For Review: Answering Business Questions SQL Guided Project

Screen Link:

Hello,
Here is my guided project for the Answering Business Questions Using SQL mission. I added some visuals to complement the data. I struggled alot with the ‘Albums v. Individual Tracks’ part. It was tough for me to visualize each step in one cell of code so I broke it out into separate steps. That made it easier for me to see what was happening at each step. I’m interested to hear what others think about it.

Things I was wondering about as I was doing the project:

When plotting the aggregate sales and customer data from each country (step 5 in the mission, code cell 6 in the Jupyter notebook) I was wondering how to write my code more efficiently?

Instead of writing code 4 times for 4 separate plots, I wanted to do it using a for loop. The plotting and common aesthetics such as x-tick label font size and rotation would be inside the loop. Outside the loop, I wanted to specify separate aesthetics for each of the 4 axes such as y-axis label and separate titles. Where I was having trouble was creating ax1=sns.barplot(), ax2=sns.barplot(), etc. inside the loop. If I had achieved this, I think I could have called the separate ax aesthetics outside.

If anyone has any suggestions it would be appreciated.

Thank you for your time!

Project Inspiration Credit due to @Elena_Kosourova and @jesusayala893 for the excellent projects that served as a benchmark for me. Also, thank you @alvinctk for helping with the pie chart.

SQL Chinook Record Store DB.ipynb (1.9 MB)

Click here to view the jupyter notebook file in a new tab

2 Likes

Hi @gosaints,

Wow, your project is really excellent! :star: Amazing and perfectly readable visualizations especially the pie charts with explode, perfect project structure, great storytelling and observations, cool idea to comment also sql code (and in general, code commenting in the whole perfect). Interesting idea to compare the appearances of the artists in playlists rather than the number of playlists for each. And once again, great approach to write both the conclusion and the methodology summary! I noticed also that, at the end, you decided to count the physical number of customers for each sales support agent :slightly_smiling_face:

Now some suggestions from my side:

  • The code cell [6], just an observation: the upper-left graph doesn’t have the grid, while the other have. Honestly, I myself cannot figure out why.
  • It’s better to combine the code cells without any output or markdown observations afterwards, like [8]-[9], [10]-[11], [24]-[26], [36]-[37].
  • A long line of code, especially a function with many arguments, for example for creating a plot (like [6], [11], [15]) will be more easily-readable if to divide it into several lines. One argument for one line. For example, this code from the code cell [6]:
ax1 = sns.barplot(x=country_list, y='sales', data=country_stats, color="steelblue")

can be written in this way:

ax1 = sns.barplot(x=country_list, 
                  y='sales', 
                  data=country_stats, 
                  color="steelblue")
  • It’s better to follow a uniform style of quote marks for the string data in the code cells throughout the project (only single or only double).
  • For the code cell [12] you can consider using ORDER BY.
  • Styling the code in SQL, don’t forget to mind “rivers”.
  • It’s better to write the numbers in digits, in this sentence:

Seventy-four artists have zero units sold.

  • If you are also curious (so was I) how to decrease those huge spaces between the title of a pie chart and the pie chart itself (like in the code cells [20], [34]), I would suggest you the following approach that I had found somewhere on StackOverflow doing the same project:
ax.set_title("My Pie Plot Title", 
              fontsize=30, 
              y=0.9)

Let’s say, the main idea here is to tune the y parameter.

Hope that my suggetsions were of use.
Once again, awsome project! :star_struck: Keep up this high level!

1 Like

Thank you for this feedback. It all makes sense and is very helpful!

Interesting about the grid not being on the upper left plot of cell 6. Also the entire figure is displaying smaller than it was when I was writing it.

When I open the notebook or restart kernel/run all, it displays the same way. When I click the cell and run just that cell again, the grid comes back and the figure is bigger as it was when I wrote it. Very strange, it must be some quirk with Jupyter. I’m running version 6.2.

Thanks again!

1 Like

Hey gosaints, thanks for sharing - i just want to say i learned a lot reviewing your notebook. You’ve inspired me to take my practice a few steps further in importing the queries into a pd and adding visualizations. Anyway, excellent work, sir!

1 Like

Thank you @kevinjhuang, I’m glad you were able to learn from it!