Guided Project: Alternative Task 4 Solution for Chinook Business Questions

Hello all,

I was stumped on the fourth task of this guided project, so I wanted to present an alternative solution to that provided by DataQuest and see if people had any comments on it.

Beyond that, I’d like to know if someone could refer me to a more in-depth explanation of the use of EXCLUDE in the solution. The solution notebook provides zero notes on what exactly their query did, which seems to be pretty bad practice imo. Some in the community provided rough explanations, but I didn’t find any with detailed explanations that I could make sense of.

Task 4 and My Solution:
Task 4 was to determine the number and percentage of invoices that were purchases of whole albums vs. individual tracks. The instructions guiding this task included use of the EXCLUDE function. After I failed to understand how to properly use EXCLUDE for the task, I inspected the data and found what I think is a much more readable solution.

Because each track and album has a unique identifying number (and because they’re not assigned at random), the sum of the track IDs on each album plus the album ID results in a unique sum for each album. Thus, if this value is calculated for each album in the database (the dictionary) and across invoices (the data to compare to the dictionary), you can easily determine whether the invoice represents a full album purchase by whether or not the invoice ID sum matches any ID sum in the album dictionary. As an extra precaution (in case they hadn’t ended up being unique identifiers), I marked the invoices as being for full albums only if the unique ID sum and the album ID matched between the dictionary and the invoice summary set. My results matched the solution notebook, as can be seen in my notebook below.

SQLite Analyzing the Chinook Music Store.ipynb (62.9 KB)

Click here to view the jupyter notebook file in a new tab

1 Like

Hi @curtiss.a.chapman,

Welcome to the Community!

About this part of your post:

The solution notebook provides zero notes on what exactly their query did, which seems to be pretty bad practice imo.

Could you please share your feedback with the Content & Product teams of Dataquest? Just click the ? button in the upper-right corner of any screen of the Dataquest learning platform, select Share Feedback, fill in the form, and send it. Thanks!


Done! Thanks for the welcome.