GP: Answering Business Questions using SQL, requesting feedback and suggestions on Step 6 attempt

Step - 6 is quite challenging. I have been stuck on it for waay too long and wanted help regarding my approach so far.

MY CODE:


WITH tracks_per_album AS (
        SELECT album_id, COUNT(track_id) AS "track_count"
        FROM track
        GROUP BY 1
),

tracks_per_invoice AS (
        SELECT il.invoice_id, t.album_id, COUNT(il.track_id) AS "track_count"
        FROM invoice_line AS "il"
        JOIN track AS "t" ON t.track_id = il.track_id
        GROUP BY 1,2
)

SELECT COUNT(*)
FROM tracks_per_album AS "tpa"
JOIN tracks_per_invoice AS "tpi" ON tpa.album_id = tpi.album_id
WHERE (tpa.track_count = tpi.track_count)

The above will output 207. As per the solution, that should be 114.

An overview of my solution -

  • I create tracks_per_album which stores the album_id and the number of tracks in each album.
  • I create tracks_per_invoice which joins invoice_line and track, and it stores the invoice_id, the album_id and the number of tracks per invoice per album.
  • I join the above two on their respective album_ids.
  • To get the number of invoices that are album purchase, I just check if the track_count in both of those tables is the same or not. If it is, then it’s an album purchase.

Looking at a specific invoice

invoice_id = 4 is an album purchase as per my solution above, but as per Dataquest’s solution, it’s not.

Based on my solution above, if we look at invoice_id = 4, I get -

Comparing the above to the number of tracks in the album_ids listed in the above table -

Now, as you can see -

  • For album_id = 260, the number of tracks is 1
  • My current attempt, because of album_id = 260, assumes that the invoice_id = 4 is an album purchase since the track count is 1
  • Even though it’s not an album purchase because there are tracks from different albums that were also bought as per invoice_id = 4

What I require help with

  • How do I tackle the above case given my current implementation? Given the disparity in my count vs the official solution’s, there are definitely more invoices which fall under the same situation.

The above approach felt more intuitive to me than the one provided in the solution. When trying to solve the above, I also came across @Bruno’s suggestion here which, I think, is also talking about a similar approach.

I am just stuck on how to proceed forward from this, at the moment. Any feedback is appreciated. Thanks!

This old post should also be useful.

The approach is similar, but we’re using different definitions of what constitutes an album purchase.

If I recall correctly, the definition used by the author of project is a purchase that contains a full album and nothing more, just as long as that album as more two (?) tracks.

I didn’t look at your code, it’s possible there are more elegant solutions than what I’m about to suggest. On the table above you should be able to add a column that checks whether or not the album was fully purchased and it has more than two (?) tracks.

I was not going to write the code for this, but I did, so I’m leaving it below.

Expand to see query
WITH tracks_per_album AS (
     SELECT album_id,
	        COUNT(track_id) AS "track_count"
       FROM track
      GROUP BY 1
),

tracks_per_invoice AS (
     SELECT il.invoice_id, t.album_id,
			COUNT(il.track_id) AS "track_count"
       FROM invoice_line AS "il"
      INNER JOIN track AS "t" ON t.track_id = il.track_id
      GROUP BY 1,2
)

SELECT tpi.invoice_id, tpa.*, tpi.track_count as tracks_purchased,
       CASE 
	     WHEN tpa.track_count = tpi.track_count AND tpa.track_count > 2 THEN 1
		 ELSE 0
	   END AS full_album
  FROM tracks_per_album AS "tpa"
 INNER JOIN tracks_per_invoice AS "tpi" ON tpa.album_id = tpi.album_id
 ORDER BY invoice_id;

From the resulting table, you can group by invoice_id, count how many albums there is in each invoice, and figure out if any of the albums is a full album (you can take the maximum of full_album in the query above for this).

2 Likes

Thanks for taking the time to write out the query for me. I will first try to attempt this on my own based on the suggestion you provided. If I get stuck again, I will refer to your code as well.

Thanks again!

I just piggy-backed of your code, and I didn’t write the whole thing, I just created an extra column. You’ll still have to do most of the work.

1 Like

Yup, understood. Will report back when I solve it (hopefully).