Screen Link:
https://app.dataquest.io/m/374/guided-project%3A-answering-business-questions-using-sql/7/albums-vs-individual-tracks
My Code:
albums_or_tracks <- "
WITH invoice_tracks AS
(
SELECT
il.invoice_id,
COUNT(DISTINCT(il.track_id)) AS unique_tracks,
t.album_id
FROM invoice_line il
INNER JOIN track t ON t.track_id = il.track_id
GROUP BY invoice_id, album_id
),
album_tracks AS
(
SELECT
album_id,
COUNT(DISTINCT(track_id)) AS number_of_tracks
FROM track
GROUP BY album_id
),
album_purchases AS
(
SELECT
it.invoice_id,
it.unique_tracks,
at.*,
CASE
WHEN unique_tracks = number_of_tracks THEN 'yes'
ELSE 'no'
END AS album_purchase
FROM invoice_tracks it
INNER JOIN album_tracks at ON at.album_id = it.album_id
)
SELECT
album_purchase,
COUNT(DISTINCT(invoice_id)) AS invoices,
ROUND(CAST(COUNT(DISTINCT(invoice_id)) as float) / (SELECT COUNT(DISTINCT(invoice_id)) FROM album_purchases) * 100, 2) AS percentage
FROM album_purchases
GROUP BY album_purchase; "
What I expected to happen:
To get the same values in the output as in the solution notebook.
What actually happened:
I did not get the same values.
Replace this line with the output/error
album_purchase invoices percentage
1 no 500 81.43
2 yes 201 32.74
I’m not sure I understood whether or not we were supposed to handle the “edge cases”, and I’m wondering if the flow and code are still okay in light of this likely misunderstanding.
Thanks for any help!