Getting different results but looks like my code is very similar to the answer code i.e. the Rock tracks sold I get is 418 rather than 561. It looks like when I remove the GROUP BY il.track_id - I get the right answer, however, I’m not sure why I should NOT group by track_id ?
My Code:
%%sql
WITH usa_tracks_sold AS
(SELECT c.country, i.customer_id, i.invoice_id, il.track_id, il.quantity
FROM customer c
INNER JOIN invoice i ON c.customer_id = i.customer_id
INNER JOIN invoice_line il ON i.invoice_id = il.invoice_id
WHERE c.country = "USA"
GROUP BY il.track_id),
track_genre AS
(SELECT t.track_id, t.genre_id, g.name
FROM genre g
INNER JOIN track t ON g.genre_id = t.genre_id)
SELECT tg.name,
count(uts.quantity) tracks_sold,
cast(count(uts.quantity) AS FLOAT) / (SELECT COUNT(*) FROM usa_tracks_sold) percent_sold
FROM track_genre tg
INNER JOIN usa_tracks_sold uts ON uts.track_id = tg.track_id
GROUP BY 1
ORDER BY 2 DESC
LIMIT 10;
What I expected to happen:
What actually happened:
name tracks_sold percent_sold
Rock 418 0.529113924050633
Alternative & Punk 95 0.12025316455696203
Metal 87 0.11012658227848102
R&B/Soul 36 0.04556962025316456
Blues 28 0.035443037974683546
Alternative 23 0.02911392405063291
Latin 22 0.027848101265822784
Hip Hop/Rap 19 0.024050632911392405
Pop 19 0.024050632911392405
Jazz 14 0.017721518987341773