Screen Link:
https://app.dataquest.io/m/191/guided-project%3A-answering-business-questions-using-sql/6/albums-vs-individual-tracks
I have cheated a little bit on this project and I got a different result than the answer, so maybe my cheat was not that great. In general I’ve been banging my head against the wall with SQL, I get the syntax, but I don’t know how to use it (yet!). I wonder how you would fix my ■■■■■■ solution.
WITH albums_invoice AS
(
SELECT il.invoice_id,
COUNT(DISTINCT t.album_id) different_albums
FROM invoice_line il
JOIN track t ON t.track_id = il.track_id
GROUP BY 1
),
albums_invoice_def AS
(
SELECT *,
CASE
WHEN different_albums == 1 THEN "Album"
ELSE "Singles"
END AS album_singles
FROM albums_invoice
)
SELECT album_singles,
COUNT(invoice_id) count,
COUNT(invoice_id) * 100.0 / SUM(COUNT(invoice_id)) OVER () AS Percentage
FROM albums_invoice_def
GROUP BY 1
album_singles count Percentage
Album 171 27.850162866449512
Singles 443 72.14983713355049
Unfortunately there are some albums with one or two songs on it, which are basically singles as well. The total number of those is 90. But of course not every one of those albums has been purchased, so I can’t hack by simply deducting from the histogram. I wrote this code, but I can’t seem to integrate it anywhere in my original. Do I really have to work with EXCEPT then?
WITH tracks_per_album AS
(
SELECT album_id,
COUNT(track_id) tracks
FROM track
GROUP BY 1
)
SELECT tracks,
COUNT(album_id)
FROM tracks_per_album
GROUP BY 1
tracks COUNT(album_id)
1 82
2 8
3 3
4 2
5 2
6 3
7 8
8 14
9 15
10 27
11 25
12 29
13 16
14 34
15 18
16 13
17 15
18 8
19 3
20 5
21 2
22 3
23 3
24 3
25 2
26 1
30 1
34 1
57 1
Sorry for the long post :blush: