I couldn’t figure out the EXCEPT operator so I ended up solving the problem without using it. I figured I would post it here in case someone gets stuck too!
My Code:
%%sql
WITH invoice_albums AS
(
SELECT
i.invoice_id,
COUNT(t.track_id) purchased_track_count,
COUNT(distinct a.album_id) album_count
FROM invoice i
INNER JOIN invoice_line il ON il.invoice_id = i.invoice_id
INNER JOIN track t ON t.track_id = il.track_id
INNER JOIN album a ON a.album_id = t.album_id
GROUP BY 1
),
album_track_count AS
(
SELECT
a.album_id,
COUNT(t.track_id) track_count
FROM album a
INNER JOIN track t ON a.album_id = t.album_id
GROUP BY 1
),
first_track_count AS
(
SELECT
il.invoice_id,
MIN(il.track_id) first_track_id,
atc.album_id,
atc.track_count album_track_count,
COUNT(il.track_id) purchased_track_count
FROM invoice_line il
INNER JOIN track t ON t.track_id = il.track_id
INNER JOIN album_track_count atc ON atc.album_id = t.album_id
GROUP BY 1
)
SELECT
CASE
WHEN ia.album_count = 1
AND ftc.purchased_track_count = ftc.album_track_count
THEN "Albums"
ELSE "Singles"
END AS purchase_type,
COUNT(ia.invoice_id) number_of_invoices,
ROUND(CAST(COUNT(ia.invoice_id) AS float)/
(
SELECT COUNT(*)
FROM invoice
),2) percentage
FROM invoice_albums ia
LEFT JOIN first_track_count ftc ON ftc.invoice_id = ia.invoice_id
GROUP BY 1
;
@gdelaserre: recategorized your topic. The Except operator basically excludes to exclude certain criteria and is usually used with a subquery as shown here
First, I’d like to say a huge thank you for posting this solution. I found it much more intuitive than the one given. Thank you.
Second, I’d like to say I improved the code. There were some unnecessary bits. If I’m wrong, anyone is welcome to correct me, although the code arrives at the same answer in fewer steps.
Thanks again.
%%sql
WITH invoice_albums AS
(
SELECT
i.invoice_id,
COUNT(t.track_id) purchased_track_count,
COUNT(DISTINCT t.album_id) album_count,
t.album_id album_id
FROM invoice i
INNER JOIN invoice_line il ON il.invoice_id = i.invoice_id
INNER JOIN track t ON t.track_id = il.track_id
GROUP BY 1
),
album_track_count AS
(
SELECT
t.album_id album_id,
COUNT(t.track_id) album_track_count
FROM track t
GROUP BY 1
)
SELECT
CASE
WHEN ia.album_count = 1
AND ia.purchased_track_count = atc.album_track_count
THEN "Albums"
ELSE "Singles"
END AS purchase_type,
COUNT(ia.invoice_id) number_of_invoices,
ROUND(CAST(COUNT(ia.invoice_id) AS FLOAT) /
(
SELECT COUNT(*)
FROM invoice), 2) percentage
FROM invoice_albums ia
INNER JOIN album_track_count atc ON atc.album_id = ia.album_id
GROUP BY purchase_type
I totally misunderstood this problem! I was trying to figure out how to also find albums within a purchase (for example, invoice x contains 1 album, but also 3 singles - the purchase would be both album and singles) that I got lost in the weeds. Being able to throw away any invoice where album_count > 1 is such a huge simplification. Thanks.
I am myself one of the people that misunderstood the problem just like @jasonboyle90 said before me.
I thought about this way of solving the exercise but then I stopped because the misunderstanding. I have solved the problem in this way, but I will strive to understand also the way of solving it with the EXCEPTION method.
WITH pre_base
AS (SELECT al.album_id,
Count(tr.track_id) t_cnt
FROM track tr
LEFT JOIN album al
ON al.album_id = tr.album_id
GROUP BY 1),
base
AS (SELECT i.invoice_id,
CASE
WHEN Count(DISTINCT al.album_id) = 1
AND Count(DISTINCT tr.track_id) = pb.t_cnt THEN "album"
ELSE "track"
END AS Category
FROM invoice i
LEFT JOIN invoice_line il
ON i.invoice_id = il.invoice_id
LEFT JOIN track tr
ON il.track_id = tr.track_id
LEFT JOIN album al
ON al.album_id = tr.album_id
LEFT JOIN pre_base pb
ON pb.album_id = tr.album_id
GROUP BY 1)
SELECT category,
Count(invoice_id) num_of_inv,
Round(Cast(Count(invoice_id) AS FLOAT) / Cast((SELECT Count(*)
FROM base)AS FLOAT) *
100, 2)
pct_invoice
FROM base
GROUP BY 1
However I need to revisit and see if I can leverage EXCEPT clause in the logic to achieve the same result. After all we are here to learn the concepts.