Guided Project: Answer to Albums vs. Singles without using the EXCEPT operator

Screen Link:

I couldn’t figure out the EXCEPT operator so I ended up solving the problem without using it. I figured I would post it here in case someone gets stuck too!

My Code:

%%sql

WITH invoice_albums AS 
    (
    SELECT
    i.invoice_id,
    COUNT(t.track_id) purchased_track_count,
    COUNT(distinct a.album_id) album_count
    FROM invoice i
    INNER JOIN invoice_line il ON il.invoice_id = i.invoice_id
    INNER JOIN track t ON t.track_id = il.track_id
    INNER JOIN album a ON a.album_id = t.album_id
    GROUP BY 1
    ),

    album_track_count AS
    (
    SELECT
        a.album_id,
        COUNT(t.track_id) track_count
        FROM album a
        INNER JOIN track t ON a.album_id = t.album_id
        GROUP BY 1
    ),

    first_track_count AS
    (
    SELECT
    il.invoice_id,
    MIN(il.track_id) first_track_id,
    atc.album_id,
    atc.track_count album_track_count,
    COUNT(il.track_id) purchased_track_count
    FROM invoice_line il
    INNER JOIN track t ON t.track_id = il.track_id
    INNER JOIN album_track_count atc ON atc.album_id = t.album_id
    GROUP BY 1
    )

SELECT
    CASE
        WHEN ia.album_count = 1 
        AND ftc.purchased_track_count = ftc.album_track_count
        THEN "Albums"
        ELSE "Singles"
    END AS purchase_type,
    COUNT(ia.invoice_id) number_of_invoices,
    ROUND(CAST(COUNT(ia.invoice_id) AS float)/ 
        (
            SELECT COUNT(*)
            FROM invoice
        ),2) percentage
FROM invoice_albums ia
LEFT JOIN first_track_count ftc ON ftc.invoice_id = ia.invoice_id
GROUP BY 1
    ;

Here is the output:

purchase_type number_of_invoices percentage
Albums 114 0.19
Singles 500 0.81
8 Likes

@gdelaserre: recategorized your topic. The Except operator basically excludes to exclude certain criteria and is usually used with a subquery as shown here

2 Likes

I find this way more intuitive. Thank you.

Could you please shed more light on the use of first_track_count though? Its not really clear.

Hello There,

I didn’t strictly follow dataquest’s guidelines for the same question, but I was able to use the except functionality in my answer, for more details please check: Guided Project: Answer to Albums vs. Singles using the EXCEPT operator - Share / Guided Project - Dataquest Community

First, I’d like to say a huge thank you for posting this solution. I found it much more intuitive than the one given. Thank you.

Second, I’d like to say I improved the code. There were some unnecessary bits. If I’m wrong, anyone is welcome to correct me, although the code arrives at the same answer in fewer steps.

Thanks again.

%%sql

WITH invoice_albums AS
    (
        SELECT
        i.invoice_id,
        COUNT(t.track_id) purchased_track_count,
        COUNT(DISTINCT t.album_id) album_count,
        t.album_id album_id
        FROM invoice i
        INNER JOIN invoice_line il ON il.invoice_id = i.invoice_id
        INNER JOIN track t ON t.track_id = il.track_id
        GROUP BY 1
    ),
    album_track_count AS
    (
        SELECT
        t.album_id album_id,
        COUNT(t.track_id) album_track_count
        FROM track t
        GROUP BY 1
    )
    

SELECT  
    CASE
        WHEN ia.album_count = 1
        AND ia.purchased_track_count = atc.album_track_count
        THEN "Albums"
        ELSE "Singles"
        END AS purchase_type,
    COUNT(ia.invoice_id) number_of_invoices,
    ROUND(CAST(COUNT(ia.invoice_id) AS FLOAT) / 
          (
              SELECT COUNT(*)
              FROM invoice), 2) percentage
FROM invoice_albums ia
INNER JOIN album_track_count atc ON atc.album_id = ia.album_id
GROUP BY purchase_type

I totally misunderstood this problem! I was trying to figure out how to also find albums within a purchase (for example, invoice x contains 1 album, but also 3 singles - the purchase would be both album and singles) that I got lost in the weeds. Being able to throw away any invoice where album_count > 1 is such a huge simplification. Thanks.

I am myself one of the people that misunderstood the problem just like @jasonboyle90 said before me.

I thought about this way of solving the exercise but then I stopped because the misunderstanding. I have solved the problem in this way, but I will strive to understand also the way of solving it with the EXCEPTION method.

1 Like

I used similar method with one step less .

WITH pre_base
     AS (SELECT al.album_id,
                Count(tr.track_id) t_cnt
         FROM   track tr
                LEFT JOIN album al
                       ON al.album_id = tr.album_id
         GROUP  BY 1),
     base
     AS (SELECT i.invoice_id,
                CASE
                  WHEN Count(DISTINCT al.album_id) = 1
                       AND Count(DISTINCT tr.track_id) = pb.t_cnt THEN "album"
                  ELSE "track"
                END AS Category
         FROM   invoice i
                LEFT JOIN invoice_line il
                       ON i.invoice_id = il.invoice_id
                LEFT JOIN track tr
                       ON il.track_id = tr.track_id
                LEFT JOIN album al
                       ON al.album_id = tr.album_id
                LEFT JOIN pre_base pb
                       ON pb.album_id = tr.album_id
         GROUP  BY 1)
SELECT category,
       Count(invoice_id) num_of_inv,
       Round(Cast(Count(invoice_id) AS FLOAT) / Cast((SELECT Count(*)
                                                      FROM   base)AS FLOAT) *
             100, 2)
                         pct_invoice
FROM   base
GROUP  BY 1 

|Category|num_of_inv|pct_invoice|
|Album|114|18.57|
|Track|500|81.43|

However I need to revisit and see if I can leverage EXCEPT clause in the logic to achieve the same result. After all we are here to learn the concepts.

1 Like