Answering Business Questions using SQL: Album vs. Individual tracks - alternative solution using album ids gives different results

Guidance for the last problem in Guided Project: Answering Business Questions using SQL recommends using a strategy involving an EXCEPT clause.

I tried to follow this strategy (especially to get familiar with the new tool) but had to go through some contortions to get it to work,…er um,…not work.

It made most sense to me to compare tracks from the max album id for each invoice to the track list for that album.

I’m still getting different results. Can anyone help me see why? I’ve been staring at this for too long and could use some fresh eyes. DQ community please help! :pray:

Here’s my code. Apologies it’s not cleaner/more concise, but I’m a noob.

WITH

invoice_max_album_id AS (
                         SELECT il.invoice_id, 
                                MAX(a.album_id) AS max_album_id
                         FROM invoice_line il
                         JOIN track t ON t.track_id = il.track_id
                         JOIN album a ON a.album_id = t.album_id
                         GROUP BY 1),

full_album_tracks AS (
                      SELECT i.invoice_id, 
                             t.track_id
                      FROM track t
                      JOIN invoice_max_album_id i
                      ON i.max_album_id = t.album_id
                      ),

invoice_tracks_purchased AS 
                        (
                         SELECT il.invoice_id, 
                                a.album_id,
                                t.track_id
                         FROM invoice_line il
                         JOIN track t 
                            ON t.track_id = il.track_id
                         JOIN album a 
                            ON a.album_id = t.album_id
                         JOIN invoice_max_album_id i
                            ON i.invoice_id = il.invoice_id
                            AND i.max_album_id = a.album_id
                        ),

album_purchases AS (
                    SELECT i.invoice_id,
                    CASE WHEN
                         (
                          (
                           SELECT track_id 
                           FROM full_album_tracks f
                           WHERE f.invoice_id = i.invoice_id
          
                           EXCEPT
          
                           SELECT track_id 
                           FROM invoice_tracks_purchased p
                           WHERE p.invoice_id = i.invoice_id
                          ) IS NULL
         
                         AND 
         
                          (
                           SELECT track_id 
                           FROM invoice_tracks_purchased p
                           WHERE p.invoice_id = i.invoice_id
                            
                           EXCEPT
                            
                           SELECT track_id 
                           FROM full_album_tracks f
                           WHERE f.invoice_id = i.invoice_id
                          ) IS NULL
                         )
                    THEN 'yes' 
                    ELSE 'no' 
                    END AS album_purchase
                    FROM invoice_max_album_id i
                    ORDER BY 1)
    
SELECT album_purchase, 
       COUNT(album_purchase) AS total_invoices,
       CAST(COUNT(album_purchase) AS float) / (SELECT COUNT(*) FROM invoice) AS pct_invoices
FROM album_purchases
GROUP BY 1

Here’s the output

album_purchase total_invoices pct_invoices
no 429 0.6986970684039088
yes 185 0.30130293159609123

Interestingly, this solution is sensitive to using MIN(a.album_id) instead of MAX(a.album_id). Using MIN I get

album_purchase total_invoices pct_invoices
no 497 0.8094462540716613
yes 117 0.19055374592833876

But neither of these match the answer key.

If you made it this far, bless you! Any insight is greatly appreciated.

PS: I tried to tag this 191-6 but problem tags didn’t show up in tag search. Is this something you need elevated community privileges for?