Albums-vs-individual-tracks output mismatches

Screen Link:
albums-vs-individual-tracks

My Code:

%%sql
WITH 
 invoice_tracks AS
                    (
                        SELECT i.invoice_id,
                        il.track_id,
                        t.album_id
                        
                        FROM invoice i 
                        INNER JOIN invoice_line il 
                        ON il.invoice_id = i.invoice_id
                        INNER JOIN track t 
                        ON t.track_id = il.track_id
                    
                    )
        ,
        
        album_purchase_type_1 AS
        
        (SELECT invoice_id,
                COUNT(DISTINCT(album_id)) ,
            CASE
                WHEN COUNT(DISTINCT(album_id)) = 1 THEN "album_purchase"
                ELSE "track_purchase"
            END AS album_purchase_type
        
            FROM invoice_tracks
            GROUP BY invoice_id)
        
        SELECT COUNT(invoice_id ),
        album_purchase_type
        
        FROM album_purchase_type_1
        GROUP BY album_purchase_type
        
        ;

What I expected to happen:
My answer varies from others, I know I am making some mistake but why my approach is wrong ? Which cases are being missing in my approach which is resulting in wrong analysis ?

What actually happened:

COUNT(invoice_id )	album_purchase_type
171	                                album_purchase
443	                                track_purchase

Hi! It’s somewhat late, but in case you haven’t figured it out yet…
When classifying the purchases your logic is:

  • it’s an album purchase if all the tracks from an invoice belong to the same album.

What’s wrong with this logic?
There might be some invoices with only several tracks from the album and not all of them. In your code these purchases are classified as album purchases

1 Like