Even after you fix your solution, it’s going to end up being different because the proposed solution is using a different definition of what an album purchase is.
For example the following invoice ids are purchases that include full albums with only one track that the given solution ignores: {4, 517, 7, 11, 15, 527, 20, 27, 548, 37, 38, 39, 41, 565, 59, 577, 579, 67, 586, 78, 597, 87, 91, 94, 97, 101, 103, 106, 111, 121, 127, 134, 137, 164, 168, 176, 180, 189, 190, 195, 198, 210, 216, 218, 227, 242, 250, 256, 260, 265, 270, 272, 275, 285, 288, 306, 311, 313, 316, 324, 325, 338, 345, 360, 365, 368, 369, 384, 395, 402, 405, 415, 421, 423, 424, 428, 434, 437, 441, 445, 450, 452, 456, 462, 467, 486, 509}
.
You can take a look at, for instance, invoice id 597. The track id 2093 is part of an album with only one track. To see Dataquest’s invoice classification, you can run the following query:
WITH invoice_first_track AS
(
SELECT
il.invoice_id invoice_id,
MIN(il.track_id) first_track_id
FROM invoice_line il
GROUP BY 1
)
SELECT
ifs.*,
CASE
WHEN
(
SELECT t.track_id FROM track t
WHERE t.album_id = (
SELECT t2.album_id FROM track t2
WHERE t2.track_id = ifs.first_track_id
)
EXCEPT
SELECT il2.track_id FROM invoice_line il2
WHERE il2.invoice_id = ifs.invoice_id
) IS NULL
AND
(
SELECT il2.track_id FROM invoice_line il2
WHERE il2.invoice_id = ifs.invoice_id
EXCEPT
SELECT t.track_id FROM track t
WHERE t.album_id = (
SELECT t2.album_id FROM track t2
WHERE t2.track_id = ifs.first_track_id
)
) IS NULL
THEN "yes"
ELSE "no"
END AS "album_purchase"
FROM invoice_first_track ifs
;
Problems with your solution are:
-
Filtering for albums with more than one track. There are a ton of albums with only one track.
-
SELECT count(*) FROM tracks_purchased
is a table comprised of tracks, basically. You’re supposed to count invoices, hence the huge number 2966, despite the fact that theinvoice
table only has 614 rows.