464-4 I know what I did wrong, but what did I pull back instead?

Hi Everyone,

I’m working on Intermediate Joins in SQL, specifically screen 4 Combining Multiple Joins with Subqueries.

My initial submission is below (which I learned was incorrect).

SELECT tr.album_title album,
       tr.artist_name artist,
       COUNT(*) tracks_purchased
  FROM invoice_line il
 INNER JOIN (SELECT t.track_id, ar.name artist_name, al.title album_title
               FROM track t
              INNER JOIN album al ON al.album_id = t.album_id
              INNER JOIN artist ar ON ar.artist_id = al.artist_id
            ) tr ON tr.track_id
 GROUP BY album, artist
 ORDER BY tracks_purchased DESC
 LIMIT 5;

I realize I forgot to complete the join and specify that tr.track_id = il.track_id

Can someone elaborate on exactly what SQL is bringing back in my initial submission? I didn’t receive any errors, but my results were:

Greatest Hits Lenny Kravitz 271149
Minha Historia Chico Buarque161738
Unplugged Eric Clapton 142710
Lost, Season 3 Lost 123682
Lost, Season 1 Lost 118925

instead of:

Are You Experienced? Jimi Hendrix 187
Faceless Godsmack 96
Mezmerize System Of A Down 93
Get Born JET 90
The Doors The Doors 83

Hey, McClure.

In the SQL courses we don’t fully explore what’s happening with joins.

Join conditions don’t need to be of the kind “the column matches that other column”. You can use any condition, and for each combination of rows from the two intervening tables in a join, the SQL engine will check if it evaluates to true or false.

If it evaluates to true, it keeps that combination of rows with whatever columns are chosen in the SELECT statement. Otherwise, it will discard that combination of rows.

In your question the condition is tr.track_id. The SQL engine looks at this condition and tries to assess whether it is true or false. And it does so in the following way:

  • If there are no missing values, then every row is evaluated to true;
  • If there is at least one missing value, then every row is evaluated to false;

Since tr.track_id has no missing values, it evaluates to true and it performs the join on every row. You can test this by swapping tr.track_id with 1=1, like this:

SELECT tr.album_title album,
       tr.artist_name artist,
       COUNT(*) tracks_purchased
  FROM invoice_line il
 INNER JOIN (SELECT t.track_id, ar.name artist_name, al.title album_title
               FROM track t
              INNER JOIN album al ON al.album_id = t.album_id
              INNER JOIN artist ar ON ar.artist_id = al.artist_id
            ) tr ON 1=1
 GROUP BY album, artist
 ORDER BY tracks_purchased DESC
 LIMIT 5;
1 Like