I am having trouble understanding which table and column to call in SELECT after I join a bunch of tables

Screen Link:
SQL Joins And Working With Complex Databases — Combining Multiple Joins With Subqueries | Dataquest

My Code:

SELECT
    ta.album_title as album,
    ta.artist_name as artist,
    COUNT(*) tracks_purchased
FROM invoice_line il
INNER JOIN (
            SELECT
                t.track_id,
                al.title album_title,
                ar.name artist_name
            FROM track t
            INNER JOIN album al ON al.album_id = t.album_id
            INNER JOIN artist ar ON ar.artist_id = al.artist_id
           ) as ta
           ON ta.track_id = il.track_id
GROUP BY 1,2 
ORDER BY 3 DESC
LIMIT 5;

What I don’t understand is why the select is all from the ta table. If the ta table have joined with il table, then why does it matter if I call il or ta?

Why can’t I use:


SELECT
    il.album_title as album,
    il.artist_name as artist,
    COUNT(*) tracks_purchased

Hello @hliu19922019 ,

You cannot use il.album_title and il.artist_name because the invoice_line table does not have such columns.

Even after you joined the tables, you still need to respect the columns in each table to successfully extract the data.

In this particular case, you can also not specify the table. The following works:

SELECT
    album_title as album,
    artist_name as artist,
    COUNT(*) tracks_purchased

But you cannot tell SQL to find a column in a table that does not have such column.

Also, imagine the two joined tables have two columns with the same name but representing different things. For example, a table countries and a table cities might have the columns population. After the join, it would make a great difference to select countries.population or cities.population.

When you make a join, it does not mean you have only one table, but two joined tables.

Hope this helps you.

Thanks for responding! I guess maybe it’s just not me understanding what is “physically” doing when you join 2 tables. My understanding if that you essentially create a whole new table with those columns put in there.
So using an example, say Table 1 has the 2 column titled food where it might be cheese, apples, etc. Column 2 has food category such as dairy, veggies, fruits.

Then Table 2 might have Columns food and column expiration date and table 1 and table 2 match on the food column

My understanding is that if you do an inner join of table 2 on to table 1, you would create a brand new table or table 3 and you’d only have the 2 columns food and food categories right? Since table 1 and table 2 will merge on food and become 1 column

BUT if you have a LEFT JOIN, you’d have a table where it includes 3 categories right? where table 3 will have food, food categories, and expiration date as the two matching ```food`` columns will essentially merge and become one column and the other 2 distinct columns will be added into the table 3 resulting in 3 columns.

This may contain the animations you’re looking for: sql - What is the difference between "INNER JOIN" and "OUTER JOIN"? - Stack Overflow
There are many things venn diagram can’t express, though they serve as a good 1st intro to joins.

I understand you need a better mental model of what’s going on.
Just passively following lessons is insufficient for that, and experimenting in DQ’s browser limits you by data choice and slow return of results.

You can get faster feedback loops by designing your own small tables in https://sqliteonline.com/ or http://sqlfiddle.com/ and running your own investigative queries. This skill is transferrable beyond SQL to learning any example, and crucial for debugging too (isolating the likely cause, replicating it with the smallest possible example, communicating what you’ve tried and what you think the bug is when opening issues on open source projects).

For a start, here is a small example i prepared for you to try:
DDL

CREATE TABLE customers (id,name);
insert into customers VALUES
(1,'tom'),
(2,'jerry');
CREATE TABLE sales (id,amt);
insert into sales VALUES
(1,100);

DML

SELECT *
FROM customers c
LEFT JOIN sales s ON c.id = s.id

This student also asked how left-joins work. My reply here contains a list of questions I hope can help SQL beginners see what driving your own learning with a brainstorm of interactions of concepts looks like.

After that, you may be interested in more difficult topics here: From Zero to Hero in SQL

1 Like

Table 1:

food food_category
cheese dairy
apple fruit
potato veggies

Table 2:

food expiration_date
cheese 2022
apple 2021

Note that table 2 has only 2 rows.

TABLE 1 INNER JOIN TABLE 2

food food category expiration_date
cheese dairy 2022
apple fruit 2021

TABLE 1 LEFT JOIN TABLE 2

food food category expiration_date
cheese dairy 2022
apple fruit 2021
potato veggies NULL

When selecting, you cannot say select Table1.expiration_date or select Table2.food_category because these tables do not have such columns. The so-called Table 3 does, but table Table 1 does not have expiration_date and Table 2 does not have food_category.

That’s the problem with il.album_title, for example. ìl is like Table 2 and it does not have the album_title column. Table 1 (track) does and Table 3 (the one you are creating with the join) also does, but not Table 2 (invoice_line ) does not.

I strongly encourage you to carefully read all the links Hanqi provided.