SQL - JOIN same table twice on different columns

Screen Link:

Example Code:

SELECT
e1.employee_id,
e2.employee_id supervisor_id
FROM employee e1
INNER JOIN employee e2 on e1.reports_to = e2.employee_id
LIMIT 4;

I am learning Recursive Joins or Joining the same table twice on different columns. I wonder what really happens here. We give the same table ‘employee’ aliases: “e1” after FROM and “e2” after JOIN.
My question is:

Does SQL create a new table “e2” here? (since the columns of “e2” are different from those in our original table).

1 Like

Hi @alexpham0903 and welcome to the community!

I found this type of join very confusing the first time as well. For me, it helped to keep experimenting with this join type and analyzing the results. Also, I found it made more sense to name the first table e for employee and to the name the second table s for supervisor.

The columns should not be different for e2 since it is still referencing the same table (employee). I think what’s causing your confusion is how the table is being joined to itself on a different column. Up until now, we have been joining tables using the “same” column (ie track_id, album_id, artist_id, etc…) but now it seems like we are joining on “different” columns (ie reports_to and employee_id). But we must remember that although these columns have different names, they contain the same type of data (ie employee numbers).

Thanks Mike,

Still, I am trying to grasp the concept. I understand the that JOIN clause creates a temporary table, which is the combination of 2 tables (just talk about 2 tables only) coming after FROM and JOIN respectively.

However, in the case of Joining same table twice on different columns, take the above example:

  • the “reports_to” column of “employees” now effectively becomes “employee_id” column of “supervisors”. I understand both columns (and tables) refer to the same info.

Still, I cannot persuade myself that the “supervisor” table now is still 100% the same as our original table. Clearly, the identifying columns of “id” are not the same. So again, the same question: Does SQL create a “slightly” different temporary table compared to the original one?

I do not believe so…it’s still the same table as the original but upon joining to itself, the “supervisor” table is “shuffled” so that it’s id column lines up with the “employee” id column.

Why is this process even necessary?

I can produce the same result by just selecting both columns from the same table:

SELECT employee_id, reports_to
FROM employee;

Why do we need to go through all these extra steps to join columns that are already on the same table??

1 Like

This is normally done if a row in a table is related to another row in the same table.

Some employees in the employee table report to another employee in the same employee table. For example, an employee with an id 1 reports to another employee with an id 2.

What we actually want is to have the row for employee id 1 to also have the data of employee id 2 (which is their supervisor).

We would like to transform the following:

id name reports_to
1 Abraham 2
2 Bob 0

(with Left Join) Into :

id name reports_to supervisor_id supervisor_name reports_to
1 Abraham 2 2 Bob NaN
2 Bob NaN NaN NaN NaN

The second table have the same columns but the rows will be ordered differently when joined with the first table depending on the join condition.

1 Like