SQL Recursive Joins

Hello everyone :slight_smile:,

I’ve got stuck on this mission:

Screen Link:
https://app.dataquest.io/m/189/intermediate-joins-in-sql/5/recursive-joins

This is the example code, that I must say, its the worst example ever, and even worst explained.
Example Code:

SELECT
    e1.employee_id,
    e2.employee_id supervisor_id
FROM employee e1
INNER JOIN employee e2 on e1.reports_to = e2.employee_id
LIMIT 4;

Why do we have to use this:

  e1.employee_id,
  e2.employee_id supervisor_id

Instead of:

  e1.employee_id,
  e2.reports_to 'supervisor_name'

Why using the same col, the employee_id and naming it supervisor_id?!?! To me this does not make any sense.

And also why this peace of code here:

e1.reports_to = e2.employee_id

And not:

e1.employee_id = e2.reports_to

Sorry but you lack a lot of explanation in this mission. And this is quite confusing to me.

Thank you for the help :slight_smile:

You name it supervisor_id because the reports_to column contains ids, not names. Each row of this column contains an id representing the employee to whom the employee identified in the employee_id column reports to. So when you merge it to e2 and select e2.employee_id you are selecting the supervisor’s id.

With e2.reports_to you’d be selecting the id of the supervisor of the supervisor. e2 contains information about the supervisor. Therefore, to select the supervisor’s name you must use e2.first_name and that’s what you are asked in the exercise.

Because e1 is where you get the employee’s information, so to select the employee’s supervisor id you need to select e1.reports_to, not e2.reports_to. Then you merge e1.reports_to to e2 so you can select the supervisor’s information.

employee is a table containing all employees information, supervisors (they are also employees) are also included since they are also employees. Since this is a self join, we will assume that the first table employee AS e1 is a table about the normal employees employee AS e2 is a table about the supervisors now we want to join the two tables using e1.reports_to employee’s supervisor ID with e2.employee_id supervisors Employee ID.
Think of this as getting employees who are also supervisors.

in case you want to use reports_to in the SELECT statement then I think you’ll be referring to the one in e1 and not e2.

2 Likes

Wao!!!
This clarified my issue, i think i understand how recursive joins works now.
Supervisors are also employees but are those other empoyees report to
Thank you

2 Likes

A graphical view of the problem:

4 Likes

Wouldn’t both codes work? Just that when it comes to the select clause would have to vary the e1 and e2.

thats

SELECT
e1.employee_id,
e2.employee_id supervisor_id
FROM employee e1
INNER JOIN employee e2 on e1.reports_to = e2.employee_id
LIMIT 4;

SELECT
e2.employee_id,
e1.employee_id supervisor_id
FROM employee e1
INNER JOIN employee e2 on e2.reports_to = e1.employee_id
LIMIT 4;

this analogy is based on the fact that everyone can choose which table e1 or e2 the modification should be carried on and which should be keep the original content

1 Like