Self Join - On clause explanation

,

Screen Link: screen link

My Code:

SELECT e1.first_name || ' ' || e1.last_name AS report,
       e2.first_name || ' ' || e2.last_name AS manager
  FROM employee AS e1
  LEFT JOIN employee AS e2
    ON e2.reports_to = e1.employee_id;

The solution:

SELECT e1.first_name || ' ' || e1.last_name AS report,
       e2.first_name || ' ' || e2.last_name AS manager
  FROM employee AS e1
  LEFT JOIN employee AS e2
    ON e1.reports_to = e2.employee_id;

What I expected to happen:
I expected to work in the same way as the suggested solution where the only that changes is the ON clause condition.

I don’t understand how SQL interprets differently this solution just by changing the ON clause in the way I did. Any help is highly appreciate as this is driving me crazy :smiley:

What actually happened:

1 Like

Yeah, it can be confusing.

One way to make it clearer is by using a more distinct alias for the two employees table. Since we want e1 to be the ones being supervised, we can name it with just e; for the e2 table, which is meant for finding managers, we can name it m instead.

Yours:

SELECT e.first_name || ' ' || e.last_name AS report,
       m.first_name || ' ' || m.last_name AS manager
  FROM employee AS e
  LEFT JOIN employee AS m
    ON m.reports_to = e.employee_id;

What’s happening here is we are “labeling” the normal employees as the managers; that is e should be the one who reports_to someone and not the other way around. In other words, managers will be in e, while non-managers will be in m. SQL doesn’t understand our business logic, the only thing it knows is we want to join two things together.

Using your original aliases, let’s flip it around:

SELECT 
e2.first_name || ' ' || e2.last_name AS report,
e1.first_name || ' ' || e1.last_name AS manager,    
  FROM employee AS e1
  LEFT JOIN employee AS e2
    ON e2.reports_to = e1.employee_id;

The output:

Is quite similar to the solution’s output:

There are more missing values in the flipped solution because remember that we mislabeled e1 as managers. Because some of those mislabeled employees do not have anyone from e2 who reports to them, they’ll have missing values due to Left Join.

The solution doesn’t have this issue because only one employee, the CEO presumably, doesn’t report to anyone.

I’m not sure if my explanation above is clear or not. If it confuses you even more, please disregard it.


Addendum: Oh yeah, the following also works:

SELECT e1.first_name || ' ' || e1.last_name AS report,
       e2.first_name || ' ' || e2.last_name AS manager
  FROM employee AS e1
  LEFT JOIN employee AS e2
    ON e2.employee_id = e1.reports_to;

It’s essentially the same as the solution, only the left and right is flipped.

3 Likes

We should have used more meaningful aliases here. Nice suggestion!

2 Likes