Recursive join and concatenate operator in SQL

Screen Link:
Over the past few days, have spent countless hours on this mission, still finding it hard to understand, anyone to help me please. Note: Have read the knowledge base, still having same issues, had to create my own topic.

Recursive Joins in SQL
On this, I understand we want to get information from two columns in one table, since the table is aliased (e1), so to get information about employee_id and reports_to columns, we say (Select e1.employee_id, e1.reports_to from eployee e1). pls, what’s is e2 referring to, it’s totally confusing.
it got complicated on the INNER JOIN againg, Please, someone to clarify each line for me please. Thanks.

SELECT
    e1.employee_id,
    e2.employee_id supervisor_id
FROM employee e1
INNER JOIN employee e2 on e1.reports_to = e2.employee_id
LIMIT 4;

Concatenate operator
I’m totally confused from “album id is” || album_id col_1, till the end. Kindly help me, I’m stuck in my mission and I don’t like copying answers without even understanding the topic. Someone to help me break down Concatenate operator to my understanding.

SELECT
    album_id,
    artist_id,
    "album id is" || album_id col_1,
    "artist id is" || artist_id col2,
    album_id || artist_id col3
FROM album LIMIT 3;

Thanks.

Hi @stobiabatan23,
following your code blocks

SELECT
    e1.employee_id,
    e2.employee_id supervisor_id
FROM employee e1
INNER JOIN employee e2 on e1.reports_to = e2.employee_id
LIMIT 4;

What you are doing here is joining a table with itself in two instances.

  • first you call the table employee e1
  • then you join it to itself, but call this second “instance” of the table e2
    What you should be careful about is the join relationship. for e1 you are using the column “reports_to”, and joining it to the id of the supervisor.

Basically, you are looking at employee id data for table 1. Then you are interested at the “manager” data in table 2. To understand who is the supervisor of whom.

The second part is actually easier

SELECT
    album_id,
    artist_id,
    "album id is" || album_id col_1,
    "artist id is" || artist_id col2,
    album_id || artist_id col3
FROM album LIMIT 3;

The concatenate operator simply takes a string variable and concatenates it to a sql statement.
so your result here is a 5-columns table, shaped for instance as

album_id | artist_id | col_1 | col2 | col3

1234 | 1 | album_id is 1234 | artist_id is 1 | 12341

it is easier in my opinion to read this sql by declaring variables with “as”, such as this

SELECT
    album_id,
    artist_id,
    "album id is" || album_id  AS  col_1, -- this concatenates the string "album id is" and the actual album id
    "artist id is" || artist_id AS  col2,
    album_id || artist_id AS  col3 --this concatenates together the value of album_id&artist_id
FROM album LIMIT 3;

Let me know if it makes sense to you!
cheers
N

1 Like

Just for extra knowledge,
ON conditions in JOIN can be removed, and added to WHERE condition after the join is done.
Filtering in ON in JOIN clause vs filtering in WHERE after JOIN is mostly a preference issue as sql query plan optimizers will treat them about the same.

For practical applications of recursive joins, basically anytime you want to convert a long table to a wide table, it is of use.
Long tables are good for convenient data appending and storage. Wide tables are transformed from long tables during analysis to move important elements of analysis into columns.

By storing someone’s id, and the id of his boss, it allows the sql tables to represent an infinitely big tree of an organization hierarchy, with people without boss id at the top.

Another scenario of self-join is when supermarket wants to study which pairs of products are purchased together. I’ll let you figure out how the joining relationship is different from the employee example.

Apple Interview Question:

Let’s say we have two tables, purchases and products. The purchases table consists of over a billion rows of transactions bought by users.

We are trying to find paired products that are often purchased together by the same user, such as wine and bottle openers, chips and beer, etc…

Write a query to find the top 100 paired products and their names.
(Yes, you don’t need real data values to solve such questions)

purchases table

column type
user_id integer
created_at datetime
product_id integer

products table

column type
id integer
name string
1 Like

@nlong
Thanks so much, I believe the recursive join and concatenates is much more clearer.
Thanks
S

@hanqi
Thanks for the tip, it’s awesome.
Thanks
S