Scalar Subqueries in SQL Practice Problems Screen 6

Hi! Can someone please explain why the ON statement is not used for joining tables in the solution for this practice problem?

Instructions:
Write a query that returns all the rows in performance where the student studies more than the average hours studied per week, as detailed in the survey table

Solution:

SELECT *
  FROM performance AS p
  JOIN survey AS s
 WHERE s.hours_per_week > (SELECT AVG(hours_per_week) 
                             FROM survey);

Screen Link:
https://app.dataquest.io/m/1000601/scalar-subqueries-in-sql-practice-problems/6/subqueries-in-where-statements-2

Not including an ON clause is equivalent to performing a CROSS JOIN, so the solution is equivalent to. . .

SELECT *
  FROM performance AS p
 CROSS JOIN survey AS s
 WHERE s.hours_per_week > (SELECT AVG(hours_per_week) 
                             FROM survey);

To bridge the gap between this and what may have been your solution, revisit the CROSS JOIN screen.

2 Likes

Thank you for your reply, but using cross join seems to give a different solution.

Also, I think the answer provided by Dataquest is wrong:

The survey table doesn’t have all the student_ids present in the performance table. The answer provided by Dataquest lists all the student_ids from the performance table and joins them with uncorrect rows from the survey table. Eg. the first 10 rows displayed show the student_ids from performance table from 1 to 10, meanwhile the corresponding id from the survey table is 1196 for all of these.

It may depend on what you mean with “different”. Different order? Possibly, but it should have the same rows. SQL doesn’t care about order unless we tell it to, what usually matters is that the data is there.

You’re right, our answer is incorrect. We’re missing the ON clause, specifically ON p.student_id = s.student_id.

You are right. I thought it gives a different result because it wasn’t accepted as “correct” answer, but if I order the queries the same way, they are the same (based on the first 10 rows displayed).

You can confirm they are the same by running the query suggested by the pseudo code below.

OUR_SOLUTION EXCEPT YOUR_SOLUTION

and then switching the order and verifying we get zero rows both times.

1 Like