# SQL - JOINS Practice Questions. WHERE clause not used in solution

My Code:

``````SELECT pu.parental_education, pu.lunch,
COUNT(s.plays_sports) *100.0 / (SELECT COUNT(*) FROM survey)percentage_sports
FROM performance_updated pu
INNER JOIN survey s
ON pu.student_id = s.student_id
WHERE s.plays_sports = 1
GROUP BY pu.parental_education, pu.lunch;
``````

What I expected to happen:
Return a query that returns the `parental_education` and `lunch` columns from `performance_updated` table, as well as the percentage of students in each combination that participate in sports (`plays_sports` in `survey` ).

What actually happened:

``````The query result table has less rows than we expected. We expected a table with 12 rows but the provided table only has 11 rows.
``````

Solution from DQ is as below, however it is calculating AVG , whereas the mission asks us to
calculate percentage for the students playing sport i.e. column plays_sports value should be equal to 1, for each combination of parental_education & lunch.

SELECT pu.parental_education,
pu.lunch,
AVG(s.plays_sports) * 100 AS percentage_sports
FROM performance_updated AS pu
JOIN survey AS s
ON pu.student_id = s.student_id
GROUP BY pu.parental_education, pu.lunch;

Since there are 100 rows in `survey` and the values are either `0` or `1` in the `plays_sports` column, taking the average and multiplying by `100` will effectively calculate the percentage. Simply taking the sum of this column would get you there as well. EDIT_1: This is true only if we arenâ€™t grouping by anythingâ€¦once we `GROUP BY`, each group will some who play sports, and some who donâ€™t. Taking the average of each group and multiplying by 100 will give us the percentage for that group.

All that said, the way that you are attempting to calculate this percentage will effectively count how many rows you have in each of your groups, as defined by your `GROUP BY` clause. Try adding the column `COUNT(s.plays_sports)` to your `SELECT` clause and you will see that itâ€™s the same as your percentage column.

The reason for this result is that your subquery (namely: SELECT COUNT(*) FROM survey) will always return `100` (ie the number of rows in `survey`) which cancels out with your `*100` in the numerator, leaving only `COUNT(s.plays_sports)` for your `percentage_sports` column.

As to why youâ€™re getting 11 rows instead of 12, Iâ€™m going to guess itâ€™s because of your `WHERE` clause thatâ€™s filtering your results unnecessarily before they are grouped. It looks like filtering by `plays_sports=1` causes you to lose the group: parental_education=bachelorâ€™s degree / lunch=standard because (Iâ€™m guessing) that group doesnâ€™t have anyone who plays sports and therefore you lose this group entirely.

I think if you address these two issues, you should be able to pass the screen without any more problems.

EDIT_2:
Here is how I solved it:

``````SELECT pu.parental_education,
pu.lunch,
SUM(s.plays_sports)/COUNT(s.plays_sports)*100 AS percentage_sports
FROM performance_updated AS pu
JOIN survey s ON s.student_id=pu.student_id
GROUP BY pu.parental_education, pu.lunch;
``````
1 Like

Thanks. Agree on the calculation part

For the WHERE clause, the missions asks to calculate " percentage of students in each combination that participate in sports", that is why I included WHERE.

Edit: I think now I understand, even though mission asks to show students participating in sports, those who are not participating should not be excluded from the output table, but shown as â€ś0â€ť - that way the DQ answer makes sense

1 Like