CYBER WEEK - EXTRA SAVINGS EVENT
TRY A FREE LESSON

Subquerys SQL Why do I need to use subqueries instead of an aggregate function?

For https://app.dataquest.io/m/255/subqueries/1/writing-more-complex-queries

I know what subqueries do but when I have a linger question that was not answered in this section that is bothering me. How come this code:

SELECT * FROM recent_grads
WHERE ShareWomen > AVG(ShareWomen)

gives this error: (sqlite3.OperationalError) misuse of aggregate function AVG() [SQL: SELECT * FROM recent_grads WHERE ShareWomen > AVG(ShareWomen);] (Background on this error at: http://sqlalche.me/e/e3q8)

I need to use something along the line of:

SELECT * FROM recent_grads
WHERE ShareWomen > (SELECT AVG(ShareWomen) FROM recent_grads);

to get my answer but I’m just wondering why SQL can’t use the aggregate function of WHERE ShareWomen > AVG(ShareWomen)

Normally we use having clause when we have some condition based on some aggregate group column.
example: select id, AVG(ShareWomen) from recent_grads group by id having AVG(ShareWomen) > 2;

Where clause = condition on some normal table column and having clause = condition on some aggregate group column so where clause wasn’t designed to use aggregate functions directly.

However for the “rows with ShareWomen greater than AVG(ShareWomen)” problem you are trying to solve in sql, you can store AVG(ShareWomen) value to a variable such as avgWomen and then use that variable in where clause like below:
set @avgWomen = (select AVG(ShareWomen) from recent_grads);
select * from recent_grads where ShareWomen > @avgWomen;

Thanks, that makes sense.

Here’s some interesting info on how having clause can be expressed as where: https://stackoverflow.com/questions/12121125/is-the-having-clause-redundant

Hi
This can be solved in this way too-
Step 1) First we can find the avg

SELECT AVG(ShareWomen) FROM recent_grads
OUTPUT is 0.5225502029537575

Step 2) Then

SELECT * FROM recent_grads
WHERE ShareWomen > 0.52

OR Simply-
SELECT * FROM recent_grads
WHERE ShareWomen > (SELECT AVG(ShareWomen) FROM recent_grads);

I hope this may help
Regards
Anu Jain

1 Like

Hi,
you are asking where we can’t can use average function direct in where clause? like you given in this query.

SELECT * FROM recent_grads
WHERE ShareWomen > AVG(ShareWomen)

means you needs as simple English SQL.

Ans:

  1. Where clause never considers the complete table this only take rows we given in the condition and here average function we are using that will use the complete table.

  2. you can use group by and having and group by if confused with a subquery.

1 Like