Use of "IN" vs. "IS" with WHERE clauses

Hi all,

I’m new to Dataquest and have no prior experience with SQL. I’m enrolled in the SQL Fundamentals course right now, and there’s a point of confusion for me that I was hoping someone could take a stab at answering. In Mission 4 (Subqueries), page 4. How come the correct operator here is “IN,” rather than “IS”? My intuition would tell me to use “IS” to have SQL return the values that meet the subquery criteria.

Any help would be appreciated!

Thanks,
-B

Hey, Ben.

I actually find IN to be more intuitive and in line with what we teach previously in the content. I’ll try to support this intuition, this answer isn’t particularly technical. Hopefully someone else can deal with the technical details.

Let us consider the subquery in the answer:

SELECT major_category
  FROM recent_grads
 GROUP BY major_category
 ORDER BY SUM(total) DESC
 LIMIT 5
;

This returns what in day-to-day life we would call a list:

Major_category
Business
Humanities & Liberal Arts
Education
Engineering
Social Science

Technically it returns a table because everything in SQL is a table, but I want to focus on the intuition.

So we want to filter for the rows that are in this list. Not for rows that are this list. In Python this would look something like if major_category is in top_5.

Another way of looking at this and possibly training your intuition is that the keyword IS more or less works like the comparison operator =. And you don’t want to filter for rows where the major category equals a list.

I hope this helps.

1 Like