BLACK FRIDAY EXTRA SAVINGS EVENT - EXTENDED
START FREE

Confusion about SUM and GROUP BY in SQL

Hello,

I was working on the Group Summary Statistics in SQL mission, and I am confused why the following query seems to be incorrect.

My Code:

SELECT Major_category,
    SUM(Total * ShareWomen) AS Total_women,
    AVG(ShareWomen) AS Mean_women,
    SUM(Total) * AVG(ShareWomen) AS Estimate_women
FROM recent_grads
GROUP BY Major_category

I expected this to return the same output as the query given in the answer key (only the second line has changed):

SELECT Major_category,
    SUM(Women) AS Total_women,
    AVG(ShareWomen) AS Mean_women,
    SUM(Total) * AVG(ShareWomen) AS Estimate_women
FROM recent_grads
GROUP BY Major_category

I played around with this and was surprised to find that when I run

SELECT Major_category,
    SUM(Women+Men) AS Total_1,
    SUM(Total) AS Total_2
FROM recent_grads
GROUP BY Major_category

I get different values in each column.

Is this a bug? Or am I missing something about how SUM and GROUP BY work?

Link to mission:

1 Like

@jjb Welcome to the Community!

For this one below, the logic behind the code is correct. However, you are multiplying by ShareWomen which is in percentage. Therefore, Total_women final value is a float, but the answer check expect integer values. Even if you round, you might not get exactly the same value as SUM(Women) because of rounding errors.

For this code below, I think it is data entry error:

There are discrepancies for Engineering entries. So expect different values when you sum.

WITH one as 
(
    SELECT Major_category,
           men + women as men_women,
           total
    From recent_grads)

SELECT  *
FROM one
WHERE Major_category = 'Engineering'

image

I wanted to be sure that SQL wasn’t playing any tricks:

WITH one as 
(
    SELECT Major_category, Major,
           men + women as men_women,
           total
    From recent_grads
    LIMIT 5)

SELECT * FROM one

image

WITH one as 
(
    SELECT Major_category, Major,
           men + women as men_women,
           total
    From recent_grads
    LIMIT 5)


SELECT  SUM(men_women),
        SUM(total)
FROM one

image

2 Likes