CYBER WEEK - EXTRA SAVINGS EVENT
TRY A FREE LESSON

Calculating percentage or share using AVG vs TOTAL

Screen Link: https://app.dataquest.io/m/254/group-summary-statistics/4/querying-virtual-columns-with-the-having-statement

My Code:

SELECT Major_category, TOTAL(Low_wage_jobs)/TOTAL(Total) share_low_wage FROM recent_grads GROUP BY Major_category HAVING share_low_wage > 0.1

What I expected to happen:
Query is valid and result has same content as when using AVG.

What actually happened:
DQ solutions use AVG function to calculate the percentage or share. If I use TOTAL and submit the answer, it gives the following error. Is using AVG instead of TOTAL in such queries the only correct method? Thanks for your help!

The value for result doesn't look right.

hey @DnaData

I tried your solution as well, turns out the differences are at the 17th place after the decimal. So the answer is wrong! just kidding! :stuck_out_tongue_closed_eyes:

Most likely since the instruction clearly tells you the query you need to use so it might be expecting the exact same values (I am not 100% sure), but this the instruction query deals with avg of avg and your query deals avg of totals. That’s the only difference.

Hey Rucha, I tried using SUM instead of TOTAL, but my output is blank, do you know why this is?

SELECT
Major_category,
SUM(Low_wage_jobs) / SUM(Total) Share_low_wage
FROM recent_grads
GROUP BY 1
HAVING Share_low_wage > 0.

Hi @raunakkedar

Wow! interesting question. (I don’t know if Dataquest covered this part in the course. Let me know if DQ has mentioned this.)

And I assume it’s a decimal after 0 in the having clause in your question here -

Access this link and scroll to the part SUM(X) and TOTAL(X) - Built-in Aggregate Functions (sqlite.org)

Then try these queries:

query1

SELECT
    Major_category,
    SUM(low_wage_jobs),
    SUM(total),
    SUM(Low_wage_jobs) / SUM(Total) AS Share_low_wage_SUM,
    TOTAL(low_wage_jobs),
    TOTAL(total),
    TOTAL(Low_wage_jobs) / TOTAL(Total) AS Share_low_wage_TOTAL
FROM recent_grads
GROUP BY 1
HAVING Share_low_wage_SUM > 0.

query2

SELECT
    Major_category,
    SUM(low_wage_jobs),
    SUM(total),
    SUM(Low_wage_jobs) / SUM(Total) AS Share_low_wage_SUM,
    TOTAL(low_wage_jobs),
    TOTAL(total),
    TOTAL(Low_wage_jobs) / TOTAL(Total) AS Share_low_wage_TOTAL
FROM recent_grads
GROUP BY 1
HAVING Share_low_wage_TOTAL > 0.

Let me know your observations.