Guided Project: Answering Business Questions Using SQL: 191-5, total value of sales?

I’m really struggling with this query, and I think looking at the solution and some of the Q/A on here has only confused me more because I’ve seen this done both ways.
I’m trying to understand how to get the “total value of sales” but the way I thought it should be done is by doing SUM(invoice.total), whereas the solution on Github has it as SUM(invoice_line.unit_price). When I ran them side-by-side, they are no where near the same numbers and I’m just not sure why we wouldn’t use the sum of invoice totals. Can anyone explain it?

Side-by-side comparison:


This is from the solution:

Thanks in advance!

1 Like

Hi ywbadri,
Well I was about to upload my project when I saw your post and I was also confused as I never encountered this problem during my analysis because I was not taking invoice_line table into consideration in my INNER JOINS. So i decided to look at this issue and dig deeper into it…
I ran this simple query without considering invoice_line and I got the following results
SELECT
c.country,
SUM(i.total),
COUNT(i.invoice_id)
FROM invoice AS i
INNER JOIN customer AS c ON i.customer_id=c.customer_id
GROUP BY 1
ORDER BY 1 DESC
LIMIT 10
And ths is my output
image
Now I will also include invoice_line in my inner join
SELECT
c.country,
SUM(i.total),
COUNT(i.invoice_id),
COUNT(il.invoice_id),
SUM(il.unit_price)
FROM invoice AS i
INNER JOIN customer AS c ON i.customer_id=c.customer_id
INNER JOIN invoice_line AS il ON i.invoice_id=il.invoice_id
GROUP BY 1
ORDER BY 1 DESC
LIMIT 10
And my output
image
Now I will only focus on USA and apply groupby on customer ids
SELECT
c.customer_id,
SUM(i.total),
COUNT(i.invoice_id),
COUNT(il.invoice_id),
SUM(il.unit_price)
FROM invoice AS i
INNER JOIN customer AS c ON i.customer_id=c.customer_id
INNER JOIN invoice_line AS il ON i.invoice_id=il.invoice_id
WHERE c.country=‘USA’
GROUP BY 1
ORDER BY 1 DESC
LIMIT 10
And my output is
image
Now I will be investigating the customer ID 28 and groupby on the invoice ids of this customer
SELECT
i.invoice_id,
i.total,
SUM(i.total),
COUNT(i.invoice_id),
COUNT(il.invoice_line_id),
SUM(il.unit_price)
FROM invoice AS i
INNER JOIN customer AS c ON i.customer_id=c.customer_id
INNER JOIN invoice_line AS il ON i.invoice_id=il.invoice_id
WHERE c.country=‘USA’ AND c.customer_id=28
GROUP BY 1
ORDER BY 1 DESC
LIMIT 10
And my output is
image

Observe the first invoice id in the image which is 577. Now over here the SUM(i.total) value is an incorrect indicator because what it is doing is that it is adding SUM(il.unit_price) 9 times because 9 items have been purchased in this invoice_id. The total 8.91 is already 9 times unit_price because 9*0.99 = 8.91. So it is actually calculating the sum(i.total) based on the sum(il.unit_price)*Count(il.invoice_line_id). Therefore the value 10405 for USA actually contains twice the invoice_line_id count for every invoice_id for every customer_id.
I hope you have an idea why 10405 for USA is incorrect…
Feel free to reply back so that we can clear our doubts…
Best regards,
Saad Mohsin Khan

1 Like

Thank you so much for looking into this! It makes sense now somehow.

1 Like