INTERMEDIATE JOINS IN SQL 189/7: Generating columns with the CASE statement

Hello,

I am having difficulty understanding why the GROUP BY 1 line is necessary here

https://app.dataquest.io/m/189/intermediate-joins-in-sql/7/generating-columns-with-the-case-statement

SELECT
   c.first_name || " " || c.last_name customer_name,
   COUNT(i.invoice_id) number_of_purchases,
   SUM(i.total) total_spent,
   CASE
       WHEN sum(i.total) < 40 THEN 'small spender'
       WHEN sum(i.total) > 100 THEN 'big spender'
       ELSE 'regular'
       END
       AS customer_category
FROM invoice i
INNER JOIN customer c ON i.customer_id = c.customer_id
GROUP BY 1
ORDER BY 1

My initial code did not contain the ‘GROUP BY 1’ and the output gives one single customer:
customer_name: Phil Hughes
number_of_purchases: 614
total_spent: 4709.429999999994
customer_category: big spender

Inputting the correct code and locating this customer I see the values are different:
customer_name: Phil Hughes
number_of_purchases: 11
total_spent: 98.01
customer_category: regular

Why is ‘group by 1’ important here when the instructions specifically states we do not have any 2 customers with the same name?

If you don’t include GROUP BY 1 it will count or sum the others purchases and spend amount for the first row (first customer) that’s why it gives output of only one customer with total number of purchases and total amount spend data given in the table.

Grouping the data by customer_name makes the cluster of customers with theirs own purchases and spend amount in any other time. So it become easy to sum or count the amount spend and purchases for respective customer.

Makes sense. Thank you Prem!

1 Like