Why does this solution return fewer rows than requested?

Screen Link:
https://app.dataquest.io/m/190/building-and-organizing-complex-queries/8/challenge-each-countrys-best-customer

My Code:

CREATE VIEW chinook.customer_totals AS
    SELECT customer.customer_id as customer_id, 
        SUM(invoice.total) as total_purchased,
        customer.country as country,
        customer.first_name || ' ' || customer.last_name as name
    FROM chinook.customer as customer
        LEFT JOIN chinook.invoice as invoice
        ON customer.customer_id = invoice.customer_id
    GROUP BY customer.customer_id
    ORDER BY country, total_purchased DESC;
SELECT 
    totals1.country country,
    totals1.name customer_name,
    totals1.total_purchased as total_purchased
    FROM chinook.customer_totals totals1
    WHERE totals1.total_purchased = (
        SELECT totals2.total_purchased
        FROM chinook.customer_totals totals2
        WHERE totals2.country = totals1.country
        ORDER BY totals2.total_purchased DESC
        LIMIT 1)
    ORDER BY country;

What I expected to happen:
The correct answer, which is given in the prompt.

What actually happened:
After some testing, it seems that my subquery (the last “SELECT” in the code) is not properly matching the value it should. Yet I do not understand why.

1 Like

This worked:

CREATE VIEW chinook.customer_totals AS
    SELECT customer.customer_id as customer_id, 
        SUM(invoice.total) as total_purchased,
        customer.country as country,
        customer.first_name || ' ' || customer.last_name as name
    FROM chinook.customer as customer
        LEFT JOIN chinook.invoice as invoice
        ON customer.customer_id = invoice.customer_id
    GROUP BY customer.customer_id
    ORDER BY country, total_purchased DESC;
SELECT 
    totals1.country country,
    totals1.name customer_name,
    totals1.total_purchased as total_purchased
    FROM chinook.customer_totals totals1
    GROUP BY 1
    HAVING total_purchased = MAX(total_purchased);
1 Like

That is much more elegant! So I should have used HAVING there instead of WHERE.

Still it doesn’t exactly answer the question; I was wondering why the query I posted doesn’t work. Because after checking it a few times I don’t understand what is wrong with it, so it seems I will learn something new about SQL by understanding the issue with it.

Suggestion: Run your subquery alone and see whether it is performing as expected.

I have tried doing this, in particular I ran

SELECT totals2.total_purchased
FROM chinook.customer_totals totals2
WHERE totals2.country = 'Argentina'
ORDER BY totals2.total_purchased DESC
LIMIT 1

because the first missing row was Argentina. But the subquery returned the expected quantity, 39.6.

In my opinion, you want to use aggregation to get the max value. LIMIT does not serve this purpose. WHERE does not serve the purpose of aggregation, it is for filtering total purchase is not UNIQUE and is not ideal for one-to-one mapping.customer_id is preferred.

I completely agree with this. But it doesn’t answer the question “why does the query not work?”

Because of the above in my opinion.