BLACK FRIDAY EXTRA SAVINGS EVENT - EXTENDED
START FREE

Pls someone explain for me about CAUSE in SQL section

Screen Link: https://app.dataquest.io/m/191/guided-project%3A-answering-business-questions-using-sql/4/analyzing-employee-sales-performance

In suggestion, they writed below. I can’t understand :’ CASE
WHEN (
SELECT count(*)
FROM customer
where country = c.country
) = 1 THEN “Other”
ELSE c.country’

How does it work? Pls someone explain in detail for me.

sales_by_country = ‘’’
WITH country_or_other AS
(
SELECT
CASE
WHEN (
SELECT count()
FROM customer
where country = c.country
) = 1 THEN “Other”
ELSE c.country
END AS country,
c.customer_id,
il.

FROM invoice_line il
INNER JOIN invoice i ON i.invoice_id = il.invoice_id
INNER JOIN customer c ON c.customer_id = i.customer_id
)

SELECT
country,
customers,
total_sales,
average_order,
customer_lifetime_value
FROM
(
SELECT
country,
count(distinct customer_id) customers,
SUM(unit_price) total_sales,
SUM(unit_price) / count(distinct customer_id) customer_lifetime_value,
SUM(unit_price) / count(distinct invoice_id) average_order,
CASE
WHEN country = “Other” THEN 1
ELSE 0
END AS sort
FROM country_or_other
GROUP BY country
ORDER BY sort ASC, total_sales DESC
);
‘’’

1 Like

CASE statements are used when you want the value in a column to conditionally be different based on other factors, such as the values in other columns.

I’ll illustrate this with the following table (let’s call it table1) as an example:

Name Score
Alice 35
Ben 50
Cynthia 60
Darren 70
Emma 80

I want to make an SQL query that tells me who the high-scorers are. I’ll define a high-scorer as anyone with a score of 55 or higher.

This is what the various parts of the CASE statement would look like:

CASE                # This simply says that you're invoking a CASE statement. 
WHEN Score >= 55    # "When the score is 55 or above...."
THEN "High Scorer"  # .... then return "High Scorer"
ELSE "Low Scorer"   # Else for any other event, return "Low Scorer"
END AS "Outcome"    # End this Case Statement and let's label it as "Outcome"

Now let’s throw this into an actual SQL query:

SELECT
    Name,
    CASE
        WHEN Score >= 55    
        THEN "High Scorer"  
        ELSE "Low Scorer"   
    END AS "Outcome"
FROM table1

Output:

Name Outcome
Alice Low Scorer
Ben Low Scorer
Cynthia High Scorer
Darren High Scorer
Emma High Scorer

Our CASE statement allowed us to conditionally alter the value in the new Outcome column based on the value in the Score column. You can think of it in plain English terms:

“Hey SQL, I’m gonna specify you a CASE.
WHEN the score of a person is 55 or above, THEN call them a “High Scorer”.
ELSE, call them a “Low Scorer”.
END this case statement AS the column ‘Outcome’”.

Let’s link this back to the screen:

SELECT
CASE
WHEN (
SELECT count()
FROM customer
where country = c.country
) = 1 THEN “Other”
ELSE c.country
END AS country,

count(*) returns the number of rows where the specified condition is met.

What this CASE statement is essentially saying is “If the number of rows in the table where this country occurs is 1, then just label it “Others”. Else, give me the actual name of the country”.

4 Likes

Your answer didn’t point out the most important stuff in the question.

This is a good one: CASE and Subqueries - #3 by hanqi
And this part of it is essential:
Correlated subquery - Wikipedia

UPDATE: the main problem that is posted in the question is the part where inner query exchange data with outer query. So its a problem of Chicken or the egg paradox