Several doubts about country stats part of the GP: Answering Business Questions Using SQL

Screen Link: Learn data science with Python and R projects

I have been despairing about this project for a while. So I decided to just copy the logic of the answer provided by the DQ team. But I still have questions. :scream:

I did almost identically the same, except I used invoice.total column for the calculation of the sales amount unstead of unit_price, and using customer table instead of invoice_line in the WITH subquery (but I joined the invoice_line table anyway just in case I see any use for it).

My Code:

%%sql
WITH country_or_other AS
                     (
                      SELECT 
                             CASE 
                                  WHEN (
                                       SELECT COUNT(*)
                                       FROM customer
                                       WHERE country = c.country
                                       ) = 1 THEN "Other"
                                  ELSE  c.country
                             END AS country,
                             c.customer_id,
                             il.*
                      FROM customer c
                      INNER JOIN invoice i ON i.customer_id = c.customer_id
                      INNER JOIN invoice_line il ON il.invoice_id = i.invoice_id
                     )
SELECT country,
       num_customers,
       total_sales,
       avg_sales_per_customer,
       avg_order
    FROM
        (
         SELECT country,
                COUNT (DISTINCT c.customer_id) num_customers,
                SUM (i.total) total_sales,
                SUM (i.total)/COUNT (DISTINCT customer_id) avg_sales_per_customer,
                SUM(i.total)/COUNT (DISTINCT invoice_id) avg_order,
                CASE
                WHEN country = "Other" THEN 1
                ELSE 0
                END AS sort
         FROM country_or_other 
         GROUP BY country
         ORDER BY sort
        );

I get the error

(sqlite3.OperationalError) no such column: c.customer_id

I can’t see what I missed. Please, help.

Also, I would really appreciate it if someone could explain to me:

  1. why do they use unit_price and and not total from the invoice,
  2. What does the line WHERE country = c.country do?

Read you!
Best day to everyone!

PS: I think this project makes a really challenging leap from all the previous material. Maybe DQ Team should think of probably giving some more props for this project. If intimidating isn’t your goal of’course :stuck_out_tongue: Because, personally, I started doubting my capabilities and I am 60% into the path without any such troubles.

hey daryaholodova,

the answer provided by DQ team is pretty dirty too, soo i’m unable to answer your two last questions. But, answering the first one, your code breaks because in your last select statement you’re using aliases that doesn’t exists

please, try the following:


SELECT country,
       num_customers,
       total_sales,
       avg_sales_per_customer,
       avg_order
    FROM
        (
         SELECT country,
                COUNT (DISTINCT customer_id) num_customers,
                SUM (total) total_sales,
                SUM (total)/COUNT (DISTINCT customer_id) avg_sales_per_customer,
                SUM(total)/COUNT (DISTINCT invoice_id) avg_order,
                CASE
                WHEN country = "Other" THEN 1
                ELSE 0
                END AS sort
         FROM country_or_other 
         GROUP BY country
         ORDER BY sort
        );
1 Like