Doubt about solution

Screen Link:
https://app.dataquest.io/m/179/joining-data-in-sql/8/challenge-complex-query-with-joins-and-subqueries

Here is the solution given:

SELECT
    f.name country,
    c.urban_pop,
    f.population total_pop,
    (c.urban_pop / CAST(f.population AS FLOAT)) urban_pct
FROM facts f
INNER JOIN (
            SELECT
                facts_id,
                SUM(population) urban_pop
            FROM cities
            GROUP BY 1
           ) c ON c.facts_id = f.id
WHERE urban_pct > .5
ORDER BY 4 ASC;

Here is the solution I tried, it is not working properly:

SELECT
    facts.name AS country,
    cities.urban_pop,
    facts.population AS total_pop,
    (cities.urban_pop / CAST(facts.population AS FLOAT)) AS urban_pct
FROM facts
INNER JOIN (
            SELECT
                facts_id,
                SUM(population) AS urban_pop
            FROM cities
            GROUP BY 1
           ) ON cities.facts_id = facts.id
WHERE urban_pct > .5
ORDER BY 4 ASC;

It seems like the fact I did not use any aliases for the tables are the unique difference between the two queries. Are these aliases for the tables is a MUST for the query to work?

1 Like

It would be better if you provide more details on why it’s not working properly.

I believe so, yes. cities table does not have urban_pop as a column. So cities.urban_pop will not work. When you use SELECT you are not creating or adding new columns in that table, so cities.urban_pop would be incorrect.

Same goes for cities.facts_id after your JOIN statement. Because the FROM cities is inside of the subquery. So, you can’t access the facts_id from cities outside of that subquery.

That’s why they have used an alias c for the JOIN. Because that alias then allows you to access the columns corresponding to that subquery.

2 Likes

Thank you! It makes sense.

I made another try. This time, using this alternative solution that one of our colleagues came up with in a differente thread:

SELECT  f.name AS country,
         SUM(c.population) AS urban_pop,
         f.population AS total_pop,
         (CAST(SUM(c.population) AS float) / CAST(f.population AS float)) AS urban_pct
   FROM  facts AS f
  INNER  JOIN cities as c ON f.id = c.facts_id
  GROUP  BY country
 HAVING  urban_pct > 0.5
  ORDER  BY urban_pct

I tried the following solution, removing the c and f aliases:

SELECT  facts.name AS country,
         SUM(cities.population) AS urban_pop,
         facts.population AS total_pop,
         (CAST(SUM(c.population) AS float) / CAST(facts.population AS float)) AS urban_pct
   FROM  facts
  INNER  JOIN cities ON facts.id = cities.facts_id
  GROUP  BY country
 HAVING  urban_pct > 0.5
  ORDER  BY urban_pct;

It didn’t work, I can’t understand why since the problem with the inexistent column in the select clause is not the case here. Is anyone able to help?

2 Likes

You forgot to remove alias ‘c’ from this
(CAST(SUM(**c.**population) AS float)

will work if you put full name:
(CAST(SUM(cities.population) AS float)

1 Like