Joining Data in SQL - 8.Challenge: Complex Query with Joins and Subqueries

Hi everyone, hope everything is fine with you :slight_smile:,

I am currently stuck in this mission:

https://app.dataquest.io/m/179/joining-data-in-sql/8/challenge-complex-query-with-joins-and-subqueries

I came out with this Query:

SELECT f.name AS 'country', c.urban_pop, f.population AS 'total_pop', ROUND(CAST(c.urban_pop AS Float)/f.population, 2) AS 'urban_pct' FROM facts AS f
INNER JOIN (SELECT *, SUM(population) AS 'urban_pop' FROM cities) AS c ON c.facts_id = f.id
WHERE urban_pct > .5
ORDER BY urban_pct ASC;

And I expected for the query to generate a table with a list of the countries where the urban_pct of the pop would outpace by half the total pop. But what actually happened is that it only generated me a single output with a single country:

country

urban_pop

total_pop

urban_pct

Zimbabwe

991685100

14229541

69.69

Could someone give me a hand, cuz I donnu what else to do, and I don’t want to click on the See the answer option :smiley:.

Muito Obrigado!

Here’s a hint. You use the subquery below. Run it and see its results.

SELECT *,
       SUM(population) AS 'urban_pop'
  FROM cities;

Olá Olá :smiley:
I am already using it in the INNER JOIN:

INNER JOIN (SELECT *, SUM(population) AS 'urban_pop' FROM cities)

. You mean should I use the subquery in the SELECT?

I’ve changed the query to this:

SELECT f.name AS 'country', (SELECT SUM(population) FROM cities) AS 'urban_pop', f.population AS 'total_pop', ROUND(CAST(c.population AS Float)/f.population, 2) AS 'urban_pct'  FROM facts AS f
INNER JOIN cities AS c ON c.facts_id = f.id
WHERE urban_pct > .5
ORDER BY urban_pct ASC;

But now I have an issue with the urban_pop, I think its adding all the values in the column, and giving the same result to all the rows, but I just want it to add the population per country…

1 Like

Your query is salvageable, but it makes it needlessly complex (it requires the introduction of correlated subqueries). For completion, here’s a modification to your code that makes it right:

SELECT DISTINCT
	   f.name AS 'country',
       (SELECT SUM(population) FROM cities WHERE c.facts_id=facts_id ) AS 'urban_pop',
	   f.population AS 'total_pop',
	   (SELECT SUM(population) FROM cities WHERE c.facts_id=facts_id )*1.0/f.population AS 'urban_pct' 
  FROM facts AS f
 INNER JOIN cities AS c ON c.facts_id = f.id
 WHERE urban_pct > .5
 ORDER BY urban_pct ASC;

Instead, I recommend that you follow the strategy used in the solution:

  • Create a table (it can be a subquery or you can use a WITH statement) that finds the total urban population for each country.
  • Join facts with this table and do the appropriate calculations.

I hope this helps.

2 Likes

Thank you very much, my friend :slight_smile:

The secret was in this line of code:

SELECT 
            facts_id, SUM(population) AS 'urban_pop' 
            FROM cities
            GROUP BY 1)

Adding the countries by group based on their unique id ;).

Merci Merci!!

1 Like