Using Loop With Pandas below

selected_rows = f500[f500["country"] == c]

please can somebody explain how that code above output only "Israel " as the only country out of the 34 unique countries.

@jithins123 @nityesh. please explain. thanks

1 Like

hi @aniefiokuduakobong

What is the full code you are using for this mission?

If you just print selected_rows, at the end of the for loop (and outside the for loop), you will get the last group of records selected by the country.

`for c in countries:


In this case, there is only one row at index 495, that belongs to the country “Israel”. Hence the country displayed to you is Israel.
try this code to check:

f500[f500["country"] == "Israel"]

To complete this mission you are missing a couple of steps to identify the company with the highest employment for each country.

thanks @Rucha. but i am still not getting it.

this is the full code for the mission.

#Create an empty dictionary
top_employer_by_country = {}
#Create an array of unique values from `country`
countries = f500["country"].unique()

#loop through unique countries
for c in countries:
    seleted_rows = f500[f500["country"] == c]
    sorted_rows = selected_rows.sort_values("employees", ascending=False)
    first_row = sorted_rows.iloc[0]
    employer_name = first_row["company"]
    top_employer_by_country[c] = employer_name

The first code below outputs a list or a single row of 34 unique countries.

countries = f500["country"].unique()

The code below outputs all the countries in the data frame


The boolean comparison of the code below should output a boolean (True) for all the 34 unique countries instead of only Israel

f500["country"] == c

These are my thoughts.

It does output a boolean True for all unique countries. But the iterator c changes from country to country as the loop goes on and so does the selected_rows variable. In each iteration, this variable stores only the rows that refer to the c country. The output you see is the output for the last iteration, which assigns to the selected_rows only the rows that refer to the last country in countries. If you take a look at the countries variable, you’ll see that the last country is Israel.


Hey, everything you did is right except for this one

There is a hidden typo here! If you can find it, your code will run fine.
About seeing Israel, I think the output shows only the last iteration like @otavios.s said. But if you print each line of code, you can see the whole rows appearing. So your code is right. Don’t worry about seeing Israel alone there. If you check selected_row also, you will find there is only one row selected. I think maybe that is how the variable output is displayed on the output screen.

It took me a long while to figure out this very well hidden mistake. Hope you can find it now. Let me know.

1 Like

hey @aniefiokuduakobong

yup as @jithins123 has highlighted, there is a typo between seleted and selected. Your code will work once this correction is done!


Hey @Rucha and @jithins123,

The typo is not the problem. If it was, the code would raise an error stating that selected_rows is not defined. The typo probably happened while posting the code here in the community.

The code is correct, he is just expecting a wrong outcome. As @jithins123 said, if he prints the selected_rows in each loop, he’ll see all the unique countries appearing.

1 Like

Hi @otavios.s,

I think you are right and I didn’t think about that.
But I did run the code with and without the typo on DQ environment.
Looks like it didn’t generate an error because since he used the same variable name mentioned in the instruction, probably the answer checking engine used the value stored for selected_rows and returned a different output based on that. When I checked I got all the company names as ’ Toyota’. Now I think I know why I got that as an answer.

Maybe if I had run the code on my jupyter notebook, it would have raised an error like you have mentioned.

What do you think?

You are right, no error is raised. That’s because selected_rows was already defined in the step before this one. And that’s also the reason it outputs Toyota, because in step 10 we’re told to assign the rows where the country is Japan.

Step 10 instruction and code:

  1. Find the company headquartered in Japan with the largest number of employees.
  • Select only the rows that have a country name equal to Japan .
selected_rows = f500[f500['country'] == 'Japan']

Now the typo becomes a huge problem, especially because no error is raised. With this typo, the only row that’s is changing with each iteration is the following:

    seleted_rows = f500[f500["country"] == c]

Since all the other lines of code inside the for depend on the selected_rows variable instead of the seleted_rows variable and selected_rows never changes, they will store the same data (about Japan) in the variables in the code below:

sorted_rows = selected_rows.sort_values("employees", ascending=False)
first_row = sorted_rows.iloc[0]
employer_name = first_row["company"]
top_employer_by_country[c] = employer_name

@aniefiokuduakobong, although the sorted_rows , first_row , employer_name , top_employer_by_country are presenting wrong values (if you have the typo in your code), the seleted_rows (considering the typo) correct value is, indeed, Israel as I explained earlier.


thanks alot @otavios.s. i went through my codes again. i printed each variable inside the loop and noticed that the output was applicable to all unique countries. Just that Python"s output only shows for the last item in the list.

Thanks again for coming through. i am really grateful

1 Like

@jithins123.Thanks for your response. it was typo error from me.

1 Like was a typo error from me. thanks for your contribution

@otavios.s you are very correct. :smiley:

1 Like

@otavios.s. you are very correct. i did as you stated above and i got every unique countries in the output.
thanks alot.

1 Like