Pandas selecting max value using for loop question

Hello everyone,

Happy thanksgiving first! wish you all having a wonderful time.
Here is a little question that I had today. Can someone please help me take a look?

This code below is to calculate the company that employs the most people in each of the 34 countries. The answer should be in a dictionary. This is the slide 11 in exploring Pandas intermediate section.

here is my code:

top_employer_by_country = {}

countries = f500["country"].unique()

for c in countries :
    selected_rows = f500[f500["country"] == c]
    employer_name = selected_rows.loc[(selected_rows["employees"]==selected_rows["employees"].max()),"company"]
    top_employer_by_country[c] = employer_name

output

{'Australia': 197    Wesfarmers
-  Name: company, dtype: object, 'Belgium': 205    Anheuser-Busch InBev
-  Name: company, dtype: object, 'Brazil': 190    JBS
-  Name: company, dtype: object, 'Britain': 386    Compass Group
-  Name: company, dtype: object, 'Canada': 292    George Weston
-  Name: company, dtype: object, 'China': 3    China National Petroleum
-  Name: company, dtype: object, 'Denmark': 297    Maersk Group
-  Name: company, dtype: object, 'Finland': 414    Nokia
-  Name: company, dtype: object, 'France': 483    Sodexo
-  Name: company, dtype: object, 'Germany': 5    Volkswagen
-  Name: company, dtype: object, 'India': 216    State Bank of India
-  Name: company, dtype: object, 'Indonesia': 288    Pertamina
-  Name: company, dtype: object, 'Ireland': 304    Accenture
-  Name: company, dtype: object, 'Israel': 495    Teva Pharmaceutical Industries
-  Name: company, dtype: object, 'Italy': 284    Poste Italiane
-  Name: company, dtype: object, 'Japan': 4    Toyota Motor
-  Name: company, dtype: object, 'Luxembourg': 155    ArcelorMittal
-  Name: company, dtype: object, 'Malaysia': 183    Petronas
-  Name: company, dtype: object, 'Mexico': 175    America Movil
-  Name: company, dtype: object, 'Netherlands': 19    EXOR Group
-  Name: company, dtype: object, 'Norway': 206    Statoil
-  Name: company, dtype: object, 'Russia': 62    Gazprom
-  Name: company, dtype: object, 'Saudi Arabia': 298    SABIC
-  Name: company, dtype: object, 'Singapore': 454    Flex
-  Name: company, dtype: object, 'South Korea': 14    Samsung Electronics
-  Name: company, dtype: object, 'Spain': 72    Banco Santander
-  Name: company, dtype: object, 'Sweden': 481    H & M Hennes & Mauritz
-  Name: company, dtype: object, 'Switzerland': 63    Nestle
-  Name: company, dtype: object, 'Taiwan': 26    Hon Hai Precision Industry
-  Name: company, dtype: object, 'Thailand': 191    PTT
-  Name: company, dtype: object, 'Turkey': 462    Koc Holding
-  Name: company, dtype: object, 'U.A.E': 479    Emirates Group
-  Name: company, dtype: object, 'USA': 0    Walmart
-  Name: company, dtype: object, 'Venezuela': 441    Mercantil Servicios Financieros
-  Name: company, dtype: object}

My code generated the output in a very strange way that does not match the answer:face_with_head_bandage: but the numbers are correct. Can someone take a look for the reason, please? Thank you!

Here is the answer I wanted.

+ {'Australia': 'Wesfarmers',
+  'Belgium': 'Anheuser-Busch InBev',
+  'Brazil': 'JBS',
+  'Britain': 'Compass Group',
+  'Canada': 'George Weston',
+  'China': 'China National Petroleum',
+  'Denmark': 'Maersk Group',
+  'Finland': 'Nokia',
+  'France': 'Sodexo',
+  'Germany': 'Volkswagen',
+  'India': 'State Bank of India',
+  'Indonesia': 'Pertamina',
+  'Ireland': 'Accenture',
+  'Israel': 'Teva Pharmaceutical Industries',
+  'Italy': 'Poste Italiane',
+  'Japan': 'Toyota Motor',
+  'Luxembourg': 'ArcelorMittal',
+  'Malaysia': 'Petronas',
+  'Mexico': 'America Movil',
+  'Netherlands': 'EXOR Group',
+  'Norway': 'Statoil',
+  'Russia': 'Gazprom',
+  'Saudi Arabia': 'SABIC',
+  'Singapore': 'Flex',
+  'South Korea': 'Samsung Electronics',
+  'Spain': 'Banco Santander',
+  'Sweden': 'H & M Hennes & Mauritz',
+  'Switzerland': 'Nestle',
+  'Taiwan': 'Hon Hai Precision Industry',
+  'Thailand': 'PTT',
+  'Turkey': 'Koc Holding',
+  'U.A.E': 'Emirates Group',
+  'USA': 'Walmart',
+  'Venezuela': 'Mercantil Servicios Financieros'}
1 Like

The issue in the code is this portion

employer_name = selected_rows.loc[(selected_rows["employees"]==selected_rows["employees"].max()),"company"]

If you look at the type of employer_name it’s a pandas series instead of a string. I hope that points you into the right direction.

I tried to change the data type to strings using the str(), but the result is still not the way it should be. :sob:

You wont be able to coerce it into a string since it’s a pandas series. employer_name is a series, we’ll need to to slice to just pull the first row and the first column using .iloc[].

#your original code with .iloc chained at the end.
employer_name = selected_rows.loc[(selected_rows["employees"]==selected_rows["employees"].max()),"company"].iloc[0, 0]

Your ending result should give you the following:

Hi, thank you so much for your reply! I just did the iloc at the end of the code.
The code is below

  employer_name = selected_rows.loc[(selected_rows["employees"]==selected_rows["employees"].max()),"company"].iloc[0, 0] 

but the code is returning an error saying there are too many indexers

IndexingError: Too many indexers

so confused:sob:

Thank you so much! Now I get the correct answer.