Top_japanese_employer code

I had a question on Mission 10 (Sorting Values) in the “Exploring Data with Pandas: Intermediate”.

The URL is: Learn data science with Python and R projects

My Code:
top_japanese_employer = f500.loc[(f500.loc[:,"country"]=="Japan") & (f500.loc[:,"employees"].sort_values()) & (f500.loc[:,"company"])].head(1)

What I expected to happen: I thought I would get the same answer, however, I’m not. The correct answer is just an integer. Mine is giving me the whole row. How do you write the code in one step/no intermediates?

I would recommend going through the documentation for head() to see what it returns.

Then, go through the instructions again to see where your approach deviates. You have opted for a slightly complex approach which is not required here. For example,

You don’t need this as a separate step. You can select rows corresponding to Japan and then you just need to use sort_values() on those rows. If you check the documentation for sort_values() you will see that you can specify the column on which you wish to sort the dataframe. This is what the content covers as well in that Step -

sorted_rows = selected_rows.sort_values("employees", ascending=False)

So, go through the content carefully. And then follow the instructions. Hopefully this helps.

1 Like

Your code is close. But you have just selected a dataframe, asked for all the rows that meet your conditions, and returned the first of those rows. To return a value, you have to select the row AND the column that you want returned.

Also, as cool as it is to execute a bunch of stuff in one big long line of code, it becomes very hard to understand if you ever want to use it again. Or lets say you wanted to instead find the 3rd highest employer in the US, how easy is it to alter the code for a new search?

1 Like

Hello @the_doctor and @ncarvey ,
Thanks for the reply. I’ve been working on the mission for a while.

I did this for the code:
selected_japanese_rows = f500.loc[f500.loc[:,"country"] == "Japan"].sort_values("employees", ascending=False)

top_japanese_employer = selected_japanese_rows.iloc[0]["company"]

I broke the line of code up. I just had two questions: 1) Is this code OK, in regards to readability? I did combine the selected_rows and sorted_rows from the mission, however, it isn’t like my original one-line code. 2) I’m confused about the use of the iloc[0]["company"]. I thought if you use iloc it’s only numbers but this answer also has “company.” How is this so?

I would say you are using loc even if it’s not quite necessary. You can just use f500[f500["country"] == "Japan"] instead, for example. But either is fine I think.

You are not specifying "company" as part of iloc. You first access the row at 0 using iloc, and then from that output you access the value corresponding to the column "company" just like you would do normally from your DataFrame or Series.

1 Like

Thanks @the_doctor !