industry_usa = f500.loc[f500[“country”] == “China”,“sector”].value_counts()
vs
sector_china = f500[“sector”][f500[“country”] == “China”].value_counts()
it shows the same result but I don’t understand the way to write in the second one.
industry_usa = f500.loc[f500[“country”] == “China”,“sector”].value_counts()
vs
sector_china = f500[“sector”][f500[“country”] == “China”].value_counts()
it shows the same result but I don’t understand the way to write in the second one.
When using loc
you can specify the rows and columns you want to access. So, the above results in all rows from the "sector"
column, but where the country
is "China"
.
So, if you had a simple example like -
country | sector |
---|---|
China | Energy |
India | Education |
USA | Wholesale |
China | Finance |
running the above code would only return -
sector |
---|
Energy |
Finance |
Because only the rows with the sector
- "Energy"
and "Finance"
, corresponded to the country
"China"
Now, the concept for the following is similar
Instead of using loc
you are chaining different approaches.
f500[“country”] == “China”
will return a Series with just one column where each row is either True
or False
. The boolean corresponds to whether or not that row’s country
was China
.
f500[“sector”]
just gives you the entire column sector
. When you combine/chain the two -
f500[“sector”][f500[“country”] == “China”]
It’s the same operation. You access rows in the column sector
for which the country
was China
.
In Pandas, there is more than one way to index data from a dataframe. They have an entire page dedicated to covering this that I would recommend checking out - Indexing and selecting data — pandas 1.3.1 documentation
Why I can’t see or seem like I can’t remembered that I learn to write the second concept? I understand but I am a bit surprise like I have never learn or see this thing before.
Thanks and greatly appreciated your inputs!
It would be difficult to go through the content and try to find this. But I do think this is covered in one way or another. Even if not, it’s difficult to try and cover everything there is related to Pandas in the content, so it’s even better to keep learning as you get new information by asking questions (just like you did)!
Glad I could help!