Screen Link:
My Code:
selected_rows = f500[f500["country"] == "China"]
pandas.core.frame.DataFrame
china = (f500['country']=="China")
pandas.core.series.Series
I would like to know when exactly used both methods. I’m pretty confused.
Screen Link:
My Code:
selected_rows = f500[f500["country"] == "China"]
pandas.core.frame.DataFrame
china = (f500['country']=="China")
pandas.core.series.Series
I would like to know when exactly used both methods. I’m pretty confused.
When you do the above, you create a series that contains boolean values: True and False.
When you do this, you use the series of boolean values to select particular rows in the column - that is cells where the series contain True.
So, you create a series of boolean value and you use this series to select rows that are true. In this case, the final results contains only data about China.
Not exactly but I have explain similar question here if it helps
selected_rows = f500[f500["country"] == "China"]
The code above does the same thing as
selected_rows = f500[china]
as long as you defined the variable china
:
china = f500[‘country’]==“China”
The variable china
serves as a boolean filter. That is, it contains a series of True
or False
values depending on the condition you specified which you can then use to access specific elements in a dataframe. In this case, Python will iterate through each row in the 'country'
column of the f500
dataframe and will assign a value of True
into the china
variable every time it sees an entry of "China"
in the country
column (returning False
otherwise). The length of the china
variable will then be equal to the length of the country
column, but this time it will only contain True
or False
values.
When you use the china
variable as a filter (e.g. f500[china]
) and assign it to the new variable selected_rows
, you’re telling Python to create a new series containing only the rows where the condition f500['country'] == "China"
is True
.
Hope this helps!