countries = f500["country"]
revenues_years = f500[["revenues","years_on_global_500_list"]]
ceo_to_sector = f500[:,"ceo":"sector"]
What I expected to happen:
get a slice of ceo to sector columns
What actually happened:
I needed to use f500.loc instead of just f500
Replace this line with the output/error
Why do I not need .loc for manipulating other dataframe data but I do to slice it?
loc() has been explained in the 5th Step of the Mission - https://app.dataquest.io/m/291/introduction-to-pandas/5/selecting-a-column-from-a-dataframe-by-label
I would recommend going through that again if you are confused about when to use
But, I think, this is more of a “how Pandas works” situation which isn’t really helpful to go into more details since it would require trying to understand the underlying code and how the library was created/structured/designed.
These are two distinct use-cases as per the library. One allows you to select specific columns from the DataFrame with all their rows, and one allows you to select specific columns and specific rows from those columns.
loc can also be helpful if your DataFrame’s
Index is not just the row numbers. This might be a more advanced concept/use-case so you can ignore that for now.
A summary of techniques to select columns.
|Select by Label
|List of columns
|Slice of columns
how i interpreted this is as follows -
the first 2 commands are trying to retrieve different slices of the data frame and hence we do not need the ‘loc’ term while using the shorthand
but in the third command we are trying to retrieve a smaller data frame from the bigger f500 data frame - hence we need loc
Thanks for this. So simple yet exactly what I needed.
There is a more confusing event in the next section of this chapter i.e., 6.
I simply think this is how it works. you need to add
loc in some cases in order to achieve the correct syntax.