Pandas Dataframe vs Series

I understand the difference between series and dataframe but I don’t really understand the difference between the 3 lines of codes below

f500['revenues']

f500[['revenues']]

f500[f500['revenues']]

I hope to get some clarification or further explanation. Thanks!

Hi @adamjosh26,
f500[‘revenue’] returns column with label revenue as Series.
f500[[‘revenue’]] returns column with label revenue as Dataframe.
f500[f500[‘revenue’]] filters the entire f500 dataframe based on the ‘revenue’ column. Thus for this to work the revenue column must contain boolean values TRUE or FALSE. f500[f500[‘revenue’]] can then return rows of the f500 dataframe where the ‘revenue’ column evaluates to TRUE.

Just note that when you select a column of dataframe with single brackets, e.g. [‘revenue’], it returns a Series, whereas when you use double brackes, e.g. [['revenue]], you return a dataframe. You can also filter the entire dataframe by specifying a boolean condition within square brackets like f500[f500[‘revenue’] > 100]. f500[f500[‘revenue’] > 100] will return rows of the f500 dataframe where the revenue column is greater than 100.

1 Like
  1. Indexing a dataframe with a Column name will return a Series
  1. Indexing a DataFrame with a list of column names will return a DataFrame
  1. I think this will generate an error

I’ll try it out then I’ll give you feedback later.


Learn More:

Selection

Use these commands to select a specific subset of your data.

df[col] | Returns column with label col as Series
df[[col1, col2]] | Returns columns as a new DataFrame
s.iloc[0] | Selection by position
s.loc['index_one'] | Selection by index
df.iloc[0,:] | First row
df.iloc[0,0] | First element of first column


I Have ran the code, It generates an error. KeyError

1 Like

Thanks for the explanation! I think I was a bit confused over the boolean indexing portion but this clears it up.

Glad you find my response helpful!

1 Like