How does df.set_index() works?

I don’t understand what’s going on in this exercice.

I have read that:

I try to recreate it by hashing my code and just paste df.set_index() like in the example, but it doesn’t work.

…so how the author “set the Country column as the index.”?

Hi @drill_n_bass,

So think of the ‘index’ as the labels for each row. The ‘index’ is NOT a column per se but is just the labels.

If no index is specified, then pandas will just label them 0,1,2,3… and so on. In that case Country is actually an individual column in the dataframe.

If you set the Country column as the index then it removes it as a separate column and turns it into the ‘index’ or label for each row.

To answer your question about the error: I think that DQ has already set the Country column as the index so the Country column doesn’t exist anymore. Thats why you are getting the error.

So you are basically trying to set the index twice.

I recommend you experiment with this in a Jupyter notebook. When I get stuck, I find that experimenting in Jupyter helps me see what is going on better.


I forgot about this fact! Thank you!

1 Like