Selecting Data in DataFrames Practice Problems 23/24

Screen Link:

https://app.dataquest.io/m/1033/selecting-data-in-dataframes-practice-problems/23/mixing-iloc-and-loc

Hey!

This exercise was taken from the practice mode:

  1. Select the Name and Country columns from the people DataFrame. Select only the rows whose index is a multiple of 3 between 11 (inclusive) and 66 (exclusive). Assign the result to a variable named my_selection .

I tried the following code:

my_selection = people.iloc[12:66:3][['Name', 'Country']]

I have made this because 12 is the first multiple of 3 between 11 and 65. In this case I would expect to have 12, 15, 18, 21…

But the answer given for the exercise is:

my_selection = people.iloc[11:66:3][['Name', 'Country']]

And the output dataframe starts at the row 12. Why is this not starting at the row 11?

Tks!
Paulo

1 Like

Hello Paulo!

This is because the indexing starts at 0, but for some reason that I’m not aware of this particular DataFrame has its first row labeled as 1. However, for python the first row is still 0 no matter what. So, in order to select the first row with .iloc you must run people.iloc[0]. That’s because.iloc[0] returns the row labeled as 1. Therefore, .iloc[11] returns row labeled as 12.

If you try to use .loc instead, you’ll use the actual label in the index, which would be .loc['001'] for the first row (posistion 0) and .loc['012'] for the 12th row (position 11).

This is a bit confusing, I know, and I’m no sure I explained it properly, but the index starting at 001 makes it even more confusing.

Let me know if you need further explanation.

2 Likes

Ok! Your explanation made it clear. Thank you!

1 Like