Numpy: Boolean Indexing


For the following code we are working with ndarray called “taxi”. We calculated “trip_mph” below which is originally not a part of “taxi”. But then in the line 2 we are only selecting rows for which trip_mph<100 in taxi. But trip_mph was not a part of taxi, so how does python knows which values to pull?

trip_mph = taxi[:,7] / (taxi[:,8] / 3600)
cleaned_taxi = taxi[trip_mph < 100]

Hi there!

the thing is trip_mph is a series object having equal length as taxi Dataframe.
So, trip_mph < 100 is returning a boolean series which in turn selecting the rows in the Dataframe.

Thanks a lot for your help! Have a wonderful winter break :slight_smile: