Boolean Indexing With NumPy, Challenge: 'Calculating Statistics for Trips on Clean Data'


I just did the Boolean Indexing With NumPy second challenge, but it left me with doubts about my understanding of NumPy vectors. When using the code:

trip_mph = taxi[:,7] / (taxi[:,8] / 3600)

cleaned_taxi = taxi[trip_mph < 100]

How is that the 2D array ‘taxi’ can be filtered by refering to the 1D array ‘trip_mph’?

I understand both as separate arrays, so I don’t get why when we assign the boolean operation using trip_mph as reference, the language undertakes which rows to filter from the second array ‘taxi’ .

Thank you in advance for your support. I hope someone can help me improve my understanding.

Greetings and happy new year to all!

Did you try to print trip_mph < 100? If this was too long an example, how about doing your own experiment with np.array(range(5))>3. You will see a lot of method chaining going on when learning pandas too. See how you can break it down into the tiniest operation, and observe it’s type() and values of both the input and output of each operation.