Boolean Indexing in NumPy challange question

Hi,

i am on mission 10 in Boolean indexing in NumPy course.

there is a code as answer:

> trip_mph = taxi[:,7] / (taxi[:,8] / 3600)
> 
> print (trip_mph)
> 
> **cleaned_taxi = taxi[trip_mph < 100]**
> 
> print(cleaned_taxi)
> 
> mean_distance = cleaned_taxi[:,7].mean()
> mean_length = cleaned_taxi[:,8].mean()
> mean_total_amount = cleaned_taxi[:,13].mean()

I do not understand on which column taxi (cleaned_taxi variable) is restricted in this case? We have whole array there so why we know to choose rows only less than 100 ?

Please help

1 Like

Hi @jaryszek,

Imagine this to be our taxi ndarray:

112 234 345
325 234 123
532 125 563

And this to be our trip_mph ndarray

122
50
20

When we do cleaned_taxi = taxi[trip_mph < 100], we are basically keeping only those rows in taxi where the row of trip_mph is lesser than 100. So in the above imaginative case, the result will be:

112 234 345 [122 < 100]
325 234 123 [50 < 100]
532 125 563 [20 < 100]

Hope this helps :slightly_smiling_face:

Best,
Sahil

Thank you very much,

It is clear now!

1 Like