I am finding it hard to absorb the effect of numpy slicing and the consecutive boolean indexing on a particular variable.
For example in the final challenge when numpy slicing is applied on taxi to get trip_mph(which is only supposed to store speeds), how come another variable cleaned_taxi which is subset of taxi_mph is again used to fetch distance, length columns and methods are applied on them?
Why does cleaned_taxi have those columns in the first place?
someone explain plz.
2 Likes
Hi @aswadr093:
Please provide a mission link and format your code appropriately as per these guidelines so that we can better assist you.
Hi @aswadr093,
In that challenge 'cleaned_taxi'
is actually not a subset of 'trip_mph'
(which stores only speeds, you are right), but a subset of 'taxi'
itself, to which was applied a boolean mask of 'trip_mph < 100'
. It means that we extracted all the rows from 'taxi'
where the speed is less than 100, creating in this way a new ndarray 'cleaned_taxi'
, and then applied all the other manipulations (calculating distance etc.) to this new ndarray.
Hope it was helpful.
1 Like
Thanks. This sounds plausible.
1 Like