Boolean Indexing with NumPy - [[]]

newark = taxi[taxi[:, 6] == 5]
newark_count = newark.shape[0]

I do not understand why I need taxi [ taxi… ]
Can someone explain me this doubling?



taxi[:, 6] == 5 will only compare the value present in each row of the 7th column of “taxi” dataframe. So it will only give you a “True” or “False” result.

to select all the rows where the series compare condition is met, we need to couple the dataframe alongwith the series.

taxi[taxi[:, 6] == 5] will fetch all the rows where condition is “True”.

following is an example with only two columns. Since only rows at index=0 and index=2 satisfy the condition, they are returned from the dataframe.

