Two filters again--Sampling 8/12

I had written this piece of code:

first_players = wnba[wnba['Games Played'] <= 12]
second_players = wnba[12 < wnba['Games Played'] <= 22]
third_players = wnba[wnba['Games Played'] > 22]

sample_mean = []
for i in range(100):
sample_1 = first_players.sample(1, random_state = i)
sample_2 = second_players.sample(2, random_state = i)
sample_3 = third_players.sample(7, random_state = i)
sample = pd.concat([sample_1, sample_2, sample_3])
sample_mean.append(sample['PTS'].mean())

plt.scatter(range(1,101), sample_mean)
plt.axhline(wnba['PTS'].mean())

Then the system gave me the error like this:

I checked the solution, the only difference is: when there is double boolean check, it breaks down to two booleans.

btw_13_22 = wnba[(wnba['Games Played'] > 12) & (wnba['Games Played'] <= 22)]

I know in Matlab, I can write a < x < b, and I don’t have to write like this: (a<x) & (x < b)

A friend of mine told me in C, we need to bring down to two booleans like (a<x) & (x < b).

I don’t remember if and when I ever wrote something like a < x < b in Python. Can someone please confirm if it is best to break down to two booleans like (a<x) & (x < b) in Python?

Changing your line from :

second_players = wnba[12 < wnba['Games Played'] <= 22]

to

second_players = wnba[(12 < wnba['Games Played']) & (wnba['Games Played'] <= 22)]

should get it to work.

You can actually use a < x < b notation in Python, here’s a simple piece of code to show an example:

list1 = [1,2,3,4,5,6,7,8,9,10]
list2 = []

for x in list1:
    if 3 <= x <= 7:
        list2.append(x)

print(list2)

Output:

[3, 4, 5, 6, 7]

Pandas just isn’t one of these areas!

In the above example, 3 <= x <= 7 was equivalent to 3 <= x and x <= 7.

or and and statements require truth values, which in Pandas are ambiguous. That’s the reason for using the bitwise operators | and & in their place.

You don’t communicate this through your code when you use second_players = wnba[12 < wnba['Games Played'] <= 22], because that communicates an and clause. You’d see that if you used the code second_players = wnba[(12 < wnba['Games Played']) and (wnba['Games Played']<= 22)] instead, it returns the same error.

1 Like

For pandas, the size of the evaluated expression does not conform the same magnitude.
12 < wnba[‘Games Played’] gives series of boolean
The series of boolean is compared with the less than or equals to operator which gives an error

It will easier to use two expression to represent the mathematical expression.
Python statement is not a mathematical expression.
Mathematical expression: a < b <= c
Python comparison statement: a < b && b <= c

The following should work:

wnba[(12 < wnba['Games Played']) && (wnba["Games Played"]<= 22)]
1 Like

@alvinctk Thank you for clarification.

Bingo! Just find out you replied to me! :innocent: Thank you so much for very thorough clarification. I really appreciate. :smiley:

2 Likes