I like the idea of being able to solve this with a mask in case an index is categorical, I feel like the solution provided is risky in a lot of real-world scenarios.
Here’s what I tried to use to no avail:
percentage_over_30 = wnba[wnba['Age'] >= 30]['Age'].value_counts(normalize=True)
Can anyone help shed some insight into what I’m doing wrong here? How do I extract the Age series after the mask is applied?
Several things confuse me in this post. Can you please elaborate on:
- what are you trying to do exactly
- how does, the below case come into play here?
- and the mission link/ solution page where:
Consider “mask” like substituting one value where a condition is met. for example this code
s.mask(s >= 30, "above 30").value_counts() would give you the below result:
I generated the series using this code:
s = pd.Series(np.random.randint(low = 5, high = 50, size = 50))
# s.value_counts(bins = 5).sort_index()
Apologies, I was not super clear in my initial question, but you answered exactly what I was aiming to do.
This is the link to the problem section I was working on
I ended up using the following code to answer the last two questions:
percentage_over_30 = wnba['Age'].mask(wnba['Age']>=30, 'percentage_over_30').value_counts(normalize=True)*100
percentage_below_23 = wnba['Age'].mask(wnba['Age']<=23, 'percentage_below_23').value_counts(normalize=True)*100