Apply with pd.value_counts

Hi,

I’m using apply to do value_counts() on two columns at once. I, however, want to set dropna=False.
This is not possible as I cannot supply any additional arguments for value_counts once I use it in apply.

columns = ['dissatisfaction','job_dissatisfaction']
print(tafe_resignations[columns].apply(pd.value_counts(dropna=False))

afbeelding

Is there a work-around for this, or should I do value_counts for both columns seperately?

https://app.dataquest.io/c/60/m/348/guided-project%3A-clean-and-analyze-employee-exit-surveys/7/identify-dissatisfied-employees

Try using pandas.DataFrame.value_counts.

Hi Bruno,

I tried

print(tafe_resignations[columns].apply(pd.DataFrame.value_counts))

But then I get an error saying
afbeelding

Am I not looking in the right direction with your hint? :slight_smile:

Review the examples section in the link I sent. If that’s not enough reach out again. Also please include a link to the screen in your original post.

Hi Bruno, I checked the example section but cant come to a solution.

I currently have this:

The following does what I want:

print(tafe_resignations['dissatisfaction'].value_counts(dropna=False))
print('\n')
print(tafe_resignations['job_dissatisfaction'].value_counts(dropna=False))

But i’d like to do it in one step. Therefore I tried:

print(tafe_resignations[columns].apply(pd.value_countss))

Which does exactly what I want but won’t allow argument dropna = false.

When I tried:

print(tafe_resignations.value_counts(dropna=False)

I get an error saying that value_counts() got an unexpected argument ‘dropna’ even though this argument is mentioned in the documentation.

Nevertheless, even if it would allow the dropna=False I do not think it does the trick as it returns a series of counts of unique rows.

You could try a lambda function:

.apply(lambda x: x.value_counts(dropna=False))

1 Like

Works like a charm! Thank you @vik.

I’ve seen lambda functions a couple of times but thought to leave it for now as I assumed it will be part of the Data Analyst with Python path I’m taking.

Does anybody know why I get the unexpected argument (dropna=False) error with DataFrame.value_counts()?

Because calling .apply(pd.value_counts(dropna=False)) immediately ran the value_counts function, versus passing the reference into the apply method. When you instead did .apply(pd.value_counts), then it passed in the function reference, and the apply method ran the function across each column. It’s the difference between a = sum and a=sum([1,2]).

It didn’t work for you probably due a version issue. You can see the following just below the spot where your screenshot ends:

If you’re curious about what version you’re on, you can print pd.__version__.

I see Vik sorted it out in the meantime. Good to know you’re set.