# Guided Project: Clean And Analyze Employee Exit Surveys: [Step 10] Combined Pivot Table Mean Results

Hi All:

I’m not understanding the values returned from the combined_updated.pivot_table() call.
Hoping that someone can shed some light on the numbers for me.

My Code:

My combined_updated dataframe is as expected and seems to match values in the other project solutions that I have reviewed.

``````combined_updated['service_cat'].value_counts(dropna=False)
Out[84]:
new            193
experienced    172
veteran        136
NaN             88
established     62
Name: service_cat, dtype: int64
``````
``````pivot_table_combined_mean = \
combined_updated.pivot_table(index='service_cat',values='dissatisfied')
``````

What I expected to happen:

The pivot_table returns mean values but I am not understanding how these values were calculated. Again these results seem to match the results returned in other solutions to this project.

I’ve added an extra column to the result showing what I think the returned mean values should be.

There are 650 rows in combined_updated so the mean should just be the sum of the number of elements in each category divided by 650 and the grand sum of all the means should be 1 if we include the 88 NaNs in the dataframe.

The mean values for experienced, established and veteran all appear to be wrong to me.

Any enlightenment is appreciated!

dissatisfied
service_cat Mean My Expected Results
new 0.295337 193 / 650 = 0.29692
experienced 0.343023 172 / 650 = 0.26461
established 0.516129 62 / 650 = 0.09538
veteran 0.485294 136 / 650 = 0.20923
------ ------ ------
sum 1.639783 ??? 135 / 563 / 650 = 0.86615

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.pivot_table.html

By default the `aggfunc` argument is `mean`, hence `pivot_table` will return a DataFrame with the mean values of the `values` argument.

Think of `pivot_table` as a way to group and aggregate data in your DataFrame.

1 Like

I figured out how the pivot_table() mean works now.

I was using the wrong numbers for my sanity check.

The mean is calculated on the total number of elements for each specific category and not the total number of elements in the dataframe.

``````# total counts of all service categories with dissatisfied == True
dissatisfied_true = combined_updated[ combined_updated['dissatisfied'] == True]
dissatisfied_true['service_cat'].value_counts(dropna=False).sort_index(ascending=False)

veteran        66
new            57
experienced    59
established    32
NaN            26
Name: service_cat, dtype: int64

# total counts of all service categories (True or False)
combined_updated['service_cat'].value_counts(dropna=False).sort_index(ascending=False)

veteran        136
new            193
experienced    172
established     62
NaN             88
Name: service_cat, dtype: int64
``````

Using the above numbers for my sanity check gives the correct mean values

dissatisfied
service_cat Mean Correct Mean Calculation
new 0.295337 57 / 193
experienced 0.343023 59 / 172
established 0.516129 32 / 62
veteran 0.485294 66 / 136

Hi @mgj000
I got a different number when i used the `pd.pivot_table`, I refer to the solution provided by DQ. but i dont understand the answer they provided.