Subplots for pivot tables in Guided Project: Clean And Analyze Employee Exit Surveys

Hello,

I am trying to create bar graphs which look at the percentage of dissatisfaction of different age groups in different career stages. If I do the code below, I get a massive graph which I attached. Is there a way to turn this into subplots and have 4 bar graphs, each one representing a different career stage (service_cat) with the y axis as “dissatisfied” and the x axis as “age_cleaned” (string age ranges)?

Use a pivot table to calculate the percentage of dissatisfied employees in each age group

dis_pt = combined_updated.pivot_table(index=“service_cat”,columns=“age_cleaned”,values=“dissatisfied”)
dis_pt.head()
dis_pt.plot(kind=“bar”)

image

I have also tried this:

new_pt = new_df.pivot_table(index=“age_cleaned”,values=“dissatisfied”)
exper_pt = exper_df.pivot_table(index=“age_cleaned”,values=“dissatisfied”)
estab_pt = estab_df.pivot_table(index=“age_cleaned”,values=“dissatisfied”)
vet_pt = vet_df.pivot_table(index=“age_cleaned”,values=“dissatisfied”)

df1 = new_pt.reset_index()
df2 = exper_pt.reset_index()
df3 = estab_pt.reset_index()
df4 = vet_pt.reset_index()

fig, ax = plt.subplots(2,2,figsize=(15,15))

for r in df1.itertuples():
ax[0,0].bar(r[0],r[2],label=r[1])
plt.xticks(df1.index,df1.age_cleaned)

for r in df2.itertuples():
ax[0,1].bar(r[0],r[2],label=r[1])
plt.xticks(df2.index,df2.age_cleaned)

for r in df3.itertuples():
ax[1,0].bar(r[0],r[2],label=r[1])
plt.xticks(df3.index,df3.age_cleaned)

for r in df4.itertuples():
ax[1,1].bar(r[0],r[2],label=r[1])
plt.xticks(df4.index,df4.age_cleaned)

AND

new_pt.plot(kind = ‘bar’, color=“green”)
plt.xlabel(“Employee Age”)
plt.ylabel(“Percent Dissatisfied”)
plt.ylim(0,1)
L=plt.legend()
L.get_texts()[0].set_text(‘New’)
plt.show()

exper_pt.plot(kind = ‘bar’)
plt.xlabel(“Employee Age”)
plt.ylabel(“Percent Dissatisfied”)
plt.ylim(0,1)
L=plt.legend()
L.get_texts()[0].set_text(‘Experienced’)
plt.show()

estab_pt.plot(kind = ‘bar’, color=“red”)
plt.xlabel(“Employee Age”)
plt.ylabel(“Percent Dissatisfied”)
plt.ylim(0,1)
L=plt.legend()
L.get_texts()[0].set_text(‘Established’)
plt.show()

estab_pt.plot(kind = ‘bar’, color=“purple”)
plt.xlabel(“Employee Age”)
plt.ylabel(“Percent Dissatisfied”)
plt.ylim(0,1)
L=plt.legend()
L.get_texts()[0].set_text(‘Veteran’)
plt.show()

Which gives me four separate graphs. What I want is four subplots with easy to read labeling and consistent y axis. Thank you for the help!

Hi @tnet92 ,

I would like to take a look at your Guided Project, please click the Download button at the top of the Jupyter Notebook interface in our platform.

jupyter_guided_project_download.png

This will download a .tar file that contains both your notebook file and the required datasets. Please attach this .tar file with your reply.

Best,
Sahil

Hello,

I attached it to this email. I tried something to get some new graphs. I have a whole section at the end where I explore other variables and then another extra section where I played with code.

Could you review this for me and give me feedback? I would love to put this on my portfolio but I want it to look well done and thoughtful. If you have any suggestions, please let me know. Thank you so much for your help.

Guided Project_ Clean And Analyze Employee Exit Surveys.tar (1.9 MB)

1 Like

Hi @tnet92,

First of all, Fantastic work! However, maybe I didn’t understand your question correctly, or you might have already achieved what you were asking. While checking your project, I noticed that on input cell number 81, you have already produced this graph by using the following code.

# Create a fig to put in 4 subplots
fig=plt.figure(figsize=(15,15))
fig.suptitle('Dissatisfied Employees by Career Stage and Age', fontsize=16)

ax = plt.subplot("221")
# Select only the "New" row
dis_pt.iloc[2,:].plot(kind="bar", ax= ax)
plt.ylabel("Percent Dissatisfied")
ax.set_title("New: Less than 3 years")
ax.set_ylim([0, .8])

ax = plt.subplot("222")
dis_pt.iloc[1,:].plot(kind="bar", ax= ax)
ax.set_title("Experienced: 3-6 years")
ax.set_ylim([0, .8])

ax = plt.subplot("223")
dis_pt.iloc[0,:].plot(kind="bar", ax= ax)
plt.xlabel("Employee Age")
plt.ylabel("Percent Dissatisfied")
ax.set_title("Established: 7-10 years")
ax.set_ylim([0, .8])

ax = plt.subplot("224")
dis_pt.iloc[3,:].plot(kind="bar", ax= ax)
plt.xlabel("Employee Age")
ax.set_title("Veteran: 11 or more years")
ax.set_ylim([0, .8])

plt.show()

Let me know if this wasn’t what you were looking for.

And as feedback, since you intend to add it to your portfolio. I would recommend you to remove all the value checking codes that are not giving any meaningful information to the reader like:

# Look to see the columns in combined and the number of null values
combined.info()

And instead of showing True and False value counts for each column.

You can show it in a single graph. This will reduce the length of the project.

Best,
Sahil

Thank you! I had substituted this code in after I asked the question. Can you think of a better way to do this group of graphs?

Thank you for the input; I will work on it!

Hi @tnet92,

There will always be better ways to do the same thing. However, you have already done it really well. As you continue your learning you will keep on learning new ways to do things.

Here is an alternative using for loop:

# Create a fig to put in 4 subplots
fig=plt.figure(figsize=(15,15))
fig.suptitle('Dissatisfied Employees by Career Stage and Age', fontsize=16)

plot_elts = (("221", 2, "New: Less than 3 years"),
             ("222", 1, "Experienced: 3-6 years"),
             ("223", 0, "Established: 7-10 years"),
             ("224", 3, "Veteran: 11 or more years"))

for s_no, s_idx, title in plot_elts:
    ax = plt.subplot(s_no)
    dis_pt.iloc[s_idx,:].plot(kind="bar", ax= ax, sharex=True, sharey=True)
    ax.set_title(title)
    ax.set_ylim([0, .8])
    plt.xlabel("Employee Age")
    plt.ylabel("Percent Dissatisfied")

plt.show()

However, shorter code is not always the best, reducing code size sometimes decreases comprehension.

Best,
Sahil

Hello,

I am having trouble plotting trues and falses. Is there a way to plot boolean values? Thank you!

What do you have so far? Do you mean to say you’d like to plot the number of true values and the number of false values? You should be able to use the .value_counts() method on the column then plot the resulting Series at that point.

1 Like