Creating Boxplot

Screen Link:

https://app.dataquest.io/jupyter/notebooks/notebook/Employee%20Exit%20Surveys.ipynb

My Code:

import pandas as pd
import numpy as np
dete_survey = pd.read_csv("dete_survey.csv")
tafe_survey = pd.read_csv("tafe_survey.csv")
dete_survey.info()
tafe_survey.info()
dete_survey.head()
tafe_survey.head()
dete_survey.isnull()
tafe_survey.isnull()
dete_survey.isnull().sum()
tafe_survey.isnull().sum()
dete_survey = pd.read_csv("dete_survey.csv", na_values='Not Stated')
dete_survey.columns
tafe_survey.columns
dete_survey_updated = dete_survey.drop(dete_survey.columns[28:49], axis=1)
tafe_survey_updated = tafe_survey.drop(tafe_survey.columns[17:66], axis=1)
dete_survey_updated.columns = dete_survey_updated.columns.str.lower().str.strip().str.replace(' ','_')
dete_survey_updated.columns
tafe_survey_updated.columns
tafe_survey_updated = tafe_survey_updated.rename({"Record ID":"id", "CESSATION YEAR":"cease_date", "Reason for ceasing employment":"separationtype", "Gender. What is your Gender?":"gender", "CurrentAge. Current Age":"age", "Employment Type. Employment Type":"employment_status", "Classification. Classification":"position", "LengthofServiceOverall. Overall Length of Service at Institute (in years)":"institute_service", "LengthofServiceCurrent. Length of Service at current workplace (in years)":"role_service"}, axis=1)
tafe_survey_updated.columns
dete_survey_updated['separationtype'].value_counts()
dete_survey_updated['separationtype'].unique()
#Update all separationtypes with the word 'resignation' to 'Resignation' category by splitting and selecting the first element
dete_survey_updated['separationtype'] = dete_survey_updated['separationtype'].str.split('-').str[0]
dete_survey_updated['separationtype'].value_counts()
dete_survey_updated['separationtype'].unique()
tafe_survey_updated['separationtype'].value_counts()
dete_resignations = dete_survey_updated[dete_survey_updated['separationtype'] == 'Resignation'].copy()
tafe_resignations = tafe_survey_updated[tafe_survey_updated['separationtype'] == 'Resignation'].copy()
dete_resignations['cease_date'].value_counts()
dete_resignations['cease_date'] = dete_resignations['cease_date'].str.split('/').str[-1]
dete_resignations['cease_date'] = dete_resignations['cease_date'].astype("float")
dete_resignations['cease_date'].value_counts()
import matplotlib.pyplot as plt

fig, ax = plt.subplots()

ax.boxplot(dete_resignations['cease_date'])
#dete_resignations.boxplot(column=['cease_date'])
plt.show()

What I expected to happen:
Boxplot to be created for dete_resignations[‘cease_date’].

What actually happened:

Key error 0

How would I create separate boxplots for the two datasets?
I’m getting a key error 0. Why is it looking for a key 0?

Hello @vroomvroom

I am not able to access the provided Jupyter notebook link as it is throwing the error. It would be great / helpful if you can share the mission link.

As per my understanding of your code and the error, Plot tries to go from index 0 on wards, which doesn’t work for named columns or slices, etc… Using .values or reseting the index will solve this.

ax.boxplot(dete_resignations[‘cease_date’].values)

Give it a shot and let me know.

Best
K!

Employee Exit Surveys-Copy1.ipynb (308.2 KB)
Thanks for replying to my post! I entered

import matplotlib.pyplot as plt

fig, ax = plt.subplots()

ax.boxplot(dete_resignations['cease_date'].values)
#dete_resignations.boxplot(column=['cease_date'])
plt.show()```
import matplotlib.pyplot as plt

fig, ax = plt.subplots()

ax.boxplot(dete_resignations['cease_date'].values)
#dete_resignations.boxplot(column=['cease_date'])
plt.show()

The error now is RuntimeWarning:Invalid value encountered…

This boxplot is for Guided Project: Clean And Analyze Employee Exit Surveys, page 5/11 , tag 348-5.

Click here to view the jupyter notebook file in a new tab

Hey

I plotted the box plot for the start_date. Here is the screenshot

image

You can try the same for the cease_data column with the below code . I believe it should work now.
fig = plt.figure(figsize=(10,5))
sns.set_style(‘dark’)
sns.despine(left=True, bottom=True)
ax1 = sns.boxplot(dete_resignations[‘cease_date’])

Best
K!

Thanks for your suggestions! I created the individual boxplot using:

import matplotlib.pyplot as plt
%matplotlib inline
fig = plt.figure(figsize=(10,10))
fig.add_subplot(1,1,1)

dete_resignations.boxplot(column=['cease_date']).set_ylim(2011, 2015)
plt.ticklabel_format(useOffset=False, axis='y')
plt.show()