Guided Project: Clean and Analyze Employees Exit Survey

Hello,

I am working on the Clean and Analyze Employees Exit Survey Project. However, in step 5 of verifying the data in the cease_date column, I have got the value of ‘Not Stated’. I have checked the solution and others’ work but this value does not show. I cannot convert the value in float type with this value. Should I delete the rows which have ‘Not Stated’ value?
Screen Link:

My Code:

dete_resignations['cease_date'].value_counts()```

What I expected to happen:

2012       126
2013        74
01/2014     22
12/2013     17
06/2013     14
09/2013     11
11/2013      9
07/2013      9
10/2013      6
08/2013      4
05/2012      2
05/2013      2
09/2010      1
07/2006      1
07/2012      1
2010         1
Name: cease_date, dtype: int64

What actually happened: 
2012          126
2013           74
01/2014        22
12/2013        17
06/2013        14
09/2013        11
Not Stated     11
07/2013         9
11/2013         9
10/2013         6
08/2013         4
05/2012         2
05/2013         2
2010            1
09/2010         1
07/2012         1
07/2006         1
Name: cease_date, dtype: int64

print(dete_resignations[dete_resignations['cease_date']== 'Not Stated'])
id separationtype cease_date dete_start_date role_start_date
683 685 Resignation Not Stated 2011 2012
694 696 Resignation Not Stated 2012 Not Stated
704 706 Resignation Not Stated 2006 2007
709 711 Resignation Not Stated Not Stated Not Stated
724 726 Resignation Not Stated 1984 Not Stated
770 772 Resignation Not Stated 1987 1987
774 776 Resignation Not Stated 2005 2005
788 790 Resignation Not Stated 1990 2010
791 793 Resignation Not Stated 2007 2007
797 799 Resignation Not Stated 2000 2013
798 800 Resignation Not Stated 1995 Not Stated

Hi @haupham.neu,

This Not Stated values indicate that values are missing, but they aren’t represented as NaN. So, according to the code cell [35] in the solution notebook, we have to re-read the data again, but this time read Not Stated values as NaN:

dete_survey = pd.read_csv('dete_survey.csv', na_values='Not Stated')

Oh I see. Thank you.

1 Like

I’ve tried that and it does not work for me. It also does not work in the answer key provided in GitHub if you look at the cell: Screen Shot 2021-09-05 at 1...