GP: Employees Exit Survey Cleaning and Analysis

Hey everyone!
Sharing my latest complete project here.
I am mostly worried about the data cleaning part. Especially, lines 14-15 of the notebook where the values I have got differ from those of other people and I couldn’t find out why.
Any help (or reassurance:) will be welcome!

Best day to you all!

Last mission screen:

My project:
Employees Exit Survey.ipynb (158.7 KB)

Click here to view the jupyter notebook file in a new tab

1 Like

Hi @daryaholodova! Thanks for sharing yet another project with the Community:) I liked that you clearly stated the two questions you want to answer at the end of the analysis. You’ve also done a great analysis without writing too much: your sentences are short and only summarize the most important insights you’ve discovered. Also, great job in digging in more capillary age groups (that’s interesting that you’ve found this difference among you people) although are samples big enough?

Some feedback from my side:

  • How this information can be useful to readers/employees? State it in the beginning. It can also be something like: this is curious to know the differences between new and veteran employees!
  • You have some typos
  • There are some Python style discrepancies. You can use Jupyter Lab/Notebook Formatter and a linting tool (like Black)
  • When you print out the data don’t forget to have a line explaining what that data means to improve the readability. Something like: First five rows of the dataset, etc.
  • Improve your plots: title them, provide labels to axes, remove legendas where they are not needed, reorder categorical variables in a logical way (from New to Veteran). You can also increase the font size of plots’ labels by using seaborn.set_context (seaborn.set_context — seaborn 0.11.2 documentation).
  • I see the difference of your results with other projects for the DETE data set. This may happen because you select all column by using .iloc instead of manually selecting the columns you need. Make sure that you’ve selected ["job_dissatisfaction", "dissatisfaction_with_the_department", "physical_work_environment", "lack_of_recognition", "lack_of_job_security", "work_location", "employment_conditions", "work_life_balance", "workload"]
  • What’s the point of your last plot? Does it confirm your hypothesis that sex and job dissatisfaction are not correlated?

Happy coding and good luck with your next projects! :grinning_face_with_smiling_eyes:

1 Like

Thanks a bunch for detailed review again, Artur!
Great tip about Jupyter Lab! I didn’t know about it and will definitely check it out.
Plots are for sure my weak spot, it has been my least favorite topic tbh.

Anyway, great help for me to polish the project before thinking of adding it to the portfolio!
Happy coding to you too!

1 Like