Its been a while that I posted my guided project here. Well ’ Clean And Analyze Employee Exit Surveys’ wasn’t really an easy project to do. Also, I went quite ambitious with this one and gotten myself into a deep pit. I wanted to make use of most of the data that has been given to us while the instruction asked to drop most of the columns.
I went ahead with all the data I thought would make sense to the analysis(agaist the advice from instructions in guided project) and finally messed up. I couldn’t really got out of it and hence I got stuct with that problem for almost 2 weeks. One of our latest Learning Assistant @Elena_Kosourova helped me to get through this.
Finally following the good advice, I just decided to follow guided instructions step by step and able to finish the project.
I will definitely go back and do the project according to my ambitious plan again. But for now here is my latest guiede project Since I did it quickly in order to finish it and move on with my studies, you might find a few things to rectify.
If you have some time, please go through it and give me some feedback. Looking forward to your suggestions. Here is the link. Thank you all.
Here is my last screen.
P6_Guided+Project_DQ_Clean+And+Analyze+Employee+Exit+Surveys.ipynb (184.9 KB)
Click here to view the jupyter notebook file in a new tab
That’s great, congratulations on finishing another challenging project! I agree that they are becoming more and more difficult throughout the course. It’s actually very good, because it requires you to really search for the right answer, for the best approach, and activate all your resources (I saw somewhere another kind of “guided projects”, where everything is already written for you, and you have only to fill some rare gaps). But sometimes you can feel stuck for many days, or even can go in a wrong direction with the analysis, that’s also true.
Well, now some suggestions about your project.
- I would definitely recommend, when you return back to this project, to analyze the dissatisfaction by age and to make also comparison between DETE and TAFE separately. I remember that in the previous version of your project you started analyzing the age factor. Now that you removed (finally ) all the redundant columns, such analysis will be much more clear and informative. The same about the comparison DETE vs. TAFE. Personally, I found some interesting insights when doing this comparison (let’s say, the distributions for them are quite different, which looks curious), as well as analyzing the factor of age.
- It’s better to add more customizing to visualizations. Well, for the same project my visualizations are not the best example, and I’ll definitely return to improve them. Anyway, in this case I would increase the figure size, the title and labels font, add a name to the y-axis, remove the one of the x-axis (the name of categories are sufficient here), remove top and right spines together with the ticks, remove the legend, which is redundant here.
- There seems to be an issue with the links at the beginning of the project (I just mean the way how they are visualized).
- The code cell . Here it would be better to use
value_counts(), maybe with sorting. This would improve the readability of the output.
- I would also recommend you to use a uniform style of quote marks throughout the project (or always single, or always double). It’s not crucial, of course, but is a good practice.
Well, nothing else to add from my side. For the rest great job, clean and well-commented code, exhaustive markdown explanations. In the code cell  I found out how to select non-successional columns all together
Congratulations also on becoming tier-2 Learning Assistant!
Thank you @Elena_Kosourova for yet another super-detailed signature review! Since you know the back story, I was in a hurry just to finish this project and move on. So it is possible to see some of the reflections of my impatience in this project.
Like you’ve mentioned, I didn’t venture into categorizing age and institute, which I was planning on doing in my initial attempts. So this time I just followed the instructions line by line and avoided going into more adventurous waters. I will definitely go back and re-do this project soon.
You’re right about the plot. I was looking at it and thinking of appying some of the points you had mentioned and then thought I will do it when I get back to it. Well call it lazyness
Link could have had the anchor text. Well, you may call it cutting corners
Yes, value_counts() could have been better there
Yes, I should stick to good practices, even for selecting the headings, subheadings etc. Thank you for mentioning.
Glad you could find something interesting. Well, that was the result of pushing myself with a lot of data in the previous attempts! I had many clusters of columns and didn’t know how to select them all in one go. Finally found it in Stack Overflow.
Thanks again for your detailed analysis on my project! And congratulations to you too for joining the LA team.
I was stuck on Cleaning The Service Column portion of this project for many days but was able to get the help I needed to make progress thanks to you!
With that being said, I have a question:
1) You did not check for NaN values is your sorter function (cell 38)? The project gave us code to do so, or I may misunderstand what is being asked. Your help understanding this will be greatly appreciated.
I can understand your situation. This was a tough project for me. I’m glad I could somehow help you.
I think on cell 38, that could be a mistake from my side that I forgot to take NaN values into consideration. I might have converted all 'nan’s into veterans
Please add that condition also at first in the code and let me know the results.
This is one project I definitely have to come back and improve. Thank you for pointing out a mistake. Also, please do share the result when you finish. Looking forward to it.
Thanks again for all your help with this project and for such a professional and fast response. You are definitely setting the bar high! I hope as we continue to complete the course we can utilize each other as a resource.
My prior post was my first, and you made it a great experience!