GP : Clean And Analyze Employee Exit Surveys

Good project . thanks giving me a feeback .

You can see my job here

3 Likes

Hello @biadboze! Thanks for sharing.

You did a rather good job and here is some feedback:

  • You copied the whole Dataquest introduction that is not fair. It’s ok if you cannot invent another way of asking the question but try to rephrase at least the introduction.
  • Your code blocks have comments! That’s a useful habit because it helps you and other people in understanding your code.
  • You also copied the DQ instructions in the “Verification of data” section.
  • What does your boxplot represent? What information would you like to communicate to us? Give it a title, axes labels and some explanation below.
  • In your second plot you lack a title and x axis label. It’s also a good idea to order the categories in a logical way (like from New to Veteran) to help the readability. You also do not need any rotation because the horizontal view of the categories is the best one.
  • Could you give more motivation for categorizing NaN values in the categorizeNaN function? You add a considerable amount of data that can have a great impact on the results.
  • When you visualize the number of dissatisfied employees by position what message do you want to transmit? A plot cannot be there just for the sake of visualization, it should communicate some useful information.
  • You have some typos in the narratives.
  • Your code style isn’t always the same: like having or not having spaces before and after the commas in methods.
  • You have no conclusion. Wrap up all the results you have, did you answer the initial question? What are the most important insights into your analysis?

I advise your yo read this article on the project style. For the data visualization part, I recommend reading “Storytelling with Data”.

Happy coding:)

2 Likes

Thank you very much for your feedback. I’ll think about it .

1 Like

I need your help for this part: to order the categories in my plot to make it more readability and I can’t rename the x axis label .

  • Could you give more motivation for categorizing NaN values in the categorizeNaN function? You add a considerable amount of data that can have a great impact on the results.
    About this part, I wanted to fill the missing values like above, but you’re right, so I should remove the rows with NaN in service_cat column?

thanks

1 Like

Hello!

  • You can rename your x-axis label with this code axes.set_xlabel(<label name>).
  • You may remove them if you can a motivation. The choice is yours just make sure to motivate it.

Happy coding:)

1 Like

thanks :wink:

Happy coding :slight_smile:

2 Likes

Please, I want to understand well something. it’s about combined_updated['dissatisfied'].pivot_table(index='service_cat',values='dissatisfied'), what is the value of consideration True, False or both.
Thanks for your help.

Hi @biadboze! Sorry for late response. The pivot table counts the number of Trues which are 1s in the computer language.

Hi @artur.sannikov96, ok Good.

1 Like

let’s look at something.


I don’t know why the plot is different from the result above? can you make clear me this part.

Hello @biadboze! On the first screenshot, you count the number of dissatisfied employees for each position.

On the second screenshot, you aggregate the date by position and return the mean number of Trues (or in other words of employees) for each position. The default aggregation function for pivot tables in pandas is mean as you may read in the docs. To obtain the same result you can use the np.sum function in aggfunc argument.

Hope it was helpful.

Happy coding:)

1 Like

yes yes right :man_facepalming:t5: I thought normally the mean will give the same result, I’ll check nicely the doc. Thanks

1 Like

If it solved your doubt could you mark my answer as a solution? Thanks!

1 Like

Ok, no problem but how can I do it ???

Sorry, it’s only for the Q&A section. Never mind.