Feedback and Question: Visualizing Earnings Based On College Major

Hello everyone,
I just finished this project and would appreciate feedback.

I also had some questions I did not know how to solve. It’s more about reading and interpreting the graphs rather than the code.

I couldn’t, and didn’t know how to, answer these questions:

  1. What percent of majors are predominantly male? Predominantly female?

  2. Do students that majored in subjects that were majority female make more money?

Recent Grads Plotting data.ipynb (345.7 KB)

Click here to view the jupyter notebook file in a new tab

1 Like

Hi @RayOjel,

For you first question, you could run a count of majors where ShareWomen > 0.5 to show how many are predominantly female, and one for where ShareWomen < 0.5 to show how many are predominantly male. You can then take these two numbers away from the total amount of majors to show how many have an equal amount.

For your second question, you’ll want to look at plots that compare the ShareWomen colum and the Median column.

I hope this helps,



Hey @RayOjel,

Thanks for sharing your project with us. :slight_smile:

You have done a good job completing the project. Kudos!

Here are my quick tips to improve it further:

  • It’s difficult to follow what you mean when you say looking at the median histogram. I would feel better if I knew which histogram you were talking about. Maybe it would make the conclusion feel more coherent with the analysis if you moved it to the end instead of the beginning?

  • Can you include some explanation for the last few graphs? Right now, I don’t understand what those graphs accomplish.

1 Like

Thank you, I kept your suggestions from my previous project in mind, that’s why I added an introduction and conclusion so thank you for taking the time out to help me improve.
Also, how do you add the link for the jupyter notebook? I can only seem to upload it as a file.

1 Like

I’m so glad that you found my advice helpful @RayOjel!

You don’t have to add the link; it is done automatically. Just upload your file when you create a topic!

1 Like

Hello fellow learners!

I have one question regarding the dataset of this Project. I cannot quite understand what the Sample_size column actually describes therefore I cannot understand how to interpret the scatter plots I generate!

It is stated that
Sample_size : Sample size (unweighted) of full time.

I cannot understand what that really means.

Thank you very much for your time!

Ioannis Nikolaos

Sample size is a statistics term which means we take a Subset or sample out of the whole population.
If you think of the whole population as a pizza, a sample would be a slice, and based on the answers from that slice, we can make predictions for the whole population.
In this dataset, if total majors is 200 and sample size is 36 or whatever, then we only have data from 36 of the 200.
I hope that makes it clearer!

1 Like

Thank you very much @RayOjel for the quick reply!
Yes it makes the whole picture clearer!

1 Like