My Fifth Guided Project! Visualizing the gender gap in college degrees

Hey guys, I’m here again!

I’m very happy to finish my last guided project of this year!

Short info about my journey: In April I decided I wanted to work as a Data Scientist instead of an Engineer (I’m in the last year of my university), with no programming knowledge I started learning python, did some online courses and, a month ago, started my Data Scientist Path here in Data Quest!

People like @brayanopiyo18, that has been reviewing my projects since the first one, @jithins123 and @Yeside make the journey even better with their feedback and support :smiling_face_with_three_hearts: Thanks a lot!!!

About this project: any feedbacks would be great! I had to take a look into the solution to solve this loop but in the end I understood it, at least.

New corrected file:
Visualizing the gender gap in college degrees.ipynb (161.9 KB)

Thanks in advance and happy new year! :sparkles:

Click here to view the jupyter notebook file in a new tab

7 Likes

Hi @nathalia.pignaton,

Thank you for sharing your project, it looks very nice! Well-structured, the code well-commented and clean, storytelling coherent and easy to follow. Also, I like your usage of emphasizing around the project where necessary (using bold and italic-bold fonts). Well done!

Here are some suggestions from my side, hope they will be useful:

  • Chapter numbering: number 7 is missed.
  • The code cell [2]: it’s better to remove the second code (the commented-out one) from the project, but of course, to keep it in mind in future.
  • The code cell [3]. When we select only rows from a dataframe, we can omit the column part. I mean, we can write here recent_grads.iloc[:5], or, exactly in this case, even recent_grads.head().
  • It’s better to use a uniform quote marks style for string data throughout the project: or only single, or only double ones.
  • For some graphs, especially in the chapter 4, I would write more markdown comments. Well, I agree with you that there isn’t much correlation among the columns here :blush: But the absence of result is also a result. Also, you can try to zoom some parts of graphs, it can happen that some correlation reveals itself only for small values of certain graphs, for example.
  • You should add a conclusion at the end, summarizing the most interesting and useful insights obtained while doing this project.
  • I would recommend you to return once again to this project after the mission called “Improving Plot Aesthetics” and do the following: increase the font of plot titles, plot labels, and legend labels, add titles where missing, remove grids (for example, for the plots in the chapter 4) and despine all the plots (remove at least top and right spines).

All in all, congratulations for the great job and good luck with your future projects as well!

Happy New Year! :santa:

3 Likes

@Elena_Kosourova through your comment I saw that I attached the wrong file! :woman_facepalming:t4:
Thanks a lot for the feedback (I will for sure improve the things you said), but this was the Visualizing Earnings Based On College Majors not the Visualizing The Gender Gap In College Degrees. I’ll change it now.
The topic for the fourth project is here, but I’ve annotated your points already.
Thanks again and sorry for the mistake :pensive:

1 Like

Hi @nathalia.pignaton,

Oh, then I noticed those small details, but not the main issue! :sweat_smile: :joy: No problem, it’s ok! Good that we found it anyway. Later I will review your new notebook as well :blush:

1 Like

Hi @nathalia.pignaton

Happy to see your fifth project on Visualizing The Gender Gap in College Degrees.
I can’t imagine that with a period of just a month in Data Quest you have managed all these( 5 guided projects) , you are too determine indeed kudos for that !

image nathalia.pignaton
People like @brayanopiyo18, that has been reviewing my projects since the first one, @jithins123 and @Yeside make the journey even better with their feedback and support :smiling_face_with_three_hearts: Thanks a lot!!!

Honestly , reviewing most of your projects has been an added advantage to my path, all the interactions has exposed me to a new level of understanding of data science ,thanks as well for that.

About the project , to me, everything looks good and the introduction part, the comments are so informative. Just a suggestion, I think the conclusion need more than that, I have checked through and noticed you haven’t touched on the art courses.

Otherwise congratulations for the good work and wishing you Happy New Year!

3 Likes

Thanks, @brayanopiyo18! I’ll take some time to read other projects and see how I can make the conclusion more informative (it has been difficult). :blush:

1 Like

Hi @nathalia.pignaton

Congratulations on your 5th project and thank you for sharing your projects! I learn a few things from reviewing them.

I agree with @brayanopiyo18 that the conclusion needs a bit more information.

I have a suggestion that I think may help.

Since you already introduced the problem statement (a gender gap) in your introduction and you mentioned sectioning the analysis into STEM, liberal arts, and others, your conclusion should include what you observed about the gender gap in these groups while focusing on ‘interesting’ subjects within each group.

Hope this helps.

Have a wonderful 2021!

2 Likes

@Yeside thanks a lot!! I’ll rewrite it :upside_down_face:

And have a wonderful 2021 you too!!

1 Like

Hi @nathalia.pignaton,

Now I have returned with my review of the right version of your project :blush:
To add to the ideas that our fellow co-learners have already suggested to you:

  • It would be good to add some short observations after the code cell [1]. For example, that already from these rows it seems that in some spheres women dominate (like “Health Professions” and “Foreign Languages”), in some other they represent a minority (like “Engineering”). It’s important to mention here that these 5 rows are related to the “old” years from 1970 till 1974 (unless you decide to take a look also at the “recent” years, i.e. women_degrees.tail() for comparison). However, we have already at least some assumptions what to expect (or what expect to be changed) by the end of the project.
  • About the Color Blind 10 palette, you can just mention that it was used in the project, since its name is quite self-explanatory.
  • The code cell [2]. After the comment # generating the second column of the grid: liberal arts courses, it’s better to remove all the intermediate comments for the other for-loops (like # creating the category index, # removing the spines, etc.), since they are repeating, and we already know their functionality from the first for-loop. Actually, it would be great to combine all the 3 for-loops into a giant for-loop, and I saw a relization of this idea in some of our fellows’ guided projects. I still have to return to my own project, to implement this giant for-loop and in general, to improve significantly my own project (looking back at it, it looks a disaster :sweat_smile:).
  • About this part:
# saving our image - fig.savefig() also works
# plt.savefig('gender_degrees.png')

It’s better not to leave the commented-out code in the project (instead, or remove it, or uncomment it). Also, the alternative version of this code can be omitted (well, we can write alternative codes in many cases).

  • In general, the comments for some evident things (like, importing libraries, creating the plot figure, showing the graphs) can be omitted.
  • For a long line of code with arguments, like this:
ax.plot(women_degrees['Year'], women_degrees[lib_arts_cats[cat_index]], c=(0,107/255,164/255), linewidth=3)    

you can divide it into several lines, for better readability:

ax.plot(
        women_degrees['Year'], 
        100-women_degrees[other_cats[cat_index]], 
        c=(1, 128/255, 14/255), 
        linewidth=3
        )  

One argument for one row. It’s an especially good idea when there are a lot of arguments. The same about dividing a long line of comment into several rows.

  • I would add some observations already at the end of the chapter 2, and then, in a more concise form, re-write them in the conclusion.

Well, that’s all from my side, hopefully my suggestions were helpful. Congratulations on completing a nice and clean project, and also on your fantastic speed of learning! After one month of studying, you’re already at the 5th project, that’s a really cool progress!!! :rocket:

2 Likes

Thanks a lot @Elena_Kosourova!! Especially for coming back :slight_smile:
This week I’m making a pause in the courses to review the guided projects I’ve done already, and I’ll implement it all (I loved all the suggestions!!)

Thanks again!

1 Like