Guided project Feedback - Visualising Earning Based on College Majors

  1. Hi all, here is another guided project which I’ve done. This time I tried to label my coding and do better from feedback I got from my last project. Please give me feedback on what you think I can improve.
    What I find hard sometimes is to make a relation between the question and the chart I’m using. Sometimes I don’t understand what kind of information I can take out from the chart in front of me. Any ideas how I can improve this?

  2. https://app.dataquest.io/m/146/guided-project%3A-visualizing-earnings-based-on-college-majors/6/next-steps

Visualizing Earnings Based on College Majors.ipynb (529.3 KB)

Click here to view the jupyter notebook file in a new tab

1 Like

Hi @drewc,

Thanks for sharing your project! Starting from your question, I would recommend you first to define a question your want to answer with the available data and only then choose an appropriate type of plot to answer this particular question. This selection is not always easy, I agree, and depends on the question to answer and on the type of the available data.

Now my comments about your project.

  • The best practice is to re-run the whole notebook after finishing a project, for having all the code cells in order.
  • Project structure: adding a project title of h1 size, clear introduction and conclusion, subheadings for different sections. The conclusion should be more structured, using bullet points can be a good idea. It’s better not to use too much italic font, especially italic+bold.
  • In the introduction, apart from the dataset description and project goals, should be given a link to the dataset webpage and the dataset dictionary, if available.
  • It’s better to combine the subsequent code cells in one code cell, especially if they have no output (like [6] and [7] in your project).
  • Some code cells ([2], [3], [4] and some others) are over-commented. For example,
#using describe() to generate a summary of statistics
print(recent_grads.describe())

But we can already see from the code itself that we are using describe() method here :slightly_smiling_face: Hence, for the comment to be useful in this case, and for not repeating ourselves, we can write just #generate a summary of statistics.

  • Plots. You’d better remove the unnecessary ticks (the code cell [8] and similar), especially on the right and top side. It’s good also to remove the spines (top and right) themselves. Increasing the figsize and titles & labels for all the plots would improve the readability of the plots. You’ll find all these things in the next course “Storytelling Through Data Visualization”, and then you can return to this project and improve the aesthetics of your plots.
  • The code cells [64], [55], and [54] - here the legend doesn’t seem really informative and should be removed.
  • Overal, please add more storytelling to your project in markdown cells: observations (maybe even unexpected and contradictive), interesting insights, all the useful pieces of information that the data reveals step by step throughout the project and that can help us at the end to answer the questions which we defined in the introduction.

Hope my suggestions were helpful :blush:

3 Likes