Guided Project: Are SAT's fair? Analyzing NYC High Schools Dataset

Hi Everyone,
I have just finished my project on NYC high school dataset under the data analyst in python track of DataQuest. I am really looking forward for your comments and feedback on my project, especially the areas that I have to improve. Thanks in advance.
Best Regards,

Analyzing_NYC High School Data_Project_5.ipynb (412.5 KB)

Click here to view the jupyter notebook file in a new tab

Hello @saadk687! Thanks for sharing your project with the Community! You have a nicely organized project with a short but explicit explanation of the steps you’ve taken. You also have nice plots and even explained what the SAT exam is which is important for readers outside the USA. You also left code comments in most cells.

Here are some suggestions:

  • You can reduce the number of sections because in some of them you have just one code block that does only one thing like in Convert AP scores to numeric
  • You have some minor typos
  • It’s not very clear what you do in the “Condense datasets” section. It seems that you select the data for only one time period but give no motivation of why you do it
  • In code cells [19], [22], and [26] you can split the code into logical pieces to improve the readability
  • Plots are good but why not giving tick labels full names of the measures? Like what is aca_tot_11? You give the explanation after the plot but often readers won’t bother reading it and will just look at the plot
  • Schools having a safety score between 7 and 9 have higher SAT scores-mostly above 1600. In this case, I’d say it’s 50/50 if we look at the plot
  • It’s better to order borough safety scores in decreasing order because otherwise, it gives a false perception that Bronx has the highest average safety score
  • In conclusions, you have some HTML code
  • You also have a multilevel list that is hard to read
  • So, is SAT fair?

That’s it for me. Happy coding :grinning_face_with_smiling_eyes: