Kids in NYC Schools, Guided Project 7

Hello everyone. Here is my seventh guided project. My three main findings are:

  • Schools with low safety scores have low SAT scores.
  • The correlation between SAT scores and percent of Hispanic students is more nuanced than it might first appear.
  • A group of schools share SAT and AP exam values. The influence this has on the data set should be considered.

What I particularly liked about this project was melting data frames and making scatter plots in subplots.

Please share any ideas you have! Thanks.

Kids in NYC Schools.ipynb (845.7 KB)

Click here to view the jupyter notebook file in a new tab


Great work. I like the way you carried out your analysis.

1 Like

Thank you Naftali, nice of you to say so!

hello @brucemcminn

The first impression about the work you’ve done is like there not appear to be a common thread between cells.

for example:

When encodig = windows appears, you didn`t explain where does it comes from, being a crucial issue to be able to continue.

It is not that the work is wrong is that it seems to me that when we get down to work and as we are usually many hours, we lose the perspective that whoever reads it does not know what we are talking about and this also happens to me.

Another posible thing that catches my attention is that you make use of a scatter plot to explain the relationship between male and female, its very nice but its a little bit messy, however this could be solved with a graph of bars since in the axis of the x you can enter the attributes you need and the axis of the y gives you the amount of each one.

This :point_down: helped me a while ago to know when I have to choose one type of graph or another.

Choosing_a_good_chart_dissected.pdf (88.0 KB)

I hope I’ve helped you a little, just that.



I’m not sure if you are mixing up my work with someone elses.

The text encodig = windows is not in my notebook on NYC schools.

Oooh, I think I found what you’re talking about. encoding='windows-1252'. It’s in the beginning when the files are being opened. So check out this link, it is from a lesson earlier in the course where the NYC School data is being used.

The original notebook in the guided project already had some code filled in. This is from the introduction page of the guided project: “In another lesson, we began performing some analysis. We’ll extend that analysis in this lesson. As you can see, we’ve included the code to read in all of the data, combine it, and create correlations in the notebook.”

So I guess you think it should be commented out and explained? Maybe.

I actually think the scatter plot provides a good picture of the relationship between male and female SAT scores because the viewer can see the clustering and the small number of data points outside of the cluster. I would like to see what you have in mind for the bar chart.

Thanks for your insight.

1 Like