Going fast! #DataquestChallenge Premium Annual Offer:
500 get 50% & the next 1000 get 40% off.

FULL Guided Project No. 07: Analyzing NYC High School Data

Hello :slight_smile:

let’s start with what I mean when I title this topic “FULL”:
When I had opened the first page of this project mission and had read all about I realized that this project will be useless for me if it won’t include all code. How could I attach it to my cv when the project starts almost at the end of all code written before :man_facepalming:? No! It couldn’t be like that, so I add my upgrade and build this project from the starting point. Because the “starting point” was the whole learning segment, I was forced to convert it in such a way that I could paste it all…So I did… :sunglasses:

My project literally include:

Because there are some issues with seeing this project in a new tab (as the last time, some pictures don’t want to display) I suggest downloading this project and see it in the Jupyter Notebook. Or, please download the HTML version of the file. Just download the HTML file and open it (without the need of using Jupyter Notebook).


7. Data Cleaning Project Walkthrough and Analyzing NYC High School Data.ipynb (858.9 KB)

Click here to view the jupyter notebook file in a new tab

HTML version: 7. Data Cleaning Project Walkthrough and Analyzing NYC High School Data - Jupyter Notebook.htm - Google Drive


Hi @drill_n_bass,

Thank you for sharing your project with us and welcome back to the Community! :star_struck: Wow, you project looks super-detailed, and you conducted a very thorough analysis!! I can imagine that the whole work took a lot of time. And you were absolutely right to include the previous analysis and cleaning sections into it, otherwise it would lack context.

Some suggestions from my side:

  • It’s important to re-run the whole project when it’s already ready, to have all the code cells in order and starting from 1.
  • You should add a conclusion summarizing the main insights of your work.
  • Project structure. It’s better to put all the technical information, making it as laconic as possible, in code comments. In markdown, we should share only our observations and next steps (not technical but logical). For the same reason, a good idea is to avoid too technical chapter names (Seg 2, chapter 11). Also, for the very same reason, I would remove descriptions of pandas methods and the links to their documentation (your readers will google them by themselves, if they are curious, so don’t make their lives too easy :wink:). As for subheadings, I noticed that some of them are repeated even inside one segment up to 3 times (for example, Filling in Missing Values). Also, you practically have 3 different introductions, one for each segment. I’d suggest you to fix these things
  • Technical issues. Some images and links (like the one for Flatbush) are not visualized, or visualized incorrectly. Also, there are some issues with displaying the tables in Segment 2, chapters 2, 6, and 8. And yes, I was checking your project from my Anaconda, because there is really something wrong with the tab, even after my adjustments.
  • Plots. You might consider despining them, de-ticking, adding readable (i.e. descriptive, laconic, and big enough) titles and axis labels.
  • About numbers before the subheadings. A good practice is to avoid them at all. However, since you have 3 different segments, you can consider adding double numeration. For example, Segment 3 chapter 2 will be numbered 3.2. Otherwise, the readers can be confused.

I hope my suggestions were helpful. Good luck with your future projects and happy learning!

1 Like

Hello Elena. As always I can count on you. Thank you for your reply and help! :smiley:

Done :sunglasses:

Done :sunglasses:

I probably was a sleepwalker when I was writing this. It’s fixed now:crazy_face:

It was an improperly made link, not a picture. It’s funny what kind of errors one can do when is out of space at 5 a.m. :joy:

Life is hard enough, no need to make it harder for anyone! :wink: :joy:. I understand your point, but there are two reasons I want to leave it as it is. First is the fact that I can find easy stuff from the past with full documentation. I’ve done it many times and it’s great when I do some new project, need something, remember where I was using it. Then I just open an old project and I have the full knowledge base. Full, no need to waste my time. Without researching the web once again, without revere engineering of my own old code. It’s good to know what professionals do and why. But on the other hand, I need some foundations for this professionality. I’m just a beginner, I have beginner’s mind. :yin_yang:

Secondly, I think that this project is very complex and it’s easier to understand it better with a wider description.

Fixed version of my project is posted in the first post (I’ve updated all files). From now, you can see my project as an HTML file (without the need of using Jupyter Notebook). Just download the HTML file and open it.

1 Like

Hi @drill_n_bass,

That’s cool, very glad to know that my review was helpful! :star_struck:

As for including some explanatory sections from the documentation, you might consider just including the links to the corresponding parts of documentation. It’ll allow you to save vertical space of your project (which, indeed, becomes even more important in case of already long and detailed projects). If your readers aren’t familiar with that function (approach, method, etc.), they’ll use such links to learn about the technical details. If they are already familiar, they’ll skip this step. Because remember that your projects can have a lot of readers, not only you yourself :yum: And those people can have different background and skills, including high level of experience, so you don’t need to give them too many explanations (and increase the project’s length). Just the side references (i.e. links) are enough, if you want to make their life easier anyway :grinning:

Also, a good practice, especially at the beginner’s level, even though judging by your projects, you are not such a beginner anymore :innocent:, is to add appropriate code commenting and function documentation. In some Internet resources, I read that not everebody agrees with it. Some people think that the code should be just written clearly enough, variables called in a descriptive way, so you won’t need any comments or function description anymore. Well, I agree with it only partially. Code comments, if written correctly and appropriately, can save time to the readers when describing particulartly complicated parts of the code. The function description is also necessary when the function is supposed to be used for further projects. If a function has a description, you can run help comand to access it and to know immediately how and for what to use it.

You can share your Jupyter notebook here anyway. I’ll render it into a readable link, if it’s not rendered automatically. Then the future reviewers can open it easily, because I suspect that otherwise they’ll avoid any hassle with rendering a HTML file.

Anyway, great job applying my suggestions :star2: Good luck with your future projects!


I know that, but this readable link doesn’t display some pictures (the HTML version displays all).

Yes, I am. I will send you a book about that - very soon :grin: :wink: