I’m really excited you’re reviewing my Guided Project. Take a look at the Introduction and Conclusion, and let me know if the message is clearly conveyed. In addition, feel free to provide critical feedback about the code and explanations in between. I’m really curious about the flow of the project and to learn if it helps newcomers and veterans.
Exploring Hacker News Posts.ipynb (18.2 KB)
This is my first iteration. Feel free to ask additional questions about the data, and I will gladly take into consideration for my next iteration.
Click here to view the jupyter notebook file in a new tab
@jeprim one word that has to be highlighted regarding your project is neat. Everything seems like its in order, like getting in to a neat place.
A few pointers to help you get it better (hopefully):
- I’ve not said it in earlier reviews and I thought I should start it with yours. If you’ve read the introduction section you would have noticed that DQ is not using the entire dataset. Instead they are using a sample of about 20,000 entries. It would be good to mention the same in case the reader wants to go and validate your findings.
…but note that we have reduced from almost 300,000 rows to approximately 20,000 rows by removing all submissions that didn’t receive any comments and then randomly sampling from the remaining submissions.
- Seeing as this project mostly consists of terminal outputs and since you have yet to get in to visualizations, you could format your output with color or boldening. e.g The output for cell  could look like:
Average show comments: 10.31669535283993
Check this for the same. This could help to differentiate your code from your output
- Your commenting style is nice. Its simple and precise. A personal preference is capitalizing the first letter like you would when you begin a sentence and using the simple present tense instead of the present continuous tense of the verb e.g In cell  my version of the comment would be #Calculate total_show_comments. Note: this is a personal preference
- I feel its good practice that you round your outputs instead of outing the non-rounded values like in cell . A simple
numpy.round() should help with this regard.
- I could not find any issues but I haven’t gone too deep in to your code.
- Once you have gotten a hold on visualization. I recommend that you re-do this project and add a couple of visualizations .
- I like how you tried to put in some time by separating your title from you output using newline like in cell. If you are willing to go further, check out this post. Its got a list of libraries that can help to present your output in tabular form. I loved it the moment I started using it.
Hope that was helpful . Keep up this momentum and you should be at the top pretty soon.