Popular Data Science Questions (m469)

Hi, colleagues!
Upload my ninth project.
Wait yours feedback.
BR,
Vadim Maklakov.Popular_Data_Science_Questions_m469.ipynb (425.1 KB)

Click here to view the jupyter notebook file in a new tab

1 Like

Hi Vadim,

Thanks for sharing another nice project with the Community! Great project structure, neat visualizations, well-commented and organized code (including also function commenting in a correct and exhaustive way), all the necessary links are available. Good approach to import all the necessary libraries at the beginning, and to re-run the completed project. And of course, that’s great that you extracted fresh data for 2020, instead of using the proposed dataset for 2019. Great job!

Now some suggestions:

  • You’d better re-write the sentence “Pretend that I working in the company” in the introduction in some other way.
  • When mentioning column names and pieces of code in markdown, it’s recommended to use backticks, to make them in evidence.
  • Consider using the same color for all bars in a bar plot, if applying different colors doesn’t highlight anything in partcular (the code cells [8] and [12]).
  • The code cells [9], [11], and [13]: better to add more white spaces between different code pieces (namely, before each code comment).
  • The code cell [11]: there is a warning to fix, or at least to silence.
  • Visualizations in general. Don’t forget to add a title to each and consider increasing font size for the axis labels.
  • The code cell [15]: consider using also here a scatter plot, like you did in [10] .Which was a great idea there, by the way!
  • The code cell [13]: the output is a bit hard to digest, probably no need to display it.
  • The last section of your project Definition growth question Machine-Learning by year and quarter from start for Q12021 and general stats for tags. It’s better to rename it somehow shorter. Also, it could be a good idea to add some more intermediate markdown explanations in this section, since it’s rather saturated with information and graphs. It will be easier for a reader to follow the whole workflow then.
  • The code cell [16]. Just a curiousity: why do we have those peaks there? I cannot figure it out myself.

Hope my comments were useful. Good luck with your future projects!

3 Likes

Hi all!
Update my previous notebook and upload it.
Warning of pandas I can’t remove because its the future warning - i tried change list bracket for tuples - but all finished by raising error.
Add stack bar with correct summary values, and rewrite plot final values for ml_tags.
For run code locality requires download in local directory with notebook two CVS files but I can’'t upload it because its have size over 4MBPopular_Data_Science_Questions_m469_final.ipynb (615.2 KB)
BR
Vadim Maklakov

Click here to view the jupyter notebook file in a new tab

1 Like