Here’s another one - this one is all about the style! Nice plots, charts, titles etc.
Even comes with a death-star… I’ve remembered this dataset as a small one and I couldn’t really go big on analysis and digging out various facts like I did with my ebay project, so I’ve worked on the style…
What I’ve learned redoing this proj (or want to learn):
- current version of Jupyter does not recognise ‘span class’ in markdown - had to
use plt.text for all the fancy titles
- if you’re plotting 3 or 4 similar plots - use a loop, reduces the code and
automates the process
- surely there’s a better way of scraping just the movie dates from Wikipedia - those 6 dates almost burned my router! Anyway I’ll add that to the list of thing to do…
- you need a lot of code to customise a plot - here’s a question: surely
there’s a way to define a class (or something) that defines certain plot
atributes across the whole notebook? I had to set the background color
title font, label font etc. etc. for every figure, axes… wondering how to do it once
for the whole notebook…
Have a look and feel free to go wild with the critic:
yoda.ipynb (976.4 KB)
project on github
pretty sure nbviewer doesnt update (even with ‘flush_cache=true’ in URL) :
Click here to view the jupyter notebook file in a new tab
Thanks for sharing your project with the Community, you’ve done a super-great job! The visualizations are just stunning aesthetically and at the same time highly informative, great navigation throughout the project (including back to top), the whole work is very well-organized and the code perfectly commented. Also, you used plenty of interesting libraries, digged deep into the data, and outlined the lessons learned in your post. Amazing job!
About your question: I’d suggest you to create functions for these purposes. Indeed, you don’t use them in this project at all.
Some minor suggestions from my side:
- It’s better to put the title in the very beginning of the project.
- A good practice is to combine adjacent code cells without any output or markdown observations between them into one cell (e.g., -, -, -, -). Just add empty lines of code comments between different blocks of code.
- The code cell : consider adding an empty line before the code comment # DEATH STAR MOD:
- Use the word “observations” instead of “conclusions” throughout the project, and the word “conclusions” - only for the final ones.
Hope my ideas were helpful. Very well done, congratulations!
Thanks @Elena_Kosourova ! OMW with touch ups…
I’ve got the ‘back to top’ idea from one of last weeks champions: also a Star Wars proj
- combining adjacent code: agreed in majority cases BUT… is it a must? whats the reasoning behind it?
Maybe I forgot some important lesson, but in a very few selected cases (example:   ). I’d use it as an additional formatting tool. Once again thank you for the feedback.
That’s great, Adam, I’m glad to hear that my suggestions were useful! And you’re doing very well reviewing other people’s projects and learning a lot of things from there, it’s my approach as well! Indeed, your own projects are really cool, so the DQ learners will definitely find a lot of useful ideas in them.
About adjacent code cells, I read about it in the DQ guide on how to style a guided project. It was said that the best practice is to interchange one markdown and one code cells throughout the project, except for the cases when a code cell has an output. Let’s say, the reader always expects something right after s code cell As a formatting tool inside a big code cell, you should use code comments and, of course, an empty line separating the code blocks of different functionality. But, of course, it’s just a style guideline, only about decorations, because your project already looks really super-great and interesting to read!
Hi @adam.kubalica! I will add a few words about visualizations. They are good but don’t forget to add Y labels because it’s not immediately clear that the numbers there are ratings or number of views. They may seem obvious (especially in ratings) but I still believe that it’s a good idea to add the labels because they add much more information than ink.
Also, I think that the plot of “New and old trilogy rating in different fan groups” is a bit overwhelming. I have to constantly go from the plot to the legend and hope not to mistake the color associated with its group. It’s better to split theplot into a few mini-plots that concentrate on one category (male vs. female, fans vs. not fans, etc).
Happy coding and congratulations on your project. I suppose this can compete for the Community Champion award. What do you think @Elena_Kosourova and @nityesh?
@artur.sannikov96 thanks for the input, initially I had the ‘New and old trilogy’ plot split into few groups like you’ve mentioned - it looked VERY similar to the previous plot (movie ratings in different groups) and that didn’t look nice (2 almost the same figures), Since this proj was more about style, decided to show it in 1 plot, which visually looks better, but you’re right the readability of that one isn’s great . I’m gonna have to find a way to make it clearer(grid) or replace it with something different
Absolutely agree with you, Artur, this is a perfect candidate for the Community Champion program!
And yes, agree also with your comments as well. For grouped bar plots, it’s their typical pain when they have a lot of categories, and even worse in case of stacked bar plots. Well, the last ones have usually even some other issues, I alway try to avoid them at all.
This maybe a touch better (very initial phase, I’ll sort out the colors etc) :
Yep, It’s much better! You can also experiment with splitting this plot into mini-plots as I’d said before and see how it looks like.
So once again thanks for the feedback, job done, badge won, etc. I’ve updated the proj with a few functions and the old/ new trilogy plot, for now I’m done, time to learn something new, but I’ll probably return to this project sometime in the future:
list of things to update:
use BeautifulSoup library for more efficient scraping of wikipedia movie dates, also scrape movie; budget, length and other easily ‘scrapeable’ params and analyse ratings based on them (that’s’ going to add some analysis content, and it’s going to be a good base model for redoing the fandango project: analysing the ratings shift based on movies budgets, or their marketing budgets, or shift of ranking per studio, producer etc.)
write more functions for reducing the code of plots (atm, I’ve updated the proj with 2 functions: 1 for episode titles, 1 for bar plots) that reduced quite a few lines of code BTW thank @Elena_Kosourova for the idea, that did reduce the ax.set_something_clutter
github branch/ merge etc. (atm I’m only using Github as a cloud storage solution for my notebooks and it can do sooo much more)
Great job adjusting all those things, Adam, and also very cool that you’re planning the steps forward and potential improvements! This is the best way to learn and practice coding skills.
By the way, congratulations on becoming the Community Champion! Waiting for your new cool projects published here.
Thank you so much for the insight!