Star Wars Project: Expanded Universe fan study

Just wanted to share my project… Any and all feedback is welcome!

Star Wars Study.ipynb (181.2 KB)

Click here to view the jupyter notebook file in a new tab

1 Like

Hi Fried,

Thanks for submitting your project!

Some quick feedback :

  • Use titles for your plots outside of the legend
  • Shorten overly long variable names where possible
  • Try to use formatting to make your project more skimmable (sections for separate parts of the analysis & use spacing between your numbered conclusions)
  • Some frequency charts might be a nice add for more detail - mean charts are good but you might be missing context.
  • There are some weak/borderline medium strength correlations for things like age, sex, familiarity, and movies seen that you could potentially hypothesize about (but you are mostly correct that these aren’t conclusive).

Lot’s of positives too, but for sake of time I’ll just say great job :).

3 Likes

Kevin-
Thanks for the feedback- I appreciate not just being rubber-stamped!
I am uploading a new version with some corrections incorporated.
With regards to the correlations, I would just note that I do not think there were even any .20 correlations. I believe generally, anything less than .2-.3 is not even considered a meaningful weak correlation.
Star Wars Study (1).ipynb (218.7 KB)

Click here to view the jupyter notebook file in a new tab

Hi fried,

Correlations are tricky as acceptable thresholds can vary by industry and data being compared. What I’m recommending is that you draw questions for further exploration of the data from the weak correlations. Further drawing definitive conclusions from correlations is generally not recommended in general without controlled scientific experiments, as we know correlation != causation. Here’s a thread that cites general correlation thresholds (I’m sure you can find ones that vary slightly from this). https://www.researchgate.net/post/What_is_the_minimum_value_of_correlation_coefficient_to_prove_the_existence_of_the_accepted_relationship_between_scores_of_two_of_more_tests

For making correlation matrices easier to skim try heatmaps, seaborn has an easy to implement solution. How to Create a Seaborn Correlation Heatmap in Python? | by Bibor Szabo | Medium

1 Like