Winning Jeopardy - Project review on full dataset

Hi Dataquest community,
I have finished a couple of days ago the jeopardy project, reiterated a bit into it after seeing some techniques applied by Raj Talluri (and shown here ).

The most solid difference is that I worked with the full 200k rows jeopardy dataset - results are surprisingly different!
Which is why I would appreciate a review, just to make sure I have not screwed up anything or misinterpreted the project application of the chi-squared test.

Project here --> https://github.com/nlong-ds/dataquest/blob/master/data-analyst-path/winning_jeopardy/winning_jeopardy.ipynb

Thanks to anyone who will take a look!
Cheers
N

1 Like

hey @nlong

I am not yet at this mission so I didn’t read your code (I have written this almost 4 times today:( I am lagging too much in my studies :frowning_face: ), just glanced it through.

Few things I would like to share with you at the onset:

  • mentioning and highlighting a peer’s project, you checked out, learned from and added ideas and work of your own. :+1:
  • one of the code contains a reference to a stack overflow post so that guides us to look at the original SO post and perhaps will lead to looking for more new ideas :slight_smile:
  • link to external documentation/ information to base your findings :ok_hand:

This is just a guess, but is it that the conclusions based on the wordcloud, have been drawn before displaying the wordcloud?

And any personal tiff with plt.show() just asking…

I have bookmarked this project to refer to when I will complete it for myself.
Thanks for sharing!

1 Like