Hello Dataquest community!
I’m sharing with you my solution to the Winning Jeopardy project. I’d like some feedback on it! Anything goes: code, visualization, narration, mistakes…
I took the extra step of making the code faster by:
- Computing the high/low value occurrences of words all at once, instead of looping over the database once for each word, as the guide suggests.
- Using a dictionary to store p-values and avoid recomputing them (i.e. memoization). I’m not sure this second step is very useful, but it was fun implementing it!
With that, one can actually compute all needed p-values in 3 seconds. Some of the words do have very low p-values.