Guided Project on Jeopardy

Good day :slight_smile:

I am having some conceptual issues regarding slide 5 of this guided project:

https://app.dataquest.io/m/210/guided-project%3A-winning-jeopardy/5/recycled-questions

I believe the purpose of the exercise is to collect all re-used words in a set, and determine how many times words are repeated (to ultimately determine how many times similar questions are asked).

I seem to be able to grasp everything up until the point where I am asked to do the following:

" If the length of split_question is greater than 0 , divide match_count by the length of split_question

Conceptually I am struggling to understand the purpose of this code. I see that we determining whether a word in a question is in our set, if it is we add 1 to our match_count. If not the word goes on to get added to the set and will be matched (with the relative increase in the match_count) if this word is in the next question and so on.

I am struggling to wrap my head around the code that follows from the above instruction:

'if len(split_question) > 0:
match_count /= len(split_question)

If someone could please explain this to me conceptually: what is the purpose of doing this? What does it achieve etc?

I would be forever grateful for any light that is shed on this predicament I find myself in!

Thanks!

Regards,
John

For now, ignore the following instruction -

Remove any words in split_question that are less than 6 characters long.

Let’s say you have two questions -

  1. What is your name?
  1. What is your father’s name?

The words in 1 would be added in terms_used.

For 2, match_count would be 4 (ignoring the questioning mark, and only looking at the words).

What happens when we divide match_count with length of split_question for 2.?

That’s, 4/5, that’s 0.8.

Or, we can say that 80% of words in the 2nd question` overlap with words that were present in a previous question.

For a quiz, this information can be helpful to figure out what kind of questions are frequently asked, for example. Or rather, what kind of topics or terms are used more frequently - that is, how often those terms get “recycled”.

Using this and any additional information (like value of the question) you can consider studying for the quiz such that you focus on high value questions that might have some terms occur frequently and thereby potentially winning/earning more. It’s not that simple, of course, but it can give some idea.

1 Like

Awesome! Thanks for the response. Much Appreciated!

1 Like