Regex to extract Python versions

Screen Link: https://app.dataquest.io/m/369/advanced-regular-expressions/3/using-capture-groups-to-extract-data

Your Code: [Pp]ython [\d\.]+

Please explain how this regex works in extracting Python versions from the titles. I am not sure why the ‘.’ character is escaped.

Hey, Aditya. I need some clarification.

I see two implied questions in your request:

  1. How does this all work?
  2. Why is . escaped?

I don’t understand if you’re asking question 1 because of question 2 (and if answering question 2 would solve both 1 and 2), or if they are independent questions. In any case, it seems like at least question 2 needs to be answered.

You don’t need to escape .. Inside square brackets many symbols (like + and .) lose their special meaning, so there’s no need to escape them. The pattern [Pp]ython [\d.]+ would work the same as the one in your question.

You can see the details in the documentation. I leave the relevant excerpt below:

1 Like

Hi Bruno!

Apologies for ambiguous framing of my query.

I was able to discern the meaning of other elements of the regex barring the backslash preceding the . character.
I wrote my regex without escaping the ., just as you did towards the end of your response, but got confused upon seeing the solution code.

Now, I understand that a special character loses its special meaning inside sets.

Thanks for the clarification, the link to the documentation, and the excerpt!

2 Likes

why do the instructions not ask us to let the space following the word python be optional?
if we go by the instructions then Python 4 gets counted by Python4 does not?

pattern = r’[Pp]ython ?([\d.]+)’ -> this pattern is marked as wrong, whereas this counts the above 2 scenarios correctly

4 Likes

Thank you for this! I was trying and failing to figure out why my code wasn’t being accepted until I saw your comment and realized that the answer is just incorrect in the system (definitely should have the ‘?’).

1 Like

I was also trying to use the ‘?’ after space. I got a little confused about not using it. I hope the correct answer is really with ‘?’.

1 Like

I am here for the same reason. The preceding exercises definitely led me to believe the ? was important. I mean, it is right? Otherwise it gets counted differently.