Advanced regular expression: [Pp]ython freq table

Screen Link: Learn data science with Python and R projects

My Code:

pattern = r"[Pp]ython ([\d.]+)"
py_versions_freq = titles.str.extract(pattern, expand=False).value_counts()

I expected the freq table will show Python/python plus the digital character. However, it only shows the digital character in freq table. Why?

That’s what str.extract() does. It extracts the capture group in the pattern, which is what’s enclosed in the parenthesis.

If you modified your pattern to r"([Pp]ython [\d.]+)", then you would see Python/python as well.

I would recommend checking out the documentation for extract()