Screen Link: https://app.dataquest.io/m/355/list-comprehensions-and-lambda-functions/8/lambda-functions
The function extract_and_increment()
below extracts digits from a string using regex, groups them together using the .group()
method, so we get a single integer, then adds one to that integer.
import re
def extract_and_increment(string):
digits = re.search(r"\d+", string).group()
incremented = int(digits) + 1
return incremented
But I’m not seeing how it’s supposed to work. For example, if
string = "a1bb22ccc333dddd4444eeeee55555"
then my guess is that re.search(r"\d+", string)
would match the digits in string
, which would be 1
, 22
, 333
, 4444
and 55555
the way regexr.com shows here:
But, that doesn’t seem to happen…
string = "a1bb22ccc333dddd4444eeeee55555"
re.search(r"\d+", string)
…generates the output:
<re.Match object; span=(1, 2), match='1'>
…it only seems to match the first digit, 1
. What am I misunderstanding?
After that, what would .group()
do? I’ve only seen re.group()
used with regex that contain capture groups, but this example has no capture groups, so I’m not sure how to interpret it. If I apply .group()
as follows…
digits = re.search(r"\d+", string).group()
digits
…which outputs…
'1'
…I’m not sure what to make of the output. Is it just telling me that there’s one group because there’s just one match?
After that, I understand. incremented = int(digits) + 1
casts digits
(1
in this case) as an integer and adds 1
to that integer (resulting in 2
in this case):
incremented = int(digits) + 1
incremented
Out: 2
To summarize: My questions are about digits = re.search(r"\d+", string).group()
:
- Why does
re.search()
match only the first digit instead of all the digit “groups”? - How is
.group()
supposed to work if there are no capture groups?