Using Flags to Modify Regex Patterns

pattern = r’e-?\s?mail’

What is wrong with this pattern?

I am able to select all the variation of word email

email
Email
e Mail
e mail
E-mail
e-mail
eMail
E-Mail
EMAIL

My code

email_mentions = titles.str.contains(pattern, flags=re.I).sum()

But when i submit it shows

email_mentionsint64 (<class ‘numpy.int64’>)

  • actual + expected

  • 151

  • 143
1 Like

This wasn’t immediately obvious to me either, but the answer is in the hint for this screen:

You will need to use a word boundary so that your pattern does not capture words like “late mail.”

3 Likes

Gotcha!!
Thanks
Yeah didn’t think about this particular case

1 Like

Hey guys! I can’t seem to understand why I need the word boundary only in the beginning, shouldn’t it be also in the end?

When I try with the word boundary in the end I get 2 results less (141 instead of 143), so I investigated and found out these two matches:

13943 Why That Salesperson Just Wont Stop Emailing You
14161 Emailing SaaS companies to test support time

I think these two should be discounted and the pattern should be

r’\be[-\s]?mails?\b’

What do you think?

1 Like

Hi @jfpsmatos, welcome to the community! You’re the 2nd person in the last couple days that have brought this up!

Personally, I think within the context of the mission, these should be included. Our goal is to find the articles that reference email, and these 2 technically fit in. The purpose of the word boundary was to eliminate titles that may contain a variant of “email” but not really be about email (such as the “late mail” example). “Emailing” is about email, so it makes sense we should have these articles.

Probably what’s confusing is that we’re expecting only the “email” variants we were given in the list, and “emailing” wasn’t on it. It might be a thing where the the wording on the instructions could be more clear that the list isn’t necessary exhaustive, or just include these 2 variants so we know a word boundary at the end isn’t ideal. I’ll pass along your feedback to @Sahil. Thanks for your input!

1 Like

Hi @april.g, thank you for your feedback! Being second sucks :smiley:

Happy birthday!

2 Likes

Hi @april.g, @jfpsmatos,

Yes, either the instruction should be more clear or the expected output needs to be changed. I will get this issue logged. Thank you for an excellent feedback for this mission screen.

Best,
Sahil