Screen Link:
email replace
My Code:
email_variations = pd.Series(['email', 'Email', 'e Mail',
'e mail', 'E-mail', 'e-mail',
'eMail', 'E-Mail', 'EMAIL'])
pattern = r"\be[\s-]*mail\b"
pattern1 = r"e[\s\-]?mail"
pattern2 = r"\be[\s-]*mail[Ss]*\b"
email_uniform = email_variations.str.replace(pattern,"email",flags=re.I)
titles_clean = titles.str.replace(pattern,"email",flags=re.IGNORECASE)
titles_clean_1 = titles.str.replace(pattern1,"email",flags=re.IGNORECASE)
titles_clean_2 = titles.str.replace(pattern2,"email",flags=re.IGNORECASE)
mismatch1=titles[~titles_clean.eq(titles_clean_1)]
mismatch2=titles[~titles_clean_2.eq(titles_clean_1)]
print("pattern:\n",titles_clean[[161,450,9006]])
print("pattern1:\n",titles_clean_1[[161,450,9006]])
print("pattern2:\n",titles_clean_2[[161,450,9006]])
What I expected to happen:
I expected that “source Mailchimp” should not get matched so I written pattern2 = r"\be[\s-]mail[Ss]\b" , when I submitted test case failed .
In answer section r"e[\s-]?mail" pattern is given but this has some side effects:
it matched below:
source Mailchimp
open source mail client
This is wrong, it should not match that.
What actually happened:
Answer given matched extra lines
pattern:
161 Computer Specialist Who Deleted Clinton Emails...
450 Mailtrain (the open source Mailchimp clone) is...
9006 N1 The extensible, open source mail client
Name: title, dtype: object
pattern1:
161 Computer Specialist Who Deleted Clinton emails...
450 Mailtrain (the open sourcemailchimp clone) is ...
9006 N1 The extensible, open sourcemail client
Name: title, dtype: object
pattern2:
161 Computer Specialist Who Deleted Clinton email ...
450 Mailtrain (the open source Mailchimp clone) is...
9006 N1 The extensible, open source mail client