Hi @ryan.wetherbie,
I was about to post the same topic when I saw your post.
The pattern you used is similar to what I had:
pattern1 = r"(?<!Series\s)\b[Cc]\b(?![+.])"
While this pattern worked for this particular exercise, I realized that the proposed answer provided by Dataquest:
pattern2 = r"(?<!Series\s)\b[Cc]\b((?![+.])|\.$)"
is more correct in general.
This is because the pattern1
will not match cases where the character [Cc]
is at the end of the sentence followed immediately with a period "."
(because of the negative lookaround (?![+.])
.
Hence, a string with the following value:
string1 = "I find it difficult to learn C."
will not be matched using pattern1
but pattern2
will be able to.
This is because pattern2
tells the program to capture instances where [Cc]
is not followed by the characters "+"
or "."
as represented by the negative lookaround (?![+.])
OR where [Cc]
is followed by the character "."
that is immediately followed by the end of the line or the string (\.$
).
pattern1
works for this exercise because it just so happens that the data set we’re working with does not have cases such as the string1
example I gave where "C"
is at the end of the sentence.
I hope the instructions and the prompt for this screen is updated because it is really very confusing and took more time to figure out than I would have preferred.