Counting mentions of C language

Screen Link:
https://app.dataquest.io/c/64/m/369/advanced-regular-expressions/4/counting-mentions-of-the-c-language

My Code:

# Based on useful information from https://www.dataquest.io/wp-content/uploads/2019/03/python-regular-expressions-cheat-sheet.pdf
pattern = r"\b[Cc]\b(?![.+])"

What I expected to happen:
To match 100% with the expected result.

What actually happened:
It matched 90% with the expected result. An additional mention of the C language was found using the above pattern at titles.loc[221].

From the assignment, I understand that we should find every mention of C which is not followed by a period or other distinct C languages. In addition, it is mentioned that we should use a negative set to prevent the above-mentioned matches. So can anyone explain why my suggested pattern is not correct when compared to the solution? Thank you in advance. /Michael

1 Like

Hey, Michael.

The solution is checking for something slightly different. In my opinion, it is incorrect and yours is correct.

However, the way that I suggest whoever ends up looking at this fixes it, is not by modifying the solution, but rather by being more precise in the instructions (in such a way that your solution isn’t valid).

I agree, and that’s why I said above that you’re correct. The difference between your solution and the given solution is that yours captures what you mention, while Dataquest’s captures “mentions of C that are followed by something other than a period or plus sign”.

Do you see the difference? It’s “not followed by” vs “followed by something other than”.

Nice catch.

1 Like