What about 1999/2000?

Shouldn’t this code account also for cases when the first year is 1999 and the second year is 2000?
Thanks,

https://app.dataquest.io/m/346/working-with-strings-in-pandas/10/extracting-more-than-one-group-of-patterns-from-a-series

Hello @annalisa,

There is no code in your post.
Kindly follow the guidelines as discussed in the post below.

Hi @annalisa,

So, for the code:

r"(?P<First_Year>[1-2][0-9]{3})/?(?P<Second_Year>[0-9]{2})?"

Frist_year will match everything between 1000-2999:
[1-2] - must be 1 or 2
[0-9]{3} must find three numbers, each from range 0-9

But the Second_year will match only numbers between 00-99:
[0-9]{2} - must be a 2 digit number with each number from range 0-9

Could be explain the question you have for this mission?

Hi @doyinsolamiolaoye @sahil, I did not put any code because my question is not about the code, but about the question that is being asked and the example provided (so I thought my code was not needed). I undestand my original post was not detailed, so I am going clarify my question:
Dataquest question 346-10 is about finding a year in the text and its subsequent year, but it assumes that the first two digit of the second year are the same as the first two digit of the first year. Maybe this questions should be modified to include also the case when we have 1999 as first year and 2000 as second year? I am not looking for a code solution, it was more a suggestion to maybe modify this excercise. I probably wrote it in the wrong sub-forum. If so, I apologize, I am still not very familiar with this forum. Thanks a lot for reading. :blush:
(the sample code from the question has been kindly supplied bu @kakoori, so I will not write it again)

1 Like

@kakoori yes, thanks a lot. This is exactly my question. If the code finds 1999/00, it will output 1999 as first year, and 1900 as second year. Unless I am missing something big here… :nerd_face:

@annalisa

Ah, then yes. :smiley:

If we use vectorized slicing only, without any conditions, then this breaks for 1999/00.
But since in our dataset this will be the only one I suppose this can be corrected with a simple if clause.

A correction for all cases of XX99/00 would be better, of course.

2 Likes