Difference between extract and extractall -346-9

Screen Link:
Working With Strings In Pandas | Dataquest

First query is what do we mean by extract will only extract the first match.
Does it mean it will extract the first instance that it comes across and ignore the other rows?
What could be a practical example of preferring extract over extractall during a project?

Second query is with respect to example given. I do not understand how the values under that column were generated and what does it indicate. For the year 1999 for Finland, the match is 0 and 1. For Netherlands, it is 0 and 2. Oh…maybe it simply means the order 0,1,2,3,4,5,…right ? What could be its use…in the upcoming missions it will be clear i hope.

Yes, exactly! extract() will only return the first instance (per row) that matches the pattern and will not return any other values even if there are more to be found in that row. This can be useful if we only want the first mention/instance of something vs all matches. Whereas, extractall() will return all matches for each row that fit the pattern.

I think it might be helpful to know that match is actually an index. Therefore, the table generated by series.str.extractall() has what’s called a multi-index. This is why Country and match are on the line below Years to indicate they are both indices of the returned dataframe object.

1 Like

Sure thanks. I will read-up on multi-index