Hiya, can someone help me understand what’s happening here? The regex seems to work, however it splits the string in two - keeping both and creating hat seems to be a 2-D dataframe.
URL here: https://app.dataquest.io/m/369/advanced-regular-expressions/8/extracting-domains-from-urls
My code here:
pattern = r'(?<=(://))([\w\-\.]+)'
test_urls_clean = test_urls.str.extract(pattern, flags=re.I, expand=False)
It seems that you´ve used 2 capture groups in your pattern, that´s why the result is a two-column dataframe.
The parenthesis around the
:// create the capture group
0 and the parenthesis around the
[\w\-\.]+ create the capture group