Regex pattern extract to mutch, but why?

Screen Link:

My Code:

test_urls = pd.Series([
 'https://www.amazon.com/Technology-Ventures-Enterprise-Thomas-Byers/dp/0073523429',
 'http://www.interactivedynamicvideo.com/',
 'http://www.nytimes.com/2007/11/07/movies/07stein.html?_r=0',
 'http://evonomics.com/advertising-cannot-maintain-internet-heres-solution/',
 'HTTPS://github.com/keppel/pinn',
 'Http://phys.org/news/2015-09-scale-solar-youve.html',
 'https://iot.seeed.cc',
 'http://www.bfilipek.com/2016/04/custom-deleters-for-c-smart-pointers.html',
 'http://beta.crowdfireapp.com/?beta=agnipath',
 'https://www.valid.ly?param',
 'http://css-cursor.techstream.org'
])

pattern = r"(https?)://([\w\.\-]+)/?(.+)"

test_url_parts = test_urls.str.extract(pattern, flags=re.I)
url_parts = hn['url'].str.extract(pattern, flags=re.I)

What I expected to happen:
Columns split by protocol, domain and URL path

For the first row, the pattern extract \ for the 3rd column. I don’t understand why?
I thought that it has to be excluded by: /?

Hi Jeroen,

As according to the documentation, the special character ? matches 0 or 1 repetitions of the preceding regular expression. So /? will match one slash symbol, in case it exists in the string.