Screen Link: https://app.dataquest.io/m/369/advanced-regular-expressions/9/extracting-url-parts-using-multiple-capture-groups
What I expected to happen:
If i use the above code for the url
then is should return thew whole url
because the syntax include
. means any char except newline, tab etc … and this include
? also then
([\w\-\.]+) should return the whole url
www.valid.ly?param till params.
What actually happened:
Why it stops before
? and doesn’t read further?
Because you’ve not specified them.
. read those character and it includes all char in
Consider for this part:
if i write
r"([\w\-\.]+)" ,then it should read
www.valid.ly?param but that code only gives :
www.valid.ly and not proceed further, it should not stop before
? and give that whole url as an output but this doesn’t happen why??
No @nishu123tushar, it returns the correct string because according to pattern you are selecting one or more (
+) alphanumeric or underscore (
\w) with skipped dash and dot character (
Dot in the last of pattern is with backslash not alone that’s why its not reading the characters.