Screen Link:
My Code:
combined_updated.copy()['institute_service_up'] = combined_updated['institute_service'].astype('str').str.extract(r"(\d+)")
combined_updated.copy()['institute_service_up'] = combined_updated['institute_service_up'].astype('float')
combined_updated.copy()['institute_service_up'].value_counts()
What I expected to happen:
I’m confused about how this code is able to extract the lower end of the range. Why is this the case?
How was I supposed to figure out using extract(r’(\d+)’) was THE PROPER SYNTAX TO USE. I used this link: 6.2. re — Regular expression operations — Python 3.4.10 documentation but could only find information on finding digits using d. Is there an alternative within the framework of what I’ve learned in the modules to solving this issue? Had it not been for the key, I don’t think I would have been able to find this specific syntax with my own research. Any pointers on how to figure out this and make this deduction independently?
Also, what’s the difference between ‘str’ or ‘float’, and str and float—without quotations?
What actually happened:
That’s actually what I’m trying to figure out
Replace this line with the output/error
tldr/ just curious about how the syntax works, ‘str’ and ‘float’ vs. str and float, and why the lower end of the range is used given the syntax I provided. I’m curious to see what the community thinks. Thanks