Screen Link: https://app.dataquest.io/m/136/data-cleaning-walkthrough/13/parsing-geographic-coordinates-for-schools
With this exercise, I tried to do the following as an alternative solution
pattern=r"((.+))"
def lats(s):
co=s.str.extract(pattern).str.replace("(", " “).str.replace(”)", " “).str.strip()
latitude=co.str.split(”,").str.get(0)
return latitude
data[‘hs_directory’][‘lat’]=data[‘hs_directory’][‘Location 1’].apply(lats)
I get the following error:
AttributeError: ‘str’ object has no attribute 'str’
I am not sure why since the line below is running ok
cor=data[‘hs_directory’][‘Location 1’].str.extract(pattern).str.replace("(", " “).str.replace(”)", " ").str.strip()
From my understanding the series.apply() is applying the function to the series provided data[‘hs_directory’][‘Location 1’] and I don’t understand why I am getting the error.
Thank you for your help
Are you absolutely sure that the above is running without any issues? Only that line of code, and nothing before it?
Because if the above code runs fine for you that means the pattern
you have shared in your post is incorrect. In your own code you probably have r"(.+)"
(notice the single set of parenthesis instead of 2 that you have above)
Regardless of the above, if the code runs, but the one in your lats()
function doesn’t, is because when you use apply()
, the s
that gets passed to your lats(s)
is the actual value from each row
If you print out type(s)
, it will return <class 'str'>
. It’s a string value. apply()
modifies each row from the column, and it passes in the value for each row to lats()
.
.str
is only a function applicable to Series. Since s
is not a Series and is a string, you get the error.
It works for that line of code when you use data[‘hs_directory’][‘Location 1’]
because data[‘hs_directory’][‘Location 1’]
is a Pandas Series, and str
is a valid function you can apply to it.