Hi everyone, I am completing the guided project ‘Clean and Analyze Employee Exit Surveys’ in the Dataquest ‘Data Analyst with Python’ and I’m stuck on a seemingly minor issue.
I am working with a dataset called ‘dete_resignations’ which contains a column called ‘cease_date’. The column is of type int64 and it contains entries with the format ‘YYYY’ and other entries with the format ‘MM/YYYY’.
My aim is to extract the year from this column.
My first ideas was to identify a regex ‘MM/YYYY’ and then select the year part:
If I try to do this, I get an error: "Can only use .str accessor with string values! " I suppose I cannot use .str.extract because my type is ‘int64’.
So I move to my second idea: let’s convert the data to datetime with pd.to_datetime(), then extract the year using dt.year.
pd.to_datetime(dete_resignations['cease_date'], unit = 's').dt.year.astype(float)
If I try to do this, the code runs, but as a result I get the year ‘1970’ for all entries. I looked up on Stack Overflow and they said this issue could be solved by setting the parameter unit = ‘s’ but it does not work for me. By printing intermediate results, I managed to understand that the problem comes not when I extract the year, but when I turn it into float.
At this point, I try to use the proposed solution…which is the following:
so I cut and past the code:
dete_resignations['cease_date'] = dete_resignations['cease_date'].str.split('/').str[-1] dete_resignations['cease_date'] = dete_resignations['cease_date'].astype("float")
…and again I end up with the error ‘Can only use .str accessor with string values!’ like in my first approach. I checked in the solution file but it doesn’t seem to me like the authors did something with the data, like converting it to a string type, before these lines.
Please help! I’m really struggling! I’d like to know:
- Is it possible to use string methods, both ‘str.extract()’ and ‘str.split()’ if my data is int64? How can I do it?
- Why do I get ‘1970’ when I try to turn the year I extracted from datetime into float?
Thank you everyone…