Your Code: I’m trying to write a function to add year column into my dataframe. The function takes a dataframe as an input. Example: Happiness2016. I want to extract 2016 and adds into the new column “Year”. Here is my code:
def addYear(df):
hold = "df"
df["Year"] = hold[-4:]
What I expected to happen: I expect the function to add 2015, 2016, 2017 into my dataframe respectively to Happiness2015, Happiness2016,Happiness2017
What actually happened: My function add df instead of what I expected
It is because I assign df to hold variable. Could you help me figure out a way to make this work please?
That’s an interesting problem. When you write hold = "df", Python is not understanding it as taking the name of the dataframe and turning it into a string. It sees the df in your input and "df" as 2 different things and not making the connection between the two. So when it sees hold[-4:], it only understands it as "take the last 4 characters of the string "df", which is why it filled the column with “df”.
I found this article on Stack Overflow where someone asked how to extract a variable name as a string like you want to do, but it seems more trouble than it’s worth.