Renaming columns before combining dataframes problem

I have ran this to get rid of brackets , to make column names uniform before combining dataframes, however when I tried I see that this does not save result back into original datframe and just returns an index .

happiness2015=happiness2015.columns.str.replace("(", "")

However there would still be leading and trailing spaces which causes this as mentioned —

You may have also noticed that some of the column names differ only by punctuation, which caused the dataframes to be combined incorrectly:

Trust (Government Corruption)

Therefore I still have null values and the columns have not been combined

Then I realised I have to amend column names for each year of each separate csv --2015,2016,2017 before combining to make them all same

happiness2015 = pd.read_csv("World_Happiness_2015.csv")
happiness2016 = pd.read_csv("World_Happiness_2016.csv")
happiness2017 = pd.read_csv("World_Happiness_2017.csv")

So then I tried this solution from stackoverflow to also solve a trailing space problem, but got error

AttributeError: Can only use .str accessor with string values!

happiness2015['Health (Life Expectancy)']=happiness2015['Health (Life Expectancy)'].str.replace(r"\s+\(.*\)","")

Even if it did work it seems like a very time consuming task to do this for each column of '16 and '17 to get same spelling and ensure no leading and trailing spaces , is there a more efficient way or if not please advise on how to fix?

You did a great job looking for solutions. But actually you almost did it right on the step 1. There’s a small but crucial error in your code.

You save the results to the data frame itself instead of saving it to the .columns attribute:
happiness2015.columns = happiness2015.columns.str.replace("(", "")

1 Like


For happiness2017dataset,you could do this.

happiness2017.columns = happiness2017.columns.str.replace(’.’, ’ ‘).str.replace(’\s+’, ’ ').str.strip().str.upper()

For renaming columns in happiness2016 try this


For renaming columns in happiness2015 try this


You don’t have to rename each column like happiness2015[‘Health (Life Expectancy)’] .Instead you can do it together ie, rename all columns in a dataset at once like above .

hope this helps.

1 Like

Hi ksenia.kustanovich
Thanks for your feedback and correction.


1 Like