Incorrect answer - Working With Strings In Pandas (11)

It seems that the answer proposed in mission “Working With Strings In Pandas”, point 11 (Challenge: Clean a String Column, Aggregate the Data, and Plot the Results) is incorrect.

The code below results in extra space between the words in strings ‘HIGH OECD’ and ‘HIGH NONOECD’:

merged[‘IncomeGroup’] = merged[‘IncomeGroup’].str.replace(’ income’, ‘’).str.replace(’:’, ‘’).str.upper()
pv_incomes = merged.pivot_table(values=‘Happiness Score’, index=‘IncomeGroup’)
pv_incomes.plot(kind=‘bar’, rot=30, ylim=(0,10))
plt.show()

I’ve dealt with it by adding additional replace string method (to replace double space with single one):

merged[‘IncomeGroup’] = merged[‘IncomeGroup’].str.replace(‘income’,’’).str.replace(’:’, ‘’).str.replace(’ ', ’ ').str.upper()
pv_incomes = merged.pivot_table(values=‘Happiness Score’, index=‘IncomeGroup’)
pv_incomes.plot(kind=‘bar’, rot=30, ylim=(0,10))
plt.show()

Please let me know what do you guys think about it.

This code:

merged['IncomeGroup'] = merged['IncomeGroup'].str.replace(' income', '').str.replace(':','').str.upper()
print(merged['IncomeGroup'].unique())

returns these words:

['HIGH OECD' 'UPPER MIDDLE' 'HIGH NONOECD' nan 'LOWER MIDDLE' 'LOW']

I don’t see extra spaces.

However, notice that the second code you posted would return extra spaces if you had not added str.replace(' ', ' '). That’s because in the first replace you did not put a space before income. Try to note the difference:

.str.replace(' income', '')
.str.replace('income', '')

In this code you used the space before income and it should not return extra spaces:

In this one you did not put the space before income and then you had to add `str.replace(’ ', ’ ').

1 Like

@otavios.s you’re right! thank you :wink:

1 Like