I have a string containing the following kind of string in a dataframe :
To remove the coma and the km, I’ll do:
a_dataframe[‘a_column’] = a_dataframe[‘a_column’].str.replace(’,’,’’)
a_dataframe[‘a_column’] = a_dataframe[‘a_column’].str.replace(‘km’,’’)
Is there a more elegant way to do this in a single row of code ?
a_dataframe[‘a_column’] = a_dataframe[‘a_column’].str.replace(’,’,’’).str.replace(‘km’,’’)
Also, recently I learned from @veratsien you can remove all the non-digit characters using this trick
autos['odometer'] = autos['odometer'].str.replace('\D', '')
This replaces all non-digit characters with white space. This is super cool.
Or using a comprehension:
your_string = '150,000km'
''.join(c for c in your_string if c not in ',km')
Thanks guys, I tried the str.replace().str.replace(), but did not realized it did now work because I was running it on integers…
Yea, you probably missed the .str for the second one. It happened to me as well.
For your current purpose given answer by @jithins123 is the best way. But another thing you may want to note
That you can add multiple condition in to replace by concatenating each condition by
a_dataframe['a_column'] = a_dataframe.a_column.str.replace(",|km","")
So it will replace ‘
,’ and ‘
km’ both with “” empty string.