I have a string containing the following kind of string in a dataframe : 150,000km
To remove the coma and the km, I’ll do:
a_dataframe[‘a_column’] = a_dataframe[‘a_column’].str.replace(’,’,’’)
a_dataframe[‘a_column’] = a_dataframe[‘a_column’].str.replace(‘km’,’’)
Is there a more elegant way to do this in a single row of code ?
2 Likes
a_dataframe[‘a_column’] = a_dataframe[‘a_column’].str.replace(’,’,’’).str.replace(‘km’,’’)
1 Like
Also, recently I learned from @veratsien you can remove all the non-digit characters using this trick
autos['odometer'] = autos['odometer'].str.replace('\D', '')
This replaces all non-digit characters with white space. This is super cool.
5 Likes
Or using a comprehension:
your_string = '150,000km'
''.join(c for c in your_string if c not in ',km')
1 Like
Thanks guys, I tried the str.replace().str.replace(), but did not realized it did now work because I was running it on integers…
1 Like
Yea, you probably missed the .str for the second one. It happened to me as well.
Hello @nicolas_mtl,
For your current purpose given answer by @jithins123 is the best way. But another thing you may want to note
That you can add multiple condition in to replace by concatenating each condition by |
pipe.
a_dataframe['a_column'] = a_dataframe.a_column.str.replace(",|km","")
So it will replace ‘,
’ and ‘km
’ both with “” empty string.
4 Likes