Is there an elegant way to not using the str.replace() method two times for the same column?

I have a string containing the following kind of string in a dataframe : 150,000km
To remove the coma and the km, I’ll do:

a_dataframe[‘a_column’] = a_dataframe[‘a_column’].str.replace(’,’,’’)
a_dataframe[‘a_column’] = a_dataframe[‘a_column’].str.replace(‘km’,’’)

Is there a more elegant way to do this in a single row of code ?

2 Likes

a_dataframe[‘a_column’] = a_dataframe[‘a_column’].str.replace(’,’,’’).str.replace(‘km’,’’)

1 Like

Also, recently I learned from @veratsien you can remove all the non-digit characters using this trick

autos['odometer'] = autos['odometer'].str.replace('\D', '') 

This replaces all non-digit characters with white space. This is super cool.

5 Likes

Or using a comprehension:

your_string = '150,000km'
''.join(c for c in your_string if  c not in ',km')
1 Like

Thanks guys, I tried the str.replace().str.replace(), but did not realized it did now work because I was running it on integers…

1 Like

Yea, you probably missed the .str for the second one. It happened to me as well.

Hello @nicolas_mtl,

For your current purpose given answer by @jithins123 is the best way. But another thing you may want to note

That you can add multiple condition in to replace by concatenating each condition by | pipe.

a_dataframe['a_column'] = a_dataframe.a_column.str.replace(",|km","")

So it will replace ‘,’ and ‘km’ both with “” empty string.

4 Likes