Apply method faster than series.str method for string manipulation

Hi all,

Screen Link:

I used the apply method with a function to select the word at -1 position on the CurrencyUnit and problem and timed it and got back 991 microseconds while the series str method ( gave me 1.73 milliseconds even though series.str was told to be faster than the apply.

Anybody has any insight on this?

My Code:

#Apply method:
def extract(element):
    return word

%time merged["CurrencyUnit"]=merged["CurrencyUnit"].apply(extract)


#Series.str method:
%time merged['CurrencyUnit']=merged['CurrencyUnit'].str.split().str.get(-1)

What I expected to happen:
I expected series.str to be faster

What actually happened:
Apply method turned out to be faster

![apply method for string manipulation|690x286](upload://dlAnZYdSpkS8S7wFbWMgpaagjfd.png) ![series str method|690x280](upload://fGOhBO4PTafWLvUtCr9hQ1yB1Cw.png) 
1 Like

Hi @immanuel.ajay.r

The answer is: vectorization

First let’s define vectorization, basically it’s the process of executing operations on entire arrays. In other words, instead of making a operation line per line (like using .str.split() etc etc), it makes the operation on the entire dataset

This two articles might help you to understand more about it
1 Like

Yeah, but series.str is supposed to be using vectorization and hence, supposed to be faster
but not apply()

But i did find some info here

It’s not clear cut as to why but the discussion hints at what may be causing apply() to run faster than series.str