Apply method faster than series.str method for string manipulation





I used the apply method with a function to select the word at -1 position on the CurrencyUnit and problem and timed it and got back 991 microseconds while the series str method ( gave me 1.73 milliseconds even though series.str was told to be faster than the apply.

Anybody has any insight on this?

My Code:

#Apply method:
def extract(element):
    return word

%time merged["CurrencyUnit"]=merged["CurrencyUnit"].apply(extract)


#Series.str method:
%time merged['CurrencyUnit']=merged['CurrencyUnit'].str.split().str.get(-1)

What I expected to happen:
I expected series.str to be faster

What actually happened:
Apply method turned out to be faster

![apply method for string manipulation|690x286](upload://dlAnZYdSpkS8S7wFbWMgpaagjfd.png) ![series str method|690x280](upload://fGOhBO4PTafWLvUtCr9hQ1yB1Cw.png) 
Hi @immanuel.ajay.r

The answer is: vectorization

First let’s define vectorization, basically it’s the process of executing operations on entire arrays. In other words, instead of making a operation line per line (like using .str.split() etc etc), it makes the operation on the entire dataset

This two articles might help you to understand more about it
Yeah, but series.str is supposed to be using vectorization and hence, supposed to be faster
but not apply()

But i did find some info here

It’s not clear cut as to why but the discussion hints at what may be causing apply() to run faster than series.str