Filter on Delta between two Dates

Hi together,

i want to filter my Data Frame in Pandas based on the Delta between to Columns. I want to filter the Data Frame with the following logic: Answer_Time (Column D) has to have 6 hours or more after Send_Time (Column C). Answer_Time >= 6.

Now i’m looking for a way to convert all Dates in my Data Frame into the same Format. I have imported Locale but i can’t convert multiple Date Colums with a statement.

What is here the best practice?
I attached the file if it helps:

https://community.dataquest.io/uploads/short-url/5NoD1UtW4kCNzgvVxhMiSmsHArq.xlsx

How can i filter based on Date Delta the best way?

It’s a little bit tricky to work with imported Dates from a Excel file. Thanks as always, :slight_smile:
Simo

@ Simo
I think you can add a new column to contain the time difference between answer time and send time, then filter the dataframe using the condition on new column.

Hi @raisa.jerin.sristy79 thanks for your idea.
This is for sure an valuable idea, thanks for that will try it this way.

Best,
Simo

@ Simo
It’s always my pleasure :slight_smile: :heart: :heart: :heart:

I have to filter the Column D for NaNs (some answers aren’t sent yet and it doesn’t make sense to use the Answer_Time for further Analyses agains Send_Time if it is not answered yet).

This here is not working:

def calculation(row):
    if row.Answer_Time == 'NaN':
        return 0
    else:
       return 1
df['New_Column'] = df.apply(calculation, axis=1)

Why i can’t set a IF-Statement to check for NaNs?
This Code doesn’t recognize the NaNs - it returns a 1 in every row like there are no NaNs… :frowning:

Thanks for any help/idea

Simo

@ * Simo

1 Like

@ Simo
You’ve used regular expression “==“ for comparison.
math.isnan is used to check whether a certain variable is NaN or not. We cannot use the regular comparison operator, == , to check for NaN. NaN is not equal to anything (not even itself! ).

1 Like

Hi @raisa.jerin.sristy79 thanks for clarifying.
I played a little bit around with the other colums and the filtering went very well now i know that NaN is a special case :).
The entry NaN seems like a typical string and now i understood what they aren’t.

But: In the tutorial you attached he is cleaning the Data Frame at the very begining. I want to filter based on if there is a NaN or not if a IF-Statement. Do you have an idea how i can use my IF-Statement above to filter for NaN directly?

Thanks,
Simo

@ Simo
You can use dataframe.isnull() == True in case of if nan and dataframe.isnull() == False in case of if not nan

I think it’s another way of filtering out the NaN values,
dataframe_name.loc[pd.notnull(dataframe_name.column_name)]
The way of filtering out the non-NaN values,
dataframe_name.loc[pd.isnull(dataframe_name.column_name)]