Formatting all Dates

Hi together,

I’m stuck at one point and home somebody can help me.
I have different formated Dates in one Column I imorted to Pandas.
What is the easiest/most clever way to convert all Dates in one standardized Format?

I have this here:

24025       2017-10-05 14:39:07
24026       2017-08-05 08:37:58
24027      4/21/2017 8:48:53 AM
24028     4/19/2017 12:23:35 PM
24029      4/19/2017 9:04:31 AM
24030      4/19/2017 8:44:59 AM

And I need all Dates in the Column one standard Format e.g. 01-12-2020 10:44 PM (without seconds).
I have to loop over the Dates and to format it row by row right?
Or is here an smarter way to do this?

All Dates are from type Object, i checked it.

Thanks in advance,
Simo

1 Like

Try this pd.to_datetime

>>> pd.to_datetime(df.iloc[:, 1]).dt.strftime('%d-%m-%Y %H:%M %p')
0    05-10-2017 14:39 PM
1    05-08-2017 08:37 AM
2    21-04-2017 08:48 AM
3    19-04-2017 12:23 PM
4    19-04-2017 09:04 AM
5    19-04-2017 08:44 AM
Name: b, dtype: object

Hi @DishinGoyani thanks for this.
I have different columns with different Datatypes in my DataFrame.
Your Code tries do format everything, right?
Well, it will run on a error if the Code tries to format something else as a Date.

Can i somehow determine which exactly columns should be formatted?
I would pick them the 4 of the DataFrame which are Dates.

Thanks,
Simo

No it will only try to format second column of data frame to given date format as we had df.iloc[:, 1] (This means all rows of second column)

You can use df.iloc[:, 3] for forth column.

So it would be

>>> pd.to_datetime(df.iloc[:, 3]).dt.strftime('%d-%m-%Y %H:%M %p')

You can just use column name if you have one i used iloc Because I could not know from the sample data what your column names are.

With column name (Ex. if it is "date_column_name") than it would be

>>> pd.to_datetime(df.date_column_name).dt.strftime('%d-%m-%Y %H:%M %p')
2 Likes

Hi @DishinGoyani thank you so much it worked well :slight_smile:
But tell me please: If i want to format from the 3thrd to the 9th?

This here doesn’t work:
pd.to_datetime(df.iloc[2, 8]).dt.strftime(’%d-%m-%Y %H:%M %p’)
I picked 2 as start point because we count here from zero, right?

Simo

1 Like

You can do

 pd.to_datetime(df.iloc[2:9]).dt.strftime("%d-%m-%Y %H:%M %p")

I would suggest please go through example of iloc indexing from doc - pd.DataFrame.iloc so you can get idea about indexing first.

Hope it helps! :slight_smile:

Hi @DishinGoyani thank you so much for your help.
I think that now i have a better understanding and i can do the formatting on my own in the future :).

I get two errors for two columns and i started to figure out what is the source of the problem:

Error message : Unknown string format: Yes
-> I checked the column where this error message shows up. The dates are mixed formatted in one column e.g. 11/13/2020 9:13:47 AM & 11.12.2020 10:47
How can i cover now these two format styles in my statement?

Error message: mixed datetimes and integers in passed array
-> I checked also this column and it seems to be the same issue. Two Date Styles in one Column e.g. 11/27/2020 1:14:49 PM & 11.12.2020 12:13

But, i have also mixed Dates in the other columns but not error messages. This is very strange. What can I do here so I can work with this two columns?

Thanks in advance als always,
Simo

So what is expected output for it which one do you want to keep? If you want one of them we can do data cleaning by removing one of it before converting into datetime type.

1 Like

Hi @DishinGoyani thanks for your fast reply.
Now i understand what you mean.
I would like to keep the 11/13/2020 Format.

How can i do the Data cleaning?

EDIT: But i will need to keep the other Dates, too. So did i understand you right that we will bring all into one Format?

Thanks.

1 Like
# df sample
       a                               date_column
0  24028                     4/19/2017 12:23:35 PM
1  24029                      4/19/2017 9:04:31 AM
2  24030  11/13/2020 9:13:47 AM & 11.12.2020 10:47

# It will remove anything after `&`
df["date_column"] = df.date_column.str.replace(r"&.*",'') 

pd.to_datetime(df.date_column).dt.strftime('%d-%m-%Y %H:%M %p')
0    19-04-2017 12:23 PM
1    19-04-2017 09:04 AM
2    13-11-2020 09:13 AM
Name: date_column, dtype: object

But that is the case when you want to keep only one date from it.

If you want to keep both then we can not store both date in same column when converting into datetime type. So will have to store second date in new column.

1 Like

@DishinGoyani Thank you so much :slight_smile:

Hi @DishinGoyani sorry but i have to open this topic again.
Maybe you can help me a little bit further with working with Dates.

What is the Situation:
In my Excel File all Dates are in this Format: MM/DD/YYYY H:M:S and i need them in the Format DD.MM.YYYY H:M in my Data Frame.

Now you showed me that i can work with pd.to_datetime and convert the Date as needed. This works fine but it creates strings and i need them as Datetime Objects.

How can i convert without creating strings of my timestamps?
Or do I need to convert them to strings to format them and then converting them back to Dateobjects?
I tried to convert them directly with regex but it doesn’t work.
Hope you have an solution/idea.

Thank you in advance.

Best,
Simo

PS For sure also all the other valuable users are welcome to help me:) Thanks!

I don’t think custom format with DateTime object in data frame possible. You have to convert into string only.

1 Like

Ok, I will convert to string, do my formatting and convert back to date.
Thanks :slight_smile: