Hi guys
I’m working again in a personal project and i have some files in excel that i open in pandas to transform them into dataframes. There’s a column that is a date in day-month-year hour format as you can see in the image
So i use the next code to first transform the hour 24:00 into 00:00 because pandas only works from 00:00 to 23:00, then i transform them into datetime type because when i create the dataframe pandas don’t recognize the formart and says that it’s a object type. And do other stuff like extracting the day and month and combining them together
It works so far but then i found something really weird, the first 12 days have this format year-day-month hour but then at day 13 it changes into year-month-day hour and later changes again into the first one
At this point i don’t know what to do. I’ve been reading but i’m not getting any clue and to be honest this is one of my biggest flaws. I don’t know how to work well with datetimes
This is 1 way to demonstrate your problem on a small scale.
How did you read in the data?
What were the parameter settings?
Here is a good summary of some issues with pandas which depends on dateutil for date parsing and the problems with dayfirst, yearfirst, default american date format parsing, british date format that people expect as the default, their non-strict behavior (meaning no Error raised, but falling back to a other ways of parsing). Using format parameter with pd.to_datetime is good for both specifying exactly how you want it parsed, and for errors to be raised if you have bad data that doesn’t match your specified format. I expect index 288 to raise error if you used format
@DishinGoyani Thank you so much! it worked. However the date wasn’t transformed into the format specified in pd.to_datetime. I mean if i understand correctly, the date format should have change right?
@hanqi I tried to replicate your code but to be honest i don’t understand it. However the post linked it’s really useful
I use this to read the data i don’t know if i could have save time and problem with another parameter