Changing timedelta format - datetime should be visible as (%Y-%B-%d) - %B = monthname

Hello, I’m trying to change the format of timedelta to %Y-%B-%d. The idea is to extract only the months in word character instead of numeric.
Keep in mind that in the dataset the values in ‘arrival_month’ column is in word character. I tried to convert the values of all the columns I need to string, then concatenate them to have a full date. I was expecting the end result to be in this format %Y-%B-%d but it ended up being all numeric.

Then I concatenated the 3 columns to have a full date in string with month in %B format, then used a for loop with strptime to convert back to date object and extract the month but it didn’t work.

Help please! Thank you

hi @nelsonlupanda

Welcome to DQ community!

At the onset, I will make the request to follow Introducing guidelines for all technical questions in our Community. It is very difficult to work with a snap shot of the code, instead of the actual code, that too with no sample data and forced to make our own assumed data which may or may not be helpful for the question asked.

Before I try to answer your questions I have few of my doubts:

  • what’s the purpose of dt.obj which is causing the type error here. I tried simple obj in for loop and it works. It’s like you are trying to assign something to an object of datetime module.
  • you have already converted the room_booking_date series in cell 44 code at line 2 then why is the date-column reprocessed again in for loop.
  • you want to store the date as datetime object and present it in a month-name format or you want to store the data with month-name as well?

(There was no info on “lead_time” series so I don’t understand this part and I am excluding it here.)

Below is the code that I worked on to understand your question:

from datetime import datetime as dt
import pandas as pd

arrival_year = pd.Series([2019, 2019, 2019])
arrival_month = pd.Series(["March", "March", "March"])
arrival_day = pd.Series([11, 12, 13])

arrival = arrival_year.astype(str) +"-"+ arrival_month.astype(str) +"-"+ arrival_day.astype(str)
print(arrival, "\n")

df = pd.DataFrame(data = {"year": arrival_year, "month": arrival_month, 
                          "day": arrival_day, "date_str": arrival})

df["date_date"] = pd.Series(pd.to_datetime(arrival))
df["date_format"] = pd.Series(pd.to_datetime(arrival)).dt.strftime("%Y-%B-%d")

The result of the above code is as below:

Some info:

  • date_str is the date we get when the three separate columns are concatenated and is stored as object dtype
  • by just applying pd.to_datetime() method the date_str column, it gets converted to datetime format but yes we loose the month name (but that’s how pandas stores the datetime info)
  • by applying dt.strftime() method we get back the date with month name although now stored as object.
  • in case you want to extract only the month and store it for some analysis, you can try this code: df["date_date"].dt.strftime("%B")
1 Like

Hi @Rucha thanks for taking the time to help.

  • dt.obj was a typo. It should’ve been dt_obj

  • Cell 44 line 2 convert the date_column into object and keep the month-name format. I used the loop to convert it back to datetime object with strptime and keep the month-name as well.

  • I want to store the date as datetime object.

The dataset is about hotel booking demand and the ‘lead_time’ column stores the number days between the day the room was booked and the day the guest checked in the hotel.

I concatenated the year, month and day into a single column named ‘guest_arrival_date’, converted the value into datetime object, then subtracted ‘lead_time’ from ‘guest_arrival_date’ to determine the exact date the room was booked. I hope that make sense.

I thought with pd.to_datetime() we could specify the format as well. I tried it, it didn’t raise an error but it also didn’t change anything. I still lost the month-name.

I used df[“date_date”].dt.strftime("%B") and it gave me the result I was looking for, so thank you very much.