Question about creating a data frame

Hi everyone,

I have a question I’m not able to find an answer for, so I appreciate any guidance.

I wrote some code to recursively scan through some folders to extract all Outlook email objects (extension ‘.msg’ to be precise. I am using the extract_msg module to get some attributes from the email object – sender, date, subject, and body. I would be very grateful if someone could modify my code to create a data frame using these attributes. I can think of a really inefficient way of doing this by appending to empty lists, but I’d imagine there’s a more efficient way to do this (I hope).

os.chdir(’…/Raw’)
for item in os.listdir():
msg = extract_msg.Message(item)
msg_sender = msg.sender
msg_date = msg.date
msg_subj = msg.subject
msg_message = msg.body

@jniakamal

Hey, You can do this.

os.chdir(’…/Raw’)
message = []

for item in os.listdir():
     msg = extract_msg.Message(item)
     msg_sender = msg.sender
     msg_date = msg.date
     msg_subj = msg.subject
     msg_message = msg.body
     message.append([msg, msg_sender, msg_date, msg_subj, msg_message])

This will create list of lists.

Then you can create data frame like this:
df = pd.DataFrame(message, columns=['msg','msg_sender','msg_date','msg_subj','msg_message'])

You can use any column names you want while creating a data frame.

Thank you @jenil2452000, that’s what I was looking for!

@jniakamal,

Hey, Please mark it as a solution if it answered your question!! Thank You!!