Ebay Car Sales - Step 5 confusion

Screen Link:
https://app.dataquest.io/m/294/guided-project%3A-exploring-ebay-car-sales-data/5/exploring-the-date-columns

I’ve been browsing the community posts about step 5 for Ebay car sales because I’ve been confused regarding the instructions. My confusion stems from the following statement in the activity description:

Right now, the date_crawled , last_seen , and ad_created columns are all identified as string values by pandas. Because these three columns are represented as strings, we need to convert the data into a numerical representation so we can understand it quantitatively.

This statement implies that part of our workflow should be converting the dates in the three columns from strings to datetime objects. However, no such instructions are provided as part of the workflow. Like many others on these forums, I was confused because it seems odd to me to generate summary data on strings rather than dates.

Could someone clarify whether we’re supposed to convert the strings in this step? Or are we supposed to follow the instructions without the data conversion? Thanks.

Of course converting to DateTime object would be appropriate but since this project is not so much concerned with time series analysis (we are only finding value_counts and sorting the values) that why they are using the string objects.

But for the purpose of learning you can convert to datetime object…

2 Likes

How about:

import datetime as dt
import pandas as pd

date_format = '%d - %m - %y'

dataframe.loc[:, [date_crawled , last_seen , ad_created ]] = dataframe.loc[:, [dt.strptime(date_crawled,date_format) , dt.strptime(last_seen,date_format) , dt.strptime(ad_created,date_format) ]]

adjust accordingly

2 Likes

Aaaaaand after an hour of debuging my own code, :smile: I had to jump through a couple of loops to make it really work, here is the final version :smiley: :

from datetime import datetime as dt
import pandas as pd

date_format = '%Y-%m-%d %H:%M:%S'

autos = pd.read_csv('autos.csv',encoding='cp1252')

autos[['dateCrawled' , 'lastSeen' , 'dateCreated']] = autos[['dateCrawled' , 'lastSeen' , 'dateCreated']].applymap(lambda x : dt.strptime(str(x), date_format))

One must check on one’s own work :smile: :

autos.info()

autos[['dateCrawled']].loc[0]

and it DID work at last, THANK GOD :smile: :smile: :smile:

dateCrawled 2016-03-26 17:47:46
Name: 0, dtype: datetime64[ns]

1 Like