Hello guys, thanks for viewing this topic.
I am a forth year, student who majored in software engineering but want to change my career to data analyst, I don’t have experience to analyst data.
Please do tell your observation to discuss what we got, thank you.
To me,
After use this code
autos[‘last_seen’].str[:10].value_counts(normalize=True, dropna=False).sort_index(ascending=True)
for 3 columns, my observation is:
- Before 03-2016 is outlier data, because I think it hasn’t been updated for a while. (Like the advertiser who sold the car did not delete the post)
- It seems like this website ads work well, car might has been sold because the ads are no longer on site.
and
autos[‘registration_year’].describe()
I think we have 1000 and 9999 are outliers, because registration_year is the year that cars had been registered so it can’t be 1000 and 9999. (can’t be higher than 2016 and less than 1950)