eBay Car Sales Data -JF

Hi all,

This is my first guided project I’m sharing for review! I’m new to this so I’ll appreciate any and all feedback. Thank you!

https://app.dataquest.io/m/294/guided-project%3A-exploring-ebay-car-sales-data/9/next-steps

Basics.ipynb (30.8 KB)

Click here to view the jupyter notebook file in a new tab

2 Likes

Hi @jesseba.fernando,

Welcome to the community and congratulations on finishing this guided project.

Your project looks good. But there are a few things that caught my attention.

  • It would have been great if you could give a bit more of context about the project in the introduction. You have already given a couple of lines using the markdown cell, which is great.
  • You have used str.replace() method along with for loop to rename the columns. This is one way to do it for sure. But if you’re using
    dataframe.rename({'oldname':'new_name'},axis=1, inplace=True) it would be easier and you don’t need a loop.

The whole code will look like this

autos.rename({'dateCrawled':'date_crawled', 
             'offerType':'offer_type', 
             'abtest':'ab_test',
             'vehicleType':'vehicle_type', 
             'yearOfRegistration':'reg_year',
             'powerPS':'power_ps',
             'monthOfRegistration':'reg_month',
             'fuelType':'fuel_type',
             'notRepairedDamage':'not_repaired_damage',
             'dateCreated':'date_created',
             'nrOfPictures':'no_of_pictures',
             'postalCode':'postal_code',
             'lastSeen':'last_seen'}, axis=1, inplace=True)

With Registration year, I think we can accept cars with reg_year as 2016 since data is crawled in 2016.

From the way you have written the pandas code, I can sense that you know quite a lot about Pandas library. Maybe you were on a rush to finish this project. Anyway hope this feedback helps.

Hi @jithins123,

Thank you for your feedback! I had tried the dataframe.rename({'oldname':'new_name'}),axis=1, inplace=True) method but I think I had it formatted incorrectly as it kept throwing an error. Which is what led me to the for loop using str.replace().

As for the registration year, not including 2019 (you typed 2016 but I think you meant 2019 as I did include 2016 in reg_year) for the reg_year was based on future questions that could be asked. If we were to ask which brands increased in popularity over time, having an incomplete dataset from 2019 (the data was scraped sometime in the middle of 2019 but not at the end) might skew the results.

Thank you for your feedback! It definitely helped!
Jesseba

Hi @jesseba.fernando,

What I have understood is that the data was scrapped in 2016 as per the date columns.