Exploring a dataset car sales data on eBay

Hi all,

i’ve just finished my third guided project in which I try to analyse a dataset of car listings on eBay.

If you have the time, please take a look and give me all sorts of types of feedback (preferably related to the guided project)! :smiley:

https://app.dataquest.io/c/54/m/294/guided-project%3A-exploring-ebay-car-sales-data/9/next-steps

GP3 Exploring eBay Car Sales Data.ipynb (290.4 KB)

Click here to view the jupyter notebook file in a new tab

1 Like

Hi @leonhekkert! Thanks for sharing your project with the Community. It was a nice and easy-to-follow read for me. I liked that you left a lot of comments on your code and also your explanations are very clear. It’s interesting that you concluded that it’s impossible to compute the effect of mileage on the price, I suppose it’s the first project I see that makes this conclusion. You are correct, it’s would be interesting to use some other data set with actual selling prices. Do you have one on your mind?

I’m not sure what you mean by “I thought to use this to format a 1000 separator”?

Some suggestions from my side:

  • You have slashes after each sentence. Why?
  • You have a warning in the project. Is it important? Could you solve it?
  • Remove the list of all unique prices. It’s like 70% of your project and I don’t think they are useful. You could return the top highest and lowest prices
  • You could work a bit on your code style. For example, in the last code cells, try to make identical the indentation on each line

I hope these suggestions are useful. Happy coding :grinning:

Hi @artur.sannikov96,

thanks a lot for taking the time to check and give feedback!

You have slashes after each sentence. Why?

When I work with Jupyter Notebook, a slash at the end of a sentence forces a line break.
I like to read new sentences on a new line.
Apparently it is not rendered to nbviewer in the same way…

You have a warning in the project. Is it important? Could you solve it?

I have no clue why there is a warning about regex as I believe there is no use of regex in that line of code.
Do you understand the warning?

Remove the list of all unique prices. It’s like 70% of your project and I don’t think they are useful. You could return the top highest and lowest prices

in Jupyter Notebook output is rendered with scroll bars, thus not taking up that much space.
Apparently it is not rendered to nbviewer in the same way…

Do you know whether rendering it with scroll bars is possible?
As I like to eyeball all data to check for possible errors.
Its probably best practice to have code to do this, but as a beginner i’d still like to eyeball the data to see whether i’ve overlooked something.

You could work a bit on your code style. For example, in the last code cells, try to make identical the
indentation on each line

Will do, thanks for the feedback.

I’m not sure what you mean by “I thought to use this to format a 1000 separator”?

I was wondering whether it is possible to show numeric data like 10000 as 10,000 in a dataframe.
But not sure how to do it.

1 Like

I believe you can make new lines with just Enter. I never had the problem…

Yes, apparently you have a version of pandas that warns you that the default value of parameter regex will change from True to False but in the documentation the default value is None and they do not mention anything about this change.

I don’t know if it’s possible in nbviewer. However, I advise you to draw a box plot to look for errors. It’s much more efficient.

Yes, that’s possible. You can use .map() from pandas. This function accepts other functions that can be applied element-wise on a Series. So, you can write something like this: autos['price'] = autos['price'].map('{:,}'.format). In this way, you will separate the number by thousands. Be aware that the prices will become strings.

1 Like