Guided Project 4: Exploring eBay Car Sales Data

Hello Guys🙌, I hope everyone is doing great?

I am sharing my 4th guided project here for your review and contributions. I was excited to learn more about Numpy and pandas. I also explored a few added options of my own:

  • For the price outliers, I removed rows with values greater than 1.5IQR away from the 75th percentile.
  • I tried to understand why the lowest mileage cars were priced lower than the higher ones.
  • Finally I attempted to create a value metric to identify the best value brands and models.

This was an exciting project overall and I am happy to receive your feedback and suggestions.:raised_hands:

Cheers!!

Link to the last mission screen here

Project’s notebook file:
notebook.ipynb (1.9 MB)

Here’s the link to the project on Github

Click here to view the jupyter notebook file in a new tab

7 Likes

Hi @israelogunmola

What another well detailed and inspiring project on By Car Sales Data. The title , the aim , the explanations given, the use of comments, the recommendations and the conclusion are very informing. Like every section has been worked so perfectly and have ended up learning many things through your projects more so the value metric you created and I think it will be helpful to most of the students in this community as well. I have also noticed the adjustment you have done in the project like adding the summary result in the introduction ,which was raised then when you shared one of your projects. This is a great improvement and keep it up mate for the good work.

How did you come to identify the columns with non-english columns :thinking:, I gaze it must have took you lot of time. I have worked on the project before and never thought on the same. Actually your approach is the best, those loops and the function are very attractive and informing.
Your visualizations also got my attention, like the one in cell[22] the automation added are just excellent.

Don’t you think cell[36] and [37] are displaying the same result though with different format? I think after doing the grouping , the best approach approach was to have your dataframe immediately other than having it on a separate cell.

Check also cell[8] , I think those output are kind of lonely :smiley:, Assuming you hidden your code lines, it will be somehow hard to the reader to tell exactly what had happen. So I think embedding the output with some strings will solve this problem.

Otherwise ,Congratulations mate for the good work and I think @Elena_Kosourova should consider this in the champion section. All the best in your upcoming projects.

3 Likes

Another wonderful project from Israel! :partying_face: thank you @brayanopiyo18 for pointing it out, this is a great candidate! :partying_face:

3 Likes

Thank you very much for the detailed feedback @bryanopiyo18. It means a lot that you went through my code and pointed to specific areas to implement better changes with clear explanations🚀.

You are right; it took some time to identify the German columns; I used a language-translation tool called DeepL to check the unique terms in each column I suspected. This made it easier to identify and translate those columns.

You are also right about cells 36 and 37. I will try not to make these repetitions in upcoming projects. Oh my God, cell 8 :see_no_evil:. I planned to add some strings at a point, but I forgot :sweat_smile:. It is nice of you to point that out.

Again thanks a lot for this thorough feedback. They help me improve and get better and better—cheers to learning more​:nerd_face::raised_hands:

1 Like

Thanks for the kind words @Elena_Kosourova . I cannot wait to learn and improve even more🥳

1 Like

This is great @israelogunmola .
Happy learning.

Thanks @israelogunmola !

I was a bit stucked in the middle of the project, and looking at yours just helped me to know where my mistake was.

Thanks for sharing!

Hai @israelogunmola, I am a beginner in this field. Your project has given me lots ideas to work out and it was such an interesting thing to go through it. I have so many doubts regarding the steps you have taken. Since most of them arises due to my lack of experience in the field, I will only ask just one for now. It’s regarding the data fence you created. Did you work something out to get rid of advertisements that are priced at a very low amount? Since $1 or $100 wouldn’t be a justifiable price for a vehicle, isn’t it better to get rid of it prior to analysis? I would be happy to discuss this with you and probably you had already worked this issue out and I might have missed that part.

Edit: I am not able to use your visualization techniques in dataquest jupyter pseudo code editor.

I am happy to hear that :star_struck:. Thank you for sharing this feedback :100:

Hi @madtitan. I am delighted to hear your feedback.

First, I apologize for the late response: My laptop died on me two weeks ago and I recently got it fixed :sweat_smile:. That’s why I haven’t been active on the community.

You have a good point with the low priced vehicles. I also doubted the validity of having cars listed at that price. However, on checking the ebay website further, I realised that some cars were really listed at those price ranges (especially auctioned cars). The were also huge numbers of these kind of deals on the website, so I decided to include them.

To eliminate the outliers I used the Turkeys rule, You can find a concise explanation of how this rule works here :wink:

I can understand why the visualization techniques didn’t work for you as expected. At the time of the project, I used a visualization library called Plotly. Plotly doesn’t work in Jupyter right out of the box. You need to install it on your computer through the command line. I would suggested waiting to take the matplotlib course on the DataQuest platform to build your visualizations (Matplotlib is awesome :star_struck:) . However, If you decide to explore further with plotly, the instructions for its installation are here: Plotly getting started :nerd_face:

Let me know if you need anything else. I am happy to help

1 Like

Thank you very much. Nobody could have explained this in a much clearer way. I appreciate your talent and if it’s okay for you, I would like to follow you on other social medias.

1 Like

dude,you made me insecure Fr about my code style

can i know how long you’ve been in this realm,i mean python and data analyst thanks

Hello @Helmii8, try not to feel insecure at all. This a great place to learn and improve on your work :smile:

I started learning Python with DataQuest in January, 2022. Believe me hehe :rofl: I sucked at first, and I found a couple of things challenging. However, with continued feedback from interesting minds like @jesmaxavier , @Elena_Kosourova and @brayanopiyo18 at the early stage, I was challenged to learn more and improve.

Keep learning my friend. The only way is up :rocket:

1 Like

Hi! Sorry to say what’s already been said here, but…hey, your project is amazing!

I only started learning Python here on Dataquest a week ago and I’m putting in a lot of effort and it is amazing to hear that you’ve come so far in such a short time. I hope I can follow your path, and I’ll take inspiration from your projects to learn even more especially about visualizations, style, and completeness of the analysis.

Thanks for sharing your work and best wishes for your future progress!

2 Likes