Review: German used cars guided project

Hello everyone,

This is my first uploaded guided project since I started learning at dataquest.io. Please let me know if there is anything that could be improved. I really look forward to your feedback.

My project on github can be found here.
The file can be found here: Used car sales in Germany.ipynb (66.5 KB)

Best regards,
Ninh.

Click here to view the jupyter notebook file in a new tab

1 Like

Hi @ninh.luongkhanh,

Good project!

Your code is very well laid out and concise - very readable. I also like the use of headings to break the sections up.

I also liked some of the language (e.g., “Let’s take a look at the min and max values of the ‘price’ column to see if there are any anomalies.”). It shows you have a curious mindset, which leads to a rigor in your analysis. This comes in handy for data cleaning.

I also appreciate your work in going beyond the Guided Project and try to tackle the more interesting questions such as characterizing relationships between variables (e.g., price and odometer readings; damage and non-damage cars, etc).

A couple of next steps that I offer your to consider:

  • You mentioned you encountered a ‘SettingWithCopyWarning’. You can resolve this by use .loc or using df.copy(). Check out this StackOverflow post for more details on what you keep getting the error.
  • I loved challenge #4 on trying to see the trends of price and mileage. One minor thing in your code:
    mileage_50k_150k = autos[(autos['odometer_km'] >= 50000) & (autos['odometer_km'] <= 100000)]['price'].mean()
    mileage_above_150k = autos[autos['odometer_km'] >= 100000]['price'].mean()

Be careful here because you are double counting vehicles with a odometer reading of 100,000 in both these groups. I believe you wanted < 100000 in the first line.

  • You might want to start experimenting with graphs such as using matplotlib and seaborn. This will really make your trends pop out to readers, as well as help you understand the data better. If you are not there yet in Dataquest, that’s fine! But you can always return to this project at a later date and build on.
  • Lastly, and this may be a personal stylistic thing based on my education and background, but I always like to include a limitations section of my analysis as well as future analyses that could be done. I feel this sets the proper context in which you can draw your conclusions, and give readers (or yourself!) ideas to explore later on.

Good project and keep it up!
Anthony

1 Like

hey @ninh.luongkhanh

Congrats on the first project. You have done a great job! :ok_hand: I Like your conclusion section too.

@AWM007 has already given very detailed feedback :+1: , don’t have much to add to that.

Keep learning and happy coding!

1 Like

Hi @AWM007,

Thank you very much for your detailed feedback. It’s much appreciated! I’m delving into the data visualization courses and hopefully can create visually compelling graphs later on.

Btw, your work on the project is very inspiring, I found lots of things that can be learnt from. Let’s keep up the great work!

Best regards,
Ninh.

1 Like

Hi @Rucha,

Many thanks for your kind words! Really appreciate it. I’m trying to speed up my learning process as there is much to learn ahead. Sometimes I find the work so overwhelming, so it’s great that we have a community here to exchange ideas and provide feedback. It really helps!

Best regards,
Ninh.

1 Like