Guided Project: Predicting Car Prices with KNN

Hello all,

I would like feedback on improving my presentation style as well as the quality of the work. Also, I hope my project can be useful to others who may like to learn from it

https://app.dataquest.io/m/155/guided-project%3A-predicting-car-prices/6/next-steps

predict_car_prices.ipynb (781.5 KB)

Click here to view the jupyter notebook file in a new tab

2 Likes

hey @TaiwoGoldAyodeji

I am not at this mission yet. Not even close. So we will leave the technical discussion (if any) for a later date.

Your project has a flow, and using titles to create a narrative is something new.

Your plots really stand out and are very clear.

But your project is missing a main Title!

I bookmarked your project to come back when completing my own project. Thanks for sharing! happy learning :slight_smile:

1 Like

Thanks a lot @Rucha. I’ll add a title to it in my folder now. I hope to have that technical discussion with you later. Cheers!

Hello Taiwo. I cam across your project and want to try being nit-picky and intentionally find faults. I just went through this project, but the problem is I skipped a lot of previous lessons. Now, I want to not just look and see what the correct solution is. I want to test how much I really know my topics. Please challenge or correct my comments as you see fit. I plan to start another project using a linear regression model and need guidance. I will also post my finished project for review later.

Complaint:

  • In cell 12, the bar graphs shown do not use the normalized values. Could you also make graphs showing the distribution of the normalized values? Although they may be numerically the same, I believe it would be more “visually accurate” to find out which column was more distributed than the other. The scale of values when graphed causes some minor discrepancies when comparing them to each other.

High-point

  • The clever use of a scatter plot on data that has very few data points is good. I assumed scatter plots worked best on many data points, but this series of graphs makes it far easier to see the best RMSE and the corresponding k-value.

That is as far as I will go. My project has removed a lot of rows, has not yet explored different k-folds and has not yet done the varying of both number of top features included and k-neighbors value. It also has very few graphs. Overall I got lower scores than this project of yours, but I need to make sure it is not because of overfit.

Question

Do you have any idea how to tell if your model is over-fitting asides from having too few data points (rows)?