Hey there fellow learners!
Just a quick question here.
from sklearn.neighbors import KNeighborsRegressor from sklearn.metrics import mean_squared_error train_one = split_one test_one = split_two train_two = split_two test_two = split_one knn = KNeighborsRegressor() knn.fit(train_one[['accommodates']],train_one['price']) prediction_one = knn.predict(test_one[['accommodates']]) msq_one = mean_squared_error(test_one['price'],prediction_one) iteration_one_rmse = msq_one**0.5 knn_two = KNeighborsRegressor() knn_two.fit(train_two[['accommodates']],train_two['price']) prediction_two = knn_two.predict(test_two[['accommodates']]) msq_two = mean_squared_error(test_two['price'],prediction_two) iteration_two_rmse = msq_two**0.5 avg_rmse = np.mean([iteration_two_rmse,iteration_one_rmse])
What I expected to happen:
avg_rmse = 128.96254732948216
What actually happened:
avg_rmse = 123.7207888486061
So I seem to be slightly off here. I checked it with the answer and what happened is that I should not have started knn_two. This seems to be a bit counter intuitive to me, because the original knn already has the test_two values in it right? So it would unfairly improve its learning capabilities. Therefore I started a new knn.
I am a bit confused why I should not start another knn.