Screen Link:
https://app.dataquest.io/m/155/guided-project%3A-predicting-car-prices/3/univariate-model
My Code:
def knn_train_test(train_col, test_col, df):
#randomize and split data in half for training and testing
np.random.seed(1)
tmp_df = df.copy()
shuffled_index = np.random.permutation(tmp_df.index)
rand_df = tmp_df.reindex(shuffled_index)
last_train_ind = int(len(rand_df)/2)
train = df.iloc[:last_train_ind]
test = df.iloc[last_train_ind:]
#instantiating, fitting and predicting using the model
knn = KNeighborsRegressor() #default k=5
knn.fit(train[[train_col]], train[test_col])
test['predicted'] = knn.predict(test[[train_col]])
rmse = np.sqrt(mean_squared_error(test[test_col], test['predicted']))
return rmse
#using the knn model function on the numeric coluns to identify the best performing feature
train_cols = numeric_cars.copy()
train_cols = train_cols.columns.drop('price')
rmses = {}
for col in train_cols:
rmse_val = knn_train_test(col, 'price', numeric_cars)
rmses[col] = rmse_val
rmse_series = pd.Series(rmses)
rmse_series.sort_values()
What I expected to happen:
What actually happened:
The RMSE values do not match the ones in the ‘Solution’! Also, not sure how to correct the ‘SettingWithCopyWarning’
/dataquest/system/env/python3/lib/python3.4/site-packages/ipykernel/__main__.py:17: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
engine-size 4190.433888
horsepower 4267.730361
highway-mpg 4628.793094
city-mpg 4814.778015
curb-weight 5166.828581
width 7110.412630
compression-rate 8096.301512
normalized-losses 8131.436882
length 8304.189346
stroke 9334.714914
peak-rpm 9759.209970
wheel-base 9969.243292
height 10839.693636
bore 13397.091693
dtype: float64
This is my first time posting, so apologies if the query is not formatted properly!