def knn_train_test(train, target, df):
np.random.seed(1)
shuffled_index = np.random.permutation(df.index)
rand_df = df.reindex(shuffled_index)
train_index = int(len(rand_df)/2)
train_set = rand_df.iloc[0:train_index]
test_set = rand_df.iloc[train_index:]
k_rmses = {}
k_values = [1,3,5,7,9]
for k in k_values:
knn = KNeighborsRegressor(n_neighbors = k)
knn.fit(train_set[[train]], train_set[target])
predictions = knn.predict(test_set[[train]])
mse = mean_squared_error(test_set[target],predictions)
rmse = np.sqrt(mse)
k_rmses[k] = rmse
return k_rmses
mult_rmses = {}
train_cols = norm_cars.columns.drop('price')
for col in train_cols:
rmses = knn_train_test(col, 'price',norm_cars)
mult_rmses[col] = rmses
mult_rmses
import matplotlib.pyplot as plt
%matplotlib inline
for k,v in mult_rmses.items():
x = list(v.keys())
y = list(v.values())
plt.plot(x,y)
plt.xlabel('k value')
plt.ylabel('RMSE')
I expected this chart to match the solution code for this guided project, with lines for each variable showing the RMSE value that corresponds to each k value.
What actually happened:
My matplotlib plot is doubling back right in the middle, even though there are no dictionary keys with duplicate values. So halfway through my chart, I have more lines than I need.
My code is almost completely the same as the solution notebook, so I’m having a hard time finding the source of this error. Thanks for any help!
2 Likes
I’m having the exact same problem. I copied and pasted the code from the solution and found the exact same plot. 
@ninasweeney18 @srauten
I too had the same issue.
Not sure what the underlying cause of this is, but I tried changing
k_value = [1,3,5,7,9]
to
something else, like
k_values = [k for k in range(1,10)]
This got rid of the issue for me.
Hope this helps!
Hi! I just plugged in your code and the graph came out normal, so I was unable to reproduce the error. One thought for you to try though, because the graph seems to only double back on the x axis is to break up your code a bit and try printing out values for X and Y. My guess is that somehow the X values got out of order? perhaps you could try a sort at some point before graphing?
1 Like
hi, i tried your suggestion on my code (as i had thesame issues as the questions raised) and it worked. I’m pondering on the why.
1 Like
The issue is that the code below produces out of order lists (i.e. “9” and it’s corresponding RMSE are not in the end of the lists]. This results in an incorrect plot.
The solution I found was to order the nested dictionary and then unpack the list as shown in this post.

3 Likes
@ninasweeney18 @srauten @idowumichael49
There is your why 
@ncarvey
Thanks for your suggestion! Coming from a non-programming background, troubleshooting it this way wasn’t readily apparent to me so I just coded it differently instead. I’m still working on building up the programmer’s logical mindset so thanks again!
Troubleshooting Tip: “Try breaking down code to bits and printing values” - Noted.
@peter.dushku
Thanks for the solution Peter. Other folks who stumble onto this thread with the same issue will definitely find it useful.
Cheers.
1 Like
I’m just circling back to this and really appreciate all your responses! I’m excited to jump back in and fix this issue. Love the virtual community at work 