Matplotlib Line Plot Doubling Back on Machine Learning Guided Project- Help Debugging?

def knn_train_test(train, target, df): 
    shuffled_index = np.random.permutation(df.index)
    rand_df = df.reindex(shuffled_index)
    train_index = int(len(rand_df)/2)
    train_set = rand_df.iloc[0:train_index]
    test_set = rand_df.iloc[train_index:]
    k_rmses = {}
    k_values = [1,3,5,7,9]
    for k in k_values: 
        knn = KNeighborsRegressor(n_neighbors = k)[[train]], train_set[target])
        predictions = knn.predict(test_set[[train]])
        mse = mean_squared_error(test_set[target],predictions)
        rmse = np.sqrt(mse)
        k_rmses[k] = rmse
    return k_rmses   

mult_rmses = {}

train_cols = norm_cars.columns.drop('price')
for col in train_cols:
    rmses = knn_train_test(col, 'price',norm_cars)
    mult_rmses[col] = rmses

import matplotlib.pyplot as plt
%matplotlib inline

for k,v in mult_rmses.items():
    x = list(v.keys())
    y = list(v.values())
    plt.xlabel('k value')

I expected this chart to match the solution code for this guided project, with lines for each variable showing the RMSE value that corresponds to each k value.

What actually happened:
My matplotlib plot is doubling back right in the middle, even though there are no dictionary keys with duplicate values. So halfway through my chart, I have more lines than I need.

My code is almost completely the same as the solution notebook, so I’m having a hard time finding the source of this error. Thanks for any help!


I’m having the exact same problem. I copied and pasted the code from the solution and found the exact same plot. image

@ninasweeney18 @srauten
I too had the same issue.

Not sure what the underlying cause of this is, but I tried changing

k_value = [1,3,5,7,9]

something else, like

k_values = [k for k in range(1,10)]

This got rid of the issue for me.

Hope this helps!

Hi! I just plugged in your code and the graph came out normal, so I was unable to reproduce the error. One thought for you to try though, because the graph seems to only double back on the x axis is to break up your code a bit and try printing out values for X and Y. My guess is that somehow the X values got out of order? perhaps you could try a sort at some point before graphing?

hi, i tried your suggestion on my code (as i had thesame issues as the questions raised) and it worked. I’m pondering on the why.

The issue is that the code below produces out of order lists (i.e. “9” and it’s corresponding RMSE are not in the end of the lists]. This results in an incorrect plot.

The solution I found was to order the nested dictionary and then unpack the list as shown in this post.



@ninasweeney18 @srauten @idowumichael49

There is your why :slight_smile:


Thanks for your suggestion! Coming from a non-programming background, troubleshooting it this way wasn’t readily apparent to me so I just coded it differently instead. I’m still working on building up the programmer’s logical mindset so thanks again!

Troubleshooting Tip: “Try breaking down code to bits and printing values” - Noted.


Thanks for the solution Peter. Other folks who stumble onto this thread with the same issue will definitely find it useful.

I’m just circling back to this and really appreciate all your responses! I’m excited to jump back in and fix this issue. Love the virtual community at work :grinning: