Alternative solution to "Hyperparameter Optimization - step 6: Practice the Workflow" exercise

I found a shorter solution than the official one for this the “Hyperparameter Optimization - step 6: Practice the Workflow” exercise. I thought I would share it with the community to see if my way of doing it is not as “pythonic” or if I am missing something.

Screen Link:

My Code:

two_features = ['accommodates', 'bathrooms']
three_features = ['accommodates', 'bathrooms', 'bedrooms']
hyper_params = [x for x in range(1,21)]

def model(features):
    mse_values = dict()
    for k in hyper_params:
        knn = KNeighborsRegressor(n_neighbors = k, algorithm = 'brute')
        train_target = train_df['price']
        train_features = train_df[features]
        knn.fit(train_features, train_target)
        predictions = knn.predict(test_df[features])
        mse = mean_squared_error(test_df['price'], predictions)
        mse_values[k] = mse
        lowest_k = min(mse_values, key = mse_values.get)
        winning_k = {key:value for key,value in mse_values.items() if key == lowest_k}
    return winning_k

two_hyp_mse = model(two_features)
three_hyp_mse = model(three_features)

print(two_hyp_mse)
print(three_hyp_mse)

Output is correct:

{5: 14790.314266211606}
{7: 13518.769009310208}
1 Like

Why are these 2 lines inside for loop. Inside loop value tracking would probably be important to control other parts of an algorithm, but in this case it’s just running through all the k and collecting their errors which can be done once after all loops are finished instead of immediately during loop. There doesn’t seem to be an urgency to get updated within each loop.

They can be changed to 1 line outside for loop. return min(mse_values.items(),key=lambda x:x[1])

1 Like

This single line returns a tuple instead of the dictionary required by the exercise. But it makes sense to remove them from the loop. Thanks a lot for your feedback!

Here is the code for reference.

two_features = ['accommodates', 'bathrooms']
three_features = ['accommodates', 'bathrooms', 'bedrooms']
hyper_params = [x for x in range(1,21)]

def model(features):
    mse_values = dict()
    for k in hyper_params:
        knn = KNeighborsRegressor(n_neighbors = k, algorithm = 'brute')
        train_target = train_df['price']
        train_features = train_df[features]
        knn.fit(train_features, train_target)
        predictions = knn.predict(test_df[features])
        mse = mean_squared_error(test_df['price'], predictions)
        mse_values[k] = mse
    lowest_k = min(mse_values, key = mse_values.get)
    winning_k = {key:value for key,value in mse_values.items() if key == lowest_k}
    return winning_k

two_hyp_mse = model(two_features)
three_hyp_mse = model(three_features)

print(two_hyp_mse)
print(three_hyp_mse)

Hi @gdelaserre

Thanks for the sharing!

I think the for loop seems not necessary here since the lowest_k has been found out. I tried winning_k = {lowest_k: mse_values[lowest_k]}, and it works. I’m not sure if I missed something, just FYI

1 Like