BLACK FRIDAY EXTRA SAVINGS EVENT - EXTENDED
START FREE

Getting different outputs from DQ and local machine

I usually try and re-run code on Pandas on my local machine just to make sure that I understand what I typed down. I tried the following exercise and the outputs I’ve gotten on DQ and my local machine are very different despite both having the same code.

https://app.dataquest.io/c/36/m/141/hyperparameter-optimization/6/practice-the-workflow

My Code:

import numpy as np
two_features = ['accommodates', 'bathrooms']
three_features = ['accommodates', 'bathrooms', 'bedrooms']
hyper_params = [x for x in range(1,21)]
# Append the first model's MSE values to this list.
two_mse_values = list()
# Append the second model's MSE values to this list.
three_mse_values = list()
two_hyp_mse = dict()
three_hyp_mse = dict()

for k in hyper_params:
    knn = KNeighborsRegressor(algorithm='brute',n_neighbors=k)
    knn.fit(train_df[two_features],train_df['price'])
    predictions = knn.predict(test_df[two_features])
    two_mse_values.append(mean_squared_error(test_df['price'],predictions))

k=two_mse_values.index(np.min(two_mse_values))+1
val=np.min(two_mse_values)
two_hyp_mse[k]=val

for k in hyper_params:
    knn = KNeighborsRegressor(algorithm='brute',n_neighbors=k)
    knn.fit(train_df[three_features],train_df['price'])
    predictions = knn.predict(test_df[three_features])
    three_mse_values.append(mean_squared_error(test_df['price'],predictions))

k=three_mse_values.index(np.min(three_mse_values))+1
val=np.min(three_mse_values)
three_hyp_mse[k]=val
print(two_hyp_mse,three_hyp_mse)

What I expected to happen:
image

What actually happened:
image

I’ve noticed that on running the codes from the previous exercises related to this lesson the outputs differ despite have the same codes. e.g:
The second exercise for this lesson requires that we list the MSE values for hyperparameters 1-5:
DQ Result:

On Pandas on my local machine:

Any idea behind why the difference?

I didn’t investigate, but there is such a massive difference in the versions we used and the current ones that it’s hard to take this difference as being meaningful. It could easily be a default parameter’s argument being different, for instance.

So is there anything that can be done to resolve the difference? I did try changing the parameter algorithm for the constructor KNeighborsRegressor but I did not get the values that were delivered by DQ.

Sorry for the delay. I took another look at this and the default parameters are the same. It’s not impossible to figure out why there’s a difference, but it’s probably not worth your time.

You’d have to dig in the source code of both versions and experiment with stuff looking for clues.

1 Like

Thanks @Bruno appreciate you :+1: taking a look at the matter.

As you’ve suggested, I think I’ll leave it to what it is for now.

1 Like