Hi,
In 7.4, Exploring Topics in Data Science - Machine Learning with KNN
On page 7 we have the following sklearn example, when running on my local machine I get the following error:
ValueError: Input contains NaN, infinity or a value too large for dtype('float64').
I have confirmed that there are some NaN values and would like to know what’s going on behind the curtain - did Dataquest drop or change the NaN values and not say anything??
# The columns that we'll be using to make predictions
x_columns = ['age', 'g', 'gs', 'mp', 'fg', 'fga', 'fg.', 'x3p', 'x3pa', 'x3p.', 'x2p', 'x2pa', 'x2p.', 'efg.', 'ft', 'fta', 'ft.', 'orb', 'drb', 'trb', 'ast', 'stl', 'blk', 'tov', 'pf']
# The column we want to predict
y_column = ["pts"]
from sklearn.neighbors import KNeighborsRegressor
# Create the kNN model
knn = KNeighborsRegressor(n_neighbors=5)
# Fit the model on the training data
knn.fit(train[x_columns], train[y_column])
# Make predictions on the test set using the fit model
predictions = knn.predict(test[x_columns])