About Features df in KNN algorithm

I would like to ask something about univariate case of knn algorithm

whenever i use

knn.fit(df['feature_column'], train_target)

i get an error

except if i put the feature column as a list

knn.fit(df[['feature_column']], train_target)

Does anyone know why is this happening?

Hey there,

The method knn.fit(X, y) takes two parameters, X and y. X is the training data and y is the target column. The training data can be in the form of one or various columns of a DataFrame.

X = [col_a, col_b, col_n]
# Here, we are training our model on various columns of our data
X = [col_a]
# Here, we are using only one column, i.e. this is a univariate case.
# The brackets are still there, because it's a list - even if we're
# left with only one element

When you have questions like this, it is always a good idea to take a look at the documentation

Hope this helped!

2 Likes

for multiple features i use this format
knn.fit(df[['feature_column_1','feature_column_2']], train_target)
i use as an input a dataframe
so i expect that for 1 feature the logical would be to input a series?
knn.fit(df['feature_column_1'], train_target) ?

Erase 'feature_column_2' from your list and you will be left with knn.fit(df[['feature_column_1']] Double brackets are still there.

Inside df[], you are inputting a list of features. Whether that list has one item or many, it is still surrounded by the [].

I edited my original answer because there was an error in how I presented the list of features.

3 Likes


This paper tells you how sklearn is designed. It’s not academic so should be readable for anyone.
If i remember right, all models (not only knn) in sklearn requires 2-D input to fit, predict, transform. In your case, indexing a dataframe with single column will return series, a 1-D object, so you must index using double [[]] to return dataframe (even if contains only 1 column).

This highlighted part is where it mentions you need 2-D input. This requirement is more obvious in linear regression fit section of docs where it is bolded: https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LinearRegression.html

1 Like

oh, i see thank you very much!totally understood!

Hi kostasmandilass, could you mark this question as solved so future students can find it if you think it answers your question?