Predicting car prices project: ERROR Found array with 0 sample(s) (shape=(0, 1)) while a minimum of 1 is required

Screen Link:
https://app.dataquest.io/m/155/guided-project%3A-predicting-car-prices/3/univariate-model

My Code:

from sklearn.neighbors import KNeighborsRegressor
from sklearn.metrics import mean_squared_error

numerical_cars.info()

# Creating knn_train_test() that encapsulates the training and simple validation process.
# We will solit the datset into training and test dataset
# based on the model fit and predictions, we will calculate the RMSE

def knn_train_test(train_col, target_col, df):
    # shuffling the rows of the dataset:
    shuffled_index = np.random.permutation(df.index)
    shuffled_df = df.reindex(shuffled_index)
    
    # Dividing the dataset for test/ train, we will divide it in half:
    midpoint_df = int(len(shuffled_df / 2))
    
    # We will then assign the train/test sets:
    train_df = df.iloc[0:midpoint_df]
    test_df = df.iloc[midpoint_df:]
    
    #Initiating the model, using default k value:
    knn = KNeighborsRegressor()
    knn.fit(train_df[[train_col]], train_df[target_col]) # Fits the KNN model
    predictions = knn.predict(test_df[[train_col]])  # Makes predictions using model
    mse = mean_squared_error(test_df[[target_col]], predictions)
    rmse = mse ** (1/2)
    return rmse

I expected to have the result of my model

What actually happened:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-92-2b03a83e67f7> in <module>
      6 
      7 for col in training_cols:
----> 8     rmse_value = knn_train_test(col, 'price',numerical_cars)
      9     rmse_results[col] = rmse_value
     10 print(rmse_results)

<ipython-input-91-87cbbb5e320b> in knn_train_test(train_col, target_col, df)
     23     knn = KNeighborsRegressor()
     24     knn.fit(train_df[[train_col]], train_df[target_col]) # Fits the KNN model
---> 25     predictions = knn.predict(test_df[[train_col]])  # Makes predoctions using model
     26     mse = mean_squared_error(test_df[[target_col]], predictions)
     27     rmse = mse ** (1/2)

C:\ProgramData\Anaconda3\lib\site-packages\sklearn\neighbors\_regression.py in predict(self, X)
    172             Target values.
    173         """
--> 174         X = check_array(X, accept_sparse='csr')
    175 
    176         neigh_dist, neigh_ind = self.kneighbors(X)

C:\ProgramData\Anaconda3\lib\site-packages\sklearn\utils\validation.py in inner_f(*args, **kwargs)
     70                           FutureWarning)
     71         kwargs.update({k: arg for k, arg in zip(sig.parameters, args)})
---> 72         return f(**kwargs)
     73     return inner_f
     74 

C:\ProgramData\Anaconda3\lib\site-packages\sklearn\utils\validation.py in check_array(array, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, estimator)
    648         n_samples = _num_samples(array)
    649         if n_samples < ensure_min_samples:
--> 650             raise ValueError("Found array with %d sample(s) (shape=%s) while a"
    651                              " minimum of %d is required%s."
    652                              % (n_samples, array.shape, ensure_min_samples,

ValueError: Found array with 0 sample(s) (shape=(0, 1)) while a minimum of 1 is required.

Any help here please?

I just got the mistake!

There was a typo in the brackets while dividing the dataset into two for train/test sets

1 Like