Need help with cross_val_score

Hi all!

Screen Link:

I know this is to be solved differently, but I wanted to try my luck with sklearns provided functions. Here´s what I did:

My Code:

new_features = df.loc[:,["Gr Liv Area"]]
new_targets = df.SalePrice


(2930, 1)


from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import cross_val_score
lr = LogisticRegression(solver="saga")
cross_val_score(lr, new_features, new_targets, cv = 5, scoring="accuracy").mean()

What I expected to happen:
getting a score :slight_smile:

What actually happened:

/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/sklearn/model_selection/ UserWarning: The least populated class in y has only 1 members, which is less than n_splits=5.
  warnings.warn(("The least populated class in y has only %d"
/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/sklearn/linear_model/ ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
  warnings.warn("The max_iter was reached which means "

I tried different solver methods, different cv values, different models… same result :slight_smile: As I understand it there´s something wrong with my target data? I just can´t see how it´s not being able to split into 5-folds :confused:

Tried the same thing with kaggles titanic data and it works like a charm!

Would greatly appreciate any hints/ideas.

many thanks in advance

Hi Marina,

Did you try using the “neg_root_mean_squared_error” for the scoring parameter in cross_val_score?

I was able to successfully use this in my code.

Hope it helps.