I know this is to be solved differently, but I wanted to try my luck with sklearns provided functions. Here´s what I did:
new_features = df.loc[:,["Gr Liv Area"]] new_targets = df.SalePrice new_features.shape
from sklearn.linear_model import LogisticRegression from sklearn.model_selection import cross_val_score lr = LogisticRegression(solver="saga") cross_val_score(lr, new_features, new_targets, cv = 5, scoring="accuracy").mean()
What I expected to happen:
getting a score
What actually happened:
/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/sklearn/model_selection/_split.py:665: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=5. warnings.warn(("The least populated class in y has only %d" /Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/sklearn/linear_model/_sag.py:329: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge warnings.warn("The max_iter was reached which means "
I tried different solver methods, different cv values, different models… same result As I understand it there´s something wrong with my target data? I just can´t see how it´s not being able to split into 5-folds
Tried the same thing with kaggles titanic data and it works like a charm!
Would greatly appreciate any hints/ideas.
many thanks in advance