Help with Machine Learning Project Walkthrough

Screen Link: https://app.dataquest.io/m/135/machine-learning-project-walkthrough%3A-making-predictions/9/penalizing-the-classifier

Hi Everyone, i’ve been working on this project for a while. The problem i have is, when i try to find the tpr and fpr in my Jupyter notebook i have way different values than the ones i found in the site even if i have the same code in both windows

Site:

lr = LogisticRegression(class_weight="balanced")
predictions = cross_val_predict(lr, features, target, cv=3)
predictions = pd.Series(predictions)

fp_filter = (predictions == 1) & (loans['loan_status'] == 0)
fp = len(predictions[fp_filter])

tp_filter = (predictions == 1) & (loans['loan_status'] == 1)
tp = len(predictions[tp_filter])

fn_filter = (predictions == 0) & (loans['loan_status'] == 1)
fn = len(predictions[fn_filter])

tn_filter = (predictions == 0) & (loans['loan_status'] == 0)
tn = len(predictions[tn_filter])

fpr = fp / (fp + tn)
tpr = tp / (tp + fn)

print('False Positive Rate:', fpr)
print('True Positive Rate:', tpr)

False Positive Rate: 0.38664292074799644
True Positive Rate: 0.6636146617109359

My code:

lr = LogisticRegression(class_weight="balanced")
predictions = cross_val_predict(lr, features, target, cv=3)
predictions = pd.Series(predictions)

fp_filter = (predictions == 1) & (loans_2007['loan_status'] == 0)
fp = len(predictions[fp_filter])

tp_filter = (predictions == 1) & (loans_2007['loan_status'] == 1)
tp = len(predictions[tp_filter])

fn_filter = (predictions == 0) & (loans_2007['loan_status'] == 1)
fn = len(predictions[fn_filter])

tn_filter = (predictions == 0) & (loans_2007['loan_status'] == 0)
tn = len(predictions[tn_filter])

fpr = fp / (fp + tn)
tpr = tp / (tp + fn)

print('False Positive Rate:', fpr)
print('True Positive Rate:', tpr)

False Positive Rate: 0.5237907206317868
True Positive Rate: 0.5465718405873099

I don’t know if i’m doing something wrong or what or if there’s something wrong with the data
I also post my notebook in case someone wants to check everything i’ve done

Thanks for your help guys!

Loan Predictor.ipynb (59.0 KB)
https://nbviewer.jupyter.org/urls/community.dataquest.io/uploads/short-url/8AolRTZKKAfA26zzZuPzwT7osJF.ipynb

Click here to view the jupyter notebook file in a new tab

1 Like

loans_2007 does not look like same code. How do you verify same code?

Why not print all the fp,tn,tp,fn since fpr tpr is derived from them.
Then go back 1 step further. Print all the 4 filters that generated fp,tn,tp,fn. Get the statistics on these filters and use python skills to inspect them in detail for differences.

1 Like

I have same problem here! my jupyter notebook have way different values for machine learning part for this project.