Confused with the output

Hi
I have a few confusions over the screens below.
For the screen https://app.dataquest.io/m/135/machine-learning-project-walkthrough%3A-making-predictions/9/penalizing-the-classifier
the output I got here is as follows:
image

But as per the next screen,
image

As per screen 9, fpr is 60%, tpr is ~62%, whereas as per screen 10, it should have been 39% and 66% respectively.

Also, the output from screen 10 (https://app.dataquest.io/m/135/machine-learning-project-walkthrough%3A-making-predictions/10/manual-penalties) is
image

But screen 11 (https://app.dataquest.io/m/135/machine-learning-project-walkthrough%3A-making-predictions/11/random-forests) mentions a different fpr
image

I have been confused with the above, can anyone please clarify.

Thanks,
Debasmita

4 Likes

@dash.debasmita I’ve encountered the same problem. This seems to be a bug in the content, best report it through Contact Us.

Also, @Sahil will you kindly take a look at this? I’m wondering if it’s related to version change with scikitlearn. The tpr and fpr in the output are so close in both penalized cases, it feels like the overall accuracy just dropped so much it’s not really valid optimizing.

2 Likes

I am having the exact same problem!

Difficult to see the induced parameter changes at work in attaining lower fp’s and higher tp’s.

Regards,
John

1 Like

Hi @veratsien,

Thank you for mentioning me. I will get this issue logged.

Best,
Sahil

1 Like

@Sahil Awesome, thank you!

1 Like

Thank you Sahil , and veratsien, I forgot to follow up on this.

2 Likes

Any update on the issue mentioned here? I am seeing the same issue that is being reported here.

1 Like

Hi @Mathew.Thomas,

Thank you for asking. I just checked the ticket for updates and it seems like the issue is scheduled to be fixed by January 11, 2021.

Best,
Sahil

Hi @Sahil - just to note that these issues still seem to be present. Thanks

1 Like

Hi @Sahil : The issue on both screens still persists. The model actually deteriorates on applying penalties on both the screens. Infact, it gets worse with manual penalties.
Is this the correct way of applying penalties because we seem to be making our model worse with each step?

1 Like

Hi @joe.gamse,

Sorry about that; it seems like the fix has been delayed so that the content team can focus on our SQL skills path, which is a high priority task for this quarter. I will update this topic once the issue is fixed. Until then, please use this workaround to mark the screen as completed:

https://dataquest.elevio.help/en/articles/151-how-to-mark-a-lesson-screen-as-complete

Best,
Sahil

Hi @vinayak.naik87,

Sorry, I don’t understand what you mean by worse here. The goal here is to reduce the false-positive rate and to demonstrate how to use manual penalties.

We reduced the false positive rate from 60% to 21% using a manual penalty. However, the manual penalty unintentionally reduced the true positive rate as well which is expected behavior as mentioned in screen 4:

Generally, if we want to reduce false positive rate, true positive rate will also go down. This is because if we want to reduce the risk of false positives, we wouldn’t think about funding riskier loans in the first place.

And why it is best for us to focus on the false positive rate (in the case of loans) is explained in screen 11:

Note that this comes at the expense of true positive rate. While we have fewer false positives, we’re also missing opportunities to fund more loans and potentially make more money. Given that we’re approaching this as a conservative investor, this strategy makes sense, but it’s worth keeping in mind the tradeoffs.

Hope this helps! :slightly_smiling_face:

Hi there,

I also struggled with the results in this mission. And finally I realized that the variable “target” it’s not equals to “loans[“loan_status”]”, and it should be because it is defined so at the beguining of this mission. This manner you can get different results in tpr and fpr depending on using “target” or “loans[“loan_status”]”, when you compares your predictions with them.

I reckon this is why results doesn’t mach with those mentioned during this mission. May be they would have been calculated using the variable “target” instead of “loans[“loan_status”]”.

I hope it serves.

Regards.