Great question @vinayak.naik87 !
First, yes the issues on all the screens are still there, and that is annoying. Especially since a couple people now have pointed out the source of the issue.
Second, this is an interesting notion, that the model deteriorates as we add these penalties. Just looking at the score for the metrics we have chosen we see the numbers going down as we add penalties and so it is easy to think that going from 66% to 30% (or whatever the actual numbers are) indicates a decrease in the performance of the model. But its important to remember why we have created this model in the first place, and what it is that these numbers actually mean.
For this lesson we are a potential investor that is looking to use the lending club to make money. However, there is no guarantee that our investment will payoff, and in fact some of these âinvestment opportunitiesâ end up losing a bunch of money. How can we guarantee that we dont pick one of the âbadâ borrowers? Simple, we build a model that predicts that EVERY borrower is bad, and we dont invest. That model has predicts all zeros and has a 100% True Negative rate. Unfortunately, that model does not serve the original purpose of selecting an investment to make money.
So, how can we guarantee that we find good investments? Simple, we build a model that predicts all ones and has a 100% True Positive rate. Unfortunately, this model has no discriminating power to help us select a borrower that will pay back the loan, so we might as well just pick a borrower at random.
So why dont we just build a model that accurately predicts the category to which each borrow belongs? We know that this model will never be âperfectâ, but lets assume we know that we could create a model with a 90% accuracy, it correctly categorizes 9 of every 10 borrowers. Dataquest actually played out this example pretty well here:
https://app.dataquest.io/m/135/machine-learning-project-walkthrough%3A-making-predictions/4/class-imbalance
Even with this âaccurateâ model we end up losing money! So we need another metric to try to judge the success of our model. We want a model that is going to select as few âbadâ borrowers as possible, but unlike our all zero model, we do need it to select at least some borrowers. If the model assigns a 1 to a bad borrower, then we would say that the model gave us a False Positive, since the model falsely predicted that the borrower would be good. For this scenario then, we want to produce a model that has a LOW False Positive rate, ideally zero, since that would mean that it would never choose a bad borrower. We dont really care if we miss out on some potentially good borrowers, but we do need to have a model that will still correctly identify some of them. Successfully selecting a good borrower here is a True Positive, since it is a positive borrower that is truthfully identified. We dont really care if this number is low, meaning it doesnt identify many good borrowers, but we can have it go all the way to zero like our all zero model did.
Important to remember though that which metric you use to evaluate the model depends on what you are trying to accomplish with the model. If you are trying to identify international terrorists at the airport, you need a low False Negative rate because you dont want to miss even one! However, a high False Positive rate would mean that you are arresting everyone simply for being at the airport!