What is the logic behind squaring as "penalization"? (mean squared error)

Maybe a stupid question, but I don’t understand why we exactly square the error in order to penalize the higher values. Why don’t we use 4th power or 6th. Is there any particular logic or is it just easier and faster to do than any other form of penalization?

1 Like

HI @VitaliyNechay ,
first off no question is a stupid question :rofl:

Penalizing a Machine Leaning algorithm essentially means that you do not want your algorithm to be overfitted to your data.

so there are other methods that can be used, kindly go through the writeup it would give you a clear understanding of what it’s about. https://towardsdatascience.com/why-do-we-minimize-the-mean-squared-error-3b97391f54c


There are subtleties to your questions.

The first in the issue of convexity. You would like to find the global minimum of your cost function. The MSE is a convex function on a linear regression algorithm. You are almost always likely to find the global minimum if your learning rate is not too high.

If you try a higher degree polynomial. You may be stuck in a local minimum.

The second subtlety is the issue of penalising higher value errors. Take a look at this article: Intuitions on L1 and L2 Regularisation | by Raimi Karim | Towards Data Science

Our focus is on L, loss function without regularization. The error of a prediction is error = \hat{y} - y = (wx +b) - y.

For the MSE:
{dL \over db} = 2 * (wx +b - y) = 2 * (error).

When updating the weight of b, you take twice the error out. You will be able to take more errors out with higher degree polynomials but gradient descent may not converge to the global minimum.

1 Like

Hi @VitaliyNechay,

If you feel that  your question was successfully solved, could you please mark the helpful answer as Solution? You can find here how to do it technically (in the GUIDELINE #5). Please consider this practice also for any other questions you might ask in the Community. This would be of great help both to your helpers and the other learners who can have the same questions in the future.

Many thanks and happy learning! 

NOTE: This is an automated message.