Imagine that you are standing on a flat surface and have to walk down a concave slope. Now imagine that it is a slippery slope and you have to complete your walk in seconds. How did you manage that? Welcome to gradient descent!
Gradient descent is a “first order iterative optimization algorithm for finding the local minimum of a differentiable function”.
To implement gradient descent, there must be a differentiable function. A differentiable function is a function whose derivative exists with respect to a particular variable. This differentiable function is what we called cost or loss function.
Gradient descent is the optimization technique for finding the bias and coefficient(s) in the linear regression and logistic regression algorithms. In this article, I shall apply both mean squared error and the log-loss cost functions to the logistic function.
The aim of this article is to show the math behind gradient descent when the cost functions are: mean squared error and log-loss. Hopefully, this article will inspire you to apply the same technique to other cost functions : for example tanh.
This article has shown the math behind applying gradient descent to the logistic function when the cost functions are mean squared error and log-loss. The process involves finding the local minimum of a differentiable function by first-order differentiating.