Imagine that you are standing on a flat surface and have to walk down a concave slope. Now imagine that it is a slippery slope and you have to complete your walk in seconds. How did you manage that? Welcome to gradient descent!

Gradient descent is a *“first order iterative optimization algorithm for finding the local minimum of a differentiable function”*.

To implement gradient descent, there must be a differentiable function. A differentiable function is a function whose derivative exists with respect to a particular variable. This differentiable function is what we called **cost or loss function.**

Gradient descent is the optimization technique for finding the bias and coefficient(s) in the linear regression and logistic regression algorithms. In this article, I shall apply both * mean squared error* and the

*cost functions to the logistic function.*

**log-loss**The aim of this article is to show the math behind gradient descent when the cost functions are: *mean squared error* and *log-loss*. Hopefully, this article will inspire you to apply the same technique to other cost functions : for example * tanh*.

**Conclusion**

This article has shown the math behind applying gradient descent to the logistic function when the cost functions are mean squared error and log-loss. The process involves finding the local minimum of a differentiable function by first-order differentiating.