Mathematics & steps in fitting logistic regression

Screen Link: https://app.dataquest.io/m/20/logistic-regression/5/training-a-logistic-regression-model

Question 1
The screen mentions:
We won't dive into the math and the steps required to fit a logistic regression model to the training data in this mission
Can you please guide me to a resource (apart from Wikipedia) which explains these concepts in a simple but detailed manner?
To be more specific something that covers:

  1. The cost function for Logistic regression
  2. Optimization process to find the parameters for best fit

Question 2
Also in logistic regression probability is modelled as exp(t)/(1+exp(t)
What is the relationship between t and the independent variables (function or equation)?

1 Like

Hello Vinayak,
Logistic regression follows a sigmoid function.

It is similar to Linear regression but it calculates distance from hyper plane.

.

As show in image, you will calculate perpendicular distance from hyper plane. This distance is a t or raw score as you mention in formula.

Once you convert to probability, it will be class 0 if probability is <0.5 and class 1 if probability is >0.5
Here is sample code for perpendicular distance.

import numpy as np
# Vector multiplication

# Given a hyperplane  w in augmented form ([b, w1, w2])
# a test case x in augmented form also    ([1, ...])
x=[1, 0,4] 
w=[-12, 3, 4]
perpendicular_distance = np.dot(x, w)
perpendicular_distance

Here is cost function

cxe = np.mean(y_true*np.log(p) + (1-y_true)*np.log(1-p))

Here p is probability you calculate using sigmoid function

Thanks for your response. But still need some clarity on following:
Question 1
The screen mentions:
We won't dive into the math and the steps required to fit a logistic regression model to the training data in this mission
Can you please guide me to a resource (apart from Wikipedia) which explains these concepts in a simple but detailed manner?
To be more specific something that covers:

  1. The cost function for Logistic regression
  2. Optimization process to find the parameters for best fit

Question 2
Also in logistic regression probability is modelled as exp(t)/(1+exp(t)
What is the relationship between t and the independent variables (expressed in form of a function or equation)?

The cost function for Logistic regression

If you would like to understand logistic regression in detail, then I suggest watching this playlist:

If you just want an explanation for the cost function, then I would suggest this video:

Optimization process to find the parameters for the best fit
NOTE: This is a very long video

I asked this question to one of our content authors and here is the response:

  • The regression equation is y = f(X) where X is the independent variables and y the dependent one.
  • In the logistic regression (for two classes 1 and 0) we want to decide when y is equal to 1 or 0 that why we use the sigmoid function exp(y)/(1+exp(y))

For your equation t = y = f(X), so my answer to the question is t is the outcome of the combination of the independent variables through f

Best,
Sahil

1 Like