The questions I have are

- How do you derive the equation for the MSE derivative of a multi-variate case of linear regression?
- How are the two resulting equations from using the chain rule proven to be equivalent with respect to this specific equation?

I have used the linearity of differentiation property:

I wrote the equation as the summation of derivatives instead of the derivative of a summation.

Next, I used the chain rule assuming:

`g(x) = x^2`

`h(x) = a0 + a1x1 - y`

`f(x) = g(h(x)) = (a0 + a1x1 - y)^2`

That is where I am stuck. I can only change the above function to:

I treated

as the sum of two functions then simplified their derivatives using the “Derivative of a Constant” rule. Next, I used the “Derivative of a Simple Linear Equation” rule:

That brings me back to the two questions. The minor question is, doesn’t using the Chain Rule inflate the value of the derivative for this statement:

How is = ?

The more important question of course is where I went wrong or where I go from here to get the DQ answer. Feel free to point to links. I only reviewed the properties/rules of derivatives and did not make any effort to use the proofs of the rules to find an answer or determine the equivalence of the terms after using the Chain Rule.