Hi! As I understood, the variance is the measure of the variability.
In the “Overfitting” paragraph we are told about the bias-variance tradeoff and suggested representing bias as the MSE and variance as the variance of the predicted values. The thing is that when we compute MSE, we do it by comparing actual and predicted values, whereas when dealing with variance we just calculating how vary the predictions are without any respect to the difference between actual and predicted values.
As I think the variance can be presented only in case of at least 2 test sets and should be calculated as the variance of all MSE (or RMSE). This means “if our model performs vary (when MSE is the measure of the performance) across different train and test sets, it is overfitted, and in ideal world we’ll get the same MSE whatever train and test sets we take”.
In DQ’s version it looks like “the more vary the predictions are, the more overfitted the model is”. Just imagine if the target values are actually varied much and the model predicts them precise, than it will mean that the model is overfitted.