Let and be independent with zero mean and unit variance. You measure inputs and .
(a) What are variance (), variance () and covariance ?
First, we have:
Now, we consider and get:
(b) Suppose (linear in the independent variables). Show that is linear in the correlated inputs, . (Obtain as functions of .)
Please notice that and are linear function.
That means: is the linear combination of , , and .
(c) Consider the ‘simple’ target function . If you perform regression with the correlated inputs and regularization constraint , what is the maximum amount of regularization you can use (minimum value of ) and still be able to implement the target?
(d) What happens to the minimum as the correlation increases ().
First we need to explain why as , the correlation increases: Remember that and . As , the factor dominates while , hence and are more correlated. When or , the factor dominates and the factor has subtle influence on so and are less correlated. Covariance also reveals this.
So will approaches infinity as correlation increases.
(e) Assuming that there is significant noise in the data, discuss your results in the context of bias and var.
The more and are correlated, the more complex the hypothesis set is required, hence the lower the bias is and the higher the variance is. The higher the variance is, the more the model is susceptible to noise.