Learning From Data – A Short Course: Exercise 4.7
Page 139:
Fix
(learned from
) and define
. We consider how
depends on
. Let
be the pointwise variance in the out-of-sample error of
.
(a) Show that
.
We have:
Because each is independent from each other, so:
Reference: Properties of variance and covariance.
(b) In a classification problem, where
, express
in terms of
.
First we observe that:
and:
Hence:
(c) Show that for any
in a classification problem,
.
First we consider the function: , we have:
and
. So
reaches maxima value at
and that is
, which also means
.
Similarly, we have:
(d) Is there a uniform upper bound for
similar to (c) in the case of regression with squared error
?
Because the squared error is unbounded hence the variance of it cannot be bounded. However, the result (a) suggests that large
may help reduce the variance.
(e) For regression with squared error, if we train using fewer points (smaller
) to get
, do you expect
to be higher or lower?
We have:
As we use fewer points to train, gets worse. As
gets worse,
and
gets higher value, hence
often gets higher, that also means
is expected to be higher.
(f) Conclude that increasing the size of the validation set can result in a better or a worse estimate of
.
Check the answer for (d) and (e), this question is also discussed at the same page.






2 comments on “Learning From Data – A Short Course: Exercise 4.7”