Learning From Data – A Short Course: Exercise 7.13

Page 27


Suppose you run gradient descent for 1000 iterations. You have 500 examples in D, and you use 450 for D_{train} and 50 for D_{val}. You output the weight from iteration 50, with E_{val}(w_{50}) = 0.05 and E_{train}(w_{50}) = 0.04.

(a) Is E_{val}(w_{50}) = 0.05 an unbiased estimate of E_{out}(w_{50})?


(b) Use the Hoeffding bound to get a bound for E_{out} using E_{val} plus an error bar. Your bound should hold with probability at least 0.1.

If the problem is classification then please refer to (2.1) formula on Page 40 [Section 2.1] and then replace M = 1000, N = 50,  \delta = 1 - 0.1 = 0.9. So the bound will be: \pm \sqrt{\frac{1}{2N}\ln{\frac{2M}{\delta}}} \approx \pm 0.27760156655.

(c) Can you bound E_{out} using E_{train} or do you need more information?

As cardinality of the hypothesis set at 1000 iterations is infinite, the (2.1) formula does not longer hold even the problem is binary classification. In the case of binary classification, we will use the (2.12) formula on Page 53 [Section 2.1]. In other cases, I guess we will use some similar formula involving d_{VC}. In either cases, I guess that we lack the d_{VC} information to actually be able to compute the bound.



Leave a Reply

Your email address will not be published. Required fields are marked *