Page 139: Fix (learned from ) and define . We consider how depends on . Let be the pointwise variance in the out-of-sample error of . (a) Show that . We have: Because each is independent from each other, so: Reference: Properties of variance and covariance. (b) In a classification problem, where , express [...]
Page 125: Deterministic noise depends on , as some models approximate better than others. (a) Assume is fixed and we increase the complexity of . Will deterministic noise in general go up or down? Is there a higher or lower tendency to overfit? Deterministic noise in general will go up because it is harder for [...]
Bởi vì quỷ dữ không ngừng cười cợt chúng ta. Bởi vì tin tưởng là một điều xa xỉ.
Page 74: When there is noise in the data, , where . If is a zero-mean noise random variable with variance , show that the bias-variance decomposition becomes We have: We split the above expression in two sub-expressions, in which: Together, we can derive that:
Page 33: Prove that the PLA eventually converges to a linear separator for separable data. The following steps will guide you through the proof. Let be an optimal set of weights (one which separates the data). The essential idea in this proof is to show that the PLA weights get “more aligned” with with every [...]
Page 8: The weight update rule in (1.3) has the nice interpretation that it moves in the direction of classifying correctly. (a) Show that . [Hint: x(t) is misclassified by .] Because is misclassified by so: : . Hence: . : . Hence: . So: . (b) Show that . [Hint: Use (1.3).] My solution for (b) is wrong. [...]