Page 103: We know that in the Euclidean plane, the perceptron model cannot implement all 16 dichotomies on 4 points. That is . Take the feature transform in (3.12). (a) Show that . We have proved (in Exercise 2.4) that the hypothesis set of perceptron model in Euclidean plane has , and by the definition of [...]
Page 98: (a) Define an error for a single data point to be Argue that PLA can be viewed as SGD on with learning rate . when means that agrees with (no error at that point): . when and disagrees (that point is misclassified): . Hence: When there is no error at the [...]
Page 94: The claim that is the direction which gives largest decrease in only holds for small . Why? is small and ignorable only when is small. P/s: I’m bored. That’s why I’m posting separate post.
Page 87: Consider a noisy target for generating the data, where is a noise term with zero mean and variance, independently generated for every example . The expected error of the best possible linear fit to this target is thus . For the data , denote the noise in as and let , assume that [...]
Page 87, Exercise 3.3: Consider the hat matrix , where is an by matrix, and is invertible. (a) Show that is symmetric. (b) Show that for any positive integer . (c) If is the identity matrix of size , show that for any positive integer . (d) Show that , where the trace is [...]
Page 23, Exercise 1.10 Here is an experiment that illustrates the difference between a single bin and multiple bins. Run a computer simulation for flipping 1,000 fair coins. Flip each coin independently 10 times. Let’s focus on 3 coins as follows: is the first coin flipped; is a coin you choose at random ; is [...]
Page 19, Exercise 1.9. If , use the Hoeffding Inequality to bound the probability that a sample fo 10 marbles will have and compare the answer to the previous exercise. means any number slightly less than (reference). Hence: If so: We observe that: is true.
Page 19, Exercise 1.8. If , what is the probability that a sample of 10 marbles will have ? Here we have N = 10 (a sample of 10 marbles): Hence:
Page 69, Problem 2.5. Prove by induction that , hence Base cases: Induction step for : We will prove later, for now, we will use its result: : : So follows. Prove by induction: Base case: Induction step for : [...]
This book should be read along with watching its corresponding online course. However, you should not watch the online course alone. The Bin Model PAGE NOTE Page 19: Reference: Malik Magdon-Ismail. Page 31: y here is a random variable. Reference: Malik Magdon-Ismail, The Elements of Statistical Learning page 28. Page 32: “we will assume the target to be a [...]