Page 103: We know that in the Euclidean plane, the perceptron model cannot implement all 16 dichotomies on 4 points. That is . Take the feature transform in (3.12). (a) Show that . We have proved (in Exercise 2.4) that the hypothesis set of perceptron model in Euclidean plane has , and by the definition of [...]
Page 98: (a) Define an error for a single data point to be Argue that PLA can be viewed as SGD on with learning rate . when means that agrees with (no error at that point): . when and disagrees (that point is misclassified): . Hence: When there is no error at the [...]
Page 94: The claim that is the direction which gives largest decrease in only holds for small . Why? is small and ignorable only when is small. P/s: I’m bored. That’s why I’m posting separate post.
Page 87: Consider a noisy target for generating the data, where is a noise term with zero mean and variance, independently generated for every example . The expected error of the best possible linear fit to this target is thus . For the data , denote the noise in as and let , assume that [...]
I had this project done (the code part) in roughly 10 days. I do not think that I should upload the source code on here because it can be my final project and may help me graduate from university, LOL. My project is a mix of Computational Networks and Fuzzy Knowledge Based System (my teacher [...]
Page 87, Exercise 3.3: Consider the hat matrix , where is an by matrix, and is invertible. (a) Show that is symmetric. (b) Show that for any positive integer . (c) If is the identity matrix of size , show that for any positive integer . (d) Show that , where the trace is [...]
Page 23, Exercise 1.10 Here is an experiment that illustrates the difference between a single bin and multiple bins. Run a computer simulation for flipping 1,000 fair coins. Flip each coin independently 10 times. Let’s focus on 3 coins as follows: is the first coin flipped; is a coin you choose at random ; is [...]
Page 19, Exercise 1.9. If , use the Hoeffding Inequality to bound the probability that a sample fo 10 marbles will have and compare the answer to the previous exercise. means any number slightly less than (reference). Hence: If so: We observe that: is true.
Page 19, Exercise 1.8. If , what is the probability that a sample of 10 marbles will have ? Here we have N = 10 (a sample of 10 marbles): Hence: