## Learning From Data – A Short Course: Exercise 3.12

Page 103: We know that in the Euclidean plane, the perceptron model cannot implement all 16 dichotomies on 4 points. That is . Take the feature transform in (3.12). (a) Show that . We have proved (in Exercise 2.4) that the hypothesis set of perceptron model in Euclidean plane has , and by the definition of [...]

## Learning From Data – A Short Course: Exercise 3.10

Page 98: (a) Define an error for a single data point to be     Argue that PLA can be viewed as SGD on with learning rate . when   means that agrees with (no error at that point): . when  and  disagrees (that point is misclassified): . Hence: When there is no error at the [...]

## Learning From Data – A Short Course: Exercise 3.8

Page 94: The claim that is the direction which gives largest decrease in only holds for small . Why? is small and ignorable only when is small. P/s: I’m bored. That’s why I’m posting separate post.

## Learning From Data – A Short Course: Exercise 3.4

Page 87: Consider a noisy target for generating the data, where is a noise term with zero mean and variance, independently generated for every example . The expected error of the best possible linear fit to this target is thus . For the data , denote the noise in as and let , assume that [...]

## Knowledge Representation Project

I had this project done (the code part) in roughly 10 days. I do not think that I should upload the source code on here because it can be my final project and may help me graduate from university, LOL. My project is a mix of Computational Networks and Fuzzy Knowledge Based System (my teacher [...]

## Learning From Data – A Short Course: Exercise 3.3

Page 87, Exercise 3.3: Consider the hat matrix , where   is an by matrix, and is invertible. (a) Show that is symmetric. (b) Show that for any positive integer . (c) If is the identity matrix of size , show that for any positive integer . (d) Show that , where the trace is [...]

## Learning From Data – A Short Course: Exercise 1.10

Page 23, Exercise 1.10 Here is an experiment that illustrates the difference between a single bin and multiple bins. Run a computer simulation for flipping 1,000 fair coins. Flip each coin independently 10 times. Let’s focus on 3 coins as follows: is the first coin flipped; is a coin you choose at random ; is [...]

## Learning From Data – A Short Course: Exercise 1.9

Page 19, Exercise 1.9. If , use the Hoeffding Inequality to bound the probability that a sample fo 10 marbles will have and compare the answer to the previous exercise.     means any number slightly less than (reference). Hence:     If so:     We observe that:     is true.

## Learning From Data – A Short Course: Exercise 1.8

Page 19, Exercise 1.8. If , what is the probability that a sample of 10 marbles will have ?     Here we have N = 10 (a sample of 10 marbles):         Hence: