# Learning From Data – A Short Course: Problem 1.3

Page 33:

Prove that the PLA eventually converges to a linear separator for separable data. The following steps will guide you through the proof. Let be an optimal set of weights (one which separates the data). The essential idea in this proof is to show that the PLA weights get “more aligned” with with every iteration. For simplicity, assume that .

(a) Let . Show that .

Because seperate the data totally so, for any data point we always have (as argued in the Exercise 1.3). Hence .

(b) Show that , and conclude that .

[Hint: Use induction.]

We have: .

Hence:

We also have: (as proved in part (a)), hence: .

So:

Base case: : . (We have assumed that ).

Induction step:

Assume: .

We observe that: .

So the statement follows.

(c) Show that .

Check the slide 13 of Lecture 01. We will use *Law of Cosines* here.

For , we have:

is the angle between and and . Here is the proof:

.

Hence: .

The case can be proved similarly.

So:

The statement then follows.

(d) Show by induction that , where .

Base case: : .

Induction step:

Assume: .

We observe: . (refer to part (c)).

So the statement follows.

(e) Using (b) and (d), show that:

and hence prove that:

Special thanks to MaciekLeks for the indication.

Refer to (b), we have:

Refer to (d), we have:

Hence [1]:

We have: , with is the angle between and .

We observe that: . And as increases, the right-hand side of [1] increases, this leads to left-hand side of [1] increases too. The increase of means the increase of , the increase of means the decrease of , the decrease of means the closer is to . So as increases, converges to .

And the bound for is: