# Learning From Data – A Short Course: Problem 8.7

For any with and even, show that there exists a balanced dichotomy that satisfies

, and

(This is the geometric lemma that is need to bound the VC-dimension of -fat hyperplanes by .) The following steps are a guide for the proof.

Suppose you randomly select of the labels to be , the others being . By construction, .

(a)

(b) When , what is ? Show that when . Hence show that

When , .

When , we have ways to choose and such that they share the same sign (first we choose one out of , second we choose one such that it shares the same sign as out of , minus for the case ), we also have ways to choose and for the product (first we choose one out of , second we choose one out of , minus for the case ), hence:

(c) Show that

where the average vector .

The last equality is still a question to me. It’s probably that when you use dataset to train supervisedly, you have assumed that depends on . If not, the whole process is meaningless.

It looks like we are still missing some components, let’s check back the formula we have to derive at:

Now let’s consider the following missing components:

So yes, we have gained what we need to show.

(d) Show that .

Based on the result from Problem 8.6, we have:

If we set , then:

We also have:

So the statement follows.

(e) Conclude that

and hence that

This means for some choice of , .

Remember that:

So it is true that: , hence:

Expected value is mean value, what does mean value imply? It implies that there is at least one number greater than or equal the mean value and there is at least one number less than or equal the mean value. Hence:

So the statement follows.