Suppose that we removed a data point with .

(a) Show that the previous optimal solution remains feasible for the new dual problem (8.21) (after removing ).

is the old dual problem while is the new dual problem.

Because  appears when appears and reverse. So when we replace the value of into (8.21), there is no difference between setting  and removing the data point .

(b) Show that if there is any other feasible solution for the new dual that as a lower objective value than , this would contradict the optimality of for the original dual problem.

If there exists such that then we can have such that:

Hence . This would contradict the optimality of  for the original problem

(c) Hence, show that (minus ) is optimal for the new dual.

By contradiction proof, the above statement follows.

(d) Hence, show that the optimal fat-hyperplane did not change.

This statement trivially follows.

(e) Prove the bound on in (8.27).

I guess the formula (8.27) is about LOOCV method, so the following argument may only apply to LOOCV.

If we remove the non-support vectors then the optimal fat-hyperplane does not change, data points that were correctly classified before will remain correctly classified (). If we remove one of the support vectors then the optimal fat-hyperplane may change, what worse, it may incorrectly classify the removed support vector (). Hence the bound (8.27).