# Learning From Data – A Short Course: Exercise 7.11

**Page 24**

For weight elimination, show that .

I have this differential formula: . Now I consider the following derivative:

So we have:

Argue that weight elimination shrinks small weights faster than large ones.

There are many ways to do this. One of the fastest way that I can think of is to Google for “graph for x / (1+x^2)^2” haha. The more traditional way would be: Take the derivative of (I have eliminated the number here because it make trivial effect to the argument), then solve the equation (I admit that I will let Maple do these computational jobs):

I check and see that (local minimum) and (local maximum).

That means on the interval , decrease then increase

The only concern left is how the function acts when goes to infinity? Well, it’s easy to see that . I also observe that:

So far so good.

I should have realized before that and , so if I’m interested in how behaves when is small / large in general, I only need to consider how behaves when and then deduce the other case’s result.

So the statement follows.