[Notes] Learning From Data – A Short Course: e-Chapter 7
Need an explanation for .
: is number of node in the first layer, is number of node in the input layer. Here is my guess: Each input node must connect to at least one node in the first layer (that is ). So the first input node can choose one in hidden nodes to connnect to and the second input node can also choose one in hidden nodes to connect to, et cetera, hence: .
How early stopping is actually related to weight decay? Please refer to the Figure on Page 130 [Chapter 4].