In this particular experiment, the black curve () is sometimes below and sometimes above the red curve (). If we repeated this experiment many times, and plotted the average black and red curves, would you expect the black curve to lie above or below the red curve?
It is written on the page 147 that “In this case, the cross validation estimate will on average be an upper estimate for the out-of-sample error”. The reason is:
and (dunno if proved or just a leap of faith, see also: “The fact that more training data lead to a better final hypothesis has been extensively verified empiricaly, although it is challenging to prove theoretically” on page 141):
Hence on average, the red curve will lie below the black curve.