I have some difficulty to understand the idea of deterministic noise. I think there are some disturbing contradiction with what we've seen with the bias-variance tradeoff, particularly with 50th order noiseless target exemple.

Chapter 4 state that:
- A 2nd order polynomial could be better than a 10th order polynomial to fit a 50th order polynomial target and it's due to the deterministic noise.

--> So I conclude that there is more deterministic noise with 10th order than with the 2nd order.

- Deterministic noise=bias

But we've seen with the bias-variance tradeoff, that a more complex model than an other have a lower bias.

Chapter 4 state that:
- A 2nd order polynomial could be better than a 10th order polynomial to fit a 50th order polynomial target and it's due to the deterministic noise.

--> So I conclude that there is more deterministic noise with 10th order than with the 2nd order.

The confusion is justified, since there are two opposing factors here (see Exercise 4.3). There is more deterministic noise with the 2nd order model than with the 10th order model (which would suggest more overfitting with the 2nd order), but the model itself is simpler so that would suggest less overfitting. It turns out that the latter factor wins here.

If you want to isolate the impact of deterministic noise on overfitting without interference from the model complexity, you can fix the model and change the complexity of the target function.

__________________Where everyone thinks alike, no one thinks very much

I think the confusion comes from Figure 4.4 compared to the figures of the stochastic noise.

Here you write the shading is the deterministic noise, since this is the difference between the best fit of the current model and the target function.
Exactly this shading is from the bias-variance analyses. Thus the value of the deterministic noise is directly related to the bias.

When you talk about stochastic noise you say that the out-of-sample error will increase with the model-complexity and this is related to the area between the final hypothesis and the target . Thus the reader might think the bias is increasing with the complexity of the model. However the bias depends on and not on . And the reason why this area increases is due to the stochastic noise. If there isn't any noise the final hypothesis will have a better chance to fit (depending on the position of the samples).

In fact (and this is not really clear form the text, but from Exercise 4.3) on a noiseless target the shaded area in Figure 4.4 will decrease when the model complexity increases and thus the bias decreases.
My suggestion is to make is more clear, that in case of stochastic noise you talk about the actual final hypothesis and in case of deterministic noise you talk about the best fitting hypothesis, that is related to .

From my understanding I would say:
Overfitting does not apply to the best fit of the model () but to the real hypothesis (). In the bias-variance-analyses we saw the variance will increase together with the model complexity (at the same number of samples). So I think Overfitting is a major part of the variance, either due to the stochastic noise or due to the deterministic noise.

The contents of this forum are to be used ONLY by readers of the Learning From Data book by Yaser S. Abu-Mostafa, Malik Magdon-Ismail, and Hsuan-Tien Lin, and participants in the Learning From Data MOOC by Yaser S. Abu-Mostafa. No part of these contents is to be communicated or made accessible to ANY other person or entity.