1 Answer
1

Suppose you have data $\{Y_t,X_{t-h}\}_{t=h+1}^T$, where $h \in \{1,2,\ldots\},$ and your goal is to build a model (say, $\hat f(X_{t-h})$) to predict $Y_t$ given $X_{t-h}$. For concreteness, suppose the data is daily and $T$ corresponds to today.

In-sample analysis means to estimate the model using all available data up to and including $T$, and then compare the model's fitted values to the actual realizations. However, this procedure is known to draw an overly optimistic picture of the model's forecasting ability, since common fitting algorithms (e.g. using squared error or likelihood criteria) tend to take pains to avoid large prediction errors, and are thus susceptible to overfitting - mistaking noise for signal in the data.

A true out-of-sample analysis would be to estimate the model based on data up to and including today, construct a forecast of tomorrow's value $Y_{T+1}$, wait until tomorrow, record the forecast error $e_{T+1} \equiv Y_{T+1} - \hat f(X_{T+1-h}),$ re-estimate the model, make a new forecast of $Y_{T+2}$, and so forth. At the end of this exercise, one would have a sample of forecast errors $\{e_{T+l}\}_{l=1}^L$ which would be truly out-of-sample and would give a very realistic picture of the model's performance.

Since this procedure is very time-consuming, people often resort to "pseudo", or "simulated", out-of-sample analysis, which means to mimic the procedure described in the last paragraph, using some historical date $T_0 < T$, rather than today's date $T$, as a starting point. The resulting forecasting errors $\{e_t\}_{t=T_0+1}^T$ are then used to get an estimate of the model's out-of-sample forecasting ability.

Note that pseudo-out-of-sample analysis is not the only way to estimate a model's out-of-sample performance. Alternatives include cross-validation and information criteria.

A very good discussion of all these issues is provided in Chapter 7 of