Prediction Error Properties

Two Key Properties

For the best linear predictor $P_n X_{n+h}$:

1. Zero mean error:

$$E(X_{n+h} - P_n X_{n+h}) = 0$$

The predictor is unbiased.

2. Error uncorrelated with information:

$$E[(X_{n+h} - P_n X_{n+h}) \cdot X_j] = 0, \qquad j = 1, 2, \ldots, n$$

Equivalently: $\text{Cov}(X_{n+h} - P_n X_{n+h},\; X_j) = 0$ for $j = 1, \ldots, n$.

Interpretation

The prediction error contains no linear information extractable from the observed data. If it did, we could improve the predictor — contradicting optimality.

Derivation Sketch

Property 1 comes from $\partial L / \partial a_0 = 0$. Property 2 comes from $\partial L / \partial a_i = 0$ for $i = 1, \ldots, n$, where $L = E[(X_{n+h} - P_n X_{n+h})^2]$.