Best Linear Predictor
The best linear predictor (BLP) is the linear combination of observed values that minimizes the mean squared prediction error.
1. Definition
Given observations $X_1, \dots, X_n$ of a zero-mean stationary process, the best linear predictor of $X_{n+h}$ is
$$P_n X_{n+h} = \sum_{i=1}^n a_i X_i$$where $a_1, \dots, a_n$ minimize $\mathbb{E}[(X_{n+h} - \sum_{i=1}^n a_i X_i)^2]$.
2. Prediction Equations
The optimal coefficients satisfy the normal equations (orthogonality principle):
$$\mathbb{E}\left[(X_{n+h} - P_n X_{n+h}) \cdot X_j\right] = 0 \quad \text{for } j = 1, \dots, n$$In matrix form with $\Gamma_n = (\gamma_X(i-j))_{i,j=1}^n$:
$$\Gamma_n \mathbf{a} = \boldsymbol{\gamma}_{n,h}$$where $\boldsymbol{\gamma}_{n,h} = (\gamma_X(n+h-1), \gamma_X(n+h-2), \dots, \gamma_X(h))^\top$.
3. MSE
$$\text{MSE} = \gamma_X(0) - \mathbf{a}^\top \boldsymbol{\gamma}_{n,h}$$4. Example: MA(1) Prediction
For $X_t = Z_t + \theta Z_{t-1}$ with known $X_4, X_5$, find $P(X_3 \mid X_4, X_5) = a_1 X_4 + a_2 X_5$.
Set up: $\mathbb{E}[(X_3 - a_1 X_4 - a_2 X_5)X_j] = 0$ for $j = 4, 5$.
This gives two equations using $\gamma_X(0) = \sigma^2(1+\theta^2)$, $\gamma_X(1) = \sigma^2\theta$, $\gamma_X(h) = 0$ for $|h| > 1$.
5. AR(1) Simplification
For AR(1), $P_n X_{n+1} = \phi X_n$ with $\text{MSE} = \sigma^2$.
The predictor only depends on the most recent observation — the Markov property makes all earlier observations redundant.