Molecular Note: Best Linear Prediction for Stationary Series
Linked Atomic Concepts: Best Predictor — Conditional Expectation, Best Linear Predictor — General (Theorem 1.2), Best Linear Predictor — P_n X_{n+h}, Prediction Error Properties, Prediction for Causal AR(p), Prediction Intervals, Autocovariance Function (ACVF)
Scene-Setting
You have an AR(1) process $X_t = 0.8 X_{t-1} + Z_t$ with $\sigma^2 = 1$, and you’ve observed $X_1 = 2.1, X_2 = 1.5, X_3 = 2.8$. You need to predict $X_4$ and give a 95% prediction interval.
Concept Chain
Why not use the best predictor? → Best Predictor — Conditional Expectation
$E(X_4 | X_3, X_2, X_1)$ is optimal under MSE, but requires the joint distribution. We restrict to linear functions instead.
The general framework → Best Linear Predictor — General (Theorem 1.2)
$P(U|\mathbf{W}) = E(U) + \mathbf{a}'(\mathbf{W} - E(\mathbf{W}))$, where $\Gamma\mathbf{a} = \text{Cov}(U, \mathbf{W})$.
Specialization to time series → Best Linear Predictor — P_n X_{n+h}
$U = X_{n+h}$, $\mathbf{W} = (X_n, \ldots, X_1)'$, $\mu = 0$ (for zero-mean process):
$$P_n X_{n+h} = \mathbf{a}_n' \mathbf{X}$$Solve $\Gamma_n \mathbf{a}_n = \boldsymbol{\gamma}_n(h)$ where $\Gamma_n$ is Toeplitz: $(\Gamma_n)_{ij} = \gamma(|i-j|)$.
MSE $= \gamma(0) - \mathbf{a}_n'\boldsymbol{\gamma}_n(h)$.
Quality guarantees → Prediction Error Properties
Error has zero mean and is uncorrelated with all information. This is the optimality condition.
Shortcut for AR(p) → Prediction for Causal AR(p)
For causal AR(p) with $n \geq p$: $P_n X_{n+1} = \phi_1 X_n + \cdots + \phi_p X_{n+1-p}$. No need to solve the full linear system — the AR coefficients are the prediction coefficients.
Uncertainty quantification → Prediction Intervals
$(1-\alpha)$ PI: $P_N X_{N+h} \pm z_{\alpha/2}\sqrt{\text{MSE}}$.
Worked Pipeline
AR(1): $X_t = 0.8X_{t-1} + Z_t$, $\sigma^2 = 1$, $\mu = 0$. Observed: $X_3 = 2.8$.
Step 1 — Apply AR(p) shortcut: $P_3 X_4 = 0.8 \cdot X_3 = 0.8 \times 2.8 = 2.24$.
Step 2 — MSE for one-step AR(1): $\text{MSE} = \sigma^2 = 1$.
Step 3 — 95% PI: $2.24 \pm 1.96\sqrt{1} = (0.28, 4.20)$.
Alternative: Full linear system approach (to verify).
$\gamma(0) = 1/(1 - 0.64) = 2.778$, $\gamma(1) = 0.8 \times 2.778 = 2.222$, $\gamma(2) = 0.64 \times 2.778 = 1.778$.
For $n = 3, h = 1$:
$$\Gamma_3 = \begin{pmatrix} 2.778 & 2.222 & 1.778 \\ 2.222 & 2.778 & 2.222 \\ 1.778 & 2.222 & 2.778 \end{pmatrix}, \quad \boldsymbol{\gamma}_3(1) = \begin{pmatrix} 2.222 \\ 1.778 \\ 1.422 \end{pmatrix}$$Solving $\Gamma_3 \mathbf{a} = \boldsymbol{\gamma}_3(1)$ gives $\mathbf{a} = (0.8, 0, 0)'$ (as expected — only the most recent observation matters for causal AR(1)).
Exam Patterns
Pattern 1: “Compute $P_n X_{n+h}$ for a given ACVF”
- Template: Given $\gamma(h)$, set up $\Gamma_n$ and $\boldsymbol{\gamma}_n(h)$, solve the system, compute prediction and MSE.
- Key move: Build $\Gamma_n$ from $\gamma(|i-j|)$ (Toeplitz structure). Build $\boldsymbol{\gamma}_n(h)$ as $(\gamma(h), \gamma(h+1), \ldots, \gamma(h+n-1))'$.
Pattern 2: “Show one-step prediction for AR(p) equals $\sum \phi_i X_{n+1-i}$” (Lec 6 Problem 1)
- Key move: Apply linearity of $P$: $P(X_{n+1}|X_n,\ldots,X_1) = \sum \phi_i P(X_{n+1-i}|\cdots) + P(Z_{n+1}|\cdots)$. Use $P(X_j|\cdots) = X_j$ for observed, $P(Z_{n+1}|\cdots) = 0$ for causal.
Pattern 3: “Compute prediction interval”
- Template: Given model parameters and observed data, compute $P_N X_{N+h}$, MSE, and 95% PI.
- Key move: $\text{MSE} = \gamma(0) - \mathbf{a}'\boldsymbol{\gamma}(h)$. For one-step AR(p): MSE $= \sigma^2$.
Links
- Forward: ARMA from Definition to Prediction (extends to ARMA case)
- Backward: AR(1) Three Regimes, Autocovariance Function (ACVF)
- STA457