Molecular Note: Best Linear Prediction for Stationary Series

Linked Atomic Concepts: Best Predictor — Conditional Expectation, Best Linear Predictor — General (Theorem 1.2), Best Linear Predictor — P_n X_{n+h}, Prediction Error Properties, Prediction for Causal AR(p), Prediction Intervals, Autocovariance Function (ACVF)


Scene-Setting

You have an AR(1) process $X_t = 0.8 X_{t-1} + Z_t$ with $\sigma^2 = 1$, and you’ve observed $X_1 = 2.1, X_2 = 1.5, X_3 = 2.8$. You need to predict $X_4$ and give a 95% prediction interval.

Concept Chain

Why not use the best predictor? → Best Predictor — Conditional Expectation

$E(X_4 | X_3, X_2, X_1)$ is optimal under MSE, but requires the joint distribution. We restrict to linear functions instead.

The general framework → Best Linear Predictor — General (Theorem 1.2)

$P(U|\mathbf{W}) = E(U) + \mathbf{a}'(\mathbf{W} - E(\mathbf{W}))$, where $\Gamma\mathbf{a} = \text{Cov}(U, \mathbf{W})$.

Specialization to time series → Best Linear Predictor — P_n X_{n+h}

$U = X_{n+h}$, $\mathbf{W} = (X_n, \ldots, X_1)'$, $\mu = 0$ (for zero-mean process):

$$P_n X_{n+h} = \mathbf{a}_n' \mathbf{X}$$

Solve $\Gamma_n \mathbf{a}_n = \boldsymbol{\gamma}_n(h)$ where $\Gamma_n$ is Toeplitz: $(\Gamma_n)_{ij} = \gamma(|i-j|)$.

MSE $= \gamma(0) - \mathbf{a}_n'\boldsymbol{\gamma}_n(h)$.

Quality guarantees → Prediction Error Properties

Error has zero mean and is uncorrelated with all information. This is the optimality condition.

Shortcut for AR(p) → Prediction for Causal AR(p)

For causal AR(p) with $n \geq p$: $P_n X_{n+1} = \phi_1 X_n + \cdots + \phi_p X_{n+1-p}$. No need to solve the full linear system — the AR coefficients are the prediction coefficients.

Uncertainty quantification → Prediction Intervals

$(1-\alpha)$ PI: $P_N X_{N+h} \pm z_{\alpha/2}\sqrt{\text{MSE}}$.

Worked Pipeline

AR(1): $X_t = 0.8X_{t-1} + Z_t$, $\sigma^2 = 1$, $\mu = 0$. Observed: $X_3 = 2.8$.

Step 1 — Apply AR(p) shortcut: $P_3 X_4 = 0.8 \cdot X_3 = 0.8 \times 2.8 = 2.24$.

Step 2 — MSE for one-step AR(1): $\text{MSE} = \sigma^2 = 1$.

Step 3 — 95% PI: $2.24 \pm 1.96\sqrt{1} = (0.28, 4.20)$.

Alternative: Full linear system approach (to verify).

$\gamma(0) = 1/(1 - 0.64) = 2.778$, $\gamma(1) = 0.8 \times 2.778 = 2.222$, $\gamma(2) = 0.64 \times 2.778 = 1.778$.

For $n = 3, h = 1$:

$$\Gamma_3 = \begin{pmatrix} 2.778 & 2.222 & 1.778 \\ 2.222 & 2.778 & 2.222 \\ 1.778 & 2.222 & 2.778 \end{pmatrix}, \quad \boldsymbol{\gamma}_3(1) = \begin{pmatrix} 2.222 \\ 1.778 \\ 1.422 \end{pmatrix}$$

Solving $\Gamma_3 \mathbf{a} = \boldsymbol{\gamma}_3(1)$ gives $\mathbf{a} = (0.8, 0, 0)'$ (as expected — only the most recent observation matters for causal AR(1)).

Exam Patterns

Pattern 1: “Compute $P_n X_{n+h}$ for a given ACVF”

  • Template: Given $\gamma(h)$, set up $\Gamma_n$ and $\boldsymbol{\gamma}_n(h)$, solve the system, compute prediction and MSE.
  • Key move: Build $\Gamma_n$ from $\gamma(|i-j|)$ (Toeplitz structure). Build $\boldsymbol{\gamma}_n(h)$ as $(\gamma(h), \gamma(h+1), \ldots, \gamma(h+n-1))'$.

Pattern 2: “Show one-step prediction for AR(p) equals $\sum \phi_i X_{n+1-i}$” (Lec 6 Problem 1)

  • Key move: Apply linearity of $P$: $P(X_{n+1}|X_n,\ldots,X_1) = \sum \phi_i P(X_{n+1-i}|\cdots) + P(Z_{n+1}|\cdots)$. Use $P(X_j|\cdots) = X_j$ for observed, $P(Z_{n+1}|\cdots) = 0$ for causal.

Pattern 3: “Compute prediction interval”

  • Template: Given model parameters and observed data, compute $P_N X_{N+h}$, MSE, and 95% PI.
  • Key move: $\text{MSE} = \gamma(0) - \mathbf{a}'\boldsymbol{\gamma}(h)$. For one-step AR(p): MSE $= \sigma^2$.