Molecular Note: From Raw Series to Stationary Residuals

Linked Atomic Concepts: Time Series Characteristics, Classical Decomposition, Trend Estimation — Regression, Trend Estimation — Moving Average Filter, Trend Estimation — Exponential Smoothing, Trend Elimination — Differencing, Seasonality Estimation and Elimination, Testing Residuals — iid and White Noise


Scene-Setting

You receive quarterly beer production data for 1975–1983. The plot shows an upward trend and a repeating seasonal cycle every 4 quarters. You need to strip these components until the residuals look like stationary noise, then decide whether further modeling is needed.

Concept Chain

Stage 1: Visual Diagnosis → Time Series Characteristics

Plot the data. Identify: trend (upward drift), seasonality ($d = 4$), constant or non-constant variation. If variation grows with level → log transform first.

Stage 2: Choose a Decomposition → Classical Decomposition

Assume additive: $X_t = m_t + s_t + Y_t$. If log was applied, this holds on the log scale.

Stage 3: Estimate/Eliminate Trend

Three options, each with different trade-offs:

Option A — Trend Estimation — Regression: Fit $\hat{m}_t = \hat{a} + \hat{b}t$. Pro: can extrapolate. Con: assumes linear form.

Option B — Trend Estimation — Moving Average Filter: Compute d-point MA. For $d = 4$ (even): $\hat{m}_t = (0.5x_{t-2} + x_{t-1} + x_t + x_{t+1} + 0.5x_{t+2})/4$. Pro: nonparametric. Con: loses endpoints, can’t forecast.

Option C — Trend Elimination — Differencing: Apply $\nabla X_t = X_t - X_{t-1}$. Pro: no functional form needed. Con: changes the process structure.

Stage 4: Eliminate Seasonality → Seasonality Estimation and Elimination

If using estimation (Options A/B): Apply the 4-step classical decomposition procedure. Compute seasonal estimates $\hat{s}_k$, re-estimate trend from deseasonalized data, extract residuals $\hat{Y}_t = x_t - \hat{m}_t - \hat{s}_t$.

If using differencing (Option C): Apply $\nabla_4 X_t = X_t - X_{t-4}$ to kill seasonality, then $\nabla$ to kill remaining trend.

Stage 5: Check Residuals → Testing Residuals — iid and White Noise

Plot sample ACF of $\hat{Y}_t$ with $\pm 1.96/\sqrt{n}$ bounds. If residuals look like white noise → done. If significant autocorrelations remain → fit a stationary model (MA, AR, ARMA) to the residuals.

Worked Pipeline

Data: Quarterly production, $n = 36$, period $d = 4$.

  1. Plot → upward trend + clear seasonal cycle + roughly constant amplitude → additive model.
  2. 4-point MA: $\hat{m}_5 = (0.5 \cdot x_3 + x_4 + x_5 + x_6 + 0.5 \cdot x_7)/4$ (compute for $t = 3, \ldots, 34$).
  3. Seasonal estimates: Group deviations $x_t - \hat{m}_t$ by quarter. Average each quarter’s deviations → $w_1, w_2, w_3, w_4$. Center: $\hat{s}_k = w_k - \bar{w}$.
  4. Re-estimate trend from $d_t = x_t - \hat{s}_t$ using regression: $\hat{m}_t = \hat{a} + \hat{b}t$.
  5. Residuals: $\hat{Y}_t = x_t - \hat{m}_t - \hat{s}_t$. Plot ACF. If ~95% within bounds → white noise, done.

Comparison Table

MethodTypeForecasting?Endpoints?Assumptions
RegressionEstimationParametric form
MA FilterEstimationLoses $q$ on each sideNone
Exp. SmoothingEstimation✓ (1-step)No trend/seasonality
DifferencingEliminationLoses 1 obs per $\nabla$Polynomial trend degree

Exam Patterns

Pattern 1: “Is this series stationary?”

  • What they’re testing: visual identification + formal check (mean/var/cov)
  • Template: Given a plot or formula for $X_t$, determine stationarity.
  • Key move: If you see trend or seasonality in the plot → not stationary. If given a formula, compute $E(X_t)$ — if it depends on $t$, stop immediately.

Pattern 2: “Remove trend and seasonality, test residuals”

  • What they’re testing: the full pipeline
  • Template: Given data with known period $d$, apply classical decomposition, compute residuals, test with ACF.
  • Key move: For even $d$, remember the endpoint weights are $0.5$; for centering the seasonal estimates, subtract the mean of $w_k$’s.

Pattern 3: “Prove differencing eliminates linear trend” (MT1 Review)

  • Template: Show $\nabla m_t = b$ for $m_t = a + bt$.
  • Key move: Direct substitution: $\nabla m_t = (a + bt) - (a + b(t-1)) = b$.