Forward-Backward Algorithm

The Forward-Backward algorithm computes marginal posteriors $p(z_t \mid x_{1:T})$ for each latent state in a Hidden Markov Model (HMM) by combining a forward and a backward recursion.

1. Forward Recursion

Define the forward message $\alpha_t(k) = p(z_t = k, x_{1:t})$. Initialize:

$$\alpha_1(k) = \pi_k \,\lambda_1(k)$$

where $\pi_k = p(z_1 = k)$ and $\lambda_t(k) = p(x_t \mid z_t = k)$.

Recurse for $t = 2, \dots, T$:

$$\alpha_t(j) = \lambda_t(j) \sum_{i=1}^K \alpha_{t-1}(i) \,A_{ij}$$

In vector form: $\alpha_t \propto \lambda_t \odot (A^\top \alpha_{t-1})$, where $\odot$ denotes element-wise product.

2. Backward Recursion

Define the backward message $\beta_t(k) = p(x_{t+1:T} \mid z_t = k)$. Initialize:

$$\beta_T(k) = 1 \quad \text{for all } k$$

Recurse for $t = T-1, \dots, 1$:

$$\beta_t(i) = \sum_{j=1}^K A_{ij} \,\lambda_{t+1}(j) \,\beta_{t+1}(j)$$

3. Posterior Marginals (Smoothing)

Combining both messages:

$$p(z_t = k \mid x_{1:T}) = \frac{\alpha_t(k) \,\beta_t(k)}{\sum_{j=1}^K \alpha_t(j) \,\beta_t(j)}$$

4. Special Cases

  • Filtering (online inference): uses only the forward pass to compute $p(z_t \mid x_{1:t}) \propto \alpha_t$
  • Prediction: $p(z_{t+1} \mid x_{1:t}) = A^\top \cdot p(z_t \mid x_{1:t})$

5. Marginal Likelihood

The marginal likelihood of the observations is obtained as a by-product:

$$p(x_{1:T}) = \sum_{k=1}^K \alpha_T(k)$$

6. Complexity

Time $O(TK^2)$, space $O(TK)$. Linear in sequence length $T$, quadratic in number of states $K$.