Forward-Backward Algorithm
The Forward-Backward algorithm computes marginal posteriors $p(z_t \mid x_{1:T})$ for each latent state in a Hidden Markov Model (HMM) by combining a forward and a backward recursion.
1. Forward Recursion
Define the forward message $\alpha_t(k) = p(z_t = k, x_{1:t})$. Initialize:
$$\alpha_1(k) = \pi_k \,\lambda_1(k)$$where $\pi_k = p(z_1 = k)$ and $\lambda_t(k) = p(x_t \mid z_t = k)$.
Recurse for $t = 2, \dots, T$:
$$\alpha_t(j) = \lambda_t(j) \sum_{i=1}^K \alpha_{t-1}(i) \,A_{ij}$$In vector form: $\alpha_t \propto \lambda_t \odot (A^\top \alpha_{t-1})$, where $\odot$ denotes element-wise product.
2. Backward Recursion
Define the backward message $\beta_t(k) = p(x_{t+1:T} \mid z_t = k)$. Initialize:
$$\beta_T(k) = 1 \quad \text{for all } k$$Recurse for $t = T-1, \dots, 1$:
$$\beta_t(i) = \sum_{j=1}^K A_{ij} \,\lambda_{t+1}(j) \,\beta_{t+1}(j)$$3. Posterior Marginals (Smoothing)
Combining both messages:
$$p(z_t = k \mid x_{1:T}) = \frac{\alpha_t(k) \,\beta_t(k)}{\sum_{j=1}^K \alpha_t(j) \,\beta_t(j)}$$4. Special Cases
- Filtering (online inference): uses only the forward pass to compute $p(z_t \mid x_{1:t}) \propto \alpha_t$
- Prediction: $p(z_{t+1} \mid x_{1:t}) = A^\top \cdot p(z_t \mid x_{1:t})$
5. Marginal Likelihood
The marginal likelihood of the observations is obtained as a by-product:
$$p(x_{1:T}) = \sum_{k=1}^K \alpha_T(k)$$6. Complexity
Time $O(TK^2)$, space $O(TK)$. Linear in sequence length $T$, quadratic in number of states $K$.