Hidden Markov Model (HMM)
A Hidden Markov Model is a Latent Variable Model for sequential data where the latent states form a Markov chain and each observation depends only on its corresponding latent state.
1. Three Distributions
An HMM with $K$ hidden states and $T$ time steps is specified by:
- Initial distribution: $\pi_k = p(z_1 = k)$ for $k = 1, \dots, K$
- Transition matrix: $A_{ij} = p(z_t = j \mid z_{t-1} = i)$, with rows summing to 1
- Emission distribution: $p(x_t \mid z_t)$, often parameterized as $\lambda_t(k) = p(x_t \mid z_t = k)$
2. Joint Distribution
The joint probability of a sequence of observations $x_{1:T}$ and latent states $z_{1:T}$ factorizes as
$$p(x_{1:T}, z_{1:T}) = p(z_1) \prod_{t=2}^T p(z_t \mid z_{t-1}) \prod_{t=1}^T p(x_t \mid z_t)$$This factorization follows from the directed graphical model structure: $z_1 \to z_2 \to \cdots \to z_T$ with each $z_t \to x_t$.
3. Key Inference Tasks
| Task | Quantity | Algorithm |
|---|---|---|
| Filtering | $p(z_t \mid x_{1:t})$ | Forward-Backward Algorithm (forward pass only) |
| Smoothing | $p(z_t \mid x_{1:T})$ | Forward-Backward Algorithm (both passes) |
| MAP sequence | $\arg\max_{z_{1:T}} p(z_{1:T} \mid x_{1:T})$ | Viterbi Algorithm |
| Marginal likelihood | $p(x_{1:T})$ | Forward algorithm |
4. Observation Models
Common emission distributions include:
- Discrete: $p(x_t \mid z_t = k)$ is a categorical distribution (e.g., dice, words)
- Gaussian: $p(x_t \mid z_t = k) = \mathcal{N}(x_t \mid \mu_k, \Sigma_k)$