Hidden Markov Model (HMM)

A Hidden Markov Model is a Latent Variable Model for sequential data where the latent states form a Markov chain and each observation depends only on its corresponding latent state.

1. Three Distributions

An HMM with $K$ hidden states and $T$ time steps is specified by:

Initial distribution: $\pi_k = p(z_1 = k)$ for $k = 1, \dots, K$
Transition matrix: $A_{ij} = p(z_t = j \mid z_{t-1} = i)$, with rows summing to 1
Emission distribution: $p(x_t \mid z_t)$, often parameterized as $\lambda_t(k) = p(x_t \mid z_t = k)$

2. Joint Distribution

The joint probability of a sequence of observations $x_{1:T}$ and latent states $z_{1:T}$ factorizes as

$$p(x_{1:T}, z_{1:T}) = p(z_1) \prod_{t=2}^T p(z_t \mid z_{t-1}) \prod_{t=1}^T p(x_t \mid z_t)$$

This factorization follows from the directed graphical model structure: $z_1 \to z_2 \to \cdots \to z_T$ with each $z_t \to x_t$.

3. Key Inference Tasks

Task	Quantity	Algorithm
Filtering	$p(z_t \mid x_{1:t})$	Forward-Backward Algorithm (forward pass only)
Smoothing	$p(z_t \mid x_{1:T})$	Forward-Backward Algorithm (both passes)
MAP sequence	$\arg\max_{z_{1:T}} p(z_{1:T} \mid x_{1:T})$	Viterbi Algorithm
Marginal likelihood	$p(x_{1:T})$	Forward algorithm

4. Observation Models

Common emission distributions include:

Discrete: $p(x_t \mid z_t = k)$ is a categorical distribution (e.g., dice, words)
Gaussian: $p(x_t \mid z_t = k) = \mathcal{N}(x_t \mid \mu_k, \Sigma_k)$

Hidden Markov Model (HMM)#

1. Three Distributions#

2. Joint Distribution#

3. Key Inference Tasks#

4. Observation Models#

Related#

Hidden Markov Model (HMM)

1. Three Distributions

2. Joint Distribution

3. Key Inference Tasks

4. Observation Models

Related