Viterbi Algorithm

The Viterbi algorithm finds the most probable sequence of hidden states (MAP inference) in a Hidden Markov Model (HMM) using dynamic programming.

1. Objective

Find the MAP state sequence:

$$z_{1:T}^* = \arg\max_{z_{1:T}} p(z_{1:T} \mid x_{1:T}) = \arg\max_{z_{1:T}} p(x_{1:T}, z_{1:T})$$

Define $\delta_t(j)$ as the probability of the most probable path ending in state $j$ at time $t$:

$$\delta_t(j) = \max_{z_{1:t-1}} p(z_{1:t-1}, z_t = j, x_{1:t})$$

Initialize:

$$\delta_1(k) = \pi_k \,\lambda_1(k)$$

Recurse for $t = 2, \dots, T$:

$$\delta_t(j) = \lambda_t(j) \max_{i} \left[\delta_{t-1}(i) \,A_{ij}\right]$$

Store the argmax for traceback:

$$\psi_t(j) = \arg\max_i \left[\delta_{t-1}(i) \,A_{ij}\right]$$

The optimal final state is $z_T^* = \arg\max_k \delta_T(k)$.

Then recover the full sequence backwards:

$$z_t^* = \psi_{t+1}(z_{t+1}^*) \quad \text{for } t = T-1, \dots, 1$$

The Viterbi algorithm replaces the sum in the forward recursion with a max:

Forward	Viterbi
$\alpha_t(j) = \lambda_t(j) \sum_i \alpha_{t-1}(i) A_{ij}$	$\delta_t(j) = \lambda_t(j) \max_i \delta_{t-1}(i) A_{ij}$

Both have complexity $O(TK^2)$.

In practice, the recursion is computed in log-space to avoid numerical underflow:

$$\log \delta_t(j) = \log \lambda_t(j) + \max_i \left[\log \delta_{t-1}(i) + \log A_{ij}\right]$$