Gaussian Mixture Model (GMM)

A Gaussian Mixture Model is a Latent Variable Model that represents a distribution as a weighted sum of Gaussian components.

1. Generative Model

For each data point $x_n$:

  1. Draw a latent cluster assignment $z_n \sim \text{Categorical}(\pi_1, \dots, \pi_K)$
  2. Draw the observation $x_n \sim \mathcal{N}_m(x_n \mid \mu_{z_n}, \Sigma_{z_n})$

The mixing weights satisfy $\pi_k \geq 0$ and $\sum_{k=1}^K \pi_k = 1$.

2. Marginal Density

Marginalizing over the latent variable gives

$$p(x) = \sum_{k=1}^K \pi_k \,\mathcal{N}_m(x \mid \mu_k, \Sigma_k)$$

Each component $k$ has parameters: mean $\mu_k$, covariance $\Sigma_k$, and weight $\pi_k$.

3. Responsibilities

The posterior probability that data point $x_n$ belongs to component $k$ is

$$r_{nk} = p(z_n = k \mid x_n) = \frac{\pi_k \,\mathcal{N}_m(x_n \mid \mu_k, \Sigma_k)}{\sum_{j=1}^K \pi_j \,\mathcal{N}_m(x_n \mid \mu_j, \Sigma_j)}$$

These are called responsibilities and are computed in the E-step of EM.

4. Log-Likelihood

Given $N$ observations $\{x_1, \dots, x_N\}$, the observed-data log-likelihood is

$$\ell(\theta) = \sum_{n=1}^N \log \left( \sum_{k=1}^K \pi_k \,\mathcal{N}_m(x_n \mid \mu_k, \Sigma_k) \right)$$

The sum inside the log makes direct optimization difficult, motivating the use of Expectation-Maximization (EM).

5. Relation to K-Means

K-Means Algorithm is the limiting case of EM for GMMs when all covariances are $\Sigma_k = \varepsilon I$ and $\varepsilon \to 0$. In this limit, responsibilities become hard assignments (0 or 1).