Importance Sampling
Importance sampling estimates expectations under a target distribution $p(x)$ using samples from a different proposal distribution $q(x)$.
1. Core Identity
For any function $f(x)$:
$$\mathbb{E}_p[f(x)] = \int f(x)\,p(x)\,dx = \int f(x)\frac{p(x)}{q(x)}\,q(x)\,dx = \mathbb{E}_q\left[f(x)\frac{p(x)}{q(x)}\right]$$2. Importance Weights
The ratio $w(x) = \frac{p(x)}{q(x)}$ is called the importance weight. Given samples $x^{(s)} \sim q$:
$$\mathbb{E}_p[f(x)] \approx \frac{1}{S}\sum_{s=1}^S f(x^{(s)})\,w(x^{(s)})$$3. Unnormalized Case
When $p(x) = \bar{p}(x)/Z$ with unknown $Z$, use self-normalized importance sampling:
$$\mathbb{E}_p[f(x)] \approx \frac{\sum_{s=1}^S f(x^{(s)})\,\bar{w}(x^{(s)})}{\sum_{s=1}^S \bar{w}(x^{(s)})}$$where $\bar{w}(x) = \bar{p}(x)/q(x)$ are the unnormalized weights.
4. Variance
The variance of the importance sampling estimator depends heavily on the choice of $q$:
- If $q$ has lighter tails than $p$, the weights can have high variance
- The optimal proposal (minimum variance) is $q^*(x) \propto |f(x)|\,p(x)$
- In practice, choose $q$ to cover the regions where $|f(x)|p(x)$ is large
5. Comparison with Other Methods
| Method | Requires envelope | Samples are weighted | Samples are independent |
|---|---|---|---|
| Rejection Sampling | Yes | No | Yes |
| Importance Sampling | No | Yes | Yes |
| Metropolis-Hastings Algorithm | No | No | No |