Sufficient Statistics
A sufficient statistic is a function of the data that preserves all information about the parameter needed for inference.
1. Definition
A statistic $T(X)$ is sufficient for parameter $\theta$ if the conditional distribution of $X$ given $T(X)$ does not depend on $\theta$.
Equivalently, once $T(X)$ is known, the remaining details of the sample contain no additional information about $\theta$.
2. Fisher-Neyman Factorization Criterion
$T(X)$ is sufficient for $\theta$ if and only if the joint density or mass function can be written as
$$p(x|\theta) = h(x)\, g(T(x), \theta)$$where $h(x)$ does not depend on $\theta$.
3. Meaning
A sufficient statistic compresses the data without losing inferential information about $\theta$.
4. Role in Exponential Family
If a model belongs to the exponential family,
$$p(x|\theta) = h(x)\exp\left(\eta(\theta)^\top T(x) - A(\eta(\theta))\right)$$then $T(x)$ is a sufficient statistic for $\theta$.
For i.i.d. observations $x_1,\dots,x_n$, the sufficient statistic is typically
$$\sum_{i=1}^n T(x_i)$$5. Importance
Sufficient statistics reduce optimization and inference to a lower-dimensional summary of the data, which is why they are central in MLE, Bayesian inference, EM, and VI.