Question 1
- [8] Let $C = I_n - \frac{1}{n} \mathbf{1}_n \mathbf{1}_n^\top$ be the centering operator on $\mathbb{R}^n$.
(a) [2] Show that $C$ is positive semi-definite (PSD).
(b) [2] Explain why the rank of $C$ is $n-1$.
(c) [2] Determine the eigenvalues of $C$ and their multiplicities. (Recall: the multiplicity of an eigenvalue $\lambda$ is the number of linearly independent eigenvectors corresponding to $\lambda$.)
(d) [2] Show that $C \mathbf{1}_n = 0$, i.e., $\mathbf{1}_n$ is an eigenvector corresponding to the zero eigenvalue.
Solution 1
(a) $C$ is a projection matrix $\Rightarrow$ it is PSD.
(b) $\mathcal{C}(\frac{\mathbb{1}}{\sqrt{n}})$ is 1-dimensional space in $\mathbb{R}^n \Rightarrow$ $\mathcal{C}(\frac{\mathbb{1}}{\sqrt{n}})_\perp$ is $n-1$-dimensional space in $\mathbb{R}^n$. $C$ projects onto $(\frac{\mathbb{1}}{\sqrt{n}})_\perp \Rightarrow rank(C) = dim(\mathcal{C}((\frac{\mathbb{1}}{\sqrt{n}})_\perp)) = n-1$
(c) $C$ is a projection matrix $\Rightarrow$ e.vals are 1 or 0. $rank(C) = n-1 \Rightarrow$ multiplicity of 1 is $n-1$ multiplicity of 0 is 1.
(d) $C \mathbb{1} = (I - \frac{\mathbb{1}\mathbb{1}^T}{n}) \mathbb{1} = \mathbb{1} - \frac{\mathbb{1}(\mathbb{1}^T \mathbb{1})}{n} = \mathbb{1} - \frac{\mathbb{1} \cdot n}{n} = 0$
Question 2
Let
$$X = \begin{pmatrix} | & | & | \\ f_1 & f_2 & f_3 \\ | & | & | \end{pmatrix} = \begin{pmatrix} - & x_1^\top & - \\ & \vdots & \\ - & x_{10}^\top & - \end{pmatrix} \in \mathbb{R}^{10 \times 3}$$be a data matrix where rows correspond to observations and columns correspond to variables. Assume that $X$ is column-centered (each column has mean 0) and column-orthogonal ($X^\top X = I_3$).
(a) [2] Find the sample mean vector $\bar{x}$ and the sample covariance matrix $S$ of $X$.
(b) [2] Suppose $f_1$ represents measurements in meters, $f_2$ in centimeters, and $f_3$ in millimeters. Convert all measurements to centimeters by defining
$$Y = \begin{pmatrix} | & | & | \\ 100f_1 & f_2 & 0.1f_3 \\ | & | & | \end{pmatrix} \in \mathbb{R}^{10 \times 3}$$What is the sample covariance matrix of $Y$?
(c) [2] Let $v \in \mathbb{R}^3$ be a vector with unit norm, i.e., $\|v\| = 1$. Define $z = Xv$. Find the sample mean of $z$.
(d) [2] What is the sample variance of $z$?
Solution 2
(a) column centered $\Rightarrow \overline{f}_1 = \overline{f}_2 = \overline{f}_3 = 0 \Rightarrow \bar{x} = \begin{pmatrix} 0 \\ 0 \\ 0 \end{pmatrix}$ column orthogonal $\Rightarrow S_X = \frac{X^\top X}{n} = \frac{1}{10} \cdot I_3 = \begin{pmatrix} 1/10 & & \\ & 1/10 & \\ & & 1/10 \end{pmatrix}$
(b) $y = \begin{pmatrix} 100 & & \\ & 1 & \\ & & 0.1 \end{pmatrix} X \Rightarrow D X$ $S_y = \frac{Y^\top Y}{n} = \frac{D X^\top X D}{n} = D S_X D = \frac{1}{10} \cdot D^2 = \begin{pmatrix} 1000 & & \\ & 1/10 & \\ & & 1/1000 \end{pmatrix}$
(c) note that $Cz = CXv = Xv = z \Rightarrow z$ is centered $\Rightarrow$ $\bar{z} = 0$
(d) $S_z = \frac{z^\top z}{n} = \frac{v^\top X^\top X v}{n} = \frac{v^\top v}{n} = \frac{\|v\|^2}{n} = \frac{1}{10}$
Question 3
Consider the linear regression model
$$y = X\beta + \epsilon, \quad \epsilon \sim \mathcal{N}_n(0, \sigma^2 I_n),$$where $X \in \mathbb{R}^{n \times p}$ is a fixed column-orthogonal data matrix ($X^\top X = I_p$), $\beta \in \mathbb{R}^p$ is a vector of regression coefficients, and $\epsilon$ is Gaussian noise. Assume that $\sigma^2$ is known.
(a) [2] What is the distribution of $y$? Write it down in terms of $X, \beta,$ and $\sigma^2$.
(b) [2] Note that for column-orthogonal $X$, the regression coefficient vector is $\hat{\beta} = X^\top y$. Show that the fitted values $\hat{y} = X\hat{\beta}$ can be written as $\hat{y} = Py$, where $P$ is a projection matrix. Specify the space onto which $P$ projects.
(c) [2] What is the distribution of $\hat{y}$?
(d) [2] Show that the residuals $r = y - \hat{y}$ can be expressed as $r = P_\perp y$ for some projection matrix $P_\perp$. Specify the space onto which $P_\perp$ projects.
(e) [2] What is the distribution of the residuals $r$?
(f) [2] Explain why the residuals $r$ and the fitted values $\hat{y}$ are independent.
(g) [2] Write down the joint distribution of $\begin{pmatrix} \hat{y} \\ r \end{pmatrix}$.
(h) [2] What is the conditional distribution of $r|\hat{y}$?
Solution 3
(a) $y = X\beta + \mathcal{E} \sim \mathcal{N}_n(X\beta, \sigma^2 I_n)$
(b) $\hat{y} = X\hat{\beta} = XX^\top y = Py \quad \text{where } P \text{ projects onto } C(X)$
(c) $\hat{y} = Py \sim \mathcal{N}_n(P X \beta, \sigma^2 P^2)$ $P^2 = P = XX^\top$ (projection), $PXB = XX^\top X \beta = X \beta \Rightarrow \hat{y} \sim \mathcal{N}_n(X\beta, \sigma^2 XX^\top)$
(d) $r = y - \hat{y} = y - Py = (I - XX^\top)y = P_\perp y \quad \text{where}$ $P_\perp \text{ projects onto } C(X_\perp)$
(e) $r = P_\perp y \sim \mathcal{N}_n(P_\perp X \beta, \sigma^2 P_\perp^2)$ $P_\perp X \beta = (I-P) X \beta = X \beta - X \beta = 0$ $P_\perp^2 = P_\perp = (I - XX^\top)$ $y \Rightarrow r \sim \mathcal{N}_n(0, \sigma^2(I - XX^\top))$
(f) $cov(\hat{y}, r) = cov(Py, P_\perp y) = P \text{var}(y) P_\perp^\top = P \cdot \sigma^2 I \cdot P_\perp =$ $= \sigma^2 P(I-P) = 0$ $\hat{y}, r \text{ are uncorrelated } \Rightarrow \text{ independent}$
(g) $\begin{pmatrix} \hat{y} \\ r \end{pmatrix} \sim \mathcal{N}_{2n} \left( \begin{pmatrix} X\beta \\ 0 \end{pmatrix}, \sigma^2 \begin{pmatrix} XX^\top & 0 \\ 0 & I - XX^\top \end{pmatrix} \right)$
(h) $r \text{ and } \hat{y} \text{ are independent } \Rightarrow r|\hat{y} \stackrel{\text{same as } r}{\sim} \mathcal{N}_n(0, \sigma^2(I - XX^\top))$ (or you could use the formula from class: $y|x=x \sim N_q(\mu_y + \Sigma_{yx}\Sigma_x^{-1}(x-\mu_x), \Sigma_y - \Sigma_{yx}\Sigma_x^{-1}\Sigma_{xy})$ $\uparrow \quad \uparrow$ $r \quad \hat{y}$ )