Matrix Algebra & Decomposition Exercises
4. Eigen-Decomposition and Column Signs
Let $A \in \mathbb{R}^{n \times n}$ be symmetric with Eigen Decomposition (E.D.) $A = U\Lambda U^T$ where $U = (u_1, \dots, u_n)$. Denote $\tilde{U} = (u_1, \dots, -u_i, \dots, u_n)$, i.e., matrix $U$ with the $i$-th column replaced by $-u_i$.
Show that $\tilde{U}^T\tilde{U} = \tilde{U}\tilde{U}^T = I_n$ and $A = \tilde{U}\Lambda\tilde{U}^T$.
Proof: Since $U$ is orthogonal, $U^T U = I$, which implies:
- $u_i^T u_i = 1$
- $u_i^T u_j = 0$ for $i \neq j$
For $\tilde{u}_i = -u_i$:
- $\tilde{u}_i^T \tilde{u}_i = (-u_i)^T (-u_i) = u_i^T u_i = 1$
- $\tilde{u}_i^T u_j = (-u_i)^T u_j = - (u_i^T u_j) = 0$
Thus, $\tilde{U}^T \tilde{U} = I$. Since $\tilde{U}$ is square, $\tilde{U}\tilde{U}^T = I$.
For the decomposition:
$$ \tilde{U}\Lambda\tilde{U}^T = \sum_{j=1}^{n} \lambda_j \tilde{u}_j \tilde{u}_j^T = \lambda_i (-u_i)(-u_i)^T + \sum_{j \neq i} \lambda_j u_j u_j^T = \sum_{j=1}^{n} \lambda_j u_j u_j^T = U\Lambda U^T = A $$5. Eigen-Decomposition and Column Swapping
Let $A \in \mathbb{R}^{n \times n}$ be symmetric with E.D. $A = U\Lambda U^T$ where $\Lambda = \text{diag}(\lambda_1, \dots, \lambda_n)$ with $\lambda_1 = \lambda_2$. Let $\tilde{U} = (u_2, u_1, \dots, u_n)$, i.e., matrix $U$ with the first two columns swapped.
Show that $\tilde{U}^T\tilde{U} = \tilde{U}\tilde{U}^T = I_n$ and $A = \tilde{U}\Lambda\tilde{U}^T$.
Proof: Swapping columns of an orthogonal matrix results in another orthogonal matrix, so $\tilde{U}^T\tilde{U} = I$. For the decomposition:
$$ \tilde{U}\Lambda\tilde{U}^T = \lambda_1 u_2 u_2^T + \lambda_2 u_1 u_1^T + \dots + \lambda_n u_n u_n^T $$Since $\lambda_1 = \lambda_2$, this sum is equal to:
$$ \lambda_1 u_1 u_1^T + \lambda_2 u_2 u_2^T + \dots + \lambda_n u_n u_n^T = A $$6. Powers of a Matrix
E.D. can be used to compute powers of $A$. Specifically, if $A = U\Lambda U^T$, then:
$$ A^k = (U\Lambda U^T)(U\Lambda U^T)\dots(U\Lambda U^T) = U(\Lambda \dots \Lambda)U^T = U\Lambda^k U^T $$7. Orthogonal Projection Matrix Diagonal Elements
If $P \in \mathbb{R}^{n \times n}$ is an orthogonal projection matrix, show that $0 \le P_{ii} \le 1$. (Hint: use $P = P^2 = P^T P$)
Proof: Since $P$ is a projection matrix, $P_{ii} = (P^T P)_{ii}$.
$$ P_{ii} = \sum_{j=1}^{n} P_{ji}^2 = P_{ii}^2 + \sum_{j \neq i} P_{ji}^2 \ge P_{ii}^2 $$This implies $P_{ii} \ge P_{ii}^2$, so $P_{ii}(1 - P_{ii}) \ge 0$, which means $0 \le P_{ii} \le 1$.
8. Constructing Orthogonal Matrices
If $U \in \mathbb{R}^{n \times p}$ is column-orthogonal and $U_{\perp} \in \mathbb{R}^{n \times (n-p)}$ is its complement, then $\tilde{U} = (U, U_{\perp}) \in \mathbb{R}^{n \times n}$ is orthogonal.
$$ \tilde{U}^T \tilde{U} = \begin{pmatrix} U^T U & U^T U_{\perp} \\ U_{\perp}^T U & U_{\perp}^T U_{\perp} \end{pmatrix} = \begin{pmatrix} I_p & 0 \\ 0 & I_{n-p} \end{pmatrix} = I_n $$9. Centering Matrix as Projection
Explain why the centering matrix $C = I_n - \frac{1_n 1_n^T}{n}$ is an orthogonal projection operator. Find the space it projects onto.
Explanation: Let $u = \frac{1_n}{\sqrt{n}}$. Then $u$ is a unit vector ($u^T u = 1$).
$$ C = I - u u^T $$This is the projection onto the orthogonal complement of the space spanned by $u$ (which is the vector of ones). Thus, $C$ projects onto the subspace of vectors with zero sum (mean centered).
SVD and Matrix Calculus Exercises
10. SVD Relations
Let $A \in \mathbb{R}^{n \times p}$ with SVD $A = UDV^T$ where $U=(u_1, \dots, u_p)$, $D=\text{diag}(d_1, \dots, d_p)$, $V=(v_1, \dots, v_p)$.
Show that $Av_i = d_i u_i$ and $A^T u_i = d_i v_i$.
Proof:
$$ A = \sum_{j=1}^{n} d_j u_j v_j^T $$$$ Av_i = \left( \sum_{j=1}^{n} d_j u_j v_j^T \right) v_i = \sum_{j=1}^{n} d_j u_j (v_j^T v_i) = d_i u_i $$(Since $v_j^T v_i = 1$ if $i=j$, else $0$). Similarly for $A^T u_i = d_i v_i$.
11. SVD from E.D. (Handling Negative Eigenvalues)
If $A \in \mathbb{R}^{n \times n}$ is symmetric with E.D. $A = U\Lambda U^T$ where $U=(u_1, \dots, u_n, u_{n+1})$ and $\Lambda = \text{diag}(\lambda_1, \dots, \lambda_n)$. (Note: The problem context implies one negative eigenvalue or similar sign adjustment). Let $\tilde{V} = (u_1, \dots, u_n, -u_{n+1})$ and $D = \text{diag}(d_1, \dots, d_n, -d_{n+1})$. Show that $A = UD\tilde{V}^T$ is a valid SVD.
Proof:
$$ UD\tilde{V}^T = \sum_{i=1}^{n} d_i u_i u_i^T + (-d_{n+1})u_{n+1}(-u_{n+1})^T = \sum_{i=1}^{n+1} d_i u_i u_i^T = A $$This is valid because $U$ and $\tilde{V}$ are orthogonal, and $D$ contains non-negative singular values (assuming $|d_i|$ correspond to $|\lambda_i|$).
12. Frobenius Norm and Singular Values
For $A \in \mathbb{R}^{n \times p}$, show that $||A||_F^2 = \text{tr}(A^T A) = \sum_{i=1}^{\min(n,p)} d_i^2(A)$.
Proof:
$$ \text{tr}(A^T A) = \text{tr}(V D U^T U D V^T) = \text{tr}(V D^2 V^T) = \text{tr}(D^2 V^T V) = \text{tr}(D^2) = \sum d_i^2 $$13. Gradient of Quadratic Form
If $f(x) = x^T S x$ for a symmetric $S \in \mathbb{R}^{n \times n}$, then $\nabla_x f(x) = 2Sx$.
Derivation:
$$ x^T S x = \sum_{i,j} S_{ij} x_i x_j $$$$ \frac{\partial}{\partial x_k} (x^T S x) = \sum_{j} S_{kj} x_j + \sum_{i} S_{ik} x_i = 2 \sum_{j} S_{kj} x_j \quad (\text{since } S_{ij} = S_{ji}) $$Thus, the gradient vector is $2Sx$.
14. Gradient of Trace
If $A, B \in \mathbb{R}^{n \times p}$ and $f(A) = \text{tr}(AB^T)$, then $\nabla_A f(A) = B$.
Derivation:
$$ \text{tr}(AB^T) = \sum_{i,j} A_{ij} B_{ij} \implies \frac{\partial f(A)}{\partial A_{ij}} = B_{ij} $$Statistical Estimates & MVN Exercises
1. Correlation Matrix
Let $X$ be a $p$-dimensional random vector with $\Sigma = \text{var}(x)$. Let $D_{\Sigma} = \text{diag}(\Sigma_{11}, \dots, \Sigma_{pp})$. Then:
$$ R = D_{\Sigma}^{-1/2} \Sigma D_{\Sigma}^{-1/2} $$Element-wise: $(D_{\Sigma}^{-1/2} \Sigma D_{\Sigma}^{-1/2})_{ij} = \frac{\Sigma_{ij}}{\sqrt{\Sigma_{ii}\Sigma_{jj}}}$.
2. Expectation Identity
Show that $\mathbb{E}(xx^T) - \mu\mu^T = \mathbb{E}((x-\mu)(x-\mu)^T)$.
Proof:
$$ \mathbb{E}((x-\mu)(x-\mu)^T) = \mathbb{E}(xx^T - x\mu^T - \mu x^T + \mu\mu^T) $$$$ = \mathbb{E}(xx^T) - \mathbb{E}(x)\mu^T - \mu\mathbb{E}(x)^T + \mu\mu^T $$$$ = \mathbb{E}(xx^T) - \mu\mu^T - \mu\mu^T + \mu\mu^T = \mathbb{E}(xx^T) - \mu\mu^T $$3. Expectation of Random Matrix Product
If $S$ is a random matrix, and $A, B$ are constant matrices:
$$ \mathbb{E}(ASB) = A \cdot \mathbb{E}(S) \cdot B $$4. Covariance Properties
For random vectors $x, y$ and matrices $A, B$:
- $Cov(x, y) = \mathbb{E}(xy^T) - \mathbb{E}(x)\mathbb{E}(y)^T$
- $Cov(x, y) = Cov(y, x)^T$
- $Cov(Ax, By) = A Cov(x, y) B^T$
5. Sample Variance and Centering
If $v \in \mathbb{R}^n$ and $C$ is the centering matrix:
$$ S_v^2 = \frac{1}{n} ||Cv||^2 $$Proof:
$$ Cv = \begin{pmatrix} v_1 - \bar{v} \\ \vdots \\ v_n - \bar{v} \end{pmatrix} \implies \frac{1}{n}||Cv||^2 = \frac{1}{n} \sum (v_i - \bar{v})^2 = S_v^2 $$6. Quadratic Form and Sample Variance
Let $S$ be the sample covariance matrix of $X$. For $v \in \mathbb{R}^p$, $v^T S v$ is the sample variance of $Xv$.
$$ v^T S v = v^T \frac{X^T C X}{n} v = \frac{(Xv)^T C (Xv)}{n} = \frac{||C(Xv)||^2}{n} = S_{Xv}^2 $$Since $S_{Xv}^2 \ge 0$, $S$ is Positive Semidefinite (PSD).
7. Regression Coefficients
For centered $X$ and $y$, the regression coefficient $\beta = S_{xx}^{-1}S_{xy}$.
$$ S_{xx} = \frac{X^T X}{n}, \quad S_{xy} = \frac{X^T y}{n} $$$$ \beta = (X^T X)^{-1} X^T y $$Standard MVN Density
For $X \sim N_p(0, I_p)$, show the density is $f(x) = \frac{1}{(2\pi)^{p/2}} e^{-\frac{1}{2}||x||^2}$.
- $\det(I_p) = 1$
- $(x-\mu)^T \Sigma^{-1} (x-\mu) = x^T I x = ||x||^2$
Independence of Uncorrelated MVN
Let $z = \begin{pmatrix} X \\ Y \end{pmatrix} \sim N_{p+q} \left( \begin{pmatrix} \mu_X \\ \mu_Y \end{pmatrix}, \begin{pmatrix} \Sigma_X & 0 \\ 0 & \Sigma_Y \end{pmatrix} \right)$. Show $X$ and $Y$ are independent. The joint density factors:
$$ f_Z(x, y) = f_X(x) f_Y(y) $$This is because $\Sigma^{-1}$ is block diagonal, making the quadratic form additive in the exponent, and the determinant multiplicative.
LEC4 EXERCISE (Numerical MVN)
Suppose $X = (X_1, X_2)^T \sim N_2(\mu, \Sigma)$ where:
- $\mu = \begin{pmatrix} 1 \\ 2 \end{pmatrix}$
- Eigenvalues: $\lambda_1 = 1, \lambda_2 = 2$
- Eigenvectors: $u_1 = \begin{pmatrix} 1/\sqrt{2} \\ 1/\sqrt{2} \end{pmatrix}, u_2 = \begin{pmatrix} -1/\sqrt{2} \\ 1/\sqrt{2} \end{pmatrix}$
Calculate $\Sigma$:
$$ \Sigma = U \Lambda U^T = \begin{pmatrix} 1.5 & -0.5 \\ -0.5 & 1.5 \end{pmatrix} $$1. Distributions of $X_1$ and $X_2$
Marginal distributions:
- $X_1 \sim N(1, 1.5)$
- $X_2 \sim N(2, 1.5)$
2. Conditional Distributions
Formula: $X_1 | X_2 = x_2 \sim N\left( \mu_1 + \frac{\sigma_{12}}{\sigma_{22}}(x_2 - \mu_2), \sigma_{11} - \frac{\sigma_{12}^2}{\sigma_{22}} \right)$.
- $\frac{\sigma_{12}}{\sigma_{22}} = \frac{-0.5}{1.5} = -\frac{1}{3}$
- Conditional Variance: $1.5 - \frac{(-0.5)^2}{1.5} = 1.5 - \frac{0.25}{1.5} = \frac{2.25 - 0.25}{1.5} = \frac{2}{1.5} = \frac{4}{3}$
Results:
- $X_1 | X_2 = 1 \sim N(1 - \frac{1}{3}(1-2), \frac{4}{3}) = N(\frac{4}{3}, \frac{4}{3})$
- $X_2 | X_1 = 1 \sim N(2 - \frac{1}{3}(1-1), \frac{4}{3}) = N(2, \frac{4}{3})$
3 & 4. Contour Plots and Confidence Intervals (90%)
Using $\chi^2(2)$ table, $90\%$ percentile is approx $4.61$. Ellipse axes lengths are proportional to $\sqrt{4.61 \lambda_i}$:
- Axis 1 along $u_1$: length $\propto \sqrt{1 \cdot 4.61}$
- Axis 2 along $u_2$: length $\propto \sqrt{2 \cdot 4.61}$
5. General Conditional Distribution
$$ X_1 | X_2 = x \sim N\left( 1 - \frac{1}{3}(x - 2), \frac{4}{3} \right) $$6. Plot of Conditional Mean
$$ g(x) = \mathbb{E}(X_1 | X_2 = x) = 1 - \frac{1}{3}x + \frac{2}{3} = \frac{5}{3} - \frac{1}{3}x $$(Or in terms of $x_2$: $x_1 = -\frac{1}{3}x_2 + \frac{5}{3} \iff x_2 = -3x_1 + 5$)
7. 95% Confidence Region for $X_1 | X_2 = x$
Using 2SD rule (approx for 95%):
$$ \text{Upper } u(x) = \left( 1 - \frac{1}{3}(x-2) \right) + 2\sqrt{\frac{4}{3}} $$$$ \text{Lower } l(x) = \left( 1 - \frac{1}{3}(x-2) \right) - 2\sqrt{\frac{4}{3}} $$