Matrix Algebra & Decomposition Exercises

4. Eigen-Decomposition and Column Signs

Let $A \in \mathbb{R}^{n \times n}$ be symmetric with Eigen Decomposition (E.D.) $A = U\Lambda U^T$ where $U = (u_1, \dots, u_n)$. Denote $\tilde{U} = (u_1, \dots, -u_i, \dots, u_n)$, i.e., matrix $U$ with the $i$-th column replaced by $-u_i$.

Show that $\tilde{U}^T\tilde{U} = \tilde{U}\tilde{U}^T = I_n$ and $A = \tilde{U}\Lambda\tilde{U}^T$.

Proof: Since $U$ is orthogonal, $U^T U = I$, which implies:

  • $u_i^T u_i = 1$
  • $u_i^T u_j = 0$ for $i \neq j$

For $\tilde{u}_i = -u_i$:

  • $\tilde{u}_i^T \tilde{u}_i = (-u_i)^T (-u_i) = u_i^T u_i = 1$
  • $\tilde{u}_i^T u_j = (-u_i)^T u_j = - (u_i^T u_j) = 0$

Thus, $\tilde{U}^T \tilde{U} = I$. Since $\tilde{U}$ is square, $\tilde{U}\tilde{U}^T = I$.

For the decomposition:

$$ \tilde{U}\Lambda\tilde{U}^T = \sum_{j=1}^{n} \lambda_j \tilde{u}_j \tilde{u}_j^T = \lambda_i (-u_i)(-u_i)^T + \sum_{j \neq i} \lambda_j u_j u_j^T = \sum_{j=1}^{n} \lambda_j u_j u_j^T = U\Lambda U^T = A $$

5. Eigen-Decomposition and Column Swapping

Let $A \in \mathbb{R}^{n \times n}$ be symmetric with E.D. $A = U\Lambda U^T$ where $\Lambda = \text{diag}(\lambda_1, \dots, \lambda_n)$ with $\lambda_1 = \lambda_2$. Let $\tilde{U} = (u_2, u_1, \dots, u_n)$, i.e., matrix $U$ with the first two columns swapped.

Show that $\tilde{U}^T\tilde{U} = \tilde{U}\tilde{U}^T = I_n$ and $A = \tilde{U}\Lambda\tilde{U}^T$.

Proof: Swapping columns of an orthogonal matrix results in another orthogonal matrix, so $\tilde{U}^T\tilde{U} = I$. For the decomposition:

$$ \tilde{U}\Lambda\tilde{U}^T = \lambda_1 u_2 u_2^T + \lambda_2 u_1 u_1^T + \dots + \lambda_n u_n u_n^T $$

Since $\lambda_1 = \lambda_2$, this sum is equal to:

$$ \lambda_1 u_1 u_1^T + \lambda_2 u_2 u_2^T + \dots + \lambda_n u_n u_n^T = A $$

6. Powers of a Matrix

E.D. can be used to compute powers of $A$. Specifically, if $A = U\Lambda U^T$, then:

$$ A^k = (U\Lambda U^T)(U\Lambda U^T)\dots(U\Lambda U^T) = U(\Lambda \dots \Lambda)U^T = U\Lambda^k U^T $$

7. Orthogonal Projection Matrix Diagonal Elements

If $P \in \mathbb{R}^{n \times n}$ is an orthogonal projection matrix, show that $0 \le P_{ii} \le 1$. (Hint: use $P = P^2 = P^T P$)

Proof: Since $P$ is a projection matrix, $P_{ii} = (P^T P)_{ii}$.

$$ P_{ii} = \sum_{j=1}^{n} P_{ji}^2 = P_{ii}^2 + \sum_{j \neq i} P_{ji}^2 \ge P_{ii}^2 $$

This implies $P_{ii} \ge P_{ii}^2$, so $P_{ii}(1 - P_{ii}) \ge 0$, which means $0 \le P_{ii} \le 1$.

8. Constructing Orthogonal Matrices

If $U \in \mathbb{R}^{n \times p}$ is column-orthogonal and $U_{\perp} \in \mathbb{R}^{n \times (n-p)}$ is its complement, then $\tilde{U} = (U, U_{\perp}) \in \mathbb{R}^{n \times n}$ is orthogonal.

$$ \tilde{U}^T \tilde{U} = \begin{pmatrix} U^T U & U^T U_{\perp} \\ U_{\perp}^T U & U_{\perp}^T U_{\perp} \end{pmatrix} = \begin{pmatrix} I_p & 0 \\ 0 & I_{n-p} \end{pmatrix} = I_n $$

9. Centering Matrix as Projection

Explain why the centering matrix $C = I_n - \frac{1_n 1_n^T}{n}$ is an orthogonal projection operator. Find the space it projects onto.

Explanation: Let $u = \frac{1_n}{\sqrt{n}}$. Then $u$ is a unit vector ($u^T u = 1$).

$$ C = I - u u^T $$

This is the projection onto the orthogonal complement of the space spanned by $u$ (which is the vector of ones). Thus, $C$ projects onto the subspace of vectors with zero sum (mean centered).


SVD and Matrix Calculus Exercises

10. SVD Relations

Let $A \in \mathbb{R}^{n \times p}$ with SVD $A = UDV^T$ where $U=(u_1, \dots, u_p)$, $D=\text{diag}(d_1, \dots, d_p)$, $V=(v_1, \dots, v_p)$.

Show that $Av_i = d_i u_i$ and $A^T u_i = d_i v_i$.

Proof:

$$ A = \sum_{j=1}^{n} d_j u_j v_j^T $$

$$ Av_i = \left( \sum_{j=1}^{n} d_j u_j v_j^T \right) v_i = \sum_{j=1}^{n} d_j u_j (v_j^T v_i) = d_i u_i $$

(Since $v_j^T v_i = 1$ if $i=j$, else $0$). Similarly for $A^T u_i = d_i v_i$.

11. SVD from E.D. (Handling Negative Eigenvalues)

If $A \in \mathbb{R}^{n \times n}$ is symmetric with E.D. $A = U\Lambda U^T$ where $U=(u_1, \dots, u_n, u_{n+1})$ and $\Lambda = \text{diag}(\lambda_1, \dots, \lambda_n)$. (Note: The problem context implies one negative eigenvalue or similar sign adjustment). Let $\tilde{V} = (u_1, \dots, u_n, -u_{n+1})$ and $D = \text{diag}(d_1, \dots, d_n, -d_{n+1})$. Show that $A = UD\tilde{V}^T$ is a valid SVD.

Proof:

$$ UD\tilde{V}^T = \sum_{i=1}^{n} d_i u_i u_i^T + (-d_{n+1})u_{n+1}(-u_{n+1})^T = \sum_{i=1}^{n+1} d_i u_i u_i^T = A $$

This is valid because $U$ and $\tilde{V}$ are orthogonal, and $D$ contains non-negative singular values (assuming $|d_i|$ correspond to $|\lambda_i|$).

12. Frobenius Norm and Singular Values

For $A \in \mathbb{R}^{n \times p}$, show that $||A||_F^2 = \text{tr}(A^T A) = \sum_{i=1}^{\min(n,p)} d_i^2(A)$.

Proof:

$$ \text{tr}(A^T A) = \text{tr}(V D U^T U D V^T) = \text{tr}(V D^2 V^T) = \text{tr}(D^2 V^T V) = \text{tr}(D^2) = \sum d_i^2 $$

13. Gradient of Quadratic Form

If $f(x) = x^T S x$ for a symmetric $S \in \mathbb{R}^{n \times n}$, then $\nabla_x f(x) = 2Sx$.

Derivation:

$$ x^T S x = \sum_{i,j} S_{ij} x_i x_j $$

$$ \frac{\partial}{\partial x_k} (x^T S x) = \sum_{j} S_{kj} x_j + \sum_{i} S_{ik} x_i = 2 \sum_{j} S_{kj} x_j \quad (\text{since } S_{ij} = S_{ji}) $$

Thus, the gradient vector is $2Sx$.

14. Gradient of Trace

If $A, B \in \mathbb{R}^{n \times p}$ and $f(A) = \text{tr}(AB^T)$, then $\nabla_A f(A) = B$.

Derivation:

$$ \text{tr}(AB^T) = \sum_{i,j} A_{ij} B_{ij} \implies \frac{\partial f(A)}{\partial A_{ij}} = B_{ij} $$

Statistical Estimates & MVN Exercises

1. Correlation Matrix

Let $X$ be a $p$-dimensional random vector with $\Sigma = \text{var}(x)$. Let $D_{\Sigma} = \text{diag}(\Sigma_{11}, \dots, \Sigma_{pp})$. Then:

$$ R = D_{\Sigma}^{-1/2} \Sigma D_{\Sigma}^{-1/2} $$

Element-wise: $(D_{\Sigma}^{-1/2} \Sigma D_{\Sigma}^{-1/2})_{ij} = \frac{\Sigma_{ij}}{\sqrt{\Sigma_{ii}\Sigma_{jj}}}$.

2. Expectation Identity

Show that $\mathbb{E}(xx^T) - \mu\mu^T = \mathbb{E}((x-\mu)(x-\mu)^T)$.

Proof:

$$ \mathbb{E}((x-\mu)(x-\mu)^T) = \mathbb{E}(xx^T - x\mu^T - \mu x^T + \mu\mu^T) $$

$$ = \mathbb{E}(xx^T) - \mathbb{E}(x)\mu^T - \mu\mathbb{E}(x)^T + \mu\mu^T $$

$$ = \mathbb{E}(xx^T) - \mu\mu^T - \mu\mu^T + \mu\mu^T = \mathbb{E}(xx^T) - \mu\mu^T $$

3. Expectation of Random Matrix Product

If $S$ is a random matrix, and $A, B$ are constant matrices:

$$ \mathbb{E}(ASB) = A \cdot \mathbb{E}(S) \cdot B $$

4. Covariance Properties

For random vectors $x, y$ and matrices $A, B$:

  • $Cov(x, y) = \mathbb{E}(xy^T) - \mathbb{E}(x)\mathbb{E}(y)^T$
  • $Cov(x, y) = Cov(y, x)^T$
  • $Cov(Ax, By) = A Cov(x, y) B^T$

5. Sample Variance and Centering

If $v \in \mathbb{R}^n$ and $C$ is the centering matrix:

$$ S_v^2 = \frac{1}{n} ||Cv||^2 $$

Proof:

$$ Cv = \begin{pmatrix} v_1 - \bar{v} \\ \vdots \\ v_n - \bar{v} \end{pmatrix} \implies \frac{1}{n}||Cv||^2 = \frac{1}{n} \sum (v_i - \bar{v})^2 = S_v^2 $$

6. Quadratic Form and Sample Variance

Let $S$ be the sample covariance matrix of $X$. For $v \in \mathbb{R}^p$, $v^T S v$ is the sample variance of $Xv$.

$$ v^T S v = v^T \frac{X^T C X}{n} v = \frac{(Xv)^T C (Xv)}{n} = \frac{||C(Xv)||^2}{n} = S_{Xv}^2 $$

Since $S_{Xv}^2 \ge 0$, $S$ is Positive Semidefinite (PSD).

7. Regression Coefficients

For centered $X$ and $y$, the regression coefficient $\beta = S_{xx}^{-1}S_{xy}$.

$$ S_{xx} = \frac{X^T X}{n}, \quad S_{xy} = \frac{X^T y}{n} $$

$$ \beta = (X^T X)^{-1} X^T y $$

Standard MVN Density

For $X \sim N_p(0, I_p)$, show the density is $f(x) = \frac{1}{(2\pi)^{p/2}} e^{-\frac{1}{2}||x||^2}$.

  • $\det(I_p) = 1$
  • $(x-\mu)^T \Sigma^{-1} (x-\mu) = x^T I x = ||x||^2$

Independence of Uncorrelated MVN

Let $z = \begin{pmatrix} X \\ Y \end{pmatrix} \sim N_{p+q} \left( \begin{pmatrix} \mu_X \\ \mu_Y \end{pmatrix}, \begin{pmatrix} \Sigma_X & 0 \\ 0 & \Sigma_Y \end{pmatrix} \right)$. Show $X$ and $Y$ are independent. The joint density factors:

$$ f_Z(x, y) = f_X(x) f_Y(y) $$

This is because $\Sigma^{-1}$ is block diagonal, making the quadratic form additive in the exponent, and the determinant multiplicative.


LEC4 EXERCISE (Numerical MVN)

Suppose $X = (X_1, X_2)^T \sim N_2(\mu, \Sigma)$ where:

  • $\mu = \begin{pmatrix} 1 \\ 2 \end{pmatrix}$
  • Eigenvalues: $\lambda_1 = 1, \lambda_2 = 2$
  • Eigenvectors: $u_1 = \begin{pmatrix} 1/\sqrt{2} \\ 1/\sqrt{2} \end{pmatrix}, u_2 = \begin{pmatrix} -1/\sqrt{2} \\ 1/\sqrt{2} \end{pmatrix}$

Calculate $\Sigma$:

$$ \Sigma = U \Lambda U^T = \begin{pmatrix} 1.5 & -0.5 \\ -0.5 & 1.5 \end{pmatrix} $$

1. Distributions of $X_1$ and $X_2$

Marginal distributions:

  • $X_1 \sim N(1, 1.5)$
  • $X_2 \sim N(2, 1.5)$

2. Conditional Distributions

Formula: $X_1 | X_2 = x_2 \sim N\left( \mu_1 + \frac{\sigma_{12}}{\sigma_{22}}(x_2 - \mu_2), \sigma_{11} - \frac{\sigma_{12}^2}{\sigma_{22}} \right)$.

  • $\frac{\sigma_{12}}{\sigma_{22}} = \frac{-0.5}{1.5} = -\frac{1}{3}$
  • Conditional Variance: $1.5 - \frac{(-0.5)^2}{1.5} = 1.5 - \frac{0.25}{1.5} = \frac{2.25 - 0.25}{1.5} = \frac{2}{1.5} = \frac{4}{3}$

Results:

  • $X_1 | X_2 = 1 \sim N(1 - \frac{1}{3}(1-2), \frac{4}{3}) = N(\frac{4}{3}, \frac{4}{3})$
  • $X_2 | X_1 = 1 \sim N(2 - \frac{1}{3}(1-1), \frac{4}{3}) = N(2, \frac{4}{3})$

3 & 4. Contour Plots and Confidence Intervals (90%)

Using $\chi^2(2)$ table, $90\%$ percentile is approx $4.61$. Ellipse axes lengths are proportional to $\sqrt{4.61 \lambda_i}$:

  • Axis 1 along $u_1$: length $\propto \sqrt{1 \cdot 4.61}$
  • Axis 2 along $u_2$: length $\propto \sqrt{2 \cdot 4.61}$

5. General Conditional Distribution

$$ X_1 | X_2 = x \sim N\left( 1 - \frac{1}{3}(x - 2), \frac{4}{3} \right) $$

6. Plot of Conditional Mean

$$ g(x) = \mathbb{E}(X_1 | X_2 = x) = 1 - \frac{1}{3}x + \frac{2}{3} = \frac{5}{3} - \frac{1}{3}x $$

(Or in terms of $x_2$: $x_1 = -\frac{1}{3}x_2 + \frac{5}{3} \iff x_2 = -3x_1 + 5$)

7. 95% Confidence Region for $X_1 | X_2 = x$

Using 2SD rule (approx for 95%):

$$ \text{Upper } u(x) = \left( 1 - \frac{1}{3}(x-2) \right) + 2\sqrt{\frac{4}{3}} $$

$$ \text{Lower } l(x) = \left( 1 - \frac{1}{3}(x-2) \right) - 2\sqrt{\frac{4}{3}} $$