Rank of a matrix

We start by defining the column (or row) rank of a matrix as the maximum number of linearly independent columns (or rows) in the matrix. It is not obvious to me when I learned linear algebra that row rank and column rank are equal. Thanks to this equality, we can simply speaks of the rank of a matrix.   

The proof of row rank = column rank:  Suppose the column rank of a matrix $\mathbf{A}\in \mathbb{R}^{m\times n}$ is $r$, we can make a linear expansion $\mathbf{A} = \mathbf{X}\mathbf{Y}$ where the $r$ columns of $\mathbf{X}\in \mathbb{R}^{m\times r}$ are linearly independent and $\mathbf{Y}\in \mathbb{R}^{r\times n}$ contains the expansion coefficients: $\mathbf{A}[:, j]=\sum_{k=1}^r \mathbf{X}[:, k]\,\mathbf{Y}[k, j]$. $\mathbf{A} = \mathbf{X}\mathbf{Y}$ can also be interpreted as a linear expansion of rows: $\mathbf{A}[i, :] = \sum_{k=1}^r \mathbf{X}[i, k]\mathbf{Y}[k, :]$. As a result, $\text{row rank}(\mathbf{A})\leq r = \text{column rank}(\mathbf{A})$. By applying the same argument to $\mathbf{A}^T$, we have $\text{column rank}(\mathbf{A})\leq\text{row rank}(\mathbf{A})$. Finally, $\text{row rank}(\mathbf{A})=\text{column rank}(\mathbf{A})\equiv\text{rank}(\mathbf{A})$, or $\text{rank}(\mathbf{A})=\text{rank}(\mathbf{A}^T)$.

Simply by the definition of rank, we have the following properties: 
  • It is obvious that $\text{rank}(\mathbf{A})\leq \min(m, n)$.
  • For overdetermined linear equations $\mathbf{A}\mathbf{x}=0$, $\text{rank}(\mathbf{A})=n$ iff $\mathbf{x}=0$ is the only solution.
  • Only zero matrix has zero rank.
  • $\text{rank}(\mathbf{A}+\mathbf{B})\leq \text{rank}(\mathbf{A})+\text{rank}(\mathbf{B})$, since the columns of $\mathbf{A}+\mathbf{B}$ are linear combinations of columns of $\mathbf{A}$ and $\mathbf{B}$.
  • $\text{rank}\left[\begin{array}{cc} \mathbf{A} & 0 \\ 0 & \mathbf{B}\end{array}\right] = \text{rank}\left[\begin{array}{cc} 0 & \mathbf{A} \\ \mathbf{B} & 0\end{array}\right]=\text{rank}(\mathbf{A})+\text{rank}(\mathbf{B})$.
  • $\text{rank}\left[\begin{array}{cc} \mathbf{A} & \mathbf{C} \\ 0 & \mathbf{B}\end{array}\right] \geq\text{rank}(\mathbf{A})+\text{rank}(\mathbf{B})$.
More properties of matrix rank:
  • $\text{rank}(\mathbf{A}\mathbf{B})\leq \min\left(\text{rank}(\mathbf{A}), \text{rank}(\mathbf{B})\right)$, following the above proof of row rank=column rank.
  • $\mathbf{A}\in\mathbb{R}^{n\times n}$ is invertible iff $\text{rank}(\mathbf{A})=n$:
    • $(\Rightarrow)$ Multiplying $\mathbf{A}^{-1}$ on both sides of $\mathbf{A}\mathbf{x}=0$ gives $\mathbf{x}=0$.
    • $(\Leftarrow)$ Elementary row operations can reduce $\left[\mathbf{A}\,|\, \mathbf{I}_n\right]$ to $\left[\mathbf{I}_n\,|\,\mathbf{A}^{-1}\right]$.
  • For $\mathbf{A}\in\mathbb{R}^{m\times n}$ with $m\geq n$, $\text{rank}(\mathbf{A})=n$ iff $\text{rank}(\mathbf{A}^T\mathbf{A})=n$. As a result, $\mathbf{A}^T\mathbf{A}$ is invertible, which is used in deriving the normal equation of linear regression.
    • $(\Rightarrow)$ Suppose $\mathbf{A}^T\mathbf{A}\mathbf{x}=0$, $\mathbf{x}^T\mathbf{A}^T\mathbf{A}\mathbf{x}=||\mathbf{A}\mathbf{x}||^2=0$. As a result, $\mathbf{A}\mathbf{x}=0$. $\mathbf{x}=0$ because of $\text{rank}(\mathbf{A})=n$.
    • $(\Leftarrow)$ Suppose $\mathbf{A}\mathbf{x}=0$, $\mathbf{A}^T\mathbf{A}\mathbf{x}=0$. $\mathbf{x}=0$ because of $\text{rank}(\mathbf{A}^T\mathbf{A})=n$.
  • $\mathbf{A}\in\mathbb{R}^{m\times n}$ is left invertible iff $\text{rank}(\mathbf{A})=n$. This is because $\left(\mathbf{A}^T\mathbf{A}\right)^{-1}\mathbf{A}^T\cdot\mathbf{A}=\mathbf{I}_n$. 
  • Similarly, $\mathbf{A}\in\mathbb{R}^{m\times n}$ is right invertible iff $\text{rank}(\mathbf{A})=m$. This is because $\mathbf{A}\cdot\mathbf{A}^T\left(\mathbf{A}\mathbf{A}^T\right)^{-1}=\mathbf{I}_m$.
  • For $\mathbf{A}\in\mathbb{R}^{m\times n}$ and $\mathbf{B}\in\mathbb{R}^{n\times k}$, if $\text{rank}(\mathbf{B})=n$, $\text{rank}(\mathbf{A}\mathbf{B})=\text{rank}(\mathbf{A})$. This is because $\text{rank}(\mathbf{A}\mathbf{B})\leq\text{rank}(\mathbf{A})$ and $\text{rank}(\mathbf{A})=\text{rank}(\mathbf{A}\mathbf{B}\mathbf{B}^{-1}_{\text{right}})\leq\text{rank}(\mathbf{A}\mathbf{B})$.
  • Similarly, for $\mathbf{A}\in\mathbb{R}^{m\times n}$ and $\mathbf{B}\in\mathbb{R}^{k\times m}$, if $\text{rank}(\mathbf{B})=m$, $\text{rank}(\mathbf{B}\mathbf{A})=\text{rank}(\mathbf{A})$.
  • Sylvester inequality: for $\mathbf{A}\in\mathbb{R}^{m\times n}$ and $\mathbf{B}\in \mathbb{R}^{n\times k}$, $\text{rank}(\mathbf{A}\mathbf{B})\geq\text{rank}(\mathbf{A})+\text{rank}(\mathbf{B})-n$. The proof is as follows: Note that\begin{equation}\left[\begin{array}{cc} \mathbf{I}_n & \mathbf{B} \\ \mathbf{A} & 0\end{array}\right]=\left[\begin{array}{cc} \mathbf{I}_n & 0 \\ \mathbf{A} & \mathbf{I}_m\end{array}\right] \left[\begin{array}{cc} \mathbf{I}_n & 0 \\ 0 & \mathbf{AB}\end{array}\right] \left[\begin{array}{cc} \mathbf{I}_n & \mathbf{B} \\ 0 & -\mathbf{I}_k\end{array}\right]\,,\end{equation} we have $\text{rank}\left[\begin{array}{cc} \mathbf{I}_n & \mathbf{B} \\ \mathbf{A} & 0\end{array}\right]=\text{rank}\left[\begin{array}{cc} \mathbf{I}_n & 0 \\ 0 & \mathbf{AB}\end{array}\right]$. $\text{L.H.S}\geq\text{rank}(\mathbf{A})+\text{rank}(\mathbf{B})$ while $\text{R.H.S}=n+\text{rank}(\mathbf{AB})$.
  • For $\mathbf{A}\in\mathbb{R}^{m\times n}$,  $\text{rank}(\mathbf{A})=r$ iff there exists two invertible matrices $\mathbf{X}\in \mathbb{R}^{m\times m}$ and $\mathbf{Y}\in \mathbb{R}^{n\times n}$ such that \begin{equation}\mathbf{X}\mathbf{A}\mathbf{Y}=\left[\begin{array}{cc} \mathbf{I}_r & 0 \\ 0 & 0\end{array}\right]\,.\end{equation}
Finally, $\mathbf{A}\in\mathbb{R}^{m\times n}$ defines a linear map $\mathbf{y}=\mathbf{A}\mathbf{x}$. The null space is defined as $\text{ker}\mathbf{A}=\{\mathbf{x} | \mathbf{A}\mathbf{x}=0\}$ and the range space is defined as $\text{im}\mathbf{A}=\{\mathbf{y} | \mathbf{y}=\mathbf{A}\mathbf{x} \text{ for some }\mathbf{x} \}$. By the definition of rank, we have $\dim(\text{im}\mathbf{A})=\text{rank}(\mathbf{A})$. More importantly, rank-nullity theorem states that $\dim(\text{ker}\mathbf{A}) + \text{rank}\left(\mathbf{A}\right)=n$.



Problems:
(1) If $\mathbf{A}$ is a $n\times n$ matrix satisfying $\mathbf{A}^2=0$, what is the rank of $\mathbf{A}$?

(2) If $\mathbf{A}$ is a $n\times n$ matrix that contains only 0 or 1 and satisfies $\mathbf{A}^2=0$, how many 1s in $\mathbf{A}$?

(3) $\mathbf{A}$ is a $n\times n$ matrix. If there is some integer $k$ such that $\mathbf{A}^k=0$ (nilpotent), what are the eigenvalues of $\mathbf{A}$?

Solution:
(1) Method 1: apply Sylvester inequality with $\mathbf{B}=\mathbf{A}$. 
Method2: $\mathbf{A}^2=0$ implies that $\text{im}\mathbf{A}\subseteq\text{ker}\mathbf{A}$, and thus $\dim(\text{im}\mathbf{A})\leq \dim(\text{ker}\mathbf{A})$. Since $\dim(\text{ker}\mathbf{A}) + \dim(\text{im}\mathbf{A})=n$, we have $\text{rank}(\mathbf{A}) =\dim(\text{im}\mathbf{A})\leq n/2$.

(2) to do.

(3) $\mathbf{A}\mathbf{x}=\lambda\mathbf{x}$, then $\mathbf{A}^k\mathbf{x}=\lambda^k\mathbf{x}$. As a result, $\lambda^k=0$ and thus all the eigenvalues are zero.
An example of nilpotent matrix: strictly upper triangular matrix (creation operator in quantum physics).




Comments

Popular posts from this blog

529 Plan

How to offset W2 tax

Retirement Accounts