Rank of a matrix
We start by defining the column (or row) rank of a matrix as the maximum number of linearly independent columns (or rows) in the matrix. It is not obvious to me when I learned linear algebra that row rank and column rank are equal. Thanks to this equality, we can simply speaks of the rank of a matrix.
The proof of row rank = column rank: Suppose the column rank of a matrix \mathbf{A}\in \mathbb{R}^{m\times n} is r, we can make a linear expansion \mathbf{A} = \mathbf{X}\mathbf{Y} where the r columns of \mathbf{X}\in \mathbb{R}^{m\times r} are linearly independent and \mathbf{Y}\in \mathbb{R}^{r\times n} contains the expansion coefficients: \mathbf{A}[:, j]=\sum_{k=1}^r \mathbf{X}[:, k]\,\mathbf{Y}[k, j]. \mathbf{A} = \mathbf{X}\mathbf{Y} can also be interpreted as a linear expansion of rows: \mathbf{A}[i, :] = \sum_{k=1}^r \mathbf{X}[i, k]\mathbf{Y}[k, :]. As a result, \text{row rank}(\mathbf{A})\leq r = \text{column rank}(\mathbf{A}). By applying the same argument to \mathbf{A}^T, we have \text{column rank}(\mathbf{A})\leq\text{row rank}(\mathbf{A}). Finally, \text{row rank}(\mathbf{A})=\text{column rank}(\mathbf{A})\equiv\text{rank}(\mathbf{A}), or \text{rank}(\mathbf{A})=\text{rank}(\mathbf{A}^T).
Simply by the definition of rank, we have the following properties:
- It is obvious that \text{rank}(\mathbf{A})\leq \min(m, n).
- For overdetermined linear equations \mathbf{A}\mathbf{x}=0, \text{rank}(\mathbf{A})=n iff \mathbf{x}=0 is the only solution.
- Only zero matrix has zero rank.
- \text{rank}(\mathbf{A}+\mathbf{B})\leq \text{rank}(\mathbf{A})+\text{rank}(\mathbf{B}), since the columns of \mathbf{A}+\mathbf{B} are linear combinations of columns of \mathbf{A} and \mathbf{B}.
- \text{rank}\left[\begin{array}{cc} \mathbf{A} & 0 \\ 0 & \mathbf{B}\end{array}\right] = \text{rank}\left[\begin{array}{cc} 0 & \mathbf{A} \\ \mathbf{B} & 0\end{array}\right]=\text{rank}(\mathbf{A})+\text{rank}(\mathbf{B}).
- \text{rank}\left[\begin{array}{cc} \mathbf{A} & \mathbf{C} \\ 0 & \mathbf{B}\end{array}\right] \geq\text{rank}(\mathbf{A})+\text{rank}(\mathbf{B}).
More properties of matrix rank:
- \text{rank}(\mathbf{A}\mathbf{B})\leq \min\left(\text{rank}(\mathbf{A}), \text{rank}(\mathbf{B})\right), following the above proof of row rank=column rank.
- \mathbf{A}\in\mathbb{R}^{n\times n} is invertible iff \text{rank}(\mathbf{A})=n:
- (\Rightarrow) Multiplying \mathbf{A}^{-1} on both sides of \mathbf{A}\mathbf{x}=0 gives \mathbf{x}=0.
- (\Leftarrow) Elementary row operations can reduce \left[\mathbf{A}\,|\, \mathbf{I}_n\right] to \left[\mathbf{I}_n\,|\,\mathbf{A}^{-1}\right].
- For \mathbf{A}\in\mathbb{R}^{m\times n} with m\geq n, \text{rank}(\mathbf{A})=n iff \text{rank}(\mathbf{A}^T\mathbf{A})=n. As a result, \mathbf{A}^T\mathbf{A} is invertible, which is used in deriving the normal equation of linear regression.
- (\Rightarrow) Suppose \mathbf{A}^T\mathbf{A}\mathbf{x}=0, \mathbf{x}^T\mathbf{A}^T\mathbf{A}\mathbf{x}=||\mathbf{A}\mathbf{x}||^2=0. As a result, \mathbf{A}\mathbf{x}=0. \mathbf{x}=0 because of \text{rank}(\mathbf{A})=n.
- (\Leftarrow) Suppose \mathbf{A}\mathbf{x}=0, \mathbf{A}^T\mathbf{A}\mathbf{x}=0. \mathbf{x}=0 because of \text{rank}(\mathbf{A}^T\mathbf{A})=n.
- \mathbf{A}\in\mathbb{R}^{m\times n} is left invertible iff \text{rank}(\mathbf{A})=n. This is because \left(\mathbf{A}^T\mathbf{A}\right)^{-1}\mathbf{A}^T\cdot\mathbf{A}=\mathbf{I}_n.
- Similarly, \mathbf{A}\in\mathbb{R}^{m\times n} is right invertible iff \text{rank}(\mathbf{A})=m. This is because \mathbf{A}\cdot\mathbf{A}^T\left(\mathbf{A}\mathbf{A}^T\right)^{-1}=\mathbf{I}_m.
- For \mathbf{A}\in\mathbb{R}^{m\times n} and \mathbf{B}\in\mathbb{R}^{n\times k}, if \text{rank}(\mathbf{B})=n, \text{rank}(\mathbf{A}\mathbf{B})=\text{rank}(\mathbf{A}). This is because \text{rank}(\mathbf{A}\mathbf{B})\leq\text{rank}(\mathbf{A}) and \text{rank}(\mathbf{A})=\text{rank}(\mathbf{A}\mathbf{B}\mathbf{B}^{-1}_{\text{right}})\leq\text{rank}(\mathbf{A}\mathbf{B}).
- Similarly, for \mathbf{A}\in\mathbb{R}^{m\times n} and \mathbf{B}\in\mathbb{R}^{k\times m}, if \text{rank}(\mathbf{B})=m, \text{rank}(\mathbf{B}\mathbf{A})=\text{rank}(\mathbf{A}).
- Sylvester inequality: for \mathbf{A}\in\mathbb{R}^{m\times n} and \mathbf{B}\in \mathbb{R}^{n\times k}, \text{rank}(\mathbf{A}\mathbf{B})\geq\text{rank}(\mathbf{A})+\text{rank}(\mathbf{B})-n. The proof is as follows: Note that\begin{equation}\left[\begin{array}{cc} \mathbf{I}_n & \mathbf{B} \\ \mathbf{A} & 0\end{array}\right]=\left[\begin{array}{cc} \mathbf{I}_n & 0 \\ \mathbf{A} & \mathbf{I}_m\end{array}\right] \left[\begin{array}{cc} \mathbf{I}_n & 0 \\ 0 & \mathbf{AB}\end{array}\right] \left[\begin{array}{cc} \mathbf{I}_n & \mathbf{B} \\ 0 & -\mathbf{I}_k\end{array}\right]\,,\end{equation} we have \text{rank}\left[\begin{array}{cc} \mathbf{I}_n & \mathbf{B} \\ \mathbf{A} & 0\end{array}\right]=\text{rank}\left[\begin{array}{cc} \mathbf{I}_n & 0 \\ 0 & \mathbf{AB}\end{array}\right]. \text{L.H.S}\geq\text{rank}(\mathbf{A})+\text{rank}(\mathbf{B}) while \text{R.H.S}=n+\text{rank}(\mathbf{AB}).
- For \mathbf{A}\in\mathbb{R}^{m\times n}, \text{rank}(\mathbf{A})=r iff there exists two invertible matrices \mathbf{X}\in \mathbb{R}^{m\times m} and \mathbf{Y}\in \mathbb{R}^{n\times n} such that \begin{equation}\mathbf{X}\mathbf{A}\mathbf{Y}=\left[\begin{array}{cc} \mathbf{I}_r & 0 \\ 0 & 0\end{array}\right]\,.\end{equation}
Problems:
(1) If \mathbf{A} is a n\times n matrix satisfying \mathbf{A}^2=0, what is the rank of \mathbf{A}?
(2) If \mathbf{A} is a n\times n matrix that contains only 0 or 1 and satisfies \mathbf{A}^2=0, how many 1s in \mathbf{A}?
(3) \mathbf{A} is a n\times n matrix. If there is some integer k such that \mathbf{A}^k=0 (nilpotent), what are the eigenvalues of \mathbf{A}?
Solution:
(1) Method 1: apply Sylvester inequality with \mathbf{B}=\mathbf{A}.
Method2: \mathbf{A}^2=0 implies that \text{im}\mathbf{A}\subseteq\text{ker}\mathbf{A}, and thus \dim(\text{im}\mathbf{A})\leq \dim(\text{ker}\mathbf{A}). Since \dim(\text{ker}\mathbf{A}) + \dim(\text{im}\mathbf{A})=n, we have \text{rank}(\mathbf{A}) =\dim(\text{im}\mathbf{A})\leq n/2.
(2) to do.
(3) \mathbf{A}\mathbf{x}=\lambda\mathbf{x}, then \mathbf{A}^k\mathbf{x}=\lambda^k\mathbf{x}. As a result, \lambda^k=0 and thus all the eigenvalues are zero.
An example of nilpotent matrix: strictly upper triangular matrix (creation operator in quantum physics).
Comments
Post a Comment