Principal components analysis
PCA is a common technique to reduce data in high dimensions to a lower dimensional subspace, either for visualization or data compression.
Suppose we have a dataset consisting of $n$ samples and each sample is of $p$ features. We represent the dataset by a $n\times p$ matrix $\mathbf{X}$.
We also assume that $\mathbf{X}$ is centered so that the sample covariance matrix $\boldsymbol{\Sigma}$ in the feature space is:
\begin{eqnarray}
\boldsymbol{\Sigma} = \frac{1}{n}\mathbf{X}^T\mathbf{X}\,.\tag{1}\end{eqnarray}
PCA is to find a set of $k\leq p$ feature vectors $\mathbf{v}_1, \cdots, \mathbf{v}_k$ (called principal component directions) such that these new feature vectors have the largest $k$ sample variances
$\mathbf{v}^T_1 \boldsymbol{\Sigma} \mathbf{v}_1,\cdots, \mathbf{v}^T_k \boldsymbol{\Sigma} \mathbf{v}_k$. That is, $\mathbf{v}_1, \cdots, \mathbf{v}_k$ are the eigenvectors of $\boldsymbol{\Sigma}$ corresponding to the $k$ largest eigenvalues $\lambda_1\geq \cdots \geq \lambda_k$.
In sum,
let $\mathbf{V}$ be a $p\times k$ matrix $[\mathbf{v}_1, \cdots, \mathbf{v}_k]$ and $\boldsymbol{\lambda} = \text{diag}\left[\lambda_1, \cdots, \lambda_k\right]$,
we can solve $\mathbf{V}$ by the eigenvalue problem \begin{eqnarray}
\boldsymbol{\Sigma}\,\mathbf{V} = \mathbf{V}\,\boldsymbol{\lambda}\,.\tag{2} \end{eqnarray}
In practice, instead of computing (1) and (2), one can solve $\mathbf{V}$ directly by SVD of $\mathbf{X}$ as
\begin{eqnarray}
\mathbf{X} = \mathbf{U}\,\boldsymbol{\sigma}\,\mathbf{V}^T\,. \tag{3}\end{eqnarray} The dataset $\mathbf{X}$ can then be projected to a lower $k$-dimensional feature space as $\mathbf{X}\,\mathbf{V}$.
Comments
Post a Comment