Principal components analysis

PCA is a common technique to reduce data in high dimensions to a lower dimensional subspace, either for visualization or data compression. 
Suppose we have a dataset consisting of n samples and each sample is of p features. We represent the dataset by a n\times p matrix \mathbf{X}. We also assume that \mathbf{X} is centered so that the sample covariance matrix \boldsymbol{\Sigma} in the feature space is: \begin{eqnarray} \boldsymbol{\Sigma} = \frac{1}{n}\mathbf{X}^T\mathbf{X}\,.\tag{1}\end{eqnarray} PCA is to find a set of k\leq p feature vectors \mathbf{v}_1, \cdots, \mathbf{v}_k (called principal component directions) such that these new feature vectors have the largest k sample variances \mathbf{v}^T_1 \boldsymbol{\Sigma} \mathbf{v}_1,\cdots, \mathbf{v}^T_k \boldsymbol{\Sigma} \mathbf{v}_k. That is, \mathbf{v}_1, \cdots, \mathbf{v}_k are the eigenvectors of \boldsymbol{\Sigma} corresponding to the k largest eigenvalues \lambda_1\geq \cdots \geq \lambda_k. In sum, let \mathbf{V} be a p\times k matrix [\mathbf{v}_1, \cdots, \mathbf{v}_k] and \boldsymbol{\lambda} = \text{diag}\left[\lambda_1, \cdots, \lambda_k\right], we can solve \mathbf{V} by the eigenvalue problem \begin{eqnarray} \boldsymbol{\Sigma}\,\mathbf{V} = \mathbf{V}\,\boldsymbol{\lambda}\,.\tag{2} \end{eqnarray} 
In practice, instead of computing (1) and (2), one can solve \mathbf{V} directly by SVD of \mathbf{X} as \begin{eqnarray} \mathbf{X} = \mathbf{U}\,\boldsymbol{\sigma}\,\mathbf{V}^T\,. \tag{3}\end{eqnarray} The dataset \mathbf{X} can then be projected to a lower k-dimensional feature space as \mathbf{X}\,\mathbf{V}

Comments

Popular posts from this blog

529 Plan

How to offset W2 tax

Retirement Accounts