| Principal Component Analysis |
Article Index for Principal |
Website Links For Principal |
Information AboutPrincipal Component Analysis |
| CATEGORIES ABOUT PRINCIPAL COMPONENTS ANALYSIS | |
| multivariate statistics | |
| singular value decomposition | |
| data mining | |
| mathematical components | |
|
PCA is also called the (discrete) Karhunen-Loève transform (or KLT, named after Kari Karhunen and Michel Loève ) or the '''Hotelling transform''' (in honor of Harold Hotelling ). PCA has the distinction of being the optimal Linear Transformation for keeping the subspace that has largest variance. This advantage, however, comes at the price of greater computational requirement if compared, for example, to the Discrete Cosine Transform . Unlike other linear transforms, the PCA does not have a fixed set of Basis Vector s. Its basis vectors depend on the data set. Assuming zero Empirical Mean (the empirical mean of the distribution has been subtracted away from the data set), the principal component ''w''1 of a dataset ''x'' can be defined as: : (See Arg Max for the notation.) With the first components, the -th component can be found by subtracting the first principal components from ''x'': : and by substituting this as the new dataset to find a principal component in : The Karhunen-Loève transform is therefore equivalent to finding the Singular Value Decomposition of the data matrix ''X'', : and then obtaining the reduced-space data matrix Y by projecting '''X''' down into the reduced space defined by only the first ''L'' singular vectors, '''WL''': : The matrix W of singular vectors of '''X''' is equivalently also the matrix W of eigenvectors of the matrix of observed covariances '''C''' = '''X XT''', : By finding the Eigenvalue s and Eigenvector s of the covariance matrix, the eigenvectors with the largest eigenvalues correspond to the dimensions that have the strongest Correlation in the dataset (see Rayleigh Quotient ). PCA is equivalent to Empirical Orthogonal Functions (EOF). PCA is a popular technique in Pattern Recognition . However, PCA is not optimized for class separability. An alternative is the Linear Discriminant Analysis , which does take this into account. PCA optimally minimizes reconstruction error under the L2 Norm . TABLE OF SYMBOLS AND ABBREVIATIONS ALGORITHM #1: THE COVARIANCE METHOD Following is a detailed description of PCA using the covariance method. The goal is to transform a given data set X of dimension ''M'' to an alternative data set '''Y''' of smaller dimension ''L''. Equivalently, we are seeking to find the matrix '''Y''', where '''Y''' is the Karhunen-Loeve transform (KLT) of matrix X: : Organize the data set Suppose you have ''N'' data vectors each length ''M'', written as , and you want to project your data into aמ ''L'' dimensional subspace.
Calculate the empirical mean
:: Calculate the deviations from the mean
:: ::where h is a 1 x ''N'' row vector of all 1's: ::: Find the covariance matrix
::where ::: is the Expected Value operator, ::: is the Outer Product operator, and
Find the eigenvectors and eigenvalues of the covariance matrix
:: ::This step will typically require the use of a computer-based algorithm for computing eigenvectors and eigenvalues. These algorithms are readily available as sub-components of most Matrix Algebra systems, such as MatLab . See, for example, the eig function .
:: :is the ''m''th eigenvalue of the covariance matrix C, and ::
Rearrange the eigenvectors and eigenvalues
Compute the cumulative energy content for each eigenvector
:: Select a subset of the eigenvectors as basis vectors
:: :where ::
:: Convert the source data to z-scores
::
:: (divide element-by-element) Project the z-scores of the data onto the new basis
ALGORITHM #2: THE CORRELATION METHOD Editor's note: This section is currently undergoing a major revision. See page history for previous revisions. ''DERIVATION'' OF PCA USING THE COVARIANCE METHOD Let X be a ''d''-dimensional random vector expressed as column vector. Without loss of generality, assume X has zero empirical mean. We want to find a Orthonormal Projection Matrix P such that : with the constraint that : is a Diagonal Matrix and By substitution, and matrix algebra, we obtain: : We now have: : Rewrite P as d column vectors, so : and as: : Substituting into equation above, we obtain: : Notice that in , ''P''i is an Eigenvector of ''X′''s covariance matrix. Therefore, by finding the eigenvectors of ''X′''s covariance matrix, we find a projection matrix ''P'' that satisfies the original constraints. CORRESPONDENCE ANALYSIS Correspondence analysis is conceptually similar to PCA, but scales the data (which must be positive) so that rows and columns are treated equivalently. It is traditionally applied to contingency tables where Pearson's Chi-square Test has shown a relationship between rows and columns. SOFTWARE/SOURCE CODE REFERENCES
SEE ALSO
EXTERNAL LINKS
|
|
|