pca - easy to understand

Upload: othersk468731

Post on 04-Jun-2018

218 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/13/2019 PCA - Easy to Understand

    1/26

    PCAPrincipal Component Analysis

    Dr. Saed SayadUniversity of Toronto

    2010

    [email protected]

    1http://chem-eng.utoronto.ca/~datamining/

  • 8/13/2019 PCA - Easy to Understand

    2/26

    Basic Statistics

    2http://chem-eng.utoronto.ca/~datamining/

  • 8/13/2019 PCA - Easy to Understand

    3/26

    Statistics - Example

    X SumX Count Average X-Xbar (X-Xbar)^2 Sxx Variance SD

    0 40 4 10 -10 100 208 69.33 8.33

    8 -2 4

    12 2 4

    20 10 100

    Y SumY Count Average Y-Ybar (Y-Ybar)^2 Syy Variance SD8 40 4 10 -2 4 10 3.33 1.83

    9 -1 1

    11 1 1

    12 2 4

    X Y Count X-Xbar Y-Ybar (X-Xbar)(Y-Ybar) Sxy CoVariance

    0 8 4 -10 -2 20 44 14.67

    8 9 -2 -1 2

    12 11 2 1 2

    20 12 10 2 20

    3http://chem-eng.utoronto.ca/~datamining/

  • 8/13/2019 PCA - Easy to Understand

    4/26

    Covariance Matrix

    4http://chem-eng.utoronto.ca/~datamining/

  • 8/13/2019 PCA - Easy to Understand

    5/26

    Matrix Algebra

    Example of one non-eigenvector and one eigenvector

    5http://chem-eng.utoronto.ca/~datamining/

  • 8/13/2019 PCA - Easy to Understand

    6/26

    Matrix Algebra

    Example of how a scaled eigenvector is still an eigenvector

    6http://chem-eng.utoronto.ca/~datamining/

  • 8/13/2019 PCA - Easy to Understand

    7/26

    Eigenvectors Properties

    Eigenvectors can only be found for square matrices and not every squarematrix has eigenvectors.

    Given an n x n matrix that does have eigenvectors, there are n of them.

    Another property of eigenvectors is that even if we scale the vector bysome amount before we multiply it, we still get the same multiple of it asa result. This is because if you scale a vector by some amount, all you aredoing is making it longer not changing its direction.

    Lastly, all the eigenvectors of a matrix areperpendicular. it means that you

    can express the data in terms of these perpendicular eigenvectors, insteadof expressing them in terms of thexand y axes.

    7http://chem-eng.utoronto.ca/~datamining/

  • 8/13/2019 PCA - Easy to Understand

    8/26

    Standardized Eigenvectors

    We like to find the eigenvectors whose length isexactly one.

    This is because, the length of a vector doesnt affectwhether its an eigenvector or not, whereas thedirection does.

    So, in order to keep eigenvectors standard, whenever

    we find an eigenvector we usually scale it to make ithave a length of 1, so that all eigenvectors have thesame length.

    8http://chem-eng.utoronto.ca/~datamining/

  • 8/13/2019 PCA - Easy to Understand

    9/26

    Standardized Eigenvectors

    Eigenvector

    Vector Length

    Eigenvector with length of one

    9http://chem-eng.utoronto.ca/~datamining/

  • 8/13/2019 PCA - Easy to Understand

    10/26

    Eigenvalues

    Eigenvalues is the amount by which the original vector was scaled aftermultiplication by the square matrix.

    4 is the eigenvalue associated with that eigenvector in the example.

    No matter what multiple of the eigenvector we took before wemultiplied it by the square matrix, we would always get 4 times thescaled vector.

    Eigenvectors and Eigenvalues always come in pairs.

    10http://chem-eng.utoronto.ca/~datamining/

  • 8/13/2019 PCA - Easy to Understand

    11/26

    PCA

    PCA is a way of identifying patterns in data, and expressingthe data in such a way as to highlight their similarities anddifferences.

    Since patterns in data can be hard to find in data of highdimension, where the luxury of graphical representation is notavailable, PCA is a powerful tool for analysing data.

    The other main advantage of PCA is that once you have found

    these patterns in the data, and you compress the data, ie. byreducing the number of dimensions, without much loss ofinformation. This technique used in image compression.

    11http://chem-eng.utoronto.ca/~datamining/

  • 8/13/2019 PCA - Easy to Understand

    12/26

    PCA Original and Adjusted Data

    Original Data Original Data - Average

    12http://chem-eng.utoronto.ca/~datamining/

  • 8/13/2019 PCA - Easy to Understand

    13/26

    PCA Original Data plot

    13http://chem-eng.utoronto.ca/~datamining/

  • 8/13/2019 PCA - Easy to Understand

    14/26

    Calculate Covariance Matrix

    14http://chem-eng.utoronto.ca/~datamining/

  • 8/13/2019 PCA - Easy to Understand

    15/26

    Calculate Eigenvectors and Eigenvalues from

    Covariance Matrix

    15http://chem-eng.utoronto.ca/~datamining/

  • 8/13/2019 PCA - Easy to Understand

    16/26

    Eigenvectors Plot

    16http://chem-eng.utoronto.ca/~datamining/

  • 8/13/2019 PCA - Easy to Understand

    17/26

    Choosing components and forming a Feature

    Vector

    The eigenvector with the highesteigenvalue is theprinciple componentofthe data set. It is the most significant relationship between the datadimensions.

    In general, once eigenvectors are found from the covariance matrix, the

    next step is to order them by eigenvalue, highest to lowest. This gives usthe components in order of significance.

    Now, if we like, we can decide to ignore the components of lessersignificance. We do lose some information, but if the eigenvalues aresmall, we dont lose much.

    If we leave out some components, the final data set will have lessdimensions than the original.

    17http://chem-eng.utoronto.ca/~datamining/

  • 8/13/2019 PCA - Easy to Understand

    18/26

    Feature Vector

    18http://chem-eng.utoronto.ca/~datamining/

  • 8/13/2019 PCA - Easy to Understand

    19/26

    Deriving the new data set

    Row Feature Vectoris the matrix with the eigenvectors in thecolumns transposed so that the eigenvectors are now in therows, with the most significant eigenvector at the top.

    Row Data Adjustis the mean-adjusted data transposed, ie.the data items are in each column, with each row holding aseparate dimension.

    Final Data is the final data items in columns, and dimensionsalong rows. It is the original data solely in terms of thevectors.

    Final Data = Row Feature VectorXRow Data Adjust

    19http://chem-eng.utoronto.ca/~datamining/

  • 8/13/2019 PCA - Easy to Understand

    20/26

    Deriving the new data set

    Two Eigenvectors

    20http://chem-eng.utoronto.ca/~datamining/

  • 8/13/2019 PCA - Easy to Understand

    21/26

    Get the original data back

    Row Data Adjust= Row Feature Vector TX Final Data

    Row Original Data = (Row Feature Vector TX Final Data) + Original Mean

    21http://chem-eng.utoronto.ca/~datamining/

  • 8/13/2019 PCA - Easy to Understand

    22/26

    Principal Component Regression

    PCA + MLR

    22http://chem-eng.utoronto.ca/~datamining/

  • 8/13/2019 PCA - Easy to Understand

    23/26

    PCR

    T is a matrix of SCORES

    X is a DATA matrix

    P is a matrix of LOADNIGS

    Y is a dependent variable vector

    B is the regression coefficients vector

    http://chem-eng.utoronto.ca/~datamining/

    23

    YTT)T(B

    ETBY

    TPX

    1-

  • 8/13/2019 PCA - Easy to Understand

    24/26

    PCR

    In PCR the X matrix is replaced by the T matrixwhich has less and orthogonal variables.

    There will be no issue with inversion of the TT

    matrix (unlike MLR) because of orthogonalscores.

    PCR also resolves the issue of colinearitywhichcould in return reduce the prediction error.

    There is no guarantee to have a PCR model thatworks better than an MLR model using the samedataset.

    http://chem-eng.utoronto.ca/~datamining/

    24

  • 8/13/2019 PCA - Easy to Understand

    25/26

    Reference

    www.cs.otago.ac.nz/cosc453/student_tutorial

    s/principal_components.pdf

    http://chem-eng.utoronto.ca/~datamining/ 25

  • 8/13/2019 PCA - Easy to Understand

    26/26

    Questions?

    26http://chem-eng.utoronto.ca/~datamining/