user friendly demonstration pca

Upload: steven231191

Post on 03-Jun-2018

220 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/12/2019 User Friendly Demonstration PCA

    1/39

    A User-Friendly

    Demonstration of PrincipalComponents Analysis as a

    Data Reduction Method

    R. Michael Haynes, PhD Keith Lamb, MBAAssistant Vice President Associate Vice President

    Student Life Studies Student Affairs

    Tarleton State University Midwestern State University

  • 8/12/2019 User Friendly Demonstration PCA

    2/39

    What is Principal Components

    Analysis (PCA)?

    A member of the general linear model (GLM) where all

    analyses are correlational

    Term often used interchangeably with factor analysis,

    however, there are slight differences

    A method of reducing large data sets into more

    manageable factors or components

    A method of identifying the most usefulvariables in a

    dataset

    A method of identifying and classifying variables across

    common themes, or constructs that they represent

  • 8/12/2019 User Friendly Demonstration PCA

    3/39

    Before we get started, aGLOSSARYofterms well be using today:

    Bartlettss Test of Sphericity

    Communality coefficients

    Construct

    Correlation matrix

    Cronbachs alpha coefficient

    Effect sizes (variance accounted for) Eigenvalues

    Extraction

    Factor or component

    Kaiser criterion for retaining factors

    Kaiser-Meyer-Olkin Measure of Sampling Adequacy

    Latent

    Reliability

    Rotation

    Scree plot

    Split-half reliability

    Structure coefficients

  • 8/12/2019 User Friendly Demonstration PCA

    4/39

    Desired outcomes from

    todays session

    Understand the terminology associated with principalcomponents analysis (PCA)

    Understand when using PCA is appropriate

    Understand how to conduct PCA using SPSS 17.0 Understand how to interpret a correlation matrix

    Understand how to interpret a communality matrix

    Understand how to interpret a components matrix andthe methods used in determining how many componentsto retain

    Understand how to analyze a component to determinewhich variables to include and why

    Understand the concept of reliability and why it is

    important in survey research

  • 8/12/2019 User Friendly Demonstration PCA

    5/39

    LETS GET STARTED!!

  • 8/12/2019 User Friendly Demonstration PCA

    6/39

    When is using PCA appropriate?

    When your data is interval or ratio level

    When you have at least 5 observations per variable and

    at least 100 observations (ie20 variables>100

    observations)

    When trying to reduce the number of variables to be

    used in another GLM technique (ie.regression,

    MANOVA, etc...)

    When attempting to identify latent constructs that are

    being measured by observed variables in the absence of

    a priori theory.

  • 8/12/2019 User Friendly Demonstration PCA

    7/39

    HUERISTIC DATA

    Responses to the Developing Purpose Inventory (DPI)collected at a large, metropolitan university between2004-2006 (IRB approval received)

    45 questions related to Chickeringsdeveloping purposestage

    Responses on 5 interval scale; 1=always true to5=never true

    Sample size = 998 participants

    SUGGESTION: always visually inspect data for missing

    cases and potential outliers! (APA Task Force onStatistical Inference, 1999).

    Multiple ways of dealing with missing data, but thats foranother day!

  • 8/12/2019 User Friendly Demonstration PCA

    8/39

    SPSS 17.0

    Make sure your set-up in Variable View is complete toaccommodate your data

    Names, labels, possible values of the data, and type of measure

  • 8/12/2019 User Friendly Demonstration PCA

    9/39

    Analyze>Dimension Reduction>Factor

    SPSS 17.0

  • 8/12/2019 User Friendly Demonstration PCA

    10/39

    SPSS 17.0 SYNTAXOrangeindicates sections specific to your analysis!

    DATASET ACTIVATE DataSet1.

    FACTOR

    /VARIABLES question1 question2 question3 question4 question5 question6 question7 question8question9 question10 question11 question12 question13 question14 question15 question16question17 question18 question19 question20 question21 question22 question23 question24question25 question26 question27 question28 question29 question30 question31 question32question33 question34 question35 question36 question37 question38 question39 question40question41 question42 question43 question44 question45

    /MISSING LISTWISE/ANALYSIS question1 question2 question3 question4 question5 question6 question7 question8

    question9 question10 question11 question12 question13 question14 question15 question16question17 question18 question19 question20 question21 question22 question23 question24question25 question26 question27 question28 question29 question30 question31 question32question33 question34 question35 question36 question37 question38 question39 question40question41 question42 question43 question44 question45

    /PRINT INITIAL CORRELATION SIG KMO EXTRACTION ROTATION FSCORE

    /FORMAT SORT BLANK(.000)/PLOT EIGEN

    /CRITERIA MINEIGEN(1) ITERATE(25)

    /EXTRACTION PC

    /CRITERIA ITERATE(25)

    /ROTATION VARIMAX

    /SAVE AR(ALL)

    /METHOD=CORRELATION.

  • 8/12/2019 User Friendly Demonstration PCA

    11/39

    OUTPUT COMPONENTS

    Correlation Matrix Pearson R between the individual variables

    Variables range from -1.0 to +1.0; strong, modest, weak; positive,

    negative

    Correlations of 1.00 on the diagonal; every variable is perfectly and

    positively correlated with itself!

    It is this information that is the basis for PCA! In other words, if you

    have only a correlation matrix, you can conduct PCA!

    Question 1 - ARI Question 2 - VI Question 3 - SL Question 4 - ARI Question 5 - VI

    Question 1 - ARI 1.000 .157 .077 .165 .069

    Question 2 - VI .157 1.000 .261 .109 .211

    Question 3 - SL .077 .261 1.000 .157 .017

    Question 4 - ARI .165 .109 .157 1.000 .098

    Question 5 - VI .069 .211 .017 .098 1.000

  • 8/12/2019 User Friendly Demonstration PCA

    12/39

    KMO Measure of Sampling Adequacy and Bartletts

    Test of Sphericity

    KMO values closer to 1.0 are better

    Kaiser (1970 & 1975; as cited by Meyers, Gamst, & Guarino, 2006)

    states that a value of .70 is considered adequate. Bartletts Test: you want a statistically significant value

    Reject the null hypothesis of a lack of sufficient correlation between the

    variables.

    Kaiser-Meyer-Olkin Measure of Sampling Adequacy. .861

    Bartlett's Test of

    Sphericity

    Approx. Chi-Square 9193.879

    df 990

    Sig. .000

    OUTPUT COMPONENTS

  • 8/12/2019 User Friendly Demonstration PCA

    13/39

    Communality Coefficients amount of variance in the

    variable accounted for

    by the components

    higher coefficients

    =stronger variables

    lower coefficients

    =weaker variables

    Initial Extraction

    Question 1 - ARI 1.000 .560

    Question 2 - VI 1.000 .446

    Question 3 - SL 1.000 .773

    Question 4 - ARI 1.000 .519

    Question 5 - VI 1.000 .539

    Question 6 - SL 1.000 .439

    Question 7 - ARI 1.000 .605

    Question 8 - VI 1.000 .527

    Question 9 - SL 1.000 .537

    Question 10 - ARI 1.000 .775

    Question 11 - VI 1.000 .635

    Question 12 - SL 1.000 .476

    Question 13 - ARI 1.000 .542

    Question 14 - VI 1.000 .435

    Question 15 - SL 1.000 .426

    OUTPUT COMPONENTS

  • 8/12/2019 User Friendly Demonstration PCA

    14/39

    Total Variance Explained Table Lists the individual components (remember, you have as

    many components as you have variables) by eigenvalue andvariance accounted for

    How do we determine how many components to retain?

    Component

    Initial Eigenvalues Extraction Sums of Squared Loadings Rotation Sums of Squared Loadings

    Total % of Variance Cumulative % Total % of Variance Cumulative % Total % of Variance Cumulative %

    1 7.216 16.035 16.035 7.216 16.035 16.035 3.666 8.147 8.147

    2 3.107 6.904 22.938 3.107 6.904 22.938 2.649 5.887 14.034

    3 2.455 5.456 28.395 2.455 5.456 28.395 2.597 5.771 19.806

    4 1.846 4.103 32.498 1.846 4.103 32.498 2.555 5.677 25.482

    5 1.690 3.755 36.253 1.690 3.755 36.253 2.243 4.984 30.466

    6 1.458 3.239 39.493 1.458 3.239 39.493 2.189 4.865 35.331

    7 1.307 2.906 42.398 1.307 2.906 42.398 1.746 3.880 39.212

    8 1.180 2.623 45.021 1.180 2.623 45.021 1.578 3.507 42.719

    9 1.107 2.461 47.482 1.107 2.461 47.482 1.555 3.455 46.174

    10 1.064 2.364 49.846 1.064 2.364 49.846 1.314 2.919 49.093

    11 1.024 2.275 52.121 1.024 2.275 52.121 1.221 2.712 51.805

    12 1.014 2.253 54.374 1.014 2.253 54.374 1.156 2.569 54.374

    13 .976 2.170 56.544

    OUTPUT COMPONENTS

  • 8/12/2019 User Friendly Demonstration PCA

    15/39

    Component

    Initial Eigenvalues Extraction Sums of Squared Loadings Rotation Sums of Squared Loadings

    Total % of Variance Cumulative % Total % of Variance Cumulative % Total % of Variance Cumulative %

    1 7.216 16.035 16.035 7.216 16.035 16.035 3.666 8.147 8.147

    2 3.107 6.904 22.938 3.107 6.904 22.938 2.649 5.887 14.034

    3 2.455 5.456 28.395 2.455 5.456 28.395 2.597 5.771 19.806

    4 1.846 4.103 32.498 1.846 4.103 32.498 2.555 5.677 25.482

    5 1.690 3.755 36.253 1.690 3.755 36.253 2.243 4.984 30.466

    6 1.458 3.239 39.493 1.458 3.239 39.493 2.189 4.865 35.331

    7 1.307 2.906 42.398 1.307 2.906 42.398 1.746 3.880 39.212

    8 1.180 2.623 45.021 1.180 2.623 45.021 1.578 3.507 42.719

    9 1.107 2.461 47.482 1.107 2.461 47.482 1.555 3.455 46.174

    10 1.064 2.364 49.846 1.064 2.364 49.846 1.314 2.919 49.093

    11 1.024 2.275 52.121 1.024 2.275 52.121 1.221 2.712 51.805

    12 1.014 2.253 54.374 1.014 2.253 54.374 1.156 2.569 54.374

    13 .976 2.170 56.544

    OUTPUT COMPONENTS

    Total Variance Explained Table

    Kaiser Criterion (K1 Rule): retain only those components with

    an eigenvalue of greater than 1; can lead to retaining more

    components than necessary

  • 8/12/2019 User Friendly Demonstration PCA

    16/39

    OUTPUT COMPONENTS

    Component

    Initial Eigenvalues Extraction Sums of Squared Loadings Rotation Sums of Squared Loadings

    Total % of Variance Cumulative % Total % of Variance Cumulative % Total % of Variance Cumulative %

    1 7.216 16.035 16.035 7.216 16.035 16.035 3.666 8.147 8.147

    2 3.107 6.904 22.938 3.107 6.904 22.938 2.649 5.887 14.034

    3 2.455 5.456 28.395 2.455 5.456 28.395 2.597 5.771 19.806

    4 1.846 4.103 32.498 1.846 4.103 32.498 2.555 5.677 25.482

    5 1.690 3.755 36.253 1.690 3.755 36.253 2.243 4.984 30.466

    6 1.458 3.239 39.493 1.458 3.239 39.493 2.189 4.865 35.331

    7 1.307 2.906 42.398 1.307 2.906 42.398 1.746 3.880 39.212

    8 1.180 2.623 45.021 1.180 2.623 45.021 1.578 3.507 42.719

    9 1.107 2.461 47.482 1.107 2.461 47.482 1.555 3.455 46.174

    10 1.064 2.364 49.846 1.064 2.364 49.846 1.314 2.919 49.093

    11 1.024 2.275 52.121 1.024 2.275 52.121 1.221 2.712 51.805

    12 1.014 2.253 54.374 1.014 2.253 54.374 1.156 2.569 54.374

    13 .976 2.170 56.544

    Total Variance Explained Table

    Retain as many factors as will account for a pre-determined

    amount of variance, say 70%; can lead to retention of

    components that are variable specific (Stevens, 2002)

  • 8/12/2019 User Friendly Demonstration PCA

    17/39

    Scree Plot

    Plots eigenvalues on Y axis and component number onX

    axis

    Recommendation is toretain all components

    in the descent before

    the first one on the line

    where it levels off

    (Cattell, 1966; as cited

    by Stevens, 2002).

    OUTPUT COMPONENTS

  • 8/12/2019 User Friendly Demonstration PCA

    18/39

    Other Retention Methods

    Velicers Minimum Average Partial (MAP) test

    Seeks to determine what components are common

    Does not seek cut-off point, but rather to find a more

    comprehensive solution Components that have high number of highly correlated

    variables are retained

    However, variable based decisions can result inunderestimating the number of components to retain

    (Ledesma & Valero-Mora, 2007)

  • 8/12/2019 User Friendly Demonstration PCA

    19/39

    Other Retention Methods

    Horns Parallel Analysis (PA)

    Compares observed eigenvalues with simulatedeigenvalues

    Retain all components with an eigenvalue greater than themean of the simulated eigenvalues

    Considered highly accurate and exempt from extraneousfactors

    (Ledesma & Valero-Mora, 2007)

  • 8/12/2019 User Friendly Demonstration PCA

    20/39

    OUTPUT COMPONENTS

    Component Matrix

    Column values are structure coefficients, or thecorrelation between the test question and the syntheticcomponent; REMEMBER: squared structure

    coefficients inform us of how well the item canreproduce the effect in the component!

  • 8/12/2019 User Friendly Demonstration PCA

    21/39

    Rotated Component Matrixa

    Component

    1 2 3 4 5 6 7 8 9 10 11 12

    Question 42 - SL .781 -.060 .000 .117 .034 .071 .055 -.062 .093 -.002 .032 .025

    Question 39 - SL .778 -.132 .107 .109 .008 .024 -.025 .018 .044 -.010 .022 -.025

    Question 33 - SL .765 -.042 .115 .098 .034 .090 -.035 -.035 .011 .013 -.012 .020

    Question 9 - SL .672 -.103 .127 .092 .050 .126 .005 -.119 -.002 -.063 -.034 -.114

    Question 37 - ARI .462 -.173 .193 -.103 .075 .197 .345 -.018 .024 .232 .009 .119

    Question 15 - SL .406 -.002 .340 .038 .050 .091 .120 -.007 .067 -.152 -.127 -.273

    Question 36 - SL .395 -.067 .212 -.104 .225 .125 .365 -.089 .110 .168 -.037 .221

    Question 44 - VI .375 -.033 .360 .128 .175 .091 .221 -.023 .177 -.035 -.027 -.001

    Question 26 - VI -.022 .660 -.113 .009 .021 -.063 -.096 .089 .044 .034 -.060 .174

    Question 27 - SL -.158 .652 -.088 .032 .069 -.091 .040 .193 -.032 -.150 -.019 .003

    Question 38 - VI -.058 .501 -.109 -.171 .032 -.276 -.051 .078 -.042 .255 -.016 -.097

    Question 20 - VI -.240 .489 .016 .076 .036 -.092 -.052 .434 -.102 .071 -.079 .056

    Question 32 - VI -.101 .488 -.134 .084 -.074 -.415 -.010 .046 .025 -.057 -.050 .020

    Question 45 - SL -.144 .443 -.049 -.097 -.105 -.026 -.097 .078 -.031 .057 .421 -.013

    Question 29 - VI -.006 .439 .154 -.114 .007 .231 .238 -.196 .145 -.098 .089 -.138

    Question 41 - VI -.019 .421 -.087 -.210 .006 -.107 .333 -.005 .125 .091 .300 -.082

    Question 24 - SL .129 -.067 .720 .101 .147 .119 -.003 .011 .005 .011 -.012 .203

    Question 21 - SL .125 -.164 .676 -.056 .161 .047 .160 -.044 -.012 .137 -.006 .029

    Question 23 - VI .313 -.164 .537 .286 .063 .007 .076 -.094 .119 .049 .123 .031

    Question 17 - VI .076 -.050 .459 .187 .040 .136 .314 .048 .120 -.212 .083 -.140

    Question 30 - SL .120 .114 .420 .287 -.081 .309 -.109 -.165 .061 .328 -.107 .161

    Question 22 - ARI .042 .075 .364 .045 .087 -.081 -.135 -.353 .324 .216 .016 -.188

    Question 34 - ARI .187 .042 .067 .791 -.002 .075 -.031 -.019 .012 .063 -.050 -.036

    Question 1 - ARI -.002 -.062 .082 .722 .055 -.018 .008 -.014 .039 .132 .015 -.075

    Extraction Method: Principal Component Analysis.

    Rotation Method: Varimax with Kaiser Normalization.

  • 8/12/2019 User Friendly Demonstration PCA

    22/39

    Rotated Component Matrixa , continuedComponent

    Question 35 - VI .113 .077 .015 .569 .030 .439 -.053 -.140 .067 -.089 .095 .105

    Question 40 - ARI .194 -.161 .176 .553 .033 .057 -.041 .016 .186 -.086 .216 .147

    Question 10 - ARI .029 .016 .144 .033 .860 .036 .010 -.074 .032 .063 -.010 .006

    Question 3 - SL .069 -.015 .197 .050 .848 -.029 .025 -.011 .067 -.026 -.003 .004

    Question 12 - SL .297 .069 .072 .000 .488 .137 .282 .024 .033 .091 .082 .158

    Question 13 - ARI -.046 .058 -.118 .045 .447 -.102 .321 .069 .128 .368 -.222 -.033

    Question 11 - VI .151 -.021 .024 .361 .115 .663 .000 -.006 -.124 -.028 .021 .104

    Question 5 - VI .154 -.134 .201 .042 -.057 .652 .020 .028 -.019 .124 .039 -.092

    Question 8 - VI -.090 .250 -.017 .010 .000 -.623 -.034 .115 -.105 .141 .120 .088

    Question 18 - SL .034 .003 .095 -.055 .092 -.039 .686 -.026 .015 .006 -.024 .036

    Question 14 - VI .241 -.157 .289 -.007 .132 .221 .418 .061 -.057 -.006 .122 -.080

    Question 28 - ARI -.232 .248 .051 .181 -.128 -.237 .357 -.112 .043 .074 -.144 .240

    Question 16 - ARI -.069 .213 -.008 .062 -.006 -.075 .033 .678 -.051 -.101 -.103 .023

    Question 19 - ARI .001 .054 -.042 -.241 -.033 -.010 -.112 .630 .147 -.010 .127 .036

    Question 43 - ARI .138 -.011 .067 .255 .017 .045 -.091 .086 .756 .024 -.074 .075

    Question 31 - ARI .062 .045 .069 -.048 .122 -.040 .186 -.053 .721 .140 -.077 .033

    Question 4 - ARI .023 -.057 .119 .100 .132 .007 .034 -.131 .184 .643 .020 -.088

    Question 6 - SL -.186 .177 -.039 .065 -.051 -.066 .087 .372 -.059 .390 .230 -.080

    Question 7 - ARI .024 -.059 .047 .149 .010 .005 .016 -.017 -.133 .008 .736 .126

    Question 2 - VI .234 -.198 .246 .175 .233 .094 .203 .086 .179 -.161 .254 -.162

    Question 25 - ARI -.048 .063 .119 .021 .073 -.049 .064 .085 .078 -.123 .108 .767

    Extraction Method: Principal Component Analysis.

    Rotation Method: Varimax with Kaiser Normalization.

  • 8/12/2019 User Friendly Demonstration PCA

    23/39

    Component Matrix

    Column values are structure coefficients, or thecorrelation between the test question and the syntheticcomponent; REMEMBER: squared structure

    coefficients inform us of how well the item canreproduce the effect in the component!

    Rule of thumb, include all items with structurecoefficients with an absolute value of .300 or greater

    OUTPUT COMPONENTS

  • 8/12/2019 User Friendly Demonstration PCA

    24/39

    Rotated Component Matrixa

    Component

    1 2 3 4 5 6 7 8 9 10 11 12

    Question 42 - SL .781 -.060 .000 .117 .034 .071 .055 -.062 .093 -.002 .032 .025

    Question 39 - SL .778 -.132 .107 .109 .008 .024 -.025 .018 .044 -.010 .022 -.025

    Question 33 - SL .765 -.042 .115 .098 .034 .090 -.035 -.035 .011 .013 -.012 .020

    Question 9 - SL .672 -.103 .127 .092 .050 .126 .005 -.119 -.002 -.063 -.034 -.114

    Question 37 - ARI .462 -.173 .193 -.103 .075 .197 .345 -.018 .024 .232 .009 .119

    Question 15 - SL .406 -.002 .340 .038 .050 .091 .120 -.007 .067 -.152 -.127 -.273

    Question 36 - SL .395 -.067 .212 -.104 .225 .125 .365 -.089 .110 .168 -.037 .221

    Question 44 - VI .375 -.033 .360 .128 .175 .091 .221 -.023 .177 -.035 -.027 -.001

    Question 26 - VI -.022 .660 -.113 .009 .021 -.063 -.096 .089 .044 .034 -.060 .174

    Question 27 - SL -.158 .652 -.088 .032 .069 -.091 .040 .193 -.032 -.150 -.019 .003

    Question 38 - VI -.058 .501 -.109 -.171 .032 -.276 -.051 .078 -.042 .255 -.016 -.097

    Question 20 - VI -.240 .489 .016 .076 .036 -.092 -.052 .434 -.102 .071 -.079 .056

    Question 32 - VI -.101 .488 -.134 .084 -.074 -.415 -.010 .046 .025 -.057 -.050 .020

    Question 45 - SL -.144 .443 -.049 -.097 -.105 -.026 -.097 .078 -.031 .057 .421 -.013

    Question 29 - VI -.006 .439 .154 -.114 .007 .231 .238 -.196 .145 -.098 .089 -.138

    Question 41 - VI -.019 .421 -.087 -.210 .006 -.107 .333 -.005 .125 .091 .300 -.082

    Question 24 - SL .129 -.067 .720 .101 .147 .119 -.003 .011 .005 .011 -.012 .203

    Question 21 - SL .125 -.164 .676 -.056 .161 .047 .160 -.044 -.012 .137 -.006 .029

    Question 23 - VI .313 -.164 .537 .286 .063 .007 .076 -.094 .119 .049 .123 .031

    Question 17 - VI .076 -.050 .459 .187 .040 .136 .314 .048 .120 -.212 .083 -.140

    Question 30 - SL .120 .114 .420 .287 -.081 .309 -.109 -.165 .061 .328 -.107 .161

    Question 22 - ARI .042 .075 .364 .045 .087 -.081 -.135 -.353 .324 .216 .016 -.188

    Question 34 - ARI .187 .042 .067 .791 -.002 .075 -.031 -.019 .012 .063 -.050 -.036

    Question 1 - ARI -.002 -.062 .082 .722 .055 -.018 .008 -.014 .039 .132 .015 -.075

    Extraction Method: Principal Component Analysis.

    Rotation Method: Varimax with Kaiser Normalization.

  • 8/12/2019 User Friendly Demonstration PCA

    25/39

    Rotated Component Matrixa , continuedComponent

    Question 35 - VI .113 .077 .015 .569 .030 .439 -.053 -.140 .067 -.089 .095 .105

    Question 40 - ARI .194 -.161 .176 .553 .033 .057 -.041 .016 .186 -.086 .216 .147

    Question 10 - ARI .029 .016 .144 .033 .860 .036 .010 -.074 .032 .063 -.010 .006

    Question 3 - SL .069 -.015 .197 .050 .848 -.029 .025 -.011 .067 -.026 -.003 .004

    Question 12 - SL .297 .069 .072 .000 .488 .137 .282 .024 .033 .091 .082 .158

    Question 13 - ARI -.046 .058 -.118 .045 .447 -.102 .321 .069 .128 .368 -.222 -.033

    Question 11 - VI .151 -.021 .024 .361 .115 .663 .000 -.006 -.124 -.028 .021 .104

    Question 5 - VI .154 -.134 .201 .042 -.057 .652 .020 .028 -.019 .124 .039 -.092

    Question 8 - VI -.090 .250 -.017 .010 .000 -.623 -.034 .115 -.105 .141 .120 .088

    Question 18 - SL .034 .003 .095 -.055 .092 -.039 .686 -.026 .015 .006 -.024 .036

    Question 14 - VI .241 -.157 .289 -.007 .132 .221 .418 .061 -.057 -.006 .122 -.080

    Question 28 - ARI -.232 .248 .051 .181 -.128 -.237 .357 -.112 .043 .074 -.144 .240

    Question 16 - ARI -.069 .213 -.008 .062 -.006 -.075 .033 .678 -.051 -.101 -.103 .023

    Question 19 - ARI .001 .054 -.042 -.241 -.033 -.010 -.112 .630 .147 -.010 .127 .036

    Question 43 - ARI .138 -.011 .067 .255 .017 .045 -.091 .086 .756 .024 -.074 .075

    Question 31 - ARI .062 .045 .069 -.048 .122 -.040 .186 -.053 .721 .140 -.077 .033

    Question 4 - ARI .023 -.057 .119 .100 .132 .007 .034 -.131 .184 .643 .020 -.088

    Question 6 - SL -.186 .177 -.039 .065 -.051 -.066 .087 .372 -.059 .390 .230 -.080

    Question 7 - ARI .024 -.059 .047 .149 .010 .005 .016 -.017 -.133 .008 .736 .126

    Question 2 - VI .234 -.198 .246 .175 .233 .094 .203 .086 .179 -.161 .254 -.162

    Question 25 - ARI -.048 .063 .119 .021 .073 -.049 .064 .085 .078 -.123 .108 .767

    Extraction Method: Principal Component Analysis.

    Rotation Method: Varimax with Kaiser Normalization.

  • 8/12/2019 User Friendly Demonstration PCA

    26/39

    Component Matrix

    For heuristic purposes, were retaining the first Xcomponents; what variables should we include in thecomponents?

    Column values are structure coefficients, or thecorrelation between the test question and the syntheticcomponent; REMEMBER: squared structurecoefficients inform us of how well the item canreproduce the effect in the component!

    Rule of thumb, include all items with structurecoefficients with an absolute value of .300 or greater

    Stevens recommends a better way!

    OUTPUT COMPONENTS

  • 8/12/2019 User Friendly Demonstration PCA

    27/39

    Critical Values for a Correlation Coefficient at =

    .01 for a Two-Tailed Test

    n CV n CV n CV

    50 .361 180 .192 400 .129

    80 .286 200 .182 600 .105

    100 .256 250 .163 800 .091

    140 .217 300 .149 1000 .081

    (Stevens, 2002, pp. 394)

    Test the structure coefficient for statistical significance against a

    two-tailed table based on sample size and a critical value (CV); for

    our sample size of 998, the CV would be |.081| doubled (two-tailed),

    or |.162|.

  • 8/12/2019 User Friendly Demonstration PCA

    28/39

    Rotated Component Matrixa

    Component

    1 2 3 4 5 6 7 8 9 10 11 12

    Question 42 - SL .781 -.060 .000 .117 .034 .071 .055 -.062 .093 -.002 .032 .025

    Question 39 - SL .778 -.132 .107 .109 .008 .024 -.025 .018 .044 -.010 .022 -.025

    Question 33 - SL .765 -.042 .115 .098 .034 .090 -.035 -.035 .011 .013 -.012 .020

    Question 9 - SL .672 -.103 .127 .092 .050 .126 .005 -.119 -.002 -.063 -.034 -.114

    Question 37 - ARI .462 -.173 .193 -.103 .075 .197 .345 -.018 .024 .232 .009 .119

    Question 15 - SL .406 -.002 .340 .038 .050 .091 .120 -.007 .067 -.152 -.127 -.273

    Question 36 - SL .395 -.067 .212 -.104 .225 .125 .365 -.089 .110 .168 -.037 .221

    Question 44 - VI .375 -.033 .360 .128 .175 .091 .221 -.023 .177 -.035 -.027 -.001

    Question 26 - VI -.022 .660 -.113 .009 .021 -.063 -.096 .089 .044 .034 -.060 .174

    Question 27 - SL -.158 .652 -.088 .032 .069 -.091 .040 .193 -.032 -.150 -.019 .003

    Question 38 - VI -.058 .501 -.109 -.171 .032 -.276 -.051 .078 -.042 .255 -.016 -.097

    Question 20 - VI -.240 .489 .016 .076 .036 -.092 -.052 .434 -.102 .071 -.079 .056

    Question 32 - VI -.101 .488 -.134 .084 -.074 -.415 -.010 .046 .025 -.057 -.050 .020

    Question 45 - SL -.144 .443 -.049 -.097 -.105 -.026 -.097 .078 -.031 .057 .421 -.013

    Question 29 - VI -.006 .439 .154 -.114 .007 .231 .238 -.196 .145 -.098 .089 -.138

    Question 41 - VI -.019 .421 -.087 -.210 .006 -.107 .333 -.005 .125 .091 .300 -.082

    Question 24 - SL .129 -.067 .720 .101 .147 .119 -.003 .011 .005 .011 -.012 .203

    Question 21 - SL .125 -.164 .676 -.056 .161 .047 .160 -.044 -.012 .137 -.006 .029

    Question 23 - VI .313 -.164 .537 .286 .063 .007 .076 -.094 .119 .049 .123 .031

    Question 17 - VI .076 -.050 .459 .187 .040 .136 .314 .048 .120 -.212 .083 -.140

    Question 30 - SL .120 .114 .420 .287 -.081 .309 -.109 -.165 .061 .328 -.107 .161

    Question 22 - ARI .042 .075 .364 .045 .087 -.081 -.135 -.353 .324 .216 .016 -.188

    Question 34 - ARI .187 .042 .067 .791 -.002 .075 -.031 -.019 .012 .063 -.050 -.036

    Question 1 - ARI -.002 -.062 .082 .722 .055 -.018 .008 -.014 .039 .132 .015 -.075

    Extraction Method: Principal Component Analysis.

    Rotation Method: Varimax with Kaiser Normalization.

  • 8/12/2019 User Friendly Demonstration PCA

    29/39

    Rotated Component Matrixa , continuedComponent

    Question 35 - VI .113 .077 .015 .569 .030 .439 -.053 -.140 .067 -.089 .095 .105

    Question 40 - ARI .194 -.161 .176 .553 .033 .057 -.041 .016 .186 -.086 .216 .147

    Question 10 - ARI .029 .016 .144 .033 .860 .036 .010 -.074 .032 .063 -.010 .006

    Question 3 - SL .069 -.015 .197 .050 .848 -.029 .025 -.011 .067 -.026 -.003 .004

    Question 12 - SL .297 .069 .072 .000 .488 .137 .282 .024 .033 .091 .082 .158

    Question 13 - ARI -.046 .058 -.118 .045 .447 -.102 .321 .069 .128 .368 -.222 -.033

    Question 11 - VI .151 -.021 .024 .361 .115 .663 .000 -.006 -.124 -.028 .021 .104

    Question 5 - VI .154 -.134 .201 .042 -.057 .652 .020 .028 -.019 .124 .039 -.092

    Question 8 - VI -.090 .250 -.017 .010 .000 -.623 -.034 .115 -.105 .141 .120 .088

    Question 18 - SL .034 .003 .095 -.055 .092 -.039 .686 -.026 .015 .006 -.024 .036

    Question 14 - VI .241 -.157 .289 -.007 .132 .221 .418 .061 -.057 -.006 .122 -.080

    Question 28 - ARI -.232 .248 .051 .181 -.128 -.237 .357 -.112 .043 .074 -.144 .240

    Question 16 - ARI -.069 .213 -.008 .062 -.006 -.075 .033 .678 -.051 -.101 -.103 .023

    Question 19 - ARI .001 .054 -.042 -.241 -.033 -.010 -.112 .630 .147 -.010 .127 .036

    Question 43 - ARI .138 -.011 .067 .255 .017 .045 -.091 .086 .756 .024 -.074 .075

    Question 31 - ARI .062 .045 .069 -.048 .122 -.040 .186 -.053 .721 .140 -.077 .033

    Question 4 - ARI .023 -.057 .119 .100 .132 .007 .034 -.131 .184 .643 .020 -.088

    Question 6 - SL -.186 .177 -.039 .065 -.051 -.066 .087 .372 -.059 .390 .230 -.080

    Question 7 - ARI .024 -.059 .047 .149 .010 .005 .016 -.017 -.133 .008 .736 .126

    Question 2 - VI .234 -.198 .246 .175 .233 .094 .203 .086 .179 -.161 .254 -.162

    Question 25 - ARI -.048 .063 .119 .021 .073 -.049 .064 .085 .078 -.123 .108 .767

    Extraction Method: Principal Component Analysis.

    Rotation Method: Varimax with Kaiser Normalization.

  • 8/12/2019 User Friendly Demonstration PCA

    30/39

    Sum the interval values for the responses of all questions

    included in the retained component

    Obtain mean values for the responses of all questions

    included in the retained componenthintyoull get thesame R, R, , and structure coefficients as with the sums!

    Use SPSS to obtain factor scores for the component

    Choose Scores button when setting up your PCA

    Options include calculating scores based on regression, Bartlett, orAndersonRubin methodologiesbe sure and check Save asVariables

    Factor scores will appear in your data set and can be used asvariables in other GLM analyses

    Obtaining Continuous Component Values

    for Use in Further Analysis

  • 8/12/2019 User Friendly Demonstration PCA

    31/39

    RELIABILITY

    The extent to which scores on a test are consistent

    across multiple administrations of the test; the amount of

    measurement error in the scores yielded by a test (Gall,

    Gall, & Borg, 2003).

    While validity is important in ensuring our tests are really

    measuring what we intended to measure; You wouldnt

    administer an English literature test to assess math

    competency, would you?

    Can be measured several ways using SPSS 17.0

  • 8/12/2019 User Friendly Demonstration PCA

    32/39

    A Visual Explanation of

    Reliability and Validity

    http://rds.yahoo.com/_ylt=A0S020psNm9LtGkA9pOjzbkF/SIG=12q8fameb/EXP=1265666028/**http%3a/blog.questionmark.com/wp-content/uploads/2009/04/newchartwp.jpg
  • 8/12/2019 User Friendly Demonstration PCA

    33/39

    RELIABILITY

  • 8/12/2019 User Friendly Demonstration PCA

    34/39

    RELIABILITY

  • 8/12/2019 User Friendly Demonstration PCA

    35/39

    Cronbachs Alpha CoefficientRELIABILITY

    /VARIABLES=question1 question2 question3 question4 question5 question6 question7 question8

    question9 question10 question11 question12 question13 question14 question15 question16 question17

    question18 question19 question20 question21 question22 question23 question24 question25 question26

    question27 question28 question29 question30 question31 question32 question33 question34 question35

    question36 question37 question38 question39 question40 question41 question42 question43 question44question45

    /SCALE('ALL VARIABLES') ALL

    /MODEL=ALPHA.

    Split-Half CoefficientRELIABILITY

    /VARIABLES=question1 question2 question3 question4 question5 question6 question7 question8question9 question10 question11 question12 question13 question14 question15 question16 question17

    question18 question19 question20 question21 question22 question23 question24 question25 question26

    question27 question28 question29 question30 question31 question32 question33 question34 question35

    question36 question37 question38 question39 question40 question41 question42 question43 question44

    question45

    /SCALE('ALL VARIABLES') ALL

    /MODEL=SPLIT.

    RELIABILITY

  • 8/12/2019 User Friendly Demonstration PCA

    36/39

    Cronbachs Alpha Coefficient

    Reliability Statistics

    Cronbach's Alpha N of Items

    .749 45

    RELIABILITY

    Benchmarks for Alpha

    .9 & up = very good

    .8 to .9 = good

    .7 to .8 = acceptable

    .7 & below = suspect.

    dont refer to the

    test as reliable, but

    scores from this

    administration of the

    test yielded reliable

    results.Kyle Roberts

    http://smu.edu/education/teachereducation/faculty/images/KyleRoberts.jpg
  • 8/12/2019 User Friendly Demonstration PCA

    37/39

    Split-Half Coefficient Reliability StatisticsCronbach's Alpha Part 1 Value .620

    N of Items 23a

    Part 2 Value .623

    N of Items 22b

    Total N of Items 45

    Correlation Between

    Forms

    .518

    Spearman-Brown

    Coefficient

    Equal Length .683

    Unequal Length .683

    Guttman Split-Half

    Coefficient

    .683

    a. The items are: Question 1 - ARI, Question 2 - VI, Question 3 - SL, Question 4

    - ARI, Question 5 - VI, Question 6 - SL, Question 7 - ARI, Question 8 - VI,Question 9 - SL, Question 10 - ARI, Question 11 - VI, Question 12 - SL,

    Question 13 - ARI, Question 14 - VI, Question 15 - SL, Question 16 - ARI,

    Question 17 - VI, Question 18 - SL, Question 19 - ARI, Question 20 - VI,

    Question 21 - SL, Question 22 - ARI, Question 23 - VI.

    b. The items are: Question 23 - VI, Question 24 - SL, Question 25 - ARI,

    Question 26 - VI, Question 27 - SL, Question 28 - ARI, Question 29 - VI,Question 30 - SL, Question 31 - ARI, Question 32 - VI, Question 33 - SL,

    Question 34 - ARI, Question 35 - VI, Question 36 - SL, Question 37 - ARI,

    Question 38 - VI, Question 39 - SL, Question 40 - ARI, Question 41 - VI,Question 42 - SL, Question 43 - ARI, Question 44 - VI, Questiton 45 - SL.

    RELIABILITY

  • 8/12/2019 User Friendly Demonstration PCA

    38/39

    http://faculty.chass.ncsu.edu/garson/PA765/factor.ht

    m

    http://www.uic.edu/classes/epsy/epsy546/Lecture%2

    04%20---

    %20notes%20on%20PRINCIPAL%20COMPONENTS%20ANALYSIS%20AND%20FACTOR%20ANALYS

    IS1.pdf

    http://www.ats.ucla.edu/stat/Spss/output/factor1.htm

    http://www.statsoft.com/textbook/principal-components-factor-analysis/

    RELATED LINKS

    http://faculty.chass.ncsu.edu/garson/PA765/factor.htmhttp://faculty.chass.ncsu.edu/garson/PA765/factor.htmhttp://www.uic.edu/classes/epsy/epsy546/Lecture%204%20---%20notes%20on%20PRINCIPAL%20COMPONENTS%20ANALYSIS%20AND%20FACTOR%20ANALYSIS1.pdfhttp://www.uic.edu/classes/epsy/epsy546/Lecture%204%20---%20notes%20on%20PRINCIPAL%20COMPONENTS%20ANALYSIS%20AND%20FACTOR%20ANALYSIS1.pdfhttp://www.uic.edu/classes/epsy/epsy546/Lecture%204%20---%20notes%20on%20PRINCIPAL%20COMPONENTS%20ANALYSIS%20AND%20FACTOR%20ANALYSIS1.pdfhttp://www.uic.edu/classes/epsy/epsy546/Lecture%204%20---%20notes%20on%20PRINCIPAL%20COMPONENTS%20ANALYSIS%20AND%20FACTOR%20ANALYSIS1.pdfhttp://www.uic.edu/classes/epsy/epsy546/Lecture%204%20---%20notes%20on%20PRINCIPAL%20COMPONENTS%20ANALYSIS%20AND%20FACTOR%20ANALYSIS1.pdfhttp://www.ats.ucla.edu/stat/Spss/output/factor1.htmhttp://www.statsoft.com/textbook/principal-components-factor-analysis/http://www.statsoft.com/textbook/principal-components-factor-analysis/http://www.statsoft.com/textbook/principal-components-factor-analysis/http://www.statsoft.com/textbook/principal-components-factor-analysis/http://www.statsoft.com/textbook/principal-components-factor-analysis/http://www.statsoft.com/textbook/principal-components-factor-analysis/http://www.statsoft.com/textbook/principal-components-factor-analysis/http://www.statsoft.com/textbook/principal-components-factor-analysis/http://www.statsoft.com/textbook/principal-components-factor-analysis/http://www.ats.ucla.edu/stat/Spss/output/factor1.htmhttp://www.uic.edu/classes/epsy/epsy546/Lecture%204%20---%20notes%20on%20PRINCIPAL%20COMPONENTS%20ANALYSIS%20AND%20FACTOR%20ANALYSIS1.pdfhttp://www.uic.edu/classes/epsy/epsy546/Lecture%204%20---%20notes%20on%20PRINCIPAL%20COMPONENTS%20ANALYSIS%20AND%20FACTOR%20ANALYSIS1.pdfhttp://www.uic.edu/classes/epsy/epsy546/Lecture%204%20---%20notes%20on%20PRINCIPAL%20COMPONENTS%20ANALYSIS%20AND%20FACTOR%20ANALYSIS1.pdfhttp://www.uic.edu/classes/epsy/epsy546/Lecture%204%20---%20notes%20on%20PRINCIPAL%20COMPONENTS%20ANALYSIS%20AND%20FACTOR%20ANALYSIS1.pdfhttp://www.uic.edu/classes/epsy/epsy546/Lecture%204%20---%20notes%20on%20PRINCIPAL%20COMPONENTS%20ANALYSIS%20AND%20FACTOR%20ANALYSIS1.pdfhttp://www.uic.edu/classes/epsy/epsy546/Lecture%204%20---%20notes%20on%20PRINCIPAL%20COMPONENTS%20ANALYSIS%20AND%20FACTOR%20ANALYSIS1.pdfhttp://faculty.chass.ncsu.edu/garson/PA765/factor.htmhttp://faculty.chass.ncsu.edu/garson/PA765/factor.htm
  • 8/12/2019 User Friendly Demonstration PCA

    39/39

    Gall, M.D., Gall, J.P., & Borg, W.R. (2003). Educational research: An introduction7th

    ed.). Boson: Allyn and Bacon.

    Ledesma, R.D., & Valero-Mora, P. (2007). Determining the number of factors to

    retain in EFA: an easy-to-use computer program for carrying out parallel analysis.

    Practical Assessment, Research, & Evaluation,12(2).

    Meyers, L.S., Gamst, G., & Guarino, A.J. (2006).Applied multivariate research:

    Design and interpretation. Thousand Oaks, CA: Sage.

    Stevens, J. P. (2002).Applied multivariate statistics for the social sciences(4thed.).

    Mahwaw, NJ: Lawrence Erlbaum Associates.

    University of California at Los Angeles Academic Technology Services (2009).

    Annotated SPSS output: Factor analysis. Retrieved January 11, 2010 from

    http://www.ats.ucla.edu/stat/Spss/output/factor1.htm

    University of Illinois at Chicago (2009). Principal components analysis and factor

    analysis. Retrieved January 11, 2010 fromhttp://www.uic.edu/classes/epsy/epsy546/Lecture%204%20---

    %20notes%20on%20PRINCIPAL%20COMPONENTS%20ANALYSIS%20AND%20F

    ACTOR%20ANALYSIS1.pdf

    Wilkinson, L. & Task Force on Statistical Inference. (1999). Statistical methods in

    psychology journals: Guidelines and explanation.American Psychologist, 54, 594-

    604

    REFERENCES

    http://www.ats.ucla.edu/stat/Spss/output/factor1.htmhttp://www.uic.edu/classes/epsy/epsy546/Lecture%204%20---%20notes%20on%20PRINCIPAL%20COMPONENTS%20ANALYSIS%20AND%20FACTOR%20ANALYSIS1.pdfhttp://www.uic.edu/classes/epsy/epsy546/Lecture%204%20---%20notes%20on%20PRINCIPAL%20COMPONENTS%20ANALYSIS%20AND%20FACTOR%20ANALYSIS1.pdfhttp://www.uic.edu/classes/epsy/epsy546/Lecture%204%20---%20notes%20on%20PRINCIPAL%20COMPONENTS%20ANALYSIS%20AND%20FACTOR%20ANALYSIS1.pdfhttp://www.uic.edu/classes/epsy/epsy546/Lecture%204%20---%20notes%20on%20PRINCIPAL%20COMPONENTS%20ANALYSIS%20AND%20FACTOR%20ANALYSIS1.pdfhttp://www.uic.edu/classes/epsy/epsy546/Lecture%204%20---%20notes%20on%20PRINCIPAL%20COMPONENTS%20ANALYSIS%20AND%20FACTOR%20ANALYSIS1.pdfhttp://www.uic.edu/classes/epsy/epsy546/Lecture%204%20---%20notes%20on%20PRINCIPAL%20COMPONENTS%20ANALYSIS%20AND%20FACTOR%20ANALYSIS1.pdfhttp://www.uic.edu/classes/epsy/epsy546/Lecture%204%20---%20notes%20on%20PRINCIPAL%20COMPONENTS%20ANALYSIS%20AND%20FACTOR%20ANALYSIS1.pdfhttp://www.ats.ucla.edu/stat/Spss/output/factor1.htm