user friendly demonstration pca
TRANSCRIPT
-
8/12/2019 User Friendly Demonstration PCA
1/39
A User-Friendly
Demonstration of PrincipalComponents Analysis as a
Data Reduction Method
R. Michael Haynes, PhD Keith Lamb, MBAAssistant Vice President Associate Vice President
Student Life Studies Student Affairs
Tarleton State University Midwestern State University
-
8/12/2019 User Friendly Demonstration PCA
2/39
What is Principal Components
Analysis (PCA)?
A member of the general linear model (GLM) where all
analyses are correlational
Term often used interchangeably with factor analysis,
however, there are slight differences
A method of reducing large data sets into more
manageable factors or components
A method of identifying the most usefulvariables in a
dataset
A method of identifying and classifying variables across
common themes, or constructs that they represent
-
8/12/2019 User Friendly Demonstration PCA
3/39
Before we get started, aGLOSSARYofterms well be using today:
Bartlettss Test of Sphericity
Communality coefficients
Construct
Correlation matrix
Cronbachs alpha coefficient
Effect sizes (variance accounted for) Eigenvalues
Extraction
Factor or component
Kaiser criterion for retaining factors
Kaiser-Meyer-Olkin Measure of Sampling Adequacy
Latent
Reliability
Rotation
Scree plot
Split-half reliability
Structure coefficients
-
8/12/2019 User Friendly Demonstration PCA
4/39
Desired outcomes from
todays session
Understand the terminology associated with principalcomponents analysis (PCA)
Understand when using PCA is appropriate
Understand how to conduct PCA using SPSS 17.0 Understand how to interpret a correlation matrix
Understand how to interpret a communality matrix
Understand how to interpret a components matrix andthe methods used in determining how many componentsto retain
Understand how to analyze a component to determinewhich variables to include and why
Understand the concept of reliability and why it is
important in survey research
-
8/12/2019 User Friendly Demonstration PCA
5/39
LETS GET STARTED!!
-
8/12/2019 User Friendly Demonstration PCA
6/39
When is using PCA appropriate?
When your data is interval or ratio level
When you have at least 5 observations per variable and
at least 100 observations (ie20 variables>100
observations)
When trying to reduce the number of variables to be
used in another GLM technique (ie.regression,
MANOVA, etc...)
When attempting to identify latent constructs that are
being measured by observed variables in the absence of
a priori theory.
-
8/12/2019 User Friendly Demonstration PCA
7/39
HUERISTIC DATA
Responses to the Developing Purpose Inventory (DPI)collected at a large, metropolitan university between2004-2006 (IRB approval received)
45 questions related to Chickeringsdeveloping purposestage
Responses on 5 interval scale; 1=always true to5=never true
Sample size = 998 participants
SUGGESTION: always visually inspect data for missing
cases and potential outliers! (APA Task Force onStatistical Inference, 1999).
Multiple ways of dealing with missing data, but thats foranother day!
-
8/12/2019 User Friendly Demonstration PCA
8/39
SPSS 17.0
Make sure your set-up in Variable View is complete toaccommodate your data
Names, labels, possible values of the data, and type of measure
-
8/12/2019 User Friendly Demonstration PCA
9/39
Analyze>Dimension Reduction>Factor
SPSS 17.0
-
8/12/2019 User Friendly Demonstration PCA
10/39
SPSS 17.0 SYNTAXOrangeindicates sections specific to your analysis!
DATASET ACTIVATE DataSet1.
FACTOR
/VARIABLES question1 question2 question3 question4 question5 question6 question7 question8question9 question10 question11 question12 question13 question14 question15 question16question17 question18 question19 question20 question21 question22 question23 question24question25 question26 question27 question28 question29 question30 question31 question32question33 question34 question35 question36 question37 question38 question39 question40question41 question42 question43 question44 question45
/MISSING LISTWISE/ANALYSIS question1 question2 question3 question4 question5 question6 question7 question8
question9 question10 question11 question12 question13 question14 question15 question16question17 question18 question19 question20 question21 question22 question23 question24question25 question26 question27 question28 question29 question30 question31 question32question33 question34 question35 question36 question37 question38 question39 question40question41 question42 question43 question44 question45
/PRINT INITIAL CORRELATION SIG KMO EXTRACTION ROTATION FSCORE
/FORMAT SORT BLANK(.000)/PLOT EIGEN
/CRITERIA MINEIGEN(1) ITERATE(25)
/EXTRACTION PC
/CRITERIA ITERATE(25)
/ROTATION VARIMAX
/SAVE AR(ALL)
/METHOD=CORRELATION.
-
8/12/2019 User Friendly Demonstration PCA
11/39
OUTPUT COMPONENTS
Correlation Matrix Pearson R between the individual variables
Variables range from -1.0 to +1.0; strong, modest, weak; positive,
negative
Correlations of 1.00 on the diagonal; every variable is perfectly and
positively correlated with itself!
It is this information that is the basis for PCA! In other words, if you
have only a correlation matrix, you can conduct PCA!
Question 1 - ARI Question 2 - VI Question 3 - SL Question 4 - ARI Question 5 - VI
Question 1 - ARI 1.000 .157 .077 .165 .069
Question 2 - VI .157 1.000 .261 .109 .211
Question 3 - SL .077 .261 1.000 .157 .017
Question 4 - ARI .165 .109 .157 1.000 .098
Question 5 - VI .069 .211 .017 .098 1.000
-
8/12/2019 User Friendly Demonstration PCA
12/39
KMO Measure of Sampling Adequacy and Bartletts
Test of Sphericity
KMO values closer to 1.0 are better
Kaiser (1970 & 1975; as cited by Meyers, Gamst, & Guarino, 2006)
states that a value of .70 is considered adequate. Bartletts Test: you want a statistically significant value
Reject the null hypothesis of a lack of sufficient correlation between the
variables.
Kaiser-Meyer-Olkin Measure of Sampling Adequacy. .861
Bartlett's Test of
Sphericity
Approx. Chi-Square 9193.879
df 990
Sig. .000
OUTPUT COMPONENTS
-
8/12/2019 User Friendly Demonstration PCA
13/39
Communality Coefficients amount of variance in the
variable accounted for
by the components
higher coefficients
=stronger variables
lower coefficients
=weaker variables
Initial Extraction
Question 1 - ARI 1.000 .560
Question 2 - VI 1.000 .446
Question 3 - SL 1.000 .773
Question 4 - ARI 1.000 .519
Question 5 - VI 1.000 .539
Question 6 - SL 1.000 .439
Question 7 - ARI 1.000 .605
Question 8 - VI 1.000 .527
Question 9 - SL 1.000 .537
Question 10 - ARI 1.000 .775
Question 11 - VI 1.000 .635
Question 12 - SL 1.000 .476
Question 13 - ARI 1.000 .542
Question 14 - VI 1.000 .435
Question 15 - SL 1.000 .426
OUTPUT COMPONENTS
-
8/12/2019 User Friendly Demonstration PCA
14/39
Total Variance Explained Table Lists the individual components (remember, you have as
many components as you have variables) by eigenvalue andvariance accounted for
How do we determine how many components to retain?
Component
Initial Eigenvalues Extraction Sums of Squared Loadings Rotation Sums of Squared Loadings
Total % of Variance Cumulative % Total % of Variance Cumulative % Total % of Variance Cumulative %
1 7.216 16.035 16.035 7.216 16.035 16.035 3.666 8.147 8.147
2 3.107 6.904 22.938 3.107 6.904 22.938 2.649 5.887 14.034
3 2.455 5.456 28.395 2.455 5.456 28.395 2.597 5.771 19.806
4 1.846 4.103 32.498 1.846 4.103 32.498 2.555 5.677 25.482
5 1.690 3.755 36.253 1.690 3.755 36.253 2.243 4.984 30.466
6 1.458 3.239 39.493 1.458 3.239 39.493 2.189 4.865 35.331
7 1.307 2.906 42.398 1.307 2.906 42.398 1.746 3.880 39.212
8 1.180 2.623 45.021 1.180 2.623 45.021 1.578 3.507 42.719
9 1.107 2.461 47.482 1.107 2.461 47.482 1.555 3.455 46.174
10 1.064 2.364 49.846 1.064 2.364 49.846 1.314 2.919 49.093
11 1.024 2.275 52.121 1.024 2.275 52.121 1.221 2.712 51.805
12 1.014 2.253 54.374 1.014 2.253 54.374 1.156 2.569 54.374
13 .976 2.170 56.544
OUTPUT COMPONENTS
-
8/12/2019 User Friendly Demonstration PCA
15/39
Component
Initial Eigenvalues Extraction Sums of Squared Loadings Rotation Sums of Squared Loadings
Total % of Variance Cumulative % Total % of Variance Cumulative % Total % of Variance Cumulative %
1 7.216 16.035 16.035 7.216 16.035 16.035 3.666 8.147 8.147
2 3.107 6.904 22.938 3.107 6.904 22.938 2.649 5.887 14.034
3 2.455 5.456 28.395 2.455 5.456 28.395 2.597 5.771 19.806
4 1.846 4.103 32.498 1.846 4.103 32.498 2.555 5.677 25.482
5 1.690 3.755 36.253 1.690 3.755 36.253 2.243 4.984 30.466
6 1.458 3.239 39.493 1.458 3.239 39.493 2.189 4.865 35.331
7 1.307 2.906 42.398 1.307 2.906 42.398 1.746 3.880 39.212
8 1.180 2.623 45.021 1.180 2.623 45.021 1.578 3.507 42.719
9 1.107 2.461 47.482 1.107 2.461 47.482 1.555 3.455 46.174
10 1.064 2.364 49.846 1.064 2.364 49.846 1.314 2.919 49.093
11 1.024 2.275 52.121 1.024 2.275 52.121 1.221 2.712 51.805
12 1.014 2.253 54.374 1.014 2.253 54.374 1.156 2.569 54.374
13 .976 2.170 56.544
OUTPUT COMPONENTS
Total Variance Explained Table
Kaiser Criterion (K1 Rule): retain only those components with
an eigenvalue of greater than 1; can lead to retaining more
components than necessary
-
8/12/2019 User Friendly Demonstration PCA
16/39
OUTPUT COMPONENTS
Component
Initial Eigenvalues Extraction Sums of Squared Loadings Rotation Sums of Squared Loadings
Total % of Variance Cumulative % Total % of Variance Cumulative % Total % of Variance Cumulative %
1 7.216 16.035 16.035 7.216 16.035 16.035 3.666 8.147 8.147
2 3.107 6.904 22.938 3.107 6.904 22.938 2.649 5.887 14.034
3 2.455 5.456 28.395 2.455 5.456 28.395 2.597 5.771 19.806
4 1.846 4.103 32.498 1.846 4.103 32.498 2.555 5.677 25.482
5 1.690 3.755 36.253 1.690 3.755 36.253 2.243 4.984 30.466
6 1.458 3.239 39.493 1.458 3.239 39.493 2.189 4.865 35.331
7 1.307 2.906 42.398 1.307 2.906 42.398 1.746 3.880 39.212
8 1.180 2.623 45.021 1.180 2.623 45.021 1.578 3.507 42.719
9 1.107 2.461 47.482 1.107 2.461 47.482 1.555 3.455 46.174
10 1.064 2.364 49.846 1.064 2.364 49.846 1.314 2.919 49.093
11 1.024 2.275 52.121 1.024 2.275 52.121 1.221 2.712 51.805
12 1.014 2.253 54.374 1.014 2.253 54.374 1.156 2.569 54.374
13 .976 2.170 56.544
Total Variance Explained Table
Retain as many factors as will account for a pre-determined
amount of variance, say 70%; can lead to retention of
components that are variable specific (Stevens, 2002)
-
8/12/2019 User Friendly Demonstration PCA
17/39
Scree Plot
Plots eigenvalues on Y axis and component number onX
axis
Recommendation is toretain all components
in the descent before
the first one on the line
where it levels off
(Cattell, 1966; as cited
by Stevens, 2002).
OUTPUT COMPONENTS
-
8/12/2019 User Friendly Demonstration PCA
18/39
Other Retention Methods
Velicers Minimum Average Partial (MAP) test
Seeks to determine what components are common
Does not seek cut-off point, but rather to find a more
comprehensive solution Components that have high number of highly correlated
variables are retained
However, variable based decisions can result inunderestimating the number of components to retain
(Ledesma & Valero-Mora, 2007)
-
8/12/2019 User Friendly Demonstration PCA
19/39
Other Retention Methods
Horns Parallel Analysis (PA)
Compares observed eigenvalues with simulatedeigenvalues
Retain all components with an eigenvalue greater than themean of the simulated eigenvalues
Considered highly accurate and exempt from extraneousfactors
(Ledesma & Valero-Mora, 2007)
-
8/12/2019 User Friendly Demonstration PCA
20/39
OUTPUT COMPONENTS
Component Matrix
Column values are structure coefficients, or thecorrelation between the test question and the syntheticcomponent; REMEMBER: squared structure
coefficients inform us of how well the item canreproduce the effect in the component!
-
8/12/2019 User Friendly Demonstration PCA
21/39
Rotated Component Matrixa
Component
1 2 3 4 5 6 7 8 9 10 11 12
Question 42 - SL .781 -.060 .000 .117 .034 .071 .055 -.062 .093 -.002 .032 .025
Question 39 - SL .778 -.132 .107 .109 .008 .024 -.025 .018 .044 -.010 .022 -.025
Question 33 - SL .765 -.042 .115 .098 .034 .090 -.035 -.035 .011 .013 -.012 .020
Question 9 - SL .672 -.103 .127 .092 .050 .126 .005 -.119 -.002 -.063 -.034 -.114
Question 37 - ARI .462 -.173 .193 -.103 .075 .197 .345 -.018 .024 .232 .009 .119
Question 15 - SL .406 -.002 .340 .038 .050 .091 .120 -.007 .067 -.152 -.127 -.273
Question 36 - SL .395 -.067 .212 -.104 .225 .125 .365 -.089 .110 .168 -.037 .221
Question 44 - VI .375 -.033 .360 .128 .175 .091 .221 -.023 .177 -.035 -.027 -.001
Question 26 - VI -.022 .660 -.113 .009 .021 -.063 -.096 .089 .044 .034 -.060 .174
Question 27 - SL -.158 .652 -.088 .032 .069 -.091 .040 .193 -.032 -.150 -.019 .003
Question 38 - VI -.058 .501 -.109 -.171 .032 -.276 -.051 .078 -.042 .255 -.016 -.097
Question 20 - VI -.240 .489 .016 .076 .036 -.092 -.052 .434 -.102 .071 -.079 .056
Question 32 - VI -.101 .488 -.134 .084 -.074 -.415 -.010 .046 .025 -.057 -.050 .020
Question 45 - SL -.144 .443 -.049 -.097 -.105 -.026 -.097 .078 -.031 .057 .421 -.013
Question 29 - VI -.006 .439 .154 -.114 .007 .231 .238 -.196 .145 -.098 .089 -.138
Question 41 - VI -.019 .421 -.087 -.210 .006 -.107 .333 -.005 .125 .091 .300 -.082
Question 24 - SL .129 -.067 .720 .101 .147 .119 -.003 .011 .005 .011 -.012 .203
Question 21 - SL .125 -.164 .676 -.056 .161 .047 .160 -.044 -.012 .137 -.006 .029
Question 23 - VI .313 -.164 .537 .286 .063 .007 .076 -.094 .119 .049 .123 .031
Question 17 - VI .076 -.050 .459 .187 .040 .136 .314 .048 .120 -.212 .083 -.140
Question 30 - SL .120 .114 .420 .287 -.081 .309 -.109 -.165 .061 .328 -.107 .161
Question 22 - ARI .042 .075 .364 .045 .087 -.081 -.135 -.353 .324 .216 .016 -.188
Question 34 - ARI .187 .042 .067 .791 -.002 .075 -.031 -.019 .012 .063 -.050 -.036
Question 1 - ARI -.002 -.062 .082 .722 .055 -.018 .008 -.014 .039 .132 .015 -.075
Extraction Method: Principal Component Analysis.
Rotation Method: Varimax with Kaiser Normalization.
-
8/12/2019 User Friendly Demonstration PCA
22/39
Rotated Component Matrixa , continuedComponent
Question 35 - VI .113 .077 .015 .569 .030 .439 -.053 -.140 .067 -.089 .095 .105
Question 40 - ARI .194 -.161 .176 .553 .033 .057 -.041 .016 .186 -.086 .216 .147
Question 10 - ARI .029 .016 .144 .033 .860 .036 .010 -.074 .032 .063 -.010 .006
Question 3 - SL .069 -.015 .197 .050 .848 -.029 .025 -.011 .067 -.026 -.003 .004
Question 12 - SL .297 .069 .072 .000 .488 .137 .282 .024 .033 .091 .082 .158
Question 13 - ARI -.046 .058 -.118 .045 .447 -.102 .321 .069 .128 .368 -.222 -.033
Question 11 - VI .151 -.021 .024 .361 .115 .663 .000 -.006 -.124 -.028 .021 .104
Question 5 - VI .154 -.134 .201 .042 -.057 .652 .020 .028 -.019 .124 .039 -.092
Question 8 - VI -.090 .250 -.017 .010 .000 -.623 -.034 .115 -.105 .141 .120 .088
Question 18 - SL .034 .003 .095 -.055 .092 -.039 .686 -.026 .015 .006 -.024 .036
Question 14 - VI .241 -.157 .289 -.007 .132 .221 .418 .061 -.057 -.006 .122 -.080
Question 28 - ARI -.232 .248 .051 .181 -.128 -.237 .357 -.112 .043 .074 -.144 .240
Question 16 - ARI -.069 .213 -.008 .062 -.006 -.075 .033 .678 -.051 -.101 -.103 .023
Question 19 - ARI .001 .054 -.042 -.241 -.033 -.010 -.112 .630 .147 -.010 .127 .036
Question 43 - ARI .138 -.011 .067 .255 .017 .045 -.091 .086 .756 .024 -.074 .075
Question 31 - ARI .062 .045 .069 -.048 .122 -.040 .186 -.053 .721 .140 -.077 .033
Question 4 - ARI .023 -.057 .119 .100 .132 .007 .034 -.131 .184 .643 .020 -.088
Question 6 - SL -.186 .177 -.039 .065 -.051 -.066 .087 .372 -.059 .390 .230 -.080
Question 7 - ARI .024 -.059 .047 .149 .010 .005 .016 -.017 -.133 .008 .736 .126
Question 2 - VI .234 -.198 .246 .175 .233 .094 .203 .086 .179 -.161 .254 -.162
Question 25 - ARI -.048 .063 .119 .021 .073 -.049 .064 .085 .078 -.123 .108 .767
Extraction Method: Principal Component Analysis.
Rotation Method: Varimax with Kaiser Normalization.
-
8/12/2019 User Friendly Demonstration PCA
23/39
Component Matrix
Column values are structure coefficients, or thecorrelation between the test question and the syntheticcomponent; REMEMBER: squared structure
coefficients inform us of how well the item canreproduce the effect in the component!
Rule of thumb, include all items with structurecoefficients with an absolute value of .300 or greater
OUTPUT COMPONENTS
-
8/12/2019 User Friendly Demonstration PCA
24/39
Rotated Component Matrixa
Component
1 2 3 4 5 6 7 8 9 10 11 12
Question 42 - SL .781 -.060 .000 .117 .034 .071 .055 -.062 .093 -.002 .032 .025
Question 39 - SL .778 -.132 .107 .109 .008 .024 -.025 .018 .044 -.010 .022 -.025
Question 33 - SL .765 -.042 .115 .098 .034 .090 -.035 -.035 .011 .013 -.012 .020
Question 9 - SL .672 -.103 .127 .092 .050 .126 .005 -.119 -.002 -.063 -.034 -.114
Question 37 - ARI .462 -.173 .193 -.103 .075 .197 .345 -.018 .024 .232 .009 .119
Question 15 - SL .406 -.002 .340 .038 .050 .091 .120 -.007 .067 -.152 -.127 -.273
Question 36 - SL .395 -.067 .212 -.104 .225 .125 .365 -.089 .110 .168 -.037 .221
Question 44 - VI .375 -.033 .360 .128 .175 .091 .221 -.023 .177 -.035 -.027 -.001
Question 26 - VI -.022 .660 -.113 .009 .021 -.063 -.096 .089 .044 .034 -.060 .174
Question 27 - SL -.158 .652 -.088 .032 .069 -.091 .040 .193 -.032 -.150 -.019 .003
Question 38 - VI -.058 .501 -.109 -.171 .032 -.276 -.051 .078 -.042 .255 -.016 -.097
Question 20 - VI -.240 .489 .016 .076 .036 -.092 -.052 .434 -.102 .071 -.079 .056
Question 32 - VI -.101 .488 -.134 .084 -.074 -.415 -.010 .046 .025 -.057 -.050 .020
Question 45 - SL -.144 .443 -.049 -.097 -.105 -.026 -.097 .078 -.031 .057 .421 -.013
Question 29 - VI -.006 .439 .154 -.114 .007 .231 .238 -.196 .145 -.098 .089 -.138
Question 41 - VI -.019 .421 -.087 -.210 .006 -.107 .333 -.005 .125 .091 .300 -.082
Question 24 - SL .129 -.067 .720 .101 .147 .119 -.003 .011 .005 .011 -.012 .203
Question 21 - SL .125 -.164 .676 -.056 .161 .047 .160 -.044 -.012 .137 -.006 .029
Question 23 - VI .313 -.164 .537 .286 .063 .007 .076 -.094 .119 .049 .123 .031
Question 17 - VI .076 -.050 .459 .187 .040 .136 .314 .048 .120 -.212 .083 -.140
Question 30 - SL .120 .114 .420 .287 -.081 .309 -.109 -.165 .061 .328 -.107 .161
Question 22 - ARI .042 .075 .364 .045 .087 -.081 -.135 -.353 .324 .216 .016 -.188
Question 34 - ARI .187 .042 .067 .791 -.002 .075 -.031 -.019 .012 .063 -.050 -.036
Question 1 - ARI -.002 -.062 .082 .722 .055 -.018 .008 -.014 .039 .132 .015 -.075
Extraction Method: Principal Component Analysis.
Rotation Method: Varimax with Kaiser Normalization.
-
8/12/2019 User Friendly Demonstration PCA
25/39
Rotated Component Matrixa , continuedComponent
Question 35 - VI .113 .077 .015 .569 .030 .439 -.053 -.140 .067 -.089 .095 .105
Question 40 - ARI .194 -.161 .176 .553 .033 .057 -.041 .016 .186 -.086 .216 .147
Question 10 - ARI .029 .016 .144 .033 .860 .036 .010 -.074 .032 .063 -.010 .006
Question 3 - SL .069 -.015 .197 .050 .848 -.029 .025 -.011 .067 -.026 -.003 .004
Question 12 - SL .297 .069 .072 .000 .488 .137 .282 .024 .033 .091 .082 .158
Question 13 - ARI -.046 .058 -.118 .045 .447 -.102 .321 .069 .128 .368 -.222 -.033
Question 11 - VI .151 -.021 .024 .361 .115 .663 .000 -.006 -.124 -.028 .021 .104
Question 5 - VI .154 -.134 .201 .042 -.057 .652 .020 .028 -.019 .124 .039 -.092
Question 8 - VI -.090 .250 -.017 .010 .000 -.623 -.034 .115 -.105 .141 .120 .088
Question 18 - SL .034 .003 .095 -.055 .092 -.039 .686 -.026 .015 .006 -.024 .036
Question 14 - VI .241 -.157 .289 -.007 .132 .221 .418 .061 -.057 -.006 .122 -.080
Question 28 - ARI -.232 .248 .051 .181 -.128 -.237 .357 -.112 .043 .074 -.144 .240
Question 16 - ARI -.069 .213 -.008 .062 -.006 -.075 .033 .678 -.051 -.101 -.103 .023
Question 19 - ARI .001 .054 -.042 -.241 -.033 -.010 -.112 .630 .147 -.010 .127 .036
Question 43 - ARI .138 -.011 .067 .255 .017 .045 -.091 .086 .756 .024 -.074 .075
Question 31 - ARI .062 .045 .069 -.048 .122 -.040 .186 -.053 .721 .140 -.077 .033
Question 4 - ARI .023 -.057 .119 .100 .132 .007 .034 -.131 .184 .643 .020 -.088
Question 6 - SL -.186 .177 -.039 .065 -.051 -.066 .087 .372 -.059 .390 .230 -.080
Question 7 - ARI .024 -.059 .047 .149 .010 .005 .016 -.017 -.133 .008 .736 .126
Question 2 - VI .234 -.198 .246 .175 .233 .094 .203 .086 .179 -.161 .254 -.162
Question 25 - ARI -.048 .063 .119 .021 .073 -.049 .064 .085 .078 -.123 .108 .767
Extraction Method: Principal Component Analysis.
Rotation Method: Varimax with Kaiser Normalization.
-
8/12/2019 User Friendly Demonstration PCA
26/39
Component Matrix
For heuristic purposes, were retaining the first Xcomponents; what variables should we include in thecomponents?
Column values are structure coefficients, or thecorrelation between the test question and the syntheticcomponent; REMEMBER: squared structurecoefficients inform us of how well the item canreproduce the effect in the component!
Rule of thumb, include all items with structurecoefficients with an absolute value of .300 or greater
Stevens recommends a better way!
OUTPUT COMPONENTS
-
8/12/2019 User Friendly Demonstration PCA
27/39
Critical Values for a Correlation Coefficient at =
.01 for a Two-Tailed Test
n CV n CV n CV
50 .361 180 .192 400 .129
80 .286 200 .182 600 .105
100 .256 250 .163 800 .091
140 .217 300 .149 1000 .081
(Stevens, 2002, pp. 394)
Test the structure coefficient for statistical significance against a
two-tailed table based on sample size and a critical value (CV); for
our sample size of 998, the CV would be |.081| doubled (two-tailed),
or |.162|.
-
8/12/2019 User Friendly Demonstration PCA
28/39
Rotated Component Matrixa
Component
1 2 3 4 5 6 7 8 9 10 11 12
Question 42 - SL .781 -.060 .000 .117 .034 .071 .055 -.062 .093 -.002 .032 .025
Question 39 - SL .778 -.132 .107 .109 .008 .024 -.025 .018 .044 -.010 .022 -.025
Question 33 - SL .765 -.042 .115 .098 .034 .090 -.035 -.035 .011 .013 -.012 .020
Question 9 - SL .672 -.103 .127 .092 .050 .126 .005 -.119 -.002 -.063 -.034 -.114
Question 37 - ARI .462 -.173 .193 -.103 .075 .197 .345 -.018 .024 .232 .009 .119
Question 15 - SL .406 -.002 .340 .038 .050 .091 .120 -.007 .067 -.152 -.127 -.273
Question 36 - SL .395 -.067 .212 -.104 .225 .125 .365 -.089 .110 .168 -.037 .221
Question 44 - VI .375 -.033 .360 .128 .175 .091 .221 -.023 .177 -.035 -.027 -.001
Question 26 - VI -.022 .660 -.113 .009 .021 -.063 -.096 .089 .044 .034 -.060 .174
Question 27 - SL -.158 .652 -.088 .032 .069 -.091 .040 .193 -.032 -.150 -.019 .003
Question 38 - VI -.058 .501 -.109 -.171 .032 -.276 -.051 .078 -.042 .255 -.016 -.097
Question 20 - VI -.240 .489 .016 .076 .036 -.092 -.052 .434 -.102 .071 -.079 .056
Question 32 - VI -.101 .488 -.134 .084 -.074 -.415 -.010 .046 .025 -.057 -.050 .020
Question 45 - SL -.144 .443 -.049 -.097 -.105 -.026 -.097 .078 -.031 .057 .421 -.013
Question 29 - VI -.006 .439 .154 -.114 .007 .231 .238 -.196 .145 -.098 .089 -.138
Question 41 - VI -.019 .421 -.087 -.210 .006 -.107 .333 -.005 .125 .091 .300 -.082
Question 24 - SL .129 -.067 .720 .101 .147 .119 -.003 .011 .005 .011 -.012 .203
Question 21 - SL .125 -.164 .676 -.056 .161 .047 .160 -.044 -.012 .137 -.006 .029
Question 23 - VI .313 -.164 .537 .286 .063 .007 .076 -.094 .119 .049 .123 .031
Question 17 - VI .076 -.050 .459 .187 .040 .136 .314 .048 .120 -.212 .083 -.140
Question 30 - SL .120 .114 .420 .287 -.081 .309 -.109 -.165 .061 .328 -.107 .161
Question 22 - ARI .042 .075 .364 .045 .087 -.081 -.135 -.353 .324 .216 .016 -.188
Question 34 - ARI .187 .042 .067 .791 -.002 .075 -.031 -.019 .012 .063 -.050 -.036
Question 1 - ARI -.002 -.062 .082 .722 .055 -.018 .008 -.014 .039 .132 .015 -.075
Extraction Method: Principal Component Analysis.
Rotation Method: Varimax with Kaiser Normalization.
-
8/12/2019 User Friendly Demonstration PCA
29/39
Rotated Component Matrixa , continuedComponent
Question 35 - VI .113 .077 .015 .569 .030 .439 -.053 -.140 .067 -.089 .095 .105
Question 40 - ARI .194 -.161 .176 .553 .033 .057 -.041 .016 .186 -.086 .216 .147
Question 10 - ARI .029 .016 .144 .033 .860 .036 .010 -.074 .032 .063 -.010 .006
Question 3 - SL .069 -.015 .197 .050 .848 -.029 .025 -.011 .067 -.026 -.003 .004
Question 12 - SL .297 .069 .072 .000 .488 .137 .282 .024 .033 .091 .082 .158
Question 13 - ARI -.046 .058 -.118 .045 .447 -.102 .321 .069 .128 .368 -.222 -.033
Question 11 - VI .151 -.021 .024 .361 .115 .663 .000 -.006 -.124 -.028 .021 .104
Question 5 - VI .154 -.134 .201 .042 -.057 .652 .020 .028 -.019 .124 .039 -.092
Question 8 - VI -.090 .250 -.017 .010 .000 -.623 -.034 .115 -.105 .141 .120 .088
Question 18 - SL .034 .003 .095 -.055 .092 -.039 .686 -.026 .015 .006 -.024 .036
Question 14 - VI .241 -.157 .289 -.007 .132 .221 .418 .061 -.057 -.006 .122 -.080
Question 28 - ARI -.232 .248 .051 .181 -.128 -.237 .357 -.112 .043 .074 -.144 .240
Question 16 - ARI -.069 .213 -.008 .062 -.006 -.075 .033 .678 -.051 -.101 -.103 .023
Question 19 - ARI .001 .054 -.042 -.241 -.033 -.010 -.112 .630 .147 -.010 .127 .036
Question 43 - ARI .138 -.011 .067 .255 .017 .045 -.091 .086 .756 .024 -.074 .075
Question 31 - ARI .062 .045 .069 -.048 .122 -.040 .186 -.053 .721 .140 -.077 .033
Question 4 - ARI .023 -.057 .119 .100 .132 .007 .034 -.131 .184 .643 .020 -.088
Question 6 - SL -.186 .177 -.039 .065 -.051 -.066 .087 .372 -.059 .390 .230 -.080
Question 7 - ARI .024 -.059 .047 .149 .010 .005 .016 -.017 -.133 .008 .736 .126
Question 2 - VI .234 -.198 .246 .175 .233 .094 .203 .086 .179 -.161 .254 -.162
Question 25 - ARI -.048 .063 .119 .021 .073 -.049 .064 .085 .078 -.123 .108 .767
Extraction Method: Principal Component Analysis.
Rotation Method: Varimax with Kaiser Normalization.
-
8/12/2019 User Friendly Demonstration PCA
30/39
Sum the interval values for the responses of all questions
included in the retained component
Obtain mean values for the responses of all questions
included in the retained componenthintyoull get thesame R, R, , and structure coefficients as with the sums!
Use SPSS to obtain factor scores for the component
Choose Scores button when setting up your PCA
Options include calculating scores based on regression, Bartlett, orAndersonRubin methodologiesbe sure and check Save asVariables
Factor scores will appear in your data set and can be used asvariables in other GLM analyses
Obtaining Continuous Component Values
for Use in Further Analysis
-
8/12/2019 User Friendly Demonstration PCA
31/39
RELIABILITY
The extent to which scores on a test are consistent
across multiple administrations of the test; the amount of
measurement error in the scores yielded by a test (Gall,
Gall, & Borg, 2003).
While validity is important in ensuring our tests are really
measuring what we intended to measure; You wouldnt
administer an English literature test to assess math
competency, would you?
Can be measured several ways using SPSS 17.0
-
8/12/2019 User Friendly Demonstration PCA
32/39
A Visual Explanation of
Reliability and Validity
http://rds.yahoo.com/_ylt=A0S020psNm9LtGkA9pOjzbkF/SIG=12q8fameb/EXP=1265666028/**http%3a/blog.questionmark.com/wp-content/uploads/2009/04/newchartwp.jpg -
8/12/2019 User Friendly Demonstration PCA
33/39
RELIABILITY
-
8/12/2019 User Friendly Demonstration PCA
34/39
RELIABILITY
-
8/12/2019 User Friendly Demonstration PCA
35/39
Cronbachs Alpha CoefficientRELIABILITY
/VARIABLES=question1 question2 question3 question4 question5 question6 question7 question8
question9 question10 question11 question12 question13 question14 question15 question16 question17
question18 question19 question20 question21 question22 question23 question24 question25 question26
question27 question28 question29 question30 question31 question32 question33 question34 question35
question36 question37 question38 question39 question40 question41 question42 question43 question44question45
/SCALE('ALL VARIABLES') ALL
/MODEL=ALPHA.
Split-Half CoefficientRELIABILITY
/VARIABLES=question1 question2 question3 question4 question5 question6 question7 question8question9 question10 question11 question12 question13 question14 question15 question16 question17
question18 question19 question20 question21 question22 question23 question24 question25 question26
question27 question28 question29 question30 question31 question32 question33 question34 question35
question36 question37 question38 question39 question40 question41 question42 question43 question44
question45
/SCALE('ALL VARIABLES') ALL
/MODEL=SPLIT.
RELIABILITY
-
8/12/2019 User Friendly Demonstration PCA
36/39
Cronbachs Alpha Coefficient
Reliability Statistics
Cronbach's Alpha N of Items
.749 45
RELIABILITY
Benchmarks for Alpha
.9 & up = very good
.8 to .9 = good
.7 to .8 = acceptable
.7 & below = suspect.
dont refer to the
test as reliable, but
scores from this
administration of the
test yielded reliable
results.Kyle Roberts
http://smu.edu/education/teachereducation/faculty/images/KyleRoberts.jpg -
8/12/2019 User Friendly Demonstration PCA
37/39
Split-Half Coefficient Reliability StatisticsCronbach's Alpha Part 1 Value .620
N of Items 23a
Part 2 Value .623
N of Items 22b
Total N of Items 45
Correlation Between
Forms
.518
Spearman-Brown
Coefficient
Equal Length .683
Unequal Length .683
Guttman Split-Half
Coefficient
.683
a. The items are: Question 1 - ARI, Question 2 - VI, Question 3 - SL, Question 4
- ARI, Question 5 - VI, Question 6 - SL, Question 7 - ARI, Question 8 - VI,Question 9 - SL, Question 10 - ARI, Question 11 - VI, Question 12 - SL,
Question 13 - ARI, Question 14 - VI, Question 15 - SL, Question 16 - ARI,
Question 17 - VI, Question 18 - SL, Question 19 - ARI, Question 20 - VI,
Question 21 - SL, Question 22 - ARI, Question 23 - VI.
b. The items are: Question 23 - VI, Question 24 - SL, Question 25 - ARI,
Question 26 - VI, Question 27 - SL, Question 28 - ARI, Question 29 - VI,Question 30 - SL, Question 31 - ARI, Question 32 - VI, Question 33 - SL,
Question 34 - ARI, Question 35 - VI, Question 36 - SL, Question 37 - ARI,
Question 38 - VI, Question 39 - SL, Question 40 - ARI, Question 41 - VI,Question 42 - SL, Question 43 - ARI, Question 44 - VI, Questiton 45 - SL.
RELIABILITY
-
8/12/2019 User Friendly Demonstration PCA
38/39
http://faculty.chass.ncsu.edu/garson/PA765/factor.ht
m
http://www.uic.edu/classes/epsy/epsy546/Lecture%2
04%20---
%20notes%20on%20PRINCIPAL%20COMPONENTS%20ANALYSIS%20AND%20FACTOR%20ANALYS
IS1.pdf
http://www.ats.ucla.edu/stat/Spss/output/factor1.htm
http://www.statsoft.com/textbook/principal-components-factor-analysis/
RELATED LINKS
http://faculty.chass.ncsu.edu/garson/PA765/factor.htmhttp://faculty.chass.ncsu.edu/garson/PA765/factor.htmhttp://www.uic.edu/classes/epsy/epsy546/Lecture%204%20---%20notes%20on%20PRINCIPAL%20COMPONENTS%20ANALYSIS%20AND%20FACTOR%20ANALYSIS1.pdfhttp://www.uic.edu/classes/epsy/epsy546/Lecture%204%20---%20notes%20on%20PRINCIPAL%20COMPONENTS%20ANALYSIS%20AND%20FACTOR%20ANALYSIS1.pdfhttp://www.uic.edu/classes/epsy/epsy546/Lecture%204%20---%20notes%20on%20PRINCIPAL%20COMPONENTS%20ANALYSIS%20AND%20FACTOR%20ANALYSIS1.pdfhttp://www.uic.edu/classes/epsy/epsy546/Lecture%204%20---%20notes%20on%20PRINCIPAL%20COMPONENTS%20ANALYSIS%20AND%20FACTOR%20ANALYSIS1.pdfhttp://www.uic.edu/classes/epsy/epsy546/Lecture%204%20---%20notes%20on%20PRINCIPAL%20COMPONENTS%20ANALYSIS%20AND%20FACTOR%20ANALYSIS1.pdfhttp://www.ats.ucla.edu/stat/Spss/output/factor1.htmhttp://www.statsoft.com/textbook/principal-components-factor-analysis/http://www.statsoft.com/textbook/principal-components-factor-analysis/http://www.statsoft.com/textbook/principal-components-factor-analysis/http://www.statsoft.com/textbook/principal-components-factor-analysis/http://www.statsoft.com/textbook/principal-components-factor-analysis/http://www.statsoft.com/textbook/principal-components-factor-analysis/http://www.statsoft.com/textbook/principal-components-factor-analysis/http://www.statsoft.com/textbook/principal-components-factor-analysis/http://www.statsoft.com/textbook/principal-components-factor-analysis/http://www.ats.ucla.edu/stat/Spss/output/factor1.htmhttp://www.uic.edu/classes/epsy/epsy546/Lecture%204%20---%20notes%20on%20PRINCIPAL%20COMPONENTS%20ANALYSIS%20AND%20FACTOR%20ANALYSIS1.pdfhttp://www.uic.edu/classes/epsy/epsy546/Lecture%204%20---%20notes%20on%20PRINCIPAL%20COMPONENTS%20ANALYSIS%20AND%20FACTOR%20ANALYSIS1.pdfhttp://www.uic.edu/classes/epsy/epsy546/Lecture%204%20---%20notes%20on%20PRINCIPAL%20COMPONENTS%20ANALYSIS%20AND%20FACTOR%20ANALYSIS1.pdfhttp://www.uic.edu/classes/epsy/epsy546/Lecture%204%20---%20notes%20on%20PRINCIPAL%20COMPONENTS%20ANALYSIS%20AND%20FACTOR%20ANALYSIS1.pdfhttp://www.uic.edu/classes/epsy/epsy546/Lecture%204%20---%20notes%20on%20PRINCIPAL%20COMPONENTS%20ANALYSIS%20AND%20FACTOR%20ANALYSIS1.pdfhttp://www.uic.edu/classes/epsy/epsy546/Lecture%204%20---%20notes%20on%20PRINCIPAL%20COMPONENTS%20ANALYSIS%20AND%20FACTOR%20ANALYSIS1.pdfhttp://faculty.chass.ncsu.edu/garson/PA765/factor.htmhttp://faculty.chass.ncsu.edu/garson/PA765/factor.htm -
8/12/2019 User Friendly Demonstration PCA
39/39
Gall, M.D., Gall, J.P., & Borg, W.R. (2003). Educational research: An introduction7th
ed.). Boson: Allyn and Bacon.
Ledesma, R.D., & Valero-Mora, P. (2007). Determining the number of factors to
retain in EFA: an easy-to-use computer program for carrying out parallel analysis.
Practical Assessment, Research, & Evaluation,12(2).
Meyers, L.S., Gamst, G., & Guarino, A.J. (2006).Applied multivariate research:
Design and interpretation. Thousand Oaks, CA: Sage.
Stevens, J. P. (2002).Applied multivariate statistics for the social sciences(4thed.).
Mahwaw, NJ: Lawrence Erlbaum Associates.
University of California at Los Angeles Academic Technology Services (2009).
Annotated SPSS output: Factor analysis. Retrieved January 11, 2010 from
http://www.ats.ucla.edu/stat/Spss/output/factor1.htm
University of Illinois at Chicago (2009). Principal components analysis and factor
analysis. Retrieved January 11, 2010 fromhttp://www.uic.edu/classes/epsy/epsy546/Lecture%204%20---
%20notes%20on%20PRINCIPAL%20COMPONENTS%20ANALYSIS%20AND%20F
ACTOR%20ANALYSIS1.pdf
Wilkinson, L. & Task Force on Statistical Inference. (1999). Statistical methods in
psychology journals: Guidelines and explanation.American Psychologist, 54, 594-
604
REFERENCES
http://www.ats.ucla.edu/stat/Spss/output/factor1.htmhttp://www.uic.edu/classes/epsy/epsy546/Lecture%204%20---%20notes%20on%20PRINCIPAL%20COMPONENTS%20ANALYSIS%20AND%20FACTOR%20ANALYSIS1.pdfhttp://www.uic.edu/classes/epsy/epsy546/Lecture%204%20---%20notes%20on%20PRINCIPAL%20COMPONENTS%20ANALYSIS%20AND%20FACTOR%20ANALYSIS1.pdfhttp://www.uic.edu/classes/epsy/epsy546/Lecture%204%20---%20notes%20on%20PRINCIPAL%20COMPONENTS%20ANALYSIS%20AND%20FACTOR%20ANALYSIS1.pdfhttp://www.uic.edu/classes/epsy/epsy546/Lecture%204%20---%20notes%20on%20PRINCIPAL%20COMPONENTS%20ANALYSIS%20AND%20FACTOR%20ANALYSIS1.pdfhttp://www.uic.edu/classes/epsy/epsy546/Lecture%204%20---%20notes%20on%20PRINCIPAL%20COMPONENTS%20ANALYSIS%20AND%20FACTOR%20ANALYSIS1.pdfhttp://www.uic.edu/classes/epsy/epsy546/Lecture%204%20---%20notes%20on%20PRINCIPAL%20COMPONENTS%20ANALYSIS%20AND%20FACTOR%20ANALYSIS1.pdfhttp://www.uic.edu/classes/epsy/epsy546/Lecture%204%20---%20notes%20on%20PRINCIPAL%20COMPONENTS%20ANALYSIS%20AND%20FACTOR%20ANALYSIS1.pdfhttp://www.ats.ucla.edu/stat/Spss/output/factor1.htm