ch 9 multivariate data analysis: discriminant analysis and multidimensional scaling

Ch 9 Multivariate Data

Analysis:

Discriminant Analysis and

Multidimensional Scaling

2

Harcourt, Inc. items and derived items copyright Harcourt, Inc.

多群區別分析

其探討群與群之間的區別規則與兩群體之區別分析ㄧ致。唯一差別 : 可能無法從單一之區別方程式來顯現所有群體之差異。

確認最小個數之區別函式，以提供最佳之區別。

2

幾何角度來探討多群區別分析確認最小個數之區別函式，以提供最佳之區別。

假設共有 G 個群體。探討其群體與群體之間的差異，並不一定需要(G-1)個維度，可能小於它 ( 以 r代替 ) 。

到底共需要幾個區別函式才能有效地區別多元群體呢 ?

3

4


幾何角度來探討多群區別分析區別分數是群體內部的點，投影在新的座標軸上，會有區別分數值的產生。此區別分數值能有效的區別多個群體。

旋轉角度，定義新的座標軸 (Zi) ，以最大值的方法為依據，獲得區別分數。

4

5


幾何角度來探討多群區別分析當此座標軸 Z1無法有效區別其它群體時，需要其它區別函式來辨識( 即其它座標軸 Z2) 。

兩座標軸 Z1 與 Z2 並不一定要正交，但彼此互不相關 ( 即其區別分數不能有相關性 ) 。

5

Transparency 17.1Salespeople’s New Account Activities

Grand Prize Winner (W)

Number of Calls on New Accounts

Percentage of Calls with Advance Appointments

Telephone Calls Made to Prospects

Number of New Accounts Visited

RMB

ALB

BCC

JJC

EDC

WPD

RHH

BEK

DAK

JJN

MYS

PJS

CET

LLV

LMW

Mean

130122

89104116100

85113108116

9978

1069498

103.9

627068584065665952485770615864

59.9

148186171135160151183130163154188190157173137

161.7

424432403630422541483240382936

37.0

123456789

101112131415

X1 X2 X3 X4

Transparency 17.1 (Continued)Salespeople’s New Account Activities






X1 X2 X3 X4

JGB

RAB

HAF

PPD

BCE

ASG

WLH

LHL

RJL

WFM

JRP

EJS

VES

HMT

BMT

Mean

1058664

104102

7394598491839568

10189

86.5

396048365362516431474042525139

47.7

155140132119143128152130102

9687

114123

98117

122.4

453336294130362832353028262433

32.4

Transparency 17.1 (Continued)Salespeople’s New Account Activities






X1 X2 X3 X4

804726945738294857395140643551

50.4

234237243241522436373842213229

34.0

6974

13268948396738298

117112

677881

88.3

323320262328222628212422292526

25.7OVERALLMeanStd. Dev.

80.315.91

47.28.97

124.119.99

31.75.37

RBBGEBADCJFC

LDEJFHJCHRPFAPLHALERM

WRRJTS

JMV HEY

Mean

Transparency 17.2Scatter Plot of Selected Two-Variable Combinations

0

10

20

30

40

50

60

70

80

0 20 40 60 80 100 120 140

Panel A

X2--Percentage of Calls with Advanced Appointments

X1--Number of Calls on New Accounts

Grand Prize Winner

Consolation Prize Winner

Transparency 17.2 (Continued)Scatter Plot of Selected Two-Variable Combinations

X2--Percentage of Callswith Advance Appointments

X1--Number of Callson New Accounts

0

10

20

30

40

50

60

70

80

0 20 40 60 80 100 120 140

Grand Prize Winner


Panel B

Transparency 17.2 (Continued)Scatter Plot of Selected Two-Variable Combinations

Panel C

0

10

20

30

40

50

60

0 20 40 60 80 100 120 140

X4--Number of New

Accounts Visited

X1--Number of Calls on New Accounts

Grand Prize Winner


Transparency 17.3Scatter Plot Containing New Axis

X2--Percentage of Callswith Advance Appointments

X1--Number of Callson New Accounts

0

10

20

30

40

50

60

70

80

0 20 40 60 80 100 120 140

Grand Prize Winner


13


幾何角度來探討多群區別分析確認最小個數之區別方程式，以提供最佳之區別。

探討群與群之間的區別

14




15




16




17




18




19




Transparency 17.4Calculated Discriminant Scores for Grand Prize and Consolation Prize Winners Using the Discriminant Function Y=.058X1 + .063X2 + .034X3 - .032X4


RMB

ALB

BCC

JJC

EDC

WPD

RHH

BEK

DAK

JJN

MYS

PJS

CET

LLV

LMW

Mean

X1 X3

X2 X4 Y

130122

89104116100

85113108116

9978

1069498

103.9

627068584065665952485770615864

59.9

148186171135160151183130163154188190157173137

161.7

424432403630422541483240382936

37.0

15.216.514.313.013.614.114.013.913.813.514.814.214.214.113.3

123456789

101112131415

Transparency 17.4 (Continued)Calculated Discriminant Scores for Grand Prize and Consolation Prize Winners Using the Discriminant Function Y=.058X1 + .063X2 + .034X3 - .032X4

Grand Prize Winner (W)X1 X

3

X2 X4 Y

JGB

RAB

HAF

PPD

BCE

ASG

WLH

LHL

RJL

WFM

JRP

EJS

VES

HMT

BMT

Mean

1058664

104102

7394598491839568

10189

86.5

396048365362516431474042525139

47.7

155140132119143128152130102

9687

114123

98117

122.4

453336294130362832353028262433

32.4

12.412.510.111.412.811.612.711.0

9.310.4

9.411.210.611.710.6

123456789

101112131415

Transparency 17.5Predicted Group Membership Using the Simple Classification Rule


Discriminant Score

Yi

First Group

Yi - Yw=

Yi - 14.2

Second Group

Yi - Yc=

Yi - 11.2

Predicted Group Membership

Differences from Mean of:

123456789

101112131415

15.2

16.5

14.3

13.0

13.6

14.1

14.0

13.9

13.8

13.5

14.8

14.2

14.2

14.1

13.3

1.02.30.1-1.

2-0.

6-0.

1-0.

2-0.

3-0.

4-0.

70.60.00.0-0.

1-0.

9

4.05.33.11.82.42.92.82.72.62.33.63.03.02.92.1

W

W

W

W

W

W

W

W

W

W

W

W

W

W

W

Transparency 17.5 (Continued)Predicted Group Membership Using the Simple Classification Rule


Discriminant Score

Yi

First Group

Yi - Yw=

Yi - 14.2

Second Group

Yi - Yc=

Yi - 11.2

Predicted Group Membership

Differences from Mean of:

123456789

101112131415

12.412.510.111.412.811.612.711.09.310.49.411.210.611.710.6

-1.8-1.7-4.1-2.8-1.4-2.6-1.5-3.2-4.9-3.8-4.8-3.0-3.6-2.5-3.6

1.21.3-1.10.21.60.41.5-0.2-1.9-0.8-1.80.0-0.60.5-0.6

CCCCWCW*CCCCCCCC

*The assignments were actually carried out using more significant digits in the calculations of discriminant scores. While the calculations to one decimal place suggest this case is equidistant from the two group means, it actually is slightly closer to the mean for the grand prize winners.

Transparency 17.6Matrix of Actual Versus Predicted Group Membership


Predicted Classification:

Grand Prize Winner Total

Actual Classification

Grand prize winner

Consolation prize winner

15

2

0

13 15

15

Transparency 17.7Discriminant Scores for Each Salesperson and Group to Which Salesperson Would be Predicted toBelong Employing the Function Y=.064X1+.079X2+.027X3-.002X4

Grand Prize

Winners (W)

Consolation Prize

Winners (C)

Unsuccessful

Salespeople (U)

123456789

101112131415

U

U

U

U

U

U

U

U

U

U

U

U

U

U

U

RBB

GEB

ADC

JFC

LDE

JFH

JCH

RPF

APL

HAL

ERM

WRR

JTS

JMV

HEY

8.808.328.189.768.737.928.586.948.718.089.458.937.566.887.75

C

C

C

C

W

C

W

C

U

C

U

C

C

C

C

JGB

RAB

HAF

PPD

BCE

ASG

WLH

LHL

RJL

WFM

JRP

EJS

VES

HMT

BMT

14.0014.0611.4712.7414.6013.0614.1812.3810.5912.1410.8312.5011.8113.1711.95

W

W

W

W

W

W

W

W

W

W

W

W

W

W

W

RMB

ALB

BCC

JJC

EDC

WPD

RHH

BEK

DAK

JJN

MYS

PJS

CET

LLV

LMW

17.2518.4015.7314.9114.9415.6615.6315.4515.4515.3915.9715.6915.8815.3215.06

123456789

101112131415

123456789

101112131415

Transparency 17.8Matrix of Predicted by Actual Classifications for Salespeople


Predicted Classification:

Grand Prize Winner Total

Actual Classification

Grand prize winner 15 0

Consolation prize winner

2 11 15

15

Unsuccessful salesperson

0 0 15

Unsuccessful Salesperson

0

2

15

Source:

Perceptual

Mapping

Alternative Approaches to Develop Perceptual Maps

Nonattribute-based approaches (MDS)Similarity dataPreference data

Attribute-based approachesFactor analysisDiscriminant Analysis

Nonattribute-based Approaches

RespondentTasks

Similarity judgment of various stimuli

Advantages

It does not require a predefined attribute set.Allow respondents to use their own criteria to form similarity.It is suitable when perception is not decomposable in terms of attributes.

Nonattribute-based Approaches (Cont.)

Disadvantages

Naming of dimensionsIndividual variations are at most a stretching of the common measure.Criteria used by respondents are sensitive to stimuli set.Requiring special computer programs.Limited by the number of objects.

Attribute-based Approaches

Respondent Tasks

Rating stimuli on pre-specifiec attributes

Advantages Facilatating naming the attributesEasier to cluster respondents into groupsEasy and inexpensive to useComputer programs are available.

Attribute-based Approaches (Cont.)

Disadvantages

Requiring a relatively complete set of attributes.Assuming that an individual’s overall perception of a stimulus is decomposable into his reactions to various attributes.

Transparency 17.36Perceptual Map of Automobiles

Perceptual Map--Brand Images

Has a Touch of Class a Car to be Proud to Own Distinctive Looking

Pontiac

BMW

PorscheLincoln

Cadillac

Mercedes

ChryslerBuick

Chevrolet

Oldsmobile

Has Spirited PerformanceAppeals to Young People Fun to DriveSporty Looking

Very PracticalProvides Good Gas MileageAffordable

Datsun

Toyota

VW

ConservativeLookingAppeals toOlder People

Ford

Dodge

PlymouthSource: Chrysler Corp.John Koten, “Car Makers Use ‘Image’ Map as Tool to Position Products,” The Wall Street Journal, March 22, 1984, p. 31. Reprinted by permission of The Wall Street Journal, C. Dow Jones & Company, Inc., 1984. All Rights Reserved.

Source:

Transparency 17.37Respondent Similarity Judgments

Camera

Camera A

B

C

D

E

F

G

H

I

J

A B C D E F G H I J

28

5

24

32

37

31

27

16

7

29

21

1

3

36

43

40

30

17

26

34

22

20

23

2

18

25

7

13

12

15

4

35

42

39

33

41

45

44

38

9

10

19

6

14

11

Transparency 17.38Similarity Judgments of Four Objects

Object (J)

Object (I)

1

2

3

4

1 2 3 4

1 4 2

3 6

5

Transparency 17.39Arbitrary Plot of Four Objects in Two Space

6

5

4

3

2

1

1 2 3 4 5 6 7 8

2

1

3

4

Transparency 17.40Distance Versus Judgments

Actual Distances Among Objects in Arbitrary Configuration

Object (I)

1

2

3

4

1 2 3 4

2.0 5.9 6.1

5.1 7.1

5.2

Object (J)

Object (I)

1

2

3

4

1 2 3 4

1 4 2

3 6

5

Object (J)Original Input Similarity Judgments

Transparency 17.41Plot of Input Similarities Versus Actual Distances

DIJ

IJ

01234567

0 1 2 3 4 5 6 7 8

Transparency 17.42Flow Diagram of a Multidimensional Scaling Analysis

Input Similarity/Dissimilarity Data

Set Dimensions=1

Determine Initial Configuration

Compute Distances Between Objects

Compare Computed DistancesAgainst Hypothetical Distances

to Make Function Monotonic

Is Goodness of Fit Better this

Iteration Than Last Iteration

Is the Number of Dimensions

Less Than or Equal to Maximum

STOP

Modify ExistingConfiguration

Increase Number of Dimensions by One

No

Yes

Yes

Transparency 17.43Stress Index for Bank Similarity Judgments

Stress

Number of Dimensions

1 2 3 4

.2

.15

.1

.05

0

Transparency 17.44Multidimensional Scaling Map of Similarity Judgments

GD

H

IB

EF

C

A

I

I I

II

II

Transparency 17.45Key Decisions When Conducting a Multidimensional Scaling Analysis

Specify the Productsand/or Brands to Be Used

Specify How the Similarities Judgments Are to Be Secured and Construct the Stimuli

Decide on Whether Judgments Will Be Aggregated and, If Yes, How

Collect the Judgments and Analyze Them toGenerate the Perceptual Map

Name the Resulting Dimensions

I

II Diet Coke

Diet Pepsi

7up

Sprite

Pepsi

Coke

Transparency 17.46Simple Example to Interpret

Stimulus DepVar PredictorVarsBrand Attribute Coordinates in MDS Space7up 0 -.7 -.3Sprite 0 -.6 -.4Diet Coke 1 .4 .5 Diet drinksDiet Pepsi 1 .5 .4 vs. not-dietPepsi 0 .6 -.4 drinks: d.var.Coke 0 .5 -.5

7up 3.5 -.7 -.3Sprite 4.8 -.6 -.4Diet Coke 2.7 .4 .5 Mean ratingDiet Pepsi 5.4 .5 .4 of sweetnessPepsi 6.3 .6 -.4 (1 to 7, 7 very)Coke 3.2 .5 -.5

Transparency 17.47“Vector Fitting”: Objective Means of Interpreting Dimensions

R2 tell you how helpful the attribute is in interpreting the MDS solution

Betas tell you where to put head of “attribute vector” (direction w max vector property):

diet = b1 (dim1) + b2 (dim2) = .03 dim1 + .75 dim2 (R2 =.85) caramel colored = .82 dim1 + .09 dim2 (R2 =.89) sweetness = .15 dim1 + .07 dim2 (R2 =.21)

Transparency 17.48Multiple Regressions

I

II Diet Coke

Diet Pepsi

7up

Sprite

Pepsi

Coke

sweet caramel

dietness

Transparency 17.49Example Overlaying Vectors

I

II Diet Coke

Diet Pepsi

7up

Sprite

Pepsi

Coke

Consumer1

Consumer3

Consumer2

Transparency 17.50Preferences: “Ideal Points”

subject weights: dim: I IIS1 .7 .3S2 0 1S3 .5 .5

Subject 1:

I

II DCDP

7S

PC

Subject 2:

7up

CokeS,P

DC

DP

Transparency 17.51MDS Models that Allow for Individual Differences

Multidimensional Scaling

J.B. Kruskal, Psychometrika,Val.29.No 1

,March 1964

The purpose is to represent the n objects by n points in a t-dimensional space.

： dissimilarity between i & j

)(ˆˆˆ

),ˆ()(

)(

),,,(

)()2()1(

,

1

2

21

)()3()2()1(

Monddd

dd

XXd

XXXX

Mijijij

ijijijij

t

sjsisij

itiii

Mijijijij

Given a configuration

Stress = =

St. satisfying ( Mon )

In t dimensions= Min All t-dimensional configurations

2ˆ

1

)ˆ(

),,(

ij

ijij

d

n

d

ddMin

xxS

ij

ijd̂

),,,( 21 nxxxS

*S

Acceptable Level of Stress

Stress Goodness of Fit

0.2 Poor

0.1 Fair

0.05 Good

0.025 Excellent

0.0 Perfect

ch 9 multivariate data analysis: discriminant analysis and multidimensional scaling

Documents