multidimensional scaling mds · 2018. 11. 21. · multidimensional scaling mds and other...

32
Multidimensional scaling MDS And other permutation based analyses MDS Aim Graphical representation of dissimilarities between objects in as few dimensions (axes) as possible

Upload: others

Post on 10-Sep-2021

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Multidimensional scaling MDS · 2018. 11. 21. · Multidimensional scaling MDS And other permutation based analyses MDS Aim • Graphical representation of dissimilarities between

Multidimensional scaling

MDS

And other permutation based analyses

MDS Aim

• Graphical representation of

dissimilarities between objects in as few

dimensions (axes) as possible

Page 2: Multidimensional scaling MDS · 2018. 11. 21. · Multidimensional scaling MDS And other permutation based analyses MDS Aim • Graphical representation of dissimilarities between

• Graphical representation is termed an

“ordination” in ecology

• Axes of graph represent new variables

which are summaries of original

variables

Haynes & Quinn (unpublished)

• Four sites along Morwell River – site 1 upstream from planned sewage

outfall

– sites 2, 3 and 4 downstream

– site 3 below fish farm

• Abundance of all species of

invertebrates recorded from 3 stations

at each site

Page 3: Multidimensional scaling MDS · 2018. 11. 21. · Multidimensional scaling MDS And other permutation based analyses MDS Aim • Graphical representation of dissimilarities between

• 12 objects (sampling units): – 4 sites by 3 stations at each site

• 94 variables (species)

Do invertebrate communities (or

assemblages) differ between stations

and sites? – Is Site 1 different from rest?

Multidimensional scaling

1. Set up a raw data matrix

Species 1 2 3 4 5 etc.

Site/sample

S11 54 0 0 5 0

S12 37 1 0 4 0

S13 68 2 0 2 0

S21 60 0 0 0 1

S22 47 0 0 2 0

S23 60 0 0 0 0

etc.

Page 4: Multidimensional scaling MDS · 2018. 11. 21. · Multidimensional scaling MDS And other permutation based analyses MDS Aim • Graphical representation of dissimilarities between

2. Calculate a dissimilarity (Bray-Curtis) matrix

S11 S12 S13 S21 S22 S23 etc.

S11 .000

S12 .203 .000

S13 .666 .652 .000

S21 .216 .331 .759 .000

S22 .328 .410 .796 .191 .000

S23 .336 .432 .796 .183 .054 .000

etc.

3. Decide on number of dimensions

(axes) for the ordination:

– suspected number of underlying

ecological gradients

– match distances between objects on plot

and dissimilarities between objects as

closely as possible

– more dimensions means better match

– usually between 2 and 4 dimensions

Page 5: Multidimensional scaling MDS · 2018. 11. 21. · Multidimensional scaling MDS And other permutation based analyses MDS Aim • Graphical representation of dissimilarities between

4. Arrange objects (eg. sampling units)

initially on ordination plot in chosen

number of dimensions

– starting configuration

– usually generated randomly

Starting configuration

-2 -1 0 1 2

-2

-1

0

1

2

Axis I

Axis II

Site 1 Site 3 Site 2 Site 4

Page 6: Multidimensional scaling MDS · 2018. 11. 21. · Multidimensional scaling MDS And other permutation based analyses MDS Aim • Graphical representation of dissimilarities between

5. Compare distances between objects on

ordination plot and Bray-Curtis

dissimilarities between objects

– strength of relationship measured by

Kruskal’s stress value

– measures “badness of fit” so lower values

indicate better match

– plot is called Shepard plot

Starting configuration

-2 -1 0 1 2

-2

-1

0

1

2

Axis I

Axis II

Site 1

Site 3

Site 2

Site 4

0 0.5 1 0

1

2

3

Dissimilarity

Distance

Shepard plot

Stress = 0.394

Page 7: Multidimensional scaling MDS · 2018. 11. 21. · Multidimensional scaling MDS And other permutation based analyses MDS Aim • Graphical representation of dissimilarities between

6. Move objects on ordination plot

iteratively by method of steepest

descent

– each step improves match between

dissimilarities and distances between

objects on ordination plot

– lowers stress value

0 0.5 1 0

1

2

3

Dissimilarity

Distance

-2 -1 0 1 2 -2

-1

0

1

2

Axis I

Axis II

After 20 iterations

Stress = 0.119

Page 8: Multidimensional scaling MDS · 2018. 11. 21. · Multidimensional scaling MDS And other permutation based analyses MDS Aim • Graphical representation of dissimilarities between

7. Final configuration

• further moving of objects on ordination

plot cannot improve match between

dissimilarities and distances

• stress as low as possible

0 0.5 1 0

1

2

3

Dissimilarity

Distance

-2 -1 0 1 2 -2

-1

0

1

2

Axis II

Axis I

Final configuration - 50 iterations

Stress = 0.069

Page 9: Multidimensional scaling MDS · 2018. 11. 21. · Multidimensional scaling MDS And other permutation based analyses MDS Aim • Graphical representation of dissimilarities between

Iteration Stress

1 0.394

2 0.368

3 0.357

4 0.351

... ...

20 0.119

... ...

49 0.069

50 0.069

Stress of final configuration is 0.069

Iteration history

How low should stress be?

Clarke (1993) suggests:

• > 0.20 is basically random

• < 0.15 is good

• < 0.10 is ideal

– configuration is close to actual

dissimilarities

Page 10: Multidimensional scaling MDS · 2018. 11. 21. · Multidimensional scaling MDS And other permutation based analyses MDS Aim • Graphical representation of dissimilarities between

How many dimensions?

• Increasing no. of dimensions above 4

usually offers little reduction in stress

• 2 or 3 dimensions usually adequate to

get good fit (ie. low stress)

• 2 dimensions straightforward to plot

Lonhart (unpublished data)

• Effects of depth and piling location on

marine fouling assemblage

• Two pilings, four sides of each panel,

two depths, sampled 4 times

• 40 species in total recorded

Page 11: Multidimensional scaling MDS · 2018. 11. 21. · Multidimensional scaling MDS And other permutation based analyses MDS Aim • Graphical representation of dissimilarities between

• MDS to examine relationship piling

location and depth on invertebrate

community – Does the community vary as a function of

depth?

– Does the community vary as a function of

pilling location?

– Does the effect of depth on the community

vary as a function of piling location?

• Bray-Curtis dissimilarity

• Non-metric MDS

• ANOSIM / PERMANOVA

• SIMPER

Transform: Square root

Resemblance: S17 Bray Curtis similarity

Date2_22_2010

3_05_2010

3_18_2010

4_02_2010

2D Stress: 0.17

MDS Plot

Page 12: Multidimensional scaling MDS · 2018. 11. 21. · Multidimensional scaling MDS And other permutation based analyses MDS Aim • Graphical representation of dissimilarities between

Transform: Square root

Resemblance: S17 Bray Curtis similarity

Piling8381

8179

2D Stress: 0.17

Transform: Square root

Resemblance: S17 Bray Curtis similarity

DepthShallow

Deep

2D Stress: 0.17

Transform: Square root

Resemblance: S17 Bray Curtis similarity

PilingDepth8381Shallow

8381Deep

8179Shallow

8179Deep

2D Stress: 0.17

Comparing groups in MDS

• 2 Piling locations

• 2 Depths

• 8 replicates per treatment combination (4

sides x 2 samples)

• Are sites significantly different in species

composition?

• Is there an ANOVA-like equivalent for

MDS?

Page 13: Multidimensional scaling MDS · 2018. 11. 21. · Multidimensional scaling MDS And other permutation based analyses MDS Aim • Graphical representation of dissimilarities between

Procedure 1:Analysis of

similarities - ANOSIM

• Uses (dis)similarity matrix

• Because dissimilarities are not normally distributed, uses ranks of pairwise dissimilarities

• Because dissimilarities are not independent of each other, uses randomization test rather than usual significance testing procedure

• Generates own test statistic (called R) by randomization of rank dissimilarities

• Available through PRIMER package

Lonhart ANOSIM

• Depth effect R = 0.305, P = 0.001 so reject Ho.

- Significant differences between depths

• Piling location R = 0.761 , P = 0.001 so reject Ho

- Significant difference by Piling

Page 14: Multidimensional scaling MDS · 2018. 11. 21. · Multidimensional scaling MDS And other permutation based analyses MDS Aim • Graphical representation of dissimilarities between

Permanova (permutation

ANOVA)

• Run just like an ANOVA

• Sums of Squares can be partitioned in

multivariate space (based on distances to

multidimensional centroids)

• P – values based on permutations of the

analysis

Permanova (permutation

ANOVA)

PERMANOVA table of results

Unique

Source df SS MS Pseudo-F P(perm) perms

Depth 1 14884 14884 15.67 0.001 999

Piling 1 70878 70878 74.623 0.001 999

DepthxPiling 1 10558 10558 11.116 0.001 999

Res 124 1.1778E5 949.82

Total 127 2.141E5

Page 15: Multidimensional scaling MDS · 2018. 11. 21. · Multidimensional scaling MDS And other permutation based analyses MDS Aim • Graphical representation of dissimilarities between

Transform: Square root

Resemblance: S17 Bray Curtis similarity

Piling8381

8179

2D Stress: 0.17

Transform: Square root

Resemblance: S17 Bray Curtis similarity

DepthShallow

Deep

2D Stress: 0.17

Transform: Square root

Resemblance: S17 Bray Curtis similarity

PilingDepth8381Shallow

8381Deep

8179Shallow

8179Deep

2D Stress: 0.17

Interaction effect

Which variables (species) most

important?

• For MDS-type analyses, three methods:

– correlate individual variables (species abundances) with axis scores – like PCA loadings

– SIMPER (similarity percentages) to determine which species contribute most to Bray-Curtis dissimilarity

– CA (Correspondence Analyis)to simultaneously ordinate objects and species - biplots

Page 16: Multidimensional scaling MDS · 2018. 11. 21. · Multidimensional scaling MDS And other permutation based analyses MDS Aim • Graphical representation of dissimilarities between

SIMPER (similarity percentages)

|yij - yik|

Bray-Curtis dissimilarity =

yij + yik)

Note is summing over each species, 1 to p.

The contribution of species i is:

|yij - yik|

i =

yij + yik)

Simper results – comparing deep

depths between Pilings

Groups 8381Deep & 8179Deep

Average dissimilarity = 77.47

Group 8381Deep

Group 8179Deep

Species Av.Abund

Av.Abund Av.Diss Diss/SD Contrib% Cum.%

Watersipora, live 11.34 0 11.58 1.63 14.94 14.94

Detritus 3.28 13.34 10.7 1.7 13.81 28.75

Corynactis californica 0 7.53 7.68 1.15 9.92 38.67

Burgundy crust 0 6.66 6.79 1.04 8.77 47.44

Diplosoma listerianum 6.97 2.41 6.4 0.8 8.26 55.7

CaCO3 9.13 8.16 6.06 1.43 7.82 63.52

Dead bryozoan 5.41 0.19 5.35 1.16 6.91 70.42

Orange bryozoan 5 0 5.1 0.83 6.59 77.01

Dead Watersipora 4.88 0 4.97 0.9 6.42 83.43

Ascidia ceratodes 0.09 4.91 4.95 0.83 6.39 89.82

Rhynchozoon (brwn bryo) 1 1.44 2.04 0.67 2.64 92.45

Page 17: Multidimensional scaling MDS · 2018. 11. 21. · Multidimensional scaling MDS And other permutation based analyses MDS Aim • Graphical representation of dissimilarities between

Are these results interpretable

graphically?

Transform: Square root

Resemblance: S17 Bray Curtis similarity

Watersipora, live

0

10

20

30

2D Stress: 0.17

Transform: Square root

Resemblance: S17 Bray Curtis similarity

PilingDepth8381Shallow

8381Deep

8179Shallow

8179Deep

2D Stress: 0.17

Linking biota MDS to

environmental variables

• Are differences in species composition

related to differences in environmental

variables?

• Correlate MDS axis scores with

environmental variables

• BIO-ENV procedure - correlates

dissimilarities from biota with

dissimilarities from environmental variables

Page 18: Multidimensional scaling MDS · 2018. 11. 21. · Multidimensional scaling MDS And other permutation based analyses MDS Aim • Graphical representation of dissimilarities between

BIO-ENV procedure

Samples

Species

abundances

Env

variables

Euclidean

Bray-Curtis

Subsets of

variables

Rank correlation - Spearman

- Weighted Spearman

Dissimilarity matrix

BIO-ENV correlations

• Exploratory rather than hypothesis testing

procedure.

• Tries to find best combination of

environmental variables, ie. combination

most correlated with biotic dissimilarities.

• A priori chosen correlations can be tested

with RELATE procedure - randomization

test of correlation.

Page 19: Multidimensional scaling MDS · 2018. 11. 21. · Multidimensional scaling MDS And other permutation based analyses MDS Aim • Graphical representation of dissimilarities between

Example

• Bristol Bay Zooplankton

• 57 stations

• 25 species sampled

• Salinity measures taken at the same time

• Question: is zooplankton community related

to salinity

Zooplankton community data

Page 20: Multidimensional scaling MDS · 2018. 11. 21. · Multidimensional scaling MDS And other permutation based analyses MDS Aim • Graphical representation of dissimilarities between

Community Matrix

NMDS plot

Bristol Channel zooplankton

Non-metric MDSTransform: Square root

Resemblance: S17 Bray-Curtis similarity

1

2

34

567

8

9

10

1112

13

14

15

16

171819

20

21

22

23

24

25

2627

28

29

31

32

3334

35

36

37

38

3940

41

42

43

44

45

46

47

4849

50

51

52

5354

55

56

5758

2D Stress: 0.1

Page 21: Multidimensional scaling MDS · 2018. 11. 21. · Multidimensional scaling MDS And other permutation based analyses MDS Aim • Graphical representation of dissimilarities between

Bristol Channel zooplankton

Non-metric MDSTransform: Square root

Resemblance: S17 Bray-Curtis similarity

Salinity

1.8

4.2

6.6

9

1

2

34

567

8

9

10

1112

13

14

15

16

171819

20

21

22

23

24

25

2627

28

29

31

32

3334

35

36

37

38

3940

41

42

43

44

45

46

47

4849

50

51

52

5354

55

56

5758

2D Stress: 0.1

NMDS plot with Salinity Bubbles

Salinity data

Page 22: Multidimensional scaling MDS · 2018. 11. 21. · Multidimensional scaling MDS And other permutation based analyses MDS Aim • Graphical representation of dissimilarities between

Salinity Matrix

RELATE procedure

Samples

Species

abundances

Env

variables

Euclidean

Bray-Curtis

All variables

Rank correlation - Spearman

- Weighted Spearman

Dissimilarity matrix

Page 23: Multidimensional scaling MDS · 2018. 11. 21. · Multidimensional scaling MDS And other permutation based analyses MDS Aim • Graphical representation of dissimilarities between

RELATE the matrices

Bristol Channel salinity group (1-9 in increasing salinity)RELATE

-0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8

Rho

0

56

Fre

quency

0.741

Parameters

Correlation method: Spearman rank

Sample statistic (Rho): 0.741

Significance level of sample statistic: 0.1 % (=<0.001)

Number of permutations: 999

Number of permuted statistics greater than or equal to Rho: 0

A more complicated example – linking multivariate

biological data to multivariate environmental data

• Biological data: Nematode species (>100)

abundance at 19 sites in Exe estuary

• Environmental:

– MPD: mean particle diameter

– % Org: Percent organic matter

– WT: water table depth

– H2S: depth of Hydrogen sulfide layer

– Sal: interstitial salinity

– Ht: Intertidal range

Page 24: Multidimensional scaling MDS · 2018. 11. 21. · Multidimensional scaling MDS And other permutation based analyses MDS Aim • Graphical representation of dissimilarities between

Environmental NMDS Exe estuary

Non-metric MDSNormalise

Resemblance: D1 Euclidean distance

1

2

3

4

5

6

7

89

10

11

12

13

14

15

16

17

1819

2D Stress: 0.06

Biological NMDS

Exe nematodes (19 sites averaged over season)Non-metric MDS

Transform: Square root

Resemblance: S17 Bray-Curtis similarity

site1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

123

4

5

6

7 8 9

10

11

121314

15

16

17

18

19

2D Stress: 0.05

Page 25: Multidimensional scaling MDS · 2018. 11. 21. · Multidimensional scaling MDS And other permutation based analyses MDS Aim • Graphical representation of dissimilarities between

Linking Environment to Community Exe nematodes (19 sites averaged over season)

Non-metric MDSTransform: Square root

Resemblance: S17 Bray-Curtis similarity

Med Part Diam

0.2

0.8

1.4

2

site1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

2D Stress: 0.05

Exe nematodes (19 sites averaged over season)Non-metric MDS

Transform: Square root

Resemblance: S17 Bray-Curtis similarity

Interstit Salinity

19

46

73

100

site1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

2D Stress: 0.05

Exe nematodes (19 sites averaged over season)Non-metric MDS

Transform: Square root

Resemblance: S17 Bray-Curtis similarity

Dep Water Tab

2

8

14

20

site1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

2D Stress: 0.05

Exe nematodes (19 sites averaged over season)Non-metric MDS

Transform: Square root

Resemblance: S17 Bray-Curtis similarity

%Organics

0.8

3.2

5.6

8

site1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

2D Stress: 0.05

Formally

• First: use RELATE to determine

relationship between the biological

community and the environmental

community

Page 26: Multidimensional scaling MDS · 2018. 11. 21. · Multidimensional scaling MDS And other permutation based analyses MDS Aim • Graphical representation of dissimilarities between

RELATE procedure

Samples

Species

abundances

Env

variables

Euclidean

Bray-Curtis

All variables

Rank correlation - Spearman

- Weighted Spearman

Dissimilarity matrix

Exe estuaryRELATE

-0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8

Rho

0

67

Fre

quency

0.791

Parameters

Correlation method: Spearman rank

Sample statistic (Rho): 0.791

Significance level of sample statistic: 0.1 % (<0.001)

Number of permutations: 999

Number of permuted statistics greater than or equal to Rho: 0

Page 27: Multidimensional scaling MDS · 2018. 11. 21. · Multidimensional scaling MDS And other permutation based analyses MDS Aim • Graphical representation of dissimilarities between

Formally

• First: use RELATE to determine

relationship between the biological

community and the environmental

community

• Second: Use BIO ENV to determine best fit

of environmental variables to Biological

Community

BIO-ENV procedure

Samples

Species

abundances

Env

variables

Euclidean

Bray-Curtis

Subsets of

variables

Rank correlation - Spearman

- Weighted Spearman

Dissimilarity matrix

Page 28: Multidimensional scaling MDS · 2018. 11. 21. · Multidimensional scaling MDS And other permutation based analyses MDS Aim • Graphical representation of dissimilarities between

Select best model

Best result for each number of variables

No.Vars Corr. Selections

1 0.676 Dep H2S layer

2 0.777 Dep H2S layer,Interstit Salinity

3 0.816 Med Part Diam,Dep H2S layer,Interstit Salinity

4 0.811 Med Part Diam,Dep H2S layer,%Organics,Interstit Salinity

5 0.804 Med Part Diam,Dep H2S layer,Shore height,%Organics,Interstit Salinity

6 0.791 Med Part Diam,Dep Water Tab,Dep H2S layer,Shore height,%Organics,Interstit Salinity

Exe nematodes (19 sites averaged over season)Non-metric MDS

Transform: Square root

Resemblance: S17 Bray-Curtis similarity

Med Part Diam

0.2

0.8

1.4

2

site1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

2D Stress: 0.05

Linking Environment to Community – model results

Exe nematodes (19 sites averaged over season)Non-metric MDS

Transform: Square root

Resemblance: S17 Bray-Curtis similarity

Interstit Salinity

19

46

73

100

site1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

2D Stress: 0.05

Exe nematodes (19 sites averaged over season)Non-metric MDS

Transform: Square root

Resemblance: S17 Bray-Curtis similarity

Dep H2S layer

2

8

14

20

site1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

2D Stress: 0.05

Best result for each number of variables

No.Vars Corr. Selections

1 0.676 Dep H2S layer

2 0.777 Dep H2S layer,Interstit Salinity

3 0.816 Med Part Diam,Dep H2S layer,Interstit Salinity

4 0.811 Med Part Diam,Dep H2S layer,%Organics,Interstit

Salinity

5 0.804 Med Part Diam,Dep H2S layer,Shore

height,%Organics,Interstit Salinity

6 0.791 Med Part Diam,Dep Water Tab,Dep H2S layer,Shore

height,%Organics,Interstit Salinity

Page 29: Multidimensional scaling MDS · 2018. 11. 21. · Multidimensional scaling MDS And other permutation based analyses MDS Aim • Graphical representation of dissimilarities between

Procedure 1:Analysis of

similarities - ANOSIM

• Uses (dis)similarity matrix

• Because dissimilarities are not normally distributed, uses ranks of pairwise dissimilarities

• Because dissimilarities are not independent of each other, uses randomization test rather than usual significance testing procedure

• Generates own test statistic (called R) by randomization of rank dissimilarities

• Available through PRIMER package

Page 30: Multidimensional scaling MDS · 2018. 11. 21. · Multidimensional scaling MDS And other permutation based analyses MDS Aim • Graphical representation of dissimilarities between

Null hypothesis

Average of rank dissimilarities between objects

within groups = average of rank dissimilarities

between objects between groups

rB = rW

No difference in species composition between

groups

Within group dissimilarities

Between group dissimilarities

Page 31: Multidimensional scaling MDS · 2018. 11. 21. · Multidimensional scaling MDS And other permutation based analyses MDS Aim • Graphical representation of dissimilarities between

Test statistic

R average of rank dissimilarities between objects

between groups - average of rank

dissimilarities between objects within groups

R = (rB - rW) / (M / 2) where M = n(n-1)/2

• R between -1 and +1.

• Use randomization test to generate probability

distribution of R when H0 is true.

Lonhart ANOSIM

• Depth effect R = 0.305, P = 0.001 so reject Ho.

- Significant differences between depths

• Piling location R = 0.761 , P = 0.001 so reject Ho

- Significant difference by Piling

Page 32: Multidimensional scaling MDS · 2018. 11. 21. · Multidimensional scaling MDS And other permutation based analyses MDS Aim • Graphical representation of dissimilarities between

SIMPER (similarity percentages)

|yij - yik|

Bray-Curtis dissimilarity =

yij + yik)

Note is summing over each species, 1 to p.

The contribution of species i is:

|yij - yik|

i =

yij + yik)

Which species discriminate

groups of objects?

• Calculate average i over all pairs of objects between groups

– larger values indicate species contribute more to group differences

• Calculate standard deviation of i

– smaller values indicate species contribution is consistent across all pairs of objects

• Calculate ratio of i / SD(i)

– larger values indicate good discriminating species between 2 groups