gerry quinn - multivariate analysis in community ecology

41
Multivariate analysis in community ecology Gerry Quinn Deakin University

Upload: hadieu

Post on 25-Jan-2017

227 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Gerry Quinn - Multivariate analysis in community ecology

Multivariate analysis in community ecology

Gerry QuinnDeakin University

Page 2: Gerry Quinn - Multivariate analysis in community ecology

Data sets in community ecology• Multivariate abundance data• Sampling or experimental units

– plots, cores, panels, quadrats ……– usually in hierarchical spatial or temporal structure

• Abundances recorded for multiple taxa in each unit– simple counts, densities, % cover, presence-absence ……

• Environmental variables recorded in each unit– pH, salinity, temperature, nutrients, sediment load, elevation …..

Page 3: Gerry Quinn - Multivariate analysis in community ecology

Typical aims• Examine spatial and temporal patterns in species composition

– assemblage/community “structure”, more than simply biodiversity (e.g. taxon richness/diversity)

– test formal hypotheses about spatial and temporal differences in composition

• Relate patterns to unit (or higher) level environmental predictors– typical linear model type question

• Determine which taxa are most important in “driving” the patterns– which taxa most typify differences across spatial and temporal

hierarchies

Page 4: Gerry Quinn - Multivariate analysis in community ecology

Why multivariate?• Individual taxa of main interest

– concern over multiple univariate hypothesis testing (Type 1 error rates)

– referees and editors won’t accept paper with 50-100 ANOVAs• Community (assemblage) structure interest

– recognition of limitations of univariate biodiversity (richness, diversity, evenness) measures

– hypotheses about community/assemblage composition • Most multivariate analyses in community ecology also

incorporate univariate (individual taxa or environmental predictors) models

Page 5: Gerry Quinn - Multivariate analysis in community ecology

Forest bird communities• Does bird community

composition vary between forest types?– 5 types (box-ironbark, river redgum,

Gippsland manna gum etc.) plus mixed

• Maximum bird abundance (across 4 seasons)– 102 species across 37 sites

• Mac Nally (1989)

beechworthonline.com.au

Swift parrot - Wikipedia

Page 6: Gerry Quinn - Multivariate analysis in community ecology
Page 7: Gerry Quinn - Multivariate analysis in community ecology

Estuary nematode communities• Does nematode community

composition vary between sites and with environmental variables?

• Nematode abundance (6 seasonal “replicates)– 182 species across 19 “sites”

• Environmental variables– 6 (sediment particle size, % organic

matter etc.) at each site• Clarke & Warwick (1993)

Exe estuary - Wikipedia

Marine nematodeshttp://www.ipm.iastate.edu

Page 8: Gerry Quinn - Multivariate analysis in community ecology

Site Sp1 Sp2 Sp3 Sp4 Sp5 Sp6 Sp7 etc.Part size

WTab depth

H2S depth

Shore height

% organ Salinity

1 90 187 90 23 123 28 5etc. 0.06 0 2.167 4 6.43 24.8332 54 158 66 51 22 10 5etc. 0.06 0 3.183 3 7.06 22.8333 47 117 28 97 9 26 3etc. 0.06 0 1.817 2 7.99 17.8334 52 27 6 72 1 3 1etc. 0.06 0 2.02 1 7.15 16.25 0 0 0 0 0 0 0etc. 1.275 20 20 5 0.24 106 5 0 0 0 0 0 1etc. 0.562 3.417 2.95 4 0.37 76.67 8 14 145 0 0 4 120etc. 0.06 0 2.167 3 1.98 768 3 18 35 0 0 17 94etc. 0.177 0 2.683 2 2.22 81.29 51 2 206 0 0 1 76etc. 0.06 0 2.66 1 5.88 71.2

10 0 0 0 0 0 0 0etc. 0.451 20 20 5 0.09 1011 0 0 0 0 2 0 0etc. 0.205 4.417 7.25 4 0.39 8812 0 0 0 0 0 0 0etc. 0.528 20 20 3 0.09 8813 0 0 0 0 0 0 0etc. 0.598 20 20 2 0.06 8814 1 0 0 0 0 0 0etc. 0.769 0 20 1 0.09 88.515 0 0 0 0 0 0 0etc. 0.468 14.917 20 5 0.06 8916 0 0 0 0 0 0 0etc. 0.837 6.333 20 4 0.04 90.87517 0 0 0 0 0 0 0etc. 0.797 6.75 20 3 0.06 91.66718 0 0 0 0 0 0 0etc. 1.141 3.667 20 2 0.07 89.419 1 0 0 0 0 0 0etc. 0.223 0 20 1 0.09 90.833

Page 9: Gerry Quinn - Multivariate analysis in community ecology

Impact assessment• Does sessile marine animal

community composition vary between sewage impact and control sites?– 3 control and 1 impact locations– 4 randomly chosen times– replicate sites and photographic quadrats

at each location• Percent cover of 58 taxa• Classical “beyond” BACI design

– split-plot type linear model• Terlizzi et al (2005)

http://www.conisma.it/total/t_aim.html

Page 10: Gerry Quinn - Multivariate analysis in community ecology
Page 11: Gerry Quinn - Multivariate analysis in community ecology

Three broad approaches• Eigenanalyses

– distance measure implied

• Distance-based analyses– distance measure explicit and user-selected

• Multi-species linear models– combine taxon-specific univariate (linear) models– no distance measure required

Page 12: Gerry Quinn - Multivariate analysis in community ecology

Eigenanalysis methods• Principal components analysis (PCA)

– implied Euclidean distance• Correspondence analysis (CA)

– implied chi-square distance• Canonical correspondence analysis (CCA/CANOCO)

– constrains ordination based on linear modelling with environmental variables

• Strengths– biplots of sample and species ordinations– CCA provides measures of fit with covarying environmental

variables

Cajo ter Braak

Page 13: Gerry Quinn - Multivariate analysis in community ecology

SITE AREA DISTX AGE RRATTUS MMUS PCALIF PEREM RMEGAL NFUSC NLEPID PFALLAX MCALIFFlorida 25 2100 50 0 13 3 1 1 2 0 0 0Sandmark 84.1 914 20 0 1 57 65 9 16 8 2 334street 53.8 1676 34 0 4 36 0 2 9 0 0 0Balboaterr 51.8 243 34 0 4 53 1 5 30 0 18 3Katesess 25.6 822 16 0 2 63 21 11 16 0 0 0Altalajolla 32.1 121 14 0 1 48 35 12 8 12 2 2Laurel 9.7 1554 79 0 11 0 0 0 0 0 0 0Canon 8.7 1219 58 0 16 0 0 0 0 0 0 0Zena 8.5 2865 36 3 8 0 0 0 0 0 0 0

Rodents in habitat fragments

Bolger et al (1997)

etc.

Page 14: Gerry Quinn - Multivariate analysis in community ecology

AcunaEl mac

Rr

Mm

54th Street

Oakcrest

BajaZena32nd Street Sth

Florida

7 fragments

Axis 1

Axis 2

Rodent data – CA biplot

Page 15: Gerry Quinn - Multivariate analysis in community ecology

Area

Dist Age

Rr

Mm

Mc

Pe

Nl

SandmarkLaurel

Spruce

Montanosa

Edison

Balboa34th Street

Axis 1

Axis 2

AcunaEl mac

54th Street

Rodent data – CCA triplot

Page 16: Gerry Quinn - Multivariate analysis in community ecology

Issues• Both methods “compress”

distances at ends of axes (so-called arch or horseshoe effect)– detrended CA brute force “fix”

for this effect• CA and CCA implicitly up-

weight rarer taxa by use of chi-square distance

• No choice of distance measure

Comp 1

Comp 2

PCA bird community data

Page 17: Gerry Quinn - Multivariate analysis in community ecology

Distance-based methods• Include principal coordinates analysis

(PCoA), multidimensional scaling (MDS), generalised dissimilarity modelling (GDS)

• Hypothesis testing– compare groups using multi-response

permutation procedure (MRPP), analysis of similarities (ANOSIM), permutational multivariate ANOVA (PERMANOVA)

– relate to environmental variables with Mantel test, BIO-ENV

John Curtis

Bob Clarke

Marti Anderson

Page 18: Gerry Quinn - Multivariate analysis in community ecology

Distance-based methods• Strengths

– flexibility of distance/dissimilarity measure, standardisation and transformation

– consistency in that ordination and subsequent analyses based on original dissimilarities

– some dissimilarities can be “decomposed” into relative taxon contributions (similarity percentages - SIMPER)

Page 19: Gerry Quinn - Multivariate analysis in community ecology

nMDS – bird community data

Page 20: Gerry Quinn - Multivariate analysis in community ecology

PERMANOVA – bird community data

Page 21: Gerry Quinn - Multivariate analysis in community ecology

nMDS – subtidal reef data

Page 22: Gerry Quinn - Multivariate analysis in community ecology

PERMANOVA – subtidal reef data

Page 23: Gerry Quinn - Multivariate analysis in community ecology

Issues• Flexible choice of distance/dissimilarity measure

– ecologists nearly always default to Bray-Curtis– does B-C represent ecological differences of interest?

• Modelling dissimilarities tricky– appropriate probability distributions – permutation

procedures usually applied – robustness for complex models?

– PERMANOVA only partitions SS not likelihoods– lack of independence – rely on permutation robustness

• Limited predictive capacity• Distance-based methods cannot easily separate

location and dispersion effects

Page 24: Gerry Quinn - Multivariate analysis in community ecology

• Location vs dispersion• Warton et al (2012)

Page 25: Gerry Quinn - Multivariate analysis in community ecology

Location vs dispersion• Transformation of abundances may help BUT many taxa have

very skewed distributions• Issue recognised by PRIMER/PERMANOVA

– “we can consider the homogeneity of dispersions to be included as part of the general null hypothesis of "no differences" among groups being tested by PERMANOVA (even though the focus of the PERMANOVA test is to detect location effects)” (PERMANOVA manual p.22)

• On going debate PRIMER/PERMANOVA vs mvabund

Page 26: Gerry Quinn - Multivariate analysis in community ecology

“Univariate” linear model approach

• Fit separate generalised linear models to each taxon– based on –ve binomial distribution (over-dispersed counts)

• Testing overall group or covariate effects– sum likelihood ratio (LR) tests across taxa– use permutation (resampling) methods to generate test statistic

• Relative taxon contribution to patterns– LR statistic as measure of strength of individual taxon contributions

• Strengths– linear models framework, univariate predictive capacity– handles mean-variance relationship

• Issues– not an “ordination” method

David Warton

Page 27: Gerry Quinn - Multivariate analysis in community ecology

Methods in community ecology• Journals searched 2011-2012

– Austral Ecology– Oikos

• Analyses of community/assemblage (species abundance incl. pres-abs data)– 62 papers found

• Methods used– overall multivariate “philosophy”– choice of dissimilarity measure (if relevant)– transformation/standardisation used– modeling (hypothesis testing) method– choice of “ordination” plot

Page 28: Gerry Quinn - Multivariate analysis in community ecology

Multivariate approach

Approach # papers % papersEigenanalysis 15 24Distance-based 47 76Combined taxon-specific linear models

0 0

Page 29: Gerry Quinn - Multivariate analysis in community ecology

Eigenanalyses

Approach # papersMANOVA / DFA 3PCA 0Correspondence analysis (incl. detrended)

8

Constrained (canonical) correspondence analysis

4

Majority of “ordinations” based on biplots, many with vectors fitted for environmental predictors (triplots)

Page 30: Gerry Quinn - Multivariate analysis in community ecology

Distance-based

Dissimilarity measure # papersBray-Curtis 31Sorensen 4Jaccard 2Gower 2

Page 31: Gerry Quinn - Multivariate analysis in community ecology

Distance/dissimilarity• Why do ecologists default to Bray-Curtis?

– Faith et al (1987 – Vegetatio) strongly recommended B-C as robust indicator of ecological gradients

– ranges between 0 (identical samples) and 1 (no species in common)– handles joint absences (taxa missing from both samples)– default in PRIMER/PERMANOVA, PC-ORD

• Does B-C represent patterns ecologists are really interested in?

Page 32: Gerry Quinn - Multivariate analysis in community ecology

Distance-based

Approach # papersComparing groups

ANOSIM / PERMANOVA / dbRDA 24MRPP 6ANOVA on MDS axis scores 2

Majority of “ordinations” based on non-metric MDS, 3 papers used cluster analysis

Page 33: Gerry Quinn - Multivariate analysis in community ecology

Distance-basedApproach # papersRelating to env predictors

BIO-ENV/ Relate 24Mantel tests 6Regression/correlation with MDS axis scores 2Generalised dissimilarity modelling 1

Determining taxa driving group differencesSIMPER 9

Page 34: Gerry Quinn - Multivariate analysis in community ecology

Transformations• Transformations of abundances common in ecology

– log (y+1) or square/fourth root– original PRIMER program had 4th root as default!

• Most common reason - to reduce the influence of most abundant (dominant) taxa and give relatively greater weighting to rarer taxa– each taxon will be affected differently depending on its distribution?– effects on interaction terms almost never considered

• Issues of unequal dispersions almost never raised in ecological papers– “it is not at all difficult to understand that transformations will also

affect relative dispersions in multivariate space” (PERMANOVA manual p. 97)

Page 35: Gerry Quinn - Multivariate analysis in community ecology

• Invertebrate assemblages in lake (Quinn et al 1996)

• Four site-season combinations

• nMDS on Bray-Curtis• Four standardisations:

• None• By sample totals• By taxa totals• Double

• Bray-Curtis vs Canberra

None Sample

Taxa Double

Standardisations

Page 36: Gerry Quinn - Multivariate analysis in community ecology

To Bayes or not to Bayes….

Page 37: Gerry Quinn - Multivariate analysis in community ecology

Bayesian approaches• Detecting transitions between upslope

and riparian vegetation– management of stream riparian zones

• Based on plant assemblage data (% cover) along transects away from stream– pairwise Canberra distances between

quadrats along each transect• Aim - to find the model with the highest

probability of being the break between riparian and upslope vegetation– usual MCMC estimation of models

Acheron River

Page 38: Gerry Quinn - Multivariate analysis in community ecology

Higher elevation sites

Lower elevation sites

Bayes factors > 10

Mac Nally et al (2008) Plant Ecology

Page 39: Gerry Quinn - Multivariate analysis in community ecology

Bayesian approaches• Maybe more robust than ML for complex models

– already being used for variance estimation and confidence (credible) intervals in some mixed model software

• Straightforward(?) under mvabund generalised linear model approach– select suitable probability distributions for parameters– use uninformative prior if appropriate

• More difficult with distance-based methods– but can be adapted (see Mac Nally 2005 Divers & Distr)– other examples using MDS and clustering (Oh & Raftery 2007 J Comp

Graph Stat) focus on graphical representation (“ordination”)

Page 40: Gerry Quinn - Multivariate analysis in community ecology

Questions for discussion• Is the confounding of location and dispersion a “fatal”

flaw for distance-based measures?– more direct comparisons between distance-based and linear

model approaches needed• Comparison to other new methods

– generalised dissimilarity modelling (Ferrier et al 2007)– gradient forests (Ellis et al 2012)

• If distance-based measures are used:– what does Bray-Curtis actually measure ecologically?

• What do multivariate models actually predict?

Page 41: Gerry Quinn - Multivariate analysis in community ecology

Questions for discussion• Should ecologists re-think their use of transformations?

– NOT just a multivariate issue!• How do ecologists determine optimum sample sizes for

community ecology– power characteristics will vary between taxa in linear models approach– power for distance-based permutation analyses?