the temporal beta diversity index (tbi)biol09.biol.umontreal.ca/plcourses/legendre_tbi... · data...

29
The Temporal Beta diversity Index (TBI) Pierre Legendre Département de sciences biologiques, Université de Montréal [email protected] Fushan Short Course, April 28, 2016

Upload: others

Post on 05-Oct-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The Temporal Beta diversity Index (TBI)biol09.biol.umontreal.ca/PLcourses/Legendre_TBI... · Data – Mat1 for time T1 and Mat2 for time T2. ! The data may be of different kinds

The Temporal Beta diversity Index (TBI)

Pierre Legendre Département de sciences biologiques, Université de Montréal

[email protected]

Fushan Short Course, April 28, 2016

Page 2: The Temporal Beta diversity Index (TBI)biol09.biol.umontreal.ca/PLcourses/Legendre_TBI... · Data – Mat1 for time T1 and Mat2 for time T2. ! The data may be of different kinds

1. The Temporal Beta diversity Index (TBI) Researchers may want to compare observations made at several sites, observed at two different times T1 and T2.

Question: Are there sites where the differences are so important that they do not seem to belong to the same statistical population as the others?

Page 3: The Temporal Beta diversity Index (TBI)biol09.biol.umontreal.ca/PLcourses/Legendre_TBI... · Data – Mat1 for time T1 and Mat2 for time T2. ! The data may be of different kinds

Some examples –

• In community ecology: examining surveys made at two different times, before and after a disturbance, may indicate sampling units that have been exceptionally affected by the disturbance, e.g. a climatic or anthropogenic event.

• In palaeoecology: comparison of ancient and modern diatom communities preserved in lake sediment may indicate areas where acute anthropogenic processes have singularly changed the land use, hence the environmental conditions in the lakes and their diatom communities.

• In population genetics: comparing local populations of a species observed at two different moments separated by an event of interest may indicate the locations where the event has had exceptionally strong effects.

More examples can be found in other fields of biological and biomedical research.

Page 4: The Temporal Beta diversity Index (TBI)biol09.biol.umontreal.ca/PLcourses/Legendre_TBI... · Data – Mat1 for time T1 and Mat2 for time T2. ! The data may be of different kinds

Data – Mat1 for time T1 and Mat2 for time T2. The data may be of different kinds. In landscape ecology and genetics, the data are community composition or population gene frequencies observed at different sites. To fix ideas, I will use sites and species in this presentation.

The null hypothesis (H0) of the TBI test is that a site (species assemblage) is not exceptionally different between T1 and T2, compared to other sites that have been observed at the same two times. The difference belongs to the same statistical population as the difference of the other sites.

The problem – Tests of significance for dissimilarity coefficients are usually not possible because the D values in a dissimilarity matrix are interrelated.

However,   the dissimilarity between T1 and T2 for a site is independent of the dissimilarities observed at other sites. So it may be possible to compute a valid test of significance in that case.

Page 5: The Temporal Beta diversity Index (TBI)biol09.biol.umontreal.ca/PLcourses/Legendre_TBI... · Data – Mat1 for time T1 and Mat2 for time T2. ! The data may be of different kinds

Proposed solution – 

1.  For each site i, compute the dissimilarity (TBIi) between vectors T1i in Mat1 and T2i in Mat2.

Page 6: The Temporal Beta diversity Index (TBI)biol09.biol.umontreal.ca/PLcourses/Legendre_TBI... · Data – Mat1 for time T1 and Mat2 for time T2. ! The data may be of different kinds

Proposed solution – 

2. Permute the data at random in both matrices and recompute the dissimilarities (TBI) to obtain a p-value for each site. How?

⇒ What are the permutable units?

They are the abundances of the species observed at T1 and T2 at the various sites.

Remember H0: a site (species assemblage) is not exceptionally different between T1 and T2, compared to other sites that have been observed at the same two times.

The species abundances can be permuted at random among the site vectors in Mat1 and Mat2.

(I will not discuss the choice of a dissimilarity function for the data at hand: another talk. For quantitative community composition data, I used five dissimilarity coefficients in the simulations.)

Page 7: The Temporal Beta diversity Index (TBI)biol09.biol.umontreal.ca/PLcourses/Legendre_TBI... · Data – Mat1 for time T1 and Mat2 for time T2. ! The data may be of different kinds

Several possible variants of that permutation procedure – 

2.1  Permute the abundance data at random within each column separately, in the same way in matrices Mat1 and Mat2.

2.2 Permute the abundance data at random in each column separately, independently in matrices Mat1 and Mat2.

2.3 Permute entire rows of Mat1 and Mat2 at random, independently in these two matrices, as in tests of RDA.

⇒ I assessed these variants through numerical simulations.

2.4 (not done yet): If the sites are part of a geographic broad-scale gradient on a map and spatial autocorrelation is considered to be a salient property of the data, each species could be permuted in a toroidal manner to preserve the spatial autocorrelation of the data.

Page 8: The Temporal Beta diversity Index (TBI)biol09.biol.umontreal.ca/PLcourses/Legendre_TBI... · Data – Mat1 for time T1 and Mat2 for time T2. ! The data may be of different kinds

Steps of the permutation procedure. Example for method 2.1 – 

a) In each matrix, values are permuted at random, independently in each column. Permutations start with the same random seed in Mat1 and Mat2.

=> It is the differences in values between T1 and T2, for each species, that are permuted among the sites. Justification: we are testing dissimilarities, obtained from the differences between T1 and T2.

b) The first step in the calculation of the Hellinger and chord distances isth a data transformation. The transformation is recomputed after permuting the values in the columns of the two matrices. This is necessary to make sure that the permuted data are transformed in the same way as the initial data, with row sums or row lengths of 1. In this way, the D of the permuted data remain comparable to the reference D.

c) TBI distances between Mat1 and Mat2 are recomputed for each site.

d) After a large number of permutations, a p-value is computed for each site i (hence for each TB index D(yi.1, yi.2)), as in any permutation test. A correction for multiple testing should be applied to obtain a correct experimentwise error rate.

Page 9: The Temporal Beta diversity Index (TBI)biol09.biol.umontreal.ca/PLcourses/Legendre_TBI... · Data – Mat1 for time T1 and Mat2 for time T2. ! The data may be of different kinds

Simulations for type 1 error rate of the 3 permutation methods – 

“A statistical testing procedure is valid if the probability of a type I error (rejecting H0 when true) is no greater than α, the level of significance, for any α.” (Edgington 1995, p. 37).

There were various ways of generating data for which H0 was true.

First simulation method –

• Draw random species-like data from a random lognormal distribution. (Simulation with Poisson-distributed random data produced similar results.)

• n = 20 sites, p = 20 species per matrix.

• In this series of simulations, all data from Mat1 and Mat2 came from the same statistical population. Data for all sites, species and times (T1, T2) were generated in the same way.

Page 10: The Temporal Beta diversity Index (TBI)biol09.biol.umontreal.ca/PLcourses/Legendre_TBI... · Data – Mat1 for time T1 and Mat2 for time T2. ! The data may be of different kinds
Page 11: The Temporal Beta diversity Index (TBI)biol09.biol.umontreal.ca/PLcourses/Legendre_TBI... · Data – Mat1 for time T1 and Mat2 for time T2. ! The data may be of different kinds

Simulations for type 1 error rate of the 3 permutation methods – 

There were various ways to generate data where H0 was true.

Second simulation method –

• Draw random species-like data from a random lognormal distribution. (Simulation with Poisson-distributed random data produced similar results.)

• n = 20 sites, p = 20 species per matrix.

• Furthermore, sites in Mat2 differed from those in Mat1 by having gained a number of species that were not found at time T1: 6 species were added to all sites of the Mat2 data matrix, with random values at each site. These species had abundances of 0 in the T1 matrix. So, for each site, T1 and T2 differed, yet none of the TBI differences was exceptionnally larger than the others. Hence H0 was still true.

• n = 20 sites, p = 20 species in Mat1, 26 species in Mat2.

Page 12: The Temporal Beta diversity Index (TBI)biol09.biol.umontreal.ca/PLcourses/Legendre_TBI... · Data – Mat1 for time T1 and Mat2 for time T2. ! The data may be of different kinds
Page 13: The Temporal Beta diversity Index (TBI)biol09.biol.umontreal.ca/PLcourses/Legendre_TBI... · Data – Mat1 for time T1 and Mat2 for time T2. ! The data may be of different kinds

Simulations to estimate the power of the 3 permutation methods – 

Generate data with strong differences between T1 and T2 at some of the sites.

### Results not tabulated yet. ###

=>   Preliminary examination of the simulation results indicates that permutation method 3 consistently had lower power than methods 1 and 2.

Page 14: The Temporal Beta diversity Index (TBI)biol09.biol.umontreal.ca/PLcourses/Legendre_TBI... · Data – Mat1 for time T1 and Mat2 for time T2. ! The data may be of different kinds

Ecological example 1 – Insecticide treatments in mesocosms Observations on the abundances of 178 invertebrate species (macroinvertebrates and zooplankton) subjected to insecticide treatments in 12 aquatic mesocosms (“ditches”).

Data from van den Brink & ter Braak (1999); available in package vegan.

12 mesocosms, treated once with insecticide doses of {0, 0, 0, 0, 0.1, 0.1, 0.9, 0.9, 6, 6, 44, 44} μg/L of Chlorpyrifos, were surveyed on eleven occasions.

The invertebrate data were log-transformed abundances, y' = loge(10y + 1), as in the van den Brink & ter Braak (1999) paper.

I compared the community composition data from week 4 (one week after treatment) and week 11 (after full recovery of the fauna).

Questions: which dissimilarity functions and permutation methods produced the most powerful tests?

Page 15: The Temporal Beta diversity Index (TBI)biol09.biol.umontreal.ca/PLcourses/Legendre_TBI... · Data – Mat1 for time T1 and Mat2 for time T2. ! The data may be of different kinds
Page 16: The Temporal Beta diversity Index (TBI)biol09.biol.umontreal.ca/PLcourses/Legendre_TBI... · Data – Mat1 for time T1 and Mat2 for time T2. ! The data may be of different kinds

Comparison of permutation methods using the percentage difference dissimilarity and 999 random permutations for the tests of significance.

Only the results for mesocosms M10 to M12 are reported; all other p-values were 1.0 after correction for multiple testing.

• Permutation method 1 – The last three corrected p-values were 0.230, 0.012 and 0.012; M11 and M12 were significant at α = 0.05.

• Permutation method 2 – The last three corrected p-values were 0.470, 0.012 and 0.044; M11 and M12 were significant at α = 0.05.

• Permutation method 3 – The last three corrected p-values were 1.000, 0.120 and 0.759; M10, M11 and M12 were not significant at α = 0.05.

Conclusions –

• Permutation method 1 had the highest power to detect changes in species composition,

• whereas permutation method 3 lacked power.

• As expected, the Euclidean distance was inappropriate to study community composition data. More about this in Legendre & De Cáceres (2013).

Page 17: The Temporal Beta diversity Index (TBI)biol09.biol.umontreal.ca/PLcourses/Legendre_TBI... · Data – Mat1 for time T1 and Mat2 for time T2. ! The data may be of different kinds

Ecological example 2 –Alpine grassland data, François Gillet 2016

Page 18: The Temporal Beta diversity Index (TBI)biol09.biol.umontreal.ca/PLcourses/Legendre_TBI... · Data – Mat1 for time T1 and Mat2 for time T2. ! The data may be of different kinds

Material for TBI study of Alpine vegetation –

• 150 quadrats surveyed at

T1 = 1990 to 2000 and T2 = 2012

• 325 species of vascular plants.

Data were relative cover data from Braun-Blanquet dominance codes, summing to 100 at each site.

Questions for TBI study –

• Did land-use changes affect plant biodiversity? If so, in what way?

TBI analysis using Hellinger distance identified 22 sites that had significant changes in community composition.

=> In another analysis based on Sørensen index for presence-absence data, species gains (C, in 2012) were a bit higher than species losses (B).

Page 19: The Temporal Beta diversity Index (TBI)biol09.biol.umontreal.ca/PLcourses/Legendre_TBI... · Data – Mat1 for time T1 and Mat2 for time T2. ! The data may be of different kinds
Page 20: The Temporal Beta diversity Index (TBI)biol09.biol.umontreal.ca/PLcourses/Legendre_TBI... · Data – Mat1 for time T1 and Mat2 for time T2. ! The data may be of different kinds

-20 0 20 40 60 80 100

020

4060

80100

TBI, 150 Alpine grassland sites

Longitude E (km)

Latit

ude

N (k

m)

Page 21: The Temporal Beta diversity Index (TBI)biol09.biol.umontreal.ca/PLcourses/Legendre_TBI... · Data – Mat1 for time T1 and Mat2 for time T2. ! The data may be of different kinds

It remains to examine what happened at these 22 sites between the two surveys.

--------------------

LCBD indices decompose the total variance in a community composition table (or total beta diversity, BDTotal) into Local Contributions to Beta Diversity (LCBD). See Legendre & De Cáceres (2013). 1

LCBD indices were computed for each survey to determine what were the exceptional sites in each time period.

LCBD indices reflect the spatial contributions of sites to beta diversity computed in a survey. TBI and LCBD do not measure the same thing.

1An application of LCBD analysis to a space-time study is found in the Legendre & Gauthier (2014) paper.

Page 22: The Temporal Beta diversity Index (TBI)biol09.biol.umontreal.ca/PLcourses/Legendre_TBI... · Data – Mat1 for time T1 and Mat2 for time T2. ! The data may be of different kinds

-20 0 20 40 60 80 100

020

4060

80100

LCBD, 150 Alpine grassland sites

Longitude E (km)

Latit

ude

N (k

m)

LCBD signif. in 1990-2000 surveyLCBD signif. in 2012 survey

Page 23: The Temporal Beta diversity Index (TBI)biol09.biol.umontreal.ca/PLcourses/Legendre_TBI... · Data – Mat1 for time T1 and Mat2 for time T2. ! The data may be of different kinds

The points with high TBI values were in most cases not the same as those that had high LCBD indices in the individual surveys.

High TBI indicate strong changes in composition in the T1-T2 interval. High LCBD indicate exceptional composition in a given survey.

-20 0 20 40 60 80 100

020

4060

80100

TBI, 150 Alpine grassland sites

Longitude E (km)

Latit

ude

N (k

m)

-20 0 20 40 60 80 100

020

4060

80100

LCBD, 150 Alpine grassland sites

Longitude E (km)

Latit

ude

N (k

m)

LCBD signif. in 1990-2000 surveyLCBD signif. in 2012 survey

Page 24: The Temporal Beta diversity Index (TBI)biol09.biol.umontreal.ca/PLcourses/Legendre_TBI... · Data – Mat1 for time T1 and Mat2 for time T2. ! The data may be of different kinds

Ecological example 3 – Amanda Winegardner’s research

Objective – Quantify and identify drivers of spatial (landscape) and temporal beta diversity in diatom assemblages between modern and historical times. Understanding the magnitude and drivers of freshwater diversity change over the last 150 years, a period of intense industrial proliferation and land cover alteration, provides essential insights for developing scenarios of future change. [From the Abstract of the paper.]

Material for TBI study – Diatom community composition in pre-1850 and modern (2007) sediments of 176 lakes throughout the US.

Questions for TBI study: (1) Are there lakes where the change between pre-1850 and modern diatom community composition was exceptionally large, and (2) are there environmental drivers associated with these exceptional TBI values?

Reference – Winegardner, A. K., P. Legendre, B. E. Beisner and I. Gregory-Eaves. Diatom beta diversity through time and space. To be submitted to: Global Ecology and Biogeography.

Page 25: The Temporal Beta diversity Index (TBI)biol09.biol.umontreal.ca/PLcourses/Legendre_TBI... · Data – Mat1 for time T1 and Mat2 for time T2. ! The data may be of different kinds
Page 26: The Temporal Beta diversity Index (TBI)biol09.biol.umontreal.ca/PLcourses/Legendre_TBI... · Data – Mat1 for time T1 and Mat2 for time T2. ! The data may be of different kinds
Page 27: The Temporal Beta diversity Index (TBI)biol09.biol.umontreal.ca/PLcourses/Legendre_TBI... · Data – Mat1 for time T1 and Mat2 for time T2. ! The data may be of different kinds

3. Other types of variables TBI analysis can be carried out on physical (e.g. environmental) variables. One should use the Euclidean distance for standardized quantitative variables, or the Gower dissimilarity for mixtures of quantitative and qualitative (= factor) variables [function gowdis() in package FD] (Legendre & Legendre 2012, Chapter 7).

Future work – I intend to carry out simulations to show that these analyses can be performed, and how their power varies for different numbers of sites and environmental variables.

4. Software An R function called TBI is available on the Web page

http://adn.biol.umontreal.ca/~numericalecology/Rcode/ Click on the link beta.div to download a folder containing four functions for the study of beta diversity. Among these are function TBI.R and its documentation file TBI.help.pdf.

Page 28: The Temporal Beta diversity Index (TBI)biol09.biol.umontreal.ca/PLcourses/Legendre_TBI... · Data – Mat1 for time T1 and Mat2 for time T2. ! The data may be of different kinds

5. References Edgington, E. S. 1995. Randomization Tests, 3rd edition. Marcel Dekker, New York.

Legendre, P. A temporal beta-diversity index to identify exceptional sites in space-time surveys. (Manuscript to be submitted.)

Legendre, P. and M. De Cáceres. 2013. Beta diversity as the variance of community data: dissimilarity coefficients and partitioning. Ecology Letters 16: 951-963.

Legendre, P. and O. Gauthier. 2014. Statistical methods for temporal and space-time analysis of community composition data. Proceedings of the Royal Society B 281: 20132728. Legendre, P. and L. Legendre. 2012. Numerical ecology, 3rd English edition. Developments in Environmental Modelling, Vol. 24. Elsevier Science BV, Amsterdam.

Page 29: The Temporal Beta diversity Index (TBI)biol09.biol.umontreal.ca/PLcourses/Legendre_TBI... · Data – Mat1 for time T1 and Mat2 for time T2. ! The data may be of different kinds

Questions?