aporrectodea longa (clitellata, lumbricidae)- one … abstract the black-headed worm, aporrectodea...

23
Caroline Rhodén Degree project for Bachelor of Science in Biology BIO602 Biology: Degree project 15 hec Spring 2015 Department of Biological and Environmental Sciences University of Gothenburg Examiner: Bernard Pfeil Department of Biological and Environmental Sciences University of Gothenburg Supervisor: Christer Erséus Department of Biological and Environmental Sciences University of Gothenburg Aporrectodea longa (Clitellata, Lumbricidae)- one or two species in Scandinavia

Upload: lamxuyen

Post on 27-Mar-2019

221 views

Category:

Documents


0 download

TRANSCRIPT

Caroline Rhodén

Degree project for Bachelor of Science inBiology

BIO602 Biology: Degree project 15 hecSpring 2015

Department of Biological and Environmental SciencesUniversity of Gothenburg

Examiner: Bernard PfeilDepartment of Biological and Environmental Sciences

University of Gothenburg

Supervisor: Christer ErséusDepartment of Biological and Environmental Sciences

University of Gothenburg

Aporrectodea longa (Clitellata, Lumbricidae)-

one or two species in Scandinavia

Photo on cover: Aporrectodea longa, by Caroline Rhodén

Table of contents Abstract/Sammanfattning 4 Introduction 5

- Earthworms 5 - Species delimitation 5 - Aim of this study 7

Material and methods 7

- Sampling 7 - DNA-extraction, amplification and gel electrophoresis 7 - Sequence assembling and alignment 8 - Distance analyses 8 - Bayesian coalescent analyses 9 - Species delimitation tests 9 - Haplotype network analyses 10 - Morphological study 10

Results 10

- DNA sequencing 10 - Distance analyses 10 - Bayesian coalescent analyses 11 - Species delimitation tests 14 - Haplotype network analyses 14 - Morphological study 14

Discussion 16

- Conclusions 18

Acknowledgements 18 References 19 Appendix 22

4

Abstract The black-headed worm, Aporrectodea longa (Ude, 1895), is an earthworm belonging to the family Lumbricidae and is widely distributed in Europe. When some specimens of A.longa collected in Scandinavia were sequenced for the mitochondrial genetic marker Cytochrome c Oxidase subunit I (COI), two distinct clusters were observed, with a sequence difference of about 7%. When analyzing the genetic distance among the two clusters a `barcoding gap´ (here the gap between variation within and variation between the clusters) of 4.4% was observed. COI is used as an animal barcoding gene, i.e., identification of species by comparing the COI sequence of an individual with reference COI sequences in a database. However, the existence of these two clusters shows that the nominal taxon A. longa comprises at least two maternal genetic lineages. The question is whether these lineages represent one or two species. To investigate this additional data sources were analyzed, namely the nuclear markers Histone 3 (H3) and the Internal Transcribed Spacer 2 (ITS2) combined with five external morphological characters. The genetic markers were explored by distance analysis using Automatic Barcode Gap Discovery (ABGD), and by Bayesian coalescent analysis. The two nuclear markers did not separate the individuals in two groups corresponding to the COI clusters, and no significant difference in morphological characters between the clusters was found either. Based on these results the two clusters cannot be considered to represent two species. This implies that the use of COI as a barcoding gene for delimitation of species should be complemented with other data sets such as nuclear genes, morphology and ecology to ensure that the delimitation reflects a speciation event. Sammanfattning Den långa daggmasken, Aporrectodea longa (Ude, 1895), är en art som tillhör familjen daggmaskar (Lumbricidae) och är utbredd i stora delar av Europa. När några skandinaviska exemplar av A. longa sekvenserades för den mitkondriella genetiska markören Cytochrome c Oxidase subunit I (COI), hittades två distinkta kluster med en skillnad på cirka 7 %. När den genetiska distansen mellan de två klustren analyserades så fann man ett så kallat `barcoding gap´ (ett uttryck som normalt avser skillnaden mellan den intraspecifika och interspecifika variationen) på 4,4 %. COI används idag som en barcoding-gen för djur, dvs. identifiering av arter genom att jämföra COI-sekvensen hos en individ med referenssekvenser av COI i en databas. Förekomsten av de två klustren innebär att den nominella arten A. longa innefattar minst två maternellt genetiska linjer. I denna studie undersöks om dessa två linjer representerar en eller två arter. För att testa detta analyserades ytterligare data, nämligen de nukleära markörerna Histone 3 (H3) och Internal Transcribed Spacer 2 (ITS2) kombinerat med fem yttre morfologiska karaktärer. För att undersöka de genetiska markörerna utfördes en distansanalys genom att använda Automatic Barcode Gap Discovery (ABGD), och en Bayesiansk koalescentanalys. De två nukleära generna separerade inte individerna i två kluster som COI gjorde, och någon signifikant morfologisk skillnad kunde heller inte ses. Baserat på dessa resultat kan de två klustren inte anses representera två arter. Detta betyder att användandet av COI som en barcoding-gen för avgränsning av arter, borde kompletteras med andra data, såsom nukleära gener, morfologi och ekologi etc., för att försäkra att avgränsningen reflekterar en artbildningshändelse.

5

Introduction This study will address species delimitation in earthworms (family Lumbricidae) with focus on the black-headed worm, Aporrectodea longa, in Scandinavia. Earthworms Annelida is a large invertebrate phylum including about 16500 species of segmented worms in the world (GBIF; http://www.gbif.org/), at least 3000 of them are earthworms. Earthworms of the family Lumbricidae, class Clitellata, are known for their soil enhancing properties and play an important part in the formation and maintenance of fertility and structure of soils (Edwards, 2004). By their burrowing activities they create an opportunity for water and gas transport in the soil, and they also contribute to the mixing of minerals and organic material. By consuming the soil earthworms take part in the breakdown of terrestrial organic matter which then will be available for microorganisms (Brown et al., 2000). Because of their soil enhancing qualities earthworms are referred to as soil engineers by some ecologists (Jouquet et al., 2006). Ecologically, earthworms are divided into three different categories: epigeic, endogeic and anecic species. Epigeic worms live in and eat plant litter on the soil surface and burrowing is not common. Endogeic worms live in the soil making burrows both vertically and horizontally, and eating organic material in the soil. Anecic worms make deep vertical burrows, and they feed on litter from the surface and organic material in the soil (Brown et al., 2000). Earthworms also serve as a prey for several vertebrate and invertebrate species such as birds, moles, hedgehogs, rove beetles (family Staphylinidae) and ants (Sims and Gerard, 1985). Lumbricid earthworms are widely distributed and commonly found in most regions on the earth, with the exception of areas with arctic and desert climates (Edwards, 2004). There are about 432 European species of the family Lumbricidae, and at least 25 of them exist in Sweden (Fauna Europea 2013; http://www.faunaeur.org, Dyntaxa; https://www.dyntaxa.se). The lumbricid genus Aporrectodea Örley,1885, consists of 83 species in Europe according to Fauna Europaea Database (2013) and five of them are currently found in Sweden (Dyntaxa; https://www.dyntaxa.se). The black-headed worm A. longa (Ude, 1895) is an anecic species of Lumbricidae. In Swedish it is called lång daggmask (long earthworm) because of its large and slender body (Sims and Gerard, 1985; Edwards and Bohlen, 1996). Aporrectodea longa has around 170-190 segments and its body length varies between 9 and 17 cm. Segment number 28 generally marks the beginning of a saddle-shaped glandular swelling region called the clitellum, which can be seen in mature individuals and is used to produce a cocoon for the fertilized eggs. The clitellum extends over eight to nine segments. A pair of male reproductive pores are used when transferring sperm to a mate, and they are commonly situated in segment 15 (Sims and Gerard, 1985; Edwards and Bohlen, 1996). Species delimitation How to delimit species, i.e., how to determine the boundaries and number of species, has long been debated among biologists and several species concepts have been proposed over the past

6

half century (De Queiroz, 2007). One well-known concept is the biological species concept, which, simplified, states that organisms that are able to produce viable and fertile offspring and are reproductively isolated from other groups of organisms constitute the same species (Dobzhansky, 1950). The ecological species concept, on the other hand, separates species regarding to whether they share the same niche or adaptive zone (Van Valen, 1976). The phylogenetic species concepts consider phylogenetic relationships and constitute four entries: Hennigian (Hennig, 1966), monophyletic (Rosen, 1979), diagnosable (Nelson and Platnick, 1981) and genealogical (Baum and Shaw, 1995). With so many concepts, all with different criteria, disagreement has arisen among scientists as to when and how to delimitate species. However, most of the different concepts have one element in common, that species are “separately evolving metapopulation lineages”. In 2007, the zoologist Kevin de Queiroz proposed a unified species concept which retains the common element that species are “separately evolving metapopulation lineages”, and treats the secondary species criteria as equally important for estimating the separation of lineages. The unified species concept thereby makes it possible to take advantage of all existing criteria for species delimitation (De Queiroz, 2007). In my study I will adopt the unified species concept using molecular and morphological data as evidence of speciation. When using molecular evidence in delimitation of species it is preferred to use both mitochondrial and nuclear gene markers (Brown et al., 1979). The mitochondrial genome is haploid and maternally inherited, in contrast to the nuclear genome which is diploid and biparentally inherited, and the size of the mitochondrial genome is much smaller than the nuclear one. Processes such as replication, transcription, translation and DNA proofreading is simpler in the mitochondrial genome than in the nuclear one. This results in a higher error rate and thereby a higher rate of evolution in the mitochondrial genes, with a more rapid lineage sorting, which makes it possible to distinguish more recently accumulated differences between closely related species (Brown et al., 1979; Neigel et al., 1986; Alberts et al., 2008). The mitochondrial genetic marker Cytochrome c Oxidase subunit I (COI) has been proposed as an animal DNA barcoding gene for identifying species. The sequence in question can be compared to sequences from several taxa included in a large database, and if your sequence matches with a sequence in the database you know what species it belongs to. Nevertheless, nuclear genes, both coding and non-coding ones, can be used to validate or falsify a species delimitation hypothesis (Brown et al., 1979; Klarica et al., 2012). I sequenced the two nuclear markers H3 (Histone-3) and ITS2 (Internal Transcribed Spacer 2). The latter is a non-coding region in the nuclear ribosomal DNA, and it can be highly variable between closely related species because mutations occur more frequently in this region (Baldwin et al., 1995; Collins and Paskewitz, 1996; Marinucci et al., 1999). H3, which is a coding gene, does not evolve as quickly, as either the mitochondrial DNA or as the ITS2-region (Brown et al., 1982). As an additional test I will perform a morphological study comparing five different external characters. However, morphological evidence might be hard to find because of the simple body structure and the fact that the features in lumbricids are highly variable and may overlap between both closely and distantly related taxa (Lee and Frost, 2002; Pop et al., 2003; Perez-Losada et al., 2009). This has led to morphologically similar species being classified into one species. Such

7

species are called cryptic, and to the recognition of a so called species complex which includes several taxa with an unclear relationship (Bickford et al., 2007; Perez-Losada et al., 2009; Karaca, 2011). Aim of this study In earlier DNA barcoding, Christer Erséus (University of Gothenburg) found that individuals of the earthworm Aporrectodea longa collected in Sweden, Norway and Denmark were grouped in two different clusters (1 and 2) regarding the COI marker. The genetic distance (p-distance) between the clusters is about 7%, which might indicate that A. longa is two separate species in Scandinavia (Hebert et al., 2003). However, to conclude that this is associated with cryptic speciation, evidence other than only difference in a mitochondrial gene is preferable (Brown et al., 1979). The aim of my study is to investigate the phylogenetic relationship among individuals representing the two COI clusters of A. longa found in Scandinavia. My hypothesis is that the genetic variation in the mitochondrial COI gene is interspecific. To test my hypothesis I will sequence and analyze the nuclear markers H3 and ITS2 in addition to the COI barcodes, and also conduct a study on five external morphological characters. Material and methods Sampling More than 200 specimens of the earthworm species Aporrectodea longa were collected by Christer Erséus and others from various locations in Sweden, Norway and Denmark during 2008-2015. The specimens were preserved in 95% ethanol and stored in -20°C at the University of Gothenburg, Sweden. The material studied in this project comes from this collection. The identification numbers of the studied specimens and their collecting sites are listed in Table 1. DNA-extraction, amplification and gel electrophoresis Thirty-one of the specimens included in this study were previously barcoded using COI at the University of Gothenburg, 9 specimens belonging to cluster 1 and 22 specimens belonging to cluster 2. To increase the number of specimens in cluster 1, DNA from 36 additional individuals of A.longa were extracted and COI barcoded. Three of these specimens turned out to belong to cluster 1 and were then included in the study. For the now totally 34 specimens the nuclear genetic markers H3 and ITS2 were also sequenced. A small piece of the body wall was taken from the posterior part of each specimen for DNA extraction, using Epicentre Quickextract DNA Extraction Solution 1.0 following the manufactures instructions. The genetic markers were amplified using PCR (Polymerase chain reaction) with the primers listed in Table 2. The PCR was carried out as a 25 µl reaction containing 11 µl Taq Master Mix RED, 10 µl distilled water, 1 µl of each primer (10 µM) and 2 µl template DNA for COI and H3. The ITS2 25 µl mixture contained 15 µl Taq Master Mix RED, 6 µl distilled water, 1 µl of each primer (10 µM) and 2 µl template DNA.

8

The amplification program for COI included 95°C for 5 min, followed by 35 cycles of 95°C for 40 sec, 45°C for 45 sec and 72°C for 60 sec and a final step at 72°C for 8 min. The amplification program for H3 included 95°C for 5 min followed by 35 cycles of 95°C for 30 sec, 50°C for 30 sec and 72°C for 90 sec, followed by a final step at 72°C for 8 min. The amplification program for ITS2 included 95°C for 5 min followed by 35 cycles of 95°C for 45 sec, 48°C for 60 sec and 72°C for 90 sec, followed by a final step at 72°C for 8 min. To determine whether the target genes had been successfully amplified, the PCR products were tested in gel electrophoresis using 1% Agarose. When the gel electrophoresis showed a negative result the sample was run in PCR once again. If the COI samples had to be run a third time the primer HCO2198 was replaced by CO1-E, in cases where the result still was negative a new DNA-extraction was conducted. The PCR products were purified using 5 µl ExoTAP (Exonuclease I and FastAP Thermosensitive Alkaline Phosohatase) (Werle et al., 1994), and then sequenced by Eurofins MWG Operon (Ebersberg, Germany). Sequence assembling and alignment The 36 COI sequence segments were assembled, i.e., forward and reverse primer readings were put together for each individual, into a consensus sequence. Then the sequences were aligned, i.e., each individual's DNA sequence was arranged on top of each other, to be able to discern sequence similarities and dissimilarities, using the bioinformatics program Geneious R6 v. 6.0.6 (Biomatters Ltd., Auckland, New Zeeland). To determine to which of the two clusters each of the 36 samples belonged, they were added to the 31 sequences already barcoded and a tree was created by neighbor joining. As already mentioned, three individuals belonged to cluster 1 and were added to the 31 individuals in the study. Similarly, the ITS2 and H3 sequences of 34 individuals were assembled into consensus sequences and aligned in Geneious R6. The alignments created in Geneious R6 were exported into MEGA 6.06 (Tamura et al., 2013), to test which model fits the data best and therefore should be used in the Bayesian coalescent analysis. For ITS2 and COI (H3 sequences showed no variation and could not be tested) the model proposed, using the Bayesian information criterion (BIC), by MEGA 6.06 was HKY+ Γ. This model allows different proportions of bases and transversions and transitions can have different substitution rates. All new sequences will be deposited in GenBank and vouchers of the sequenced specimens will be deposited in Scandinavian museums. Distance analyses A first test to see if there are two hypothetical species in the data set was conducted by calculating pairwise genetic distances, measuring every sequences differences pairwise, for COI and ITS2 (H3 was not analyzed because the sequences belonged to the same haplotype i.e. no variation). The test was conducted in MEGA 6.06 using uncorrected p-distances. If two sequences differed in length, the missing data were pairwise deleted. Histograms of pairwise

9

distances were drawn in Microsoft Excel. The distances were analyzed using the online version of Automatic Barcode Gap Discovery, ABGD (http://wwwabi.snv.jussieu.fr/public/abgd/abgdweb.html) (Puillandre et al., 2012) with default settings. ABGD is used to delimit genetic clusters by detecting if there is a significant gap between the pairwise distance distribution. If there is a gap in the distribution the lower distance values are presumed to reflect intraspecific difference i.e. differences within a species, and the higher values reflect the interspecific differences i.e. differences between species. Bayesian coalescent analyses To estimate the gene trees for COI and ITS2, the DNA sequences were analyzed with Bayesian coalescent analysis, which is a computational method searching for the most likely tree by comparing likelihoods (L) of a huge number of possible trees. A tree's likelihood is a value of the probability of how accurate the tree depicts the DNA pattern, given a certain model, and thus the higher the likelihood the better is the tree (Baum and Smith, 2013). The Bayesian coalescent analysis was conducted in BEAST MC3 v. 1.7.5 (Drummond and Rambaut, 2007; Drummond et al., 2012). The data were transformed to an .xml file in BEAUTI v1.6.2 (Drummond and Rambaut, 2007), before put in BEAST, with the HKY+Γ model for both gene data sets. The following settings were used for all analyses: base frequencies ʻestimatedʼ; clock model ʻlognormal relaxed clock (uncorrelated)ʼ; tree prior ʻcoalescent/constant sizeʼ; constant.popsize ʻlognormal: Log(Mean) = 0.0, Log(Stdev) = 1.0, offset = 0.0ʼ. A prior is based on the background knowledge you have on your data before the analyses is run. For all other priors the default settings were used. Both COI and ITS2 analyses were run for 50 million generations, 1 generation yields a parameter proposal to the tree, which then will be accepted or rejected. When a decision has been made regarding the proposal, a new generation starts, i.e. a new proposal. For every 5000th generation, this tree proposal is sampled and 10 000 trees are kept for further analysis. TRACER 1.5 (Drummond and Rambaut, 2007) was used to determine the burn-in of the 10 000 trees. In the beginning of a Bayesian analysis the first generation trees often got low likelihood values that steadily increased, while the most of the later trees values were higher and more consistent, the first trees created are regarded as the burn-in. The burn-in does not give any useful information and was discarded. The first 25% was discarded as burn-in, and the remaining trees were summarized and the best tree were chosen based on posterior probability using TREEANNOTATOR v1.7.5 (Drummond and Rambaut, 2007). Posterior probability is a value of the probability of each node in the tree after observing the data, the values ranges between 0 and 1, where 0 means no probability and 1 means strong probability. Species delimitation tests To test if the two clusters were significantly separated the COI and ITS2 data was imported into Geneious where the species delimitation plug-in (Masters et al., 2011) was used to calculate P (Randomly Distinct), which tests if the observed distinctiveness between groups is due to random coalescent events (Rodrigo et al., 2008) and Rosenberg´s PAB, which tests the probability for reciprocal monophyly of the clusters (Rosenberg, 2007).

10

Haplotype network analyses Haplotype networks were constructed for both genes (H3 and ITS2) in PopART v1. (http://popart.otago.ac.nz) using Statistical parsimony (Templeton et al., 1992; Clement, 2002). Morphological study Five external morphological characters were compared between the two clusters; number of body segments, body length (of preserved specimens), the segment in which the clitellum begins, number of segments in clitellum, and in which segment the male reproductive pores are located. To get a reliable result only adult individuals were used (Klarica et al., 2012), and because of that only 10 individuals of cluster 1 and 18 from cluster 2 were possible to use. For some individuals a certain character could not be measured, which resulted in that one character may e.g., have 8 values for one cluster and 13 values for the other. The characters were studied on specimens preserved in 95% ethanol using a dissection microscope (Leica MZ6). The body length was measured, using a piece of plastic string and a ruler. The morphological study was statistically analyzed using a two sample t-test when the data were normally distributed. When data were not normally distributed a Mann-Whitney U test was used instead. The tests were performed in IBM SPSS statistics 22. Results DNA sequencing 29 of the 36 DNA samples were successfully amplified and sequenced for COI. Three of them belonged to cluster 1, and were included in the study, and 26 belonged to cluster 2 and were discarded from the study. 26 of the 34 DNA samples included in the study, 12 individuals from cluster 1 and 22 individuals from cluster 2, were successfully amplified and sequenced for H3. The sequences were identical and could not be further analyzed. Regarding ITS2, 30 samples were amplified and sequenced using the reverse primer (1082R) sequences, the forward primer (606F) sequences were not sequenced correctly by Eurofins MWG Operon and could not be used for analysis. Distance analyses For COI, the maximum pairwise distance within Aporrectodea longa was 8.1% (Fig. 1). The distance within the two clusters (bars on the left side of the gap in the histogram) was between 0.0 and 2.0%, and the distance between the two clusters (bars on the right side of the gap in the histogram) was around 6.4 to 8.1%. A barcoding gap was observed between 2.0 and 6.4% pairwise distances. The ABGD analysis for COI also detected a distinct gap between the two clusters. For ITS2, the maximum pairwise distance was 1.5% and there was a continuous variation between 0 and 1.5% (Fig. 2). The ABGD analysis found only a single cluster for ITS2.

11

Figure 1. Histogram showing pairwise COI distances in % on 34 individuals.

                     Figure 2. Histogram showing pairwise ITS2 distances in % on 30 individuals.

Bayesian coalescent analyses

The tree created by the Bayesian coalescent analysis of COI (655 bp) demonstrated two clearly distinct clusters forming two monophyletic groups. Within these there was variation among the individuals. The posterior probability for the two main lineages was 0.98 for cluster 1 and 0.86 for cluster 2 (Fig. 3). The ITS2 (525 bp) tree showed a great deal of variation too, but it did not display the two clusters like the COI tree did, and individuals from cluster 1 and 2, according to COI, was widely distributed among the branches of the ITS2 tree (F

12

Figure 3. Bayesian coalescent COI tree with Bayesian posterior probability, values above 0.50 are given. Id numbers are provided for each specimen. Individuals from cluster 1 are colored in green and individuals from cluster 2 in purple. Scale bar shows expected numbers of substitution per site.

13

Figure 4. Bayesian coalescent ITS2 tree with Bayesian posterior probability, values above 0.50 are given. Id numbers are provided for each specimen. Individuals from cluster 1 are colored in green and individuals from cluster 2 in purple. Scale bar shows expected numbers of substitution per site.

14

Species delimitation tests The P (Randomly distinct) species delimitation test of the COI-tree, was not significant with a value of 0.53 (significance level 0.05). The Rosenberg´s PAB species delimitation test was highly significant with a value of 3.9 x 10-11 (significance level 0.05). The species delimitation test was not performed on ITS2 because the two clusters were not monophyletic.

Haplotype network analyses The haplotype network of COI showed two distinct groups, cluster 1 with 3 haplotypes and cluster 2 with 9 haplotypes, and the number of substitutions between some haplotypes was high (Fig. 5). To get equal number of base pairs in the COI sequences they had to be trimmed, which resulted in 612 bp each. The ITS2 sequences had 525 bp each, and contained 11 haplotypes. In the haplotype network, four ITS2 haplotypes were shared between individuals of the two COI-clusters and the number of substitutions between haplotypes was low (Fig. 5).

Figure 5. Haplotype networks of COI and ITS2. Specimens samples are color-coded according to COI

clusters 1 and 2. The marks on the branches represent changes. Morphological study The position of the male reproductive pores had a mean value of 14.91 and 14.94 for cluster 1 (11 individuals) and 2 (17 individuals) respectively. No statistical test was performed on this character due to the low variation. The mean value of number of body segments in cluster 1 was 193.33 and in cluster 2, 178.81, the variation within the groups was high and not significant according to the two sample t-test (p=0.07) (Fig. 6A). The mean value of body length in cluster 1 was 11.80 cm and in cluster 2, 12.03 cm. The two sample t-test was not significant (p=0.791) (Fig. 6B).

15

The number of segments in the clitellum showed a mean value of 8.20 for cluster 1 and 7.94 for cluster 2. The Mann-Whitney U test proved no significant difference between the clusters (p=0.195) (Fig. 7A). Regarding the starting segment of the clitellum, the mean-value was 28.30 for cluster 1 and 28.00 for cluster 2. The Mann-Whitney U test proved no significant difference between the clusters (p=0.742) (Fig. 7B). The morphological study thus showed no significant difference between the two clusters of A.longa for any of the characters (Table 3).

Figure 6. A, Boxplot of the number of body segments within clusters 1 and 2. Mean value for cluster 1 is 193.3 segments, and for cluster 2, 178.8 segments. Cluster 1 contains 9 adult individuals, and cluster 2 of 15 adult individuals. B, Boxplot of the body length of individuals within clusters 1 and 2. Mean value for cluster 1 is 11.80 cm, and for cluster 2, 12.03 cm. Cluster 1 contains of 9 individuals, and cluster 2 of 15 individuals.

 

 

 

Figure 7. A, Diagram of the number of segments in the clitellum in clusters 1 and 2. B, Diagram of at which segment the clitellum begins in clusters 1 and 2. In both A and B, cluster 1 contains 10 adult individuals, and cluster 2 contain 17 adult individuals.

 

 

 

 

 

A B

A B

16

Discussion

A barcoding gap can be observed when the intraspecific genetic variation is smaller than the interspecific genetic variation (Meyer and Paulay, 2013). The gap is thereby, in the present case, an indication of two species. The distance analysis together with the ABGD analysis of the COI data yielded a barcoding gap between 2.0-6.4%, i.e. the variation within the clusters is below 2.0% and the variation between them is above 6.4%, which means there are two hypothetical species (Fig.1). This is not a small gap, but when comparing with other clitellate studies the gap more often seem to be larger when species are delimitated (Matamoros et al., 2012). However, the corresponding analysis of ITS2 yielded no such gap (Fig. 2) and thereby suggests that all specimens belong to the same species. Thus, one marker proposes two species and the other does not, which is not an unreasonable result considering that (in animals) mitochondrial genes generally exhibit a faster mutation rate and a faster lineage sorting than nuclear genes (Brown et al., 1979; Neigel et al., 1986). Furthermore, the phylogenetic tree resulting from the Bayesian coalescent analysis of the COI sequences showed two distinct monophyletic groups just as the distance analysis predicted, one including individuals from cluster 1 and the other including individuals from cluster 2. The ITS2 tree, on the other hand, exhibited another topology and no sign of the two distinct clusters; on the contrary, the individuals of the two species predicted by the COI data were scattered widely within the tree. This result was quite interesting, because when considering at which localities the individuals come from, representatives of both clusters 1 and 2 were found at the same sites. For instance, in the ITS2 tree one clade with full support (posterior probability = 1) consists of two individuals (CE4219 and CE4226) that are both from Bromölla, Sweden; CE4219 belongs to COI-cluster 2, and CE4226 to COI-cluster 1. Even in Oslo, Tjärnö and Kinnekulle individuals from both clusters 1 and 2 co-exist. In part this indicates that there are no geographic divergence between the two clusters, and it also seems to suggest that the Scandinavian metapopulation of A. longa is panmictic, i.e., all individuals reproduce freely with each other. Reproductive incompatibility is a criterion of speciation derived from the biological species concept, and also used in the unified species concept (which I am applying), and reproductive incompatibility does not seem to have evolved between the clusters recognized in this study. On the other hand, one can argue that there is too little variation in the ITS2 data to conclude that this is just one species, and that additional nuclear data should be investigated to be able to make a decision whether to delimit these clusters or not. A reason why we see the distinct divergence of the two clusters in the mitochondrial gene, but not in the nuclear marker, is probably due to the free reproduction among individuals which keeps the population size larger and maintains frequent recombination of the nuclear genome between the maternal clusters 1 and 2. The mitochondrial genome which is inherited maternally and does not recombine with the paternal counterpart, exhibits a reduced effective population size (Ne), i.e., the number of individuals in a population that reproduce, and thereby tends to have an increasing rate of genetic drift. Genetic drift has a stronger effect in small populations, and the extinction of gene copies proceeds faster, which results in a faster lineage sorting, compared to the same process in a large population. For the nuclear genome the effective population size is

17

larger, because the genes are biparentally inherited with recombination. It will thus take a longer time for the nuclear gene lineages to coalesce (Neigel et al., 1986; Futuyma, 2013). Regarding H3 there was no variation at all between the sequences, they all belonged to the same haplotype. Although ITS2 also is a nuclear marker it possessed much more variation than H3; being a non-coding region in the ribosomal DNA it is generally more variable than coding regions (such as H3) of the nuclear genome (Collins and Paskewitz, 1996). The ITS2 tree seems to reflect some variation, but the posterior probability values on the short branches are very low, i.e. the support for the nodes is low, so the question is if these branches are accurate. The tree is constructed by a Bayesian coalescence analysis that assumes coalescence to occur between any two individuals. This means that even if two individuals have the same haplotype, the analysis will still search for a coalescence event between them. Such an example is the group CE11874 and CE16498, with a posterior probability of 0.05. To get a more precise picture of how many haplotypes the tree actually contains I conducted the haplotype network analysis, which showed fewer haplotypes than the tree did. Even the variation between the haplotypes is small, often just one substitution differs, as can be seen in the network. Four haplotypes were shared between the two clusters based on COI (Fig. 5). The fact that the ITS2 tree is solely constructed by reverse primer sequences might have led to loss or gain of haplotypes due to sequencing error, which may have caused under- or overestimation of the variability in the ITS2 tree. The species delimitation tests, conducted on the COI data, found no significant clusters with a P (Randomly Distinct) (>0.05). These results mean that we cannot be sure that the two clusters demonstrated by the COI data have not arisen by random coalescent processes. The Rosenberg's PAB metric on the other hand was highly significant (<0.05) for random reciprocal monophyly. This raises the question as to which of the two tests to trust. A previous clitellate study showed a similar result, with P (Randomly Distinct) insignificant and Rosenberg's PAB significant, for several groups, although other sources of data pointed against a single group (Martinsson et al., 2013). The Rosenberg's PAB statistic seems to be a lot more liberal than P (Randomly Distinct). With regard to A. longa and the fact that the nuclear DNA data sources I used did not suggest that this taxon is two species (even though it cannot be ruled out), it seems more legitimate to trust the more conservative P (Randomly Distinct) in this case. However, in summary, it seems that even COI does not give unambiguous support for speciation. The five external morphological characters studied did not differ significantly when compared between the two COI clusters. The morphological variation among the A. longa individuals was high, as expected for earthworms (see introduction), for which morphological studies alone have been considered as an unreliable source for detecting speciation (Dominguez et al., 2015). However, the mean values of the measurements of body segments differed with 14.52 segments between the clusters, and according to the t-test this was nearly significant with a p-value of 0.07. For future studies it would be relevant to investigate the morphology further, with an increased sample size (in both clusters) and also looking at other both external and internal characters. Considering that cryptic speciation is common among earthworms, this result does not necessarily support that this is one species.

18

Conclusions Out of in total four data sources, three genetic markers COI, H3, ITS2 and morphology, only COI supports the hypothesis of two species, with regard to the distance analysis, ABGD, Bayesian coalescent analysis and haplotype network. COI has been proposed as, and is commonly used as, a suitable barcoding gene in animal species identification. Such identification relies on the so called barcoding gap, which if it is large enough is said to give enough support for species delimitation (Hebert et al., 2003). According to my results, a COI barcoding gap is not an adequate tool for species delimitation by itself. COI data should be complemented with those of other genetic markers, as well as other sources of information from morphology, ecology etc. Previous studies have also questioned the uncritical use of COI as a barcoding gene (Dasmahapatra et al., 2010; Matamoros et al., 2012; Achurra and Erséus, 2013; Martinsson et al., 2013). However, according to my results, especially considering the poor resolution in the ITS2 and H3, it is not possible to fully reject the hypothesis that the COI variation in Aporrectodea longa is interspecific. To be able to make a final conclusion whether the two clusters represent one or two species further analyses are needed. For example, it would be interesting to build a species tree with the inclusion of outgroups, which are not as closely related as clusters 1 and 2 are to each other. Acknowledgments First of all I want to thank my supervisor Christer Erséus for giving me the opportunity to do my thesis in systematics. I also want to thank him for being a dedicated and helpful supervisor. I also want to thank Svante Martinsson for great guidance and for always taking time to answer my questions. Last but not least I want to thank Suzanna Persson and Mårten Eriksson for pleasant company and helpful advices. Thanks to all of you, I have learnt a lot during these weeks!

19

References Achurra, A., and C. Erséus. 2013. DNA barcoding and species delimitation: the Stylodrilus

heringianus case (Annelida : Clitellata : Lumbriculidae). Invertebrate Systematics 27: 118-128.

Alberts, B. et al. 2008. Molecular biology of the cell. Fifth ed. Garland Science, New York. Baldwin, B. G. et al. 1995. The ITS region of nuclear ribosomal DNA - A valuable source of

evidence on angiosperm phylogeny. Annals of the Missouri Botanical Garden 82: 247-277.

Baum, D. A., and K. L. Shaw. 1995. Genealogical perspectives on the species problem. Experimental and Molecular Approaches to Plant Biosystematics 53: 289-303.

Baum, D. A., and S. D. Smith. 2013. Tree thinking: an introduction to phylogenetic biology. Roberts and Company Publishers, Inc., Greenwood Village.

Bely, A. E., and G. A. Wray. 2004. Molecular phylogeny of naidid worms (Annelida : Clitellata) based on cytochrome oxidase I. Molecular Phylogenetics and Evolution 30: 50-63.

Bickford, D. et al. 2007. Cryptic species as a window on diversity and conservation. Trends in Ecology & Evolution 22: 148-155.

Brown, G. G., I. Barois, and P. Lavelle. 2000. Regulation of soil organic matter dynamics and microbial activity in the drilosphere and the role of interactions with other edaphic functional domains. European Journal of Soil Biology 36: 177-198.

Brown, S., G. Rouse, P. Hutchings, and D. Colgan. 1999. Assessing the usefulness of histone H3, U2 snRNA and 28S rDNA in analyses of polychaete relationships. Australian Journal of Zoology 47: 499-516.

Brown, W. M., M. George, and A. C. Wilson. 1979. Rapid evolution of animal mitochondrial-DNA. Proceedings of the National Academy of Sciences of the United States of America 76: 1967-1971.

Brown, W. M., E. M. Prager, A. Wang, and A. C. Wilson. 1982. Mitochondrial-DNA sequences of primates - Tempo and mode of evolution. Journal of Molecular Evolution 18: 225-239.

Clement, M., Snell, Q., Walker, P., Posada, D., Crandall, K. 2002. TCS: estimating gene genealogies. In Parallel and Distributed Processing Symposium, International 2.

Collins, F. H., and S. M. Paskewitz. 1996. A review of the use of ribosomal DNA (rDNA) to differentiate among cryptic Anopheles species. Insect Molecular Biology 5: 1-9.

Dasmahapatra, K. K., M. Elias, R. I. Hill, J. I. Hoffman, and J. Mallet. 2010. Mitochondrial DNA barcoding detects some species that are real, and some that are not. Molecular Ecology Resources 10: 264-273.

De Queiroz, K. 2007. Species concepts and species delimitation. Systematic Biology 56: 879-886.

Dobzhansky, T. 1950. Mendelian populations and their evolution. American Naturalist 84: 401-418.

Dominguez, J. et al. 2015. Underground evolution: New roots for the old tree of lumbricid earthworms. Molecular Phylogenetics and Evolution 83: 7-19.

Drummond, A. J., and A. Rambaut. 2007. BEAST: Bayesian evolutionary analysis by sampling trees. Bmc Evolutionary Biology 7.

Drummond, A. J., M. A. Suchard, D. Xie, and A. Rambaut. 2012. Bayesian Phylogenetics with BEAUti and the BEAST 1.7. Molecular Biology and Evolution 29: 1969-1973.

Edwards, C. A. 2004. Earthworm ecology. Second ed. CRC Press, Florida. Edwards, C. A., and P. J. Bohlen. 1996. Biology and ecology of earthworms. Third ed. Chapman

& Hall, London. Folmer, O., M. Black, W. Hoeh, R. Lutz, and R. Vrijenhoek. 1994. DNA primers for

amplification of mitochondrial cytochrome c oxidase subunit I from diverse metazoan invertebrates. Molecular Marine Biology and Biotechnology 3: 294-299.

20

Futuyma, D. J. 2013. Evolution. Third ed. Sinauer Associates, Inc., Sunderland. Hebert, P. D. N., A. Cywinska, S. L. Ball, and J. R. DeWaard. 2003. Biological identifications

through DNA barcodes. Proceedings of the Royal Society B-Biological Sciences 270: 313-321.

Hennig, W. 1966. Phylogenetic systematics. University of Illinois Press, Urbana. Jouquet, P., J. Dauber, J. Lagerlof, P. Lavelle, and M. Lepage. 2006. Soil invertebrates as

ecosystem engineers: Intended and accidental effects on soil and feedback loops. Applied Soil Ecology 32: 153-164.

Karaca, A. 2011. Biology of earthworms. Springer, New York. Klarica, J., A. Kloss-Brandstaetter, M. Traugott, and A. Juen. 2012. Comparing four

mitochondrial genes in earthworms - Implications for identification, phylogenetics, and discovery of cryptic species. Soil Biology & Biochemistry 45: 23-30.

Lee, C. E., and B. W. Frost. 2002. Morphological stasis in the Eurytemora affinis species complex (Copepoda : Temoridae). Hydrobiologia 480: 111-128.

Marinucci, M., R. Romi, P. Mancini, M. Di Luca, and C. Severini. 1999. Phylogenetic relationships of seven palearctic members of the maculipennis complex inferred from ITS2 sequence analysis. Insect Molecular Biology 8: 469-480.

Martinsson, S., A. Achurra, M. Svensson, and C. Erséus. 2013. Integrative taxonomy of the freshwater worm Rhyacodrilus falciformis s.l. (Clitellata: Naididae), with the description of a new species. Zoologica Scripta 42: 612-622.

Masters, B. C., V. Fan, and H. A. Ross. 2011. Species delimitation - a geneious plugin for the exploration of species boundaries. Molecular Ecology Resources 11: 154-157.

Matamoros, L., E. Rota, and C. Erséus. 2012. Cryptic diversity among the achaetous Marionina (Annelida, Clitellata, Enchytraeidae). Systematics and Biodiversity 10: 509-525.

Meyer, C. P., and G. Paulay. 2013. DNA Barcoding: Error Rates Based on Comprehensive Sampling. Figshare.

Neigel, J. E., J. C. Avise, S. Karlin, and E. Nevo. 1986. Phylogenetic relationships of mitochondrial DNA under various demographic models of speciation. Evolutionary processes and theory.: 515-534.

Nelson, G., and N. Platnick. 1981. Systematics and biogeography: cladistics and vicariance. Systematics and biogeography: cladistics and vicariance.: 1-567.

Perez-Losada, M., M. Ricoy, J. C. Marshall, and J. Dominguez. 2009. Phylogenetic assessment of the earthworm Aporrectodea caliginosa species complex (Oligochaeta: Lumbricidae) based on mitochondrial and nuclear DNA sequences. Molecular Phylogenetics and Evolution 52: 293-302.

Pop, A. A., M. Wink, and V. V. Pop. 2003. Use of 18S, 16S rDNA and cytochrome c oxidase sequences in earthworm taxonomy (Oligochaeta, Lumbricidae). Pedobiologia 47: 428-433.

Puillandre, N., A. Lambert, S. Brouillet, and G. Achaz. 2012. ABGD, Automatic Barcode Gap Discovery for primary species delimitation. Molecular Ecology 21: 1864-1877.

Rodrigo, A. et al. 2008. The perils of plenty: what are we going to do with all these genes? Philosophical Transactions of the Royal Society B-Biological Sciences 363: 3893-3902.

Rosen, D. E. 1979. Fishes from the uplands and intermontane basins of Guatemala: revisionary studies and comparative geography. Bulletin of the American Museum of Natural History 162: 269-375.

Rosenberg, N. A. 2007. Statistical tests for taxonomic distinctiveness from observations of monophyly. Evolution 61: 317-323.

Sims, R. W., and B. M. Gerard. 1985. Earthworms. Keys and notes for the identification and study of the species. The Linnean Society of London and the Estuarine and Brackish-Water Sciences Association London.

21

Tamura, K., G. Stecher, D. Peterson, A. Filipski, and S. Kumar. 2013. MEGA6: Molecular Evolutionary Genetics Analysis Version 6.0. Molecular Biology and Evolution 30: 2725-2729.

Templeton, A. R., K. A. Crandall, and C. F. Sing. 1992. A cladistic-analysis of phenotypic associations with haplotypes inferred from restriction endonuclease mapping and DNA-sequence data .3. cladogram estimation. Genetics 132: 619-633.

Van Valen, L. 1976. Ecological species, multispecies, and oaks. Taxon 25: 233-239. Werle, E., C. Schneider, M. Renner, M. Volker, and W. Fiehn. 1994. Convenient single-step, one

tube purification of PCR products for direct sequencing. Nucleic Acids Research 22: 4354-4355.

.

22

Appendix Table 1. List of specimens of Aporrectodea longa used in the study, including identification number, COI cluster, collection site and collection date. SE= Sweden, NO= Norway, DK= Denmark.

Genbank accession numbers

Id no. COI cluster Collection site Collection date COI H3 ITS2

CE4219 2 Bromölla, Skåne, SE May 2008 XX000000 XX000000 XX000000

CE4224 2 Bromölla, Skåne, SE May 2008 XX000000 XX000000 XX000000

CE4226 1 Bromölla, Skåne, SE May 2008 XX000000 XX000000 XX000000

CE4890 1 Kinnekulle, Västergötland, SE Sep 2008 XX000000 XX000000 XX000000

CE4891 2 Kinnekulle, Västergötland, SE Oct 2008 XX000000 XX000000 XX000000

CE4951 2 Århus, Midtjylland, DK Oct 2008 XX000000 XX000000 XX000000

CE5124 2 Tjärnö, Bohuslän, SE Oct 2008 XX000000 XX000000 XX000000

CE5125 2 Tjärnö, Bohuslän, SE Oct 2008 XX000000 XX000000 XX000000

CE5155 1 Tjärnö, Bohuslän, SE Oct 2008 XX000000 XX000000 XX000000

CE5900 2 Vellinge, Skåne, SE Apr 2009 XX000000 XX000000 XX000000

CE5982 2 Nårunga, Västergötland, SE May 2009 XX000000 XX000000 XX000000

CE6259 2 Åkersberga, Uppland, SE June 2009 XX000000 XX000000 XX000000

CE6268 2 Uppsala, Uppland, SE June 2009 XX000000 XX000000 XX000000

CE6430 1 Östervåla, Uppland, SE June 2009 XX000000 XX000000

CE6987 1 Smögen, Bohuslän, SE Aug 2009 XX000000 XX000000

CE9148 1 Etelhem, Gotland, SE Aug 2010 XX000000 XX000000

CE9957 2 Lerum, Västergötland, SE Oct 2010 XX000000 XX000000 XX000000

CE10419 1 Torshälla, Södermanland, SE Oct 2010 XX000000

CE11874 2 Kristiansand, Vest-Agder, NO June 2011 XX000000 XX000000 XX000000

CE11908 2 Frikstad, Vest-Agder, NO June 2011 XX000000 XX000000 XX000000

CE13319 2 Halden, Östfold, NO Sep 2011 XX000000 XX000000 XX000000

CE13532 2 Kviberg, Västergötland,SE Oct 2011 XX000000 XX000000 XX000000

CE13638 1 Trondheim, Sør-Trøndelag, NO Nov 2011 XX000000 XX000000

CE14811 2 Horten, Vestfold, NO June 2012 XX000000 XX000000

CE15977 2 Göteborg, Västergötland, SE Sep 2012 XX000000 XX000000 XX000000

CE16494 1 Oslo, NO Oct 2012 XX000000 XX000000 XX000000

CE16495 2 Oslo, NO Oct 2012 XX000000 XX000000 XX000000

CE16496 1 Oslo, NO Oct 2012 XX000000 XX000000 XX000000

CE16497 2 Oslo, NO Oct 2012 XX000000 XX000000

CE16498 2 Oslo, NO Oct 2012 XX000000 XX000000

CE16502 2 Oslo, NO Oct 2012 XX000000 XX000000

CE16809 1 Sollentuna, Uppland, SE Nov 2012 XX000000 XX000000 XX000000

CE19799 1 Karlstad, Värmland, SE Aug 2013 XX000000 XX000000 XX000000

CE19800 2 Karlstad, Värmland, SE Aug 2013 XX000000

23

Table 2. Table of primers, their sequences and references.

Table 3. Table of mean values, standard deviation and p-values for the five morphological characters. Ranges included for number of body segments and body length. All p-values is >0.05 which means there is no significant difference between the two clusters in A.longa.

Primer Sequence 5'-3' Reference  

COI LCO1490 GGTCAACAAATCATAAAGATATTGG (Folmer et al., 1994)

HCO2198 TAAACTTCAGGGTGACCAAAAAATCA (Folmer et al., 1994)

COI-E TATACTTCTGGGTGTCCGAAGAATCA (Bely and Wray, 2004)

H3   

H3F ATGGCTCGTACCAAGCAGACVGC (Brown et al., 1999)

H3R ATATCCTTRGGCATRATRGTGAC (Brown et al., 1999)

ITS2   

606F GTCGATGAAGAGCGCAGCCA (Yingkui Liu, In lit.)  

1082R TTAGTTTCTTTTCCTCCGCTT (Yingkui Liu, In lit.)  

             

  No. of body segments 

Body length (cm)* 

Position of male pores 

Clitellum (no. of segments) 

Clitellum (starting segment) 

A.longa cluster 1         

Mean    193.33 (161‐208)        11.80 (7‐14.5)  14.91 (14‐15) 8.20 (7‐9)      28.30 (27‐30)

St‐Deviation                         13.96                         2.09 0.63                      0.95

A.longa cluster 2         

Mean    178.81 (140‐214)     12.03(7,5‐14,5)   14.94 (14‐15) 7.94 (7‐9)      28.00 (25‐29)

St‐Deviation  20.31  2.06    0.42  1.24

Statistical test, P‐value <0.05  0.070  0.791    0.195  0.742

Ranges in parentheses * fixed specimens