novel universal primers for metabarcoding edna surveys of … · groups: i) vertebrates (excluding...

40
Elena Valsecchi 1 , Jonas Bylemans 2 , Simon J. Goodman 3 , Roberto Lombardi 1 , Ian Carr 4 , Laura Castellano 5 , Andrea Galimberti 6 , Paolo Galli 1,7 NOVEL UNIVERSAL PRIMERS FOR METABARCODING eDNA SURVEYS OF MARINE MAMMALS AND OTHER MARINE VERTEBRATES 1 Department of Environmental and Earth Sciences, University of Milano-Bicocca, Piazza della Scienza 1, 20126 Milan, Italy 2 Fondazione Edmund Mach, Via E. Mach 1, 38010 S. Michele all'Adige (TN), Italy 3 School of Biology, University of Leeds, Woodhouse Lane, Leeds LS2 9JT, United Kingdom 4 Leeds Institute of Molecular Medicine, St James’s University Hospital, Leeds LS9 7TF, United Kingdom 5 Acquario di Genova, Costa Edutainment SPA, Area Porto Antico, Ponte Spinola, 16128 Genoa, Italy 6 Department of Biotechnology and Biosciences, University of Milano-Bicocca, Piazza della Scienza 2, 20126 Milan, Italy 7 MaRHE Center, Magoodhoo Island, Faafu Atoll, Republic of Maldives Corresponding author: [email protected] ORCID ID 0000.0003.3869.6413 Key words: 12S, 16S, cetaceans, pinnipeds, fish, sea turtles Running title: Marine Vertebrate Universal Markers for eDNA Metabarcoding was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which this version posted September 5, 2019. . https://doi.org/10.1101/759746 doi: bioRxiv preprint was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which this version posted September 5, 2019. . https://doi.org/10.1101/759746 doi: bioRxiv preprint was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which this version posted September 5, 2019. . https://doi.org/10.1101/759746 doi: bioRxiv preprint was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which this version posted September 5, 2019. . https://doi.org/10.1101/759746 doi: bioRxiv preprint was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which this version posted September 5, 2019. . https://doi.org/10.1101/759746 doi: bioRxiv preprint was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which this version posted September 5, 2019. . https://doi.org/10.1101/759746 doi: bioRxiv preprint was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which this version posted September 5, 2019. . https://doi.org/10.1101/759746 doi: bioRxiv preprint was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which this version posted September 5, 2019. . https://doi.org/10.1101/759746 doi: bioRxiv preprint was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which this version posted September 5, 2019. . https://doi.org/10.1101/759746 doi: bioRxiv preprint was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which this version posted September 5, 2019. . https://doi.org/10.1101/759746 doi: bioRxiv preprint was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which this version posted September 5, 2019. . https://doi.org/10.1101/759746 doi: bioRxiv preprint was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which this version posted September 5, 2019. . https://doi.org/10.1101/759746 doi: bioRxiv preprint was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which this version posted September 5, 2019. . https://doi.org/10.1101/759746 doi: bioRxiv preprint was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which this version posted September 5, 2019. . https://doi.org/10.1101/759746 doi: bioRxiv preprint

Upload: others

Post on 07-Jun-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Novel Universal Primers for Metabarcoding eDNA Surveys of … · groups: i) vertebrates (excluding Cetaceans), ii) Cetaceans only, and iii) Invertebrates. Performance of the newly

Elena Valsecchi1, Jonas Bylemans2, Simon J. Goodman3, Roberto Lombardi1, Ian Carr4, Laura Castellano5,

Andrea Galimberti6, Paolo Galli1,7

NOVEL UNIVERSAL PRIMERS FOR METABARCODING eDNA SURVEYS OF MARINE

MAMMALS AND OTHER MARINE VERTEBRATES

1 Department of Environmental and Earth Sciences, University of Milano-Bicocca, Piazza della Scienza 1,

20126 Milan, Italy 2 Fondazione Edmund Mach, Via E. Mach 1, 38010 S. Michele all'Adige (TN), Italy 3 School of Biology, University of Leeds, Woodhouse Lane, Leeds LS2 9JT, United Kingdom 4 Leeds Institute of Molecular Medicine, St James’s University Hospital, Leeds LS9 7TF, United Kingdom 5 Acquario di Genova, Costa Edutainment SPA, Area Porto Antico, Ponte Spinola, 16128 Genoa, Italy 6 Department of Biotechnology and Biosciences, University of Milano-Bicocca, Piazza della Scienza 2,

20126 Milan, Italy 7 MaRHE Center, Magoodhoo Island, Faafu Atoll, Republic of Maldives

Corresponding author:

[email protected] ORCID ID 0000.0003.3869.6413

Key words: 12S, 16S, cetaceans, pinnipeds, fish, sea turtles

Running title: Marine Vertebrate Universal Markers for eDNA Metabarcoding

was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (whichthis version posted September 5, 2019. . https://doi.org/10.1101/759746doi: bioRxiv preprint

was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (whichthis version posted September 5, 2019. . https://doi.org/10.1101/759746doi: bioRxiv preprint

was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (whichthis version posted September 5, 2019. . https://doi.org/10.1101/759746doi: bioRxiv preprint

was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (whichthis version posted September 5, 2019. . https://doi.org/10.1101/759746doi: bioRxiv preprint

was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (whichthis version posted September 5, 2019. . https://doi.org/10.1101/759746doi: bioRxiv preprint

was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (whichthis version posted September 5, 2019. . https://doi.org/10.1101/759746doi: bioRxiv preprint

was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (whichthis version posted September 5, 2019. . https://doi.org/10.1101/759746doi: bioRxiv preprint

was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (whichthis version posted September 5, 2019. . https://doi.org/10.1101/759746doi: bioRxiv preprint

was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (whichthis version posted September 5, 2019. . https://doi.org/10.1101/759746doi: bioRxiv preprint

was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (whichthis version posted September 5, 2019. . https://doi.org/10.1101/759746doi: bioRxiv preprint

was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (whichthis version posted September 5, 2019. . https://doi.org/10.1101/759746doi: bioRxiv preprint

was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (whichthis version posted September 5, 2019. . https://doi.org/10.1101/759746doi: bioRxiv preprint

was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (whichthis version posted September 5, 2019. . https://doi.org/10.1101/759746doi: bioRxiv preprint

was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (whichthis version posted September 5, 2019. . https://doi.org/10.1101/759746doi: bioRxiv preprint

Page 2: Novel Universal Primers for Metabarcoding eDNA Surveys of … · groups: i) vertebrates (excluding Cetaceans), ii) Cetaceans only, and iii) Invertebrates. Performance of the newly

ACKNOWLEDGEMENTS

We thank Fulvio Maffucci for providing marine turtles DNAa samples. Giudo Gnone of the Aquarium of

Genoa for allowing and supporting collection of controlled environmental eDNA samples. Antonia Bruno for

advises on filtering procedures of environmental samples and Anna Sandionigi for bioinformatic advice.

AUTHOR CONTRIBUTIONS

E.V. designed primer sets, planned the testing approach compiled the marine mammal guideline to the

primers’ use and wrote the manuscript; J.B. performed the in silico validation of the novel primer sets; R.L.

carried out samples collection, wet lab tests, eDNA amplifications and analysed controlled environment HTS

data; S.G. contributed to the design and implementation of the wet lab primer validation, provided support

and facilities for HTS analysis at UoL, and contributed to the drafting and editing of the manuscript; I.C.

provided HTS services and bioinformatics support; L.C. allowed and supported collection of water samples

from tanks of the Aquarium of Genoa structure; A.G. provided useful comments on manuscript; P.G.

enthusiastically supported and hosted the research study at UnMB.

was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (whichthis version posted September 5, 2019. . https://doi.org/10.1101/759746doi: bioRxiv preprint

Page 3: Novel Universal Primers for Metabarcoding eDNA Surveys of … · groups: i) vertebrates (excluding Cetaceans), ii) Cetaceans only, and iii) Invertebrates. Performance of the newly

ABSTRACT

Metabarcoding studies using environmental DNA (eDNA) and high throughput sequencing (HTS) are rapidly

becoming an important tool for assessing and monitoring marine biodiversity, detecting invasive species, and

supporting basic ecological research. Several barcode loci targeting teleost fish and elasmobranchs have previously

been developed, but to date primer sets focusing on other marine megafauna, such as marine mammals have

received less attention. Similarly, there have been few attempts to identify potentially ‘universal’ barcode loci which

may be informative across multiple marine vertebrate Orders. Here we describe the design and validation of four

new sets of primers targeting hypervariable regions of the vertebrate mitochondrial 12S and 16S rRNA genes, which

have conserved priming sites across virtually all cetaceans, pinnipeds, elasmobranchs, boney fish, sea turtles and

birds, and amplify fragments with consistently high levels of taxonomically diagnostic sequence variation. ‘In

silico’ validation using the OBITOOLS software showed our new barcode loci outperformed most existing

vertebrate barcode loci for taxon detection and resolution. We also evaluated sequence diversity and taxonomic

resolution of the new barcode loci in 680 complete marine mammal mitochondrial genomes demonstrating that they

are effective at resolving amplicons for most taxa to the species level. Finally, we evaluated the performance of the

primer sets with eDNA samples from aquarium communities with known species composition. These new primers

will potentially allow surveys of complete marine vertebrate communities in single HTS metabarcoding

assessments, simplifying workflows, reducing costs, and increasing accessibility to a wider range of investigators.

was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (whichthis version posted September 5, 2019. . https://doi.org/10.1101/759746doi: bioRxiv preprint

Page 4: Novel Universal Primers for Metabarcoding eDNA Surveys of … · groups: i) vertebrates (excluding Cetaceans), ii) Cetaceans only, and iii) Invertebrates. Performance of the newly

INTRODUCTION

The use of DNA fragments extracted from environmental sources (e.g. soil and water samples) is becoming a

well-established tool for monitoring biodiversity (Deiner et al. 2017; Jarman et al. 2018). Within the marine

environment, such eDNA surveys have been used to assess the diet of marine species (Deagle et al. 2010,

Peters et al.2015, McInnes et al. 2017), monitor the species diversity of marine communities (Port et al. 2016;

Sigsgaard et al. 2017), determine the presence/absence of invasive species (Borrell et al. 2017) and to obtain

estimates of population genetic diversity (Sigsgaard et al. 2016). Community biodiversity surveys from

eDNA (i.e. eDNA metabarcoding) rely on primers targeting specific taxonomic groups and high-throughput

sequencing to amplify and sequence barcoding regions from all species of interest (Creer et al. 2016). While

DNA metabarcoding primers have been developed to target several individual marine taxonomic groups (e.g.

elasmobranchs, teleost fish, cephalopods and crustaceans) (e.g. Jarman et al. 2016, Miya et al. 2015, Komai et

al. 2019, Bylemans et al. 2018), to date no primer sets have been specifically designed to maximize the

recovery and identification of marine mammals.

The few marine eDNA studies focussing on marine mammals used targeted species-specific assays (Foote et

al. 2012, Baker et al. 2018) or used universal fish specific primers as a proxy to assess the total vertebrate

biodiversity (e.g. Andruszkiewicz et al. 2017). All these approaches present at least one drawback when

aiming to detecting the presence of cetacean and pinniped species within an eDNA sample. Primer sets

designed for one group, i.e. ‘fish-specific’ primers, might amplify eDNA from other taxa, but unquantified

primer mismatches risk reduced detection rates and the introduction of biases in high throughput sequencing

(HTS) results (e.g. Elbrecht et al 2018). Where primers are designed for the species of interest, amplicon

target size may also be a consideration. The instability and short life span of eDNA molecules (e.g. Thomsen

et al. 2012), favours the use of short and informative (hypervariable) DNA regions, such as the mitochondrial

12S, 16S and cytochrome oxidase I (COI) genes, with amplicons of typically around 100-200 bp although

larger mtDNA fragments have been shown to be successfully amplified from eDNA samples (Deiner et al.

2017).

There is therefore a need to develop marine mammal-specific primers suitable for metabarcoding analysis

from marine eDNA samples. Mitochondrial 12S and 16S regions provide suitable targets since their sequence

variation provides good taxonomic resolution for macro-eukaryotes, while also maintaining conserved sites

across regions for siting primers (Deagle et al. 2014). The design criteria for such primers should be, a)

primer sites which are conserved among marine mammal groups, while amplifying hypervariable DNA

fragments for taxonomic resolution; b) where possible, identify marine mammal specific priming sites in

order to reduce cross-amplification with human DNA, thus lessening contamination risks from investigators,

swimmers, or other biological residues left by humans; c) for each primer set evaluate predicted binding

efficiency and amplicon sequence diversity for other marine vertebrates such as fish and sea turtles (when

necessary allowing for a single degenerate base per primer). This final point would provide to give a more

accurate understanding of primer specificity and suitability for use with other vertebrate groups. Primer sets

proven to have reliable affinity across multiple vertebrate Orders and could support more efficient and cost-

effective eDNA biodiversity surveys.

was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (whichthis version posted September 5, 2019. . https://doi.org/10.1101/759746doi: bioRxiv preprint

Page 5: Novel Universal Primers for Metabarcoding eDNA Surveys of … · groups: i) vertebrates (excluding Cetaceans), ii) Cetaceans only, and iii) Invertebrates. Performance of the newly

In this paper we report on the development of novel “universal” marine vertebrate eDNA primers meeting the

above criteria, their ‘in silico’ validation against a large marine vertebrate NCBI-Genbank dataset, and their

successful initial application and validation with a high throughput sequencing analysis of environmental

water samples collected from a public marine aquarium. Finally, we test for inter- and intra-specific variation

over large mitogenomic data sets available for marine mammal species, presenting ready-to-use guidelines

for the accurate selection of primer set of choice in specific marine mammal studies.

MATERIALS AND METHODS

Initial design of primer sets

Seventy-one complete mitochondrial genome sequences, representative of most marine vertebrate groups

(fish, sea turtles, birds and marine mammals), were retrieved from GenBank and used for initial primer

development (Online Resource 1 in Supplementary Materials). The selected sequences represented 30 marine

vertebrate families, including most marine mammal families (all 3 Pinniped families, both Sirenian and 13

Cetacean families). The selection comprised all Cetacean species occurring in the Mediterranean Sea. In

addition, four human mitochondrial genomes representative of the four main human haplogroups (i.e.

haplotypes 16, 31, 33 and 52 in Ingman et al. 2000) were included (Online Resource 1) in order to design

primers with reduced amplification efficiency for human DNA. All sequences (n = 75) were aligned with the

online tool Clustal-Omega (https://www.ebi.ac.uk/Tools/msa/clustalo/) with default parameters, and the

complete ribosomal 12S and 16S genes were isolated. Potential sites for metabarcoding primers were

identified by manually searching for suitable locations within alignments. Gene regions were considered

suitable for designing metabarcoding primers if they encompassed a short (80-230bp) highly variable

fragment, required for species discrimination, and were flanked by highly conserved sites for situating

primers. Where possible, (i.e. when enough intra-mammal variation was found in proximity of the priming

sites), we also tried to design, for each candidate locus, we also tried to design alternative primers which

minimised the probability of amplifying human targets, by ensuring mismatches between the primers and

human templates. Such variants could be preferentially used in studies specifically targeting marine

mammals.

Primer evaluation and validation

Three approaches were used to assess primer performance. Firstly, primers were evaluated in silico in two

steps: i) predicted primer binding and amplicon sequence diversity was assessed using the ecoPCR scripts

within the OBITOOLS software package (Ficetola et al. 2010; Boyer et al. 2016); and ii) 680 complete marine

mammal mitogenome sequences deposited in Genbank, were used to quantify sequence diversity for the

primer target regions within marine mammal Families, Genera and species, to provide recommendations on

taxonomic resolution utility of primer sets for specific taxa. Secondly, the performance of the primers was

evaluated in vitro using tissue derived DNA extracts with varying levels of degradation. Finally, eDNA

samples, obtained from tanks of known species composition at the Aquarium of Genoa (Italy) as a proxy for

‘real world’ environmental samples, were used to assess the metabarcoding performance of the primers.

was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (whichthis version posted September 5, 2019. . https://doi.org/10.1101/759746doi: bioRxiv preprint

Page 6: Novel Universal Primers for Metabarcoding eDNA Surveys of … · groups: i) vertebrates (excluding Cetaceans), ii) Cetaceans only, and iii) Invertebrates. Performance of the newly

In silico primer evaluation

An in silico approach was used to assess the universality of the newly designed primers against all standard

nucleotide sequences in the NCBI-Genbank data repository (accessed April 2019) for three taxonomic

groups: i) vertebrates (excluding Cetaceans), ii) Cetaceans only, and iii) Invertebrates. Performance of the

newly designed primers were compared against the commonly used 12S-V5 vertebrate metabarcoding

primers (Riaz et al. 2011) as a benchmark.

The ecoPCR script was used to simulate an in silico PCR and extract the amplifiable barcoding regions for

each primer pair while allowing for a maximum of 3 base-pair (bp) mismatches between the primers and

template DNA. Barcode regions shorter than 50 bp and longer than 400 bp were not considered.

Subsequently, the OBIGREP command was used to extract sequences that were reliably assigned to a species

level taxonomy. Ambiguous species level identifications (i.e. sequences with ‘sp.’, ‘aff.’, etc. in the definition

of the sequence) and nuclear pseudogenes were excluded from the final sequence database. The OBIUNIQ

command was then used to remove duplicate records for each species. Sequences were classified according to

their higher taxonomy (i.e. Vertebrates [excluding Cetacean species], Cetaceans and Invertebrates) before

summarising the data into a tabular format for further analyses using R version 3.5.2 (R Development Core

Team 2010). Finally, the taxonomic resolution of the different primers was assessed by splitting the data into

their higher taxonomy and running the ECOTAXSPECIFICITY script with three different thresholds for barcode

similarity (i.e. sequences were considered different if they have 1, 3 or 5 bp differences). The data obtained

from the in silico analyses were imported into R and the packages tidyverse (Wickham 2016) and gridExtra

(Auguie 2016) were used to construct summary figures to evaluate the taxonomic coverage, the specificity

and taxonomic resolution for each primer pair.

We downloaded 680 GenBank complete marine mammal mitogenome sequences from GenBank, and

evaluated levels of polymorphism within Families, Genera and species at the three proposed loci. Complete

12S and 16S genes were extracted from the retrieved sequences and aligned for each type of taxonomic

comparison and the number of variable sites recorded within the three loci amplicons were reported.

In vitro primer evaluation

Tissue derived DNA extracts were used to assess the performance of the newly designed primer pairs in vitro

and optimise amplification conditions. DNA extracts of diverse marine vertebrate groups (cetaceans,

pinnipeds, sea turtles and fish) were used as templates for PCR amplification (see Table 4). The 13 DNA

templates were purposely selected to have different levels of degradation, being extracted with different

techniques and spanning 1-31 years since extraction, in order to evaluate the ability of the primer to amplify

low-quality DNA. High quality DNA extracts were obtained from fresh samples (i.e. muscle, skin or blood)

using the Qiagen DNeasy Blood & Tissue extraction kit and following the manufacturers protocols. Low

quality DNA extracts consisted of phenol-chloroform extracted DNA which was over 25 years old or DNA

extracts obtained from boiling tissue samples in a buffer solution (Valsecchi 1998). For each primer pair and

each of the DNA extracts a (single, duplicate, triplicate) PCR reaction was performed consisting of 25 U/µl of

GoTaq G2 Flexi DNA polymerase (Promega); 1X Green GoTaq Flexi Reaction Buffer (Promega), 1.25-1.87

was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (whichthis version posted September 5, 2019. . https://doi.org/10.1101/759746doi: bioRxiv preprint

Page 7: Novel Universal Primers for Metabarcoding eDNA Surveys of … · groups: i) vertebrates (excluding Cetaceans), ii) Cetaceans only, and iii) Invertebrates. Performance of the newly

mM MgCl2 (Promega); 0.2 mM dNTPs (Promega), 0.25-0.75 µM of each primer and dH2O to reach a final

volume of 20 µl. Thermal cycling conditions followed a touch-down PCR protocol with annealing

temperatures depending on primer pairs: 10/10/18 cycles at 54/55/56 °C for MarVer1; 10/10/18 cycles at

51/52/53 °C for MarVer2; 8/10/10/10 cycles at 54/55/56/57 °C for MarVer3 and finally 10/10/18 cycles at

59/60/61 °C for Ceto2. After an initial denaturation step of 4 min at 94 oC, each of the 38 cycles consisted of

a 30 s at 95 oC, 30 s at the primer specific annealing temperatures as described above, followed by 40 s at 72 oC. The final extension consisted of 5 s at 72 oC. To confirm the amplification of the desired amplicon, PCR

products were visually assessed using gel-electrophoresis and Sanger sequenced (GENEWIZ, UK; data not

shown).

Evaluation of primer performance with environmental samples

Environmental DNA samples derived from water collected from 6 tanks of the Aquarium of Genoa, Italy in

June 2018, were also employed to further validate the performance of the four primer sets. The tanks

contained from 1 to 14 known vertebrate species and were named after their main hosted species or typology:

1) “Manatee”, 2) “Dolphin”, 3) “Shark”, 4) “Seal”, 5) “Penguin” and 6) “Rocky shore” - a multispecies tank

hosting fish and invertebrates typical of Mediterranean rocky shores. Two tanks (Dolphin and Seal) were

single species, while the Shark and the Rocky shore tanks included a combination of cartilaginous and bony

fish.

For each tank, a total of 13.5 litres of water was collected from the water surface using a sterile graduated

2000ml glass cylinder, while wearing sterile gloves, and were stored within a 15-litre sterile container in

order to homogenise the water sample, and avoid stochastic variability due to sampling of small volumes. For

each tank, 3x 1.5 litre and 3x 3 litre replicates (total 6 replicates per tank) were then aliquoted from the larger

sample. To capture environmental DNA, immediately after aliquoting, each of the 6 replicates was then

filtered using individual 0.45 µm pore size nitrocellulose filters, using a BioSartâ 100 filtration system

(Sartorius). After filtering, membranes were placed on ice for transport to the University of Milano-Bicocca,

and subsequently stored at -20oC. Two weeks later environmental DNA was extracted from the filter

membranes using a DNeasy PowerSoil Kitâ (Qiagen), following the manufacturer’s protocol.

For each of the four primer sets developed in this study, PCR performance with the eDNA extractions, was

initially evaluated using the same PCR conditions as the ‘in vitro’ validation. Subsequently two of the primer

sets were selected for further use in a DNA metabarcoding analysis with high throughput sequencing using an

Illumina MiSeq platform. For each selected primer set (metabarcoding locus), indexed forward and reverse

sequencing primers were created comprising (5’ to 3’): an 8bp Illumina bar code tag - 4 random nucleotides -

amplification primer sequence, and sourced from Sigma, UK. Eight forward primer indexes were combined

with 12 reverse primer indexes, to allow pooling of amplicons from up to 96 uniquely identifiable samples

within a single sequencing library (Taberlet et al. 2018). Trial amplifications with the Illumina barcode

tagged primers suggested their yield and specificity was unchanged, and so the previously optimised PCR

conditions were used to generate amplicons for MiSeq sequencing. For each locus, eDNA was amplified in

triplicate in 40µl final PCR volume; 5µl of each replicate was used to check for successful amplification via

agarose gel electrophoresis, and the remainder combined to yield a single pool for each sample.

was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (whichthis version posted September 5, 2019. . https://doi.org/10.1101/759746doi: bioRxiv preprint

Page 8: Novel Universal Primers for Metabarcoding eDNA Surveys of … · groups: i) vertebrates (excluding Cetaceans), ii) Cetaceans only, and iii) Invertebrates. Performance of the newly

Each sample amplicon pool was first assessed for fragment size distribution using an Agilent Tapestation, and

cleaned with AMPure beads (Beckman Coulter), following the manufacturer’s protocol, to remove primer

dimers. The cleaned samples were quantified with a Qubit fluorometer, and then for each metabarcoding

locus, separate Illumina NEBNext Ultra DNA libraries were generated, with the pooled samples in equimolar

ratios. Prior to sequencing each library was further assessed for fragment size distribution and DNA quantity

by Agilent Tapestation and Qubit fluorometer. The library for locus MarVer1 (see results) was sequenced in a

150bp paired-end lane, and locus MarVer3 (see results) in a 250bp paired-end lane, using an Illumina MiSeq

sequencer at the University of Leeds Genomics Facility, St James’s Hospital.

Bioinformatics for environmental HTS data

Paired reads were first screened for the presence of the expected primer and index sequence combinations to

exclude off-target amplicons. The reads were then combined to generate the insert sequence, and the

sequence of the random nucleotide region noted, such that only one instance of an insert per sample with the

sample random nucleotide finger print was saved to a sample specific file (i.e. to avoid PCR duplicates and

chimeric sequences). The insert data was aggregated to create a counts matrix containing the occurrence of

each unique sequence in each sample. The taxonomic origin of each insert was determined by Blasting their

sequence against a local instance of the Genbank NT database (Nucleotide, [Internet]). The level of

homology of insert to the hit sequence was noted, as was the species name of the hit sequence. The

taxonomic hierarchy for each unique insert was generated by searching a local instance of the ITIS database

(ITIS, [internet]) with the annotated Genbank species name. The count matrix and taxonomic hierarchy for all

annotated unique sequences were then aggregated into values for equivalent Molecular Operational

Taxonomic Units (MOTUs), by combining all inserts with a set homology (>=98%) to the Genbank hit at a

specified taxonomic level (i.e. ‘Order’, ‘Family’, ‘Genus’ or ‘Species’), using bespoke software (available on

request). Summaries and visualisations of read counts for different taxonomic levels were generated using the

R package ‘Phyloseq’ (McMurdie and Holmes, 2013).

RESULTS

Description of primer sets

From the initial evaluation of aligned marine vertebrate mitochondrial sequences, four novel primer sets were

identified, three within the 12S and one in the 16S genes (Table 1). The three ‘MarVer’ primer combinations

(2 in 12S and 1 in 16S) were developed to amplify the target regions in any of the 71 taxa selected as

representatives of marine vertebrates. To allow for variable sites between different vertebrate groups within

the primer sequence, single degenerate bases were introduced (Table 1) for 4 out of the 6 MarVer primers.

The fourth primer set ‘Ceto2’, was designed to preferentially amplify Cetacean DNA by minimizing base-

pair mismatches for cetacean species while maximizing base-pair mismatches for other vertebrate groups.

Online Resource 2 (Supplementary Material) shows variability at the priming sites across the eight marine

vertebrate categories. Amplicon variability across the 71 taxa (plus the 4 human sequences) recorded in the

regions targeted by the proposed markers are shown in Online Resource 3 (Supplementary Material).

was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (whichthis version posted September 5, 2019. . https://doi.org/10.1101/759746doi: bioRxiv preprint

Page 9: Novel Universal Primers for Metabarcoding eDNA Surveys of … · groups: i) vertebrates (excluding Cetaceans), ii) Cetaceans only, and iii) Invertebrates. Performance of the newly

MarVer1

MarVer1 primer set (abbr. MV1) targets a hypervariable region of the 12S gene, amplifying a fragment of

about 199-212bp (Table 2). It partially overlaps with loci Tele02 (Taberlet et al. 2018), Tele03 (as named by

Taberlet et al. 2018, corresponding to MiFish-U in Miya et al. 2015) and Elas01 (as named by Taberlet et al.

2018, corresponding to MiFish-E in Miya et al. 2015), targeting bony and cartilaginous fishes (Figure 1). The

forward primer MarVer1F differs from the forward primers of previously described loci, in that by shifting 5-

12bp at the 5’ end, it skips variable sites distinguishing bony from cartilaginous fishes, while gaining, at the

3’ end, bases which are conserved across all surveyed marine vertebrates.

MarVer2

MarVer2 primer set (abbr. MV2) targets a variable region of the 12S gene, amplifying a smaller fragment of

85-96bp in the tested vertebrate taxa (Table 2). It is a modification of primer sets Tele01 (Valentini et al.

2016), targeting bony fishes, and Mamm01 (Taberlet et al. 2018), targeting mammals (Figure 1). The

MarVer2 reverse primer partially overlaps with AcMDB01R primer designed by Bylemans et al. 2018 for

freshwater fish Actinopterygii species. Both forward and reverse MarVer2 primers were designed with a

degenerate base to allow amplification in all the 71 tested vertebrate taxa (Table 2). Sequence variation

within MarVer2 amplicon did not resolve all 71 marine vertebrate taxa in the original alignment, with

sequences shared between two or more Odontocete species within the Delphinidae family; with Sousa

chinensis, Tursiops truncatus, Tursiops aduncus, Stenella coeruleoalba, Delphinus delphis and Delphinus

capensis sharing one amplicon sequence, and Peponocephala electra and Feresa attenuaae sharing another.

MarVer3

MarVer3 (abbr. MV3) amplifies a variable region of the 16S gene, producing amplicons ranging in size from

232 to 274bp in the 71 marine vertebrate taxa tested (Table 2). MarVer3 is partially covered by locus

Mamm02 (Taberlet et al. 2018, see Figure 1), but targets a fragment twice as long: for example, in

Odontocetes MarVer3 amplifies a 233bp fragment versus a 115bp amplicon for Mamm02. The reverse

primer (MarVer3R) was the only one of the presented oligonucleotides to be truly “universal”, as it was

found to be fully conserved across all tested marine vertebrates (Table 2). The MarVer3 amplicon sequence

resolved 69 of the 71 tested marine vertebrate species. The unresolved species also fall into the Delphinidae

family: Sousa chinensis and Tursiops truncatus sharing one amplicon sequence, and Tursiops aduncus and

Delphinus capensis sharing another.

Ceto2

Ceto2 primers were designed to complement the MarVer2 locus, amplifying the same 12S region, but with

nucleotides specific to Cetaceans (and excluding humans) in both forward and reverse primers, and

containing no degenerate bases (Table 1). Ceto2 amplicon sizes were 15bp longer than MarVer2, due to

shifting of the priming sites. The Ceto2 primer set is suggested as not suitable for use with marine mammals

other than Cetaceans due to primer site mismatches. The priming sites present one variable site in some

Pinnipeds with the New Zealand fur seal (Arctocephalus forsteri) exhibiting a variant at the 3’ end of the

was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (whichthis version posted September 5, 2019. . https://doi.org/10.1101/759746doi: bioRxiv preprint

Page 10: Novel Universal Primers for Metabarcoding eDNA Surveys of … · groups: i) vertebrates (excluding Cetaceans), ii) Cetaceans only, and iii) Invertebrates. Performance of the newly

forward primer, while both the walrus (Odobenus rosmarus) and the harp seal (Phoca groenlandica)

presented a variant towards the 5’ end of the Ceto2R primer.

Primer evaluation and validation

In silico primer evaluation

For the different primer pairs and higher taxonomic groups, the total number of unique taxa for which the in

silico amplification recovered target sequences is given in Figure 2. The results show that the MarVer3

primer pair amplified DNA from the most vertebrate and cetacean taxa. However, this primer pair also

successfully amplifies the DNA of invertebrate species thus indicating that it has a low overall specificity to

the intended taxonomic targets (Figure 2). The three other primer pairs all amplified DNA from fewer target

taxa then the commonly used 12S-V5 primers.

The proportion of sequences amplified for each higher taxonomic group is a function of the bp-mismatches

between the primers and template DNA is shown in Figure 3. The results of the commonly used 12S-V5

primer pair show that a high number of vertebrate sequences have very few bp-mismatches while the

recovered non-vertebrate sequences generally have ≥ 5 bp-mismatches at both primer binding regions (Figure

3). A similar pattern can be observed for the MarVer2 primer pair although the bp-mismatches in the primer

binding regions for non-vertebrate sequences are slightly lower (i.e. ≥ 4 bp-mismatches between both primer

regions). The results of both the MarVer1 and MarVer3 primers shows that even with a low number of bp-

mismatches (i.e. ≥ 2 mismatches for both the forward and reverse primers) a significant proportion of the

amplified sequences belong to non-vertebrate taxa. Finally, the results of the Ceto2 primers show that this

primer pair will preferentially amplify Cetacean species as the bp-mismatches between Cetacean sequences

are lower compared to the number of bp-mismatches observed in the primer binding regions for most

vertebrate sequences (i.e. ≤ 2 bp-mismatches for the Cetacean sequences versus ≥ 3 bp-mismatches for most

vertebrate sequences).

The taxonomic resolution power of the different primer pairs was evaluated using both the Cetacean and

Vertebrate sequences and the results are summarised in Figure 4. Overall the commonly used 12S-V5

metabarcoding primers have a lower resolution capacity compared to our newly designed primers (for both

the Cetacean and Vertebrate taxa), with our primers assigning 20-25 % more sequences to the correct species

level taxonomy (Figure 4). For the newly designed primers, differences are observed in their taxonomic

assignment power, with MarVer1 and MarVer3 generally assigning a higher percentage of the sequences to

the correct family, genus and species level taxonomy (Figure 4).

Primer set resolution for Marine Mammal taxonomic detection

Table 3 shows the results of the comparison performed on 680 GenBank complete marine mammal

mitogenome sequences (GenBank accession numbers shown in Online Resource 4), in order to evaluate

levels of polymorphism within Families, Genera and species.

was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (whichthis version posted September 5, 2019. . https://doi.org/10.1101/759746doi: bioRxiv preprint

Page 11: Novel Universal Primers for Metabarcoding eDNA Surveys of … · groups: i) vertebrates (excluding Cetaceans), ii) Cetaceans only, and iii) Invertebrates. Performance of the newly

Family level. All three targeted regions contained high genetic variability within the seven analysed marine

mammals Families (Pinnipeds (2), Mysticetes (1) and Odontocetes (4)). The DNA fragment amplified by

MarVer3 primer set (16S region) proved to be the most diverse, highlighting 59 variable sites within the

Phocoidea, and over 40 in the Otariidae, Ziphiidae and Delphinidae Families. The least variable locus of the

three was MarVer2 (thus also Ceto2). Nevertheless, this marker had high discriminatory power for genera

within Families, with the exception of the Monodontidae and Phocoenoidae families. Here both families

contain only 2 genera, which in both cases only differ by one basepair, and therefore may not be

automatically resolved as distinct MOTUs if a threshold of 98% homology is used in the annotation pipeline

(Table 3).

Genus level. Nine Genera, each including 2 to 5 species, were assessed for within-genus variability,

including pairwise con-generic species comparisons (Table 3). MarVer3 consistently revealed the highest

levels of polymorphism (23 of the 31 inter-generic pairwise comparisons show differences of at least 1bp),

while MarVer2 was least variable, showing no variation in 7 of the 31 pairwise comparisons (grey boxes in

Table 3). However, MarVer2 showed good power to resolve taxon identity in the Genus Mesoplodon

(Ziphiidae family). MarVer1 was the second most variable locus at this level, with the highest number of

variable sites in 10 out of the 31 congeneric-species pairwise comparisons. Within all three Tursiops spp.

pairwise comparisons MarVer1 was the locus showing the highest variability.

Species level. Genetic variability was investigated also below the nominal species level. This could be tested

only on the few species for which large enough sample sizes were available to evaluate intraspecific

variation. We assessed 14 marine mammal species for which mitogenomic data were available for a number

of individuals ranging from 2 (Megaptera novaeangliae and Dugong dugong) to 152 (Balaenoptera physalus)

individuals (Table 3). The 14 species were representative of seven marine mammal Families: Dugongidae (1

species), Otariidae (2 species), Phocidae (1 species), Balaenopteridae (2 species), Delphinidae (4 species),

Ziphiidae (3 species) and Phocoenidae (1 species). All sequences were retrieved from GenBank (see Online

Resource 4), with the exception of 22 unpublished Pusa caspica sequences (provided by SG). In only one of

the 14 species (Eumetopias jubatus, fam. Otariidae) were none of the three loci polymorphic (11 individuals

compared). Four of 14 species had only one informative locus: MarVer1 (12S) in two instances (Mesoplodon

europaeus and Phocoena phocoena), and MarVer3 (16S) in two other cases (Tursiops truncatus and Ziphius

cavirostris) (Table 3). In the Caspian seal (Pusa capsica), the killer whale (Orcinus orca) and the spinner

dolphin (Stenella longirostris), two of three loci showed polymorphism. In the remaining 6 species (Dugong

dugong, Arctocephalus forsteri, Balaenoptera physalus, Megaptera novaeangliae, Stenella attenuata,

Mesoplodon densirostris) all 3 loci presented polymorphisms: in some instances MarVer1 was found to be

the most informative locus (e.g. 12 variable sites in Balaenoptera physalus), in others MarVer3 (e.g. 6

variable sites in Stenella longirostris).

In vitro primer evaluation

PCR amplicons of the expected size were generated by the four primer sets with most of the 13 DNA extracts

(Table 4). The oldest DNA extract (31yo, humpback whale, Megaptera novaeangliae) was the worst

performing, yielding only very faint bands for MarVer1, MarVer3 and Ceto2, and failing to produce a band

was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (whichthis version posted September 5, 2019. . https://doi.org/10.1101/759746doi: bioRxiv preprint

Page 12: Novel Universal Primers for Metabarcoding eDNA Surveys of … · groups: i) vertebrates (excluding Cetaceans), ii) Cetaceans only, and iii) Invertebrates. Performance of the newly

for locus MarVer2. All sea turtle extracts produced one strong artefact band (ca 161bp) and a fainter band in

the correct size range (ca 107bp) place of the correct sized fragment for locus Ceto2. Bony fish DNA extracts

also produced a band compatible in size with the Ceto2 amplicon.

Application to environmental samples

Amplicons of the expected size were obtained from all 36 water samples collected at the Genoa Aquarium

with the four primer sets. Locus Ceto2 amplified a faint band in the expected size range from the shark,

penguin and seal tanks, in addition to the bottlenose dolphin tank, suggesting either low level amplification

from non-cetacean taxa or potential transfer of dolphin eDNA between tanks (see below).

Loci MarVer1 and MarVer3 were selected for further HTS metabarcoding evaluation on the basis of the

bioinformatic prediction that they would be the most informative for Mediterranean marine megafauna

species (see below). For these loci after sequence quality filtering and demultiplexing, annotated read counts

per sample ranged from 682 to 52,478 for MarVer1 and 1,025 to 43,003 for MarVer3, with combined reads

per Tank of 10,750 to 158,950 for MarVer1, and 1,3251 to 232,015 for MarVer3 (see Table 5, Figure 5,

Online Resource 5).

The percentage of resident vertebrate species with amplicons annotated to the species level within each tank

ranged between 0% (Tank 6, Rocky shore) to 100% (Tank 2, Dolphin and Tank 5, Seal), mean 50.4% for

MarVer1; and between 66.6% (Tank 1, Manatee) to 100% (Tanks 4, 5, 6), mean 89.7%, for MarVer3 (see

Online Resource 5). Overall, for MarVer1 amplicons for 9 out of 27 resident taxa were annotated to the

species level, and for 22 resident species for MarVer3, (see Table 5, Figure 5, Online Resource 5).

Amplicons for the aquarium’s four frequently used vertebrate feed species (Culpea harengus, Mallotous

villosus, Merluccius productus, and Scomber scombrus), were detected in tanks 2 to 6 (in the Manatee tank,

feed consists of lettuce) using both loci, with the exception of MarVer1 failing to detect Merluccius

productus. Squid (unspecified species), is also supplied to the Dolphin Tank, but no cephalopod, amplicons

were recovered.

Amplicons for resident species were also detected in tanks other than their host tanks at low levels (e.g. with

MarVer3, dolphin and seal amplicons were detected in the manatee and shark, and penguin and rocky shore

tanks respectively), suggesting possible transfer of eDNA between tanks in the aquarium, e.g. via the

equipment used by staff members. Similarly, human amplicons were detected in all tanks, consistent with the

practice of aquarium staff entering the water for maintenance.

Both MarVer1 and MarVer3 identified amplicons (partially overlapping between loci and tanks) that were not

directly attributable to resident species or food sources (category B in Online Resource 5). These comprised 6

recurrent species, two of which were previously (but no longer) used as feed (Sardina pilchardus, Engraulis

encrasicolus), and four species present in the Ligurian Sea (e.g. Auxis rochei, Auxis thazard, Belone belone,

Coris julis) from which the water used to fill the tanks is drawn, after being filtered and UV irradiated. All of

these unexpected species detections were at low abundance, with read counts greater than 100 to a maximum

of 947 in at least 1 tank (range 0.3% to 3.7% of total reads with MarVer3 and MarVer1 respectively). Very

was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (whichthis version posted September 5, 2019. . https://doi.org/10.1101/759746doi: bioRxiv preprint

Page 13: Novel Universal Primers for Metabarcoding eDNA Surveys of … · groups: i) vertebrates (excluding Cetaceans), ii) Cetaceans only, and iii) Invertebrates. Performance of the newly

low abundance amplicons (<100 reads per tanks) for at least 20 other Mediterranean resident teleost fish

species were also observed, but were not considered further as definitive detections.

Amplicons from invertebrate species (category C in Online Resource 5) were detected in two tanks by the

MarVer3 locus at low abundances (read count <100), which would normally be discounted as a detection.

These being, an Anthozoa species, Seriatopora hystrix, in the manatee tank, and a Sipunculid worm, of the

family Phascolosomatidae, in the seal tank (Online Resource 5). Neither taxa were in the tank in which their

traces were found. No invertebrate amplicons were recovered in the “rocky shore” tank which contains some

Anthozoa (e.g. Anemonia viridis), some unidentified sponges growing spontaneously and hydrozoans, or

from the shark tank where Aiptasia spp grows spontaneously. The other tanks (with the exception of dolphin

and manatee) may also contain other spontaneously growing small invertebrates, such as copepods,

amphipods and hydrozoans, but none were detected.

DISCUSSION

Comparison with previously described barcode primer sets

This study was conceived to identify cetacean specific barcode loci complementing the many already

available for fish species (e.g. Miya et al. 2015, Sato et al. 2018), for use in eDNA biodiversity monitoring

studies of Mediterranean marine vertebrates. However, in the primer design process, we realised that with

minor adaptations, it was possible to cover the whole range of marine vertebrates in a single HTS run,

potentially dramatically reducing costs for eDNA HTS biodiversity monitoring of pelagic vertebrates.

Most existing 12S/16S barcode primer sets (e.g. Taberlet et al. 2018, Miya et al. 2015, Bylemans et al. 2018)

were designed for particular vertebrate groups and thus contained conserved sequence elements specific for

their target taxonomic group. Most of these primer sets are partially overlapping, at least at one end, with the

ones presented in this study although they are never identical (Figure 1). Our proposed primer sets were also

found to be different from the universal 12S and 16S primers combinations described by Yang et al. (2014) -

although MarVer3 sits within Yang et al.’s 16S target fragment - however, their amplicon sizes were too

large (ca 430bp and ca 500bp for respectively the 12S and the 16S primer sets) to be easily employed in

eDNA studies using current short-read HTS technology.

The 12S-V5 primer set (Riaz et al. 2011, renamed Vert01 by Taberlet et al. 2018), is the only one previously

described as being specific for vertebrates. It is located adjacent to MarVer1 site (forward 12S-V5 primer

partially overlaps with reverse MarVer1 sequence, see Figure 1). Within our alignments, the 12S-V5 site was

not as variable as any of the three loci described in this paper. For example, looking at one of the most

variable Families considered in this study, the Ziphiidae (Odontocetes), our three loci (MarVer 1, 2 and 3)

recovered 37, 30 and 43 variable sites respectively (Table 3), while 12S-V5 highlighted only 13 within the

same pool of sequences. Given that 12S-V5 and MarVer2 primer pairs generate similar sized amplicons

(respectively 106bp and 96bp), MarVer2 detects more than twice (30) the number of variable sites found in

the 12S-V5 target sequence (13). Such performance difference was also highlighted in the in silico simulation

was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (whichthis version posted September 5, 2019. . https://doi.org/10.1101/759746doi: bioRxiv preprint

Page 14: Novel Universal Primers for Metabarcoding eDNA Surveys of … · groups: i) vertebrates (excluding Cetaceans), ii) Cetaceans only, and iii) Invertebrates. Performance of the newly

(Figure 4). Moreover, within both forward and reverse 12S-V5 primer sites, polymorphisms were present

within the Ziphiidae Family, suggesting that the 12S-V5 primers may not be conserved across all vertebrate

classes. However, 12S-V5 performed better than MarVer1, MarVer2 and Ceto2 based on the total number of

unique taxa for which DNA was amplified in the in silico simulation (Figure 2). While the lower predicted

number of taxa recovered by these primers may indicate a failure to amplify DNA from some species, the

completeness of the reference database used will also influence the results given that all sequences records

were considered and not only the full mitochondrial genomes. Within the partial 12S sequences deposited on

GenBank the fragment including locus 12S-V5 might be over-represented, as most previous studies relied on

this marker.

Besides being highly conserved among Vertebrates, all three MarVer primer-sets were shown to be

potentially non-exclusive to Vertebrates when 2-4bp primer/template mismatches were allowed (Figure 3).

With reduced specificity, these primer pairs could potentially amplify unwanted non-vertebrate taxa. Given

the high number of vertebrate taxa recovered by the MarVer3 primers in silico (Figure 2), and the observation

that the majority of the non-vertebrate taxa recovered have ≥ 3 bp-mismatches in the primer binding regions

(Figure 3), this primer pair should still be valuable if stringent thermal cycling conditions are used during

PCR amplification. This was supported by the eDNA sequencing of aquarium samples where there was

minimal recovery of invertebrate amplicons from tanks known to contain invertebrate species. This suggests

the use of MarVer3 in Vertebrate biodiversity surveys would not be limited by potential homology with some

invertebrate sequences.

Performance of MarVer1 and MarVer3 primer sets with environmental samples

We evaluated the performance of MarVer1 and MarVer3 primer sets with water samples collected at the

Genoa aquarium, from tanks with known community compositions. Amplicons annotated to species level

were recovered for 81.5% and 37% of the 27 resident taxa for MarVer3 and MarVer1 respectively. For

MarVer3, the five ‘undetected’ species included Diplodus cervinus (Zebra seabream), Pterygoplichthys

gibbiceps (Armored catfish), two ray species (Dasiatis americana, Teaniura grabata), and Pristis zjisron

(Longcomb sawfish). In the case of D. cervinus, amplicons assigned to other non-resident Diplodus species

were observed, so eDNA from the species may have been present, but annotated as a congeneric. For the four

other ‘undetected’ species, there were no other incompletely (above species level) assigned reads at genus,

family, or order level which could be attributed to these taxa (for Pristis zjisron no matching reference

sequence for the MarVer3 region was available in Genbank). These four cases therefore appear to be genuine

non-detections. These are all bottom dwelling species, whereas our water samples were collected at the

surface, and therefore potentially we did not capture eDNA from these species.

For MarVer1, 9 out of 13 resident species with Genbank reference sequences covering the MarVer1 region

were detected successfully. Amplicons correctly assigned to the species level were not observed for the two

penguin species, Blackspot seabream (Pagellus bogaraveo), and longcomb sawfish, despite reference

sequences being available. For the Magellanic penguin, amplicons assigned to the congeneric Spheniscus

was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (whichthis version posted September 5, 2019. . https://doi.org/10.1101/759746doi: bioRxiv preprint

Page 15: Novel Universal Primers for Metabarcoding eDNA Surveys of … · groups: i) vertebrates (excluding Cetaceans), ii) Cetaceans only, and iii) Invertebrates. Performance of the newly

demersus were observed, but no candidate incompletely or misattributed amplicons for the Gentoo penguin

(Pygoscelis papua) or two fish species were recorded.

For 14 species (51.6% cases), no Genbank reference sequences covering the MarVer1 region were available.

Reads assigned to the non-resident grouper Epinepehelus lanceolatus were observed, indicating that eDNA

from the three resident groupers may have been detected but attributed to a congeneric for which a reference

was available. For the remaining 11 species no other candidate amplicons attributable to related taxa were

recorded.

Our demultiplexing and annotation pipeline required an amplicon sequence homology of at least 98% with

other Molecular Operational Taxonomic Units (MOTUs) and with Genbank reference sequences. Therefore,

the lower assignment rate for MarVer1 compared to MarVer3 likely reflects the lower Genbank coverage for

the 12S region encompassed by MarVer1. In this case, reducing stringency (e.g. to 95% homology) may

increase annotation rates, allowing successful attribution of amplicons to genus and/or family level, but with

a requirement to check homology level for individual MOTUs before accepting species level assignments.

For both primer sets, annotation success rates would be expected to increase as taxonomic coverage of

Genbank reference sequences improves over time. Similarly, while MarVer3 was predicted to potentially

recover invertebrate amplicons when allowing for low levels of degeneracy at priming sites, few were

observed. Potentially this may also be accounted for by low coverage with reference sequences and the level

of stringency applied in the annotation pipeline. The annotation of the few observed invertebrate amplicons

from the aquarium samples should also be treated cautiously for the same reasons.

Optimising locus choice for different eDNA and taxon detection applications

The four loci described in this paper provide investigators with flexible options to target different barcode

markers depending on priorities for their study objectives, tailored to requirements for taxonomic breadth,

variation and resolution at different taxonomic levels, amplicon size where eDNA degradation is a concern

(e.g. Speller et al. 2016), and sensitivity to contamination from human DNA.

Overall the 16S based MarVer3, the largest amplicon product with an expected size of approximately 245bp,

has the highest potential taxonomic coverage across all vertebrates, and taxon resolution from species level

upwards. Our trial eDNA HTS assay using samples from aquarium tanks demonstrated the locus performed

well with environmental samples despite its relatively large amplicon size compared to the other loci.

However, the MarVer3 region showed limited variability at nominal species level compared to MarVer1 and

2 amplicons (e.g. in killer whale, Orcinus orca, no variability was found among 151 individuals tested, see

Table 3). This is consistent with lower rates of evolution in 16S compared to 12S genes, and suggests 12S

genes may be more informative for resolving intraspecific differences (see below).

The 12S based MarVer1, Marver2, and Ceto2, offer the advantage of smaller amplicon sizes (approximately

202bp, 89bp, 104bp respectively), which may be a consideration for applications requiring work with more

highly degraded eDNA templates (Nichols et al. 2018, Wei et al. 2018).

was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (whichthis version posted September 5, 2019. . https://doi.org/10.1101/759746doi: bioRxiv preprint

Page 16: Novel Universal Primers for Metabarcoding eDNA Surveys of … · groups: i) vertebrates (excluding Cetaceans), ii) Cetaceans only, and iii) Invertebrates. Performance of the newly

Here we focused on evaluating taxonomic resolution with marine mammal groups, and found the relative

performance of the loci varied across different families. For instance, MarVer1 was more variable than

MarVer2 in most Cetacean species, but MarVer2 still had strong taxon resolution for species outside the

Delphinidae family, and in particular for the Ziphiidae. In a stretch of less than 100bp, the MarVer2 region

contained more polymorphic sites in the Ziphiidae (5 Genera assessed) than in the numerically and

morphologically more diverse Delphinidae Family (14 Genera assessed). This observation is likely due to the

older phylogenetic divergence of the Ziphiidae lineages, allowing more time for accumulation of fixed

differences between species, compared to the more recently derived Delphinidae family (Nikado et al. 2001).

MarVer2 was also more effective at resolving Pinnipeds of the genus Pusa compared to MarVer1. However,

the incomplete resolution of MarVer2 (and thus Ceto2), within the Delphinidae would preclude its use for

species level detection studies within the Mediterranean, one of our geographic areas of interest, since it

cannot differentiate bottlenose, common and striped dolphins, which are sympatric in this sea. There were

also notable performance differences for loci at the interspecific level within Families. For example, MarVer3

was the most polymorphic locus within the 104 spinner dolphin (Stenella longirostris) samples, but it was

monomorphic within the 151 killer whale (Orcinus orca) mitogenome sequences retrieved from GenBank,

see Table 3).

Resolution of intraspecific phylogeographic variation is likely to be best attempted with either more variable,

or longer target amplicons (e.g. d-loop region; Kunal et al 2013). However, the large Cetacean sequence

dataset we evaluated allowed us to test the potential of our loci to identify phylogeographically informative

variation, which could be used for simple haplotype clade determination with eDNA for some species

(Adams et al. 2019). For instance, within the MarVer1 amplicon in the 151 killer whale mitogenomes (Morin

et al. 2010, 2015 and Filatova et al. 2018), some variable sites were private to either the Transient clade or to

the AntB and AntC clades identified by the larger dataset.

The preliminary investigation of sequence variation in other large marine vertebrate groups (tuna and sea

turtle species), suggested our loci also have potential to be informative for species identification in those taxa.

While not assessed directly in this study, the MarVer loci may also prove to be useful as barcode markers for

terrestrial vertebrates given taxonomic conservation of the priming sites.

Finally, the high levels of diagnostic variation seen within MarVer loci amplicons offer potential for

designing additional species specific nested internal primers (Stoeckle et al. 2018). These might have utility

for species focused, non-sequencing based detection applications, such as quantitative PCR, or simple

agarose gel based amplicon visualisation when there is limited access to laboratory facilities, or funding.

MarVer2 and Ceto2, would be most suited for this in Cetaceans (due to the clustering of variable sites in

proximity of one of the priming site), while for Teleost species, MarVer3 would be the best candidate (due to

many in-del mutations in fish).

Conclusions

was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (whichthis version posted September 5, 2019. . https://doi.org/10.1101/759746doi: bioRxiv preprint

Page 17: Novel Universal Primers for Metabarcoding eDNA Surveys of … · groups: i) vertebrates (excluding Cetaceans), ii) Cetaceans only, and iii) Invertebrates. Performance of the newly

This paper presents four novel primer sets targeting 12S and 16S vertebrate mitogenome regions, with a

particular focus on marine mammals. Using a combination of ‘in silico’ validation, and application to eDNA

samples from aquarium communities with known species composition, we show the loci to have high

potential for metabarcoding and eDNA studies targeting marine vertebrates. These primer sets have broader

taxonomic coverage and resolution compared to previously developed 12S and 16S primer sets, potentially

allowing surveys of complete marine vertebrate communities (including fish, sea turtles, bird and mammals)

in single HTS runs, simplifying workflows, reducing costs, and increasing accessibility to a wider range of

investigators. They may be applied in any context focusing on resolving vertebrate taxonomic identity, from

biodiversity surveys and forensics (e.g. CITES surveillance, or surveys of commercially targeted fish

species), through to behavioural ecology studies and supporting conservation of rare or endangered marine

vertebrate species.

was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (whichthis version posted September 5, 2019. . https://doi.org/10.1101/759746doi: bioRxiv preprint

Page 18: Novel Universal Primers for Metabarcoding eDNA Surveys of … · groups: i) vertebrates (excluding Cetaceans), ii) Cetaceans only, and iii) Invertebrates. Performance of the newly

FUNDING

No funding was received for this work.

CONFLICT OF INTEREST

The authors declare that they have no conflict of interest.

REFERENCES

Adams CIM, Knapp M, Gemmell NJ, et al (2019) Beyond Biodiversity: Can Environmental DNA (eDNA)

Cut It as a Population Genetics Tool? Genes (Basel) 10:192. https://doi.org/10.3390/genes10030192

Andruszkiewicz EA, Starks HA, Chavez FP, et al (2017) Biomonitoring of marine vertebrates in Monterey

Bay using eDNA metabarcoding. PLoS One 12:1–20. https://doi.org/10.1371/journal.pone.0176343

Archer FI, Morin PA, Hancock-Hanser BL, et al (2013) Mitogenomic Phylogenetics of Fin Whales

(Balaenoptera physalus spp.): Genetic Evidence for Revision of Subspecies. PLoS One 8:e63396.

https://doi.org/10.1371/journal.pone.0063396

Auguie, B. (2016). gridExtra: miscellaneous functions for “grid” graphics. R package version, 2(1), 242.

Baker CS, Steel D, Nieukirk S, Klinck H (2018) Environmental DNA (eDNA) From the Wake of the Whales:

Droplet Digital PCR for Detection and Species Identification. Front Mar Sci 5:1–11.

https://doi.org/10.3389/fmars.2018.00133

Borrell YJ, Miralles L, Do Huu H, et al (2017) DNA in a bottle - Rapid metabarcoding survey for early alerts

of invasive species in ports. PLoS One 12:1–17. https://doi.org/10.1371/journal.pone.0183347

Boyer F, Mercier C, Bonin A, et al (2016) obitools: A unix-inspired software package for DNA

metabarcoding. Mol Ecol Resour 16:176–182. https://doi.org/10.1111/1755-0998.12428

Bylemans J, Gleeson DM, Hardy CM, Furlan E (2018) Toward an ecoregion scale evaluation of eDNA

metabarcoding primers: A case study for the freshwater fish biodiversity of the Murray–Darling Basin

(Australia). Ecol Evol 8:8697–8712. https://doi.org/10.1002/ece3.4387

Creer S, Deiner K, Frey S, et al (2016) The ecologist’s field guide to sequence-based identification of

biodiversity. Methods Ecol Evol 7:1008–1018. https://doi.org/10.1111/2041-210X.12574

Deagle BE, Chiaradia A, McInnes J, Jarman SN (2010) Pyrosequencing faecal DNA to determine diet of

little penguins: Is what goes in what comes out? Conserv Genet 11:2039–2048.

https://doi.org/10.1007/s10592-010-0096-6

was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (whichthis version posted September 5, 2019. . https://doi.org/10.1101/759746doi: bioRxiv preprint

Page 19: Novel Universal Primers for Metabarcoding eDNA Surveys of … · groups: i) vertebrates (excluding Cetaceans), ii) Cetaceans only, and iii) Invertebrates. Performance of the newly

Deagle BE, Jarman SN, Coissac E, et al (2014) DNA metabarcoding and the cytochrome c oxidase subunit I

marker: Not a perfect match. Biol Lett 10:2–5. https://doi.org/10.1098/rsbl.2014.0562

Deiner K, Renshaw MA, Li Y, et al (2017) Long-range PCR allows sequencing of mitochondrial genomes

from environmental DNA. Methods Ecol Evol 8:1888–1898. https://doi.org/10.1111/2041-210X.12836

Elbrecht V, Hebert PDN, Steinke D (2018) Slippage of degenerate primers can cause variation in amplicon

length. Sci Rep 8:1–5. https://doi.org/10.1038/s41598-018-29364-z

Ficetola GF, Coissac E, Zundel S, et al (2010) An in silico approach for the evaluation of DNA barcodes.

BMC Genomics 11:434. https://doi.org/10.1186/1471-2164-11-434

Filatova OA, Borisova EA, Meschersky IG, et al (2018) Colonizing the wild west: Low diversity of complete

mitochondrial genomes in Western North Pacific killer whales suggests a founder effect. J Hered 109:735–

743. https://doi.org/10.1093/jhered/esy037

Foote AD, Thomsen PF, Sveegaard S, et al (2012) Investigating the Potential Use of Environmental DNA

(eDNA) for Genetic Monitoring of Marine Mammals. PLoS One 7:2–7.

https://doi.org/10.1371/journal.pone.0041781

Ingman M, Kaessmann H, Pääbo S, Gyllensten U (2000) Mitochondrial genome variation and the origin of

modem humans. Nature 408:708–713. https://doi.org/10.1038/35047064

ITIS (2018). The Integrated Taxonomic Information System on-line database, http://www.itis.gov.

Data retrieved 2018-12-18.

Jarman SN, Berry O, Bunce M (2018) The value of environmental DNA biobanking for long-term

biomonitoring. Nat. Ecol. Evol. 2

Jarman SN, Redd KS, Gales NJ (2006) Group-specific primers for amplifying DNA sequences that identify

Amphipoda, Cephalopoda, Echinodermata, Gastropoda, Isopoda, Ostracoda and Thoracica. Mol Ecol Notes

6:268–271. https://doi.org/10.1111/j.1471-8286.2005.01172.x

Komai T, Gotoh RO, Sado T, Miya M (2019) Development of a new set of PCR primers for eDNA

metabarcoding decapod crustaceans. Metabarcoding and Metagenomics 3:1–19.

https://doi.org/10.3897/mbmg.3.33835

Kunal SP, Kumar G, Menezes MR, Meena RM (2013) Mitochondrial DNA analysis reveals three stocks of

yellowfin tuna Thunnus albacares (Bonnaterre, 1788) in Indian waters. Conserv Genet 14:205–213.

https://doi.org/10.1007/s10592-013-0445-3

was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (whichthis version posted September 5, 2019. . https://doi.org/10.1101/759746doi: bioRxiv preprint

Page 20: Novel Universal Primers for Metabarcoding eDNA Surveys of … · groups: i) vertebrates (excluding Cetaceans), ii) Cetaceans only, and iii) Invertebrates. Performance of the newly

McInnes JC, Alderman R, Deagle BE, et al (2017) Optimised scat collection protocols for dietary DNA

metabarcoding in vertebrates. Methods Ecol Evol 8:192–202. https://doi.org/10.1111/2041-210X.12677

McMurdie PJ, Holmes S (2013) Phyloseq: An R Package for Reproducible Interactive Analysis and Graphics

of Microbiome Census Data. PLoS One 8:. https://doi.org/10.1371/journal.pone.0061217

Miya M, Sato Y, Fukunaga T, et al (2015) MiFish, a set of universal PCR primers for metabarcoding

environmental DNA from fishes: detection of more than 230 subtropical marine species. R Soc open Sci

2:150088. https://doi.org/10.1098/rsos.150088

Morin PA, Archer FI, Foote AD, et al (2010) Complete mitochondrial genome phylogeographic analysis of

killer whales (Orcinus orca) indicates multiple species. Genome Res 908–916.

https://doi.org/10.1101/gr.102954.109.908

Morin PA, Parson KM, Archer FI, et al (2015) Geographic and temporal dynamics of a global radiation and

diversification in the killer whale. Mol Ecol 24:3964–3979. https://doi.org/10.1111/mec.13284

Nichols R V., Vollmers C, Newsom LA, et al (2018) Minimizing polymerase biases in metabarcoding. Mol

Ecol Resour 18:927–939. https://doi.org/10.1111/1755-0998.12895

Nikado M, Matsuno F, Hamilton H, et al (2001) Retroposon analysis of major cetacean lineages: the

monophyly of toothed whales and the paraphyly of river dolphins. Proc Natl Acad Sci 98:7384–7389.

https://doi.org/10.1073/pnas.121139198

Nucleotide [Internet]. Bethesda (MD): National Library of Medicine (US), National Center for

Biotechnology Information; [1988] - [data retrieved 2018-12-18]. Available from:

https://www.ncbi.nlm.nih.gov/nucleotide/

Parsons KM, Everett M, Dahlheim M, Park L (2018) Water, water everywhere: Environmental DNA can

unlock population structure in elusive marine species. R Soc Open Sci 5:. https://doi.org/10.1098/rsos.180537

Peters KJ, Ophelkeller K, Bott NJ, et al (2015) Fine-scale diet of the Australian sea lion (Neophoca cinerea)

using DNA-based analysis of faeces. Mar Ecol 36:347–367. https://doi.org/10.1111/maec.12145

Port JA, O’Donnell JL, Romero-Maraccini OC, et al (2016) Assessing vertebrate biodiversity in a kelp forest

ecosystem using environmental DNA. Mol Ecol. https://doi.org/10.1111/mec.13481

Riaz T, Shehzad W, Viari A, et al (2011) EcoPrimers: Inference of new DNA barcode markers from whole

genome sequence analysis. Nucleic Acids Res 39:1–11. https://doi.org/10.1093/nar/gkr732

was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (whichthis version posted September 5, 2019. . https://doi.org/10.1101/759746doi: bioRxiv preprint

Page 21: Novel Universal Primers for Metabarcoding eDNA Surveys of … · groups: i) vertebrates (excluding Cetaceans), ii) Cetaceans only, and iii) Invertebrates. Performance of the newly

Sato Y, Miya M, Fukunaga T, et al (2018) MitoFish and mifish pipeline: A mitochondrial genome database

of fish with an analysis pipeline for environmental DNA metabarcoding. Mol Biol Evol 35:.

https://doi.org/10.1093/molbev/msy074

Sigsgaard EE, Nielsen IB, Bach SS, et al (2016) Population characteristics of a large whale shark aggregation

inferred from seawater environmental DNA. Nat Ecol Evol 1:4. https://doi.org/10.1038/s41559-016-0004

Sigsgaard EE, Nielsen IB, Carl H, et al (2017) Seawater environmental DNA reflects seasonality of a coastal

fish community. Mar Biol 164:. https://doi.org/10.1007/s00227-017-3147-4

Speller C, Van Den Hurk Y, Charpentier A, et al (2016) Barcoding the largest animals on earth: Ongoing

challenges and molecular solutions in the taxonomic identification of ancient cetaceans. Philos Trans R Soc B

Biol Sci 371:. https://doi.org/10.1098/rstb.2015.0332

Stoeckle MY, Mishu M Das, Charlop-Powers Z (2018) Gofish: A versatile nested PCR strategy for

environmental DNA assays for marine vertebrates. PLoS One 13:1–17.

https://doi.org/10.1371/journal.pone.0198717

Taberlet P, Bonin A, Zinger L and Coissac E (2018) Environmental DNA For Biodiversity Research and

Monitoring. Oxford University Press

Thomsen PF, Kielgast J, Iversen LL, et al (2012) Detection of a Diverse Marine Fish Fauna Using

Environmental DNA from Seawater Samples. PLoS One 7:1–9. https://doi.org/10.1371/journal.pone.0041732

Valentini A, Taberlet P, Miaud C, et al (2016) Next-generation monitoring of aquatic biodiversity using

environmental DNA metabarcoding. Mol Ecol 25:929–942. https://doi.org/10.1111/mec.13428

Valsecchi E (1998) Tissue boiling: A short-cut in DNA extraction for large-scale population screenings. Mol

Ecol 7:1243–1245. https://doi.org/10.1046/j.1365-294x.1998.00379.x

Wei N, Nakajima F, Tobino T (2018) Effects of treated sample weight and DNA marker length on sediment

eDNA based detection of a benthic invertebrate. Ecol Indic 93:267–273.

https://doi.org/10.1016/j.ecolind.2018.04.063

Wickham H. (2016). ggplot2: Elegant Graphics for Data Analysis. useR. Springer.

Yang L, Tan Z, Wang D, et al (2014) Species identification through mitochondrial rRNA genetic analysis.

Sci Rep 4:1–11. https://doi.org/10.1038/srep04089

was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (whichthis version posted September 5, 2019. . https://doi.org/10.1101/759746doi: bioRxiv preprint

Page 22: Novel Universal Primers for Metabarcoding eDNA Surveys of … · groups: i) vertebrates (excluding Cetaceans), ii) Cetaceans only, and iii) Invertebrates. Performance of the newly

Table 1 Sequences of the four described primer sets

Locus Primer ID Primer sequence (5'-3') Size Region Average amplicon size

MarVer1 MarVer1F CGTGCCAGCCACCGCG 16bp 12S ca. 202 bpMarVer1R GGGTATCTAATCCYAGTTTG 20bp

MarVer2 MarVer2F CCGCCCGTCACYCTC 15bp 12S ca. 89 bpMarVer2R CTTACCWTGTTACGACTT 18bp

MarVer3 MarVer3F AGACGAGAAGACCCTRTG 18bp 16S ca. 245 bpMarVer3R GGATTGCGCTGTTATCCC 18bp

Ceto2 Ceto2F CACGCACACACCGCCCG 17bp 12S ca. 104 bpCeto2R GTATGCTTACCTTGTTACGAC 21bp

was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (whichthis version posted September 5, 2019. . https://doi.org/10.1101/759746doi: bioRxiv preprint

Page 23: Novel Universal Primers for Metabarcoding eDNA Surveys of … · groups: i) vertebrates (excluding Cetaceans), ii) Cetaceans only, and iii) Invertebrates. Performance of the newly

Table 2 Characteristics of the amplicons produced by each of the primer sets for the 71 marine vertebrate species used

for primers’ design. The overall amplicon size is the mean value of the amplicon sizes recorded in the eight marine

vertebrate groups analysed. Ceto2 locus is not included in this table as its characteristics are the same as MarVer2, with

the only exception of having the total amplicon size increased of 15bp compared with MarVer2 due to primers

positioning. The grey shading indicates, for each vertebrate group, rank of nucleotide p-distance values across the three

loci (the highest shown in darker grey)

MarVer01 CODO CMYS PINN SIRE STUR SBIR TELE ELAS Overall

approximate amplicon size (bp) 199 199 197 199 211 209 197 212 ca. 202

n variable sites 77 31 33 18 36 45 80 40 171resolved over tested species 36/36 12/12 3/3 2/2 6/6 2/2 7/7 3/3 71/71percentage of variable sites (%) 38.7 15.6 16.8 9.0 17.1 21.5 40.6 18.9 84.7nucleotide p-distance 0.084 0.042 0.097 0.084 0.059 0.157 0.176 0.090 0.196

MarVer02 CODO CMYS PINN SIRE STUR SBIR TELE ELAS Overall

approximate amplicon size (bp) 96 96 89 90 87 87 85 85 ca. 89

n variable sites 46 22 29 16 18 11 47 26 65resolved over tested species 30/36 12/12 3/3 2/2 6/6 2/2 7/7 3/3 65/71percentage of variable sites (%) 47.9 22.9 32.6 17.8 20.7 12.6 55.3 30.6 73.0nucleotide p-distance 0.056 0.028 0.146 0.125 0.075 0.047 0.175 0.109 0.154

MarVer03 CODO CMYS PINN SIRE STUR SBIR TELE ELAS Overall

approximate amplicon size (bp) 233 232 240 235 240 246 274 265 ca. 245

n variable sites 79 41 53 36 50 31 142 37 251resolved over tested species 34/36 12/12 3/3 2/2 6/6 2/2 7/7 3/3 69/71percentage of variable sites (%) 33.9 17.7 22.1 15.3 20.8 12.6 51.8 14.0 102.4nucleotide p-distance 0.052 0.040 0.098 0.081 0.054 0.058 0.151 0.046 0.167

was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (whichthis version posted September 5, 2019. . https://doi.org/10.1101/759746doi: bioRxiv preprint

Page 24: Novel Universal Primers for Metabarcoding eDNA Surveys of … · groups: i) vertebrates (excluding Cetaceans), ii) Cetaceans only, and iii) Invertebrates. Performance of the newly

Table 3 Levels of polymorphism found at the three proposed loci in a series of marine mammals Families, Genera and

species for which complete mitogenomic data were available from GenBank (accession numbers listed in Appendix E,

with the exception of 22 Pusa capsica entries for which data are unpublished). Numbers in the three columns on the

right indicate the number of variable sites detected in the sequence targeted by the three primers sets in each given

comparison: clear boxes indicate polymorphism thus usefulness of the primer set to resolve taxa case, while grey boxes

show lack of diagnostic polymorphism and light grey boxes indicate lack of resolution at 98% homology threshold for

MOTUs assignment

a all mutations present in only 1 or 2 individuals, not diagnostic to detect populations or subspecies b some of these are mutations present in only 1 or 2 individuals c further 7 variable sites were found in a single individual (NZFS8, in Emami-Khoyi et al. 2016) and were not included in this list as they might be an artifact

Taxonomic level Group Taxa (sample size) Comparisons between MV1 MV2 MV3

Family level PINN Phocidae (10 Genera) Cystophora, Erignathus, Halichoerus, Hydrurga, Lobodon, Mirounga, Monachus, Phoca, 39 29 59Pusa, Leptonychotes

Otariidae (6 Genera) Arctocephalus, Callorhinus, Eumetopias, Neophoca, Phocarctos, Zalophus 31 22 42CMYS Balaenopteridae (2 Genera) Balaenoptera, Megaptera 23 14 25CODO Ziphiidae (5 Genera) Ziphius, Mesoplodon, Indopacetus, Hyperodon, Berardius 37 30 43

Monodontidae (2 Genera) Delphinapterus, Monodon 6 1 13Delphinidae (14 Genera) Cephalorhynchus, Sousa, Tursiops, Stenella, Delphinus, Lagenodelphis, Legenorhynchus, 41 25 42

Grampus, Peponocephala, Feresa, Pseudorca, Orcinus, Globicephala, OrcaellaPhocoenidae (2 Genera) Neophocaena, Phocoena 13 1 10

Genus level PINN Phoca (4 species, 5 individuals) Phoca vitulina (1) vs Phoca larga (2) 0 2 2Phoca vitulina (1) vs Phoca groenlandica (1) 3 3 11Phoca vitulina (1) vs Phoca fasciata (1) 10 5 10Phoca larga (2) vs Phoca groenlandica (1) 3 2 11Phoca larga (2) vs Phoca fasciata (1) 11 4 10Phoca groenlandica (1) vs Phoca fasciata (1) 7 4 8

Pusa (3 species, 3 individuals) Pusa caspica (1) vs Pusa hispida (1) 1 1 4Pusa caspica (1) vs Pusa sibirica (1) 1 3 5Pusa hispida (1) vs Pusa sibirica (1) 0 2 1

Arctocephalus (3 species, 48 individuals) Arctocephalus forsteri (46) vs Arctocephalus pusillus (1) 8 5 4Arctocephalus forsteri (46) vs Arctocephalus townsendi (1) 4 3 5Arctocephalus pusillus (1) vs Arctocephalus townsendi (1) 8 7 4

CODO Tursiops (3 species, 23 individuals) Tursiops truncatus (16) vs Tursiops aduncus (5) 4 0 1Tursiops truncatus (16) vs Tursiops australis (2) 3 0 1Tursiops aduncus (5) vs Tursiops australis (2) 3 0 2

Delphinus (2 species, 2 individuals) Delphinus delphis (1) vs Delphinus capensis (1) 1 0 2Stenella (3 species , 176 individuals) Stenella coeruleoalba (2) vs Stenella attenuata (70) 3 1 3

Stenella coeruleoalba (2) vs Stenella longirostris (104) 1 0 4Stenella attenuata (70) vs Stenella longirostris (104) 2 0 2

Globipephala (2 species, 2 individuals) Globicephala melas (1) vs Globicephala macrorhynchus (1) 1 1 2Mesoplodon (5 species, 26 individuals) Mesoplodon europaeus (8) vs Mesoplodon densirostris (12) 8 9 11

Mesoplodon europaeus (8) vs Mesoplodon grayi (2) 10 9 13Mesoplodon europaeus (8) vs Mesoplodon ginkgodens (2) 7 7 11Mesoplodon europaeus (8) vs Mesoplodon stejnegeri (2) 12 10 12Mesoplodon densirostris (12) vs Mesoplodon grayi (2) 6 6 9Mesoplodon densirostris (12) vs Mesoplodon ginkgodens (2) 8 11 15Mesoplodon densirostris (12) vs Mesoplodon stejnegeri (2) 7 6 7Mesoplodon grayi (2) vs Mesoplodon ginkgodens (2) 10 9 12Mesoplodon grayi (2) vs Mesoplodon stejnegeri (2) 5 8 11Mesoplodon ginkgodens (2) vs Mesoplodon stejnegeri (2) 6 10 17

Neophocena (2 species, 12 individuals) Neophocena asiaeorientalis (4) vs Neophocena phocaenoides (8) 0 0 0

Species level SIRE Dugong dugong 2 individuals 1 2 2PINN Arctocephalus forsteri 46 individuals 2 2 5c

Eumetopias jubatus 11 individuals 0 0 0Pusa caspica 23 individuals 2a 2b 2

CMYS Balaenoptera physalus 152 individuals 12b 1 8b

Magaptera novaeangliae 2 individuals 1 1 1CODO Orcinus orca 151 individuals 3 3b 0

Stenella longirostris 104 individuals 4a 3b 6Stenella attenuata 70 individuals 3b 1 3Tursiops truncatus 16 individuals 1a 0 2b

Ziphius cavirostris 20 individuals 1a 1a 4Mesoplodon densirostris 12 individuals 1 1 4Mesoplodon europaeus 8 individuals 1 0 0Phocoena phocoena 5 individuals 1 0 0

was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (whichthis version posted September 5, 2019. . https://doi.org/10.1101/759746doi: bioRxiv preprint

Page 25: Novel Universal Primers for Metabarcoding eDNA Surveys of … · groups: i) vertebrates (excluding Cetaceans), ii) Cetaceans only, and iii) Invertebrates. Performance of the newly

Table 4 List of tissue DNA extracts used for wet lab primer tests. “PC” indicates phenol chloroform extracts, “CK”

indicates commercial kit extracts, while “BE” refers to boil extracts as described in Valsecchi 1998. The signs “✓” and

“✗” specify successful amplification and PCR failure (no band or aspecific bands) respectively

Group Species Extract type Tissue Time since extraction MarVer1 MarVer2 MarVer3 Ceto2

CODO Stenella coeruleoalba PC skin 27yr � � � �

Stenella frontalis PC muscle 26yr � � � �

Pontoporia blainvillei BE skin 25yr � � � �

CMYS Megaptera novaeangliae PC skin 31yr � � � �

Eubalaena australis BE skin 24yr � � � �

PINN Puca caspica CK blood 1yr � � � �

STUR Lepidochelys kempii CK blood 1yr � � � �

Chelonia midas CK muscle 1yr � � � �

Lepidochelys olivacea CK blood 1yr � � � �

Caretta caretta CK blood 1yr � � � �

Caretta caretta CK muscle 1yr � � � �

TELE Thunnus albacares CK muscle 1yr � � � �

Merlangius merlangus CK muscle 1yr � � � �

was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (whichthis version posted September 5, 2019. . https://doi.org/10.1101/759746doi: bioRxiv preprint

Page 26: Novel Universal Primers for Metabarcoding eDNA Surveys of … · groups: i) vertebrates (excluding Cetaceans), ii) Cetaceans only, and iii) Invertebrates. Performance of the newly

26

Table 5 Species composition of the 6 tanks of the Aquarium of Genoa from which water samples were collected for

extracting eDNA. The last four columns on the right show NGS outcomes for primer sets MarVer1 and MarVer3:

“GB” columns indicate whether reference sequences are present in GenBank for that specific species and locus; “■”

indicate successful species detection (half-full and empty square indicates a number of reads 0.001>nr>0.005 or

nr<0,001 respectively); △ denotes that a congenereric was identified in place of the resident vertebrate species, “?”

indicates possible detection of the resident species at a higher taxonomic group (this case together with those

instances for which reference sequences were available but the species remained undetected are discussed in the

text)

Tank Name Hosted species (n. individuals) Common name Vertebrate groupGB detection GB detection

1 MANATEE Trichechus manatus (4) Matatee SIRE yes ■ yes ■Piaractus brachypomus (3) Pirapitinga TELE yes ■ yes ■Pterygoplichthys gibbiceps (2) Armored catfish TELE no X yes X

2 DOLPHIN Tursiops truncatus (10) Bottlenose dolphin CODO yes ■ yes ■3 SHARK Ginglymostoma cirratum (3) Nurse shark ELAS yes ■ yes ■

Stegostsoma fasciatum (1) Zebra shark ELAS yes ■ yes ■Carcharhinus plumbeus (4) Grey shark ELAS yes ■ yes ■Dasiatis americana (1) Southern stingray ELAS no X no XPristis zjisron (2) Longcomb sawfish ELAS yes X yes XTeaniura grabata (2) Round stingray ELAS no X yes XEpinepehelus itajara (1) Atlantic goliath grouper TELE no � yes �Epinephelus costae (1) Goldblotch grouper TELE no � yes ■Epinephelus marginatus (2) Dusky grouper TELE no � yes ■Seriola dumerili (9) Greater amberjack TELE yes ■ yes �Dicentrarchus labrax (7) European seabass TELE yes ■ yes ■Diplodus sargus (4) White seabream TELE no X yes ■Diplodus cervinus (1) Zebra sea bream TELE no X yes �

Pagellus bogaraveo (1) Blackspot seabream TELE yes X yes ◻

4 PENGUIN Pygoscelis papua (12) Gentoo penguin SBIR yes X yes ■Spheniscus magellanicus (24) Magellanic penguin SBIR yes � yes ■

5 SEAL Phoca vitulina (6) Harbour seal PINN yes ■ yes ■6 ROCKY SHORE Diplodus vulgaris (21) Common two-banded seabream TELE no X yes ■

Diplodus sargus (3) White seabream TELE no X yes ■Sciaena umbra (3) Brown meagre TELE no ? yes ■Sarpa salpa (2) Salema TELE no X yes ■Serranus scriba (1) Painted comber TELE no X yes �Scyliorhinus stellaris (2) Nursehound ELAS no X yes ■

Feed species2,3,4,5,6 Culpea harengus Atlantic herring TELE yes ■ yes ■2,3,4,5,6 Mallotous villosus Capelin TELE yes ■ yes ■

3,6 Merluccius productus North Pacific hake TELE no X yes ■6 Scomber scombrus Atlantic mackerel TELE yes � yes ■2 Squid yes yes

MarVer1 MarVer3

was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (whichthis version posted September 5, 2019. . https://doi.org/10.1101/759746doi: bioRxiv preprint

Page 27: Novel Universal Primers for Metabarcoding eDNA Surveys of … · groups: i) vertebrates (excluding Cetaceans), ii) Cetaceans only, and iii) Invertebrates. Performance of the newly

27

Fig. 1 Map of the regions amplified by the newly presented primer sets within the 12S (light grey) and 16S (dark grey) genes. The positions of some of most commonly

used barcode markers used for detecting vertebrate groups are shown for comparison. The size marker at the bottom refers to the 12S and 16S mtDNA fragment from

position 72 to position 2690 in the stripe dolphin (Stenella coeruleoalba) complete mitogenome (GenBank accession number NC_012053)

MarVer1 MarVer2 MarVer3Ceto2

23

4

561

7

0 bp 500 bp 1000 bp 1500 bp 2000 bp 2500 bp

Page 28: Novel Universal Primers for Metabarcoding eDNA Surveys of … · groups: i) vertebrates (excluding Cetaceans), ii) Cetaceans only, and iii) Invertebrates. Performance of the newly

28

Fig. 2 The total number of unique taxa (y-axis) recovered by the different primer pairs (x-axis). The results

are shown for all three higher taxonomic groups (i.e. Vertebrates -excluding cetacean species-, Cetaceans and

Invertebrates) considered during the analyses (horizontal panels)

Page 29: Novel Universal Primers for Metabarcoding eDNA Surveys of … · groups: i) vertebrates (excluding Cetaceans), ii) Cetaceans only, and iii) Invertebrates. Performance of the newly

29

Fig. 3 The proportion of sequences amplified for each higher taxonomic group as a function of the bp-

mismatches between the template DNA and the forward (y-axis) and reverse (x-axis) primers. The size of the

pie-charts is proportional (on a log scale) to the total number of sequences recovered for a given number of

bp-mismatches between the forward and reverse primer

Page 30: Novel Universal Primers for Metabarcoding eDNA Surveys of … · groups: i) vertebrates (excluding Cetaceans), ii) Cetaceans only, and iii) Invertebrates. Performance of the newly

30

Fig. 4 The percentage of sequences correctly identified (y-axis) to the family, genus and species level taxonomy (x-

axis) for the different primer pairs. Results are shown for all sequences belonging to the Cetaceans and Vertebrates

(vertical panels) and using different threshold values for barcode similarity (i.e. sequences were considered different if

they have 1, 3 or 5 bp differences) (horizontal panels)

Page 31: Novel Universal Primers for Metabarcoding eDNA Surveys of … · groups: i) vertebrates (excluding Cetaceans), ii) Cetaceans only, and iii) Invertebrates. Performance of the newly

31

Fig. 5 Taxon ‘abundance’ (read counts) for amplicons generated with the MarVer3 primer set for environmental DNA

extracted from water samples collected from 6 tanks at the Genoa Aquarium. Read counts are combined across the 6

replicate samples assayed for each tank, and seqeunce demultiplexing and amplicon annotation against Genbank

references was carried out using a threshold of 98% sequence similarity. Taxa presented in the figure were filtered to

exclude those with read counts less than 0.005*median read count across tanks. Taxa names with *, are feed species, all

others are resident

Page 32: Novel Universal Primers for Metabarcoding eDNA Surveys of … · groups: i) vertebrates (excluding Cetaceans), ii) Cetaceans only, and iii) Invertebrates. Performance of the newly

32

Online Resource 1 List of 75 taxa used for primer sets’ design.

Label Family Species GenBank A.N.HOMI01 Hominidae Homo sapiens sapiens (UK) AY339432HOMI02 Hominidae Homo sapiens sapiens (JP) AY339469HOMI03 Hominidae Homo sapiens sapiens (AF1) AY339522.2HOMI04 Hominidae Homo sapiens sapiens (AF2) AY339578.1PINN01 Otariidae Arctocephalus forsteri KT693378PINN02 Odobenidae Odobenus rosmarus NC_004029PINN03 Phocidae Phoca vitulina NC_001325CMYS01 Balaenidae Balaena mysticetus NC_005268CMYS02 Balaenidae Eubalaena glacialis NC_037444CMYS03 Cetotheriae Caperea marginata NC_005269CMYS04 Eschrichtiidae Eschrichtius robustus NC_005270CMYS05 Balaenopteridae Megaptara novaeangliae NC_006927CMYS06 Balaenopteridae Balaenoptera acutorostroata NC_005271CMYS07 Balaenopteridae Balaenoptera bonaerensis NC_006926CMYS08 Balaenopteridae Balaenoptera edeni NC_007938CMYS09 Balaenopteridae Balaenoptera brydei NC_006928CMYS10 Balaenopteridae Balaenoptera borealis NC_006929CMYS11 Balaenopteridae Balaenoptera physalus NC_001321CMYS12 Balaenopteridae Balaenoptera musculus NC_001601CODO01 Physeteridae Physeter catodon NC_002503CODO02 Koogidae Kogia breviceps NC_005272CODO03 Ziphiidae Ziphius cavirostris NC_021435CODO04 Ziphiidae Berardius Bairdii NC_005274CODO05 Ziphiidae Hyperoodon ampullatus NC_005273CODO06 Ziphiidae Mesoplodon europaeus NC_021434CODO07 Ziphiidae Mesoplodon grayi NC_023830CODO08 Ziphiidae Mesoplodon ginkgodens NC_027593CODO09 Ziphiidae Mesoplodon stejnegeri NC_036997CODO10 Ziphiidae Mesoplodon densirostris NC_021974CODO11 Iniidae Inia geofrensis NC_005276CODO12 Lipotidae Lipotes vexillifer NC_007629CODO13 Pontoporiidae Pontoporia blainvillei NC_005277CODO14 Monodontidae Delphinapterus leucas NC_034236CODO15 Monodontidae Monodon monoceros NC_005279CODO16 Delphinidae Cephalorhynchus heavisidii NC_020696CODO17 Delphinidae Sousa chinensis NC_012057CODO18 Delphinidae Tursiops truncatus KF570323CODO19 Delphinidae Tursiops aduncus NC_012058CODO20 Delphinidae Stenella attenuata NC_012051CODO21 Delphinidae Stenella coeruleoalba NC_012053CODO22 Delphinidae Delphinus delphis MH000365CODO23 Delphinidae Delphinus capensis NC_012061CODO24 Delphinidae Lagenodelphis hosei MG599457CODO25 Delphinidae Legenorhynchus albirostris NC_005278CODO26 Delphinidae Lagenorhynchus obliquidens NC_035426CODO27 Delphinidae Grampus griseus NC_012062CODO28 Delphinidae Peponocephala electra NC_019589CODO29 Delphinidae Feresa attenuata NC_019588CODO30 Delphinidae Pseudorca crassidens NC_019577CODO31 Delphinidae Orcinus orca NC_023889CODO32 Delphinidae Globicephala melas NC_019441CODO33 Delphinidae Globicephala macrorhynchus NC_019578CODO34 Delphinidae Orcaella brevirostris NC_019590CODO35 Phocoenidae Neophocaena phocaenoides NC_021461CODO36 Phocoenidae Phocoena phocoena NC_005280SIRE01 Trichechidae Trichecus manatus NC_010302SIRE02 Dugongidae Dugong dugong NC_003314STUR01 Dermochelyidae Dermochelys coriacea MF460363.1STUR02 Cheloniidae Caretta caretta NC_016923.1STUR03 Cheloniidae Chelonia mydas NC_000886.1STUR04 Cheloniidae Eretmochelys imbricata DQ533485.1STUR05 Cheloniidae Lepidochelys olivacea DQ486893.1STUR06 Cheloniidae Natator depressa NC_018550.1SBIR1 PhalacrocoracidaePhalacrocorax carbo NC_027267.1SBIR2 Laridae Larus crassirostris NC_025556.1TELE01 Scombridae Thunnus thynnus KF906720.1 TELE02 Scombridae Scomber scombrus NC_006398.1TELE03 Moronidae Dicentrarchus labrax NC_026074.1TELE04 Clupeidae Sardina pilchardus NC_009592.1TELE05 Engraulinae Engraulis encrasicolus NC_009581.1TELE06 Salmoninae Salmo salar NC_001960.1TELE07 Xiphiidae Xiphias gladius NC_012677.1ELAS01 Carcharhinidae Sphyrna mokarron NC_035491.1ELAS02 Carcharhinidae Prionace glauca NC_022819.1ELAS03 Alopiidae Carcharodon carcharias NC_022415.1

Page 33: Novel Universal Primers for Metabarcoding eDNA Surveys of … · groups: i) vertebrates (excluding Cetaceans), ii) Cetaceans only, and iii) Invertebrates. Performance of the newly

33

Online Resource 2 Variable sites at the priming sites of the described loci for each of the eight marine vertebrate groups considered in this study (71 taxa considered): CODO= Cetacean Odontocetes; CMYS= Cetacean Mysticetes; PINN= Pinnipeds; SIRE= Sirenians; STUR= Sea Turtle; SBIR= Sea Birds; TELE= Teleosts or bony fish; ELAS= Elasmobranch or cartilaginous fish. Star signs (*) indicate variable sites within the corresponding vertebrate group. Locus Primer Primer Sequence (5'-3') vertebrate groupMarVer1

T * SBIRMarVer1F CGTGCCAGCCACCGCGMarVer1R GGGTATCTAATCCYAGTTTG

C CODO C CMYS C PINN C SIRE T STUR C SBIR C TELE T ELAS

MarVer2 C H. sapiens T ELAS * TELE C SBIR C STUR C SIRE C PINN C CMYS C CODO

MarVer2F CCGCCCGTCACYCTCMarVer2R CTTACCWTGTTACGACTT

T CODO T CMYS T PINN T SIRE T STUR T SBIR A TELE A ELAS A H. sapiens

MarVer3 A H. sapiens A ELAS A TELE G SBIR G STUR A SIRE A PINN A CMYS A CODO

MarVer3F AGACGAGAAGACCCTRTGMarVer3R GGATTGCGCTGTTATCCC

Ceto2 G T H. sapiens ** ELAS * TELET* T T STUR

Ceto2F CACGCACACACCGCCCGCeto2R GTATGCTTACCTTGTTACGAC

* PINN CA A SBIR CA A TELE CA A ELAS CA H. sapiens

Page 34: Novel Universal Primers for Metabarcoding eDNA Surveys of … · groups: i) vertebrates (excluding Cetaceans), ii) Cetaceans only, and iii) Invertebrates. Performance of the newly

34

Online Resource 3 Extraction of variable sites for loci MarVer1, MarVer2, and MarVer3 for the 75 taxa used for primers design. MarVer1

111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111222222222222222 111122222223333333444444445555666666667777777777888888888899999999000000000111111112222222223333333333444444444455555555666666666777777777788888888889999999999000000111111223 1018914567891345678012345675678234567890123456789012345678901234789013456789012345670124567890123456789012345678901236789012356789012345678901234567890123456789046789123456247 HOMI01 CCATCCGATTAACAAGTCATAGAAGC-AAGATTTTAGAT-------CA--CCCCCTCCC--CAAAAGCTAAACTCACCTGAGTTAAAAACTCCAGTTGACAC--AAAATAGACTACGAGTGGCTTA--------ACA----------TATCT----GAACAATAGCAAGACC-GC HOMI02 ............................................................................................................................................................................... HOMI03 ............................................................................................................................................................................... HOMI04 ............................................................................................................................................................................... PINN01 .....T.........ACT.CG.GCC.A...C.AAA...........TT..A---TC-AA....C...T....T.T.A.CA..CC....G..A.C...A.T..........T..........A............CT...........CT.........TT.G............. PINN02 .....T.........A.T.C..GTT.A...T..AAG..........T...G---..-AA....C...T......T.G.CA..CC....G..G.C.C.A.T..........T..........A............TT...........AC..C......TT.GC............ PINN03 .....T.........ACT....GCC.T...C..AA...............A---.C-.A....C.........CT.A.CA..CC....G..A.C...A...T...........C.......A............CT...........AT.........CTGG........T.... CODO01 .....T....G.....CT....GCATA.G....AA...G.......TT..A---TA.AA..A.-...TC.GC..TGA..A.ACC....G.CAT....A.A.T.T..GC..A..C.......A...G........GT.C.........A........-.CTGGC.A.......... CODO02 .....T....G.....CT...AGCATA.....C.AG..A........C..A---.A.AA..A.-....C.GCT.TGA..A..C.....G.CAT...CA.A...C..G..............A.............T.C.........AG.........CT.GC............ CODO03 .....T....G..G.A.T......ATA......AA...A.......TT..A---TGTAA..A.-...T...T..T.AT.A..C..G..G.CAT.A..A.A...T...C..A..C.......A.............TCT.........A..........TT.GC.......T.... CODO04 .....T....G....A.T......ATA......AAG..G.......TC..A---.ATAA..A.-...TC..C..T.AT.A..C.....G.CAT.A..A.A...C..GC.............A.............T.T.........A..........TT.G.......GT.... CODO05 .....T....G....A.T......ATA......AA...A.......TC..G---TATAA..A.-...T...T..T.AT.A..C.....GTCAT.A..A.A.T.T...C..A..C...G...A.............T.C.........A..........CT.GA.A.....T.... CODO06 .....T.........A.T......ATA...G..AA...A.......T...G---TATAA..A.-...T...C..T.AT.A..C..G..G.CAT.A..A.A.T.T...C..A..C.......A.............T.C.........A........A.CT.G..A.......... CODO07 .....T....G....A.T......ATA......AA...A.......T...A---.A.AA..A.-...T...T..T.AT.A..C..G..G.CAT.A..A.A...T..GC..A..........A.............T.C.........A..........CT.G..A.....T.... CODO08 .....T.........A.T......ATA......AA...A.......TG..A---TGTAA..A.-G..T...C..T.AT.A..C..G..G.CAT.A..A.A...T...C..A..C.......A.............T.T.........A..........CT.GC.A.......... CODO09 .....T....G....A.T......ATA......AA...A.......TG..A---.A.AA..A.-...T...C..T.AT.A..C..G..G.CAT.A..A.AG..T..GCC.A..C.......A.............TGT.........A..........CT.GC.A.....T.... CODO1O .....T....G....A.T......ATA......AA...A.......T...A---TA.AA..A.-...T...T..T.AT.A..C.....G.CAT.A..A.A...T...C..A..C.......A.............T.T.........G..........CT.G..A.....T.... CODO11 ...C.T.......G.A.T.....CATC.....CAAG..G.......TC..A---TA-TA..A.....TC..TGC..AT.A.AC.....GTCAT.A..A.T.TCT..GC.....C.......A............TT.T.........A.........C.T.GC......G..... CODO12 .....T.......G.A.T..G..CA.A.....AAAGA.A........T..A---AA-TA..A......C..TACT.AT.A..CC.G..G.CAT.A..A.A.T.T..GC..A..C.......A............T..C.........A...C......TT.GC.A.......... CODO13 .....T.......C.A.T.....CATT......GAG..........TC..A---TG-.A..A.....TC..TA...A....AC.....GTCAT....A.A..CT..G...AG.........A............CT.C.........A.........CGT.GC.A.....T.... CODO14 ...C.T....G..G.A.T....GCATC.....CAAG..A...........A---TA-.A..A.....TC.GCT.T.AT.A..C.....G.CAT.A..A.ATT.T..GCCGA..C......AA............C..T.........A.A........GT.GC.A.......... CODO15 .....T....G..G.A.T.....CATC.....CAAG..A...........A---TA-.A..A.....TC..C..T.AT.A..C.....G.CAT.A..A.ACA.T..GCCGA..C......AA............C..T.........A.A........GT.GC.A.....T.... CODO16 .....T....G..G.A.T.....CA.C.....CAAG..A...........A---TA.AA..A.....TC..C..T.AT.A..C.....G.CAT.A..A.A.T.T..GT..A.........AA............C..T.........A.A........GT.GC.A.......... CODO17 .....T....G....A.T.....CATC.....CAA...A...........A---TATAA..A.....TC..C..T.AT.A..C.....G.CAT.A..A.A.T.T..GT..A.........AA............C..T.........A.A.........T.GC.A.......... CODO18 .....T....G....ACT.....CA.C.....CAA...A...........A---TATAA..A.....TC..C..T.AT.A..C.....G.CAT.A..A.A.T.T..GT..A.........AA............C..T.........A.A.........T.GC.A.......... CODO19 .....T....G......T.....CA.C.....CAAG..A...........A---TATAA..A.....TC..C..T.AT.A..C.....G.CAT.A..A.A.T.T..GT..A.........AA............C..T.........A.A........GT.GC.A.......... CODO20 .....T....G....A.T.....CA.C.....CAA...A...........A---.ATAA..A.....TC..C..TGAT.A..C.....G.CAT.A..A.A.T.T..GT..A.........AA............C..T.........A.A........GT.GC.A.......... CODO21 .....T....G....A.T.....CA.C.....CAA...A...........A---TATAA..A.....TC..C..T.AT.A..C.....G.CAT.A..A.A.T.T..GT..A.........AA............C..C.........A.A........GT.GC.A.......... CODO22 .....T....G....ACT.....CA.C.....CAA...A...........A---TATAA..A.....TC.GC..T.AT.A..C.....G.CAT.A..A.A.T.T..GT..A.........AA............C..T.........A.A........GT.GC.A.......... CODO23 .....T....G....ACT.....CA.C.....CAA...A...........A---TATAA..A.....TC..C..T.AT.A..C.....G.CAT.A..A.A.T.T..GT..A.........AA............C..T.........A.A........GT.GC.A.......... CODO36 .....T....G......T.....CA.C.....CGAG..G.......T...A---.A-TA..A.....TC..T..T.GT.A..C.....G.CAT.AC.A.AC..T..GC..A..C......AA............T..C.........A.A.C......GT.GC.A.......... CODO24 .....T....G....A.T.....CA.C.....CAA..GA...........A---TATAA..A.....TC..C..T.AT.A..C.....G.CAT.A..A.A.T.T..GT..A.........AA............C..T.........A.A........GT.GC.A.......... CODO25 .....T....G....A.T.....CATC.....CAAG..A........T..A---TA.AA..A.....TC..C..T.AT.A..C.....G.CAT.A..A.A.T.T..GT..A.........AA............T..T.........A.A.........T.GC.A.......... CODO26 .....T....G..G.A.T.....CA.C.....CAAG..A...........A---TATAA..A.....TC..C..T.AT.A..C.....G.CAT.A..A.A.T.T..GT..A.........AA............C..T.........A.A........GT.GC.A.......... CODO27 .....T....G....A.T.....CA.C.....CGAG..A...........A---TATAA..A.....TC.GC..T.AT.A..C.....G.CAT.A..A.A.T.T..GT..A.........AA............C..T.........A.A.C......GT.G..A.......... CODO28 .....T....G....A.T...A.CA.C.....CAAG..A...........A---TATAA..A.....TC..C..T.AT.A..C.....G.CAT.A..A.A.T.T..GT..A.........AA............C..T.........A.A.C......GT.G..A.......... CODO29 .....T....G....A.T...A.CA.C.....CAAG..A...........A---TATAA..A.....TC..CT.T.AT.A..C.....G.CAT.A..A.A.T.T..GT..A.........AA............C..T.........A.A.C......GT.G..A.......... CODO30 .....T....G....A.T.....CA.C.....CAAG..A...........A---TA.AA..A.....TC..C..T.AT.A..C.....G.CAT.A..A.A.T.T..GT..A.........AA............C..T.........A.A.C......GT.G..A.......... CODO31 .....T....G....A.T.....CA.C.....CAA...A........G..A---TATAA..A.....T..GC..T.AT.A..C.....GTCAT.A..A.A.T.T..GT..A.........AA............C..T.........A.A........GT.GC.A.....T.... CODO32 .....T....G....A.T...A.CA.C.....CGAG..A...........A---TATAA..A.....TC..C..T.AT.A..C.....G.CAT.A..A.A.T.T..GT..A.........AA............C..T.........A.A.C.......T.G..A.......... CODO33 .....T....G....A.T...A.CA.C.....CAAG..A...........A---TATAA..A.....TC..C..T.AT.A..C.....G.CAT.A..A.A.T.T..GT..A.........AA............C..T.........A.A.C.......T.G..A.......... CODO34 .T...T....G....A.T.....CATC.....CAAG..A........C..A---TATAA..A.....TC..C..T.AT.A..C.....G.CAT.A..A.A.T.T..GT..A..........A............C..T.........A.A.C......GT.G..A.......... CODO35 .....T....G..G..CT.....CATA...G.CAAG..G.......T...G---TA-TA..G.....TC..T..T.G..A..C.....G.CAT.AC.A.GC..T..GC..A..C......AA............C..C.........A.A.C......GT.GC.A.......... CMYS01 .....T.........A.T......A.A......AAG..G.......TC..A---.GTAA..A.-...TC.GC..T.AT.A..C.....G.C.T.A..A.A...T..GCC.A..........A.............T.T.........A..........TT.GC............ CMYS02 .....T.........A.T....G.ATA......AAG..G.......TT..A---.ATAA..A.-...TC.GC..T.AT.A..C.....G.C.T.A..A.A...T..GCC.A..........A.............T.T.........A..........TT.GC.......T.... CMYS03 .....T.........A.T......A.A......AA...A.......TC..G---TATA...A.-...TC..C..T.AT.A..C.....G.C.T.AC.A.A.A.T..GCC.A..........A.............T.T.........G..........T..GC......G.T... CMYS04 .....T....G....A.T......ATA......AA...G.......TC..A---.ATAA..A.-...TC..C..T.AT.A..C.....G.C.T.A..A.A...T..GCC.A..........A.............T.T.........G........A.T..GC............ CMYS05 .....T....G....A.T....G.A.A......AAG..G.......TC..A---.ATAA..A.-...TC..C..T.AT.A..C.....G.C.T.A..A.A.T.T..GCC.A..........A...............T.........A.C........T..GC............ CMYS06 .....T....G....A.T........A......AAG..A........T..A---.ATAA..A.T...TC..T..T.AT.A..C.....G.C.T.A..A.A.T.T..GCC.A..........A............GT.T.........A...C......T..GC............ CMYS07 .....T....G....A.T........A......AAG..A.......TT..A---.ATAA..A.T...TC..T..T.AT.A..C.....G.C.T.A..A.A.T.T..GCC.A..........A...............T.........A...C....A.T..GC............ CMYS08 .....T.........A.T......A.A......AAG..A........T..A---.ATAA..A.-...TC..C..T.AT.A..C.....G.CAT.A..A.A...T..GC..A..........A.............T.T.........G...C....A.T..GC.......T.... CMYS09 .....T.........A.T......A.A......AAG..C.......TT..A---.A.AA..A.-...TC..C..T.AT.A..C.....G.CAT.A..A.A...T..GCC.A..........A.............T.T.........G...C....A.T..GC.......T.... CMYS10 .....T.........A.T......A.A......AAG..G.......TC..A---.A.AG..A.-...TC..C..T.AT.A..C.....G.CAT.A..A.A...T..GCC.A..........A.............T.T.........G...C....A.T..GC.......T.... CMYS11 .....T...CG....A.T........A......AAG..G........C..A---.ATGA..A.-...TC..C..T.AT.A..C.....G.C.T.A..A.A.T.T..GCC.A..........A.............T.T.........A..........T..GC.......T.... CMYS12 .....T.........A.T......A.A......AAG..G.......TC..T---.ATAG..A.-...TC..C..T.AT.A..C.....G.CAT.A..A.A.T.T..GCC.A..........A.............T.T.........G..........T..GC.......T.... SIRE01 ...C.T.........A.T...AGCAT.G..C...AG..G.......TC..G-A.AC-A...A.....T...TTAA.T..A..CC....G..AAG.A.T.G...A......AC.C.......A............C............GC..........T.G.....G...A... SIRE02 ...............A.T...AGTA..G..C...AG..G.......A...G-A..C-AT..A.....T...T.AG.T.CA..CC.G..G..AA..A.C.G...A......AT.C.......A............T............GC............G.........A... STUR01 ....TTA.GA.G...ACT.C..CTAAT..ATGC.A.ATA.......TT..TAT.AA.AA..A.T.G.TG.CCTAATA.CC.AC.C.TCGTAAAG..AT...T..-T..C.C....T..A.AA.C...........T.CTAACGAAACA..T.........CG.C.......A.A. STUR02 ....TTA.GA.G...ACT.C.ACCAAT..ATGC.A.ACA.......GCTCTAT.AATAA..A.T.GATG.CCAAGTA.CA.AC.C.TCGTAAA...ACGG.T..-T..C.C....T..A.AA.C...........T.C.AATGAAATA..T.........CG.C.......A.A. STUR03 ....TTA.GA.G...A.T.C.ACCAAT..ATAC.A.ACA.......TTTCTAT.AATAA..A.T.G.TG.CCAAGTA..C.AC.C.TCGTAAA...ACG..T..-C..C.C.T..T..A.AA.C...........T.C.AATGAAACA..T.........TG.T.......A.A. STUR04 ....TTA.GA.G...ACT.C.ACTAAT..ATGC.A.A.A.......A..CTATTAA.AA..A.T.G.TG.CCAAGTA.CA.AC.C.TCGTAAA...ACG..T..-T..C.C....T..A.AATC...........T.C.AACAAAATA..T.........CG.C.......A.A. STUR05 ....TTA.GA.G...ACT.C.ACCAAT..ATGC.A.ACA.......ACCCTAT.AATAA..A.T.G.TG.CTAAGTA.CA.AC.C.TCGTAAA...AC.A.T..-T....C....T..A.AA.C...........T.C.AATGAAACA.CT.........CG.C.......ATAA STUR06 ....TTA.GA.GT..ACT.C.ACCAAC..ATGC.A.ACA.......TTTATAT.AATAA..A.T.G.TG.TCAAGTA.CT.AC.C.TCGTAAA...ACG..T..-C..C.C....T..A.AA.C...........T.C.AATGAAACA..T.........CG.C.......A.A. SBIR01 T.G...A.GAGG...A.T.C....ATT....G..ACATG........T..-AT.TC.A...A.CGG..C...TA..G.....C.C.T.G..TA.AC..CAC...T..GACCC.CCTCGA..A.CC.........G..CACACGATCCAT.AA....-.C.CCA....G.G..... SBIR02 T.....A.GAG....A.T.CC.-TATA....GCACTATG.......AT..-AT.ACAGT..G.CGGAT..G.TG..A..A..C.C.TCG.C.A..G..TACT..T..GATCG.CCTTGA..ATCC.........GT.CTTTTGATTGAT.TA....-.T.CGA....G.G.A... TELE01 ....TT..GAGG.....TGC...CA.....CG..A..G....ACA..C..-------AA..A.C....CG.CAC.TT.A.G.CAT.TCG.AT.C.AA.G....G..GCCCC..C...........-.........T.A.........ACC........C.CGA....T...A... TELE02 ....TT...AGG.....TGC...CC.....CG..AG.GA...AAATTT..-------AA..A.C....CG.CAC.TT.A.G.CAC.TCG..T.C.AA.G....G..GCCCCT.C...........-..........TA.........CCC.C......C.CGA....G...A... TELE03 ....TA.GGAGGT....TG....CA.....CG..A..GACCCAACTTT..-------TA....C....CG.CGCT...A...CAG.TCG.A-.C.AGAGT.A.G..G..CA.T.......A.............TTTA.........A.C........C.CGA......G.A... TELE04 ....TT..GAGG.T...TG.TT.-AT.....G..AT.GA...GAGTT...-------AA..A..G....G.GAC.T.TC.G.CC..TCG.AT.CAGAAGTT..G..TCACA.AC......A.............C.TT.........CT..CACCA...TCGC...GG...A... TELE05 ....TT..GAG..T...TG.T..-AG.....G..AT.GA...ATTTTT..-------TA...CC....AG..AC.T.TCA.AC.T.TCG.A..CAGA.GTTA.A..CCCCTTAC.......A............TTTT.........CGC..ACCA....CGA....G..TA... TELE06 ....TT..GAGG.T...TG..ACTA......G..AC.GA...AAAAT...-------TT..T......CG.CAC.C...C..CCC.TCG.A..T.GG.G....G..G..CT........CA..............TTA...........C..........CGC....C...A... TELE07 ....TT..GAGG.....TGC...CAA.....G..A..GA...AACCA...-------AA..A.C....CG.CGCTCT.A...C.T.TCG.AT.C.AA.GT.T.G..GCCCA..............-...........T.........C.C..........CGA.......AA... ELAS01 ....TT..G.G.T.TA.T.C.C.TC......A..AGAGAAAGACCT....-------AACCT.T...T.TG..CTCATAA..C.T.TCG.A..CA.GA.TGG.A.T..ACA..A.......A....TAGACCTA.GGA.........ATCT.......TGTGC..TGG.....A. ELAS02 ....TT..G...T.CA.T...C.CC......A..A.AGAATGACCTTT..-------AA.CT.C...T.TG..CTTAT.A.AC.T.TCGTATTCA.AA.TGG.A.T..ACA..A.......A.....CGAATTA.GGA.........ACCT.......TGTGC..TGG.....A. ELAS03 ....TT..G...T.CA.T...CTTC......A....AGAAGTATCTA...-------.AGTA.T...T..G..CTTATCA..C.C.CCG.A..CA.AA..GG.A.TT..CA..A.......A....TCTCACT..G.A.........ATCT.......TGTGC..T.GAC...A.

Page 35: Novel Universal Primers for Metabarcoding eDNA Surveys of … · groups: i) vertebrates (excluding Cetaceans), ii) Cetaceans only, and iii) Invertebrates. Performance of the newly

35

MarVer2 11112222222222333333333344444444445555555555666666666677777777889 26790123456789012345678901234567890123456789012345678901234589346 HOMI01 CCTAAGTATAC--TTCAAA--------GGACATTTAACTAAAACCCCTACGCATTTATATAGACT HOMI02 ................................................................. HOMI03 ................................................................. HOMI04 ................................................................. PINN01 .....A..AT..ACC.---........CA.A.CAAT..AT...TTAAC.TAA.A.C...-....A PINN02 .....AC.ACA.CC..---........AA....AC.TAA.T.TGTA.A.AAATA.ATCT-...TA PINN03 .....A..A...TC.A---........CA.GC.AC.TAA--....AAC..AA.ACATAC-...TA CODO01 .........CA.CAG.--G.AACGCCCAAT.CAC...T.T.TG..GAGCGC-CCCC.C.-....A CODO02 ........CCA.CAGT--..AAGCTCCAAT..A.C...CC.CG.TAAATAA-T..C.C.-....A CODO03 ......C..TA.T.A.--C.AAGCCTAA.CT.GC....AC.TG..AAACA.-CCC.GC.-....A CODO04 ......C.CCA.CCG.--T.GAGCCC.A.CTTGC....AC.CG.TAAGCAA-GCA.GC.-....A CODO05 .....AC..TA.CCG.--T.AAGCCCTA.CT.GC....AC.CG..AAGCAA-CCC.GC.-....A CODO06 ......C.GTA.TCG.--C.AAGCTCTA.CT.GC....AC.CG..AAGCAA-C.C.GCG-....A CODO07 ......C..TA.TCA.--T.AAGCTC.A.CT.GC....AC.CG.TAAGTA.-C...GCG-...TA CODO08 ......C..TA.CCG.--T.AAGCCCTA.TT.AC....AC.TG..AAGCAA-C.C.GCG-...TA CODO09 ......C..TA.TCA.--T.AAACTC.A.CT.AC....AC.TG..AAATAA-C...GC.-....A CODO1O ......C..TA.TCA.--T.GAGCTC.A.CT.GC...TAC.TG..AAGTAA-C...GCG-....A CODO11 ........CCA.CAG.--T.AGGCCCAA.TTCCC....CC.TG..AAG-.AAC.AC.CT-....A CODO12 .........CA..AG.--..AGGCCCTA..TCAC....A..TG..AAG-.AAC.CC.C.-....A CODO13 .......GACA.CGA.--..AAGCCCTA.CTTAC....AC.TG.TAAGT.AACCAC.CG-....A CODO14 ........GCA.TAG.--G.AAGCCCTA..TTAC....CC.TG.TAAG-.AAG.A..C.-....A CODO15 ........GCA.TAG.--..AAGCCCTA..TCAC....CC.TG.TAAG-.AAG.A..C.-....A CODO16 .......GCCA.TAG.--T.GAGCCTCA..TCAC....CC.CG.TAAG-.AAGCG..C.-....A CODO17 .......GCCA.TAG.--..AAGCCCCA..TTGC....CC.TG.TAAG-.AAGCG..C.-....A CODO18 .......GCCA.TAG.--..AAGCCCCA..TTGC....CC.TG.TAAG-.AAGCG..C.-....A CODO19 .......GCCA.TAG.--..AAGCCCCA..TTGC....CC.TG.TAAG-.AAGCG..C.-....A CODO20 .......GCCA.TAG.--G.AAGCCCCA..TTGC....CC.TG.TAAG-.AAGCG..C.-....A CODO21 .......GCCA.TAG.--..AAGCCCCA..TTGC....CC.TG.TAAG-.AAGCG..C.-....A CODO22 .......GCCA.TAG.--..AAGCCCCA..TTGC....CC.TG.TAAG-.AAGCG..C.-....A CODO23 .......GCCA.TAG.--..AAGCCCCA..TTGC....CC.TG.TAAG-.AAGCG..C.-....A CODO36 .........CA.CAG.--..GAGCCTTA..TTAC...TCCCTG.TAAG-.AAGCA..C.-....A CODO24 .......GCCA.TAG.--..AAGCCCCA..TCAC....CT.TG.TAAG-.AAGCG..C.-....A CODO25 .......GCCA.TAG.--..AAGCCCCA..TCAC....CC.TG.TAAG-.AAGCG..C.-....A CODO26 .......GCCA.TAG.--..GAGCCCCA..TCAC....CC.TG.TAAG-.AAGCG..C.-....A CODO27 .......GCCA.CAG.--TAAAGCCTCA..TCAC....CCGTG.TAAG-.AAGCG..C.-....A CODO28 .......GCCA.TAG.---AAAGCCCCA..TCAC....CC.TG.TAAG-.AAGCG..C.-....A CODO29 .......GCCA.TAG.---AAAGCCCCA..TCAC....CC.TG.TAAG-.AAGCG..C.-....A CODO30 ........CCA.TAG.--.TAAGCCCCA..TCAC....CC.CG.TAAG-.AAGCG..C.-....A CODO31 ..C.....CCA.CAG.--..AAGCCCCA..TCGC....CC.TG.TAAG-.AGGCG..C.-....A CODO32 ........CCA.TAG.---AAGG.CCCA..TCAC....CC.TG.TAAG-.AAGCG..C.-.A..A CODO33 ........CCA.TAG.---AAAG.CCCA..TCAC....CC.TG.TAAG-.AAGCG..C.-.A..A CODO34 .......GCCA.TAG.--..AAGCCCTA..TTGC....CC.TG.TAAG-.AAGCG..C.-....A CODO35 .........CA.CAG.--..GAGCCCTA..TCAC...TCCCTG.TAAG-.AAGCA..C.-....A CMYS01 ........CC..TAG.--..AAGCCCCA.TTCG......C.CG..AAGCAA-TCA..CG-....A CMYS02 ........CC..TAG.--..AAGCCCCA.TTCG.....CCCCG.TAAGCAA-TCA..CG-....A CMYS03 ........C...CCG.-.T.AAGCCCCA.T.CA.....CC.GG..AAGCAATCCA..CG-....A CMYS04 ........CC..CAG.T.-.AAGCCCCA.T.CA.....CC.GG..AAGCAATTCG..C.-....A CMYS05 ........CC..CAG.T.-.AAACCTCA.TTCA....T.C.GG..AAGCAATT.A..CG-....A CMYS06 ........CC..TAG.T.-.GAGCCCCT.TTCA......C.GG..AAGCAATCCA..CG-....A CMYS07 ........CC..TAG.T.-.AAGCCCCA.T.CA......C.GG..AAGCAATC.A..CG-...TA CMYS08 ........CC..TAG.T.-.AAGCCCCA.TTCG......C.GG..AAGCAATC.A..CG-....A CMYS09 ........CC..TAG.T.-.AAGCCCCA.TTCG......C.GG..AAGTAATC.A..CG-....A CMYS10 ........CC..TAG.T.-.AAGCCCCA.TTCG......C.GG..AAGCAATCCA..CG-....A CMYS11 ........CC..CAG.T.T.AAACCCCA.TTCG......C.GG..AAGCAA-T.A..CG-....A CMYS12 ........CC..CAG.T.-.AAGCCACA.TTCA......C.GG..AAGCAA.T.A..CG-....A SIRE01 .....A..CC.ATAAT---........TAT.T.ACC..A..C.ATTT..ATA.CG...G-....A SIRE02 .........T.ATAA.---........CATTCCACC.TAT.TGAT....ATAGCA...G-....A STUR01 .....AA..CAAC.AA---........AC.ACCA...A-T.....A.A..AA.CA-.AT-.TG.A STUR02 .....A---CAATC.A---........AC.ACCA...A.T.....A.A..AA.CA..AT-.TGTA STUR03 .....AA..TAACC.A---........AT.ACCA...A-T.....ATA..AA.CA-.AT-.TG.A STUR04 .....A--..AACC.A---........ATCA.CAAT.AAT.....ATA..AA.CA.TAC-.TGTA STUR05 .....A--..AACC.A---........AT.ACAA...A-T.....A.A..AA.CA..AT-.TGTA STUR06 .....AA.C.ACCCAA---........CC.ACCA...A-T.....ATA..AA.CA-.AT-.TG.A SBIR01 ....CAAGCCA..CAA---.ATACCCCAT.AC.AAT..CCCTG..GGCCAA---------.TGTA SBIR02 ....CAAGCCA..CCA---.ACA.ACCAT.AT.AAT.AACCTTATGGCTAA---------.TGTA TELE01 T.C...CT...CAA.T---........TATATA.CT.AA------A.GCTTT.AC.GCGAG.G.. TELE02 TTC...CTC..CCAC.---........CATTCAAC.TAACT..TTA.CCATA.AC.GC.AGAG.. TELE03 T.C....C...CAA.A---........CCGT..AACTAAT.CG.TA----AA.C..GC.AG..T. TELE04 T.C..---C.ACTAC.---........TATA.AAATGTA.CT.A.A.A.TATTCGCCGCAG.G.. TELE05 T.C..CA.CC.TAA.T---........CACG.G.A.TT...CC.TA.AC---..G..A.AG.G.. TELE06 T.C....TC.ATTAA.---........CCTTC.AACTAAG...TTAACCGAA-----C.AG.G.. TELE07 ..CG..C-CCTAAACT---........CACTTAACT.AA------A.CCTAT.A..GCGAG.GA. ELAS01 T....TA.ACAACC.A---........CTT.TC.AT.AAC.T..TT------C..C.ACA..G.. ELAS02 T....AATA.TTTAC.---........CTTTT.CAT.AAT.C.TTT------C.CC.ACA..G.. ELAS03 T....-A.A.AATC.A---........CTTAT...T.A.T..TGAA------.AA.C.CA..G..

Page 36: Novel Universal Primers for Metabarcoding eDNA Surveys of … · groups: i) vertebrates (excluding Cetaceans), ii) Cetaceans only, and iii) Invertebrates. Performance of the newly

36

MarVer3 1111111111111111111111111111111111111111111111111111111111111111111111111111111112222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222223333333333333333333333 1122222233333333334444444444555555555566666666667777777777888888888899999999990000000000111111111122222222223333333333444444444455555555567777888888888999999990000000000111111111122222222223333333333444444444455555555666666666777777777788888888999999990000000000111111222235 5804578901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567810569013456789012467890123456789012345678901234567890123456789012345678901245679123456789012345678901234569234567890123456789012569135843 HOMI01 AGGTATTTATTAATGC--AAACAGTAC-----------CTAACAAACCCA---CAGGTCCTAAACTACCAAACCTGCATTAAA---------------------------------------------------AATCCTCAGCGAACCCA-ACTCGAGCAG----------TA-----------------------CATGCTAAGACTTCACAGTAAGCGAACTACTAT-----ACTCAATTT---AATA-ACTT---------CCCGAATTC HOMI02 .................................................................................................................................................................................................................................................................................. HOMI03 .................................................................................................................................................................................................................................................................................. HOMI04 .................................................................................................................................................................................................................................................................................. PINN01 ...C...A.C.T.CC.AA--------TCAGAACCTACT.A.GTC....A.AACGG.A.AAA.T.ACT.TGC..A..GG.C.GC......................................................AT....AC..AA..........TGA...................................T-AAA.CTAGACAT.......ATA..T.---.......CA.T.......A..A.CCT...........T........ PINN02 .......A.C...CC.AA--------T..AGAATTTATTC..TT....A.TATGG.ACAAAGT.A.T.T.C.-G..GGAC.GC......................................................A.....TT.G.A.....C....TGA...................................T-AAA.CTAGACAT..T....A......---.......CA.A.......A..A.TTT...........T........ PINN03 .......A.C...CT.AA--------...AGAGCAAAT.C.GTC....A.CA.GG.AATAA.....T.T.C.-A..AG...GC......................................................AT....AC..AA..........TGA...................................T-AAACT.AGACAA........T.CCAC---.......CACT.......A..A.TTT...........T........ CODO01 ...C...A..C..CC.AA-T.ACCC.............AACCTC..A....CCA...GATA...AA.A.TT.TA..GG..G.C.......................................................T.....C..AAA.CC......TGA..........--.......................TTAAACCTAGGCCT....C....AT.AC---.......CACTT.........AG.CT...........T........ CODO02 ...C...A..C.GCC.AA-..AGCC-A............A.T.T..A....CCA...GATA...AA..TTC.TA..GGC.G.C............................................................TT..AAAC.C......TGA..........-T.......................T.GAGCCTAGGCCT.T..C....AT.A.---.......CACTT.........A..TA...........T........ CODO03 .......A.CC..CC.AA--.A.C..T...........AA.T.T..A..G.CCA...GATA.C.GA..TTT.TA..GG..G.C......................................................TT.....C..AAA.........TGA..........-T.......................T.AAACCTAGGCC...G.C...T.T.A.---.......CACTT......A..A..TT...........T........ CODO04 .......A..C..CT.AA--.A.C..............AA.C.T..A....CCA...GATA.C.AAG.TTT.TA..AG..G.C.......................................................T....TC..AAA..G......TGA..........-T.......................T.AAACCTAGGC......C...T.T.A.---.......CACTT......A..AT.TT...........T........ CODO05 .......A.CC..CC.AA.T.A.C..T...........AA.T.T..A....CCA...AATA.C.AA...TT.TA..GG..G.C......................................................TT....TC..AAA.........TGA..........-T.......................T.AAACCTAGGC.T....C...TTTCA.---.......CACTT......A..AT.TT...........T........ CODO06 .......A.CC..CC.AA...ATA--T...........AA.TTT..A....CCA...AATA.C.AA..TTT.TA..GG..G.C......................................................TT....TC..AAA.........TGA..........-T.......................T.AAACCTAGGCCT....C...T.TTAA---.......CACTT......A.T.T.TT...........T........ CODO07 .......A.CC..CC.AA...TTA--T...........AA.T.TT.A....TCA...AATA.C.AA.TTTT.TA..GG..G.C.......................................................T....TC..AAA.........TGA..........-T.......................T.AAACCTAGGC.T....C...TACTA.---.......CACTT....AAA.G.T.TT...........T........ CODO08 .......A..C..CC.AA...TTA--............AA.T.T..A....CCA...GATA.C.AA...TT.TA..GG..G.C......................................................TT.....C..AAA.........TGA..........-T.......................T.AAACCTAGGC.T.T..C...TACTA.---.......CACTT......A.T.T.TT...........T........ CODO09 .......A.CC..CC.AA...ATC.-T...........AA.T.TT.A....CCA...AATA.C.AA..TTT.TA..GG..G.C.......................................................T....TT..AAA........ATGA..........-T.......................T.AAACTTAGGCCT....C...TATCA.---.......CACTT....AAA.-.T.TT...........T........ CODO1O .......A.CC..CC.AA...ATCA-T...........AA.TACT.A....CCA...AATA.C.AA.TTTT.TA..GG..G.C.......................................................T....TC..AAA........ATGA..........-T.......................T.AAACCTAGGCCT.T..C...TATCA.---.......CA.TT....AAA.-.T.TT...........T........ CODO11 ...C...A.CC..CC....TTA.C..............AA....--AA.CACCAT..GATA.C.AA..TTT.....GG..G.T......................................................TT.....C.GAAA.........TGA...........T.......................ATAAGCCTAGGCCT.T..C.....TGAC---.......CACTT.........GT.TT...........T........ CODO12 ...C...A.....CC......T..CCA............A...T..TA.CATCA...GAT-.T.AA..TTT.TA..GG....C.......................................................T.....C..AAA..T......TGA...........T.......................A-AAGCCTAGGCCT....C....AT...---.......CACTT.........A.TTT...........T........ CODO13 ...C...A..C.GCC......AGT..T...........AA.TTT.CA....CCA...GATA.C.AA..T.T.AT..AGC.G.C.......................................................T....TC..AAA..C......TGA...........T.......................A-AAGCCTAGGCC..T..C....ATGAC---.......CACTT.........AG.T............T........ CODO14 .......A..C..CC......A.AC.............AA....GCA....CCA...GATA.C.AA.TTTT..A..GG..G.C.......................................................T....TC..AAA..C......TGA...........T.......................A-AAACCTAGGCCT.TG.C...T.T.A.---.......CACTT.........AT.T............T........ CODO15 ...C...A..C...C......AGTC.A...........AA..T..CA....CCA...GATA.C.AA.T.TT.TA..GG..G.C.......................................................T....TC..AAG..C......TGA...........T.......................A-AAACTTAGGCCT.T..C...T.T.A.---.......CACTT.........AC.T............T........ CODO16 .......A..C..CT......A..C.T...........AA..T.GTA....CCA...GATA.C.AAG.TTT.TA..AG..G.C.......................................................T....TT..AAA..C......TGA...........T.......................A-AAACTTAGGCCT....C...TAT.G.---.......CACTT.........A..TT...........T........ CODO17 .......A..C..CT......A..C.T...........AA..T..TA....CCA...GATA.C.AAG.TTT.GA..AGC.G.C.......................................................T....TT..AAA..C......TGA...........T.......................A-AAACTTAGGCCT....C...TAT.G.---.......CACTT.........A..TT...........T........ CODO18 .......A..C..CT......A..C.T...........AA..T..TA....CCA...GATA.C.AAG.TTT.GA..AGC.G.C.......................................................T....TT..AAA..C......TGA...........T.......................A-AAACTTAGGCCT....C...TAT.G.---.......CACTT.........A..TT...........T........ CODO19 .......A..C..CT......A..C.T...........AA..T..TA....CCA...GATA.C.AAG.TTT.GA..AGC.G.C.......................................................T....TT..AAA..C......TGA...........T.......................A-AAACTTAGGCCT.T..C...TAT.G.---.......CACTT.........A..TT...........T........ CODO20 .......A..C..CT......A..C.T...........AA..T..TA....CCA...GATA.C.AAG.TTT..A..AGC.G.C.......................................................T....TT..AAA..C......TGA...........T.......................A-AAACTTAGGCCT....C...TAT.G.---.......CACTT.........A..TT...........T........ CODO21 .......A..C..CT......A..C.T...........AA..T..TA....CTA...GATA.C.AAG.TTT.GA..AGC.G.C.......................................................T....TT..AAA..C......TGA...........T.......................A-AAACTTAGGCCT.T..C...TAT.G.---.......CACTT.........A..TT...........T........ CODO22 .......A..C..CT......A..C.T...........AA.....TA....CCA...GATA.C.AAG.TTT.AA..AGC.G.C.......................................................T....TT..AAA..C......TGA...........T.......................A-AAACTTAGGCCT.T..C...TAT.G.---.......CACTT.........A..TT...........T........ CODO23 .......A..C..CT......A..C.T...........AA..T..TA....CCA...GATA.C.AAG.TTT.GA..AGC.G.C.......................................................T....TT..AAA..C......TGA...........T.......................A-AAACTTAGGCCT.T..C...TAT.G.---.......CACTT.........A..TT...........T........ CODO36 .......A..C..CT.AA...A.TCGT...........AA....GTA....CTA...GATA.C.AA.TTTT.TA..GG..G.C.......................................................T....TT..AAA..C......TGA...........T.......................A-AAACCTAGGCCT.T..C...T.T.GC---.......CACTT.........AT.TT...........T........ CODO24 .......A..C..CT......A..C.T...........AA..T..TA....CCA...GATA.C.AAG.TTT..A..AGC.G.C.......................................................T....TT..AAA..C......TGA...........T.......................A-AAACTTAGGC.T..G.C...TAT.G.---.......CACTT.........A..TT...........T........ CODO25 .......A..C..CT......A..C.T...........AA..T..CA....CCA...AATA.C.TAG..TT.TA..AG..G.C.......................................................T....TT..AAA..C......TGA...........T.......................A-AAACTTAGGC.T....C...TAT.AC---.......CACTT.........A..TT...........T........ CODO26 .......A..C..CT......A..C.T...........AA.....TA....CCA...GATA.C.AAG..TT.TA..AG..G.C.......................................................T....TT..AAA..C......TGA...........T.......................A-AAACTTAGGCCT....C...TAT.G.---.......CACTT.........A..TT...........T........ CODO27 .......A..C..CT......A..C.T...........AC..T..TA....CCA...GGTA.T.AAG.TTT.TA..AGC.G.C.......................................................T....TT..AAA..C......TGA...........T.......................A-AAACTTAGGCCT.T..C...TAT.T.---.......CACAT.........G..TT...........T........ CODO28 .......A..C..CT......A..C.............AA..T..TA....CCA...GATA.C.AAG.TTT.TA..AGC.G.C.......................................................T....TT..AAA..C......TGA...........T.......................A-AAACTTAGGCCT.T..C...TAT.G.---.......CACAT.........A..TT...........T........ CODO29 .......A..C..CT......A..C.T...........AA..T..TA....CCA...GATA.C.AAG.TTT.TA..AGC.G.C.......................................................T....TT..AAA..C......TGA...........T.......................A-AAACTTAGGCCT.T..C...TAT...---.......CACAT.........A..TT...........T........ CODO30 .......A..C..CT......A..C.T...........AA..T..TA....CCA...GATA.T.GAG.TTT..A..AGC.G.C.......................................................T....TT..AAA..C......TGA...........C.......................A-AAACTTAGGCCT.T..C...TAT.G.---.......CACAT.........A..TT...........T........ CODO31 .......A..C..CT......A..C.T...........AA..T..TA....CCA...GATA.C.AAG.TTT..A..AGC.G.C.......................................................T.....T..AAA..C......TGA...........T.......................A-AAACTTAGGCCT....C...TAT.AC---.......CACTT.........A..TT...........T........ CODO32 ...C...A..C..CT......A..C.T...........AA.....TA....CCA...GATA.C.AAG.TTT.TA..AGC.G.C.......................................................T....TT..AAA..C......TGA...........T.......................A-AAACTTAGGCCT.T..C...TAC.G.---.......CACAT.........A..TT...........T........ CODO33 .......A..C..CT......A..C.T...........AA.....TA....CCA...GATA.C.AA..TTT.TA..AGC.G.C.......................................................T....TT..AAA..C......TGA...........T.......................A-AAACTTAGGCCT.T..C...TAC.G.---.......CACAT.........A..TT...........T........ CODO34 .......A..C..CT......A..C.T...........AA..T..CA....CTA...GATA.C.GAG.TTT.TA..AGC.G.C.......................................................T....TT..AAA..C......TGA...........T.......................A-AAACTTAGGC.T.T..C...TAC.A.---.......CACTT.........A..TT...........T........ CODO35 .......A..C..CC......A.TC.............AA....GTA...CCTA...GATA.C.AA.TTTT.TA..GG..G.C.......................................................T.....C..AAA..C......TGA...........T.......................A-AAACCTAGGCCT.TG.C...T.T.G.---.......CACTT.........GT.TT...........T........ CMYS01 ...C...A..C..CC.AA-..ATCA-T...........AACCTC..A....CCA...GATA.C.AA.TTTT.TA..GGC.G.C.......................................................T.....C..AAA..C......TGA..........-T.......................T.AAGCCTAGGCC..T..C....AT.A.---.......CACTT..........C.TT...........T........ CMYS02 ...C...A..C..CC.AA-..ATCA-............AACCTT..A....CCA...GATAGC.AA..TTT..A..GGC.G.C.......................................................T.....C..AAA..C......TGA..........-T.......................T.AAGCCTAGGCC.....C....AT.G.---.......CACTT..........C.TT...........T........ CMYS03 ...C...A.....CC.AA-..ATCA-T...........AATC.C..A....CCA...AATA.C.AA.TTTT.TA..GG....C......................................................TT....TT..AAA..C......TGA..........-T.......................T.AAGCCTAGAC...T.......AC.AC---.......CACTT..........C.T............T........ CMYS04 ...C...A..C..CC.AAT..ATCA-A...........AACCTT..A....CCA...GA.A.C.AA.TTTT.TA..GGC.G.C.......................................................T.....C..AAA..C......TGA...........T.......................T.AAACCTAGGC...T..CG..TAT.A.---.......CACTT..........C.TA...........T........ CMYS05 ...C...A..C..CC.AAT..ATCA-............AACCTT..A....CCA...GA.A.C.AA.TTTT.TG..GG..G.C.......................................................T.....C..AAA..C......TGA..........-T.......................T.AAACTTAGGCC..T..C...T.T.A.---.......CA.TT..........C.TT...........T........ CMYS06 ...C...A..C.GCC.AA-.TATTA-T...........AACCTT..A....CCA...GATA.C.AA.T.TT.TA..AGC.G.C.......................................................T....TC..AAA..C......TGA..........-T.......................T.AAACCTAGGCC..T..C...TAT.AC---.......CACTT..........C.CT...........T........ CMYS07 ...C...A..C..CC.AA-.TATCA-T...........AACCTT..A....CCA...GATA.C.AA.T.TT.TA..GGC.G.C.......................................................T....TC..AAA..C......TGA..........-T.......................T.AAACCTAGGC...T..C...TATGAC---.......CACTT..........C.TT...........T........ CMYS08 ...C...A..C..CC..A...ATCA-T...........AACCTT..A....CCA...GATA.C.AA.TTTT.TA..GGC.G.C......................................................TT....TC..AAA..C......TGA..........-T.......................T.AAACTTAGGC...T..C...TAT.AC---.......CACTT..........C.TT...........T........ CMYS09 ...C...A..C..CC.AA...ATCA-T...........AACCTT..A....CCA...GATA.C.AA.TTTT.TA..GGC.G.C.......................................................T.....C..AAA..C......TGA..........-T.......................T.AAACTTAGGC...T..C...TAT.AC---.......CACTT..........C.TT...........T........ CMYS10 ...C...A..C..CC.AA...ATCA-T...........AACCTT..A....CCA...GATA.C.AA.TTTT.TA..GGC.G.C.......................................................T.....C..AAA..C......TGA..........-T.......................T.AAACTTAGGC...T..C...TAT.A.---.......CACTT..........C.TT...........T........ CMYS11 ...C...A..C..CC.AA-..ACCA-T...........AACCTT..A....CCA...GATA.C.AA...TT.TA..GGC.G.C.......................................................T....TC..AAA..C......TGA..........-T.......................T.AAACTTAGGCC..T..C...TAC.A.---.......CACTT..........C.CT...........T........ CMYS12 ...C...A..C..CC.AA-..ATCA-............AACCTT..A....CCA...GATA.C.AA.TTTT.TA..GGC.G.C.......................................................T.....C..AAA..T......TGA..........-T.......................T.AAACTTAGGC...T..C...T.T.A.---.......CACTT..........C.T............T........ SIRE01 ...A...A.C...CTTAA--------T......AATATAA.TATCCA..CTACGG.C.TAACCTAGCATTTTTA..AG....C......................................................TT....AC..GAA........ATGATA.......C.........................T-GA..C....C..........TCTG.G.--.......CACTT...C...--A.TTA...........T........ SIRE02 ...A...A....G.T.AA--------.......CA....A..AC.TATACTACGG..-..ACCCAACAGTTTAA..A.C...C......................................................TT....AC..AAA........ATGATA.......C.........................T-AA.CC......A.....G..TCC.TA.--.......CACTT...C...--A.TTA...........T........ STUR01 G.A..A.ACAG.TCAACT.C...TCCA............ACCT..T.TA....GGAC.TA...C.A..T.G..T..ATCC.T....................................................TT.......TA...AA........AA.AAGAACAA..C.T........................T.AACCTAGAC.T.A.T....T.CCAA----C.....GGCA..A.........T.T...........T..AC.CC. STUR02 G.A..A.ATAA.TCAACT.TT--A..T...........T..C..CC.TA....AGAC.TA...TTA..T.GTT.-TG..CC.T....................................................T.TT....TA...AA........AA.AAGAACATATT.T........................T.AACCTAGA..T.A.C....T.CCAA----C.....GGCA..A.........T.T...........T..AC.CC. STUR03 G.A..A.ACAG.TCAACT.TCT.TAC.............ACT..CT.TA....GGACCTA...CTA..T.GTA..TG.CCT.T....................................................T.T.....TA...AA........AA.AAGAATACA.C.T........................T.AACCTAGACC..A.T....T.CCAA----C.....GGCA..A.........T.T...........T..AC.CC. STUR04 G.A..A.ATAA.TCAACT.TTT.CAC.............ACC..CT.TA....AGAC.TA...TT...T.GTT..TG..CC.T....................................................T.TT....AA...AA.......-AA.AAGAACATATC.T........................T.AACCTAGACC..A.C....T.CCAA----C.....GGAA..A.........T.T...........T..AC.CC. STUR05 G.A..A.ATAA.TCAACT.TC--AC.............T..CT.CC.TA....AGAC.TA...CT...T.GTT.-TG..CC.T....................................................T.TT....TA...AA........AA.AAGAACATACT.T........................T.AACCTAGA..T.A.C....T.CCGA----C.....GG.A..A.........T.T...........T..AC.CC. STUR06 G.A..A.ACAA.TCAACT.TC-.TACT............ACC..CC..A....GGACCTA...CTA..T..TA..TG.CCT.T....................................................T.T.....TA...GA........AA.AAGAACATA.C.T........................T.AACCCAGACC..A.T....T.CTAA----C.....GGCA..A.........T.T...........T..AC.CC. SBIR01 G.AA.AA.CAGC.GC.AC--------.CGACACAAAAT.....CT.T.AC...AG.C..ACC.T.AC.AGGCA...GCCCGC....................................................TT....T..AA...AA..T...A.AA.CAAGACCTCCCCT........................T.AACC.AGAGCA..CC....TACTAA----C......G.A.CCAC.......C.A.............T.C.CC. SBIR02 G.AA.AA.CAGCGAC.AC--------.ACATACAAACCTA.C.CT.--.C...AG.C..ACT.C.CT.A..CT...GCCCGC....................................................TT....T..AA..TAG..T...A.AA.TAAGACCACATCT........................T..ACC.AGAGCA..CC..T.TACTAA----C......G.A.CCAC.......C.............T.T.C.CCT TELE01 ....-GAC.CC..G..AT---------...........------------...--------------------A.CATG.C.....ACACCCCTAAACAAAGGACTAAACCAAATGAAT..CAT.ACCCCCAT.GTCT.G.G.AAT.AAA.AC.CACGTGGAATGGGAGTAC..................C..CTCCT.CAACC.AGAGCTGAGC.T.AAA.CAG.A...CTGACCAAT..--....----.GGCA...ACGCC.T...CG... TELE02 .....GAC.C.G.GC.AT---------...........------------...--------------------A.CA.G.T.....ACACCCCCAAACAAGGGACTAAACTTATTGAAATCATT.GGCCGTAT.GTC..ATG.AAC.AAA.AC.CACGTGGAATGGGAGCACA.................CTACTCCT.CAG.C.AGAGC....C.T.CAA.CAG.A.T.CTGACCAAT..C.....----.GGCA...ACGCC.T...C.... TELE03 .A..-GAC.CA-.G.GAG---------...........------------...--------------------ACTACC.T.....TTACCCCTGGATAAAGGGCAAAAGCTAAAGGTAGCCCT...CCCCAGCGTCT.G.G.TA...AA.AC.CATGTGGAATAGGAGCACCC...................CTCCTTCAACC.AGAGCTC.GC.T..GA.CAG.ACT.TTGACCAATC.A.....----.GGCA...TCGCC.T..ACG... TELE04 ....-GACGC...CCAAC---------...........------------...--------------------.CCG---......AGCACCCTACACCAGGGCCCAAACAACGTGGTT....TTGGCACAACCGTC..G.GAGT.G.A..GCTCGAGTGGATGGGGGAAACCC.......................T.AAACC.AGAGCT.AGC.G..TC.CAA.AC..TTGACCAA...-.....----.GGC..TCATGCC.....C.... TELE05 ....-GAC.C.-.GC.AA---------...........------------...--------------------...TGA...G...CGACTGAACTGAACAAGTCCTAAATACCCGCAGCCTTATGGTAATGTAGTCA.A.GAGA.GTAA.GCTCAAGCAGACCGGGAAAACCC.......................TTAAGCCGAGAG.TGA.C.T...CGCAA.A.T.TTGACTGAT.T--....----.GG..GAAAAACC.TT.AC.... TELE06 ....-GAC.CC-.G..AG---------...........------------...--------------------A.CACG.C.....GTAACCTTGAATTAACAAGTAAAAACGCAGTGACCCCTAGCCCATAT.GTCT.G.G.GA...AA.GC.CATGTGGACTGGGGGCAC.G......................C..CAACC.AGAG...A.C.T..TACCAG.A.T.CTGACCAAA..-.....----.GGCA.TCACGCC.T...CG... TELE07 ....-GACG.C..A..AG---------...........------------...--------------------A.CATG.T.....ACACCCCCAAATAAGGGACCAAACTAAATGACC..CCT.GCTTTAAT.GTC..ATG.AAT.AAA.AC.CACGCGGACTGGGAGCAC..AACCCCTTCTTTACCTCTCCTCC..CAACT.AGAGCT.AGC.T.CTA.CAG.A...CTGACCAATC.-.....----.GGCA...ATGCC.T...C.... ELAS01 ...C-AAC.CAT.AATTA---------...........------------...--------------------A.TATG...CCTTATACCTCCCAGGATATAAACAAAACATACAACACTTCTAATTTAACT.GT.TTGAG.AA.G.AA.TC.CTC.T.GACTGAGTACT.--AAGTACT................T.AAAATTAGAA.G.A.T.T.ATAGTAA.A...TTATCGAAA..-.C...----..GGA...TTACCTT.TAC.... ELAS02 ...C-AAC.C.T.AATTA---------...........------------...--------------------A.TATG...CCATCCATTTCCCAGGAAATAAACAAAATATACAACACTTCTAATTTAACT.GT.TTAAG.AA...AA.TC.CTC.T.GATTGAGTACT.--AAGTACT................T.AAAATTAGAA.G.A.T.T.TTA.TAA.AC..TTACCGAAA..-.....----..GGA...TTTCCTT.TAC.... ELAS03 ...C-AAC.CAT.AATTA---------...........------------...--------------------A.TATG....TTAACCACTCTACGGATATAAACAGAA.ATACAATACTTTTAATTTAGCT.GT.TTAAG.AA..TTA.TC.CTT.T.GACCGAGTACTC.CAAGTACT................T.AAAATTAGAAC..A.T.T.TTA.TAA.AC..TTATCGAAA..-.....----..GGA...TTTCCTT.TAC....

Page 37: Novel Universal Primers for Metabarcoding eDNA Surveys of … · groups: i) vertebrates (excluding Cetaceans), ii) Cetaceans only, and iii) Invertebrates. Performance of the newly

37

Online Resource 4 List of 657 GenBank entries used in this study to assess amplicons’ polymorphism at the three presented loci. NB 22 Pusa capsica sequences are not shown as unpublished yet.

CODOZiphidae (n=50) Phocoenidae (n=18) Tursiops spp (n=24) Stenella spp (n=176) Orcinus orca (n=151)

KC776688 - KC776691 KP170488 NC_022805 EU557096 - EU557097 KR180337

KC776693 - KC776698 KC777291 KF570317 KX857264 - KX857266 GU187155 - GU187164

KC776700 - KC776712 KR108307 - KR108308 KF570323 KX857268 - KX857281 GU187166 - GU187219

KC776715 - KC776717 KT852939 KF570333 KX857284 - KX857298 HQ405752

KF032860 - KF032861 KU886000 KF570335 KX857300 - KX857301 KF164610

KF032863 KX650869 - KX650872 KF570344 - KF570345 KX857303 - KX857306 KF418372 - KF418376

KF032866 - KF032868 MF669488 - MF669489 KF570349 KX857309 - KX857315 KF418379 - KF418393

KF032870 - KF032871 MF669491 KF570353 KX857318 - KX857321 KR180299 - KR180300

KF032873 MF669493 - MF669494 KF570360 KX857323 KR180303 - KR180325

KF032876 NC_005280 KF570367 KX857325 - KX857326 KR180327

KF032878 NC_021461 KF570374 KX857330 - KX857339 KR180330 - KR180332

KF981442 NC_026456 KF570377 KX857341 - KX857342 KR180334 - KR180336

KR534596 KF570386 KX857344 KR180338 - KR180367

KY364702 KF570388 - KF570389 KX857347 - KX857349 MH062792

MG000980 KT601194 KX857351 - KX857416 NC_02388

NC_005273 - NC_005274 KT601196 KX857418 - KX857422

NC_021434 - NC_021435 KT601203 KX857424 - KX857427

NC_021974 KT601207 KX857430 - KX857438

NC_023830 MF669485 - MF669486 KX857440 - KX857445

NC_027593 NC_012058 - NC_012059 KX857447 - KX857451

NC_034348 KX857453 - KX857460

NC_036997 NC_012051

NC_012053

NC_032301

CMYS PINN SIREBalaenoptera Phocidae (n=16) Otariidae (n=63) Odobenidae (n=2) Dugongidae (n=2)

physalus (n=152) NC_008426 - NC_008441 NC_008415 - NC_008477 AJ428576 NC_003314

KC572708 - KC572860 NC_004029 AY075116

Page 38: Novel Universal Primers for Metabarcoding eDNA Surveys of … · groups: i) vertebrates (excluding Cetaceans), ii) Cetaceans only, and iii) Invertebrates. Performance of the newly

38

Online Resource 5 HTS species detection differences between loci MarVer1 and MarVer3, as inferred from controlled environment eDNA samples collected in 6 tanks of the Aquarium of Genoa (July 2018). For each tank, two sets of data are shown: one referring to the number and composition of detected species, where the species scoring more than 100 reads were considered as “detected” (top part of each section), and one showing the number of HTS reads obtained (bottom part of each section, with related pie graphs). Note that all data refer to 6 eDNA-samples replicates collected in each tank. The top part of each table section shows 7 categories: 1) “DETECTED/TOTAL vertebrate species” indicates the number of vertebrate species correctly assigned over the total number of species known to be present in the tank as main occupants (“*” indicates two instances -Shark and Rocky shore” tanks- in which one of the species detected was found but with a read count lower than 100); 2) “TANK FEED” indicates the number of molecularly detected species that were not present in the tank as living species, that however were used as food source provided to the tank hosts (thus their DNA traces originating from both hosts’ faeces and food left-overs could be found); 3) “OVERAL FEED CONTAMINATION” indicates the number of molecularly detected species that should not be in that specific sampled tank, but that are found elsewhere in the Aquarium structure, used as food source in other tanks (food buckets and/or operators’ boots or wetsuit are likely to have provided a vehicle for contamination); 4) “CONTAMINATION FROM OTHER TANKS AND HUMANS” indicates the number of molecularly detected species that matched with either species present in other tanks of the Aquarium or humans (NB: the diver in charge of the maintenance of different tanks is the same person, wearing the same wetsuit: often tank hosts -like dolphins- closely interact with him and his wetsuit); 5) “MISASSIGNED SPECIES” groups the molecularly detected species which were not those present in the tank, that however were taxonomically related to them. 6) “UNEXPECTED VERTEBRATE SPECIES” indicate the number of molecularly assigned species whose detection could not be explained. Note that all species falling in these first 6 categories were all vertebrate species (including those falling into the two “food” categories). 7) “INVERTEBRATE SPECIES” indicate the number of molecularly assigned invertebrate species. The bottom part of each table section shows the number of HTS reads scored in each of the following 3 categories: A) vertebrate species whose presence (direct or indirect) in form of eDNA traces could be justified (thus grouping together categories 1 to 5 of the above described list) [no artefact] B) vertebrate species whose presence (direct or indirect) in form of eDNA traces could not be explained [possible artefact] C) invertebrate species detection [primer aspecificity] Results are discussed in the main text.

Page 39: Novel Universal Primers for Metabarcoding eDNA Surveys of … · groups: i) vertebrates (excluding Cetaceans), ii) Cetaceans only, and iii) Invertebrates. Performance of the newly

39

TANKSPRIMER SET MV1 MV3 MV1 MV3 MV1 MV3

1 DETECTED/TOTAL HOSTED VERTEBRATE SPECIES 2/3 2/3 1/1 1/1 5/14 10**/14 2 TANK FEED 0 0 2 2 2 33 OVERALL FEED CONTAMINATION 3 1 A 1 2 A 0 0 A4 CONTAMINATION FROM OTHERS TANKS AND HUMAN 4 2 4 12 3 55 MISASSIGNED SPECIES 0 0 1 2 1 06 UNEXPECTED VERTEBRATE SPECIES 3 0 B 0 1 B 0 0 B7 INVERTEBRATE SPECIES 0 1* C 0 0 C 0 0 C

HTS read count A 51608 26998 232726 196247 80565 124265HTS read count B 877 0 0 693 0 0HTS read count C 0 69 0 0 0 0

TANKSPRIMER SET MV1 MV3 MV1 MV3 MV1 MV3

1 DETECTED/TOTAL HOSTED VERTEBRATE SPECIES 0/2 2/2 1/1 1/1 0/6 6*/6 2 TANK FEED 2 2 2 1 1 43 OVERALL FEED CONTAMINATION 1 1 A 2 1 A 1 0 A4 CONTAMINATION FROM OTHERS TANKS AND HUMAN 5 8 3 9 4 105 MISASSIGNED SPECIES 1 0 1 0 0 36 UNEXPECTED VERTEBRATE SPECIES 0 2 B 0 1 B 4 3 B7 INVERTEBRATE SPECIES 0 0 C 0 1* C 0 0 C

HTS read count A 91348 191573 158492 194057 10036 204597HTS read count B 0 583 0 230 2627 1585HTS read count C 0 0 0 38 0 0

MANATEES DOLPHINS SHARKS

PENGUINS SEALS ROCKY SHORE

Page 40: Novel Universal Primers for Metabarcoding eDNA Surveys of … · groups: i) vertebrates (excluding Cetaceans), ii) Cetaceans only, and iii) Invertebrates. Performance of the newly

40