molecular footprints of inshore aquatic adaptation in indo...

9
Contents lists available at ScienceDirect Genomics journal homepage: www.elsevier.com/locate/ygeno Database Tools Molecular footprints of inshore aquatic adaptation in Indo-Pacic humpback dolphin (Sousa chinensis) Yao Ming a , Jianbo Jian a , Fei Yu a , Xueying Yu b , Jingzhen Wang b, , Wenhua Liu a, a Marine Biology Institute, Shantou University, Shantou, Guangdong 515063, PR China b Guangxi Key Laboratory of Marine Disaster in the Beibu Gulf,Qinzhou University, Qinzhou, Guangxi 535011, PR China ARTICLE INFO Keywords: Sousa chinensis Sequence Evolution Adaptation Cetaceans ABSTRACT The Indo-Pacic humpback dolphin, Sousa chinensis, being a member of cetaceans, had fully adapted to inshore waters. As a threatened marine mammal, little molecular information available for understanding the genetic basis of ecological adaptation. We rstly sequenced and obtained the draft genome map of S. chinensis. Phylogenetic analysis in this study, based on the single copy orthologous genes of the draft genome, is consistent with traditional phylogenetic classication. The comparative genomic analysis indicated that S. chinensis had 494 species-specic gene families, which involved immune, DNA repair and sensory systems associated with the potential adaption mechanism. We also identied the expansion and positive selection genes in S. chinensis lineage to investigate the potential adaptation mechanism. Our study provided the potential insight into the molecular bases of ecological adaptation in Indo-Pacic humpback dolphin and will be also valuable for future understanding the ecological adaptation and evolution of cetaceans at the genomic level. 1. Introduction Cetaceans, including whales, dolphins, and porpoises, are a model mammalian group for the study of macroevolution, particularly the molecular changes and adaptations required for the transition from terrestrial to aquatic environments [1, 2]. Cetaceans are the only mammalian group that has secondarily adapted to an entirely aquatic lifestyle, and these animals have been used as indicator species in ocean ecosystems [3, 4]. Cetaceans diverged from terrestrial artiodactyls ap- proximately 50 million years ago (Mya), and they rapidly diversied to form currently extant species around 30 Mya [5, 6]. Extant cetaceans are dispersed worldwide in oceans, estuaries and rivers. These species are thought to have a strong inuence on the aquatic ecosystems they inhabit [7, 8]. Given their extraordinary evolutionary history and their important ecological function, cetaceans make an attractive model with which to test correlations between physical adaptations and gene se- quences, so as to understand the genetic bases of habitat shifts [9]. Recent studies have investigated the genetic signatures of pheno- typic changes and stress responses related to the transition and adap- tation to aquatic life. These phenotypic changes include the loss of limbs, the loss of fur, the development of echolocation, the degenera- tion of olfactory and taste senses, an increase in hypoxic resistance, and a relative increase in brain size [6, 1014]. All of these are phenotypes changes associated with aquatic life, however, there is lack of sucient eective data for their genetic background research. With the rapid development of next-generation sequencing tech- nologies, large-scale genome sequencing, including whole genome shotgun (WGS) sequencing, has become more widely accessible [15, 16]. With whole genomes, species can be fully characterized at the molecular level and evolutionary history and the inuences of natural selection can be claried [17]. Whole genomes are also useful for gene mining and gene function prediction [18]. Thus, organismal genomic research can identify the signatures of natural selection, the dynamics of genome evolution, the molecular bases of species divergences, and functional genomic changes underpinning adaptation to environment [18]. Until now, although there have been several cetaceans' genome data published [6, 10, 19, 20], the genome-scale studies of molecular adaptation in cetaceans to date haven't contain the Sousa genus. The threatened Indo-Pacic humpback dolphin Sousa chinensis (Odontocetes: Delphinidae) is found in inshore and estuary waters from the mouth of the Yangtze River, China, south throughout the waters of Southeast Asia as far southeast at least as Borneo, and as far west as the Orissa coast of India [21]. As inhabitants of shallow coastal waters (< 20 m), S. chinensis face dierent ecological stressors compared with deep ocean cetaceans, their habitats are more ecologically precarious and they are in higher frequency contact with anthropogenic activity https://doi.org/10.1016/j.ygeno.2018.07.015 Received 11 April 2018; Received in revised form 28 June 2018; Accepted 17 July 2018 Corresponding authors. E-mail addresses: [email protected] (Y. Ming), [email protected] (J. Jian), [email protected] (F. Yu), [email protected] (X. Yu), [email protected] (J. Wang), [email protected] (W. Liu). Genomics xxx (xxxx) xxx–xxx 0888-7543/ © 2018 Elsevier Inc. All rights reserved. Please cite this article as: Ming, Y., Genomics (2018), https://doi.org/10.1016/j.ygeno.2018.07.015

Upload: others

Post on 29-Jul-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Molecular footprints of inshore aquatic adaptation in Indo ...lmdbg.bbgu.edu.cn/__local/E/1C/E3/DE6AED8694CA6CA234D1BA008… · limbs, the loss of fur, the development of echolocation,

Contents lists available at ScienceDirect

Genomics

journal homepage: www.elsevier.com/locate/ygeno

Database Tools

Molecular footprints of inshore aquatic adaptation in Indo-Pacific humpbackdolphin (Sousa chinensis)

Yao Minga, Jianbo Jiana, Fei Yua, Xueying Yub, Jingzhen Wangb,⁎, Wenhua Liua,⁎

aMarine Biology Institute, Shantou University, Shantou, Guangdong 515063, PR ChinabGuangxi Key Laboratory of Marine Disaster in the Beibu Gulf,Qinzhou University, Qinzhou, Guangxi 535011, PR China

A R T I C L E I N F O

Keywords:Sousa chinensisSequenceEvolutionAdaptationCetaceans

A B S T R A C T

The Indo-Pacific humpback dolphin, Sousa chinensis, being a member of cetaceans, had fully adapted to inshorewaters. As a threatened marine mammal, little molecular information available for understanding the geneticbasis of ecological adaptation. We firstly sequenced and obtained the draft genome map of S. chinensis.Phylogenetic analysis in this study, based on the single copy orthologous genes of the draft genome, is consistentwith traditional phylogenetic classification. The comparative genomic analysis indicated that S. chinensis had494 species-specific gene families, which involved immune, DNA repair and sensory systems associated with thepotential adaption mechanism. We also identified the expansion and positive selection genes in S. chinensislineage to investigate the potential adaptation mechanism. Our study provided the potential insight into themolecular bases of ecological adaptation in Indo-Pacific humpback dolphin and will be also valuable for futureunderstanding the ecological adaptation and evolution of cetaceans at the genomic level.

1. Introduction

Cetaceans, including whales, dolphins, and porpoises, are a modelmammalian group for the study of macroevolution, particularly themolecular changes and adaptations required for the transition fromterrestrial to aquatic environments [1, 2]. Cetaceans are the onlymammalian group that has secondarily adapted to an entirely aquaticlifestyle, and these animals have been used as indicator species in oceanecosystems [3, 4]. Cetaceans diverged from terrestrial artiodactyls ap-proximately 50 million years ago (Mya), and they rapidly diversified toform currently extant species around 30 Mya [5, 6]. Extant cetaceansare dispersed worldwide in oceans, estuaries and rivers. These speciesare thought to have a strong influence on the aquatic ecosystems theyinhabit [7, 8]. Given their extraordinary evolutionary history and theirimportant ecological function, cetaceans make an attractive model withwhich to test correlations between physical adaptations and gene se-quences, so as to understand the genetic bases of habitat shifts [9].

Recent studies have investigated the genetic signatures of pheno-typic changes and stress responses related to the transition and adap-tation to aquatic life. These phenotypic changes include the loss oflimbs, the loss of fur, the development of echolocation, the degenera-tion of olfactory and taste senses, an increase in hypoxic resistance, anda relative increase in brain size [6, 10–14]. All of these are phenotypes

changes associated with aquatic life, however, there is lack of sufficienteffective data for their genetic background research.

With the rapid development of next-generation sequencing tech-nologies, large-scale genome sequencing, including whole genomeshotgun (WGS) sequencing, has become more widely accessible [15,16]. With whole genomes, species can be fully characterized at themolecular level and evolutionary history and the influences of naturalselection can be clarified [17]. Whole genomes are also useful for genemining and gene function prediction [18]. Thus, organismal genomicresearch can identify the signatures of natural selection, the dynamicsof genome evolution, the molecular bases of species divergences, andfunctional genomic changes underpinning adaptation to environment[18]. Until now, although there have been several cetaceans' genomedata published [6, 10, 19, 20], the genome-scale studies of molecularadaptation in cetaceans to date haven't contain the Sousa genus.

The threatened Indo-Pacific humpback dolphin Sousa chinensis(Odontocetes: Delphinidae) is found in inshore and estuary waters fromthe mouth of the Yangtze River, China, south throughout the waters ofSoutheast Asia as far southeast at least as Borneo, and as far west as theOrissa coast of India [21]. As inhabitants of shallow coastal waters(< 20m), S. chinensis face different ecological stressors compared withdeep ocean cetaceans, their habitats are more ecologically precariousand they are in higher frequency contact with anthropogenic activity

https://doi.org/10.1016/j.ygeno.2018.07.015Received 11 April 2018; Received in revised form 28 June 2018; Accepted 17 July 2018

⁎ Corresponding authors.E-mail addresses: [email protected] (Y. Ming), [email protected] (J. Jian), [email protected] (F. Yu), [email protected] (X. Yu),

[email protected] (J. Wang), [email protected] (W. Liu).

Genomics xxx (xxxx) xxx–xxx

0888-7543/ © 2018 Elsevier Inc. All rights reserved.

Please cite this article as: Ming, Y., Genomics (2018), https://doi.org/10.1016/j.ygeno.2018.07.015

Page 2: Molecular footprints of inshore aquatic adaptation in Indo ...lmdbg.bbgu.edu.cn/__local/E/1C/E3/DE6AED8694CA6CA234D1BA008… · limbs, the loss of fur, the development of echolocation,

[3, 22]. The Indo-Pacific humpback dolphin, therefore, offer a primeexample of understanding the genetic basis of aquatic habitats andecological adaptions to nearshore, so the aim of this study was to in-vestigate and understand the ecological and behavior variation andgenome-level adaptive evolution by the comparative genomics re-search.

2. Materials and methods

2.1. Sample collection, DNA extraction and DNA sequencing

A skin sample was collected from a female specimen of S. chinensis,dead from stranding in Guangxi Beibu Gulf, China, in 2015. No issue onethics was concerned in this study. Genome DNA was extracted fromthe skin sample with the QIAGEN DNeasy Blood & Tissue Kit (QIAGEN,Shanghai, China) following the manufacturer's instructions. We con-structed DNA libraries with different insert sizes of 500 bp and 2Kb.Then, we used the Illumina HiSeq4000 platform to sequence PE (pairend) reads for each library: PE125 (reads length: 125 bp) for 500 bplibrary and PE50 (reads length: 50 bp) for 2Kb library, generating 107.6Gb of raw data. We filtered the reads, removing all of the reads that metany of the following criteria:

1) Reads from the 500 bp insert-size libraries comprised of> 2% ‘N's,and reads from the 2 Kb insert-size libraries comprised of> 5% ‘N's.

2) Reads from the 500 bp insert-size libraries with>40% low-qualitybases (≤7), and reads from the 2 Kb insert-size libraries with>30% low-quality bases (≤7).

3) Reads containing adapter sequences.4) Identical read 1 and read 2 paired-end reads (considered to be PCR

duplications).

2.2. Estimation of genome size

We used 60.3 Gb clean data (the 500 bp library) for the K-meranalysis. We set K to 17. The peak of the 17-mer distribution derivedfrom the sequencing reads was about 19, and the total K-mer count was52,217,435,273. Based on this, the genome size could be estimated asK-mer_num/Peak_depth.

2.3. Genome assembly

We used SOAPdenovo2 [23] to assemble the genome with the de-fault parameters based on all of the filtered data, except for that kmersize 47 was set at the pregraph stage. First, at the pregraph stage, theread sequences of the 500 bp insert-size libraries were split into 47-mersto construct de Bruijn graph [24]. Which was then simplified by re-moving tips, merging bubbles and solving repeats to get contigs. Allclean reads then were mapped onto the contigs to construct scaffolds.Finally, the clean data from short-insert size libraries were used to fillgaps. We filled the gaps with GapCloser [23]. Only the scaffolds longerthan 200 bp was maintained.

2.4. Repeat annotation

Genomes typically contain plenty repeated sequences. There aretwo main types of genomic repeats: tandem repeats and interspersedrepeats. Tandem repeats were identified using Tandem Repeats Finderversion 4.07b [25]. Interspersed repeats were searched by homolog-based identification using RepeatMasker (http://www.repeatmasker.org) and the repeat database Repbase (version 21.01). Repeated pro-teins were identified using RepeatProteinMask, and de novo inter-spersed repeat annotation was performed using RepeatModeler (http://www.repeatmasker.org/RepeatModeler.html.). RepeatMasker analysiswas applied following de novo repeat identification.

2.5. Gene prediction and functional annotation

Gene annotation consists of structural and functional annotations.We used a homology-based method to annotate the coding genes in theS. chinensis genome as follows [26]. First, proteins of Homo sapiens, Bostaurus, Lipotes vexillifer, and Tursiops truncatus were mapped onto the S.chinensis genome using TBLASTN [27]. Second, high-score segment pair(HSP) segments were concatenated between pairs of proteins usingsolar V0.9.6 (in-house software; [28]). Third, homologous genome se-quences were aligned against matching proteins using Genewise [29] todefine accurate gene models. Fourth, we filtered redundancy based onthe Genewise score. Finally, we combined the annotated results from allof the homolog-based analyses to generate the final genome annotation[26]. Gene functions were assigned based on the best matches withannotated proteins across five databases: InterPro [30], Gene ontology[31], KEGG [32], Swissprot [33] and TrEMBL [33].

2.6. ncRNA annotation

We used INFERNAL [34] and tRNAscan-SE [35] to predict ncRNAs.Four types of ncRNAs were annotated in our analysis: tRNAs, rRNAs,microRNAs, and snRNAs. tRNAs were predicted with tRNAscan-SE,using eukaryote settings. rRNA fragments were identified by aligningthe rRNA template sequences from the human genome to the S. chi-nensis genome with BLASTN, using an E-value cutoff of 1e−5. Micro-RNAs and small nuclear RNAs were predicted by INFERNAL [36],running against the Rfam database V11.0.

2.7. Gene family construction

We identified gene families using TreeFam [37]. First, we usedBLAST to compare all of the protein sequences from eight species (S.chinensis, Orcinus orca, Lipotes vexillifer, Balaenoptera acutorostrata, Bostaurus, Canis familiaris, Homo sapiens, and Ornithorhynchus anatinus),using an E-value threshold of 1e-7. Second, HSP segments for eachprotein pair were concatenated with Solar. H-scores were computedbased on bit-scores, and these were used to evaluate gene similarity.Finally, we identified gene families by clustering homologous gene se-quences with Hcluster_sg V0.5.0.

2.8. Phylogenetic tree construction and divergence time estimation

We constructed a phylogenetic tree of S. chinensis and seven othersequenced mammals (O. orca, L. vexillifer, B. acutorostrata, B. taurus, C.familiaris, H. sapiens, and O. anatinus) based on 2753 single-copy or-thologous genes. Fourfold degenerate sites were extracted from eachsingle family and concatenated to form one supergene for each species[6, 10, 38]. Phylogenetic relationships among eight species (four ce-tacean and four other mammals) were inferred using PhyML [39, 40],with a GTR substitution model and gamma distribution rates. We usedO. anatinus as an outgroup. Divergence times were estimated using theapproximate likelihood calculation method in mcmctree (in PAMLV4.5; [41–43]).

2.9. Expansion and contraction of gene families

We analyzed the expansion and contraction of gene families usingcafé [44], which employs a random birth and death model across auser-specified phylogeny. The global parameter λ, which describes therate of both gene birth (λ) and gene death (μ=−λ) across all of thetree branches for all of the gene families, was estimated with maximumlikelihood. A conditional P-value was calculated for each gene family,and families with conditional P-values less than threshold (0.01) wereconsidered to have expanded or contracted. Families on branches withlow overall P-values were considered to be significant.

Y. Ming et al. Genomics xxx (xxxx) xxx–xxx

2

Page 3: Molecular footprints of inshore aquatic adaptation in Indo ...lmdbg.bbgu.edu.cn/__local/E/1C/E3/DE6AED8694CA6CA234D1BA008… · limbs, the loss of fur, the development of echolocation,

2.10. PSG (positively selected genes) detection

We investigated the genotypes underlying the aquatic adaptationsof the cetacean lineage by analyzing clade-specific positively selectedgenes (PSGs). To estimate positive selection, Ka/Ks were calculated forall of the single copy orthologs of S. chinensis and the other sevenmammals (O. orca, L. vexillifer, B. acutorostrata, B. taurus, C. familiaris,H. sapiens, and O. anatinus). Orthologous genes were aligned withPRANK [45]. We then used Gblocks 0.91b [46] to remove ambiguouslyaligned sites within the PRANK alignments. We used codeml (in PAMLV4.5 [43]) with the free-ratio model to estimate the Ka, Ks, and Ka/Ksratios for all of the branches. We compared the mean Ka/Ks ratios forsingle-copy genes between S. chinensis and each of the other specieswith paired Wilcoxon rank sum tests. The genes with Ka/Ks > 1 alongthe S. chinensis branch were reanalyzed using the codon based branch-site tests in PAML V4.5. The branch-site model, which allowed ω tovary both in the protein and across branches, was used to detect epi-sodic positive selection.

3. Results and discussion

3.1. Genome assembly and annotation

Our clean data were approximately 88.88 Gb, about 32.9-foldgenome coverage (Table S1). The predicted size of the S. chinensisgenome was 2.7 Gb (Table S2 and Fig. S1). The total assembled genomelength was approximately 2.29 Gb, with a contig N50 of 12,900 bp, anda scaffold N50 of 163,000 bp (Fig. 1; Table S3). The average GC-contentof the S. chinensis genome (~40%; Fig. S2) was generally consistentwith those of other marine cetaceans (e.g., B. acutorostrata, T. truncatesand L. vexillifer: around 40%; [6, 10, 13] (Fig. S2 and S3). The GCdistribution of the S. chinensis genome was consistent with other ceta-ceans and distinct from non-cetacean mammals, such as cattle (Fig. S3).Consistent with other previously published cetacean genomes [6, 10,13, 19], approximately 30.4% of the S. chinensis genome was comprisedof transposon-derived repeats (Fig. 1; Table S4). Most of the transpo-sable elements in the S. chinensis genome (21.87%) were long inter-spersed nuclear elements (LINEs; Table S4), similar to the number ofLINEs in L. vexillifer (26.65%) and B. acutorostrata (22.83%) [6, 10]. Wepredicted 22,501 coding genes in the S. chinensis genome, comparableto B. acutorostrata (20,605) and L. vexillifer (22,168; Table S5) [6, 10].The average length of the top 15 longest scaffolds of S. chinensis was1,451Kb (Fig. 1). We identified the genes function in the S. chinensisgenome by aligning against Swissprot [33], Trembl [33], Gene ontology[31], KEGG [32] and Interpro [30]. Most of the annotated genes wereenriched in the following gene ontology (GO) terms: cell, cell part,cellular process, organelle, binding and biological regulation (Fig. S4).According to KEGG analysis, the most commonly found pathways in thegenome of S. chinensis were related to signal transduction, global andoverview maps, cancers overview, translation, immune system andendocrine system (Fig. S5). Non-coding RNAs (ncRNAs), including ri-bosomal RNAs (rRNAs), transfer RNAs (tRNAs), small nuclear RNAs(snRNAs) and microRNAs, were estimated to comprise 0.049% of thewhole genome. Most of the ncRNAs were tRNAs (0.025% of the wholegenome) and miRNAs (0.006% of the whole genome; Table S6).

3.2. Cetacean taxonomy and evolution

Our cetacean phylogeny based on whole genome data was con-sistent with traditional classification and research [47, 48]. Our datinganalysis indicated that the cetaceans diverged from their terrestrialsister clade (cattle) about 55.5 Mya (Fig. S7), which was consistent withprevious studies [49]. Cetaceans are thought have diverged from theirterrestrial ancestors (closely related to modern artiodactyls) at ap-proximately 56–53 Mya when they reinvaded aquatic environments[50]. Approximately 30.8 Mya, the Cetacea split into the Mysticeti

(baleen whales) and the Odontoceti (toothed whales, dolphins, andporpoises; Fig. S7). S. chinensis split from other Delphinidae species, andbegan to colonize estuary and coastal areas, around 14.9 Mya (Fig. S7).Previous studies have indicated that the cetacean crown group di-versified from the Archaeocetes about 34Mya, and gradually spreadthrough nearly all of the ocean bodies [5, 51]. Thus, our whole genomephylogeny provides complementary evidence for the patterns of di-vergence and evolution in the cetaceans.

3.3. Adaptation of S. chinensis to the inshore environment

3.3.1. Species-specific gene families and functionsOur comparison of gene families across the S. chinensis genome and

the genomes of three other cetacean species (O. orca, L. vexillifer, and B.acutorostrata) recovered 14,193 orthologous gene families (Fig. 2A). Ofthese, 10,158 were common to all of the four cetacean species, and 494were found only in S. chinensis (Fig. 2A). The functions of most of thesegenes were related to cellular processes, metabolic processes, single-organism processes, biological process regulation and response to sti-mulus (Fig. 2B). On the other hand, most of the genes specific to S.chinensis were involved in cellular functions (growth, death, replicationand repair), signal transduction, bacterial infections, neurodegenerativediseases, viral infections, lipid metabolism, cofactor/vitamin metabo-lism, and the digestive, excretory, immune, nervous, and sensory sys-tems in KEGG enrichment (Fig. 2C).

The inshore environment is frequently disturbed by natural andanthropogenic activities, including freshwater inflow, vessel traffic,environmental contamination, intensive fishing and habitat destruc-tion/degradation [52–56]. Previous studies of the mechanisms of thecetacean immune system have been fragmented and inconclusive; stu-dies of the immune response of cetaceans to microparasite infection arerare [57, 58]. However, viral infections have been found in severalspecies of cetaceans, including T. truncatus [59, 60], Delphinus delphis[61] and Phocoena phocoena [62]. In addition, respiratory illness isthought to be most the common cause of death in both wild and captivepopulations of T. truncatus [63]. It has been shown that T. truncates andthe bacterial community of its upper respiratory tract have co-evolved[63]. S. chinensis, as a nearshore species, was probably confronted withmore risk to pathogens and virus exposure than offshore cetaceans,such as O. orca, L. vexillifer, and B. acutorostrata. Previous microarraystudies of T. truncates identified 52 genes known to be important in thestress response and adaptive immune function [64]. In addition, severalunigenes were found in the transcriptome of S. chinensis that werepredicted to be involved in either the immune response or the adaptiveevolution of cetacean-specific traits [24]. The unique gene families weidentified involved in bacterial and virus infections in S. chinensisprobably also co-evolved with infectious bacteria and virus: the long-term exposure of this species to multiple pathogens might have con-tributed to this potential regulatory and response mechanism.

Here, the specific genes associated with stimulus response in thenervous and sensory systems (Fig. 2C) suggested that S. chinensis waspossibly sensitive to and responded to anthropogenic pressures in theenvironment. Mammals that live exclusively in aquatic environmentsnormally cope with many physiological challenges, including sensoryones. The transition from the terrestrial to the aquatic environmentrequired an adaptation to the different physical properties of water asopposed to air. T. truncatus has a complex sensory area in its cerebralneocortex [65]. This species capable of immediately recognizing avariety of complex objects with vision or echolocation; when tested, theintersensory recognition accuracy of T. truncatus was nearly errorless[66]. The species-specific genes involved in stimulus response in thenervous and sensory systems (Fig. 2C) found in S. chinensis might in-dicate a similarly strong sensory system.

We also identified specific genes involved in catalytic activity(Fig. 2B), replication and repair (Fig. 2C). These genes illustrated theability of S. chinensis to initiate repairs and adjust responses to

Y. Ming et al. Genomics xxx (xxxx) xxx–xxx

3

Page 4: Molecular footprints of inshore aquatic adaptation in Indo ...lmdbg.bbgu.edu.cn/__local/E/1C/E3/DE6AED8694CA6CA234D1BA008… · limbs, the loss of fur, the development of echolocation,

environmental pressures at molecular level. Studies have shown thatDNA repair in T. truncatus cells was more efficient that in human leu-kocytes [67]. Indeed, the resistance of dolphins to methyl mercurytoxicity might be due to defensive cellular strategies developed tocombat high dietary exposure to contaminants and to compensate forthe limited capacity of dolphins to excrete persistent pollutants [67].Our results suggested that, similar to T. truncatus, S. chinensis may havea great capacity for DNA replication and repair in the presence ofpersistent environmental pollution.

3.3.2. Gene family expansionExpanded gene families might reflect lineage-specific physiological

adaptations and traits [6, 10]. In the S. chinensis lineage, 1140 gene

families were predicted to have expanded (Fig. 3). The 256 gene fa-milies in S. chinensis that were most significantly expanded (P < .01)were composed of genes primarily involved in inflammatory response(GO:0006954), immune response (GO:0006955), scavenger receptoractivity (GO:0005044), iron ion transport (GO:0006826), cellular ironion homeostasis (GO:0006879), antioxidant activity (GO:0016209,Table S7). The expansion of gene families containing inflammatoryresponse, immune response and scavenger receptor activity genes(Table S7) suggested that these gene families were potentially involvedin the immune response of S. chinensis, probably activated by its ha-bitat. Previous studies mainly have used the number of gene copies inthe major histocompatibility complex (MHC) to investigate the immuneresponse of aquatic organisms [68–71]. An increased number of MHC I

Fig. 1. Characterization of the Sousa chinensis genome. Circos plot of the 15 longest scaffolds.There are four circles in total, from inside to outside: (A) GC content, (B) repeats, (C) gene model, (D)scaffolds.

Y. Ming et al. Genomics xxx (xxxx) xxx–xxx

4

Page 5: Molecular footprints of inshore aquatic adaptation in Indo ...lmdbg.bbgu.edu.cn/__local/E/1C/E3/DE6AED8694CA6CA234D1BA008… · limbs, the loss of fur, the development of echolocation,

Fig.

2.Th

eVen

ndiag

ram

ofun

ique

andshared

gene

families

inSo

usachinensis,Orcinus

orca,L

ipotes

vexilliferan

dBa

laenoptera

acutorostrata.

(A)ge

nefamily

,(B)

GO

and(C

)KEG

Gfunc

tion

enrich

men

tof

unique

gene

families

ofSo

usachinensisam

ongcetacean

s.

Y. Ming et al. Genomics xxx (xxxx) xxx–xxx

5

Page 6: Molecular footprints of inshore aquatic adaptation in Indo ...lmdbg.bbgu.edu.cn/__local/E/1C/E3/DE6AED8694CA6CA234D1BA008… · limbs, the loss of fur, the development of echolocation,

genes were found in Atlantic cod [68]. Sequences of the MHC class Iregion of Neophocaena asiaeorientalis asiaeorientalis and T. truncates areavailable [70, 72], but no such sequences are available for S. chinensis.The immune response research of S. chinensis is still inconclusive. In-formation about the gene copy number of immunity-related genes in S.chinensis would increase and supplement our knowledge of the immuneresponse in this species.

Gene families containing genes related to antioxidant activity(GO:0016209), iron ion transport (GO:0006826), and cellular iron ionhomeostasis (GO:0006879) were also expanded in S. chinensis (TableS7). Under hypoxic conditions, reactive oxygen species (ROS) aregenerated for cetaceans, that may damage cellular molecules, such asDNA, proteins and lipids. In cetaceans, genes associated with anti-oxidant function and iron ion transport are vital for the prevention ofdamage to important cellular components. An expansion of antioxidant-related gene families was also identified in B. acutorostrata, a speciesthat must contend with the hypoxic conditions experienced duringdives [10].

3.3.3. Positive selectionWe identified 624 positively selected genes (PSGs) in the S. chinensis

genome. These PSGs were enriched in oxidoreductase activity, process,and function (GO:0016627, GO:0016491, GO:0055114, GO:0016705,GO:0045454, GO:0016671, GO:0016616, GO:0016705, andGO:0016491 (Table 1)). Oxidoreductase is important for hypoxia re-sistance and diving activity in cetaceans which can combat the dama-ging effects of hypoxia-induced ROS. Besides, our results indicated thatgenes associated with the apoptotic processes (GO:0042981,GO:0006915), with pathways related to the DNA damage checkpoint(GO:0000077), and with DNA repair (GO:0006281) were under posi-tive selection (Table 2). The processes of DNA damage and repair areknown to play important roles in brain development and microcephaly[73]. The evolution of DNA damage repair pathways might have con-tributed to the slowdown of the nucleic acid base substitution rate incetaceans [74]. PSGs with similar functions to those in S. chinensis werealso reported in the L. vexillifer genome [6].

PSGs in the S. chinensis lineage were also involved in cation trans-port processes, ion channel activity, and ion transport (GO:0005216,GO:0005886, GO:0005215, GO:0005272, GO:0055085, GO:0006812,GO:0005261, and GO:0004889) as well as sodium symporter neuro-transmitter activity (GO:0005328) and neuron projection development(GO:0031175; Table 1). This is consistent with a previous study of T.truncatus, where 228 genes were potentially under positive selection; of

those, 27 were associated with the nervous system [13]. Therefore, wehypothesized that the relatively high nerve sensitivity of nearshorecetaceans might contribute to their adaptive capacities. A quick re-sponse to environmental change might be critical for avoiding damage,triggering the stress response, and initiating adaptation mechanisms.

It is noteworthy that PSGs associated with tumor necrosis factorreceptor binding (GO:0005164) and leukemia inhibitory factor receptorbinding (GO:0005146; Table 2) might be involved in the immune reg-ulation and disease resistance mechanisms of S. chinensis. It has beenshown that tumor necrosis factor alpha (doTNF-alpha) in bottlenosedolphins has a 79% similarity with the doTNF-alpha of cattle, pigs, miceand humans [75]. As neoplasia is infrequently reported in cetaceans,the positive selection of the tumor necrosis factor receptor binding genemight help combat cetacean diseases [76]. As a result, the cetaceanmight probably become model organisms for anti-tumor research.

4. Conclusions

Here, our results provided a database for further studies of genefunction and conservation biology in S. chinensis. Phylogenetic analysisin this study that based on the single copy orthologous genes of thedraft genome is consistent with traditional phylogenetic classification.Our analysis of the divergence time of cetaceans based on genome levelwas consistent with previous studies. The species-specific genes, genefamily expansions, and footprints of positive selection genes that weidentified might contribute to indicate that molecular genetic shiftsrelated with the adaptation of S. chinensis to an inshore, anthro-pogenically-affected environment. Our study is an important first steptowards an understanding of the ecological adaptations of S. chinensis atthe genomic level. Our data will be also useful for future studies ofcetacean evolution and conservation, as well as intraspecific populationgenetic and ecological studies of S. chinensis.

Competing interests

The authors declare that they do not have any competing interests.

Data accessibility

Genome assembled result has been deposited in NCBI underBioProject PRJNA449414.

Fig. 3. Gene family expansion or contraction. The numbers indicate the number of gene families that have expanded (green) or contracted (red) since the split from acommon ancestor. MRCA, most recent common ancestor. Timelines indicate the divergence times between species. (For interpretation of the references to colour inthis figure legend, the reader is referred to the web version of this article.)

Y. Ming et al. Genomics xxx (xxxx) xxx–xxx

6

Page 7: Molecular footprints of inshore aquatic adaptation in Indo ...lmdbg.bbgu.edu.cn/__local/E/1C/E3/DE6AED8694CA6CA234D1BA008… · limbs, the loss of fur, the development of echolocation,

Table 1Positively selected genes (PSGs) involved in oxidoreductase activity (green), ion channels (red), and neuronregulation (blue) in the Sousa chinensis genome (FDR (false discovery rate) < 0.05; P≤ .01).

Gene_ID Function description P-value

ENSBTAP000000

33383

GO:0016627; oxidoreductase activity, acting on the CH-CH group of donors;

function 4.44089209850063E-16

ENSTTRP000000

13585 GO:0016491; oxidoreductase activity; function 0

ENSTTRP000000

09681 GO:0055114; oxidation-reduction process; process 0

ENSTTRP000000

01775

GO:0016705; oxidoreductase activity, acting on paired donors, with incorporation or reduction of molecular

oxygen; function 9.5187191462287E-12

ENSBTAP000000

11043 GO:0045454; cell redox homeostasis; process

4.1860733213106E-07

ENSTTRP000000

03635

GO:0016671; oxidoreductase activity, acting on a sulfur group of donors, disulfide as acceptor; function

0

ENSTTRP000000

01162

GO:0016616; oxidoreductase activity, acting on the CH-OH group of donors, NAD or NADP as acceptor; function

0

ENSBTAP000000

39756

GO:0016705; oxidoreductase activity, acting on paired donors, with incorporation or reduction of molecular

oxygen; function 1.13827454129023E-08

ENSTTRP000000

10906

GO:0016705; oxidoreductase activity, acting on paired donors, with incorporation or reduction of molecular

oxygen; function 9.81941197308878E-06

ENSTTRP000000

13529 GO:0016491; oxidoreductase activity; function 0

ENSTTRP000000

06964 GO:0005216; ion channel activity; function 0

ENSTTRP000000

09075 GO:0005886; plasma membrane; component 0

ENSTTRP000000

06110 GO:0005215; transporter activity; function 0

ENSTTRP000000

16140 GO:0005272; sodium channel activity; function 0

ENSTTRP000000

05755 GO:0005272; sodium channel activity; function 0

ENSTTRP000000

02153 GO:0005216; ion channel activity; function 0

ENSTTRP000000

08783 GO:0055085; transmembrane transport; process 0

ENSTTRP000000

14332 GO:0006812; cation transport; process 0.00002737219349358

ENSTTRP000000

14317 GO:0005216; ion channel activity; function 0

ENSTTRP000000

05291 GO:0005261; cation channel activity; function 4.76577000085854E-10

ENSTTRP000000

08335 GO:0004889; acetylcholine-activated cation-selective channel activity; function 1.15745424356106E-09

ENSTTRP000000

03398 GO:0005328; neurotransmitter:sodium symporter activity; function 4.35207425653061E-14

ENSTTRP000000

15487 GO:0005328; neurotransmitter:sodium symporter activity; function 0

ENSTTRP000000

02375 GO:0031175; neuron projection development; process 0

ENSBTAP000000

53664 GO:0004672; protein kinase activity; function 3.5419742694387E-08

Y. Ming et al. Genomics xxx (xxxx) xxx–xxx

7

Page 8: Molecular footprints of inshore aquatic adaptation in Indo ...lmdbg.bbgu.edu.cn/__local/E/1C/E3/DE6AED8694CA6CA234D1BA008… · limbs, the loss of fur, the development of echolocation,

Author's contributions

Y.M. and WH.L. conceived and designed this study. XY.Y. and JZ.W.prepared the samples. Genome sequencing was performed by BGI-Shenzhen; Y.M. and F.Y. performed genome assembly. Y.M. performedgenome annotation and other data analyses. Y.M., JB.J., JZ.W., andWH.L. discussed the data. Y.M. wrote the paper with significant con-tributions from JB.J., JZ.W., XY.Y. and WH.L. All of the authors con-tributed to the data interpretation.

Acknowledgements

This project was supported by the Ministry of Agriculture of China(Chinese White Dolphin Conservation Action), and the China NationalOffshore Oil Corporation Foundation, the National Natural ScienceFoundation of China (grant nos. 41676166 and 41776174), theEducation Department of Guangxi Zhuang Autonomous RegionFoundation (grant nos. KY2016YB487 and KY2016YB476), theFoundation of Guangdong Provincial Key Laboratory of MarineBiotechnology (grant no. GPKLMB201602) and Guangxi NaturalScience Foundation (grant no. 2016GXNSFBA380142).

Appendix A. Supplementary data

Supplementary data to this article can be found online at https://doi.org/10.1016/j.ygeno.2018.07.015.

References

[1] M.R. McGowen, J. Gatesy, D.E. Wildman, Molecular evolution tracks macroevolu-tionary transitions in Cetacea, Trends Ecol. Evol. 29 (2014) 336–346.

[2] J. Wang, X. Yu, B. Hu, J. Zheng, W. Xiao, Y. Hao, W. Liu, D. Wang, PhysicochemicalEvolution and Molecular Adaptation of the Cetacean Osmoregulation-related GeneUT-A2 and Implications for Functional Studies, Scientific Reports, 5, (2015).

[3] J. Wang, Y. Yang, F. Yang, Y. Li, L. Li, D. Lin, T. He, B. Liang, T. Zhang, Y. Lin, P. Li,W. Liu, A framework for the assessment of the spatial and temporal patterns ofthreatened coastal delphinids, Sci. Rep. 6 (2016) 19883.

[4] J. Roman, J.A. Estes, L. Morissette, C. Smith, D. Costa, J. McCarthy, J.B. Nation,S. Nicol, A. Pershing, V. Smetacek, Whales as marine ecosystem engineers, Front.Ecol. Environ. 12 (2014) 377–385.

[5] M.E. Steeman, M.B. Hebsgaard, R.E. Fordyce, S.Y. Ho, D.L. Rabosky, R. Nielsen,C. Rahbek, H. Glenner, M.V. Sorensen, E. Willerslev, Radiation of extant cetaceansdriven by restructuring of the oceans, Syst. Biol. 58 (2009) 573–585.

[6] X. Zhou, F. Sun, S. Xu, G. Fan, K. Zhu, X. Liu, Y. Chen, C. Shi, Y. Yang, Z. Huang,J. Chen, H. Hou, X. Guo, W. Chen, Y. Chen, X. Wang, T. Lv, D. Yang, J. Zhou,B. Huang, Z. Wang, W. Zhao, R. Tian, Z. Xiong, J. Xu, X. Liang, B. Chen, W. Liu,J. Wang, S. Pan, X. Fang, M. Li, F. Wei, X. Xu, K. Zhou, J. Wang, G. Yang, Baijigenomes reveal low genetic variability and new insights into secondary aquaticadaptations, Nat. Commun. 4 (2013) 2708.

[7] W. Bowen, Role of marine mammals in aquatic ecosystems, Mar. Ecol. Prog. Ser.158 (1997) 74.

[8] J.G. Sanders, A.C. Beichman, J. Roman, J.J. Scott, D. Emerson, J.J. McCarthy,P.R. Girguis, Baleen whales host a unique gut microbiome with similarities to bothcarnivores and herbivores, Nat. Commun. 6 (2015).

[9] M.F. Nery, D.J. Gonzalez, J.C. Opazo, How to make a dolphin: molecular signatureof positive selection in cetacean genome, PLoS One 8 (2013) e65491.

[10] H.S. Yim, Y.S. Cho, X. Guang, S.G. Kang, J.Y. Jeong, S.S. Cha, H.M. Oh, J.H. Lee,

E.C. Yang, K.K. Kwon, Y.J. Kim, T.W. Kim, W. Kim, J.H. Jeon, S.J. Kim, D.H. Choi,S. Jho, H.M. Kim, J. Ko, H. Kim, Y.A. Shin, H.J. Jung, Y. Zheng, Z. Wang, Y. Chen,M. Chen, A. Jiang, E. Li, S. Zhang, H. Hou, T.H. Kim, L. Yu, S. Liu, K. Ahn, J. Cooper,S.G. Park, C.P. Hong, W. Jin, H.S. Kim, C. Park, K. Lee, S. Chun, P.A. Morin,S.J. O'Brien, H. Lee, J. Kimura, D.Y. Moon, A. Manica, J. Edwards, B.C. Kim, S. Kim,J. Wang, J. Bhak, H.S. Lee, J.H. Lee, Minke whale genome and aquatic adaptation incetaceans, Nat. Genet. 46 (2014) 88–92.

[11] Y.-B. Sun, W.-P. Zhou, H.-Q. Liu, D.M. Irwin, Y.-Y. Shen, Y.-P. Zhang, Genome-WideScans for Candidate Genes Involved in the Aquatic Adaptation of Dolphins, GenomeBiol. Evol. 5 (2013) 130–139.

[12] J.Y. Park, Y.-R. An, N. Kanda, C.-M. An, H.S. An, J.-H. Kang, E.M. Kim, D.-H. An,H. Jung, M. Joung, M.H. Park, S.H. Yoon, B.-Y. Lee, T. Lee, K.-W. Kim, W.C. Park,D.H. Shin, Y.S. Lee, J. Kim, W. Kwak, H.J. Kim, Y.-J. Kwon, S. Moon, Y. Kim,D.W. Burt, S. Cho, H. Kim, Cetaceans evolution: insights from the genome sequencesof common minke whales, BMC Genomics 16 (2015) 13.

[13] M.R. McGowen, L.I. Grossman, D.E. Wildman, Dolphin genome provides evidencefor adaptive evolution of nervous system genes and a molecular rate slowdown,Proc. Biol. Sci. 279 (2012) 3643–3651.

[14] A.D. Foote, N. Vijay, M.C. Ávila-Arcos, R.W. Baird, J.W. Durban, M. Fumagalli,R.A. Gibbs, M.B. Hanson, T.S. Korneliussen, M.D. Martin, Genome-culture coevo-lution promotes rapid divergence of killer whale ecotypes, Nat. Commun. 7 (2016)11693.

[15] M. Yandell, D. Ence, A beginner's guide to eukaryotic genome annotation, Nat. Rev.Genet. 13 (2012) 329–342.

[16] P.C. Ng, E.F. Kirkness, Whole genome sequencing, Methods Mol. Biol. 628 (2010)215–226.

[17] A.G. Clark, Genomics of the evolutionary process, Trends Ecol. Evol. 21 (2006)316–321.

[18] J.L. Kelley, A.P. Brown, N.O. Therkildsen, A.D. Foote, The life aquatic: advances inmarine vertebrate genomics, Nat. Rev. Genet. 17 (2016) 523–534.

[19] M. Keane, J. Semeiks, A.E. Webb, Y.I. Li, V. Quesada, T. Craig, L.B. Madsen, S. vanDam, D. Brawand, P.I. Marques, P. Michalak, L. Kang, J. Bhak, H.S. Yim,N.V. Grishin, N.H. Nielsen, M.P. Heide-Jorgensen, E.M. Oziolor, C.W. Matson,G.M. Church, G.W. Stuart, J.C. Patton, J.C. George, R. Suydam, K. Larsen, C. Lopez-Otin, M.J. O'Connell, J.W. Bickham, B. Thomsen, J.P. de Magalhaes, Insights intothe evolution of longevity from the bowhead whale genome, Cell Rep. 10 (2015)112–122.

[20] A.D. Foote, Y. Liu, G.W. Thomas, T. Vinar, J. Alfoldi, J. Deng, S. Dugan, C.E. vanElk, M.E. Hunter, V. Joshi, Z. Khan, C. Kovar, S.L. Lee, K. Lindblad-Toh, A. Mancia,R. Nielsen, X. Qin, J. Qu, B.J. Raney, N. Vijay, J.B. Wolf, M.W. Hahn, D.M. Muzny,K.C. Worley, M.T. Gilbert, R.A. Gibbs, Convergent evolution of the genomes ofmarine mammals, Nat. Genet. 47 (2015) 272–275.

[21] T.A. Jefferson, H.C. Rosenbaum, Taxonomic revision of the humpback dolphins(Sousa spp.), and description of a new species from Australia, Mar. Mamm. Sci. 30(2014) 1494–1541.

[22] T.A. Jefferson, B.E. Curry, Humpback dolphins: a brief introduction to the genussousa, in: A.J. Thomas, E.C. Barbara (Eds.), Advances in Marine Biology, AcademicPress, 2015, pp. 1–16.

[23] R. Luo, B. Liu, Y. Xie, Z. Li, W. Huang, J. Yuan, G. He, Y. Chen, Q. Pan, Y. Liu,J. Tang, G. Wu, H. Zhang, Y. Shi, Y. Liu, C. Yu, B. Wang, Y. Lu, C. Han,D.W. Cheung, S.M. Yiu, S. Peng, Z. Xiaoqian, G. Liu, X. Liao, Y. Li, H. Yang, J. Wang,T.W. Lam, J. Wang, SOAPdenovo2: an empirically improved memory-efficientshort-read de novo assembler, Gigascience 1 (2012) 18.

[24] R. Li, H. Zhu, J. Ruan, W. Qian, X. Fang, Z. Shi, Y. Li, S. Li, G. Shan, K. Kristiansen,S. Li, H. Yang, J. Wang, J. Wang, De novo assembly of human genomes withmassively parallel short read sequencing, Genome Res. 20 (2010) 265–272.

[25] G. Benson, Tandem repeats finder: a program to analyze DNA sequences, NucleicAcids Res. 27 (1999) 573–580.

[26] V. Curwen, E. Eyras, T.D. Andrews, L. Clarke, E. Mongin, S.M. Searle, M. Clamp,The Ensembl automatic gene annotation system, Genome Res. 14 (2004) 942–950.

[27] S.F. Altschul, T.L. Madden, A.A. Schaffer, J. Zhang, Z. Zhang, W. Miller,D.J. Lipman, Gapped BLAST and PSI-BLAST: a new generation of protein databasesearch programs, Nucleic Acids Res. 25 (1997) 3389–3402.

[28] C. Qin, C. Yu, Y. Shen, X. Fang, L. Chen, J. Min, J. Cheng, S. Zhao, M. Xu, Y. Luo,Y. Yang, Z. Wu, L. Mao, H. Wu, C. Ling-Hu, H. Zhou, H. Lin, S. Gonzalez-Morales,D.L. Trejo-Saavedra, H. Tian, X. Tang, M. Zhao, Z. Huang, A. Zhou, X. Yao, J. Cui,

Table 2Positively selected genes (PSGs) involved in immune response (green), apoptosis, DNA damage and repair(red) in the Sousa chinensis genome (FDR < 0.05; P≤ .01).

Gene ID Function description P-value

ENSTTRP00000012493 GO:0005164; tumor necrosis factor receptor binding; function 6.38022967791585e-12

ENSTTRP00000001444 GO:0005146; leukemia inhibitory factor receptor binding; function 4.42978986825437E-13

ENSTTRP00000007588 GO:0006915; apoptotic process; process 1.30114751417665e-07

ENSP00000380150 GO:0042981; regulation of apoptotic process; process 0.0000691926854976543

ENSBTAP00000005496 GO:0006915; apoptotic process; process 2.26385576951316E-12

ENSTTRP00000014307 GO:0000077; DNA damage checkpoint; process 6.20510309801148E-11

ENSTTRP00000011670 GO:0006281; DNA repair; process 0

Y. Ming et al. Genomics xxx (xxxx) xxx–xxx

8

Page 9: Molecular footprints of inshore aquatic adaptation in Indo ...lmdbg.bbgu.edu.cn/__local/E/1C/E3/DE6AED8694CA6CA234D1BA008… · limbs, the loss of fur, the development of echolocation,

W. Li, Z. Chen, Y. Feng, Y. Niu, S. Bi, X. Yang, W. Li, H. Cai, X. Luo, S. Montes-Hernandez, M.A. Leyva-Gonzalez, Z. Xiong, X. He, L. Bai, S. Tan, X. Tang, D. Liu,J. Liu, S. Zhang, M. Chen, L. Zhang, L. Zhang, Y. Zhang, W. Liao, Y. Zhang,M. Wang, X. Lv, B. Wen, H. Liu, H. Luan, Y. Zhang, S. Yang, X. Wang, J. Xu, X. Li,S. Li, J. Wang, A. Palloix, P.W. Bosland, Y. Li, A. Krogh, R.F. Rivera-Bustamante,L. Herrera-Estrella, Y. Yin, J. Yu, K. Hu, Z. Zhang, Whole-genome sequencing ofcultivated and wild peppers provides insights into Capsicum domestication andspecialization, Proc. Natl. Acad. Sci. U. S. A. 111 (2014) 5135–5140.

[29] E. Birney, M. Clamp, R. Durbin, GeneWise and Genomewise, Genome Res. 14(2004) 988–995.

[30] N. Mulder, R. Apweiler, InterPro and InterProScan: tools for protein sequenceclassification and comparison, Methods Mol. Biol. 396 (2007) 59–70.

[31] M. Ashburner, C.A. Ball, J.A. Blake, D. Botstein, H. Butler, J.M. Cherry, A.P. Davis,K. Dolinski, S.S. Dwight, J.T. Eppig, M.A. Harris, D.P. Hill, L. Issel-Tarver,A. Kasarskis, S. Lewis, J.C. Matese, J.E. Richardson, M. Ringwald, G.M. Rubin,G. Sherlock, Gene ontology: tool for the unification of biology. The Gene OntologyConsortium, Nat. Genet. 25 (2000) 25–29.

[32] M. Kanehisa, S. Goto, KEGG: kyoto encyclopedia of genes and genomes, NucleicAcids Res. 28 (2000) 27–30.

[33] A. Bairoch, R. Apweiler, The SWISS-PROT protein sequence database and its sup-plement TrEMBL in 2000, Nucleic Acids Res. 28 (2000) 45–48.

[34] E.P. Nawrocki, D.L. Kolbe, S.R. Eddy, Infernal 1.0: inference of RNA alignments,Bioinformatics 25 (2009) 1335–1337.

[35] T.M. Lowe, S.R. Eddy, tRNAscan-SE: a program for improved detection of transferRNA genes in genomic sequence, Nucleic Acids Res. 25 (1997) 955–964.

[36] E.P. Nawrocki, S.R. Eddy, Infernal 1.1: 100-fold faster RNA homology searches,Bioinformatics 29 (2013) 2933–2935.

[37] H. Li, A. Coghlan, J. Ruan, L.J. Coin, J.K. Heriche, L. Osmotherly, R. Li, T. Liu,Z. Zhang, L. Bolund, G.K. Wong, W. Zheng, P. Dehal, J. Wang, R. Durbin, TreeFam:a curated database of phylogenetic trees of animal gene families, Nucleic Acids Res.34 (2006) D572–D580.

[38] H. Wu, X. Guang, M.B. Al-Fageeh, J. Cao, S. Pan, H. Zhou, L. Zhang,M.H. Abutarboush, Y. Xing, Z. Xie, A.S. Alshanqeeti, Y. Zhang, Q. Yao, B.M. Al-Shomrani, D. Zhang, J. Li, M.M. Manee, Z. Yang, L. Yang, Y. Liu, J. Zhang,M.A. Altammami, S. Wang, L. Yu, W. Zhang, S. Liu, L. Ba, C. Liu, X. Yang, F. Meng,S. Wang, L. Li, E. Li, X. Li, K. Wu, S. Zhang, J. Wang, Y. Yin, H. Yang, A.M. Al-Swailem, J. Wang, Camelid genomes reveal evolution and adaptation to desertenvironments, Nat. Commun. 5 (2014) 5188.

[39] S. Guindon, O. Gascuel, A simple, fast, and accurate algorithm to estimate largephylogenies by maximum likelihood, Syst. Biol. 52 (2003) 696–704.

[40] S. Guindon, J.F. Dufayard, V. Lefort, M. Anisimova, W. Hordijk, O. Gascuel, Newalgorithms and methods to estimate maximum-likelihood phylogenies: assessing theperformance of PhyML 3.0, Syst. Biol. 59 (2010) 307–321.

[41] B. Rannala, Z. Yang, Inferring speciation times under an episodic molecular clock,Syst. Biol. 56 (2007) 453–466.

[42] Z. Yang, B. Rannala, Bayesian estimation of species divergence times under a mo-lecular clock using multiple fossil calibrations with soft bounds, Mol. Biol. Evol. 23(2006) 212–226.

[43] Z. Yang, PAML 4: Phylogenetic Analysis by Maximum Likelihood, Mol. Biol. Evol.24 (2007) 1586–1591.

[44] M.W. Hahn, J.P. Demuth, S.G. Han, Accelerated rate of gene gain and loss in pri-mates, Genetics 177 (2007) 1941–1949.

[45] A. Loytynoja, N. Goldman, An algorithm for progressive multiple alignment of se-quences with insertions, Proc. Natl. Acad. Sci. U. S. A. 102 (2005) 10557–10562.

[46] G. Talavera, J. Castresana, Improvement of phylogenies after removing divergentand ambiguously aligned blocks from protein sequence alignments, Syst. Biol. 56(2007) 564–577.

[47] S. Montelli, A. Peruffo, T. Patarnello, B. Cozzi, E. Negrisolo, Back to Water:Signature of Adaptive Evolution in Cetacean Mitochondrial tRNAs, PLoS One 11(2016) e0158129.

[48] W. Lin, R. Zhou, L. Porter, J. Chen, Y. Wu, Evolution of Sousa chinensis: a scenariobased on mitochondrial DNA study, Mol. Phylogenet. Evol. 57 (2010) 907–911.

[49] N.D. Pyenson, The Ecological Rise of Whales Chronicled by the Fossil Record, Curr.Biol. 27 (2017) R558–R564.

[50] J.G. Thewissen, L.N. Cooper, M.T. Clementz, S. Bajpai, B.N. Tiwari, Whales origi-nated from aquatic artiodactyls in the Eocene epoch of India, Nature 450 (2007)1190–1194.

[51] J.G.M. Thewissen, E.M. Williams, The Early Radiations of Cetacea (Mammalia):Evolutionary Pattern and Developmental Correlations, Annu. Rev. Ecol. Syst. 33(2002) 73–90.

[52] D. Gui, R. Yu, X. He, Q. Tu, L. Chen, Y. Wu, Bioaccumulation and biomagnificationof persistent organic pollutants in Indo-Pacific humpback dolphins (Sousa chi-nensis) from the Pearl River Estuary, China, Chemosphere 114 (2014) 106–113.

[53] T.A. Jefferson, S.K. Hung, B. Würsig, Protecting small cetaceans from coastal de-velopment: Impact assessment and mitigation experience in Hong Kong, Mar. Policy33 (2009) 305–311.

[54] T.A. Jefferson, S.K. Hung, A review of the status of the indo-pacific humpbackdolphin (Sousa chinensis) in Chinese Waters, Aquat. Mamm. 30 (2004) 149–158.

[55] X. Xu, J. Song, Z. Zhang, P. Li, G. Yang, K. Zhou, The world's second largest po-pulation of humpback dolphins in the waters of Zhanjiang deserves the highestconservation priority, Sci. Rep. 5 (2015) 8147.

[56] T.A. Jefferson, Population biology of the indo-pacific hump-backed dolphin in HongKong waters, Wildl. Monogr. (2000) 1–65.

[57] A. Beineke, U. Siebert, P. Wohlsein, W. Baumgartner, Immunology of whales anddolphins, Vet. Immunol. Immunopathol. 133 (2010) 81–94.

[58] B.M. Stone, D.J. Blyde, J.T. Saliki, J.M. Morton, Morbillivirus infection in livestranded, injured, trapped, and captive cetaceans in southeastern Queensland andnorthern New South Wales, Australia, J. Wildl. Dis. 48 (2012) 47–55.

[59] A.J. Davison, K. Subramaniam, K. Kerr, J.M. Jacob, N. Landrau-Giovannetti,M.T. Walsh, R.S. Wells, T.B. Waltzek, Genome sequence of a gammaherpesvirusfrom a common bottlenose dolphin (Tursiops truncatus), Genome Announc. 5(2017).

[60] M.C. Montalvo Villalba, D. Cruz Martinez, I. Ahmad, L.A. Rodriguez Lay, M. BelloCorredor, C. Guevara March, L.S. Martinez, L.S. Martinez-Campo, S. Jameel,Hepatitis E virus in bottlenose dolphins Tursiops truncatus, Dis. Aquat. Org. 123(2017) 13–18.

[61] J. Diaz-Delgado, E. Sierra, A.I. Vela, M. Arbelo, D. Zucca, K.R. Groch, A. Fernandez,Coinfection by Streptococcus phocae and cetacean morbillivirus in a short-beakedcommon dolphin Delphinus delphis, Dis. Aquat. Org. 124 (2017) 247–252.

[62] S.J. van Beurden, I.J. LL, S.R. Ordonez, C. Forster, G. Dde Vrieze, A. Grone,M.H. Verheije, M. Kik, Identification of a novel gammaherpesvirus associated with(muco)cutaneous lesions in harbour porpoises (Phocoena phocoena), Arch. Virol.160 (2015) 3115–3120.

[63] W.R. Johnson, M. Torralba, P.A. Fair, G.D. Bossart, K.E. Nelson, P.J. Morris, Noveldiversity of bacterial communities associated with bottlenose dolphin upper re-spiratory tracts, Environ. Microbiol. Rep. 1 (2009) 555–562.

[64] A. Mancia, M.L. Lundqvist, T.A. Romano, M.M. Peden-Adams, P.A. Fair, M.S. Kindy,B.C. Ellis, S. Gattoni-Celli, D.J. McKillen, H.F. Trent, Y.A. Chen, J.S. Almeida,P.S. Gross, R.W. Chapman, G.W. Warr, A dolphin peripheral blood leukocyte cDNAmicroarray for studies of immune function and stress reactions, Dev. Comp.Immunol. 31 (2007) 520–529.

[65] R.A. Lende, W.I. Welker, An unusual sensory area in the cerebral neocortex of thebottlenose dolphin, Tursiops truncatus, Brain Res. 45 (1972) 555–560.

[66] A.A. Pack, L.M. Herman, Sensory integration in the bottlenosed dolphin: immediaterecognition of complex shapes across the senses of echolocation and vision, J.Acoust. Soc. Am. 98 (1995) 722–733.

[67] F. Taddei, V. Scarcelli, G. Frenzilli, M. Nigro, Genotoxic hazard of pollutants incetaceans: DNA damage and repair evaluated in the bottlenose dolphin (Tursiopstruncatus) by the Comet Assay, Mar. Pollut. Bull. 42 (2001) 324–328.

[68] B. Star, A.J. Nederbragt, S. Jentoft, U. Grimholt, M. Malmstrom, T.F. Gregers,T.B. Rounge, J. Paulsen, M.H. Solbakken, A. Sharma, O.F. Wetten, A. Lanzen,R. Winer, J. Knight, J.H. Vogel, B. Aken, O. Andersen, K. Lagesen, A. Tooming-Klunderud, R.B. Edvardsen, K.G. Tina, M. Espelund, C. Nepal, C. Previti,B.O. Karlsen, T. Moum, M. Skage, P.R. Berg, T. Gjoen, H. Kuhl, J. Thorsen, K. Malde,R. Reinhardt, L. Du, S.D. Johansen, S. Searle, S. Lien, F. Nilsen, I. Jonassen,S.W. Omholt, N.C. Stenseth, K.S. Jakobsen, The genome sequence of Atlantic codreveals a unique immune system, Nature 477 (2011) 207–210.

[69] T.A. Romano, S.H. Ridgway, V. Quaranta, MHC class II molecules and im-munoglobulins on peripheral blood lymphocytes of the bottlenosed dolphin,Tursiops truncatus, J. Exp. Zool. 263 (1992) 96–104.

[70] R. Ruan, X.L. Wan, Y. Zheng, J.S. Zheng, D. Wang, Assembly and characterization ofthe MHC class I region of the Yangtze finless porpoise (Neophocaena asiaeorientalisasiaeorientalis), Immunogenetics 68 (2016) 77–82.

[71] C. Monzon-Arguello, C. Garcia De Leaniz, G. Gajardo, S. Consuegra, Less can bemore: loss of MHC functional diversity can reflect adaptation to novel conditionsduring fish invasions, Ecol. Evol. 3 (2013) 3359–3368.

[72] K. Shirai, T. Sakai, T. Oike, Molecular cloning of bottle-nosed dolphin (Tursiopstruncatus) MHC class I cDNA, J. Vet. Med. Sci. 60 (1998) 1093–1096.

[73] M. O'Driscoll, P.A. Jeggo, The role of the DNA damage response pathways in braindevelopment and microcephaly: insight from human disorders, DNA Repair (Amst)7 (2008) 1039–1050.

[74] R.J. Britten, Rates of DNA sequence evolution differ between taxonomic groups,Science 231 (1986) 1393–1398.

[75] Y. Shoji, Y. Inoue, H. Sugisawa, T. Itou, T. Endo, T. Sakai, Molecular cloning andfunctional characterization of bottlenose dolphin (Tursiops truncatus) tumor ne-crosis factor alpha, Vet. Immunol. Immunopathol. 82 (2001) 183–192.

[76] J.S. Estep, R.E. Baumgartner, F. Townsend, D.A. Pabst, W.A. McLellan,A. Friedlaender, D.G. Dunn, T.P. Lipscomb, Malignant seminoma with metastasis,Sertoli cell tumor, and pheochromocytoma in a spotted dolphin (Stenella frontalis)and malignant seminoma with metastasis in a bottlenose dolphin (Tursiops trun-catus), Vet. Pathol. 42 (2005) 357–359.

Y. Ming et al. Genomics xxx (xxxx) xxx–xxx

9