phylogenomics “the intersection of phylogenetics and genomics” the reconstruction of...

17
Phylogenomics “The intersection of phylogenetics and genomics” The reconstruction of evolutionary relationships by comparing sequences of whole genomes or portions of genomes Several potential methods/strategies to discuss We will focus on: Ultraconserved element phylogenetics Transposable element phylogenetics RADSeq PhylomeDB

Upload: adrian-cummings

Post on 13-Jan-2016

222 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Phylogenomics “The intersection of phylogenetics and genomics” The reconstruction of evolutionary relationships by comparing sequences of whole genomes

Phylogenomics• “The intersection of phylogenetics and genomics”• The reconstruction of evolutionary relationships by comparing

sequences of whole genomes or portions of genomes• Several potential methods/strategies to discuss• We will focus on:

– Ultraconserved element phylogenetics– Transposable element phylogenetics– RADSeq– PhylomeDB

Page 2: Phylogenomics “The intersection of phylogenetics and genomics” The reconstruction of evolutionary relationships by comparing sequences of whole genomes

Phylogenomics• UltraConserved Elements

• UCEs• Bejerano et al. Science

304:1321-1325• “481 segments longer than 200

base pairs (bp) that are absolutely conserved (100% identity with no insertions or deletions) between orthologous regions of the human, rat, and mouse genomes”

• “Nearly all of these segments are also conserved in the chicken and dog genomes, with an average of 95 and 99% identity, respectively. Many are also significantly conserved in fish”

• “more than 5000 sequences of over 100 bp that are absolutely conserved among the three sequenced mammals”

Page 3: Phylogenomics “The intersection of phylogenetics and genomics” The reconstruction of evolutionary relationships by comparing sequences of whole genomes

Phylogenomics• UltraConserved Elements

• UCEs have been associated with gene regulation and development

• generally assumed that UCEs must be important by the very nature of their near-universal conservation across extremely divergent taxa.

• However, gene knockouts of UCE loci in mice resulted in viable, fertile offspring, suggesting that their role in the biology of the genome may be cryptic.

Page 4: Phylogenomics “The intersection of phylogenetics and genomics” The reconstruction of evolutionary relationships by comparing sequences of whole genomes

Phylogenomics• By definition, UCEs would be of minimal

use in phylogenetics because of the low variability

• Linkage predicts that neighboring sequence that isn’t as highly conserved would be under less constraint

• UCEs serve as the anchors to access the neighboring sequence

Page 5: Phylogenomics “The intersection of phylogenetics and genomics” The reconstruction of evolutionary relationships by comparing sequences of whole genomes

• UCE workflow• http://ultraconserved.org/• Target enrichment of ultraconserved

elements from arthropods provides a genomic perspective on relationships among Hymenoptera. Mol Ecol Res 2014

• The evolution of peafowl and other taxa with ocelli (eyespots): A phylogenomic approach. Proc R Soc Lond B Biol Sci 281: 20140823. 2014.

• Target Capture and Massively Parallel Sequencing of Ultraconserved Elements (UCEs) for Comparative Studies at Shallow Evolutionary Time Scales. Syst Biol 63:83-95. 2014.

• A Phylogenomic Perspective on the Radiation of Ray-Finned Fishes Based upon Targeted Sequencing of Ultraconserved Elements (UCEs). PLoS ONE 8: e65923. 2014.

Page 6: Phylogenomics “The intersection of phylogenetics and genomics” The reconstruction of evolutionary relationships by comparing sequences of whole genomes

Because of the way they accumulate in a genome, TEs, especially retrotransposons, make excellent marker for phylogenetic analysis

Genome

Time

Subfamily 1

Subfamily 2

Subfamily 3

SINE accumulation in genomes

Phylogenomics

Page 7: Phylogenomics “The intersection of phylogenetics and genomics” The reconstruction of evolutionary relationships by comparing sequences of whole genomes

SINEs as phylogenetic markers

But…Which SINE families do you target and how do you identify them?

Page 8: Phylogenomics “The intersection of phylogenetics and genomics” The reconstruction of evolutionary relationships by comparing sequences of whole genomes

• Transposable element phylogenetics1. Identical by descent

2. Known ancestral state

3. Simple evolutionary model

4. Neutral

5. “Low-tech”

6. Bi-allelic markers

Phylogenomics

Consistency index = 1.00Homoplasy index = 0.00

Page 9: Phylogenomics “The intersection of phylogenetics and genomics” The reconstruction of evolutionary relationships by comparing sequences of whole genomes

• ME-Scan

Phylogenomics

Page 10: Phylogenomics “The intersection of phylogenetics and genomics” The reconstruction of evolutionary relationships by comparing sequences of whole genomes

• ME-Scan validation

Phylogenomics

Page 11: Phylogenomics “The intersection of phylogenetics and genomics” The reconstruction of evolutionary relationships by comparing sequences of whole genomes

• RAD-Seq• Restriction Site Associated DNA Sequencing• Cresko and colleagues (PLoS ONE 2008;3:e3376, PLoS Genet

2010;6:e1000862, PNAS 2010;107:16196–200.)• Akin to RFLP and AFLP except that you sequence the fragments• Rapidly identify genome-wide suites of SNPs and other

polymorphisms

Phylogenomics

Page 12: Phylogenomics “The intersection of phylogenetics and genomics” The reconstruction of evolutionary relationships by comparing sequences of whole genomes

• (A) Genomic DNA is sheared with a restriction enzyme.

• (B) P1 adapter is ligated to cut fragments.

• (C) Samples from multiple individuals are pooled together randomly sheared. Only a subset of the resulting fragments contains restriction sites and P1 adapters.

• (D) P2 adapter is ligated to all fragments. The P2 adapter has a divergent end.

• (E) PCR amplification with P1 and P2 primers. The P2 adapter will be completed only in the fragments ligated with P1 adapter, and so only these fragments will be fully amplified.

• (F) Pooled samples with different MIDs are separated bioinformatically and SNPs called (C/G SNP underlined).

• (G) As fragments are sheared randomly, paired end sequences from each sequenced fragment will cover a 300 - 400 bp region downstream of the restriction site.

Phylogenomics

Page 13: Phylogenomics “The intersection of phylogenetics and genomics” The reconstruction of evolutionary relationships by comparing sequences of whole genomes

Phylogenomics• RAD-Seq and phylogenetics• There is potential but there are problems

– “the most substantial obstacle to using RAD sequences for phylogenetics is determining orthology”

– “Deep divergences are problematic for two reasons: first, restriction sites change over time, with losses favored over gains, leading to a reduction in the number of orthologs retained across divergent taxa; second, evolutionary divergence of orthologous RAD sequences compromises the ability to infer their orthology based on sequences imilarity. Consequently, taxa that are phylogenetically isolated on long branches are less likely to retain orthologous restriction sites, and the RAD sequences they do retain will be more divergent, diminishing their representation in clusters.”

– “While correct nodes are more likely in general to be strongly supported, incorrect nodes can also have high bootstrap values, although this is not unique to RAD phylogenetics.”

• Probably still really good for phylogeography within species and among closely related species

Page 14: Phylogenomics “The intersection of phylogenetics and genomics” The reconstruction of evolutionary relationships by comparing sequences of whole genomes

• PhylomeDB• Remember that gene tree/species tree problem?• “given the plurality of evolutionary histories among genes encoded

in a given genome, there is a need for the combined analysis of genome-wide collections of phylogenetic trees (phylomes).”

• Phylome – the complete collection of evolutionary histories of all genes in a genome

• Huerta-Cepas et al. 2007• Latest version of PhylomeDB is v4, Nucleic Acids Research 2013• phylomedb.org

Phylogenomics

Page 15: Phylogenomics “The intersection of phylogenetics and genomics” The reconstruction of evolutionary relationships by comparing sequences of whole genomes

Phylogenomics

Page 16: Phylogenomics “The intersection of phylogenetics and genomics” The reconstruction of evolutionary relationships by comparing sequences of whole genomes

Phylogenomics• Phylome for gene family TP53 (screenshot from Huerta-Cepas et al.

2013)

Gene duplication events

Speciation events

Page 17: Phylogenomics “The intersection of phylogenetics and genomics” The reconstruction of evolutionary relationships by comparing sequences of whole genomes

Phylogenomics• Alternative topology resolution using phylomes

# trees (%) supporting the given phylogeny

# trees (%) with PP >0.9 supporting the given phylogeny

# gene families (%) supporting the given phylogeny