a network framework to explore phylogenetic structure in genomic data
TRANSCRIPT
A network framework to explore phylogeneticstructure in genomic data
Guifang Zhou 1, Jeremy Ash 2, Wen Huang 3, Melissa Marchand 4, David Morris 1,Paul Van Dooren 3, James C. Wilgenbusch 5, Jeremy M. Brown 1, Kyle A. Gallivan 4
1Department of Biological Sciences, Louisiana State University2Bioinformatics Research Center, North Carolina State University
3ICTEAM Institute, Université catholique de Louvain4Department of Mathematics, Florida State University
5Minnesota Supercomputing Institute, University of Minnesota
June 20, 2016
June 20, 2016
Motivations
Phylogenetic analyses often produce large sets ofcompeting trees
Summarize interesting evolutionary history:HybridizationRecombinationHorizontal Gene TransferIncomplete Lineage Sorting
Identify Systematic Error
June 20, 2016
Shortcomings of Current Approaches
Consensus treeDiscards information concerning competing trees
Dimensionality ReductionMay be difficult to interpret
June 20, 2016
Shortcomings of Current Approaches
ClusteringBased on pairwise tree to tree distanceOnly consider nonnegative links
June 20, 2016
Our Approaches
Apply graph-based methods to understand relationship among:
Tree topologies Bipartitions within treetopologies
June 20, 2016
Application
Yeast dataset with 5 species, 106 loci106 gene trees were reconstructed using maximumparsimony
June 20, 2016
Topology-based Network Analysis
Affinity matrixReciprocal of pairwisedistances
Detect communitiesDiscovered 11 communities
Consensus trees for eachcommunity
Top 2 recovers the top 2candidate species trees 62/106
17/10611/106
4/106
3/106
2/106
2/106
2/106
· · ·
June 20, 2016
Bipartition-based Network Analysis
Covariance matrix based on presence or absence ofbipartitions in the gene trees
June 20, 2016