finding orthologous groups

Download Finding Orthologous Groups

If you can't read please download the document

Upload: nessa

Post on 25-Feb-2016

49 views

Category:

Documents


3 download

DESCRIPTION

Finding Orthologous Groups. René van der Heijden. What is this lecture about?. What is ‘orthology’? Why do we study gene-ancestry/gene-trees (phylogenies)? Several approaches to find orthologous genes High-resolution orthology Steps involved Things to think about (homework). Homology. - PowerPoint PPT Presentation

TRANSCRIPT

  • Finding Orthologous GroupsRen van der Heijden

  • What is this lecture about?What is orthology?Why do we study gene-ancestry/gene-trees (phylogenies)?Several approaches to find orthologous genesHigh-resolution orthologySteps involvedThings to think about (homework)

  • HomologyGenes are homologous if and only if theyderive from the same ancestral geneSufficient sequence similarity proofs homologyVery dissimilar sequences:PSI blast, HMM searches

  • Homologous genes tend to have similar functions

  • Homologous genes tend to have similar functionsAccurate function prediction requires something better than homologyOrthology

  • Duplications, Speciations, and OrthologyEvolution results in:Growing number of genesGene duplicationsHorizontal gene transferDe novo generationGrowing number of speciesThe fate of gene duplicates:PerishFind a new functional nicheTendency for functional expansion

  • Duplications, Speciations, and OrthologyTwo genes in two species are orthologous ifthey derive from one gene in their last common ancestor

    Orthologous genes are likely to have the same functionMuch stronger than tend to have similar function

  • Duplications, Speciations, and Orthologyprimalancestorpresent genesevolutionary distance

  • Homologs, Orthologs,and ParalogsHomologous: one common ancestral geneOrthologous: separated by a speciation eventParalogous: separated by a duplication event

    Orthologs and Paralogs must be HomologsAre there homologous genes whichare not orthologous nor paralogous?The view on orthology and paralogy is relative to a certain speciation

  • Inparalogs and OutparalogsBoth, In- and Outparalogous genes are separated by a gene duplication eventFor Inparalogs, the duplication event is not followed by speciation(s)Outparalogs are separated by a duplication event, followed by speciation(s)

    Inparalogs are recent paralogs Outparalogs are more ancient paralogsAre Inparalogs Orthologs ?Depends on your definition: Yes: two genes are orthologous ifthey derive from one gene in their last common ancestorNo: two genes are orthologous if they are only separated by cell division events

  • Reading Gene-TreesAlthough genes spec1,1 and spec2,1 are closer relatives, their distance is larger than that between spec1,1 and spec3,1The tree suggests at least 2 gene losses

  • In-, and Outparalogs, Orthologs, and Co-orthologs

  • More examples

  • www = What, Why, and hoW?What: Orthologous genes are separated by cell division onlyWhy: Orthologous genes are likely to have the same functionHow: Yes, how can orthologous relations be established ?

  • Several approachesThe COG approachInParanoidTree-based methods

  • COG approachBased on blast hitsEstablishment and extension of triangles:

  • COG approachIIExtension oforthologous groups

  • InParanoid IMethod denotesIN- and OUTparalogsFor TWO speciesFind all hits from species A on BFind all hits from species B on AFind all bi-directional best hits (BBH)These for putative orthologs

  • InParanoid IIFind all hits from A on AFind all hits from B on BFind all InParalogsThese are all hits better than the orthologsBetter => more recently split

  • InParanoid IIIPutative orthologous pairs are curated by an outgroup species CInParalogs are given a confidence valueBootstrapping is used to give confidence values for orthologous pairs

  • Genes with promiscuous domainsGene A may hit on gene B because of a shared domain XGene B may hit on gene C because of a shared domain Y

    Promiscuous domains require (manual) curation

  • Tree-based methodsGet all homologous genesMake multiple alignmentsGenerate phylogenetic gene treesAnalyze trees

    Uncertainty in multiple alignment?Different methods for distance calculationsSuperpose a trusted species tree?How to assess a level of accuracy?

  • The Phylogenetic Gene-TreeMultiple alignment for all genes

    Distance matrix calculationKimura correctionPAM modelCategories modelLarge trees: distance-based methodsNeighbor Joining

  • Uncertainty in treesEvolutionary noiseDiffering rates of evolutionConvergent evolution (low complexity, coiled coils)Promiscuous domains (recombination, fusion, fission)Use of heuristic methodsMultiple alignmentTree making

  • Analyze trees but dont trust them fullyRigid analysis suggests many duplications and lossesPresume scp branch is wrongly placed!

    If this is correct . this cant be

  • Analyze trees but dont trust them fullyAnd if we accept wrong placement of branches

    Three orthologous groups suggesting 15 gene lossesConsidering one wrongly placedgene leaves only 2 gene losses

  • High-res versus Low-resMany,Complete, andClosely relatedgenomesChallenge:Automatic Orthology assignment

  • Things to think about (homework)Select a partnerCollect a gene tree (and some copies)Carefully deduce which nodes are duplications and which are speciationsDenote which genes are orthologous to each other (orthologous groups)Select interesting parts to predict whatThe COG procedure would sayInParanoid would sayWhat would have happened if some genes (or species) where not involved in the analysis

  • Homework: also think about

    No need to explain about Large Scale I guess

    Duplication node: set of species in the branches overlapAncient: duplication took place before any speciation within the species consideredRecent: a duplication within a currently existing speciesSpeciation node: all genes that arise from the node are of different species, except recent duplicates