manolis kellis: research synopsis

12
Manolis Kellis: Research synopsis Brief overvie 1 slide each vignette Why biology in a computer science group? Big biological questions: 1.Interpreting the human genome. 2.Revealing the logic of gene regulation. 3.Principles of evolutionary change. Underlying computational techniques: Comparative genomics: evolutionary signatures Regulatory genomics: motifs, networks, models Epigenomics: chromatin states, dynamics, disease Phylogenomics: evolution at the genome scale Defining characteristics of research program: Genome-wide rules, exploit nature of problems, interdisciplinary collaborations, biology impact

Upload: lelia

Post on 15-Jan-2016

62 views

Category:

Documents


0 download

DESCRIPTION

Manolis Kellis: Research synopsis. Why biology in a computer science group? Big biological questions: Interpreting the human genome. Revealing the logic of gene regulation. Principles of evolutionary change. Underlying computational techniques: - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Manolis Kellis: Research synopsis

Manolis Kellis: Research synopsis

Brief overview1 slide each

vignette

• Why biology in a computer science group?• Big biological questions:

1. Interpreting the human genome. 2. Revealing the logic of gene regulation. 3. Principles of evolutionary change.

• Underlying computational techniques: – Comparative genomics: evolutionary signatures– Regulatory genomics: motifs, networks, models– Epigenomics: chromatin states, dynamics, disease– Phylogenomics: evolution at the genome scale

• Defining characteristics of research program:– Genome-wide rules, exploit nature of problems,

interdisciplinary collaborations, biology impact

Page 2: Manolis Kellis: Research synopsis

(1) Comparative genomics: evolutionary signaturesProtein-coding signatures• 1000s new coding exons• Translational readthrough• Overlapping constraints

Non-coding RNA signatures• Novel structural families• Targeting, editing, stability• Structures in coding exons

microRNA signatures:• Novel/expanded miR

families• miR/miR* arm cooperation• Sense/anti-sense switchesRegulatory motif signatures• Systematic motif discovery• Regulatory motif instances• TF/miRNA target networks• Single binding-site

resolution

Page 3: Manolis Kellis: Research synopsis

(2) Regulatory genomics: circuits, predictive models

• Initial annotation of the non-coding genome, from 20% to 70%• Systems biology for an animal genome for the first time possible• Students and postdocs are co-first authors, leadership roles

Predictive models of gene regulation•Infer networks•Predict function•Predict regulators•Predict gene expression

ENCODE/modENCODE•4-year effort, dozens of experimental labs•Integrative analysis•Systematic genome annotation•Flagship NIH project

Page 4: Manolis Kellis: Research synopsis

Gen

erat

ive

mo

del

1. Family rate 2. Species-specific rates

SiFj~gamma

(α,β)

~normal(μi,σi)

Selective pressures on gene function

Population dynamics of the species

Two components of gene evolution

(3) Phylogenomics: Bayesian gene-tree reconstructionN

ew p

hyl

og

eno

mic

pip

elin

e

Learned Fj,Sidistributions

Birth-Deathprocess

Branch lengthprior

SequencelikelihoodLength I, Topology T, Reconciliation R

Topologyprior

HKY model(traditional)Alignment data D, species-level parameters θB

ayes

ian

form

ula

tio

n

Page 5: Manolis Kellis: Research synopsis

Vignette: Epigenomics

Jason Ernst, Pouya Kheradpour

Ernst and Kellis, Nature Biotech, 2010Ernst, Kheradpour et al, Nature, 2011 (in press)

Page 6: Manolis Kellis: Research synopsis

Epigenomics and ‘chromatin state’ signatures

• Learn de novo combinations of chromatin marks

• Reveal functional elements

• Use for genome annotation

• Use for studying dynamics across many cell types

Promoter states

Transcribed states

Active Intergenic

Repressed

DNA

Histonetails

Chromatin‘marks’

Page 7: Manolis Kellis: Research synopsis

ChromHMM: learning ‘hidden’ chromatin statesTranscription Start Site

7

EnhancerDNA

Observedchromatin marks. Called based on a poisson distribution

Most likely Hidden State

Transcribed Region

1 6 53 4 6 6 6 6 5

1:

3:

4:

5:

6:

5

High Probability Chromatin Marks in State

2:

0.8

0.9

0.9

0.80.7

0.9

200bp intervals

All probabilities are learned de novo from chromatin data alone (Baum-Welch aka. EM)

2

K4me3 K36me3 K36me3 K36me3 K36me3K4me1 K4me3 K4me1

K27ac

0.8

K4me1

K36me3

K27ac

K4me1K4me3

K4me3

K4me1 K4me1

Each state: vector of emissions, vector of transitions

Page 8: Manolis Kellis: Research synopsis

Chromatin states dynamics across nine cell types

• State definitions are cell-type invariant– Same combinations consistently found

• State locations are cell-type specific– Can study pair-wise or multi-way changes

Page 9: Manolis Kellis: Research synopsis

Multi-cell activity profiles and their correlations

HUVEC

NHEK

GM12878

K562

HepG2

NHLF

HMEC

HSMM

H1

Gene

expression

Chromatin

States

Active TF motif

enrichment

ON

OFF

Active enhancer

Repressed

Motif enrichment

Motif depletion

TF regulator

expression

TF On

TF Off

Dip-aligned

motif biases

Motif aligned

Flat profile

Chromatin state & gene expression link enhancers and target genes

TF motif enrichment & TF expression reveal activators / repressors

Page 10: Manolis Kellis: Research synopsis

Coordinated activity reveals enhancer links

• Enhancer networks: Regulator enhancer target gene

• Ex1: Oct4 predicted activator of embryonic stem (ES) cells

• Ex2: Ets activator of GM/HUVEC (but not either one alone)

Enhanceractivity

Geneactivity

Predicted

regulators Activity signatures for each TF

Page 11: Manolis Kellis: Research synopsis

xx

• Disease-associated SNPs enriched for enhancers in relevant cell types• E.g. lupus SNP in GM enhancer disrupts Ets1 predicted activator

Revisiting disease- associated variants

Page 12: Manolis Kellis: Research synopsis

ScienceNatureNatureNature

Nature BiotechNatureNature

PLoS Genetics

MBEGenome Research

Nature

Genome ResearchNature

Genome ResearchPLoS Comp. Bio.

Genes & DevelopmentGenome Research

NaturePNAS

BMC Evo. Bio.ACM TKDD

Genome ResearchRECOMB

J. Comp. Bio.PNAS

ContributionsWe aim to further our understanding of the human genome by computational integration of large-scale functional and comparative genomics datasets. •We use comparative genomics of multiple related species to recognize evolutionary signatures of protein-coding genes, RNA structures, microRNAs, regulatory motifs, and individual regulatory elements. •We use combinations of epigenetic modifications to define chromatin states associated with distinct functions, including promoter, enhancer, transcribed, and repressed regions, each with distinct functional properties. •We develop phylogenomic methods to study differences between species and to uncover evolutionary mechanisms for the emergence of new gene functionsOur methods have led to numerous new insights on diverse regulatory mechanisms, uncovered evolutionary principles, and provide mechanistic insights for previously uncharacterized disease-associated SNPs

Nature Nature Nature Nature Nature In review

Nature Gen Genes&Dev Nature Nature Biotech

Nature Nature Nature Nature WBpress

PNAS Nature G.R. BioChem

Nature GenomRes Nature G.R. Science

RECOMB RECOMB