towards an understanding of diversity in biological and biomedical systems

38
Data analysis workshop for massive sequencing data Igor Zwir Department Computer Science and Artificial Intelligence, University of Granada, Granada, Spain Howard Hughes Medical Institute Yale School of Medicine, NewHeaven, CT, US Department of Psychiatry Washington University School of Medicine, St. Louis, MO, US e-mail: [email protected] Towards an understanding of diversity in biological and biomedical systems

Upload: cursongs

Post on 22-Nov-2014

857 views

Category:

Technology


1 download

DESCRIPTION

Towards an understanding of diversity in biological and biomedical systemsIgor zwir Massive sequencing data analysis workshopGranada 2011

TRANSCRIPT

Page 1: Towards an understanding of diversity in biological and biomedical systems

Data analysis workshop for massive

sequencing data

Igor Zwir

Department Computer Science and Artificial Intelligence,

University of Granada, Granada, Spain

Howard Hughes Medical Institute

Yale School of Medicine, NewHeaven, CT, US

Department of Psychiatry Washington University School of

Medicine, St. Louis, MO, US

e-mail: [email protected]

Towards an understanding of diversity in

biological and biomedical systems

Page 2: Towards an understanding of diversity in biological and biomedical systems

“Some people enjoy reading papers,

juggling possibilities and formulating ideas,

even if they can’t work a pipette”

(“Reasoning for results”, Nature, Bray, D., 2001)

“Some people enjoy reading papers, juggling

possibilities and formulating ideas, even if

they can’t write a line of a computer program”

(“Reasoning for results”, Groisman Lab, 2007)

Page 3: Towards an understanding of diversity in biological and biomedical systems

“…organisms of the most different sorts are

constructed from the very same battery of

genes. The diversity of life forms results from

small changes in the regulatory systems that

govern expression of these genes.”

François Jacob

In Of flies, mice and men

Page 4: Towards an understanding of diversity in biological and biomedical systems

Salmonella : A Gram-negative

pathogen with a varied lifestyle

Page 5: Towards an understanding of diversity in biological and biomedical systems

mgtB

Mg2+ transport

mgtA

Mg2+ transport

PhoQ

PhoP

low Mg2+

-PO3

Signal

Effectors

Regulator

Sensor

Response

Signal transduction cascade by

two-component regulatory systems

Page 6: Towards an understanding of diversity in biological and biomedical systems

System Signal Function

ArcA/ArcB Quinones Anaerobic respiration

OmpR/EnvZ Osmolarity changes Osmoadaptation

NtrB/NtrC Low nitrogen levels Nitrogen metabolism

PhoP/PhoQ Low Mg2+ Virulence, growth in low Mg2+

PmrA/PmrB Fe3+ and Al3+ Resistance to polymyxin B

SsrA/SpiR Unknown Virulence

TtrR/TtrS Tetrathionate Anaerobic respiration

Two-component systems regulate physiological

and virulence functions

Page 7: Towards an understanding of diversity in biological and biomedical systems

high Fe3+

pmrD PmrD

low Mg2+

PhoQ

PhoP -PO3

PmrB

-PO3PmrA

pbgP

LPS modification

The Salmonella PMRA/PMRB system

responds to Fe3+ and low Mg2+

Page 8: Towards an understanding of diversity in biological and biomedical systems

high Fe3+

pmrD PmrD

low Mg2+

PhoQ

PhoP -PO3

PmrB

-PO3PmrA

pbgP

LPS modification85.4% 93.3%

The E. Coli PMRA/PMRB system

responds to Fe3+ but not to low Mg2+

Page 9: Towards an understanding of diversity in biological and biomedical systems

PhoQ

PhoP -PO3

85.4% 93.3%

(the median amino acid identity between Salmonella and E. coli proteins is 90%)

ugd

85.5%

The Salmonella but not the E. coli ugd gene is

regulated by the PhoP protein

PhoQ

PhoP -PO3

ugd

Page 10: Towards an understanding of diversity in biological and biomedical systems

PhoP-PhoQ Two component system

regulates 5% of Salmonella genes

Consensus Motif

Salmonella LT2 & E. coli K12

Page 11: Towards an understanding of diversity in biological and biomedical systems

Single motif vs. a family of PhoP

submotifs

Harari et al., PloS computational Biology, 2010

+Sensitivity

+Specificity+Specificity

Page 12: Towards an understanding of diversity in biological and biomedical systems

26 BS

PhoP submotifs improve BS detection

Page 13: Towards an understanding of diversity in biological and biomedical systems

Genome wide analysis: custom tiling

arrays and ChIP assays

Page 14: Towards an understanding of diversity in biological and biomedical systems

Evolution of submotives thougout the

Gamma/Enterobacteria

Perez et al., PloS Genetics, 2009; Harari et al., PloS computational Biology, 2010

S01 S05

Information content

Background (HKY85 Model)

PhoP (Halpem Bruno)

Page 15: Towards an understanding of diversity in biological and biomedical systems

The submotifs and the PhoP protein evolve at

correlated rates

Page 16: Towards an understanding of diversity in biological and biomedical systems

In vitro affinities correlate well with the top three

families of submotifs

Page 17: Towards an understanding of diversity in biological and biomedical systems

Zwir et al., PNAS, 2005; Zwir et al, Bioinformatics, 2005,

Harari et al., BMC Bioinformatics, 2009

+

-

Page 18: Towards an understanding of diversity in biological and biomedical systems

Submotif & distances from the

RNAP binding site

45%

Close Medium Remote

21%

Harari et al., PloS computational Biology, 2010

Page 19: Towards an understanding of diversity in biological and biomedical systems

Two closely related species show

distinct promoter’s preferences

Close Medium Remote

Submotifs & distances can distinguish

Salmonella & E. coli

Page 20: Towards an understanding of diversity in biological and biomedical systems

Two far related species show distinct

promoter architectures

Page 21: Towards an understanding of diversity in biological and biomedical systems

PhoP-activated genes are bound and

transcribed at different times and levels

Page 22: Towards an understanding of diversity in biological and biomedical systems

an

ce

str

al

ho

rizo

nta

lly

-ac

qu

ired

Predicting gene binding and transcription of

PhoP regulated targets

Page 23: Towards an understanding of diversity in biological and biomedical systems

Summary

TF Affinity for its binding sites determine promoter

time and levels in naked DNA

Binding and Transcription in vivo depends on where

the binding sites sit (promoter architectures)

Cis-acting features in the PhoP-activated promoters

determine non-arbitrary organized architectures

The differences of the regulon througout distinct

species depends on the evolution of the binding sites

and promoter architectures

Page 24: Towards an understanding of diversity in biological and biomedical systems

Two paradigms: multiple genes with small

effect, or few genes with large effect

London Metro Boston Metro

de Vries, Nature Medicine, 2009

Page 25: Towards an understanding of diversity in biological and biomedical systems

Phenotypic-genotypic relations describe a risk

surface of Schizophrenia

0.1% of the population affected

Multigenic disease

Non-genetic contributions

Risk: Monozygotic twins 50% - Dizygotic twins 15%.

Gottesman II, Gould TD. Am J Psychiatry, 2003

R10:

11 affected,

6 Relatives

R19:

6 affected,

1 Relative

Page 26: Towards an understanding of diversity in biological and biomedical systems

Trios (affected, relatives and controls)

70 clinical attributes

Cognitive

Motor

Behavioral

Structural

SNPs chips

Phenotype clusters

Genotype clusters

Su

bje

cts

Su

bje

cts

Uncovering genotype-phenotype relations by

independently clustering both domains

Page 27: Towards an understanding of diversity in biological and biomedical systems

1E-10

0.01

Identifying significant genotype-phenotype

relations among inter-domain clusters

Romero-Zaliz et al, Nucleic Acids Research, 2008; Romero-Zaliz. et al, IEEE Trans. on

Evol. Computation, 2008, de Erausquin et al, Mol. Psych in Press

Page 28: Towards an understanding of diversity in biological and biomedical systems

Phenotype relations

Page 29: Towards an understanding of diversity in biological and biomedical systems

=~

Genotype relations

Page 30: Towards an understanding of diversity in biological and biomedical systems

Optimal (multiobjective/multimodal) relations

are hierarchically organized

Page 31: Towards an understanding of diversity in biological and biomedical systems

First degree relatives have

a genetic predisposition

Relations reflect the risk of Schizophrenia

Page 32: Towards an understanding of diversity in biological and biomedical systems

Relation Risk(%) Affected Relative Control

R22 91 10164

10170

R19 88 10155

10192

R05 61 10184

R06 57 10156

R11 32 10181

R30 28 20148

10127

R29 17 10198 10158

10165

R24 9 10193 10151

10166

R25 1 10157

Validation using an independent set of

subjects

Page 33: Towards an understanding of diversity in biological and biomedical systems

Pathway analysis Process for Neurological Disease

......

...

Qualitative significance of learned SNPs

Page 34: Towards an understanding of diversity in biological and biomedical systems

Neuronal cell adhesion pathway derived from

the genotype domain of the relations

Page 35: Towards an understanding of diversity in biological and biomedical systems

Novel pathways: oxidative stress and

epigenetic control of gene expression

Page 36: Towards an understanding of diversity in biological and biomedical systems

Summary

We proposed the first data-driven definition of the Schizophrenia risk

function

Concurrent CGWAS provides a panoramic vision of phenotype-

genotype associations, each of which can be used by traditional

GWAS analysis

Four signaling pathways associated with risk of schizophrenia were

identified

Phenotype-genotype relations were sufficient to reliably predict

subject status

This finding opens the door for early detection and preventative

intervention prior to the onset of psychotic symptoms in

high/intermediate risk populations

Page 37: Towards an understanding of diversity in biological and biomedical systems

Acknowledgements

Eduardo Groisman Lab

Howard Hughes Medical Institute

Dongwoo Shin

Chistian Perez

Henry Huang Lab

Dept. of Molecular Microbiology

Washington U.

School of Medicine, USA

Gabriel de Erausquin Lab

Departments of Psychiatry and

Neurology

Harvard Med. School

Dept. of Computer Science and

Artificial Intelligence

University of Granada, Spain

Coral del Val

Pat Anders

Javier Arnedo

Luis Miguel Merino

Rocio Romero-Zaliz (U. de Granada)

Cristina Rubio-Escudero (U. Seville)

Christopher Previti (U. Bergen)

Oscar Harari (Washington U.)

Page 38: Towards an understanding of diversity in biological and biomedical systems

Acknowledgments

Francisco Herrera

Coral del Val

Igor Zwir

Mining for Modeling Lab

Kathleen MarchalDepartment of Microbial

and Molecular Systems

Katholieke Universiteit Leuven

Department of Psychiatry,

Washington University in St. Louis

Gabriel de Eraúsquin

Department of Molecular Biology,

Washington University in St. Louis

Henry Huang

DECSAI,

University of Granada

DECSAI,

University of Granada

DECSAI,

University of Granada

DECSAI,

University of Granada

HHMI, Department of Molecular Biology,

Washington University in St. Louis

Eduardo Groisman