friend niehs 2013-03-01

68
If not Integrating genomes and networks to understand health and disease

Upload: sage-base

Post on 13-Jul-2015

88 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Friend NIEHS 2013-03-01

If not

Integrating genomes and networks to understand health and disease

Page 2: Friend NIEHS 2013-03-01

Examples of being Naive:

Expression Profiles

Page 3: Friend NIEHS 2013-03-01

2000

Page 4: Friend NIEHS 2013-03-01
Page 5: Friend NIEHS 2013-03-01

Examples of being Naive:

DNA Alterations

Page 6: Friend NIEHS 2013-03-01
Page 7: Friend NIEHS 2013-03-01
Page 8: Friend NIEHS 2013-03-01

Examples of being Naive:

Synthetic Lethal Screens

Page 9: Friend NIEHS 2013-03-01
Page 10: Friend NIEHS 2013-03-01
Page 11: Friend NIEHS 2013-03-01

Examples of being Naieve:

Drugs and Trials

Page 12: Friend NIEHS 2013-03-01

PARP IGF1-R m-TOR VEGF-R Wee-1

Page 13: Friend NIEHS 2013-03-01

Reality: Overlapping Pathways

Page 14: Friend NIEHS 2013-03-01
Page 15: Friend NIEHS 2013-03-01
Page 16: Friend NIEHS 2013-03-01

• alchemist

Page 17: Friend NIEHS 2013-03-01

How often are we hurt by going from the particular to the general

in very complex systems driven by context?

Is this going from the particular to the general a central problem in

Hypothesis Driven Biomedical Research?

How often do we inappropriately praise findings that go on to have awkward adjacencies?

Page 18: Friend NIEHS 2013-03-01

.

Page 19: Friend NIEHS 2013-03-01

TENURE FEUDAL STATES

Page 20: Friend NIEHS 2013-03-01

What could be done by us?

Page 21: Friend NIEHS 2013-03-01

BUILDING PRECISION MEDICINE

Extensions of Current Institutions

Proprietary Short term Solutions

Open Systems of Sharing in a Commons

Page 22: Friend NIEHS 2013-03-01
Page 23: Friend NIEHS 2013-03-01
Page 24: Friend NIEHS 2013-03-01
Page 25: Friend NIEHS 2013-03-01
Page 26: Friend NIEHS 2013-03-01

Massive amount of human “omic’s” and compound data

Page 27: Friend NIEHS 2013-03-01

Network Modeling Approaches for Diseases are emerging

Page 28: Friend NIEHS 2013-03-01

IT Infrastructure and Cloud compute capacity allows a generative open approach to solving problems

Page 29: Friend NIEHS 2013-03-01

Nascent Movement for patients to Control Sensitive information allowing sharing

Page 30: Friend NIEHS 2013-03-01

Open Social Media allows citizens and experts to use gaming to solve problems

Page 31: Friend NIEHS 2013-03-01

1- Now possible to generate massive amount of human “omic’s” data 2-Network Modeling Approaches for Diseases are emerging 3- IT Infrastructure and Cloud compute capacity allows a generative open approach to biomedical problem solving 4-Nascent Movement for patients to Control Sensitive information allowing sharing 5- Open Social Media allows citizens and experts to use gaming to solve problems

A HUGE OPPORTUNITY -- A HUGE RESPONSIBILITY

Page 32: Friend NIEHS 2013-03-01

We focus on a world where biomedical research is about to fundamentally change. We think it will be often conducted in an open, collaborative way where teams of teams far beyond the current guilds of experts will contribute to making better, faster, relevant discoveries

Page 33: Friend NIEHS 2013-03-01

Better Models of

Disease:

KNOWLEDGE

NETWORK

Techn

olo

gy P

latform

Rewards/Challenges

Imp

actf

ul M

od

els

Governance

Page 34: Friend NIEHS 2013-03-01

1) Identifying key disease systems and genes- Alzheimer’s Gaiteri et al.

Example “modules” of coexpressed genes, color-coded

1.) Identify groups of genes that move together – coexpressed “modules” - correlated expression of multiple genes across many patients

- coexpression calculated separately for Disease/healthy groups - these gene groups are often coherent cellular subsystems, enriched in one or more GO functions

Page 35: Friend NIEHS 2013-03-01

1.) Identify groups of genes that move together – coexpressed “modules” 2.) Prioritize the disease-relevance of the modules by clinical and network measures

Prioritize modules through expression synchrony with clinical measures or tendency too reconfigure themselves in disease

vs

1) Identifying key disease systems and genes- Alzheimer’s

Page 36: Friend NIEHS 2013-03-01

Infer directed/causal relationships and clear hierarchical structure by

incorporating eSNP information

(no hair-balls here) vs

Prioritize modules through expression synchrony with clinical measures or tendency too reconfigure themselves in disease

1) Identifying key disease systems and genes- Alzheimer’s

1.) Identify groups of genes that move together – coexpressed “modules” 2.) Prioritize the disease-relevance of the modules by clinical and network measures 3.) Incorporate genetic information to find directed relationships between genes

Page 37: Friend NIEHS 2013-03-01

1) Identifying key disease systems and genes- Alzheimer’s Example network finding: microglia activation

Module selection – what identifies these modules as relevant to Alzheimer’s disease?

The eigengene of a module of ~400 probes correlates with Braak score, age, cognitive disease severity and cortical atrophy. Members of this module are on average differentially expressed (both up- and down-regulated).

Evidence these modules are related to microglia function

The members of this module are enriched with GO categories (p<.001) such as “response to biotic stimulus” that are indicative of immunologic function for this module. The microglia markers CD68 and CD11b/ITGAM are contained in the module (this is rare – even when a module appears to represent a specific cell-type, the histological markers may be lacking). Numerous key drivers (SYK, TREM2, DAP12, FC1R, TLR2) are important elements of microglia signaling.

Alzgene hits found in co-regulated microglia module:

Page 38: Friend NIEHS 2013-03-01

Figure key:

Five main immunologic families found in Alzheimer’s-associated module Square nodes in surrounding network denote literature-supported nodes. Node size is proportional to connectivity in the full module.

(Interior circle) Width of connections between 5 immune families are linearly scaled to the number of inter-family connections.

Labeled nodes are either highly connected in the original network, implicated by at least 2 papers as associated with Alzheimer’s disease, or core members of one of the 5 immune families.

Core family members are shaded.

1) Identifying key disease systems and genes- Alzheimer’s

Page 39: Friend NIEHS 2013-03-01

Transforming networks into biological hypotheses

1) Identifying key disease systems and genes- Alzheimer’s

Page 40: Friend NIEHS 2013-03-01

Design-stage AD projects at Sage

Fusing our expertise in…

Join us in uniting genes, circuits and regions to build multi-scale biophysical disease models. Contact [email protected]

Diffusion Spectrum Imaging

Microcircuits & neuronal diversity

Gene regulatory networks

Feed

back

1) Identifying key disease systems and genes- Alzheimer’s

Page 41: Friend NIEHS 2013-03-01

N=587 P<0.0001

N=944, P<0.0001

2) Identifying genetic biomarkers of statin response from

cellular expression changes in treated LCLs

Differential eQTL analysis

Identifying local “cis” acting genetic effects

Differential network analysis

Identifying “trans” acting genetic effects.

Genotypes

2M simvastatin

Control

Clinical simvastatin trial Cellular Simvastatin exposure

N=480

Lara Mangravite

Page 42: Friend NIEHS 2013-03-01

AA AG GG

AA AG GG AA AG GG

Differential eQTL analysis identifies loci for which genetic association with gene expression is altered by statin treatment

Control Simvastatin Difference Control vs. Simvastatin

log10BF=0.52 log10BF=7.1* log10BF=5.7*

Diff-eQTL locus is associated with reduced incidence of statin-induced myopathy

Lara Mangravite

Page 43: Friend NIEHS 2013-03-01

Differential network analysis:

Partial correlation, FDR=5% and PP>0.90

By integrating statin-mediated changes in gene correlation with eQTLs, we identify genes predicted to alter cholesterol homeostatis and lipoprotein metabolism.

Knockdown of candidate gene in hepatocytes confirms alterations in lipoprotein metabolism

78.1±8.0% gene knockdown, Huh7 cells

Lara Mangravite

(including one involved in creatine biosynthesis)

Page 44: Friend NIEHS 2013-03-01

3) Classification of transporter-mediated hepatotoxicity

Bile Salt Exporter BSEP (Amgen)

AUC=0.98 5-fold crossvalidation

3. Development of classifier for predicting BSEP inhibition of unknown compounds

2. Classification of response to compounds by BSEP Inhibitor Status (rat IC50)

1. Characterization of differential expression following compound exposures in rat liver

4. Validation

Mangravite, Jang, Mecham, Derry

Page 45: Friend NIEHS 2013-03-01

How It All Fits Together

45

DREAM Challenges

Synapse

Data Generation

BRIDGE Data

Activation

FEDERATION

On-Line Open Generative

Communities

Portable Legal Consent

2009-2010

Access to Data Sets

Page 46: Friend NIEHS 2013-03-01

How It All Fits Together

46

DREAM Challenges

Synapse

Data Generation

BRIDGE Data Activation

FEDERATION

On-Line Open Generative

Communities

Portable Legal Consent

2010-2011

Page 47: Friend NIEHS 2013-03-01

two approaches to building common scientific knowledge

Text summary of the completed project

Assembled after the fact

Every code change versioned

Every issue tracked

Every project the starting point for new work

All evolving and accessible in real time

Social Coding

TECHNOLOGY PLATFORM

Page 48: Friend NIEHS 2013-03-01

Synapse is GitHub for Biomedical Data

• Data and code versioned

• Analysis history captured in real time

• Work anywhere, and share the results with anyone

• Social/Interactive Science

• Every code change versioned

• Every issue tracked

• Every project the starting point for new work

• Social/Interactive Coding

Page 49: Friend NIEHS 2013-03-01

Data Analysis with Synapse

Run Any Tool

On Any Platform

Record in Synapse

Share with Anyone

Page 50: Friend NIEHS 2013-03-01

“Synapse is a nascent compute platform for transparent, reproducible, and modular collaborative research.”

Page 51: Friend NIEHS 2013-03-01

Currently at 16K+ datasets and ~1M models

Page 52: Friend NIEHS 2013-03-01

Download analysis and meta-analysis

Download another Cluster Result Download Evaluation and view more stats

• Perform Model averaging

• Compare/contrast models

• Find consensus clusters

• Visualize in Cytoscape

Page 53: Friend NIEHS 2013-03-01

Pancancer collaborative subtype discovery

Page 54: Friend NIEHS 2013-03-01

Objective assessment of factors influencing model

performance (>1 million predictions evaluated)

Sanger CCLE Prediction accuracy

improved by…

Not discretizing data

Including expression data

Elastic net regression

130 compounds 24 compounds

Cro

ss v

alid

atio

n p

red

icti

on

acc

ura

cy (

R2)

In Sock Jang

Page 55: Friend NIEHS 2013-03-01

How It All Fits Together

55

DREAM Challenges

Synapse

Data Generation

BRIDGE Data Activation

FEDERATION

On-Line Open Generative

Communities

Portable Legal Consent

2011-2012

Page 56: Friend NIEHS 2013-03-01

(Nolan and Haussler)

THE FEDERATION

Schadt Ideker Friend Califano Nolan Vidal

Page 57: Friend NIEHS 2013-03-01

How It All Fits Together

57

DREAM Challenges

Synapse

Data Generation

BRIDGE Data Activation

FEDERATION

On-Line Open Generative

Communities

Portable Legal Consent

2012-2013

Page 58: Friend NIEHS 2013-03-01

Sage-DREAM Breast Cancer Prognosis Challenge #1 Building better disease models together

154 participants; 27 countries

334 participants; >35 countries

>500 models posted to Leaderboard

breast cancer data

Challenge Launch: July 17

Sep 26 Status

Caldos/Aparicio

Page 59: Friend NIEHS 2013-03-01

How It All Fits Together

59

DREAM Challenges

Synapse

Data Generation

BRIDGE Data

Activation

FEDERATION

On-Line Open Generative

Communities

Portable Legal Consent

2012-2013

Page 60: Friend NIEHS 2013-03-01

GOVERNANCE: PORTABLE LEGAL CONSENT Control of Private information by Citizens allows sharing

weconsent.us

John Wilbanks

• Online educational wizard • Tutorial video • Legal Informed Consent Document • Profile registration • Data upload

John Wilbanks TED Talk “Let’s pool our medical data” weconsent.us

Page 61: Friend NIEHS 2013-03-01

How It All Fits Together

61

DREAM Challenges

Synapse

Data Generation

BRIDGE

Data Activation

FEDERATION

On-Line Open Generative

Communities

Portable Legal Consent

2012-2013

Page 62: Friend NIEHS 2013-03-01

BRIDGE

BRIDGE

Page 63: Friend NIEHS 2013-03-01
Page 64: Friend NIEHS 2013-03-01

How It All Fits Together

64

DREAM Challenges

Synapse

Data Generation

BRIDGE

Data Activation

FEDERATION

On-Line Open Generative

Communities

Portable Legal Consent

2013-2014

IMPACT

Page 65: Friend NIEHS 2013-03-01

virtual machine

A ‘clearScience’ way of

modeling PI3K pathway

activation in breast cancer

web-accessible

DATA

web-accessible

SOURCE CODE

web-accessible

PROVENANCE

web-accessible

MODEL

sage bionetworks

metaGenomics/pan-cancer project collaboration with david haussler @ ucsc for

“analysis-ready” tcga data

tcga breast RNAseq data

tcga breast exome seq data

R code for a pathway heuristic

random forest model of pi3k

activation

executable pi3k model

binary

world wide web consortium (w3c) specification PROVENANCE for

all the interconnections above

all of these elements can be housed in an

Page 66: Friend NIEHS 2013-03-01

THE DREAM PROJECT JOINS

SAGE BIONETWORKS TO ENABLE

COLLABORATIVE SCIENCE

66

Page 67: Friend NIEHS 2013-03-01

How to incent the joint evolution of ideas in a rapid learning space- prepublication?

How to fund where data generators and analysts are not always the same people- repeatedly?

Should we consider

Centralized Guilds and Distributed Dynamic Teams to perform gene-environment model building?

Page 68: Friend NIEHS 2013-03-01

If not

SYNAPSE FEDERATION PORTABLE LEGAL CONSENT CHALLENGES BRIDGE CITIZEN ENGAGEMENT