m reich - genomespace

31
Michael Reich Broad Institute of Harvard and MIT July 14, 2012

Upload: jan-aerts

Post on 10-May-2015

627 views

Category:

Health & Medicine


2 download

DESCRIPTION

Presentation at BOSC2012 by M Reich - GenomeSpace

TRANSCRIPT

Page 1: M Reich - GenomeSpace

Michael ReichBroad Institute of Harvard and MIT

July 14, 2012

Page 2: M Reich - GenomeSpace

Gene mutation causing Leigh Syndrome French Canadian Type (LFSC) and 8 other mitochondrial diseases

Integrate: candidate genomic region, mitochondrial proteomic data, and cancer expression compendium

Authors: Mootha et al. 2003, Calvo et al. 2006Subtle repression of oxphos genes in diabetic muscle: role for mitochondrial dysregulation in diabetes pathology; new computational approach Gene Set Enrichment Analysis

Integrate: Gene sets/pathways & processes with expression profiles

Authors: Mootha et al. 2003, Subramanian & Tamayo et al. 2005

IKBKE as a new breast cancer oncogene

Integrate: RNAi screens, transformation of activated kinases, and copy number from SNP arrays of cell linesAuthors: Boehm et al. 2007

~3000 novel, large non-coding RNAs with functions in development, the immune response and cancer

Integrate: Genome sequences from 21 mammals, epigenomic maps, and expression profiles Authors: Guttman, Rinn et al. 2009

Discovery of 3 new genes involved in Glioblastoma Multiforme (NF1, ERRB2, PIK3R1); Confirmation of TP53, PTEN, EGFR, RB1, PIK3CA

Integrate: DNA sequence, copy number, methylation aberrations and expression profiles in 206 glioblastomasAuthors: TCGA Research Network 2008

Characterization of disease subtypes and improved risk stratification for medulloblastoma patients

Integrate: Copy number, expression, clinical data for 96 medulloblastoma patientsAuthors: Tamayo et al., Cho et al. 2011

Need: Insights through integrative studies

Page 3: M Reich - GenomeSpace

Translational Research ExampleGenePattern Cytoscape IGV/UCSC Genomica

Network

Compendium

Expression

Alterations

atcgcgtttattcgataaggatcgcgttttttcgataagg

CMAP

Add Transcription Factor track from UCSC

6

Looks close to p53 site

7 Test for similarity of

p53 and gene location

8

Extractmodule

ii

Learn p53 site/score on

promoteriv

Load compendium Show module

mapi

Show Chromosome

5

Expand +1 (include

neighbors)

4

Show network

3

Differentially Expressed

Genes

1

Idea

GSEA test enrichment

2

iii

Arrests G2/M

Conclusion

vi

Pathwayactivation

Added to GenePattern

v

ODF GMT

HTML

SIF

GCT GXP

NA gene list

GXC GRP

GFF GXA

Page 4: M Reich - GenomeSpace

GenePattern

Cytoscape

IGV

Genomica

CMAP

UCSC Browser

Analysis step

Analysis conclusion

Within tool

Across tools

12 steps, 6 tools, 7 transitions

6 8

2 3 ii iii

4 5 iv v

1 i

Page 5: M Reich - GenomeSpace

• A lightweight “connection layer” for a wide variety of integrative genomic analyseso Support for all types of resource: Web-based,

desktop, etc.o Automatic conversion of data formats between toolso Easy access to data from any locationo Any tool that joins is automatically connected to the

community of toolso Ease of entry into the environment

The Need

Page 6: M Reich - GenomeSpace

Cloud-based storage

API connectivity layer

Page 7: M Reich - GenomeSpace

3 Driving BiologicalProjects

lincRNAs

Cancer stem cells

Patient Stratification

6 Seed Tools

CytoscapeGalaxy

GenePatternGenomica

IGVUCSC Browser

New tools New Biological Projects

Online community to share diverse computational tools

www.genomespace.org

Page 8: M Reich - GenomeSpace

• Aimed towards non-programming users

• Support interoperability through automatic cross-tool data transfer

• Requires minimal changes to tools

www.genomespace.org

GenomeSpace Principles

Page 9: M Reich - GenomeSpace

GS Enabled Tools

Integrative Genomics ViewerCytoscapeGalaxyGenePattern

GenomeSpace Components

Authentication and Authorization

Genome Space Server

Data ManagerAnalysis and Tool Manager

GenomeSpace Project Data

1

2 3

geWorkbench

External Data Sources & Tools

Page 10: M Reich - GenomeSpace

Seed Tools

Cytoscape Galaxy GenePattern

Genomica IGV UCSC Genome Browser

Page 11: M Reich - GenomeSpace

New Tools

InSilicoDB(University of Brussels)

Cistrome(Dana-Farber Cancer Institute)

ArrayExpress(EBI)

geWorkbench(Columbia University)

Reactome(Ontario Institute of Cancer Research)

Recently added

In development

Page 12: M Reich - GenomeSpace

Using GenomeSpace

Cloud-based

filesystem

Tools and Data

SourcesActions

Page 13: M Reich - GenomeSpace

GenomeSpace Actions

Page 14: M Reich - GenomeSpace

GenomeSpace Tool Enablement: IGV

Page 15: M Reich - GenomeSpace

GenomeSpace Tool Enablement: GenePattern

Page 16: M Reich - GenomeSpace

GenomeSpace Tool Enablement: GenePattern

Page 17: M Reich - GenomeSpace

GenomeSpace Data Source Integration: InSilico DB

Page 18: M Reich - GenomeSpace

Other collaborating projects

• Taverna/MyExperiment (University of Manchester)

• National Center for Biomedical Ontology (Stanford University)

Page 19: M Reich - GenomeSpace

DBP3: Studying the regulatory control of human hematopoiesis

Page 20: M Reich - GenomeSpace

DBP3: Studying the regulatory control of human hematopoiesis – Overview

42

2

1 1 1 1 1

2

1

Part 1: Data pre-processingand quality control

Part 3: Studying thetranscriptional program

Part 2: Basic analysis

2

To part 2

From part 11

1

3

2

Genomica

GenePattern

IGV

Cytoscape

4Analysis step (# steps)

Analysis conclusion

Currently integrated

Not yet integrated

Analysis section

Manual step

Galaxy

Optional choices

12

To part 3

2

2

2

3

1

4

1

1

1

2

3 2

1

1

3

Part 4: cis-regulatorysite analysis

2

1

To part 4

From part 2

Frompart 2

1

2

From part 3

To part 5

Part 5: Finding newtranscription factor

regulators

1

2

From part 4

2

2

23

1

1 3+

Page 21: M Reich - GenomeSpace

Part 5: Finding newtranscription factor

regulators

2

From part 4

2

2

3Step 1

Step 2

Step 3

Step 4

Page 22: M Reich - GenomeSpace

Step 1: Create transcription factor dataset in Genomica and save to GenomeSpace

Page 23: M Reich - GenomeSpace

Step 2: Send transcription factor datasets into GenePattern

Page 24: M Reich - GenomeSpace

Step 2: Perform differential expression analysis in GenePattern

Page 25: M Reich - GenomeSpace

Step 3: Send differentially expressed genes to Genomica

Page 26: M Reich - GenomeSpace

Step 3: perform module network analysis in Genomica

Page 27: M Reich - GenomeSpace

Step 4: Visualize regulators with known SNPs and linkage regions

Page 28: M Reich - GenomeSpace

Step 4: Visualize regulators with known SNPs and linkage regions

Page 29: M Reich - GenomeSpace

Clustered

30

Deployment ArchitectureAmazon

ExternalData Sources(e.g., Arrary Express)

Data Manager (DM)

Analysis Task Manager (ATM)

SimpleDB

S3 File transfers

GS ClientsClustered

IdentityService (OpenID)

RE

ST

RE

ST

Pro

ven

an

ce

Deployment Architecture and APIs

GS UI

IGV

RE

ST

RE

ST

Genomica

CD

K

Gene-Pattern

CD

K

Galaxy

Cytoscape

RE

ST

CD

K

UCSC

RE

ST

Page 30: M Reich - GenomeSpace

Join the GenomeSpace community

• Researchers with biological projects• Developers

– Add your tools– Contribute format converters– Build new infrastructure

• Data portals and repositories– Link your resources

Page 31: M Reich - GenomeSpace

Acknowledgements

Broad InstituteTed LiefeldHelga ThorvaldsdottirJim RobinsonMarco OcanaEliot PolkJill Mesirov, PI

GenomeSpace CollaboratorsCytoscape: Trey Ideker Lab, UCSDGalaxy: Anton Nekrutenko Lab, Penn State University Genomica: Eran Segal Lab, Weizmann InstituteUCSC Browser TeamGenePattern TeamIGV Team

Driving Biological ProjectsHoward Chang Lab – Stanford UniversityAviv Regev Lab – Broad Institute Funding

[email protected]