geworkbench john watkinson columbia university. geworkbench the bioinformatics platform of the...

23
geWorkbench John Watkinson Columbia University

Upload: carmella-byrd

Post on 17-Jan-2016

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: GeWorkbench John Watkinson Columbia University. geWorkbench The bioinformatics platform of the National Center for the Multi-scale Analysis of Genomic

geWorkbench

John Watkinson

Columbia University

Page 2: GeWorkbench John Watkinson Columbia University. geWorkbench The bioinformatics platform of the National Center for the Multi-scale Analysis of Genomic

geWorkbench

The bioinformatics platform of the National Center for the Multi-scale Analysis of Genomic and Cellular Networks (MAGNet).

Also, part of the NCI’s cancer Biomedical Informatics Grid (caBIG) initiative. The project was formerly called caWorkbench.

Page 3: GeWorkbench John Watkinson Columbia University. geWorkbench The bioinformatics platform of the National Center for the Multi-scale Analysis of Genomic

geWorkbench (cont.)

A desktop application for integrative genomics.

Runs on Windows, Linux and Macintosh. Includes a variety of informatics tools, but

specializes in microarray analysis. Open-source and free for non-commercial

use. Includes an API for plugin development.

Page 4: GeWorkbench John Watkinson Columbia University. geWorkbench The bioinformatics platform of the National Center for the Multi-scale Analysis of Genomic
Page 5: GeWorkbench John Watkinson Columbia University. geWorkbench The bioinformatics platform of the National Center for the Multi-scale Analysis of Genomic

geWorkbench (cont.)

Page 6: GeWorkbench John Watkinson Columbia University. geWorkbench The bioinformatics platform of the National Center for the Multi-scale Analysis of Genomic

Integrative Genomics

Increasingly, researchers need to combine several data sources (microarray assays, DNA/RNA/protein sequences, protein structure, gene ontology, clinical data, etc.)

geWorkbench attempts to move past simple microarray analysis to include integrative methods.

Plugin framework allows geWorkbench to interact with other major software packages, including BioConductor, GenePattern and Cytoscape.

Page 7: GeWorkbench John Watkinson Columbia University. geWorkbench The bioinformatics platform of the National Center for the Multi-scale Analysis of Genomic

Data Support

Microarray assays (one-color and two-color, as well as caARRAY assays).

Sequence files. BLAST queries. Gene-Gene interaction networks

(Interactomes). Gene Ontology Terms. caBIO pathways and annotations. Protein structure files (PDB).

Page 8: GeWorkbench John Watkinson Columbia University. geWorkbench The bioinformatics platform of the National Center for the Multi-scale Analysis of Genomic

Components

geWorkbench has a plugin interface for the development of 3rd-party components.

Documentation and developer support is available from the geWorkbench team.

All visualizations and analyses have been written using the API. Several groups at Columbia are developing for the platform.

Page 9: GeWorkbench John Watkinson Columbia University. geWorkbench The bioinformatics platform of the National Center for the Multi-scale Analysis of Genomic

Microarray Analysis

Summarization of raw chip data (via BioConductor).

Normalization and Filtering. Differential expression analysis. Clustering (Hierarchical and Self-Organizing

Maps). Classification (SVM and SMLR). Many visualization tools.

Page 10: GeWorkbench John Watkinson Columbia University. geWorkbench The bioinformatics platform of the National Center for the Multi-scale Analysis of Genomic

Hierarchical Clustering

Page 11: GeWorkbench John Watkinson Columbia University. geWorkbench The bioinformatics platform of the National Center for the Multi-scale Analysis of Genomic

Scatter Plot Visualization

Page 12: GeWorkbench John Watkinson Columbia University. geWorkbench The bioinformatics platform of the National Center for the Multi-scale Analysis of Genomic

caBIO Pathway Viewer

Page 13: GeWorkbench John Watkinson Columbia University. geWorkbench The bioinformatics platform of the National Center for the Multi-scale Analysis of Genomic

Sequence Analysis

BLAST and HMM search interface. Pattern discovery. Synteny analysis. Promoter region analysis. A variety of sequence viewers.

Page 14: GeWorkbench John Watkinson Columbia University. geWorkbench The bioinformatics platform of the National Center for the Multi-scale Analysis of Genomic

Pattern Discovery Viewer

Page 15: GeWorkbench John Watkinson Columbia University. geWorkbench The bioinformatics platform of the National Center for the Multi-scale Analysis of Genomic

Promoter Viewer

Page 16: GeWorkbench John Watkinson Columbia University. geWorkbench The bioinformatics platform of the National Center for the Multi-scale Analysis of Genomic

GO Term Enrichment

Traditional t-tests on microarray data determine differentially expressed genes between two different phenotypes.

Gene Ontology (GO) term enrichment can determine which functional or structural categories show significant differentiation.

Supported in geWorkbench’s GO Panel component. A similar technique can be applied to other gene

sets, such as KEGG pathways.

Page 17: GeWorkbench John Watkinson Columbia University. geWorkbench The bioinformatics platform of the National Center for the Multi-scale Analysis of Genomic

GO Terms (cont.)

Page 18: GeWorkbench John Watkinson Columbia University. geWorkbench The bioinformatics platform of the National Center for the Multi-scale Analysis of Genomic

Reverse Engineering

Microarray data can be used to infer biological pathways.

geWorkbench’s Reverse Engineering component uses the ARACNE algorithm to build gene-gene interaction networks.

These can be compared and combined with an online database of interactions, curated by Columbia.

Page 19: GeWorkbench John Watkinson Columbia University. geWorkbench The bioinformatics platform of the National Center for the Multi-scale Analysis of Genomic

Reverse Engineering (cont.)

Page 20: GeWorkbench John Watkinson Columbia University. geWorkbench The bioinformatics platform of the National Center for the Multi-scale Analysis of Genomic

Reverse Engineering (cont.)

Page 21: GeWorkbench John Watkinson Columbia University. geWorkbench The bioinformatics platform of the National Center for the Multi-scale Analysis of Genomic

Matrix REDUCE

Given microarray data and upstream sequences for genes, transcription factor binding sites can be inferred.

The Matrix REDUCE component in geWorkbench provides this analysis and tools to visualize the results.

Page 22: GeWorkbench John Watkinson Columbia University. geWorkbench The bioinformatics platform of the National Center for the Multi-scale Analysis of Genomic

For More Information

http://www.geworkbench.org Mailing List:

[email protected] John Watkinson: [email protected]

Page 23: GeWorkbench John Watkinson Columbia University. geWorkbench The bioinformatics platform of the National Center for the Multi-scale Analysis of Genomic

Acknowledgements

ARACNE algorithm by Califano et al. Matrix REDUCE algorithm by Bussemaker, et

al. geWorkbench team: Aris Floratos, Eileen

Daly, Kenneth Smith, Kiran Keshav, Xiaoqing Zhang, Manjunath Kustagi, Matthew Hall, Bernd Jagla, Mary VanGinhoven, John Watkinson.