introduction to the plant metabolic network: 18 databases and omics-level tools for analysis and...

32
Introduction to the Plant Metabolic Network: 18 Databases and Omics-Level Tools for Analysis and Discovery kate dreher The Carnegie Institution for Science Stanford, CA (CIMMYT, Mexico)

Upload: maud-gordon

Post on 18-Dec-2015

220 views

Category:

Documents


0 download

TRANSCRIPT

 Introduction to the Plant Metabolic Network: 18 Databases and

Omics-Level Tools for Analysis and Discovery

kate dreher

The Carnegie Institution for Science

Stanford, CA

(CIMMYT, Mexico)

Free access to high quality, curated data promotes beneficial research on plant metabolism

Plants provide crucial benefits to humanity and the ecosystem

A better understanding of plant metabolism may contribute to:

More nutritious foods More pest-resistant plants More stress-tolerant crops Higher photosynthetic capacity

and higher yield in

agricultural and biofuel crops New pharmaceutical sources . . . many more applications

These efforts benefit from access to high quality plant metabolism data

Plant Metabolic Network goals

Transform published results into data-rich metabolic pathways

Create and deploy improved methods for predicting enzyme function and metabolic capacity using plant genome sequences

Facilitate data analysis

Support research, breeding, and education

Provide public resources : PlantCyc AraCyc 16 additional species-specific databases

www.plantcyc.org

Plant Metabolic Network collaborators

SRI International – BioCyc project

Provide Pathway Tools Software

Maintain and update MetaCyc

Other collaborators / contributors include: MaizeGDB GoFORSYS TAIR SoyBase Sol Genomics Network (SGN) / Boyce Thompson Institute Gramene MedicCyc / Nobel Foundation PlantMetabolomics group . . . and more

SoyBase

Editorial BoardCommunity

Submissions

17 PMN species are phylogenetically and "functionally" diverse

Eudicots

Cereals

Basal land plants

Green alga

• Major crops

• Model species

• Legume

• Woody and herbaceous species

• Annuals and perennials

• C3 and C4 species

AraCyc 11.5 Arabidopsis thaliana

CassavaCyc 3.0 Manihot esculenta

ChineseCabbageCyc 1.0 Brassica rapa (+ spp.)

GrapeCyc 3.0 Vitis vinifera

PapayaCyc 2.0 Carica papaya

PoplarCyc 6.0 Populus trichocarpa (+ spp.)

SoyCyc 4.0 Glycine max

BarleyCyc 1.0 Hordeum vulgare

BrachypodiumCyc 1.0 Brachypodium distachyon

CornCyc 4.0 Zea mays

OryzaCyc 1.0 Oryza sativa (+ spp.)

SetariaCyc 1.0 Setaria italica

SorghumBicolorCyc 1.0 Sorghum bicolor

SwitchgrassCyc 1.0 Panicum virgatum

MossCyc 2.0 Physcomitrella patens

SelaginellaCyc 2.0 Selaginella moellendorffii

ChlamyCyc 3.5 Chlamydomonas reinhardtii

Database Species Pathways* PlantCyc 8.0 400+ 1050AraCyc 11.5 Arabidopsis thaliana 597

CassavaCyc 3.0 Manihot esculenta 491ChineseCabbageCyc 1.0 Brassica rapa (+ spp.) 499

GrapeCyc 3.0 Vitis vinifera 479PapayaCyc 2.0 Carica papaya 481PoplarCyc 6.0 Populus trichocarpa (+ spp.) 505

SoyCyc 4.0 Glycine max 520BarleyCyc 1.0 Hordeum vulgare 465

BrachypodiumCyc 1.0 Brachypodium distachyon 473CornCyc 4.0 Zea mays 508OryzaCyc 1.0 Oryza sativa (+ spp.) 482

SetariaCyc 1.0 Setaria italica 477SorghumBicolorCyc 1.0 Sorghum bicolor 480

SwitchgrassCyc 1.0 Panicum virgatum 479MossCyc 2.0 Physcomitrella patens 416

SelaginellaCyc 2.0 Selaginella moellendorffii 421ChlamyCyc 3.5 Chlamydomonas reinhardtii 349

PlantCyc provides access to important pathways not found in species-specific databases

PlantCyc contains over 1000 pathways and information from over 400 plant species

1050pathways

477pathways(average)

PlantCyc provides access to numerous specialized metabolic pathways

Many PlantCyc-specific metabolic pathways produce or break-down specialized ("secondary") metabolites

Many of the enzymes in these pathways have experimental evidence

Caffeine biosynthesis I

(Caffea arabica)Morphine biosynthesis(Papaver somniferum)

Taxol biosynthesis(cancer drug)

(Taxus brevifolia)

Raspberry ketone biosynthesis

(Rubus idaeus)

Vicianin bioactivation (defense compound)

(Vicia sativa)

Alliin degradation (garlic odor)

(Allium sativum)

Species-specific databases require predictions

Annotated Genomee.g. Populus trichocarpa

PathoLogic

SoftwareReference PathwayDatabase (MetaCyc)

Reactions

Pathways

compounds

Gene products

genes

Pathway/Genome Database (PoplarCyc)

E2P2

The PMN predicts enzyme functions from sequenced proteomes

To improve enzyme functional predictions:

A high-confidence Reference Protein Sequence Database was built RPSD 2.0 contains 34,269 enzymes and 82,216 non-enzymes

The Ensemble Enzyme Prediction Pipeline (E2P2) uses the RPSD to predict functions based on protein sequence

E2P2 can predict reactions that are not fully defined in the EC system by incorporating MetaCyc reaction IDs

Pathway predictions are refined through a SAVI pipeline

Annotated Genomee.g. Populus trichocarpa

PathoLogic

SoftwareReference PathwayDatabase (MetaCyc)

Reactions

Pathways

compounds

Gene products

genes

Pathway/Genome Database (PoplarCyc)

E2P2

SAVI

Validated – Publicly Released

Pathway/Genome Database (PoplarCyc)

SAVI pipeline Decision rules and criteria for each pathway are generated by curators

The Semi-automated validation / incorporation pipeline uses the rules to:

Identify pathways with curated experimental evidence*

Bring in “Ubiquitous Plant Pathways” that were not predicted Calvin cycle, glycolysis, etc.

Remove predicted non-plant / non-PMN pathways Glycogen biosynthesis (non-plant) Proteolysis (not small molecule metabolism)

Check key reactions and expected phylogenetic range to automatically assess many other predicted pathways

Highlight pathways that require manual validation

Expert input welcome!! To submit data, report an error, volunteer to validate, or ask a question

Send an e-mail: [email protected]

Use our feedback form:

Meet with me at the end of the workshop

Schedule an individual meeting with me at PAG

Community gratitude

We thank you publicly!

Together, we can make valuable, high quality databases

The PMN databases provide data and tools for analysis

Information, curated summaries, high quality predictions, and experimentally supported information about: Pathways Enzymes Reactions Compounds

Tools General and specific searches Comparative analysis tools BLAST against enzymes in the PMN or the RPSD

OMICs-level data analysis

Data analysis with the Metabolic Map / Omics Viewer

Display experimental data on a metabolic map

Data types: Genes - transcriptomics Enzymes – proteomics Reactions - fluxomics Compounds – metabolomics

Data inputs: Single or multiple values for each object Absolute or relative values

gene IDs

relative expression levels at different timepoints

Visualizing quantitative data

Visualizing quantitative datainput file

color gradient

data columns

type of data

scale

relative or

absolute

data display options

Easily identify altered pathways

Navigate to items of interest on the metabolic map

Working with Groups

Opportunities

Create custom data sets

Explore experimental results

Perform enrichment analyses

Share data

Requires free registration

Generate custom datasets

Create groups from searches

modify content

paint on map

compare to other groups

export to Excel

share with

others

Plant metabolic NETWORKING Please use our data

Please use our tools

Please come explore our new species databases coming in 2014

Please help us to improve our databases!

Please contact us if we can be of any help!

[email protected]

www.plantcyc.org

PMN AcknowledgementsCurator:

- kate dreher

Post-docs:

- Lee Chae

- Ricardo Nilo Poyanco

- Chuan Wang

Interns- Ashley Joseph

Tech Team Members:- Bob Muller - Garret Huntress

Rhee Lab Members:- Flavia Bossi- Hye-in Nam- Taehyong Kim- Meng Xu- Jim Guo- Jue Fan- Caryn Johansen

Peifen Zhang (Director and curator)

Sue Rhee (PI)

Collaborators:

SRI

- Peter Karp

- Ron Caspi

- Hartmut Foerster

- Suzanne Paley

- SRI Tech Team

MaizeGDB

- Mary Schaeffer

- Lisa Harper

- Jack Gardiner

- Taner Sen

ChlamyCyc- Patrick May

- Dirk Walther

- Lukas Mueller (SGN)

- Rex Nelson (Soybase)

- Gramene and MedicCyc

PMN Alumni:

- A. S. Karthikeyan (curator)

- Christophe Tissier (curator)

- Hartmut Foerster (curator)

- Eva Huala (co-PI)

- Tam Tran (intern)

- Varun Dwaraka (intern)

- Damian Priamurskiy (intern)

- Ricardo Leitao (intern)

- Michael Ahn (intern)

- Purva Karia (intern)

- Anuradha Pujar (SGN curator)

Tech Team Alumni

- Anjo Chi

- Cynthia Lee- Tom Meyer- Larry Ploetz- Shanker Singh- Bill Nelson

- Vanessa Kirkup

- Chris Wilks

- Raymond Chetty

Please use our data

Please use our tools

[email protected]

www.plantcyc.org

Sue Rhee (PI)

Peifen Zhang (Director)

We're here to help . . .