vectorbase a resource centre for invertebrate hosts of human pathogens bob maccallum imperial...

45
VectorBas e VectorBase A Resource Centre for A Resource Centre for Invertebrate Hosts of Human Invertebrate Hosts of Human Pathogens Pathogens Bob MacCallum Bob MacCallum Imperial College London Imperial College London

Upload: bridget-cannon

Post on 27-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

VectorBaseVectorBase

VectorBase

A Resource Centre for A Resource Centre for Invertebrate Hosts of Human Invertebrate Hosts of Human

PathogensPathogens

Bob MacCallumBob MacCallum

Imperial College LondonImperial College London

VectorBaseVectorBase

Outline

• Introduction to VectorBase

• Two important recent developments:

– Community Annotations

– Gene Expression Data

VectorBaseVectorBase

What is VectorBase?

• Aim Genomic bioinformatics resource for invertebrate

vectors of human pathogens Data hub for community

• Funding US NIAID (National Institute for Allergy and Infectious

Diseases) via its Bioinformatics Resource Centre (BRC) program

VectorBaseVectorBase

Why VectorBase?

• Sequencing initiatives do not include “after-care”

• Ensembl had no long-term plans for insects

VectorBaseVectorBase

Main VectorBase activities

• www.vectorbase.org:– Browse, search & download genomic data

• Genome annotation– Automatic & manual

• Functional genomics

• Ontologies

• Training/outreach/consultancy

VectorBaseVectorBase

Invertebrate vectorsSpecies Disease Status Funder

Aedes aegypti Yellow feverDengue fever

Complete† NIAID

Anopheles gambiae PEST Malaria Complete† -

Anopheles gambiae M & S form

Malaria Assembled NHGRI

Culex pipiens quinquefasciatus

Lymphatic filariasis

Complete† NIAID

Glossina morsitans morsitans

Sleeping sickness Initiated Wellcome Trust

Ixodes scapularis Lyme disease Draft gene set NIAID

Lutzomyia longipalpis Leishmania Planned NHGRI/Wellcome Trust

Pediculus humanus Typhus Draft gene set NHGRI

Phlebotomus papatasi Leishmania Planned NHGRI/Wellcome Trust

Rhodnius prolixus Chagas disease Initiated NHGRI

VectorBaseVectorBase

Who is VectorBase?

US

UK

GR

VectorBaseVectorBase

Notre Dame

PIsFrank Collins, Dave Severson, Greg Madey, Nora Besansky

Tasks project coordinationcore website developmentcommunity annotation pipelineAedes and Anopheles community reps.

VectorBaseVectorBase

EBI(European Bioinformatics Institute)

PIEwan Birney

Tasks “automated” genome annotationcomparative genomicsGenbank submissionsgenome browser technology

VectorBaseVectorBase

IMBB, Crete

PIKitsos Louis

Tasks ontologies for anatomy, insecticide resistance, biological processespopulation genetics

VectorBaseVectorBase

Harvard

PIBill Gelbart

Tasks manual annotation

VectorBaseVectorBase

Imperial College, London

PIsGeorge Christophides, Fotis Kafatos

Tasks functional genomics: gene expression, RNAi phenotypes

VectorBaseVectorBase

UC Riverside

PIPeter Atkinson

Tasks Culex pipiens

VectorBaseVectorBase

Purdue University

PICatherine Hill

Tasks Ixodes scapularis

VectorBaseVectorBase

A quick tour of VectorBase

Blast

Genome

browser

Searchengine

BioMart

Downloads

VectorBaseVectorBase

VectorBase genome browser

VectorBaseVectorBase

VectorBase genome browser

VectorBaseVectorBase

Genome annotation cycle

Automatic gene build

Assembly

Community annotations

Manual annotations

Other genomes, gene sets

Repeat library (TEs etc)ESTs, cDNAs

Protein domains

VectorBaseVectorBase

Manual annotation

• Flybase team (Kathy Campbell)

• Anopheles 2L completed Sep 2006

• Anopheles 2R completed Sep 2007

• Anopheles X completed Feb 2008

• 875 Culex genes completed July 2008

• Three mosquitoes better than one

VectorBaseVectorBase

Community annotation

• Expertise from around world

• Gene models, symbols, literature, function

• Need system to track contributions

• Incorporated in gene build updates

• Credit sourcesCommunity Annotation Pipeline (CAP)

VectorBaseVectorBase

CAP: gene model submission

• Gene symbol• Gene description• mRNA sequence• Translation start• Translation stop• Determination method• GO IDs• PubMed IDs

Excel spreadsheet

VectorBaseVectorBase

VectorBaseVectorBase

CAP: what happens next

• Transcript aligned to genome

• Gene model constructed

• Reviewed by community representative

VectorBaseVectorBase

VectorBaseVectorBase

VectorBaseVectorBase

VectorBaseVectorBase

VectorBaseVectorBase

VectorBaseVectorBase

VectorBaseVectorBase

CAP: other annotations

• Publications

• CV/ontology terms

• Free text comment*

(* unmoderated)

VectorBaseVectorBase

VectorBaseVectorBase

Expression data

• Many microarray technologies

• Many experimental designs

• Large amount of information

• Many ways to do analysis

VectorBaseVectorBase

Microarray repositories

• Widely adopted standard: MIAME

• GEO (NCBI) & ArrayExpress (EBI)

• Repository ≠ Useful data

• Curation backlog at central repositories

• VectorBase data is manageable

• We manage and curate

VectorBaseVectorBase

Microarray pipeline at VB

What Where

Alignments & gene assignments Ensembl-style database

Microarray data, raw & processed BASE

Statistics and web interface VB’s GESOL API

VectorBaseVectorBase

Web interfacePPO*

VectorBaseVectorBase

VectorBaseVectorBase

Overall picture of

expression

VectorBaseVectorBase

VectorBaseVectorBase

VectorBaseVectorBase

Genome browser integration

VectorBaseVectorBase

Help & Documentation

VectorBaseVectorBase

No time today for…

• Averaging over multiple reporters

• Ambiguous reporters

• List of microarray experiments in VB

• Community microarray data submission

• Expert analysis & collaboration

• Future developments

VectorBaseVectorBase

VectorBaseVectorBase

VectorBase’s future directions

• More genomes & sequencing

• Population biology, association studies

• More community involvement in genome annotation

• Enhanced functional genomics resources

VectorBaseVectorBase

Acknowledgements

• VB team

• IC PIs

• VB SWG

• NIAID

• Community

• Organisers

• Audience