modware

22
Eric Just – BOSC 2007 - July 20, 2007 Modware: An Object- oriented Perl Interface to the Chado Schema Eric Just Senior Bioinformatics Scientist dictyBase: http://dictybase.org Northwestern University Generic Model Organism Database Project (GMOD)

Upload: bosc

Post on 11-May-2015

1.772 views

Category:

Technology


0 download

DESCRIPTION

Title: Modware: An Object-oriented Perl Interface to the Chado Schema Author: Eric Just

TRANSCRIPT

Page 1: Modware

Eric Just – BOSC 2007 - July 20, 2007

Modware: An Object-oriented Perl Interface to the Chado

Schema

Eric Just

Senior Bioinformatics Scientist

dictyBase: http://dictybase.org

Northwestern University

Generic Model Organism Database Project (GMOD)

Page 2: Modware

Eric Just – BOSC 2007 - July 20, 2007

Agenda• What?

– What is Chado?– What is Modware?

• Why?• How?

– Example: a protein coding gene• Storing in Chado• Writing a Web Page with Modware

• Try– Getting Modware

Page 3: Modware

Eric Just – BOSC 2007 - July 20, 2007

What is Chado?

• Standardized genomics database schema• Developed by FlyBase• Adopted and distributed by GMOD project• Extremely flexible and compact

– Heavy ontology usage– Entity-Attribute-Value tables

• Modularized for different areas of bioinformatics

See Talk on Chado at ISMB (Paper 65)

Page 4: Modware

Eric Just – BOSC 2007 - July 20, 2007

What is Modware

• Open-source tool for programmers who write software to QUERY AND UPDATE Chado

• Object-oriented API with semantically sensible classes and methods

• Developed at dictyBase • Working with GMOD

Page 5: Modware

Eric Just – BOSC 2007 - July 20, 2007

Agenda• What?

– What is Chado?– What is Modware?

• Why?• How?

– Example: a protein coding gene• Storing in database• Writing a Web Page with Modware

• Try– Getting Modware

Page 6: Modware

Eric Just – BOSC 2007 - July 20, 2007

Why Modware Exists

• Chado has many business rules• Modware encapsulates many Chado business rules• Faster, more efficient development time• More readable code• UI changes, logic does not• Leverage GMOD/Chado/Open Source community

Page 7: Modware

Eric Just – BOSC 2007 - July 20, 2007

Agenda• What?

– What is Chado?– What is Modware?

• Why?• How?

– Example: a protein coding gene• Storing in database• Writing a Web Page with Modware

• Try– Getting Modware

Page 8: Modware

Eric Just – BOSC 2007 - July 20, 2007

A Simple Gene ExampleA gene is a region on a chromosome that encloses one or more transcript objects. An mRNA is a protein-coding transcript is composed of one or more exons which have coordinates on a chromosome.

Chromosome 3

Page 9: Modware

Eric Just – BOSC 2007 - July 20, 2007

Storing mlcE in Chado

srcfeature feature_id fmin fmax strand

100 101 67979 68706 1

100 102 67979 68706 1

100 103 67979 67982 1

100 104 68256 68706 1

Feat_id Feat_type name

100 250 Chr 3

101 251 mlcE

102 252 DDB0214813

103 253 _DDB0214813_exon_1

104 253 _DDB0214813_exon_2

cv_id cv

1 Sequence Ontology

cvterm_id cvterm cv

250 chromosome 1

251 gene 1

252 mRNA 1

253 exon 1

Located on

part of

part of

cv_id cv

1 Sequence Ontology

2 Relationsip Ontology

cvterm_id cvterm cv

250 chromosome 1

251 gene 1

252 mRNA 1

253 exon 1

301 part_of 2

subject_id type_id object_id

102 301 101

103 301 102

104 301 102

CV (controlled vocabulary)

Feature

Featureloc

Feature_relationship

CVterm

Page 10: Modware

Eric Just – BOSC 2007 - July 20, 2007

A Simple Gene Page

Page 11: Modware

Eric Just – BOSC 2007 - July 20, 2007

#!/usr/bin/perluse Modware::Feature;use CGI;my $id = CGI::param(‘primary_id');my $count = 1;

# Get all data from databasemy $feature = new Modware::Feature( -primary_id => $id );my $chromosome = $feature->reference_feature()->name();my $gene = $feature->gene()->name();my @exons = $feature->bioperl()->exons();my $sequence = $feature->sequence( -type => ’protein', -format => 'fasta' );

# print the reportprint CGI->header;print "<pre>";

print $id." is on chromosome $chromosome";print " and is the gene $gene\n";

# print the number and position of each exonforeach my $exon (@exons) { print "Exon $count. start=".$exon->start(). " end=".$exon->end()."\n"; $count++;}print $sequence;print "</pre>";

Page 12: Modware

Eric Just – BOSC 2007 - July 20, 2007

Modware::Features

Modware Feature Classes

• Gene

• mRNA

• ncRNA

• Contig

• Chromosome

• EST

• Generic (catch-all)

Modware can manage the following annotations

• Sequence• Location• Name• Synonyms• Description• Public identifiers• External identifiers

(dbxrefs)

Page 13: Modware

Eric Just – BOSC 2007 - July 20, 2007

Modware::Search

These classes retrieve groups (iterators) of features

Modware::Search::Gene->Search_by_name_and_synonym(‘*kinase*’);

Modware::Search::Feature->Search_overlapping_feats_by_range( ‘Chr3’, 100000, 500000, ‘mRNA’);

Location searchesFind all protein-coding genes on Chromosome 3 between bases 100,000 and 500,000

Text searchesRetrieve all kinase genes

Page 14: Modware

Eric Just – BOSC 2007 - July 20, 2007

Updating a Gene Name, add Synonym

# get genemy ($gene) = new Modware::Search::Gene->Search_by_name(‘mlcA' );

# change the name$gene->name( ‘newname' );

# add a synonym$gene->add_synonym( ‘mlcA' );

# write changes to database$gene->update();

Page 15: Modware

Eric Just – BOSC 2007 - July 20, 2007

Modware Goals

• Future releases will include:– Literature annotations– GO annotations– Phenotype annotations

• Incorporate feedback from users

Page 16: Modware

Eric Just – BOSC 2007 - July 20, 2007

Agenda• What?

– What is Chado?– What is Modware?

• Why?• How?

– Example: a protein coding gene• Storing in database• Writing a Web Page with Modware

• Try– Getting Modware

Page 17: Modware

Eric Just – BOSC 2007 - July 20, 2007

Getting Modware

http://gmod-ware.sourceforge.net

• Download the NEW Virtual Machine

• Modware is preinstalled and ready for you!

• See me if you have any questions or want a demo

• Visit Poster N41 at ISMB

Page 18: Modware

Eric Just – BOSC 2007 - July 20, 2007

Online Documentationhttp://gmod-ware.sourceforge.net/doc/

Page 19: Modware

Eric Just – BOSC 2007 - July 20, 2007

DankesThe organizers of BOSC 2007O|B|F

dictyBase

• PIs– Rex Chisholm, PhD– Warren Kibbe, PhD

• Programmer– Sohel Merchant

• Curators– Petra Fey– Pascale Gaudet, PhD

Other Groups

• Funding– NIH (NIGMS and NHGRI)

• GMOD– Scott Cain– Brian O’Connor

• Chado developers

• Bioperl developers

Page 20: Modware

Eric Just – BOSC 2007 - July 20, 2007

# USE CASE: Add a description, dbxref, and an exon

my $transcript = new Modware::Feature( -primary_id => 'DDB0233595' );$transcript->description( 'Gene model derived from AU12345' );$transcript->add_external_id( -source => 'GenBank Accession Number', -id => 'AU12345' );

# call the bioperl method to retrieve bioperl representation of object# need this to view/edit exon structure$bioperl = $transcript->bioperl();

# here, we are manipulating a Bio::SeqFeature::Gene object# shift the last exon back a little bit (to lose stop codon)[$bioperl->exons()]->[2]->start( 281050 );

# create a new exon and add it to the featuremy $exon = Bio::SeqFeature::Gene::Exon->new( -start => 280921, -end => 280959, -strand => -1 );$exon->is_coding(1);$bioperl->add_exon($exon);

# update writes everything to the database$transcript->update();

Page 21: Modware

Eric Just – BOSC 2007 - July 20, 2007

Modware::Feature

Modware::Feature::GENE

Modware::Feature::MRNA

Bio::SeqFeature::Gene::Transcript Bio::Seq

Modware::Feature::CHROMOSOME

Bio::SeqFeature::Gene::Exon

Page 22: Modware

Eric Just – BOSC 2007 - July 20, 2007

Feature

ncRNA mRNA Contig ChromosomegetOverlappingFeatures()getOverlappingAlignments()

Bio::SeqFeature::Gene::Transcript Bio::SeqFeature::GenericBio::SeqFeature::Generic Bio::Seq