an implementation of jan aerts' locustree

28
Another Genome Browser LocusTree Pierre Lindenbaum PhD. Fondation Jean Dausset Dec. 2009 [email protected] http://plindenbaum.blogspot.com

Upload: pierre-lindenbaum

Post on 09-Jul-2015

564 views

Category:

Technology


3 download

DESCRIPTION

my implementation of Jan Aerts' LocusTree algorithm based on BerkeleyDB-JE, a Key/Value datastore. This implementation has been used to build a SVG genome browser.

TRANSCRIPT

Page 1: An implementation of Jan Aerts' LocusTree

Another Genome Browser

LocusTree

Pierre Lindenbaum PhD. Fondation Jean Dausset

Dec. 2009

[email protected]://plindenbaum.blogspot.com

Page 2: An implementation of Jan Aerts' LocusTree
Page 3: An implementation of Jan Aerts' LocusTree

“A brave new genome browser”

Page 4: An implementation of Jan Aerts' LocusTree

Saaien Tist

http://saaientist.blogspot.com

Page 5: An implementation of Jan Aerts' LocusTree

BerkeleyDB:A Key/Value Database

(Oracle)

C APIJava Bindings (JNI)

PureJava

http://www.oracle.com/database/berkeley-db/index.html

Page 6: An implementation of Jan Aerts' LocusTree

How it works

Page 7: An implementation of Jan Aerts' LocusTree
Page 8: An implementation of Jan Aerts' LocusTree
Page 9: An implementation of Jan Aerts' LocusTree
Page 10: An implementation of Jan Aerts' LocusTree
Page 11: An implementation of Jan Aerts' LocusTree
Page 12: An implementation of Jan Aerts' LocusTree
Page 13: An implementation of Jan Aerts' LocusTree

Loading the Data

Page 14: An implementation of Jan Aerts' LocusTree

<organisms>

<organism id="36">

<name>hg18</name>

<description>

Human Genome Build v.36

</description>

<metadata>{"taxon-id":9606}</metadata>

</organism>

</organisms>

Organisms

Page 15: An implementation of Jan Aerts' LocusTree

<tracks>

<track id="1">

<name>cytobands</name>

<description>UCSC cytobands</description>

</track>

<track id="2">

<name>knownGene</name>

<description>UCSC knownGene</description>

</track>

<track id="3">

<name>snp130</name>

<description>dbSNP v.130</description>

</track>

</tracks>

Tracks

Page 16: An implementation of Jan Aerts' LocusTree

<chromosomes organism-id="36">

<chromosome id="1">

<name>chr1</name>

<metadata>

{"size":247249719,"type":"autosomal"}

</metadata>

</chromosome>

<chromosome id="10">

<name>chr10</name>

<metadata>

{"size":135374737,"type":"autosomal"}

</metadata>

</chromosome>

(...)

</chromosomes>

ChromosomesChromosomes

Page 17: An implementation of Jan Aerts' LocusTree

public interface LTLoader

{

public MappedObject getMappedObject();

public String getChromosome();

public Set<String> getKeywords();

}

public interface LTStreamLoader

extends LTLoader

{

public void open(String uri) throws IOException;

public void close() throws IOException;

public boolean next() throws IOException;

}

Objects

Page 18: An implementation of Jan Aerts' LocusTree

<loaders>

<load organism-id="36" track-id="5"

class-loader="fr.cephb.locustree.loaders.UCSCAllMrnaLoader"

limit="10000">

http://hgdownload.cse.ucsc.edu/(...)/all_mrna.txt.gz

</load>

<load organism-id="36" track-id="4"

class-loader="fr.cephb.locustree.loaders.UCSCSnpCodingLoader"

limit="10000">

http://hgdownload.cse.ucsc.edu/(...)/snp130CodingDbSnp.txt.gz

</load>

</loaders>

Objects

Page 19: An implementation of Jan Aerts' LocusTree

Implementation:Servlet/JSP

Apache Tomcat

Page 20: An implementation of Jan Aerts' LocusTree

Searching

Page 21: An implementation of Jan Aerts' LocusTree

Browsing

Page 22: An implementation of Jan Aerts' LocusTree

SVG

Page 23: An implementation of Jan Aerts' LocusTree

hg18 chr1 knownGene uc001act.2 Q96HA4 1007060 1041599 {'exonEnds':[1008230,1009626,1009749,1011255,1012447,1012840,1015671,1016226,1016808,1017346,1040318,1041599],'exonStarts':[1007060,1009157,1009723,1011120,1012381,1012744,1015595,1016118,1016714,1017233,1040264,1041302],'strand':'-'}hg18 chr1 knownGene uc001acm.2 Q96HA4-2 1007060 1017346 {'exonEnds':[1008230,1011255,1012447,1012840,1015671,1016808,1017346],'exonStarts':[1007060,1011120,1012381,1012744,1015595,1016714,1017233],'strand':'-'}hg18 chr1 knownGene uc001acu.2 Q96HA4-4 1007060 1041599 {'exonEnds':[1008230,1009626,1009749,1011255,1012447,1012840,1015671,1016808,1017346,1041599],'exonStarts':[1007060,1009595,1009723,1011120,1012381,1012744,1015595,1016714,1017233,1041302],'strand':'-'}hg18 chr1 knownGene uc009vju.1 uc009vju.1 1007060 1017346 {'exonEnds':[1008230,1009626,1009749,1011255,1012840,1015671,1016808,1017346],'exonStarts':[1007060,1009595,1009723,1011120,1012744,1015595,1016714,1017233],'strand':'-'}hg18 chr1 knownGene uc001acr.2 uc001acr.2 1007060 1041599 {'exonEnds':[1008230,1009329,1009626,1009749,1011255,1012447,1012840,1015671,1016617,1016808,1017346,1041599],'exonStarts':[1007060,1009157,1009595,1009723,1011120,1012381,1012744,1015595,1016520,1016714,1017233,1041302],'strand':'-'}hg18 chr1 knownGene uc001acs.2 uc001acs.2 1007060 1041599 {'exonEnds':[1008230,1009329,1009626,1009749,1011255,1012447,1012840,1015671,1016808,1017346,1041599],'exonStarts':[1007060,1009157,1009595,1009723,1011120,1012381,1012744,1015595,1016714,1017233,1041302],'strand':'-'}

Export: text/plain

Page 24: An implementation of Jan Aerts' LocusTree
Page 25: An implementation of Jan Aerts' LocusTree

Distributed Annotation SystemDAS

Page 26: An implementation of Jan Aerts' LocusTree
Page 27: An implementation of Jan Aerts' LocusTree

LocusTreeService service=new LocusTreeService();

LocusTree locustree=service.getLocusTreeSePort();

final int organismId=36;

for(Chromosome chrom :

locustree.getChromosomes(organismId))

{

System.out.println(

chrom.getId()+"\t"+

chrom.getName()+"\t"+

chrom.getOrganismId()+"\t"+

chrom.getLength()

);

}

1 chr1 36 247249719

2 chr2 36 242951149

3 chr3 36 199501827

4 chr4 36 191273063

5 chr5 36 180857866

6 chr6 36 170899992

7 chr7 36 158821424

8 chr8 36 146274826

9 chr9 36 140273252

(...)

Page 28: An implementation of Jan Aerts' LocusTree

Thank you.