phyql: a phylogenetic visual query engine

21
Shahriyar Hossain , Munirul Islam , Jesmin , Hasan M Jamil Integration Informatics Laboratory, Computer Science, Wayne State University Department of Genetic Engineering and Biotechnology, University of Dhaka, Bangladesh BIBM 2008 06/24/22 1 PhyQL: A Phylogenetic Visual Query Engine Integration Informatics Research Group

Upload: alta

Post on 22-Jan-2016

34 views

Category:

Documents


0 download

DESCRIPTION

PhyQL: A Phylogenetic Visual Query Engine. Shahriyar Hossain  , Munirul Islam  , Jesmin  , Hasan M Jamil  Integration Informatics Laboratory, Computer Science, Wayne State University  Department of Genetic Engineering and Biotechnology, University of Dhaka, Bangladesh  BIBM 2008. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: PhyQL: A Phylogenetic Visual Query Engine

Shahriyar Hossain, Munirul Islam, Jesmin, Hasan M JamilIntegration Informatics Laboratory, Computer Science, Wayne State University

Department of Genetic Engineering and Biotechnology, University of Dhaka, Bangladesh

BIBM 200804/21/231

PhyQL: A Phylogenetic Visual Query Engine

Integration Informatics Research Group

Page 2: PhyQL: A Phylogenetic Visual Query Engine

04/21/23Integration Informatics Research Group2

What is a Phylogenetic Tree?

Page 3: PhyQL: A Phylogenetic Visual Query Engine

04/21/23Integration Informatics Research Group3

Page 4: PhyQL: A Phylogenetic Visual Query Engine

Queries:Least Common

Ancestor

Thurs 03/20/20084

<root> <node>rayfinned fish</node> <inode> <node>lungfish</node> <inode> <inode> <node>salamanders</node> <node>frogs</node> </inode> . . . </inode> </inode></root>

for $root in doc(“tree.xml")//root return <span> <h1> { $root/node/text() } </h1> </span>Integration Informatics Research Group

Page 5: PhyQL: A Phylogenetic Visual Query Engine

Phylogenetic Query Language:

Select: select a subset of trees that match a given criteria

Join: Join two trees based on a pair of nodesSubset: Subset queries retrieve part of a given tree

11/5/20085 Integration Informatics Research Group

Page 6: PhyQL: A Phylogenetic Visual Query Engine

04/21/236

Using Path Operators

SubTree Projection

Tree Join

Integration Informatics Research Group

Page 7: PhyQL: A Phylogenetic Visual Query Engine

PhyQL:

04/21/237

XSB

DB

Visual Query Interface

User

SELECT

JOIN

SUBTREE

Translator

XML /NEXUSFrom User /

Interoperable

Databases

Wrappers

Integration Informatics Research Group

Page 8: PhyQL: A Phylogenetic Visual Query Engine

Why XSB?eliminates left recursion problem

Path(X,Z) :- Path(X,Y), Edge(Y,Z)Stores intermediate results (by tabling method)Model-based (order of writing rules doesn’t matter)

Path(X,Y) :- edge(X,Y)Path(X,Y) :- Path(X,Y), edge(Y,Z)

its in-memory database queries are an order of magnitude faster than methods such as tuProlog.

11/5/2008Integration Informatics Research Group8

:- odbc_import(conn, 'tbl_treeinfo'(‘rootId', ‘author'), tree).:- odbc_import(conn, 'tbl_nodeinfo'('nodeId', 'nodename'), node).:- odbc_import(conn, 'tbl_edge'('parentId', 'childId'), edge).

Page 9: PhyQL: A Phylogenetic Visual Query Engine

04/21/239

<tree author="stern"> <node type=“*"> <node type=“?"> <node> Stanhopea_gibbosa </node> <node> Stanhopea_vasquezii </node> </node> <node> Stanhopea_shuttleworthii </node> </node></tree>

node(Y1, ‘Stanhopea_shuttleworthii’),node(Y2, ‘Stanhopea_gibbosa’),node(Y3, ‘Stanhopea_vasquezii),edge(Y4,Y2),edge(Y4,Y3),lca(Y0,Y4,Y1),edge(Y0,Y1)

Integration Informatics Research Group

Page 10: PhyQL: A Phylogenetic Visual Query Engine

04/21/2310 Integration Informatics Research Group

Page 11: PhyQL: A Phylogenetic Visual Query Engine

04/21/2311 Integration Informatics Research Group

Page 12: PhyQL: A Phylogenetic Visual Query Engine

04/21/23Integration Informatics Research Group12Integration Informatics Research Group

Page 13: PhyQL: A Phylogenetic Visual Query Engine

04/21/2313 Integration Informatics Research Group

Page 14: PhyQL: A Phylogenetic Visual Query Engine

SummaryPhyQL offers a simple web-based visual query

interfaceLogic based tree query operationsModifications to query tools only requires change in

logic rulesProposed architecture can also applied to protein-

protein interaction networks, metabolic pathways etc.

Future Work:Database Interoperability – allow retrieving integrate

phylogenetic data during query submission ReQuery – query on the result setTree Similarity Estimation

04/21/2314

Page 15: PhyQL: A Phylogenetic Visual Query Engine

Thank You!

04/21/2315 Integration Informatics Research Group

me: http://homopan.wayne.edu/PhD Students/Munirul Islam/index.htm

Page 16: PhyQL: A Phylogenetic Visual Query Engine

Uses of Phylogenetic Trees:1. date events of

divergence of species2. what is the most

common ancestor of all living species?

3. identify geographic origins of new disease outbreaks

11/5/2008Integration Informatics Research Group16

Page 17: PhyQL: A Phylogenetic Visual Query Engine

CrimsonUses nested subtrees to avoid long stringsZheng, Y. S. Fisher, S. Cohen, S. Guo, J. Kim, and

S. B. Davidson. 2006. Crimson: A Data Management System to Support Evaluating Phylogenetic Tree Reconstruction Algorithms. 32nd International Conference on Very Large Data Bases, ACM, pp. 1231-1234.

Page 18: PhyQL: A Phylogenetic Visual Query Engine

A B C D E

0.1

0.1.1 0.1.2

0.2

0.2.1

0.2.1.1 0.2.1.2 0.2.2

0

Dewey system:

Integration Informatics Research Group18 11/5/2008

Page 19: PhyQL: A Phylogenetic Visual Query Engine

Label Path

Root 0

NULL 0.1

A 0.1.1

B 0.1.2

NULL 0.2

NULL 0.2.1

C 0.2.1.1

D 0.2.1.2

E 0.2.2

A B C D E

Find clade for: Z = (<CS+Ds)

Find common pattern starting from left

SELECT * FROM nodesWHERE (path LIKE “0.2.1%”);

Integration Informatics Research Group19 11/5/2008

Page 20: PhyQL: A Phylogenetic Visual Query Engine

A B C D E

2

3 5

8

9

10 12 15

1

4 6

7

17

11 13 16

18

14

Depth-first traversal scoring each node with a left and right ID

Integration Informatics Research Group20 11/5/2008

Page 21: PhyQL: A Phylogenetic Visual Query Engine

Label Left Right

1 18

2 7

A 3 4

B 5 6

8 17

9 14

C 10 11

D 12 13

E 15 16

A B C D E

2

3 5

8

9

10 12 15

1

4 6

7

17

11 13 16

18

14

SELECT * FROM nodesINNER JOIN nodes AS includeON (nodes.left_id BETWEEN include.left_id AND include.right_id)WHERE include.node_id = 5 ;

Minimum Spanning Clade of Node 5

Integration Informatics Research Group21 11/5/2008