on triple dissemination, forward- chaining and load balancing in dht based rdf stores dominic...

24
On Triple Dissemination, Forward-Chaining and Load Balancing in DHT Based RDF Stores Dominic Battre, Felix Heine, Andre Höing, and Odej Kao Presented by Aldarwich Yaser rt-Ludwigs-University Freiburg SS 2009 Department of Computer Science Computer Networks and Telematic Prof. Christian Schindelh

Post on 18-Dec-2015

216 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: On Triple Dissemination, Forward- Chaining and Load Balancing in DHT Based RDF Stores Dominic Battre, Felix Heine, Andre Höing, and Odej Kao Presented

On Triple Dissemination, Forward-Chaining and Load Balancing in DHT

Based RDF Stores

Dominic Battre, Felix Heine, Andre Höing, and Odej Kao

Presented byAldarwich Yaser

Albert-Ludwigs-University Freiburg SS 2009 Department of Computer Science

Computer Networks and Telematics Prof. Christian Schindelhaue

Page 2: On Triple Dissemination, Forward- Chaining and Load Balancing in DHT Based RDF Stores Dominic Battre, Felix Heine, Andre Höing, and Odej Kao Presented

Overview

Motivation Introduction

• RDF

• DHT

• Pastry

Triples dissemination Reasoning Load Balancing References

1

Page 3: On Triple Dissemination, Forward- Chaining and Load Balancing in DHT Based RDF Stores Dominic Battre, Felix Heine, Andre Höing, and Odej Kao Presented

Motivation

Centralized database Shortcomings• Incapable to handle load• Capacities limitation like in (Seasame,Jena)

Decentralized database • Example: Babelpeers,RDFpeers and Edutella

• Provides scalibility,effeciency and capacity

Reasoning• Infer new data from existing information

Load balancing

Page 4: On Triple Dissemination, Forward- Chaining and Load Balancing in DHT Based RDF Stores Dominic Battre, Felix Heine, Andre Höing, and Odej Kao Presented

RDF Introduction

Resource Description Framework (RDF) Used for representing information on the Web RDFs provides a powerful model for storing and

inferencing knowledge . In RDF everything is represented by triples of the

form(S,P,O)

Example: Germany has Capital Berlin

S P O

2

Page 5: On Triple Dissemination, Forward- Chaining and Load Balancing in DHT Based RDF Stores Dominic Battre, Felix Heine, Andre Höing, and Odej Kao Presented

DHT Introduction

Solve the item location problem in a distributed

network of nodes

Use a key k to calculate the ID

ID=hash(k)

Operations: • Put(k, x)• Get(k)

3

Page 6: On Triple Dissemination, Forward- Chaining and Load Balancing in DHT Based RDF Stores Dominic Battre, Felix Heine, Andre Höing, and Odej Kao Presented

Triple dissemination

Triple T=(s,p,o)

identifier = (hash(s))

identifier = (hash(p))

identifier = (hash(o))

Responsible node for p

Responsible node for o

Responsible node for s

http://videolectures.net/iswc08_kaoudi_rdfs/

Query q = (s, p, o)

identifier = (hash(p))

4

Page 7: On Triple Dissemination, Forward- Chaining and Load Balancing in DHT Based RDF Stores Dominic Battre, Felix Heine, Andre Höing, and Odej Kao Presented

Pastry Protocol

Each peer has a 128-bit ID: nodeID• Unique and uniformly distributed• Use cryptographic function applied to IP-address

Message takes O(log N) steps to destination

Node state contains:• Leaf Set • Routing table explain• Neighborhood Set

Page 8: On Triple Dissemination, Forward- Chaining and Load Balancing in DHT Based RDF Stores Dominic Battre, Felix Heine, Andre Höing, and Odej Kao Presented

Pastry (prefix-matching)

323310

323211

322021

313221

103231

Route(m, 323310)?

Node-id

Key

Page 9: On Triple Dissemination, Forward- Chaining and Load Balancing in DHT Based RDF Stores Dominic Battre, Felix Heine, Andre Höing, and Odej Kao Presented

RDf Reasoning

The query is formulated gernerally RDFs extract data even if the description does not

exactly match the query

Example:

Christian fatherof SchindelhauerFather subpropertyof relatives

=> Christian relative of Schindelhauer

Page 10: On Triple Dissemination, Forward- Chaining and Load Balancing in DHT Based RDF Stores Dominic Battre, Felix Heine, Andre Höing, and Odej Kao Presented

RDFS Rules

Rule NamePreconditionGenerated Triple

rdfs2a,rdfs:domain,x

u, a , v

u, rdf:type, x

Rdfs3a, rdfs:range, x

u, a, v

v, rdf:type, x

rdfs5u, rdfs:subPropertyOf, v

v, rdfs:subpropertyOf, x

u,rdfs:subPropertyOf,x

rdfs9u, rdfs:subClassOf, x

v, rdf:type, u

v, rdf:type, x

rdfs11u, rdfs:subClassOf, v

v, rdfs:subClassOf, x

u, rdfs:subClassOf, x

6

Page 11: On Triple Dissemination, Forward- Chaining and Load Balancing in DHT Based RDF Stores Dominic Battre, Felix Heine, Andre Höing, and Odej Kao Presented

Node Architecture

Each node hosts multiple RDf databases• local triples database

• Received triples database

• Replica database

• Generated triples

Generated Triples

Local Triples

Received Triples

Replica

5 Node

Page 12: On Triple Dissemination, Forward- Chaining and Load Balancing in DHT Based RDF Stores Dominic Battre, Felix Heine, Andre Höing, and Odej Kao Presented

Triple dissemination in DHT

Node1 Node2 Node3 Node4

Generated Triples

Local Triples

Received Triples

Replica

Generated Triples

Local Triples

Received Triples

Replica

Generated Triples

Local Triples

Received Triples

Replica

Generated Triples

Local Triples

Received Triples

Replica

7

Page 13: On Triple Dissemination, Forward- Chaining and Load Balancing in DHT Based RDF Stores Dominic Battre, Felix Heine, Andre Höing, and Odej Kao Presented

Triples life-cycle

Triples are subjected to different events

like (Joining, Departure)

Triples life-time• long life time triples has few refreshes refreshes

• short life time triples(generated triples)

Update triples update inferred triples Soft-state

Page 14: On Triple Dissemination, Forward- Chaining and Load Balancing in DHT Based RDF Stores Dominic Battre, Felix Heine, Andre Höing, and Odej Kao Presented

Node Departure

Node substitution Correction of routing table Replica duty Decreasing number of replicas

8

n1

n4

n3

n2

n9

Page 15: On Triple Dissemination, Forward- Chaining and Load Balancing in DHT Based RDF Stores Dominic Battre, Felix Heine, Andre Höing, and Odej Kao Presented

Node Arrival

More complicated Query recieving Task of replica nodes Time reduction

9

n1

n4

n3

n2

n6

n9

Page 16: On Triple Dissemination, Forward- Chaining and Load Balancing in DHT Based RDF Stores Dominic Battre, Felix Heine, Andre Höing, and Odej Kao Presented

Load balancing

Major criticism against DHT based RDF strores Many collisions are unavoidable Example:

• DHT stores many triples with predicate rdf:type

“ rdfs:subClassOf“ create many triples with Predicate

rdf:type

Overlay Tree Builds for discrete DHT positions like the one stores triples

with rdf:type

10

Page 17: On Triple Dissemination, Forward- Chaining and Load Balancing in DHT Based RDF Stores Dominic Battre, Felix Heine, Andre Höing, and Odej Kao Presented

Node1 Node2 Node3 Node4

Local Triples

Received Triples

Local

Generated Triples

Remote Triples

Exte

Exte

Local

Remote Triples

Local Triples

Received Triples

Generated Triples

Local Triples

Received Triples

Generated Triples

Local Triples

Received Triples

Generated Triples

Local

Remote Triples

Exte

Local

Remote Triples

Local

Remote Triples

refe

renc

esre

fere

nces

references references

Load-balancing with remote triples database11

Page 18: On Triple Dissemination, Forward- Chaining and Load Balancing in DHT Based RDF Stores Dominic Battre, Felix Heine, Andre Höing, and Odej Kao Presented

Replicated overlay tree

Root

Rank1 Rank2

12

Page 19: On Triple Dissemination, Forward- Chaining and Load Balancing in DHT Based RDF Stores Dominic Battre, Felix Heine, Andre Höing, and Odej Kao Presented

Query routing in overlay tree

RootRank1 Rank2

Qeury

Result

13

Page 20: On Triple Dissemination, Forward- Chaining and Load Balancing in DHT Based RDF Stores Dominic Battre, Felix Heine, Andre Höing, and Odej Kao Presented

Handling RDFs rules in load balancing

Problem of RDF rules• As node is overloaded, the triples are splited into other nodes

• Example:

a, rdfs:domain, x

u, a, v

a, rdfs:domain, xu,a,v u,a,v

a, rdfs:domain, x

Node3Node1 Node2

Page 21: On Triple Dissemination, Forward- Chaining and Load Balancing in DHT Based RDF Stores Dominic Battre, Felix Heine, Andre Höing, and Odej Kao Presented

Handling RDFs rules in load balancing

Solution• Make copy of most common rdfs schema into each node in

overlay tree

a, rdfs:domain, xu,a,v

Node1 Node4Node3

a, rdfs:domain, x

u, a, v

Node2

a, rdfs:domain, x a, rdfs:domain, x

Page 22: On Triple Dissemination, Forward- Chaining and Load Balancing in DHT Based RDF Stores Dominic Battre, Felix Heine, Andre Höing, and Odej Kao Presented

Conclusion

P2p based distributed database offer better

scalability and source integration Real power of RDF is stems from possibility

to derive new data from explicit knwoledge Overlay tree is the solution for overloading

problem

Page 23: On Triple Dissemination, Forward- Chaining and Load Balancing in DHT Based RDF Stores Dominic Battre, Felix Heine, Andre Höing, and Odej Kao Presented

References

http://www.videolectures.net http://cone.informatik.uni-freiburg.de http://www.w3schools.com http://www.w3.org/TR/rdf-schema/ http://peersim.sourceforge.net/ http://infolab.stanford.edu http://www.edutella.org/edutella.shtml Battre,heine,Kao:Top k RDF query evaluation in p2p

14

Page 24: On Triple Dissemination, Forward- Chaining and Load Balancing in DHT Based RDF Stores Dominic Battre, Felix Heine, Andre Höing, and Odej Kao Presented

Thanks for your Attention