what to send first? a study of utility in the semantic web mike dean 1, prithwish basu 1, ben...

17
What to Send First? A Study of Utility in the Semantic Web Mike Dean 1 , Prithwish Basu 1 , Ben Carterette 2 , Craig Partridge 1 , and James Hendler 3 1 Raytheon BBN Technologies 2 University of Delaware 3 Rensselaer Polytechnic Institute Joint Large and Heterogeneous Data and Quantified Formalization Workshop (LHD+SemQuant 2012) Boston, Massachusetts 12 November 2012 1 Copyright 2012 Raytheon BBN Technologies

Upload: hilary-foster

Post on 05-Jan-2016

216 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: What to Send First? A Study of Utility in the Semantic Web Mike Dean 1, Prithwish Basu 1, Ben Carterette 2, Craig Partridge 1, and James Hendler 3 1 Raytheon

What to Send First? A Study of Utility in the Semantic Web

Mike Dean1, Prithwish Basu1, Ben Carterette2, Craig Partridge1, and James Hendler3

1Raytheon BBN Technologies2University of Delaware

3Rensselaer Polytechnic Institute

Joint Large and Heterogeneous Data and Quantified Formalization Workshop (LHD+SemQuant 2012)

Boston, Massachusetts12 November 2012

1Copyright 2012 Raytheon BBN Technologies

Page 2: What to Send First? A Study of Utility in the Semantic Web Mike Dean 1, Prithwish Basu 1, Ben Carterette 2, Craig Partridge 1, and James Hendler 3 1 Raytheon

Outline

• Problem• Our Solution• Future Work

2Copyright 2012 Raytheon BBN Technologies

Page 3: What to Send First? A Study of Utility in the Semantic Web Mike Dean 1, Prithwish Basu 1, Ben Carterette 2, Craig Partridge 1, and James Hendler 3 1 Raytheon

Problem

• Transfer a knowledge base in a constrained or intermittent communication environment– Tactical military– Large football game or conference

• Send the most important information first– Prioritize statements based on their utility

• Account for inference– No need to transfer inferred statements

3

KBKB KBKB

Copyright 2012 Raytheon BBN Technologies

Page 4: What to Send First? A Study of Utility in the Semantic Web Mike Dean 1, Prithwish Basu 1, Ben Carterette 2, Craig Partridge 1, and James Hendler 3 1 Raytheon

Utility

• The utility of a statement can be calculated by a preference function U(S, s) where S is the set of statements in a knowledge base and s S∈

• Somewhat arbitrarily– Utility ranges from 0 to 1– The total utility of all statements in S should equal 1

4Copyright 2012 Raytheon BBN Technologies

Page 5: What to Send First? A Study of Utility in the Semantic Web Mike Dean 1, Prithwish Basu 1, Ben Carterette 2, Craig Partridge 1, and James Hendler 3 1 Raytheon

Preference Functions

• Ideally, users would provide a preference function suitable for a given context– Difficult to extract or derive

• Need a default preference function when nothing more specific is available

• We selected inverse frequency as the default• Motivations

– Surprise in previous research on Semantic Information Theory

– Term frequency-inverse document frequency in Information Retrieval systems

5Copyright 2012 Raytheon BBN Technologies

Page 6: What to Send First? A Study of Utility in the Semantic Web Mike Dean 1, Prithwish Basu 1, Ben Carterette 2, Craig Partridge 1, and James Hendler 3 1 Raytheon

RDF Utility

• We consider each URI and literal to be a symbol• We compute the utility of a statement by

averaging the inverse frequencies of its subject, predicate, and object components and then normalizing the results

6Copyright 2012 Raytheon BBN Technologies

Page 7: What to Send First? A Study of Utility in the Semantic Web Mike Dean 1, Prithwish Basu 1, Ben Carterette 2, Craig Partridge 1, and James Hendler 3 1 Raytheon

Inference

• Statements can be used to infer other statements– We want to quantify this by computing the inference

contribution of each of these statements

• Statements can have different utilities in different KBs– We’re particularly interested in the initial (ground) KB

and its deductive closure

• The total inference contribution is 1 – the utility of each of the ground statements in the deductive closure

7Copyright 2012 Raytheon BBN Technologies

Page 8: What to Send First? A Study of Utility in the Semantic Web Mike Dean 1, Prithwish Basu 1, Ben Carterette 2, Craig Partridge 1, and James Hendler 3 1 Raytheon

Framework

• An experiment consists of – A set of statements (KB)– An inference procedure – we used RDF Schema– A preference function – we used inverse frequency– A statement ranking function, which uses various

computed values• Implemented using Jena, its rule based reasoner,

and its Derivation interface• We accumulate utility in the deductive closure as

statements are transmitted and inferred– Generate a transcript and a cumulative utility graph– An experiment can be summarized by its average

cumulative utility

8Copyright 2012 Raytheon BBN Technologies

Page 9: What to Send First? A Study of Utility in the Semantic Web Mike Dean 1, Prithwish Basu 1, Ben Carterette 2, Craig Partridge 1, and James Hendler 3 1 Raytheon

Data Sets

• POTUS – Wikipedia information about Presidents of the United States

• FOAF – My FOAF profile + FOAF vocabulary• Cascade – Discussed later

• Data sets and code are available at http://asio.bbn.com/2012/04/utility/

9Copyright 2012 Raytheon BBN Technologies

Page 10: What to Send First? A Study of Utility in the Semantic Web Mike Dean 1, Prithwish Basu 1, Ben Carterette 2, Craig Partridge 1, and James Hendler 3 1 Raytheon

Ranking

• Gold standard: Ranking by utility in the deductive closure + inference contribution

• Inference contribution is rather difficult and expensive to compute– Most reasoners provide 1 justification, not all

• Also tried several heuristics– Utility in the initial KB– Utility in the deductive closure– Random T box, then random A box

• Base case: random10Copyright 2012 Raytheon BBN Technologies

Page 11: What to Send First? A Study of Utility in the Semantic Web Mike Dean 1, Prithwish Basu 1, Ben Carterette 2, Craig Partridge 1, and James Hendler 3 1 Raytheon

Results of Different Ranking Functions

• Cumulative utility for 262 statements in the POTUS data set

11Copyright 2012 Raytheon BBN Technologies

Page 12: What to Send First? A Study of Utility in the Semantic Web Mike Dean 1, Prithwish Basu 1, Ben Carterette 2, Craig Partridge 1, and James Hendler 3 1 Raytheon

Observations

• We can effectively order statements to increase or maximize average cumulative utility

• Using inverse frequency– Inferred RDFS statements are of lower utility– Matches intuitions and practice regarding

rdf:Resource, etc.

• Ranking based on simpler heuristics appears promising– More research is needed

12Copyright 2012 Raytheon BBN Technologies

Page 13: What to Send First? A Study of Utility in the Semantic Web Mike Dean 1, Prithwish Basu 1, Ben Carterette 2, Craig Partridge 1, and James Hendler 3 1 Raytheon

Cascade Data Set

• @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>. @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . @prefix : <http://example.org/cascade#> .

:A rdfs:subClassOf :B . :B rdfs:subClassOf :C . :C rdfs:subClassOf :D . :D rdfs:subClassOf :E .

:a rdf:type :A . • Possible to analyze all 5! = 120 possible permutations• What order do you think is best?

13Copyright 2012 Raytheon BBN Technologies

Page 14: What to Send First? A Study of Utility in the Semantic Web Mike Dean 1, Prithwish Basu 1, Ben Carterette 2, Craig Partridge 1, and James Hendler 3 1 Raytheon

Cascade Data Set (2)

• Average Cumulative Utility for all 120 permutations of cascade statements

14Copyright 2012 Raytheon BBN Technologies

Page 15: What to Send First? A Study of Utility in the Semantic Web Mike Dean 1, Prithwish Basu 1, Ben Carterette 2, Craig Partridge 1, and James Hendler 3 1 Raytheon

Cascade Data Set (3)

• Statements0. :D rdfs:subClassOf :E . 1.:B rdfs:subClassOf :C . 2.:C rdfs:subClassOf :D . 3.:a rdf:type :A .4.:A rdfs:subClassOf :B .

• Best results: average cumulative utility .639–01423–04123–04213–10423–24013–40123–40213–42013

15Copyright 2012 Raytheon BBN Technologies

Page 16: What to Send First? A Study of Utility in the Semantic Web Mike Dean 1, Prithwish Basu 1, Ben Carterette 2, Craig Partridge 1, and James Hendler 3 1 Raytheon

Contributions

• Introducing utility into the Semantic Web• Quantifying inference• A new problem• An evaluation framework

16Copyright 2012 Raytheon BBN Technologies

Page 17: What to Send First? A Study of Utility in the Semantic Web Mike Dean 1, Prithwish Basu 1, Ben Carterette 2, Craig Partridge 1, and James Hendler 3 1 Raytheon

Future Directions

• Incorporating user-defined preferences• Employing more sophisticated inference (e.g.

OWL RL)• Working with (much) larger data sets• Generalizing our framework into a toolkit• Considering bits required to encode messages• Addressing multi-party situations with different

preference functions• Modeling information fusion

17Copyright 2012 Raytheon BBN Technologies