enterprise information integration using semantic web ... · enterprise information integration...

43
© 2008 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice Enterprise Information Integration using Semantic Web Technologies: RDF as the Lingua Franca David Booth, Ph.D. HP Software Semantic Technology Conference 20-May-2008 In collaboration with Steve Battle, HP Labs Latest version of these slides: http://dbooth.org/2008/stc/slides.ppt

Upload: others

Post on 18-Jul-2020

13 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Enterprise Information Integration using Semantic Web ... · Enterprise Information Integration using Semantic Web Technologies: RDF as the Lingua Franca David Booth, Ph.D. HP Software

© 2008 Hewlett-Packard Development Company, L.P.The information contained herein is subject to change without notice

Enterprise Information Integration using Semantic Web Technologies:

RDF as the Lingua FrancaDavid Booth, Ph.D.

HP SoftwareSemantic Technology Conference 20-May-2008

In collaboration with Steve Battle, HP Labs

Latest version of these slides:http://dbooth.org/2008/stc/slides.ppt

Page 2: Enterprise Information Integration using Semantic Web ... · Enterprise Information Integration using Semantic Web Technologies: RDF as the Lingua Franca David Booth, Ph.D. HP Software

2

Disclaimer

This work reflects research and is presented for discussion purposes only. No product commitment whatsoever is expressed or implied. Furthermore, views expressed herein are those of the author and do not necessarily reflect those of HP.

Page 3: Enterprise Information Integration using Semantic Web ... · Enterprise Information Integration using Semantic Web Technologies: RDF as the Lingua Franca David Booth, Ph.D. HP Software

3

Outline• PART 0: The problem• PART 1: RDF: The lingua franca for information exchange

• Why1. Focus on semantics2. Easier data integration3. Easier to bridge other formats/models4. Looser coupling

• How1. RDF message semantics2. REST-based SPARQL endpoints3. XML with GRDDL transformations4. Aggregators

• PART 2: POC: A SPARQL adaptor for UCMDB• What is UCMDB• SPARQL adaptor

Page 4: Enterprise Information Integration using Semantic Web ... · Enterprise Information Integration using Semantic Web Technologies: RDF as the Lingua Franca David Booth, Ph.D. HP Software

4

PART 0

The problem

Page 5: Enterprise Information Integration using Semantic Web ... · Enterprise Information Integration using Semantic Web Technologies: RDF as the Lingua Franca David Booth, Ph.D. HP Software

5

Problem 1: Integration complexity• Multiple producers/consumers need to share data• Tight coupling hampers independent versioning

ProvisioningDiscovery

ChangeManagement

Compliance Management IncidentManagement

ReleaseManagement

Monitoring

Ticketing

Release Managers

Unix SystemAdministrators

Networking Engineers

NetworkingAdministrators

ComplianceManagers

Windows SystemAdministrators

Operation Centers

StorageAdministrators

Source Control

Page 6: Enterprise Information Integration using Semantic Web ... · Enterprise Information Integration using Semantic Web Technologies: RDF as the Lingua Franca David Booth, Ph.D. HP Software

6

Problem 2: Babelization• Proliferation of data models (XML schemas, etc.)• Parsing issues influence data models• No consistent semantics• Data chaos

Tower of Babel, Abel Grimmer (1570-1619)

Page 7: Enterprise Information Integration using Semantic Web ... · Enterprise Information Integration using Semantic Web Technologies: RDF as the Lingua Franca David Booth, Ph.D. HP Software

7

PART 1

RDF: The lingua franca for information exchange

Page 8: Enterprise Information Integration using Semantic Web ... · Enterprise Information Integration using Semantic Web Technologies: RDF as the Lingua Franca David Booth, Ph.D. HP Software

8

Why?Four reasons . . .

Page 9: Enterprise Information Integration using Semantic Web ... · Enterprise Information Integration using Semantic Web Technologies: RDF as the Lingua Franca David Booth, Ph.D. HP Software

9

Why?1. Focus on semantics• XML:

•Schema is focused on how to serialize• Constrains more than the model

•Parent/child and sibling relationships are not named• Are their semantics documented? E.g., does sibling order matter?

• RDF:•One URI per concept•Syntax independent

• Who cares about syntax?

Page 10: Enterprise Information Integration using Semantic Web ... · Enterprise Information Integration using Semantic Web Technologies: RDF as the Lingua Franca David Booth, Ph.D. HP Software

10

Why?2. Easier data integration

Page 11: Enterprise Information Integration using Semantic Web ... · Enterprise Information Integration using Semantic Web Technologies: RDF as the Lingua Franca David Booth, Ph.D. HP Software

11

Why?2. Easier data integration• Blue App has model

Page 12: Enterprise Information Integration using Semantic Web ... · Enterprise Information Integration using Semantic Web Technologies: RDF as the Lingua Franca David Booth, Ph.D. HP Software

12

Why?2. Easier data integration• Red App has model

• Need to integrate Red & Blue models

Page 13: Enterprise Information Integration using Semantic Web ... · Enterprise Information Integration using Semantic Web Technologies: RDF as the Lingua Franca David Booth, Ph.D. HP Software

13

Why?2. Easier data integration• Step 1: Merge RDF• Same nodes (URIs) join automatically

Page 14: Enterprise Information Integration using Semantic Web ... · Enterprise Information Integration using Semantic Web Technologies: RDF as the Lingua Franca David Booth, Ph.D. HP Software

14

Why?2. Easier data integration• Step 2: Add relationships and rules• (Relationships are also RDF)

Page 15: Enterprise Information Integration using Semantic Web ... · Enterprise Information Integration using Semantic Web Technologies: RDF as the Lingua Franca David Booth, Ph.D. HP Software

15

Why?2. Easier data integration

• Step 3: Define Green model• (Making use of Red

& Blue models)

Page 16: Enterprise Information Integration using Semantic Web ... · Enterprise Information Integration using Semantic Web Technologies: RDF as the Lingua Franca David Booth, Ph.D. HP Software

16

Why?2. Easier data integration• What the Blue app sees:

•No difference!

Page 17: Enterprise Information Integration using Semantic Web ... · Enterprise Information Integration using Semantic Web Technologies: RDF as the Lingua Franca David Booth, Ph.D. HP Software

17

Why?2. Easier data integration• What the Red app sees• No difference!

Page 18: Enterprise Information Integration using Semantic Web ... · Enterprise Information Integration using Semantic Web Technologies: RDF as the Lingua Franca David Booth, Ph.D. HP Software

18

Why?3. RDF helps bridge other formats/models• Producers and consumers may use different formats/models• Rules can specify transformations• Inference engine finds path to desired result model

RDFModelTransform

A1

A2

A3

B1

B2

C1

C2

X

Y

Z Ontologies& RulesOntologies& RulesOntologies& Rules

Page 19: Enterprise Information Integration using Semantic Web ... · Enterprise Information Integration using Semantic Web Technologies: RDF as the Lingua Franca David Booth, Ph.D. HP Software

19

Why?4. Looser coupling• Without breaking consumers:

•Ontologies can be mixed and extended•Triples can be added

• Producer & consumer can be versioned more independently

Page 20: Enterprise Information Integration using Semantic Web ... · Enterprise Information Integration using Semantic Web Technologies: RDF as the Lingua Franca David Booth, Ph.D. HP Software

20

Example of looser coupling• RedCust and GreenCust ontologies added• Blue app is not affected

(Blue app)

Consumer Producer

Page 21: Enterprise Information Integration using Semantic Web ... · Enterprise Information Integration using Semantic Web Technologies: RDF as the Lingua Franca David Booth, Ph.D. HP Software

21

How?Four ways . . .

Page 22: Enterprise Information Integration using Semantic Web ... · Enterprise Information Integration using Semantic Web Technologies: RDF as the Lingua Franca David Booth, Ph.D. HP Software

22

How?1. RDF message semantics• Interface contract specifies RDF, regardless of

serialization• RDF pins the semantics

Consumer ProducerRDF

Page 23: Enterprise Information Integration using Semantic Web ... · Enterprise Information Integration using Semantic Web Technologies: RDF as the Lingua Franca David Booth, Ph.D. HP Software

23

How?2. REST-based SPARQL endpoints

Consumer Producer

SPA

RQLRDF

HTTP

Page 24: Enterprise Information Integration using Semantic Web ... · Enterprise Information Integration using Semantic Web Technologies: RDF as the Lingua Franca David Booth, Ph.D. HP Software

24

REST-based SPARQL endpoints• Why REST:

•HTTP is ubiquitous•Simpler than SOAP-based Web services (WS*)•Looser process coupling

Page 25: Enterprise Information Integration using Semantic Web ... · Enterprise Information Integration using Semantic Web Technologies: RDF as the Lingua Franca David Booth, Ph.D. HP Software

25

REST-based SPARQL endpoints • Why SPARQL:

•One endpoint supports multiple data needs• Each consumer gets what it wants

• Insulates consumers from internal model changes• Inferencing transforms data to consumer's desired model • Looser data coupling

Page 26: Enterprise Information Integration using Semantic Web ... · Enterprise Information Integration using Semantic Web Technologies: RDF as the Lingua Franca David Booth, Ph.D. HP Software

26

How?3. XML with GRDDL transformations• GRDDL is a W3C standard• GRDDL permits RDF to be "gleaned" from XML

•XML document or schema specifies desired GRDDL transformation

•GRDDL transformation produces RDF from XML document

•Mostly intended for getting microformat and other data/metadata from HTML pages

Page 27: Enterprise Information Integration using Semantic Web ... · Enterprise Information Integration using Semantic Web Technologies: RDF as the Lingua Franca David Booth, Ph.D. HP Software

27

Using GRDDL for XML document semantics• Each XML format can be viewed as a custom serialization

of RDF!• GRDDL transformation produces semantics of the XML document

• Helps bridge XML and RDF worlds• Same XML document can be consumed by:

• Legacy XML app• RDF app

• App interface contract can specify RDF• Serializations can vary• Semantics are pinned by RDF

Page 28: Enterprise Information Integration using Semantic Web ... · Enterprise Information Integration using Semantic Web Technologies: RDF as the Lingua Franca David Booth, Ph.D. HP Software

28

Using GRDDL for XML document semantics

See: http://dbooth.org/2007/rdf-and-soa/rdf-and-soa-paper.htm

Client

Normalizeto RDF

Serialize asXML/other/RDF

RDF Engine/ Store

Service

Core AppProcessing

Page 29: Enterprise Information Integration using Semantic Web ... · Enterprise Information Integration using Semantic Web Technologies: RDF as the Lingua Franca David Booth, Ph.D. HP Software

29

How?4. Aggregators• Gets data from multiple sources• Provides data to consumers

Aggregator

A1

A2

A3

B1

B2

C1

C2

X

Y

Z Ontologies& RulesOntologies& RulesOntologies& Rules

SPA

RQL

Page 30: Enterprise Information Integration using Semantic Web ... · Enterprise Information Integration using Semantic Web Technologies: RDF as the Lingua Franca David Booth, Ph.D. HP Software

30

Aggregator• Conceptual component

•Not necessarily a separate physical service

• Handles mechanics of getting data•Different adaptors for different sources

• REST, WS*, Relational, XML, etc.• Diverse data models

•Might do caching and query distribution (federation)

• Provides model transformation•Plug in ontologies and inference rules as needed

Page 31: Enterprise Information Integration using Semantic Web ... · Enterprise Information Integration using Semantic Web Technologies: RDF as the Lingua Franca David Booth, Ph.D. HP Software

31

PART 2

Proof-of-Concept: A SPARQL adaptor for UCMDB

Page 32: Enterprise Information Integration using Semantic Web ... · Enterprise Information Integration using Semantic Web Technologies: RDF as the Lingua Franca David Booth, Ph.D. HP Software

32

IT Service Management (ITSM)• Manage IT environment• Configuration

Management Data Base (CMDB) is central

Page 33: Enterprise Information Integration using Semantic Web ... · Enterprise Information Integration using Semantic Web Technologies: RDF as the Lingua Franca David Booth, Ph.D. HP Software

33

The HP Universal CMDB (UCMDB)

CMDB : Configuration Management DB

Goal:• Maintain a comprehensive and

current record of all configuration items (CIs) and their relationships

Page 34: Enterprise Information Integration using Semantic Web ... · Enterprise Information Integration using Semantic Web Technologies: RDF as the Lingua Franca David Booth, Ph.D. HP Software

34

Example: host information

Host propertiesHost properties

One particular host machine

One particular host machine

http://cmdb.mercury.com#nt.35014541

Page 35: Enterprise Information Integration using Semantic Web ... · Enterprise Information Integration using Semantic Web Technologies: RDF as the Lingua Franca David Booth, Ph.D. HP Software

35

SPARQL adaptor• Uses existing SOAP

interface to UCMDB• Enables SPARQL

queries• Results can be RDF• No model

transformation (yet)SOAP interface

SPARQL adaptor

HP UCMDB

Page 36: Enterprise Information Integration using Semantic Web ... · Enterprise Information Integration using Semantic Web Technologies: RDF as the Lingua Franca David Booth, Ph.D. HP Software

36

Architecture of SPARQL adaptor

SOAP interface

SPARQL adaptor

HP UCMDB

SPARQLSPARQL compile compile TQLTQL

XMLXMLRDF RDF lift lift

Database

CMDB metadata CMDB metadata export export

submitsubmit

OWLOWL

Design Time

Run Time

Page 37: Enterprise Information Integration using Semantic Web ... · Enterprise Information Integration using Semantic Web Technologies: RDF as the Lingua Franca David Booth, Ph.D. HP Software

37

UCMDB ontology• The HP UCMDB ontology

defines CI types and relationship hierarchies.

• Derived automatically from HP UCMDB metadata.

Page 38: Enterprise Information Integration using Semantic Web ... · Enterprise Information Integration using Semantic Web Technologies: RDF as the Lingua Franca David Booth, Ph.D. HP Software

38

Jena based implementation• Jena, ARQ, Joseki developed at HP Labs*.

Jena : Semantic Web toolkit RDF, OWL, inference

ARQ : Query Engine

SPARQL queryalgebra, evaluationJoseki : SPARQL server

SPARQL protocol

* http://www.hpl.hp.com/semweb/

Page 39: Enterprise Information Integration using Semantic Web ... · Enterprise Information Integration using Semantic Web Technologies: RDF as the Lingua Franca David Booth, Ph.D. HP Software

39

Query returning a table

"MBADIR-IL" ^^<http://www.w3.org/2001/XMLSchema#string>

"JONI" ^^<http://www.w3.org/2001/XMLSchema#string>

"ILDTRD129" ^^<http://www.w3.org/2001/XMLSchema#string>

host_name

Select the names of host servers on the network with addresses from 192.168.81.0SELECT ?host_nameWHERE {

[ a object:network ] attr:network_netaddr "192.168.81.0" ;link:member [ a object:host ; attr:host_dnsname ?host_name

] }

Page 40: Enterprise Information Integration using Semantic Web ... · Enterprise Information Integration using Semantic Web Technologies: RDF as the Lingua Franca David Booth, Ph.D. HP Software

40

Query returning an RDF subgraph

Describe a network (192.168.81.0) with host servers containing a DB.

Describe a network (192.168.81.0) with host servers containing a DB.

Page 41: Enterprise Information Integration using Semantic Web ... · Enterprise Information Integration using Semantic Web Technologies: RDF as the Lingua Franca David Booth, Ph.D. HP Software

41

Example RDF result set

DatabaseDatabase

HostHost

Page 42: Enterprise Information Integration using Semantic Web ... · Enterprise Information Integration using Semantic Web Technologies: RDF as the Lingua Franca David Booth, Ph.D. HP Software

42

Outline• PART 0: The problem• PART 1: RDF: The lingua franca for information exchange

• Why1. Focus on semantics2. Easier data integration3. Easier to bridge other formats/models4. Looser coupling

• How1. RDF message semantics2. REST-based SPARQL endpoints3. XML with GRDDL transformations4. Aggregators

• PART 2: POC: A SPARQL adaptor for UCMDB• What is UCMDB• SPARQL adaptor

Page 43: Enterprise Information Integration using Semantic Web ... · Enterprise Information Integration using Semantic Web Technologies: RDF as the Lingua Franca David Booth, Ph.D. HP Software

© 2008 Hewlett-Packard Development Company, L.P.The information contained herein is subject to change without notice

Questions?