mikedean, leoobrst, peteryim, et al. april 29, 2008 v 1.02

65
1 Open Ontology Open Ontology Repository Session Repository Session OOR-Team Presentation OOR-Team Presentation Ontology Summit 2008 Ontology Summit 2008 Interoperability Week, NIST Interoperability Week, NIST Gaithersburg, MD Gaithersburg, MD MikeDean, LeoObrst, PeterYim, et al. April 29, 2008 v 1.02

Upload: dalton

Post on 18-Mar-2016

41 views

Category:

Documents


0 download

DESCRIPTION

Open Ontology Repository Session OOR-Team Presentation Ontology Summit 2008 Interoperability Week, NIST Gaithersburg, MD. MikeDean, LeoObrst, PeterYim, et al. April 29, 2008 v 1.02. Agenda: Presenting the OOR Initiative. What is the OOR? Overview, rationale and motivations Leo Obrst - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: MikeDean, LeoObrst, PeterYim, et al. April 29, 2008 v 1.02

1

Open Ontology Repository SessionOpen Ontology Repository SessionOOR-Team PresentationOOR-Team Presentation

Ontology Summit 2008Ontology Summit 2008Interoperability Week, NISTInteroperability Week, NIST

Gaithersburg, MD Gaithersburg, MD

MikeDean, LeoObrst, PeterYim, et al.April 29, 2008

v 1.02

Page 2: MikeDean, LeoObrst, PeterYim, et al. April 29, 2008 v 1.02

2

Agenda: Presenting the OOR Initiative

1) What is the OOR? Overview, rationale and motivations– Leo Obrst

2) What do users expect? How do these needs align with the rationale?– Ken Baclawski

3) How do these needs translate into OOR system requirements? How do these satisfy the rationale?

– Evan Wallace4) What are some existing efforts? How do these address or satisfy the

rationale?– Bruce Bargmeyer

5) What is the roadmap to developing/delivering these requirements in an OOR implementation effort? How does the roadmap satisfy the rationale?

– Mike Dean

Page 3: MikeDean, LeoObrst, PeterYim, et al. April 29, 2008 v 1.02

3

Overview, Rationale & Motivations

Leo Obrst

Page 4: MikeDean, LeoObrst, PeterYim, et al. April 29, 2008 v 1.02

4

Overview• Recognizing of the need for an Open Ontology Repository, the co-conveners got their act together:

– 2001 DAML Ontology Library (Mike Dean)– 2005 MITRE Study on OWL/RDF Registry & Repository (Leo Obrst)– 2002/2005 CIM3-CWE / CODS initiative (Peter Yim)

• 2008-01-03: Open Ontology Repository initiative - Planning Meeting– Proposed to have OOR as the theme for Ontology Summit 2008

• 2008-01-23: OOR Initiative - Founding Members Conference Call– “Open Ontology Repository (OOR) Initiative” came into being, with about 40 participants (active

participants, as well as observers)– Team adopts Mission Statement

• 2008-02-07: OOR team adopts their “Ontology Repository” definition• 2008-02-28~04.10: joined with the OntologySummit2008 effort and co-organized four OOR-

Panel Sessions: – Covering: “Technology Landscape,” “Expectations & Requirements,” and “Ontology of Ontologies”

• OOR team virtual activities being hosted within the Ontolog collaborative work environment, for the time being

Page 5: MikeDean, LeoObrst, PeterYim, et al. April 29, 2008 v 1.02

5

The charter of the Open Ontology Repository (OOR) Initiative is to the promote the global use and sharing of ontologies by:

• 1. establishing a hosted registry-repository;• 2. enabling and facilitating open, federated, collaborative

ontology repositories, and• 3. establishing best practices for expressing interoperable

ontology and taxonomy work in registry-repositories.where, “An ontology repository is a facility where ontologies and

related information artifacts can be stored, retrieved and managed.” -- definition as adopted by the OOR-team / 2008.02.07

Homepage: http://ontolog.cim3.net/cgi-bin/wiki.pl?OpenOntologyRepository

Page 6: MikeDean, LeoObrst, PeterYim, et al. April 29, 2008 v 1.02

6

Rationale & Motivations (1)

• Why are we interested in an OOR and what purpose does it serve?• Isn’t the Semantic Web notion of distributed islands of semantics sufficient as

a de facto repository?– If you put it out there, will they come?– If you build it better and put it out there, will they prefer yours?– History does not show this laissez faire “field of dreams” is good reality– The "clickable" web has been very successful in employing this strategy for html

documents– However the use and content of the semantic web has different characteristics that make

it far less tolerant of the change and frequent errors which are commonplace on the clickable web.

• Distinguishing characteristics of the Semantic Web– Machines rather than humans are the primary consumers of content. Errors that a human

may be able to diagnose and fix (such as a change in location of a document) are likely fatal for machine processing

– The use of owl:imports creates a strong transitive dependency between ontology documents; changes in any imported document (imported directly or through nested import) can cause the resulting import closure to be inconsistent or to change its meaning or computational characteristics significantly.

– Ontologies convey a precise meaning with an unambiguous machine interpretation. This means that, when using this content, careful selection and precise reference is critical.

Page 7: MikeDean, LeoObrst, PeterYim, et al. April 29, 2008 v 1.02

7

Rationale & Motivations (2)

• Value Added to the content by an Open Ontology Repository/Registry:– The OOR is reliably available– The OOR is persistent and sustainable, so you can be confident when committing to its use– The OOR has information about when, why, and how an ontology has changed, so you can be

aware of changes that may effect its usability– You can find ontologies easily– Ontologies are registered, so you know who built them– Metadata provides the ontology purpose, KR language, user group, content subject area, etc.– The OOR includes mappings, so you can connect ontologies to other ontologies– The OOR content has quality and value, as gauged by recognized criteria– The OOR supports services, so that ontologies can map and be mapped, find and be found, can

review/certify and be reviewed/certified, can hook your own services into and can use the services others have hooked in

– Ontologies can reuse or extend other ontologies, including common middle and upper ontologies– The OOR can be easily extended

• Ref. also – opening post to the OOR-team ref “definition: registry vs. repository; goals, etc.” - http://ontolog.cim3.net/forum/oor-forum/2008-01/msg00016.html

Page 8: MikeDean, LeoObrst, PeterYim, et al. April 29, 2008 v 1.02

8

Open Ontology Repository

User Needs&

Requirements

Ken Baclawski

Page 9: MikeDean, LeoObrst, PeterYim, et al. April 29, 2008 v 1.02

9

“OOR” - what is in scope?• Repository: "An ontology repository is a facility where ontologies

and related information artifacts can be stored, retrieved and managed"

– first: the persistent store for ontologies– then, the registry for ontologies in the repository– then progressively, the value-added services

• Ontology:– all types of artifacts on the ontology spectrum

• from folksonomies, terminologies, controlled vocabularies, taxonomies, thesauri, ... to data-schema, data-models ... to OWL ontologies ... and, axiomatized logical theories

• from shared understanding ... to ontological commitments ... to the future of standards

• Open: – open access; compliance with open standards; open technology (with

open source); open knowledge (open content); open collaboration (with transparent community process)

– open to integration with “non-open” repositories via an open interface

Page 10: MikeDean, LeoObrst, PeterYim, et al. April 29, 2008 v 1.02

10

OOR Users Needs (1)

• Who are the users of an OOR– ontology developers (individuals or distributed teams)

– ontology centers and institutions– end-users (human) who need to search/browse an ontology– software agents (machine) who need to use the ontologies– application developers

• When– design time– run-time (dynamic, real-time, on-the-fly, ...)

Page 11: MikeDean, LeoObrst, PeterYim, et al. April 29, 2008 v 1.02

11

OOR Users Needs (2)

Through the two virtual panel sessions and our online discourse, we heard from experts among the panelists and participants coming from different domains:

• 2008_03_27 - Thursday: Joint OOR-OntologySummit2008 Panel Discussion: "An Open Ontology Repository: Rationale, Expectations & Requirements - Session-1" - Chair: LeoObrst & FabianNeuhaus; Panelists: WilliamBug, EvanWallace, JohnLMcCarthy, KenBaclawski, PeterBenson & RexBrooks - http://ontolog.cim3.net/cgi-bin/wiki.pl?ConferenceCall_2008_03_27

• 2008_04_03 - Thursday: Joint OOR-OntologySummit2008 Panel Discussion: "An Open Ontology Repository: Rationale, Expectations & Requirements - Session-2" - Chair: LeoObrst & FabianNeuhaus; Panelists: DougLenat, DekeSmith, MarciaZeng, DeniseBedford, PatHayes, MalaMehrotra & RobRaskin - http://ontolog.cim3.net/cgi-bin/wiki.pl?ConferenceCall_2008_04_03

Page 12: MikeDean, LeoObrst, PeterYim, et al. April 29, 2008 v 1.02

12

OOR Users Needs (3)

The top needs came out to:• that there is a well-maintained persistent store (with high availability and

performance) where ontological work can be stored, shared and accessed• having “ontologies” properly registered and “governed,” with provenance and

versioning support, and made available (logically) in one place so that they can be browsed, discovered, queried, analysed, validated and reused

• allow ontologies to be “open” and unencumbered by IPR constraints, in terms of access and reuse

• services that can be provided across disparate ontological artifacts to support cross-domain interoperability, mapping, application and making inferences … and having such semantic services be properly registered and available to support peer OORs

• (in addition to the panel proceedings cited above) ref. IM chat/discussion: http://ontolog.cim3.net/cgi-bin/wiki.pl?ConferenceCall_2008_03_27#nid1CJJ and, for example, input from AndrewSchain (NASA/HQ): http://ontolog.cim3.net/forum/ontology-summit/2008-04/msg00010.html

Page 13: MikeDean, LeoObrst, PeterYim, et al. April 29, 2008 v 1.02

13

A sample of the input on Needs and Expectations

… based on summary slides received from some of the OOR-Panelists on the 2008.03.27 &

2008.04.03 “Requirements Panel” Sessions References to the

Rationale are shown in green boxes

Page 14: MikeDean, LeoObrst, PeterYim, et al. April 29, 2008 v 1.02

“What is impossible to do right now, but, if you could do it, would fundamentally change your business?”

1990 Joel Arthur Barker

Codification at source !– Common metadata (ISO 22745-20/eOTD)

• an end to data mapping– Requirement specifications (ISO 22745-30/eOTD-i-

xml )• an end to incomplete data

– Data provenance (ISO 8000-120)• an end to inaccurate information

Vision of the Future Peter Benson

NATO codification system as the foundation for the eOTD, ISO 22745 and ISO 8000

Faster Faster datadata – Better – Better datadata – Cheaper – Cheaper datadata

Quality and value

Metadata

Reliable, available

Page 15: MikeDean, LeoObrst, PeterYim, et al. April 29, 2008 v 1.02

Codification at Source - Peter BensonUsing standards to automate the data supply chain

Data requestor

Data provider

Sub

eOTD-i-xml(data requirements statement) ISO 22745-30

eOTD-q-xml(query)

ISO 22745-35

Sub-TiereOTD-q-xml

Sub-TiereOTD-r-xml

eOTD-r-xml(data exchange)ISO 22745-40

Faster Faster datadata – Better – Better datadata – Cheaper – Cheaper datadata

Page 16: MikeDean, LeoObrst, PeterYim, et al. April 29, 2008 v 1.02

16

Rex Brooks: Content Provider-Repository BuilderFocus on Architecture, Registry-Repository & Emergency Data Exchange Language Reference Information Model

(EDXL-RIM)

Find Ontologies

Ontology Registration

Page 17: MikeDean, LeoObrst, PeterYim, et al. April 29, 2008 v 1.02

17

Rex Brooks: Content Provider-Repository BuilderFocus on Architecture, Registry-Repository & Emergency Data Exchange Language Reference Information Model

(EDXL-RIM)

Find Ontologies

Ontology Registration

Page 18: MikeDean, LeoObrst, PeterYim, et al. April 29, 2008 v 1.02

18

Developer Requirements - Neil Sarkar

Must have the ability to browse and query small segments of an ontology.

Good to have the ability to dynamically curate and suggest changes via the user community.

Ideally, it can be used to navigate across inferred information that is associated with a small set of terms and that comes from many ontologies.

Ontology Services

ChangeManagement

Ontology Mapping

Page 19: MikeDean, LeoObrst, PeterYim, et al. April 29, 2008 v 1.02

19

End User Requirements – Neil Sarkar

Must have– Ability to efficiently navigate multiple hierarchies– Consistency across multiple ontologies

Good to have– Ability to provide live feedback– Allow annotating relationships or propose new terms

Ideally, it can– Support scientific hypothesis testing

Ontology Mapping

Change Management

OOR Extensions

Metadata

Page 20: MikeDean, LeoObrst, PeterYim, et al. April 29, 2008 v 1.02

20

OOR needs for content /application providersMala Mehrotra

• Content developers: Discover related terms/axioms/models for reuse– Context – collaboration groups of concepts

• region (geographic, biological, political)

– Depth/detail • month in SUMO vs. monthDescription in DAML time ontologies

– Differences in competing models• TimeInterval in SUMO vs DurationDescription in DAML

– Degree of Crossover/Overlap • More than just imports closure• Orthogonality measures across ontologies

• Application developers: Interoperate using multiple ontologies– Create formalized mapping relationships – Find mapping relationships

Reuse and Extension

Ontology Mapping

Page 21: MikeDean, LeoObrst, PeterYim, et al. April 29, 2008 v 1.02

21

Infrastructure Needs Mala Mehrotra

• Cognitive Tools for discovery– Collaborating groups of concepts used in applications

– Implicit relationships across resources

– Ontological/Taxonomy hierarchy browsing

– Human-machine collaboration mode

• Mapping Tools for capturing inter-resources’ relationships

• Need formal representation of relationships for reasoners– A large repertoire of relationships

– Multiple ontological representations

– Mechanisms to represent formalism in human-readable form

• Pragati’s Exposé tool addresses these issues www.pragati-inc.com

Reuse and Extension

Ontology Mapping

Page 22: MikeDean, LeoObrst, PeterYim, et al. April 29, 2008 v 1.02

22

What We’d Want a Good Host to Provide• A commitment to use – to have contributors all provide content under – some Creative

Commons license, as opposed to e.g. a GNU license• Retention of the provenance/lineage of contributed ontological content• Agreement on some of the most fundamental ontological relations• Agreement on a small set of inter-ontology alignment relations

Doug LenatCycorp

[email protected]

Content that Cycorp could provide to be be hosted: * OpenCyc (www.opencyc.org) 100% free even for commercial purposes * ResearchCyc (researchcyc.cyc.com) free for R&D purposesIn both cases, there are ontologies plus inference engines and API-level and graphical interface tools

Meta-level message: Look at OKKAM, LarKC, etc., and decide what role, if any, OOR can/should play, and how it should tie in with those other efforts.

Unencumbered by IP constraints

Registration, Metadata

Ontology Mapping

Page 23: MikeDean, LeoObrst, PeterYim, et al. April 29, 2008 v 1.02

23

What’s in OpenCycDoug Lenat

• (#$isa 596215)• (#$genls 99198) • (#$disjointWith 6114) • (#$resultIsa 4277) • (#$resultGenl 1206) • (#$argIsa 35617• (#$argGenl 5398) • (#$arg1Isa 16748)• (#$arg1Genl 2354) • (#$arg2Isa 14114• (#$arg2Genl 2283) • (#$arg3Isa 3486)

(#$argFormat 5493) (#$arg2Format 3320) (#$functionalInArgs 1427) (#$arity 16416) (#$arityMin 958) (#$comment 57305) (#$genlPreds 7440) (#$negationInverse 990) (#$genlMt 26078) (#$denotationInEnglish 409745) (#$synonymousExternalConcept 13916)

Explicitly: 300k terms; 14k predicates; 57k classes; 2 million assertionsImplicitly: There are infinitely more nonatomic terms and inferred assertionsMore subtle but crucial point: There are infinitely many contexts (microtheories) defined compositionally rather than having only explicitly reified contexts

This means there are 596k “isa” assertions in OpenCyc

E.g., mapping between a term in

OpenCyc and a WordNet synset

Page 24: MikeDean, LeoObrst, PeterYim, et al. April 29, 2008 v 1.02

24

Needs vs. Rationale• The “Needs and Expectations” map well to the “Rationale”

cited in the previous slide• A community OOR will provide us with

(from slide #11 from Denise Bedford's 2008.04.03 brief)

– Knowledge value– Collaboration value– Shared process value

• However, further to the intellectual discourse on what an OOR should be, the implementors of the OOR will also need to answer questions like:

– How could we make sure the OOR is still around in 100 years?– What can be done about assuring the sustainability of the resources,

expertise, quality of the ontologies in the OOR and the services provided?– How can we ensure long term value, commitment and continuous

improvement to the OOR?

Page 25: MikeDean, LeoObrst, PeterYim, et al. April 29, 2008 v 1.02

25

Open Ontology Repository

Translating User Needs intoRequirements for

OOR Implementation

Evan Wallace

Page 26: MikeDean, LeoObrst, PeterYim, et al. April 29, 2008 v 1.02

26

… more input from the OOR-Panel sessions

… based on summary slides received from the panelists on the virtual panel sessions

Page 27: MikeDean, LeoObrst, PeterYim, et al. April 29, 2008 v 1.02

27

Current ontology reuse challenges- Elisa Kendall & Evan Wallace

• Ontologies developed for programs such as the DARPA DAML program are aging– Ontology pages have not been revised since 2004 (see

http://www.daml.org/ontologies/)– Most recent submission was actually in 2003 (see

http://www.daml.org/ontologies/submission.html)– Community knowledge about development methodology & facts about

the world relevant to the IC community have continued to evolve

• Ontologies are often published in an author’s user space which is ephemeral. When these ontologies move, references to them are invalidated and references within the artifacts must be updated but sometimes are not (e.g. OWL Time)

• Research ontologies tend to be focused on demonstration-related content and are by nature incomplete, with varying coverage and levels of granularity due to funding limitations

Page 28: MikeDean, LeoObrst, PeterYim, et al. April 29, 2008 v 1.02

28

Challenges in deploying an effective OOR- Elisa Kendall & Evan Wallace

• Linking among models built from different metamodels and for different CoPs (business modelers versus knowledge engineers)

• Intellectual Property concerns particularly w.r.t. content based on International Standards

• Ensuring availability and persistence• Maintenance and refreshment of content

– Need long term resource commitment– Need staff with correct technical skills/knowledge– What policies, processes, tools and automation are needed?– How will freshness be monitored?

Page 29: MikeDean, LeoObrst, PeterYim, et al. April 29, 2008 v 1.02

29

“good practices” for reusability - Elisa Kendall

• Well-specified policies for vocabulary management, metadata, and provenance enable trust

• Commitment to forming, accommodating, serving, & working with a community of users is critical

• Emerging portals (e.g., NCBO’s BioPortal) provide the repository, publish relevant metadata, manage versions, and provide web-based access to facilitate collaboration & reuse

• Minimal principles for vocabulary publication & management* – Use URIs for naming – publish not only the URI’s but policies for URI persistence,

ownership, delegation of responsibility for specific vocabularies, etc.– Provide adequate readable documentation– Articulate maintenance policies that specify whether or not changes can be made, the

process for doing so, a feedback loop for user community involvement– Identify versions– Publish a formal schema in a recommended standard

• Essential metadata– Identify sources, creation & revision dates, etc. at the ontology level (minimum)– Knowledge provenance for business & government intelligence may require detail at the

fact/individual level*– Quality, trustworthiness assessment metrics for the vocabulary & source materials – Licensing, IP limitations

* http://www.w3.org/2006/07/SWD/Vocab/principles

Page 30: MikeDean, LeoObrst, PeterYim, et al. April 29, 2008 v 1.02

30

Steps towards an OOR enabling reuse- Elisa Kendall

• Design a repository structure, version strategy, & naming conventions

• Determine metrics for content assessment / evaluation

• Create rules & procedures for content acceptance

• Adopt metadata schema for annotation & assessment information

• Determine mechanisms for content annotation / classification & querying

• Create a strategy/schedule for deployment

Page 31: MikeDean, LeoObrst, PeterYim, et al. April 29, 2008 v 1.02

31

Translating Needs to Requirements• Active discussion on the matter evolved on the

[oor-forum] list, initiated from threads like:– post from EvanWallace (NIST):

http://ontolog.cim3.net/forum/oor-forum/2008-04/msg00011.html .. & – post from ToddSchneider (Raytheon):

http://ontolog.cim3.net/forum/oor-forum/2008-04/msg00012.html

• breaking it down to:– general requirements (scalability, distributed repository

support, platform independence, ...)– requirements to support search and discovery– requirements to support subscription and notification– management requirements, &– governance requirements

Page 32: MikeDean, LeoObrst, PeterYim, et al. April 29, 2008 v 1.02

32

Ontology Language Requirements

• Bare minimum– OWL– Common Logic (CL)

• Expected evolution– OWL → OWL2 + SWRL– CL → IKL

• Desirable additions– RDFS– Topic Maps– SBVR exchange form– ?

Page 33: MikeDean, LeoObrst, PeterYim, et al. April 29, 2008 v 1.02

33

Top User Needs (1st bullet)

• Well-maintained persistent store with high availability and performance for storage, sharing and access:– Maintenance mechanisms and interfaces– Long term storage with backup (how often?)– Persistent references and structured reference scheme– Adequate throughput and uptime percentage– HTTP support for upload/access including redirection

(303 response code handling)

Page 34: MikeDean, LeoObrst, PeterYim, et al. April 29, 2008 v 1.02

34

Top User Needs (2nd bullet)• Ontology content is properly registered and

“governed” (managed?) , with provenance and versioning support, and made available (logically) in one place so that it can be browsed, discovered, queried, analyzed, validated, and reused– Must support interfaces for registration and management– Must support storage and retrieval of provenance

information– Must handle versioning of content and associated metadata– Must somehow integrate distributed content– Must support user browsing of content and metadata– Must support both user and machine interfaces for

querying and accessing content and metadata

Page 35: MikeDean, LeoObrst, PeterYim, et al. April 29, 2008 v 1.02

35

Top User Needs (3rd bullet)

• Allow ontologies to be “open” and unencumbered by IPR constraints, in terms of access and reuse– Do we mean: allow or require?– Need to capture information at registration time

about license type

Page 36: MikeDean, LeoObrst, PeterYim, et al. April 29, 2008 v 1.02

36

Top User Needs (4th bullet)

• Provide services across disparate ontological artifacts to support cross-domain interoperability, mapping, application and making inferences …and having such semantic services be properly registered and available to support peer OORs– This is a research area– Mapping types need to be defined– “Semantic services” need to be defined

Page 37: MikeDean, LeoObrst, PeterYim, et al. April 29, 2008 v 1.02

37

Next steps

• Determine minimum content types and formats to support and minimum repository capabilities

• Refine relevant requirements– workflows: manual and automated (functions,

roles, responsibilities, and processes)– define supporting interfaces– associated data requirements

Page 38: MikeDean, LeoObrst, PeterYim, et al. April 29, 2008 v 1.02

38

To Enable the Management and Services

specified in the Requirementswe need to capture a set of metadata

about the ontologies

… here are some of the input from the 2008.04.10 Panel Session

On “Ontology of Ontologies”

Page 39: MikeDean, LeoObrst, PeterYim, et al. April 29, 2008 v 1.02

39

Objective Michael Gruninger

• This is based on the communique from the 2007 Ontology Summit that took place April 22-23, 2007 in Gaithersburg, Maryland, at the National Institute of Standards and Technology (NIST).

• Provide a framework that ensures that we can support diversity without divergence, so that we can maintain sharability and reusability among the different approaches to ontologies.

• To this end, we can define a set of characteristics common to all approaches and then propose a set of features that can be used to distinguish among different approaches.

Page 40: MikeDean, LeoObrst, PeterYim, et al. April 29, 2008 v 1.02

40

DimensionsMichael Gruninger

• We can identify a set of dimensions that can be used to distinguish among different approaches to ontologies.

• There are two kinds of dimensions:• Semantic - how an ontology specifies the meaning of its

vocabulary– Expressiveness of the ontology representation language– Level of structure– Representational granularity

• Pragmatic - the purpose and context in which the ontology is designed and used

– Intended use– Role of automated reasoning– Descriptive vs prescriptive– Design methodology– Governance

Page 41: MikeDean, LeoObrst, PeterYim, et al. April 29, 2008 v 1.02

41

OMV - Ontology Metadata VocabularyPeter Haase

• OMV is … a metadata schema– Captures reuse-relevant information about an ontology

• OMV consists of … core and extensions– OMV Core: fundamental information about an ontology and its

life cycle– OMV Extensions: detailed account on specific phases of an

ontology life cycle• OMV is designed … as an ontology• OMV is realized … in OWL DL

• Website http://omv.ontoware.org/

Page 42: MikeDean, LeoObrst, PeterYim, et al. April 29, 2008 v 1.02

42

Applications of OMV Peter Haase

• Numerous existing and planned applications of OMV– Ontology Registries in NeOn

• Oyster as Open Source implementation• Centrasite as commercial product of Software AG

– Watson - Gateway to the Semantic Web• Web interface for searching ontologies and semantic documents

– Stanford BMIR intend to use OMV in Protege and their Bioportal ontology repository

– OMG intend to use OMV in their ontology repository• OMV development sustained by OMV Consortium

– Current members: UPM, AIFB, TU Berlin, Stanford BMIR– Looking for wider adoption / standardization in the community– Opportunity to join, contribute, collaborate!

Page 43: MikeDean, LeoObrst, PeterYim, et al. April 29, 2008 v 1.02

43

Requirements vs. Rationale• Again, the “Requirements” map well to the “Rationale”

cited in the previous slide, additionally, though• Unlike R&D, what's not “interesting” there could

still be crucial in implementation and in production

– for example delivering high availability, performance, even security, spam control ... these are almost irrelevant to “serving ontologies,” yet essential to having a viable OOR

• For the *real* implementation of the OOR initiative, we would still need:

– the proper Organizational Model that would deliver the previously mentioned needs (like sustainability, ... etc.)

– a viable Operating (Business) Model, and– most importantly, the community, and the skills and generosity of its

membership, to get this off the ground

Page 44: MikeDean, LeoObrst, PeterYim, et al. April 29, 2008 v 1.02

44

Existing Efforts

Bruce Bargmeyer

… based on input from the 2008.02.28 OOR-Panelon “Ontology Registry and Repository Technology

& Infrastructure Landscape”

Page 45: MikeDean, LeoObrst, PeterYim, et al. April 29, 2008 v 1.02

45

eXtended Metadata Registry (XMDR)

Bruce BargmeyerWhat XMDR Brings to the Table:• Use cases - semantics challenges - and Requirements• Potential design specifications

– Proposed specifications for ISO/IEC 11179 Edition 3 – A UML Model, definitions, and OWL ontology

• Modular software architecture and open source software modules• Open Source XMDR software• Test content – concept systems including thesauri, taxonomies,

ontologies• A group of participants (XMDR project) that has considerable

experience in this area. See: XMDR.org

Page 46: MikeDean, LeoObrst, PeterYim, et al. April 29, 2008 v 1.02

46

Modular XMDR ArchtitectureBruce Bargmeyer

Registry Store

Search & Content Serving (Jena, Lucene)

XMDR metamodel (OWL & xml schema)

standard XMDR filesstandard XMDR files

standard XMDR filesstandard XMDR files

LogicIndex

Content Loading & Transformation

(Lexgrid & custom)

Human User Interface(HTML fromJSP and javascript; Exhibit)

Metadata Sources concept systems,

data elements

USERSWeb Browsers…..Client

Software

Application Program Interface (REST)

Authentication ServiceValidation

(XML Schema)

MappingEngine

Logic Indexer(Jana & Pellet)

Text Indexer(Lucene)

Metamodel specs(UML & Editing)

(Poseidon, Protege)

XMDR data model & exchange format

XML, RDF, OWL

TextIndex

Postgres Database

Third Party Software

Page 47: MikeDean, LeoObrst, PeterYim, et al. April 29, 2008 v 1.02

47

DAML Ontology LibraryMike Dean

• Created early in the DARPA Agent Markup Language program– Organize content– Promote reuse– Demonstrate adoption

• 282 DAML+OIL and OWL ontologies submitted from October 2000 – December 2003

• Cited in several ISWC papers– Property (Feature) use across libraries

• Largely replaced by Ontaria and SchemaWeb• Available at http://www.daml.org/ontologies/

– daml.org now archived at W3C

Page 48: MikeDean, LeoObrst, PeterYim, et al. April 29, 2008 v 1.02

48

Lessons Learned Mike Dean

• Curation/quality control– Many users desired some indication of use and

quality of the ontologies, e.g. user ratings• Cacheing

– Many links subsequently became unavailable– Desirable to store (and make available) local

copies

Page 49: MikeDean, LeoObrst, PeterYim, et al. April 29, 2008 v 1.02

49

NCBO BioPortalMark Musen

• The National Center for Biomedical Ontology (http://bioontology.org) is developing BioPortal, an open-source repository of ontologies, terminologies, and thesauri of importance in biomedicine.

• An early version of BioPortal is accessible at http://bioportal.bioontology.org. Users can access the BioPortal content interactively via Web browsers or programmatically via Web services.

Page 50: MikeDean, LeoObrst, PeterYim, et al. April 29, 2008 v 1.02

50

BioPortal will offerMark Musen

• Ontology repository functionality• Linkages among different ontologies• Community-based peer review and ontology

annotation• Linkages between ontology content and

related online data repositories• Support for communities of ontology users

and developers

Page 51: MikeDean, LeoObrst, PeterYim, et al. April 29, 2008 v 1.02

51

OASIS ebXML RegRepFarrukh Najmi

• A generic registry / repository standard– ebRIM = meta infomodel, ebRS = services and protocols– Approved as OASIS and ISO standards– Version 4 of RegRep expected in 2008– Highly extensible

• Not specifically an Ontology repository, but can be made so– ebXML RegRep Profile for OWL-Lite is an approved specification

• Has rich feature set to support use cases and architecture– Extensible metamodel, extensible protocol, stored queries, extensible

relationships, service model, validation, cataloging, subscription & notification, role-based access control and authorization, change history, federation / federated query, SOAP and REST bindings, Java API (JAXR)

• freebXML Registry provides a royalty-free open source implementation• Is more a toolkit than an out-of-box solution

Page 52: MikeDean, LeoObrst, PeterYim, et al. April 29, 2008 v 1.02

52

ebXML RegRep as an OOR Server: A Proposal - Farrukh Najmi

• Build upon RegRep 4.0 impl from Wellfleet Software• Implement OWL-Lite Profile (modulo RegRep 4)• RegRep does not provide Ontology specific UI, use Protege• Integrate RegRep 4.0 with Protege such that

– RegRep serves as backend for Multi-user Protege client– Protege reasoning engine serves as Reasoning plugin for RegRep

• Initially deploy a single Root OOR instance with pilot users playing various roles in the collaborative ontology management use cases

– Use OpenID as distributed identity management solution• Later facilitate deployment of Community-specific OORs (e.g. Medical, GIS,

Defense ...)

Page 53: MikeDean, LeoObrst, PeterYim, et al. April 29, 2008 v 1.02

53

What can CIM3 bring to the table? - Peter Yim

• CIM3 Mission: to enable more effective distributed collaboration and virtual enterprise through bootstrapping collective intelligence over the Internet

Page 54: MikeDean, LeoObrst, PeterYim, et al. April 29, 2008 v 1.02

54

CIM3's potential role in OOR - PeterYim

• provide the “plumbing” (the bottom layer of the technology stack) - a robust hosted (hardware and network) infrastructure for OOR– Network Facility:

• Tier-1 IPv4 Internet hosting facility (IPv6 ready)• 100Mbps bandwidth into the Internet backbone

(upgradable to 1Gbps in short order)• Backbone: multiple OC192 & Gige self-healing fiber-ring

(among the top 10 networks in the world as measured by connectivity to the rest of the Internet.)

– Linux Servers (mainly on IBM 1u boxes)• Triple redundant storage (in 2 locations)• Locked-down system environment, with spam-filtering and

content filtering capabilities• provide a collaborative work environment for the OOR team• help facilitate the distributed teamwork - project coordination

and management

Page 55: MikeDean, LeoObrst, PeterYim, et al. April 29, 2008 v 1.02

55

Open Ontology Repository Roadmap

Mike Dean

Page 56: MikeDean, LeoObrst, PeterYim, et al. April 29, 2008 v 1.02

56

OOR Is …

• An open source software platform• 1 or more public instantiations of that

platform• A sustainable organization• (Lots of potential parallelism here)

Page 57: MikeDean, LeoObrst, PeterYim, et al. April 29, 2008 v 1.02

57

Apache-like Software Platform• Architectural framework (internal APIs, core representation

standards, processing pipeline)• A few core modules (basic registry, GUI, web service interfaces,

…)• Lots of optional modules (pick and choose when instantiating)

– Quality and gatekeeping (basic checks, usage-based, community ranking, curation, etc.)

– Languages (OWL, RDFS, Common Logic, UML, SKOS, etc.)– Mapping and translation– Federation (bi-directional, one way)– Repository (expanded persistence)– Editing (access control, versioning)– Encapsulations of existing ontology services– …

Page 58: MikeDean, LeoObrst, PeterYim, et al. April 29, 2008 v 1.02

58

OOR Federation• Other OOR instances

– P2P (ish)– Easiest case - we have full control

• Collaborative ontology editing environments (knoodl, Semantic MediaWiki, BioPortal, CODS, etc.)– Want to cooperate rather than compete

• Other registries/repositories• “Loose” ontologies posted on the WWW

– Add metadata and apply services

Page 59: MikeDean, LeoObrst, PeterYim, et al. April 29, 2008 v 1.02

59

“Open”• OOR platform software should be open source (like

Apache)– Probably use one of the licenses from opensource.org

• OOR instantiations can set their own policies with respect to ontology content licensing

• Accommodates– Open source content– Private instantiations behind firewalls– Commercially licensed content (e.g. ResearchCyc) and services

• OOR organization should do whatever it can to promote the adoption of ontologies and related technologies that benefit the community

Page 60: MikeDean, LeoObrst, PeterYim, et al. April 29, 2008 v 1.02

60

Public Services• At least one “central” instantiation

– Showcase for the technology• Employ many/all optional modules

– Should include some sort of “registry of OORs” or OOR DNS– Might federate ontologies from domain-specific public OORs– Primary focus on “open source” ontologies – <OOR.org> is already taken; Mike Dean has registered

OpenOntologyRepository.org & OpenOntologyRepository.net– Peter Yim has volunteered supporting infrastructure, initially

Page 61: MikeDean, LeoObrst, PeterYim, et al. April 29, 2008 v 1.02

61

OOR Organization Support• Volunteers (now)• Funded OOR proposals (hopefully soon)• Collaborative projects/proposals citing OOR

– Enhances credibility of both parties– Most immediate path forward

• Endowment• Acquisition by another sustainable organization

(Apache Foundation, W3C, OMG, …)• Should adopt a W3C-like persistence policy for

software and content

Page 62: MikeDean, LeoObrst, PeterYim, et al. April 29, 2008 v 1.02

62

Linking Open Data• Ontologies as (meta) data

– Adopt their conventions• Nice visual depiction of

increasing inter-connections

• A community linked by best practices and a cool logo

Page 63: MikeDean, LeoObrst, PeterYim, et al. April 29, 2008 v 1.02

63

Strawman Phase 1• 6 month time horizon• Basic registry and repository

– Might build off XMDR• Some federation capability• Initial versions of architectural framework and

core modules• Support for at least 1 language

– Other optional modules as they become available• Specific open source license selected• 1 or more instantiations

Page 64: MikeDean, LeoObrst, PeterYim, et al. April 29, 2008 v 1.02

64

Going Forward• [email protected] email list

– http://ontolog.cim3.net/forum/oor-forum/ archives• Next OOR telecon

– Friday, May 9, noon EDT• Calling for:

– Collaborators / contributors– Developers– “A cool logo!”

Page 65: MikeDean, LeoObrst, PeterYim, et al. April 29, 2008 v 1.02

65

Open Discussion

Q & A