1 ontology management in calo, a cognitive assistant that learns and organizes adam cheyer program...
TRANSCRIPT
1
Ontology Management in CALO, a Cognitive Assistant that Learns and Organizes
Adam Cheyer
Program Director,Cognitive Computing GroupSRI International
2
Abstract
CALO is one of DARPA's most ambitious efforts to develop a persistent assistant that lives with, learns from, and supports users in managing the complexities of their daily work lives. A multi-year project that unites some 200+ researchers from 25 academic and commercial organizations, the goal is to produce a single system where learning happens "in vivo", inside an ever-evolving agent that can observe, comprehend, reason, anticipate, act, and communicate.
In this talk, we will first provide an overview of CALO: the what, the how, the why. Next, we will discuss the engineering methods we use to develop and maintain the ontology of CALO. CALO has some unusual requirements, such as "Concept Learning" where the ontology is extended and modified "in-the-wild" by machine learning algorithms. Finally, we will demonstrate IRIS, a semantic desktop that serves as the office environment that integrates best with CALO. IRIS leverages many of CALO's techniques to ontology management, and being open source, provides a distributable, transparent example of the approach.
3
Outline
CALO Overview (separate presentation)
Ontology Management in CALO Ontology Usage in CALO’s Architecture CALO’s Unique Issues (and Solutions Attempted)
for Ontology Management and Maintenance In Practice
Overview of IRIS Semantic Desktop Demonstration of CALO/IRIS
4
CALO Functions
Organize & Manage
Information
Schedule &Organize in
Time
Acquire, Allocate
Resources
Prepare InformationProducts
Monitor & ManageTasks
Observe & Mediate
Interactions
CALO
PerceptionManager
MeetingActivity
Recognition
KnowledgeManager
Timeline Mgr
Update Mgr
Query Mgr
Memory Mgr
TaskManagerInteraction
Manager
Interpretation
NL/Speech
IRIS
Explanation
KnowledgeBase
Timeline Database(Episodic Memory)
CollaborativeProblem Solver
Plan Reasoner
Task Exec
Coordination Mgr
ParticipantTracking
CyberManager
RemoteCyber
Environment
Local CyberEnvironment
5
High-level CALO Architecture
PerceptionManager
MeetingActivity
Recognition
KnowledgeManager
Timeline Mgr
Update Mgr
Query Mgr
Memory Mgr
TaskManagerInteraction
Manager
Interpretation
NL/Speech
IRIS
Explanation
KnowledgeBase
Timeline Database(Episodic Memory)
CollaborativeProblem Solver
Plan Reasoner
Task Exec
Coordination Mgr
ParticipantTracking
CyberManager
RemoteCyber
Environment
Local CyberEnvironment
Task Registry
Towel
IRISOffice Environment
6
Ontology in CALO’s Architecture Query Manager
Provides single entry point for querying knowledge in CALO – unifies many data sources and reasoning components
Publish Subscribe Event Framework Across all cyber/physical events in CALO
Episodic Memory (Timeline Server) Records instances of events for learning
Task Interface Registry Engineered and Learned Actions in CALO
Dialog Management Used for understanding user intent
and generating interactions to user IRIS Office Environment
Rich model of user’s electronic life MOKB Meeting Ontology KB
Rich model of meeting events CALO Test Infrastructure (CATS)
Evaluates CALO’s abilities and how much learning in the wild has contributed
PerceptionManager
MeetingActivity
Recognition
KnowledgeManager
Timeline Mgr
Update Mgr
Query Mgr
Memory Mgr
TaskManagerInteraction
Manager
Interpretation
NL/Speech
IRIS
Explanation
KnowledgeBase
Timeline Database(Episodic Memory)
CollaborativeProblem Solver
Plan Reasoner
Task Exec
Coordination Mgr
ParticipantTracking
CyberManager
RemoteCyber
Environment
Local CyberEnvironment
High-level CALO Architecture
7
CALO Ontology: Core+Office Core Ontology (aka CLIB)
Created by UT Austin: http://www.cs.utexas.edu/users/mfkb/RKF/clib.html
Library of generic, composable and re-usable knowledge components.
It was created before CALO and has been used in a variety of different projects including RKF, HALO and AURA.
857 core components (as of 2005-11-14) Ex: Time-Interval, Person, Organization, Message
Office Ontology Extension of CORE suitable for CALO Office domain 108 office components (as of 2005-11-14) Ex: Author, Vendor, ProjectLeader,
ElectronicPresentationDocument Implemented in KM (“The Knowledge Machine”)
KM is a frame-based language with clear first-order logic semantics
It contains sophisticated machinery for reasoning, including selection by description, unification, classification, and reasoning about actions using a situations mechanism
http://www.cs.utexas.edu/users/mfkb/RKF/km.html
8
CALO’s Unique Ontology Management Issues Very large project, many different
representation and inference needs 5 year project: Ontology will change. How to
maintain consistency of code, data, and docs?
Enduring Personal Cognitive Assistant: can’t forget data.
Concept & Task Learning: Ontology can change “in the wild” by the user and by CALO
Uncertainty a reality, from many different reasoners and predictors
9
Consistent Ontology Evolution
KM vs. OWL (tools)
Keeping Code, Data, and Doc in Synch
Migrating acquired data instances forward through ontology changes from Engineering Releases
Concept learning allows user ontologies to diverge
How to rationalize with engineering releases?
How to validate CALO-learned changes?
KM (master) exports to OWL
Documentation “POJOs” and Human Readable Doc
Transactional POJOs
SOUP: “Simple Ontology Update Program” applies system of patches to data to migrate forward to latest version
Concept learning changes kept separate from main Engineering “trunk”
Restrict changes allowed Add, rename properties and
classes, Not move or delete Shadow ontology and
validation processes
Issues Solutions Attempted
10
Keeping Code, Data, and Docs in Synch
Semantic Object
CLIB Ontology (KM)
Query APIsPOJOsJava (RN) SPARQL
OntologyUsageSpec
RadarNetworks
KB
JENAKB
LuceneFullText
Index
KB1 KB2 KB3
OWL Translator CLIBIn OWL
QueryManager
BackEnd
Front End
AppsCALO UI
UI Separation
HTML Doc
QMDomain
File
Data UI
Event frmwk Action frmwk
PluginSvcs
ClusterFramewk
ClassifierOther
plugins
CATS Tester
MOKBQuery
Timeline TaskMgr
IRIS
Action
To TaskMgr
ILRSpecializedOWL
Ontologies
OWLDoc
11
CALO Concept Learning
Concept Learning works in 2 steps / workflows 1. Building a ‘Shadow Ontology/Knowledgebase’
Information harvesting Validation of harvested facts Integrated into a Shadow Ontology and Knowledgebase This is a longer term process and will be done first
2. Realtime updating of CALO Uses Shadow Ontology and KB CALO Queries CL about a concept CL returns one or more concepts CALO user verifies which was actually meant CALO Ontology and IRIS KB gets updated
12
Uncertainty across multiple sources
Issues When to “write” hypothesis as “truth” into KB? Maintaining consistency
How to rationalize/combine hypotheses from different algorithms Credit assignment problem
Solutions Attempted Year1: Global KB, some algorithms wrote, some hypotheses
only accessible through APIs Year2: Provenance in global KB – record multiple solutions
and where they came from Year3:
Separate KBs by learning component, “smart” queries across sources
Probablistic Consistency Engine maintains global “what CALO believes” repository
13
IRIS: “Integrate. Relate. Infer. Share.”
“Real” office applications(Mozilla, GLOW, Jabber, …)
Plug-in Architecture (180+ plugins: UI, KB, NL, learning, apps, …)
Semantic Object layer: JAVA objects on top of OWL
Full-text & relational query (SPARQL)
Ontology-based event and action framework
Machine learning framework: classification, extraction, clustering, ranking, …
LGPL Open Sourcehttp://www.openiris.org
Only small subset of CALO, but should be useful for many applications and uses many of techniques in this presentation
IRIS Semantic Desktop
14
Questions?
Adam Cheyer