knowledge systems and project halo in collaboration with sri (vinay chaudhri) and boeing (peter...
TRANSCRIPT
Knowledge Systems and Project Halo
In collaboration with
SRI (Vinay Chaudhri)
and Boeing (Peter Clark)
Knowledge Systems
• Knowledge Systems are formal representations of knowledge capable of answering unanticipated questions with coherent explanations
• Knowledge System = KB + Q/A + Explanation Generator + Knowledge Acq. tools
Project Halo
• Funded and administered by Vulcan, Inc – a Paul Allen company
• Objective: to assess the state of the art of knowledge systems – computer programs that know a lot and answer tough questions with coherent explanations
• Method: administer an AP Chemistry exam to knowledge systems built by 4 teams of researchers
A Significant Advance over Expert Systems
• Coverage
• Reasoning
• Explanation
• Rapid construction
KM: A Logic Programming Language
• …able to represent:– classes, instances, prototypes
– defaults, fluents, constraints
– (hypothetical) situations
– actions (pre-, post-, and during- conditions)
• …and reason about:– inheritance with exceptions
– deductive and abductive inference (with constraints)
– automatic classification (given a partial description of an instance, determine the classes to which it belongs)
– temporal projection (“my car is where I left it”)
– affects of actions
A Simple Example
• When 70 ml of 3.0-Molar Na2CO3 is added to 30 ml of 1.0-Molar NaHCO3 the resulting concentration of Na+ is:
a) 2.0 Mb) 2.4 Mc) 4.0 Md) 4.5 Me) 7.0 M
Question Representation
volume
Mix
Aqueous Solution Aqueous Solution
Mixture
Na+
raw material
Na2CO3
3.0 M 0.07 lit
NaHCO3
0.03 lit
volume
1.0 M
conc.base base conc.
result
has-part
conc.
Question 26context
??
output
Background Knowledge
Chemistry laws:1. Concentration of a solute
2. Composition of strong electrolyte solutions
3. Conservation of mass
4. Conservation of volume
etc.
Law 1: Concentration of a Solute
The concentration of a chemical in a mixture is the quantity of the chemical divided by the volume of the mixture.
Divide the quantity by the volume:<Quantity> / <Volume> = X *molar
Therefore, the concentration of <Chemical> in <Mixture> = X *molar
Explanation Template
Mixture
volumeconc.
Volume*liters
Concentration*molar
has-part
Chemical
Quantity*moles
quantity
Compute-Concentration Methodcontextinput output
Note: when this law is applied, using Novak’s code, the quantities are
automatically converted to the units-
of-measurement specified here
Law 2: Composition of Strong Electrolytes
Strong Electrolyte
Anion
has-part
Quantity*moles
quantity
Quantity*moles
quantity
Cation
Quantity*moles
quantity
Compute-Ions-in-Strong-Electrolytecontextinput output
Law 3: Conservation of MassConservation of Mass
contextinputoutput
Mix
Chemical1 Chemicaln
Chemical
raw-material
result
…
Quantity*moles
Quantity*moles
quantity quantity
Chemical
has-part
??*moles
quantity
part-of
By the Law of Conservation of Mass, the quantity of a chemical in a mixture is the sum of the quantities of that chemical in the parts of the mix.
The quantity of <Chemical> in <Chemical1> is X1 *moles…The quantity of <Chemical> in <Chemicaln> is Xn *moles
Therefore, the quantity of <Chemical> = X *moles
Explanation Template
Law 4: Conservation of Volume
Mix
Chemical1 Chemicaln
Mixture
raw-material
result
…
Volume<uom1>
Volume<uomn>
volume volume
??*liter
volume
Conservation of Volumecontextinput
output
By the Law of Conservation of Volume, the volume of a mixture is the sum of the volumes of the parts mixed.
The sum of X1 <uom1>, … and Xn <uomn> = X *literTherefore, the volume of <Mixture> = X *liter
Explanation Template
Step 1: Reclassify Terms
volume
Mix
Aqueous Solution Aqueous Solution
Mixture
Na+
raw material
Na2CO3
3.0 M 0.07 lit
NaHCO3
0.03 lit
volume
1.0 M
conc.base base conc.
result
has-part
Strong Electrolyte Solutionsuperclass
Step 2: Use Law 1 to Compute Concentration
Mixture
volumeconc.
Volume*liters
Concentration*molar
has-part
Chemical
Quantity*moles
quantityLaw 1
conc.
??*molar
volume
Mix
Aqueous Solution Aqueous Solution
Mixture
Na+
raw material
Na2CO3
3.0 M 0.07 lit
NaHCO3
0.03 lit
volume
1.0 M
conc.base base conc.
result
has-part??
*liters
volume
??*moles
quantity
The Search is non-deterministic
• Multiple laws might be used to compute a value for any property. For example, here’s another way to compute concentration:
pH = - log [H+], where [H+] is the concentration of H+
• Since this applies only to H+, this search path ends quickly
Step 3: Use Law 4 to Compute Volume
Mix
Chemical Chemical
Chemical
raw-material
result
…
Volume*liter
Volume*liter
volume volume
Volume*liter
volume
Law 4
.1
conc.
??*molar
volume
Mix
Aqueous Solution Aqueous Solution
Mixture
Na+
raw material
Na2CO3
3.0 M 0.07 lit
NaHCO3
0.03 lit
volume
1.0 M
conc.base base conc.
result
has-part??
*liters
volume
??*moles
quantity
Step 4: Use Law 3 to Compute Quantity
volume
Mix
Aqueous Solution Aqueous Solution
Mixture
Na+
raw material
Na2CO3
3.0 M
0.07 liters
0.03 liters
volume
1.0 M
conc. base
NaHCO3
base conc.
result
has-part
conc.
??*molar
.1*liters
volume
??*moles
quantity
Mix
Chemical Chemical
Chemical
raw-material
result
…
Quantity*moles
Quantity*moles
quantity quantity
Chemical
has-part
??*moles
quantity
part-ofLaw 3Na+Na+
??*moles
??*moles
has-part
quantity
Step 5: Use Law 2 to Compute Quantity of Ionic Parts
??*moles
quantity
Strong Electrolyte
Anion
has-part
Quantity*moles
quantity
Quantity*moles
quantity
Cation
Quantity*moles
quantity
Law 2
volume
Mix
Aqueous Solution Aqueous Solution
Mixture
Na+
raw material
Na2CO3
3.0 M
0.07 liters
0.03 liters
volume
1.0 M
conc. base
NaHCO3
base conc.
result
has-part
conc.
??*molar
.1*liters
volume
??*moles
quantity
Na+Na+
??*moles
??*moles
has-part
quantity
Step 6: Use Law 1’ to Compute Quantity
??*moles
quantityMixture
volumeconc.
Volume*liters
Concentration*molar
has-part
Chemical
Quantity*moles
quantity
Law 1’.21
volume
Mix
Aqueous Solution Aqueous Solution
Mixture
Na+
raw material
Na2CO3
3.0 M
0.07 liters
0.03 liters
volume
1.0 M
conc. base
NaHCO3
base conc.
result
has-part
conc.
??*molar
.1*liters
volume
??*moles
quantity
Na+Na+
??*moles
??*moles
has-part
quantity
Step 7: Wind out of Law 2 from step 5
Strong Electrolyte
Anion
has-part
Quantity*moles
quantity
Quantity*moles
quantity
Cation
Quantity*moles
quantity
Law 2
.42.21*moles
quantity
volume
Mix
Aqueous Solution Aqueous Solution
Mixture
Na+
raw material
Na2CO3
3.0 M
0.07 liters
0.03 liters
volume
1.0 M
conc. base
NaHCO3
base conc.
result
has-part
conc.
??*molar
.1*liters
volume
??*moles
quantity
Na+Na+
??*moles
??*moles
has-part
quantity
Step 8-10: Similar to steps 5-7
.03.21*moles
quantity
volume
Mix
Aqueous Solution Aqueous Solution
Mixture
Na+
raw material
Na2CO3
3.0 M
0.07 liters
0.03 liters
volume
1.0 M
conc. base
NaHCO3
base conc.
result
has-part
conc.
??*molar
.1*liters
volume
??*moles
quantity
Na+Na+
??*moles
.42*moles
has-part
quantity
Step 11: Wind out of Law 3 from Step 4
Mix
Chemical Chemical
Chemical
raw-material
result
…
Quantity*moles
Quantity*moles
quantity quantity
Chemical
has-part
??*moles
quantity
part-ofLaw 3
.45
.21*moles
quantity
volume
Mix
Aqueous Solution Aqueous Solution
Mixture
Na+
raw material
Na2CO3
3.0 M
0.07 liters
0.03 liters
volume
1.0 M
conc. base
NaHCO3
base conc.
result
has-part
conc.
??*molar
.1*liters
volume
??*moles
quantity
Na+Na+
.03*moles
.42*moles
has-part
quantity
Step 12: Wind out of Law 1 from Step 2
Mixture
volumeconc.
Volume*liters
Concentration*molar
has-part
Chemical
Quantity*moles
quantityLaw 1
.21*moles
quantity
volume
Mix
Aqueous Solution Aqueous Solution
Mixture
Na+
raw material
Na2CO3
3.0 M
0.07 liters
0.03 liters
volume
1.0 M
conc. base
NaHCO3
base conc.
result
has-part
conc.
??*molar
.1*liters
volume
.45*moles
quantity
Na+Na+
.03*moles
.42*moles
has-part
quantity
4.5
Question 26 AnswerWhen 70 ml of 3.0-Molar Na2CO3 is added to 30 ml of 1.0-Molar NaHCO3, what is the resulting concentration of Na+?.
The concentration of a chemical in a mixture is the quantity of the chemical divided by the volume of the mixture.
By the Law of Conservation of Mass, the quantity of a chemical in a mixture is the sum of the quantities of that chemical in
the parts of the mix.
In the na2co3 strong-electrolyte-solution and the nahco3 strong-electrolyte-solution :
In the na-plus :
Multiply the concentration and the volume:
3 molar * 70 milliliter = 0.21 mole.
The quantity of na-plus in the na-plus is 0.42 mole.
In the co3-2 :
The quantity of na-plus in the co3-2 is 0 mole.
Multiply the concentration and the volume:
1 molar * 30 milliliter = 0.03 mole.
In the na-plus :
The quantity of na-plus in the na-plus is 0.03 mole.
In the hco3- :
The quantity of na-plus in the hco3- is 0 mole.
The quantity of na-plus in the na2co3 strong-electrolyte-solution and the nahco3 strong-electrolyte-solution is 0.45 mole.
Therefore, the quantity of na-plus = 0.45 mole.
By the Law of Conservation of Volume, the volume of a mixture is the sum of the volumes of the parts mixed.
The sum of 70 milliliter and 30 milliliter = 0.10 liter.
Therefore, the volume of the strong-electrolyte-solution strong-electrolyte-solution mixture = 0.10 liter.
Divide the quantity by the volume:.
0.45 mole / 0.10 liter = 4.50 molar.
Therefore, the concentration of na-plus in the strong-electrolyte-solution strong-electrolyte-solution mixture = 4.50 molar.
When 70 ml of 3.0-Molar Na2CO3 is added to 30 ml of 1.0-Molar NaHCO3, the resulting concentration of Na+ is 4.50 molar
Results of Project Halo
• After 4 month development effort, the knowledge systems were sequestered and given a test:– 165 novel questions: 50 multiple choice; 115
free form response– Questions translated from English to formal
language by each team, then assessed for fidelity by an independent committee
• High likelihood of long term follow on
Correctness
• The SRI’s team correctness score corresponds to an AP score of 3 – high enough for credit at UCSD, UIUC, and many other universities.
• We’ve predicted scoring 85% after a 3 month follow-on project.
Explanation Quality
Our Long Term Goal
• to enable distributed communities of domain experts to build knowledge systems in their area of expertise …– without direct help from knowledge engineers – working with familiar concepts and without
writing axioms– with little more effort than writing technical
papers
Our Current Focus
• Insight: even domain-specific representations contain common abstractions
• Approach: we build a library consisting of– a small hierarchy of reusable, composable, domain-
independent knowledge units (“components”)
– a small vocabulary of relations to connect them
then domain experts build representations by instantiating and composing these components
Bioremediation Amount Amount
Oil Fertilizer
Get Apply BreakDown
Absorb
Microbes Script
Bio-technologist
Soil Rate
environmentcontains
Q+ I- Q-I-
amount
productabsorbed
then
agent
patient agent
scriptpollutant
se
rateagent
then then
product
sesese
patient
remediatoramount
Building a Representation Compositionally
Bioremediation Amount Amount
Oil Fertilizer
Get Apply BreakDown
Absorb
Microbes Script
Bio-technologist
Soil Rate
environmentcontains
Q+ I- Q-I-
amount
productabsorbed
then
agent
patient agent
scriptpollutant
se
rateagent
then then
product
sesese
patient
remediator
Conversion Amount Amount
Substance
RateQ+ I- Q-
I-
amountraw-materials
rate
product
Substance
amount
amount
An underlying abstraction...
Bioremediation Amount Amount
Oil Fertilizer
Get Apply BreakDown
Absorb
Microbes Script
Bio-technologist
Soil Rate
environmentcontains
Q+ I- Q-I-
amount
productabsorbed
then
agent
patient agent
scriptpollutant
se
rateagent
then then
product
sesese
remediatoramount
Digest
Substance
BreakDown
Absorb
Agent Script
absorbedagent
script food
se
then
se patient
eater
agent
Another abstraction...
patient
Bioremediation Amount Amount
Oil Fertilizer
BreakDown
Absorb
Bio-technologist
Soil Rate
environmentcontains
Q+ I- Q-I-
amount
productabsorbed
then
agent
agent
pollutant
se
rateagent
Get Apply
Microbes Scriptpatient
script
thenthen
product
sesese
remediatoramount
TreatmentAgent
Another abstraction...
patient
Get Apply
substance Scriptpatient
script
then
substance
patient
se
Examples of Concepts Described Compositionally
• a Fuel-Cell is a Producer of Electricity
• a Bulb is an Electrical Resistor that Produces Light
• a Camera is an Image Recording Device
• a Wire is a Conduit of Electricity
Library Contents
• actions — things that happen, change states– Enter, Copy, Replace, Transfer, etc.
• states — relatively temporally stable events– Be-Closed, Be-Attached-To, Be-Confined, etc.
• entities — things that are– Substance, Place, Object, etc.
• roles — things that are, but only in the context of things that happen– Container, Catalyst, Barrier, Vehicle, etc.
Library Contents
• relations between events, entities, roles– agent, donor, object, recipient, result, etc.– content, part, material, possession, etc.– causes, defeats, enables, prevents, etc.– purpose, plays, etc.
• properties between events/entities and values– rate, frequency, intensity, direction, etc.– size, color, integrity, shape, etc.
Computational Semantics
• Knowledge about Enter:– instances of Enter inherit axioms from Move, such as:
the action changes the location of the object of the Move– before the Enter, the object is outside some enclosure– after the Enter, the object is inside that enclosure and
contained by it– during the Enter, the object passes through a portal of
the enclosure– if the portal has a covering, it must be open; and unless it
is known to be closed, assume that it’s open– etc.
Searching the Library
• browsing the hierarchy top-down• WordNet-based search
– all components have hooks to WordNet
– climb the WordNet hypernym tree with search terms– assemble: Attach, Come-Together
mend: Repair
infiltrate: Enter, Traverse, Penetrate, Move-Intogum-up: Block, Obstruct
busted: Be-Broken, Be-Ruined
First Challenge Problem
• To enable biologists to encode college-level textbook knowledge about cells
• A small example: mRNA-Transport• “mRNA is transported out of the cell nucleus
into the cytoplasm”• Transport: Move-Out-Of
unify
location
Evaluation
• Can Domain Experts learn to use the library to encode domain knowledge?
• Can sophisticated knowledge be captured through composition of components?
Methodology• train biologists (4 graduate students) for six days• have them encode knowledge from a college
textbook, Essential Cell Biology by Bruce Alberts• supply end-of-the-chapter-style Biology questions• have the biologists pose the questions to their
knowledge bases and record the answers• have another biologist evaluate the answers on a
scale of 0-3• qualitatively evaluate their KBs
Some Example Questions
• What nucleotide base pairs with adenine in RNA?• How is uracil in RNA like thymine in DNA?• What is the relationship between thymine and uracil?• For a given bacterial gene, how are bacterial RNA and DNA molecules different?• Describe RNA as a kind of polymer.• What are the four bases/nucleotides of RNA?• What is the relationship between a DNA gene and its RNA transcription product?
Evaluation — Productivity
0.0
0.5
1.0
1.5
2.0
2.5
6/25 7/2 7/9 7/16 7/23 7/30
Axi
oms
× 1
000
Structural
Implication
Total
Evaluation — Question Answering
Summary
• Knowledge Systems offer significant benefits compared with expert systems
• Multi-functional knowledge bases can be built• … by domain experts, almost• … and they will be, with or without sound
principles of ontological engineering• … and ontologists can significantly improve the
results
Discussion
• Will the idiosyncrasies of specific domains overshadow the commonalities coded in the component library?
• How can NLP be used to pull information from text to build knowledge systems?
• How can knowledge acquisition systems use machine learning?