knowledge systems and project halo in collaboration with sri (vinay chaudhri) and boeing (peter...

53
Knowledge Systems and Project Halo In collaboratio n with SRI (Vinay Chaudhri) and Boeing (Peter Clark)

Upload: robert-ward

Post on 18-Jan-2016

235 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Knowledge Systems and Project Halo In collaboration with SRI (Vinay Chaudhri) and Boeing (Peter Clark)

Knowledge Systems and Project Halo

In collaboration with

SRI (Vinay Chaudhri)

and Boeing (Peter Clark)

Page 2: Knowledge Systems and Project Halo In collaboration with SRI (Vinay Chaudhri) and Boeing (Peter Clark)

Knowledge Systems

• Knowledge Systems are formal representations of knowledge capable of answering unanticipated questions with coherent explanations

• Knowledge System = KB + Q/A + Explanation Generator + Knowledge Acq. tools

Page 3: Knowledge Systems and Project Halo In collaboration with SRI (Vinay Chaudhri) and Boeing (Peter Clark)

Project Halo

• Funded and administered by Vulcan, Inc – a Paul Allen company

• Objective: to assess the state of the art of knowledge systems – computer programs that know a lot and answer tough questions with coherent explanations

• Method: administer an AP Chemistry exam to knowledge systems built by 4 teams of researchers

Page 4: Knowledge Systems and Project Halo In collaboration with SRI (Vinay Chaudhri) and Boeing (Peter Clark)

A Significant Advance over Expert Systems

• Coverage

• Reasoning

• Explanation

• Rapid construction

Page 5: Knowledge Systems and Project Halo In collaboration with SRI (Vinay Chaudhri) and Boeing (Peter Clark)

KM: A Logic Programming Language

• …able to represent:– classes, instances, prototypes

– defaults, fluents, constraints

– (hypothetical) situations

– actions (pre-, post-, and during- conditions)

• …and reason about:– inheritance with exceptions

– deductive and abductive inference (with constraints)

– automatic classification (given a partial description of an instance, determine the classes to which it belongs)

– temporal projection (“my car is where I left it”)

– affects of actions

Page 6: Knowledge Systems and Project Halo In collaboration with SRI (Vinay Chaudhri) and Boeing (Peter Clark)

A Simple Example

• When 70 ml of 3.0-Molar Na2CO3 is added to 30 ml of 1.0-Molar NaHCO3 the resulting concentration of Na+ is:

a) 2.0 Mb) 2.4 Mc) 4.0 Md) 4.5 Me) 7.0 M

Page 7: Knowledge Systems and Project Halo In collaboration with SRI (Vinay Chaudhri) and Boeing (Peter Clark)

Question Representation

volume

Mix

Aqueous Solution Aqueous Solution

Mixture

Na+

raw material

Na2CO3

3.0 M 0.07 lit

NaHCO3

0.03 lit

volume

1.0 M

conc.base base conc.

result

has-part

conc.

Question 26context

??

output

Page 8: Knowledge Systems and Project Halo In collaboration with SRI (Vinay Chaudhri) and Boeing (Peter Clark)

Background Knowledge

Chemistry laws:1. Concentration of a solute

2. Composition of strong electrolyte solutions

3. Conservation of mass

4. Conservation of volume

etc.

Page 9: Knowledge Systems and Project Halo In collaboration with SRI (Vinay Chaudhri) and Boeing (Peter Clark)

Law 1: Concentration of a Solute

The concentration of a chemical in a mixture is the quantity of the chemical divided by the volume of the mixture.

Divide the quantity by the volume:<Quantity> / <Volume> = X *molar

Therefore, the concentration of <Chemical> in <Mixture> = X *molar

Explanation Template

Mixture

volumeconc.

Volume*liters

Concentration*molar

has-part

Chemical

Quantity*moles

quantity

Compute-Concentration Methodcontextinput output

Note: when this law is applied, using Novak’s code, the quantities are

automatically converted to the units-

of-measurement specified here

Page 10: Knowledge Systems and Project Halo In collaboration with SRI (Vinay Chaudhri) and Boeing (Peter Clark)

Law 2: Composition of Strong Electrolytes

Strong Electrolyte

Anion

has-part

Quantity*moles

quantity

Quantity*moles

quantity

Cation

Quantity*moles

quantity

Compute-Ions-in-Strong-Electrolytecontextinput output

Page 11: Knowledge Systems and Project Halo In collaboration with SRI (Vinay Chaudhri) and Boeing (Peter Clark)

Law 3: Conservation of MassConservation of Mass

contextinputoutput

Mix

Chemical1 Chemicaln

Chemical

raw-material

result

Quantity*moles

Quantity*moles

quantity quantity

Chemical

has-part

??*moles

quantity

part-of

By the Law of Conservation of Mass, the quantity of a chemical in a mixture is the sum of the quantities of that chemical in the parts of the mix.

The quantity of <Chemical> in <Chemical1> is X1 *moles…The quantity of <Chemical> in <Chemicaln> is Xn *moles

Therefore, the quantity of <Chemical> = X *moles

Explanation Template

Page 12: Knowledge Systems and Project Halo In collaboration with SRI (Vinay Chaudhri) and Boeing (Peter Clark)

Law 4: Conservation of Volume

Mix

Chemical1 Chemicaln

Mixture

raw-material

result

Volume<uom1>

Volume<uomn>

volume volume

??*liter

volume

Conservation of Volumecontextinput

output

By the Law of Conservation of Volume, the volume of a mixture is the sum of the volumes of the parts mixed.

The sum of X1 <uom1>, … and Xn <uomn> = X *literTherefore, the volume of <Mixture> = X *liter

Explanation Template

Page 13: Knowledge Systems and Project Halo In collaboration with SRI (Vinay Chaudhri) and Boeing (Peter Clark)

Step 1: Reclassify Terms

volume

Mix

Aqueous Solution Aqueous Solution

Mixture

Na+

raw material

Na2CO3

3.0 M 0.07 lit

NaHCO3

0.03 lit

volume

1.0 M

conc.base base conc.

result

has-part

Strong Electrolyte Solutionsuperclass

Page 14: Knowledge Systems and Project Halo In collaboration with SRI (Vinay Chaudhri) and Boeing (Peter Clark)

Step 2: Use Law 1 to Compute Concentration

Mixture

volumeconc.

Volume*liters

Concentration*molar

has-part

Chemical

Quantity*moles

quantityLaw 1

conc.

??*molar

volume

Mix

Aqueous Solution Aqueous Solution

Mixture

Na+

raw material

Na2CO3

3.0 M 0.07 lit

NaHCO3

0.03 lit

volume

1.0 M

conc.base base conc.

result

has-part??

*liters

volume

??*moles

quantity

Page 15: Knowledge Systems and Project Halo In collaboration with SRI (Vinay Chaudhri) and Boeing (Peter Clark)

The Search is non-deterministic

• Multiple laws might be used to compute a value for any property. For example, here’s another way to compute concentration:

pH = - log [H+], where [H+] is the concentration of H+

• Since this applies only to H+, this search path ends quickly

Page 16: Knowledge Systems and Project Halo In collaboration with SRI (Vinay Chaudhri) and Boeing (Peter Clark)

Step 3: Use Law 4 to Compute Volume

Mix

Chemical Chemical

Chemical

raw-material

result

Volume*liter

Volume*liter

volume volume

Volume*liter

volume

Law 4

.1

conc.

??*molar

volume

Mix

Aqueous Solution Aqueous Solution

Mixture

Na+

raw material

Na2CO3

3.0 M 0.07 lit

NaHCO3

0.03 lit

volume

1.0 M

conc.base base conc.

result

has-part??

*liters

volume

??*moles

quantity

Page 17: Knowledge Systems and Project Halo In collaboration with SRI (Vinay Chaudhri) and Boeing (Peter Clark)

Step 4: Use Law 3 to Compute Quantity

volume

Mix

Aqueous Solution Aqueous Solution

Mixture

Na+

raw material

Na2CO3

3.0 M

0.07 liters

0.03 liters

volume

1.0 M

conc. base

NaHCO3

base conc.

result

has-part

conc.

??*molar

.1*liters

volume

??*moles

quantity

Mix

Chemical Chemical

Chemical

raw-material

result

Quantity*moles

Quantity*moles

quantity quantity

Chemical

has-part

??*moles

quantity

part-ofLaw 3Na+Na+

??*moles

??*moles

has-part

quantity

Page 18: Knowledge Systems and Project Halo In collaboration with SRI (Vinay Chaudhri) and Boeing (Peter Clark)

Step 5: Use Law 2 to Compute Quantity of Ionic Parts

??*moles

quantity

Strong Electrolyte

Anion

has-part

Quantity*moles

quantity

Quantity*moles

quantity

Cation

Quantity*moles

quantity

Law 2

volume

Mix

Aqueous Solution Aqueous Solution

Mixture

Na+

raw material

Na2CO3

3.0 M

0.07 liters

0.03 liters

volume

1.0 M

conc. base

NaHCO3

base conc.

result

has-part

conc.

??*molar

.1*liters

volume

??*moles

quantity

Na+Na+

??*moles

??*moles

has-part

quantity

Page 19: Knowledge Systems and Project Halo In collaboration with SRI (Vinay Chaudhri) and Boeing (Peter Clark)

Step 6: Use Law 1’ to Compute Quantity

??*moles

quantityMixture

volumeconc.

Volume*liters

Concentration*molar

has-part

Chemical

Quantity*moles

quantity

Law 1’.21

volume

Mix

Aqueous Solution Aqueous Solution

Mixture

Na+

raw material

Na2CO3

3.0 M

0.07 liters

0.03 liters

volume

1.0 M

conc. base

NaHCO3

base conc.

result

has-part

conc.

??*molar

.1*liters

volume

??*moles

quantity

Na+Na+

??*moles

??*moles

has-part

quantity

Page 20: Knowledge Systems and Project Halo In collaboration with SRI (Vinay Chaudhri) and Boeing (Peter Clark)

Step 7: Wind out of Law 2 from step 5

Strong Electrolyte

Anion

has-part

Quantity*moles

quantity

Quantity*moles

quantity

Cation

Quantity*moles

quantity

Law 2

.42.21*moles

quantity

volume

Mix

Aqueous Solution Aqueous Solution

Mixture

Na+

raw material

Na2CO3

3.0 M

0.07 liters

0.03 liters

volume

1.0 M

conc. base

NaHCO3

base conc.

result

has-part

conc.

??*molar

.1*liters

volume

??*moles

quantity

Na+Na+

??*moles

??*moles

has-part

quantity

Page 21: Knowledge Systems and Project Halo In collaboration with SRI (Vinay Chaudhri) and Boeing (Peter Clark)

Step 8-10: Similar to steps 5-7

.03.21*moles

quantity

volume

Mix

Aqueous Solution Aqueous Solution

Mixture

Na+

raw material

Na2CO3

3.0 M

0.07 liters

0.03 liters

volume

1.0 M

conc. base

NaHCO3

base conc.

result

has-part

conc.

??*molar

.1*liters

volume

??*moles

quantity

Na+Na+

??*moles

.42*moles

has-part

quantity

Page 22: Knowledge Systems and Project Halo In collaboration with SRI (Vinay Chaudhri) and Boeing (Peter Clark)

Step 11: Wind out of Law 3 from Step 4

Mix

Chemical Chemical

Chemical

raw-material

result

Quantity*moles

Quantity*moles

quantity quantity

Chemical

has-part

??*moles

quantity

part-ofLaw 3

.45

.21*moles

quantity

volume

Mix

Aqueous Solution Aqueous Solution

Mixture

Na+

raw material

Na2CO3

3.0 M

0.07 liters

0.03 liters

volume

1.0 M

conc. base

NaHCO3

base conc.

result

has-part

conc.

??*molar

.1*liters

volume

??*moles

quantity

Na+Na+

.03*moles

.42*moles

has-part

quantity

Page 23: Knowledge Systems and Project Halo In collaboration with SRI (Vinay Chaudhri) and Boeing (Peter Clark)

Step 12: Wind out of Law 1 from Step 2

Mixture

volumeconc.

Volume*liters

Concentration*molar

has-part

Chemical

Quantity*moles

quantityLaw 1

.21*moles

quantity

volume

Mix

Aqueous Solution Aqueous Solution

Mixture

Na+

raw material

Na2CO3

3.0 M

0.07 liters

0.03 liters

volume

1.0 M

conc. base

NaHCO3

base conc.

result

has-part

conc.

??*molar

.1*liters

volume

.45*moles

quantity

Na+Na+

.03*moles

.42*moles

has-part

quantity

4.5

Page 24: Knowledge Systems and Project Halo In collaboration with SRI (Vinay Chaudhri) and Boeing (Peter Clark)

Question 26 AnswerWhen 70 ml of 3.0-Molar Na2CO3 is added to 30 ml of 1.0-Molar NaHCO3, what is the resulting concentration of Na+?.

The concentration of a chemical in a mixture is the quantity of the chemical divided by the volume of the mixture.

By the Law of Conservation of Mass, the quantity of a chemical in a mixture is the sum of the quantities of that chemical in

the parts of the mix.

In the na2co3 strong-electrolyte-solution and the nahco3 strong-electrolyte-solution :

In the na-plus :

Multiply the concentration and the volume:

3 molar * 70 milliliter = 0.21 mole.

The quantity of na-plus in the na-plus is 0.42 mole.

In the co3-2 :

The quantity of na-plus in the co3-2 is 0 mole.

Multiply the concentration and the volume:

1 molar * 30 milliliter = 0.03 mole.

In the na-plus :

The quantity of na-plus in the na-plus is 0.03 mole.

In the hco3- :

The quantity of na-plus in the hco3- is 0 mole.

The quantity of na-plus in the na2co3 strong-electrolyte-solution and the nahco3 strong-electrolyte-solution is 0.45 mole.

Therefore, the quantity of na-plus = 0.45 mole.

By the Law of Conservation of Volume, the volume of a mixture is the sum of the volumes of the parts mixed.

The sum of 70 milliliter and 30 milliliter = 0.10 liter.

Therefore, the volume of the strong-electrolyte-solution strong-electrolyte-solution mixture = 0.10 liter.

Divide the quantity by the volume:.

0.45 mole / 0.10 liter = 4.50 molar.

Therefore, the concentration of na-plus in the strong-electrolyte-solution strong-electrolyte-solution mixture = 4.50 molar.

When 70 ml of 3.0-Molar Na2CO3 is added to 30 ml of 1.0-Molar NaHCO3, the resulting concentration of Na+ is 4.50 molar

Page 25: Knowledge Systems and Project Halo In collaboration with SRI (Vinay Chaudhri) and Boeing (Peter Clark)

Results of Project Halo

• After 4 month development effort, the knowledge systems were sequestered and given a test:– 165 novel questions: 50 multiple choice; 115

free form response– Questions translated from English to formal

language by each team, then assessed for fidelity by an independent committee

• High likelihood of long term follow on

Page 26: Knowledge Systems and Project Halo In collaboration with SRI (Vinay Chaudhri) and Boeing (Peter Clark)

Correctness

• The SRI’s team correctness score corresponds to an AP score of 3 – high enough for credit at UCSD, UIUC, and many other universities.

• We’ve predicted scoring 85% after a 3 month follow-on project.

Page 27: Knowledge Systems and Project Halo In collaboration with SRI (Vinay Chaudhri) and Boeing (Peter Clark)

Explanation Quality

Page 28: Knowledge Systems and Project Halo In collaboration with SRI (Vinay Chaudhri) and Boeing (Peter Clark)

Our Long Term Goal

• to enable distributed communities of domain experts to build knowledge systems in their area of expertise …– without direct help from knowledge engineers – working with familiar concepts and without

writing axioms– with little more effort than writing technical

papers

Page 29: Knowledge Systems and Project Halo In collaboration with SRI (Vinay Chaudhri) and Boeing (Peter Clark)

Our Current Focus

• Insight: even domain-specific representations contain common abstractions

• Approach: we build a library consisting of– a small hierarchy of reusable, composable, domain-

independent knowledge units (“components”)

– a small vocabulary of relations to connect them

then domain experts build representations by instantiating and composing these components

Page 30: Knowledge Systems and Project Halo In collaboration with SRI (Vinay Chaudhri) and Boeing (Peter Clark)

Bioremediation Amount Amount

Oil Fertilizer

Get Apply BreakDown

Absorb

Microbes Script

Bio-technologist

Soil Rate

environmentcontains

Q+ I- Q-I-

amount

productabsorbed

then

agent

patient agent

scriptpollutant

se

rateagent

then then

product

sesese

patient

remediatoramount

Building a Representation Compositionally

Page 31: Knowledge Systems and Project Halo In collaboration with SRI (Vinay Chaudhri) and Boeing (Peter Clark)

Bioremediation Amount Amount

Oil Fertilizer

Get Apply BreakDown

Absorb

Microbes Script

Bio-technologist

Soil Rate

environmentcontains

Q+ I- Q-I-

amount

productabsorbed

then

agent

patient agent

scriptpollutant

se

rateagent

then then

product

sesese

patient

remediator

Conversion Amount Amount

Substance

RateQ+ I- Q-

I-

amountraw-materials

rate

product

Substance

amount

amount

An underlying abstraction...

Page 32: Knowledge Systems and Project Halo In collaboration with SRI (Vinay Chaudhri) and Boeing (Peter Clark)

Bioremediation Amount Amount

Oil Fertilizer

Get Apply BreakDown

Absorb

Microbes Script

Bio-technologist

Soil Rate

environmentcontains

Q+ I- Q-I-

amount

productabsorbed

then

agent

patient agent

scriptpollutant

se

rateagent

then then

product

sesese

remediatoramount

Digest

Substance

BreakDown

Absorb

Agent Script

absorbedagent

script food

se

then

se patient

eater

agent

Another abstraction...

patient

Page 33: Knowledge Systems and Project Halo In collaboration with SRI (Vinay Chaudhri) and Boeing (Peter Clark)

Bioremediation Amount Amount

Oil Fertilizer

BreakDown

Absorb

Bio-technologist

Soil Rate

environmentcontains

Q+ I- Q-I-

amount

productabsorbed

then

agent

agent

pollutant

se

rateagent

Get Apply

Microbes Scriptpatient

script

thenthen

product

sesese

remediatoramount

TreatmentAgent

Another abstraction...

patient

Get Apply

substance Scriptpatient

script

then

substance

patient

se

Page 34: Knowledge Systems and Project Halo In collaboration with SRI (Vinay Chaudhri) and Boeing (Peter Clark)

Examples of Concepts Described Compositionally

• a Fuel-Cell is a Producer of Electricity

• a Bulb is an Electrical Resistor that Produces Light

• a Camera is an Image Recording Device

• a Wire is a Conduit of Electricity

Page 35: Knowledge Systems and Project Halo In collaboration with SRI (Vinay Chaudhri) and Boeing (Peter Clark)

Library Contents

• actions — things that happen, change states– Enter, Copy, Replace, Transfer, etc.

• states — relatively temporally stable events– Be-Closed, Be-Attached-To, Be-Confined, etc.

• entities — things that are– Substance, Place, Object, etc.

• roles — things that are, but only in the context of things that happen– Container, Catalyst, Barrier, Vehicle, etc.

Page 36: Knowledge Systems and Project Halo In collaboration with SRI (Vinay Chaudhri) and Boeing (Peter Clark)

Library Contents

• relations between events, entities, roles– agent, donor, object, recipient, result, etc.– content, part, material, possession, etc.– causes, defeats, enables, prevents, etc.– purpose, plays, etc.

• properties between events/entities and values– rate, frequency, intensity, direction, etc.– size, color, integrity, shape, etc.

Page 37: Knowledge Systems and Project Halo In collaboration with SRI (Vinay Chaudhri) and Boeing (Peter Clark)

Computational Semantics

• Knowledge about Enter:– instances of Enter inherit axioms from Move, such as:

the action changes the location of the object of the Move– before the Enter, the object is outside some enclosure– after the Enter, the object is inside that enclosure and

contained by it– during the Enter, the object passes through a portal of

the enclosure– if the portal has a covering, it must be open; and unless it

is known to be closed, assume that it’s open– etc.

Page 38: Knowledge Systems and Project Halo In collaboration with SRI (Vinay Chaudhri) and Boeing (Peter Clark)

Searching the Library

• browsing the hierarchy top-down• WordNet-based search

– all components have hooks to WordNet

– climb the WordNet hypernym tree with search terms– assemble: Attach, Come-Together

mend: Repair

infiltrate: Enter, Traverse, Penetrate, Move-Intogum-up: Block, Obstruct

busted: Be-Broken, Be-Ruined

Page 39: Knowledge Systems and Project Halo In collaboration with SRI (Vinay Chaudhri) and Boeing (Peter Clark)

First Challenge Problem

• To enable biologists to encode college-level textbook knowledge about cells

• A small example: mRNA-Transport• “mRNA is transported out of the cell nucleus

into the cytoplasm”• Transport: Move-Out-Of

Page 40: Knowledge Systems and Project Halo In collaboration with SRI (Vinay Chaudhri) and Boeing (Peter Clark)
Page 41: Knowledge Systems and Project Halo In collaboration with SRI (Vinay Chaudhri) and Boeing (Peter Clark)
Page 42: Knowledge Systems and Project Halo In collaboration with SRI (Vinay Chaudhri) and Boeing (Peter Clark)
Page 43: Knowledge Systems and Project Halo In collaboration with SRI (Vinay Chaudhri) and Boeing (Peter Clark)
Page 44: Knowledge Systems and Project Halo In collaboration with SRI (Vinay Chaudhri) and Boeing (Peter Clark)
Page 45: Knowledge Systems and Project Halo In collaboration with SRI (Vinay Chaudhri) and Boeing (Peter Clark)

unify

Page 46: Knowledge Systems and Project Halo In collaboration with SRI (Vinay Chaudhri) and Boeing (Peter Clark)

location

Page 47: Knowledge Systems and Project Halo In collaboration with SRI (Vinay Chaudhri) and Boeing (Peter Clark)

Evaluation

• Can Domain Experts learn to use the library to encode domain knowledge?

• Can sophisticated knowledge be captured through composition of components?

Page 48: Knowledge Systems and Project Halo In collaboration with SRI (Vinay Chaudhri) and Boeing (Peter Clark)

Methodology• train biologists (4 graduate students) for six days• have them encode knowledge from a college

textbook, Essential Cell Biology by Bruce Alberts• supply end-of-the-chapter-style Biology questions• have the biologists pose the questions to their

knowledge bases and record the answers• have another biologist evaluate the answers on a

scale of 0-3• qualitatively evaluate their KBs

Page 49: Knowledge Systems and Project Halo In collaboration with SRI (Vinay Chaudhri) and Boeing (Peter Clark)

Some Example Questions

• What nucleotide base pairs with adenine in RNA?• How is uracil in RNA like thymine in DNA?• What is the relationship between thymine and uracil?• For a given bacterial gene, how are bacterial RNA and DNA molecules different?• Describe RNA as a kind of polymer.• What are the four bases/nucleotides of RNA?• What is the relationship between a DNA gene and its RNA transcription product?

Page 50: Knowledge Systems and Project Halo In collaboration with SRI (Vinay Chaudhri) and Boeing (Peter Clark)

Evaluation — Productivity

0.0

0.5

1.0

1.5

2.0

2.5

6/25 7/2 7/9 7/16 7/23 7/30

Axi

oms

× 1

000

Structural

Implication

Total

Page 51: Knowledge Systems and Project Halo In collaboration with SRI (Vinay Chaudhri) and Boeing (Peter Clark)

Evaluation — Question Answering

Page 52: Knowledge Systems and Project Halo In collaboration with SRI (Vinay Chaudhri) and Boeing (Peter Clark)

Summary

• Knowledge Systems offer significant benefits compared with expert systems

• Multi-functional knowledge bases can be built• … by domain experts, almost• … and they will be, with or without sound

principles of ontological engineering• … and ontologists can significantly improve the

results

Page 53: Knowledge Systems and Project Halo In collaboration with SRI (Vinay Chaudhri) and Boeing (Peter Clark)

Discussion

• Will the idiosyncrasies of specific domains overshadow the commonalities coded in the component library?

• How can NLP be used to pull information from text to build knowledge systems?

• How can knowledge acquisition systems use machine learning?