building a system that can learn by reading

Post on 14-Jan-2016

38 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Building a System that can Learn by Reading. Kevin Livingston PhD Candidate Cognitive Systems Division EECS Department Northwestern University Presented at University of Dayton November 3, 2006 As part of the Computer Science Research Colloquium Series. Text. Understanding Systems. Text - PowerPoint PPT Presentation

TRANSCRIPT

1

Building a System that canLearn by Reading

Kevin LivingstonPhD Candidate

Cognitive Systems DivisionEECS Department

Northwestern University

Presented at University of Dayton November 3, 2006As part of the Computer Science Research Colloquium Series

2

Understanding Systems

Knowledge Base

Text

Text Understanding

System QuestionAnswering

Explanation

Reasoning

3

Examples of Intelligent Systems

• Digital Assistants

• Conversational Agents– Imagine a video game character that actually

talked to you

• Intelligent Games

• Search

• Question Answering

4

Modeling and Using Knowledge

• Frames (Minsky)

• Scripts (Shank)

• Qualitative Reasoning (Forbus)

5

Semantic Memory

• Ontology– Doctors are people– Bombings are events

• Semantics– Doctors have patients– Bombings have targets

6

Episodic Memory

• Memory Instances– Madrid is a city– Madrid is in Spain– The Madrid Bombings is a terrorist attack– The Madrid Bombings occurred on March 11,

2004– Al Qaida is a terrorist organization

7

Modeling Reasoning

• Logic Based Systems– Cyc (Cycorp; Austin, TX)– Fire (Qualitative Reasoning Group (QRG);

Northwestern)

• Statistical• Bayesian• Markovian• Neural Nets

8

Knowledge Bases

Knowledge Base

Text

Text Understanding

System QuestionAnswering

Explanation

Reasoning

9

Available Knowledge Bases• Information Retrieval (IR) techniques

– mine information from Internet (MUC and TREC)

• Open Mind Common Sense (OMCS)– Sentences collected from Internet contributors– Mined for knowledge

• Knowledge Machine (KM)– Frame based

• ResearchCyc– Latest release ~3,000,000 assertions– Predicate Logic in CycL

• First order and some second order constructs

10

ResearchCyc

• Bombings are Attacks and Attacks are Events(genls Bombing AttackOnObject)

(genls AttackOnObject Event)

• The Madrid terrorist attack was a bombing(isa

TerroristAttack-September-8-2003-Madrid

Bombing)

• Al Qaida is an Islamist Terrorist Group(isa AlQaida TerroristGroup-Islamist)

11

ResearchCyc (cont.)

• Location of an event

(eventOccursAt

TerroristAttack-September-8-2003-Madrid

CityOfMadridSpain) • Perpetrator of an event(perpetrator

TerroristAttack-September-8-2003-Madrid

AlQaida)

12

ResearchCyc (cont.)

• Person being killed(organismKilled SpaceShuttleChallengerDisaster ChristaMcAuliffe)

• Deaths caused by an attack(deathToll

TerroristAttack-September-8-2003-Madrid

Person190)

13

Building Knowledge Bases

• Slow and Tedious– years to grow Cyc from 1.2M to 3M assertions

• Requires Training– measured in weeks+, for GUI tools (SHAKEN)

• Expensive– Project Halo estimates $10,000 per page!

of AP level Chemistry content

14

Available Information

• Encyclopedias

• Newspapers

• Online sources

• Print

15

A Better Way?Teach the Computer to Read

16

What we want to Read

• Episodic Knowledge– New people– New events

• General Knowledge– “the heart is a pump”

17

Standard Model forNatural Language Processing

Dictionary

Text

POSTagging

Grammar

SyntacticParser

SemanticInterpreter

Semantics

Tagged Text

18

“Time flies like an arrow.”

19

“Time flies like an arrow.”• Time moves quickly just like an arrow does

20

“Time flies like an arrow.”• Time moves quickly just like an arrow does

• (You should) time flies like you would an arrow

21

“Time flies like an arrow.”• Time moves quickly just like an arrow does

• (You should) time flies like you would an arrow

• Time flies in the same way that an arrow would (time them)

22

“Time flies like an arrow.”• Time moves quickly just like an arrow does

• (You should) time flies like you would an arrow

• Time flies in the same way that an arrow would (time them)

• Time those flies that are like arrows;

23

“Time flies like an arrow.”• Time moves quickly just like an arrow does

• (You should) time flies like you would an arrow

• Time flies in the same way that an arrow would (time them)

• Time those flies that are like arrows;

• A type of flying insect, "time-flies," enjoy arrows (compare Fruit flies like a banana.)

24

Picture ofGeorge Burns

AndGrace Allen

25

Grace,Those are beautiful flowers.

Picture ofGeorge Burns

AndGrace Allen

26

Grace,Those are beautiful flowers.

Where did they come

from?

Picture ofGeorge Burns

AndGrace Allen

27

Don’t you remember, George?

Picture ofGeorge Burns

AndGrace Allen

28

Picture ofGeorge Burns

AndGrace Allen

You said that if I went to visit Clara Bagley

in the hospital I should be sure to take her flowers.

So when she wasn’t looking,

I did.

29

Picture ofGeorge Burns

AndGrace Allen

30

Common Sense

• “You said that if I went to visit Clara Bagley in the hospital I should be sure to take her flowers. So, when she wasn't looking, I did.”

– take flowers from her– take flowers to her

31

Picture ofElevatorOperator

32

Down?

Picture ofElevatorOperator

33

ThatWay!

Picture ofElevatorOperator

34

Context in the Environment

“Down?”

What does the question mean?– Which way is down?– Are you going down?

35

Language UnderstandingGoals include understanding:

– The meaning of the text– How it fits into what is known– The purpose of being told

Requires:– Common sense knowledge– Awareness of context

Our model is to get to knowledge as quickly as possible, with as few intermediate steps as possible.

36

37

Reader ExampleAn attack occurred in MadridThe bombing killed 190 peopleThe bombing was perpetrated by Al-Qaida

(eventOccursAt TerroristAttack-September-8-2003-Madrid CityOfMadridSpain)(perpetrator TerroristAttack-September-8-2003-Madrid AlQaida)(deathToll TerroristAttack-September-8-2003-Madrid Person 190)(isa TerroristAttack-September-8-2003-Madrid Bombing)

38

Lexical Processing

“An attack occurred in Madrid.”• “attack”(singular Attack-TheWord “attack”)(denotation Attack-TheWord CountNoun 0

AttackOnObject)(isa ?x AttackOnObject)

• “Madrid”(placeName-Standard CityOfMadridSpain

"Madrid")(isa CityOfMadridSpain City)

39

Rule Pattern

• Pattern:(isa ?event Event)

Occur-TheWord

In-TheWord

(isa ?location GeographicLocation)

• Results:(eventOccursAt ?event ?

location)

40

Pattern Matching• Pattern:

(isa ?event Event) Occur-TheWordIn-TheWord (isa ?location GeographicLocation)

• Input:– “An”– “attack” (isa ?x AttackOnObject)

– “occurred” Occur-TheWord

– “in” In-TheWord– “Madrid” (isa CityOfMadridSpain City)

41

Rule Completion“An attack occurred in Madrid.”• Pattern:

(isa ?event Event) Occur-TheWordIn-TheWord (isa ?location GeographicLocation)

• Results:(eventOccursAt ?event ?location)

• Constraints:(isa ?event AttackOnObject)

• Bindings from Reading:((?location . CityOfMadridSpain))

42

Remindings“An attack occurred in Madrid.”• Pattern: (isa ?event Event)

Occur-TheWordIn-TheWord (isa ?location GeographicLocation)

• Results:(eventOccursAt ?event ?location)

• Constraints:(isa ?event AttackOnObject)

• Bindings from Reading:((?location . CityOfMadridSpain))

• Remindings from Memory:((?event . TerrorAttack-Sept8-2003-Madrid))

43

Coreference ResolutionAn attack occurred in Madrid. The bombing was

perpetrated by Al-Qaida.

44

Coreference ResolutionAn attack occurred in Madrid

Results: (eventOccursAt?event?location)

Constraints:(isa ?event AttackOnObject)

Bindings from Reading:((?location .

CityOfMadridSpain))Remindings from Memory:((?event . TerrorAttack-

Sept8-2003-Madrid))

The bombing was perpetrated by Al-Qaida

Results: (perpetrator?action

?agent)

Constraints:(isa ?action Bombing)

Bindings from Reading:((?agent . AlQaida))

45

Coreference ResolutionAn attack occurred in Madrid

Results: (eventOccursAt?event?location)

Constraints:(isa ?event AttackOnObject)

Bindings from Reading:((?location .

CityOfMadridSpain))Remindings from Memory:((?event . TerrorAttack-

Sept8-2003-Madrid))

The bombing was perpetrated by Al-Qaida

Results: (perpetrator?action

?agent)

Constraints:(isa ?action Bombing)

Bindings from Reading:((?agent . AlQaida))

46

Coreference ResolutionAn attack occurred in Madrid

Results: (eventOccursAt?event?location)

Constraints:(isa ?event AttackOnObject)

Bindings from Reading:((?location .

CityOfMadridSpain))Remindings from Memory:((?event . TerrorAttack-

Sept8-2003-Madrid))

The bombing was perpetrated by Al-Qaida

Results: (perpetrator?action

?agent)

Constraints:(isa ?action Bombing)

Bindings from Reading:((?agent . AlQaida))

47

Coreference ResolutionAn attack occurred in Madrid

Results: (eventOccursAt?event?location)

Constraints:(isa ?event AttackOnObject)

Bindings from Reading:((?location .

CityOfMadridSpain))Remindings from Memory:((?event . TerrorAttack-

Sept8-2003-Madrid))

The bombing was perpetrated by Al-Qaida

Results: (perpetrator?action

?agent)

Constraints:(isa ?action Bombing)

Bindings from Reading:((?agent . AlQaida))

48

Coreference ResolutionAn attack occurred in Madrid

Results: (eventOccursAt?event?location)

Constraints:(isa ?event AttackOnObject)

Bindings from Reading:((?location .

CityOfMadridSpain))Remindings from Memory:((?event . TerrorAttack-

Sept8-2003-Madrid))

The bombing was perpetrated by Al-Qaida

Results: (perpetrator?action

?agent)

Constraints:(isa ?action Bombing)

Bindings from Reading:((?agent . AlQaida))

49

Coreference

• Refer to a more general or specific type– “bombing” and “attack”

• Consistent with being the same,

have a known shared instance in memory– “doctor” and “father of four”

50

Tracking Ambiguity

• Words are ambiguous– “Bush”

• A shrubbery?• A president? Which one?

• Sentences and Phrases– “take her flowers”– “Iraq borders Iran on the North.”

• Intension– “Down?”– “Where is Baghdad?”

51

% of sentences processed under time

0

10

20

30

40

50

60

70

80

90

100

1 10 100 1000 10000 100000 1E+06 1E+07

time (ms)

% o

f co

rpu

s se

nte

nce

s p

roce

ssed

RDE

SLE

WLE

52

What Learning Reader can Read

• Information about existing instances– “the 2003 Madrid bombing”

• Information extending existing instances– “the attack killed 190 people”

• Information representing new instances– “An attack occurred in Al Anbar.”

53

What Learning Reader can’t Read

• Larger patterns– Scripts: “A bomb went off.”

“6 people were arrested.”

• Generalizations“All countries have political leaders.”

“The heart is a pump.”

54

Dr. Christopher K. RiesbeckDr. Ken ForbusDr. Larry BirnbaumAbhishek Sharma

Dr. Jennifer SeitzerDr. Saverio Perugini

grant HR0011-04-1-0051

Acknowledgements

55

For more Information

http://cs.northwestern.edu/~livingston/http://cs.northwestern.edu/~livingston/talks.html

Kevin LivingstonPhD Candidate

Cognitive Systems DivisionEECS Department

Northwestern University

top related