learning to “ read between the lines ” using bayesian logic programs

29
Learning to “Read Between the Lines” using Bayesian Logic Programs Sindhu Raghavan, Raymond Mooney, and Hyeonseo Ku The University of Texas at Austin July 2012 1

Upload: wren

Post on 15-Jan-2016

27 views

Category:

Documents


0 download

DESCRIPTION

Learning to “ Read Between the Lines ” using Bayesian Logic Programs. Sindhu Raghavan, Raymond Mooney, and Hyeonseo Ku The University of Texas at Austin July 2012. Information Extraction. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Learning to  “ Read Between the Lines ”  using Bayesian Logic Programs

Learning to “Read Between the Lines” using Bayesian Logic Programs

Sindhu Raghavan, Raymond Mooney, and Hyeonseo Ku

The University of Texas at AustinJuly 2012

1

Page 2: Learning to  “ Read Between the Lines ”  using Bayesian Logic Programs

Information Extraction• Information extraction (IE) systems extract factual

information that occurs in text [Cowie and Lenhert, 1996; Sarawagi, 2008]

• Natural language text is typically “incomplete”– Commonsense information is not explicitly stated– Easily inferred facts are omitted from the text

• Human readers use commonsense knowledge and “read between the lines” to infer implicit information

• IE systems have no access to commonsense knowledge and hence cannot infer implicit information

2

Page 3: Learning to  “ Read Between the Lines ”  using Bayesian Logic Programs

Example

Natural language text“Barack Obama is the President of the United States of America.”

Query“Barack Obama is the citizen of what country?”

IE systems cannot answer this query since citizenship information is not explicitly stated!

3

Page 4: Learning to  “ Read Between the Lines ”  using Bayesian Logic Programs

Objective

• Infer implicit facts from explicitly stated information– Extract explicitly stated facts using an IE system– Learn common sense knowledge in the form of

logical rules to deduce additional facts– Employ models from statistical relational

learning (SRL) that allow probabilities to be estimated using well-founded probabilistic graphical models

4

Page 5: Learning to  “ Read Between the Lines ”  using Bayesian Logic Programs

Related Work

•Learning propositional rules [Nahm and Mooney, 2000]

– Learn propositional rules from the output of an IE system on computer-related job postings

– Perform logical deduction to infer new facts– Purely logical deduction is brittle

• Cannot assign probabilities or confidence estimates to inferences

5

Page 6: Learning to  “ Read Between the Lines ”  using Bayesian Logic Programs

Related Work• Learning first-order rules

– Logical deduction using probabilistic rules [Carlson et al., 2010; Doppa et al., 2010]

• Modify existing rule learners like FOIL and FARMER to learn probabilistic rules

• Probabilities are not computed using well-founded probabilistic graphical models

– Use Markov Logic Networks (MLNs) [Domingos and

Lowd, 2009] based approaches to infer additional facts [Schoenmackers et al., 2010; Sorower et al., 2011]

• Grounding process could result in intractably large networks for large domains

6

Page 7: Learning to  “ Read Between the Lines ”  using Bayesian Logic Programs

Related Work

• Learning for Textual Entailment [Lin and Pantel, 2001; Yates and Etzioni, 2007; Berant et al., 2011]

– Textual entailment rules have a single antecedent in the body of the rule

– Approaches from statistical relational learning have not been applied so far

– Do not use extractions from a traditional IE system to learn rules

7

Page 8: Learning to  “ Read Between the Lines ”  using Bayesian Logic Programs

Our Approach

• Use an off-the shelf IE system to extract facts

• Learn commonsense knowledge from the extracted facts in the form of probabilistic first-order-rules

• Infer additional facts based on the learned rules using Bayesian Logic Programs (BLPs) [Kersting and De Raedt, 2001]

8

Page 9: Learning to  “ Read Between the Lines ”  using Bayesian Logic Programs

System ArchitectureTraining

DocumentsInformation Extractor

(IBM SIRE)Extracted

Facts

Inductive LogicProgramming

(LIME)

First-OrderLogical Rules

BLP Weight Learner(version of EM)

Bayesian LogicProgram (BLP)

BLP InferenceEngine

TestDocument

Extractions

Inferences withprobabilities 9

.

.

.

.

.

.

Barack Obama is the current President of USA……. Obama was born on August 4, 1961, in Hawaii, USA.

.

.

.

.

.

.

nationState(USA)Person(BarackObama)isLedBy(USA,BarackObama)hasBirthPlace(BarackObama,USA)hasCitizenship(BarackObama,USA)

nationState(B) ∧ isLedBy(B,A) hasCitizenship(A,B)nationState(B) ∧ employs(B,A) hasCitizenship(A,B)

hasCitizenship(A,B) | nationState(B) , isLedBy(B,A) .9hasCitizenship(A,B) | nationState(B) , employs(B,A) .6

nationState(malaysian)Person(mahathir-mohamad)isLedBy(malaysian,mahathir-mohamad)employs(malaysian,mahatir-mohamad)

hasCitizenship(mahathir-mohamad, malaysian) 0.75

Page 10: Learning to  “ Read Between the Lines ”  using Bayesian Logic Programs

Bayesian Logic Programs[Kersting and De Raedt, 2001]

• Set of Bayesian clauses a | a1,a2,....,an– Definite clauses in first-order logic, universally quantified– Head of the clause - a– Body of the clause - a1, a2, …, an – Associated conditional probability table (CPT)

• P(head | body) • Bayesian predicates a, a1, a2, …, an have finite

domains– Combining rule like noisy-or for mapping multiple CPTs

into a single CPT• Given a set of Bayesian clauses and a query, SLD

resolution is used to construct ground Bayesian networks for probabilistic inference

10

Page 11: Learning to  “ Read Between the Lines ”  using Bayesian Logic Programs

Why BLPs?

• Pure logical deduction is brittle and results in many undifferentiated inferences

• Inference in BLPs is probabilistic, i.e. inferences are assigned probabilities– Probabilities can be used to select only high-

confidence inferences

• Efficient grounding mechanism in BLPs enables our approach to scale

11

Page 12: Learning to  “ Read Between the Lines ”  using Bayesian Logic Programs

Inductive Logic Programming (ILP) for learning first-order rules

ILP Rule Learner

ILP Rule Learner

Target relationhasCitizenship(X,Y)

Positive instanceshasCitizenship(BarackObama, USA)

hasCitizenship(GeorgeBush, USA)

hasCitizenship(IndiraGandhi,India)

.

.

Negative instanceshasCitizenship(BarackObama, India)

hasCitizenship(GeorgeBush, India)

hasCitizenship(IndiraGandhi,USA)

.

.

KBhasBirthPlace(BarackObama,USA)person(BarackObama)nationState(USA)nationState(India)

.

.

RulesnationState(Y) ∧ isLedBy(Y,X) hasCitizenship(X,Y)

..

RulesnationState(Y) ∧ isLedBy(Y,X) hasCitizenship(X,Y)

..

Generated using clo

sed-

world assu

mption

Generated using clo

sed-

world assu

mption

12

Page 13: Learning to  “ Read Between the Lines ”  using Bayesian Logic Programs

Inference using BLPs

Test document“Malaysian Prime Minister Mahathir Mohamad Wednesday announced for the first time that he has appointed his deputy Abdullah Ahmad Badawi as his successor.”

Extracted factsnationState(malaysian)Person(mahathir-mohamad)isLedBy(malaysian,mahathir-mohamad)employs(malaysian,mahatir-mohamad)

Learned rulesnationState(B) ∧ isLedBy(B,A) hasCitizenship(A,B)nationState(B) ∧ employs(B,A) hasCitizenship(A,B)

13

Page 14: Learning to  “ Read Between the Lines ”  using Bayesian Logic Programs

Logical Inference in BLPs

Rule 1nationState(B) ∧ isLedBy(B,A) hasCitizenship(A,B)

nationState(malaysian) isLedBy(malaysian,mahathir-mohamad)

hasCitizenship(mahathir-mohamad, malaysian)

14

Page 15: Learning to  “ Read Between the Lines ”  using Bayesian Logic Programs

Logical Inference in BLPs

Rule 2nationState(B) ∧ employs(B,A) hasCitizenship(A,B)

nationState(malaysian) employs(malaysian,mahathir-mohamad)

hasCitizenship(mahathir-mohamad, malaysian)

15

Page 16: Learning to  “ Read Between the Lines ”  using Bayesian Logic Programs

Probabilistic inference in BLPs

nationState(malaysian)

isLedBy(malaysian, mahathir-mohamad)

- - -

- - -

- - -

- - -

Logical

And

employs(malaysian, mahathir-mohamad)

dummy1 dummy2

hasCitizenship(mahathir-mohamad,

malaysian)Marginal Probability ??

- - -

- - -

- - -

- - -

Logical

And- - -

- - -

- - -

- - -

Noisy

Or

16

Page 17: Learning to  “ Read Between the Lines ”  using Bayesian Logic Programs

Sample rules learnedgovernmentOrganization(A) ∧ employs(A,B) hasMember(A,B)

eventLocation(A,B) ∧ bombing(A) thingPhysicallyDamage(A,B)

isLedBy(A,B) hasMemberPerson(A,B)

17

Page 18: Learning to  “ Read Between the Lines ”  using Bayesian Logic Programs

Experimental Evaluation

• Data– DARPA’s intelligence community (IC) data set

from the Machine Reading Project (MRP)– Consists of news articles on politics,

terrorism, and other international events– 10,000 documents in total

• Perform 10-fold cross validation

18

Page 19: Learning to  “ Read Between the Lines ”  using Bayesian Logic Programs

Experimental Evaluation

• Learning first-order rules using LIME [McCreath and Sharma, 1998]

– Learn rules for 13 target relations– Learn rules using both positive and negative

instances and using only positive instances– Include all unique rules learned from different

models

• Learning BLP parameters– Learn noisy-or parameters using Expectation

Maximization (EM)– Set priors to maximum likelihood estimates

19

Page 20: Learning to  “ Read Between the Lines ”  using Bayesian Logic Programs

Experimental Evaluation

• Performance evaluation– Manually evaluated inferred facts from 40

documents, randomly selected from each test set– Compute two precision scores

• Unadjusted (UA) – does not account for extractor’s mistakes

• Adjusted (AD) – account for extractor’s mistakes

– Rank inferences using marginal probabilities and evaluate top-n

20

Page 21: Learning to  “ Read Between the Lines ”  using Bayesian Logic Programs

Experimental Evaluation

• Systems compared– BLP Learned Weights

• Noisy-or parameters learned using online EM– BLP Manual Weights

• Noisy-or parameters set to 0.9– Logical Deduction– MLN Learned Weights

• Learn weights using generative online weight learner– MLN Manual Weights

• Assign a weight of 10 to all rules and MLE priors to all predicates

21

Page 22: Learning to  “ Read Between the Lines ”  using Bayesian Logic Programs

Unadjusted Precision

22

Page 23: Learning to  “ Read Between the Lines ”  using Bayesian Logic Programs

Adjusted Precision

23

Page 24: Learning to  “ Read Between the Lines ”  using Bayesian Logic Programs

Future Work

• Improve the performance of weight learning for BLPs and MLNs– Learn parameters on larger data sets

• Improve performance of MLNs– Use open-world assumption for learning– Add constraints required to prevent inference of facts

like employs(a,a)– Specialize types that do not have strictly defined types

• Develop an online rule learner that can learn rules from uncertain training data

24

Page 25: Learning to  “ Read Between the Lines ”  using Bayesian Logic Programs

Conclusions• Efficient learning of probabilistic first-order

rules that represent common sense knowledge using extractions from an IE system

• Inference of implicitly stated facts with high precision using BLPs

• Superior performance of BLPs over purely logical deduction and MLNs

25

Page 26: Learning to  “ Read Between the Lines ”  using Bayesian Logic Programs

Questions??

26

Page 27: Learning to  “ Read Between the Lines ”  using Bayesian Logic Programs

Back Up

27

Page 28: Learning to  “ Read Between the Lines ”  using Bayesian Logic Programs

Results for Logical Deduction

UA AD

Precision 29.73 (443/1490)

35.24 (443/1257)

28

Page 29: Learning to  “ Read Between the Lines ”  using Bayesian Logic Programs

Experimental Evaluation

• Learning BLP parameters– Use logical-and model to combine evidence

from the conjuncts in the body of the clause– Use noisy-or model to combine evidence from

several ground rules that have the same head– Learn noisy-or parameters using Expectation

Maximization (EM)– Set priors to maximum likelihood estimates

29