natural logic for textual inference bill maccartney and christopher d. manning nlp group stanford...
TRANSCRIPT
Natural LogicNatural Logicfor Textual Inferencefor Textual Inference
Bill MacCartney and Christopher D. Manning
NLP Group
Stanford University
29 June 2007
2
Inferences involving Inferences involving monotonicitymonotonicity
Few states completely forbid casino gambling.
OK Few western states completely forbid casino gambling.
Few or no states completely forbid casino gambling.
Few states completely forbid gambling.
No Few states completely forbid casino gambling for kids.
Few states or cities completely forbid casino gambling.
Few states restrict gambling.
What kind of textual inference system could predict this?
Introduction • Foundations of Natural Logic • The NatLog System • Experiments with FraCaS • Experiments with RTE • Conclusion
3
Textual inference:Textual inference:a spectrum of approachesa spectrum of approaches
robust,but shallow
deep,but brittle
naturallogic
lexical/semanticoverlap
Jijkoun & de Rijke 2005
patternedrelationextraction
Romano et al. 2006
pred-argstructurematching
Hickl et al. 2006
FOL &theoremproving
Bos & Markert 2006
Introduction • Foundations of Natural Logic • The NatLog System • Experiments with FraCaS • Experiments with RTE • Conclusion
4
What is natural logic?What is natural logic?
• A logic whose vehicle of inference is natural language• No formal notation: • Just words & phrases: All men are mortal…
• Focus on a ubiquitous category of inference: monotonicity• I.e., reasoning about the consequences of broadening or narrowing the concepts or constraints in a proposition
• Precise, yet sidesteps difficulties of translating to FOL:idioms, intensionality and propositional attitudes, modalities, indexicals,reciprocals,scope ambiguities, quantifiers such as most, reciprocals, anaphoric adjectives, temporal and causal relations, aspect, unselective quantifiers, adverbs of quantification, donkey sentences, generic determiners, …
• Aristotle, Lakoff, van Benthem, Sánchez Valencia 1991
Introduction • Foundations of Natural Logic • The NatLog System • Experiments with FraCaS • Experiments with RTE • Conclusion
5
OutlineOutline
• Introduction
• Foundations of Natural Logic
• The NatLog System
• Experiments with FraCaS
• Experiments with RTE
• Conclusion
Introduction • Foundations of Natural Logic • The NatLog System • Experiments with FraCaS • Experiments with RTE • Conclusion
6
The entailment relation: The entailment relation:
In natural logic, entailment is defined as an ordering relation over expressions of all semantic types (not just sentences)
category semantic type example(s)
common nouns et penguin bird
adjectives et tiny small
intransitive verbs
et hover fly
transitive verbs
eet kick strike
temporal &locative modifiers
(et)(et) this morning today in Beijing in China
connectives ttt and or
quantifiers(et)t
(et)(et)teveryone someone all most some
Introduction • Foundations of Natural Logic • The NatLog System • Experiments with FraCaS • Experiments with RTE • Conclusion
7
Monotonicity of semantic Monotonicity of semantic functionsfunctions
Upward-monotone (M)The default: “bigger” inputs yield “bigger” outputsExample: broken. Since chair furniture, broken chair broken furnitureHeuristic: in a M context, broadening edits preserve truthDownward-monotone (M)Negatives, restrictives, etc.: “bigger” inputs yield “smaller” outputsExample: doesn’t. While hover fly, doesn’t fly doesn’t hoverHeuristic: in a M context, narrowing edits preserve truthNon-monotone (#M)Superlatives, some quantifiers (most, exactly n): neither M nor MExample: most. While penguin bird, most penguins # most birdsHeuristic: in a #M context, no edits preserve truth
In compositional semantics, meanings are seen as functions, and can have various monotonicity properties:
Introduction • Foundations of Natural Logic • The NatLog System • Experiments with FraCaS • Experiments with RTE • Conclusion
8
Downward monotonicityDownward monotonicity
few athletes few sprinters
restrictive quantifiers:no, few, at most n
prohibit weapons prohibit guns
negative & restrictive verbs:
lack, fail, prohibit, deny
without clothes without pants
prepositions & adverbs:without, except, only
drug ban heroin ban
negative & restrictive nouns:
ban, absence [of], refusal
If stocks rise, we’ll get real paid If stocks soar, we’ll get real paid
the antecedent of a conditional
didn’t dance didn’t tango
explicit negation:no, n’t
Downward-monotone constructions are widespread!
Introduction • Foundations of Natural Logic • The NatLog System • Experiments with FraCaS • Experiments with RTE • Conclusion
9
Monotonicity of binary Monotonicity of binary functionsfunctions
• Some quantifiers are best viewed as binary functions
• Different arguments can have different monotonicities
all
All ducks fly All mallards fly All ducks move
some
Some mammals fly Some animals fly Some mammals move
no
No dogs fly No poodles fly No dogs hover
not every
Not every bird flies Not every animal flies Not every bird hovers
Introduction • Foundations of Natural Logic • The NatLog System • Experiments with FraCaS • Experiments with RTE • Conclusion
10
Composition of monotonicityComposition of monotonicity
• Composition of functions composition of monotonicity
• Sánchez Valencia: a precise monotonicity calculus for CG
Few
forbid
states completely
casino
gambling
+ + +–– –Few states completely forbid casino gambling
o
M M #M
M M M #M
M M M #M
#M #M #M #M
Introduction • Foundations of Natural Logic • The NatLog System • Experiments with FraCaS • Experiments with RTE • Conclusion
11
The NatLog SystemThe NatLog System
linguistic pre-processing
alignment
entailment classification
1
2
3
textual inference problem
prediction
Introduction • Foundations of Natural Logic • The NatLog System • Experiments with FraCaS • Experiments with RTE • Conclusion
12
Step 1: Linguistic Pre-Step 1: Linguistic Pre-processingprocessing
• Tokenize & parse input sentences (future: & NER & coref & …)
• Identify & project monotonicity operators• Problem: PTB-style parse tree semantic structure!
Few states completely forbid casino gambling
JJ NNS RB VBD NN NN
NP ADVP NP
VP
S
+ + +–– –
• Solution: specify projections in PTB trees using Tregex
Few
forbid
states completely
casino
gambling
fewpattern: JJ < /^[Ff]ew$/arg1: M on dominating NP
__ >+(NP) (NP=proj !> NP)arg2: M on dominating S
__ >+(/.*/) (S=proj !> S)
Introduction • Foundations of Natural Logic • The NatLog System • Experiments with FraCaS • Experiments with RTE • Conclusion
13
Step 2: AlignmentStep 2: Alignment
• Alignment = a sequence of atomic edits [cf. Harmeling 07]
• Atomic edits over token spans: DEL, INS, SUB, ADV
• Limitations:• no easy way to represent movement• no alignments to non-contiguous sets of tokens
• Benefits:• well-defined sequence of intermediate forms• can use adaptation of Levenshtein string-edit DP
• We haven’t (yet) invested much effort here
Few states completely forbid casino gambling
Few states have completely prohibited gambling
ADV ADV SUB ADVINS DEL
Introduction • Foundations of Natural Logic • The NatLog System • Experiments with FraCaS • Experiments with RTE • Conclusion
14
Step 3: Entailment Step 3: Entailment ClassificationClassification
• Atomic edits atomic entailment problems
• Feature representation• Basic features: edit type, monotonicity, “light edit” feature• Lexical features for SUB edits: lemma sim, WN features
• Decision tree classifier• Trained on small data set designed to exercise feature space• Outputs an elementary entailment relation: = # |
• Composition of atomic entailment predictions• Fairly intuitive: º , º #, º = =, etc.• Composition yields global entailment prediction for problem
Introduction • Foundations of Natural Logic • The NatLog System • Experiments with FraCaS • Experiments with RTE • Conclusion
15
predict
featurize
Entailment model exampleEntailment model example
type INSmono downisLighttrue
Few states completely forbid casino gambling .
Few states have completely prohibited gambling .
SUBINS DEL
type SUBmono downisLightfalselemSim 0.375wnSyn 1.0wnAnto 0.0wnHypo 0.0
type DELmono upisLightfalse
compose
= (equivalent) (forward)= (equivalent)
(forward)
Introduction • Foundations of Natural Logic • The NatLog System • Experiments with FraCaS • Experiments with RTE • Conclusion
16
The FraCaS test suiteThe FraCaS test suite
• FraCaS: mid-90s project in computational semantics
• 346 “textbook” examples of textual inference problems
No delegate finished the report.
Some delegate finished the report on time.
Smith believed that ITEL had won the contract in 1992.
ITEL won the contract in 1992.
• 9 sections: quantifiers, plurals, anaphora, ellipsis, …
• 3 possible answers: yes, no, unknown (not balanced!)
• 55% single-premise, 45% multi-premise (excluded)
unk
no
Introduction • Foundations of Natural Logic • The NatLog System • Experiments with FraCaS • Experiments with RTE • Conclusion
17
Results on FraCaSResults on FraCaS
§ Category # Acc.
1 Quantifiers 44 84.09
2 Plurals 24 41.67
3 Anaphora 6 50.00
4 Ellipsis 25 28.00
5 Adjectives 15 60.00
6 Comparatives 16 68.75
7 Temporal 36 61.11
8 Verbs 8 62.50
9 Attitudes 9 55.56
“Applicable”: 1, 5, 6
75 76.00
All sections 183 59.56
yes unk no total
yes 62 40 — 102
unk 15 45 — 60
no 6 13 2 21
total
90 91 2 183
guess
gold
by section
confusion matrix
Introduction • Foundations of Natural Logic • The NatLog System • Experiments with FraCaS • Experiments with RTE • Conclusion
18
The RTE3 test suiteThe RTE3 test suite
• RTE: more “natural” textual inference problems
• Much longer premises: average 35 words (vs. 11)
• Binary classification: yes and no
• RTE problems not ideal for NatLog• Many kinds of inference not addressed by NatLog• Big edit distance propagation of errors from atomic model
• Maybe we can achieve high precision on a subset?
• Strategy: hybridize with broad-coverage RTE system• As in Bos & Markert 2006
Introduction • Foundations of Natural Logic • The NatLog System • Experiments with FraCaS • Experiments with RTE • Conclusion
19
A hybrid RTE system using A hybrid RTE system using NatLogNatLog
NatLog
pre-processing
alignment
classification
{yes, no}
Stanford
pre-processing
alignment
classification
[–, +]
threshold(balanced)
{yes, no}
x
threshold(optimized)
{yes, no}
Introduction • Foundations of Natural Logic • The NatLog System • Experiments with FraCaS • Experiments with RTE • Conclusion
20
Results on RTE3Results on RTE3
RTE3 Development Set (800 problems)
System % yesprecision
recallaccuracy
Stanford 50.25 68.66 66.99 67.25
NatLog 18.00 76.39 26.70 58.00
Hybrid, balanced
50.00 69.75 67.72 68.25
Hybrid, optimized
55.13 69.16 74.03 69.63RTE3 Test Set (800 problems)
System % yesprecision
recallaccuracy
Stanford 50.00 61.75 60.24 60.50
NatLog 23.88 68.06 31.71 57.38
Hybrid, balanced
50.00 64.50 62.93 63.25
Hybrid, optimized
54.13 63.74 67.32 63.62
25 extraproblems
(significant,p < 0.01)
Introduction • Foundations of Natural Logic • The NatLog System • Experiments with FraCaS • Experiments with RTE • Conclusion
21
ConclusionConclusion
Natural logic enables precise reasoning about monotonicity, while sidestepping the difficulties of translating to FOL.
The NatLog system successfully handles a broad range of such inferences, as demonstrated on the FraCaS test suite.
Future work:• Add proof search, to handle multiple-premise inference problems• Consider using CCG parses to facilitate monotonicity projection• Explore the use of more sophisticated alignment models• Bring factive & implicative inferences into the NatLog framework
:-) Thanks! Questions?
Introduction • Foundations of Natural Logic • The NatLog System • Experiments with FraCaS • Experiments with RTE • Conclusion