statistical relational learning for knowledge extraction from the web
DESCRIPTION
Statistical Relational Learning for Knowledge Extraction from the Web. Hoifung Poon Dept. of Computer Science & Eng. University of Washington. 1. “Drowning in Information, Starved for Knowledge”. WWW. 2. 2. 2. Great Vision: Knowledge Extraction from Web. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/1.jpg)
1
Statistical Relational Learning for Knowledge Extraction
from the Web
Hoifung PoonDept. of Computer Science & Eng.
University of Washington
1
![Page 2: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/2.jpg)
22
“Drowning in Information, Starved for Knowledge”
2
WWW
2
![Page 3: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/3.jpg)
3
Great Vision:Knowledge Extraction from Web
Also need: Knowledge representation and reasoning Close the loop: Apply knowledge to extraction
Machine reading [Etzioni et al., 2007]
Craven et al., “Learning to Construct Knowledge Bases from the World Wide Web," Artificial Intelligence, 1999.
3
![Page 4: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/4.jpg)
44
Machine Reading: Text Knowledge
4
……
4
![Page 5: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/5.jpg)
5
Rapidly Growing Interest
AAAI-07 Spring Symposium on Machine Reading DARPA Machine Reading Program (2009-2014) NAACL-10 Workshop on Learning By Reading Etc.
5
![Page 6: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/6.jpg)
6
Great Impact
Scientific inquiry and commercial applications Literature-based discovery, robot scientists Question answering, semantic search Drug design, medical diagnosis Breach knowledge acquisition bottleneck for
AI and natural language understanding Automatically semantify the Web Etc.
6
![Page 7: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/7.jpg)
7
This Talk
Statistical relational learning offers promising solutions to machine reading
Markov logic is a leading unifying framework A success story: USP
Unsupervised, end-to-end machine reading Extracts five times as many correct answers as
state of the art, with highest accuracy of 91%
7
![Page 8: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/8.jpg)
88
USP: Question-Answer Example
Q: What does IL-2 control?
A: The DEX-mediated IkappaBalpha induction
Interestingly, the DEX-mediated IkappaBalpha induction was completely inhibited by IL-2, but not IL-4, in Th1 cells, while the reverse profile was seen in Th2 cells.
8
![Page 9: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/9.jpg)
999
Overview
Machine reading: Challenges Statistical relational learning Markov logic USP: Unsupervised Semantic Parsing Research directions
9
![Page 10: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/10.jpg)
10
Key Challenges
Complexity Uncertainty Pipeline accumulates errors Supervision is scarce
10
![Page 11: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/11.jpg)
111111
Languages Are Structural
IL-4 induces CD11B
Involvement of p70(S6)-kinase activation in IL-10 up-regulation in human monocytes by gp41......
George Walker Bush was the 43rd President of the United States.…… Bush was the eldest son of President G. H. W. Bush and Babara Bush. …….In November 1977, he met Laura Welch at a barbecue.11
governments
lm$pxtm(Hebrew: according to their families)
![Page 12: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/12.jpg)
121212
Languages Are Structural
govern-ment-s
l-m$px-t-m(Hebrew: according to their families)
S
V NP
NP VP
IL-4 induces CD11B
Involvement of p70(S6)-kinase activation in IL-10 up-regulation in human monocytes by gp41......
involvement
up-regulation
IL-10human
monocyte
SiteTheme Cause
gp41 p70(S6)-kinase
activation
Theme Cause
Theme
George Walker Bush was the 43rd President of the United States.…… Bush was the eldest son of President G. H. W. Bush and Babara Bush. …….In November 1977, he met Laura Welch at a barbecue.12
![Page 13: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/13.jpg)
1313
Knowledge Is Heterogeneous
IndividualsE.g.: Socrates is a man
TypesE.g.: Man is mortal
Inference rulesE.g.: Syllogism
Ontological relations
Etc.13
MAMMAL
HUMAN
ISA
FACE
EYE
ISPART
![Page 14: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/14.jpg)
141414
Complexity
Can handle using first-order logic Trees, graphs, dependencies, hierarchies, etc.
easily expressed Inference algorithms (satisfiability testing,
theorem proving, etc.) But … logic is brittle with uncertainty
![Page 15: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/15.jpg)
151515
G. W. Bush ………… Laura Bush ……Mrs. Bush ……
Languages Are Ambiguous
I saw the man with the telescope
I saw the man with the telescope
NP
NP ADVP
I saw the man with the telescope
Here in London, Frances Deek is a retired teacher …In the Israeli town …, Karen London says …Now London says …
London PERSON or LOCATION?
Microsoft buys Powerset
Microsoft acquires Powerset
Powerset is acquired by Microsoft Corporation
The Redmond software giant buys Powerset
Microsoft’s purchase of Powerset, …
……
Which one?
15
![Page 16: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/16.jpg)
161616
Knowledge Has Uncertainty
We need to model correlations Our information is always incomplete Our predictions are uncertain
![Page 17: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/17.jpg)
17
Uncertainty
Statistics provides the tools to handle this Mixture models Hidden Markov models Bayesian networks Markov random fields Maximum entropy models Conditional random fields Etc.
But … statistical models assume i.i.d. data(independently and identically distributed) objects feature vectors
![Page 18: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/18.jpg)
18
Pipeline is Suboptimal
E.g., NLP pipeline:
Tokenization Morphology Chunking Syntax …
Accumulates and propagates errors Wanted: Joint inference
Across all processing stages Among all interdependent objects
18
![Page 19: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/19.jpg)
191919
Supervision is Scarce
Tons of text … but most is not annotated Labeling is expensive (Cf. Penn-Treebank)
Need to leverage indirect supervision
19
![Page 20: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/20.jpg)
20
Redundancy
Key source of indirect supervision State-of-the-art systems depend on this
E.g., TextRunner [Banko et al., 2007]
But … Web is heterogeneous: Long tail Redundancy only present in head regime
![Page 21: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/21.jpg)
212121
Overview
Machine reading: Challenges Statistical relational learning Markov logic USP: Unsupervised Semantic Parsing Research directions
21
![Page 22: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/22.jpg)
2222
Statistical Relational Learning
Burgeoning field in machine learning Offers promising solutions for machine reading Unify statistical and logical approaches Replace pipeline with joint inference Principled framework to leverage both
direct and indirect supervision
22
![Page 23: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/23.jpg)
2323
Machine Reading: A Vision
Challenge: Long tail
![Page 24: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/24.jpg)
2424
Machine Reading: A Vision
![Page 25: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/25.jpg)
252525
Challenges in Applying Statistical Relational Learning
Learning is much harder Inference becomes a crucial issue Greater complexity for user
![Page 26: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/26.jpg)
262626
Progress to Date
Probabilistic logic [Nilsson, 1986] Statistics and beliefs [Halpern, 1990] Knowledge-based model construction
[Wellman et al., 1992] Stochastic logic programs [Muggleton, 1996] Probabilistic relational models [Friedman et al., 1999] Relational Markov networks [Taskar et al., 2002] Markov logic [Domingos & Lowd, 2009]
Etc.
![Page 27: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/27.jpg)
272727
Progress to Date
Probabilistic logic [Nilsson, 1986] Statistics and beliefs [Halpern, 1990] Knowledge-based model construction
[Wellman et al., 1992] Stochastic logic programs [Muggleton, 1996] Probabilistic relational models [Friedman et al., 1999] Relational Markov networks [Taskar et al., 2002] Markov logic [Domingos & Lowd, 2009]
Etc.
Leading unifying framework
![Page 28: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/28.jpg)
282828
Overview
Machine reading Statistical relational learning Markov logic USP: Unsupervised Semantic Parsing Research directions
28
![Page 29: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/29.jpg)
29
Markov Networks Undirected graphical models
Log-linear model:
Weight of Feature i Feature i
otherwise0
CancerSmokingif1)CancerSmoking,(1f
1 1.5w
Cancer
CoughAsthma
Smoking
iii xfw
ZxP )(exp
1)(
29
![Page 30: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/30.jpg)
30
First-Order Logic
Constants, variables, functions, predicatesE.g.: Anna, x, MotherOf(x), Friends(x,y)
Grounding: Replace all variables by constantsE.g.: Friends (Anna, Bob)
World (model, interpretation):Assignment of truth values to all ground predicates
30
![Page 31: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/31.jpg)
31
Markov Logic
Intuition: Soften logical constraints Syntax: Weighted first-order formulas Semantics: Feature templates for Markov
networks A Markov Logic Network (MLN) is a set of
pairs (Fi, wi) where Fi is a formula in first-order logic
wi is a real number1
( ) exp ( )i ii
P x w n xZ
Number of true groundings
of Fi
31
![Page 32: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/32.jpg)
32
Example: Friends & Smokers
habits. smoking similar have Friends
cancer. causes Smoking
32
![Page 33: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/33.jpg)
33
Example: Friends & Smokers
)()(),(,
)()(
ySmokesxSmokesyxFriendsyx
xCancerxSmokesx
33
![Page 34: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/34.jpg)
34
Example: Friends & Smokers
)()(),(,
)()(
ySmokesxSmokesyxFriendsyx
xCancerxSmokesx
1.1
5.1
34
![Page 35: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/35.jpg)
35
Example: Friends & Smokers
)()(),(,
)()(
ySmokesxSmokesyxFriendsyx
xCancerxSmokesx
1.1
5.1
Cancer(A)
Smokes(A)Friends(A,A)
Friends(B,A)
Smokes(B)
Friends(A,B)
Cancer(B)
Friends(B,B)
Two constants: Anna (A) and Bob (B)Probabilistic graphical models andfirst-order logic are special cases
35
![Page 36: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/36.jpg)
36
MLN Algorithms:The First Three Generations
Problem First generation
Second generation
Third generation
MAP inference
Weighted satisfiability
Lazy inference
Cutting planes
Marginal inference
Gibbs sampling
MC-SAT Lifted inference
Weight learning
Pseudo-likelihood
Voted perceptron
Scaled conj. gradient
Structure learning
Inductive logic progr.
ILP + PL (etc.)
Clustering + pathfinding
36
![Page 37: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/37.jpg)
37
Efficient Inference Logical or statistical inference already hard But … can do approximate inference
Suffice to perform well in most cases Combine ideas from both camps E.g., MC-SAT MCMC SAT solver
Can also leverage sparsity in relational domains
More: Poon & Domingos, “Sound and Efficient Inference with Probabilistic and Deterministic Dependencies”, in Proc. AAAI-2006.
37
More: Poon, Domingos & Sumner, “A General Method for Reducing the Complexity of Relational Inference and its Application to MCMC”, in Proc. AAAI-2008.
![Page 38: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/38.jpg)
38
Weight Learning
Probability model P(X) X: Observable in training data Maximize likelihood of observed data Regularization to prevent overfitting
![Page 39: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/39.jpg)
393939
Weight Learning
No. of times clause i is true in data
Expected no. times clause i is true according to MLN
39
log ( ) ( ) ( )i x ii
P x n x E n xw
Gradient descent
Use MC-SAT for inference Can also leverage second-order information
[Lowd & Domingos, 2007]
Requires inference
![Page 40: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/40.jpg)
404040
Unsupervised Learning: How?
I.I.D. learning: Sophisticated model requires more labeled data
Statistical relational learning: Sophisticated model may require less labeled data Ambiguities vary among objects Joint inference Propagate information from
unambiguous objects to ambiguous ones One formula is worth a thousand labels
Small amount of domain knowledge large-scale joint inference
40
![Page 41: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/41.jpg)
41
Unsupervised Weight Learning
Probability model P(X,Z) X: Observed in training data Z: Hidden variables E.g., clustering with mixture models
Z: Cluster assignment X: Observed features
Maximize likelihood of observed data by summing out hidden variables Z
( , ) ( ) ( | )P X Z P Z P X Z
![Page 42: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/42.jpg)
42
4242
| ,log ( ) ( , ) ( , )z x i x z ii
P x E n x z E n x zw
Unsupervised Weight Learning
Sum over z, conditioned on observed x
Summed over both x and z
More: Poon, Cherry, & Toutanova, “Unsupervised Morphological Segmentation with Log-Linear Models”, in Proc. NAACL-2009.
Best Paper Award42
Gradient descent
Use MC-SAT to compute both expectations May also combine with contrastive estimation
![Page 43: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/43.jpg)
434343
Markov Logic
Unified inference and learning algorithms Can handle millions of variables, billions of features,
ten of thousands of parameters Easy-to-use software: Alchemy Many successful applications
E.g.: Information extraction, coreference resolution, semantic parsing, ontology induction
43
![Page 44: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/44.jpg)
4444
Pipeline Joint Inference
Combine segmentation and entity resolution for information extraction
Extract complex and nested bio-events from PubMed abstracts
More: Poon & Domingos, “Joint Inference for Information Extraction”, in Proc. AAAI-2007.
More: Poon & Vanderwende, “Joint Inference for Knowledge Extraction from Biomedical Literature”, in Proc. NAACL-2010.
44
![Page 45: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/45.jpg)
4545
Unsupervised Learning: Example
Coreference resolution: Accuracy comparable to previous supervised state of the art
More: Poon & Domingos, “Joint Unsupervised Coreference Resolution with Markov Logic”, in Proc. EMNLP-2008.
45
![Page 46: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/46.jpg)
464646
Overview
Machine reading: Challenges Statistical relational learning Markov logic USP: Unsupervised Semantic Parsing Research directions
46
![Page 47: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/47.jpg)
4747
Unsupervised Semantic Parsing
USP [Poon & Domingos, EMNLP-09] First unsupervised approach for semantic parsing End-to-end machine reading system Read text, answer questions
OntoUSP USP Ontology Induction [Poon & Domingos, ACL-10]
Encoded in a few Markov logic formulas
Best Paper Award
47
![Page 48: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/48.jpg)
484848
Semantic Parsing
Microsoft buys Powerset BUY(MICROSOFT,POWERSET)Goal
Microsoft buys PowersetMicrosoft acquires semantic search engine PowersetPowerset is acquired by Microsoft CorporationThe Redmond software giant buys PowersetMicrosoft’s purchase of Powerset, …
Challenge
48
![Page 49: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/49.jpg)
49
Limitations of Existing Approaches
Manual grammar or supervised learning Applicable to restricted domains only For general text
Not clear what predicates and objects to use Hard to produce consistent meaning annotation
Also, often learn both syntax and semantics Fail to leverage advanced syntactic parsers Make semantic parsing harder
![Page 50: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/50.jpg)
5050
USP: Key Idea # 1
Target predicates and objects can be learned Viewed as clusters of syntactic or lexical variations
of the same meaning
BUY(-,-)
buys, acquires, ’s purchase of, … Cluster of various expressions for acquisition
MICROSOFT
Microsoft, the Redmond software giant, … Cluster of various mentions of Microsoft
![Page 51: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/51.jpg)
5151
USP: Key Idea # 2
Relational clustering Cluster relations with same objects
USP Recursively cluster arbitrary expressions with similar subexpressions
Microsoft buys Powerset
Microsoft acquires semantic search engine Powerset
Powerset is acquired by Microsoft Corporation
The Redmond software giant buys Powerset
Microsoft’s purchase of Powerset, …
![Page 52: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/52.jpg)
5252
USP: Key Idea # 2
Relational clustering Cluster relations with same objects
USP Recursively cluster arbitrary expressions with similar subexpressions
Microsoft buys Powerset
Microsoft acquires semantic search engine Powerset
Powerset is acquired by Microsoft Corporation
The Redmond software giant buys Powerset
Microsoft’s purchase of Powerset, …
Cluster same forms at the atom level
![Page 53: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/53.jpg)
5353
USP: Key Idea # 2
Relational clustering Cluster relations with same objects
USP Recursively cluster arbitrary expressions with similar subexpressions
Microsoft buys Powerset
Microsoft acquires semantic search engine Powerset
Powerset is acquired by Microsoft Corporation
The Redmond software giant buys Powerset
Microsoft’s purchase of Powerset, …
Cluster forms in composition with same forms
![Page 54: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/54.jpg)
5454
USP: Key Idea # 2
Relational clustering Cluster relations with same objects
USP Recursively cluster arbitrary expressions with similar subexpressions
Microsoft buys Powerset
Microsoft acquires semantic search engine Powerset
Powerset is acquired by Microsoft Corporation
The Redmond software giant buys Powerset
Microsoft’s purchase of Powerset, …
Cluster forms in composition with same forms
![Page 55: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/55.jpg)
5555
USP: Key Idea # 2
Relational clustering Cluster relations with same objects
USP Recursively cluster arbitrary expressions with similar subexpressions
Microsoft buys Powerset
Microsoft acquires semantic search engine Powerset
Powerset is acquired by Microsoft Corporation
The Redmond software giant buys Powerset
Microsoft’s purchase of Powerset, …
Cluster forms in composition with same forms
![Page 56: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/56.jpg)
5656
USP: Key Idea # 3
Start directly from syntactic analyses Focus on translating them to semantics Leverage rapid progress in syntactic parsing Much easier than learning both
![Page 57: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/57.jpg)
57
Joint Inference in USP
Forms canonical meaning representation by recursively clustering synonymous expressions
Text Logical form in this representation Induces ISA hierarchy among clusters and
applies hierarchical smoothing (shrinkage)
57
![Page 58: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/58.jpg)
58
USP: System Overview
Input: Dependency trees for sentences Converts dependency trees into quasi-logical
forms (QLFs) Starts with QLF clusters at atom level Recursively builds up clusters of larger forms Output:
Probability distribution over QLF clusters and their composition
MAP semantic parses of sentences58
![Page 59: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/59.jpg)
59
Generating Quasi-Logical Forms
buys
Microsoft Powerset
nsubj dobj
Convert each node into an unary atom
59
![Page 60: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/60.jpg)
60
Generating Quasi-Logical Forms
nsubj dobj
n1, n2, n3 are Skolem constants
buys(n1)
Microsoft(n2) Powerset(n3)
60
![Page 61: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/61.jpg)
61
Generating Quasi-Logical Forms
nsubj dobj
Convert each edge into a binary atom
buys(n1)
Microsoft(n2) Powerset(n3)
61
![Page 62: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/62.jpg)
62
Generating Quasi-Logical Forms
Convert each edge into a binary atom
buys(n1)
Microsoft(n2) Powerset(n3)
nsubj(n1,n2) dobj(n1,n3)
62
![Page 63: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/63.jpg)
63
A Semantic Parse
buys(n1)
Microsoft(n2) Powerset(n3)
nsubj(n1,n2) dobj(n1,n3)
Partition QLF into subformulas
63
![Page 64: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/64.jpg)
64
A Semantic Parse
buys(n1)
Microsoft(n2) Powerset(n3)
nsubj(n1,n2) dobj(n1,n3)
Subformula Lambda form: Replace Skolem constant not in unary atom
with a unique lambda variable 64
![Page 65: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/65.jpg)
65
A Semantic Parse
buys(n1)
Microsoft(n2) Powerset(n3)
λx2.nsubj(n1,x2
)
Subformula Lambda form: Replace Skolem constant not in unary atom
with a unique lambda variable
λx3.dobj(n1,x3
)
65
![Page 66: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/66.jpg)
66
A Semantic Parse
buys(n1)
Microsoft(n2) Powerset(n3)
λx2.nsubj(n1,x2
)
Core form: No lambda variableArgument form: One lambda variable
λx3.dobj(n1,x3
)
Core form
Argument form Argument form
66
![Page 67: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/67.jpg)
67
A Semantic Parse
buys(n1)
Microsoft(n2
)
Powerset(n3)
λx2.nsubj(n1,x2)
Assign subformula to object cluster
λx3.dobj(n1,x3) BUY
MICROSOFT
POWERSET
67
![Page 68: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/68.jpg)
68
Object Cluster: BUY
buys(n1
)
Distribution over core forms
0.1
acquires(n1) 0.2
……
One formula in MLN
Learn weights for each pair ofcluster and core form
68
![Page 69: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/69.jpg)
69
Object Cluster: BUY
buys(n1
)
May contain variable number of property clusters
0.1
acquires(n1) 0.2
……
BUYER
BOUGHT
PRICE
……
69
![Page 70: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/70.jpg)
70
Property Cluster: BUYER
λx2.nsubj(n1,x2)
Distributions over argument forms, clusters, and number
0.5
0.4
……
MICROSOFT 0.2
GOOGLE 0.1
……
Zero 0.1
One 0.8
……
λx2.agent(n1,x2)
70
Three MLN formulas
![Page 71: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/71.jpg)
7171
Probabilistic Model
71
Exponential prior on number of parameters Cluster mixtures:
Object Cluster: BUY
buys 0.1
acquires 0.4
…
……
Property Cluster: BUYER
0.5
0.4
…
MICROSOFT 0.2
GOOGLE 0.1
…
Zero 0.1
One 0.8
…
nsubj
agent
71
![Page 72: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/72.jpg)
7272
Probabilistic Model
72
Exponential prior on number of parameters Cluster mixtures with hierarchical smoothing:
Object Cluster: BUY
buys 0.1
acquires 0.4
…
……
Property Cluster: BUYER
0.5
0.4
…
MICROSOFT 0.2
GOOGLE 0.1
…
Zero 0.1
One 0.8
…
nsubj
agent
E.g., picking MICROSOFT as BUYER argument depends not only on BUY, but also on its ISA ancestors
72
![Page 73: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/73.jpg)
73
Abstract Lambda Form
buys(n1) λx2.nsubj(n1,x2) λx3.dobj(n1,x3)
BUYS(n1) λx2.BUYER(n1,x2) λx3.BOUGHT(n1,x3)
Final logical form is obtained via lambda reduction
73
![Page 74: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/74.jpg)
747474
Challenge: State Space Too Large
Potential cluster number exp(token-number) Also, meaning units and clusters often small
Use combinatorial search
74
![Page 75: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/75.jpg)
757575
Inference: Find MAP Parse
Initialize
Search Operator
Lambda reduction
induces
protein CD11B
nsubj dobj
IL-4
nn
protein
IL-4
nn
protein
IL-4
nn
75
![Page 76: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/76.jpg)
767676
Learning: Greedily Maximize Posterior
enhances 1.0induces 1.0
MERGE COMPOSE
amino acid 1.0induces 0.2enhances 0.8
……Initialize
Search Operators enhances 1.0induces 1.0 acid 1.0amino 1.0
acid 1.0amino 1.0
76
![Page 77: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/77.jpg)
777777
Operator: Abstract
induces 0.30.1
…
enhances
ISA ISA
inhibits 0.2suppresses 0.1
induces 0.6
up-regulates 0.2
…
INDUCE
INHIBIT
inhibits 0.4
0.2
…
suppresses
INHIBIT
inhibits 0.4
0.2
…
suppressesinduces 0.6
up-regulates 0.2
…
INDUCE
MERGE with
REGULATE?
Captures substantial similarities 77
![Page 78: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/78.jpg)
787878
Experiments
Apply to machine reading:
Extract knowledge from text and answer questions Evaluation: Number of answers and accuracy GENIA dataset: 1999 Pubmed abstracts Use simple factoid questions, e.g.:
What does anti-STAT1 inhibit? What regulates MIP-1 alpha?
78
![Page 79: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/79.jpg)
7979
Total and Correct Answers
0
100
200
300
400
500
KW-SYN TextRunner RESOLVER DIRT USP
USP extracted five times as many correct answers as TextRunner
Highest precision of 91%
79
![Page 80: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/80.jpg)
8080
Qualitative Analysis
Resolve many nontrivial variations Argument forms that mean the same, e.g.,
expression of X X expression
X stimulates Y Y is stimulated with X Active vs. passive voices Synonymous expressions Etc.
80
![Page 81: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/81.jpg)
8181
Clusters And Compositions
Clusters in core forms investigate, examine, evaluate, analyze, study, assay diminish, reduce, decrease, attenuate synthesis, production, secretion, release dramatically, substantially, significantly ……
Compositionsamino acid, t cell, immune response, transcription factor,
initiation site, binding site …81
![Page 82: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/82.jpg)
8282
Question-Answer Example
Q: What does IL-2 control?
A: The DEX-mediated IkappaBalpha induction
Interestingly, the DEX-mediated IkappaBalpha induction was completely inhibited by IL-2, but not IL-4, in Th1 cells, while the reverse profile was seen in Th2 cells.
82
![Page 83: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/83.jpg)
838383
Overview
Machine reading Statistical relational learning Markov logic USP: Unsupervised Semantic Parsing Research directions
83
![Page 84: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/84.jpg)
8484
Web-Scale Joint Inference
Challenge: Efficiently identify the relevant Key: Induce and leverage an ontology
Ontology Capture essential properties & Abstract away unimportant variations
Upper-level nodes Skip irrelevant branches Wanted: Combine the following
Probabilistic ontology induction (e.g., USP) Coarse-to-fine learning and inference
[Felzenszwalb & McAllester, 2007; Petrov, Ph.D. Thesis]
84
![Page 85: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/85.jpg)
8585
Knowledge Reasoning
Most facts/rules are not explicitly stated “Dark matter” in the natural language universe
kale contains calcium calcium prevent osteoporosis
kale prevents osteoporosis Keys:
Induce generic reasoning patterns Incorporate reasoning in extraction
Additional sources of indirect supervision
85
![Page 86: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/86.jpg)
8686
Harness Social Computing Bootstrap online community
Knowledge Base
86
![Page 87: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/87.jpg)
8787
Harness Social Computing Bootstrap online community Incorporate human & end tasks in the loop“Tell me everything about dicer applied
to synapse …”
87
Knowledge Base
![Page 88: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/88.jpg)
8888
Harness Social Computing Bootstrap online community Incorporate human & end tasks in the loop
“Your extraction from my paper is correct except for blah …”
88
Knowledge Base
![Page 89: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/89.jpg)
8989
Harness Social Computing Bootstrap online community Incorporate human & end tasks in the loop Form positive feedback loop
89
Knowledge Base
![Page 90: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/90.jpg)
9090
Acknowledgments
Pedro Domingos, Colin Cherry, Kristina Toutanova, Lucy Vanderwende, Oren Etzioni, Dan Weld, Matt Richardson, Parag Singla, Stanley Kok, Daniel Lowd, Marc Sumner
ARO, AFRL, ONR, DARPA, NSF
90
![Page 91: Statistical Relational Learning for Knowledge Extraction from the Web](https://reader036.vdocuments.site/reader036/viewer/2022062422/56813f63550346895daa357d/html5/thumbnails/91.jpg)
9191
Summary
Statistical relational learning offers promising solutions for machine reading
Markov logic provides a language for this Syntax: Weighted first-order logical formulas Semantics: Feature templates of Markov nets
Open-source software: Alchemy
A success story: USP
Three key research directions
alchemy.cs.washington.edu
alchemy.cs.washington.edu/papers/poon09
91