coupling semi-supervised learning of categories and relations by andrew carlson, justin betteridge,...

13
Coupling Semi-Supervised Learning of Categories and Relations by Andrew Carlson, Justin Betteridge, Estevam R. Hruschka Jr. and Tom M. Mitchell School of Computer Science Carnegie Mellon University presented by Thomas Packer

Post on 19-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Coupling Semi-Supervised Learning of Categories and

Relationsby

Andrew Carlson, Justin Betteridge, Estevam R. Hruschka Jr. and Tom M. Mitchell

School of Computer ScienceCarnegie Mellon University

presented byThomas Packer

Bootstrapped Information Extraction

• Semi-Supervised:– Seed knowledge (predicate instances & patterns)– Pattern learners (uses learned instances)– Instance learners (uses learned patterns)

• Feedback Loop:– Rel1(X, Y)

– Sent1(X, Y), Rel0(X, Y) Pat1

– Pat1: Sent2(A, B) Rel1(A, B)

Challenges and Previous Solutions

• Semantic drift: Feedback loop amplifies error and ambiguities.

• Semi-Supervised learning often suffers from being under-constrained.

• Multiple mutually-exclusive predicate learning: Positive examples of one predicate are also negative examples of others.

• Category and predicate learning: Arguments must be of certain types.

Does More Look Harder?

Approach

• Simultaneous bootstrapped training of multiple categories and multiple relations.

• Growing related knowledge provides constraints to guide continued learning.

• Ontology Constraints:– Mutually exclusive predicates imply negative instances

and patterns.– Hypernyms imply positive instances.– Relation argument type constraints imply positive

category and negative relation instances.

Mutual Exclusion Constraint

• “city” and “scientist” categories are mutually exclusive.

• If “Boston” is an instance of “city”, then it is also a negative instance of “scientist”.

• If “mayor of arg1” is a pattern for “city”, then it is also a negative pattern for “scientist”.

Hypernym Constraints

• “athlete” is a hyponym of “person”.• If “John McEnroe” is a positive instance of

athlete, then it is also a positive instance of “person”.

Type Checking Constraints

• The “ceoOf()” relation must have arguments of type “person” and “company”.

• If “bicycle” is not a “person” then “ceoOf(bicycle, Microsoft)” is a negative instance of “ceoOf()”.

• If “ceoOf(Steve Ballmer, Microsoft)” is true, then “Steve Ballmer” is a positive instance of “person”. “Microsoft” handled similarly.

Coupled Bootstrap Learner

Knowledge Constraints Makes Extraction Easier

Knowledge Constraints Makes Extraction Easier

Conclusion

• Clearly shows improvements based on constraints.

• Could probably benefit by– adding probabilistic reasoning– larger corpus– higher thresholds– more contrastive categories– other techniques discussed in this class

Questions