a metric-based framework for automatic taxonomy induction
TRANSCRIPT
6/27/13
1
A Metric-based Framework for Automatic Taxonomy Induction
Hui Yang and Jamie Callan Language Technologies Institute Carnegie Mellon University ACL2009, Singapore
ROADMAP
¥ Introduc)on
¥ Related Work
¥ Metric-‐Based Taxonomy Induc)on Framework
¥ The Features
¥ Experimental Results
¥ Conclusions
6/27/13
2
INTRODUCTION
¥ Seman)c taxonomies, such as WordNet, play an important role in solving knowledge-‐rich problems
¥ Limita)ons of Manually-‐created Taxonomies ¤ Rarely complete
¤ Difficult to include new terms from emerging/changing domains
¤ Time-‐consuming to create; May make it unfeasible for specialized domains and personalized tasks
INTRODUCTION
¥ Automa)c Taxonomy Induc)on is a solu)on to ¤ Augment exis)ng resources
¤ Quickly produce new taxonomies for specialized domains and personalized tasks
¥ Subtasks in Automa)c Taxonomy Induc)on ¤ Term extrac)on
¤ Rela)on forma)on
¥ This paper focuses on Rela)on Forma)on
6/27/13
3
Related Work ¥ Pa#ern-‐based Approaches ¥ Define lexical-‐syntac)c paPerns for rela)ons, and use these paPerns to discover instances
¥ Have been applied to extract Is-‐a, part-‐of, sibling, synonym, causal, etc, rela)ons
¥ Strength: Highly accurate
¥ Weakness: Sparse coverage of paPerns
¥ Clustering-‐based Approaches ¥ Hierarchically cluster terms based
on similari)es of their meanings usually represented by a feature vector
¥ Have only been applied to extract is-‐a and sibling rela)ons
¥ Strength: Allowing discovery of rela)ons which do not explicitly appear in text; higher recall
¥ Weaknesses: Generally fail to produce coherent cluster for small corpora [Pantel and PennacchioV 2006]; Hard to label non-‐leaf nodes
A UNIFIED SOLUTION
¥ Combine strengths of both approaches in a unified framework ¤ Flexibly incorporate heterogeneous features ¤ Use lexical-‐syntac)c paPerns as one types of features in a
clustering framework
Metric-‐based Taxonomy Induc)on
6/27/13
4
THE FRAMEWORK
¥ A novel framework, which ¤ Incrementally clusters terms ¤ Transforms taxonomy induc)on into a mul)-‐criteria op)miza)on ¤ Using heterogeneous features
¥ Op)miza)on based on two criteria ¤ Minimiza)on of taxonomy structures ó
Minimum Evolu)on Assump)on ¤ Modeling of term abstractness ó
Abstractness Assump)on
LET’S BEGIN WITH SOME IMPORTANT DEFINITIONS
¤ A Taxonomy is a data model
Concept Set Rela)onship Set Domain
6/27/13
5
MORE DEFINITIONS
ball table
Game Equipment
A Full Taxonomy:
AssignedTermSet={game equipment, ball, table, basketball, volleyball, soccer, table-‐tennis table, snooker table} UnassignedTermSet={}
MORE DEFINITIONS
ball
Game Equipment
A Par)al Taxonomy
table
AssignedTermSet={game equipment, ball, table, basketball, volleyball} UnassignedTermSet={soccer, table-‐tennis table, snooker table}
6/27/13
6
MORE DEFINITIONS Ontology Metric
distance = 1.5 distance = 2
distance =1
distance =1
d( , ) = 2
d( , ) = 1 ball
d( , ) = 4.5 table
ASSUMPTIONS Minimum Evolu)on Assump)on: The
Op)mal Ontology is One that Introduces Least Informa)on
Changes!
6/27/13
7
ILLUSTRATION Minimum Evolu)on Assump)on
ILLUSTRATION Minimum Evolu)on Assump)on
6/27/13
8
ILLUSTRATION Minimum Evolu)on Assump)on
ball
ILLUSTRATION Minimum Evolu)on Assump)on ball
table
6/27/13
9
ILLUSTRATION Minimum Evolu)on Assump)on
ball table
Game Equipment
ILLUSTRATION Minimum Evolu)on Assump)on
ball table
Game Equipment
6/27/13
10
ILLUSTRATION Minimum Evolu)on Assump)on
ball table
Game Equipment
ASSUMPTIONS Abstractness
Assump)on: Each abstrac)on level
has its own Informa)on func)on
6/27/13
11
ASSUMPTIONS Abstractness Assump)on
ball table
Game Equipment
MULTIPLE CRITERION OPTIMIZATION
Minimum Evolu)on
objec)ve func)on
Abstractness objec)ve func)on
Scalariza)on variable
6/27/13
12
ESTIMATING ONTOLOGY METRIC
¥ Assume ontology metric is a linear interpola)on of some underlying feature func)ons
¥ Ridge Regression to es)mate and predict the ontology metric
THE FEATURES
¥ Our framework allows a wide range of features to be used
¥ Input for the Feature Func)ons: Two terms
¥ Output: A numeric score to measure seman)c distance between these two terms
¥ We can use the following types of feature func)ons, but not restricted to only these: ¤ Contextual Features ¤ Term Co-‐occurrence ¤ Lexical-‐Syntac)c PaPerns ¤ Syntac)c Dependency Features ¤ Word Length Difference ¤ Defini)on Overlap, etc
6/27/13
13
EXPERIMENTAL RESULTS
¥ Task: Reconstruct taxonomies from WordNet and ODP ¤ Not the en)re WordNet or ODP, but fragments of WordNet or
ODP
¥ Ground Truth: 50 hypernym taxonomies from WordNet; 50 hypernym taxonomies from ODP; 50 meronym taxonomies from WordNet.
¥ Auxiliary Datasets: 1000 Google documents per term or per term pair; 100 Wikipedia documents per term.
¥ Evalua)on Metrics: F1-‐measure (averaged by Leave-‐One-‐Out Cross Valida)on).
DATASETS
6/27/13
14
PERFORMANCE OF TAXONOMY INDUCTION
¥ Compare our system (ME) with other state-‐of-‐the-‐art systems ¤ HE: 6 is-‐a paPerns [Hearst 1992]
¤ GI: 3 part-‐of paPerns [Girju et al. 2003]
¤ PR: a probabilis)c framework [Snow et al. 2006]
¤ ME: our metric-‐based framework
PERFORMANCE OF TAXONOMY INDUCTION
¥ Our system (ME) consistently gives the best F1 for all three tasks.
¥ Systems using heterogeneous features (ME and PR) achieve a significant absolute F1 gain (>30%)
6/27/13
15
FEATURES VS. RELATIONS
¥ This is the first study of the impact of using different features on taxonomy induc)on for different rela)ons
¥ Co-‐occurrence and lexico-‐syntac0c pa3erns are good for is-‐a, part-‐of, and sibling rela)ons
¥ Contextual and syntac0c dependency features are only good for sibling rela)on
FEATURES VS. ABSTRACTNESS
¥ This is the first study of the impact of using different features on taxonomy induc)on for terms at different abstrac)on levels
¥ Contextual, co-‐occurrence, lexical-‐syntac0c pa3erns, and syntac0c dependency features work well for concrete terms;
¥ Only co-‐occurrence works well for abstract terms
6/27/13
16
CONCLUSIONS
¥ This paper presents a novel metric-‐based taxonomy induc)on framework, which ¤ Combines strengths of paPern-‐based and clustering-‐based
approaches
¤ Achieves bePer F1 than 3 state-‐of-‐the-‐art systems
¥ The first study on the impact of using different features on taxonomy induc)on for different types of rela)ons and for terms at different abstrac)on levels
CONCLUSIONS
¥ This work is a general framework, which
¤ Allows a wider range of features
¤ Allows different metric func)ons at different abstrac)on levels
¥ This work has a poten)al to learn more complex taxonomies than previous approaches