슬라이드 1
DESCRIPTION
TRANSCRIPT
![Page 1: 슬라이드 1](https://reader036.vdocuments.site/reader036/viewer/2022081413/5464a41faf795950608b5122/html5/thumbnails/1.jpg)
Survey of Semantic Anno-tation Platform
CILAB Seminar2008/03/21
![Page 2: 슬라이드 1](https://reader036.vdocuments.site/reader036/viewer/2022081413/5464a41faf795950608b5122/html5/thumbnails/2.jpg)
Contents
Paper Overview
Wrapper Induction
Pattern-based Approch
Rule-based Approach
Conclusion
![Page 3: 슬라이드 1](https://reader036.vdocuments.site/reader036/viewer/2022081413/5464a41faf795950608b5122/html5/thumbnails/3.jpg)
Introducing Paper
Surveys on the Semantic Annotation Platform Writer: Lawrence Reeve, Hyoil Han Affiliation: Drexel University (Philadelphia) ACM Symposium on Applied Computing
They examined Semantic Web annotation platforms▪ Platform Classification, Overview, Evaluation Comparison
What I want to get … Annotation hint for our Project Term unification – Pattern, Rule, … For my research
![Page 4: 슬라이드 1](https://reader036.vdocuments.site/reader036/viewer/2022081413/5464a41faf795950608b5122/html5/thumbnails/4.jpg)
Platform Classification
Pattern-based Discovery: Seed expansion Rules: Taxonomy Label Matching
Machine Learning-based Probabilistic: HMM, N-gram analysis Induction: Linguistic, Structural
![Page 5: 슬라이드 1](https://reader036.vdocuments.site/reader036/viewer/2022081413/5464a41faf795950608b5122/html5/thumbnails/5.jpg)
Platform Overview
Platform Method Machine Learning
Manual Rules
Bootstrap Ontology
AeroDAML Rule N Y WordNet
Armadillo Pattern Discovery
N Y User
KIM Rule N Y KIMO
MnM Wrapper Induction
Y N KMi
MUSE Rule N Y User
Ont-O-Mat: Amilcare
Wrapper Induction
Y N User
Ont-O-Mat: PANKOW
Pattern Discovery
N N User
SemTag Rule N N TAP
![Page 6: 슬라이드 1](https://reader036.vdocuments.site/reader036/viewer/2022081413/5464a41faf795950608b5122/html5/thumbnails/6.jpg)
Platform Overview - 2
Platform Method Machine Learning
Manual Rules
Bootstrap Ontology
AeroDAML Rule N Y WordNet
Armadillo Pattern Discovery
N Y User
KIM Rule N Y KIMO
MnM Wrapper Induction
Y N KMi
MUSE Rule N Y User
Ont-O-Mat: Amilcare
Wrapper Induction
Y N User
Ont-O-Mat: PANKOW
Pattern Discovery
N N User
SemTag Rule N N TAP
![Page 7: 슬라이드 1](https://reader036.vdocuments.site/reader036/viewer/2022081413/5464a41faf795950608b5122/html5/thumbnails/7.jpg)
Wrapper Induction
![Page 8: 슬라이드 1](https://reader036.vdocuments.site/reader036/viewer/2022081413/5464a41faf795950608b5122/html5/thumbnails/8.jpg)
What is Wrapper? - 1
A frame to analyze semi-structured data (mostly in web)
![Page 9: 슬라이드 1](https://reader036.vdocuments.site/reader036/viewer/2022081413/5464a41faf795950608b5122/html5/thumbnails/9.jpg)
What is Wrapper? - 2
![Page 10: 슬라이드 1](https://reader036.vdocuments.site/reader036/viewer/2022081413/5464a41faf795950608b5122/html5/thumbnails/10.jpg)
Wrapper Induction
Information Extraction from
Semi-Structured Databy creating Wrapper Automatically
“Wrapper Induction for Information Extraction”- Nicholas Kushmerick (264p)
![Page 11: 슬라이드 1](https://reader036.vdocuments.site/reader036/viewer/2022081413/5464a41faf795950608b5122/html5/thumbnails/11.jpg)
Wrapper Induction
High precision Useful bootstrapping method
Many other semantic annotation platform used this method Amilcare: Wrapper Induction Tool▪ MnM▪ OntoMat▪ Armadillo
![Page 12: 슬라이드 1](https://reader036.vdocuments.site/reader036/viewer/2022081413/5464a41faf795950608b5122/html5/thumbnails/12.jpg)
Pattern-based Approach
![Page 13: 슬라이드 1](https://reader036.vdocuments.site/reader036/viewer/2022081413/5464a41faf795950608b5122/html5/thumbnails/13.jpg)
OntoMat PANKOW
PANKOW Pattern-based Annotation through Knowledge
on the Web Plugin for OntoMat Institute of AIFB, University of Karlsruhe
![Page 14: 슬라이드 1](https://reader036.vdocuments.site/reader036/viewer/2022081413/5464a41faf795950608b5122/html5/thumbnails/14.jpg)
Patterns in PANKOW
Hearst Patterns
![Page 15: 슬라이드 1](https://reader036.vdocuments.site/reader036/viewer/2022081413/5464a41faf795950608b5122/html5/thumbnails/15.jpg)
Patterns in PANKOW
Definites
Apposition and Copula
![Page 16: 슬라이드 1](https://reader036.vdocuments.site/reader036/viewer/2022081413/5464a41faf795950608b5122/html5/thumbnails/16.jpg)
The Process of PANKOW
Proper Noun Ex-traction
(Term Ex-traction)
Hypothe-sis
PhraseConstruc-
tion
Using Pat-tern
![Page 17: 슬라이드 1](https://reader036.vdocuments.site/reader036/viewer/2022081413/5464a41faf795950608b5122/html5/thumbnails/17.jpg)
PANKOW Example - 1
The Extensible Markup Language ( XML ) is a general-purpose markup language.It is classified as an extensible language because it allows its users to define their own tags.
The |Extensible Markup Language|( |XML| ) is a general-purpose |markup language|.It is classified as an |extensible language| because it allows its users to define their own |tags|.
H1: <CONCEPT>s such as <INSTANCE>Extensible Markup Languages such as XMLExtensible Markup Languages such as markup languageXMLs such as markup languagemarkup languages such as XML…DEFINITE1: the <INSTANCE> <CONCEPT>the markup language XML… the tags markup language
Web Page
Web Page with Proper Noun Phrases
Hypothesis Phrases
![Page 18: 슬라이드 1](https://reader036.vdocuments.site/reader036/viewer/2022081413/5464a41faf795950608b5122/html5/thumbnails/18.jpg)
PANKOW Example - 2
H1: <CONCEPT>s such as <INSTANCE>Extensible Markup Languages such as XMLExtensible Markup Languages such as markup languageXMLs such as markup languagemarkup languages such as XML…DEFINITE1: the <INSTANCE> <CONCEPT>the markup language XML… the tags markup language
Extensible Markup Languages such as XML -- 3Extensible Markup Languages such as markup language -- 0XMLs such as markup language -- 0markup languages such as XML -- 834
Hypothesis Phrases
Number of hits for phrase
![Page 19: 슬라이드 1](https://reader036.vdocuments.site/reader036/viewer/2022081413/5464a41faf795950608b5122/html5/thumbnails/19.jpg)
PANKOW Example - 3
Extensible Markup Languages such as XML -- 3Extensible Markup Languages such as markup language -- 0XMLs such as markup language -- 0markup languages such as XML -- 834
The Extensible Markup Language ( <Term id =“2” instanceOf=“3”>XML</Term> ) is a general-purpose <Term id=“3” conceptOf=“2”>markup language</Term>.It is classified as an extensible language because it allows its users to define their own tags.
Number of hits for phrase
Annotated Document
ComputerLanguage
MarkupLanguage
Program-ming
Language
![Page 20: 슬라이드 1](https://reader036.vdocuments.site/reader036/viewer/2022081413/5464a41faf795950608b5122/html5/thumbnails/20.jpg)
Rule-based Approach
![Page 21: 슬라이드 1](https://reader036.vdocuments.site/reader036/viewer/2022081413/5464a41faf795950608b5122/html5/thumbnails/21.jpg)
Semantic Annotation in KIM
UpperOntology
Named En-tity Recogni-
tion
Map-ping
![Page 22: 슬라이드 1](https://reader036.vdocuments.site/reader036/viewer/2022081413/5464a41faf795950608b5122/html5/thumbnails/22.jpg)
Key Points in KIM
Semantic annotation system requires a light-weight upper-level ontology focused on named entity classes
RDF(S) with compliance and possible extensions to OWL Lite is the best choice for knowledge rep-resentation language for the ontology and the KB More power will unneccessarily degrade the scale and
performance
The documents and the metadata (annotations) should be kept decoupled from each other and separate from the ontology and theh knowledge base
![Page 23: 슬라이드 1](https://reader036.vdocuments.site/reader036/viewer/2022081413/5464a41faf795950608b5122/html5/thumbnails/23.jpg)
Rules in KIM
Lists of mapping rule 80,000 mapping rules already▪ Date, Person, Organization, Location, Percent, Money
![Page 24: 슬라이드 1](https://reader036.vdocuments.site/reader036/viewer/2022081413/5464a41faf795950608b5122/html5/thumbnails/24.jpg)
Evaluation
![Page 25: 슬라이드 1](https://reader036.vdocuments.site/reader036/viewer/2022081413/5464a41faf795950608b5122/html5/thumbnails/25.jpg)
Platform Evaluation
Framework Precision Recall F-Measure
Armadillo 91.0 74.0 87.0
KIM 86.0 82.0 84.0
MnM 95.0 90.0 n/a
MUSE 93.5 92.3 92.9
Ont-O-Mat: PANKOW
65.0 28.2 24.9
SemTag 82.0 n/a n/a
![Page 26: 슬라이드 1](https://reader036.vdocuments.site/reader036/viewer/2022081413/5464a41faf795950608b5122/html5/thumbnails/26.jpg)
Unfairness in Evaluation
Definitions and Scope of Semantic Annota-tion are different PANKOW: concept, instance annotation Armadillo: Restricted NE Annotation(Human,
Paper) KIM: NE Annotation (Date, Person, Organization,
Location, Percent, Money)To the best of our knowledge there is no well established term for this task; Neither there is a well established meaning for the term “semantic annota-tion” - From “KIM – Semantic Annotation Platform”
![Page 27: 슬라이드 1](https://reader036.vdocuments.site/reader036/viewer/2022081413/5464a41faf795950608b5122/html5/thumbnails/27.jpg)
Conclusion
Terms like pattern, rule, semantic annotation are very ambiguous Defining these terms suitable for our project is important
Wrapper Induction for Bootstrapping Data PANKOW Term Extraction method Upper ontology is important
Every annotation tool have upper ontology and they mapped extracted entity to this ontology
KIMO is well-defined Separation of relation extraction from concept
gathering
![Page 28: 슬라이드 1](https://reader036.vdocuments.site/reader036/viewer/2022081413/5464a41faf795950608b5122/html5/thumbnails/28.jpg)
The end
![Page 29: 슬라이드 1](https://reader036.vdocuments.site/reader036/viewer/2022081413/5464a41faf795950608b5122/html5/thumbnails/29.jpg)
![Page 30: 슬라이드 1](https://reader036.vdocuments.site/reader036/viewer/2022081413/5464a41faf795950608b5122/html5/thumbnails/30.jpg)
![Page 31: 슬라이드 1](https://reader036.vdocuments.site/reader036/viewer/2022081413/5464a41faf795950608b5122/html5/thumbnails/31.jpg)
Conclusion
Named Entity ( 추출하고자 하는 대상을 좁히면 편하다 )
개념 등록과 관계 맺기를 분리하라 Use Upper Ontology 자신의 목적에 맞게 annotation 툴을 사용하라 .
같은 용어를 사용했다고 , 같은 행동을 하는 툴은 아니다 .
![Page 32: 슬라이드 1](https://reader036.vdocuments.site/reader036/viewer/2022081413/5464a41faf795950608b5122/html5/thumbnails/32.jpg)
각 논문에서의 Semantic Annota-tion 의 의미
![Page 33: 슬라이드 1](https://reader036.vdocuments.site/reader036/viewer/2022081413/5464a41faf795950608b5122/html5/thumbnails/33.jpg)
Named Entity Recognition
![Page 34: 슬라이드 1](https://reader036.vdocuments.site/reader036/viewer/2022081413/5464a41faf795950608b5122/html5/thumbnails/34.jpg)
용어 통일
Pattern Rule Machine Learning
새 triple 에서 pattern 을 추출하는 것은 Machine Learning 은 아니다 .
![Page 35: 슬라이드 1](https://reader036.vdocuments.site/reader036/viewer/2022081413/5464a41faf795950608b5122/html5/thumbnails/35.jpg)
Pattern
Example of Ont-O-Mat: PANKOW PANKOW▪ Pattern-based Annotation through Knowledge on the
Web
Patterns in PANKOW Linguistic Patterns (similar pattern with ours)▪ Hearst Patterns▪ Definites▪ Apposition and Copula
They use patterns to extract concepts, in-stances from text
![Page 36: 슬라이드 1](https://reader036.vdocuments.site/reader036/viewer/2022081413/5464a41faf795950608b5122/html5/thumbnails/36.jpg)
Pattern Discovery in PANKOW
![Page 37: 슬라이드 1](https://reader036.vdocuments.site/reader036/viewer/2022081413/5464a41faf795950608b5122/html5/thumbnails/37.jpg)
평가 방법 Precision Recall
평가셋을 어디에서 구하던가 ?
![Page 38: 슬라이드 1](https://reader036.vdocuments.site/reader036/viewer/2022081413/5464a41faf795950608b5122/html5/thumbnails/38.jpg)
주요 프로그램 예제 KIM ?
프로그램에서 저장하고 있는 Annotation 의 형태
![Page 39: 슬라이드 1](https://reader036.vdocuments.site/reader036/viewer/2022081413/5464a41faf795950608b5122/html5/thumbnails/39.jpg)
Un-covered Annotation Tool
MMAX2 EML
OntoNote
![Page 40: 슬라이드 1](https://reader036.vdocuments.site/reader036/viewer/2022081413/5464a41faf795950608b5122/html5/thumbnails/40.jpg)
우리의 어노테이션과 차이점
![Page 41: 슬라이드 1](https://reader036.vdocuments.site/reader036/viewer/2022081413/5464a41faf795950608b5122/html5/thumbnails/41.jpg)
어느 프로그램이 가장 유용할까 ? 우리 프로젝트에
![Page 42: 슬라이드 1](https://reader036.vdocuments.site/reader036/viewer/2022081413/5464a41faf795950608b5122/html5/thumbnails/42.jpg)
Reference