06/03/'07 upd 04/03/08cmpe 588 spring 2008 emu1 tools for semantic annotation atilla elÇİ...

18
06/03/'07 upd 04/03/08 CmpE 588 Spring 2008 EMU 1 Tools for Semantic Annotation Atilla ELÇİ Dept. of Computer Engineering Eastern Mediterranean University

Upload: coleen-owen

Post on 25-Dec-2015

218 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: 06/03/'07 upd 04/03/08CmpE 588 Spring 2008 EMU1 Tools for Semantic Annotation Atilla ELÇİ Dept. of Computer Engineering Eastern Mediterranean University

06/03/'07 upd 04/03/08 CmpE 588 Spring 2008 EMU 1

Tools for Semantic Annotation

Atilla ELÇİDept. of Computer Engineering

Eastern Mediterranean University

Page 2: 06/03/'07 upd 04/03/08CmpE 588 Spring 2008 EMU1 Tools for Semantic Annotation Atilla ELÇİ Dept. of Computer Engineering Eastern Mediterranean University

06/03/'07 upd 04/03/08 CmpE 588 Spring 2008 EMU 2

Semantic Annotation

• Transcoding:Transformation of info from one form to another

• Web content transcoding if along the Web transaction path– Objective: repurposing– Available externally:

• Can’t be internally due to HTML being rigid.• Even if inserted, not recognized (skipped) by

browsers

Page 3: 06/03/'07 upd 04/03/08CmpE 588 Spring 2008 EMU1 Tools for Semantic Annotation Atilla ELÇİ Dept. of Computer Engineering Eastern Mediterranean University

06/03/'07 upd 04/03/08 CmpE 588 Spring 2008 EMU 3

Annotation: Passin Ch. 4.

Def.: The act of adding notes (WordNet)

Page 4: 06/03/'07 upd 04/03/08CmpE 588 Spring 2008 EMU1 Tools for Semantic Annotation Atilla ELÇİ Dept. of Computer Engineering Eastern Mediterranean University

06/03/'07 upd 04/03/08 CmpE 588 Spring 2008 EMU 4

Annotation continued

Semantic Web can play an important role in improving the role of annotations through following features:

• Annotation discovery by agents• Machine-understandable annotation• Intelligent filtering of annotations (need annotation

metadata)• Improved searching• Enhancement by agents• Collaboration over annotations and supporting

data• Information extraction

Page 5: 06/03/'07 upd 04/03/08CmpE 588 Spring 2008 EMU1 Tools for Semantic Annotation Atilla ELÇİ Dept. of Computer Engineering Eastern Mediterranean University

06/03/'07 upd 04/03/08 CmpE 588 Spring 2008 EMU 5

Web Annotation Systems

• Inline editing: download & edit an HTML page and save in your domain

• External annotation: browser first downloads a page and its annotation (from your domain or other server) then merges them on the fly and displays

• External server creates annotated view and serves itImportant criteria to consider:• Ownership of annotations• Tool becoming obsolete / unavailable• Server scalability

Page 6: 06/03/'07 upd 04/03/08CmpE 588 Spring 2008 EMU1 Tools for Semantic Annotation Atilla ELÇİ Dept. of Computer Engineering Eastern Mediterranean University

06/03/'07 upd 04/03/08 CmpE 588 Spring 2008 EMU 6

Current Web Annotation Systems

• Wiki Collaboratives: allows members to edit Web content on the fly. Check a dictionary. Examples:– The original Wiki by Ward Cunningham– Wikipedia and others of Free Encyclopedia Project– Ontolog Wiki

• W3C Annotea Project and Amaya editor-cum-browser. Uses RDF to describe annotations.

• Multivalent Browser: – Multivalent Home Page – Wikipedia entry.

Page 7: 06/03/'07 upd 04/03/08CmpE 588 Spring 2008 EMU1 Tools for Semantic Annotation Atilla ELÇİ Dept. of Computer Engineering Eastern Mediterranean University

06/03/'07 upd 04/03/08 CmpE 588 Spring 2008 EMU 7

External Annotation Framework

• Ref.: Hori; and in ACM PD Bookshelf.• Def.: A scheme for representing annotation files and a way of

associating original documents with external annotations.

-Contains metadata adressing part of the Web document.

- XPath & XPointer are used to link the two.

Page 9: 06/03/'07 upd 04/03/08CmpE 588 Spring 2008 EMU1 Tools for Semantic Annotation Atilla ELÇİ Dept. of Computer Engineering Eastern Mediterranean University

06/03/'07 upd 04/03/08 CmpE 588 Spring 2008 EMU 9

Annotation Rendering:Client Preference & Capability

• Content adaptation requires dealing with Client Feature Set:– User preferences– Device capabilities.

• W3C’s Composite Capability / Preference Profiles (CC/CP) is used in describing such information profiles.

• CC/PP specifies that client profiles can be delivered to a proxy server over HTTP.

• Thus proxy server is able to consider together the original document, annotation(s) on it, and client’s CC/PP specs in transcoding the content.

Page 10: 06/03/'07 upd 04/03/08CmpE 588 Spring 2008 EMU1 Tools for Semantic Annotation Atilla ELÇİ Dept. of Computer Engineering Eastern Mediterranean University

06/03/'07 upd 04/03/08 CmpE 588 Spring 2008 EMU 10

Annotation-Based Transcoding System

• Transcoding Architecture: Abstract architecture based on intermediary (proxy) between client & server.

Page 11: 06/03/'07 upd 04/03/08CmpE 588 Spring 2008 EMU1 Tools for Semantic Annotation Atilla ELÇİ Dept. of Computer Engineering Eastern Mediterranean University

06/03/'07 upd 04/03/08 CmpE 588 Spring 2008 EMU 11

Ex: Page Splitting

• Annotation vocabulary:

“pcd” consists of:– Alternatives,– Splitting hints

(a in the ex.),– Selection criteria

(b in the ex.)

Page 12: 06/03/'07 upd 04/03/08CmpE 588 Spring 2008 EMU1 Tools for Semantic Annotation Atilla ELÇİ Dept. of Computer Engineering Eastern Mediterranean University

06/03/'07 upd 04/03/08 CmpE 588 Spring 2008 EMU 12

Ex: Page Splitting (continued)

• Adaptation engine:

Page 13: 06/03/'07 upd 04/03/08CmpE 588 Spring 2008 EMU1 Tools for Semantic Annotation Atilla ELÇİ Dept. of Computer Engineering Eastern Mediterranean University

Information Extraction (IE) (Bontcheva et al @Davies Ch. 3)

≡ A technology based on analysing natural language in order to extract snippets of information and produce fixed format, unambiguous data as output.

Types of info sought (ref. Message Understanding Conference, MUC-7, definitions:

1. Entities (NE): such as people, places, organizations, quantities of commodities, dates, etc.

2. Mentions (CO): places of references to entities in the text

3. Descriptions (TE): of the entities

4. Relations (TR): between entities

5. Events (ST): involving entities

Semantic annotation: assigning to entities and relations in the text links to their semantic descriptions in an ontology.

06/03/'07 upd 04/03/08 CmpE 588 Spring 2008 EMU 13

Page 14: 06/03/'07 upd 04/03/08CmpE 588 Spring 2008 EMU1 Tools for Semantic Annotation Atilla ELÇİ Dept. of Computer Engineering Eastern Mediterranean University

HLT & Semantic WebFigure 3.1 in Bontcheva et al @ Davies

06/03/'07 upd 10/03/08 CmpE 588 Spring 2008 EMU 14

Page 15: 06/03/'07 upd 04/03/08CmpE 588 Spring 2008 EMU1 Tools for Semantic Annotation Atilla ELÇİ Dept. of Computer Engineering Eastern Mediterranean University

Applying IE in SemWeb: Traditional Cases

• “Traditional” IE: annotating with metadata; ontology is not incorporated into annotated text such as Web pages:– AeroDAML annotation tool (2001): auto generates DAML annotations from

Web pages

– Amilcare IE system (2003): a machine learning system that produces extracted info in triples for use by an anno’n tool

– MnM semantic anno’n tool (2002: semi-auto; piggy-backed to Amilcare.

– S-Cream (2002): auto anno’n using Onto-O-Mat manual anno’n tool implementing CREAM framework for creating relational metadata, and Amilcare

06/03/'07 upd 04/03/08 CmpE 588 Spring 2008 EMU 15

Page 16: 06/03/'07 upd 04/03/08CmpE 588 Spring 2008 EMU1 Tools for Semantic Annotation Atilla ELÇİ Dept. of Computer Engineering Eastern Mediterranean University

Applying IE in SemWeb: Ontology-Based Cases

• Ontology-based IE: annotating using a formal ontology as one of the system’s resources:– Pankow (Pattern-based Annotation through Knowledge on the Web,

2004): gathers surface pattern wrt a given ontology. • <instance> <concept>, <instance> is a <concept>

• Checks validity through google queries

• Auto performance: 24.9 % against human perf of 62.09%; semi-auto: 49.56%

– SemTag: large-scale semantic anno’n wrt TAP Ontology.

– KIM (Knowledge and Information Management system by OntoText Lab, 2005): now taken over by SEKT Project.

06/03/'07 upd 04/03/08 CmpE 588 Spring 2008 EMU 16

Page 17: 06/03/'07 upd 04/03/08CmpE 588 Spring 2008 EMU1 Tools for Semantic Annotation Atilla ELÇİ Dept. of Computer Engineering Eastern Mediterranean University

Semantic HTML?

• W3C has now issued a draft HTML 5 aiming to formalize semantic annotations as next version of HTML tag vocabulary:– Tony Patton's blog– Differences of HTML 5 from prev version– HTML5 specs

06/03/'07 upd 04/03/08 CmpE 588 Spring 2008 EMU 17

Page 18: 06/03/'07 upd 04/03/08CmpE 588 Spring 2008 EMU1 Tools for Semantic Annotation Atilla ELÇİ Dept. of Computer Engineering Eastern Mediterranean University

06/03/'07 upd 04/03/08 CmpE 588 Spring 2008 EMU 18

References

• Thomas B. Passin. Explorer’s Guide to the Semantic Web, Manning 2004. Ch4.

• Masahiro Hori: Semantic Annotation for Web Content Adaptation, Ch. 14 in Spinning the Semantic Web, (Ed.: Dieter Fensel et al.), MIT Press, 2003. (Check ACM Digital Library Books at Professional Development Center)

• Adobe’s annotation reference.• Yuce’s thesis project.• Bontcheva et al.(2006). Semantic Annotation and Human Language

Technology. Ch. 3 in Semantic Web Technologies, Davies et al (eds). Wiley.