facilitating document annotation using content and querying value

10
We present a novel alternative approach that facilitates the generation of the structured metadata by identifying documents that are likely to contain information of interest and this information is going to be subsequently useful for querying the database. Our experimental evaluation shows that our approach generates superior results compared to approaches that rely only on the textual content or only on the query workload, to identify attributes of interest. Abstract

Upload: cegon-technologies

Post on 15-Jul-2015

183 views

Category:

Engineering


2 download

TRANSCRIPT

Page 1: Facilitating Document Annotation Using Content and Querying Value

We present a novel alternative approach that

facilitates the generation of the structured metadata

by identifying documents that are likely to contain

information of interest and this information is going

to be subsequently useful for querying the database.

Our experimental evaluation shows that our

approach generates superior results compared to

approaches that rely only on the textual content or

only on the query workload, to identify attributes of

interest.

Abstract

Page 2: Facilitating Document Annotation Using Content and Querying Value

Existing system

Many annotation systems allow only “untyped”

keyword annotation for instance, a user may annotate a

weather report using a tag such as “Storm Category 3”.

In Datas paces, users provide data integration hints at

query time.

This results in data entry users ignoring such annotation

capabilities.

Page 3: Facilitating Document Annotation Using Content and Querying Value

Disadvantages of existing system

The cost is high for creation of annotation information.

The existing system produces some errors in the

suggestions.

Page 4: Facilitating Document Annotation Using Content and Querying Value

Proposed System

In this paper, we propose CADS (collaborative adaptive

data sharing platform), which is an “annotate-as-you

create” infrastructure that facilitates fielded data

annotation.

We are trying to prioritize the annotation of documents

towards generating attribute values for attributes that are

often used by querying users.

Page 5: Facilitating Document Annotation Using Content and Querying Value

Advantages of proposed system

We present an adaptive technique for automatically

generating data input forms, for annotating

unstructured textual documents.

We create principled probabilistic methods and

algorithms to seamlessly integrate information from the

query workload into the data annotation process

Page 6: Facilitating Document Annotation Using Content and Querying Value

Software Configuration

Operating System - Windows XP/7

Programming Language - Java/J2EE

Software Version - JDK 1.7 or above

Database - MYSQL

Page 7: Facilitating Document Annotation Using Content and Querying Value

Hardware Configuration

Processor - Pentium IV

Speed - 1.1 Ghz

RAM - 512 MB (min)

Hard Disk - 20GB

Keyboard - Standard Keyboard

Mouse - Two or Three Button Mouse

Monitor - LCD/LED Monitor

Page 8: Facilitating Document Annotation Using Content and Querying Value

UML Diagrams

Use case Diagram:

ADMIN USER

DETAILS

INFORMATION

ADD SOURCE

SERACH BY Q1,Q2,Q3, CONTENT

Page 9: Facilitating Document Annotation Using Content and Querying Value

Class diagram

USER

View SourceQ1, Q2, Q3,Content,

Download()

Sink

Action receiveProvide Services

ADMIN

Add SourceAdd ContentAdd Information

Add Source()

Page 10: Facilitating Document Annotation Using Content and Querying Value

Sequence Diagram

Admin Storage User

SearchAdding File

Adding source

Provide Information

Get Information