metadata for dl metadata architecture for digital libraries: conceptual framework for indian digital...

102
Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore.

Upload: bernard-hoover

Post on 20-Jan-2016

220 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

Metadata Architecture for Digital Libraries:

Conceptual Framework for Indian Digital Libraries

Madhusudana Rao CR

C-DAC, Bangalore.

Page 2: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

Agenda

• Introduction

• Metadata

• Digital Library Architecture– SODA – STARTS

• Indian Digital Library– Background

Page 3: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

Agenda

– Proposed Architecture– SODA & STARTS

• Conclusion

Page 4: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

Exclude

• Search Engines - General

• Digital Library - General

Page 5: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

Introduction

• Information Processing & Retrieval– Typical Library Environment– Library Automation– Networking of Libraries– Digital Library– Digital Library initiatives

Page 6: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

Introduction

• Digital Library Scene– Search Engines

• Heterogeneous

• Vertical Information Retrieval

• Unique User Interface

• Search engines are different

• Protocols are different

• Querying & Ranking

• Incompatible across the sources

Page 7: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

Introduction

– Possible solutions• Identifying the User Group

• Identifying the Information Sources

• Negotiating with different Information Sources

• Resource Description Format

• Choose best Information Source to evaluate Query

• Evaluate the query at these sources

• Merge the Query Results from these sources

Page 8: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

New Protocol

• User

• User Query

• Information Source

• Networked Environment

• RDF Metadata

• User Interface

• Search & Retrieval

Page 9: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

Issues..

• Metadata

• Network Protocols

• Possible Solutions for typical environment

Page 10: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

Metadata…definition

Structured data about data...

Page 11: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

Metadata…definition

• Data that helps in design, create, describe, preserve and use of information systems and resources is Metadata.

• Metadata can play in the development of effective, authoritative, interoperable, scaleable, and preservable information and record keeping systems.

Page 12: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

Metadata…means

• Information Resource

• Library Catalogue– Index, Abstracts, Catalog Records, etc >

MARC, AACR, LCSH etc.

• Human Generated Textual description

• Machine generated data

Page 13: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

• Content– Intrinsic

• What it contains?

• What is about?

• Context– Extrinsic

• Who, What, Why, Where, How etc.

• Structure– Formal Set

Metadata….features

Page 14: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

Metadata…Attributes

• Intrinsic– Subject, Title, Author, Publisher, Publication

place, Other agent, Date, Object type, Form - Identifier, Relation, Source, Language, Coverage, Abstract, Version, Notes, Signature, Classification, keyword

Page 15: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

Metadata…Attributes

• Extrinsic– System Requirement, Mode of access,

Availability, Cost, Control, Extent, Encoding description, Revision description

Page 16: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

Metadata…for two communities

• Information Generators

• Librarians / Cataloguers

Page 17: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

Metadata… can be

• Information Objects– Physical– Intellectual Form

Page 18: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

Metadata…similar

• Typical Physical Library:– Catalogue – Book Racks– Books

Page 19: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

Metadata…currently

• Electronic Information Environment– Users search Metadata– Pointers – Primary Information available on computer

display

• Distinction– Electronic Environment

Page 20: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

Metadata…process

Two Communities

Generators Of information

Libraries & Cataloguers

User’s

Metadata

Page 21: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

• Need not be Digital

• More than description of an object

• Come from variety of sources

• Continue to accrue

• One’s object Metadata can be another information object’s metadata

Metadata…can be

Page 22: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

Metadata…can be

• Intermediate steps to retrieve content

• Surrogates of objects

Page 23: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

Metadata… need

• Internet & WWW witnessed exponential growth

• Need of the hour in the internet is catalogs of some kind

• Internet/WWW is not designed to catalog the contents

Page 24: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

Metadata…need

• Resource Description is a Challenge

• Tools are available

• Just directories listing of network resources and search engines

• Metadata is one of the solutions

• Again Standards are yet to make its impact

Page 25: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

Metadata…issues

• Increased accessibility– Searching > existence of rich and consistent

metadata– search across multiple collections– Distributed across several repositories

Page 26: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

Metadata…issues

• Retention of Text– Collection of objects– Complex interrelationships with people, places,

movements & events– Documenting and maintaining those

relationships– authenticity, structural and procedural integrity

Page 27: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

Metadata…issues

• Expanding use– Disseminating digital versions – Geography– Economics– Infinite ways to search information– Retrieve to wider community

Page 28: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

Metadata…issues

• Multi-versioning– variant versions– High resolution copy for preservation– Low resolution copy for thumbnail image for

quick reference and network transfers

Page 29: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

Metadata…issues

• Legal Issues– Track many layers of rights and reproduction

information – Privacy– Proprietary interests

Page 30: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

Metadata…issues

• Preservation– Generations - H/W & S/W– Technical, Descriptive and Preservation data – Information objects to remain accessible and

intelligible over time

Page 31: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

Metadata…issues

• System improvement and economics– Benchmarking– Planning new systems

Page 32: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

Metadata..life cycle

Organization

Searching & Retrieval

Utilization

Preservation &Disposition

Creation & MultiVersioning

Page 33: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

Metadata…standards

• In order Metadata to be useful & cost-effective it is essential– Structure, Semantics and Syntax conforms to

standards– Capture essence of sources– Distributed metadata model

Page 34: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

Metadata…standards

• There is no single international standard for Metadata

• Different levels - complexity, richness to simple formats

• Several metadata schemes has been proposed for different levels of requirements

Page 35: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

Metadata…standards

• IAFA templates

• WWW semantic header

• URS (Uniform Resource Citation)

• OCLC InterCat project

• TEI (Text Encoding and Interchange)

• Search engine meta tags

• Resource Description Framework

• EAD (Encoding Archival Description)

• GILS (Govt Information Locator Service)

• Federal Geographic Data Committee

• Museum Educational Site Licensing Project

• Dublin Core

Page 36: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

Dublin Core

Because it is simple…….. Yet effective ….

Page 37: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

Dublin Core..means

• Dublin, Ohio

• International consensus meetings, workshops, etc

• Emerging Infrastructure for Internet

• Support Resource Discovery

• Elements represent a broad interdisciplinary consensus

• Core set of elements

Page 38: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

Dublin Core..standard

• Comprises of 15 core elements

• Consensus by an International, Cross-disciplinary group representing– Library & Information – Computer Science– Text Encoding– Museum– Related fields of scholarship

Page 39: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

Dublin Core..standard

• Each 15 elements are optional and repetitive

• Each element has a limited set of qualifiers and attributes

• Simple DC

• Qualified DC

Page 40: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

Dublin Core..goals

• Simplicity of creation & Maintenance– Non-specialist to create descriptive records for

effective retrieval in an networked environment

• Commonly understood semantics– Digital tourist for non specialist searcher– Convergence of common, more generic

elements– increasing visibility and accessibility

Page 41: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

Dublin Core..goals

• International scope– 20 languages– Coordinating efforts– RDF - WWW

• Technical challenges of Internationalization– Multilingual & Multicultural nature of

electronic information universe

Page 42: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

Dublin Core..goals

• Extensibility– Additional resource discovery needs

Page 43: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

Dublin Core..elements

• Content– Coverage, Description, type, relation, source,

subject and title

• Intellectual property– Contributor, Creator, Publisher & Rights

• Instantiation– Date, Format, Identifier & Language

Page 44: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

Dublin Core..implementation

• Dublin Core web site lists 15 North America and Mexico in Europe and 12 Asia and Australia

Page 45: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

Digital Library Architecture

• SODA (Smart Objects Dumb Archives)

• STARTS (Stanford Protocol proposal for Internet Retrieval and Search)

Page 46: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

Digital Library

• Digital Library Services– User

• Functionality & Interface

– Searching– Browsing

• Archive– Managed sets of objects

Page 47: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

Digital Library

• Digital Object– Stored and trafficked digital content

• Simple files,

• Sophisticated objects

Page 48: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

Digital Library

Digital Library Services

Archive 1 Archive 2 Archive N

Digital Library Service Providers

Digital Objects in Archives

Publishers

Library Users

Digital Objectsout of Archives

Page 49: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

Digital Library.. builds

• Identifying a user group

• Identifying archives holding information of interest

• Negotiating terms and conditions with publishing

• Creating Indices

• Services such as Search & Browse

Page 50: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

Digital Library.. builds

• Creating User interaction services– Terms & Conditions– Authentication– Billing– Display

Page 51: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

Digital Library.. hindered

• Interoperability

• Object mobility

• Complex archives

Page 52: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

Digital Library..cons

• Digital Libraries are partitioned– Discipline - Computer Science, Aeronautics,

Physics, etc.– Format - Technical reports, video, software, etc.

• Interdisciplinary search difficult

• Resource Description includes manuscripts, software, data sets etc.

Page 53: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

Digital Library..cons

• Manuscripts Vs Other objects - Reintegration

• All digital storage and transmission, tight integration

Page 54: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

SODA…background

• Information generated in several forms

• Differentiated by semantic types (report, software, video, data sets etc.)

• Given semantic representation differentiated by syntactic representation (PS, PDF, Word)

• Media boundaries exists

Page 55: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

SODA…addresses

• Archive-independent container construct

• All semantic and syntactic data types

• Objects that logically grouped together

• Archived & manipulated as a single object

• Several objects can communicate with each other

• Arbitrary network services

Page 56: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

SODA..addresses

• Traditional functionality associated with archives has been pushed down into objects

• Making objects smarter/increase the responsibility

• Archives dumber/decrease the responsibility

Page 57: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

SODA

• Archives exists to assist the user to locate the objects

• Once the object is found user directly interact with the objects

Page 58: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

Smart Objects.. illustration

Smart objects

DumbArchives

Smart Archives Dumb Archives

SOSA: Smart objects, Smart ArchivesEx: none

SODA: Smart ObjectsDumb ArchivesEx: NCSTRL+

DOSA: Dumb ObjectsSmart ArchivesEx: NCSTRL

DODA: Dumb objectsDumb ArchivesEx: FTP server

Page 59: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

SODA Model…implementation

Page 60: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

Buckets..containers

• Object oriented containers• Logically grouped items are

– Collected– Stored– Transported as a single unit

• Many forms of same data• Related & non traditional data (Supportive

material)

Page 61: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

Buckets.. containers

• Multiple packages

• Packages can corresponds semantics– manuscript, software etc.– metadata– terms and conditions– pointers

• Single package can have several items

Page 62: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

Bucket..architecture

Terms and Conditions

Metadata (RFC 1807, Dublin Core)

Manuscript.ps, .pdf, .tex, .doc

Software.tar,.c, .java, .asp

Images.gif, .jpg

Data sets.xls, .tar

Packages inside the bucket Element

s inside the package

Access Methods

Handle (unique ID)

Page 63: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

Bucket…requirements

• Unique ID - handle

• Either standalone or multiple repositories

• Standalone - WWW through TCP/IP

• Moderation of number of buckets through intelligence and functionality

• Individual buckets may have custom terms and conditions

Page 64: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

Buckets..characteristics

• Is of arbitrary size

• Globally unique ID

• 0 or more components called packages

• Package contains 1 or more components - elements

• Element can be a file or pointer

• Packages and elements can be other buckets

Page 65: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

Buckets..characteristics

• Package can be a pointers to a remote bucket, another package or element

• Buckets can keep internal logs of actions

• Interactions or communication between buckets are made only through defined methods

• Buckets can initiate actions, they do not have to wait to be acted on

Page 66: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

Traditional Vs Bucket repository

Repository Interface Repository Interface

intelligence Optional intelligence

Archived objects Archived Buckets

Bucketextractionprocedure

User User

Page 67: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

Buckets..protocol

Index holdingsSearch/retrieve

holdings

Display holdingsbucket

Archive

User

Page 68: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

Bucket..Tools

• Author Tool– Metadata– Adds packages– Adds elements to package– Selects applicable clusters– Terms and conditions

Page 69: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

Bucket..Tools

• Management Tool– Interface – Query and update buckets

• Bucket Matching System– SDI– Find similar works by different authors– Arbitrary SDI– Metadata scrubbing

Page 70: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

Buckets..implementation

• NCSTRL

• NCSTRL+

Page 71: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

STARTS

• Stanford Digital Library Project

• Search Engine Vendors

Page 72: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

STARTS

• Document Sources– Internal networks– Internet

• Source Contents– Hidden behind search interfaces

• Algorithms/Protocols are different

Page 73: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

STARTS..Architecture

Page 74: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

STARTS..Architecture

• Large Number of resources

• Each resource consist one or more sources

• Source is collection of files

• Accepts queries from clients and produces results

• Sources may be small or large

• Extract the source list from resources periodically

Page 75: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

STARTS..Architecture

• Extract Metadata and content summaries from source periodically

• Query to a source to a resource

• Communicate with promising resources

• Results are from multiple sources, merge them & retrieve them to the user

Page 76: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

STARTS..Query language

• Filter expression– Boolean nature– Defines documents

• Ranking expression– Associates score with documents

Page 77: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

STARTS..Query language

• L-strings– language-country– string behavior

• Atomic Terms– Fields– Modifiers

• Complex filter expression– and, or, and-not, prox etc

Page 78: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

STARTS..Query language

• Complex ranking expressions

• Global settings

Page 79: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

STARTS..Merging ranks

• Unnormalized score of the document for each query

• ID of the sources where document appears

• Statistics– Term-frequency, Term-weight, Document-

frequency, Document-size, Document-count

Page 80: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

STARTS..Source metadata

• Properties of the source– Fields supported, score range, linkage etc.

• Content Summary of the source– List of words that appear in the source– statistics of each word listed– total documents in the list etc.

Page 81: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

STARTS..in the end

• General Search Engines– Gathers all documents on the network

• STARTS– Gathers metadata about collections– Selects small set of collections– Search & retrieve

Page 82: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

STARTS..implementation

• Alexandria Digital Library

Page 83: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

STARTS..limitation

• Text only

Page 84: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

Indian Digital Library..

• Ancient & Diverse culture

• 5000 years old culture

• Largest Democracy

• Seventh largest country

• High population

• Illiterate

• Important part of World Economy

Page 85: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

Indian Digital Library..

• World’s largest middle class

• Poverty

• Highly skilled manpower

• Generates Research Oriented Information

• Global interest

• Major players in IT in the World

• World is looking for ancient Indian Culture

Page 86: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

Indian Scene..IT

• Content is lacking

• Indian Literature control (both bibliographic and full text)in almost all fields are sketchy

• NII

• DL on Indian Heritage

• World Wide accord for Indian Heritage

• Internet Religion is the hot attraction

Page 87: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

Indian Scene.. IT

• West Research has been done on Veda, Upanishads, Shastra, Philosophy etc. but soul is missing

• Protection, Preservation, Study, Research, Propagation for posterity

• NLP

• Knowledge Presentation

Page 88: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

Indian Scene.. IT

• Speech recognition

• OCR

• Machine translation

• NL interfaces

• Text Processing through Index, Concordance, Thesauri, Dictionaries

Page 89: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

Indian Scene.. IT

• National Integration, Guide Humanity, Conflicts, Aberrations, intolerance etc

• Value based system

• Historic priceless manuscripts

Page 90: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

Indian Heritage

• Indian Art

• Indian Paintings

• Indian Sculpture

• Religion

Page 91: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

Proposed Architecture….

• Background– User Group

• Skilled & Illiterates

• Oral tradition still exists

• Multilingual

– Information Sources• Content is lacking

• Literature Control both Bibliographic and Text is very weak

Page 92: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

Proposed Architecture….

• Media– Computer Generated files to Palm leaf manuscripts

• Language

• lack of standards for communication

• Geographical boundaries

• Accessibility

• Reaching rural population

– Publishing• Restricted to regional and local

Page 93: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

Proposed Architecture….

• National initiates are yet to take off

• Cooperative publishing is lacking

• Unicode/Universal protocol yet make its impact

– Network Resources• Communication infrastructure exists but not stable

• Individuals, Organizations, local, regional are generators of sources

• Loose networks - manpower & infrastructure

• Lack of communication standards

• Duplicate works

Page 94: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

Proposed Architecture….

– Need of Networked Information Sources• Many priceless knowledge lost or loosing

• Future generation missing the value of life told by ancestors

• Protection, Preservation, Study, Research, Propagation for posterity

– Looking for future• NII

• Better CCC, Computer, Communication, Content

Page 95: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

Hybrid Architecture….

• Combination of SODA & STARTS Architecture– From SODA - Bucket Architecture– From STARTS - Search and Retrieval protocol

• Metadata - Dublin Core– For its simplicity and popularity

Page 96: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

Bucket Architecture….

• Buckets are logically grouped– Language, Region, Content, Media, Images,

etc. (any combination or together as intelligent)

• Large archives have buckets with many different functionality's

• Bucket may contain resources, packages, elements, metadata, pointers, etc.

Page 97: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

Bucket Architecture….

• Bucket may be unique entity or many buckets may form an entity

• Bucket may be standalone with the content

• Many buckets may become resource

• Each bucket has been built with some degree of intelligence and functionality

• Includes author tool and management tool

Page 98: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

Bucket Architecture….

• Similarly user’s buckets are also created • Bucket matching may take place• Interactions with packages or elements are

made only through defined methods on a bucket

• Bucket can initiate actions• Buckets can exist inside or out of a repository

Page 99: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

STARTS Architecture….

• Search, Retrieval and Browse within Bucket

• Resources, Sources, Elements, Packages, Pointers, etc. based on the Bucket definition

• Search query is made within the source defined in Bucket

• Query may be within the bucket or across the bucket based on the definition and functionality

Page 100: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

STARTS Architecture….

• Ranking is done within the source

• Matching is done with User’s Bucket definition

• Results displayed based on Ranking and user’s requirements

• Although STARTS uses Z39.50 for metadata & transfer protocol, we propose to use Dublin Core for metadata

Page 101: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

New Protocol..

• Need to create standard for communication

• Information processing and retrieval

• Feeling universal information source

• Many sources converge as once resource

• Global information resource

• Universal accessibility by unified protocol

• Global access

Page 102: Metadata for DL Metadata Architecture for Digital Libraries: Conceptual Framework for Indian Digital Libraries Madhusudana Rao CR C-DAC, Bangalore

Metadata for DL

New Protocol..

• Frame work is just beginning