the handle system: and its role in a digital object architecture robert e. kahn cnri workshop on...

32
The Handle System: and its role in a Digital Object Architecture Robert E. Kahn CNRI kshop on Frontiers Distributed Information Systems sidio of San Fransisco y 31 – August 1, 2003

Upload: shon-strickland

Post on 27-Dec-2015

230 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The Handle System: and its role in a Digital Object Architecture Robert E. Kahn CNRI Workshop on Frontiers in Distributed Information Systems Presidio

The Handle System: and its role in a Digital Object Architecture

Robert E. KahnCNRI

Workshop on Frontiersin Distributed Information Systems

Presidio of San FransiscoJuly 31 – August 1, 2003

Page 2: The Handle System: and its role in a Digital Object Architecture Robert E. Kahn CNRI Workshop on Frontiers in Distributed Information Systems Presidio

Objective of the Framework

Internet objectiveBest-effort Packet Delivery

Heterogeneous

Networks Information Systems

Seamless Interoperability

Networks InformationSystems

Federating Heterogeneous Systems

Page 3: The Handle System: and its role in a Digital Object Architecture Robert E. Kahn CNRI Workshop on Frontiers in Distributed Information Systems Presidio

Internet Comparison

• IP Addresses Machines

• Gateways (now routers) help with access

• TCP handles end-end issues– Remove duplicate packets– Restructure the arriving fragmented stream– Perform end-end error detection & retransmission– Provide flow control

Page 4: The Handle System: and its role in a Digital Object Architecture Robert E. Kahn CNRI Workshop on Frontiers in Distributed Information Systems Presidio

Further Scoping the Problem

Complexity of Query

Time toResolve Query

Initial Focus on Querieswith Complexity = Zero

Page 5: The Handle System: and its role in a Digital Object Architecture Robert E. Kahn CNRI Workshop on Frontiers in Distributed Information Systems Presidio

Literary Music Video Financial Grid Enum RFID

“SimpleLookup URL IPaddresses “Unfederated Databases”

Page 6: The Handle System: and its role in a Digital Object Architecture Robert E. Kahn CNRI Workshop on Frontiers in Distributed Information Systems Presidio

Basic Attributes of the Approach

• Digital Objects (i.e. Data Structures)• Unique Identifiers Digital Objects• Resolution & Administration Mechanism

– Maintains Uniqueness of Ids DOs as long as they persist

– Maps Ids Useful State Information– Is distributed and scaleable– Does not involve complete search

Page 7: The Handle System: and its role in a Digital Object Architecture Robert E. Kahn CNRI Workshop on Frontiers in Distributed Information Systems Presidio

Digital Object

• Set of elements, each of <Type, Value>

• Parsable across heterogeneous platforms

• One element must be the unique identifier

• Properties Record contains metadata

• Transaction Record records usage

• Most users wish to access its Essence

• Key Metadata is part of the Essence

Page 8: The Handle System: and its role in a Digital Object Architecture Robert E. Kahn CNRI Workshop on Frontiers in Distributed Information Systems Presidio

Internal Data Structure

MethodsDisseminators

Digital Object

Access to the object is subject tocontrol by the owner. For example,a market in disseminators is possible.

The internal data structureis not directly accessibleby the programmer

Page 9: The Handle System: and its role in a Digital Object Architecture Robert E. Kahn CNRI Workshop on Frontiers in Distributed Information Systems Presidio

Purposely Silent about

• What Types

• What Type of Types

• What Values

• What metadata or metadata schema

• What state information in Handle Records

• Policies and Procedures in general

• There are policies for Global, however

Page 10: The Handle System: and its role in a Digital Object Architecture Robert E. Kahn CNRI Workshop on Frontiers in Distributed Information Systems Presidio

A Range of Possibilities

• Identifiers are persistent – e.g. DOIs

• Identifiers are transient – e.g. Grid

• Identifiers are resolvable

• Resolution information is not accessible

• Digital Objects are fixed, unchangeable

• Access to Digital Objects is fixed, even if DOs are changeable

Page 11: The Handle System: and its role in a Digital Object Architecture Robert E. Kahn CNRI Workshop on Frontiers in Distributed Information Systems Presidio

Repository Notion

Any Hardware & SoftwareConfiguration

Logical External Interface

RAP

Page 12: The Handle System: and its role in a Digital Object Architecture Robert E. Kahn CNRI Workshop on Frontiers in Distributed Information Systems Presidio

Nesting of Repositories

CoreStructureContentAggregation &De-aggregation

Core Interface must be present at each levelOther levels could be separately defined later

Page 13: The Handle System: and its role in a Digital Object Architecture Robert E. Kahn CNRI Workshop on Frontiers in Distributed Information Systems Presidio

Federated Repositories• Key issue is commonality of interests in accessing

information from multiple repositories.• Financial Information is prime applications area• Metadata Registries allow for searching based on

“user-supplied” inputs. The use of handles (however branded) can simplify access.

• Access via local repositories is an operational desirable capability.

Page 14: The Handle System: and its role in a Digital Object Architecture Robert E. Kahn CNRI Workshop on Frontiers in Distributed Information Systems Presidio

MetaObjects & Metadata Registries

• MetaObjects provide a structural basis for indirection and for organizing information

• Metadata is used to characterize digital objects, to access their identifiers and to assist in cross referencing

• Metadata Registries provide uniform access to metadata.

Page 15: The Handle System: and its role in a Digital Object Architecture Robert E. Kahn CNRI Workshop on Frontiers in Distributed Information Systems Presidio

Handle Format

Naming Authority

Item ID(any format)

Prefix Suffix

In use, a Handle is an opaque string. Corporation For National

Research Initiatives

2304568.40/12345678

Page 16: The Handle System: and its role in a Digital Object Architecture Robert E. Kahn CNRI Workshop on Frontiers in Distributed Information Systems Presidio

Handles Resolve to Typed Data

Handle data

2304568.40/12345678 URLhttp://www.loc.gov/.....

Handle Data type

RAPloc/repository

URLhttp://www.loc2.gov/..

Extensible Data Types XYZ1001110011110

Just one example - also looks like a digital object

Handles can also have semantics butwe frown on it! Resolution is independentof semantics in every instance

Handle Record

Page 17: The Handle System: and its role in a Digital Object Architecture Robert E. Kahn CNRI Workshop on Frontiers in Distributed Information Systems Presidio

Allocation of Prefixes

1 - System Uses2 - High Fan in/out Organizations3 “4 - Businesses and formal organizations5 “6 - Individuals and anything that cant fit above7 “8 “

Page 18: The Handle System: and its role in a Digital Object Architecture Robert E. Kahn CNRI Workshop on Frontiers in Distributed Information Systems Presidio

Creating & Resolving Type Information Dynamically

• Prefixes of the form 0.X are reserved for defining resolvable “system information” such as types and naming authorities

• 0.type/<type> is a handle for the type in brackets

• 0.na/<na> is a handle for a particular na• Non-system types can also be created by

individual users

Page 19: The Handle System: and its role in a Digital Object Architecture Robert E. Kahn CNRI Workshop on Frontiers in Distributed Information Systems Presidio

Global Handle Resolution

HS1 HS2 HPS3 HS4

HANDLE ADMINISTRATION

HANDLE RESOLUTION

Handle Servers

(Handles are uniformly spread by hashing)

Multiple Handle Servers

Page 20: The Handle System: and its role in a Digital Object Architecture Robert E. Kahn CNRI Workshop on Frontiers in Distributed Information Systems Presidio

Global & Local Handle Resolution

HS1 HS2 HPS3 HS4

HANDLE ADMINISTRATION

HANDLE RESOLUTION

Handle ServersGlobal

Local

HANDLE RESOLUTION

Page 21: The Handle System: and its role in a Digital Object Architecture Robert E. Kahn CNRI Workshop on Frontiers in Distributed Information Systems Presidio

How do handles resolve...

Two steps to resolve a handle - -• Client queries GHR: “Which Handle Service has 1895.22/1011?”• GHR responds with a “map” showing the client which servers within

the responsible LHS it can query for that handle .

Handle Client

GHR

LHS A

LHS C

LHS ..n

LHS B

LHS D

Handle System

1. Where is 1895.22/1011?

Map of LHS B

2. Give me all data for 1895.22/1011

Handle Data

Page 22: The Handle System: and its role in a Digital Object Architecture Robert E. Kahn CNRI Workshop on Frontiers in Distributed Information Systems Presidio

Administration of Handle Records

univ/thesis.txt 1217/4913527 univ/4913527 1217/thesis.txt

univ.csl.17.2

(the handles shown above identify digital objects)

univ 1217

univ.csl.17

univ.csl 1217.34

1217.34.1

Page 23: The Handle System: and its role in a Digital Object Architecture Robert E. Kahn CNRI Workshop on Frontiers in Distributed Information Systems Presidio

The Global Handle Registry

• The GHR is a unique handle service used to store the identity and location of all local handle services (LHS), and tells a handle client which service to query to resolve a handle.

• All handle clients (for resolution or administration) know how to contact and query the GHR.

GlobalHandle Registry

DOI HandleService

LOCHandle ServiceCMU

Handle Service

DTIC HandleService

Korean Ctrl LibHandle Service

Nat’l Lib AustraliaHandle Service

Twin BaysHandle Service

Liqid KrystalHandle Service

MITHandle Service

Page 24: The Handle System: and its role in a Digital Object Architecture Robert E. Kahn CNRI Workshop on Frontiers in Distributed Information Systems Presidio

Groups of Handle Servers

P

S

S

S

S

Group A Group B

Group C Group D

Page 25: The Handle System: and its role in a Digital Object Architecture Robert E. Kahn CNRI Workshop on Frontiers in Distributed Information Systems Presidio

Handle ClientsAdministration

Use the Java™ HandleClient Tool provided inthe distribution for creatingor updating handles one-at-a-time or via a batch.

Develop your own administration client.

or

Page 26: The Handle System: and its role in a Digital Object Architecture Robert E. Kahn CNRI Workshop on Frontiers in Distributed Information Systems Presidio

Handle ClientsResolutionDownload web browser plug-in which enables browsersto recognize the handle protocol.

or

Append a handle to proxy servere.g http://hdl.handle.net/<handle>)which understands both HTTP and HDL protocols.

or

Develop your own resolution client.

Page 27: The Handle System: and its role in a Digital Object Architecture Robert E. Kahn CNRI Workshop on Frontiers in Distributed Information Systems Presidio

Setting up a Local Handle Service...

• Download the software from

http://www.handle.net.

• Follow the instructions in the installation script.

• Send your “site bundle”, containing the IP address

of your server and your administrator information,

to the Global Handle Registry (GHR)

administrator.

Page 28: The Handle System: and its role in a Digital Object Architecture Robert E. Kahn CNRI Workshop on Frontiers in Distributed Information Systems Presidio

Organization of the International DOI Foundation

IDF

IDF is a non-profitorganization with offices in

Washington, DC (AAP) Geneva, Switzerland (IPA)

Members areMostly Book &

Journal Publishers Membership Dues

- Policies & Procedures- Licensing the DOI TM- Qualifying RAs- Marketing the DOI brand

4¢ per DOI on deposit – 1X; min $20K/yr1¢ per DOI in CDD on 12/31 – annual½¢ per DOI in CDD after $50K per RA

CDD

Page 29: The Handle System: and its role in a Digital Object Architecture Robert E. Kahn CNRI Workshop on Frontiers in Distributed Information Systems Presidio

Business Potential

• Enabling new forms of Creativity – New forms of expression– Representing value as Digital Objects

• Selling infrastructure technology & services• Enabling Third Party value-added capabilities• Helping organizations manage their own information

better & offer new types of services• Stimulating access to “surface information” and

“embedded information” with appropriate access controls and conditions of use

Page 30: The Handle System: and its role in a Digital Object Architecture Robert E. Kahn CNRI Workshop on Frontiers in Distributed Information Systems Presidio

Evolution of Policy for Global

• Original Policy– Best efforts service; run in-house– Cost paid by the Government– Available to the research community for free

• Current Policy (still in flux)– Best efforts service; run 7x24 with backup– Free to the research community; commercial users pay

after a period of experimentation– Handle System Advisory Committee oversees costs and

evolution.

Page 31: The Handle System: and its role in a Digital Object Architecture Robert E. Kahn CNRI Workshop on Frontiers in Distributed Information Systems Presidio

Cost of Global Services

• IPv4 several million addresses; about 50M TLDs (excluding CCs)

• At say $20 per year per TLD, the cost of global registration and resolution services is about $1B per year – this is inefficient, very profitable or both

• The handle system is almost as large as DNS (there are over 10M DOIs alone) and costs about $250K per year at present.

• The DNS can be run within the handle system, if desired; but the handle system can support IPv4 and IPv6 without DNS

Page 32: The Handle System: and its role in a Digital Object Architecture Robert E. Kahn CNRI Workshop on Frontiers in Distributed Information Systems Presidio

Applications of the Technology

• Identity Management (DHS)• PKI Infrastructure• Personal Locator Information• Efficient Communications• Steganography• Managing Digital Cash• Managing Business Transactions (e.g. email)• Learning of more up to date Publications• Cataloguing and Indexing