the digital object architecture a presentation at louisiana state university baton rouge, louisiana...

38
The Digital Object The Digital Object Architecture Architecture A presentation at A presentation at Louisiana State University Louisiana State University Baton Rouge, Louisiana Baton Rouge, Louisiana August 26, 2005 August 26, 2005 ert E. Kahn poration for National Research Initiatives ton, Virginia

Upload: neil-george

Post on 28-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The Digital Object Architecture A presentation at Louisiana State University Baton Rouge, Louisiana August 26, 2005 Robert E. Kahn Corporation for National

The Digital Object ArchitectureThe Digital Object Architecture

A presentation atA presentation atLouisiana State UniversityLouisiana State UniversityBaton Rouge, LouisianaBaton Rouge, Louisiana

August 26, 2005August 26, 2005

Robert E. KahnCorporation for National Research InitiativesReston, Virginia

Page 2: The Digital Object Architecture A presentation at Louisiana State University Baton Rouge, Louisiana August 26, 2005 Robert E. Kahn Corporation for National

Selected Major Network Selected Major Network IssuesIssues

How to get affordable broadband access to How to get affordable broadband access to homes, businesses, government, etc.homes, businesses, government, etc.

How to add more dimensionality to the How to add more dimensionality to the mobile wireless experiencemobile wireless experience

How to take advantage of many How to take advantage of many devices/appliances being on the Internetdevices/appliances being on the Internet

Protecting critical elements (including Protecting critical elements (including infrastructure elements such as DNS)infrastructure elements such as DNS)

Stifling SPAM; detecting and fighting VirusesStifling SPAM; detecting and fighting Viruses

Page 3: The Digital Object Architecture A presentation at Louisiana State University Baton Rouge, Louisiana August 26, 2005 Robert E. Kahn Corporation for National

Selected Major Issues (con’t)Selected Major Issues (con’t)

Identity Management (w/o certificates)Identity Management (w/o certificates)Trust in the security mechanismsTrust in the security mechanismsManaging PrivacyManaging PrivacyHow to enable more widespread sharing How to enable more widespread sharing

of important information on the netof important information on the netTrusting your information to the NetTrusting your information to the NetManaging your information on the Net over Managing your information on the Net over

very long periods of timevery long periods of time

Page 4: The Digital Object Architecture A presentation at Louisiana State University Baton Rouge, Louisiana August 26, 2005 Robert E. Kahn Corporation for National

Infrastructure DevelopmentInfrastructure Development

What is so hard about it?What is so hard about it? Making it scalable over platforms, size and timeMaking it scalable over platforms, size and time Achieving Critical MassAchieving Critical Mass

Getting Buy inGetting Buy in Pleasing many essential participantsPleasing many essential participants Displacing prior capabilitiesDisplacing prior capabilities Structuring matters to deal with concerns about empire Structuring matters to deal with concerns about empire

buildingbuilding

It’s a lot easier to create brand new capabilities It’s a lot easier to create brand new capabilities than to affect existing means of operationthan to affect existing means of operation

Page 5: The Digital Object Architecture A presentation at Louisiana State University Baton Rouge, Louisiana August 26, 2005 Robert E. Kahn Corporation for National

Infrastructure Creation is a Infrastructure Creation is a Subtractive ProcessSubtractive Process

Infrastructure reduces a common, shared Infrastructure reduces a common, shared capability to its basic and essential attributescapability to its basic and essential attributes

These attributes are not always recognized or These attributes are not always recognized or understood up frontunderstood up front

Upon further scrutiny, capabilities are usually Upon further scrutiny, capabilities are usually deleted from a well-conceived architecture deleted from a well-conceived architecture over timeover time

Consensus develops when no more can be Consensus develops when no more can be removed without disabling the infrastructureremoved without disabling the infrastructure

Page 6: The Digital Object Architecture A presentation at Louisiana State University Baton Rouge, Louisiana August 26, 2005 Robert E. Kahn Corporation for National

What is the Problem?What is the Problem?

Managing information in the Net over very long Managing information in the Net over very long periods of time – e.g. centuries or moreperiods of time – e.g. centuries or more

Dealing with very large amounts of information in Dealing with very large amounts of information in the Net over timethe Net over time

When information, its location(s) and even the When information, its location(s) and even the underlying systems may change dramatically underlying systems may change dramatically over timeover time

Respecting and protecting rights, interests and Respecting and protecting rights, interests and valuevalue

Page 7: The Digital Object Architecture A presentation at Louisiana State University Baton Rouge, Louisiana August 26, 2005 Robert E. Kahn Corporation for National

A Meta-level ArchitectureA Meta-level Architecture

Allows for arbitrary types of information Allows for arbitrary types of information systemssystems

Allows for dynamic formatting and data Allows for dynamic formatting and data typingtyping

Can accommodate interoperability Can accommodate interoperability between multiple different information between multiple different information systemssystems

Allows metadata schema to be identified Allows metadata schema to be identified and typedand typed

Page 8: The Digital Object Architecture A presentation at Louisiana State University Baton Rouge, Louisiana August 26, 2005 Robert E. Kahn Corporation for National

Digital Object Architecture:Digital Object Architecture:MotivationMotivation

To To reformulate the Internet architecturereformulate the Internet architecture around the around the notion of uniquely identifiable data structuresnotion of uniquely identifiable data structures

Enabling existing and new types of information to be Enabling existing and new types of information to be reliably managed and accessed in the Internet reliably managed and accessed in the Internet environment environment over long periods of timeover long periods of time

Providing mechanisms to stimulate innovation, the Providing mechanisms to stimulate innovation, the creation of dynamic new forms of expression and to creation of dynamic new forms of expression and to manifest older formsmanifest older forms

While supporting intellectual property protection, fine-While supporting intellectual property protection, fine-grained access control, and enable well-formed grained access control, and enable well-formed business practices to emergebusiness practices to emerge

Page 9: The Digital Object Architecture A presentation at Louisiana State University Baton Rouge, Louisiana August 26, 2005 Robert E. Kahn Corporation for National

Objective of the FrameworkObjective of the Framework

Internet objectiveBest-effort Packet Delivery

Heterogeneous

Networks Information Systems

Seamless Interoperability

Networks InformationSystems

Organizing Heterogeneous Systems

Page 10: The Digital Object Architecture A presentation at Louisiana State University Baton Rouge, Louisiana August 26, 2005 Robert E. Kahn Corporation for National

Digital Object ArchitectureDigital Object Architecture Technical ComponentsTechnical Components

Digital Objects (DOs)Digital Objects (DOs) Structured data, independent of the platform on which it was Structured data, independent of the platform on which it was

createdcreated Consisting of “elements” of the form <type,value>Consisting of “elements” of the form <type,value> One of which is its unique, persistent identifierOne of which is its unique, persistent identifier

Resolution of Unique IdentifiersResolution of Unique Identifiers Maps an identifier into “state information” about the DOMaps an identifier into “state information” about the DO Handle System is a general purpose resolution systemHandle System is a general purpose resolution system

RepositoriesRepositories from which DOs may be accessed from which DOs may be accessed And into which they may be depositedAnd into which they may be deposited

Metadata RegistriesMetadata Registries Repositories that contain general information about DOsRepositories that contain general information about DOs Supports multiple metadata schemesSupports multiple metadata schemes Can map queries into unique DO specifications (via handles)Can map queries into unique DO specifications (via handles)

Page 11: The Digital Object Architecture A presentation at Louisiana State University Baton Rouge, Louisiana August 26, 2005 Robert E. Kahn Corporation for National

What is a Digital ObjectWhat is a Digital Object Defined data structureDefined data structure, machine independent, machine independent Consisting of a set of elementsConsisting of a set of elements

Each of the form Each of the form <type,value><type,value> One of which is the One of which is the unique identifierunique identifier

Identifiers are known as “Handles”Identifiers are known as “Handles” Format is Format is “prefix/suffix”“prefix/suffix” Prefix is unique to a naming authorityPrefix is unique to a naming authority Suffix can be any string of bits assigned by that authoritySuffix can be any string of bits assigned by that authority

Data structure can be parsed; types can be resolved within the Data structure can be parsed; types can be resolved within the architecturearchitecture

Associated Associated properties recordproperties record and and transaction record transaction record containing containing metadata and usage informationmetadata and usage information

Page 12: The Digital Object Architecture A presentation at Louisiana State University Baton Rouge, Louisiana August 26, 2005 Robert E. Kahn Corporation for National

Interoperability & Federated Interoperability & Federated RepositoriesRepositories

Create a cohesive interoperable collection Create a cohesive interoperable collection of repository-based systemsof repository-based systems Initially, perhaps, around a core set of Initially, perhaps, around a core set of

projects, content, applications and/or projects, content, applications and/or organizations as in ADL organizations as in ADL

Demonstrate interoperability between Demonstrate interoperability between different repository collectionsdifferent repository collections

Develop procedures to insure continued Develop procedures to insure continued accessibility to key archival information accessibility to key archival information

Page 13: The Digital Object Architecture A presentation at Louisiana State University Baton Rouge, Louisiana August 26, 2005 Robert E. Kahn Corporation for National

Repository NotionRepository Notion

Any Hardware & SoftwareConfiguration

Logical External Interface

RAP

RepositoryAccess Protocol

Page 14: The Digital Object Architecture A presentation at Louisiana State University Baton Rouge, Louisiana August 26, 2005 Robert E. Kahn Corporation for National

Repository

Digital Object RepositoryDigital Object Repository

RA

P

ClientClient

• Provides distributed Digital Object storage.

• May itself be a Digital Object.

• Provides a dynamic acquisition and execution mechanism for the mobile code that implements the content type operations.

• Exclusively accessed using the Repository Access Protocol (RAP).

DisseminateDeposit

Page 15: The Digital Object Architecture A presentation at Louisiana State University Baton Rouge, Louisiana August 26, 2005 Robert E. Kahn Corporation for National

Nesting of RepositoryNesting of Repository FunctionalityFunctionality

CoreStructureContentAggregation &De-aggregation

Core Interface must be present at each levelOther levels could be separately defined later

Page 16: The Digital Object Architecture A presentation at Louisiana State University Baton Rouge, Louisiana August 26, 2005 Robert E. Kahn Corporation for National

Repositories & Digital ObjectsRepositories & Digital Objects

REPOSITORY

IPv6

Each DigitalObject has itsown unique & persistent ID

Content Providerswant to assign Ids

Could be upwardsof trillions of DOsper Repository

Objects may beReplicated inMultiple Repositories

Page 17: The Digital Object Architecture A presentation at Louisiana State University Baton Rouge, Louisiana August 26, 2005 Robert E. Kahn Corporation for National

Handle SystemHandle System Distributed Identifier Service on the InternetDistributed Identifier Service on the Internet

First General Purpose Resolution systemFirst General Purpose Resolution system

Can be used to Can be used to locate repositorieslocate repositories that contain digital objects given that contain digital objects given their handlestheir handles - - and more!and more!

Other indirect referencesOther indirect references Public Keys, Authentication information for DosPublic Keys, Authentication information for Dos

Accommodates interoperability between many different information Accommodates interoperability between many different information systems; for examplesystems; for example DNS was demonstrated on the Handle System in preparation for Y2KDNS was demonstrated on the Handle System in preparation for Y2K Can support ENUM, RFID, and moreCan support ENUM, RFID, and more

Page 18: The Digital Object Architecture A presentation at Louisiana State University Baton Rouge, Louisiana August 26, 2005 Robert E. Kahn Corporation for National

Attributes of the Handle SystemAttributes of the Handle System

The basic Architecture of the Handle System The basic Architecture of the Handle System is is flat, scaleable, and extensibleflat, scaleable, and extensible

Logically central, but physically decentralizedLogically central, but physically decentralized Supports Supports Local Handle ServicesLocal Handle Services, if desired, if desired Handle resolutions return entire “Handle Handle resolutions return entire “Handle

Records” Records” or portions thereofor portions thereof Handle Records are alsoHandle Records are also

digital objectsdigital objectssigned by the serverssigned by the serversdoubly certificated by the systemdoubly certificated by the system

Page 19: The Digital Object Architecture A presentation at Louisiana State University Baton Rouge, Louisiana August 26, 2005 Robert E. Kahn Corporation for National

Resolution MechanismResolution Mechanism

Multiple SitesMultiple Servers

Handle System<www.handle.net>

Handle

HandleRecord

• System is non –nodal• Scaleable & Distributed• Supports global (and local) resolution• With backup for reliability, mirroring for efficiency

Page 20: The Digital Object Architecture A presentation at Louisiana State University Baton Rouge, Louisiana August 26, 2005 Robert E. Kahn Corporation for National

Type ResolutionType Resolution

Types are resolvable in the Handle SystemTypes are resolvable in the Handle System Types may be created dynamicallyTypes may be created dynamically Types may be locally named, mapped into Types may be locally named, mapped into

bit strings without semanticsbit strings without semantics Primary prefix zero “0” is used for system Primary prefix zero “0” is used for system

identifiersidentifiers 0.type/<type> is the system handle for type0.type/<type> is the system handle for type Other handles may cross reference this Other handles may cross reference this

handle (e.g. for international use)handle (e.g. for international use)

Page 21: The Digital Object Architecture A presentation at Louisiana State University Baton Rouge, Louisiana August 26, 2005 Robert E. Kahn Corporation for National

Handle FormatHandle Format

Prefix Authority

Item ID(any format)

Prefix Suffix

In use, a Handle is an opaque string.

2304.40/1234 Other examples ofHandles

2304/general info2304/12304. HQ/staff2304.1/memo1232304.22.Pub/2004

Page 22: The Digital Object Architecture A presentation at Louisiana State University Baton Rouge, Louisiana August 26, 2005 Robert E. Kahn Corporation for National

Direct Access and ProxiesDirect Access and Proxies

DirectAccess

One or moreProxy Servers

IndirectAccess

Page 23: The Digital Object Architecture A presentation at Louisiana State University Baton Rouge, Louisiana August 26, 2005 Robert E. Kahn Corporation for National

Redirection of Handle RequestsRedirection of Handle Requests

DirectAccess

DirectAccess

One or moreLocal Handle Services

General Registry of allNaming Authorities

RedirectionInformation

RedirectionInformation

Page 24: The Digital Object Architecture A presentation at Louisiana State University Baton Rouge, Louisiana August 26, 2005 Robert E. Kahn Corporation for National

Literary Music Video Financial Grid Enum RFID

“SimpleLookup URL IPaddresses “Unfederated Databases”

Page 25: The Digital Object Architecture A presentation at Louisiana State University Baton Rouge, Louisiana August 26, 2005 Robert E. Kahn Corporation for National

Digital ObjectDigital ObjectContent Type(s)Content Type(s) AccessRequestsAccess

Requests

InformationInformation

Digital Object OverviewDigital Object Overview

DisseminationsDisseminations

Unique IdentifierUnique Identifier

Handle

Page 26: The Digital Object Architecture A presentation at Louisiana State University Baton Rouge, Louisiana August 26, 2005 Robert E. Kahn Corporation for National

HamletHamlet

It’s a BookIt’s a Book Get Page(2)Get Page(2)

Digital Object OverviewDigital Object Overview

HamletHamlet

Page 27: The Digital Object Architecture A presentation at Louisiana State University Baton Rouge, Louisiana August 26, 2005 Robert E. Kahn Corporation for National

Hamlet

•Digital objects are uniquely identified in a given identifier space.

•Data elements reference sequences of typed data.

•A Digital Object can have zero or more Content Types to reflect intended uses by its creator.

•Content Type Operations are accessible as DOs

DataElement

DataElement

DataElement

DataElement

HamletContent Type

Operations

Content TypeOperations

Digital Object OverviewDigital Object Overview

Page 28: The Digital Object Architecture A presentation at Louisiana State University Baton Rouge, Louisiana August 26, 2005 Robert E. Kahn Corporation for National

The Digital Object Identifier The Digital Object Identifier (DOI(DOI®®))

Used by the International DOI Foundation Used by the International DOI Foundation (IDF) to reference high-quality materials of (IDF) to reference high-quality materials of publishers (and other owners of IP)publishers (and other owners of IP)

Major Commercial User of the Handle Major Commercial User of the Handle System at present with approximately 12 System at present with approximately 12 Million handlesMillion handles

Usage growing at about 4 Million per yearUsage growing at about 4 Million per year DNS domain names, by comparison, are DNS domain names, by comparison, are

relatively flat with perhaps 40% churn per relatively flat with perhaps 40% churn per year.year.

Page 29: The Digital Object Architecture A presentation at Louisiana State University Baton Rouge, Louisiana August 26, 2005 Robert E. Kahn Corporation for National

Setting up a Setting up a Local Handle Local Handle Service...Service...

Download the software from Download the software from http://www.handle.nethttp://www.handle.net

Follow the instructions in the installation script.Follow the instructions in the installation script.

Send your “site bundle”, containing the IP address of Send your “site bundle”, containing the IP address of

your server and your administrator information, to the your server and your administrator information, to the

Global Handle RegistryGlobal Handle Registry®® (GHR) administrator (GHR) administrator

Site is under re-development to accommodate Site is under re-development to accommodate

widespread use via automated meanswidespread use via automated means

Experimental Repository software also available on-Experimental Repository software also available on-

lineline

Page 30: The Digital Object Architecture A presentation at Louisiana State University Baton Rouge, Louisiana August 26, 2005 Robert E. Kahn Corporation for National

Managing Rights & InterestsManaging Rights & Interests Not just about copyrightNot just about copyright Terms and Conditions (T&Cs) for use may be Terms and Conditions (T&Cs) for use may be

contained within each DO; also information contained within each DO; also information about intrinsic value, such as monetary valueabout intrinsic value, such as monetary value

T&Cs are intended to indicate clearly what one T&Cs are intended to indicate clearly what one can and/or cannot docan and/or cannot do with a given DO, where with a given DO, where such clarity is intended by the owner of the DOsuch clarity is intended by the owner of the DO

Not an enforcement means, although it may be Not an enforcement means, although it may be used by an enforcement systemused by an enforcement system

Mobile programs that are Digital Objects may Mobile programs that are Digital Objects may apply such terms to themselves and to any apply such terms to themselves and to any digital objects they containdigital objects they contain

Page 31: The Digital Object Architecture A presentation at Louisiana State University Baton Rouge, Louisiana August 26, 2005 Robert E. Kahn Corporation for National

Handle-DNS IntegrationHandle-DNS IntegrationDeveloping EnvironmentDeveloping Environment

C/C++, Linux/Windows C/C++, Linux/Windows Additional ModulesAdditional Modules

DNS Interface integrated with handle serverDNS Interface integrated with handle serverCache/Preload ModuleCache/Preload ModuleDatabase Connection PoolsDatabase Connection PoolsC-Version Handle-DNS Admin ToolkitC-Version Handle-DNS Admin Toolkit

Performance ImprovementsPerformance ImprovementsExceptional ProcessingExceptional ProcessingMemory Leak ProtectionMemory Leak ProtectionThread Pool Management Thread Pool Management

Page 32: The Digital Object Architecture A presentation at Louisiana State University Baton Rouge, Louisiana August 26, 2005 Robert E. Kahn Corporation for National

Design & Design & ImplementationImplementation

Simple Handle Server Workflow (C-Simple Handle Server Workflow (C-Version)Version)

Storage Management

Interface

Handle Requests

Thread Pool

Listener

Handle Server

Client

Message

Processor

DBDatabase

Connection Pool

Page 33: The Digital Object Architecture A presentation at Louisiana State University Baton Rouge, Louisiana August 26, 2005 Robert E. Kahn Corporation for National

External Protocol ConverterExternal Protocol Converter

DNS ProtocolDNS Protocol

Converter

Handle Protocol

53

8000

2641

Handle

Process

Module

Handle S

erver

Latency

Page 34: The Digital Object Architecture A presentation at Louisiana State University Baton Rouge, Louisiana August 26, 2005 Robert E. Kahn Corporation for National

Plug & Play InterfacesPlug & Play Interfaces

Integrate DNS Interface with Handle ServerIntegrate DNS Interface with Handle Server

DNS ProtocolDNS Message

Processor

Handle Protocol

53

8000

2641

Handle Message

Processor

Handle S

erver

Page 35: The Digital Object Architecture A presentation at Louisiana State University Baton Rouge, Louisiana August 26, 2005 Robert E. Kahn Corporation for National

Cache & Storage ManagementCache & Storage Management Preload (Cache) ModulePreload (Cache) Module

Preload Handle Records from Preload Handle Records from Database into RAM Database into RAM

Reduce Database Access TimesReduce Database Access Times Improve Throughput of Handle ServerImprove Throughput of Handle Server

Storage Management APIStorage Management API User TransparentUser Transparent

RAM or DatabaseRAM or Database Combination of RAM and DatabaseCombination of RAM and Database

Multiple Database InterfacesMultiple Database Interfaces Mysql, PostgreSQL, etc.Mysql, PostgreSQL, etc.

Features of Cache ModuleFeatures of Cache Module Efficient Query PerformanceEfficient Query Performance

STL RBTree, Hash TableSTL RBTree, Hash Table Configurable size of RAM for each Configurable size of RAM for each

Handle Record, or total recordsHandle Record, or total recordsStorage Management API

Storage Management

Interface

RAM

Operations

Create

Modify

Delete

Data Base

Periodic

Update

Page 36: The Digital Object Architecture A presentation at Louisiana State University Baton Rouge, Louisiana August 26, 2005 Robert E. Kahn Corporation for National

BenchmarkBenchmarkUDP Interface for DNS ProtocolUDP Interface for DNS ProtocolCompared to BIND 9.3.0Compared to BIND 9.3.0

Handl e-DNS VS Bi nd

0

2000

4000

6000

8000

10000

12000

14000

16000

2 8 14 20 26 32 38 44 50 56 62 68 74 80 86 92 98

Number of Cl i ent Requests(103)

Responses per Second

Handl e-DNSBi nd

Page 37: The Digital Object Architecture A presentation at Louisiana State University Baton Rouge, Louisiana August 26, 2005 Robert E. Kahn Corporation for National

Selling infrastructure technologySelling infrastructure technology Providing identification, management and Providing identification, management and

Metadata servicesMetadata services Enabling third-party value-added capabilitiesEnabling third-party value-added capabilities Helping organizations manage their own Helping organizations manage their own

information better & offer new types of servicesinformation better & offer new types of services Stimulating access to “surface information” and Stimulating access to “surface information” and

“embedded information” with appropriate “embedded information” with appropriate access controls and conditions of useaccess controls and conditions of use

Business PotentialBusiness Potential

Page 38: The Digital Object Architecture A presentation at Louisiana State University Baton Rouge, Louisiana August 26, 2005 Robert E. Kahn Corporation for National

ConclusionsConclusions

Managing Digital Objects for long-term access is a Managing Digital Objects for long-term access is a key challengekey challenge

Initial Technology Components are available; Initial Technology Components are available; Industry is expected to generate more over timeIndustry is expected to generate more over time

Third-party value-added providers in the private Third-party value-added providers in the private sector will ultimately shape the long-term evolutionsector will ultimately shape the long-term evolution

Interoperability and reliable information access is a Interoperability and reliable information access is a critical objectivecritical objective

A diversity of applications (with user-friendly A diversity of applications (with user-friendly interfaces) need to be developed & deployedinterfaces) need to be developed & deployed

Application Projects have a central role to play in Application Projects have a central role to play in demonstrating the technology and using it effectivelydemonstrating the technology and using it effectively