pawn: producer-archive workflow network university of maryland institute for advanced computer...
Post on 21-Dec-2015
219 views
TRANSCRIPT
PAWN: Producer-Archive Workflow Network
University of MarylandInstitute for Advanced Computer
Studies
Joseph JaJa, Mike Smorul, Mike McGann
What is PAWN? Software that provides an ingestion framework Distributed and secure ingestion of digital
objects into an archive. Handles the process
From package assembly To archival storage
Simple interface for end-users Flexible interface for archive managers Designed for use in multiple contexts
Distributed Ingestion Multiple producing
sites with different requirements.
Separation of administrative responsibility.
Customizable roles for various parties.
Scalable infrastructure.
``
`
Producer
``
`
Producer
``
`
Producer
``
`
Producer
Distributed Archive
Overall Organization Producers organized into domains, each
domain contains a transfer agreement negotiated with the archive.
Each domain contains a hierarchical organization of data grouped into record sets/templates (convenient groupings from the transfer agreement).
An end-user operates within a domain with record sets associated with the account.
Package Workflow Overview1. Create Producer-Archive Agreement2. Client package template.3. Create package based on template4. Once approved, packages can be archived5. Rejected packages can be held until rectified or
deleted for resubmission.
Package Builder Review
Producer Agreement
· AdministrativeStrategic and Performance PlansAppointment and PromotionPolicies and CommitteesAlumni Affairs
· FinancialContracts and GrantsPayrollDonations
· Publication ReportsTechnical ReportsPresentationsPostersOutreach
Template
Template Name: Research ResultsNotes: Published results and conference presentations
Contents:· Presentations
· Technical Reports
Create Template Create Package Audit Package
Activity Log
Package Lifecycle
ArchiveArchive Gateway
Archive
Customizable Components
Definable Roles Roles are groupings of actions in PAWN
applied to accounts Pluggable Interface
APIs for creating gateways between PAWN and archive (SRB, Fedora, etc..)
APIs for creating custom package builders
APIs for handling metadata formats
Package Builders Default Builder
Create files and folders
Attach descriptive metadata to files or folders
ICDL Builder Create ‘books’ with
dublin core metadata Uses ICDL database
as source for book list and metadata
PAWN Architecture Management service
Track all administrative and high-level metadata. (domains, accounts, agreements, package description, etc)
Multiple instances Receiving servers
Storage of submitted packages Push to archive
Scheduler Allocate resources on receiving servers to
clients Control receiving servers
PAWN Architecture
Bulk Transfer
Scheduler
Producer Managed Archive Managed
Management Server
Producer data suppliers
Receiving Server
DistributedArchive
Schedule Request
AuthenticationPackage Information
Ingestion Status
Validation Services
Case Study: 15,000 cdroms 15,000 cdroms containing landsat
data. CD’s in control of library, processing
and data storage across campus. Moving cd collection not feasible.
Need for untrained (student) labor to ingest without supervision.
Final copy needed to be accessible by several parties.
Case Study: 15,000 cdroms Custom PAWN
Interface. Two workstations, 4
cd drives apiece. Generate thumbnails
and barcode cdroms. Use SRB as final
archive, and pre-existing PAWN-SRB driver.