access2011 - van garderen: occupy the memory

31
Open-Source Big Data for Archives and Libraries: An Case Study Peter Van Garderen, President/Systems Archivist MJ Suhonos, Systems Librarian/Software Engineer

Upload: peter-van-garderen

Post on 11-May-2015

856 views

Category:

Education


3 download

DESCRIPTION

Access2011 conference#access2011Peter Van GarderenOccupy The Memory#occupy-the-memoryoccupythememory.orgOpen Source Big Data for Archives and Libraries: an Artefactual Case StudyArtefactual SystemsMJ Suhonos

TRANSCRIPT

Page 1: Access2011 - Van Garderen: Occupy The Memory

Open-Source Big Data for Archives and Libraries:

An Case Study

Peter Van Garderen, President/Systems Archivist

MJ Suhonos, Systems Librarian/Software Engineer

Page 2: Access2011 - Van Garderen: Occupy The Memory
Page 3: Access2011 - Van Garderen: Occupy The Memory

Free Beer!

Page 4: Access2011 - Van Garderen: Occupy The Memory

http://archivematica.org

http://ica-atom.org

http://dcb-gcn.canadiana.org

http://qubit-toolkit.org

Page 5: Access2011 - Van Garderen: Occupy The Memory

Peter Van Garderen (MAS)President / Systems Archivist@pjvangarderen

Evelyn McLellan (MAS)Systems Archivist

David JuhaszSoftware Engineer

Austin TraskSoftware Engineer

Jesús García CrespoSoftware Engineer

Joseph PerrySoftware Engineer

Jessica Bushey (MAS)Systems Archivist

MJ Suhonos (MLIS)Systems Librarian / Software Engineer

open-source sofware for archives and libraries

digital preservation consulting services

http://artefactual.com

Courtney Mumma (MAS/MLIS)Systems Archivist

Page 6: Access2011 - Van Garderen: Occupy The Memory

http://archivesspace.org ?

Page 7: Access2011 - Van Garderen: Occupy The Memory

Artefactual clients and project sponsors

International Council on Archives

UNESCO Memory of the World

UNESCO Archives

United Nations Archives and Records Management Section

The World Bank Group

International Monetary Fund

NATO Archives

International Records Management Trust

Rockefeller Archive Center

Library and Archives Canada

Canadian Council of Archives

Canadiana

National Archives of the Netherlands

Dutch Ministry of the Interior and Kingdom Relations

Dutch Institute for Archival Research and Education (Archiefschool)

British Commonwealth Secretariat

United Kingdom Department for International Development

Direction des Archives de France

United Arab Emirates Center for Documentation and Research

Al-Dhakira Al-Arabiyya

Association of Brazilian Archivists

Botswana National Archives and Records Service

Caribbean Regional Branch of the International Council on Archives

American Institute of Architects

British Columbia Museum and Archives

British Columbia Ministry of Management Services

●Provincial Archives of Alberta

●Alberta Government Services Ministry

●Insurance Corporation of British Columbia

●Archives Association of British Columbia

●Archives Society of Alberta

●Archives Association of Ontario

●Association for Manitoba Archives

●University of British Columbia Library

●Simon Fraser University Archives

●Simon Fraser University Library

●University of Victoria Archives

●University of Toronto iSchool Institute

●University of Northern British Columbia Library and Archives

●University of Strathclyde Archives

●British Columbia Electronic Library Network

●University of British Columbia Irving K. Barber Learning Centre

●Diocese of New Westminster - Anglican Church of Canada Archives

●City of Vancouver Archives

●City of Toronto Corporate Information Management Services

●City of Rotterdam Archives

●City of Edmonton Archives

●Squamish Public Library

●West Vancouver Museum and Archives

●Whistler Museum and Archives

●Langley Centennial Museum and National Exhibition Centre

●Stirling Council Archives

Page 8: Access2011 - Van Garderen: Occupy The Memory

Archivists & Librarians: Who are we?

Who are we in the face of Google, ebooks, iTunes, Facebook, Flickr, Internet Archive, Ancestry.com, History Channel, Sharepoint, Twitter...

Who are we in the face of our traditional services, our traditional identity? tight budgets?

Page 9: Access2011 - Van Garderen: Occupy The Memory

we're space

Page 10: Access2011 - Van Garderen: Occupy The Memory

http://www.vancouverarchives.ca/2011/06/forming-a-new-archives/

Page 11: Access2011 - Van Garderen: Occupy The Memory
Page 12: Access2011 - Van Garderen: Occupy The Memory

we're Trusted Digital Repositories

we're code

we're portals

Page 13: Access2011 - Van Garderen: Occupy The Memory

we're context

Page 14: Access2011 - Van Garderen: Occupy The Memory

all creation is connected

in various ways

in a marvelous spatial balance.

Out of the formation of new entities

has emerged

information

resulting in communication

and memory

Hugh Taylor. “The Archivist, the Letter, and the Spirit” Archivaria 43 Association of Canadian Archivists (1997) p6

http://journals.sfu.ca/archivar

Page 15: Access2011 - Van Garderen: Occupy The Memory

now future

bitstream

storage media

packaging

storage device

storage driver

file system

error correction operating system

application software user interface

input / output devices

metadata

find

relate / bind

authenticate

contextualize

stored

conserved

protected

Accessible?Usable?Authentic?

compression

decryption

file format

character encoding fonts

codec

Page 16: Access2011 - Van Garderen: Occupy The Memory

now future

Accessible?Usable?Authentic?

In your scope,I am content

communication

memory

wisdom

<metadata isa=”love note to the future” />

consciousness

Page 17: Access2011 - Van Garderen: Occupy The Memory
Page 18: Access2011 - Van Garderen: Occupy The Memory
Page 19: Access2011 - Van Garderen: Occupy The Memory

Doctoral Candidate, Archival Science

Page 20: Access2011 - Van Garderen: Occupy The Memory
Page 21: Access2011 - Van Garderen: Occupy The Memory

we're the 99%

● We the people, helped by our archivists & librarians, should be in charge of:● the space● the portals● the Trusted Digital Repositories● the code● the information

Page 22: Access2011 - Van Garderen: Occupy The Memory

we're the 99%

● We the people, helped by our archivists & librarians, should be in charge of:● the space● the portals● the Trusted Digital Repositories● the code● the information

●the public record●the social network●personal archives●big data

Page 23: Access2011 - Van Garderen: Occupy The Memory

#occupy the memory

● We the people, helped by our archivists & librarians, should be in charge of:● the space● the portals● the Trusted Digital Repositories● the code● the information

occupythememory.org

Page 24: Access2011 - Van Garderen: Occupy The Memory

“They’ll never take our freedom!”

© 1995 Paramount Pictures & 20th Century FoxSee fair use rationale: http://en.wikipedia.org/wiki/File:Brave_mel.jpg

Page 25: Access2011 - Van Garderen: Occupy The Memory

Foundation orSteering Committee

Governance

Coordination

Funding

Promotion

Users

Lead institutions Funding DevelopmentAll users Bug reports Enhancement requests Code patches Documentation Promotion

Open Source Software

Code

Knowledge

Community

Service Providers

Development

Technical Support

Hosting

Training

Promotion

CodeTime

MoneyKnowledge

CodeTimeMoneyKnowledge

TimeMoney

Knowledge

The open-source eco-system

Page 26: Access2011 - Van Garderen: Occupy The Memory
Page 27: Access2011 - Van Garderen: Occupy The Memory

hostinginstallationintegrationsoftware developmenttech supporttrainingsystem analysisstrategy

$125/hr

Annual maintenance program

Community SupportWe will try to answer fairly straight-forward questions from the open source community about installing and configuring our software. When we think a particular query is beyond these free support parameters (too specific, in-depth, or time-consuming) we will inform the user that it may be necessary to address it as paid, commercial support.

Commercial SupportOur software is always free and open source, but with our optional hosting and support services, the Artefactual development team will assist a client with more in-depth questions to get the software installed and operating as required, whether on one of our servers or their own.

Page 28: Access2011 - Van Garderen: Occupy The Memory
Page 29: Access2011 - Van Garderen: Occupy The Memory

PropelORM

ZSLindex

Page 30: Access2011 - Van Garderen: Occupy The Memory

Big Data in Canadian Library and Archives: How Big?

● MemoryBC.ca <100,00 archival descriptions & authority

● Archeion.ca <100,000 archival descriptions & authority

● Canadiana Portal: 1 million items, 4-5 million records

● Toronto Public Library: 3 million MARC records

● Library Archives Canada: 3.5 million MARC records

● ArchivesCanada.ca: with LAC & BNQ? (<5 million?)

● City of Vancouver: >25TB of digital files from VANOC

Page 31: Access2011 - Van Garderen: Occupy The Memory

The original content in this presentation is Copyright Artefactual Systems Inc. 2011. You may freely re­use this content under the terms of the Creative Commons Attribution­Non­Commercial­Share Alike 3.0 license

AttributionTitle:         Open­Source Big Data for Archives and Libraries: An Artefactual Systems Case StudyCreator:    Peter Van Garderen & MJ SuhonosPublisher: Artefactual Systems Inc.Date:        October 20, 2011