one site among many: stanford and collaborative technical … · 2020. 6. 27. · opportunities for...

22
One Site Among Many: Stanford and Collaborative Technical Development for Web Archiving Nicholas Taylor Web Archiving Service Manager Stanford University Libraries PASIG 2016 March 11, 2016

Upload: others

Post on 24-Aug-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: One Site Among Many: Stanford and Collaborative Technical … · 2020. 6. 27. · opportunities for research ... web archiving lifecycle ... Description Access / Use / Reuse Preservation

One Site Among Many: Stanford

and Collaborative Technical

Development for Web Archiving

Nicholas Taylor

Web Archiving Service Manager

Stanford University Libraries

PASIG 2016

March 11, 2016

Page 2: One Site Among Many: Stanford and Collaborative Technical … · 2020. 6. 27. · opportunities for research ... web archiving lifecycle ... Description Access / Use / Reuse Preservation

overview

• web archiving

opportunity gaps

• situation of SUL web

archiving

• APIs + community

(technical)

development

“LAX on take off” by Doug under CC BY-NC-ND 2.0

Page 4: One Site Among Many: Stanford and Collaborative Technical … · 2020. 6. 27. · opportunities for research ... web archiving lifecycle ... Description Access / Use / Reuse Preservation

web content >

“The Seeker” by C MB 166 under CC BY-ND 2.0

preserved web content

Page 5: One Site Among Many: Stanford and Collaborative Technical … · 2020. 6. 27. · opportunities for research ... web archiving lifecycle ... Description Access / Use / Reuse Preservation

link rot + content drift

Andrew Jackson: “Ten years of the UK Web Archive”

Page 6: One Site Among Many: Stanford and Collaborative Technical … · 2020. 6. 27. · opportunities for research ... web archiving lifecycle ... Description Access / Use / Reuse Preservation

a centralized enterprise

60%

25%

14%

63%

20%16%

0%

10%

20%

30%

40%

50%

60%

70%

External Local Both

2011 2013

NDSA: “Web Archiving in the U.S.: A 2013 Survey”

Page 7: One Site Among Many: Stanford and Collaborative Technical … · 2020. 6. 27. · opportunities for research ... web archiving lifecycle ... Description Access / Use / Reuse Preservation

a centralized enterprise

0 01

0

2

01

01

0

3 3

12

4

2

6

4

10

2

0

0

1

1

0

1 3

5

3

4 2

25

6

15

0

2

4

6

8

10

12

14

16

18

20

1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013

Number of organizations Archive-It Partner as of 2013

NDSA: “Web Archiving in the U.S.: A 2013 Survey”

Page 8: One Site Among Many: Stanford and Collaborative Technical … · 2020. 6. 27. · opportunities for research ... web archiving lifecycle ... Description Access / Use / Reuse Preservation

minimal local preservation

19%

81%

20%

80%

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

Transferred Haven't transferred

2011 2013

NDSA: “Web Archiving in the U.S.: A 2013 Survey”

Page 10: One Site Among Many: Stanford and Collaborative Technical … · 2020. 6. 27. · opportunities for research ... web archiving lifecycle ... Description Access / Use / Reuse Preservation

opportunities for research

“Exploring the Canadian Political Interest Group and Political Parties Web Sphere” by Ian Milligan under Standard YouTube License

Page 12: One Site Among Many: Stanford and Collaborative Technical … · 2020. 6. 27. · opportunities for research ... web archiving lifecycle ... Description Access / Use / Reuse Preservation

Stanford Web Archive Portal

Stanford University Libraries: “Stanford Web Archive Portal”

Page 13: One Site Among Many: Stanford and Collaborative Technical … · 2020. 6. 27. · opportunities for research ... web archiving lifecycle ... Description Access / Use / Reuse Preservation

SearchWorks (online catalog)

Stanford University Libraries: “SearchWorks”

Page 14: One Site Among Many: Stanford and Collaborative Technical … · 2020. 6. 27. · opportunities for research ... web archiving lifecycle ... Description Access / Use / Reuse Preservation

web archaeology (SLAC)

oldweb.today: “WorldWideWeb SLAC Home Page”

Page 15: One Site Among Many: Stanford and Collaborative Technical … · 2020. 6. 27. · opportunities for research ... web archiving lifecycle ... Description Access / Use / Reuse Preservation

building + integrating infrastructure

discovery

preservation

access

capture

SDR

Page 16: One Site Among Many: Stanford and Collaborative Technical … · 2020. 6. 27. · opportunities for research ... web archiving lifecycle ... Description Access / Use / Reuse Preservation

APIS + COMMUNITY DEVELOPMENT

“P1050827” by Rebecca Siegel under CC BY 2.0

Page 17: One Site Among Many: Stanford and Collaborative Technical … · 2020. 6. 27. · opportunities for research ... web archiving lifecycle ... Description Access / Use / Reuse Preservation

web archiving lifecycle

Internet Archive: “The Web Archiving Life Cycle Model”

Page 18: One Site Among Many: Stanford and Collaborative Technical … · 2020. 6. 27. · opportunities for research ... web archiving lifecycle ... Description Access / Use / Reuse Preservation

functional overlap

Appraisal

and

Selection

ScopingData

Capture

Storage and

Organization

QA and

Analysis

Metadata /

Description

Access

/ Use /

Reuse

PreservationRisk

Management

ACT

Archive-It

AtN

BCWeb

CDL WAS

DigiBoard

Islandora

WARC

Solution Pack

Netarchive

Suite

PageFreezer

UNT

Nomination

Tool

WCT

Page 19: One Site Among Many: Stanford and Collaborative Technical … · 2020. 6. 27. · opportunities for research ... web archiving lifecycle ... Description Access / Use / Reuse Preservation

smaller, modular components

“Giant Rubik's Cube” by Francois Lamotte under CC BY 2.0

Page 21: One Site Among Many: Stanford and Collaborative Technical … · 2020. 6. 27. · opportunities for research ... web archiving lifecycle ... Description Access / Use / Reuse Preservation

API candidates

• capture tool/proxy

interconnect

• capture tool

management

• data import/export

• query + extraction

• integrity audit + repair

• descriptive metadata

• logs + analytics

• renderings/derivative

formats

• federated data

delivery

• federated replay

• federated full-text

search