using wayback machine for research - library of congress blogs
TRANSCRIPT
![Page 1: Using Wayback Machine for Research - Library of Congress Blogs](https://reader036.vdocuments.site/reader036/viewer/2022071523/613d08ef736caf36b7588a0e/html5/thumbnails/1.jpg)
Nicholas TaylorRepository Development Group
Using Wayback Machine for Research
![Page 2: Using Wayback Machine for Research - Library of Congress Blogs](https://reader036.vdocuments.site/reader036/viewer/2022071523/613d08ef736caf36b7588a0e/html5/thumbnails/2.jpg)
WAYBACK MACHINE?What Is the
![Page 3: Using Wayback Machine for Research - Library of Congress Blogs](https://reader036.vdocuments.site/reader036/viewer/2022071523/613d08ef736caf36b7588a0e/html5/thumbnails/3.jpg)
WABAC Machine?
![Page 4: Using Wayback Machine for Research - Library of Congress Blogs](https://reader036.vdocuments.site/reader036/viewer/2022071523/613d08ef736caf36b7588a0e/html5/thumbnails/4.jpg)
Internet Archive’s Wayback Machine
![Page 5: Using Wayback Machine for Research - Library of Congress Blogs](https://reader036.vdocuments.site/reader036/viewer/2022071523/613d08ef736caf36b7588a0e/html5/thumbnails/5.jpg)
not one, but many Wayback Machines
open source software to “replay” web archives rewrites links to point to archived resources allows for temporal navigation within archive
used by many web archiving institutions 33 out of 62 initiatives listed on Wikipedia
![Page 6: Using Wayback Machine for Research - Library of Congress Blogs](https://reader036.vdocuments.site/reader036/viewer/2022071523/613d08ef736caf36b7588a0e/html5/thumbnails/6.jpg)
Government of Canada Web Archive
![Page 7: Using Wayback Machine for Research - Library of Congress Blogs](https://reader036.vdocuments.site/reader036/viewer/2022071523/613d08ef736caf36b7588a0e/html5/thumbnails/7.jpg)
Government of Canada Web Archive
![Page 8: Using Wayback Machine for Research - Library of Congress Blogs](https://reader036.vdocuments.site/reader036/viewer/2022071523/613d08ef736caf36b7588a0e/html5/thumbnails/8.jpg)
Portuguese Web Archive
![Page 9: Using Wayback Machine for Research - Library of Congress Blogs](https://reader036.vdocuments.site/reader036/viewer/2022071523/613d08ef736caf36b7588a0e/html5/thumbnails/9.jpg)
Web Archive Singapore
![Page 10: Using Wayback Machine for Research - Library of Congress Blogs](https://reader036.vdocuments.site/reader036/viewer/2022071523/613d08ef736caf36b7588a0e/html5/thumbnails/10.jpg)
Web Archive Singapore
![Page 11: Using Wayback Machine for Research - Library of Congress Blogs](https://reader036.vdocuments.site/reader036/viewer/2022071523/613d08ef736caf36b7588a0e/html5/thumbnails/11.jpg)
Catalonian Web Archive
![Page 12: Using Wayback Machine for Research - Library of Congress Blogs](https://reader036.vdocuments.site/reader036/viewer/2022071523/613d08ef736caf36b7588a0e/html5/thumbnails/12.jpg)
Catalonian Web Archive
![Page 13: Using Wayback Machine for Research - Library of Congress Blogs](https://reader036.vdocuments.site/reader036/viewer/2022071523/613d08ef736caf36b7588a0e/html5/thumbnails/13.jpg)
California Digital Library Web Archiving Service
![Page 14: Using Wayback Machine for Research - Library of Congress Blogs](https://reader036.vdocuments.site/reader036/viewer/2022071523/613d08ef736caf36b7588a0e/html5/thumbnails/14.jpg)
Harvard University Web Archive Collection Service
![Page 15: Using Wayback Machine for Research - Library of Congress Blogs](https://reader036.vdocuments.site/reader036/viewer/2022071523/613d08ef736caf36b7588a0e/html5/thumbnails/15.jpg)
LIMITATIONS AND WORKAROUNDS
Common
![Page 16: Using Wayback Machine for Research - Library of Congress Blogs](https://reader036.vdocuments.site/reader036/viewer/2022071523/613d08ef736caf36b7588a0e/html5/thumbnails/16.jpg)
limitation: banner displaces page elements
![Page 17: Using Wayback Machine for Research - Library of Congress Blogs](https://reader036.vdocuments.site/reader036/viewer/2022071523/613d08ef736caf36b7588a0e/html5/thumbnails/17.jpg)
workaround: hide the banner
![Page 18: Using Wayback Machine for Research - Library of Congress Blogs](https://reader036.vdocuments.site/reader036/viewer/2022071523/613d08ef736caf36b7588a0e/html5/thumbnails/18.jpg)
limitation: AJAX-enabled sites
![Page 19: Using Wayback Machine for Research - Library of Congress Blogs](https://reader036.vdocuments.site/reader036/viewer/2022071523/613d08ef736caf36b7588a0e/html5/thumbnails/19.jpg)
limitation: AJAX-enabled sites
![Page 20: Using Wayback Machine for Research - Library of Congress Blogs](https://reader036.vdocuments.site/reader036/viewer/2022071523/613d08ef736caf36b7588a0e/html5/thumbnails/20.jpg)
workaround: disable JavaScript
![Page 21: Using Wayback Machine for Research - Library of Congress Blogs](https://reader036.vdocuments.site/reader036/viewer/2022071523/613d08ef736caf36b7588a0e/html5/thumbnails/21.jpg)
limitation: nav menu link errors
![Page 22: Using Wayback Machine for Research - Library of Congress Blogs](https://reader036.vdocuments.site/reader036/viewer/2022071523/613d08ef736caf36b7588a0e/html5/thumbnails/22.jpg)
workaround: insert live site URL in archive
![Page 23: Using Wayback Machine for Research - Library of Congress Blogs](https://reader036.vdocuments.site/reader036/viewer/2022071523/613d08ef736caf36b7588a0e/html5/thumbnails/23.jpg)
workaround: insert live site URL in archive
![Page 24: Using Wayback Machine for Research - Library of Congress Blogs](https://reader036.vdocuments.site/reader036/viewer/2022071523/613d08ef736caf36b7588a0e/html5/thumbnails/24.jpg)
workaround: insert live site URL in archive
![Page 25: Using Wayback Machine for Research - Library of Congress Blogs](https://reader036.vdocuments.site/reader036/viewer/2022071523/613d08ef736caf36b7588a0e/html5/thumbnails/25.jpg)
limitation: no full-text search
![Page 26: Using Wayback Machine for Research - Library of Congress Blogs](https://reader036.vdocuments.site/reader036/viewer/2022071523/613d08ef736caf36b7588a0e/html5/thumbnails/26.jpg)
workaround: none yet, but R&D ongoing
![Page 27: Using Wayback Machine for Research - Library of Congress Blogs](https://reader036.vdocuments.site/reader036/viewer/2022071523/613d08ef736caf36b7588a0e/html5/thumbnails/27.jpg)
MECHANICSBasic
![Page 28: Using Wayback Machine for Research - Library of Congress Blogs](https://reader036.vdocuments.site/reader036/viewer/2022071523/613d08ef736caf36b7588a0e/html5/thumbnails/28.jpg)
structure of a Wayback Machine URL
http://webarchiveqr.loc.gov/loc_sites/20120131201510/http://www.loc.gov/index.html
Wayback Machine URL collection date/timestamp(YYYYMMDDHHMMSS)
URL of archivedresource
![Page 29: Using Wayback Machine for Research - Library of Congress Blogs](https://reader036.vdocuments.site/reader036/viewer/2022071523/613d08ef736caf36b7588a0e/html5/thumbnails/29.jpg)
URL-based access
![Page 30: Using Wayback Machine for Research - Library of Congress Blogs](https://reader036.vdocuments.site/reader036/viewer/2022071523/613d08ef736caf36b7588a0e/html5/thumbnails/30.jpg)
URL-based access
![Page 31: Using Wayback Machine for Research - Library of Congress Blogs](https://reader036.vdocuments.site/reader036/viewer/2022071523/613d08ef736caf36b7588a0e/html5/thumbnails/31.jpg)
date wildcarding
![Page 32: Using Wayback Machine for Research - Library of Congress Blogs](https://reader036.vdocuments.site/reader036/viewer/2022071523/613d08ef736caf36b7588a0e/html5/thumbnails/32.jpg)
date wildcarding
![Page 33: Using Wayback Machine for Research - Library of Congress Blogs](https://reader036.vdocuments.site/reader036/viewer/2022071523/613d08ef736caf36b7588a0e/html5/thumbnails/33.jpg)
document wildcarding
![Page 34: Using Wayback Machine for Research - Library of Congress Blogs](https://reader036.vdocuments.site/reader036/viewer/2022071523/613d08ef736caf36b7588a0e/html5/thumbnails/34.jpg)
document wildcarding
![Page 35: Using Wayback Machine for Research - Library of Congress Blogs](https://reader036.vdocuments.site/reader036/viewer/2022071523/613d08ef736caf36b7588a0e/html5/thumbnails/35.jpg)
document wildcarding
![Page 36: Using Wayback Machine for Research - Library of Congress Blogs](https://reader036.vdocuments.site/reader036/viewer/2022071523/613d08ef736caf36b7588a0e/html5/thumbnails/36.jpg)
FINDING MISSING RESOURCES
Strategies for
![Page 37: Using Wayback Machine for Research - Library of Congress Blogs](https://reader036.vdocuments.site/reader036/viewer/2022071523/613d08ef736caf36b7588a0e/html5/thumbnails/37.jpg)
removed or moved?
don’t start with the archive missing resources have often just moved (Klein
& Nelson, 2010) Synchronicity for Firefox helps find new location scrapes archived version for “fingerprint”
keywords; uses them to query search engines
![Page 38: Using Wayback Machine for Research - Library of Congress Blogs](https://reader036.vdocuments.site/reader036/viewer/2022071523/613d08ef736caf36b7588a0e/html5/thumbnails/38.jpg)
MementoFox
![Page 39: Using Wayback Machine for Research - Library of Congress Blogs](https://reader036.vdocuments.site/reader036/viewer/2022071523/613d08ef736caf36b7588a0e/html5/thumbnails/39.jpg)
MementoFox
![Page 40: Using Wayback Machine for Research - Library of Congress Blogs](https://reader036.vdocuments.site/reader036/viewer/2022071523/613d08ef736caf36b7588a0e/html5/thumbnails/40.jpg)
find archived content now at a new URL
congressional committee hearings archive live site URL doesn’t work in archive find a site in the archive that would link to the
desired site, then navigate to contemporaneous snapshot
![Page 41: Using Wayback Machine for Research - Library of Congress Blogs](https://reader036.vdocuments.site/reader036/viewer/2022071523/613d08ef736caf36b7588a0e/html5/thumbnails/41.jpg)
hearings archive only spans 2001-2006
![Page 42: Using Wayback Machine for Research - Library of Congress Blogs](https://reader036.vdocuments.site/reader036/viewer/2022071523/613d08ef736caf36b7588a0e/html5/thumbnails/42.jpg)
hearings archive URL changed in 2011
![Page 43: Using Wayback Machine for Research - Library of Congress Blogs](https://reader036.vdocuments.site/reader036/viewer/2022071523/613d08ef736caf36b7588a0e/html5/thumbnails/43.jpg)
truncate archival access URL
![Page 44: Using Wayback Machine for Research - Library of Congress Blogs](https://reader036.vdocuments.site/reader036/viewer/2022071523/613d08ef736caf36b7588a0e/html5/thumbnails/44.jpg)
snapshot from prior to site change
![Page 45: Using Wayback Machine for Research - Library of Congress Blogs](https://reader036.vdocuments.site/reader036/viewer/2022071523/613d08ef736caf36b7588a0e/html5/thumbnails/45.jpg)
navigate to appropriate section
![Page 46: Using Wayback Machine for Research - Library of Congress Blogs](https://reader036.vdocuments.site/reader036/viewer/2022071523/613d08ef736caf36b7588a0e/html5/thumbnails/46.jpg)
navigate to appropriate section
![Page 47: Using Wayback Machine for Research - Library of Congress Blogs](https://reader036.vdocuments.site/reader036/viewer/2022071523/613d08ef736caf36b7588a0e/html5/thumbnails/47.jpg)
find archived content now at a new URL
records currently stored in password-protected part of site may have previously been publicly-accessible
conceptual site organization lasts longer than exact link construction
figure out where desired resource would be on the live site, then navigate to analogous section on archived site
![Page 48: Using Wayback Machine for Research - Library of Congress Blogs](https://reader036.vdocuments.site/reader036/viewer/2022071523/613d08ef736caf36b7588a0e/html5/thumbnails/48.jpg)
location of resources on live site
![Page 49: Using Wayback Machine for Research - Library of Congress Blogs](https://reader036.vdocuments.site/reader036/viewer/2022071523/613d08ef736caf36b7588a0e/html5/thumbnails/49.jpg)
location of resources on live site
![Page 50: Using Wayback Machine for Research - Library of Congress Blogs](https://reader036.vdocuments.site/reader036/viewer/2022071523/613d08ef736caf36b7588a0e/html5/thumbnails/50.jpg)
authentication required
![Page 51: Using Wayback Machine for Research - Library of Congress Blogs](https://reader036.vdocuments.site/reader036/viewer/2022071523/613d08ef736caf36b7588a0e/html5/thumbnails/51.jpg)
check the site in the archive
![Page 52: Using Wayback Machine for Research - Library of Congress Blogs](https://reader036.vdocuments.site/reader036/viewer/2022071523/613d08ef736caf36b7588a0e/html5/thumbnails/52.jpg)
navigate to an individual capture
![Page 53: Using Wayback Machine for Research - Library of Congress Blogs](https://reader036.vdocuments.site/reader036/viewer/2022071523/613d08ef736caf36b7588a0e/html5/thumbnails/53.jpg)
navigate to appropriate section
![Page 54: Using Wayback Machine for Research - Library of Congress Blogs](https://reader036.vdocuments.site/reader036/viewer/2022071523/613d08ef736caf36b7588a0e/html5/thumbnails/54.jpg)
navigate to appropriate section
![Page 55: Using Wayback Machine for Research - Library of Congress Blogs](https://reader036.vdocuments.site/reader036/viewer/2022071523/613d08ef736caf36b7588a0e/html5/thumbnails/55.jpg)
GET INVOLVEDHow You Can
![Page 56: Using Wayback Machine for Research - Library of Congress Blogs](https://reader036.vdocuments.site/reader036/viewer/2022071523/613d08ef736caf36b7588a0e/html5/thumbnails/56.jpg)
what websites from today would you want to be able to consult in five, ten, twenty years’ time?
have you told us what is important to capture?
help us to help you
![Page 57: Using Wayback Machine for Research - Library of Congress Blogs](https://reader036.vdocuments.site/reader036/viewer/2022071523/613d08ef736caf36b7588a0e/html5/thumbnails/57.jpg)
for more information
Library of Congress Web Archiving Program: http://www.loc.gov/webarchiving/
Library of Congress Web Archives: http://loc.gov/lcwa/
International Internet Preservation Consortium: http://netpreserve.org/
National Digital Information Infrastructure and Preservation Program: http://www.digitalpreservation.gov/