open library at the api workshop
DESCRIPTION
Presented February 26, 2011 at The Maryland Institute for Technology in the Humanities.TRANSCRIPT
![Page 1: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/1.jpg)
Hello.MITH API Workshop
George OatesMaryland, February 2011
Monday, April 11, 2011
![Page 2: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/2.jpg)
Some rights reserved by mattdork
Monday, April 11, 2011
I work at the Internet Archive, leading The Open Library project. We recently moved in to this church in The Richmond in San Francisco. We’re turning it into a library.
![Page 3: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/3.jpg)
Monday, April 11, 2011
We’re based in San Francisco, California, where I happen to have been living for about 5 years.
![Page 4: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/4.jpg)
Universal Access toAll Knowledge
Monday, April 11, 2011
Since 1996, the non-profit Internet Archive has been building a digital library of Internet sites and other things in digital form. archive.org has a ton of texts, video, software, live music... all sorts of things.
Our mission is Universal Access to all Knowledge. Not a bad reason to get out of bed each day...
![Page 5: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/5.jpg)
Some rights reserved by heather
Monday, April 11, 2011
It’s not your traditional non-profit... Lots of the staff are technologists and developers.
![Page 6: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/6.jpg)
archive.orgMonday, April 11, 2011
We have many computers. They store over- 100,000 hours of TV from channels all over the world- 250,000 moving images or video- 500,000 audio recordings- 2.5 million scanned texts- 150,000,000,000 web pages
![Page 7: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/7.jpg)
By rkumar
Monday, April 11, 2011
Just the other day we had 2.88 petabytes of hard drives delivered. That’s enough storage for about 2 billion books.
![Page 8: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/8.jpg)
Monday, April 11, 2011
Another major part of what we do is scanning books. This is a picture of one of the scanning centers in San Francisco. We currently employ about 200 staff scanning books
![Page 9: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/9.jpg)
Monday, April 11, 2011
And today, we have over million free texts available online ‐ that includes over 1 million books150 million pages scanned1,000 books scanned EVERY day24 scanning centers in 5 countries, and we hope for more.
![Page 10: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/10.jpg)
Monday, April 11, 2011
We’re also scanning microfilm, which is much faster than individual books. Here’s an example of the record of the populaJon census from 1790 to 1930. Scanned from microfilm from the collecJons of the Allen County Public Library and originally from the United States NaJonal Archives Record AdministraJon.
![Page 11: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/11.jpg)
Monday, April 11, 2011
Examples of Cross Writing from Boston Public Library
![Page 12: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/12.jpg)
Monday, April 11, 2011
Over 1 million free books that you can read on archive.org today, and access through the Open Library site, by checking the little “Only eBooks” box as you search.
![Page 13: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/13.jpg)
Monday, April 11, 2011
As well as being able to download these books in a variety of different formats, from PDF to TXT and more, we also have a web-based book reader, which you can use to read our scanned texts within your web browser, without the need for any additional software. At the end of 2010, we released a new version of our open source, browser-based BookReader.
I’ve actually come to Wellington direct from a meeting in San Francisco called Books in Browser, held at the Internet Archive last week. It was there that we announced an upcoming new release of our bookreader, which will hopefully go live in the next few weeks... Here are some screenshots...
![Page 14: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/14.jpg)
Monday, April 11, 2011
The main reason we wanted to improve on the current design was to try to build an “app-level quality” book reading experience right in the browser. This included several improvement for touch interfaces in browsers on devices like the iPad.
From a straightforward design perspective, there were also improvements to be made on usability and simple stuff like making the book bigger in the browser window.
![Page 15: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/15.jpg)
Monday, April 11, 2011
This is a screenshot with the toolbar open, where you can see new features like a navigation bar at the bottom that allows you to scroll through the book, a “read to me” feature which plays the book in a computer-y voice, and highlights what’s being read. Also, if we know a table of contents for the book, each chapter is mapped along the navigation bar.
We’ve also rewritten the full text search engine, and I’ll talk more about that a bit later.
![Page 16: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/16.jpg)
By rkumar
Monday, April 11, 2011
Apologies for the slightly blurry picture, but this is my boss, Brewster Kahle, who founded the Internet Archive back in 1996. He’s playing with a touchscreen which is displaying the new bookreader. The screen’s been installed in one of the reading desks that used to sit in the reading room of the Christian Science church before it became our new home. A big part of the bookreader redesign was to evolve an app-level quality book reading experience within a web browser. If you have an iPad, I’d encourage you to try it!
![Page 17: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/17.jpg)
Monday, April 11, 2011
The Open Library project was launched back in 2007. In May 2010, we launched a total site redesign. Just last week, we released a revised home page, building on our new Lending program, and generally trying to do a better job of communicating that you can come to Open Library to find something to read for free, or a book to borrow. We also added activity graphs to try to show that there’s stuff happening, all day, every day.
![Page 18: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/18.jpg)
A “Wikipedia for Books”
Monday, April 11, 2011
There are a few different ways to describe what Open Library is, but I think the explanation that makes the most sense is “a Wikipedia for Books”.
![Page 19: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/19.jpg)
Monday, April 11, 2011
Scrolling down the home page...
![Page 20: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/20.jpg)
Monday, April 11, 2011
We have a lending library of some 10,000 20th Century books. You can also access another 80,000 books if you’re (literally) sitting in one of the 150 or so libraries participating in our “In-Library Lending” program. Each participating library contributes eBooks into the in-library pool, and you can borrow anything in the pool, once you’re sitting in one of the libraries.
![Page 21: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/21.jpg)
Monday, April 11, 2011
Yay! Graphs going up! (That peak you can see across the graphs is our lending launch. For more info, read “Get Thee to a Library!” http://blog.openlibrary.org/2011/02/22/get-thee-to-a-library/)
![Page 22: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/22.jpg)
Monday, April 11, 2011
Snapshot of the various combinations of links we can provide to get you to books... For books we can’t lend through our own lending program, we’ve connected to Overdrive... We’re hoping to make the vendors you can buy from more dynamic, and open up the sources for online free texts. Right now, it’s just the Internet Archive texts that we link to in full.
![Page 23: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/23.jpg)
lending ebooks
• map / openstreen
Monday, April 11, 2011
You can browse a map of (mainly North American) libraries participating in the In-Library lending program. If you’re interested to join in, please contact us!
![Page 24: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/24.jpg)
borrow page
• screen
Monday, April 11, 2011
Here’s what a page looks like to borrow a book. You can see 3 options: In Browser, PDF, and ePub.
In-browser is available immediately. You need to download/install Adobe Digital Editions to read PDF or ePub versions.
![Page 25: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/25.jpg)
DeveloperResources
Monday, April 11, 2011
![Page 26: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/26.jpg)
Open Libraryhttp://openlibrary.org/developers
Monday, April 11, 2011
Python, Postgres, SOLR, JSON, REST
![Page 27: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/27.jpg)
http://github.com/openlibraryMonday, April 11, 2011
We certainly have our code online at github, but we rarely receive patches. I’m OK with this, at least for now.
![Page 28: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/28.jpg)
JSON/RDFhttp://openlibrary.org/developers
Monday, April 11, 2011
![Page 29: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/29.jpg)
Monday, April 11, 2011
![Page 30: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/30.jpg)
{"description": {"type": "/type/text", "value": "Published in 1845, this pre-eminent American slave narrative powerfully details the life of the internationally famous abolitionist Frederick Douglass from his birth into slavery in 1818 to his escape to the North in 1838\u2014how he endured the daily physical and spiritual brutalities of his owners and drivers, how he learned to read and write, and how he grew into a man who could only live free or die."}, "created": {"type": "/type/datetime", "value": "2009-10-16T05:15:16.306558"}, "title": "Narrative of the life of Frederick Douglass, an American slave", "covers": [5658658], "subject_places": ["United States", "Maryland"], "last_modified": {"type": "/type/datetime", "value": "2011-02-26T02:29:58.442342"}, "subject_people": ["Frederick Douglass (1818-1895)", "Frederick Douglass (1817?-1895)", "Harriet A. Jacobs (1813-1897)"], "key": "/works/OL69181W", "authors": [{"type": {"key": "/type/author_role"}, "author": {"key": "/authors/OL23684A"}}], "latest_revision": 9, "subject_times": ["19th century"], "type": {"key": "/type/work"}, "subjects": ["Biography", "Abolitionists", "African American abolitionists", "Slaves", "Slavery", "United States", "African Americans", "Women slaves", "Social conditions", "Antislavery movements", "History", "Accessible book", "OverDrive", "Biography & Autobiography", "Nonfiction", "Classic Literature", "Fiction", "Protected DAISY"], "revision": 9}
Monday, April 11, 2011
JSON blob
![Page 31: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/31.jpg)
{"description": {"type": "/type/text", "value": "Published in 1845, this pre-eminent American slave narrative powerfully details the life of the internationally famous abolitionist Frederick Douglass from his birth into slavery in 1818 to his escape to the North in 1838\u2014how he endured the daily physical and spiritual brutalities of his owners and drivers, how he learned to read and write, and how he grew into a man who could only live free or die."}, "created": {"type": "/type/datetime", "value": "2009-10-16T05:15:16.306558"}, "title": "Narrative of the life of Frederick Douglass, an American slave", "covers": [5658658], "subject_places": ["United States", "Maryland"], "last_modified": {"type": "/type/datetime", "value": "2011-02-26T02:29:58.442342"}, "subject_people": ["Frederick Douglass (1818-1895)", "Frederick Douglass (1817?-1895)", "Harriet A. Jacobs (1813-1897)"], "key": "/works/OL69181W", "authors": [{"type": {"key": "/type/author_role"}, "author": {"key": "/authors/OL23684A"}}], "latest_revision": 9, "subject_times": ["19th century"], "type": {"key": "/type/work"}, "subjects": ["Biography", "Abolitionists", "African American abolitionists", "Slaves", "Slavery", "United States", "African Americans", "Women slaves", "Social conditions", "Antislavery movements", "History", "Accessible book", "OverDrive", "Biography & Autobiography", "Nonfiction", "Classic Literature", "Fiction", "Protected DAISY"], "revision": 9}
Monday, April 11, 2011
JSON blob
![Page 32: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/32.jpg)
Monday, April 11, 2011
![Page 33: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/33.jpg)
• http://openlibrary.org/works/OL69181W/
• http://openlibrary.org/works/OL69181W.json
• http://openlibrary.org/works/OL69181W.rdf
Monday, April 11, 2011
HTML, JSON, RDF
![Page 34: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/34.jpg)
Data Dumpshttp://archive.org/details/ol_data
Monday, April 11, 2011
![Page 35: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/35.jpg)
archive.org/details/ol_dataMonday, April 11, 2011
There’s a copy of everything we’re using on the Internet Archive too.
![Page 36: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/36.jpg)
APIhttp://openlibrary.org/developers/api
Monday, April 11, 2011
Open Library has a RESTful API, best used to link into Open Library data in JSON, YAML and RDF/XML.
![Page 37: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/37.jpg)
APIhttp://openlibrary.org/developers/api
BooksCovers
Search insideSubjects
Recent ChangesLists
Monday, April 11, 2011
Open Library has a RESTful API, best used to link into Open Library data in JSON, YAML and RDF/XML.
![Page 38: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/38.jpg)
Request:
GET http://openlibrary.org/people/george08/lists.json
Request:
{ "links": { "self": "/people/george08/lists.json", "next": "/people/george08/lists.json?limit=5&offset=5" }, "size": 12, "entries": [ { "url": "/people/george08/lists/OL13L", "full_url": "/people/george08/lists/OL13L/Various_Seeds_for_Testing", "name": "Various Seeds for Testing", "last_update": "2010-12-21T00:46:17.712513", "seed_count": 13, "edition_count": 13181 }, { "url": "/people/george08/lists/OL97L", "full_url": "/people/george08/lists/OL97L/Time_Travel", "name": "Time Travel", "last_update": "2010-12-17T18:27:14.781336", "seed_count": 5, "edition_count": 838 }, { ... }, { ... }, { ... } ]} http://openlibrary.org/dev/docs/api/lists
Monday, April 11, 2011
![Page 39: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/39.jpg)
Monday, April 11, 2011
![Page 40: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/40.jpg)
Monday, April 11, 2011
![Page 41: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/41.jpg)
Monday, April 11, 2011
We built lists for a couple of reasons: 1, to help people collect things together, and 2, to make it easy to get at smaller sets of records.
![Page 42: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/42.jpg)
Covershttp://openlibrary.org/developers/api
Monday, April 11, 2011
![Page 43: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/43.jpg)
http://covers.openlibrary.org/b/$key/$value-$size.jpg
Monday, April 11, 2011
Where:
• key can be any one of ISBN, OLCC, LCCN, OLID and ID (case-insensitive)• value is the value of the chosen key• size can be one of S, M and L for small, medium and large respectively.
![Page 44: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/44.jpg)
http://covers.openlibrary.org/b/$key/$value-$size.jpg
http://covers.openlibrary.org/b/olid/OL7440033M-S.jpg (we use this)
http://covers.openlibrary.org/b/isbn/0385472579-S.jpg
http://covers.openlibrary.org/b/isbn/9780385472579-S.jpg
http://covers.openlibrary.org/b/lccn/93005405-S.jpg
http://covers.openlibrary.org/b/oclc/28419896-S.jpg
Monday, April 11, 2011
Where:
• key can be any one of ISBN, OLCC, LCCN, OLID and ID (case-insensitive)• value is the value of the chosen key• size can be one of S, M and L for small, medium and large respectively.
![Page 45: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/45.jpg)
Monday, April 11, 2011
![Page 46: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/46.jpg)
Monday, April 11, 2011
![Page 47: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/47.jpg)
Yay!
Monday, April 11, 2011
![Page 48: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/48.jpg)
Monday, April 11, 2011
![Page 49: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/49.jpg)
DOUBLEYay!
Monday, April 11, 2011
![Page 50: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/50.jpg)
Monday, April 11, 2011
One of quite a few examples of Open Library in the wild includes the National Library of Australia’s new search engine, Trove.
![Page 51: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/51.jpg)
Monday, April 11, 2011
You can see there that there are links to Open Library books wherever one can be sourced.
There are a growing number of sites making use of Open Library data... and that’s what we’re all about - data in, data out. The more interconnections we can make with other systems, the easier it will be for people to land where they want to go inside Open Library.
![Page 52: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/52.jpg)
Monday, April 11, 2011
This is ImportBot. He gets new catalog records from the Library of Congress and puts them into Open Library every Tuesday. We also import records from Amazon, and from the Internet Archive. ImportBot looks for recently scanned books, and creates new records (or merges them with existing ones) just a few minutes after the record is created on the Internet Archive.
![Page 53: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/53.jpg)
Monday, April 11, 2011
You can see ImportBot working away, just like you can see the Wiki’s edit history for every person who edits something.
![Page 54: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/54.jpg)
Monday, April 11, 2011
Another quick note on data in before I move on...
We’ve been experimenting with a couple of other “surgical” bots, that look across the catalog and connect edition records directly to other services by stamping identifiers from other systems into Open Library. This is a bot written by a developer called Ben Gimpert, that takes a file mapping ISBN to Goodreads IDs, and looks for ISBN matches in OL, then adding the Goodreads ID to those records. This allows us to construct links to Goodreads, and to make the Goodreads ID available through the API.
![Page 55: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/55.jpg)
Monday, April 11, 2011
You can see we’ve added a little widget on the page that connects to Goodreads, if you have an account, you can add our records to your lists on Goodreads. There’s also a LibraryThing ID too, added by a similar batch bot update.
Writing bots to do things like this is the sort of development we’d like to open up to external developers too...
![Page 56: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/56.jpg)
BookReaderhttp://openlibrary.org/dev/docs/ia
Monday, April 11, 2011
![Page 57: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/57.jpg)
Monday, April 11, 2011
This is a screenshot with the toolbar open, where you can see new features like a navigation bar at the bottom that allows you to scroll through the book, a “read to me” feature which plays the book in a computer-y voice, and highlights what’s being read. Also, if we know a table of contents for the book, each chapter is mapped along the navigation bar.
We’ve also rewritten the full text search engine, and I’ll talk more about that a bit later.
![Page 58: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/58.jpg)
Monday, April 11, 2011
The Library of Congress is using our Bookreader on read.gov. There are quite a few other examples of the IA Bookreader out there on the web. Hopefully the redesign (with touch interactions etc) will attract new people too...
![Page 59: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/59.jpg)
Monday, April 11, 2011
Princeton Digital Library
![Page 60: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/60.jpg)
Internet Archivehttp://openlibrary.org/dev/docs/ia
Monday, April 11, 2011
![Page 62: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/62.jpg)
Raw Full Text > 4 million documents
with metadata
Monday, April 11, 2011
![Page 63: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/63.jpg)
Stanford NLP thing
http://nlp.stanford.edu/Monday, April 11, 2011
We’ve just begun experimenting with some of the software made by the the Stanford Natural Language Processing Group - that includes members of both the Linguistics Department and the Computer Science Department, One idea is to fold this software into the scanning process, so we can do a first pass on entity extraction on full text of a book, to extract things like names, places and common subjects...
![Page 64: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/64.jpg)
Monday, April 11, 2011
But then of course, you can do cool stuff like this :)
![Page 65: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/65.jpg)
Challenges
Monday, April 11, 2011
![Page 66: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/66.jpg)
Tension? http://flic.kr/p/6zyU3UMonday, April 11, 2011
The Taxonomy vs Folksonomy debate may be represented thusly.
![Page 67: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/67.jpg)
1) Books are for use.
2) Every reader his [or her] book.
3) Every book its reader.
4) Save the time of the User.
5) The library is a growing organism.
Monday, April 11, 2011
So, on the basis of the idea of our current catalog being a substrate, as Ranganathan suggests in his five laws of library science...
![Page 68: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/68.jpg)
1) Books are for use.
2) Every reader his [or her] book.
3) Every book its reader.
4) Save the time of the User.
5) The library is a growing organism.
Monday, April 11, 2011
So, on the basis of the idea of our current catalog being a substrate, as Ranganathan suggests in his five laws of library science...
![Page 69: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/69.jpg)
Monday, April 11, 2011
So... Open Library is a virtual space. Its organization isn’t constrained like a physical catalog. In fact, the more connections you can make into one of our “virtual index cards” the more ways people have to discover and navigate its contents.
http://www.flickr.com/photos/brixton/1394845916/
![Page 70: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/70.jpg)
http://flic.kr/p/6pmtQLMonday, April 11, 2011
But, librarians are (very clever) humans too. And everyone who’s responsible for putting books into a traditional catalogue must work within patterns. Patterns that have grown semantically remarkable and deeply complex.
![Page 71: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/71.jpg)
Unknown author 403Unknown Author 358Author unknown 254No Author 145Author Unknown 59No Author. 54Author 20No author. 16No author 12unknown author 8Unknown Author Unknown 7no author 7No Author Stated 7(No Author) 6No author noted 5No author noted. 4no author listed 4(no author) 4Author Not Stated 4Author. 4No author specified 3Miscellaneous Author 3no Author 3Author One 3Multi-Author 3No Author Listed 3No Stated Author 3Author Anonymous 2(no author given) 2Author 2Author Wright 2Unkown Author 2No author stated 2Mms suspense author 2Author Test 2TEST AUTHOR 2
http://openlibrary.org/search
?author=author
Monday, April 11, 2011
Duplicate authors (and editions) are an issue... This is an example search for author records with “author” in their names... you can see the variety of ways that catalogers have noted unknown authors...
![Page 72: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/72.jpg)
http://www.flickr.com/photos/blackbeltjones/4294354526/Monday, April 11, 2011
We’ve noticed a TON of minor variations in the way cataloguers enter data... Trivial to us, but very hard for computers to differentiate
![Page 73: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/73.jpg)
Substrate:any surface on which a plant or animal lives or on which a material sticks
Some rights reserved by Brynja Eldon
Monday, April 11, 2011
We have a repository that mostly contains records created by professionals. I find it useful to consider these records as a substrate, something that can be reacted upon.
![Page 74: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/74.jpg)
What if we consider the source Open Library records like that?
Some rights reserved by Brynja Eldon
Monday, April 11, 2011
Now that we’ve begun to reveal this substrate, how will people react to it? What reactions has it caused so far?
![Page 75: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/75.jpg)
Monday, April 11, 2011
Handwritten scribbles and scrawls; annotations; corrections
![Page 76: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/76.jpg)
Some rights reserved by jared
Monday, April 11, 2011
What if a catalog looks like this? Is crystalline? What if it is unconstrained by the need to sort, say, alphabetically?
From the artist of this image, Jared Tarbell: “Lines like crystals form at perpendicular angles to existing lines. A complex form emerges. 1000 classic computational substrate, color palette stolen from Jackson Pollock: A simple perpendicular growth rule creates intricate city-like structures. The simple rule, the complex results, the enormous potential for modification; this has got to be one of my all time favorite self-discovered algorithms. Lines likes crystals grow on a computational substrate.”
![Page 77: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/77.jpg)
Monday, April 11, 2011
What happens when you introduce turbulence into the catalog? Here are a few examples of the sorts of edits we’re seeing... at a rate of about 100,000 edits per month.
http://www.flickr.com/photos/rreis/4859722551/sizes/l/
![Page 78: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/78.jpg)
000s of edits per month
Monday, April 11, 2011
What happens when you introduce turbulence into the catalog? Here are a few examples of the sorts of edits we’re seeing... at a rate of about 100,000 edits per month.
if you don’t stimulate an organism, it atrophies
http://www.flickr.com/photos/rreis/4859722551/sizes/l/
![Page 79: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/79.jpg)
Activity/History
Monday, April 11, 2011
One of the key components to any happy social system is the visibility of other people, and a sense of activity. This is one of the key elements we’re focussed on in the redesign. This particular list shows all edits by humans on Open Library, and actually, turns out to be a handy way to spot check what’s happening. You’ll notice too, there’s a special tab for the variety of edits that we run across the system using bots. Often pretty mechanical and repetitive, we found that the bots obscure the humans if you just mush everything up in a big list, so we separated them.
![Page 80: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/80.jpg)
Activity/HistoryLive Data
Monday, April 11, 2011
One of the key components to any happy social system is the visibility of other people, and a sense of activity. This is one of the key elements we’re focussed on in the redesign. This particular list shows all edits by humans on Open Library, and actually, turns out to be a handy way to spot check what’s happening. You’ll notice too, there’s a special tab for the variety of edits that we run across the system using bots. Often pretty mechanical and repetitive, we found that the bots obscure the humans if you just mush everything up in a big list, so we separated them.
![Page 81: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/81.jpg)
Solutions?
Monday, April 11, 2011
![Page 82: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/82.jpg)
http://www.flickr.com/photos/emdot/400280705/
Shelf
Monday, April 11, 2011
I really like how Raymond described his book yesterday, that as soon as he’d written it, it began to decay... Concrete, decay
![Page 83: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/83.jpg)
http://www.flickr.com/photos/arenamontanus/352130655/
Network
Monday, April 11, 2011
Plastic, self-healing
![Page 84: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/84.jpg)
Minimum Viable Record
Monday, April 11, 2011
Now, I want to try a little exercise. I’m going to hand out an index card to all of you, and ask you to nominate 5 fields that you think is enough to describe a book. I’ll collate the results and report back later.
![Page 85: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/85.jpg)
http://dotspotting.stamen.com/
Monday, April 11, 2011
Stamen Design in SF. Got funding from Knight Foundation to build Citytracking. Challenge is a “hodgepodge of bits—including APIs [2] and official sources, scraped websites, sometimes-reusable data formats and datasets, visualizations, embeddable widgets etc.—is fractured, overly technical and obscure, held in the knowledge base of a relatively small number of people, and requires considerable expertise to harness.”
![Page 86: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/86.jpg)
“...the first part of this project is to start from scratch, in a 'clean room' environment. We've started from a baseline that's really straightforward, tackling the simplest part: getting dots on maps, without legacy code or any baggage. Just that, to start. Dots on maps.
But “dots on maps” implies a few other things: getting the locations, putting them on there, working with them, and̶crucially̶getting them out in a format that people can work with.”
Monday, April 11, 2011
Stamen Design in SF. Got funding from Knight Foundation to build Citytracking. Challenge is a “hodgepodge of bits—including APIs [2] and official sources, scraped websites, sometimes-reusable data formats and datasets, visualizations, embeddable widgets etc.—is fractured, overly technical and obscure, held in the knowledge base of a relatively small number of people, and requires considerable expertise to harness.”
![Page 87: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/87.jpg)
“...the first part of this project is to start from scratch, in a 'clean room' environment. We've started from a baseline that's really straightforward, tackling the simplest part: getting dots on maps, without legacy code or any baggage. Just that, to start. Dots on maps.
But “dots on maps” implies a few other things: getting the locations, putting them on there, working with them, and̶crucially̶getting them out in a format that people can work with.”
http://dotspotting.stamen.com/about
Monday, April 11, 2011
![Page 88: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/88.jpg)
Online Publishing Distribution System (OPDS)http://bookserver.archive.org/catalog/new
Monday, April 11, 2011
This is an example of trying something very bare bones, to try to help systems intercommunicate more easily. (Open Library plans to publish OPDS feeds soon.)Online Publishing Distribution System (OPDS): The Open Publication Distribution System (OPDS) Catalog specification is a syndication format for electronic publications based on Atom RFC4287 and HTTP RFC2616.
![Page 89: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/89.jpg)
American notes for general circulation [microform]February 25, 2011 10:22 AMAuthor: Dickens, Charles, 1812-1870Publisher: New York : HarperYear published: 1842Book contributor: Canadiana.orgLanguage: enDownload Ebook: (PDF) (EPUB)
Monday, April 11, 2011
![Page 90: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/90.jpg)
Monday, April 11, 2011
Individuals can also add new books with a few details like Title, Author, Publisher and Publish Date. That’s enough for a stub, and then people are invited to add more details.
![Page 91: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/91.jpg)
Canonical ID?
Monday, April 11, 2011
![Page 92: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/92.jpg)
Canonical ID?Collect them.
Monday, April 11, 2011
![Page 93: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/93.jpg)
Monday, April 11, 2011
Another experiment we’re looking forward to trying is about identifiers. We’re not particularly concerned about canonical identifiers. Perhaps it’s a waste of time to wait for one, so instead, we’re going to try and attach as many ID types to our records as we can. (This list is just a braindump - not active yet.) The idea is that people could add a URL or actual identifier and Open Library would just do the right thing. A suggestion (after this presentation was delivered) was that people could ping Open Library with an identifier, not even knowing what TYPE of ID it is. Perhaps Open Library could help “triangulate” this query towards a book record. “Record laundering.”
![Page 94: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/94.jpg)
Canonical ID?Exchange them.
Monday, April 11, 2011
![Page 95: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/95.jpg)
http://openlibrary.org/books/olid/OL7440033M
http://openlibrary.org/books/isbn/0385472579
http://openlibrary.org/books/isbn/9780385472579
http://openlibrary.org/books/lccn/93005405
http://openlibrary.org/books/oclc/28419896
http://openlibrary.org/books/id/240727
http://openlibrary.org/books/amazon/...
http://openlibrary.org/books/bookmooch/...
http://openlibrary.org/books/goodreads/...
http://openlibrary.org/books/ocaid/...
http://openlibrary.org/books/librarything/...
http://openlibrary.org/books/paperback_swap/...
http://openlibrary.org/books/Your ID Here/...
Monday, April 11, 2011
You can already ping Open Library with an ID other than the Open Library identifier to see if we have any matches.
![Page 96: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/96.jpg)
http://openlibrary.org/books/olid/OL7440033M
http://openlibrary.org/books/isbn/0385472579
http://openlibrary.org/books/isbn/9780385472579
http://openlibrary.org/books/lccn/93005405
http://openlibrary.org/books/oclc/28419896
http://openlibrary.org/books/id/240727
http://openlibrary.org/books/amazon/...
http://openlibrary.org/books/bookmooch/...
http://openlibrary.org/books/goodreads/...
http://openlibrary.org/books/librarything/...
http://openlibrary.org/books/ocaid/...
http://openlibrary.org/books/paperback_swap/...
http://openlibrary.org/books/Your ID Here/...
Monday, April 11, 2011
![Page 97: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/97.jpg)
Your ID
Monday, April 11, 2011
![Page 98: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/98.jpg)
Your ID
Everyone else’s
Monday, April 11, 2011
![Page 99: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/99.jpg)
Make nodes,not cards
Monday, April 11, 2011Some rights reserved by yobink
![Page 100: Open Library at the API Workshop](https://reader038.vdocuments.site/reader038/viewer/2022110118/5552c418b4c90581158b498f/html5/thumbnails/100.jpg)
Network,not sequence
Monday, April 11, 2011