nsdl 2.0: creating a collaborative digital library dean krafft, cornell university...
TRANSCRIPT
NSDL 2.0:Creating a collaborative digital library
Dean Krafft, Cornell [email protected]
Structure of the Talk
Project Overview and NSDL 1.0 The Fedora-based NSDL Data
Repository (NDR) and NSDL 2.0 Inspiring Contribution and
Collaboration: ExpertVoices, Soft Matter Wiki, etc.
IS Challenges for NSDL Q&A
What is the NSDL? An NSF-funded $20 million/year program in
Science, Technology, Engineering and Mathematics (STEM) education
A digital library describing nearly two million carefully selected online STEM resources from well over 100 collections (at http://nsdl.org)
A core integration team (Cornell, UCAR, Columbia) working with 9 “pathways” portals and over 200 NSF grantees
A large community of researchers, librarians, content providers, developers, students, and teachers
Portals to the NSDL
NSDL 1.0
A “Union Catalog” OAI-PMH Harvesting Central Metadata Database OAI Server for catalog Search index of
metadata/content Initial K-Gray Portal: nsdl.org
Infrastructure overview: NSDL 1.0
STEMCollectionson the Web
CentralMetadata
Repository
SearchService
ArchiveService
Collection RegistrationSystem
NSDL.org Portal
Protocol:OAI-PMHHTTPRESTSQL
NSDL 1.0 Lessons
Metadata Repository was quick to implement using known technologies, but
Limited model Metadata-centric orientation No content – only metadata Limited relationships – collection/item Limits on context, structure, and access Severe limits on contribution and
collaboration One-way data flow: NSDL → Users
Photo by Jon Crispin
NSDL 2.0
Create an NSDL that guides not just resource discovery, but: Supports creating “context” for resources Presents resources in context: linked to related
concepts; with user ratings; with codes and data
Enables community tools for selecting, organizing, evaluating, annotating, contributing, and collaborating
Provides two-way data flow: NSDL ↔ users
Goal: Create a dynamic, living library
In Architectural terms, create an NSDL Data Repository that
Supports storing both content and metadata
Allows arbitrary relationships among resource and metadata objects: organization, annotation, citation
Accessible through web service architecture of remixable data sources and transformations
The Fedora Vision: A Repository for Rich Information Networks
D ata
Ac to r
F o r m al d o c u m en t
I n f o r m al d o c u m en t
D ata s e ts
W eb s er v ic e
Fedora: the NDR middleware A Flexible, Extensible Digital Object Repository
Architecture (http://www.fedora.info) Open source project with $2.2 million in Mellon
funding 2002-2007 Collaboration of Cornell and Univ. of Virginia Key funded users include:
eSciDoc project (collaboration of the Max Planck Society and FIZ Karlsruhe)
Public Library of Science (Topaz Foundation) VTLS Corp., Harris Corp., Library of Congress Australian Research Repositories Online to the World Royal Library Denmark, National Library, and DTU
What is Fedora? An architecture, toolkit, and
implementation: middleware, not a vertical application
Stores arbitrary internal and external digital objects, disseminations (transformations and combinations), relationships among objects
Entirely SOAP/REST based, disseminations are URLs
XML data store; RDBMS cache; RDF triplestore supports relationship queries
NSDL Data Repository (NDR) References to roughly 2 million
selected STEM resources on the web Sourced metadata statements about
those resources A REST API to allow authenticated
access by Pathways and providers Support for annotation, aggregation,
and other relationships
Sample NDR Objects & Relationships
PublicationResource
Data SetMetadata
PublicationMetadata
Data SetResource
CodeResourceCites
Metadata for
Member of
MetadataProvider MatForge
Collection
SoftMatter
Collection
Member of
Cites
Metadata for
CornellCCMR
MatDLPathway Selector
forSelectorfor
Draft NDR API Characteristics Uses REST calls for all interactions; uses
handles (DOIs) for all external references Ensures external applications can’t violate
the NDR model constraints Disseminations allow combining metadata
from multiple sources, or related content Authentication: Requests signed with
private key associated with an agent Authorization: Agent can become a
metadata provider or aggregator; can create resources
An Information Network Overlay Think of the NDR as a lens for viewing
science content on the net Content can be:
Local: stored directly in the NDR Remote: accessed through a URL Computed: derived from a database or
web service Archived: an older version stored at SDSC
It all has a repository-based URL
Network Overlay View
User View
API/UI
Repository View with Relations & Annotations
Resources on the Web
NSDL 2.0 Technical Challenges
Scaling the RDF triple-store past 200 million triples
Constraining RDF queries to be reasonably computable
Building meaningful search indices on explicit metadata, annotations, resource content, and relationships
Applying the NDR
The NDR provides powerful capabilities for: Creating context around resources Enabling the NSDL community to directly
contribute resources and context Representing a web of relationships among
science resources and information about those resources
How do we use it? Here’s one specific example …
ExpertVoices
What is Expert Voices? A multi-user blogging tool Topic-based discussions (e.g. forensics)
with pointers to related resources An outreach tool to explain and document
NSF-funded research A way for NSDL community members to
become NSDL contributors: of resources, questions, reviews, annotations, metadata
A question/answer and discussion forum: scientist ↔ teacher ↔ student ↔ librarian
What isn’t EV?
Expert Voices ≠ LiveJournal Contributors are carefully selected,
contributions are about science, the process of science, and education
Comic by Michael Lalonde/orneryboy.com
Hurricane Floyd/Photo by NASA
Photo by Jon Crispin
Broadening Participation: An Expert Voices Learning Scenario
“Hurricane Season Blog” run by a National Weather Service hurricane expert, an Earth Science teacher, and a school media specialist familiar with NSDL
Expert creates an entry for Hurricane Gertrude “On track to hit Ft. Lauderdale in 72 hours” “Currently undergoing eyewall replacement cycle” “Expecting 15 foot storm surge”
Media specialist adds links to NSDL resources: Hurricane Hunters site, latest satellite photos, and USGS flooding and flood plain web page
Teacher makes connections to relevant standards and appropriate pedagogy for use by other teachers
Students experience engaging real-time, real-world applications of science lessons
Expert Voices Implementation
Open source multi-user blogging system Published entries become NSDL resources Owner controls publication of entries and
visibility of comments Entries can contain linked references to
NSDL resources, references to URLs that should become resources, and new resource metadata
Integrated with NSDL Shibboleth-based community sign-on
But Expert Voices is just the beginning…
Soft Matter Wiki: Planned NDR Integration Community of approved contributors (e.g.
teachers, librarians, materials scientists) are granted edit access to Soft Matter wiki
New resources and metadata are created as wiki pages and reflected into the NDR
Relevant non-wiki-based NDR resources and metadata are displayed as read-only wiki pages, subject to comment and linking
User and project pages organize NDR resources
NDR Entry for Soft Matter Wiki
Wiki Entry
NewMetadata
NewAudience
MD
ReferencedNew
Resource 1
ReferencedExisting
Resource 2
Annotates
Metadata for
Metadata for
Member ofMetadataProvider
MetadataProvider
ExistingCollection
Soft Matter
Wiki
Member of
Inferred relationshipbetween resources
MyNSDL: NDR-integrated tagging, bookmarking, and recommendation Based on Connotea open-source
folksonomic tagging/bookmarking system Tags and bookmarking structure are
reflected back into the NDR Authorized users can “automatically”
recommend new NSDL resources simply by tagging them
Gives user a personal view of NSDL resources
NDR Application: Content Assignment Tool
Developed by Anne Diekema, Elizabeth Liddy, et al. at the Syracuse University Center for Natural Language Processing
Uses text analysis and machine learning to suggest Educational Standards alignment for resources
Content expert assigns standard, and system learns from the assignment
Standalone tool available now; standards associated with resources in the NDR 4Q06
Other applications in development Automated grade-level assignment
based on vocabulary analysis (San Diego Supercomputer Center)
OnRamp – multi-user, multi-project NDR-integrated content management system
Instructional Architect: Lesson plan development for K12 teachers (Utah State)
iVia-based Expert-Guided crawl: Tool for Pathways and others to turn websites into resource collections (in development at UC Riverside)
Other proposed applications
Moodle Course Management System – courses integrated with NSDL resources
Electronic lab notebook – integrating lab notes with code, data sets, and reference materials within the library archival framework
…
NSDL 2.0 Ecosystem
Protocol:OAI-PMHHTTPRESTNDR API
STEMCollections
SearchServiceArchive
Service
Fedora-basedNDR
What are the Information Science challenges?
Trust
Photo © 2005 Reuters
Contribution
Trust and reputation in NSDL
We brand NSDL as a source of “trusted” resources
What is our trust mechanism? Transitive trust approval Community rating/filtering/reputation
Trusted vs. complete “views” What is the right balance of trust
vs. community contribution?
Community Formation
Build the tools and they will come? What can we learn from Wikipedia,
MySpace, Flickr, and YouTube? How do we leverage existing
societies and groupings (NSTA, ACM, AAPT, AAAS)?
Is there an NSDL community, or are there many small communities?
Courtesy Kathy Sierra/WickedlySmart.com
Creating Passionate Users
How do we help NSDL users “kick ass”? What can we learn from game design?
Motivating goal Challenging interaction Meaningful payoff Multiple levels
Can we use fun, emotion, seduction, surprise, and visuals – and still be academics?
Courtesy Kathy Sierra/WickedlySmart.com
Photo by Jon Crispin
Challenges of ubiquity
Should we target NSDL materials at limited devices (iPods, cell phones)?
How does ubiquitous NSDL access change teacher/student interactions?
Should we build tools to capture field data from these devices?
Other IS Challenges
Personalization: SDI, automated activity analysis, targeted user views
Visualizing the library: alternatives to text search for discovery and context
Location awareness: specializing library views by physical location
Summary
NSDL 2.0 and its tools allow scientists, mathematicians, teachers, engineers, librarians, and students to create a unique web of context, contribution, and collaboration around the high-quality STEM education resources at the core of the NSDL.
NSDL CI needs solve the IS problems needed to turn Capability into Reality.
Acknowledgements
NSDL NSF Program Officers Lee Zia David McArthur
NSDL Core Integration Team UCAR: Kaye Howe, PI and Executive Director Cornell: Dean Krafft, PI Columbia: Kate Wittenberg, PI
Fedora Development Team Cornell: Sandy Payette & Carl Lagoze Univ. of Virginia: Thornton Staples
Apology
Courtesy Kathy Sierra/WickedlySmart.com
Questions?
Contact Information
Dean B. KrafftCornell Information Science301 College Ave.Ithaca, NY [email protected]
This work is licensed under the Creative Commons Attribution-NoDerivs 2.5 License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nd/2.5/ or send a letter to Creative Commons, 543 Howard Street, 5th Floor, San Francisco, California, 94105, USA.