rdap13 lorrie johnson: facilitating access to scientific data

Lorrie Apple Johnson Senior Librarian, Information Analysis & Services Office of Scientific and Technical Information (OSTI) Research Data Access & Preservation Summit 2013 Baltimore, MD April 4, 2013 Facilitating Access to Scientific Data: The DataCite, Science.gov, and WorldWideScience.org Initiatives

Upload: asist

Post on 22-Jan-2015




0 download


Lorrie Johnson, U.S. Department of Energy/Office of Science and Technical Information: “Facilitating Access to Scientific Data: The DataCite, Science.gov, and WorldWideScience.org Initiatives” Panel: Linked data and metadata (co-sponsored by the ASIS&T Digital Libraries SIG) Research Data Access & Preservation Summit 2013 Baltimore, MD April 4, 2013 #rdap13


  • 1. Lorrie Apple Johnson Senior Librarian, Information Analysis & ServicesOffice of Scientific and Technical Information (OSTI)Research Data Access & Preservation Summit 2013Baltimore, MDApril 4, 2013

2. OSTI is a program within the DOE Office of Sciencewith the corporate responsibility for ensuringappropriate access to DOE R&D results. DOE invests over $10 billion/year in basic sciences, cleanenergy technology, nuclear research. The immediate output from this investment is informationknowledge R&D results. OSTIs mission is to accelerate scientific progress byaccelerating access to this information. Energy Policy Act of 2005The Secretary, through the Office of Scientific and Technical Information, shallmaintain within the Department publicly available collections of scientific and technicalinformation resulting from research, development, demonstration, and commercialapplications activities supported by the Department. 3. Department of EnergyScientific and Technical Information ProgramDOE R&D results are: Collected from DOE offices, labs, and facilities, as well as university grantees; Preserved for re-use; and Made accessible via multiple web outlets.OSTI works to ensure that: Research results from DOE programs areshared globallyplus DOE-supported researchers have access toscientific discoveries from around the world 4. Scientific research is conducted at many agenciesacross the federal government. Scientists and researchers produce a lot ofinformation, in many different formats: Textual reports, journal articles, conference proceedings, patents Multimedia videos, images Data 5. Hard to FINDHard to NAVIGATEHard to CITE 6. Data should be cited in just the same way that other sources ofinformation, such as articles and books, are cited.Data citation can help by: enabling easy reuse and verification of data allowing the impact of data to be tracked creating a scholarly structure that recognizes and rewards data producers 7. What is DataCite? A global consortium composedof local institutions focused onimproving the scholarlyinfrastructure around datasetsand other non-textualinformation. A service for assigning DigitalObject Identification (DOIs) andmetadata to datasets. DataCite (www.datacite.org) helps researchers find, access and reuse data. 8. Easier identification and access of datasets across theinternational community of researchers via DataCitesresolving tools Linkage between DOEs R&D documents and theunderlying datasets generated by the research Standard format for including data in the accepted bibliographic citation framework Aid researchers in locating exact datasets used in previous work, thus allowing verification of results or new uses for the data 9. DOE Data ID Service DOE/OSTI is the only U.S. federal member of DataCite. Interagency agreement in place with NIH project; indiscussions with seven agencies representing 12 projects. OSTI Partnered with Oak Ridge National Laboratory to pioneerprocedure. First DOI for a DOE dataset was minted and registered withDataCite on 8/10/2011. DOE Atmospheric Radiation Measurement (ARM) has nowregistered over 400 datasets. 10. Dataset Type Originating Research OrganizationDataset TitlePublication/ Issue Date = Data CitationDataset Creator/Author or metadata submitted to Principal Investigator Sponsoring OrganizationDOE-OSTIDataset Product Number URL where the Dataset is posted for accessDOE Contract/Award NumberContact information Web241.6ServiceAN API Creator/Author, PrimaryData Citation Investigator, or submitted to DOI Assigned By search enginesSubmitter notified of DOE-OSTI for indexing Data Citation availabilityDOE-OSTI submits nightlyDOE-OSTI updates feed of new metadata record with DOI DOIs to DataCite creating a fullData CitationDataCite validatesDataCite DOI registration withRegisters DOI DOE-OSTI 11. Dataset Type Originating ResearchOrganizationDataset TitlePublication/ Issue DateDataset Creator/Authoror Principal Investigator Sponsoring OrganizationDataset Product Number URL where the Datasetis posted for accessDOE Contract/AwardNumberContact information 12. Federated SearchingSince science is not bound by agency,organization, or geography We integrate or aggregate multiple government R&D-relateddatabases into single-search portals. Innovative technology drills down to selected databases andwebsites in parallel, then presents ranked search results. 13. Drills into the deep web, where scientific databases reside Finds dynamically generated content living inside thosedatabases; high-quality managed subject-specific content Returns current, real-time results Presents no burden for database owner Allows for fielded searchingPlus Inexpensive to implement No need-to-know for user No searching door-to-door Automatic interoperability achieved 14. Parallel Searching Visualization Clustering Relevancy Ranking 15. Science.gov Integrates Federal Agency R&D ResultsOSTI developed and operates Science.gova single search box portal toSTI from 13 federal science agencies.Represents 97 % of the federal research and development budget. 200 million pages of science information Over 55 databases 2,100 select websites Expanding to formats beyond text to multimedia and data. 16. WorldWideScience.orgEnabling Access to Global R&D ResultsU.S. research results (Science.gov) plusresearch results from 70+ countries aresearchable via single-query globalscience portal. Multilingual translations capability for 10languages. More than 400 million pages of scientific andtechnical information, including: Text Multimedia Data 17. Thank you!Lorrie Apple Johnson U.S. Department of EnergyOffice of Scientific and Technical Information [email protected]