jane greenberg, professor and director, metadata research center school of information and library...

18
Jane Greenberg, Professor and Director, Metadata Research Center <MRC> School of Information And Library Science University of North Carolina at Chapel Hill

Upload: bella-acres

Post on 29-Mar-2015

224 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Jane Greenberg, Professor and Director, Metadata Research Center School of Information And Library Science University of North Carolina at Chapel Hill

Jane Greenberg, Professor and Director, Metadata Research Center <MRC>School of Information And Library Science University of North Carolina at Chapel Hill

Page 2: Jane Greenberg, Professor and Director, Metadata Research Center School of Information And Library Science University of North Carolina at Chapel Hill

2

EcologyEcology PaleontologyPaleontology Population Population geneticsgeneticsPhysiologyPhysiologySystematics Systematics GenomicsGenomics

Dryad GoalsOne-stop deposition/access for data

objects supporting published research…Acquisition, preservation, discovery, and

reuse of heterogeneous digital datasets

Page 3: Jane Greenberg, Professor and Director, Metadata Research Center School of Information And Library Science University of North Carolina at Chapel Hill
Page 4: Jane Greenberg, Professor and Director, Metadata Research Center School of Information And Library Science University of North Carolina at Chapel Hill

American Society of NaturalistsAmerican Naturalist

Ecological Society of AmericaEcology, Ecological Letters, Ecological Monographs, etc.

European Society for Evolutionary BiologyJournal of Evolutionary Biology

Society for Integrative and Comparative BiologyIntegrative and Comparative Biology

Society for Molecular Biology and EvolutionMolecular Biology and Evolution

Society for the Study of Evolution EvolutionSociety for Systematic Biology

Systematic BiologyCommercial journals

Molecular EcologyMolecular Phylogenetics and Evolution

Page 5: Jane Greenberg, Professor and Director, Metadata Research Center School of Information And Library Science University of North Carolina at Chapel Hill
Page 6: Jane Greenberg, Professor and Director, Metadata Research Center School of Information And Library Science University of North Carolina at Chapel Hill

Dyrad’s functional requirements: Support resource discovery and use; data interoperability; computer-

aided metadata generation, augmentation, and quality control; linking

publications and underlying datasets; and data security (Dube et al,

2006, White et al 2008)

Dryad’s high-level metadata requirements Simple: depositor created metadata; automation of

metadata gen; discovery of heterogeneous datasets

Interoperable: harvesting, cross-system searching

Semantic Web Compatible: sustainable and adaptable

metadata architecture, supporting machine processing

Page 7: Jane Greenberg, Professor and Director, Metadata Research Center School of Information And Library Science University of North Carolina at Chapel Hill

Digital data repositories ought to support immediate operational needs and long term goals to be sustainable. ~ a two-pronged approach~ a two-pronged approach

1. Sustainable in the evolving information environment (DCAM application profile)

2. Immediate need to make content available in Dspace (XML schema)

Page 8: Jane Greenberg, Professor and Director, Metadata Research Center School of Information And Library Science University of North Carolina at Chapel Hill

SynthesisSynthesis

SharingSharing

DiscoveryDiscovery

PreservationPreservation

Page 9: Jane Greenberg, Professor and Director, Metadata Research Center School of Information And Library Science University of North Carolina at Chapel Hill

Dublin Core based• Circumvents limitations of using a single scheme• Interoperable with other schemes• Why reinvent the wheel?

Modular scheme: 1. Data package2. Journal citation3. Data object

(Carrier, et al., 2007; White, et al, 2008; Greenberg, 2009; Greenberg et al; in press JLM, 2010)

Namespaces: 1.Dublin Core2.Data Documentation Initiative (DDI)3.Ecological Metadata Language (EML)4.PREMIS Data Dictionary for Preservation Metadata 5.Publishing Requirements for Industry Standard Metadata (PRISM)6.Darwin Core (DwC)7.Dryad

Page 10: Jane Greenberg, Professor and Director, Metadata Research Center School of Information And Library Science University of North Carolina at Chapel Hill

Singapore Framework(machine processing, long term quality control Semantic web/linked data)

A loose standard for DC endorsed application profiles

Page 11: Jane Greenberg, Professor and Director, Metadata Research Center School of Information And Library Science University of North Carolina at Chapel Hill

A “loose” standard for Dublin Core “endorsed” application profiles

Singapore framework provides guidelines for creating a DCAM-conformant Application Profile (“DC Application Profile”)

A packet of documentation which consists of:

1. Functional requirements (desirable)2. Domain model (mandatory)3. Description Set Profile (DSP) (mandatory)4. Usage guidelines (optional)5. Encoding syntax guidelines (optional)

Stuff we do any how…just better! ( JLM paper)

Page 12: Jane Greenberg, Professor and Director, Metadata Research Center School of Information And Library Science University of North Carolina at Chapel Hill

DSP is “an information model and XML expression” (http://www.unc.edu/~scarrier/dryad/DSPLevelOneAppProfDraft.xml)

Obligation (optional, mandatory) Non-literal (thing – philosophically – things in the real

world, known in different ways)http://purl.org/dc/elements/1.1/rights (mandatory), there

are different rightsSubject, creator, description…

Literals (strings): http://purl.org/dc/elements/1.1/identifier =

http://purl.org/dc/terms/URI,http://purl.org/dc/terms/available =

http://purl.org/dc/terms/W3CDTF

12

Toward the Semantic Web… linked data….

Toward the Semantic Web… linked data….

Potential for metadata and …data reuse in different contexts

Page 13: Jane Greenberg, Professor and Director, Metadata Research Center School of Information And Library Science University of North Carolina at Chapel Hill

10/04/23 Titel (edit in slide master) 13

<AMG> approach for integrating discipline CVs Model addressing C V cost, interoperability, and usability constraints (interdisciplinary environment)

Building, Sharing, Evaluation the HIVE….

Page 14: Jane Greenberg, Professor and Director, Metadata Research Center School of Information And Library Science University of North Carolina at Chapel Hill

Dryad development—reversed engineered long-term goals first did not have to hit the ground running

Commencement of data exchange plans DCAP work refocus XML schema

Harvesting EML records from LTER Network’s Metacat

Data exchange plans Dryad TreeBASE Discouraged …approaches not at odds

App. profile work instrumental in creating a sufficient XML schema

Renderings of the same intellectual work on a continuum

GRDDL (Gleaning Resource Descriptions from Dialects of Languages) – enabling technology to create RDF

Page 15: Jane Greenberg, Professor and Director, Metadata Research Center School of Information And Library Science University of North Carolina at Chapel Hill

Positive aspects- Intellectually

engaging- Think we are making

a contribution, have to start somewhere…

- Machine capabilities- eScience/data

synthesis

Challenges- Infrastructure not all

there… (a lot is not in RDF)- Registered Dryad

“purl”

- Proof of concept difficult

- Time consuming- Documentation

lacking

Page 16: Jane Greenberg, Professor and Director, Metadata Research Center School of Information And Library Science University of North Carolina at Chapel Hill

Conclusions: Dryad development team has discovered that a two

prong hybrid philosophy works well in this environment, keeps legacy practices from impeding development

A bridge… Development has been consensus driven

biologists, computer scientists, librarians New models need to be explored No exploration, no discovery…unsuccessful results are Not always BAD DCAPDCAP

Semantic Semantic WebWeb~ linked ~ linked data…data…

Page 17: Jane Greenberg, Professor and Director, Metadata Research Center School of Information And Library Science University of North Carolina at Chapel Hill

HIVE Advisory Board Jim Balhoff, NESCent Libby Dechman, LCSH Mike Frame, USGS Alistair Miles, CCLRC Rutherford

Appleton Laboratory William Moen, University of

North Texas Eva Méndez Rodríguez,

University Carlos III of Madrid Joseph Shubitowski, Getty

Research Institute Ed Summers, LCSH Barbara Tillett, Library of

Congress Kathy Wisser, UNC Chapel Hill Lisa Zolly, USGS

Vocabulary Partners Library of Congress: LCSH the Getty Research Institute

(GRI): TGN (Thesaurus of Geographic Names )

United States Geological Survey (USGS): NBII Thesaurus

Dryad TeamNESCEnt Hilmar Lapp

Assistant Director for Informatics Ryan Scherle

Data Repository Architect Todd Vision, Associate Director of

Informatics

UNC/SILS/MRC Jane Greenberg, Professor Lina Huang, MSIS Student,

Research Assistant Robert M. Losee, Professor Jose R. Pérez-Agüera, Clinical

Assistant Professor Hollie White, Metadata Research

Center Doctoral Fellow

Collaborators North Carolina State University University of New Mexico/LTER Yale University Partner journals and societies

Page 18: Jane Greenberg, Professor and Director, Metadata Research Center School of Information And Library Science University of North Carolina at Chapel Hill

For Information Dryad

http://datadryad.org/ Dryad Wiki

https://www.nescent.org/wg_digitaldata/Main_Page Includes links to publications, the application

profile, and lists Dryad team members Metadata Research Center <MRC>

http://www.ils.unc.edu/mrc/ National Evolutionary Synthesis Center

(NESCent) http://www.nescent.org/index.php

The Dryad Data Repository

10/04/2318