grad 521, research data management winter 2014 – lecture 9 amanda l. whitmire, asst. professor
TRANSCRIPT
![Page 1: GRAD 521, Research Data Management Winter 2014 – Lecture 9 Amanda L. Whitmire, Asst. Professor](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649e4e5503460f94b45621/html5/thumbnails/1.jpg)
GRAD 521, Research Data Management Winter 2014 – Lecture 9
Amanda L. Whitmire, Asst. Professor
Data documentation through metadata
![Page 2: GRAD 521, Research Data Management Winter 2014 – Lecture 9 Amanda L. Whitmire, Asst. Professor](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649e4e5503460f94b45621/html5/thumbnails/2.jpg)
Lesson topics
1. Definition of metadata
2. Examine information included in a metadata record
3. Examples of metadata standards and how to choose
4. Illustrate the value of metadata to data users, data providers, and organizations
5. Describe the utility of metadata for a variety of scenarios beyond discovery
![Page 3: GRAD 521, Research Data Management Winter 2014 – Lecture 9 Amanda L. Whitmire, Asst. Professor](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649e4e5503460f94b45621/html5/thumbnails/3.jpg)
The data lifecycle
![Page 4: GRAD 521, Research Data Management Winter 2014 – Lecture 9 Amanda L. Whitmire, Asst. Professor](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649e4e5503460f94b45621/html5/thumbnails/4.jpg)
Data collection
CC im
age
by Ju
stin
See
on F
lickr
CC im
age
by C
IMM
YT o
n Fl
ickr
CC im
age
by a
cord
ova
on
Flic
kr
CC im
age
by k
ukku
rova
ca o
n Fl
ickr
CC im
age
by S
EDAC
on
Flic
krCC
imag
e by
ISAS
on
Flic
kr
![Page 5: GRAD 521, Research Data Management Winter 2014 – Lecture 9 Amanda L. Whitmire, Asst. Professor](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649e4e5503460f94b45621/html5/thumbnails/5.jpg)
From field notes to datasets
Average temperature of observation for each species
SpeciesAverage
Temperature
Temperature Standard Deviation
Number of Observations
Minimum Temperature
Maximum Temperature
Northern Red-legged Frog
4.4 --- 1 4.4 4.4
Tailed Frog 7.0 3.0 3 4 10
Arizona Toad 10.0 --- 1 10 10
Strecker's Chorus Frog
10.5 2.0 11 9 16
Oregon Spotted Frog
11.0 15.5 2 0 22
New Jersey Chorus Frog
11.5 4.5 17 3 22
Wood Frog 12.5 5.5 897 0 28.8
Spring Peeper 13.2 5.6 569 -1 32
Red-legged Frog 13.3 5.9 16 4 27
![Page 6: GRAD 521, Research Data Management Winter 2014 – Lecture 9 Amanda L. Whitmire, Asst. Professor](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649e4e5503460f94b45621/html5/thumbnails/6.jpg)
From datasets to published papers
CC im
age
by H
eath
er K
enne
dy o
n Fl
ickr
![Page 7: GRAD 521, Research Data Management Winter 2014 – Lecture 9 Amanda L. Whitmire, Asst. Professor](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649e4e5503460f94b45621/html5/thumbnails/7.jpg)
Working with data
When you provide data to someone else, what types of information would you want to include with the data?
When you receive a dataset from an external source, what types of details do you want to know about the data?
![Page 8: GRAD 521, Research Data Management Winter 2014 – Lecture 9 Amanda L. Whitmire, Asst. Professor](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649e4e5503460f94b45621/html5/thumbnails/8.jpg)
Working with data
Providing data: Why were the data created? What limitations, if any, do the data have? What does the data mean? How should the data be cited if it is re-used in a new study?
Receiving data:What are the data gaps?What processes were used for creating the data?Are there any fees associated with the data?In what scale were the data created? What do the values in the tables mean?What software do I need in order to read the data?What projection are the data in?Can I give these data to someone else?
![Page 9: GRAD 521, Research Data Management Winter 2014 – Lecture 9 Amanda L. Whitmire, Asst. Professor](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649e4e5503460f94b45621/html5/thumbnails/9.jpg)
What is metadata?
“Data about data”
“Structured information that describes, explains, locates, or otherwise makes it easier to retrieve, use, or manage an information resource.”
NISO, Understanding Metadata
![Page 10: GRAD 521, Research Data Management Winter 2014 – Lecture 9 Amanda L. Whitmire, Asst. Professor](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649e4e5503460f94b45621/html5/thumbnails/10.jpg)
Metadata
“The metadata accompanying your data should be written for a user 20 years into the future -- what does that person need to know to use your data properly? Prepare the metadata for a user who is unfamiliar with your project, methods, or observations.”
Oak Ridge National Laboratory Distributed Active Archive Center for Biogeochemical
Dynamics(ORNL DAAC)
![Page 11: GRAD 521, Research Data Management Winter 2014 – Lecture 9 Amanda L. Whitmire, Asst. Professor](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649e4e5503460f94b45621/html5/thumbnails/11.jpg)
What is metadata?
WHO created the data? WHAT is the content of the data? WHEN were the data created? WHERE is it geographically? HOW were the data developed? WHY were the data developed?
Phot
o by
Mic
helle
Cha
ng. A
ll Ri
ghts
Res
erve
d
Metadata is: Data ‘reporting’
![Page 12: GRAD 521, Research Data Management Winter 2014 – Lecture 9 Amanda L. Whitmire, Asst. Professor](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649e4e5503460f94b45621/html5/thumbnails/12.jpg)
Levels of metadata
PROJECT LEVELDescriptive information
DATA LEVELGranular information
![Page 13: GRAD 521, Research Data Management Winter 2014 – Lecture 9 Amanda L. Whitmire, Asst. Professor](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649e4e5503460f94b45621/html5/thumbnails/13.jpg)
Metadata in real life
You use it all the time…
![Page 14: GRAD 521, Research Data Management Winter 2014 – Lecture 9 Amanda L. Whitmire, Asst. Professor](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649e4e5503460f94b45621/html5/thumbnails/14.jpg)
Metadata standards
Dublin Core (DC), Darwin
Core (DwC), EML, DDI, NBII,
FGDC/CSDGM, ISO 19139, ISO
19115, DIF, LDIF, e-GMS,
AGLS, METS, MODS, PREMIS,
OAI-PMH, MARC, CDWA,
CIDOC/CRM, DACS, DIG35,
GILS, GML, ISBD, LCSH, KML,
MARCXML, MEI, MODS, MIX,
OAIS, ANSI/NISO Z39.88, PB
Core, PRISM, QDC, RDF,
SGML, VSO, XML, XMP
![Page 15: GRAD 521, Research Data Management Winter 2014 – Lecture 9 Amanda L. Whitmire, Asst. Professor](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649e4e5503460f94b45621/html5/thumbnails/15.jpg)
What is a metadata standard?A Standard provides a structure to describe data with:
o Common terms to allow consistency between recordso Common definitions for easier interpretationo Common language for ease of communicationo Common structure to quickly locate information
In search and retrieval, standards provide:o Documentation structure in a reliable and predictable format for
computer interpretationo A uniform summary description of the dataset
CC im
age
by c
carls
tead
on
Flic
kr
![Page 16: GRAD 521, Research Data Management Winter 2014 – Lecture 9 Amanda L. Whitmire, Asst. Professor](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649e4e5503460f94b45621/html5/thumbnails/16.jpg)
What does a metadata record look like?
Ocean Currents and Biogeochemistry: Nearshore Water Profiles (Monthly CTD and Chemistry; SBC-LTER)web link
New York City Community Health Survey, 2009 (ICPSR)web link
Mountain hemlock tree-ring width chronologies from the western Oregon Cascade Mountains (USFS Research Data Archive)web link
![Page 17: GRAD 521, Research Data Management Winter 2014 – Lecture 9 Amanda L. Whitmire, Asst. Professor](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649e4e5503460f94b45621/html5/thumbnails/17.jpg)
Muddiest point…
What did you find unclear about the
concept of metadata?
![Page 18: GRAD 521, Research Data Management Winter 2014 – Lecture 9 Amanda L. Whitmire, Asst. Professor](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649e4e5503460f94b45621/html5/thumbnails/18.jpg)
Even if the value of data documentation is recognized, concerns remain as to the effort required to create metadata that effectively describe the data.
Concerns about creating metadata
![Page 19: GRAD 521, Research Data Management Winter 2014 – Lecture 9 Amanda L. Whitmire, Asst. Professor](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649e4e5503460f94b45621/html5/thumbnails/19.jpg)
Concerns about creating metadata
Concern Solution
workload required to capture accurate robust metadata
incorporate metadata creation into data development process – distribute the effort
time and resources to create, manage, and maintain metadata
include in grant budget and schedule
readability / usability of metadata use a standardized metadata format
discipline specific information and ontologies
‘profile’ standard to require specific information and use specific values
![Page 20: GRAD 521, Research Data Management Winter 2014 – Lecture 9 Amanda L. Whitmire, Asst. Professor](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649e4e5503460f94b45621/html5/thumbnails/20.jpg)
The value of metadata
Data creators
Datausers
Organizations
Metadatahelps…
![Page 21: GRAD 521, Research Data Management Winter 2014 – Lecture 9 Amanda L. Whitmire, Asst. Professor](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649e4e5503460f94b45621/html5/thumbnails/21.jpg)
What is the value to data creators?
Metadata allows data creators to:o Avoid data duplication o Share reliable informationo Publicize efforts – promote the work of a scientist and
his/her contributions to a field of study
CC im
age
by U
S Em
bass
y G
uyan
a o
n Fl
ickr
![Page 22: GRAD 521, Research Data Management Winter 2014 – Lecture 9 Amanda L. Whitmire, Asst. Professor](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649e4e5503460f94b45621/html5/thumbnails/22.jpg)
What is the value to data users?
Metadata gives a user the ability to:o Search, retrieve, and evaluate data
set information from both inside and outside an organization
o Find data: Determine what data exists for a geographic location and/or topic
o Determine applicability: Decide if a data set meets a particular need
o Discover how to acquire the dataset you identified; process and use the dataset
CC im
age
by A
SEE
on F
lickr
![Page 23: GRAD 521, Research Data Management Winter 2014 – Lecture 9 Amanda L. Whitmire, Asst. Professor](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649e4e5503460f94b45621/html5/thumbnails/23.jpg)
What is the value to organizations?
Metadata helps ensure an organization’s investment in data
o Documentation of data processing steps, quality control, definitions, data uses, and restrictions
o Ability to use data after initial intended purpose
Transcends people & time o Offers data permanenceo Creates institutional memory
Advertises an organization’s research o Creates possible new partnerships and
collaborations through data sharing
![Page 24: GRAD 521, Research Data Management Winter 2014 – Lecture 9 Amanda L. Whitmire, Asst. Professor](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649e4e5503460f94b45621/html5/thumbnails/24.jpg)
Information EntropyDA
TA D
ETAI
LS
Time of data development
Specific details about problems with individual items or specific dates are lost relatively rapidly
General details about datasets are lost through time
Accident or technology change may make data unusable
Retirement or career change makes access to “mental storage” difficult or unlikely
Loss of data developer leads to loss of remaining information
TIME (From Michener et al 1997)
![Page 25: GRAD 521, Research Data Management Winter 2014 – Lecture 9 Amanda L. Whitmire, Asst. Professor](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649e4e5503460f94b45621/html5/thumbnails/25.jpg)
Information Entropy
TIME
DATA
DET
AILS Sound information
management, including metadata development, can arrest the loss of dataset detail.
![Page 26: GRAD 521, Research Data Management Winter 2014 – Lecture 9 Amanda L. Whitmire, Asst. Professor](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649e4e5503460f94b45621/html5/thumbnails/26.jpg)
A closer look: the utility of metadata
Metadata can support:o data distributiono data managemento [project management]
If it is:o considered a component of the datao created during data developmento populated with rich content
derive classify
collect
planimetric imagery
analysis
alternativecommittee
review
PLAN
charette
meta
meta
meta
meta
![Page 27: GRAD 521, Research Data Management Winter 2014 – Lecture 9 Amanda L. Whitmire, Asst. Professor](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649e4e5503460f94b45621/html5/thumbnails/27.jpg)
Data distribution via metadata
metadata publication
dataportals
datadiscovery
![Page 28: GRAD 521, Research Data Management Winter 2014 – Lecture 9 Amanda L. Whitmire, Asst. Professor](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649e4e5503460f94b45621/html5/thumbnails/28.jpg)
Distribution: data discovery
The descriptive content of the metadata file can be used to identify, assess, and access available data resources.
• online access• order process• contacts
ACCESS
• use constraints• access constraints• data quality• availability/pricing
ASSESS
• keywords• geographic location• time period• attributes
IDENTIFY
![Page 29: GRAD 521, Research Data Management Winter 2014 – Lecture 9 Amanda L. Whitmire, Asst. Professor](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649e4e5503460f94b45621/html5/thumbnails/29.jpg)
Distribution: metadata publication
A metadata collection can be published to the internet via:
website catalogweb accessible folder (WAF)Z39.50 metadata clearinghousemetadata servicegeospatial data portal
Internet
Metadata CollectionUser Query
Internet /
Intranet
Dataset
![Page 30: GRAD 521, Research Data Management Winter 2014 – Lecture 9 Amanda L. Whitmire, Asst. Professor](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649e4e5503460f94b45621/html5/thumbnails/30.jpg)
Distribution: data portals
Examples of metadata search portals:Data.gov
Federal e-gov geospatial data portalhttp://www.geo.data.gov
MetacatRepository for data and metadatahttp://knb.ecoinformatics.org/index.jsp
US Geological SurveyUSGS Core Science Metadata Clearinghouse:
http://mercury.ornl.gov/clearinghouseICPSR
Political and Social Science data portal
![Page 31: GRAD 521, Research Data Management Winter 2014 – Lecture 9 Amanda L. Whitmire, Asst. Professor](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649e4e5503460f94b45621/html5/thumbnails/31.jpg)
Data management via metadata
DataAccountability
Discovery
& Re-use
Maintenance
& Update
DataLiability
![Page 32: GRAD 521, Research Data Management Winter 2014 – Lecture 9 Amanda L. Whitmire, Asst. Professor](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649e4e5503460f94b45621/html5/thumbnails/32.jpg)
Management: maintenance & updateMetadata records can used to track data provenance accuracyData Maintenance:
• Are the data current?o Do we have data older than ten years?o was before some political or geophysical event that resulted in
significant change?• Are the data valid?
o prior to most current source datao prior to most current methodologies
Data Update:• Contact information• Distribution policies, availability, pricing, URLs• New derivations of the dataset
![Page 33: GRAD 521, Research Data Management Winter 2014 – Lecture 9 Amanda L. Whitmire, Asst. Professor](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649e4e5503460f94b45621/html5/thumbnails/33.jpg)
Discovery: data reuse
If you create metadata, other people can discover your data
If you create metadata,you can find your own data
CC im
age
by O
cean
it D
aily
Pho
to
on F
lickr
![Page 34: GRAD 521, Research Data Management Winter 2014 – Lecture 9 Amanda L. Whitmire, Asst. Professor](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649e4e5503460f94b45621/html5/thumbnails/34.jpg)
Management: data discovery & reuse
Find your data by:o themes / attributeso geographic locationo time rangeso analytical methods usedo sources & contributorso data quality
Discoverable data is usable data!
CC im
age
by N
ASA
God
dard
Spe
ce F
light
Cen
ter o
n Fl
ickr
![Page 35: GRAD 521, Research Data Management Winter 2014 – Lecture 9 Amanda L. Whitmire, Asst. Professor](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649e4e5503460f94b45621/html5/thumbnails/35.jpg)
Management: data accountability
Metadata allows you to repeat scientific process if:o methodologies are definedo variables are definedo analytical parameters are defined
Metadata allows you to defend your scientific process:
o demonstrate process o increasingly GIS-savvy public
requires metadata for consumer information
INPUT
RESULTS
![Page 36: GRAD 521, Research Data Management Winter 2014 – Lecture 9 Amanda L. Whitmire, Asst. Professor](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649e4e5503460f94b45621/html5/thumbnails/36.jpg)
Management: data accountability
Metadata is an exercise in data accountability. It requires you to assess:
What do you know about the dataset?What don’t you know about the dataset?What should you know about the dataset?
Are you willing to associate yourself with the metadata record ?
![Page 37: GRAD 521, Research Data Management Winter 2014 – Lecture 9 Amanda L. Whitmire, Asst. Professor](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649e4e5503460f94b45621/html5/thumbnails/37.jpg)
Management: data liability
Metadata is a declaration of:Purposeo the originator’s intended application of
the data
Use Constraintso inappropriate applications of the data
Completenesso features or geographies excluded from the data
Distribution Liabilityo explicit liability of the data producer and assumed liability of the
consumer
What to do…
What not to do…
![Page 38: GRAD 521, Research Data Management Winter 2014 – Lecture 9 Amanda L. Whitmire, Asst. Professor](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649e4e5503460f94b45621/html5/thumbnails/38.jpg)
Review: the utility of metadata
Metadata can support: Data distribution
o discoveryo metadata publicationo data portals
Data managemento maintenance & updateo discovery & reuseo data accountabilityo data liability
[Project management]
![Page 39: GRAD 521, Research Data Management Winter 2014 – Lecture 9 Amanda L. Whitmire, Asst. Professor](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649e4e5503460f94b45621/html5/thumbnails/39.jpg)
Choosing Metadata Standards
Imag
e co
urte
sy o
f Viv
Hut
chin
son
![Page 40: GRAD 521, Research Data Management Winter 2014 – Lecture 9 Amanda L. Whitmire, Asst. Professor](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649e4e5503460f94b45621/html5/thumbnails/40.jpg)
Darwin Core | biological diversity, taxonomy
Dublin Core | general
DDI (Data Documentation Initiative) | social & behavioral sci.
DIF (Directory Interchange Format) | environmental sci.
EML (Ecological Metadata Language) | ecology, biology
ISO 19115| geographic data
Multiple standards exist
Browse by discipline: http://www.dcc.ac.uk/resources/metadata-standards
![Page 41: GRAD 521, Research Data Management Winter 2014 – Lecture 9 Amanda L. Whitmire, Asst. Professor](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649e4e5503460f94b45621/html5/thumbnails/41.jpg)
Comparing metadata standards
EML FGDC
Title Title
Abstract Abstract
Entity Description Entity Type Definition
Intellectual Rights Use Constraints
![Page 42: GRAD 521, Research Data Management Winter 2014 – Lecture 9 Amanda L. Whitmire, Asst. Professor](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649e4e5503460f94b45621/html5/thumbnails/42.jpg)
Choosing a metadata standard
Many standards collect similar informationFactors to consider:
1. Your data type• raster/vector GIS data, images, surveys/text, etc.
2. Organization [funder] policies3. Future preservation/sharing location4. Tools to support creation & distribution5. Other factors: Availability of human support;
instructional materials; use of controlled vocabularies; output formats
![Page 43: GRAD 521, Research Data Management Winter 2014 – Lecture 9 Amanda L. Whitmire, Asst. Professor](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649e4e5503460f94b45621/html5/thumbnails/43.jpg)
Summaryo Metadata is documentation of datao A metadata record captures critical information about the content of a dataseto Metadata allows data to be discovered, accessed, and re-usedo A metadata standard provides structure and consistency to data
documentationo Standards and tools vary – select according to defined criteria such as data
type, organizational guidance, and available resourceso Metadata is of critical importance to data developers, data users, and
organizationso Metadata can be effectively used for:
• data distribution• data management• project management
o Metadata completes a dataset.
Creating robust metadata is in your OWN best interest!
![Page 44: GRAD 521, Research Data Management Winter 2014 – Lecture 9 Amanda L. Whitmire, Asst. Professor](https://reader035.vdocuments.site/reader035/viewer/2022062421/56649e4e5503460f94b45621/html5/thumbnails/44.jpg)
On Thursday
Barnard Classroom5th Floor