research data management, bexis hands-on workshop

50
Research Data Management BExIS Hands-On Workshop Tayebeh Kiani, Javad Chamanara February 2016 Tehran, Iran

Upload: javadch

Post on 10-Jan-2017

53 views

Category:

Science


3 download

TRANSCRIPT

Research Data Management

BExIS Hands-On Workshop

Tayebeh Kiani, Javad ChamanaraFebruary 2016

Tehran, Iran

2BExIS Hands-On Workshop, Feb. 2016, Tehran, Iran

Scientific Data

“The recorded information (regardless of the form or the media in which they may exist) necessary to support or validate a research project’s observations, findings or outputs.”

-University of Oxford

3BExIS Hands-On Workshop, Feb. 2016, Tehran, Iran

A Usual Research Cycle

4BExIS Hands-On Workshop, Feb. 2016, Tehran, Iran

Data Intensive Research (DIR)

5BExIS Hands-On Workshop, Feb. 2016, Tehran, Iran

Considerations of DIR

• Data Acquisition• Data Processing/Analysis• Result reproduction• Availability of data• Teamwork and data sharing• Digital rights• Referencing and citation

Data Management is needed

6BExIS Hands-On Workshop, Feb. 2016, Tehran, Iran

Need for Data Management

“Data management refers to all aspects of creating, housing, delivering, maintaining, and archiving and preserving data. It is one of the essential areas of responsible conduct of research.”

-MANTRA 2013

Javad Chamanara
Who is this?

7BExIS Hands-On Workshop, Feb. 2016, Tehran, Iran

DM is done through a Lifecycle

Boston University The UniversityOf Alabama

The University Of Virginia

DataONE The U.S. Geological Survey

8BExIS Hands-On Workshop, Feb. 2016, Tehran, Iran

In This Workshop

9BExIS Hands-On Workshop, Feb. 2016, Tehran, Iran

• Data and File Formats• Data Standards• Data Access Policies• Data Management Plan• Data Preservation Plan• Data Retirement• Quality Level

• Hardware• Software• Cost/ Funding• Technical Staff• Tools:

https://dmptool.org/ https://dmponline.dcc.ac.uk

10BExIS Hands-On Workshop, Feb. 2016, Tehran, Iran

• By means of:– Collecting new data – Updating existing data– Converting/Transforming existing data– Purchasing/Obtaining data

• Either manually or automated• In the laboratory, in the field, or by computation• Following methodologies, standards, recommendations• Satisfying constraints such as access policies

11BExIS Hands-On Workshop, Feb. 2016, Tehran, Iran

• To prepare data for subsequent use– Verify– Organize– Transform– integrate, and extract

• Tools:– OpenRefine/ GoogleRefine– Statistical software: R, SAS– Modeling Tools: ….

12BExIS Hands-On Workshop, Feb. 2016, Tehran, Iran

• describe facts• detect patterns• develop explanations• test hypotheses. • This includes

– data quality assurance– statistical data analysis– Modeling– interpretation of analysis results.

13BExIS Hands-On Workshop, Feb. 2016, Tehran, Iran

• The need for:– Supporting research publications by associated, accessible datasets.– re-usability by others

• actions and procedures to:– keep data for some period of time – set data aside for future use– archiving in a data repository.

• Considering– Discovering– Identification– Reproduction/ Presentation– Policies

14BExIS Hands-On Workshop, Feb. 2016, Tehran, Iran

• Disseminate quality data to the public and to other agencies

• Medium- and agent-independent• Via non-/automated mechanisms • Shared, but with controls• Useful metadata

15BExIS Hands-On Workshop, Feb. 2016, Tehran, Iran

• What to publish:– the research result citing

the data– A data paper describing

the data– The data itself

• Where to Publish:– Catalogs– Portals– Repositories– National Archives

• Considerations– Licensing and rights– Cost– Sensitive data– Anonymization

16BExIS Hands-On Workshop, Feb. 2016, Tehran, Iran

“Metadata is information about the context, content, quality, provenance, and/or accessibility of a set of data.”

-Digital Curation at the University of Wisconsin-Madison

But why it is needed?

17BExIS Hands-On Workshop, Feb. 2016, Tehran, Iran

DATA

DET

AILS

Time of data development

Specific details about problems with individual items or specific dates are lost relatively rapidly

General details about datasets are lost through time

Accident or technology change may make data unusable

Retirement or career change makes access to “mental storage” difficult or unlikely

Loss of data developer leads to loss of remaining information

TIME (Michener et al 1997)

18BExIS Hands-On Workshop, Feb. 2016, Tehran, Iran

• Formally describes various key attributes of each data element or collection of elements

• To maintain data quality. • And make use of data possible/ easier

19BExIS Hands-On Workshop, Feb. 2016, Tehran, Iran

• QA focuses on building-in quality to prevent defects– Setting the Quality Level– Setting standards– Proper protocols and methods for:

• Data collection• Data processing and usage• Maintenance

• QC focuses on testing for quality (Defect detection)– Acceptance Criteria– Automatic QC upon data manipulation– Configuring/testing instruments– Unit of measurement, accuracy, conversion errors, …

20BExIS Hands-On Workshop, Feb. 2016, Tehran, Iran

• Protect data from:– Loss– Corruption– Unauthorized access

• Regular backups• Regular restores• Proper structure and naming

21BExIS Hands-On Workshop, Feb. 2016, Tehran, Iran

Feedback

The first part is overThank you

22BExIS Hands-On Workshop, Feb. 2016, Tehran, Iran

Note!

There are some suggestions for cooperation at the end of the workshop

23BExIS Hands-On Workshop, Feb. 2016, Tehran, Iran

The Workshop

• BExIS– Data Lifecycle

Management– Generic– Extensible– Portable– Scalable

• Flexible Data Structures• Data Submission• Validation• Preserving• Metadata Management• Versioning

24BExIS Hands-On Workshop, Feb. 2016, Tehran, Iran

BExIS Core Concepts

DataMetadata

Data StructureMetadata Structure Semantics Geo

Administration Security

«use»

«use»

«use» «use»

«use»

25BExIS Hands-On Workshop, Feb. 2016, Tehran, Iran

BExIS Core Concepts cntd.

SearchPublishing

CMLand Use

ReservationData Submission

26BExIS Hands-On Workshop, Feb. 2016, Tehran, Iran

Preparation

• URL: bx2train.inf-bb.uni-jena.de• Demo: http://bexis2.vmguest.uni-jena.de/• Source: http://fusion.cs.uni-jena.de/bexis

27BExIS Hands-On Workshop, Feb. 2016, Tehran, Iran

The Scenario

• Registration/ Logging in• Seeing the data and metadata structures• Downloading a template• Filling in the Excel data (sample datasets)• Uploading the datasets• Providing metadata• Checking validations• Seeing the dataset in the system• Searching, etc.

28BExIS Hands-On Workshop, Feb. 2016, Tehran, Iran

Registration

29BExIS Hands-On Workshop, Feb. 2016, Tehran, Iran

Logging In

30BExIS Hands-On Workshop, Feb. 2016, Tehran, Iran

Example Datasets

– Tectonic Stress Fields on BExIS website– International Seismological Center– DATA TYPE FOCAL MECHANISM

Javad Chamanara
Why data is on BExIS website?What are the others, Do we have the data structures ready for all the examples?

31BExIS Hands-On Workshop, Feb. 2016, Tehran, Iran

Creating a Data Structure

32BExIS Hands-On Workshop, Feb. 2016, Tehran, Iran

Creating a Data Structure

33BExIS Hands-On Workshop, Feb. 2016, Tehran, Iran

Downloading a Template

34BExIS Hands-On Workshop, Feb. 2016, Tehran, Iran

Downloading a Template

35BExIS Hands-On Workshop, Feb. 2016, Tehran, Iran

36BExIS Hands-On Workshop, Feb. 2016, Tehran, Iran

BExIS Dataset

Views

S.N. Tmp Time S.M. Depth Pos. Hu. 14 22 1 12 -10 A 46 78 Green 13 23 2 10 -10 B 45 16 21 3 12 -11 C 30 0.11 16 18 5 15 -10 A 25 18 14 6 17 -9 D 25 Yes 100 EP

Variable 1 Variable 2 Amendments±0.10%Error

YesRounded1 Sec.Interval

Tmp Time Hu.22 1 4623 2 4521 3 3018 5 2514 6 25

S.N. S.M. Depth14 12 -1013 10 -1016 15 -1018 17 -9

Extended Properties خاک PersianرطوبتBodenfeuchteGermanSoil MoistureEnglish

Globalization Info

Data Structure

Observation (Tuple)

37BExIS Hands-On Workshop, Feb. 2016, Tehran, Iran

Creating a Dataset

38BExIS Hands-On Workshop, Feb. 2016, Tehran, Iran

Choosing a Metadata Scheme

39BExIS Hands-On Workshop, Feb. 2016, Tehran, Iran

Entering Metadata

40BExIS Hands-On Workshop, Feb. 2016, Tehran, Iran

Submitting the Metadata

41BExIS Hands-On Workshop, Feb. 2016, Tehran, Iran

Adding Data to the Dataset

42BExIS Hands-On Workshop, Feb. 2016, Tehran, Iran

Choosing the Dataset’s File

43BExIS Hands-On Workshop, Feb. 2016, Tehran, Iran

Selecting the Online Dataset

44BExIS Hands-On Workshop, Feb. 2016, Tehran, Iran

Validating the Data

45BExIS Hands-On Workshop, Feb. 2016, Tehran, Iran

Submitting the Data

46BExIS Hands-On Workshop, Feb. 2016, Tehran, Iran

Searching for Datasets

47BExIS Hands-On Workshop, Feb. 2016, Tehran, Iran

Dataset Details

48BExIS Hands-On Workshop, Feb. 2016, Tehran, Iran

Dataset Details

4949

Thank You: Workshop Participants Martin Hohmuth Nafiseh Navabpour Roman Gerlach

Contact:[email protected]://bexis2.uni-jena.de

BEXIS Tech Talk #2: The Conceptual Model

Acknowledgment

50BExIS Hands-On Workshop, Feb. 2016, Tehran, Iran

Suggestions

• Data Lifecycle survey– List of lifecycles– Their features/domain of application– Strengths/ weaknesses

• GSI data lifecycle– Best of all– Customizable– …