immersive informatics - research data management at pitt ischool and carnegie mellon university...
TRANSCRIPT
ImmersiveInformatics -
RDM at Pitt iSchool
Library Research Seminar VI, Illinois, October 2014
Professor Liz Lyon, School of Information Sciences,
University of Pittsburgh
http://www.flickr.com/photos/think
mulejunk/352387473/
http://www.google.co.uk/imgres?q=illumina+bgi&hl=en&client=firefox-
a&hs=Jl2&rls=org.mozilla:en-GB:official&biw=1366&bih
http://www.flickr.com/photos/wasp_barcode/4793484478/http://www.flickr.com/photos/charleswelch/3597432481//
http://www.flickr.com/photos/usfsregion5/4546851916//
Data
...
evidence, reproducibility,
curation, stewardship
Implications of
“Big Data” and
data science for
organisations in
all sectors
Predicts a
shortage of
190,000
data scientists
by 2019http://www.mckinsey.com/Insights/MGI/Research/Technology_and_Innov
ation/Big_data_The_next_frontier_for_innovation
Flavours of
data scientist (Lyon 2012)
• data engineer - focus on software
development, coding,
programming, tools
• data analyst – focus on
business/scientific analytics and
statistics e.g. R, SAS, Excel to
support researchers and modellers,
business
• data librarian – focus on
advocacy, research data
management / informatics in a
university / institute
• data steward – focus on long term
digital preservation, repositories,
archives, data centres
• data journalist – focus on telling
stories and news
New roles
New skills
…data librarian, research data services manager, data
scientist, technical data co-ordinator, data curator, data
analyst, data steward, chief data officer....
http://www.ala.org/acrl/sites/ala.org.acrl/files/content/publications/whitepapers/Tenopir_Birch_Allard.pdf
“Very few librarians are
likely to have specialist
scientific or medical
knowledge - if you train as
a research scientist or a
medic, you probably won’t
become a librarian.”
RLUK/Mary Auckland: Reskilling for Research 2012
https://www.flickr.com/photos/23312112@N04/9108008669/in/photolist-eSQVEe-9dju9H-f3rUKg-h2kzk1-8nbYbH-apTJmZ-a8CapW-ahsWNa-
a8CXWA-6SEFzn-7DMRoo-4Zudts-TpNW5-4jh869-2MhStX-8tqtLd-8XQvXo-8s9Uup-6GB7QU-995KZ7-7uG7vL-9mgxCa-6qpip1-77mSoG-
7LGBw4-at7uYC-ghoME2-jfbsjF-8rHmyB-khyact-7gWFgw-968oHa-i8gDyd-jrvv7v-hu8KBH-5X3pmT-8LBseT-dMXgXq-fmNe5N-dNuWKB-dNMuxP-
brmocc-b9djCp-yKP5Q-dsQ5xw-9HtSRR-eRVdi8-5BncXP-apYD4x-6gQ4mJ-4N6WQv
Data curation : domain disconnect ?
How best to make the domain connection?ImmersiveInformatics pilot development 2013
http://immersiveinformatics.org/
Co-developed UKOLN Informatics +
University of Melbourne
Focus on work-based RDM training
IDCC14 paper
Librarians &
researchers mix
10 modules
Immersive data
sessions in labs
Co-curate dataset
Keep “data diary”
Positive evaluation
Next step: Bring the
immersiveinformatics
model to iSchool data
education programs
• Visiting Professor @Pitt from January 2014
• Spring Semester – new Research Data
Management course first run as a Doctoral
Seminar Program Special Topic
Methodology
• 12 student participants for immersive session
scheduled for Week 8 Weds 26 February
• Doctoral students from Pitt (2), practicing
librarians from University of Pittsburgh (2) and
from Carnegie Mellon University (8)
• Lab placements set up by email / phone via
contacts and recommendations
• Immersive session for up to 3 hours in the lab
• Students work in pairs with a researcher
• Briefing note sent to Pitt faculty/researchers
• Students briefed during the RDM course
RDM Course @ Pitt iSchool
1. Introductions & Overview
2. Data Landscape
3. Universities & Data
4. Data Requirements &
Capability
5. RDM Roadmaps, Strategy,
Services & Structures
6. Data Management Plans
7. No Class – Fall Break
8. Immersive session
with Researchers
9. Disciplinary Data 1
10. Legal & Ethical Issues
11. Disciplinary Data 2
12. Data Centers
13. Data Advocacy, Skills,
Training
14. Data Sustainability &
Costs
15. Presentations
Immersive Unit Objectives Students will be able to:
• Observe research data practice “at the coalface” in
a selected discipline or sub-discipline
• Learn about disciplinary data creation, capture,
collection, manipulation, analysis etc.
• Understand data methodologies, tools, protocols,
instrumentation, workflows etc.
• Build first-hand experience of the day-to-day data
challenges and constraints for researchers
• Begin to provide RDM advocacy, advice and
guidance to researchers
Fall Semester 2014 RDM & RDI
• Research Data Management run as a MLIS
Masters course
• New Research Data Infrastructures (RDI)
Doctoral Seminar Program Special Topic
• Student participant numbers (Total=9) and
includes Librarians from Pitt and CMU
• Immersive session RDM in Week 8 and RDI
in Week 7 - up to 3 hours length in the lab
Research Data Infrastructures
1. No class Labor Day
2. Introductions, Syllabus
Overview & Data
Storage Part 1
3. Data Storage Part 2
4. Data Publication &
Citation Part 1
5. Data Publication &
Citation Part 2
6. Data Discovery
7. Immersive session
with Researchers
8. Disciplinary Data 3
9. Data Standards
10.Data Repositories
11.Data Preservation
(Long-term)
12.Citizen Science,
Citizen Data
13. Data Science
14. Data, Society,
Futures
15.Presentations &
Summary Evaluation
Evaluation feedback
• Collected from faculty and researchers via
1 hour focus group in department
– Semi-structured interview approach
• Collected from iSchool students via
questionnaire completed in class
– What worked well?
– What didn’t work at all / less well?
– What did you learn?
– How were Timings? Environment?
– How can the placements be improved?
Student feedback
“It was great to see a real-life example of how
a lab generates and uses data.”
“We learned not only about the specifics of
their research but about the lifecycle of data.”
“This was a valuable experience. It was very
practical and illuminated some of the struggles
that one may encounter in discussing data as
its own area of research.”
Faculty / Researcher feedback
“We talked about the project, I took them to the lab,
showed them cells, raw data, calculations, final
data, data which is stored and shared with the PI,
details kept in notebook, reagents, primers,
antibodies, PubMed, gene databases”
“Explaining what one does to a new person is
instructive, since it shows you what you do not
understand and cannot explain. Discussion with the
(LIS) student exposed some weaknesses in my
own thinking”
Process / methodology feedback
• Fall Semester RDM & RDI courses will have:
– more background information to Faculty e.g.
Propose agenda for session
– more guidance to students e.g. suggested
questions to faculty, topics to explore
– Class debrief sessions
“More communication needed beforehand –
context, agenda” (Faculty)
“a debriefing to compare notes either in the pairing
or as a larger group” (Student)
A centre of expertise in digital information management
• It is likely that the way that researchers publish, assess
impact, communicate, and collaborate will change more
within the next 20 years than it did in the past 200 years.
http://book.openingscience.org/
A centre of expertise in digital information management
Useful knowledge Useful knowledge
Sharable
knowledge
Sharable
knowledge
A centre of expertise in digital information management
Research
collaboration is
associated with high
academic and wider
impact
International
collaboration is
associated with high
academic impact
Data can be shared
easily across borders
A centre of expertise in digital information management
More data will be created in the next five years than has been collected in the whole of human history. Properly managed, this data will form a major resource for Australian researchers.
A centre of expertise in digital information management
Why Data Management Services?
"The Board believes that timely attention to digital
research data sharing and management is fundamental
to supporting U.S. science and engineering in the twenty-
first century.
...strong and sustainable data sharing and management
policies [are] a critical national need."
Digital Research Data Sharing and Management
December 2011
Task Force on Data Policies
Committee on Strategy and Budget
National Science Board
A centre of expertise in digital information management
• The rapid development in computing
technology and the Internet have
opened up new applications for the
basic sources of research — the
base material of research data —
which has given a major impetus to
scientific work in recent years.
• Access to research data increases
the returns from public investment in
this area; reinforces open scientific
inquiry; encourages diversity of
studies and opinion; promotes new
areas of work and enables the
exploration of topics not envisioned
by the initial investigators.
• The value of data lies in their use.
Full and open access to scientific
data should be adopted as the
international norm for the exchange
of scientific data derived from publicly
funded research.
A centre of expertise in digital information management3
8
Institutions are to retain
research data, provide
secure data storage,
identify ownership, and
ensure security and
confidentiality of research
data
Researchers are to retain
research data and primary
materials, manage storage
of research data and
primary materials, maintain
confidentiality of research
data and primary materials.
A centre of expertise in digital information management
“The Holdren Memo”
To achieve the Administration’s commitment to
increase access to federally funded published
research and digital scientific data, Federal agencies
investing in research and development must have
clear and coordinated policies for increasing
such access.
Memo on Increasing Access to the Results of
Federally Funded Scientific Research
White House Office of Science and Technology
Policy
February 22, 2013
A centre of expertise in digital information management
Current priorities in academic
libraries
1. Continue and complete migration from print
to electronic and realign service operations
2. Retire legacy collections
3. Continue to repurpose library as primary
learning space
4. Reposition library expertise and resources to
be more closely embedded in research and
teaching enterprise outside library
5. Extend focus of collection development from
external purchase to local curation
Lewis (2007); Webster (2010, 2012)
A centre of expertise in digital information management
• The part that academic
librarians should play
remains unclear
• Raise awareness of
eResearch amongst
library staff
• Provide advice on data
management to
eResearchers
• Data curation is vast,
complex and requires
subject input
A centre of expertise in digital information management
• “The bad news is that I’m not sure they understand what goes on in the library other than taking out books.”
Benton Foundation, 1996
• “User perceptions negatively affect the ability of librarians to meet information needs simply because a profession cannot serve those who do not understand its purpose and expertise.”
Durrance, 1988
A centre of expertise in digital information management
The worst thing about
the stereotype is that it
impacts on the psyche
of librarians who really
begin to believe that
they don't deserve the
kingpin role
US Congress, 2001
A centre of expertise in digital information management
CORE SCHEMA, Body of Professional Knowledge, CILIP, 2004
A centre of expertise in digital information management
Collections grid
high low
low
hig
h
stewardship
un
iqu
en
ess
Books
JournalsNewspapers
Gov. docs
CD, DVD
Maps
Scores
Special
collectionsRare books
Local/Historical
newspapers
Local history materials
Archives & Manuscripts,
Theses & dissertations
Research, learning and
administrative
materials,
•ePrints/tech reports
•Learning objects
•Courseware
•E-portfolios
•Research data
•Institutional records
•Reports, newsletters, etc
Freely-accessible web
resourcesOpen source software
Newsgroup archives
http://www.slideshare.net/lisld/collections-grid
A centre of expertise in digital information management
Librarians’ competencies profile for RDM
Key roles
• Providing access to data
–Identification of data sets; discovery and analytic
tools; advice on informatics
• Advocacy and support for managing data
–Policy development; articulating benefits; promoting
data sharing and reuse; education and training; data
audits
• Managing data collections
–Preparing for data deposit; appraisal; selection;
ingestion; curation; preservation; storage and backup
48
A centre of expertise in digital information management
Core competencies
• Providing access to data
–Data centres and repositories; organization and
structure of data; licensing and IP; manipulation and
analysis
• Advocacy and support for managing data
–Research funder mandates; DMP; research
workflows; disciplinary norms; journal requirements;
data audit and assessment tools
• Managing data collections
–Metadata; discovery tools and indexing; database
design; data linking; forensic procedures in data
curation 49
Librarians’ competencies profile for RDM
A centre of expertise in digital information management
Data Management at
CMU Timeline
July
2013
September
2013
November
Dean appointed Data
Management
Services Group
DM Librarian
appointed
A centre of expertise in digital information management
December
2013
January
2014
February
2014
Initial
presentation
to Faculty
Senate
Faculty
Senate
resolution
CLIR Data
Curation
Fellows
A centre of expertise in digital information management
March
2014
April
2014
May
2014
Draft
detailed
strategy
Initial
consultation
First
‘graduates’
from
LIS2975
A centre of expertise in digital information management
What might our service offer?
• Teaching or doing?
• Compliance or support?
• Storage or registering?
• Policy advice vs policy development
• Institution-wide or in response to requests?
• Advising on data re-use (sources, analysis
etc)
A centre of expertise in digital information management
uqkeithw
Keith
Webster
cmkeithw
Keith Webster