data and society lecture 7: data in the global landscapebermaf/data course 2015/data and society...

55
Data and Society Lecture 7: Data in the Global Landscape 4/3/15

Upload: doankhanh

Post on 16-Jul-2018

216 views

Category:

Documents


0 download

TRANSCRIPT

Data and Society Lecture 7: Data in the Global

Landscape

4/3/15

Announcements • We are now in Section 3! Come talk to me if you plan on turning in an extra

credit op-ed.

• Papers / mini-proposals due today. Please send .pdf to [email protected] now if you haven’t done it already.

• Guest lecture next week by RPI Professor Bulent Yener on Data and Privacy

• Lecture 9 will be on “Digital Rights and Regulation” instead of “Data and the Workforce”

• Mike Schroepfer, Facebook CTO, is coming April 24. We will have a bigger room in Walker 5113. The class will be open to more people but you get “first dibs”. Feel free to bring 1+ friends to this class and let me know how many you are bringing.

• Data Roundtables for the rest of the semester will be on L7, L8, L10.

– THERE IS TIME FOR DO-OVERS. Let me know if you would like to do an additional Data Roundtable. Your grade will be computed from the best 3 out of 4.

Today (4/3/15)

• Lecture 7: Data in the Global Landscape

– Digital Rights in Europe

– Health Data in Iceland

– Competition and collaboration in Japan

– International Efforts -- The Research Data Alliance

• L6 Data Roundtable (Lars, Sumit, Karl, Dennis)

4

You are here

Section Theme Date First “half” Second “half”

Section 1: The Data Ecosystem -- Fundamentals

January 30 Class introduction; Digital data in the 21st Century (L1)

Data Roundtable / Fran

February 6 Data Stewardship and Preservation (L2)

L1 Data Roundtable / 5 students

February 13 Data and Computing (L3) L2 Data Roundtable / 6 students

February 20 Colin Bodel, Time Inc. CTO Guest Lecture and Q&A

L3 Data Roundtable / 5 students

Section 2: Data and Innovation – How has data transformed science and society?

February 27 Section 1 Exam Data and the Health Sciences (L4)

March 6 Paper preparation / no class

March 13 Data and Entertainment (L5) L4 Data Roundtable / 6 students

March 20 Big Data Applications (L6) L5 Data Roundtable / 5 students

Section 3: Data and Community – Social infrastructure for a data-driven world

April 3 Data in the Global Landscape (L7) Section 2 paper due

L6 Data Roundtable / 4 students

April 10 Bulent Yener Guest Lecture, Data Privacy / Bad guys on the Internet (L8)

L7 Data Roundtable / 4 students

April 17 Digital Rights and Regulation (L9) L8 Data Roundtable / 4 students

April 24 Mike Schroepfer, Facebook CTO Guest Lecture and Q&A

May 1 Data Futures (L10) L10 Data Roundtable / 4 students

May 8 Section 3 Exam Data Roundtable as needed

You are here

Lecture 7: Data in the Global Landscape

Perspectives on digital data vary globally

• “Social infrastructure” around rights, privacy, data sharing vary around the world

– Complex interaction of data potential, privacy, policy, innovation driving critical national conversations

• Even for scientific communities, different national approaches to data sharing, stewardship, preservation, responsibility for support

• At the same time, there is universal recognition of the importance of digital data as a driver for innovation and progress

– each nation finding their own solutions to common, fundamental problems within their own cultures

Digital Rights in Europe

European Union (EU) Digital Agenda

• Overall aim is to deliver sustainable economic and social benefits to Europeans from information and communication technologies.

– Europe perceives itself as lagging behind in terms of use and deployment if IT

• EU launched Europe 2020 strategy in March 2010. Digital Agenda for Europe one of the 7 flagship initiatives of the Europe 2020 Strategy.

EU Data Challenges

• Fragmented digital markets

– 27 countries in EU, much variation between content, services, and infrastructure across boarders; unification difficult

• Lack of interoperability

– “weaknesses in standard setting”, difficulty in coordination

• Rising cybercrime and low risk of trust in networks

• Lack of investment in networks

• Insufficient research and innovation efforts

• Lack of digital literacy and skills

• Missed opportunities in addressing societal challenges

EU Digital Agenda Action Areas 1

• Single digital market

– Want to unify telecom, services, rules, and content

– Rights and protection for consumers and businesses when doing business on-line

• Interoperability and Standards

• Trust and security

– “Europeans will not embrace technology they do not trust – the digital age is neither ‘big brother’ nor ‘cyber wild west’.” (Digital Agenda for Europe, COM(2010) 245, 19.05.2010)

EU Digital Agenda Action Areas 2

• Fast and ultra fast internet access

– Universal broadband coverage, open and neutral internet

• Research and Innovation

– Leverage private investment and accelerate innovation

– Increase digital literacy, skills and services

• ICT-enabled benefits for EU society

– ICT-enabled energy, environment, health care, independent living, cultural diversity / arts, e-government, transportation.

Code of EU on-line rights

• Rights and Principles applicable when you access and use online services

– “Universal” access to electronic communication networks and services

– Access to services and applications of your choice

– Non-discrimination when accessing services provided online

– Privacy, protection of personal data and security

• Rights and Principles applicable when you buy goods or services online

– Information prior to the conclusion of a contract

– Timely, clear and complete contractual information

– Fair contract terms & conditions

– Protection against unfair practices

– Delivery of goods and services without defects and in good time

– Withdrawal from a contract

• Rights and Principles protecting you in case of conflict

– Access to justice and dispute resolution

Europe vs. Facebook

• Advocacy group started by Austrian University student (Max Schrems) grew to a grass-roots movement of 25,000+ people

• Issue is potential violation of EU data protection law due to personal data collected by Facebook, etc.

• Schrems filed a complaint with Irish Data Protection Commissioner alleging 22 violations of European law

Europe vs. Facebook, 2011/2012 • Schrems claimed that Facebook collected data he never consented to provide: physical location,

data he had deleted, etc. Schrems started the “Europe vs. Facebook” movement and 25,000+

other users also requested FB data.

– Legal case has been crowdfunded …

• Irish Data Protection Commissioner (DPC) started investigation

– Complaints filed in Ireland because European users have a contract with “Facebook Ireland Ltd”. Under European law,

Facebook Ireland is the “data controller” for facebook.com and therefore facebook.com is governed by European data

protection laws.

• Schrems eventually recovered 1,222 pages of material 57 data categories from FB in 2011

– Schrems claims that Facebook did not provide all data and that Facebook holds at least 84 data categories about every

user.

• FB developed a download tool to provide users a quick overview of the data being kept on file.

FB also agreed to cut the amount of time it retains data on user activities to less than one year.

• Irish Data Protection Commissioner refused to investigate complaints against Facebook and

Apple in summer 2013. Claimed that the legal view in the complaints was frivolous.

Current Status • Schrems lost claim against regulator in Irish High Court, but

the Judge asked the European court of justice to examine

whether Ireland’s data watchdog is bound by “safe harbor”

and whether an investigation should be launched

– Wikipedia: A safe harbor is a provision of a statute or a

regulation that specifies that certain conduct will be deemed

not to violate a given rule. … The EU Data Protection Directive is

an example of a safe harbor law. It sets comparatively strict

privacy protections for EU citizens. It prohibits European firms

from transferring personal data to overseas jurisdictions with

weaker privacy laws, but creates exceptions where the foreign

recipients have voluntarily agreed to meet EU standards under

the Directive's Safe Harbor Principles.

• Schrems looking for a declaration that the safe harbor

designation for Facebook under EU law should be cancelled

and that the Irish Data Protection commissioner should audit

the exchange of information rather than to allow it to

continue unexamined.

• Still ongoing: Case heard on 4/24/15, judgement expected

in a few months …

Safe Harbor Principles:

• Notice - Individuals must be informed that their data is being collected and about how it will be used.

• Choice - Individuals must have the option to opt out of the collection and forward transfer of the data to third parties.

• Onward Transfer - Transfers of data to third parties may only occur to other organizations that follow adequate data protection principles.

• Security - Reasonable efforts must be made to prevent loss of collected information.

• Data Integrity - Data must be relevant and reliable for the purpose it was collected for.

• Access - Individuals must be able to access information held about them, and correct or delete it if it is inaccurate.

• Enforcement - There must be effective means of enforcing these rules.

Wikipedia

Health Data in Iceland

Icelanders • Iceland has a population of ~326,000 and is the most

sparsely populated country in Europe.

• Iceland provides universal health care to its citizens and spends a fair amount on health care, ranking 11th in health care expenditures as a percentage of GDP and 14th in spending per capita.

– Health care system is ranked 15th in performance by the World Health Organization.

• Ethnically homogeneous. Most Icelanders descendants of Germanic and Gaelic (Celtic) settlers.

– 93% Icelandic

– 3.13% Polish

– 3.84% Other

• Iceland has extensive genealogical records dating back to the late 17th century and fragmentary records extending back to the 9th century.

Source: Wikipedia articles on Iceland, Icelanders

Whole Country Health Data

• 1996: deCODE Genetics founded to identify human genes associated with common diseases using populations studies and apply the knowledge gained to guide the development of candidate drug treatments.

• Company worked with government on the Health Sector Database Act and created Icelandic Health Sector Database (HSD) which merged genealogical, genetic and health records for the entire population of Iceland

– Opt-out model of presumed consent

– Services and infrastructure developed as well for mining data in HSD

• deCODE data used for discoveries about genes that increase risk for kidney disease, cancer, lupus, vascular disease, schizophrenia, osteoporosis, etc.

– One result identified a gene that protects against Alzheimer’s

Controversy and Transition

deCODE founded in 1996; filed for

bankruptcy in 2009

Saga Investments LLC purchased

deCODE services and assets in 2010

Amgen purchased deCODE in 2012,

spun off NextCODE Health in 2013

NextCODE acquired by WuXi

PharmaTech in 2015

• Health Sector Database very controversial because of privacy and consent issues

• Legal judgement from the Icelandic Supreme Court effectively killed off the HSD project in 2003

– Court case focused on legal rights and rights to not participate from deceased Icelander. Legal issues included legal standing and personal rights of deceased individual, identifiability due to the richness of data

– Part of the problem was the original Health Sector Database Act which did not provide information and guidelines on how DB should be set up, who should run it, who should have access to the data, and what control Icelandic citizens should have over samples.

– Company believed it could continue to identify disease-related genes without the database

• Services and assets of deCODE went through many transitions:

Precision Medicine, challenging ethics

• DeCODE has collected full DNA sequnces on 10,000 individuals.

– Because Icelandic population is so homogeneous, DeCODE says it can extrapolate (“impute”) to accurately guess the DNA makeup fo the other 330K citizens, including those who never participated in the studies.

• DeCODE has identified mutations in BRCA2 that convey sharply increased risk of breast and ovarian cancers.

– DeCODE’s data could identify 2K people with the gene mutation but there are legal and ethical issues that prevent DeCODE from informing people who are at risk

– Inferences go “beyond informed consent”

• Ongoing issues: Development of large-scale data collections provide critical information for scientists but development of accompanying privacy, ethics and health policy and regulation a problem for many countries

Moving Japan Towards Competitiveness through

Collaboration and Data Sharing (All slides adapted from Nao Tsunematsu)

• Nao Tsunematsu – Senior Policy Advisor for Japan Science and Technology Agency

• Nao working to help Japan be more competitive in research and innovation

• Focusing on key drivers:

– Scientific leadership and collaboration

– New approaches to scientific innovation including data sharing “To the greatest extent and with the fewest constraints possible publicly funded scientific research data should be open, while at the same time respecting concerns in relation to privacy, safety, security and commercial interests, whilst acknowledging the legitimate concerns of private partners.” (G8 + 5 Science Ministers)

Increasing focus on collaboration in science: From individual to

team work

23

Agri. Science Biology/Biochemistry Chemistry Clinical Medicine Computational Sci. Economics/Business Administration Engineering Environment Studies/Ecology Earth Science Immunology Material Science Mathematics Microbiology Molecular Biology/Genetics Neuro Science Pharmacy/Toxicology Physics Botany/Zoology Psychiatry/Psychology Social Sciences Space Science

N of Authors

Source: “White Paper on Science and Technology 2014”

By:Ministry of Education, Culture, Sports, Science and Technology (MEXT)

Analysis: National Institute of Science and Technology Policy (NISTEP) “Science of Science, Technology, and Innovation Policy”

International Collaboration Increasing

24

JP JP

Integer Count • If a paper is coauthored by two American

authors, U.S. gets “Count 1.” • If a paper is coauthored by a Japanese and an

author in U.S.A., each country gets “Count 1”

Increase in Chinese Collaboration results in loss of relative standing for some countries

Changes in shares in most cited papers

(Top 10%)

US

DE

KR

FR CN JP

UK

Changes in shares in most cited papers (Top 1%)

US UK DE CN FR CA IT

NL AU

Risk of Encapsulation and isolationism 27

• Nao: If Japan stays out of the movement toward building the global infrastructure for research data sharing, science in Japan will become the Galápagos Islands of Science

– Galapagos

• Isolated from other parts of the world

• Many unique species have evolved

• Emerging focus on Open Science Policy in Japan

• Gala-Kei mobile phones uniquely evolved in Japan.

• They are out of sync with the global standard, and not sold outside of Japan

Gala-Kei

1999: Web Browsing Service 2004: e-money, e-ticket…

Pipeline Model of Innovation, Circa 1999

Publicly funded research is the source of competitiveness "Funding a Revolution: Government Support for Computing Research“ (NRC, 1999)

New Value Proposition?

• Is the pipeline model out of date? – WiFi, Google, Amazon,

Facebook

• If science is changing, the way science drives innovation may be changing

– If “All Legacy Information” are discoverable, accessible, and usable, business organizations can create and/or collect new datasets to fit its own goal.

All Legacy information

User

New dataset

New Insights

Figure adapted from presentation by

Prof. Barend Mons http://www.slideshare.net/GigaScience/barend-mons-slides-from-ismb-2014?qid=cb9133a2-fce9-4b68-9ff5-

c0a6145342c1&v=qf1&b=&from_search=1

The Game is Changing

Building the capacity for analysis is critical • Role for funding agencies for

innovation

– Build infrastructure in broad sense of the meaning of the word in collaboration with other agencies

– Build human capacity in all phases of data life cycle

30

International efforts – the Research Data Alliance

RDA Plenary 3

Dublin, Ireland

Data Sharing as a Driver of Innovation

Research Data Alliance Global community-driven

organization launched in

March 2013 to accelerate

data-driven innovation

RDA focus is on building the social,

organizational and technical

infrastructure to

reduce barriers to data sharing and

exchange

accelerate the development of coordinated

global data infrastructure

RDA Vision and Mission

Research Data Alliance Vision: Researchers and innovators openly share data across technologies, disciplines, and countries to address the grand challenges of society.

Research Data Alliance Mission: RDA builds the social and technical bridges that enable data sharing.

Goal of RDA Infrastructure: Support Data Sharing and Interoperability Across Cultures, Scales, Technologies

Common data types for data Interoperability

Persistent identifiers

Domain-focused portals

Harmonized standards

Data access and preservation policy and practice

Tools for data discoverability, …

Harmonized standards

Policy and Practice

RDA First Meetings

• October 2012: RDA Pre-meeting -- “Global Planning Meeting”

• First broad discussion about goals and organizational framework for RDA

• First Working / Interest groups formed

• Meeting drew > 100 participants from US, EU, elsewhere

• March 2013: RDA Launch / First Plenary

• High level talks by Neelie Kroes (EU VP of Digital Agenda), Farnam Jahanian (NSF CISE AD), Duncan Lewis (AU Ambassador to Belgium, Luxembourg, the EU and NATO)

• First working meeting for RDA groups

• Drew 240 participants from ~30 countries

RDA Launch / First Plenary

March 2013

First RDA organizational telecon: August 2012

Global Data Planning Meeting: October 2012

First Working Groups and Interest Groups

240 participants

Evolving focus for RDA “Deliverables”: CREATE ADOPT USE

RDA Members come together as

• Working Groups – 12-18 month efforts to build, adopt, and use specific pieces of

infrastructure

• Interest Groups – longer-lived discussion forums that spawn Working Groups as

specific pieces of needed infrastructure are identified.

Working Group efforts focus on the development and use of data sharing infrastructure

• Code, policy, infrastructure, standards, or best practices that are adopted and used

by communities to enable data sharing

• “Harvestable” efforts for which 12-18 months of work can eliminate a roadblock

• Efforts that have substantive applicability to groups within the data community, but

may not apply to everyone

• Efforts for which working scientists and researchers can start today

RDA Community Today

393

993 1276

1658

2051

2407 2645

2778

Total RDA Community Members: 2778

RDA Community Spans 95+ Countries

Precipitous Growth

RDA Launch / First Plenary

March 2013

RDA Second

Plenary

September 2013

RDA Third

Plenary

March 2014

First RDA organizational telecon: August 2012

Global Data Planning Meeting: October 2012

First Working Groups and Interest Groups

240 participants

First “neutral space” community meeting (Data Citation Summit)

First Organizational Partner Meet-up

First BOFs

380 participants from 22 countries

RDA Fourth

Plenary

September 2014

First Organizational Assembly

6 co-located events

14 BOF, 12 Working Groups, 22 Interest Groups

497 Participants from 32 countries

RDA Plenary 4 Amsterdam

First Working Group exchange meeting

RDA Plenary 2

Washington, DC

RDA Plenary 1 / Launch Gothenburg, Sweden

RDA Plenary 3

Dublin, Ireland

First RDA Deliverables presented

Organizational Assembly and first OAB / Council meeting

10 co-located events

11 BOF, 14 Working Groups, 36 Interest Groups

550 Participants from 40 countries

slide courtesy Fran Berman

How RDA is Organized

Funder’s Forum Public Sector Organizational and Community Support

Interest Groups domain coordination, idea generation, maintenance,

Technical Advisory

Board

Socio-technical vision

and strategy

Secretary-General and

Secretariat

Administration and

operations

Organizational Advisory

Board and

Organizational

Assembly

Needs, adoption, business

advice

RD

A M

em

bers

hip

RDA Council

Organizational mission and strategy

Working Groups

implementable, impactful outcomes

RDA Foundation Legal entity

RDA Interest Groups

1. Agricultural Data Interoperability IG

2. Active Data Management Plans*

3. Big Data Analytics IG

4. Biodiversity Data Integration IG

5. Brokering IG

6. Community Capability Model IG

7. Data Fabric IG

8. Data for Development

9. Data Foundations and Terminology IG*

10.Data in Context IG

11.Development of cloud computing capacity and

education in developing world research*

12.Development of cloud computing capacity and

education in developing world research

13.Digital Practices in History and Ethnography IG

14.Domain Repositories Interest Group

15.Education and Training on handling of research data

16.ELIXIR Bridging Force IG

17.Engagement IG

18.Federated Identity Management

19.Geospatial IG*

20.Libraries for Research Data

21.Long tail of research data IG

22.Marine Data Harmonization IG

23.Metabolomics

24.Metadata IG

25.PID Interest Group

26.Preservation e-Infrastructure IG

27.Quality of Urban Life Interest Group

28.RDA/CODATA Legal Interoperability IG

29.RDA/CODATA Materials Data, Infrastructure &

Interoperability IG

30.RDA/WDS Certification of Digital Repositories IG

31.RDA/WDS Publishing Data Cost Recovery for Data

Centres

32.RDA/WDS Publishing Data IG

33.Reproducibility IG

34.Research data needs of the Photon and Neutron

Science community

35.Research Data Provenance

36.Service Management IG

37.Structural Biology IG

38.Toxicogenomics Interoperability IG

* in review

RDA Example Efforts: Domain Repositories Interest Group (George Alter, ICPSR; Peter Doorn, DANS; Ruth Duerr, NSIDC; Bob Hanisch, VAO)

Why: Repositories critical for stewardship and preservation of research data.

Provide “homes” for accessing and using data now and in the future.

What: RDA Domain Repositories Interest Group brings together active

data repositories serving scientific disciplines to share/create good practice in

(and collaborations around) data curation, dissemination, preservation and

institutional sustainability

How: RDA Domain Repositories Interest Group will

Share best practices among domain repositories

Collaborate to create economies of scale and common approaches

Work with other RDA groups (data citation, metadata, certification of digital

repositories) to adopt/amplify infrastructure

Impact: The Domain Repositories IG will advance repository infrastructure

critical to support data sharing and exchange and build community among

domain repositories regionally (in the US) and world-wide

RDA Working Groups

1. Brokering Governance

2. Data Citation WG

3. Data Description Registry

Interoperability

4. Data Foundation and Terminology

WG

5. Data Type Registries WG

6. Metadata Standards Directory

Working Group

7. PID Information Types WG

8. Practical Policy WG

9. RDA/CODATA Summer Schools in

Data Science and Cloud Computing

in the Developing World

10.RDA/WDS Publishing Data

Bibliometrics WG

11.RDA/WDS Publishing Data

Services WG

12.RDA/WDS Publishing Data

Workflows WG

13.Repository Audit and Certification

DSA–WDS Partnership WG

14.Repository Platforms for Research

Data*

15.The BioSharing Registry:

connecting data policies, standards

& databases in life sciences*

16.Wheat Data Interoperability WG

* in review

2014, 2015 deliverables

RDA Example Efforts: Wheat Data Interoperability WG (Esther Dzale Yeumo Kabore, French National Institute for Ag. Research, Devika Madalli (Indian Statistical Institute), Johannes

Keizer, Food and Agriculture Office of the UN)

Why: Wheat information systems needed to answer questions such as “What genes and

traits are relevant for understanding the impact of climate change on wheat plant

productivity?”

– Answers to question require coordination / integration of diverse data sets regarding yield, market

pricing, soil analysis, genomic and phenotypic information, etc.

What: RDA Wheat Data Interoperability Working Group developing a common

integration framework for describing, representing, linking and publishing wheat data with

respect to open standards to support wheat data sharing, use and re-use. Contributing to

WheatIS ((Wheat Information System of the Global Wheat Initiative)

How: RDA Group will

Create common standards and vocabularies for wheat data management.

Facilitate access, discovery, use and re-use through technical and social infrastructure development:

metadata, vocabularies/ontologies/formats, and good practice.

Data to be integrated in WheatIS : genomic annotations, phenotypes, genetic maps, physical maps,

germplasm.

Intend to adapt framework to other crops such as RICE and MAIZE

Impact: Helps accelerate work of Global Forum on Agricultural Research (GFAR), the

Cooperative Group on International Agricultural Research (CGIAR) and others promoting the

Coherence in Information for Agricultural Research for Development (CIARD) movement to

open up access to agricultural knowledge worldwide.

RDA/WDS Publishing

Data Bibliometrics

Data

providers Data

consumers

Social

Technical

Solutions

dimension

Beneficiary

dimension

Working Group Clusters

Data

Citation

Data Foundation

and Terminology

Repository Audit and

Certification DSA–WDS

Partnership

Brokering

Governance

PID Information

Types

Data Type

Registries

RDA/WDS

Publishing Data

Workflows

RDA/CODATA Summer

Schools in Data Science and

Cloud Computing in the

Developing World

Metadata

Standards

Directory

Practical

Policy The BioSharing

Registry

Wheat Data

Interoperability

RDA/WDS

Publishing Data

Services

Data Description

Registry

Interoperability

Standardisation of

Data Categories and

Codes

Q1

Q2 Q3

Q4

RDA Deliverables and Adopters P4

Working Group Deliverable Impact Adopters

Data Foundation &

Terminology Working

Group

Basic vocabulary of

foundational terminology,

query tool

Ensures researchers use a

common terminology

when referring to data

DataFed.net

CLARIN

Data Type Registries

Working Group

Data type model and

registry

Provides machine-readable

and researcher-accessible

registries of data types

that support the accurate

use of data

Materials Genome

Initiative

Deep Carbon Observatory

PID Information Types

Working Group

Persistent identifier

registry

Conceptual model for

structuring typed

information to better

identify PIDs, common

interface for access to this

information

Materials Genome

Initiative

Deep Carbon Observatory

Practical Policy Working

Group

Basic set of machine

actionable rules

Policy templates that can

be used to support data

sharing and interchange

between communities

Platform for Experimental

Collaborative Ethnography

EUDAT

RDA deliverables and adopters P5 Working Group Deliverables Impact Adopters

Scalable Dynamic Data

Citation Working Group

Dynamic-data citation

methodology that supports

efficient processing of data

and linking from

publications

Researchers can reference

precise subsets of changing

data

Collaborators include:

CODATA, OpenAire,

Datacite, W3C, other

related standards

Metadata Standards

Directory Working Group

Prototype Metadata

Standards Directory and use

cases

Information can be

maintained transparently

and with full version control.

Digital Curation Centre

Wheat Data

Interoperability Working

Group

Common framework for

Wheat Data Terminology to

enable interoperability

between distinct data

collections

Semantically linked terms

describing wheat data so

researchers can share

harvest and related

information between data

sets and communities

Wheat Initiative Wheat

Information System

Data Description

Registry Interoperability

Working Group

Systems and graph

technologies to link data

across multiple registries to

facilitate search and

discovery

Enables more efficient

discovery of data sets

Collaborators include:

Australian National Data

Service, CERN, DANS,

DataCite, DataPASS,

Thomson Reuters, Cornell,

others

Going forward: Focus on impact

Continuing pipeline of infrastructure deliverables adopted and used to accelerate data sharing

Increasing coordination of infrastructure

Increasing cross-boundary collaborations between domains, sectors, organizations

International and regional programs focusing on workforce, outreach, expansion of infrastructure impact

New partners in the Organizational Assembly

Focused strategy to support development of industry infrastructure for data sharing

More Infrastructure

Partnership with Industry

Synergistic Programs

Effective Community

slide courtesy Fran Berman

RDA/US Goals:

Contribute to RDA “international” efforts

and leadership

Bring US efforts to broader RDA

community

Build the RDA community within

the US

Leverage and implement RDA

deliverables in the US to amplify impact

Collaborate closely with other RDA

“regions” on key programs and initiatives

RDA/US: Collaborate Globally, Contribute Locally

NSF-supported RDA/US initiatives: • Outreach (RDA RDA/US) • RDA Deliverables Amplification • Student / Early Career

Engagement

RDA/US Steering Committee • Fran Berman, RPI • Larry Lannom, CNRI • Kathy Fontaine, RPI • Beth Plale, IU

Lecture 7 Sources (not already on slides)

• Research Data Alliance, http://rd-alliance.org

• “Genome study predicts DNA of the whole of Iceland,” MIT Technology Review, http://www.technologyreview.com/news/536096/genome-study-predicts-dna-of-the-whole-of-iceland/

• “NextCODE Health Mines deCODE’s Data, and More, to Catalyze Clinical Diagnosis”, PLOS

• “An analysis of the Icelandic Supreme Court judgement on the Health Sector Database Act”, http://www2.law.ed.ac.uk/ahrc/script-ed/issue2/iceland.asp

• Biology Blog, blogs.plos.org/dnascience/2013/11/14/nextcode-health-mines-decodes-data-and-more-to-catalyze-clinical-diagnosis

• “Facebook data privacy case to be heard before European Court,” The Guardian, http://www.theguardian.com/technology/2015/mar/24/facebook-data-privacy-european-union-court-maximillian-schrems

• Europe versus Facebook, http://www.europe-v-facebook.org/EN/en.html/

Data Roundtable

April 10: L7 Data Roundtable

• “Facebook’s privacy policy breaches European law, report finds”, The Guardian, http://www.theguardian.com/technology/2015/feb/23/facebooks-privacy-policy-breaches-european-law-report-finds (Juan Poma)

• “Australia Tops OECD’s Better Life Index”, Wall Street Journal, http://www.wsj.com/articles/SB10001424052702303610504577419320948930402 (Miguel Inoa-Lantigua)

• “Africa: Data Gaps Make Malnutrition Too Easy to Ignore“, SciDev.net http://allafrica.com/stories/201503171418.html (Alex Karcher)

• “In China, an Open Data Movement is Starting to Take Off,” TechPresident, http://techpresident.com/news/wegov/24940/China-Open-Data-Movement-Starting-Take-Off (Kate McGuire)

April 17: L8 Roundtable (Need 4 volunteers from {Philip, Dennis, Charles, Yusri, Lars, Juan, Oskari, Robert})

• “Everything Google Knows About You (and How it Knows It)”, The Washington Post,

http://www.washingtonpost.com/news/the-intersect/wp/2014/11/19/everything-google-

knows-about-you-and-how-it-knows-it/ (Juan Poma)

• “How the Politics of Data Privacy Defy Party Labels in Minnesota”, MINNPOST,

https://www.minnpost.com/politics-policy/2015/04/how-politics-data-privacy-defy-party-

labels-minnesota (Robert Stephens)

• “Bill Would Limit Use of Student Data”, The New York Times,

http://www.nytimes.com/2015/03/23/technology/bill-would-limit-use-of-student-

data.html?_r=0 (Yusri Jamaluddin)

• “DDoS attacks that crippled GitHub linked to Great Firewall of China,” Ars Technica,

http://arstechnica.com/security/2015/04/ddos-attacks-that-crippled-github-linked-to-great-

firewall-of-

china/?utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+arstechnica

%2Findex+%28Ars+Technica+-+All+content%29 (Charles Hathaway)

Today: Big Data (L6) Data Roundtable

• “Big data: Welcome to the petacentre”, Nature,

http://www.nature.com/news/2008/080903/full/455016a.html (Lars Olsson)

• “Police Push the Limits of Big Data Technology”, Datanami,

http://www.datanami.com/2014/07/31/police-push-limits-big-data-technology/ (Sumit

Munshi)

• “Big Data: Big Obstacles”, Chronicle of Higher Education, http://chronicle.com/article/Big-

Data-Big-Obstacles/151421/ (Karl Appel)

• “How pro teams are using data analytics to draft better players,” Financial Post,

http://business.financialpost.com/2013/09/03/pro-sports-teams-turning-to-data-anlaytics-

to-fill-seats/?__lsa=88c9-3dab (Dennis Fogerty)

• “The big deal about “big data” – your guide to what the heck it actually means”, Ars

Technica, http://arstechnica.com/information-technology/2015/02/the-big-deal-about-big-

data-your-guide-to-what-the-heck-it-actually-means/ (READ THIS)