data and society lecture 7: data in the global landscapebermaf/data course 2015/data and society...
TRANSCRIPT
Announcements • We are now in Section 3! Come talk to me if you plan on turning in an extra
credit op-ed.
• Papers / mini-proposals due today. Please send .pdf to [email protected] now if you haven’t done it already.
• Guest lecture next week by RPI Professor Bulent Yener on Data and Privacy
• Lecture 9 will be on “Digital Rights and Regulation” instead of “Data and the Workforce”
• Mike Schroepfer, Facebook CTO, is coming April 24. We will have a bigger room in Walker 5113. The class will be open to more people but you get “first dibs”. Feel free to bring 1+ friends to this class and let me know how many you are bringing.
• Data Roundtables for the rest of the semester will be on L7, L8, L10.
– THERE IS TIME FOR DO-OVERS. Let me know if you would like to do an additional Data Roundtable. Your grade will be computed from the best 3 out of 4.
FYI
• April 30 Webcast at 11 EDT with 2014 A.M. Turing Award Winner Michael Stonebraker: "The Fast Data Challenge and Picking the Right Database"
• Register at https://event.on24.com/eventRegistration/EventLobbyServlet?target=reg20.jsp&eventid=975297&sessionid=1&key=228605EABDC52F9B90014185EF938119&sourcepage=register
Today (4/3/15)
• Lecture 7: Data in the Global Landscape
– Digital Rights in Europe
– Health Data in Iceland
– Competition and collaboration in Japan
– International Efforts -- The Research Data Alliance
• L6 Data Roundtable (Lars, Sumit, Karl, Dennis)
4
You are here
Section Theme Date First “half” Second “half”
Section 1: The Data Ecosystem -- Fundamentals
January 30 Class introduction; Digital data in the 21st Century (L1)
Data Roundtable / Fran
February 6 Data Stewardship and Preservation (L2)
L1 Data Roundtable / 5 students
February 13 Data and Computing (L3) L2 Data Roundtable / 6 students
February 20 Colin Bodel, Time Inc. CTO Guest Lecture and Q&A
L3 Data Roundtable / 5 students
Section 2: Data and Innovation – How has data transformed science and society?
February 27 Section 1 Exam Data and the Health Sciences (L4)
March 6 Paper preparation / no class
March 13 Data and Entertainment (L5) L4 Data Roundtable / 6 students
March 20 Big Data Applications (L6) L5 Data Roundtable / 5 students
Section 3: Data and Community – Social infrastructure for a data-driven world
April 3 Data in the Global Landscape (L7) Section 2 paper due
L6 Data Roundtable / 4 students
April 10 Bulent Yener Guest Lecture, Data Privacy / Bad guys on the Internet (L8)
L7 Data Roundtable / 4 students
April 17 Digital Rights and Regulation (L9) L8 Data Roundtable / 4 students
April 24 Mike Schroepfer, Facebook CTO Guest Lecture and Q&A
May 1 Data Futures (L10) L10 Data Roundtable / 4 students
May 8 Section 3 Exam Data Roundtable as needed
You are here
Perspectives on digital data vary globally
• “Social infrastructure” around rights, privacy, data sharing vary around the world
– Complex interaction of data potential, privacy, policy, innovation driving critical national conversations
• Even for scientific communities, different national approaches to data sharing, stewardship, preservation, responsibility for support
• At the same time, there is universal recognition of the importance of digital data as a driver for innovation and progress
– each nation finding their own solutions to common, fundamental problems within their own cultures
European Union (EU) Digital Agenda
• Overall aim is to deliver sustainable economic and social benefits to Europeans from information and communication technologies.
– Europe perceives itself as lagging behind in terms of use and deployment if IT
• EU launched Europe 2020 strategy in March 2010. Digital Agenda for Europe one of the 7 flagship initiatives of the Europe 2020 Strategy.
EU Data Challenges
• Fragmented digital markets
– 27 countries in EU, much variation between content, services, and infrastructure across boarders; unification difficult
• Lack of interoperability
– “weaknesses in standard setting”, difficulty in coordination
• Rising cybercrime and low risk of trust in networks
• Lack of investment in networks
• Insufficient research and innovation efforts
• Lack of digital literacy and skills
• Missed opportunities in addressing societal challenges
EU Digital Agenda Action Areas 1
• Single digital market
– Want to unify telecom, services, rules, and content
– Rights and protection for consumers and businesses when doing business on-line
• Interoperability and Standards
• Trust and security
– “Europeans will not embrace technology they do not trust – the digital age is neither ‘big brother’ nor ‘cyber wild west’.” (Digital Agenda for Europe, COM(2010) 245, 19.05.2010)
EU Digital Agenda Action Areas 2
• Fast and ultra fast internet access
– Universal broadband coverage, open and neutral internet
• Research and Innovation
– Leverage private investment and accelerate innovation
– Increase digital literacy, skills and services
• ICT-enabled benefits for EU society
– ICT-enabled energy, environment, health care, independent living, cultural diversity / arts, e-government, transportation.
Code of EU on-line rights
• Rights and Principles applicable when you access and use online services
– “Universal” access to electronic communication networks and services
– Access to services and applications of your choice
– Non-discrimination when accessing services provided online
– Privacy, protection of personal data and security
• Rights and Principles applicable when you buy goods or services online
– Information prior to the conclusion of a contract
– Timely, clear and complete contractual information
– Fair contract terms & conditions
– Protection against unfair practices
– Delivery of goods and services without defects and in good time
– Withdrawal from a contract
• Rights and Principles protecting you in case of conflict
– Access to justice and dispute resolution
Europe vs. Facebook
• Advocacy group started by Austrian University student (Max Schrems) grew to a grass-roots movement of 25,000+ people
• Issue is potential violation of EU data protection law due to personal data collected by Facebook, etc.
• Schrems filed a complaint with Irish Data Protection Commissioner alleging 22 violations of European law
Europe vs. Facebook, 2011/2012 • Schrems claimed that Facebook collected data he never consented to provide: physical location,
data he had deleted, etc. Schrems started the “Europe vs. Facebook” movement and 25,000+
other users also requested FB data.
– Legal case has been crowdfunded …
• Irish Data Protection Commissioner (DPC) started investigation
– Complaints filed in Ireland because European users have a contract with “Facebook Ireland Ltd”. Under European law,
Facebook Ireland is the “data controller” for facebook.com and therefore facebook.com is governed by European data
protection laws.
• Schrems eventually recovered 1,222 pages of material 57 data categories from FB in 2011
– Schrems claims that Facebook did not provide all data and that Facebook holds at least 84 data categories about every
user.
• FB developed a download tool to provide users a quick overview of the data being kept on file.
FB also agreed to cut the amount of time it retains data on user activities to less than one year.
• Irish Data Protection Commissioner refused to investigate complaints against Facebook and
Apple in summer 2013. Claimed that the legal view in the complaints was frivolous.
Current Status • Schrems lost claim against regulator in Irish High Court, but
the Judge asked the European court of justice to examine
whether Ireland’s data watchdog is bound by “safe harbor”
and whether an investigation should be launched
– Wikipedia: A safe harbor is a provision of a statute or a
regulation that specifies that certain conduct will be deemed
not to violate a given rule. … The EU Data Protection Directive is
an example of a safe harbor law. It sets comparatively strict
privacy protections for EU citizens. It prohibits European firms
from transferring personal data to overseas jurisdictions with
weaker privacy laws, but creates exceptions where the foreign
recipients have voluntarily agreed to meet EU standards under
the Directive's Safe Harbor Principles.
• Schrems looking for a declaration that the safe harbor
designation for Facebook under EU law should be cancelled
and that the Irish Data Protection commissioner should audit
the exchange of information rather than to allow it to
continue unexamined.
• Still ongoing: Case heard on 4/24/15, judgement expected
in a few months …
Safe Harbor Principles:
• Notice - Individuals must be informed that their data is being collected and about how it will be used.
• Choice - Individuals must have the option to opt out of the collection and forward transfer of the data to third parties.
• Onward Transfer - Transfers of data to third parties may only occur to other organizations that follow adequate data protection principles.
• Security - Reasonable efforts must be made to prevent loss of collected information.
• Data Integrity - Data must be relevant and reliable for the purpose it was collected for.
• Access - Individuals must be able to access information held about them, and correct or delete it if it is inaccurate.
• Enforcement - There must be effective means of enforcing these rules.
Wikipedia
Icelanders • Iceland has a population of ~326,000 and is the most
sparsely populated country in Europe.
• Iceland provides universal health care to its citizens and spends a fair amount on health care, ranking 11th in health care expenditures as a percentage of GDP and 14th in spending per capita.
– Health care system is ranked 15th in performance by the World Health Organization.
• Ethnically homogeneous. Most Icelanders descendants of Germanic and Gaelic (Celtic) settlers.
– 93% Icelandic
– 3.13% Polish
– 3.84% Other
• Iceland has extensive genealogical records dating back to the late 17th century and fragmentary records extending back to the 9th century.
Source: Wikipedia articles on Iceland, Icelanders
Whole Country Health Data
• 1996: deCODE Genetics founded to identify human genes associated with common diseases using populations studies and apply the knowledge gained to guide the development of candidate drug treatments.
• Company worked with government on the Health Sector Database Act and created Icelandic Health Sector Database (HSD) which merged genealogical, genetic and health records for the entire population of Iceland
– Opt-out model of presumed consent
– Services and infrastructure developed as well for mining data in HSD
• deCODE data used for discoveries about genes that increase risk for kidney disease, cancer, lupus, vascular disease, schizophrenia, osteoporosis, etc.
– One result identified a gene that protects against Alzheimer’s
Controversy and Transition
deCODE founded in 1996; filed for
bankruptcy in 2009
Saga Investments LLC purchased
deCODE services and assets in 2010
Amgen purchased deCODE in 2012,
spun off NextCODE Health in 2013
NextCODE acquired by WuXi
PharmaTech in 2015
• Health Sector Database very controversial because of privacy and consent issues
• Legal judgement from the Icelandic Supreme Court effectively killed off the HSD project in 2003
– Court case focused on legal rights and rights to not participate from deceased Icelander. Legal issues included legal standing and personal rights of deceased individual, identifiability due to the richness of data
– Part of the problem was the original Health Sector Database Act which did not provide information and guidelines on how DB should be set up, who should run it, who should have access to the data, and what control Icelandic citizens should have over samples.
– Company believed it could continue to identify disease-related genes without the database
• Services and assets of deCODE went through many transitions:
Precision Medicine, challenging ethics
• DeCODE has collected full DNA sequnces on 10,000 individuals.
– Because Icelandic population is so homogeneous, DeCODE says it can extrapolate (“impute”) to accurately guess the DNA makeup fo the other 330K citizens, including those who never participated in the studies.
• DeCODE has identified mutations in BRCA2 that convey sharply increased risk of breast and ovarian cancers.
– DeCODE’s data could identify 2K people with the gene mutation but there are legal and ethical issues that prevent DeCODE from informing people who are at risk
– Inferences go “beyond informed consent”
• Ongoing issues: Development of large-scale data collections provide critical information for scientists but development of accompanying privacy, ethics and health policy and regulation a problem for many countries
Moving Japan Towards Competitiveness through
Collaboration and Data Sharing (All slides adapted from Nao Tsunematsu)
• Nao Tsunematsu – Senior Policy Advisor for Japan Science and Technology Agency
• Nao working to help Japan be more competitive in research and innovation
• Focusing on key drivers:
– Scientific leadership and collaboration
– New approaches to scientific innovation including data sharing “To the greatest extent and with the fewest constraints possible publicly funded scientific research data should be open, while at the same time respecting concerns in relation to privacy, safety, security and commercial interests, whilst acknowledging the legitimate concerns of private partners.” (G8 + 5 Science Ministers)
Increasing focus on collaboration in science: From individual to
team work
23
Agri. Science Biology/Biochemistry Chemistry Clinical Medicine Computational Sci. Economics/Business Administration Engineering Environment Studies/Ecology Earth Science Immunology Material Science Mathematics Microbiology Molecular Biology/Genetics Neuro Science Pharmacy/Toxicology Physics Botany/Zoology Psychiatry/Psychology Social Sciences Space Science
N of Authors
Source: “White Paper on Science and Technology 2014”
By:Ministry of Education, Culture, Sports, Science and Technology (MEXT)
Analysis: National Institute of Science and Technology Policy (NISTEP) “Science of Science, Technology, and Innovation Policy”
International Collaboration Increasing
24
JP JP
Integer Count • If a paper is coauthored by two American
authors, U.S. gets “Count 1.” • If a paper is coauthored by a Japanese and an
author in U.S.A., each country gets “Count 1”
Increase in Chinese Collaboration results in loss of relative standing for some countries
Changes in shares in most cited papers
(Top 10%)
US
DE
KR
FR CN JP
UK
Risk of Encapsulation and isolationism 27
• Nao: If Japan stays out of the movement toward building the global infrastructure for research data sharing, science in Japan will become the Galápagos Islands of Science
– Galapagos
• Isolated from other parts of the world
• Many unique species have evolved
• Emerging focus on Open Science Policy in Japan
• Gala-Kei mobile phones uniquely evolved in Japan.
• They are out of sync with the global standard, and not sold outside of Japan
Gala-Kei
1999: Web Browsing Service 2004: e-money, e-ticket…
Pipeline Model of Innovation, Circa 1999
Publicly funded research is the source of competitiveness "Funding a Revolution: Government Support for Computing Research“ (NRC, 1999)
New Value Proposition?
• Is the pipeline model out of date? – WiFi, Google, Amazon,
• If science is changing, the way science drives innovation may be changing
– If “All Legacy Information” are discoverable, accessible, and usable, business organizations can create and/or collect new datasets to fit its own goal.
All Legacy information
User
New dataset
New Insights
Figure adapted from presentation by
Prof. Barend Mons http://www.slideshare.net/GigaScience/barend-mons-slides-from-ismb-2014?qid=cb9133a2-fce9-4b68-9ff5-
c0a6145342c1&v=qf1&b=&from_search=1
The Game is Changing
Building the capacity for analysis is critical • Role for funding agencies for
innovation
– Build infrastructure in broad sense of the meaning of the word in collaboration with other agencies
– Build human capacity in all phases of data life cycle
30
Research Data Alliance Global community-driven
organization launched in
March 2013 to accelerate
data-driven innovation
RDA focus is on building the social,
organizational and technical
infrastructure to
reduce barriers to data sharing and
exchange
accelerate the development of coordinated
global data infrastructure
RDA Vision and Mission
Research Data Alliance Vision: Researchers and innovators openly share data across technologies, disciplines, and countries to address the grand challenges of society.
Research Data Alliance Mission: RDA builds the social and technical bridges that enable data sharing.
Goal of RDA Infrastructure: Support Data Sharing and Interoperability Across Cultures, Scales, Technologies
Common data types for data Interoperability
Persistent identifiers
Domain-focused portals
Harmonized standards
Data access and preservation policy and practice
Tools for data discoverability, …
Harmonized standards
Policy and Practice
RDA First Meetings
• October 2012: RDA Pre-meeting -- “Global Planning Meeting”
• First broad discussion about goals and organizational framework for RDA
• First Working / Interest groups formed
• Meeting drew > 100 participants from US, EU, elsewhere
• March 2013: RDA Launch / First Plenary
• High level talks by Neelie Kroes (EU VP of Digital Agenda), Farnam Jahanian (NSF CISE AD), Duncan Lewis (AU Ambassador to Belgium, Luxembourg, the EU and NATO)
• First working meeting for RDA groups
• Drew 240 participants from ~30 countries
RDA Launch / First Plenary
March 2013
First RDA organizational telecon: August 2012
Global Data Planning Meeting: October 2012
First Working Groups and Interest Groups
240 participants
Evolving focus for RDA “Deliverables”: CREATE ADOPT USE
RDA Members come together as
• Working Groups – 12-18 month efforts to build, adopt, and use specific pieces of
infrastructure
• Interest Groups – longer-lived discussion forums that spawn Working Groups as
specific pieces of needed infrastructure are identified.
Working Group efforts focus on the development and use of data sharing infrastructure
• Code, policy, infrastructure, standards, or best practices that are adopted and used
by communities to enable data sharing
• “Harvestable” efforts for which 12-18 months of work can eliminate a roadblock
• Efforts that have substantive applicability to groups within the data community, but
may not apply to everyone
• Efforts for which working scientists and researchers can start today
Precipitous Growth
RDA Launch / First Plenary
March 2013
RDA Second
Plenary
September 2013
RDA Third
Plenary
March 2014
First RDA organizational telecon: August 2012
Global Data Planning Meeting: October 2012
First Working Groups and Interest Groups
240 participants
First “neutral space” community meeting (Data Citation Summit)
First Organizational Partner Meet-up
First BOFs
380 participants from 22 countries
RDA Fourth
Plenary
September 2014
First Organizational Assembly
6 co-located events
14 BOF, 12 Working Groups, 22 Interest Groups
497 Participants from 32 countries
RDA Plenary 4 Amsterdam
First Working Group exchange meeting
RDA Plenary 2
Washington, DC
RDA Plenary 1 / Launch Gothenburg, Sweden
RDA Plenary 3
Dublin, Ireland
First RDA Deliverables presented
Organizational Assembly and first OAB / Council meeting
10 co-located events
11 BOF, 14 Working Groups, 36 Interest Groups
550 Participants from 40 countries
slide courtesy Fran Berman
How RDA is Organized
Funder’s Forum Public Sector Organizational and Community Support
Interest Groups domain coordination, idea generation, maintenance,
Technical Advisory
Board
Socio-technical vision
and strategy
Secretary-General and
Secretariat
Administration and
operations
Organizational Advisory
Board and
Organizational
Assembly
Needs, adoption, business
advice
RD
A M
em
bers
hip
RDA Council
Organizational mission and strategy
Working Groups
implementable, impactful outcomes
RDA Foundation Legal entity
RDA Interest Groups
1. Agricultural Data Interoperability IG
2. Active Data Management Plans*
3. Big Data Analytics IG
4. Biodiversity Data Integration IG
5. Brokering IG
6. Community Capability Model IG
7. Data Fabric IG
8. Data for Development
9. Data Foundations and Terminology IG*
10.Data in Context IG
11.Development of cloud computing capacity and
education in developing world research*
12.Development of cloud computing capacity and
education in developing world research
13.Digital Practices in History and Ethnography IG
14.Domain Repositories Interest Group
15.Education and Training on handling of research data
16.ELIXIR Bridging Force IG
17.Engagement IG
18.Federated Identity Management
19.Geospatial IG*
20.Libraries for Research Data
21.Long tail of research data IG
22.Marine Data Harmonization IG
23.Metabolomics
24.Metadata IG
25.PID Interest Group
26.Preservation e-Infrastructure IG
27.Quality of Urban Life Interest Group
28.RDA/CODATA Legal Interoperability IG
29.RDA/CODATA Materials Data, Infrastructure &
Interoperability IG
30.RDA/WDS Certification of Digital Repositories IG
31.RDA/WDS Publishing Data Cost Recovery for Data
Centres
32.RDA/WDS Publishing Data IG
33.Reproducibility IG
34.Research data needs of the Photon and Neutron
Science community
35.Research Data Provenance
36.Service Management IG
37.Structural Biology IG
38.Toxicogenomics Interoperability IG
* in review
RDA Example Efforts: Domain Repositories Interest Group (George Alter, ICPSR; Peter Doorn, DANS; Ruth Duerr, NSIDC; Bob Hanisch, VAO)
Why: Repositories critical for stewardship and preservation of research data.
Provide “homes” for accessing and using data now and in the future.
What: RDA Domain Repositories Interest Group brings together active
data repositories serving scientific disciplines to share/create good practice in
(and collaborations around) data curation, dissemination, preservation and
institutional sustainability
How: RDA Domain Repositories Interest Group will
Share best practices among domain repositories
Collaborate to create economies of scale and common approaches
Work with other RDA groups (data citation, metadata, certification of digital
repositories) to adopt/amplify infrastructure
Impact: The Domain Repositories IG will advance repository infrastructure
critical to support data sharing and exchange and build community among
domain repositories regionally (in the US) and world-wide
RDA Working Groups
1. Brokering Governance
2. Data Citation WG
3. Data Description Registry
Interoperability
4. Data Foundation and Terminology
WG
5. Data Type Registries WG
6. Metadata Standards Directory
Working Group
7. PID Information Types WG
8. Practical Policy WG
9. RDA/CODATA Summer Schools in
Data Science and Cloud Computing
in the Developing World
10.RDA/WDS Publishing Data
Bibliometrics WG
11.RDA/WDS Publishing Data
Services WG
12.RDA/WDS Publishing Data
Workflows WG
13.Repository Audit and Certification
DSA–WDS Partnership WG
14.Repository Platforms for Research
Data*
15.The BioSharing Registry:
connecting data policies, standards
& databases in life sciences*
16.Wheat Data Interoperability WG
* in review
2014, 2015 deliverables
RDA Example Efforts: Wheat Data Interoperability WG (Esther Dzale Yeumo Kabore, French National Institute for Ag. Research, Devika Madalli (Indian Statistical Institute), Johannes
Keizer, Food and Agriculture Office of the UN)
Why: Wheat information systems needed to answer questions such as “What genes and
traits are relevant for understanding the impact of climate change on wheat plant
productivity?”
– Answers to question require coordination / integration of diverse data sets regarding yield, market
pricing, soil analysis, genomic and phenotypic information, etc.
What: RDA Wheat Data Interoperability Working Group developing a common
integration framework for describing, representing, linking and publishing wheat data with
respect to open standards to support wheat data sharing, use and re-use. Contributing to
WheatIS ((Wheat Information System of the Global Wheat Initiative)
How: RDA Group will
Create common standards and vocabularies for wheat data management.
Facilitate access, discovery, use and re-use through technical and social infrastructure development:
metadata, vocabularies/ontologies/formats, and good practice.
Data to be integrated in WheatIS : genomic annotations, phenotypes, genetic maps, physical maps,
germplasm.
Intend to adapt framework to other crops such as RICE and MAIZE
Impact: Helps accelerate work of Global Forum on Agricultural Research (GFAR), the
Cooperative Group on International Agricultural Research (CGIAR) and others promoting the
Coherence in Information for Agricultural Research for Development (CIARD) movement to
open up access to agricultural knowledge worldwide.
RDA/WDS Publishing
Data Bibliometrics
Data
providers Data
consumers
Social
Technical
Solutions
dimension
Beneficiary
dimension
Working Group Clusters
Data
Citation
Data Foundation
and Terminology
Repository Audit and
Certification DSA–WDS
Partnership
Brokering
Governance
PID Information
Types
Data Type
Registries
RDA/WDS
Publishing Data
Workflows
RDA/CODATA Summer
Schools in Data Science and
Cloud Computing in the
Developing World
Metadata
Standards
Directory
Practical
Policy The BioSharing
Registry
Wheat Data
Interoperability
RDA/WDS
Publishing Data
Services
Data Description
Registry
Interoperability
Standardisation of
Data Categories and
Codes
Q1
Q2 Q3
Q4
RDA Deliverables and Adopters P4
Working Group Deliverable Impact Adopters
Data Foundation &
Terminology Working
Group
Basic vocabulary of
foundational terminology,
query tool
Ensures researchers use a
common terminology
when referring to data
DataFed.net
CLARIN
Data Type Registries
Working Group
Data type model and
registry
Provides machine-readable
and researcher-accessible
registries of data types
that support the accurate
use of data
Materials Genome
Initiative
Deep Carbon Observatory
PID Information Types
Working Group
Persistent identifier
registry
Conceptual model for
structuring typed
information to better
identify PIDs, common
interface for access to this
information
Materials Genome
Initiative
Deep Carbon Observatory
Practical Policy Working
Group
Basic set of machine
actionable rules
Policy templates that can
be used to support data
sharing and interchange
between communities
Platform for Experimental
Collaborative Ethnography
EUDAT
RDA deliverables and adopters P5 Working Group Deliverables Impact Adopters
Scalable Dynamic Data
Citation Working Group
Dynamic-data citation
methodology that supports
efficient processing of data
and linking from
publications
Researchers can reference
precise subsets of changing
data
Collaborators include:
CODATA, OpenAire,
Datacite, W3C, other
related standards
Metadata Standards
Directory Working Group
Prototype Metadata
Standards Directory and use
cases
Information can be
maintained transparently
and with full version control.
Digital Curation Centre
Wheat Data
Interoperability Working
Group
Common framework for
Wheat Data Terminology to
enable interoperability
between distinct data
collections
Semantically linked terms
describing wheat data so
researchers can share
harvest and related
information between data
sets and communities
Wheat Initiative Wheat
Information System
Data Description
Registry Interoperability
Working Group
Systems and graph
technologies to link data
across multiple registries to
facilitate search and
discovery
Enables more efficient
discovery of data sets
Collaborators include:
Australian National Data
Service, CERN, DANS,
DataCite, DataPASS,
Thomson Reuters, Cornell,
others
Going forward: Focus on impact
Continuing pipeline of infrastructure deliverables adopted and used to accelerate data sharing
Increasing coordination of infrastructure
Increasing cross-boundary collaborations between domains, sectors, organizations
International and regional programs focusing on workforce, outreach, expansion of infrastructure impact
New partners in the Organizational Assembly
Focused strategy to support development of industry infrastructure for data sharing
More Infrastructure
Partnership with Industry
Synergistic Programs
Effective Community
slide courtesy Fran Berman
RDA/US Goals:
Contribute to RDA “international” efforts
and leadership
Bring US efforts to broader RDA
community
Build the RDA community within
the US
Leverage and implement RDA
deliverables in the US to amplify impact
Collaborate closely with other RDA
“regions” on key programs and initiatives
RDA/US: Collaborate Globally, Contribute Locally
NSF-supported RDA/US initiatives: • Outreach (RDA RDA/US) • RDA Deliverables Amplification • Student / Early Career
Engagement
RDA/US Steering Committee • Fran Berman, RPI • Larry Lannom, CNRI • Kathy Fontaine, RPI • Beth Plale, IU
Lecture 7 Sources (not already on slides)
• Research Data Alliance, http://rd-alliance.org
• “Genome study predicts DNA of the whole of Iceland,” MIT Technology Review, http://www.technologyreview.com/news/536096/genome-study-predicts-dna-of-the-whole-of-iceland/
• “NextCODE Health Mines deCODE’s Data, and More, to Catalyze Clinical Diagnosis”, PLOS
• “An analysis of the Icelandic Supreme Court judgement on the Health Sector Database Act”, http://www2.law.ed.ac.uk/ahrc/script-ed/issue2/iceland.asp
• Biology Blog, blogs.plos.org/dnascience/2013/11/14/nextcode-health-mines-decodes-data-and-more-to-catalyze-clinical-diagnosis
• “Facebook data privacy case to be heard before European Court,” The Guardian, http://www.theguardian.com/technology/2015/mar/24/facebook-data-privacy-european-union-court-maximillian-schrems
• Europe versus Facebook, http://www.europe-v-facebook.org/EN/en.html/
April 10: L7 Data Roundtable
• “Facebook’s privacy policy breaches European law, report finds”, The Guardian, http://www.theguardian.com/technology/2015/feb/23/facebooks-privacy-policy-breaches-european-law-report-finds (Juan Poma)
• “Australia Tops OECD’s Better Life Index”, Wall Street Journal, http://www.wsj.com/articles/SB10001424052702303610504577419320948930402 (Miguel Inoa-Lantigua)
• “Africa: Data Gaps Make Malnutrition Too Easy to Ignore“, SciDev.net http://allafrica.com/stories/201503171418.html (Alex Karcher)
• “In China, an Open Data Movement is Starting to Take Off,” TechPresident, http://techpresident.com/news/wegov/24940/China-Open-Data-Movement-Starting-Take-Off (Kate McGuire)
April 17: L8 Roundtable (Need 4 volunteers from {Philip, Dennis, Charles, Yusri, Lars, Juan, Oskari, Robert})
• “Everything Google Knows About You (and How it Knows It)”, The Washington Post,
http://www.washingtonpost.com/news/the-intersect/wp/2014/11/19/everything-google-
knows-about-you-and-how-it-knows-it/ (Juan Poma)
• “How the Politics of Data Privacy Defy Party Labels in Minnesota”, MINNPOST,
https://www.minnpost.com/politics-policy/2015/04/how-politics-data-privacy-defy-party-
labels-minnesota (Robert Stephens)
• “Bill Would Limit Use of Student Data”, The New York Times,
http://www.nytimes.com/2015/03/23/technology/bill-would-limit-use-of-student-
data.html?_r=0 (Yusri Jamaluddin)
• “DDoS attacks that crippled GitHub linked to Great Firewall of China,” Ars Technica,
http://arstechnica.com/security/2015/04/ddos-attacks-that-crippled-github-linked-to-great-
firewall-of-
china/?utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+arstechnica
%2Findex+%28Ars+Technica+-+All+content%29 (Charles Hathaway)
Today: Big Data (L6) Data Roundtable
• “Big data: Welcome to the petacentre”, Nature,
http://www.nature.com/news/2008/080903/full/455016a.html (Lars Olsson)
• “Police Push the Limits of Big Data Technology”, Datanami,
http://www.datanami.com/2014/07/31/police-push-limits-big-data-technology/ (Sumit
Munshi)
• “Big Data: Big Obstacles”, Chronicle of Higher Education, http://chronicle.com/article/Big-
Data-Big-Obstacles/151421/ (Karl Appel)
• “How pro teams are using data analytics to draft better players,” Financial Post,
http://business.financialpost.com/2013/09/03/pro-sports-teams-turning-to-data-anlaytics-
to-fill-seats/?__lsa=88c9-3dab (Dennis Fogerty)
• “The big deal about “big data” – your guide to what the heck it actually means”, Ars
Technica, http://arstechnica.com/information-technology/2015/02/the-big-deal-about-big-
data-your-guide-to-what-the-heck-it-actually-means/ (READ THIS)