new and emerging forms of data

39
David De Roure @dder New and Emerging Forms of Data: Past, Present, and Future OXFORD E-RESEARCH CENTRE

Upload: david-de-roure

Post on 15-Apr-2017

68 views

Category:

Internet


0 download

TRANSCRIPT

David De Roure @dder

New and Emerging Forms of Data: �Past, Present, and Future

OXFORD E-RESEARCH CENTRE

http://www.data-archive.ac.uk/media/54761/ukda-40thanniversary.pdf

Whendid(mebegin?

More people

Mor

e m

achi

nes Big Data

High Performance Computing

Conventional computing

Web 2 Social Media

e-infrastructure

online R&D

New and Emerging Forms of Data

deeply about society

Nigel Shadbolt et al

https://twitter.com/CR_UK/status/446223117841494016/

Some people's smartphones had autocorrected the word "BEAT" to instead read "BEAR". "Thank you for choosing an adorable polar bear," the reply from the WWF said. "We will call you today to set up your adoption."

http://www.bbc.com/news/technology-26723457

http://www.parliament.uk/business/committees/committees-a-z/commons-select/science-and-technology-committee/news/report-responsible-use-of-data/

theODI.org

Social Media Triangle

social media data and analytics

social media for engagement with research

social media as a subject of research

Sam McGregor

New Forms of Data ▶ Internet data, derived from social

media and other online interactions (including data gathered by connected people and devices, eg mobile devices, wearable technology, Internet of Things)

▶ Tracking data, monitoring the movement of people and objects (including GPS/geolocation data, traffic and other transport sensor data, CCTV images etc)

▶ Satellite and aerial imagery (eg Google Earth, Landsat, infrared, radar mapping etc) http://www.oecd.org/sti/sci-tech/new-data-for-

understanding-the-human-condition.htm

What do we mean by real-time analytics?

▶  Live data streams vs live data analysis

▶ Different kinds of data, at a different pace

▶  Time-critical integration and analysis

▶  Influencing processes as they unfold, at speed & at scale

▶ New methodological apparatus

▶ New computational methods and infrastructure

▶ Not just social media – but social media is a rehearsal

New and emerging CDTs

Real life is and must be full of all kinds of social constraint – the very processes from which society arises. Computers can help if we use them to create abstract social machines on the Web: processes in which the people do the creative work and the machine does the administration... The stage is set for an evolutionary growth of new social engines. The ability to create new forms of social process would be given to the world at large, and development would be rapid.

Berners-Lee, Weaving the Web, 1999 (pp. 172–175)

Social Machines

The Macroscope

Observer of one social machine

Observers using third party observatory

Observer of multiple social

machines

Human participants in

Social Machine

Human participants in multiple Social Machines

Observer of Social Machine infrastructure

1

4

2

3

5

6

SM

SM

SM

Social Machine Observing Social

Machines

7

@dder

De Roure, D., Hooper, C., Page, K., Tarte, S., and Willcox, P. 2015. Observing Social Machines Part 2: How to Observe? ACM Web Science

STORYTELLING AS A STETHOSCOPE FOR SOCIAL MACHINES

1.  Sociality through storytelling potential and realization

2.  Sustainability through reactivity and interactivity

3.  Emergence through collaborative authorship and mixed authority

ZooniverseisahighlystorifiedSocialMachine

Facebookdoesn’tallowforimprovisa(on

Wikipediaassignsauthorityrightsrigidly

Tarte, S. M., De Roure, D., and Willcox, P. Working out the plot: the role of stories in social machines. In Proceedings of the companion publication of the 23rd international conference on World wide web companion (2014), International World Wide Web Conferences Steering Committee, pp. 909–914."

Seizing the tiger by the tail

▶  The Internet of Things describes a world in which everyday objects are connected to a network so that data can be shared

▶  But it is really as much about people as the inanimate object

▶  It is impossible to anticipate all the social changes that could be created by connecting billions of devices

https://www.gov.uk/government/publications/internet-of-things-blackett-review

PETRAS Privacy, Ethics, Trust, Reliability, Acceptability, and Security for the Internet of Things

•  Use an integrated approach of collaborative social and physical science expertise

•  Remove barriers to the beneficial adoption of Internet of Things

•  Address generic knowledge gaps through case study approaches covering major sectors

•  Use innovative methodologies including ‘in the wild’ and citizen science

Principles

PETRAS Privacy, Ethics, Trust, Reliability, Acceptability, and Security for the Internet of Things

Key Facts about PETRAS •  9 world leading universities via

the core and spoke model (4 from the Alan Turing Institute)

•  Combined hub value: £23m

•  Blackett Review expertise

•  47 partners at submission combining presence in the UK, Central Europe and America (giving International links and perspective)

•  Inter– and multi-disciplinary focus

NormalAccidents

Smalleventscascadethroughthesystem,withcatastrophicconsequences,when:•  Thesystemiscomplex•  Thesystemis(ghtlycoupled•  Thesystemhascatastrophicpoten(al

doi:10.1016/B0-08-043076-7/04509-5

More people

Mor

e m

achi

nes Big Data

High Performance Computing

Conventional computing

Web 2 Social Media

e-infrastructure

online R&D

New and Emerging Forms of Data

deeply about society

The

futu

re

increasing automation machine learning

Data Detect Store Analytics Filter Analysts

Edwards, P. N., et al. (2013) Knowledge Infrastructures: Intellectual Frameworks and Research Challenges. Ann Arbor: Deep Blue. http://hdl.handle.net/2027.42/97552

Findable�Accessible�Interoperable Reusable

Jameson L. Toole, Yu-Ru Lin, Erich Muehlegger, Daniel Shoag, Marta C. González, David Lazer. Journal of the Royal Society Interface. Volume 12, issue 107. Published 27 May 2015.DOI: 10.1098/rsif.2015.0185

Tracking employment shocks using mobile phone data

A computationally-enabled sense-making network of expertise, data, software,

models and narratives

Big Data, in a�Big Data Centre

TheRDimensions

ResearchObjectsfacilitateresearchthatisreproducible,repeatable,replicable,reusable,referenceable,retrievable,reviewable,replayable,re-interpretable,reprocessable,recomposable,reconstructable,repurposable,reliable,respecZul,reputable,revealable,recoverable,restorable,reparable,refreshable?

@dder 14 April 2014

scimethod

access

understand

newuse

social

cura(on

ResearchObject

Principles

Whatarewetryingtoachieve?

My reflection is that the reason we seek “reproducible research” is principally to achieve two ends:

1.  Confidence in results, because they inform policy, decision-making, and further research

2.  Sharing and citation of methods, data, software, to make it easier to stand on each others shoulders not toes

Let’s focus on (1)…

ResearchintheWild(West)Imagine you are a conference chair… or responsible for urban planning, or security. Confidence in results is getting harder:

What interventions should we make to improve confidence and quality? What (socio-)technology can we adopt?

Trusting the analysis that is occurring Automation of workflows,�crowd-sourced data reduction, software vulnerabilities, increasing adoption of machine learning, and no critical human in the loop

Knowing what the data is, where it has come from, and what we can do with it Multiple and partial data sources, at speed and scale, in an evolving ecosystem of data processing intermediaries, with complexity in permissions for data use

Provocation One

▶ Are there questions which are answered using longitudinal studies data today that could be answered in other ways?

▶  There is massive (voluntary) supply of data about individuals on a huge scale

▶  The supply is set to increase with Internet of Things ▶  This data is “real time” (fitbit, smartphone,

accelerometer methods…)

Provocation Two

▶ Sometimes we really do need a longitudinal study in order to answer a question

▶  So can we do that longitudinal study in a new way? ▶  By:

–  Supplementing existing studies, using linkage –  Using new techniques with easy reporting at scale –  Working internationally, regionally—shining the torch

Provocation Three

▶ Are we planning for how the world will be in 5 years?

▶ What have we learned from the rehearsal so far? ▶  Increasing automation, bots, robots ▶  Behaviour in the digital world (physical-digital world) ▶ Changing data ecosystem, e.g. personal data stores

consume

produce

composeperformcapture

distribute

www.semanticaudio.ac.uk

Closingreflec(ons

1.  Not just new forms of data, but new social processes and new research questions

2.  What can we learn from the social media analytics rehearsal?

3.  Are we ready? –  for the data supply ahead –  for inevitable automation

4.  How do we ensure the quality of research?

[email protected] @dder

Thanks to Peter Elias, Wendy Hall, Sam McGregor, Mark Sandler, Nigel Shadbolt, Jeremy Watson, also Grant Miller, Petar Radanliev, Ségolène Tarte, and Pip Willcox.

http://www.slideshare.net/dder/new-and-emerging-forms-of-data

www.oerc.ox.ac.uk

[email protected]@dder