big data, new epistemologies and paradigm shifts

22
Big data, new epistemologies and paradigm shifts or Do revolutions in measurement lead to revolutions in science? Rob Kitchin, National University of Ireland Maynooth

Upload: robkitchin

Post on 10-Feb-2017

275 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Big data, new epistemologies and paradigm shifts

Big data, new epistemologies and paradigm shifts

or

Do revolutions in measurement lead to

revolutions in science?

Rob Kitchin,

National University of Ireland Maynooth

Page 2: Big data, new epistemologies and paradigm shifts

Introduction

• “Revolutions in science have often been preceded by revolutions in measurement” Sinan Aral (2010)

• “Big data creates a radical shift in how we think about research. ... [It offers] a profound change at the levels of epistemology and ethics. Big data reframes key questions about the constitution of knowledge, the processes of research, how we should engage with information, and the nature and the categorization of reality ... Big data stakes out new terrains of objects, methods of knowing, and definitions of social life” (boyd and Crawford 2012)

• Critically examine • Big data

• Data analytics

• Effects on epistemological and methodological approach in sciences, social sciences and humanities

Page 3: Big data, new epistemologies and paradigm shifts

Small data / big data

Characteristic Small data Big data

Volume Limited to large Very large

Exhaustivity Samples Entire populations

Resolution and

indexicality

Coarse & weak to tight

& strong

Tight & strong

Relationality Weak to strong Strong

Velocity Slow, freeze-framed Fast

Variety Limited to wide Wide

Flexible and scalable Low to middling High

Page 4: Big data, new epistemologies and paradigm shifts

Urban big data

• Directed

o Surveillance: CCTV, drones/satellite

o Scaled public admin records

• Automated

o Automated surveillance

o Digital devices

o Sensors, actuators, transponders, meters (IoT)

o Interactions and transactions

• Volunteered

o Social media

o Sousveillance/wearables

o Crowdsourcing

o Citizen science

Page 5: Big data, new epistemologies and paradigm shifts

Big data analytics

• Challenge of making sense of big data is coping with its

abundance and exhaustivity, timeliness and dynamism,

messiness and uncertainty, semi-structured or unstructured

nature

• Solution has been machine learning (AI) made possible by

advances in computation and computational techniques

• Four broad classes of analytics:

• data mining and pattern recognition

• statistical analysis

• prediction, simulation, and optimization

• data visualization and visual analytics

Page 6: Big data, new epistemologies and paradigm shifts
Page 7: Big data, new epistemologies and paradigm shifts

New paradigms

• Big data, coupled with new data analytics, challenges established

epistemologies across the sciences, social sciences and humanities

• Transforming how we frame, ask and answer questions

• Some argue leading to new paradigms within and across disciplines

• For Kuhn (1962) paradigm shifts are driven by science being unable to account

for particular phenomena or answer key questions

• For Gray (2009) paradigm shifts are driven by new forms of measurement, data

and analytical techniques. He charts the evolution of science through four

broad paradigms

Paradigm Nature Form When

First Experimental science Empiricism; describing natural phenomena

pre-Renaissance

Second Theoretical science Modelling and generalization pre-computers

Third Computational science Simulation of complex phenomena pre-big data

Fourth Exploratory science Data-intensive; statistical exploration and data mining

Now

Page 8: Big data, new epistemologies and paradigm shifts

Science

• Gray proposes that science is entering a fourth paradigm

driven by big data and new data analytics

• Leading to new era of data-intensive science and a

radically new extension of the established scientific

method

• Others suggest that big data ushers in a new era of

empiricism, wherein data can speak for themselves free of

theory

• The latter has gain credence outside of the academy,

especially within business circles, but its ideas have also

taken root in data science

Page 9: Big data, new epistemologies and paradigm shifts

‘The end of theory’

• Anderson (2008) argues: ‘The data deluge makes the scientific method obsolete’; that the patterns and relationships contained within big data inherently produce meaningful and insightful knowledge

• “There is now a better way. Petabytes allow us to say: ‘Correlation is enough.’ ... We can analyze the data without hypotheses about what it might show. We can throw the numbers into the biggest computing clusters the world has ever seen and let statistical algorithms find patterns where science cannot. ... Correlation supersedes causation, and science can advance even without coherent models, unified theories, or really any mechanistic explanation at all. There’s no reason to cling to our old ways.”

• Ayasdi software claims to be able to:

• “automatically discover insights -- regardless of complexity -- without asking questions.”

Page 10: Big data, new epistemologies and paradigm shifts

‘The end of theory’

• Moreover, can employ an ensemble approach

• Literally hundreds of different algorithms can be applied to

a dataset to determine the best answer or a composite

model or explanation

• A radically different approach to that traditionally used

wherein the analyst selects an appropriate method based

on their knowledge of techniques and the data

• Logic is insight is born from the data, not theory

Page 11: Big data, new epistemologies and paradigm shifts

‘The end of theory’

• Powerful and attractive set of ideas at work in the empiricist epistemology that

run counter to mainstream deductive approach:

• big data can capture a whole of a domain and provide full resolution

• there is no need for a priori theory, models or hypotheses

• through the application of agnostic data analytics the data can speak for

themselves free of human bias or framing

• that any patterns and relationships within big data are inherently

meaningful and truthful

• meaning transcends context or domain-specific knowledge, thus can be

interpreted by anyone who can decode a statistic or data visualization

• offers the possibility of insightful, objective and profitable knowledge

without science or scientists

• These work together to suggest that a new mode of understanding the world is

being created, one in which the modus operandi is purely inductive in nature

Page 12: Big data, new epistemologies and paradigm shifts

‘The end of theory’

• Empiricist thinking is problematic for four

reasons:

• Big data are both a representation and a sample, shaped

by the technology and platform used, the data ontology

employed, the regulatory environment, and are subject

to sampling bias

• Big data do not arise from nowhere, free from the ‘the

regulating force of philosophy’

• Big data cannot simply speak for themselves free of

human bias or framing

• Big data cannot be interpreted outside of context and

domain-specific knowledge

Page 13: Big data, new epistemologies and paradigm shifts

Data-driven science

• Data-driven science seeks to hold to the tenets of the scientific

method, but is more open to using a hybrid combination of

abductive, inductive and deductive approaches

• Differs from traditional, experimental deductive design in that it

seeks to generate hypotheses and insights ‘born from the data’

rather than ‘born from the theory’

• Seeks to incorporate a mode of induction into the research

design, though explanation through induction is not the intended

end-point.

• Instead, induction forms a new mode of hypothesis generation

before a deductive approach is employed

• Process of induction does not arise from nowhere, but is situated

and contextualised within a highly evolved theoretical domain

Page 14: Big data, new epistemologies and paradigm shifts

Data-driven science

• The epistemological strategy is to use guide knowledge discovery techniques to identify potential questions worthy of further examination and testing

• And instead of testing whether every relationship revealed has veracity, attention is focused on those that seemingly offer the most likely or valid way forward based on established science

• Approach is suited to extracting additional, valuable insights that traditional ‘knowledge-driven science’ would fail to generate

• Data-driven approached: • suited to exploring, extracting value and making sense of massive,

interconnected data sets

• fostering interdisciplinary research that conjoins domain expertise

• will lead to more holistic and extensive models and theories of entire complex systems rather than elements of them

Page 15: Big data, new epistemologies and paradigm shifts

Social sciences and humanities

• The effect of big data/data analytics in the humanities and

social sciences is less certain

• These areas of scholarship are highly diverse in their

philosophical underpinnings, with only some scholars

employing the epistemology common in the sciences

• Whilst there is a history quantitative and positivistic

scholarship in social sciences, much rarer in humanities

• There has been a strong post-positivistic shift in many

social science disciplines

Page 16: Big data, new epistemologies and paradigm shifts

Computational social science

• For positivistic scholars in the social sciences, big data offers the opportunity to develop more sophisticated, wider-scale, finer-grained models of human life. To shift from: • data-scarce to data-rich studies of societies

• from static snapshots to dynamic unfoldings

• from coarse aggregations to high resolutions

• from relatively simple models to more complex, sophisticated simulations

• The potential is for studies with much greater breadth, depth, scale, and timeliness, and are inherently longitudinal

• The variety, exhaustivity, resolution, and relationality of data, plus new techniques, addresses some of the critiques of positivistic scholarship –- reductionism and universalism -- by providing more finely grained, sensitive, and nuanced analysis

Page 17: Big data, new epistemologies and paradigm shifts

Social sciences

• For post-positivist scholars, big data offers both opportunities and challenges

• Opportunities: • a proliferation, digitisation and interlinking of a diverse set of analogue and

unstructured data, much of it new (e.g., social media) and many of which have been difficult to access (e.g., millions of books, documents, newspapers, photographs, art works, material objects, etc.)

• And new tools of data curation, management and analysis that can handle massive numbers of data objects

• Challenges: • Analysis mechanistic, atomizing, and parochial, reducing diverse individuals and

complex multidimensional social structures to mere data points; identifies trends but not what produces such a trend

• struggles with the social and with context

• creates bigger haystacks

• identifies but does not address problems

• tends to marginalize metaphysical and normative questions

• erosion of domain level expertise

• promotion of empiricist/quantitative approaches and skewing of funding towards big data

• skills and knowledge deficit

Page 18: Big data, new epistemologies and paradigm shifts

Digital humanities

• Opportunities/challenges being keenly felt in the humanities; rise of digital humanities

• Rather than providing a close reading of a handful of novels or photographs, or a couple of artists and their work, it becomes possible to search, connect and find patterns across a very large number of related works

• Digital humanities advocates broadly divided into two camps epistemologically • Those that believe that that new techniques -- counting,

graphing, mapping, data mining -- bring methodological rigour and objectivity to disciplines that heretofore been unsystematic and random in their focus and approach

• Those that see the techniques as a supplement to, rather than replacement for existing humanities methods and theory building

Page 19: Big data, new epistemologies and paradigm shifts

Digital humanities

• Both cases tend to use descriptive rather than inferential statistics

• The claims of the former have opened up an epistemological debate centred on close versus distant reading/interpretation, ability of algorithms to parse meaning and context

• DH seen by some as mechanistic and reductionist (reduces literature and art to data)

• Identifies patterns but not processes or meaning

• Sacrifices complexity, specificity, context, depth and critique for scale, breadth, automation, descriptive patterns and the impression that interpretation does not require deep contextual knowledge

• Other similar concerns as Soc Sci.

Page 20: Big data, new epistemologies and paradigm shifts

What happens to small data studies?

• Big data doesn’t replace or negate small data

• Small data have a proven track record of answering specific questions, with est. procedures, methods, etc.

• Studies can be much more finely tailored

• Small data studies seek to mine gold from carefully working a narrow seam, whereas big data studies seek to extract nuggets through open-pit mining, scooping up and sieving huge tracts of land

• Small data will, however, increasingly be made more big data-like through the development of new data infrastructures that: • pool, scale and link small data in order to create larger datasets,

• encourage sharing and re-use

• open them up to combination with big data and analysis using big data analytics

Page 21: Big data, new epistemologies and paradigm shifts

Conclusion

• Big data/analytics constitute a data revolution – fundamentally

alters the nature of data and how we make sense of them

(disruptive innovation)

• It is starting to transform how research is conducted, organised

and managed - enables new approaches to data

generation/analysis that make it possible to ask and answer

questions in new ways

• Also pose significant social, political and ethical questions

• As new technologies and analytics develop these transformations

will extend and deepen raising a series of conceptual and

methodological challenges across sciences, social sciences and

humanities

• Have the potential to usher in new paradigms, but more likely to

be further pluralism in approaches

Page 22: Big data, new epistemologies and paradigm shifts

[email protected] @robkitchin

Kitchin, R. and McArdle, G. (2016) What makes big data, big data? Exploring the ontological

characteristics of 26 datasets. Big Data & Society 3: 1–10

Kitchin, R. and Lauriault, T. (2014) Towards critical data studies. SSRN

Kitchin R and Lauriault T (2015) Small data in the era of big data. GeoJournal 80(4): 463-475

Kitchin R (2014) Big data, new epistemologies & paradigm shifts. Big Data and Society 1: 1-12.

Kitchin, R. (2014) The real-time city? Big data and smart urbanism. GeoJournal 79(1): 1-14.

Kitchin, R. (2013) Big data and human geography: Opportunities, challenges and risks. Dialogues in Human Geography 3(3): 262–267

http://www.nuim.ie/progcity

@progcity