the science of grammar and vice-versa or, we don't need no...

67
Or, We Don't Need No Stinkin’ Linguists 1 Geoff Nunberg School of Information, UC Berkeley AAAS Feb. 14, 2015 The Science of Grammar and Vice-Versa

Upload: others

Post on 25-Jul-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The Science of Grammar and Vice-Versa Or, We Don't Need No …courses.ischool.berkeley.edu/i218/s15/slides/AAASslidesFeb14.pdf · Geoff Nunberg School of Information, UC Berkeley

Or, We Don't Need No �Stinkin’ Linguists�

1

���

Geoff Nunberg�School of Information, UC Berkeley�

AAAS�Feb. 14, 2015 �

The Science of Grammar and Vice-Versa �

Page 2: The Science of Grammar and Vice-Versa Or, We Don't Need No …courses.ischool.berkeley.edu/i218/s15/slides/AAASslidesFeb14.pdf · Geoff Nunberg School of Information, UC Berkeley

The view from section Z�

It was ≈20 years ago today…

Making the case for a science of language

“Words are so interesting”

Why are people incurious about the workings of language?

Because they can be.

The pleasures of curmudgeonry

What’s overlooked…

2

Page 3: The Science of Grammar and Vice-Versa Or, We Don't Need No …courses.ischool.berkeley.edu/i218/s15/slides/AAASslidesFeb14.pdf · Geoff Nunberg School of Information, UC Berkeley

Anecdotes aren’t data, but… �

3

Page 4: The Science of Grammar and Vice-Versa Or, We Don't Need No …courses.ischool.berkeley.edu/i218/s15/slides/AAASslidesFeb14.pdf · Geoff Nunberg School of Information, UC Berkeley

Anecdotes aren’t data, but…data can be an anecdote �

The great data kerfuffle: who cares? Why do people use data in the singular plural?

Seeing the kerfuffle through linguistic eyes

Count & mass: Individuation in language

Looking at “data” with a linguist’s eyes tells us something about how we think about the stuff

4

Page 5: The Science of Grammar and Vice-Versa Or, We Don't Need No …courses.ischool.berkeley.edu/i218/s15/slides/AAASslidesFeb14.pdf · Geoff Nunberg School of Information, UC Berkeley

The Great “Data” Kerfuffle �

5

A word with a great past…

Page 6: The Science of Grammar and Vice-Versa Or, We Don't Need No …courses.ischool.berkeley.edu/i218/s15/slides/AAASslidesFeb14.pdf · Geoff Nunberg School of Information, UC Berkeley

The Great “Data” Kerfuffle �

6

A word with a past… and a great future

Page 7: The Science of Grammar and Vice-Versa Or, We Don't Need No …courses.ischool.berkeley.edu/i218/s15/slides/AAASslidesFeb14.pdf · Geoff Nunberg School of Information, UC Berkeley

The New Age of “Data” �

Frequency of Google searches on information science/scientist vs data science/scientist, 2005-2015 (Google Trends)

7

data ���science/scientist

information ���science/scientist

Page 8: The Science of Grammar and Vice-Versa Or, We Don't Need No …courses.ischool.berkeley.edu/i218/s15/slides/AAASslidesFeb14.pdf · Geoff Nunberg School of Information, UC Berkeley

The New Age of “Data” �

Frequency of Google searches on information science/scientist vs data science/scientist, 2005-2015 (Google Trends)

8

data ���science/scientist

information ���science/scientist

Page 9: The Science of Grammar and Vice-Versa Or, We Don't Need No …courses.ischool.berkeley.edu/i218/s15/slides/AAASslidesFeb14.pdf · Geoff Nunberg School of Information, UC Berkeley

But what do we focus on?�

9

Like strata, phenomena and media, data is plural and is best used with a plural verb. —Strunk and White, The Elements of Style

Can we just clear this up now: the word 'data', in English, is a singular mass noun. It is thus a grammatical and stylistic error to use it as a plural. Plural use is barbaric… —Norman Gray

Page 10: The Science of Grammar and Vice-Versa Or, We Don't Need No …courses.ischool.berkeley.edu/i218/s15/slides/AAASslidesFeb14.pdf · Geoff Nunberg School of Information, UC Berkeley

Where do you stand? �

“Data are everywhere and piling up in dizzying amounts. “

A. Fine as it stands.

10

Page 11: The Science of Grammar and Vice-Versa Or, We Don't Need No …courses.ischool.berkeley.edu/i218/s15/slides/AAASslidesFeb14.pdf · Geoff Nunberg School of Information, UC Berkeley

Where do you stand? �

“Data are everywhere and piling up in dizzying amounts. “

A. Fine as it stands

B. More natural as “data is”

11

Page 12: The Science of Grammar and Vice-Versa Or, We Don't Need No …courses.ischool.berkeley.edu/i218/s15/slides/AAASslidesFeb14.pdf · Geoff Nunberg School of Information, UC Berkeley

Where do you stand? �

“Data are everywhere and piling up in dizzying amounts. “

A. Fine as it stands

B. More natural as “data is”

C. WHAT-ever

12

Page 13: The Science of Grammar and Vice-Versa Or, We Don't Need No …courses.ischool.berkeley.edu/i218/s15/slides/AAASslidesFeb14.pdf · Geoff Nunberg School of Information, UC Berkeley

A Hundred Years War �

The grammatical error most commonly made by engineering writers [is] the use of a singular noun with the plural verb “data.” —American Machinist, May 9, 1907

“During the year much data have been furnished to the various departments of the State Experiment Station.” Report of the New Jersey Weather Service, 1895

13

Page 14: The Science of Grammar and Vice-Versa Or, We Don't Need No …courses.ischool.berkeley.edu/i218/s15/slides/AAASslidesFeb14.pdf · Geoff Nunberg School of Information, UC Berkeley

A Hundred Years War �

I HAVE more than once publicly protested against that abomination "data is." We say "phenomenon is" and "phenomena are," and I do not recall in Latin any singular verb used in English with a plural noun, excepting poor "data is.” W. W. Keen, Science, July 1, 1927 … I feel minded to essay the role of devil's advocate for the apparently incorrect use [of “data”]. We speak and hence write English by ear and not by rules of grammar. If "this data" sounds better than "these data" it will be used. Charles H. Blake, Science, July 1, 1927

14

Page 15: The Science of Grammar and Vice-Versa Or, We Don't Need No …courses.ischool.berkeley.edu/i218/s15/slides/AAASslidesFeb14.pdf · Geoff Nunberg School of Information, UC Berkeley

Foolish inconsistency�

The plural form of some nouns of foreign origin… may appear to be singular and can cause authors to select a verb that does not agree in number with the noun.

Correct: The data indicate that Terrence was correct. Incorrect: The data indicates that Terrence was correct.

APA Publication Manual, p. 79 Tables are efficient, enabling the researcher to present a large amount of data in a small amount of space.

APA Publication Manual, p. 147

15

Page 16: The Science of Grammar and Vice-Versa Or, We Don't Need No …courses.ischool.berkeley.edu/i218/s15/slides/AAASslidesFeb14.pdf · Geoff Nunberg School of Information, UC Berkeley

Foolish inconsistency�

Yet even as big data are helping banks, they are also throwing up new competitors from outside the industry. The Economist, 19 May 2012 Big datum: x =3.1415926535897932384626433832795028841971… Little datum: x > 3

16

Fetishism makes us foolish

Page 17: The Science of Grammar and Vice-Versa Or, We Don't Need No …courses.ischool.berkeley.edu/i218/s15/slides/AAASslidesFeb14.pdf · Geoff Nunberg School of Information, UC Berkeley

The Perils of Pedantry�

Samuel Johnson:

PEDANTRY. Awkward ostentation of needless learning.

candelabra candelabrum

insignia insigne

enema enemata

17

rhinocerotes

Page 18: The Science of Grammar and Vice-Versa Or, We Don't Need No …courses.ischool.berkeley.edu/i218/s15/slides/AAASslidesFeb14.pdf · Geoff Nunberg School of Information, UC Berkeley

The Perils of Pedantry�

Samuel Johnson:

PEDANTRY. Awkward ostentation of needless learning.

candelabra candelabrum?

insignia insigne?

enema enemata?

18

rhinocerotes

Page 19: The Science of Grammar and Vice-Versa Or, We Don't Need No …courses.ischool.berkeley.edu/i218/s15/slides/AAASslidesFeb14.pdf · Geoff Nunberg School of Information, UC Berkeley

Singularian Dogmatists�

Saying “data are” is like… over-pronouncing Italian at the Olive Garden. No one is impressed, and frankly, we’re just a little embarrassed for you. —John August I won't rest until [everyone] accepts the plain fact that data should be treated as a singular noun in all circumstances.…I'm not sure I've ever heard anyone say "data are," but lots of diehards with PhDs still use it in print. —Kevin Drum But where is it written you can use any form in writing that wouldn't sound natural in speech?

19

Page 20: The Science of Grammar and Vice-Versa Or, We Don't Need No …courses.ischool.berkeley.edu/i218/s15/slides/AAASslidesFeb14.pdf · Geoff Nunberg School of Information, UC Berkeley

Enough Dogma! �

How can we so superficial about such an important word?

20

Page 21: The Science of Grammar and Vice-Versa Or, We Don't Need No …courses.ischool.berkeley.edu/i218/s15/slides/AAASslidesFeb14.pdf · Geoff Nunberg School of Information, UC Berkeley

“Data” and the Scientific Voice�

21

Page 22: The Science of Grammar and Vice-Versa Or, We Don't Need No …courses.ischool.berkeley.edu/i218/s15/slides/AAASslidesFeb14.pdf · Geoff Nunberg School of Information, UC Berkeley

“Data” and the Scientific Voice�

22

Frequency of “data” in English genres (COCA)

Page 23: The Science of Grammar and Vice-Versa Or, We Don't Need No …courses.ischool.berkeley.edu/i218/s15/slides/AAASslidesFeb14.pdf · Geoff Nunberg School of Information, UC Berkeley

“Data” and the Scientific Voice�

23 Frequency of “data” in English sub-genres (COCA)

     sub-­‐genre  Wds  per  million  

1            ACAD:Medicine     1,415.1  2            ACAD:Educa5on     1,267.9  3            ACAD:Sci/Tech     1,228.5  4            ACAD:Geog/SocSci     855.8  5            MAG:Sci/Tech     561.7  6            NEWS:Money     333.5  7            MAG:Financial     327.0  8            ACAD:Phil/Rel     301.9  9            ACAD:History     204.1  

10            ACAD:Law/PolSci     193.8  

“Data” is most widely used in genres that stress quantitative evidence

Page 24: The Science of Grammar and Vice-Versa Or, We Don't Need No …courses.ischool.berkeley.edu/i218/s15/slides/AAASslidesFeb14.pdf · Geoff Nunberg School of Information, UC Berkeley

Where is data plural? �

24

Genre Data pp M

Ratio pl/sg

fiction 3.5 0.08 broadcast 4.3 0.06 newspaper 1.61 0.39 magazine 0.99 1.02 academic 0.33 7.12

Frequency of plural data correlates with text frequency of data

Page 25: The Science of Grammar and Vice-Versa Or, We Don't Need No …courses.ischool.berkeley.edu/i218/s15/slides/AAASslidesFeb14.pdf · Geoff Nunberg School of Information, UC Berkeley

Frequency of plural correlates with “science-ness” �

25

Ratio of plural to singular in journal articles since 1990

!pl/sg&&

Jrnl.&Cell&Biology& 14.48!Am.&Educ.&Research& 6.21!Language& 3.41!Am.&Hist.&Review& 1.38!Representations& 0.03!Early&Am.&Lit.& 0!

Page 26: The Science of Grammar and Vice-Versa Or, We Don't Need No …courses.ischool.berkeley.edu/i218/s15/slides/AAASslidesFeb14.pdf · Geoff Nunberg School of Information, UC Berkeley

…with one exception�

26

Ratio of plural to singular in journal articles since 1990

!pl/sg&&

Jrnl.&Cell&Biology& 14.48!Am.&Educ.&Research& 6.21!Language& 3.41!Am.&Hist.&Review& 1.38!ACM& 0.39!Representations& 0.03!Early&Am.&Lit.& 0!

Page 27: The Science of Grammar and Vice-Versa Or, We Don't Need No …courses.ischool.berkeley.edu/i218/s15/slides/AAASslidesFeb14.pdf · Geoff Nunberg School of Information, UC Berkeley

…with one exception�

27

The data that’s almost always singular

Page 28: The Science of Grammar and Vice-Versa Or, We Don't Need No …courses.ischool.berkeley.edu/i218/s15/slides/AAASslidesFeb14.pdf · Geoff Nunberg School of Information, UC Berkeley

Why do most people make data singular?�

Are people just ignorant about Latin?

Or does the singular make sense?

…data are so aggregative that English usage increasingly makes many into one. —Lisa Gitelman!

28

Page 29: The Science of Grammar and Vice-Versa Or, We Don't Need No …courses.ischool.berkeley.edu/i218/s15/slides/AAASslidesFeb14.pdf · Geoff Nunberg School of Information, UC Berkeley

Why do scientists make data plural? �

Are^

scientists just hopeless pedants?

Slaves to the style guides?

A badge of belonging? ���

When do we learn to pluralize data?

29

most

Page 30: The Science of Grammar and Vice-Versa Or, We Don't Need No …courses.ischool.berkeley.edu/i218/s15/slides/AAASslidesFeb14.pdf · Geoff Nunberg School of Information, UC Berkeley

Why do scientists make data plural? �

An empirical question: What determines when scientists use the plural?

30

Page 31: The Science of Grammar and Vice-Versa Or, We Don't Need No …courses.ischool.berkeley.edu/i218/s15/slides/AAASslidesFeb14.pdf · Geoff Nunberg School of Information, UC Berkeley

The linguistics of count and mass �

31

The mass/count distinction keeps being at the center of much attention among cognitive scientists, as it involves in fundamental ways the relation between language(i.e. grammar), thought (i.e. conceptual systems not necessarily rooted in grammar) and reality (i.e. the physical world). —Gennaro Chierchia (2010) !

Page 32: The Science of Grammar and Vice-Versa Or, We Don't Need No …courses.ischool.berkeley.edu/i218/s15/slides/AAASslidesFeb14.pdf · Geoff Nunberg School of Information, UC Berkeley

�“Count” and “Mass” �

Making the linguistic distinction in English

Counting and quantification a dog, three dogs, *a mud, *three muds

dogs are, *mud are

too many dogs, *too many muds

*too much dog, too much mud

Measure: *a gallon of dog, a gallon of mud

32

Page 33: The Science of Grammar and Vice-Versa Or, We Don't Need No …courses.ischool.berkeley.edu/i218/s15/slides/AAASslidesFeb14.pdf · Geoff Nunberg School of Information, UC Berkeley

How to Tell Things�from Gunk�

33

Things Gunk Divisive reference no yes Cumulative reference no yes Countable yes no

…”the metaphysical question of the primary existence of gunk vs. things” Jeff Pelletier (2011)!

Page 34: The Science of Grammar and Vice-Versa Or, We Don't Need No …courses.ischool.berkeley.edu/i218/s15/slides/AAASslidesFeb14.pdf · Geoff Nunberg School of Information, UC Berkeley

But categories �are not clear-cut�

Count and Mass noun pairs with similar meanings:

leaves vs foliage, pebbles vs gravel, ���clothes vs clothing, noodles vs spaghetti

34

Page 35: The Science of Grammar and Vice-Versa Or, We Don't Need No …courses.ischool.berkeley.edu/i218/s15/slides/AAASslidesFeb14.pdf · Geoff Nunberg School of Information, UC Berkeley

Words that go both ways �

She killed a chicken (c)/ We ate chicken (m)

I have some gray hairs (c) /I’m losing my hair (m)

We had many talks (chats, *chatters).

There was too much talk (*chat, chatter).

We’re concerned about all the crimes/crime in the neighborhood.

35

Words with both count and mass uses

Page 36: The Science of Grammar and Vice-Versa Or, We Don't Need No …courses.ischool.berkeley.edu/i218/s15/slides/AAASslidesFeb14.pdf · Geoff Nunberg School of Information, UC Berkeley

Cross-linguistic Mismatches �

36

English Italian contents C contenuto M

spaghetti M spaghetti C

lightning M lampo C

furniture M mobili C

advice M consiglio C

Page 37: The Science of Grammar and Vice-Versa Or, We Don't Need No …courses.ischool.berkeley.edu/i218/s15/slides/AAASslidesFeb14.pdf · Geoff Nunberg School of Information, UC Berkeley

Cross-linguistic Mismatches �

37

English Italian contents C contenuto M

spaghetti M spaghetti C

lightning M lampo C

furniture M mobili C

advice M consiglio C

data C/M dati C

information M informazione C/M

Page 38: The Science of Grammar and Vice-Versa Or, We Don't Need No …courses.ischool.berkeley.edu/i218/s15/slides/AAASslidesFeb14.pdf · Geoff Nunberg School of Information, UC Berkeley

Where do the distinctions live? �

38

[dɔg]

“The mass/count distinction …involves in fundamental ways the relation between language…, thought… and reality.”"

Page 39: The Science of Grammar and Vice-Versa Or, We Don't Need No …courses.ischool.berkeley.edu/i218/s15/slides/AAASslidesFeb14.pdf · Geoff Nunberg School of Information, UC Berkeley

The question every linguist has to answer �

39

Does language structure reality or reflect it?

Yes.

Language is ultimately wedded to reality…

…but it’s a very rocky marriage.

But language can reveal tacit conceptual structure

Page 40: The Science of Grammar and Vice-Versa Or, We Don't Need No …courses.ischool.berkeley.edu/i218/s15/slides/AAASslidesFeb14.pdf · Geoff Nunberg School of Information, UC Berkeley

Cross-linguistic consistencies �

40

Scott Grimm, “Grammatical Number and Individuation,” to appear in Lg.

Languages signal individuation in different ways, but carve experience along similar lines

Page 41: The Science of Grammar and Vice-Versa Or, We Don't Need No …courses.ischool.berkeley.edu/i218/s15/slides/AAASslidesFeb14.pdf · Geoff Nunberg School of Information, UC Berkeley

Cross-linguistic consistencies �

41

Collective/Singulative Classes in Turkana, Welsh and Maltese on the Lattice of Animacy and Individuation (Grimm, to appear)

liquids/substances < granular aggregates < collective aggregates <

individuals

Page 42: The Science of Grammar and Vice-Versa Or, We Don't Need No …courses.ischool.berkeley.edu/i218/s15/slides/AAASslidesFeb14.pdf · Geoff Nunberg School of Information, UC Berkeley

Populating the space between dogs and milk�

Collective nouns:

team, committee, etc.

Bipartites:

scissors, trousers, tongs, goggles…

Singular aggregates:

footwear, jewelry, furniture…

Plural aggregates:

arms, supplies, valuables; Americana,

Other plural-only nouns:

directions, manners, dues, troops…

42

conceptualization of a collective noun (Joosten et al.)

Page 43: The Science of Grammar and Vice-Versa Or, We Don't Need No …courses.ischool.berkeley.edu/i218/s15/slides/AAASslidesFeb14.pdf · Geoff Nunberg School of Information, UC Berkeley

Populating the space between dogs and milk�

Collective nouns:

team, committee, etc.

Bipartites:

scissors, trousers, tongs, goggles…

Singular aggregates:

footwear, jewelry, furniture…

Plural aggregates:

arms, supplies, valuables; Americana,

Other plural only nouns:

directions, manners, dues, troops … and data

43

conceptualization of a collective noun (Joosten et al.)

Page 44: The Science of Grammar and Vice-Versa Or, We Don't Need No …courses.ischool.berkeley.edu/i218/s15/slides/AAASslidesFeb14.pdf · Geoff Nunberg School of Information, UC Berkeley

Explaining count/noncount alternations �

Do alternations correspond to systematic conceptual distinctions? Often

Count nouns are favored when:

• Individual elements are distinguishable (size, perceptibility, continguity)

• We interact with elements one-by-one rather than in quantity

44

autumn foliage

autumn leaves

Page 45: The Science of Grammar and Vice-Versa Or, We Don't Need No …courses.ischool.berkeley.edu/i218/s15/slides/AAASslidesFeb14.pdf · Geoff Nunberg School of Information, UC Berkeley

Explaining count/noncount alternations �

Cf Zwicky on cover plants and hedges

But clover swings both ways…

45

The shifting use of chad, November 2000

Page 46: The Science of Grammar and Vice-Versa Or, We Don't Need No …courses.ischool.berkeley.edu/i218/s15/slides/AAASslidesFeb14.pdf · Geoff Nunberg School of Information, UC Berkeley

Explaining count/noncount alternations �

Langacker on pebbles and gravel:

Middleton et al.’s worgel

46

Page 47: The Science of Grammar and Vice-Versa Or, We Don't Need No …courses.ischool.berkeley.edu/i218/s15/slides/AAASslidesFeb14.pdf · Geoff Nunberg School of Information, UC Berkeley

Back to data�

47

A culturally saturated notion…

Who has data and who doesn’t?

Page 48: The Science of Grammar and Vice-Versa Or, We Don't Need No …courses.ischool.berkeley.edu/i218/s15/slides/AAASslidesFeb14.pdf · Geoff Nunberg School of Information, UC Berkeley

When quantitative facts aren’t data�

48

     sub-­‐genre   per  million  1            ACAD:Medicine     1,415.1  2            ACAD:Educa5on     1,267.9  3            ACAD:Sci/Tech     1,228.5  4            ACAD:Geog/SocSci     855.8  5            MAG:Sci/Tech     561.7  6            NEWS:Money     333.5  7            MAG:Financial     327.0  8            ACAD:Phil/Rel     301.9  9            ACAD:History     204.1  

10            ACAD:Law/PolSci     193.8  

Page 49: The Science of Grammar and Vice-Versa Or, We Don't Need No …courses.ischool.berkeley.edu/i218/s15/slides/AAASslidesFeb14.pdf · Geoff Nunberg School of Information, UC Berkeley

When quantitative facts aren’t data�

49

     sub-­‐genre   per  million  1            ACAD:Medicine     1,415.1  2            ACAD:Educa5on     1,267.9  3            ACAD:Sci/Tech     1,228.5  4            ACAD:Geog/SocSci     855.8  5            MAG:Sci/Tech     561.7  6            NEWS:Money     333.5  7            MAG:Financial     327.0  8            ACAD:Phil/Rel     301.9  9            ACAD:History     204.1  

10            ACAD:Law/PolSci     193.8  24            MAG:Sports     64.8  

Page 50: The Science of Grammar and Vice-Versa Or, We Don't Need No …courses.ischool.berkeley.edu/i218/s15/slides/AAASslidesFeb14.pdf · Geoff Nunberg School of Information, UC Berkeley

The invention of data�

50

Frequency of data in Google Books, 1800-2000

Page 51: The Science of Grammar and Vice-Versa Or, We Don't Need No …courses.ischool.berkeley.edu/i218/s15/slides/AAASslidesFeb14.pdf · Geoff Nunberg School of Information, UC Berkeley

The invention of data�

51

Frequency of data in Google Books, 1800-1900

It is remarkable that until a very few years ago no data were collected where a calculation of the average occurrence of sickness at the several ages of man rould be formed with tolerable accuracy. ��� —Select Committee of the House of Commons, 1825

“the first Big Data revolution”

Page 52: The Science of Grammar and Vice-Versa Or, We Don't Need No …courses.ischool.berkeley.edu/i218/s15/slides/AAASslidesFeb14.pdf · Geoff Nunberg School of Information, UC Berkeley

The things we do with data�

Collect

Wash

Organize

Process

Analyze

Evaluate

Appraise

52

High

Low

Inte

ract

ion

at P

artic

ulat

e Le

vel

Page 53: The Science of Grammar and Vice-Versa Or, We Don't Need No …courses.ischool.berkeley.edu/i218/s15/slides/AAASslidesFeb14.pdf · Geoff Nunberg School of Information, UC Berkeley

Working the difference? �

It is now difficult to walk the streets of a major city without having one ’s progress captured by some hidden gaitkeeping device.  Data about me are stored in thousands of virtual locations…. As that data is reworked, processed through an online algorithm or spat out to somewhere…, my possibilities for action are being shaped. —Geoffrey Bowker!

53

Page 54: The Science of Grammar and Vice-Versa Or, We Don't Need No …courses.ischool.berkeley.edu/i218/s15/slides/AAASslidesFeb14.pdf · Geoff Nunberg School of Information, UC Berkeley

When is data plural? �

54

!pl/sg&

&insufficient! 3.79!inconsistent! 3.56!self2reported! 3.49!scattered! 2.79!precise! 2.70!scarce! 2.66!collected! 2.31!sketchy! 2.21!

Ratio of “data are” to “data is” preceding various adjectives

Plural predominates when predicate implies direct interaction with data at individual/particulate level

Page 55: The Science of Grammar and Vice-Versa Or, We Don't Need No …courses.ischool.berkeley.edu/i218/s15/slides/AAASslidesFeb14.pdf · Geoff Nunberg School of Information, UC Berkeley

Sensitivity of plural �usage to tense �

55

! Present' Past''insufficient! 3.79! 3.78!inconsistent! 3.56! 3.56!self3reported! 3.49! 5.82!scattered! 2.79! 5.09!precise! 2.70! 3.29!scarce! 2.66! 3.48!collected! 2.31! 8.19!sketchy! 2.21! 2.16!

Plural is more frequent following “data were” than “data are”

Page 56: The Science of Grammar and Vice-Versa Or, We Don't Need No …courses.ischool.berkeley.edu/i218/s15/slides/AAASslidesFeb14.pdf · Geoff Nunberg School of Information, UC Berkeley

When is data plural? �

56

Singular is favored when appraising significance of results…

pl/sg surprising 0.98 important 0.87 significant 0.82 crucial 0.59 overwhelming 0.45 remarkable 0.26

Page 57: The Science of Grammar and Vice-Versa Or, We Don't Need No …courses.ischool.berkeley.edu/i218/s15/slides/AAASslidesFeb14.pdf · Geoff Nunberg School of Information, UC Berkeley

Evaluating data�

57

pl/sg consistent with 7.63 compatible with 6.91 fit 6.47 refute 4.14 confirm 3.77 show that 2.45

Plural is favored when evaluating data for fit with model or hypothesis

Page 58: The Science of Grammar and Vice-Versa Or, We Don't Need No …courses.ischool.berkeley.edu/i218/s15/slides/AAASslidesFeb14.pdf · Geoff Nunberg School of Information, UC Berkeley

Appraising data�

58

pl/sg consistent with 7.63 compatible with 6.91 fit 6.47 refute 4.14 confirm 3.77 show that 2.45

…but singular is favored when drawing broad extra-scientific conclusions from data

! pl/sg!challenge( 1.03(say(that( 0.54(mean(that( 0.21(tell(us(that( 0.21(

Our data confirm that the bulk of the Deccan eruptions occurred in a short time…!The scientific data tells us that people who are without symptoms… are not a threat.! !

Page 59: The Science of Grammar and Vice-Versa Or, We Don't Need No …courses.ischool.berkeley.edu/i218/s15/slides/AAASslidesFeb14.pdf · Geoff Nunberg School of Information, UC Berkeley

Appraising data�

59

pl/sg consistent with 7.63 compatible with 6.91 fit 6.47 refute 4.14 confirm 3.77 show that 2.45

…but singular is favored when drawing broad extra-scientific conclusions from data

! pl/sg!challenge( 1.03(say(that( 0.54(mean(that( 0.21(tell(us(that( 0.21(

Our data confirm that the bulk of the Deccan eruptions occurred in a short time…!The scientific data tells us that people who are without symptoms… are not a threat.! !

However, it is now difficult to walk the streets of a major city without having one ’ s progress captured by some hidden gaitkeeping device.  Data about me are stored in thousands of virtual locations…. As that data is reworked, processed through an online algorithm or spat out to somewhere and somewhen to the computer screen of a vigilant operator, my possibilitiesfor action are being shaped.

Page 60: The Science of Grammar and Vice-Versa Or, We Don't Need No …courses.ischool.berkeley.edu/i218/s15/slides/AAASslidesFeb14.pdf · Geoff Nunberg School of Information, UC Berkeley

Appraising data�

60

pl/sg consistent with 7.63 compatible with 6.91 fit 6.47 refute 4.14 confirm 3.77 show that 2.45

…but singular is favored when drawing broad extra-scientific conclusions from data

! pl/sg!challenge( 1.03(say(that( 0.54(mean(that( 0.21(tell(us(that( 0.21(

Our data confirm that the bulk of the Deccan eruptions occurred in a short time…!The scientific data tells us that people who are without symptoms… are not a threat.! !

However, it is now difficult to walk the streets of a major city without having one ’ s progress captured by some hidden gaitkeeping device.  Data about me are stored in thousands of virtual locations…. As that data is reworked, processed through an online algorithm or spat out to somewhere and somewhen to the computer screen of a vigilant operator, my possibilitiesfor action are being shaped.

It is now difficult to walk the streets of a major city without having one’s progress captured by some hidden gaitkeeping device.  Data about me are stored in thousands of virtual locations…. As that data is reworked, processed through an online algorithm or spat out to somewhere…, my possibilities for action are being shaped. —Geoffrey Bowker!

Page 61: The Science of Grammar and Vice-Versa Or, We Don't Need No …courses.ischool.berkeley.edu/i218/s15/slides/AAASslidesFeb14.pdf · Geoff Nunberg School of Information, UC Berkeley

Evaluating data�

61

pl/sg consistent with 7.63 compatible with 6.91 fit 6.47 refute 4.14 confirm 3.77 show that 2.45

…and the singular predominates when speaking of digital data ! pl/sg!challenge( 1.03(say(that( 0.54(mean(that( 0.21(tell(us(that( 0.21(

! pl/sg&structured! 0.65!encrypted! 0.50!secure! 0.34!safe! 0.26!

The history of digital technology: turning data into a sing. noun

Page 62: The Science of Grammar and Vice-Versa Or, We Don't Need No …courses.ischool.berkeley.edu/i218/s15/slides/AAASslidesFeb14.pdf · Geoff Nunberg School of Information, UC Berkeley

Conclusions �Plural data is most common in genres where data is most frequent, particularly scientific discourse

Variation between plural and singular data in scientific discourse is systematic

Refutes “pedantry,” “stylebook” and “membership badge” explanations

62

Page 63: The Science of Grammar and Vice-Versa Or, We Don't Need No …courses.ischool.berkeley.edu/i218/s15/slides/AAASslidesFeb14.pdf · Geoff Nunberg School of Information, UC Berkeley

Conclusions �Plural predominates when speaking of direct interaction with data, especially in ways particular to scientific practice or reasoning

e.g. “consistent with,” “self-reported,” “collected,” “scattered”

63

Distribution of "consistent with" in COCA genres

Page 64: The Science of Grammar and Vice-Versa Or, We Don't Need No …courses.ischool.berkeley.edu/i218/s15/slides/AAASslidesFeb14.pdf · Geoff Nunberg School of Information, UC Berkeley

Conclusions �Singular predominates when drawing broad conclusions from data; here, “data” often means “results” or “findings”:

This data means that the company should put even more focus on this product line.!

This data tells us that such pursuits are futile.!

64

Page 65: The Science of Grammar and Vice-Versa Or, We Don't Need No …courses.ischool.berkeley.edu/i218/s15/slides/AAASslidesFeb14.pdf · Geoff Nunberg School of Information, UC Berkeley

Conclusions �Scientists use plural data more than others because they more often interact immediately with data or because a corpuscular conceptualization of data is particular to some forms of scientific reasoning.

i.e., plural-singular disparaties reflect conceptual differences!

65

Page 66: The Science of Grammar and Vice-Versa Or, We Don't Need No …courses.ischool.berkeley.edu/i218/s15/slides/AAASslidesFeb14.pdf · Geoff Nunberg School of Information, UC Berkeley

Conclusions �Are these conceptual distinctions particular to speakers of English (as opposed to Germans, Italians, etc. who use data only in the plural)?

Of course not—but it would harder to notice them or validate their existence if language didn’t signal them.

66

Page 67: The Science of Grammar and Vice-Versa Or, We Don't Need No …courses.ischool.berkeley.edu/i218/s15/slides/AAASslidesFeb14.pdf · Geoff Nunberg School of Information, UC Berkeley

Author’s Messages �

1.  Language encodes tacit cognitive distinctions that (only?) reveal themselves to linguistic analysis. BUT

2.  If you think you’re smarter than the English language, you’re apt not to hear what it’s trying to tell you.

67