michael hoey (& matt o’donnell)

201
The Beginning of something important?: Corpus evidence on the text beginnings of hard news stories Michael Hoey (& Matt O’Donnell)

Upload: dean-contreras

Post on 03-Jan-2016

32 views

Category:

Documents


1 download

DESCRIPTION

The Beginning of something important?: Corpus evidence on the text beginnings of hard news stories. Michael Hoey (& Matt O’Donnell). Dedicated to John Sinclair. Discoverer of collocation. Dedicated to John Sinclair. Discoverer of collocation. Becoming a science…. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Michael Hoey (& Matt O’Donnell)

The Beginning of something important?: Corpus evidence on

the text beginnings of hard news stories

Michael Hoey (& Matt O’Donnell)

Page 2: Michael Hoey (& Matt O’Donnell)

Dedicated to John Sinclair

Discoverer of collocation

Page 3: Michael Hoey (& Matt O’Donnell)

Dedicated to John Sinclair

Discoverer of collocation

Page 4: Michael Hoey (& Matt O’Donnell)

Becoming a science…

• Observations made with specially designed instrumentation

• Data classification

• Hypotheses that give rise to experimentation

• Unifying theories

Page 5: Michael Hoey (& Matt O’Donnell)

Pre-Darwinian biology

Creationism

Classification

Contradictions

Page 6: Michael Hoey (& Matt O’Donnell)

Pre-Darwinian biology

Dog breeding

Darwin’s finches

Whimsicality of God

Page 7: Michael Hoey (& Matt O’Donnell)

Darwin’s contribution to biology

Not evolution but mechanism for evolution

Page 8: Michael Hoey (& Matt O’Donnell)

Pre-Darwinian linguistics

Creationism

Classification

Contradictions

Page 9: Michael Hoey (& Matt O’Donnell)

Pre-Darwinian linguistics

Creationism

Classification

Contradictions

Page 10: Michael Hoey (& Matt O’Donnell)

Pre-Darwinian linguistics

Language as unity (contrast Harris)

Classification

Contradictions

Page 11: Michael Hoey (& Matt O’Donnell)

Pre-Darwinian linguistics

Language as unity (contrast Harris)

Classification

Contradictions

Page 12: Michael Hoey (& Matt O’Donnell)

Darwin’s finches = collocations

• ubiquity

• apparent arbitrariness

• apparent unnecessariness

Page 13: Michael Hoey (& Matt O’Donnell)

Darwin’s finches = collocations

• ubiquity

• apparent arbitrariness

• apparent unnecessariness

Page 14: Michael Hoey (& Matt O’Donnell)

Darwin’s finches = collocations

• ubiquity

• apparent arbitrariness

• apparent unnecessariness

Page 15: Michael Hoey (& Matt O’Donnell)

Darwin’s finches = collocations

• ubiquity

• apparent arbitrariness

• apparent unnecessariness

Page 16: Michael Hoey (& Matt O’Donnell)

The Lexical Priming claim

Collocation can only be explained if we revise our ideas on how utterances are stored.

Psycholinguistic mainstream thinking is that there are different types of memory and decomposition of received utterances into semantic ‘primes’ [no direct connection]

BUT to explain collocation, we must have “concordances” in the head.

Page 17: Michael Hoey (& Matt O’Donnell)

The Lexical Priming claim

Mental “concordances” also explain• literary allusion• quotation• spoonerisms (know your blows rather than

blow your nose)• recognition of plagiarism• recognition of non-nativisms• recognition of creativity

Page 18: Michael Hoey (& Matt O’Donnell)

ago 177,827• years 100,306 BUT ALSO• months 16,564 ▪ yonks 14• weeks 13,925 ▪ Novembers 11• year 11,304 ▪ Thursdays 9• long 7,062• days 5,343

AND ALSO AND EVEN• defeats 3 ▪ a world ago• overs 3 ▪ three wives ago• albums 2 ▪ 14 Wimbledons ago• books 1 ▪ 61 victories ago• budgets 1 ▪ two Thanksgivings ago• careers 1 ▪ a few years and several

stone ago

Page 19: Michael Hoey (& Matt O’Donnell)

ago 177,827• years 100,306 BUT ALSO• months 16,564 ▪ yonks 14• weeks 13,925 ▪ Novembers 11• year 11,304 ▪ Thursdays 9• long 7,062• days 5,343

AND ALSO AND EVEN• defeats 3 ▪ a world ago• overs 3 ▪ three wives ago• albums 2 ▪ 14 Wimbledons ago• books 1 ▪ 61 victories ago• budgets 1 ▪ two Thanksgivings ago• careers 1 ▪ a few years and several

stone ago

Page 20: Michael Hoey (& Matt O’Donnell)

ago 177,827• years 100,306 BUT ALSO• months 16,564 ▪ yonks 14• weeks 13,925 ▪ Novembers 11• year 11,304 ▪ Thursdays 9• long 7,062• days 5,343

AND ALSO AND EVEN• defeats 3 ▪ a world ago• overs 3 ▪ three wives ago• albums 2 ▪ 14 Wimbledons ago• books 1 ▪ 61 victories ago• budgets 1 ▪ two Thanksgivings ago• careers 1 ▪ a few years and several

stone ago

Page 21: Michael Hoey (& Matt O’Donnell)

ago 177,827• years 100,306 BUT ALSO• months 16,564 ▪ yonks 14• weeks 13,925 ▪ Novembers 11• year 11,304 ▪ Thursdays 9• long 7,062• days 5,343

AND ALSO AND EVEN• defeats 3 ▪ a world ago• overs 3 ▪ three wives ago• albums 2 ▪ 14 Wimbledons ago• books 1 ▪ 61 victories ago• budgets 1 ▪ two Thanksgivings ago• careers 1 ▪ a few years and several

stone ago

Page 22: Michael Hoey (& Matt O’Donnell)

The Lexical Priming claim

So how do we get ‘concordances’ in our head?

My claim is that all the pieces of language we encounter prime us so that when we come to use the piece of language ourselves, we are likely (in speech, particularly) to use it in the same kinds of way as it was used in those encounters.

We may be primed so that word clusters are acquired as unities with their own primings and then learn that they are not always fixedor we may be primed to recognise collocations and build clusters from them

Page 23: Michael Hoey (& Matt O’Donnell)

drink all gone

either the child is primed to associate all gone with foods, liquids and then learns that gone may go with nearly

or the child is primed to collocate gone with all and nearly, and has the priming strengthened on each occasion.

Page 24: Michael Hoey (& Matt O’Donnell)

drink all gone

either the child is primed to associate all gone with foods, liquids and then learns that gone may go with nearly

or the child is primed to collocate gone with all and nearly, and has the priming strengthened on each occasion.

Page 25: Michael Hoey (& Matt O’Donnell)

The Lexical Priming claim

Whenever we encounter a word (or syllable or combination of words), we note subconsciously

• the words it occurs with (its collocations),

• the grammatical patterns it occurs in (its colligations),

• the meanings with which it is associated (its semantic associations),

Page 26: Michael Hoey (& Matt O’Donnell)

The Lexical Priming claim

Whenever we encounter a word (or syllable or combination of words), we note subconsciously

• the words it occurs with (its collocations),

• the grammatical patterns it occurs in (its colligations),

• the meanings with which it is associated (its semantic associations),

Page 27: Michael Hoey (& Matt O’Donnell)

The Lexical Priming claim

Whenever we encounter a word (or syllable or combination of words), we note subconsciously

• the words it occurs with (its collocations),

• the grammatical patterns it occurs in (its colligations),

• the meanings with which it is associated (its semantic associations),

Page 28: Michael Hoey (& Matt O’Donnell)

The Lexical Priming claim

Whenever we encounter a word (or syllable or combination of words), we note subconsciously

• the words it occurs with (its collocations),

• the grammatical patterns it occurs in (its colligations),

• the meanings with which it is associated (its semantic associations),

Page 29: Michael Hoey (& Matt O’Donnell)

The Lexical Priming claim

Whenever we encounter a word (or syllable or combination of words), we note subconsciously

• the words it occurs with (its collocations),

• the meanings with which it is associated (its semantic associations),

Page 30: Michael Hoey (& Matt O’Donnell)

all gone with drink

all with gone

years with ago

Page 31: Michael Hoey (& Matt O’Donnell)

The Lexical Priming claim

Whenever we encounter a word (or syllable or combination of words), we note subconsciously

• the words it occurs with (its collocations),

• the meanings with which it is associated (its semantic associations),

Page 32: Michael Hoey (& Matt O’Donnell)

all gone with CONSUMABLES

gone with PROPORTION

ago with MEASURE OF TIME

Page 33: Michael Hoey (& Matt O’Donnell)

all gone with CONSUMABLES

gone with PROPORTION

ago with MEASURE OF TIME

The collocations set up the semantic association, then the semantic association creates the environment for further collocations.

The way we categorise the world is a direct consequence of our primings.

Page 34: Michael Hoey (& Matt O’Donnell)

all gone with CONSUMABLES

gone with PROPORTION

ago with MEASURE OF TIME

The collocations set up the semantic association, then the semantic association creates the environment for further collocations.

The way we categorise the world is a direct consequence of our primings.

Page 35: Michael Hoey (& Matt O’Donnell)

The Lexical Priming claim

Whenever we encounter a word (or syllable or combination of words), we note subconsciously

• the words it occurs with (its collocations),

• the meanings with which it is associated (its semantic associations),

• the pragmatics it is associated with (its pragmatic associations),

Page 36: Michael Hoey (& Matt O’Donnell)

The Lexical Priming claim

Whenever we encounter a word (or syllable or combination of words), we also note subconsciously

• the grammatical patterns it is associated with (its colligations),

• the genre and/or style and/or social situation it is used in,

• whether it is used in a context we are likely to want to emulate or not

Page 37: Michael Hoey (& Matt O’Donnell)

The Lexical Priming claim

Whenever we encounter a word (or syllable or combination of words), we also note subconsciously

• the grammatical patterns it is associated with (its colligations),

• the genre and/or style and/or social situation it is used in,

• whether it is used in a context we are likely to want to emulate or not

Page 38: Michael Hoey (& Matt O’Donnell)

Colligations

Colligations are an observation made with specially designed instrumentation.

Page 39: Michael Hoey (& Matt O’Donnell)

Colligations

An accumulation of colligations may (& usually does) lead to the creation of a grammar

Page 40: Michael Hoey (& Matt O’Donnell)

20 years ago, 136 summers ago, two seasons ago, three nights ago, twelve months ago:

NUMBER + NNs + ago

a week ago, a year ago, one year ago,

one/a + NN(-s) + ago(a = single number)

nearly six years ago, almost five years ago, just three months ago, exactly a century ago

PREMODIFYING ADVERB of (NON-)APPROXIMATION + NUMBER/one/a + NN(s) + ago

a year or so ago, 10 days or so ago

PREMODIFYING ADVERB of (NON-)APPROXIMATION + NUMBER/one/a + NN(s) + agoOR NUMBER/one/a + NN(s) + POSTMODIFYING EXPRESSION of (NON-)APPROXIMATION +

ago

Page 41: Michael Hoey (& Matt O’Donnell)

20 years ago, 136 summers ago, two seasons ago, three nights ago, twelve months ago:

NUMBER + NNs + ago

a week ago, a year ago, one year ago,

one/a + NN(-s) + ago(a = single number)

nearly six years ago, almost five years ago, just three months ago, exactly a century ago

PREMODIFYING ADVERB of (NON-)APPROXIMATION + NUMBER/one/a + NN(s) + ago

a year or so ago, 10 days or so ago

PREMODIFYING ADVERB of (NON-)APPROXIMATION + NUMBER/one/a + NN(s) + agoOR NUMBER/one/a + NN(s) + POSTMODIFYING EXPRESSION of (NON-)APPROXIMATION +

ago

Page 42: Michael Hoey (& Matt O’Donnell)

20 years ago, 136 summers ago, two seasons ago, three nights ago, twelve months ago:

NUMBER + NNs + ago

a week ago, a year ago, one year ago,

one/a + NN(-s) + ago(a = single number)

nearly six years ago, almost five years ago, just three months ago, exactly a century ago

PREMODIFYING ADVERB of (NON-)APPROXIMATION + NUMBER/one/a + NN(s) + ago

a year or so ago, 10 days or so ago

PREMODIFYING ADVERB of (NON-)APPROXIMATION + NUMBER/one/a + NN(s) + agoOR NUMBER/one/a + NN(s) + POSTMODIFYING EXPRESSION of (NON-)APPROXIMATION +

ago

Page 43: Michael Hoey (& Matt O’Donnell)

20 years ago, 136 summers ago, two seasons ago, three nights ago, twelve months ago:

NUMBER + NNs + ago

a week ago, a year ago, one year ago,

one/a + NN(-s) + ago(a = single number)

nearly six years ago, almost five years ago, just three months ago, exactly a century ago

PREMODIFYING ADVERB of (NON-)APPROXIMATION + NUMBER/one/a + NN(s) + ago

a year or so ago, 10 days or so ago

PREMODIFYING ADVERB of (NON-)APPROXIMATION + NUMBER/one/a + NN(s) + agoOR NUMBER/one/a + NN(s) + POSTMODIFYING EXPRESSION of (NON-)APPROXIMATION +

ago

Page 44: Michael Hoey (& Matt O’Donnell)

20 years ago, 136 summers ago, two seasons ago, three nights ago, twelve months ago:

NUMBER + NNs + ago

a week ago, a year ago, one year ago,

one/a + NN(-s) + ago(a = single number)

nearly six years ago, almost five years ago, just three months ago, exactly a century ago

PREMODIFYING ADVERB of (NON-)APPROXIMATION + NUMBER/one/a + NN(s) + ago

a year or so ago, 10 days or so ago

PREMODIFYING ADVERB of (NON-)APPROXIMATION + NUMBER/one/a + NN(s) + agoOR NUMBER/one/a + NN(s) + POSTMODIFYING EXPRESSION of (NON-)APPROXIMATION +

ago

Page 45: Michael Hoey (& Matt O’Donnell)

20 years ago, 136 summers ago, two seasons ago, three nights ago, twelve months ago:

NUMBER + NNs + ago

a week ago, a year ago, one year ago,

one/a + NN(-s) + ago(a = single number)

nearly six years ago, almost five years ago, just three months ago, exactly a century ago

PREMODIFYING ADVERB of (NON-)APPROXIMATION + NUMBER/one/a + NN(s) + ago

a year or so ago, 10 days or so ago

PREMODIFYING ADVERB of (NON-)APPROXIMATION + NUMBER/one/a + NN(s) + agoOR NUMBER/one/a + NN(s) + POSTMODIFYING EXPRESSION of (NON-)APPROXIMATION +

ago

Page 46: Michael Hoey (& Matt O’Donnell)

20 years ago, 136 summers ago, two seasons ago, three nights ago, twelve months ago:

NUMBER + NNs + ago

a week ago, a year ago, one year ago,

one/a + NN(-s) + ago(a = single number)

nearly six years ago, almost five years ago, just three months ago, exactly a century ago

PREMODIFYING ADVERB of (NON-)APPROXIMATION + NUMBER/one/a + NN(s) + ago

a year or so ago, 10 days or so ago

PREMODIFYING ADVERB of (NON-)APPROXIMATION + NUMBER/one/a + NN(s) + agoOR NUMBER/one/a + NN(s) + POSTMODIFYING EXPRESSION of (NON-)APPROXIMATION +

ago

Page 47: Michael Hoey (& Matt O’Donnell)

20 years ago, 136 summers ago, two seasons ago, three nights ago, twelve months ago:

NUMBER + NNs + ago

a week ago, a year ago, one year ago,

one/a + NN(-s) + ago(a = single number)

nearly six years ago, almost five years ago, just three months ago, exactly a century ago

PREMODIFYING ADVERB of (NON-)APPROXIMATION + NUMBER/one/a + NN(s) + ago

a year or so ago, 10 days or so ago

PREMODIFYING ADVERB of (NON-)APPROXIMATION + NUMBER/one/a + NN(s) + agoOR NUMBER/one/a + NN(s) + POSTMODIFYING EXPRESSION of (NON-)APPROXIMATION +

ago

Page 48: Michael Hoey (& Matt O’Donnell)

The Lexical Priming claim

Whenever we encounter a word (or syllable or combination of words), we also note subconsciously

• the grammatical patterns it is associated with (its colligations),

• the genre and/or style and/or social situation it is used in,

• whether it is used in a context we are likely to want to emulate or not

Page 49: Michael Hoey (& Matt O’Donnell)

The Lexical Priming claim

Whenever we encounter a word (or syllable or combination of words), we also note subconsciously

• the grammatical patterns it is associated with (its colligations),

• the genre and/or style and/or social situation it is used in,

• whether it is used in a context we are likely to want to emulate or not

Page 50: Michael Hoey (& Matt O’Donnell)

The Lexical Priming claim

Whenever we encounter a word (or syllable or combination of words), we also note subconsciously

• the grammatical patterns it is associated with (its colligations),

• the genre and/or style and/or social situation it is used in,

• whether it is used in a context we are likely to want to emulate or not

Page 51: Michael Hoey (& Matt O’Donnell)

The Lexical Priming claim

All the features we notice prime us so that when we come to use the word ourselves, we are likely (in speech, particularly) to use it in the same lexical context, with the same grammar, in the same semantic context, as part of the same genre/style, in the same kind of social and physical context, with a similar pragmatics and in similar textual ways.

Page 52: Michael Hoey (& Matt O’Donnell)

The Lexical Priming claim

• Our ability to do this is what it means to know a word.

• We are ALL learners, since we never stop being primed.

• The only difference between the native speaker and the non-native speaker is the way that they are typically primed.

• Fluency is the result of conformity to one’s primings.

Page 53: Michael Hoey (& Matt O’Donnell)

The Lexical Priming claim

• Our ability to do this is what it means to know a word.

• We never stop being primed.

• The only difference between the native speaker and the non-native speaker is the way that they are typically primed.

• Fluency is the result of conformity to one’s primings.

Page 54: Michael Hoey (& Matt O’Donnell)

But…

text linguistics is traditionally top-down

and

lexical priming is bottom-up

Page 55: Michael Hoey (& Matt O’Donnell)

Characteristics of text

• text is interactively produced & processed

• text is linearly developed

• text is cohesive

• text is chunked

• text is shaped in the service of particular communities of users

Hoey (2004)

Page 56: Michael Hoey (& Matt O’Donnell)

Three possible relations between text linguistic claims and corpus linguistic

observations

1. The relationships (interactive, linear, cohesive, hierarchical and structural) found in a text are independent of the lexis of the language.

Hoey (2004)

Page 57: Michael Hoey (& Matt O’Donnell)

Three possible relations between text linguistic claims and corpus linguistic

observations

2. The relationships (interactive, linear, cohesive, hierarchical and structural) found in a text are dependent upon and created by the lexis of the language.

Hoey (2004)

Page 58: Michael Hoey (& Matt O’Donnell)

Three possible relations between text linguistic claims and corpus linguistic

observations

2. The relationships (interactive, linear, cohesive, hierarchical and structural) found in a text and the lexis of the language are interdependent.

Page 59: Michael Hoey (& Matt O’Donnell)

Five textual claims about lexis(or lexical claims about text)

1. We are primed to expect every word to enter into or avoid cohesive chains (or cohesive links) [textual collocation]

2. We are primed to associate each (cohesive) word with particular kinds of cohesion [textual collocation]

Page 60: Michael Hoey (& Matt O’Donnell)

Five textual claims about lexis(or lexical claims about text)

2. We are primed to associate each (cohesive) word with particular kinds of cohesion [textual collocation]

Page 61: Michael Hoey (& Matt O’Donnell)

Five textual claims about lexis(or lexical claims about text)

3. Every word may be primed for us to occur within a specific semantic relation, e.g. contrast, time sequence, exemplification [textual semantic association]

4. Every word may be primed for us to occur as part of Theme or Rheme in a Theme-Rheme relation [textual colligation]

Page 62: Michael Hoey (& Matt O’Donnell)

Five textual claims about lexis(or lexical claims about text)

4 Every word may be primed for us to occur as part of Theme or Rheme in a Theme-Rheme relation [textual colligation]

Page 63: Michael Hoey (& Matt O’Donnell)

5. Every word may be primed for us to occur at the beginning or end of an independently recognised ‘chunk’ of text, e.g. the paragraph, the whole text [textual colligation]

Five textual claims about lexis(or lexical claims about text)

Page 64: Michael Hoey (& Matt O’Donnell)

Important sixth claim about the previous five claims

If a word is primed for us in any of the ways mentioned, these primings may be only (or especially) operative in texts designed for a particular community of users, e.g. academic papers, newspapers.

Page 65: Michael Hoey (& Matt O’Donnell)

The textual priming of hard news stories

Our objective is to test claim 5 exhaustively on a corpus of hard news stories taken from the Guardian:

5. Every word may be primed for us to occur at the beginning or end of an independently recognised ‘chunk’ of text, e.g. the paragraph, the whole text [textual colligation]

Page 66: Michael Hoey (& Matt O’Donnell)

Method: Building positional subcorpora

• Corpus = archive of The Guardian 1998-2004

• Selected ‘Home News’ section for initial investigation of ‘hard news’

• Bell (1991: 147) ‘hard news is news as we all recognize it, and at its core is spot news – tales of accidents, disasters, crimes’.

• approx 52.1 million words and 113288 articles

• Corpus contains basic structural markup used by Guardian typesetters (paragraphs)

• Carried out sentence tokenization

Page 67: Michael Hoey (& Matt O’Donnell)

Method: Building positional subcorpora

Headline

(Subheadline)

sentence………

sentence ……… sentence ………

sentence ……… sentence ……… sentence ………

sentence ………

….

sentence ……… sentence ………

Anatomy of an article!

To test textual colligation particularly interested in:

a.Initial paragraph

b.Initial sentences of paragraphs

Page 68: Michael Hoey (& Matt O’Donnell)

‘Taking the PISC…’

Process each article and extract sentences into:

•TISC – first sentence of first paragraph

•PISC – first sentence of subsequent paragraphs

•NISC – all non paragraph-initial sentences

Page 69: Michael Hoey (& Matt O’Donnell)

‘Taking the PISC…’

Process each article and extract sentences into:

•TISC – first sentence of first paragraph

•PISC – first sentence of subsequent paragraphs

•NISC – all non paragraph-initial sentences

•SISC – sentences from single sentence paragraphs

•HISC – headline and subheadline material

Page 70: Michael Hoey (& Matt O’Donnell)

Table 1 – Summary of positional subcorpora

TISC PISC SISC NISC

tokens 3,122,037 12,521,902 17,129,694 19,338,590

types 58,432 127,038 137,322 141,793

sentences 113,288 607,125 555,641 1,064,493

mean (in words) 28 21 31 18

std.dev. 11.11 9.68 23.8 9.88

Page 71: Michael Hoey (& Matt O’Donnell)

Guardian Corpus details (in brief)

words sentences

• TISC 3,122,037 113,288

• NISC 19,338,590 1,064,493

• PISC 12,521,902 607,125

• SISC 17,129,694

• HISC 1,273,635

• Total53,385,858

Page 72: Michael Hoey (& Matt O’Donnell)

British authorities yesterday launched an inquiry into how computer hackers who targeted the cut-price fashion retailers TK Maxx were able to steal information from more than 45 million credit and debit card holders on both sides of the Atlantic.

1st sentence of ‘Inquiry launched after biggest ever credit card heist’ The Guardian Saturday March 31 2007, p.11

Page 73: Michael Hoey (& Matt O’Donnell)

British authorities yesterday launched an inquiry into how computer hackers who targeted the cut-price fashion retailers TK Maxx were able to steal information from more than 45 million credit and debit card holders on both sides of the Atlantic.

1st sentence of ‘Inquiry launched after biggest ever credit card heist’ The Guardian Saturday March 31 2007, p.11

Page 74: Michael Hoey (& Matt O’Donnell)

British TISC 7,231 instances

6.4% of text-initial sentences

23.2 instances per 10,000 words

British 1st or 2nd word of sentence

TISC

1863+ 25.8% of TISC occurrences

Page 75: Michael Hoey (& Matt O’Donnell)

British NISC 16,124 instances

1.5% of non-initial sentences

8.3 instances per 10,000 words

Page 76: Michael Hoey (& Matt O’Donnell)

British TISC 7,231 instances

6.4% of TISC sentences

23.2 per 10,000 words of TISC

British 1st or 2nd word of sentence in TISC

1863+ 25.8% of TISC occurrences

Page 77: Michael Hoey (& Matt O’Donnell)

British NISC 16,124 instances

1.5% of NISC sentences

8.3 per 10,000 words of NISC

British 1st or 2nd word of sentence in NISC

1788+ 11.1% of NISC occurrences

Page 78: Michael Hoey (& Matt O’Donnell)

Proportionally British appears between 2½ and 3 times more often in text-initial sentences than in non-initial sentences in Guardian text, and 4 times as many text-initial sentences contain British as non-initial.

So Guardian writers (and readers) are primed to use British to start a news text.

When it is used in text-initial sentences, it is also 2½ times more likely to be in the first two words of the sentence than when it occurs in non-initial sentences.

We begin texts with British.

Page 79: Michael Hoey (& Matt O’Donnell)

British authorities yesterday launched an inquiry into how computer hackers who targeted the cut-price fashion retailers TK Maxx were able to steal information from more than 45 million credit and debit card holders on both sides of the Atlantic.

1st sentence of ‘Inquiry launched after biggest ever credit card heist’ The Guardian Saturday March 31 2007, p.11

Page 80: Michael Hoey (& Matt O’Donnell)

British authorities yesterday launched an inquiry into how computer hackers who targeted the cut-price fashion retailers TK Maxx were able to steal information from more than 45 million credit and debit card holders on both sides of the Atlantic.

1st sentence of ‘Inquiry launched after biggest ever credit card heist’ The Guardian Saturday March 31 2007, p.11

Page 81: Michael Hoey (& Matt O’Donnell)

authorities TISC 574 instances

0.5% of TISC sentences

1.8 per 10,000 words of TISC

authorities first third of sentenceTISC

1863+ 29.6% of TISC occurrences

Page 82: Michael Hoey (& Matt O’Donnell)

authorities NISC 3564 instances

0.3% of NISC sentences

1.8 per 10,000 words of NISC

authorities first third of sentenceNISC

1863+ 29.2% of NISC occurrences

Page 83: Michael Hoey (& Matt O’Donnell)

Text-initial sentences contain authorities no more frequently than non-initial sentences.

But when Guardian writers are primed to use slightly differently when they do use in text-initial sentences.

Page 84: Michael Hoey (& Matt O’Donnell)

Text-initial sentences contain authorities no more frequently than non-initial sentences.

But when Guardian writers are primed to use slightly differently when they do use in text-initial sentences.

Page 85: Michael Hoey (& Matt O’Donnell)

TISC authorities 574

Semantic association

nation/region 168 29.3%

social welfare 109 19.0%

Page 86: Michael Hoey (& Matt O’Donnell)

NISC authorities 3564

Semantic association

nation/region 726 20.3%

social welfare 972 27.3%

Page 87: Michael Hoey (& Matt O’Donnell)

British authorities yesterday launched an inquiry into how computer hackers who targeted the cut-price fashion retailers TK Maxx were able to steal information from more than 45 million credit and debit card holders on both sides of the Atlantic.

1st sentence of ‘Inquiry launched after biggest ever credit card heist’ The Guardian Saturday March 31 2007, p.11

Page 88: Michael Hoey (& Matt O’Donnell)

British authorities yesterday launched an inquiry into how computer hackers who targeted the cut-price fashion retailers TK Maxx were able to steal information from more than 45 million credit and debit card holders on both sides of the Atlantic.

1st sentence of ‘Inquiry launched after biggest ever credit card heist’ The Guardian Saturday March 31 2007, p.11

Page 89: Michael Hoey (& Matt O’Donnell)

yesterday TISC 34,646

30.6% of all text-initial sentences in news stories in the Guardian contain yesterday.

yesterday NISC 13,363

1.3% of all non-initial sentences in news stories in the Guardian contain yesterday.

Page 90: Michael Hoey (& Matt O’Donnell)

yesterday TISC 34,646

30.6% of all text-initial sentences in news stories in the Guardian contain yesterday.

yesterday NISC 13,363

1.3% of all non-initial sentences in news stories in the Guardian contain yesterday.

Page 91: Michael Hoey (& Matt O’Donnell)

yesterday TISC 34,646

30.6% of all text-initial sentences in news stories in the Guardian contain yesterday.

yesterday NISC 13,363

1.3% of all non-initial sentences in news stories in the Guardian contain yesterday.

NOT AS OBVIOUS AS IT SEEMS.

Page 92: Michael Hoey (& Matt O’Donnell)

British authorities yesterday launched an inquiry into how computer hackers who targeted the cut-price fashion retailers TK Maxx were able to steal information from more than 45 million credit and debit card holders on both sides of the Atlantic.

1st sentence of ‘Inquiry launched after biggest ever credit card heist’ The Guardian Saturday March 31 2007, p.11

Page 93: Michael Hoey (& Matt O’Donnell)

British authorities yesterday launched an inquiry into how computer hackers who targeted the cut-price fashion retailers TK Maxx were able to steal information from more than 45 million credit and debit card holders on both sides of the Atlantic.

1st sentence of ‘Inquiry launched after biggest ever credit card heist’ The Guardian Saturday March 31 2007, p.11

Page 94: Michael Hoey (& Matt O’Donnell)

TISC launch* 2317

2.0% of sentences (1 in 49)

NISC launch* 3684

0.35% of sentences (1 in 289)

Proportionally nearly 6 times as many text-initial sentences contain launched as non-initial

Page 95: Michael Hoey (& Matt O’Donnell)

TISC launch* 2317

2.0% of sentences (1 in 49)

NISC launch* 3684

0.35% of sentences (1 in 289)

Proportionally nearly 6 times as many text-initial sentences contain launched as non-initial

Page 96: Michael Hoey (& Matt O’Donnell)

TISC launch* 2317

2.0% of sentences (1 in 49)

NISC launch* 3684

0.35% of sentences (1 in 289)

Proportionally nearly 6 times as many text-initial sentences contain launched as non-initial

Page 97: Michael Hoey (& Matt O’Donnell)

TISC launched 1503

1.3% of sentences (1 in 75)

NISC launched 1707

0.16% of sentences (1 in 623)

Page 98: Michael Hoey (& Matt O’Donnell)

TISC launched 1503

1.3% of sentences (1 in 75)

NISC launched 1707

0.16% of sentences (1 in 623)

Page 99: Michael Hoey (& Matt O’Donnell)

TISC launched 1503

1.3% of sentences (1 in 75)

NISC launched 1707

0.16% of sentences (1 in 623)

Proportionally over 8 times more text-initial sentences contain launched than non-initial

Page 100: Michael Hoey (& Matt O’Donnell)

TISC launched 1503 AFTER BEFORE TOTAL

launched yesterday 117 248 365 (24.2%)launched today 57 - 57 ( 3.8%)launched last night 18 70 88 ( 5.9%)launched this week 14 launched on DATE/TIME 6 TOTAL 510 (33.9%)launched in YEAR 6launched X years ago 6launched tomorrow 5launched next week 3launched next year 3launched later this year 3launched last week 3launched in MONTH 3launched this summer 2launched X years ago 2launched next month 2launched last year 2

launched next MONTH, this year, X days ago, over the weekend,later this month, last MONTH, 1last month, in X months, at the weekend,as early as next year, at the end of the month,at any time, at CLOCK TIME

Page 101: Michael Hoey (& Matt O’Donnell)

NISC launched 1787 AFTER BEFORE TOTAL

launched yesterday 17 23 40 ( 2.2%)launched today 7 1 8 ( 0.4%)launched last night 3 3 6 ( 0.3%)launched this week 9 launched on DATE/TIME 22 TOTAL 54 ( 3.0%)launched in YEAR 55launched X years ago 9launched tomorrow 3launched next week 3launched next year 3launched later this year 4launched last week 9launched in MONTH 31launched this summer 1launched X years ago 9launched next month 8launched last year 12

launched next MONTH, this year, X days ago, over the weekend,later this month, last MONTH, 1last month, in X months, at the weekend,as early as next year, at the end of the month,at any time, at CLOCK TIME

Page 102: Michael Hoey (& Matt O’Donnell)

TISC launched 1503 AFTER BEFORE TOTAL

launched yesterday 117 248 365 (24.2%)launched today 57 - 57 ( 3.8%)launched last night 18 70 88 ( 5.9%)launched this week 14

launched on DATE/TIME 6 TOTAL 510 (33.9%)launched in YEAR 6launched X years ago 6launched tomorrow 5launched next week 3launched next year 3launched later this year 3launched last week 3launched in MONTH 3launched this summer 2launched X years ago 2launched next month 2launched last year 2

launched next MONTH, this year, X days ago, over the weekend,later this month, last MONTH, 1last month, in X months, at the weekend,as early as next year, at the end of the month,at any time, at CLOCK TIME

Page 103: Michael Hoey (& Matt O’Donnell)

NISC launched 1787 AFTER BEFORE TOTAL

launched yesterday 17 23 40 ( 2.2%)launched today 7 1 8 ( 0.4%)launched last night 3 3 6 ( 0.3%)launched this week 9 launched on DATE/TIME 22 TOTAL 54

( 3.0%)launched in YEAR 55launched X years ago 9launched tomorrow 3launched next week 3launched next year 3launched later this year 4launched last week 9launched in MONTH 31launched this summer 1launched X years ago 9launched next month 8launched last year 12

launched next MONTH, this year, X days ago, over the weekend,later this month, last MONTH, 1last month, in X months, at the weekend,as early as next year, at the end of the month,at any time, at CLOCK TIME

Page 104: Michael Hoey (& Matt O’Donnell)

1. Guardian writers are primed to use launch in text-initial sentences. Proportionally, 6 times as many text-initial sentences contain the word as non-initial sentences.

2. If any of the time adjuncts yesterday, today and last night is used with launched, it is 10 times more likely to be be part of a text-initial sentence than a non-initial.

3. If the time adjunct chosen specifies a year or a month, it is over 9 times more likely to occur in a non-initial sentence

Page 105: Michael Hoey (& Matt O’Donnell)

British authorities yesterday launched an inquiry into how computer hackers who targeted the cut-price fashion retailers TK Maxx were able to steal information from more than 45 million credit and debit card holders on both sides of the Atlantic.

1st sentence of ‘Inquiry launched after biggest ever credit card heist’ The Guardian Saturday March 31 2007, p.11

Page 106: Michael Hoey (& Matt O’Donnell)

British authorities yesterday launched an inquiry into how computer hackers who targeted the cut-price fashion retailers TK Maxx were able to steal information from more than 45 million credit and debit card holders on both sides of the Atlantic.

1st sentence of ‘Inquiry launched after biggest ever credit card heist’ The Guardian Saturday March 31 2007, p.11

Page 107: Michael Hoey (& Matt O’Donnell)

TISC launch* 2317 2.0% of sentences (1 in 49)

HAVE launched a(n) * inquiry 30 launched a(n) * inquiry <yesterday> 23launched a(n) * inquiry last night 17is (*) to launch a(n) * inquiry 17will launch a(n) * inquiry 2launched a(n) * inquiry [NO TIME ADJ] 2other (+ an + inquiry) 1

a(n) (*) inquiry HAVE been launched 14 a(n) (*) inquiry was launched <yesterday> 15a(n) (*) inquiry was launched <last night> 5a(n) (*) inquiry is to be launched 6a(n) inquiry was launched [NO TIME ADJ] 6

other (+ inquiry) 8

TOTAL 1466.3% of launch* (1 in 16 instances of launch* occur with inquiry)

Page 108: Michael Hoey (& Matt O’Donnell)

NISC launch* 3684 0.35% of sentences (1 in 289)

HAVE (already) launched a(n) * inquiry 33 launched a(n) * inquiry <yesterday> 1launched a(n) * inquiry last week/month 3 [no last night]is (*) to launch a(n) * inquiry 14will launch a(n) * inquiry 2launched a(n) * inquiry [NO TIME ADJ] 6launched a(n) * inquiry recently 2other (+ an + inquiry) 10

a(n) (*) inquiry HAVE been launched 14 a(n) (*) inquiry was launched <yesterday> 2a(n) (*) inquiry was launched <last wk, x mnth, Fr>1a(n) (*) inquiry will be launched 1 a(n) inquiry was launched [NO TIME ADJ] 7

other (+ inquiry) 11

TOTAL 1113.0% of launch* (1 in 33 instances of launch* occur with inquiry)

Page 109: Michael Hoey (& Matt O’Donnell)

4. If launch is used, it is twice as likely to be used in the expression launch* an inquiry in text-initial sentences as in non-initial sentences.

5. The expression launch* an inquiry is proportionally 6 times as likely to occur with a time adjunct in text-initial sentences as in non-initial sentences.

This illustrates the way primings nest.

Page 110: Michael Hoey (& Matt O’Donnell)

British authorities yesterday launched an inquiry into how computer hackers who targeted the cut-price fashion retailers TK Maxx were able to steal information from more than 45 million credit and debit card holders on both sides of the Atlantic.

1st sentence of ‘Inquiry launched after biggest ever credit card heist’ The Guardian Saturday March 31 2007, p.11

Page 111: Michael Hoey (& Matt O’Donnell)

British authorities yesterday launched an inquiry into how computer hackers who targeted the cut-price fashion retailers TK Maxx were able to steal information from more than 45 million credit and debit card holders on both sides of the Atlantic.

1st sentence of ‘Inquiry launched after biggest ever credit card heist’ The Guardian Saturday March 31 2007, p.11

Page 112: Michael Hoey (& Matt O’Donnell)

inquiry TISC 1655

1 in every 68 text-initial sentences

inquiry NISC 4196

1 in every 254 non-initial sentences

Page 113: Michael Hoey (& Matt O’Donnell)

inquiry TISC 1655

1 in every 68 text-initial sentences

inquiry NISC 4196

1 in every 254 non-initial sentences

Page 114: Michael Hoey (& Matt O’Donnell)

British authorities yesterday launched an inquiry into how computer hackers who targeted the cut-price fashion retailers TK Maxx were able to steal information from more than 45 million credit and debit card holders on both sides of the Atlantic.

1st sentence of ‘Inquiry launched after biggest ever credit card heist’ The Guardian Saturday March 31 2007, p.11

Page 115: Michael Hoey (& Matt O’Donnell)

British authorities yesterday launched an inquiry into how computer hackers who targeted the cut-price fashion retailers TK Maxx were able to steal information from more than 45 million credit and debit card holders on both sides of the Atlantic.

1st sentence of ‘Inquiry launched after biggest ever credit card heist’ The Guardian Saturday March 31 2007, p.11

Page 116: Michael Hoey (& Matt O’Donnell)

into how TISC 62 0.054%

top collocations a(n)

inquiry

the

investigation

yesterday

Page 117: Michael Hoey (& Matt O’Donnell)

into how TISC 62 0.054%

top collocations a(n)

inquiry

the

investigation

yesterday

Page 118: Michael Hoey (& Matt O’Donnell)

into how TISC 62 0.054%

top collocations a(n)

inquiry

the

investigation

yesterday

Page 119: Michael Hoey (& Matt O’Donnell)

into how TISC 62 0.054%

Page 120: Michael Hoey (& Matt O’Donnell)

into how NISC 58 0.0054%

Page 121: Michael Hoey (& Matt O’Donnell)

into how NISC 58 0.0054%

TISC 10 times more likely to contain into how

Page 122: Michael Hoey (& Matt O’Donnell)

British authorities yesterday launched an inquiry into how computer hackers who targeted the cut-price fashion retailers TK Maxx were able to steal information from more than 45 million credit and debit card holders on both sides of the Atlantic.

1st sentence of ‘Inquiry launched after biggest ever credit card heist’ The Guardian Saturday March 31 2007, p.11

Page 123: Michael Hoey (& Matt O’Donnell)

British authorities yesterday launched an inquiry into how computer hackers who targeted the cut-price fashion retailers TK Maxx were able to steal information from more than 45 million credit and debit card holders on both sides of the Atlantic.

1st sentence of ‘Inquiry launched after biggest ever credit card heist’ The Guardian Saturday March 31 2007, p.11

Page 124: Michael Hoey (& Matt O’Donnell)

computer TISC 552 0.49%

1 in 205 sentences 1 in every 5656 words

computer NISC 1929 0.18%

1 in 552 sentences 1 in every 10,025

So computer occurs in text-initial sentences 2.7 times more frequently than in non-initial sentences

Page 125: Michael Hoey (& Matt O’Donnell)

computer TISC 552 0.49%

1 in 205 sentences 1 in every 5656 words

computer NISC 1929 0.18%

1 in 552 sentences 1 in every 10,025

So computer occurs in text-initial sentences 2.7 times more frequently than in non-initial sentences

Page 126: Michael Hoey (& Matt O’Donnell)

computer TISC 552 0.49%

1 in 205 sentences 1 in every 5656 words

computer NISC 1929 0.18%

1 in 552 sentences 1 in every 10,025

So computer occurs in text-initial sentences 2.7 times more frequently than in non-initial sentences

Page 127: Michael Hoey (& Matt O’Donnell)

British authorities yesterday launched an inquiry into how computer hackers who targeted the cut-price fashion retailers TK Maxx were able to steal information from more than 45 million credit and debit card holders on both sides of the Atlantic.

1st sentence of ‘Inquiry launched after biggest ever credit card heist’ The Guardian Saturday March 31 2007, p.11

Page 128: Michael Hoey (& Matt O’Donnell)

British authorities yesterday launched an inquiry into how computer hackers who targeted the cut-price fashion retailers TK Maxx were able to steal information from more than 45 million credit and debit card holders on both sides of the Atlantic.

1st sentence of ‘Inquiry launched after biggest ever credit card heist’ The Guardian Saturday March 31 2007, p.11

Page 129: Michael Hoey (& Matt O’Donnell)

hackers TISC 30

1 in 3776 sentences

hackers NISC 44

1 in 24193 sentences

So hackers occurs in text-initial sentences 6.4 times more frequently than in non-initial sentences

Page 130: Michael Hoey (& Matt O’Donnell)

hackers TISC 30

1 in 3776 sentences

hackers NISC 44

1 in 24193 sentences

So hackers occurs in text-initial sentences 6.4 times more frequently than in non-initial sentences

Page 131: Michael Hoey (& Matt O’Donnell)

hackers TISC 30

1 in 3776 sentences

hackers NISC 44

1 in 24193 sentences

So hackers occurs in text-initial sentences 6.4 times more frequently than in non-initial sentences

Page 132: Michael Hoey (& Matt O’Donnell)

hackers in TISC collocates with computer in 1 in 5 cases

hackers in NISC collocates with no lexical item, despite the raw number of instances being larger than in TISC

Page 133: Michael Hoey (& Matt O’Donnell)

British authorities yesterday launched an inquiry into how computer hackers who targeted the cut-price fashion retailers TK Maxx were able to steal information from more than 45 million credit and debit card holders on both sides of the Atlantic.

1st sentence of ‘Inquiry launched after biggest ever credit card heist’ The Guardian Saturday March 31 2007, p.11

Page 134: Michael Hoey (& Matt O’Donnell)

British authorities yesterday launched an inquiry into how computer hackers who targeted the cut-price fashion retailers TK Maxx were able to steal information from more than 45 million credit and debit card holders on both sides of the Atlantic.

1st sentence of ‘Inquiry launched after biggest ever credit card heist’ The Guardian Saturday March 31 2007, p.11

Page 135: Michael Hoey (& Matt O’Donnell)

who TISC 11,474

10.1% of all TISC sentences

1 instance every 272 words

who NISC 58,557

5.5% of all NISC sentences

1 instance every 330 words

NEEDS FURTHER INVESTIGATION

Page 136: Michael Hoey (& Matt O’Donnell)

who TISC 11,474

10.1% of all TISC sentences

1 instance every 272 words

who NISC 58,557

5.5% of all NISC sentences

1 instance every 330 words

NEEDS FURTHER INVESTIGATION

Page 137: Michael Hoey (& Matt O’Donnell)

who TISC 11,474

10.1% of all TISC sentences

1 instance every 272 words

who NISC 58,557

5.5% of all NISC sentences

1 instance every 330 words

NEEDS FURTHER INVESTIGATION

Page 138: Michael Hoey (& Matt O’Donnell)

who TISC 11,474

10.1% of all TISC sentences

1 instance every 272 words

who NISC 58,557

5.5% of all NISC sentences

1 instance every 330 words

NEEDS FURTHER INVESTIGATION

Page 139: Michael Hoey (& Matt O’Donnell)

British authorities yesterday launched an inquiry into how computer hackers who targeted the cut-price fashion retailers TK Maxx were able to steal information from more than 45 million credit and debit card holders on both sides of the Atlantic.

1st sentence of ‘Inquiry launched after biggest ever credit card heist’ The Guardian Saturday March 31 2007, p.11

Page 140: Michael Hoey (& Matt O’Donnell)

British authorities yesterday launched an inquiry into how computer hackers who targeted the cut-price fashion retailers TK Maxx were able to steal information from more than 45 million credit and debit card holders on both sides of the Atlantic.

1st sentence of ‘Inquiry launched after biggest ever credit card heist’ The Guardian Saturday March 31 2007, p.11

Page 141: Michael Hoey (& Matt O’Donnell)

targeted TISC 1431 in 792 sentences 1 per 21832 words

targeted NISC 7701 in 1382 sentences 1 per 25115 words

So targeted occurs in text-initial sentences 1.7 times more frequently than in non-initial sentences

BUT there is no difference if words used as base

Page 142: Michael Hoey (& Matt O’Donnell)

targeted TISC 1431 in 792 sentences 1 per 21832 words

targeted NISC 7701 in 1382 sentences 1 per 25115 words

So targeted occurs in text-initial sentences 1.7 times more frequently than in non-initial sentences

BUT there is no difference if words used as base

Page 143: Michael Hoey (& Matt O’Donnell)

British authorities yesterday launched an inquiry into how computer hackers who targeted the cut-price fashion retailers TK Maxx were able to steal information from more than 45 million credit and debit card holders on both sides of the Atlantic.

1st sentence of ‘Inquiry launched after biggest ever credit card heist’ The Guardian Saturday March 31 2007, p.11

Page 144: Michael Hoey (& Matt O’Donnell)

British authorities yesterday launched an inquiry into how computer hackers who targeted the cut-price fashion retailers TK Maxx were able to steal information from more than 45 million credit and debit card holders on both sides of the Atlantic.

1st sentence of ‘Inquiry launched after biggest ever credit card heist’ The Guardian Saturday March 31 2007, p.11

Page 145: Michael Hoey (& Matt O’Donnell)

cut-price TISC 24

1 in 4720 sentences

cut-price NISC 57

1 in 18,675 sentences

So cut-price occurs in text-initial sentences 4.0 times more frequently than in non-initial sentences

Page 146: Michael Hoey (& Matt O’Donnell)

cut-price TISC 24

1 in 4720 sentences

cut-price NISC 57

1 in 18,675 sentences

So cut-price occurs in text-initial sentences 4.0 times more frequently than in non-initial sentences

Page 147: Michael Hoey (& Matt O’Donnell)

cut-price TISC 24

1 in 4720 sentences

cut-price NISC 57

1 in 18,675 sentences

So cut-price occurs in text-initial sentences 4.0 times more frequently than in non-initial sentences

Page 148: Michael Hoey (& Matt O’Donnell)

cut-price TISC 24

1 in 4720 sentences

cut-price NISC 57

1 in 18,675 sentences

So cut-price occurs in text-initial sentences 4.0 times more frequently than in non-initial sentences

Page 149: Michael Hoey (& Matt O’Donnell)

British authorities yesterday launched an inquiry into how computer hackers who targeted the cut-price fashion retailers TK Maxx were able to steal information from more than 45 million credit and debit card holders on both sides of the Atlantic.

1st sentence of ‘Inquiry launched after biggest ever credit card heist’ The Guardian Saturday March 31 2007, p.11

Page 150: Michael Hoey (& Matt O’Donnell)

British authorities yesterday launched an inquiry into how computer hackers who targeted the cut-price fashion retailers TK Maxx were able to steal information from more than 45 million credit and debit card holders on both sides of the Atlantic.

1st sentence of ‘Inquiry launched after biggest ever credit card heist’ The Guardian Saturday March 31 2007, p.11

Page 151: Michael Hoey (& Matt O’Donnell)

fashion TISC 430

1 in 263 sentences

fashion NISC 1171

1 in 9449 sentences

So fashion occurs in text-initial sentences 36 times more frequently than in non-initial sentences

Page 152: Michael Hoey (& Matt O’Donnell)

fashion TISC 430

1 in 263 sentences

fashion NISC 1171

1 in 9449 sentences

So fashion occurs in text-initial sentences 36 times more frequently than in non-initial sentences

Page 153: Michael Hoey (& Matt O’Donnell)

fashion TISC 430

1 in 263 sentences

fashion NISC 1171

1 in 9449 sentences

So fashion occurs in text-initial sentences 36 times more frequently than in non-initial sentences

Page 154: Michael Hoey (& Matt O’Donnell)

British authorities yesterday launched an inquiry into how computer hackers who targeted the cut-price fashion retailers TK Maxx were able to steal information from more than 45 million credit and debit card holders on both sides of the Atlantic.

1st sentence of ‘Inquiry launched after biggest ever credit card heist’ The Guardian Saturday March 31 2007, p.11

Page 155: Michael Hoey (& Matt O’Donnell)

British authorities yesterday launched an inquiry into how computer hackers who targeted the cut-price fashion retailers TK Maxx were able to steal information from more than 45 million credit and debit card holders on both sides of the Atlantic.

1st sentence of ‘Inquiry launched after biggest ever credit card heist’ The Guardian Saturday March 31 2007, p.11

Page 156: Michael Hoey (& Matt O’Donnell)

retailers TISC 51

1 in 2221 sentences

retailers NISC 290

1 in 3671 sentences

So, allowing for greater length of TISC sentences, retailers occurs no more frequently in text-initial sentences than in non-initial sentences.

Page 157: Michael Hoey (& Matt O’Donnell)

retailers TISC 51

1 in 2221 sentences

retailers NISC 290

1 in 3671 sentences

So, allowing for greater length of TISC sentences, retailers occurs no more frequently in text-initial sentences than in non-initial sentences.

Page 158: Michael Hoey (& Matt O’Donnell)

retailers TISC 51

1 in 2221 sentences

retailers NISC 290

1 in 3671 sentences

So, allowing for greater length of TISC sentences, retailers occurs no more frequently in text-initial sentences than in non-initial sentences.

Page 159: Michael Hoey (& Matt O’Donnell)

British authorities yesterday launched an inquiry into how computer hackers who targeted the cut-price fashion retailers TK Maxx were able to steal information from more than 45 million credit and debit card holders on both sides of the Atlantic.

1st sentence of ‘Inquiry launched after biggest ever credit card heist’ The Guardian Saturday March 31 2007, p.11

Page 160: Michael Hoey (& Matt O’Donnell)

TK Maxx TISC 1

1 in 2221 sentences

TK Maxx NISC 1

1 in 3671 sentences

So, allowing for greater length of TISC sentences, cut-price occurs no more frequently in text-initial sentences than in non-initial sentences

Page 161: Michael Hoey (& Matt O’Donnell)

1. British [sp 4, cr 7, sp 15, sr 27, cr A9] authorities (gen – spec?? 10) yesterday (co-hyp 14, 15) launched an inquiry [cp 12, sp 21] into how computer [sr 2] hackers [spec – gen 3, cp 5, cr 14, spec-gen 17, 20, cr 20, spec-gen 24, cr A3, sr A6, spec-gen A7] who targeted the cut-price fashion retailer [spec –gen 2, cp 3, 7 (x2), 8, spec-gen 14, cp 23, spec – gen 26] TK Maxx [sr 3, 5, 7, spec – gen 7, 9, 14, 15, sr 21, 23, cr 25, sr A2, spec – gen A3, pro A4, spec-gen A6, A8] were able to steal [cp 2, 3, sr 6, 9, cp 19, sr 20, cp 23] information [sp 3, 5, 6, sr 9, 10, 11, 14, sp 14, sr 21, 26, sp A9] from more than 45 million credit [sr 2, 3, 17, 20, 23, 24, 25, A2] and debit [sr 3, A2] card [sr 2, 6, cr 7, sr 17, 20, 23, 24, 26, A2] holders on both sides of the Atlantic [cp 5, 6, 7, 8, 11].

Page 162: Michael Hoey (& Matt O’Donnell)

1. British [sp 4, cr 7, sp 15, sr 27, cr A9] authorities (gen – spec?? 10) yesterday (co-hyp 14, 15) launched an inquiry [cp 12, sp 21] into how computer [sr 2] hackers [spec – gen 3, cp 5, cr 14, spec-gen 17, 20, cr 20, spec-gen 24, cr A3, sr A6, spec-gen A7] who targeted the cut-price fashion retailer [spec –gen 2, cp 3, 7 (x2), 8, spec-gen 14, cp 23, spec – gen 26] TK Maxx [sr 3, 5, 7, spec – gen 7, 9, 14, 15, sr 21, 23, cr 25, sr A2, spec – gen A3, pro A4, spec-gen A6, A8] were able to steal [cp 2, 3, sr 6, 9, cp 19, sr 20, cp 23] information [sp 3, 5, 6, sr 9, 10, 11, 14, sp 14, sr 21, 26, sp A9] from more than 45 million credit [sr 2, 3, 17, 20, 23, 24, 25, A2] and debit [sr 3, A2] card [sr 2, 6, cr 7, sr 17, 20, 23, 24, 26, A2] holders on both sides of the Atlantic [cp 5, 6, 7, 8, 11].

Page 163: Michael Hoey (& Matt O’Donnell)

information TISC 620

0.55% of sentences

information NISC 5701

0.54% of sentences

Page 164: Michael Hoey (& Matt O’Donnell)

credit TISC 154

0.14% of sentences

credit NISC 1346

0.13% of sentences

Page 165: Michael Hoey (& Matt O’Donnell)

card TISC 209

0.18% of sentences

credit NISC 1178

0.11% of sentences

Page 166: Michael Hoey (& Matt O’Donnell)

card TISC 154

0.14% of sentences

credit NISC 1346

0.13% of sentences

Page 167: Michael Hoey (& Matt O’Donnell)

So it may be that we are primed to expect certain words to be cohesive and others to chunk the discourse.

Page 168: Michael Hoey (& Matt O’Donnell)

launched – Problem-Solution patternsattack 109campaign 107against 82appeal 40attacks 16urgent 16challenge 16drive 15assault 14fight 13strike 12crackdown 10offensive 10strategy 8rescue 7fightback 6

Page 169: Michael Hoey (& Matt O’Donnell)

launchedGap in Knowledge-Filling patternsinvestigation 149inquiry 120into 142after 192hunt 37allegations 13claim 13?complaints 11search 8investigations 6?find 5study 5test case 5

Page 170: Michael Hoey (& Matt O’Donnell)

TISC launched 1503

GAP IN KNOWLEDGE - FILLING

into 134 (8.9%)

(excluding projection meaning e.g. launched himself into)

after 177 (11.8%)

Page 171: Michael Hoey (& Matt O’Donnell)

NISC launched 1787

GAP IN KNOWLEDGE - FILLING

into 108 (6.0%)

(excluding projection meaning e.g. launched himself into)

after 46 (2.6%)

Page 172: Michael Hoey (& Matt O’Donnell)

In a sample of 50 texts where targeted is text-initial, 31 occur as part of a Problem-Solution pattern

Page 173: Michael Hoey (& Matt O’Donnell)

So it may be that we are primed to expect certain words to have textual semantic associations. These, like targeted, may not be involved in cohesion or text chunking.

Page 174: Michael Hoey (& Matt O’Donnell)

Examining text-initial keywords

• Any items deemed ‘key’ (Scott, 2001) in TISC (with NISC as reference corpus) are candidate words with text-initial priming

• Examples:– yesterday ‘Tony Blair yesterday revealed…’

– fresh ‘Fresh evidence of the involvement…’

– branded ‘NORMAN Tebbit was branded ‘paranoid’…

– announced ‘British scientists today announced they had…’

Page 175: Michael Hoey (& Matt O’Donnell)

Key clusters in TISC (against PISC)

Key cluster Freq. % RC. Freq. RC. % Keyness

# # # 2,734 0.09 1,488 0.01 3,989.79

ACCORDING TO A 1,482 0.05 389 3,033.75

LAST NIGHT AFTER 1,096 0.03 92 2,923.47

A # YEAR 1,823 0.06 1,724 0.01 1,725.59

IT EMERGED YESTERDAY 693 0.02 92 1,705.62

WAS LAST NIGHT 712 0.02 142 1,587.87

ARE TO BE 849 0.03 298 1,553.02

# YEAR OLD 2,622 0.08 4,087 0.03 1,288.79

LAST NIGHT WHEN 537 0.02 91 1,250.30

THE MURDER OF 855 0.03 468 1,243.07

Page 176: Michael Hoey (& Matt O’Donnell)

Key clusters in TISC (against PISC)

Key cluster Freq. % RC. Freq. RC. % Keyness

# # # 2,734 0.09 1,488 0.01 3,989.79

ACCORDING TO A 1,482 0.05 389 3,033.75

LAST NIGHT AFTER 1,096 0.03 92 2,923.47

A # YEAR 1,823 0.06 1,724 0.01 1,725.59

IT EMERGED YESTERDAY 693 0.02 92 1,705.62

WAS LAST NIGHT 712 0.02 142 1,587.87

ARE TO BE 849 0.03 298 1,553.02

# YEAR OLD 2,622 0.08 4,087 0.03 1,288.79

LAST NIGHT WHEN 537 0.02 91 1,250.30

THE MURDER OF 855 0.03 468 1,243.07

Page 177: Michael Hoey (& Matt O’Donnell)

according to a

Table 3 – Occurrences of ‘according to a’ in positional subcorpora

TISC PISC SISC NISC

occurrences 1482 389 637 485

per 10000 sent. 130.82 6.41 11.46 4.56

• confirming text-initial keyness

Page 178: Michael Hoey (& Matt O’Donnell)

according to a

Table 3 – Occurrences of ‘according to a’ in positional subcorpora

TISC PISC SISC NISC

occurrences 1482 389 637 485

per 10000 sent. 130.82 6.41 11.46 4.56

• confirming text-initial keyness

Page 179: Michael Hoey (& Matt O’Donnell)

Textual colligation of according to a

• In text-initial position, according to a appears to be strongly primed to avoid Theme

– only 4 out of 1482 sentences in TISC begin ‘According to a…’

– this is 0.27%

Page 180: Michael Hoey (& Matt O’Donnell)

Textual colligation of according to a

• In text-initial position, according to a appears to be strongly primed to avoid Theme

– only 4 out of 1482 sentences in TISC begin ‘According to a…’

– this is 0.27%

Page 181: Michael Hoey (& Matt O’Donnell)

Textual colligation of according to a

• In text-initial position, according to a appears to be strongly primed to avoid Theme

– only 4 out of 1482 sentences in TISC begin ‘According to a…’

– this is 0.27%

Page 182: Michael Hoey (& Matt O’Donnell)

Textual colligation of According to a

• Elsewhere, according to a appears to be strongly primed for Theme

PISC 126 32.47%

SISC 190 29.83%

NISC 125 25.83%

Page 183: Michael Hoey (& Matt O’Donnell)

Textual colligation of According to a

• Elsewhere, according to a appears to be strongly primed for Theme

PISC 126 32.47%

SISC 190 29.83%

NISC 125 25.83%

TISC 4 0.27%

Page 184: Michael Hoey (& Matt O’Donnell)

Textual colligation of according to a

• In text-initial position, according to a appears to be strongly primed to occur in second half of a sentence

in 1340 of TISC sentences (90.42%), according to a occurs at >= 50% position in the sentence

compare: PISC 202 52.06%

SISC 349 54.79%

NISC 287 59.30%

Page 185: Michael Hoey (& Matt O’Donnell)

Textual colligation of according to a

• In text-initial position, according to a appears to be strongly primed to occur in second half of a sentence

in 1340 of TISC sentences (90.42%), according to a occurs at >= 50% position in the sentence

compare: PISC 202 52.06%

SISC 349 54.79%

NISC 287 59.30%

Page 186: Michael Hoey (& Matt O’Donnell)

Textual colligation of according to a

• In text-initial position, according to a appears to be strongly primed to occur in second half of a sentence

in 1340 of TISC sentences (90.42%), according to a occurs at >= 50% position in the sentence

compare: PISC 202 52.06%

SISC 349 54.79%

NISC 287 59.30%

Page 187: Michael Hoey (& Matt O’Donnell)

according to a

according to a appears to be strongly primed for semantic association with WRITTEN

RESEARCH SOURCE

Page 188: Michael Hoey (& Matt O’Donnell)

according to a (1482)report 449survey 331 study 221poll 120paper 24book 21document 19investigation 14

1119 (80%)

Page 189: Michael Hoey (& Matt O’Donnell)

according to a (1482)report 449survey 331 study 221poll 120paper 24book 21document 19investigation 14

1119 (80%)

Page 190: Michael Hoey (& Matt O’Donnell)

according to a

new X 133

leaked X 24

damning X 16

controversial X 16

Page 191: Michael Hoey (& Matt O’Donnell)

TISC

according to a * published

263

according to a survey published today

report yesterday

study this week

Page 192: Michael Hoey (& Matt O’Donnell)

TISC

according to a * published

263

according to a survey published today

report yesterday

study this week

NISC

37

Page 193: Michael Hoey (& Matt O’Donnell)

TISCaccording to a * published263report 103survey 85study 45poll 20paper 5book 3list 1blueprint 1

Page 194: Michael Hoey (& Matt O’Donnell)

TISC NISCaccording to a * published263 37report 103 12survey 85 12study 45 8poll 20 4paper 5 1book 3list 1blueprint 1

Page 195: Michael Hoey (& Matt O’Donnell)

TISC

according to a * published

263

today 141

yesterday 106

this week 3

tomorrow 3

last night 2

next month 1

none 7

Page 196: Michael Hoey (& Matt O’Donnell)

TISC NISC

according to a * published

263 37

today 141 16

yesterday 106 14

this week 3 2

tomorrow 3 last week 1

last night 2 1

next month 1 on Thursday 1

none 7 2

Page 197: Michael Hoey (& Matt O’Donnell)

According to a

When according to a is not at the end of Rheme, it is strongly primed to occur in the structure

according to a WRITTEN SOURCE WHICH VERB OF SPEECH OR CLAIM

e.g.

according to a study that suggests there may be…

according to a survey which shows they expect…

Page 198: Michael Hoey (& Matt O’Donnell)

According to a

In text-initial position, according to is not primed for exact repetition with itself in chains and is only very weakly primed for exact repetition with itself in cohesive links

•Examining 139 articles with according to a study in TISC, we found:

– 13 with one repetition of according to (9%)– 3 with two repetitions of according to (2%)

Page 199: Michael Hoey (& Matt O’Donnell)

According to a

In text-initial position, according to a is primed for cohesive chains of complex paraphrase (according to = says)

i.e. one expects lexis of source, claim and statement

Page 200: Michael Hoey (& Matt O’Donnell)

According to a

In text-initial position, according to a is primed for cohesive chains of complex paraphrase (according to = says)

i.e. one expects lexis of source, claim and statement

Page 201: Michael Hoey (& Matt O’Donnell)

So what are the implications?

1. Text chunking and cohesion are interrelated. 2. We start top down with the need to write (for

example) a Guardian news story about credit card fraud. The words we have in mind when starting are primed for us to be cohesive.

3. We then are primed to use certain other words to start our text (& our paragraphs).

4. If it can be demonstrated that certain words are primed to be cohesive or have textual semantic associations, then we have a unifying theory. But that is another story…