researchportal.port.ac.uk€¦  · web viewbe either an isolated language or part of the altaic...

43
This is the author’s postprint. Copyright is now held by Edinburgh University press. Final version is available at http://www.euppublishing.com/doi/abs/10.3366/cor.2014.0049. A, an and the environments in Spoken Korean English Glenn Hadikin School of Languages and Area Studies University of Portsmouth Park Building King Henry 1 Street Portsmouth PO1 2DZ [email protected] This paper comprises an analysis of small corpora of spoken Korean English: a burgeoning New English that is rarely discussed in published articles. With a theoretical framework based on Hoey’s Theory of Lexical Priming (Hoey 2005) the lexical environment surrounding the items a, an and the in two Korean corpora (one comprising Korean English speakers in Liverpool, England and the other, speakers in Seoul, Korea) are compared with two British comparator corpora. The results show a balance of differences and similarities between the Korean corpora which may suggest that while Korean English is distinct from British varieties recent priming effects and the L1 are interacting in complex ways that give each corpus a unique identity. 1

Upload: others

Post on 13-Aug-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: researchportal.port.ac.uk€¦  · Web viewbe either an isolated language or part of the Altaic group that includes Turkic and Japonic languages (Lee and Ramsey 2000), and, as Porter

This is the author’s postprint. Copyright is now held by Edinburgh University press. Final version is available at http://www.euppublishing.com/doi/abs/10.3366/cor.2014.0049.

A, an and the environments in Spoken Korean English

Glenn Hadikin

School of Languages and Area Studies

University of Portsmouth

Park Building

King Henry 1 Street

Portsmouth PO1 2DZ

[email protected]

This paper comprises an analysis of small corpora of spoken Korean English: a burgeoning

New English that is rarely discussed in published articles. With a theoretical framework

based on Hoey’s Theory of Lexical Priming (Hoey 2005) the lexical environment

surrounding the items a, an and the in two Korean corpora (one comprising Korean English

speakers in Liverpool, England and the other, speakers in Seoul, Korea) are compared with

two British comparator corpora. The results show a balance of differences and similarities

between the Korean corpora which may suggest that while Korean English is distinct from

British varieties recent priming effects and the L1 are interacting in complex ways that give

each corpus a unique identity.

1 Introduction

The Republic of Korea, or South Korea, is a small country situated between China and Japan

and has a population of just fewer than 50 million1. The people speak Korean, considered to

be either an isolated language or part of the Altaic group that includes Turkic and Japonic

1 Population data taken from http://www.worldatlas.com/aatlas/populations/ctypopls.htm on 24/3/12

1

Page 2: researchportal.port.ac.uk€¦  · Web viewbe either an isolated language or part of the Altaic group that includes Turkic and Japonic languages (Lee and Ramsey 2000), and, as Porter

languages (Lee and Ramsey 2000), and, as Porter (2011) reports, ‘English mania’ has now

become so widespread that people have even had surgery on their tongue in the hope that it

will improve pronunciation. The search term “English in Korea” gets notably more hits on

Google than the term “English in China” (1.6 million and 618 000 respectively2); English has

been taught to all middle and high school students since 1945 and in all elementary schools

since 1997 (Tollefson 2002) but until very recently it was rarely listed as a World English in

reference books such as Crystal (2003:111). Crystal reproduces a ‘circle of World Englishes’

from McArthur (1987) that includes 51 regional varieties including Appalachian and Inuit

English as well as Chinese and Japanese English but it does not mention the English used in

South Korea whatsoever (in this paper the terms Korea and South Korea are both used to

refer to the Korean republic). The large number of teaching positions being advertised online

at the time of writing, however, is a reflection of the level of interest in English3 and the

extent to which it is used in Korea in the 21st century is a reflection of Korea’s developing

multiculturalism (Mehlsen 2011 provides a useful summary) where English is used more and

more outside of the classroom albeit typically in groups where at least one speaker does not

speak Korean.

Korean English has more recently been discussed as part of a World Englishes model (see

Kachru and Nelson 2006) and the variety will be discussed in this light throughout this paper

i.e. although Korean speakers often refer to external norms it is a developing form of English

in its own right and its unique features can be described and discussed without necessarily

being seen as erroneous. This is in contrast to discussions of Konglish (a disparaging term

used to describe a mixture of Korean and English). Korean Learner English, comprising the

features described in Lee (2001), is a more positive construct that highlights certain cases

where the L1 affects Korean people’s English, and is noted to still carry the implicit message

that features of English unique to Korea or East Asia are problematic. Previous Korean

English studies have tended to focus on pronunciation (see Yeni-Komshian, Flege and Liu

2000 for example) or ideological and pedagogical issues of teaching English such as Park’s

(2009) study which claims three underlying ideologies in Korean English: necessitation is the

idea that English is a necessary tool for success in a global economy, externalisation suggests

that English is often still seen as the language of the other and can conflict with a Korean

2 Google searches conducted on 24/3/12

3 See travelandteachrecruiting.com and teachkoreans.com as examples2

Paul, 18/07/12,
I moved this bit to later in the paper as it seemed a bit too early to mention articles here.
Page 3: researchportal.port.ac.uk€¦  · Web viewbe either an isolated language or part of the Altaic group that includes Turkic and Japonic languages (Lee and Ramsey 2000), and, as Porter

identity and, finally, a shared ideology of self-depreciation – that no matter what they do

Koreans see themselves as poor at English (Park 2009).

A study that explicitly argues that Korea now has a form of codified English that is taught in

schools is Shim (1999). Shim highlights a variety of usages that are found in Korean English

textbooks ranging from lexico-semantic differences (on life used as a synonym for alive e.g.

gardens come on life again) to grammatical differences such as her claim that the simple

present tense is not differentiated from the progressive form; as an example Shim reports that

the following exchange would be acceptable in codified Korean English:

Q What happens to the grass and trees when spring comes?

A The grass is turning green and trees are budding with fresh leaves.

(Shim 1999: 253)

Shim (1999) discusses articles twice under a heading of morpho-syntactic differences

(between Korean and American English): the first is her suggestion that students are taught

that a noun phrase must be preceded with the definite article when the noun phrase contains a

relative clause so he is the man who can help other people (Shim’s example) must be used

rather than he is a man who can help other people. Shim’s second point regarding articles is

that Korean English allows for more variation in terms of count/noncount nouns and gives the

example although it is a hard work, I enjoy it as an acceptable structure. For the purposes of

this paper I accept Shim’s claims of codification as evidence that Korean English has begun

to separate from related varieties. Note, however, that this study is now over twelve years old

and there has not, to my knowledge, been a corpus-driven study published that highlights

Korean English as it is actually used in the 21st century.

With this lack of corpus-driven studies in mind I collected and transcribed two corpora of

Korean spoken English for my PhD. The motivation for creating two corpora was to allow

me to explore similarities and differences between Korean English as spoken by volunteers in

Korea itself with that of comparable Korean volunteers speaking English in the UK. The

theoretical basis for this paper is Hoey’s Lexical Priming which postulates that:

3

Page 4: researchportal.port.ac.uk€¦  · Web viewbe either an isolated language or part of the Altaic group that includes Turkic and Japonic languages (Lee and Ramsey 2000), and, as Porter

As a word is acquired through encounters with it in speech and writing, it becomes

cumulatively loaded with the contexts and co-texts in which it is encountered, and our

knowledge of it includes the fact that it co-occurs with certain other words in certain

kinds of context.

(Hoey 2005:8)

Lexical Priming repositions the related phenomena of collocation and colligation at the very

heart of language so that even traditional grammar is seen as a secondary output. The

following ten priming hypotheses are posited:  

1.  Every word is primed to occur with particular other words; these are its collocates.2. Every word is primed to occur with particular semantic sets; these are its semantic

associations.3. Every word is primed to occur in association with particular pragmatic functions; these

are its pragmatic associations.4. Every word is primed to occur in (or avoid) certain grammatical positions, and to occur

in (or avoid) certain grammatical functions; these are its colligations.5. Co-hyponyms and synonyms differ with respect to their collocations, semantic

associations and colligations.6. When a word is polysemous, the collocations, semantic associations and colligations of

one sense of the word differ from those of its other senses.7. Every word is primed for use in one or more grammatical roles; these are its

grammatical categories.8. Every word is primed to participate in, or avoid, particular types of cohesive relation in

a discourse; these are its textual collocations.9. Every word is primed to occur in particular semantic relations in the discourse; these

are its textual semantic associations.10. Every word is primed to occur in, or avoid, certain positions within the discourse; these

are its textual colligations.

Reproduced from Hoey (2012)

Hoey (2005) argues that cultures harmonise their primings in three key ways: formal

education, shared literary and religious traditions and the mass media. If we are primed then

by television, radio, adverts, our friends, teachers, neighbours and family members - indeed

every single instance of language we are exposed to - it would be reasonable to expect

measurable differences in the language used by two communities in two different countries

even when they share a first language and cultural background. This study was developed to 4

Paul, 18/07/12,
Can you add a page number
Page 5: researchportal.port.ac.uk€¦  · Web viewbe either an isolated language or part of the Altaic group that includes Turkic and Japonic languages (Lee and Ramsey 2000), and, as Porter

test such a hypothesis as well as to explore the level of similarity between the corpora. The

following research questions have guided the study:

1 What are the key similarities between the two Korean English corpora in terms of the

lexical environment around the articles a, an and the?

2 What are the key differences between the Korean corpora and two British reference corpora

in terms of the lexical environment around articles?

3 To what extent can Lexical Priming theory account for any observed variation?

Hoey (2005) makes use of selected 2-grams and 3-grams (amongst others) in his own

analysis. On the face of it an analysis of function words may seem surprising (not least when

one notes Hoey’s claim that every word has pragmatic and semantic associations) but Hoey

provides a convincing argument that in the winter has different semantic primings from in

winter – in his corpus of news articles from The Guardian; the former tends to occur with

material process verbs whereas the latter tends to occur with relational process verbs which

highlights the potential priming effects of the. The focus on function words is also likely to

minimise variation depending on the topics discussed during data collection.

Note Hoey’s (2005) caution - shared by the author - that concordance lines cannot be taken

as direct evidence that speakers are primed (psychologically) to associate certain words and

strings with other words and strings; the concordance lines and corpus data are to be seen as

an indication of strings that could reasonably be seen as being primed. In a similar vein I

share Hoey’s rejection of lemmatisation for the purposes of this study simply because one

cannot assume different forms of a lexical item will share primings; get a job for example

will be treated independently from got a job unless the data provide a reason to discuss

common primings. Hoey (2005) also suggests that a speaker’s L2 primings will be

superimposed on their L1 primings; this is clearly a complex relationship that requires further

research but it will be touched upon in this paper and partially explains the choice of articles

as the focus of this paper.

Articles were chosen partially because there is no article system in the Korean language (and

thus less obvious L1 interference) and, perhaps more so, because many of my students and

respondents reported that it was the most challenging aspect of learning and using English.

Lee (2001) only refers to articles by stating that the Korean language does not have them

5

Page 6: researchportal.port.ac.uk€¦  · Web viewbe either an isolated language or part of the Altaic group that includes Turkic and Japonic languages (Lee and Ramsey 2000), and, as Porter

while Ko et al. (2012) highlight a number of problems surrounding article use in Korean

English but is based on analysis of very specific tasks consisting of the volunteers responding

to prompts such as draw circles around the books during written and picture-based tasks

rather than speech. Chuang and Nesi (2006) use a corpus-driven method to study article use

in Chinese Written English and show that up to 29.7% (including problems with the zero

article) of the learners’ errors are article related. It also seems reasonable to suggest that

articles would be subject to subconscious priming effects (rather than the more conscious

effects of education) while speakers are focussing on the (more salient) content of their

speech. I begin with a brief summary of the two Korean corpora and two British comparator

corpora that were used.

2 Four corpora

The two spoken Korean English corpora were collected in Liverpool and Seoul in 2008 and

are named SK (for Seoul Koreans) and LK (for Liverpool Koreans). For each recording the

Korean informant and myself were situated in a small room as I began by asking questions

about their reasons for studying English, hobbies and career ambition (for example); I aimed

to keep the conversations as informal as possible and was keen to find a subject that would

‘get them talking’ freely without focussing on form. Table 1 shows that SK and LK were well

matched for age and years of learning English but that SK has a more notable female bias.

Ideally the number of speakers and gender balance would be better matched but difficulties

finding respondents who were willing to be recorded speaking English prevented this. (Recall

Park’s (2009) suggestion that Korean speakers tend to feel that they are poor at English and

thus may be hesitant to volunteer for such studies.)

6

Paul, 18/07/12,
You didn’t label or number the first two tables. I’ve done this, but check my labelling of them.Also you haven’t made a distinction between tables and figures, so I’ve altered this.
Page 7: researchportal.port.ac.uk€¦  · Web viewbe either an isolated language or part of the Altaic group that includes Turkic and Japonic languages (Lee and Ramsey 2000), and, as Porter

SK LK

Number of respondents 39 28

Average age 25 27

Gender 29f (78%) 16f (57%)

8m (22%) 4 12m (43%)

Average years learning English 9.7 12.2

Table 1: Korean informants

The respondents in Liverpool had spent an average of two years living in the UK. My own

utterances were removed from the main Korean corpora and not used in subsequent

frequency counts but all audio files and complete transcripts were kept for reference. The

total number of word tokens in each of the four corpora used in this study is shown in Table

2.

Liverpool Korean corpus (LK) 83 446

Seoul Korean corpus (SK) 112 621

Scouse corpus (SCO) 106 562

Demographic section of spoken BNC (BNC) 3 945 881

Table 2: Corpus sizes

As I was directly involved in preparing the Korean corpora great care was taken to keep them

as comparable as possible; the comparator corpora, however, were not prepared specifically

for this study so certain differences must be noted and taken into account. A corpus of native 4 Two respondents in Seoul chose not to complete the demographic data sheet

7

Page 8: researchportal.port.ac.uk€¦  · Web viewbe either an isolated language or part of the Altaic group that includes Turkic and Japonic languages (Lee and Ramsey 2000), and, as Porter

Liverpool spoken English or ‘scouse’ (SCO) was developed by a colleague between 2001 and

2004 (Pace-Sigge 2010) so I used this as a comparator corpus because of its similar size and

to allow me to account for any possible influence of the local primings on the Liverpool (LK)

volunteers. Note, however, that the SCO has a total number of 51 speakers, a larger

proportion of males at 54%, the volunteers are slightly older than the Korean speakers with

an average age of 33 and there is also a large number of group discussions compared with the

one-to-one arrangement used for the Korean corpora.

Finally the much larger demographic spoken section of the British National Corpus (BNC)

was used. This is a large reference corpus with data collected by 124 volunteers in 38 UK

locations in the section used for this study (What is the BNC? 2012) but with notably older

recordings collected in 1991 and 1992 one has to be cautious about any language structures

that may be changing in this timescale; the difference between the one-to-one interview-like

type data collection in SK and LK and the freer recording used in the SCO and the BNC

(including groups) must also be noted. All analysis was carried out using WordSmith tools

version five (Scott 2011); a simple orthographic transcription was used for all the data and

small differences such as the transcription of filled pauses (er and um) between the Korean

data and SCO were not judged as being problematic for the current study.

3 The a environment

Table 3 shows the frequency of the article a in each of the four corpora alongside the most

frequent R1 collocates. L1 items would be affected by the speakers’ primings but R1 items

were only focussed on in this paper in order to explore the relationship between the article

and the other components of a noun phrase that follow. Note that while many R1 items would

traditionally be seen as specific components of a noun phrase, for the purposes of this study I

would see this as simply as colligation (a, for example, appears to colligate with quantifiers

such as lot and little in the data shown in Table 3) and I wish to avoid further assumptions

about structure unless it comes out of the data.

The column labelled dispersion shows how many files the string occurs in and shows that the

strings under observation are evenly spread through the files rather than being clustered in

one or two (and, particularly for SK and LK, this would suggest only one or two speakers are

8

Page 9: researchportal.port.ac.uk€¦  · Web viewbe either an isolated language or part of the Altaic group that includes Turkic and Japonic languages (Lee and Ramsey 2000), and, as Porter

using the form).

Table 3: Frequency details for a and most frequent R1 collocates in four corpora

3.1 a lot versus a bit

The most frequent 2-gram in the three smaller corpora is a lot which is discussed in some

detail in Hadikin (2011a) and Hadikin (2011b) so is not repeated other than to mention two

key points. The first is that a high frequency 5-gram there are a lot of seems to be driving the

very high frequency of 170 occurrences of a lot (or 1509 per million) seen in SK; SK

contains 14 occurrences of there are a lot of compared to just 17 in the BNC which is 35

times larger (p < 0.0001 with two-tailed Fisher’s exact test (FE)). The second is the possible

influence of the speakers’ first language form 많 은 (manun) which is often used without any

variation and glossed as there are/is a lot of in translation dictionaries. The LK data seems

more heavily influenced by the string quite a lot which more closely reflects the percentage

values in the British corpora; eight percent of the usage of a lot in LK consists of the string

quite a lot compared with four percent in SCO and six percent in the BNC (Table 4). p = 0.25 9

Page 10: researchportal.port.ac.uk€¦  · Web viewbe either an isolated language or part of the Altaic group that includes Turkic and Japonic languages (Lee and Ramsey 2000), and, as Porter

when LK is compared with the BNC for quite a lot versus OTHER a lot and so the difference

is not statistically significant but p = 0.0009 (two-tailed FE) when SK is compared with the

BNC – this is clearly below the oft-cited cut off point of 0.05 for statistical significance (see

Gries 2009 for example).

Table 4: Frequency detail for a lot and quite a lot in four corpora

Rather than the string a lot, this highest frequency position is occupied by a bit in the BNC

and an important factor affecting the frequency of this 2-gram is the use of the string a bit of

(Concordance 1). From the concordance this structure appears to be used for a number of

functions ranging from the expression of large amounts (quite a bit of work) to somewhat

fixed expressions (a bit of a pain).

10

Paul, 18/07/12,
You don’t seem to have any reference to this in the main body of text. This needs to be corrected (although don’t use prepositions like above/below when doing this – the typesetting might mean that the table does not appear where you’d expect.
Page 11: researchportal.port.ac.uk€¦  · Web viewbe either an isolated language or part of the Altaic group that includes Turkic and Japonic languages (Lee and Ramsey 2000), and, as Porter

Concordance 1: Sample concordance of a bit of from BNC 11

Page 12: researchportal.port.ac.uk€¦  · Web viewbe either an isolated language or part of the Altaic group that includes Turkic and Japonic languages (Lee and Ramsey 2000), and, as Porter

The normalised frequency of a bit of in the BNC is 264pm compared with 197pm in SCO and

9pm and 48pm in SK and LK respectively; this suggests that the form may not be established

in Korean English (two-tailed χ2 with Yates’ correction is 22.5 when a bit of is compared

with OTHER of in SK and the BNC, df=1, p <0.0001). Four occurrences in LK suggest that

the form may be developing with the presence of a bit of a passion, two closely related

strings relating to studies from two speakers: I studied a bit of sociology and I studied a bit of

infectious disease and what appears to be a reformulation as a speaker produces we had a bit

of we had a few complaints which suggests the speaker was primed to refer to small amounts

as a bit of but then became aware of the problematic string we had a bit of complaints and

reformulated.

3.2 a little bit

Returning to Table 3 it is interesting to note that the string a lot takes up 15% of all uses of a

in both Korean corpora while the corresponding figure is 4% in both British corpora. The

variation of the second most frequent 2-gram, a little, in both Korean corpora appears to be

less striking with 8% of SK, 6% of LK and 2% of the British corpora but these numbers hide

a surprising point. The 3-gram a little bit takes up 78% of all a little strings in SK, 79% in LK

but just 31% and 32% in SCO and the BNC (see Table 5, p <0.0001 when a little bit is

compared with a little OTHER for SK and the BNC, two-tailed FE).

After the highest frequency of 70 occurrences of a little bit in SK the second most frequent a

little * string is a little different but with just three occurrences there is a notable drop in

frequency that is shared with LK. It seems reasonable to suggest that speakers of Korean

English are primed to use the string a little bit and that this may partially explain the smaller

frequencies of the string a bit seen in Table 3. The phrase a little bit appears as an adjective

modifier in most cases with examples ranging from a little bit different and a little bit better

to a little bit free and a little bit hard; this could reflect the L1 primings for the item 조금

(chokum).

12

Page 13: researchportal.port.ac.uk€¦  · Web viewbe either an isolated language or part of the Altaic group that includes Turkic and Japonic languages (Lee and Ramsey 2000), and, as Porter

Table 5: Frequency details for a little and high frequency R1 items in four corpora

3.3 get a job

This final section, before moving on to the an environment, centres on the third most frequent

a * 2-gram in SK: a job. It stands out amongst the other top four frequent 2-grams in that it is

a psychologically complete unit that does not refer to an amount (cf. a lot, a little, a very, a

few, a bit and a good). This should not come as a complete surprise considering that many of

the informants were students and were likely to be practising English to improve their job

prospects but, as is often the case, the lexical environment that this particular 2-gram carries

with it tells a story about the primings of the informants that may mark Korean English as

subtly different from other varieties.

13

Page 14: researchportal.port.ac.uk€¦  · Web viewbe either an isolated language or part of the Altaic group that includes Turkic and Japonic languages (Lee and Ramsey 2000), and, as Porter

Table 6: Frequency and percentage chart for the 3-gram get a job (based on Biber 2009)

In this case it appears that the Korean English speakers are primed to use the string get a job

and I will use a technique inspired by Biber (2009) to highlight the ways in which this string

is used differently in the corpora. Table 6 shows three main data columns: the first showing

data for the frequency of the string get a job compared with the total frequency of * a job

strings, the second column compares get a job with all get * job strings and the final column

compares get a job with all get a * data. As an example, the highlighted parts of Table 6

show that there are 14 occurrences of get a job out of a total 25 occurrences of all * a job

strings in SK and, in the lower part of the chart, that this is equivalent to 56%. Recall that the

dispersion column shows the number of files in which the string occurs so get a job, for

example, occurs in seven different files in SK and this corresponds with seven different

speakers.

This 56% value for SK combined with a figure of 77% for LK shows a clear level of relative

fixedness when compared with the equivalent figures in the British corpora: 14% in SCO and

13% in the BNC (p < 0.0001 when get a job is compared with OTHER a job in SK and the

BNC, two-tailed FE). Concordance two shows the range of contexts in which the 2-gram a

job occurs in the BNC (in a sample of data) while the Korean data consists largely of strings

such as I want to get a job and any chance to get a job which could easily be influenced by

the fact that the data were collected in a university setting and many informants were

involved in looking for a job. The speakers would, in terms of traditional grammar, have had

the option to say I want a job or any chance of a job in this context, however, so we are left 14

Page 15: researchportal.port.ac.uk€¦  · Web viewbe either an isolated language or part of the Altaic group that includes Turkic and Japonic languages (Lee and Ramsey 2000), and, as Porter

with a mixed picture. Korean speakers may be primed to use the string get a job more

strongly than British speakers in a range of contexts but the data sets are not comparable

enough for a strong claim at this point. A study based on a comparison between a corpus of

Korean English and a purpose built, directly comparable corpus of spoken British English

would be useful to further explore this area. Note, however, that while the same limitation

might be expected to suggest primings for the string got a job a rather different picture

emerges. In the Korean data the item got shows a much weaker attraction to a job when

compared to get a job (just 16% of * a job strings are filled with the item got in SK compared

with 8% in LK, 7% in SCO and 14% in the BNC. One’s attention is drawn more to the third

element when conducting a got a job analysis with 22% of SK showing the item job in the

got a * frame compared with just 8% of LK, and 1% in each of SCO and the BNC thus

highlighting the need for caution when it comes to lemmatisation.

Concordance 2: Sample concordance of a job in the BNC

The fully fixed status of the article in the string get a job is clear from the two figures of

100% in Table 6: SK and LK show no variety at all. SCO and the BNC have figures of 67%

and 61% respectively though it should be noted that there are only three occurrences of get *

job in SCO; the 33% comes from a single use of get your job back. The variety in the BNC is

interesting in that 24% of the get * job frame makes use of the article the and could

reasonably have been expected to appear in the 24 lines of Korean data if speakers shared 15

Paul, 18/07/12,
You need to refer to this in the main body of the text.
Page 16: researchportal.port.ac.uk€¦  · Web viewbe either an isolated language or part of the Altaic group that includes Turkic and Japonic languages (Lee and Ramsey 2000), and, as Porter

similar primings (p = 0.004 when get a job is compared with all other get * job strings for SK

and the BNC, two-tailed FE).

The third column in Table 6 shows some of the greatest differences between the Korean

corpora and the British corpora; the item job completes the get a * frame in just 3% of cases

in the BNC, for example, compared with a figure more than 10 times larger - 37% - in SK (p

< 0.0001, two-tailed FE).

SCO BNC

1 few 1 bit

2 car 2 lot

2 grant

2 lot

2 bit

2 big

2 job

Table 7: Items occupying R1 position following get a in British corpora

The apparent British primings for the colligation a QUANTIFIER influence these data with

get a few forming the most frequent get a * 3-gram in SCO and the items car, grant, lot, bit

and big occurring at the same frequency of job with two occurrences each (see Table 7). The

strings get a bit and get a lot are the most frequent get a * strings in the BNC but get a job is

clearly the most frequent in both Korean corpora.

The Korean corpora have no occurrences of the BNC’s most frequent get a * 3-gram get a bit

though there are two occurrences in SCO (19pm) and 104 occurrences in the BNC (26pm).

SCOs most frequent get a * 3-gram get a few is also completely absent from both Korean

corpora (there are three occurrences in the similar sized SCO (28pm) and 40 occurrences in

the BNC (10pm)). This suggests that get a QUANTIFIER strings are used rather differently in

16

Page 17: researchportal.port.ac.uk€¦  · Web viewbe either an isolated language or part of the Altaic group that includes Turkic and Japonic languages (Lee and Ramsey 2000), and, as Porter

Korean English but the potential combined primings for delexicalised verbs (see Chi, Wong

and Wong 1994 for a study of learners in Hong Kong for example), articles and quantifiers

give a complex picture that would be beyond the scope of this study.

4 The an environment

The lexical environment surrounding the item an in the four corpora is clearly rather different

than that of a as Table 8 shows.

Table 8: Frequency details for an and most frequent R1 collocates in four corpora

There are very few obvious similarities in the most frequent ten items that follow an (though

note that the Korean data shown contains a number of single occurrences). The rather low

normalised frequencies of an (222pm in SK and 467pm in LK) are quite striking however

compared with the British corpora (1398pm in SCO and 1303 in the BNC); ratios of the

17

Page 18: researchportal.port.ac.uk€¦  · Web viewbe either an isolated language or part of the Altaic group that includes Turkic and Japonic languages (Lee and Ramsey 2000), and, as Porter

frequency of a to an are 45-1 in SK, 21-1 in LK, 11-1 in SCO and 15-1 in the BNC which

clearly positions LK between SK and the comparator corpora and may suggest primings are

shifting to a British level.

LK stands out amongst the four corpora, however, in that the string an hour is not the most

frequent an * 2-gram. It is possible that these speakers are primed to say thirty minutes rather

than half an hour (the most frequent * an hour string in both SCO and the BNC); this would

reflect my experience of teaching time phrases in Korea and, indeed, LK has the highest

normalised frequency of the string thirty minutes at 48pm. SK has 18pm, the BNC has just

3pm and there are no occurrences at all in SCO. Clearly, a Korean preference for the thirty

minutes form does not explain why SK differs from LK but note that SK only has 25

occurrences of an so is particularly susceptible to statistical variation and that the most

influential * an hour 3-gram in SK is for an hour rather than half an hour; it is perhaps

something of an illusion that SK is more closely aligned with the British corpora.

Concordance 3: Complete concordance of an * student in LK

The item hour is not only absent from the top of the an * frequency list in LK, it ranks sixth

below international, English, exchange, interview and essay. These items suggest a possible

priming based on the concept of international student activities which would be

understandable considering the demographics of the informants. Such an explanation should

not take away from the fact that this is a difference between SK and LK (potentially) based

on recent primings (note, however, that the difference between an hour and other an * strings

in SK and LK is not statistically significant with p=0.199, two-tailed FE). The an * student

frame shown in Concordance 3 is, in fact, more frequent than the most frequent 2-gram an

international in LK and lends support to this idea.

18

Page 19: researchportal.port.ac.uk€¦  · Web viewbe either an isolated language or part of the Altaic group that includes Turkic and Japonic languages (Lee and Ramsey 2000), and, as Porter

5 The the environment

In this final analysis-based part of the paper I will be discussing areas of the the environment

as it is shown in Table 9. The frequencies of the most frequent the * 2-grams are much lower

than a * 2-grams, particularly in the Korean corpora, but it is notable that there is a clear

division between the Korean data and the British data: the Korean corpora make greatest use

of the string the first while the British corpora make greater use the string the other. For this

reason as well the fact that both strings are available for the formation of longer noun phrases

I will take these forms as my starting point. (Recall that I am cautious about simply assuming

that structures such as noun phrases have an important role in corpus-driven studies but

Korean learners are taught phrase structure from an early age so it is reasonable to think

about the primings effects of such an education.)

Table 9: Frequency details for the and most frequent R1 collocates in four corpora

19

Page 20: researchportal.port.ac.uk€¦  · Web viewbe either an isolated language or part of the Altaic group that includes Turkic and Japonic languages (Lee and Ramsey 2000), and, as Porter

5.1 the first time

The first is the most frequent the * 2-gram in the Korean data with normalised frequencies of

611pm and 462pm in LK and SK respectively; by comparison the normalised frequencies for

SCO and the BNC are 197pm and 270pm. The 3-gram the first time is the most frequent the

first * 3-gram in each of the corpora so I selected this string to produce the frequency chart

shown in Table 10.

Table 10: Frequency and percentage chart for the 3-gram the first time (based on Biber 2009)

Compared to the chart shown in Table 6, and other charts in Hadikin (2011a) and Hadikin

(2011b) Table 10 is quite unusual because the Korean corpora show the most flexibility in the

first slot while the British corpora appear somewhat fixed.

20

Page 21: researchportal.port.ac.uk€¦  · Web viewbe either an isolated language or part of the Altaic group that includes Turkic and Japonic languages (Lee and Ramsey 2000), and, as Porter

Concordance 4: Sample concordance of first time in SK

Concordance 4 shows a sample concordance from SK where the following strings can be seen:

1 when I look back first time

2 when I met him first time

3 in Korea at first time

This suggests that many of the speakers are weakly, if at all, primed to insert the item the

before the 2-gram and may be primed to use the string first time in similar ways to the way a

British speaker simply uses first or at first (p < 0.0001 when the first time is compared with

other * first time strings in SK and the BNC, two-tailed FE). Indeed, LK has seven

occurrences of at first time which highlights this potential example of mixed primings (there

are three occurrences in SK and none in either SCO or the BNC).

The second column of Table 10 returns to a more familiar pattern of the string in the Korean

data appearing more fixed with 100% of the * time strings taking the form the first time in

LK, 58% in SK but just 25% in both SCO and the BNC. It is a curious point that to a large

extent (12/34 occurrences or 35% of all the * time occurrences) the variation comes from the

use of the same time in SK which is completely absent in LK; there is a single occurrence of

same time as a 2-gram. With 5/16 occurrences (31%) SCO actually has the only time as its

21

Page 22: researchportal.port.ac.uk€¦  · Web viewbe either an isolated language or part of the Altaic group that includes Turkic and Japonic languages (Lee and Ramsey 2000), and, as Porter

most frequent the * time phrase and the BNC has the first time as its most frequent (138/724

or 19%) followed by the same time (129/724 or 18%) and the next time (45/724 or 6%).

The percentage of the first * that takes the form the first time shows that there is a certain

amount of flexibility in all four corpora; LK shows the most fixedness but with just 45% of

the concordance in the form the first time there is actually a large set of other strings such as

the first thing, the first floor and the first one. It seems that the numbers are most notably

affected by the high normalised frequencies of the first time in the Korean corpora – 178pm

in SK and 276pm in LK compared to 38pm in SCO and 45pm in the BNC. This appears to be

the result of a more general use that compares and overlaps with the meaning of at first in

British English (p=0.55 when the first time is compared with other the first * strings in LK

and SK but 0.0003 between SK and the BNC, two-tailed FE).

5.2 the other

This most frequent the * 2-gram in both SCO and the BNC does not lend itself as readily to

analysis by a frequency/percentage chart as it tends to form quite different longer strings

depending on whether one is looking at the Korean corpora or the British. In both SK and LK

it has a tendency to form and the other but in SCO one mostly finds the other side and in the

BNC the most notable 3-gram is the other one. SK stands alone in its high relative use of and

the other compared to the total number of * the other strings with 14/38 (37%) compared to

5/35 (14%) in LK, 2/48 (4%) in SCO and 202/2918 (7%) in the BNC (p = 0.035 when SK is

compared with LK, two-tailed FE).

22

Page 23: researchportal.port.ac.uk€¦  · Web viewbe either an isolated language or part of the Altaic group that includes Turkic and Japonic languages (Lee and Ramsey 2000), and, as Porter

Concordance 5: Complete concordance of and the other in SK (above) and LK (below)

Use of and the other shown in Concordance 5 shows that the string is being used mostly for

comparisons or for the addition of a new point in the conversation in both Korean corpora. A

colligation and the other thing BE is noteworthy with a presence in both corpora shown in

Concordance 5 but is not present in SCO and only occurs six times in the near four million

word section of the BNC (cf. four times in the much smaller SK) always in the same form

and the other thing is; the Korean informants appear to be more strongly primed to use this

string/colligation to add information without necessarily specifying that they were going to

make two or more points earlier in the discourse as one may expect (p = 0.002 when and the

other thing is compared with other and the other * strings in SK and the BNC, two-tailed

FE).

The most frequent * the other/the other * string in SCO is the other side which makes it

appear somewhat different from both the Korean corpora and the BNC. 11/48 (23%)

occurrences of the other form this 3-gram in SCO and are mostly used to refer to physical or

geographic areas (the other side of the Liverbuilding, the other side of Wigan etc) though it is 23

Page 24: researchportal.port.ac.uk€¦  · Web viewbe either an isolated language or part of the Altaic group that includes Turkic and Japonic languages (Lee and Ramsey 2000), and, as Porter

important to note that four of the occurrences were produced by the researcher himself. There

are no occurrences of this string in the Korean corpora which may suggest different primings;

279/2919 (10%) are the corresponding figures for the other side in the BNC and it appears to

be used with a similar function of describing physical locations. The most frequent 3-gram in

the BNC’s the other environment is the other one with 480/2919 (16%) of all the other *

strings taking this form compared with 3/48 (6%) in SCO, 3/38 (8%) in SK and 1/35 (3%) in

LK. The right side of the string the other in the Korean corpora appears to be more flexible

than in the British corpora; recall that 23% of the other * in SCO takes the form the other

side and 16% of the other * in the BNC takes the form the other one. The most frequent the

other * in SK is the other thing with 4/38 (11%) of occurrences and the most frequent the

other * 3-gram in LK is the other countries with 3/35 (9%) in this form.

Despite this apparent R1 flexibility time expressions such as the other day, the other week

and the other night are conspicuously absent from 38 lines of the other in SK but these forms

make up 23% of the BNC’s the other * occurrences, 21% of SCO’s and 9% of LK’s

occurrences (a single occurrence in SK would have represented approximately 3%); this

suggests that the informants in SK are weakly primed or, possibly, not primed to use the

other in time expressions but the LK informants may have begun a shift to British primings

during their time in the UK (Concordance six illustrates the lack of time expressions in SK,

p=0.105 when LK is compared with SK for time expressions but p < 0.0001 when SK is

compared with the BNC).

24

Page 25: researchportal.port.ac.uk€¦  · Web viewbe either an isolated language or part of the Altaic group that includes Turkic and Japonic languages (Lee and Ramsey 2000), and, as Porter

Concordance 6: Sample concordance of the other in SK showing R1 flexibility but a lack of time expressions

25

Paul, 18/07/12,
Need to refer to this in the main body of text
Page 26: researchportal.port.ac.uk€¦  · Web viewbe either an isolated language or part of the Altaic group that includes Turkic and Japonic languages (Lee and Ramsey 2000), and, as Porter

6 Conclusion

In this paper I have tried to show some of the variety as well as the consistency of the English

spoken by Korean adults by discussing three lexical environments: the phraseology and lexis

surrounding the items a, an and the in two small corpora of Korean English and, as

comparator corpora, a small ‘scouse’ corpus and the spoken demographic section of the

BNC.

Some of the similarities between the Seoul Korean corpus (SK) and the Liverpool Korean

corpus (LK) include the following:

The percentage of a * that takes the form a lot is consistent at 15% and this is notably

higher than the British corpora.

The percentage of a little * that takes the form a little bit is very similar at 78% (SK)

and 79% (LK); this is also much higher than the British corpora.

The string get a job shows a comparable level of fixedness in the Korean corpora

with 100% of get * job taking the form get a job, for example, compared with 67%

and 61% in SCO and the BNC.

The item an occurs with very low normalised frequency in the Korean corpora:

222pm and 467pm in SK and LK compared with 1398pm and 1303pm in the BNC.

The string the other has a strong tendency to form the 3-gram and the other in the

Korean corpora compared to SCO and the BNC which tend to form R1-based the

other * strings.

While differences between SK and LK include:

LK has a greater frequency of quite a lot than SK despite being approximately 25%

smaller.

LK stands out amongst the four corpora because it does not have an hour as the

highest frequency an * 2-gram

LK has a higher frequency of at first time than SK

26

Page 27: researchportal.port.ac.uk€¦  · Web viewbe either an isolated language or part of the Altaic group that includes Turkic and Japonic languages (Lee and Ramsey 2000), and, as Porter

LK uses the other to form time expressions such as the other day and the other night

but there are no occurrences in SK

This kind of variation suggests that while priming effects may separate LK from SK in

certain areas, other strings are being used with great consistency. Similarities such as the high

frequency use of a lot, for example, are arguably influenced by a comparable L1 form and

consistent translation across pedagogic materials - Korean learners may be using a lot with a

comparable frequency to the Korean equivalent 많 은 (manun); their primings are then

reinforced by high exposure to a lot when reading Korean English texts - but then, as

language researchers, we might ask how and why the speakers are primed to use this English

form as part of longer utterances. The SK speakers appear primed to produce there are a lot

of while LK speakers appear more likely to say quite a lot. This arguably reflects a recent

change to a British priming as the LK speakers rely less on there are a lot of (possibly stored

as a formulaic chunk for many speakers and an exact translation of its Korean equivalent) and

begin to include a hedging term quite that they will have been exposed to in the UK. It is also

interesting to note that a colligation appears to have crossed over from the L1: for each article

under consideration in this paper the normalised frequencies are lower in the Korean data

compared to the British data. The article a occurs in SK at a normalised rate of 10 176 pm but

19 621 pm in the BNC, for example, and this pattern is consistent across the data for an and

the suggesting that noun phrases are weakly primed for colligation with articles in Korean

English. In at least one case British-like primings appear to have come together to create a

uniquely Korean result: the primings for at first and first time appear quite unexceptional but

then combine to give a high frequency of at first time with its own function of referring to the

first time something is done or experienced. Lexical Priming is arguably unique in that its

focus on the primings of each word (which is actually shorthand for the primings of the

language user’s or users’ use of that word) allows for a detailed consideration of how and

why a string appears to be changing form.

I hope that this paper has highlighted some of the areas of spoken language which might be

expected to show priming effects whereas until now any suggestions would have been merely

theoretical (Hoey 2005 was largely based on written texts). The issues discussed here may

also suggest which areas of spoken English are the first to change when individuals move

27

Page 28: researchportal.port.ac.uk€¦  · Web viewbe either an isolated language or part of the Altaic group that includes Turkic and Japonic languages (Lee and Ramsey 2000), and, as Porter

into a new geographical area and which parts of one’s idiolect are relatively fixed or slower

to change – this could have exciting implications for both pedagogy and language evolution.

Many of these changes of language forms could, of course, be discussed without reference to

Lexical Priming but the alternative model would need to account for communities and

individual speakers on both a psychological and sociological level changing their

collocational behaviour in a short space of time – linguistic models such as those proposed in

Sinclair (1991) and Wray (2002) are suitable but are also generally compatible with Lexical

Priming as discussed in Hadikin (2011a) and Hadikin (2011b). Wray (2002), however, argues

that learners tend to break down chunks into their parts based on meaning thus leaving a

fuzzy picture when it comes to function words; it is not clear how the details of Wray’s

model could explain how LK speakers have manipulated the strings at first and first time to

create a new form, for example.

The work also raises research questions such as ‘how and why do individuals vary in their

use of a lot?’, for example, and pedagogic questions such as how alternatives could be taught

or, indeed, if it is actually beneficial to try to reproduce the primings of native speakers. The

need for very carefully chosen/carefully prepared comparator corpora is a further issue raised

because the differences between interview-type data in the Korean corpora and the wider

contexts recorded in the comparator corpora will interfere with and exaggerate any ‘true’

differences between the language varieties.

There are, however, very few papers published about Korean spoken English so I hope this

one can add to the developing picture of corpus-based language variation and act as a starting

point for further research work as well as providing Korean learners, language users and

teachers with, what some may see as points to notice (in the sense of Schmidt 1990) such as

an overall tendency to drop articles while, to others, these are simply differences between

two equally valid World Englishes and in many cases the speaker’s meaning would be

unhindered.

28

Page 29: researchportal.port.ac.uk€¦  · Web viewbe either an isolated language or part of the Altaic group that includes Turkic and Japonic languages (Lee and Ramsey 2000), and, as Porter

References

Biber, D. 2009. A corpus-driven approach to formulaic language in English: multi-word patterns in speech and writing. Presentation given at Corpus Linguistics 2009 on 23rd July 2009.

Chi, A.M., Wong, K.P. and Wong, M.C. 1994. ‘Collocational problems amongst ESL learners: A corpus-based study’ in L. Flowerdew and A.K.K. Tong, Entering text. Hong Kong: Language Centre, Hong Kong University of Science and Technology, andDepartment of English, Guangzhou Institute of Foreign Languages, pp. 157-165.

Chuang, F. and Nesi, H. 2006. ‘An analysis of formal errors in a corpus of L2 English produced by Chinese students’, Corpora 1(2): 251-271.

Crystal, D. 2003. The Cambridge Encyclopaedia of the English Language (2nd Edition). Cambridge: Cambridge University Press.

Gries, S. 2009. Quantitative Corpus Linguistics with R. London: Routledge.

Hadikin, G. S. 2011a. Corpus, Concordance, Koreans: a corpus-driven study of an emerging New English. Manuscript submitted for publication.

Hadikin, G. S. 2011b. Corpus, Concordance, Koreans: a comparison of the spoken English of two Korean communities. Unpublished PhD thesis, University of Liverpool.

Hoey, M. 2005. Lexical Priming: a New Theory of Words and Language. London: Routledge.

Hoey, M. 2012. Priming hypotheses. Retrieved from http://lexicalpriming.org/priming-

hypotheses/.

Ionin, T., Baek, S,. Kim, E., Ko, H. and Wexler, K. 2012. ‘That’s not so different from the:

definite and demonstrative descriptions in second language acquisition’, Second Language

Research, 28, 69-101.

Kachru, B. and Nelson, C. 2006. World Englishes in Asian contexts. Hong Kong: Hong Kong

University Press.

Lee, I. and Ramsey, R. 2000. The Korean Language. Albany, NY: SUNY press.

Lee, J. 2001. ‘Korean speakers’ in Swan, M. and Smith, B. (eds.) Learner English: a teacher’s guide to interference and other problems. Cambridge: Cambridge University Press.

McArthur, A. 1987. The English Languages? English Today 11, pp 9-13.

Mehlsen, C. 2011. The Rise of Multiculturalism in Korea. Retrieved from http://www.dpu.dk/fileadmin/www.dpu.dk/ialeimagazine/multiculturaleducation/IALEI_Magazine_18-20.pdf.

29

Page 30: researchportal.port.ac.uk€¦  · Web viewbe either an isolated language or part of the Altaic group that includes Turkic and Japonic languages (Lee and Ramsey 2000), and, as Porter

Pace-Sigge, M. T. L. 2010. Evidence of lexical priming in spoken Liverpool English. Unpublished PhD thesis, University of Liverpool.

Park, J. S. 2009. The Local Construction of a Global Language: Ideologies of English in South Korea. Berlin: Mouton de Gruyter.

Porter, C. 2011. ‘Review of ‘The Local Construction of a Global Language: Ideologies of English in South Korea’’, TESL-EJ 14 (4).

Schmidt, R. 1990. The role of consciousness in second language learning. Applied Linguistics 11, pp 17-46.

Scott, M. 2011. Wordsmith tools. Retrieved from http://www.lexically.net/wordsmith/index.html.

Shim, R. J. 1999. Codified Korean English. World Englishes 18 (2), pp. 247-259.

Sinclair, J.McH. 1991. Corpus, Concordance, Collocation. Oxford: Oxford University Press.

Tollefson, J. 2002. Language Policies in Education: Critical Issues. Routledge: London.

Wray, A. 2002. Formulaic Language and the Lexicon. Cambridge: Cambridge University

Press.

Yemi-Komshian, G., Flege, J. and Liu, S. 2000. ‘Pronunication Proficiency in the First and Second Languages of Korean-English Bilinguals’, Bilingualism: Language and Cognition 3 (2) pp. 131-49.

What is the BNC? 2012. Retrieved from http://www.natcorp.ox.ac.uk/corpus/index.xml?ID=intro.

30