11/01/06 Penn State 1
Bridging contrastive study and language acquisition research A corpus-based study of passives
in English and Chinese
Richard [email protected]
11/01/06 Penn State 2
Overview of the talk
• Corpora for contrastive study
• Passives in English and Chinese
• Passive errors in Chinese learner English
11/01/06 Penn State 3
Corpora for contrastive study
11/01/06 Penn State 4
Parallel corpora? No
• Two types of multilingual corpora• Parallel corpus = source texts +
translations• Some misunderstandings, e.g.
– ‘translation equivalence is the best available basis of comparison’ (James 1980: 178)
– ‘studies based on real translations are the only sound method for contrastive analysis’ (Santos 1996: i)
• But…
11/01/06 Penn State 5
Evidence of translationese (1)
• An unrepresentative special variant• A ‘third code’ (Frawley 1984: 168)• Four core patterns of lexical use (Laviosa 1998)
– a relatively low proportion of lexical words over function words
– a relatively high proportion of high-frequency words over low-frequency words
– a relatively great repetition of most frequent words – less variety in most frequently used words
11/01/06 Penn State 6
Evidence of translationese (2)
• Beyond the lexical level -– Normalization, simplification (Baker
1993/1999)– Explicitation (Øverås 1998)– Sanitization (Kenny 1998)– Aspect markers twice as frequent in L1
Chinese (McEnery & Xiao 2002)
• Parallel corpora: unreliable for contrastive study
11/01/06 Penn State 7
Comparable corpora: Yes
• Comparable corpus = same sampling techniques + similar balance and representativeness
• Well suited for contrastive study
• Some E-C contrastive studies– Aspect marking (e.g. McEnery, Xiao & Mo 2003)– Situation aspect (e.g. Xiao & McEnery (2004a)– Collocation and semantic prosody (e.g. Xiao &
McEnery 2005)
11/01/06 Penn State 8
Passives constructionsin English and Chinese
11/01/06 Penn State 9
Corpus data
• Two English corpora– Freiburg-LOB (FLOB)– BNCdemo
• Two Chinese corpora– Lancaster Corpus of Mandarin Chinese
(LCMC)– LDC CallHome Mandarin Transcripts
11/01/06 Penn State 10
Text categories in FLOB and LCMC
Code Text category No. of samples Proportion
A Press reportage 44 8.8%
B Press editorials 27 5.4%
C Press reviews 17 3.4%
D Religion 17 3.4%
E Skills, trades and hobbies 38 7.6%
F Popular lore 44 8.8%
G Biographies and essays 77 15.4%
H Miscellaneous (reports, official documents)
30 6%
J Science (academic prose) 80 16%
K General fiction 29 5.8%
L Mystery/detective fiction 24 4.8%
M Science fiction 6 1.2%
N Adventure fiction 29 5.8%
P Romantic fiction 29 5.8%
R Humour 9 1.8%
Total 500 100%
11/01/06 Penn State 11
Two major passives types in English
• Be vs. get-passives– Dynamic vs. stative
• e.g. Go and get/*be changed! (BNCdemo)
– Infinitival complements• e.g. they liked to be/*get seen to go to church (BNCdemo)
– Contrast in overall frequencies• 955 vs. 31 instances of be-passives vs. get-passives per
100K words
– Writing vs. speech• Normalised frequencies (per 100K words)
– Be-passives: 854 (W) vs. 101 (S)– Get-passives: 5 (W) vs. 26 (S)
11/01/06 Penn State 12
Long vs. short forms by register
• Long vs. short passives
• Distribution in speech & writing
• Short passives more frequent in S than W– LL=209.225 for 1
d.f., p<0.001Corpus
FLOBBNCdemo
Pe
rce
nt
120
100
80
60
40
20
0
Agent type
Short passive
Long passive
11/01/06 Penn State 13
Long vs. short forms by passive type• Get-passives are more
likely than be-passives to occur in short forms– LL=76.015 for 1 d.f., p<0.001
• Agents in get-passives Impersonal, e.g. – got caught by the police
• Inanimate, e.g. – got knocked down by a car
• Personal agents: informationally dense, semantically indispensable, e.g. – The bleeding fat girl, he got
asked out by her. (BNC)
Passive type
get-passivebe-passives
Per
cent
120
100
80
60
40
20
0
Agent type
Short passive
Long passive
11/01/06 Penn State 14
Adverbials in English passives
• Passives with no adverbial are much more common than those with an adverbial – true for both be- and get-passives
• Adverbials are more frequent in be- passives than get-passives
– 17.7% of be-passives; 7% of get-passives
• Less diversified in get-passives– Typically ‘have an intensifying or
focusing role’ (Carter & McCarthy 1999: 53)
• Proportions of be-passives with an adverbial are similar in S & W
– 19.5% (S) vs. 17.3% (W)• BUT the proportion of get-passives
with an adverbial is much greater in W than S
– 15.2% (W) vs. 6.6% (S)
Passive type
get-passivebe-passive
Pe
rce
nt
120
100
80
60
40
20
0
Adverbial type
No adverbial
Adverbial
11/01/06 Penn State 15
Pragmatic meanings
Passive type Negative Positive Neutral
Be-passive 15% 4.7% 80.3%
Get-passive 37.7% 3.4% 58.9%
11/01/06 Penn State 16
Collocation analysis• Observation of pragmatic meanings of get-
passives is supported by collocation analysis– z score>3.0, frequency>3, L0-R1
• Collocates of get-passives are more likely to show a negative pragmatic meaning– Negative get-passives: 46.5% in BNCdemo (one
collocate in FLOB: married)– Negative be-passives: 27% in BNCdemo and 8% in
FLOB– Get-passives NOT necessarily more frequently
negative in S• Proportions of negative cases: 45.8% (W) vs. 37.3% (S)
– Exceptionally high co-occurrence frequency of a few neutral collocates of get-passives in S (married , paid , dressed , changed)
11/01/06 Penn State 17
Collocation vs. style
• Get-passives are more informal in style– More restricted in collocation, more likely to
refer to daily activities and be used in informal expressions
• GET - dressed, changed, weighed, fed (i.e. eat), washed, cleaned
• GET - pricked, hooked, mixed (up), carried (away), muddled (up), sacked, kicked (out), stuffed, thrown (out), chucked, pissed, nicked
– Rarely found among the top 100 collocates of be-passives
11/01/06 Penn State 18
Style vs. distribution• Stylistic difference >
distribution• Be-passives: over 8 times as
frequent in FLOB (A-R) as in BNCdemo (S)– Of written genres, more
common in informative texts (A-J) than imaginative writing (K-R)
– Exceptionally frequent in H & J (cf. Biber 1988)
• Get-passives typically occur in speech and colloquial, informal genres– Over 5 times as frequent in
speech as in writing– Of written genres,
exceptionally frequent in E (leisure) & R.
be-passive
get-passive
Passive type
AB
CE
FG
HJ
KL
MN
PR
S
Genre
0.00
10.00
20.00
11/01/06 Penn State 19
Syntactic functions
• Finite vs. non-finite– Finite: predicate– Non-finite: adjectival, adverbial, complement, object,
subject
• Typically used as predicates– 97% of be-passives and 96% of get-passives– Sometimes found in object and complement positions– Rarely used as subjects
• Distribution of get-passives is more balanced across syntactic functions
11/01/06 Penn State 20
Passives in Chinese: Notional, syntactic vs. lexical
• Marked (47%) vs. unmarked (53%) passives– Unmarked passives: notional or pseudo-passives– Topic sentences (topic + comment)
• e.g. fan (meal) <*bei (PSV)> zuo-hao (do-ready) le (PERF) ‘The dinner is cooked (ready)’ (LCMC)
• Syntactic vs. lexical passives– Passivised verbs do not inflect morphologically– Syntactic passive markers
• Bei: the most frequent, ‘universal’ passive marker• Gei, jiao, rang: not fully grammaticalised, typically in
colloquial genres & dialects• Wei…suo: archaic, only in formal written genres
– Lexical passives: ai, shou(dao), zao(dao) • Inherently passive
11/01/06 Penn State 21
Long vs. short passives
• Bei and gei: in both long (40%, 43%) and short (60%, 57%) passives
• Wei, jiao and rang: only in long passives• Shou and zao: more frequent in short (68%,
63%) than long (32%, 37%) passives• Ai: almost exclusively in short passives (97%)• Long passives: in speech and colloquial genres;
short passives: typically in written genres such as J, H and G
11/01/06 Penn State 22
Agent NPs in syntactic vs. lexical passives
• Can be systematically interpreted as attributive modifiers of (nominalised) verbs in lexical passives, but cannot in syntactic passives, cf.– A) danshi (but) zhe (this) yi (one) jianyi (proposal)
zaodao (suffer) Xide (West Germany) zongli (prime minister) <de (PRT)> jujue (reject/rejection) ‘But this proposal was rejected by the prime minister of West Germany’ (LCMC)
– B) wo-men (we) na-ge (that-CL) che (car), bei (PSV) Xinhuan (Xinhuan) <*de (PRT)> nong-huai (ruin) le (PERF) ‘Our car was ruined by Xinhuan’ (CallHome)
11/01/06 Penn State 23
Syntactic functions
• Most frequent in the predicate position– 76% of syntactic passives (74% of bei); 75%
of lexical passives
• Non-predicate uses– Attributive modifier: second most important
syntactic function (14%)– Uncommon as subjects or complements
11/01/06 Penn State 24
Interaction with aspect
• Interacting with aspect closely (Xiao and McEnery 2004b)– Syntactic passives convey an aspectual meaning of result
• Bare passives account for the largest proportions of syntactic (40%) and lexical (78%) passives
• BUT perfective -le is not uncommon in both syntactic (17%) and lexical (11%) passives
• RVCs and resultative de-structure are more common in syntactic passives; bare forms are more frequent in lexical passives
• Passivised verbs in bare forms are uncommon in syntactic passives, especially when they function as predicates
11/01/06 Penn State 25
Pragmatic meanings
• Typically express a negative pragmatic meaning– “usually of unfavourable meanings” (Chao 1968: 703)
• Universal passive marker bei derived from its main verb usage, meaning ‘suffer’ (Wang 1957)
• Under the influence of Western languages, Chinese passives are no longer restricted to verbs with an inflictive meaning
– Proportions of negative pragmatic meaning• Syntactic passives: gei (68%), rang (67%), bei (52%), jiao
(50%), wei (19%)• Lexical passives: ai (100%), zao (100%), shou (65%)
– Collocates of bei-passives• 51% negative, 39% neutral, 10% positive
11/01/06 Penn State 26
Distribution across genres
• 11 times as frequent in writing as in speech• Most common in religious writing (D) and mystery/
detective stories (L)– Mystery/detective stories are often concerned with victims who
suffer from various kinds of mishaps or what criminals do to them– In religions, human beings are passive animals whose fate is
controlled by some kind of supernatural force• Least frequent in news editorials (C) and official
documents (H)• Universal passive marker bei
– Contrast in proportions between long vs. short passives typically less marked in 5 types of fiction (K-P), humour (R) and speech (S)
– Predominantly negative in speech (S); more often than not negative in news editorials (C), mystery/detective stories (L), and adventure stories (N); but rarely negative in official documents (H) and academic prose (J)
11/01/06 Penn State 27
Contrast: Overall frequencies• Passive constructions are significantly more
common in English than in Chinese (nearly 10 times as frequent)– English (be-)passives occur in both dynamic and
stative situations; Chinese passives can only occur in dynamic events
– Chinese passives typically have a negative pragmatic meaning; English passives (esp. be-passives) do not
– Unmarked notional passives are more common in Chinese
• Chinese topic-oriented; English subject-oriented– English tends to over-use passives, esp. in formal
writing (Quirk 1968; Baker 1985); Chinese tends to avoid syntactic passives wherever possible
• Chinese uses topic sentences instead
11/01/06 Penn State 28
Contrast: Long vs. short passives
• The agent NP in the long passive follows the passivised verb in English but precedes it in Chinese
• Short passives are predominant in English; long passives are not uncommon in Chinese– Passives are used in English to avoid mentioning the agent– The agent must normally be spelt out in Chinese passives
• This constraint has become more relaxed nowadays
• When it is difficult to spell out the agent…– Passives are used in English– In Chinese, a vague expression such as ren/youren ‘someone’
or renmen ‘people’ is used instead of using passives
11/01/06 Penn State 29
Contrast: Pragmatic meanings
• Chinese passives are more frequently used with a negative pragmatic meaning than English passives– Chinese passives were used at early stages primarily
for unpleasant or undesirable events; the semantic constraint on the use of passives has become more relaxed, especially in writing
– Rank order of meaning categories• English: neutral > negative > positive• Chinese: negative > neutral > positive
– In this respect, the get-passive is more akin to Chinese passives than the unmarked be-passive – more stylistically oriented
11/01/06 Penn State 30
Contrast: Syntactic functions
• Passives are most frequently used in the predicate position in English and Chinese
• Proportion of passives used as predicates in English (over 95%) is much greater than that in Chinese (76% on average)
• More frequent in the object than subject position in both languages
• More frequent as attributive modifiers in Chinese; more frequent as complements in English
• Passives in Chinese (esp. bei-passives) are more balanced across syntactic functions than English passives
• Chinese passives in the predicate position typically interact with aspect but this interaction is not obvious in English
11/01/06 Penn State 31
Contrast: Distribution
• Unmarked English (be-)passives more frequent in informative (A-J) than imaginative writing (K-R); get-passives more common in speech and informal written genres– H and J show very high proportions of passives in English, but
they have the lowest proportions of passives in Chinese• Unmarked English passives function to mark objectivity and a formal
style but Chinese passives do not have this function
• In Chinese, wei typically occurs in formal written genres; jiao, rang and gei are used in colloquial genres– Mystery/detective stories (L) and religious writing (D) show
exceptionally high proportions of passives in Chinese• Different distributions are associated with different
functions– English (be-)passives: an impersonal, objective and formal style– Chinese passives: ‘inflictive voice’
11/01/06 Penn State 32
Contrast: Typological differences
• Klaiman’s (1991: 23) 3-way classification of grammatical voices– Basic (unmarked) voice: active/middle voice– Derived/non-basic (marked) voice: passivisation– Pragmatic voice: involving ‘assignment to some
sentential arguments of some special pragmatic status or salience’ (Klaiman 1991: 24)
• English passive: derived voice• Chinese passive: pragmatic voice
11/01/06 Penn State 33
Passive errors in Chinese Learner English
11/01/06 Penn State 34
Corpora
• CLEC: the Chinese Learner English Corpus– One million words– Essays– Five proficiency levels
• LOCNESS: the Louvain Corpus of Native English Essays– 324,304 words– Essays– British A-Level children and British/American
university students
11/01/06 Penn State 35
Under-use of passives
Corpus Words Frequency Per 100K words
LL score
p value
CLEC 1,070,602 9,711 907
LL=1235.6
1.d.f.
p<0.001LOCNESS 324,304 5,465 1,685
11/01/06 Penn State 36
Long vs. short passives
• Long passives are slightly more frequent in Chinese learner English– Long passives in CLEC
• 9.14%: 888 out of 9,711
– Long passives in LOCNESS• 8.44%: 461 out of 5,465
• Not statistically significant– LL=2.184, 1 d.f., p=0.139
11/01/06 Penn State 37
Pragmatic meanings
• Passives are more frequently negative in Chinese learner English– CLEC
• Negative: 25.7%• Positive: 5.9%• Neutral: 68.4%
– LOCNESS• Negative: 16.8%• Positive: 4.4%• Neutral: 78.8%
– LL=7.4, 2 d.f., p=0.025
Corpus
LOCNESSCLEC
Pe
rce
nt
120
100
80
60
40
20
0
Meanings
Positive
Neutral
Negative
11/01/06 Penn State 38
Passive errors vs. learner levels
• Learners at higher levels generally make fewer passive errors
• Four major types of passive errors
• Under-use is the most important error type
• Learning curve is not a straight line, especially for difficult itemsProficiency level
ST6ST5ST4ST3ST2
Fre
qu
en
cy p
er
20
0K
wo
rds
300
200
100
0
Error type
Aux. omission
Misformation
Overuse
Underuse
All error types
11/01/06 Penn State 39
Error types vs. learner levels• Error types are associated with learner levels
– LL=51.774, 12.d.f., p<0.001• Similar learner groups make similar types of errors
– ST2 >> ST3: statistically significant (LL=27.303, 3 d.f., p<0.001)– ST3 >> ST4: not significant (LL=6.955, 3 d.f., p=0.073)– ST4 >> ST5: statistically significant (LL=18.563, 3 d.f., p<0.001)– ST5 >> ST6: not significant (LL=6.987, 3 d.f., p=0.072)
ST2 ST3/ST4 ST5/ST6 (High (Junior/Senior (Junior/Senior
school non-English English major
students) major students) students)
11/01/06 Penn State 40
Under-use: L1 transfer
• Borne out of the contrastive analysis• Confirmed by the CLEC-LOCNESS comparison• Result of L1 transfer• Typically occur with verbs whose Chinese
equivalents are not normally used in passives, e.g.– A birthday party will hold in Lily’s house. (ST2)– …or our efforts will waste. (ST4)– The woman in white called Anne Catherick. (ST5)
• Also under the influence of Chinese topic sentences– The supper had done. (ST2)
11/01/06 Penn State 41
Over-use: three major types
• Intransitive verbs used in passives, e.g.– A very unhappy thing was happened in this week. (ST2)– Their friendships are not died off with the passing of time (ST4)– I was graduated from Zhongshan University (ST5)
• Misuse of ergative verbs, e.g.– …the science <sic. secince> is developed quickly (ST4)– …infant mortality was declined (ST4)
• Passive training effects, e.g.– …many machines <sic. machine> and appliances <sic.
appliance> are used electricity as power (ST5)– Because they have been mastered everything of this job…
(ST4)
11/01/06 Penn State 42
Misformation: L1 interference
• Result of L1 interference• Related to morphological inflections
– Passivised verbs do not inflect in L1 Chinese
• Tend to use uninflected verbs or misspelt past participles in passives, e.g.– The door is wrap with two coats of iron (ST5)– His relatives can not stop him, because his choice is
protect by the laws. (ST6)– Since the People’s Republic of China <sic. china>
was found on October 1, 1949… (ST2)– I was moving at that time, but I didn't cry. (ST2)
11/01/06 Penn State 43
Auxiliary omission: L1 interference
• Result of L1 interference– Unmarked ‘notional passives’ are abundant in
Chinese
• Tend to omit or misuse auxiliaries in passives, e.g.– …and we will not satisfied with what we have done.
(ST4) – In China, since the new China established, people’s
life has gotten <sic. goten> better and better. (ST3)– I am not a smoker, but why do we forced to be a
second-hand smoker? (ST5)
11/01/06 Penn State 44
Conclusions
• While passive constructions express a basic passive meaning in both English and Chinese, they also show a range of differences which are associated with their different functions in the two languages
• Most passive-related errors made by Chinese learners of English can be accounted for from a contrastive perspective
• A combination of contrastive study and learner corpus analysis can bring insights into language acquisition research
11/01/06 Penn State 45
Thank you!
11/01/06 Penn State 46
References (1)• Baker, M. (1993) ‘Corpus linguistics and translation studies’. In M.
Baker, G. Francis & E. Tognini-Bonelli (eds.) Text and technology (pp. 233-52). Amsterdam: Benjamins.
• Baker, M. (1999) ‘The role of corpora in investigating the linguistic behaviour of professional translators’. International Journal of Corpus Linguistics 4: 281-98.
• Baker, S. (1985) 1985. The Practical Stylist [6th ed.]. New York: Harper & Row.
• Biber, D. (1988) Variation across Speech and Writing. Cambridge: Cambridge University Press.
• Carter, R. and McCarthy, M. (1999) ‘The English get-passive in spoken discourse’. English Language and Literature 3(1): 41-58.
• Chao, Y. (1968) Grammar of Spoken Chinese. Berkeley: University of California Press.
• Frawley, W. (1984) ‘Prolegomenon to a theory of translation’. In W. Frawley (ed.) Translation: Literary, linguistic and philosophical perspectives (pp. 159-75). London: Associated University Press.
11/01/06 Penn State 47
References (2)• James, C. (1980) Contrastive Analysis. London: Longman.• Kenny, D. (1998) ‘Creatures of habit? What translators usually do
with words?’ Meta 43(4). • Klaiman, M. (1991) Grammatical Voice. Cambridge: Cambridge
University Press. • Laviosa, S. (1998) ‘Core patterns of lexical use in a comparable
corpus of English narrative prose’. Meta 43(4). • McEnery, A and Xiao, Z. (2002) ‘Domains, text types, aspect
marking and English-Chinese translation’. Languages in Contrast 2(2): 211-31.
• McEnery, A., Xiao, Z. and Mo, L. (2003) ‘Aspect marking in English and Chinese’. Literary and Linguistic Computing 18(4): 361-78.
• Mcenery, A., Xiao, Z. and Tono, Y. (2005) Corpus-Based Language Studies. London: Routledge.
• Øverås, S. (1998) ‘In search of the third code: An investigation of norms in literary translation’. Meta 43(4).
11/01/06 Penn State 48
References (3)• Quirk, R. (1968) The Use of English [2nd ed.]. London: Longman. • Santos, D. (1996). Tense and Aspect in English and Portuguese: A
contrastive semantical study. PhD thesis. Universidade Tecnica de Lisboa.
• Wang, L. (1957) ‘Hanyu beidongju de fazhan (Development of Chinese passives)’. Yuyanxue Luncong (Studies in Linguistics) Vol. 1. Beijing: Commercial Printing. House.
• Xiao, Z. and McEnery, A. (2004a) ‘A corpus-based two-level model of situation aspect’. Journal of Linguistics 40(2): 325-63.
• Xiao, Z. and McEnery, A. (2004b) Aspect in Mandarin Chinese. Amsterdam: John Benjamins.
• Xiao, Z. and McEnery, A. (2006) ‘Collocation, semantic prosody and near synonymy: a cross-linguistic perspective’. Applied Linguistics. [In press]
• Xiao, Z, McEnery, A. and Qian, Y. (2006) ‘Passive constructions in English and Chinese: a corpus-based contrastive study’. Languages in Contrast 3(1).