in tianjin chinese - department of...
TRANSCRIPT
1 23
Journal of East Asian Linguistics ISSN 0925-8558Volume 25Number 1 J East Asian Linguist (2016) 25:1-35DOI 10.1007/s10831-015-9135-0
The productivity of variable disyllabic tonesandhi in Tianjin Chinese
Jie Zhang & Jiang Liu
1 23
Your article is protected by copyright and all
rights are held exclusively by Springer Science
+Business Media Dordrecht. This e-offprint
is for personal use only and shall not be self-
archived in electronic repositories. If you wish
to self-archive your article, please use the
accepted manuscript version for posting on
your own website. You may further deposit
the accepted manuscript version in any
repository, provided it is only made publicly
available 12 months after official publication
or later and provided acknowledgement is
given to the original source of publication
and a link is inserted to the published article
on Springer's website. The link must be
accompanied by the following text: "The final
publication is available at link.springer.com”.
The productivity of variable disyllabic tone sandhiin Tianjin Chinese
Jie Zhang1 · Jiang Liu2
Received: 9 January 2013 /Accepted: 23 December 2014 / Published online: 3 November 2015
© Springer Science+Business Media Dordrecht 2015
Abstract Tianjin Chinese has one of the more complex tone sandhi systems in
Northern Chinese dialects. Due to its close contact with Standard Chinese, many of
its tone sandhi patterns are also variable. This article first reports a detailed acoustic
study of tone sandhi patterns in both real lexical items and novel words in Tianjin.
The data were collected from 48 speakers of Tianjin, who were instructed to pro-
nounce disyllabic sequences as real words based on voice prompts. The results
showed that the productivity of the sandhis in novel words varied depending on the
sandhi—some were less productive than in real words, and some were more pro-
ductive, indicating a combination of underlearning, overlearning, and proper
learning of the sandhis from the lexicon. A theoretical model that predicts the
productivity patterns based on the phonetic properties of the sandhis and statistical
generalizations about the sandhis over the lexicon is then proposed.
Keywords Tone · Tone sandhi · Tianjin · Productivity · Optimality theory ·
Maximum entropy grammar
Electronic supplementary material The online version of this article (doi:
10.1007/s10831-015-9135-0) contains supplementary material, which is available to authorized
users.
& Jie Zhang
& Jiang Liu
1 Department of Linguistics, The University of Kansas, 1541 Lilac Lane, Blake Hall, Room 427,
Lawrence, KS 66045-3129, USA
2 Department of Asian Languages and Literatures, University of Minnesota, 220 Folwell Hall, 9
Pleasant Street SE, Minneapolis, MN 55455, USA
123
J East Asian Linguist (2016) 25:1–35
DOI 10.1007/s10831-015-9135-0
Author's personal copy
1 Introduction
1.1 Two types of evidence for phonological knowledge
Kenstowicz and Kisseberth, in Chap. 5 of their seminal Generative phonology:Description and theory (1979), raised a serious methodological issue for generative
phonology research: they questioned the assumption that the phonological
abstractions derived by traditional research methods that focused on lexically
manifested patterns of sound distribution and morpheme alternation were the same
abstractions in speakers’ unconscious phonological knowledge—the knowledge that
generative phonology aims to uncover. Consequently, they advocated the research
practice of complementing the evidence gleaned from such traditional sources with
evidence from speakers’ linguistic behavior that directly manifested their uncon-
scious knowledge, from speech errors and language games to loanwords and second
language acquisition.
Their skepticism of the assumption turned out to be well founded as subsequent
research showed that speakers know both more and less than the lexical patterns. A
number of recent studies have shown that speakers possess phonological knowledge
that the lexical patterns of their language do not inform them of—a scenario that we
will refer to as “overlearning.” For instance, Zuraw (2007) showed through a corpus
study on loans and a web-based survey on novel words that Tagalog speakers
possessed knowledge of the splittability of word-initial consonant clusters that could
not be deduced from the lexicon. Berent et al. (2007) demonstrated through a series
of experiments that English speakers preferred /bd/ as an onset cluster over /lb/,
even though neither is a legal onset cluster in English. In an artificial language-
learning setting, Wilson (2006) established that when English speakers were
presented with velar palatalization before mid vowels, they could extend the process
before high vowels but not vice versa. These have been taken as “the poverty of the
stimulus” arguments for the relevance of Universal Grammar or substantive biases
in phonological learning.
“Underlearning,” alternatively termed “the surfeit of the stimulus” (Becker et al.
2011), refers to speakers’ subpar knowledge, and sometimes total ignorance, of
generalizable patterns in the lexicon. For example, Becker et al. (2011) found that
Turkish speakers could generalize to novel words the statistical patterns seen in
relations between an obstruent voicing alternation and word length as well as place
of articulation in obstruents in the lexicon, but they were oblivious to a similar
statistically significant relation between the voicing alternation and properties of the
preceding vowel (height, backness). Hayes et al. (2009b) investigated the variation
patterns in suffixal vowel harmony in Hungarian and compared how speakers
internalized two types of gradient patterns in novel words—natural ones in which
the harmony behavior is based on the properties of the stem vowels (number of
triggers, height of the trigger) and unnatural ones in which the harmony is correlated
with features of the stem-final consonant. They found that speakers learned both the
natural and unnatural patterns, but the unnatural patterns were undervalued and
learned less robustly than the natural ones. Using an artificial language-learning
2 J. Zhang, J. Liu
123
Author's personal copy
paradigm, Moreton (2008) showed that English speakers learned a vowel height-
voicing dependency significantly more poorly than a height-height dependency
despite the facts that (a) neither dependency is attested in English, (b) the
dependency in question was present in the learning experiment, and (c) the two
dependencies have comparable phonetic precursors. These results also suggest that
speakers’ phonological knowledge is the combined result of learned lexical patterns
and a priori knowledge.
These studies support Kenstowicz and Kisseberth’s thesis that evidence for
speakers’ phonological knowledge needs to come from both within and beyond
lexical patterns. Beyond the areas identified by Kenstowicz and Kisseberth such as
speech errors and loanwords, corpus-external evidence has emerged from exper-
imental investigations of productivity, especially in the form of wug tests (Berko
1958), in which speakers are asked to provide responses to novel words in contexts
that are facilitative to the application of the phonological process in question. This
methodology has been widely used to test the productivity of phonological
alternations (e.g., Albright et al. 2001; Hayes and Londe 2006; Zuraw 2007; Hayes
et al. 2009b; Becker et al. 2011) as well as regular and irregular morphological rules
(e.g., Bybee and Pardo 1981; Albright 2002; Albright and Hayes 2003;
Pierrehumbert 2006).
1.2 The role of productivity in tone sandhi research
Tone sandhi research, particularly descriptive work, has had a long tradition in
Chinese phonology. Both detailed descriptions of tone sandhi in individual dialects
and typological works on cross-linguistic patterns of tone sandhi abound (see Zhang
2014a, b for reviews and references). The relation between tone sandhi and
theoretical phonology, however, has been an uncomfortable one. The analysis of
Chinese tone sandhi patterns has presented considerable challenges to theoretical
phonology in both rule-based and constraint-based frameworks, and complete
theoretical analyses of any given tone sandhi system have proven difficult. Beyond
the sheer complexity of tone sandhi patterns often observed in Chinese dialects,
especially in the Wu and Min groups, three other properties of tone sandhi are
responsible for this difficulty. First, as the result of diachronic changes, many of the
sandhi patterns in the present-day systems are phonetically arbitrary. This presents
particular challenges to the analysis of these patterns in Optimality Theory (Prince
and Smolensky 1993), which relies on surface-oriented, generalizable markedness
constraints. Second, many of the tone sandhi patterns are phonologically opaque
(Kiparsky 1973). For example, in Taiwanese, four of the five tones in the tonal
inventory on non-checked syllables are involved in a circular chain shift:
55 → 33 → 21 → 51 → 55 (Cheng 1968; Chen 1987); in Fuzhou, the following
synchronic chain shifts are attested: 32 → 44 → 53 → 21 / __ {212, 242};
44 → 53 → 32 → 24 / __ 32 (Liang and Feng 1996). These patterns also pose
analytical challenges for Optimality Theory: circular chain shift has been shown to
be incomputable by a “conservative” OT grammar that uses only IO-faithfulness
and markedness constraints (Moreton 2004), and regular chain shift requires
additional mechanisms such as constraint conjunction to be captured (Kirchner
Tone sandhi productivity in Tianjin Chinese 3
123
Author's personal copy
1996). Third, due to complex contact situations as well as internal factors, many
sandhi patterns are riddled with variation and exceptions. Under these contexts, it is
particularly worthwhile to ask whether the lexical sandhi patterns that the speakers
encounter are a true reflection of their phonological knowledge via productivity
studies. Do speakers overlearn/generalize sandhi patterns in the face of variation
and exceptions? Do speakers underlearn lexical regularities in tone sandhi due to
their phonetic arbitrariness and phonological opacity? In other words, we need to
expand our empirical basis from which theoretical analysis of tone sandhi proceeds
to include not only lexical patterns of tone sandhi but also experimental evidence of
tone sandhi productivity. This was exactly Kenstowicz and Kisseberth’s recom-
mendation to phonologists over 30 years ago.
Using wug tests to investigate the productivity of tone sandhi patterns can be
traced back to the ground-breaking work of Hsieh (1970, 1975, 1976), who showed
that the opaque tone sandhi circle in Taiwanese is generally not productive. Later
works by Wang (1993), Zhang and Lai (2008), and Zhang et al. (2009, 2011)
replicated and expanded Hsieh’s studies and reached similar conclusions. Zhang and
Lai (2008) and Zhang et al. (2009, 2011), in addition, showed that sandhi
productivity is also correlated with the frequencies of the sandhi patterns in the
lexicon and the phonetic nature of the tone change: sandhis that have higher type
and token frequencies in the lexicon tend to have higher productivity, and sandhis
that turn longer tones into shorter tones have a productivity advantage over sandhis
that turn shorter tones into longer tones due to the impoverished duration of the
sandhi position as compared to the non-sandhi position. Zhang and Lai (2010) tested
the productivity difference between the third-tone sandhi (213 → 35 / __ 213) and
half-third sandhi (213 → 21 / __ T, T ≠ 213) in Standard Chinese in two wug test
experiments and showed that the former applies less productively in novel words
than the latter. They argued that the results were due to the fact that the half-third
sandhi is a contour reduction process directly related to the shortened duration in
non-final positions and thus has a clearer phonetic motivation than the third-tone
sandhi, which (a) has a long diachronic history, (b) involves a pitch raising not
easily explainable by phonetics, and (c) is also perceptually neutralizing. Zhang and
Meng (2012) demonstrated that in Shanghai Wu, rightward contour extension,
which effectively reduces contour tones on both syllables, is more productive than
rightward contour displacement, which does not level the contour, and in the
meantime causes large phonetic mismatches in both stress and tonal contour
between the base and sandhi tones. These studies indicate that wug testing the
productivity of tone sandhi patterns is a worthy research endeavor as speakers’
phonological knowledge can indeed differ from lexically manifested sandhi patterns
due to the phonetic (e.g., tone duration, tone similarity) and phonological (e.g.,
opacity) properties of the sandhis.
What we hope to achieve in this article is to present a productivity study of the
tone sandhi system in Tianjin Chinese, which differs from previously investigated
sandhi systems in a number of respects. First, as a northern dialect with a close
affinity to the Beijing dialect and Standard Chinese, Tianjin’s sandhi pattern is also
“right-dominant” (Yue-Hashimoto 1987), in that the tone at the right edge of the
sandhi domain remains intact while non-final tones undergo sandhi. But its sandhi
4 J. Zhang, J. Liu
123
Author's personal copy
pattern is considerably more complex than that of Beijing and Standard Chinese.
Second, different from the “right-dominant” southern Min dialects like Taiwanese,
the Tianjin sandhi pattern does not involve phonological opacity. Third, the sandhi
pattern in Tianjin is riddled with variation and exceptions, likely due to its close
contact with the Beijing dialect and the dominance of Standard Chinese. The
productivity study on Tianjin tone sandhi, therefore, allows us to expand the
typology of sandhi productivity, address new questions such as the effect of
variation and exceptions to productivity, and in the meantime provide further tests
of some of the hypotheses mentioned earlier, such as the relevance of lexical
frequency and phonetic properties to sandhi productivity. In the rest of the article,
we introduce the Tianjin tone sandhi pattern first in Sect. 1.3, then discuss the
hypotheses and the methodology of the productivity study in Sect. 2. Results of our
experiment follow in Sect. 3. We then provide a theoretical model for our results in
Sect. 4. Discussions and concluding remarks are provided in Sect. 5.
1.3 Tianjin tone sandhi
Tianjin Chinese is spoken in the city of Tianjin 65 miles to the southeast of Beijing.
Its four lexical tones are cognates with the four tones in Standard Chinese, but the
pitch values of the tones in the two dialects differ, as shown in (1) (Chen 2000).1
The four-way contrast maH ‘mother’ � maMH ‘hemp’ � maMLH ‘horse’ � maHL
‘to scold’ in Standard Chinese, for example, is realized as maL � maH � maLH
� maHL in Tianjin.
(1) Lexical tones and Tianjin Chinese and Standard Chinese:
Tone 1 Tone 2 Tone 3 Tone 4
Tianjin L H LH HL
Standard Chinese H MH MLH HL
As mentioned previously, despite its close affinity and similarity to Standard
Chinese, Tianjin has a considerably more complex system of tone sandhi. The
traditional disyllabic sandhis reported in Li and Liu (1985) and later confirmed by
Shi (1986), Yang et al. (1999), and Chen (2000), are summarized in (2). The T3+T3
sandhi in (2b) is cognate with the third-tone sandhi in Standard Chinese, which also
changes a T3 to a T2 before another T3. The other three sandhis are not attested in
Standard Chinese nor do they have extensive synchronic counterparts in other
dialects to the best of our knowledge.
1 The transcriptions of the Tianjin tones vary from source to source. For example, using Chao’s tone
numbers (Chao 1968), Li and Liu (1985) transcribed the four tones as 21, 45, 213, 54, respectively, while
Shi (1990) used 11, 55, 24, 53. We use Chen’s (2000) notation here. For more detailed discussion and
acoustic data on Tianjin citation tones, see Zhang and Liu (2011).
Tone sandhi productivity in Tianjin Chinese 5
123
Author's personal copy
(2) Tianjin disyllabic tone sandhi I:
a. L+L → LH+L (T1+T1 → T3+T1)
b. LH+LH → H+LH (T3+T3 → T2+T3)
c. HL+L → H+L (T4+T1 → T2+T1)
d. HL+HL → L+HL (T4+T4 → T1+T4)
Shi (1988) noted that the four sandhi processes in (2) applied with different
propensities in Tianjin. Under the criteria of the number of lexical exceptions and
the likelihood with which the base-tone combinations surface as the result of tone
sandhi in longer sequences, Shi ordered the sandhis according to their “strength” as
follows: (T3+T3) [ (T1+T1) [ (T4+T4) [ (T4+T1). From the recordings of
204 Tianjin speakers in different age groups, Shi and Wang (2004) showed that the
T4+T1 sandhi had a tendency to apply with greater regularity among younger
speakers (close to 100 % application for speakers younger than 20 but only around
60 % for speakers older than 70), and the T4+T4 sandhi had generally become
obsolete for younger speakers (close to 0 % application for \20 years; around 40 %
for [70 years).2 The disappearance of the T4+T4 sandhi has also been reported in
Liu and Gao (2003) and Gao (2004), and they attributed the disappearance to the
influence of Standard Chinese, which has a similar T4 (51) that does not undergo
sandhi before another T4. Shi and Wang’s (2004) results were in general agreement
with Zhang and Liu’s (2011) acoustic findings on disyllabic tone sandhi from 12
Tianjin speakers (average age = 34.3), which showed that the T3+T3 and T1+T1
sandhis applied consistently, the T4+T1 sandhi had a small number of exceptions,
and the T4+T4 sandhi only applied to a handful of words for a small subset of the
speakers. Furthermore, Zhang and Liu (2011) showed that the sandhi patterns, even
when they applied, generally did not result in tonal neutralization as the description
in (2) implies, as the sandhi tone always preserved certain pitch properties from the
base tone.
Wee (2004) reported two additional tone sandhis for Tianjin, given in (3). These
sandhis likely originated from the half-third sandhi in Standard Chinese, whereby
the falling-rising T3 is realized as its first half before a tone other than T3 (213
+T → 21+T, T ≠ 213). Although Wee (2004) reported these sandhis as
neutralizing sandhis (neutralization of T3 and T1 in the sandhi contexts), Ma and
Jia’s (2006) acoustic and perceptual studies showed that neither sandhi in (3) was
truly neutralizing: the sandhi tones partially preserved the rising property of T3, and
listeners could identify the difference between T1 and T3 in the sandhi contexts
with an accuracy rate of over 85 %. Zhang and Liu’s (2011) acoustic results further
supported the incomplete neutralization property of these two sandhis. In our
discussion of the sandhis below, we will still use the conventional categorical
transcriptions, but only as a convenient shorthand.
2 In addition, Shi and Wang (2004) also found that for T1+T1, younger speakers (\20 years)
consistently used T2+T1 as the sandhi tones, not the previously reported T3+T1, while older speakers
([70%) varied between T3+T1 and T2+T1. See Lu (1997, 2004) and Zhang and Liu (2011) for similar
findings and additional discussions.
6 J. Zhang, J. Liu
123
Author's personal copy
(3) Tianjin disyllabic tone sandhi II:
a. LH+H → L+H (T3+T2 → T1+T2)
b. LH+HL → L+HL (T3+T4 → T1+T4)
The complexity of tone sandhi in Tianjin, therefore, comes not only from the
intricacy of the pattern itself but also from the variation and exceptions in the
pattern and the changes that it is currently undergoing. The pattern itself, then, is not
only interesting in its own right but also presents an opportunity to contribute to the
theoretical debate on the roles of variation and exceptions in the formal grammar—
an issue that has captured much attention in the recent phonological literature (see
Coetzee and Pater 2011 for a review). It is also worth noting that the complexity of
the Tianjin sandhi pattern does not involve opaque chain shifts as in Taiwanese. A
study of the productivity of the tone sandhi pattern in Tianjin, therefore, allows us to
investigate the speakers’ knowledge of a typologically different kind of sandhi
system. We lay out the specific hypotheses about the productivity of Tianjin tone
sandhi and the methodology for the study in the next section.
2 Hypotheses and methodology
2.1 Hypotheses
We have seen in Sect. 1.2 that a series of work on the productivity of tone sandhi
patterns in Chinese dialects has shown that the phonological transparency, phonetic
properties, and lexical frequency of a sandhi can all affect its productivity in novel
words. Phonological transparency is not relevant here as all Tianjin tone sandhis are
transparent. But we expect the effects of phonetic properties and lexical frequency
to manifest themselves in Tianjin. In particular, we first hypothesize that regular
sandhis with a strong phonetic basis, such as the half-third sandhis LH+H → L+H
(T3+T2 → T1+T2) and LH+HL → L+HL (T3+T4 → T1+T4), would be more
productive than other regular sandhis L+L → LH+L (T1+T1 → T3+T1) and LH
+LH → H+LH (T3+T3 → T2+T3), whose phonetic basis is less strong. Our
judgment of the strength of the phonetic basis follows that of Zhang and Lai’s
(2010) for Standard Chinese. The half-third sandhi is a contour reduction process
directly related to the shortened duration in non-final positions.3 The other sandhis
have properties that are not directly related to phonetic reduction. The T1+T1
sandhi involves a contouring process in non-final position, which is typologically
rare (Yue-Hashimoto 1987; Zhang 2002). It also cannot be easily interpreted as
phonetically motivated dissimilation as coarticulatory dissimilation typically
involves the raising of a high tone before a low tone (see Gandour et al. 1994 for
3 An anonymous reviewer questioned the phonetic basis of the T3+T2 sandhi as the opposite pattern,
whereby L+H → LH+H, is attested in African languages. But this type of regressive tone spreading is
considerably rarer than progressive assimilation (Maddieson 1978; Hyman 2007; Zhang 2007). Hyman
(2007) in fact goes on to argue that regressive tone spreading is due to special circumstances involving
tone attraction to stressed positions or pressure from intonation at the right edge and therefore is not a
diachronically natural process.
Tone sandhi productivity in Tianjin Chinese 7
123
Author's personal copy
Thai; Xu 1997 for Standard Chinese; Peng 1997 for Taiwanese; and Zhang and Liu
2011 for Tianjin). The third-tone sandhi, like in Standard Chinese, also involves a
raising of the pitch not easily explainable by phonetics.
Second, we hypothesize that ceteris paribus, sandhi patterns with higher type and
token frequencies will be more productive than those with lower frequencies. This
should be most clearly manifested in the comparison between the two half-third
sandhi patterns: both are equally motivated by insufficient duration, yet Tone 2 has
lower type and token frequencies than Tone 4 (based on Da 2004). Therefore, we
expect the T3+T2 sandhi to be less productive than the T3+T4 sandhi, a result also
found in Zhang and Lai’s (2010) study on Standard Chinese. Relatedly, we also
hypothesize that the token frequency of a particular lexical item is related to how the
sandhi applies to the item, in that higher frequency leads to higher productivity. If
so, then any underlearning or overlearning effects in novel words may be interpreted
as exaggerated frequency effects. This will further inform the theoretical model for
the speakers’ sandhi knowledge.
Third, for the sandhis with exceptions, we predict that they will tend to change in
the innovative direction in novel words. This is because we expect new words to
take on the behavior that represents the direction of change. In other words, the
disappearing HL+HL → L+HL (T4+T4 → T1+T4) should show further
underlearning in novel words while the sandhi gaining popularity—HL+L → H
+L (T4+T1 → T2+T1)—should be overlearned and generalized.
In short, we hypothesize that a Tianjin speaker’s knowledge of tone sandhi is a
combination of proper learning, underlearning, and overlearning from the lexicon:
the lexical frequency of the sandhi pattern is positively correlated with sandhi
productivity; however, sandhis that lack phonetic motivation should be under-
learned and lack full productivity, yet sandhis with a limited number of exceptions
should be overlearned and generalized.
2.2 Methodology
2.2.1 Experimental design
To test these hypotheses, we designed a wug test in which native speakers of Tianjin
were asked to pronounce two separately presented individual syllables together as a
real disyllabic word in Tianjin. All six sandhis in (2) and (3) were tested, and within
each sandhi, three types of words were used: real disyllabic words in Tianjin,
pseudo words composed of two actual-occurring syllables in Tianjin, and novel
words in which the first syllable was an accidental gap in the Tianjin syllabary. An
accidental gap is a syllable in which both the segmentals and tone are legal, but their
combination happens to be missing in Tianjin. We will refer to these three groups as
REAL, PSEUDO, and NOVEL henceforth. REAL words were then further divided into
four subtypes according to whether the disyllable and the first syllable were of high
or low token frequency as in Fig. 1a, and PSEUDO words were further divided into
two subtypes depending on whether the first syllable had high or low token
frequency as in Fig. 1b. For each of the word-(sub)type/sandhi-type combination,
we used four different words, which resulted in 168 test words (6 9 7 9 4). Token
8 J. Zhang, J. Liu
123
Author's personal copy
frequency data were derived from a corpus of written Chinese with 28,278,285
bigrams compiled from online resources by Da (2004). The mean raw bigram
frequency for the high-frequency disyllabic words is 3721, and that for the low-
frequency words is 178. Frequencies for the first syllables in high-frequency REAL,
low-frequency REAL, and PSEUDO words include the frequencies of all homophonous
characters, provided that the characters are among the 3500 most commonly used
characters in Da’s character corpus. In other words, these frequencies are
approximations of the frequencies of the phonetic syllables with tones. High-
frequency syllables all have a mean raw frequency over 210,000 while low-
frequency syllables all have a frequency under 80,000. Care was taken to minimize
the effect of tonal combination on word and syllable frequencies. We also used 160
fillers, 16 for each of the 10 disyllabic tonal combinations that did not undergo
sandhi. We did not control for whether the REAL words were verbs or nouns or
whether the PSEUDO words were more easily interpreted as verbs or nouns as word
category is not known to affect the application of disyllabic tone sandhi in either
Tianjin or Standard Chinese. Additional information on the selection of the stimuli
and the entire word list are given in Appendix 1 (see Supplementary material).
The 328 experimental stimuli were recorded in their monosyllabic citation form
by a 23-year-old male native speaker of Tianjin in an anechoic chamber at the
University of Kansas. Each monosyllable was read without sentential context twice,
and the token deemed clearer by the two authors was used in the experiment. The
experiment was implemented in Paradigm® (Perceptional Research Systems). The
stimuli were evenly divided into two blocks. Block A included all stimuli with the
tonal combinations T1+T1, T3+T2, and T3+T3 as well as fillers with the tonal
combinations T1+T2, T1+T3, T1+T4, T2+T1, and T2+T2. Block B included all
T3+T4, T4+T1, and T4+T4 stimuli and T2+T3, T2+T4, T3+T1, T4+T2, and T4
+T3 fillers. Half of the subjects took block A first, and the other half took block B
first. There was a 5-min break between the blocks. Within each block, the stimuli
were randomized by Paradigm® for each speaker. Each stimulus consisted of two
monosyllables separated by an 800 ms interval. The stimuli were played through a
pair of headphones to the subjects. For each stimulus, the subjects were asked to put
the two syllables together and pronounce them as a real disyllabic word in Tianjin as
naturally as possible. Before the experiment began, there was an introduction in
Tianjin that the subjects heard through the headphones and simultaneously read on a
(a) (b)
REAL
High freq. Low freq.
Highfreq. 1
Lowfreq. 1
Highfreq. 1
Lowfreq. 1
PSEUDO
Highfreq. 1
Lowfreq. 1
Fig. 1 Stimulus design for a REAL words and b PSEUDO words
Tone sandhi productivity in Tianjin Chinese 9
123
Author's personal copy
computer screen in front of them. The introduction explained their task both in prose
and through examples. There was then a practice session of 9 words that did not
appear in the real experiment (three of each of REAL, PSEUDO, and NOVEL words).
The instruction and practice items were recorded by the same male speaker whose
voice was used in the experiment. The experiment began after a verbal confirmation
from the subjects that they were ready. The entire experiment took around 45 min.
Fifty native speakers of Tianjin participated in the experiment. Two of them were
recorded in an anechoic chamber in the Phonetics and Psycholinguistics Laboratory
of the University of Kansas using a Marantz solid state recorder PMD 671 sampling
at 22.05 kHz and an Electro-Voice RE-20 microphone. The other 48 were recorded
in a quiet room in the Phonetics Laboratory of the Department of Chinese Language
and Literature at Nankai University in Tianjin using the same model of solid-state
recorder and an EV N/D 767a microphone. These speakers all self-reported to be
native Tianjin speakers but were all bilingual in Tianjin and Standard Chinese. We
made it clear to them that we were interested in the Tianjin dialect, and the native-
Tianjin instruction and practice should also orient them to the Tianjin context. The
speakers’ recordings were judged to be native-Tianjin-like by a native Tianjin
consultant in the US and a trained Tianjin linguist in Tianjin. The data from two of
the speakers in Tianjin could not be used: one speaker was from a suburb of Tianjin
and spoke a different native dialect; the other’s data were lost due to a software
malfunction. For the 48 speakers whose data we did use, all were from the six inner-
city districts of Tianjin and used both Tianjin and Standard Chinese in their daily
lives; 14 were male, 34 were female; they had an average age of 23.4 at the time of
the experiment.
2.2.2 Data analysis
All acoustic analyses of the data were conducted in Praat (Boersma and Weenink
2009). For the first syllable in all test words, we took an f0 measurement every 10 %
of the rhyme duration using Yi Xu’s TimeNormalizedF0 Praat script (Xu 2005),
giving eleven f0 measurements for each syllable. The Maxf0 and Minf0 parameters
in the script as well as the octave-jump cost were adjusted for each speaker, and the
f0 measurements were hand-checked against narrow-band spectrograms in Praat.
There were two situations in which a token was not used in further analysis: first, if
neither the TimeNormalizedF0 script nor the narrow band spectrogram could
produce reliable pitch measurements for it; second, if its second syllable was
pronounced as a stressless syllable, as judged by both authors, who are native
speakers of Standard Chinese.4 The reason the latter cases were excluded was that
stressless syllables in Tianjin have a reduced tonal inventory, and words with
stressless syllables have a different set of tone sandhi behaviors as shown in Jiang
(1994) and Wang (2002). Of the 8064 tokens recorded (168 test words 9 48
speakers), 932 were excluded due to these two reasons—an attrition rate of 11.56 %.
4 Although neither author is a native speaker of Tianjin, we believe that our judgment was accurate as
stressless syllables in Tianjin have significantly reduced duration (Jiang 1994), similar to Standard
Chinese.
10 J. Zhang, J. Liu
123
Author's personal copy
The f0 measurements in Hz were converted to Semi-tone using the formula in
(4a) to better reflect pitch perception (Rietveld and Chen 2006). The Semi-tone
values were then z-score transformed using the formula in (4b) over all
measurements from a given speaker in order to normalize for between-speaker
variation, especially male and female differences (Rose 1987; Zhu 2004). Then for
each speaker, the f0 values of the four words within each word-(sub)type/sandhi-
type combination were averaged, and the averaged data were submitted for
statistical analyses.
(4) a. ST = 39.87 9 log10(Hz/50)
b. zSTx ¼STx�1
n
Pn
i¼1STiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
1n�1
Pn
i¼1ðSTi�1
n
Pn
i¼1STiÞ2
p
3 Results and discussions
3.1 Word-type results
We first report the productivity differences among the three word types—REAL,
PSEUDO, and NOVEL—for the different sandhis. For REAL and PSEUDO words, we
further averaged the pitch values of the first syllable from different lexical
frequencies within each word type. For each sandhi, three two-way Repeated-
Measures ANOVAs were conducted, with Word-Type [two levels each: (1) REAL vs.
PSEUDO; (2) PSEUDO vs. NOVEL; (3) REAL vs. NOVEL] and Data-Point (11 levels) as
independent variables. A significant main effect on Word-Type indicates that the
pitches from the two word types under comparison have different means; a
significant main effect on Data-Point indicates that the pitch value is time-sensitive;
and a significant interaction between Word-Type and Data-Point indicates that the
pitches from the two word types have different slopes. Huynh–Feldt adjusted values
were used to correct for sphericity violations.
The average pitches for the different word types for each sandhi are plotted in
Fig. 2, and the ANOVA results are summarized in Appendix 2 (see Supplementary
material). Let us first note that regardless of the word type, the general sandhi
patterns agree with the acoustic findings in Zhang and Liu (2011). Most notably,
compared to traditional descriptions, for the T1+T1 sandhi, the sandhi tone is
higher than expected and closer to a base T2 (Fig. 2a); for the T3+T3 sandhi, the
sandhi tone is lower than expected and does not neutralize with T2 (Fig. 2b); and for
the T4+T4 sandhi, the sandhi tone has the same falling shape as the base tone,
indicating that the sandhi has indeed become obsolete (Fig. 2d).
But crucially, there are differences in how the sandhis applied to the three types
of words as indicated by the often significant differences in pitch means and pitch
slopes between the word types under comparison. Specifically, we can categorize
the six sandhis into three types depending on whether it is the NOVEL words or the
REAL words or neither that share more phonetic properties with the base tone of the
first syllable of the stimuli. The properties under comparison are the pitch mean and
Tone sandhi productivity in Tianjin Chinese 11
123
Author's personal copy
pitch slope of the tones. In L+L → LH+L (Fig. 2a), LH+LH → H+LH (Fig. 2b),
HL+HL → L+HL (Fig. 2d), and LH+H → L+H (Fig. 2e), the sandhi tone for
NOVEL words shares more phonetic properties with the base tone compared to REAL
Fig. 2 Average pitch contours of the first syllable in REAL, PSEUDO, and NOVEL disyllabic words for thesix different sandhis (a–f). Significant comparisons in pitch means and pitch slopes are noted in thegraphs. *, **, and *** significant differences at the p \ 0.05, p \ 0.01, and p \ 0.001 levels,respectively. For detailed ANOVA results, see Appendix 2 in Supplementary material. a L+L → LH+L(T1+T1 → T3+T1), b LH+LH → H+LH (T3+T3 → T2+T3), c HL+L → H+L (T4+T1 → T2+T1),d HL+HL → L+HL (T4+T4 → T1+T4), e LH+H → L+H (T3+T2 → T1+T2), f LH+HL → L+HL(T3+T4 → T1+T4)
12 J. Zhang, J. Liu
123
Author's personal copy
words and PSEUDO words. For instance, in L+L → LH+L (Fig. 2a), the sandhi tone
for NOVEL words is lower in pitch than that for REAL words. Given that the base tone
(L) has a lower pitch than the expected sandhi tone (LH), this falls under the
category in which the NOVEL words share more phonetic properties with the base
tone. On the other hand, in HL+L → H+L (Fig. 2c), the sandhi tone in NOVEL
words has more properties of the expected sandhi tone H by having an overall
higher pitch than the sandhi tones in REAL and PSEUDO words. Finally, for LH
+HL → L+HL (Fig. 2f), there is no difference in pitch mean or pitch slope among
the sandhi tones for the three types of words.
Our interpretation of these results is as follows. The first type of sandhi is
underlearned by the speakers as the sandhi applies less productively in NOVEL words
than REAL words, as indicated by the greater phonetic similarity between the sandhi
tone and the base tone in NOVEL words. The second type of sandhi—the sandhi with
exceptions HL+L → H+L (Fig. 2c)—has been generalized and thus applies with a
greater regularity in NOVEL words. We consider this as an instance of overlearning.The last type of sandhi—LH+HL → L+HL (Fig. 2f)—is properly learned from the
lexicon and applies in the same fashion to PSEUDO and NOVEL words as in REAL
words. These results and their interpretations are summarized in Table 1.
A note of caution is in order for the interpretation of the gradient differences
among different word types. The average pitches in Fig. 2 represent all usable
tokens in the recorded data regardless of whether the token undergoes the sandhi per
the rules in (2). This is because whether a tonal combination has undergone the
sandhi categorically, incompletely, or has not undergone the sandhi at all is often
difficult to determine except for a handful of cases. For L+L → LH+L and LH
+LH → H+LH, all usable tokens have undergone sandhi regardless of word type;
therefore, the gradient differences among the different word types were due to
incomplete application of the sandhis to at least some of the wug tokens. For HL
+L → H+L and HL+HL → L+HL, however, there are tokens in which the sandhi
clearly did not apply—a handful for the former and the vast majority for the latter;
the gradient differences seen in Fig. 2 are thus likely due to both categorical and
gradient differences in the application of the sandhis. For the two half-third sandhis
LH+H → L+H and LH+HL → L+HL, whether the sandhi has applied to a token
was particularly difficult to decide, and we surmise that the gradient differences
Table 1 A summary of the word-type results for the six tone sandhi patterns
Sandhi pattern Acoustic results for sandhi tone Learning
classification
L → LH/ __ L Lower pitch in NOVEL than REAL words Underlearing
LH → H/ __ LH Lower pitch in NOVEL than REAL words Underlearing
HL → H/ __ L Higher pitch in NOVEL than REAL words Overlearning
HL → L/ __ HL Higher pitch in NOVEL than REAL words Underlearing
LH → L / __ H Higher pitch in NOVEL than REAL words Underlearing
LH → L/ __ HL No pitch difference among word types Proper learning
Tone sandhi productivity in Tianjin Chinese 13
123
Author's personal copy
observed for the former are primarily caused by different degrees of gradient
application of the sandhi.
Our results are in agreement with our hypotheses. We have shown that a tone
sandhi pattern may be underlearned despite its full productivity in the lexicon, and
the underlearning may be gradiently realized as the incomplete application of the
sandhi. As hypothesized, the set of sandhis that shows underlearning includes not
only the regular and the obsolete sandhis, L+L → LH+L, LH+LH → H+LH, and
HL+HL → L+HL, but also the durationally based LH+H → L+H (reduction of
contour due to insufficient duration). The other durationally-based sandhi LH
+HL → L+HL, however, shows proper learning as expected. It is possible that in
order to approach proper learning, the pattern needs the help of both phonetics and
high lexical frequency: the trigger of the properly learned half-third sandhi, HL
(Tone 4), has considerably higher type and token frequencies than the trigger of the
underlearned half-third sandhi, H (Tone 2). Zhang and Lai (2010) found the same
underlearning and proper learning patterns for the half-third sandhis before Tone 2
and Tone 4 in Standard Chinese as well. For the underlearning of the obsolete
sandhi HL+HL → L+HL, our interpretation is that the real words that still undergo
the sandhi are listed in the lexicon, but the sandhi itself has become unproductive,
manifested in the results as underlearning. And for HL+L → H+L, the exceptions
to the sandhi are listed in the lexicon, but the sandhi itself is productive, manifested
in the results as overlearning.
3.2 Lexical frequency results
The effects of lexical frequency on sandhi productivity are reported on three separate
graphs for each tone sandhi, two for REAL words and one for PSEUDO words as shown
in Fig. 3. The two comparisons for the REAL words are based on the token frequency of
the disyllable (high vs. low) and the token frequency of the first syllable (high vs.
low), and the comparison for the PSEUDO words is based on the token frequency of the
first syllable. Each graph represents the average pitch contours of the first syllable of
the two word types under comparison. A two-way Repeated-Measures ANOVA was
conducted for each comparison, with Frequency (two levels) and Data-Point (11
levels) as independent variables. A significant main effect on Frequency indicates that
the pitches from the two frequency profiles have different means, and a significant
interaction between Frequency and Data-Point indicates that the pitches from the
different frequencies have different slopes. Huynh–Feldt adjusted values were again
used. Significant comparisons are indicated in Fig. 3. Detailed ANOVA results are
given in Appendix 3 (see Supplementary material).
The frequency results in Fig. 3 show that higher token frequency generally leads
to higher productivity. For both L+L → LH+L (Fig. 3a) and LH+LH → H+LH
(Fig. 3b) for which the sandhi raises the base tone, the σ1 comparison for the PSEUDO
words showed that higher token frequency σ1 leads to higher pitch. This indicates
that higher frequency for a syllable likely leads to a stronger allomorph listing for its
sandhi tone. In turn, this supports the hypothesis that the gradient underlearning of
sandhis exhibited in wug words is an exaggerated frequency effect. The higher
14 J. Zhang, J. Liu
123
Author's personal copy
Fig. 3 Effects of lexical frequency on the productivity of different sandhis. For each sandhi, threecomparisons are shown: REAL-Word-High vs. REAL-Word-Low; REAL-Syll1-High vs. REAL-Syll1-Low;PSEUDO-Syll1-High vs. PSEUDO-Syll1-Low. All graphs show the pitch contours of the first syllable of thetwo word types under comparison. In the graphs, *, **, and *** significant differences at the p \ 0.05,p \ 0.01, and p \ 0.001 levels, respectively. For detailed ANOVA results, see Appendix 3 inSupplementary material. a L+L → LH+L (T1+T1 → T3+T1), b LH+LH → H+LH (T3+T3 → T2+T3), c HL+L → H+L (T4+T1 → T2+T1), d HL+HL → L+HL (T4+T4 → T1+T4), e LH+H → L+H (T3+T2 → T1+T2), f LH+HL → L+HL (T3+T4 → T1+T4)
Tone sandhi productivity in Tianjin Chinese 15
123
Author's personal copy
sandhi productivity for high-frequency words is also attested in the two REAL
comparisons of the obsolete sandhi HL+HL → L+HL (Fig. 3d): the sandhi tones
for words with higher frequency are lower than those for words with low frequency,
which, for a sandhi that lowers the base tone, indicates higher productivity for the
high-frequency words. This is likely due to the fact that high-frequency words are
more conservative in maintaining exceptional behavior, which in this case is to
maintain the sandhi. Interestingly, for the sandhi with exceptions, HL+L → H+L
(Fig. 3c), in the REAL-Word-High vs. REAL-Word-Low comparison, higher
frequency in fact leads to lower productivity as evidenced by the lower pitch of
the sandhi tone for the high-frequency words. This reversal of the pattern may also
be caused by the conservative nature of high-frequency words in maintaining
exceptional patterns; but in this case, the exceptional pattern is the failure to apply
this sandhi.
We also found a difference between the two half-third sandhis in the frequency
effect. For LH+H → L+H, two of the frequency comparisons (REAL-Word-High
vs. REAL-Word-Low; PSEUDO-σ1-High vs. PSEUDO-σ1-Low) showed a significant
difference in pitch slope, yet no significant difference was obtained for any
frequency comparison for LH+HL → L+HL. In particular, for the two
comparisons for LH+H → L+H that showed a significant difference, the high-
frequency words both had a more pronounced pitch fall at the beginning. Given that
the half-third sandhi in Tianjin primarily renders the first syllable a falling tone as
shown in Zhang and Liu (2011), this seems to indicate a productivity advantage for
the high-frequency words. This result, again, maybe due to the overall higher lexical
frequency of HL (Tone 4) than H (Tone 2), further encouraging proper learning of
the sandhi involving the former.
We fail to interpret two of the frequency patterns observed in PSEUDO words: the
high productivity of HL+L → H+L when σ1 has a high frequency and the low
productivity of HL+HL→ L+HL when σ1 has a high frequency. It is interesting to
note that these anomalies occur in the two sandhis with exceptional behaviors and
that they are the mirror images of the patterns observed for REAL words for these
sandhis. But we are yet to understand the significance of these observations.
4 A learning model
Our experimental results showed that Tianjin speakers’ knowledge of tone sandhi is
a combination of proper learning, underlearning, and overlearning from the lexicon:
on the one hand, lexical statistics do inform learning as evidenced by the frequency
effects found in our experiment; on the other hand, productive patterns in the
lexicon can be underlearned, especially when the patterns do not have strong
phonetic bases, and the underlearning can be gradiently manifested in the phonetic
realization of the sandhi tones, yet patterns with exceptions can also be overlearned
and generalized to novel words. It is therefore imperative to have a learning model
that is able to make these predictions.
18 J. Zhang, J. Liu
123
Author's personal copy
4.1 The maximum entropy (MaxEnt) model
To this end, we designed a substantively-biased learning model based on the
Maximum Entropy (MaxEnt) grammar. In MaxEnt, each constraint is associated
with a weight, and for each input, the probability of a particular candidate
surfacing as the output is determined by how well this candidate satisfies the
constraint weight hierarchy when compared with all other candidates. Learning in
a MaxEnt grammar is to determine the constraint weights that maximize the log
probability of the learning data, and for each constraint, the learner can impose a
Gaussian prior, with a mean of μ and a variance of σ2, over its weight to prevent
overfitting the data. The μ represents the default weight for the constraint, and σ2
determines the severity of the penalty when the weight of the constraint deviates
from μ—the smaller the σ2, the greater the penalty. Crucially, learning biasescan be encoded as different σ2s for different constraints. For more details on
MaxEnt grammars and learning biases as Gaussian priors, see Goldwater and
Johnson (2003), Wilson (2006), Jager (2007), and Hayes and Wilson (2008),
among others.
4.2 Constraints
We also base our analysis on the dual listing/generation model of Zuraw (2000,
2010). This model assumes that existing forms are lexically listed and are protected
by highly-ranked faithfulness constraints, but lower and stochastically-ranked
constraints can encode both patterns of lexical statistics and phonetically-based
generalizations. One crucial type of constraint in our model is USELISTED, inspired
by Zuraw (2000). Two types of USELISTED constraints are proposed. First, given that
the speakers performed the sandhis better in real words than in wug words, we posit
that the disyllabic words are listed in the lexicon with their sandhi tones, and there
are USELISTED constraints on disyllables that force the listed disyllables to be used as
in (5a). Second, since our results also showed that the speakers performed the
sandhis better when the first syllable is an existing Tianjin syllable than when it is an
accidental gap, this indicates that sandhi allomorphs of existing syllables are also
listed, and we posit a second group of USELISTED constraints that forces the listed
syllable allomorphs to be used in non-final sandhi positions as in (5b). Note that the
term “allomorph” in (5b) is used in a more abstract sense than the morpheme-
specific traditional definition of the term as it refers to syllables that can cue
multiple homophonous morphemes. For example, [panHL] is an existing syllable in
Tianjin and can represent morphemes meaning “half,” “partner,” “to mix,” “to act
as,” “to deal with,” and “to trip.” Therefore, this syllable has a listed allomorph
[panH] to be used before an L-toned syllable, and USELISTED(panHL/_L) requires the
use of this allomorph in the appropriate context regardless of which morpheme this
syllable represents.
Tone sandhi productivity in Tianjin Chinese 19
123
Author's personal copy
(5) USELISTED constraints:
a. USELISTED(σL–σL): Use the listed /σLH–σL/ for /σL/+/σL/.Mutatis mutandis for USELISTED(σLH–σLH), USELISTED(σHL–σL),USELISTED(σHL–σHL), USELISTED(σLH–σH), and USELISTED(σLH–σHL).
b. USELISTED(σL/_L): Use the listed allomorph /σLH/ for /σL/ before an
/L/-toned syllable.
Mutatis mutandis for USELISTED(σLH/_LH), USELISTED(σHL/_L),USELISTED(σHL/_HL), USELISTED(σLH/_H), and USELISTED(σLH/_HL).
In our implementation of the model, the USELISTED constraints in (5a) are word-
specific, and the ones in (5b) are syllable-specific. In other words, there are as many
(5a)-type USELISTED constraints as words in Tianjin, and there are as many (5b)-type
USELISTED constraints as syllable types. This is in the same spirit as the lexically
indexed constraints à la Coetzee (2009), Becker et al. (2011), and Coetzee and
Kawahara (2013), and it can be seen as a possible way in which lexical entries
interacts with the rest of the phonological grammar: the strength of the lexical entry
is now represented as the weight of its USELISTED constraint.
The USELISTED constraints employed here are different from USELISTED in Zuraw
(2000) in the following respects. First, Zuraw employs USELISTED only for
morphologically complex forms, not for allomorphs. Second, Zuraw assumes that
each candidate is an input–output pairing, and her USELISTED constraint is defined as
“The input portion of a candidate must be a single lexical entry” (p. 50). We have
made a different assumption: the candidate that is identical to the listed form is
necessarily derived from the listed form. Third, Zuraw uses only one USELISTED
constraint and encodes the strength of a lexical entry by a listedness value from 0 to
1 that is determined by the entry’s lexical frequency. The listedness value reflects
the availability of the lexical entry in the derivation of the output. We, on the other
hand, have a proliferation of USELISTED constraints whose weights reflect the
strengths of lexical entries and syllable allomorph listings as determined by their
frequencies. The assumption, then, is that a lexical entry for a word or an abstract
allomorph for a syllable with the appropriate tone sandhi is built together with a
USELISTED constraint whenever a sandhied formed is encountered and accepted by
the speaker, and the weight of the USELISTED constraint gradually increases as the
form is further encountered.5
5 A reviewer asked what restrictions USELISTED constraints would have and whether they only refer to
tones of a given language. These are interesting and difficult questions. Provided that (a) a lexical
phonological pattern is not entirely productive, and (b) there are productivity differences among different
types of phonological patterns, indicating that the lack of full productivity is not just a task effect, it is
necessary to encode the effect of lexicality for this pattern in the grammar. Therefore, USELISTED
constraints would be applicable to any type of phonological pattern, not just tonal ones. It is possible to
conceive of USELISTED constraints simply as IO-faithfulness constraints, which would require the output
to be identical to the listed form. This is essentially how we have used these constraints here. It is then
less of a surprise that these constraints are applicable to other phonological features. In a published update
of Zuraw (2000), Zuraw (2010) in fact rephrased the USELISTED constraints in similar terms and
distinguished the correspondence between the output and the listed form and the correspondence between
the output and the “underlying” form by shifting the burden of the latter to Output–Output-
correspondence. We have simply maintained the distinction between USELISTED and IO-faithfulness
here. The proliferation of the USELISTED constraints is necessary for the analysis of lexical frequency
20 J. Zhang, J. Liu
123
Author's personal copy
Markedness constraints that militate against certain tonal combinations and
hence motivate tone sandhi6 and faithfulness constraints that protect underlying
tones, as defined in (6) and (7), are also included in our model.
(6) Markedness constraints:
a. *L+L b. *LH–LH c. *HL–L
d. *HL–HL e. *LH–H f. *LH–HL
(7) Faithfulness constraints:7
a. PRESERVE(L) b. PRESERVE(H) c. PRESERVE(LH)
d. PRESERVE(HL/_L) e. PRESERVE(HL/_HL)
In order to capture the gradience observed in sandhi application, we define
these constraints to be gradient in that candidates may incur different degrees of
violation of the constraints encoded as different numbers of violation marks. We
assume that the number of violations for each constraint ranges from 0 to 4, with
0 indicating that the output tone completely satisfies the requirement set forth by
the constraint, and 4 indicating that the output tone maximally deviates from the
requirement. The 0–4 scale is admittedly ad hoc, but it represents a reasonable
trade-off between the contrastive tone differences in Tianjin and the potential
gradient steps between contrastive tones given the production and perception of
tones. As an illustration, Table 2 shows the evaluations of five candidates for a
real word with /L/+/L/ base tones and a listed /LH–L/ form against USELISTED
(σL–σL), USELISTED(σL/_σL), PRESERVE(L), and *L–L. The five candidates [L–L],
[LL↑–L], [LM–L], [LH↓–L], and [LH–L] are phonetically evenly spaced between
[L–L] and [LH–L]. The closer a candidate is to /L–L/, the more violations it
incurs for USELISTED(σL–σL), USELISTED(σL/_L), and *L–L, but the fewer
violations it incurs for PRESERVE(L).
Footnote 5 continued
effects on productivity as well as lexical variation, and Coetzee (2009), Becker et al. (2011), and Coetzee
and Kawahara (2013), among others, have used a similar strategy.6 The markedness constraints should be taken as phonotactic generalizations that speakers make when
tonal alternations are encountered. This is different from the canonical OT assumption that all constraints
are in UG (Prince and Smolensky 1993). For modeling the learning of phonotactic constraints, see Hayes
and Wilson (2008).7 The reason we use PRESERVE instead of IDENT in our faithfulness constraints is that in its formal
definition, IDENT(F) requires [F] to be a distinctive feature; the featural representation of tone, however, is
controversial in both the number of tone levels and whether there are contour tone features (see Zhang
2010 for a review of the issue). We have therefore chosen to use the theory-neutral PRESERVE to avoid this
controversy.
Tone sandhi productivity in Tianjin Chinese 21
123
Author's personal copy
4.3 Learning biases as σ2 values
We set the default weight μ to be 0 and the default σ2 to be 10−3 for all constraints.
But we also encode two learning biases by adjusting the σ2 values of the USELISTED
and the markedness constraints in the following ways.
First, the σ2 value of each USELISTED constraint is multiplied by a coefficient
BListed that is smaller than 1 and thus biases against promoting the weight of the
constraint. For each USELISTED constraint, we posit BListed to be 10 to the negative
power of a logistic function, in which x represents the number of morphemes that
the USELISTED constraint covers as in (8). The x value for the USELISTED
constraints for disyllabic words is naturally 1. For the USELISTED constraints for
syllable-level allomorphs, the x value equals the number of homophones that the
syllable represents. As estimated from Da’s (2004) corpus, the average numbers
of homophones for a syllable in each of the tones in Mandarin are summarized in
Table 3. We will use these numbers as approximations for the x values for the
syllable-level USELISTED constraints in our learning simulation. The BListed values
according to these numbers are summarized in Table 3 as well. The intuition
behind this bias coefficient is that learners use lexical information in concomi-
tance with grammatical resources such as the MARKEDNESS » FAITHFULNESS ranking
to make phonological generalizations, but they do so cautiously, expressed in the
model by assigning USELISTED constraints greater penalties if they deviate from
the default ranking of 0, so that the weights of these constraints are harder to
promote; moreover, learners are unwilling to treat large amounts of data as listed
behavior, expressed in the model as greater penalties for syllable-level USELISTED
constraints, so that these constraints are even harder to promote along the weight
scale.
(8) BListed ¼ 10� 1
1þe1�0:25x
(x = the number of morphemes that the USELISTED constraint covers.)
Second, we encode a learning bias in favor of promoting the weights of
USELISTED constraints that regulate base-sandhi mappings with a strong phonetic
basis [i.e., USELISTED(σLH/_H), USELISTED(σLH/_HL)] and the relevant markedness
constraints (i.e., *LH+H, *LH+HL) by multiplying their σ2 values with a
coefficient BPhonetics = 10. The rest of the constraints are assumed to have a
Table 2 Constraint evaluations
Base: /L/+/L/
listed: /LH–L/
USELISTED(σL–σL) USELISTED(σL/_L) PRESERVE(L) *L–L
L–L 4 4 4
LL↑–L 3 3 1 3
LM–L 2 2 2 2
LH↓–L 1 1 3 1
LH–L 4
22 J. Zhang, J. Liu
123
Author's personal copy
BPhonetics = 1. This coefficient expresses a substantive bias à la Wilson (2006) in
allowing phonetically motivated patterns to have an edge in learning over other
patterns (see also Zhang and Lai 2010; Zhang et al. 2009, 2011). Each USELISTED
constraint’s σ2 value, then, is 10−3 multiplied by its BListed and BPhonetics values while
the rest of the constraints’ σ2 values are 10−3 multiplied by their respective BPhonetics
values.
The σ2 values for all constraints are summarized in Table 4.
4.4 Learning simulations
The goal of our learning simulation is to train the learner with a representative
sample of the Tianjin lexicon so that it will acquire a grammar that can predict our
speakers’ wug test behavior. The learning was simulated using the MaxEnt
Grammar Tool (Hayes et al. 2009a). The training dataset included 20 real words for
each of the base tone combinations L+L, LH+LH, HL+L, HL+HL, LH+H, and
LH+HL. Among the 20 words for each tonal combination, 10 were high frequency,
and 10 were low frequency. We used the average raw frequencies of the disyllabic
words in each tonal combination used in our experiment from Da’s corpus to
simulate the token frequencies of words in the training dataset as shown in Table 5.
For example, for L+L, each of the 10 high-frequency words had a token frequency
of 4615, and each of the 10 low-frequency words had a token frequency of 75.
For each word, five candidates whose initial syllables were phonetically evenly
spaced between the base tone and the sandhi tone, like in Table 2, were considered.
For L+L, LH+LH, LH+H, and LH+HL, the base tone was consistently listed as
undergoing sandhi; for HL+L, one high-frequency word and one low-frequency
word were listed not to undergo sandhi; for HL+HL, only one high-frequency word
and one low-frequency word were listed to undergo sandhi. The USELISTED
constraints were indexed to the words and the syllables.
Each sandhi was tested separately, and the learner acquired the weights of the
constraints relevant for the sandhi. We will not list the weights for individual
constraints due to the large number of word- and syllable-specific USELISTED
constraints. But overall, the USELISTED constraints for high-frequency disyllabic
words have higher weights than those for low-frequency disyllable words, and the
USELISTED constraints for high-frequency syllable allomorphs have higher weights
Table 3 BListed values for
USELISTED constraintsx BListed
USELISTED(σ–σ) 1 0.4777
USELISTED(σL/_σL) 5.45 0.2573
USELISTED(σLH/_σLH) 3.72 0.3292
USELISTED(σHL/_σL) 5.76 0.2465
USELISTED(σHL/_σHL) 5.76 0.2465
USELISTED(σLH/_σH) 3.72 0.3292
USELISTED(σLH/_σHL) 3.72 0.3292
Tone sandhi productivity in Tianjin Chinese 23
123
Author's personal copy
than those for low-frequency syllable allomorphs. Also, USELISTED for a disyllabic
word has a higher weight than USELISTED for the syllable allomorph of its first
syllable. The markedness constraints generally have high weights except for *HL–
HL, which has a weight of 0. The faithfulness constraints, on the other hand, have a
weight of 0 except for PRESERVE(HL/_HL), which has a high weight.
To test the accuracy of the learning model, we considered the learner’s
predictions for five types of words: high- and low-frequency real words (REAL-High,
REAL-Low), pseudo words in which σ1 comes from high- and low-frequency real
words (PSEUDO-High, PSEUDO-Low), and novel words with a nonce σ1 (NOVEL). For
PSEUDO words, we assumed that the only relevant type of USELISTED constraint was
the syllable-level constraints, and for NOVEL words, none of the USELISTED
constraints were relevant. For HL+L and HL+HL, we tested both words that are
listed to undergo sandhi as well as words that are listed not to. The learner made
predictions on the percentages of the five output candidates whose initial syllables
were phonetically evenly spaced between the base tone and the sandhi tone.
Given that for all the sandhis, the base and the sandhi tones differ in pitch only at
either the left or the right edge of the tone, in reporting the learner’s predictions, we
report the average pitch of this crucial edge according to the predicted outputs for each
base tone combination. To facilitate the pitch calculation, we represented the pitch on a
1–5 numerical scale, on which 5=H, 4=H↓, 3=M, 2= L↑, 1= L. To illustrate, take
Table 4 σ2 values for all constraints
Constraints σ2 Constraints σ2
USELISTED(σ–σ) 0.0004777 *HL+L 0.001
USELISTED(σL/_L) 0.0002573 *HL+HL 0.001
USELISTED(σLH/_LH) 0.0003292 *LH+H 0.01
USELISTED(σHL/_L) 0.0002465 *LH+HL 0.01
USELISTED(σHL/_HL) 0.0002465 PRESERVE(L) 0.001
USELISTED(σLH/_H) 0.003292 PRESERVE(H) 0.001
USELISTED(σLH/_HL) 0.003292 PRESERVE(LH) 0.001
*L+L 0.001 PRESERVE(HL/_L) 0.001
*LH+LH 0.001 PRESERVE(HL/_HL) 0.001
Table 5 Token frequencies of
disyllabic words in the training
dataset
High frequency Low frequency
L+L 4615 75
LH+LH 3267 201
HL+L 3652 307
HL+HL 3629 173
LH+H 2851 117
LH+HL 4291 216
24 J. Zhang, J. Liu
123
Author's personal copy
the example of a high-frequency real word with the base tones /L/+/L/: if the learner
predicts the five candidates [L–L], [LL↑–L], [LM–L], [LH↓–L], and [LH–L] to have
the percentages 0.005, 0.062, 0.705, 8.021, and 91.206 %, respectively, then the
predicted average offset pitch for σ1 is 19 0.005 %+ 29 0.062 %+ 39 0.705 %+
4 9 8.021 % + 5 9 91.206 % = 4.9036. For HL+L and HL+HL, the average pitch
Fig. 4 The learner’s predictions for the behavior of different sandhis for five word types: REAL-High,REAL-Low, PSEUDO-High, PSEUDO-Low, and NOVEL. Bars in the graphs represent the average pitchesamong the predicted outputs of the edge of the tone where the base and the sandhi tones differ. The pitchis represented in a 1–5 numerical scale: 5 = High and 1 = Low. a L+L → LH+L (T1+T1 → T3+T1),b LH+LH→ H+LH (T3+T3→ T2+T3), c HL+L → H+L (T4+T1→ T2+T1), d HL+HL→ L+HL(T4+T4 → T1+T4), e LH+H → L+H (T3+T2 → T1+T2) f LH+HL → L+HL (T3+T4 → T1+T4)
Tone sandhi productivity in Tianjin Chinese 25
123
Author's personal copy
was derived by proportionally combining the predictions for forms with listed sandhi
and the predictions for forms with listed no-sandhi (9:1 for HL+L, 1:9 for HL+HL).
The learner’s predictions are summarized in Figure 4.
For the sandhi L+L → LH+L, given that the sandhi changes L to LH, the higher
the right edge of the output tone is, the more productively the sandhi has applied.
Our predictions in Fig. 4a are that, first, there is a general gradation of sandhi
productivity from REAL to PSEUDO to NOVEL, and second, the sandhi applies more
productively in high-frequency than low-frequency words. These predictions were
borne out in our experimental results. For LH+LH → H+LH (Fig. 4b), the pattern
is similar.
For the two half-third sandhi patterns LH+H → L+H (Fig. 4e) and LH+
HL → L+HL (Fig. 4f), our model predicts that the magnitude of the differences
among the different word types is smaller due to the Bphonetics coefficient that
allowed the weights of relevant USELISTED and markedness constraints to be
promoted more easily. In our experiment, the LH+H → L+H sandhi showed
underlearning in PSEUDO and NOVEL words but no clear results based on lexical
frequency; the LH+HL sandhi showed proper learning. Our model in fact predicts
slightly smaller pitch differences among different word types for the latter due to the
higher token frequencies of the LH+HL words in the learner’s input.
Regarding the sandhis with exceptional behavior, for HL+HL → L+HL we
predicted a higher productivity in real words, especially those with high frequency
as indicated by a lower σ1 onset pitch (Fig. 4d), and for HL+L → H+L we
predicted the mirror image, namely, a lower productivity in high-frequency real
words as indicated by a lower σ1 offset pitch (Fig. 4c); both agreed with the
experimental results. In our model, the nature of the predicted productivity
differences is a combination of categorical and gradient differences in the
application of the sandhis. This also echoes the experimental results.
We compared the current model with a baseline model in which the phonetic
nature of the sandhi is not encoded in the grammar; in other words, Bphonetics = 1 for
all constraints. This baseline model makes different predictions for LH+H and
LH+HL as shown in Fig. 5. The main difference is in the magnitude of the
Fig. 5 The predictions of the baseline learner in which Bphonetics = 1 for all constraints for the LH+H andLH+HL sandhis. a LH+H → L+H (T3+T2 → T1+T2), b LH+HL → L+HL (T3+T4 → T1+T4)
26 J. Zhang, J. Liu
123
Author's personal copy
predicted pitch difference: the baseline model predicts a considerably larger
productivity difference between different word types and words of different lexical
frequencies. Given that in our experimental results, lexical frequency had a
significant effect on productivity for only a subset of the comparisons for LH+H
(Fig. 3e), and neither word type nor lexical frequency had a significant effect on
productivity for LH+HL, the smaller effect predicted by the model with the
Bphonetics coefficient is more consistent with the experimental results.
Another baseline model that we compared our results to was one in which there is
no bias against the promotion of USELISTED constraints; in other words, BListed = 1.
The phonetic bias is retained. The predictions of this baseline model are given in
Fig. 6. Compared to the original model, this baseline model predicts similar
patterns, but the differences predicted among word types and different frequencies
are of slightly greater magnitudes. This works to the advantage of the non-phonetic
sandhis of L+L, LH+LH, HL+L, and HL+HL as the magnitude of effects
predicted by the original model was smaller than the attested effects, but to the
disadvantage of the two phonetic sandhis LH+H and LH+HL as our experimental
result showed no consistent effect.
However, we do not consider this baseline model to be a theoretically sound
model. This is because earlier work by Zhang et al. (e.g., 2009, 2011) has shown
that the BListed coefficients are crucial to the learning of opaque tone sandhi patterns
in Taiwanese. There is thus no reason to assume that they would not be relevant for
Tianjin. The reason these coefficients are particularly important for opaque sandhis
is that these sandhis cannot be captured by the MARKEDNESS » FAITHFULNESS schema
and must be acquired through lexical and allomorph listings in our model; in order
to capture the lack of full productivity of the opaque sandhis manifested in wug
tests, the learning model must actively suppress the promotion of weights for
USELISTED constraints, especially those regarding syllable and tonal allomorphs. For
transparent sandhis like those in Tianjin, the patterns are the combined result of both
USELISTED and markedness constraints. The suppression of weights for USELISTED,
therefore, has less of a dramatic effect as the markedness constraints will
compensate for the effect by acquiring greater weights. Indeed, in the baseline
simulation where BListed was set to 1, the weights for USELISTED constraints were
greater, but the weights for the markedness constraints were smaller. This trade-off
produced the similar effects of word type and frequency to the original model.
Finally, we also tested a baseline model in which both Bphonetics and BListed are set
to 1. Aside from the theoretical issue of not suppressing the weights for USELISTED
constraints just mentioned, this model has the same problem as the first baseline
model in predicting a productivity difference between different word types and
words of different lexical frequencies for the two half-third sandhis as shown in
Fig. 7. The predicted patterns for the sandhis without clear phonetic motivation are
identical to those in Fig. 6a–d.
Overall, we believe that our biased model is the one that is both theoretically sound
and makes good empirical predictions. It succeeds in predicting the simultaneous
underlearning and overlearning of the sandhi patterns: the learner can underlearn the
sandhi patterns slightly despite their full productivity in the lexicon, but it can also
overgeneralize the sandhi with exceptions to wug words; both the underlearning and
Tone sandhi productivity in Tianjin Chinese 27
123
Author's personal copy
overlearning are correlated with the frequency effects in the right direction as well.
Regarding proper learning for one of the half-third sandhis, the biased model only
predicts smaller differences in productivity between real and wug words, not the
identity between the two; the predicted differenceswould likely be further reduced if we
took type frequency into account in our model as the HL tone that triggers the properly
learned half-third sandhi has the highest syllable-type frequency among all tones.
Fig. 6 The predictions of the baseline learner in which Blisted = 1 for all USELISTED constraints for the sixsandhi patterns. a L+L → LH+L (T1+T1 → T3+T1), b LH+LH→ H+LH (T3+T3 → T2+T3), c HL+L → H+L (T4+T1 → T2+T1), d HL+HL → L+HL (T4+T4 → T1+T4), e LH+H → L+H (T3+T2 → T1+T2), f LH+HL → L+HL (T3+T4 → T1+T4)
28 J. Zhang, J. Liu
123
Author's personal copy
Our model, however still needs improvements in the following areas. First, the
overall magnitudes of the predicted productivity differences are currently too small
compared to our wug test results. Second, like our failure of interpretation, we also
fail to model the frequency patterns in PSEUDO words for the two sandhis with
exceptional behaviors. Third, although we have commented on the influence of
Beijing and Standard Chinese (SC) on Tianjin, our model has not formally taken
this influence into account and can hence only model underlearning or overlearning
effects due to Tianjin-internal factors. A more comprehensive model should be able
to make predictions on how the SC input helps shape the productivity patterns.
5 General discussion
5.1 The theoretical model
Earlier theoretical analyses of disyllabic tone sandhi in Tianjin (e.g., Wang 2002;
Lin 2008) take the four sandhi patterns in (2) as a given in terms of both their
productivity and their neutralizing nature and account for the patterns via the
interaction between various types of tonal Obligatory Contour Principle (OCP,
Leben 1973) constraints and tonal faithfulness constraints. For example, Lin (2008)
accounts for the sandhi L+L → LH+L as in (9). The tones are represented on two
levels: the tonal level (T), which is directly associated with the syllable, and the
tonemic level (t), which are level components of contour tones dominated by the
tonal level. OCP and faithfulness constraints to tones can be defined on either the T
or the t level, and subscripted L or R indicates the level tone on the left or right of a
contour tone. The conjoined constraint [IDENT(t)R and IDENT(t)L]T militates against
changing both the left and right edges of the contour tone.
Fig. 7 The predictions of the baseline learner in which Bphonetics = Blisted = 1 for all constraints for theLH+H and LH+HL sandhis. a LH+H → L+H (T3+T2 → T1+T2), b LH+HL → L+HL (T3+T4 → T1+T4)
Tone sandhi productivity in Tianjin Chinese 29
123
Author's personal copy
We have taken a very different approach in our analysis. The advantage of our
proposal is that it is a more accurate reflection of speakers’ knowledge of Tianjin
disyllabic tone sandhi, which involves exceptions and incomplete neutralization,
and the productivity of the patterns also varies depending on the sandhi. An analysis
along the lines of (9) misses these nuanced yet important generalizations. The price
that we pay, however, is that we now have a proliferation of USELISTED constraints
that interact with the rest of the grammar, and the syllable-level USELISTED
constraints partially duplicate the function of the MARKEDNESS » FAITHFULNESS
ranking. Zhang et al. (2009, 2011) have shown that this duplication is empirically
necessary to capture the lack of full productivity of the opaque tone sandhis in
Taiwanese. What we have seen here is that even for transparent sandhis that can be
captured by the markedness and faithfulness interaction, full productivity is still not
guaranteed, and lexical listing is still necessary.
The coexistence of traditional markedness and faithfulness constraints with ad
hoc USELISTED constraints that require the surface forms to use listed allomorphs
coincides with Moreton’s (2004) argument that the grammar is composed of an
innate and “conservative” component of markedness and faithfulness constraints
and a language-specific component with constraints that require particular lexical
items to have particular surface representations in particular environments. We
share with Moreton (2004) the intuition that such constraints are necessary in the
grammar in any case to deal with processes that target specific lexical items and
morphological categories, that are suppletive, and that have lexical exceptions, but
we have taken the position one step further by positing that speakers will build
lexical constraints in any event. In other words, USELISTED can be considered as a
universal template into which learners plug the specifics of their language.
Finally, the model proposed here has affinities with exemplar-based models of
grammar (e.g., Bybee 2001, 2006; Pierrehumbert 2001, 2002; Gahl and Yu 2006) in
that it allows usage frequency effects on phonological patterning to be captured. But
the frequency effects are derived through the weights of USELISTED constraints,
which interact with other constraints in the grammar, rather than just emerging from
the lexicon. Thus, the frequency effects are predicted to interact with other
grammatical effects in ways constrained by the grammar.
30 J. Zhang, J. Liu
123
Author's personal copy
5.2 Size of the acoustic effects
We have seen in the acoustic results that although some of the word-type or
frequency comparisons show significant differences in either pitch mean or pitch
slope, the differences are typically of small magnitudes. Absolute f0 differences
found in the comparisons are generally in the order of a few Hertz before
normalization. An anonymous reviewer questioned whether such small differences
can be the basis for the claim of learning differences and, hence, grammatical
differences. The point that we would like to emphasize, however, is that our main
result is in the different behaviors of different sandhi patterns, and we have
interpreted the different behaviors based on the lexical and phonetic properties of
the sandhi patterns that are known to affect phonological productivity in general.
Therefore, the pitch differences, though small, cannot be easily claimed to have
resulted from the nature of the task and swept under the rug; they need to be
accounted for in other ways. The position we have taken is that the lexical and
phonetic properties of the sandhi directly influence its production grammar. The fact
that the small acoustic differences may not be perceptible does not contradict the
fact that different sandhis are processed differently in production. This is in fact a
familiar scenario: production and perception studies of incompletion neutralization
and near merger often show consistent small acoustic differences as the result of
these, but speakers’ perceptual use of these subtle cues is highly context-dependent
and often unreliable (e.g., Jassem and Richter 1989; Port and Crawford 1989; Peng
2000; Warner et al. 2004; Yu 2007; Herd et al. 2010).
5.3 Aggregate vs. individual differences
As pointed out by an anonymous reviewer, it is important to recognize that our
grammatical model is based on the aggregate acoustic results from multiple speakers
of Tianjin. Therefore, the model is only a representation of the behavior of an
idealized native speaker of Tianjin. As we have commented in Sects. 1.3 and 3, there
were clearly individual differences in how the speakers behaved. This means that
each individual speaker’s grammar will deviate from the model that we have
proposed. However, we opted not to take each individual speaker’s data and construct
a grammar for him/her as idiosyncrasies of the speakers will likely be overrepresented
in these grammars while a grammar based on the aggregate results is more likely to be
representative of the Tianjin language. This is common practice for modeling
analyses of phonological patterns based on experimental results or corpus data (e.g.,
Wilson 2006; Hayes and Londe 2006; Coetzee and Pater 2008; Hayes et al. 2009b;
Becker et al. 2011; Zuraw 2010; Coetzee and Kawahara 2013).
6 Conclusions
The tone sandhi patterns of Tianjin Chinese are variable, gradient, and full of
exceptions. To understand how the speakers of Tianjin tackle phonological patterns
with such complexity, we conducted a wug test to investigate the productivity of the
Tone sandhi productivity in Tianjin Chinese 31
123
Author's personal copy
sandhi patterns. Our results indicate that a Tianjin speaker’s knowledge of tone
sandhi may differ from the sandhi pattern in the lexicon in nuanced ways: sandhis
with exceptions can be generalized and overlearned while a number of fully
productive sandhis in the lexicon are underlearned, both of which illustrate the
effects of frequency and lexical listing on sandhi productivity; the phonetic nature
of a sandhi may encourage learning, bringing underlearning closer to proper
learning. These mismatches are claimed here to be informative as to the nature of
the speakers’ phonological grammars. A model of the grammar, consequently,
needs to be quantitative and flexible enough to capture the variability, gradience,
and exceptions, and the resultant overlearning and underlearning effects.
Acknowledgments We are indebted to Ping Wang, Xiaoyu Zeng, and Feng Shi at Nankai Universityfor hosting us during data collection and discussing various aspects of this project with us. We also thankGeng Wang for serving as our Tianjin language consultant and the speakers of Tianjin who participated inour experiment. We are grateful to the participants at GLOW-Asia 8 and the second Pan-American/Iberian Meeting on Acoustics, especially James Myers, Doug Whalen, and Charles Yang, for theircomments on this research. We, however, remain fully responsible for the opinions expressed here. Thisresearch was supported by the National Science Foundation grant BCS-0750773 and the University ofKansas General Research Fund 2301166.
References
Albright, Adam. 2002. Islands of reliability for regular morphology: Evidence from Italian. Language 78(4): 684–709.
Albright, Adam, and Bruce Hayes. 2003. Rules vs. analogy in English past tenses: A computational/ex-
perimental study. Cognition 90: 119–161.
Albright, Adam, Argelia Andrade, and Bruce Hayes. 2001. Segmental environments of Spanish
diphthongization. In UCLA working papers in linguistics 7, (Papers in phonology 5), ed. AdamAlbright, and Taehong Cho, 117–151. Los Angeles: UCLA Department of Linguistics.
Becker, Michael, Nihan Ketrez, and Andrew Nevins. 2011. The surfeit of the stimulus: Analytic biases
filter lexical statistics in Turkish laryngeal alternations. Language 87: 84–125.
Berent, Iris, Donca Steriade, Tracy Lennertz, and Vered Vaknin. 2007. What we know about what we
have never heard: Evidence from perceptual illusions. Cognition 104: 591–630.
Berko, Jean. 1958. The child’s learning of English morphology. Word 14: 150–177.
Boersma, Paul and David Weenink. 2009. Praat: Doing phonetics by computer (computer program).http://www.praat.org/. Accessed 5 Jan 2009.
Bybee, Joan. 2001. Phonology and language use. Cambridge: Cambridge University Press.
Bybee, Joan. 2006. From usage to grammar: The mind’s response to repetition. Language 82: 711–733.Bybee, Joan, and Elly Pardo. 1981. On lexical and morphological conditioning of alternations: A nonce-
probe experiment with Spanish verbs. Linguistics 19: 937–968.Chao, Yuen Ren. 1968. A grammar of spoken Chinese. Berkeley and Los Angeles: University of
California Press.
Chen, Matthew Y. 1987. The syntax of Xiamen tone sandhi. Phonology Yearbook 4: 109–150.
Chen, Matthew Y. 2000. Tone sandhi: Patterns across Chinese dialects. Cambridge: Cambridge
University Press.
Cheng, Robert L. 1968. Tone sandhi in Taiwanese. Linguistics 41: 19–42.Coetzee, Andries W. 2009. Learning lexical indexation. Phonology 26: 109–145.
Coetzee, Andries W., and Shigeto Kawahara. 2013. Frequency biases in phonological variation. NaturalLanguage and Linguistic Theory 31: 47–89.
Coetzee, Andries W., and Joe Pater. 2008. Weighted constraints and gradient restrictions on place co-
occurrence in Muna and Arabic. Natural Language and Linguistic Theory 26: 289–337.
32 J. Zhang, J. Liu
123
Author's personal copy
Coetzee, Andries, and Joe Pater. 2011. The place of variation in phonological theory. In The handbook ofphonological theory, 2nd ed, ed. John A. Goldsmith, Jason Riggle, and Alan C.L. Yu, 401–434.
Cambridge, MA and Oxford, UK: Blackwell.
Da, Jun. 2004. Chinese text computing. http://lingua.mtsu.edu/chinese-computing. Accessed 1 Sept 2008.
Gahl, Susanne, and Alan Yu. 2006. Special issue on exemplar-based models in linguistics. The LinguisticReview 23(3): 289–318.
Gandour, Jackson T., Siripong Potisuk, and Sumalee Dechongkit. 1994. Tonal coarticulation in Thai.
Journal of Phonetics 22: 474–492.Gao, Jing. 2004. The changing sandhi rules in Tianjin dialect. In Phonetic and phonological studies on
Tianjin dialect, ed. Lu Jilun, 193–247. Beijing: Beijing Institute of Technology Press.
Goldwater, Sharon, and Mark Johnson. 2003. Learning OT constraint ranking using a maximum entropy
model. In Proceedings of the Stockholm workshop on variation within optimality theory, ed. JenniferSpenader, Anders Eriksson, and Osten Dahl, 111–120. Stockholm: Stockholm University.
Hayes, Bruce, and Zsuzsa C. Londe. 2006. Stochastic phonological knowledge: The case of Hungarian
vowel harmony. Phonology 23: 59–104.
Hayes, Bruce, and Colin Wilson. 2008. A maximum entropy model of phonotactics and phonotactic
learning. Linguistic Inquiry 39: 379–440.
Hayes, Bruce, Colin Wilson, and Benjamin George. 2009a. Maxent grammar tool. Java program.
http://www.linguistics.ucla.edu/people/hayes/MaxentGrammarTool/. Accessed 22 May 2009.
Hayes, Bruce, Kie Zuraw, Peter Siptar, and Zsuzsa Londe. 2009b. Natural and unnatural constraints in
Hungarian vowel harmony. Language 85: 822–863.
Herd, Wendy, Allard Jongman, and Joan Sereno. 2010. An acoustic and perceptual analysis of /t/ and /d/
flaps in American English. Journal of Phonetics 38: 504–516.Hsieh, Hsin-I. 1970. The psychological reality of tone sandhi rules in Taiwanese. In Papers from the 6th
meeting of the Chicago Linguistic Society, ed. M.A. Campbell, 489–503. Chicago: Chicago
Linguistic Society.
Hsieh, Hsin-I. 1975. How generative is phonology. In The transformational-generative paradigm andmodern linguistic theory, ed. E.F. Koerner, 109–144. Amsterdam: John Benjamins.
Hsieh, Hsin-I. 1976. On the unreality of some phonological rules. Lingua 38: 1–19.
Hyman, Larry. 2007. Universals of tone rules: 30 years later. In Tones and tunes vol 1: Typological studiesin word and sentence prosody, ed. Tomas Riad, and Carlos Gussenhoven, 1–34. Berlin: Mouton de
Gruyter.
Jager, Gerhard. 2007. Maximum Entropy models and stochastic optimality theory. In Architectures, rulesand preferences: Variation on themes by Joan W. Bresnan, ed. Annie Zaenen, Jane Simpson, Tracy
H. King, Jane Grimshaw, Joan Maling, and Chris Manning, 467–479. Stanford: CSLI Publications.
Jassem, Wiktor, and Lutoslawa Richter. 1989. Neutralization of voicing in Polish obstruents. Journal ofPhonetics 17: 317–325.
Jiang, Hui. 1994. The phonetic description of neutral tone in Tianjin dialect. MA thesis, Tianjin Normal
University, Tianjin.
Kenstowicz, Michael, and Charles Kisseberth. 1979. Generative phonology: Description and theory. SanDiego: Academic.
Kiparsky, Paul. 1973. Abstractness, opacity, and global rules. In Three dimensions of linguistic theory, ed.Osamu Fujimura, 57–86. Tokyo: TEC Company Ltd.
Kirchner, Robert. 1996. Synchronic chain shifts in optimality theory. Linguistic Inquiry 27: 341–350.
Leben, William. 1973. Suprasegmental phonology. PhD dissertation, MIT.
Li, Xing-Jian, and Si-Xun Liu. 1985. Tianjin fangyan de liandu biandiao [Tone sandhi in the Tianjin
dialect]. Zhongguo Yuwen [Studies of the Chinese Language] 1985(1): 76–80.Liang, Yuzhang, and Aizhen Feng. 1996. Fuzhouhua yindang [The sound system of Fuzhou dialect].
Shanghai: Shanghai Education.
Lin, Huishan. 2008. Variable directional applications in Tianjin tone sandhi. Journal of East AsianLinguistics 17: 181–226.
Liu, Yu-Zhen, and Jiang Gao. 2003. Qu-Qu liandu biandiao guize: shehui yuyanxue bianxiang [FF sandhi
rule in Tianjin dialect: A sociolinguistic variable]. Tianjin Shifan Daxue Xuebao—Shehui Kexue Ban[Journal of Tianjin Normal University—Social Sciences] 2003(5): 65–69.
Lu, Ji-Lun. 1997. Tianjin fangyan zhong de yizhong xin de liandu biandiao [A new tone sandhi rule in
Tianjin dialect]. Tianjin Shida Xuebao [Journal of Tianjin Normal University] 1997(4): 67–72.
Tone sandhi productivity in Tianjin Chinese 33
123
Author's personal copy
Lu, Ji-Lun. 2004. A new phenomenon in Tianjin tone sandhi. In Phonetic and phonological studies onTianjin dialect: Festschrift for Professor Wang Jialing’s 70th birthday, ed. Lu Ji-Lun, 89–137.
Beijing: Beijing Institute of Technology Press.
Ma, Qiuwu, and Yuan Jia. 2006. Tianjinhua shangsheng de liangtiao “biandiao guize” bianxi [Two new
third tone sandhi rules in Tianjin dialect—a critical reanalysis]. Tianjin Shifan Daxue Xuebao—Shehui Kexue Ban [Journal of Tianjin Normal University—Social Science] 2006(1): 53–58.
Maddieson, Ian. 1978. Universals of tone. In Universals of human language, vol. 2: Phonology, ed. JosephH. Greenberg, 335–366. Stanford: Stanford University Press.
Moreton, Elliott. 2004. Non-computable functions in optimality theory. In Optimality theory inphonology, ed. John McCarthy, 141–164. Malden: Blackwell.
Moreton, Elliott. 2008. Analytical bias and phonological typology. Phonology 25: 83–127.
Peng, Shu-Hui. 1997. Production and perception of Taiwanese tones in different tonal and prosodic
contexts. Journal of Phonetics 25: 371–400.Peng, Shu-Hui. 2000. Lexical versus ‘phonological’ representations of Mandarin sandhi tones. In
Language acquisition and the lexicon: Papers in laboratory phonology 5, ed. Michael B. Broe, and
Janet B. Pierrehumbert, 152–167. Cambridge: Cambridge University Press.
Pierrehumbert, Janet B. 2001. Exemplar dynamics: Word frequency, lenition and contrast. In Frequencyand the emergence of linguistic structure, ed. Joan Bybee, and Paul Hopper, 137–157. Amsterdam:
John Benjamins.
Pierrehumbert, Janet B. 2002. Word-specific phonetics. In Laboratory phonology 7, ed. Carlos
Gussenhoven, and Natasha Warner, 101–139. Berlin: Mouton de Gruyter.
Pierrehumbert, Janet B. 2006. The statistical basis of an unnatural alternation. In Laboratory phonology 8.Varieties of phonological competence, ed. Louis Goldstein, Douglas H. Whalen, and Catherine Best,
81–107. Berlin: Mouton de Gruyter.
Port, Robert, and Penny Crawford. 1989. Incomplete neutralization and pragmatics in German. Journal ofPhonetics 17: 257–282.
Prince, Alan, and Paul Smolensky. 1993. Optimality theory: Constraint interactions in generativegrammar. New Brunswick: Rutgers Center for Cognitive Science, Rutgers University. (re-printedin 2004 by MIT Press, Cambridge, MA).
Rietveld, Toni, and Aoju Chen. 2006. How to obtain and process perceptual judgements of intonational
meaning. In Methods in empirical prosody research, ed. Stefan Sudhoff, Denisa Lenortova, Roland
Meyer, Sandra Pappert, Petra Augurzky, Ina Mleinek, Nicole Richter, and Johannes Schieβer, 283–319. Berlin: Walter de Gruyter.
Rose, Phil. 1987. Considerations in the normalization of the fundamental frequency in linguistic tone.
Speech Communication 6: 343–351.
Shi, Feng. 1986. Tianjin fangyan shuangzizu shengdiao fenxi [An analysis of disyllabic tones in Tianjin
dialect]. Yuyan Yanjiu [Linguistic Research] 1986(1): 77–90.Shi, Feng. 1988. Shilun Tianjinhua de shengdiao jiqi bianhua—xiandai yuyinxue biji [On tones and their
recent changes in Tianjin dialect—modern phonetics notes]. Zhongguo Yuwen [Studies of theChinese Language] 1988(5): 351–360.
Shi, Feng. 1990. Hanyu he Dong-Tai yu de shengdiao geju [Tone systems in Chinese and Kam-Tai
languages]. PhD dissertation, Nankai University, Tianjin.
Shi, Feng, and Ping Wang. 2004. Tianjinhua shengdiao de xin bianhua [New changes in Tianjin tones]. In
The joy of research: A festschrift in honor of Professor William S.-Y. Wang on his seventieth birthday,ed. Feng Shi, and Zhongwei Shen, 176–188. Tianjin: Nankai University Press.
Wang, Samuel H. 1993. Taiyu biandiao de xinli texing [On the psychological status of Taiwanese tone
sandhi]. Tsinghua Xuebao [Tsinghua Journal of Chinese Studies] 23: 175–192.Wang, Jia-Ling. 2002. Youxuanlun he Tianjinhua de liandu biandiao ji qingsheng [Optimality Theory and
tone sandhi and neutral tone in Tianjin dialect]. Zhongguo Yuwen [Studies of the Chinese Language]2002(4): 363–371.
Warner, Natasha, Allard Jongman, Joan Sereno, and Rachel Kemper. 2004. Incomplete neutralization of
sub-phonemic durational differences in production and perception of Dutch. Journal of Phonetics32: 251–276.
Wee, Lian-Hee. 2004. Inter-tier correspondence theory. PhD dissertation, Rutgers University, New
Brunswick, NJ.
Wilson, Colin. 2006. Learning phonology with substantive bias: An experimental and computational
study of velar palatalization. Cognitive Science 30(5): 945–982.
Xu, Yi. 1997. Contextual tonal variations in Mandarin. Journal of Phonetics 25: 61–83.
34 J. Zhang, J. Liu
123
Author's personal copy
Xu, Yi. 2005. TimeNormalizedF0. Praat script. http://www.phon.ucl.ac.uk/home/yi/tools.html. Accessed 1
Dec 2005.
Yang, Zi-Xiang, He-Tong Guo, and Xiang-Dong Shi. 1999. Tianjinhua Yindang [The sound system ofTianjin dialect]. Shanghai: Shanghai Education Press.
Yu, Alan C.L. 2007. Understanding near mergers: The case of morphological tone in Cantonese.
Phonology 24: 187–214.
Yue-Hashimoto, Anne O. 1987. Tone sandhi across Chinese dialects. In Wang Li memorial volumes,English volume, ed. Chinese Language Society of Hong Kong, 445–474. Hong Kong: Joint
Publishing Co.
Zhang, Jie. 2002. The effects of duration and sonority on contour tone distribution: A typological surveyand formal analysis. New York: Routledge.
Zhang, Jie. 2007. A directional asymmetry in Chinese tone sandhi systems. Journal of East AsianLinguistics 16: 259–302.
Zhang, Jie. 2010. Issues in the analysis of Chinese tone. Language and Linguistics Compass 4(12): 1137–1153.
Zhang, Jie. 2014a. Tones, tonal phonology, and tone sandhi. In The handbook of Chinese linguistics, ed.C.-T.James Huang, Y.-H.Audrey Li, and Andrew Simpson, 443–464. Oxford: Wiley-Blackwell.
Zhang, Jie. 2014b. Tone sandhi. In Oxford bibliographies in linguistics, ed. Mark Aronoff. New York:
Oxford University Press. http://www.oxfordbibliographies.com/view/document/obo-978019977281
0/obo-9780199772810-0160.xml. Accessed 15 July 2014.
Zhang, Jie, and Yuwen Lai. 2008. Phonological knowledge beyond the lexicon in Taiwanese double
reduplication. In Interfaces in Chinese phonology: Festschrift in honor of Matthew Y. Chen on his70th birthday, ed. Yuchau E. Hsiao, Hui-Chuan Hsu, Lian-Hee Wee, and Dah-An Ho, 183–222.
Taipei: Academia Sinica.
Zhang, Jie, and Yuwen Lai. 2010. Testing the role of phonetic knowledge in Mandarin tone sandhi.
Phonology 27(1): 153–201.
Zhang, Jie, and Jiang Liu. 2011. Tone sandhi and tonal coarticulation in Tianjin Chinese. Phonetica 68
(3): 161–191.
Zhang, Jie, and Yuanliang Meng. 2012. Structure-dependent tone sandhi in real and nonce words in
Shanghai Wu. In Proceedings of the 3rd international symposium on tonal aspects of languages, ed.Gu Wentao. Nanjing: Nanjing Normal University.
Zhang, Jie, Yuwen Lai, and Craig Sailor. 2009. Opacity, phonetics, and frequency in Taiwanese tone
sandhi. In Current issues in unity and diversity of languages: Collection of papers selected from the18th International Congress of Linguists, ed. Manghyu Pak, 3019–3038. Seoul: Linguistic Society of
Korea.
Zhang, Jie, Yuwen Lai, and Craig Sailor. 2011. Modeling Taiwanese speakers’ knowledge of tone sandhi
in reduplication. Lingua 121(2): 181–206.
Zhao, Yuan, and Dan Jurafsky. 2009. The effect of lexical frequency and Lombard reflex on tone
hyperarticulation. Journal of Phonetics 37: 231–247.Zhu, Xiaonong. 2004. Jipin guiyihua — ruhe chuli shengdiao de suiji chayi? [F0 normalization — How to
deal with between-speaker tonal variations?]. Yuyan Kexue [Linguistic Sciences] 3(2): 3–19.Zuraw, Kie. 2000. Patterned exceptions in phonology. PhD dissertation, University of California, Los
Angeles.
Zuraw, Kie. 2007. The role of phonetic knowledge in phonological patterning: Corpus and survey
evidence from Tagalog infixation. Language 83: 277–316.
Zuraw, Kie. 2010. A model of lexical variation and the grammar with application to Tagalog nasal
substitution. Natural Language and Linguistic Theory 28: 417–472.
Tone sandhi productivity in Tianjin Chinese 35
123
Author's personal copy