МУЛЬТИМОДАЛЬНАЯ ЛИНГВИСТИКА Семинар
DESCRIPTION
МУЛЬТИМОДАЛЬНАЯ ЛИНГВИСТИКА Семинар. Дульзоновские чтения Томск, 2011. А.А.Кибрик (Институт языкознания РАН) [email protected]. The mainstream linguistic approach. Language consists of hierarchically organized segmental units, such as phonemes, morphemes, words, phrases, and sentences - PowerPoint PPT PresentationTRANSCRIPT
МУЛЬТИМОДАЛЬНАЯ ЛИНГВИСТИКА
Семинар
А.А.Кибрик (Институт языкознания РАН)
Дульзоновские чтенияТомск, 2011
2
The mainstream linguistic approach
Language consists of hierarchically organized segmental units, such as phonemes, morphemes, words, phrases, and sentences
Linguistic form is thus equated with verbal form Search for “linguistic form” in Google:
The first result is: “A meaningful unit of language, such as an affix, a word,
a phrase, or a sentence.” (TheFreeDictionary.com) «В своей совокупности языковые знаки
образуют особого рода знаковую систему – язык. <…> Наиболее типичным языковым знаком является слово <…> Форма выражения любого словесного знака состоит из фонем» (Лингвистический энциклопедический словарь, с. 167)
3
However
Apart from sound, there are other channels of communication, in the first place through vision (body language - gesture, mimics, gaze, etc.)
There are prosodic, that is non-verbal aspects to sound Imagine prosody-free talk or, vice versa, talk behind a wall
4
Multimodality
In order to understand language and communication, all aspects of linguistic form must be taken into account
This is what is sometimes called the multimodal approach Modality, or mode, refers to a distinct type of input In particular, modality is a kind of stimulus associated
with one the human senses, particularly hearing and sight So the verbal component, prosody, and body language all
count as modes or modalities
“Any use of language is inescapably multimodal” (Scollon 2006)
5
Goals of this talk
Emphasize the importance of prosody and visual aspects of communication in linguistic research
Show how prosody and visual communication interact with the verbal component, thus suggesting not only the multimodal, but also the cross-modal approach
Propose that linguistics cannot progress without taking multimodality seriously into account
6
Are these goals relevant and important?
After all, linguists and other scholars have already been pursuing these issues for many decades, and the respective research traditions are quite rich
But: First, prosody and visual communication are
marginalized in linguistics, they are located in certain “pockets” of the overall linguistic panorama and are tolerated by the mainstream as “paralinguistics”
Those focusing on these information channels often treat them as a “thing in itself”, without integration with the verbal component
7
Plan of talk
I. Prosody II. Gestures III. Relative contribution of three
information channels IV. Signed languagesV. Wider context
8
I. PROSODY Prosodic components
pausing accents pitch tempo (of various scope) registers degrees of reduction glottal features loudness ................
«Рост интереса к просодии связан <…> с новыми семантическим задачами (описание непропозициональной семантики <...>)» (Кодзасов 1996: 85)
Prosody is responsible for discourse segmentation into Elementary Discourse Units (EDUs), identified on the basis of several prosodic components and strongly correlated with clauses
9
An example of prosodically oriented discourse transcription
....(1.5) /\Озеро ...(0.5) какое-то,Lake some
..(0.3) (Или /\речка,Either river
или /\озеро,or lake
но по-моему \озеро,but I guess lake
потому что’ ..(0.2) как-то-оw because somehow...(0.6) \маленькое такое,
small such \небольшое.)
minor ....(1.0) ’и-иh ...(0.7)через /него
and across it..(0.3) как-то \бревно какое-то,
somehow log some типа \моста.
like bridge
....(1.5) /\Ozero ...(0.5) kakoe-to,
..(0.3) (Ili /\rečka,
ili /\ozero,
no po-moemu \ozero,
potomu čto’ ..(0.2) kak-to-oW ...
(0.6) \malen’koe takoe,
\nebol’šoe.)
....(1.0) ’i-iH ...(0.7) čerez /nego ..
(0.3) kak-to \brevno kakoe-to,
tipa \mosta.
10
Night Dream Stories
Corpus of spoken Russian stories Speakers: children and adolescents Subject matter: retelling of night
dreamsDiscourse type: monologic narrative
(personal stories) Joint study with Vera Podlesskaya and a
group of our graduate students Kibrik and Podlesskaya eds. 2009
11
Segmentation (EDUs) ....(1.5) /\Озеро ...(0.5) какое-то,
Lake some ..(0.3) (Или /\речка,
Either river или /\озеро,
or lake но по-моему \озеро,
but I guess lake потому что’ ..(0.2) как-то-оw
because somehow...(0.6) \маленькое такое,
small such \небольшое.)
minor ....(1.0) ’и-иh ...(0.7)через /него
and across it..(0.3) как-то \бревно какое-то,
somehow log some типа \моста.
like bridge
....(1.5) /\Ozero ...(0.5) kakoe-to,
..(0.3) (Ili /\rečka,
ili /\ozero,
no po-moemu \ozero,
potomu čto’ ..(0.2) kak-to-oW ...
(0.6) \malen’koe takoe,
\nebol’šoe.)
....(1.0) ’i-iH ...(0.7) čerez /nego ..
(0.3) kak-to \brevno kakoe-to,
tipa \mosta.
12
Pauses ....(1.5) /\Озеро ...(0.5) какое-то,
Lake some ..(0.3) (Или /\речка,
Either river или /\озеро,
or lake но по-моему \озеро,
but I guess lake потому что’ ..(0.2) как-то-оw
because somehow...(0.6) \маленькое такое,
small such \небольшое.)
minor ....(1.0) ’и-иh ...(0.7)через /него
and across it..(0.3) как-то \бревно какое-то,
somehow log some типа \моста.
like bridge
....(1.5) /\Ozero ...(0.5) kakoe-to,
..(0.3) (Ili /\rečka,
ili /\ozero,
no po-moemu \ozero,
potomu čto’ ..(0.2) kak-to-oW ...
(0.6) \malen’koe takoe,
\nebol’šoe.)
....(1.0) ’i-iH ...(0.7) čerez /nego ..
(0.3) kak-to \brevno kakoe-to,
tipa \mosta.
13
Pitch accents ....(1.5) /\Озеро ...(0.5) какое-то,
Lake some ..(0.3) (Или /\речка,
Either river или /\озеро,
or lake но по-моему \озеро,
but I guess lake потому что’ ..(0.2) как-то-оw
because somehow...(0.6) \маленькое такое,
small such \небольшое.)
minor ....(1.0) ’и-иh ...(0.7)через /него
and across it..(0.3) как-то \бревно какое-то,
somehow log some типа \моста.
like bridge
....(1.5) /\Ozero ...(0.5) kakoe-to,
..(0.3) (Ili /\rečka,
ili /\ozero,
no po-moemu \ozero,
potomu čto’ ..(0.2) kak-to-oW ...
(0.6) \malen’koe takoe,
\nebol’šoe.)
....(1.0) ’i-iH ...(0.7) čerez /nego ..
(0.3) kak-to \brevno kakoe-to,
tipa \mosta.
14
Tempo: wide and narrow scope ....(1.5) /\Озеро ...(0.5) какое-то,
Lake some ..(0.3) (Или /\речка,
Either river или /\озеро,
or lake но по-моему \озеро,
but I guess lake потому что’ ..(0.2) как-то-оw
because somehow...(0.6) \маленькое такое,
small such \небольшое.)
minor ....(1.0) ’и-иh ...(0.7)через /него
and across it..(0.3) как-то \бревно какое-то,
somehow log some типа \моста.
like bridge
....(1.5) /\Ozero ...(0.5) kakoe-to,
..(0.3) (Ili /\rečka,
ili /\ozero,
no po-moemu \ozero,
potomu čto’ ..(0.2) kak-to-oW ...
(0.6) \malen’koe takoe,
\nebol’šoe.)
....(1.0) ’i-iH ...(0.7) čerez /nego ..
(0.3) kak-to \brevno kakoe-to,
tipa \mosta.
15
Other prosodic phenomena ....(1.5) /\Озеро ...(0.5) какое-то,
Lake some ..(0.3) (Или /\речка,
Either river или /\озеро,
or lake но по-моему \озеро,
but I guess lake потому что’ ..(0.2) как-то-оw
because somehow...(0.6) \маленькое такое,
small such \небольшое.)
minor ....(1.0) ’и-иh ...(0.7)через /него
and across it..(0.3) как-то \бревно какое-то,
somehow log some типа \моста.
like bridge
....(1.5) /\Ozero ...(0.5) kakoe-to,
..(0.3) (Ili /\rečka,
ili /\ozero,
no po-moemu \ozero,
potomu čto’ ..(0.2) kak-to-oW ...
(0.6) \malen’koe takoe,
\nebol’šoe.)
....(1.0) ’i-iH ...(0.7) čerez /nego ..
(0.3) kak-to \brevno kakoe-to,
tipa \mosta.
16
Prosody and sentence
Does spoken language consist of sentences? Sheer facts:
Spoken language is the primary form of language Spoken language does not contain periods,
question marks and other explicit signals of sentence boundaries
Research question: Is sentence, as a theoretical construct, as
identifiable and as basic for the primary form of language as it is (or as it is thought to be) for written language?
17
Sentence in spoken language
Position 1: sentence is a universal and basic unit of language Assumption typically held by not only by
linguists but also by other cognitive scientists
But sentence is very far from being obvious in spoken language
Position 2: avoidance of the issue, typical of discourse-oriented linguists If so, how could sentences become so
much entrenched in written language?
18
Phase (фаза)
Term by Sandro V. Kodzasov Alternative term by J. DuBois et al. 1992:
transitional continuity Discourse semantic category: ‘end’ vs. ‘non-
end’ (=expectation of a forthcoming end) Hierarchical nature of phase End of tentative sentence – falling tonal
accent Non-end – rising tonal accent
19
A canonical example of the transitional continuity distinction z57:15-16
..(0.4) /\Мы-ы’ ..(0.4) \как бы за них /взя-ались,..(0.4) /\My-y’ ..(0.4) \kak by za nix /vzja-alis’,
We sort of at them got.hold
...(0.5) и-и ввь= || ..(0.2) полетели \вве-ерх. ...(0.5) i-i vv’= || ..(0.2) poleteli \vve-erx. and flew upward
Rising (“comma”)Non-end
Falling (“period”)End
If things were that easy, sentence would be uncontroversial
20
Non-canonical situation: Non-end with a falling tonal accent
....(1.5) /\Озеро ...(0.5) какое-то,
..(0.3) (Или /\речка,
или /\озеро,
но по-моему \озеро,
потому что’ ..(0.2) как-то-оw
...(0.6) \маленькое такое,
\небольшое.)
....(1.0) ’и-иh ...(0.7)через /него
..(0.3) как-то \бревно какое-то,
типа \моста.
....(1.5) /\Ozero ...(0.5) kakoe-to,Lake some
..(0.3) (Ili /\rečka, Either river
ili /\ozero,or lake
no po-moemu \ozero,but I guess lake
potomu čto’ ..(0.2) kak-to-oWbecause somehow ...(0.6) \malen’koe takoe,
small such \nebol’šoe.)
minor ....(1.0) ’i-iH ...(0.7) čerez /nego
and across it ..(0.3) kak-to \brevno kakoe-to,
somehow log some tipa \mosta.
like bridge
21
The problem of two kinds of falling
The existence of non-final falling calls relevance of sentence into question
However, the distinction between two kinds of falling is very systematic
The two kinds of falling: are prosodically distinct have distinct discourse functions
22
Prosodic criteria of the final vs. non-final falling distinction
1. Target frequency band2. Post-accent behavior3. Pausing pattern4. Reset vs. latching5. Steepness of falling6. Interval of falling
23
Target frequency band
Final falling (“period”): targets at the bottom of the speaker’s F0 range
Non-final falling (“falling comma”): targets at level several dozen Hz (several semitones) higher
24
F0 graph for the “lake” example
\ozero, \malen’koe \nebol’ \brevno kakoe \mosta.
takoe, šoe.-to,
12 10 125
8
25
Non-final falling (210 Гц), final falling (170 Гц), rising, post-rising falling Z54: 4-5
..(0.4) А /тогда уже д= || ..(0.2) закрывались \двери,
..(0.4) A /togda uže d= || ..(0.2) zakryvalis’ \dveri,And then already d= were.closing doors
..(0.1) и /’Аня не –успела \сесть.
..(0.1) i /Anja ne –uspela \sest’. and Anja not managed get.in
...(0.7) Иw мм(0.4) /\когда-а ..(0.2) ’’(0.3) ..(0.4) {ЧМОКАНЬЕ 0.2} ..(0.4) когда я приехала на нашу /остановку’, ...(0.7) IW mm(0.4) /\kogda-a ..(0.2) ’’(0.3) ..(0.4) {SMACKING 0.2} ..(0.4) kogda ja priexala na našu /ostanovku’, And when when I arrived to our station
210 Hz170 Hz
26
Post-accent behavior
Final falling (“period”): steady falling on the post-accent syllables
Non-final falling (“comma”): lack of falling on post-accent syllables, often rise of tone (V-curve)
27
V-curve z26
....(5.7) /Домик ...(0.6) был /около \реч↑ки,
....(5.7) /Domik ...(0.6) byl /okolo \reč↑ki, Little.house was near creek
....(3.3) /рядом были \–родник-ки,
....(3.3) /rjadom byli \–rodnik-ki, nearby were springs
..(0.4) и \–ле-ес.
..(0.4) i \–le-es. and forest
260 Hz
235 Hz
240 Hz
28
The final vs. non-final falling distinction
A speaker’s prosodic pattern must be identified
On its basis the difference between final and non-final falling distinction can be identified with a high degree of robustness
29
Contexts of non-final falling
Anticipatory mirror-image adaptation Inset Stepwise falling
30
Anticipatory mirror-image adaptation
....(1.8) Когда я \услышала,Kogda ja \uslyšala,when I heard
...(0.5) что-о /бомба гремит,čto-o /bomba
gremit,that bomb growls
31
Inset
/Входит это ...(0.5) /\ма-аль↑чик,/Vxodit èto ...(0.5) /\ma-al’↑čik,enters here boy
’ ’ ..(0.1) /\ну к \другому,’ ’ ..(0.1) /\nu k \drugomu,
well to another ..(0.1) и \говорит:
..(0.1) i \govorit:and says
32
Stepwise falling ....(1.5) /\Озеро ...(0.5) какое-то,
..(0.3) (Или /\речка,
или /\озеро,
но по-моему \озеро,
потому что’ ..(0.2) как-то-оw
...(0.6) \маленькое такое,
\небольшое.)
....(1.5) /\Ozero ...(0.5) kakoe-to,Lake some
..(0.3) (Ili /\rečka, Either river
ili /\ozero,or lake
no po-moemu \ozero,but I guess lake
potomu čto’ ..(0.2) kak-to-oWbecause somehow ...(0.6) \malen’koe takoe,
small such \nebol’šoe.)
minor
210 Hz
190 Hz
160 Hz
33
Representation of EDU continuity types in corpus
894
606
1188
0
200
400
600
800
1000
1200
Finalfalling
Non-finalfalling
(Non-final)rising
34
The status of sentence
In the speech of most speakers final falling is clearly distinct from non-final patterns
Final intonation, expressly distinct from non-final intonation (both rising and falling), makes the notion of sentence valid for spoken discourse
Speakers “know” when they complete a sentence and when they do not
Apparently, spoken sentences are the prototype of written sentences
35
However
Identification of sentences is possible only on the basis of a complex analytic procedure
It is dependent on prior understanding of a speaker’s prosodic “portrait”
There are prototypes of final and non-final fallings, but there are intermediate instances, therefore sentencehood may be a matter of degree
Unlike EDUs, sentences are highly variable Speakers with short sentences Speakers with long sentences equaling stories
• Clause chaining A significant tune-up is necessary to apply the
procedure to a different discourse type or a different language
36
Conclusions on prosody and sentence
Sentence is an intermediate hierarchical grouping between an EDU (roughly, clause) and whole discourse
Sentence is an elusive, complex, non-elementary unit of spoken language
These conclusions, possible only due to prosodic analysis, are of prime importance for linguistic theory
The notion of sentence, so salient in theories restricted to the verbal component alone, can only be evaluated relying on prosodic evidence
37
Other languages?
Upper Kuskokwim Athabaskan
Bobby Esai, Sr.
38
Excerpt from a storya. (1.6) hwndine ŒiÈ chu
suddenly with Ptclb. (2.2) sighwdlaŒ todoltsitÈ' ts'eŒ
my.sled it.broke.through.ice andc. (5.5) sileka ch'ildon' nich'i toghedak
Œedinhmy.dogs some too they.fell.in.water though
d. (0.9) ch'ildon' chuŒdasome though
e. (0.2) tinh k'its' ==ice on
f. (0.9) tinh k'its' Œohighet'a ts'eŒice on they.are.there and
‘Suddenly, my sled broke through the ice, and some of my dogs also fell into the water, while others remained on top of the ice, and <…>’
39
Tonal contours and EDUs
a b c d fe
40
II. GESTURE
In the course of communication, it is not just that the speaker speaks and the addressee listens
In addition, the speaker displays, and the addressee observes Gesture Gaze Mimics Posture Proxemics Cultural symbolism .....................(see, for example, Крейдлин 2002, Бутовская 2004)
41
Gestures
Gestures are kinetic behaviors of arms and other limbs, capable of conveying meaning from speaker to addressee.
Among the various types of gestures (see e.g. McNeill 1992) pointing gestures are one of the most salient types.
42
Pointing
Понюхай эти!
43
Elements of a canonical pointing act
44
Phylogeny and ontogeny
Appear an exclusive property of humans (Tomasello et al. 2007)
Are a very ancient gesture type (Крейдлин 2007)
Appear at the end of the first year Can participate in binary multimodal
constructions “word + gesture”, such as open POINT (Butcher and Goldin-Meadow 2000)
45
Reference and pointing
Reference is a fundamental linguistic phenomenon, accounting for about every third word in running discourse
Studies of reference (deixis, anaphora, etc.) among the central concerns of modern linguistics
Pointing is the developmental source of reference
46
Pointing, deixis, and exophora
Deixis is the most widely recognized function of pointing
However, quite frequently pointing is associated with exophora, that is mention of perceptually activated referents (O'Neill 1996, Levy 2000: 219, Nikolaeva 2003)
Exophora is the ontological source of anaphora
47
Exophoric and anaphoric reference (from Nikolaeva 2003)
a. My s Anatoliem uže mnogolet očen’ rabotaem,
<three intervening clauses>
e. on mnogo raz zavjazyval,
‘Anatolij and I have been working together for many years, <…> he was winding it up (drinking) many times’
48
Pointing and prosody
Pointing and accentuation are analogous phenomena, both associated with making an item salient
Nikolaeva (p.c.): pointing typically cooccurs with accent
Levy (2000): energy expenditure
49
Substitution: Referent vs. demonstratum
Reference to non-specific items:
Vot počemu my i obraščaemsja poroj k psixologam.
‘This is why we address psychologists now and then’
This phenomenon is known as deferred ostension, analogic deixis, ostensive metonymy, etc.
In substitution, reference does not have to be non-specific
He got a big scar here (pointing to one’s cheek) (Levelt 1989)
50
Virtual pointing
Pointing to imaginary targets cf. Buehler’s Deixis am Phantasma,
McNeill’s abstract pointing
51
Frequency in two discourse types
Nikolaeva 2003 (TV shows): 5.4 pointing gestures per 100 EDUs 2.7 are virtual pointing
Nikolaeva p.c. (retelling of a film): 4.2 pointing gestures per 100 EDUs All are virtual pointing
Virtual pointing in exophora/anaphora is as frequent as in deixis
52
a. … əə Kogda on exal po= podoroge,
b. on əə mm … poravnjalsja s devočkoj,
‘As he rode along the road, he passed a girl <...>’
Изобразительный жест
53
d. on zasmotrelsja na neë,
‘he gaped at her’
Указательные жесты
54
Spatial representation of referents
By illustrative gestures in the previous example By verbal devicesa. i naprotiv menja sideli dve devočki-mulatki, <21 intervening clauses>y. vot êti dve devočki i ja,‘And across from me sat two brown-skinned girls, <…>
these two girls and I <...>’ There is no difference for the referential system
what is used to convey spatial relations Verbal and gestural material is jointly used to
convey the inner cognitive representation from the speaker to the addressee
55
Conclusions on gestures and reference
The pointing gesture is the developmental source of reference
The use of pointing is intimately connected to reference
Reference is performed with the help of both verbal devices and illustrative gestures
Reference, a central linguistic phenomenon, cannot be understood if we fail to take gesture into account
56
III. Relative contribution of three information channels
Discourse
Vocal channels Visual channel
Verbal channel Prosodic channel
57
What is the contribution of different channels?
Traditional approach of mainstream linguistics: the verbal channel is so central that prosody and the visual channel are at best downgraded as “paralinguistics”
Applied psychology «Since body language conveys more than half of any message in any
face-to-face encounter, how you act is vital» (Business advising)
http://www.sideroad.com/Business_Etiquette/business-body-language.html
It is often stated that (figures go back to Mehrabian 1971):• body language conveys 55% of information• prosody conveys 38% of information• the verbal component conveys 7% of information
«Words may be what men use when all else fails» (Крейдлин 2002: 6)
Who is right?
58
Experimental study
Isolate three information channels Present a sample discourse in all
possible variants (23=8) Present each of the eight variants to
a group of subjectsAssess the degree of understanding
in each case Kibrik and El’bert 2008
59
Experimental material
Russian TV serial “Tajny sledstvija” – “Mysteries of the investigation”
Experimental excerpt: 3 min. 20 sec. Preceded by a 8 minutes context (that starts from the
beginning of the series) The excerpt fully consists of a conversation, to ensure that
we are testing the understanding of discourse rather than of the film in general
Two vocal channels have been separated: verbal alone – running subtitles prosodic alone – superimposed filter creating the “behind a
wall” effect Subjects:
99 participants, divided into 8 groups Native speakers of Russian Each group comprised 10 to 17 subjects
60
Полный вариант
61
Визуальный+вербальный каналы
62
Визуальный+просодический каналы
63
Procedure
Every subject was instructed to watch the context and the experimental excerpt and then answer a set of questions concerned with the experimental excerpt alone
Questionnaire was constructed in accordance with the received principles of test tasks (Panchenko 2000)
23 questions in questionnaire A subject was supposed to choose only one answer out of
four listed variants What Tamara Stepanovna offers Masha before the
beginning of the conversation: a. to take off her coat b. to have a cup of tea c. to have a seat d. to have a drink
Percentage of correct answers is used as an assessment of a subject’s degree of understanding
64
Results
Group number
1 2 3 4 5 6 7 8
Experimen-tal material
Original
Sound Subtitles+ video
Prosody+ video
Subtitles
Prosody Video Nothing(context only)
Information channels
verbal prosodic visual
verbal prosodic
verbal visual
prosodic visual
verbal prosodic
visual [none]
Number of information channels
3 2 2 2 1 1 1 0
Mean %% of correct answers
87,4% 70,4% 73,9% 51,2% 72,0% 51,1% 61,7% 38,3%
65
Each of the three information channels, taken in isolation, is quite informative
Group number
1 2 3 4 5 6 7 8
Experimen-tal material
Original
Sound Subtitles+ video
Prosody+ video
Subtitles
Prosody Video Nothing(context only)
Information channels
verbal prosodic visual
verbal prosodic
verbal visual
prosodic visual
verbal prosodic
visual [none]
Number of information channels
3 2 2 2 1 1 1 0
Mean %% of correct answers
87,4% 70,4% 73,9% 51,2% 72,0% 51,1% 61,7% 38,3%
66
The hierarchy of informativeness: verbal > visual > prosodic
Group number
1 2 3 4 5 6 7 8
Experimen-tal material
Original
Sound Subtitles+ video
Prosody+ video
Subtitles
Prosody Video Nothing(context only)
Information channels
verbal prosodic visual
verbal prosodic
verbal visual
prosodic visual
verbal prosodic
visual [none]
Number of information channels
3 2 2 2 1 1 1 0
Mean %% of correct answers
87,4% 70,4% 73,9% 51,2% 72,0% 51,1% 61,7% 38,3%
67
The combination ‘prosodic plus visual’ (group 4) leads to significantly lower result than in other pairs of channels (groups 2 and 3).
Group number
1 2 3 4 5 6 7 8
Experimen-tal material
Original
Sound Subtitles+ video
Prosody+ video
Subtitles
Prosody Video Nothing(context only)
Information channels
verbal prosodic visual
verbal prosodic
verbal visual
prosodic visual
verbal prosodic
visual [none]
Number of information channels
3 2 2 2 1 1 1 0
Mean %% of correct answers
87,4% 70,4% 73,9% 51,2% 72,0% 51,1% 61,7% 38,3%
68
Relative contribution of the three channels
For the sake of simplicity, assume that all three channels are independent
(72+51+62=185)/100Results:
Verbal channel 39% (72:1.85≈39), Prosodic channel 28% (51,1:1.85≈28), Visual channel 33% (61,7:1.85≈33),
69
Conclusions about the relative weight of three information channels
All information channels are highly significant the traditional linguistic viewpoint is
erroneous
The verbal channel is the leading one the viewpoint popular in applied psychology
is erroneous
Information from the prosodic and the visual channels is primarily used through integration with the verbal channel, at least for this discourse type
70
IV. Signed languagesNATURAL LANGUAGES
SPOKEN SIGNED
DEAF SIGN LANGUAGES
natural, fully-fledged human languages visual-spatial languages
use hands and arms, facial expressions, eye gaze, head and body posture to encode linguistic information
manual signs are produced in a three-dimensional space immediately in front of the signer – the signing arena
121 sign languages (http//:www.ethnologue.com)American Sign Language, Russian Sign Language …
71
Reference in RSL
Prozorova 2006, Kibrik and Prozorova 2007
Goal: to characterize referential choice of a deaf sign language as contrasted to that of spoken languages
72
RSL data collection
‘The Pear Stories’ Film (Chafe 1980) Corpus of 10 video-recorded RSL narratives
based on the retellings of the Pear Film Speakers:
6 men and 4 women age 15-55 all based in Moscow
7 animate referents in the Pear Film 657 clauses 542 referential expressions (animate)
73
Deictic demonstrative reference in RSL
operates in the perceived space P
deictic expressions: pointing signs pointing with an index
finger towards the intended referent
(2) DEMcat ILL ‘He is ill’
74
Major anaphoric options in RSL
Full NPs (114)Zero expressions (401)Demonstratives (27)
75
Full NP
BOY YOUNG AGE CYCLE ‘A young boy is riding a bicycle’
76
Zeroexpressions
1. BOY YOUNG AGE CYCLE2. Øboy STOP
3. Øboy HUMAN-STANDrightdown
4. Øboy LOOKrightdown P-E-A-R
1. A young boy is riding a bicycle.2. He stops.3. He stands upright.4. He sees the pears.
77
Anaphoric zero reference
Interlocutors’ shared cognitive representation contains not only perceived referents, but also referents conceived of (remembered or imagined)
We call this representation the conceived space C
Mentioning referents that are present, or activated, in the conceived space is what is known as anaphora
Anaphoric referential choice depends on a referent’s activation in the conceived space: High zero Low full NP
78
Two discourse factors and anaphoric referential devices
factor 1: RD=1
RD=2 RD=3+ TOTALfactor 2: Ant=S Ant=O
full NP <1 % 33 % 14 % 57 % 59
zero NP 99 % 42 % 67 % 27 % 401
DEM <1 % 25 % 19 % 16 % 27
TOTAL346
(100%)24
(100%)43
(100%)74
(100%)487
79
Demonstrative
1. Øboy CYCLE
2. Øboy GOsignerforward AWAYsignerforward
3. DEMmanright
SEE NEG
4. Øman PICK-ROUND
1. He cycles.2. He goes away.3. That one doesn’t see.4. He picks pears
80
Anaphoric demonstrative reference
In signed discourse the signer maps referents from the inner conceived space C onto the external signing arena
Mapping includes various parameters of referents: locations orientations physical interactions even abstract relations between
them Thus a constructed space C’
is created, inhabited by referents conceived of
81
How are locations of referents established in the constructed space?
Signed discourse takes place in the three-dimensional signing arena
The topology of the signing arena isomorphically represents the topology of the scenes, remembered by signers from the film
The signer establishes the locations of referents in his signing arena
These locations are isomorphic to the locations of the referents in the film, as remembered by the signer
82
An episode from the Pear Film
83
A retelling
1. ONE-MOVEfrontsigner MANi 2. ONE-MOVEfrontsigner SHE-
GOAT3. BOY GIRL UNCLEAR4. SHE-GOAT5. Øgoat TWO-HORN HAVE.NEG6. DEMi
front PULL
1. A man is coming,2. with a she-goat.3. Male, female – it is
unclear.4. It’s a she-goat:5. It has no horns.6. This one is pulling it.
84
Anaphoric demonstratives
Once the signer has explicitly indicated the location/path of a referent, demonstratives may be used for further mentions of this referent
Thus demonstratives are the basic device used for repeated mention of referents in the constructed space
Formally they are the same as deictic demonstratives
Demonstratives are based on the mechanism of virtual pointing, but it is conventionalized in RSL
What is a kind of an ad hoc, fluid device in spoken languages, is an established, nearly lexical device in RSL
85
Referential function of demonstratives
Demonstratives are not particularly sensitive to activation factors:
factor 1: RD=1
RD=2 RD=3+ TOTAL
factor 2: Ant=S Ant=O
nominal DEM
<1 % 25 % 19 % 16 % 27
86
Conclusions on reference in RSL
Types of referential devices and factors of reference are analogous to those of spoken languages
Some devices, only embryonically present in spoken languages, are strongly entrenched in RSL: virtual pointing
This is apparently due to the fundamentally spatio-visual character of RSL
Studying signed languages gives us a new perspective on spoken languages
Recognition of two fundamental types of languages, spoken and signed, appears indispensable for a general theory of language
87
V. A wider picture
The world surrounding us is multimodalWe are multimodal animalsObviously language and communication
are mutimodalAs it often happens, those specializing
in applied fields have understood the importance of multimodality before pure scholars and theorists
88
Multimodality in technology
TV is superior to radioMultimodal communication devices Internet, especially Web 2.0, is all
multimodal
89
Stages of multimodal integration, from Cohen and Oviatt 2006
90
Multimodality in biological sciences
“Within biology, experimental psychology, and cognitive neuroscience, a separate rapidly growing literature has clarified that multisensory perception and integration cannot be predicted by studying the senses in isolation.”
(Cohen and Oviatt 2006)
91
Multimodality in communication studies and semiotics
Kress G & van Leeuwen T (2001). Multimodal discourse: the modes and media of contemporary communication. London: Arnold.
‘‘A multimodal approach assumes that the message is ‘spread across’ all the modes of communication. If this is so, then each mode is a partial bearer of the overall meaning of the message. All modes, speech and writing included, are then seen as always partial bearers of meaning only. This is a fundamental challenge to hitherto current notions of ‘language’ as a full means of making meaning’’ (Kress, 2002: 6).
92
Multimodal corpora
LREC-2008 (Language Resources and Evaluation Conference) Blache P., Bertrand R., Ferré G. 2008. Creating
and exploiting multimodal annotated corpora. Gallo C.G., Jaeger T.F., Allen J., Swift M. 2008.
Production in a multimodal corpus: How speakers communicate complex actions
Kitazawa Sh., Kiriyama Sh., Kasami T., Ishikawa Sh., Otani N., Horiuchi H., Takebayashii Y. 2008. A Multimodal infant behavior annotation for developmental analysis of demonstrative expressions
93
Synthesis
LeVine P & Scollon R (eds.) Discourse and technology: multimodal discourse analysis. Washington, DC: Georgetown University Press. 2004
94
Conclusions
“Normal” linguists, researching conventional verbal material, need to understand that further progress in linguistics is impossible if one ignores the multimodality of language
Language in the understanding of the 20th century mainstream linguistics is an abstraction, very remote from reality. We live in the multimodal world, this is where language evolved and where it functions, and this is what we need to realize if we want to understand it
Taking the multimodal perspective into account can help to adequately approach classical questions of narrow linguistics
95
Acknowledgements
Julia NikolaevaVera Podlesskaya Evgenia Prozorova Ekaterina El’bert
96
Alm 2006, “Augmentative and Alternative Communication”
“Unimpaired communication is, of course, inherently
multimodal, with the speech content being modified by prosody and delivered in parallel with
facial expression, gesture, posture, and a range of other
nonverbal communication methods.”
97
Schrøder 2006
Kress G & van Leeuwen T (2001). Multimodal discourse: the modes and media of contemporary communication. London: Arnold/Hodder Headline Group.
NB: this is multimodal social semiotic theory
“The overall theoretical framework of Kress and van Leeuwen’s visual discourse semiotics is strongly akin to Fairclough’s three-dimensional model, whereas the analytical practice is inspired eclectically by theoretical and analytical work in linguistics, visual semiotics, film theory, art
criticism, as well as numerous predecessors in the various fields of media research, especially the analysis of advertising (Cook, 1992; Myers, 1994; Williamson, 1978).”
98
Norris S (2004). Analyzing multimodal interaction: A methodological framework. London: Routledge.
99
Multimodal microplanning ELL, P. 168
100
ELL, 514 – multimodal technology
101
Cohen and Oviatt 2006
On technology
“before building high-performance multimodal systems, it is crucial that the architecture be based on an understanding of how humans communicate multimodally in different contexts.”
“future multimodal systems that can detect and adapt to a user’s dominant integration pattern potentially could yield substantial improvements in system robustness and overall performance”
“systems that allow users to distribute their content across modalities will face simpler recognition and understanding problems and thus are likely to be more robust”
“Within biology, experimental psychology, and cognitive neuroscience, a separate rapidly growing literature has clarified that multisensory perception and integration cannot be predicted by studying the senses in isolation.”
102
McKay 2006
“Studying texts with images and sounds has presented challenges to conventional discourse analysis, which has valued modes of language through speech and/or writing over visual images or music. The mass media produce multimodal texts, that is, texts that draw from language, pictures, or other graphic elements and sounds in various combinations. Considerations of the multimodal nature of media texts are difficult to incorporate in language-based media analysis. <...> In spite of the difficulties in trying capture such multimodality, concentrating on language and ignoring the other modes is to miss much of the potential for meaning of contemporary media texts.”
103
Busch 2006 Media communication is inherently multimodal communication: this means that language in written and spoken form is one of several modes available for expressing a potential of meanings. For instance, in print media lay-out and image are available in addition to the written word; in radio, language is present in its spoken form, alongside music and different sounds; in television all the aforementioned modes can be drawn upon in a context in which the moving image holds a central position. Similarly, in computer-mediated communication, a wide range of modes is available. ‘‘A multimodal approach assumes that the message is ‘spread across’ all the modes of communication. If this is so, then each mode is a partial bearer of the overall meaning of the message. All modes, speech and writing included, are then seen as always partial bearers of meaning only. This is a fundamental challenge to hitherto current notions of ‘language’ as a full means of making meaning’’ (Kress, 2002: 6).
104
Scollon 2006
“any use of language is inescapably multimodal.
That is, spoken or written language inherently
cooccurs in grammatical interactions among other
semiotic modes such as gesture, image, color, texture,
shape, or spatial layout and configuration”
105
EDUs vs. sentences: degree of variability
EDUs:distribution in terms of number of words
Sentences:distribution in terms of number of EDUs
0
50
100
150
200
250
300
350
400
450
1 3 5 7 9 11 13 15 17 19 21 23 25 27 290
100
200
300
400
500
600
700
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
53% – 3±180% – 3±2
106
Gestures enhance understanding
Сutica and Bucciarelli 2006 Cassell et al. 1998
107
Alternative theories of gestures’ functions
Alibali, Kita and Young 2000: Lexical retrieval hypothesis Information packaging hypothesis
108
Combining the verbal channel with one additional channel does not increase the percentage of correct answers
Group number
1 2 3 4 5 6 7 8
Experimen-tal material
Original
Sound Subtitles+ video
Prosody+ video
Subtitles
Prosody Video Nothing(context only)
Information channels
verbal prosodic visual
verbal prosodic
verbal visual
prosodic visual
verbal prosodic
visual [none]
Number of information channels
3 2 2 2 1 1 1 0
Mean %% of correct answers
87,4% 70,4% 73,9% 51,2% 72,0% 51,1% 61,7% 38,3%
109
Use of zero expressions under RD > 1
49 usages (12% of all zeroes) Pragmatic and semantic clues that help to
identify the referent of a zero expression: certain predicates associated with a particular
referent (RIDE-BICYCLE; HOLD-BICYCLE)
The process of role-shifting (Padden 1986): by shifting (rotating) the body and changing
his/her facial expression the signer shows that s/he is currently “acting” for one of the referents
110
Role-shifting
1. Øboy LOOKdown
2. Øboy BE-ABOUT ONE PEAR ONE TAKE-ROUND
3. Øboy LOOKup
role-shifting4. DEMup
man PICK-ROUNDrole-shifting
5. Øboy LOOKdown
6. Øboy TAKE-ROUND
1. He [the boy] looks down.2. He is about to take one pear.3. He looks up.
role-shifting4. That one (the man) is picking pears.
role-shifting5. He (the boy) looks down.6. He takes one.
111
In case of intermediate referent activation, full NPs and demonstratives compete
In case of low activation (RD=3+) full NPs strongly prevail (57%)
Apparently, information on the location of a referent in the constructed space can be assumed available to the addressee only for a limited time
Full NPsvs nominal demonstratives
112
Full NPs vs demonstratives
1. Øboy CYCLE
2. Øboy OBJECT-MOVEsignerforward
3. Øboy GO-AWAYsignerleft-forward
4. DEMup MAN STILL PICK-PEAR5. CYCLE DEMboy
front
6. Øboy OBJECT-MOVEsignerforward
1. He (the boy) is cycling.
2. He is riding forward.3. He goes away.4. That man is still
picking pears.5. This one is cycling.6. He is riding forward.
11
2
113
The multimodal flight finder enables rapid task completion by enabling the user to interact via a multiplicity of user interaction modalities
114
Multimodal Analysis Lab (Singapore): collaboration of social scientists and computer scientists
115
Multimodality in computational linguistics
Gibbon D, Mertins I & Moore R (eds.) Handbook of multimodal and spoken dialogue systems: resources, terminology and product evaluation. Dordrecht: Kluwer. 2000
116
In related disciplines
Assumption typically held by other cognitive scientists, for example psychologists: language consists of words, sentences, and other verbal units
“With no more than 50 to 100 K words humans can create and understand an infinite number of sentences” (Bernstein et al. 1994: 349-350)
When cognitive scientists work with “language”, they almost invariably think that language is a set of individual words or, at most, sentences