МУЛЬТИМОДАЛЬНАЯ ЛИНГВИСТИКА Семинар

МУЛЬТИМОДАЛЬНАЯ ЛИНГВИСТИКА

Семинар

А.А.Кибрик (Институт языкознания РАН)

[email protected]

Дульзоновские чтенияТомск, 2011

mailto:aakibrik@g

2

The mainstream linguistic approach

Language consists of hierarchically organized segmental units, such as phonemes, morphemes, words, phrases, and sentences

Linguistic form is thus equated with verbal form Search for “linguistic form” in Google:

The first result is: “A meaningful unit of language, such as an affix, a word,

a phrase, or a sentence.” (TheFreeDictionary.com) «В своей совокупности языковые знаки

образуют особого рода знаковую систему – язык. <…> Наиболее типичным языковым знаком является слово <…> Форма выражения любого словесного знака состоит из фонем» (Лингвистический энциклопедический словарь, с. 167)

3

However

Apart from sound, there are other channels of communication, in the first place through vision (body language - gesture, mimics, gaze, etc.)

There are prosodic, that is non-verbal aspects to sound Imagine prosody-free talk or, vice versa, talk behind a wall

4

Multimodality

In order to understand language and communication, all aspects of linguistic form must be taken into account

This is what is sometimes called the multimodal approach Modality, or mode, refers to a distinct type of input In particular, modality is a kind of stimulus associated

with one the human senses, particularly hearing and sight So the verbal component, prosody, and body language all

count as modes or modalities

“Any use of language is inescapably multimodal” (Scollon 2006)

5

Goals of this talk

Emphasize the importance of prosody and visual aspects of communication in linguistic research

Show how prosody and visual communication interact with the verbal component, thus suggesting not only the multimodal, but also the cross-modal approach

Propose that linguistics cannot progress without taking multimodality seriously into account

6

Are these goals relevant and important?

After all, linguists and other scholars have already been pursuing these issues for many decades, and the respective research traditions are quite rich

But: First, prosody and visual communication are

marginalized in linguistics, they are located in certain “pockets” of the overall linguistic panorama and are tolerated by the mainstream as “paralinguistics”

Those focusing on these information channels often treat them as a “thing in itself”, without integration with the verbal component

7

Plan of talk

I. Prosody II. Gestures III. Relative contribution of three

information channels IV. Signed languagesV. Wider context

8

I. PROSODY Prosodic components

pausing accents pitch tempo (of various scope) registers degrees of reduction glottal features loudness ................

«Рост интереса к просодии связан <…> с новыми семантическим задачами (описание непропозициональной семантики <...>)» (Кодзасов 1996: 85)

Prosody is responsible for discourse segmentation into Elementary Discourse Units (EDUs), identified on the basis of several prosodic components and strongly correlated with clauses

9

An example of prosodically oriented discourse transcription

....(1.5) /\Озеро ...(0.5) какое-то,Lake some

..(0.3) (Или /\речка,Either river

или /\озеро,or lake

но по-моему \озеро,but I guess lake

потому что’ ..(0.2) как-то-оw because somehow...(0.6) \маленькое такое,

small such \небольшое.)

minor ....(1.0) ’и-иh ...(0.7)через /него

and across it..(0.3) как-то \бревно какое-то,

somehow log some типа \моста.

like bridge

....(1.5) /\Ozero ...(0.5) kakoe-to,

..(0.3) (Ili /\rečka,

ili /\ozero,

no po-moemu \ozero,

potomu čto’ ..(0.2) kak-to-oW ...

(0.6) \malen’koe takoe,

\nebol’šoe.)

....(1.0) ’i-iH ...(0.7) čerez /nego ..

(0.3) kak-to \brevno kakoe-to,

tipa \mosta.

10

Night Dream Stories

Corpus of spoken Russian stories Speakers: children and adolescents Subject matter: retelling of night

dreamsDiscourse type: monologic narrative

(personal stories) Joint study with Vera Podlesskaya and a

group of our graduate students Kibrik and Podlesskaya eds. 2009

11

Segmentation (EDUs) ....(1.5) /\Озеро ...(0.5) какое-то,

Lake some ..(0.3) (Или /\речка,

Either river или /\озеро,

or lake но по-моему \озеро,

but I guess lake потому что’ ..(0.2) как-то-оw

because somehow...(0.6) \маленькое такое,





like bridge

....(1.5) /\Ozero ...(0.5) kakoe-to,

..(0.3) (Ili /\rečka,

ili /\ozero,

no po-moemu \ozero,



\nebol’šoe.)

....(1.0) ’i-iH ...(0.7) čerez /nego ..


tipa \mosta.

12

Pauses ....(1.5) /\Озеро ...(0.5) какое-то,










like bridge

....(1.5) /\Ozero ...(0.5) kakoe-to,

..(0.3) (Ili /\rečka,

ili /\ozero,

no po-moemu \ozero,



\nebol’šoe.)

....(1.0) ’i-iH ...(0.7) čerez /nego ..


tipa \mosta.

13

Pitch accents ....(1.5) /\Озеро ...(0.5) какое-то,










like bridge

....(1.5) /\Ozero ...(0.5) kakoe-to,

..(0.3) (Ili /\rečka,

ili /\ozero,

no po-moemu \ozero,



\nebol’šoe.)

....(1.0) ’i-iH ...(0.7) čerez /nego ..


tipa \mosta.

14

Tempo: wide and narrow scope ....(1.5) /\Озеро ...(0.5) какое-то,










like bridge

....(1.5) /\Ozero ...(0.5) kakoe-to,

..(0.3) (Ili /\rečka,

ili /\ozero,

no po-moemu \ozero,



\nebol’šoe.)

....(1.0) ’i-iH ...(0.7) čerez /nego ..


tipa \mosta.

15

Other prosodic phenomena ....(1.5) /\Озеро ...(0.5) какое-то,










like bridge

....(1.5) /\Ozero ...(0.5) kakoe-to,

..(0.3) (Ili /\rečka,

ili /\ozero,

no po-moemu \ozero,



\nebol’šoe.)

....(1.0) ’i-iH ...(0.7) čerez /nego ..


tipa \mosta.

16

Prosody and sentence

Does spoken language consist of sentences? Sheer facts:

Spoken language is the primary form of language Spoken language does not contain periods,

question marks and other explicit signals of sentence boundaries

Research question: Is sentence, as a theoretical construct, as

identifiable and as basic for the primary form of language as it is (or as it is thought to be) for written language?

17

Sentence in spoken language

Position 1: sentence is a universal and basic unit of language Assumption typically held by not only by

linguists but also by other cognitive scientists

But sentence is very far from being obvious in spoken language

Position 2: avoidance of the issue, typical of discourse-oriented linguists If so, how could sentences become so

much entrenched in written language?

18

Phase (фаза)

Term by Sandro V. Kodzasov Alternative term by J. DuBois et al. 1992:

transitional continuity Discourse semantic category: ‘end’ vs. ‘non-

end’ (=expectation of a forthcoming end) Hierarchical nature of phase End of tentative sentence – falling tonal

accent Non-end – rising tonal accent

19

A canonical example of the transitional continuity distinction z57:15-16

..(0.4) /\Мы-ы’ ..(0.4) \как бы за них /взя-ались,..(0.4) /\My-y’ ..(0.4) \kak by za nix /vzja-alis’,

We sort of at them got.hold

...(0.5) и-и ввь= || ..(0.2) полетели \вве-ерх. ...(0.5) i-i vv’= || ..(0.2) poleteli \vve-erx. and flew upward

Rising (“comma”)Non-end

Falling (“period”)End

If things were that easy, sentence would be uncontroversial

20

Non-canonical situation: Non-end with a falling tonal accent

....(1.5) /\Озеро ...(0.5) какое-то,

..(0.3) (Или /\речка,

или /\озеро,

но по-моему \озеро,

потому что’ ..(0.2) как-то-оw

...(0.6) \маленькое такое,

\небольшое.)

....(1.0) ’и-иh ...(0.7)через /него

..(0.3) как-то \бревно какое-то,

типа \моста.

....(1.5) /\Ozero ...(0.5) kakoe-to,Lake some

..(0.3) (Ili /\rečka, Either river

ili /\ozero,or lake

no po-moemu \ozero,but I guess lake

potomu čto’ ..(0.2) kak-to-oWbecause somehow ...(0.6) \malen’koe takoe,

small such \nebol’šoe.)

minor ....(1.0) ’i-iH ...(0.7) čerez /nego

and across it ..(0.3) kak-to \brevno kakoe-to,

somehow log some tipa \mosta.

like bridge

21

The problem of two kinds of falling

The existence of non-final falling calls relevance of sentence into question

However, the distinction between two kinds of falling is very systematic

The two kinds of falling: are prosodically distinct have distinct discourse functions

22

Prosodic criteria of the final vs. non-final falling distinction

1. Target frequency band2. Post-accent behavior3. Pausing pattern4. Reset vs. latching5. Steepness of falling6. Interval of falling

23

Target frequency band

Final falling (“period”): targets at the bottom of the speaker’s F0 range

Non-final falling (“falling comma”): targets at level several dozen Hz (several semitones) higher

24

F0 graph for the “lake” example

\ozero, \malen’koe \nebol’ \brevno kakoe \mosta.

takoe, šoe.-to,

12 10 125

8

25

Non-final falling (210 Гц), final falling (170 Гц), rising, post-rising falling Z54: 4-5

..(0.4) А /тогда уже д= || ..(0.2) закрывались \двери,

..(0.4) A /togda uže d= || ..(0.2) zakryvalis’ \dveri,And then already d= were.closing doors

..(0.1) и /’Аня не –успела \сесть.

..(0.1) i /Anja ne –uspela \sest’. and Anja not managed get.in

...(0.7) Иw мм(0.4) /\когда-а ..(0.2) ’’(0.3) ..(0.4) {ЧМОКАНЬЕ 0.2} ..(0.4) когда я приехала на нашу /остановку’, ...(0.7) IW mm(0.4) /\kogda-a ..(0.2) ’’(0.3) ..(0.4) {SMACKING 0.2} ..(0.4) kogda ja priexala na našu /ostanovku’, And when when I arrived to our station

210 Hz170 Hz

26

Post-accent behavior

Final falling (“period”): steady falling on the post-accent syllables

Non-final falling (“comma”): lack of falling on post-accent syllables, often rise of tone (V-curve)

27

V-curve z26

....(5.7) /Домик ...(0.6) был /около \реч↑ки,

....(5.7) /Domik ...(0.6) byl /okolo \reč↑ki, Little.house was near creek

....(3.3) /рядом были \–родник-ки,

....(3.3) /rjadom byli \–rodnik-ki, nearby were springs

..(0.4) и \–ле-ес.

..(0.4) i \–le-es. and forest

260 Hz

235 Hz

240 Hz

28

The final vs. non-final falling distinction

A speaker’s prosodic pattern must be identified

On its basis the difference between final and non-final falling distinction can be identified with a high degree of robustness

29

Contexts of non-final falling

Anticipatory mirror-image adaptation Inset Stepwise falling

30

Anticipatory mirror-image adaptation

....(1.8) Когда я \услышала,Kogda ja \uslyšala,when I heard

...(0.5) что-о /бомба гремит,čto-o /bomba

gremit,that bomb growls

31

Inset

/Входит это ...(0.5) /\ма-аль↑чик,/Vxodit èto ...(0.5) /\ma-al’↑čik,enters here boy

’ ’ ..(0.1) /\ну к \другому,’ ’ ..(0.1) /\nu k \drugomu,

well to another ..(0.1) и \говорит:

..(0.1) i \govorit:and says

32

Stepwise falling ....(1.5) /\Озеро ...(0.5) какое-то,

..(0.3) (Или /\речка,

или /\озеро,

но по-моему \озеро,

потому что’ ..(0.2) как-то-оw

...(0.6) \маленькое такое,

\небольшое.)

....(1.5) /\Ozero ...(0.5) kakoe-to,Lake some

..(0.3) (Ili /\rečka, Either river

ili /\ozero,or lake

no po-moemu \ozero,but I guess lake

potomu čto’ ..(0.2) kak-to-oWbecause somehow ...(0.6) \malen’koe takoe,

small such \nebol’šoe.)

minor

210 Hz

190 Hz

160 Hz

33

Representation of EDU continuity types in corpus

894

606

1188

0

200

400

600

800

1000

1200

Finalfalling

Non-finalfalling

(Non-final)rising

34

The status of sentence

In the speech of most speakers final falling is clearly distinct from non-final patterns

Final intonation, expressly distinct from non-final intonation (both rising and falling), makes the notion of sentence valid for spoken discourse

Speakers “know” when they complete a sentence and when they do not

Apparently, spoken sentences are the prototype of written sentences

35

However

Identification of sentences is possible only on the basis of a complex analytic procedure

It is dependent on prior understanding of a speaker’s prosodic “portrait”

There are prototypes of final and non-final fallings, but there are intermediate instances, therefore sentencehood may be a matter of degree

Unlike EDUs, sentences are highly variable Speakers with short sentences Speakers with long sentences equaling stories

• Clause chaining A significant tune-up is necessary to apply the

procedure to a different discourse type or a different language

36

Conclusions on prosody and sentence

Sentence is an intermediate hierarchical grouping between an EDU (roughly, clause) and whole discourse

Sentence is an elusive, complex, non-elementary unit of spoken language

These conclusions, possible only due to prosodic analysis, are of prime importance for linguistic theory

The notion of sentence, so salient in theories restricted to the verbal component alone, can only be evaluated relying on prosodic evidence

37

Other languages?

Upper Kuskokwim Athabaskan

Bobby Esai, Sr.

38

Excerpt from a storya. (1.6) hwndine ŒiÈ chu

suddenly with Ptclb. (2.2) sighwdlaŒ todoltsitÈ' ts'eŒ

my.sled it.broke.through.ice andc. (5.5) sileka ch'ildon' nich'i toghedak

Œedinhmy.dogs some too they.fell.in.water though

d. (0.9) ch'ildon' chuŒdasome though

e. (0.2) tinh k'its' ==ice on

f. (0.9) tinh k'its' Œohighet'a ts'eŒice on they.are.there and

‘Suddenly, my sled broke through the ice, and some of my dogs also fell into the water, while others remained on top of the ice, and <…>’

39

Tonal contours and EDUs

a b c d fe

40

II. GESTURE

In the course of communication, it is not just that the speaker speaks and the addressee listens

In addition, the speaker displays, and the addressee observes Gesture Gaze Mimics Posture Proxemics Cultural symbolism .....................(see, for example, Крейдлин 2002, Бутовская 2004)

41

Gestures

Gestures are kinetic behaviors of arms and other limbs, capable of conveying meaning from speaker to addressee.

Among the various types of gestures (see e.g. McNeill 1992) pointing gestures are one of the most salient types.

42

Pointing

Понюхай эти!

43

Elements of a canonical pointing act

44

Phylogeny and ontogeny

Appear an exclusive property of humans (Tomasello et al. 2007)

Are a very ancient gesture type (Крейдлин 2007)

Appear at the end of the first year Can participate in binary multimodal

constructions “word + gesture”, such as open POINT (Butcher and Goldin-Meadow 2000)

45

Reference and pointing

Reference is a fundamental linguistic phenomenon, accounting for about every third word in running discourse

Studies of reference (deixis, anaphora, etc.) among the central concerns of modern linguistics

Pointing is the developmental source of reference

46

Pointing, deixis, and exophora

Deixis is the most widely recognized function of pointing

However, quite frequently pointing is associated with exophora, that is mention of perceptually activated referents (O'Neill 1996, Levy 2000: 219, Nikolaeva 2003)

Exophora is the ontological source of anaphora

47

Exophoric and anaphoric reference (from Nikolaeva 2003)

a. My s Anatoliem uže mnogolet očen’ rabotaem,

<three intervening clauses>

e. on mnogo raz zavjazyval,

‘Anatolij and I have been working together for many years, <…> he was winding it up (drinking) many times’

48

Pointing and prosody

Pointing and accentuation are analogous phenomena, both associated with making an item salient

Nikolaeva (p.c.): pointing typically cooccurs with accent

Levy (2000): energy expenditure

49

Substitution: Referent vs. demonstratum

Reference to non-specific items:

Vot počemu my i obraščaemsja poroj k psixologam.

‘This is why we address psychologists now and then’

This phenomenon is known as deferred ostension, analogic deixis, ostensive metonymy, etc.

In substitution, reference does not have to be non-specific

He got a big scar here (pointing to one’s cheek) (Levelt 1989)

50

Virtual pointing

Pointing to imaginary targets cf. Buehler’s Deixis am Phantasma,

McNeill’s abstract pointing

51

Frequency in two discourse types

Nikolaeva 2003 (TV shows): 5.4 pointing gestures per 100 EDUs 2.7 are virtual pointing

Nikolaeva p.c. (retelling of a film): 4.2 pointing gestures per 100 EDUs All are virtual pointing

Virtual pointing in exophora/anaphora is as frequent as in deixis

52

a. … əə Kogda on exal po= podoroge,

b. on əə mm … poravnjalsja s devočkoj,

‘As he rode along the road, he passed a girl <...>’

Изобразительный жест

53

d. on zasmotrelsja na neë,

‘he gaped at her’

Указательные жесты

54

Spatial representation of referents

By illustrative gestures in the previous example By verbal devicesa. i naprotiv menja sideli dve devočki-mulatki, <21 intervening clauses>y. vot êti dve devočki i ja,‘And across from me sat two brown-skinned girls, <…>

these two girls and I <...>’ There is no difference for the referential system

what is used to convey spatial relations Verbal and gestural material is jointly used to

convey the inner cognitive representation from the speaker to the addressee

55

Conclusions on gestures and reference

The pointing gesture is the developmental source of reference

The use of pointing is intimately connected to reference

Reference is performed with the help of both verbal devices and illustrative gestures

Reference, a central linguistic phenomenon, cannot be understood if we fail to take gesture into account

56

III. Relative contribution of three information channels

Discourse

Vocal channels Visual channel

Verbal channel Prosodic channel

57

What is the contribution of different channels?

Traditional approach of mainstream linguistics: the verbal channel is so central that prosody and the visual channel are at best downgraded as “paralinguistics”

Applied psychology «Since body language conveys more than half of any message in any

face-to-face encounter, how you act is vital» (Business advising)

http://www.sideroad.com/Business_Etiquette/business-body-language.html

It is often stated that (figures go back to Mehrabian 1971):• body language conveys 55% of information• prosody conveys 38% of information• the verbal component conveys 7% of information

«Words may be what men use when all else fails» (Крейдлин 2002: 6)

Who is right?

http://www.sideroad.com/Business_Etiquette/business-body-language.html

58

Experimental study

Isolate three information channels Present a sample discourse in all

possible variants (23=8) Present each of the eight variants to

a group of subjectsAssess the degree of understanding

in each case Kibrik and El’bert 2008

59

Experimental material

Russian TV serial “Tajny sledstvija” – “Mysteries of the investigation”

Experimental excerpt: 3 min. 20 sec. Preceded by a 8 minutes context (that starts from the

beginning of the series) The excerpt fully consists of a conversation, to ensure that

we are testing the understanding of discourse rather than of the film in general

Two vocal channels have been separated: verbal alone – running subtitles prosodic alone – superimposed filter creating the “behind a

wall” effect Subjects:

99 participants, divided into 8 groups Native speakers of Russian Each group comprised 10 to 17 subjects

60

Полный вариант

61

Визуальный+вербальный каналы

62

Визуальный+просодический каналы

63

Procedure

Every subject was instructed to watch the context and the experimental excerpt and then answer a set of questions concerned with the experimental excerpt alone

Questionnaire was constructed in accordance with the received principles of test tasks (Panchenko 2000)

23 questions in questionnaire A subject was supposed to choose only one answer out of

four listed variants What Tamara Stepanovna offers Masha before the

beginning of the conversation: a. to take off her coat b. to have a cup of tea c. to have a seat d. to have a drink

Percentage of correct answers is used as an assessment of a subject’s degree of understanding

64

Results

Group number

1 2 3 4 5 6 7 8

Experimen-tal material

Original

Sound Subtitles+ video

Prosody+ video

Subtitles

Prosody Video Nothing(context only)

Information channels

verbal prosodic visual

verbal prosodic

verbal visual

prosodic visual

verbal prosodic

visual [none]

Number of information channels

3 2 2 2 1 1 1 0

Mean %% of correct answers

87,4% 70,4% 73,9% 51,2% 72,0% 51,1% 61,7% 38,3%

65

Each of the three information channels, taken in isolation, is quite informative

Group number

1 2 3 4 5 6 7 8


Original


Prosody+ video

Subtitles




verbal prosodic

verbal visual

prosodic visual

verbal prosodic

visual [none]


3 2 2 2 1 1 1 0


87,4% 70,4% 73,9% 51,2% 72,0% 51,1% 61,7% 38,3%

66

The hierarchy of informativeness: verbal > visual > prosodic

Group number

1 2 3 4 5 6 7 8


Original


Prosody+ video

Subtitles




verbal prosodic

verbal visual

prosodic visual

verbal prosodic

visual [none]


3 2 2 2 1 1 1 0


87,4% 70,4% 73,9% 51,2% 72,0% 51,1% 61,7% 38,3%

67

The combination ‘prosodic plus visual’ (group 4) leads to significantly lower result than in other pairs of channels (groups 2 and 3).

Group number

1 2 3 4 5 6 7 8


Original


Prosody+ video

Subtitles




verbal prosodic

verbal visual

prosodic visual

verbal prosodic

visual [none]


3 2 2 2 1 1 1 0


87,4% 70,4% 73,9% 51,2% 72,0% 51,1% 61,7% 38,3%

68

Relative contribution of the three channels

For the sake of simplicity, assume that all three channels are independent

(72+51+62=185)/100Results:

Verbal channel 39% (72:1.85≈39), Prosodic channel 28% (51,1:1.85≈28), Visual channel 33% (61,7:1.85≈33),

69

Conclusions about the relative weight of three information channels

All information channels are highly significant the traditional linguistic viewpoint is

erroneous

The verbal channel is the leading one the viewpoint popular in applied psychology

is erroneous

Information from the prosodic and the visual channels is primarily used through integration with the verbal channel, at least for this discourse type

70

IV. Signed languagesNATURAL LANGUAGES

SPOKEN SIGNED

DEAF SIGN LANGUAGES

natural, fully-fledged human languages visual-spatial languages

use hands and arms, facial expressions, eye gaze, head and body posture to encode linguistic information

manual signs are produced in a three-dimensional space immediately in front of the signer – the signing arena

121 sign languages (http//:www.ethnologue.com)American Sign Language, Russian Sign Language …

71

Reference in RSL

Prozorova 2006, Kibrik and Prozorova 2007

Goal: to characterize referential choice of a deaf sign language as contrasted to that of spoken languages

72

RSL data collection

‘The Pear Stories’ Film (Chafe 1980) Corpus of 10 video-recorded RSL narratives

based on the retellings of the Pear Film Speakers:

6 men and 4 women age 15-55 all based in Moscow

7 animate referents in the Pear Film 657 clauses 542 referential expressions (animate)

73

Deictic demonstrative reference in RSL

operates in the perceived space P

deictic expressions: pointing signs pointing with an index

finger towards the intended referent

(2) DEMcat ILL ‘He is ill’

74

Major anaphoric options in RSL

Full NPs (114)Zero expressions (401)Demonstratives (27)

75

Full NP

BOY YOUNG AGE CYCLE ‘A young boy is riding a bicycle’

76

Zeroexpressions

1. BOY YOUNG AGE CYCLE2. Øboy STOP

3. Øboy HUMAN-STANDrightdown

4. Øboy LOOKrightdown P-E-A-R

1. A young boy is riding a bicycle.2. He stops.3. He stands upright.4. He sees the pears.

77

Anaphoric zero reference

Interlocutors’ shared cognitive representation contains not only perceived referents, but also referents conceived of (remembered or imagined)

We call this representation the conceived space C

Mentioning referents that are present, or activated, in the conceived space is what is known as anaphora

Anaphoric referential choice depends on a referent’s activation in the conceived space: High zero Low full NP

78

Two discourse factors and anaphoric referential devices

factor 1: RD=1

RD=2 RD=3+ TOTALfactor 2: Ant=S Ant=O

full NP <1 % 33 % 14 % 57 % 59

zero NP 99 % 42 % 67 % 27 % 401

DEM <1 % 25 % 19 % 16 % 27

TOTAL346

(100%)24

(100%)43

(100%)74

(100%)487

79

Demonstrative

1. Øboy CYCLE

2. Øboy GOsignerforward AWAYsignerforward

3. DEMmanright

SEE NEG

4. Øman PICK-ROUND

1. He cycles.2. He goes away.3. That one doesn’t see.4. He picks pears

80

Anaphoric demonstrative reference

In signed discourse the signer maps referents from the inner conceived space C onto the external signing arena

Mapping includes various parameters of referents: locations orientations physical interactions even abstract relations between

them Thus a constructed space C’

is created, inhabited by referents conceived of

81

How are locations of referents established in the constructed space?

Signed discourse takes place in the three-dimensional signing arena

The topology of the signing arena isomorphically represents the topology of the scenes, remembered by signers from the film

The signer establishes the locations of referents in his signing arena

These locations are isomorphic to the locations of the referents in the film, as remembered by the signer

82

An episode from the Pear Film

83

A retelling

1. ONE-MOVEfrontsigner MANi 2. ONE-MOVEfrontsigner SHE-

GOAT3. BOY GIRL UNCLEAR4. SHE-GOAT5. Øgoat TWO-HORN HAVE.NEG6. DEMi

front PULL

1. A man is coming,2. with a she-goat.3. Male, female – it is

unclear.4. It’s a she-goat:5. It has no horns.6. This one is pulling it.

84

Anaphoric demonstratives

Once the signer has explicitly indicated the location/path of a referent, demonstratives may be used for further mentions of this referent

Thus demonstratives are the basic device used for repeated mention of referents in the constructed space

Formally they are the same as deictic demonstratives

Demonstratives are based on the mechanism of virtual pointing, but it is conventionalized in RSL

What is a kind of an ad hoc, fluid device in spoken languages, is an established, nearly lexical device in RSL

85

Referential function of demonstratives

Demonstratives are not particularly sensitive to activation factors:

factor 1: RD=1

RD=2 RD=3+ TOTAL

factor 2: Ant=S Ant=O

nominal DEM

<1 % 25 % 19 % 16 % 27

86

Conclusions on reference in RSL

Types of referential devices and factors of reference are analogous to those of spoken languages

Some devices, only embryonically present in spoken languages, are strongly entrenched in RSL: virtual pointing

This is apparently due to the fundamentally spatio-visual character of RSL

Studying signed languages gives us a new perspective on spoken languages

Recognition of two fundamental types of languages, spoken and signed, appears indispensable for a general theory of language

87

V. A wider picture

The world surrounding us is multimodalWe are multimodal animalsObviously language and communication

are mutimodalAs it often happens, those specializing

in applied fields have understood the importance of multimodality before pure scholars and theorists

88

Multimodality in technology

TV is superior to radioMultimodal communication devices Internet, especially Web 2.0, is all

multimodal

89

Stages of multimodal integration, from Cohen and Oviatt 2006

90

Multimodality in biological sciences

“Within biology, experimental psychology, and cognitive neuroscience, a separate rapidly growing literature has clarified that multisensory perception and integration cannot be predicted by studying the senses in isolation.”

(Cohen and Oviatt 2006)

91

Multimodality in communication studies and semiotics

Kress G & van Leeuwen T (2001). Multimodal discourse: the modes and media of contemporary communication. London: Arnold.

‘‘A multimodal approach assumes that the message is ‘spread across’ all the modes of communication. If this is so, then each mode is a partial bearer of the overall meaning of the message. All modes, speech and writing included, are then seen as always partial bearers of meaning only. This is a fundamental challenge to hitherto current notions of ‘language’ as a full means of making meaning’’ (Kress, 2002: 6).

92

Multimodal corpora

LREC-2008 (Language Resources and Evaluation Conference) Blache P., Bertrand R., Ferré G. 2008. Creating

and exploiting multimodal annotated corpora. Gallo C.G., Jaeger T.F., Allen J., Swift M. 2008.

Production in a multimodal corpus: How speakers communicate complex actions

Kitazawa Sh., Kiriyama Sh., Kasami T., Ishikawa Sh., Otani N., Horiuchi H., Takebayashii Y. 2008. A Multimodal infant behavior annotation for developmental analysis of demonstrative expressions

93

Synthesis

LeVine P & Scollon R (eds.) Discourse and technology: multimodal discourse analysis. Washington, DC: Georgetown University Press. 2004

94

Conclusions

“Normal” linguists, researching conventional verbal material, need to understand that further progress in linguistics is impossible if one ignores the multimodality of language

Language in the understanding of the 20th century mainstream linguistics is an abstraction, very remote from reality. We live in the multimodal world, this is where language evolved and where it functions, and this is what we need to realize if we want to understand it

Taking the multimodal perspective into account can help to adequately approach classical questions of narrow linguistics

95

Acknowledgements

Julia NikolaevaVera Podlesskaya Evgenia Prozorova Ekaterina El’bert

96

Alm 2006, “Augmentative and Alternative Communication”

“Unimpaired communication is, of course, inherently

multimodal, with the speech content being modified by prosody and delivered in parallel with

facial expression, gesture, posture, and a range of other

nonverbal communication methods.”

97

Schrøder 2006

Kress G & van Leeuwen T (2001). Multimodal discourse: the modes and media of contemporary communication. London: Arnold/Hodder Headline Group.

NB: this is multimodal social semiotic theory

“The overall theoretical framework of Kress and van Leeuwen’s visual discourse semiotics is strongly akin to Fairclough’s three-dimensional model, whereas the analytical practice is inspired eclectically by theoretical and analytical work in linguistics, visual semiotics, film theory, art

criticism, as well as numerous predecessors in the various fields of media research, especially the analysis of advertising (Cook, 1992; Myers, 1994; Williamson, 1978).”

98

Norris S (2004). Analyzing multimodal interaction: A methodological framework. London: Routledge.

99

Multimodal microplanning ELL, P. 168

100

ELL, 514 – multimodal technology

101

Cohen and Oviatt 2006

On technology

“before building high-performance multimodal systems, it is crucial that the architecture be based on an understanding of how humans communicate multimodally in different contexts.”

“future multimodal systems that can detect and adapt to a user’s dominant integration pattern potentially could yield substantial improvements in system robustness and overall performance”

“systems that allow users to distribute their content across modalities will face simpler recognition and understanding problems and thus are likely to be more robust”

“Within biology, experimental psychology, and cognitive neuroscience, a separate rapidly growing literature has clarified that multisensory perception and integration cannot be predicted by studying the senses in isolation.”

102

McKay 2006

“Studying texts with images and sounds has presented challenges to conventional discourse analysis, which has valued modes of language through speech and/or writing over visual images or music. The mass media produce multimodal texts, that is, texts that draw from language, pictures, or other graphic elements and sounds in various combinations. Considerations of the multimodal nature of media texts are difficult to incorporate in language-based media analysis. <...> In spite of the difficulties in trying capture such multimodality, concentrating on language and ignoring the other modes is to miss much of the potential for meaning of contemporary media texts.”

103

Busch 2006 Media communication is inherently multimodal communication: this means that language in written and spoken form is one of several modes available for expressing a potential of meanings. For instance, in print media lay-out and image are available in addition to the written word; in radio, language is present in its spoken form, alongside music and different sounds; in television all the aforementioned modes can be drawn upon in a context in which the moving image holds a central position. Similarly, in computer-mediated communication, a wide range of modes is available. ‘‘A multimodal approach assumes that the message is ‘spread across’ all the modes of communication. If this is so, then each mode is a partial bearer of the overall meaning of the message. All modes, speech and writing included, are then seen as always partial bearers of meaning only. This is a fundamental challenge to hitherto current notions of ‘language’ as a full means of making meaning’’ (Kress, 2002: 6).

104

Scollon 2006

“any use of language is inescapably multimodal.

That is, spoken or written language inherently

cooccurs in grammatical interactions among other

semiotic modes such as gesture, image, color, texture,

shape, or spatial layout and configuration”

105

EDUs vs. sentences: degree of variability

EDUs:distribution in terms of number of words

Sentences:distribution in terms of number of EDUs

0

50

100

150

200

250

300

350

400

450

1 3 5 7 9 11 13 15 17 19 21 23 25 27 290

100

200

300

400

500

600

700

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

53% – 3±180% – 3±2

106

Gestures enhance understanding

Сutica and Bucciarelli 2006 Cassell et al. 1998

107

Alternative theories of gestures’ functions

Alibali, Kita and Young 2000: Lexical retrieval hypothesis Information packaging hypothesis

108

Combining the verbal channel with one additional channel does not increase the percentage of correct answers

Group number

1 2 3 4 5 6 7 8


Original


Prosody+ video

Subtitles




verbal prosodic

verbal visual

prosodic visual

verbal prosodic

visual [none]


3 2 2 2 1 1 1 0


87,4% 70,4% 73,9% 51,2% 72,0% 51,1% 61,7% 38,3%

109

Use of zero expressions under RD > 1

49 usages (12% of all zeroes) Pragmatic and semantic clues that help to

identify the referent of a zero expression: certain predicates associated with a particular

referent (RIDE-BICYCLE; HOLD-BICYCLE)

The process of role-shifting (Padden 1986): by shifting (rotating) the body and changing

his/her facial expression the signer shows that s/he is currently “acting” for one of the referents

110

Role-shifting

1. Øboy LOOKdown

2. Øboy BE-ABOUT ONE PEAR ONE TAKE-ROUND

3. Øboy LOOKup

role-shifting4. DEMup

man PICK-ROUNDrole-shifting

5. Øboy LOOKdown

6. Øboy TAKE-ROUND

1. He [the boy] looks down.2. He is about to take one pear.3. He looks up.

role-shifting4. That one (the man) is picking pears.

role-shifting5. He (the boy) looks down.6. He takes one.

111

In case of intermediate referent activation, full NPs and demonstratives compete

In case of low activation (RD=3+) full NPs strongly prevail (57%)

Apparently, information on the location of a referent in the constructed space can be assumed available to the addressee only for a limited time

Full NPsvs nominal demonstratives

112

Full NPs vs demonstratives

1. Øboy CYCLE

2. Øboy OBJECT-MOVEsignerforward

3. Øboy GO-AWAYsignerleft-forward

4. DEMup MAN STILL PICK-PEAR5. CYCLE DEMboy

front

6. Øboy OBJECT-MOVEsignerforward

1. He (the boy) is cycling.

2. He is riding forward.3. He goes away.4. That man is still

picking pears.5. This one is cycling.6. He is riding forward.

11

2

113

The multimodal flight finder enables rapid task completion by enabling the user to interact via a multiplicity of user interaction modalities

114

Multimodal Analysis Lab (Singapore): collaboration of social scientists and computer scientists

115

Multimodality in computational linguistics

Gibbon D, Mertins I & Moore R (eds.) Handbook of multimodal and spoken dialogue systems: resources, terminology and product evaluation. Dordrecht: Kluwer. 2000

116

In related disciplines

Assumption typically held by other cognitive scientists, for example psychologists: language consists of words, sentences, and other verbal units

“With no more than 50 to 100 K words humans can create and understand an infinite number of sentences” (Bernstein et al. 1994: 349-350)

When cognitive scientists work with “language”, they almost invariably think that language is a set of individual words or, at most, sentences

МУЛЬТИМОДАЛЬНАЯ ЛИНГВИСТИКА Семинар

Documents

visual communication

aspects of linguistic

importance of prosody

channels of communication

linguistic researchshow

ili ozero

verbal formsearch

modalitiesany use of