fieldwork – consultation and elicitation methods eldp training 2007 friederike lüpke

Fieldwork – consultation and elicitation methods

ELDP Training 2007Friederike Lüpke

2

Structure of the talk

Motivation for methodological considerations in the field of language documentation

Overview of Himmelmann’s types of communicative events

Illustration of data resulting from different types of communicative events, with a focus on staged communicative events

Presentation and classification of different types of stimuli

Potential problems Links

Motivation for methodological considerations

4

Why bother So far, the field of language documentation has

focussed on the shape that a language documentation should take, but not on what data should be included, how they should be collected and to whom they should be of use.

A step toward this is a systematic investigation of the goals of language documentation, of the data collection methods associated with them, and the usability of the data resulting from them.

I-language vs. E-language

Text vs. performance

Oral vs. written

Oral vs. nonverbal

Quality vs. quantity

…

5

The (new?) role of data

“For description, the main concern is the production of grammars and dictionaries whose primary audience are linguists… In these products language data serves essentially as exemplification and support for the linguist’s analysis.” (Austin 2006: 87)

[..] Language documentation, on the other hand, places data at the center of its concerns.” (Austin 2006:87)

6

One view of language documentation

Elicitations

(paradigms, results of tests...)

Observed commu-nicative

events(conversation, narratives…)

Staged communicative

events(descriptions of

picture and video stimuli...)

Corpus (Himmelmann 1998)

Qualitative analyses(occurrence of x in context y)

Quantitative analyses (weighting of occurrence of x in context y throughout speakers, texts and genres…)

7

Data types in the corpus

Video, audio and image data

Written data Metadata

Corpus

Information about the

content, format and structure of

data

Transcription, annotation

and analysis of data

Speech in cultural context

8

But exactly what data?

“A language documentation […] conceived of as a lasting, multipurpose record of a language [… ] should contain a large set of primary data which provide evidence for the language(s) used at a given time in a given community” (Himmelmann 2006: 7)

“The main goal of a language documentation is to make primary data available for a broad group of users.” (Himmelmann 2006: 15)

Which community/ies?

Which language/s?

Which audience(s)?

9

Homogeneity vs. heterogeneity

Do we aim at a monolithic record or at documenting

variation (of what) ?

How do we establish

representativeness?

Do we document a snapshot or the production,

transmission, maintenance and change of linguistic and

cultural behaviour?

10

What status for negative evidence? “With regard to the usual way of obtaining

negative evidence (i.e. asking one or two speakers whether examples x, y, z, are “okay”), it is doubtful whether this really makes a difference in quality compared to evidence provided by the fact that the structure in question is not attested in a large corpus. Elicited evidence is only superior here if it is very carefully elicited, paying adequate attention to the sample of speakers interviewed, potential biases in presenting the material, and the like.” (Himmelmann 2006: 23)

How much methodological and

theoretical awareness can we expect in

language documentation?

Which methods are robust and widely

accepted?

11

Data for who? We are aware of the

disciplines that also have language as a centre of interest – but do we cater for their needs?

We want to create data relevant for the speech community/ies, but we have little evidence for the use of our electronic corpora.How can we create a true multipurpose

record of a language?

12

The (new?) role of the consultant “…some older field

manuals give advice on what kind of questions to ask or not to ask, … . In this manner, such manuals quite automatically assign a passive role to the speaker. If we regard fieldwork as a mutual teaching-learning event, this approach is no longer acceptable.” (Mosel 2006: 75)

What roles do we assume for ourselves and our

consultants?

13

What’s left? Data and methodology “The major discovery of post-1957 “syntactic

theory” is not “theoretical”, but methodological: That a huge amount of generalizations can best be found by adopting an “experimental” approach…What remains of the published body of research is the empirical part. So all the papers that are neatly divided into a “data/generalizations” part and an “analysis” part have a good chance of continuing to be useful”. (Haspelmath 2006: Linguistlist 17.2304)

If its data that is central, how can we assure that

our data are, and will be, relevant? ?

How can we reach maximal transparency

and explicitness in providing information

about how and why we collected our data ?

14

Your turn

Please take 5 minutes to:– Think about the main goals and users of your

research project.– Think about how you have collected and/or

intend to collect data in the field.– What kind of methods of data collection (i.e.

word lists, questionnaires, stimuli…) do you use?

We will discuss your findings and concerns in the plenary.

Data resulting from observed communicative events

16

Data resulting from monologues

PRO:– Have a high degree of

ecological validity.– Yield phonologically,

semantically and syntactically natural utterances.

– Give insight into the culture, if thematically balanced.

– Show high-frequency phenomena.

CON:– Can seem natural but

factually aren’t because the cultural settings are not respected.

– Can contain pragmatic oddities.

– Are not very controlled.– Many features are not

quantifiable because a unique performance of one speaker.

– Don’t offer negative evidence and are not good for low-frequency phenomena..

“This lecture is about the fascinating theory on...”

17

Data resulting from conversation

PRO:– Often seen as the non-

plus-ultra in naturalness.– Yields data that are

naturalistic in every respect.

– Also gives important information about the culture.

CON:– Is not controlled at all.– Is very difficult to get.– Is tedious and time-

consuming to transcribe.

– Is even more time-consuming to analyse.

– Doesn’t offer negative evidence and insight into low-frequency phenomena.

A: “How do you like the ELDP training so far?”

B: “All I can say is they start too early and don’t give us enough breaks!”

18

Representativeness of a LDD corpus – Jalonke high frequency verb kolon ‘know’

Causative

Reciprocal

Complement

Perfect

Many transitive uses

Passive

19

Representativeness of a LDD corpus – Jalonke low frequency verb

Past

NP subject

Goal PP

All uses are intransitive

Causative?

Transitive uses?

Perfect?

Passive?

???

???

20

Summary

Observed communicative events that are investigated in a qualitative way allow to– Get a first impression of the most frequent

syntactic environments of the most frequent verbs.

– Formulate hypotheses and prepare elicitation sessions.

But: these data don’t tell us anything about markedness, about the full distributional range, about low frequency items and constructions, and about their semantic properties.

Data resulting from elicitation

22

Data resulting from contextualising elicitation

PRO:– Yield phonologically

natural utterances.– Can be quantified to some

extent.– Are highly controlled, or at

least seem to be.– Yield phonologically

natural utterances– Offer negative evidence

CON:– Results depend heavily on

the creativity of the researcher and the receptiveness of the consultant.

– Easily lead to misunderstandings that go by unnoticed.

– Can thus yield syntactically/semantically/prag-matically odd utterances.

“How do you greet in the morning?”

23

Data resulting from translational equivalent elicitation

PRO:– Are easy when starting

work on an unknown language.

– Give good data to work on phoneme inventory, basic lexicon, and for lexical comparison.

– Are quantifiable and highly controlled.

– Offer negative evidence.

CON:– Yield phonologically odd

utterances.– Can easily lead to

misunderstandings due to the lack of context.

– Translatable items are limited in number.

– Hyper-cooperative consultants may create neologisms and produce calques to be helpful.

“How do you say ‘bee’ in Dida?”

24

Data resulting from acceptability judgements

PRO:– Are controlled and

quantifiable.– Can give results for

domains that are difficult to cover otherwise.

– Give comparable results for many fields.

– Offer negative evidence.

CON:– Very often do not test

acceptability of the utterance, but rather of the context provided for it.

– Can therefore very often be contradicted by the same and/or different speakers.

“Can I say ‘this book’ when the book is lying over there?

25

Summary

Elicited data that are inspected in a qualitative way allow to– Get the full distributional range of a given

item/construction.– Test the semantic properties of that

item/construction.– Provide negative evidence, i.e. information on

unattested structures/uses, ungrammaticality, etc.

But: these data are often influenced by the metalanguage/elicitation method and not naturalistic at all.

26

Your turn

Please take five minutes to think about other data collection methods you use, in particular about stimuli-based data:– Which media do you use if you collect data based on non-

verbal stimuli?– How do you rate the quality of the data obtained with

stimuli?– Have you encountered any problems when working with

stimuli?– Do you have recommendations to make regarding

specific stimuli that worked well?

We will compare your observations in the plenary.

Data resulting from staged communicative events

Staged communicative events based on nonverbal stimuli

29

Types of stimuli

Static stimuli:– Comics– Picture books– Photos

Dynamic stimuli:– Acted videos– Animated videos– Staged life events

Interactive stimuli:– Puzzle tasks– Map tasks– Matching games

30

Static stimuli

Picture books– Topological relations

picture book– Frog story

Photos– Positional verbs picture

book Comics

– Calvin & Hobbes– Tintin– Asterix & Obelix

31

Dynamic stimuli

Acted videos:– Staged events– Cut & Break– Pear film

Animated videos:– Fish film– Event triads– ECOM clips

32

Interactive stimuli

Matching/sorting games games:– Basic colour terms Munsell

chips– Men and tree– Cluedo

Puzzles:– Eisenbeiss/Matsuo puzzle

Map tasks/route descriptions:– HCRC map task– Table top route description

task

33

Data resulting from static stimuli

PRO:– Are highly controlled,

quantifiable and comparable.– Yield phonologically,

semantically and syntactically accurate data.

– Are free from linguistic interference of the metalanguage and from misunderstandings of context.

– Can be used for nonlinguistic categorisation tasks.

CON:– Validity of the data depends

on coverage of the domain under inspection by the stimulus.

– If gaps in parameters, data can be severely flawed.

– Cross-cultural applicability can be limited.

– Use is limited to visually depictable scenes.

34

Data resulting from dynamic stimuli

PRO:– Yield phonologically,

syntactically and semantically quantifiable and comparable data etc. (see previous slide).

– Can be used for nonlinguistic categorisation tasks.

CON:– See previous slide and:– Require the use of high-tech,

which is complicated if not impossible in many field settings.

– Depending on the abstractness of the stimulus and the purpose of the elicitation, misunderstandings can occur.

35

Data resulting from interactive stimuli

PRO:– Allow controlled

interaction of two or more speakers.

– Yield quantifiable and comparable data.

– Can be used for nonlinguistics categorisation tasks.

CON:– May create culturally

inappropriate or strange situations.

– Since the true purpose of the interaction is normally not known to the consultants, misunderstandings occur easily.

Examples for the use of static stimuli

37

Posture verbs in stative positions (Ameka, de Witte & Wilkins 1999)

English/Dutch: The bottle is standing on the rock.

Jalonke: Biniir-<< d$$-xi g<m-<< fari.bottle-DEF sit-PF rock-DEF on‘The bottle is sitting on the rock.’

Goemai: The stick is hanging on the tree trunk.

Jalonke: Tam-<< kiran-xi wurixuntun-na ma.stick-DEF lean-PF tree trunk-DEF at‘The stick is leaning against the tree trunk.’

Examples for the use of dynamic stimuli

39

Single clause; single verb

Multiple independent clauses

English: The ball rolled from the square past the house to the triangle.

Yukatek: The ball is at the square, and it goes rolling, and then it passes the house, and then it arrives at the triangle.

Event segmentation: ECOM clips (Bohnemeyer & Caelen 1999)

40

Posture verbs in caused positions (Hellwig & Lüpke 1999)

English: She puts the bottle on the table.

Jalonke/Goemai/Dutch: She ‘sits’ the bottle on the table.

Differences between stative and caused positions:

Same posture verb used : Jalonke.

Different verbs with same extension used: Goemai.

Different verbs with different extensions used: English and Dutch.

Semantic differences:

In Jalonke and Goemai, objects with a base sit/are ‘sat’, even when their longest axis is vertical.

In English and Dutch, they stand, but are put (English) or ‘sat’ (Dutch).

41

Cut & break verbs (Bohnemeyer, Bowerman & Brown 2001)

English: cut (with scissors)

Dutch: knippen ‘cut with scissors’

Jalonke: cut-iterative (because cloth has already been cut).

English: cut (with knife)

Dutch: snijden ‘cut with a knife’

Jalonke: cut (because fish hasn’t been cut yet).

Examples for the use of interactive stimuli

43

The Puzzle Task (Eisenbeiss & Matsuo 2003)

Children have to describe puzzle pieces in order to be handed the piece to be handed to them

The pictures are selected in order to elicit descriptions of external possession and to ‘force’ the children to verbalise all the relevant contrasts

44

An Example of the contrasts involved

45

The HCRC map task (HCRC Edinburgh)

The instruction giver’s map

The instruction follower’s map

Crucial: landmarks on both maps are not identical in order to increase motivation to communicate.

46

The men and tree matching game (MPI Nijmegen)

Two consultants, a ‘director’ and a ‘matcher’ have identical sets of photos with similar scenes.

The director describes a photo to the matcher, who has to find the matching picture.

The photos are selected to uncover the categories triggering the choice of the matching photos – in this case, intrinsic vs. absolute frames of reference

Ad hoc stimuli

48

Ad hoc stimuli

New technologies enable fieldworkers to create stimuli ‘ad hoc’ in the field:– Digital photos– Video clips– Animations

Although generally not usable for cross-linguistic comparison, these stimuli can yield interesting data difficult to get otherwise.

49

Action descriptions (Lüpke 2005, ms.)

Videos recorded in the field that are described by consultants.

PRO:– Yield fine-grained event

descriptions difficult to obtain otherwise.

– Can be used to cover semantic domains not attested so far in the corpus.

CON:– Don’t constitute a ‘speech

event’ in the sense of Hymes.

50

Photos and Powerpoint animations

Useful for ethnobotany Sequences of stills

from digital video or ppt animations can be used to elicit stages of an event

Potential problems

52

Ecological validity

It is important to aim at culturally appropriate methods.

However, total ecological validity leads to non-transferability.

Therefore:– Elicitations and stimuli should

replace the names of culturally unfamiliar items with more familiar ones.

– Unfamiliar or uncomfortable settings (elicitation sessions with consultants of different rank/sex…, elicitations games, etc.) should be explained and negotiated beforehand, and, if necessary, amended.

53

Procedure and analysis

The familiarity or unfamiliarity of certain devices, techniques, or media in different cultures and groups should be taken seriously but need not rule out using them with caution and preparation:– Pilot studies determine whether a technique works and if not, give

indications on what should be changed.– Consultant training is important for all kinds of data collection.

Data resulting from one collection technique should always be checked against data from another technique.

If the comparability of data is aimed at, collection procedure and analysis should not deviate from instructions or procedures given for the specific technique, questionnaire or stimulus.

My conclusion

55

Why all kinds of data?

Elicitations Observed communicative

events


events

CorpusField-based corpora are relatively small. Thus: They don’t show the full distributional range of a given item. They don’t offer negative evidence.

Are very controlled, but not linguistically prompted Thus:

They permit the controlled variation of a situation.

They allow the assessment of the real-world situation referred to.

They yield data that are directly comparable across languages.

Are relatively uncontrolled. Thus:

They don’t’ allow an active manipulation of parameters of variation by the researcher.

Often, texts do not permit the reconstruction of the real-world context for a given item.

Are highly controlled, but linguistically prompted. Thus:

They are likely to be influenced by the linguistic input.

Their ‘naturalness’ cannot be assessed.

56

A circle

Observed communicati

ve events


events

Elicitation

Demonstrative questionnaire

Action description

BowPed picture stimulus

Your conclusion?

Useful links

59

MPI Nijmegen Language & Cognition and Acquisition Groups:– Large number of stimuli on a range of topics; stimuli and manuals upon

request: http://www.mpi.nl/world/index.html

The MPI EVA Leipzig links to field tools:– http://lingweb.eva.mpg.de/fieldtools/tools.htm

Russ Tomlin’s Fish Film:– Stimulus designed to uncover the motivation for voice contrasts,

topicality, etc.http://logos.uoregon.edu/tomlin/research_fishfilm.html

Wallace Chafe’s Pear Film – Designed to compare narrative structure

http://www.linguistics.ucsb.edu/faculty/chafe/pearfilm.htm Phillip Wolff’s animations on causality (upon request?)

– Aimed at testing Talmy’s force dynamics model of causationhttp://userwww.service.emory.edu/~pwolff/

Sonja Eisenbeiß’s elicitation games (upon request)– A variety of games and tasks for language acquisition studies,

focussing on three participant events and external possessionhttp://privatewww.essex.ac.uk/~seisen/index.htm

http://www.mpi.nl/world/index.html

http://lingweb.eva.mpg.de/fieldtools/tools.htm

http://lingweb.eva.mpg.de/fieldtools/tools.htm

http://logos.uoregon.edu/tomlin/research_fishfilm.html

http://www.linguistics.ucsb.edu/faculty/chafe/pearfilm.htm

http://userwww.service.emory.edu/~pwolff/

http://privatewww.essex.ac.uk/~seisen/index.htm

fieldwork – consultation and elicitation methods eldp training 2007 friederike lüpke

Documents

products language data

places data

primary data available

data collection methods

large set of primary

negative evidence

elicited evidence

corpus himmelmann