fieldwork – consultation and elicitation methods eldp training 2007 friederike lüpke
TRANSCRIPT
Fieldwork – consultation and elicitation methods
ELDP Training 2007Friederike Lüpke
2
Structure of the talk
Motivation for methodological considerations in the field of language documentation
Overview of Himmelmann’s types of communicative events
Illustration of data resulting from different types of communicative events, with a focus on staged communicative events
Presentation and classification of different types of stimuli
Potential problems Links
Motivation for methodological considerations
4
Why bother So far, the field of language documentation has
focussed on the shape that a language documentation should take, but not on what data should be included, how they should be collected and to whom they should be of use.
A step toward this is a systematic investigation of the goals of language documentation, of the data collection methods associated with them, and the usability of the data resulting from them.
I-language vs. E-language
Text vs. performance
Oral vs. written
Oral vs. nonverbal
Quality vs. quantity
…
5
The (new?) role of data
“For description, the main concern is the production of grammars and dictionaries whose primary audience are linguists… In these products language data serves essentially as exemplification and support for the linguist’s analysis.” (Austin 2006: 87)
[..] Language documentation, on the other hand, places data at the center of its concerns.” (Austin 2006:87)
6
One view of language documentation
Elicitations
(paradigms, results of tests...)
Observed commu-nicative
events(conversation, narratives…)
Staged communicative
events(descriptions of
picture and video stimuli...)
Corpus (Himmelmann 1998)
Qualitative analyses(occurrence of x in context y)
Quantitative analyses (weighting of occurrence of x in context y throughout speakers, texts and genres…)
7
Data types in the corpus
Video, audio and image data
Written data Metadata
Corpus
Information about the
content, format and structure of
data
Transcription, annotation
and analysis of data
Speech in cultural context
8
But exactly what data?
“A language documentation […] conceived of as a lasting, multipurpose record of a language [… ] should contain a large set of primary data which provide evidence for the language(s) used at a given time in a given community” (Himmelmann 2006: 7)
“The main goal of a language documentation is to make primary data available for a broad group of users.” (Himmelmann 2006: 15)
Which community/ies?
Which language/s?
Which audience(s)?
9
Homogeneity vs. heterogeneity
Do we aim at a monolithic record or at documenting
variation (of what) ?
How do we establish
representativeness?
Do we document a snapshot or the production,
transmission, maintenance and change of linguistic and
cultural behaviour?
10
What status for negative evidence? “With regard to the usual way of obtaining
negative evidence (i.e. asking one or two speakers whether examples x, y, z, are “okay”), it is doubtful whether this really makes a difference in quality compared to evidence provided by the fact that the structure in question is not attested in a large corpus. Elicited evidence is only superior here if it is very carefully elicited, paying adequate attention to the sample of speakers interviewed, potential biases in presenting the material, and the like.” (Himmelmann 2006: 23)
How much methodological and
theoretical awareness can we expect in
language documentation?
Which methods are robust and widely
accepted?
11
Data for who? We are aware of the
disciplines that also have language as a centre of interest – but do we cater for their needs?
We want to create data relevant for the speech community/ies, but we have little evidence for the use of our electronic corpora.How can we create a true multipurpose
record of a language?
12
The (new?) role of the consultant “…some older field
manuals give advice on what kind of questions to ask or not to ask, … . In this manner, such manuals quite automatically assign a passive role to the speaker. If we regard fieldwork as a mutual teaching-learning event, this approach is no longer acceptable.” (Mosel 2006: 75)
What roles do we assume for ourselves and our
consultants?
13
What’s left? Data and methodology “The major discovery of post-1957 “syntactic
theory” is not “theoretical”, but methodological: That a huge amount of generalizations can best be found by adopting an “experimental” approach…What remains of the published body of research is the empirical part. So all the papers that are neatly divided into a “data/generalizations” part and an “analysis” part have a good chance of continuing to be useful”. (Haspelmath 2006: Linguistlist 17.2304)
If its data that is central, how can we assure that
our data are, and will be, relevant? ?
How can we reach maximal transparency
and explicitness in providing information
about how and why we collected our data ?
14
Your turn
Please take 5 minutes to:– Think about the main goals and users of your
research project.– Think about how you have collected and/or
intend to collect data in the field.– What kind of methods of data collection (i.e.
word lists, questionnaires, stimuli…) do you use?
We will discuss your findings and concerns in the plenary.
Data resulting from observed communicative events
16
Data resulting from monologues
PRO:– Have a high degree of
ecological validity.– Yield phonologically,
semantically and syntactically natural utterances.
– Give insight into the culture, if thematically balanced.
– Show high-frequency phenomena.
CON:– Can seem natural but
factually aren’t because the cultural settings are not respected.
– Can contain pragmatic oddities.
– Are not very controlled.– Many features are not
quantifiable because a unique performance of one speaker.
– Don’t offer negative evidence and are not good for low-frequency phenomena..
“This lecture is about the fascinating theory on...”
17
Data resulting from conversation
PRO:– Often seen as the non-
plus-ultra in naturalness.– Yields data that are
naturalistic in every respect.
– Also gives important information about the culture.
CON:– Is not controlled at all.– Is very difficult to get.– Is tedious and time-
consuming to transcribe.
– Is even more time-consuming to analyse.
– Doesn’t offer negative evidence and insight into low-frequency phenomena.
A: “How do you like the ELDP training so far?”
B: “All I can say is they start too early and don’t give us enough breaks!”
18
Representativeness of a LDD corpus – Jalonke high frequency verb kolon ‘know’
Causative
Reciprocal
Complement
Perfect
Many transitive uses
Passive
19
Representativeness of a LDD corpus – Jalonke low frequency verb
Past
NP subject
Goal PP
All uses are intransitive
Causative?
Transitive uses?
Perfect?
Passive?
???
???
20
Summary
Observed communicative events that are investigated in a qualitative way allow to– Get a first impression of the most frequent
syntactic environments of the most frequent verbs.
– Formulate hypotheses and prepare elicitation sessions.
But: these data don’t tell us anything about markedness, about the full distributional range, about low frequency items and constructions, and about their semantic properties.
Data resulting from elicitation
22
Data resulting from contextualising elicitation
PRO:– Yield phonologically
natural utterances.– Can be quantified to some
extent.– Are highly controlled, or at
least seem to be.– Yield phonologically
natural utterances– Offer negative evidence
CON:– Results depend heavily on
the creativity of the researcher and the receptiveness of the consultant.
– Easily lead to misunderstandings that go by unnoticed.
– Can thus yield syntactically/semantically/prag-matically odd utterances.
“How do you greet in the morning?”
23
Data resulting from translational equivalent elicitation
PRO:– Are easy when starting
work on an unknown language.
– Give good data to work on phoneme inventory, basic lexicon, and for lexical comparison.
– Are quantifiable and highly controlled.
– Offer negative evidence.
CON:– Yield phonologically odd
utterances.– Can easily lead to
misunderstandings due to the lack of context.
– Translatable items are limited in number.
– Hyper-cooperative consultants may create neologisms and produce calques to be helpful.
“How do you say ‘bee’ in Dida?”
24
Data resulting from acceptability judgements
PRO:– Are controlled and
quantifiable.– Can give results for
domains that are difficult to cover otherwise.
– Give comparable results for many fields.
– Offer negative evidence.
CON:– Very often do not test
acceptability of the utterance, but rather of the context provided for it.
– Can therefore very often be contradicted by the same and/or different speakers.
“Can I say ‘this book’ when the book is lying over there?
25
Summary
Elicited data that are inspected in a qualitative way allow to– Get the full distributional range of a given
item/construction.– Test the semantic properties of that
item/construction.– Provide negative evidence, i.e. information on
unattested structures/uses, ungrammaticality, etc.
But: these data are often influenced by the metalanguage/elicitation method and not naturalistic at all.
26
Your turn
Please take five minutes to think about other data collection methods you use, in particular about stimuli-based data:– Which media do you use if you collect data based on non-
verbal stimuli?– How do you rate the quality of the data obtained with
stimuli?– Have you encountered any problems when working with
stimuli?– Do you have recommendations to make regarding
specific stimuli that worked well?
We will compare your observations in the plenary.
Data resulting from staged communicative events
Staged communicative events based on nonverbal stimuli
29
Types of stimuli
Static stimuli:– Comics– Picture books– Photos
Dynamic stimuli:– Acted videos– Animated videos– Staged life events
Interactive stimuli:– Puzzle tasks– Map tasks– Matching games
30
Static stimuli
Picture books– Topological relations
picture book– Frog story
Photos– Positional verbs picture
book Comics
– Calvin & Hobbes– Tintin– Asterix & Obelix
31
Dynamic stimuli
Acted videos:– Staged events– Cut & Break– Pear film
Animated videos:– Fish film– Event triads– ECOM clips
32
Interactive stimuli
Matching/sorting games games:– Basic colour terms Munsell
chips– Men and tree– Cluedo
Puzzles:– Eisenbeiss/Matsuo puzzle
Map tasks/route descriptions:– HCRC map task– Table top route description
task
33
Data resulting from static stimuli
PRO:– Are highly controlled,
quantifiable and comparable.– Yield phonologically,
semantically and syntactically accurate data.
– Are free from linguistic interference of the metalanguage and from misunderstandings of context.
– Can be used for nonlinguistic categorisation tasks.
CON:– Validity of the data depends
on coverage of the domain under inspection by the stimulus.
– If gaps in parameters, data can be severely flawed.
– Cross-cultural applicability can be limited.
– Use is limited to visually depictable scenes.
34
Data resulting from dynamic stimuli
PRO:– Yield phonologically,
syntactically and semantically quantifiable and comparable data etc. (see previous slide).
– Can be used for nonlinguistic categorisation tasks.
CON:– See previous slide and:– Require the use of high-tech,
which is complicated if not impossible in many field settings.
– Depending on the abstractness of the stimulus and the purpose of the elicitation, misunderstandings can occur.
35
Data resulting from interactive stimuli
PRO:– Allow controlled
interaction of two or more speakers.
– Yield quantifiable and comparable data.
– Can be used for nonlinguistics categorisation tasks.
CON:– May create culturally
inappropriate or strange situations.
– Since the true purpose of the interaction is normally not known to the consultants, misunderstandings occur easily.
Examples for the use of static stimuli
37
Posture verbs in stative positions (Ameka, de Witte & Wilkins 1999)
English/Dutch: The bottle is standing on the rock.
Jalonke: Biniir-<< d$$-xi g<m-<< fari.bottle-DEF sit-PF rock-DEF on‘The bottle is sitting on the rock.’
Goemai: The stick is hanging on the tree trunk.
Jalonke: Tam-<< kiran-xi wurixuntun-na ma.stick-DEF lean-PF tree trunk-DEF at‘The stick is leaning against the tree trunk.’
Examples for the use of dynamic stimuli
39
Single clause; single verb
Multiple independent clauses
English: The ball rolled from the square past the house to the triangle.
Yukatek: The ball is at the square, and it goes rolling, and then it passes the house, and then it arrives at the triangle.
Event segmentation: ECOM clips (Bohnemeyer & Caelen 1999)
40
Posture verbs in caused positions (Hellwig & Lüpke 1999)
English: She puts the bottle on the table.
Jalonke/Goemai/Dutch: She ‘sits’ the bottle on the table.
Differences between stative and caused positions:
Same posture verb used : Jalonke.
Different verbs with same extension used: Goemai.
Different verbs with different extensions used: English and Dutch.
Semantic differences:
In Jalonke and Goemai, objects with a base sit/are ‘sat’, even when their longest axis is vertical.
In English and Dutch, they stand, but are put (English) or ‘sat’ (Dutch).
41
Cut & break verbs (Bohnemeyer, Bowerman & Brown 2001)
English: cut (with scissors)
Dutch: knippen ‘cut with scissors’
Jalonke: cut-iterative (because cloth has already been cut).
English: cut (with knife)
Dutch: snijden ‘cut with a knife’
Jalonke: cut (because fish hasn’t been cut yet).
Examples for the use of interactive stimuli
43
The Puzzle Task (Eisenbeiss & Matsuo 2003)
Children have to describe puzzle pieces in order to be handed the piece to be handed to them
The pictures are selected in order to elicit descriptions of external possession and to ‘force’ the children to verbalise all the relevant contrasts
44
An Example of the contrasts involved
45
The HCRC map task (HCRC Edinburgh)
The instruction giver’s map
The instruction follower’s map
Crucial: landmarks on both maps are not identical in order to increase motivation to communicate.
46
The men and tree matching game (MPI Nijmegen)
Two consultants, a ‘director’ and a ‘matcher’ have identical sets of photos with similar scenes.
The director describes a photo to the matcher, who has to find the matching picture.
The photos are selected to uncover the categories triggering the choice of the matching photos – in this case, intrinsic vs. absolute frames of reference
Ad hoc stimuli
48
Ad hoc stimuli
New technologies enable fieldworkers to create stimuli ‘ad hoc’ in the field:– Digital photos– Video clips– Animations
Although generally not usable for cross-linguistic comparison, these stimuli can yield interesting data difficult to get otherwise.
49
Action descriptions (Lüpke 2005, ms.)
Videos recorded in the field that are described by consultants.
PRO:– Yield fine-grained event
descriptions difficult to obtain otherwise.
– Can be used to cover semantic domains not attested so far in the corpus.
CON:– Don’t constitute a ‘speech
event’ in the sense of Hymes.
50
Photos and Powerpoint animations
Useful for ethnobotany Sequences of stills
from digital video or ppt animations can be used to elicit stages of an event
Potential problems
52
Ecological validity
It is important to aim at culturally appropriate methods.
However, total ecological validity leads to non-transferability.
Therefore:– Elicitations and stimuli should
replace the names of culturally unfamiliar items with more familiar ones.
– Unfamiliar or uncomfortable settings (elicitation sessions with consultants of different rank/sex…, elicitations games, etc.) should be explained and negotiated beforehand, and, if necessary, amended.
53
Procedure and analysis
The familiarity or unfamiliarity of certain devices, techniques, or media in different cultures and groups should be taken seriously but need not rule out using them with caution and preparation:– Pilot studies determine whether a technique works and if not, give
indications on what should be changed.– Consultant training is important for all kinds of data collection.
Data resulting from one collection technique should always be checked against data from another technique.
If the comparability of data is aimed at, collection procedure and analysis should not deviate from instructions or procedures given for the specific technique, questionnaire or stimulus.
My conclusion
55
Why all kinds of data?
Elicitations Observed communicative
events
Staged communicative
events
CorpusField-based corpora are relatively small. Thus: They don’t show the full distributional range of a given item. They don’t offer negative evidence.
Are very controlled, but not linguistically prompted Thus:
They permit the controlled variation of a situation.
They allow the assessment of the real-world situation referred to.
They yield data that are directly comparable across languages.
Are relatively uncontrolled. Thus:
They don’t’ allow an active manipulation of parameters of variation by the researcher.
Often, texts do not permit the reconstruction of the real-world context for a given item.
Are highly controlled, but linguistically prompted. Thus:
They are likely to be influenced by the linguistic input.
Their ‘naturalness’ cannot be assessed.
56
A circle
Observed communicati
ve events
Staged communicative
events
Elicitation
Demonstrative questionnaire
Action description
BowPed picture stimulus
Your conclusion?
Useful links
59
MPI Nijmegen Language & Cognition and Acquisition Groups:– Large number of stimuli on a range of topics; stimuli and manuals upon
request: http://www.mpi.nl/world/index.html
The MPI EVA Leipzig links to field tools:– http://lingweb.eva.mpg.de/fieldtools/tools.htm
Russ Tomlin’s Fish Film:– Stimulus designed to uncover the motivation for voice contrasts,
topicality, etc.http://logos.uoregon.edu/tomlin/research_fishfilm.html
Wallace Chafe’s Pear Film – Designed to compare narrative structure
http://www.linguistics.ucsb.edu/faculty/chafe/pearfilm.htm Phillip Wolff’s animations on causality (upon request?)
– Aimed at testing Talmy’s force dynamics model of causationhttp://userwww.service.emory.edu/~pwolff/
Sonja Eisenbeiß’s elicitation games (upon request)– A variety of games and tasks for language acquisition studies,
focussing on three participant events and external possessionhttp://privatewww.essex.ac.uk/~seisen/index.htm