skopeteas frankfurt oder handout

16
Forschungscolloquium Migration und Minderheiten Europa-Universität Viadrina Caucasian Urum Language contact and speaker variation Stavros Skopeteas University of Bielefeld Frankfurt (Oder), 25.06.2011 1. Preliminaries 1.1 Caucasian Urum - spoken in the district of Trialeti, Georgia. - language contact: Anatolian Turkish (basic substrate), exchange with Russian and Georgian (possibly also Armenian). - Population: 30 811 people according to the 1979 Population Census of the Georgian SSR, estimated to 1500 people in 2006 (Wheatley 2006). - According to the tradition of the community: the Urum people were originally situated in Eastern Turkey (Kars). Their ancestors moved to the Caucausus at the beginning of the 19 th century. - The Caucasian Urum language should not be confused with the Urum language spoken in Ukraine (also known as Greek-Tatar) or with the Urum people in Turkey.

Upload: others

Post on 29-Dec-2021

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Skopeteas Frankfurt Oder Handout

Forschungscolloquium Migration und Minderheiten

Europa-Universität Viadrina

Caucasian Urum Language contact and speaker variation

Stavros Skopeteas University of Bielefeld

Frankfurt (Oder), 25.06.2011

1. Preliminaries

1.1 Caucasian Urum

- spoken in the district of Trialeti, Georgia.

- language contact: Anatolian Turkish (basic substrate), exchange with Russian

and Georgian (possibly also Armenian).

- Population: 30 811 people according to the 1979 Population Census of the

Georgian SSR, estimated to 1500 people in 2006 (Wheatley 2006).

- According to the tradition of the community: the Urum people were originally

situated in Eastern Turkey (Kars). Their ancestors moved to the Caucausus at

the beginning of the 19th century.

- The Caucasian Urum language should not be confused with the Urum

language spoken in Ukraine (also known as Greek-Tatar) or with the Urum

people in Turkey.

Page 2: Skopeteas Frankfurt Oder Handout

Caucasian Urum

2

Fig. 1. Tshalka

1.2 Urum documentation project

University of Athens

Athanasios Markopoulos, Eleni Sella-Mazi

University of Bielefeld

Stavros Skopeteas

University of Bremen

Elisabeth Verhoeven

Tbilisi

Violeta Moisidi

Funded by the Latsis foundation (January 2010 – February 2011)

1.3 Objectives

Sections

(a) thematic LEXICON: translation of 1419 concepts (belonging to 24 different

semantic fields) (4 native speakers)

(b) SENTENCE sample: representative sentences for the examination of

different grammatical categories (4 native speakers);

Page 3: Skopeteas Frankfurt Oder Handout

S. Skopeteas

3

(c) TEXT collection containing semi-naturalistic narratives (80 short narratives,

16 native speakers).

(d) documentation of the COMMUNITY: sociolinguistic questionnaires about

the use of language and other languages by the individuals (30 native

speakers).

Method

- This decision follows from the assumption that linguistic properties vary in at

least two dimensions: (a) the variation between speakers (pervasive in an

endangered language); (b) the variation between linguistic objects.

- Repeated observations in language documentation.

Fig. 2. Urum in the Web (www.urum.lili.uni-bielefeld.de)

2. Words

Research questions

• What are the sources of the Urum vocabulary?

• Which (phonological, morphological, semantic) deviations from the

Eastern varieties of Turkish may be observed in Urum?

• What is the influence of the contact languages (Georgian, Russian, Pontic

Greek) to the vocabulary? How is this influence manifested in particular

semantic fields?

Page 4: Skopeteas Frankfurt Oder Handout

Caucasian Urum

4

Method

- version of the World Loanword Database (WOLD), inventory of lexical

concepts (see Haspelmath & Tadmor 2009).

Table 1. Basic vocabulary in semantic fields

semantic field illustrative examples n

sense perception smell, bitter, hear, etc. 47

spatial relations remain, pick up, in front of, left, etc. 71

body head, eye, bone, cheek, etc. 138

kinship mother, father, sister, younger sister, etc. 82

motion fall, throw, swim, carry on the back, etc. 76

physical world land, soil, mud, mountain, etc. 71

emotions and values heavy, happy, cry, proud, etc. 54

quantity fifteen, count, few, empty, etc. 39

time slow, sometime, soon, year, etc. 56

actions and technology cut, pull, build, hammer, etc. 64

cognition study, teach, pupil, doubt, etc. 51

speech and language tell, speech, paper, pen, etc. 42

animals cow, sheep, goat, chicken, etc. 104

possession give, find, pay, price, etc. 47

warfare and hunting army, soldier, victory, defeat, etc. 35

social and political relations queen, Russian, servant, command, etc. 56

food and drink oven, bowl, soup, bean, etc. 109

agriculture shovel, flower, tree, orange, etc. 68

law accuse, guilty, prison, thief, etc. 20

house door, window, chimney, bed, etc. 39

clothing glove, leather, skirt, shoe, etc. 52

religion and belief bishop, hymn, marriage, Muslim, etc. 33

modern world bomb, plastic, workshop, film, etc. 51

miscellaneous same, nothing, without, that, etc. 14

TOTAL 1419

Page 5: Skopeteas Frankfurt Oder Handout

S. Skopeteas

5

The native speaker was then presented a sentential example in the contact language

(Russian) that contains the target word. The sentential examples are developed with

the following rules:

A. Non-relational entities are encoded in the contact language as subjects.

Target concept: “goat”

Sentential example: “The goat is clever.”

B. Relational entities are encoded in the contact language as subjects possessed by a

third person.

Target concept: “nose”

Sentential example: “Her nose is beautiful.”

C. Properties and events are encoded in the contact language as predicates.

Target concept: “big”

Sentential example: “The cow is big.”

Fig. 3. Entry in Lexicon

Page 6: Skopeteas Frankfurt Oder Handout

Caucasian Urum

6

Illustrative Results.

Fig. 4. Likelihood of borrowing per semantic field:

Urum likelihood calculated in a total of 5273 translations (4 speakers)

48 languages from the same words in the WOLD sample

0

0,2

0,4

0,6

0,8

1

sens

e pe

rcep

tion

spat

ial r

elat

ions

body

kins

hip

mot

ion

phys

ical

wor

ldem

otio

nsqu

antit

ytim

eac

tions

and

tech

nolo

gyco

gnitio

nsp

eech

anim

als

poss

essi

onw

arfa

re a

nd h

untin

gso

cial

and

pol

itical

rela

tions

food

and

drin

kag

ricul

ture law

hous

ecl

othi

ngre

ligio

n an

d be

lief

mod

ern

wor

ld

Urum48 languages

Comments

- the proportion of borrowings in Urum, i.e., 23,7% (aggregated per field), is

smaller than the corresponding proportion of the same words in the 48-

languages sample, i.e., 28,6% (WOLD).

- some outliners (e.g., kinship, time, warfare and hunting)

- general pattern of proportions across semantic fields is similar to the cross-

linguistic pattern (Pearson r = .84).

Page 7: Skopeteas Frankfurt Oder Handout

S. Skopeteas

7

Fig. 5. Origin of borrowed words per semantic field

(preliminary decoding)

0%

25%

50%

75%

100%

sens

e pe

rcep

tion

spat

ial r

elat

ions

body

kins

hip

mot

ion

phys

ical

wor

ld

emot

ions

quan

tity

time

actio

ns a

nd te

chno

logy

cogn

ition

spee

ch

anim

als

poss

essi

on

war

fare

and

hun

ting

soci

al a

nd p

oliti

cal r

elat

ions

food

and

drin

k

agric

ultu

re law

hous

e

clot

hing

relig

ion

and

belie

f

mod

ern

wor

ld

UrumRussianGeorgianGreek

Comments

- majority of borrowings from Russian: 1037 translations (i.e., 24.1%).

- less borrowings from Georgian (in particular semantic fields, e.g., food and

drinking): 77 translations (i.e., 1.8%);

- very few borrowings from Greek (in highly culture-specific fields, e.g.,

religion): 10 translations (i.e., .2%).

The subset of “Urum” words contains diverse groups:

(a) Turkish words (1935 out of 2254 decoded tokens, i.e., 85,8%)

(b) Old Turkish words (73 out of 2254 decoded tokens, i.e., 3,2%)

(c) not yet identified (246 out of 2254 decoded tokens, i.e., 10,9%)

Based on the cross-linguistic data of the WOLD database, we obtain an index of

borrowability for each concept (n of languages in which this concept is encoded

through a borrowing/n total). E.g.,

Page 8: Skopeteas Frankfurt Oder Handout

Caucasian Urum

8

(1) a. brother 0,06

b. fish 0,15

c. mouse 0,18

d. potato 0,42

e. trousers 0,56

f. car 0,79

The index of borrowability gives us a tool for the estimation of the developments in

the conservative/innovative parts of the lexicon.

Fig. 6. Likelihood of borrowing and origin of Urum words

(2606 translated words; words occuring in both languages excluded)

0

25

50

75

100

0 - ,2] ,2 - ,4] ,4 - ,6] ,6 - ,8] ,8 - 1]

likelihood of borrowing in 48 languages

% o

f bor

row

ed w

ords

out

of n

wor

ds

RussianTurkish

3. Sentences

Research questions

• In which clausal environment do Urum speakers select a particular

inflectional category?

• What are the basic syntactic properties of Urum syntax?

• What are the similarities and differences between the Urum clause

structure and the clause structure in the other languages at issue (Turkish,

Georgian, Russian, Pontic Greek)?

Page 9: Skopeteas Frankfurt Oder Handout

S. Skopeteas

9

Method

- Sentence list by Suarez (820 sentences).

- 4 native speakers

Fig. 7. Entry in Sentences

4. Texts

Research questions

• How do speakers select words and syntactic structures in naturalistic data

(narratives)?

• What can we learn about the frequencies of particular linguistic properties

in discourse?

• Is there variation between speakers?

Method

Cheese story

Instruction: Please tell me how you make cheese in Tshalka. (Do not worry if there are

some details that you do not know just tell me everything you consider necessary.)

Path description

Instruction: Please describe the path to go from Beshtasheni to Hadik to me. Please

give exact descriptions, so that we can recognize the path that we have to follow (by

telling me about all the important places on the way to Hadik, e.g., characteristic

houses, trees, crossroads, etc.).

Page 10: Skopeteas Frankfurt Oder Handout

Caucasian Urum

10

The story of the ancestors

Instruction: Please tell me the story of how the Urum people came to the Caucasus. It is

not a problem if you are not sure about the historical details. Just tell me the story of

your ancestors as far as you know it and include all the details you consider necessary.

Modern life story

Instruction: Please tell me about the changes in the situation of the Urum people in the

last twenty years. The best way to do this is to start by telling me about your and/or

your families experiences at the time Georgia became independent. Try to remember

those events and tell me about the course of events until today. Please take your time in

doing this and give me all the details you consider important, because I am interested in

everything that is important for you.

Peer story

Instruction: You are going to see a film twice. Please notice what happens in the film

and tell me the story. Try to remember as many details as you can.

Data

- The 5 texts have been recorded with 16 native speakers

- Total: 80 parallel narratives

Fig. 8. Entry in Text

Illustrative generalizations

The plural morpheme is a modifier that is used if it is relevant and not obvious in the

context. Beginning from nouns, when a quantifier/numeral encodes plurality, the

specified noun is most frequently not morphologically marked for plural, see, e.g., (2)

(observed in 14 tokens out of total 16 quantified noun phrases, i.e., 87.5%). This

phenomenon is known for a large array of languages with concatenative morphology

(see Corbett 2004: 211).

Page 11: Skopeteas Frankfurt Oder Handout

S. Skopeteas

11

(2) a. ušax-lar

child-PL

‘children’

b. uč ušax

three child

‘three children’ (speaker 21)

Plural subjects (either with a quantifier/numeral or with a plural suffix) can be cross-

referenced by the plural suffix on the verb, e.g., (3a). However, we observe that the

plural marker is omitted elsewhere in the corpus, e.g., (3b). The crucial question for

the grammatical description is whether the variation observed in these examples is

random (e.g., resulting from the varying choice of the individual speakers) or it is

determined by some grammatically relevant factor.

(3) a. uč ušax-ta gäl-dɴ-lar.

three child-and come-PAST-3.PL

‘and three children came’ (speaker 21)

b. uč ušax-ta gäl-di.

three child-and come-PAST

‘and three children came’ (speaker 28)

Motivated by the established typologies of number distinctions, we hypothesize that

the likelihood of plural marking on the verb depends on the mental representation of

the referents, such that highly individuated referents are more likely to be cross-

referenced by a plural affix on the verb. This hypothesis predicts an animate-

inanimate asymmetry in the marking of plurality (see Smith-Stark 1974; Lucy 1992;

Corbett 2004: 70).

The empirical investigation of our corpus confirms our expectations: 39 out of 63

animate plural subjects (61,9%) are cross-referenced by a plural affix on the verb,

while this is the case only for 3 out of 16 (18,8%) inanimate subjects. Moreover, our

corpus allows us to examine whether this empirical difference is independent from the

Page 12: Skopeteas Frankfurt Oder Handout

Caucasian Urum

12

speaker variation. We found tokens of both conditions at issue (animate plural

subjects, inanimate plural subjects) in twelve speakers and were able to run a paired-

samples t-test, which revealed that the observed difference is beyond the chance level

(t11 = 3,7, p < .003).

Fig. 9. Percentages of plural agreement for animates and inanimates

(Y-bars indicate standard error of the mean values)

0

25

50

75

100

animates inanimates

% o

f plu

ral a

gree

men

t

However, plotting the production of plural agreement per speaker reveals that our data

are not canonically distributed. The distribution is bimodal, i.e., the sample is

distributed around two central values: a group of speakers is distributed around the

33,3-50% segment of plural agreement proportions and a second group of speakers is

distributed around the 83,3-100% segment.

Page 13: Skopeteas Frankfurt Oder Handout

S. Skopeteas

13

Fig. 10. Proportions of plural agreement and n of speakers

0

1

2

3

4

5

6

0-16,6% 16,6-33,3% 33,3-50% 50-66,6% 66,6-83,3% 83,3-100%

% of plural agreement

n of

spea

kers

Fig. 10 implies that our sample contains two groups of individuals speaking two

different grammars: the former grammar has optional plural marking while the latter

has near obligatory plural marking. This means that Fig. 9 involves a confounding

between two types of information: the difference depending on the type of entity

(animate vs. inanimate) and the difference between groups. I.e., the question ‘is there

a difference between the production of plural agreement between animates and

inanimates’ has to be answered for each subgroup of speakers separately, as in Fig.

11. Indeed this view on the data reveals a quite different empirical situation. There is a

group A of speakers that produce plural optionally, and a group B for which plural

agreement is near obligatory. The crucial issue is that the Group A speakers did never

produced plural agreement with inanimate entities, i.e., the gradient difference

between animates and inanimates in our data (see Fig. 9) does not result from the

lower likelihood of plural with inanimates, but from the lower proportion of speakers

of Group B in our speaker sample (if we had only speakers of the B-group in our

sample the gradience would not be visible). Group A speakers display a categorical

pattern that is not visible in the average result. They optionally produce plural

agreement with animates and never produce plural agreement with inanimates.

Page 14: Skopeteas Frankfurt Oder Handout

Caucasian Urum

14

Fig. 11. Percentages of plural agreement per speaker group

(Y-bars indicate standard error of the mean values)

(a) Speakers’ Group A

0

25

50

75

100

animates inanimates

% o

f plu

ral a

gree

men

t

(b) Speakers’ Group B

0

25

50

75

100

animates inanimates%

of p

lura

l agr

eem

ent

5. Community

(by Eleni Sella)

Research questions

• What are the sociolinguistic properties of the Urum community?

• Which second and third languages do Urum people speak?

• In which communicative situations do Urum people use their language?

Method

In order to answer these questions, we developed a detailed sociolinguistic

questionnaire. This questionnaire contains:

• Biographical details;

• Language competence;

• Use of the language in several fields of communication;

• Attitude towards the language;

• Self estimation of the fluency in Urum.

30 native speakers were interviewed with this questionnaire.

Page 15: Skopeteas Frankfurt Oder Handout

S. Skopeteas

15

Data

The questionnaire was designed as a set of multiple-choice questions allowing for

selection of more than one option. The interviews were conducted in Urum. E.g.,

You are using Urum:

Говорите на Урум

a. with the parents

С родителями,

b. with the grandparents

С дедушкой, бабушкой

c. with your children

С вашими детьми

d. with the neighbours

С соседями

e. at work

На работе

f. with your friends

С вашими друзьями

g. in other occasions. Where?

В другом месте. Где?

Fig. 12. Primary language in social interactions

(percentages of 30 native speakers’ estimations)

0%

25%

50%

75%

100%

grandparents parents siblings spouse children friends colleagues

ArmenianGreekGeorgianRussianUrum

Page 16: Skopeteas Frankfurt Oder Handout

Caucasian Urum

16

Comments

- The frequency of language use decreases across generations:

grandparents > parents > siblings/spouse > children.

- The frequency of language use decreases with social distance:

relatives > friends > colleagues.

References

Bowern, Claire 2008, Linguistic Fieldwork: A practical guide. New York: Palgrave

Macmillan.

Corbett, G. 2004, Number, Cambridge: Cambridge University Press.

Gippert, Jost, Nikolaus P. Himmelmann, and Ulrike Mosel (eds.) 2006, Essentials of

language documentation. Berlin: Mouton De Gruyter.

Haspelmath, Martin and Uri Tadmor (eds.) 2009, Loanwords in the World's

languages: A Comparative Handbook. Berlin: Mouton De Gruyter.

Haviland, John 2006, Documenting lexical knowledge. In Gippert et al. (eds.), 129-

162.

Levinson, Stephen C. 2003, Space in Language and Cognition. Explorations in

Cognitive Diversity. Cambridge: Cambridge University Press.

Lucy, J. 1992, Grammatical categories and cognition, Cambridge: Cambridge

University Press.

Mosel, Ulrike 2006, Fieldwork and community language work. In Gippert et al.

(eds.), 67-85.

Roudik, Peter L. 2009, Culture and Customs of the Caucasus. Westport: Greenwood.

Smith-Stark, T. 1974, The plurality split, Chicago Linguistic Society 10, 657-671.

Snider, K. and J. Roberts 2006, SIL Comparative African Wordlist (SILCAWL).

Dallas: SIL International.

Swadesh, Morris 1952, Lexicostatistic dating of prehistoric ethnic contacts.

Proceedings of the American Philosophical Society 96,152-63

Wheatley, J. 2006, Defusing conflict in Tsalka district of Georgia: migration,

international intervention and the role of the state. European Centre for Minority

Issues, Working Paper 36.