presenting sambalpuri-kosli language: a demonstration of

10
SP Publications International Journal Of English and Studies (IJOES) An International Peer-Reviewed Journal ; Volume-4, Issue-1, 2022 www.ijoes.in ISSN: 2581-8333; Impact Factor: 5.421(SJIF) ISSN: 2581-8333 Copyright © 2022 SP Publications Page 73 RESEARCH ARTICLE Presenting Sambalpuri-Kosli Language: A Demonstration of Limited-Resourced Language ____________________________________________________________________________________________ Dr. Bipin Bihari Dash,Assistant Professor in English, Odisha University of Technology and Research,OUTR, Bhubaneswar- 751029 Odisha, India Abstract: Language is one of such components that everybody has a stake in; which encapsulates a consortium of elements such as culture of a community, indigenous knowledge paradigm, social and religious values, folklores and so on. One of the salient objectives of this research is to document and describe the Sambalpuri-Kosli language by way of preparing an online dictionary which could prove to be a stepping stone for the technological advancement and future research into it. The present dictionary is a multilingual, web-based and thematic dictionary of around 600 words collected from three domains-flora and fauna, kinship, and body parts. Documenting one’s own language is to archive and disseminate it for the posterity; this can be done better none other than by making a web-based dictionary. The data in the form of lexicon has been encoded with the Toolbox, and Lexique Pro has been used for its online launch. The data has been analyzed and processed in such a manner that it can be comprehended by researchers of other disciplines. The concerned paper explicates that the dictionaries are not only a repository of lexicons which are mere representatives of the linguistic knowledge but also a plethora of cognition-database of a particular speech community embedded in the same language such as cultural, anthropological, ethnographic, social and so on. Furthermore, it has been attempted to look at how other ontological information are inherently pertained to language. Keywords: Toolbox, Lexique Pro, Lexicography, Documentation, Lexicon INTRODUCTION One of the pertinent issues in the arena of language in the present era is that the languages are challenged with an alarming rate of their extinction. It has been apprehended that the forthcoming century will eyewitness the fast disappearance of the languages ‘without being adequately recorded’ (Krauss, 1992, Crystal 2002:19). The current world-wide distribution of languages exhibits that majority (3586) of world’s languages are spoken by approximately a meagre population (0.2%) whereas a minor number of languages (83) are spoken around 79% of the world’s population (Harrison, 2007:14). Besides, most of the languages are less-resourced and less-described in terms of the availability of the electronic corpora on one hand and the amount of linguistic research on the other respectively. Because of the existence of a dominant language, the minor languages are not able to captivate the attention of the government for their patronization and as a result they are consistently and indifferently neglected which thereby results in the endangerment of the language. As rightly put forth by Ostler (1993), languages that are lacking active participation in the electronic media are subjected to be endangered. So, these languages are either dialects or languages having no government patronization (Behera et al., 2015). As a consequence, the situations of these languages in South Asia in general and particularly in Indic languages are ‘relatively bleak’ (McEnry et al., 2000). Although India is a land of more than 1500 languages with five prominent diverse language families (Abbi, 2001), only 22 are scheduled and the rest are fighting for their survival. Therefore, it is indispensable to document, describe and archive those languages fighting for survival and make them available online for the posterity for conducting further natural language processing research and development on them. Description of a language refers to describing the formal properties of language like phoneme, morpheme, sentence, and at other higher levels. Language documentation complements language description which aims at describing a language's abstract system of structures and rules in the form of a grammar or dictionary. Documentation, as put forth by Himmelmann (2006:01), is a “lasting, multipurpose record of a language”. Broadly speaking, in other words, it is concerned with the compilation and preservation of

Upload: others

Post on 27-Jul-2022

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Presenting Sambalpuri-Kosli Language: A Demonstration of

SP Publications

International Journal Of English and Studies (IJOES)

An International Peer-Reviewed Journal ; Volume-4, Issue-1, 2022 www.ijoes.in ISSN: 2581-8333; Impact Factor: 5.421(SJIF)

ISSN: 2581-8333 Copyright © 2022 SP Publications Page 73

RESEARCH ARTICLE

Presenting Sambalpuri-Kosli Language: A

Demonstration of Limited-Resourced Language ____________________________________________________________________________________________

Dr. Bipin Bihari Dash,Assistant Professor in English, Odisha University of Technology and Research,OUTR, Bhubaneswar-

751029 Odisha, India

Abstract:

Language is one of such components that everybody has a

stake in; which encapsulates a consortium of elements such as

culture of a community, indigenous knowledge paradigm,

social and religious values, folklores and so on. One of the

salient objectives of this research is to document and describe

the Sambalpuri-Kosli language by way of preparing an online

dictionary which could prove to be a stepping stone for the

technological advancement and future research into it. The

present dictionary is a multilingual, web-based and thematic

dictionary of around 600 words collected from three

domains-flora and fauna, kinship, and body parts.

Documenting one’s own language is to archive and

disseminate it for the posterity; this can be done better none

other than by making a web-based dictionary. The data in

the form of lexicon has been encoded with the Toolbox, and

Lexique Pro has been used for its online launch. The data has

been analyzed and processed in such a manner that it can be

comprehended by researchers of other disciplines. The

concerned paper explicates that the dictionaries are not only

a repository of lexicons which are mere representatives of the

linguistic knowledge but also a plethora of cognition-database

of a particular speech community embedded in the same

language such as cultural, anthropological, ethnographic,

social and so on. Furthermore, it has been attempted to look

at how other ontological information are inherently pertained

to language.

Keywords: Toolbox, Lexique Pro, Lexicography,

Documentation, Lexicon

INTRODUCTION One of the pertinent issues in the arena of

language in the present era is that the languages are

challenged with an alarming rate of their extinction. It has

been apprehended that the forthcoming century will

eyewitness the fast disappearance of the languages

‘without being adequately recorded’ (Krauss, 1992,

Crystal 2002:19). The current world-wide distribution of

languages exhibits that majority (3586) of world’s

languages are spoken by approximately a meagre population (0.2%) whereas a minor number of languages

(83) are spoken around 79% of the world’s population

(Harrison, 2007:14). Besides, most of the languages are

less-resourced and less-described in terms of the

availability of the electronic corpora on one hand and the

amount of linguistic research on the other respectively.

Because of the existence of a dominant language, the

minor languages are not able to captivate the attention of

the government for their patronization and as a result they

are consistently and indifferently neglected which thereby

results in the endangerment of the language.

As rightly put forth by Ostler (1993), languages that are

lacking active participation in the electronic media are

subjected to be endangered. So, these languages are either

dialects or languages having no government patronization (Behera et al., 2015). As a consequence, the situations of

these languages in South Asia in general and particularly

in Indic languages are ‘relatively bleak’ (McEnry et al.,

2000). Although India is a land of more than 1500

languages with five prominent diverse language families

(Abbi, 2001), only 22 are scheduled and the rest are

fighting for their survival. Therefore, it is indispensable to

document, describe and archive those languages fighting

for survival and make them available online for the

posterity for conducting further natural language

processing research and development on them.

Description of a language refers to describing the

formal properties of language like phoneme, morpheme,

sentence, and at other higher levels. Language

documentation complements language description which aims at describing a language's abstract system of

structures and rules in the form of a grammar or

dictionary.

Documentation, as put forth by Himmelmann (2006:01), is a “lasting, multipurpose record of a

language”. Broadly speaking, in other words, it is

concerned with the compilation and preservation of

Page 2: Presenting Sambalpuri-Kosli Language: A Demonstration of

SP Publications

International Journal Of English and Studies (IJOES)

An International Peer-Reviewed Journal ; Volume-4, Issue-1, 2022 www.ijoes.in ISSN: 2581-8333; Impact Factor: 5.421(SJIF)

ISSN: 2581-8333 Copyright © 2022 SP Publications Page 74

RESEARCH ARTICLE

linguistic primary data and interfaces between primary

data and the analysis based on that. The primary data

includes audio and video recordings of a communicative

event and field notes taken on elicitation session as well.

Typical steps involve recording, transcribing (often using

the International Phonetic Alphabet and/or a "practical

orthography" made up for that language), annotation and

analysis, translation into a language of wider

communication, archiving and dissemination.

One of the innovative ways to document a language and

the phenomena pertaining to language is with the

lexicographic documentation. Lexicography is the applied

study of the meaning, evolution, and function of the

vocabulary units of a language for the purpose of

compilation in book form in short. Perhaps the simplest explanation of lexicography is that it is a scholarly

discipline that involves compiling, writing, or editing

dictionaries. Lexicography is widely considered an

independent scholarly discipline, though it is a subfield

within linguistics.

There are two types of lexicography. They are as follows:

Practical: It is the art or craft of compiling, writing and

editing dictionaries.

Theoretical: It is the scholarly discipline of analyzing and

describing the semantic, syntagmatic and pragmatic

relationships within the lexicon of a language, developing

theories of dictionary components and structures linking the data in dictionaries, the needs for information by users

in specific types of situation, and how users may best

access the data incorporated in printed and electronic

dictionaries. This is sometimes referred to as 'meta-

lexicography'. “These dictionaries of endangered

languages comprise a wider inventory from a variety of

speech genres, with sophisticated multimedia materials,

and new ways of preserving cultural memory and

representing semantic and cultural ontologies.” (Ogilvia,

2011: 389-404)

It shows multidisciplinary nature and draws on

theoretical concepts and methods from linguistics,

ethnography, folklore studies, psychology, information

and library science, archiving and museum studies, digital

humanities, media and recording arts, pedagogy, ethics, and other research areas. Its major goal is the creation of

well-organized, long-lasting corpora that can be used for a

variety of purposes, including theoretical research and

practical needs such as language and cultural

revitalization. Another prominent feature is attention to the

rights and desires of language speakers and communities

and collaboration with them in the recording, analysis,

archiving, dissemination, and support of their own

languages.

AIMS AND OBJECTIVES

One of the salient objectives of this concerned

research is to create a lexicon of the Sambalpuri-

Kosli in an electronic version so that future

research can be initiated on the linguistic aspects

of the language.

Secondly, to present the socio-cultural,

anthropological, ethnographic aspects of the

region where the language is spoken so as to deep

delve into the indigenous knowledge system underlined by the language.

To avail the language to the researchers of the

other interdisciplinary branches so as to explore

the other language-pertaining aspects in future.

To cater to the linguistic needs of the Western

Odisha region and to make use of the dictionary

for teaching-learning process through the

language.

To publish in both the versions i.e. print and

electronic so that it reaches to all from those who

have access to the technology and to those who do not.

It has been documented in three languages, viz.

English, Hindi and Sambalpuri so that it is

comprehensible to all speakers.

BACKGROUNDS

Sambalpuri-Kosli (ISO 639-3 spv) belongs to the

Indo-Aryan Language family largely spoken in vast

geographical distribution of ten districts (Sambalpur,

Bargarh, Bolangir, Sonepur, Kalahandi, Sundargarh, Boud,

Deogarh, Nuapada and Jharsuguda) with approximately 18 million (Census Report, 2001) people of western, south-

western and north-western Odisha and some parts of

Jharkhand and Chhattisgarh as well.

Although there is adequate amount of literature available

in the Sambalpuri-Kosli, the linguistic research and

development is quite negligible. The attitude of the

speakers towards the language is quite positive and the

domains of use are more in the informal setting than in the

formal ones. The language is not used as a language in the

Page 3: Presenting Sambalpuri-Kosli Language: A Demonstration of

SP Publications

International Journal Of English and Studies (IJOES)

An International Peer-Reviewed Journal ; Volume-4, Issue-1, 2022 www.ijoes.in ISSN: 2581-8333; Impact Factor: 5.421(SJIF)

ISSN: 2581-8333 Copyright © 2022 SP Publications Page 75

RESEARCH ARTICLE

pedagogic process rather the dominant language, i.e. Odia,

supersedes in this matter.

The proposed paper undertaken pertains to the

documentation of a less-resourceful and less

technologically advanced language named Samblapuri-

Kosli with Lexique Pro (see Figure 1) and Toolbox. In the

concerned paper an effort has been envisaged for the

making of an online Sambalpuri dictionary under the

semantic domains of body parts, kinship terms, and flaura

and fauna. The domain of body parts has further been sub-

divided into two broad categories: internal and external. In

addition, the domain of flora and fauna has been sub-

categorized into six more categories: creepers, fruit plants,

vegetable plants, flower, weather, & other trees.

Furthermore, kinship terms have been categorized into

affinal and non-affinal. The dictionary is a multilingual

(Sambalpuri-Hindi-English) dictionary (see fig 1)

comprising approximately of four hundred words under

the aforementioned semantic categories.

Fig 1 Lexique Pro Sambalpuri-Kosli Lexicon Snapshoot

The languages employed in the dictionary are Sambalpuri for phonetic transcriptions and drawing examples, English for descriptions, glossing, indicating parts of speech, and drawing examples, Hindi for descriptions, examples instantiation and

gloss of each word. “These dictionaries challenge the traditional types of dictionaries because they are everything in one. They

combine aspects of the learners dictionary, historical dictionary, encyclopaedic dictionary, talking dictionary, pictorial

dictionary, video dictionary, and visual thesaurus” (Ogilvia, 2011: 389-404).Furthermore, it also contains pictures and audios

of each word in electronic format which makes it a talking dictionary. Besides, it provides with scientific nomenclatures,

etymological reference, cross-reference, details of the source, morphemic breaks, if needed and the metadata like dates of

entries and the parts of speech of each respective word.

Page 4: Presenting Sambalpuri-Kosli Language: A Demonstration of

SP Publications

International Journal Of English and Studies (IJOES)

An International Peer-Reviewed Journal ; Volume-4, Issue-1, 2022 www.ijoes.in ISSN: 2581-8333; Impact Factor: 5.421(SJIF)

ISSN: 2581-8333 Copyright © 2022 SP Publications Page 76

RESEARCH ARTICLE

Fig 2 an Exemplary Entry of the Lexicons

Special Features of the Dictionary:

Multilingual

Contains images and audios of each word

Morphemic breaks

Parts of speech

Entry of the source Contains scientific names of words pertaining to flora and fauna

Etymological reference

Cross reference

Method of Data Collection: The data has been collected from the Sambalpuri speakers of western Odisha. The rest of the data

is proposed to be collected from the Sambalpuri blogs and Facebook from the native speakers living there. The data collected

and documented as of now is from three below-mentioned domains or themes (see figure 4).

Body parts: a. Internal

b. External

Kinship terms: a. Affinal

b. Non-Affinal

Flora and fauna: a. Animals: birds, reptiles, insects, and wild animals.

b. Plants: creepers, herbs, trees

Page 5: Presenting Sambalpuri-Kosli Language: A Demonstration of

SP Publications

International Journal Of English and Studies (IJOES)

An International Peer-Reviewed Journal ; Volume-4, Issue-1, 2022 www.ijoes.in ISSN: 2581-8333; Impact Factor: 5.421(SJIF)

ISSN: 2581-8333 Copyright © 2022 SP Publications Page 77

RESEARCH ARTICLE

Fig 3 Three Domains of Data Collection

Method of Data Analysis:

After data collection the complete lexicon is given entry with the help of Toolbox Software and for online launching

and upload Lexique Pro is used.

For the recording of the audio files, assistance has been taken from Audacity software and Angel SV 200mA recorder.

The analysis has been conducted at two levels, i.e. linguistic and the cultural and the relation between the two in a

dictionary-making enterprise.

Linguistic Analysis:

Sambalpuri has loaned many words from the other Indian languages and others into its lexicon. Out of the total number of

lexicons around 33 percent is from the indigenous Sambalpuri, about 38 percent of them are from the Odia language, approximately 27 from Hindi and the rest constitutes the other languages (see Chart 1).

SERIES 1

Sambalpuri Odia Hindi English and others

Semantic Domains

Body Parts

Internal

External

KinshipTerms

Affinal

Non-Affinal

Flora and Fauna

Creepers

Vegetables

Flower

Fruit

Weather

Miscellaneous

Page 6: Presenting Sambalpuri-Kosli Language: A Demonstration of

SP Publications

International Journal Of English and Studies (IJOES)

An International Peer-Reviewed Journal ; Volume-4, Issue-1, 2022 www.ijoes.in ISSN: 2581-8333; Impact Factor: 5.421(SJIF)

ISSN: 2581-8333 Copyright © 2022 SP Publications Page 78

RESEARCH ARTICLE

Noun (body parts) (n > v)

Verb

/kɑn/ ‘ear’

/kənɑ/ ‘hear’

/kɑnd̪ʰ/ ‘shoulder’

/kənd̪ʰɑ/ ‘bear’

noun

Verb ( to+ infinitive)

/nɑk/ ‘nose’

/nəkɑbɑr/ ‘to smell’

/kɑn/ ‘ear’

/kənɑbɑr/ ‘to hear’

Compound nouns (flora & fauna)

/bɪleɪ/ ‘cat’ + /ɑɛk̃ʰ/ ‘eye’

Noun+noun

/bɪleɪ ɑɛ̃kʰ/ ‘cat‟s eye’

/hɑt̪ɪ/ ‘elephant’ + /muɖɪɑ/ ‘headed’

Noun+adjective

/hɑt̪ɪ muɖɪɑ/ ‘elephant-headed’

With respect to the verbalizations, verbs are formed with the addition of ‘-ɑ’ and with the reduction of the vowels /ɑ/ and /ə/

from /ə/ and /ʊ/ respectively in words consisting of single syllables. With regard to the verbs of infinitive construction „-bar‟

suffix is added to the stems of the verbs of directions with the reduction of the vowels /ɑ/ and /ə/ from /ə/ and /ʊ/ respectively.

In consideration to the compound noun formation two types of constructions are noticeable viz. noun + noun and noun +

adjective.

Body parts( n > adj )

/hɑ:t̪/ ‘hand’ /hɑ:t̪e/ ‘hand-sized’

/peʈ/ ‘stomach’ /peʈe/ ‘full-stomach’

/pɑ:d̪/ ‘foot’ /pɑ:d̪e/ ‘one foot’

/ɑ:̃ʈʰʊ/ ‘knee’ /ɑ:ʈ̃ʰe/ ‘length upto knee’

/mʊɖ/ ‘head’ /mʊɖɑ/ ‘bent’

In the field of body parts adjectives are formed with the addition of the derivational morphemes ‘-e’ and

‘-ɑ’ to the roots. In addition, in some cases the vowel at the nucleus which is longer (e.g. /ɑ:/) used in the

nouns gets centralized (e.g. /ə/) in the adjectives.

Page 7: Presenting Sambalpuri-Kosli Language: A Demonstration of

SP Publications

International Journal Of English and Studies (IJOES)

An International Peer-Reviewed Journal ; Volume-4, Issue-1, 2022 www.ijoes.in ISSN: 2581-8333; Impact Factor: 5.421(SJIF)

ISSN: 2581-8333 Copyright © 2022 SP Publications Page 79

RESEARCH ARTICLE

Noun>adjective Adjective

/kʊkʊr/ „dog‟ /kʊkʊrɪɑ/ „doggish

/mɑkəɖ/ „monkey‟ /mɑkəɖɪɑ/ „ugly‟

/ gʰʊsrɪ/ „pig‟ / gʰʊsrɪɑ/ „piggish‟

So far as the words of flora and fauna are concerned, adjectivalizations are formed with the addition of ‘-ɪɑ’ suffix to the root.

Cultural Analysis:

Words are cultural, archaeological, and environmental signatures of a community. “But more important, for humanity in

general, is the need to preserve cultural diversity and knowledge systems that can be encoded in a dictionary” (Ogilvia, 2011:

389-404). There are many such words-/ʈə̃ɖʰɛɪ pok/ ‘praying mantis’, /ərəkʰ gəcʰ/ ‘calotropis tree’, /d̪ʊd̪ʰrɑ gəcʰ/ ‘stramonium

plant’, /kəi ̃ gəcʰ/ ‘water lily plant’, /d̪ʰəmnɑ/ ‘the female cobra’, /cɑt̪ək/ ‘the swallow bird’ that provide us with ample

information regarding the culture of a specific speech community. For instance, /ʈə̃ɖʰɛɪ/ means witch and /pok/ refers to the worm in Sambalpuri. In other words, /ʈə̃ɖʰɛɪ pok/ (see Figure 5)

denotes to ‘the worm of the witch’. Hence, Praying mantis is believed to be the agent of a witch that is going to suck the blood

of the person on whom the spell is triggered at night, especially on the full moon and new moon nights.

It is one of the popular blind beliefs of the language speakers of the region. /cɑt̪ək/ (see Figure 6) ‘the swallow bird’ is believed to be one of the rarest birds which do not drink water from the water present on the earth’s crust; it directly drinks water when

the rain comes. In Sambalpuri /cɑhə̃/ means ‘want’ or ‘look’. So the word /cɑt̪ək/ probably has been derived from the word

/cɑhə̃/.

Fig 5 praying mantis Fig 6 the swallow bird

/ərəkʰ gəcʰ/ (see Figure 7) ‘calotropis tree’, /d̪ʊd̪ʰrɑ gəcʰ/ (see Figure 8) ‘stramonium plant’ are two of the plants belonging to

the flora and fauna domain refer to the religious aspect of the region. The flowers of the plants are worshipped to Lord Shiva

that cannot be used for worshipping any other gods and goddesses. This aspect denotes the fact that most of the speakers of the

language are from the Hindu community. /d̪ʰəmnɑ/ (see Figure 9) ‘the female cobra’ is referring also to the religious aspect of

the community. The witnessing of the mating of the king and queen cobra is considered as auspicious by the people.

Page 8: Presenting Sambalpuri-Kosli Language: A Demonstration of

SP Publications

International Journal Of English and Studies (IJOES)

An International Peer-Reviewed Journal ; Volume-4, Issue-1, 2022 www.ijoes.in ISSN: 2581-8333; Impact Factor: 5.421(SJIF)

ISSN: 2581-8333 Copyright © 2022 SP Publications Page 80

RESEARCH ARTICLE

Fig 7 Calotropis tree

Fig 8 the female cobra

Fig 9 Stramonium plant

/kəi ̃ gəcʰ/ (see Figure 10) ‘water lily plant’ sprouts in

the ponds generally and the ponds are extremely deep.

There is a belief that the lilies are the homelands of the

Gods and Goddesses and one must not pluck them. If

one plucks them they are sure to face some problems.

This phenomenon is also manifested in one of the

festivals where a large number of people gather to

celebrate the occasion known as /bərʊɑ/ ‘Barua’. In

this festival some specific persons get possessed by

one of the deities and they typically behave with the

characteristics of the respective deities.

Page 9: Presenting Sambalpuri-Kosli Language: A Demonstration of

SP Publications

International Journal Of English and Studies (IJOES)

An International Peer-Reviewed Journal ; Volume-4, Issue-1, 2022 www.ijoes.in ISSN: 2581-8333; Impact Factor: 5.421(SJIF)

ISSN: 2581-8333 Copyright © 2022 SP Publications Page 81

RESEARCH ARTICLE

Fig 10 water lily plant

Conclusion:

English, Chinese, Spanish, French, Japanese and more of

the European and Western languages are high-resource.

There is already a vast corpus of data in those languages

which can be tapped for training and learning. Low-

resource languages are those that have relatively less data

available for training conversational systems. One of the

best aspects of today’s hyper connected world is that the

fruits of technological innovation can spread across the

globe. There is no reason a new technology breakthrough

in the present world could not also be replicated to help

societies in India. ‘Odia’ is a suitable language in Odisha

but Sambalpuri-Kosli is a limited-resourced language.

From the foregoing discussion and on the basis of the

above analysis it can however be affirmed that undoubtedly the dictionaries provide the readers with an

abundant storehouse of the language under study. In

addition, the dictionaries, being the product of the

language also encapsulates plethora of the other semantic

and ontological information with respect to the socio-

Page 10: Presenting Sambalpuri-Kosli Language: A Demonstration of

SP Publications

International Journal Of English and Studies (IJOES)

An International Peer-Reviewed Journal ; Volume-4, Issue-1, 2022 www.ijoes.in ISSN: 2581-8333; Impact Factor: 5.421(SJIF)

ISSN: 2581-8333 Copyright © 2022 SP Publications Page 82

RESEARCH ARTICLE

cultural aspect, indigenous knowledge paradigm,

philosophical and religious values, folklores and so on.

REFERENCES

[1]Abbi, A. Manual of Linguistic Fieldwork and

Structures of Indian Languages. Lincom

Europa. 2001

[2]Behera, P., Ojha, A. K., Jha, G. N.. Issues and

Challenges in Developing Statistical

POS Taggers for Sambalpuri. In

Proceedings LTC-2015, Poland,

Springer Verlag. Accessed on

23.02.2016

http://ltc.amu.edu.pl/book/papers/LRL- 13.pdf. 2015

[3]Behera, P. & Ojha, A. K. Developing an

Automated SVM POS Tagger for

Sambalpuri: the Case of a Lesser-

known Language. In Proceedings of

ELKL-4, 2016, Cambridge Scholars

Publishing (to be published), India. 2016

[4]Behera, P. Issues and Challenges in Corpus

Collection and Annotation of

Sambalpuri: the Case of a Lesser-known

Language. Proceedings of ELKL-4, 2016, Cambridge Scholars Publishing

(to be published), India. 2016

[5]Buseman, A. & Buseman, K. Toolbox Self-

Training- How to use the Field

Linguist’s Toolbox, Version 1.5.9 Ma.

2011

[6]E.Coward, D., & E. Gimmes, Charles. Making

Dictionaries. North Carolina: SIL. 2000.

[7]Himmelmann, N. P. Language Documentation:

What is it and what is it good for.

Essentials of language documentation, 178, 1. 2006

[7]Jha, G. N., Hellan L., Beermann, D., Singh, S.,

Behera, P. & Banerjee, E. (2014). Indian

Languages on the TypeCraft

Platform– The Case of Hindi and Odia,

Proceedings of WILDRE-2014 (ISBN:

978-2-9517408-8- 4):

Rekyavijk, Iceland. Accessed on

23.02.2016 http://www.lrec-

conf.org/proceedings/lrec2014/workshop

s/LREC2014Workshop-

WILDRE%20Proceedings.pdf

[8]Kushal, G. Case and Agreement in

Sambalpuri. Centre for Linguistics, Jawaharlal Nehru Univerity. 2015

[9]Mathai, E. K. & Kelsall, J. Sambalpuri of

Orissa, India: A Brief Sociolinguistic

Survey. SIL International. 2013

[10]McEnery, T., Baker, P., & Burnard, L.

Corpus resources and minority language

engineering. In LREC. 2000

[11]Ostler, N. Language technology and the

Smaller Language. Elra Newsletter, 4(2).

1999

[12]Ogilvie, S. Linguistics, lexicography, and the

revitalization of endangered languages. International Journal of

Lexicography, ecr019. 2011

[13]Ogilvie, Sarah. Linguistics, Lexicography,

and the Revitalization of the Endangered

Languages. International Journal of

Lexicography, Vol. 24 No. 4, pp. 389–

404 doi:10.1093/ijl/ecr019. 2011

[14]Ojha, A. K., Behera, P., Singh, S. & Jha, G.

N. (2015). Training & Evaluation of

POS Taggers in Indo-Aryan

Languages: A Case of Hindi, Odia and Bhojpuri, InProceedings of LTC-2015,

Poland, Springer Verlag.Accessed on

23.02.2016

http://ltc.amu.edu.pl/book/papers/TANO

2-2.pdf

[15]Patel, Kunjabana. A Sambalpuri Phonetic

Reader. Sambalpur: Menaka Prakashani.

2017