computational linguistics in mayan languages

151
MLCP, 2007-11-02 Computational Linguistics and Mayan Languages Michael Gasser School of Informatics Indiana University

Upload: enrique-salguero

Post on 01-Jan-2016

24 views

Category:

Documents


1 download

DESCRIPTION

Michael GasserSchool of InformaticsIndiana University

TRANSCRIPT

MLCP, 2007-11-02

Computational Linguisticsand Mayan Languages

Michael GasserSchool of Informatics

Indiana University

Languages

2

Languages

2

• A language is

Languages

2

• A language is

- A set of conventions shared by a linguistic community concerning pronunciation, vocabulary, grammar, and usage

Languages

2

• A language is

- A set of conventions shared by a linguistic community concerning pronunciation, vocabulary, grammar, and usage. Potentially different conventions for different functions/genres: informal

conversation, formal oratory

Languages

2

• A language is

- A set of conventions shared by a linguistic community concerning pronunciation, vocabulary, grammar, and usage. Potentially different conventions for different functions/genres: informal

conversation, formal oratory

- A set of lexical and grammatical categories that constitute a way of slicing up the “world”, an interpretation of reality

Languages

2

• A language is

- A set of conventions shared by a linguistic community concerning pronunciation, vocabulary, grammar, and usage. Potentially different conventions for different functions/genres: informal

conversation, formal oratory

- A set of lexical and grammatical categories that constitute a way of slicing up the “world”, an interpretation of reality. Eat in English, etc., multiple words in Mayan languages depending on the

nature of the eaten thing (eat fruit, eat meat, eat soft thing)

Languages

2

• A language is

- A set of conventions shared by a linguistic community concerning pronunciation, vocabulary, grammar, and usage. Potentially different conventions for different functions/genres: informal

conversation, formal oratory

- A set of lexical and grammatical categories that constitute a way of slicing up the “world”, an interpretation of reality. Eat in English, etc., multiple words in Mayan languages depending on the

nature of the eaten thing (eat fruit, eat meat, eat soft thing). Shape vs. material biases in English, etc. and Yucatec Mayan

Languages

2

• A language is

- A set of conventions shared by a linguistic community concerning pronunciation, vocabulary, grammar, and usage. Potentially different conventions for different functions/genres: informal

conversation, formal oratory

- A set of lexical and grammatical categories that constitute a way of slicing up the “world”, an interpretation of reality. Eat in English, etc., multiple words in Mayan languages depending on the

nature of the eaten thing (eat fruit, eat meat, eat soft thing). Shape vs. material biases in English, etc. and Yucatec Mayan. Ergative vs. nominative languages

Languages

2

• A language is

- A set of conventions shared by a linguistic community concerning pronunciation, vocabulary, grammar, and usage. Potentially different conventions for different functions/genres: informal

conversation, formal oratory

- A set of lexical and grammatical categories that constitute a way of slicing up the “world”, an interpretation of reality. Eat in English, etc., multiple words in Mayan languages depending on the

nature of the eaten thing (eat fruit, eat meat, eat soft thing). Shape vs. material biases in English, etc. and Yucatec Mayan. Ergative vs. nominative languages

- All of the available data in the language (uses by native speakers/writers in context)

Linguistic revolutions

3

Linguistic revolutions

• Linguistic revolutions — Writing, Printing, Computing

3

Linguistic revolutions

• Linguistic revolutions — Writing, Printing, Computing

- Changed the nature of language

3

Linguistic revolutions

• Linguistic revolutions — Writing, Printing, Computing

- Changed the nature of language

- Changed individual languages

3

The First and Second Linguistic Revolutions: Writing, Printing

4

The First and Second Linguistic Revolutions: Writing, Printing

• Changes in languages that were written

4

The First and Second Linguistic Revolutions: Writing, Printing

• Changes in languages that were written

- New functions, genres (bookkeeping, the letter, the receipt, the novel, the parking ticket, the crib sheet)

4

The First and Second Linguistic Revolutions: Writing, Printing

• Changes in languages that were written

- New functions, genres (bookkeeping, the letter, the receipt, the novel, the parking ticket, the crib sheet)

- New processing constraints and possibilities: new grammatical constructions

4

The First and Second Linguistic Revolutions: Writing, Printing

• Changes in languages that were written

- New functions, genres (bookkeeping, the letter, the receipt, the novel, the parking ticket, the crib sheet)

- New processing constraints and possibilities: new grammatical constructions

- Increase in the database

4

The Second Linguistic Revolution: Printing

5

The Second Linguistic Revolution: Printing

• By greatly facilitating copying, greatly increased speed and range of access to knowledge for literate native readers of written languages, but access to printed material still restricted even among this population

5

The Second Linguistic Revolution: Printing

• By greatly facilitating copying, greatly increased speed and range of access to knowledge for literate native readers of written languages, but access to printed material still restricted even among this population

• Increased population capable of creating accessible knowledge, but, because of costs, left this still quite limited

5

Privileged and disadvantaged languages

6

Privileged and disadvantaged languages

6

• At the dawn of the Digital Revolution

Privileged and disadvantaged languages

6

• At the dawn of the Digital Revolution

- Most languages remained essentially unwritten

Privileged and disadvantaged languages

6

• At the dawn of the Digital Revolution

- Most languages remained essentially unwritten

- Most speakers of many written languages were illiterate

Privileged and disadvantaged languages

6

• At the dawn of the Digital Revolution

- Most languages remained essentially unwritten

- Most speakers of many written languages were illiterate

- Among written languages, the availability of material varied greatly (not necessarily as a function of the number of speakers: Danish, Finnish, Hebrew vs. Hindi, Bengali, Indonesian, Swahili)

Privileged and disadvantaged languages

6

• At the dawn of the Digital Revolution

- Most languages remained essentially unwritten

- Most speakers of many written languages were illiterate

- Among written languages, the availability of material varied greatly (not necessarily as a function of the number of speakers: Danish, Finnish, Hebrew vs. Hindi, Bengali, Indonesian, Swahili)

- Many speakers of disadvantaged languages relied on other languages for important functions such as accessing archived information

Privileged and disadvantaged languages

6

• At the dawn of the Digital Revolution

- Most languages remained essentially unwritten

- Most speakers of many written languages were illiterate

- Among written languages, the availability of material varied greatly (not necessarily as a function of the number of speakers: Danish, Finnish, Hebrew vs. Hindi, Bengali, Indonesian, Swahili)

- Many speakers of disadvantaged languages relied on other languages for important functions such as accessing archived information

- Speakers of disadvantaged languages remained handicapped in participation in national and global debates about the future

The Third Linguistic Revolution: the Digital Revolution

7

The Third Linguistic Revolution: the Digital Revolution

• Changes in the affected languages

7

The Third Linguistic Revolution: the Digital Revolution

• Changes in the affected languages

- New functions and genres (email, chat, blogs, text messaging)

7

The Third Linguistic Revolution: the Digital Revolution

• Changes in the affected languages

- New functions and genres (email, chat, blogs, text messaging)

- Massive increase in the database

7

The Third Linguistic Revolution: the Digital Revolution

• Changes in the affected languages

- New functions and genres (email, chat, blogs, text messaging)

- Massive increase in the database

• The promise for disadvantaged languages

7

The Third Linguistic Revolution: the Digital Revolution

• Changes in the affected languages

- New functions and genres (email, chat, blogs, text messaging)

- Massive increase in the database

• The promise for disadvantaged languages

- Democratic nature of the Internet

7

The Third Linguistic Revolution: the Digital Revolution

• Changes in the affected languages

- New functions and genres (email, chat, blogs, text messaging)

- Massive increase in the database

• The promise for disadvantaged languages

- Democratic nature of the Internet

- Possibility of bypassing printing

7

The Third Linguistic Revolution: the Digital Revolution

• Changes in the affected languages

- New functions and genres (email, chat, blogs, text messaging)

- Massive increase in the database

• The promise for disadvantaged languages

- Democratic nature of the Internet

- Possibility of bypassing printing

- Possibility of partially automating translation

7

The Third Linguistic Revolution: the Digital Revolution

• Changes in the affected languages

- New functions and genres (email, chat, blogs, text messaging)

- Massive increase in the database

• The promise for disadvantaged languages

- Democratic nature of the Internet

- Possibility of bypassing printing

- Possibility of partially automating translation

- Computer-assisted literacy training

7

The “Information Society”

8

The “Information Society”

• World Summit on Information Society principles (Geneva, 2003; Tunis, 2005)

8

The “Information Society”

• World Summit on Information Society principles (Geneva, 2003; Tunis, 2005)

- 3: “The ability for all to access and contribute information, ideas and knowledge is essential in an inclusive Information Society.”

8

The “Information Society”

• World Summit on Information Society principles (Geneva, 2003; Tunis, 2005)

- 3: “The ability for all to access and contribute information, ideas and knowledge is essential in an inclusive Information Society.”

- 8: “The Information Society should be founded on and stimulate respect for cultural identity, cultural and linguistic diversity, traditions and religions, and foster dialogue among cultures and civilizations. The promotion, affirmation and preservation of diverse cultural identities and languages ... will further enrich the Information Society. ”

8

The reality

9

The reality

• One language dominates the Internet (~70% of web pages). 12 languages account for ~97% of all web pages. (O’Neill, Lavoie, Bennett)

9

The reality

• One language dominates the Internet (~70% of web pages). 12 languages account for ~97% of all web pages. (O’Neill, Lavoie, Bennett)

• Distribution corresponds closely to that in the world’s libraries.

9

The reality

• One language dominates the Internet (~70% of web pages). 12 languages account for ~97% of all web pages. (O’Neill, Lavoie, Bennett)

• Distribution corresponds closely to that in the world’s libraries.

9

The reality: Wikipedia

10

The reality: Wikipedia

• 251 languages

10

The reality: Wikipedia

• 251 languages

• English: 1,743,312 articles

10

The reality: Wikipedia

• 251 languages

• English: 1,743,312 articles

• Swedish: 222,821 articles

10

The reality: Wikipedia

• 251 languages

• English: 1,743,312 articles

• Swedish: 222,821 articles

• Quechua: 2,102 articles

• 11 other indigenous American languages (only 4 with more than 100 articles)

10

Language and the Internet

11

Language and the Internet

• Even some communities that share a language other than English (e.g., Panjabi speakers) use English for email and chat. (Paolillo)

11

Language and the Internet

• Even some communities that share a language other than English (e.g., Panjabi speakers) use English for email and chat. (Paolillo)

• Even for languages such as Spanish, resources may focus on music, dancing, food, shopping (in fact may be catering to “cultural tourists”). (Clark & Gorski)

11

Language and the Internet

• Even some communities that share a language other than English (e.g., Panjabi speakers) use English for email and chat. (Paolillo)

• Even for languages such as Spanish, resources may focus on music, dancing, food, shopping (in fact may be catering to “cultural tourists”). (Clark & Gorski)

• The adoption of encoding standards and keyboards for some scripts lags behind.

11

Language and the Internet

• Even some communities that share a language other than English (e.g., Panjabi speakers) use English for email and chat. (Paolillo)

• Even for languages such as Spanish, resources may focus on music, dancing, food, shopping (in fact may be catering to “cultural tourists”). (Clark & Gorski)

• The adoption of encoding standards and keyboards for some scripts lags behind.

• Programming and markup languages are based on English. (Paolillo)

11

The Linguistic Digital Divide

12

The Linguistic Digital Divide

• Relative lack of documents and computational resources in disadvantaged languages

12

The Linguistic Digital Divide

• Relative lack of documents and computational resources in disadvantaged languages

• Denies revolutionary advances in access, creation, collaboration to the majority

12

The Linguistic Digital Divide

• Relative lack of documents and computational resources in disadvantaged languages

• Denies revolutionary advances in access, creation, collaboration to the majority

• Inhibits the participation of the majority in solving urgent national and international problems

12

The Linguistic Digital Divide

• Relative lack of documents and computational resources in disadvantaged languages

• Denies revolutionary advances in access, creation, collaboration to the majority

• Inhibits the participation of the majority in solving urgent national and international problems

• Exaggerates class divisions within linguistic communities

12

The Linguistic Digital Divide

• Relative lack of documents and computational resources in disadvantaged languages

• Denies revolutionary advances in access, creation, collaboration to the majority

• Inhibits the participation of the majority in solving urgent national and international problems

• Exaggerates class divisions within linguistic communities

• Diminishes the role of many languages

12

The Linguistic Digital Divide

• Relative lack of documents and computational resources in disadvantaged languages

• Denies revolutionary advances in access, creation, collaboration to the majority

• Inhibits the participation of the majority in solving urgent national and international problems

• Exaggerates class divisions within linguistic communities

• Diminishes the role of many languages

- “Strong” international, national, and regional languages (Thai, Tamil, Amharic, Swahili, etc.)

12

The Linguistic Digital Divide

• Relative lack of documents and computational resources in disadvantaged languages

• Denies revolutionary advances in access, creation, collaboration to the majority

• Inhibits the participation of the majority in solving urgent national and international problems

• Exaggerates class divisions within linguistic communities

• Diminishes the role of many languages

- “Strong” international, national, and regional languages (Thai, Tamil, Amharic, Swahili, etc.)

- Languages already marginalized within their own countries

12

Causes of the LDD

13

Causes of the LDD

• Lack of users

13

Causes of the LDD

• Lack of users

• Lack of power and financial resources

13

Causes of the LDD

• Lack of users

• Lack of power and financial resources

• Linguistic imperialism, chauvinism

13

Bridging the LDD

14

Bridging the LDD

• Have everybody learn English.

14

Bridging the LDD

• Have everybody learn English.

- It doesn’t work. (Brock-Utne : The Recolonization of the African Mind)

14

Bridging the LDD

• Have everybody learn English.

- It doesn’t work. (Brock-Utne : The Recolonization of the African Mind)

- It relegates all other languages to a secondary role: violates WSIS Principle 8.

14

Bridging the LDD

• Have everybody learn English.

- It doesn’t work. (Brock-Utne : The Recolonization of the African Mind)

- It relegates all other languages to a secondary role: violates WSIS Principle 8.

• Create documents in and tools for under-represented languages, including mediating interpreters

14

A role for translation?

15

A role for translation?

15

• Translation and the spread of knowledge in the Middle Ages

A role for translation?

15

• Translation and the spread of knowledge in the Middle Ages

- Greek to Arabic in 8th-10th century Baghdad

A role for translation?

15

• Translation and the spread of knowledge in the Middle Ages

- Greek to Arabic in 8th-10th century Baghdad

- Arabic and Hebrew to Latin and Spanish in 12th-13th century Toledo

A role for translation?

15

• Translation and the spread of knowledge in the Middle Ages

- Greek to Arabic in 8th-10th century Baghdad

- Arabic and Hebrew to Latin and Spanish in 12th-13th century Toledo

• Translation could bridge the divide by making more documents available in disadvantaged languages and giving speakers/writers of these languages a voice.

Translation and the LDD

16

Translation and the LDD

English

K’iche’

16

Translation and the LDD

English

K’iche’

17

Translation and the LDD

English

K’iche’

17

Translation

18

Translation

• The enormity of the problem

- Hundreds of languages

- Millions of documents

18

Translation

• The enormity of the problem

- Hundreds of languages

- Millions of documents

• Machine translation

18

Overview of machine translation

19

Overview of machine translation

• As in other fields within computational linguistics, two classes of methods

19

Overview of machine translation

• As in other fields within computational linguistics, two classes of methods

- Symbolic/linguistic

19

Overview of machine translation

• As in other fields within computational linguistics, two classes of methods

- Symbolic/linguistic. Grammatical rules and explicit lexicons for each languages; explicit

correspondence rules between the languages

19

Overview of machine translation

• As in other fields within computational linguistics, two classes of methods

- Symbolic/linguistic. Grammatical rules and explicit lexicons for each languages; explicit

correspondence rules between the languages. Based on grammars, dictionaries, and linguistic theories

19

Overview of machine translation

• As in other fields within computational linguistics, two classes of methods

- Symbolic/linguistic. Grammatical rules and explicit lexicons for each languages; explicit

correspondence rules between the languages. Based on grammars, dictionaries, and linguistic theories

- Statistical

19

Overview of machine translation

• As in other fields within computational linguistics, two classes of methods

- Symbolic/linguistic. Grammatical rules and explicit lexicons for each languages; explicit

correspondence rules between the languages. Based on grammars, dictionaries, and linguistic theories

- Statistical. Co-occurrence regularities between words (within and between languages)

are learned.

19

Overview of machine translation

• As in other fields within computational linguistics, two classes of methods

- Symbolic/linguistic. Grammatical rules and explicit lexicons for each languages; explicit

correspondence rules between the languages. Based on grammars, dictionaries, and linguistic theories

- Statistical. Co-occurrence regularities between words (within and between languages)

are learned.. Based on large corpora of texts (monolingual and bilingual) and on theories

from machine learning (AI) and bayesian statistics

19

Overview of machine translation

• As in other fields within computational linguistics, two classes of methods

- Symbolic/linguistic. Grammatical rules and explicit lexicons for each languages; explicit

correspondence rules between the languages. Based on grammars, dictionaries, and linguistic theories

- Statistical. Co-occurrence regularities between words (within and between languages)

are learned.. Based on large corpora of texts (monolingual and bilingual) and on theories

from machine learning (AI) and bayesian statistics

• The problem of integrating the two classes of methods

19

Overview of machine translation

20

Overview of machine translation

• Transfer architectures

20

Overview of machine translation

• Transfer architectures

• Interlingua architectures

20

Machine translation

21

Machine translation

• Sophisticated machine translation relies on

21

Machine translation

• Sophisticated machine translation relies on

- Explicit grammars and lexicons and

21

Machine translation

• Sophisticated machine translation relies on

- Explicit grammars and lexicons and

- Training the system on monolingual and bilingual texts

21

Machine translation

• Sophisticated machine translation relies on

- Explicit grammars and lexicons and

- Training the system on monolingual and bilingual texts

• Quality of translation depends on

21

Machine translation

• Sophisticated machine translation relies on

- Explicit grammars and lexicons and

- Training the system on monolingual and bilingual texts

• Quality of translation depends on

- Distance between languages

21

Machine translation

• Sophisticated machine translation relies on

- Explicit grammars and lexicons and

- Training the system on monolingual and bilingual texts

• Quality of translation depends on

- Distance between languages

- Breadth of content domain

21

Machine translation

• Sophisticated machine translation relies on

- Explicit grammars and lexicons and

- Training the system on monolingual and bilingual texts

• Quality of translation depends on

- Distance between languages

- Breadth of content domain

• In general, human intervention is still required

21

Machine translation

• Sophisticated machine translation relies on

- Explicit grammars and lexicons and

- Training the system on monolingual and bilingual texts

• Quality of translation depends on

- Distance between languages

- Breadth of content domain

• In general, human intervention is still required

• Because of the background knowledge that seems essential for translation, the original goals of MT will probably never be realized.

21

Machine translation

• Sophisticated machine translation relies on

- Explicit grammars and lexicons and

- Training the system on monolingual and bilingual texts

• Quality of translation depends on

- Distance between languages

- Breadth of content domain

• In general, human intervention is still required

• Because of the background knowledge that seems essential for translation, the original goals of MT will probably never be realized.

• Toward appropriate and efficient forms of collaboration between people and machines (Kay)

21

L3: long-term goals

22

L3: long-term goals

• Translation

22

L3: long-term goals

• Translation

- Between “disadvantaged” languages (DLs) of the Global South and “privileged” languages of the Global North

22

L3: long-term goals

• Translation

- Between “disadvantaged” languages (DLs) of the Global South and “privileged” languages of the Global North

- Among the the DLs

22

L3: long-term goals

• Translation

- Between “disadvantaged” languages (DLs) of the Global South and “privileged” languages of the Global North

- Among the the DLs

• Computational tools to aid in teaching the DLs

22

L3: long-term goals

• Translation

- Between “disadvantaged” languages (DLs) of the Global South and “privileged” languages of the Global North

- Among the the DLs

• Computational tools to aid in teaching the DLs

• Software to facilitate the creation of virtual communities of “experts” (teachers, writers, students, linguists) on DLs

22

L3: long-term goals

• Translation

- Between “disadvantaged” languages (DLs) of the Global South and “privileged” languages of the Global North

- Among the the DLs

• Computational tools to aid in teaching the DLs

• Software to facilitate the creation of virtual communities of “experts” (teachers, writers, students, linguists) on DLs

- Strengthening the language

22

L3: long-term goals

• Translation

- Between “disadvantaged” languages (DLs) of the Global South and “privileged” languages of the Global North

- Among the the DLs

• Computational tools to aid in teaching the DLs

• Software to facilitate the creation of virtual communities of “experts” (teachers, writers, students, linguists) on DLs

- Strengthening the language

- Providing texts for training MT system

22

L3: long-term goals

• Translation

- Between “disadvantaged” languages (DLs) of the Global South and “privileged” languages of the Global North

- Among the the DLs

• Computational tools to aid in teaching the DLs

• Software to facilitate the creation of virtual communities of “experts” (teachers, writers, students, linguists) on DLs

- Strengthening the language

- Providing texts for training MT system

- Providing feedback for MT system

22

L3

23

L3

• Collaboration between

23

L3

• Collaboration between

- Computational linguists and

23

L3

• Collaboration between

- Computational linguists and

- Members of the linguistic communities themselves who

23

L3

• Collaboration between

- Computational linguists and

- Members of the linguistic communities themselves who. Define the content areas for translation and

23

L3

• Collaboration between

- Computational linguists and

- Members of the linguistic communities themselves who. Define the content areas for translation and. Are responsible for the quality of the final translations

23

Machine translation and disadvantaged languages

24

Machine translation and disadvantaged languages

• Much research on privileged languages

- English, French, German, Spanish, Russian, Dutch, Portuguese, Italian, Chinese, Japanese, Korean

24

Machine translation and disadvantaged languages

• Much research on privileged languages

- English, French, German, Spanish, Russian, Dutch, Portuguese, Italian, Chinese, Japanese, Korean

• Some research on major languages of the Global South, in countries with significant research facilities, and on languages deemed “critical” by the US government

- Arabic, Farsi, Hindi

24

Machine translation and disadvantaged languages

• Much research on privileged languages

- English, French, German, Spanish, Russian, Dutch, Portuguese, Italian, Chinese, Japanese, Korean

• Some research on major languages of the Global South, in countries with significant research facilities, and on languages deemed “critical” by the US government

- Arabic, Farsi, Hindi

• For the majority of languages, we only have at best dictionaries and a few other resources

24

The situation in Guatemala

25

The situation in Guatemala

• Roughly half of the population of 12,800,000 is indigenous (Mayan), speaking ~20 languages, with from 1,000,000+ to ~1,000 speakers.

25

The situation in Guatemala

• Roughly half of the population of 12,800,000 is indigenous (Mayan), speaking ~20 languages, with from 1,000,000+ to ~1,000 speakers.

• A significant number of Mayans are monolingual.

25

The situation in Guatemala

• Roughly half of the population of 12,800,000 is indigenous (Mayan), speaking ~20 languages, with from 1,000,000+ to ~1,000 speakers.

• A significant number of Mayans are monolingual.

• Most Mayans are not literate in their mother tongue; literacy in Spanish is more common.

25

The situation in Guatemala

• Roughly half of the population of 12,800,000 is indigenous (Mayan), speaking ~20 languages, with from 1,000,000+ to ~1,000 speakers.

• A significant number of Mayans are monolingual.

• Most Mayans are not literate in their mother tongue; literacy in Spanish is more common.

• Officially the languages are recognized and promoted; in practice there is not much actual support for this.

25

The situation in Guatemala

• Roughly half of the population of 12,800,000 is indigenous (Mayan), speaking ~20 languages, with from 1,000,000+ to ~1,000 speakers.

• A significant number of Mayans are monolingual.

• Most Mayans are not literate in their mother tongue; literacy in Spanish is more common.

• Officially the languages are recognized and promoted; in practice there is not much actual support for this.

• There are now some bilingual schools.

25

The situation in Guatemala

26

The situation in Guatemala

• Semi-governmental organization, the Academia de las Lenguas Mayas de Guatemala, oversees language-related issues, is involved in translation, education, production of materials, but there is little funding for its work.

26

The situation in Guatemala

• Semi-governmental organization, the Academia de las Lenguas Mayas de Guatemala, oversees language-related issues, is involved in translation, education, production of materials, but there is little funding for its work.

• Other independent organizations working on Mayan language issues, including Asociación Ajb’atz’ Enlace Quiché, an NGO dedicated to using technology for teaching and strengthening the languages.

26

The situation in Guatemala

• Semi-governmental organization, the Academia de las Lenguas Mayas de Guatemala, oversees language-related issues, is involved in translation, education, production of materials, but there is little funding for its work.

• Other independent organizations working on Mayan language issues, including Asociación Ajb’atz’ Enlace Quiché, an NGO dedicated to using technology for teaching and strengthening the languages.

• Small number of online resources, a few monolingual and bilingual texts, bilingual dictionaries, teaching materials.

26

From Loq’aläj täq Mayab’ kunab’äl(Mayan Medicine)

27

LOQ’ALÄJ TÄQ MAYAB’ KUNAB’ÄL

201=

ANIX

UPETIK UB’ANTAJIK

We q’ayes kunab’äl ri’, man kk’iy ta ulöq utukel. Je wa’ kpetik: ri rulewal are

täq le xoral, le uxäq laj täq xerxob’ uwach.

RI KUKUNAJ

! Pamaj.

! Ib’och’.

! Kuk’iysaj upam le tu’,

RI UKOJIK

! Chi rech le pamaj: Kpoq’owisäx jun

laj jub’utzaj pa jun xa’r k’a te k’u ri’

ktzaq jub’iq’ tzam ruk’, ktijow jun qumb’äl ronojel nïm aq’ab’

b’elejeb’ q’ij ktijowik.

! Chi rech le ib’och’: Kpoq’owisäx jun laj jub’utzaj pa jun xa’r, ktijow

xäq pa chi kech kajb’äl, jun qumb’äl ktijowik ronojel q’ij, lajuj q’ij

ktaqexïk.

! Chi rech upam le tu’: Ktijowik are chi’ ke’l ulöq ri ixöq pa tuj,

kutïj jun qumb’äl chi rech pa waq’ib’ q’ij.

APACÍN

UPETIK UB’ANTAJIK

We q’ayes ri’ sib’aläj nima’q raqan le uxaq xuquje’ räx kka’yik le uwachib’äl

le uche’al, kk’iy ulöq pa joron juyub’ xuquje’ pa meq’ïn juyub’.

RI KUKUNAJ

! Le öj.

! Q’oxom jolomaj.

! Q’oxom wareyaj.

! Ch’a’k rech palajaj.

From Poemas infantiles

28

EELL RRÍÍOO Me gusta tu belleza

No más que tu pureza Lo digo con ternura

Me encanta tu dulzura.

Cada vez al mirarte No dejo de exclamarte Lo bello y vitalizante

Que es tu mundo fascinante.

LLEE JJAA’’ Ütz kinwil ri je’l apetik

Rumal ri asaqil Kinb’ij ruk’ chuch’jal

Sib’aläj kwaj ri a ki’al.

Ronojel le q’ij chi’ katinwilo Loq’ ta chik kinya’ kan rilik Ri aje’lik xuquje’ ri k’aslik

Are b’a wa’ nimaläj ak’olb’al.

The situation in Guatemala

29

The situation in Guatemala

• Internet cafes fairly common, home computers not

29

The situation in Guatemala

• Internet cafes fairly common, home computers not

• Cellphones everywhere

29

The situation in Guatemala

• Internet cafes fairly common, home computers not

• Cellphones everywhere

• In order to benefit from the Digital Revolution, Mayan language communities need

29

The situation in Guatemala

• Internet cafes fairly common, home computers not

• Cellphones everywhere

• In order to benefit from the Digital Revolution, Mayan language communities need

- Access to technology

29

The situation in Guatemala

• Internet cafes fairly common, home computers not

• Cellphones everywhere

• In order to benefit from the Digital Revolution, Mayan language communities need

- Access to technology

- Literacy in their mother tongues

29

The situation in Guatemala

• Internet cafes fairly common, home computers not

• Cellphones everywhere

• In order to benefit from the Digital Revolution, Mayan language communities need

- Access to technology

- Literacy in their mother tongues

- Many more documents online

29

The situation in Guatemala

• Internet cafes fairly common, home computers not

• Cellphones everywhere

• In order to benefit from the Digital Revolution, Mayan language communities need

- Access to technology

- Literacy in their mother tongues

- Many more documents online

- Ways to interact with speakers of other languages

29

L3 and Mayan languages: short-term goals

30

L3 and Mayan languages: short-term goals

• Morphological parsers and generators for four largest languages (K’iche’, Kaqchikel, Mam, Q’eqchi’)

30

L3 and Mayan languages: short-term goals

• Morphological parsers and generators for four largest languages (K’iche’, Kaqchikel, Mam, Q’eqchi’)

• Translation of simple sentences in a narrow content domain among these languages.

30

Thank you!¡Maltyox!¡Matyox!¡Chjonta!

31