ो. 34 ो. 34 ो. 34 -...
TRANSCRIPT
-
Page | 0
---- (((( $$$$ &&&& ))))
LEXICAL AMBIGUTIY IN HINDI-MARATHI MACHINE TRANSLATION (IN THE
CONTEXT OF HOMONYMS)
((((....++++.... ---- //// ---- ////))))
4444
//// 55556666
----89898989 ----89898989
////.... ....----
? ? ? ?
, , , , AAAA 8888 CCCC
EEEE FGFHFGFHFGFHFGFH
JJJJ 110067110067110067110067
2009200920092009
-
Page | 1
RRRR
U$U$U$U$
. . .. . .. . .. . .
-
Page | 2
Tkokgjyky usg: foTkokgjyky usg: foTkokgjyky usg: foTkokgjyky usg: fo''''ofo|ky;ofo|ky;ofo|ky;ofo|ky;
JAWAHARLAL NEHRU UNIVERSITY
Centre of Indian Languages,
School of Language, Literature & Culture Studies,
New Delhi 110067, India
_______________________________________________________________________________
Centre of Indian Languages DATE: - th, July 2009
DECLARATION
I declare that the material in this dissertation entitled in Hindi ----
(((( $$$$ &&&& )))) in Hindi
Transcription HINDI-MARATHI MASHINI ANUVAAD MEIN SHAABDIK ASPASHTATA
(SAMAAN UCHHARAN VAALE SHABDON KE SANDARBHA MEIN) in English LEXICAL
AMBIGUTIY IN HINDI-MARATHI MACHINE TRANSLATION (IN THE CONTEXT OF
HOMONYMS) submitted by me is an original research work and has not been
previously submitted for any other degree in this or any other university /institution.
Kamble Prakash Abhimannu
RESEARCH SCHOLAR
PROF.CHAMAN LAL Dr. GIRISH NATH JHA PROF.CHAMAN LAL
SUPERVISOR JOINT-SUPERVISOR CHAIRPERSON
CIL / SLL&CS Special Centre for Sanskrit Studies, CIL / SLL&CS
Jawaharlal Nehru University Jawaharlal Nehru University Jawaharlal Nehru University
New Delhi 110067 New Delhi 110067 New Delhi 110067
GRAM:JAYENU TEL.:(91-001) 6107676, 6167557TELEX:031-73167 JNU IN FAX:91-011-6865886
-
Page | 3
VVVV
. . . . . . . . . . . . . . . . . . . . . . . . . 3-33
WXYWXYWXYWXY XXXX////////-\-\-\-\ ]]]] VVVV. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3ii-3iii
FFFF //// . . . . . . . . . . . . . . . . . . . . . . . . . . 01 -03
CCCC:::: ____ . . . . . . . . . . . . . . .. . . . . . . . . . . . . . .. . . . . . . . . . . . . . .. . . . . . . . . . . . . . . 04 22
//// CCCC . . . . . . . . . . . . . . . . . . . . . . . . . . 05-05
1. RRRR . . . . . . . . . . . . . . . . . . . . . . . . 05-06
1.1 ____ /+V/+V/+V/+V WXYWXYWXYWXY 8888 . . . . . . . . . . . . . . . . . . . . . 07-10
1.2 ____ bbbb . . . . . . . . . . . . . . . . . . . . . . . . 10-10
1.3 ........ cccc FdFdFdFd (The Source Language Analysis) ____ bbbb 11-11
2. ____ . . . . . . . . . . . . . . . . . . . . . . . . . . 11-22
2.1. & & & & (pre -Translation Problems) . . . . . . . . . . . 1212
2.2. ____ (Translation Problems) . . . . . . . . . . . . . . 1215
2.3. gggg (Post-Translation Problems) . . . . . . . . . . . 1515
2.4. ____ (Translator/User problems). . . . . . . . . . . 15-16
3. GGGG ____ . . . . . . . . . . . . . . . . . . . . . . . . . 16-17
4. ____ . . . . . . . . . . . . . . . . . . . . . . . . . . 17-22
CCCC:::: &&&& $$$$
. . . . . . . . . . . . . . .. . . . . . . . . . . . . . .. . . . . . . . . . . . . . .. . . . . . . . . . . . . . . 2361
-
Page | 4
CCCC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 - 24
1. &&&& RRRR RRRR . . . . . . . . . . . . . . . . . . . . . . . . . 25 - 31
2. 6 5 & $ 6 5 & $ 6 5 & $ 6 5 & $ . . . . . . . . . . . . 31- 36
3. $$$$ //// jjjjkkkk \\\\ . . . . . . . . . . . . . 36 - 40
4. llll
$ _$ _$ _$ _ . . . . . . . . . . . . . . . . . . . . . . . . . . 40 - 61
4.1 llll & $ _ & $ _ & $ _ & $ _ 40-49
4.2 l & $ _ l & $ _ l & $ _ l & $ _ 49-61
CCCC :::: ---- $$$$ ____
FFFF \\\\ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62119
C C C C . . . . . . . . . . . . . . . . . . . . . . . . 6363
1. &&&& //// &&&& FFFF . . . 64-74
2. 5F5F5F5F , , 56&56&56&56& . . . . . . . . . . . . . . . . . 74-91
3. 56565656 5F5F5F5F, , 56&56&56&56& . . . . . . . . . . . . . . . . . 91-103
4. mmmm &&&& . . . . . . . . . . . . . . . . . . . . . 104-119
CCCC : : : : &&&& $$$$ ____ ____ 8888 120-130
CCCC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121-121
1. &&&& ____ F5F5F5F5 . . . . . . . . . . . . . . . . . . . . . . . . . . 122-127
2. F-F-F-F- . . . . . . . . . . . . . . . . . . . . . . . . . . 128-130
3. . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . 130-140
-
Page | 5
1. ____ 5555 . . . . . 130- 136
2. 5555136-140
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . .. . . .. . . . 141-148
mVkmVkmVkmVk. . . . . . . . . . . . . . . . . . . . . . . . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . .. . . . 149-156
1.... mmmm . . . . . . . . . . . . . . . . . . . . . . . . 150-150
2....&&&& mmmm, , , , , , , , p\p\p\p\ . . . . . . . . . . . . . . . . . . . . . . . . . 151-155
3.... &&&& . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155-156
-
Page | 6
A &q FGFH ... J
+ U FGFH " /H-_
(Translation Technology)" + .. .+
-/ U ?, ...
5 F / js /.
.-
u & +
F _ 4 A C U-
/. & _ /R +
- - "- (
56& $ & )" F -/ _
U- A& +
-/ & &- x -/
89 5 ./.- , _ u
J + -/ X\
u / /+V ...
... / 5 X\ &
5 /R + &
y _ u F5 ? _
_ 8 u + p\ -
u F /
& F5 ? 5\$ / ,
6$ & 5\, , && U &
{F k, 5x, ?,
/ U
-
Page | 7
& & U m$ _ F U
5 1 & - .p\ " 2.
Learning Form-Meaning Mappings in Presence of Homonymy: a linguistically motivated model of learning inflection by Katya Pertsova +$ |8
R- & 5 5 u j}
& |8 5 5 &
$ 5 $ b &
5 ... l & _ F
m F F 5 u _
_ 6 ~ C
, FG, , F/, & /
_ /, , F
& $ $ 5 _ _ m
A - $
- $ _ 5 ? ,J (), F5
C ?(), ..., ? U
FGFH A& 5 &R$
A& 5 u
j}
& 5\ 5 -F
- && -
& /8 _ /R + &5\$ 5,
, , Y, + F 8 & _ /
_ U- +
8 F, , R, k,
, , 6$ X\
& & & _ /
-
Page | 8
A& _ +
$ FdF _ & / 5 8
& F(js, ? ,J), (.),
(.), & (js) / U /
& &
u & + RX C 8
+ & F &
F U & $ & $ \
u & & u + &
& E J 6
& $ & 5 &
-/ & /. 8C & + -
/ 5 /.-
& 89 /8 u & x| /
/ 56
X s 28, R \,
U FGFH,
J -110067
-
Page | 9
WXYWXYWXYWXY XXXX //// //// -\-\-\-\
WXYWXYWXYWXY XXXX
No
Abbreviations
Description
1 ALPAC Automatic Language Processing Advisory Committee 2 COLING Computational Linguistics 3 C-DAC Centre for Development in Advance Computing 4 IIT Indian Institute of Technology 5 JSP Java Server Pages 6 NLP Natural Language Processing 7 PL Plural 8 OBJ Object 9 POS part-of-speech tagging (POS tagging or POST) 10 SG Singular 11 STT Speech to Text 12 TDIL Technology Development for Indian Language 13 TTS Text To Speech 14 VR Verb Root 15 R&D Research and Development 16 AI Artificial Intelligence 17 AUXAUXAUXAUX Auxiliary 18 PREFPREFPREFPREF Prefix 19 SGSGSGSG Singular 20 VRVRVRVR Verb Root 21 SUFFSUFFSUFFSUFF Suffix 22 SUBJSUBJSUBJSUBJ Subject 23 OPOPOPOP Output 24 FL formed language 25 LTs Language Tools 26 MTMTMTMT Machine Translation
WXYX 27 ........ 28 N 29 PRN & 30 ADJ F 31 V +V 32 NP 8 33 ADV +V-F
-
Page | 10
Name Page NO.
1.Table NO.1: Marathi-Hindi MT is one of the major
language pair in this project. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.Table No. 2. types of homonyms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3. Table No. 3 - & $ - . . . . . . . . . . . . . . . . . . . . . . . . . . 73
4. Table No. 4. F $ - . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
5. Table No. 5 5 & 89 . . . . . . . . . . . . . 124
6. Table No. 6 j} 6A & 89 . . . . . . . . . . . . . . . . . 126
7. /8 & 5 6 & 127
-\-\-\-\
1. Graph No.1 process of MT Graph No.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2. Graph No.2 Error analysis, Frameworks example . . . . . . . . . . . . 54
3. Graph No.3 process of Homonymy word Translation Tool(Hindi to Marathi)
129
4. Graph No.4. CLIR Schematic for input being Marathi . . . . . . . . . . . . . 135
-
Page | 11
FFFF ////
8_ Fl$ _
F F _ C + +
/H-_(Information Technology)
& G X\ / 8 U
u, 8 56&
(Homonymy), l&(Syntactical Ambiguity), &
l&(Referential Ambiguity), (Fuzzy), , , (Metaphors
and Symbols), F5 (New Vocabulary Development)
Homonymy _ / , G F +
b + F Homonym m
Homo + onyms $ && & &
56& , 6 , x, l +
(Human Translation), (interpretion), (Language
Acquisition), p\ F| 8&-& F$ (Artificial Intelligence specialists)
56& $ (Homonymy) & _
A6 m F|, & $ _ R 8F-$
$ & /+V & U R ,
F5 , G F _ &
8 5 _ +
F & U F5 u _
$ (Homonyms) _ U
$ 8 _ FG - \$
8& G _
-
Page | 12
/ C _ & _
WXY 8 /8 () _
& _ A \$ b
_ G .. _
.. _ F +
l C / F 56& ,
56& $ / 56& $ 6
jk \$ & _ Homonymy $ /$ _
l + + / A6 ,
+ / 8 F +
C 56& $ F $ + .
5F , , 56& . 56 5F, ,
56& F $ - $ + 5 $
F & - ~ + Homonymy $
jk $ _ + , &
- /8 F $
56& $ _ 5 (Language
Tool) 8& U A&
& C s U y /m5 5 &
$ + _ & + /
_ /m5 _ 89 Homonymy $ F5
(Special Dicationary) _ &
jk 8$ _ -F- 6 _
8 C + &
+ / 5|$ 5 - 5 -
-
Page | 13
- - - + /
& _ & C$ 5
_ b F +
2006 _ A
_ - / \ _ R
8& G 8 +
& _ \ 8& _
R 5 J & X\ 5 _
A F
-
Page | 14
Tabl. NO.11: Marathi-Hindi MT is one of the major language pair in this
project.
CCCC
____
FFFF ////
CCCC
1111.... RRRR
1.1 ____ /+V/+V/+V/+V WXYWXYWXYWXY 8888
1.2 ____ bbbb
1.3 ..... . . . cccc FdFdFdFd (The Source Language Analysis) ____
bbbb
2222.... ____
2.1. & & & & (pre -Translation Problems)
2.2. ____ (Translation Problems)
2.3. gggg (Post-Translation Problems)
2.4. ____ (Translator/User problems)
3. GGGG ____
4. ____
1 paradigm Shift Of Language Technology initiatives Under Programme - FG@TDIL-page 4
-
Page | 15
CCCC
C _ /
_ 6 , WXY 8, _ /+V
/8 (NLP) & _ _
- - b
A _ b $
F + ) G _ )
_ .. _ (Human
Translation) (Machine Translation) _
$ + . _ F
& b _
6 & _
-
Page | 16
1. RRRR
/ u /
U + _ y C
& + F5 y /m l 5 + +
$ , 8,
/ u $ y F| 8 +
8, _ y : _
y + p
y b &: .. 8&
& 8&g&
\ 5 u
5 \ 8& & _ F b
_ J j U 5 .. _
R 8 U u
/. 5 _ 6 R + _
/+V y /(system) R
, /+V _ m(Text) (Input) U
y _ / $ $, jk 8$
- , m X$
8& (output) U m /Y 2
2 /. 5, , /H-_, p\, F,
/-
-
Page | 17
. Y .. _ R / /+V
$ (c ) (X ) y C
Machine Translation (MT) is the process of translating text units of
one language (source language) into a second language (target language) by using
computers.3
&} R + : (\) y
/8 /8
\ .. _ -
.. $ F + u
1) &: (Fully Machine Translation)
2) - (Human Aided Machine Translation)
3) - (Machine Aided Human Translation)4
1.1 ____ /+V/+V/+V/+V WXWXWXWXYYYY 8888
FG 8 F + 8
& \ 8& /
& FGFH 5 U J R& (ALPAC
Report1966) & .. & _
: .. & _ 8 8 + FG
\ U & u \
\ 8& 8& +
.. \ u \$ / u GAT,
3 CONTRIBUTIONS TO ENGLISH TO HINDI MACHINE TRANSLATION USING EXAMPLE-BASED APPROACH DEEPA GUPTA DEPARTMENT OF MATHEMATICS INDIAN INSTITUTE OF TECHNOLOGY DELHI HAUZ KHAS, NEW DELHI-110016, INDIA JANUARY, 2005 4 Punjabi Text Generation using Interlingua approach in Machine Translation (A thesis) by -SACHIN KALRA, page No.-10
-
Page | 18
SYSTRAN, LOGOS, METEL, TAUM-METEO, EUROTRA, ATLAS-I, ALTAS-II, TAURUS,
PIUOT, ALPA, LOGOS, CULT, TITUS, ARINC-78, METAL, MU, MT.S-NCST 5
_ / U 1983 WX 5
FGFH 5-5 TUMTS \ 5 U (In India research on
MT began in 1983 at the Tamil University in South India. The TUMTS system is a
small-scale 'direct translation' system specifically designed for Russian as SL and
Tamil as TL and running on a small microcomputer.6) + \ _ X _
\ 8& (Output) - 5
_ F- U .. U X m
\
\$ & + u ()MAT, ()ANUVADAK,
(\)MANTRA, (5)SIVA, (\)MANTR, (-6\)ANUVADAYAN- TR (CIILM)
(m-) ()ANUSARAKA ( 5), (}) Shakti (m-
, , 5) 5
\ _ A& _ Indian Language to
Indian Language Machine Translation system 85& + u F
_ /+V - 8 & _ /+V F
+ .. _ /+V l + (Human Translation) _
/+V + .. _ /+V U
& _ /+V & 5 .. _
/+V - jk
& u 6
5 F 5| / ( .) _ - . F- .
6 Recent developments in Machine Translation a review of the last five years - W.John Hutchins, (University of East Anglia, Norwich)
-
Page | 19
-8 5 /+V U
/+V (Text) $(LTs)
_ & /+V WXY U 8
_ /+V : - 5 / y 5 F
jA y ? C y _
? 8\ , s 8 k
$ y _ /+V
8 & /A \
_ /+V U$ . \
. _ /+V _ /+V
6: _ /+V 8
/ 1.c (SOURCE LANGUAGE TEXT) 2.(ROOT WORD)
c $ - & 5
jk & \$ 5
jk & - \ 8& & 8 + \
+ j & :- 3. X j(POS TAG)
$ -6 + & 4. -6
CHANKAR MARKING 5.\ (PADSUTRA) /+V $ + +
6. /+V (WORD GRUPING) 7. F (SENSE
DISAMBIGUATION) F _ /+V A& /+V
8& _ &&
& 8& 8 /+V C
+ 8.&& 8F-(PREPOSITION MOVEMENTS) /+V
c U 5 5 $ / +
9. / (TARGET LANGUAGE GENERATION)
-
/
5 u
m(CORPARA )
U Fd
ANYLISAR 12.Fd(PARASAR
JR5 F
14.8&
$
u + &
(Graph No.1
.. $ (Tools)
C + 7 Parallel MTS for Indian LangauageesVol.No.7.jan-dec.1995
8&
j m _
F & U$
Fd 11.E8 Fd MORPHOLOGICAL
PARASAR) 8& /5&
13.&
8 jk
$ _ /+V 8
8& (Output) / _
$ 8R} 6 /+V
A&
(Graph No.17 ref. by International Machine Translation )
(Tools) (Lexical Data) 8 8
+ c
Parallel MTS for Indian Langauagees-SRaman and N.Rama Krishna Reday,IIT,Madras IJT
Page | 20
&
10. j
FdF
MORPHOLOGICAL
/5& 5
&(GENARATER)
jk, 5,
_ 6 /+V
$ &
by International Machine Translation )
8
/Y _
SRaman and N.Rama Krishna Reday,IIT,Madras IJT
-
Page | 21
/+V & & 8&
- .. \
u - - l -, ... , -
l \, ... l \, 6 l ,
+ _ | _ -/8 u, _ /
1.2 ____ bbbb
F u F
8 - /8 $ \ F5
& 5 5| _
5 _ b /8 & -
8 &
1.3 ........ cccc FdFdFdFd (The Source Language Analysis) ____ bbbb
+ /m (\) 5 $ _ b _
/m & & /m 5
/... (NLP) $ 8
U u (A) (Recognition) (B) /+V(Understanding), (C)
(Generation) (D) /8 _ / U (/... Standard Paradigm)
(E) /} Fd (Discourse Analyser) (F) & Fd (Semantic Analyser) (G) U
Fd (Morphological Analyser) (H) c Fd (The Source Language
Analyser), (I) Fd (Syntactic Analyser) (J) / (Target
Language Generation Content Delimitation) (K) 6(Anaphora) (L)
(Syntactic Selection) (M) (Text Structuring) (N) &
(Constituent Ordering) (O) /8 (Realization) (P) (Lexical
Selection) F $ c Fd 5 +
-
Page | 22
F $ 6 $ _ b
8& /... + F $ , 5 ..
8& /... _ - b
2. ____
U F| /+V _ X :
y F| _ +
_ /+V \ $ J F /+V , , &
/8 jY 8| 5
_ J _ 8
/8 F| p\ F| _ /+V A$ /
5|$, j 8$ & F (logic science) 5|$
_ / U $ F +
u
2.1. & & & & (pre -Translation Problems)
2.2. ____ (Translation Problems)
2.3. gggg (Post-Translation Problems)
2.4. ____ (Translator/User problems)
_ $ &X j
A6 u /m A6 u,
85& u 6 8 U u
2.1. &&&& :-
8 y F / J\, F ] s -
-
Page | 23
& u
F _ - s 8
j u 5 .. -& U
:- MT is a very expensive endeavor, both in terms of the
software development effort required and in terms of the linguistic resources which
need to be assembled.9 y _ 5 ,
8 & y u R8 g&/& y _
\$ & &
2.2 ____ :- _ / U _ /+V
: $ F u
2.2.1 8 (Language problems)
2.2.2 jk (Grammar problems)
2.2.3 F $ 8& _ (for use and development of
language tool)
2.1 8888 :- 8 (Linguistic
problems In ..) Harold Somers 8 U F +
F _ - F U F 10
2.1.1 FdFdFdFd (/8/8/8/8 5555) problems Analysis
(Apply to all NLP):- () (Lexical) () (Syntactic), ()
&(Semantic)
2.1.2. ..... j8_ j8_ j8_ j8_ . (Contrastive problems In MT)
9 Prospects for Machine Translation of the Tamil Language Fredric C. Gey University of California, Berkeley 10 Machine Translation-2- Harold Somers, CCL, UMIST. Manchester M60, IQD England, Australian Language Technology summer School, 8-9 December 2003
-
Page | 24
() (Lexical), () (Structural), () & (Representational),
() /}(problems in Style/Register)
() & 56 (conceptual differences), () (Lexical gaps)
1. Staructural divergence linked to lexical differences
2. Structural divergence linked to grammatical differences
3. Level shift: - Similar grammatical meanings conveyed by
Different devices
2.1.3 FdFdFdFd(Lexical Analysis)
2.1.3.1 F /+V (Segmentation)
2.1.1.1 U F _ (In Morphology):-
1. /&A UF(Functional Morphology)
2. jA UF (Derivational Morphology)
3. l4 UF(Ambiguous Morphology)
2.1.3.3 R- (Unknown words):- Misspelled, spelling, not in
Dictionary, because it a regular derivation, proper name, Compound
2.1.3.4 (Lexical Ambiguity) :-
(1) A (Structural) (A) Unstructured
(2) (F)Real ()Accidental
(3) (5)Local G(Global)
(4) FdF Analytical () G Global Ambiguity
-
Page | 25
(5) (Shallow) (Deep Ambiguity) : -
6 Anaphora, ()Raising, ()Case-
U R (Frame ambiguity), (R )
Quantifier and g& R (operator scope)11
2.1.3.5 (Category Ambiguities):-
1) & (Homonymy):- (C6A ) Homophones,
Homographs ( 5F ) (proper names : - many proper names form
Homonyms with meaningful words)
2). R :- (Polysemy) F l +
U {F (85&) u
2.2 jk :- 5 / jk _ U
U jk 56
/ .. j / - u
/ u FUG (kAY-1984), HPSG(Polard and Sag-1994),
LFG(Bresnsn-1982), TAG (Joshi and Schables-1992), Panini Grammar12
_ $ F jk - u
2.3 F $ 8& _ :- + .. 8&
F (Language Tools (LTs)) + \
8& 8 U u F $ _
_ A6 8& &
A $ & _
2.3.gggg :- g u +
/AX U WX (ignore)
11 Machine Translation-1- Harold Somers, CCL, UMIST. Manchester M60, IQD England, Australian Language Technology summer School, 8-9 December 2003 12 Computational Linguistics and Sanskrit Dr. Jha, Girish N., Computational Linguistics, Special Centre for Sanskrit Studies, JNU,N.Delhi-67
-
Page | 26
.. & /} F ; + 6 () --\
8& /} $ 8 + 8&
F 6
u, -
2.31.1 p\ (y)
2.3.1.2 5A F$ _ 8 _
2.3.1.3 y /m _
l 5 (output(OP)) 8& (Formed language) p\
5 j U / 5
F |
2.4. ____ :-
/A / + U //
g& _ u & 8& A6
\ 5 .. } 5
\ 8& & + / _
/ g& \
/ & (Translator/user)
u F & _ b _
F $ u, + _ \
A $ & & A 8
F8$ -8 Fd 5 +
& & U & &
-
Page | 27
R + & Fl A
\$ c \$ U u
3. GGGG ____
G A, j, -F X\ _ b
6 F8$ + A, F
l G _ _
G u FG _
\ U u 8
$ 8 j &, C8 j /& $
6 : / \ X + &
(incompleteness) F +
13 $+ y
&(incomplete) $ _ X F5
+
_ \ (quantity) + g(quality) \ \ /
c
_ 8 _ 8 X \
+ / +
& $ $ 8& 5 $ &
_ g \
/ .. _ J G
13 F 5| /( .) _ - . F-
.-
-
Page | 28
\$ _ u 8 $ $ _
_ 6 $ 5 G
F U G 8 / U 5 u 1)
5k +V R 2) 3) 8
/& (High function and low function term) 4) (Vagueness) 5) 6
(Anaphora) 6) (juncture) 7) _ - 5WX$ _
/U 8) -R& 9) F /& 10) $ F
(Segmentation of unit) F + +
C(mediator language) & C jAFg
(Derivational) c _ 5j} & 8UF (Represent)
+ 14 + & + &
C 8& / U G u
j F u, FG \$
U u
4. ____
_ X & U , &$
\ + +
_ & _
6 C + g& F 58 /5$
$ R$ $+ 5 R F5 +
6 F R F5 + /} -
15 .. 8& g& _ .. g& (.. users)
14 _ . &, / (y F-) /- 15 : R{b /. p\ (y F-) /-
-
Page | 29
u _ |8$ F +
$ } X\ }
U % m
(Sinha and Jain, 2003) b + m 5
\ 8& + Private research also in other countries but not in
India. Consequently, in a country like India, where English is understood by less than
3% of the population (Sinha and Jain, 2003), the need for developing MT systems for
translating from English into some native Indian languages is very acute.16
_
_ /+V - 8 / 5
A& + A /H-_(Language Technology)
/8 u /H-_ 8 _
5 - F5 Ab
/-/ FG _ 6 _ U
+ - F5
6 F - s
Y \$ _ c F +
- & m U
m _ 56
/ & m
\$ J _
Fl _ 5 .. _ J u +
5 \ 8& _ R8 To our knowledge
there is no direct machine translation software development being done on Indian
subcontinent languages languages spoken by nearly twenty percent of the worlds
people. The reasons are simple, the resources available for traditional machine
16 CONTRIBUTIONS TO ENGLISH TO HINDI MACHINE TRANSLATION USING EXAMPLE-BASED APPROACH DEEPA GUPTA DEPARTMENT OF MATHEMATICS INDIAN INSTITUTE OF TECHNOLOGY DELHI HAUZ KHAS, NEW DELHI-110016, INDIA JANUARY, 2005
-
Page | 30
translation are simply non-existent and the commercial viability of such
development seems dim.17 However, the new field of research into statistical
machine translation offers hopes that Tamil translation may be just around the
corner, to the pleasure and social benefit of all. l _
5 &: \ 8& + %
u + R \
+ F , /$ _
j8 - 5 U
_ + 5 \ U
A& Two major problems exist in connection with machine
translation and cross-language retrieval of Tamil (and other Indian languages). First
is the lack of machine-readable resources for either machine translation or cross-
language dictionary lookup. Such dictionaries need to be pared of poetic and
Classical terminology and augmented with modern words as found in recent
newspaper and Magazine texts. Tamil is a phonetic language. By default each Tamil
consonent is followed by an a vowel sound. The vowel sound is modified by the
addition of glyphs, such as the curlicue following the m which changes it sound
from ma to mi. To suppress the vowel sound one places a dot over the consonant.
This means that borrowed words from English or other languages will sound
similar in Tamil to their native language (Malten 1996).18 5 \
F + 5 ( 6 ) \ /
- 5
- 5 $ jA
F5 \$ p\ 5 5 8 R
, 8 /A 5 j F
8 l R& + + \ C8 5
R8& FF 5 j p
& m 6 5 $ 5
X\ _ -$ 17 Prospects for Machine Translation of the Tamil Language Fredric C. Gey 18 Prospects for Machine Translation of the Tamil Language Fredric C. Gey
-
Page | 31
_ & (-) F, $
8&, F & + _ b 5 ?F
R _ & R _ A&
-5 5- \$ +
5 5 \
+ & : & : F8$,
F8$, F8$ F8$ _ 5 /
- \ Fg F5 + -F$
- RWX HF F$ A
u, + g: 8 + y F F$
u19 -/A -, _ b $
.. _ &8 .. _ +
m \$ $ & \
- } - / , + & U
5 X + \$ 8& /8 U
\$ 8& s 8 U
1. F 2. F 3.c -
F5 4.& , 5. $ 6 _
6.$ $ _ , 7.F5 $ _
, 8. & 6 _ 9., /} _ &
_ 10. _ F $
& F5 + F l
\ 8& u
19
-
Page | 32
- X 5 +
\ /8 | _ /+V
5 .. _ /+V _ /+V $
8$ /8 & y /5 8 u
_ u .. l _ - 8$
8& (Output)
_ 8 + 5 , |
X\ _ $ _ -
F, $ 8&, & F &
+ _ b _ s +
, _ & &
+ + p\ F| (Artificial Intelligence) &
8U(Knowledge Representation) _ F- F5 _ u, _
/8 6 , & & /+V
y - 20 ..
$ {F 5 ,
_ J y _ }
- + 5
Fl$
5 / $ $
- F
5 U
/ - \$ 8& +
20 y F / F J\ ] s -
-
Page | 33
F 5
\ 5 Building translation system From
and to Tamil helps the Tamil community all over the world in accessing the
information in Tamil. 5 5 .. F &
5 FG -& & &
5 5 _ 5 5 .. _
b ..
R .(Fredric C. Gey).. $ & +
$ F5 _ Ab After having discussed
the various components of an MT system, and the resources that might be needed to
be build for MT 21
21 Prospects for Machine Translation of the Tamil Language Fredric C. Gey
-
Page | 34
CCCC
&&&& $$$$
CCCC
1. &&&& RRRR RRRR
2. 6 5 & $6 5 & $6 5 & $6 5 & $
3. &&&& $$$$ //// jkjkjkjk \\\\
4. llll &&&& $ _$ _$ _$ _
4.1 llll & $ _ & $ _ & $ _ & $ _
4.2 l l l l & $ _ & $ _ & $ _ & $ _
-
Page | 35
CCCC
l C / F & $ & F &
_ /: & + / 8& , /
& U u $
/A + / .. 8&
F _ + /F & _
8& l& .. 8&
& $ F & b $
F + & $ 6 jk \$
& _ & $ /$ l +
+ / 8& u
8 , F +
\ & U 8&
5 m- \$ & $
8& F +
\$ & $ 8 5 / +
+ & \$ _
s R{ 5 .. \$
C & _
-
Page | 36
1.&&&& RRRR RRRR
&q , F & 6 _
- C8 A
$ C8
C8 _ A R _
6 _ - u +
6 $()
? +
8 + 56 & 56 8 ?
8 + 6 $
56& \ u Fl$
+ 6 $
F _ +
b +
$
& } + l&
8&
5F 5 5F & _ 56
56 F-5; -; -: -;
-6 22 _ 8$ &
8& u - j} 8
5 5 , +
& j} $ ~ Fl , _
$ F & 5 C8
22
& 5 .p\ - 76
-
Page | 37
/ F , / F
& / J 5 -\ {b / U / +
23 y X$ -\$ U + -\$
~ y 5 & & 5 y
6 -\$ $
& , jk , s V
, 6 /H-_ 5
&
X$ -\ - -\$ 8& $
U , 6 & /Y $
$ jk /Y u $ 5 ~
$ / 5 /+V
U / + / _ U +
F / $ _ 8 Y6
U - A6 -
/F u
$
V $ F $
/ / + F5 +
F(concept), j, 8, , , + &
8& _
8 $ F R u $ R
{F u F {F F + X\ $
/ R u /: 8 $ $
23
8 F . 87
-
Page | 38
R- _ s F {F
$ l& 8&
u 6 $ -
Y u u + 6
$ l& (),
(g&), (), (8\), ( ), ()
_ {F $ F + -\ U
u / + u y
+ - & ? & 5
_ b _ ; F
F 5
+ $ _ (Homophony) &
(Homogaraphy ) ~ 5 A6 , /FF$
&-/ _ E 24 & $
R{ k8
Fl$ & + ..
k8 &/ m C
56& $ & _ in Sanskrit one comes across
discussion on homonymy among the suffixes. Pini (6th cent.B.C) treats them in his
Adhyyi 25 & $ & m j
&R $ &R( ) &/ F
j U F & + Among the Indian writers,
24
F 5| / (& 6? & 6 m) . 8, .
, .? , .G, .
25 Treatment of homonymy in Pini Adhyyi H.S.Ananthanarayana - page No.- 50
-
Page | 39
Bhartrihari(7th cent .A.D.) was probably the first to discuss this question,
systematically. 26 &R( ) & $ &
$ $ F + & $
l& + - & /F 5 &R
& U$ / syntactic
construction, / context, & goal, -A propriety, place,
time. Bhartihari lists in his vakyapadtya(2.314) six factors which may be used as
helpful cues in correctly interpreting the intended sense. They are: vakya syntactic
construction. prakarana context, artha goal, aucitya propriety, desa place and
kala time.27 / &R $ &
_ & & 8& 56&
$ 8 5 + 5
\ &: \ &
U + \ C _ +
+ A6 $
8 J $ _ +
$ l /5
/+V - .p\
9 $ 8 C 5 F _
85 (Homonymics) F5 C
& $ 8& 8
-
1. J& 6(The world book encyclopedia):- _ 8
- $ C8 6 56 & _ 5F
m & $
26
] s - 49 27
] s - 50
-
Page | 40
Is a language term for two or more words that have the same sound but
different meaning. The word may or may not be spelled alike Homonyms are
common in the English language.28
2. . (M.Teresa Cabrs) :- R F & $
C8 } & ( C8 )
& & ( 5F ) C8
+ 5F 56
Tradition linguistics draws a distinction within homonymy between phonological
Homonymy (Homophony) and Orthographic homonymy (homograph) homophones
are unit that are pronounced the same but are spelled differently : - e.g.night/knight,
here/hear, bare/bear, whereas homographs are words that have the same spelling
but differ in origin.29
3. . :- $ 5 u &
6 & 56 u $ 56&, C6A,
C 30 m & (8 Homonyms)
u
4. . & (M.Teresa Cabrs):- &
C8 $ U + & 56 u :- m
bank (u) p\ U 56
& :- bank + bank & u
( _ ) &
+ 5F 56 :- bare bear
& C8 $ R C8
:- mouth the mouth of the river, +
& $ & A word that sound identical to 28
The world book encyclopedia volume 9 (H) page 275-276 29
Terminology Theory, Methods and Applications - M.Teresa Cabrs, page No.111
30 , - . , ] s 28
-
Page | 41
another word with a different meaning; e.g. bank, pertaining to the ground
boarding a river and also pertaining to a place for keeping money. A Homonym also
be word that sound identical to another word that is spelled differently; e.g. bare
and bear Homonym is distinct from polysemy or multiple meaning; e.g. mouth as
in the mouth of the river, is a single word with two related meanings.31
&} R + ; F 6
& $ 5 6 5F 56
6 _ 5F $ _
R + & $
- + $ & 56
+ Fl & / ;
- + . & (M.Teresa
Cabrs) .. R 6$ mouth
5 C8 & $
56 5 C8 $ & $
& _ {F F + $ $ & 56 8&
8 C8 $
& $ & Fl$ & $
F ..$ & $ & m
& $ _ ..$ -
- 5
X\ \ & & $ _
8 F
& _ AFg F + m Homonym m
Homo + onyms $ && & homonyms
m Homo & onym / m
31
Encyclopedia Britannica vol.No.5.H :- page No - 107
-
Page | 42
Homonymy 8& 32 Homonymy 5 x8
56& /5 Homonymy 5
m- $ - R
- C 56&
5F & /Y
m Homonymy 5 &
+ , ... m-
5 & $ & + +
$ F + +
$ _ - s 6: +
FG _ - U 5
2.6 5 & $ 6 5 & $ 6 5 & $ 6 5 & $
1. () ()
(/) ()
() (&)
() ()
(A) ( )
(A) ()
g(4) g()
2. (&) ()
() ( &)
( ) ( )
(_) ( )
32
http://efl.htmlplanet.com/polysemy.htm
-
Page | 43
() ()
() (-J)
( _ ) (_ )
3.m be() bee ()
knot() not ()
dear(F/) deer ()
hour() our()
wind() wind()
dove (g) dove(_)
mobail () mobile(C \,)
4.&
& m & m
Morgan tomorrow () morgan morning ()
arm poor () Arm arm ()
elf eleven () Elf elf ()
Steuer tax () Steuer wheel ()
Tau dew () Tau line ()
Bis until () Biss bite ()
Boot boat () bot offered (/F)
Bund federation () bunt colorful ()
Chor chorus ( ) Korps corps (8)
das he () dass that ($+)
Heer army () her from ()
Isst eats ( ) ist is ()
Meer sea () mehr more (-)
Nahmen took (5) Namen name ()
-
Page | 44
Rad wheel () Rat advice ()
seid are (u) seit since ( )
Tod death (8) tot dead ()
viel much () fiel fell ()
wahr true () war was ()
Waise orphan () Weise way (&)
5.5 33
5 m 5 m
max stroke / Max stretch
cegok hoursman cegok passenger /
Beulu will pouer } Beulu freedov
Digx sprit Digx scent \
6.8 34
8 m 8 m
Solo (alone) ) slo (only) (\)
si (if) () s (yes) ()
sumo (supreme) () zumo (juice) ()
tasa (rate) () taza (cup) (y)
ti (you) () ti (note of the musical scale) ()
33
Langenslheiots, Russian Dicationary, by E.Wedel 34
www.spanicity.com
-
Page | 45
tu (your) () t (you) (,)
tubo (pipe) () tuvo (to have) ( )
vino (wine) () vino (to come) ()
A - F F
C & $ + m, &, 8, , 5,
5, , , , p 35 F
& $ _ {F
F + b + F
F$ _ {F & + 6: & $ F
- + /: $ _ R _ 56 +
6 j}, FH4 $ _ R U _ $
FG _ + & + / 8& u
F b FG l
F5 u b &
$ _ AFg $ _ _
& 8& - l - F5
_ 6 $, 8$ _ & F 6
$ 5 _ /+V,
/ - F5 R / , X
$ _ & $ 5
U _ -WX$
_
35
Lexical meaning page No. 75
-
Page | 46
+ & 8& ?
F + 8 U / u
1.5 F
2.8 b
3.RF
4.&
5.$ & /
6.$ R /
7.$ WX8Y
8.& X\ F
9.X
10.F 5
11.$ jk /
12.56& 5x, F56 -$ _ U, HA U
& 8& / 5F l8
8& 56 / _
& 8& _ /+V 8& -
& - , /: , + &
56 & 56 u
& 56 , & + $ _ s
_ m $ & $ _
/ & U C8
u & $ + $
F + , F b $ -
- - U , 6
U & $ 8 $ F
-
1.HOMOGRAPHS
2.HOMOPHONES
3.HETERONYMS
4.POLYSEMES
5.
CAPITONYMS36
(Pronunciation),
F u
F /$
3. $$$$
1. 5F
56
bark (the sound of a dog) and
56
front of a ship) and bow
2. C8
$ &
56
36
http://en.wikipedia.org/wiki/Homonym
Same
spelling
Different
spelling
Tabal No. 2. types of homonyms
(Pronunciation), C8 56& 8$
& $ _ (inequality)
/$ _ R 8 U
$$$$ //// jkjkjkjk \\\\
(HOMOGRAPHS) - &
56 5F $
(the sound of a dog) and bark (the skin of a tree). $
56 (heteronyms)
(a type of knot).
C8 (HOMOPHONES) C8
$ &
C8 56& , 5F
http://en.wikipedia.org/wiki/Homonym
Same
pronunciation
Different
pronunciation
Same
spelling
Different
spelling
Homonym
Homograph
Page | 47
Tabal No. 2. types of homonyms
-$
(inequality)
&
5F $
. :-
$
:- bow (the
$
$ & , &
5F
Homonym
Homograph
Homophone
Heteronym/
Heterophony
-
Page | 48
56& ( 56& $ _ 5F )
C8 56& :- Homographic examples include tire (to
become weary) and tire (on the wheel of a car). + 56
5F 56& (Heterographic) :- to, too, two, and there, their,
theyre.
3.- ~ (HETERONYMS) $ 5F
56& $ ~ _ $ _ 5F
+ $ 56 5 5F
+
:- desert (to abandon) and desert (arid region); row (to argue or an argument) and
row (as in to row a boat or a row of seats). $
F Heterophonys U
4. (POLYSEMES) _ 5F/&
+ & 56 - /:
56& $ $ 8 F
, : 8 +
56& u; F 8& :-
Words such as "mouth", meaning either the orifice on one's face, or the opening of a
cave or river, are polysemous and may or may not be considered homonyms.
6.5F R& & (CAPITONYMS) - 5F
+ 56 ~ s U _ &
R& 8& :- polish (to make shiny) and Polish (from Poland).
& $ F _ $
R + R
56 6 F b 56
+ 6 j}
-
Page | 49
5F $ 5F & 56
+ - ~ (HETERONYMS)
F: 5F - F U -
_ 5F C8 - ~ $ 5F
+ 56 5 $ 5F A
- 5F C8 5 , +
5 &
C8 $ & $ F
+ & & 8 R
5F R& & (CAPITONYMS) & $ _ \
$ 5F R&
& ; _ 5F
, + / X
, & R& & 56 A6 5F
/ 8 jk 8 U
F /$ & _ 56
/ $ - ~ (HETERONYMS) $
8& 5F
A & $ $
- A j jk
$ - A 8& & $ _
AFg jk F _ A& 5 :-
U & F 5
U & &
& 5 U Grammatical developments also sometimes lead
to the formation of homonyms. Jita as the past tense of jina, meaning alive, is also
the past tense of jitna, meaning conquered; and diya is both a lamp and the past
-
Page | 50
tense of dena or give.37 / $ R& $
8& &: jk & $ 8& j
8$ _ A& 5 j & $
8& $ 8 A& 5
8 :- U + 6 U
& +V U
+ & U
/ j $ U$ +
$ 8 U / _
8 U 1. A (Structural
Ambiguity) - 8& F5
8& Y6
A
2. (Lexical ambiguity):- + F5 8&
8
5 A& + y _ +
& 5 - & m
1. He chairs the session.
2. The chairs in this room are comfortable.
chairs +V U +
chairs 5 + chairs & /A
& 5 & 6 8 POS
tags C +
37
http://www.tribuneindia.com/2000/20000819/windows/roots.htm
-
Page | 51
3. jk jk jk jk (Grammatical ambiguity) :- /A j
8& jk 8$
A6 jk
8& & $ 8&
A
jk - /
/A
/ _
56& (The Homonymy is
a lexical ambiguity.) 38 8 Word Sense
Disambiguation (WSD) C + , 5
/A \ c j F
8$ 8& + &
8& &
/ A&
j &R 56& $ _
5 j 8$ 8& +
4. llll $ _$ _$ _$ _
l & $ _
U -, -, 5WX-
5WX, & jR-$ F$ 6
j}$ & $
38
Punjabi Text Generation using Interlingua approach in Machine Translation (A thesis) by -SACHIN KALRA, page No.- 39-40
-
Page | 52
6 U / 5 + + 56
/ /F ,
5 + / /$ F
+ (Interpretation) l 5k U +
- _ - j U
5k 6 6 $
y _ F
/ + A6
y _
/
$, 5$, /$ l
& - l u
F l + /
8, , & & U
& 5 J A FG _
& _ g _ A
_ U 5 l & $
8& $
8$ R- b
& $ / _
& 56
/: -- $ 5k U
+ $ F& 5 /+V 8 U
6 A &
u 8 _ /+V A6 / U
-
Page | 53
l& , & , u
$ ,
+V , +
+ _ _
4.1 l & $ _ l & $ _ l & $ _ l & $ _
_ /+V & $ + / 8 +
5 6 8$ 5
m A / _ 8 G
l 5k + + 6 / _ $
$ _ +
+ 6 A _
c & $
A +
8 5
6 + +
G
6 + +
/ _ 8 8
m 8 /
F
-
Page | 54
1. 6 , + X 39
2. 6 & & X 40
1. _$ 41
2. 8 _ .42
56 56 U +
s & + +
& /k /Y +
:-
1. l E
2. A A .
1. FG
2. -- A FG
1. , 43
2. 8 44
1. 5 5
2. A A .
1. u
2. k 8
1. FH y &
2. FH J y& .
39
- / _ 8 8 40
5 - 8 / - F 41
- / _ 8 8 42
5 - 8 / - F 43
k-/ _ 8 8 44
F| -8 / - F
-
Page | 55
& $ $ -
5 + $
R- 5F $ & 8
y $
l + 8 U
1. J 45
2. J -46
1. / , 47
2. / H 48
1. _ ?
2. 8 \ .
1. 5
2. .
1. +$ + +
2. / J E _ /
45
_ - / _ 8 8 46
\ - 8 / - F 47
_ - / _ 8 8 48
\ - 8 / - F
-
Page | 56
_ 5F 6
U / - +
, 4 5
5 :-
1. g 5
- 2. | g 5 .
1. $
2. A V F .
1.
1. .
/ A& + + 6
6$ (] s 14-15) 10 +
& U, , -, E,
, -, & , , ,
, $ 5 , , , ,
, & 5, , , , U
+ , , , , u +
+ , U + _ {F
A + /
+ / F R
/+ } -
1) 1. ! 8 + 49
2. ! 50
49
+ + - G, ] s - 12
-
Page | 57
2) 1. ! } R _ E 51
2. ! J R 8 b 52
3) 1.
2. , J C .
4) 1. !
2. ! g .
5) 1. + ?
2. ?
6 6 $ +
6) ! + /\ F \
7) ! |
8) ! 5$ 8$ _ 5
9) !
10) !
/ 6 5 $ 6
U +
+ \ _
/ 6 5 u
_ _ _ _
1. _
2.
1. 5
50
- + + - - , 6 / J, ] s - 16 51
+ + - G, ] s - 12 52
- + + - - , 6 / J, ] s - 16
-
Page | 58
2. .
/ & + 5
8 & $ &
/ 5
/ + 5 &
+ 4 b
, 5 & _
{F
G _ - 5
+ F| _ + / 5 $
- $
1. b +
2. b k
1. _ +
5
-- 53
2. J A- \ J
E _ C
.54
56 & $
_ u + $ $ _ {F
U +
53
+ + - G, ] s - 59 54
- + + - - , 6 / J, ] s - 69
-
Page | 59
/ + U +
+ -
5 & $ /
& $ / 8
1. _ p ! ?
2. _ , ?
+ & ,
, & -, , -, u
+ _ - _ {F /
8& + +V 5 A $+ & $
$ U$ / /
- U, , -, -, p -, -
/ $ 8& &
5 \ 8& \ 8& b
& $ 8 6 $ _
- m 5 &
\ 5 u
\$ 5 & }
+ + \$ + \
8 u? 56 ? $+
\ - $ _ /+V
5 8$ _ . (B Kumara
-
Page | 60
Shanmugam) /+V 56-56
$ _ /+V 6 _ 5
u b + $
$ + Machine Translation is a complex system that does
different kinds of processing over the text at various levels. As in other languages
Tamil also contains polysemous words that require a collocation dictionary to
disambiguate them. 55
8 & - {F b
4.2 l & $ _ l & $ _ l & $ _ l & $ _
8& & $ _ 5
\ & l 5
+ & 5
l + / b
8& & $ _
F
m- \ 56&
+ 5F } 8 U u
1. $ \ 8& +
\ u &2009
1.The night looked dashing in his armour
The knight looked dashing in his armour
55
Machine Translation as related to Tamil, B Kumara Shanmugam, AU-KBC Research Centre, M.I.T.,
Anna University, Chennai, India.
-
Page | 61
2. I have blond hair and blew eyes.
u 6 u .
I have blond hair and blue eyes.
u 6 $ $ .
3. Bambi's mother was a dear
_
Bambi's mother was a deer
_
4. He said he knew where the place was.
+ + .
He said he new where the place was.
+ .56
2. m- \ (& ) +
} 8 U u
&2009 5 \ 6
\ u :-
The night looked dashing in his armour
1.1
2.1
Shakti machine translation system
56
http://translate.google.com/translate_t#en|hi|He
-
Page | 62
3.} m} m} m} m- \ & \ & \ & \ & + + + +
(Homophon) } 8 U u } 8 U u } 8 U u } 8 U u
&2009 5
1.The night looked dashing in his armour
! r^aata ne usake kavacha men' de maaranaa dekhaa
The knight looked dashing in his armour
! yoddhe ne usake kavacha men' de maaranaa dekhaa
2.I have blond hair and blew eyes.
! mein' sunaharaa kesha huuZ oora aazkhen' ud'Zaa
I have blond hair and blue eyes.
! mein' sunaharaa kesha oora niilaa ran'ga aazkhen' huuZ
3. Bambi's mother was a dear
! bambi kii maataa priya thaa
Bambi's mother was a deer
! bambi kii maataa hirand-a thaa57
4.\ \ & + \ \ & + \ \ & + \ \ & +
} 8 U u} 8 U u} 8 U u} 8 U u &2009 5
1.The night looked dashing in his armour
The knight looked dashing in his armour
2.I have blond hair and blew eyes.
u u|
57
http://shakti.iiit.net/~shakti/cgi-bin/mt/translate.cgi
-
Page | 63
I have blond hair and blue eyes.
u u|
3.Bambi's mother was a dear
p F/
Bambi's mother was a deer
p 58
5.\ \ & \ \ & \ \ & \ \ & (homophon)
} 8 U u } 8 U u } 8 U u } 8 U u &2009
5
1.The night looked dashing in his armour
The knight looked dashing in his armour.
| &&
2.I have blond hair and blew eyes.
+
I have blond hair and blue eyes.
u u
3.Bambi's mother was a dear
F/
Bambi's mother was a deer
R
4. He said he knew where the place was.
+
58
http://mantra-rajbhasha.cdac.in/mantrarajbhasha/index.html
-
Page | 64
He said he new where the place was.
6 59
C8 $ + , $+
- 8$ 8 C8
$ _ - 8 5F 56-56
_ 5F & +
8& C8 $
$ ~ : jk & _ {F
/ 8 U u 1. _
8 (Lexical inaccuracies) 2. & $ 8 3.
& $ _ A J(Violated compositionality of multiword
units) 4.& $ _ $ | (Mistranslated multiword
units) 5. X & $ (Wrong equivalents due to
unidentified homonymy) 6. + $(Untranslated words)
5 7.jk 8(Grammatical inaccuracies) 8. $
$ _ (Wrong formation of words or phrases) 9. $ _
(Wrong word form agreement) 10. &~ 6 (Wrong
or missing prepositions) 11. $ V (Inaccurate word order) 12.
(Inadequate clause connection) 13. 8R} jk &
(Unusual grammatical form in target language) 14.8& & 8(Mistakes
in government) 6 U u
F + 8 - :-
+ &:
\ 8& &
59
http://www.cdacmumbai.in/matra/index.jsp
-
Page | 65
_ /+V -
/ U C
V 8& 8, & & U
jk 8 8 & j
8& 5 & &
- jk U U
- U 8 F +
(Vilar et al) (Rimkut`e) / j (Kovalevskat`e)
8 X l A6
+ & $ A6
8
Graph No.2 - Error analysis, Frameworks example60
60
Machine translation evaluation, Error analysis, Frameworks example 1, [Vilar et al., 2006] - Rimkut`e and
-
Page | 66
& $ & $ _
5F 56& $ - &
8 U
2. 5F 56& $ (Homographs example) -
1. -m \ & + 5F
} 8 U u & 2009
5
1. Water travelled through ancient Rome through lead pipes.
/ C A C \ _.
The mother duck can lead her ducklings around.
ducklings A u.
2. The baby sat in awe at the bright colors on the mobile.
5 $ 8
Although most animals are mobile, the sponge is sessile.
HF - u, p .
3. All need to be present for a unanimous vote.
_ E & 5 5.
I need to buy my sister a present for her birthday.
u 6 5 _ b
.
He who neglects the present moment throws away all he has
& X neglects
He will present his ideas to the Board of Directors
Kovalevskat`e, 2007] (English to Lithuanian)
-
Page | 67
tomorrow.
6$ + 8 _ F / .
4. The sow suckled her newborn piglets.
piglets suckled .
The farmer will sow oats in the back forty.
+ .
5.speaker please wind up your speech
}
The wind blew from the northeast.
g & .61
.} \ & + 5F } } \ & + 5F } } \ & + 5F } } \ & + 5F }
8 U u 8 U u 8 U u 8 U u &2009 5
1.Water travelled through ancient Rome through lead pipes.
! paanii ne netRtva paaipon' men' se praachiina rome men' se yaatraa kiyaa
The mother duck can lead her ducklings around.
! maataa batakha ve batakha kaa bachce maar^a dikhaa sakataa hei idhara udhara
2.The baby sat in awe at the bright colors on the mobile.
! shishu mobile para chamakiile ran'gon' para vismayapuurnd-a aadara men'
beit'haa
Although most animals are mobile, the sponge is sessile.
! haalaan'ki sabse adhika pashu phira bhii chalataa phirate phira bhii hein' phira
bhii span'ja sessile
Hei
61
http://translate.google.com/translate_t#en|hi|speaker
-
Page | 68
3.All need to be present for a unanimous vote.
! sabhii zaruurata honaa ekamata mata kaa honaa
I need to buy my sister a present for her birthday.
! mein' zaruurata honaa merii bahana vaha janmadina kaa khZariidanaa
4.The sow suckled her newborn piglets.
! sov ne ve navajaata ghen't'e stana se duudha pilaaye
The farmer will sow oats in the back forty.
! kisaana oat boegaa mein' piiche kaa forty
5. speaker please wind up your speech.
! vaktaa aapakii bolii para havaa khusha karataa hei
The wind blew from the northeast.
! havaa ud'Zii se uttara puurva62
.\ \ & + 5F
} 8 U u &2009 5
1 Water travelled through ancient Rome through lead pipes.
/ C \ + C $ A
u|
The mother duck can lead her ducklings around.
g$ A u|
2.The baby sat in awe at the bright colors on the mobile.
_ 8 |
Although most animals are mobile, the sponge is sessile.
$ HF , u 6 |
3.All need to be present for a unanimous vote.
62
http://shakti.iiit.net/~shakti/cgi-bin/mt/translate.cgi
-
Page | 69
5 & b |
I need to buy my sister a present for her birthday.
5 u E 6 5 &|
4.The sow suckled her newborn piglets.
5 F |
The farmer will sow oats in the back forty.
4 |
5.speaker please wind up your speech
CX
The wind blew from the northeast.
g4 |63
.\ \ & + 5F }
8 U u & 2009 5
1. Water travelled through ancient Rome through lead pipes.
/ A 5 \ +
The mother duck can lead her ducklings around.
$ A u
2. The baby sat in awe at the bright colors on the mobile.
5 xC} _ & + +
Although most animals are mobile, the sponge is sessile.
J + u u
3.All need to be present for a unanimous vote. 63
http://mantra-rajbhasha.cdac.in/mantrarajbhasha/index.html
-
Page | 70
b 5
I need to buy my sister a present for her birthday.
u 6 5 &
He who neglects the present moment throws away all he has
X u X u u
He will present his ideas to the Board of Directors tomorrow.
& 89 F
4. The sow suckled her newborn piglets.
J J
The farmer will sow oats in the back forty.
+
5.speaker please wind up your speech
y u u
The wind blew from the northeast.
g & + 64
&} $ /: $ 5 - &
$ 56 \$ $
_ {F + - /
\$ _ , & 56
\ &
+ +
\ + + & _
64
http://www.cdacmumbai.in/matra/
-
Page | 71
- F - F & $ & ?
+ / $ /
& + & $
+ &
:- lead - ..\$ A + _ /
A + &
wind + wind
up &
& / U +
+ & + $
F + present /, &,
$ \ +
$ 5 +
+ +
+
,
+ + $
_ 5 +
+ \$ :-
sow 5 / _ _
..\ U \$
+ \
5y + + & $
} & 6 $ _ & _
{F - A& 5y _
-
Page | 72
$ - }
- 5F 56& $ &
& $ U
_ 56 F & $
, & /m5
8&(depend) $ 8&
+ & 5
U _ -
+ , & 5
& 8 +
& $
5 + U
5 _ 5F /
_ U
_ U
+ / & $ /
\ 89
jk 5|$ -
-
Page | 73
C C C C
- 56& $56& $56& $56& $ ____
FFFF \\\\
C C C C
1. &&&& //// &&&& FFFF
2. 5F5F5F5F , , 56&56&56&56&
3. 56565656 5F5F5F5F, , 56&56&56&56&
4.mmmm &&&&
-
Page | 74
C C C C
C / + + & $
,
U 6 &
8& U U
u + $ \ _ j U 8&
b + / 6 & $ 8
5 F5 \ 8& / & $
8 5 m$(jk) \ 5
C 56& $ F $ +
1. 5F , , 56& 2. 56 5F,
, 56& 3. m & F
A6 & $ -- +
/ b - \ & $ _
J $ F & _
& $ ~ +
& $ jk $ _ + ,
jk & - /8
F $ & $ _
5 (Language Tool) 8& U A&
/ A&
-
Page | 75
1. &&&& //// &&&& FFFF
+ / & A6 u
-5 5 5F,
, 56 & $ U /
1. R A6 2.
_ 5F C8 j 3. _ - &
-F j8_ -F & + F Fd _
& $ _ - 56 F Fd
& 56 ~ _ /+V 56
5 A & + j} & $
j8_ F _ /F + &
j8_ F & 56 & & l&
8& J Fl$ u + $
- 8& 65 $ / _ F 8&
$ j jA U
& $ jA - -
& X 8&
& $ $ F + 1.&: & (complet) 2.
5(partial) & 66 6: - lA
$ + &
& :- $ +V +V
5 (5 +V) 5
65 C - ? / 56, /, ] 431-432 66 Anusaraka:an approach to machine Translation Akshr Bharati Dipti Misra-sharma, Amba Kulkarani, Vinit Chaitanya
-
Page | 76
_ + _ 5 _ _
67 _ /
$
+ U + &
& u / & $ U
|8 $ 8X U 5
- 8X & 5 l &
F 8& b 8 56
& / 8& 5 +
/ 8 + + 8X U 68
+ / X / , &, Rb-
, X C +
: A& ~ 8X U +
Fd 5 ~g
& $ 5 m$ F _
F & 5 $ & $
& & 5 &
& $ 8 _ & _
C8 5F + 56 & F5
8& _ R 8 U
(-j-) , 56-56
$ 69 + C &
67 http//www.srajaram.com/2008/01/did-harabhajan-say-maaki-or-monkey-to.html 68 F 5| / .? ] s - 69 C - ? / 56, /, ] 431
-
Page | 77
_ 56 6 _ $ b
8 U
pppp ,
, ++++ 70
& /A 56-56 /
& & & : &
& 8& & / 56 ~
8&
----
, 71
56-56 ~ +
& / &
& + /
$ C $ +
\ & U + F
+ $ 8 5 F 8 /F +
& $ $ y
$ $ ~ + & \ U +
8& 5 +
F 5 A 6 C
C /+V $
F 5 A {F & _
b _ U
70 ] s-431 71 ] s-432
-
Page | 78
5 /+V
A& + _
J FG 15 72 q _
q / u _ j8Y /
, 6, F / 73 &
/ 505 5 5 m
611 Ax l F 5 q / J 5
74 / q , WX _
FG _ 15 _ 4 _ -
75 AFg /5 u /
8& 6 /$, q, &, ,
_ AFg , $ _ 76 1188 m
6 F5 5 1290 G G 5
A _ {F b u + &
j _ 5$ $ $ _
$ 602, 675, 660, 735, 757, 810, 832
900 5 6, F/-, , R, , , 8, ,
5 77 905 ( 983) R
x 5 5 , U /
x x x x F F F F
x g F x g F x g F x g F
72 www.wikipedeia/marathi-lang/ 73 &A j .-., q / 58, 6, & ] .-11 74
] s-11 75 &A j .-., q / 58, 6, & ] .-11 76 ] s - 11 77 ] s - 12
-
Page | 79
G & & 1289 5 8 R G
5 _ 5-x _ J
X\$ 5$ F A ,
/ _ 5 5 +
U q 5 5 / u 1.-
_ C 2. F&- g /
3.WX F: $ $
, , -A, J, , , , ,
5 u / + - 5$ F5
q _ A /8
A /8 U
+ 8 A | R& _
_ F
66 A - _ F _ b
&
R, 8 A
/F _ U _ , F$ _
& 5 & U /8p j F _
F /+V F 5 /A
_ / F R- b _
& A6 + --
$?
u R & R _ WX
_ u _ AFg R
-
Page | 80
J $
_ 5F X$ - 5 $
jk - 8 5 -
A& F
8& _ U u
$ U / /
m $ 6 $ ,
/ m 6
... _ _ X U
5 _ 78 & _
u A & $
& _ {F - _ &
$ _ 8 & A & & 5
A & + _ {F + F
6 _ $ - F _
/H-_ & 85 U
(TDIL) 5 /H-_ F _ FG p\
... _ 5
5 ...(2000), -,, -, IIT Bombay,
CDAC Mumbai and CDAC Pune from Maharastra are playing leading roles in these
endeavours.79 / U (i) 8& (lexical resources
like wordnets), 6 & (ontologies and
semantics-rich machine readable dictionaries (MRD)), (ii) \F
78
http://www.cfilt.iitb.ac.in 79 White Paper on Cross Lingual Search and Machine Translation - Dr. Pushpak Bhattacharyya Department of Computer Science and Engineering IIT Bombay
-
Page | 81
8& 5 interlingua based machine translation softwares for
Hindi and Marathi and (iii) \ & & ( Marathi search engine
and spell checker)80 8& -,
.. } m \ 8&
+ s U b + + F
$ F $ +
1.(Lexicon):- , -, m- 6
C $ 8& + q
R & $ _ + 81 2.
: - () \ - F5 +
C + , & - +
3. Fd(Marathi Analysis) 4. Word Sense Disambiguation 5.
5 Fd /(Speech Synthesis for Marathi Language) 6.
/ (Project Tukaram ): - q & /
/ b l - $ y \
82 7. 6 8& (Designing Devanagari Fonts) 8.
. (IT Localization) F $
F $ _ +
5 _ R$
+ & 5
80 paradigm Shift Of Language Technology initiatives Under Programme (\5 p\)
FG@TDIL /H-_ \, - page 5 81 ] s - 6 82 paradigm Shift Of Language Technology initiatives Under Programme (\5 p\)
FG@TDIL /H-_ \, - page 5
-
Page | 82
6 $ 5 8&
& $ - & &
_ g -, , , + F
F5 - $ _
/ A + j} -
R& / F $ R&
/+V R& ? jk
- + $ F +
& $ - jk $ F
+ 83
V. jk
WXYX -jk
1. N 1. j} 2. 8
3.
4. ?j 5.
N 1. -5 j 2. - j
3. - j 2. & PRN 1. U 2. 8
3. 8
4. /
5.
6. 8 & PRN 1. &-U j 83 j-/--.? 5 , & / --118-216
-
Page | 83
2. &-5 j
3. 3&-5 j
4. &- j
5. &- j 3. F ADJ 1. 2. s
3. R
4. &5
V. jk
WXYX -jk
4. +V V 1. /& +V 2. } +V
2.1. +V
2.2. A +V
3. A
4.
5. & +V
6. & +V
+V V 1. +V- 5 2. +V-
3. +V-
4. +V-
5. +V- g
6. +V- X
7. +V- U
8. +V-
8.1. +V- &
8.2. +V- F
8.3 +V-
-
Page | 84
. 3 - & $ -
F $ - _ $ 8 U 84
84 j-/ - - .? 5 , & / - - 118-216
V. jk
WXYX -jk
1. 1. 2. -
3. 2. 1. 8 2.
3. / 3. +V F 1. R 2. 8 4. / --------
5. 1. 2.
3. 4. 5. b
6. &
7. j8
8. F 6.
EXM 1. -
2.
3. F
4. F&
5. R&
-
Page | 85
. 4. F $ -
& _ 8
_ u $ -
- & U F + 85
2. 5555FFFF, , 56&56&56&56&
85 paradigm Shift Of Language Technology initiatives Under Programme (\5 p\)
FG@TDIL /H-_ \, - page 11
6. j-
7. js
8.
9. b
10. /8 7. PSY 1. F 2. &
3.
4.
5. 8
6. 4
7.
8. 8. 8 NP ---- ---
-
Page | 86
$ m (Homograph) & $ _ R
R $ & b
U 5 $ 5F $