Transcript
Page 1: 1 2009 Year of science. 2 Koycho Mitev DIGITAL LITERACY

1

2009

Year of science

Page 2: 1 2009 Year of science. 2 Koycho Mitev DIGITAL LITERACY

2

Koycho MitevDIGITAL LITERACY

Page 3: 1 2009 Year of science. 2 Koycho Mitev DIGITAL LITERACY

3

INTERNTIONAL STANDARDS

Mankind has created a number of internationally coordinated systems for presenting a common description of various aspects and objects of knowledge – the international system of weights and measures, the system of presenting chemical elements with letters from the Latin alphabet, the music letter, the barcode system for recognizing products in shops, and many others.

Page 4: 1 2009 Year of science. 2 Koycho Mitev DIGITAL LITERACY

4

THE NOTES

Page 5: 1 2009 Year of science. 2 Koycho Mitev DIGITAL LITERACY

5

THE BARCODE SYSTEM

US Patent #2,612,994- issued to inventors Joseph Woodland

and Bernard Silver on October 7, 1952

Page 6: 1 2009 Year of science. 2 Koycho Mitev DIGITAL LITERACY

6

THE PERIODIC SYSTEM OF CHEMICAL ELEMENTS

Dmitrii Mendeleev

Page 7: 1 2009 Year of science. 2 Koycho Mitev DIGITAL LITERACY

7

QUESTION

Is it possible that there exists AN INTERNATIONAL STANDARD FOR DIGITAL

COMMUNICATION IN MOTHER TONGUE, where the actual interpreter of the voice and

written speech is the computer?

Page 8: 1 2009 Year of science. 2 Koycho Mitev DIGITAL LITERACY

8

You speak in your mother tongue, and someone at the other side of the world

hears your voice but in his/her own language!? .

Page 9: 1 2009 Year of science. 2 Koycho Mitev DIGITAL LITERACY

9

Centuries birthday of JOHN ATANASOV

Acad. Sendov http://www.aba.government.bg/bg/BGpoSveta/Kariera/021203.html John Atanassov was very much interested in the most important means of communication – the human language and its written equivalent. It can be said that written languages, when compared to the spoken ones, are a symbolic presentation of the latter. This representation does not have a single meaning and therefore can have a different quality. When John Atanassov was awarded the “Cyril and Methodius” medal, something he had not expected, he showed he was very well informed about the work of the brothers Cyril and Methodius. This was due to the fact that he was interested in different scripts. He also complained about the high percentage of illiteracy in the USA and explained that fact with the imperfections of written English. He also considered the Cyrillic alphabet to be more felicitous.

This motivated him to create a new script which would be entirely phonetic and suitable for both people and machines. He did not fulfill his dream although he aspired after it till the end of his life.

Page 10: 1 2009 Year of science. 2 Koycho Mitev DIGITAL LITERACY

10

SCIENTIFIC DISCOVERY DIGITAL SCRIPT

A scientific discovery must meet simultaneously three requirements :

А. CAUSE

B. EFFECT

C. CAUSE-EFFECT RELATIONSHIP

Page 11: 1 2009 Year of science. 2 Koycho Mitev DIGITAL LITERACY

11

CAUSES - 1 The topic of communication all over the

world is one and the same – work, money, sport, love, business, education, culture, etc.

Page 12: 1 2009 Year of science. 2 Koycho Mitev DIGITAL LITERACY

12

CAUSES - 2The organs of speech of human beings are the same for all ethnic, racial, and religious

groups.

Page 13: 1 2009 Year of science. 2 Koycho Mitev DIGITAL LITERACY

13

CAUSES - 3

The digit 10 hides secrets !!!The parts of speech in all languages are exactly 10 in number:

noun, adjective, numeral, pronoun, verb, conjunction, preposition, particle, and interjection.

Page 14: 1 2009 Year of science. 2 Koycho Mitev DIGITAL LITERACY

14

EFFECT

With the help of the digits from the decimal system we can transform

communicative spoken and written speech from a random language or

dialect into any other language.

Page 15: 1 2009 Year of science. 2 Koycho Mitev DIGITAL LITERACY

15

CAUSE-EFFECT RELATIONSHIP - 1Undeniable scientific facts:

• The digits from the decimal system:

0 1 2 3 4 5 6 7 8 9 are 10 in number and the initial digital codes of the 10 parts of speech can be simply coded with their help.

• Regardless of its language origin, the sentence is a basic element of speech and is characterized by semantic and intonation unity with communicative importance.

Page 16: 1 2009 Year of science. 2 Koycho Mitev DIGITAL LITERACY

16

CAUSE-EFFECT RELATIONSHIP - 2

• The digits and punctuation symbols on the computer keyboard are common for all languages in the world.

Page 17: 1 2009 Year of science. 2 Koycho Mitev DIGITAL LITERACY

17

CAUSE-EFFECT RELATIONSHIP - 3

• The grammar of all languages consists of the same elements:

phonetics;

morphology;

syntax;

lexicology;

semantics.

Page 18: 1 2009 Year of science. 2 Koycho Mitev DIGITAL LITERACY

18

CAUSE-EFFECT RELATIONSHIP - 4

The sounds which human beings use in their communication are a two-digit

number and can be coded in the same way in all languages.

Page 19: 1 2009 Year of science. 2 Koycho Mitev DIGITAL LITERACY

19

CAUSE-EFFECT RELATIONSHIP - 5

The digital representation of spoken and written speech can be transformed in a

binary code using John Atanassov’s invention and can be transmitted in real time

(on line) to any place in the world.

Page 20: 1 2009 Year of science. 2 Koycho Mitev DIGITAL LITERACY

20

THE INVENTION

Patent BG 63704 – 04.10.2002

METHOD FOR COMMUNICATION IN MOTHER TONGUE

Page 21: 1 2009 Year of science. 2 Koycho Mitev DIGITAL LITERACY

21

Kiochiro Matsura Secretary- general of UNESCO

More than half of the 6 800 languages that are spoken today can disappear by

the end of the century. When a language dies part of the world dies. Language is more than an ordinary tool and a means of communication. It is a fundamental element of human nature. More than 20% of all languages do not have a written version. In Africa, where a third of the human languages are recorded, 80 % of the dialects do not have a scrip and exist only in a spoken form. Therefore, they are in danger of disappearing.

Page 22: 1 2009 Year of science. 2 Koycho Mitev DIGITAL LITERACY

22

NATURE OF THE INVENTION

The communicative spoken and written speech is recorded a single time in the memory of the computer with the help of digits. A system of digital codes is

entered and it allows the identification of equivalent words ( including idioms), phrases, and the entire grammar of the

particular language.

Page 23: 1 2009 Year of science. 2 Koycho Mitev DIGITAL LITERACY

23

Digital coding of phonetic speech

• The number of sounds in speech is a two-digit number. • The characteristics of the phonemes are coded with the

help of digits: serial number; vowel or consonant; short or long, stressed, etc. for the vowels; type of consonant – voiced opposed to voiceless.

• А – 01100 01 serial number, the third digit 1 is a vowel, the fourth digit 0 is a short vowel, the fifth digit 0– a vowel in a stressed syllable. In Czech, Slovak and other language there are long vowels (nemám) Then the fourth digit will be 1. Or : long vowel under stress Á - 01111

Page 24: 1 2009 Year of science. 2 Koycho Mitev DIGITAL LITERACY

24

Digital coding of phonetic speech

• B – 02200 02 serial number, the third digit 2 is for a consonant, the fourth 0 is for voiceless (not voiced), the fifth digit– another characteristic of the consonants (long or double consonant in-innocent; in Arabic - arrabia);

• Ль (сколько) in Russian language – 1721017 serial number, the third digit 2 is for a consonant, the fourth digit 1 is for a voiced consonant; the fifth digit is for a double consonant – long or double consonant.

And so on for all sounds, the same for all languages.

Page 25: 1 2009 Year of science. 2 Koycho Mitev DIGITAL LITERACY

25

Digital coding of phonetic speech

• The digital representation of the phonetic speech is a combination of the digital codes of the phonemes. The number of digits for identification of sound will be the same for all phonemes ( for example 5 digits); The digital representation of the separate syllables and word is a sum of the digital codes of the phonemes that comprise the respective syllables and words. This applies for diphthongs, ai, ou, ei, ie, etc.

Page 26: 1 2009 Year of science. 2 Koycho Mitev DIGITAL LITERACY

26

Phoneme Recognition

Similar to the systems for fingerprint identification, where the name of a particular person corresponds to a print, with phonemes there will be a single and common for all languages digital code that will correspond to the graphical representation of each phoneme.

Page 27: 1 2009 Year of science. 2 Koycho Mitev DIGITAL LITERACY

27

Phoneme Recognition

• The digital representation of words can be designed in a way that will allow taking into account reduction of vowels, devocalization of consonants and other phonetic phenomena typical of dialects. These variants will also correspond to the written variants of the word so that the software can recognize the spoken language.

Page 28: 1 2009 Year of science. 2 Koycho Mitev DIGITAL LITERACY

28

Recognition of phonemesexample

Cаnadа – digital representation of the word:

0320001101142000110104100 01101

Page 29: 1 2009 Year of science. 2 Koycho Mitev DIGITAL LITERACY

29

MORPHOLOGICAL CODES

The morphological codes will show consecutively what part of speech the word belongs to, its grammatical categories, and other characteristics.

1. The first digit – part of speech (10 parts of speech) will identify the part of speech. Zero (0) will be for nouns, One (1) will be for adjectives, five (5) will be for verbs, etc.

Page 30: 1 2009 Year of science. 2 Koycho Mitev DIGITAL LITERACY

30

MORPHOLOGICAL CODES

2. The second digit – GENDER 0 – no gender, 1 – masculine gender, 2 – feminine gender, 3 – neuter gender;

3. The third digit – NUMBER 0 – no number, 1 – singular, 2 – plural;

4. Fourth and fifth digit – VERB TENSE (in some languages the number of verb tenses is a two-digit number);

Page 31: 1 2009 Year of science. 2 Koycho Mitev DIGITAL LITERACY

31

MORPHOLOGICAL CODES

5. Sixth and seventh digit – CASE (in some languages there is a two-digit number of cases). Languages which do not have cases (such as Bulgarian and English) write 00.

6. Seventh and next digits – other grammatical categories and / or characteristics of words such as being countable /uncountable or animate / inanimate nouns.

Page 32: 1 2009 Year of science. 2 Koycho Mitev DIGITAL LITERACY

32

MORPHOLOGICAL CODES

• The tenses with compound verb forms (consisting of two or more words) are coded as the phrases.

• The digital phonetic and morphological codes are recorded in an electronic dictionary against each word entered in it.

Page 33: 1 2009 Year of science. 2 Koycho Mitev DIGITAL LITERACY

33

Coding of phrases, collocations, proverbs, etc.

The digital codes of the separate words are connected by an underscore. In this way the program will understand that this is one semantic unit which consists of many words:

Leje_ ako_ z_ konvy. (Slovak)

It_is_raining_ cats_ and_ dogs.

Page 34: 1 2009 Year of science. 2 Koycho Mitev DIGITAL LITERACY

34

Coding according to field of knowledge

Our knowledge of the word is divided into various fields where words have different meanings (idioms). These fields of knowledge are coded in the same manner for all languages:

01 everyday speech

02 business

03 science

etc with a possibility for up to 99 fields of knowledge, etc. with a possibilities for 99 fields of knowledge.

Page 35: 1 2009 Year of science. 2 Koycho Mitev DIGITAL LITERACY

35

Coding according to field of knowledge

• The digital codes of words start with the code for the field of knowledge.

• If a word has a different semantic meaning in the different fields it is coded separately for each field of knowledge. Semantic synonymy is achieved in this way.

Page 36: 1 2009 Year of science. 2 Koycho Mitev DIGITAL LITERACY

36

Coding of synonyms

• Words which have synonyms can be grouped and arranged according to various aspects of their meaning, for example – everyday speech, slang, expressing of quality, etc. The digital codes of these groups are then added to the other codes.

Page 37: 1 2009 Year of science. 2 Koycho Mitev DIGITAL LITERACY

37

Syntactic parser

• To perform a syntactic analysis of sentences we need to design software which includes all parts of speech, their grammatical categories and other characteristics. This programmed will apply separately the grammatical, syntactic, etc rules of each language. We should note that the components are the same for all languages but they interact differently according to the grammar rules of the particular languages.

Page 38: 1 2009 Year of science. 2 Koycho Mitev DIGITAL LITERACY

38

Syntactic analysis

• Each sentence is analyzed syntactically before it is translated. To do so the programme finds the predicate centre of the sentence (the subject – verb relationship). The predicate is found first. It is a verb. The verbs starts with the digit 5. Then the program searches for the subject. It can be a pronoun or a noun which agrees with the verb in gender and number (the second and the third digit) according to the particular word order. The analysis continues until the software finds the syntactic function of each word in the particular sentence.

Page 39: 1 2009 Year of science. 2 Koycho Mitev DIGITAL LITERACY

39

Coding of the syntactic function of words in sentences

Now the algorithms for machine translation apply the so called “Statistical method”. For the first time now there exists an opportunity for assigning a TEMPORARY code to each word in a sentence. This temporary code shows the function of the word – subject, predicate, attribute, adverbial, etc.

Page 40: 1 2009 Year of science. 2 Koycho Mitev DIGITAL LITERACY

40

Coding of the syntactic function of words in sentences

• The temporary codes for the syntactic functions of words will allow the digital codes from one language to be transformed in the respective digital codes in the other language “in bulk”. Then in the second language a new syntactic analysis, a subject-verb agreement, and management is carried out before the sentence is translated grammatically correct according to the word order rules.

Page 41: 1 2009 Year of science. 2 Koycho Mitev DIGITAL LITERACY

41

Language code

This code presents the origin of the language and its dialects:000100 Standard Bulgarian language000101 Rhodopi dialect (from the

Rhodope region)000102 Shopski dialect (from Sofia

region)000102 and so on till 199000200 Standard British English000201 American English000202 and so on to 299

Page 42: 1 2009 Year of science. 2 Koycho Mitev DIGITAL LITERACY

42

Language code

000800 Standard French language

000801 French spoken in Quebec

008002 Second French dialect

008003 another French dialect, etc.• Over 6800 languages (the number can be

expanded up to 9999 languages) can be identified with the help of the first four digits of these codes. The rest of the codes can be used to identify up to 99 dialects in each language.

Page 43: 1 2009 Year of science. 2 Koycho Mitev DIGITAL LITERACY

43

Sequence in the programming of written speech

Saving the sentence in the computer memory

Connection with the

database

Syntactic analysis

Morphological analysis

Temporary codes for the

roles of the words

Transfer of codes

Morphological analysis

Syntactic analysis and word

order

Writing the sentence in the

other language

Page 44: 1 2009 Year of science. 2 Koycho Mitev DIGITAL LITERACY

44

Sequence in the programming of spoken speech

Recording of the sentence

Determining the phonemes of

each word

Determining of the word in a

written form

Morphological analysis

Syntactic analysis

Temporary cods of the words

Transferring to another

language

Morphological analysis

Syntactic analysis

Agreement of words in gender,

number, etc.

Sound representation of the

sentence with the voice of the person speaking the first

language

Page 45: 1 2009 Year of science. 2 Koycho Mitev DIGITAL LITERACY

45

EXAMPLE

BGТова изобретение ще промени нашите представи за комуникация!

EN This invention will transform our ideas about communication!

GR Αυτή η εφεύρεση θα αλλάξει τις ιδέες μας επικοινωνίας!

Arab االتصال من أفكارنا تغيير سوف االختراع !هذاRU Это изобретение изменит наше представление

о коммуникации!D→ Diese Erfindung wird unsere Vorstellungen von der

Kommunikation verändern!

Page 46: 1 2009 Year of science. 2 Koycho Mitev DIGITAL LITERACY

46

The future of this scientific discovery –free communication in mother tongue!

.

Page 47: 1 2009 Year of science. 2 Koycho Mitev DIGITAL LITERACY

47

Possibilities of the technology

• After recording the human voice once ( as if taking fingerprints), the program will allow you to speak in Bulgarian ( or any other human language entered in the system) and people on the other end of the telephone line, Skype, microphone, etc. will hear you in their own language but with your voice. The missing phonemes from your language will be added by a synthesizer. The members of the European parliament will speak in their mother tongue but the rest in the hall will hear them in their own languages. The delay will be the same as in live interpreting – the time the program needs to perform a syntactic analysis of one sentence. The program can be uploaded on the computers of your mobile service provider.

• You can open a random Internet site written in a random language and read it in your language.

Page 48: 1 2009 Year of science. 2 Koycho Mitev DIGITAL LITERACY

48

Vision for the development of the technology

• A pilot model for written translation between 4-5 languages;

• A pilot model for a speech translation;• Licensing by the universities around the world;• Every language and dialect will become part of system

for communication in a mother tongue after a one-time coding of grammar, words, and collocations;

• Communication between two distant languages ( such that do not have linguists who know the languages) will be carried out through another basic language such as English. If we do not do this now, communication will be through Chinese soon.

Page 49: 1 2009 Year of science. 2 Koycho Mitev DIGITAL LITERACY

49

The others for us

• India (page 24 from the text, or page 19 from 132 of PDF file): http://www.saneinetwork.net/pdf/SANEI_VI/SANEI-VI-(EcommerceandEconomicDevelopment_FPEPR).pdf

Page 50: 1 2009 Year of science. 2 Koycho Mitev DIGITAL LITERACY

50

The others for us

• France Press Agencyhttp://www.bulgaria-france.net/kmitev.html

• One invention against terrorism:

http://www.democrit.com/category.php?n=330&cat=27&br=12&wh_n=news17

• China radio international http://bg.chinabroadcast.cn/64/2005/09/29/[email protected]

Page 51: 1 2009 Year of science. 2 Koycho Mitev DIGITAL LITERACY

51

The others for us

• BablePort USA• http://www.babelport.com/news/1106 • A Great Bulgarian Invention is waiting• http://www.novavizia.com/399.html • Vietnam • http://www.daichung.com/110/12_tinnho.shtm • Slovakia•

Will a brilliant Bulgarian invention change human communication?

• http://www.itnews.sk/buxus_dev/generate_page.php?page_id=37989

Page 52: 1 2009 Year of science. 2 Koycho Mitev DIGITAL LITERACY

52

Belarusian Academy of Sciences

“...We are convinced that this project does not need only financial support. You know that UNESCO is deeply concerned about the future of the majority of the 6800 existing languages which are threatened by globalization processes. We think that this promising project will contribute to this very sensitive topic for the human civilization and in this way will obtain considerable political support”.

Sergei ABLAMEIKOProfessor, Ph.D. in Computer Sciences,

Associated member of Belarusian Academy of Sciences

Page 53: 1 2009 Year of science. 2 Koycho Mitev DIGITAL LITERACY

53

THANK YOU FOR YOUR ATTENTION!

Dipl. Eng. Koycho MitevE-mail: [email protected]

Pictures from the Internet that are subject to copyright law are used in this presentation but since this document is not written for

commercial purposes the author thanks for the understanding. June 2009 Copyright ©


Top Related