alexander gelbukh moscow, russia

Post on 23-Jan-2016

85 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Alexander Gelbukh Moscow, Russia. Mexico. Computing Research Center (CIC), Mexico. Chung-Ang University, Korea Electronic Commerce and Internet Application Lab. Natural Language Processing. Alexander Gelbukh www.Gelbukh.com. What language is. Better communication with computers. - PowerPoint PPT Presentation

TRANSCRIPT

1

Alexander GelbukhMoscow, Russia

2

Mexico

3

Computing Research Center (CIC), Computing Research Center (CIC), MexicoMexico

4

Chung-Ang University, KoreaChung-Ang University, KoreaElectronic Commerce andElectronic Commerce andInternet Application LabInternet Application Lab

Natural Language Natural Language ProcessingProcessing

Alexander Gelbukh

www.Gelbukh.com

6

What language isWhat language is

Linguistic

module

Sentido

This is an example of the output text of the system. This is an example of the output text of the system. This is an example of the output text of the system. This is an example of the output text of the system. This is an example of the output text of the system. This is an example of the output text of the system. This is an example of the output text of the system. This is an example of the output text of the system. This is an example of the output text of the system. This is an example of the output text of the system. This is an example of the output text of the system. This is an example of the output text of the system. This is an example of the output text of the system. This is an example of the output text of the system. This is an example of the output text of the system. This is an example of the output text of the system. This is an example of the output text of the system. This is an example of the output text of the system. This is the output text of the

Texto

Lan- guage

Expert System experto

Linguistic

module

Voice, OCR

7

Better communication with computersBetter communication with computers

0101011101010001101010111o101001011

VS.

Persons are more productive when speaking their own language

8

Accessibility of computers for allAccessibility of computers for all

vs.

It’s easier to teach one computer how to speak than teach generations of people how to use computers

9

Better knowledge managementBetter knowledge management

vs.

Computers are better than people at managing information

10

Solution:Solution:Language understanding by Language understanding by

computerscomputers

11

ApplicationsApplications

Information retrieval (Internet search. Google) Question Answering (Internet) Information extraction (Fill a DB from newspapers) Automatic translation OCR, speech recognition Natural Language Interfaces (robots, computers) Interaction of agents

Thinking computers? Think = speak

12

Source of language complexity: 1-DSource of language complexity: 1-D

This is a text that represents the meaning shown in the right part of the picture. This is a text that represents the meaning shown in the right part of the picture. This is a text that represents the meaning shown in the right part of the picture. This is a text that represents the meaning shown in the right part of the picture. This is a text that represents the meaning shown in the right part of the picture. This is a text that represents the meaning shown in the right part of the picture. This is a text that represents the meaning shown in the right part of the picture. This is a text that represents the meaning shown in the right part of the picture. This is a text that represents the meaning shown in the right part of the picture. This is a text that represents the meaning shown in the right part of the picture. This is a text that represents the meaning shown in the right part of the

Language

Text (speech)

Meaning Meaning

........Text Text.......

Bra

in 1

Brain 2

13

Knowledge Knowledge

Lan-guage

Lan-guage

This is a text that represents the meaning shown in the right part of the picture. This is a text that represents the meaning shown in the right part of the picture. This is a text that represents the meaning shown in the right part of the picture. This is a text that represents the meaning shown in the right part of the picture. This is a text that represents the meaning shown in the right part of the picture. This is a text that represents the meaning shown in the right part of the picture. This is a text that represents the meaning shown in the right part of the picture. This is a text that represents the meaning shown in the right part of the picture. This is a text that represents the meaning shown in the right part of the picture. This is a text that represents the meaning shown in the right part of the picture.

Text

Source of language complexity: 1-DSource of language complexity: 1-D

14

Linguistic processorLinguistic processortranslates between representationstranslates between representations

Linguisticmodule

Meanings

This is an example of the output text ofthe system. This is an example of theoutput text of the system. This is anexample of the output text of thesystem. This is an example of the outputtext of the system. This is an example ofthe output text of the system. This is anexample of the output text of thesystem. This is an example of the outputtext of the system. This is an example ofthe output text of the system. This is anexample of the output text of thesystem. This is an example of the outputtext of the system. This is an example ofthe output text of the system. This is anexample of the output text of thesystem. This is an example of the outputtext of the system. This is an example ofthe output text of the system. This is anexample of the output text of thesystem. This is an example of the outputtext of the system. This is an example ofthe output text of the system. This is an

Texts

Linguisticmodule

Appliedsystem

15

General scheme of text General scheme of text processingprocessing

L inguistic processor

Applied system

(e.g., Expert system)

Out-put

In-put

(Semantic) representation

Linguistic processor uses linguistic knowledge Applied system uses other types of knowledge

(e.g., Artificial Intelligence)

16

Language levelsLanguage levels

Morphological: words Syntactic: sentences Semantic: meaning Pragmatic: intention ...?

17

This is a text that represents themeaning shown in the right part of thepicture. This is a text that represents themeaning shown in the right part of thepicture. This is a text that represents themeaning shown in the right part of thepicture. This is a text that represents themeaning shown in the right part of thepicture. This is a text that represents themeaning shown in the right part of thepicture. This is a text that represents themeaning shown in the right part of thepicture. This is a text that represents themeaning shown in the right part of thepicture. This is a text that represents themeaning shown in the right part of thepicture. This is a text that represents themeaning shown in the right part of thepicture. This is a text that represents themeaning shown in the right part of thepicture. This is a text that represents themeaning shown in the right part of thepicture.

LanguageText Meaning

Morphologicalrepresentation

Syntacticrepresentation

Morpho-logicaltrans-former

Syntac-tic

trans-former

Seman-tic

trans-former

Semanitcrepresentation

Surfacerepresentation

Fine structure of linguistic processor

18

Example of textExample of text

““Science is importantScience is importantfor our country.for our country.

The Government pays it The Government pays it much attention.”much attention.”

19

Textual representationTextual representation

Text is a sequence of letters.

S c i e n c e i s S c i e n c e i s i m p o r t a n t i m p o r t a n t f o r o u r c f o r o u r c o u n t r y . T h e o u n t r y . T h e G o v e r n m e n G o v e r n m e n t p a y s i t t p a y s i t m u c h a t t e n m u c h a t t e n t i o n t i o n ..

20

Linguistic processor

Morpho-logical

analyzer

Semantic analyzer

Syntactic parser

Morphologicalanalysis

Morfological analysisMorfological analysis

21

Morphological Morphological representationrepresentation

A sequence of words.The THE article definite, plural/singular

science SCIENCE noun singular

is BE verb present, 3rd person, sing.

important IMPORTANT adjective

for FOR preposition

our WE pronoun possessive

country COUNTRY noun singular

22

Linguistic processor

Morpho-logical

analyzer

Semantic analyzer

Syntactic parser

Syntacticparsing

Syntactic parsingSyntactic parsing

23

Syntactic representation Syntactic representation

A sequence of syntactic trees.

BE

SCIENCE IMPORTANT

COUNTRY

WE

of

PAY

GOVERNMENT ATTENTION IT

MUCH

24

Linguistic processor

Morpho-logical

analyzer

Semantic analyzer

Syntactic parser

Semanticanalysis

Semantic analysisSemantic analysis

25

Semantic representationSemantic representation

Complex structure of whole text

SCIENCE

IMPORTANT

COUNTRY

WE

GOVERNMENT

ATTENTION

is

of

gives

for

of for

Funding

Organization

Sector

Money

is a main form

needs

is a

gives

is a implies

26

The meaningThe meaning

““La ciencia es importante para La ciencia es importante para nuestro país.nuestro país.

El Gobierno le pone mucha El Gobierno le pone mucha atención.”atención.”

La LA articulo determinado, femeninociencia CIENCIA sustantivo feminino, singular

es SER verbo presente, 3ª persona, sing.importante IMPORTANTE adjetivo singular

para PARA preposicion ---nuestro NOSOTROS pronombre posesivo

pais PAIS sustantivo masculino, singular

SER

CIENCIA IMPORTANTE

PAIS

NOSOTROS

de

PONER

GOBIERNO ATENCION

LE MUCHA

Presupuesto

Organizacion

Sector

Dinero

es unForma

principal

nececita

es un

da

es un implica

CIENCIA

IMPORTANTE

PAIS

NOSOTROS

GOBIERNO

ATENCION

es

de

da

para

depara

““Science is important for our Science is important for our country. The Government pays it country. The Government pays it much attention.”much attention.”

There are good conditions for development of science in our country.

27

Example: TranslationExample: Translation

?

Morphologicallevel

Syntacticlevel

Textlevel

Semanticlevel

The Meaning,yet unreachable

Language A Language B

28

ProblemsProblems

Ambiguity of text I see a cat with a telescope

Knowledge needed Linguistic About the world and life

Good newsLearning from texts

Plenty of texts in Internet! Good statistical methods

29

30

31

32

33

34

Current stateCurrent state

Working...

35

ConclusionesConclusiones

¿Is it necessary?¿Is it simple?¿Is it possible?¿Has been done something?¿Has been done all?¿Where are people working on it?

36

Thank you!www.Gelbukh.com

top related