ccsb354 artificial intelligence (ai)

37
1 CCSB354 ARTIFICIAL INTELLIGENCE (AI) CHAPTER 12 NATURAL LANGUAGE PROCESSING (NLP) Textbook (Chapter 13, & especially pages 558 & 588) Instructor: Alicia Tang Y. C.

Upload: cornelius-welsh

Post on 03-Jan-2016

69 views

Category:

Documents


4 download

DESCRIPTION

CCSB354 ARTIFICIAL INTELLIGENCE (AI). CHAPTER 12 NATURAL LANGUAGE PROCESSING (NLP) Textbook (Chapter 13, & especially pages 558 & 588). Instructor: Alicia Tang Y. C. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: CCSB354 ARTIFICIAL INTELLIGENCE (AI)

1

CCSB354ARTIFICIAL INTELLIGENCE

(AI)

CHAPTER 12NATURAL LANGUAGE PROCESSING

(NLP)Textbook (Chapter 13, & especially pages 558 & 588)

CHAPTER 12NATURAL LANGUAGE PROCESSING

(NLP)Textbook (Chapter 13, & especially pages 558 & 588)

Instructor: Alicia Tang Y. C.

Page 2: CCSB354 ARTIFICIAL INTELLIGENCE (AI)

2

Language is a complicated phenomenon, involving

processes as varied as the recognition of sounds or printed letters, syntactic

parsing, high-level semantic inferences,etc.

Language is a complicated phenomenon, involving

processes as varied as the recognition of sounds or printed letters, syntactic

parsing, high-level semantic inferences,etc.

Page 3: CCSB354 ARTIFICIAL INTELLIGENCE (AI)

3

What is Natural Language Processing (NLP)?

Natural language gives computer users the ability to communication with the computer in their native language.

This technology allows for conversational type of interface. A general NLP system is not yet possible, especially in recognising and interpreting written sentences.

Page 4: CCSB354 ARTIFICIAL INTELLIGENCE (AI)

4

Types

Natural Language Understanding: Investigate methods of allowing the computer to comprehend instructions given in ordinary English.

Natural Language Generation: Strives to have computers produce ordinary English language so that people can understand computers easily.

Page 5: CCSB354 ARTIFICIAL INTELLIGENCE (AI)

5

NLP Software Tools

INTELLECT SPOCK BBN Parlane NaturalLink

Page 6: CCSB354 ARTIFICIAL INTELLIGENCE (AI)

6

Component of NLP

Parser Lexicon Understander Knowledge base Generator

Page 7: CCSB354 ARTIFICIAL INTELLIGENCE (AI)

7

Stages in Producing an Internal Representation of a Sentence

Page 8: CCSB354 ARTIFICIAL INTELLIGENCE (AI)

8

Input: Tarzan kissed Jane

Parsing

Parse tree: Sentence

verb phrasenoun phrase

noun phrase

noun verb noun

Tarzan kissed Jane

Semantic Interpretation

person: Tarzan person: Jane

agent kiss object

instrument lips

Contextual Knowledge Interpretation

Expanded representation:

possess

experiencer

pet:cheetah

love object

Person: tarzan

agent

Person:jane

kiss object

instrumentlips

location locationjungle

??

Page 9: CCSB354 ARTIFICIAL INTELLIGENCE (AI)

9

Block Diagram of a Natural Language Understanding Program of the

Syntactic-Semantic Analysis Type

Input textString Output

Parser Understander

KnowledgebaseLexicon

Generator

Page 10: CCSB354 ARTIFICIAL INTELLIGENCE (AI)

10

Block Diagram of a Computer Language Translation System

Parser

Lexicon

Rule-based Processor

Rule-based Processor

Generalised Intermediate

Form(GIF)

InputProgram

Output Program inTarget Language

Page 11: CCSB354 ARTIFICIAL INTELLIGENCE (AI)

11

NLP Database

Expert Systems:where do they stand?

Page 12: CCSB354 ARTIFICIAL INTELLIGENCE (AI)

12EASE OF COMMUNICATION WITH DIFFERENT TYPES OF INTEGRATED SYSTEMS

Oh no!

Well..not bad!

Oh yeah!!

Page 13: CCSB354 ARTIFICIAL INTELLIGENCE (AI)

13

Levels of analysis for natural language

Prosody– it deals with the rhythm and intonation of language

Phonology– it examines the sounds that are combined to form

language.

Morphology– it concerns with the components that make up words.– E.g. the effect of prefixes (non-, un-) and suffixes (-ing,

-ly)

Page 14: CCSB354 ARTIFICIAL INTELLIGENCE (AI)

14

Levels of analysis for natural language Syntax

– this involves the study of the rule for combining words to form legal sentences

Semantics– it considers the meaning of words, phrases and sentences

Pragmatics– this is the study of the ways in which language is used and its

effects on the listener World Knowledge

– this includes knowledge of the ‘physical’ world, the world of our social interaction

Page 15: CCSB354 ARTIFICIAL INTELLIGENCE (AI)

15

Syntax Specification and

parsing using Context-free Grammars– the rules listed

below define a GRAMMAR for simple transitive sentences

1. Sentence noun-phrase verb-phrase2. Noun-phrase noun3. Noun-phrase article noun4. Verb-phrase verb5. Verb-phrase verb noun-phrase6. Article a7. Article the8. Noun man9. Noun dog10. Verb likes11. Verb bites

Page 16: CCSB354 ARTIFICIAL INTELLIGENCE (AI)

16

Syntax Rules 6. to 11. are

terminals they define a lexicon

for the language terms that describe

high level linguistic are called nonterminals (sentence, verb, noun-phrase, etc)

1. Sentence noun-phrase verb-phrase2. Noun-phrase noun3. Noun-phrase article noun4. Verb-phrase verb5. Verb-phrase verb noun-phrase6. Article a7. Article the8. Noun man9. Noun dog10. Verb likes11. Verb bites

Page 17: CCSB354 ARTIFICIAL INTELLIGENCE (AI)

17

Syntax A derivation of the sentence “the man bites the dog”

(syntactically correct but semantically wrong, we shall see it later on)

String Apply Rule No.

sentence 1noun-phrase verb-phrase 3article noun verb-phrase 7the noun verb-phrase 8the man verb-phrase 5the man verb noun-phrase 11the man bites noun-phrase 3the man bites article noun 7the man bites the noun 9the man bites the dog done

Parsing algorithms fallinto two classes:

1. Bottom-up2. Top-down

Page 18: CCSB354 ARTIFICIAL INTELLIGENCE (AI)

18

Draw a parse tree for the sentence “the man bites

the dog”

Page 19: CCSB354 ARTIFICIAL INTELLIGENCE (AI)

19

Top-down ParsingUnrecognised I/P

the man bites the dogthe man bites the dogthe man bites the dogman bites the dogbites the dogbites the dogthe dogthe dogdog

Parse Tree

NP

A N

theman

bites

thedog

V

N

NP

S

VP

A

Page 20: CCSB354 ARTIFICIAL INTELLIGENCE (AI)

20

Bottom-Up ParsingUnrecognised I/P

the man bites the dogman bites the dogman bites the dogbites the dogbites the dogbites the dogthe dogthe dogdogdog

Parse Tree the

A

NP

N

man

bites

V

VP

S

NPN

dog A

the

Page 21: CCSB354 ARTIFICIAL INTELLIGENCE (AI)

21

Transition Network Parsers

A transition network parser represents grammars as a set of finite-state machines (i.e. transition networks), like this:

Example: sentence (S)

S S

Noun-phrase Verb-phrase

initial final

Page 22: CCSB354 ARTIFICIAL INTELLIGENCE (AI)

22

Transition Network ParsersExample: Noun-phrase

S SArticle Noun

initial final

Example: Article

S Sa

initial finalthe

Read page 563, text

Noun

Page 23: CCSB354 ARTIFICIAL INTELLIGENCE (AI)

23

The Chomsky Hierarchy and Context-Sensitive Grammars

A context free grammar allows rule to have a single nonterminal on their left-hand side.

Context-free grammar is not powerful enough to represent rules of natural language syntax

the context-sensitive languages form a proper superset of the context-free counterpart

Page 24: CCSB354 ARTIFICIAL INTELLIGENCE (AI)

24

The Chomsky Hierarchy and Context-Sensitive Grammars

Here, one or more symbols on the left-hand side of a rule are allowed that makes it possible to define a context in which that rule can be applied.

This ensures satisfaction of a global constraints such as number agreement and other semantic checks.

The semantic error in earlier example could be detected in context-sensitive grammar if a non- terminal, act_of_biting is added to the grammar, preventing the sentence “man bites dog” to be valid.

Page 25: CCSB354 ARTIFICIAL INTELLIGENCE (AI)

25

The Chomsky Hierarchy

To correctly express grammatical structure of a language (e.g. English), rules are needed.

We can classify grammars according to the kinds of rules that appear in it.

Having done that, we can classify the language into families according to the kinds of rules that are needed to express its grammars.– One such means of classifying grammars in

this manner is called Chomsky Hierarchy

Page 26: CCSB354 ARTIFICIAL INTELLIGENCE (AI)

26

Grammar Hierarchy

Type 0

Type 1

Type 2

Type 3

The Grammarsfor the language

Page 27: CCSB354 ARTIFICIAL INTELLIGENCE (AI)

27

Grammar Hierarchy

Type 0

Name of Grammar: Transformation GrammarForm of Rules: anything anything

Computational Power: General Turing MachineString Characteristics: Any form

Page 28: CCSB354 ARTIFICIAL INTELLIGENCE (AI)

28

Grammar Hierarchy

Type 1Name of Grammar: Context-sensitive

Form of Rules: A B C A D CComputational Power: Linear Bound Automata

String Characteristics: an bn cn

a1 a2 a3 b3 b2 b3

Crossing dependencies E.g. Ali1 helps4 Ahmad2 to teach5 Aida3 programming6.

Page 29: CCSB354 ARTIFICIAL INTELLIGENCE (AI)

29

Type 2

Name of Grammar: Context-freeForm of Rules: A B C D ….

Computational Power: Push Down Stack AutomataString Characteristics: an bn

a1 a2 a3 b3 b2 b1

Nested dependencies

E.g. Ali1 who studies in UNITEN2 that offers2 quality programmes is1 graduating

1 1

2

1

2

2

1 1

1

Page 30: CCSB354 ARTIFICIAL INTELLIGENCE (AI)

30

Type 3

Name of Grammar: Right LinearForm of Rules: A x B (x in terminal category)

Computational Power: Finite State AutomataString Characteristics: a* b*

They have been used for grammars of Morphology

Page 31: CCSB354 ARTIFICIAL INTELLIGENCE (AI)

31

NATURAL LANGUAGE PROCESSING

Problems with NLP :– Ambiguity

» multiple word meanings the pitcher is angry the pitcher is empty

– Inaccuracy– Incompleteness

Page 32: CCSB354 ARTIFICIAL INTELLIGENCE (AI)

32

– Imprecision»I’ve been waiting for you for a long time.»The king ruled the kingdom for a long

time.

– Unclear antecedents»Ben hits Bill because the sympathized

with mary. How people overcome natural

language problems?– Context Familiarity Expectations

Page 33: CCSB354 ARTIFICIAL INTELLIGENCE (AI)

33

SPEECH RECOGNITION Advantages:

» most natural method» ease of access» speed» manual freedom» remote access

Context» Isolated Word Recognition (IWR)» Connected Word Recognition (CWR)» Continuous Speech Recognition (CSR)

Analysing Speech» Syllable Phonemes Allophones

Page 34: CCSB354 ARTIFICIAL INTELLIGENCE (AI)

34

Speech Component: A Holistic Look

I/O Processing Other data

Freq. spectrum

She likes ice-creamWord Sentence

She like ice-creamSentence Structure

x. Likes(x,ice-cream)Partial Meaning

likes(siti, ice-cream).Full Sentence Meaning

Speech recognition

Syntactic Analysis

SemanticAnalysis

Pragmatics

Match with other sound frequencies

Grammar of language(dictionary)

Meaning of each word (thesaurus)

Context of utterance

Page 35: CCSB354 ARTIFICIAL INTELLIGENCE (AI)

35

Exercises:

Page 36: CCSB354 ARTIFICIAL INTELLIGENCE (AI)

36

S NP VP NP <name> | <det> <noun> |PP

VP <verb> | <verb> NP NP | <verb> NP PP | VP NP PP <prep> NP

<name> “Ben" | “Ann"<noun> "morning" | “ice-cream"<verb> "gave" | "saw"<det> "the" <prep> "to" | "in"

S – SENTENCENP – NOUN PHRASEVP – VERB PHRASEDET – DETERMINERPP - PREPOSITION

S – SENTENCENP – NOUN PHRASEVP – VERB PHRASEDET – DETERMINERPP - PREPOSITION

Question: Consider the grammar defined by the following BNF:

Page 37: CCSB354 ARTIFICIAL INTELLIGENCE (AI)

37

Draw parse trees in this grammar for the following sentences. a) Ben gave Ann the ice-cream.

b)   Ann gave the ice-cream to Ben.

Answers:

S NP VP NP <name> | <det> <noun> |PP

VP <verb> | <verb> NP NP | <verb> NP PP | VP NP PP <prep> NP

<name> “Ben" | “Ann"<noun> "morning" | “ice-cream"<verb> "gave" | "saw"<det> "the" <prep> "to" | "in"