csa2050 introduction to computational linguistics
DESCRIPTION
CSA2050 Introduction to Computational Linguistics. Lecture 8 Definite Clause Grammars. Rationale. Prolog Program. Logic. CFG + Sentence. Sentence Structure. Logic Rules and Grammar Rules. Basic Question: what is the connection between logic rules and grammar rules? - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: CSA2050 Introduction to Computational Linguistics](https://reader036.vdocuments.site/reader036/viewer/2022062422/568140ab550346895dac6cde/html5/thumbnails/1.jpg)
09.04.2003 CSA2050: DCG I 1
CSA2050 Introduction to Computational
Linguistics
Lecture 8
Definite Clause Grammars
![Page 2: CSA2050 Introduction to Computational Linguistics](https://reader036.vdocuments.site/reader036/viewer/2022062422/568140ab550346895dac6cde/html5/thumbnails/2.jpg)
09.04.2003 CSA2050: DCG I 2
Rationale
Logic
CFG+
Sentence
Prolog Program
SentenceStructure
![Page 3: CSA2050 Introduction to Computational Linguistics](https://reader036.vdocuments.site/reader036/viewer/2022062422/568140ab550346895dac6cde/html5/thumbnails/3.jpg)
09.04.2003 CSA2050: DCG I 3
Logic Rules andGrammar Rules
Basic Question: what is the connection between logic rules and grammar rules?
x y male(x) & parent(x,y) → father(x,y)
S → NP VP
They are both concerned with the definition of predicates.
![Page 4: CSA2050 Introduction to Computational Linguistics](https://reader036.vdocuments.site/reader036/viewer/2022062422/568140ab550346895dac6cde/html5/thumbnails/4.jpg)
09.04.2003 CSA2050: DCG I 4
Logic Rulesand Grammar Rules
Logic: arbitrary n-ary predicates, eg raining; clever(x); father(x,y); between(x,y,z)
Grammar Rules: predicates over text segments, egnp(x); vp(y); s(z).
![Page 5: CSA2050 Introduction to Computational Linguistics](https://reader036.vdocuments.site/reader036/viewer/2022062422/568140ab550346895dac6cde/html5/thumbnails/5.jpg)
09.04.2003 CSA2050: DCG I 5
Text Segments
A text segment is a sequence of consecutive words.
A text segment can be identified by two pointers, if we assign names to the spaces between words. 0 the 1 cat 2 sat 3 on 4 the 5 mat 6
(0,6) is the whole sentence (0,2) is the first noun phrase
![Page 6: CSA2050 Introduction to Computational Linguistics](https://reader036.vdocuments.site/reader036/viewer/2022062422/568140ab550346895dac6cde/html5/thumbnails/6.jpg)
09.04.2003 CSA2050: DCG I 6
From Grammar Rules to Logic
The general statement made by the CF rule S → NP, VP
can be summarised using predicates over segments with the following logic statement
NP(p1,p) & VP(p,p2) => S(p1,p2)
![Page 7: CSA2050 Introduction to Computational Linguistics](https://reader036.vdocuments.site/reader036/viewer/2022062422/568140ab550346895dac6cde/html5/thumbnails/7.jpg)
09.04.2003 CSA2050: DCG I 7
From Grammar Rules to Logic
0 the 1 cat 2 sat 3 on 4 the 5 mat 6
NP
VP
S
![Page 8: CSA2050 Introduction to Computational Linguistics](https://reader036.vdocuments.site/reader036/viewer/2022062422/568140ab550346895dac6cde/html5/thumbnails/8.jpg)
09.04.2003 CSA2050: DCG I 8
From Logic to Prolog
Each logic statement of the form
NP(p1,p) & VP(p,p2) => S(p1,p2)Corresponds to the "definite clause"
s(P1,P2) :- np(P1,P), vp(P,P2).
![Page 9: CSA2050 Introduction to Computational Linguistics](https://reader036.vdocuments.site/reader036/viewer/2022062422/568140ab550346895dac6cde/html5/thumbnails/9.jpg)
09.04.2003 CSA2050: DCG I 9
Converting a Grammar
S → NP, VP
NP → N
NP → Det N
VP → V NP
s(P1,P2) :- np(P1,P), vp(P,P2).
np(P1,P2) :- n(P1,P2).
np(P1,P2) :- det(P1,P), n(P,P2).
vp(P1,P2) :-v(P1,P), np(P, P2)
![Page 10: CSA2050 Introduction to Computational Linguistics](https://reader036.vdocuments.site/reader036/viewer/2022062422/568140ab550346895dac6cde/html5/thumbnails/10.jpg)
09.04.2003 CSA2050: DCG I 10
Lexical Categories and Rules
Lexical categories are those which are not defined in the grammar itself (eg. N and V in our grammar)
Instead, they are defined by the words that they rewriteV → run, sleep, talk etc
Lexical categories always derive exactly one input token.
![Page 11: CSA2050 Introduction to Computational Linguistics](https://reader036.vdocuments.site/reader036/viewer/2022062422/568140ab550346895dac6cde/html5/thumbnails/11.jpg)
09.04.2003 CSA2050: DCG I 11
Lexical Rules
A rule defining lexical category C must express the following information:there is a C between positions p1 and p2 if some word of syntactic category C spans those positions
There are many different ways to translate such a rule into a Prolog clause.
Each way needs to make reference to how the input sentence is represented.
![Page 12: CSA2050 Introduction to Computational Linguistics](https://reader036.vdocuments.site/reader036/viewer/2022062422/568140ab550346895dac6cde/html5/thumbnails/12.jpg)
09.04.2003 CSA2050: DCG I 12
Defining Lexical Categories
Each category is defined in terms of the words it can rewrite
d(P1,P2) :- input(P1,P2,[the]).n(P1,P2) :- input(P1,P2,[cat]).n(P1,P2) :- input(P1,P2,['John']).v(P1,P2) :- input(P1,P2,[ate]).
How is the input sentence represented?
![Page 13: CSA2050 Introduction to Computational Linguistics](https://reader036.vdocuments.site/reader036/viewer/2022062422/568140ab550346895dac6cde/html5/thumbnails/13.jpg)
09.04.2003 CSA2050: DCG I 13
Representing the Input
Define the predicate input(P1,P2,L) such that P1 and P2 are positions and L is a list containing the words spanning those positions
Checkpoint: show how to represent the input sentence "John ate the cat"
![Page 14: CSA2050 Introduction to Computational Linguistics](https://reader036.vdocuments.site/reader036/viewer/2022062422/568140ab550346895dac6cde/html5/thumbnails/14.jpg)
09.04.2003 CSA2050: DCG I 14
John ate the cat
input(0,1,['John']).
input(1,2,[ate]).
input(2,3,[the]).
input(3,4,[cat]). Checkpoints
Why is John in quotes? Why use a list of one element rather than an atom? Is this the only way to do it?
![Page 15: CSA2050 Introduction to Computational Linguistics](https://reader036.vdocuments.site/reader036/viewer/2022062422/568140ab550346895dac6cde/html5/thumbnails/15.jpg)
09.04.2003 CSA2050: DCG I 15
Complete Program
1. Grammar
s(P1,P2) :- np(P1,P), vp(P,P2).
np(P1,P2) :- n(P1,P2).
np(P1,P2) :- d(P1,P), n(P,P2).
vp(P1,P2) :- v(P1,P2).
vp(P1,P2) :-v(P1,P), np(P, P2)
2. Lexicond(P1,P2) :- input(P1,P2,[the]).n(P1,P2) :- input(P1,P2,[cat]).n(P1,P2) :- input(P1,P2,['John']).v(P1,P2) :- input(P1,P2,[ate]).3. Inputinput(0,1,['John']).input(1,2,[ate]).input(2,3,[the]).input(3,4,[cat]).4. Query?- s(0,4).
![Page 16: CSA2050 Introduction to Computational Linguistics](https://reader036.vdocuments.site/reader036/viewer/2022062422/568140ab550346895dac6cde/html5/thumbnails/16.jpg)
09.04.2003 CSA2050: DCG I 16
Trace of query?- vp(1,4)
1 1 Call: vp(1,4) ?2 2 Call: v(1,4) ?3 3 Call: input(1,4,[ate]) ?3 3 Fail: input(1,4,[ate]) ? 2 2 Fail: v(1,4) ? 2 2 Call: v(1,_349) ? 3 3 Call: input(1,_349,[ate]) ? 3 3 Exit: input(1,2,[ate]) ? 2 2 Exit: v(1,2) ? 4 2 Call: np(2,4) ? 5 3 Call: n(2,4) ? 6 4 Call: input(2,4,[cat]) ? 6 4 Fail: input(2,4,[cat]) ?
6 4 Call: input(2,4,[John]) ? 6 4 Fail: input(2,4,[John]) ? 5 3 Fail: n(2,4) ? 5 3 Call: d(2,_1338) ? 6 4 Call: input(2,_1338,[the]) ? 6 4 Exit: input(2,3,[the]) ? 5 3 Exit: d(2,3) ? 7 3 Call: n(3,4) ? 8 4 Call: input(3,4,[cat]) ? 8 4 Exit: input(3,4,[cat]) ? 7 3 Exit: n(3,4) ? 4 2 Exit: np(2,4) ? 1 1 Exit: vp(1,4) ?
![Page 17: CSA2050 Introduction to Computational Linguistics](https://reader036.vdocuments.site/reader036/viewer/2022062422/568140ab550346895dac6cde/html5/thumbnails/17.jpg)
09.04.2003 CSA2050: DCG I 17
Representing the Sentence Using Difference Lists
We can represent the input as a pair of pointers The first pointer points to the entire list The second pointer points to a suffix of the list. The represented list is the difference between
the two lists.input(['John',ate,the,cat],['John',ate,the,cat]).input(['John',ate,the,cat],[ate,the,cat]).input(['John',ate,the,cat],[the,cat]).input(['John',ate,the,cat],[]).input([X|Y],Y,X).
![Page 18: CSA2050 Introduction to Computational Linguistics](https://reader036.vdocuments.site/reader036/viewer/2022062422/568140ab550346895dac6cde/html5/thumbnails/18.jpg)
09.04.2003 CSA2050: DCG I 18
DCG Notation
The conversion of CF rules into Prolog is so simple that it can be done automatically.
Clauses in DCG notation:s --> np, vp.np --> d, n.n --> [cat].are automatically translated when read in tos(P1,P2) --> np(P1,P),vp(P,P2).np(P1,P2) --> d(P1,P), n(P,P2).n([dog|L],L).
![Page 19: CSA2050 Introduction to Computational Linguistics](https://reader036.vdocuments.site/reader036/viewer/2022062422/568140ab550346895dac6cde/html5/thumbnails/19.jpg)
09.04.2003 CSA2050: DCG I 19
DCG Notation
Every DCG rule takes the formnonterminal --> expansionwhere expansion is any of A nonterminal symbol np A list of non-terminal symbols [each,other] A null constitutent [ ] A plain Prolog goal enclosed in braces {write('Found')}
A series of any of these expansions joined by commas.
![Page 20: CSA2050 Introduction to Computational Linguistics](https://reader036.vdocuments.site/reader036/viewer/2022062422/568140ab550346895dac6cde/html5/thumbnails/20.jpg)
09.04.2003 CSA2050: DCG I 20
Complete DCG
1. Grammar
s --> np, vp.
np --> n.
np --> d, n.
vp --> v.
vp --> v, np
2. Lexicond --> [the].n --> [cat].n --> ['John'].v --> ['ate']. 3. Input
4. Query?- s(['john', ate, the, cat], []).
![Page 21: CSA2050 Introduction to Computational Linguistics](https://reader036.vdocuments.site/reader036/viewer/2022062422/568140ab550346895dac6cde/html5/thumbnails/21.jpg)
09.04.2003 CSA2050: DCG I 21
Checkpoints
What is your system's translation ofs --> np, vp.n --> [cat].