cs626-460: language technology for the web/natural language processing

7
CS626-460: Language Technology for the Web/Natural Language Processing Pushpak Bhattacharyya CSE Dept., IIT Bombay Beta probabilities; parser evaluation criteria

Upload: xena-gross

Post on 31-Dec-2015

27 views

Category:

Documents


3 download

DESCRIPTION

CS626-460: Language Technology for the Web/Natural Language Processing. Pushpak Bhattacharyya CSE Dept., IIT Bombay Beta probabilities; parser evaluation criteria. Inside and Outside probabilities and their usage Inside Probability β j (k,l) = P(w k,l |N j ) - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: CS626-460: Language Technology for the Web/Natural Language Processing

CS626-460: Language Technology for the Web/Natural

Language Processing

Pushpak BhattacharyyaCSE Dept., IIT Bombay

Beta probabilities; parser evaluation criteria

Page 2: CS626-460: Language Technology for the Web/Natural Language Processing

Inside and Outside probabilities and their usage

Inside Probability

βj(k,l) = P(wk,l|Nj)

βj(k,l) gives the probability that Nj yields wk,l

wk wl

Nj

wk

Nj

wlwk

Nj

Page 3: CS626-460: Language Technology for the Web/Natural Language Processing

Outside Probability

αj(k,l) = P(w1,k, Njk,l, wl+1,m)

w1,m is the sentence

Probability of Nj denotes wk,l surrounded by w1,k-1 and wl+1,m

To calculate the probability of a sentence

P(w1,m) = β1(1,m)

Nj

wj

w1 wk wl wm

Page 4: CS626-460: Language Technology for the Web/Natural Language Processing

Recursive calculation of β

βj(k,k) = P(wk,k|Nj) = P(Nj wk)

Assume the grammar to be in Chomsky Normal Form(CNF)

βj(k,l) = P(wk,l|Nj)

= ∑p,q,mP(wk,m,Nk,m,wm+1,l,Nm+1,l|Nj) marginalization

wk wm wm+1 wl

Np Nq

Nj

Page 5: CS626-460: Language Technology for the Web/Natural Language Processing

= ∑p,q,m P(Npk,m,Nq

m+1,l|Nj) . P(wk,m|Npk,m,Nq

m+1,l,Nj) . P(wm+1,l|Np

k,m,Nqm+1,l,Nj,wk,m)

= ∑p,q,m (Nj NpNq) . P(wk,m|Npk,m) . P(wm+1,l|Nq

m+1,l)

= ∑p,q,m P(Nj NpNq) . βp(k,m) . βq(m+1,l)

Page 6: CS626-460: Language Technology for the Web/Natural Language Processing

Assignment 3Study and note the relative merits of Charniak Parser, Collins Parser,

Stanford Parser, RASP (Robust, Accurate, Statistical Parser)

Criteria• Robustness to ungrammaticality• Ranking in case of multiple parses• Time taken• How efficient is embedding handled

– Example: The cat that killed the rat that stole the milk that spilled on the floor that was slippery escaped

• How effectively is multiple POS handled– i.e. if the words are with numerous POS tags, does the parser still work

• Can it handle repeated words with changing POS– Example: Buffalo buffaloes Buffalo buffaloes buffalo buffalo Buffalo buffaloes

Black cows brown cows cow cow white cows• Length of the sentence

Page 7: CS626-460: Language Technology for the Web/Natural Language Processing

S

VPNP

NPVS’NP

NNVPNPNN

Buffalobuffaloes buffaloes

buffalo

BuffalobuffaloesBuffalo

S

NP

NP

S

NP

N

NP

S

NP

NN

NP

S

NP

NPNN

NP

S

NP

VPNPNN

NP

S

NP

S’

VPNPNN

NP

S

NP

VS’

VPNPNN

NP

S

NP

buffalo

VS’

VPNPNN

NP

S

NP

Nbuffalo

VS’

VPNPNN

NP

S

NP

NNbuffalo

VS’

VPNPNN

NP

S

NP

NP

NNbuffalo

VS’

VPNPNN

NP

S

NP VP

NP

NNbuffalo

VS’

VPNPNN

NP

S

NP

Buffalo

VP

NP

NNbuffalo

VS’

VPNPNN

NP

S

NP

BuffalobuffaloesBuffalo

VP

NP

NNbuffalo

VS’

VPNPNN

NP

S

NP

buffaloesBuffalobuffaloesBuffalo

VP

NP

NNbuffalo

VS’

VPNPNN

NP

S

NP

BuffalobuffaloesBuffalobuffaloesBuffalo

VP

NP

NNbuffalo

VS’

VPNPNN

NP

S

NPNP VPNP

S’NP VS’NP NPVS’NP

N

NPVS’NP

N N

NPVS’NP

buffalo N N

NPVS’NP

VP buffalo N N

NPVS’NP

NPN VP buffalo N N

NPVS’NP

N NPN VP buffalo N N

NPVS’NP

buffaloesBuffalo

N NPN VP buffalo N N

NPVS’NP

buffaloesBuffalobuffaloesBuffalo

N NPN VP buffalo N N

NPVS’NP

NP

S

VPNP