Download - "SSC" - Geometria e Semantica del Linguaggio
![Page 2: "SSC" - Geometria e Semantica del Linguaggio](https://reader031.vdocuments.site/reader031/viewer/2022020116/55a8e3ed1a28ab3f6a8b4825/html5/thumbnails/2.jpg)
Il significato nel mondo reale
Rappresentazione estensionale del significato!
automobile
![Page 3: "SSC" - Geometria e Semantica del Linguaggio](https://reader031.vdocuments.site/reader031/viewer/2022020116/55a8e3ed1a28ab3f6a8b4825/html5/thumbnails/3.jpg)
Il significato nel mondo reale
Rappresentazione intensionale del significato!
automòbile agg. e s. f. [dal fr. automobile ‹otomobìl›, comp. di auto-1 e dell’agg. lat. mobĭlis «che si muove»]. – 1. agg. Che si muove da sé, soprattutto con riferimento a veicoli che si muovono sul terreno (o anche nell’acqua, come per es. i mezzi autopropulsi, quali i siluri e i missili) per mezzo di un motore proprio. 2. s. f. Autoveicolo a quattro ruote con motore generalmente a scoppio, adibito al trasporto di un numero limitato di persone su strade ordinarie (detto, nelle... Leggi
AUTOMOBILE
automobile
![Page 4: "SSC" - Geometria e Semantica del Linguaggio](https://reader031.vdocuments.site/reader031/viewer/2022020116/55a8e3ed1a28ab3f6a8b4825/html5/thumbnails/4.jpg)
Il significato nella nostra mente
La rappresentazione del concetto AUTOMOBILE nella nostra mente
Connessionisti(reti neurali)
Simbolici(formalismo logico)
![Page 5: "SSC" - Geometria e Semantica del Linguaggio](https://reader031.vdocuments.site/reader031/viewer/2022020116/55a8e3ed1a28ab3f6a8b4825/html5/thumbnails/5.jpg)
Il significato nel testo
Semantica distribuzionale: cosa è un’automobile?
AUTOMOBILE è l’insieme dei contesti linguistici in cui la parola automobile
occorre
![Page 6: "SSC" - Geometria e Semantica del Linguaggio](https://reader031.vdocuments.site/reader031/viewer/2022020116/55a8e3ed1a28ab3f6a8b4825/html5/thumbnails/6.jpg)
A bottle of Tezguno is on the table.
Everyone likes Tezguno.
Tezguno makes you drunk.
We make Tezguno out of corn.
What’s Tezguno?
![Page 7: "SSC" - Geometria e Semantica del Linguaggio](https://reader031.vdocuments.site/reader031/viewer/2022020116/55a8e3ed1a28ab3f6a8b4825/html5/thumbnails/7.jpg)
Modelli distribuzionali semantici
You shall know a word by the company it keeps!
Meaning of a word is determined by its usage
7
![Page 8: "SSC" - Geometria e Semantica del Linguaggio](https://reader031.vdocuments.site/reader031/viewer/2022020116/55a8e3ed1a28ab3f6a8b4825/html5/thumbnails/8.jpg)
Modelli distribuzionali semantici
• Modelli computazionali che costruiscono rappresentazioni semantiche delle parole analizzando dei corpora
– Le parole sono rappresentate tramite vettori
– I vettori sono costruiti analizzando statisticamente i contesti linguistici in cui le parole occorrono
![Page 9: "SSC" - Geometria e Semantica del Linguaggio](https://reader031.vdocuments.site/reader031/viewer/2022020116/55a8e3ed1a28ab3f6a8b4825/html5/thumbnails/9.jpg)
Vettore distribuzionale
1. contare quante volte una parola occorre in un determinato contesto
2. costruire un vettore in funzione delle occorrenze calcolate al punto 1
PAROLE SIMILI AVRANNO VETTORI SEMILI
![Page 10: "SSC" - Geometria e Semantica del Linguaggio](https://reader031.vdocuments.site/reader031/viewer/2022020116/55a8e3ed1a28ab3f6a8b4825/html5/thumbnails/10.jpg)
Matrice = Spazio geometrico
• Matrice: parole X contesti
C1 C2 C3 C4 C5 C6 C7
cane 5 0 11 2 2 9 1
gatto 4 1 7 1 1 7 2
pane 0 12 0 0 9 1 9
pasta 0 8 1 2 14 0 10
carne 0 7 1 1 11 1 8
topo 4 0 8 0 1 8 1
![Page 11: "SSC" - Geometria e Semantica del Linguaggio](https://reader031.vdocuments.site/reader031/viewer/2022020116/55a8e3ed1a28ab3f6a8b4825/html5/thumbnails/11.jpg)
Matrice = Spazio geometrico
• Matrice: parole X contesti
C1 C2 C3 C4 C5 C6 C7
cane 5 0 11 2 2 9 1
gatto 4 1 7 1 1 7 2
pane 0 12 0 0 9 1 9
pasta 0 8 1 2 14 0 10
carne 0 7 1 1 11 1 8
topo 4 0 8 0 1 8 1
![Page 12: "SSC" - Geometria e Semantica del Linguaggio](https://reader031.vdocuments.site/reader031/viewer/2022020116/55a8e3ed1a28ab3f6a8b4825/html5/thumbnails/12.jpg)
Matrice = Spazio geometricoC3
C5
pasta
topo
cane
gatto
Similarità->vicinanza in uno spazio multi-dimensionale(similarità del coseno)
![Page 13: "SSC" - Geometria e Semantica del Linguaggio](https://reader031.vdocuments.site/reader031/viewer/2022020116/55a8e3ed1a28ab3f6a8b4825/html5/thumbnails/13.jpg)
Generalizzazione
• Un modello distribuzionale può essere definito da <T, C, R, W, M, d, S>– T: target elements -> le parole (generalmente)
– C: i contesti
– R: la relazione che lega T a C
– W: schema di pesatura
– M: spazio geometrico TxC
– d: funzione di riduzione dello spazio M -> M’
– S: funzione di similarità in M’
![Page 14: "SSC" - Geometria e Semantica del Linguaggio](https://reader031.vdocuments.site/reader031/viewer/2022020116/55a8e3ed1a28ab3f6a8b4825/html5/thumbnails/14.jpg)
Costruire uno spazio semantico
1. Pre-processing del corpus
2. Individuare parole e contesti
3. Contare le co-occorrenze parole/contesti
4. Pesatura (opzionale, ma consigliata)
5. Costruzione della matrice TxC
6. Riduzione della matrice (opzionale)
7. Calcolare la similarità tra vettori
![Page 15: "SSC" - Geometria e Semantica del Linguaggio](https://reader031.vdocuments.site/reader031/viewer/2022020116/55a8e3ed1a28ab3f6a8b4825/html5/thumbnails/15.jpg)
I parametri
• La definizione di contesto
– Una finestra di dimensione n, frase, paragrafo, documento, un particolare contesto sintattico
• Schema di pesatura
• Funzione di similarità
![Page 16: "SSC" - Geometria e Semantica del Linguaggio](https://reader031.vdocuments.site/reader031/viewer/2022020116/55a8e3ed1a28ab3f6a8b4825/html5/thumbnails/16.jpg)
Un esempio
• Matrice Termini-Termini– T: parole
– C: parole
– R: T occorre «vicino» a C
– W: numero di volte che T e C co-occorrono
– M: matrice termini/termini
– d: nessuna o ad esempio SVD (Latent SemanticAnalysis)
– S: similarità del coseno
![Page 17: "SSC" - Geometria e Semantica del Linguaggio](https://reader031.vdocuments.site/reader031/viewer/2022020116/55a8e3ed1a28ab3f6a8b4825/html5/thumbnails/17.jpg)
1. pre-processing
• Tokenizzazione necessaria!– PoS-tag– Lemmatizazzione– Parsing
• Un’analisi troppo profonda– Introduce errori– Richiede altri parametri– Dipende dalla lingua
• Pre-processing influisce sulla scelta delle parole e dei contesti
![Page 18: "SSC" - Geometria e Semantica del Linguaggio](https://reader031.vdocuments.site/reader031/viewer/2022020116/55a8e3ed1a28ab3f6a8b4825/html5/thumbnails/18.jpg)
2. Definizione del contesto
• Il documento– l’intero documento
– paragrafo, frase, porzione di testo (passage)
• Le altre parole– In genere si scelgono le n più frequenti
– Dove?• ad un distanza fissata a priori (finestra)
• dipendenza sintattica
• pattern
![Page 19: "SSC" - Geometria e Semantica del Linguaggio](https://reader031.vdocuments.site/reader031/viewer/2022020116/55a8e3ed1a28ab3f6a8b4825/html5/thumbnails/19.jpg)
3. Pesatura
• Frequenza o log(frequenza) per mitigare i contesti che occorrono tante volte
• Idea: se l’occorrenza è più bassa significa che la relazione è più forte
– Mutual Information, Log-Likelihood Ratio
• Information Retrieval: tf-idf, word-entropy, …
![Page 20: "SSC" - Geometria e Semantica del Linguaggio](https://reader031.vdocuments.site/reader031/viewer/2022020116/55a8e3ed1a28ab3f6a8b4825/html5/thumbnails/20.jpg)
Pointwise Mutual Information
N
wfreqwP
N
wwfreqwwP
wPwP
wwPwwMI
ii
)()(
),(),(
)()(
),(log),(
2121
21
21221
),(),( 2121 wwMIwwfreq
Local Mutual-Information
P(bere) = 100/106
P(birra) = 25/106
P(acqua)=150/106
P(bere, acqua)=60/106
P(bere, birra)=20/106
MI(bere, birra)=log2(2*106/250) = 12,96MI(bere, acqua)=log2(6*106/1500) = 11,96
LMI(bere, birra) = 0,0002592LMI(bere, acqua) = 0,0007176
MI tende a dare molto peso ad eventi poco ricorrenti
![Page 21: "SSC" - Geometria e Semantica del Linguaggio](https://reader031.vdocuments.site/reader031/viewer/2022020116/55a8e3ed1a28ab3f6a8b4825/html5/thumbnails/21.jpg)
5. Riduzione della matrice
• M (TxC) è una matrice altamente dimensionale può essere utile ridurla:
1. Individuare le dimensioni latenti: LSI, PCA
2. Ridurre lo spazio: approssimazioni di M ad esempio Random Indexing
![Page 22: "SSC" - Geometria e Semantica del Linguaggio](https://reader031.vdocuments.site/reader031/viewer/2022020116/55a8e3ed1a28ab3f6a8b4825/html5/thumbnails/22.jpg)
Random Indexing
![Page 23: "SSC" - Geometria e Semantica del Linguaggio](https://reader031.vdocuments.site/reader031/viewer/2022020116/55a8e3ed1a28ab3f6a8b4825/html5/thumbnails/23.jpg)
Il metodo
• Assegnare un “vettore random” ad ogni contesto: random/context vector
• Il vettore semantico associato ad ogni target (e.s. parole) è la somma di tutti i vettori contesto in cui il target (parola) occorre
![Page 24: "SSC" - Geometria e Semantica del Linguaggio](https://reader031.vdocuments.site/reader031/viewer/2022020116/55a8e3ed1a28ab3f6a8b4825/html5/thumbnails/24.jpg)
Vettore contesto
• sparso
• altamente dimensionale
• ternario, valori in {-1, 0, +1}
• un piccolo numero di elementi non nulli distribuiti casualmente
0 0 0 0 0 0 0 -1 0 0 0 0 1 0 0 -1 0 1 0 0 0 0 1 0 0 0 0 -1
24
![Page 25: "SSC" - Geometria e Semantica del Linguaggio](https://reader031.vdocuments.site/reader031/viewer/2022020116/55a8e3ed1a28ab3f6a8b4825/html5/thumbnails/25.jpg)
Random Indexing (formal)
kmmnkn RAB ,,,
B preserves the distance between points(Johnson-Lindenstrauss lemma)
mk
dcdr
25
![Page 26: "SSC" - Geometria e Semantica del Linguaggio](https://reader031.vdocuments.site/reader031/viewer/2022020116/55a8e3ed1a28ab3f6a8b4825/html5/thumbnails/26.jpg)
Esempio
John eats a red appleRjohn -> (0, 0, 0, 0, 0, 0, 1, 0, -1, 0)Reat -> (1, 0, 0, 0, -1, 0, 0 ,0 ,0 ,0)Rred-> (0, 0, 0, 1, 0, 0, 0, -1, 0, 0)
SVapple= Rjohn+ Reat+Rred=(1, 0, 0, 1, -1, 0, 1, -1, -1, 0)
26
![Page 27: "SSC" - Geometria e Semantica del Linguaggio](https://reader031.vdocuments.site/reader031/viewer/2022020116/55a8e3ed1a28ab3f6a8b4825/html5/thumbnails/27.jpg)
Random Indexing
• Vantaggi
– Semplice e veloce
– Scalabile e parallelizzabile
– Incrementale
• Svantaggi
– Richiede molta memoria
![Page 28: "SSC" - Geometria e Semantica del Linguaggio](https://reader031.vdocuments.site/reader031/viewer/2022020116/55a8e3ed1a28ab3f6a8b4825/html5/thumbnails/28.jpg)
Permutazioni
• Utilizzare diverse permutazioni degli elementi del vettore random per codificare diversi contesti
– l’ordine delle parole
– dipendenza (sintattica/relazionale) tra termini
• Il vettore random prima di essere sommato viene permutato in base al contesto che si sta codificando
![Page 29: "SSC" - Geometria e Semantica del Linguaggio](https://reader031.vdocuments.site/reader031/viewer/2022020116/55a8e3ed1a28ab3f6a8b4825/html5/thumbnails/29.jpg)
Esempio (word order)
John eats a red appleRjohn -> (0, 0, 0, 0, 0, 0, 1, 0, -1, 0)Reat -> (1, 0, 0, 0, -1, 0, 0 ,0 ,0 ,0)Rapple-> (0, 1, 0, 0, 0, 0, 0, 0, 0, -1)
SVred= R-2john+ R-1eat+R+1apple=
(0,0,0,0,1,0,-1,0,0,0)+(0,0,0,-1,0,0,0,0,0,1)+(-1,0,1,0,0,0,0,0,0,0)=(-1,0,1,-1,1,0,-1,0,0,1)
29
![Page 30: "SSC" - Geometria e Semantica del Linguaggio](https://reader031.vdocuments.site/reader031/viewer/2022020116/55a8e3ed1a28ab3f6a8b4825/html5/thumbnails/30.jpg)
Permutazioni (query)
• In fase di query applicare la permutazione inversa in base al contesto
• Word order: <t> ? (i termini più simili che si trovano a destra di <t>)
– Permutare -1 il vettore random di t R-1t
• t deve comparire a sinistra
– Calcolare la similarità di R-1t con tutti i vettori presenti nel term space
![Page 31: "SSC" - Geometria e Semantica del Linguaggio](https://reader031.vdocuments.site/reader031/viewer/2022020116/55a8e3ed1a28ab3f6a8b4825/html5/thumbnails/31.jpg)
SIMPLE DSMS AND SIMPLE OPERATORS
![Page 32: "SSC" - Geometria e Semantica del Linguaggio](https://reader031.vdocuments.site/reader031/viewer/2022020116/55a8e3ed1a28ab3f6a8b4825/html5/thumbnails/32.jpg)
Simple DSMs…
Term-term co-occurrence matrix (TTM): each cell contains the co-occurrences between two terms within a prefixed distance
dog cat computer animal mouse
dog 0 4 0 2 1
cat 4 0 0 3 5
computer 0 0 0 0 3
animal 2 3 0 0 2
mouse 1 5 3 2 0
![Page 33: "SSC" - Geometria e Semantica del Linguaggio](https://reader031.vdocuments.site/reader031/viewer/2022020116/55a8e3ed1a28ab3f6a8b4825/html5/thumbnails/33.jpg)
…Simple DSMs
Latent Semantic Analysis (LSA): relies on the Singular Value Decomposition (SVD) of the co-occurrence matrix
Random Indexing (RI): based on the Random Projection
Latent Semantic Analysis over Random Indexing (RILSA)
![Page 34: "SSC" - Geometria e Semantica del Linguaggio](https://reader031.vdocuments.site/reader031/viewer/2022020116/55a8e3ed1a28ab3f6a8b4825/html5/thumbnails/34.jpg)
Latent Semantic Analysis over Random Indexing
1. Reduce the dimension of the co-occurrences matrix using RI
2. Perform LSA over RI (LSARI)
– reduction of LSA computation time: RI matrix contains less dimensions than co-occurrences matrix
![Page 35: "SSC" - Geometria e Semantica del Linguaggio](https://reader031.vdocuments.site/reader031/viewer/2022020116/55a8e3ed1a28ab3f6a8b4825/html5/thumbnails/35.jpg)
Simple operators…
Addition (+): pointwise sum of components
Multiplication (∘): pointwise multiplication of components
Addition and multiplication are commutative
– do not take into account word order
Complex structures represented summing or multiplying words which compose them
![Page 36: "SSC" - Geometria e Semantica del Linguaggio](https://reader031.vdocuments.site/reader031/viewer/2022020116/55a8e3ed1a28ab3f6a8b4825/html5/thumbnails/36.jpg)
…Simple operators
Given two word vectors u and v
– composition by sum p = u + v
– composition by multiplication p = u ∘ v
Can be applied to any sequence of words
![Page 37: "SSC" - Geometria e Semantica del Linguaggio](https://reader031.vdocuments.site/reader031/viewer/2022020116/55a8e3ed1a28ab3f6a8b4825/html5/thumbnails/37.jpg)
SYNTACTIC DEPENDENCIES IN DSMS
![Page 38: "SSC" - Geometria e Semantica del Linguaggio](https://reader031.vdocuments.site/reader031/viewer/2022020116/55a8e3ed1a28ab3f6a8b4825/html5/thumbnails/38.jpg)
Syntactic dependencies…
John eats a red apple.
John eats apple
red
modifier
objectsubject
38
![Page 39: "SSC" - Geometria e Semantica del Linguaggio](https://reader031.vdocuments.site/reader031/viewer/2022020116/55a8e3ed1a28ab3f6a8b4825/html5/thumbnails/39.jpg)
…Syntactic dependencies
John eats a red apple.
John eats apple
red
modifier
objectsubject
39
HEADDEPENDENT
![Page 40: "SSC" - Geometria e Semantica del Linguaggio](https://reader031.vdocuments.site/reader031/viewer/2022020116/55a8e3ed1a28ab3f6a8b4825/html5/thumbnails/40.jpg)
Representing dependences
Use filler/role binding approach to represent a dependency dep(u, v)
rd⊚u + rh⊚v
rd and rh are vectors which representrespectively the role of dependent and head
⊚ is a placeholder for a composition operator
![Page 41: "SSC" - Geometria e Semantica del Linguaggio](https://reader031.vdocuments.site/reader031/viewer/2022020116/55a8e3ed1a28ab3f6a8b4825/html5/thumbnails/41.jpg)
Representing dependences (example)
obj(apple, eat)
rd⊚apple + rh⊚eat
role vectors
![Page 42: "SSC" - Geometria e Semantica del Linguaggio](https://reader031.vdocuments.site/reader031/viewer/2022020116/55a8e3ed1a28ab3f6a8b4825/html5/thumbnails/42.jpg)
Structured DSMs
1. Vector permutation in RI (PERM) to encode dependencies
2. Circular convolution (CONV) as filler/binding operator to represent syntactic dependencies in DSMs
3. LSA over PERM and CONV carries out two spaces: PERMLSA and CONVLSA
![Page 43: "SSC" - Geometria e Semantica del Linguaggio](https://reader031.vdocuments.site/reader031/viewer/2022020116/55a8e3ed1a28ab3f6a8b4825/html5/thumbnails/43.jpg)
Vector permutation in RI (PERM)
Using permutation of elements in context vectors to encode dependencies
– right rotation of n elements to encode dependents (permutation)
– left rotation of n elements to encode heads(inverse permutation)
43
![Page 44: "SSC" - Geometria e Semantica del Linguaggio](https://reader031.vdocuments.site/reader031/viewer/2022020116/55a8e3ed1a28ab3f6a8b4825/html5/thumbnails/44.jpg)
PERM (method)
Create and assign a context vector to each term
Assign a rotation function Π+1 to the dependent and Π-1 to the head
Each term is represented by a vector which is– the sum of the permuted vectors of all the dependent
terms
– the sum of the inverse permuted vectors of all the head terms
– the sum of the no-permuted vectors of both dependent and head words
44
![Page 45: "SSC" - Geometria e Semantica del Linguaggio](https://reader031.vdocuments.site/reader031/viewer/2022020116/55a8e3ed1a28ab3f6a8b4825/html5/thumbnails/45.jpg)
PERM (example…)
John eats a red apple
John -> (0, 0, 0, 0, 0, 0, 1, 0, -1, 0)eat -> (1, 0, 0, 0, -1, 0, 0 ,0 ,0 ,0)red -> (0, 0, 0, 1, 0, 0, 0, -1, 0, 0)apple -> (1, 0, 0, 0, 0, 0, 0, -1, 0, 0)
TVapple = Π+1(CVred) + Π-1(CVeat) + CVred + CVeat
45
applered
eats
![Page 46: "SSC" - Geometria e Semantica del Linguaggio](https://reader031.vdocuments.site/reader031/viewer/2022020116/55a8e3ed1a28ab3f6a8b4825/html5/thumbnails/46.jpg)
PERM (…example)
John eats a red apple
John -> (0, 0, 0, 0, 0, 0, 1, 0, -1, 0)eat -> (1, 0, 0, 0, -1, 0, 0 ,0 ,0 ,0)red-> (0, 0, 0, 1, 0, 0, 0, -1, 0, 0)apple -> (1, 0, 0, 0, 0, 0, 0, -1, 0, 0)
TVapple = Π+1(CVred) + Π-1(CVeat) + CVred + CVeat=…
…=(0, 0, 0, 0, 1, 0, 0, 0, -1, 0) + (0, 0, 0, -1, 0, 0, 0, 0, 1) +
+ (0, 0, 0, 1, 0, 0, 0, -1, 0, 0) + (1, 0, 0, 0, -1, 0, 0 ,0 ,0 ,0)
right shift left shift
46
![Page 47: "SSC" - Geometria e Semantica del Linguaggio](https://reader031.vdocuments.site/reader031/viewer/2022020116/55a8e3ed1a28ab3f6a8b4825/html5/thumbnails/47.jpg)
Convolution (CONV)
Create and assign a context vector to each term
Create two context vectors for head and dependentroles
Each term is represented by a vector which is– the sum of the convolution between dependent terms
and the dependent role vector
– the sum of the convolution between head terms and the head role vector
– the sum of the vectors of both dependent and head words
![Page 48: "SSC" - Geometria e Semantica del Linguaggio](https://reader031.vdocuments.site/reader031/viewer/2022020116/55a8e3ed1a28ab3f6a8b4825/html5/thumbnails/48.jpg)
Circular convolution operator
Circular convolution
p=u⊛v
defined as:
n
k
nkjkj vup1
)1()(
U1 U2 U3 U4 U5
V1 1 1 -1 -1 1
V2 -1 -1 1 1 -1
V3 1 1 -1 -1 1
V4 -1 -1 1 1 -1
V5 -1 -1 1 1 -1
U=<1, 1, -1, -1, 1>V=<1, -1, 1, -1, -1>P=<-1, 3, -1, -1, -1>
P1
P2
P3
P4
P5
![Page 49: "SSC" - Geometria e Semantica del Linguaggio](https://reader031.vdocuments.site/reader031/viewer/2022020116/55a8e3ed1a28ab3f6a8b4825/html5/thumbnails/49.jpg)
Circular convolution by FFTs
Circular convolution is computed in O(n2)
– using FFTs is computed in O(nlogn)
Given f the discrete FFTs and f -1 its inverse
– u ⊛v = f -1( f(u) ∘ f (v) )
![Page 50: "SSC" - Geometria e Semantica del Linguaggio](https://reader031.vdocuments.site/reader031/viewer/2022020116/55a8e3ed1a28ab3f6a8b4825/html5/thumbnails/50.jpg)
CONV (example)
John eats a red apple
John -> (0, 0, 0, 0, 0, 0, 1, 0, -1, 0)eat -> (1, 0, 0, 0, -1, 0, 0 ,0 ,0 ,0)red-> (0, 0, 0, 1, 0, 0, 0, -1, 0, 0)apple -> (1, 0, 0, 0, 0, 0, 0, -1, 0, 0)rd -> (0, 0, 1, 0, -1, 0, 0, 0, 0, 0)rh -> (0, -1, 1, 0, 0, 0, 0, 0, 0, 0)
apple = eat +red + (rd⊛ red) + (rh ⊛ eat)
50
Context vector for dependent role
Context vector for head role
![Page 51: "SSC" - Geometria e Semantica del Linguaggio](https://reader031.vdocuments.site/reader031/viewer/2022020116/55a8e3ed1a28ab3f6a8b4825/html5/thumbnails/51.jpg)
Complex operators
Based on filler/role binding taking into account syntactic role: rd ⊚ u + rh ⊚ v
– u and v could be recursive structures
Two vector operators to bind the role:– convolution (⊛)
– tensor (⊗)
– convolution (⊛+): exploits also the sum of term vectors
rd ⊛ u + rh ⊛ v + v + u
![Page 52: "SSC" - Geometria e Semantica del Linguaggio](https://reader031.vdocuments.site/reader031/viewer/2022020116/55a8e3ed1a28ab3f6a8b4825/html5/thumbnails/52.jpg)
Complex operators (remarks)
Existing operators
– t1 ⊚ t2 ⊚ … ⊚ tn: does not take into account syntactic role
– t1 ⊛ t2 is commutative
– t1⊗t2 ⊗ … ⊗tn: tensor order depends on the phrase length
• two phrases with different length are not comparable
– t1 ⊗ r1 ⊗t2 ⊗r2 ⊗ … ⊗ tn ⊗rn : also depends on the sentence length
![Page 53: "SSC" - Geometria e Semantica del Linguaggio](https://reader031.vdocuments.site/reader031/viewer/2022020116/55a8e3ed1a28ab3f6a8b4825/html5/thumbnails/53.jpg)
System setup
• Corpus
– WaCkypedia EN based on a 2009 dump of Wikipedia
– about 800 million tokens
– dependency parse by MaltParser
• DSMs
– 500 vector dimension (LSA/RI/RILSA)
– 1,000 vector dimension (PERM/CONV/PERMLSA/CONVLSA)
– 50,000 most frequent words
– co-occurrence distance: 4
53
![Page 54: "SSC" - Geometria e Semantica del Linguaggio](https://reader031.vdocuments.site/reader031/viewer/2022020116/55a8e3ed1a28ab3f6a8b4825/html5/thumbnails/54.jpg)
Evaluation
• GEMS 2011 Shared Task for compositional semantics– list of two pairs of words combination
(support offer) (help provide) 7(old person) (right hand) 1
• rated by humans• 5,833 rates• 3 types involved: noun-noun (NN), adjective-noun (AN),
verb-object (VO)
– GOAL: compare the system performance against humans scores• Spearman correlation
54
![Page 55: "SSC" - Geometria e Semantica del Linguaggio](https://reader031.vdocuments.site/reader031/viewer/2022020116/55a8e3ed1a28ab3f6a8b4825/html5/thumbnails/55.jpg)
Results (simple spaces)…NN AN VO
TTM
LSA
RI
RILS
A
TTM
LSA
RI
RILS
A
TTM
LSA
RI
RILS
A
+ .21 .36 .25 .42 .22 .35 .33 .41 .23 .31 .28 .31
∘ .31 .15 .23 .22 .21 .20 .22 .18 .13 .10 .18 .21
⊛ .21 .38 .26 .35 .20 .33 .31 .44 .15 .31 .24 .34
⊛+ .21 .34 .28 .43 .23 .32 .31 .37 .20 .31 .25 .29
⊗ .21 .38 .25 .39 .22 .38 .33 .43 .15 .34 .26 .32
human .49 .52 .55
Simple Semantic Spaces
![Page 56: "SSC" - Geometria e Semantica del Linguaggio](https://reader031.vdocuments.site/reader031/viewer/2022020116/55a8e3ed1a28ab3f6a8b4825/html5/thumbnails/56.jpg)
…Results (structured spaces)NN AN VO
CO
NV
PER
M
CO
NV
LSA
PER
MLS
A
CO
NV
PER
M
CO
NV
LSA
PER
MLS
A
CO
NV
PER
M
CO
NV
LSA
PER
MLS
A
+ .36 .39 .43 .42 .34 .39 .42 .45 .27 .23 .30 .31
∘ .22 .17 .10 .13 .23 .27 .13 .15 .20 .15 .06 .14
⊛ .31 .36 .37 .35 .39 .39 .45 .44 .28 .23 .27 .28
⊛+ .30 .36 .40 .36 .38 .32 .48 .44 .27 .22 .30 .32
⊗ .34 .37 .37 .40 .36 .40 .45 .45 .27 .24 .31 .32
human .49 .52 .55
Structured Semantic Spaces
![Page 57: "SSC" - Geometria e Semantica del Linguaggio](https://reader031.vdocuments.site/reader031/viewer/2022020116/55a8e3ed1a28ab3f6a8b4825/html5/thumbnails/57.jpg)
Final remarks
• Best results are obtained when complex operators/spaces (or both) are involved
• No best combination of operator/space exists– depend on the type of relation (NN, AN, VO)
• Tensor product and convolution provide good results in spite of previous results– filler/role binding is effective
• Future work– generate several rd and rh vectors for each kind of
dependency– apply this approach to other direct graph-based
representations