embedding-based techniques

30
Embedding-Based Techniques MATRICES, TENSORS, AND NEURAL NETWORKS

Upload: others

Post on 03-Nov-2021

3 views

Category:

Documents


0 download

TRANSCRIPT

Embedding-BasedTechniquesMATRICES,TENSORS,ANDNEURALNETWORKS

ProbabilisticModels:Downsides

LimitationtoLogicalRelations

• Representationrestrictedbymanualdesign• Clustering?Assymetric implications?• Informationflowsthroughtheserelations

• Difficulttogeneralizetounseenentities/relations

ComputationalComplexityofAlgorithms

• Complexitydependsonexplicitdimensionality• OftenNP-Hard,insizeofdata• Morerules,moreexpensiveinference

• Query-timeinferenceissometimesNP-Hard• Nottrivialtoparallelize,oruseGPUs

Embeddings

• Everythingasdensevectors• Cancapturemanyrelations• Learnedfromdata

• Complexitydependsonlatentdimensions

• Learningusingstochasticgradient,back-propagation

• Queryingisoftencheap• GPU-parallelismfriendly

2

TwoRelatedTasks

relation

GraphCompletion

relation

relation

relationrelation

relationrelation

relation

relation

RelationExtraction

surfacepattern

surfacepattern

relation

relation

relation

relation

3

TwoRelatedTasks

RelationExtraction

GraphCompletion

surfacepattern

surfacepattern

relation

relation

relation

relation

relation

relation

relation

relation

relationrelation

relation

relation

relation

4

RelationExtractionFromTextJohn was born in Liverpool, to Julia and Alfred Lennon.

JohnLennon

AlfredLennon

JuliaLennon

Liverpool“wasbornin”

“wasbornto”

“wasbornto”

“and”

“bornin__,to”

“bornin__,to”

5

RelationExtractionFromText

JohnLennon

AlfredLennon

JuliaLennon

Liverpoolbirthplace

John was born in Liverpool, to Julia and Alfred Lennon.

“wasbornin”“wasbornto”

“wasbornto”

childOf

childOf“and”

“bornin__,to”

“bornin__,to”

livedIn

livedIn

6

“Distant”Supervision

7

JohnLennonLiverpool

birthplace

“wasbornin”

Nodirectsupervisiongivesusthisinformation.Supervised: TooexpensivetolabelsentencesRule-based: ToomuchvarietyinlanguageBothonlyworkforasmallsetofrelations,i.e.10s,not100s

BarackObamaHonolulu birthplace

“wasbornin”

“isnativeto”

“visited”

“metthesenatorfrom”

RelationExtractionasaMatrix

John was born in Liverpool, to Julia and Alfred Lennon.

John Lennon, Liverpool

John Lennon, Julia Lennon

John Lennon, Alfred Lennon

Julia Lennon, Alfred Lennon

1

1

1

1

Barack Obama, Hawaii

Barack Obama, Michelle Obama

1

1 1

1

?

?

EntityPairs

8UniversalSchema,Riedeletal, NAACL(2013)

≈n

kkm

X

relations

pairs

MatrixFactorization

nm

UniversalSchema,Riedeletal, NAACL(2013)

bornIn(John,Liverpool)bornIn(John,Liverpool)

9

relations

pairs

Training:StochasticUpdates

Pickanobserved cell, :

◦ Update&suchthat ishigher

Pickanyrandomcell,assumeitisnegative:

◦ Update & suchthat islower

relations

pairs

pairs

relations

10

R(i, j)

pxy

rR0R

0(x, y)

pij rR R(i, j)

R(i, j)

R

0(x, y)

RelationEmbeddings

11

Embeddings ~LogicalRelationsRelationEmbeddings,w◦ Similarembeddingfor2relationsdenotetheyareparaphrases◦ ismarriedto,spouseOf(X,Y),/person/spouse

◦ Oneembeddingcanbecontained byanother◦ w(topEmployeeOf)� w(employeeOf)◦ topEmployeeOf(X,Y)→employeeOf(X,Y)

◦ Cancapturelogicalpatterns,withoutneedingtospecifythem!

FromSebastianRiedel 12

EntityPairEmbeddings,vSimilarentitypairsdenotesimilarrelationsbetweenthemEntitypairsmaydescribemultiple“relations”

independentfoundedBy andemployeeOfrelations

SimilarEmbeddings

Xown percentage of Y Xbuy stake in Y

Time, IncAmer. Tel. and Comm. 1 1

VolvoScania A.B. 1

CampeauFederated Dept Stores

AppleHP

Successfully predicts “Volvo owns percentage of Scania A.B.”from “Volvo bought a stake in Scania A.B.”

similarunderlyingembedding

similare

mbe

dding

FromSebastianRiedel 13

Implications

Xprofessor at Y Xhistorian at Y

Kevin BoyleOhio State 1

R. FreemanHarvard 1

Learns asymmetric entailment:PER historian at UNIV → PER professor at UNIV

But,PER professor at UNIV → PER historian at UNIV

XhistorianatY→XprofessoratY

(Freeman,Harvard)→(Boyle,OhioState)

FromSebastianRiedel 14

TwoRelatedTasks

RelationExtraction

GraphCompletion

surfacepattern

surfacepattern

relation

relation

relation

relation

relation

relation

relation

relation

relationrelation

relation

relation

relation

15

GraphCompletion

JohnLennon

AlfredLennon

JuliaLennon

Liverpoolbirthplace

“wasbornin”“wasbornto”

“wasbornto”

childOf

childOf“and”

“bornin__,to”

“bornin__,to”

livedIn

livedIn

16

spouse

spouse

GraphCompletion

JohnLennon

AlfredLennon

JuliaLennon

Liverpoolbirthplace

childOf

childOf

livedIn

livedIn

17

|R|

|E|

TensorFormulationofKG

e1

e2r

|E|

Doesanunseenrelationexist?

18

|R|

|E|

FactorizethatTensor

|E|

|E|

|E||R|

kk

k

19

S(r(a, b)) = f(vr,va,vb)

ManyDifferentFactorizations

CANDECOMP/PARAFAC-Decomposition

S (r(a, b)) =X

k

Rr,k · ea,k · eb,k

Tucker2andRESCALDecompositions

S (r(a, b)) = (Rr ⇥ ea)⇥ eb

ModelE

S (r(a, b)) = Rr,1 · ea +Rr,2 · eb

HOLE:Nickeletal,AAAI(2016),ModelE:Riedeletal,NAACL(2013),RESCAL:Nickeletal,WWW(2012),CP:Harshman (1970),Tucker2:Tucker(1966)

Nottensorfactorization(perse)

20

HolographicEmbeddings

S(r(a, b)) = Rr ⇥ (ea ? eb)

TranslationEmbeddings

e1

e2

r

TransE

S (r(a, b)) = �kea +Rr � ebk22

TransE:Bordes etal.XXX(2011),TransH:Bordes etal.XXX(2011),TransR:Bordes etal.XXX(2011) 21

TransH

S (r(a, b)) = �ke?a +Rr � e?b k22e?a = ea �wT

r eawr

TransR

S (r(a, b)) = �keaMr +Rr � ebMrk22

Liverpool

John Lennon

birthplace

birthplace

Barack Obama

Honolulu

|R|

|E|

ParameterEstimation

e1

e2r

|E|

22

Observedcell:increasescoreS (r(a, b))

Unobservedcell:decreasescore

S (r0(x, y))

MatrixvsTensorFactorization

• Vectorsforeachentitypair• Canonlypredictforentitypairsthat

appearintexttogether• Nosharingforsameentityindifferent

entitypairs

• Vectorsforeachentity• Assumeentitypairsare“low-rank”

• Butmanyrelationsarenot!• Spouse:youcanhaveonly~1

• Cannotlearnpairspecificinformation

23

Whattheycan,andcan’t,do..

• Red: deterministically implied by Black - needs pair-specific embedding - Only F is able to generalize• Green: needs to estimate entity types - needs entity-specific embedding - Tensor factorization generalizes, F doesn't• Blue: implied by Red and Green - Nothing works much better than random

FromSinghetal.VSM(2015),http://sameersingh.org/files/papers/mftf-vsm15.pdf 24

JointExtraction+Completionsurfacepattern

surfacepattern

RelationExtraction

relation

relation

relation

GraphCompletion

relation

relation

relation

relation

relation

relationrelation

relation

relation

relation

JointModel

25

CompositionalNeuralModelsSofar,we’relearningvectorsforeachentity/surfacepattern/relation..

Butlearningvectorsindependentlyignores“composition”

CompositioninSurfacePatterns

• Everysurfacepatternisnotunique

• Synonymy:

• Inverse:

• Cantherepresentationlearnthis?

A is B’s spouse.A is married to B.

X is Y’s parent.Y is one of X’s children.

CompositioninRelationPaths

• Everyrelationpathisnotunique

• Explicit:

• Implicit:

• Cantherepresentationcapturethis?

X “bornInState” ZX bornInCity Y, Y cityInState Z

A grandparent CA parent B, B parent C

26

ComposingDependencyPaths

… wasbornto… … ‘sparentsare… \parentsOf

(neverappearsintrainingdata)

Butwedon’tneedlinkeddatatoknowtheymeansimilarthings…

Useneuralnetworkstoproducetheembeddings fromtext!

… wasbornto… … ‘sparentsare… \parentsOf

NN NN

Verga etal(2016),https://arxiv.org/pdf/1511.06396v2.pdf 27

ComposingRelationalPaths

Microsoft Seattle Washington USAisBasedIn stateLocatedIn countryLocatedIn

NN

stateBasedIn

countryBasedIn

NN

Neelakantan etal(2015),http://www.aaai.org/ocs/index.php/SSS/SSS15/paper/viewFile/10254/10032Linetal,EMNLP(2015),https://arxiv.org/pdf/1506.00379.pdf

28

Review:EmbeddingTechniquesTwoRelatedTasks:• RelationExtractionfromText• Graph(orLink)Completion

RelationExtraction:• MatrixFactorizationApproaches

GraphCompletion:• TensorFactorizationApproaches

CompositionalNeuralModels• Composeoverdependencypaths• Composeoverrelationpaths

29

UsingEmbeddings inMLNsTutorialOverview

2

Part2:KnowledgeExtraction

Part3:GraphConstruction

Part1:KnowledgeGraphs

Part4:CriticalAnalysis

30