introduction to categorical compositional distributional semantics lecture 1: introduction ·...

49
Introduction to Categorical Compositional Distributional Semantics Lecture 1: Introduction Dimitri Kartsaklis 1 Martha Lewis 2 1 Department of Theoretical and Applied Linguistics University of Cambridge 2 Department of Computer Science University of Oxford ESSLLI 2017 Toulouse, France D. Kartsaklis, M. Lewis Introduction to CCDS - Lecture 1: Introduction 1/48

Upload: buidang

Post on 06-Mar-2018

241 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Introduction to Categorical Compositional Distributional Semantics Lecture 1: Introduction · PDF fileOutline 1 Introduction 2 Computers and meaning 3 Compositional Distributional

Introduction to Categorical Compositional Distributional Semantics

Lecture 1: Introduction

Dimitri Kartsaklis1 Martha Lewis2

1 Department of Theoretical and Applied LinguisticsUniversity of Cambridge

2Department of Computer ScienceUniversity of Oxford

ESSLLI 2017

Toulouse, France

D. Kartsaklis, M. Lewis Introduction to CCDS - Lecture 1: Introduction 1/48

Page 2: Introduction to Categorical Compositional Distributional Semantics Lecture 1: Introduction · PDF fileOutline 1 Introduction 2 Computers and meaning 3 Compositional Distributional

Outline

1 Introduction

2 Computers and meaning

3 Compositional Distributional Models: An Overview

4 Introduction to Category Theory

5 Diagrammatic calculus for monoidal categories

D. Kartsaklis, M. Lewis Introduction to CCDS - Lecture 1: Introduction 2/48

Page 3: Introduction to Categorical Compositional Distributional Semantics Lecture 1: Introduction · PDF fileOutline 1 Introduction 2 Computers and meaning 3 Compositional Distributional

Outline

1 Introduction

2 Computers and meaning

3 Compositional Distributional Models: An Overview

4 Introduction to Category Theory

5 Diagrammatic calculus for monoidal categories

D. Kartsaklis, M. Lewis Introduction to CCDS - Lecture 1: Introduction 3/48

Page 4: Introduction to Categorical Compositional Distributional Semantics Lecture 1: Introduction · PDF fileOutline 1 Introduction 2 Computers and meaning 3 Compositional Distributional

This lecture in a nutshell

Categorical Compositional Distributional Semantics (CCDS)formalises the notion of composition over distributional modelsof meaning, unifying two orthogonal semantic paradigms:

The type-logical compositional approach of formal semanticsThe quantitative perspective of statistical vector space modelsof meaning

We will give an outline of the course and explain thefundamental ideas behind the work.

We give some of the basic category theory needed.

D. Kartsaklis, M. Lewis Introduction to CCDS - Lecture 1: Introduction 4/48

Page 5: Introduction to Categorical Compositional Distributional Semantics Lecture 1: Introduction · PDF fileOutline 1 Introduction 2 Computers and meaning 3 Compositional Distributional

Outline of the course

Lecture 1: Introduction. Linguistic intuitions; compositionality;distributional hypothesis; category-theory basics; diagrammaticcalculus

Lecture 2: CCDS. Pregroup grammars; compact closed categories; amap from syntax to semantics; creating relational tensors

Lecture 3: Frobenius algebras. Role of Frobenius algebras in CCDS;relative pronouns; coordination; intonation; non-compositionalcompounds

Lecture 4: From vectors to density operators. Word vectors asstates; the CPM construction; density matrices for modellingambiguity and textual entailment.

Lecture 5: Advanced topics. From distributional spaces toconceptual spaces; work in progress and future directions; conclusion

D. Kartsaklis, M. Lewis Introduction to CCDS - Lecture 1: Introduction 5/48

Page 6: Introduction to Categorical Compositional Distributional Semantics Lecture 1: Introduction · PDF fileOutline 1 Introduction 2 Computers and meaning 3 Compositional Distributional

Outline

1 Introduction

2 Computers and meaning

3 Compositional Distributional Models: An Overview

4 Introduction to Category Theory

5 Diagrammatic calculus for monoidal categories

D. Kartsaklis, M. Lewis Introduction to CCDS - Lecture 1: Introduction 6/48

Page 7: Introduction to Categorical Compositional Distributional Semantics Lecture 1: Introduction · PDF fileOutline 1 Introduction 2 Computers and meaning 3 Compositional Distributional

Computers and meaning

Computational linguistics is the scientific and engineering

discipline concerned with understanding written and

spoken language from a computational perspective.

—Stanford Encyclopedia of Philosophy1

1http://plato.stanford.eduD. Kartsaklis, M. Lewis Introduction to CCDS - Lecture 1: Introduction 7/48

Page 8: Introduction to Categorical Compositional Distributional Semantics Lecture 1: Introduction · PDF fileOutline 1 Introduction 2 Computers and meaning 3 Compositional Distributional

Compositional semantics

The principle of compositionality

The meaning of a complex expression is determined by themeanings of its parts and the rules used for combining them.

Montague Grammar: A systematic way of processingfragments of the English language in order to get semanticrepresentations capturing their meaning.

There is in my opinion no important theoretical

di↵erence between natural languages and the

artificial languages of logicians.

—Richard Montague, Universal Grammar (1970)

D. Kartsaklis, M. Lewis Introduction to CCDS - Lecture 1: Introduction 8/48

Page 9: Introduction to Categorical Compositional Distributional Semantics Lecture 1: Introduction · PDF fileOutline 1 Introduction 2 Computers and meaning 3 Compositional Distributional

Syntax-to-semantics correspondence (1/2)

A lexicon:

(1) a. every ` Dt : �P .�Q.8x [P(x) ! Q(x)]

b. man ` N : �y .man(y)

c. walks ` V

IN

: �z .walk(z)

A parse tree, so syntax guides the semantic composition:

S

NP

Dt

Every

N

man

V

IN

walks

NP ! Dt N : [[N]]([[Dt]])S ! NP V

IN

: [[VIN

]]([[NP]])

D. Kartsaklis, M. Lewis Introduction to CCDS - Lecture 1: Introduction 9/48

Page 10: Introduction to Categorical Compositional Distributional Semantics Lecture 1: Introduction · PDF fileOutline 1 Introduction 2 Computers and meaning 3 Compositional Distributional

Syntax-to-semantics correspondence (2/2)

Logical forms of compounds are computed via �-reduction:

S

8x [man(x) ! walk(x)]

NP

�Q.8x [man(x) ! Q(x)]

Dt

�P .�Q.8x [P(x) ! Q(x)]

Every

N

�y .man(y)

man

V

IN

�z .walk(z)

walks

The semantic value of a sentence can be true or false.

Can we do better than that?

D. Kartsaklis, M. Lewis Introduction to CCDS - Lecture 1: Introduction 10/48

Page 11: Introduction to Categorical Compositional Distributional Semantics Lecture 1: Introduction · PDF fileOutline 1 Introduction 2 Computers and meaning 3 Compositional Distributional

The meaning of words

Distributional hypothesis

Words that occur in similar contexts have similar meanings[Harris, 1958].

The functional interplay of philosophy and ? should, as a minimum, guarantee......and among works of dystopian ? fiction...

The rapid advance in ? today suggests......calculus, which are more popular in ? -oriented schools.

But because ? is based on mathematics......the value of opinions formed in ? as well as in the religions...

...if ? can discover the laws of human nature.......is an art, not an exact ? .

...factors shaping the future of our civilization: ? and religion....certainty which every new discovery in ? either replaces or reshapes.

...if the new technology of computer ? is to grow significantlyHe got a ? scholarship to Yale.

...frightened by the powers of destruction ? has given......but there is also specialization in ? and technology...

D. Kartsaklis, M. Lewis Introduction to CCDS - Lecture 1: Introduction 11/48

Page 12: Introduction to Categorical Compositional Distributional Semantics Lecture 1: Introduction · PDF fileOutline 1 Introduction 2 Computers and meaning 3 Compositional Distributional

The meaning of words

Distributional hypothesis

Words that occur in similar contexts have similar meanings[Harris, 1958].

The functional interplay of philosophy and science should, as a minimum, guarantee......and among works of dystopian science fiction...

The rapid advance in science today suggests......calculus, which are more popular in science -oriented schools.

But because science is based on mathematics......the value of opinions formed in science as well as in the religions...

...if science can discover the laws of human nature.......is an art, not an exact science .

...factors shaping the future of our civilization: science and religion....certainty which every new discovery in science either replaces or reshapes.

...if the new technology of computer science is to grow significantlyHe got a science scholarship to Yale.

...frightened by the powers of destruction science has given......but there is also specialization in science and technology...

D. Kartsaklis, M. Lewis Introduction to CCDS - Lecture 1: Introduction 11/48

Page 13: Introduction to Categorical Compositional Distributional Semantics Lecture 1: Introduction · PDF fileOutline 1 Introduction 2 Computers and meaning 3 Compositional Distributional

Distributional models of meaning

A word is a vector of co-occurrence statistics with every otherword in a selected subset of the vocabulary:

milk

cute

dog

bank

money

12

8

5

0

1

cat

catdog

account

money

pet

Semantic relatedness is usually based on cosine similarity:

sim(�!v ,�!u ) = cos ✓�!v ,�!u =

h�!v ·�!u ik�!v kk�!u k

D. Kartsaklis, M. Lewis Introduction to CCDS - Lecture 1: Introduction 12/48

Page 14: Introduction to Categorical Compositional Distributional Semantics Lecture 1: Introduction · PDF fileOutline 1 Introduction 2 Computers and meaning 3 Compositional Distributional

A real vector space

D. Kartsaklis, M. Lewis Introduction to CCDS - Lecture 1: Introduction 13/48

Page 15: Introduction to Categorical Compositional Distributional Semantics Lecture 1: Introduction · PDF fileOutline 1 Introduction 2 Computers and meaning 3 Compositional Distributional

The necessity for a unified model

Distributional models of meaning are quantitative, but theydo not scale up to phrases and sentences; there is not enoughdata:

Even if we had an infinitely large corpus, what the con-text of a sentence would be?

D. Kartsaklis, M. Lewis Introduction to CCDS - Lecture 1: Introduction 14/48

Page 16: Introduction to Categorical Compositional Distributional Semantics Lecture 1: Introduction · PDF fileOutline 1 Introduction 2 Computers and meaning 3 Compositional Distributional

The role of compositionality

Compositional distributional models

We can produce a sentence vector by composing the vectorsof the words in that sentence.

�!s = f (�!w1,

�!w2, . . . ,

�!w

n

)

Three generic classes of CDMs:

Vector mixture models [Mitchell and Lapata (2010)]

Tensor-based models [Coecke, Sadrzadeh, Clark (2010); Baroni and

Zamparelli (2010)]

Neural models [Socher et al. (2012); Kalchbrenner et al. (2014)]

D. Kartsaklis, M. Lewis Introduction to CCDS - Lecture 1: Introduction 15/48

Page 17: Introduction to Categorical Compositional Distributional Semantics Lecture 1: Introduction · PDF fileOutline 1 Introduction 2 Computers and meaning 3 Compositional Distributional

Applications (1/2)

Why are CDMs important?

The problem of producing robust representations for themeaning of phrases and sentences is at the heart of everytask related to natural language.

Paraphrase detection

Problem: Given two sentences, decide if they say the samething in di↵erent words

Solution: Measure the cosine similarity between the sentencevectors

Sentiment analysis

Problem: Extract the general sentiment from a sentence or adocument

Solution: Train a classifier using sentence vectors as input

D. Kartsaklis, M. Lewis Introduction to CCDS - Lecture 1: Introduction 16/48

Page 18: Introduction to Categorical Compositional Distributional Semantics Lecture 1: Introduction · PDF fileOutline 1 Introduction 2 Computers and meaning 3 Compositional Distributional

Applications (2/2)

Textual entailment

Problem: Decide if one sentence logically infers a di↵erent one

Solution: Examine the feature inclusion properties of thesentence vectors

Machine translation

Problem: Automatically translate one sentence into a di↵erentlanguage

Solution: Encode the source sentence into a vector, then usethis vector to decode a surface form into the target language

And so on. Many other potential applications exist...

D. Kartsaklis, M. Lewis Introduction to CCDS - Lecture 1: Introduction 17/48

Page 19: Introduction to Categorical Compositional Distributional Semantics Lecture 1: Introduction · PDF fileOutline 1 Introduction 2 Computers and meaning 3 Compositional Distributional

Outline

1 Introduction

2 Computers and meaning

3 Compositional Distributional Models: An Overview

4 Introduction to Category Theory

5 Diagrammatic calculus for monoidal categories

D. Kartsaklis, M. Lewis Introduction to CCDS - Lecture 1: Introduction 18/48

Page 20: Introduction to Categorical Compositional Distributional Semantics Lecture 1: Introduction · PDF fileOutline 1 Introduction 2 Computers and meaning 3 Compositional Distributional

Element-wise vector composition

The easiest way to compose two vectors is by workingelement-wise [Mitchell and Lapata (2010)]:

���!w1w2 = ↵�!w1 + ��!w2 =

X

i

(↵cw1i

+ �cw2i

)�!ni

���!w1w2 =

�!w1 ��!

w2 =X

i

c

w1i

c

w2i

�!n

i

An element-wise “mixture” of the input elements:

=

D. Kartsaklis, M. Lewis Introduction to CCDS - Lecture 1: Introduction 19/48

Page 21: Introduction to Categorical Compositional Distributional Semantics Lecture 1: Introduction · PDF fileOutline 1 Introduction 2 Computers and meaning 3 Compositional Distributional

Properties of vector mixture models

Words, phrases and sentences share the same vector space

A bag-of-word approach. Word order does not play a role:

��!dog +

��!bites +��!

man = ��!man +

��!bites +

��!dog

Feature-wise, vector addition can be seen as feature union,and vector multiplication as feature intersection

D. Kartsaklis, M. Lewis Introduction to CCDS - Lecture 1: Introduction 20/48

Page 22: Introduction to Categorical Compositional Distributional Semantics Lecture 1: Introduction · PDF fileOutline 1 Introduction 2 Computers and meaning 3 Compositional Distributional

Vector mixtures: Pros and Cons

Distinguishing feature:

All words contribute equally to the final result.

PROS:

Trivial to implement

Surprisingly e↵ective in practice

CONS:

A bag-of-word approach

Does not distinguish between the type-logical identities of thewords

D. Kartsaklis, M. Lewis Introduction to CCDS - Lecture 1: Introduction 21/48

Page 23: Introduction to Categorical Compositional Distributional Semantics Lecture 1: Introduction · PDF fileOutline 1 Introduction 2 Computers and meaning 3 Compositional Distributional

Tensor-based models: Relational words as functions

In a vector mixture model, an adjective is of the same order asthe noun it modifies, and both contribute equally to the result.

One step further: Relational words are multi-linear maps(tensors of various orders) that can be applied to one or morearguments (vectors).

= =

Vector mixture Tensor-based

Formalized in the context of compact closed categories byCoecke, Sadrzadeh and Clark (2010).

D. Kartsaklis, M. Lewis Introduction to CCDS - Lecture 1: Introduction 22/48

Page 24: Introduction to Categorical Compositional Distributional Semantics Lecture 1: Introduction · PDF fileOutline 1 Introduction 2 Computers and meaning 3 Compositional Distributional

Tensor-based models: Pros and Cons

Distinguishing feature:

Relational words are multi-linear maps acting on arguments

PROS:

Aligned with the formal semantics perspective

More powerful than vector mixtures

Flexible regarding the representation of functional words, suchas relative pronouns and prepositions.

CONS:

Every logical and functional word must be assigned to anappropriate tensor representation–it’s not always clear how

Space complexity problems for functions of higher arity (e.g. aditransitive verb is a tensor of order 4)

D. Kartsaklis, M. Lewis Introduction to CCDS - Lecture 1: Introduction 23/48

Page 25: Introduction to Categorical Compositional Distributional Semantics Lecture 1: Introduction · PDF fileOutline 1 Introduction 2 Computers and meaning 3 Compositional Distributional

Neural networks for composition

Vector composition does not have to be linear. Below, f is anon-linear function such as tanh or sigmoid.

Pollack (1990); Socher et al. (2011;2012):

D. Kartsaklis, M. Lewis Introduction to CCDS - Lecture 1: Introduction 24/48

Page 26: Introduction to Categorical Compositional Distributional Semantics Lecture 1: Introduction · PDF fileOutline 1 Introduction 2 Computers and meaning 3 Compositional Distributional

Neural models: Pros and Cons

Distinguishing feature:

Drastic transformation of the sentence space.

PROS:

Non-linearity and layered approach allow the simulation of avery wide range of functions

Word vectors are parameters of the model, optimized duringtraining

State-of-the-art results in a number of NLP tasks

CONS:

Requires expensive training based on backpropagation

Di�cult to discover the right configuration

A “black-box” approach: not easy to correlate inner workingswith output

D. Kartsaklis, M. Lewis Introduction to CCDS - Lecture 1: Introduction 25/48

Page 27: Introduction to Categorical Compositional Distributional Semantics Lecture 1: Introduction · PDF fileOutline 1 Introduction 2 Computers and meaning 3 Compositional Distributional

Outline

1 Introduction

2 Computers and meaning

3 Compositional Distributional Models: An Overview

4 Introduction to Category Theory

5 Diagrammatic calculus for monoidal categories

D. Kartsaklis, M. Lewis Introduction to CCDS - Lecture 1: Introduction 26/48

Page 28: Introduction to Categorical Compositional Distributional Semantics Lecture 1: Introduction · PDF fileOutline 1 Introduction 2 Computers and meaning 3 Compositional Distributional

What is category theory?

Category theory aims at identifying and studying connections be-tween seemingly di↵erent forms of mathematical structures.

Introduced by Saunders Mac Lane and Samuel Eilenberg in 1940s inthe context of algebraic topology

In contrast to set theory, category theory focuses not on elementsx , y ... (the objects), but on the relations between these objects (themorphisms)

In this course we use category theory to show that:

1 Syntax and vector space semantics have the same structure,

2 thus there is a canonical translation from a syntactic derivation to alinear-algebraic formula computing the meaning of a sentence vector

D. Kartsaklis, M. Lewis Introduction to CCDS - Lecture 1: Introduction 27/48

Page 29: Introduction to Categorical Compositional Distributional Semantics Lecture 1: Introduction · PDF fileOutline 1 Introduction 2 Computers and meaning 3 Compositional Distributional

Categories, objects and morphisms

A category C consists of:A collection of objectsA collection of morphisms (or arrows) that define interactionsbetween the objects

Morphisms can be composed:

A

f�! B

g�! C

Composition is associative:

(h � g) � f = h � (g � f )

Every object A has an identity morphism 1A

: A ! A suchthat:

8f : A ! B , 1B

� f = f = f � 1A

D. Kartsaklis, M. Lewis Introduction to CCDS - Lecture 1: Introduction 28/48

Page 30: Introduction to Categorical Compositional Distributional Semantics Lecture 1: Introduction · PDF fileOutline 1 Introduction 2 Computers and meaning 3 Compositional Distributional

Functors

A functor is a structure-preserving morphism between cate-gories

Let C, D be categories, then

F : C ! D

is a functor if:

for every object X 2 C there is a corresponding object F(X ) 2 D

for every morphism f : X ! Y 2 C there is a morphismF(f ) : F(X ) ! F(Y ) such that:

1 F(1X

) = 1F(X ), for all X 2 C2 F(g � f ) = F(g) � F(f ) for all morphisms f : X ! Y and

g : Y ! Z in C.

D. Kartsaklis, M. Lewis Introduction to CCDS - Lecture 1: Introduction 29/48

Page 31: Introduction to Categorical Compositional Distributional Semantics Lecture 1: Introduction · PDF fileOutline 1 Introduction 2 Computers and meaning 3 Compositional Distributional

Natural transformations

Recall that functors are tools to transform one specificcategory into another

Natural transformations operate one level higher, and providea way of transforming one functor into another (i.e. they aremorphisms between functors)

Together with categories and functors, natural transformationsare one of the fundamental notions of category theory

We don’t deal with them in the course

D. Kartsaklis, M. Lewis Introduction to CCDS - Lecture 1: Introduction 30/48

Page 32: Introduction to Categorical Compositional Distributional Semantics Lecture 1: Introduction · PDF fileOutline 1 Introduction 2 Computers and meaning 3 Compositional Distributional

Commutative diagrams

In category theory, the relationships between morphisms areoften depicted with commutative diagrams.

A diagram commutes if all paths starting and ending at thesame points lead to the same result by the application ofcomposition.

The diagram below shows the associativity of composition:

A B

C D

f

g � f g h � g

h

D. Kartsaklis, M. Lewis Introduction to CCDS - Lecture 1: Introduction 31/48

Page 33: Introduction to Categorical Compositional Distributional Semantics Lecture 1: Introduction · PDF fileOutline 1 Introduction 2 Computers and meaning 3 Compositional Distributional

Example categories

The category Set has sets as objects and functions asmorphisms

The category Poset has posets as objects and monotonefunctions as morphisms

The category Grp, with groups as objects and grouphomomorphisms as morphisms

The category Vect, where the objects are vector spaces over afield and the morphisms are linear maps

The category FdVect is restricted to finite-dimensional vectorspaces (the kind that occurs in vector space models ofmeaning)

D. Kartsaklis, M. Lewis Introduction to CCDS - Lecture 1: Introduction 32/48

Page 34: Introduction to Categorical Compositional Distributional Semantics Lecture 1: Introduction · PDF fileOutline 1 Introduction 2 Computers and meaning 3 Compositional Distributional

Monoidal categories

A monoidal category is a category equipped with a secondoperation, the monoidal tensor:

⌦ : C ⇥ C ! C

For each pair of objects (A,B), there is a composite objectA⌦ B

For each pair of morphisms (Af�! C ,B

g�! D), there is a

composite morphism A⌦ B

f⌦g��! C ⌦ D

Further, the monoidal tensor is associative, and there is a unitobject I such that:

A⌦ I

⇠= A

⇠= I ⌦ A

for all A 2 C

D. Kartsaklis, M. Lewis Introduction to CCDS - Lecture 1: Introduction 33/48

Page 35: Introduction to Categorical Compositional Distributional Semantics Lecture 1: Introduction · PDF fileOutline 1 Introduction 2 Computers and meaning 3 Compositional Distributional

Examples of monoidal categories

The category Set, with the cartesian product as monoidaltensor, and the one-element set as unit

The category Rel of sets and relations also has the cartesianproduct as a monoidal tensor and the one-element set as unit.

The categories Vect and FdVect, with the tensor product ofvector spaces as monoidal tensor, and the one-dimensionalvector space as unit

D. Kartsaklis, M. Lewis Introduction to CCDS - Lecture 1: Introduction 34/48

Page 36: Introduction to Categorical Compositional Distributional Semantics Lecture 1: Introduction · PDF fileOutline 1 Introduction 2 Computers and meaning 3 Compositional Distributional

Compact closed categories

Compact closed categories are monoidal categories withadditional structure.

Every object A of a CCC has a left and a right adjoint, Al ,Ar ,for which the following special morphisms exist:

⌘l : I ! A⌦ A

l ⌘r : I ! A

r ⌦ A

✏l : Al ⌦ A ! I ✏r : A⌦ A

r ! I

These satisfy the so-called yanking equations:

(1A

⌦ ✏lA

) � (⌘lA

⌦ 1A

) = 1A

(✏rA

⌦ 1A

) � (1A

⌦ ⌘rA

) = 1A

(✏lA

⌦ 1A

l ) � (1A

l ⌦ ⌘lA

) = 1A

l (1A

r ⌦ ✏rA

) � (⌘rA

⌦ 1A

r ) = 1A

r

In a symmetric CCC, the left and right adjoints collapse intoone: A⇤ := A

l = A

r

D. Kartsaklis, M. Lewis Introduction to CCDS - Lecture 1: Introduction 35/48

Page 37: Introduction to Categorical Compositional Distributional Semantics Lecture 1: Introduction · PDF fileOutline 1 Introduction 2 Computers and meaning 3 Compositional Distributional

Examples of CCCs

The category FdVect is symmetric compact closed.

The adjoint of each vector space V is its dual space V

The ✏-map is the application of inner productThe ⌘-maps are identity maps and multiples of them

The category Rel is also symmetric compact closed.

Each set X is self-dualThe ✏-map is given by

✏X

: X ⇥ X ! {?} :: {((x , x), ?) | x 2 X}

The ⌘-map is given by:

⌘X

: {?} ! X ⇥ X :: {(?, (x , x)) | x 2 X}

D. Kartsaklis, M. Lewis Introduction to CCDS - Lecture 1: Introduction 36/48

Page 38: Introduction to Categorical Compositional Distributional Semantics Lecture 1: Introduction · PDF fileOutline 1 Introduction 2 Computers and meaning 3 Compositional Distributional

Preamble to Lecture 2: CCCs in linguistics

A tensor-based model of meaning, as defined previously, livesin FdVect, i.e. has a compact closed structure

As we will see in Lecture 2, a grammar (in a form of aLambek pregroup) has also compact closure.

Recall that category theory is focused on processes, not onobjects

So, if grammar and semantics have the same categoricalstructure, this means that the principal processes are the samein both sides. In other words...

I.e., there exists a canonical translation between them

D. Kartsaklis, M. Lewis Introduction to CCDS - Lecture 1: Introduction 37/48

Page 39: Introduction to Categorical Compositional Distributional Semantics Lecture 1: Introduction · PDF fileOutline 1 Introduction 2 Computers and meaning 3 Compositional Distributional

Outline

1 Introduction

2 Computers and meaning

3 Compositional Distributional Models: An Overview

4 Introduction to Category Theory

5 Diagrammatic calculus for monoidal categories

D. Kartsaklis, M. Lewis Introduction to CCDS - Lecture 1: Introduction 38/48

Page 40: Introduction to Categorical Compositional Distributional Semantics Lecture 1: Introduction · PDF fileOutline 1 Introduction 2 Computers and meaning 3 Compositional Distributional

From symbols to diagrams

Monoidal categories are complete with regard to a convenientpictorial calculus

This means that an equation between morphisms follows fromthe axioms of monoidal categories i↵ it holds, up to planarisotopy, in the pictorial calculus

The diagrammatic language visualises the derivations in anintuitive way, focusing on the interactions between the variousobjects

It will be the main tool for demonstrating the various conceptsin this course

D. Kartsaklis, M. Lewis Introduction to CCDS - Lecture 1: Introduction 39/48

Page 41: Introduction to Categorical Compositional Distributional Semantics Lecture 1: Introduction · PDF fileOutline 1 Introduction 2 Computers and meaning 3 Compositional Distributional

Objects, morphisms, vectors and tensors

Objects are denoted by upper-case letters, A, B and so on.

Morphisms are lines connecting objects, with boxes containingtheir names. The identity morphism is just a straight line.

A

f A

B

Vectors and tensors are represented as morphisms from themonoidal unit, e.g. �!v : I ! V , m : I ! V ⌦ V etc.

V V W V W Z

D. Kartsaklis, M. Lewis Introduction to CCDS - Lecture 1: Introduction 40/48

Page 42: Introduction to Categorical Compositional Distributional Semantics Lecture 1: Introduction · PDF fileOutline 1 Introduction 2 Computers and meaning 3 Compositional Distributional

Composition and tensor product

Tensor product is juxtaposition, while composing morphismsamounts to connecting outputs to inputs:

A

A C f

Bf g

hA B B D

C

D. Kartsaklis, M. Lewis Introduction to CCDS - Lecture 1: Introduction 41/48

Page 43: Introduction to Categorical Compositional Distributional Semantics Lecture 1: Introduction · PDF fileOutline 1 Introduction 2 Computers and meaning 3 Compositional Distributional

Structural morphisms

The ✏-maps are represented as cups ([) and the ⌘-maps ascaps (\):

A Ar

A Ar A = AAr A

Equations such as (✏rA

⌦ 1A

) � (1A

⌦ ⌘rA

) = 1A

now get anintuitive visual justification.

D. Kartsaklis, M. Lewis Introduction to CCDS - Lecture 1: Introduction 42/48

Page 44: Introduction to Categorical Compositional Distributional Semantics Lecture 1: Introduction · PDF fileOutline 1 Introduction 2 Computers and meaning 3 Compositional Distributional

Comparison with the symbolic approach

Tensor-based composition of a transitive sentence��!subj ⇥ verb ⇥

�!obj in symbols:

I ⌦ I ⌦ I

I

N ⌦ N

r ⌦ S ⌦ N

l ⌦ N

S

��!subj ⌦ verb ⌦

�!obj

✏rN

⌦ 1S

⌦ ✏lN

(✏rN

⌦ 1S

⌦ ✏lN

) � (��!subj ⌦ verb ⌦

�!obj)

⇠ =

Same derivation in the pictorial calculus:

subject verb object

N NS

D. Kartsaklis, M. Lewis Introduction to CCDS - Lecture 1: Introduction 43/48

Page 45: Introduction to Categorical Compositional Distributional Semantics Lecture 1: Introduction · PDF fileOutline 1 Introduction 2 Computers and meaning 3 Compositional Distributional

Diagrammatic calculus: Summary

A

f A

V V W V W Z

B

morphisms tensors

A A

r

A A

r

A

=A

r

A

r

A

✏-map ⌘-map (✏rA

⌦ 1A

) � (1A

⌦ ⌘rA

) = 1A

D. Kartsaklis, M. Lewis Introduction to CCDS - Lecture 1: Introduction 44/48

Page 46: Introduction to Categorical Compositional Distributional Semantics Lecture 1: Introduction · PDF fileOutline 1 Introduction 2 Computers and meaning 3 Compositional Distributional

Summary

We’ve described two contrasting models of meaning -compositional models and distributional models.

We’ve given examples of tasks these can be used for.

We’ve provided an overview of approaches to combiningcompositional and distributional models...

...and we’ve given an introduction to some basic categorytheory and the types of category that we will use.

Tomorrow: Categorical Compositional DistributionalSemantics

D. Kartsaklis, M. Lewis Introduction to CCDS - Lecture 1: Introduction 45/48

Page 47: Introduction to Categorical Compositional Distributional Semantics Lecture 1: Introduction · PDF fileOutline 1 Introduction 2 Computers and meaning 3 Compositional Distributional

References I

Baroni, M. and Zamparelli, R. (2010).

Nouns are Vectors, Adjectives are Matrices.In Proceedings of Conference on Empirical Methods in Natural Language Processing (EMNLP).

Cheng, J. and Kartsaklis, D. (2015).

Syntax-aware multi-sense word embeddings for deep compositional models of meaning.In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pages1531–1542, Lisbon, Portugal. Association for Computational Linguistics.

Coecke, B., Sadrzadeh, M., and Clark, S. (2010).

Mathematical Foundations for a Compositional Distributional Model of Meaning. Lambek Festschrift.Linguistic Analysis, 36:345–384.

Denil, M., Demiraj, A., Kalchbrenner, N., Blunsom, P., and de Freitas, N. (2014).

Modelling, visualising and summarising documents with a single convolutional neural network.Technical Report arXiv:1406.3830, University of Oxford.

Grefenstette, E. and Sadrzadeh, M. (2011).

Experimental support for a categorical compositional distributional model of meaning.In Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, pages1394–1404, Edinburgh, Scotland, UK. Association for Computational Linguistics.

Harris, Z. (1968).

Mathematical Structures of Language.Wiley.

Kartsaklis, D. (2015).

Compositional Distributional Semantics with Compact Closed Categories and Frobenius Algebras.PhD thesis, University of Oxford.

D. Kartsaklis, M. Lewis Introduction to CCDS - Lecture 1: Introduction 46/48

Page 48: Introduction to Categorical Compositional Distributional Semantics Lecture 1: Introduction · PDF fileOutline 1 Introduction 2 Computers and meaning 3 Compositional Distributional

References II

Kartsaklis, D. and Sadrzadeh, M. (2016).

A compositional distributional inclusion hypothesis.In Proceedings of the 2017 Conference on Logical Aspects of Computational Linguistics, Nancy, France.Springer.

Mitchell, J. and Lapata, M. (2010).

Composition in distributional models of semantics.Cognitive Science, 34(8):1388–1439.

Montague, R. (1970a).

English as a formal language.In Linguaggi nella Societa e nella Tecnica, pages 189–224. Edizioni di Comunita, Milan.

Montague, R. (1970b).

Universal grammar.Theoria, 36:373–398.

Sadrzadeh, M., Clark, S., and Coecke, B. (2013).

The Frobenius anatomy of word meanings I: subject and object relative pronouns.Journal of Logic and Computation, Advance Access.

Sadrzadeh, M., Clark, S., and Coecke, B. (2014).

The Frobenius anatomy of word meanings II: Possessive relative pronouns.Journal of Logic and Computation.

Socher, R., Huang, E., Pennington, J., Ng, A., and Manning, C. (2011).

Dynamic Pooling and Unfolding Recursive Autoencoders for Paraphrase Detection.Advances in Neural Information Processing Systems, 24.

D. Kartsaklis, M. Lewis Introduction to CCDS - Lecture 1: Introduction 47/48

Page 49: Introduction to Categorical Compositional Distributional Semantics Lecture 1: Introduction · PDF fileOutline 1 Introduction 2 Computers and meaning 3 Compositional Distributional

References III

Socher, R., Huval, B., Manning, C., and A., N. (2012).

Semantic compositionality through recursive matrix-vector spaces.In Conference on Empirical Methods in Natural Language Processing 2012.

Socher, R., Manning, C., and Ng, A. (2010).

Learning Continuous Phrase Representations and Syntactic Parsing with recursive neural networks.In Proceedings of the NIPS-2010 Deep Learning and Unsupervised Feature Learning Workshop.

D. Kartsaklis, M. Lewis Introduction to CCDS - Lecture 1: Introduction 48/48