drawings as models of syntactic structure: theory and algorithms by mathias möhl supervised by...

27
Drawings as Models of Syntactic Structure: Theory and Algorithms by Mathias Möhl supervised by Marco Kuhlmann final talk of diploma thesis at Programming Systems Lab. Saarland University, Prof. Smolka

Upload: quentin-dallas

Post on 14-Dec-2015

219 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: Drawings as Models of Syntactic Structure: Theory and Algorithms by Mathias Möhl supervised by Marco Kuhlmann final talk of diploma thesis at Programming

Drawings as Models of Syntactic Structure:

Theory and Algorithms

by Mathias Möhlsupervised by Marco Kuhlmann

final talk of diploma thesisat Programming Systems Lab. Saarland University, Prof.

Smolka

Page 2: Drawings as Models of Syntactic Structure: Theory and Algorithms by Mathias Möhl supervised by Marco Kuhlmann final talk of diploma thesis at Programming

2 / 25

Dependency analysis

This is a sentence

The depedency analysis of a sentence consists of two relations:

• a tree ( dependencies among words)• a total order ( word order)

formal model: drawings

A drawing is a relational structure (V;S, ), where (V;S) forms a tree

and (V; ) is a total order.

Definition

Page 3: Drawings as Models of Syntactic Structure: Theory and Algorithms by Mathias Möhl supervised by Marco Kuhlmann final talk of diploma thesis at Programming

3 / 25

The task

two relaxations of projectivity:

gap degree

well-nestedness

constraint language for well-nested drawings

saturation algorithm for enumeration

definition of TAG drawing

TAG’ness = well-nestedness + gap 1

structural properties of

drawings

description language

Tree Adjoining Grammar (TAG)

Page 4: Drawings as Models of Syntactic Structure: Theory and Algorithms by Mathias Möhl supervised by Marco Kuhlmann final talk of diploma thesis at Programming

4 / 25

Part I

Structural properties of drawings

Page 5: Drawings as Models of Syntactic Structure: Theory and Algorithms by Mathias Möhl supervised by Marco Kuhlmann final talk of diploma thesis at Programming

5 / 25

Some terminology

for any node v of a drawing (V;S, ) we define:

yield(v) := S*v

cover(v) = Convex-Hull(S*v)

1 3 4 52

yield(5)={2,4,5}

cover(5)={2,3,4,5}

Example

A drawing is projective, iff the yield of each node equals its cover.

Definition

Page 6: Drawings as Models of Syntactic Structure: Theory and Algorithms by Mathias Möhl supervised by Marco Kuhlmann final talk of diploma thesis at Programming

6 / 25

Example

Gaps in drawings

A gap of a node v is a maximal convex set in cover(v)-yield(v).

The number of gaps of a node is called its gap degree.

1 3 4 52 6 7

node 1 has two gaps:

{3} and {5,6}

its gap degree is two

The gap degree of a drawing is the maximum of the gap degrees of the nodes

gap degree 0 projective

The gap degree is a measure for the non-projectivity of a drawing

Page 7: Drawings as Models of Syntactic Structure: Theory and Algorithms by Mathias Möhl supervised by Marco Kuhlmann final talk of diploma thesis at Programming

7 / 25

Contributions related to gaps

Algorithm to compute the gap degree O(n*g)

(n= number of nodes; g=gap degree ; g<n)

A drawing has at most O(n²) gaps and at most O(n*log(n)) different gaps

1 2 4 73 5 6 1 2 4 73 5 6

worst case examples

O(n*log(n))O(n²)

Page 8: Drawings as Models of Syntactic Structure: Theory and Algorithms by Mathias Möhl supervised by Marco Kuhlmann final talk of diploma thesis at Programming

8 / 25

Well-nestedness

A drawing is well-nested if is satisfies the constraint: All disjoint subtrees are non-interleaving.Two subtrees T1, T2 interleave, if there are nodes l1, r1 T1 and l2, r2 T2

such that l1 l2 r1 r2 .

l1 l2 r1 r2

interleaving subtrees

l1 l2 r1 r2

non-interleaving subtrees

l1 r1 l2 r2l1 l2 r1 r2

Definition

Well-nestedness is a second kind of relaxation of projectivity

Orthogonal to the gap degree

projective

well-nested

Page 9: Drawings as Models of Syntactic Structure: Theory and Algorithms by Mathias Möhl supervised by Marco Kuhlmann final talk of diploma thesis at Programming

9 / 25

Contributions related to well-nesterness

A well-nested drawing has at most O(n) different gaps

1 2 4 73 5 6 1 2 4 73 5 6

worst case examples

O(n*log(n))O(n²)

bound for well-nested drawings: O(n)

not well-nested

Page 10: Drawings as Models of Syntactic Structure: Theory and Algorithms by Mathias Möhl supervised by Marco Kuhlmann final talk of diploma thesis at Programming

10 / 25

Contributions related to well-nestedness

1 2 3 4 1 2 3

1 2 3 4 1 2 3

1 2 3

1 2 3

projective planar well-nested

two algorithms to test well-nestedness; both O(n²)

first algorithm: tests for sibling nodes, if their subtrees interleave

second algorithm: reduction to a cycle test ( details later)

projective drawings planar drawings well-nested drawings

Page 11: Drawings as Models of Syntactic Structure: Theory and Algorithms by Mathias Möhl supervised by Marco Kuhlmann final talk of diploma thesis at Programming

11 / 25

Contributions related to well-nestedness

In well-nested drawings each node has a gap forest

The gap forest of a node v describes the relative position among the subtrees rooted at its children

If the drawing is not well-nested, some nodes have no gap forest:

?

Page 12: Drawings as Models of Syntactic Structure: Theory and Algorithms by Mathias Möhl supervised by Marco Kuhlmann final talk of diploma thesis at Programming

12 / 25

Contributions related to well-nestedness

a b c d e f g h

a

c e

f

gap forest of node b:Example

ac

e

f

b

Page 13: Drawings as Models of Syntactic Structure: Theory and Algorithms by Mathias Möhl supervised by Marco Kuhlmann final talk of diploma thesis at Programming

13 / 25

Part II

A description language for drawings

Page 14: Drawings as Models of Syntactic Structure: Theory and Algorithms by Mathias Möhl supervised by Marco Kuhlmann final talk of diploma thesis at Programming

14 / 25

Description language

projective drawings are describable as tree structure + local order

This is a sentence

is

subj obj

This: subj

sentence: obj

det

a: det

for well-nested drawings local order is not sufficient:

goal: find local description for order in well-nested drawings

This is a sentenceaa

Page 15: Drawings as Models of Syntactic Structure: Theory and Algorithms by Mathias Möhl supervised by Marco Kuhlmann final talk of diploma thesis at Programming

15 / 25

Description language

description of order in our approach: extended form of gap forests

(contains some additional nodes and edge labels)

Example

c a d b e

c

a

d

b

e

b

c

self

1 2

edself 21

selfself

each well-nested drawing has a unique description

order is described locally for each node

Page 16: Drawings as Models of Syntactic Structure: Theory and Algorithms by Mathias Möhl supervised by Marco Kuhlmann final talk of diploma thesis at Programming

16 / 25

Description language

underspecified description of gap forests with constraint language

describes sets of drawings (with the same tree structure)

saturation algorithm to enumerate all described gap forests (NP)

(related to saturation algorithms for dominance constraints)

......

Example

c

a

d

b

e

constraints:self is in the first gap of cb before c ...

...

Page 17: Drawings as Models of Syntactic Structure: Theory and Algorithms by Mathias Möhl supervised by Marco Kuhlmann final talk of diploma thesis at Programming

17 / 25

Part III

Tree Adjoining Grammar

Page 18: Drawings as Models of Syntactic Structure: Theory and Algorithms by Mathias Möhl supervised by Marco Kuhlmann final talk of diploma thesis at Programming

18 / 25

Tree Adjoining Grammar (TAG)

TAG derivation combines elementary trees into derived tree

like

VB

NP1 NP2*

Dan

NP2

S

VB

does

what

NP1

like

VB

NP1 NP2VB

Dan

S

doeswhat

derivation tree records the combining operations

lexicalised TAG: each elementary tree corresponds to one word

does

like

Danwhat

derived tree

elementary trees

derivation tree

Page 19: Drawings as Models of Syntactic Structure: Theory and Algorithms by Mathias Möhl supervised by Marco Kuhlmann final talk of diploma thesis at Programming

19 / 25

TAG drawings

consist of derivation tree + leaf-order of derived tree

structural characterisation:

TAG drawings = wellnested drawings with gap degree ≤ 1

Theorem:

like

VB

NP1 NP2VB

Dan

S

doeswhat

derived tree

does

like

Danwhat

derivation tree

what does Dan like

drawing

+

Page 20: Drawings as Models of Syntactic Structure: Theory and Algorithms by Mathias Möhl supervised by Marco Kuhlmann final talk of diploma thesis at Programming

20 / 25

finally a technical detail:

Reducing well-nestedness to a cycle-test

Page 21: Drawings as Models of Syntactic Structure: Theory and Algorithms by Mathias Möhl supervised by Marco Kuhlmann final talk of diploma thesis at Programming

21 / 25

b ca d

c

a

db

■ tree edges■ gap edges

Two types of edges: tree edges and gap edges

Theorem: A drawing is well-nested if and only if its gap graph is acyclic

b ca d

c

a

db

■ tree edges■ gap edges

x x

The gap graph of a drawing

Page 22: Drawings as Models of Syntactic Structure: Theory and Algorithms by Mathias Möhl supervised by Marco Kuhlmann final talk of diploma thesis at Programming

22 / 25

Proof: I. If a drawing is not well-nested, its gap graph contains a cycle.II. If the gap graph contains a cycle, the drawing is not well-nested.

Part I.

l1 l2 r1 r2

If the drawing is not well-nested, there exist two disjoint subtrees with interleaving nodes:

l1 r1l2 r2

drawing with interleaving subtrees gap graph with cycle

The gap graph of a drawing

Page 23: Drawings as Models of Syntactic Structure: Theory and Algorithms by Mathias Möhl supervised by Marco Kuhlmann final talk of diploma thesis at Programming

23 / 25

Proof: II. If the gap graph contains a cycle, the drawing is not well-nested.

If the gap graph contains a cycle, it contains a cycle in which all nodes reached by a gap edge are pairwise disjoint y1

x1

y2

x2

yn

xn...

If x1 and x2 are not disjoint

either x1 dominates x2 or x2 dominates x1

y1

x1

y2

x2

yn

xn...

y1

x1

y2

x2

yn

xn...

The gap graph of a drawing

Page 24: Drawings as Models of Syntactic Structure: Theory and Algorithms by Mathias Möhl supervised by Marco Kuhlmann final talk of diploma thesis at Programming

24 / 25

Assume that the drawing is well-

nested. Then the path

implies C(x1) C(x2)

y1

x1

y2

x2

yn

xn...

y1 x2

x1 y1x2

x1

The gap graph of a drawing

Proof: II. If the gap graph contains a cycle, the drawing is not well-nested.

C(x1) C(x2) ... C(xn) C(x1)

C(x1) C(x1)

Page 25: Drawings as Models of Syntactic Structure: Theory and Algorithms by Mathias Möhl supervised by Marco Kuhlmann final talk of diploma thesis at Programming

25 / 25

Main contributions

Formalisation of drawings

Measures for non-projectivity of drawings: gap-degree

well-nestedness

Description language for well-nested drawings.

Characterisation of TAG drawings (well-nested + gap 1)

Future work:

tree bank evaluations

grammar formalism based on drawings

structural properties of other formalisms

Page 26: Drawings as Models of Syntactic Structure: Theory and Algorithms by Mathias Möhl supervised by Marco Kuhlmann final talk of diploma thesis at Programming

26 / 25

References

Manuel Bodirsky, Marco Kuhlmann, and Mathias Möhl. Well-nested drawings asmodels of syntactic structure. In 10th Conference on Formal Grammar and 9thMeeting on Mathematics of Language, Edinburgh, Scotland, UK, 2005.

Manuel Bodirsky and Martin Kutz. Pure dominance constraints. In Proceedingsof the 19th Annual Symposium on Theoretical Aspects of Computer Science (STACS2002), 2002.

Mike Daniels and W. Detmar Meurers. Improving the efficiency of parsing with discontinuous

constituents. In Shuly Wintner, editor, Proceedings of NLULP’02: The 7thInternational Workshop on Natural Language Understanding and Logic Program-ming, number 92 in Datalogiske Skrifter, pages 49–68, Copenhagen, 2002. RoskildeUniversitetscenter.

Denys Duchier and Joachim Niehren. Dominance constraints with set operators. InProceedings of the First International Conference on Computational Logic (CL2000),volume 1861 of Lecture Notes in Computer Science, pages 326–341. Springer, July2000.

Page 27: Drawings as Models of Syntactic Structure: Theory and Algorithms by Mathias Möhl supervised by Marco Kuhlmann final talk of diploma thesis at Programming

27 / 25

References

Alexander Koller. Constraint-based and graph-based resolution of ambiguities in

natural language. PhD thesis, Universität des Saarlandes, 2004.

Martin Plátek, Tomáš Holan, and Vladislav Kuboˇn. On relax-ability of word-order

by d-grammars. In Cristian Calude, Michael Dinneen, and Smaranda Sburlan, editors, Combinatorics, Computability and Logic, Discrete Mathematics and Theoretical Computer Science, pages 159–174. Springer, Berlin, 2001.

Anssi Yli-Jyrä. Multiplanarity – a model for dependency structures in treebanks. In

Second Workshop on Treebanks and Linguistic Theories, Mathematical Modelling in

Physics, Engineering and Cognitive Sciences, pages 189–200, Växjö, Sweden, 2003.