drawings as models of syntactic structure: theory and algorithms by mathias möhl supervised by...
TRANSCRIPT
Drawings as Models of Syntactic Structure:
Theory and Algorithms
by Mathias Möhlsupervised by Marco Kuhlmann
final talk of diploma thesisat Programming Systems Lab. Saarland University, Prof.
Smolka
2 / 25
Dependency analysis
This is a sentence
The depedency analysis of a sentence consists of two relations:
• a tree ( dependencies among words)• a total order ( word order)
formal model: drawings
A drawing is a relational structure (V;S, ), where (V;S) forms a tree
and (V; ) is a total order.
Definition
3 / 25
The task
two relaxations of projectivity:
gap degree
well-nestedness
constraint language for well-nested drawings
saturation algorithm for enumeration
definition of TAG drawing
TAG’ness = well-nestedness + gap 1
structural properties of
drawings
description language
Tree Adjoining Grammar (TAG)
4 / 25
Part I
Structural properties of drawings
5 / 25
Some terminology
for any node v of a drawing (V;S, ) we define:
yield(v) := S*v
cover(v) = Convex-Hull(S*v)
1 3 4 52
yield(5)={2,4,5}
cover(5)={2,3,4,5}
Example
A drawing is projective, iff the yield of each node equals its cover.
Definition
6 / 25
Example
Gaps in drawings
A gap of a node v is a maximal convex set in cover(v)-yield(v).
The number of gaps of a node is called its gap degree.
1 3 4 52 6 7
node 1 has two gaps:
{3} and {5,6}
its gap degree is two
The gap degree of a drawing is the maximum of the gap degrees of the nodes
gap degree 0 projective
The gap degree is a measure for the non-projectivity of a drawing
7 / 25
Contributions related to gaps
Algorithm to compute the gap degree O(n*g)
(n= number of nodes; g=gap degree ; g<n)
A drawing has at most O(n²) gaps and at most O(n*log(n)) different gaps
1 2 4 73 5 6 1 2 4 73 5 6
worst case examples
O(n*log(n))O(n²)
8 / 25
Well-nestedness
A drawing is well-nested if is satisfies the constraint: All disjoint subtrees are non-interleaving.Two subtrees T1, T2 interleave, if there are nodes l1, r1 T1 and l2, r2 T2
such that l1 l2 r1 r2 .
l1 l2 r1 r2
interleaving subtrees
l1 l2 r1 r2
non-interleaving subtrees
l1 r1 l2 r2l1 l2 r1 r2
Definition
Well-nestedness is a second kind of relaxation of projectivity
Orthogonal to the gap degree
projective
well-nested
9 / 25
Contributions related to well-nesterness
A well-nested drawing has at most O(n) different gaps
1 2 4 73 5 6 1 2 4 73 5 6
worst case examples
O(n*log(n))O(n²)
bound for well-nested drawings: O(n)
not well-nested
10 / 25
Contributions related to well-nestedness
1 2 3 4 1 2 3
1 2 3 4 1 2 3
1 2 3
1 2 3
projective planar well-nested
two algorithms to test well-nestedness; both O(n²)
first algorithm: tests for sibling nodes, if their subtrees interleave
second algorithm: reduction to a cycle test ( details later)
projective drawings planar drawings well-nested drawings
11 / 25
Contributions related to well-nestedness
In well-nested drawings each node has a gap forest
The gap forest of a node v describes the relative position among the subtrees rooted at its children
If the drawing is not well-nested, some nodes have no gap forest:
?
12 / 25
Contributions related to well-nestedness
a b c d e f g h
a
c e
f
gap forest of node b:Example
ac
e
f
b
13 / 25
Part II
A description language for drawings
14 / 25
Description language
projective drawings are describable as tree structure + local order
This is a sentence
is
subj obj
This: subj
sentence: obj
det
a: det
for well-nested drawings local order is not sufficient:
goal: find local description for order in well-nested drawings
This is a sentenceaa
15 / 25
Description language
description of order in our approach: extended form of gap forests
(contains some additional nodes and edge labels)
Example
c a d b e
c
a
d
b
e
b
c
self
1 2
edself 21
selfself
each well-nested drawing has a unique description
order is described locally for each node
16 / 25
Description language
underspecified description of gap forests with constraint language
describes sets of drawings (with the same tree structure)
saturation algorithm to enumerate all described gap forests (NP)
(related to saturation algorithms for dominance constraints)
......
Example
c
a
d
b
e
constraints:self is in the first gap of cb before c ...
...
17 / 25
Part III
Tree Adjoining Grammar
18 / 25
Tree Adjoining Grammar (TAG)
TAG derivation combines elementary trees into derived tree
like
VB
NP1 NP2*
Dan
NP2
S
VB
does
what
NP1
like
VB
NP1 NP2VB
Dan
S
doeswhat
derivation tree records the combining operations
lexicalised TAG: each elementary tree corresponds to one word
does
like
Danwhat
derived tree
elementary trees
derivation tree
19 / 25
TAG drawings
consist of derivation tree + leaf-order of derived tree
structural characterisation:
TAG drawings = wellnested drawings with gap degree ≤ 1
Theorem:
like
VB
NP1 NP2VB
Dan
S
doeswhat
derived tree
does
like
Danwhat
derivation tree
what does Dan like
drawing
+
20 / 25
finally a technical detail:
Reducing well-nestedness to a cycle-test
21 / 25
b ca d
c
a
db
■ tree edges■ gap edges
Two types of edges: tree edges and gap edges
Theorem: A drawing is well-nested if and only if its gap graph is acyclic
b ca d
c
a
db
■ tree edges■ gap edges
x x
The gap graph of a drawing
22 / 25
Proof: I. If a drawing is not well-nested, its gap graph contains a cycle.II. If the gap graph contains a cycle, the drawing is not well-nested.
Part I.
l1 l2 r1 r2
If the drawing is not well-nested, there exist two disjoint subtrees with interleaving nodes:
l1 r1l2 r2
drawing with interleaving subtrees gap graph with cycle
The gap graph of a drawing
23 / 25
Proof: II. If the gap graph contains a cycle, the drawing is not well-nested.
If the gap graph contains a cycle, it contains a cycle in which all nodes reached by a gap edge are pairwise disjoint y1
x1
y2
x2
yn
xn...
If x1 and x2 are not disjoint
either x1 dominates x2 or x2 dominates x1
y1
x1
y2
x2
yn
xn...
y1
x1
y2
x2
yn
xn...
The gap graph of a drawing
24 / 25
Assume that the drawing is well-
nested. Then the path
implies C(x1) C(x2)
y1
x1
y2
x2
yn
xn...
y1 x2
x1 y1x2
x1
The gap graph of a drawing
Proof: II. If the gap graph contains a cycle, the drawing is not well-nested.
C(x1) C(x2) ... C(xn) C(x1)
C(x1) C(x1)
25 / 25
Main contributions
Formalisation of drawings
Measures for non-projectivity of drawings: gap-degree
well-nestedness
Description language for well-nested drawings.
Characterisation of TAG drawings (well-nested + gap 1)
Future work:
tree bank evaluations
grammar formalism based on drawings
structural properties of other formalisms
26 / 25
References
Manuel Bodirsky, Marco Kuhlmann, and Mathias Möhl. Well-nested drawings asmodels of syntactic structure. In 10th Conference on Formal Grammar and 9thMeeting on Mathematics of Language, Edinburgh, Scotland, UK, 2005.
Manuel Bodirsky and Martin Kutz. Pure dominance constraints. In Proceedingsof the 19th Annual Symposium on Theoretical Aspects of Computer Science (STACS2002), 2002.
Mike Daniels and W. Detmar Meurers. Improving the efficiency of parsing with discontinuous
constituents. In Shuly Wintner, editor, Proceedings of NLULP’02: The 7thInternational Workshop on Natural Language Understanding and Logic Program-ming, number 92 in Datalogiske Skrifter, pages 49–68, Copenhagen, 2002. RoskildeUniversitetscenter.
Denys Duchier and Joachim Niehren. Dominance constraints with set operators. InProceedings of the First International Conference on Computational Logic (CL2000),volume 1861 of Lecture Notes in Computer Science, pages 326–341. Springer, July2000.
27 / 25
References
Alexander Koller. Constraint-based and graph-based resolution of ambiguities in
natural language. PhD thesis, Universität des Saarlandes, 2004.
Martin Plátek, Tomáš Holan, and Vladislav Kuboˇn. On relax-ability of word-order
by d-grammars. In Cristian Calude, Michael Dinneen, and Smaranda Sburlan, editors, Combinatorics, Computability and Logic, Discrete Mathematics and Theoretical Computer Science, pages 159–174. Springer, Berlin, 2001.
Anssi Yli-Jyrä. Multiplanarity – a model for dependency structures in treebanks. In
Second Workshop on Treebanks and Linguistic Theories, Mathematical Modelling in
Physics, Engineering and Cognitive Sciences, pages 189–200, Växjö, Sweden, 2003.