a graph grammar approach to geographical databases

11
Inform. Syslems Vol. 10. No. I, pp. 9-19, 1985 03c6-4379185 $3.00 + .I0 Printed in the U.S.A. Pergamon Press Ltd. A GRAPH GRAMMAR APPROACH TO GEOGRAPHICAL DATABASES? ANDREAS MEIER Institut fuer Informatik, Swiss Federal Institute of Technology (ETH) Zurich, CH-8092 Zurich, Switzerland (Received 20 July 1983) Abstract-This paper treats the modeling of an important class of databases, i.e. geographical databases, with emphasis on both structural (data definition) and behavioral (data manipulation) aspects. Geometric objects such as polygons, line segments, and points may have different relations among each other (such as order, adjacency, connectivity) and can he represented in a uniform spatial data structure (structure graph). The dynamic behavior is defined by a finite set of consistency-preserving state transitions (productions) where coincidence problems as well as topological properties have to he solved. Moreover, the graph grammar approach can he used to study the synchronization of several concurrent productions (Church-Rosser properties) and offers a framework for implementing a geographical database. 1. REPRESENTATION OF GEOGRAPHICAL DATA Database modeling is concerned with the static and dynamic behavior of real world applications. As of today, most of the effort to represent data in data- bases has been invested in commercial applications, e.g. administration. We have studied a few applica- tions dealing with geographical data, e.g. trian- gulation networks, real estate parcel plans, and earth resource and land use registers. Computerized geographical databases offer advan- tages over conventional methods: easy updating, fast analysis, and meaningful visual representation of data can be economically accomplished. Of course, in order to be considered official and legal Cjuridical land register), such a system should ensure data integrity (i.e. reliability, consistency, quality, security and protection of data). Geographical entities are classified geometrically as POINTS, LINES and REGIONS. Relationships among these primitives play an important role. Al- though the relational model adequately describes these relationships, it has been recognized as lacking some semantics because of its uniform and sometimes inconsistent treatment of attributes and relationships. Extensions have been proposed to capture more of the meaning of the data[l]. Furthermore, the mod- eling of the behavioral properties in a consistent manner has become more and more important especially in the field of geo-processing where modifications can last as long as a year due to legal procedures[2]. As a consequence, data items and topological properties must be analyzed to decide whether or not certain modifications are allowed (context conditions of the manipulation). The goal of this paper is to model structure and behavior of a geographical database in a very consis- tMuch of this work was done as a Ph.D. thesis research in the Computer Science Department at the ETH Zurich and was supported, in part, by the Zentenarfond of the ETH under grant number 0.330.080.07/8. tent manner. Our approach is based on ideas of the relational model and concepts of graph grammars (originally described in [3]) for the following main reasons: (1) Graph grammars may be used to provide a semantic model for data which controls both the allowed states of the data and the transitions from one state to another by way of update operations. (2) Graph grammar semantics are an effective model for geographical data in particular. This effectiveness will be demonstrated by an example. (3) Using graph grammars, it is possible under certain circumstances to show parallel independence for a set of productions. This may permit study of concurrency problems in a multi-user environment. In the relational data model, constraints may be required for semantic and integrity reasons[4]. Static constraints express rules about data base states, and dynamic constraints are operation oriented and specify which state transitions are allowed. In our approach, both types of constraints are inherent in the model as long as no geometric computation is involved. For instance, one might impose a constraint that no division of a region is to be permitted if one or the other of the resulting regions is smaller than some size. This will require an auxiliary function to compute areas of regions. Other data models for databases have been pro- posed to describe events of a database. Programs, integral parts of the semantic network description[S], are characterized by a prerequisite, an action body, and an effect or complaint depending on a successful completion of the body or not. In[6], structural and behavioral specifications of an object class are inte- grated into a data abstraction given in terms of predicates. Petri nets have been considered for event modeling as well[l; net transitions express a change of conditions from those which hold before to those which hold after a rule tires. In Section 2, we describe a graph grammar ap- proach to model structure and operations of a data- 9

Upload: andreas-meier

Post on 21-Jun-2016

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A graph grammar approach to geographical databases

Inform. Syslems Vol. 10. No. I, pp. 9-19, 1985 03c6-4379185 $3.00 + .I0 Printed in the U.S.A. Pergamon Press Ltd.

A GRAPH GRAMMAR APPROACH TO GEOGRAPHICAL DATABASES?

ANDREAS MEIER Institut fuer Informatik, Swiss Federal Institute of Technology (ETH) Zurich, CH-8092 Zurich,

Switzerland

(Received 20 July 1983)

Abstract-This paper treats the modeling of an important class of databases, i.e. geographical databases, with emphasis on both structural (data definition) and behavioral (data manipulation) aspects. Geometric objects such as polygons, line segments, and points may have different relations among each other (such as order, adjacency, connectivity) and can he represented in a uniform spatial data structure (structure graph). The dynamic behavior is defined by a finite set of consistency-preserving state transitions (productions) where coincidence problems as well as topological properties have to he solved. Moreover, the graph grammar approach can he used to study the synchronization of several concurrent productions (Church-Rosser properties) and offers a framework for implementing a geographical database.

1. REPRESENTATION OF GEOGRAPHICAL DATA

Database modeling is concerned with the static and dynamic behavior of real world applications. As of today, most of the effort to represent data in data- bases has been invested in commercial applications, e.g. administration. We have studied a few applica- tions dealing with geographical data, e.g. trian- gulation networks, real estate parcel plans, and earth resource and land use registers.

Computerized geographical databases offer advan- tages over conventional methods: easy updating, fast analysis, and meaningful visual representation of data can be economically accomplished. Of course, in order to be considered official and legal Cjuridical land register), such a system should ensure data integrity (i.e. reliability, consistency, quality, security and protection of data).

Geographical entities are classified geometrically as POINTS, LINES and REGIONS. Relationships among these primitives play an important role. Al- though the relational model adequately describes these relationships, it has been recognized as lacking some semantics because of its uniform and sometimes inconsistent treatment of attributes and relationships. Extensions have been proposed to capture more of the meaning of the data[l]. Furthermore, the mod- eling of the behavioral properties in a consistent manner has become more and more important especially in the field of geo-processing where modifications can last as long as a year due to legal procedures[2]. As a consequence, data items and topological properties must be analyzed to decide whether or not certain modifications are allowed (context conditions of the manipulation).

The goal of this paper is to model structure and behavior of a geographical database in a very consis-

tMuch of this work was done as a Ph.D. thesis research in the Computer Science Department at the ETH Zurich and was supported, in part, by the Zentenarfond of the ETH under grant number 0.330.080.07/8.

tent manner. Our approach is based on ideas of the relational model and concepts of graph grammars (originally described in [3]) for the following main

reasons:

(1) Graph grammars may be used to provide a semantic model for data which controls both the allowed states of the data and the transitions from one state to another by way of update operations.

(2) Graph grammar semantics are an effective model for geographical data in particular. This effectiveness will be demonstrated by an example.

(3) Using graph grammars, it is possible under certain circumstances to show parallel independence for a set of productions. This may permit study of concurrency problems in a multi-user environment. In the relational data model, constraints may be required for semantic and integrity reasons[4]. Static constraints express rules about data base states, and dynamic constraints are operation oriented and specify which state transitions are allowed. In our approach, both types of constraints are inherent in the model as long as no geometric computation is involved. For instance, one might impose a constraint that no division of a region is to be permitted if one or the other of the resulting regions is smaller than some size. This will require an auxiliary function to compute areas of regions.

Other data models for databases have been pro- posed to describe events of a database. Programs, integral parts of the semantic network description[S], are characterized by a prerequisite, an action body, and an effect or complaint depending on a successful completion of the body or not. In[6], structural and behavioral specifications of an object class are inte- grated into a data abstraction given in terms of predicates. Petri nets have been considered for event modeling as well[l; net transitions express a change of conditions from those which hold before to those which hold after a rule tires.

In Section 2, we describe a graph grammar ap- proach to model structure and operations of a data-

9

Page 2: A graph grammar approach to geographical databases

IO A. MEIFX

base. Section 3 shows that all the example prod- uctions on a map of polygons preserve consistency. The context conditions of merging and dividing productions are discussed in Section 4. In Section 5, we study the synchronization of several concurrent map productions. Section 6 treats some imple- mentation aspects (extended by an Appendix).

2. THE UNDERLYING DATA MODEL

Survey articles[8, 91 on geographical information systems show that the relational model is often used because of its tabular structure and the set oriented nature of its operations. A relation is a subset of a Cartesian Product of not necessarily distinct domains (set of possible values of an attribute) and represents an entity set in terms of its intension (i.e. entity set name, attributes) and its extension (i.e. data values). Attributes and relationships between entity sets can be described by the edges of a graph where the vertices represent entity sets or domains[lO]. Ap- plying results from graph grammar theory[l 11, all consistent states of a database can then be modeled by labeled graphs, and state transitions are given by graph productions.

2.1. Description of the structure graph The first step in database modeling is identifying

the relevant entity sets and analyzing how entity sets are interrelated. A relationship between two entity sets Ei and Ej can be described by two associations Tij and i’j?. The association Td between entity sets Ei and Ej indicates the number of potential entities in Ej which can be associated with one entity in Ei. For our purpose, we distinguish four types of associations between two entity sets by specifying both cardinality and dependency of the association[l2, 131.

The cardinality (unique: 1, or multiple: m) places restrictions on the number of entities in one entity set that may be related to an entity of the other set, whereas the dependency (conditional: c, or multiple- conditional:mc) defines whether or not an entity can exist independently.

As an example, we consider the entity sets SEG- MENT and POINT. For defining the geometric relationship of coincidence, we may conclude: To each segment there exist exactly two points (type Tsp = 2), and, on the other hand, each point can be related to several segments (type Tps = MC). The relationship between SEGMENT and POINT char- acterizes the boundary of each segment and the co-boundary of their defining points, respectively.

Having defined all entity sets and their relevant relationships, mapping into relations is the next step. A structure graph is a directed graph where the vertices represent relations and domains (value sets), and the edges represent attributes and roles. An attribute is a function from a relation into a domain of any data type (e.g. INTEGER, CHARACTER, BOOLEAN, etc.). A role indicates an association of one relation to another by naming the role played by the referenced domain in this association. Each role edge connects the two domains which define the association and includes its corresponding associ- ation type.

We discuss a structure graph (Fig. 1) derived from two entity sets and some relationship. This graph describes all consistent states of a possible database by drawing its relations, domains, attributes, and roles. The relation R3 is defined over subdomains from R 1 resp. R2 and therefore shows two role edges to the referenced domains. The roles reflect the notion of relationship in the relational model where associations are based on value matching. Fur- thermore, they explicitly describe referential constraints [ 141, i.e. only existing data values of refer- enced domains may be used to establish relationships.

2.2. Consistency-preserving productions Database operations transform a database state to

another database state. These states relate to a partic- ular structure graph in order to preserve its static properties. Other consistent states may then be derived by the notion of production.

A production p :B 1 ==- B2 consists of two graphs B 1 and B2 each corresponding to the structure graph,

attribute al

++

role l-1 role r2

association T13 association T23

Fig. 1. Structure graph of a relationship.

Page 3: A graph grammar approach to geographical databases

A graph grammar approac :h

and a set of gluing points [ 151 which is part of the left- and righthand side of the production as well. Both sides of a production represent graphs labeled with the names of the appropriate relations and bearing suitable variables or constants of the referenced domains. The set of gluing points is given implicitly by denotation numbers and allows to identify parts of the left- and righthand sides of a production.

While all consistent states of a database are de- scribed by the structure graph, their state transitions are given by a set of productions. Starting from an initial state, we may derive all consistent states of a database by applying suitable productions. To apply a production p:Bl * B2 to a consistent state G, we have to identify the lefthand side Bl as a subgraph of G and replace it by the righthand side B2 of the production according to the gluing condition [ 151. Locating the lefthand graph in the database at its momentary consistent state (pattern matching) and the following replacement by the corresponding righthand graph transforms the database into a new consistent state.

We discuss a sample production INSERT_TUPLE for relation R3 (Fig. 2) which refers to the structure graph described in the previous section. Given a consistent state of the database, the question arises which matching conditions must hold when applying an insertion. Since INSERT-TUPLE creates a new relationship in R 3, both relations R 1 and R 3 must be referenced. Therefore, the lefthand side of the prod- uction consists of all three relations R 1, R2 and R3. By studying the existence or absence of semantic edges in the lefthand graph, we are able to check the conditions which lead to a possible insertion. These preconditions of database manipulation are given explicitly in graph notation and express referential constraints: To insert a new tuple in relation R3, there must be matching values in R 1 and R2, e.g. to insert a new segment, the two points defining a segment must exist.

3. CONSISTENCY THEOREM APPLIED TO A MAP

For illustration, we describe a map of mutually disjoint polygons and a closed set of appropriate productions. In [16], analogous concepts are applied to a database system for a library. In contrast to this example from commercial database processing, we

to geographical databases II

must investigate geometric and topological properties defined on points, segments and polygons. Therefore, composition and division of polygons or segments involve problems of computational geometry.

3.1 Consistent states of a map We consider a map of mutually disjoint polygonal

regions which is the crucial property to be preserved in this example. Each polygon is given by a set of directed line segments, and each segment is defined by its start and end point.

A map is orientable because we can apply one orientation to each polygon such that each segment can be used in both possible directions. If we traverse the boundary of a given polygon, the polygon under consideration lies to the left or to the right of the enclosing line. It is important to note that neither the directions of two segments meeting at a point must be identical nor the entire boundary segments of a polygon have to be directed in the same sense as we travel around. Attaching a direction to a segment has been introduced only for the purpose of dis- tinguishing the polygon to the left from the polygon to the right.

It is reasonable to demand that the polygons be singly-connected regions. With no loss of generality, multiply-connected regions can be partitioned so as to result in singly-connected ones.

The structure graph of Fig. 3 represents all consis- tent states of a map:

-The vertices of the starting graph are represented by the relations POLYGON, SEGMENT and POINT.

-The relation POLYGON consists of the primary or key domain R # _DOM and the domain A_DOM with their respective attributes Polygon Number and Area. Each polygon has to be singly-connected and its area is not allowed to be smaller than a specific amount.

-The relation POINT is delined by the primary domain P #_DOM and the domains X_DOM and Y_DOM which describe Point Numbers, X- and Y-Coordinates. Furthermore, each coordinate pair must be unique with reference to the primary key.

-The relation SEGMENT consists of the primary domain S # _DOM (Segment Numbers) and two

Fig. 2. Shows the elementary referential constraints of a production.

Page 4: A graph grammar approach to geographical databases

12 A. MUER

ypa T PolygonNumber SegmentNumber

\ I Y&ordinate

RiahtPolvaon Trs=mc- - Tps=mc

Fig. 3. Structure graph of a map of non-overlapping polygons.

subdomains of the relation POINT by means of the -MERGE-POLYGON: This composition of di- primary domain P # _DOM and, similarly, two sub- mension two turns two polygons with a common domains of the relation POLYGON by means of the boundary (segment or chain) into a new polygon by primary domain R # _DOM. omitting this common boundary.

-The relations POINT and SEGMENT are double-linked by two edges representing the role Start Point and End Point, respectively. Every point may be coincident with an arbitrary set of outgoing segments (Start Point) or incoming ones (End Point), i.e. both association types are multiple-conditional.

-DIVIDEPOLYGON: This is the reverse rule of merging two polygons, and it defines a segment or chain by which the polygon is parceled out (two neighboring polygons).

-The relations POLYGON and SEGMENT are connected via two edges expressing the role Left Polygon and Right Polygon for each directed seg- ment. The border of each polygon is composed by a set of arbitrarily oriented segments, i.e. the corre- sponding association types for Left and Right Poly- gon are multiple-conditional.

We now show that the consistent states of a map as defined by the rules of its structure graph remain stable under the above productions and that each consistent state can be derived from the initial state.

THEOREM. All map productions transform a consis- tent state into a consistent state, and each consistent state may be generated from the starting graph by applying suitable map productions.

Every consistent state of a database with polygonal maps is restricted to the above constraints. As we can see, some geometric or topological properties are not explicitly expressed by the structure graph but have to be guaranteed when applying map productions.

3.2. A set of map productions A summary of the behavior of a map database is

given in Table 1 by means of map representations rather than productions. For simplicity, we define a minimal set of graph productions as follows:

Preservation of consistent states. We show that the manipulation rules leave all consistent states topo- logically invariant. Obviously, the alternating sum (Euler Characteristic) X = a2 - a 1 + a0 of the num- ber of polygons a2, segments al, and points a0 equals two (normal form of the sphere) and does not change when applying one of the above productions. In addition, the application of each production pre- serves all other conditions of the structure graph.

--CREATEMAP: The initial manipulation rule defines the boundary of a map by existing points.

-DROP-MAP: The reverse rule deletes a map which consists of a single polygon, i.e. the polygon which completely covers the map.

-MERGESEGMENT: This rule is only allowed for adjacent segments of the same or opposite direc- tion. It is a composition of dimension one and merges two segments by deleting the “boundary” point in common.

Generation of consistent states. We assume that a consistent state is given with more than one polygon, and we have to show that it can be derived by applying only the above productions. Since we have a map which contains more than one polygon, we can use the composition of dimension two such that we get a map with smaller degree of the parceling (number of polygons, segments and points). By the induction hypothesis we conclude that our given map is also derivable.

-DIVIDE-SEGMENT: This rule is a subdivision of dimension one. A segment of a polygon is divided into two segments by assigning to any inner point the meaning of an end point.

4. CONTEXT CONDITIONS IN MAP PRODUCTIONS

Since geographical applications deal with complex objects and relationships, manipulation of the data cannot be restricted to a few tuples or a single relation. As an illustrative example, Fig. 4 shows the

Page 5: A graph grammar approach to geographical databases

A graph grammar approach to geographical databases

Table 1

00

INITIATE AND DELETE OPERATIONS

CREATE

DROP

OPERATIONS OF DIMENSION 1

Pk+l Pk+l

MERGE

Sk'

I =====>

i i

Sk

P

<=====

Sk"

i

DIVIDE

0

Pk Pk

OPERATIONS OF DIMENSION 2

Pl.- 3 c

MERGE

=====>

Sill SII Sll

<=====

DIVIDE Rk

13

Page 6: A graph grammar approach to geographical databases

14 A. MFJER

-9 0 Sll

L- L_--I-_-A

\

0 pq

u 0

Sm

Fig. 4. Production MERGEPOLYGON with conditions encircled.

geometric and topological conditions of the prod- uction MERGEPOLYGON in an abstract notation by means of graphs. The allowed conditions are expressed by the existence or absence of semantic edges in the encircled area of the lefthand graph. In this example, they are:

-The two polygons Rn and Rm must exist. -The polygons Rn and Rm must have at least one

segment in common; i.e. there must be at least one vertex Sk with exactly two edges attaching the verti- ces Rn and Rm. Polygons sharing only one point are prohibited since the merging of such polygons would result in multiply-connected regions which we have excluded from the beginning.

-The common boundary must be singly- connected; i.e. there must be a path which starts from Sk and alternately visits a vertex among Pk, . . . , Pk+n-1 orSk+l,...,Sk+n.

-Each inner point of the common boundary must have exactly two polygons meeting there, namely Rn and Rm; i.e. all vertices Pk, . . . , Pk + n - 1 must not have any semantic edges that intersect with the encircled line.

On the other hand, when dividing a polygon Rk into two non-overlapping polygons Rn and Rm, the new chain Sk, . . . , Sk + n should link two already exist- ing points on the boundary of polygon Rk. This chain with inner points Pk, . . . , Pk + n - 1 is neither al- lowed to lie partly outside polygon Rk nor to coincide anywhere with the boundary of the polygon, except in the two border points Pq and Pr.

5. CHURCH-ROSSER PROPERTIES OF MAP PRODUCTIONS

A subtle problem arises when several segments or polygons are manipulated at the same time (i.e. in parallel). This is not only of pure theoretical interest, but also has some practical consequences since modifications can last as long as a year. For instance, many legal procedures are involved when land sur- veyors are restructuring a specific region.

5.1 Parallel and sequential independence The study of synchronization issues in the case of

shared data needs a few definitions. We concentrate again on the algebraic approach of graph grammars described in [15]. As we know, the application of a production to a consistent state demands the identification of the lefthand graph and its replace- ment by the righthand graph according to the gluing points. The direct application of a production is often named a manipulation rule in order to distinguish a derivation built of a sequence of productions.

Two manipulation rules are called parallel in&pen - dent if the intersection of the corresponding lefthand sides fulfill the following conditions: There exist no common edges, and each vertex must be a common gluing point. If we apply this definition to a sequence of two manipulation rules by studying the inter- section property for the common consistent state, we get the definition of sequential independence. In other words, parallel and sequential independence proper- ties are dual to each other.

Without any proof we list the Church-Rosser

Page 7: A graph grammar approach to geographical databases

A graph grammar approach to geographical databases I

theorem which states the following: Given two direct derivations p 1: G =z- H 1 and p 2: G * H2 which are parallel independent then there exists a consistent state X such that the sequences p2.p 1 :G =-H 1 =.X and p 1 .p2:G =sX are sequential independent. (A precise proof may be found in [15].) In other words, the theorem states that parallel independent prod- uctions can be executed concurrently or in any de- sired sequential order yielding the same result.

The example given in Fig. 5 illustrates the basic idea: Can we merge the two polygons Rm and Rn without complications concurrently with dividing Rk by segment Sk? The common boundary segments S’, S”, and S’” of the polygons Rk, Rm, and Rn have common vertices and edges in the intersection of the lefthand graphs of both productions MERGE-POLYGON and DIVIDEPOLYGON. This means that the requirements of the Church-Rosser theorem are not met. As a con- sequence, we are not allowed to process the above in parallel. When merging Rm and Rn, the topological properties of the common boundary segments with Rk are being changed which cannot be detected by the process which concurrently divides region Rk. Undoubtedly, this will lead to a conflict.

5.2 Independence theorem for a map We now ask which map productions are indepen-

dent for each application such that they can be used in arbitrary order or even in parallel. Obviously, creating and dropping a map are not independent of any other map production. We therefore restrict ourselves to merging and dividing manipulations in both one and two dimensions.

THEOREM. Each possible pair of merging and dividing polygons or segments is parallel independent if the corresponding polygons or segments have at most points in common.

For a proof, we first consider the map productions of dimension one. Of course, if the segments which are involved in the manipulation intersect each other then this will lead to a conflict. Therefore, merging (or dividing) segments is allowed in parallel with merging (or dividing) other segments if the two segment sets are disjoint or have only points in common. We are sure that these common points are

Fig. 5. Conflicting situation caused by concurrent manipulations.

always gluing points since two single segments which will be merged cannot meet other adjacent segments on their common intersecting point (precondition of MERGE-SEGMENT). For the same reason, merg- ing mixed with dividing segments may be applied in parallel if the set of segments consists of at most points in common.

For a proof, we first consider the map productions of dimension one. Of course, if the segments which are involved in the manipulation intersect each other then this will lead to a conflict. Therefore, merging (or dividing) segments is allowed in parallel with merging (or dividing) other segments if the two segment sets are disjoint or have only points in common. We are sure that these common points are always gluing points since two single segments which will be merged cannot meet other adjacent segments on their common intersecting point (precondition of MERGE-SEGMENT). For the same reason, merg- ing mixed with dividing segments may be applied in parallel if the set of segments consists of at most points in common.

Next, we evaluate the map productions of dimen- sion two. As we know from the previous section, two sets of polygons cannot be manipulated in parallel if they intersect in common segments or even polygons. The study of the context conditions also lead to the conclusion that all endpoints of outer boundary segments are gluing points. Thus two sets of polygons are allowed to intersect in single points in an applica- tion of parallel map productions of dimension two.

Finally, if we mix up manipulation rules for poly- gons and segments then the geometric intersection may consist of gluing points, i.e. outer border points of polygons or outer endpoints of segments. Anal- ogously, productions which involve both polygons and segments are parallel independent if they show at most one or several points in common.

To summarize the key idea (Fig. 6), we point to the conflicting situation discussed in the previous section. As soon as the polygons involved by the productions MERGE-POLYGON and DIVIDEPOLYGON are separated widely enough (i.e. no polygons or segments lie in the intersection), merging and dividing can be processed in parallel.

6. IMPLEMENTATION ASPECTS In order to implement the structure graph and its

corresponding productions for the map example, we propose a transformation into a linear notation. This will lead to a set of relations and transactions partly described in the Appendix. (The specification lan- guage is based on [ 171.)

To express the matching conditions, we break up each lefthand side of a production into a set of predicates. Since we investigate geometric and topo- logical properties defined on points, segments and polygons, the notion of predicate must be seen in a somewhat more general sense[l8]. For example, the

Page 8: A graph grammar approach to geographical databases

16 A. MEIER

,

Rn Rm

DIVIDE_POLYGON(Rk,Sk:Rk',Rk") MERGE_POLYGON(Rn,Ftm;Rnm)

Fig. 6. Parallel independent manipulation rules.

transaction MERG~~LY~N involves topo- logical predicates to verify that the two polygons are adjacent, the common boundary is singly-connected, and the inner points are coincidexit with only the two merging polygons. On the other hand, the imple- mentation of the transaction DIVIDE-POLYGON uses some well-known problems in computational geometry, e.g. a point-in-~lygon test for the new boundary points, an intersection test for the common boundary segments, and an algorithm to calculate the two new resulting polygon areas. Unacceptable pre- conditions for transactions MERGE-POLYGON and DIVIDEPOLYGON are listed in Table 2. Consistency might be assured by the con- ventional double line test and the area test. The former says that for a set of map polygons each bound~y segment should be referenced exactly twice in opposite directions when traveling around each polygon boundary. The latter test assures that the sum over a11 polygon areas remains constant and equals the area of the entire map. Obviously, it is impractical to perform the double line and area test after each operation. We therefore have put most of the topological properties of a map into the structure graph and pay some costs for preserving consistency

when applying map productions. To the user, the transactions are guaranteed to commit successfully once the rule part is satisfied (provided that the database system is reliable). This also eliminates the tedious case of undoing operations after an inconsis- tency has been detected.

As we already mentioned, merging and dividing polygons and segments differ from typical data pro- cessing transactions which touch only a few tuples and generally complete in a short time. It is therefore inappropriate to lock all or part of a map to force a user to wait until all legal procedures of a running transaction are fulfilled, or even to back out a transaction (and probably destroy the update work of several days or even weeks!) in case of a deadlock. The study of the Church-Rosser properties suggests a more convenient solution: The Indep~d~~e The- orem for a map tells us which geometric objects are topologically involved by a map transaction and should be copied in a separate workspace called version. For instance, MERGE_POLYGON extracts all involved adjacent polygons (with at least one segment in common) into a new version, together with their segments and points. Versions are allowed to overlap since the system keeps track of them and

Page 9: A graph grammar approach to geographical databases

A graph grammar approach to geographical databases 17

Table 2

Unacceptable Preconditions for MERGE-POLYGON

Bordering

(Rn,Rm)=FALSE

Connected

(Sk,Sk+l)=FALSE

Coincident

(Pk+l)=FALSE

Unacceptable Preconditions for DIVIDE-POLYGON

Point-in-polygon

(Pk,Rk)=FALSE

Intersecting

(Sk,Rk)=FALSE

Self-intersecting

(Sk,Sk+l,Sk+Z)=FALSE

guarantees a consistent re-incorporation of the changed data as soon as all legal procedures are satisfied.

7. CONCLUSIONS

Investigations into the representation of prod- uction systems have been carried out mainly in the field of Artificial Intelligence. We have taken a geo- graphical database as an example to show the appli- cability of graph grammars to practical systems. The concept is well-suited to the preservation of consis- tency of all productions and general enough to support additional investigations (e.g. Church- Rosser properties). Moreover, an implementation proved that the concept is reasonable and allowed estimation of the effort involved in designing and implementing such a consistency-preserving system.

Acknowledgements-1 am grateful to Hans-Joerg Kreowski, Jim Rhyne and Adrian Walker for their many valuable comments on the presentation and technical accu- racy of this paper.

REFERENCES

[l] E. F. Codd: Extending the database relational model to capture more meaning. ACM Trans. on Data Base Systems 4(4), 391434.

[2] A. Meier and M. Ilg: Consistent operations on a spatial data structure. IEEE Conf. on Pattern Recognition and Image Processing, Las Vegas, pp. 432440 (1982).

[3] J. L. Pfaltz and A. Rosenfeld: Web grammars. Proc. Int. Joint Conf. of Artificial Intelligence, Washington, pp. 609619 (1969).

[4] K. P. Eswaran and D. D. Chamberlin: Functional specifications of a subsystem for data base integrity. Proc. 1975 Co@ on Very Large Data Base Systems, pp. 48-68.

[S] H. Levesque and J. Mylopoulos: A procedural seman- tics for semantic networks. In: Associatiue Networks, (Edited by N. Findler), pp. 93-120. Academic Press, New York (1979).

[6] L. M. Brodie: On modelling behavioral semantics of databases. Proc. 1981 Co& on Very Large Data Bases, Cannes, pp. 32-42 (1981).

[7j V. De Antonellis and B. Zonta: Modelling events in data base applications design. Proc. 1981 Conf. on Very Large Data Bases, Cannes, pp. 23-31 (1981).

[8] S. K. Chang (Ed): Pictorial information systems. IEEE Comput. 14(1 l), (1981).

[9] G. Nagy and S. Wagle: Geographic data processing. Computing Surveys 11(2), 139-181 (1979).

[ltJ] E. H. Sibley and L. Kerschberg: Data architecture and data model considerations. Proc. AF’IPS, Texas, pp. 85-96 (1977).

[ 1 I] A. Meier: A graph-relational approach to geographic data bases. In Graph-Grammars and Their Application to Computer Science (Edited by H. Ehrig, M.-Nag1 and G. Rozenberg), pp. 245-254. Lecture Notes in Comp. Science, No. 153. Springer-Verlag, Berlin (1983).

Page 10: A graph grammar approach to geographical databases

18 A. Mnen

[12] J. R. Abrial: Data semantics. In Data Base Manage- ment (Edited by J. W. Klimbie and K. L. Koffeman), pp. l-59. North-Holland, Amsterdam (1974). 1161

[13] R. El-Maxi and G. Wiederhold: Properties of re- lationships and their representation. AFZPS Conf. Proc. 1980, Vol. 49, pp. 319-326, Anaheim.

[14] C. J. Date: Referential integrity. Proc. 1981 Co& on [17] Very Large Data Bases, Cannes, pp. 2-12 (1981).

[15] H. Ehrig: Introduction to the algebraic theory of graph grammars (a survey). In Graph-Grammars and rheir [18] Application to Computer Science and Biology (Edited by V. Claus, H. Ehrig and G. Rozenberg), pp. l&69.

Lecture Notes in Comp. Science, No. 73. Springer- Verlag, Berlin (1979). H. Ehrig and H. J. Kreowski: Applications of graph grammar theory to consistency, synchronization and sheduling in data base systems. Inform. systems 5, 225238 (1980). A. Meier: A framework for specifying geographic database applications. 6th Inr. Comput. Software and Applic. Conf., Chicago, pp. 476-481 (1982). M. Minsky and S. Papert: Perceptrons-An Intro- duction to Computational Geometry. The MIT Press, Cambridge, Mass. (1969).

APPENDIX

Description of the map structure

RELATION Points; ATTRIBUTE

P#: Numbers; XCoordinate: REAL; YCoordinate: REAL;

IDENT P# PRIMARY DOMAIN PointNumbers; (XCoordinate,YCoordinate);

END Points;

RELATION Polygons; ATTRIBUTE

R#: Numbers; Area: REAL;

IDENT R# PRIMARY DOMAIN PolygonNumbers;

END Polygons;

RELATION Segments; ROLE

LeftPolygon: Polygons ASSOCIATION(mc); RightPolygon: Polygons ASSOCIATION(mc); StartPoint: Points ASSOCIATION(mc); Endpoint: Points ASSOCIATION(mc);

ATTRIBUTE S#: Numbers;

IDENT S#; (StartPoint,EndPoint);

END Segments;

Description of two map transactions

TRANSACTION MergePolygons; SCOPE (* Left side of production *)

SELECT OldPolygons FROM Polygons; SELECT CommonEdgeSegments FROM Segments; SELECT OuterEdgeSegments FROM Segments; PREPARE NewPolygons FOR Polygons;

RULE PolygonsExist(OldPolygons) AND Bordering(CommonEdgeSegments) AND Connected(CommonEdgeSegments) AND Coincident(InnerPoints) AND NOT PolygonsExist(NewPolygons);

EVENT (* -> Right side of production *) DELETE OldPolvqons FROM Polvgons; INSERT NewPolygons IN Polyg%; UPDATE OuterEdgeSegments IN Segments; DELETE CommonEdgeSegments IN S&ments; DELETE InnerPoints FROM Points;

END MergePolygons;

Page 11: A graph grammar approach to geographical databases

A graph grammar approach to geographical databases

TRANSACTION DividePolygons; SCOPE (* Left side of production *)

SELECT OldPolygons FROM Polygons; SELECT OuterEdgeSegments FROM Segments; PREPARE NewPoints FOR Points; PREPARE NewSegments FOR Segments; PREPARE NewPolygons FOR Polygons;

RULE PolygonsExist(OldPolygons) AND Point-in-polygon(NewPoints) AND Intersecting(NewSegments) AND Self-intersecting(NewSegents) AND NOT PolygonsExist(NewPolygons);

EVENT (* --> Right side of production *) DELETE OldPolygons FROM Polygons; INSERT NewPolygons IN Polygons; INSERT NewSegknts IN Segments; UPDATE OuterEdgeSegments IN Segments; INSERT NewPoints 16 Points;

END DividePolygons;

19