shape detection with vision: implementing shape grammars in conceptual design

ORIGINAL PAPER

Shape detection with vision: implementing shape grammarsin conceptual design

Iestyn Jowers • David C. Hogg • Alison McKay •

Hau Hing Chau • Alan de Pennington

Received: 3 August 2009 / Revised: 19 February 2010 / Accepted: 8 March 2010 / Published online: 27 March 2010

� Springer-Verlag London Limited 2010

Abstract Despite more than 30 years of research, shape

grammar implementations have limited functionality. This

is largely due to the difficult problem of subshape detec-

tion. Previous research has addressed this problem analyt-

ically and has proposed solutions that directly compare

geometric representations of shapes. Typically, such work

has concentrated on shapes composed of limited geometry,

for example straight lines or parametric curves, and as a

result, their application has been restricted. The problem of

general subshape detection has not been resolved. In this

paper, an alternative approach is proposed, in which sub-

shape detection is viewed as a problem of object recogni-

tion, a sub-domain of computer vision. In particular, a

general method of subshape detection is introduced based

on the Hausdorff distance. The approach is not limited in

terms of geometry, and any shapes that can be represented

in an image can be compared according to the subshape

relation. Based on this approach, a prototype shape gram-

mar system has been built in which the geometry of two-

dimensional shapes is not restricted. The system automates

the discovery of subshapes in a shape, enabling the

implementation of shape rules in a shape grammar.

Application of the system is illustrated via consideration of

shape exploration in conceptual design. The manipulations

of sketched design concepts are formalised by shape rules

that reflect the types of shape transformations employed by

designers when sketching.

Keywords Shape detection � Shape grammars �Object recognition � Hausdorff distance �Conceptual design � Computational design

1 Introduction

Research into shape grammar implementation has been

active since the concept of shape grammars was devised by

Stiny and Gips in the 1970s (Stiny and Gips 1972). A

summary of this research activity is given in Chau et al.

(2004) and spans many design disciplines, from art to

engineering and from architecture to product design.

However, despite some promising results, these imple-

mentations have not yet satisfactorily met the potential

suggested by theoretical developments of the shape gram-

mar formalism (Stiny 2006). This is largely due to the

technical difficulties involved in implementing shape

grammar systems, which rely on a general purpose solution

for subshape detection.

Typically, the subshape detection problem has been

approached as an analytical problem, and while a number

of solutions have proved to be feasible, they have also

proved to be limited in their applicability. These limitations

result from the necessity of analytical solutions being

applied to specific types of geometry, such as straight lines

(Krishnamurti 1981) or parametric curves (Jowers and Earl

2010).

In this paper, a new approach to addressing the subshape

detection problem is presented. Here, subshape detection is

regarded as an object recognition problem, and established

results from computer vision are used in order to identify

embedded subshapes in a shape. In particular, model-based

recognition using the Hausdorff distance (Huttenlocher

et al. 1993) is employed in order to determine whether the

I. Jowers (&) � A. McKay � H. H. Chau � A. de Pennington

School of Mechanical Engineering, University of Leeds,

Leeds LS2 9JT, UK

e-mail: [email protected]

D. C. Hogg

School of Computing, University of Leeds, Leeds LS2 9JT, UK

123

Res Eng Design (2010) 21:235–247

DOI 10.1007/s00163-010-0088-z

shape represented in one image can be embedded within

the shape represented in a second image. Such images can

be the output of computational systems, such as computer-

aided design (CAD) systems, or alternatively can be taken

from designers’ sketches.

This approach to subshape detection has been imple-

mented in a shape grammar system. The system is intro-

duced, and subshape detection is illustrated via

consideration of shapes that are not defined according to a

limited geometry but instead are taken from designers’

sketches. The system is used to manipulate a sketched

design concept in a mode that reflects the shape transfor-

mations employed by designers when sketching in con-

ceptual design (Prats et al. 2009).

2 Background

A shape grammar (Stiny 2006) consists of an initial shape

and a set of rules of the form a ? b, where a and b are

both shapes, as illustrated in Fig. 1. In this context, a shape

is defined visually as a finite arrangement of geometric

elements, such as lines or curves, each with a definite

boundary and limited but non-zero extent. A rule a ? b is

applicable to a shape c if some similarity transformation

(typically a Euclidean transformation) of the shape a on the

left-hand side of the rule is a subshape of c, denoted a B c,

where B is the subshape relation. Application of the rule

removes the instance of the subshape a and replaces it with

an instance of the shape b on the right-hand side of the rule.

For example, the shape rule in Fig. 1 removes a lens and

replaces it with a similar lens translated along its central

axis, as indicated by the local coordinate axis. Repeated

application of shape rules to an initial shape leads to the

generation of a sequence of shapes, as illustrated in Fig. 2.

Shape grammars embody the philosophy that a designer

using a computational system, such as a CAD system,

ought to be able to recognise and manipulate any subshape

or structure that can be perceived within a shape. As a

result, application of a shape rule is not restricted to the

geometric elements initially used to define a shape but

instead is applicable to any subshapes that can be seen to be

embedded in the shape. Indeed, within the shape grammar

formalism, the structure of a shape is defined retrospec-

tively according to shape rule applications (Stiny 1994).

A consequence of this is that shape grammars often gen-

erate shapes that incorporate some unexpected results that

follow from the recognition and manipulation of new

interpretations of shapes. For example, the shape sequence

in Fig. 2 begins with an initial shape (Fig. 2a) and ends

with a shape that is a rotation of this initial shape (Fig. 2e),

but this sequence is generated via application of a rule that

merely translates a subshape of this shape. This unexpected

result occurs because after a single application of the shape

rule, additional instances of the lens in the left-hand side of

the rule have emerged (Fig. 2c). Indeed, in Fig. 2c, three

instances of the lens can now be recognised, and applica-

tion of the shape rule to one of these subshapes translates it

along its central axis and restructures the shape so that it

appears to be rotated.

Shape grammars provide a formal description of the

shape exploration processes employed by designers in

conceptual design (Prats et al. 2009). This is because shape

rules enable the perceived structure of a design shape to be

freely recognised and manipulated without adherence to a

predefined geometric structure. Indeed, application of a

shape rule reflects the explorative process of ‘seeing-

moving-seeing’, as employed by designers when sketching

in conceptual design (Schon and Wiggins 1992). In ‘see-

ing’, a rule formalises the perception of a shape by rec-

ognising particular subshapes, and in ‘moving’, the rule

manipulates the shape according to replacement of the

recognised subshapes. This process of recognising and

manipulating subshapes leads to the generation of

sequences of shapes in a manner that reflects the way

designers generate sequences of sketches (Prats and Earl

2006).

Shape grammars have also been employed as a gener-

ative mechanism in a range of design disciplines, from

architecture, e.g. Koning and Eizenberg (1981), to engi-

neering, e.g. Brown and Cagan (1997). However, few of

these applications are computationally implemented and

instead it is common for rules to be applied and designs

generated as a paper-based exercise. This is because the

task of developing a shape grammar system—a computa-

tional system intended to automate the application of shape

rules—is not trivial, requiring a general purpose solution to

the subshape detection problem.

Subshape detection can be thought of as a type of object

recognition problem. Object recognition is a central theme

in computer vision and is applied to a range of problems

including fingerprint matching, character recognition and

content-based image retrieval (Forsyth and Ponce 2003). In

general, these problems are concerned with finding a target

object in an image or video sequence, for example by

applying statistical analysis or by comparing features.

Similarly, subshape detection is concerned with finding a

target shape embedded in a second shape, in such a way

that the embedded shape can be replaced according to a

shape rule. However, despite the commonalities betweenFig. 1 An example shape rule

236 Res Eng Design (2010) 21:235–247

123

the problems, the techniques employed in object recogni-

tion have never been applied to subshape detection, which

has instead been typically viewed as a problem to be solved

analytically by matching directly against geometric

representations.

3 Analytical subshape detection

A successful shape grammar implementation is dependent

on a general solution to the subshape detection problem.

For over 30 years, research in this field has produced

analytical solutions to the problem, which have been

applied to shapes composed of a variety of geometric

elements, including lines (Krishnamurti 1981), planes

(Krishnamurti 1992), circular arcs (Chau et al. 2004) and

parametric curves (McCormack and Cagan 2006; Jowers

and Earl 2010). Each of these analytical solutions takes

advantage of the geometric properties of shapes in order to

allow for subshape detection, and they typically utilise a

common representation of shapes, given by maximal geo-

metric elements.

Software implementation of shape grammars is difficult,

because it depends on a visual representation of shape that

does not take into consideration the underlying structure.

This means that it is necessary to compare shapes

according to their perceptual qualities, regardless of their

mathematical representation. However, comparing shapes

to determine whether or not they are perceptually the same

can be computationally expensive, if not impossible. In

addition, further difficulties arise with respect to subshape

detection. This is because the underlying structure of a

shape has consequences concerning which subshapes are

apparent for manipulation. For example, although three

lens subshapes are visually apparent in the triquetra in

Fig. 2c, a structural decomposition of the shape will only

allow for the recognition of one lens at a time. This is due

to the overlapping configuration of the three lenses.

In order to enable a visual comparison of shapes, it is

beneficial to provide a unique, canonical representation to

which all visually equivalent shapes can be reduced. One

such representation is the maximal representation as dis-

cussed by Stiny (2006). The maximal representation results

from merging any geometric elements in a shape that can

be merged, such as touching co-linear lines. For example,

the maximal representation of the triquetra in Fig. 2c is

given by the three largest curve segments in the shape.

Comparison of shapes according to maximal representa-

tions can reveal whether or not two shapes are perceptually

the same, or if one shape can be embedded as a subshape of

the other. Indeed, if one shape can be embedded in another,

then all of the maximal geometric elements of the first can

be embedded in the maximal geometric elements of the

second under a similarity transformation. For example, the

lens shape in the left-hand side of the rule in Fig. 1 is a

subshape of the triquetra in Fig. 2c, because the maximal

curves that define the lens shape can be embedded in the

maximal curves of the triquetra, under Euclidean

transformations.

With shapes reduced to a canonical representation, such

as maximal lines (Krishnamurti 1981), maximal curves

(Jowers and Earl 2010), or a representative shape based on

distinct points (McCormack and Cagan 2006), subshape

detection can be implemented via consideration of the

specific geometry of shapes. For example, in Jowers and

Earl (2010), subshape detection is implemented on shapes

composed of parametric curve segments via consideration

of the intrinsic properties of the mathematical curves in

which the segments are embedded. These intrinsic prop-

erties specify the embedding relations between curve seg-

ments and consequently between shapes composed of

curve segments. However, despite their theoretical success,

analytical approaches to subshape detection lead to prob-

lems with respect to a shape grammar implementation.

One problem arises from the dependency of analytical

approaches on the formal structures used to represent

shapes. In an analytical approach, subshape detection is

facilitated through consideration of the properties of spe-

cific geometric elements, such as straight lines or curve

segments. Such an approach can be defined so that it is

theoretically applicable to a range of element types.

However, in practice, implementation of an analytical

approach is restricted to a small class of elements. For

example, the algorithms introduced by Jowers and Earl

(2010) are theoretically applicable to shapes composed of

any parametric curve segments. However, in practice, the

detail of implementation is such that a different algorithm

is required for parametric curves of different order, as

Fig. 2 An example shape

sequence. a Initial shape.

b Detection of lens subshape.

c Result of one rule application.

d Detection of emergent lens

subshape. e Result of two rule

applications

Res Eng Design (2010) 21:235–247 237

123

illustrated in Jowers (2006). Similarly, the algorithms

introduced by Krishnamurti (1981) are theoretically

extendable to include all rectilinear shapes including

shapes composed of planes, as discussed in Krishnamurti

(1992). Despite this, they have only been implemented for

shapes composed of straight lines, and the implementation

details for other geometric elements are yet to be fully

resolved.

With respect to a shape grammar implementation, this

means that any system based on any one analytical

approach to subshape detection will prove restrictive to

designers who generally incorporate many types of geo-

metric elements in their work. Alternatively, a shape

grammar system could be developed that is not based on a

single analytical approach to subshape detection but

instead incorporates many such approaches. This could

result in an inefficient system, since it would be necessary

to apply multiple subshape detection algorithms when

comparing shapes composed of different types of geo-

metric elements. In addition, there are still many types of

geometric elements for which the subshape detection

problem has not been addressed, for example free-form

surfaces.

A second problem arises with respect to the similarity of

shapes. A key premise in the application of shape gram-

mars lies in the fact that shapes are viewed as visual

entities and that they can be formally explored in a visually

intuitive way. When humans view a shape, visual similarity

implies equality. That is, if two shapes look the same, then

they are generally considered to be the same. However,

when analytical techniques carry out the same process,

visual similarity does not necessarily imply equality. For

example, the two curve segments, C1 and C2, in Fig. 3 are

visually similar but analytically they are different, because

they are segments of infinite curves that are mathematically

distinct (as illustrated by the extended curves). As a result,

in a shape grammar system based on analytical techniques,

shape matching between C1 and C2 would fail despite their

perceptual similarity.

This distinction between visual similarity and analytic

similarity has not been addressed in any of the analytical

approaches developed to date. As a result, computational

systems that recognise and manipulate subshapes based on

analytical approaches may not match shapes that seem

obviously similar to humans.

This paper presents an alternative approach to subshape

detection that addresses the problem as a computer vision

problem rather than as an analytical problem. In the

remainder of this paper, an approach is described, which is

based on an established technique of object recognition for

computer vision. This technique, which builds on the

Hausdorff distance, does not rely on formal geometry in

order to facilitate object recognition but instead treats

images as visual entities to be analysed and compared

within a definable similarity threshold. As a result, the

problems discussed with respect to analytical subshape

detection are avoided, and a more robust approach is

obtained.

4 Object recognition with the Hausdorff distance

Object recognition is a key challenge within the study of

computer vision. Much of the current work on this problem

is concerned with developing methods for detecting

instances of generic object categories within images, for

example a cup, a face, or a car (Pinz 2005), as opposed to

detecting instances of a particular object. The dominant

approach is to characterise the appearance of objects from a

target category in terms of collections of local features that

are expected to occur, such as the spatial and angular

distributions of intensity gradients. In this paper, the focus

is on shapes represented in raster images that can be

reduced to binary images, and ultimately, sets of points (i.e.

the individual pixels). This special case has been studied in

its own right, and a recent and powerful approach for

addressing this problem is based on the Hausdorff distance,

a distance metric defined between two sets of points

(Huttenlocher et al. 1993).

The benefits of using the Hausdorff distance as a method

for object recognition have been discussed in detail by

Rucklidge (1996). In particular, the Hausdorff distance

avoids many of the difficulties that result from other

approaches to object recognition. For example, difficulties

that emerge from unstable linear feature detection are

avoided by representing images as point features rather

than linear features. Also, problems that can result from

feature correspondence are avoided, because the approach

does not rely on any explicit pairing of features between

images. This also means there is no reliance on combina-

torial comparison, and it is possible to compare images

with many (thousands to tens of thousands) features,

whereas approaches that rely on feature pairing are typi-

cally restricted to relatively few (tens to hundreds). Most

importantly, the Hausdorff distance provides an approachFig. 3 Visually similar curve segments

238 Res Eng Design (2010) 21:235–247

123

that enables real-time comparison of images and is reliable,

producing sensible results even in the presence of noise or

occlusion.

4.1 The Hausdorff distance

The Hausdorff distance between two point sets, A and B, is

calculated from the directed distances between the sets.

The directed Hausdorff distance is a measure of the max-

imum distance from a set of points to the nearest point in a

second set. Formally, the directed Hausdorff distance

between two point sets A = {a1, a2, …, an} and B = {b1,

b2, …, bm} is defined as

h A; Bð Þ ¼ maxa2A

minb2B

a� bk k ð1Þ

where ||a - b|| is a measure of the distance between two

points a and b, e.g. the Euclidean distance. The value

h(A, B) is the distance from the point a [ A that is furthest

from all points of B to its nearest neighbour in B. For

example, if h(A, B) = d, then each point in A is within

distance d of a point in B, and there is at least one point in

A that is exactly distance d from its nearest neighbour in

B. The directed Hausdorff distance is asymmetric, meaning

that it is not necessarily true that h(A, B) = h(B, A), and

therefore, h(A, B) measures the distance from point set A to

point set B, but not vice versa.

A symmetrical measure of the distance between A and B

is given by the (bidirectional) Hausdorff distance, which is

defined as

H A; Bð Þ ¼ max h A; Bð Þ; h B;Að Þð Þ ð2Þ

The Hausdorff distance H(A, B) is the maximum of the

directed Hausdorff distances from A to B and from B to A

and measures the minimum distance between all points in

A and all points in B. For example, if H(A, B) = d, then

each point in A is within distance d of some point in B and

also each point in B is within distance d of some point in A.

As such, given two point sets A and B, the Hausdorff

distance can provide a measure of their similarity. For

example, if A and B contain the same points, then the

Hausdorff distance between the sets will be zero, or if

the points in B are displaced by a small distance, then the

Hausdorff distance will be small. This is illustrated by the

two point sets in Fig. 4. Let A be the set of solid points and

B be the set of hollow points. Here, the Hausdorff distance

H(A, B) is large, because there are points in one set that are

far away from the points in the other. This reflects that

there is little similarity between A and B. Note that the

Hausdorff distance does not explicitly pair the points in A

with the points in B; instead, it is possible for many of the

points of A to be close to the same point of B and vice

versa.

As defined, the Hausdorff distance measures the simi-

larity between point sets that have fixed position relative to

each other. However, it can also be used to measure the

absolute similarity between point sets by allowing com-

parison under transformation. For example, Huttenlocher

et al. (1993) investigate the problem of finding the mini-

mum Hausdorff distance between two point sets under

Euclidean motion. Given two point sets, A and B, that do

not have fixed relative position, the Hausdorff distance can

be defined as a function of a transformation group T, e.g.

the Euclidean transformations, with the set B transformed

relative to the set A, according to TðBÞ ¼ fTðbÞjb 2 Bg. In

this case, the minimum Hausdorff distance determines the

transformation group T that minimises the distance

between A and B and is defined as

HTðA;BÞ ¼ minT

HðA; TðBÞÞ ð3Þ

The minimum Hausdorff distance provides a measure of

the absolute similarity between A and B relative to a

transformation group T. For example, for the two point sets

in Fig. 4, the minimum Hausdorff distance defined as a

function of translation is small since there are translations

of B that make all the points in A nearly coincident to

points in B and all the points in B nearly coincident with

points in A. Similarly, if T is a Euclidean motion, then

HT(A, B) = 0, since there is a rotation that embeds all the

points in B in all the points of A. In both cases, the value of

HT(A, B) reflects the similarity between the sets A and B,

under a specified transformation group.

4.2 Comparing images with the Hausdorff distance

In Huttenlocher et al. (1993), the minimum Hausdorff

distance under transformation, as defined in Eq. 3, is

presented as an approach for comparing shapes in images

given by a raster representation. In particular, it is applied

to the problem of model-based recognition, which is

concerned with searching a scene in order to locate

instances of an object, of which a model is provided. For

example, the problem could be concerned with tracking

an object as it moves about a scene by locating it in the

different frames of a video sequence. The minimum

Hausdorff distance HT(A, B) is particularly appropriate for

vision problems such as this because, in the case where

the transformation T is a Euclidean transformation, it

obeys the metric properties of identity, symmetry and the

Fig. 4 Two sets of points

Res Eng Design (2010) 21:235–247 239

123

triangle inequality. These properties correspond to intui-

tive notions of shape similarity. In particular, identity

implies that a point set representing a shape is identical

only to itself; symmetry implies that shape similarity is

not dependent on the order of comparison between point

sets; and the triangular inequality implies that two point

sets representing shapes that are dissimilar cannot both be

similar to a point set representing some third shape. Many

shape comparison functions do not obey the triangle

inequality, and thus, it is possible that such functions will

report that two highly dissimilar shapes are both similar

to a third shape, which is counterintuitive.

Raster images are suitable for comparison according to

the Hausdorff distance, because they can be simply reduced

to point sets, for example through the use of edge detection

techniques (e.g. Canny 1986). Traditionally, in model-

based recognition, the point set representing the scene that

is to be searched is denoted A, and the point set repre-

senting the model that is to be searched for is denoted B. In

practice, recognition problems involve finding a transfor-

mation of the model B that brings it within close corre-

spondence with a portion of the scene A, and vice versa.

For this task, it is beneficial if it is possible to compare

partial images.

Consider again the directed Hausdorff distance, defined

according to Eq. 1. As discussed, this equation calculates

a distance h(A, B) = d such that all points in A are within

distance d of some point in B. A partial equivalent of this

equation calculates a distance d such that a subset of the

points in A is within distance d of some point in B

(Huttenlocher et al. 1993). In particular, the partial

directed Hausdorff distance from a subset of A to B is

given by

hKðA;BÞ ¼ Ktha2A min

b2Ba� bk k ð4Þ

where K is the number of points in the subset of A, so that

0 B K B n, where n is the number of points in A. Ktha2A

denotes a ranked value based on an ordering of the points

in A according to their distance from the nearest point in B.

Therefore, hK(A, B) = d is the distance from the Kth

ranked point in A to its nearest neighbour in B, and con-

sequently, K of the points in A are within a distance d of

some point in B. For example, if K = n, the number of

points in A, then d is equal to the directed Hausdorff dis-

tance h(A, B).

Note that this definition of the partial Hausdorff dis-

tance is not dependent on a specification of which part of

A is to be matched with B. Instead, the best matching part

of A is selected because Eq. 4 identifies the subset of K

points in A that minimises the directed Hausdorff distance

from A to B.

The partial Hausdorff distance between A and B is cal-

culated according to the maximum of the partial directed

distances between the point sets and is defined as

HKL A; Bð Þ ¼ max hK A; Bð Þ; hL B; Að Þð Þ ð5Þ

This partial Hausdorff distance differs from the

Hausdorff distance as defined in Eq. 2, because instead

of matching whole point sets, it compares subsets of points.

For recognition problems, this is important since it enables

matching of an object in scenes in which many other

objects are present, and it also enables matching of objects

that are not wholly represented in a scene, for example due

to occlusion or sensor failure.

4.3 Computing the Hausdorff distance

When computing the Hausdorff distance, it is useful to

consider its graphical interpretation. As discussed in Hut-

tenlocher et al. (1993), the directed Hausdorff distance as

defined in Eq. 1 can be re-expressed as

hðA;BÞ ¼ maxa2A

dðaÞ

with

dðxÞ ¼ minb2B

x� bk k

A similar expression, d0(x), can also be defined for the

reverse directed Hausdorff distance h(B, A). The graph of

d(x), fðx; dðxÞÞjx 2 <2g, is known as the distance

transform of B since it gives the distance from a point x

to the nearest point b [ B. For example, for a set of points

defined by a raster image, the resulting distance transform

can be represented as an array, as illustrated in Fig. 5.

Here, the distance transform is calculated according to the

L1 distance (also known as the Manhattan distance), where

the distance d between two points p1 and p2 is given by

d = |x2 - x1| ? |y2 - y1|.

Consideration of the graphical interpretation of the

Hausdorff distance gives rise to the algorithms outlined in

Table 1. Graphically, the directed Hausdorff distance

h(A, B) is simply given by local maxima of the distance

transform d(x). For example, if A and B are defined by

raster images, then h(A, B) is defined as the highest entry in

Fig. 5 An example distance

transform

240 Res Eng Design (2010) 21:235–247

123

d(x) that corresponds to a point b [ B. This procedure for

calculating h(A, B) is outlined in algorithm (a) in Table 1.

h(B, A) is similarly defined as the local maxima of d’(x),

and the Hausdorff distance H(A, B) is simply the largest

local maxima of the two distance transforms d(x) and d0(x),

as outlined in algorithm (b). Similarly, the Hausdorff dis-

tance as a function of a transformation T can be computed

by considering the local maxima of the distance transforms

d(x) and d0(x) under transformation, as outlined in algo-

rithm (c) in Table 1.

When comparing images, a threshold value, say s, is

specified. This value is a numerical expression for the

similarity of images and defines the maximum distance that

is allowed between two images for them to be considered

similar. That is, if the Hausdorff distance between a model

and a scene is greater than s, then there is little similarity

between them, suggesting that there are no instances of the

model in the scene. If an exact match is required, then it is

possible to let s = 0; however, for images where similarity

does not require an exact match, the value of s can be

greater. Based on this, algorithm (d) in Table 1 describes a

procedure in which a set of transformations are tested to

determine whether they map a point set B onto a point set A

within a specified threshold value s.

5 A shape grammar system

The methods of object detection described in the previous

section were implemented in a shape grammar system. This

system, introduced in McKay et al. (2008) and illustrated in

Fig. 6, enables implementation of shape grammars on two-

dimensional shapes arranged in the plane. The shapes are

input as raster images in bmp, jpg, gif, or png format and are

converted to point sets via consideration of the colour density

of the pixels in the image. Here, colour density is defined by

adding the red, green and blue colour values of a pixel, each

of which is defined over the range [0, 255]. Colour density is

therefore defined over the range [0, 765], with 0 indicating a

black pixel and 765 indicating a white pixel. Any pixel with a

colour density less than the specified tolerance (here set at

300) is defined as a point in the point set representing the

shape. Conversion of raster images into point sets allows for

their comparison according to the Hausdorff distance.

In the interface to the shape grammar system, the main

window displays the current shape in a generative sequence,

which, in Fig. 6, is the triquetra from Fig. 2c. Shape rules are

defined in a second interface, as illustrated in Fig. 7. The

rules are defined according to two shapes, one for the left-

hand side of the rule and the other for the right-hand side, and

Table 1 Algorithms for

calculating Hausdorff distanceData: Two point sets A and BResult: Directed Hausdorff distance

h(A,B)

dB = distance transform of B;max = 0; for i 1 to size A do

point = A[i]; dist = dB[point.x, point.y]if dist > max

max = dist;end

end return max;

Data: Two point sets A and BResult: Hausdorff distance H(A,B)

hAB = h(A,B);hBA = h(B,A);max = 0;if hAB > hBA then

max = hAB;else

max = hBA;end return max;

(a) calculate h(A,B (b) ) calculate H(A,B)

Data: Two point sets A and B, a set of transformations T, and a threshold value

Result: A set of transformations that map B onto A within threshold

map = new set; for i 1 to size T do

t = T[i]; HAB = Ht(A,B);if HAB <

map.add(t);end

return map;

Data: Two point sets A and B, and a transformation array T

Result: Hausdorff distance H(A,T(B))

tB = new set; for i 1 to size B do

point = B[i]; tPoint = T(point);tB.add(tPoint);

end return H(A, tB);

(c) calculate H(A,T(B)) (d) calculate set of transformations that map B onto A within threshold

Res Eng Design (2010) 21:235–247 241

123

according to the spatial relation between the two shapes. For

example, Fig. 7 illustrates the shape rule from Fig. 1, in

which a lens is translated along its central axis.

In order to apply a rule to a shape, it is first necessary to

find the shape on the left-hand side of the rule embedded as

a subshape. For example, in order to apply the rule in

Fig. 7 to the triquetra in Fig. 6, it is first necessary to detect

the lens on the left-hand side of the rule embedded as a

subshape of the triquetra. This is achieved by considering

the Hausdorff distance between the two shapes, under a set

of transformations, and will be discussed in Sect. 6. In this

example, three instances of the lens shape are found

embedded in the triquetra as illustrated in Fig. 8.

Application of the rule involves removing the pixels that

compose the matched subshape and the addition of the

pixels that compose the shape on the right-hand side of

the rule, according to the transformation under which the

embedded subshape was found. For example, application

of the rule in Fig. 7 to the triquetra shape according to the

match illustrated in the middle image in Fig. 8 results in

the shape in Fig. 9.

5.1 Shape exploration in conceptual design

Unlike previous shape grammar implementations, the

approach to subshape detection presented here does not

depend on the geometric properties of shape. Instead, it

depends on the similarity properties of shapes as defined by

the Hausdorff distance under transformation. As a result,

within this shape grammar system, shapes need not con-

form to formal geometric elements such as straight lines or

parametric curves. Indeed, since shapes are compared as

point sets, anything that can be represented as such can be

considered a shape, including raster images.

In the remainder of this paper, examples of shape rule

applications will be presented, which are based on studies

of designers exploring design shapes in conceptual design

(Lim et al. 2009). In particular, these examples will be

based on the sketches in Fig. 10, in which a designer was

exploring design concepts for a kettle. Shape rules are used

to manipulate a sketched design concept in a mode that

reflects the types of shape transformations employed by the

designer when sketching (Prats et al. 2009).

A sketch, such as one of the kettle concepts in Fig. 10,

can be digitised as a raster image (for example, by using an

optical input device such as a scanner) and imported into

the system, as illustrated in Fig. 11. Shape rules can then

be applied to modify design concepts and systematically

generate and explore design alternatives. For example, the

Fig. 6 The shape grammar system

Fig. 7 An example shape rule

Fig. 8 Result of subshape detection

Fig. 9 Result of rule application

242 Res Eng Design (2010) 21:235–247

123

rule in Fig. 12 replaces the handle of the kettle with one

that has a base that is disconnected from the main body of

the kettle and reflects transformations made by the designer

when sketching.

Application of the rule detects the handle subshape and

replaces it, as illustrated in Fig. 13. In this example, the

shape that is being manipulated by the shape rule is not

defined according to mathematically precise geometry as is

common in most shape grammar implementations. Instead,

the shape incorporates inaccuracies that commonly arise in

sketched shapes. The resulting design is not one that was

sketched by the designer, but it does result from applying

shape transformations used by the designer and, as a result,

would not be out of place in the sketches in Fig. 10. This

illustrates the potential for shape grammar systems to be

used as a computational aid in design exploration as means

for formalising the shape transformations used by designers

and applying those transformations to generate ideas and

explore alternatives that a designer may not consider

unaided (Prats et al. 2009).

6 Vision-based subshape detection

6.1 One-way subshape detection

The simplest approach to implementing subshape detection

using the Hausdorff distance is to consider only the

directed Hausdorff distance as defined in Eq. 1. The partial

(bidirectional) Hausdorff distance, defined in Eq. 5, mea-

sures distance in two directions, from a model to a scene

and conversely from the scene to the model. This partial

two-way matching ensures not only that the model is

similar to an object in the scene, but also that the object in

the scene is similar to the model. In subshape detection, the

model is a shape a on the left-hand side of a rule, while the

scene is the shape c in which it is to be embedded. In this

case, partial two-way matching is not strictly necessary,

because the problem is concerned with finding transfor-

mations of a such that it can be embedded in a shape c. For

this, it is necessary to determine whether a transformation

Fig. 10 Shape exploration in conceptual design

Fig. 11 An initial kettle concept

Fig. 12 A handle exploration rule

Fig. 13 Modification of kettle design via shape rule. a Detection of

handle subshape. b Result of rule application

Res Eng Design (2010) 21:235–247 243

123

of the shape a is similar to a portion of the shape c and the

directed Hausdorff distance is sufficient.

The directed Hausdorff distance is measured from a shape

a on the left-hand side of a rule to a shape c in which it is

embedded, according to a transformation group T. Typically,

in shape grammar implementations, T would be the Euclid-

ean transformations consisting of translation, rotation, iso-

tropic scale and reflection. In the shape grammar system

presented here, a subset of the Euclidean transformations is

specified by the user before subshape detection is initiated.

This is necessary in order to avoid having to consider an

infinite number of transformations. This subset is specified

according to incremental steps between transformations. For

example, a rotation defined with step size of 60� will result in

a set of transformations that incorporate rotations of 60�,

120�, 180�, 240�, 300� and 360�. With T defined as a subset

of the Euclidean transformations, the directed Hausdorff

distance as a function of T is given by

hT B;Að Þ ¼ hðTðBÞ;AÞ ð6Þ

where the shape c is defined according to point set A and

the shape a is defined according to point set B. hT(B, A) can

be calculated using a variation of the algorithms presented

in Table 1.

In the algorithms, shape similarity is defined within a

threshold value s, which is specified before subshape

detection is initiated. Transformations for which hT(B, A) is

less than s are considered valid transformations that embed

the shape a in the shape c. Once such a transformation is

found, it can be used in the application of a shape rule. The

value of s depends on the specific shapes that are being

compared, including their relative resolutions, and a value

that produces reasonable results for one pair of shapes may

not produce such reasonable results for a different pair. For

example, in the application of the lens rule to the triquetra

illustrated in Figs. 6, 7, 8 and 9, the similarity threshold

value is set such that s = 1. As a result, the transformations

that were calculated embed the lens shape in the triquetra

within a maximum distance of one pixel, as illustrated in

Fig. 8. In this case, a low similarity threshold was feasible

because the shapes were defined according to formal geo-

metric elements. In the application of the kettle handle rule

illustrated in Figs. 11, 12 and 13, a higher threshold value is

needed in order to account for the inexact geometry in the

sketched kettle concept. In this case, the similarity threshold

value was set so that s = 4, and the transformations that

were calculated embed the handle shape in the kettle within

a maximum distance of four pixels, as illustrated in Fig. 13a.

The threshold value, s, can be increased further in order

to broaden the definition of similarity between shapes. For

example, the kettle concept in Fig. 11 does not contain any

exact ellipse subshapes, although the handle of the kettle

can be visually compared to an ellipse. As a result, with the

threshold value increased, the rule in Fig. 14 can be used to

recognise the handle and replace it with a handle composed

of concentric circles, similar to those explored in the

sketches in Fig. 10.

With the threshold value set so that s = 10, transfor-

mations were calculated that embed the ellipse in the kettle

concept within a maximum distance of ten pixels, as

illustrated in Fig. 15a. The rule can then be applied to

replace the oval-shaped handle with the circular one.

As illustrated, one-way subshape detection is reasonably

successful. However, the approach does suffer from limi-

tations that result from it being based on a directed distance

from one shape to another. In particular, because the dis-

tance hT(B, A) is directed, it does not obey the metric

properties of identity, symmetry and the triangle inequality.

As discussed in Sect. 4.2, these properties correspond to

intuitive notions of shape similarity, and as a result, one-

way subshape detection can sometimes produce undesir-

able results. For example, consider the sketched kettle in

Fig. 16, which contains a large shaded area in which the

majority of pixels are coloured black. Examples such as

this can readily arise in sketches where shaded areas are

used to introduce three-dimensional effects.

With shape similarity defined according to the distance

hT(B, A), any shape can be embedded in shaded areas such

as this under uncountable Euclidean transformations. For

example, the rule in Fig. 12 can be used to detect handle

subshapes embedded in the shaded area, as illustrated in

Fig. 16. Despite this, a designer is unlikely to recognise the

handle as a feature of this area, and its recognition is likely

to be undesirable in a shape grammar system. This issue

can be avoided by insisting that shapes not contain black

areas and be composed solely of linear elements, for

example as a result of edge detection (Canny 1986).

Alternatively, a method of subshape detection can be used

based on a bidirectional distance.

6.2 Two-way subshape detection

As discussed in Sect. 4.1, the minimum Hausdorff distance

HT(A, B), defined in Eq. 3, measures the similarity between

Fig. 14 A second handle exploration rule

244 Res Eng Design (2010) 21:235–247

123

shapes represented by point sets A and B, according to a

specific transformation group T. This distance is bidirec-

tional and, when T is a Euclidean transformation, it obeys

the metric properties of identity, symmetry and the triangle

inequality. However, HT(A, B) cannot be used directly to

determine the transformations that embed one shape in

another. This is because when a shape a is embedded in a

shape c under a Euclidean transformation, there are likely

to be points in c that are far away from points in the

transformed shape a, resulting in a high value for HT(A, B).

For example, one-way subshape detection can calculate

transformations that embed a lens in a triquetra under a

similarity threshold value s = 1, as illustrated in Fig. 8. If

HT(A, B) is used to measure the similarity between the

triquetra and the lens under these transformations, then the

distance calculated between them would be much greater

than one because there are points in the triquetra that are

far away from the points in an embedded lens.

A bidirectional measure of the distance between a shape

a and a subshape of the shape c can instead be defined by

considering the partial Hausdorff distance, as discussed in

Sect 4.2. In particular, a measure of the distance between a

subset of the points in A and all of the points in B is

required. This is given by

HKT A;Bð Þ ¼ max hKT A;Bð Þ; hT B;Að Þð Þ

This distance is defined as a function of a transformation

group T with the set B transformed relative to the set A,

according to TðBÞ ¼ fTðbÞjb 2 Bg. Accordingly, hT(B, A)

is as defined in Eq. 6, and hKT(A, B) is given by

hKTðA;BÞ ¼ Ktha2A min

TðbÞjb2Ba� T bð Þk k

Here, K specifies the size of the subset of A that is

considered, and its value is very significant with respect to

the subshapes that can be detected. For example, if K = n,

the number of points in A, then hKT(A, B) will be equal to

hT(A, B), the directed distance from A to B, and the benefits

of using the partial Hausdorff distance will be negated.

HKT(A, B) is a bidirectional measure of the distance

between a subset of the points in A and all of the points in

B. Despite this, HKT(A, B) still does not obey the metric

properties of identity, symmetry and the triangle inequality.

It does however obey weaker conditions that correspond to

intuitively reasonable behaviour with respect to subshape

similarity. These conditions are that the metric properties

are in effect obeyed between subsets of points, for example

between the points in B and a subset of the points in A,

when compared under a Euclidean transformation. A dis-

cussion of these weaker conditions is presented in Hut-

tenlocher et al. (1993).

In the shape grammar system discussed in this paper,

two-way subshape detection is still to be implemented.

Instead for the examples presented, the method of one-way

detection was found to be sufficient.

7 Discussion

The approach to subshape detection presented in this paper

is significantly different from the analytical approaches

discussed in Sect. 3, because here shapes are defined as sets

of points. Under the shape grammar formalism, shapes are

defined within algebras that are ordered according to the

subshape relation and the embedding properties of geo-

metric elements (Stiny 1991). This definition enables the

reinterpretation of shapes according to features that are not

apparent in their initial formulation. However, within these

algebras, shapes composed of geometric elements of dif-

ferent types are treated independently. This is because

there is no relation between them with respect to their

embedding properties. For example, under the subshape

relation, points cannot be embedded in lines, and lines

cannot be embedded in planes. Instead, every subdivision

of a line contains only lines, and similarly, every subdivi-

sion of a plane contains only planes.

Fig. 15 Modification of kettle design via shape rule. a Detection of

ellipse subshape. b Result of rule application

Fig. 16 A shape containing a shaded region

Res Eng Design (2010) 21:235–247 245

123

Implementations of shape grammars typically conform

to this algebraic definition of shape and do not enable

comparison of geometric elements across types. For

example, the algorithms presented by Krishnamurti (1981)

enable comparison of shapes composed only of straight

lines, whereas the algorithms presented by Jowers and Earl

(2010) enable comparison of shapes composed only of

parametric curve segments. In the approach to subshape

detection presented in this paper, this restriction is not held,

and it is possible to compare geometric elements across

types. This is illustrated in Fig. 16 where a shape com-

posed of curve segments is embedded in a shaded region

intended to represent a surface. This is possible because

shapes are defined as point sets given by raster images.

Visually, a shape composed of points can be indistinct

from a shape composed of other geometric elements. For

example, a raster image can be seen to be composed of

lines, curves, planes or surfaces. However, there is a fun-

damental difference with respect to the structure of these

shapes. As Stiny (2006) emphasises, point sets do not

always behave in the same way as the geometric elements

they can mimic. For example, shapes composed of points

have a granularity, which can result in a loss of form—this

is apparent in the pixilation that can occur in raster images.

Such granularity gives shapes a specific resolution, beyond

which a shape rule application can be limited. Granularity

does not occur in other geometric elements since they

contain no atomic elements—a line can be always subdi-

vided into smaller lines, a plane can always be subdivided

into smaller planes. In addition, when points are used to

represent other geometric elements, the boundaries of these

elements become indistinct from the rest of the shape,

meaning the shapes have to be regularised to behave cor-

rectly under Boolean shape operations (Earl 1997).

With respect to addressing the problem of subshape

detection, representing shapes as point sets has been suc-

cessful, because points are different from higher order

geometric elements in that they cannot be subdivided. As a

result, any two points are identical, and the embedding

relation between them is always identity. This means that

subshape detection can be reduced to a simple comparison

of collections of atomic elements, without consideration of

embedding. The Hausdorff distance provides one such

approach of comparison. With shapes defined as point sets

and compared according to the Hausdorff distance, the

problems discussed in Sect. 3 have been avoided. This is

because (1) there is no restriction on the two-dimensional

geometry that can be represented by a set of points and (2)

two-way comparison obeys the (weak) metric properties of

identity, symmetry and the triangle inequality, which cor-

respond to intuitive notions of shape similarity.

Traditionally in the shape grammar formalism, shape

similarity is defined according to Euclidean transformations,

and for rectilinear shapes, such as shapes composed of

straight lines, this definition has been sufficient. However,

for non-rectilinear shapes, such as shapes composed of curve

segments, this definition is found to be lacking, as illustrated

in Fig. 4. Here, two curve segments are presented that are

visually similar but are not similar under Euclidean trans-

formation. As discussed, this is because the curves are seg-

ments of infinite curves that are mathematically distinct. In

this paper, an alternative measure of similarity was defined

according to the Hausdorff distance. This measure arguably

better reflects visual similarity and enables general subshape

detection.

The application of shape grammars to formalise and

support the process of shape exploration in conceptual

design has been discussed in detail by Prats et al. (2009).

However, development of shape grammar systems is still

limited due to the difficulties of subshape detection. This

research suggests an alternative approach to subshape

detection that is not restricted to formal geometrical shapes

and can support the fluid and dynamic interaction required

by designers when exploring design concepts (Jowers et al.

2008).

Acknowledgments The research reported in this paper was carried

out as part of the Design Synthesis and Shape Generation project

which is funded through the UK Arts & Humanities Research Council

(AHRC) and Engineering & Physical Sciences Research Council

(EPSRC)’s Designing for the 21st century programme.

References

Brown KN, Cagan J (1997) Optimized process planning by generative

simulated annealing. Artif Intell Eng Des Anal Manuf

11(3):219–235

Canny J (1986) A computational approach to edge detection. IEEE

Trans Pattern Anal Mach Intell 8(6):679–698

Chau HH, Chen X, McKay A, de Pennington A (2004) Evaluation of

a 3D shape grammar implementation. In: Gero JS (ed) Design

computing and cognition ‘04. Kluwer, Boston, pp 357–376

Earl CF (1997) Shape boundaries. Environ Plan B Plan Des

24(5):669–687

Forsyth DA, Ponce J (2003) Computer vision: a modern approach.

Prentice Hall, Englewood Cliffs

Huttenlocher DP, Klanderman GA, Rucklidge WJ (1993) Comparing

images using the Hausdorff distance. IEEE Trans Pattern Anal

Mach Intell 15(9):850–863

Jowers I (2006) Computation with curved shapes: towards freeform

shape generation in design. PhD Thesis, The Open University

Jowers I, Earl C (2010) The construction of curved shapes. Environ

Plan B Plan Des 37(1):42–58

Jowers I, Prats M, Lim S, McKay A, Garner S, Chase S (2008) Supporting

reinterpretation in computer-aided conceptual design. In: Alvarado

C, Cani M-P (eds) Sketch-based interfaces and modeling 2008,

Eurographics symposium proceedings, pp 151–158

Koning H, Eizenberg J (1981) The language of the prairie—Frank

Lloyd Wright’s Prairie houses. Environ Plan B Plan Des

8(3):295–323

Krishnamurti R (1981) The construction of shapes. Environ Plan B

Plan Des 8(1):5–40

246 Res Eng Design (2010) 21:235–247

123

Krishnamurti R (1992) The arithmetic of maximal planes. Environ

Plan B Plan Des 19(4):431–464

Lim S, Prats M, Jowers I, Chase S, Garner S, McKay A (2009) Shape

exploration in design: formalising and supporting a transforma-

tional process. Int J Archit Comput 6(4):415–433

McCormack JP, Cagan J (2006) Curve-based shape matching:

supporting designers’ hierarchies through parametric shape

recognition of arbitrary geometry. Environ Plan B Plan Des

33(4):523–540

McKay A, Jowers I, Chau HH, de Pennington A, Hogg DC (2008)

Computer aided design: an early shape synthesis system. In: Yan

X-T, Eynard B, Ion WJ (eds) Global design to gain a competitive

edge: an holistic and collaborative design approach based on

computational tools. Springer, London, pp 3–12

Pinz A (2005) Object categorization. Found Trends Comput Graph

Vis 1(4):255–353

Prats M, Earl CF (2006) Exploration through drawings in product

design. In: Gero JS (ed) Design computing and cognition ‘06.

Springer, The Netherlands, pp 82–102

Prats M, Garner SW, Lim S, Jowers I, Chase S (2009) Transforming

shape in design: observations from studies of sketching. Des

Stud 30(5):503–520

Rucklidge W (1996) Efficient visual recognition using the Hausdorff

distance. Lecture Notes in Computer Science. Springer, London

Schon DA, Wiggins G (1992) Kinds of seeing and their functions in

designing. Des Stud 13(2):135–156

Stiny G (1991) The algebras of design. Res Eng Des 2(3):171–181

Stiny G (1994) Shape rules—closure, continuity, and emergence.

Environ Plan B Plan Des 21(7):s49–s78

Stiny G (2006) Shape: talking about seeing and doing. MIT Press,

Cambridge

Stiny G, Gips J (1972) Shape grammars and the generative

specification of painting and sculpture. In: Freiman CV (ed)

Information processing 71, North Holland, Amsterdam, pp

1460–1465

Res Eng Design (2010) 21:235–247 247

123

shape detection with vision: implementing shape grammars in conceptual design

Documents