shape detection with vision: implementing shape grammars in conceptual design
TRANSCRIPT
ORIGINAL PAPER
Shape detection with vision: implementing shape grammarsin conceptual design
Iestyn Jowers • David C. Hogg • Alison McKay •
Hau Hing Chau • Alan de Pennington
Received: 3 August 2009 / Revised: 19 February 2010 / Accepted: 8 March 2010 / Published online: 27 March 2010
� Springer-Verlag London Limited 2010
Abstract Despite more than 30 years of research, shape
grammar implementations have limited functionality. This
is largely due to the difficult problem of subshape detec-
tion. Previous research has addressed this problem analyt-
ically and has proposed solutions that directly compare
geometric representations of shapes. Typically, such work
has concentrated on shapes composed of limited geometry,
for example straight lines or parametric curves, and as a
result, their application has been restricted. The problem of
general subshape detection has not been resolved. In this
paper, an alternative approach is proposed, in which sub-
shape detection is viewed as a problem of object recogni-
tion, a sub-domain of computer vision. In particular, a
general method of subshape detection is introduced based
on the Hausdorff distance. The approach is not limited in
terms of geometry, and any shapes that can be represented
in an image can be compared according to the subshape
relation. Based on this approach, a prototype shape gram-
mar system has been built in which the geometry of two-
dimensional shapes is not restricted. The system automates
the discovery of subshapes in a shape, enabling the
implementation of shape rules in a shape grammar.
Application of the system is illustrated via consideration of
shape exploration in conceptual design. The manipulations
of sketched design concepts are formalised by shape rules
that reflect the types of shape transformations employed by
designers when sketching.
Keywords Shape detection � Shape grammars �Object recognition � Hausdorff distance �Conceptual design � Computational design
1 Introduction
Research into shape grammar implementation has been
active since the concept of shape grammars was devised by
Stiny and Gips in the 1970s (Stiny and Gips 1972). A
summary of this research activity is given in Chau et al.
(2004) and spans many design disciplines, from art to
engineering and from architecture to product design.
However, despite some promising results, these imple-
mentations have not yet satisfactorily met the potential
suggested by theoretical developments of the shape gram-
mar formalism (Stiny 2006). This is largely due to the
technical difficulties involved in implementing shape
grammar systems, which rely on a general purpose solution
for subshape detection.
Typically, the subshape detection problem has been
approached as an analytical problem, and while a number
of solutions have proved to be feasible, they have also
proved to be limited in their applicability. These limitations
result from the necessity of analytical solutions being
applied to specific types of geometry, such as straight lines
(Krishnamurti 1981) or parametric curves (Jowers and Earl
2010).
In this paper, a new approach to addressing the subshape
detection problem is presented. Here, subshape detection is
regarded as an object recognition problem, and established
results from computer vision are used in order to identify
embedded subshapes in a shape. In particular, model-based
recognition using the Hausdorff distance (Huttenlocher
et al. 1993) is employed in order to determine whether the
I. Jowers (&) � A. McKay � H. H. Chau � A. de Pennington
School of Mechanical Engineering, University of Leeds,
Leeds LS2 9JT, UK
e-mail: [email protected]
D. C. Hogg
School of Computing, University of Leeds, Leeds LS2 9JT, UK
123
Res Eng Design (2010) 21:235–247
DOI 10.1007/s00163-010-0088-z
shape represented in one image can be embedded within
the shape represented in a second image. Such images can
be the output of computational systems, such as computer-
aided design (CAD) systems, or alternatively can be taken
from designers’ sketches.
This approach to subshape detection has been imple-
mented in a shape grammar system. The system is intro-
duced, and subshape detection is illustrated via
consideration of shapes that are not defined according to a
limited geometry but instead are taken from designers’
sketches. The system is used to manipulate a sketched
design concept in a mode that reflects the shape transfor-
mations employed by designers when sketching in con-
ceptual design (Prats et al. 2009).
2 Background
A shape grammar (Stiny 2006) consists of an initial shape
and a set of rules of the form a ? b, where a and b are
both shapes, as illustrated in Fig. 1. In this context, a shape
is defined visually as a finite arrangement of geometric
elements, such as lines or curves, each with a definite
boundary and limited but non-zero extent. A rule a ? b is
applicable to a shape c if some similarity transformation
(typically a Euclidean transformation) of the shape a on the
left-hand side of the rule is a subshape of c, denoted a B c,
where B is the subshape relation. Application of the rule
removes the instance of the subshape a and replaces it with
an instance of the shape b on the right-hand side of the rule.
For example, the shape rule in Fig. 1 removes a lens and
replaces it with a similar lens translated along its central
axis, as indicated by the local coordinate axis. Repeated
application of shape rules to an initial shape leads to the
generation of a sequence of shapes, as illustrated in Fig. 2.
Shape grammars embody the philosophy that a designer
using a computational system, such as a CAD system,
ought to be able to recognise and manipulate any subshape
or structure that can be perceived within a shape. As a
result, application of a shape rule is not restricted to the
geometric elements initially used to define a shape but
instead is applicable to any subshapes that can be seen to be
embedded in the shape. Indeed, within the shape grammar
formalism, the structure of a shape is defined retrospec-
tively according to shape rule applications (Stiny 1994).
A consequence of this is that shape grammars often gen-
erate shapes that incorporate some unexpected results that
follow from the recognition and manipulation of new
interpretations of shapes. For example, the shape sequence
in Fig. 2 begins with an initial shape (Fig. 2a) and ends
with a shape that is a rotation of this initial shape (Fig. 2e),
but this sequence is generated via application of a rule that
merely translates a subshape of this shape. This unexpected
result occurs because after a single application of the shape
rule, additional instances of the lens in the left-hand side of
the rule have emerged (Fig. 2c). Indeed, in Fig. 2c, three
instances of the lens can now be recognised, and applica-
tion of the shape rule to one of these subshapes translates it
along its central axis and restructures the shape so that it
appears to be rotated.
Shape grammars provide a formal description of the
shape exploration processes employed by designers in
conceptual design (Prats et al. 2009). This is because shape
rules enable the perceived structure of a design shape to be
freely recognised and manipulated without adherence to a
predefined geometric structure. Indeed, application of a
shape rule reflects the explorative process of ‘seeing-
moving-seeing’, as employed by designers when sketching
in conceptual design (Schon and Wiggins 1992). In ‘see-
ing’, a rule formalises the perception of a shape by rec-
ognising particular subshapes, and in ‘moving’, the rule
manipulates the shape according to replacement of the
recognised subshapes. This process of recognising and
manipulating subshapes leads to the generation of
sequences of shapes in a manner that reflects the way
designers generate sequences of sketches (Prats and Earl
2006).
Shape grammars have also been employed as a gener-
ative mechanism in a range of design disciplines, from
architecture, e.g. Koning and Eizenberg (1981), to engi-
neering, e.g. Brown and Cagan (1997). However, few of
these applications are computationally implemented and
instead it is common for rules to be applied and designs
generated as a paper-based exercise. This is because the
task of developing a shape grammar system—a computa-
tional system intended to automate the application of shape
rules—is not trivial, requiring a general purpose solution to
the subshape detection problem.
Subshape detection can be thought of as a type of object
recognition problem. Object recognition is a central theme
in computer vision and is applied to a range of problems
including fingerprint matching, character recognition and
content-based image retrieval (Forsyth and Ponce 2003). In
general, these problems are concerned with finding a target
object in an image or video sequence, for example by
applying statistical analysis or by comparing features.
Similarly, subshape detection is concerned with finding a
target shape embedded in a second shape, in such a way
that the embedded shape can be replaced according to a
shape rule. However, despite the commonalities betweenFig. 1 An example shape rule
236 Res Eng Design (2010) 21:235–247
123
the problems, the techniques employed in object recogni-
tion have never been applied to subshape detection, which
has instead been typically viewed as a problem to be solved
analytically by matching directly against geometric
representations.
3 Analytical subshape detection
A successful shape grammar implementation is dependent
on a general solution to the subshape detection problem.
For over 30 years, research in this field has produced
analytical solutions to the problem, which have been
applied to shapes composed of a variety of geometric
elements, including lines (Krishnamurti 1981), planes
(Krishnamurti 1992), circular arcs (Chau et al. 2004) and
parametric curves (McCormack and Cagan 2006; Jowers
and Earl 2010). Each of these analytical solutions takes
advantage of the geometric properties of shapes in order to
allow for subshape detection, and they typically utilise a
common representation of shapes, given by maximal geo-
metric elements.
Software implementation of shape grammars is difficult,
because it depends on a visual representation of shape that
does not take into consideration the underlying structure.
This means that it is necessary to compare shapes
according to their perceptual qualities, regardless of their
mathematical representation. However, comparing shapes
to determine whether or not they are perceptually the same
can be computationally expensive, if not impossible. In
addition, further difficulties arise with respect to subshape
detection. This is because the underlying structure of a
shape has consequences concerning which subshapes are
apparent for manipulation. For example, although three
lens subshapes are visually apparent in the triquetra in
Fig. 2c, a structural decomposition of the shape will only
allow for the recognition of one lens at a time. This is due
to the overlapping configuration of the three lenses.
In order to enable a visual comparison of shapes, it is
beneficial to provide a unique, canonical representation to
which all visually equivalent shapes can be reduced. One
such representation is the maximal representation as dis-
cussed by Stiny (2006). The maximal representation results
from merging any geometric elements in a shape that can
be merged, such as touching co-linear lines. For example,
the maximal representation of the triquetra in Fig. 2c is
given by the three largest curve segments in the shape.
Comparison of shapes according to maximal representa-
tions can reveal whether or not two shapes are perceptually
the same, or if one shape can be embedded as a subshape of
the other. Indeed, if one shape can be embedded in another,
then all of the maximal geometric elements of the first can
be embedded in the maximal geometric elements of the
second under a similarity transformation. For example, the
lens shape in the left-hand side of the rule in Fig. 1 is a
subshape of the triquetra in Fig. 2c, because the maximal
curves that define the lens shape can be embedded in the
maximal curves of the triquetra, under Euclidean
transformations.
With shapes reduced to a canonical representation, such
as maximal lines (Krishnamurti 1981), maximal curves
(Jowers and Earl 2010), or a representative shape based on
distinct points (McCormack and Cagan 2006), subshape
detection can be implemented via consideration of the
specific geometry of shapes. For example, in Jowers and
Earl (2010), subshape detection is implemented on shapes
composed of parametric curve segments via consideration
of the intrinsic properties of the mathematical curves in
which the segments are embedded. These intrinsic prop-
erties specify the embedding relations between curve seg-
ments and consequently between shapes composed of
curve segments. However, despite their theoretical success,
analytical approaches to subshape detection lead to prob-
lems with respect to a shape grammar implementation.
One problem arises from the dependency of analytical
approaches on the formal structures used to represent
shapes. In an analytical approach, subshape detection is
facilitated through consideration of the properties of spe-
cific geometric elements, such as straight lines or curve
segments. Such an approach can be defined so that it is
theoretically applicable to a range of element types.
However, in practice, implementation of an analytical
approach is restricted to a small class of elements. For
example, the algorithms introduced by Jowers and Earl
(2010) are theoretically applicable to shapes composed of
any parametric curve segments. However, in practice, the
detail of implementation is such that a different algorithm
is required for parametric curves of different order, as
Fig. 2 An example shape
sequence. a Initial shape.
b Detection of lens subshape.
c Result of one rule application.
d Detection of emergent lens
subshape. e Result of two rule
applications
Res Eng Design (2010) 21:235–247 237
123
illustrated in Jowers (2006). Similarly, the algorithms
introduced by Krishnamurti (1981) are theoretically
extendable to include all rectilinear shapes including
shapes composed of planes, as discussed in Krishnamurti
(1992). Despite this, they have only been implemented for
shapes composed of straight lines, and the implementation
details for other geometric elements are yet to be fully
resolved.
With respect to a shape grammar implementation, this
means that any system based on any one analytical
approach to subshape detection will prove restrictive to
designers who generally incorporate many types of geo-
metric elements in their work. Alternatively, a shape
grammar system could be developed that is not based on a
single analytical approach to subshape detection but
instead incorporates many such approaches. This could
result in an inefficient system, since it would be necessary
to apply multiple subshape detection algorithms when
comparing shapes composed of different types of geo-
metric elements. In addition, there are still many types of
geometric elements for which the subshape detection
problem has not been addressed, for example free-form
surfaces.
A second problem arises with respect to the similarity of
shapes. A key premise in the application of shape gram-
mars lies in the fact that shapes are viewed as visual
entities and that they can be formally explored in a visually
intuitive way. When humans view a shape, visual similarity
implies equality. That is, if two shapes look the same, then
they are generally considered to be the same. However,
when analytical techniques carry out the same process,
visual similarity does not necessarily imply equality. For
example, the two curve segments, C1 and C2, in Fig. 3 are
visually similar but analytically they are different, because
they are segments of infinite curves that are mathematically
distinct (as illustrated by the extended curves). As a result,
in a shape grammar system based on analytical techniques,
shape matching between C1 and C2 would fail despite their
perceptual similarity.
This distinction between visual similarity and analytic
similarity has not been addressed in any of the analytical
approaches developed to date. As a result, computational
systems that recognise and manipulate subshapes based on
analytical approaches may not match shapes that seem
obviously similar to humans.
This paper presents an alternative approach to subshape
detection that addresses the problem as a computer vision
problem rather than as an analytical problem. In the
remainder of this paper, an approach is described, which is
based on an established technique of object recognition for
computer vision. This technique, which builds on the
Hausdorff distance, does not rely on formal geometry in
order to facilitate object recognition but instead treats
images as visual entities to be analysed and compared
within a definable similarity threshold. As a result, the
problems discussed with respect to analytical subshape
detection are avoided, and a more robust approach is
obtained.
4 Object recognition with the Hausdorff distance
Object recognition is a key challenge within the study of
computer vision. Much of the current work on this problem
is concerned with developing methods for detecting
instances of generic object categories within images, for
example a cup, a face, or a car (Pinz 2005), as opposed to
detecting instances of a particular object. The dominant
approach is to characterise the appearance of objects from a
target category in terms of collections of local features that
are expected to occur, such as the spatial and angular
distributions of intensity gradients. In this paper, the focus
is on shapes represented in raster images that can be
reduced to binary images, and ultimately, sets of points (i.e.
the individual pixels). This special case has been studied in
its own right, and a recent and powerful approach for
addressing this problem is based on the Hausdorff distance,
a distance metric defined between two sets of points
(Huttenlocher et al. 1993).
The benefits of using the Hausdorff distance as a method
for object recognition have been discussed in detail by
Rucklidge (1996). In particular, the Hausdorff distance
avoids many of the difficulties that result from other
approaches to object recognition. For example, difficulties
that emerge from unstable linear feature detection are
avoided by representing images as point features rather
than linear features. Also, problems that can result from
feature correspondence are avoided, because the approach
does not rely on any explicit pairing of features between
images. This also means there is no reliance on combina-
torial comparison, and it is possible to compare images
with many (thousands to tens of thousands) features,
whereas approaches that rely on feature pairing are typi-
cally restricted to relatively few (tens to hundreds). Most
importantly, the Hausdorff distance provides an approachFig. 3 Visually similar curve segments
238 Res Eng Design (2010) 21:235–247
123
that enables real-time comparison of images and is reliable,
producing sensible results even in the presence of noise or
occlusion.
4.1 The Hausdorff distance
The Hausdorff distance between two point sets, A and B, is
calculated from the directed distances between the sets.
The directed Hausdorff distance is a measure of the max-
imum distance from a set of points to the nearest point in a
second set. Formally, the directed Hausdorff distance
between two point sets A = {a1, a2, …, an} and B = {b1,
b2, …, bm} is defined as
h A; Bð Þ ¼ maxa2A
minb2B
a� bk k ð1Þ
where ||a - b|| is a measure of the distance between two
points a and b, e.g. the Euclidean distance. The value
h(A, B) is the distance from the point a [ A that is furthest
from all points of B to its nearest neighbour in B. For
example, if h(A, B) = d, then each point in A is within
distance d of a point in B, and there is at least one point in
A that is exactly distance d from its nearest neighbour in
B. The directed Hausdorff distance is asymmetric, meaning
that it is not necessarily true that h(A, B) = h(B, A), and
therefore, h(A, B) measures the distance from point set A to
point set B, but not vice versa.
A symmetrical measure of the distance between A and B
is given by the (bidirectional) Hausdorff distance, which is
defined as
H A; Bð Þ ¼ max h A; Bð Þ; h B;Að Þð Þ ð2Þ
The Hausdorff distance H(A, B) is the maximum of the
directed Hausdorff distances from A to B and from B to A
and measures the minimum distance between all points in
A and all points in B. For example, if H(A, B) = d, then
each point in A is within distance d of some point in B and
also each point in B is within distance d of some point in A.
As such, given two point sets A and B, the Hausdorff
distance can provide a measure of their similarity. For
example, if A and B contain the same points, then the
Hausdorff distance between the sets will be zero, or if
the points in B are displaced by a small distance, then the
Hausdorff distance will be small. This is illustrated by the
two point sets in Fig. 4. Let A be the set of solid points and
B be the set of hollow points. Here, the Hausdorff distance
H(A, B) is large, because there are points in one set that are
far away from the points in the other. This reflects that
there is little similarity between A and B. Note that the
Hausdorff distance does not explicitly pair the points in A
with the points in B; instead, it is possible for many of the
points of A to be close to the same point of B and vice
versa.
As defined, the Hausdorff distance measures the simi-
larity between point sets that have fixed position relative to
each other. However, it can also be used to measure the
absolute similarity between point sets by allowing com-
parison under transformation. For example, Huttenlocher
et al. (1993) investigate the problem of finding the mini-
mum Hausdorff distance between two point sets under
Euclidean motion. Given two point sets, A and B, that do
not have fixed relative position, the Hausdorff distance can
be defined as a function of a transformation group T, e.g.
the Euclidean transformations, with the set B transformed
relative to the set A, according to TðBÞ ¼ fTðbÞjb 2 Bg. In
this case, the minimum Hausdorff distance determines the
transformation group T that minimises the distance
between A and B and is defined as
HTðA;BÞ ¼ minT
HðA; TðBÞÞ ð3Þ
The minimum Hausdorff distance provides a measure of
the absolute similarity between A and B relative to a
transformation group T. For example, for the two point sets
in Fig. 4, the minimum Hausdorff distance defined as a
function of translation is small since there are translations
of B that make all the points in A nearly coincident to
points in B and all the points in B nearly coincident with
points in A. Similarly, if T is a Euclidean motion, then
HT(A, B) = 0, since there is a rotation that embeds all the
points in B in all the points of A. In both cases, the value of
HT(A, B) reflects the similarity between the sets A and B,
under a specified transformation group.
4.2 Comparing images with the Hausdorff distance
In Huttenlocher et al. (1993), the minimum Hausdorff
distance under transformation, as defined in Eq. 3, is
presented as an approach for comparing shapes in images
given by a raster representation. In particular, it is applied
to the problem of model-based recognition, which is
concerned with searching a scene in order to locate
instances of an object, of which a model is provided. For
example, the problem could be concerned with tracking
an object as it moves about a scene by locating it in the
different frames of a video sequence. The minimum
Hausdorff distance HT(A, B) is particularly appropriate for
vision problems such as this because, in the case where
the transformation T is a Euclidean transformation, it
obeys the metric properties of identity, symmetry and the
Fig. 4 Two sets of points
Res Eng Design (2010) 21:235–247 239
123
triangle inequality. These properties correspond to intui-
tive notions of shape similarity. In particular, identity
implies that a point set representing a shape is identical
only to itself; symmetry implies that shape similarity is
not dependent on the order of comparison between point
sets; and the triangular inequality implies that two point
sets representing shapes that are dissimilar cannot both be
similar to a point set representing some third shape. Many
shape comparison functions do not obey the triangle
inequality, and thus, it is possible that such functions will
report that two highly dissimilar shapes are both similar
to a third shape, which is counterintuitive.
Raster images are suitable for comparison according to
the Hausdorff distance, because they can be simply reduced
to point sets, for example through the use of edge detection
techniques (e.g. Canny 1986). Traditionally, in model-
based recognition, the point set representing the scene that
is to be searched is denoted A, and the point set repre-
senting the model that is to be searched for is denoted B. In
practice, recognition problems involve finding a transfor-
mation of the model B that brings it within close corre-
spondence with a portion of the scene A, and vice versa.
For this task, it is beneficial if it is possible to compare
partial images.
Consider again the directed Hausdorff distance, defined
according to Eq. 1. As discussed, this equation calculates
a distance h(A, B) = d such that all points in A are within
distance d of some point in B. A partial equivalent of this
equation calculates a distance d such that a subset of the
points in A is within distance d of some point in B
(Huttenlocher et al. 1993). In particular, the partial
directed Hausdorff distance from a subset of A to B is
given by
hKðA;BÞ ¼ Ktha2A min
b2Ba� bk k ð4Þ
where K is the number of points in the subset of A, so that
0 B K B n, where n is the number of points in A. Ktha2A
denotes a ranked value based on an ordering of the points
in A according to their distance from the nearest point in B.
Therefore, hK(A, B) = d is the distance from the Kth
ranked point in A to its nearest neighbour in B, and con-
sequently, K of the points in A are within a distance d of
some point in B. For example, if K = n, the number of
points in A, then d is equal to the directed Hausdorff dis-
tance h(A, B).
Note that this definition of the partial Hausdorff dis-
tance is not dependent on a specification of which part of
A is to be matched with B. Instead, the best matching part
of A is selected because Eq. 4 identifies the subset of K
points in A that minimises the directed Hausdorff distance
from A to B.
The partial Hausdorff distance between A and B is cal-
culated according to the maximum of the partial directed
distances between the point sets and is defined as
HKL A; Bð Þ ¼ max hK A; Bð Þ; hL B; Að Þð Þ ð5Þ
This partial Hausdorff distance differs from the
Hausdorff distance as defined in Eq. 2, because instead
of matching whole point sets, it compares subsets of points.
For recognition problems, this is important since it enables
matching of an object in scenes in which many other
objects are present, and it also enables matching of objects
that are not wholly represented in a scene, for example due
to occlusion or sensor failure.
4.3 Computing the Hausdorff distance
When computing the Hausdorff distance, it is useful to
consider its graphical interpretation. As discussed in Hut-
tenlocher et al. (1993), the directed Hausdorff distance as
defined in Eq. 1 can be re-expressed as
hðA;BÞ ¼ maxa2A
dðaÞ
with
dðxÞ ¼ minb2B
x� bk k
A similar expression, d0(x), can also be defined for the
reverse directed Hausdorff distance h(B, A). The graph of
d(x), fðx; dðxÞÞjx 2 <2g, is known as the distance
transform of B since it gives the distance from a point x
to the nearest point b [ B. For example, for a set of points
defined by a raster image, the resulting distance transform
can be represented as an array, as illustrated in Fig. 5.
Here, the distance transform is calculated according to the
L1 distance (also known as the Manhattan distance), where
the distance d between two points p1 and p2 is given by
d = |x2 - x1| ? |y2 - y1|.
Consideration of the graphical interpretation of the
Hausdorff distance gives rise to the algorithms outlined in
Table 1. Graphically, the directed Hausdorff distance
h(A, B) is simply given by local maxima of the distance
transform d(x). For example, if A and B are defined by
raster images, then h(A, B) is defined as the highest entry in
Fig. 5 An example distance
transform
240 Res Eng Design (2010) 21:235–247
123
d(x) that corresponds to a point b [ B. This procedure for
calculating h(A, B) is outlined in algorithm (a) in Table 1.
h(B, A) is similarly defined as the local maxima of d’(x),
and the Hausdorff distance H(A, B) is simply the largest
local maxima of the two distance transforms d(x) and d0(x),
as outlined in algorithm (b). Similarly, the Hausdorff dis-
tance as a function of a transformation T can be computed
by considering the local maxima of the distance transforms
d(x) and d0(x) under transformation, as outlined in algo-
rithm (c) in Table 1.
When comparing images, a threshold value, say s, is
specified. This value is a numerical expression for the
similarity of images and defines the maximum distance that
is allowed between two images for them to be considered
similar. That is, if the Hausdorff distance between a model
and a scene is greater than s, then there is little similarity
between them, suggesting that there are no instances of the
model in the scene. If an exact match is required, then it is
possible to let s = 0; however, for images where similarity
does not require an exact match, the value of s can be
greater. Based on this, algorithm (d) in Table 1 describes a
procedure in which a set of transformations are tested to
determine whether they map a point set B onto a point set A
within a specified threshold value s.
5 A shape grammar system
The methods of object detection described in the previous
section were implemented in a shape grammar system. This
system, introduced in McKay et al. (2008) and illustrated in
Fig. 6, enables implementation of shape grammars on two-
dimensional shapes arranged in the plane. The shapes are
input as raster images in bmp, jpg, gif, or png format and are
converted to point sets via consideration of the colour density
of the pixels in the image. Here, colour density is defined by
adding the red, green and blue colour values of a pixel, each
of which is defined over the range [0, 255]. Colour density is
therefore defined over the range [0, 765], with 0 indicating a
black pixel and 765 indicating a white pixel. Any pixel with a
colour density less than the specified tolerance (here set at
300) is defined as a point in the point set representing the
shape. Conversion of raster images into point sets allows for
their comparison according to the Hausdorff distance.
In the interface to the shape grammar system, the main
window displays the current shape in a generative sequence,
which, in Fig. 6, is the triquetra from Fig. 2c. Shape rules are
defined in a second interface, as illustrated in Fig. 7. The
rules are defined according to two shapes, one for the left-
hand side of the rule and the other for the right-hand side, and
Table 1 Algorithms for
calculating Hausdorff distanceData: Two point sets A and BResult: Directed Hausdorff distance
h(A,B)
dB = distance transform of B;max = 0; for i 1 to size A do
point = A[i]; dist = dB[point.x, point.y]if dist > max
max = dist;end
end return max;
Data: Two point sets A and BResult: Hausdorff distance H(A,B)
hAB = h(A,B);hBA = h(B,A);max = 0;if hAB > hBA then
max = hAB;else
max = hBA;end return max;
(a) calculate h(A,B (b) ) calculate H(A,B)
Data: Two point sets A and B, a set of transformations T, and a threshold value
Result: A set of transformations that map B onto A within threshold
map = new set; for i 1 to size T do
t = T[i]; HAB = Ht(A,B);if HAB <
map.add(t);end
return map;
Data: Two point sets A and B, and a transformation array T
Result: Hausdorff distance H(A,T(B))
tB = new set; for i 1 to size B do
point = B[i]; tPoint = T(point);tB.add(tPoint);
end return H(A, tB);
(c) calculate H(A,T(B)) (d) calculate set of transformations that map B onto A within threshold
Res Eng Design (2010) 21:235–247 241
123
according to the spatial relation between the two shapes. For
example, Fig. 7 illustrates the shape rule from Fig. 1, in
which a lens is translated along its central axis.
In order to apply a rule to a shape, it is first necessary to
find the shape on the left-hand side of the rule embedded as
a subshape. For example, in order to apply the rule in
Fig. 7 to the triquetra in Fig. 6, it is first necessary to detect
the lens on the left-hand side of the rule embedded as a
subshape of the triquetra. This is achieved by considering
the Hausdorff distance between the two shapes, under a set
of transformations, and will be discussed in Sect. 6. In this
example, three instances of the lens shape are found
embedded in the triquetra as illustrated in Fig. 8.
Application of the rule involves removing the pixels that
compose the matched subshape and the addition of the
pixels that compose the shape on the right-hand side of
the rule, according to the transformation under which the
embedded subshape was found. For example, application
of the rule in Fig. 7 to the triquetra shape according to the
match illustrated in the middle image in Fig. 8 results in
the shape in Fig. 9.
5.1 Shape exploration in conceptual design
Unlike previous shape grammar implementations, the
approach to subshape detection presented here does not
depend on the geometric properties of shape. Instead, it
depends on the similarity properties of shapes as defined by
the Hausdorff distance under transformation. As a result,
within this shape grammar system, shapes need not con-
form to formal geometric elements such as straight lines or
parametric curves. Indeed, since shapes are compared as
point sets, anything that can be represented as such can be
considered a shape, including raster images.
In the remainder of this paper, examples of shape rule
applications will be presented, which are based on studies
of designers exploring design shapes in conceptual design
(Lim et al. 2009). In particular, these examples will be
based on the sketches in Fig. 10, in which a designer was
exploring design concepts for a kettle. Shape rules are used
to manipulate a sketched design concept in a mode that
reflects the types of shape transformations employed by the
designer when sketching (Prats et al. 2009).
A sketch, such as one of the kettle concepts in Fig. 10,
can be digitised as a raster image (for example, by using an
optical input device such as a scanner) and imported into
the system, as illustrated in Fig. 11. Shape rules can then
be applied to modify design concepts and systematically
generate and explore design alternatives. For example, the
Fig. 6 The shape grammar system
Fig. 7 An example shape rule
Fig. 8 Result of subshape detection
Fig. 9 Result of rule application
242 Res Eng Design (2010) 21:235–247
123
rule in Fig. 12 replaces the handle of the kettle with one
that has a base that is disconnected from the main body of
the kettle and reflects transformations made by the designer
when sketching.
Application of the rule detects the handle subshape and
replaces it, as illustrated in Fig. 13. In this example, the
shape that is being manipulated by the shape rule is not
defined according to mathematically precise geometry as is
common in most shape grammar implementations. Instead,
the shape incorporates inaccuracies that commonly arise in
sketched shapes. The resulting design is not one that was
sketched by the designer, but it does result from applying
shape transformations used by the designer and, as a result,
would not be out of place in the sketches in Fig. 10. This
illustrates the potential for shape grammar systems to be
used as a computational aid in design exploration as means
for formalising the shape transformations used by designers
and applying those transformations to generate ideas and
explore alternatives that a designer may not consider
unaided (Prats et al. 2009).
6 Vision-based subshape detection
6.1 One-way subshape detection
The simplest approach to implementing subshape detection
using the Hausdorff distance is to consider only the
directed Hausdorff distance as defined in Eq. 1. The partial
(bidirectional) Hausdorff distance, defined in Eq. 5, mea-
sures distance in two directions, from a model to a scene
and conversely from the scene to the model. This partial
two-way matching ensures not only that the model is
similar to an object in the scene, but also that the object in
the scene is similar to the model. In subshape detection, the
model is a shape a on the left-hand side of a rule, while the
scene is the shape c in which it is to be embedded. In this
case, partial two-way matching is not strictly necessary,
because the problem is concerned with finding transfor-
mations of a such that it can be embedded in a shape c. For
this, it is necessary to determine whether a transformation
Fig. 10 Shape exploration in conceptual design
Fig. 11 An initial kettle concept
Fig. 12 A handle exploration rule
Fig. 13 Modification of kettle design via shape rule. a Detection of
handle subshape. b Result of rule application
Res Eng Design (2010) 21:235–247 243
123
of the shape a is similar to a portion of the shape c and the
directed Hausdorff distance is sufficient.
The directed Hausdorff distance is measured from a shape
a on the left-hand side of a rule to a shape c in which it is
embedded, according to a transformation group T. Typically,
in shape grammar implementations, T would be the Euclid-
ean transformations consisting of translation, rotation, iso-
tropic scale and reflection. In the shape grammar system
presented here, a subset of the Euclidean transformations is
specified by the user before subshape detection is initiated.
This is necessary in order to avoid having to consider an
infinite number of transformations. This subset is specified
according to incremental steps between transformations. For
example, a rotation defined with step size of 60� will result in
a set of transformations that incorporate rotations of 60�,
120�, 180�, 240�, 300� and 360�. With T defined as a subset
of the Euclidean transformations, the directed Hausdorff
distance as a function of T is given by
hT B;Að Þ ¼ hðTðBÞ;AÞ ð6Þ
where the shape c is defined according to point set A and
the shape a is defined according to point set B. hT(B, A) can
be calculated using a variation of the algorithms presented
in Table 1.
In the algorithms, shape similarity is defined within a
threshold value s, which is specified before subshape
detection is initiated. Transformations for which hT(B, A) is
less than s are considered valid transformations that embed
the shape a in the shape c. Once such a transformation is
found, it can be used in the application of a shape rule. The
value of s depends on the specific shapes that are being
compared, including their relative resolutions, and a value
that produces reasonable results for one pair of shapes may
not produce such reasonable results for a different pair. For
example, in the application of the lens rule to the triquetra
illustrated in Figs. 6, 7, 8 and 9, the similarity threshold
value is set such that s = 1. As a result, the transformations
that were calculated embed the lens shape in the triquetra
within a maximum distance of one pixel, as illustrated in
Fig. 8. In this case, a low similarity threshold was feasible
because the shapes were defined according to formal geo-
metric elements. In the application of the kettle handle rule
illustrated in Figs. 11, 12 and 13, a higher threshold value is
needed in order to account for the inexact geometry in the
sketched kettle concept. In this case, the similarity threshold
value was set so that s = 4, and the transformations that
were calculated embed the handle shape in the kettle within
a maximum distance of four pixels, as illustrated in Fig. 13a.
The threshold value, s, can be increased further in order
to broaden the definition of similarity between shapes. For
example, the kettle concept in Fig. 11 does not contain any
exact ellipse subshapes, although the handle of the kettle
can be visually compared to an ellipse. As a result, with the
threshold value increased, the rule in Fig. 14 can be used to
recognise the handle and replace it with a handle composed
of concentric circles, similar to those explored in the
sketches in Fig. 10.
With the threshold value set so that s = 10, transfor-
mations were calculated that embed the ellipse in the kettle
concept within a maximum distance of ten pixels, as
illustrated in Fig. 15a. The rule can then be applied to
replace the oval-shaped handle with the circular one.
As illustrated, one-way subshape detection is reasonably
successful. However, the approach does suffer from limi-
tations that result from it being based on a directed distance
from one shape to another. In particular, because the dis-
tance hT(B, A) is directed, it does not obey the metric
properties of identity, symmetry and the triangle inequality.
As discussed in Sect. 4.2, these properties correspond to
intuitive notions of shape similarity, and as a result, one-
way subshape detection can sometimes produce undesir-
able results. For example, consider the sketched kettle in
Fig. 16, which contains a large shaded area in which the
majority of pixels are coloured black. Examples such as
this can readily arise in sketches where shaded areas are
used to introduce three-dimensional effects.
With shape similarity defined according to the distance
hT(B, A), any shape can be embedded in shaded areas such
as this under uncountable Euclidean transformations. For
example, the rule in Fig. 12 can be used to detect handle
subshapes embedded in the shaded area, as illustrated in
Fig. 16. Despite this, a designer is unlikely to recognise the
handle as a feature of this area, and its recognition is likely
to be undesirable in a shape grammar system. This issue
can be avoided by insisting that shapes not contain black
areas and be composed solely of linear elements, for
example as a result of edge detection (Canny 1986).
Alternatively, a method of subshape detection can be used
based on a bidirectional distance.
6.2 Two-way subshape detection
As discussed in Sect. 4.1, the minimum Hausdorff distance
HT(A, B), defined in Eq. 3, measures the similarity between
Fig. 14 A second handle exploration rule
244 Res Eng Design (2010) 21:235–247
123
shapes represented by point sets A and B, according to a
specific transformation group T. This distance is bidirec-
tional and, when T is a Euclidean transformation, it obeys
the metric properties of identity, symmetry and the triangle
inequality. However, HT(A, B) cannot be used directly to
determine the transformations that embed one shape in
another. This is because when a shape a is embedded in a
shape c under a Euclidean transformation, there are likely
to be points in c that are far away from points in the
transformed shape a, resulting in a high value for HT(A, B).
For example, one-way subshape detection can calculate
transformations that embed a lens in a triquetra under a
similarity threshold value s = 1, as illustrated in Fig. 8. If
HT(A, B) is used to measure the similarity between the
triquetra and the lens under these transformations, then the
distance calculated between them would be much greater
than one because there are points in the triquetra that are
far away from the points in an embedded lens.
A bidirectional measure of the distance between a shape
a and a subshape of the shape c can instead be defined by
considering the partial Hausdorff distance, as discussed in
Sect 4.2. In particular, a measure of the distance between a
subset of the points in A and all of the points in B is
required. This is given by
HKT A;Bð Þ ¼ max hKT A;Bð Þ; hT B;Að Þð Þ
This distance is defined as a function of a transformation
group T with the set B transformed relative to the set A,
according to TðBÞ ¼ fTðbÞjb 2 Bg. Accordingly, hT(B, A)
is as defined in Eq. 6, and hKT(A, B) is given by
hKTðA;BÞ ¼ Ktha2A min
TðbÞjb2Ba� T bð Þk k
Here, K specifies the size of the subset of A that is
considered, and its value is very significant with respect to
the subshapes that can be detected. For example, if K = n,
the number of points in A, then hKT(A, B) will be equal to
hT(A, B), the directed distance from A to B, and the benefits
of using the partial Hausdorff distance will be negated.
HKT(A, B) is a bidirectional measure of the distance
between a subset of the points in A and all of the points in
B. Despite this, HKT(A, B) still does not obey the metric
properties of identity, symmetry and the triangle inequality.
It does however obey weaker conditions that correspond to
intuitively reasonable behaviour with respect to subshape
similarity. These conditions are that the metric properties
are in effect obeyed between subsets of points, for example
between the points in B and a subset of the points in A,
when compared under a Euclidean transformation. A dis-
cussion of these weaker conditions is presented in Hut-
tenlocher et al. (1993).
In the shape grammar system discussed in this paper,
two-way subshape detection is still to be implemented.
Instead for the examples presented, the method of one-way
detection was found to be sufficient.
7 Discussion
The approach to subshape detection presented in this paper
is significantly different from the analytical approaches
discussed in Sect. 3, because here shapes are defined as sets
of points. Under the shape grammar formalism, shapes are
defined within algebras that are ordered according to the
subshape relation and the embedding properties of geo-
metric elements (Stiny 1991). This definition enables the
reinterpretation of shapes according to features that are not
apparent in their initial formulation. However, within these
algebras, shapes composed of geometric elements of dif-
ferent types are treated independently. This is because
there is no relation between them with respect to their
embedding properties. For example, under the subshape
relation, points cannot be embedded in lines, and lines
cannot be embedded in planes. Instead, every subdivision
of a line contains only lines, and similarly, every subdivi-
sion of a plane contains only planes.
Fig. 15 Modification of kettle design via shape rule. a Detection of
ellipse subshape. b Result of rule application
Fig. 16 A shape containing a shaded region
Res Eng Design (2010) 21:235–247 245
123
Implementations of shape grammars typically conform
to this algebraic definition of shape and do not enable
comparison of geometric elements across types. For
example, the algorithms presented by Krishnamurti (1981)
enable comparison of shapes composed only of straight
lines, whereas the algorithms presented by Jowers and Earl
(2010) enable comparison of shapes composed only of
parametric curve segments. In the approach to subshape
detection presented in this paper, this restriction is not held,
and it is possible to compare geometric elements across
types. This is illustrated in Fig. 16 where a shape com-
posed of curve segments is embedded in a shaded region
intended to represent a surface. This is possible because
shapes are defined as point sets given by raster images.
Visually, a shape composed of points can be indistinct
from a shape composed of other geometric elements. For
example, a raster image can be seen to be composed of
lines, curves, planes or surfaces. However, there is a fun-
damental difference with respect to the structure of these
shapes. As Stiny (2006) emphasises, point sets do not
always behave in the same way as the geometric elements
they can mimic. For example, shapes composed of points
have a granularity, which can result in a loss of form—this
is apparent in the pixilation that can occur in raster images.
Such granularity gives shapes a specific resolution, beyond
which a shape rule application can be limited. Granularity
does not occur in other geometric elements since they
contain no atomic elements—a line can be always subdi-
vided into smaller lines, a plane can always be subdivided
into smaller planes. In addition, when points are used to
represent other geometric elements, the boundaries of these
elements become indistinct from the rest of the shape,
meaning the shapes have to be regularised to behave cor-
rectly under Boolean shape operations (Earl 1997).
With respect to addressing the problem of subshape
detection, representing shapes as point sets has been suc-
cessful, because points are different from higher order
geometric elements in that they cannot be subdivided. As a
result, any two points are identical, and the embedding
relation between them is always identity. This means that
subshape detection can be reduced to a simple comparison
of collections of atomic elements, without consideration of
embedding. The Hausdorff distance provides one such
approach of comparison. With shapes defined as point sets
and compared according to the Hausdorff distance, the
problems discussed in Sect. 3 have been avoided. This is
because (1) there is no restriction on the two-dimensional
geometry that can be represented by a set of points and (2)
two-way comparison obeys the (weak) metric properties of
identity, symmetry and the triangle inequality, which cor-
respond to intuitive notions of shape similarity.
Traditionally in the shape grammar formalism, shape
similarity is defined according to Euclidean transformations,
and for rectilinear shapes, such as shapes composed of
straight lines, this definition has been sufficient. However,
for non-rectilinear shapes, such as shapes composed of curve
segments, this definition is found to be lacking, as illustrated
in Fig. 4. Here, two curve segments are presented that are
visually similar but are not similar under Euclidean trans-
formation. As discussed, this is because the curves are seg-
ments of infinite curves that are mathematically distinct. In
this paper, an alternative measure of similarity was defined
according to the Hausdorff distance. This measure arguably
better reflects visual similarity and enables general subshape
detection.
The application of shape grammars to formalise and
support the process of shape exploration in conceptual
design has been discussed in detail by Prats et al. (2009).
However, development of shape grammar systems is still
limited due to the difficulties of subshape detection. This
research suggests an alternative approach to subshape
detection that is not restricted to formal geometrical shapes
and can support the fluid and dynamic interaction required
by designers when exploring design concepts (Jowers et al.
2008).
Acknowledgments The research reported in this paper was carried
out as part of the Design Synthesis and Shape Generation project
which is funded through the UK Arts & Humanities Research Council
(AHRC) and Engineering & Physical Sciences Research Council
(EPSRC)’s Designing for the 21st century programme.
References
Brown KN, Cagan J (1997) Optimized process planning by generative
simulated annealing. Artif Intell Eng Des Anal Manuf
11(3):219–235
Canny J (1986) A computational approach to edge detection. IEEE
Trans Pattern Anal Mach Intell 8(6):679–698
Chau HH, Chen X, McKay A, de Pennington A (2004) Evaluation of
a 3D shape grammar implementation. In: Gero JS (ed) Design
computing and cognition ‘04. Kluwer, Boston, pp 357–376
Earl CF (1997) Shape boundaries. Environ Plan B Plan Des
24(5):669–687
Forsyth DA, Ponce J (2003) Computer vision: a modern approach.
Prentice Hall, Englewood Cliffs
Huttenlocher DP, Klanderman GA, Rucklidge WJ (1993) Comparing
images using the Hausdorff distance. IEEE Trans Pattern Anal
Mach Intell 15(9):850–863
Jowers I (2006) Computation with curved shapes: towards freeform
shape generation in design. PhD Thesis, The Open University
Jowers I, Earl C (2010) The construction of curved shapes. Environ
Plan B Plan Des 37(1):42–58
Jowers I, Prats M, Lim S, McKay A, Garner S, Chase S (2008) Supporting
reinterpretation in computer-aided conceptual design. In: Alvarado
C, Cani M-P (eds) Sketch-based interfaces and modeling 2008,
Eurographics symposium proceedings, pp 151–158
Koning H, Eizenberg J (1981) The language of the prairie—Frank
Lloyd Wright’s Prairie houses. Environ Plan B Plan Des
8(3):295–323
Krishnamurti R (1981) The construction of shapes. Environ Plan B
Plan Des 8(1):5–40
246 Res Eng Design (2010) 21:235–247
123
Krishnamurti R (1992) The arithmetic of maximal planes. Environ
Plan B Plan Des 19(4):431–464
Lim S, Prats M, Jowers I, Chase S, Garner S, McKay A (2009) Shape
exploration in design: formalising and supporting a transforma-
tional process. Int J Archit Comput 6(4):415–433
McCormack JP, Cagan J (2006) Curve-based shape matching:
supporting designers’ hierarchies through parametric shape
recognition of arbitrary geometry. Environ Plan B Plan Des
33(4):523–540
McKay A, Jowers I, Chau HH, de Pennington A, Hogg DC (2008)
Computer aided design: an early shape synthesis system. In: Yan
X-T, Eynard B, Ion WJ (eds) Global design to gain a competitive
edge: an holistic and collaborative design approach based on
computational tools. Springer, London, pp 3–12
Pinz A (2005) Object categorization. Found Trends Comput Graph
Vis 1(4):255–353
Prats M, Earl CF (2006) Exploration through drawings in product
design. In: Gero JS (ed) Design computing and cognition ‘06.
Springer, The Netherlands, pp 82–102
Prats M, Garner SW, Lim S, Jowers I, Chase S (2009) Transforming
shape in design: observations from studies of sketching. Des
Stud 30(5):503–520
Rucklidge W (1996) Efficient visual recognition using the Hausdorff
distance. Lecture Notes in Computer Science. Springer, London
Schon DA, Wiggins G (1992) Kinds of seeing and their functions in
designing. Des Stud 13(2):135–156
Stiny G (1991) The algebras of design. Res Eng Des 2(3):171–181
Stiny G (1994) Shape rules—closure, continuity, and emergence.
Environ Plan B Plan Des 21(7):s49–s78
Stiny G (2006) Shape: talking about seeing and doing. MIT Press,
Cambridge
Stiny G, Gips J (1972) Shape grammars and the generative
specification of painting and sculpture. In: Freiman CV (ed)
Information processing 71, North Holland, Amsterdam, pp
1460–1465
Res Eng Design (2010) 21:235–247 247
123