parameter estimation for excursion set texture models

SIGNAL PROCESSING

ELSEVIER Signal Processing 63 (1997) 199-210

Parameter estimation for excursion set texture models’

David J. NotPb, Richard J. Wilsona9b,*

‘Department of Mathematics. Universig of Queensland, Queensland 4072, Australia

h Cooperative Research Centre for Mining Technolog), and Equipment, P.O. Box 883, Moggill Rd, Kenmore. Queensland 4067. Australia

Received 3 October 1996; received in revised form 27 May 1997

Abstract

We discuss some work which was motivated by the need to synthesize binary images with a structure statistically similar to a given image. Excursion sets of random fields, which are obtained by ‘thresholding’ a random

field at some level, have many advantages for this kind of problem. For stationary Gaussian random fields, excursion sets can be easily simulated and the global properties of the simulated images can be directly related to the model parameters. One barrier to the wider application of excursion set texture models is the lack of statistically efficient methods of parameter estimation. We discuss here the use of the EM algorithm for this problem. Markov chain Monte Carlo techniques are used to implement a stochastic version of the EM procedure. A further modification of the algorithm, which introduces no approximations, enables the method to be implemented in problems of typical size. The techniques can be extended to parameter estimation from contour data at more than one level, to the Bayesian analysis of excursion sets, and to the modelling of binary and categorical time series. 0 1997 Elsevier Science B.V. All rights reserved.

Zusammenfassung

Wir diskutieren Arbeiten, die durch die Notwendigkeit motiviert wurden, binLre Bilder zu synthetisieren, die eine einem gegebenen Bild in statistischer Hinsicht ghnliche Struktur besitzen. ‘Excursion sets’ von Zufallsfeldern, die man durch Vergleich eines Zufallsfeldes mit einem Schwellwert erhhlt, besitzen bei Problemen dieser Art viele Vorteile. Fiir

stationare GauBsche Zufallsfelder kiinnen excursion sets einfach simuliert werden, und die globalen Eigenschaften der simulierten Bilder k6nnen in direkten Zusammenhang mit den Modellparametern gebracht werden. Eine Grenze fiir eine breitere Anwendbarkeit von auf excursion sets basierenden Texturmodellen ist durch den Mange1 an statistisch effizienten Methoden der ParameterschSitzung gegeben. Im vorliegenden Artikel betrachten wir die Anwendung des EM-Algorithmus auf dieses Problem. Es werden Monte Carlo-Methoden mit Markoflketten verwendet, urn eine stochastische Version des EM-Verfahrens zu implementieren. Eine weitere Modifikation des Algorithmus, welche keine Approximationen enthhlt, ermaglicht die Implementierung der Methode bei Problemen typischer GrijOe. Die Verfahren k6nnen auf die Parameterschltzung basierend auf Konturdaten mit mehreren Niveaus, die Bayessche Analyse von excursions sets und die Modellierung binlrer und kategorischer Zeitreihen erweitert werden. s@: 1997 Elsevier Science B.V. All rights reserved.

* Corresponding author. E-mail: [email protected].

’ Supported in part by ONR grant NO0014 93 1 0043.

0165-1684/97/$17.00 Q 1997 Elsevier Science B.V. All rights reserved

PIISOl65-1684(97)00156-4

200 D.J. Nott, R.J. Wilson / Signal Processing 63 (1997) 199-210

RCsumB

Nous discutons dans cet article un travail dont la motivation trouve son origine dans le besoin de synthetiser des images binaires de structure statistiquement similaire a celle dune image don&e. Les ensembles d’excursion de champs aleatoires, obtenus par ‘seuillage’ dun champ aleatoire a un certain niveau, presentent de nombreux avantages pour ce genre de probleme. Pour des champs aleatoires gaussiens stationnaires, les ensembles d’excursion peuvent etre aisement simults et les proprietes globales des images simulees peuvent &tre directement reliees aux parametres du modele. Un obstacle a une application plus large de modbles de texture bases sur les ensembles d’excursion est le manque de methodes statistiquement efficientes pour l’estimation des parametres. Nous discutons ici de l’utilisation de l’algorithme EM pour resoudre ce probleme. Les techniques Monte Carlo sur des chaines de Markov sont utilisees pour implanter une version stochastique de la procedure EM. Une modification subsequente de l’algorithme, qui n’introduit pas d’approximations, permet a la methode d’etre implantable pour des probltmes de taille typique. Ces techniques peuvent &tre Ctendues a l’estimation de parametres de donnees de contour a plus dun niveau, a l’analyse bayesienne des ensembles d’excursion et a la modilisation de series temporelles categorielles et binaires. 0 1997 Elsevier Science B.V. All rights reserved.

Keywords: Excursion set; Random field; EM algorithm; Stochastic EM algorithm; Texture modelling; Circulant embedding; Markov chain Monte Carlo technique

1. Introduction

Workers in image processing may talk informal- ly of ‘texture modelling’, although the word texture has no commonly accepted technical meaning. A summary of proposed definitions is given by Haindl [13] - a pragmatic approach is to define texture to mean a realization of a random field or random set. We discuss some work which was motivated by a problem in mining engineering, where there was a need to synthesize binary images with a structure statistically similar to a given image. Excursion sets of random fields, which are obtained by ‘thresholding’ a random field at some fixed level, are statistical models for binary images with many advantages for texture synthesis. Given an observed binary image, we assume that it is the excursion set of a stationary Gaussian random field. We estimate the covariance function of the random field and then ‘threshold’ simulations from the fitted Gaussian random field model to obtain simulated binary textures. Stationary Gaussian random fields (and hence their excursion sets) can be easily simulated, and in general the global geo- metrical properties of excursion sets (such as the behaviour of size distributions) can be directly related to the model parameters. For other common binary texture models, such as binary Markov random fields, the parameters of the model have a con-

ditional interpretation and simulation may be more delicate. Haindl [13] gives a review of commonly used statistical models for texture synthesis.

In the following, we will deal with a stationary Gaussian random field {Y(t): t E [w*) with zero mean and variance one. The distribution of Y(t) is completely specified by its covariance function R(t) (see for instance [2]). The u-level excursion set of Y, A,(Y), is simply the set of points where Y is not less than U,

A,(Y) = {te rw2:Y(t) 3 2.4).

Probability theorists have studied the geometry of excursion sets extensively - the monograph of Adler [Z] describes the developments up to 1982. However, there are very few papers dealing with the statistical theory of excursion sets. Some notable exceptions are Adler [l], Worsley [22,23], Sieg- mund and Worsley [ 191 and Lindgren and Rychlik [ 161. Certain generalizations of excursion set models such as the truncated Gaussian model [ 171 and the substitution random functions of Lantuejoul [lS] are also of interest in geostatistics. Parameter estimation in these models typically involves ex- ploiting functional relationships between the covariance function of a Gaussian random field and the covariance functions of certain indicator random fields. A similar approach to parameter estimation can be devised for excursion sets. To

D.J. Nott, R.J. Wilson 1 Signal Processing 63 (1997) 199-210 201

explain the idea we need some more notation. De- data is captured in other respects relates to model

fine a binary random field B,(t) from A,(Y) by adequacy.

B,(t) = i

1, if t E A,( Y),

0, otherwise.

If we write R,(t) for the covariance function of B,, then under the assumptions we have made on Y it

can be shown (see for instance [4], p. 27) that

The above approach to covariance estimation is statistically inefficient, since no account is taken of the correlations between empirical covariance function values. In the following we also adopt a para- metric form R(t;8) for the covariance function.

However, we discuss some more sophisticated likelihood-based methods for estimating 0 which are computationally intensive and may be generalized to some new situations. R&) = k J R(f)exp( - u’/(l + 2)) dz

0 Jl - z2 .

When u = 0, this expression simplifies to

Ro(t) = & arcsin(R(t)). (2)

Eq. (1) is monotone, and hence invertible. Hence the distribution of the random field Y(i) can be

determined from u and R,(t) (i.e. from a knowledge of the second-order statistical properties of B,(t)). A method of estimating R(t) using Eq. (1) immedi- ately suggests itself. If Y is ergodic and the domain of observation increases appropriately, then the observed area1 fraction, j? say, converges almost surely to 1 - Q(U) (where @ denotes the standard normal distribution function). Hence we can esti-

mate u by solving

j? = 1 - @(u^)

for 6. We then calculate an estimate &(t) of R,(t),

and invert R,(t) using Eq. (1) (with u = u^) to estimate R(t). The process of inversion using Eq. (1) does not preserve non-negative definiteness of covariance functions, so that the resulting estimate of R(t) cannot be used for texture simulation. In fact, the estimated covariance matrix for the Gaus- sian vector corresponding to the observation lattice in the excursion set model can have a significant proportion of negative eigenvalues. One way of overcoming this problem is to adopt some para- metric form for the covariance function, R(t; 0) say (where 8 E Rd is some parameter), and to do a ‘least

squares’ fit using Eq. (1) as a non-linear regression equation in order to estimate 8. Hence 0 is chosen to match as closely as possible the second-order statistical properties of the observed image. The question of whether the statistical structure of the

First we establish some more notation. Let 9 be the rectangular integer lattice

$P = {(j,k): 0 6 j < NI,O 6 k < Nz).

Let N = N1N2 and write si, 1 < i < N, for some ordering of the sites of 5?. Also, write Y(9) (or simply Y where there is no confusion) for the N-dimensional random vector with components Yi = Y(si) for i = 1, . . , N. A binary image on P is a sequence of binary variables B = (BI, . . . , BN), where Bi is associated with the ith site si of 9’. In an excursion set model for B we assume that

Bi = i

1, if Yi 3 U,

0, otherwise.

We will describe methods for estimating 8 from an observation of B. One natural approach to this

problem is to use the EM algorithm (Section 2) with B as the incomplete data and Y as the complete data. However, certain difficulties are encountered in the implementation of this idea.

The first difficulty is that the E step in the EM algorithm is hard to perform, a problem we overcome using Markov chain Monte Carlo methods. A further obstacle is the computational burden of the function evaluations which are needed to perform the M step of the EM procedure. To overcome this, we describe a modification of our EM algorithm which introduces no approximations and involves changing our choice for the complete data. The basic idea is given in Section 3, where we

explain a method for embedding a stationary Gaus- sian lattice process into a toroidal process with a likelihood which can be evaluated by fast Fourier transform (FFT). This ‘circulant embedding’ construction has been used (together with the EM algorithm) for estimation of the covariance matrix

202 D. J. Nott, R. J. Wilson 1 Signal Processing 63 (1997) 199-210

in stationary Gaussian time series [8] and for the simulation of stationary Gaussian processes [6,11,21]. By using a fictitious toroidal process for the complete data we can perform the function evaluations needed in the M step via FFT. This allows our Monte Carlo EM algorithm to be used in problems of typical size.

In Section 4 we discuss some modelling issues and describe a small simulation study comparing the performance of a stochastic EM algorithm to that of the least squares procedure discussed above. The stochastic EM estimator appears to have superior properties. Some extensions are discussed in Section 5. In particular, our methods can be applied to certain non-Gaussian fields, to parameter estimation from contour data at more than one level, to the Bayesian analysis of excursion sets, and to the modelling of binary and categorical time series.

2. A Monte Carlo EM algorithm for excursion sets

A natural approach to estimating 0 from an observation of B is to use the EM algorithm, which we now describe. The EM algorithm [9] is an iterative technique for calculating maximum likelihood estimates in missing data problems. EM stands for ‘expectation/maximization’, and the name describes the steps which are performed at each iterate of the procedure.

Suppose that we observe data z having a distribution on a parameter vector 8, and that the calculation of maximum likelihood estimates from z is difficult. Suppose that z is a function of unobserved data x, which also has a distribution depending on 8. That is, the observed data z can be recovered from the unobserved data x (which contains some extra information). We call z the incomplete data, and x the complete data, and write L(z;8) and L(x;8) for their respective likelihoods. The EM algorithm produces a sequence of estimates {fP> starting from some initial estimate 13(l). At the nth iteration of the algorithm, we generate f#“r) from @‘) by a two step procedure: in the expectation step (E step) we calculate

Q(0 1 e(n)) = Ee$log L(x; e) 1 z) (3)

and then we maximize Q(el tP)) with respect t 8 (maximization or M step) to obtain f?@+l). B each step of the algorithm we are maximizing 01 best approximation to the complete data log like1 hood in the sense of a conditional expectation give the incomplete data and the current fit. An E1 iteration never decreases the incomplete data lc likelihood, and the maximum likelihood estimate a fixed point of the EM iteration. The EM alga rithm finds many applications to problems invol ing missing data in image processing. For instant it has been used in approaches to image reconstrul tion based on hidden Markov models [3,7]. Tl attraction of the EM algorithm often lies in the fat that the complete data log-likelihood is easily con puted, whereas the incomplete data log-likelihoc may be intractable. Computations in the EM alga rithm involve only the complete data log-like] hood.

It is clear that the estimation of parameters fro] excursion set data is a situation in which the EI algorithm might be applicable. The incomple data is B, and the missing complete data is Y, whit is joint Gaussian. An iterative approach to max mum likelihood via the EM algorithm enables us I use the convenient Gaussian likelihood for Y i calculations. Computing the maximum likelihoc estimate directly from B is difficult, since calct lation of the likelihood for B involves a very big dimensional integration. In implementing the El algorithm several problems are encountered. Tl first problem is that the E step is hard to perforr We overcome this by the use of Markov chal Monte Carlo (MCMC) methods. In particular, tl Gibbs sampler can be used for simulating truncate Gaussian vectors [12] so that instead of calculatir Q(el#“)) at the nth step of the EM algorithm, we ca simulate realizations yr ,..., y,ofYlBwith8=# and use

as an approximation to Q(01@“)). This strategy h; been called the Monte Carlo EM algorithm [20 The case where s = 1 is referred to as the stochast EM (SEM) algorithm. Diebolt and Ip [lo] gi7 a recent review of the SEM algorithm and its al plications. The SEM algorithm generates a randol

D.J. Nott, R. J. Wilson / Signal Processing 63 (1997) 199-210 203

sequence of estimates, and it is hoped that the mean

of the stationary distribution of the sequence is a good estimator of 0. Large values of s in Eq. (4) allow us to approximate the true EM trajectory closely, but the greater variability of the SEM se-

quence has been observed to confer some advantages - it allows us to escape from local modes, and

the problem of slow convergence of the EM algorithm in the neighbourhood of a local mode is avoided. In Section 4 we consider only the SEM algorithm.

Implementing the Monte Carlo EM algorithm

for excursion sets involves maximizing the approximation (Eq. (4)), which presents some major difficulties. The problem is that if Y is a very high dimensional Gaussian vector, calculating the log- likelihood of Y is extremely difficult. Apart from an

additive constant, negative twice the log-likelihood for Y is specified by

log(det(C)) + yTC-‘y, (5)

where C = C(Z) is the covariance matrix of Y and det(C) is its determinant. Since Y(t) is stationary, the covariance matrix C has some structure - if the

sites of 9? are appropriately ordered, then C is block Toeplitz with Toeplitz blocks. However, even with all this structure, it is impossible to calculate Eq. (4) in problems of typical size using the current best algorithms, and especially the determinant of C in Eq. (5). To overcome this, we suggest changing the choice of the complete data - we embed Y into a fictitious vector W = (Y, U) which is even larger, but which has a likelihood which is convenient for calculations. We can then use this vector Was the complete data in the EM algorithm. It must be emphasized that no statistical approximations are

introduced by this embedding idea.

3. Circulant embedding

We now describe a construction for embedding a stationary Gaussian lattice process into a toroidal process with a likelihood which can be evaluated via FFT. This so-called circulant embedding construction is not new - the circulant embedding idea has been used by Dembo et al. [S] for the estimation of the covariance matrix in stationary Gaus-

sian time series, and by Davies and Harte [6], Dietrich and Newsam [l l] and Wood and Chan [21] for the efficient simulation of stationary Gaus- sian processes.

To aid intuition we describe the construction for a one dimensional process first. Let { Yr, t E Z} be a zero mean stationary Gaussian process, and let

Y = (Yo, . , YN- 1). The covariance matrix C of Y is a symmetric Toeplitz matrix with first column

(co, . . . , CN - I)~ say. Now consider a circulant matrix S with first COhnn (CO, . . . , CN - I, CN - 2, . , ~1)~. Un- der mild conditions on Yr,S will be positive definite for N large enough [21] and hence we can define a zero mean Gaussian random vector W with

covariance matrix S. W can be considered to be a stationary process on a circle, whose covariance function is obtained by periodically repeating part of the covariance function of Yt. The first 2N - 2 values of the covariance function of the periodic process are traced out by the first column of S. Clearly W contains a subvector identically distributed as Y (since Cis a submatrix of S). Furthermore,

since S is circulant, it can be diagonalized by a discrete Fourier transform, which can be done efficiently with fast Fourier transform (FFT) algorithms when N is highly composite. This last fact (which is well known and can be deduced as a special case in our later discussion) is the basis for efficient calculation of the likelihood of W. Hence we are able to embed Y into a random vector whose likelihood is easily calculated.

Now consider the two dimensional situation. Here we will try to embed a two-dimensional stationary Gaussian sequence into an observation of a toroidal process having periodic behaviour in the

principal directions of the lattice. Again the toroidal process will have a likelihood which is easily calculated. Suppose we observe the vector Y = Y(9) discussed in Section 1 with covariance matrix C = C(9). Suppose we order the sites of

9 as si = (O,O), sz = (LO), . . . , SN, = (Nz - l,O), SN,+I = (0, l), SN,+2 = (1, 1) and so on. With this ordering of sites, we have that C is block Toeplitz with Toeplitz blocks of size N2 x N2. Write (C”, .., , CN:)T for the first block column of C, where each C’ is Toeplitz of size NZ x Nz. (The first block column determines C since C is symmetric). We will embed Cinto a block circulant matrix with

204 D.J. Nott, R.J. Wilson / Signal Processing 63 (1997) 199-210

circulant blocks. There are three different constructions described in the literature for doing this. Die- trich and Newsam [ll] describe a construction which is valid when C has symmetric blocks. This will occur, for instance, when the covariance function of Y(t) is isotropic, or more generally when it is invariant under reflections through the x and y axes. However, the off-diagonal blocks of C are not symmetric in general. Wood and Chan [21] give two variations of a construction which applies in the general stationary case. In situations where Dietrich and Newsam’s construction is applicable, all three constructions are equivalent. Suppose that the Toeplitz matrix C’ is specified by

c’ =

c; C; . . . ci _ Nz 1

c’ 1 C; c; . . .

ci- (N2 - 2)

Ci-(N,-l) C’-(NZ-2) C$ L

Define the matrix Si with first column (ci,, C’i, . . . , cf (N2 _ 1), 0, CL, _ 1, . . . , c:)~ and define S = S(9) to be the block circulant matrix with first block column

(Sl’, . . . ) SNIT, 0, sN1,. . . ) S2)T,

where 0 is a 2N2 x 2N2 matrix of zeros. This choice of S is equivalent to a special case of one of Wood and Chan’s constructions. If S is positive definite, then we can define a zero mean, Gaussian random vector W with covariance matrix S. W contains a subvector which is identically distributed as Y(since C is embedded in S), and corresponds to an observation of a toroidal process. It cannot be guaranteed that S(9) is positive definite. If S(3) is not positive definite, we can consider a rectangular integer lattive 9+ 2 9 and the matrix S(_.!?+), which may be positive definite even when S(9) is not. Wood and Chan [21] provide motivation for doing this, by showing that under weak conditions on Y(t), S(P’+) will be positive definite if the dimensions of the lattice 9’ are large enough. There may also be computational reasons (related to the efficiency of fast Fourier transform algorithms in calculating likelihoods) for considering the matrix S(9+) for some lattice 9+ 2 9.

Since the construction we have just described rather complex, it will be instructive to consider tl case of a 2 x 2 lattice. Here we have N 1 = N2 = and N = 4, with s1 = (O,O), s2 = (LO), s3 = (0, and s4 = (1,l). The covariance matrix of Y a block Toeplitz matrix of size 4 x 4, with Toepli blocks of size 2 x 2. C has the form

co Cl ko k-1

c= c1 co kl ko

ko kl co CI ’

k-1 ko cl co

so

cl = co cl

[ 1 Cl co

is symmetric and Toeplitz, and

c2 = ko kl

[ 1 k-1 ko

is Toeplitz (but not necessarily symmetric). We er bed C’ into the circulant matrix S’ with first cc umn (co, cl, 0, c,)~, and embed C2 into the circula matrix S2 with first column (k,, k_ 1,0, kl)T. P define S to be block circulant with first block cc umn (S1T,S2T, 0,S2)T where 0 is a 4 x 4 matrix zeros. By considering the first, second, fifth ar sixth columns of S we can see that a random vectc with covariance matrix S (assuming S is positi definite) will contain a subvector identically distri uted as Y.

We have claimed that a zero mean Gaussi: random vector with a covariance matrix which block circulant with circulant blocks has a like hood which is easy to calculate. To explain why th is the case, we need some more notation. Let K 1 the block circulant matrix with first block colun (KIT, . . . , KMT)T where each A?, 1 < i < Ml, is ci culant of size M2 x M2. We write (k’,, . . . , kiM2_ 1 for the first column of K’. Let M = M1M2, and fc 1 < 1 < A4 let kl and j, be the (unique) pair integers satisfying

1 = j, + 1 + klM1,

D.J. Nott, R.J. Wilson 1 Signal Processing 63 (1997) 199-210 205

whereO<j16MI-1 andOdk[<M,-1. The

mapping I -+ (j,, k,) defines a one-to-one corres- pondence between (1, . . . , M} and {(j,k): 0 d j < Ml - 1,0 d k d M2 - l}. Let Q be the M x M matrix with (I, m)th element

&exp[ - Zrri($$ +$)I.

The effect of multiplying a vector by Q can be described in terms of two-dimensional discrete

Fourier transforms. In particular, if ZE CM is an ,21-dimensional vector of complex variables, and a = Qz, then the two-dimensional sequence

we can simplify Eq. (5) to

log(detV)) -I- (Q*Y)*~ - ‘(Q*Y) M

= I( lOg(Aii) i=l

+ ~). II

The elements of Py and the diagonal elements of n can be calculated by a discrete Fourier transform, which can be done efficiently with fast Fourier transform algorithms provided M, and M, are highly composite. Hence calculation of the

likelihood for a Gaussian random vector with covariance matrix K can be done efficiently.

Hence in a Monte Carlo EM algorithm for esti-

mating 0 from B, we can choose a random vector W = (Y,u) with a block circulant covariance matrix for the complete data, and implement the M step of the Monte Carlo EM procedure efficiently (since the likelihood of W can be calculated via fast

is simply the two-dimensional discrete Fourier

transform of

. aM_

EM algorithm by the use of this embedding procedure. Furthermore, without the use of circulant embedding, the use of a Monte Carlo EM algorithm could not be contemplated. For even small images consisting of just a few thousand pixels it is not possible to calculate Eq. (5) in a reasonable amount of time with the current best algorithms.

Fourier transform). It must be emphasized that no statistical approximations are introduced into the

Write Q* for the conjugate transpose of Q, and observe that QQ* = I (where Z denotes the identity matrix). Write K1 for the first column of K, and write ,4 for the diagonal matrix with diagonal ele-

ments given by A,, = fi(QK,),,. The diagonal elements of n are in fact the eigenvalues of K. We can show (Appendix A) that

K = QAQ*.

This is an analogue of a well known result for circulant matrices (circulant matrices can be diag-

onalized by discrete Fourier transform). If K is positive definite, negative twice the log-likelihood of a zero mean, Gaussian random vector with covariance matrix K is specified by Eq. (5) with C = K. Since det(K) = det(n) and K- ’ = QA - ‘Q*,

4. Implementation and modelling

It is of interest to compare the performance of an SEM algorithm to that of ad hoc least squares estimators such as the one described in Section 1. We discuss here a small simulation study comparing the SEM algorithm with covariance-based least

squares fitting. We consider zero mean, stationary, Gaussian

random fields with variance one and a covariance function of the form

R(t) = exp( - @It11 + It2l)). (6)

The covariance function (Eq. (6)) has convenient Markov properties which make conditional simulation in the Monte Carlo EM algorithm more than usually straightforward [ 121. Hence it is convenient for simulation studies.

206 D.J. NOM, R. J. Wilson 1 Signal Processing 63 (1997) 199-210

We generated three test sets, each containing fifty images, by simulating excursion sets at the level u = 0 of a zero mean, stationary Gaussian random field on a 32 x 32 integer lattice. The three test sets corresponded to three different parameter values (0 = 0.5, 8 = 1.0 and 8 = 1.5). For each of the images, three hundred iterates of the SEM algorithm were calculated - the first hundred iterates of each sequence were discarded, and the remaining two hundred iterates averaged to obtain an estimate of 8.

For comparison, we estimated 8 by a least squares procedure which we now describe. Observe that in Eq. (6) R(0, t) = R(t, 0). We call R(t, 0) and R(O,t) the covariance functions in the horizontal and vertical directions, respectively. For each image in the test sets, we estimated the covariances in the

horizontal and vertical directions for the excursion sets, averaged these estimates, and then inverted this final estimate using the procedure discussed in Section 1. We then did a least squares fit to the covariance function (Eq. (6)) in the horizontal di- rection in order to estimate 8. The estimates obtained were not particularly sensitive to the design points used in the least squares fitting. Fig. 1 shows kernel density estimates of the distributions of the SEM and least squares estimators for the three different values of 6’ on the basis of the simulated test sets. Selection of the bandwidth was based on a rule of thumb due to Scott [18].

The SEM estimator appears to have superior properties. It seems to be less biased with a smaller variance. From the three test sets estimates of the bias for the stochastic EM estimator were 0.014

Fig. 1. Kernel estimates of densities of SEM and least squares estimators from test sets of fifty simulated images. (a) SEM estimator,

0 = 0.5; (b) least squares estimator, 0 = 0.5; (c) SEM estimator, 0 = 1.0; (d) least squares estimator, 0 = 1.0; (e) SEM estimator, f3 = 1.5;

(f) least squares estimator, 8 = 1.5.

D.J. Nott, R.J. Wilson / Signal Processing 63 (1997) 199-210 207

(0 = OS), - 0.001 (0 = 1.0) and 0.021(0 = 1.5). The corresponding estimates of the bias for the least squares estimator were 0.082, 0.07 and 0.085 for 0 = 0.5, 1.0 and 1.5 respectively. Estimated standard errors of the estimators were 0.052,0.091 and 0.192 (stochastic EM) and 0.104, 0.139 and 0.25 (least squares) for 8 = 0.5, 1.0 and 1.5, respectively. The estimated biases and standard errors can be

used to estimate the relative efficiency of the SEM estimator with respect to the least squares es-

timator. We have estimated relative efficiencies of 6.05, 2.92 and 1.87 for 6, = 0.5, 1.0 and 1.5, respec-

tively. The large gain in relative efficiency when /I = 0.5 is due mainly to the bias of the least squares estimator. The benefits of using the SEM procedure seem to increase with increasing spatial depend-

ence. Despite their less than optimal statistical proper-

ties, ad hoc least squares methods of estimation are

still useful. The computational effort required for their implementation is minimal. Furthermore,

when the validity of the model is very doubtful, they

can be used to ensure that a model reproduces certain chosen aspects of the data well. Of course, we can also use ad hoc estimators for generating a starting point in the stochastic EM algorithm.

We discuss now a few modelling issues. The covariance function (Eq. (6)) while convenient for simulation studies, may not be useful for texture modelling. Cressie [S] (Section 2.5.1) gives a list of covariance function models which are used in geostatistics, and which can be used for the generation of excursion sets. The covariance functions listed below are just a few of the possibilities. Here, as before, 0~ Rd is some vector of parameters.

R(t) = expj - Q1 Iltll”)

(d = 2.8, > 0,o < 0, d 2)

(7)

1 WI =

- 3(llW1) + +(ll~l13/G) iflltll < 01, o

otherwise, (8)

R(t) = Q:ll4~,(~:lI~ll)~ (10)

where d = 1, and K,(.) denotes the modified Bessel function of the second kind.

We will refer to Eqs. (7HlO) as the exponential, spherical, rational quadratic and Whittle covariance functions, respectively.

Fig. 2 shows simulated excursion sets from some of these covariance function models at the level

IA = 0.5. By varying u we can change the area1 fraction of

the simulations. Excursion set texture models are

Fig. 2. Simulated excursion sets with u = 0.5. (a) Exponential

model with HI = 0.02, Q2 = 1; (b) exponential model with

Q1 = 0.02, 8, = 2; (c) rational quadratic model with

o1 = 8, Bz = 0.5; (d) rational quadratic model with

8, = 12, Bz = 2; (e) Whittle model with 8, = 0.3; (f) spherical

model with HI = 40.

208 D.J. Nott, R. J. Wilson / Signal Processing 63 (1997) 19%210

extremely flexible, and in general the parameters of the covariance function can be directly related to the global properties of the simulated textures (such as the behaviour of size distributions). Simulation of stationary Gaussian random fields (and hence their excursion sets) is easily done (see, for instance, [S] (Section 3.6.1), [11] and [21]).

We have not yet discussed the question of model selection. That is, when confronted with binary image data, how do we select a covariance function model? One useful approach is to consider the estimate of the random field covariance discussed in Section 1. Although, as we have seen, this estimate is not a promising basis for statistically efficient parameter estimation, it is still a useful exploratory tool which may help us choose between competing covariance function models. We have also not yet discussed the question of model validation. Monte Carlo goodness of fit tests would appear to be a promising general approach.

5. Extensions

The methods that we have described can be used for estimating parameters in random fields from contour data. Let k > 2 and let - co = ug < ui < ... < uk = co be a series of levels. Sup- pose that we observe integer valued random variables 2 = (Z,, . . . , 2,) satisfying

Zi=m, ifu,< Yi<U,+l.

The random vector 2 contains information about the contours of Y at more than one level. The Gibbs sampler can be used to simulate W/Z where W = (Y,U) is a Gaussian random vector with a block circulant covariance matrix and circulant blocks. Hence an SEM algorithm can be implemented for estimating 8 from 2. Contours of known moving levels can also be considered - this corresponds to the so-called truncated Gaussian model [17].

So far our discussion has dealt with two-dimensional random fields. Excursions or contours of one-dimensional Gaussian processes can be used to model binary and categorical time series. Binary time series models based on excursions have been considered by Kedem [14]. The modelling of three-

dimensional images is also clearly a possibility, although the computational problems are acute.

There is also the possibility of considering parameter estimation for excursion sets and contours of a certain class of non-Gaussian random fields. The idea is to define a random field by a known function of Gaussian component fields, and to use the values of the Gaussian component fields as the complete data in the EM algorithm. More precisely, suppose that f is a known function, that Y l(f), . . . ,Y”(t) are independent, identically distributed, zero mean, stationary Gaussian random fields, and that

Y(r) =f( Y ‘(i), . . . ) Y”(t)). (11)

Let Y’, i = 1, . . . , n, be the vector with jth component Y: = Y’(Sj), and write W’ for some circulant embedding of Y’. We write W = (W’, . . . ,W”), and consider W as the complete data in an EM algorithm for estimating 8 from B (where B is defined as before, but now Y is given by Eq. (11). Conditional simulations of WI B can still be performed with the Gibbs sampler, so that we can approximate the expected log likelihood for Win the EM algorithm. Performing the M step is not difficult, since the likelihood for Wis a product of n separate Gaussian likelihoods, each of which can be evaluated via FFT. For models of this transformed Gaussian variety we might expect convergence in the EM algorithm to be slow, since there is a considerable amount of missing data. Also, for some choices off, simple Markov chain Monte Carlo algorithms for performing the conditional simulations may experi- ence problems (for instance, for some choices of f the distribution of W/B may be highly multi- modal). But in principle, we can model binary images with excursion sets defined by known functions of Gaussian component fields. We point out a few special cases. If we choose

f(Yl, ... ,Y”) = i Y’, i=l

then Y(l) is a xz random field. Another interesting choice off is

.f(Y i, . . . ,y,) = max yi. 1 <i=s’n

In this case, observe that t E A,(Y) if and only if maxl,i,,Yi(t) 2 u, which occurs if and only if

D.J. Nott, R.J. Wilson J Signal Processing 63 (1997) 199-210 209

teA,(Y’) for some 1 < i d n. So

A”(Y) = ;, A,( Yi). i=l

That is, the excursion sets of Y are unions of excursion sets of the component fields Y’.

There is also the possibility for a Bayesian analysis of binary images with excursion set models. If we assume a prior distribution p(B) for the parameter 8, we can sample from the joint posterior distribution of (W, 0) 1 B using standard MCMC methods. This can form the basis for a Bayesian analysis of excursion sets.

Work on the extensions we have discussed is currently in progress, the results of which will be reported elsewhere.

6. Conclusion

We have discussed a method of parameter estimation for excursion set and contour data for

Gaussian (and certain transformed Gaussian) processes and fields. One use of the methods is in texture modelling, where excursion sets provide a useful class of binary texture models. Two advantages of excursion set texture models are that convenient estimation methods are available, and that the global properties of the simulated textures can be fairly directly related to the model parameters.

Acknowledgements

We acknowledge the support of the Cooperative Research Centre for Mining, Technology and Equipment through a project directed by Dr. G.J. Lyman of the Julius Kruttschnitt Mineral Research Centre, Isles Road, Indooroopilly, Queensland 4068 Australia. The first author thanks the Lund University and the University of North Carolina at Chapel Hill, where parts of this work were completed.

Appendix A.

In the notation of Section 3, we wish to prove that K = QnQ*. If p, q~ (1, . . . ,M}, then

(&IQ*,,, = f 4nmQpmQ~, = . .

m=l & il (i,, exp[ - 2rri[‘m’LT “) + km(khz kq)]].

Using the definition of A,,, we have

Both the inner sums are geometric series, which are easily calculated. We have that

and

M,-1

c exp k&l + kp - kJ 11 i MB if kl = k, - k, (modM2), =

k,=O M2 0, otherwise.

210

Hence

(e42*),, = f(sQwj,lt

D.J. Nott, R.J. Wilson / Signal Processing 63 (1997) 19C210

where s(p,q) = (j, - j,)modM1 + 1 + MI&, - k,)modM,. But Ks(p,qJ,l is just equal to K,,,, since K is block circulant with MI circulant blocks of size M2 x M2.

References

[l] R.J. Adler, A spectral moment estimation problem in two dimensions, Biometrika 64 (2) (1977) 367-373.

[Z] R.J. Adler, The Geometry of Random Fields, Wiley Pub. Math. Statist., New York, 1982.

[3] B. Chalmond, An iterative Gibbsian technique for recon- struction of m-ary images, Pattern Recognition 22 (6) (1989) 747-761.

[4] H. Cramer, M.R. Leadbetter, Stationary and Related Stochastic Processes, Wiley, New York, 1967.

[S] N.A.C. Cressie, Statistics for Spatial Data (Revised Edi- tion), Wiley Pub. Math. Statist., New York, 1993.

[6] R.B. Davies, D.S. Harte, Tests for Hurst effect, Biometrika 74 (1) (1987) 95-101.

[7] A.R. De Pierro, A modified EM algorithm for penalized likelihood estimation in emission tomography, IEEE Transactions on Medical Imaging 14 (1) (1995) 132-137.

[S] A. Dembo, C.L. Mallows, L.A. Shepp, Embedding

c91

Cl01

Cl11

cw

nonegative definite Toeplitz matrices in nonnegative defi: nite circulant matrices, with application to covariance estimation, IEEE Transactions on Information Theory 35 (6) (1989) 12061212. A.P. Dempster, N.M. Laird, D.B. Rubin, Maximum likelihood from incomplete data via the EM algorithm, J. Roy. Statist. Sot. Ser. B 39 (1) (1977) 1-38. J. Diebolt, E.H.S. Ip, Stochastic EM: method and application, in: W.R. Gilks, S. Richardson, D.I. Spiegelhalter (Eds.), Markov Chain Monte Carlo in Practice, Chapman and Hall, London, 1996, pp. 259-273. C.R. Dietrich, G.N. Newsam, A fast and exact method for multidimensional Gaussian stochastic simulations, Water Resources Research 29 (8) (1993) 2861-2869. X. Freulon, Ch. de Fouquet, Conditioning a Gaussian model with inequalities, in: A. Soares (Ed.), Geostatistics

Troia ‘92, Vol. 1, Kluwer Academic Publishers, Dordrecht, 1993, pp. 61-72.

[13] M. Haindl, Texture synthesis, CWI Quarterly 4 (1991) 305-331.

[14] B. Kedem, Binary Time Series, Marcel Dekker, New York, 1980.

[15] Ch. Lantuejoul, Substitution random functions, in: A. Soares (Ed.), Geostatistics Troia ‘92, Vol. 1, Kluwer Aca- demic Publishers, Dordrecht, 1993, pp. 37-48.

[16] G. Lindgren, I. Rychlik, How reliable are contour curves - confidence sets for level contours, Bernoulli 1 (4) (1995) 301-319.

[17] G. Matheron, H. Beucher, C. de Fouquet, A. Galli, D. Guerillot, C. Ravenne, Conditional simulation of the geometry of fluvio-deltaic reservoirs, SPE paper No. 16753, 62nd Conference of the Society of Petroleum Engineers, Dallas, TX, 1987.

[18] D.W. Scott, Multivariate Density Estimation. Theory, Practice and Visualization, Wiley, New York, 1992.

[19] D.O. Siegmund, K.J. Worsley, Testing for a signal with unknown location and scale in a stationary Gaussian random field, Ann. Statist. 23 (2) (1995) 608639.

[20] G.C.G. Wei, M.A. Tanner, A Monte Carlo implementation of the EM algorithm and poor man’s data augmentation algorithms, J. Amer. Statist. Assoc. 85 (411) (1990) 699-704.

[21] A.T.A. Wood, G. Chan, Simulation of stationary Gaussian processes in [O,lld, J. Comp. Graph. Statist. 3 (4) (1994) 409432.

[22] K.J. Worsley, Local maxima and the expected Euler characteristic of excursion sets of x2, F and t fields, Adv. in Appl. Probab. 26 (1994) 13-42.

[23] K.J. Worsley, Estimating the number of peaks in a random field using the Hadwiger characteristic of excursion sets, with applications to medical images, Ann. Statist. 23 (2) (1995) 64&669.

parameter estimation for excursion set texture models

Documents