bayesian hierarchical models in ecological studies of health–environment effects
TRANSCRIPT
ENVIRONMETRICS
Environmetrics 2003; 14: 129–147 (DOI: 10.1002/env.571)
Bayesian hierarchical models in ecological studiesof health–environment effects
Sylvia Richardson*,y and Nicky Best
Department of Epidemiology and Public Health, Imperial College School of Medicine at St Mary’s, Norfolk Place,London W2 1PG, U.K.
SUMMARY
We describe Bayesian hierarchical models and illustrate their use in epidemiological studies of the effects ofenvironment on health. The framework of Bayesian hierarchical models refers to a generic model buildingstrategy in which unobserved quantities (e.g. statistical parameters, missing or mismeasured data, random effects,etc.) are organized into a small number of discrete levels with logically distinct and scientifically interpretablefunctions, and probabilistic relationships between them that capture inherent features of the data. It has proved tobe successful for analysing many types of complex epidemiological and biomedical data. The generalapplicability of Bayesian hierarchical models has been enhanced by advances in computational algorithms,notably those belonging to the family of stochastic algorithms based on Markov chain Monte Carlo techniques.
In this article, we review different types of design commonly used in studies of environment and health, givedetails on how to incorporate the hierarchical structure into the different components of the model (baseline risk,exposure) and discuss the model specification at the different levels of the hierarchy with particular attention to theproblem of aggregation (ecological) bias. Copyright # 2003 John Wiley & Sons, Ltd.
key words: aggregation; Bayesian graphical models; ecological bias; exposure measurement error; spatialdependence; time series
1. INTRODUCTION
The study of the effects of environmental exposures on health presents many challenges. Among these
are the practical and methodological problems of collecting suitable data, and the need for
sophisticated statistical models to capture the complex nature of the underlying health–exposure
relationships and to acknowledge the type and quality of the available data. In this article, our aim is to
show how Bayesian hierarchical models provide a unifying framework for modelling such health–
environment relationships, and to illustrate how this framework may be elaborated to incorporate the
additional complexity demanded by non-standard features of the data. These include availability of
data at individual and group level and the associated inconsistencies between individual and aggregate
Received 3 December 2001
Copyright # 2003 John Wiley & Sons, Ltd. Accepted 4 March 2002
*Correspondence to: Sylvia Richardson, Department of Epidemiology and Public Health, Imperial College School of Medicineat St Mary’s, Norfolk Place, London W2 1PG, U.K.yE-mail: [email protected]
Contract/grant sponsor: U.K. Medical Research Council; contract/grant number: G9803841.Contract/grant sponsor: U.K. Small Area Health Statistics Unit.
level disease–exposure relationships, the need for exposure measurement models relating individual
and ambient measures of exposure, and the occurrence of complex patterns of spatial and temporal
dependence in exposure and outcome data and of missing data and unmeasured confounders.
The article is organised as follows. In Section 2, we introduce the hierarchical modelling
framework and show how a graphical representation of the local dependence relations between model
quantities can facilitate construction of complex statistical models by combining several sub-graphs
representing different features of the substantive problem. In Section 3, we elaborate on the functional
specification of the relationships implicated in our hierarchical models, focusing on non-standard
aspects relating particularly to environment and health problems. We also draw on a recent example
from the literature to illustrate the practical use of hierarchical models for investigating environment–
health relationships in the context of air pollution. In Section 4, we make some concluding remarks
about the potential benefits of using a hierarchical Bayesian modelling strategy when faced with the
complex problem of studying health–environment effects.
2. BAYESIAN HIERARCHICAL MODELLING FRAMEWORK
The framework of Bayesian hierarchical modelling refers to a generic model building strategy in which
unobserved quantities (e.g. statistical parameters, missing or mismeasured data, random effects, etc.) are
organized into a small number of discrete levels with logically distinct and scientifically interpretable
functions and probabilistic relationships between them that capture inherent features of the data. It has
proved to be successful for analysing many types of complex epidemiological, biomedical, environ-
mental and other data, as illustrated by the case studies in Gilks et al. (1996) and the wide range of
examples in the literature (for example, Morris and Normand (1991); Wakefield (1996); Su et al. (2001);
Rosenberg et al. (1999)). The general applicability of Bayesian hierarchical models has been enhanced
by advances in computational algorithms, notably those belonging to the family of stochastic algorithms
based on Markov chain Monte Carlo (MCMC) techniques (see Green, 2001, for a recent review).
When specifying a hierarchical model, it is often convenient to start with a graphical representation
of the structural assumptions relating the quantities in the model. Such models are commonly referred
to as Bayesian graphical models and have become increasingly popular as ‘building blocks’ for
constructing complex statistical models of biological and other phenomena (Spiegelhalter, 1998).
These graphs consist of nodes representing the variables in the model, linked by directed or undirected
edges representing the dependence relationships between the variables. A graph where all the edges are
directed and where there is no loop is known as a directed acyclic graph (DAG). DAGs have been
extensively used in modelling situations where the relationships between the variables are asymmetric,
from ‘‘cause’’ to ‘‘effect’’. A unique and easily computed joint distribution can be written for any DAG
(see, for example, Lauritzen, 1996). There are other cases where the links between the variables are
symmetric. For example, in spatial epidemiology, one might simply want to formulate that incidence
rates in neighbouring areas are correlated. This type of symmetric dependence is encoded via an
undirected graph, where by convention, the absence of a link between two variables signifies that the
variables are conditionally independent given all the others in the graph. This latter type of graph is
referred to as a conditional independence graph, and extra conditions are required to guarantee that the
joint distribution of all the variables exist. When a particular context involves a mixture of both types of
links, the corresponding graphs are called chain graphs. In our review, we shall mostly display DAGs,
but chain graphs would be required for representing health–environment studies exploring the spatial
variability of outcomes and exposure (see Spiegelhalter et al., 1995, for more details of such graphs).
130 S. RICHARDSON AND N. BEST
Copyright # 2003 John Wiley & Sons, Ltd. Environmetrics 2003; 14: 129–147
Throughout, we adopt the notation used by Spiegelhalter (1998), denoting variable quantities in the
model (whether observed or not) by circular nodes in the graph, and constants by rectangular nodes;
arrows indicate directed dependencies between nodes. Repetitive structures are known as ‘plates’ and
are represented by large rectangles enclosing the repeated nodes. Figure 1 shows some examples of
DAGs representing various environment–health models; these are discussed in more detail below.
2.1. Study designs
There are three main designs used to study relationships between environmental exposures and health,
depending on the type and resolution of the available data. We will refer to these as individual, semi-
ecological and ecological (or aggregate) designs. We first provide some basic notation and then
describe the generic structure of each design using graphical models.
Let i ¼ 1; . . . ;Ng denote individuals within groups g ¼ 1; . . . ;G, where g may index, for example,
time units, geographical (spatial) units, socioeconomic groups, etc. We denote the total population at
risk by N; health outcomes by yi or Yg for individuals or groups, respectively; the environmental
exposure of interest by Xi if measured at the individual level and by Zg if measured at the group level;
the regression coefficient measuring the effect of exposure on disease risk by �; and the baseline risk
(possibly adjusted for known risk factors such as age and sex) by �0i (individual-level) or �0g (group
level). Note that, for simplicity of presentation in this section, we assume that no other concomitant
risk factor data are available, although multiple exposures are easily accommodated by extending Xi,
Zg and � to vector notation.
2.1.1. Individual design. This design is appropriate if both exposure and health outcome data are
measured at the individual level on the same set of subjects. Figure 1a shows the graph corresponding
to such a model. The functional form of the local dependence relationships implied by this graph may
be represented as
yi � p ð f ð�0i;Xi; �ÞÞ i ¼ 1; . . . ;N
Figure 1. Graphical models (DAGs) illustrating the structure of basic environment–health relationships for (a) individual design;
(b) semi-ecological design; (c) ecological design. Note that, in order for each graph to represent a full joint distribution, we also
assume that the unknown quantities at the top of each graph (i.e. the �, �0 and �0 parameters) are given appropriate prior
probability distributions (usually chosen to be minimally informative); however, for clarity we suppress these dependencies in
the graphical representation
BAYESIAN MODELS OF HEALTH–ENVIRONMENT EFFECTS 131
Copyright # 2003 John Wiley & Sons, Ltd. Environmetrics 2003; 14: 129–147
where pðÞ is an appropriate probability distribution (taken to be Bernoulli if yi is dichotomous) and f ðÞis a suitable link function specifying the disease risk �i for individual i as a function of the baseline risk
�0i and the exposure Xi (see Section 3.1 for further details).
2.1.2. Semi-ecological design. If health outcome data are measured at the individual level, but
measurements of the environmental exposure are only available at the group level (for example,
mean daily or annual ambient pollution concentrations, or spatially averaged exposures), then a semi-
ecological design is appropriate. The baseline risk may be specified at either the individual or group
level. Figure 1b shows the graph corresponding to such a model (illustrated assuming individual-level
baseline risk). Again, the functional form of the local dependence relationships implied by this graph is
yig � p ð f ð�0ig; Zg; ��ÞÞ i ¼ 1; . . . ;Ng; g ¼ 1; . . . ;G
As before pðÞ will be taken as Bernoulli if yig is dichotomous, and f ðÞ specifies the form of the
functional dependence of the overall disease risk �ig for individual i in group g on the baseline risk �0ig
and exposure Zg. However, note that the coefficient �� representing the effect of exposure Zg is not
necessarily the same as the coefficient in the individual-level model. The difference will depend on the
relationship between the group-level measured exposure Zg and the (unobserved) individual exposures
Xig in the group (see Section 3.4 for further details).
2.1.3. Ecological design. The ecological design is appropriate when both the health outcome data and
exposure data are only available at the group level (for example, counts of health events in space or
time, and average exposure measurements over space or time). Figure 1c shows the graph correspond-
ing to such a model. The functional form of the local dependence relationships is given below:
YgjNg � p ð f ð�0g; Zg; ��ÞÞ g ¼ 1; . . . ;G
Here pðÞ will typically be taken as binomial if Yg represent counts of cases or, if the disease is rare, a
Poisson approximation may be assumed. f ðÞ specifies the form of the functional dependence of the
average disease risk �g for group g on the average baseline risk �0g and the ecological exposure Zg.
The coefficient �� representing the effect of this exposure is not necessarily the same as in the previous
two models. In this case, the difference depends on a number of factors (Greenland, 1992) including:
(i) the relationship between Zg and Xig (see above and Section 3.4); (ii) the functional form of the
underlying individual-level relationship f ð�0ig;Xig; �Þ between disease risk and exposure (see Section
3.1); (iii) the presence of group-level confounders (which we may attempt to adjust for by modelling
between-group variations in baseline risk; see Section 2.2.1 and 3.2); and (iv) the presence of group-
level effect modifiers (which we may attempt to adjust for by modelling between-group variations in
the exposure coefficient; see Section 2.2.2 and 3.3).
2.2. Incorporating hierarchical structure
Many statistical applications involve multiple parameters that can be regarded as related or connected
in some way by the structure of the problem. For example, in the present context, we wish to estimate
the parameters Hk of an exposure–response relationship for each of a large number of individuals or
groups (where Hk denotes unknown quantities of interest, including regression parameters, mis-
measured or unobserved risk factors etc., and k is a generic index representing individuals or groups).
132 S. RICHARDSON AND N. BEST
Copyright # 2003 John Wiley & Sons, Ltd. Environmetrics 2003; 14: 129–147
The Hk are likely to be similar (but not necessarily identical) across units k since the same exposure
and response are of interest in each case. Adjustment can be made for factors such as age and sex that
are known to influence the relationship, and spatial and/or temporal proximity of the units may suggest
that some of the Hk are likely to be more closely related than others. The dependence among the Hk
can be represented by a joint probability model for these parameters. That is, we regard the unknown
quantities Hk as drawn from a (prior) probability distribution that may depend on known covariates
plus additional parameters / that represent the overall mean and variances/covariances among the Hk.
This crucial step of specifying a probability distribution relating unknown quantities across units leads
to a hierarchical or multilevel model (Good, 1987; Gelman et al., 1995; Breslow, 1990; Goldstein,
1995). The links between parameters implied by this joint distribution enable the estimates of Hk for
any one unit to ‘borrow strength’ from information on related parameters for other units, as well as
depending on the (often sparse) information contained in the data for unit k. This leads to improved
parameter estimates over those obtained from non-hierarchical models that treat each unit indepen-
dently, or pool all units together ignoring between-unit variability (Gelman et al., 1995).
Below we discuss how to extend the above models in hierarchical fashion by considering
elaboration of three different aspects of the basic model. Our focus will be on the full ecological
design, since this is a common design in environmental epidemiology and is the most challenging
statistically. However, similar hierarchical extensions to the individual and semi-ecological designs
are also possible.
2.2.1. Hierarchical modelling of baseline risk. The baseline risk of disease typically depends on the
age and sex of the individual, plus possibly other factors, for example, genetic susceptibility. The
baseline risk will therefore vary from individual to individual, a concept that is often termed ‘frailty’ in
the survival analysis literature. Baseline disease risk is also likely to vary from group to group. For
example, groups defined by small areas or socioeconomic categories are likely to share similar
lifestyle or cultural factors that may lead to between-group variations in �0g. Likewise, seasonal
effects may manifest as between-group variations in baseline risk if temporal groupings are used. If
data are available on any of these factors, these may be included as additional observed covariates in
the model. However, typically, many risk factors that influence baseline risk are unobserved, or are not
even known. Nonetheless, it may be reasonable to assume that the baseline risks are similar across
individuals or groups, in which case a hierarchical model specifying a joint probability distribution for
the unknown baseline across units is appropriate.
Figure 2a shows the part of the graph corresponding to a hierarchical model for baseline risk in the
ecological study design. The parameters �0g are usually interpreted as representing the inherent part of
the risk due to the aging process as well as that of unobserved group-level risk factors. However,
Knorr-Held and Besag (1998) show that this model also corresponds to the situation in which baseline
risk varies within groups, providing that the individuals in each of the baseline risk ‘strata’ are
distributed randomly within the group. If there is clustering (e.g. in space or time) of the factors
influencing baseline risk within groups, then inclusion of a group level baseline risk parameter �0g
should be viewed as only a rough approximation to the ‘true’ model (see Wakefield et al., 2000, for
further details). Specification of the joint probability distribution pðK0j/�Þ;K0 ¼ f�0ggg¼1;...;G,
implied by the graph will be discussed in Section 3.2.
2.2.2. Hierarchical modelling of exposure risk. The effect of an environmental exposure on risk of
disease may be modified by a number of factors. For example, lifestyle factors such as diet or poor
housing conditions, or climatic factors such as temperature may interact with the exposure of interest,
BAYESIAN MODELS OF HEALTH–ENVIRONMENT EFFECTS 133
Copyright # 2003 John Wiley & Sons, Ltd. Environmetrics 2003; 14: 129–147
leading to non-uniformity of the exposure–response relationship across values of the other factors. This
suggests that our statistical model should allow the coefficient representing risk of disease associated
with the exposure of interest to be different across groups. In general, such a model is not fully
identifiable unless repeated pairs of measurements of the exposure and response are available for each
group. In the absence of replicate data within groups, some authors have attempted to estimate group-
specific coefficients ��g by ‘borrowing strength’ across groups using a hierarchical model to induce
similarity between the coefficients (King et al., 1999). In this case, the hierarchical model is used, not
only to improve the precision of the parameter estimates, but to make the coefficients identifiable. The
validity of the inference achieved therefore depends crucially on the appropriateness of the joint
probability distribution assumed for the coefficients f��gg. This issue is discussed further in Section 3.3.
A more satisfactory alternative may be to specify a priori subsets of groups Sj � g;g ¼ 1; . . . ;G; j ¼ 1; . . . ; J < G that share a common value ��
j for the exposure coefficient, and
assume a hierarchical model for these coefficients across subsets. For example, if groups g represent
weeks or months, then it may be reasonable to allow the exposure coefficients to vary across quarters
or years. Such a model also depends crucially on the appropriateness of the prior choice of subsets but
has the advantage that the subset-specific coefficients for the exposure effect are fully identifiable,
since the exposure and response data for each group g � Sj form replicate pairs of measurements
within the appropriate subset. Figure 2b shows the part of the graph corresponding to such a
hierarchical model for the regression coefficient associated with exposure in the ecological study
design. Specification of the joint distribution pðb�j/�Þ; b� ¼ f��j gj¼1;...;J implied by the graph is
discussed in Section 3.3.
2.2.3. Hierarchical modelling of exposure measurement error. Uncertainties in exposure assessment
remain one of the major constraints on studying environment–health relationships. The complex
pathways linking measurements of a particular environmental pollutant (say) to ambient concentra-
tions, and in turn, to the internally absorbed (biological) dose present severe methodological challenges
(see Briggs, 2000, for a discussion). This problem may be partly addressed by elaborating the graph to
specify a hierarchical model relating the measured and ‘true’ exposures. Focusing on the ecological
Figure 2. Graphical model illustrating hierarchical extensions to the basic ecological model of environment–health relation-
ships: (a) hierarchical model for baseline risk; (b) hierarchical model of risk associated with exposure; (c) hierarchical model of
exposure measurement error. The term Ig�Sjis an indicator variable of whether group g belongs to subset Sj
134 S. RICHARDSON AND N. BEST
Copyright # 2003 John Wiley & Sons, Ltd. Environmetrics 2003; 14: 129–147
design, we might assume a joint probability distribution pðZjZ�;/zÞ � pðZ�j/�z Þ relating the observed
and ‘true’ ecological exposures Z and Z�, respectively. Here /z represents the measurement error
variances and covariances, while /�z represents the mean and (co)variances of the distribution of the
true ecological exposure across groups. Specification of this measurement error model is considered in
Section 3.4, and the corresponding elaboration to the graphical model is shown in Figure 2c.
3. MODEL SPECIFICATION AT DIFFERENT LEVELS OF THE HIERARCHY
In the previous section, we have discussed the structural assumptions that underpin hierarchical models.
Here, we come to discuss the functional specifications of these models. In the context of studies of
health–environment effects, some functional specifications are straightforward and guided by the usual
considerations for expressing sources of variability in the generalized linear model framework. We
would like to focus our discussion on a number of non-standard aspects: the specification of aggregated
level rather than individual level dose–effect relationships, modelling of between-group variability of
baseline risk and risk due to exposure, and the need to build exposure measurement models. These
functional specifications will involve parameters common to all the groups, represented at the top of
Figure 2 by the nodes /. These parameters will themselves be given suitable prior distributions,
although we will not discuss these here. It is not our purpose to give details of the Bayesian estimation
framework that allows joint estimation of these parameters together with those which are group
specific. The necessary calculations are often numerically intractable, and so simulation approaches
based on MCMC techniques are generally the only feasible method. The reader is referred to Gilks et al.
(1996) and Green (2001) for further details. All the models discussed in here may also be implemented
using the WinBUGS software (Spiegelhalter et al., 2001) for MCMC estimation.
3.1. Aggregating dose–effect relationship from the individual to the group level
Let us consider the following situation. At the individual level, we let �ig be the risk that individual i
belonging to group g and having received exposure Xig contracts a disease D. For ease of presentation,
we assume a single exposure of interest. However, extension of the models discussed below to the case
of multivariate exposures Xig ¼ ðX1ig;X2ig; . . . ;XpigÞ is straightforward. For an exposure X that is
common to all individuals in the group, as in the case in the semi-ecological model defined in Section
2.1.2, we have Xig ¼ Xg.
In general, �ig is a function of the baseline risk and the exposure Xig : �ig ¼ f ð�0ig;Xig; �Þ. Note
that here we have allowed the baseline risk to be individual specific. This could refer to intrinsic
differences between individuals, in other words different frailties, and/or include the influence of latent
(unmeasured) covariates. In many situations, the simplifying assumption is that the baseline risk is
constant over the group, i.e. that �0ig ¼ �0g is made.
As before, let Yg be the total number of cases in group g composed of Ng individuals i. The group g
could represent, for example, persons living in a specific location, like an electoral ward, over a period
of time. For the present, we do not need to state precisely how the group is defined with respect to
space and/or time.
In order to specify meaningfully a functional relation between the disease rate in the group,
EðYgÞ=Ng, and a measure of exposure defined for the group, we investigate how an individual level
model can be aggregated into a full ecological model. Several cases, corresponding to different
functional forms for f ðÞ, have been discussed in the literature. For each case, we express the
BAYESIAN MODELS OF HEALTH–ENVIRONMENT EFFECTS 135
Copyright # 2003 John Wiley & Sons, Ltd. Environmetrics 2003; 14: 129–147
aggregation problem in two forms. In the first one (semiparametric), we condition on the observed
values of individual baseline risk and exposures and we simply aggregate by summing the risk �ig over
all the individuals in the group. In the second situation (model-based), we do not assume that we
necessarily have individual level data observed. We model instead the joint distribution of baseline
risk and exposure in the group: pgð�0;XÞ. We thus derive the group disease rate by integrating
f ð�0;X; �Þ with respect to that joint distribution. Throughout, we shall make the additional simplifying
assumption that �0 and X are independent, so that their joint distribution is simply the product
pgð�0ÞpgðXÞ, where pgð�0Þ and pgðXÞ denote respectively the within-group distribution of �0 and X.
Thus, we seek to evaluate
EðYgÞ=Ng ¼ð ð
f ð�0; x; �Þpgð�0ÞpgðxÞd�0dx ð1Þ
for different functional forms of f ðÞ.
3.1.1. Linear dose–effect relationship. Suppose that
f ð�0ig;Xig; �Þ ¼ �0ig þ �Xig ð2Þ
where �0ig represents the baseline risk of disease for individual i and � is the vector of regression
coefficients of interest. Note that this formulation supposes that suitable constraints are operating on
the range of Xig and �, so that the function specified in (2) still defines a risk. Then, we obtain that
EðYgÞ ¼Xi2g
�0ig þ �Xi2g
Xits
We can rewrite this as
EðYgÞ ¼ Ngð��0g þ ��XgÞ ð3Þ
where ��0g represents the mean baseline risk and �Xg equals a straightforward average of the individual
exposures in group g. Thus, because of the linearity of Equation (2), the functional form of the
aggregated dose–effect relationship is the same as the individual level one; in particular, the same
coefficients � relate both the individual level and the group exposure to the disease.
Alternatively, integrating (2) as in (1), we get
EðYgÞ=Ng ¼ð�0pgð�0Þd�0 þ �
ðxpgðxÞdx ¼ Egð�0Þ þ �EgðXÞ ð4Þ
a similar expression to (3).
3.1.2. Exponential dose–effect relationship. This is the most common form of dose–effect relation-
ship used in epidemiology. We assume that
f ð�0ig;Xig; �Þ ¼ �0igexpð�XigÞ ð5Þ
136 S. RICHARDSON AND N. BEST
Copyright # 2003 John Wiley & Sons, Ltd. Environmetrics 2003; 14: 129–147
By summing (5) over the group, we obtain
EðYgÞ ¼Xi2g
�0igexpð�XigÞ ð6Þ
an expression that is not a simple function of expð��XgÞ.With reference to (1), we would obtain
EðYgÞ=Ng ¼ð ð
�0expð�xÞpgð�0ÞpgðxÞd�0dx ¼ Egð�0ÞEgðexpð�XÞÞ ð7Þ
again an expression that does not involve EgðXÞ. To progress, different assumptions can be made.
(a) Small regression coefficient
When � is small, we can linearize the exponential in (6) to obtain
EðYgÞ �Xi2g
�0ig þ �
�Xi2g
�0igXig
�
EðYgÞ � Ng��0g 1 þ �~Xg
� �
where ~Xg ¼�P
�0igXig
�=P
�0ig is the baseline risk-weighted average exposure. Using again a first
order approximation, we thus obtain
EðYgÞ=Ng � ��0gexp �~Xg
� �ð8Þ
Comparing (5) and (8), we see that, to first order approximation, the regression coefficient � also
measures the effect of exposure at the group level, if the average group exposure is appropriately
weighted. The equivalent linearization in (7) leads to
EðYgÞ=Ng � Egð�0Þexpð�EgðXÞÞ ð9Þ
which has a form that again corresponds to expression (5). Note that (9) involves the unweighted group
average exposure, EgðXÞ, rather than the baseline-risk-weighted one because we have assumed
independence between �0 and X.
(b) Normally distributed exposure
If we now consider the case where X is normally distributed with mean �g and variance �2g, i.e.
pgðXÞ ¼ Nð�g; �2gÞ, then we obtain
Egðexpð�XÞÞ ¼ exp ��g þ 0:5�2�2g
� �
(Note that, in the case of multivariate exposures, the full variance–covariance matrix of the joint
exposure distribution enters here.) In general, we can evaluate the expression Egðexpð�XÞÞ for any
distribution pgðXÞ for which the moment generating function (Laplace transform) is known explicitly,
BAYESIAN MODELS OF HEALTH–ENVIRONMENT EFFECTS 137
Copyright # 2003 John Wiley & Sons, Ltd. Environmetrics 2003; 14: 129–147
as was noted in Richardson et al. (1987). Substituting in (7) with �g ¼ EgðXÞ, we obtain an aggregated
form of (5) in the Gaussian case,
EðYgÞ=Ng ¼ Egð�0Þexp 0:5�2�2g
� �expð�EgðXÞÞ ð10Þ
We see that in this case the aggregated form corresponding to (5) is also a function of the within-area
variance of the exposure variable (or the within-area covariance matrix for multiple exposures).
Neglecting this term thus leads to a biased functional relation at the group level. This bias is often
referred to as specification bias (see Richardson and Monfort, 2000). Nevertheless, as discussed by
Plummer and Clayton (1996), there are several situations where this bias can become negligible. This is
the case if either (i) �2g is small or (ii) �2
g hardly varies with g and thus can be absorbed in a constant
term. Essentially, for (i) to hold, the exposure has to be nearly uniform over the group, which is rarely
the case, except if the group is fairly small, whilst it is difficult for (ii) to hold if the mean �g also varies
between groups. Thus, it is important to have some knowledge of the within-area (co)variance of the
exposure(s), and to input this into the specification of the dose–effect relationship at the group level.
Note that our calculations in the Gaussian case can be easily extended to other within-area distributions
for X (Wakefield and Salway, 2001). However, in general, the moment generating function of X will
involve higher order moments of the within-area distribution of X, and these would be hard to estimate.
3.2. Modelling the baseline risk
In this section, we discuss the modelling of the beween-group variability of the average baseline risk:
Egð�0Þ. To simplify the notation, we now let this random quantity be denoted by �0g, as in Section 2.
3.2.1. Exchangeability. The simplest form of between-group variability is to suppose that the �0g are
exchangeable between the groups, in other words that all the �0g come from a common distribution
and are independent. Since �0g is positive, it is easier to work with its log-transform:
log�0g � pð/�Þ; independently for g ¼ 1; . . . ;G
Examples of commonly adopted parametric forms for pð/�Þ are a Gaussian or a Student t-distribution
with a chosen small degree of freedom; the latter distribution is advisable if outliers are suspected in
the variability of the baseline risk. In both cases, /� consists of a mean and a variance parameter, that
will be given suitable weakly informative priors, and estimated jointly with the f�0gg.
3.2.2. Spatial or temporal dependence. The simple exchangeable structure is appropriate if there is no
reason to suspect that some of the groups are more closely related than others in terms of their baseline
risk. In many ecological designs, though, the structure of the group renders this assumption
implausible. As mentioned previously, the groups often represent temporal or spatial units. In air
pollution studies, a common study design is to follow the population of one city on a daily, weekly or
monthly basis and to relate counts of chosen health events to levels of air pollution measured at
different monitoring stations. In geographical epidemiology studies, a particular geographical scale of
analysis is chosen first, like electoral wards or districts in the U.K., giving a set of predefined areas.
Then the map of health events per area cumulated over a time period is related to the spatial
distribution of environmental factors. Whether the units are indexed by time, �0g ¼ �0t, or by space,
�0g ¼ �0s, it is clear that an appropriate dependence structure has to be built.
138 S. RICHARDSON AND N. BEST
Copyright # 2003 John Wiley & Sons, Ltd. Environmetrics 2003; 14: 129–147
(a) Temporal structure
A flexible temporal structure for the log �0t is to assume that they follow either a first order random
walk: log �0t ¼ log�0t�1 þ ut, or a second order random walk: log�0t ¼ 2 log�0t�1 � log�0t�2 þ ut,
where ut is assumed to be Gaussian white noise, i.e. independent and identically distributed Gaussian
variables with mean 0 and variance �2. Both these models make the assumptions that neighbouring
time points share some common component, the second order random walk being particularly well
suited for predictions. The size of the variance �2 controls the smoothness of the temporal pattern, a
small value corresponding to stronger time dependence. These models have been used in studies of
time trends for cancer rates; see Knorr-Held and Besag (1998) and Fahrmeir and Lang (2001).
(b) Spatial structure
A spatial structure for log�0s can be built along similar lines. Instead of using the natural time
ordering, it relies on defining a neighbourhood structure between areas s and s0, symbolized as s � s0, a
common choice being to say that s � s0 when the areas s and s0 are contiguous. Then, a commonly used
model of spatial dependence, referred to as an intrinsic or conditional autoregressive (CAR) model,
specifies the distribution of �s ¼ log�0s by
pð�sj�s0 ; s0 6¼ sÞ � Nð�s; �2=nsÞ
where �2 is an unknown variance parameter, �s ¼P
s0�s �s 0=ns, and ns denotes the number of
neighbours of area s. The CAR model has been extensively used in disease mapping studies concerned
with rare diseases after its introduction by Besag et al. (1991). The resulting estimates of the f�0sgborrow strength from the neighbouring areas and are smoothed towards a local mean. Note that to
account for spatial and non-spatial structure in the flog�0sg, one could model their distribution as the
sum of a CAR and an exchangeable model, as suggested by Besag et al. (1991) in the disease mapping
context. The associated graph is a chain graph containing both directed and undirected links (see
Spiegelhalter et al. (1995) and Bernardinelli et al. (1997) for illustrations).
3.3. Modelling the effect of exposure across groups
We now turn to a discussion of models for between-group variability in the effect of exposure on
disease risk. As noted in Section 2.2.2, if we wish to allow group-specific exposure coefficients ��g in
ecological models, it becomes important to specify some form of hierarchical prior distribution for the
f��gg to attempt to improve the identifiability of the model. Recent work on ecological inference in
sociological applications may provide some useful insight on this issue.
King et al. (1999) and Wakefield (2001) consider estimation of the cell probabilities of 2� 2 tables
where only the table margins are observed. In the present context, this corresponds to a scenario in
which each 2�2 table represents a group g, with columns and rows representing disease status and a
binary exposure, respectively (see Table 1). If individual-level data (equivalent in this case to the cell-
specific counts n:g) are available, then estimation of the group-specific probabilities �0g ¼ Pr(disease junexposed) and �1g ¼ Pr(disease j exposed) is straightforward, from which it is possible to obtain an
exposure coefficient for individuals within group g, i.e. ��g ¼ f ð�1gÞ � f ð�0gÞ, where f ðÞ is an
appropriate link function (usually log or logit). Note also that �0g corresponds to what we have
termed the baseline risk of disease in group g, denoted �0g in previous sections. In ecological studies
we only observe the margins of the underlying 2�2 table (i.e. total number of disease cases Yg, total
BAYESIAN MODELS OF HEALTH–ENVIRONMENT EFFECTS 139
Copyright # 2003 John Wiley & Sons, Ltd. Environmetrics 2003; 14: 129–147
proportion exposed Zg, and total population at risk Ng). Nonetheless, depending on the particular data
set, there may be some information in the margins to allow the construction of bounds on the
proportion of diseased individuals who are or are not exposed. For example, Wakefield (2001) cites
the case in which one of the table margins is near zero. That is, if most or all individuals in a group are
exposed (i.e. Zg!1), then knowledge of the total number of diseased and disease-free individuals in
the group provides considerable information about �1g (although virtually no information about �0g).
Likewise, if most or all individuals in a group are unexposed (i.e. Zg!0), considerable information is
available about �0g but not �1g. In the worst case scenario, if exactly half the individuals in a group
have the disease, and exactly half are exposed, then the margins contain no information about �0g and
�1g since any combination of cell counts n:g yielding row and column sums equal to Ng=2 is possible
(Wakefield, 2001).
In order to facilitate estimation of �0g and �1g (and hence, in our case, ��g) for tables (groups)
containing little information, King et al. (1999) propose a hierarchical model. The idea is to borrow
strength across groups where the proportion of exposed individuals is close to 1 to estimate �1g for all
g ¼ 1; . . . ;G, and to borrow strength across groups where the proportion of exposed individuals is
close to 0 to estimate �0g for all g ¼ 1; . . . ;G. King et al. (1999) assume independent beta distributions
for the joint distributions of �jg; j ¼ 0; 1; g ¼ 1; . . . ;G. Alternatively, Wakefield (2001) considers both
independent normal and Student t priors for the joint distributions of the logit-transformed
probabilities �jg; j ¼ 0; 1; g ¼ 1; . . . ;G, and discusses the extension to a bivariate normal prior
distribution to incorporate dependence between �0g and �1g. In the present context, it seems more
natural to specify the hierarchical model directly on the regression coefficients ��g and the baseline
risks �0 than on the probabilities �0g and �1g. For example, Assuncao et al. (2001) assume
independent CAR distributions (see Section 3.2.2) for the f�0gg and f��gg in a spatial regression
model of human fertility rates (where g represents small areas). However, the scenario they consider is
a special case in which the data for each area (including the ‘exposure’ of interest) are available by
age-group. Hence there is replication of exposure and response data within areas, leading to a fully
identified model. Whether such a model would be estimable more generally is not clear.
In general, although the work by King et al. (1999) and Wakefield (2001) indicates that ecological
data may contain some information by which to estimate random coefficient models for categorical
exposures, it is not clear that this will be sufficient to derive useful estimates of group-varying effects
of environmental exposures on disease risk. It is rare to find environmental epidemiological examples
with close to 100 per cent of individuals in a group exposed or vice versa (the most information-rich
situations), so even using a hierarchical model to borrow strength across groups is unlikely to produce
reliable estimates of the group-specific regression coefficients. The examples presented by both
Wakefield and King et al. also suggest that the probabilities for some groups may be sensitive to the
choice of prior distribution at the second level. Furthermore, Wakefield notes that the case of a
Table 1. 2� 2 table summarizing the potential data available on disease statusand a binary exposure for group g. In an ecological study only the margins are
observed
No disease Disease Total
Unexposed n00g n10g
Exposed n01g n11g NgZg
Yg Ng
140 S. RICHARDSON AND N. BEST
Copyright # 2003 John Wiley & Sons, Ltd. Environmetrics 2003; 14: 129–147
continuous exposure is even more difficult since knowledge of the average exposure for a group does
not provide any information about the distribution of individual exposure values within that group.
As noted in Section 2.2.2, a pragmatic solution to the problem of estimating group-varying effects of
exposure is to introduce an additional level of aggregation into the model and allow the exposure
coefficients to vary at this coarser resolution. Denoting the subsets of groups by Sj � g; g ¼1; . . . ;G; j ¼ 1; . . . ; J < G as before, we may model the variability between exposure coefficients by
assuming a joint distribution pðb�j/�Þ for b� ¼ f��j g. Appropriate distributional choices include the
exchangeable, random walk (for temporally defined subsets) and CAR (for spatially defined subsets)
distributions discussed in Section 3.2 in the context of modelling the baseline risk. This approach is
illustrated in Section 3.6, where we briefly discuss an example by Dominici et al. (2000). These authors
consider a hierarchical model with groups defined by time (daily counts of mortality and mean air
pollution concentration) nested within space (20 large cities in the U.S.A.). Each city thus represents a
spatially defined subset of temporal groups, and the authors then assume a hierarchical model for the
city-specific regression coefficients, representing risk of mortality associated with specific air pollutants.
If we are unwilling to assign groups to subsets a priori, a more flexible approach is to assume a
mixture model for the joint distribution of the group-specific coefficients. Such a model also assumes
that the groups g belong to subsets Sj; j ¼ 1; . . . ; J G, with a common exposure effect ��j for all
groups g 2 Sj. However, the number and composition of the subsets Sj is assumed to be unknown
a priori, and is instead estimated as part of the model. Such an approach has been used by Hurn et al.
(2002) for various applications, including to estimate the effect of daily ambient nitrogen dioxide
concentrations on risk of hospital admissions for circulatory and respiratory diseases.
3.4. Measurement error model between group average and ecological measures
of environmental exposure
Whatever the form (4), (9) or (10) of the disease–exposure relationship at the group level, we have seen
that the average group exposure EgðXÞ is involved. In the majority of studies, there is not enough
information to estimate EgðXÞ for each group, and this random quantity is commonly replaced by a
group level ecological surrogate, Zg. For example, Zg could represent a recorded measure of ambient
air pollution or a level of chlorine in drinking water supplied to a city or region. Thus, the investigation
of the health effect of a specific environmental exposure relies in many cases on linking EðYgÞ to Zg by
a coefficient �� (see Figure 2), whereas the true interest lies in the more interpretable coefficient �linking EðYgÞ to EgðXÞ (modulo the within-area variability).
Modelling the different sources of variability between the recorded measures Zg of environmental
exposure and the relevant individual or group averaged exposure is a key component of any statistical
analysis of environmental effects. Such analyses typically involve building several model components
expressing links between individual Xig and true environmental exposure Z�g , and between measured
Zg and true Z�g environmental exposure and combining these sub-models to derive the link between
EgðXÞ and EðYgÞ.Misclassification of exposure has long been recognized as a limitation of many epidemiological
studies. Indeed, measurement error can distort the quantification of the dose–effect relationship
investigated. The extent and the nature of the distortion, depending on several factors including the
study design, the type of error and the relationship between the outcome and the covariates; see, for
example, the review by Thomas et al. (1993) or the general Bayesian framework outlined in
Richardson and Gilks (1993). Several types of error models have been traditionally considered. In
the classical measurement error formulation, the conditional distribution of the surrogate exposure
BAYESIAN MODELS OF HEALTH–ENVIRONMENT EFFECTS 141
Copyright # 2003 John Wiley & Sons, Ltd. Environmetrics 2003; 14: 129–147
given the true exposure is specified. In doing this, one is essentially characterizing the measuring
instrument. In the Berkson error model, it is instead the conditional distribution of the true exposure
given the surrogate exposure that is specified. For example, if Z�g is the ‘true’ mean ambient
concentration of a particular pollutant in a city on a specific day and Zg are the recorded concentrations
at one or several monitoring stations, then it might be appropriate to formulate a classical error model
pðZgjZ�g ;/zÞ, where /z quantifies both the sensitivity of the recording instrument and the network
coverage of the monitoring stations (see Zeger, 2000, for a detailed discussion of measurement error
models in the air pollution context). On the other hand, the model between the exposure Xig of an
individual to this pollutant and the true mean ambient concentration Z�g is more of the Berkson type
and would be formulated using a physiologically based dose absorption model and allowing for the
individual’s activity patterns. In this case, it is pðXigjZ�g ; �Þ, the conditional distribution of Xig given Z�
g ,
that is modelled, where � are the dose absorption parameters. From the discussion above, one can see
that the relationship between EgðXÞ and Z�g is a combination of the measurement error model relating
monitored and true ambient exposures, the physiologically based dose absorption model and some
group measure of time-activity data aiming to characterize the proportion of time spent in different
ambient environments (e.g. outdoors, in traffic, in the home, etc.). Thus it combines both classical and
Berkson components, each component requiring careful specification. Cases of measurement error
situations combining both Berkson and classical features have been discussed in the occupational
context of job-exposure matrices by Gilks and Richardson (1992), and Richardson (1996), and in the
environmental context of radon exposure by Reeves et al. (1998).
To estimate the parameters of these models, quantitative information on the measurement error has
to be introduced. This information may come from different sources: a priori external information on
the measurement instrument or purpose built validation or replication sub-studies. For example, the
relationship between indoor and outdoor air pollution levels can be studied if recording of the indoor
pollution is made in a representative sample of houses. Moreover, to quantify the link between group
exposure and environmental exposure, specifically designed surveys collecting time-activity data in
order to characterize the proportion of time spent in different environments for different age-groups
and categories of occupations will be needed. The information contributed by these sub-studies can
then be built as an additional part of the graph of the hierarchical model and provide the necessary
information to identify the relevant parameters of the measurement error model. For example, Figure 3
shows how part (c) of the graph in Figure 2 could be extended to incorporate data from a sub-study on
replicate measurements of ambient pollution concentrations in different environments, and data from a
time-activity sub-study measuring the amount of time spent by a sample of individuals in the various
environments. This illustates how hierarchical models may be used to help construct more realistic
models of the complex relationships involved in studying the effect of the environment on health.
3.5. Summary of hierarchical structures
In summary, in full ecological designs involving counts of cases of a rare disease and a single
environmental exposure, the following core 2-level hierarchy is usually specified:
Yg � Poissonð�gÞlogð�gÞ ¼ logð�0gÞ þ ��
j Z�g
accompanied by hierarchical substructures describing the variability of logð�0gÞ; �j and Z�g .
142 S. RICHARDSON AND N. BEST
Copyright # 2003 John Wiley & Sons, Ltd. Environmetrics 2003; 14: 129–147
For the baseline risk, we have described three different structures, the choice being dictated by the
underlying dependence structure which is related to the constitution of the groups:
logð�0gÞ � pð��Þ; independently for g ¼ 1; . . . ;G ðexchangeabilityÞ
or, when g ¼ t:
logð�0tÞ ¼ logð�0t�1Þ þ ut; ut � pð��Þ; independently ðfirst order random walkÞ
or, when g ¼ s:
logð�0sÞ ¼ �s þ us; �s � spatial CAR model; us � pð��Þ; independently
For modelling the group-varying effect of exposure, we have described an additional level in terms of
the subset Sj � g; g ¼ 1; . . . ;G; j ¼ 1; . . . ; J < G : f��j g � pð/�Þ; in order to be able to identify the
f��j g. pð/�Þ may take the form of an exchangeable, random walk or spatial CAR distribution as
appropriate, or, if the composition and number of subsets Sj is unknown a priori, a mixture distribution
may be specified.
To allow for measurement error between the true environmental level of the covariates Z�g and the
recorded levels Zg, and/or the average group exposure EgðXÞ, we have described a classical error
model pðZgjZ�g ;/zÞ and a combined Berkson and classical model pðEgðXÞjZ�
g ; h;TgÞ:
3.6. Combined analyses of several data sets
In this section, we give an illustration of hierarchical modelling of health–environmental effects in the
context of air pollution. In this domain, it is important to be able to combine the analyses of several
Figure 3. Extension of part of the graphical model to include information from sub-studies to improve estimation of the
relationship between EgðXÞ and Zg. Zger represents replicate measurements r, and Z�ge the true value, of the ambient pollutant
concentration in different environments e within group g; Tige represents measurements of the amount of time individual i in
group g spends in environment e; the dashed arrow between Xige and EgðXÞ represents a deterministic dependence as opposed to
the stochastic relationships implied by the solid arrows
BAYESIAN MODELS OF HEALTH–ENVIRONMENT EFFECTS 143
Copyright # 2003 John Wiley & Sons, Ltd. Environmetrics 2003; 14: 129–147
data sets to increase the power of the analyses. Indeed, the relative risks associated with environmental
exposure are often small and the results of separate studies will be associated with large uncertainties.
Combined analyses can be done straightforwardly by extending the hierarchical framework discussed
in previous sections to include a model of between-datasets variability. The resulting estimates will be
different from a simple pooled estimate that assumes no variability between studies. To illustrate the
power of the hierarchical modelling strategy, we discuss in this section a recent study published by
Dominici et al. (2000) that concerns the effect of air pollution on mortality. Precisely, the authors
analysed time series of daily air pollution and mortality in the 20 largest U.S. cities, using a two-stage
model building strategy:
1. Building the time series model for each city. Here, the data consist of the daily total number of
deaths Yt for an age group (the analysis is repeated for several age groups, but for simplicity of
exposition, we do not include an age index). It is thus aggregated by day, t, and will be further indexed
by the city, s. The covariates of interest are the air pollution levels Zts ¼ ðZ1ts; Z2tsÞ, where Z1ts is the
recorded level of particulates (PM10) and Z2ts is the recorded ozone (O3) level. The aim is to study the
short term effect of these pollutants on mortality and not the chronic effects. Thus, there is a need to
account for time trends and seasonal patterns in mortality to avoid confounding the short term effects
by the long term trends that might be due to changes in the characteristics of the population. The
authors chose to model these effects by a flexible function of time, that we will denote globally by Sts
and not detail further. Thus, for each city s, the following model is specified:
Yts � Poissonð�tsÞlogð�tgÞ ¼ Sts þ ��T
s Zts þ ut
where ut is a Gaussian white noise, the time dependence in the daily counts having been absorbed in
the flexible function of time Sts.
If as discussed previously, external information is available on the link between Zts and the average
group exposure, this information can be included at this stage.
2. Building the between city model for combining the data. A pooled analysis of the effect of air
pollution on mortality for the 20 cities would assume that the effects quantified by ��s are the same for
all the cities, ��s ¼ ��, for all s, and combine the separate estimates with weights inversely
proportional to their variance. This simplifying assumption can be misleading as there are many
sources of variability between the cities that can create variability of the ��s .
In a hierarchical framework, this variability is explicitly modelled; in particular, site-specific
explanatory variables Cs such as the percentage of people living in poor socio-economic conditions or
even the average pollution level in the study period are taken into account. The following model is thus
adopted at the next hierarchical level:
��s � Nð�� þ Cs;�Þ; independently for each s ð11Þ
In (11), the intercept �� represents a synthesis of the information from the different cities, and
estimates the overall effects of air pollution on mortality after accounting for within and between site
confounders. Moreover, the between-city variability � is also quantified in the hierarchical analysis,
and this is of interest per se. Dominici et al. (2000) further extend (11) to a multivariate normal
distribution for b� ¼ f��s g with spatially structured covariance matrix � allowing the correlation
between cities s and s0 to depend on distance. Besides giving a sensible estimate of the overall effects,
144 S. RICHARDSON AND N. BEST
Copyright # 2003 John Wiley & Sons, Ltd. Environmetrics 2003; 14: 129–147
the hierarchical framework also leads to improved estimation of the parameters ��s for each city
through shrinkage and borrowing of strength between the data sets. In their combined study, Dominici
et al. (2000) found an overall short term effect of PM10 after adjustment for within and between city
covariates.
4. DISCUSSION
There are many benefits of using a hierarchical Bayesian modelling strategy when faced with the
complex problem of studying health–environment effects. We regroup these under the following
headings, but note that there is some obvious intersection between these: (a) modular model
elaboration; (b) integration of different sources of information; (c) coherent propagation of
uncertainty; (d) borrowing of strength; (e) integrated treatment of information at different levels.
(a) The current trend of increasing sophistication and deployment of measurement instruments forquantifying environmental exposure has led to increasing availability of more abundant and betterquality data. It is important that the analyst uses a framework where each new type of data, forexample a new series of indoor measures of air pollution, can be treated in a modular fashion. Thisis exactly what hierarchical model building provides. Suppose that there is already a core modelthat can be represented by a DAG and that a new type of data becomes available. Firstly, a separatemodel is built to account for the specific characteristics of the new type of data. Subsequently, this‘module’ is linked to the rest of the variables by extending the DAG of the original model aroundthe pivotal variables that are common to the ‘module’ and the original graph. In our example ofindoor pollution, it is the unobserved individual exposure Xig that is pivotal to the core model andthat describing the indoor pollution variability, as was illustrated in Figure 3.
(b) A related benefit of the modularity just described is that it renders possible the simultaneousintegration of different sources of information. Indeed, by always building a joint model of all thevariables that encompasses any number of different modules, all sources of information areintegrated to contribute to the estimation of the dose–effect relationships of interest. For example,in the air pollution context, separate sub-studies might be available: (i) linking personal (badge-measured) and indoor exposure, (ii) relating indoor and outdoor exposure, and (iii) quantifyingproportion of time spent outdoors for different categories of people, all contributing to building theoverall model for studying the effect of environmental air pollution on health at the group level.
(c) In parallel, the joint hierarchical model leads automatically to a correct propagation of all sourcesof uncertainty that have been quantified in each ‘module’ onto the estimation of the parameters ofinterest. We stress that this is not the case for methods that would proceed by ‘substitution’, forinstance by replacing an unknown exposure Xig by an estimated one Xig using an equationcalibrated in a sub-study. In the present framework, unknown exposures are treated as randomvariables, their distribution is informed by the different sub-studies to which they are related andthe associated uncertainty is then propagated on the distribution of the coefficients � or ��.
(d) One important aspect of hierarchical models that has been widely discussed in the previoussections is that it allows the borrowing of strength between different data sets, thus leading toimproved and more stable estimates of the parameters of interest. The simplest model forborrowing strength is the exchangeable model, but we have also discussed several extensions thatpermit flexible modelling of dependence between the data sets, dependence that can extend intime, in space or through the sharing of a higher level grouping structure. This is an active area ofresearch at the moment with natural extension to space–time or space–time–activity models.
(e) In some cases, health data might be available at an individual level, while contextual orenvironmental exposure variables are expressed at the aggregated level of geographical units.
BAYESIAN MODELS OF HEALTH–ENVIRONMENT EFFECTS 145
Copyright # 2003 John Wiley & Sons, Ltd. Environmetrics 2003; 14: 129–147
Thus, there is a need to build a framework that can, in principle, accommodate data observed atdifferent scales. One framework that is useful for that purpose is that of point process models. Inspatial epidemiology, one might consider that the location of the individuals at risk is beingmodelled by a baseline demographic process, on which is superimposed a disease process that‘picks’ out the cases with a probability that is dependent both on individual level risk factors andarea-level variables measuring individual exposure (Richardson, 2002). In a study of the effect oftraffic pollution on respiratory disorders of children carried out by Best et al. (2000), a pointprocess is incorporated in a hierarchical model, allowing to account both for individual risk factorsof the children and for environmental levels of exposure to air pollution. There is much scope forfurther work in this direction.
ACKNOWLEDGEMENTS
The authors would like to acknowledge support from the U.K. Medical Research Council (Career EstablishmentGrant G9803841) and the U.K. Small Area Health Statistics Unit. They are grateful to their colleagues, PeterGreen, David Spiegelhalter and Jon Wakefield for stimulating discussions. SR thanks the organizers of the ISCEPconference for the invitation to speak.
REFERENCES
Assuncao RM, Potter JE, Cavenaghi SM. 2001. A Bayesian space varying parameter model applied to estimating fertilityschedules. Technical Report, Departamento de Estatistica, UFMG, Brazil.
Bernardinelli L, Pascutto C, Best NG, Gilks WR. 1997. Disease mapping with errors in covariates. Statistics in Medicine 16:741–752.
Besag J, York J, Mollie A. 1991. Bayesian image restoration, with two applications in spatial statistics (with discussion). Annalsof the Institute of Statistical Mathematics 43: 1–59.
Best N, Ickstadt K, Wolpert R. 2000. Spatial Poisson regression for health and exposure data measured at disparate resolutions.Journal of the American Statistical Society 95: 1076–1088.
Breslow N. 1990. Biostatistics and Bayes (with discussion). Statistical Science 5: 269–298.Briggs DB. 2000. Exposure assessment. In Spatial Epidemiology: Methods and Applications, Elliott P, Wakefield JC, Best NG,
Briggs DB (eds). Oxford University Press: Oxford, UK; 335–359.Dominici F, Samet JM, Zeger SL. 2000. Combining evidence on air pollution and daily mortality from the 20 largest us cities: a
hierarchical modelling strategy. Journal of the Royal Statistical Society, Series A 163(3): 263–302.Fahrmeir L, Lang S. 2001. Bayesian inference for generalized additive mixed models based on Markov random field priors.
Journal of the Royal Statistical Society, Series C 50(2): 201–220.Gelman A, Carlin JB, Stern HS, Rubin DB (eds). 1995. Bayesian Data Analysis. Chapman & Hall: London, UK.Gilks W, Richardson S. 1992. Analysis of disease risks using ancillary risk factors, with application to job–exposure matrices.
Statistics in Medicine 11: 1443–1463.Gilks WR, Richardson S, Spiegelhalter DJ (eds). 1996. Markov chain Monte Carlo in Practice. Chapman and Hall: London, UK.Goldstein H (ed.). 1995. Multilevel Models in Educational and Social Research, 2 edn. Arnold: London, UK.Good I. 1987. Hierarchical Bayesian and empirical Bayesian methods with discussion. American Statistician 41.Green PJ. 2001. A primer on Markov chain Monte Carlo. In Complex Stochastic Systems, Barndorff-Nielsen OE, Cox DR,
Kluppelberg K (eds). Chapman & Hall: London, UK.Greenland S. 1992. Divergent biases in ecologic and individual-level studies. Statistical Medicine 11: 1200–1223.Hurn M, Justel A, Robert CP. 2002. Estimating mixtures of regressions. J. Comput. Graph. Stat. (to appear).King G, Rosen O, Tanner MA. 1999. Binomial-beta hierarchical models for ecological inference. Sociological Methods and
Research 28: 61–90.Knorr-Held L, Besag J. 1998. Modelling risk from a disease in time and space. Statistics in Medicine 17: 2045–2060.Lauritzen SL. 1996. Graphical Models. Clarendon Press: Oxford, UK.Morris CN, Normand SL. 1991. Hierarchical models form combining information and for meta-analyses. In Bayesian Statistics
IV, Bernardo JO, Berger JP, Dawid AP, Smith AFM (eds). Oxford University Press: Oxford, UK; 321–344.Plummer M, Clayton D. 1996. Estimation of population exposure in ecological studies (with discussion). Journal of the Royal
Statistical Society, Series B 58: 113–126.
146 S. RICHARDSON AND N. BEST
Copyright # 2003 John Wiley & Sons, Ltd. Environmetrics 2003; 14: 129–147
Reeves G, Cox D, Darby S, Whitley E. 1998. Some aspects of measurement error in explanatory variables for continuous andbinary regression models. Statistics in Medicine 17: 2157–2177.
Richardson S. 1996. Measurement error. In Markov chain Monte Carlo in Practice. Chapman & Hall: London, UK; 401–417.Richardson S. 2002. Spatial models in epidemiological applications. In Highly Structured Stochastic Systems, Green P, Hjort N,
Richardson S (eds). Oxford University Press: Oxford, UK (in press).Richardson S, Gilks W. 1993. Conditional independence models for epidemiological studies with covariate measurement error.
Statistics in Medicine 12: 1703–1722.Richardson S, Monfort C. 2000. Ecological correlation studies. In Spatial Epidemiology: Methods and Applciations, Elliott P,
Wakefield J, Best N, Briggs D (eds). Oxford University Press: Oxford, UK; 205–220.Richardson S, Stucker I, Hemon D. 1987. Comparison of relative risks obtained in ecological and individual studies: some
methodological considerations. International Journal of Epidemiology 16: 111–120.Rosenberg MA, Andrews RW, Lenk PJ. 1999. A hierarchical Bayesian model for predicting the rate of nonacceptable in-patient
hospital utilization. J. Bus. Econ. Stat. 17: 1–8.Spiegelhalter DJ. 1998. Bayesian graphical modelling: a case study in monitoiring health outcomes. Applied Statistics 47(1):
115–133.Spiegelhalter DJ, Thomas A, Best NG. 1995. Computation on Bayesian graphical models. In Bayesian Statistics 5, Bernardo JM,
Berger JO, Dawid AP, Smith AFM (eds). Oxford University Press: Oxford, UK; 407–425.Spiegelhalter DJ, Thomas A, Best NG, Lunn D. 2001. WinBUGS Version 1.4 User Manual. Imperial College, London and MRC
Biostatistics Unit, Cambridge, Available from www.mrc-bsu.cam.ac.uk/bugs.Su ZM, Adkison MD, Van Alen BW. 2001. A hierarchical Bayesian model for estimating historical salmon escapement and
escapement timing. Can. J. Fish. Aquat. Sci. 58: 1648–1662.Thomas D, Stram D, Dwyer J. 1993. Exposure measurement error: infiuence on exposure–disease relationships and methods of
correction. Annual Revue of Public Health 14: 69–93.Wakefield J, Salway R. 2001. A statistical framework for ecological and aggregate studies. Journal of the Royal Statistical
Society 164(1): 119–137.Wakefield JC. 1996. The Bayesian analysis of population pharmacokinetic models. Journal of the American Statistical
Association 91: 62–75.Wakefield JC. 2001. Ecological inference for 2� 2 tables. Technical Report 12, Centre for Statistics and the Social Sciences.
University of Washington: Seattle.Wakefield JC, Best NG, Waller L. 2000. Bayesian approaches to disease mapping. In Spatial Epidemiology: Methods and
Applications, Elliott P, Wakefield JC, Best NG, Briggs DB (eds). Oxford University Press: Oxford, UK; 104–127.Zeger S. 2000. Exposure measurement error in time-series studies of air pollution: concepts and consequences. Environmental
Health Perspectives 108: 419–426.
BAYESIAN MODELS OF HEALTH–ENVIRONMENT EFFECTS 147
Copyright # 2003 John Wiley & Sons, Ltd. Environmetrics 2003; 14: 129–147