health-related behaviour in context: a multilevel modelling approach

14
~ Pergamon 0277-9536(95)00181-6 Sot', Sci. Med. Vol. 42, No. 6, pp. 817-830, 1996 Copyright ,~, 1996 Elsevier Science Ltd Printed in Great Britain. All rights reserved 0277-9536/96 $15.00 + 0.00 HEALTH-RELATED BEHAVIOUR IN CONTEXT: A MULTILEVEL MODELLING APPROACH CRAIG DUNCAN, ~KELVYN JONES t and GRAHAM MOON 2 tDepartment of Geography, University of Portsmouth, Buckingham Building, Lion Terrace, Portsmouth P01 3HE, England and "School of Social and Historical Studies, University of Portsmouth, Milldam, Burnaby Road, Portsmouth P01 3AS, England Abstract--Recent attempts to place individual health-related behaviour in context have beenjudged largely unsuccessful. This paper examines how this situation might be improved and is especiallyconcerned with the role of quantitative methodologies. It is argued that, whilst recent developments in social theory help provide important theoretical guidelines, they can only be implemented with difficulty in empirical health-related behaviour research if traditional quantitative methodologiesare used. It is suggested that the best way to implement social theory within a quantitative framework is to apply the newly developed technique of multilevel modelling. This paper offers an overview of the multilevel approach and outlines its significance for health-related behaviour research. In addition, it details a number of ways in which the multilevel framework can be extended so as to achieve further improvements in the conceptualization of health-related behaviour. To illustrate the value of the technique, the paper finishesby considering one of these extensions in detail and applying it to data recording smoking behaviour in the United Kingdom. Key words--health-related behaviour, smoking, context, multilevel model INTRODUCTION The behaviour of individuals has come to be closely implicated in the incidence of diseases characterized by chronic, degenerative conditions. This view was reflected in British national health policy in a Department of Health and Social Security document published in 1976 entitled Prevention and Health: Everybody's Business which declared that the diseases of modern populations "related less to man's (sic) outside environment than to his (sic) own personal behaviour or what might be termed our lifestyle" [1] (p. 95). Today this same view dominates govern- mental thinking. The most recent national policy document, The Health of the Nation: A Strategy for England [2], gives primacy to behavioral factors [3]. Historically, there have been two main perspectives on health-related behaviour [4]. First, a biomedical perspective which, following a doctrine of specific aetiology, emphasizes a narrow range of behaviours, principally smoking, drinking, diet and exercise. Martin and McQueen's description of these as "the holy four" [5] (p. 6) reflects the prevalence of this limited definition and the significance attached to it. Within this perspective behaviour is regarded as a matter of free choice and individual responsibility. The risk factors are therefore seen as being autonomous and unconnected with broader socio-structural factors. Reinforcing this bio-medical position is a second view of health-related behaviour constructed from theoretical developments in behavioral science. Drawing on social psychology and most frequently articulated by health educationalists, this relates human behaviour to either fixed personality traits or pre-programmed psychological mechanisms. As with its counterpart, this second perspective views behaviour as being fixed and unchanging and its focus is entirely upon the individual. In recent years these two perspectives have been severely criticized and held to encourage a tendency towards victim-blaming and the attribution of guilt [6, 7]. It has been argued that human agency has been grossly misrepresented with people being portrayed as automata rather than complex, reflexive, knowl- edgable agents [8]. Furthermore, the reductionist and individualistic approach which characterized both perspectives ensured that individual behaviour was entirely divorced from the social and situational context in which it occurred [4, 9, 10]. An attempt to provide an improved conceptualiz- ation of health-related behaviour has been a major concern of the "new public health" [11] and in particular health promotion which "'grew out of the legacy of health education" [12, p. 10]. These two inter-related movements have argued that health- related behaviour needs to be placed within a broader perspective which emphasizes structural constraints as well as choices [13]. Just as the health risks associated with nineteenth century infectious disease had been regarded as a function of social conditions it is now argued that contemporary health risks associated with behaviour need to be seen as being intimately connected with the broader social world [14]. The notion that individuals are unaffected by social, cultural, economic or legislative factors and freely choose their behaviour is to be rejected. As the 817

Upload: craig-duncan

Post on 31-Aug-2016

217 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Health-related behaviour in context: A multilevel modelling approach

~ Pergamon 0277-9536(95)00181-6 Sot', Sci. Med. Vol. 42, No. 6, pp. 817-830, 1996

Copyright ,~, 1996 Elsevier Science Ltd Printed in Great Britain. All rights reserved

0277-9536/96 $15.00 + 0.00

HEALTH-RELATED BEHAVIOUR IN CONTEXT: A MULTILEVEL MODELLING APPROACH

CRAIG DUNCAN, ~ KELVYN JONES t and GRAHAM MOON 2

tDepartment of Geography, University of Portsmouth, Buckingham Building, Lion Terrace, Portsmouth P01 3HE, England and "School of Social and Historical Studies, University of Portsmouth, Milldam,

Burnaby Road, Portsmouth P01 3AS, England

Abstract--Recent attempts to place individual health-related behaviour in context have been judged largely unsuccessful. This paper examines how this situation might be improved and is especially concerned with the role of quantitative methodologies. It is argued that, whilst recent developments in social theory help provide important theoretical guidelines, they can only be implemented with difficulty in empirical health-related behaviour research if traditional quantitative methodologies are used. It is suggested that the best way to implement social theory within a quantitative framework is to apply the newly developed technique of multilevel modelling. This paper offers an overview of the multilevel approach and outlines its significance for health-related behaviour research. In addition, it details a number of ways in which the multilevel framework can be extended so as to achieve further improvements in the conceptualization of health-related behaviour. To illustrate the value of the technique, the paper finishes by considering one of these extensions in detail and applying it to data recording smoking behaviour in the United Kingdom.

Key words--health-related behaviour, smoking, context, multilevel model

INTRODUCTION

The behaviour of individuals has come to be closely implicated in the incidence of diseases characterized by chronic, degenerative conditions. This view was reflected in British national health policy in a Department of Health and Social Security document published in 1976 entitled Prevention and Health: Everybody's Business which declared that the diseases of modern populations "related less to man's (sic) outside environment than to his (sic) own personal behaviour or what might be termed our lifestyle" [1] (p. 95). Today this same view dominates govern- mental thinking. The most recent national policy document, The Health of the Nation: A Strategy for England [2], gives primacy to behavioral factors [3].

Historically, there have been two main perspectives on health-related behaviour [4]. First, a biomedical perspective which, following a doctrine of specific aetiology, emphasizes a narrow range of behaviours, principally smoking, drinking, diet and exercise. Martin and McQueen's description of these as "the holy four" [5] (p. 6) reflects the prevalence of this limited definition and the significance attached to it. Within this perspective behaviour is regarded as a matter of free choice and individual responsibility. The risk factors are therefore seen as being autonomous and unconnected with broader socio-structural factors. Reinforcing this bio-medical position is a second view of health-related behaviour constructed from theoretical developments in behavioral science. Drawing on social psychology and most frequently articulated by health educationalists, this relates

human behaviour to either fixed personality traits or pre-programmed psychological mechanisms. As with its counterpart, this second perspective views behaviour as being fixed and unchanging and its focus is entirely upon the individual.

In recent years these two perspectives have been severely criticized and held to encourage a tendency towards victim-blaming and the attribution of guilt [6, 7]. It has been argued that human agency has been grossly misrepresented with people being portrayed as automata rather than complex, reflexive, knowl- edgable agents [8]. Furthermore, the reductionist and individualistic approach which characterized both perspectives ensured that individual behaviour was entirely divorced from the social and situational context in which it occurred [4, 9, 10].

An attempt to provide an improved conceptualiz- ation of health-related behaviour has been a major concern of the "new public health" [11] and in particular health promotion which "'grew out of the legacy of health education" [12, p. 10]. These two inter-related movements have argued that health- related behaviour needs to be placed within a broader perspective which emphasizes structural constraints as well as choices [13]. Just as the health risks associated with nineteenth century infectious disease had been regarded as a function of social conditions it is now argued that contemporary health risks associated with behaviour need to be seen as being intimately connected with the broader social world [14]. The notion that individuals are unaffected by social, cultural, economic or legislative factors and freely choose their behaviour is to be rejected. As the

817

Page 2: Health-related behaviour in context: A multilevel modelling approach

818 Craig Duncan et al.

Research Unit in Health and Behavioral Change (RUHBC) puts it, individual health-related behaviour "can only be adequately explained by taking into account the context in which it occurs" [4] (p. 84).

Sadly, many writers suggest that the attempts made so far to articulate a new conceptualization of health and health-related behaviour have failed [4, 13, 15 17]. It has been argued that the pure individualistic lifestyle approach remains predomi- nant and that health promotion professionals are really just ~health educators in thin disguise" [15] (p. 8). In addition, British governmental health policy remains unchanged. Thomas declares that The Health o[' the Nation [2] views behaviour as ~a matter of individual responsibility" and that "behaviours are not placed in context" but rather are conceptualized in traditionally narrow epidemiological terms [3] (p. 304).

Undoubtedly, then, a key imperative is that health- related behaviour research should achieve the declared aim of articulating the connections between the actions of individuals and the socio-ecological context in which these actions are performed. There would seem to be two parts to this project: firstly, the construction of a robust theoretical framework and secondly, the methodological implementation of that framework. This paper will give attention to the former by highlighting the recent work in social theory which can provide useful guidelines for health-related behaviour research. Its main concern, however, is with methodological issues. Poland has described health promotion as sitting:

at the crossroads between continued reliance on methods and models rooted in positivist traditions of 'scientific research' on the one hand. and on the other the exploration of more qualitative and explicitly critical perspectives [18] (p. s31).

Poland concludes that the way ahead is to follow the qualitative road. The present paper challenges this conclusion by showing how the recently developed quantitative technique, multilevel modelling [19], can be used to reflect a socio-ecological interpretation of health-related behaviour. We are not suggesting that the qualitative direction is misguided or inferior. Rather we hope that a spirit of mutual trust, cooperation and cross-fertilization can be fostered. We do not envisage a crossroads, rather a carriage way consisting of two lanes running side by side.

FOUNDATIONS FOR IMPROVEMENT: GUIDELINES FROM SOCIAL THEORY

Within recent years there have been a number of significant developments within social science that parallel the issues that have been raised within health-related behaviour research. Social theory has experienced a contextual re-orientation and the advances made can help provide foundations with which to underpin contemporary health-related behaviour research which, as Thomas [3] realises, is in some senses, theoretically under-developed.

Anthony Giddens' structuration theory [20] draws attention to the way in which knowledgeable indi- viduals will draw upon social structures in day-to-day living. Human agents operate within particular socio-cultural milieux which will contain a number of specific structural factors (conceptualized as rules and resources by Giddens) that stimulate and shape behaviour. By drawing on these, social activity produces patterns of behaviour and also reproduces the structural factors that were involved. However, as these factors are as much an outcome as a medium there is always the possibility that they will be re-created differently. This on-going, recursive process is fundamentally context-specific and Giddens' theory emphasizes the way in which the interplay between individuals (agency) and social factors (structure) will be constituted differently at different times in different settings. Context is therefore crucial both to the immediate manifestation of behaviour and the sedi- mentation of longer term behavioral structures. Giddens' work is complex and abstract and, whilst its implications in terms of health-related behaviour research are difficult to assess exactly, it does offer some important general guidelines. Research must consider the vitally important connections that exist between phenomena at a number of different levels structures and institutions operating at macro-levels need to be set alongside knowledgeable and capable human agents behaving at a micro-level. As these connections are inherently situation-specific, a contextualized understanding of health-related behaviour is imperative.

The need for an emphasis on context and setting has been amplified by the critique of natural science that has been conducted by critical realist philosophers [21-23]. Focusing on the way in which social objects have been conceptualized and the need to appreciate the nature of relations between different types of social object, critical realist philosophy has outlined some crucial differences between natural science and social science. The inherent tendency for social objects to undergo both internal and external change means that the search for order and regularity that characterizes natural science and is enshrined within its philosophi- cal companion, positivism, cannot be uncritically and automatically followed in social research. Critical realism emphasizes the likelihood of contextual variation and reveals the inadequacy of methodologies that assume invariance and universal applicability. Human agents are likely to behave quite differently in different contexts due to their active transformative nature and the ability they have to connect in different ways at different times with the external social world which itself is continually changing. This implies a need to abandon transhistorical and transcultural explanations; relationships are always embedded in particular times and particular places. The recent postmodern turn within social theory has further emphasized this need for contextually sensitive social research [17].

Page 3: Health-related behaviour in context: A multilevel modelling approach

Health-related behaviour in context 819

Given this considerable re-orientation within social theory, the desire to articulate a new socio-ecological model of health-related behaviour seems perfectly justified. It is also not surprising that workers are starting to reflect critically on the research method- ologies that they use. In particular, when the connec- tions between human agency and social structure are stressed and emphasis is placed on the importance of considering the context in which processes operate, traditional quantitative methodologies are highly problematic. The newly developed technique, multi- level modelling, can, to an extent, operationalize some of the concepts that are now being advanced with regards to health-related behaviour. This task seems particularly appropriate given the large amount of resources that have already been directed towards the collection of quantitative lifestyle information at both national and health authority level [24-26].

QUANTITATIVE MODELLING OF CONTEXTUALITY: A MULTILEVEL APPROACH

The theoretical developments reviewed in the last section emphasize the complex, multi-layered nature of reality and the need for research to articulate the connections between phenomena at several different levels--the micro-scale of people and the macro-scale of contextual settings. Furthermore, it suggests that patterns of behaviour will be characterized by variability and specificity. Recently, multilevel models have been developed which work at several levels simultaneously. As is increasingly being recognized, these models offer a seemingly robust and efficient approach to the study of contextual effects within a quantitative framework [27, 28].

In the following section we outline the conceptual development of multilevel modelling techniques to date. First we show how the multilevel framework can be used to provide a contextual analysis of behaviour which is more robust than traditional methods. A second part outlines the range of possible extensions to the basic multilevel model and assesses their importance for health-related behaviour research. In the third part one extension is considered in detail and applied to data recording smoking behaviour. As the discussion is intended for a general audience it concen- trates on the application of conceptual developments to practical research areas focusing on substantive advantages rather than technical details.

single relationship is held to exist everywhere. In effect the model has explained everything in general and nothing in particular. This can be rectified by recog- nising the communities in which individuals live and using a two-level model with individuals at level 1 nested within communities at level 2. One possible result is shown in Fig. l(b), a two-level 'random- intercepts' model. Here each of six different com- munities have their own cigarette consumption/age relation represented by a separate line. The single, thicker line represents the general relationship across all six communities. The parallel lines imply that, while cigarette consumption increases with age at the same rate in each place, some places have uniformly higher consumption rates than others. With the multilevel approach, therefore, we can see both the g~eneral relationship across all places and the particular relationship in specific places. In Fig. l(c) and (d) the situation is more complicated as the steepness of the

a)

The multilevel framework

Consider a simple regression model in which it is hypothesized that cigarette consumption (the response variable) is a function of a person's age (the predictor variable). A traditional single-level analysis might generate the relationship shown in Fig. l(a). Here the cigarette consumption/age relationship is shown as a straight line with a positive slope: older people consume more cigarettes. In this model the context in which the behaviour occurs is completely ignored: one

a)

Fig. 1.

Page 4: Health-related behaviour in context: A multilevel modelling approach

820 Craig Duncan et al.

lines varies from place to place. In Fig. l(c) the pattern is such that place makes very little difference for the elderly but there is a high degree of between- community variation in the cigarette consumption of the young. In Fig. l(d) there is a complex interaction between age and place. In some communities it is the young who have relatively high rates; in others it is the old.

The differing patterns of Fig. l(b)-(d) are simply achieved by varying the slopes and intercepts of the lines. Since the vertical axis is centred at the mean age of individuals, the intercept represents the number of cigarettes consumed by a person of average age. The slope represents the increase in cigarette consumption associated with a unit increase in age. The key feature of multilevel models is that the communities are treated as a sample drawn from a population and their potentially different intercepts and slopes are treated as coming from two distributions at a higher level. A multilevel analysis summarizes these higher level distributions in terms of two par ts - -a ~fixed' part which is unchanging across contexts, and a 'random" part which is allowed to vary. The fixed part gives the mean value of each distribution--the average slope and intercept across all communities (shown by the thick lines in Fig. l)--while the random part consists of variances which summarize the degree to which the community-specific slopes and intercepts differ from these average values. In addition, the random part also summarizes the degree of co-variability between the two higher-level distributions [29].

By adopting a multilevel approach researchers are no longer restricted to working at a single level and this provides a number of substantive advantages. First, by combining individual and aggregate levels together in one analysis both the ecological fallacy [30] and the atomistic fallacy [31] can be avoided. Working solely at an individual level means the context of local cultures is ignored, whilst working just at the aggregate level fails to capture individual variation fully. For ease of exposition we have only referred to two-level models here but software is currently available that can accommodate three levels and one package, ML3, is soon to be extended to n levels [32].

Second, by working at more than one level, the approach can start to separate compositional from contextual differences. Taking our example of smok- ing consumption, there may, nationally, be a tendency for older males to be heavy smokers. Consequently, high smoking places may simply result from the concentration of older males in certain locations. Alternatively, they could be a result of regional cultures that encourage smoking in all types of people. The former is a compositional difference related to the type of people contained within particular contexts. The latter is a contextual effect and refers to the difference arising irrespective of compositional make-up. A multilevel approach is able to separate these two effects and therefore has an important role to play in the examination of the regional behavioral

stereotypes which have taken root in both popular and professional literature [33].

Third, by working at several levels simultaneously it becomes possible to allow for contextual variation in the predictor variables. If we accept a socio- ecological conception of health-related behaviour then we need to anticipate that similar types of people (on the basis of individual characteristics) may not necessarily be behaving in the same way everywhere. As shown in the example, we can see both the behaviour of people on average across all places and their specific behaviour in particular places. Further- more, we can begin to model any variation found between places by including variables at higher levels which reflect contextual characteristics. For example, we could include a measure of community deprivation to see whether it was a significant predictor of the variation between places in cigarette consumption by a person of average age. Additionally, by including cross-level interaction variables we can assess the potentially important contextual effect whereby the characteristics of people and the characteristics of contexts interact to produce substantively different expressions of behaviour. For example, people of low social status may consume varying amounts of cigarettes depending upon the social composition of the area in which they live.

Finally, the multilevel approach also allows attention to focus on the difference in variability that exists between particular types of individuals. For example, using a categorical predictor, lower status individuals may consume more cigarettes on average than higher status individuals, a fixed effect, but they may also be more variable amongst themselves in their consumption, a random effect, with some smoking much more and some smoking much less. Unlike conventional regression modelling techniques, which concentrate on the fixed part of the model and assume simple random variation capable of being captured by a single constant variance term, multilevel analysis allows the specification of a complex random part which allows attention to focus on differential variability as well as fixed differences.

It is important to realize that these substantive advantages are made within a robust technical framework. From the graphs in Fig. 1 it appears as though a separate line is fitted in each place. This would be equivalent to procedures based on traditional single level OLS regression in which the fixed part of the model is expanded to include a slope and intercept term for each individual community. If there were 200 communities, however, this approach would involve fitting a model with 400 parameters and a very large sample size would be needed to obtain reliable estimates. Traditional quantitative approaches to contextual analysis are, therefore, highly inefficient. In contrast~ multilevel techniques involve estimating the statistical characteristics of the higher-level intercept and slope distributions for the population using the communities as a sample.

Page 5: Health-related behaviour in context: A multilevel modelling approach

Health-related behaviour in context 821

Consequently, it is the random part of the model that is expanded and, in the example above, a multilevel analysis would involve estimating only two fixed part terms giving the average intercept and slope across all 400 places and three random terms summarizing the variability between specific places. It should be noted, however, that predictions of place-specific intercepts and slopes can be obtained and since these are made using the entire sample of places they are more precise than those from a traditional approach in which each place is estimated separately [28]. Since the multilevel approach involves estimating more than one random term, traditional OLS estimation strategies cannot be used and special multilevel modelling software is required. The approach also has a number of other technical strengths which have been discussed elsewhere [29].

Extending the jramework

So far we have only considered how a two level model can provide a meaningful vehicle for expressing a contextualized interpretation of health-related behaviour in quantitative, empirical research. We will now consider a number of extensions that can be made to the multilevel framework which can help further improve our conceptualization of health-related behaviour.

Repeated measures, longitudinal models and be- havioural change. Much research has treated health- related behaviour as being stable and unchanging and there has been a tendency to rely on static, cross-sectional surveys [34]. Recently, however, there has been a marked shift with many workers realizing the need for a dynamic model and advocating the use of longitudinal research designs to reflect the process of behavioural change [4]. This would seem imperative given the results of one national survey in which 47% of the respondents identified themselves as having changed some aspect of their health-related behaviour over the previous l0 years [35].

Traditionally, the whole area of longitudinal data analysis and the assessment of change over time has been plagued by an array of methodological problems [36]. The multilevel approach offers a rewarding way out of the resulting impasse since data collected over time can be represented by multilevel structures. Two possibilities arise depending on the level of unit that is repeatedly measured. When individuals are repeatedly measured in a panel design, the behavioral measure- ment taken at different times, for example cigarettes smoked, forms level 1. This is nested within individuals at level 2~ which in turn nest within a further higher level unit such as the community. This structure is shown in Fig. 2(a). Alternatively, if repeated cross- sectional surveys are undertaken then communities could be monitored every 5'ear, producing a structure with individuals at level 1, years within communities at level 2, and communities at level 3. This is shown in Fig. 2(b). The first case allows the assessment of individual change within a contextual setting. The

second case permits the examination of trends within settings having controlled for their compositional make-up.

Substantively, the multilevel approach allows the flexible specification of variance and covariance structures through which it becomes possible to assess both which sort of individuals and which sort of communities change their behaviour. Technically, multilevel analysis is not affected by the restrictive data requirements that have hampered conventional repeated measures analyses. Within a multilevel structure both the number of observations per unit and the spacing anaong observations may vary. This flexibility enables efficient use to be made of all the data available.

Cross-class(fled models and multiple contexts. Individuals live their lives in a number of different settings--the workplace, the home, the residential neighbourhood. If a socio-ecological conception of health-related behaviour is followed there is every reason to suppose that any or all of these environments will be influential. This creates a situation in which contexts do not produce a neat hierarchical structure. Rather, a number of different settings overlap at the same level. -l-'his poses problems as all the models we have considered so far have been strictly hierarchical with contextual effects nesting within each other. Recent developments, however, allow the estimation of multilevel, cross-classified models which can be used to assess the relative importance of different contexts occurring at the same level [37]. For example a cross-classified model of smoking behaviour could be formulated with individuals at level 1 and both neighbourhoods and workplaces at level 2 and this is shown in Fig. 2(c). This approach is extremely valuable as it can identify contextual settings which are having a confounding influence. In the example above it may be discovered that what appears as between- workplace ~ariation is in fact really between- neighbourhood variation.

Multivariate, multile~'el models: beha~'ioural cluster- ing. Research has tended to treat health-related behaviour as consisting of a number of simple, discrete, unrelated actions. This has been seen as a consequence of biomedicine's reductionist and compartmentalizing tendency which has manifested itself in research funding mechanisms that encourage work to focus on separate behaviours [4]. The outcome is that researchers have largely failed to consider the extent to which different behaviours interrelate [38]. This is a serious oversight for it appears that the accumulation of risk factors may have a multiplicative effect thereb~ significantly increasing the degree of absolute risk [39]. Paradoxically, however, there is also a collective concept known as qifestyle': a belief in behavioural stereotypes which is applied to both people and places. Given the dearth of research that has considered the connections between different behaviours, it is difficult to know whether 'lifestyle" is a meaningful conception or whether it is simply a

Page 6: Health-related behaviour in context: A multilevel modelling approach

822 Cra ig D u n c a n et al.

( a ) Pane l d e s i g n

Level 3 Place

Level 2 Person

Level 1 Time i

I

i . . _ .

I

1 2

90 91 92 93 90 93

2

1 2

90 92 93 90 91 92 93

(b ) R e p e a t e d c r o s s - s e c t i o n a l d e s i g n

- - " I

Level 3 Place - - - / ~

Level 2 Time . . . . . . 92 93

Level 1 Person - - - / ~

- - - I 2 3 I 2

2

92 93

I 2 3 1 2

( c ) C r o s s - c l a s s i f i e d s t r u c t u r e - - - 1 2 4

Level I Person . . . . . . 1 2 3 4 5 6 8 9 10

Leve l2 Workplace - - - ~ ~ / ~ ~

- ' " 1 2 3 4

( d ) M u l t i v a r i a t e r e s p o n s e s

Level 3 Place - - -

Level 2 Person . . . . . .

Level 1 Response - - - , ',

i i 1 . . . .

I 2

1 2 3 I 2 3

A A A A A A YI Y2 Y3 Y4 YI Y2 Y3 YI Y2 Y3 Y4 YI Y2 Y3 Y1 Y2 Y3 YI Y2 Y3 Y4

( e ) M i x e d m u l t i v a r i a t e r e s p o n s e s

Level 3 Place - - -

Level 2 Person . . . . . .

L e v e l l Response - - -

2

I 2 3

A A I I 2 0 I 15 0

I

A I 5

Fig. 2. A range of mul t i level structures.

3

A I 15

Page 7: Health-related behaviour in context: A multilevel modelling approach

Health-related behaviour in conte,t 823

'taxonomic collective', a convenient term that groups disparate entities.

By extending the multilevel framework to a multi- variate model [19] it becomes possible to assess the degree to which the different behaviours are con- nected. If we collect information for individuals on each behaviour then we can produce a multivariate, multilevel structure in which level 1 is a set of response variables, one for each behaviour, which nest within individuals at level 2, who nest within communities at level 3. This form of multilevel structure is given in Fig. 2(d). In substantive terms, two main benefits arise from a multilevel, multivariate approach. First, the behaviours are directly comparable in terms of how each is related to individual-level characteristics. Answers to complex questions can be given: for example, is smoking related to age and socio-economic status in the same way as unhealthy eating'? Second, the residual covariance matrix can be estimated at both the level of the individual, showing whether those who drink heavily also smoke heavily, controlling for individual characteristics, and at the level of the community indicating whether high drinking places are also high smoking places, having controlled for the characteristics of the people within them.

All multilevel models can use both continuous and categorical data. Importantly, therefore, this multi- variate multilevel framework can be applied to continuous response variables, categorical response variables, and also a combination of the two. Consequently, it can be applied in a wide variety of situations which significantly increases its value for health-related behaviour research. First, if the response variables are all continuous, it can be used to model a group of measurements (e.g. number of cigarettes and units of alcohol consumed). Second, if the response variables are all categorical a group of dichotomous classifications can be modelled (e.g. low-smoker/high smoker and low drinker/high drinker). Third, a group of dichotomous classifi- cations can also be used to represent a single response variable which consists of multiple categories. For example, people may' be classified as never having smoked, as currently smoking or as having given up smoking. This can be represented as a multivariate structure in which there are three categorical (0,1) variables for each person, one for each of the categories, only one of which is 1 corresponding to the category which that person is in. This produces what is termed a multinomial, multivariate model and provides a way of modelling multiple response category data [40]. Finally, the responses can be a mixture of both categorical and continuous variables. This variant of the multivariate model can provide a key distinction which is often missing in quantitative health-related behaviour research. Consequently, it will now be outlined in detail and applied to data recording smoking behaviour in the United Kingdom to illustrate the value of multilevel techniques.

Mixed, multivariate multilevel models" and be- havioural dimensions: an outline and application.

Data on health-related behaviours frequently record both occurrence and quantity. The analysis of such data needs to be considered carefully as its distributive form is often problematic. For example, when either the number of cigarettes smoked daily or the number of units of alcohol consumed weekly are recorded a substantial number of the people questioned do not practice these behaviours. This produces responses with a "spiked' distribution a large number of values at zero followed by a distribution of non-zero values. These features are shown in Fig. 3 for data recording smoking quantity from the Health and Lifestyle Survey [24], a large-scale survey of 9003 adults conducted across mainland Britain in 1984/5. Here 65% of respondents did not smoke producing the characteristic 'spike" whilst the responses for the remaining 35% of the sample who did smoke, produce a distribution of continuous values.

If such data is treated as a single continuous response variable and a simple linear model is fitted, the results oblained provide poor descriptors. In this instance, for example, if the model

.v, = 11. + / 1 , . v , + e, (1 )

is fitted, where fl0 represents the average cigarette consumption for females and fl~ the contrast in average cigarette consumption for males, women are estimated as smoking 4.7 cigarettes per day with males smoking an additional 1.4 cigarettes per day. However, as Fig. 3 reveals, very few people actually smoke these amounts.

To avoid th~s situation the distribution needs to be considered in terms of its two key features. First, the spike, which can be identified by simply distinguishing zero values from non-zero values and secondly, the non-zero values which have a continuous distribution. These elements correspond with two distinct and substantively interesting processes. The first reflects the occurrence of smoking and distinguishes those who do not smoke from those who do and the second indicates the actual quantity of cigarettes consumed by those people who are smokers. The first can be regarded as an indicator of the prevalence or acceptability of smoking behaviour whilst the second is a quantitative dimension that constitutes a direct behavioural measure of consumption [41]. If the original data is treated as consisting of a mixed response--an "on/off" binary switch relating to occurrence, and, for the 'on" setting of the switch, a continuous wmable relating to quantity--each component can be identified and then modelled. This could be done in two separate generalized linear models. However, as the multilevel, multivariate structure can handle a mixture of categorical and continuous response variables it allows a more sophisticated strategy in which the two components

Page 8: Health-related behaviour in context: A multilevel modelling approach

824 Craig Duncan et al.

can be differentiated yet handled simultaneously within a single overall model.

The mixed, multivariate structure that can be applied to the data in Fig. 3 is shown in Fig. 2(e). Those people who smoke have both a categorical response variable set to" 1' indicating that they are a smoker and

a continuous response variable showing the number of cigarettes consumed daily. As the multilevel approach does not require balanced data the non-smokers have only the categorical response variable set to '0' indicating that they do not smoke. The result of this is that the occurrence of smoking can be separated

Number of respondents

7 , 0 0 0

6 , 0 0 0 5,848

5 , 0 0 0

4 , 0 0 0

3 , 0 0 0

2 , 0 0 0

1 , 0 0 0

326

0 0.5 1-5

922

660 590

144 182

23 73 3 10 0 9 0

6-10 11-15 16-20 21-25 26-30 31-35 36-40 41-45 46-50 51-55 56-60 60+

Cigarettes smoked per day

(Source • 1984/5 Health & Lifestyle Survey) Fig. 3. Distribution of cigarette smoking quantity.

Page 9: Health-related behaviour in context: A multilevel modelling approach

Health-related behaviour in context 825

from, yet considered simultaneously with, the quantity smoked.

Adopting this form of multilevel approach brings a number of substantive advantages for health-related behaviour research. Previously, studies have tended to either conflate the two dimensions of behaviour or they have been forced to study one to the exclusion of the other. Furthermore, studies focusing on the quantity ofbehaviour have often been forced to reduce consumption measures to simplistic dichotomies. For example, in studies of smoking, individuals have been classified as either heavy or light smokers [42]. The multilevel approach offers a way in which the different dimensions of behaviour can be identified whilst maintaining the inherent richness of the data available. In addition, by adopting such an approach it becomes possible to establish which explanatory variables are most strongly related to which parts of the distribution. It may be that age is strongly implicated in the probability of an individual being a smoker, yet plays little role in influencing how many actual cigarettes smokers consume. A gender gap, meanwhile, may be present in both the probability of being a smoker and the number of cigarettes consumed. A final advantage is that attention can focus on contextual variation in the two mixed response variables. When a third level is added to the structure, as shown in Fig. 2(e), by recognizing individuals live in places then it becomes possible to assess whether both the proportion of smokers and the quantity of cigarettes consumed varies across communities controlling for place composition. Importantly, a covariance term can also be estimated at this third level which shows the relationship between the proportion of smokers and the quantity smoked. This allows the assessment of whether places with high proportions of smokers are also characterized by high quantities of cigarette consumption. This form of contextual relationship has been of recent interest amongst researchers investigating the effect of different workplace environments upon smoking behaviour [43].

This approach was applied to the data shown in Fig. 3 with a three level model being formulated with the two response variables at level 1, nested within individuals at level 2, who are classified according to their electoral ward of residence at level 3 [44]. Hence, the structure corresponded exactly to that shown in Fig. 2(e) and after excluding those respondents with missing information [45] the final model consisted of 12112 responses at level 1, nested within 8980 individuals at level 2, nested within 396 electoral wards at level 3. The model was estimated using the software package ML3 [32] and the results obtained are presented in Table 1. This table contains the estimates for both the fixed effects and the random effects, with the left hand column referring to the estimates for the categorical response variable--the occurrence of smoking--and the right hand column referring to the estimates for the continuous response variable--the

Table 1. Estimates from a mixed multivariate model of smoking behaviour as recorded by the 1984/5 Health & Lifestyle Survey

Fixed effects Categorical Continuous

Intercept - 0.36 15.49

Age Linear 0.0694 (1.57) 0.953 (3.28) Quadratic 0.00216 (2.24) -0 .0163 (2.61) Cubic -0.0000207 (3.17) 0.0000745 (I .76)

Gender Male 0.107 (1.60) 3.801 (8.24)

Age Gender Linear 0.130 (2.12) 0.221 (0.52) Quadratic -0.0031 (2.29) -0.00065 (0.07) Cubic 0.0000229 (2.48) 00000169 (0.26)

Social Class l&ll --0.622 (10.13) 0.633 (1.43) IV&V 0.163 (2.69) -0 .162 (0.40) l l lNon-Man -0 .423 (5.62) - 1.756 (3.29) Other 0.289 (1.06) 0.787 (0.41) Armed Forces 0.170 (0.64) 0.0159 (0009) Student 1.074 (3.19) - 5.127 (I.96)

Marital Status Single 0.12 (1.56) 0.030 (0.06) Widowed 0.171 (1.67) 0.843 ( !. I I ) Divorced 0.584 (6.23) 2.193 (3.79)

Random effects variance Categorical Continuous

Level 3 Intercept 0.108 (4.86) 1.099 ( 1.41 ) Covariance 0.265 (2.78)

Level 2 Intercept 78.44 (37.33)

Note: categorical estimates represent logit values: continuous estimates are number of cigarettes smoked per day; figures in parentheses represent ratio of estimates to standard error.

quantity of cigarettes smoked daily by those who are smokers.

Fixedeffects. As three individual level variables (sex, social class and marital status) were included as a set of indicator variables and age was centred around the sample mean of 45.86 years, the intercept terms represent the stereotypical individual who is a married woman of average age in social group lII-manual. Consequently, the estimate for the intercept for the categorical response, which is given in logit form [46] at the top of the first column in Table l, gives a nation-wide average estimate that this stereotypical individual is a smoker. When transformed from its logit value this represents a probability of 41%. The intercept for the continuous response is given at the top of the second column in Table 1. This represents the average daily cigarette consumption by the stereotypical individual who smokes--here it is 15.49 cigarettes per day.

To interpret the other fixed effects estimates in Table 1 it must be remembered that these represent contrasts on each variable from the base categories (female, III-manual, married, average age) that characterize the stereotypical individual. A simple test of their significance can be performed by dividing each estimate by its standard error. If the ratio is in excess of plus or minus 2, the estimate is judged significantly different from zero at the 0.05 level. If this test is performed we find a number of interesting differences in the relationships between the predictor variables and the two responses.

SSM 42 6 D

Page 10: Health-related behaviour in context: A multilevel modelling approach

826 Craig Duncan et al.

The estimates for the fixed effect titled 'male' give the contrasts in each response variable should the respondent be male rather than female. For the categorical response, the male term is not significant showing that in terms of the probability of being a smoker there are no important differences between men and women of average age. However, at this average age men do seem to be consuming a signi- ficantly larger amount of cigarettes each day (the male term for the continuous response is significant and is estimated as an extra 3.8). Thus, there appears to be a very different gender gap between the two dimensions.

There appear to be significant differences in the relationship between each response and the age of individuals. This relationship was estimated for men and women separately through the creation of age-gender interaction variables and for each gender category linear, quadratic and cubic age terms were fitted to allow the assessment of non-linearity. The relationship for women is given by the terms under the heading 'age' in Table 1 and for men by the terms under the heading 'age-gender' which represent contrasts from the equivalent female terms. Examin- ing the estimates and the estimate to standard error ratios reveals that for women, the cubic age term is emphasized with regards to the probability of smoking whilst the linear age term is emphasized with regards to the amount consumed. For men the relationship with age seems to be different from that for women for the probability of being a smoker but not for the quantity of cigarettes consumed (all the age-gender terms are significant for the categorical response and non-significant for the continuous response).

These relationships between the responses and age are most easily appreciated by calculating a composite

function of the linear, quadratic and cubic terms and graphing the results. This has been done for the occurrence (categorical) response variable and for the quantity (continuous) response variable in Fig. 4 and Fig. 5 respectively. As can be seen, in terms of the occurrence of smoking the shape of the curve is more complex for women than men. It seems as though older women are more likely to be non-smokers than older men and so in later years there is a gender gap. For the quantity of cigarettes consumed the overall shape of the curves are approximately the same (quadratic) but there is a significant gender gap which is most extreme in middle life though it does reduce for the young and even reverses for the very elderly. It should be noted that in a cross-sectional analysis such as this it is not possible to disentangle age and cohort effects.

Finally, both the estimates for social class effects and marital status effects are statistically more signifi- cant with regards to the probability of individuals being smokers than they are with regards to the quantity consumed. The case of people in social group IV and V is interesting. Compared to people in the base category social group Ill-manual, people in social group IV and V seem more likely to be smokers but there is a tendency, although the estimate is not statistically significant, for them to actually consume less.

Random effects. As can be seen in Table 1 the random part of the model is comparatively simple. At level 2 only the continuous intercept term is random and this gives the degree of variation between individuals (level 2 units) in terms of the number of cigarettes consumed. The value obtained is very large (78.44) suggesting that most variation in the number of cigarettes smoked is at the individual level.

o

0.48

0.36

0.24

0.12 _

0 -28

i 1 ' j ] i

~ . , Males

Females ~ - ,

-8 12 32 52

Age (deviated around the mean of 45.86 years) Fig. 4. The relationship between age and the occurence of smoking for men and women.

Page 11: Health-related behaviour in context: A multilevel modelling approach

Health-related behaviour in context 827

E

r..)

22

17

12

' I ' ' I '

.." ".. Males

I 2 I I I I

-28 -8 12 32 52

Age (deviated around the mean of 45.86 years)

Fig. 5. The relationship between age and the quantity of smoking for men and women.

At level 3, the intercept terms for both response variables are allowed to vary. On the basis of the simple significance test performed earlier the between- place variation in the quantity of cigarettes consumed by a smoker does not appear to be significant (estimate to standard error ratio = 1.41) and is seen to be considerably smaller than the between-individual variation considered earlier. However, the between- place variation in the categorical response intercept is significant (estimate to standard error ratio = 4.86) and so there does seem to be some contextual variation between electoral wards in terms of the log-odds of an individual being a smoker controlling for compo- sitional make-up.

The remaining random term at level 3 represents the co-variance between the two random intercept terms at the place level, controlling for compositional variations at the individual level, and the value estimated is positive and statistically significant. The correlation between the intercepts for the two responses at this higher level can be calculated as the ratio of their covariance to the square root of the product of their variances. When this is done, the value obtained is 0.77 showing a very high positive correlation between the two response variables. Thus, places with high numbers of smokers seem also to be high smoking places in terms of the average cigarette consumption by those who smoke. This can be represented graphically on a scatterplot of the two sets of place-specific residuals which represent differences from the national average values on each response variable as predicted by the model for each of the 396 electoral wards sampled (Fig. 6). Here, zero on each axis represents the national average, each point represents an electoral ward, and as can be seen there

is a very strong positive association between the occurrence of smoking and the quantity smoked. It would appear, therefore, that smoking cultures develop in local neighbourhoods whereby the co-presence of similarly behaving people influences the number of times people practice that behaviour. In places where there are few smokers consumption is discouraged; when there are many it is stimulated [47].

CONCLUSIONS

In this paper we have outlined the general multilevel framework and shown how it can help empirical health-related behaviour research reflect the theoreti- cal move away from biomedical based notions to a concern for contextual, situational factors. We have demonstrated its value by applying one type of model that can be developed within a multilevel framework-- the mixed, multivariate, multilevel model. It has be shown that this model affords the opportunity to distinguish separate dimensions of behaviour whilst considering them simultaneously. This type of multilevel model can be applied more generally as it can handle polychotomous response variables to- gether with several continuous response variables. For example, this would be useful in the situation where we wanted to model a 2 x 2 classification of individuals on the basis of their smoking (yes/no) and drinking behaviour (yes/no), both of which would be measured by continuous response variables.

The mixed, multivariate model is only one of many that can be developed within the multilevel framework and applied to health-related behaviour research. Consequently, we believe multilevel procedures have a great deal to offer. As the RUHBC puts it:

Page 12: Health-related behaviour in context: A multilevel modelling approach

828 Craig Duncan et al.

~D

©

0.6

0.2 __

-0.2 _ _

O

-0.6 -1,4

1 r ' I ' I

t~ D [] D °

Cl

o

O

D

rn

D rn

I I i I , -0.4 0.6 1.6 2.6

Quantity

Fig. 6. Scatterplot of the place-specific residuals for the two response variables (note: the residuals for the categorical (occurrence) response represent logit values; the residuals for the continuous (quantity) response

represent the number of cigarettes smoked per day).

The future developments of such [multilevel] techniques may well be crucial to the very 'essence' of a changing public health [4] (p. 26).

However, we do not wish to overvalorize the technique as it is still open to many of the criticisms that have been made of traditional quantitative methods. It remains crude, reductionist and mechanis- tic and does not authentically capture the complex way in which health-related behaviour is embedded in the flow of situated daily routines. Nevertheless, within these limitations, multilevel modelling techniques constitute a considerable improvement on existing quantitative methodologies. Furthermore, they offer the possibility of achieving a reconciliation with qualitative research. Since they consider both the general and the specific they can reveal the broad patterns of health-related behaviour whilst also disclosing those situations in which these patterns do not hold. Such situations would benefit from more intensive, in-depth research. Consequently, multilevel modelling can be seen as offering one way in which quantitative and qualitative research designs can be connected [47].

Acknowledgements--The authors would like to acknowledge the extremely useful comments of two anonymous referees. The Health and Lifestyle Survey data were obtained through the ESRC Data Archive at the University of Essex.

REFERENCES

1. Department of Health and Social Security. Prevention and Health: Everybody's Business. HMSO, London, 1976.

2. Department of Health. The Health o[ the Nation: A StrategyJor Health in England. HMSO, London, 1992.

3. Thomas C. Public health strategies in Sheffield and England: a comparison of conceptual foundations. HIth Promot. Int. 8, 299, 1993.

4. Research Unit in Health and Behavioural Change. Changing the Public Health. John Wiley, Chichester, 1989.

5. Martin C. and McQueen D. Framework for a new public health. In Readingsjbr a New Public Health (Edited by Martin C. and McQueen D.), p. 1. Edinburgh University Press, Edinburgh, 1989.

6. Allison K. Health education: self responsibility versus blaming the victim. Hlth Ed. 20, I 1, 1982.

7. Naidoo J. Limits to individualism. In The Polities of Health Education (Edited by Rodmell S. and Watt A.), p. 17. RKP, London, 1986.

8. Stainton-Rogers W. Explaining Health and Illness: An Exploration o[' Diversity. Harvester Wheatsheaf, Hemel Hempstead, 1991.

9. Dean K. Methodological issues in the study of health-related behaviour. In Health Beha~'iour Research and Health Promotion (Edited by Anderson R. et al.), p. 83. Oxford Univ. Press, 1988.

10. McQueen D. Editorial. HIth Ed. Res. 6, 137, 1991. 11. Ashton J. and Seymour H. The New Public Health. Open

University Press, Milton Keynes, 1988: Martin C. and McQueen D. Readings [or a New Public Health. Edinburgh University Press, Edinburgh, 1989.

12. Macdonald G. and Bunton R. Health promotion: discipline or disciplines? In Health Promotion: Disciplines and Diversity (Edited by Bunton R. and Macdonald G.), p. 6. Routledge, London, 1992.

13. Thorogood N. What is the relevance of sociology for health promotion? In Health Promotion: Disciplines and Diversity (Edited by Bunton R. and Macdonald G.), p. 42. Routledge, London, 1992.

14. Fitzpatrick R. Society and changing patterns of disease. In Sociology as Applied to Medicine (Edited by Scambler G.), p. 3. Bailere Tindall, London, 1991.

Page 13: Health-related behaviour in context: A multilevel modelling approach

Health-related behaviour in context 829

15. Rodmell S. and Watt A. Conventional health education: problems and possibilities. In The Politics o f Health Education (Edited by Rodmell S. and Watt A.), p. I. RKP, London, 1986.

16. Rawson D. The growth of health promotion theory and its rational reconstruction: lessons from the philosophy of science. In Health Promotion: Disciplines and Diversi O' (Edited by Bunton R. and Macdonald G.), p. 202. Routledge, London, 1992.

17. Fox N. Po.s'tmodernism, Sociology and Health. Open University Press, Milton Keynes, 1993.

18. Poland B. D. Learning to "walk our talk': the impli- cations of sociological theory for research methodologies in health promotion. Can. J. Publ. Hlth 83, s31, 1992.

19. Goldstein H. Multilez,el Models in Social and Educational Research. Griffin. London, 1987.

20. Giddens A. The Constitution o[' Society: Outline ffl a Theory q[' Strueturation. Polity Press, Milton Keynes, 1984.

21. Bhaskar R. A Realist Theory o[" Science. Leeds Books, Leeds, 1975.

22. Harre R. Social Beh~g. Blackwell, Oxford, 1979. 23. Sayer A. Method in Social Science: A Realist Approach

(2nd Edn). Routledge. London, 1992. 24. Cox B. Health and L(lestyle Surrey, 1984-5 (computer

file). ESRC Data Archive, Colchester, 1988. Cox B. et al. The Health and L(lbstyle Surrey. Health Promotion Research Trust, London, 1987.

25. Cox B. et al. Tire Health and L(/bstyle Survey: Seven Years On. Dartmouth, Aldershot.

26. Health Education Authority, Office of Population Censuses and Surveys. Health and L(/estyle Surveys: Towards a Cummon Approach. HEA, London, 1990.

27. Hox J. J. and Kreft I. G. Multilevel analysis methods. Suciol. Meth. Res. 22, 283, 1994.

28. Jones K. and Bullen N. Contextual models of house prices: a comparison of fixed-and random-coefficient models developed by expansion. Econ. Geog. 70, 252, 1994.

29. A more detailed discussion of these concepts can be found in Jones K. Using multilevel models for survey analysis. J. Market Res. Soe. 35, 249, 1993.

30. Robinson W. Ecological correlations and the behaviour of individuals. Am. Sociol. Ret'. 15, 351, 1950.

31. Alker, H A typology of ecological fallacies. In Quantitatit,c Ecological Analysis (Edited by Dogan M. and Rokkan S.). MIT Press, Mass~ 1969.

32. Prosser R. et al. ML3: S~l?wure [br Three-Lerel Analysis. Institute of Education, University of London, 1991. It should also be noted that the precision of estimation depends on the number of units at each level. Guidelines are presented in Paterson L. and Goldstein H. New statistical methods for analyzing social structures: an introduction to multilevel models. Br. Ed. Res. J. 17, 387, 1991.

33. Duncan C. et al. Do places matter? A multilevel analysis of regional variations in health-related behaviour. Soc. Sci. Med. 37, 725, 1993.

34, McQueen D. (1988) Directions for research in health behaviour related to health promotion: an overview. In Health Behat'iour Research and Health Promotion (Edited by Anderson R. et al.). p. 251. Oxford University Press, 1988.

35. Cartwright A. and Anderson R. General Practice Rerisited: A Second Study o! Patients and their Doctors. Tavistock, London, 1981.

36. Bryk A. and Raudenbush S. Hierarchical Linear Models: Applications and Data Anah'sis Methods. Sage Publi- cations. Newbury Park, 1992.

37. Goldstein H. Multilevel cross-classified models. Mimeu. Inst. of Education, Univ. of London, 1992.

38. Blaxter M. Health am/L!/~,stvles. Tavistock Routledge, London. 1990.

39. Kok F. J. et al. Characteristics of individuals with multiple behavioural risk factors for coronary heart disease: The Netherlands. Am. J. Publ. Hlth 72, 986, 1982.

40. Goldstein H. Multilevel Multinomial Response Models. Working Paper, Multilevel Models Project, University of London, 1992; Multilevel Models Project. A Guide to ML3 Macros: Multilevel Multinomial Response Logistie Models. Inst. of Education, Univ. of London, 1993.

41. Colby J. P. et al. Social stress and state-to-state differences in smoking and smoking related mortality in the United States. Soc. Sci. Med. 38, 373, 1994.

42. Serxner S. et al. Tobacco use: selection, stress, or culture? J. Occupational Med. 33, 1035, 1991.

43. Serxner S. et al. Influence on cigarette smoking quantity: selection, stress or culture? J. Occupational Med. 34, 934, 1992.

44. Details ofthe general data design and model specification procedure on which this example is based are given in Appendix 1. Further details can be found in Multilevel Models Project. A Guide to ML3 Macros: Multilevel Multivariate Mixed Models with both Multinomial and Continuous Responses. Inst. of Education, Univ. of London, 1993. As these models are based on the general multivariate structure the reader is also encouraged to see Cresswell M, A multivariate bivariate model. In Data Analysis with ML3 (Edited by Prosser R. et al.), p. 76. Inst. of Education, Univ. of London, 1991; Duncan C. et al. Blood pressure, age and gender. In A Guide to ML3 /'or" New Us'ers (Edited by Woodhouse G.), p. 55. Inst. of Education, Univ. of London, 1994.

45. Twenty-three individuals failed to respond to the smoking questions and were excluded. In total there were 5848 non-smokers, and 2942 regular smokers. The survey also recorded 190 individuals as 'occasional smokers' (fewer than one a day). Here, these people are coded as smokers with a quantity of 0.5 cigarettes per day.

46. Due to a number of technical reasons those parts of the model relating to the categorical response are specified as a logit formulation. See Healy M. J. R. GLIM: An Introduction. Oxford Univ. Press, 1988.

47. It should be noted that some caution is required with this interpretation, however, as the non-significance of the intercept term for the continuous response will mean that the place-specific predictions for cigarette consumption will have large standard errors associated with them. See Duncan C. Modelling Contextuality in Health-related Behaviour, Ph.D. thesis (in preparation), Department of Geography, University of Portsmouth.

48. Jones K. and Duncan C. People and places: the multilevel model as a general framework for the quantitative analysis of geographical data. In Geographic h!formation and Society (Edited by Poiker T.). National Centre for Geographic Information and Analysis, Friday Harbor. Washington.

A P P E N D I X

Mixed multivariate multilevel models: data design and model specification

To show how a mixed, multivariate, multilevel model can be developed we will start by considering the data matrix that can be formed from the smoking data presented earlier recognizing both individuals and their place of residence. To begin we have two response variables for each respondent, one recording occurrence the other quantity, and this is shown below:

Person Place Smoker Amount Age 1 I 1 25 32 2 1 0 0 46 3 2 1 15 23

Page 14: Health-related behaviour in context: A multilevel modelling approach

830 Craig Duncan et al.

If we make a single column vector containing both of the response variables, Smoker and Amount , rather than having each of them on one record, then we create a multivariate structure. When we do this it is also necessary to duplicate the values of any individual level variables which we wish to use as predictors. In this instance we have included just one, Age. This produces the matrix of interleaved responses shown below:

Person Place Response Age 1 I 1 32 1 1 25 32 2 1 0 46 2 1 0 46 3 2 1 23 3 2 15 23

It is necessary to perform three additional steps before we have a suitable data matrix for modelling. First, as stated earlier the multilevel approach does not require balanced data and so we delete the continuous response record for those people who are not smokers, thus distinguishing the two dimensions of behaviour. Secondly, we need to recognize the type of response variable that each record contains and this can be done by creating indicator variables which will also represent the intercepts. Finally, any further explanatory variables have to be defined by multiplying these indicator variables by individual level explanatory variables. When these three steps are performed we produce the final data matrix shown below:

where P " expresses the model for the categorical response and P-~ expresses the model for the continuous response. We can now write a series of models for each. Thus, we can write the following random-intercepts for the categorical response, p~. the true underlying probability that an individual is a smoker, which for a number of technical reasons [46] is specified as having a non-linear form which is linearized for estimation by taking a Iogit formulation so that:

U~2 ~ = E(L,k)

where

E(Lj~) = E[logLoj~/(1 -p,~)] = fl6~z,,~ + flI"Z,iAXvk + (1{o~ ~)

(A2)

For the continuous response variable, the number of cigarettes smoked per day. we can also specify a random-intercepts model but with the standard Gaussian error structure:

i2) . X" U2 . . . . fl~o2'z2jk + fl, -2,~- ,i~ + (e,, + ,u~]') (A3)

The level 3 between-place random terms, #0~ ~ and /~],~, represent respectively the place-specific differences in the log-odds that an individual is a smoker and the place-specific differences in average daily cigarette consumption by smokers. Thus. considering the model overall there are two

Response Person Place F,jk

1 1 1 1 1 25 2 1 0 3 2 1 3 2 15

Categorical Cont inuous Categorical Cont inuous Z~jk Zejk Z~ikX~jk ZzjkX~k

1 0 32 0 0 1 0 32 1 0 4 6 0 1 0 23 0 0 1 0 23

This table reveals that we wish to model a response vector, F~, consisting of i responses which are a mixture of both categorical and continuous variables for j individuals in k places, with the structure of the vector being given by the two indicator variables, ZVk and Z.,j,. If we use a three-level model with level 1 defining the multivariate structure using the indicator variables, level 2 describing between-individual variation and level 3 describing between-place variation we can specify the following relationship between the mixed response variables and the one explanatory variable:

F,j~ = F,~'+ ~z z' (AI)

variables (fl~' and fl~2,) with higher-level distributions. Consequently, besides estimating the mean and variance of each of these we can also summarize their joint distribution

11)(2) by the covariance term, a,,o,,o, and it is this term that allows us to assess the relationship between the proportion of smokers in a place and the quantity of cigarettes consumed by smokers in a place controlling for compo- sitional make-up.

In general, the fixed part of such models can include a range of individual-level explanatory variables. Any of the coefficients from both of these sets can be allowed to vary and co-vary at level 3.