archaeological applications of kernel density estimates.pdf

Upload: gustavo-lucero

Post on 02-Jun-2018

218 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/10/2019 Archaeological Applications of Kernel Density Estimates.pdf

    1/8

    Journal of Archaeological Science (1997) 24, 347354

    Some Archaeological Applications of Kernel Density Estimates

    M. J. Baxter and C. C. Beardah

    Department of Mathematics, Statistics and Operational Research, The Nottingham Trent University,Nottingham NG11 8NS, U.K.

    R. V. S. Wright

    Prehistoric and Historical Archaeology, University of Sydney, NSW 2006, Australia

    (Received 10 November 1995, manuscript accepted 11 March 1996 )

    Kernel density estimates, which at their simplest can be viewed as a smoothed form of histogram, have been widelystudied in the statistical literature in recent years but used hardly at all within archaeology. They provide an e ff ectivemethod of data presentation for univariate and particularly bivariate data and this is illustrated with a range of examples. The methodology can be used as an informal approach to spatial cluster analysis, and one example suggeststhat it is competetitive with other approaches in this area. A reason for the lack of use of kernel density estimates byarchaeologists may be the lack of accessible software. The analyses described here were undertaken in the MATLABpackage using routines developed by the second author, and are available on request. 1997 Academic Press Limited

    Keywords: KERNEL DENSITY ESTIMATES, BIVARIATE DATA, CONTOURING, SPATIALCLUSTERING, MATLAB.

    Introduction

    Kernel density estimates (KDEs) at their simplestcan be thought of as an alternative to thehistogram. They typically provide a smoother

    representation of the data and, unlike the histogram,their appearance does not depend on a choice of starting point. In this sense KDEs alleviate problemswith the histogram that have been perceived by somearchaeologists ( Whallon, 1987 ).

    The smoothness of the KDE means that it isaesthetically more pleasing than the histogram. It alsofacilitates the presentation of several data sets in asingle gure, and makes it easier to compare data sets.This has been argued and illustrated in Baxter &Beardah (1995 b).

    It might be argued that, with univariate data, the

    advantages of using a KDE as opposed to a histogramfor data representation are not so great as to causethem to be preferred on a routine basis. For bivariatedata the case for using KDEs is much stronger, and thepurpose of this paper is to illustrate this by example.Two-dimensional histograms require large amounts of data, are unwieldy, may be di fficult to interpret, andcannot easily be used as the basis for other methods of data representation such as contouring. This paper willillustrate how KDEs readily overcome these problems.

    Although the possibility of using KDEs for archaeo-logical data presentation is implicit in Ortons (1988)comments on Whallons (1987) paper, we are not

    aware of any such uses outside our own work. Anexample of an application to bivariate data is given inBaxter & Beardah (1995 a). This arose when one of us

    (MJB) wished to explore the potential of the method-ology for representing results from a principal compo-nent analysis of archaeometric compositional data andasked the second author (CCB) if it was possible to dothis in the MATLAB package. Subsequent collabor-ation, described in Beardah & Baxter (1995) andBaxter & Beardah (1995 b), has led to the developmentof a set of MATLAB routines that include many of theapproaches described in the recent book by Wand &Jones (1995) . That book, the earlier text of Silverman(1986) , and the paper by Bowman & Foster (1993) maybe referred to for the technical developments thatunderpin the work described here.

    The main ideas of kernel density estimation necess-ary for this paper are presented in the next section,with more technical detail and discussion of compu-tational matters in the appendix. The main section of the paper illustrates applications of the methodology,and the concluding section summarizes what we thinkare its merits.

    Kernel density estimationHistograms are among the most common methodsof data presentation in archaeology. Anyone whohas drawn a histogram by hand will know that its

    3470305-4403/97/040347+08 $25.00/0/as960119 1997 Academic Press Limited

  • 8/10/2019 Archaeological Applications of Kernel Density Estimates.pdf

    2/8

    appearance may be crucially a ff ected both by the pointat which the histogram is startedthe originand thewidth of the intervals used, or bin-width. Goodcomputer software packages will make automatic andsensible choices for the origin and bin-width, but itshould be possible to vary these and this will a ff ect theresults obtained.

    Let the origin of the histogram be m0 , with subse-quent interval boundaries at m1 , m2 , etc. and assumethat ( m j m j1 )= c for some constant c for j =1,2, . . . (i.e.intervals are of equal width). Let and q be values suchthat is small and q = c. It is then possible to imaginethe construction of successive histograms with originsat (m0 + i ) for i =0,1, . . . , q 1. If the q histograms soobtained are averaged then an average shifted histo-gram (ASH) ( Scott, 1992 ) is obtained. The appearanceof the ASH will not be dependent on the choice of m0.Its smoothness will depend on c, and increases as cincreases. The limiting form of the ASH, as 0, is a

    kernel density estimate. An example is given in Baxte r& Beardah (1995 b).Another way to think of KDEs is as follows. Given

    n points X 1, X 2 , . . . , X n situated on a line a KDE canbe obtained by placing a bump at each point andthen summing the height of each bump at each pointon the X-axis. The shape of the bump is dened by amathematical function, the kernel K (x), that integratesto 1. The spread of the bump is determined by awindow- or band-width, h, that is analogous to thebin-width, c, of a histogram. The kernel is usually asymmetric probability density function.

    The shape of the resulting KDE does not depend ona choice of origin and is relatively insensitive to theexact form of K (x), which is taken to be a normaldensity function in the rest of the paper. The choice of h is more critical and will be considered shortly.

    We have presented two simple ways of conceptual-ising what a KDE is. Mathematically, the latterapproach gives the KDE as

    where f |

    (x) is an estimate of the density underlying thedata.

    Large values of h over-smooth, while small values

    under-smooth the data. A variety of approaches can beused to select h, including subjective choice and it mayoften be sensible to look at KDEs for several valuesof h.

    More objective or data-driven choices of h can bemade, and a wide range of methods have been pro-posed for this. These are described in detail in Wan d& Jones (1995) and in summary form in Baxter &Beardah (1995 b). An outline of a subset of thesemethods is given here.

    The data can be thought of as a sample of n froman underlying and unknown true density, f (x). It ispossible to dene a measure of closeness between the

    KDE and the true density, leading to an estimate of hthat maximizes the closeness. If it is assumed thatthe true density is normal then it can be shown that anoptimal choice of h is

    h=106 n 1/5 ,

    where is an estimate (possibly robust) of , the S.D.of the normal distribution. This is the normal scale ruleand will typically over-smooth the data if the under-lying density is not normal.

    The estimate of h depends, in general, on propertiesof the true density that are unknown, and in particularon a quantity that may be interpreted as the rough-ness of the density. A family of direct plug-in (DPI)estimates can be dened in which an estimate of h canbe obtained by plugging-in an estimate of roughnessinto the equation that denes h. More details are givenin the Appendix.

    A related approach is the solve the equation(STE) method, in which an equation that relates h to afunction of the unknown density is dened. In essence,an initial estimate of h leads to an estimate of thedensity, that in turn leads to a new value for h and anew density estimate. The process continues until theestimate of h converges. Wand & Jones (1995: 96)suggest that a suitable data analytic strategy is to lookat several di ff erent estimates of h, but that if a singlevalue is required DPI and STE estimates appear to beamong the more suitable.

    The prime purpose of the paper is to illustrate theuse of bivariate KDEs and the generalization to theseis relatively straightforward. By analogy with theprevious discussion of univariate KDEs we maythink in terms of n points in a plane dened byco-ordinates X (i ) =( X i , Y i ), for i =1,2, . . . , n. Locatinga bump at each point corresponds in this caseto centering a three-dimensional bump or hill ateach point and then, at each point in the plane,summing the height of the bumps. The bump, orkernel, is taken in this paper to be a bivariate normaldistribution.

    For two variables, X and Y , a bivariate normaldistribution is dened by the means of X and Y , takento be zero; their S.D.; and their correlation, whichdetermines the orientation of the bump. If this corre-

    lation is taken to be zero, as we do here, then smooth-ing will be in the direction of the coordinate axes andthe degree of smoothing is determined by the S.D. Onewill often not lose much by taking the correlation to bezero, whereas smoothing equally in both directions, byusing the same window-widths, is not generally tobe recommended ( Wand & Jones, 1995: 108 ).

    The theory underlying the optimal choice of window-widths is not as well developed for the bivari-ate as for the univariate case. The examples in thispaper use window-widths for the X and Y directionsdetermined as for the univariate case, using either STEestimates or the normal scale estimates.

    348 M. J. Baxter et al.

  • 8/10/2019 Archaeological Applications of Kernel Density Estimates.pdf

    3/8

    With the assumption of zero correlation therepresentation of the bivariate KDE, f

    |

    (x , y), is given by

    where h1 and h2 are the window-widths in the X and Y directions.

    An attraction of using KDEs is that they can be usedas a basis for producing contour plots of the data andthis leads to graphical representations of data of a kindthat archaeologists should nd familiar. The followingdiscussion of how contouring can be used is based onthe paper by Bowman & Foster (1993) .

    After a bivariate KDE has been obtained each(two-dimensional) data point is associated with adensity height that may be ranked from largest tosmallest. The rst 50% ranked observations, forexample, may be used to dene contours that enclose

    the densest 50% of the data. The level of contouringcan be varied to contain any specied proportion of the data, and several contours can be superimposedon a plot, with the original data if this is helpful.Bowman & Foster (1993: 173) note that in someways this provides a two-dimensional analogy to theone-dimensional boxplot, and also that the approachis useful for looking for modes or clusters in thedata.

    A further extension, noted in the same paper, occurswhen the data points can be classied, by period orcontext for example. In this case a particular contourlevel such as 75% might be selected and then contoursat this level drawn for each group separately, to revealhow similar or distinct they are. This will also beillustrated in the next section.

    ExamplesThere are many ways in which univariate KDEs mightbe used in archaeology, and several of these have beenillustrated in our previous work. Data presentation fora single data set and comparison between the distri-butions of di ff erent data sets are obvious uses. It isworth remarking that the boxplot, another good wayof looking at and comparing univariate data, does not

    work well with multi-modal data. Bounded data, in thesense that certain values are impossible, and dataaff ected by outliers can be handled using boundarykernels and adaptive estimates respectively, and thisis discussed and illustrated in Beardah & Baxter(1995) .

    For practical purposes a distinction may be drawnbetween kernel density estimation as applied to simple,or simply transformed, variables, and as applied tocomposite variables such as those derived in principalcomponent and other forms of multivariate analysis.This latter greatly extends the potential for the use of KDEs and is illustrated in Examples 1, 3 and 4.

    Example 1Principal component analysis is one of the more com-monly used multivariate methods in archaeology and adetailed account and bibliography is given in Baxter(1994) . Typically, data are standardized and an analy-sis results in new, linear combinations of the originalvariables, called principal components, that can beinspected for structure using plots (usually) based onthe rst two or three components. If there is structurein the data it will often show in the rst component andit can be useful to examine this using a KDE.

    The data used for the rst example are 105 speci-mens of Roman waste glass, with a principal compo-nent analysis based on their chemical compositionwith respect to 11 oxides. The data are given, andextensively analysed, in Baxter (1994) . The specimenscome from two sites and the statistical analyses suggestthat there are perhaps three clusters in the data that arerelated to, but do not exactly coincide with the siteclassication.

    As a rst illustration of kernel density estimationFigure 1 shows two KDEs for the principal componentscores, based on the normal scale estimate of h and an

    STE estimate of h. The normal scale estimate over-smooths the data, as expected, and misses the centraland smaller mode suggested by the STE approach.

    The usual bivariate component plot can be repre-sented by a KDE in various ways. Figure 2 shows ascatter plot of the scores on the rst two componentsand Figure 3 shows a KDE using the STE estimate of h. Three main concentrations are evident. For thisexample inspec tion of the scatterplot has led one of us(Baxter, 1994 ) to the same conclusion, so that a KDEis not essential. In Examples 3 and 4 much largerdata sets are used for which the scatterplot is a lessuseful tool.

    8

    0.3

    08

    First component

    R e l a t i v

    e f r e q u e n c y

    2

    0.05

    0.25

    0.2

    0.15

    0.1

    6 4 2 0 4 6

    Figure 1. Two univariate kernel density estimates for scores on therst principal component of an analysis of the chemical compositionof 105 specimens of Romano-British waste glass. : STE rule; : normal scale rule.

    Kernel Density Estimates 349

  • 8/10/2019 Archaeological Applications of Kernel Density Estimates.pdf

    4/8

    Example 2

    An obvious use for bivariate KDEs is in the presen-tation and interpretation of spatial data in the form of coordinates of nd spots, for example. To illustratethis an ethnoarchaeological data set, Binfords (1978)

    Mask Site data, is used. The data are taken fromappendix A of Blankholm (1991) , who uses them totest a variety of approaches to intrasite spatial analysis.The data, as presented by Blankholm, consists of thespatial coordinates of ve classes of nd that mightoccur in the archaeological record, such as artefacts,large bones and bone splinters. We use the subsetbased on the coordinates of the locations of 276 bonesplinters.

    Figures 4 and 5 show analyses in which the normalscale rule and STE estimates have been used to deter-mine window widths separately for the two coordinatedirections. Both analyses show 25, 50, 75 and 100%

    contours superimposed on the distribution of the bonesplinters. Once again the normal scale analysis pro-duces a smoother picture. There are clearly three mainconcentrations in the data with the STE analysissuggesting a subdivision of one of these, in the bottomright of the graph, into two groups and a fth group in

    the upper left of the gure.It is instructive to compare our results with thoseobtained by a variety of methods in Blankholm (1991) .His gure 9, using contouring at equal heights (ratherthan encompassing specied proportions of the data),is less revelatory of structure than our gures, while ak -means cluster analysis (his gure 17) suggests a threecluster distribution. Contour maps or clustering arisingfrom local density analysis (his gure 32) and nearestneighbour analysis (his gure 39) are also given. Wethink that our gures, and particularly that for theSTE analysis, suggest struct ure as well asor moreclearly thanthe analyses in Blankholm (1991) .

    5

    4

    85

    Component 1

    C o m p o n e n t

    2

    2

    6

    2

    0

    2

    4

    4 2 0 41 33 1

    Figure 2. Principal component plot for the rst two componentsfrom an analysis of the chemical composition of 105 specimens of Romano-British waste glass.

    6

    0.2

    0

    6Component 1

    R e l a t i v

    e f r e q u

    e n c y

    2

    0.05

    0.15

    0.1

    4 2 0 4

    Component 2 5

    5

    0

    Figure 3. A KDE estimate, based on an STE rule for the selection of h, for the data.

    13

    12

    3

    Component 1

    C o m p o n e n t

    2

    5

    9

    8

    10

    7

    4 6

    6

    11

    4

    5 1211107 8 9

    Normal scale rule

    Figure 4. A KDE of the Mask Site data using the normal scale rule.The contours are for 25, 50, 75 and 100% inclusion levels.

    13

    12

    43

    Component 1

    C o m p o n e n t 2

    5

    9

    8

    10

    7

    4 6

    6

    11

    5 1211107 8 9

    STE r ule

    Figure 5. As for Figure 4 but using an STE estimate.

    350 M. J. Baxter et al.

  • 8/10/2019 Archaeological Applications of Kernel Density Estimates.pdf

    5/8

    How real is the structure suggested? In fact thelocation of hearths, activity areas and features such asrocks is known, and Blankholm provides a map of these that can be overlaid on his gures. There are vehearths and two of them are associated with concen-trations detected in all analysesthose to the left of our gures. Two other hearths that are adjacent, and atthe bottom left, are associated with the third mainconcentration. Only our STE analysis suggests twosubdivisions of this group. The fth hearth is associ-ated with a less dense area of bone splinters in theupper right of the diagram and is suggested by our STEanalysis and some of those reported by Blankholm.From this discussion we conclude that, for thisexample at least, the KDE approach is competitivewith other statistical approaches to spatial analysis inarchaeology of the kind that seeks clustering in artefactscatters.

    From the foregoing discussion it is obvious thatcontouring of artefact scatters can be undertaken with-out reference to kernel density estimation. The merits,or otherwise, of di ff erent approaches will be discussedin the concluding section. It will also be obvious thatKDEs can be used as an informal means of clusteranalysis for these kind of data, and in this sensecompetes with more formal methods such as k -meanscluster analysis. It is known that k -means analysis has

    a tendency to produce spherical clusters, whether ornot the real structure has this form. This di fficulty isavoided by the clusters (or contours) suggested byKDEs. Determining the number of clusters is a prob-lem for any clustering approach. With KDEs it shouldbe informative to examine contours at di ff erent levelsof inclusion as a means of looking for structure atdiff erent scales of spatial resolution.

    Figure 6 shows the alternative representation, for theSTE estimate, of the KDE as a three dimensionaldiagram. There are four clear concentrations, ormodes, with a much gentler hillock, visible behindthe front peaks, that is associated with the fth hearth.

    Example 3This third example is based on anthropometric ratherthan archaeological data, but is ideal for showing howKDEs can be used to illuminate the message of largedata sets. The data are discussed and analysed inRelethford & Crawford (1995) and consist of 17 bodyand craniofacial measurements from 7214 male adultsin 31 birth counties in Ireland. The data were originallyused to investigate the genetic distances between thepopulations dened by the counties.

    It was of interest for one of us (RVSW) to investi-gate the performance of a principal component analy-sis, in order to see how the rst two principalcomponents relate to geography. Some strong corre-lations of this sort, but for di ff erent data, have beenreported by Wright (1992) . An obvious problem, interms of the usual component plots presented fromsuch an analysis, is that there are too many data pointsto plot the data sensibly in the usual way. Here weconcentrate on what KDEs have to o ff er in terms of handling such a mass of data, without going intoaspects of substantive interpretation, and note that anytwo-dimensional scatter of data can be handled in asimilar way.

    Figures 710 show four diff

    erent representations of these data. Figure 7 is an attempted scatter plot thatshows how hopeless it is to try and discern structure inthe data in this way; Figure 8 is a three-dimensionalplot along the lines of Figures 3 & 6; and Figure 9 is acontour plot along the lines of Figure 5 . An STEestimate has been used. The interesting feature of theselast two plots is that there are no interesting features;there is no evidence of any kind of grouping in thedata. The nal plot in Figure 10 shows separate 75%contours for three of the counties, and there is noevidence of any di ff erence between them. Although theplot becomes very crowded this remains the case if all

    6

    0

    2 Component 1

    R e l a t i v

    e f r e q u

    e n c y

    14

    0.2

    0.25

    4Component 2

    10

    1412

    1210

    8

    2

    86

    4

    0.150.1

    0.05

    Figure 6. A density plot of the Mask Site data using the STEestimate.

    0.15

    0.1

    0.060.15

    Component 1

    C o m p o n e n t 2

    0.05

    0.04

    0.08

    0.04

    0

    0.02

    0.1 0 0.10.05

    0.06

    0.02

    Figure 7. A scatterplot of the scores from a principal componentanalysis of the Irish body and craniofacial measurement data. Basedon 7214 individuals, the purpose is to illustrate that such plots are of limited use for looking at large data sets.

    Kernel Density Estimates 351

  • 8/10/2019 Archaeological Applications of Kernel Density Estimates.pdf

    6/8

    counties are represented in a similar way on the sameplot. The plots indicate that the correlation (if any)that exists between the principal components andgeography must be a low one.

    Example 4The nal example is similar in kind to the previous

    example, and arises from a correspondence analysisoriginally undertaken by one of us. It possessesadditional features of interest.

    The data are the frequencies of eight di ff erent typesof archaeological site, recorded for 4712 km 2 in theAustralian state of Victoria. The resulting 4712 corre-spondence analysis object scores are plotted onFigure 11 .

    It is not possible to get a sensible looking plot herebecause the structure of the data is such that many of the points represent multiple occurrences. This over-printing happens because many of the kilometresquares have identical frequencies of sites, though

    multiple occurrences are not evident from the plot.Solutions such as jittering exist, in which points aredisplaced by a small and random amount, which wouldtend to give a separate point for each site, but this thenleads to problems similar to that evidenced in Figure 7 .

    An alternative approach is to use a KDE andcontour it at some suitable level in order to see wherethe points pile up, and this is done in Figure 12 .Here 90, 95 and 100% contours are shown using anSTE estimate, and are suggestive of, perhaps, ninegroups.

    DiscussionFor simple data presentation and comparison of uni-variate data KDEs can be regarded as an alternative tothe histogram. We think that there are aesthetic and

    0

    0.2 Component 1

    R e l a t i v

    e f r e q u

    e n c y

    0.2

    Component 2

    0.10.15

    200

    100

    0.10

    0.10.1

    0.050

    0.05

    500

    400

    300

    Figure 8. The data of Figure 7 represented as a density plot.

    0.15

    0.1

    0.060.15

    Component 1

    C o m p o n e n t 2

    0.05

    0.04

    0.08

    0.04

    0

    0.02

    0.1 0 0.10.05

    0.06

    0.02

    Figure 9. The data of Figure 7 represented as a contour plotshowing 25, 50, 75 and 100% levels of inclusion.

    0.06

    0.03

    0.040.06

    Component 1

    C o m p o n e n t 2

    0.02

    0.03

    0

    0.02

    0.04 0 0.040.02

    0.02

    0.01

    0.01

    Figure 10. The data of Figure 7 showing 75% contours for three of 31 counties. The contours largely overlap, and this is the casehowever the counties are selected.

    Figure 11. A scatterplot based on the rst two components of acorrespondence analysis of the Victorian sites data. Many of thepoints displayed represent multiple occurrences.

    352 M. J. Baxter et al.

  • 8/10/2019 Archaeological Applications of Kernel Density Estimates.pdf

    7/8

    interpretational benets to be obtained from usingKDEs and have argued this elsewhere (Baxter &Beard ah, 1995 b), but others may regard it as a matterof taste.

    For bivariate data the case for regarding KDEs as atool to be routinely used is a strong one. Even when anordinary scatter plot could be used, as in Examples 1and 2, KDEs can be more e ff ective for showing con-centrations of points or modes in the data. For verylarge data sets as in Examples 3 and 4, where scatter-plots are uninterpretable, it is unarguable that KDEsor other methods with similar aims are needed to get a

    view of the data.Of course it would be quite possible to generatecontour plots and density plots of the kind shown inthe gures without recourse to KDEs, so why useKDEs? One answer is that statistical theory associatedwith KDEs provides guidance as to the appropriatedegree of smoothing to use and, as the rst twoexamples show, this can have a critical e ff ect on theinterpretation of the data. Another reason for usingKDEs is that they lend themselves naturally to con-touring in terms of inclusion of specied percentages of the most densely clustered points. As the examplesshow, this means that KDEs can be used as aninformal kind of clustering method that does notimpose structure on the data in the way that moreformal methods often do.

    AcknowledgementsCaroline Jackson is thanked for providing theglass compositional data used in Example 1. JohnRelethford and Michael Crawford are thanked forproviding the data used in Example 3, and for per-mission to use it. Richard MacNeill and StewartSimmons (Aboriginal A ff airs Victoria) are thanked forthe sites data used in Example 4. Responsibility for the

    use to which these data sets have been put in the paper,and interpretations, rests with the present authors.

    References

    Baxter, M. J. (1994). Exploratory Multivariate Analysis in Archae-ology . Edinburgh: Edinburgh University Press.

    Baxter, M. J. & Beardah, C. C. (1995 a). Graphical presentationof results from principal components analysis. In (J. Huggett &N. Ryan, Eds) Computer Applications and Quantitive Methodsin Archaeology 1994 . Oxford: BAR International Series 600,pp. 6367.

    Baxter, M. J. & Beardah, C. C. (1995 b). Beyond the Histogram Archaeological Applications of Kernel Density Estimation . Re-search Report 6/95, Nottingham Trent University Department of Mathematics, Statistics and Operational Research, NottinghamTrent University, Nottingham, U.K.

    Beardah, C. C. & Baxter, M. J. (1995). MATLAB Routines for Kernel Density Estimation and the Graphical Representation of Archaeo-logical Data . Research Report 2/95, Nottingham Trent UniversityDepartment of Mathematics, Statistics and Operational Research,

    Nottingham Trent University, Nottingham, U.K.Binford, L. R. (1978). Dimensional analysis of behavior and sitestructure: learning from an Eskimo hunting stand. AmericanAntiquity 34, 330361.

    Blankholm, H. P. (1991). Intrasite Spatial Analysis in Theory and Practice . Aarhus: Aarhus University Press.

    Bowman, A. & Foster, P. (1963). Density based exploration of bivariate data. Statistics and Computing 3, 171177.

    Orton, C. R. (1988). Review of Quantitative Research in Archaeology ,Aldenderfer, M. S. (Ed.). Antiquity 62, 597598.

    Relethford, J. H. & Crawford, M. H. (1995). Anthropometricvariation and the population history of Ireland. American Journal of Physical Anthropology 96, 2538.

    Scott, D. W. (1992). Multivariate Density Estimation . New York:Wiley.

    Silverman, B. (1986). Density Estimation for Statistics and DataAnalysis . London: Chapman and Hall.

    Wand, M. P. & Jones, M. C. (1995). Kernel Smoothing . London:Chapman and Hall.

    Whallon, R. (1987). Simple statistics. In (M. S. Aldenderfer, Ed.)Quantitative Research in Archaeology: Progress and Prospects .London: Sage, pp. 135150.

    Wright, R. V. S. (1992). Correlation between cranial form andgeography in Homo sapiens : CRANIDa computer programfor forensic and other applications. Archaeology in Oceania 27,128134.

    Appendix: Technical DetailsThe closeness of a KDE to the true density can be

    dened in terms of the asymptotic mean integratedsquare error (AMISE) and the value of h whichminimizes this, hAMISE , can be shown to have the form

    hAMISE =[ g (K )R( f )]1/5 ,

    where g (K ) is a function of the known kernel and

    R( f )= x f 2dx

    is a function of the unknown true density that can beinterpreted as its roughness. Assuming that the truedensity is normal leads to the normal scale rule forchoice of h. Estimating R ( f ), which can be done with

    3

    3.5

    11.5

    Component 1

    C o m p o n e n t 2

    1

    0.5

    0

    0.5

    1 0 1.50.5

    2

    1

    1.5

    2 2.50.5

    3

    2.5

    Figure 12. The same data as in Figure 11 in the form at 90, 95 and100% contours arising from a KDE based on an STE rule forwindow-width selection.

    Kernel Density Estimates 353

  • 8/10/2019 Archaeological Applications of Kernel Density Estimates.pdf

    8/8

    varying degrees of renement, leads to the family of direct plug-in (DPI) estimates. Details are given inWand & Jones (1995).

    Solve-the-equation (STE) estimates are closelyrelated to DPI estimates.The formula for hAMISE is thestarting point, and R( f ) is replaced by an estimate thatdepends on h and can be determined for an initialchoice of h. This leads to the estimate of hAMISE that inturn is used to estimate a new R( f ) and a new hAMISE .This process continues until h converges.

    These and other techniques have been implementedin the MATLAB package by one of the authors (CCB)and are freely available to anyone who wants them

    (email [email protected]). The routines weredeveloped because our interest in KDEs occurred at atime when nothing else was obviously and easily avail-able to us. We believe that kernel density estimation isa valuable tool for data analysis that can be fruitfullydeployed by archaeologists. We are also aware thatsoftware to implement the ideas involved, includingour own, is not readily available and is expensive. It islikely that this situation will change, and that kerneldensity estimation will become available in accessiblesoftware packages. Our hope is that the present paperwill encourage the use of such methodology when itbecomes more readily available.

    354 M. J. Baxter et al.