tesi di dottorato di ricerca malvezzi

Upload: randomas

Post on 08-Apr-2018

222 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/7/2019 Tesi di Dottorato di Ricerca Malvezzi

    1/45

    UNIVERSIT DEGLI STUDI DI MILANOFacolt di Medicina e Chirurgia

    Istituto di Statistica e Biometria GA Maccacaro

    Dottorato in Statistica Biomedica Ciclo XXII

    TESI DI DOTTORATO DI RICERCAAnalisi dei Tassi di Mortalit Con Modelli Et Periodo Coorte

    MED/01

    Matteo Charles Malvezzi

    Prof. Adriano Decarli

    Prof. Silvano Milani

    Anno Accademico 2008-2009

    1

  • 8/7/2019 Tesi di Dottorato di Ricerca Malvezzi

    2/45

    Index

    Introduction..........................................................................................................................................3

    Materials and Methods.........................................................................................................................4

    Age-Standardised-Rates...................................................................................................................4

    Instantaneous Incidence Rates....................................................................................................4

    Mortality Rates............................................................................................................................5

    Standard Rates ............................................................................................................................6

    Rate Estimate Accuracy..............................................................................................................8

    Age-Period-Cohort Analysis............................................................................................................9

    Introduction.................................................................................................................................9

    Presenting Data: Tabular View..................................................................................................10

    Presenting Data: Graphical Methods........................................................................................12Classical Modelling Approach..................................................................................................16

    The Age Model..........................................................................................................................17

    The Age-Period Model..............................................................................................................18

    The Age-Cohort Model.............................................................................................................19

    The Age-Drift Model................................................................................................................21

    The Age-Period-Cohort Model ................................................................................................22

    Added Constraints................................................................................................................22

    Successive Iteration Modelling............................................................................................23

    Penalised likelihood Age-Period-Cohort Method.....................................................................27

    Confidence Intervals for APC models......................................................................................29

    Gastric Cancer Data..................................................................................................................30Oral Cancer Data.......................................................................................................................31

    Results................................................................................................................................................33

    Gastric Cancer Mortality ..............................................................................................................33

    Oral Cancer Mortality....................................................................................................................36

    Discussion...........................................................................................................................................40

    The Penalised Likelihood APC Model..........................................................................................40

    Gastric Cancer Mortality...............................................................................................................41

    Oral Cancer Mortality....................................................................................................................42

    2

  • 8/7/2019 Tesi di Dottorato di Ricerca Malvezzi

    3/45

    Age Period Cohort Analysis of Mortality Rates

    Introduction

    Descriptive epidemiology is seen as a first approach aimed at defining the scope for a research

    problem. Its techniques are primarily aimed at exploratory studies, in that its main concern is to

    generate hypotheses rather than verifying them.

    The basic techniques of descriptive epidemiology were borrowed from demography, with morbidity

    and mortality rates being the key descriptive tools and standardisation being the only method used

    for comparative purposes, usually ignoring issues of variability, or acknowledging the problem

    working around it with simplistic devices, such as the aggregation of data over longer time frames.

    The improvement and greater availability of epidemiological data over the years is possibly the

    main factor that brought about the development of modern descriptive epidemiology. Mortality and

    incidence data (in particular oncological data) has come in leaps and bounds both quantity and

    quality wise over the years, this is mainly due to the proliferation of cancer registries and the

    concerted effort to standardise procedures of data collection and classification, not to mention the

    improvement in demographic data that has become available for an always greater number of

    populations and is published on a more regular basis.

    As a consequence of this accumulation of time based incidence and mortality data, time series

    modelling techniques were developed to analyse the different factors underlying the changes in

    rates for both explanatory and predictive purposes.

    In this thesis methods and modelling techniques to study mortality rates are described, in particular

    age-period-cohort (APC) analysis is presented for its function in contributing to the aetiologic

    purpose of descriptive epidemiology to make inference from the group to the individual.

    The methods described herein were used to analyse gastric and oral cancer mortality in Europe to

    illustrate the benefits and shortcomings of the methods, as well as, for the intrinsic value of these

    3

  • 8/7/2019 Tesi di Dottorato di Ricerca Malvezzi

    4/45

    analyses for aetiological research and indications for cancer prevention that arise from the results of

    these studies.

    Materials and Methods

    Age-Standardised-Rates

    Instantaneous Incidence Rates

    The probability of developing a condition between the ages t0 and t1 with (u) being the age specific

    rate and S(u) the probability of survival without disease is given by:

    =t0

    t1

    u Sudu .

    We are interested in the conditional probability c of developing the disease given that a subject is

    still at risk. This probability is not influenced by general survival until the examined age t 0 and very

    little between t0 and t1, as long as this interval is small, giving us

    c=t0

    t1

    uSuSt0

    du .

    If the interval is small enough that (u) and S(u) can be considered constant ((t0) and S(t0)) then

    the equation can be rewritten as:

    c t0t1t0

    If k is the number of cases observed between t0 and t1, and nt0 is the number of subjects at risk at t0

    4

  • 8/7/2019 Tesi di Dottorato di Ricerca Malvezzi

    5/45

    we have:

    c=k

    nt0.

    Which gives us the estimate of as: t0

    k

    nt0t1t0.

    Which is to say that the estimate of the instantaneous rate at t0 is the number of cases observed

    divided by the number m of persons year between t0 and t1, which translates to the more familiar

    t0 km .

    Of course this approximation is not very good if (u) varies violently between t0 and t1, or if the

    ratio

    Su

    St0

    is very different from unity.

    Mortality Rates

    The annual incidence rate of a given disease, for a group, over a given time period is equal to the

    ratio between the new occurrences of the disease over the given time and the number of persons

    year accumulated by the members of the group over the same time period.

    As seen above, rates are more meaningful when the examined group is homogeneous and there is a

    constant risk over the time period. Under these conditions, the observed incidence rate can be

    considered an estimate of the underlying instantaneous rate. These conditions are the reasons why

    incidence is calculated separately by age and sex to obtain specific incidence rates.

    In the case of annual mortality rates, the numerator of the rate is the number of deaths for the

    5

  • 8/7/2019 Tesi di Dottorato di Ricerca Malvezzi

    6/45

    observed condition during the calendar year. In the case of descriptive statistics, particularly in

    mortality studies, the denominator is made of population estimates derived from official censuses.

    I.e. the provided population value is considered to be the mid point of the year, or the yearly

    average. This gives us:

    x=kx

    [nx t0nx t1]/2.

    With x being the annual incidence estimate for the studied age group, kx being the number of

    observed cases in the observed population, nx(t0) and nx(t1) the number of subjects at risk at the

    beginning and end of the studied period. The expression at the denominator corresponds to the

    value mx provided by the statistical bureaux.

    Standard Rates

    To compare incidence rates measured on different populations, some form of standardisation is

    necessary. The direct method of standardisation consists in calculating the annual rate that would

    have been observed in a theoretical standard population, had this been subjected to the same force

    of incidence of that being studied. To obtain this result, expected numbers of cases for each age

    group are calculated for the standard population based on the corresponding person years and the

    estimated rate for the studied population. This number, that corresponds to the number of expected

    cases in the theoretical population, is divided by the total person years of this population. If L is the

    size of the total standard population, Lx the number of subjects in the xth age group, with kx and mx

    respectively the incidence and number of person years in the corresponding age group for the

    studied population, the age specific rate for the observed population is t x= kx/mx. Thus, the number

    of expected cases in the xth age group of the standard population if it were subject to the same level

    6

  • 8/7/2019 Tesi di Dottorato di Ricerca Malvezzi

    7/45

    of risk defined by tx, would be Lxtx. The standardised rate, for a total number of age groups g, is

    then:

    t=1

    Lx=1

    g

    Lx tx .

    Which may also be written as:

    t=x=1

    g

    wx tx

    where wx = Lx/L is the proportion of subjects in the xth age group for the standard population

    giving:

    x=1

    g

    wx=1

    Lx=1

    g

    Lx=1 .

    Therefore t is a weighted average of age specific rates, with the weights being the proportion of

    individuals per age group in the standard population. The standard populations used in this thesis

    are the World Standard Population and the World Standard Truncated (35-64) Population [1].

    Table 1. Age group weights for the world standard and truncated (35-64) populations.

    Age Group World Standard Population World Truncated Population

    0-4 12

    5-9 10

    10-14 9

    15-19 9

    20-24 8

    25-29 8

    30-34 6

    35-39 6 19.35

    40-44 6 19.35

    45-49 6 19.35

    7

  • 8/7/2019 Tesi di Dottorato di Ricerca Malvezzi

    8/45

    Age Group World Standard Population World Truncated Population

    50-54 5 16.13

    55-59 4 12.91

    60-64 4 12.91

    65-69 370-74 2

    75-79 1

    80 1

    Total 100 100

    Rate Estimate Accuracy

    The accuracy of rates depends on the accuracy of the number of observed cases K. K follows a

    Poisson distribution whose expectation and variance are equal to the theoretical rate multiplied by

    the number of person years m accumulated, so E(K)= m and Var(K)= m. Therefore the variance

    of the rate estimator K/m is:

    Var Km =Var K

    m2

    =m

    .

    Its estimate is obtained by replacing with k/m:

    Var Km =k

    m2

    =2

    k.

    From here we see that the variance for the specific rate tx is:

    Vartx =Var Kx

    mx2

    =xm x

    .

    For the standardised rate this becomes:

    Vart=x=1

    g

    w x2Vart=

    x=1

    g

    wx2

    xmx

    x being unknown the variance must be estimated with:

    Vart=x=1

    g

    wx2

    mx2 kx .

    8

  • 8/7/2019 Tesi di Dottorato di Ricerca Malvezzi

    9/45

    Therefore, if the theoretical standard rate is:

    =x

    w x x

    and s is its standard error, (t-)/s approximates to a standard normal variable for which we can write

    a 1 confidence interval in the usual form:

    [ tZ /2 Vart ; tZ /2 Vart]

    For practical purposes rates are usually given as t per 100 000 person years (105 t), consequently the

    variance needs to be presented as 1010 Var(t).

    Age-Period-Cohort Analysis

    Introduction

    Standardized rates are a useful descriptive tool when comparing different populations, but offer no

    analytical insight, on the other hand, inferential analytic methods, applied to mortality rate time

    trends such as joinpoint analysis, offer analytical insight on trends and their change in time, but

    completely ignore information on the age and cohort structure of the analysed data [2]. APC

    analysis performs a simultaneous study of the effects of age, period and cohort. The age effects

    correspond to the variability explained by physiological differences that characterise the different

    age groups, period effects are brought about by the change in variability in time that influence all

    age groups in the same way, finally cohort effects explain changes that affect groups of subjects

    belonging to the same birth cohort. Period effects usually portray the consequences of the

    introduction of new therapies or screening interventions, they are also affected by exposures that

    manifest their consequences on the studied event over short periods of time. Cohort effects are

    usually determined by gestational exposures, or those exposures that influence event frequencies

    over long time periods.

    However useful, the APC model suffers from an intrinsic structural issue; the age, period and cohort

    variables have an exact linear dependence that can be expressed as age = period cohort (A=P-C).

    9

  • 8/7/2019 Tesi di Dottorato di Ricerca Malvezzi

    10/45

    This causes the infamous identifiability problem that makes this model complicated to treat. In the

    recent decades many efforts have been spent in an attempt to overcome this issue, a few examples

    are the use of estimable functions, adding extra constraints to the model or performing sequential

    fittings of a two variable model and then regressing over residuals with the remaining variable [3]

    [4][5][6][7][8]. More recently, solutions using modern techniques such as smoothers and Bayesian

    methods have also been proposed [3][8][9][10][11]. All these solutions have their merits and set

    backs; estimable functions are epidemiologically hard to interpret, adding constraints requires

    biologic and statistical a priori knowledge and is not always justifiable, while cascade regression

    results don't always agree with each other, and also require biological justifiability. In the following

    paragraphs some of these techniques will be illustrated and the penalised likelihood method, that

    offers a good balance between trade-offs and functionality, will be illustrated in detail [12][13].

    Presenting Data: Tabular View

    Incidence and mortality data are often shown in a two way table in tabular form, where rows

    represent age groups (i=1,...,I) and columns the time periods (j=1,..,J). Usually the groupings of age

    and periods are of the same size, but this is not necessarily the case. Cells can portray the following

    information: the number of events (Oij), which is either the number of deaths for mortality or new

    occurrences of a disease in the case of incidence, the number of man years at risk (N ij), or the age

    specific rate (rij=Oij/Nij), which synthesises the afore mentioned information. In table 2 Italian male

    gastric mortality rates are shown, age groups and time periods are both grouped in quinquennia, the

    cells contain the corresponding age specific rates per 100 000 inhabitants. Observing the table, it

    can be seen that moving down the rows we study the variation along the age groups (I=12), while

    moving horizontally along the columns we can observe the temporal variation through the 5-year

    periods (J=11). Moving diagonally down the table changes through the cohorts (K=I+J-1, k=I-i+j)

    can be observed, the cohort with the central year of birth in 1930 (C=P-A) is highlighted.

    In the example it is easy to see how the rates get bigger at older ages and earlier periods of death,

    10

  • 8/7/2019 Tesi di Dottorato di Ricerca Malvezzi

    11/45

    the intrinsic dependence of cohort effects from the age and period variables also appears clearly in

    this example.

    Table 2. Tabulated age specific mortality rates for gastric cancer in Italian men.

    Central Year of the Quinquennium of Death

    1952 1957 1962 1967 1972 1977 1982 1987 1992 1997 2002

    CentralYearoftheA

    geQuinquennium

    22 0.27 0.33 0.45 0.14 0.22 0.25 0.23 0.16 0.12 0.12 0.04

    27 0.82 0.85 0.93 0.73 0.7 0.63 0.51 0.33 0.38 0.24 0.38

    32 2.21 2.03 2.35 2.18 1.81 1.65 1.27 1 0.92 0.89 0.57

    37 5.86 5.18 5.17 4.92 4.61 4.09 3.32 2.83 2.28 1.82 1.44

    42 15.02 13.83 11.5 10.49 9.3 7.73 7.94 5.69 5.15 4.11 3.34

    47 32.04 27.65 25.5 20.57 19.18 17.12 13.98 12.23 9.87 8.32 6.58

    52 58.09 56.65 48.8 43.05 34.97 31.39 26.81 21.9 18.12 14.53 11.55

    57 100.97 95.88 89.69 80.01 68.43 53.3 47.74 41.05 33.43 26.27 21.23

    62 159.17 151.69 148.13 136.19 114.49 95.31 77.77 66.89 55.28 45.77 37.11

    67 235.55 223.04 220.72 205.64 182.18 145.83 131.09 105.86 88.14 72.06 61.06

    72 317.74 313.15 301.82 296.15 259.8 208.33 191.36 171.75 138.05 108.03 92.54

    77 366.57 367.31 393.92 367.7 355.83 281.51 267.09 237.9 196.14 158.47 133.61

    11

  • 8/7/2019 Tesi di Dottorato di Ricerca Malvezzi

    12/45

    Presenting Data: Graphical Methods

    To explore the data structure further, a series of two-dimensional logarithmic scale representations

    are used. Figure 1 shows death rates per 100 000 inhabitants plotted against period of death with

    data stratified by age at death. In the shown case (Italian male gastric cancer mortality) parallelism

    between age groups is very strong, underlining the coherence of mortality variation over time. This

    parallelism is also seen in figure 2 where the grouping and x axis variables are inverted, i.e. with

    data grouped by period of death and death rates plotted against age at death, underlining the

    coherence in variation death rates have over the age groups.

    This homogeneity can also be seen when plotting involves the cohort of birth factor. In figure 3

    death rates are plotted against cohort-of-birth, but divided in age groups. The parallelism seen is

    very similar to that seen in figure 1, which is expected since the strata are identical in the two

    figures, the difference lies in the fact that in figure 3 they are plotted against the cohort of birth

    instead of the period (figure 1).

    In figure 4 we see death rates per 100 000 grouped by birth cohorts and plotted against age at death,

    12

    Figure 1: Death rates per 100 000 inhabitants plotted against period of deathwith data stratified by age at death

  • 8/7/2019 Tesi di Dottorato di Ricerca Malvezzi

    13/45

    the graph has the same shape as that in figure 2 because it plots the same data points, but these are

    joined according to different strata. It is immediately noticeable that the broken lines don't all share

    the same length, that is because older and younger cohorts are based on fewer cells, compared to the

    central ones, since they only exist at the extremes of the studied age spectrum. Nonetheless, good

    13

    Figure 3: Death rates per 100 000 inhabitants plotted against age at birth

    with data stratified by age at death

    Figure 2: Death rates per 100 000 inhabitants plotted against age at deathwith data stratified by period of death

  • 8/7/2019 Tesi di Dottorato di Ricerca Malvezzi

    14/45

    parallelism between the lines is evident showing that, with respect to cohorts, data variation over

    age is homogeneous. From the four figures representing Italian male gastric cancer mortality, age

    appears to be the most important factor, as would be expected with mortality from this cancer. The

    parallelism seen in the graphs makes discerning whether there is a stronger age period effect (shown

    in figures 1 and 2) or whether the data is influenced by a stronger age cohort effect (as shown in

    figures 3 and 4) complicated.

    There are more data representation techniques than it is healthy to discuss, the more explanatory

    ones tend to be combinations of those shown so far: in the example (figure 5) cohort strata are

    added to the graph in figure 1, the effect of this procedure is to create a surface with a varying

    topology from which information regarding the roles of age, period and cohort can be gained, in the

    example parallelism and linearity on a logarithmic scale dominate the data.

    14

    Figure 4: Death rates per 100 000 inhabitants plotted against age at death

    with data divided by age at birth

  • 8/7/2019 Tesi di Dottorato di Ricerca Malvezzi

    15/45

    The more spectacular representations are undoubtedly three-dimensional plots, of which we give an

    example (figure 6), portraying the same data with the logarithm of the death rate per 100 000

    inhabitants on the z axis and age and period on the x and y axes respectively.

    15

    Figure 5: Death rates per 100 000 inhabitants plotted against period

    of death with data stratified by age at death (black) and age at birth(red)

    Figure 6: Death rate per 100 000 inhabitants on the z axis, ageat death on the y axes and period oh death on the y axes

  • 8/7/2019 Tesi di Dottorato di Ricerca Malvezzi

    16/45

    The strength of the age effect can also be seen on this graph, but it suffers from the limit of being a

    three dimensional representation stuck on a two-dimensional surface, this makes the choice of view

    point an important and tricky issue.

    Classical Modelling Approach

    To explain the methods and issues that arise from the application of APC models it's best to start

    from the classical illustration of the problem that Clayton and Schifflers make in their double paper

    titled Models for Temporal Variation in Cancer Rates [14][5]. In their work, the authors tackle the

    problem performing variance analyses of successively more complex models. They start with a

    simple age model, moving on to a linear drift model that adds either a cohort or period drift term to

    the age model, to then apply the age-period (AP) and age-cohort models (AC) and only in a final

    instance, if justified by the variance analysis, apply one of the possible APC solutions. The flow

    chart they present in their paper (figure 7), is a synthesis of their proposed method.

    This schematic highlights the fact that the AP and AC models are not directly comparable, this is

    16

    Figure 7: Anova procedure schematic for age period cohort analysis

    3

    Age-Period ModelDoF: (I-1)(J-1)

    1Age ModelDoF: I(J-1)

    2Age-Drift Model

    DoF: I(J-1)-1

    3

    Age-Cohort ModelDoF: (I-1)(J-2)

    4Age-Period-Cohort Model

    DoF: (I-2)(J-2)

    -1

    -(J-2)-(I+J-3)

    -(I+J-3)-(J-2)

    Degrees of Freedom in APC Models.

  • 8/7/2019 Tesi di Dottorato di Ricerca Malvezzi

    17/45

    due to the AC model using more degrees of freedom than the AP model. The resulting ANOVA is

    consequently complicated to interpret. Applying this ANOVA method to the afore-mentioned

    models, which will be explained in greater detail in the following section, to the Italian male gastric

    mortality rates from the previous examples we obtain:

    Examining figure 8, it can be seen that, the age drift model explains the most variance using the

    least degrees of freedom, therefore it gives the most economical representation of the data. This

    result is also obvious upon close inspection of the graphs presented in the Graphical Methods

    section (figures 1-6) where the plots are dominated by strong parallelism and linearity on a

    logarithmic scale.

    The Age Model

    The model that analyses the rate trends using age as a factor can be expressed with the following

    function:

    Y=log E[ri]=i

    17

    Figure 8: Anova schematic applied to Italian male gastric cancer mortality

    3Age-Period ModelDoF:90 Dev:1713

    1Age Model

    DoF:100 Dev:70183

    2Age-Drift Model

    DoF:99 Dev:4714

    3Age-Cohort ModelDoF:81 Dev:927

    4Age-Period-Cohort Model

    DoF:72 Dev:306

    -1

    -9-18

    -18-9

    Degrees of Freedom in APC Models of Italian male gastric cancer mortality .

    Null model DoF: 110 Total Deviance:1479717519

  • 8/7/2019 Tesi di Dottorato di Ricerca Malvezzi

    18/45

    where is the intercept and the i is the additive effect of the ith age group on the base rate, which

    can be expressed as EXP()*100 000 inhabitants, i.e. i is the relative risk as compared to that of a

    reference age group 0.

    For our purposes the model without the intercept where the i are the log-estimates of the age-

    specific rates ri is more useful:

    Y=log E[ri ]=i

    Applying the model to the example data (figure 9), the slightly wider confidence intervals in the

    younger age groups are an indication of the greater variability present in these estimates, this is due

    to the very small numbers of deaths recorded in these age groups in the studied pathologies.

    The Age-Period Model

    This model assumes that the age specific rates maintain the same structure during all the studied

    periods, but vary in dimension according to the time period and can be written as:

    18

    Figure 9: Age model for Italian male gastric cancer mortality

  • 8/7/2019 Tesi di Dottorato di Ricerca Malvezzi

    19/45

    Y=logE[rij]=i j

    where i is the log-estimate of the ith age-specific rate in the reference period 0, j is the additive

    effect of the jth period on the logarithm of the age-specific rates with rij being the estimate of the

    rate of the ith age group in the jth period.

    This model is over-parametrised as it has a parameter for each age group and one for each period,

    but this is easily solved by constraining one of the period effects to be equal to 0 and set it as the

    reference 0 compared to which the other period effects j are to be considered as measures of risk.

    In our example (Italian male gastric mortality rates) the fifth period corresponding to the 1970-74

    quinquennium was chosen as the reference, therefore the i are the logarithms of age-specific

    mortality rates in the reference period 0 (1970-74) and j is the relative risk to be applied to the

    age-specific log-rates in the jth period.

    The Age-Cohort Model

    This model is structurally similar to the AP model replacing the period factor with the cohort one:

    19

    Figure 10: Age period model for Italian male gastric cancer mortality

  • 8/7/2019 Tesi di Dottorato di Ricerca Malvezzi

    20/45

    Y=log E[rik]=ik

    where i is the log-estimate of the ith age-specific rate in the reference cohort 0, k is the additive

    effect of the kth cohort on the logarithm of the age-specific rates, similarly to the previous model rij

    is the estimate of the rate of the ith age group, but in the kth cohort instead of the jth period. The

    over parametrisation problem is solved, similarly to the previous model, taking a reference cohort 0

    where the age-specific log-rates take on the values given by EXP( i)*100 000, referred to which k

    is the relative risk for the kth cohort. In our example (Italian male gastric mortality rates) the central

    reference cohort is the 11th that has its central year of birth in 1930.

    The confidence intervals for the cohorts, resulting from the application of the AC model to the

    example data (figure 9), show greater variability in the oldest and youngest cohorts. This is due to

    these cohorts being built on fewer observations, as can be seen in figure 4. Furthermore, the most

    recent cohorts include the youngest age groups, which have the smallest numbers of recorded

    deaths, consequently they are affected by the greatest variability.

    20

    Figure 11: Age cohort model for Italian male gastric cancer mortality

  • 8/7/2019 Tesi di Dottorato di Ricerca Malvezzi

    21/45

    The Age-Drift Model

    Observing the previous two models with the period and cohort effects expressed as relative risks,

    the question arises as to what would happen if these parameter estimates were substituted with a

    log-linear trend. This is called the linear drift and can be expressed either as a function of period or

    as a function of cohort, as can be seen respectively in the following formulae:

    Y=log E[rij ]=i j j 0

    Y=logE[rik]=ikk0

    The resulting effects are age-specific rates calculated in the reference period or cohort (j0 and k0

    respectively) and a linear trend component representing the relative risk as a linear drift on the

    logarithmic scale.

    The main feature of these two models is that, as far as the expected value estimates are concerned,

    the residual deviance is identical in both models. The only real difference between the two models

    is the reference temporal scale, this changes the model interpretation according to whether it is

    period or cohort centred, consequently the estimated age-specific rates in the reference

    21

    Figure 12: Age drift models for Italian male gastric cancer mortality

  • 8/7/2019 Tesi di Dottorato di Ricerca Malvezzi

    22/45

    parameter have a different structure to reflect this.

    This model is interpreted as an age-specific rate structure that is influenced by a constant linear

    cohort or period effect. In the example (Italian male gastric mortality rates), we see the two models

    superimposed in figure 12 to illustrate the previously mentioned features of these models, with the

    references j0 and k0.being the same as in the previous AP and AC examples (figures 10 and 11).

    The Age-Period-Cohort Model

    As mentioned in the introduction, the full APC model suffers from a non-identifiability problem

    given by the linear relation A=P-C. Numerous techniques have been used to solve or circumvent

    this issue over the years, from simple added constraint solutions to more exotic methods. In the

    following paragraphs some of these methods will be illustrated in order to underline their strengths

    and weaknesses, before moving on to the chosen method for APC analysis.

    Added Constraints

    This method solves the non identifiability problem by adding additional constraints to the model. As

    well as choosing a cohort 0 and a period 0 as references to solve the ordinary over-parametrisation

    problem, an additional constraint is added to the model, usually forcing the effects of 2 parameters

    to be the same.

    This method has some drawbacks, the most obvious of which is that by changing the constraints the

    resulting model is heavily and visibly affected, therefore a priory biological and statistical

    knowledge is needed to justify such an intervention.

    In the two examples we apply different constraints on the model, the resulting differences are plain

    to see. The first model (figure 13) constrains the effects of the to two earliest cohorts (central years

    1875 and 1880) so that c1=c2, while the second model (figure 14) constrains the last two age groups

    (70-74 and 75-79 years of age) so that a9=a10. As anticipated, the estimates of the two models are

    incompatible even though the chosen constraints are closely related: the earliest cohorts and oldest

    22

  • 8/7/2019 Tesi di Dottorato di Ricerca Malvezzi

    23/45

    age groups overlap greatly. This underlines the fact that this kind of solution should be avoided

    unless very strong a priori knowledge is held.

    Successive Iteration Modelling

    In the case where age is the most important scale (as is the case in most cancer mortality studies)

    and cohort or period can be prioritised according to a priori biological or statistical information, a

    full APC model can be approximated by regressing an AP or AC model first, then obtaining the

    effect of the remaining factor by regressing it over the residuals of the first model.

    23

    Figure 13: Age period cohort model constraining the first two cohorts

    Figure 14: Age period cohort model constraining the oldest two age groups

  • 8/7/2019 Tesi di Dottorato di Ricerca Malvezzi

    24/45

    In the case of an AC model followed by a period model, what is being obtained is the residual

    period effect conditional to the age and cohort estimates from the preceding model:

    logE[rijk]= i k j

    which for the expected cases becomes:

    log E[O ijk]= i klog Nijk j .

    Now this is similar to the general expression of a Poisson model, with the difference that the offset

    now includes the log of the fitted values from the AC model.

    This procedure does not provide maximum likelihood estimates for parameters, instead it estimates

    marginal age and cohort effects and conditional estimates of the period effects. This technique

    allows for the calculation of confidence intervals, which consequently are not maximum likelihood

    estimated intervals, but marginal estimates for the age and cohort effect intervals and conditional

    estimates for the period effect ones.

    24

    Figure 15: Period model regressed over the residuals of an age cohort model

  • 8/7/2019 Tesi di Dottorato di Ricerca Malvezzi

    25/45

    To use this procedure starting with an AP model to then regress the cohort factor the following

    applies:

    log E[rijk]= i j k

    and the expected cases become:

    log E[O ijk]= i jlog Nijkk .

    The same holds for effect and standard error estimates, now age and period are the marginal

    estimates while the cohort ones are conditional to the AP model.

    From figures 15 and 16 we can see that the two models are similar but not equivalent. Obviously

    the effects of the parameters of the starting models are those of the AP and AC models, since the

    data examined in this case is strongly driven by drift, the cohort and period parameters for these

    first models are very similar to the corresponding age-drift models. Therefore, choosing whether to

    start with an AC or AP model depends on where we choose to assign the drift term, and which

    factor we want to study without the effects of linear drift.

    25

    Figure 16: Cohort model regressed over the residuals of an age period model

  • 8/7/2019 Tesi di Dottorato di Ricerca Malvezzi

    26/45

    This method can also be used by fitting one of the possible age-drift models and then regressing

    over the residuals with either the period or cohort parameters. For an age-drift-period model, first

    the age-drift model using a cohort linear drift is used, successively the model for the period effects

    is regressed using the fitted estimates of the first model as the offset. The results of this procedure

    are period estimates conditional to age and cohort drift estimates of the drift model, this gives us a

    set of period effect estimates unaffected by linear drift as this has already been absorbed by the first

    model. This translates to:

    log E[rijk]= i kk0 j

    and for the expected cases:

    logE[O ijk]= i kk0log Nijk j .

    Analogously to the previous examples, this results in marginal estimates for the effects and errors of

    age and cohort drift and conditional estimates for the period ones. The same thing can be done to

    obtain an age-drift-cohort model where the cohort estimates are drift free.

    26

    Figure 17: Period model regressed over the residuals of an age cohort driftmodel

  • 8/7/2019 Tesi di Dottorato di Ricerca Malvezzi

    27/45

    The period and cohort effects resulting from these two models appear to be virtually identical to the

    ones obtained from regressing AP or AC models first (figure 17 and 18). This is because the linear

    drift is absorbed completely by the first model fitted. The usefulness of these models is that they

    enable us to study the shape of the last factor once the linear-drift effect has been subtracted.

    Penalised likelihood Age-Period-Cohort Method

    As previously illustrated, the issues with fitting a full APC model are solving the identification

    problem and dealing with the linear drift term.

    An APC model can be considered as a parameter estimate problem in a log-linear Poisson model,

    such as those illustrated up till now:

    logOijk=log Nijkloga ilog p jlog ck

    where ai, pj and ckare the multiplicative parameters for age, period and cohort respectively. These

    can be estimated by minimising the following expression using the weighted least squares method:

    fa , p , c = Oijk logOijklog Nijkloga ilog p jlog ck2

    Seeing as there is a linear relation between A, P and C the solution set X(a,p,c) is infinite, but can be

    27

    Figure 18: Cohort model regressed over the residuals of an age period driftmodel

  • 8/7/2019 Tesi di Dottorato di Ricerca Malvezzi

    28/45

    parametrised in in the following manner:

    log a 'i=log ai Ii

    log p 'i=log p i j

    log c 'i=log c i k .

    This parametrisation makes it possible to calculate a goodness of fit statistic (G2) that is independent

    from .

    The solutions estimated by the three two factor models are:

    Xc= ac , pc , c0 ; X p= a p , p0 , c p ; Xa=a0 , pa , ca;

    where c0 and p0 are unit vectors of length k and j respectively and a0 takes the form:

    a0i=exp [j

    Oij log Oijlog Nij /j

    Oij ] .

    The natural logarithms of these solutions are then placed in the real space Ri +j + k, where their

    euclidean distances from the parametrised saturated model solutions X()=(a, p, c, ) are defined

    as:

    dc =XcX

    dp =XpX

    da=Xa X .

    The sum of these distances, weighed by the inverse of the degrees of freedom scaled goodness of

    fit:

    g X=dc

    Gc2 / I1 J1

    dp

    G p2 / I1 J2

    da

    Ga2/ I2 J1

    can be minimised in . This results in a solution X'() that minimises the distance of the saturated

    model from the three two factor models, constructing a geometrical weighted average of the three

    28

  • 8/7/2019 Tesi di Dottorato di Ricerca Malvezzi

    29/45

    two factor models. This way the identification problem is solved and the drift is distributed

    according to the goodness of fit statistics.

    Confidence Intervals for APC models.

    Many APC methods don't allow for the straightforward production of confidence intervals. Such is

    the case for the penalised likelihood method. To gain some insight on the variability of the

    estimated parameters we resorted to a parametric bootstrap simulation method[15]. Published

    Mortality data is usually stratified by age-groups and time-periods. Being count data every cell,

    specified by age group and calendar time-period, contains values that are Poisson distributed. With

    this in mind, for each cell, 1 000 values were randomly extracted from a Poisson distribution

    characterised by the measured value of that cell. The simulated datasets were then fed through the

    penalised likelihood APC model, 1 000 sets of estimates for the age, period and cohort parameters

    were obtained.

    For each parameter the 2.5th and 97.5th percentiles were taken from the one-thousandfold estimate

    set as an estimate for the 95% confidence interval for that parameter.

    29

    Figure 19: Age period cohort model fitted using the penalised likelihood

    method

  • 8/7/2019 Tesi di Dottorato di Ricerca Malvezzi

    30/45

    Gastric Cancer Data

    Official death certification data for the period 1950-2004 was derived for gastric cancer, whenever

    available, from the World Health Organization (WHO) database [16], for 35 countries of the

    European Region (according to the WHO definition), of which 23 were EU countries (Belgium,

    Cyprus, Luxembourg and Slovakia were excluded), the other 12 being Albania, Armenia,

    Azerbaijan, Belarus, Croatia, Georgia, Kazakhstan, Norway, Moldova, Russia, Switzerland and

    Ukraine, and the EU as a whole (27 member states as defined in January 2007). Mortality data was

    coded according to the Seventh Revision of the International Classification of Diseases (ICD-7)

    from the 1950's to the end of the 1960's, the eighth revision (ICD-8) was used throughout the

    1970's, while the ninth ICD-9 was used up to the mid 1990's (with the exception of Switzerland and

    Denmark that skipped this revision and moved to the tenth in 1995 and 1994 respectively), the

    Tenth Revision of the International statistical Classification of Diseases and Health Related

    Problems (ICD-10) was adopted between the late 1990's and early 2000s by most countries [17],

    with the exceptions of Albania, Armenia, Bulgaria, Greece and Ireland that still have not adopted it.

    30

    Figure 20: Age period cohort model fitted using the penalised likelihoodmethod with confidence intervals from the bootstrap procedure

  • 8/7/2019 Tesi di Dottorato di Ricerca Malvezzi

    31/45

    Encoding for stomach mortality is straightforward and its transition between revisions did not

    present particular issues, it was coded as 151 for ICD 7, 8 and 9 and C16 in ICD-10.

    Estimates of the resident populations for the corresponding calendar periods, based on official

    censuses, were extracted from the same WHO database.

    From the matrices of certified deaths and resident population we computed the age-specific

    mortality rates per 100 000 inhabitants for 5-year age groups (from 30-34 to 75-79 years), for the 11

    5-year periods considered (from 1950-54 to 2000-04) where data was available. No extrapolations

    were made for missing data.

    From the same matrices, we computed age-specific rates for each 5-year age group (from 0, 1-4 to

    85+ years), in order to construct age-standardized mortality rates per 100 000 men and women

    using the direct method on the basis of the world standard population at all ages, and the

    corresponding percent change in rates over the 1994-2004 decade [1].

    Cohorts were defined according to their central year of birth. Thus, the earliest possible cohort (the

    1875 one) relates to individuals aged 75 to 79 who died in the quinquennium 1950-54; they could

    have been born in any of the 10 years from 1870 to 1879.

    Oral Cancer Data

    Official death certification data from oral and pharyngeal cancer for 38 European countries and for

    the EU in the period 1950-2007 was derived from the World Health Organization (WHO) database

    available on electronic support [16].

    The EU was defined as the 27 member states as of January 2007, i.e., Austria, Belgium, Bulgaria,

    the Czech Republic, Cyprus, Denmark, Estonia, Finland, France, Germany, Greece, Hungary,

    Ireland, Italy, Latvia, Lithuania, Luxembourg, Malta, the Netherlands, Poland, Portugal, Romania,

    Slovakia, Slovenia, Spain, Sweden, United Kingdom. Data for Cyprus was not available and for

    Belgium it was only available up to 1997, and was therefore excluded.

    During the calendar period considered (1950-2004), four different Revisions of the International

    31

  • 8/7/2019 Tesi di Dottorato di Ricerca Malvezzi

    32/45

    Classification of Diseases (ICD) were used. Since differences in classifications between various

    Revisions were minor, oral and pharyngeal cancer deaths were re-coded for all countries according

    to the Tenth Revision of the ICD (ICD-10: C00-C14) [17].

    Estimates of the resident population, based on official censuses, were obtained from the same WHO

    database. From the matrices of certified deaths and resident populations, we computed age-specific

    rates for each 5-year age group (from 0, 14 to 85+ years) and calendar period. Age-standardized

    rates per 100 000 men and women, at all ages and truncated 35-64 years, were computed using the

    direct method, on the basis of the world standard population [1]. In a few countries, mortality data

    was missing for one or more calendar years. No extrapolation was made for missing data.

    From these same matrices, we computed the age-specific mortality rates per 100,000 inhabitants for

    5-year age groups ( from 30-34 to 80-84 years), for the 12 5-year periods considered (from 1950-54

    32

    Figure 21: Age period cohort models of male gastric cancer mortality in countries representative of

    the European region and the EU as a whole.

  • 8/7/2019 Tesi di Dottorato di Ricerca Malvezzi

    33/45

    to 2000-04 plus 2007), where data was available. No extrapolations were made for missing data.

    Cohorts were defined according to their central year of birth. Thus the earliest possible cohort (the

    1865 one) relates to individuals aged 80 to 84 who died in the quinquennium 1950-54; they could

    have been born in any of the 10 years from 1860 to 1869.

    Results

    Gastric Cancer Mortality

    Table 3 shows age-standardized rates for men and women in 1994, 1999 and 2004 with percentage

    differences between periods. We see that for the EU taken as a whole in 2004 we have an age-

    standardized mortality rate of 9.09/100 000 inhabitants for men and 4.17/100 000 for women, but

    this total rate is the result of many differing contributions; central and northern European countries

    33

    Figure 22: Age period cohort models of female gastric cancer mortality in countries representative

    of the European region and the EU as a whole.

  • 8/7/2019 Tesi di Dottorato di Ricerca Malvezzi

    34/45

    like France, Finland and Sweden showing low rates around 5/100 000 in men and between 2 and

    4/100 000 in women in 2004, while southern and Mediterranean countries like Italy and Portugal

    have mortality rates of 10.43 and 16.36/100 000 in men and 5.04 and 7.30 /100 000 in women

    respectively. Eastern countries belonging to the EU show even higher mortality rates with Latvia

    and Lithuania peaking slightly above 20/100 000 in men and nearly 8 /100 000 in women, with

    Estonia showing the highest values for women in the EU with 9.66/100 000.

    Table 3. Age standardised mortality rates for gastric cancer in men and women in countries from

    the European geographical region.

    MEN WOMEN

    Calendar

    Yearsa1994 1999 2004

    % Change

    1999/94

    % Change

    2004/991994 1999 2004

    % Change

    1999/94

    % Change

    2004/99

    EUROPE

    . 16.29 13.92 . -14.59 . 7.54 7.73 . 2.55Albania 1987-2004

    Armenia (2003) 1981-2003 15.91 13.35 16.79 -16.08 25.76 6.98 5.82 7.09 -16.61 21.81

    Austria 1980-2005 14.10 9.89 8.07 -29.83 -18.38 7.65 5.35 4.63 -30.12 -13.42

    Azerbaijan

    (2002)1981-2002 24.32 25.24 21.64 3.81 -14.28 10.27

    10.0

    6

    11.0

    6-2.03 9.90

    Belarus (2003) 1981-2003 35.51 32.97 27.44 -7.17 -16.76 15.0713.0

    2

    10.5

    1-13.63 -19.22

    Bulgaria 1980-2004 18.55 14.91 13.25 -19.62 -11.15 8.75 7.49 6.25 -14.43 -16.50

    Croatia 1985-2005 21.15 22.79 16.38 7.77 -28.12 8.27 7.49 6.21 -9.50 -17.09

    Czech Republic 1986-2005 16.57 11.73 10.15 -29.20 -13.53 7.34 5.88 4.78 -19.79 -18.82

    Denmark (2001) 1980-2001 6.40 5.19 5.13 -18.81 -1.20 3.02 2.83 2.75 -6.25 -2.85

    Estonia 1981-2005 25.92 23.45 18.94 -9.52 -19.24 13.4012.0

    99.66 -9.75 -20.10

    Finland 1980-2005 10.81 7.89 6.31 -27.06 -20.02 4.64 3.64 3.86 -21.65 6.06

    France 1980-2005 7.30 6.53 5.63 -10.46 -13.85 2.81 2.46 2.08 -12.45 -15.16

    Georgia (2001) 1981-2001 11.20 12.02 8.51 7.36 -29.20 5.25 5.10 4.99 -2.74 -2.26

    Germany 1980-2004 12.88 9.94 7.95 -22.81 -20.08 6.72 5.19 4.23 -22.75 -18.44

    Greece 1980-2005 . 8.70 7.25 . -16.68 . 3.73 3.65 . -2.18

    Hungary 1980-2005 21.64 17.79 13.49 -17.79 -24.16 9.45 7.50 6.49 -20.63 -13.51

    Iceland 1980-2005 10.29 7.68 6.88 -25.35 -10.42 4.04 3.30 3.06 -18.20 -7.48

    Ireland 1980-2005 10.26 8.32 6.90 -18.87 -17.04 5.51 4.29 2.64 -22.17 -38.55

    Italy (2003) 1980-2003 15.06 11.95 10.43 -20.64 -12.73 7.23 5.46 5.04 -24.46 -7.68

    34

  • 8/7/2019 Tesi di Dottorato di Ricerca Malvezzi

    35/45

    MEN WOMEN

    Calendar

    Yearsa1994 1999 2004

    % Change

    1999/94

    % Change

    2004/991994 1999 2004

    % Change

    1999/94

    % Change

    2004/99

    Israel (2003) 1980-2003 8.87 7.49 7.81 -15.59 4.34 5.07 4.33 3.41 -14.50 -21.42

    Kazakhstan 1981-2005 33.92 30.57 25.71 -9.88 -15.92 13.9412.6

    5

    10.7

    7-9.25 -14.87

    Kyrgyzstan 1981-2005 30.58 24.49 23.24 -19.92 -5.13 10.6810.8

    48.30 1.44 -23.37

    Latvia 1980-2005 26.38 22.49 20.55 -14.74 -8.62 11.48 9.83 7.85 -14.40 -20.18

    Lithuania 1981-2005 25.67 22.86 21.09 -10.92 -7.77 9.70 9.36 7.98 -3.51 -14.76

    Luxembourg 1980-2005 9.04 7.52 7.03 -16.83 -6.55 4.40 3.55 2.86 -19.33 -19.54

    Macedonia

    TFYR (2003)1991-2003 22.01 17.17 17.77 -22.00 3.50 10.52 7.80 7.98 -25.88 2.28

    Malta 1980-2005 9.99 12.25 8.03 22.66 -34.42 5.77 3.70 3.65 -35.78 -1.55

    Netherlands 1980-2004 10.75 8.47 6.53 -21.17 -22.99 4.43 3.55 3.37 -19.79 -5.04

    Norway 1980-2005 9.61 8.55 5.57 -10.96 -34.90 4.68 4.28 2.94 -8.41 -31.47

    Poland 1980-2005 19.68 16.52 14.28 -16.05 -13.57 6.97 5.70 5.00 -18.27 -12.23

    Portugal (2003) 1980-2003 21.02 18.69 16.36 -11.09 -12.46 9.66 8.93 7.30 -7.57 -18.24

    Republic of

    Moldova1981-2005 19.74 16.51 16.89 -16.35 2.33 9.51 6.69 7.52 -29.63 12.41

    Romania 1980-2004 17.79 16.67 16.00 -6.33 -4.02 6.83 6.08 6.29 -10.95 3.43

    RussianFederation

    1980-2005 37.46 31.70 27.16 -15.40 -14.32 15.5813.0

    211.1

    8-16.43 -14.14

    Slovakia 1992-2005 . 16.69 13.63 . -18.34 . 5.84 5.78 . -0.92

    Slovenia 1985-2005 21.18 17.82 14.84 -15.86 -16.75 7.82 7.67 5.42 -1.86 -29.44

    Spain 1980-2005 13.16 10.90 9.02 -17.14 -17.26 5.69 4.67 3.79 -17.99 -18.74

    Sweden 1980-2004 7.05 6.13 4.94 -13.12 -19.32 3.75 3.05 2.73 -18.78 -10.29

    Switzerland 1980-2005 7.98 5.53 4.50 -30.75 -18.55 3.58 2.71 2.10 -24.48 -22.35

    Tajikistan 1981-2004 19.25 17.99 14.88 -6.53 -17.28 10.6310.8

    29.26 1.80 -14.38

    Ukraine 1981-2005 28.85 24.86 21.23 -13.84 -14.61 11.24 9.58 8.28 -14.76 -13.65

    United Kingdom 1980-2005 10.41 8.08 6.09 -22.36 -24.68 4.22 3.26 2.75 -22.80 -15.65

    Uzbekistan 1981-2005 15.93 11.98 12.34 -24.77 2.93 7.11 5.56 6.22 -21.86 12.04

    EU 13.22 10.85 9.09 -17.90 -16.28 5.98 4.81 4.17 -19.42 -13.45

    This geographical division of results is also reflected in the curves resulting from the APC analysis

    that can be seen in Figures 21 (men) and 22 (women). The age curves for France, Sweden and the

    UK are relatively shallow in both men and women as compared to those found in southern and

    35

  • 8/7/2019 Tesi di Dottorato di Ricerca Malvezzi

    36/45

    eastern countries such as Italy, Portugal and Hungary. The central and northern countries with the

    more favourable rates also have distinctive cohort and period effect curves, they have strong

    descending period effect trends and cohort effects that fall steeply from the earliest cohorts until

    about the 1940s to then stabilize. Southern and Eastern countries, on the other hand, display a later

    start in the cohort effect fall with eastern countries not seeing the stabilization found in the younger

    cohorts that southern European countries share with central northern states. The difference between

    sexes is mainly seen in the age curves that are much shallower than the male ones throughout the

    observed countries.

    Oral Cancer Mortality

    Table 4 gives the overall age-standardized mortality rates from oral and pharyngeal cancer in men

    and women from various European countries and the EU as a whole in 1990-1994 and 2000-2004

    with the corresponding percent changes.

    For EU men, rates declined by 8% between the early 1990s and 2000-2004, to reach an overall

    age-standardized rate of 6.1/100 000 in 2000-2004. In 1990-1994, the highest male rates were in

    Hungary (17.1/100 000), Slovakia (16.0/100,000), and France (12.4/100 000); the lowest ones in

    Greece and Iceland (1.8/100 000), and Finland (2.2/100 000). In 2000-2004, the highest male rates

    were in Hungary (21.1/100 000) and other countries from central and eastern Europe, such as

    Slovakia (16.9/100 000), the Republic of Moldova, Lithuania, Ukraine and Croatia (around 10-

    11/100 000), while the lowest ones were in Nordic countries, such Iceland, Sweden, Finland

    (around 2/100 000), the United Kingdom (2.8/100 000) and Greece (1.8/100 000). Oral and

    pharyngeal cancer mortality declined over the last decade in several large European countries,

    including France (with a rate of 8.6/100 000 in the early 2000s), Spain (6.0/100 000), Germany

    (5.7/100 000) and Italy (4.3/100 000). Persisting rises were, however, observed in several central

    and eastern European countries, including in particular Hungary, but also Belarus, Lithuania and

    Romania. Rates were much lower in women, but increased moderately from 1.08 to 1.14/100 000

    36

  • 8/7/2019 Tesi di Dottorato di Ricerca Malvezzi

    37/45

    over the last decade in the EU as a whole. In 2000-2004 the highest rates for women were in

    Hungary too (3.3/100 000), followed by Denmark (1.6/100 000) and Scotland (1.4/100 000), the

    lowest ones were in Bulgaria (0.8/100 000) and Greece (0.7/100 000).

    37

    Figure 23: Male age-specific mortality rates from oral and pharyngeal cancer from 30-34 to 80-84years plotted against the year of birth

  • 8/7/2019 Tesi di Dottorato di Ricerca Malvezzi

    38/45

    Table 4. Age standardised mortality rates for oral cancer in men and women in countries from theEuropean geographical region.

    CountriesMen Women

    1990-94 2000-04

    % Change

    2000-04/1990-94

    1990-94 2000-04

    % Change

    2000-04/1990-94

    Albania (1992-94) 3.81 2.41 -36.77 1.35 1.15 -14.94

    Austria 6.07 6.28 3.47 0.96 1.35 40.49

    Belarus (2000-03) 8.28 9.70 17.25 0.72 0.72 -0.23

    Bulgaria (2000-03) 4.27 4.48 5.07 0.69 0.81 17.54

    Croatia 11.88 10.43 -12.15 1.15 1.07 -7.44

    Czech Republic 6.53 7.13 9.14 0.97 1.06 9.03

    Denmark (2000-01) 4.28 4.93 15.30 1.38 1.64 19.16

    Estonia 8.71 8.83 1.31 1.06 1.30 22.56

    Finland 2.21 2.31 4.15 0.83 0.83 0.24

    France 12.43 8.57 -31.01 1.30 1.27 -2.37Germany 6.63 5.67 -14.47 1.13 1.18 4.97

    Greece 1.80 1.84 2.60 0.49 0.66 32.62

    Hungary 17.13 21.12 23.25 2.22 3.25 46.34

    Iceland 1.81 2.09 15.23 0.66 1.45 117.85

    Ireland 4.52 3.51 -22.26 1.09 1.13 3.27

    Italy (2000-03) 5.81 4.31 -25.78 1.00 1.01 1.67

    Latvia 6.96 8.06 15.80 0.81 0.92 13.85

    Lithuania 7.94 10.57 33.03 0.95 0.96 0.16

    Luxembourg 8.49 7.70 -9.31 1.40 1.47 4.98

    Macedonia (2000-03) 3.22 3.04 -5.41 0.60 0.83 37.50

    Malta 3.49 2.86 -18.04 1.40 0.78 -44.30

    Netherlands 2.78 2.96 6.29 1.04 1.27 22.16

    Norway 3.14 2.76 -11.93 0.93 0.98 5.47

    Poland 6.27 5.98 -4.70 1.06 1.12 5.39

    Portugal (2000-03) 5.87 6.81 16.10 0.91 0.83 -9.18

    Republic of Moldova 10.57 11.21 6.13 1.02 1.10 7.72

    Romania 6.45 9.99 55.00 1.02 1.23 20.38

    Russian Federation 8.92 8.85 -0.75 1.03 1.08 4.15

    Slovakia 16.01 16.85 5.28 1.10 1.19 8.32

    Slovenia 11.41 8.35 -26.82 0.95 1.08 14.39

    Spain 6.94 5.98 -13.76 0.83 0.85 2.06Sweden 2.54 2.15 -15.39 0.91 0.88 -2.85

    Switzerland 6.14 4.60 -25.06 1.17 1.23 5.15

    Ukraine 9.73 10.55 8.32 0.94 0.88 -6.36

    United Kingdom 2.98 2.75 -7.64 1.15 1.07 -6.72

    United Kingdom,

    England and Wales 2.83 2.58 -8.96 1.11 1.05 -5.41

    United Kingdom,

    Northern Ireland 3.16 2.74 -13.30 0.96 1.12 16.43

    United Kingdom,

    Scotland 4.48 4.19 -6.46 1.52 1.37 -9.69

    European Union 6.62 6.09 -7.92 1.08 1.14 6.23

    38

  • 8/7/2019 Tesi di Dottorato di Ricerca Malvezzi

    39/45

  • 8/7/2019 Tesi di Dottorato di Ricerca Malvezzi

    40/45

    selection of European countries and the EU. Examining the graph for the EU, age-specific mortality

    rate estimates for this cancer rise linearly with age, reaching a value lower than 50/100 000, in the

    oldest age group. Period estimates rise steadily up to the late 1980's/ early 1990's, to then invert the

    trend and fall until the most recent studied period. Cohort estimates rise sharply from the earliest

    cohort up the beginning of the 19th century, then they fall until the 1920's to rise up to the 1960's and

    then fall again for the more recent cohorts. With the exception of Portugal, that shows a fall up to

    the 1920's and then a continuous rise, cohort effects for the studied countries closely resemble the

    ones from the EU, however there are differences in variability, Lithuania and the Republic of

    Moldova display wider confidence intervals than other countries for their cohort effects. Age effects

    reflect the standard rates reported for the countries, Hungary's age curves reaches the highest point

    at about 100/100 000, while Germany doesn't reach 40 /100 000 at its highest point. The bigger EU

    countries, like France, Germany, Italy and Spain, show period estimate trends that reflect the

    patterns already seen in the EU with a rise for the earlier periods, an inversion in the 1990's and

    then continue to descend up to the last studied period. Countries that showed rises in standard rate

    trends, like Lithuania, Romania and Ukraine, have rising period effect trends for the more recent

    periods.

    Discussion

    The Penalised Likelihood APC Model

    Before proceeding to the interpretation of results, a few general considerations on the model are

    due. To start with, random variation problems differ in relation to age, period and cohort estimates.

    These are minor when related to period of death estimates, since these are based upon relatively

    similar total numbers of events over subsequent calendar periods; for age values the issue tends to

    manifest with the younger age groups where the absolute numbers of deaths for most oncological

    pathologies are low. In cohort effects these problems are potentially greater at both ends of the

    curve, the earliest and latest cohorts are based on very few observations, while moving towards the

    40

  • 8/7/2019 Tesi di Dottorato di Ricerca Malvezzi

    41/45

    central cohorts the number of observations rise. In addition to this, the more recent cohorts are

    based on smaller numbers of deaths because they contain the youngest age groups. It follows that

    changes in trends in these recent cohorts should be examined with caution, as is reflected by the

    wider confidence intervals for their estimates, even though they provide important information

    towards future trends.

    Another limit of the model is that it has difficulties discerning whether the major underlying trend is

    a cohort or a period one when both their estimated effects share the same direction [18]. This model

    also has a systematic tendency to favour cohort effects due to the greater weight its larger number of

    parameters have in the modelling process.

    Gastric Cancer Mortality

    The widespread favourable trends in the cohort of birth and period of death effects in gastric cancer

    mortality are not clearly understood, but most certainly reflect the effects of a more affluent diet,

    that is richer in fresh fruit and vegetables, of better food conservation (including refrigeration), as

    well as better hygiene, with a lower level ofHelicobacter pylori infection [19][20]. There is also

    consistent proof of a correlation between the consumption of salt and salted foods [21][22][23]. As

    well as salt, other methods of food conservation, such as smoking and curing, have been linked to

    stomach cancer, but the evidence is less consistent. Tobacco smoking is also an important risk factor

    in stomach cancer mortality [24].

    In countries with more advanced health infrastructures, improved and newly developed methods of

    diagnosis and treatment may also have played a role over the most recent calendar periods, but this

    effect remains hard to quantify [25].

    These favourable changes, that are essentially the result of a more developed socio-economic

    environment and healthier lifestyle, have been slower to occur and are less widespread in some

    countries of southern and eastern Europe. In particular the high prevalence ofH. pylori, aspects of

    41

  • 8/7/2019 Tesi di Dottorato di Ricerca Malvezzi

    42/45

    diet, nutrition and food conservation as well as a higher prevalence of tobacco smoking are more

    probably the causes of the differences observed between central-northern Europe and the south-

    eastern countries.

    The conclusions that can be drawn from this study are that, although in some countries such as

    France, Sweden and Switzerland there seems to be an asymptote in gastric cancer mortality due to

    the levelling seen in the effects of the most recent cohorts, there appears to be a lot of room for

    improvement for the mortality rates of this cancer, particularly in southern and eastern European

    countries, where these rates are high and the effects of the more recent cohorts have not shown

    signs of slowdown such as in the Czech Republic, Estonia, Latvia and Italy.

    Oral Cancer Mortality

    The main finding from this analysis of oral and pharyngeal cancer mortality in Europe is the strong

    excess mortality in Hungary, where the rate for middle aged men was 55/100,000, comparable to

    those of lung cancer in several western countries (i.e., 50 to 60/100,000 in Germany, Italy and the

    United Kingdom) [26][27]. In the most recent cohorts (i.e., those born after 1960) there were signs

    of a possible near future reversal in trends. Male rates appreciably decreased in southern European

    countries, such as France, Italy and Spain, which had the highest rates in the past, but not in several

    northern European countries, such as Denmark, the United Kingdom and the Netherlands [28][29].

    Oral and pharyngeal cancer mortality is comparatively low in European women, though trends were

    upwards over the last decades [30], and rates in some countries (Hungary in particular, but also

    Denmark and Romania) have reached relatively high levels, especially in middle aged women,

    reflecting female drinking and smoking patterns in those populations. Across European countries,

    there was still an over 10-fold variation in male oral and pharyngeal mortality between the highest

    rates in Hungary (21.1/100,000) and Slovakia (16.9/100,000), and the lowest ones in Greece

    (1.8/100,000) and Sweden (2.2/100,000).

    42

  • 8/7/2019 Tesi di Dottorato di Ricerca Malvezzi

    43/45

    The diverging trends in the two sexes essentially reflect different patterns in tobacco smoking and

    alcohol drinking. The favourable trends in male mortality rates reflect the fall in tobacco

    consumption in men from most (western) European countries over the last few decades. A

    favourable effect of stopping smoking is in fact evident already within few years after smoking

    cessation, while the risk may remain persistently high for several years after stopping drinking [31]

    [32]. Conversely, tobacco consumption has increased in women in several countries and this has led

    to unfavourable oral and pharyngeal cancer mortality trends [33].

    The geographic pattern of oral and pharyngeal cancer mortality trends also appears to be related to

    changes in alcohol consumption [34]. The exceedingly high rates in Hungary and in a few other

    countries of central and eastern Europe (Slovakia, Moldova, Lithuania, Croatia, Romania) can be

    related to the overall quantity of alcohol consumed, but also to the drinking patterns (out of meals,

    binge drinking) and to the type of alcohol consumed. In these countries, in fact, a substantial

    proportion of alcohol derives from fruit (plums, peaches, apricots) and home-made alcoholic

    beverages are widespread [35][36]. These may include high levels of acetaldehyde, which is an

    established human carcinogen [37].

    Age and cohort-specific analyses appear to reflect available information on the prevalence of

    tobacco smoking in subsequent generations of men from major European countries, as well as

    possibly the patterns of alcohol drinking, though generation specific data on alcohol consumption is

    not available for most countries.

    Although oral and pharyngeal mortality in Europe has declined in the last decade in men, there were

    still rises in a few central and eastern European countries, reaching exceedingly high rates in

    Hungary and Slovakia, which have now the highest rates on a European scale. The control of oral

    and pharyngeal cancer, as well as of other alcohol- and tobacco-related cancers, remains therefore a

    major public health problem in those areas of the continent.

    43

  • 8/7/2019 Tesi di Dottorato di Ricerca Malvezzi

    44/45

  • 8/7/2019 Tesi di Dottorato di Ricerca Malvezzi

    45/45

    24 IARC: IARC Monographs on the evaluation of carcinogenic risks to humans. Vol. 83.Tobacco smoke and involuntary smoking. ; 2004.25 Shibata A, Personnet J, Schottenfeld D, Fraumeni, JF, Jr: Stomach cancer. In Cancerepidemiology and prevention. Volume . 3th. Edited by . ; 2006:659-673.26 Levi F, Lucchini F, Negri E, La Vecchia C: Trends in mortality from major cancers in theEuropean Union, including acceding countries, in 2004. Cancer2004, 101:2843-2850.

    27 Bosetti C, Bertuccio P, Levi F, Lucchini F, Negri E, La Vecchia C: Cancer mortality in theEuropean Union, 1970-2003, with a joinpoint analysis. Ann Oncol2008, 19:631-640.28 Olsen AH, Parkin DM, Sasieni P: Cancer mortality in the United Kingdom: projections tothe year 2025. Br J Cancer2008, 99:1549-1554.29 Braakhuis BJM, Visser O, Leemans CR: Oral and oropharyngeal cancer in TheNetherlands between 1989 and 2006: Increasing incidence, but not in young adults. OralOncol2009, 45:e85-9.30 Garavello W, Bertuccio P, Levi F, Lucchini F, Bosetti C, Malvezzi M, Negri E, La Vecchia C:

    The oral cancer epidemic in central and eastern Europe. Int J Cancer2009, :.31 La Vecchia C, Franceschi S, Bosetti C, Levi F, Talamini R, Negri E: Time since stoppingsmoking and the risk of oral and pharyngeal cancers. J Natl Cancer Inst1999, 91:726-728.32 Franceschi S, Levi F, Dal Maso L, Talamini R, Conti E, Negri E, La Vecchia C: Cessation ofalcohol drinking and risk of cancer of the oral cavity and pharynx. Int J Cancer2000, 85:787-790.

    33 Shafey O, Dalwick S, Guindon G: Tobacco control country profiles 2003. American Cancer

    Society; 2003.

    34 World Health Organization Statistical Information System: Health topics. Alcohol drinking.Available at: http://www.who.int/topics/alcohol_drinking/en/2006, :.35 Boffetta P, Hashibe M, La Vecchia C, Zatonski W, Rehm J: The burden of cancerattributable to alcohol drinking. Int J Cancer2006, 119:884-887.36 Lachenmeier DW, Ganss S, Rychlak B, Rehm J, Sulkowska U, Skiba M, Zatonski W:

    Association between quality of cheap and unrecorded alcohol products and public healthconsequences in Poland. Alcohol Clin Exp Res 2009, 33:1757-1769.37 IARC: IARC Monographs on the evaluation of carcinogenic risks to humans. Vol. 71. Re-

    evaluation of some organic chemicals, hydrazine and hydrogen peroxideed. Lyon: International

    Agency for Research on Cancer; 1999.