tesi di dottorato di ricerca malvezzi
TRANSCRIPT
-
8/7/2019 Tesi di Dottorato di Ricerca Malvezzi
1/45
UNIVERSIT DEGLI STUDI DI MILANOFacolt di Medicina e Chirurgia
Istituto di Statistica e Biometria GA Maccacaro
Dottorato in Statistica Biomedica Ciclo XXII
TESI DI DOTTORATO DI RICERCAAnalisi dei Tassi di Mortalit Con Modelli Et Periodo Coorte
MED/01
Matteo Charles Malvezzi
Prof. Adriano Decarli
Prof. Silvano Milani
Anno Accademico 2008-2009
1
-
8/7/2019 Tesi di Dottorato di Ricerca Malvezzi
2/45
Index
Introduction..........................................................................................................................................3
Materials and Methods.........................................................................................................................4
Age-Standardised-Rates...................................................................................................................4
Instantaneous Incidence Rates....................................................................................................4
Mortality Rates............................................................................................................................5
Standard Rates ............................................................................................................................6
Rate Estimate Accuracy..............................................................................................................8
Age-Period-Cohort Analysis............................................................................................................9
Introduction.................................................................................................................................9
Presenting Data: Tabular View..................................................................................................10
Presenting Data: Graphical Methods........................................................................................12Classical Modelling Approach..................................................................................................16
The Age Model..........................................................................................................................17
The Age-Period Model..............................................................................................................18
The Age-Cohort Model.............................................................................................................19
The Age-Drift Model................................................................................................................21
The Age-Period-Cohort Model ................................................................................................22
Added Constraints................................................................................................................22
Successive Iteration Modelling............................................................................................23
Penalised likelihood Age-Period-Cohort Method.....................................................................27
Confidence Intervals for APC models......................................................................................29
Gastric Cancer Data..................................................................................................................30Oral Cancer Data.......................................................................................................................31
Results................................................................................................................................................33
Gastric Cancer Mortality ..............................................................................................................33
Oral Cancer Mortality....................................................................................................................36
Discussion...........................................................................................................................................40
The Penalised Likelihood APC Model..........................................................................................40
Gastric Cancer Mortality...............................................................................................................41
Oral Cancer Mortality....................................................................................................................42
2
-
8/7/2019 Tesi di Dottorato di Ricerca Malvezzi
3/45
Age Period Cohort Analysis of Mortality Rates
Introduction
Descriptive epidemiology is seen as a first approach aimed at defining the scope for a research
problem. Its techniques are primarily aimed at exploratory studies, in that its main concern is to
generate hypotheses rather than verifying them.
The basic techniques of descriptive epidemiology were borrowed from demography, with morbidity
and mortality rates being the key descriptive tools and standardisation being the only method used
for comparative purposes, usually ignoring issues of variability, or acknowledging the problem
working around it with simplistic devices, such as the aggregation of data over longer time frames.
The improvement and greater availability of epidemiological data over the years is possibly the
main factor that brought about the development of modern descriptive epidemiology. Mortality and
incidence data (in particular oncological data) has come in leaps and bounds both quantity and
quality wise over the years, this is mainly due to the proliferation of cancer registries and the
concerted effort to standardise procedures of data collection and classification, not to mention the
improvement in demographic data that has become available for an always greater number of
populations and is published on a more regular basis.
As a consequence of this accumulation of time based incidence and mortality data, time series
modelling techniques were developed to analyse the different factors underlying the changes in
rates for both explanatory and predictive purposes.
In this thesis methods and modelling techniques to study mortality rates are described, in particular
age-period-cohort (APC) analysis is presented for its function in contributing to the aetiologic
purpose of descriptive epidemiology to make inference from the group to the individual.
The methods described herein were used to analyse gastric and oral cancer mortality in Europe to
illustrate the benefits and shortcomings of the methods, as well as, for the intrinsic value of these
3
-
8/7/2019 Tesi di Dottorato di Ricerca Malvezzi
4/45
analyses for aetiological research and indications for cancer prevention that arise from the results of
these studies.
Materials and Methods
Age-Standardised-Rates
Instantaneous Incidence Rates
The probability of developing a condition between the ages t0 and t1 with (u) being the age specific
rate and S(u) the probability of survival without disease is given by:
=t0
t1
u Sudu .
We are interested in the conditional probability c of developing the disease given that a subject is
still at risk. This probability is not influenced by general survival until the examined age t 0 and very
little between t0 and t1, as long as this interval is small, giving us
c=t0
t1
uSuSt0
du .
If the interval is small enough that (u) and S(u) can be considered constant ((t0) and S(t0)) then
the equation can be rewritten as:
c t0t1t0
If k is the number of cases observed between t0 and t1, and nt0 is the number of subjects at risk at t0
4
-
8/7/2019 Tesi di Dottorato di Ricerca Malvezzi
5/45
we have:
c=k
nt0.
Which gives us the estimate of as: t0
k
nt0t1t0.
Which is to say that the estimate of the instantaneous rate at t0 is the number of cases observed
divided by the number m of persons year between t0 and t1, which translates to the more familiar
t0 km .
Of course this approximation is not very good if (u) varies violently between t0 and t1, or if the
ratio
Su
St0
is very different from unity.
Mortality Rates
The annual incidence rate of a given disease, for a group, over a given time period is equal to the
ratio between the new occurrences of the disease over the given time and the number of persons
year accumulated by the members of the group over the same time period.
As seen above, rates are more meaningful when the examined group is homogeneous and there is a
constant risk over the time period. Under these conditions, the observed incidence rate can be
considered an estimate of the underlying instantaneous rate. These conditions are the reasons why
incidence is calculated separately by age and sex to obtain specific incidence rates.
In the case of annual mortality rates, the numerator of the rate is the number of deaths for the
5
-
8/7/2019 Tesi di Dottorato di Ricerca Malvezzi
6/45
observed condition during the calendar year. In the case of descriptive statistics, particularly in
mortality studies, the denominator is made of population estimates derived from official censuses.
I.e. the provided population value is considered to be the mid point of the year, or the yearly
average. This gives us:
x=kx
[nx t0nx t1]/2.
With x being the annual incidence estimate for the studied age group, kx being the number of
observed cases in the observed population, nx(t0) and nx(t1) the number of subjects at risk at the
beginning and end of the studied period. The expression at the denominator corresponds to the
value mx provided by the statistical bureaux.
Standard Rates
To compare incidence rates measured on different populations, some form of standardisation is
necessary. The direct method of standardisation consists in calculating the annual rate that would
have been observed in a theoretical standard population, had this been subjected to the same force
of incidence of that being studied. To obtain this result, expected numbers of cases for each age
group are calculated for the standard population based on the corresponding person years and the
estimated rate for the studied population. This number, that corresponds to the number of expected
cases in the theoretical population, is divided by the total person years of this population. If L is the
size of the total standard population, Lx the number of subjects in the xth age group, with kx and mx
respectively the incidence and number of person years in the corresponding age group for the
studied population, the age specific rate for the observed population is t x= kx/mx. Thus, the number
of expected cases in the xth age group of the standard population if it were subject to the same level
6
-
8/7/2019 Tesi di Dottorato di Ricerca Malvezzi
7/45
of risk defined by tx, would be Lxtx. The standardised rate, for a total number of age groups g, is
then:
t=1
Lx=1
g
Lx tx .
Which may also be written as:
t=x=1
g
wx tx
where wx = Lx/L is the proportion of subjects in the xth age group for the standard population
giving:
x=1
g
wx=1
Lx=1
g
Lx=1 .
Therefore t is a weighted average of age specific rates, with the weights being the proportion of
individuals per age group in the standard population. The standard populations used in this thesis
are the World Standard Population and the World Standard Truncated (35-64) Population [1].
Table 1. Age group weights for the world standard and truncated (35-64) populations.
Age Group World Standard Population World Truncated Population
0-4 12
5-9 10
10-14 9
15-19 9
20-24 8
25-29 8
30-34 6
35-39 6 19.35
40-44 6 19.35
45-49 6 19.35
7
-
8/7/2019 Tesi di Dottorato di Ricerca Malvezzi
8/45
Age Group World Standard Population World Truncated Population
50-54 5 16.13
55-59 4 12.91
60-64 4 12.91
65-69 370-74 2
75-79 1
80 1
Total 100 100
Rate Estimate Accuracy
The accuracy of rates depends on the accuracy of the number of observed cases K. K follows a
Poisson distribution whose expectation and variance are equal to the theoretical rate multiplied by
the number of person years m accumulated, so E(K)= m and Var(K)= m. Therefore the variance
of the rate estimator K/m is:
Var Km =Var K
m2
=m
.
Its estimate is obtained by replacing with k/m:
Var Km =k
m2
=2
k.
From here we see that the variance for the specific rate tx is:
Vartx =Var Kx
mx2
=xm x
.
For the standardised rate this becomes:
Vart=x=1
g
w x2Vart=
x=1
g
wx2
xmx
x being unknown the variance must be estimated with:
Vart=x=1
g
wx2
mx2 kx .
8
-
8/7/2019 Tesi di Dottorato di Ricerca Malvezzi
9/45
Therefore, if the theoretical standard rate is:
=x
w x x
and s is its standard error, (t-)/s approximates to a standard normal variable for which we can write
a 1 confidence interval in the usual form:
[ tZ /2 Vart ; tZ /2 Vart]
For practical purposes rates are usually given as t per 100 000 person years (105 t), consequently the
variance needs to be presented as 1010 Var(t).
Age-Period-Cohort Analysis
Introduction
Standardized rates are a useful descriptive tool when comparing different populations, but offer no
analytical insight, on the other hand, inferential analytic methods, applied to mortality rate time
trends such as joinpoint analysis, offer analytical insight on trends and their change in time, but
completely ignore information on the age and cohort structure of the analysed data [2]. APC
analysis performs a simultaneous study of the effects of age, period and cohort. The age effects
correspond to the variability explained by physiological differences that characterise the different
age groups, period effects are brought about by the change in variability in time that influence all
age groups in the same way, finally cohort effects explain changes that affect groups of subjects
belonging to the same birth cohort. Period effects usually portray the consequences of the
introduction of new therapies or screening interventions, they are also affected by exposures that
manifest their consequences on the studied event over short periods of time. Cohort effects are
usually determined by gestational exposures, or those exposures that influence event frequencies
over long time periods.
However useful, the APC model suffers from an intrinsic structural issue; the age, period and cohort
variables have an exact linear dependence that can be expressed as age = period cohort (A=P-C).
9
-
8/7/2019 Tesi di Dottorato di Ricerca Malvezzi
10/45
This causes the infamous identifiability problem that makes this model complicated to treat. In the
recent decades many efforts have been spent in an attempt to overcome this issue, a few examples
are the use of estimable functions, adding extra constraints to the model or performing sequential
fittings of a two variable model and then regressing over residuals with the remaining variable [3]
[4][5][6][7][8]. More recently, solutions using modern techniques such as smoothers and Bayesian
methods have also been proposed [3][8][9][10][11]. All these solutions have their merits and set
backs; estimable functions are epidemiologically hard to interpret, adding constraints requires
biologic and statistical a priori knowledge and is not always justifiable, while cascade regression
results don't always agree with each other, and also require biological justifiability. In the following
paragraphs some of these techniques will be illustrated and the penalised likelihood method, that
offers a good balance between trade-offs and functionality, will be illustrated in detail [12][13].
Presenting Data: Tabular View
Incidence and mortality data are often shown in a two way table in tabular form, where rows
represent age groups (i=1,...,I) and columns the time periods (j=1,..,J). Usually the groupings of age
and periods are of the same size, but this is not necessarily the case. Cells can portray the following
information: the number of events (Oij), which is either the number of deaths for mortality or new
occurrences of a disease in the case of incidence, the number of man years at risk (N ij), or the age
specific rate (rij=Oij/Nij), which synthesises the afore mentioned information. In table 2 Italian male
gastric mortality rates are shown, age groups and time periods are both grouped in quinquennia, the
cells contain the corresponding age specific rates per 100 000 inhabitants. Observing the table, it
can be seen that moving down the rows we study the variation along the age groups (I=12), while
moving horizontally along the columns we can observe the temporal variation through the 5-year
periods (J=11). Moving diagonally down the table changes through the cohorts (K=I+J-1, k=I-i+j)
can be observed, the cohort with the central year of birth in 1930 (C=P-A) is highlighted.
In the example it is easy to see how the rates get bigger at older ages and earlier periods of death,
10
-
8/7/2019 Tesi di Dottorato di Ricerca Malvezzi
11/45
the intrinsic dependence of cohort effects from the age and period variables also appears clearly in
this example.
Table 2. Tabulated age specific mortality rates for gastric cancer in Italian men.
Central Year of the Quinquennium of Death
1952 1957 1962 1967 1972 1977 1982 1987 1992 1997 2002
CentralYearoftheA
geQuinquennium
22 0.27 0.33 0.45 0.14 0.22 0.25 0.23 0.16 0.12 0.12 0.04
27 0.82 0.85 0.93 0.73 0.7 0.63 0.51 0.33 0.38 0.24 0.38
32 2.21 2.03 2.35 2.18 1.81 1.65 1.27 1 0.92 0.89 0.57
37 5.86 5.18 5.17 4.92 4.61 4.09 3.32 2.83 2.28 1.82 1.44
42 15.02 13.83 11.5 10.49 9.3 7.73 7.94 5.69 5.15 4.11 3.34
47 32.04 27.65 25.5 20.57 19.18 17.12 13.98 12.23 9.87 8.32 6.58
52 58.09 56.65 48.8 43.05 34.97 31.39 26.81 21.9 18.12 14.53 11.55
57 100.97 95.88 89.69 80.01 68.43 53.3 47.74 41.05 33.43 26.27 21.23
62 159.17 151.69 148.13 136.19 114.49 95.31 77.77 66.89 55.28 45.77 37.11
67 235.55 223.04 220.72 205.64 182.18 145.83 131.09 105.86 88.14 72.06 61.06
72 317.74 313.15 301.82 296.15 259.8 208.33 191.36 171.75 138.05 108.03 92.54
77 366.57 367.31 393.92 367.7 355.83 281.51 267.09 237.9 196.14 158.47 133.61
11
-
8/7/2019 Tesi di Dottorato di Ricerca Malvezzi
12/45
Presenting Data: Graphical Methods
To explore the data structure further, a series of two-dimensional logarithmic scale representations
are used. Figure 1 shows death rates per 100 000 inhabitants plotted against period of death with
data stratified by age at death. In the shown case (Italian male gastric cancer mortality) parallelism
between age groups is very strong, underlining the coherence of mortality variation over time. This
parallelism is also seen in figure 2 where the grouping and x axis variables are inverted, i.e. with
data grouped by period of death and death rates plotted against age at death, underlining the
coherence in variation death rates have over the age groups.
This homogeneity can also be seen when plotting involves the cohort of birth factor. In figure 3
death rates are plotted against cohort-of-birth, but divided in age groups. The parallelism seen is
very similar to that seen in figure 1, which is expected since the strata are identical in the two
figures, the difference lies in the fact that in figure 3 they are plotted against the cohort of birth
instead of the period (figure 1).
In figure 4 we see death rates per 100 000 grouped by birth cohorts and plotted against age at death,
12
Figure 1: Death rates per 100 000 inhabitants plotted against period of deathwith data stratified by age at death
-
8/7/2019 Tesi di Dottorato di Ricerca Malvezzi
13/45
the graph has the same shape as that in figure 2 because it plots the same data points, but these are
joined according to different strata. It is immediately noticeable that the broken lines don't all share
the same length, that is because older and younger cohorts are based on fewer cells, compared to the
central ones, since they only exist at the extremes of the studied age spectrum. Nonetheless, good
13
Figure 3: Death rates per 100 000 inhabitants plotted against age at birth
with data stratified by age at death
Figure 2: Death rates per 100 000 inhabitants plotted against age at deathwith data stratified by period of death
-
8/7/2019 Tesi di Dottorato di Ricerca Malvezzi
14/45
parallelism between the lines is evident showing that, with respect to cohorts, data variation over
age is homogeneous. From the four figures representing Italian male gastric cancer mortality, age
appears to be the most important factor, as would be expected with mortality from this cancer. The
parallelism seen in the graphs makes discerning whether there is a stronger age period effect (shown
in figures 1 and 2) or whether the data is influenced by a stronger age cohort effect (as shown in
figures 3 and 4) complicated.
There are more data representation techniques than it is healthy to discuss, the more explanatory
ones tend to be combinations of those shown so far: in the example (figure 5) cohort strata are
added to the graph in figure 1, the effect of this procedure is to create a surface with a varying
topology from which information regarding the roles of age, period and cohort can be gained, in the
example parallelism and linearity on a logarithmic scale dominate the data.
14
Figure 4: Death rates per 100 000 inhabitants plotted against age at death
with data divided by age at birth
-
8/7/2019 Tesi di Dottorato di Ricerca Malvezzi
15/45
The more spectacular representations are undoubtedly three-dimensional plots, of which we give an
example (figure 6), portraying the same data with the logarithm of the death rate per 100 000
inhabitants on the z axis and age and period on the x and y axes respectively.
15
Figure 5: Death rates per 100 000 inhabitants plotted against period
of death with data stratified by age at death (black) and age at birth(red)
Figure 6: Death rate per 100 000 inhabitants on the z axis, ageat death on the y axes and period oh death on the y axes
-
8/7/2019 Tesi di Dottorato di Ricerca Malvezzi
16/45
The strength of the age effect can also be seen on this graph, but it suffers from the limit of being a
three dimensional representation stuck on a two-dimensional surface, this makes the choice of view
point an important and tricky issue.
Classical Modelling Approach
To explain the methods and issues that arise from the application of APC models it's best to start
from the classical illustration of the problem that Clayton and Schifflers make in their double paper
titled Models for Temporal Variation in Cancer Rates [14][5]. In their work, the authors tackle the
problem performing variance analyses of successively more complex models. They start with a
simple age model, moving on to a linear drift model that adds either a cohort or period drift term to
the age model, to then apply the age-period (AP) and age-cohort models (AC) and only in a final
instance, if justified by the variance analysis, apply one of the possible APC solutions. The flow
chart they present in their paper (figure 7), is a synthesis of their proposed method.
This schematic highlights the fact that the AP and AC models are not directly comparable, this is
16
Figure 7: Anova procedure schematic for age period cohort analysis
3
Age-Period ModelDoF: (I-1)(J-1)
1Age ModelDoF: I(J-1)
2Age-Drift Model
DoF: I(J-1)-1
3
Age-Cohort ModelDoF: (I-1)(J-2)
4Age-Period-Cohort Model
DoF: (I-2)(J-2)
-1
-(J-2)-(I+J-3)
-(I+J-3)-(J-2)
Degrees of Freedom in APC Models.
-
8/7/2019 Tesi di Dottorato di Ricerca Malvezzi
17/45
due to the AC model using more degrees of freedom than the AP model. The resulting ANOVA is
consequently complicated to interpret. Applying this ANOVA method to the afore-mentioned
models, which will be explained in greater detail in the following section, to the Italian male gastric
mortality rates from the previous examples we obtain:
Examining figure 8, it can be seen that, the age drift model explains the most variance using the
least degrees of freedom, therefore it gives the most economical representation of the data. This
result is also obvious upon close inspection of the graphs presented in the Graphical Methods
section (figures 1-6) where the plots are dominated by strong parallelism and linearity on a
logarithmic scale.
The Age Model
The model that analyses the rate trends using age as a factor can be expressed with the following
function:
Y=log E[ri]=i
17
Figure 8: Anova schematic applied to Italian male gastric cancer mortality
3Age-Period ModelDoF:90 Dev:1713
1Age Model
DoF:100 Dev:70183
2Age-Drift Model
DoF:99 Dev:4714
3Age-Cohort ModelDoF:81 Dev:927
4Age-Period-Cohort Model
DoF:72 Dev:306
-1
-9-18
-18-9
Degrees of Freedom in APC Models of Italian male gastric cancer mortality .
Null model DoF: 110 Total Deviance:1479717519
-
8/7/2019 Tesi di Dottorato di Ricerca Malvezzi
18/45
where is the intercept and the i is the additive effect of the ith age group on the base rate, which
can be expressed as EXP()*100 000 inhabitants, i.e. i is the relative risk as compared to that of a
reference age group 0.
For our purposes the model without the intercept where the i are the log-estimates of the age-
specific rates ri is more useful:
Y=log E[ri ]=i
Applying the model to the example data (figure 9), the slightly wider confidence intervals in the
younger age groups are an indication of the greater variability present in these estimates, this is due
to the very small numbers of deaths recorded in these age groups in the studied pathologies.
The Age-Period Model
This model assumes that the age specific rates maintain the same structure during all the studied
periods, but vary in dimension according to the time period and can be written as:
18
Figure 9: Age model for Italian male gastric cancer mortality
-
8/7/2019 Tesi di Dottorato di Ricerca Malvezzi
19/45
Y=logE[rij]=i j
where i is the log-estimate of the ith age-specific rate in the reference period 0, j is the additive
effect of the jth period on the logarithm of the age-specific rates with rij being the estimate of the
rate of the ith age group in the jth period.
This model is over-parametrised as it has a parameter for each age group and one for each period,
but this is easily solved by constraining one of the period effects to be equal to 0 and set it as the
reference 0 compared to which the other period effects j are to be considered as measures of risk.
In our example (Italian male gastric mortality rates) the fifth period corresponding to the 1970-74
quinquennium was chosen as the reference, therefore the i are the logarithms of age-specific
mortality rates in the reference period 0 (1970-74) and j is the relative risk to be applied to the
age-specific log-rates in the jth period.
The Age-Cohort Model
This model is structurally similar to the AP model replacing the period factor with the cohort one:
19
Figure 10: Age period model for Italian male gastric cancer mortality
-
8/7/2019 Tesi di Dottorato di Ricerca Malvezzi
20/45
Y=log E[rik]=ik
where i is the log-estimate of the ith age-specific rate in the reference cohort 0, k is the additive
effect of the kth cohort on the logarithm of the age-specific rates, similarly to the previous model rij
is the estimate of the rate of the ith age group, but in the kth cohort instead of the jth period. The
over parametrisation problem is solved, similarly to the previous model, taking a reference cohort 0
where the age-specific log-rates take on the values given by EXP( i)*100 000, referred to which k
is the relative risk for the kth cohort. In our example (Italian male gastric mortality rates) the central
reference cohort is the 11th that has its central year of birth in 1930.
The confidence intervals for the cohorts, resulting from the application of the AC model to the
example data (figure 9), show greater variability in the oldest and youngest cohorts. This is due to
these cohorts being built on fewer observations, as can be seen in figure 4. Furthermore, the most
recent cohorts include the youngest age groups, which have the smallest numbers of recorded
deaths, consequently they are affected by the greatest variability.
20
Figure 11: Age cohort model for Italian male gastric cancer mortality
-
8/7/2019 Tesi di Dottorato di Ricerca Malvezzi
21/45
The Age-Drift Model
Observing the previous two models with the period and cohort effects expressed as relative risks,
the question arises as to what would happen if these parameter estimates were substituted with a
log-linear trend. This is called the linear drift and can be expressed either as a function of period or
as a function of cohort, as can be seen respectively in the following formulae:
Y=log E[rij ]=i j j 0
Y=logE[rik]=ikk0
The resulting effects are age-specific rates calculated in the reference period or cohort (j0 and k0
respectively) and a linear trend component representing the relative risk as a linear drift on the
logarithmic scale.
The main feature of these two models is that, as far as the expected value estimates are concerned,
the residual deviance is identical in both models. The only real difference between the two models
is the reference temporal scale, this changes the model interpretation according to whether it is
period or cohort centred, consequently the estimated age-specific rates in the reference
21
Figure 12: Age drift models for Italian male gastric cancer mortality
-
8/7/2019 Tesi di Dottorato di Ricerca Malvezzi
22/45
parameter have a different structure to reflect this.
This model is interpreted as an age-specific rate structure that is influenced by a constant linear
cohort or period effect. In the example (Italian male gastric mortality rates), we see the two models
superimposed in figure 12 to illustrate the previously mentioned features of these models, with the
references j0 and k0.being the same as in the previous AP and AC examples (figures 10 and 11).
The Age-Period-Cohort Model
As mentioned in the introduction, the full APC model suffers from a non-identifiability problem
given by the linear relation A=P-C. Numerous techniques have been used to solve or circumvent
this issue over the years, from simple added constraint solutions to more exotic methods. In the
following paragraphs some of these methods will be illustrated in order to underline their strengths
and weaknesses, before moving on to the chosen method for APC analysis.
Added Constraints
This method solves the non identifiability problem by adding additional constraints to the model. As
well as choosing a cohort 0 and a period 0 as references to solve the ordinary over-parametrisation
problem, an additional constraint is added to the model, usually forcing the effects of 2 parameters
to be the same.
This method has some drawbacks, the most obvious of which is that by changing the constraints the
resulting model is heavily and visibly affected, therefore a priory biological and statistical
knowledge is needed to justify such an intervention.
In the two examples we apply different constraints on the model, the resulting differences are plain
to see. The first model (figure 13) constrains the effects of the to two earliest cohorts (central years
1875 and 1880) so that c1=c2, while the second model (figure 14) constrains the last two age groups
(70-74 and 75-79 years of age) so that a9=a10. As anticipated, the estimates of the two models are
incompatible even though the chosen constraints are closely related: the earliest cohorts and oldest
22
-
8/7/2019 Tesi di Dottorato di Ricerca Malvezzi
23/45
age groups overlap greatly. This underlines the fact that this kind of solution should be avoided
unless very strong a priori knowledge is held.
Successive Iteration Modelling
In the case where age is the most important scale (as is the case in most cancer mortality studies)
and cohort or period can be prioritised according to a priori biological or statistical information, a
full APC model can be approximated by regressing an AP or AC model first, then obtaining the
effect of the remaining factor by regressing it over the residuals of the first model.
23
Figure 13: Age period cohort model constraining the first two cohorts
Figure 14: Age period cohort model constraining the oldest two age groups
-
8/7/2019 Tesi di Dottorato di Ricerca Malvezzi
24/45
In the case of an AC model followed by a period model, what is being obtained is the residual
period effect conditional to the age and cohort estimates from the preceding model:
logE[rijk]= i k j
which for the expected cases becomes:
log E[O ijk]= i klog Nijk j .
Now this is similar to the general expression of a Poisson model, with the difference that the offset
now includes the log of the fitted values from the AC model.
This procedure does not provide maximum likelihood estimates for parameters, instead it estimates
marginal age and cohort effects and conditional estimates of the period effects. This technique
allows for the calculation of confidence intervals, which consequently are not maximum likelihood
estimated intervals, but marginal estimates for the age and cohort effect intervals and conditional
estimates for the period effect ones.
24
Figure 15: Period model regressed over the residuals of an age cohort model
-
8/7/2019 Tesi di Dottorato di Ricerca Malvezzi
25/45
To use this procedure starting with an AP model to then regress the cohort factor the following
applies:
log E[rijk]= i j k
and the expected cases become:
log E[O ijk]= i jlog Nijkk .
The same holds for effect and standard error estimates, now age and period are the marginal
estimates while the cohort ones are conditional to the AP model.
From figures 15 and 16 we can see that the two models are similar but not equivalent. Obviously
the effects of the parameters of the starting models are those of the AP and AC models, since the
data examined in this case is strongly driven by drift, the cohort and period parameters for these
first models are very similar to the corresponding age-drift models. Therefore, choosing whether to
start with an AC or AP model depends on where we choose to assign the drift term, and which
factor we want to study without the effects of linear drift.
25
Figure 16: Cohort model regressed over the residuals of an age period model
-
8/7/2019 Tesi di Dottorato di Ricerca Malvezzi
26/45
This method can also be used by fitting one of the possible age-drift models and then regressing
over the residuals with either the period or cohort parameters. For an age-drift-period model, first
the age-drift model using a cohort linear drift is used, successively the model for the period effects
is regressed using the fitted estimates of the first model as the offset. The results of this procedure
are period estimates conditional to age and cohort drift estimates of the drift model, this gives us a
set of period effect estimates unaffected by linear drift as this has already been absorbed by the first
model. This translates to:
log E[rijk]= i kk0 j
and for the expected cases:
logE[O ijk]= i kk0log Nijk j .
Analogously to the previous examples, this results in marginal estimates for the effects and errors of
age and cohort drift and conditional estimates for the period ones. The same thing can be done to
obtain an age-drift-cohort model where the cohort estimates are drift free.
26
Figure 17: Period model regressed over the residuals of an age cohort driftmodel
-
8/7/2019 Tesi di Dottorato di Ricerca Malvezzi
27/45
The period and cohort effects resulting from these two models appear to be virtually identical to the
ones obtained from regressing AP or AC models first (figure 17 and 18). This is because the linear
drift is absorbed completely by the first model fitted. The usefulness of these models is that they
enable us to study the shape of the last factor once the linear-drift effect has been subtracted.
Penalised likelihood Age-Period-Cohort Method
As previously illustrated, the issues with fitting a full APC model are solving the identification
problem and dealing with the linear drift term.
An APC model can be considered as a parameter estimate problem in a log-linear Poisson model,
such as those illustrated up till now:
logOijk=log Nijkloga ilog p jlog ck
where ai, pj and ckare the multiplicative parameters for age, period and cohort respectively. These
can be estimated by minimising the following expression using the weighted least squares method:
fa , p , c = Oijk logOijklog Nijkloga ilog p jlog ck2
Seeing as there is a linear relation between A, P and C the solution set X(a,p,c) is infinite, but can be
27
Figure 18: Cohort model regressed over the residuals of an age period driftmodel
-
8/7/2019 Tesi di Dottorato di Ricerca Malvezzi
28/45
parametrised in in the following manner:
log a 'i=log ai Ii
log p 'i=log p i j
log c 'i=log c i k .
This parametrisation makes it possible to calculate a goodness of fit statistic (G2) that is independent
from .
The solutions estimated by the three two factor models are:
Xc= ac , pc , c0 ; X p= a p , p0 , c p ; Xa=a0 , pa , ca;
where c0 and p0 are unit vectors of length k and j respectively and a0 takes the form:
a0i=exp [j
Oij log Oijlog Nij /j
Oij ] .
The natural logarithms of these solutions are then placed in the real space Ri +j + k, where their
euclidean distances from the parametrised saturated model solutions X()=(a, p, c, ) are defined
as:
dc =XcX
dp =XpX
da=Xa X .
The sum of these distances, weighed by the inverse of the degrees of freedom scaled goodness of
fit:
g X=dc
Gc2 / I1 J1
dp
G p2 / I1 J2
da
Ga2/ I2 J1
can be minimised in . This results in a solution X'() that minimises the distance of the saturated
model from the three two factor models, constructing a geometrical weighted average of the three
28
-
8/7/2019 Tesi di Dottorato di Ricerca Malvezzi
29/45
two factor models. This way the identification problem is solved and the drift is distributed
according to the goodness of fit statistics.
Confidence Intervals for APC models.
Many APC methods don't allow for the straightforward production of confidence intervals. Such is
the case for the penalised likelihood method. To gain some insight on the variability of the
estimated parameters we resorted to a parametric bootstrap simulation method[15]. Published
Mortality data is usually stratified by age-groups and time-periods. Being count data every cell,
specified by age group and calendar time-period, contains values that are Poisson distributed. With
this in mind, for each cell, 1 000 values were randomly extracted from a Poisson distribution
characterised by the measured value of that cell. The simulated datasets were then fed through the
penalised likelihood APC model, 1 000 sets of estimates for the age, period and cohort parameters
were obtained.
For each parameter the 2.5th and 97.5th percentiles were taken from the one-thousandfold estimate
set as an estimate for the 95% confidence interval for that parameter.
29
Figure 19: Age period cohort model fitted using the penalised likelihood
method
-
8/7/2019 Tesi di Dottorato di Ricerca Malvezzi
30/45
Gastric Cancer Data
Official death certification data for the period 1950-2004 was derived for gastric cancer, whenever
available, from the World Health Organization (WHO) database [16], for 35 countries of the
European Region (according to the WHO definition), of which 23 were EU countries (Belgium,
Cyprus, Luxembourg and Slovakia were excluded), the other 12 being Albania, Armenia,
Azerbaijan, Belarus, Croatia, Georgia, Kazakhstan, Norway, Moldova, Russia, Switzerland and
Ukraine, and the EU as a whole (27 member states as defined in January 2007). Mortality data was
coded according to the Seventh Revision of the International Classification of Diseases (ICD-7)
from the 1950's to the end of the 1960's, the eighth revision (ICD-8) was used throughout the
1970's, while the ninth ICD-9 was used up to the mid 1990's (with the exception of Switzerland and
Denmark that skipped this revision and moved to the tenth in 1995 and 1994 respectively), the
Tenth Revision of the International statistical Classification of Diseases and Health Related
Problems (ICD-10) was adopted between the late 1990's and early 2000s by most countries [17],
with the exceptions of Albania, Armenia, Bulgaria, Greece and Ireland that still have not adopted it.
30
Figure 20: Age period cohort model fitted using the penalised likelihoodmethod with confidence intervals from the bootstrap procedure
-
8/7/2019 Tesi di Dottorato di Ricerca Malvezzi
31/45
Encoding for stomach mortality is straightforward and its transition between revisions did not
present particular issues, it was coded as 151 for ICD 7, 8 and 9 and C16 in ICD-10.
Estimates of the resident populations for the corresponding calendar periods, based on official
censuses, were extracted from the same WHO database.
From the matrices of certified deaths and resident population we computed the age-specific
mortality rates per 100 000 inhabitants for 5-year age groups (from 30-34 to 75-79 years), for the 11
5-year periods considered (from 1950-54 to 2000-04) where data was available. No extrapolations
were made for missing data.
From the same matrices, we computed age-specific rates for each 5-year age group (from 0, 1-4 to
85+ years), in order to construct age-standardized mortality rates per 100 000 men and women
using the direct method on the basis of the world standard population at all ages, and the
corresponding percent change in rates over the 1994-2004 decade [1].
Cohorts were defined according to their central year of birth. Thus, the earliest possible cohort (the
1875 one) relates to individuals aged 75 to 79 who died in the quinquennium 1950-54; they could
have been born in any of the 10 years from 1870 to 1879.
Oral Cancer Data
Official death certification data from oral and pharyngeal cancer for 38 European countries and for
the EU in the period 1950-2007 was derived from the World Health Organization (WHO) database
available on electronic support [16].
The EU was defined as the 27 member states as of January 2007, i.e., Austria, Belgium, Bulgaria,
the Czech Republic, Cyprus, Denmark, Estonia, Finland, France, Germany, Greece, Hungary,
Ireland, Italy, Latvia, Lithuania, Luxembourg, Malta, the Netherlands, Poland, Portugal, Romania,
Slovakia, Slovenia, Spain, Sweden, United Kingdom. Data for Cyprus was not available and for
Belgium it was only available up to 1997, and was therefore excluded.
During the calendar period considered (1950-2004), four different Revisions of the International
31
-
8/7/2019 Tesi di Dottorato di Ricerca Malvezzi
32/45
Classification of Diseases (ICD) were used. Since differences in classifications between various
Revisions were minor, oral and pharyngeal cancer deaths were re-coded for all countries according
to the Tenth Revision of the ICD (ICD-10: C00-C14) [17].
Estimates of the resident population, based on official censuses, were obtained from the same WHO
database. From the matrices of certified deaths and resident populations, we computed age-specific
rates for each 5-year age group (from 0, 14 to 85+ years) and calendar period. Age-standardized
rates per 100 000 men and women, at all ages and truncated 35-64 years, were computed using the
direct method, on the basis of the world standard population [1]. In a few countries, mortality data
was missing for one or more calendar years. No extrapolation was made for missing data.
From these same matrices, we computed the age-specific mortality rates per 100,000 inhabitants for
5-year age groups ( from 30-34 to 80-84 years), for the 12 5-year periods considered (from 1950-54
32
Figure 21: Age period cohort models of male gastric cancer mortality in countries representative of
the European region and the EU as a whole.
-
8/7/2019 Tesi di Dottorato di Ricerca Malvezzi
33/45
to 2000-04 plus 2007), where data was available. No extrapolations were made for missing data.
Cohorts were defined according to their central year of birth. Thus the earliest possible cohort (the
1865 one) relates to individuals aged 80 to 84 who died in the quinquennium 1950-54; they could
have been born in any of the 10 years from 1860 to 1869.
Results
Gastric Cancer Mortality
Table 3 shows age-standardized rates for men and women in 1994, 1999 and 2004 with percentage
differences between periods. We see that for the EU taken as a whole in 2004 we have an age-
standardized mortality rate of 9.09/100 000 inhabitants for men and 4.17/100 000 for women, but
this total rate is the result of many differing contributions; central and northern European countries
33
Figure 22: Age period cohort models of female gastric cancer mortality in countries representative
of the European region and the EU as a whole.
-
8/7/2019 Tesi di Dottorato di Ricerca Malvezzi
34/45
like France, Finland and Sweden showing low rates around 5/100 000 in men and between 2 and
4/100 000 in women in 2004, while southern and Mediterranean countries like Italy and Portugal
have mortality rates of 10.43 and 16.36/100 000 in men and 5.04 and 7.30 /100 000 in women
respectively. Eastern countries belonging to the EU show even higher mortality rates with Latvia
and Lithuania peaking slightly above 20/100 000 in men and nearly 8 /100 000 in women, with
Estonia showing the highest values for women in the EU with 9.66/100 000.
Table 3. Age standardised mortality rates for gastric cancer in men and women in countries from
the European geographical region.
MEN WOMEN
Calendar
Yearsa1994 1999 2004
% Change
1999/94
% Change
2004/991994 1999 2004
% Change
1999/94
% Change
2004/99
EUROPE
. 16.29 13.92 . -14.59 . 7.54 7.73 . 2.55Albania 1987-2004
Armenia (2003) 1981-2003 15.91 13.35 16.79 -16.08 25.76 6.98 5.82 7.09 -16.61 21.81
Austria 1980-2005 14.10 9.89 8.07 -29.83 -18.38 7.65 5.35 4.63 -30.12 -13.42
Azerbaijan
(2002)1981-2002 24.32 25.24 21.64 3.81 -14.28 10.27
10.0
6
11.0
6-2.03 9.90
Belarus (2003) 1981-2003 35.51 32.97 27.44 -7.17 -16.76 15.0713.0
2
10.5
1-13.63 -19.22
Bulgaria 1980-2004 18.55 14.91 13.25 -19.62 -11.15 8.75 7.49 6.25 -14.43 -16.50
Croatia 1985-2005 21.15 22.79 16.38 7.77 -28.12 8.27 7.49 6.21 -9.50 -17.09
Czech Republic 1986-2005 16.57 11.73 10.15 -29.20 -13.53 7.34 5.88 4.78 -19.79 -18.82
Denmark (2001) 1980-2001 6.40 5.19 5.13 -18.81 -1.20 3.02 2.83 2.75 -6.25 -2.85
Estonia 1981-2005 25.92 23.45 18.94 -9.52 -19.24 13.4012.0
99.66 -9.75 -20.10
Finland 1980-2005 10.81 7.89 6.31 -27.06 -20.02 4.64 3.64 3.86 -21.65 6.06
France 1980-2005 7.30 6.53 5.63 -10.46 -13.85 2.81 2.46 2.08 -12.45 -15.16
Georgia (2001) 1981-2001 11.20 12.02 8.51 7.36 -29.20 5.25 5.10 4.99 -2.74 -2.26
Germany 1980-2004 12.88 9.94 7.95 -22.81 -20.08 6.72 5.19 4.23 -22.75 -18.44
Greece 1980-2005 . 8.70 7.25 . -16.68 . 3.73 3.65 . -2.18
Hungary 1980-2005 21.64 17.79 13.49 -17.79 -24.16 9.45 7.50 6.49 -20.63 -13.51
Iceland 1980-2005 10.29 7.68 6.88 -25.35 -10.42 4.04 3.30 3.06 -18.20 -7.48
Ireland 1980-2005 10.26 8.32 6.90 -18.87 -17.04 5.51 4.29 2.64 -22.17 -38.55
Italy (2003) 1980-2003 15.06 11.95 10.43 -20.64 -12.73 7.23 5.46 5.04 -24.46 -7.68
34
-
8/7/2019 Tesi di Dottorato di Ricerca Malvezzi
35/45
MEN WOMEN
Calendar
Yearsa1994 1999 2004
% Change
1999/94
% Change
2004/991994 1999 2004
% Change
1999/94
% Change
2004/99
Israel (2003) 1980-2003 8.87 7.49 7.81 -15.59 4.34 5.07 4.33 3.41 -14.50 -21.42
Kazakhstan 1981-2005 33.92 30.57 25.71 -9.88 -15.92 13.9412.6
5
10.7
7-9.25 -14.87
Kyrgyzstan 1981-2005 30.58 24.49 23.24 -19.92 -5.13 10.6810.8
48.30 1.44 -23.37
Latvia 1980-2005 26.38 22.49 20.55 -14.74 -8.62 11.48 9.83 7.85 -14.40 -20.18
Lithuania 1981-2005 25.67 22.86 21.09 -10.92 -7.77 9.70 9.36 7.98 -3.51 -14.76
Luxembourg 1980-2005 9.04 7.52 7.03 -16.83 -6.55 4.40 3.55 2.86 -19.33 -19.54
Macedonia
TFYR (2003)1991-2003 22.01 17.17 17.77 -22.00 3.50 10.52 7.80 7.98 -25.88 2.28
Malta 1980-2005 9.99 12.25 8.03 22.66 -34.42 5.77 3.70 3.65 -35.78 -1.55
Netherlands 1980-2004 10.75 8.47 6.53 -21.17 -22.99 4.43 3.55 3.37 -19.79 -5.04
Norway 1980-2005 9.61 8.55 5.57 -10.96 -34.90 4.68 4.28 2.94 -8.41 -31.47
Poland 1980-2005 19.68 16.52 14.28 -16.05 -13.57 6.97 5.70 5.00 -18.27 -12.23
Portugal (2003) 1980-2003 21.02 18.69 16.36 -11.09 -12.46 9.66 8.93 7.30 -7.57 -18.24
Republic of
Moldova1981-2005 19.74 16.51 16.89 -16.35 2.33 9.51 6.69 7.52 -29.63 12.41
Romania 1980-2004 17.79 16.67 16.00 -6.33 -4.02 6.83 6.08 6.29 -10.95 3.43
RussianFederation
1980-2005 37.46 31.70 27.16 -15.40 -14.32 15.5813.0
211.1
8-16.43 -14.14
Slovakia 1992-2005 . 16.69 13.63 . -18.34 . 5.84 5.78 . -0.92
Slovenia 1985-2005 21.18 17.82 14.84 -15.86 -16.75 7.82 7.67 5.42 -1.86 -29.44
Spain 1980-2005 13.16 10.90 9.02 -17.14 -17.26 5.69 4.67 3.79 -17.99 -18.74
Sweden 1980-2004 7.05 6.13 4.94 -13.12 -19.32 3.75 3.05 2.73 -18.78 -10.29
Switzerland 1980-2005 7.98 5.53 4.50 -30.75 -18.55 3.58 2.71 2.10 -24.48 -22.35
Tajikistan 1981-2004 19.25 17.99 14.88 -6.53 -17.28 10.6310.8
29.26 1.80 -14.38
Ukraine 1981-2005 28.85 24.86 21.23 -13.84 -14.61 11.24 9.58 8.28 -14.76 -13.65
United Kingdom 1980-2005 10.41 8.08 6.09 -22.36 -24.68 4.22 3.26 2.75 -22.80 -15.65
Uzbekistan 1981-2005 15.93 11.98 12.34 -24.77 2.93 7.11 5.56 6.22 -21.86 12.04
EU 13.22 10.85 9.09 -17.90 -16.28 5.98 4.81 4.17 -19.42 -13.45
This geographical division of results is also reflected in the curves resulting from the APC analysis
that can be seen in Figures 21 (men) and 22 (women). The age curves for France, Sweden and the
UK are relatively shallow in both men and women as compared to those found in southern and
35
-
8/7/2019 Tesi di Dottorato di Ricerca Malvezzi
36/45
eastern countries such as Italy, Portugal and Hungary. The central and northern countries with the
more favourable rates also have distinctive cohort and period effect curves, they have strong
descending period effect trends and cohort effects that fall steeply from the earliest cohorts until
about the 1940s to then stabilize. Southern and Eastern countries, on the other hand, display a later
start in the cohort effect fall with eastern countries not seeing the stabilization found in the younger
cohorts that southern European countries share with central northern states. The difference between
sexes is mainly seen in the age curves that are much shallower than the male ones throughout the
observed countries.
Oral Cancer Mortality
Table 4 gives the overall age-standardized mortality rates from oral and pharyngeal cancer in men
and women from various European countries and the EU as a whole in 1990-1994 and 2000-2004
with the corresponding percent changes.
For EU men, rates declined by 8% between the early 1990s and 2000-2004, to reach an overall
age-standardized rate of 6.1/100 000 in 2000-2004. In 1990-1994, the highest male rates were in
Hungary (17.1/100 000), Slovakia (16.0/100,000), and France (12.4/100 000); the lowest ones in
Greece and Iceland (1.8/100 000), and Finland (2.2/100 000). In 2000-2004, the highest male rates
were in Hungary (21.1/100 000) and other countries from central and eastern Europe, such as
Slovakia (16.9/100 000), the Republic of Moldova, Lithuania, Ukraine and Croatia (around 10-
11/100 000), while the lowest ones were in Nordic countries, such Iceland, Sweden, Finland
(around 2/100 000), the United Kingdom (2.8/100 000) and Greece (1.8/100 000). Oral and
pharyngeal cancer mortality declined over the last decade in several large European countries,
including France (with a rate of 8.6/100 000 in the early 2000s), Spain (6.0/100 000), Germany
(5.7/100 000) and Italy (4.3/100 000). Persisting rises were, however, observed in several central
and eastern European countries, including in particular Hungary, but also Belarus, Lithuania and
Romania. Rates were much lower in women, but increased moderately from 1.08 to 1.14/100 000
36
-
8/7/2019 Tesi di Dottorato di Ricerca Malvezzi
37/45
over the last decade in the EU as a whole. In 2000-2004 the highest rates for women were in
Hungary too (3.3/100 000), followed by Denmark (1.6/100 000) and Scotland (1.4/100 000), the
lowest ones were in Bulgaria (0.8/100 000) and Greece (0.7/100 000).
37
Figure 23: Male age-specific mortality rates from oral and pharyngeal cancer from 30-34 to 80-84years plotted against the year of birth
-
8/7/2019 Tesi di Dottorato di Ricerca Malvezzi
38/45
Table 4. Age standardised mortality rates for oral cancer in men and women in countries from theEuropean geographical region.
CountriesMen Women
1990-94 2000-04
% Change
2000-04/1990-94
1990-94 2000-04
% Change
2000-04/1990-94
Albania (1992-94) 3.81 2.41 -36.77 1.35 1.15 -14.94
Austria 6.07 6.28 3.47 0.96 1.35 40.49
Belarus (2000-03) 8.28 9.70 17.25 0.72 0.72 -0.23
Bulgaria (2000-03) 4.27 4.48 5.07 0.69 0.81 17.54
Croatia 11.88 10.43 -12.15 1.15 1.07 -7.44
Czech Republic 6.53 7.13 9.14 0.97 1.06 9.03
Denmark (2000-01) 4.28 4.93 15.30 1.38 1.64 19.16
Estonia 8.71 8.83 1.31 1.06 1.30 22.56
Finland 2.21 2.31 4.15 0.83 0.83 0.24
France 12.43 8.57 -31.01 1.30 1.27 -2.37Germany 6.63 5.67 -14.47 1.13 1.18 4.97
Greece 1.80 1.84 2.60 0.49 0.66 32.62
Hungary 17.13 21.12 23.25 2.22 3.25 46.34
Iceland 1.81 2.09 15.23 0.66 1.45 117.85
Ireland 4.52 3.51 -22.26 1.09 1.13 3.27
Italy (2000-03) 5.81 4.31 -25.78 1.00 1.01 1.67
Latvia 6.96 8.06 15.80 0.81 0.92 13.85
Lithuania 7.94 10.57 33.03 0.95 0.96 0.16
Luxembourg 8.49 7.70 -9.31 1.40 1.47 4.98
Macedonia (2000-03) 3.22 3.04 -5.41 0.60 0.83 37.50
Malta 3.49 2.86 -18.04 1.40 0.78 -44.30
Netherlands 2.78 2.96 6.29 1.04 1.27 22.16
Norway 3.14 2.76 -11.93 0.93 0.98 5.47
Poland 6.27 5.98 -4.70 1.06 1.12 5.39
Portugal (2000-03) 5.87 6.81 16.10 0.91 0.83 -9.18
Republic of Moldova 10.57 11.21 6.13 1.02 1.10 7.72
Romania 6.45 9.99 55.00 1.02 1.23 20.38
Russian Federation 8.92 8.85 -0.75 1.03 1.08 4.15
Slovakia 16.01 16.85 5.28 1.10 1.19 8.32
Slovenia 11.41 8.35 -26.82 0.95 1.08 14.39
Spain 6.94 5.98 -13.76 0.83 0.85 2.06Sweden 2.54 2.15 -15.39 0.91 0.88 -2.85
Switzerland 6.14 4.60 -25.06 1.17 1.23 5.15
Ukraine 9.73 10.55 8.32 0.94 0.88 -6.36
United Kingdom 2.98 2.75 -7.64 1.15 1.07 -6.72
United Kingdom,
England and Wales 2.83 2.58 -8.96 1.11 1.05 -5.41
United Kingdom,
Northern Ireland 3.16 2.74 -13.30 0.96 1.12 16.43
United Kingdom,
Scotland 4.48 4.19 -6.46 1.52 1.37 -9.69
European Union 6.62 6.09 -7.92 1.08 1.14 6.23
38
-
8/7/2019 Tesi di Dottorato di Ricerca Malvezzi
39/45
-
8/7/2019 Tesi di Dottorato di Ricerca Malvezzi
40/45
selection of European countries and the EU. Examining the graph for the EU, age-specific mortality
rate estimates for this cancer rise linearly with age, reaching a value lower than 50/100 000, in the
oldest age group. Period estimates rise steadily up to the late 1980's/ early 1990's, to then invert the
trend and fall until the most recent studied period. Cohort estimates rise sharply from the earliest
cohort up the beginning of the 19th century, then they fall until the 1920's to rise up to the 1960's and
then fall again for the more recent cohorts. With the exception of Portugal, that shows a fall up to
the 1920's and then a continuous rise, cohort effects for the studied countries closely resemble the
ones from the EU, however there are differences in variability, Lithuania and the Republic of
Moldova display wider confidence intervals than other countries for their cohort effects. Age effects
reflect the standard rates reported for the countries, Hungary's age curves reaches the highest point
at about 100/100 000, while Germany doesn't reach 40 /100 000 at its highest point. The bigger EU
countries, like France, Germany, Italy and Spain, show period estimate trends that reflect the
patterns already seen in the EU with a rise for the earlier periods, an inversion in the 1990's and
then continue to descend up to the last studied period. Countries that showed rises in standard rate
trends, like Lithuania, Romania and Ukraine, have rising period effect trends for the more recent
periods.
Discussion
The Penalised Likelihood APC Model
Before proceeding to the interpretation of results, a few general considerations on the model are
due. To start with, random variation problems differ in relation to age, period and cohort estimates.
These are minor when related to period of death estimates, since these are based upon relatively
similar total numbers of events over subsequent calendar periods; for age values the issue tends to
manifest with the younger age groups where the absolute numbers of deaths for most oncological
pathologies are low. In cohort effects these problems are potentially greater at both ends of the
curve, the earliest and latest cohorts are based on very few observations, while moving towards the
40
-
8/7/2019 Tesi di Dottorato di Ricerca Malvezzi
41/45
central cohorts the number of observations rise. In addition to this, the more recent cohorts are
based on smaller numbers of deaths because they contain the youngest age groups. It follows that
changes in trends in these recent cohorts should be examined with caution, as is reflected by the
wider confidence intervals for their estimates, even though they provide important information
towards future trends.
Another limit of the model is that it has difficulties discerning whether the major underlying trend is
a cohort or a period one when both their estimated effects share the same direction [18]. This model
also has a systematic tendency to favour cohort effects due to the greater weight its larger number of
parameters have in the modelling process.
Gastric Cancer Mortality
The widespread favourable trends in the cohort of birth and period of death effects in gastric cancer
mortality are not clearly understood, but most certainly reflect the effects of a more affluent diet,
that is richer in fresh fruit and vegetables, of better food conservation (including refrigeration), as
well as better hygiene, with a lower level ofHelicobacter pylori infection [19][20]. There is also
consistent proof of a correlation between the consumption of salt and salted foods [21][22][23]. As
well as salt, other methods of food conservation, such as smoking and curing, have been linked to
stomach cancer, but the evidence is less consistent. Tobacco smoking is also an important risk factor
in stomach cancer mortality [24].
In countries with more advanced health infrastructures, improved and newly developed methods of
diagnosis and treatment may also have played a role over the most recent calendar periods, but this
effect remains hard to quantify [25].
These favourable changes, that are essentially the result of a more developed socio-economic
environment and healthier lifestyle, have been slower to occur and are less widespread in some
countries of southern and eastern Europe. In particular the high prevalence ofH. pylori, aspects of
41
-
8/7/2019 Tesi di Dottorato di Ricerca Malvezzi
42/45
diet, nutrition and food conservation as well as a higher prevalence of tobacco smoking are more
probably the causes of the differences observed between central-northern Europe and the south-
eastern countries.
The conclusions that can be drawn from this study are that, although in some countries such as
France, Sweden and Switzerland there seems to be an asymptote in gastric cancer mortality due to
the levelling seen in the effects of the most recent cohorts, there appears to be a lot of room for
improvement for the mortality rates of this cancer, particularly in southern and eastern European
countries, where these rates are high and the effects of the more recent cohorts have not shown
signs of slowdown such as in the Czech Republic, Estonia, Latvia and Italy.
Oral Cancer Mortality
The main finding from this analysis of oral and pharyngeal cancer mortality in Europe is the strong
excess mortality in Hungary, where the rate for middle aged men was 55/100,000, comparable to
those of lung cancer in several western countries (i.e., 50 to 60/100,000 in Germany, Italy and the
United Kingdom) [26][27]. In the most recent cohorts (i.e., those born after 1960) there were signs
of a possible near future reversal in trends. Male rates appreciably decreased in southern European
countries, such as France, Italy and Spain, which had the highest rates in the past, but not in several
northern European countries, such as Denmark, the United Kingdom and the Netherlands [28][29].
Oral and pharyngeal cancer mortality is comparatively low in European women, though trends were
upwards over the last decades [30], and rates in some countries (Hungary in particular, but also
Denmark and Romania) have reached relatively high levels, especially in middle aged women,
reflecting female drinking and smoking patterns in those populations. Across European countries,
there was still an over 10-fold variation in male oral and pharyngeal mortality between the highest
rates in Hungary (21.1/100,000) and Slovakia (16.9/100,000), and the lowest ones in Greece
(1.8/100,000) and Sweden (2.2/100,000).
42
-
8/7/2019 Tesi di Dottorato di Ricerca Malvezzi
43/45
The diverging trends in the two sexes essentially reflect different patterns in tobacco smoking and
alcohol drinking. The favourable trends in male mortality rates reflect the fall in tobacco
consumption in men from most (western) European countries over the last few decades. A
favourable effect of stopping smoking is in fact evident already within few years after smoking
cessation, while the risk may remain persistently high for several years after stopping drinking [31]
[32]. Conversely, tobacco consumption has increased in women in several countries and this has led
to unfavourable oral and pharyngeal cancer mortality trends [33].
The geographic pattern of oral and pharyngeal cancer mortality trends also appears to be related to
changes in alcohol consumption [34]. The exceedingly high rates in Hungary and in a few other
countries of central and eastern Europe (Slovakia, Moldova, Lithuania, Croatia, Romania) can be
related to the overall quantity of alcohol consumed, but also to the drinking patterns (out of meals,
binge drinking) and to the type of alcohol consumed. In these countries, in fact, a substantial
proportion of alcohol derives from fruit (plums, peaches, apricots) and home-made alcoholic
beverages are widespread [35][36]. These may include high levels of acetaldehyde, which is an
established human carcinogen [37].
Age and cohort-specific analyses appear to reflect available information on the prevalence of
tobacco smoking in subsequent generations of men from major European countries, as well as
possibly the patterns of alcohol drinking, though generation specific data on alcohol consumption is
not available for most countries.
Although oral and pharyngeal mortality in Europe has declined in the last decade in men, there were
still rises in a few central and eastern European countries, reaching exceedingly high rates in
Hungary and Slovakia, which have now the highest rates on a European scale. The control of oral
and pharyngeal cancer, as well as of other alcohol- and tobacco-related cancers, remains therefore a
major public health problem in those areas of the continent.
43
-
8/7/2019 Tesi di Dottorato di Ricerca Malvezzi
44/45
-
8/7/2019 Tesi di Dottorato di Ricerca Malvezzi
45/45
24 IARC: IARC Monographs on the evaluation of carcinogenic risks to humans. Vol. 83.Tobacco smoke and involuntary smoking. ; 2004.25 Shibata A, Personnet J, Schottenfeld D, Fraumeni, JF, Jr: Stomach cancer. In Cancerepidemiology and prevention. Volume . 3th. Edited by . ; 2006:659-673.26 Levi F, Lucchini F, Negri E, La Vecchia C: Trends in mortality from major cancers in theEuropean Union, including acceding countries, in 2004. Cancer2004, 101:2843-2850.
27 Bosetti C, Bertuccio P, Levi F, Lucchini F, Negri E, La Vecchia C: Cancer mortality in theEuropean Union, 1970-2003, with a joinpoint analysis. Ann Oncol2008, 19:631-640.28 Olsen AH, Parkin DM, Sasieni P: Cancer mortality in the United Kingdom: projections tothe year 2025. Br J Cancer2008, 99:1549-1554.29 Braakhuis BJM, Visser O, Leemans CR: Oral and oropharyngeal cancer in TheNetherlands between 1989 and 2006: Increasing incidence, but not in young adults. OralOncol2009, 45:e85-9.30 Garavello W, Bertuccio P, Levi F, Lucchini F, Bosetti C, Malvezzi M, Negri E, La Vecchia C:
The oral cancer epidemic in central and eastern Europe. Int J Cancer2009, :.31 La Vecchia C, Franceschi S, Bosetti C, Levi F, Talamini R, Negri E: Time since stoppingsmoking and the risk of oral and pharyngeal cancers. J Natl Cancer Inst1999, 91:726-728.32 Franceschi S, Levi F, Dal Maso L, Talamini R, Conti E, Negri E, La Vecchia C: Cessation ofalcohol drinking and risk of cancer of the oral cavity and pharynx. Int J Cancer2000, 85:787-790.
33 Shafey O, Dalwick S, Guindon G: Tobacco control country profiles 2003. American Cancer
Society; 2003.
34 World Health Organization Statistical Information System: Health topics. Alcohol drinking.Available at: http://www.who.int/topics/alcohol_drinking/en/2006, :.35 Boffetta P, Hashibe M, La Vecchia C, Zatonski W, Rehm J: The burden of cancerattributable to alcohol drinking. Int J Cancer2006, 119:884-887.36 Lachenmeier DW, Ganss S, Rychlak B, Rehm J, Sulkowska U, Skiba M, Zatonski W:
Association between quality of cheap and unrecorded alcohol products and public healthconsequences in Poland. Alcohol Clin Exp Res 2009, 33:1757-1769.37 IARC: IARC Monographs on the evaluation of carcinogenic risks to humans. Vol. 71. Re-
evaluation of some organic chemicals, hydrazine and hydrogen peroxideed. Lyon: International
Agency for Research on Cancer; 1999.