stochastic mortality modelling · stochastic mortality modelling xiaoming liu department of...
TRANSCRIPT
STOCHASTIC MORTALITY MODELLING
By
Xiaoming Liu
A thesis submitted in conformity with the requirements
for the degree of Doctor of Philosophy
Graduate Department of Statistics
University of Toronto
© Copyright by Xiaoming Liu 2008
Stochastic Mortality Modelling
Xiaoming Liu
Department of Statistics, University of Toronto
Ph.D. Thesis, 2008
Abstract
For life insurance and annuity products whose payoffs depend on the future mortality rates,
there is a risk that realized mortality rates will be different from the anticipated rates
accounted for in their pricing and reserving calculations. This is termed as mortality risk.
Since mortality risk is difficult to diversify and has significant financial impacts on
insurance policies and pension plans, it is now a well-accepted fact that stochastic
approaches shall be adopted to model the mortality risk and to evaluate the mortality-linked
securities.
The objective of this thesis is to propose the use of a time-changed Markov process
to describe stochastic mortality dynamics for pricing and risk management purposes.
Analytical and empirical properties of this dynamics have been investigated using a
matrix-analytic methodology. Applications of the proposed model in the evaluation of
fair values for mortality linked securities have also been explored.
To be more specific, we consider a finite-state Markov process with one absorbing
state. This Markov process is related to an underlying aging mechanism and the survival
ii
time is viewed as the time until absorption. The resulting distribution for the survival time
is a so-called phase-type distribution. This approach is different from the traditional curve
fitting mortality models in the sense that the survival probabilities are now linked with an
underlying Markov aging process. Markov mathematical and phase-type distribution
theories therefore provide us a flexible and tractable framework to model the mortality
dynamics. And the time-changed Markov process allows us to incorporate the uncertainties
embedded in the future mortality evolution.
The proposed model has been applied to price the EIB/BNP Longevity Bonds and
other mortality derivatives under the independent assumption of interest rate and mortality
rate. A calibrating method for the model is suggested so that it can utilize both the market
price information involving the relevant mortality risk and the latest mortality projection.
The proposed model has also been fitted to various type of population mortality data for
empirical study. The fitting results show that our model can interpret the stylized mortality
patterns very well.
iii
Acknowledgements
It is certainly a long road to become a doctor. When I look back, I feel so grateful
that I have obtained many people’s help and support during my doctoral study. I
would like to take this opportunity to thank all of them.
In particular, I wish to thank Professor X. Sheldon Lin. As my supervisor, the
guidance from Professor Lin is not limited in how to write a thesis, instead, he shares
his opinions on how to conduct researches career-wisely. I have learned a lot from
his wisdom. To Professor Samuel Broverman, I wish to thank him for his thoughtful
comments and for being devoted and patient; specially, he spent a huge amount of
time to proofread my thesis. I really appreciate his help. To Professor Sebastian
Jaimungal, I wish to thank him for giving me the first exposure to Variance Gamma
process and Markov models’ application in credit risk, not to mention all those in-
teresting discussions with him. I also wish to thank the external examiner, Professor
Jun Cai of the University of Waterloo, for his insightful comments.
I would like to thank the University of Toronto and the Department of Statistics
in particular for their support throughout my graduate studies. Special appreciation
goes to Professors Nancy Reid, Jeffrey S. Rosenthal, Radford Neal and Radu V. Craiu.
Thanks also go to Andrea Carter, Dermot Whelan, Laura Kerr, and Ram Mohabir,
who are always willing to help.
To all the other friends: Xiaobin, Hanna, Mylene, Zheng, Hadus, Mohammed,
Zi, Shuying, Tao, Longhai, Meng, Shelly, Mark, Ana-maria, Angelo, and Samuel for
being around and helpful; Patrice for his advices and suggestions; Sig, for being a very
iv
v
special friend I luckily made. All these people make my time at U of T so memorable.
I also wish to thank my dear daughter, Angela, for the inspiration she has brought
to me from her enthusiasm to the Harry Potter Series book and J.K. Rowling.
At last, but not least, my very special thanks go to my husband, Owen, for bringing
me the idea of doing the research as doing a project. This helped me survive the most
difficult time in writing the thesis. Without you, this can never be done!
London, Ontario Xiaoming Liu
Dec. 16, 2007
Table of Contents
Abstract ii
Acknowledgements iv
Table of Contents vi
Introduction 1
1 Mortality Risk — From Deterministic to Stochastic Approach 5
1.1 Mortality Modelling . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.1.1 Survival Distributions . . . . . . . . . . . . . . . . . . . . . . 5
1.1.2 Empirical Studies on Mortality Trend . . . . . . . . . . . . . . 11
1.1.3 Mathematical Mortality Models . . . . . . . . . . . . . . . . . 19
1.2 Mortality Projection Methods Review . . . . . . . . . . . . . . . . . . 22
1.2.1 Deterministic Projection Models . . . . . . . . . . . . . . . . . 24
1.2.2 Stochastic Projection Models . . . . . . . . . . . . . . . . . . 30
1.2.3 Assessing Mortality Projection Models . . . . . . . . . . . . . 37
1.3 Mortality Risk and Stochastic Approaches . . . . . . . . . . . . . . . 41
1.3.1 Mortality Risk — Definition and Properties . . . . . . . . . . 41
1.3.2 Financial Implication of Mortality Risk . . . . . . . . . . . . . 47
1.3.3 Stochastic Approach of Dealing with Mortality Risk . . . . . . 51
2 Arbitrage-free Pricing Framework for Mortality Contingent Claims 57
2.1 Arbitrage free pricing theory . . . . . . . . . . . . . . . . . . . . . . . 57
2.1.1 Basic Ideas of Arbitrage Free Pricing . . . . . . . . . . . . . . 57
2.1.2 The Term Structure of Interest Rate . . . . . . . . . . . . . . 63
2.2 The Term Structure of Mortality Under Arbitrage Free Framework . 67
2.2.1 Basic Building Blocks . . . . . . . . . . . . . . . . . . . . . . . 67
2.2.2 The Generalized Financial/Insurance Market . . . . . . . . . . 72
vi
vii
2.3 Review of stochastic mortality models under the no-arbitrage framework 75
2.3.1 Criteria for Term Structure of Mortality Models . . . . . . . . 75
2.3.2 A Brief Review of Existing Stochastic Mortality Models . . . . 78
2.3.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
3 The Time-Changed Markovian Mortality Model 89
3.1 Dynamic Approach of Mortality Modelling . . . . . . . . . . . . . . . 89
3.1.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
3.1.2 Phase-type distributions . . . . . . . . . . . . . . . . . . . . . 90
3.1.3 Phase-type distributions as mortality models . . . . . . . . . . 98
3.2 Time-changed Markovian Survival Model . . . . . . . . . . . . . . . . 100
3.2.1 Time-changed Markovian Process . . . . . . . . . . . . . . . . 101
3.2.2 The Gamma process . . . . . . . . . . . . . . . . . . . . . . . 102
3.2.3 Survival functions for time-changed model . . . . . . . . . . . 104
3.2.4 Illustrations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
3.3 Pricing Longevity Bonds . . . . . . . . . . . . . . . . . . . . . . . . . 116
3.3.1 The EIB/BNP Longevity Bonds . . . . . . . . . . . . . . . . . 116
3.3.2 How was the EIB/BNP LB priced? . . . . . . . . . . . . . . . 118
3.3.3 Proposed method for pricing the EIB/BNP Longevity Bonds . 120
3.3.4 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . 123
3.3.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
3.4 Miscellaneous . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
3.4.1 GAOs and Longevity Risk Crisis . . . . . . . . . . . . . . . . 131
3.4.2 Pricing Guaranteed Annuity Options – Preliminary Study . . 133
3.4.3 A Snapshot of Mortality Derivative Market . . . . . . . . . . . 137
4 Deterministic Fitting 148
4.1 Aging Process and Physiological Age . . . . . . . . . . . . . . . . . . 149
4.2 The Proposed Mortality Model . . . . . . . . . . . . . . . . . . . . . 152
4.3 Fitting Swedish Cohort Data . . . . . . . . . . . . . . . . . . . . . . . 157
4.4 Fitting U.S. Social Security Administration Mortality Data . . . . . . 162
4.5 Analysis of Goodness-of-Fit . . . . . . . . . . . . . . . . . . . . . . . 166
4.6 Qualitative Analysis of the Model . . . . . . . . . . . . . . . . . . . . 168
4.7 Conclusion and Discussion . . . . . . . . . . . . . . . . . . . . . . . . 174
A Matrix Algebra 177
A.1 Matrix Exponential . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
A.2 The Kronecker product ⊗ and the Kronecker sum ⊕ . . . . . . . . . 178
viii
Bibliography 182
Introduction
Actuaries have been using extrapolative methods to project mortality rates for cen-
turies, with the implication that the past represents the future. Traditional actuarial
approaches for the pricing and risk management of life insurance and annuity prod-
ucts treat mortality rates deterministically. In other words, a deterministic projected
mortality schedule is chosen in advance, and then used in the calculation of premi-
ums and risk reserves, with the belief that the differences between the projected rates
and realized rates, the so-called mortality risk, can be diversified among individuals
and/or over the time.
Those are indeed very strong assumptions. Over the last century, evidences have
emerged to reveal that mortality risk is neither predictable nor diversifiable. In fact,
it has been shown that the mortality projections in the last fifty years have system-
atically underestimated the overall mortality improvement. The consequent adverse
financial impacts caused by mis-assessing mortality risk have been blamed as one of
the main reasons for the insolvency of the Equitable Life Assurance Society of UK,
the world’s oldest life office. For the empirical study on the mortality trend changes
and mortality projections, see Willets (1999), CMIB reports (2002; 2004; 2005), GAD
National Statistics Quality Review Report No. 8 (2001), and the papers presented
at the 2005 Society of Actuaries “Living to 100 and Beyond Symposium”. Also see
1
2
Pitacco (2003), Olivieri (2001), Boyle and Hardy (2003) and Ballotta and Haberman
(2003) for the financial impacts of the mortality risk on life insurance and annuities.
As a result, the traditional deterministic actuarial approach is now seen to be
inadequate for the calculation of fair values and reserves. Great efforts have been
made in the past few years to explore the use of stochastic approaches to model the
mortality dynamics and to evaluate the mortality-linked securities, many of which
are similar to approaches in finance. The frontier work in this aspect can be found in
Milevsky and Promislow (2001), Dahl (2004), Dahl and Møller (2006), Biffis (2005),
Biffis and Millossovich (2006), Ballotta and Haberman (2006), and Cairns, Blake, and
Dowd (2006a,b).
The aforementioned researchers make use of the similarities between mortality
risk and interest rate risk. They suggest modifying the models arising in the interest
rate sector to obtain mortality rate models. However, while mathematically similar
at a certain conceptual level, mortality rates behave very differently from interest
rates. For example, the term structure of mortality rates should only be increasing
to reflect the biologically reasonableness for age-specific pattern of mortality, whilst
interest rates can reverse in some situation. While the mean-reverting property is a
desirable property for interest rates, it is doubtful that mean-reverting is realistic for
mortality dynamics (see Cairns, Blake, and Dowd, 2006a).
In this thesis, we propose an alternative approach to model the stochastic mor-
tality. To be more specific, we start with a finite-state Markov process with one
absorbing state. This Markov process is related to an underlying aging mechanism
and the survival time is viewed as the time until absorption. The resulting distri-
bution for the survival time is a so-called phase-type distribution. This approach is
3
very different from the traditional curve fitting mortality models in the sense that
the survival probabilities are now linked with an underlying Markov aging process.
Markov mathematical and phase-type distribution theories then provide us a flexible
and tractable framework to model the mortality dynamics.
To introduce the uncertainty measure to the mortality dynamics, we further
consider a time-changed Markov process. The time-change process is described by
Gamma process, subordinated to the underlying aging process to generate the stochas-
tic mortality curves. The time-changing idea has first been used by Madan and
his collaborators in modelling the dynamics of the logarithm of the stock price (see
Madan and Milne, 1991; Madan, Carr, and Chang, 1998). However, the time-change
technique, associated with an underlying Markov process, has not been considered
in mortality modelling. In this project, we make use of properties of phase-type
distribution, which has been extensively explored in queueing theory and risk the-
ory (see Neuts, 1981; Asmussen, 1987). Here, the computational advantages of the
matrix-analytical method for phase-type distribution is extended to the time-changed
stochastic models and presented in different form. Most interestingly, the expectation
and variance of the resulting survival curves can be explicitly expressed in terms of
the original process characteristics. Unlike the mortality models adapted from the
interest rate counterpart that must rely on simulation methods to obtain numerical
results, our model is mathematical tractable yet still remains biologically reasonable
mortality pattern.
The proposed model has been illustrated to price the EIB/BNP Longevity Bonds
and other mortality derivatives under the independent assumption of interest rate and
mortality rate. We propose a calibrating method that can utilize both the market
4
price information involving the relevant mortality risk and the available mortality pro-
jection that has best incorporated the current knowledge regarding mortality trends.
Calibrating the risk-neutral model using market information is a common practice in
financial modelling. However, similar studies for mortality risk has not been carried
out seriously. The scarcity of the research in this area is partly due to the complexity
of mortality models and partly due to the relative dearth of market information. We
wish the calibrated approach based on our time-changed Markov model can provide a
tractable framework for no-arbitrage, market consistent pricing and hedging problems
for the concerned mortality risk.
The proposed model has been fitted to various type of population mortality data
for empirical study. The fitting results show that our model can interpret the stylized
mortality patterns very well.
The thesis is organized as follows. In chapter 1, we present empirical facts with
respect to the historical mortality data. We also examine the various projection
methods for mortality. These help us understand the features of mortality risk and
the reasons why it takes so long for actuarial professional and industry to pay attention
to the stochastic and systematic aspects of the mortality risk. We then set up the
general pricing framework and basic building blocks of stochastic mortality models
in chapter 2. Our proposed time-changed Markov model is presented in chapter 3.
This chapter also includes the discussion of the properties of the model, illustrations,
examples and applications. Finally, we give a detailed empirical study in chapter 4,
to show that a phase-type distribution can fit the whole mortality schedule fairly well.
A hypothetical underlying mechanism for the model is also suggested.
Chapter 1
Mortality Risk — FromDeterministic to StochasticApproach
1.1 Mortality Modelling
Life insurance and annuities are products designed to manage financial uncertainty
related to how long an individual will survive. Hence, the lifetime random variable
X and its associated mortality model are the basic building blocks in actuarial math-
ematics. In this section, we first introduce basic concepts and actuarial notation
related to mortality modelling. We then present empirical results on human mor-
tality trends, leading to a discussion of current problems in mortality modelling and
projecting approaches.
1.1.1 Survival Distributions
We begin by considering a continuous age-at-death variable X. Specifically, X is a
nonnegative random variable representing the lifetime of an individual in a cohort or
population.
5
6
All distribution functions related to the random variable X, unless stated other-
wise, are defined over the interval [0,∞). Let f(x) denote the probability density
function (p.d.f.) of X and let the cumulative distribution function (c.d.f.) be
F (x) = P (X ≤ x) =
∫ x
0
f(t)dt. (1.1.1)
The probability of an individual surviving to age x is given by the survival function
(s.f.)
s(x) = P (X > x) =
∫ ∞
x
f(t)dt. (1.1.2)
A very important concept in mortality modelling is the force of mortality (often
referred to as the hazard function in other fields such as in reliability theory), which
is defined as:
µ(x) = lim∆x→0
P (x < X ≤ x + ∆x|X > x)
∆x(1.1.3)
=f(x)
s(x)(1.1.4)
The force of mortality specifies the instantaneous rate of death at age x, given that
the individual survives up to age x.
Any one of the functions f(x), F (x), s(x), or µ(x) can be used to specify the
distribution of X. It is easy to see that, given an expression for any one of the above
four functions, the other three can be derived. For example, in terms of the force of
mortality µ(x), we have:
s(x) = e−∫ x
0 µ(t)dt, (1.1.5)
F (x) = 1 − s(x) = 1 − e−∫ x
0 µ(t)dt, (1.1.6)
and
f(x) = µ(x) e−∫ x
0 µ(t)dt. (1.1.7)
7
We will often make use of the future lifetime random variable (or residual lifetime)
τx. τx is the time-until-death variable measured from the date that a contract has been
issued to an individual of age x. In the following, the symbol (x) is used to denote
a life-aged-x. The distribution function for τx can be derived from the distribution
function for X. In actuarial science, special symbols have been assigned to denote
the distribution function for τx as follows.
Actuarial Notation
For t ≥ 0, we define
tqx = P (τx ≤ t) = P (x < X ≤ x + t|X > x) =s(x) − s(x + t)
s(x), (1.1.8)
tpx = P (τx > t) = P (X > x + t|X > x) =s(x + t)
s(x). (1.1.9)
The symbol tqx can be interpreted as the probability that (x) will die within t years;
that is, tqx is the c.d.f. of τx. Similarly, tpx can be interpreted as the probability that
(x) will attain age x + t, that is, tpx is the s.f. of τx. Note that both tqx and tpx are
conditional probabilities in terms of random variable X; they are probabilities that
are conditional on the event that an individual survived to age x.
In terms of the force of mortality,
tpx = e−∫ x+t
xµ(s)ds = e−
∫ t
0 µ(x+s)ds, (1.1.10)
tqx = 1 − tpx = 1 − e−∫ x+t
xµ(s)ds = 1 − e−
∫ t
0 µ(x+s)ds, (1.1.11)
and the p.d.f. for τx is
fτ (t) =d( tqx)
dt= −d( tpx)
dt. (1.1.12)
For the special case of t = 1, the prefix in the symbols will be omitted and we have
qx = P [(x) will die within 1 year], (1.1.13)
8
Table 1.1: An Illustrative Life Table
Age x qx lx dx Lx Txex
......
......
......
...60 0.011894 80472 957 79993 1627170 20.22
61 0.012912 79515 1027 79002 1547176 19.46
62 0.014167 78488 1112 77932 1468175 18.71...
......
......
......
px = P [(x) will attain age x + 1]. (1.1.14)
qx plays an important role in mortality analysis as we will show in the following
sections, and is often referred to as the mortality rate or death rate at age x.
Life Table Model
A life table model is an alternative way to specify the distribution for age-at-death
random variable X. A standard life table usually contains tabulations of the basic
functions qx, lx, dx, and, possibly, additional derived functions for integer ages. For
illustration, see Table 1.1, which is extracted from U.S. Social Security Area Life
Tables (1992).
To construct a life table, we start with a given group of l0 newborns (l0 = 100, 000,
for instance). Suppose that each newborn’s age-at-death random variable follows a
distribution specified by s.f. s(x), and all newborns are mutually independent. Let
lx denote the expected number of survivors to age x from the l0 newborns. Then we
have
lx = l0 s(x). (1.1.15)
It is easy to see that lx is proportional to s(x) so lx can be viewed as the discrete
9
version of survival function s(x). Similarly, the expected number of deaths over each
age interval (x, x + 1] can be expressed as
dx = lx − lx+1 (1.1.16)
= l0 (s(x) − s(x + 1)) (1.1.17)
.= l0 f(x). (1.1.18)
Thus, the curve of deaths dx approximates the probability density function f(x).
Rewriting expression (1.1.16), we obtain
lx+1 = lx − dx (1.1.19)
= l0 s(x) − l0 (s(x) − s(x + 1)) (1.1.20)
= l0 s(x)
(1 − s(x) − s(x + 1)
s(x)
)(1.1.21)
= lx (1 − qx) (1.1.22)
Formula (1.1.16) and (1.1.22) show that table values lx and dx can be recursively
obtained for all ages if given the initial group number l0 and mortality rate qx. In
practice, we usually don’t know the underlying survival function of a population. The
construction of a life table requires the estimation of mortality rates qx at all ages
from data. The procedure is as follows. First, define the central death rate over the
interval from x to x + 1, denoted by mx, as
mx =
∫ 1
0lx+t µ(x + t) dt∫ 1
0lx+t dt
=lx − lx+1
Lx
. (1.1.23)
where Lx =∫ 1
0lx+t dt is interpreted as the total expected number of years lived
between ages x and x+1 by survivors from the initial l0 newborns. Lx usually can be
estimated directly from population data, and mx as well. Then, assuming a uniform
10
distribution of deaths for ages greater than 1, qx can be obtained from mx by
qx =mx
1 + 0.5mx
, for x ≥ 1. (1.1.24)
The mortality rate at age less than one has to be dealt with differently, and details
are omitted here.
The symbol Tx in a life table denotes the total number of years lived beyond age
x by the survivorship group with l0 initial members. We have
Tx =
∫ ∞
0
lx+t dt = Lx + Lx+1 + Lx+2 + · · · . (1.1.25)
Letex denote the life expectancy of (x), i.e. the average number of years or the future
lifetime lived by (x).ex can be calculated as
ex=
Tx
lx. (1.1.26)
Life expectancy is an important indicator for the mortality level of a population. It
has been widely used to measure overall mortality changes in a region or to compare
mortality differences between cohorts.
Finally, we would like to remark that life table models only provide partial infor-
mation on a survival distribution. To extract full distributional information from a
life table, one has to consider fractional age assumptions for death rates over each age
interval [x, x + 1). Nonetheless, due to data availability on human mortality, the life
table model is one of the most popular methods of expressing an age-specific mortality
pattern, and lx, dx and qx are the discrete counterparts of the continuous functions
s(x), f(x) and µ(x). In traditional life insurance, construction and projection of
life tables have been the central topics of mortality analysis. We will present some
historical results on age-specific patterns of human mortality (presented through life
tables) in the following section.
11
1.1.2 Empirical Studies on Mortality Trend
Over past centuries, human mortality has improved dramatically at all ages and
has shown many common features in its moving trend for different populations. In
this section, we will present some aspects of the past mortality experience referring
to Swedish male mortality data. All data used in this project are obtained from
www.mortality.org, unless specified.
Figure 1.1 illustrates the curves of deaths dx estimated at different times. As we
can see, 1) the mode of the curve of deaths moves towards older ages, and 2) the
dispersion of deaths around the mode reduces; originating the so-called “expansion”
and “concentration” properties. These two properties about the curves of deaths cor-
respond to the survival function moving towards a rectangular shape (see Figure 1.2),
from which the term “rectangularization” comes. In Figure 1.3, logarithm age-specific
mortality rates are presented, while in Figure 1.4 mortality rates qx above age 60 are
plotted against age x.
The overall mortality experience can also be depicted by other quantities such
as the life expectancy (at birth or at higher ages) and the mortality rate ratios. In
Figure 1.5, the behavior of the life expectancy at birth is compared with the life
expectancy at age 65. In Figure 1.6, the mortality ratios at ages 40, 60, 80, i.e. the
mortality rate qx(y) in various calendar years y divided by the mortality rate qx(1901)
in the year 1901, for x = 40, 60, 80, are presented.
Results are self-evident. In particular the following aspects can be pointed out: an
overall increase in the most probable age of death (i.e. the mode in deaths curve), an
overall decrease in mortality rates at all ages, an overall increase in the life expectancy
(at birth as well as at old ages). However, we also should notice that mortality changes
12
Figure 1.1: Curves of deaths dx for Swedish male population
0 20 40 60 80 100 1200
1000
2000
3000
4000
5000
Age
Dea
th n
umbe
r at
age
x
190119211941196119812001
Figure 1.2: Survival functions lx for Swedish male population
0 20 40 60 80 100 1200
20000
40000
60000
80000
100000
Age
Sur
viva
l fun
ctio
ns lx
190119211941196119812001
13
Figure 1.3: Log mortality rates ln(qx) for Swedish male population
0 20 40 60 80 1000
2
4
6
8
10
12
Age
Log
mor
talit
y ra
te ln
(10,
000*
qx)
190119211941196119812001
Figure 1.4: Death rates qx, 60 ≤ x ≤ 110 for Swedish male population
60 65 70 75 80 85 90 95 100 105 1100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Age
qx
190119211941196119812001
14
Figure 1.5: Life expectancy at birth and at age 65 for Swedish male population
1901 1921 1941 1961 1981 200145
50
55
60
65
70
75
80
85
Year
e0
e65
Figure 1.6: Mortality ratio qx(y)/qx(1901) for Swedish male population
1901 1921 1941 1961 1981 20010
0.2
0.4
0.6
0.8
1
1.2
1.4
Year
x=40x=60x=80
15
Figure 1.7: Curves of deaths dx for Swedish male cohorts
0 20 40 60 80 1000
1000
2000
3000
4000
Age
Dea
ths
num
ber
age
age
x
181118311851187118911911
are not happening evenly. Specifically, the rate of improvement in mortality rates has
varied significantly over time, and the improvement has varied substantially between
different age groups, as shown in Figure 1.5 and Figure 1.6. Furthermore, combining
with Figure 1.3 and Figure 1.4, we can tell that the mortality decreases in relative
terms are bigger at young adult ages than at very old ages, while the mortality
decrements in absolute values are bigger at very old ages than at younger ages in the
last century.
The above observations are made based on cross-sectional or period life tables.
A period life table is based on, or represents, the mortality experience of an entire
population during a relatively short period of time, usually one to three years. Life
tables based directly on population data are generally constructed as period life tables
because death and population data are most readily available on a time period basis.
16
Figure 1.8: Survival functions lx for Swedish male cohorts
0 20 40 60 80 1000
20000
40000
60000
80000
10000
Age
Sur
viva
l Fun
ctio
ns lx
181118311851187118911911
Figure 1.9: Log mortality rates ln(qx) for Swedish male cohorts
0 20 40 60 80 1000
1
2
3
4
5
6
7
8
Age
log
mor
talit
y ra
te ln
(10,
000*
qx)
181118311851187118911911
17
Figure 1.10: Death rates qx, 60 ≤ x ≤ 100 for Swedish male cohorts
60 65 70 75 80 85 90 95 1000
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Age
q x
181118311851187118911911
Figure 1.11: Life expectancy at birth and at age 65 for Swedish male cohorts
1811 1831 1851 1871 1891 191135
40
45
50
55
60
65
70
75
80
85
Year
e0
e65
18
In contrast, a cohort life table is based on, or represents, mortality experience
over the entire lifetime of a cohort of persons born during a relatively short period
of time, usually one year. Cohort life tables based directly on population experience
data are relatively rare because of the need for data of consistent quality over a very
long period of time. Cohort tables can, however, be readily produced from a series of
period tables.
However, when it comes to actuarial calculations, cohort life tables are usually
more suitable than period tables. For example, when we wish to calculate the pre-
mium or reserves for some product issued to (x) in year y, the relevant life table
shall be the corresponding cohort life table for the cohort born in the year y − x.
For this reason, we reproduce the graphs we have demonstrated so far using cohort
life tables. See Figure 1.7 to Figure 1.11. Similar properties about mortality trends
can be pointed out from those graphs as from period life tables, but they are usually
presented in more or less different degrees.
Before we end this section, we would like to draw readers attention to the ways
how random shocks are presented differently in mortality profiles based on period or
cohort life tables. In the course of mortality improvement, catastrophic events like
the Tsunami in December 2004, or epidemic diseases like the influenza pandemic in
1918 may cause a sudden mortality rise in a short time period. If using period life
tables, these kind of shocks will be reflected as the whole curve (of lx, dx, or qx) in
that special year being distorted from its normal shape, as shown in Figure 1.12 for
the effect of 1918 flu outbreak, based on the logarithm mortality rate (log(qx)) curve.
It is not hard to imagine how the curves of lx and dx are affected accordingly.
For cohort life tables, this kind of shock will be reflected as the scattered jumps
19
Figure 1.12: Log mortality rates ln(qx) for Swedish male with year 1918 curve
0 20 40 60 80 1000
2
4
6
8
10
12
Age
Log
mor
talit
y ra
te ln
(10,
000*
qx)
190119211941196119812001
Year 1918
on subsequent cohort curves (see again Figure 1.7 and Figure 1.9 for the detectable
jumps.). The overall effect is that the time series curve fore0 based on cohort life
tables is much more stable than the time series curve based on period counterparts.
1.1.3 Mathematical Mortality Models
The development of a law of mortality, in which an analytic expression is used to
represent all or part of the age pattern of mortality (in term of, say µ(x) or qx), has
been of interest since the development of the first life table, which is compiled by
renowned English astronomer Edmund Helley (1693).
Probably, the first mathematical mortality model is the one proposed by Abraham
De Moivre in 1725, who suggested that the probability of survival from birth until
age x could be expressed as linear function of age. In terms of the hazard rate, the
20
model can be written as:
µ(x) =1
ω − x, 0 ≤ x < ω
where ω is the highest attainable age.
However, the most successful and influential mortality law belongs to Benjamin
Gompertz. In 1825, Gompertz found that an exponential increase in age can approx-
imately capture the behavior of human mortality rates for large portions of the life
table. He therefore proposed, in terms of the hazard rate, that:
µ(x) = A exp (θx), x > 0,
where A and θ are positive constants.
The Gompertz law has played a central role in the development of theoretical
hypotheses about the pattern of mortality. The close fit of the Gomptertz function
to empirical data seems to suggest that a law of mortality may exist to explain the
age patterns of death for human populations. This has stimulated a lot of researches
to modify or generalize the Gompertz formula.
For example, in 1860, William Makeham noticed that the Gompertz equation
failed to capture the behavior of mortality at higher ages and added a constant term,
B, in order to correct for this deficiency. The constant can be thought of representing
the risk of death by causes that are independent of age. Hence Makeham’s model can
be expressed as
µ(x) = B + A exp (θx).
Another good extension from Gompertz’s model is Perks’s model (1932),
µ(x) =B + A exp (θx)
1 + C exp (θx), (1.1.27)
21
which allows the curve to more closely approximate the slower rate of increase in
mortality at older ages.
Also in 1963, Beard developed a model to reflect the effect of heterogeneity in the
mortality risks to alter the shape of the mortality rate increasing:
µ(x) =A exp (θx)
1 + C exp (θx), (1.1.28)
In 1980, Heligman and Pollard extended Gompertz’s law to an eight parameter
formula that can better fit the whole mortality curve,
qx
px
= (Ax+B)C + De−E(ln x−ln F )2 + GHx, (1.1.29)
where qx is the probability that a person at age x will die before attaining age x + 1,
px = 1 − qx, and A to H are parameters.
So far, we have introduced the models that are based on or proceed from Gom-
pertz’s idea. Those models constitute an important field in traditional mortality
study.
Weibull’s model is something developed along a different line. In 1951, Weibull
proposed a model which originally may be used as a failure model due to wear and
tear of a technical system in engineering. The analog is obvious: death occurs when
the (first) failure of human organs occurs. In terms of the hazard rate, Weibull’s
model is
µ(x) = Axθ.
A distinguishable feature of all the above models is that they describe mortality
in an age continuous context. It is a very important step to move from life tables to
mathematical formula. In this section, we only intend to list a few well-recognized
models from our perspective, and some of them will be applied to develop projection
22
methods in next section. For readers who are interested in a good review of the
development of mathematical models, please refer to Higgins (2003), Pitacco (2003),
and the references therein.
1.2 Mortality Projection Methods Review
Since life annuities and other life benefits are products involving future mortality rates
of lives, mortality improvement has to be carefully considered when it comes to do
actuarial calculations, such as pricing or reserving. Ignoring mortality improvement
seriously underestimate the value and related liability of those products.
Without doubt, the time series of various mortality profile curves have made
a strong impression on us: there exists a discernible downward trend with some
minor fluctuations in the evolution of age-specific mortality patterns over time. This
impression certainly has affected actuarial research and the actuarial profession. For a
long time, actuarial science has been taking a deterministic approach when mortality
trends are concerned, using projected life tables for actuarial calculations, for instance,
the CMIR tables, and SSA Life tables.
Here, the term “deterministic” means that the projected life tables are constructed
and used without accounting for any uncertainty. Conversely, a projection model with
consideration of uncertainty will be categorized as a stochastic model.
A projected mortality model aims at describing a future age-specific mortality
pattern, based on analyzing the past mortality trend. The basic idea of projecting
mortality is to first express some age-specific measure of mortality as a function of
both age x and calendar year y, denoted by Ψ(x, y). The relevance or appropriate-
ness of the model is usually justified by applying statistical procedures to the past
23
available data of Ψ(x, y). Then parameters in the expression of Ψ(x, y) are estimated
and extrapolated to obtain a projecting model. As a result, the projecting methods
developed in this way are also referred to as extrapolative projection models.
In concrete terms, Ψ(x, y) may represent mortality rates, mortality odds, central
death rates, the force of mortality, survival function, some transform of the above
functions, etc. Sometimes, Ψ(x, y) can be viewed as entries in a matrix whose rows
correspond to ages and columns to calendar years. For example, let Ψ(x, y) = qx(y).
Then, the mortality rates can be read according to three arrangements:
(1) a “vertical” arrangement (i.e. by columns),
q0(y), q1(y), · · · , qx(y), · · ·
corresponding to a sequence of period life tables, each table referring to a given
calendar year y;
(2) a “diagonal” arrangement,
q0(y), q1(y + 1), · · · , qx(y + x), · · ·
corresponding to a sequence of cohort life tables, each table referring to a cohort
born in year y;
(3) a “horizontal” arrangement (i.e. by rows),
· · · , qx(y − 1), qx(y), qx(y + 1), · · ·
yielding a time series of mortality rates referring to a given age x.
As will emerge from the discussion below, thinking in terms of the various arrange-
ments can help in understanding different approaches to the interpolation of mortality
data.
24
Although it seems quite reasonable that mortality projections are based on past
mortality experience, a number of broad projection approaches exist, for instance,
models based on underlying biomedical processes, causal models involving economet-
ric relationships, etc. In practice, most adopted projection methods fall into the
category of extrapolative models, and we will only review these types of models in
this section.
1.2.1 Deterministic Projection Models
The types of extrapolative models can briefly be summarized into the following:
(a) Models based on the independent projection of age-specific mortality or hazard
rates, including mortality reduction factor models (CMI, 1990, 1999; Willets,
1999; Renshaw and Haberman, 2000).
(b) Models based on the projection of parameters for some adopted mathemat-
ical law, including Gompertz-based projecting model (Wetterstrand, 1981),
Makeham-based projecting model (Cramer and Wold, 1935), and Heligman-
Pollard-based projecting model (Forfar and Smith, 1988; Benjamin and Pollard,
1993).
(c) The model tables or relational models that associate life table measures with
some standard life tables, including Brass’ logit model (1974) and Lee-Carter’s
model (1992).
Age-by-Age Basis Projection
As we have shown in the previous section, mortality changes are not happening
evenly: the rate of improvement in mortality rates has varied significantly over time,
25
and the improvement has varied substantially between different age groups. For
this reason, many professional projections prefer method (a) and model the future
mortality rates on an age-by-age basis. For example, the projection formula currently
used by CMIB for annuitants and pensioners mortality tables assumes the following:
qx(y) = qx(y0)α(x) + [1 − α(x)] · [1 − f(x)]y−y0,
where y0 denotes the base year of the projection, and α(x) qx(y0) represents the limit-
ing mortality rate at age x (as a percentage of the base year mortality rate). Basically,
this model assumes that mortality rate qx(y) (at age x) in calendar year y decreases
exponentially to the limiting value, with 1− f(x) specifying the speed of convergence
to approach this limit; hence, 1− f(x) is often referred to as “reduction factor”. De-
termination of α(x) and f(x) is based on the analysis of historical trends, sometimes
combined with expect (scientific and/or subjective) opinion on recent development of
medical science, trends in the incidence of diseases, and so on.
Many projections are of this kind. Examples include the early model proposed by
the Institute of Actuaries in London in 1924
qx(y) = ax + bx cyx
and the late target model used by GAD for 1992-based projection, with a similar
formula of exponential interpolation from current levels to the target level. Since
this method allows each age-specific rate to change at its own individual rate, the
projected age profile of mortality may depart from plausible, historically observed
pattern (Keyfitz, 1981). Subjective modification may be necessary when implausi-
bility occurs. Another disadvantage is that the number of parameters needed to be
estimated is very high, equal to the number of age groups multiplied by the number
26
of parameters in each formula.
Parameter-by-Parameter Basis Projection
When a mathematical mortality law is used to summarize the age pattern of mor-
tality, the high “dimension” of the forecasting problem can be dramatically reduced.
In the last decades of 20th century, various mortality law-based projection methods
have been considered. These include Wetterstrand (1981)’s Gompertz-based pro-
jection, Poulin (1980)’s Makeham-based projection, and Forfar and Smith (1988)’s
Heligman and Pollard-based projection (see also Benjamin and Soliman, 1993). In
such a case, the age pattern of mortality in the calendar year y is expressed via the
parameters of the mathematical law. Hence, the projection procedure is applied to
the set of parameters, instead of the set of age-specific mortality rates.
To illustrate the basic idea of the law-based projection method, we use Makeham-
based model which defines a dynamic Makeham’s law as
µx(y) = γ(y) + α(y)β(y)x. (1.2.1)
Here, the three parameters are viewed as functions of the calendar year y. The
projecting procedure involves first applying the formula to data for each period table
to obtain estimated values γ(y), α(y) and β(y) for calendar year y, then fitting these
estimated values in “horizontal” direction to obtain an extrapolative formula for each
parameter. Note that formula (1.2.1) can also be used on cohort base if the empirical
data support the model as has been proposed by Davidson and Reid (1927), where y
represents the birth year.
Although the law-based projection method reduces the dimension of the forecast-
ing model to a few parameters, another problem may arise when projecting those
27
parameters because there usually exist some complex correlations among those pa-
rameters. Moreover, this method requires that the proposed formula describes the
mortality pattern in a consistently satisfactory way in the past as well as in the
future. Sometimes this may not be true, thus implying high model risk in this ap-
proach. As a result, this method may generate implausible projected mortality trends
for age-specific mortality rates.
Pattern-by-Pattern Basis Projection
An alternative approach to summarizing the age pattern of mortality without
resorting to mathematical laws is the use of “model tables”. Due to the fact that
mortality demonstrates different patterns at different historical stages, it may not be
realistic to use a common mathematical law to represent changing mortality patterns.
Rather, it may be more appropriate if a specific representative of mortality table is
used when mortality reaches a certain level. This is the idea of model tables.
The first set of model tables was constructed by the United Nations in 1955, with
mortality level indicated by a “marker” where the marker is the expectation of life
at birth,oe0. Model tables can be used for mortality projection as follows. First, a
set of model tables is chosen, representing the mortality for a given population in
several past stages, and also in the projected period for that population. Trends
in the marker are then analyzed and projected, possibly using some mathematical
formula, to predict their future values. Then, the projected age-specific mortality
rates are obtained by combining the projected values of the markers with the system
of model tables accordingly.
The idea of model tables is very important, since, for the first time, it took the
viewpoint that mortality projection is to forecast the “correct” pattern and “correct”
28
level to obtain future mortality. It separated the projection mission into analyzing
two components: pattern and level.
This idea has been developed into the so-called “relational method” by W. Brass
in 1974, who focussed on the logit transform of the survival function, namely
Λx =1
2ln
(1 − s(x)
s(x)
).
Brass noted empirically that the mortality pattern, expressed by Λx, has a very nice
linear relationship with Λstandx , the logit pertaining to a “standard” population, i.e.
Λx = α + β Λstandx (1.2.2)
whose parameters are (almost) independent of age. Thus these parameters can serve
as the indices for the represented population.
For the purpose of projecting mortality, formula (1.2.2) can be used in a dynamic
sense. In a dynamic survival modelling context, the Brass logit transformation is
particularly interesting since the empirically linear property of logits applies well
when referring to successive birth-year cohorts. That is, denoting by Λx(y) the logit
of the survival function for the cohort born in year y, s(x, y), we have
Λx(y + 1) = αy + βy Λx(y) (1.2.3)
Again, parameters αy and βy are assumed independent of age, but vary for different
cohorts. So, the problem of projecting mortality converts to the problem of projecting
two parameters αy and βy which carry the information about how mortality will
change from one pattern to the other. Projected values of various life table functions
can then be derived from the inverse logit transformation:
s(x, y) =1
1 + exp2Λx(y) .
29
The main feature of model tables or relational models is that they project mortal-
ity on a pattern-by-pattern basis. Structurally, methods on a pattern-by-pattern basis
are preferable since a plausible age pattern of mortality usually can be kept. This
is unlike the situation with age-by-age or parameter-by-parameter projections. This
approach works well when the assumed relationship keeps holding for the projected
period, otherwise it may result in model risk. This is same as the other extrapolative
methods.
Further Discussion
Model risk is actually the type of risk inherent in an extrapolative projection
approach in general. When one employs a parametric model to extrapolate the past
trends into the future, the implicit assumption is that the historical patterns will
still hold for the future and no structural change will occur. This is certainly not
true. Over the past century, we have observed the changes of mortality patterns
due to the transition in major causes of death from infectious diseases to chronic
diseases (see Tuljapurkar and Boe, 1998). Reflected in death rates, those pattern
changes can be seen from the crossover in mortality decline rates between different
age groups in different historical periods (see Figure 1.6). Pure extrapolating may
result in systematic underestimation or overestimation of mortality improvement.
Therefore, we must keep in mind that extrapolative projection only describes one
scenario of the future when the past trends continue. It is also important to inquire
about the ways and the degree in which the future will be different from the past.
The answer to such inquiry shall be probabilistic. As we have discussed before,
mortality changes can involve both pattern changes and level changes. Some of those
changes are due to continuous development (for example, general improvement in
30
nutrition), but some are a result of discontinuous shocks (such as the introduction of
antibiotics). As a result, some are predictable (or easier to be predicted), while others
are not. Consequently, uncertainty is an inborn nature in the process of mortality
development, no matter what projection methods are taken. This is the motivation
for the stochastic projection approach to become an active topic in recent actuarial
research. It is critical to ask the following questions about any projection method
to what extent can future mortality be approximated by the proposed model? What
is the chance that future mortality may deviate from the projected trend, to what
degree and how do we measure it?
1.2.2 Stochastic Projection Models
The projection procedures proposed before 1990 mainly focus on providing point esti-
mates of future mortality rates (or other age-specific quantities). Although concerns
for the mortality uncertainty in future trends have grown rapidly, those models don’t
facilitate the discussion of this aspect. This is the consequence of the way in which
the models were designated.
For example, McNown and Rogers (1989) have extended Heligman and Pollard-
based projection model to a stochastic projection model by allowing the dynamics of 8
estimated parameters from A to H (see formula (1.1.29)) to be modelled by ARIMA
processes. Although their model can give forecasts of future mortality rates, the
complexity of the parameter relationships makes it impossible to measure forecasting
errors. This is why the projecting models given in last section are not able to be
simply extended to accommodate the stochastic feature.
The Lee-Carter Model
31
In 1992, Lee and Carter proposed a remarkably simple model which seemed to
solve the trade-off between plausibility of the projected age pattern and ease of mea-
suring the uncertainty. The Lee-Carter method works with the central death rates.
Let mx(t) denote age-specific central rate for age x at time t. Lee and Carter assume
that the logarithms of the central death rates satisfy
ln mx(t) = ax + bxk(t) + εx,t. (1.2.4)
for appropriately chosen sets of age-specific constants ax and bx, and time-varying
index k(t).
• The age parameters ax’s can be interpreted as describing the general shape of
ln mx(t) across age, while the time parameter k(t) describes the variation in the
mortality level with time t. If k(t) falls, mortality improves, and if k(t) rises,
mortality worsens.
• The coefficient bx determines how this level change in mortality affects the rate
at a specific age. If the bx is particularly high for some age x, then this means
that the mortality rate improves faster at this age than in general. If it were
negative at some ages, this would mean that mortality was getting worse at
those ages. If bx’s were all equal then mortality rates would decline at the same
rate. As the bx’s are controlled to sum to 1, this is a relative measure.
• The error term, εx, t reflects the particular age-time variation of historical influ-
ences not captured by the model. If model specification is correct, εx, t shall be
i.i.d. random variables’ with mean 0 and variance σ2ε .
It is worth noting that if k(t) decreases linearly, then mx(t) decreases exponentially
at each age, at a rate that depends on bx, thus reducing to a special age-by-age
32
projection model. However, the Lee-Carter model is very different from age-by-age
projection. Under the Lee-Carter model setting, the age specific rates are determined
by the three parameter sets ax, bx and k(t) together. As a result, the parameter k(t)
itself is a kind of compromise among the trends in all the individual age-specific rates.
This will lead to different forecasts of the individual rates than would be obtained
by modelling them individually. In this sense, the Lee-Carter model is in essence a
relational model, and the future mortality pattern can be generated by the forecasts
for parameter k(t), combined with the estimates of ax and bx (obtained from the model
fitting step). The Lee-Carter model thus provides a parsimonious way to express the
pattern change of mortality in terms of the variation of a single variable k(t).
The most distinguishable aspect of the Lee-Carter model is that the model allows
for uncertainty in forecasts. In fact, the variable k(t) is intrinsically viewed as a
stochastic process (not as a deterministic quantity that can be expressed by some
interpolative formula as in the previous section), thus the values of k(t) form a time
series over time.
Standard statistical procedure (see Box and Jenkins, 1970) can be applied to find
an appropriate autoregressive integrated moving average (ARIMA) model for the time
series of k(t). In Lee and Carter’s original paper, k(t) is found to decline at a roughly
constant rate and have roughly constant variability, therefore can be well modelled
using a simple random walk with drift. That is
k(t) = k(t − 1) − c + et. (1.2.5)
or
k(t) − k(0) = ct +∑
0≤s≤t
es. (1.2.6)
33
In this specification, c is the drift term, and k(t) declines linearly with increments
of size c. The deviation from this path is captured by a white noise et. This simple
expression again can facilitate the forecasting and the discussion of uncertainty.
Measure of Uncertainty in the Lee-Carter Model
Now assume an ARIMA process (1.2.5) has been recognized as the model for k(t).
Let kt+s be the s-period ahead forecast from base year t. Then the s-period ahead
forecast of ln(mx(t)) is given by
ln(mx(t + s)) = ax + bxk(t + s). (1.2.7)
The true value of ln(mx(t+s)), assuming the model specification and data are correct,
is given by
ln(mx(t + s)) = (ax + αx) + (bx + βx)(k(t + s) + ut+s) + εx,t+s. (1.2.8)
where αx and βx are the errors in estimating ax and bx respectively, εx,t+s is the error
in fitting the model for age group x, and ut+s is the error in forecasting k ahead s
periods from base year t.
The total forecast error, Ex,t+s, is the difference between expression (1.2.8) and
(1.2.7):
Ex,t+s = αx + εx,t+s + (bx + βx)ut+s + βxk(t + s). (1.2.9)
Unfortunately, the correlation among these different sources of error is not clear.
Hence, in the computation of the variance of the forecast error, we have to assume
independence between all the terms on the right hand side of equation (1.2.9). Under
this assumption, the variance of Ex,t+s is given by
σ2E, x, t+s = σ2
α, x + σ2ε, x, t+s + (b2
x + σ2β, x) σ2
k, t+s + σ2β, x k2(t + s). (1.2.10)
34
where σ2α, x and σ2
β, x correspond to the variance of the error in estimating αx and βx
respectively; σ2ε, x, t+s is the variance of εx,t+s, and σ2
k, t+s is the variance of ut+s.
Now we specify how to estimate the variance of each component in the right hand
side of equation (1.2.10). Since ax is the average over time of the log of the death rate
for age x, its error variance is the variance of ln(mx(t)) divided by T , the number of
observations of mx. σ2ε, x, t+s can be estimated by the variance of the error in fitting age
group x within the sample period. σ2β, x can be obtained by a small-scaled bootstrap
(see Lee and Carter (1992) for details). The variance σ2k, t+s depends on the form of
the ARIMA process in general. In terms of formula (1.2.5), we have
σ2k, t+s = s ∗ σ2
et+ s2 ∗ σ2
c ,
where σ2et
is the variance contained in forecasting k due to et, and σ2c is the variance
of error in estimating the drift term c.
The numerical result (see Lee and Carter, 1992, Table B1 in Appendix B) actually
shows that σ2k, t+s is significantly bigger than all the other variances altogether, and
becomes dominant when the forecasting horizon gets long enough, say more than 15
years. That means, the error occurring in forecasting the mortality index dominates
the errors from other sources. Hence the computation of the variance of the forecast
error, Ex,t+s, can be approximated solely by the error of forecasting k.
To generate the interval forecast, we need to assume normality for variable Ex,t+s.
This gives the following 95% confidence interval estimate for mx(t + s):
(mx(t + s)e−1.96 σE, x, t+s , mx(t + s)e+1.96 σE, x, t+s
). (1.2.11)
In summary, Lee and Carter proposed a simple linear transformation (formula
35
(1.2.4)) to represent the age-specific mortality pattern change in terms of the varia-
tion of a period-specific mortality level index k(t). More strikingly, k(t) empirically
declines in a linear manner during the period of interest, thus can be modelled as
(1.2.5). As a result, the uncertainty for the forecasted mortality rate can be easily
derived from the variance of k (given in formula (1.2.10) or (1.2.11)).
Further Discussion
The Lee-Carter methodology represents one of the most influential proposals in
the field of mortality forecasts. It is designed to address the issue of measuring
the uncertainty in mortality forecasting. However, because the uncertainty given
by (1.2.10) does not reflect uncertainty about whether the model specification is
correct, nor uncertainty about whether the future will look like the past, many people
believe that the confidence intervals (1.2.11) given by Lee-Carter model are too narrow
(Renshaw and Haberman, 2003; Booth, Maindonald, and Smith, 2002; Li, Hardy, and
Tan, 2006). This narrowness may result in underestimation of the risk of more extreme
outcomes, and this may defeat the original purpose of moving on to a stochastic
framework, as commented in Li, Hardy, and Tan (2006).
This narrowness of the confidence intervals can be interpreted as the model risk
inherent in the extrapolative method as we have discussed before. The Lee-Carter
methodology will only work well when the mortality change maintains the same pat-
tern over the fitting and projecting period. When a changeover of mortality from one
pattern to another takes place, this method will fail. The model mis-specification
can be tested on the error term εx,t. It is found that there exist substantial and
persistent correlations across age groups when fitting the Lee-Carter model to U.S.
death rates from 1933 to 1987. This is a negative result because εx,t is supposed to
36
be independent of one another. For further model examinations, see Lee and Miller
(2001) for details.
Another weakness of the Lee-Carter model is regarding forecasting errors for some
more integrated quantities, such as life expectancyex. Since
ex is nonlinear function
of µ(x) (or mx(t)) and k(t), the computation of the standard errors hence requires
the use of asymptotic approximations or a bootstrap procedure. The complexity of
such computations is of the same order of magnitude as would be involved under
other parameter forecasting method like McNown and Rogers’ (1989). Therefore, in
practical application, Lee-Carter model still cannot provide an analytic form for its
solution. Numerical illustration requires cumbersome simulation.
Finally, we would like to remark that, regarding the use of projected life tables in
actuarial applications, cohort life tables are directly relevant. However, most proposed
methods (including the Lee-Carter’s model) are based on analyzing period data. This
choice mainly depends on the availability of data and/or the empirical facts. Projected
period life tables are often constructed first. Cohort tables are then generated by
reading the matrix in a diagonal direction as we have discussed before. Though it
may not be a big issue to obtain cohort tables from period tables, it is much more
difficult to transfer the uncertainty measure from the period to the cohort setting
for life table functions, such as life expectancyex, survival functions spx and so on.
This is because when passing from errors in forecasting the age-specific death rates
to errors in forecasting life table functions, the error in forecasting k applies to all
age groups in the period setting, whereas the autocovariance structure of errors in k
have to be taken into account in the cohort setting, which is less straightforward.
37
1.2.3 Assessing Mortality Projection Models
Criteria for assessing forecasting methodologies (see Keyfitz 1981, GAD 2001) include
• the accuracy with which the forecasts match the eventual realizations of the
actual data
• the ability of the model to generate measures of forecasting uncertainty,
• the transparency of assumptions used to generate the forecasts,
• the quality of the data on which the forecasts are based,
• the ease of use of the model, robustness of the model, needs of users, and so on.
Government Actuary’s Department (2001) has carried out an extensive examina-
tion on various projection methods. Here due to the limit of space, we only quote
the accuracy testing result based on Lee-Carter model, which is viewed as one of the
most popular methods, is highly credited as satisfying well the assessing criteria, and
is likely the only model capable of providing an uncertainty measure.
To test the accuracy, the Lee-Carter method is fitted to base data for England
& Wales from 1941 to 1970. The estimated model is then used to project mortality
rates for the period from 1971 to 1999. They then calculate the ratio of actual over
projected rate and plot the numbers using a lexis chart developed by Andreev (2001).
We borrow their Figure 9 for male in GAD report No.8 to insert here.
The presence of large area of dark blue and dark red reveals that projected age-
specific rates were very different from the actual rates. In particular, projected rates
are lower at ages up to around 45 and higher at older ages than actual rates, through-
out the period 1971 to 1999. Moreover, this phenomenon is also found for female
38
projection and for the projections based on different base data. Therefore, we assert
that severe systematic deviation exists in the projection produced by the Lee-Carter
model.
Sources of Uncertainty
Several sources of uncertainty may influence the modelling of mortality rates, and
their projection into the future. Three well-known categories associated specifically
with the use of statistical models (see Cairns, 2000) are:
(a) uncertainty due to the stochastic nature of a given model (that is, a stochastic
model produces randomness in its output), namely the “process risk”;
(b) uncertainty in the values of the parameters (if we have a finite set of data then
we can not estimate parameter values exactly), originating the “parameter risk”;
(c) uncertainty in the model underlying what we can observe (actual trend is not
represented by the proposed model); thus the so-called “model risk” arises.
If all types of uncertainties have been well addressed, we expect only unsystem-
atic errors in the comparison of projected rates and actual rates. In concrete terms
(see Figure 1.13), when we put projected mortality rates (the continuous line) at a
given age x with its possible future mortality experience (the dots) together, we ex-
pect pattern 1, where deviations from the projected mortality rates can be sensibly
explained in terms of random fluctuations of the outcomes (the observed mortality
rates) around the relevant expected values (the projected mortality rates). The ex-
perience depicted in pattern 2 can be hardly attributed to random fluctuations only.
More likely, this profile can be explained as the result of an actual mortality trend
different from the forecasted one.
39
Similar systematic deviations can also be pictured for a fixed cohort but along the
age axis.
Figure 1.13: Experienced Mortality with its Projection
Pattern2− Systematic Deviation
q x(y)
Pattern1− Random Fluctuation
q x(y)
In the Lee-Carter model, the mortality pattern is estimated from past experience,
expressed in terms of ax and bx, and fixed throughout the projecting period. Mortality
uncertainty in the Lee-Carter model setting is mainly attributed to the random nature
of index k(t), i.e. the process risk of the mortality level. As we have analyzed
before, this creates large model risk due to the nature of mortality changes. This
might explain what we find in GAD’s report about the Lee-Carter model’s projection
performance.
Parameter risk can be reduced by improving data information, hence it may not
present a big problem when model specification is appropriate. But sometimes process
risk can be described in terms of parameter uncertainty as in McNown and Rogers’
model. From the discussion we had before, it seems that model risk could be a severe
problem, especially for the models based on extrapolative method. We refer interested
40
readers to Cairns (2000) for some general methods to deal with parameter and model
risk.
The Complexity of the Mortality Forecasting Problem
The problem of mortality forecasting is actually a very difficult one.
- First of all, we are dealing with moving curves. Precisely, we need to specify a
mortality pattern for each future time (t ∈ [0, T ]), with which age-specific mortality
structure is implied.
- Secondly, the curve is moving non-linearly and randomly.
- Thirdly, the forecasting schedule has to satisfy two-dimensional biological con-
strains: that is, at a fixed time, older people usually have higher mortality rates than
younger people; and for a fixed age, the younger cohort is expected to experience
lower mortality than the older one. Yet, for an actual sample pattern or sample path,
it is possible that at an older age we may observe lower mortality rate than at a
younger age, or a younger cohort may present a higher mortality at some age than
an older cohort due to random fluctuation. But the general trend and general shape
have to be kept, at least based on our current knowledge this has to be true.
The first two properties make mortality modelling very similar to the modelling
of term structure of interest rates. However, term structure of interest rates doesn’t
have to be restricted to those structures and patterns implied in mortality. In ad-
dition, the expectation we have for interest rate models is quite different from the
one for mortality models. For example, no one expects that interest rate models can
predict the term structure of interest rate more than 1 year ahead with high precision.
However, actuaries have been forecasting mortality rates for a future period of more
41
than 20 years ahead, and have been assessing the accuracy of their projection in some
way.
What is our particular problem with mortality modelling? This is the question in
our mind. We will try to attack this question at our best in this thesis.
1.3 Mortality Risk and Stochastic Approaches
1.3.1 Mortality Risk — Definition and Properties
As we have shown so far, mortality development embraces a great deal of uncertainty
per se. When a projection is concerned, this uncertainty presents itself as the sys-
tematic deviations of the realized mortality rates from the projected rates. The risk
that such systematic deviations may happen is often referred as mortality risk in the
recent literature.
In practice, mortality changes occur in both young and old ages. For different age
groups, mortality risk usually manifests different characteristics. To help differentiate
between them we will employ the following terminology. The term mortality risk
will be used to cover all forms of deviations in aggregate mortality rates from those
projected at different ages and over different time horizons. Longevity risk will be
used to refer to the risk that, in long term, aggregate survival rates for identified
cohorts are higher than projected. Short-term, catastrophic mortality risk will refer
to the risk that, over short periods of time, mortality rates are very much higher than
would normally be experienced.
In the following, we will use a simple representation provided by Pitacco (2003) to
illustrate the peculiar statistical aspect of the mortality risk. Using the notation from
42
section 1.2, let us denote by Ψ(x, y) a projected mortality model, i.e. a (real-valued)
function of age x and calendar year y, which expresses, in some way, the mortality of
people aged x in the (future) calendar year y, i.e. born in year z = y − x. When the
future changes in mortality are unknown at the time of valuation, the future mortality
evolution can be considered as a family of projected mortality models including all
possible outcomes. Denote by K(z) a given assumption about the mortality trend
for people born in year z, and by K(z) the set of such hypotheses. Then, the family
of projected models is
Ψ[ x, z + x|K(z) ] ; K(z) ∈ K(z), (1.3.1)
with Ψ[ x, z + x|K(z) ] representing the projected model conditional on the specific
hypothesis K(z).
If the mortality is described through a mathematical law, such as those proposed
by Gompertz, Makeham, Heligman and Pollard, etc., the law itself is characterized by
a vector-valued parameter θ(z). An appropriate choice of the vector-valued function
θ(z) may reflect a given hypothesis about the mortality trend. Furthermore, if we
are only concerned with the projection of one cohort, i.e. a given year of birth, z
can be dropped from the notation. Therefore, the family of projected models can be
expressed simply as
Ψ[ x| θ ] ; θ ∈ Θ. (1.3.2)
The parameter space can be either a discrete or a continuous set. In the former
case, the parameter space is referred by a finite set in particular:
Θ = θ1, θ2, · · · , θm, (1.3.3)
43
where m choices are made for the parameter, each one expressing a particular mortal-
ity trend. When m = 1, we are dealing with one (projected) model; hence, uncertainty
in future mortality is not allowed for. Conversely, when m > 1, we are dealing with
several projected models, expressing alternative views on future mortality trends.
Consider assigning a weight gi(gi > 0) to each parameter value θi in Θ (i.e. to each
projection hypothesis), such that∑m
i=1 gi = 1. Hence, the set gii=1,2,··· ,m can be
interpreted as a probability distribution for Θ, describing a “degree of belief” about
the uncertainty in future mortality evolution. A stochastic model setting is thus
formulated (in the sense that the projection is non-deterministic).
Now we will use this simple stochastic model representation to distinguish between
the unsystematic mortality risk stemming from the randomness of a given (parameter)
survival function, and the systematic mortality risk stemming from the uncertainty
associated with the parameter and model specification. In order to do so, let us
consider the residual lifetime random variable τx. According to a given choice θi of the
parameter, the (conditional) moments of τx may be evaluated. In an age-continuous
context, we have for example:
E(τx| θi) =
∫ ∞
0
t fx(t| θi) dt (1.3.4)
V ar(τx| θi) =
∫ ∞
0
(t − E(τx| θi))2 fx(t| θi) dt (1.3.5)
where fx(t| θi) is the (conditional) p.d.f. of the residual lifetime τx.
The unconditional moments are then as follows:
E(τx) =
∫ ∞
0
m∑
i=1
t fx(t|θi) gi dt (1.3.6)
V ar(τx) =
∫ ∞
0
m∑
i=1
(t − E(τx| θi))2 fx(t| θi) gi dt (1.3.7)
44
In particular, developing expression (1.3.7) the following well-known result can be
obtained for the variance:
V ar(τx) =m∑
i=1
V ar(τx| θi) gi +m∑
i=1
(E(τx| θi) − E(τx))2 gi (1.3.8)
= E(V ar(τx| θi)) + V ar(E(τx| θi)) (1.3.9)
where θ denotes the random value of the parameter. Comparing expression (1.3.5)
and expression (1.3.8) or (1.3.9), it is obvious that expression (1.3.5) only allows for
pure randomness arising from a given model in which the model and parameter values
have been uniquely specified, whilst formula (1.3.8) (or (1.3.9)) explicitly deals with
both pure randomness (but in a more general form than as in formula (1.3.5)) (the
first term in (1.3.8) or (1.3.9)), and systematic deviations from its best estimation
(the second term in (1.3.8) or (1.3.9)). It is worth noting that traditional actuarial
approaches concentrate mostly on the risk expressed by (1.3.5) with the systematic
deviations being largely ignored.
Further insight of mortality risk to actuarial practice can be illustrated as follows.
Let us consider a portfolio of immediate life annuities contracts, which are identical
in terms of the age of the annuitant, annual amount, etc. Assume at time 0 the
contracts are issued to N(t0) = N individuals. The random number of contracts
at time t (calendar year t0 + t) is N(t0 + t). Also the random lifetime of the ith
annuitant (i = 1, 2, · · · , N) at time 0 is denoted by τx, where x is the age at entry,
and is assumed independently identically distributed among N annuitants.
Consider a generic annuitant in the portfolio. The random present value at time
0 of benefits is denoted by
Y = R aτx|, (1.3.10)
45
where R is the annual amount, and aτx| is calculated with a given rate of interest i.
Given a survival function S(t) or its related parameter θ, the (conditional) distribution
function, expected value and variance of Y can be expressed:
FY (y| θ) = Pr[Y ≤ y| θ), (1.3.11)
E(Y | θ) = R E(aτx|| θ), (1.3.12)
V ar(Y | θ) = R2 V ar(aτx|| θ). (1.3.13)
Now denote by Y the random present value of future benefits for the portfolio.
Clearly, it can be expressed as follows:
Y =N∑
i=1
Y (i), (1.3.14)
where Y (i) has the same distribution as Y . The conditional distribution function of
Y can then be obtained as the convolution of the distribution function of the variable
Y (i), i.e.
FY(y| θ) = FY (1) ∗ · · · ∗ FY (N)(y| θ) (1.3.15)
= [FY (y| θ)]N∗ (1.3.16)
The conditional expected value and variance of Y given θ are
E(Y| θ) = N E(Y | θ), (1.3.17)
V ar(Y| θ) = N V ar(Y | θ). (1.3.18)
Note that, up to now, what we obtain is the variance for the portfolio when the future
mortality is described by a given survival model, that is, when we take a deterministic
46
approach. In this situation, the variance of the portfolio increases linearly as the
portfolio size increases.
The unconditional expected value and variance of Y can be calculated similarly
as for the unconditional moments of τx:
E(Y) = EP [E(Y| θ)] (1.3.19)
= N EP [E(Y | θ)] = N E(Y ) (1.3.20)
V ar(Y) = EP (V ar(Y| θ)) + V arP (E(Y| θ) (1.3.21)
= N EP (V ar(Y | θ)) + N2 V arP (E(Y | θ)) (1.3.22)
In the above calculation, P means the probability distribution specified by gii=1,2,··· ,m.
Formula (1.3.22) reveals that, when the future mortality is uncertain, the variance of
the portfolio increases quadratically as the portfolio size N increases.
The relationship between the size of portfolio and the total riskiness borne by
the insurer can be better illustrated by the coefficient of variation, i.e. the relative
standard deviation of a random variable. Under the deterministic model setting, we
have, for portfolio benefits Y , the coefficient of variation r is
r =
√V ar(Y| θ)E(Y| θ) =
1√N
√V ar(Y | θ)E(Y | θ) (1.3.23)
The coefficient of variation for Y under the stochastic approach is
r =
√V ar(Y)
E(Y)(1.3.24)
=
(1
N
EP (V ar(Y | θ))E2(Y )
+V arP (E(Y | θ))
E2(Y )
)1/2
(1.3.25)
Therefore, in relative terms, expression (1.3.23) shows that the riskiness of the
portfolio is decreasing with the size of the portfolio. This is the reason for the common
47
opinion that the larger the portfolio, the less risky it is. As the portfolio size goes to
infinity, the risk measured by r goes to zero, representing a property of the “pooling
risk”. On the contrary, the second component in expression (1.3.25) shows that the
risk shared by each policy can never be reduced to zero by increasing the portfolio
size. For this reason, this latter risk is called “non-pooling risk”.
1.3.2 Financial Implication of Mortality Risk
In this subsection, we will provide some numerical evidence to show the severe finan-
cial impact of the non-pooling risk. To this purpose, we investigate the percentile of
Y , which has been defined in the previous subsection as the random present value of
future benefits for a portfolio of immediate life annuities contracts. In particular, for
the deterministic case (that is, the survival model is specified by a given parameter
θ), we define
yα = infy : PrY > y| θ ≤ 1 − α. (1.3.26)
Following Pitacco (2003) and Olivieri (2001)’s approach, it is assumed that θ ∼
θ[min], θ[med], θ[max] with probability distribution P = p[min], p[med], p[max]. Define
yα = infy : PrY > y ≤ 1 − α. (1.3.27)
In general, there is no analytic form for the percentiles. Therefore, simulation method
is required.
For numerical convenience, we adopt a mathematical law for mortality:
qx
px
= GHx (1.3.28)
Note that the right-hand side of (1.3.28) is the third term in the Heligman-Pollard
law, i.e. the term describing the old-age pattern of mortality. It is a well-accepted
48
Table 1.2: Parameters of the projected survival functions
[C] [min] [med] [max]G 0.00028 0.000042 0.000002 0.0000001H 1.07319 1.09803 1.13451 1.17215
fact that formula (1.3.28) can provide a good approximation for the mortality rates
when ages over 50 are considered. In particular, G expresses the level of senescent
mortality and H the rate of increase of senescent mortality itself. From (1.3.28) the
survival function Sx(t) (or just S(t) for simplicity) can be derived. Hence, the model
is specified via the parameter θ = (G,H).
In order to represent the uncertainty in mortality trend, we have assumed three
alternative projected survival models, characterizing the three different degrees of
mortality improvement from the current mortality level (denote by S[C](t)). That is,
S[min](t), S[med](t), S[max](t) are projected survival functions expressing, respectively,
a little, a medium and a high improvement in mortality. When implementing the
model, the parameter (G,H) for the survival function S[C](t) is first estimated from
the current available mortality table. Then the relevant parameters for S[min](t),
S[med](t), S[max](t) are carefully chosen so that they can represent different degrees
of mortality improvement, and at the same time describe the general phenomena
of rectangularization and expansion (see section 1.1.2). The parameters for those
survival models are given in Table 1.2 and their corresponding schedules are displayed
in Figure 1.14.
Next, we need to assign probabilities to the three projected functions S[min](t),
S[med](t), and S[max](t), which we have interpreted as the “degree of belief” attributed
49
Figure 1.14: Curves of deaths Based on Formula (1.3.28)
50 60 70 80 90 100 1100
1000
2000
3000
4000
5000
6000
Age
[C][min][med][max]
to the corresponding projection. For the numerical result shown in Table 1.3, θ =
θ[med] is used for the deterministic case, P = p[min], p[med], p[max] = 0.2, 0.6, 0.2 for
the stochastic case, α = 0.95 for both.
It is clear from Table 1.3 there exists slight decrease in both cases when the size of
the portfolio increases, which is because the random fluctuation of the mortality risk
vanishes as the portfolio enlarges. We also notice the dramatic increase in the value
of percentiles in the stochastic case compared to the deterministic case. Moreover,
its magnitude is quite stable and its value seems to tend to a large positive amount.
This is due to the fact that a stochastic approach allows us to analyze not only the
risk of random fluctuations in the number of survivors around the relevant expected
value, but also that of systematic deviations. Since systematic deviations concern
all the insureds in the same direction, this risk cannot be hedged by increasing the
50
Table 1.3: Relative values of percentiles for Annuity portfolio with α = 0.95
N 1000 2000 3000 4000 5000(yα/E(Y[min]| θ)) × 100 102.166 101.510 101.226 101.071 100.944(yα/E(Y[med]| θ)) × 100 101.662 101.160 100.938 100.823 100.725(yα/E(Y[max]| θ)) × 100 101.242 100.878 100.701 100.610 100.541
(yα/E(Y) × 100 114.030 113.860 113.783 113.740 113.705
portfolio size. In other words, this risk is non diversifiable. As a result, systematic
deviation usually requires a risk premium to be taken into account when pricing and
reserving for life insurance and annuities.
It must be pointed out that this example is just illustrative so the choice of param-
eters is meant for demonstrating strong effects of projected tables and uncertainties,
while for real applications it requires deeper investigations and the effect may not
be so significant. Although the numerical results provided here rely on the specific
assumptions concerning uncertainty structure in mortality, the conclusion drawn in
this section is of general sense. The main feature that concerns actuarial practice is
the non-pooling aspect of the mortality risk. Due to this, the possibility of benefiting
from offsetting effects by holding a large enough portfolio of policies is ruled out.
This problem has been amplified by a dramatic decline in interest rate over the
last decades, affecting particularly those contracts providing joint financial and de-
mographic guarantees (see Boyle and Hardy, 2003; Ballotta and Haberman, 2006).
The underpricing of such guarantees has caused severe solvency problems, requiring
the setting up of extra reserves, and leading one large mutual life insurer (Equitable
Life, the world’s oldest life insurance company) to be closed to new business in 2000.
This has increasingly prompted actuaries to question whether the mortality risk has
51
been treated properly.
1.3.3 Stochastic Approach of Dealing with Mortality Risk
It is well known from the basics of actuarial science that the prices of any insur-
ance products contingent on the duration of life are affected by two main factors:
demographic and financial risks. Traditionally, actuaries have been treating both the
demographic and the financial risk factors in a deterministic way, via the “so-called”
best estimate of mortality tables for describing the future evolution of mortality as
well as the best estimate of the interest rate for discounting cash flows over time.
This is often referred as the actuarial approach.
The principle of the actuarial approach is easy to use at first sight. However, it
leaves a lot of hard questions unanswered. For example
• What discount rate should be used and how should it be affected by the char-
acteristics of the financial products (such as long term or short term)?
• What is the best projected mortality schedule? What if the eventual rates are
different from projected rates?
• In general, how should we express, evaluate and manage the risk implied in the
products including both financial and demographic factors?
• Practically, how are actuarial values resulting from the actuarial approach re-
lated to the observable values of the (readily tradable) assets that typically
appear on a life office’s balance sheet?
An alternative approach, based upon applying the principle of arbitrage-free and
the construction of a replicating portfolio, has attracted a lot of attention recently,
52
which we refer to as the contingent claim pricing approach, or sometimes simply the
financial approach. The underlying assumption behind the contingent claim pricing
approach is that values should be calculated in a consistent way with prices of traded
assets so that the market is free of arbitrage. The basic idea of the approach is
therefore not to find an objective valuation, but instead to seek a relative valuation,
compatible with the observable market values of portfolios (of assets and liabilities)
with similar risk characteristics.
Financial theory and models are first introduced into actuarial valuation to deal
with the financial risk involved in complex insurance products such as Equity-Index
Annuities (EIAs), where equity risk, interest rate risk and mortality risk are present
and the market is incomplete. Early works in this field include Follmer and Son-
dermann (1986); Møller (1998, 2001a,b); Lin and Tan (2003), introducing a risk-
minimizing strategy to the pricing and hedging problem of EIAs in an attempt to
manage financial risk.
Now there is a demand to develop more sophisticated approaches to deal with
mortality risk as well, especially the non-pooling aspect of the mortality risk. The
advantages of adopting the stochastic mortality view and exploiting the financial
theory can be seen in the following (but not limited to those) fields.
Performance valuation
For performance investigation, the use of valuations that are as objective as possi-
ble is considered critical. This perspective has been reflected by the recent proposed
fair valuation methodologies in the newly issued International Financial Reporting
Standard (IFRS) by the International Accounting Standards Committee (IASC). In
an attempt to make the assessment process of companies’ financial performance much
53
more realistic and reliable, the use of a market-price-consistent approach to obtain
fair value is encouraged. “Fair value is the amount for which an asset could be ex-
changed or a liability settled between knowledgeable willing parties in an arms length
transaction,” as defined in IFRS.
While most investment strategies have been assessed on a market-price-consistent
basis, it seems necessary that the same treatment is applied to the liability (mortality-
related) side of insurance business as well.
Capital adequacy and solvency requirements
Solvency regulations usually requires the capital capability of the insurer to meet
future obligations commitment in the insurance contract. This implies actually a
comparison between the random profile of the portfolio fund and the random profile
of the portfolio liability, due to the uncertainty in the future mortality trends and
uncertainty in the future financial markets. It is thus difficult to judge on the appro-
priateness of a reserve profile based on a deterministic view of the future scenario.
Hence, stochastic mortality modelling is a necessary step to assess the future
obligations on realistic grounds.
The development of mortality-related derivative market
A large number of products offered by life insurance companies involves a range
of complex contingent claims involving equity risk, interest rate risk and mortality
risk. The historical experience shows that very long-term products-like GAOs or EIAs
are significantly exposed to unanticipated changes over time in the mortality rates
(mortality risk) as well as the financial risk. This means that the fair valuation tech-
niques well-accepted in the financial market need to be integrated with an accurate
54
assessment of future mortality rates.
The advantage of an integrated framework of market consistent valuation is twofold.
Firstly, it gives more realistic premiums and reserves, and secondly, it quantifies the
risk associated with the underlying mortality dynamics. Having introduced stochastic
mortality dynamics, we can further study possible ways of transferring the system-
atic mortality risk to other parties. One possibility is to introduce mortality linked
securities and/or derivatives (such as the recently innovative issuance of the longevity
bonds by the European Investment Bank and BNP Paribas and the 3-year short-term
mortality-linked securities by Swiss Re). Here, the risk premium is linked to the evo-
lution of the mortality dynamics, thereby transferring the systematic mortality risk
to the holder of the contracts. More importantly, mortality risk, as a type of risk that
is nothing too much special compared to equity risk or interest rate in essence, can
be incorporated into the (generalized) financial market through the development of a
secondary mortality risk market. We believe this can conversely enhance the overall
risk management in the insurance business.
FAQ
The financial approach is, of course, not free from trouble when applied to insur-
ance and annuity products. Here is a list of some concerns that have been raised on
applying a financial approach to insurance market.
(1) Q: A deep transparent and active insurance market does not exist. Actually,
most insurance products are non-tradable. What is the implication of arbitrage
free prices for insurance products and how can we obtain market prices for the
underlying contracts?
55
A: When developing mathematical framework, we describe how insurance con-
tracts should be priced if they were traded in a perfectly liquid, frictionless and
arbitrage-free market.
Naturally, we don’t claim that real world markets are perfectly liquid or fric-
tionless. However, the logic behind this is that if prices are calculated in
an arbitrage-free manner, then even an illiquid market with frictions will be
arbitrage-free. Conversely, if we were to propose a pricing framework that vi-
olates the conditions on arbitrage-free, then the possibility of arbitrage would
emerge over time as the market becomes more liquid or trading costs begin to
fall, according to Cairns, Blake, and Dowd (2006a).
However, other techniques must be used to estimate the “market value”, which
we will discuss in more detail when we come to a more concrete context.
(2) Q: Historically, the life insurance companies have dealt with the financial and
(systematic) mortality risks by choosing both the interest rate and the mortal-
ity rate on the safe side, as seen from the insurers’ point of view. When the
real mortality rate and investment payoff are experienced over time, the result
is usually a surplus, which is the so-called with-profit principle of insurance
pricing. Some therefore argue that risk premiums charged in insurance prod-
ucts are too high, and calibrating to those with-profit market prices may give
over-estimated mortality rate.
A: It is true that, in the past, when tools of investment and risk management
are simple and market competition is modest, the actuarial with-profit valuation
approach has been in effect for a long time. Now things have changed a lot:
56
financial markets have become more sophisticated; competition in insurance
markets is getting higher. Therefore the profit margin becomes more reasonable
nowadays.
On the other hand, we need to be careful about the meaning of “with profit”.
Indeed, nobody takes on uncertainty for free. Therefore, from a different point
of view, it is actually a risk/return problem.
A martingale measure is sought to make market consistent pricing. Precisely,
the risk/return trade-off is ensured to be consistent among contracts involving
same type of risk in the market. Hence, a calibrated martingale measure de-
scribes the market view of mortality rate after risk-adjustment, so it shall not
be compared directly with physical mortality rate. The risk-premium implied
by mortality rate under the martingale measure is a reflection of the market
risk attitude to mortality risk.
(3) Q: What is the relationship between financial risk and mortality risk?
A: Whether interest rates and mortality rates are related is a topic of debate.
Some catastrophic events may result in big mortality loss, and affect economics
as well (such as 9/11 in 2001 or the Kobe earthquake in 1995). In those situa-
tion, it seems reasonable to think of mortality risk being related with financial
markets. Miltersen and Persson (2005) presented a forward-mortality frame-
work that explicitly allows for interest and mortality to be related. In this
thesis, we simply assume that the two risks are independent.
Chapter 2
Arbitrage-free Pricing Frameworkfor Mortality Contingent Claims
In this chapter, We construct the arbitrage-free pricing framework that we will use
to evaluate mortality contingent claims. In order to do so, we will first review the
arbitrage-free pricing theory, including the basic principle and key concepts. Then the
interest rate models and mortality models under the arbitrage-free pricing framework
will be briefly discussed.
2.1 Arbitrage free pricing theory
2.1.1 Basic Ideas of Arbitrage Free Pricing
A contingent claim is a contract whose payoff at time T depends on the evolution of
some underlying asset(s). For this reason, it is also called a derivative to emphasize
the fact that it is a contract written on other asset(s). In the following, we will use
the two terms interchangeably. A formal definition for contingent claims will be given
later on. Before we do that, we would like to give the basic philosophy about arbitrage
free pricing.
57
58
Using standard economic reasoning, the price of a contingent claim like the price
of any other commodity, will be determined by market forces. In particular, it will
be determined by the supply and demand of the market, and supply and demand
will in their turn be influenced by factors such as aggregate risk aversion, liquidity
preferences, etc. Therefore it seems impossible to say anything concrete about the
“absolute” value of a contingent claim.
However, the principle of no-arbitrage and the construction of a replicating port-
folio state that we should relate the value of a contingent claim to the underlying
price(s) in a specific way if we want to avoid mispricing between the derivative and
the underlying price(s) existing in the market. In another word, the contingent claim
pricing should be consistent with the market. Therefore the task here under the ar-
bitrage free principle is not to price the derivative in some “absolute” sense, which is
very important to bear in mind. Instead, it is to seek a relative value of the contingent
claim which usually can be expressed in terms of the market prices of the underlying
asset(s).
We now turn to the mathematical introduction of the above described idea, which
includes the concepts of self-financing portfolio, arbitrage, arbitrage free principle,
and so on. The notation from Bjork’s book (1998) is adopted.
Let us consider a financial market consisting of different assets such as stocks,
bonds with different maturities, or various kinds of financial derivatives. The filtered
probability space for this market is denoted by (Ω,F ,Ft, P ). Assume that there is an
adapted short-rate process r(t) (such that∫ t
0| r(s)| ds < ∞ for every t ≥ 0, P -a.s.)
representing the continuously compounded rate of interest on riskless securities. This
can be formalized by assuming the presence in the market of a money-market account,
59
with account value process B(t) defined by B(t) = exp(∫ t
0r(s) ds) representing the
amount of money available at time t from investing one unit at time 0 in risk-free
depositing account and “rolling over” the proceeds until t.
For the moment, we will also take the price dynamics of the various assets as given.
Furthermore, we assume that the assets under consideration do not pay dividends for
the sake of simplicity. Then we give the following definition.
Definition 2.1.1. Let the N -dimensional price process denote by S(t); t ≥ 0.
1. A portfolio strategy (most often simply called a portfolio) is any FSt -adapted
N -dimensional process h(t); t ≥ 0.
2. The value process V h corresponding to the portfolio h is given by
V h(t) =N∑
i=1
hi(t)Si(t)
3. A portfolio h is called self-financing if the value process Vh satisfies the con-
dition
dV h(t) =N∑
i=1
hi(t)dSi(t) or dV h(t) = h(t)dS(t)
A self-financing portfolio is an important financial concept. It basically means
that, after initiation, the portfolio allows neither exogenous infusion nor withdrawal
of money. In other words, the purchase of a new portfolio must be financed solely by
selling assets already in the portfolio. Based on self-financing portfolio, the definition
of an arbitrage follows.
Definition 2.1.2. An arbitrage possibility on a financial market is a self-financing
60
portfolio h such that
V h(0) = 0 (2.1.1)
P (V h(T ) ≥ 0) = 1 (2.1.2)
P (V h(T ) > 0) > 0 (2.1.3)
We say that the market is arbitrage free if there are no arbitrage possibilities.
An arbitrage possibility can be seen as the possibility of making a positive amount
of money out of nothing without taking any risk. It is thus essentially a riskless money
making machine or, as we used to say, a free lunch on the financial market. Therefore,
an arbitrage possibility is a serious case of mispricing in the market, and our main
assumption is that the market is efficient in the sense that no arbitrage is possible.
From the principle of no-arbitrage, we can conclude that
Proposition 2.1.3. (The law of one price) If two portfolio A and B give rise to
identical (but possibly random) future cashflows with certainty, then A and B must
have the same value at the present time.
It is natural to ask the question how we can identify an arbitrage possibility in the
market. This is answered by the Fundamental Theorem of Asset Pricing, the result
that is central to everything under the no-arbitrage pricing approach.
Theorem 2.1.4. The Fundamental Theorem of Asset Pricing (Harrison
and Pliska, 1981)
Consider a market model consisting of a money-market account B(t) and N asset
price processes S1(t), S2(t), · · · , SN(t) on the time interval [0, T ].
61
(i) The market model is free of arbitrage if and only if there exists an equivalent
martingale measure, i.e. a measure Q ∼ P such that the discounted price
processes
S1(t)
B(t),
S2(t)
B(t), · · · ,
SN(t)
B(t)(2.1.4)
are martingale under Q for all 0 < t ≤ T .
(ii) If (i) holds, then the market is complete if and only if Q is the unique.
It is thus clear that the requirement of an arbitrage free market will impose some
restrictions on the behavior of all price (value) processes that exist in the market.
That is, the martingale property of (2.1.4) has to be applied to the price process of
each asset simultaneously.
The existence of an equivalent martingale measure that allows the discounted price
processes to be martingale implies we can find a Radon-Nikodym density process f on
FT which transform the P -measure into a risk-neutral measure Q. In a risk-neutral
world, any security prices grow on average at the risk-free rate, thus requiring an
adjustment in the dynamics of the uncertainty to reflect the investors’ risk-aversion
toward the underlying risk. This adjustment can be related to a so-called market
price of risk, and it ensures that a risk premium is charged consistently with regard
to the same risk in the market.
This approach has been extended to use more general numeraires (basically any
tradable assets, including the risk-free bank account) as the reference, often referred
to as the martingale approach. The martingale valuation approach is, so far, the
most general approach for arbitrage free pricing. It is also extremely efficient from
a computational point of view. We now turn to the pricing problem for contingent
claims.
62
Definition 2.1.5. Let S(t); t ≥ 0 be the N -dimensional price process. A contin-
gent claim with date of maturity (exercise date) T , also called a T -claim, is
any stochastic variable X ∈ FST . A contingent claim X is called a simple claim if it
is of the form
X = Φ(S(T )).
The function Φ is called the contract function.
Simply, a contingent claim can be viewed as a contract, which stipulates that the
holder of the contract will obtain X (which can be positive or negative) at the time
of maturity T . The requirement that X ∈ FST means that, at time T , it will actually
be possible to determine the amount of money to be paid out based on the evolution
of price process S(t) up to and including time T . We see that the European call is a
simple contingent claim, for which the contract function is given by
Φ(x) = max[x − K, 0].
where K is strike price.
We will use the standard notation Π(t;X )(0 ≤ t < T ) for the price process of the
claim X , where we sometimes suppress the X . In the case of a simple claim we will
sometimes write Π(t; Φ).
To obtain a “reasonable” price process Π(t;X ), we consider the “primary” market
B,S1, · · · , Sn as given a priori. We assume that the primary market is arbitrage free.
Then the derivative should be priced in a way that is consistent with the prices of
the underlying assets. More precisely, we should demand that the extended market
Π(t;X ), B, S1, · · · , Sn is also free of arbitrage possibilities. As an application of the
Fundamental Theorem of Asset Pricing, we obtain the following.
63
Theorem 2.1.6. Risk Neutral Valuation Formula The arbitrage free price pro-
cess for the T -claim X is given by
Π(t;X ) = EQ
[e−
∫ T
tr(s) dsX
∣∣∣Ft
](2.1.5)
where Q is a (not necessarily unique) martingale measure.
In practical applications, there is no guarantee that any specific stochastic asset
model will be arbitrage free, and an analytic solution for the Radom-Nykodom (in
short, RN) density may be difficult to derive. For these reasons, the first step is
usually to hypothesize a tractable analytic form for the stochastic asset model in
tandem with development of the RN density. Then the model is calibrated to market
prices or estimated prices of the underlying asset.
If the market is incomplete, there are infinitely many equivalent martingale mea-
sures making (2.1.5) hold. Indeed, expression (2.1.5) determines a whole (open)
interval of prices consistent with the absence of arbitrage. To narrow down the price,
we again need to identify measure Q through calibration to observed prices. It is
worth stressing that this approach is quite effective because the practical application
requires interpolating rather than extrapolating. As a result, the no-arbitrage pric-
ing approach is regarded as being close to reality, and can provide a very valuable
benchmark in helping to determine and examine insurance company share prices and
transaction values.
2.1.2 The Term Structure of Interest Rate
In this section, we restrict ourself to interest rate risk and the bond market. We
investigate many interesting modelling aspects w.r.t interest rate models.
64
We consider a filtered probability space (Ω,F ,Ft, P ) and assume there is an
adapted short-rate process r(t) (such that∫ t
0|r(s) ds < ∞ for every t ≥ 0, P -a.s.) as
in the previous section.
The simplest financial product involving interest rate is the zero-coupon bond.
Definition 2.1.7. A T -year zero-coupon bond (denoted as T -bond) is a contract
which pays $1 to the bond holder at time T . Here, T is the maturity time of the bond.
Now let’s assume there exists a continuously (frictionless) trading bond market
for every T > 0. Let D(t, T ) denote the T -bond price at time t. Before we begin to
discuss the modelling and pricing of the T -bonds, some insight about the relationship
between D(t, T ) and two variables: t (current time) and T (term to maturity), might
be useful and interesting (see Bjork, 1998).
• For fixed time t, D(t, T ) as a function of T gives the prices for bonds of all
possible maturities at time t. The pattern of this function in terms of T is
closely related to “the term structure of interest rates at t”. Typically it will
be a smooth curve, which allows us to make the assumption that, for each t,
D(t, T ) is differentiable w.r.t. T .
• For fixed time T , D(t, T ) is a scalar stochastic process. This process gives the
prices, at different times, of the bond with fixed maturity T , and the trajectory
will typically be irregular depending on how the prevailing interest rate changes
in response to the market.
Hence, we can expect that D(t, T ) presents its relationship with the underlying term
structure of interest rates through the current time t and the term to maturity T . In
particular, D(T, T ) = 1 to avoid arbitrage.
65
Now we will use the fundamental theorem of asset pricing to derive a relationship
between D(t, T ) and r(t). As we have shown before, the absence of arbitrage for the
bond market implies the existence of a martingale or risk-neutral probability measure
Q such that for all T > 0, the discounted price process
V (t, T ) = e−∫ t
0 r(u)duD(t, T ), 0 ≤ t ≤ T (2.1.6)
is a martingale. Thus we have
V (t, T ) = EQ[V (T, T )|Ft],
which leads to a relationship between the bond price and the short rate
D(t, T ) = EQ
[e−
∫ T
tr(u)du|Ft
]. (2.1.7)
Furthermore, for each fixed t, we assume the bond price D(t, T ) is differentiable
w.r.t. the time of maturity T (this is reasonable as we have discussed before). Then
define
f(t, T ) = −∂ ln(D(t, T ))
∂T(2.1.8)
f(t, T ) is called the time-t instantaneous forward interest rate. Obviously, (2.1.8) can
be rewritten as
D(t, T ) = e−∫ T
tf(t,u)du (2.1.9)
Taking the limit as T → t we have
r(t) = limT→t
f(t, T ). (2.1.10)
For a fixed t, the relations (2.1.7), (2.1.8), (2.1.9) and (2.1.10) show that the
dynamics of bond prices, forward rates and short rates can uniquely determine the
66
other two if one is given. More precisely, on the one hand, the term structure of
interest rates in terms of r(t) or f(t, T ) can be derived from the bond price term
structure; on the other hand, the price of a zero-coupon bond can be determined from
the term structure of short rates or the term structure of forward rates. Moreover, the
dynamics of short rates can be uniquely derived from the dynamics of forward rates,
vice versa. As a result, the above relationships (2.1.7), (2.1.8), (2.1.9) and (2.1.10)
allow us to model the bond market in many different ways.
1. We may specify the dynamics of the short rate, then derive bond price processes
using the no-arbitrage pricing formula (2.1.7). The models developed from the
short rate are usually referred to as short-rate models such as those of Vasicek
(1977), Cox, Ingersoll, and Ross (1985) and Hull and White (1990).
2. We may specify the dynamics of all possible forward rates, and then use (2.1.9)
to obtain bond prices as the models developed in Health, Jarrow, and Morton
(1992). Usually the forward rate dynamics have to satisfy certain conditions
required by arbitrage free arguments.
3. We may directly specify the dynamics of all possible bonds as the models devel-
oped in Flesaker and Hughston (1996), Rogers (1997) and Rutkowski (1997).
4. We may specify the market modelling framework for the forward bond prices
such as the LIBOR and swap market models in Brace, Gatarek and Musiela
(1997) and Jamshidian (1997).
67
2.2 The Term Structure of Mortality Under Arbi-
trage Free Framework
In this section, we aim at developing a theoretical framework to price products whose
benefits are contingent on the uncertainties not only from financial risk sources (like
interest rate and/or equity risk) but also from mortality risk. Here and throughout
we have borrowed the relevant stochastic mortality notation of Cairns, Blake, and
Dowd (2006a).
2.2.1 Basic Building Blocks
Now consider a generic individual aged x at time 0, whose random residual lifetime
is denoted as τx. Suppose the individual belongs to a homogeneous group of persons
(in particular, of the same age and with the same health status) whose random
residual lifetimes can be considered identically distributed, and let N denote the
number of persons in the initial group. We assume there exists a stochastic process
µ(t, x + t), representing the instantaneous hazard rate for an individual aged x + t at
time t. This can be formalized by assuming the presence of a survival index S(t, x),
which, theoretically, can be compiled from observing the number, N(t), of remaining
survivors of the group at time t. In particular
S(t, x) =N(t)
N. (2.2.1)
Moreover we assume N(t) is large enough throughout the observation period for the
concerned group so that S(t, x) is a left limited right-continuous curve which allows
68
us to write down the following expression
S(t, x) = exp
(−∫ t
0
µ(s, x + s) ds
)(2.2.2)
to relate the hazard rate process µx(t, ω) with the survival index. For a more formal
introduction of µ(t, x + t) in a intensity based framework, see Biffis (2005) and Biffis
and Millossovich (2006).
We note that if µ(t, x + t) is deterministic then S(t, x) is equal to the survival
probability that an individual aged x at time 0 will survive to age x + t. In this
situation, µ(t, x + t) coincides with the force of mortality as defined in section 1.1.1.
However, here µ(t, x+ t) is stochastic. Then looking forward from time 0, this means
that S(t, x) is a random variable. In this case, S(t, x) can only be regarded as a
survival probability if one observes at time t rather than at time 0. In the following
the law of iterated expectations will be used to obtain the time-s survival probabilities
over the time horizon (t, T ] (for fixed 0 ≤ s ≤ t ≤ T ), and conditional on the event
τx > t.
First let’s denote by P (s, t, T, x) the survival probability for an aged (x) at time
0 surviving from time t to T as measured at time s. Specifically, P (0, 0, T, x) is the
survival probability that (x) survives to time T as measured at time 0; P (T +1, T, T +
1, x) the survival probability from time T to T + 1 as measured at T + 1, which is
actually deterministic since we have observed the mortality experience up to time
T + 1.
Let Ms be the filtration generated by µ(u, x + u), up to time s, that is, Ms
includes full information about changes in mortality up to and including time s, but
69
no information about how mortality rates will develop after time s. Define
Yx(T ) =
1 if the individual is alive at time T
0 if the individual is dead at time T
Then we have
P (s, t, T, x) = P [τx > T | τx > t, Ms]
= P [Yx(T ) = 1|Yx(t) = 1,Ms]
= E[Yx(T )|Yx(t) = 1,Ms]
= E[EYx(T )|Yx(t) = 1,MT|Ms]
= E
[S(T, x)
S(t, x)
∣∣∣∣Ms
]
= E
[exp
(−∫ T
t
µ(u, x + u) du
)∣∣∣∣Ms
]
Taking specific values for s, t, or T , we have
P (0, 0, T, x) = P [τx > T ] (2.2.3)
= E[S(T, x)] (2.2.4)
= E
[exp
(−∫ T
0
µ(s, x + s) ds
)](2.2.5)
P (t, t, T, x) = P [τx > T | τx > t, Mt] (2.2.6)
= E
[S(T, x)
S(t, x)
∣∣∣∣Mt
](2.2.7)
= E
[exp
(−∫ T
t
µ(s, x + s) ds
)∣∣∣∣Mt
](2.2.8)
And
P (T + 1, T, T + 1, x) = exp
(−∫ T+1
T
µ(s, x + s) ds
)(2.2.9)
Note
70
1. We haven’t specified the probability measure we use in the above calculation
of probabilities and expectations. You can imagine we use a real-world (or
physical) probability measure P so far. In the future, we will mainly work
under a risk-neutral or some other martingale measure when it is relevant.
2. Let PP (t, t, T, x) denote the survival probability calculated under measure P .
PP (t, t, T, x) can be compared with its deterministic analogue, T− tpx+t. On the
one hand, let µ(t, x + t) be a deterministic unbiased estimate for the force of
mortality, that is, µ(t, x+ t) = E(µ(t, x+ t)). Then, in usual actuarial notation,
we have
T− tpx+t = exp
(−∫ T
t
µ(s, x + s) ds
). (2.2.10)
On the other hand, define
µP (t, T, x + T ) = −d ln PP (t, t, T, x)
dT. (2.2.11)
Hence,
PP (t, t, T, x) = exp
(−∫ T
t
µP (t, u, x + u) ds
). (2.2.12)
Since, by Jensen’s inequality,
E
[exp
(−∫ T
t
µ(s, x + s) ds
)]> exp
(−∫ T
t
E(µ(s, x + s)) ds
).
Therefore, we usually have
µP (t, u, x + u) 6= µ(u, x + u).
3. Let PQ(t, t, T, x) denote the survival probability calculated under a risk-neutral
measure Q. We will show later that this PQ(t, t, T, x) can be stripped off from
market prices of pure endowments or other related products. For this reason,
71
PQ(t, t, T, x) can be viewed as market pricing survival probability, also referred
to as spot survival probability as in Cairns, Blake, and Dowd (2006a).
Similarly, we can define
m(t, T, x + T ) = −d ln PQ(t, t, T, x)
dT. (2.2.13)
Or,
PQ(t, t, T, x) = exp
(−∫ T
t
m(s, u, x + u) ds
). (2.2.14)
m(t, T, x + T ) is then called the forward force of mortality, or sometimes short-
ened to be forward mortality, analogous to the time t instantaneous forward
interest rate. If T = t, we get the spot force of mortality.
µ(t, x + t) = limT→t
m(t, T, x + T ) (2.2.15)
Also, PQ(s, t, T, x) is referred to as forward survival probability, the probability
that an individual aged x + t at time t, if alive, will survive to time T , based
on the information on mortality rates available up to time s, Ms.
4. We notice that there are similarities in mathematical structure between the
interest rate setting and the mortality rate setting (see (2.1.7), (2.1.8), (2.1.9),
(2.1.10) together with (2.2.8), (2.2.13), (2.2.14) and (2.2.15)).
5. Define:
B(t, T, x) = E
[exp
(−∫ T
0
µ(s, x + s) ds
)∣∣∣∣Mt
]= e−
∫ t
0 µ(s,x+s)dsP (t, t, T, x).
Then B(t, T, x) is a martingale under the corresponding measure. This property
may allow us to derive the PDE for P (t, t, T, x) on [0, T ] × R+ provided the
72
stochastic process µ(t, x + t) is defined by a diffusion process (see Dahl, 2004,
p116 formula (3.2)), which is analogous to the differential equation for zero
coupon bond prices obtained when working with a stochastic interest rate model
(see e.g. Bjork (1997, proposition 3.4)).
Also,
PQ(s, t, T, x) = exp
(−∫ T
t
m(s, u, x + u) ds
)=
B(s, T, x)
B(s, t, x). (2.2.16)
2.2.2 The Generalized Financial/Insurance Market
We are now in a position to consider the arbitrage free pricing approach in a gener-
alized financial/insurance market. Our generalized market includes two fundamental
types of financial contract:
• zero coupon bonds for a full range of terms to maturity;
• pure endowment contracts for a full range of ages and terms to maturity.
Briefly, a zero-coupon bond with maturity T is a contract which pays $1 to the
bond holder at time T . We denote it by T -bond. A pure endowment with maturity
T for (x) is a contract which pays $1 to the policyholder at time T if he/she is still
alive at time T . We denote it by (T, x)-endowment.
We assume there exists a continuously (liquid, frictionless) trading bond and en-
dowment market for every T > 0 and every age x > 0. For the sake of simplicity,
default risk from the contract issuer is ignored for both type of products. Let D(t, T )
denote the time t T -bond price, and E(t, T, x) the time t (T, x)-endowment price. Let
Bt be the combined filtration for both the term structure of interest rates and mor-
tality rates, e.g. Bt = F(Ft
∨Mt). The absence of arbitrage is essentially equivalent
73
to the existence of an equivalent martingale measure Q under which the discounted
prices of any contingent claims in this combined market are martingale. Then we
obtain
D(t, T ) = EQ
[e−
∫ T
tr(u)du|Ft
]. (2.2.17)
and
E(t, T, x) = EQ
[e−
∫ T
tr(u)duIτx>T
∣∣∣ Iτx>t = 1,Bt
]. (2.2.18)
where Iτx>t is the indication function for the event τx > t.
Assuming, furthermore, the independent assumption between interest rate and
mortality rate, we have
E(t, T, x) = EQ
[e−
∫ T
tr(u)du
∣∣∣ Ft
]EQ
[Iτx>T
∣∣ Iτx>t = 1,Mt
]
= EQ
[e−
∫ T
tr(u)du
∣∣∣ Ft
]EQ
[Iτx>T
∣∣ Iτx>t = 1,Mt
]
= EQ
[e−
∫ T
tr(u)du
∣∣∣ Ft
]EQ
[e−
∫ T
tµ(u,x+u) du
∣∣∣Mt
]
= D(t, T )P (t, t, T, x). (2.2.19)
Here, P (t, t, T, x) is a spot survival probability, calculated under the martingale mea-
sure Q with Q being omitted for simplicity of notation. Assuming pure endowment
prices are available from the insurance market, we then can derive the spot (market
pricing) survival probability from bond and endowment prices:
P (t, t, T, x) =E(t, T, x)
D(t, T ). (2.2.20)
Therefore, if the (T, x)-endowment prices were available from the market, we
could use P (t, t, T, x) to calibrate mortality model, in the same way as D(t, T ) can
be used to calibrate the interest rate model. However, the problem here is that pure
74
endowment is not a tradable asset like a T -bond. In general, the insurance market is
not considered as a liquid, frictionless market as is the bond market. In particular,
the insurance market is a market where insurers take short positions in insurance
contracts, whilst insureds take only long positions. However, those trading constraints
for the insurance products can be weakened where (re)insurance can exchange books
of policies, so that both long and short positions can be taken on insurance contracts.
Hence, obtaining the “fair value” for the (T, x)-endowment is mainly just a manner
of implementing problem.
Thus, in the following, depending on the type of contracts under valuation, suit-
able basic insurance contracts will be assumed to be traded continuously in the market
and represent the primitive securities used for arbitrage pricing. For example, when
valuing annuities, pure endowments of (possibly) every maturity will be implicitly
taken as primitive securities.
We stress at this point, though, that the risk neutral measure Q might not be
uniquely determined due to market incompleteness. Instead the choice of Q becomes
part of the modelling process. As mortality-linked securities begin to emerge and
as we gather market price data we can then test for the validity of our assumptions
about Q. For further discussion of the relationship between P and Q, the reader is
referred to Dahl (2004), Biffis (2005), and Cairns, Blake, and Dowd (2006b).
75
2.3 Review of stochastic mortality models under
the no-arbitrage framework
“We must always be prepared to demonstrate that a proposed model gives a good
approximation to what we observe in reality and that it is appropriate for the task in
hand. All models are approximations to reality but, of course, some are better than
others.”
Cairns: Interest Rate Models: An Introduction (2004)
In this section, we will check on the criteria for stochastic mortality models and
review the recently proposed models, highlighting on specific requirements for mor-
tality.
2.3.1 Criteria for Term Structure of Mortality Models
To model mortality as a stochastic process, it is reasonable to require that any “plau-
sible” mortality model would meet the following criteria (Cairns, Blake, and Dowd,
2006a):
1. The model should keep the force of mortality positive.
2. The model should be consistent with historical data.
3. The model should be comprehensive enough to deal appropriately with the
pricing, hedging, or risk managing problem.
For valuation purpose of those mortality-related derivatives, the numerical re-
sults shall be consistent with our insight about risk/return relationship. That
76
is, the value of the derivative should reflect the stochastic evolution of µ(t, x)
such that the greater the volatility in mortality rates, the greater is the value
of a mortality option.
4. It will absolutely be an advantage if the model makes it possible to value the
most common mortality-linked derivatives using analytical methods or using
fast numerical methods. However, this criterion is one of convenience only, and
should not be allowed to override the other criteria, which are more important
because they are criteria of principle. In other words, we should not drop one
of the other criteria merely to obtain an easy (for example, analytical) solution
to the problem at hand.
Now we would like to put more concrete meanings on criterion 2 of being consistent
with historical data.
• In general, the model should give a biologically reasonable age-specific mortality
pattern for a cohort, and should be coherent with observed cohort relationships.
For example, one might rule out the possibility of an “inverted” mortality curve
(that is, one in which mortality rates for the elderly fall with age). An “inverted”
curve is not only unreasonable a priori, but also conflicts with the normal
upward-sloping curves that are always observed in historical data.
At the same time, younger generation normally experience lower mortality rate
than their older peers. This should rule out the deteriorating mortality, in other
words, for a fix age x, the mortality rate should be decreasing over time (e.g.
consecutive cohorts).
77
• The model should be flexible enough to reflect the rectangularization and ex-
pansion phenomena: while the former reduces the variability of time-until-death
randomness around its expectation, the latter increases the expectation of the
future lifetime. As a consequence, the risk due to random fluctuations in mor-
tality tends to decrease, while at the same time the longevity risk due to the
randomness in future mortality trends increases.
It is clear that any candidate model for µ must be able to capture the dynamics
just described.
• Recall that mortality risk comes from the difference between actual realized
mortality rates and those implied by an adopted projection. In terms of a pa-
rameterized model, the difference can be categorized into two sources: random
fluctuations from a given parameter (survival function), and the systematic
deviations of the parameter (survival function).
The model thus should more focus on the systematic deviations related to the
choice of a specific projection as we have described in Figure 1.13. So far, we
have seen that Pitacco’s model, or Lee-Carter’s model are two different methods
to describe systematic deviations. In the following subsection, we will see that
Dahl’s mean-reversion model focuses mainly on the random fluctuations around
the pre-specified long-term target. This is viewed as a disadvantage in a model
for stochastic mortality.
In particular, long-term dynamics in the course of mortality improvements
should not be mean-reverting to a pre-determined target level, even if this target
is time dependent and incorporates mortality improvements.
78
As Cairns, Blake, and Dowd (2006a) noted: the inclusion of mean-reversion
would mean that if mortality improvement has been faster than anticipated
in the past then the potential for further mortality improvements will be sig-
nificantly reduced in the future. In extreme cases, significant past mortality
improvements might be reversed if the degree of mean reversion is too strong.
Such extreme mean reversion is difficult to justify on the basis of past mortality
experience. As we look into the future, it is even more difficult to predict what
medical advances there might be, when they will happen, and what impacts
they will have on survival rates. All of these uncertainties rule out strong mean
reversion in a model for stochastic mortality.
2.3.2 A Brief Review of Existing Stochastic Mortality Mod-
els
In this subsection, we review the existing stochastic mortality models recently pro-
posed. We classify those models into three groups according to their model formuliza-
tion.
Reduction factor approach
Ballotta and Haberman (2006) proposed, under the real probability measure P ,
that the dynamics of the hazard process µ(y, u) for a person attaining age y in future
year u follows:
µ(y, u) = µ(y, 0)RF (y, u) (2.3.1)
where µ(y, 0) is the hazard rate for a person aged y in the base year (i.e. year 0),
and RF (y, u) is the reduction factor for period [0, u] for age y. In particular, taking
79
y = x + z, and u = t + z, formula (2.3.1) can be reformed as:
µ(x + z, t + z) = µ(x + z, 0)e(α+β(x+z))(t+z)+σhYt+z (2.3.2)
where (Yt, t ≥ 0) is stochastic process on (Ω,F ,Ft, P ) describing random variations
in the projected trend dYt = −aYtdt + dBt
Y0 = 0(2.3.3)
where Bt is a standard one-dimensional P -Brownian motion, independent of the
sources of randomness existing in the financial market. µ(x + z, 0) is modelled as:
µ(x + z, 0) = a1 + a2R + eb1+b2R+b3(2R2−1)
R = (x+z)−7050
, x ≥ 50. (2.3.4)
It is seen that the random variation process Yt is an Ornstein-Uhlenbeck process,
and has the property of mean reversion, with the parameter a measuring the speed
of mean reversion to the long-run mean which is set equal to zero.
The model (2.3.4) for the hazard rate for the base year has been proposed by
the CMI Bureau (1999) for the UK standard tables for annuitant and pensioner
population for the period 1991-1994.
Ballotta and Haberman’s model can be viewed as an extension to Milevsky and
Promislow’s mean reverting Brownian Gompertz (MRBG) model (2001), where the
dynamic of the the hazard rate process for a fixed cohort is described by:
µ(t) = µ(0)egt+σYt, g, σ, µ(0) > 0, (2.3.5)
dYt = −aYtdt + dBt (2.3.6)
Milevsky and Promislow’s model can obtain Ballotta and Haberman’s model if their
model parameter µ(0) and g are age-dependent.
80
It is worth to point out that the feature of mean reversion in the process Yt (see
formula (2.3.3) or (2.3.6)), and particularly reverting to a long-run value equal to zero,
ensures that the fluctuation is centered around the pre-specified curve. For this reason,
this type of mean-reverting process includes largely just random fluctuations. In
order to enhance the model property on describing systematic deviations in mortality
dynamics, Ballotta and Haberman introduced a random parameter α using the same
idea as in formula (1.3.3). Ad hoc assumptions are made: α ∈ −0.01,−0.03,−0.05
with probabilities p = 13, 1
3, 1
3 or p = 0.2, 0.2, 0.6 to present different beliefs in
future mortality development.
There is no closed form for survival probabilities, because the (integral) sum of
lognormal variates is not lognormal. Thus, numerical calculation is conducted by
Monte Carlo simulation for both models.
Affine approach
Dahl (2004) proposed a special affine mortality dynamics.
Definition 2.3.1. (Affine mortality structure). If, for fixed x, the survival probabil-
ities are given by
P (t, t, T, x) = eA(t,x,T )−B(t,x,T )µx+t (2.3.7)
for deterministic function A(t, x, T ) and B(t, x, T ), then the model for the mortality
intensity is said to possess an affine mortality structure for cohort x. If (2.3.7) holds
for all x, then the model is simply said to possess an affine mortality structure.
Affine mortality structures are of interest, since they allow survival probabilities
to be expressed by the relatively simple expression in (see formula (2.3.7)). Dahl
further showed that a sufficient condition for having an affine mortality structure is
81
to have the following form for mortality intensity.
dµ(x + t) = αµ(t, x, µ(x + t))dt + σµ(t, x, µ(x + t))dBµt (2.3.8)
αµ(t, x, µ(x + t)) = δα(t, x)µ(x + t) + ζα(t, x) (2.3.9)
σµ(t, x, µ(x + t)) =√
δσ(t, x)µ(x + t) + ζσ(t, x) (2.3.10)
In other words, for fixed x, an affine structure for αµ and (σµ)2 in t ensures an
affine mortality structure with A and B solving the following differential equations:
∂tB(t, x, T ) + δα(t, x)B(t, x, T ) − 1
2δσ(t, x)(B(t, x, T ))2 = −1, B(T, x, T ) = 0
(2.3.11)
∂tA(t, x, T ) = ζα(t, x)B(t, x, T ) − 1
2ζσ(t, x)(B(t, x, T ))2, A(T, x, T ) = 0. (2.3.12)
Thus, provided we can solve the differential equations for A and B, an affine mortality
structure provides a closed form expression for the survival probabilities for cohort x.
If in addition αµ and σµ are time independent, the sufficient condition is necessary
as well, see Duffie (2001).
Although affine mortality structure provides mathematical convenience, special
efforts need to be taken into account on model specification to incorporate the statis-
tical features of historical mortality data. Under the affine structure, Dahl consider
a model similar to Ballotta and Haberman (2006)’s reduction factor formula. In par-
ticular, Dahl and Møller (2006) propose a dynamic for the mortality intensity process
as:
µ(x, t) = µ(x + t, 0)RF (x, t) (2.3.13)
where µ(x + t, 0) is some smooth initial mortality intensity curve, e.g. µ(x + t, 0) =
α+βcx+t, which can be estimated by standard statistical methods. And the reduction
82
factor is modelled by a special extended Cox-Ingersoll-Ross model:
dRF (x, t) = (γ(x, t) − δ(x, t)RF (t, x))dt + σ(t, x)√
RF (x, t)dBt (2.3.14)
where γ(x, t), δ(x, t) and σ(t, x) are positive bounded functions. It can be shown
that the extended CIR model ensures strict positivity of the mortality intensity for
cohort x provided that for fixed x we have 2γ(x, t) ≥ (σ(t, x))2, for all t ∈ [0, T ], see
Maghsoodi (1996). Furthermore, the model is mean reverting around the time and
cohort dependent level γ(x, t)/δ(t, x). It then follows via Ito formula that
dµ(x, t) = (γµ(x, t) − δµ(x, t)µ(x, t))dt + σµ(t, x)√
µ(x, t)dBt (2.3.15)
where
γµ(x, t) = γ(x, t)µ(x + t, 0) (2.3.16)
δµ(x, t) = δ(x, t) −ddt
µ(x + t, 0)
µ(x + t, 0)(2.3.17)
σµ(t, x) = σ(t, x)√
µ(x + t, 0) (2.3.18)
This shows that µ also follows an time-inhomogeneous CIR model. The model specifi-
cation ensures that the mortality intensity given by (2.3.15) admits an affine mortality
structure. Model performance of (2.3.15) has been investigated by simulation. The
histogram for the expected lifetime of a policyholder aged 30 is plotted. The vari-
ation ofe30 can be viewed as an overall indicator for the uncertainty related to the
mortality schedule for (30). Specifying γ(x, t) = δe−γt, δ(x, t) = δ and σ(t, x) = σ
with parameter values assigned as (γ, δ, σ) = (0.008, 0.2, 0.02), the histogram shows
that there is only a relatively small variation associated with the expected lifetime.
This actually explains why the reserves obtained from their stochastic model show
83
little difference from the compatible deterministic model. We believe this can par-
tially justify the non mean reversion criterion for mortality model specification. As
the mean-reverting speed parameter approaches zero, it can generate bigger variation
for the expected lifetime. See Figure 7 in Dahl and Møller (2006).
Along another line, Biffis (2005) proposed a two dimensional affine process Y =
(µ, µ), whose first component is the random intensity of mortality µ itself, while the
second component describes the dynamics of the stochastic drift µ:
dµt = γ1(µt − µt)dt + σ1√
µtdB1t (2.3.19)
dµt = γ2(m(t) − µt)dt + σ2
√µt − m∗(t)dB2
t (2.3.20)
where B = (B1, B2) is a standard Brownian motion in R2; γ1, γ2 are parameters rep-
resenting the ‘speed of mean reversion’ of µ to µ and of µ to m, after any fluctuations
due to B occur; the functions m and m∗ are bounded and continuous. The function
m is a suitable demographic basis, such as an available mortality table acting as a
time-varying target to the stochastic drift µ. The function m∗ is a time-varying lower
boundary for the stochastic drift µ. It was interpreted as a more optimistic assump-
tion (in terms of mortality improvements) than that implied by m. Indeed, to make
sure that Y = (µ, µ) is well-defined, i.e. that µ ≥ 0 a.s. and µ ≥ m∗, the following
conditions are imposed: m ≥ m∗ ≥ 0, µ0 ≥ m∗(0) and µ0 ≥ 0. As a consequence,
we see that m and µ always dominate m∗. Note, however, that the conditions stated
above only ensure that µ is nonnegative, so that some paths of µ may actually fall
below m∗.
Since the model takes into account the risk of random fluctuations around µ and
around the drift’s target m, this enables some degree of systematic risk in mortality.
However, the biological implication of mean reversion still lacks justification.
84
For numerical presentation purpose, it is assumed that
m(x + t) =c
θc(x + t)c−1, θ, c > 0 (2.3.21)
which is fitted to the table COH48, a projection relative to Italian males born in 1948.
The asymptotic Weibull intensity m∗ is set to be low enough to allow fluctuations of
µ below m, yet large enough to prevent mortality improvements from being unrea-
sonable. The simulation result in Biffis (2005) shows that the model can capture the
rectangularization and expansion properties well.
Time series approach
Cairns, Blake, and Dowd (2006a) work differently from the above two approaches.
They follow the spirit of Lee-Carter’s model, starting from fitting the empirical mor-
tality data to obtain time series variables. Specifically, the following model is adopted:
q(x, t) = 1 − p(t + 1, t, t + 1, x) =eA1(t)+A2(t)(x+t)
1 + eA1(t)+A2(t)(x+t)(2.3.22)
In this equation, A1(t) and A2(t) are two stochastic factors that are assumed to be
measurable at time t. The first affects mortality at all ages in an equal manner,
whereas the second has an effect on mortality that is proportional to age. A(t) =
(A1(t), A2(t))′ is then modelled by a bivariate ARIMA time-series model to describe
the evolvement of the curve over time:
A(t + 1) = A(t) + β + CZ(t + 1) (2.3.23)
where β is a constant 2× 1 vector, C is a constant 2× 2 upper triangular matrix and
Z(t) is a 2-dimensional standard normal random variable. The following parameters
are estimated based on England and Wales data from 1961 to 2002, which is available
85
from the Government Actuary’s Department website www.gad.gov.uk.
β =
(−0.0434
0.000367
)and V = CC ′ =
(0.01067 −0.0001617
−0.0001617 0.000002590
)(2.3.24)
Based on data from 1982 to 2002, the estimated parameters are
β =
(−0.0669
0.000590
)and V = CC ′ =
(0.00611 −0.0000939
−0.0000939 0.000001509
)(2.3.25)
These results show a trend change after 1982, with β1 and β2 both becoming larger
in magnitude. This reminds us of the model risk problem with Lee-Carter model.
For the model to be biologically reasonable, some requirements on parameter val-
ues should be imposed. The negative value for β1 indicates generally improving mor-
tality, and the numerical results shows this improvement is strengthened after 1982.
The positive value for β2 means that mortality rates at higher ages are improving at
a slower rate. Theoretically, there is a chance, under the current model specification
with positive β2, that, after a certain age (referred to as crossover point), the model
predicts deteriorating mortality (in other words, mortality rate at ages higher than
the crossover point will be rising over time rather than lower), and this might not
be viewed as realistic. Practically, since this point is a very high age (e.g. age 113)
based on the estimated parameter values, this might not impose a serious problem as
the number of lives involved is very low after age 113.
Another aspect of biological reasonableness is to check if period tables reflect the
observed pattern. That is, for fixed t, q(t, x) should normally be an increasing function
of x. This requires that A2(t) remain positive. Based on the above model parameter
values, A2(t) seems very unlikely to become negative, although theoretically, they can.
Thus, practically, their model can be regarded as satisfying this aspect of biological
reasonableness as well.
86
The ”short rate” dynamic (in discrete time) of q(t, x) for a cohort is the most
important aspect and can be better investigated in the following form:
log q(t + 1, x)/p(t + 1, x) (2.3.26)
= A1(t + 1) + A2(t + 1)(x + t + 1) (2.3.27)
= (1, x + t + 1)′[A(t) + β + CZ(t + 1)] (2.3.28)
= log q(t, x)/p(t, x) + (β1 + β2 + A2(t)) + (1, x + t + 1)′CZ(t + 1) (2.3.29)
It is noted that A2 in 2002 is 0.1058 and the standard deviation 0.006 for A2(t) is very
small over the time horizons concerned. Thus, β1 +β2 +A2(t) is initially positive and
is expected to stay positive. As a consequence, the cohort will experience generally
increasing rates of mortality with occasional falls in years when there occurs a large
random mortality improvement across the board (that is, when (1, x + t + 1)′CZ(t +
1) << 0).
In order to obtain the dynamic of S(t + 1), the approximation method is used
S(t + 1) = S(t)(1 − m(t, x)) (2.3.30)
where m(t, x) is the central death rate, and is given by
m(t, x) =q(t, x)
1 − 12q(t, x)
(2.3.31)
The model is not analytically tractable, so it is necessary to resort to Monte Carlo
simulation for most purposes.
Cairns, Blake, and Dowd (2006b) present the simulated spot survival probability
P (0, 0, t, 65) (or simply S(t)) with its associated confidence interval. The confidence
interval grows in quite a different way from, say, that which we may associate with
an investment in equities. This point is best illustrated by looking at the variance
87
of the logarithm of S(t), as illustrated in Figure 4 of their original paper. It can be
seen that the variance is very low in the early years indicating that we can predict
mortality rates with reasonable precision over the near future. However, after time 10
the variance starts to grow very rapidly (almost “exponentially”). This contrasts with
equities where we would expect to see linear, rather than “exponentially”, growth in
the variance if the price process follows geometric Brownian motion.
The explanation for this variance growth is that the longer-term survival prob-
ability incorporate the compounding of year-by-year mortality shocks: the survival
probability for year t depends on shocks applied to mortality rates in each year from
1 to t, and each individual shock affects survival probabilities in all subsequent years.
This property shall have an effect on the price of an annuity in the sense that the
premium charged for the 25-year payment will be much larger than, say, the 10-year
payment.
2.3.3 Summary
With regard to the criteria for mortality models, we show which of these characteris-
tics have been displayed by the various models in Table 2.1.
Other Remarks
1. Currently, most (if not all) approaches are proposed in the framework of a short
rate model, starting from theoretical convenience, such as affine structure. Few
discussed and connected their models with statistical properties of observed
mortality data.
2. In order to be consistent with historical data and satisfy biological requirements,
88
Table 2.1: Key Characteristics of Stochastic Mortality Models
Criteria B&P Dahl Biffis Cairns et al
µ(x, t) > 0 Y Y Y Y
Biological Reasonableness Y Y Y Y (based on properlychosen para. values)
Rectangularization&expansion - - Y -
Cohort Relationship - - - Y
Trend Deviation N N Y Y
Non Mean Reversion N N N Y
Tractability N Y N N
many proposed models start from the estimated or projected mortality sched-
ule, and the stochastic feature is introduced via adding random fluctuations
into a deterministic target. However, empirical study on mortality shows that
mortality risk mainly comes from trend variation rather than random fluctua-
tion.
3. We want to point out here that trend deviation has not been well addressed
by the currently proposed stochastic mortality models. It is thus important
to stress again that it is the deviation pattern 2 in Figure 1.13 that we are
trying to model, while the mean reverting processes are just simulating the
pattern 1. In this regard, Lee-Carter approach is superior in the sense that the
trend deviation is described by the level uncertainty. However, model risk in
Lee-Carter approach is a problem with which we need to be concerned.
Chapter 3
The Time-Changed MarkovianMortality Model
3.1 Dynamic Approach of Mortality Modelling
As we have shown earlier, survival analysis has been focusing on modelling the survival
function or the hazard rate (the force of mortality) of a life or a population, as seen
in Gompertz’s model and its various extensions. In spite of apparent simplicity, those
concepts are highly aggregating, and affected by many factors. Therefore, prediction
mortality models based on those concepts are difficult to interpret. A process-based
approach, viewing the survival time of a life as the occurrence time of certain events
of a process, may have certain advantages, as it may provide biological interpretations
to a model and may utilize mathematical tools developed in the theory of stochastic
processes.
In this chapter, we consider a finite-state continuous-time Markov process with
constant intensities of transition for modelling the survival time. The state space of
the Markov process is assumed to consist of a set of transient states and a single
absorbing state. The initial probability distribution is defined on the transient state
89
90
space and represents the unobservable health status of the life, and the survival time
is the time of absorption. As a result, the distribution of the survival time follows a
phase type distribution.
The use of a phase type distribution and its underlying Markov process as a failure
time model have applications in the fields of engineering or medical statistics. For
example, AIDS’ patients usually experience a development process through various
stages of severity during the incubation period, which can be characterized by the
level change of CD4+ and modelled by a Markov process accordingly. Unfortunately,
it has rarely been used for modelling and analysis of human mortality. This chapter
and the next chapter are an attempt in this direction.
3.1.1 Notation
Hereafter, vectors and matrices are always written in bold letters: lower case and
upper case, respectively. Subscripts may be used to indicate dimensions, for instance,
Am×n means a matrix with m rows and n columns. We may write A = (aij) to
emphasize a matrix with the entry on the ith row and the jth column being aij.
Specifically, the identity matrix is denoted by I, the vector with all entries equal to 1
is denoted by e, and the vector with all entries equal to 0 is denoted by 0. The symbol
D(d), or (dj)diag, is used to denote the diagonal matrix with d = (d1, d2, · · · , dn) being
the diagonal entries.
3.1.2 Phase-type distributions
Definition 3.1.1. Let Jt be a time-homogeneous Markov process on a finite state-
space S = E ∪ ∆ = 1, 2, · · · , n ∪ ∆, where ∆ is absorbing and the states in E are
91
transient. The initial distribution is given by (α, 0) (written as row vector), and the
infinitesimal generator is expressed as
Λ q
0 0
(3.1.1)
Let τ denote the time until absorption or the time until death in the survival
analysis context. Then τ is said to follow a phase-type (PH) distribution with a
representation (α,Λ) of order n.
In other words, the matrix Λ = (λij)n×n is the matrix of transition rates among
the transient states, and q = (qi)n×1 is the column vector of absorption rates into
state ∆ from the transient states:
P (Jt+ǫ = j|Jt = i) = λij · ǫ + o(ǫ), i, j ∈ E, i 6= j
P (Jt+ǫ = ∆|Jt = i) = qi · ǫ + o(ǫ), i ∈ E.
Therefore Λ is a sub intensity matrix, meaning that λii < 0, λij ≥ 0 for i 6= j and
∑j∈E λij ≤ 0. Further, q = −Λe, where e is the column vector of ones.
Some useful references on phase type distributions are Neuts (1981), Asmussen
(1987, 2000a,b), and O’Cinneide (1989, 1990, 1999). Statistical fitting of phase type
distributions using the EM algorithm is presented in Asmussen, Nerman, and Olsson
(1996). A survey from a survival analysis point of view is given by Aalen (1995) who
focused on the connection between the underlying process and various shapes of the
hazard rate. Applications using phase type distributions as survival models can be
found in Kay (1986), Longini, Clark, Gardner, and Brundage (1991).
A huge advantage of phase type distributions is their mathematical tractability. It
is possible to compute the various quantities of interest associated with a phase type
92
distribution as seen in the following theorem. Typically, analytical forms for these
quantities are expressed as matrix exponentials and matrix inverses. The calculation
of these matrices may be cumbersome by hand. However, many symbolic programs
such as Mathematica, Maple, and Matlab are currently available for deriving ana-
lytical formulas as well as conducting numerical calculations for matrix calculations.
In this thesis, most of the numerical works involving phase type distributions are
actually executed by Matlab, making this approach more convenient and attractive.
Theorem 3.1.2. Let τ have phase-type distribution with representation (α,Λ). Then
we have:
survival function
s(t) = α exp(Λt)e, t > 0 ; (3.1.2)
density function
f(t) = −s′(t) = α exp(Λt)q, t > 0 ; (3.1.3)
moment generating function
M(s) = α(−sI − Λ)−1q ; (3.1.4)
and non-central moments
mk = (−1)kk!αΛ−ke, k = 1, 2, · · · . (3.1.5)
Proof Let pij(t) = Pr(Jt = j|J0 = i) denote the transition probabilities from i to
j over time t for t ≥ 0, i, j ∈ E. Then, P (t) = (pij(t)) satisfies the Kolmogorov
forward and backwards equations
d
dtP (t) = P (t)Λ = ΛP (t) .
93
Since P (0) = I, the equation has a unique solution
P (t) = exp (Λt).
Hence
s(t) = Pα(τ > t) = Pα(Jt ∈ E) =∑
i,j∈E
αipij(t) = αP (t)e = α exp (Λt)e,
this proves (a). (b) follows from
f(t) = −s′(t) = −αd
dtP (t)e = −α exp (Λt)Λe = α exp (Λt)q.
For part (c), using formula (A.1.3) for integrating matrix exponentials, we have
M(s) =
∫ ∞
0
estαeΛtq dt = α
(∫ ∞
0
e(sI+Λ)t dt
)q
= α(−sI − Λ)−1q .
The last equality is true because all the non-zero eigenvalues for intensity matrices
have negative real part.
Part (d) follows by differentiating the m.g.f. M(s),
M (k)(s) =dk
dskα(−sI − Λ)−1q = (−1)k+1 k! α(sI + Λ)−k−1q
M (k)(0) = (−1)k+1 k! αΛ−k−1q = (−1)k k! αΛ−ke
In the following, we provide two more important theorems on phase type distri-
butions without proof.
Theorem 3.1.3. The class of phase-type distributions is closed under the forma-
tion of finite mixtures, finite convolutions, finite minima and maxima, and geometric
compounds.
94
For a proof, see Asmussen (2000a, p353).
Theorem 3.1.4. The class of phase-type distributions is dense (in the sense of weak
convergence) in the class of all distributions on (0,∞).
For a proof, see Asmussen (2000b, p201).
The closure property is important to guarantee that the underlying Markov struc-
ture of a phase type distribution is preserved after certain operations. This results
in a general rule that if a probabilistic problem involving exponential distributions
has an explicit solution, the problem will also have an explicit solution when the
exponential distributions are replaced by phase type distributions. As a result, phase
type distributions have become a convenient computational tool in applied probabil-
ity. For typical examples, see Neuts (1981), Asmussen and Rolski (1991), Asmussen
(2000b), and references therein. Furthermore, due to the denseness property of phase
type distributions, there is no essential loss in generality by assuming a phase type
distribution instead of an arbitrary distribution for a practical viewpoint, since any
distribution on [0,∞) can be approximated by a phase type distribution at any desired
degree of accuracy.
The potential structure of phase type distributions is very rich. We will only
discuss some of the most basic types of phase type distributions.
Commonly used phase type distributions
(1) Hyper-exponential distribution corresponds to a Markov process that can
start in any state (all elements in α are allowed to be non-zero), but terminates
in the absorbing state without visiting any other state (all off diagonal entries
95
in Λ are zero). A Hyper-exponential distribution is also known as a mixture of
exponentials (thus is a phase-type distribution due to closure property), and its
density is∑n
i=1 αi λi e−λit.
(2) Erlang distribution corresponds to a Markov process that must start from
state 1, and then visit each state 2,· · · , n, in that order, and terminates when it
leaves state n. The transition rates are all the same and equal to λ. As a result,
we obtain a Gamma distribution with integer n as its shape parameter and λ
as its scale parameter. In this case, the phase type distribution is actually the
convolution of exponential distributions, and its density is λn tn−1 e−λt/(n−1)!.
It is easy to generalize the Erlang distribution by assigning different transition
rates between each pair of consecutive states.
The Markov process associated with a phase type distribution can usually be
illustrated by a phase diagram. For a generalized Erlang distribution, the dia-
gram is
Diagram 3.1. Phase diagram for a generalized Erlang distribution
1 1 1 1
λ1 λ2 · · · · · · λn
1 1 1 1
96
corresponding to a phase representation of α =(
1 0 · · · 0)
1×n, and
Λ =
−λ1 λ1 0 · · · 0
0 −λ2 λ2 · · · 0
0 0 −λ3. . . 0
......
. . . . . ....
0 0 0 · · · −λn
(3.1.6)
(3) Coxian distribution can be constructed from generalized Erlang distribution
with the exception that the absorbing state can be reached from all of the other
states. Thus, from state i the Markov process can either jump to state i + 1 or
to the absorbing state. It has a phase diagram of the following form:
Diagram 3.2. Phase diagram for Coxian distribution
λ1 λ2 · · · · · · λn
1 1 1
λ1′ λ2
′
qnq1 q2
corresponding to a phase representation of α =(
1 0 · · · 0)
, and
Λ =
−λ1 λ1′ 0 · · · 0
0 −λ2 λ2′ · · · 0
0 0 −λ3. . . 0
......
. . . . . ....
0 0 0 · · · −λn
, (3.1.7)
97
where
λi = λi′ + qi. (3.1.8)
A generalized Coxian distribution corresponds to a Markov process that is same
as the above but can start from any state.
(4) Triangular phase type distribution is phase type distributions of special
interest. It can be defined as those with triangular representation (α,Λ), Λ
being an upper-triangular matrix. In terms of a Markov process, it means that
there is no ”feedback” in the state space, i.e. no state can be visited more than
once. When the states are ordered in a suitable manner, the transition matrix
will have only zero elements below the main diagonal.
Remarks
On one hand, it is easily seen that type (1), (2) and (3) are all triangular. On
the other hand, it is proved in O’Cinneide (1989) that any phase type distribution
with an order n triangular representation has a Coxian representation of the same
order. This result suggests that Coxian class is more general than it seems, and that
all models without feedback can be reduced to it.
More surprisingly, for the models with feedback, in particular, birth-death pro-
cesses with absorption, it has been proved by Keilson (1979, p59) that the absorption
time distribution for those models can also be expressed as a generalized Erlang dis-
tribution. In other words, this type of distribution can also be reduced to a (special)
Coxian-type distribution.
The non-uniqueness of phase type distributions may be of interest in stochastic
modelling. In many applications, it might be necessary to allow the process to move
98
back and forth between states before being absorbed. For instance, this may be real-
istic for some diseases where recovery may take place. However, the result mentioned
above implies that some feedback models are equivalent to the one without feedback
from a distributional viewpoint (see Aalen, 1995, p455 for more examples and condi-
tions). Therefore, it is often sufficient to assume a Coxian structure for the underlying
Markov process when one considers a phase type mortality model.
3.1.3 Phase-type distributions as mortality models
The discussions in the previous section provide some reasoning, from mathematical
point of view, on why it is of interest to consider a Coxian distribution as our ba-
sic mortality model. In this section, we will provide some biological and statistical
reasoning to further support this hypothesis.
By assuming a Coxian structured phase type distribution, the underlying Markov
process can be interpreted as an aging process. Although there is not yet any common
consensus in biological science on which could be appropriate markers for an aging
process, our model simply implies that aging is an irreversible process passing through
a number of stages. The aging process that we are trying to model may be perceived
as the representation of a health index of a life.
The general aging theory thus permits us to construct a Coxian structured phase-
type distribution to model age specific patterns of human mortality (see Lin and Liu,
2007). Full description of this work is given in next chapter in which we have shown
that a Coxian distribution and its associated Markov process (see Diagram 3.2 in
the previous section) can be used to describe the underlying aging process and the
absorbing time distribution can provide a good fit to various mortality patterns for
99
human populations.
The transition matrix of the proposed aging process has a strict Coxian form
(see section (4.2) for details). By “strict”, we mean, in expression (3.1.7) and (3.1.8),
λi > λ′i for all i = 1, · · · , n−1. In addition, for any i 6= j, we have λi 6= λj. Under this
model setting, Λ is diagonalizable with n distinct eigenvalues −λi (i = 1, · · · , n). Let
ν1, · · · ,νn be the corresponding left eigenvectors and h1, · · · ,hn the corresponding
right eigenvectors. Then, the resulting survival function S(t) for the time-till-death
random variable can be expressed as:
S(t) =n∑
i=1
e−λitα(hi ⊗ νi)e. (3.1.9)
where ⊗ is the Kronecker product. For more details on Kronecker product, see
Appendix A.
In contrast, the survival function of a general phase type distribution may be
written in the following form:
S(t) =r∑
k=1
mk−1∑
j=0
ckjtj
j!e ρkt, (3.1.10)
where ρ1, · · · , ρr are the eigenvalues of Λ with multiplicities m1, · · · ,mr, while the
ckj are constants (see Aalen, 1995; Harrison, 1990).
To obtain a good fit to the entire schedule of human mortality, the estimated
phase type distribution has a relative high dimension (see section 4.3). However, in
this chapter, we will adopt lower dimensional phase type distributions with the same
intensity structure as the ones presented in next chapter. There are two reasons for
this. First, in some actuarial practices and actuarial evaluation of mortality-linked
products in particular, the focus is on the population of age 65 or older. Thus,
only part of the mortality schedule is relevant and hence low-dimensional phase type
100
distributions are sufficient. Second, low-dimensional phase type distributions allow us
to use mathematical programs like Matlab to obtain results effectively and precisely.
3.2 Time-changed Markovian Survival Model
In this section, we present a stochastic mortality model that can be used for the
evaluation of mortality-linked products. As discussed in the previous chapters, there
are two approaches for incorporating stochastic mortality. One is what has been done
in Pitacco (2003) and Olivieri (2001), where the uncertainty is specified by assigning a
probability distribution to a finite set of potential mortality schedules. This approach
is good for illustrating the idea, but is too subjective to be realistic. The other one
can be summarized by the so-called short-rate-modelling approach (see Cairns, Blake,
and Dowd, 2006a), under which the spot mortality rates, q(x, t), or the spot force of
mortality µ(x, t) are described by a stochastic dynamic system (using time series, or
diffusion processes). The models using this approach as well as their advantages and
disadvantages have been reviewed in Section 2.3.2.
Without doubt, more work needs to be done in this area to better incorporate
mortality risk for the purposes of mortality risk pricing and hedging. In the following,
we present an alternative approach where the random feature of the future survival
distribution will be introduced by a time-change process, and the choice of the model
parameters will be determined by calibrating to the market prices of the relevant
products.
101
3.2.1 Time-changed Markovian Process
In order to introduce a stochastic dynamic into a phase type mortality model, consider
the subordinated aging process, Zt
Zt = Jγt,
where Jt is the associated Markov aging process and γt is a nondecreasing continuous-
time stochastic process. Define τ = inft : Zt = ∆. Thus, τ is the absorbing time for
stochastic process Zt. Obviously, the distribution of the absorbing time τ is governed
by both the underlying Markov process Jt and the subordinator process γt.
This idea is similar to that in the work of Madan and collaborators who used
a subordinated Brownian motion to model the dynamics of the logarithm of stock
prices (see Madan and Milne, 1991; Madan, Carr, and Chang, 1998). Their process,
called the Variance Gamma (VG) process, is obtained by evaluating Brownian mo-
tion at a random time-change. Thereby each unit of the calendar time is viewed as
having an economically related trading time. In our model setting, the interpretation
is different. One can think of the underlying aging process being influenced by an
improved or worsened living environment. Improvement may arise from advances in
health science, better social system in health care, breakthrough in genetic engineer-
ing, and so on. Deterioration may come from global warming, catastrophic events,
or epidemic diseases like SARS, bird flu, and so on. Our method is to model those
random effects in the future evolution of mortality rates via imposing a time-change
process γt to the time homogenous aging process Jt. The aggregate result is that the
future death rate will be higher or lower than anticipated by S(t), conditional on the
realization of γt. That is, the original deterministic survival function is now random
in the form of S(γt), depending on the random time process γt.
102
As a time-changing process, γt shall be non-decreasing. For tractability, indepen-
dent and stationary increments are preferable. The class of processes which holds
these properties for a random time is called a Levy subordinator (see Cont and
Tankov, 2004). The most well-known and widely used subordinator process is the
Gamma process. Note that our choice of Gamma processes is largely motivated by
tractability and familiarity.
Time-changed Markov process has also been used by Hurd and Kuznetsov (2006)
in credit risk modelling, where the credit migration of a company is described by a
finite-state Markov process compounding with stochastic time change. From mathe-
matical point of view, they point out that a wide class of processes can be the potential
candidate for the time-change process; for example, the positive mean-reverting dif-
fusion process, or the positive mean-reverting pure jump process, or the combination
of both (see equation (13) and (14) in Hurd and Kuznetsov, 2006, for more details).
Based on those model setting, many results in Hurd and Kuznetsov (2006)’s paper
are of affine structure. In contrast, we make use of matrix-analytical method, and
express the results in terms of phase-representation. Nonetheless, it is of interest to
investigate the appropriateness of other time-change processes in mortality modelling
in our future work.
3.2.2 The Gamma process
A random variable, X, is said to have a Gamma distribution with shape parameter
α > 0 and scale parameter β > 0 if it has density function fX as
fX(x) =βα
Γ(α)xα−1e−βx, x > 0. (3.2.1)
103
We simply write X ∼ Γ(α, β). The moment generating function of the Gamma
distribution is given by
φ(u) = E(euX) = (1 − u
β)−α. (3.2.2)
It is well known that X has mean α/β and variance α/β2. In particular, when
α = n, n = 1, 2, · · · , a positive integer, X has an Erlang distribution with the phase-
type representation b = (1 0 · · · 0)1×n,
Γ =
−β β 0 · · · 0
0 −β β · · · 0...
.... . . . . .
...
0 0 0 · · · −β
n×n
. (3.2.3)
Then, the density function fX(x) can be written in terms of the phase-type represen-
tation:
fX(x) = beΓxIβ, (3.2.4)
where Iβ = (0 · · · 0 β)n×1. This property is useful when we compute the moments of
the survival index at discrete time points in later sections.
The Gamma process γt can be constructed to have the following properties:
1. γ0 = 0;
2. it has independent increments, i.e., for any 0 ≤ t0 < t1 < · · · < tn the random
variables γt1 − γt0 , · · · , γtn − γtn−1 are independent; and
3. γt+s − γt ∼ Γ(αs, β) for any s, t ≥ 0.
For simplicity, we denote it as Γ(t; α, β). For the time-changing purpose, we let
α = β = 1/ν, where ν > 0 is a parameter. In other words, the Gamma process under
104
consideration is Γ(t; 1/ν, 1/ν). This is because we must have E(γt) = t for any time t.
Consequently, the Gamma process γt at fixed t has mean t, variance νt and skewness
2√
νt.
3.2.3 Survival functions for time-changed model
We have now introduced a subordinated Markov process Zt = Jγt, with Jt following
a terminating Markov process with phase-type representation (α,Λ) and γt being
the Gamma process Γ(t; 1/ν, 1/ν). Let τ be the absorbing time random variable of
process Zt. In this section, we study the distribution of τ .
First, one has
P ( τ > t) = P ( Zt 6= ∆) = P ( Jγt6= ∆)
= E[E(1Js 6=∆| γt = s
)]
= E [ S(γt) ]
In the last equality, S(t) is the survival function of the absorbing time of Jt.
Following the notation in section 2.2, we see that the survival function of τ is the
spot survival probability P (0, 0, t, x) = E [ S(γt) ] for (x), assuming that Zt is the
underlying aging process for (x). The realized survival function at the future time t
will depend on the realization of γt, e.g. S(γt), and therefore itself is random. Based
on the information on Jt and γt at time t = 0, the unconditional survival function of
S(γt) can be calculated according to the following theorem.
Theorem 3.2.1. Assume Λ is diagonalizable, i.e. there exist n distinct eigenvalues
λ1, · · · , λn with their corresponding right eigenvectors h1, · · · ,hn, and left eigenvec-
tors ν1, · · · ,νn.
105
Then, Λ =∑n
i=1 λihi ⊗ νi (see Proposition A.2.3), and P (0, 0, t, x) can be com-
puted by calculating the survival function of a phase-type distribution with represen-
tation (α, Λ), where Λ is defined as Λ =∑n
i=1 λihi ⊗ νi with each λi given by
λi = − ln(1 − νλi)
ν.
That is,
P (0, 0, t, x) = α
(n∑
i=1
eλithi ⊗ νi
)e = αH
(eλit)
diagH−1e = αeΛte. (3.2.5)
Proof By Theorem 3.1.2 and formula (3.1.9),
P (0, 0, t, x) = E [ S(γt) ]
=
∫ ∞
0
α exp(Λs)e · fγt(s)d s (3.2.6)
=
∫ ∞
0
n∑
i=1
eλisα (hi ⊗ νi) e · fγt(s)d s
=n∑
i=1
(∫ ∞
0
eλis · fγt(s)d s
)α (hi ⊗ νi) e
=n∑
i=1
(1 − νλi )− t
ν α (hi ⊗ νi) e
=n∑
i=1
eλitα (hi ⊗ νi) e (3.2.7)
The last second step uses the Laplace form for the Gamma process Γ(t; 1ν, 1
ν).
Define λi = − ln(1− νλi)ν
, we then obtain formula (3.2.5).
Note that Λ can be equivalently written as Λ = HD(λ)H−1, while Λ =
HD(λ)H−1. Therefore, theorem 3.2.1 shows that the distribution of τ still has
a ‘phase type representation’ with ‘transition matrix’ Λ. Detailed examples are given
in the next section.
106
From Proposition A.2.2 and Proposition A.1.1, we can further write the second
moment for the survival index random variable S(γt) as
E[(S(γt))
2]
= E [ S(γt) S(γt) ] = E[αeΛ γte · αeΛ γte
]
= E[(α ⊗ α)
(eΛ γt ⊗ eΛ γt
)(e ⊗ e)
]
= (α ⊗ α) E[e(Λ⊕Λ) γt
](e ⊗ e) . (3.2.8)
Since Λ = HDH−1 (Here, D is the short form for D(λ)), we obtain
Λ ⊕ Λ = Λ ⊗ I + I ⊗ Λ
= HDH−1 ⊗ I + I ⊗ HDH−1
= HDH−1 ⊗ HIH−1 + HIH−1 ⊗ HDH−1
= (H ⊗ H) (D ⊗ I)(H−1 ⊗ H−1
)+ (H ⊗ H) (I ⊗ D)
(H−1 ⊗ H−1
)
= (H ⊗ H) (D ⊗ I + I ⊗ D)(H−1 ⊗ H−1
)
= (H ⊗ H) (D ⊕ D)(H−1 ⊗ H−1
). (3.2.9)
It is straightforward to show that
(H ⊗ H)−1 =(H−1 ⊗ H−1
)
(see the proof in Appendix A), and
D ⊕ D =
D + λ1I · · · · · · 0
0 D + λ2I · · · 0
· · · · · · · · · · · ·0 · · · · · · D + λnI
.
The latter is still a diagonal matrix. Let us denote it as D ⊕ D = (λij)diag, where
λij = λi + λj implies the i-th element in the j-th block. Thus, the second moment of
S(γt) can be calculated in the same way as in the theorem 3.2.1, and we have
107
Theorem 3.2.2.
E[(S(γt))
2]
= (α ⊗ α) (H ⊗ H) E[e(D⊕D) γt
] (H−1 ⊗ H−1
)(e ⊗ e)
= (α ⊗ α) (H ⊗ H)(e˜(λij)t)
diag
(H−1 ⊗ H−1
)(e ⊗ e) . (3.2.10)
At the points where tj/ν are integers, we have an alternative formula to calculate
the second moment E[(S(γtj))
2]. Since fγt
(s) = beΓsIβ (see (3.2.4)), where Γ is an
tjν× tj
νmatrix and β = 1
ν,
E[(S(γtj))
2]
=
∫ ∞
0
(α ⊗ α)(eΛ s ⊗ eΛ s
)(e ⊗ e) beΓsIβ ds
= (α ⊗ α ⊗ b)
(∫ ∞
0
e(Λ⊕Λ⊕Γ) s ds
)(e ⊗ e ⊗ I 1
ν
)
= − (α ⊗ α ⊗ b) (Λ ⊕ Λ ⊕ Γ)−1(e ⊗ e ⊗ I 1
ν
).
Theorem 3.2.3. If tj/ν is an integer, then
E[(S(γtj))
2]
= − (α ⊗ α ⊗ b) (Λ ⊕ Λ ⊕ Γ)−1(e ⊗ e ⊗ I 1
ν
)(3.2.11)
When the mean and the second moment are calculated using Theorem 3.2.2 or
Theorem 3.2.3, we can then obtain the variance of S(γt) by
V ar(S(γt)) = E[(S(γt))
2]− (E [ S(γt)])
2.
3.2.4 Illustrations
In this section, we illustrate how to use the results in the previous section to compute
the transition matrix Λ under the time-changed model. We assume the underlying
Markov process is of Coxian type, i.e., the transition matrix is of form (3.1.7).
108
Illustration 1 Consider a 2-dimensional transition matrix
Λ =
(−a1 b1
0 −a2
)(3.2.12)
where a1 6= a2, a1, a2 > 0 and a1 > b1 so that Λ is a strict Coxian transition matrix.
For this Λ, the two eigenvalues are −a1 and −a2. One can easily find their
corresponding right eigenvectors h1 and h2, and put them in H as follows:
H = (h1,h2) =
(1 b1
a1−a2
0 1
). (3.2.13)
The corresponding left eigenvectors v1 and v2 are
H−1 =
(v1
v2
)=
(1 − b1
a1−a2
0 1
). (3.2.14)
After time-change with parameter ν, the matrix Λ = HD(λ)H−1 for the time-
changed Markov process in Theorem (3.2.1) can be shown to be
Λ = HD(λ)H−1
= H
(− ln(1+νa1)
ν0
0 − ln(1+νa2)ν
)H−1
=
(− ln(1+νa1)
νln(1+νa2)−ln(1+νa1)
a2−a1· b1
ν
0 − ln(1+νa2)ν
)(3.2.15)
The matrix Λ in (3.2.15) can be shown to be of Coxian type (3.1.7) too, since
ln(1 + νa1)
ν> 0, and
ln(1 + νa2)
ν> 0
0 <ln(1 + νa2) − ln(1 + νa1)
a2 − a1
· b1
ν<
ln(1 + νa1)
ν
for any a1, a2, and ν > 0, a1 6= a2. Therefore, Λ will always be a generator for any
2-dimensional time-changed Markov process.
109
Illustration 2 Consider a 3-dimensional transition matrix
Λ =
−a1 b1 0
0 −a2 b2
0 0 −a3
(3.2.16)
where a1 6= a2 6= a3, a1, a2, a3 > 0 and ai > bi for i = 1, 2 so that Λ is a strict Coxian
transition matrix.
We can work on the transition matrix (3.2.16) in similar way to Illustration 1.
H = (h1,h2,h3) =
1 b1a1−a2
b1a1−a3
· b2a2−a3
0 1 b2a2−a3
0 0 1
. (3.2.17)
H−1 =
v1
v2
v3
=
1 − b1a1−a2
b1a1−a2
· b2a1−a3
0 1 − b2a2−a3
0 0 1
. (3.2.18)
Λ
=HD(λ)H−1
=H
− ln(1+νa1)ν
0 0
0 − ln(1+νa2)ν
0
0 0 − ln(1+νa3)ν
H−1
=
− ln(1+νa1)ν
ln(1+νa1)−ln(1+νa2)a1−a2
b1ν
− ln(1+νa1)(a2−a3)+ln(1+νa2)(a1−a3)−ln(1+νa3)(a1−a2)(a1−a2)(a1−a3)(a2−a3)
b1 b2ν
0 − ln(1+νa2)ν
ln(1+νa2)−ln(1+νa3)a2−a3
b2ν
0 0 − ln(1+νa3)ν
(3.2.19)
Illustration 3 We now consider a concrete model with a 5-dimensional transition
110
matrix given by
Λ =
−0.4 0.3 0 0 0
0 −0.43 0.32 0 0
0 0 −0.5 0.36 0
0 0 0 −0.55 0.45
0 0 0 0 −0.6
The 5 different eigenvalues for Λ are λi = −0.4,−0.43,−0.5,−0.55 and −0.6, i =
1, 5. Consequently, the following matrices are obtained
H =
1 −10 13.714 −38.4 91.482
0 1 −4.5714 19.2 −60.988
0 0 1 −7.2 32.4
0 0 0 1 −9
0 0 0 0 1
and
H−1 =
1 10 32 76.8 172.8
0 1 4.5714 13.714 36.303
0 0 1 7.2 32.4
0 0 0 1 9
0 0 0 0 1
.
If a time-change process is applied to the Markov process with Λ as the underlying
transition matrix, we can obtain the time-changed ‘transition matrix’ Λ for the time-
changed Markov process according to Theorem 3.2.1. The following are Λ1, Λ2, Λ3
111
and Λ4, which correspond to ν = 0.25, 0.5, 1 and 2, respectively.
Λ1 =
−0.38124 0.2718 0.0097255 0.00051604 3.8078e − 005
0 −0.40842 0.28668 0.011413 0.00074867
0 0 −0.47113 0.31824 0.015651
0 0 0 −0.51533 0.39345
0 0 0 0 −0.55905
Λ2 =
−0.36464 0.24845 0.016084 0.00153 0.00020064
0 −0.38949 0.25965 0.018536 0.0021612
0 0 −0.44629 0.28516 0.024918
0 0 0 −0.48589 0.34953
0 0 0 0 −0.52473
Λ3 =
−0.33647 0.21202 0.023056 0.0036335 0.00077941
0 −0.35767 0.21847 0.02585 0.0049307
0 0 −0.40547 0.23609 0.033732
0 0 0 −0.43825 0.28574
0 0 0 0 −0.47
Λ4 =
−0.29389 0.16395 0.02701 0.0063389 0.0019937
0 −0.31029 0.16588 0.029242 0.0081792
0 0 −0.34657 0.17564 0.036776
0 0 0 −0.37097 0.20934
0 0 0 0 −0.39423
We see that all Λi, i = 1, 4, are valid transition matrices. The survival curves
corresponding to Λ and Λi, i = 1, 4, are given as follows.
P (0, 0, t, x)
= 292.6e−0. 4 t − 555.882353e−0. 43t + 556.8e−0. 5t − 384e−. 55t + 91.48235292e−. 6t
112
Figure 3.1: Time-changed survival curves of P (0, 0, t, x) for ν = 0, 1, 2
0 5 10 15 20 25 300
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
time t
P(0
,0,t,
x)
$\Lambda$$\tildeLambda_3$$\tildeLambda_4$
Pi(0, 0, t, x)
= 292.6eλ1 t − 555.882353eλ2 t + 556.8eλ3 t − 384eλ4 t + 91.48235292eλ5 t, i = 1, 4
The survival curves corresponding to ν = 1 and 2 and the original one are displayed
in Figure (3.1). The variance curves for various values ν are presented in Figure (3.2).
The survival curves with their corresponding one-σ confidence intervals for ν = 0.5
and ν = 1 are given in Figure (3.3) and Figure (3.4), separately. We also provide a
group of simulated survival curves in Figure (3.6) and Figure (3.5) corresponding to
different ν.
113
Figure 3.2: Variance curves for Time-changed survival models Pi(0, 0, t, x)when ν = 0.5, 1, 2
0 5 10 15 20 25 300
0.005
0.01
0.015
0.02
0.025
0.03
0.035
0.04
0.045
0.05
Time t
Var
[S(t
)]
ν=2ν=1ν=.5
Figure 3.3: Survival curve Pi(0, 0, t, x) with its one-σ confidence interval (ν = .5)
0 5 10 15 20 25 30−0.2
0
0.2
0.4
0.6
0.8
1
1.2
Time t
s(t)
s(t)+σ(t)s(t)s(t)−σ(t)
114
Figure 3.4: Survival curve Pi(0, 0, t, x) with its one-σ confidence interval (ν = 1)
0 5 10 15 20 25 30−0.2
0
0.2
0.4
0.6
0.8
1
1.2
Time t
s(t)
s(t)−σ(t)s(t)s(t)+σ(t)
Figure 3.5: Simulated survival curves with ν = 0.5 and the dotted lineis the original underlying survival function
0 5 10 15 20 25 30 350
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Time
S(t
,x)
115
Figure 3.6: Simulated survival curves with ν = 1 and the dotted lineis the original underlying survival function
0 5 10 15 20 25 30 350
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Time
S(t
,x)
116
3.3 Pricing Longevity Bonds
As we have discussed in Chapter 1, annuity providers and pension plans are subject
to a great deal of longevity risk: the risk that policy holders or plan participants
might live longer on average than anticipated. This problem can be illustrated by the
fact that life expectancy for men aged 65 in year 2000 in England and Wales is about
4 and half years higher than it was anticipated in the mortality projections made in
the 1980s. For references, see CMI Reports 3, 10, 17.
What makes the situation even worse is that the amounts of liabilities exposed
to the longevity risk is often huge. For example, the amounts at risk for state and
private sectors in the UK at the end of year 2003 aggregate to £2520 billion, that is,
nearly £40,000 for every man, woman and child in the UK (see Pensions Commission,
2005, Fig 5.17, p181).
Exposures to the longevity risk are therefore a very serious issue, and need to
be well managed. Except for reinsurance or portfolio diversifying, mortality-linked
securitization is a newly innovated approach to hedge longevity risk exposures. In
this section, we will focus on the longevity bonds (LB) which were issued by the
European Investment Bank (EIB) in November 2004, with BNP Paribas as the de-
signer and originator, and Partner Re as the longevity risk reinsurer. We will analyze
the basic features of this innovative mortality linked product, and discuss its pricing
methodologies in this section.
3.3.1 The EIB/BNP Longevity Bonds
The EIB/BNP LB is a financial contract in which annual coupon payments are pro-
portionally linked to the realization of the survivor index for a reference population
117
Figure 3.7: Projected cashflow of the EIB/BNP Longevity Bond
0 5 10 15 20 250
0.2
0.4
0.6
0.8
1 P(0, 0, t, x)*50
over the next 25 years. In this thesis, the definition of survivor index first appeared
in the expression (2.2.1) of section 2.2. As its name suggests, the survivor index,
S(t, x), is the proportion of some initial reference population aged x at time t = 0
who are still alive at some future time t. In particular, the reference population for
the EIB/BNP LBs is the English and Welsh males aged 65 in 2003.
Let us denote December 31, 2004 as time t = 0, December 31, 2005 as t = 1 etc.
Also let q(t) be the mortality rate between t and t+1 for the members of the reference
population. Then the relationship between the sequence q(t) and S(t, x) is given by:
S(t, x) = (1 − q(0))(1 − q(1)) · · · (1 − q(t − 1)). (3.3.1)
Since the practical issuance of the LBs is only with regard to the cohort of (65) (that
is, x = 65) in England and Wales, we will simply denote the survivor index as S(t).
The contract’s cash flow is well defined. The notional amount for the bond is set
to be £50 million. Then according to the terms of the contract, the coupon payments
118
are £50 ·S(t) million at time t = 1, 2, · · · , 25, respectively, payable at the end of each
year for 25 years. Figure (3.7) is an illustration of the cash flow, using the estimate
of S(t) produced lately by the UK’s Government Actuary’s Department (GAD) as
the realizations of the survival index.
UK pension funds and life offices were the intended main investors of the LBs.
These bonds are designed to offer investors a perfect hedge against the longevity risk
exposures from their commitments to provide annuity payments. The main question
is how to determine the purchase price for such bonds. In the next two sections we
will discuss how these bonds are priced by their issuer and propose a pricing method
based on the Markov mortality model in the previous sections.
3.3.2 How was the EIB/BNP LB priced?
In this section, we will examine how BNP determined the price for the EIB/BNP LB.
In the offer document issued by BNP Paribas in November 2004, BNP specified some
important components that are relevant to pricing:
• the projected survival rates used in the pricing of the bond are given by the
latest GAD’s projection, referred to as S(0, T ) in the following.
• the projected cash flow (Figure (3.7)) will be discounted at LIBOR minus 35
basis points to obtain the issue price.
It is known that the conventional fixed-interest EIB bonds are usually issued at
LIBOR-15 in the primary market. Hence the 20 basis points spread is the premium
for the protection from the mortality risk and can be interpreted as follows.
119
Let P (0) be the purchase price of the EIB/BNP LB with a notional amount of
one monetary unit. Also let DL(0, T ) denote the discount factor that corresponds
to the LIBOR curve and DE(0, T ) the discount factor that corresponds to the EIB
curve, for T = 1, 2, · · · , 25. Then the two curves can be related approximately by
DE(0, T ) = DL(0, T )e0.0015T . Further, suppose that the GAD’s projection are un-
biased estimate of the actual survival index for the reference population, that is,
S(0, T ) = EP [S(T )|M0], where P represents the physical measure. Thus, the BNP
formula for pricing the LBs can be written as
P (0) =25∑
T=1
DL(0, T )e0.0035T S(0, T )
=25∑
T=1
DE(0, T )e0.0020T EP [S(T )|M0]. (3.3.2)
For a given stochastic mortality model and assuming the independence between
the dynamics of interest rates and the dynamics of mortality rates, the risk-neutral
pricing approach that is discussed in section 2.2 and Formula (2.2.19) implies that
P (0) =25∑
T=1
DE(0, T )EQ[S(T )|M0]. (3.3.3)
A comparison of equations (3.3.2) and (3.3.3) shows that 20 basis points can be
interpreted as an average risk premium per annum. Since the EIB/BNP Longevity
bond links its coupon payments proportionally to the survival index, it is expected
that the risk premium entitled to each respective annual payment at time t should be
closely related to the level of the uncertainty associated with S(t). Consequently, some
may argue that the risk premium for the 15th annual payment should be much larger
than for the 5th annual payment. This is because survival probabilities are measures
that compound year-by-year mortality shocks from all the years before t. In other
120
words, the survival probability for year t depends on shocks applied to mortality
rates in each of the years from 1 to t, and each individual shock affects survival
probabilities in all subsequent years (see Cairns, Blake, and Dowd, 2006b; Cairns,
Blake, Dawson, and Dowd, 2005). Also, it is highly possible that some negative
shocks might be corrected in the following years, while most positive shocks tend
to be kept henceforward. As a result, the volatility imbedded in the uncertainty of
S(t) is usually very low at the first few years, and then it will pick up quickly in
a non-linear and unsystematic manner. If this is the case, then the constant spread
would underprice the long-dated annual payment but overprice the short-dated annual
payments. Hence, it is fair to say that the 20-basis-point spread is just a compromise
market price for the longevity risk over the entire 25 years. For a more precise and
meaningful approach, we have to exploit the market term structure of mortality rates
EQ[S(T )|M0](= P (0, 0, T, x)) as in formula (3.3.3).
3.3.3 Proposed method for pricing the EIB/BNP Longevity
Bonds
We consider that information regarding the term structure of the risk premium over
S(t) shall be obtained from market prices, given the specified dynamics of the un-
derlying. In this section, we therefore, based on the time-changed mortality model
introduced in section 3.2, propose a method that can, on one hand, utilize the pro-
jected mortality schedule to reflect the general view regarding the future mortality
trend, and on the other hand, can capture the market information (e.g. market risk
premiums) regarding the uncertainties surrounding the mortality trend changes and
future survival probabilities.
121
To be specific, we assume that a projected mortality table for the reference pop-
ulation of the LBs is available, and has been fitted by a phase-type distribution with
representation (α,Λ). We then propose that, under the risk neutral probability mea-
sure Q, the aging process is given by
Zt = JQγt
(3.3.4)
where JQt is a phase type Markov process with representation (α,ΛQ),
ΛQ = uΛ, u > 0 (3.3.5)
and γt is a Gamma process Γ(t; 1/ν, 1/ν). This approach is similar to that by Jarrow,
Lando, and Turnbull (1997) in credit risk modelling.
Hence, based on the projected mortality schedule, the time-changed Markov pro-
cess (3.3.4) generates a stochastic survival model S(γt). The parameter (u, ν) then
characterizes the market price of longevity risk associated with the specific projection
(α,Λ).
It is important to stress that the choice of an equivalent martingale measure is
not unique. In general, if a Markov process is specified in terms of its k×k transition
matrix,
q1 q12 q13 · · · q1k
q21 q2 q23 · · · q2k
......
. . . . . ....
qk−1,1 qk−1,2 qk−1,3 · · · qk−1,k
0 0 0 · · · 0
(3.3.6)
Under the equivalent martingale measure, the transition matrix can be defined as
qij(t) = uij(t)qij (3.3.7)
122
where uij(t) are strictly positive deterministic functions of t that satisfy
∫ T
0
uij(t)dt < +∞ for any i, j (3.3.8)
There will be infinitely ways to assign uij (see Jarrow, Lando, and Turnbull, 1997).
Therefore, if necessary, the relationship in equation (3.3.5) could be relaxed to be
both state and time dependent as in equation (3.3.7). The entries uij(t) will then
represent the risk adjustment from state i to j during period (0, t]. The relaxed model
shall be more powerful in calibration but with the trade off of being more complicated.
Proposed model (3.3.4) can be calibrated to the market price information. As
we have discussed in section 2.2, we can utilize the pure endowment market prices
E(0, t, x) at time 0 for (x) for all maturities t as the primary market. Theorem 3.2.1
can then allow us to calibrate to the market term structure of spot probabilities
P (0, 0, t, x). To be more specific, we can obtain parameter (u, ν) in the time-changed
Markov model (3.3.4) by solving the following equality
E(0, t, x)
D(0, t)= P (0, 0, t, x) ≃ P (0, 0, t, x) = αeΛQ te, for all t > 0, (3.3.9)
where P (0, 0, t, x) are the market spot survival probabilities and P (0, 0, t, x) are the
model values of the survival probabilities for any t > 0. P (0, 0, t, x) can then be
inserted back in formula (3.3.3) to derive the LB’s price.
This proposed idea of calibrating to market prices is similar to those in Lin and
Cox (2006) and Cairns, Blake, and Dowd (2006b), though they adopt different models
and data sources.
123
3.3.4 Implementation
In this section, we will demonstrate how to use proposed model (3.3.5) and for-
mula (3.3.3) to calculate the longevity bond price. In order to do so, we need data of
mortality projection and the market term structure of spot probabilities P (0, 0, t, x)
as input. For simplicity and also for comparison purpose we consider the data used
and provided by Cairns, Blake, and Dowd (2006b), rather than looking for the mar-
ket bond price and pure endowment price term structure when EIB/BNP LB was
launched.
In Cairns, Blake, and Dowd (2006b), Cairns et al propose a two-factor time series
model for the development of mortality pattern through time. Under the assumption
that their model gives unbiased estimates at time 0 to the survival rates EP [S(t)|M0],
they further introduce a method to obtain the risk-neutral probability measure Q and
the corresponding risk-neutral survival rates EQ[S(t)|M0]. The data used in their
paper are provided in the first two columns in Table 3.1.
We illustrate our method by assuming that column one is the real projection at
time 0 of the survival rate EP [S(t)|M0] and the column two is the market term
structure of mortality which can be stripped off from a given market price system of
pure endowments. The steps to calibrate model (3.3.5) are given as follows:
1. Fit a Coxian phase-type distribution to the projected mortality rates in column
one. The fitting algorithm is the EMpht provided by Asmussen, Nerman, and
Olsson (1996). As the result, we obtain a fitted phase-type distribution with
124
Table 3.1: Survival rates P (0, 0, t, x) under physical and risk-neutral measures
Column# 1 2 3 4
t EP [S(t)|M0] EQ[S(t)|M0] EQ[S(t)|M0] EQ[S(t)|M0]based on (3.3.12) based on (3.3.13)
1 0.9836 0.9837 0.9853 0.98502 0.9661 0.9662 0.9696 0.96963 0.9475 0.9477 0.9533 0.95234 0.9278 0.9281 0.9361 0.93455 0.9068 0.9074 0.9179 0.91546 0.8845 0.8856 0.8982 0.89477 0.861 0.8626 0.8766 0.87228 0.836 0.8384 0.8529 0.84769 0.8095 0.8129 0.8269 0.820910 0.7816 0.7862 0.7682 0.792111 0.7522 0.7583 0.7358 0.761412 0.7213 0.7292 0.7358 0.729113 0.6888 0.6989 0.7017 0.695214 0.6548 0.6675 0.6663 0.660315 0.6195 0.635 0.6299 0.624616 0.5828 0.6015 0.593 0.588517 0.5448 0.5672 0.5558 0.552418 0.5059 0.5321 0.5189 0.516519 0.4661 0.4965 0.4825 0.481220 0.4258 0.4606 0.4469 0.446721 0.3853 0.4245 0.4125 0.413322 0.345 0.3885 0.3793 0.381023 0.3054 0.353 0.3476 0.350224 0.2667 0.318 0.3174 0.320925 0.2297 0.2841 0.289 0.2932
125
representation (α,Λ) as:
α =(
0.836874 0 0 0.1066376 0.05648823)
(3.3.10)
Λ =
−0.2381937 0.2378846 0 0 0
0 −0.2384834 0.237982 0 0
0 0 −0.2390828 0.2380828 0
0 0 0 −0.2423468 0.2413468
0 0 0 0 −0.2433471
(3.3.11)
2. Solve for the optimum parameters (u, ν) such that EQ[S(t)|M0] matches the
market term structure of mortality in column two. That is, solve
min∑(
EQ[S(t)|M0] − EQ[S(t)|M0])2
(3.3.12)
for parameter (u, ν). The results for (3.3.12) are
u = 0.9483659, ν = 1.2481193,
and the corresponding survival rates are provided in column 3 of Table 3.1.
3. In Cairns, Blake, and Dowd (2006b), the Longevity bond price is derived to
be P = 11.442 using formula (3.3.3), with the zero-coupon prices being set as
D(0, t) = 1.04−t. It is therefore of interest to also look for the parameter (u, ν)
which gives the closest price for the LBs to what is obtained by Cairns et al.
That is, solve for the parameter (u, ν) so that we get the minimum:
min
∣∣∣∣∣
25∑
T=1
D(0, T )EQ[S(T )|M0] − 11.442
∣∣∣∣∣ (3.3.13)
The results for (3.3.13) are
u = 0.95544785, ν = 1.0513425,
126
Figure 3.8: Phase-type survival curves under Physical and R-N measure
0 10 20 30 40 50 600
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1Survival function
GAD−projectionFitted phase curveCairns−RN curveCalibrated RN curve
and the corresponding survival rates are provided in column 4 of Table 3.1.
In figure 3.8, the fitted and calibrated (model (3.3.12)) survival curves are pre-
sented together with the curves of EP [S(t)|M0] and EQ[S(t)|M0]. As one may see,
the match is not perfect. We suggest that improvement to the result can be achieved
by using higher dimensional phase-type distribution and/or considering a generalized
risk-neutral model with risk-adjusting parameters being state and time dependent as
in (3.3.7).
The variances associated with the time-changed mortality model are plotted in
Figure 3.9 and Figure 3.10. It is of great interest for us to remark that the resulting
variance curve from our time-changed mortality model is of quite a different form to
what is obtained in Cairns, Blake, and Dowd (2006b). In Figure 3.9, we observe that
the variance peaks at time t = 19, which is at age 84. Afterward, the variance drops
127
Figure 3.9: Variances for Time-changed survival probabilities
0 10 20 30 40 50 600
0.005
0.01
0.015
0.02
0.025
Time t
Var
[S(t
)]
Figure 3.10: Time-changed Survival Curve with one-σ Confidence intervals
0 10 20 30 40 50 60−0.2
0
0.2
0.4
0.6
0.8
1
1.2
Time t
Sur
viva
l cur
ves
P(0,0,t,x)−σ(t)−σ(t)
128
slowly, down to close to zero eventually at ages above 120.
The interpretation for this phenomenon is that our model predicts uncertainty
about the long-term survival probabilities only up to a certain age. There exists a
threshold age ω under the time-changed mortality model; beyond this age ω (say,
age 120), the survival probabilities are not very much uncertain (at least under the
market pricing measure). This makes sense to us because, although we are unsure
about the track of mortality evolvement, the majority still believe death presumably
remains as inevitable as it always was (Benjamin Franklin). The extreme threshold
age may be pushed further back, but there always exists one for the current cohort
under consideration.
In contrast, Cairns et al’s two factor model predicts the variance increasing expo-
nentially to infinity as future time extends, as can be seen from Figure 4 in Cairns,
Blake, and Dowd (2006b).
Therefore, our model differs from two factor model in extreme ages: under our
model, the potential to improve the survival probabilities at extreme ages is limited,
while under Cairns et al’s model, this potential is unlimited. We think our model offers
an alternative vision regarding future mortality development. The reasonableness of
this property is subject to further investigation from both theoretical and empirical
aspects.
3.3.5 Discussion
In November 2004, the EIB/BNP longevity bond was announced to have a total value
of £540 million at the issue time. However, this security was not received well by
the investors. Due to the lack of sufficient demand for the bond to be launched,
129
it was withdrawn for redesign in late 2005. It is therefore of interest to examine
some implementation details since we strongly believe that demands for further such
instruments and innovations can still be expected.
• The survival index that is central to determine the coupon payments will be
provided by the Office for National Statistics (ONS, a UK government agency).
These death rates are considered as a reliable and easily obtainable public
source. This arrangement shall help release the fear from investors that the
index could be manipulated by insurance companies.
However, this also presents a problem that is often referred to as basis risk —
a risk that reference population is different from a particular population of a
pension plan or an insurance policy.
The reason is simple. Since if one wish to use such contracts to hedge their
mortality risk, mortality improvement in the reference population has to match
that assured; otherwise, the company will be exposed to significant basis risk
and the mortality derivative might not provide an adequate hedge.
Usually, The larger the pension fund, the less the basis risk is.
• Also in the offer document, BNP Paribas suggests to calculate the index using
crude death rates, rather than mortality rates. Let m(x, y) represent the crude
central death rate for (x) published by the ONS in the year y. The survival
index S(t) is proposed to be calculated as follows:
S(0) = 1
S(1) = S(0) × (1 − m(65, 2003))
S(t) = S(0) × (1 − m(65, 2003)) × · · · × (1 − m(64 + t, 2002 + t)) (3.3.14)
130
while m(y, x) in the above expression is supposed to be q(y, x) according to the
terms of the contract.
Using crude death rates to approximate survival index may help avoid sub-
jectivity in smoothing methodologies, however, this will result in underesti-
mation of the true survival index. It has been suggested that crude death
rates shall be converted to mortality rates by using the usual approximation
q(x, t) = m(x, t)/[1 + 12m(x, t)] to reduce the kind of bias (see Blake, Cairns,
and Dowd, 2006).
• However, we feel that the failure of the EIB/BNP LB is, probably, not because
of the implementation reasons (discussed above or elsewhere by other authors,
for example, Blake, Cairns, and Dowd (2006) and references therein), rather
because of product design flaws.
The bond is designed to provide ideal hedge to those who bear the same type
of longevity risk over the period of 25 years. However, it is well known that
the early coupon payments have very low longevity risk attached to them (say
for the first 10 years) on one hand; on the other hand, these cash flows are
also the most expensive part of the bond. Consequently, for users who wish to
use these bonds as hedging instruments, such bonds will use up a large amount
of capital to cover a long period of low-risk payments. As a result, it makes
hedging turn into a full take-over, and this seems not what hedgers would want
in most situations.
In contrast, since the bond only covers a single cohort (aged 65, male), it leaves
behind a lot of liabilities which are linked to different age cohorts, different
131
terms of maturities, and different gender (female). This means that the bond
might not be as effective as it is supposed to be. Naturally it is not so attractive
to investors.
To sum up, we conclude that, although the first innovative longevity bond certainly
has a lot of merits in satisfying the need of financial and insurance markets, it seems
also clear that more highly geared contracts for longevity risk shall be developed to
attract investors.
It is also worth to mention that mortality-linked securities have attractions to more
general investors (other than mortality risk bearer) as well. This is because mortality
risk is usually considered as not correlated with other types of assets. Therefore,
long in mortality risky assets may improve risk diversification in a general portfo-
lio of investments (see, for example, Cox, Fairchild and Pedersen, 2000). For those
investors, liquidity and transparency of the products are key properties when consid-
ering portfolio constitutions, yet currently those are the main concerns of investors
over mortality market. Putting all those facts together, it is not surprised at all that
the market needs more time to accept mortality derivative products.
3.4 Miscellaneous
3.4.1 GAOs and Longevity Risk Crisis
In the previous section, we have studied one specific mortality-linked security: the
EIB/BNP longevity bond. As we have shown, the longevity bond is a security des-
ignated to provide life offices and pension plans with an instrument to hedge the
very-long-term longevity risk that they face. In fact, it is one of the two recently
132
launched financial securities that are designed specifically to help managing mortal-
ity risk; one is LB, and the other is the Swiss Re catastrophe short-term bond. There
are many others, though, that have been suggested in the paper, or have been traded
over-the-counter for the same purpose.
The concern to longevity risk, or more generally to mortality risk, has been ag-
gravating, probably since the world’s oldest life office, the Equitable Life Assurance
Society (ELAS), was forced to close to new business in December 2000. Between 1957
and 1988, ELAS had sold a type of pension annuities with the so-called “guaranteed
annuity options (GAOs)” as an embedded feature of the contracts. A guaranteed
annuity option (GAO) is a right that the policy holder has to convert his accumu-
lated fund at retirement at a guaranteed rate rather than at market annuity rate.
At the time of issuance, the value of those GAOs was considered worthless, but has
turned very valuable due to a combination of reductions in market interest rates and
unanticipated falls in mortality rates at the oldest ages. The emerged liability from
the guarantees thus seriously raised solvency concerns for ELAS, requiring the setting
up of extra reserves, and finally resulted in the financial difficulties for ELAS.
Some might blame the crisis on poor risk management of the company, and deem
that this could have been avoided if ELAS had hedged its exposure to both interest-
rate risk and longevity risk. However, as Blake, Cairns, and Dowd (2006) have pointed
out that, even if ELAS had anticipated the problem, it still lacked good instruments
to hedge its exposure to both risks, particularly longevity risk, back to that time.
Therefore, this is in fact not an isolated problem of ELAS. In the UK during
the late 1970s and 1980s, guaranteed annuity rate between cash and pension was a
common feature of individual pension policies, and has been sold by more than 40
133
companies in the market. Although those pension policies are no longer being sold
in the UK now, there are similar guarantees existing in the corresponding policies in
other countries. For example, in the United States’ variable annuity market, there
are guaranteed annuity rate (GAR) contracts and guaranteed minimum accumula-
tion benefit (GMAB) contracts. A GAR contract is identical to a GAO. A GMAB
contract includes the additional feature that the cash benefit available at retirement
is guaranteed to be at least a pre-specified amount. Thus, the problem that dragged
ELAS down is still haunting over the market. Fortunately, this incident has stim-
ulated a lot of researches into investigating the issues related to mortality risk, and
opened the door to the development of a mortality derivative market.
The next subsection will be devoted to a preliminary study on pricing GAOs. It
is then followed by a general introduction on the development of mortality linked
security market, where we end this chapter. As you can see many projects can be
developed from here.
3.4.2 Pricing Guaranteed Annuity Options – Preliminary Study
A guaranteed annuity option (GAO) is a contract which provides the policyholder
the right to convert his/her cash benefit to an annuity at a guaranteed conversion
rate of g (g = 9, normally) at maturity time T. The cash amount which can be used
for conversion is equity-linked, denoted as A(T ); that is, it equals the current market
value of the reference portfolio.
Let ax(T ) represent the actuarial present value at time T of a whole life annuity
which pays $1 per year throughout the remaining lifetime of the policyholder of age
x. Therefore the value of the GAO at maturity T depends on the equity-linked value
134
of A(T ), the prevailing market price of ax(T ), and the conversion rate of g as follows:
(A(T )
gax(T ) − A(T )
)+
=A(T )
g(ax(T ) − g)+ .
The problem of pricing a GAO is similar to that of pricing a call option writing on a
with-coupon bond. Problems like this in finance often start from pricing a call option
on zero-coupon bond first, then combines the portfolio. Therefore, in the following,
we will briefly study the problem of pricing call options on zero longevity bonds with
constant interest rate. The rest of the work will be our future ongoing project. At
least two directions of future work can be led from the results in this section. One
is to price GAOs from pricing zeros, another is to price with the assumption of a
stochastic interest rate model.
Now let’s first introduce zero longevity bonds. A zero LB (or just a zero), denoted
as L(0, t), is a single payment contract which defines the payoff at time t being
proportional to the survivor index S(t, x). A zero can be viewed as a deferred, one
period LB. A call option, C(t,K), written on the zero L(t, t + T ), offers the holder
the right to buy at strike price K, at time t, a zero which matures at date t + T with
the payment linked to S(t + T, x + t).
From now on, we will work in the risk neutral world. Assume that the concerned
survivor index S(t, x) is referred to a cohort of (x) at time 0. For this cohort, the
aging process follows a time-changed Markov process Zt = Jγt, where the underlying
Markov aging process Jt has a phase representation (α,Λ) and γt is a Gamma process
Γ(t; 1/ν, 1/ν). Furthermore, let S(t, t + T ; x) denote the random survival probability
for the same cohort, conditional on being alive at time t, that one will survive another
135
T years from time t. Sometimes we simply write S(t, t + T ) in short. Hence
S(t, t + T ) =S(t + T, x)
S(t, x)
=αeΛ γt+T e
αeΛ γte
Note that γt has independent increment, so we have
S(t, t + T ) =αeΛ γteΛ γT e
π(x)eΛ γte= π(x + t)eΛ γT e
where π(x + t) is defined as
π(x + t) =αeΛ γt
αeΛ γte.
We can interpret π(x + t) as the initial distribution for the surviving cohort at time
t.
Let P (t, t, t + T, x) denote the conditional survival probability of S(t, t + T ) over
filtration Mt, i.e.
P (t, t, t + T, x) = P [S(t, t + T )|Mt]
Applying Theorem 3.2.1, it follows that
P (t, t, t + T, x) = E [S(t, t + T )|Mt]
= E[π(x + t)eΛ γT e|Mt
]
= π(x + t)E[eΛ γT
]e
= π(x + t)eΛT e (3.4.1)
From formula (3.4.1), we know that P (t, t, t + T, x) has phase representation (π(x +
t), Λ).
136
Now consider to price the call option C(t,K). For simplicity, we assume constant
risk-free interest rate r. Then the value of this call option at time t can be expressed
as follows:(P (t, t, t + T, x)e−rT − K
)+.
And the call option price at time 0 can be calculated as
E[e−rtS(t, x)
(P (t, t, t + T, x)e−rT − K
)+ | F0
](3.4.2)
according to Fundamental asset pricing theorem.
Without loss of generality, the problem of (3.4.2) is equivalent to the following
E[S(t, x) (P (t, t, t + T, x) − K)+ |M0
](3.4.3)
=E[S(t, x) (P (t, t, t + T, x) − K)+] (3.4.4)
=E
[(αeΛ γteΛT e − KαeΛ γte
)+]
(3.4.5)
Now write
eΛT = H(eλi T
)diag
H−1
eΛ γt = H(eλi γt
)diag
H−1
Then
eΛ γteΛT = H(eλi γt
)diag
(eλi T
)diag
H−1
= H(eλi γteλi T
)diag
H−1
In a similar way, write
K = ek = H(ek)
diagH−1
137
K · eΛ γt = H(eλi γt
)diag
(ek)
diagH−1
= H(eλi γtek
)diag
H−1
αeΛ γteΛT e − KαeΛ γte
=αH(eλi γteλi T − eλi γtek
)diag
H−1e
=n∑
i=1
(eλi γteλi T − eλi γtek
)αhi ⊗ νie
=n∑
i=1
eλi γt
(eλi T − ek
)αhi ⊗ νie
Define γi =(eλi T − ek
)α(hi ⊗ νi)e, then we obtain
E[S(t, x) (P (t, t, t + T, x) − K)+ |M0
]
=E
[(n∑
i=1
eλi γtγi
)+]
=
∫ ∞
0
(n∑
i=1
eλi sγi
)+
fγt(s)d s (3.4.6)
which can be easily calculated.
Therefore, the formula (3.4.6) provides an explicit form to derive the call option
price written on the zeros of longevity bond.
3.4.3 A Snapshot of Mortality Derivative Market
Since 2000, there are a lot of discussions on proposing various forms of mortality
linked contracts into the market. Therefore, we feel it is necessary to briefly intro-
duce the development in this aspect. Generally speaking, these contracts can be
classified into the following four types: (i) mortality bonds, (ii) mortality swaps, (iii)
mortality futures, and (iv) mortality options. However, we have no intention to give
138
full description for each of them. Instead, we just want to highlight the basic features
of those contracts, and point out how they might be used to help manage mortality
risk. The following contents are mainly based on the material given in Blake, Cairns,
and Dowd (2006).
1. Mortality bonds
These securities have the usual features we would expect of bonds. The only
difference is that now the payments (coupons or principal) are related to the mortality
rates in one way or the other. Accordingly, mortality bonds fall under two broad
categories. The first are “principle-at-risk” bonds, of which the investors risk losing all
or part of the principal if the relevant mortality event occurs. The second are “coupon-
based” mortality bonds, in which the coupon payment is mortality dependent. The
EIB/BNP LB is an example of the coupon-based bonds. One can also imagine various
types of hybrid bonds in which both principal and coupon are at risk if specified
mortality events occurs.
The term of the bonds can be a preset finite period (like the EIB/BNP LB) or a
perpetuity. For example, Blake and Burrows (2001) have proposed a type of longevity
bond in which the payments last until the death of the last surviving member in the
reference cohort. The nature of the dependence on mortality can vary too: the
payment might be a smooth function of a mortality index, or it might be specified
in “at risk” terms, i.e. the investor loses some or all of the coupon or principle if the
mortality index crosses some threshold.
Here are some prominent examples of mortality bonds:
• the Swiss Re Mortality bond
139
In December 2003, Swiss Re issued a three-year life catastrophe bond, maturing
in January 1, 2007. It is a principal-at-risk bond, and its principal will be paid
out according to the following form:
100% if qt < 1.3q0
1.5q0−qt
0.2q0× 100% if 1.3q0 ≤ qt < 1.5q0
0% if qt ≥ 1.5q0
where q0 is a specifically constructed index of mortality rates across five coun-
tries: the United States of America, U.K., France, Italy and Switzerland. The
issue size is $400 million, and quarterly coupons are calculated at three-month
U.S. dollar LIBOR+135 basis points.
The bond is designed to help reduce Swiss Re’s exposure to an extreme mor-
tality risk (e.g. such as that associated with a repeat of the 1918 Spanish Flu
pandemic). There are some other advantages too. First, Swiss Re can improve
its credit rating and reassure rating agencies about its mortality risk manage-
ment capability. Second, by issuing bond themselves, Swiss Re doesn’t need to
rely on other counterparties should an extreme mortality event occur. Thus, the
bond gives Swiss Re some protection against extreme mortality risk yet avoid
exposure to credit risk they may face in the reinsurance.
In contrast with the EIB/BNP LB, the Swiss Re bond was well accepted by
the market. In April 2005, Swiss Re announced a second life catastrophe bond
which will mature in 2010 with an issue value of $362 million.
• Zero-Coupon Longevity Bonds
The contract term of a zero-coupon LB (or simply referred to as “zero”s) has
been discussed in the previous subsection. It is easy to imagine that single
140
payment longevity bonds might be issued or financially engineered by stripping
“standard” longevity bonds, as what has been done in fix-income market. The
attraction of zeros is that they provide building blocks for tailor-made positions
or for theoretical studies.
A two-dimensional spectrum of such bonds could be issued: one dimension is re-
lated to the cohorts being followed and the other related to the maturity dates.
The availability of a sufficient variety of bonds from this two-dimensional spec-
trum would then enable insurance companies to construct portfolios of longevity
bonds that provide close fits to the size/age features of their particular annu-
ity books. The development of such products can enhance the basic feature of
mortality derivative market as an investment resources in general.
• Deferred Longevity Bonds
As we have discussed before, we might need to have more highly geared bonds
which enable users to meet their hedging demands with a much reduced capital
outlay. One way of increasing gearing is to issue bonds with deferred payment
dates. The deferments would save a large amount of capital, and so make such
longevity bonds much more attractive as hedging instruments.
2. Survivor swaps
A mortality swap is an agreement where counterparties swap a fixed series of
payments in return for a series of payments linked to a survivor or mortality index.
Typically, the preset rate leg is linked to a published mortality projection, and the
floating leg is linked to the counterparty’s realized mortality.
141
Mortality swaps have certain advantages over longevity bonds. They can be ar-
ranged at lower transactions cost than issuing a bond. They are more flexible and they
can be tailor-made to suit diverse circumstances. They do not require the existence
of a liquid market, just the willingness of counterparties to exploit their comparative
advantages or trade views on the development of mortality over time.
Actually, there is an embedded over-the-counter (OTC) mortality swap arrange-
ment in the issuance of the EIB/BNP bond. Here is what is going on behind. By
issuing the bond, the EIB has a commitment to make mortality-linked payments in
sterling. To guarantee the bond commitment being met, the EIB engages in two
swaps. The first one is a cross-currency interest rate swap with BNP to exchange
floating euro payments with fixed sterling payments St. In the meantime, the EIB
also sets a deal with Partner Re, in which EIB exchanges the fixed sterling St for
mortality-linked floating sterling payments S(t). As a result, the EIB exchanges its
commitment from the longevity bond for a commitment of paying floating euros, and
gets rid of the mortality risk exposure.
The above mentioned mortality swap used by the EIB is called a vanilla mortality
swap (VMS). A VMS is analogous to a vanilla interest-rate swap (IRS), which involves
one fixed leg and one floating leg typically related to a market rate such as LIBOR.
However, the IRS can be easily valued since the existence of a liquid bond market.
This is not the case for VMS at present due to the lack of a liquid and transparent
spot mortality risk market.
Mortality swaps have a number of possible uses. As Lin and Cox (2006) explain, a
mortality swap can be used to help firms that run both annuity and life books manage
the natural hedges implicit in their positions. The type of swap in this case might
142
be a floating-for-floating swap, with one floating leg tied to the annuity provider’s
annuity payments and the other to the life assurer’s insurance payouts. In the same
way, firms could use such swaps to manage their exposures across both reference
populations and across the “mortality term structure”. Also a mortality swap can
be used between an insurer, wishing to manage the risks on its annuity book, and
a capital market institution wishing to acquire longevity risk exposure (see Blake,
Cairns, and Dowd, 2006).
3. Mortality futures
As in the financial market, a futures contract is an agreement between counter-
parties to buy or sell a security at a future time for a preset price. The basic form
of a futures contract involves defining (a) the underlying (typically price) process,
X(t), that will determine the payoff and the value of the futures contract, and (b)
the delivery date, T , of the contract. When X(t) itself represents the price of a traded
asset, the advantage of a futures market is normally that it allows stakeholders to
trade in the underlying risk with lower transaction costs and in a market with greater
liquidity than is usually possible from trading in the underlying in the spot market.
The factors that might jeopardize the success of a mortality futures contract have
been pinned down to the following:
• There must be a large, active and liquid spot market for the underlying with
good price transparency. This is by far the most important factor: indeed it is
extremely rare for a futures contract to survive without a spot market satisfying
these conditions.
143
• Spot prices must be sufficiently volatile to create both hedging needs and spec-
ulative interest.
• The market in either the underlying or the futures must not be heavily concen-
trated on either the buy or sell side, because this can lead to price manipulation.
• The underlying must be homogeneous or have a well-defined grading system.
• Liquidity costs (i.e. bid-ask spreads and execution risk — the risk of adverse
price movements before trade execution) in the futures contract must not be
significantly higher than those operating in any existing cross-hedge futures
contract.
The challenge to find suitable mortality risk-linked products as underlying has
been discussed in Blake, Cairns, and Dowd (2006). As a concrete example, we see
that Cairns, Blake, and Dowd (2006a) have introduced a concept of annuity futures.
Suppose ARM(t, x) and ARF (t, x) represent the market level annuity rate available
at time t per $10,000 single premium for males and females at age x, respectively.
That is, the single premium of $10,000 will purchase an annuity of AR(t, x) per
annum (here we drop the index for a corresponding gender). Then an annuity futures
contract would have AR(t, x) as its underlying index. There would be a variety of
maturity dates. The spot market for immediate annuities is a fairly active one, but
the problem is that this market is neither liquid nor transparent. It is also quite
inefficiency in the sense that market annuity rates don’t track the change of market
interest rate as frequently as it is supposed to be in an efficient financial market.
Another example of mortality futures is the one suggested by Blake, Cairns, and
Dowd (2006) that, if a liquid market in longevity bonds has been developed in time,
144
then it might be possible for a futures market to develop which uses the price or prices
of longevity bonds as the underlying. Day-to-day volatility in longevity bond prices
will be driven by changes in interest rates, whereas the risk associated with changes
in longevity emerges over longer periods of time. It is also possible to use a survivor
index as the underlying.
4. Mortality options
Mortality option is a type of contract with option characteristics whose payoff
depends on the underlying mortality schedule at the preset date. Mortality option
is of great interest since it can provide (a) a protection for investors against the
downside exposure of the risk, but leave any upside potential, and (b) an instrument
for speculators who want to trade views on volatility rather than views on the level
of mortality rates. For both these purposes, options are (usually) the best type of
choice.
• Survivor Caps and Floors
A possible market in survivor caplets and floorlets has been suggested by Blake,
Cairns, and Dowd (2006). The basic idea is to use a survivor index S(t, x) as
the underlying. Let sc(t) be the cap rate for exercise date t. The caplet pays
maxS(t, x)−sc(t), 0 at time t. Similarly a floorlet pays maxsf (t)−S(t, x), 0.
Survivor caplets and floorlets then can be packed into survivor caps and floors.
Alternative names could be longevity caps and floors.
• Mortality Swaptions
A more sophisticated contract would be a mortality swaption in which the
underlying instrument would be a mortality swap of specified type and maturity.
145
The swaption might be American, European or Bermudan in nature, and would
give the holder the right to enter into the swap on one or other side. For example,
a payer swaption gives the holder the right to enter as the fixed-rate payer; vice
versa, a receiver swaption gives the holder the right to enter as the fixed-rate
receiver. As with conventional swaptions, a payer swaption can be regarded as
a put on survivor rates, because its value would go up when survivor rates fall,
and a receiver swaption can be regarded as a call on survivor rates, because its
value would increase when survivor rates rise.
Mortality swaptions can be used for various risk management purposes. An
obvious use is to provide the option to lock-in future swap rates, which might
assist insurance companies in managing the risks of positions in instruments
such as guaranteed annuity options.
• Guaranteed Annuity Options
Finally, the guaranteed annuity option mentioned before is another example of
mortality options, however, it is a more complicated product which involves
interest rate risks as well. The contracts of this type are, probably, the most
discussed mortality linked product in the pioneered stochastic mortality research
so far. Interested readers are referred to the works by Boyle and Hardy (2003);
Ballotta and Haberman (2003, 2006) and the references therein.
We would like to finish our introduction about mortality linked securities here. Al-
though there are still many teething problems to overcome, it is no doubt that mor-
tality derivative market is becoming the next big frontier for financial market.
For issues related to implementation and securitization, we would like to refer
146
interested readers to the works by Cowley and Cummins, 2005, Dowd et al. 2006,
Lin and Cox, 2005a, Blake, Cairns and Dowd, 2006 and the references therein.
The valuation problem of mortality derivatives as well as their risk management
require the use of a good stochastic mortality model. Our proposed time-changed
Markov model may provide an alternative toolkit to handle this type of problem.
Another interesting direction we would like to work with in the future is to expand
the Markovian framework developed in this chapter to incorporate stochastic inter-
est rates. The motivation of this generalized Markov framework is two-fold. In the
first place, the current interest rate model may not be adequate for very long-term
products (average duration can be more than 40 years for longevity products). For
example, in Boyle and Hardy (2003) and Ballotta and Haberman (2006), the interest
rate is modelled by the Hull-White model and Health-Jarrow-Morton model, sepa-
rately. Their results showed that the value of guaranteed annuity options (GAOs) is
dominated by the current high interest rate no matter how long the maturity date of
GAOs are. This may not be a reasonable result. In the second place, it is mathemat-
ically tractable to incorporate multiple risk factors under the same framework.
To be specific, the short rate process can be modelled as a function of a finite
Markov process which represents the “states of the economy”, similar to the models
proposed in Norberg (2003), Elliott, Hunter, and Jamieson (2001), and Elliott and
Mamon (2002). Under this Markovian framework, explicit expressions for the prices
of zero-coupon bonds and other securities can usually be obtained in terms of the
exponential matrices which is quite similar to the results for the survival bonds under
the time-changed Markov mortality model. It shall be interesting to integrate the two
Markov models together. The augmented framework can be used to price and hedge
147
the long-term mortality-risk related products, and it can also be used to evaluate
insurance liabilities and risk measures.
Chapter 4
Deterministic Fitting
In this chapter, we set to show that a phase-type distribution and its associated
Markov process can be a very good candidate for a survival model. In addition to
provide statistical evidence to support this idea, we also propose an aging mechanism
as the model interpretation.
This chapter is organized as follows. In Section 4.1, we briefly introduce the aging
process and the key concept — the physiological age. In Section 4.2, we propose and
discuss a deterministic survival model. We present the fitting results for the Swedish
cohorts’ data in Section 4.3, and the fitting results for the mortality data compiled
by the United States Social Security Administration in Section 4.4. Goodness-of-fit
analysis is performed in Section 4.5. We discuss analytical properties of the pro-
posed model in Section 4.6, which illustrates the usefulness of the model and the
matrix-analytic method in pricing insurances and annuities. A short discussion on
this approach is given in Section 4.7.
148
149
4.1 Aging Process and Physiological Age
We have discussed in Section 1.2 how extrapolative methods have been used exten-
sively in mortality projection. When one employs a parametric model to extrapolate
the past trend into the future, it is implicitly assumed that the historical patterns
will still hold for the future and no structural change will occur. This is certainly not
true. Over the past century, we have observed the changes of mortality patterns due
to the transition in major causes of death from infectious diseases to chronic diseases
(see Tuljapurkar and Boe, 1998). Moreover, the end of the 20th century has been
marked by declines in death rates from chronic and degenerative diseases (see Wil-
lets, 1999, for an empirical study on mortality reduction). In order to partly correct
the problems in the extrapolative projection, it is necessary that expect opinions on
medical, behavioral, or social impacts on mortality are incorporated into the projec-
tion model. However, since there are no direct links between the parameters in the
mortality models and the aging mechanism, such incorporation is often difficult.
Hence, it is desirable to have a mortality model that (i) fits observed mortality
data; (ii) can link its parameters to the biological/physiological mechanism of aging
to certain extent, which allows for easier the input of expert opinions; and (iii) allows
quantitative analysis on death causes to be made at a more fundamental level. The
capability of constructing a mortality model with these desirable properties has been
considered important, see Tenenbein and Vanderhoof (1980), Gutterman and Van-
derhoof (1998). However, to our best knowledge a satisfactory model of this kind has
not been developed. The model we propose is a new attempt in this direction. We
start with modelling the underlying aging process using a finite-state continuous-time
Markov process with a single absorbing state (representing death). Consequently, the
150
survival functions can be derived from the model setting.
A complete description of aging theory is beyond the scope of this paper. Inter-
ested readers are referred to the classical books by Comfort (1964) and Finch (1990),
and also the papers by Collatz (1986) and Hayflick (2002). In the following, we high-
light some key features of an aging process and in particular we will focus on the
relationship between age-specific mortality patterns and the aging process.
As quoted from Jones (1956), the term aging process, as applied to living organ-
isms, is the genetically determined, progressive, and essentially irreversible diminution
with the passage of time of the ability of an organism or of one of its parts to adapt
to its environment, manifested as diminution of its capacity to withstand the stresses
to which it is subjected ( that is, the increase of susceptibility to certain diseases with
age), and culminating in the death of the organism.
Clearly, human aging is associated with a wide range of physiological changes: the
general lessening of the intensity of perfusion of blood through the various tissues,
the disturbances of lipid metabolism and the growth of atherosclerotic deposits, and
the rarefaction of the bony structure, to name a few. Such changes make a human
life not only more susceptible to a number of diseases but also more susceptible to
death. The deterioration of various physiological functions as a whole can be viewed
as the worsening of health status, suggesting that the increasing mortality rate with
age shall be largely linked to the health status rather than directly to the age of a
life.
A number of experimental studies have been conducted to seek the understanding
of various human physiological functional changes. A recent experimental study con-
ducted by Sehl and Yates (2001) calculated the loss rates for 445 human structural
151
and functional variables from 13 organs, and 24 more integrative variables. One in-
teresting finding is that a linear model can provide a fit to the data and the fit is as
good as the best polynomial fit. Other experimental studies on physiological func-
tions may be found in Shock (1974), Bafitis and Sargent (1977), Strehler (1999), and
references therein. Main findings in these studies on human physiological functional
changes can be summarized as: (1) most functional variables reach their maximal
capacities roughly between age 3 and 20; (2) after age 30, most human functional
variables decline linearly, contrary to the exponential increasing of mortality rates;
(3) the decline of physiological functions varies among individuals of a cohort, and this
variation increases somewhat with age. These findings suggest that the aging process
could be modelled in terms of changes in the physiological functions qualitatively and
quantitatively.
Motivated by these studies, we introduce a hypothetical physiological age that
marks a detectable physiological change resulting from one or more afore-mentioned
physiological functions. This physiological age may be interpreted as a relative health
index representing the degree of aging in a human body. This idea is similar to that in
Jones (1959), Sacher and Trucco (1962), Featherman (1986), Yashin et al (1995), and
Zuev et al (2000), where hypothetical health indices are also proposed but in different
contexts. Unlike some mortality models that focus on particular health factors (for
example, morbidity and disability. See Crimmins et al, 1994) and their connections
with mortality, we define the physiological age at a fundamental level and assume
that it only develops in one direction. Since the linear property holds by the various
physiological functions, we assume linearity in time also holds with the physiological
ages. Further, the change of health status or the transition from one physiological
152
age to the next is random, which is fundamentally different from the chronological
age. Lastly, mortality is viewed as both a reflection of the intrinsic aging process
and a response to the “environmental” factors. The interplay between the intrinsic
aging process and the external factors of death is the susceptibility described in the
aging definition. The increased physiological age implies the increased susceptibility
to certain fetal diseases, due to the deterioration of physiological capacity. Sometimes
this is interpreted as the “weakest link” of the whole organism failing, which results in
death and terminating the aging process as described in Austad (1997) and Strehler
(1999).
4.2 The Proposed Mortality Model
In the spirit of the discussions in the previous section, we now propose a finite-
state continuous-time Markov process to model the hypothetical aging process (see
Diagram 4.1). Each state represents a physiological age and aging is described as a
process of consecutive transitions from one physiological age to the next physiological
age. There is one absorbing state and the transition from any other state to the
absorbing state is interpreted as the aging process being terminated due to an early
death either from a random cause or from a fetal disease.
1 2 · · · · · · n
1 1 1
λ1 λ2 λn−1
qnq1 q2
Diagram 4.1. Proposed Physiological Aging process
153
As shown in Diagram 4.1, for each physiological age i two parameters are used: one is
to describe the development of aging process and the other to reflect the susceptibility
to death at that physiological status. Specifically, parameter λi represents the transi-
tion rate from physiological age i to physiological age i + 1 and can be interpreted as
an aging rate that measures how fast an individual advances from age i to age i+1. It
should be pointed out that λi is an integrated measure of the deteriorating intensity
of aging for a population, not for a specific individual. Probabilistically speaking, the
time of staying at state i before moving to state i + 1 follows an exponential distri-
bution with mean 1/λi. Thus, the larger λi is, the faster the progression of aging of
a population is.
Parameter qi describes the susceptibility, that is, the chance of death, at each
physiological age i. There are two different types of hazard threats at each age. One
is an aging-independent cause of death such as a fatal injury. The rate of death from
this type of causes is denoted by h1(i) (there are more discussions on how to specify
h1(i) in later sections). The other is the increasing susceptibility to death due to the
deterioration in physiological functions as a result of aging. The rate of death from
this type of cause is denoted by h2(i), and is an increasing function of i. We assume
that this two rates are additive: qi = h1(i) + h2(i). Hence, death is the competing
result of two forces: the internal irreversible decline of physiological functions and the
instantaneously random threat of death at each physiological age.
Since the Markov process has only a single absorbing state, the time of death (the
time of absorbing) follows a phase-type distribution. Furthermore, the transition rate
154
matrix of the Markov process is given by
Λ =
−(λ1 + q1) λ1 0 · · · 0
0 −(λ2 + q2) λ2 · · · 0
0 0 −(λ3 + q3). . . 0
......
. . . . . ....
0 0 0 · · · −qn
(4.2.1)
and the initial distribution is α = ( 1, 0, · · · 0 ).
We now propose special structures on the parameters λi, h1(i), and h2(i), not
only to make the model feasible for estimation but also to ensure that the model and
the parameters are given reasonable physiological meanings for the aging dynamics.
Moreover, a structured model avoids the non-uniqueness problem in the estimation
of the phase-type distribution.
Since the experimental findings show that various physiological functions follow
a slow age-wise linear decline on average, it is thus reasonable to assume a constant
transition rate between states, that is:
λi = λ, for i = k + 1, . . . , n,
where k will be defined later.
For the aging-independent hazard rate, we propose
h1(i) =
b + a for i1 < i ≤ i2
b otherwise(4.2.2)
where the constant b is interpreted as a background rate and the constant a is inter-
preted as a behavior related accident rate. The accident rate appears only between
ages i1 and i2 since the behavior related rate is age dependent, but the background
rate is a general reflection of the living environment. We propose that parameter h2(i)
155
is an increasing function of i to reflect the increasing susceptibility with physiological
age to hazard threats. As our first simple hypothesis, we assume h2(i) to be a power
function of the form h2(i) = ip · q, where q is a scale parameter and p is a measure of
the relative impact of aging to susceptibility. Together, the parameter qi is expressed
as
qi =
b + a + ip · q for i1 < i ≤ i2
b + ip · q otherwise(4.2.3)
Although the analytical form of qi seems to be very simple, the model performs very
well when it is fitted to mortality data as illustrated in later sections.
In order to fit the model to death rates of all ages, we consider including a devel-
opmental period in which newborns adapt to the environment and develop to reach
their maximal physiological performance. This is also a period during which mor-
tality decreases to its minimal point, starting from a comparatively high mortality
rate at birth. k additional states are added before the aging process to model the
developmental period. The augmented Markov process is given in Diagram 4.2 with
an illustrative 2-state developmental period:
aaaaa aacaa aaaaa aaaaaaa
I II 3 4 · · · · · · n
aaaaa avaaa aaaaa aaaaa aaaaaaa
λI λII λ3 λ4 λn−1
qI qII q3 q4 qn
Diagram 4.2. The augmented Markov Model of Physiological Aging process
Here, the Roman numbered states represent the developmental period. The aug-
mented Markov process has the same structure as the previous one. In practice, a
small number for k (2 to 4) is sufficient for the developmental period. As we will show
156
in the next section, k = 4 is sufficient when we fit the model to the Swedish cohorts’
data of Year 1811, 1861, and 1911.
Some remarks are now made:
(i) A technical advantage of this model is that the time of death has a phase-type
distribution. Phase-type distributions have been studied extensively in the queuing
context. They can easily be analyzed using the matrix-geometric method developed
by Neuts (1981). The density, survival function and moments of a phase-type distri-
bution have a simple analytical form as we have shown in Chapter 3. Further details
can be found in Neuts (1981) and Asmussen (1987). The matrix-geometric method
also enables us to derive the closed-form expression of the net single premiums of the
whole life insurance and the whole life annuity, and to perform qualitative analysis
on the model (see Section 4.6). In contrast, two widely used mortality models: the
Heligman-Pollard model in actuarial science and the Lee-Carter model in survival
analysis, can not identify the distribution of the time of death explicitly. No analyt-
ical tool is available for either model. As a result, most of the studies using these
models has to rely on simulating numerical results or statistical experiments.
(ii) The proposed model implicitly assumes that there is a maximum physiological
age n. Theoretically, it may not be necessary for aging to be a finite process, but
when n is large enough, the model can provide an excellent fit to data and can
also accommodate the needs for mortality projection and insurance valuation. Our
numerical experiments in this paper show that n is necessary to be as large as 200 to
provide a good fit for the data sets.
(iii) When n = 200, we could encounter the dimensionality problem. However,
since the total number of parameters in the model is relatively small (9 to 13), the
157
dimensionality problem is not a serious issue. As a comparison, the Lee-Carter model
may need more than 300 parameters in order to fit mortality data. See Li, Hardy
and Tan (2007).
4.3 Fitting Swedish Cohort Data
In this section, we will fit the proposed model to three sets of Swedish mortality data.
The main reason we choose the Swedish mortality data to illustrate the implementa-
tion of our model is the reliability of the data as well as the high life expectancy of
the Swedish population. As described in the previous section, we are to model the
aging process in terms of physiological ages. For this reason, cohort data are more
suitable for fitting the model than cross-sectional (period) mortality data.
The Swedish cohort data for decennial years 1811 through 1911 are downloaded
from www.mortality.org, and their graphs are shown in Figure 4.1.
Some observations are worth noting. From Year 1811 to Year 1861, there was no
significant improvement in infant-childhood mortality and in mortality at advanced
ages. A noticeable mortality improvement took place between early 20’s and late
60’s. However, a significant mortality decline at all ages was observed from Year
1861 to Year 1911. In particular, the cohorts of Years 1811, 1861 and 1911 represent
three very different mortality patterns as shown by the bold curves in Figure 4.1.
For cohort 1811, the mortality first declines to a minimal point at around age 14
and then steadily increases. For cohort 1861, the mortality increases moderately
from the minimal point for about 30 years and then increases steeply. Cohort 1911
presents a “noticeable” accident hump at youth ages and begins its steady increase
earlier than cohort 1861, but from a comparatively lower rate. All three cohorts
158
Figure 4.1: Death rate qx for Swedish Cohorts from year 1811 to year 1911
0 20 40 60 80 100 1202
3
4
5
6
7
8
9
10
Age
ln(1
0000
*qx)
18111821183118411851186118711881189119011911
159
Table 4.1: Parameter values for the Swedish cohorts of years 1811, 1861, and 1911Parameters
Cohort λ b a [i1, i2] q p1811 2.5657 3.1504e-03 1.9888e-03 [42, 99] 9.3157e-09 31861 2.4794 4.4825e-03 1.9033e-03 [42, 89] 2.6351e-13 51911 2.3707 9.0987e-04 2.8939e-03 [33, 70] 1.8872e-15 6
show approximately parallel mortality curve from age 60. We apply our model to the
cohorts of Years 1811, 1861 and 1911.
The parameters in the transition matrix (4.2.1) are estimated by minimizing the
sum of weighted squared errors:
F =ω∑
x=0
(qx − qx)2s(x), (4.3.1)
where qx and s(x) are the observed death rate and survival probability at age x, and
qx is the corresponding model value for qx. Let s(x) be the model survival function
of the time of death. Since the time of death is of phase type with the phase type
representation (α,Λ), it can be expressed as
s(x) = α exp(Λx)e, (4.3.2)
where exp(Λx) is the matrix-exponential of the transition matrix Λ. The probability
of death qx can thus be calculated using
qx =s(x) − s(x + 1)
s(x). (4.3.3)
The estimates of the parameters are obtained using the simplex algorithm (Nelder
and Mead, 1965) built in Matlab. The algorithm is applied to different combinations
of values n, i1, i2 to determine the best suitable maximal physiological age, and the
160
Figure 4.2: Fitted curves of qx for the Swedish cohorts of years 1811, 1861 and 1911
0 20 40 60 80 1003
4
5
6
7
8
9
10
ln(1
0000
*qx)
0 20 40 60 80 1003
4
5
6
7
8
9
10
0 20 40 60 80 1002
4
6
8
10
Age
ln(1
0000
*qx)
0 20 40 60 80 1002
4
6
8
10
Age
1911Fitted
1861Fitted
1811Fitted
161
age range for significant high incidence of accidental death. We found that n = 200
provides the best fit for all three data sets. Estimated values are given in Table 4.1
for aging related parameters, and in Table 4.2 for the developmental period related
parameters, separately. Four states are used for the developmental period. The fitted
curves are shown in Figure 4.2.
Table 4.2: Parameter values for the developmental periodParameters
Cohort qI λI qII λII qIII λIII qIV λIV
1811 0.6051 2.4906 0.0726 0.6376 0.0503 0.7040 0.0285 0.44161861 0.3086 1.6773 0.0291 0.7793 0.0627 0.7954 0.0276 0.75311911 0.1671 1.7958 0.0097 0.5543 0.0003 3.5061 0.0149 0.6535
SSA1950B 0.1031 5.5796 0.0750 5.5454 0.0037 0.7419 0.0017 0.6189
The estimated parameters reconfirm the observed trend for death rates qx, as
shown in Figure 4.1. The internal aging rate λ decreases over time. This trend
coincides with the medical improvement since the beginning of the twentieth century,
along with the general improvements in the standard of living. The improvements
are also reflected by the significant decrease in the background death rate b for 1911.
Note that cohort 1911 has a much higher accidental death rate a, which accounts for
the observed hump. In Figure 4.3, we give the curve of h2(i) = ip ·q for the three data
sets since it represents the increasing susceptibility with aging. It is interesting to see
that there exists a crossover among the three curves. That may be interpreted as that
younger cohorts are more subject to chronic diseases at old ages, which is consistent
with the exchange theory: the elimination of acute infectious diseases results in more
people dying from chronic diseases (Fries, 1980; Jones, 1956 and 1959).
162
Figure 4.3: Estimated patterns of hi(2) for the three cohorts
0 20 40 60 80 100 120 140 160 180 2000
0.02
0.04
0.06
0.08
0.1
0.12
Physiological age
q i(2)
191118611811
4.4 Fitting U.S. Social Security Administration Mor-
tality Data
In this section, we fit the same model and parameter sets to the (cohort) life table for
births in 1950 in Actuarial Study No. 107 that was compiled by the U.S. Social Secu-
rity Administration. This particular life table is referred to as SSA Life Table 1950B.
All the life tables in Actuarial Study No. 107 were compiled in 1992, reflecting both
the historical mortality experience before 1990 and projected future improvement
from 1990 to 2080. Those tables are designed for the valuation of the insurances and
benefits in the Old-Age and Survivors Insurance and Disability Insurance (OASDI)
program. Hence, future mortality improvement has been taken into account.
163
Table 4.3: Estimated parameter values for SSA 1950BParameters
Methods λ b a [i1, i2] q pBasic 2.2484 2.7671e-04 1.5059e-03 [37, 120] 2.1710e-15 6
Modified 2.2502 3.7874e-04 2.3635e-03 [37, 120] 2.1710e-15 6
Figure 4.4: Fitting curve of qx for SSA 1950B
0 20 40 60 80 100
2
3
4
5
6
7
8
9
Age
ln(1
0000
*qx)
Fitted1950B
164
We again use the Nelder-Mead simplex algorithm to minimize the sum of weighted
squared errors (see formula 4.3.1). The estimated parameters for the SSA Life Table
1950B are given in the first row of Table 4.3. The fitted curve of the death rates
qx’s with the corresponding observed values are given in Figure 4.4. The fitting is
reasonably good at most ages except two periods: ages from age 15 to the late 30’s,
and ages 100 and older. The rates at extreme old ages in the life tables in Actuarial
Study No. 107 were constructed by a geometric extrapolation of the probabilities of
death. Since it has been widely discussed that death rates do not increase in such
a fast manner at extreme old ages (Kannisto, 1994), we view this extrapolation as
unreliable and thus didn’t make any effort to adjust our model to provide a better fit
for this period.
For the other period (from age 15 to late 30’s), special adjustments were made
to accommodate high accidental rates. In Table 4.4, we list the age-group accidental
death rates provided in the U.S. National Vital Statistics Report (Table 10, Vol. 47,
No.19, June 30, 1999). It is shown in the table that the accidental death rate has
a sudden high jump at age 15 and remains approximately constant for age groups
15-24, 25-34, and 35-44, and then it drops for the next three age groups, followed by
an increase after age 65. We hence introduce a step function to replace the parameter
a in the model:
a →
a for [i1, j]
a · c for [j, i2]
The model is re-estimated with c = 0.51 and the revised fitting curve is given in
Figure 4.5 and the other estimated parameters are given in the second row of Table 4.3.
As can be seen, the fitting is improved noticeably.
165
Table 4.4: Accidental death rates by age group, the United States, 1997Age group lx dx Accidents# Accident Percent Cond.Acc.Rateunder 1 2313844 28045 1139 0.0406133 0.000492254
1-4 2285799 75501 2420 0.439920015 0.0010587115-14 2280298 8061 4183 0.518918248 0.00183440915-24 2272237 31544 24059 0.762712402 0.01058824425-34 2240693 45538 24061 0.528371909 0.01073819635-44 2195155 89408 26236 0.293441303 0.01195177645-54 2105747 144882 17866 0.123314145 0.008484455-64 1960865 231993 11128 0.047966965 0.00567504665-74 1728872 464274 11910 0.025652955 0.00688888575-84 1264598 670530 14791 0.02205867 0.01169620785+ 594068 594068 11713 0.019716598 0.019716598
Figure 4.5: Adjusted fitting curve of qx for SSA 1950B
0 20 40 60 80 100
2
3
4
5
6
7
8
9
Age
ln(1
0000
*qx)
Fitted1950B
166
4.5 Analysis of Goodness-of-Fit
In this section, we consider three goodness-of-fit measures. The first measure is the
R-square measure that is normally used to test the goodness-of-fit of the Lee-Carter
model. See Wilmoth (1993) and references therein. The second measure is the mean
squared error. We compare the mean squared error from our model with the mean
squared error from Heligman-Pollard’s model. The third measure is the actuarial
present values of a particular insurance and annuity. We calculate the actuarial
present value of whole life insurance and the actuarial present value of whole life
annuity for various interest rates and ages, using the model. We then compare them
with the same values obtained directly from a life table.
R-square or the coefficient of determination measures the percent of the (weighted)
variance explained by a model. Its value is an indicator of how well the model fits
the data: the closer the value is to 1, the better. In our case, the total weighted sum
of squares is given by
SST =ω−1∑
x=0
(qx − q)2s(x), (4.5.1)
where q = (∑ω−1
x=0 qx)/ω is the grand mean of qx across age and ω is the maximum
age. The residual weighted sum of squares is given by
SSE =ω−1∑
x=0
(qx − qx)2s(x). (4.5.2)
Then the percent of variance explained by the model is given by
R2 = 1 − SSE
SST. (4.5.3)
The R-square values of our model for the data sets in the previous two sections
are given in the first row of Table 4.5. As we can see, the R-square values obtained
167
for all cohorts are very close to 1, indicating that the proposed model has explained
almost all the variation of death rates over ages. In Lee and Carter (1992), the R-
square value for their model is 0.927. Normally a R-square value greater than 0.9 is
considered satisfactory.
Table 4.5: The goodness-of-fit values
Cohort 1811 Cohort 1861 Cohort 1911 SSA1950BR-square value 0.9982 0.9988 0.9993 0.9999
H-P’s MSE 0.000293 0.000131 0.00007278 0.000031ours’ MSE 0.000448 0.0003077 0.0000382 0.00000108
We next compare the mean squared error from our model with the mean squared
error from Heligman-Pollard’s model. We first fit Heligman-Pollard’s model (1.1.29)
to the three data sets discussed in the previous two sections. Since the estimation is
a routine procedure, we omit the detail. The mean squared errors (over the ages from
0 to 100) for Heligman-Pollard’s model and our model are then calculated, and the
results are also given in Table 4.5. Since Heligman-Pollard’s model is a well accepted
mortality model for all ages, the comparison of the errors from the two models provides
another way to look at how well our model fits to data. Our model outperforms
Heligman and Pollard’s model for cohort 1911 and SSA 1950 but theirs is better for the
others. In all the cases, the difference is insignificant. However, we want to emphasize
that the real technical advantage of our model compared with Heligman and Pollard’s
model is the simplicity of the model structure and the analytical properties that will
be discussed in the next section.
Lastly in this section, we calculate the actuarial present value Ax of the whole
life insurance and the actuarial present value ax of the whole life annuity, using two
168
approaches. One is to use our model and the other is to use a life table directly. The
SSA life table 1950B is used for this purpose. Based on this life table, the values of
Ax and ax can be obtained directly (e.g. Bowers et al, 1997). On the other hand,
since the time of death follows the phase-type distribution with transition matrix
Λ, the continuous versions Ax and ax of Ax and ax have a closed-form expressions
that are given by (4.6.4) in Section 4.6. Assuming the uniform distribution of death
(UDD) for each year of age, Ax and ax have simple closed-form expressions and their
values are obtained immediately (note that assuming UDD is not necessary as closed-
form expressions for the probabilities of death and survival also exist but we use the
UDD assumption for simplicity.). With the estimated parameters in Table 4.3 for the
revised model, we obtain the values of Ax and ax for 5 different ages and 3 different
interest rates. These values are compared with the corresponding values obtained
from the SSA life table 1950B directly. All of them and the relative errors are given
in Table 4.6 and Table 4.7 respectively. It again demonstrates how good the fitting
is.
4.6 Qualitative Analysis of the Model
One of the advantages for using phase-type distributions in survival analysis is that
its underlying Markov process Jt, t ≥ 0, can provide a population dynamics over time.
In our model, Jt represents the physiological age of an individual at (chronical) age t.
Let Pi(t) be the probability that the individual at age t is at physiological age i, i.e.
Pi(t) = P (Jt = i, T > t), t ≥ 0, i = 1, 2, · · · , n.
169
Table 4.6: Actuarial present value Ax of whole life insurance at different ages and interestrates
Interest rate Age x Life table value for Ax Model value for Ax Relative errori=1% 30 0.63592 0.63675 0.0013044
40 0.69484 0.696 0.001669550 0.75586 0.75798 0.002804160 0.81762 0.81895 0.001627370 0.87414 0.87358 -0.00064002
i=4% 30 0.19287 0.19349 0.003224240 0.26543 0.2669 0.00553450 0.35934 0.36401 0.01298160 0.4777 0.48151 0.00797970 0.60854 0.60713 -0.002312
i=7% 30 0.076294 0.076279 -0.0001917740 0.12337 0.12395 0.0047250 0.19578 0.20055 0.0243660 0.30535 0.31029 0.01615770 0.44719 0.44562 -0.0035071
170
Table 4.7: Actuarial present value ax of whole life annuity at different ages and interestratesInterest rate Age x Life table value for ax Model value for ax Relative error
i=1% 30 35.772 35.689 -0.002341940 29.821 29.704 -0.003928850 23.658 23.444 -0.009048660 17.421 17.286 -0.007714170 11.712 11.768 0.0048247
i=4% 30 19.985 19.969 -0.00080940 18.099 18.061 -0.002110250 15.657 15.536 -0.007745960 12.58 12.481 -0.007877970 9.178 9.2146 0.0039856
i=7% 30 13.12 13.12 1.7046e-00540 12.4 12.391 -0.0007178350 11.293 11.22 -0.006455160 9.6182 9.5428 -0.00784170 7.4501 7.4741 0.0032178
171
Then, for i = 1, 2, · · · , n, Pi(t) = [αeΛt]i, where [ · ]i is the ith component of vector [ · ].
Obviously, the survival function s(t) can be expressed as s(t) =∑n
i=1 Pi(t). Consider
now the corresponding conditional probability πi(t) = P (Jt = i|T > t). Then, πi(t)
can be expressed as
πi(t) =Pi(t)
s(t)=
[αeΛt
α exp(Λt)e
]
i
, t ≥ 0, i = 1, 2, · · · , n. (4.6.1)
The distribution π(t) = [π1(t), π2(t), . . . , πn(t)] may be used to describe the hetero-
geneity or frailty in health status among the cohort of individuals at age t, where
the heterogeneity is measured by the physiological age. Furthermore, the aging pro-
cess of the heterogeneous cohort can still be modelled in the same manner: it is a
Markov process with the same transition matrix Λ but the initial distribution is now
π(t). As a result, all the desirable properties are preserved and the same mathemat-
ical/statistical analysis can be carried out.
To illustrate the heterogeneity, in Figure 4.6 we present the distribution π(t) for
ages 30, 50, and 70 for the Swedish Cohorts of years 1811, 1861, and 1911.
The Figure 4.6 shows that (1) the degree of heterogeneity increases over age among
survivors; and (2) the distribution π(t) is shifted to the left from cohort 1811 to cohort
1911, which means that the younger cohort is younger in physiological age to reflect
the health improvement of the population over time.
The relationship between the force of mortality of the time of death and the
absorbing rate qi (see formula (4.2.3)), which represents the death rate at the physi-
ological age i can also be identified. It follows from (3.1.2) and (3.1.3) that the force
of mortality µ(t) is given by
µ(t) =α exp(Λt)q
α exp(Λt)e. (4.6.2)
172
Figu
re4.6:
Heterogen
eous
distrib
ution
sfor
three
cohorts
0 20 40 60 80 100 120 140 160 180 2000
0.01
0.02
0.03
0.04
0.05
0 20 40 60 80 100 120 140 160 180 2000
0.01
0.02
0.03
0.04
0.05
0 20 40 60 80 100 120 140 160 180 2000
0.01
0.02
0.03
0.04
0.05
191118611811
191118611811
191118611811
At age 30
At age 50
At age 70
173
Thus, with formula (4.6.1) we obtain
µ(t) =n∑
i=1
qi · πi(t). (4.6.3)
In other words, the force of mortality at age t is a weighted average of the death rates
qi’s with the heterogeneous distribution π(t) as the weights.
This model also allows us to investigate the impact of the absorption rates qi on the
distribution of the time of death. Suppose that the absorption rates qi’s are subject
to a constant change or perturbation ǫ. That is, the new absorption rates qǫi = qi + ǫ
for all i, while the other parameters remain unchanged. This is the case when the
background death rate b is changed to b + ǫ. Let (α,Λǫ) denote the corresponding
phase-type representation. Then we have
Λǫ = Λ − ǫI.
Thus,
eΛǫt = e(Λ−ǫI)t = e−ǫ teΛt.
It follows from (3.1.2) that the survival function Sǫ(t) with the perturbation can be
expressed as
Sǫ(t) = e−ǫ ts(t).
Interestingly, the distribution π(t) that measures frailty remains unchanged:
πǫ(t) =αeΛ
ǫt
α exp(Λǫt)e=
αeΛt
α exp(Λt)e= π(t).
The force of mortality µǫ(t) is obtained by
µǫ(t) =n∑
i=1
qǫi · πi(t) =
n∑
i=1
qi · πi(t) +n∑
i=1
ǫ · πi(t) = µ(t) + ǫ,
174
using (4.6.3). Thus, if the absorption rates are increased (or decreased) by ǫ, so is
the force of mortality.
The idea of changing the absorption rates qi’s is useful for determining a loading
to the net premium of an insurance or an annuity. The traditional approach is to
adjust the probabilities of death qx’s at all ages with a fixed percentage increase or
decrease as shown in the CET and GAM tables. A similar approach can be used by
letting qǫi = qi + ǫi. Here ǫi may be viewed as a mortality loading to physiological age
i. All the premium calculations can then be carried out in the same manner.
Finally, we present the following closed-form expressions for the actuarial present
values of the whole life insurance, the term life insurance and the whole life annuity,
to further illustrate the technical advantage of the model. The symbols are self-
explanatory.
Ax =
∫ ∞
0
e−δtπ(x) exp(Λt)q dt = −π(x)(Λ − δI)−1q,
A1x:n| =
∫ n
0
e−δtπ(x) exp(Λt)q dt = π(x)(Λ − δI)−1(e(Λ−δI)n − I)q,
and
ax =
∫ ∞
0
e−δtπ(x) exp(Λt)e dt = −π(x)(Λ − δI)−1e. (4.6.4)
We note that these closed-form expressions are used to calculate the values in Table 4.6
and Table 4.7. In contrast, Heligman and Pollard’s model and the Lee-Carter model
can not produce closed-form expressions for the above.
4.7 Conclusion and Discussion
The idea of using Markov processes and phase-type distributions to model human
mortality is not new at all. For instance, Gavrilov and Gavrilova (1991) use a Markov
175
process to derive a Makeham-Gompertz formula under special assumptions. Aalen
(1995) explores the theoretical potential of the use of phase-type distributions to
model different shape of hazard rates, suggesting that such models should find a
greater application in survival analysis. Other examples of using Markov models and
phase-type distributions for survival analysis can be found in Kay (1986), Longini et
al. (1989 and 1991), and Guihenniuc-Jouyaux, Richardson, and Longini (2000) under
the heading of Markov Processes.
In this chapter, we have proposed a hypothetical modelling framework using a
finite-state continuous-time Markov process with a single absorbing state (death)
to describe a physiological aging process of a human body. A special structure is
suggested on the transition matrix to characterize the Markov process such that the
process can be linked with the underlying aging mechanism. Furthermore, the use
of the finite-state continuous-time Markov process ensures that the time of death
follows a phase-type distribution. As a result, many analytical methods developed
for phase-type distributions can be applied to the proposed mortality model.
We have shown that the model is capable of explaining some stylized facts of
observed mortality very well. We have fitted the model to the Swedish population
cohort data and life tables compiled by the U.S. Social Security Administration in
Actuarial Study (107). The fitting results are satisfactory.
One of the potential applications of the model could be for mortality projection,
which is of central importance to the insurance industry and social security sys-
tems. Current mortality projection methods normally involve certain extrapolation
techniques. However, several recent reviews (C.M.I.B., 2002, and GAD, 2001) have
176
showed that these approaches often underestimate mortality improvements. In addi-
tion, certain important issues, including the cohort effect and the future uncertainties
in mortality rate, are not well addressed by extrapolative methods. This process-
based mortality model could provide a framework to (at least partly) address these
practical issues. Moreover, it is possible to extend this model to provide a unified
framework for all cohorts. This will be very useful when we are dealing with cross-
sectional (period) mortality data. Also we have shown in Chapter 3 and 4 that we
can introduce a time-change factor to this model to generate stochastic mortality.
Appendix A
Matrix Algebra
As you have seen, our results rely heavily on matrix computation and matrix expo-
nential; therefore we give a brief summary of relevant matrix operations, and some
useful results in this section.
A.1 Matrix Exponential
The matrix exponential eA of a n × n matrix A is defined as
eA =∞∑
n=0
An
n!.
Some fundamental properties for matrix exponential are:
1.
d
dteAt = AeAt = eAtA (A.1.1)
2. If matrix A is invertible, then
∫ t
0
eAsds = A−1(eAt − I); (A.1.2)
177
178
If all the eigenvalues of A are negative (or have negative real part), then
∫ ∞
0
eAsds = −A−1. (A.1.3)
3. If there exists a matrix H such that A = HDH−1, where D = (dj)diag, then
An = HDnH−1 (A.1.4)
Note that the equation (A.1.4) holds for negative n as well as positive. Dn is
easy to calculate since it only involves the powers of a diagonal matrix. More-
over,
eA = eHDH−1
= H eD H−1 = H(edi)
diagH−1. (A.1.5)
4. In general, the equality eA+B = eAeB doesn’t hold except for A and B which
are commutable 1. However, Kronecker operations allow exponential function
being generalized to matrices as it is shown in the Propositoin (A.1.1).
Proposition A.1.1.
eA⊕B = eA ⊗ eB (A.1.6)
For a proof, see Asmussen (2000a, p345).
A.2 The Kronecker product ⊗ and the Kronecker
sum ⊕
In the proposition (A.1.1), the Kronecker product ⊗ and the Kronecker sum ⊕ have
been used. Their definition and some properties are given in the following.
1Note that Baker-Campbell-Hausdorff formula can be used to calculate the approximated valuefor eA+B when A and B are not commutable. For more details, see Bakhturin (2001). We want tothank Professor Sebastian Jaimungal for pointing this out to us.
179
Definition A.2.1. (The Kronecker product ⊗ and the Kronecker sum ⊕)
Let A be a k1×m1 and B be a k2×m2 matrix, then the Kronecker product A⊗B
is the (k1 × k2) × (m1 × m2) matrix defined as
A ⊗ B =
a11B · · · a1 m1B
· · · · · · · · ·
ak11B · · · ak1 m1B
(A.2.1)
If A and B are both square (k1 = m1 and k2 = m2), then the Kronecker sum
A ⊕ B is defined by
A ⊕ B = A ⊗ Ik2+ Ik1
⊗ B (A.2.2)
Note that when A reduces to a column vector hk×1 and B reduces to a row vector
ν1×m, h⊗ν is the k×m matrix with ijth element hiνj, i.e. h⊗ν = hν in standard
matrix notation.
Proposition A.2.2.
(A1B1C1) ⊗ (A2B2C2) = (A1 ⊗ A2)(B1 ⊗ B2)(C1 ⊗ C2)
In particular, if A1 = ν1,A2 = ν2 are row vectors and C1 = h1,C2 = h2 are column
vectors, then ν1B1h1 and ν1B2h2 are real numbers, and
(ν1B1h1) · (ν2B2h2) = (ν1B1h1) ⊗ (ν2B2h2) = (ν1 ⊗ ν2)(B1 ⊗ B2)(h1 ⊗ h2)
Exponential representation
Borrowing the Kronecker notation, we can express the matrix exponential of a
diagonalizable square matrix as the sum of exponential functions in terms of its right
and left eigenvectors.
180
Proposition A.2.3. (Exponential representation) Suppose transition matrix A has
n distinct eigenvalues λ1, · · · , λn. Let ν1, · · · ,νn be the corresponding left (row)
eigenvectors and h1, · · · ,hn the corresponding right (column) eigenvectors (that is,
νiA = λiνi, Ahi = λihi with νihj = 0, i 6= j, and νihi = 1 after normalization).
Then transition matrix A has diagonal form, and,
A =n∑
i=1
λihiνi =n∑
i=1
λihi ⊗ νi
eAt =n∑
i=1
eλithiνi =n∑
i=1
eλithi ⊗ νi
In this Proposition, the key step is to find a decomposition for the matrix A such
that A = H (λi)diag H−1. This is sufficient when A has nondegenerate eigenval-
ues λ1, · · · , λn, and corresponding linearly independent eigenvectors h1,h2, · · · ,hn.
Then H = [h1, · · · ,hn]. Using formula (A.1.5) and Kronecker operations, the result
follows.
The advantage of the above property is that we have an explicit formula for A and
eAt once the λi,hi,νi has been computed, and it is a combination of exponentials.
There are, however, two serious drawbacks of this approach (see Asmussen (2000a)):
1. Numerical instability: If the λi are too close, matrix A contains terms which
almost cancel and the loss of digits may be disastrous. The phenomenon occurs
not only when the dimension n is large.
2. Complex calculas: If not all λi are real, it requires to do calculations with
complex numbers or to perform the cumbersome translation into real and imag-
inary parts, both of which are messy.
181
Proposition A.2.4. For any m × m square matrix B with inverse B−1,
(B ⊗ B)−1 =(B−1 ⊗ B−1
)
Proof Denote
B =
b11 · · · b1 m
· · · · · · · · ·bm 1 · · · bm m
and
B−1 =
c11 · · · c1 m
· · · · · · · · ·cm1 · · · cm m
which satisfy
m∑
j=1
bij cjk = 0, i 6= k, (A.2.3)
m∑
j=1
bij cjk = 1, i = k. (A.2.4)
Also
B ⊗ B =
b11B · · · b1 mB
· · · · · · · · ·bm 1B · · · bm mB
B−1 ⊗ B−1 =
c11B−1 · · · c1 mB−1
· · · · · · · · ·cm 1B
−1 · · · cm mB−1
Applying relation (A.2.3) and (A.2.4), it is straightforward to see that
(B ⊗ B)(B−1 ⊗ B−1) =
I · · · · · · 0
0 I · · · 0
· · · · · · · · · · · ·0 · · · · · · I
m×m
Bibliography
Aalen, O. O., 1995, “Phase type distributions in survival analysis,” The Scandinavian
Journal of Statistics, 22, 447–463.
Andreev, K., 2001, Overview of the Program Lexis 1.0Odense University, Denmark
and Max Planck Institute for Demographic Research, Germany.
Asmussen, S., 1987, Applied Probability and queues, Wiley, New York.
Asmussen, S., 2000a, “Matrix-analytic models and their analysis,” The Scandinavian
Journal of Statistics, 27, 193–226.
Asmussen, S., 2000b, Ruin Probabilities, World Scientific Publishing, Singapore.
Asmussen, S., O. Nerman, and M. Olsson, 1996, “Fitting phase-type distributions via
the EM algorithm,” The Scandinavian Journal of Statistics, 23, 419–441.
Asmussen, S., and T. Rolski, 1991, “Computational methods in risk theory: a matrix
algorithmic approach,” Insurance: Mathematics and Economics, 10, 259–274.
Austad, S. N., 1997, Why We Age: What Science Is Discovering about the Body’s
Journey through Life, John Wiley & Sons, New York.
Bafitis, H., and F. Sargent, 1977, “Human physiological adaptability through the life
sequence,” Journal of gerontology, 32, 402–410.
182
183
Bakhturin, Y., 2001, Campbell-Hausdorff formula . Encyclopaedia of Mathematics in
Hazewinkel, M. (eds), Kluwer Academic Publishers.
Ballotta, L., and S. Haberman, 2003, “Valuation of guaranteed annuity conversion
options,” Insurance: Mathematics and Economics, 33, 87–108.
Ballotta, L., and S. Haberman, 2006, “The fair valuation problem of guaranteed annu-
ity options: The stochastic mortality environment case,” Insurance: Mathematics
and Economics, 38, 195–214.
Benjamin, B., and J. Pollard, 1993, The analysis of mortality and other actuarial
statistics, The Institute of Actuaries, Oxford.
Benjamin, B., and A. Soliman, 1993, Mortality on the Move, Actuaries Education
Service, Oxford.
Biffis, E., 2005, “Affine processes for dynamic mortality and actuarial valuation,”
Insurance: Mathematics and Economics, 37, 443–468.
Biffis, E., and P. Millossovich, 2006, “The fair value of guaranteed annuity options,”
Scandinavian Actuarial Journal, 1, 23–41.
Bjork, T., 1998, Arbitrage Theory in Continuous Time, Oxford University Press.
Blake, D., and W. Burrows, 2001, “Survivor bonds: Helping to hedge mortality risk,”
Journal of Risk and Insurance, 68, 339–348.
Blake, D., A. J. G. Cairns, and K. Dowd, 2006, “Living with mortality: longevity
bonds and other mortality-linked securities,” British Actuarial Journal, 12, 153–
197.
Booth, H., J. Maindonald, and L. Smith, 2002, “Applying Lee-Carter under conditions
of variable mortality decline,” Population Studies, 56, 325–336.
184
Bowers, N. L., H. U. Gerber, J. C. Hickman, D. A. Jones, and C. J. Nesbitt, 1997,
Actuarial Mathematics, The Society of Actuaries. Second Edition. Schaumburg,
Illinois.
Box, G., and G. Jenkins, 1970, Time series analysis: Forecasting and control, San
Francisco: Holden-Day.
Boyle, P., and M. Hardy, 2003, “Guaranteed annuity options,” Astin Bulletin, 33,
125–152.
Brace, A., D. Gatarek, and M. Musiela, 1997, “The market model of interest-rate
dynamics,” Mathematical Finance, 7, 127–155.
Brass, W., 1974, “Mortality models and their uses in demography,” Transactions of
the Faculty of the Actuaries, 33, 123–132.
Cairns, A. J., 2000, “A discussion of parameter and model uncertainty in insurance,”
Insurance: Mathematics and Economics, 27, 313–330.
Cairns, A. J. G., D. Blake, P. Dawson, and K. Dowd, 2005, “Pricing the risk on
longevity bonds,” Life and Pensions, pp. 41–44.
Cairns, A. J. G., D. Blake, and K. Dowd, 2006a, “Pricing Death: Frameworks for the
Valuation and Securitization of Mortality Risk,” Astin Bulletin, 36, 79–120.
Cairns, A. J. G., D. Blake, and K. Dowd, 2006b, “A two-factor model for stochastic
mortality with parameter uncertainty,” Journal of Risk and Insurance, 11, 687–718.
CMI, 1990, Continuous Mortality Investigation Reports No.10.
CMI, 1999, Continuous Mortality Investigation Reports No. 17.
CMI, 2002, “An Interim Basis for Adjusting the ‘92’ Series Mortality Projections for
Cohort Effects,” Continuous Mortality Investigation Working paper 1.
185
CMI, 2004, “Projecting future mortality: a Discussion paper,” Continuous Mortality
Investigation Working paper 3.
CMI, 2005, “Projecting future mortality: Towards a proposal for a stochastic method-
ology,” Continuous Mortality Investigation Working paper 15.
Collatz, K.-G., 1986, Towards a Comparative Biology of Aging, vol. Insect Aging,
K.-G. Collatz and R.S. Sohal (Eds), . pp. 1–8, Springer-Verlag, Berlin.
Comfort, A., 1964, Ageing: The Biology of Senescence, Routledge and Kegan Paul,
London.
Cox, J., J. Ingersoll, and S. Ross, 1985, “A thoery of the term-structure of interest
rates,” Econometrica, 53, 385–408.
Cox, S. H., J. Fairchild, and H. Pedersen, 2000, “Economic Aspects of Securitization
of Risk,” Astin Bulletin, 30, 157–193.
Cramer, H., and H. Wold, 1935, “Mortality variations in Sweden: a study in gradua-
tion and forecasting,” Skandinavisk Aktuarietidskrift, 18, 161–241.
Crimmins, E., M. Hayward, , and Y. Saito, 1994, “Changing Mortality and Morbid-
ity Rates and the Health Status and Life Expectancy of the Older Population,”
Demography, 31, 159–175.
Dahl, M., 2004, “Stochastic mortality in life insurance: market reserves and mortality-
linked insurance contracts,” Insurance: Mathematics and Economics, 35, 113–136.
Dahl, M., and T. Møller, 2006, “Valuation and hedging of life insurance liabilities with
systematic mortality risk,” Insurance: Mathematics and Economics, 39, 193–217.
Davidson, A. R., and A. R. Reid, 1927, “On the calculation of rates of mortality,”
Transactions of the Faculty of Actuaries, 11, 183–232.
186
Duffie, D., 2001, Dynamic Asset Pricing Theory, Third Edition, Princeton University
Press.
Elliott, R. J., W. C. Hunter, and B. M. Jamieson, 2001, “Financial signal processing:
A self calibrating model,” International Journal of Theoretical and Applied Finance,
4, 567–584.
Elliott, R. J., and R. S. Mamon, 2002, “A complete yield curve description of a Markov
interest rate model,” International Journal of Theoretical and Applied Finance, 6,
317–326.
Featherman, D. L., 1986, “Marker of Aging,” Research on Aging, 8, 339–365.
Finch, C. E., 1990, Longevity, Senescence, and the Genome, The University of
Chicago Press, Chicago and London.
Flesaker, B., and L. Hughston, 1996, “Positive interest,” Risk, 9, 46–49.
Follmer, H., and D. Sondermann, 1986, Hedging of Non-Redundant Contingent
Claimspp. 205–223, . Contributions to Mathematical Economics, W. Hildenbrand
and A. Mas-Colell (eds), North-Holland.
Forfar, D., and D. Smith, 1988, “The changing shape of English Life Tables,” Trans-
actions of the Faculty of Actuaries, 40, 98–134.
Fries, J., 1980, “Aging, Natural Death, and the Compression of Mortality,” The New
England Journal of Medicine, 303, 130–135.
GAD, 2001, National population projections: Review of methodology for projecting
mortality.National Statistics Quality Review Series, Report No.8, Government Ac-
tuary’s Department.
187
Guihenniuc-Jouyaux, C., S. Richardson, and I. Longini, 2000, “Modeling Markers of
Disease Progression by a Hidden Markov Process: Application to Characterizing
CD4 Cell Decline,” Biometrics, 56, 733–741.
Gutterman, S., and I. Vanderhoof, 1998, “Forecasting Changes in Mortality: a Search
for a Law of Causes and Effects,” North American Actuarial Journal, 2.
Harrison, J., and S. Pliska, 1981, “Martingales and stochastic integrals in the theory
of continuous trading,” Stochastic processes and Applications, 11, 215–260.
Harrison, P. G., 1990, “Laplace transform inversion and passage-time distributions in
Markov processes,” Journal of Applied Probability, 27, 74–87.
Hayflick, L., 2002, “Longevity determination and aging,” Living to 100 and Beyond:
Survival at Advanced Ages Symposium.Schaumburg, I11.: Society of Actuaries.
Health, D., R. Jarrow, and A. Morton, 1992, “Bond pricing and the term structure of
interest rates: A new methodology for contingent claims valuation,” Econometrica,
60, 77–105.
Heligman, L., and J. Pollard, 1980, “The age pattern of mortality,” Journal of Insti-
tute of Actuaries, 107, 49–75.
Higgins, T., 2003, “Mathematical Models of Mortality,” Paper presented at the Work-
shop on Mortality Modelling and Forecasting Australian National University.
Hull, J., and A. White, 1990, “Pricing Interest rate derivative securities,” Review of
Financial Studies, 3, 573–592.
Hurd, T., and A. Kuznetsov, 2006, “Affine Markov Chain model of multifirm credit
migration,” Working paper.
Jamshidian, F., 1997, “LIBOR and swap market models and measures,” Finance and
Stochastics, 1, 293–330.
188
Jarrow, R. A., D. Lando, and S. M. Turnbull, 1997, “A Markov model for the term
structure of credit risk spreads,” The Review of Financial Studies, 10, 481–523.
Jones, H., 1956, A special consideration of the aging process, disease and life ex-
pectancy, vol. 4 of Advances in Biological and Medical Physics, J.H.Lawrence and
C.A. Tobias Eds, . pp. 281–337, Academic Press Inc., New York.
Jones, H., 1959, The relation of human health to age, place and time. . in Handbook of
Aging and the Individual, J.E. Birren, Ed., University of Chicago Press, Chicago.
Kannisto, V., 1994, Development of Oldest-Old Mortality, 1950-1990: Evidence from
28 Developed Countries, Odense University Press, Odense, Denmark.
Kay, R., 1986, “A Markov Model for analysing cancer markers and disease states in
survival studies,” Biometrics, 42, 855–865.
Keilson, J., 1979, Markov Chain Models — Rarity and Exponentiality, Springer-
Verlag, New York.
Keyfitz, N., 1981, “The Limits of Population Forecasting,” Population and Develop-
ment Review, 7, 579–593.
Lee, R. D., and L. R. Carter, 1992, “Modeling and forecasting U.S. mortality,” Journal
of the American Stotistical Association, 87, 659–675.
Lee, R. D., and T. Miller, 2001, “Evaluating the performance of the Lee-Carter
Method for Forecasting mortality,” Demography, 38.
Li, S.-H., M. R. Hardy, and K. S. Tan, 2006, “Uncertainty in Mortality forecasting:
an extension to the classical Lee-Carter approach,” In press, 5, 1–20.
Lin, X., and K. Tan, 2003, “Valuation of equity-indexed annuities under stochastic
interest rate,” North American Actuarial Journal, 7, 72–91.
189
Lin, X. S., and X. Liu, 2007, “Markov aging process and phase-type law of mortality,”
North American Actuarial Journal, 11, 92–109.
Lin, Y., and S. H. Cox, 2006, “A mortality securitization model,” Journal of Risk
and Insurance, 76, 22–52.
Longini, I., W. Clark, R. Byers, J. Ward, W. Darrow, G. Lemp, and H. Hethcote,
1989, “Statistical Analysis of the Stages of HIV-Infections Using a Markov Model,”
Statistics in Medicine, 8, 831–843.
Longini, I. M., W. S. Clark, L. Gardner, and J. F. Brundage, 1991, “The dynamics
of CD4+ T-lymphocyte decline in HIV-infected individuals: a Markov modelling
approach,” Journal of Acquired Immune Deficiency Syndromes, 4, 1141–1147.
Madan, D. B., P. Carr, and E. C. Chang, 1998, “The Variance Gamma process and
option pricing,” European Finance Review, 2, 79–105.
Madan, D. B., and F. Milne, 1991, “Option pricing with V.G. martingale compo-
nents,” Mathematical Finance, 1, 39–55.
Maghsoodi, Y., 1996, “Solution of the extended CIR term structure and bond option
valuation,” Mathematical Finance, 6, 89–109.
McNown, R., and A. Rogers, 1989, “Forecasting mortality: a parameterized time
series approach,” Demography, 26.
Milevsky, M. A., and S. D. Promislow, 2001, “Mortality derivatives and the option
to annuitise,” Insurance: Mathematics and Economics, 29, 299–318.
Miltersen, K. R., and S.-A. Persson, 2005, “Is Mortality dead? Stochastic Forward
Force of Mortality Rate Determined by No Arbitrage,” Working Paper.
Møller, T., 1998, “Risk-minimizing hedging strategies for unit-linked life insurance
contracts,” Astin Bulletin, 28, 17–47.
190
Møller, T., 2001a, “Hedging Equity-linked life insurance contracts,” North American
Actuarial Journal, 5, 79–95.
Møller, T., 2001b, “On transformations of actuarial valuation principles,” Insurance:
Mathematics and Economics, 28, 281–303.
Nelder, J., and R. Mead, 1965, “A Simplex Method for Function Minimization,”
Computer Journal, 7, 308–313.
Neuts, M. F., 1981, Matrix-geometrix solutions in stochastic models, Johns Hopkins
University Press, Baltimore, London.
Norberg, R., 2003, “The Markov Chain Market,” Astin Bulletin, 33, 265–287.
O’Cinneide, C. A., 1989, “On Non-uniqueness of representations of phase-type distri-
butions,” Commun. Statist.-Stochastic Models, 5, 247–259.
O’Cinneide, C. A., 1990, “Characterization of phase-type distributions,” Commun.
Statist.-Stochastic Models, 6, 1–57.
O’Cinneide, C. A., 1999, “Phase-type distributions: open problems and a few prop-
erties,” Commun. Statist.-Stochastic Models, 15, 731–757.
Olivieri, A., 2001, “Uncertainty in mortality projections: an actuarial perspective,”
Insurance: Mathematics and Economics, 29, 231–245.
Pitacco, E., 2003, “Survival Models in Actuarial Mathematics: From Halley to
Longevity Risk,” Invited talk at 7th International Congress Insurance: Mathe-
matics & Economics, ISFA, Lyon.
Renshaw, A., and S. Haberman, 2000, “Modelling for Mortality Reduction Factors,”
Actuarial Research Paper No. 127, Department of Actuarial Science and Statistics,
City University, London.
191
Renshaw, A., and S. Haberman, 2003, “Lee-Carter mortality forecasting with age-
specific enhancement,” Insurance: Mathematics and Economics, 33, 255–272.
Rogers, L., 1997, “The potential approach to the term-structure of interest rates and
foreign exchange rates,” Mathematical Finance, 7, 157–164.
Rutkowski, M., 1997, “A note on the Flesaker & Hughston model of the term structure
of interest rates,” Applied Mathematical Finance, 4, 151–163.
Sacher, G., and E. Trucco, 1962, “The stochastic theory of mortality,” Annals of New
York Academy of Sciences, 96, 985–1007.
Sehl, M. E., and R. E. Yates, 2001, “Kinetics of Human aging: I. Rates of Senescence
between ages 30 and 70 years in healthy people,” Journal of gerontology, 56B,
198–208.
Shock, N., 1974, “Physiological theories of aging,” Theoretical aspects of aging, M.
Rockstein Ed. Academic Press, New York:119-136.
SSA, 1992, Life Tables for the United States Social Security Area 1900-2080,.Actuarial
Study No.107, August 1992.
Strehler, B. L., 1999, Time, Cells, and Aging, Demetriades Brothers, Larnaca.
Tenenbein, A., and I. Vanderhoof, 1980, “New Mathematical Laws of Select and
Ultimate Mortality,” Transactions of the Society of Actuaries, 32, 119–158.
The National Vital Statistics Report, U. S., 1999, Vol. 47, No. 19, June. 30,
http://www.cdc.gov/nchs/products/pubs/pubd/nvsr/nvsr.htm.
Tuljapurkar, S., and C. Boe, 1998, “Mortality Changes and Forecasting: How Much
and How Little do we Know?,” North American Actuarial Journal, 2, 13–47.
192
Vasicek, O., 1977, “An equilibrium characterisation of the term structure,” Journal
of Financial Economics, 5, 177–188.
Wetterstrand, W. H., 1981, “Parametric models for life insurance mortality data:
Gompertz’s law over time,” Transactions of the Society of Actuaries, 33, 159–175.
Willets, R., 1999, “Mortality in the next millenium,” Presented at the Staple Inn
Actuarial Society on 7 December.
Wilmoth, J. R., 1993, Computational methods for fitting and extrapolating the Lee-
Carter model of mortality changeTechnical report. Department of Demography.
University of California, Berkeley.
Yashin, A. I., K. G. Manton, M. Woodbury, and E. Stallard, 1995, “The Effects of
Health Histories on Stochastic Process Model of Aging and Mortality,” Journal of
Mathematical Biology, 34, 1–16.
Zuev, S. M., A. I. Yashin, K. G. Manton, and E. Dowd, 2000, “Vitality Index in Sur-
vival Modeling: how physiological Aging Influences Mortality,” Journal of geron-
tology, 55A, 10–19.