stochastic mortality modelling · stochastic mortality modelling xiaoming liu department of...

STOCHASTIC MORTALITY MODELLING

By

Xiaoming Liu

A thesis submitted in conformity with the requirements

for the degree of Doctor of Philosophy

Graduate Department of Statistics

University of Toronto

© Copyright by Xiaoming Liu 2008

Stochastic Mortality Modelling

Xiaoming Liu

Department of Statistics, University of Toronto

Ph.D. Thesis, 2008

Abstract

For life insurance and annuity products whose payoffs depend on the future mortality rates,

there is a risk that realized mortality rates will be different from the anticipated rates

accounted for in their pricing and reserving calculations. This is termed as mortality risk.

Since mortality risk is difficult to diversify and has significant financial impacts on

insurance policies and pension plans, it is now a well-accepted fact that stochastic

approaches shall be adopted to model the mortality risk and to evaluate the mortality-linked

securities.

The objective of this thesis is to propose the use of a time-changed Markov process

to describe stochastic mortality dynamics for pricing and risk management purposes.

Analytical and empirical properties of this dynamics have been investigated using a

matrix-analytic methodology. Applications of the proposed model in the evaluation of

fair values for mortality linked securities have also been explored.

To be more specific, we consider a finite-state Markov process with one absorbing

state. This Markov process is related to an underlying aging mechanism and the survival

ii

time is viewed as the time until absorption. The resulting distribution for the survival time

is a so-called phase-type distribution. This approach is different from the traditional curve

fitting mortality models in the sense that the survival probabilities are now linked with an

underlying Markov aging process. Markov mathematical and phase-type distribution

theories therefore provide us a flexible and tractable framework to model the mortality

dynamics. And the time-changed Markov process allows us to incorporate the uncertainties

embedded in the future mortality evolution.

The proposed model has been applied to price the EIB/BNP Longevity Bonds and

other mortality derivatives under the independent assumption of interest rate and mortality

rate. A calibrating method for the model is suggested so that it can utilize both the market

price information involving the relevant mortality risk and the latest mortality projection.

The proposed model has also been fitted to various type of population mortality data for

empirical study. The fitting results show that our model can interpret the stylized mortality

patterns very well.

iii

Acknowledgements

It is certainly a long road to become a doctor. When I look back, I feel so grateful

that I have obtained many people’s help and support during my doctoral study. I

would like to take this opportunity to thank all of them.

In particular, I wish to thank Professor X. Sheldon Lin. As my supervisor, the

guidance from Professor Lin is not limited in how to write a thesis, instead, he shares

his opinions on how to conduct researches career-wisely. I have learned a lot from

his wisdom. To Professor Samuel Broverman, I wish to thank him for his thoughtful

comments and for being devoted and patient; specially, he spent a huge amount of

time to proofread my thesis. I really appreciate his help. To Professor Sebastian

Jaimungal, I wish to thank him for giving me the first exposure to Variance Gamma

process and Markov models’ application in credit risk, not to mention all those in-

teresting discussions with him. I also wish to thank the external examiner, Professor

Jun Cai of the University of Waterloo, for his insightful comments.

I would like to thank the University of Toronto and the Department of Statistics

in particular for their support throughout my graduate studies. Special appreciation

goes to Professors Nancy Reid, Jeffrey S. Rosenthal, Radford Neal and Radu V. Craiu.

Thanks also go to Andrea Carter, Dermot Whelan, Laura Kerr, and Ram Mohabir,

who are always willing to help.

To all the other friends: Xiaobin, Hanna, Mylene, Zheng, Hadus, Mohammed,

Zi, Shuying, Tao, Longhai, Meng, Shelly, Mark, Ana-maria, Angelo, and Samuel for

being around and helpful; Patrice for his advices and suggestions; Sig, for being a very

iv

v

special friend I luckily made. All these people make my time at U of T so memorable.

I also wish to thank my dear daughter, Angela, for the inspiration she has brought

to me from her enthusiasm to the Harry Potter Series book and J.K. Rowling.

At last, but not least, my very special thanks go to my husband, Owen, for bringing

me the idea of doing the research as doing a project. This helped me survive the most

difficult time in writing the thesis. Without you, this can never be done!

London, Ontario Xiaoming Liu

Dec. 16, 2007

Table of Contents

Abstract ii

Acknowledgements iv

Table of Contents vi

Introduction 1

1 Mortality Risk — From Deterministic to Stochastic Approach 5

1.1 Mortality Modelling . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.1.1 Survival Distributions . . . . . . . . . . . . . . . . . . . . . . 5

1.1.2 Empirical Studies on Mortality Trend . . . . . . . . . . . . . . 11

1.1.3 Mathematical Mortality Models . . . . . . . . . . . . . . . . . 19

1.2 Mortality Projection Methods Review . . . . . . . . . . . . . . . . . . 22

1.2.1 Deterministic Projection Models . . . . . . . . . . . . . . . . . 24

1.2.2 Stochastic Projection Models . . . . . . . . . . . . . . . . . . 30

1.2.3 Assessing Mortality Projection Models . . . . . . . . . . . . . 37

1.3 Mortality Risk and Stochastic Approaches . . . . . . . . . . . . . . . 41

1.3.1 Mortality Risk — Definition and Properties . . . . . . . . . . 41

1.3.2 Financial Implication of Mortality Risk . . . . . . . . . . . . . 47

1.3.3 Stochastic Approach of Dealing with Mortality Risk . . . . . . 51

2 Arbitrage-free Pricing Framework for Mortality Contingent Claims 57

2.1 Arbitrage free pricing theory . . . . . . . . . . . . . . . . . . . . . . . 57

2.1.1 Basic Ideas of Arbitrage Free Pricing . . . . . . . . . . . . . . 57

2.1.2 The Term Structure of Interest Rate . . . . . . . . . . . . . . 63

2.2 The Term Structure of Mortality Under Arbitrage Free Framework . 67

2.2.1 Basic Building Blocks . . . . . . . . . . . . . . . . . . . . . . . 67

2.2.2 The Generalized Financial/Insurance Market . . . . . . . . . . 72

vi

vii

2.3 Review of stochastic mortality models under the no-arbitrage framework 75

2.3.1 Criteria for Term Structure of Mortality Models . . . . . . . . 75

2.3.2 A Brief Review of Existing Stochastic Mortality Models . . . . 78

2.3.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

3 The Time-Changed Markovian Mortality Model 89

3.1 Dynamic Approach of Mortality Modelling . . . . . . . . . . . . . . . 89

3.1.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

3.1.2 Phase-type distributions . . . . . . . . . . . . . . . . . . . . . 90

3.1.3 Phase-type distributions as mortality models . . . . . . . . . . 98

3.2 Time-changed Markovian Survival Model . . . . . . . . . . . . . . . . 100

3.2.1 Time-changed Markovian Process . . . . . . . . . . . . . . . . 101

3.2.2 The Gamma process . . . . . . . . . . . . . . . . . . . . . . . 102

3.2.3 Survival functions for time-changed model . . . . . . . . . . . 104

3.2.4 Illustrations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107

3.3 Pricing Longevity Bonds . . . . . . . . . . . . . . . . . . . . . . . . . 116

3.3.1 The EIB/BNP Longevity Bonds . . . . . . . . . . . . . . . . . 116

3.3.2 How was the EIB/BNP LB priced? . . . . . . . . . . . . . . . 118

3.3.3 Proposed method for pricing the EIB/BNP Longevity Bonds . 120

3.3.4 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . 123

3.3.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128

3.4 Miscellaneous . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131

3.4.1 GAOs and Longevity Risk Crisis . . . . . . . . . . . . . . . . 131

3.4.2 Pricing Guaranteed Annuity Options – Preliminary Study . . 133

3.4.3 A Snapshot of Mortality Derivative Market . . . . . . . . . . . 137

4 Deterministic Fitting 148

4.1 Aging Process and Physiological Age . . . . . . . . . . . . . . . . . . 149

4.2 The Proposed Mortality Model . . . . . . . . . . . . . . . . . . . . . 152

4.3 Fitting Swedish Cohort Data . . . . . . . . . . . . . . . . . . . . . . . 157

4.4 Fitting U.S. Social Security Administration Mortality Data . . . . . . 162

4.5 Analysis of Goodness-of-Fit . . . . . . . . . . . . . . . . . . . . . . . 166

4.6 Qualitative Analysis of the Model . . . . . . . . . . . . . . . . . . . . 168

4.7 Conclusion and Discussion . . . . . . . . . . . . . . . . . . . . . . . . 174

A Matrix Algebra 177

A.1 Matrix Exponential . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177

A.2 The Kronecker product ⊗ and the Kronecker sum ⊕ . . . . . . . . . 178

viii

Bibliography 182

Introduction

Actuaries have been using extrapolative methods to project mortality rates for cen-

turies, with the implication that the past represents the future. Traditional actuarial

approaches for the pricing and risk management of life insurance and annuity prod-

ucts treat mortality rates deterministically. In other words, a deterministic projected

mortality schedule is chosen in advance, and then used in the calculation of premi-

ums and risk reserves, with the belief that the differences between the projected rates

and realized rates, the so-called mortality risk, can be diversified among individuals

and/or over the time.

Those are indeed very strong assumptions. Over the last century, evidences have

emerged to reveal that mortality risk is neither predictable nor diversifiable. In fact,

it has been shown that the mortality projections in the last fifty years have system-

atically underestimated the overall mortality improvement. The consequent adverse

financial impacts caused by mis-assessing mortality risk have been blamed as one of

the main reasons for the insolvency of the Equitable Life Assurance Society of UK,

the world’s oldest life office. For the empirical study on the mortality trend changes

and mortality projections, see Willets (1999), CMIB reports (2002; 2004; 2005), GAD

National Statistics Quality Review Report No. 8 (2001), and the papers presented

at the 2005 Society of Actuaries “Living to 100 and Beyond Symposium”. Also see

1

2

Pitacco (2003), Olivieri (2001), Boyle and Hardy (2003) and Ballotta and Haberman

(2003) for the financial impacts of the mortality risk on life insurance and annuities.

As a result, the traditional deterministic actuarial approach is now seen to be

inadequate for the calculation of fair values and reserves. Great efforts have been

made in the past few years to explore the use of stochastic approaches to model the

mortality dynamics and to evaluate the mortality-linked securities, many of which

are similar to approaches in finance. The frontier work in this aspect can be found in

Milevsky and Promislow (2001), Dahl (2004), Dahl and Møller (2006), Biffis (2005),

Biffis and Millossovich (2006), Ballotta and Haberman (2006), and Cairns, Blake, and

Dowd (2006a,b).

The aforementioned researchers make use of the similarities between mortality

risk and interest rate risk. They suggest modifying the models arising in the interest

rate sector to obtain mortality rate models. However, while mathematically similar

at a certain conceptual level, mortality rates behave very differently from interest

rates. For example, the term structure of mortality rates should only be increasing

to reflect the biologically reasonableness for age-specific pattern of mortality, whilst

interest rates can reverse in some situation. While the mean-reverting property is a

desirable property for interest rates, it is doubtful that mean-reverting is realistic for

mortality dynamics (see Cairns, Blake, and Dowd, 2006a).

In this thesis, we propose an alternative approach to model the stochastic mor-

tality. To be more specific, we start with a finite-state Markov process with one

absorbing state. This Markov process is related to an underlying aging mechanism

and the survival time is viewed as the time until absorption. The resulting distri-

bution for the survival time is a so-called phase-type distribution. This approach is

3

very different from the traditional curve fitting mortality models in the sense that

the survival probabilities are now linked with an underlying Markov aging process.

Markov mathematical and phase-type distribution theories then provide us a flexible

and tractable framework to model the mortality dynamics.

To introduce the uncertainty measure to the mortality dynamics, we further

consider a time-changed Markov process. The time-change process is described by

Gamma process, subordinated to the underlying aging process to generate the stochas-

tic mortality curves. The time-changing idea has first been used by Madan and

his collaborators in modelling the dynamics of the logarithm of the stock price (see

Madan and Milne, 1991; Madan, Carr, and Chang, 1998). However, the time-change

technique, associated with an underlying Markov process, has not been considered

in mortality modelling. In this project, we make use of properties of phase-type

distribution, which has been extensively explored in queueing theory and risk the-

ory (see Neuts, 1981; Asmussen, 1987). Here, the computational advantages of the

matrix-analytical method for phase-type distribution is extended to the time-changed

stochastic models and presented in different form. Most interestingly, the expectation

and variance of the resulting survival curves can be explicitly expressed in terms of

the original process characteristics. Unlike the mortality models adapted from the

interest rate counterpart that must rely on simulation methods to obtain numerical

results, our model is mathematical tractable yet still remains biologically reasonable

mortality pattern.

The proposed model has been illustrated to price the EIB/BNP Longevity Bonds

and other mortality derivatives under the independent assumption of interest rate and

mortality rate. We propose a calibrating method that can utilize both the market

4

price information involving the relevant mortality risk and the available mortality pro-

jection that has best incorporated the current knowledge regarding mortality trends.

Calibrating the risk-neutral model using market information is a common practice in

financial modelling. However, similar studies for mortality risk has not been carried

out seriously. The scarcity of the research in this area is partly due to the complexity

of mortality models and partly due to the relative dearth of market information. We

wish the calibrated approach based on our time-changed Markov model can provide a

tractable framework for no-arbitrage, market consistent pricing and hedging problems

for the concerned mortality risk.

The proposed model has been fitted to various type of population mortality data

for empirical study. The fitting results show that our model can interpret the stylized

mortality patterns very well.

The thesis is organized as follows. In chapter 1, we present empirical facts with

respect to the historical mortality data. We also examine the various projection

methods for mortality. These help us understand the features of mortality risk and

the reasons why it takes so long for actuarial professional and industry to pay attention

to the stochastic and systematic aspects of the mortality risk. We then set up the

general pricing framework and basic building blocks of stochastic mortality models

in chapter 2. Our proposed time-changed Markov model is presented in chapter 3.

This chapter also includes the discussion of the properties of the model, illustrations,

examples and applications. Finally, we give a detailed empirical study in chapter 4,

to show that a phase-type distribution can fit the whole mortality schedule fairly well.

A hypothetical underlying mechanism for the model is also suggested.

Chapter 1

Mortality Risk — FromDeterministic to StochasticApproach

1.1 Mortality Modelling

Life insurance and annuities are products designed to manage financial uncertainty

related to how long an individual will survive. Hence, the lifetime random variable

X and its associated mortality model are the basic building blocks in actuarial math-

ematics. In this section, we first introduce basic concepts and actuarial notation

related to mortality modelling. We then present empirical results on human mor-

tality trends, leading to a discussion of current problems in mortality modelling and

projecting approaches.

1.1.1 Survival Distributions

We begin by considering a continuous age-at-death variable X. Specifically, X is a

nonnegative random variable representing the lifetime of an individual in a cohort or

population.

5

6

All distribution functions related to the random variable X, unless stated other-

wise, are defined over the interval [0,∞). Let f(x) denote the probability density

function (p.d.f.) of X and let the cumulative distribution function (c.d.f.) be

F (x) = P (X ≤ x) =

∫ x

0

f(t)dt. (1.1.1)

The probability of an individual surviving to age x is given by the survival function

(s.f.)

s(x) = P (X > x) =

∫ ∞

x

f(t)dt. (1.1.2)

A very important concept in mortality modelling is the force of mortality (often

referred to as the hazard function in other fields such as in reliability theory), which

is defined as:

µ(x) = lim∆x→0

P (x < X ≤ x + ∆x|X > x)

∆x(1.1.3)

=f(x)

s(x)(1.1.4)

The force of mortality specifies the instantaneous rate of death at age x, given that

the individual survives up to age x.

Any one of the functions f(x), F (x), s(x), or µ(x) can be used to specify the

distribution of X. It is easy to see that, given an expression for any one of the above

four functions, the other three can be derived. For example, in terms of the force of

mortality µ(x), we have:

s(x) = e−∫ x

0 µ(t)dt, (1.1.5)

F (x) = 1 − s(x) = 1 − e−∫ x

0 µ(t)dt, (1.1.6)

and

f(x) = µ(x) e−∫ x

0 µ(t)dt. (1.1.7)

7

We will often make use of the future lifetime random variable (or residual lifetime)

τx. τx is the time-until-death variable measured from the date that a contract has been

issued to an individual of age x. In the following, the symbol (x) is used to denote

a life-aged-x. The distribution function for τx can be derived from the distribution

function for X. In actuarial science, special symbols have been assigned to denote

the distribution function for τx as follows.

Actuarial Notation

For t ≥ 0, we define

tqx = P (τx ≤ t) = P (x < X ≤ x + t|X > x) =s(x) − s(x + t)

s(x), (1.1.8)

tpx = P (τx > t) = P (X > x + t|X > x) =s(x + t)

s(x). (1.1.9)

The symbol tqx can be interpreted as the probability that (x) will die within t years;

that is, tqx is the c.d.f. of τx. Similarly, tpx can be interpreted as the probability that

(x) will attain age x + t, that is, tpx is the s.f. of τx. Note that both tqx and tpx are

conditional probabilities in terms of random variable X; they are probabilities that

are conditional on the event that an individual survived to age x.

In terms of the force of mortality,

tpx = e−∫ x+t

xµ(s)ds = e−

∫ t

0 µ(x+s)ds, (1.1.10)

tqx = 1 − tpx = 1 − e−∫ x+t

xµ(s)ds = 1 − e−

∫ t

0 µ(x+s)ds, (1.1.11)

and the p.d.f. for τx is

fτ (t) =d( tqx)

dt= −d( tpx)

dt. (1.1.12)

For the special case of t = 1, the prefix in the symbols will be omitted and we have

qx = P [(x) will die within 1 year], (1.1.13)

8

Table 1.1: An Illustrative Life Table

Age x qx lx dx Lx Txex

......

......

......

...60 0.011894 80472 957 79993 1627170 20.22

61 0.012912 79515 1027 79002 1547176 19.46

62 0.014167 78488 1112 77932 1468175 18.71...

......

......

......

px = P [(x) will attain age x + 1]. (1.1.14)

qx plays an important role in mortality analysis as we will show in the following

sections, and is often referred to as the mortality rate or death rate at age x.

Life Table Model

A life table model is an alternative way to specify the distribution for age-at-death

random variable X. A standard life table usually contains tabulations of the basic

functions qx, lx, dx, and, possibly, additional derived functions for integer ages. For

illustration, see Table 1.1, which is extracted from U.S. Social Security Area Life

Tables (1992).

To construct a life table, we start with a given group of l0 newborns (l0 = 100, 000,

for instance). Suppose that each newborn’s age-at-death random variable follows a

distribution specified by s.f. s(x), and all newborns are mutually independent. Let

lx denote the expected number of survivors to age x from the l0 newborns. Then we

have

lx = l0 s(x). (1.1.15)

It is easy to see that lx is proportional to s(x) so lx can be viewed as the discrete

9

version of survival function s(x). Similarly, the expected number of deaths over each

age interval (x, x + 1] can be expressed as

dx = lx − lx+1 (1.1.16)

= l0 (s(x) − s(x + 1)) (1.1.17)

.= l0 f(x). (1.1.18)

Thus, the curve of deaths dx approximates the probability density function f(x).

Rewriting expression (1.1.16), we obtain

lx+1 = lx − dx (1.1.19)

= l0 s(x) − l0 (s(x) − s(x + 1)) (1.1.20)

= l0 s(x)

(1 − s(x) − s(x + 1)

s(x)

)(1.1.21)

= lx (1 − qx) (1.1.22)

Formula (1.1.16) and (1.1.22) show that table values lx and dx can be recursively

obtained for all ages if given the initial group number l0 and mortality rate qx. In

practice, we usually don’t know the underlying survival function of a population. The

construction of a life table requires the estimation of mortality rates qx at all ages

from data. The procedure is as follows. First, define the central death rate over the

interval from x to x + 1, denoted by mx, as

mx =

∫ 1

0lx+t µ(x + t) dt∫ 1

0lx+t dt

=lx − lx+1

Lx

. (1.1.23)

where Lx =∫ 1

0lx+t dt is interpreted as the total expected number of years lived

between ages x and x+1 by survivors from the initial l0 newborns. Lx usually can be

estimated directly from population data, and mx as well. Then, assuming a uniform

10

distribution of deaths for ages greater than 1, qx can be obtained from mx by

qx =mx

1 + 0.5mx

, for x ≥ 1. (1.1.24)

The mortality rate at age less than one has to be dealt with differently, and details

are omitted here.

The symbol Tx in a life table denotes the total number of years lived beyond age

x by the survivorship group with l0 initial members. We have

Tx =

∫ ∞

0

lx+t dt = Lx + Lx+1 + Lx+2 + · · · . (1.1.25)

Letex denote the life expectancy of (x), i.e. the average number of years or the future

lifetime lived by (x).ex can be calculated as

ex=

Tx

lx. (1.1.26)

Life expectancy is an important indicator for the mortality level of a population. It

has been widely used to measure overall mortality changes in a region or to compare

mortality differences between cohorts.

Finally, we would like to remark that life table models only provide partial infor-

mation on a survival distribution. To extract full distributional information from a

life table, one has to consider fractional age assumptions for death rates over each age

interval [x, x + 1). Nonetheless, due to data availability on human mortality, the life

table model is one of the most popular methods of expressing an age-specific mortality

pattern, and lx, dx and qx are the discrete counterparts of the continuous functions

s(x), f(x) and µ(x). In traditional life insurance, construction and projection of

life tables have been the central topics of mortality analysis. We will present some

historical results on age-specific patterns of human mortality (presented through life

tables) in the following section.

11

1.1.2 Empirical Studies on Mortality Trend

Over past centuries, human mortality has improved dramatically at all ages and

has shown many common features in its moving trend for different populations. In

this section, we will present some aspects of the past mortality experience referring

to Swedish male mortality data. All data used in this project are obtained from

www.mortality.org, unless specified.

Figure 1.1 illustrates the curves of deaths dx estimated at different times. As we

can see, 1) the mode of the curve of deaths moves towards older ages, and 2) the

dispersion of deaths around the mode reduces; originating the so-called “expansion”

and “concentration” properties. These two properties about the curves of deaths cor-

respond to the survival function moving towards a rectangular shape (see Figure 1.2),

from which the term “rectangularization” comes. In Figure 1.3, logarithm age-specific

mortality rates are presented, while in Figure 1.4 mortality rates qx above age 60 are

plotted against age x.

The overall mortality experience can also be depicted by other quantities such

as the life expectancy (at birth or at higher ages) and the mortality rate ratios. In

Figure 1.5, the behavior of the life expectancy at birth is compared with the life

expectancy at age 65. In Figure 1.6, the mortality ratios at ages 40, 60, 80, i.e. the

mortality rate qx(y) in various calendar years y divided by the mortality rate qx(1901)

in the year 1901, for x = 40, 60, 80, are presented.

Results are self-evident. In particular the following aspects can be pointed out: an

overall increase in the most probable age of death (i.e. the mode in deaths curve), an

overall decrease in mortality rates at all ages, an overall increase in the life expectancy

(at birth as well as at old ages). However, we also should notice that mortality changes

12

Figure 1.1: Curves of deaths dx for Swedish male population

0 20 40 60 80 100 1200

1000

2000

3000

4000

5000

Age

Dea

th n

umbe

r at

age

x

190119211941196119812001

Figure 1.2: Survival functions lx for Swedish male population

0 20 40 60 80 100 1200

20000

40000

60000

80000

100000

Age

Sur

viva

l fun

ctio

ns lx

190119211941196119812001

13

Figure 1.3: Log mortality rates ln(qx) for Swedish male population

0 20 40 60 80 1000

2

4

6

8

10

12

Age

Log

mor

talit

y ra

te ln

(10,

000*

qx)

190119211941196119812001

Figure 1.4: Death rates qx, 60 ≤ x ≤ 110 for Swedish male population

60 65 70 75 80 85 90 95 100 105 1100

0.1

0.2

0.3

0.4

0.5

0.6

0.7

Age

qx

190119211941196119812001

14

Figure 1.5: Life expectancy at birth and at age 65 for Swedish male population

1901 1921 1941 1961 1981 200145

50

55

60

65

70

75

80

85

Year

e0

e65

Figure 1.6: Mortality ratio qx(y)/qx(1901) for Swedish male population

1901 1921 1941 1961 1981 20010

0.2

0.4

0.6

0.8

1

1.2

1.4

Year

x=40x=60x=80

15

Figure 1.7: Curves of deaths dx for Swedish male cohorts

0 20 40 60 80 1000

1000

2000

3000

4000

Age

Dea

ths

num

ber

age

age

x

181118311851187118911911

are not happening evenly. Specifically, the rate of improvement in mortality rates has

varied significantly over time, and the improvement has varied substantially between

different age groups, as shown in Figure 1.5 and Figure 1.6. Furthermore, combining

with Figure 1.3 and Figure 1.4, we can tell that the mortality decreases in relative

terms are bigger at young adult ages than at very old ages, while the mortality

decrements in absolute values are bigger at very old ages than at younger ages in the

last century.

The above observations are made based on cross-sectional or period life tables.

A period life table is based on, or represents, the mortality experience of an entire

population during a relatively short period of time, usually one to three years. Life

tables based directly on population data are generally constructed as period life tables

because death and population data are most readily available on a time period basis.

16

Figure 1.8: Survival functions lx for Swedish male cohorts

0 20 40 60 80 1000

20000

40000

60000

80000

10000

Age

Sur

viva

l Fun

ctio

ns lx

181118311851187118911911

Figure 1.9: Log mortality rates ln(qx) for Swedish male cohorts

0 20 40 60 80 1000

1

2

3

4

5

6

7

8

Age

log

mor

talit

y ra

te ln

(10,

000*

qx)

181118311851187118911911

17

Figure 1.10: Death rates qx, 60 ≤ x ≤ 100 for Swedish male cohorts

60 65 70 75 80 85 90 95 1000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

Age

q x

181118311851187118911911

Figure 1.11: Life expectancy at birth and at age 65 for Swedish male cohorts

1811 1831 1851 1871 1891 191135

40

45

50

55

60

65

70

75

80

85

Year

e0

e65

18

In contrast, a cohort life table is based on, or represents, mortality experience

over the entire lifetime of a cohort of persons born during a relatively short period

of time, usually one year. Cohort life tables based directly on population experience

data are relatively rare because of the need for data of consistent quality over a very

long period of time. Cohort tables can, however, be readily produced from a series of

period tables.

However, when it comes to actuarial calculations, cohort life tables are usually

more suitable than period tables. For example, when we wish to calculate the pre-

mium or reserves for some product issued to (x) in year y, the relevant life table

shall be the corresponding cohort life table for the cohort born in the year y − x.

For this reason, we reproduce the graphs we have demonstrated so far using cohort

life tables. See Figure 1.7 to Figure 1.11. Similar properties about mortality trends

can be pointed out from those graphs as from period life tables, but they are usually

presented in more or less different degrees.

Before we end this section, we would like to draw readers attention to the ways

how random shocks are presented differently in mortality profiles based on period or

cohort life tables. In the course of mortality improvement, catastrophic events like

the Tsunami in December 2004, or epidemic diseases like the influenza pandemic in

1918 may cause a sudden mortality rise in a short time period. If using period life

tables, these kind of shocks will be reflected as the whole curve (of lx, dx, or qx) in

that special year being distorted from its normal shape, as shown in Figure 1.12 for

the effect of 1918 flu outbreak, based on the logarithm mortality rate (log(qx)) curve.

It is not hard to imagine how the curves of lx and dx are affected accordingly.

For cohort life tables, this kind of shock will be reflected as the scattered jumps

19

Figure 1.12: Log mortality rates ln(qx) for Swedish male with year 1918 curve

0 20 40 60 80 1000

2

4

6

8

10

12

Age

Log

mor

talit

y ra

te ln

(10,

000*

qx)

190119211941196119812001

Year 1918

on subsequent cohort curves (see again Figure 1.7 and Figure 1.9 for the detectable

jumps.). The overall effect is that the time series curve fore0 based on cohort life

tables is much more stable than the time series curve based on period counterparts.

1.1.3 Mathematical Mortality Models

The development of a law of mortality, in which an analytic expression is used to

represent all or part of the age pattern of mortality (in term of, say µ(x) or qx), has

been of interest since the development of the first life table, which is compiled by

renowned English astronomer Edmund Helley (1693).

Probably, the first mathematical mortality model is the one proposed by Abraham

De Moivre in 1725, who suggested that the probability of survival from birth until

age x could be expressed as linear function of age. In terms of the hazard rate, the

20

model can be written as:

µ(x) =1

ω − x, 0 ≤ x < ω

where ω is the highest attainable age.

However, the most successful and influential mortality law belongs to Benjamin

Gompertz. In 1825, Gompertz found that an exponential increase in age can approx-

imately capture the behavior of human mortality rates for large portions of the life

table. He therefore proposed, in terms of the hazard rate, that:

µ(x) = A exp (θx), x > 0,

where A and θ are positive constants.

The Gompertz law has played a central role in the development of theoretical

hypotheses about the pattern of mortality. The close fit of the Gomptertz function

to empirical data seems to suggest that a law of mortality may exist to explain the

age patterns of death for human populations. This has stimulated a lot of researches

to modify or generalize the Gompertz formula.

For example, in 1860, William Makeham noticed that the Gompertz equation

failed to capture the behavior of mortality at higher ages and added a constant term,

B, in order to correct for this deficiency. The constant can be thought of representing

the risk of death by causes that are independent of age. Hence Makeham’s model can

be expressed as

µ(x) = B + A exp (θx).

Another good extension from Gompertz’s model is Perks’s model (1932),

µ(x) =B + A exp (θx)

1 + C exp (θx), (1.1.27)

21

which allows the curve to more closely approximate the slower rate of increase in

mortality at older ages.

Also in 1963, Beard developed a model to reflect the effect of heterogeneity in the

mortality risks to alter the shape of the mortality rate increasing:

µ(x) =A exp (θx)

1 + C exp (θx), (1.1.28)

In 1980, Heligman and Pollard extended Gompertz’s law to an eight parameter

formula that can better fit the whole mortality curve,

qx

px

= (Ax+B)C + De−E(ln x−ln F )2 + GHx, (1.1.29)

where qx is the probability that a person at age x will die before attaining age x + 1,

px = 1 − qx, and A to H are parameters.

So far, we have introduced the models that are based on or proceed from Gom-

pertz’s idea. Those models constitute an important field in traditional mortality

study.

Weibull’s model is something developed along a different line. In 1951, Weibull

proposed a model which originally may be used as a failure model due to wear and

tear of a technical system in engineering. The analog is obvious: death occurs when

the (first) failure of human organs occurs. In terms of the hazard rate, Weibull’s

model is

µ(x) = Axθ.

A distinguishable feature of all the above models is that they describe mortality

in an age continuous context. It is a very important step to move from life tables to

mathematical formula. In this section, we only intend to list a few well-recognized

models from our perspective, and some of them will be applied to develop projection

22

methods in next section. For readers who are interested in a good review of the

development of mathematical models, please refer to Higgins (2003), Pitacco (2003),

and the references therein.

1.2 Mortality Projection Methods Review

Since life annuities and other life benefits are products involving future mortality rates

of lives, mortality improvement has to be carefully considered when it comes to do

actuarial calculations, such as pricing or reserving. Ignoring mortality improvement

seriously underestimate the value and related liability of those products.

Without doubt, the time series of various mortality profile curves have made

a strong impression on us: there exists a discernible downward trend with some

minor fluctuations in the evolution of age-specific mortality patterns over time. This

impression certainly has affected actuarial research and the actuarial profession. For a

long time, actuarial science has been taking a deterministic approach when mortality

trends are concerned, using projected life tables for actuarial calculations, for instance,

the CMIR tables, and SSA Life tables.

Here, the term “deterministic” means that the projected life tables are constructed

and used without accounting for any uncertainty. Conversely, a projection model with

consideration of uncertainty will be categorized as a stochastic model.

A projected mortality model aims at describing a future age-specific mortality

pattern, based on analyzing the past mortality trend. The basic idea of projecting

mortality is to first express some age-specific measure of mortality as a function of

both age x and calendar year y, denoted by Ψ(x, y). The relevance or appropriate-

ness of the model is usually justified by applying statistical procedures to the past

23

available data of Ψ(x, y). Then parameters in the expression of Ψ(x, y) are estimated

and extrapolated to obtain a projecting model. As a result, the projecting methods

developed in this way are also referred to as extrapolative projection models.

In concrete terms, Ψ(x, y) may represent mortality rates, mortality odds, central

death rates, the force of mortality, survival function, some transform of the above

functions, etc. Sometimes, Ψ(x, y) can be viewed as entries in a matrix whose rows

correspond to ages and columns to calendar years. For example, let Ψ(x, y) = qx(y).

Then, the mortality rates can be read according to three arrangements:

(1) a “vertical” arrangement (i.e. by columns),

q0(y), q1(y), · · · , qx(y), · · ·

corresponding to a sequence of period life tables, each table referring to a given

calendar year y;

(2) a “diagonal” arrangement,

q0(y), q1(y + 1), · · · , qx(y + x), · · ·

corresponding to a sequence of cohort life tables, each table referring to a cohort

born in year y;

(3) a “horizontal” arrangement (i.e. by rows),

· · · , qx(y − 1), qx(y), qx(y + 1), · · ·

yielding a time series of mortality rates referring to a given age x.

As will emerge from the discussion below, thinking in terms of the various arrange-

ments can help in understanding different approaches to the interpolation of mortality

data.

24

Although it seems quite reasonable that mortality projections are based on past

mortality experience, a number of broad projection approaches exist, for instance,

models based on underlying biomedical processes, causal models involving economet-

ric relationships, etc. In practice, most adopted projection methods fall into the

category of extrapolative models, and we will only review these types of models in

this section.

1.2.1 Deterministic Projection Models

The types of extrapolative models can briefly be summarized into the following:

(a) Models based on the independent projection of age-specific mortality or hazard

rates, including mortality reduction factor models (CMI, 1990, 1999; Willets,

1999; Renshaw and Haberman, 2000).

(b) Models based on the projection of parameters for some adopted mathemat-

ical law, including Gompertz-based projecting model (Wetterstrand, 1981),

Makeham-based projecting model (Cramer and Wold, 1935), and Heligman-

Pollard-based projecting model (Forfar and Smith, 1988; Benjamin and Pollard,

1993).

(c) The model tables or relational models that associate life table measures with

some standard life tables, including Brass’ logit model (1974) and Lee-Carter’s

model (1992).

Age-by-Age Basis Projection

As we have shown in the previous section, mortality changes are not happening

evenly: the rate of improvement in mortality rates has varied significantly over time,

25

and the improvement has varied substantially between different age groups. For

this reason, many professional projections prefer method (a) and model the future

mortality rates on an age-by-age basis. For example, the projection formula currently

used by CMIB for annuitants and pensioners mortality tables assumes the following:

qx(y) = qx(y0)α(x) + [1 − α(x)] · [1 − f(x)]y−y0,

where y0 denotes the base year of the projection, and α(x) qx(y0) represents the limit-

ing mortality rate at age x (as a percentage of the base year mortality rate). Basically,

this model assumes that mortality rate qx(y) (at age x) in calendar year y decreases

exponentially to the limiting value, with 1− f(x) specifying the speed of convergence

to approach this limit; hence, 1− f(x) is often referred to as “reduction factor”. De-

termination of α(x) and f(x) is based on the analysis of historical trends, sometimes

combined with expect (scientific and/or subjective) opinion on recent development of

medical science, trends in the incidence of diseases, and so on.

Many projections are of this kind. Examples include the early model proposed by

the Institute of Actuaries in London in 1924

qx(y) = ax + bx cyx

and the late target model used by GAD for 1992-based projection, with a similar

formula of exponential interpolation from current levels to the target level. Since

this method allows each age-specific rate to change at its own individual rate, the

projected age profile of mortality may depart from plausible, historically observed

pattern (Keyfitz, 1981). Subjective modification may be necessary when implausi-

bility occurs. Another disadvantage is that the number of parameters needed to be

estimated is very high, equal to the number of age groups multiplied by the number

26

of parameters in each formula.

Parameter-by-Parameter Basis Projection

When a mathematical mortality law is used to summarize the age pattern of mor-

tality, the high “dimension” of the forecasting problem can be dramatically reduced.

In the last decades of 20th century, various mortality law-based projection methods

have been considered. These include Wetterstrand (1981)’s Gompertz-based pro-

jection, Poulin (1980)’s Makeham-based projection, and Forfar and Smith (1988)’s

Heligman and Pollard-based projection (see also Benjamin and Soliman, 1993). In

such a case, the age pattern of mortality in the calendar year y is expressed via the

parameters of the mathematical law. Hence, the projection procedure is applied to

the set of parameters, instead of the set of age-specific mortality rates.

To illustrate the basic idea of the law-based projection method, we use Makeham-

based model which defines a dynamic Makeham’s law as

µx(y) = γ(y) + α(y)β(y)x. (1.2.1)

Here, the three parameters are viewed as functions of the calendar year y. The

projecting procedure involves first applying the formula to data for each period table

to obtain estimated values γ(y), α(y) and β(y) for calendar year y, then fitting these

estimated values in “horizontal” direction to obtain an extrapolative formula for each

parameter. Note that formula (1.2.1) can also be used on cohort base if the empirical

data support the model as has been proposed by Davidson and Reid (1927), where y

represents the birth year.

Although the law-based projection method reduces the dimension of the forecast-

ing model to a few parameters, another problem may arise when projecting those

27

parameters because there usually exist some complex correlations among those pa-

rameters. Moreover, this method requires that the proposed formula describes the

mortality pattern in a consistently satisfactory way in the past as well as in the

future. Sometimes this may not be true, thus implying high model risk in this ap-

proach. As a result, this method may generate implausible projected mortality trends

for age-specific mortality rates.

Pattern-by-Pattern Basis Projection

An alternative approach to summarizing the age pattern of mortality without

resorting to mathematical laws is the use of “model tables”. Due to the fact that

mortality demonstrates different patterns at different historical stages, it may not be

realistic to use a common mathematical law to represent changing mortality patterns.

Rather, it may be more appropriate if a specific representative of mortality table is

used when mortality reaches a certain level. This is the idea of model tables.

The first set of model tables was constructed by the United Nations in 1955, with

mortality level indicated by a “marker” where the marker is the expectation of life

at birth,oe0. Model tables can be used for mortality projection as follows. First, a

set of model tables is chosen, representing the mortality for a given population in

several past stages, and also in the projected period for that population. Trends

in the marker are then analyzed and projected, possibly using some mathematical

formula, to predict their future values. Then, the projected age-specific mortality

rates are obtained by combining the projected values of the markers with the system

of model tables accordingly.

The idea of model tables is very important, since, for the first time, it took the

viewpoint that mortality projection is to forecast the “correct” pattern and “correct”

28

level to obtain future mortality. It separated the projection mission into analyzing

two components: pattern and level.

This idea has been developed into the so-called “relational method” by W. Brass

in 1974, who focussed on the logit transform of the survival function, namely

Λx =1

2ln

(1 − s(x)

s(x)

).

Brass noted empirically that the mortality pattern, expressed by Λx, has a very nice

linear relationship with Λstandx , the logit pertaining to a “standard” population, i.e.

Λx = α + β Λstandx (1.2.2)

whose parameters are (almost) independent of age. Thus these parameters can serve

as the indices for the represented population.

For the purpose of projecting mortality, formula (1.2.2) can be used in a dynamic

sense. In a dynamic survival modelling context, the Brass logit transformation is

particularly interesting since the empirically linear property of logits applies well

when referring to successive birth-year cohorts. That is, denoting by Λx(y) the logit

of the survival function for the cohort born in year y, s(x, y), we have

Λx(y + 1) = αy + βy Λx(y) (1.2.3)

Again, parameters αy and βy are assumed independent of age, but vary for different

cohorts. So, the problem of projecting mortality converts to the problem of projecting

two parameters αy and βy which carry the information about how mortality will

change from one pattern to the other. Projected values of various life table functions

can then be derived from the inverse logit transformation:

s(x, y) =1

1 + exp2Λx(y) .

29

The main feature of model tables or relational models is that they project mortal-

ity on a pattern-by-pattern basis. Structurally, methods on a pattern-by-pattern basis

are preferable since a plausible age pattern of mortality usually can be kept. This

is unlike the situation with age-by-age or parameter-by-parameter projections. This

approach works well when the assumed relationship keeps holding for the projected

period, otherwise it may result in model risk. This is same as the other extrapolative

methods.

Further Discussion

Model risk is actually the type of risk inherent in an extrapolative projection

approach in general. When one employs a parametric model to extrapolate the past

trends into the future, the implicit assumption is that the historical patterns will

still hold for the future and no structural change will occur. This is certainly not

true. Over the past century, we have observed the changes of mortality patterns

due to the transition in major causes of death from infectious diseases to chronic

diseases (see Tuljapurkar and Boe, 1998). Reflected in death rates, those pattern

changes can be seen from the crossover in mortality decline rates between different

age groups in different historical periods (see Figure 1.6). Pure extrapolating may

result in systematic underestimation or overestimation of mortality improvement.

Therefore, we must keep in mind that extrapolative projection only describes one

scenario of the future when the past trends continue. It is also important to inquire

about the ways and the degree in which the future will be different from the past.

The answer to such inquiry shall be probabilistic. As we have discussed before,

mortality changes can involve both pattern changes and level changes. Some of those

changes are due to continuous development (for example, general improvement in

30

nutrition), but some are a result of discontinuous shocks (such as the introduction of

antibiotics). As a result, some are predictable (or easier to be predicted), while others

are not. Consequently, uncertainty is an inborn nature in the process of mortality

development, no matter what projection methods are taken. This is the motivation

for the stochastic projection approach to become an active topic in recent actuarial

research. It is critical to ask the following questions about any projection method

to what extent can future mortality be approximated by the proposed model? What

is the chance that future mortality may deviate from the projected trend, to what

degree and how do we measure it?

1.2.2 Stochastic Projection Models

The projection procedures proposed before 1990 mainly focus on providing point esti-

mates of future mortality rates (or other age-specific quantities). Although concerns

for the mortality uncertainty in future trends have grown rapidly, those models don’t

facilitate the discussion of this aspect. This is the consequence of the way in which

the models were designated.

For example, McNown and Rogers (1989) have extended Heligman and Pollard-

based projection model to a stochastic projection model by allowing the dynamics of 8

estimated parameters from A to H (see formula (1.1.29)) to be modelled by ARIMA

processes. Although their model can give forecasts of future mortality rates, the

complexity of the parameter relationships makes it impossible to measure forecasting

errors. This is why the projecting models given in last section are not able to be

simply extended to accommodate the stochastic feature.

The Lee-Carter Model

31

In 1992, Lee and Carter proposed a remarkably simple model which seemed to

solve the trade-off between plausibility of the projected age pattern and ease of mea-

suring the uncertainty. The Lee-Carter method works with the central death rates.

Let mx(t) denote age-specific central rate for age x at time t. Lee and Carter assume

that the logarithms of the central death rates satisfy

ln mx(t) = ax + bxk(t) + εx,t. (1.2.4)

for appropriately chosen sets of age-specific constants ax and bx, and time-varying

index k(t).

• The age parameters ax’s can be interpreted as describing the general shape of

ln mx(t) across age, while the time parameter k(t) describes the variation in the

mortality level with time t. If k(t) falls, mortality improves, and if k(t) rises,

mortality worsens.

• The coefficient bx determines how this level change in mortality affects the rate

at a specific age. If the bx is particularly high for some age x, then this means

that the mortality rate improves faster at this age than in general. If it were

negative at some ages, this would mean that mortality was getting worse at

those ages. If bx’s were all equal then mortality rates would decline at the same

rate. As the bx’s are controlled to sum to 1, this is a relative measure.

• The error term, εx, t reflects the particular age-time variation of historical influ-

ences not captured by the model. If model specification is correct, εx, t shall be

i.i.d. random variables’ with mean 0 and variance σ2ε .

It is worth noting that if k(t) decreases linearly, then mx(t) decreases exponentially

at each age, at a rate that depends on bx, thus reducing to a special age-by-age

32

projection model. However, the Lee-Carter model is very different from age-by-age

projection. Under the Lee-Carter model setting, the age specific rates are determined

by the three parameter sets ax, bx and k(t) together. As a result, the parameter k(t)

itself is a kind of compromise among the trends in all the individual age-specific rates.

This will lead to different forecasts of the individual rates than would be obtained

by modelling them individually. In this sense, the Lee-Carter model is in essence a

relational model, and the future mortality pattern can be generated by the forecasts

for parameter k(t), combined with the estimates of ax and bx (obtained from the model

fitting step). The Lee-Carter model thus provides a parsimonious way to express the

pattern change of mortality in terms of the variation of a single variable k(t).

The most distinguishable aspect of the Lee-Carter model is that the model allows

for uncertainty in forecasts. In fact, the variable k(t) is intrinsically viewed as a

stochastic process (not as a deterministic quantity that can be expressed by some

interpolative formula as in the previous section), thus the values of k(t) form a time

series over time.

Standard statistical procedure (see Box and Jenkins, 1970) can be applied to find

an appropriate autoregressive integrated moving average (ARIMA) model for the time

series of k(t). In Lee and Carter’s original paper, k(t) is found to decline at a roughly

constant rate and have roughly constant variability, therefore can be well modelled

using a simple random walk with drift. That is

k(t) = k(t − 1) − c + et. (1.2.5)

or

k(t) − k(0) = ct +∑

0≤s≤t

es. (1.2.6)

33

In this specification, c is the drift term, and k(t) declines linearly with increments

of size c. The deviation from this path is captured by a white noise et. This simple

expression again can facilitate the forecasting and the discussion of uncertainty.

Measure of Uncertainty in the Lee-Carter Model

Now assume an ARIMA process (1.2.5) has been recognized as the model for k(t).

Let kt+s be the s-period ahead forecast from base year t. Then the s-period ahead

forecast of ln(mx(t)) is given by

ln(mx(t + s)) = ax + bxk(t + s). (1.2.7)

The true value of ln(mx(t+s)), assuming the model specification and data are correct,

is given by

ln(mx(t + s)) = (ax + αx) + (bx + βx)(k(t + s) + ut+s) + εx,t+s. (1.2.8)

where αx and βx are the errors in estimating ax and bx respectively, εx,t+s is the error

in fitting the model for age group x, and ut+s is the error in forecasting k ahead s

periods from base year t.

The total forecast error, Ex,t+s, is the difference between expression (1.2.8) and

(1.2.7):

Ex,t+s = αx + εx,t+s + (bx + βx)ut+s + βxk(t + s). (1.2.9)

Unfortunately, the correlation among these different sources of error is not clear.

Hence, in the computation of the variance of the forecast error, we have to assume

independence between all the terms on the right hand side of equation (1.2.9). Under

this assumption, the variance of Ex,t+s is given by

σ2E, x, t+s = σ2

α, x + σ2ε, x, t+s + (b2

x + σ2β, x) σ2

k, t+s + σ2β, x k2(t + s). (1.2.10)

34

where σ2α, x and σ2

β, x correspond to the variance of the error in estimating αx and βx

respectively; σ2ε, x, t+s is the variance of εx,t+s, and σ2

k, t+s is the variance of ut+s.

Now we specify how to estimate the variance of each component in the right hand

side of equation (1.2.10). Since ax is the average over time of the log of the death rate

for age x, its error variance is the variance of ln(mx(t)) divided by T , the number of

observations of mx. σ2ε, x, t+s can be estimated by the variance of the error in fitting age

group x within the sample period. σ2β, x can be obtained by a small-scaled bootstrap

(see Lee and Carter (1992) for details). The variance σ2k, t+s depends on the form of

the ARIMA process in general. In terms of formula (1.2.5), we have

σ2k, t+s = s ∗ σ2

et+ s2 ∗ σ2

c ,

where σ2et

is the variance contained in forecasting k due to et, and σ2c is the variance

of error in estimating the drift term c.

The numerical result (see Lee and Carter, 1992, Table B1 in Appendix B) actually

shows that σ2k, t+s is significantly bigger than all the other variances altogether, and

becomes dominant when the forecasting horizon gets long enough, say more than 15

years. That means, the error occurring in forecasting the mortality index dominates

the errors from other sources. Hence the computation of the variance of the forecast

error, Ex,t+s, can be approximated solely by the error of forecasting k.

To generate the interval forecast, we need to assume normality for variable Ex,t+s.

This gives the following 95% confidence interval estimate for mx(t + s):

(mx(t + s)e−1.96 σE, x, t+s , mx(t + s)e+1.96 σE, x, t+s

). (1.2.11)

In summary, Lee and Carter proposed a simple linear transformation (formula

35

(1.2.4)) to represent the age-specific mortality pattern change in terms of the varia-

tion of a period-specific mortality level index k(t). More strikingly, k(t) empirically

declines in a linear manner during the period of interest, thus can be modelled as

(1.2.5). As a result, the uncertainty for the forecasted mortality rate can be easily

derived from the variance of k (given in formula (1.2.10) or (1.2.11)).

Further Discussion

The Lee-Carter methodology represents one of the most influential proposals in

the field of mortality forecasts. It is designed to address the issue of measuring

the uncertainty in mortality forecasting. However, because the uncertainty given

by (1.2.10) does not reflect uncertainty about whether the model specification is

correct, nor uncertainty about whether the future will look like the past, many people

believe that the confidence intervals (1.2.11) given by Lee-Carter model are too narrow

(Renshaw and Haberman, 2003; Booth, Maindonald, and Smith, 2002; Li, Hardy, and

Tan, 2006). This narrowness may result in underestimation of the risk of more extreme

outcomes, and this may defeat the original purpose of moving on to a stochastic

framework, as commented in Li, Hardy, and Tan (2006).

This narrowness of the confidence intervals can be interpreted as the model risk

inherent in the extrapolative method as we have discussed before. The Lee-Carter

methodology will only work well when the mortality change maintains the same pat-

tern over the fitting and projecting period. When a changeover of mortality from one

pattern to another takes place, this method will fail. The model mis-specification

can be tested on the error term εx,t. It is found that there exist substantial and

persistent correlations across age groups when fitting the Lee-Carter model to U.S.

death rates from 1933 to 1987. This is a negative result because εx,t is supposed to

36

be independent of one another. For further model examinations, see Lee and Miller

(2001) for details.

Another weakness of the Lee-Carter model is regarding forecasting errors for some

more integrated quantities, such as life expectancyex. Since

ex is nonlinear function

of µ(x) (or mx(t)) and k(t), the computation of the standard errors hence requires

the use of asymptotic approximations or a bootstrap procedure. The complexity of

such computations is of the same order of magnitude as would be involved under

other parameter forecasting method like McNown and Rogers’ (1989). Therefore, in

practical application, Lee-Carter model still cannot provide an analytic form for its

solution. Numerical illustration requires cumbersome simulation.

Finally, we would like to remark that, regarding the use of projected life tables in

actuarial applications, cohort life tables are directly relevant. However, most proposed

methods (including the Lee-Carter’s model) are based on analyzing period data. This

choice mainly depends on the availability of data and/or the empirical facts. Projected

period life tables are often constructed first. Cohort tables are then generated by

reading the matrix in a diagonal direction as we have discussed before. Though it

may not be a big issue to obtain cohort tables from period tables, it is much more

difficult to transfer the uncertainty measure from the period to the cohort setting

for life table functions, such as life expectancyex, survival functions spx and so on.

This is because when passing from errors in forecasting the age-specific death rates

to errors in forecasting life table functions, the error in forecasting k applies to all

age groups in the period setting, whereas the autocovariance structure of errors in k

have to be taken into account in the cohort setting, which is less straightforward.

37

1.2.3 Assessing Mortality Projection Models

Criteria for assessing forecasting methodologies (see Keyfitz 1981, GAD 2001) include

• the accuracy with which the forecasts match the eventual realizations of the

actual data

• the ability of the model to generate measures of forecasting uncertainty,

• the transparency of assumptions used to generate the forecasts,

• the quality of the data on which the forecasts are based,

• the ease of use of the model, robustness of the model, needs of users, and so on.

Government Actuary’s Department (2001) has carried out an extensive examina-

tion on various projection methods. Here due to the limit of space, we only quote

the accuracy testing result based on Lee-Carter model, which is viewed as one of the

most popular methods, is highly credited as satisfying well the assessing criteria, and

is likely the only model capable of providing an uncertainty measure.

To test the accuracy, the Lee-Carter method is fitted to base data for England

& Wales from 1941 to 1970. The estimated model is then used to project mortality

rates for the period from 1971 to 1999. They then calculate the ratio of actual over

projected rate and plot the numbers using a lexis chart developed by Andreev (2001).

We borrow their Figure 9 for male in GAD report No.8 to insert here.

The presence of large area of dark blue and dark red reveals that projected age-

specific rates were very different from the actual rates. In particular, projected rates

are lower at ages up to around 45 and higher at older ages than actual rates, through-

out the period 1971 to 1999. Moreover, this phenomenon is also found for female

38

projection and for the projections based on different base data. Therefore, we assert

that severe systematic deviation exists in the projection produced by the Lee-Carter

model.

Sources of Uncertainty

Several sources of uncertainty may influence the modelling of mortality rates, and

their projection into the future. Three well-known categories associated specifically

with the use of statistical models (see Cairns, 2000) are:

(a) uncertainty due to the stochastic nature of a given model (that is, a stochastic

model produces randomness in its output), namely the “process risk”;

(b) uncertainty in the values of the parameters (if we have a finite set of data then

we can not estimate parameter values exactly), originating the “parameter risk”;

(c) uncertainty in the model underlying what we can observe (actual trend is not

represented by the proposed model); thus the so-called “model risk” arises.

If all types of uncertainties have been well addressed, we expect only unsystem-

atic errors in the comparison of projected rates and actual rates. In concrete terms

(see Figure 1.13), when we put projected mortality rates (the continuous line) at a

given age x with its possible future mortality experience (the dots) together, we ex-

pect pattern 1, where deviations from the projected mortality rates can be sensibly

explained in terms of random fluctuations of the outcomes (the observed mortality

rates) around the relevant expected values (the projected mortality rates). The ex-

perience depicted in pattern 2 can be hardly attributed to random fluctuations only.

More likely, this profile can be explained as the result of an actual mortality trend

different from the forecasted one.

39

Similar systematic deviations can also be pictured for a fixed cohort but along the

age axis.

Figure 1.13: Experienced Mortality with its Projection

Pattern2− Systematic Deviation

q x(y)

Pattern1− Random Fluctuation

q x(y)

In the Lee-Carter model, the mortality pattern is estimated from past experience,

expressed in terms of ax and bx, and fixed throughout the projecting period. Mortality

uncertainty in the Lee-Carter model setting is mainly attributed to the random nature

of index k(t), i.e. the process risk of the mortality level. As we have analyzed

before, this creates large model risk due to the nature of mortality changes. This

might explain what we find in GAD’s report about the Lee-Carter model’s projection

performance.

Parameter risk can be reduced by improving data information, hence it may not

present a big problem when model specification is appropriate. But sometimes process

risk can be described in terms of parameter uncertainty as in McNown and Rogers’

model. From the discussion we had before, it seems that model risk could be a severe

problem, especially for the models based on extrapolative method. We refer interested

40

readers to Cairns (2000) for some general methods to deal with parameter and model

risk.

The Complexity of the Mortality Forecasting Problem

The problem of mortality forecasting is actually a very difficult one.

- First of all, we are dealing with moving curves. Precisely, we need to specify a

mortality pattern for each future time (t ∈ [0, T ]), with which age-specific mortality

structure is implied.

- Secondly, the curve is moving non-linearly and randomly.

- Thirdly, the forecasting schedule has to satisfy two-dimensional biological con-

strains: that is, at a fixed time, older people usually have higher mortality rates than

younger people; and for a fixed age, the younger cohort is expected to experience

lower mortality than the older one. Yet, for an actual sample pattern or sample path,

it is possible that at an older age we may observe lower mortality rate than at a

younger age, or a younger cohort may present a higher mortality at some age than

an older cohort due to random fluctuation. But the general trend and general shape

have to be kept, at least based on our current knowledge this has to be true.

The first two properties make mortality modelling very similar to the modelling

of term structure of interest rates. However, term structure of interest rates doesn’t

have to be restricted to those structures and patterns implied in mortality. In ad-

dition, the expectation we have for interest rate models is quite different from the

one for mortality models. For example, no one expects that interest rate models can

predict the term structure of interest rate more than 1 year ahead with high precision.

However, actuaries have been forecasting mortality rates for a future period of more

41

than 20 years ahead, and have been assessing the accuracy of their projection in some

way.

What is our particular problem with mortality modelling? This is the question in

our mind. We will try to attack this question at our best in this thesis.

1.3 Mortality Risk and Stochastic Approaches

1.3.1 Mortality Risk — Definition and Properties

As we have shown so far, mortality development embraces a great deal of uncertainty

per se. When a projection is concerned, this uncertainty presents itself as the sys-

tematic deviations of the realized mortality rates from the projected rates. The risk

that such systematic deviations may happen is often referred as mortality risk in the

recent literature.

In practice, mortality changes occur in both young and old ages. For different age

groups, mortality risk usually manifests different characteristics. To help differentiate

between them we will employ the following terminology. The term mortality risk

will be used to cover all forms of deviations in aggregate mortality rates from those

projected at different ages and over different time horizons. Longevity risk will be

used to refer to the risk that, in long term, aggregate survival rates for identified

cohorts are higher than projected. Short-term, catastrophic mortality risk will refer

to the risk that, over short periods of time, mortality rates are very much higher than

would normally be experienced.

In the following, we will use a simple representation provided by Pitacco (2003) to

illustrate the peculiar statistical aspect of the mortality risk. Using the notation from

42

section 1.2, let us denote by Ψ(x, y) a projected mortality model, i.e. a (real-valued)

function of age x and calendar year y, which expresses, in some way, the mortality of

people aged x in the (future) calendar year y, i.e. born in year z = y − x. When the

future changes in mortality are unknown at the time of valuation, the future mortality

evolution can be considered as a family of projected mortality models including all

possible outcomes. Denote by K(z) a given assumption about the mortality trend

for people born in year z, and by K(z) the set of such hypotheses. Then, the family

of projected models is

Ψ[ x, z + x|K(z) ] ; K(z) ∈ K(z), (1.3.1)

with Ψ[ x, z + x|K(z) ] representing the projected model conditional on the specific

hypothesis K(z).

If the mortality is described through a mathematical law, such as those proposed

by Gompertz, Makeham, Heligman and Pollard, etc., the law itself is characterized by

a vector-valued parameter θ(z). An appropriate choice of the vector-valued function

θ(z) may reflect a given hypothesis about the mortality trend. Furthermore, if we

are only concerned with the projection of one cohort, i.e. a given year of birth, z

can be dropped from the notation. Therefore, the family of projected models can be

expressed simply as

Ψ[ x| θ ] ; θ ∈ Θ. (1.3.2)

The parameter space can be either a discrete or a continuous set. In the former

case, the parameter space is referred by a finite set in particular:

Θ = θ1, θ2, · · · , θm, (1.3.3)

43

where m choices are made for the parameter, each one expressing a particular mortal-

ity trend. When m = 1, we are dealing with one (projected) model; hence, uncertainty

in future mortality is not allowed for. Conversely, when m > 1, we are dealing with

several projected models, expressing alternative views on future mortality trends.

Consider assigning a weight gi(gi > 0) to each parameter value θi in Θ (i.e. to each

projection hypothesis), such that∑m

i=1 gi = 1. Hence, the set gii=1,2,··· ,m can be

interpreted as a probability distribution for Θ, describing a “degree of belief” about

the uncertainty in future mortality evolution. A stochastic model setting is thus

formulated (in the sense that the projection is non-deterministic).

Now we will use this simple stochastic model representation to distinguish between

the unsystematic mortality risk stemming from the randomness of a given (parameter)

survival function, and the systematic mortality risk stemming from the uncertainty

associated with the parameter and model specification. In order to do so, let us

consider the residual lifetime random variable τx. According to a given choice θi of the

parameter, the (conditional) moments of τx may be evaluated. In an age-continuous

context, we have for example:

E(τx| θi) =

∫ ∞

0

t fx(t| θi) dt (1.3.4)

V ar(τx| θi) =

∫ ∞

0

(t − E(τx| θi))2 fx(t| θi) dt (1.3.5)

where fx(t| θi) is the (conditional) p.d.f. of the residual lifetime τx.

The unconditional moments are then as follows:

E(τx) =

∫ ∞

0

m∑

i=1

t fx(t|θi) gi dt (1.3.6)

V ar(τx) =

∫ ∞

0

m∑

i=1

(t − E(τx| θi))2 fx(t| θi) gi dt (1.3.7)

44

In particular, developing expression (1.3.7) the following well-known result can be

obtained for the variance:

V ar(τx) =m∑

i=1

V ar(τx| θi) gi +m∑

i=1

(E(τx| θi) − E(τx))2 gi (1.3.8)

= E(V ar(τx| θi)) + V ar(E(τx| θi)) (1.3.9)

where θ denotes the random value of the parameter. Comparing expression (1.3.5)

and expression (1.3.8) or (1.3.9), it is obvious that expression (1.3.5) only allows for

pure randomness arising from a given model in which the model and parameter values

have been uniquely specified, whilst formula (1.3.8) (or (1.3.9)) explicitly deals with

both pure randomness (but in a more general form than as in formula (1.3.5)) (the

first term in (1.3.8) or (1.3.9)), and systematic deviations from its best estimation

(the second term in (1.3.8) or (1.3.9)). It is worth noting that traditional actuarial

approaches concentrate mostly on the risk expressed by (1.3.5) with the systematic

deviations being largely ignored.

Further insight of mortality risk to actuarial practice can be illustrated as follows.

Let us consider a portfolio of immediate life annuities contracts, which are identical

in terms of the age of the annuitant, annual amount, etc. Assume at time 0 the

contracts are issued to N(t0) = N individuals. The random number of contracts

at time t (calendar year t0 + t) is N(t0 + t). Also the random lifetime of the ith

annuitant (i = 1, 2, · · · , N) at time 0 is denoted by τx, where x is the age at entry,

and is assumed independently identically distributed among N annuitants.

Consider a generic annuitant in the portfolio. The random present value at time

0 of benefits is denoted by

Y = R aτx|, (1.3.10)

45

where R is the annual amount, and aτx| is calculated with a given rate of interest i.

Given a survival function S(t) or its related parameter θ, the (conditional) distribution

function, expected value and variance of Y can be expressed:

FY (y| θ) = Pr[Y ≤ y| θ), (1.3.11)

E(Y | θ) = R E(aτx|| θ), (1.3.12)

V ar(Y | θ) = R2 V ar(aτx|| θ). (1.3.13)

Now denote by Y the random present value of future benefits for the portfolio.

Clearly, it can be expressed as follows:

Y =N∑

i=1

Y (i), (1.3.14)

where Y (i) has the same distribution as Y . The conditional distribution function of

Y can then be obtained as the convolution of the distribution function of the variable

Y (i), i.e.

FY(y| θ) = FY (1) ∗ · · · ∗ FY (N)(y| θ) (1.3.15)

= [FY (y| θ)]N∗ (1.3.16)

The conditional expected value and variance of Y given θ are

E(Y| θ) = N E(Y | θ), (1.3.17)

V ar(Y| θ) = N V ar(Y | θ). (1.3.18)

Note that, up to now, what we obtain is the variance for the portfolio when the future

mortality is described by a given survival model, that is, when we take a deterministic

46

approach. In this situation, the variance of the portfolio increases linearly as the

portfolio size increases.

The unconditional expected value and variance of Y can be calculated similarly

as for the unconditional moments of τx:

E(Y) = EP [E(Y| θ)] (1.3.19)

= N EP [E(Y | θ)] = N E(Y ) (1.3.20)

V ar(Y) = EP (V ar(Y| θ)) + V arP (E(Y| θ) (1.3.21)

= N EP (V ar(Y | θ)) + N2 V arP (E(Y | θ)) (1.3.22)

In the above calculation, P means the probability distribution specified by gii=1,2,··· ,m.

Formula (1.3.22) reveals that, when the future mortality is uncertain, the variance of

the portfolio increases quadratically as the portfolio size N increases.

The relationship between the size of portfolio and the total riskiness borne by

the insurer can be better illustrated by the coefficient of variation, i.e. the relative

standard deviation of a random variable. Under the deterministic model setting, we

have, for portfolio benefits Y , the coefficient of variation r is

r =

√V ar(Y| θ)E(Y| θ) =

1√N

√V ar(Y | θ)E(Y | θ) (1.3.23)

The coefficient of variation for Y under the stochastic approach is

r =

√V ar(Y)

E(Y)(1.3.24)

=

(1

N

EP (V ar(Y | θ))E2(Y )

+V arP (E(Y | θ))

E2(Y )

)1/2

(1.3.25)

Therefore, in relative terms, expression (1.3.23) shows that the riskiness of the

portfolio is decreasing with the size of the portfolio. This is the reason for the common

47

opinion that the larger the portfolio, the less risky it is. As the portfolio size goes to

infinity, the risk measured by r goes to zero, representing a property of the “pooling

risk”. On the contrary, the second component in expression (1.3.25) shows that the

risk shared by each policy can never be reduced to zero by increasing the portfolio

size. For this reason, this latter risk is called “non-pooling risk”.

1.3.2 Financial Implication of Mortality Risk

In this subsection, we will provide some numerical evidence to show the severe finan-

cial impact of the non-pooling risk. To this purpose, we investigate the percentile of

Y , which has been defined in the previous subsection as the random present value of

future benefits for a portfolio of immediate life annuities contracts. In particular, for

the deterministic case (that is, the survival model is specified by a given parameter

θ), we define

yα = infy : PrY > y| θ ≤ 1 − α. (1.3.26)

Following Pitacco (2003) and Olivieri (2001)’s approach, it is assumed that θ ∼

θ[min], θ[med], θ[max] with probability distribution P = p[min], p[med], p[max]. Define

yα = infy : PrY > y ≤ 1 − α. (1.3.27)

In general, there is no analytic form for the percentiles. Therefore, simulation method

is required.

For numerical convenience, we adopt a mathematical law for mortality:

qx

px

= GHx (1.3.28)

Note that the right-hand side of (1.3.28) is the third term in the Heligman-Pollard

law, i.e. the term describing the old-age pattern of mortality. It is a well-accepted

48

Table 1.2: Parameters of the projected survival functions

[C] [min] [med] [max]G 0.00028 0.000042 0.000002 0.0000001H 1.07319 1.09803 1.13451 1.17215

fact that formula (1.3.28) can provide a good approximation for the mortality rates

when ages over 50 are considered. In particular, G expresses the level of senescent

mortality and H the rate of increase of senescent mortality itself. From (1.3.28) the

survival function Sx(t) (or just S(t) for simplicity) can be derived. Hence, the model

is specified via the parameter θ = (G,H).

In order to represent the uncertainty in mortality trend, we have assumed three

alternative projected survival models, characterizing the three different degrees of

mortality improvement from the current mortality level (denote by S[C](t)). That is,

S[min](t), S[med](t), S[max](t) are projected survival functions expressing, respectively,

a little, a medium and a high improvement in mortality. When implementing the

model, the parameter (G,H) for the survival function S[C](t) is first estimated from

the current available mortality table. Then the relevant parameters for S[min](t),

S[med](t), S[max](t) are carefully chosen so that they can represent different degrees

of mortality improvement, and at the same time describe the general phenomena

of rectangularization and expansion (see section 1.1.2). The parameters for those

survival models are given in Table 1.2 and their corresponding schedules are displayed

in Figure 1.14.

Next, we need to assign probabilities to the three projected functions S[min](t),

S[med](t), and S[max](t), which we have interpreted as the “degree of belief” attributed

49

Figure 1.14: Curves of deaths Based on Formula (1.3.28)

50 60 70 80 90 100 1100

1000

2000

3000

4000

5000

6000

Age

[C][min][med][max]

to the corresponding projection. For the numerical result shown in Table 1.3, θ =

θ[med] is used for the deterministic case, P = p[min], p[med], p[max] = 0.2, 0.6, 0.2 for

the stochastic case, α = 0.95 for both.

It is clear from Table 1.3 there exists slight decrease in both cases when the size of

the portfolio increases, which is because the random fluctuation of the mortality risk

vanishes as the portfolio enlarges. We also notice the dramatic increase in the value

of percentiles in the stochastic case compared to the deterministic case. Moreover,

its magnitude is quite stable and its value seems to tend to a large positive amount.

This is due to the fact that a stochastic approach allows us to analyze not only the

risk of random fluctuations in the number of survivors around the relevant expected

value, but also that of systematic deviations. Since systematic deviations concern

all the insureds in the same direction, this risk cannot be hedged by increasing the

50

Table 1.3: Relative values of percentiles for Annuity portfolio with α = 0.95

N 1000 2000 3000 4000 5000(yα/E(Y[min]| θ)) × 100 102.166 101.510 101.226 101.071 100.944(yα/E(Y[med]| θ)) × 100 101.662 101.160 100.938 100.823 100.725(yα/E(Y[max]| θ)) × 100 101.242 100.878 100.701 100.610 100.541

(yα/E(Y) × 100 114.030 113.860 113.783 113.740 113.705

portfolio size. In other words, this risk is non diversifiable. As a result, systematic

deviation usually requires a risk premium to be taken into account when pricing and

reserving for life insurance and annuities.

It must be pointed out that this example is just illustrative so the choice of param-

eters is meant for demonstrating strong effects of projected tables and uncertainties,

while for real applications it requires deeper investigations and the effect may not

be so significant. Although the numerical results provided here rely on the specific

assumptions concerning uncertainty structure in mortality, the conclusion drawn in

this section is of general sense. The main feature that concerns actuarial practice is

the non-pooling aspect of the mortality risk. Due to this, the possibility of benefiting

from offsetting effects by holding a large enough portfolio of policies is ruled out.

This problem has been amplified by a dramatic decline in interest rate over the

last decades, affecting particularly those contracts providing joint financial and de-

mographic guarantees (see Boyle and Hardy, 2003; Ballotta and Haberman, 2006).

The underpricing of such guarantees has caused severe solvency problems, requiring

the setting up of extra reserves, and leading one large mutual life insurer (Equitable

Life, the world’s oldest life insurance company) to be closed to new business in 2000.

This has increasingly prompted actuaries to question whether the mortality risk has

51

been treated properly.

1.3.3 Stochastic Approach of Dealing with Mortality Risk

It is well known from the basics of actuarial science that the prices of any insur-

ance products contingent on the duration of life are affected by two main factors:

demographic and financial risks. Traditionally, actuaries have been treating both the

demographic and the financial risk factors in a deterministic way, via the “so-called”

best estimate of mortality tables for describing the future evolution of mortality as

well as the best estimate of the interest rate for discounting cash flows over time.

This is often referred as the actuarial approach.

The principle of the actuarial approach is easy to use at first sight. However, it

leaves a lot of hard questions unanswered. For example

• What discount rate should be used and how should it be affected by the char-

acteristics of the financial products (such as long term or short term)?

• What is the best projected mortality schedule? What if the eventual rates are

different from projected rates?

• In general, how should we express, evaluate and manage the risk implied in the

products including both financial and demographic factors?

• Practically, how are actuarial values resulting from the actuarial approach re-

lated to the observable values of the (readily tradable) assets that typically

appear on a life office’s balance sheet?

An alternative approach, based upon applying the principle of arbitrage-free and

the construction of a replicating portfolio, has attracted a lot of attention recently,

52

which we refer to as the contingent claim pricing approach, or sometimes simply the

financial approach. The underlying assumption behind the contingent claim pricing

approach is that values should be calculated in a consistent way with prices of traded

assets so that the market is free of arbitrage. The basic idea of the approach is

therefore not to find an objective valuation, but instead to seek a relative valuation,

compatible with the observable market values of portfolios (of assets and liabilities)

with similar risk characteristics.

Financial theory and models are first introduced into actuarial valuation to deal

with the financial risk involved in complex insurance products such as Equity-Index

Annuities (EIAs), where equity risk, interest rate risk and mortality risk are present

and the market is incomplete. Early works in this field include Follmer and Son-

dermann (1986); Møller (1998, 2001a,b); Lin and Tan (2003), introducing a risk-

minimizing strategy to the pricing and hedging problem of EIAs in an attempt to

manage financial risk.

Now there is a demand to develop more sophisticated approaches to deal with

mortality risk as well, especially the non-pooling aspect of the mortality risk. The

advantages of adopting the stochastic mortality view and exploiting the financial

theory can be seen in the following (but not limited to those) fields.

Performance valuation

For performance investigation, the use of valuations that are as objective as possi-

ble is considered critical. This perspective has been reflected by the recent proposed

fair valuation methodologies in the newly issued International Financial Reporting

Standard (IFRS) by the International Accounting Standards Committee (IASC). In

an attempt to make the assessment process of companies’ financial performance much

53

more realistic and reliable, the use of a market-price-consistent approach to obtain

fair value is encouraged. “Fair value is the amount for which an asset could be ex-

changed or a liability settled between knowledgeable willing parties in an arms length

transaction,” as defined in IFRS.

While most investment strategies have been assessed on a market-price-consistent

basis, it seems necessary that the same treatment is applied to the liability (mortality-

related) side of insurance business as well.

Capital adequacy and solvency requirements

Solvency regulations usually requires the capital capability of the insurer to meet

future obligations commitment in the insurance contract. This implies actually a

comparison between the random profile of the portfolio fund and the random profile

of the portfolio liability, due to the uncertainty in the future mortality trends and

uncertainty in the future financial markets. It is thus difficult to judge on the appro-

priateness of a reserve profile based on a deterministic view of the future scenario.

Hence, stochastic mortality modelling is a necessary step to assess the future

obligations on realistic grounds.

The development of mortality-related derivative market

A large number of products offered by life insurance companies involves a range

of complex contingent claims involving equity risk, interest rate risk and mortality

risk. The historical experience shows that very long-term products-like GAOs or EIAs

are significantly exposed to unanticipated changes over time in the mortality rates

(mortality risk) as well as the financial risk. This means that the fair valuation tech-

niques well-accepted in the financial market need to be integrated with an accurate

54

assessment of future mortality rates.

The advantage of an integrated framework of market consistent valuation is twofold.

Firstly, it gives more realistic premiums and reserves, and secondly, it quantifies the

risk associated with the underlying mortality dynamics. Having introduced stochastic

mortality dynamics, we can further study possible ways of transferring the system-

atic mortality risk to other parties. One possibility is to introduce mortality linked

securities and/or derivatives (such as the recently innovative issuance of the longevity

bonds by the European Investment Bank and BNP Paribas and the 3-year short-term

mortality-linked securities by Swiss Re). Here, the risk premium is linked to the evo-

lution of the mortality dynamics, thereby transferring the systematic mortality risk

to the holder of the contracts. More importantly, mortality risk, as a type of risk that

is nothing too much special compared to equity risk or interest rate in essence, can

be incorporated into the (generalized) financial market through the development of a

secondary mortality risk market. We believe this can conversely enhance the overall

risk management in the insurance business.

FAQ

The financial approach is, of course, not free from trouble when applied to insur-

ance and annuity products. Here is a list of some concerns that have been raised on

applying a financial approach to insurance market.

(1) Q: A deep transparent and active insurance market does not exist. Actually,

most insurance products are non-tradable. What is the implication of arbitrage

free prices for insurance products and how can we obtain market prices for the

underlying contracts?

55

A: When developing mathematical framework, we describe how insurance con-

tracts should be priced if they were traded in a perfectly liquid, frictionless and

arbitrage-free market.

Naturally, we don’t claim that real world markets are perfectly liquid or fric-

tionless. However, the logic behind this is that if prices are calculated in

an arbitrage-free manner, then even an illiquid market with frictions will be

arbitrage-free. Conversely, if we were to propose a pricing framework that vi-

olates the conditions on arbitrage-free, then the possibility of arbitrage would

emerge over time as the market becomes more liquid or trading costs begin to

fall, according to Cairns, Blake, and Dowd (2006a).

However, other techniques must be used to estimate the “market value”, which

we will discuss in more detail when we come to a more concrete context.

(2) Q: Historically, the life insurance companies have dealt with the financial and

(systematic) mortality risks by choosing both the interest rate and the mortal-

ity rate on the safe side, as seen from the insurers’ point of view. When the

real mortality rate and investment payoff are experienced over time, the result

is usually a surplus, which is the so-called with-profit principle of insurance

pricing. Some therefore argue that risk premiums charged in insurance prod-

ucts are too high, and calibrating to those with-profit market prices may give

over-estimated mortality rate.

A: It is true that, in the past, when tools of investment and risk management

are simple and market competition is modest, the actuarial with-profit valuation

approach has been in effect for a long time. Now things have changed a lot:

56

financial markets have become more sophisticated; competition in insurance

markets is getting higher. Therefore the profit margin becomes more reasonable

nowadays.

On the other hand, we need to be careful about the meaning of “with profit”.

Indeed, nobody takes on uncertainty for free. Therefore, from a different point

of view, it is actually a risk/return problem.

A martingale measure is sought to make market consistent pricing. Precisely,

the risk/return trade-off is ensured to be consistent among contracts involving

same type of risk in the market. Hence, a calibrated martingale measure de-

scribes the market view of mortality rate after risk-adjustment, so it shall not

be compared directly with physical mortality rate. The risk-premium implied

by mortality rate under the martingale measure is a reflection of the market

risk attitude to mortality risk.

(3) Q: What is the relationship between financial risk and mortality risk?

A: Whether interest rates and mortality rates are related is a topic of debate.

Some catastrophic events may result in big mortality loss, and affect economics

as well (such as 9/11 in 2001 or the Kobe earthquake in 1995). In those situa-

tion, it seems reasonable to think of mortality risk being related with financial

markets. Miltersen and Persson (2005) presented a forward-mortality frame-

work that explicitly allows for interest and mortality to be related. In this

thesis, we simply assume that the two risks are independent.

Chapter 2

Arbitrage-free Pricing Frameworkfor Mortality Contingent Claims

In this chapter, We construct the arbitrage-free pricing framework that we will use

to evaluate mortality contingent claims. In order to do so, we will first review the

arbitrage-free pricing theory, including the basic principle and key concepts. Then the

interest rate models and mortality models under the arbitrage-free pricing framework

will be briefly discussed.

2.1 Arbitrage free pricing theory

2.1.1 Basic Ideas of Arbitrage Free Pricing

A contingent claim is a contract whose payoff at time T depends on the evolution of

some underlying asset(s). For this reason, it is also called a derivative to emphasize

the fact that it is a contract written on other asset(s). In the following, we will use

the two terms interchangeably. A formal definition for contingent claims will be given

later on. Before we do that, we would like to give the basic philosophy about arbitrage

free pricing.

57

58

Using standard economic reasoning, the price of a contingent claim like the price

of any other commodity, will be determined by market forces. In particular, it will

be determined by the supply and demand of the market, and supply and demand

will in their turn be influenced by factors such as aggregate risk aversion, liquidity

preferences, etc. Therefore it seems impossible to say anything concrete about the

“absolute” value of a contingent claim.

However, the principle of no-arbitrage and the construction of a replicating port-

folio state that we should relate the value of a contingent claim to the underlying

price(s) in a specific way if we want to avoid mispricing between the derivative and

the underlying price(s) existing in the market. In another word, the contingent claim

pricing should be consistent with the market. Therefore the task here under the ar-

bitrage free principle is not to price the derivative in some “absolute” sense, which is

very important to bear in mind. Instead, it is to seek a relative value of the contingent

claim which usually can be expressed in terms of the market prices of the underlying

asset(s).

We now turn to the mathematical introduction of the above described idea, which

includes the concepts of self-financing portfolio, arbitrage, arbitrage free principle,

and so on. The notation from Bjork’s book (1998) is adopted.

Let us consider a financial market consisting of different assets such as stocks,

bonds with different maturities, or various kinds of financial derivatives. The filtered

probability space for this market is denoted by (Ω,F ,Ft, P ). Assume that there is an

adapted short-rate process r(t) (such that∫ t

0| r(s)| ds < ∞ for every t ≥ 0, P -a.s.)

representing the continuously compounded rate of interest on riskless securities. This

can be formalized by assuming the presence in the market of a money-market account,

59

with account value process B(t) defined by B(t) = exp(∫ t

0r(s) ds) representing the

amount of money available at time t from investing one unit at time 0 in risk-free

depositing account and “rolling over” the proceeds until t.

For the moment, we will also take the price dynamics of the various assets as given.

Furthermore, we assume that the assets under consideration do not pay dividends for

the sake of simplicity. Then we give the following definition.

Definition 2.1.1. Let the N -dimensional price process denote by S(t); t ≥ 0.

1. A portfolio strategy (most often simply called a portfolio) is any FSt -adapted

N -dimensional process h(t); t ≥ 0.

2. The value process V h corresponding to the portfolio h is given by

V h(t) =N∑

i=1

hi(t)Si(t)

3. A portfolio h is called self-financing if the value process Vh satisfies the con-

dition

dV h(t) =N∑

i=1

hi(t)dSi(t) or dV h(t) = h(t)dS(t)

A self-financing portfolio is an important financial concept. It basically means

that, after initiation, the portfolio allows neither exogenous infusion nor withdrawal

of money. In other words, the purchase of a new portfolio must be financed solely by

selling assets already in the portfolio. Based on self-financing portfolio, the definition

of an arbitrage follows.

Definition 2.1.2. An arbitrage possibility on a financial market is a self-financing

60

portfolio h such that

V h(0) = 0 (2.1.1)

P (V h(T ) ≥ 0) = 1 (2.1.2)

P (V h(T ) > 0) > 0 (2.1.3)

We say that the market is arbitrage free if there are no arbitrage possibilities.

An arbitrage possibility can be seen as the possibility of making a positive amount

of money out of nothing without taking any risk. It is thus essentially a riskless money

making machine or, as we used to say, a free lunch on the financial market. Therefore,

an arbitrage possibility is a serious case of mispricing in the market, and our main

assumption is that the market is efficient in the sense that no arbitrage is possible.

From the principle of no-arbitrage, we can conclude that

Proposition 2.1.3. (The law of one price) If two portfolio A and B give rise to

identical (but possibly random) future cashflows with certainty, then A and B must

have the same value at the present time.

It is natural to ask the question how we can identify an arbitrage possibility in the

market. This is answered by the Fundamental Theorem of Asset Pricing, the result

that is central to everything under the no-arbitrage pricing approach.

Theorem 2.1.4. The Fundamental Theorem of Asset Pricing (Harrison

and Pliska, 1981)

Consider a market model consisting of a money-market account B(t) and N asset

price processes S1(t), S2(t), · · · , SN(t) on the time interval [0, T ].

61

(i) The market model is free of arbitrage if and only if there exists an equivalent

martingale measure, i.e. a measure Q ∼ P such that the discounted price

processes

S1(t)

B(t),

S2(t)

B(t), · · · ,

SN(t)

B(t)(2.1.4)

are martingale under Q for all 0 < t ≤ T .

(ii) If (i) holds, then the market is complete if and only if Q is the unique.

It is thus clear that the requirement of an arbitrage free market will impose some

restrictions on the behavior of all price (value) processes that exist in the market.

That is, the martingale property of (2.1.4) has to be applied to the price process of

each asset simultaneously.

The existence of an equivalent martingale measure that allows the discounted price

processes to be martingale implies we can find a Radon-Nikodym density process f on

FT which transform the P -measure into a risk-neutral measure Q. In a risk-neutral

world, any security prices grow on average at the risk-free rate, thus requiring an

adjustment in the dynamics of the uncertainty to reflect the investors’ risk-aversion

toward the underlying risk. This adjustment can be related to a so-called market

price of risk, and it ensures that a risk premium is charged consistently with regard

to the same risk in the market.

This approach has been extended to use more general numeraires (basically any

tradable assets, including the risk-free bank account) as the reference, often referred

to as the martingale approach. The martingale valuation approach is, so far, the

most general approach for arbitrage free pricing. It is also extremely efficient from

a computational point of view. We now turn to the pricing problem for contingent

claims.

62

Definition 2.1.5. Let S(t); t ≥ 0 be the N -dimensional price process. A contin-

gent claim with date of maturity (exercise date) T , also called a T -claim, is

any stochastic variable X ∈ FST . A contingent claim X is called a simple claim if it

is of the form

X = Φ(S(T )).

The function Φ is called the contract function.

Simply, a contingent claim can be viewed as a contract, which stipulates that the

holder of the contract will obtain X (which can be positive or negative) at the time

of maturity T . The requirement that X ∈ FST means that, at time T , it will actually

be possible to determine the amount of money to be paid out based on the evolution

of price process S(t) up to and including time T . We see that the European call is a

simple contingent claim, for which the contract function is given by

Φ(x) = max[x − K, 0].

where K is strike price.

We will use the standard notation Π(t;X )(0 ≤ t < T ) for the price process of the

claim X , where we sometimes suppress the X . In the case of a simple claim we will

sometimes write Π(t; Φ).

To obtain a “reasonable” price process Π(t;X ), we consider the “primary” market

B,S1, · · · , Sn as given a priori. We assume that the primary market is arbitrage free.

Then the derivative should be priced in a way that is consistent with the prices of

the underlying assets. More precisely, we should demand that the extended market

Π(t;X ), B, S1, · · · , Sn is also free of arbitrage possibilities. As an application of the

Fundamental Theorem of Asset Pricing, we obtain the following.

63

Theorem 2.1.6. Risk Neutral Valuation Formula The arbitrage free price pro-

cess for the T -claim X is given by

Π(t;X ) = EQ

[e−

∫ T

tr(s) dsX

∣∣∣Ft

](2.1.5)

where Q is a (not necessarily unique) martingale measure.

In practical applications, there is no guarantee that any specific stochastic asset

model will be arbitrage free, and an analytic solution for the Radom-Nykodom (in

short, RN) density may be difficult to derive. For these reasons, the first step is

usually to hypothesize a tractable analytic form for the stochastic asset model in

tandem with development of the RN density. Then the model is calibrated to market

prices or estimated prices of the underlying asset.

If the market is incomplete, there are infinitely many equivalent martingale mea-

sures making (2.1.5) hold. Indeed, expression (2.1.5) determines a whole (open)

interval of prices consistent with the absence of arbitrage. To narrow down the price,

we again need to identify measure Q through calibration to observed prices. It is

worth stressing that this approach is quite effective because the practical application

requires interpolating rather than extrapolating. As a result, the no-arbitrage pric-

ing approach is regarded as being close to reality, and can provide a very valuable

benchmark in helping to determine and examine insurance company share prices and

transaction values.

2.1.2 The Term Structure of Interest Rate

In this section, we restrict ourself to interest rate risk and the bond market. We

investigate many interesting modelling aspects w.r.t interest rate models.

64

We consider a filtered probability space (Ω,F ,Ft, P ) and assume there is an

adapted short-rate process r(t) (such that∫ t

0|r(s) ds < ∞ for every t ≥ 0, P -a.s.) as

in the previous section.

The simplest financial product involving interest rate is the zero-coupon bond.

Definition 2.1.7. A T -year zero-coupon bond (denoted as T -bond) is a contract

which pays $1 to the bond holder at time T . Here, T is the maturity time of the bond.

Now let’s assume there exists a continuously (frictionless) trading bond market

for every T > 0. Let D(t, T ) denote the T -bond price at time t. Before we begin to

discuss the modelling and pricing of the T -bonds, some insight about the relationship

between D(t, T ) and two variables: t (current time) and T (term to maturity), might

be useful and interesting (see Bjork, 1998).

• For fixed time t, D(t, T ) as a function of T gives the prices for bonds of all

possible maturities at time t. The pattern of this function in terms of T is

closely related to “the term structure of interest rates at t”. Typically it will

be a smooth curve, which allows us to make the assumption that, for each t,

D(t, T ) is differentiable w.r.t. T .

• For fixed time T , D(t, T ) is a scalar stochastic process. This process gives the

prices, at different times, of the bond with fixed maturity T , and the trajectory

will typically be irregular depending on how the prevailing interest rate changes

in response to the market.

Hence, we can expect that D(t, T ) presents its relationship with the underlying term

structure of interest rates through the current time t and the term to maturity T . In

particular, D(T, T ) = 1 to avoid arbitrage.

65

Now we will use the fundamental theorem of asset pricing to derive a relationship

between D(t, T ) and r(t). As we have shown before, the absence of arbitrage for the

bond market implies the existence of a martingale or risk-neutral probability measure

Q such that for all T > 0, the discounted price process

V (t, T ) = e−∫ t

0 r(u)duD(t, T ), 0 ≤ t ≤ T (2.1.6)

is a martingale. Thus we have

V (t, T ) = EQ[V (T, T )|Ft],

which leads to a relationship between the bond price and the short rate

D(t, T ) = EQ

[e−

∫ T

tr(u)du|Ft

]. (2.1.7)

Furthermore, for each fixed t, we assume the bond price D(t, T ) is differentiable

w.r.t. the time of maturity T (this is reasonable as we have discussed before). Then

define

f(t, T ) = −∂ ln(D(t, T ))

∂T(2.1.8)

f(t, T ) is called the time-t instantaneous forward interest rate. Obviously, (2.1.8) can

be rewritten as

D(t, T ) = e−∫ T

tf(t,u)du (2.1.9)

Taking the limit as T → t we have

r(t) = limT→t

f(t, T ). (2.1.10)

For a fixed t, the relations (2.1.7), (2.1.8), (2.1.9) and (2.1.10) show that the

dynamics of bond prices, forward rates and short rates can uniquely determine the

66

other two if one is given. More precisely, on the one hand, the term structure of

interest rates in terms of r(t) or f(t, T ) can be derived from the bond price term

structure; on the other hand, the price of a zero-coupon bond can be determined from

the term structure of short rates or the term structure of forward rates. Moreover, the

dynamics of short rates can be uniquely derived from the dynamics of forward rates,

vice versa. As a result, the above relationships (2.1.7), (2.1.8), (2.1.9) and (2.1.10)

allow us to model the bond market in many different ways.

1. We may specify the dynamics of the short rate, then derive bond price processes

using the no-arbitrage pricing formula (2.1.7). The models developed from the

short rate are usually referred to as short-rate models such as those of Vasicek

(1977), Cox, Ingersoll, and Ross (1985) and Hull and White (1990).

2. We may specify the dynamics of all possible forward rates, and then use (2.1.9)

to obtain bond prices as the models developed in Health, Jarrow, and Morton

(1992). Usually the forward rate dynamics have to satisfy certain conditions

required by arbitrage free arguments.

3. We may directly specify the dynamics of all possible bonds as the models devel-

oped in Flesaker and Hughston (1996), Rogers (1997) and Rutkowski (1997).

4. We may specify the market modelling framework for the forward bond prices

such as the LIBOR and swap market models in Brace, Gatarek and Musiela

(1997) and Jamshidian (1997).

67

2.2 The Term Structure of Mortality Under Arbi-

trage Free Framework

In this section, we aim at developing a theoretical framework to price products whose

benefits are contingent on the uncertainties not only from financial risk sources (like

interest rate and/or equity risk) but also from mortality risk. Here and throughout

we have borrowed the relevant stochastic mortality notation of Cairns, Blake, and

Dowd (2006a).

2.2.1 Basic Building Blocks

Now consider a generic individual aged x at time 0, whose random residual lifetime

is denoted as τx. Suppose the individual belongs to a homogeneous group of persons

(in particular, of the same age and with the same health status) whose random

residual lifetimes can be considered identically distributed, and let N denote the

number of persons in the initial group. We assume there exists a stochastic process

µ(t, x + t), representing the instantaneous hazard rate for an individual aged x + t at

time t. This can be formalized by assuming the presence of a survival index S(t, x),

which, theoretically, can be compiled from observing the number, N(t), of remaining

survivors of the group at time t. In particular

S(t, x) =N(t)

N. (2.2.1)

Moreover we assume N(t) is large enough throughout the observation period for the

concerned group so that S(t, x) is a left limited right-continuous curve which allows

68

us to write down the following expression

S(t, x) = exp

(−∫ t

0

µ(s, x + s) ds

)(2.2.2)

to relate the hazard rate process µx(t, ω) with the survival index. For a more formal

introduction of µ(t, x + t) in a intensity based framework, see Biffis (2005) and Biffis

and Millossovich (2006).

We note that if µ(t, x + t) is deterministic then S(t, x) is equal to the survival

probability that an individual aged x at time 0 will survive to age x + t. In this

situation, µ(t, x + t) coincides with the force of mortality as defined in section 1.1.1.

However, here µ(t, x+ t) is stochastic. Then looking forward from time 0, this means

that S(t, x) is a random variable. In this case, S(t, x) can only be regarded as a

survival probability if one observes at time t rather than at time 0. In the following

the law of iterated expectations will be used to obtain the time-s survival probabilities

over the time horizon (t, T ] (for fixed 0 ≤ s ≤ t ≤ T ), and conditional on the event

τx > t.

First let’s denote by P (s, t, T, x) the survival probability for an aged (x) at time

0 surviving from time t to T as measured at time s. Specifically, P (0, 0, T, x) is the

survival probability that (x) survives to time T as measured at time 0; P (T +1, T, T +

1, x) the survival probability from time T to T + 1 as measured at T + 1, which is

actually deterministic since we have observed the mortality experience up to time

T + 1.

Let Ms be the filtration generated by µ(u, x + u), up to time s, that is, Ms

includes full information about changes in mortality up to and including time s, but

69

no information about how mortality rates will develop after time s. Define

Yx(T ) =

1 if the individual is alive at time T

0 if the individual is dead at time T

Then we have

P (s, t, T, x) = P [τx > T | τx > t, Ms]

= P [Yx(T ) = 1|Yx(t) = 1,Ms]

= E[Yx(T )|Yx(t) = 1,Ms]

= E[EYx(T )|Yx(t) = 1,MT|Ms]

= E

[S(T, x)

S(t, x)

∣∣∣∣Ms

]

= E

[exp

(−∫ T

t

µ(u, x + u) du

)∣∣∣∣Ms

]

Taking specific values for s, t, or T , we have

P (0, 0, T, x) = P [τx > T ] (2.2.3)

= E[S(T, x)] (2.2.4)

= E

[exp

(−∫ T

0

µ(s, x + s) ds

)](2.2.5)

P (t, t, T, x) = P [τx > T | τx > t, Mt] (2.2.6)

= E

[S(T, x)

S(t, x)

∣∣∣∣Mt

](2.2.7)

= E

[exp

(−∫ T

t

µ(s, x + s) ds

)∣∣∣∣Mt

](2.2.8)

And

P (T + 1, T, T + 1, x) = exp

(−∫ T+1

T

µ(s, x + s) ds

)(2.2.9)

Note

70

1. We haven’t specified the probability measure we use in the above calculation

of probabilities and expectations. You can imagine we use a real-world (or

physical) probability measure P so far. In the future, we will mainly work

under a risk-neutral or some other martingale measure when it is relevant.

2. Let PP (t, t, T, x) denote the survival probability calculated under measure P .

PP (t, t, T, x) can be compared with its deterministic analogue, T− tpx+t. On the

one hand, let µ(t, x + t) be a deterministic unbiased estimate for the force of

mortality, that is, µ(t, x+ t) = E(µ(t, x+ t)). Then, in usual actuarial notation,

we have

T− tpx+t = exp

(−∫ T

t

µ(s, x + s) ds

). (2.2.10)

On the other hand, define

µP (t, T, x + T ) = −d ln PP (t, t, T, x)

dT. (2.2.11)

Hence,

PP (t, t, T, x) = exp

(−∫ T

t

µP (t, u, x + u) ds

). (2.2.12)

Since, by Jensen’s inequality,

E

[exp

(−∫ T

t

µ(s, x + s) ds

)]> exp

(−∫ T

t

E(µ(s, x + s)) ds

).

Therefore, we usually have

µP (t, u, x + u) 6= µ(u, x + u).

3. Let PQ(t, t, T, x) denote the survival probability calculated under a risk-neutral

measure Q. We will show later that this PQ(t, t, T, x) can be stripped off from

market prices of pure endowments or other related products. For this reason,

71

PQ(t, t, T, x) can be viewed as market pricing survival probability, also referred

to as spot survival probability as in Cairns, Blake, and Dowd (2006a).

Similarly, we can define

m(t, T, x + T ) = −d ln PQ(t, t, T, x)

dT. (2.2.13)

Or,

PQ(t, t, T, x) = exp

(−∫ T

t

m(s, u, x + u) ds

). (2.2.14)

m(t, T, x + T ) is then called the forward force of mortality, or sometimes short-

ened to be forward mortality, analogous to the time t instantaneous forward

interest rate. If T = t, we get the spot force of mortality.

µ(t, x + t) = limT→t

m(t, T, x + T ) (2.2.15)

Also, PQ(s, t, T, x) is referred to as forward survival probability, the probability

that an individual aged x + t at time t, if alive, will survive to time T , based

on the information on mortality rates available up to time s, Ms.

4. We notice that there are similarities in mathematical structure between the

interest rate setting and the mortality rate setting (see (2.1.7), (2.1.8), (2.1.9),

(2.1.10) together with (2.2.8), (2.2.13), (2.2.14) and (2.2.15)).

5. Define:

B(t, T, x) = E

[exp

(−∫ T

0

µ(s, x + s) ds

)∣∣∣∣Mt

]= e−

∫ t

0 µ(s,x+s)dsP (t, t, T, x).

Then B(t, T, x) is a martingale under the corresponding measure. This property

may allow us to derive the PDE for P (t, t, T, x) on [0, T ] × R+ provided the

72

stochastic process µ(t, x + t) is defined by a diffusion process (see Dahl, 2004,

p116 formula (3.2)), which is analogous to the differential equation for zero

coupon bond prices obtained when working with a stochastic interest rate model

(see e.g. Bjork (1997, proposition 3.4)).

Also,

PQ(s, t, T, x) = exp

(−∫ T

t

m(s, u, x + u) ds

)=

B(s, T, x)

B(s, t, x). (2.2.16)

2.2.2 The Generalized Financial/Insurance Market

We are now in a position to consider the arbitrage free pricing approach in a gener-

alized financial/insurance market. Our generalized market includes two fundamental

types of financial contract:

• zero coupon bonds for a full range of terms to maturity;

• pure endowment contracts for a full range of ages and terms to maturity.

Briefly, a zero-coupon bond with maturity T is a contract which pays $1 to the

bond holder at time T . We denote it by T -bond. A pure endowment with maturity

T for (x) is a contract which pays $1 to the policyholder at time T if he/she is still

alive at time T . We denote it by (T, x)-endowment.

We assume there exists a continuously (liquid, frictionless) trading bond and en-

dowment market for every T > 0 and every age x > 0. For the sake of simplicity,

default risk from the contract issuer is ignored for both type of products. Let D(t, T )

denote the time t T -bond price, and E(t, T, x) the time t (T, x)-endowment price. Let

Bt be the combined filtration for both the term structure of interest rates and mor-

tality rates, e.g. Bt = F(Ft

∨Mt). The absence of arbitrage is essentially equivalent

73

to the existence of an equivalent martingale measure Q under which the discounted

prices of any contingent claims in this combined market are martingale. Then we

obtain

D(t, T ) = EQ

[e−

∫ T

tr(u)du|Ft

]. (2.2.17)

and

E(t, T, x) = EQ

[e−

∫ T

tr(u)duIτx>T

∣∣∣ Iτx>t = 1,Bt

]. (2.2.18)

where Iτx>t is the indication function for the event τx > t.

Assuming, furthermore, the independent assumption between interest rate and

mortality rate, we have

E(t, T, x) = EQ

[e−

∫ T

tr(u)du

∣∣∣ Ft

]EQ

[Iτx>T

∣∣ Iτx>t = 1,Mt

]

= EQ

[e−

∫ T

tr(u)du

∣∣∣ Ft

]EQ

[Iτx>T

∣∣ Iτx>t = 1,Mt

]

= EQ

[e−

∫ T

tr(u)du

∣∣∣ Ft

]EQ

[e−

∫ T

tµ(u,x+u) du

∣∣∣Mt

]

= D(t, T )P (t, t, T, x). (2.2.19)

Here, P (t, t, T, x) is a spot survival probability, calculated under the martingale mea-

sure Q with Q being omitted for simplicity of notation. Assuming pure endowment

prices are available from the insurance market, we then can derive the spot (market

pricing) survival probability from bond and endowment prices:

P (t, t, T, x) =E(t, T, x)

D(t, T ). (2.2.20)

Therefore, if the (T, x)-endowment prices were available from the market, we

could use P (t, t, T, x) to calibrate mortality model, in the same way as D(t, T ) can

be used to calibrate the interest rate model. However, the problem here is that pure

74

endowment is not a tradable asset like a T -bond. In general, the insurance market is

not considered as a liquid, frictionless market as is the bond market. In particular,

the insurance market is a market where insurers take short positions in insurance

contracts, whilst insureds take only long positions. However, those trading constraints

for the insurance products can be weakened where (re)insurance can exchange books

of policies, so that both long and short positions can be taken on insurance contracts.

Hence, obtaining the “fair value” for the (T, x)-endowment is mainly just a manner

of implementing problem.

Thus, in the following, depending on the type of contracts under valuation, suit-

able basic insurance contracts will be assumed to be traded continuously in the market

and represent the primitive securities used for arbitrage pricing. For example, when

valuing annuities, pure endowments of (possibly) every maturity will be implicitly

taken as primitive securities.

We stress at this point, though, that the risk neutral measure Q might not be

uniquely determined due to market incompleteness. Instead the choice of Q becomes

part of the modelling process. As mortality-linked securities begin to emerge and

as we gather market price data we can then test for the validity of our assumptions

about Q. For further discussion of the relationship between P and Q, the reader is

referred to Dahl (2004), Biffis (2005), and Cairns, Blake, and Dowd (2006b).

75

2.3 Review of stochastic mortality models under

the no-arbitrage framework

“We must always be prepared to demonstrate that a proposed model gives a good

approximation to what we observe in reality and that it is appropriate for the task in

hand. All models are approximations to reality but, of course, some are better than

others.”

Cairns: Interest Rate Models: An Introduction (2004)

In this section, we will check on the criteria for stochastic mortality models and

review the recently proposed models, highlighting on specific requirements for mor-

tality.

2.3.1 Criteria for Term Structure of Mortality Models

To model mortality as a stochastic process, it is reasonable to require that any “plau-

sible” mortality model would meet the following criteria (Cairns, Blake, and Dowd,

2006a):

1. The model should keep the force of mortality positive.

2. The model should be consistent with historical data.

3. The model should be comprehensive enough to deal appropriately with the

pricing, hedging, or risk managing problem.

For valuation purpose of those mortality-related derivatives, the numerical re-

sults shall be consistent with our insight about risk/return relationship. That

76

is, the value of the derivative should reflect the stochastic evolution of µ(t, x)

such that the greater the volatility in mortality rates, the greater is the value

of a mortality option.

4. It will absolutely be an advantage if the model makes it possible to value the

most common mortality-linked derivatives using analytical methods or using

fast numerical methods. However, this criterion is one of convenience only, and

should not be allowed to override the other criteria, which are more important

because they are criteria of principle. In other words, we should not drop one

of the other criteria merely to obtain an easy (for example, analytical) solution

to the problem at hand.

Now we would like to put more concrete meanings on criterion 2 of being consistent

with historical data.

• In general, the model should give a biologically reasonable age-specific mortality

pattern for a cohort, and should be coherent with observed cohort relationships.

For example, one might rule out the possibility of an “inverted” mortality curve

(that is, one in which mortality rates for the elderly fall with age). An “inverted”

curve is not only unreasonable a priori, but also conflicts with the normal

upward-sloping curves that are always observed in historical data.

At the same time, younger generation normally experience lower mortality rate

than their older peers. This should rule out the deteriorating mortality, in other

words, for a fix age x, the mortality rate should be decreasing over time (e.g.

consecutive cohorts).

77

• The model should be flexible enough to reflect the rectangularization and ex-

pansion phenomena: while the former reduces the variability of time-until-death

randomness around its expectation, the latter increases the expectation of the

future lifetime. As a consequence, the risk due to random fluctuations in mor-

tality tends to decrease, while at the same time the longevity risk due to the

randomness in future mortality trends increases.

It is clear that any candidate model for µ must be able to capture the dynamics

just described.

• Recall that mortality risk comes from the difference between actual realized

mortality rates and those implied by an adopted projection. In terms of a pa-

rameterized model, the difference can be categorized into two sources: random

fluctuations from a given parameter (survival function), and the systematic

deviations of the parameter (survival function).

The model thus should more focus on the systematic deviations related to the

choice of a specific projection as we have described in Figure 1.13. So far, we

have seen that Pitacco’s model, or Lee-Carter’s model are two different methods

to describe systematic deviations. In the following subsection, we will see that

Dahl’s mean-reversion model focuses mainly on the random fluctuations around

the pre-specified long-term target. This is viewed as a disadvantage in a model

for stochastic mortality.

In particular, long-term dynamics in the course of mortality improvements

should not be mean-reverting to a pre-determined target level, even if this target

is time dependent and incorporates mortality improvements.

78

As Cairns, Blake, and Dowd (2006a) noted: the inclusion of mean-reversion

would mean that if mortality improvement has been faster than anticipated

in the past then the potential for further mortality improvements will be sig-

nificantly reduced in the future. In extreme cases, significant past mortality

improvements might be reversed if the degree of mean reversion is too strong.

Such extreme mean reversion is difficult to justify on the basis of past mortality

experience. As we look into the future, it is even more difficult to predict what

medical advances there might be, when they will happen, and what impacts

they will have on survival rates. All of these uncertainties rule out strong mean

reversion in a model for stochastic mortality.

2.3.2 A Brief Review of Existing Stochastic Mortality Mod-

els

In this subsection, we review the existing stochastic mortality models recently pro-

posed. We classify those models into three groups according to their model formuliza-

tion.

Reduction factor approach

Ballotta and Haberman (2006) proposed, under the real probability measure P ,

that the dynamics of the hazard process µ(y, u) for a person attaining age y in future

year u follows:

µ(y, u) = µ(y, 0)RF (y, u) (2.3.1)

where µ(y, 0) is the hazard rate for a person aged y in the base year (i.e. year 0),

and RF (y, u) is the reduction factor for period [0, u] for age y. In particular, taking

79

y = x + z, and u = t + z, formula (2.3.1) can be reformed as:

µ(x + z, t + z) = µ(x + z, 0)e(α+β(x+z))(t+z)+σhYt+z (2.3.2)

where (Yt, t ≥ 0) is stochastic process on (Ω,F ,Ft, P ) describing random variations

in the projected trend dYt = −aYtdt + dBt

Y0 = 0(2.3.3)

where Bt is a standard one-dimensional P -Brownian motion, independent of the

sources of randomness existing in the financial market. µ(x + z, 0) is modelled as:

µ(x + z, 0) = a1 + a2R + eb1+b2R+b3(2R2−1)

R = (x+z)−7050

, x ≥ 50. (2.3.4)

It is seen that the random variation process Yt is an Ornstein-Uhlenbeck process,

and has the property of mean reversion, with the parameter a measuring the speed

of mean reversion to the long-run mean which is set equal to zero.

The model (2.3.4) for the hazard rate for the base year has been proposed by

the CMI Bureau (1999) for the UK standard tables for annuitant and pensioner

population for the period 1991-1994.

Ballotta and Haberman’s model can be viewed as an extension to Milevsky and

Promislow’s mean reverting Brownian Gompertz (MRBG) model (2001), where the

dynamic of the the hazard rate process for a fixed cohort is described by:

µ(t) = µ(0)egt+σYt, g, σ, µ(0) > 0, (2.3.5)

dYt = −aYtdt + dBt (2.3.6)

Milevsky and Promislow’s model can obtain Ballotta and Haberman’s model if their

model parameter µ(0) and g are age-dependent.

80

It is worth to point out that the feature of mean reversion in the process Yt (see

formula (2.3.3) or (2.3.6)), and particularly reverting to a long-run value equal to zero,

ensures that the fluctuation is centered around the pre-specified curve. For this reason,

this type of mean-reverting process includes largely just random fluctuations. In

order to enhance the model property on describing systematic deviations in mortality

dynamics, Ballotta and Haberman introduced a random parameter α using the same

idea as in formula (1.3.3). Ad hoc assumptions are made: α ∈ −0.01,−0.03,−0.05

with probabilities p = 13, 1

3, 1

3 or p = 0.2, 0.2, 0.6 to present different beliefs in

future mortality development.

There is no closed form for survival probabilities, because the (integral) sum of

lognormal variates is not lognormal. Thus, numerical calculation is conducted by

Monte Carlo simulation for both models.

Affine approach

Dahl (2004) proposed a special affine mortality dynamics.

Definition 2.3.1. (Affine mortality structure). If, for fixed x, the survival probabil-

ities are given by

P (t, t, T, x) = eA(t,x,T )−B(t,x,T )µx+t (2.3.7)

for deterministic function A(t, x, T ) and B(t, x, T ), then the model for the mortality

intensity is said to possess an affine mortality structure for cohort x. If (2.3.7) holds

for all x, then the model is simply said to possess an affine mortality structure.

Affine mortality structures are of interest, since they allow survival probabilities

to be expressed by the relatively simple expression in (see formula (2.3.7)). Dahl

further showed that a sufficient condition for having an affine mortality structure is

81

to have the following form for mortality intensity.

dµ(x + t) = αµ(t, x, µ(x + t))dt + σµ(t, x, µ(x + t))dBµt (2.3.8)

αµ(t, x, µ(x + t)) = δα(t, x)µ(x + t) + ζα(t, x) (2.3.9)

σµ(t, x, µ(x + t)) =√

δσ(t, x)µ(x + t) + ζσ(t, x) (2.3.10)

In other words, for fixed x, an affine structure for αµ and (σµ)2 in t ensures an

affine mortality structure with A and B solving the following differential equations:

∂tB(t, x, T ) + δα(t, x)B(t, x, T ) − 1

2δσ(t, x)(B(t, x, T ))2 = −1, B(T, x, T ) = 0

(2.3.11)

∂tA(t, x, T ) = ζα(t, x)B(t, x, T ) − 1

2ζσ(t, x)(B(t, x, T ))2, A(T, x, T ) = 0. (2.3.12)

Thus, provided we can solve the differential equations for A and B, an affine mortality

structure provides a closed form expression for the survival probabilities for cohort x.

If in addition αµ and σµ are time independent, the sufficient condition is necessary

as well, see Duffie (2001).

Although affine mortality structure provides mathematical convenience, special

efforts need to be taken into account on model specification to incorporate the statis-

tical features of historical mortality data. Under the affine structure, Dahl consider

a model similar to Ballotta and Haberman (2006)’s reduction factor formula. In par-

ticular, Dahl and Møller (2006) propose a dynamic for the mortality intensity process

as:

µ(x, t) = µ(x + t, 0)RF (x, t) (2.3.13)

where µ(x + t, 0) is some smooth initial mortality intensity curve, e.g. µ(x + t, 0) =

α+βcx+t, which can be estimated by standard statistical methods. And the reduction

82

factor is modelled by a special extended Cox-Ingersoll-Ross model:

dRF (x, t) = (γ(x, t) − δ(x, t)RF (t, x))dt + σ(t, x)√

RF (x, t)dBt (2.3.14)

where γ(x, t), δ(x, t) and σ(t, x) are positive bounded functions. It can be shown

that the extended CIR model ensures strict positivity of the mortality intensity for

cohort x provided that for fixed x we have 2γ(x, t) ≥ (σ(t, x))2, for all t ∈ [0, T ], see

Maghsoodi (1996). Furthermore, the model is mean reverting around the time and

cohort dependent level γ(x, t)/δ(t, x). It then follows via Ito formula that

dµ(x, t) = (γµ(x, t) − δµ(x, t)µ(x, t))dt + σµ(t, x)√

µ(x, t)dBt (2.3.15)

where

γµ(x, t) = γ(x, t)µ(x + t, 0) (2.3.16)

δµ(x, t) = δ(x, t) −ddt

µ(x + t, 0)

µ(x + t, 0)(2.3.17)

σµ(t, x) = σ(t, x)√

µ(x + t, 0) (2.3.18)

This shows that µ also follows an time-inhomogeneous CIR model. The model specifi-

cation ensures that the mortality intensity given by (2.3.15) admits an affine mortality

structure. Model performance of (2.3.15) has been investigated by simulation. The

histogram for the expected lifetime of a policyholder aged 30 is plotted. The vari-

ation ofe30 can be viewed as an overall indicator for the uncertainty related to the

mortality schedule for (30). Specifying γ(x, t) = δe−γt, δ(x, t) = δ and σ(t, x) = σ

with parameter values assigned as (γ, δ, σ) = (0.008, 0.2, 0.02), the histogram shows

that there is only a relatively small variation associated with the expected lifetime.

This actually explains why the reserves obtained from their stochastic model show

83

little difference from the compatible deterministic model. We believe this can par-

tially justify the non mean reversion criterion for mortality model specification. As

the mean-reverting speed parameter approaches zero, it can generate bigger variation

for the expected lifetime. See Figure 7 in Dahl and Møller (2006).

Along another line, Biffis (2005) proposed a two dimensional affine process Y =

(µ, µ), whose first component is the random intensity of mortality µ itself, while the

second component describes the dynamics of the stochastic drift µ:

dµt = γ1(µt − µt)dt + σ1√

µtdB1t (2.3.19)

dµt = γ2(m(t) − µt)dt + σ2

√µt − m∗(t)dB2

t (2.3.20)

where B = (B1, B2) is a standard Brownian motion in R2; γ1, γ2 are parameters rep-

resenting the ‘speed of mean reversion’ of µ to µ and of µ to m, after any fluctuations

due to B occur; the functions m and m∗ are bounded and continuous. The function

m is a suitable demographic basis, such as an available mortality table acting as a

time-varying target to the stochastic drift µ. The function m∗ is a time-varying lower

boundary for the stochastic drift µ. It was interpreted as a more optimistic assump-

tion (in terms of mortality improvements) than that implied by m. Indeed, to make

sure that Y = (µ, µ) is well-defined, i.e. that µ ≥ 0 a.s. and µ ≥ m∗, the following

conditions are imposed: m ≥ m∗ ≥ 0, µ0 ≥ m∗(0) and µ0 ≥ 0. As a consequence,

we see that m and µ always dominate m∗. Note, however, that the conditions stated

above only ensure that µ is nonnegative, so that some paths of µ may actually fall

below m∗.

Since the model takes into account the risk of random fluctuations around µ and

around the drift’s target m, this enables some degree of systematic risk in mortality.

However, the biological implication of mean reversion still lacks justification.

84

For numerical presentation purpose, it is assumed that

m(x + t) =c

θc(x + t)c−1, θ, c > 0 (2.3.21)

which is fitted to the table COH48, a projection relative to Italian males born in 1948.

The asymptotic Weibull intensity m∗ is set to be low enough to allow fluctuations of

µ below m, yet large enough to prevent mortality improvements from being unrea-

sonable. The simulation result in Biffis (2005) shows that the model can capture the

rectangularization and expansion properties well.

Time series approach

Cairns, Blake, and Dowd (2006a) work differently from the above two approaches.

They follow the spirit of Lee-Carter’s model, starting from fitting the empirical mor-

tality data to obtain time series variables. Specifically, the following model is adopted:

q(x, t) = 1 − p(t + 1, t, t + 1, x) =eA1(t)+A2(t)(x+t)

1 + eA1(t)+A2(t)(x+t)(2.3.22)

In this equation, A1(t) and A2(t) are two stochastic factors that are assumed to be

measurable at time t. The first affects mortality at all ages in an equal manner,

whereas the second has an effect on mortality that is proportional to age. A(t) =

(A1(t), A2(t))′ is then modelled by a bivariate ARIMA time-series model to describe

the evolvement of the curve over time:

A(t + 1) = A(t) + β + CZ(t + 1) (2.3.23)

where β is a constant 2× 1 vector, C is a constant 2× 2 upper triangular matrix and

Z(t) is a 2-dimensional standard normal random variable. The following parameters

are estimated based on England and Wales data from 1961 to 2002, which is available

85

from the Government Actuary’s Department website www.gad.gov.uk.

β =

(−0.0434

0.000367

)and V = CC ′ =

(0.01067 −0.0001617

−0.0001617 0.000002590

)(2.3.24)

Based on data from 1982 to 2002, the estimated parameters are

β =

(−0.0669

0.000590

)and V = CC ′ =

(0.00611 −0.0000939

−0.0000939 0.000001509

)(2.3.25)

These results show a trend change after 1982, with β1 and β2 both becoming larger

in magnitude. This reminds us of the model risk problem with Lee-Carter model.

For the model to be biologically reasonable, some requirements on parameter val-

ues should be imposed. The negative value for β1 indicates generally improving mor-

tality, and the numerical results shows this improvement is strengthened after 1982.

The positive value for β2 means that mortality rates at higher ages are improving at

a slower rate. Theoretically, there is a chance, under the current model specification

with positive β2, that, after a certain age (referred to as crossover point), the model

predicts deteriorating mortality (in other words, mortality rate at ages higher than

the crossover point will be rising over time rather than lower), and this might not

be viewed as realistic. Practically, since this point is a very high age (e.g. age 113)

based on the estimated parameter values, this might not impose a serious problem as

the number of lives involved is very low after age 113.

Another aspect of biological reasonableness is to check if period tables reflect the

observed pattern. That is, for fixed t, q(t, x) should normally be an increasing function

of x. This requires that A2(t) remain positive. Based on the above model parameter

values, A2(t) seems very unlikely to become negative, although theoretically, they can.

Thus, practically, their model can be regarded as satisfying this aspect of biological

reasonableness as well.

86

The ”short rate” dynamic (in discrete time) of q(t, x) for a cohort is the most

important aspect and can be better investigated in the following form:

log q(t + 1, x)/p(t + 1, x) (2.3.26)

= A1(t + 1) + A2(t + 1)(x + t + 1) (2.3.27)

= (1, x + t + 1)′[A(t) + β + CZ(t + 1)] (2.3.28)

= log q(t, x)/p(t, x) + (β1 + β2 + A2(t)) + (1, x + t + 1)′CZ(t + 1) (2.3.29)

It is noted that A2 in 2002 is 0.1058 and the standard deviation 0.006 for A2(t) is very

small over the time horizons concerned. Thus, β1 +β2 +A2(t) is initially positive and

is expected to stay positive. As a consequence, the cohort will experience generally

increasing rates of mortality with occasional falls in years when there occurs a large

random mortality improvement across the board (that is, when (1, x + t + 1)′CZ(t +

1) << 0).

In order to obtain the dynamic of S(t + 1), the approximation method is used

S(t + 1) = S(t)(1 − m(t, x)) (2.3.30)

where m(t, x) is the central death rate, and is given by

m(t, x) =q(t, x)

1 − 12q(t, x)

(2.3.31)

The model is not analytically tractable, so it is necessary to resort to Monte Carlo

simulation for most purposes.

Cairns, Blake, and Dowd (2006b) present the simulated spot survival probability

P (0, 0, t, 65) (or simply S(t)) with its associated confidence interval. The confidence

interval grows in quite a different way from, say, that which we may associate with

an investment in equities. This point is best illustrated by looking at the variance

87

of the logarithm of S(t), as illustrated in Figure 4 of their original paper. It can be

seen that the variance is very low in the early years indicating that we can predict

mortality rates with reasonable precision over the near future. However, after time 10

the variance starts to grow very rapidly (almost “exponentially”). This contrasts with

equities where we would expect to see linear, rather than “exponentially”, growth in

the variance if the price process follows geometric Brownian motion.

The explanation for this variance growth is that the longer-term survival prob-

ability incorporate the compounding of year-by-year mortality shocks: the survival

probability for year t depends on shocks applied to mortality rates in each year from

1 to t, and each individual shock affects survival probabilities in all subsequent years.

This property shall have an effect on the price of an annuity in the sense that the

premium charged for the 25-year payment will be much larger than, say, the 10-year

payment.

2.3.3 Summary

With regard to the criteria for mortality models, we show which of these characteris-

tics have been displayed by the various models in Table 2.1.

Other Remarks

1. Currently, most (if not all) approaches are proposed in the framework of a short

rate model, starting from theoretical convenience, such as affine structure. Few

discussed and connected their models with statistical properties of observed

mortality data.

2. In order to be consistent with historical data and satisfy biological requirements,

88

Table 2.1: Key Characteristics of Stochastic Mortality Models

Criteria B&P Dahl Biffis Cairns et al

µ(x, t) > 0 Y Y Y Y

Biological Reasonableness Y Y Y Y (based on properlychosen para. values)

Rectangularization&expansion - - Y -

Cohort Relationship - - - Y

Trend Deviation N N Y Y

Non Mean Reversion N N N Y

Tractability N Y N N

many proposed models start from the estimated or projected mortality sched-

ule, and the stochastic feature is introduced via adding random fluctuations

into a deterministic target. However, empirical study on mortality shows that

mortality risk mainly comes from trend variation rather than random fluctua-

tion.

3. We want to point out here that trend deviation has not been well addressed

by the currently proposed stochastic mortality models. It is thus important

to stress again that it is the deviation pattern 2 in Figure 1.13 that we are

trying to model, while the mean reverting processes are just simulating the

pattern 1. In this regard, Lee-Carter approach is superior in the sense that the

trend deviation is described by the level uncertainty. However, model risk in

Lee-Carter approach is a problem with which we need to be concerned.

Chapter 3

The Time-Changed MarkovianMortality Model

3.1 Dynamic Approach of Mortality Modelling

As we have shown earlier, survival analysis has been focusing on modelling the survival

function or the hazard rate (the force of mortality) of a life or a population, as seen

in Gompertz’s model and its various extensions. In spite of apparent simplicity, those

concepts are highly aggregating, and affected by many factors. Therefore, prediction

mortality models based on those concepts are difficult to interpret. A process-based

approach, viewing the survival time of a life as the occurrence time of certain events

of a process, may have certain advantages, as it may provide biological interpretations

to a model and may utilize mathematical tools developed in the theory of stochastic

processes.

In this chapter, we consider a finite-state continuous-time Markov process with

constant intensities of transition for modelling the survival time. The state space of

the Markov process is assumed to consist of a set of transient states and a single

absorbing state. The initial probability distribution is defined on the transient state

89

90

space and represents the unobservable health status of the life, and the survival time

is the time of absorption. As a result, the distribution of the survival time follows a

phase type distribution.

The use of a phase type distribution and its underlying Markov process as a failure

time model have applications in the fields of engineering or medical statistics. For

example, AIDS’ patients usually experience a development process through various

stages of severity during the incubation period, which can be characterized by the

level change of CD4+ and modelled by a Markov process accordingly. Unfortunately,

it has rarely been used for modelling and analysis of human mortality. This chapter

and the next chapter are an attempt in this direction.

3.1.1 Notation

Hereafter, vectors and matrices are always written in bold letters: lower case and

upper case, respectively. Subscripts may be used to indicate dimensions, for instance,

Am×n means a matrix with m rows and n columns. We may write A = (aij) to

emphasize a matrix with the entry on the ith row and the jth column being aij.

Specifically, the identity matrix is denoted by I, the vector with all entries equal to 1

is denoted by e, and the vector with all entries equal to 0 is denoted by 0. The symbol

D(d), or (dj)diag, is used to denote the diagonal matrix with d = (d1, d2, · · · , dn) being

the diagonal entries.

3.1.2 Phase-type distributions

Definition 3.1.1. Let Jt be a time-homogeneous Markov process on a finite state-

space S = E ∪ ∆ = 1, 2, · · · , n ∪ ∆, where ∆ is absorbing and the states in E are

91

transient. The initial distribution is given by (α, 0) (written as row vector), and the

infinitesimal generator is expressed as

Λ q

0 0

(3.1.1)

Let τ denote the time until absorption or the time until death in the survival

analysis context. Then τ is said to follow a phase-type (PH) distribution with a

representation (α,Λ) of order n.

In other words, the matrix Λ = (λij)n×n is the matrix of transition rates among

the transient states, and q = (qi)n×1 is the column vector of absorption rates into

state ∆ from the transient states:

P (Jt+ǫ = j|Jt = i) = λij · ǫ + o(ǫ), i, j ∈ E, i 6= j

P (Jt+ǫ = ∆|Jt = i) = qi · ǫ + o(ǫ), i ∈ E.

Therefore Λ is a sub intensity matrix, meaning that λii < 0, λij ≥ 0 for i 6= j and

∑j∈E λij ≤ 0. Further, q = −Λe, where e is the column vector of ones.

Some useful references on phase type distributions are Neuts (1981), Asmussen

(1987, 2000a,b), and O’Cinneide (1989, 1990, 1999). Statistical fitting of phase type

distributions using the EM algorithm is presented in Asmussen, Nerman, and Olsson

(1996). A survey from a survival analysis point of view is given by Aalen (1995) who

focused on the connection between the underlying process and various shapes of the

hazard rate. Applications using phase type distributions as survival models can be

found in Kay (1986), Longini, Clark, Gardner, and Brundage (1991).

A huge advantage of phase type distributions is their mathematical tractability. It

is possible to compute the various quantities of interest associated with a phase type

92

distribution as seen in the following theorem. Typically, analytical forms for these

quantities are expressed as matrix exponentials and matrix inverses. The calculation

of these matrices may be cumbersome by hand. However, many symbolic programs

such as Mathematica, Maple, and Matlab are currently available for deriving ana-

lytical formulas as well as conducting numerical calculations for matrix calculations.

In this thesis, most of the numerical works involving phase type distributions are

actually executed by Matlab, making this approach more convenient and attractive.

Theorem 3.1.2. Let τ have phase-type distribution with representation (α,Λ). Then

we have:

survival function

s(t) = α exp(Λt)e, t > 0 ; (3.1.2)

density function

f(t) = −s′(t) = α exp(Λt)q, t > 0 ; (3.1.3)

moment generating function

M(s) = α(−sI − Λ)−1q ; (3.1.4)

and non-central moments

mk = (−1)kk!αΛ−ke, k = 1, 2, · · · . (3.1.5)

Proof Let pij(t) = Pr(Jt = j|J0 = i) denote the transition probabilities from i to

j over time t for t ≥ 0, i, j ∈ E. Then, P (t) = (pij(t)) satisfies the Kolmogorov

forward and backwards equations

d

dtP (t) = P (t)Λ = ΛP (t) .

93

Since P (0) = I, the equation has a unique solution

P (t) = exp (Λt).

Hence

s(t) = Pα(τ > t) = Pα(Jt ∈ E) =∑

i,j∈E

αipij(t) = αP (t)e = α exp (Λt)e,

this proves (a). (b) follows from

f(t) = −s′(t) = −αd

dtP (t)e = −α exp (Λt)Λe = α exp (Λt)q.

For part (c), using formula (A.1.3) for integrating matrix exponentials, we have

M(s) =

∫ ∞

0

estαeΛtq dt = α

(∫ ∞

0

e(sI+Λ)t dt

)q

= α(−sI − Λ)−1q .

The last equality is true because all the non-zero eigenvalues for intensity matrices

have negative real part.

Part (d) follows by differentiating the m.g.f. M(s),

M (k)(s) =dk

dskα(−sI − Λ)−1q = (−1)k+1 k! α(sI + Λ)−k−1q

M (k)(0) = (−1)k+1 k! αΛ−k−1q = (−1)k k! αΛ−ke

In the following, we provide two more important theorems on phase type distri-

butions without proof.

Theorem 3.1.3. The class of phase-type distributions is closed under the forma-

tion of finite mixtures, finite convolutions, finite minima and maxima, and geometric

compounds.

94

For a proof, see Asmussen (2000a, p353).

Theorem 3.1.4. The class of phase-type distributions is dense (in the sense of weak

convergence) in the class of all distributions on (0,∞).

For a proof, see Asmussen (2000b, p201).

The closure property is important to guarantee that the underlying Markov struc-

ture of a phase type distribution is preserved after certain operations. This results

in a general rule that if a probabilistic problem involving exponential distributions

has an explicit solution, the problem will also have an explicit solution when the

exponential distributions are replaced by phase type distributions. As a result, phase

type distributions have become a convenient computational tool in applied probabil-

ity. For typical examples, see Neuts (1981), Asmussen and Rolski (1991), Asmussen

(2000b), and references therein. Furthermore, due to the denseness property of phase

type distributions, there is no essential loss in generality by assuming a phase type

distribution instead of an arbitrary distribution for a practical viewpoint, since any

distribution on [0,∞) can be approximated by a phase type distribution at any desired

degree of accuracy.

The potential structure of phase type distributions is very rich. We will only

discuss some of the most basic types of phase type distributions.

Commonly used phase type distributions

(1) Hyper-exponential distribution corresponds to a Markov process that can

start in any state (all elements in α are allowed to be non-zero), but terminates

in the absorbing state without visiting any other state (all off diagonal entries

95

in Λ are zero). A Hyper-exponential distribution is also known as a mixture of

exponentials (thus is a phase-type distribution due to closure property), and its

density is∑n

i=1 αi λi e−λit.

(2) Erlang distribution corresponds to a Markov process that must start from

state 1, and then visit each state 2,· · · , n, in that order, and terminates when it

leaves state n. The transition rates are all the same and equal to λ. As a result,

we obtain a Gamma distribution with integer n as its shape parameter and λ

as its scale parameter. In this case, the phase type distribution is actually the

convolution of exponential distributions, and its density is λn tn−1 e−λt/(n−1)!.

It is easy to generalize the Erlang distribution by assigning different transition

rates between each pair of consecutive states.

The Markov process associated with a phase type distribution can usually be

illustrated by a phase diagram. For a generalized Erlang distribution, the dia-

gram is

Diagram 3.1. Phase diagram for a generalized Erlang distribution

1 1 1 1

λ1 λ2 · · · · · · λn

1 1 1 1

96

corresponding to a phase representation of α =(

1 0 · · · 0)

1×n, and

Λ =

−λ1 λ1 0 · · · 0

0 −λ2 λ2 · · · 0

0 0 −λ3. . . 0

......

. . . . . ....

0 0 0 · · · −λn

(3.1.6)

(3) Coxian distribution can be constructed from generalized Erlang distribution

with the exception that the absorbing state can be reached from all of the other

states. Thus, from state i the Markov process can either jump to state i + 1 or

to the absorbing state. It has a phase diagram of the following form:

Diagram 3.2. Phase diagram for Coxian distribution

λ1 λ2 · · · · · · λn

1 1 1

λ1′ λ2

′

qnq1 q2

corresponding to a phase representation of α =(

1 0 · · · 0)

, and

Λ =

−λ1 λ1′ 0 · · · 0

0 −λ2 λ2′ · · · 0

0 0 −λ3. . . 0

......

. . . . . ....

0 0 0 · · · −λn

, (3.1.7)

97

where

λi = λi′ + qi. (3.1.8)

A generalized Coxian distribution corresponds to a Markov process that is same

as the above but can start from any state.

(4) Triangular phase type distribution is phase type distributions of special

interest. It can be defined as those with triangular representation (α,Λ), Λ

being an upper-triangular matrix. In terms of a Markov process, it means that

there is no ”feedback” in the state space, i.e. no state can be visited more than

once. When the states are ordered in a suitable manner, the transition matrix

will have only zero elements below the main diagonal.

Remarks

On one hand, it is easily seen that type (1), (2) and (3) are all triangular. On

the other hand, it is proved in O’Cinneide (1989) that any phase type distribution

with an order n triangular representation has a Coxian representation of the same

order. This result suggests that Coxian class is more general than it seems, and that

all models without feedback can be reduced to it.

More surprisingly, for the models with feedback, in particular, birth-death pro-

cesses with absorption, it has been proved by Keilson (1979, p59) that the absorption

time distribution for those models can also be expressed as a generalized Erlang dis-

tribution. In other words, this type of distribution can also be reduced to a (special)

Coxian-type distribution.

The non-uniqueness of phase type distributions may be of interest in stochastic

modelling. In many applications, it might be necessary to allow the process to move

98

back and forth between states before being absorbed. For instance, this may be real-

istic for some diseases where recovery may take place. However, the result mentioned

above implies that some feedback models are equivalent to the one without feedback

from a distributional viewpoint (see Aalen, 1995, p455 for more examples and condi-

tions). Therefore, it is often sufficient to assume a Coxian structure for the underlying

Markov process when one considers a phase type mortality model.

3.1.3 Phase-type distributions as mortality models

The discussions in the previous section provide some reasoning, from mathematical

point of view, on why it is of interest to consider a Coxian distribution as our ba-

sic mortality model. In this section, we will provide some biological and statistical

reasoning to further support this hypothesis.

By assuming a Coxian structured phase type distribution, the underlying Markov

process can be interpreted as an aging process. Although there is not yet any common

consensus in biological science on which could be appropriate markers for an aging

process, our model simply implies that aging is an irreversible process passing through

a number of stages. The aging process that we are trying to model may be perceived

as the representation of a health index of a life.

The general aging theory thus permits us to construct a Coxian structured phase-

type distribution to model age specific patterns of human mortality (see Lin and Liu,

2007). Full description of this work is given in next chapter in which we have shown

that a Coxian distribution and its associated Markov process (see Diagram 3.2 in

the previous section) can be used to describe the underlying aging process and the

absorbing time distribution can provide a good fit to various mortality patterns for

99

human populations.

The transition matrix of the proposed aging process has a strict Coxian form

(see section (4.2) for details). By “strict”, we mean, in expression (3.1.7) and (3.1.8),

λi > λ′i for all i = 1, · · · , n−1. In addition, for any i 6= j, we have λi 6= λj. Under this

model setting, Λ is diagonalizable with n distinct eigenvalues −λi (i = 1, · · · , n). Let

ν1, · · · ,νn be the corresponding left eigenvectors and h1, · · · ,hn the corresponding

right eigenvectors. Then, the resulting survival function S(t) for the time-till-death

random variable can be expressed as:

S(t) =n∑

i=1

e−λitα(hi ⊗ νi)e. (3.1.9)

where ⊗ is the Kronecker product. For more details on Kronecker product, see

Appendix A.

In contrast, the survival function of a general phase type distribution may be

written in the following form:

S(t) =r∑

k=1

mk−1∑

j=0

ckjtj

j!e ρkt, (3.1.10)

where ρ1, · · · , ρr are the eigenvalues of Λ with multiplicities m1, · · · ,mr, while the

ckj are constants (see Aalen, 1995; Harrison, 1990).

To obtain a good fit to the entire schedule of human mortality, the estimated

phase type distribution has a relative high dimension (see section 4.3). However, in

this chapter, we will adopt lower dimensional phase type distributions with the same

intensity structure as the ones presented in next chapter. There are two reasons for

this. First, in some actuarial practices and actuarial evaluation of mortality-linked

products in particular, the focus is on the population of age 65 or older. Thus,

only part of the mortality schedule is relevant and hence low-dimensional phase type

100

distributions are sufficient. Second, low-dimensional phase type distributions allow us

to use mathematical programs like Matlab to obtain results effectively and precisely.

3.2 Time-changed Markovian Survival Model

In this section, we present a stochastic mortality model that can be used for the

evaluation of mortality-linked products. As discussed in the previous chapters, there

are two approaches for incorporating stochastic mortality. One is what has been done

in Pitacco (2003) and Olivieri (2001), where the uncertainty is specified by assigning a

probability distribution to a finite set of potential mortality schedules. This approach

is good for illustrating the idea, but is too subjective to be realistic. The other one

can be summarized by the so-called short-rate-modelling approach (see Cairns, Blake,

and Dowd, 2006a), under which the spot mortality rates, q(x, t), or the spot force of

mortality µ(x, t) are described by a stochastic dynamic system (using time series, or

diffusion processes). The models using this approach as well as their advantages and

disadvantages have been reviewed in Section 2.3.2.

Without doubt, more work needs to be done in this area to better incorporate

mortality risk for the purposes of mortality risk pricing and hedging. In the following,

we present an alternative approach where the random feature of the future survival

distribution will be introduced by a time-change process, and the choice of the model

parameters will be determined by calibrating to the market prices of the relevant

products.

101

3.2.1 Time-changed Markovian Process

In order to introduce a stochastic dynamic into a phase type mortality model, consider

the subordinated aging process, Zt

Zt = Jγt,

where Jt is the associated Markov aging process and γt is a nondecreasing continuous-

time stochastic process. Define τ = inft : Zt = ∆. Thus, τ is the absorbing time for

stochastic process Zt. Obviously, the distribution of the absorbing time τ is governed

by both the underlying Markov process Jt and the subordinator process γt.

This idea is similar to that in the work of Madan and collaborators who used

a subordinated Brownian motion to model the dynamics of the logarithm of stock

prices (see Madan and Milne, 1991; Madan, Carr, and Chang, 1998). Their process,

called the Variance Gamma (VG) process, is obtained by evaluating Brownian mo-

tion at a random time-change. Thereby each unit of the calendar time is viewed as

having an economically related trading time. In our model setting, the interpretation

is different. One can think of the underlying aging process being influenced by an

improved or worsened living environment. Improvement may arise from advances in

health science, better social system in health care, breakthrough in genetic engineer-

ing, and so on. Deterioration may come from global warming, catastrophic events,

or epidemic diseases like SARS, bird flu, and so on. Our method is to model those

random effects in the future evolution of mortality rates via imposing a time-change

process γt to the time homogenous aging process Jt. The aggregate result is that the

future death rate will be higher or lower than anticipated by S(t), conditional on the

realization of γt. That is, the original deterministic survival function is now random

in the form of S(γt), depending on the random time process γt.

102

As a time-changing process, γt shall be non-decreasing. For tractability, indepen-

dent and stationary increments are preferable. The class of processes which holds

these properties for a random time is called a Levy subordinator (see Cont and

Tankov, 2004). The most well-known and widely used subordinator process is the

Gamma process. Note that our choice of Gamma processes is largely motivated by

tractability and familiarity.

Time-changed Markov process has also been used by Hurd and Kuznetsov (2006)

in credit risk modelling, where the credit migration of a company is described by a

finite-state Markov process compounding with stochastic time change. From mathe-

matical point of view, they point out that a wide class of processes can be the potential

candidate for the time-change process; for example, the positive mean-reverting dif-

fusion process, or the positive mean-reverting pure jump process, or the combination

of both (see equation (13) and (14) in Hurd and Kuznetsov, 2006, for more details).

Based on those model setting, many results in Hurd and Kuznetsov (2006)’s paper

are of affine structure. In contrast, we make use of matrix-analytical method, and

express the results in terms of phase-representation. Nonetheless, it is of interest to

investigate the appropriateness of other time-change processes in mortality modelling

in our future work.

3.2.2 The Gamma process

A random variable, X, is said to have a Gamma distribution with shape parameter

α > 0 and scale parameter β > 0 if it has density function fX as

fX(x) =βα

Γ(α)xα−1e−βx, x > 0. (3.2.1)

103

We simply write X ∼ Γ(α, β). The moment generating function of the Gamma

distribution is given by

φ(u) = E(euX) = (1 − u

β)−α. (3.2.2)

It is well known that X has mean α/β and variance α/β2. In particular, when

α = n, n = 1, 2, · · · , a positive integer, X has an Erlang distribution with the phase-

type representation b = (1 0 · · · 0)1×n,

Γ =

−β β 0 · · · 0

0 −β β · · · 0...

.... . . . . .

...

0 0 0 · · · −β

n×n

. (3.2.3)

Then, the density function fX(x) can be written in terms of the phase-type represen-

tation:

fX(x) = beΓxIβ, (3.2.4)

where Iβ = (0 · · · 0 β)n×1. This property is useful when we compute the moments of

the survival index at discrete time points in later sections.

The Gamma process γt can be constructed to have the following properties:

1. γ0 = 0;

2. it has independent increments, i.e., for any 0 ≤ t0 < t1 < · · · < tn the random

variables γt1 − γt0 , · · · , γtn − γtn−1 are independent; and

3. γt+s − γt ∼ Γ(αs, β) for any s, t ≥ 0.

For simplicity, we denote it as Γ(t; α, β). For the time-changing purpose, we let

α = β = 1/ν, where ν > 0 is a parameter. In other words, the Gamma process under

104

consideration is Γ(t; 1/ν, 1/ν). This is because we must have E(γt) = t for any time t.

Consequently, the Gamma process γt at fixed t has mean t, variance νt and skewness

2√

νt.

3.2.3 Survival functions for time-changed model

We have now introduced a subordinated Markov process Zt = Jγt, with Jt following

a terminating Markov process with phase-type representation (α,Λ) and γt being

the Gamma process Γ(t; 1/ν, 1/ν). Let τ be the absorbing time random variable of

process Zt. In this section, we study the distribution of τ .

First, one has

P ( τ > t) = P ( Zt 6= ∆) = P ( Jγt6= ∆)

= E[E(1Js 6=∆| γt = s

)]

= E [ S(γt) ]

In the last equality, S(t) is the survival function of the absorbing time of Jt.

Following the notation in section 2.2, we see that the survival function of τ is the

spot survival probability P (0, 0, t, x) = E [ S(γt) ] for (x), assuming that Zt is the

underlying aging process for (x). The realized survival function at the future time t

will depend on the realization of γt, e.g. S(γt), and therefore itself is random. Based

on the information on Jt and γt at time t = 0, the unconditional survival function of

S(γt) can be calculated according to the following theorem.

Theorem 3.2.1. Assume Λ is diagonalizable, i.e. there exist n distinct eigenvalues

λ1, · · · , λn with their corresponding right eigenvectors h1, · · · ,hn, and left eigenvec-

tors ν1, · · · ,νn.

105

Then, Λ =∑n

i=1 λihi ⊗ νi (see Proposition A.2.3), and P (0, 0, t, x) can be com-

puted by calculating the survival function of a phase-type distribution with represen-

tation (α, Λ), where Λ is defined as Λ =∑n

i=1 λihi ⊗ νi with each λi given by

λi = − ln(1 − νλi)

ν.

That is,

P (0, 0, t, x) = α

(n∑

i=1

eλithi ⊗ νi

)e = αH

(eλit)

diagH−1e = αeΛte. (3.2.5)

Proof By Theorem 3.1.2 and formula (3.1.9),

P (0, 0, t, x) = E [ S(γt) ]

=

∫ ∞

0

α exp(Λs)e · fγt(s)d s (3.2.6)

=

∫ ∞

0

n∑

i=1

eλisα (hi ⊗ νi) e · fγt(s)d s

=n∑

i=1

(∫ ∞

0

eλis · fγt(s)d s

)α (hi ⊗ νi) e

=n∑

i=1

(1 − νλi )− t

ν α (hi ⊗ νi) e

=n∑

i=1

eλitα (hi ⊗ νi) e (3.2.7)

The last second step uses the Laplace form for the Gamma process Γ(t; 1ν, 1

ν).

Define λi = − ln(1− νλi)ν

, we then obtain formula (3.2.5).

Note that Λ can be equivalently written as Λ = HD(λ)H−1, while Λ =

HD(λ)H−1. Therefore, theorem 3.2.1 shows that the distribution of τ still has

a ‘phase type representation’ with ‘transition matrix’ Λ. Detailed examples are given

in the next section.

106

From Proposition A.2.2 and Proposition A.1.1, we can further write the second

moment for the survival index random variable S(γt) as

E[(S(γt))

2]

= E [ S(γt) S(γt) ] = E[αeΛ γte · αeΛ γte

]

= E[(α ⊗ α)

(eΛ γt ⊗ eΛ γt

)(e ⊗ e)

]

= (α ⊗ α) E[e(Λ⊕Λ) γt

](e ⊗ e) . (3.2.8)

Since Λ = HDH−1 (Here, D is the short form for D(λ)), we obtain

Λ ⊕ Λ = Λ ⊗ I + I ⊗ Λ

= HDH−1 ⊗ I + I ⊗ HDH−1

= HDH−1 ⊗ HIH−1 + HIH−1 ⊗ HDH−1

= (H ⊗ H) (D ⊗ I)(H−1 ⊗ H−1

)+ (H ⊗ H) (I ⊗ D)

(H−1 ⊗ H−1

)

= (H ⊗ H) (D ⊗ I + I ⊗ D)(H−1 ⊗ H−1

)

= (H ⊗ H) (D ⊕ D)(H−1 ⊗ H−1

). (3.2.9)

It is straightforward to show that

(H ⊗ H)−1 =(H−1 ⊗ H−1

)

(see the proof in Appendix A), and

D ⊕ D =

D + λ1I · · · · · · 0

0 D + λ2I · · · 0

· · · · · · · · · · · ·0 · · · · · · D + λnI

.

The latter is still a diagonal matrix. Let us denote it as D ⊕ D = (λij)diag, where

λij = λi + λj implies the i-th element in the j-th block. Thus, the second moment of

S(γt) can be calculated in the same way as in the theorem 3.2.1, and we have

107

Theorem 3.2.2.

E[(S(γt))

2]

= (α ⊗ α) (H ⊗ H) E[e(D⊕D) γt

] (H−1 ⊗ H−1

)(e ⊗ e)

= (α ⊗ α) (H ⊗ H)(e˜(λij)t)

diag

(H−1 ⊗ H−1

)(e ⊗ e) . (3.2.10)

At the points where tj/ν are integers, we have an alternative formula to calculate

the second moment E[(S(γtj))

2]. Since fγt

(s) = beΓsIβ (see (3.2.4)), where Γ is an

tjν× tj

νmatrix and β = 1

ν,

E[(S(γtj))

2]

=

∫ ∞

0

(α ⊗ α)(eΛ s ⊗ eΛ s

)(e ⊗ e) beΓsIβ ds

= (α ⊗ α ⊗ b)

(∫ ∞

0

e(Λ⊕Λ⊕Γ) s ds

)(e ⊗ e ⊗ I 1

ν

)

= − (α ⊗ α ⊗ b) (Λ ⊕ Λ ⊕ Γ)−1(e ⊗ e ⊗ I 1

ν

).

Theorem 3.2.3. If tj/ν is an integer, then

E[(S(γtj))

2]

= − (α ⊗ α ⊗ b) (Λ ⊕ Λ ⊕ Γ)−1(e ⊗ e ⊗ I 1

ν

)(3.2.11)

When the mean and the second moment are calculated using Theorem 3.2.2 or

Theorem 3.2.3, we can then obtain the variance of S(γt) by

V ar(S(γt)) = E[(S(γt))

2]− (E [ S(γt)])

2.

3.2.4 Illustrations

In this section, we illustrate how to use the results in the previous section to compute

the transition matrix Λ under the time-changed model. We assume the underlying

Markov process is of Coxian type, i.e., the transition matrix is of form (3.1.7).

108

Illustration 1 Consider a 2-dimensional transition matrix

Λ =

(−a1 b1

0 −a2

)(3.2.12)

where a1 6= a2, a1, a2 > 0 and a1 > b1 so that Λ is a strict Coxian transition matrix.

For this Λ, the two eigenvalues are −a1 and −a2. One can easily find their

corresponding right eigenvectors h1 and h2, and put them in H as follows:

H = (h1,h2) =

(1 b1

a1−a2

0 1

). (3.2.13)

The corresponding left eigenvectors v1 and v2 are

H−1 =

(v1

v2

)=

(1 − b1

a1−a2

0 1

). (3.2.14)

After time-change with parameter ν, the matrix Λ = HD(λ)H−1 for the time-

changed Markov process in Theorem (3.2.1) can be shown to be

Λ = HD(λ)H−1

= H

(− ln(1+νa1)

ν0

0 − ln(1+νa2)ν

)H−1

=

(− ln(1+νa1)

νln(1+νa2)−ln(1+νa1)

a2−a1· b1

ν

0 − ln(1+νa2)ν

)(3.2.15)

The matrix Λ in (3.2.15) can be shown to be of Coxian type (3.1.7) too, since

ln(1 + νa1)

ν> 0, and

ln(1 + νa2)

ν> 0

0 <ln(1 + νa2) − ln(1 + νa1)

a2 − a1

· b1

ν<

ln(1 + νa1)

ν

for any a1, a2, and ν > 0, a1 6= a2. Therefore, Λ will always be a generator for any

2-dimensional time-changed Markov process.

109

Illustration 2 Consider a 3-dimensional transition matrix

Λ =

−a1 b1 0

0 −a2 b2

0 0 −a3

(3.2.16)

where a1 6= a2 6= a3, a1, a2, a3 > 0 and ai > bi for i = 1, 2 so that Λ is a strict Coxian

transition matrix.

We can work on the transition matrix (3.2.16) in similar way to Illustration 1.

H = (h1,h2,h3) =

1 b1a1−a2

b1a1−a3

· b2a2−a3

0 1 b2a2−a3

0 0 1

. (3.2.17)

H−1 =

v1

v2

v3

=

1 − b1a1−a2

b1a1−a2

· b2a1−a3

0 1 − b2a2−a3

0 0 1

. (3.2.18)

Λ

=HD(λ)H−1

=H

− ln(1+νa1)ν

0 0

0 − ln(1+νa2)ν

0

0 0 − ln(1+νa3)ν

H−1

=

− ln(1+νa1)ν

ln(1+νa1)−ln(1+νa2)a1−a2

b1ν

− ln(1+νa1)(a2−a3)+ln(1+νa2)(a1−a3)−ln(1+νa3)(a1−a2)(a1−a2)(a1−a3)(a2−a3)

b1 b2ν

0 − ln(1+νa2)ν

ln(1+νa2)−ln(1+νa3)a2−a3

b2ν

0 0 − ln(1+νa3)ν

(3.2.19)

Illustration 3 We now consider a concrete model with a 5-dimensional transition

110

matrix given by

Λ =

−0.4 0.3 0 0 0

0 −0.43 0.32 0 0

0 0 −0.5 0.36 0

0 0 0 −0.55 0.45

0 0 0 0 −0.6

The 5 different eigenvalues for Λ are λi = −0.4,−0.43,−0.5,−0.55 and −0.6, i =

1, 5. Consequently, the following matrices are obtained

H =

1 −10 13.714 −38.4 91.482

0 1 −4.5714 19.2 −60.988

0 0 1 −7.2 32.4

0 0 0 1 −9

0 0 0 0 1

and

H−1 =

1 10 32 76.8 172.8

0 1 4.5714 13.714 36.303

0 0 1 7.2 32.4

0 0 0 1 9

0 0 0 0 1

.

If a time-change process is applied to the Markov process with Λ as the underlying

transition matrix, we can obtain the time-changed ‘transition matrix’ Λ for the time-

changed Markov process according to Theorem 3.2.1. The following are Λ1, Λ2, Λ3

111

and Λ4, which correspond to ν = 0.25, 0.5, 1 and 2, respectively.

Λ1 =

−0.38124 0.2718 0.0097255 0.00051604 3.8078e − 005

0 −0.40842 0.28668 0.011413 0.00074867

0 0 −0.47113 0.31824 0.015651

0 0 0 −0.51533 0.39345

0 0 0 0 −0.55905

Λ2 =

−0.36464 0.24845 0.016084 0.00153 0.00020064

0 −0.38949 0.25965 0.018536 0.0021612

0 0 −0.44629 0.28516 0.024918

0 0 0 −0.48589 0.34953

0 0 0 0 −0.52473

Λ3 =

−0.33647 0.21202 0.023056 0.0036335 0.00077941

0 −0.35767 0.21847 0.02585 0.0049307

0 0 −0.40547 0.23609 0.033732

0 0 0 −0.43825 0.28574

0 0 0 0 −0.47

Λ4 =

−0.29389 0.16395 0.02701 0.0063389 0.0019937

0 −0.31029 0.16588 0.029242 0.0081792

0 0 −0.34657 0.17564 0.036776

0 0 0 −0.37097 0.20934

0 0 0 0 −0.39423

We see that all Λi, i = 1, 4, are valid transition matrices. The survival curves

corresponding to Λ and Λi, i = 1, 4, are given as follows.

P (0, 0, t, x)

= 292.6e−0. 4 t − 555.882353e−0. 43t + 556.8e−0. 5t − 384e−. 55t + 91.48235292e−. 6t

112

Figure 3.1: Time-changed survival curves of P (0, 0, t, x) for ν = 0, 1, 2

0 5 10 15 20 25 300

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

time t

P(0

,0,t,

x)

$\Lambda$$\tildeLambda_3$$\tildeLambda_4$

Pi(0, 0, t, x)

= 292.6eλ1 t − 555.882353eλ2 t + 556.8eλ3 t − 384eλ4 t + 91.48235292eλ5 t, i = 1, 4

The survival curves corresponding to ν = 1 and 2 and the original one are displayed

in Figure (3.1). The variance curves for various values ν are presented in Figure (3.2).

The survival curves with their corresponding one-σ confidence intervals for ν = 0.5

and ν = 1 are given in Figure (3.3) and Figure (3.4), separately. We also provide a

group of simulated survival curves in Figure (3.6) and Figure (3.5) corresponding to

different ν.

113

Figure 3.2: Variance curves for Time-changed survival models Pi(0, 0, t, x)when ν = 0.5, 1, 2

0 5 10 15 20 25 300

0.005

0.01

0.015

0.02

0.025

0.03

0.035

0.04

0.045

0.05

Time t

Var

[S(t

)]

ν=2ν=1ν=.5

Figure 3.3: Survival curve Pi(0, 0, t, x) with its one-σ confidence interval (ν = .5)

0 5 10 15 20 25 30−0.2

0

0.2

0.4

0.6

0.8

1

1.2

Time t

s(t)

s(t)+σ(t)s(t)s(t)−σ(t)

114

Figure 3.4: Survival curve Pi(0, 0, t, x) with its one-σ confidence interval (ν = 1)

0 5 10 15 20 25 30−0.2

0

0.2

0.4

0.6

0.8

1

1.2

Time t

s(t)

s(t)−σ(t)s(t)s(t)+σ(t)

Figure 3.5: Simulated survival curves with ν = 0.5 and the dotted lineis the original underlying survival function

0 5 10 15 20 25 30 350

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Time

S(t

,x)

115

Figure 3.6: Simulated survival curves with ν = 1 and the dotted lineis the original underlying survival function

0 5 10 15 20 25 30 350

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Time

S(t

,x)

116

3.3 Pricing Longevity Bonds

As we have discussed in Chapter 1, annuity providers and pension plans are subject

to a great deal of longevity risk: the risk that policy holders or plan participants

might live longer on average than anticipated. This problem can be illustrated by the

fact that life expectancy for men aged 65 in year 2000 in England and Wales is about

4 and half years higher than it was anticipated in the mortality projections made in

the 1980s. For references, see CMI Reports 3, 10, 17.

What makes the situation even worse is that the amounts of liabilities exposed

to the longevity risk is often huge. For example, the amounts at risk for state and

private sectors in the UK at the end of year 2003 aggregate to £2520 billion, that is,

nearly £40,000 for every man, woman and child in the UK (see Pensions Commission,

2005, Fig 5.17, p181).

Exposures to the longevity risk are therefore a very serious issue, and need to

be well managed. Except for reinsurance or portfolio diversifying, mortality-linked

securitization is a newly innovated approach to hedge longevity risk exposures. In

this section, we will focus on the longevity bonds (LB) which were issued by the

European Investment Bank (EIB) in November 2004, with BNP Paribas as the de-

signer and originator, and Partner Re as the longevity risk reinsurer. We will analyze

the basic features of this innovative mortality linked product, and discuss its pricing

methodologies in this section.

3.3.1 The EIB/BNP Longevity Bonds

The EIB/BNP LB is a financial contract in which annual coupon payments are pro-

portionally linked to the realization of the survivor index for a reference population

117

Figure 3.7: Projected cashflow of the EIB/BNP Longevity Bond

0 5 10 15 20 250

0.2

0.4

0.6

0.8

1 P(0, 0, t, x)*50

over the next 25 years. In this thesis, the definition of survivor index first appeared

in the expression (2.2.1) of section 2.2. As its name suggests, the survivor index,

S(t, x), is the proportion of some initial reference population aged x at time t = 0

who are still alive at some future time t. In particular, the reference population for

the EIB/BNP LBs is the English and Welsh males aged 65 in 2003.

Let us denote December 31, 2004 as time t = 0, December 31, 2005 as t = 1 etc.

Also let q(t) be the mortality rate between t and t+1 for the members of the reference

population. Then the relationship between the sequence q(t) and S(t, x) is given by:

S(t, x) = (1 − q(0))(1 − q(1)) · · · (1 − q(t − 1)). (3.3.1)

Since the practical issuance of the LBs is only with regard to the cohort of (65) (that

is, x = 65) in England and Wales, we will simply denote the survivor index as S(t).

The contract’s cash flow is well defined. The notional amount for the bond is set

to be £50 million. Then according to the terms of the contract, the coupon payments

118

are £50 ·S(t) million at time t = 1, 2, · · · , 25, respectively, payable at the end of each

year for 25 years. Figure (3.7) is an illustration of the cash flow, using the estimate

of S(t) produced lately by the UK’s Government Actuary’s Department (GAD) as

the realizations of the survival index.

UK pension funds and life offices were the intended main investors of the LBs.

These bonds are designed to offer investors a perfect hedge against the longevity risk

exposures from their commitments to provide annuity payments. The main question

is how to determine the purchase price for such bonds. In the next two sections we

will discuss how these bonds are priced by their issuer and propose a pricing method

based on the Markov mortality model in the previous sections.

3.3.2 How was the EIB/BNP LB priced?

In this section, we will examine how BNP determined the price for the EIB/BNP LB.

In the offer document issued by BNP Paribas in November 2004, BNP specified some

important components that are relevant to pricing:

• the projected survival rates used in the pricing of the bond are given by the

latest GAD’s projection, referred to as S(0, T ) in the following.

• the projected cash flow (Figure (3.7)) will be discounted at LIBOR minus 35

basis points to obtain the issue price.

It is known that the conventional fixed-interest EIB bonds are usually issued at

LIBOR-15 in the primary market. Hence the 20 basis points spread is the premium

for the protection from the mortality risk and can be interpreted as follows.

119

Let P (0) be the purchase price of the EIB/BNP LB with a notional amount of

one monetary unit. Also let DL(0, T ) denote the discount factor that corresponds

to the LIBOR curve and DE(0, T ) the discount factor that corresponds to the EIB

curve, for T = 1, 2, · · · , 25. Then the two curves can be related approximately by

DE(0, T ) = DL(0, T )e0.0015T . Further, suppose that the GAD’s projection are un-

biased estimate of the actual survival index for the reference population, that is,

S(0, T ) = EP [S(T )|M0], where P represents the physical measure. Thus, the BNP

formula for pricing the LBs can be written as

P (0) =25∑

T=1

DL(0, T )e0.0035T S(0, T )

=25∑

T=1

DE(0, T )e0.0020T EP [S(T )|M0]. (3.3.2)

For a given stochastic mortality model and assuming the independence between

the dynamics of interest rates and the dynamics of mortality rates, the risk-neutral

pricing approach that is discussed in section 2.2 and Formula (2.2.19) implies that

P (0) =25∑

T=1

DE(0, T )EQ[S(T )|M0]. (3.3.3)

A comparison of equations (3.3.2) and (3.3.3) shows that 20 basis points can be

interpreted as an average risk premium per annum. Since the EIB/BNP Longevity

bond links its coupon payments proportionally to the survival index, it is expected

that the risk premium entitled to each respective annual payment at time t should be

closely related to the level of the uncertainty associated with S(t). Consequently, some

may argue that the risk premium for the 15th annual payment should be much larger

than for the 5th annual payment. This is because survival probabilities are measures

that compound year-by-year mortality shocks from all the years before t. In other

120

words, the survival probability for year t depends on shocks applied to mortality

rates in each of the years from 1 to t, and each individual shock affects survival

probabilities in all subsequent years (see Cairns, Blake, and Dowd, 2006b; Cairns,

Blake, Dawson, and Dowd, 2005). Also, it is highly possible that some negative

shocks might be corrected in the following years, while most positive shocks tend

to be kept henceforward. As a result, the volatility imbedded in the uncertainty of

S(t) is usually very low at the first few years, and then it will pick up quickly in

a non-linear and unsystematic manner. If this is the case, then the constant spread

would underprice the long-dated annual payment but overprice the short-dated annual

payments. Hence, it is fair to say that the 20-basis-point spread is just a compromise

market price for the longevity risk over the entire 25 years. For a more precise and

meaningful approach, we have to exploit the market term structure of mortality rates

EQ[S(T )|M0](= P (0, 0, T, x)) as in formula (3.3.3).

3.3.3 Proposed method for pricing the EIB/BNP Longevity

Bonds

We consider that information regarding the term structure of the risk premium over

S(t) shall be obtained from market prices, given the specified dynamics of the un-

derlying. In this section, we therefore, based on the time-changed mortality model

introduced in section 3.2, propose a method that can, on one hand, utilize the pro-

jected mortality schedule to reflect the general view regarding the future mortality

trend, and on the other hand, can capture the market information (e.g. market risk

premiums) regarding the uncertainties surrounding the mortality trend changes and

future survival probabilities.

121

To be specific, we assume that a projected mortality table for the reference pop-

ulation of the LBs is available, and has been fitted by a phase-type distribution with

representation (α,Λ). We then propose that, under the risk neutral probability mea-

sure Q, the aging process is given by

Zt = JQγt

(3.3.4)

where JQt is a phase type Markov process with representation (α,ΛQ),

ΛQ = uΛ, u > 0 (3.3.5)

and γt is a Gamma process Γ(t; 1/ν, 1/ν). This approach is similar to that by Jarrow,

Lando, and Turnbull (1997) in credit risk modelling.

Hence, based on the projected mortality schedule, the time-changed Markov pro-

cess (3.3.4) generates a stochastic survival model S(γt). The parameter (u, ν) then

characterizes the market price of longevity risk associated with the specific projection

(α,Λ).

It is important to stress that the choice of an equivalent martingale measure is

not unique. In general, if a Markov process is specified in terms of its k×k transition

matrix,

q1 q12 q13 · · · q1k

q21 q2 q23 · · · q2k

......

. . . . . ....

qk−1,1 qk−1,2 qk−1,3 · · · qk−1,k

0 0 0 · · · 0

(3.3.6)

Under the equivalent martingale measure, the transition matrix can be defined as

qij(t) = uij(t)qij (3.3.7)

122

where uij(t) are strictly positive deterministic functions of t that satisfy

∫ T

0

uij(t)dt < +∞ for any i, j (3.3.8)

There will be infinitely ways to assign uij (see Jarrow, Lando, and Turnbull, 1997).

Therefore, if necessary, the relationship in equation (3.3.5) could be relaxed to be

both state and time dependent as in equation (3.3.7). The entries uij(t) will then

represent the risk adjustment from state i to j during period (0, t]. The relaxed model

shall be more powerful in calibration but with the trade off of being more complicated.

Proposed model (3.3.4) can be calibrated to the market price information. As

we have discussed in section 2.2, we can utilize the pure endowment market prices

E(0, t, x) at time 0 for (x) for all maturities t as the primary market. Theorem 3.2.1

can then allow us to calibrate to the market term structure of spot probabilities

P (0, 0, t, x). To be more specific, we can obtain parameter (u, ν) in the time-changed

Markov model (3.3.4) by solving the following equality

E(0, t, x)

D(0, t)= P (0, 0, t, x) ≃ P (0, 0, t, x) = αeΛQ te, for all t > 0, (3.3.9)

where P (0, 0, t, x) are the market spot survival probabilities and P (0, 0, t, x) are the

model values of the survival probabilities for any t > 0. P (0, 0, t, x) can then be

inserted back in formula (3.3.3) to derive the LB’s price.

This proposed idea of calibrating to market prices is similar to those in Lin and

Cox (2006) and Cairns, Blake, and Dowd (2006b), though they adopt different models

and data sources.

123

3.3.4 Implementation

In this section, we will demonstrate how to use proposed model (3.3.5) and for-

mula (3.3.3) to calculate the longevity bond price. In order to do so, we need data of

mortality projection and the market term structure of spot probabilities P (0, 0, t, x)

as input. For simplicity and also for comparison purpose we consider the data used

and provided by Cairns, Blake, and Dowd (2006b), rather than looking for the mar-

ket bond price and pure endowment price term structure when EIB/BNP LB was

launched.

In Cairns, Blake, and Dowd (2006b), Cairns et al propose a two-factor time series

model for the development of mortality pattern through time. Under the assumption

that their model gives unbiased estimates at time 0 to the survival rates EP [S(t)|M0],

they further introduce a method to obtain the risk-neutral probability measure Q and

the corresponding risk-neutral survival rates EQ[S(t)|M0]. The data used in their

paper are provided in the first two columns in Table 3.1.

We illustrate our method by assuming that column one is the real projection at

time 0 of the survival rate EP [S(t)|M0] and the column two is the market term

structure of mortality which can be stripped off from a given market price system of

pure endowments. The steps to calibrate model (3.3.5) are given as follows:

1. Fit a Coxian phase-type distribution to the projected mortality rates in column

one. The fitting algorithm is the EMpht provided by Asmussen, Nerman, and

Olsson (1996). As the result, we obtain a fitted phase-type distribution with

124

Table 3.1: Survival rates P (0, 0, t, x) under physical and risk-neutral measures

Column# 1 2 3 4

t EP [S(t)|M0] EQ[S(t)|M0] EQ[S(t)|M0] EQ[S(t)|M0]based on (3.3.12) based on (3.3.13)

1 0.9836 0.9837 0.9853 0.98502 0.9661 0.9662 0.9696 0.96963 0.9475 0.9477 0.9533 0.95234 0.9278 0.9281 0.9361 0.93455 0.9068 0.9074 0.9179 0.91546 0.8845 0.8856 0.8982 0.89477 0.861 0.8626 0.8766 0.87228 0.836 0.8384 0.8529 0.84769 0.8095 0.8129 0.8269 0.820910 0.7816 0.7862 0.7682 0.792111 0.7522 0.7583 0.7358 0.761412 0.7213 0.7292 0.7358 0.729113 0.6888 0.6989 0.7017 0.695214 0.6548 0.6675 0.6663 0.660315 0.6195 0.635 0.6299 0.624616 0.5828 0.6015 0.593 0.588517 0.5448 0.5672 0.5558 0.552418 0.5059 0.5321 0.5189 0.516519 0.4661 0.4965 0.4825 0.481220 0.4258 0.4606 0.4469 0.446721 0.3853 0.4245 0.4125 0.413322 0.345 0.3885 0.3793 0.381023 0.3054 0.353 0.3476 0.350224 0.2667 0.318 0.3174 0.320925 0.2297 0.2841 0.289 0.2932

125

representation (α,Λ) as:

α =(

0.836874 0 0 0.1066376 0.05648823)

(3.3.10)

Λ =

−0.2381937 0.2378846 0 0 0

0 −0.2384834 0.237982 0 0

0 0 −0.2390828 0.2380828 0

0 0 0 −0.2423468 0.2413468

0 0 0 0 −0.2433471

(3.3.11)

2. Solve for the optimum parameters (u, ν) such that EQ[S(t)|M0] matches the

market term structure of mortality in column two. That is, solve

min∑(

EQ[S(t)|M0] − EQ[S(t)|M0])2

(3.3.12)

for parameter (u, ν). The results for (3.3.12) are

u = 0.9483659, ν = 1.2481193,

and the corresponding survival rates are provided in column 3 of Table 3.1.

3. In Cairns, Blake, and Dowd (2006b), the Longevity bond price is derived to

be P = 11.442 using formula (3.3.3), with the zero-coupon prices being set as

D(0, t) = 1.04−t. It is therefore of interest to also look for the parameter (u, ν)

which gives the closest price for the LBs to what is obtained by Cairns et al.

That is, solve for the parameter (u, ν) so that we get the minimum:

min

∣∣∣∣∣

25∑

T=1

D(0, T )EQ[S(T )|M0] − 11.442

∣∣∣∣∣ (3.3.13)

The results for (3.3.13) are

u = 0.95544785, ν = 1.0513425,

126

Figure 3.8: Phase-type survival curves under Physical and R-N measure

0 10 20 30 40 50 600

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1Survival function

GAD−projectionFitted phase curveCairns−RN curveCalibrated RN curve

and the corresponding survival rates are provided in column 4 of Table 3.1.

In figure 3.8, the fitted and calibrated (model (3.3.12)) survival curves are pre-

sented together with the curves of EP [S(t)|M0] and EQ[S(t)|M0]. As one may see,

the match is not perfect. We suggest that improvement to the result can be achieved

by using higher dimensional phase-type distribution and/or considering a generalized

risk-neutral model with risk-adjusting parameters being state and time dependent as

in (3.3.7).

The variances associated with the time-changed mortality model are plotted in

Figure 3.9 and Figure 3.10. It is of great interest for us to remark that the resulting

variance curve from our time-changed mortality model is of quite a different form to

what is obtained in Cairns, Blake, and Dowd (2006b). In Figure 3.9, we observe that

the variance peaks at time t = 19, which is at age 84. Afterward, the variance drops

127

Figure 3.9: Variances for Time-changed survival probabilities

0 10 20 30 40 50 600

0.005

0.01

0.015

0.02

0.025

Time t

Var

[S(t

)]

Figure 3.10: Time-changed Survival Curve with one-σ Confidence intervals

0 10 20 30 40 50 60−0.2

0

0.2

0.4

0.6

0.8

1

1.2

Time t

Sur

viva

l cur

ves

P(0,0,t,x)−σ(t)−σ(t)

128

slowly, down to close to zero eventually at ages above 120.

The interpretation for this phenomenon is that our model predicts uncertainty

about the long-term survival probabilities only up to a certain age. There exists a

threshold age ω under the time-changed mortality model; beyond this age ω (say,

age 120), the survival probabilities are not very much uncertain (at least under the

market pricing measure). This makes sense to us because, although we are unsure

about the track of mortality evolvement, the majority still believe death presumably

remains as inevitable as it always was (Benjamin Franklin). The extreme threshold

age may be pushed further back, but there always exists one for the current cohort

under consideration.

In contrast, Cairns et al’s two factor model predicts the variance increasing expo-

nentially to infinity as future time extends, as can be seen from Figure 4 in Cairns,

Blake, and Dowd (2006b).

Therefore, our model differs from two factor model in extreme ages: under our

model, the potential to improve the survival probabilities at extreme ages is limited,

while under Cairns et al’s model, this potential is unlimited. We think our model offers

an alternative vision regarding future mortality development. The reasonableness of

this property is subject to further investigation from both theoretical and empirical

aspects.

3.3.5 Discussion

In November 2004, the EIB/BNP longevity bond was announced to have a total value

of £540 million at the issue time. However, this security was not received well by

the investors. Due to the lack of sufficient demand for the bond to be launched,

129

it was withdrawn for redesign in late 2005. It is therefore of interest to examine

some implementation details since we strongly believe that demands for further such

instruments and innovations can still be expected.

• The survival index that is central to determine the coupon payments will be

provided by the Office for National Statistics (ONS, a UK government agency).

These death rates are considered as a reliable and easily obtainable public

source. This arrangement shall help release the fear from investors that the

index could be manipulated by insurance companies.

However, this also presents a problem that is often referred to as basis risk —

a risk that reference population is different from a particular population of a

pension plan or an insurance policy.

The reason is simple. Since if one wish to use such contracts to hedge their

mortality risk, mortality improvement in the reference population has to match

that assured; otherwise, the company will be exposed to significant basis risk

and the mortality derivative might not provide an adequate hedge.

Usually, The larger the pension fund, the less the basis risk is.

• Also in the offer document, BNP Paribas suggests to calculate the index using

crude death rates, rather than mortality rates. Let m(x, y) represent the crude

central death rate for (x) published by the ONS in the year y. The survival

index S(t) is proposed to be calculated as follows:

S(0) = 1

S(1) = S(0) × (1 − m(65, 2003))

S(t) = S(0) × (1 − m(65, 2003)) × · · · × (1 − m(64 + t, 2002 + t)) (3.3.14)

130

while m(y, x) in the above expression is supposed to be q(y, x) according to the

terms of the contract.

Using crude death rates to approximate survival index may help avoid sub-

jectivity in smoothing methodologies, however, this will result in underesti-

mation of the true survival index. It has been suggested that crude death

rates shall be converted to mortality rates by using the usual approximation

q(x, t) = m(x, t)/[1 + 12m(x, t)] to reduce the kind of bias (see Blake, Cairns,

and Dowd, 2006).

• However, we feel that the failure of the EIB/BNP LB is, probably, not because

of the implementation reasons (discussed above or elsewhere by other authors,

for example, Blake, Cairns, and Dowd (2006) and references therein), rather

because of product design flaws.

The bond is designed to provide ideal hedge to those who bear the same type

of longevity risk over the period of 25 years. However, it is well known that

the early coupon payments have very low longevity risk attached to them (say

for the first 10 years) on one hand; on the other hand, these cash flows are

also the most expensive part of the bond. Consequently, for users who wish to

use these bonds as hedging instruments, such bonds will use up a large amount

of capital to cover a long period of low-risk payments. As a result, it makes

hedging turn into a full take-over, and this seems not what hedgers would want

in most situations.

In contrast, since the bond only covers a single cohort (aged 65, male), it leaves

behind a lot of liabilities which are linked to different age cohorts, different

131

terms of maturities, and different gender (female). This means that the bond

might not be as effective as it is supposed to be. Naturally it is not so attractive

to investors.

To sum up, we conclude that, although the first innovative longevity bond certainly

has a lot of merits in satisfying the need of financial and insurance markets, it seems

also clear that more highly geared contracts for longevity risk shall be developed to

attract investors.

It is also worth to mention that mortality-linked securities have attractions to more

general investors (other than mortality risk bearer) as well. This is because mortality

risk is usually considered as not correlated with other types of assets. Therefore,

long in mortality risky assets may improve risk diversification in a general portfo-

lio of investments (see, for example, Cox, Fairchild and Pedersen, 2000). For those

investors, liquidity and transparency of the products are key properties when consid-

ering portfolio constitutions, yet currently those are the main concerns of investors

over mortality market. Putting all those facts together, it is not surprised at all that

the market needs more time to accept mortality derivative products.

3.4 Miscellaneous

3.4.1 GAOs and Longevity Risk Crisis

In the previous section, we have studied one specific mortality-linked security: the

EIB/BNP longevity bond. As we have shown, the longevity bond is a security des-

ignated to provide life offices and pension plans with an instrument to hedge the

very-long-term longevity risk that they face. In fact, it is one of the two recently

132

launched financial securities that are designed specifically to help managing mortal-

ity risk; one is LB, and the other is the Swiss Re catastrophe short-term bond. There

are many others, though, that have been suggested in the paper, or have been traded

over-the-counter for the same purpose.

The concern to longevity risk, or more generally to mortality risk, has been ag-

gravating, probably since the world’s oldest life office, the Equitable Life Assurance

Society (ELAS), was forced to close to new business in December 2000. Between 1957

and 1988, ELAS had sold a type of pension annuities with the so-called “guaranteed

annuity options (GAOs)” as an embedded feature of the contracts. A guaranteed

annuity option (GAO) is a right that the policy holder has to convert his accumu-

lated fund at retirement at a guaranteed rate rather than at market annuity rate.

At the time of issuance, the value of those GAOs was considered worthless, but has

turned very valuable due to a combination of reductions in market interest rates and

unanticipated falls in mortality rates at the oldest ages. The emerged liability from

the guarantees thus seriously raised solvency concerns for ELAS, requiring the setting

up of extra reserves, and finally resulted in the financial difficulties for ELAS.

Some might blame the crisis on poor risk management of the company, and deem

that this could have been avoided if ELAS had hedged its exposure to both interest-

rate risk and longevity risk. However, as Blake, Cairns, and Dowd (2006) have pointed

out that, even if ELAS had anticipated the problem, it still lacked good instruments

to hedge its exposure to both risks, particularly longevity risk, back to that time.

Therefore, this is in fact not an isolated problem of ELAS. In the UK during

the late 1970s and 1980s, guaranteed annuity rate between cash and pension was a

common feature of individual pension policies, and has been sold by more than 40

133

companies in the market. Although those pension policies are no longer being sold

in the UK now, there are similar guarantees existing in the corresponding policies in

other countries. For example, in the United States’ variable annuity market, there

are guaranteed annuity rate (GAR) contracts and guaranteed minimum accumula-

tion benefit (GMAB) contracts. A GAR contract is identical to a GAO. A GMAB

contract includes the additional feature that the cash benefit available at retirement

is guaranteed to be at least a pre-specified amount. Thus, the problem that dragged

ELAS down is still haunting over the market. Fortunately, this incident has stim-

ulated a lot of researches into investigating the issues related to mortality risk, and

opened the door to the development of a mortality derivative market.

The next subsection will be devoted to a preliminary study on pricing GAOs. It

is then followed by a general introduction on the development of mortality linked

security market, where we end this chapter. As you can see many projects can be

developed from here.

3.4.2 Pricing Guaranteed Annuity Options – Preliminary Study

A guaranteed annuity option (GAO) is a contract which provides the policyholder

the right to convert his/her cash benefit to an annuity at a guaranteed conversion

rate of g (g = 9, normally) at maturity time T. The cash amount which can be used

for conversion is equity-linked, denoted as A(T ); that is, it equals the current market

value of the reference portfolio.

Let ax(T ) represent the actuarial present value at time T of a whole life annuity

which pays $1 per year throughout the remaining lifetime of the policyholder of age

x. Therefore the value of the GAO at maturity T depends on the equity-linked value

134

of A(T ), the prevailing market price of ax(T ), and the conversion rate of g as follows:

(A(T )

gax(T ) − A(T )

)+

=A(T )

g(ax(T ) − g)+ .

The problem of pricing a GAO is similar to that of pricing a call option writing on a

with-coupon bond. Problems like this in finance often start from pricing a call option

on zero-coupon bond first, then combines the portfolio. Therefore, in the following,

we will briefly study the problem of pricing call options on zero longevity bonds with

constant interest rate. The rest of the work will be our future ongoing project. At

least two directions of future work can be led from the results in this section. One

is to price GAOs from pricing zeros, another is to price with the assumption of a

stochastic interest rate model.

Now let’s first introduce zero longevity bonds. A zero LB (or just a zero), denoted

as L(0, t), is a single payment contract which defines the payoff at time t being

proportional to the survivor index S(t, x). A zero can be viewed as a deferred, one

period LB. A call option, C(t,K), written on the zero L(t, t + T ), offers the holder

the right to buy at strike price K, at time t, a zero which matures at date t + T with

the payment linked to S(t + T, x + t).

From now on, we will work in the risk neutral world. Assume that the concerned

survivor index S(t, x) is referred to a cohort of (x) at time 0. For this cohort, the

aging process follows a time-changed Markov process Zt = Jγt, where the underlying

Markov aging process Jt has a phase representation (α,Λ) and γt is a Gamma process

Γ(t; 1/ν, 1/ν). Furthermore, let S(t, t + T ; x) denote the random survival probability

for the same cohort, conditional on being alive at time t, that one will survive another

135

T years from time t. Sometimes we simply write S(t, t + T ) in short. Hence

S(t, t + T ) =S(t + T, x)

S(t, x)

=αeΛ γt+T e

αeΛ γte

Note that γt has independent increment, so we have

S(t, t + T ) =αeΛ γteΛ γT e

π(x)eΛ γte= π(x + t)eΛ γT e

where π(x + t) is defined as

π(x + t) =αeΛ γt

αeΛ γte.

We can interpret π(x + t) as the initial distribution for the surviving cohort at time

t.

Let P (t, t, t + T, x) denote the conditional survival probability of S(t, t + T ) over

filtration Mt, i.e.

P (t, t, t + T, x) = P [S(t, t + T )|Mt]

Applying Theorem 3.2.1, it follows that

P (t, t, t + T, x) = E [S(t, t + T )|Mt]

= E[π(x + t)eΛ γT e|Mt

]

= π(x + t)E[eΛ γT

]e

= π(x + t)eΛT e (3.4.1)

From formula (3.4.1), we know that P (t, t, t + T, x) has phase representation (π(x +

t), Λ).

136

Now consider to price the call option C(t,K). For simplicity, we assume constant

risk-free interest rate r. Then the value of this call option at time t can be expressed

as follows:(P (t, t, t + T, x)e−rT − K

)+.

And the call option price at time 0 can be calculated as

E[e−rtS(t, x)

(P (t, t, t + T, x)e−rT − K

)+ | F0

](3.4.2)

according to Fundamental asset pricing theorem.

Without loss of generality, the problem of (3.4.2) is equivalent to the following

E[S(t, x) (P (t, t, t + T, x) − K)+ |M0

](3.4.3)

=E[S(t, x) (P (t, t, t + T, x) − K)+] (3.4.4)

=E

[(αeΛ γteΛT e − KαeΛ γte

)+]

(3.4.5)

Now write

eΛT = H(eλi T

)diag

H−1

eΛ γt = H(eλi γt

)diag

H−1

Then

eΛ γteΛT = H(eλi γt

)diag

(eλi T

)diag

H−1

= H(eλi γteλi T

)diag

H−1

In a similar way, write

K = ek = H(ek)

diagH−1

137

K · eΛ γt = H(eλi γt

)diag

(ek)

diagH−1

= H(eλi γtek

)diag

H−1

αeΛ γteΛT e − KαeΛ γte

=αH(eλi γteλi T − eλi γtek

)diag

H−1e

=n∑

i=1

(eλi γteλi T − eλi γtek

)αhi ⊗ νie

=n∑

i=1

eλi γt

(eλi T − ek

)αhi ⊗ νie

Define γi =(eλi T − ek

)α(hi ⊗ νi)e, then we obtain

E[S(t, x) (P (t, t, t + T, x) − K)+ |M0

]

=E

[(n∑

i=1

eλi γtγi

)+]

=

∫ ∞

0

(n∑

i=1

eλi sγi

)+

fγt(s)d s (3.4.6)

which can be easily calculated.

Therefore, the formula (3.4.6) provides an explicit form to derive the call option

price written on the zeros of longevity bond.

3.4.3 A Snapshot of Mortality Derivative Market

Since 2000, there are a lot of discussions on proposing various forms of mortality

linked contracts into the market. Therefore, we feel it is necessary to briefly intro-

duce the development in this aspect. Generally speaking, these contracts can be

classified into the following four types: (i) mortality bonds, (ii) mortality swaps, (iii)

mortality futures, and (iv) mortality options. However, we have no intention to give

138

full description for each of them. Instead, we just want to highlight the basic features

of those contracts, and point out how they might be used to help manage mortality

risk. The following contents are mainly based on the material given in Blake, Cairns,

and Dowd (2006).

1. Mortality bonds

These securities have the usual features we would expect of bonds. The only

difference is that now the payments (coupons or principal) are related to the mortality

rates in one way or the other. Accordingly, mortality bonds fall under two broad

categories. The first are “principle-at-risk” bonds, of which the investors risk losing all

or part of the principal if the relevant mortality event occurs. The second are “coupon-

based” mortality bonds, in which the coupon payment is mortality dependent. The

EIB/BNP LB is an example of the coupon-based bonds. One can also imagine various

types of hybrid bonds in which both principal and coupon are at risk if specified

mortality events occurs.

The term of the bonds can be a preset finite period (like the EIB/BNP LB) or a

perpetuity. For example, Blake and Burrows (2001) have proposed a type of longevity

bond in which the payments last until the death of the last surviving member in the

reference cohort. The nature of the dependence on mortality can vary too: the

payment might be a smooth function of a mortality index, or it might be specified

in “at risk” terms, i.e. the investor loses some or all of the coupon or principle if the

mortality index crosses some threshold.

Here are some prominent examples of mortality bonds:

• the Swiss Re Mortality bond

139

In December 2003, Swiss Re issued a three-year life catastrophe bond, maturing

in January 1, 2007. It is a principal-at-risk bond, and its principal will be paid

out according to the following form:

100% if qt < 1.3q0

1.5q0−qt

0.2q0× 100% if 1.3q0 ≤ qt < 1.5q0

0% if qt ≥ 1.5q0

where q0 is a specifically constructed index of mortality rates across five coun-

tries: the United States of America, U.K., France, Italy and Switzerland. The

issue size is $400 million, and quarterly coupons are calculated at three-month

U.S. dollar LIBOR+135 basis points.

The bond is designed to help reduce Swiss Re’s exposure to an extreme mor-

tality risk (e.g. such as that associated with a repeat of the 1918 Spanish Flu

pandemic). There are some other advantages too. First, Swiss Re can improve

its credit rating and reassure rating agencies about its mortality risk manage-

ment capability. Second, by issuing bond themselves, Swiss Re doesn’t need to

rely on other counterparties should an extreme mortality event occur. Thus, the

bond gives Swiss Re some protection against extreme mortality risk yet avoid

exposure to credit risk they may face in the reinsurance.

In contrast with the EIB/BNP LB, the Swiss Re bond was well accepted by

the market. In April 2005, Swiss Re announced a second life catastrophe bond

which will mature in 2010 with an issue value of $362 million.

• Zero-Coupon Longevity Bonds

The contract term of a zero-coupon LB (or simply referred to as “zero”s) has

been discussed in the previous subsection. It is easy to imagine that single

140

payment longevity bonds might be issued or financially engineered by stripping

“standard” longevity bonds, as what has been done in fix-income market. The

attraction of zeros is that they provide building blocks for tailor-made positions

or for theoretical studies.

A two-dimensional spectrum of such bonds could be issued: one dimension is re-

lated to the cohorts being followed and the other related to the maturity dates.

The availability of a sufficient variety of bonds from this two-dimensional spec-

trum would then enable insurance companies to construct portfolios of longevity

bonds that provide close fits to the size/age features of their particular annu-

ity books. The development of such products can enhance the basic feature of

mortality derivative market as an investment resources in general.

• Deferred Longevity Bonds

As we have discussed before, we might need to have more highly geared bonds

which enable users to meet their hedging demands with a much reduced capital

outlay. One way of increasing gearing is to issue bonds with deferred payment

dates. The deferments would save a large amount of capital, and so make such

longevity bonds much more attractive as hedging instruments.

2. Survivor swaps

A mortality swap is an agreement where counterparties swap a fixed series of

payments in return for a series of payments linked to a survivor or mortality index.

Typically, the preset rate leg is linked to a published mortality projection, and the

floating leg is linked to the counterparty’s realized mortality.

141

Mortality swaps have certain advantages over longevity bonds. They can be ar-

ranged at lower transactions cost than issuing a bond. They are more flexible and they

can be tailor-made to suit diverse circumstances. They do not require the existence

of a liquid market, just the willingness of counterparties to exploit their comparative

advantages or trade views on the development of mortality over time.

Actually, there is an embedded over-the-counter (OTC) mortality swap arrange-

ment in the issuance of the EIB/BNP bond. Here is what is going on behind. By

issuing the bond, the EIB has a commitment to make mortality-linked payments in

sterling. To guarantee the bond commitment being met, the EIB engages in two

swaps. The first one is a cross-currency interest rate swap with BNP to exchange

floating euro payments with fixed sterling payments St. In the meantime, the EIB

also sets a deal with Partner Re, in which EIB exchanges the fixed sterling St for

mortality-linked floating sterling payments S(t). As a result, the EIB exchanges its

commitment from the longevity bond for a commitment of paying floating euros, and

gets rid of the mortality risk exposure.

The above mentioned mortality swap used by the EIB is called a vanilla mortality

swap (VMS). A VMS is analogous to a vanilla interest-rate swap (IRS), which involves

one fixed leg and one floating leg typically related to a market rate such as LIBOR.

However, the IRS can be easily valued since the existence of a liquid bond market.

This is not the case for VMS at present due to the lack of a liquid and transparent

spot mortality risk market.

Mortality swaps have a number of possible uses. As Lin and Cox (2006) explain, a

mortality swap can be used to help firms that run both annuity and life books manage

the natural hedges implicit in their positions. The type of swap in this case might

142

be a floating-for-floating swap, with one floating leg tied to the annuity provider’s

annuity payments and the other to the life assurer’s insurance payouts. In the same

way, firms could use such swaps to manage their exposures across both reference

populations and across the “mortality term structure”. Also a mortality swap can

be used between an insurer, wishing to manage the risks on its annuity book, and

a capital market institution wishing to acquire longevity risk exposure (see Blake,

Cairns, and Dowd, 2006).

3. Mortality futures

As in the financial market, a futures contract is an agreement between counter-

parties to buy or sell a security at a future time for a preset price. The basic form

of a futures contract involves defining (a) the underlying (typically price) process,

X(t), that will determine the payoff and the value of the futures contract, and (b)

the delivery date, T , of the contract. When X(t) itself represents the price of a traded

asset, the advantage of a futures market is normally that it allows stakeholders to

trade in the underlying risk with lower transaction costs and in a market with greater

liquidity than is usually possible from trading in the underlying in the spot market.

The factors that might jeopardize the success of a mortality futures contract have

been pinned down to the following:

• There must be a large, active and liquid spot market for the underlying with

good price transparency. This is by far the most important factor: indeed it is

extremely rare for a futures contract to survive without a spot market satisfying

these conditions.

143

• Spot prices must be sufficiently volatile to create both hedging needs and spec-

ulative interest.

• The market in either the underlying or the futures must not be heavily concen-

trated on either the buy or sell side, because this can lead to price manipulation.

• The underlying must be homogeneous or have a well-defined grading system.

• Liquidity costs (i.e. bid-ask spreads and execution risk — the risk of adverse

price movements before trade execution) in the futures contract must not be

significantly higher than those operating in any existing cross-hedge futures

contract.

The challenge to find suitable mortality risk-linked products as underlying has

been discussed in Blake, Cairns, and Dowd (2006). As a concrete example, we see

that Cairns, Blake, and Dowd (2006a) have introduced a concept of annuity futures.

Suppose ARM(t, x) and ARF (t, x) represent the market level annuity rate available

at time t per $10,000 single premium for males and females at age x, respectively.

That is, the single premium of $10,000 will purchase an annuity of AR(t, x) per

annum (here we drop the index for a corresponding gender). Then an annuity futures

contract would have AR(t, x) as its underlying index. There would be a variety of

maturity dates. The spot market for immediate annuities is a fairly active one, but

the problem is that this market is neither liquid nor transparent. It is also quite

inefficiency in the sense that market annuity rates don’t track the change of market

interest rate as frequently as it is supposed to be in an efficient financial market.

Another example of mortality futures is the one suggested by Blake, Cairns, and

Dowd (2006) that, if a liquid market in longevity bonds has been developed in time,

144

then it might be possible for a futures market to develop which uses the price or prices

of longevity bonds as the underlying. Day-to-day volatility in longevity bond prices

will be driven by changes in interest rates, whereas the risk associated with changes

in longevity emerges over longer periods of time. It is also possible to use a survivor

index as the underlying.

4. Mortality options

Mortality option is a type of contract with option characteristics whose payoff

depends on the underlying mortality schedule at the preset date. Mortality option

is of great interest since it can provide (a) a protection for investors against the

downside exposure of the risk, but leave any upside potential, and (b) an instrument

for speculators who want to trade views on volatility rather than views on the level

of mortality rates. For both these purposes, options are (usually) the best type of

choice.

• Survivor Caps and Floors

A possible market in survivor caplets and floorlets has been suggested by Blake,

Cairns, and Dowd (2006). The basic idea is to use a survivor index S(t, x) as

the underlying. Let sc(t) be the cap rate for exercise date t. The caplet pays

maxS(t, x)−sc(t), 0 at time t. Similarly a floorlet pays maxsf (t)−S(t, x), 0.

Survivor caplets and floorlets then can be packed into survivor caps and floors.

Alternative names could be longevity caps and floors.

• Mortality Swaptions

A more sophisticated contract would be a mortality swaption in which the

underlying instrument would be a mortality swap of specified type and maturity.

145

The swaption might be American, European or Bermudan in nature, and would

give the holder the right to enter into the swap on one or other side. For example,

a payer swaption gives the holder the right to enter as the fixed-rate payer; vice

versa, a receiver swaption gives the holder the right to enter as the fixed-rate

receiver. As with conventional swaptions, a payer swaption can be regarded as

a put on survivor rates, because its value would go up when survivor rates fall,

and a receiver swaption can be regarded as a call on survivor rates, because its

value would increase when survivor rates rise.

Mortality swaptions can be used for various risk management purposes. An

obvious use is to provide the option to lock-in future swap rates, which might

assist insurance companies in managing the risks of positions in instruments

such as guaranteed annuity options.

• Guaranteed Annuity Options

Finally, the guaranteed annuity option mentioned before is another example of

mortality options, however, it is a more complicated product which involves

interest rate risks as well. The contracts of this type are, probably, the most

discussed mortality linked product in the pioneered stochastic mortality research

so far. Interested readers are referred to the works by Boyle and Hardy (2003);

Ballotta and Haberman (2003, 2006) and the references therein.

We would like to finish our introduction about mortality linked securities here. Al-

though there are still many teething problems to overcome, it is no doubt that mor-

tality derivative market is becoming the next big frontier for financial market.

For issues related to implementation and securitization, we would like to refer

146

interested readers to the works by Cowley and Cummins, 2005, Dowd et al. 2006,

Lin and Cox, 2005a, Blake, Cairns and Dowd, 2006 and the references therein.

The valuation problem of mortality derivatives as well as their risk management

require the use of a good stochastic mortality model. Our proposed time-changed

Markov model may provide an alternative toolkit to handle this type of problem.

Another interesting direction we would like to work with in the future is to expand

the Markovian framework developed in this chapter to incorporate stochastic inter-

est rates. The motivation of this generalized Markov framework is two-fold. In the

first place, the current interest rate model may not be adequate for very long-term

products (average duration can be more than 40 years for longevity products). For

example, in Boyle and Hardy (2003) and Ballotta and Haberman (2006), the interest

rate is modelled by the Hull-White model and Health-Jarrow-Morton model, sepa-

rately. Their results showed that the value of guaranteed annuity options (GAOs) is

dominated by the current high interest rate no matter how long the maturity date of

GAOs are. This may not be a reasonable result. In the second place, it is mathemat-

ically tractable to incorporate multiple risk factors under the same framework.

To be specific, the short rate process can be modelled as a function of a finite

Markov process which represents the “states of the economy”, similar to the models

proposed in Norberg (2003), Elliott, Hunter, and Jamieson (2001), and Elliott and

Mamon (2002). Under this Markovian framework, explicit expressions for the prices

of zero-coupon bonds and other securities can usually be obtained in terms of the

exponential matrices which is quite similar to the results for the survival bonds under

the time-changed Markov mortality model. It shall be interesting to integrate the two

Markov models together. The augmented framework can be used to price and hedge

147

the long-term mortality-risk related products, and it can also be used to evaluate

insurance liabilities and risk measures.

Chapter 4

Deterministic Fitting

In this chapter, we set to show that a phase-type distribution and its associated

Markov process can be a very good candidate for a survival model. In addition to

provide statistical evidence to support this idea, we also propose an aging mechanism

as the model interpretation.

This chapter is organized as follows. In Section 4.1, we briefly introduce the aging

process and the key concept — the physiological age. In Section 4.2, we propose and

discuss a deterministic survival model. We present the fitting results for the Swedish

cohorts’ data in Section 4.3, and the fitting results for the mortality data compiled

by the United States Social Security Administration in Section 4.4. Goodness-of-fit

analysis is performed in Section 4.5. We discuss analytical properties of the pro-

posed model in Section 4.6, which illustrates the usefulness of the model and the

matrix-analytic method in pricing insurances and annuities. A short discussion on

this approach is given in Section 4.7.

148

149

4.1 Aging Process and Physiological Age

We have discussed in Section 1.2 how extrapolative methods have been used exten-

sively in mortality projection. When one employs a parametric model to extrapolate

the past trend into the future, it is implicitly assumed that the historical patterns

will still hold for the future and no structural change will occur. This is certainly not

true. Over the past century, we have observed the changes of mortality patterns due

to the transition in major causes of death from infectious diseases to chronic diseases

(see Tuljapurkar and Boe, 1998). Moreover, the end of the 20th century has been

marked by declines in death rates from chronic and degenerative diseases (see Wil-

lets, 1999, for an empirical study on mortality reduction). In order to partly correct

the problems in the extrapolative projection, it is necessary that expect opinions on

medical, behavioral, or social impacts on mortality are incorporated into the projec-

tion model. However, since there are no direct links between the parameters in the

mortality models and the aging mechanism, such incorporation is often difficult.

Hence, it is desirable to have a mortality model that (i) fits observed mortality

data; (ii) can link its parameters to the biological/physiological mechanism of aging

to certain extent, which allows for easier the input of expert opinions; and (iii) allows

quantitative analysis on death causes to be made at a more fundamental level. The

capability of constructing a mortality model with these desirable properties has been

considered important, see Tenenbein and Vanderhoof (1980), Gutterman and Van-

derhoof (1998). However, to our best knowledge a satisfactory model of this kind has

not been developed. The model we propose is a new attempt in this direction. We

start with modelling the underlying aging process using a finite-state continuous-time

Markov process with a single absorbing state (representing death). Consequently, the

150

survival functions can be derived from the model setting.

A complete description of aging theory is beyond the scope of this paper. Inter-

ested readers are referred to the classical books by Comfort (1964) and Finch (1990),

and also the papers by Collatz (1986) and Hayflick (2002). In the following, we high-

light some key features of an aging process and in particular we will focus on the

relationship between age-specific mortality patterns and the aging process.

As quoted from Jones (1956), the term aging process, as applied to living organ-

isms, is the genetically determined, progressive, and essentially irreversible diminution

with the passage of time of the ability of an organism or of one of its parts to adapt

to its environment, manifested as diminution of its capacity to withstand the stresses

to which it is subjected ( that is, the increase of susceptibility to certain diseases with

age), and culminating in the death of the organism.

Clearly, human aging is associated with a wide range of physiological changes: the

general lessening of the intensity of perfusion of blood through the various tissues,

the disturbances of lipid metabolism and the growth of atherosclerotic deposits, and

the rarefaction of the bony structure, to name a few. Such changes make a human

life not only more susceptible to a number of diseases but also more susceptible to

death. The deterioration of various physiological functions as a whole can be viewed

as the worsening of health status, suggesting that the increasing mortality rate with

age shall be largely linked to the health status rather than directly to the age of a

life.

A number of experimental studies have been conducted to seek the understanding

of various human physiological functional changes. A recent experimental study con-

ducted by Sehl and Yates (2001) calculated the loss rates for 445 human structural

151

and functional variables from 13 organs, and 24 more integrative variables. One in-

teresting finding is that a linear model can provide a fit to the data and the fit is as

good as the best polynomial fit. Other experimental studies on physiological func-

tions may be found in Shock (1974), Bafitis and Sargent (1977), Strehler (1999), and

references therein. Main findings in these studies on human physiological functional

changes can be summarized as: (1) most functional variables reach their maximal

capacities roughly between age 3 and 20; (2) after age 30, most human functional

variables decline linearly, contrary to the exponential increasing of mortality rates;

(3) the decline of physiological functions varies among individuals of a cohort, and this

variation increases somewhat with age. These findings suggest that the aging process

could be modelled in terms of changes in the physiological functions qualitatively and

quantitatively.

Motivated by these studies, we introduce a hypothetical physiological age that

marks a detectable physiological change resulting from one or more afore-mentioned

physiological functions. This physiological age may be interpreted as a relative health

index representing the degree of aging in a human body. This idea is similar to that in

Jones (1959), Sacher and Trucco (1962), Featherman (1986), Yashin et al (1995), and

Zuev et al (2000), where hypothetical health indices are also proposed but in different

contexts. Unlike some mortality models that focus on particular health factors (for

example, morbidity and disability. See Crimmins et al, 1994) and their connections

with mortality, we define the physiological age at a fundamental level and assume

that it only develops in one direction. Since the linear property holds by the various

physiological functions, we assume linearity in time also holds with the physiological

ages. Further, the change of health status or the transition from one physiological

152

age to the next is random, which is fundamentally different from the chronological

age. Lastly, mortality is viewed as both a reflection of the intrinsic aging process

and a response to the “environmental” factors. The interplay between the intrinsic

aging process and the external factors of death is the susceptibility described in the

aging definition. The increased physiological age implies the increased susceptibility

to certain fetal diseases, due to the deterioration of physiological capacity. Sometimes

this is interpreted as the “weakest link” of the whole organism failing, which results in

death and terminating the aging process as described in Austad (1997) and Strehler

(1999).

4.2 The Proposed Mortality Model

In the spirit of the discussions in the previous section, we now propose a finite-

state continuous-time Markov process to model the hypothetical aging process (see

Diagram 4.1). Each state represents a physiological age and aging is described as a

process of consecutive transitions from one physiological age to the next physiological

age. There is one absorbing state and the transition from any other state to the

absorbing state is interpreted as the aging process being terminated due to an early

death either from a random cause or from a fetal disease.

1 2 · · · · · · n

1 1 1

λ1 λ2 λn−1

qnq1 q2

Diagram 4.1. Proposed Physiological Aging process

153

As shown in Diagram 4.1, for each physiological age i two parameters are used: one is

to describe the development of aging process and the other to reflect the susceptibility

to death at that physiological status. Specifically, parameter λi represents the transi-

tion rate from physiological age i to physiological age i + 1 and can be interpreted as

an aging rate that measures how fast an individual advances from age i to age i+1. It

should be pointed out that λi is an integrated measure of the deteriorating intensity

of aging for a population, not for a specific individual. Probabilistically speaking, the

time of staying at state i before moving to state i + 1 follows an exponential distri-

bution with mean 1/λi. Thus, the larger λi is, the faster the progression of aging of

a population is.

Parameter qi describes the susceptibility, that is, the chance of death, at each

physiological age i. There are two different types of hazard threats at each age. One

is an aging-independent cause of death such as a fatal injury. The rate of death from

this type of causes is denoted by h1(i) (there are more discussions on how to specify

h1(i) in later sections). The other is the increasing susceptibility to death due to the

deterioration in physiological functions as a result of aging. The rate of death from

this type of cause is denoted by h2(i), and is an increasing function of i. We assume

that this two rates are additive: qi = h1(i) + h2(i). Hence, death is the competing

result of two forces: the internal irreversible decline of physiological functions and the

instantaneously random threat of death at each physiological age.

Since the Markov process has only a single absorbing state, the time of death (the

time of absorbing) follows a phase-type distribution. Furthermore, the transition rate

154

matrix of the Markov process is given by

Λ =

−(λ1 + q1) λ1 0 · · · 0

0 −(λ2 + q2) λ2 · · · 0

0 0 −(λ3 + q3). . . 0

......

. . . . . ....

0 0 0 · · · −qn

(4.2.1)

and the initial distribution is α = ( 1, 0, · · · 0 ).

We now propose special structures on the parameters λi, h1(i), and h2(i), not

only to make the model feasible for estimation but also to ensure that the model and

the parameters are given reasonable physiological meanings for the aging dynamics.

Moreover, a structured model avoids the non-uniqueness problem in the estimation

of the phase-type distribution.

Since the experimental findings show that various physiological functions follow

a slow age-wise linear decline on average, it is thus reasonable to assume a constant

transition rate between states, that is:

λi = λ, for i = k + 1, . . . , n,

where k will be defined later.

For the aging-independent hazard rate, we propose

h1(i) =

b + a for i1 < i ≤ i2

b otherwise(4.2.2)

where the constant b is interpreted as a background rate and the constant a is inter-

preted as a behavior related accident rate. The accident rate appears only between

ages i1 and i2 since the behavior related rate is age dependent, but the background

rate is a general reflection of the living environment. We propose that parameter h2(i)

155

is an increasing function of i to reflect the increasing susceptibility with physiological

age to hazard threats. As our first simple hypothesis, we assume h2(i) to be a power

function of the form h2(i) = ip · q, where q is a scale parameter and p is a measure of

the relative impact of aging to susceptibility. Together, the parameter qi is expressed

as

qi =

b + a + ip · q for i1 < i ≤ i2

b + ip · q otherwise(4.2.3)

Although the analytical form of qi seems to be very simple, the model performs very

well when it is fitted to mortality data as illustrated in later sections.

In order to fit the model to death rates of all ages, we consider including a devel-

opmental period in which newborns adapt to the environment and develop to reach

their maximal physiological performance. This is also a period during which mor-

tality decreases to its minimal point, starting from a comparatively high mortality

rate at birth. k additional states are added before the aging process to model the

developmental period. The augmented Markov process is given in Diagram 4.2 with

an illustrative 2-state developmental period:

aaaaa aacaa aaaaa aaaaaaa

I II 3 4 · · · · · · n

aaaaa avaaa aaaaa aaaaa aaaaaaa

λI λII λ3 λ4 λn−1

qI qII q3 q4 qn

Diagram 4.2. The augmented Markov Model of Physiological Aging process

Here, the Roman numbered states represent the developmental period. The aug-

mented Markov process has the same structure as the previous one. In practice, a

small number for k (2 to 4) is sufficient for the developmental period. As we will show

156

in the next section, k = 4 is sufficient when we fit the model to the Swedish cohorts’

data of Year 1811, 1861, and 1911.

Some remarks are now made:

(i) A technical advantage of this model is that the time of death has a phase-type

distribution. Phase-type distributions have been studied extensively in the queuing

context. They can easily be analyzed using the matrix-geometric method developed

by Neuts (1981). The density, survival function and moments of a phase-type distri-

bution have a simple analytical form as we have shown in Chapter 3. Further details

can be found in Neuts (1981) and Asmussen (1987). The matrix-geometric method

also enables us to derive the closed-form expression of the net single premiums of the

whole life insurance and the whole life annuity, and to perform qualitative analysis

on the model (see Section 4.6). In contrast, two widely used mortality models: the

Heligman-Pollard model in actuarial science and the Lee-Carter model in survival

analysis, can not identify the distribution of the time of death explicitly. No analyt-

ical tool is available for either model. As a result, most of the studies using these

models has to rely on simulating numerical results or statistical experiments.

(ii) The proposed model implicitly assumes that there is a maximum physiological

age n. Theoretically, it may not be necessary for aging to be a finite process, but

when n is large enough, the model can provide an excellent fit to data and can

also accommodate the needs for mortality projection and insurance valuation. Our

numerical experiments in this paper show that n is necessary to be as large as 200 to

provide a good fit for the data sets.

(iii) When n = 200, we could encounter the dimensionality problem. However,

since the total number of parameters in the model is relatively small (9 to 13), the

157

dimensionality problem is not a serious issue. As a comparison, the Lee-Carter model

may need more than 300 parameters in order to fit mortality data. See Li, Hardy

and Tan (2007).

4.3 Fitting Swedish Cohort Data

In this section, we will fit the proposed model to three sets of Swedish mortality data.

The main reason we choose the Swedish mortality data to illustrate the implementa-

tion of our model is the reliability of the data as well as the high life expectancy of

the Swedish population. As described in the previous section, we are to model the

aging process in terms of physiological ages. For this reason, cohort data are more

suitable for fitting the model than cross-sectional (period) mortality data.

The Swedish cohort data for decennial years 1811 through 1911 are downloaded

from www.mortality.org, and their graphs are shown in Figure 4.1.

Some observations are worth noting. From Year 1811 to Year 1861, there was no

significant improvement in infant-childhood mortality and in mortality at advanced

ages. A noticeable mortality improvement took place between early 20’s and late

60’s. However, a significant mortality decline at all ages was observed from Year

1861 to Year 1911. In particular, the cohorts of Years 1811, 1861 and 1911 represent

three very different mortality patterns as shown by the bold curves in Figure 4.1.

For cohort 1811, the mortality first declines to a minimal point at around age 14

and then steadily increases. For cohort 1861, the mortality increases moderately

from the minimal point for about 30 years and then increases steeply. Cohort 1911

presents a “noticeable” accident hump at youth ages and begins its steady increase

earlier than cohort 1861, but from a comparatively lower rate. All three cohorts

158

Figure 4.1: Death rate qx for Swedish Cohorts from year 1811 to year 1911

0 20 40 60 80 100 1202

3

4

5

6

7

8

9

10

Age

ln(1

0000

*qx)

18111821183118411851186118711881189119011911

159

Table 4.1: Parameter values for the Swedish cohorts of years 1811, 1861, and 1911Parameters

Cohort λ b a [i1, i2] q p1811 2.5657 3.1504e-03 1.9888e-03 [42, 99] 9.3157e-09 31861 2.4794 4.4825e-03 1.9033e-03 [42, 89] 2.6351e-13 51911 2.3707 9.0987e-04 2.8939e-03 [33, 70] 1.8872e-15 6

show approximately parallel mortality curve from age 60. We apply our model to the

cohorts of Years 1811, 1861 and 1911.

The parameters in the transition matrix (4.2.1) are estimated by minimizing the

sum of weighted squared errors:

F =ω∑

x=0

(qx − qx)2s(x), (4.3.1)

where qx and s(x) are the observed death rate and survival probability at age x, and

qx is the corresponding model value for qx. Let s(x) be the model survival function

of the time of death. Since the time of death is of phase type with the phase type

representation (α,Λ), it can be expressed as

s(x) = α exp(Λx)e, (4.3.2)

where exp(Λx) is the matrix-exponential of the transition matrix Λ. The probability

of death qx can thus be calculated using

qx =s(x) − s(x + 1)

s(x). (4.3.3)

The estimates of the parameters are obtained using the simplex algorithm (Nelder

and Mead, 1965) built in Matlab. The algorithm is applied to different combinations

of values n, i1, i2 to determine the best suitable maximal physiological age, and the

160

Figure 4.2: Fitted curves of qx for the Swedish cohorts of years 1811, 1861 and 1911

0 20 40 60 80 1003

4

5

6

7

8

9

10

ln(1

0000

*qx)

0 20 40 60 80 1003

4

5

6

7

8

9

10

0 20 40 60 80 1002

4

6

8

10

Age

ln(1

0000

*qx)

0 20 40 60 80 1002

4

6

8

10

Age

1911Fitted

1861Fitted

1811Fitted

161

age range for significant high incidence of accidental death. We found that n = 200

provides the best fit for all three data sets. Estimated values are given in Table 4.1

for aging related parameters, and in Table 4.2 for the developmental period related

parameters, separately. Four states are used for the developmental period. The fitted

curves are shown in Figure 4.2.

Table 4.2: Parameter values for the developmental periodParameters

Cohort qI λI qII λII qIII λIII qIV λIV

1811 0.6051 2.4906 0.0726 0.6376 0.0503 0.7040 0.0285 0.44161861 0.3086 1.6773 0.0291 0.7793 0.0627 0.7954 0.0276 0.75311911 0.1671 1.7958 0.0097 0.5543 0.0003 3.5061 0.0149 0.6535

SSA1950B 0.1031 5.5796 0.0750 5.5454 0.0037 0.7419 0.0017 0.6189

The estimated parameters reconfirm the observed trend for death rates qx, as

shown in Figure 4.1. The internal aging rate λ decreases over time. This trend

coincides with the medical improvement since the beginning of the twentieth century,

along with the general improvements in the standard of living. The improvements

are also reflected by the significant decrease in the background death rate b for 1911.

Note that cohort 1911 has a much higher accidental death rate a, which accounts for

the observed hump. In Figure 4.3, we give the curve of h2(i) = ip ·q for the three data

sets since it represents the increasing susceptibility with aging. It is interesting to see

that there exists a crossover among the three curves. That may be interpreted as that

younger cohorts are more subject to chronic diseases at old ages, which is consistent

with the exchange theory: the elimination of acute infectious diseases results in more

people dying from chronic diseases (Fries, 1980; Jones, 1956 and 1959).

162

Figure 4.3: Estimated patterns of hi(2) for the three cohorts

0 20 40 60 80 100 120 140 160 180 2000

0.02

0.04

0.06

0.08

0.1

0.12

Physiological age

q i(2)

191118611811

4.4 Fitting U.S. Social Security Administration Mor-

tality Data

In this section, we fit the same model and parameter sets to the (cohort) life table for

births in 1950 in Actuarial Study No. 107 that was compiled by the U.S. Social Secu-

rity Administration. This particular life table is referred to as SSA Life Table 1950B.

All the life tables in Actuarial Study No. 107 were compiled in 1992, reflecting both

the historical mortality experience before 1990 and projected future improvement

from 1990 to 2080. Those tables are designed for the valuation of the insurances and

benefits in the Old-Age and Survivors Insurance and Disability Insurance (OASDI)

program. Hence, future mortality improvement has been taken into account.

163

Table 4.3: Estimated parameter values for SSA 1950BParameters

Methods λ b a [i1, i2] q pBasic 2.2484 2.7671e-04 1.5059e-03 [37, 120] 2.1710e-15 6

Modified 2.2502 3.7874e-04 2.3635e-03 [37, 120] 2.1710e-15 6

Figure 4.4: Fitting curve of qx for SSA 1950B

0 20 40 60 80 100

2

3

4

5

6

7

8

9

Age

ln(1

0000

*qx)

Fitted1950B

164

We again use the Nelder-Mead simplex algorithm to minimize the sum of weighted

squared errors (see formula 4.3.1). The estimated parameters for the SSA Life Table

1950B are given in the first row of Table 4.3. The fitted curve of the death rates

qx’s with the corresponding observed values are given in Figure 4.4. The fitting is

reasonably good at most ages except two periods: ages from age 15 to the late 30’s,

and ages 100 and older. The rates at extreme old ages in the life tables in Actuarial

Study No. 107 were constructed by a geometric extrapolation of the probabilities of

death. Since it has been widely discussed that death rates do not increase in such

a fast manner at extreme old ages (Kannisto, 1994), we view this extrapolation as

unreliable and thus didn’t make any effort to adjust our model to provide a better fit

for this period.

For the other period (from age 15 to late 30’s), special adjustments were made

to accommodate high accidental rates. In Table 4.4, we list the age-group accidental

death rates provided in the U.S. National Vital Statistics Report (Table 10, Vol. 47,

No.19, June 30, 1999). It is shown in the table that the accidental death rate has

a sudden high jump at age 15 and remains approximately constant for age groups

15-24, 25-34, and 35-44, and then it drops for the next three age groups, followed by

an increase after age 65. We hence introduce a step function to replace the parameter

a in the model:

a →

a for [i1, j]

a · c for [j, i2]

The model is re-estimated with c = 0.51 and the revised fitting curve is given in

Figure 4.5 and the other estimated parameters are given in the second row of Table 4.3.

As can be seen, the fitting is improved noticeably.

165

Table 4.4: Accidental death rates by age group, the United States, 1997Age group lx dx Accidents# Accident Percent Cond.Acc.Rateunder 1 2313844 28045 1139 0.0406133 0.000492254

1-4 2285799 75501 2420 0.439920015 0.0010587115-14 2280298 8061 4183 0.518918248 0.00183440915-24 2272237 31544 24059 0.762712402 0.01058824425-34 2240693 45538 24061 0.528371909 0.01073819635-44 2195155 89408 26236 0.293441303 0.01195177645-54 2105747 144882 17866 0.123314145 0.008484455-64 1960865 231993 11128 0.047966965 0.00567504665-74 1728872 464274 11910 0.025652955 0.00688888575-84 1264598 670530 14791 0.02205867 0.01169620785+ 594068 594068 11713 0.019716598 0.019716598

Figure 4.5: Adjusted fitting curve of qx for SSA 1950B

0 20 40 60 80 100

2

3

4

5

6

7

8

9

Age

ln(1

0000

*qx)

Fitted1950B

166

4.5 Analysis of Goodness-of-Fit

In this section, we consider three goodness-of-fit measures. The first measure is the

R-square measure that is normally used to test the goodness-of-fit of the Lee-Carter

model. See Wilmoth (1993) and references therein. The second measure is the mean

squared error. We compare the mean squared error from our model with the mean

squared error from Heligman-Pollard’s model. The third measure is the actuarial

present values of a particular insurance and annuity. We calculate the actuarial

present value of whole life insurance and the actuarial present value of whole life

annuity for various interest rates and ages, using the model. We then compare them

with the same values obtained directly from a life table.

R-square or the coefficient of determination measures the percent of the (weighted)

variance explained by a model. Its value is an indicator of how well the model fits

the data: the closer the value is to 1, the better. In our case, the total weighted sum

of squares is given by

SST =ω−1∑

x=0

(qx − q)2s(x), (4.5.1)

where q = (∑ω−1

x=0 qx)/ω is the grand mean of qx across age and ω is the maximum

age. The residual weighted sum of squares is given by

SSE =ω−1∑

x=0

(qx − qx)2s(x). (4.5.2)

Then the percent of variance explained by the model is given by

R2 = 1 − SSE

SST. (4.5.3)

The R-square values of our model for the data sets in the previous two sections

are given in the first row of Table 4.5. As we can see, the R-square values obtained

167

for all cohorts are very close to 1, indicating that the proposed model has explained

almost all the variation of death rates over ages. In Lee and Carter (1992), the R-

square value for their model is 0.927. Normally a R-square value greater than 0.9 is

considered satisfactory.

Table 4.5: The goodness-of-fit values

Cohort 1811 Cohort 1861 Cohort 1911 SSA1950BR-square value 0.9982 0.9988 0.9993 0.9999

H-P’s MSE 0.000293 0.000131 0.00007278 0.000031ours’ MSE 0.000448 0.0003077 0.0000382 0.00000108

We next compare the mean squared error from our model with the mean squared

error from Heligman-Pollard’s model. We first fit Heligman-Pollard’s model (1.1.29)

to the three data sets discussed in the previous two sections. Since the estimation is

a routine procedure, we omit the detail. The mean squared errors (over the ages from

0 to 100) for Heligman-Pollard’s model and our model are then calculated, and the

results are also given in Table 4.5. Since Heligman-Pollard’s model is a well accepted

mortality model for all ages, the comparison of the errors from the two models provides

another way to look at how well our model fits to data. Our model outperforms

Heligman and Pollard’s model for cohort 1911 and SSA 1950 but theirs is better for the

others. In all the cases, the difference is insignificant. However, we want to emphasize

that the real technical advantage of our model compared with Heligman and Pollard’s

model is the simplicity of the model structure and the analytical properties that will

be discussed in the next section.

Lastly in this section, we calculate the actuarial present value Ax of the whole

life insurance and the actuarial present value ax of the whole life annuity, using two

168

approaches. One is to use our model and the other is to use a life table directly. The

SSA life table 1950B is used for this purpose. Based on this life table, the values of

Ax and ax can be obtained directly (e.g. Bowers et al, 1997). On the other hand,

since the time of death follows the phase-type distribution with transition matrix

Λ, the continuous versions Ax and ax of Ax and ax have a closed-form expressions

that are given by (4.6.4) in Section 4.6. Assuming the uniform distribution of death

(UDD) for each year of age, Ax and ax have simple closed-form expressions and their

values are obtained immediately (note that assuming UDD is not necessary as closed-

form expressions for the probabilities of death and survival also exist but we use the

UDD assumption for simplicity.). With the estimated parameters in Table 4.3 for the

revised model, we obtain the values of Ax and ax for 5 different ages and 3 different

interest rates. These values are compared with the corresponding values obtained

from the SSA life table 1950B directly. All of them and the relative errors are given

in Table 4.6 and Table 4.7 respectively. It again demonstrates how good the fitting

is.

4.6 Qualitative Analysis of the Model

One of the advantages for using phase-type distributions in survival analysis is that

its underlying Markov process Jt, t ≥ 0, can provide a population dynamics over time.

In our model, Jt represents the physiological age of an individual at (chronical) age t.

Let Pi(t) be the probability that the individual at age t is at physiological age i, i.e.

Pi(t) = P (Jt = i, T > t), t ≥ 0, i = 1, 2, · · · , n.

169

Table 4.6: Actuarial present value Ax of whole life insurance at different ages and interestrates

Interest rate Age x Life table value for Ax Model value for Ax Relative errori=1% 30 0.63592 0.63675 0.0013044

40 0.69484 0.696 0.001669550 0.75586 0.75798 0.002804160 0.81762 0.81895 0.001627370 0.87414 0.87358 -0.00064002

i=4% 30 0.19287 0.19349 0.003224240 0.26543 0.2669 0.00553450 0.35934 0.36401 0.01298160 0.4777 0.48151 0.00797970 0.60854 0.60713 -0.002312

i=7% 30 0.076294 0.076279 -0.0001917740 0.12337 0.12395 0.0047250 0.19578 0.20055 0.0243660 0.30535 0.31029 0.01615770 0.44719 0.44562 -0.0035071

170

Table 4.7: Actuarial present value ax of whole life annuity at different ages and interestratesInterest rate Age x Life table value for ax Model value for ax Relative error

i=1% 30 35.772 35.689 -0.002341940 29.821 29.704 -0.003928850 23.658 23.444 -0.009048660 17.421 17.286 -0.007714170 11.712 11.768 0.0048247

i=4% 30 19.985 19.969 -0.00080940 18.099 18.061 -0.002110250 15.657 15.536 -0.007745960 12.58 12.481 -0.007877970 9.178 9.2146 0.0039856

i=7% 30 13.12 13.12 1.7046e-00540 12.4 12.391 -0.0007178350 11.293 11.22 -0.006455160 9.6182 9.5428 -0.00784170 7.4501 7.4741 0.0032178

171

Then, for i = 1, 2, · · · , n, Pi(t) = [αeΛt]i, where [ · ]i is the ith component of vector [ · ].

Obviously, the survival function s(t) can be expressed as s(t) =∑n

i=1 Pi(t). Consider

now the corresponding conditional probability πi(t) = P (Jt = i|T > t). Then, πi(t)

can be expressed as

πi(t) =Pi(t)

s(t)=

[αeΛt

α exp(Λt)e

]

i

, t ≥ 0, i = 1, 2, · · · , n. (4.6.1)

The distribution π(t) = [π1(t), π2(t), . . . , πn(t)] may be used to describe the hetero-

geneity or frailty in health status among the cohort of individuals at age t, where

the heterogeneity is measured by the physiological age. Furthermore, the aging pro-

cess of the heterogeneous cohort can still be modelled in the same manner: it is a

Markov process with the same transition matrix Λ but the initial distribution is now

π(t). As a result, all the desirable properties are preserved and the same mathemat-

ical/statistical analysis can be carried out.

To illustrate the heterogeneity, in Figure 4.6 we present the distribution π(t) for

ages 30, 50, and 70 for the Swedish Cohorts of years 1811, 1861, and 1911.

The Figure 4.6 shows that (1) the degree of heterogeneity increases over age among

survivors; and (2) the distribution π(t) is shifted to the left from cohort 1811 to cohort

1911, which means that the younger cohort is younger in physiological age to reflect

the health improvement of the population over time.

The relationship between the force of mortality of the time of death and the

absorbing rate qi (see formula (4.2.3)), which represents the death rate at the physi-

ological age i can also be identified. It follows from (3.1.2) and (3.1.3) that the force

of mortality µ(t) is given by

µ(t) =α exp(Λt)q

α exp(Λt)e. (4.6.2)

172

Figu

re4.6:

Heterogen

eous

distrib

ution

sfor

three

cohorts

0 20 40 60 80 100 120 140 160 180 2000

0.01

0.02

0.03

0.04

0.05

0 20 40 60 80 100 120 140 160 180 2000

0.01

0.02

0.03

0.04

0.05

0 20 40 60 80 100 120 140 160 180 2000

0.01

0.02

0.03

0.04

0.05

191118611811

191118611811

191118611811

At age 30

At age 50

At age 70

173

Thus, with formula (4.6.1) we obtain

µ(t) =n∑

i=1

qi · πi(t). (4.6.3)

In other words, the force of mortality at age t is a weighted average of the death rates

qi’s with the heterogeneous distribution π(t) as the weights.

This model also allows us to investigate the impact of the absorption rates qi on the

distribution of the time of death. Suppose that the absorption rates qi’s are subject

to a constant change or perturbation ǫ. That is, the new absorption rates qǫi = qi + ǫ

for all i, while the other parameters remain unchanged. This is the case when the

background death rate b is changed to b + ǫ. Let (α,Λǫ) denote the corresponding

phase-type representation. Then we have

Λǫ = Λ − ǫI.

Thus,

eΛǫt = e(Λ−ǫI)t = e−ǫ teΛt.

It follows from (3.1.2) that the survival function Sǫ(t) with the perturbation can be

expressed as

Sǫ(t) = e−ǫ ts(t).

Interestingly, the distribution π(t) that measures frailty remains unchanged:

πǫ(t) =αeΛ

ǫt

α exp(Λǫt)e=

αeΛt

α exp(Λt)e= π(t).

The force of mortality µǫ(t) is obtained by

µǫ(t) =n∑

i=1

qǫi · πi(t) =

n∑

i=1

qi · πi(t) +n∑

i=1

ǫ · πi(t) = µ(t) + ǫ,

174

using (4.6.3). Thus, if the absorption rates are increased (or decreased) by ǫ, so is

the force of mortality.

The idea of changing the absorption rates qi’s is useful for determining a loading

to the net premium of an insurance or an annuity. The traditional approach is to

adjust the probabilities of death qx’s at all ages with a fixed percentage increase or

decrease as shown in the CET and GAM tables. A similar approach can be used by

letting qǫi = qi + ǫi. Here ǫi may be viewed as a mortality loading to physiological age

i. All the premium calculations can then be carried out in the same manner.

Finally, we present the following closed-form expressions for the actuarial present

values of the whole life insurance, the term life insurance and the whole life annuity,

to further illustrate the technical advantage of the model. The symbols are self-

explanatory.

Ax =

∫ ∞

0

e−δtπ(x) exp(Λt)q dt = −π(x)(Λ − δI)−1q,

A1x:n| =

∫ n

0

e−δtπ(x) exp(Λt)q dt = π(x)(Λ − δI)−1(e(Λ−δI)n − I)q,

and

ax =

∫ ∞

0

e−δtπ(x) exp(Λt)e dt = −π(x)(Λ − δI)−1e. (4.6.4)

We note that these closed-form expressions are used to calculate the values in Table 4.6

and Table 4.7. In contrast, Heligman and Pollard’s model and the Lee-Carter model

can not produce closed-form expressions for the above.

4.7 Conclusion and Discussion

The idea of using Markov processes and phase-type distributions to model human

mortality is not new at all. For instance, Gavrilov and Gavrilova (1991) use a Markov

175

process to derive a Makeham-Gompertz formula under special assumptions. Aalen

(1995) explores the theoretical potential of the use of phase-type distributions to

model different shape of hazard rates, suggesting that such models should find a

greater application in survival analysis. Other examples of using Markov models and

phase-type distributions for survival analysis can be found in Kay (1986), Longini et

al. (1989 and 1991), and Guihenniuc-Jouyaux, Richardson, and Longini (2000) under

the heading of Markov Processes.

In this chapter, we have proposed a hypothetical modelling framework using a

finite-state continuous-time Markov process with a single absorbing state (death)

to describe a physiological aging process of a human body. A special structure is

suggested on the transition matrix to characterize the Markov process such that the

process can be linked with the underlying aging mechanism. Furthermore, the use

of the finite-state continuous-time Markov process ensures that the time of death

follows a phase-type distribution. As a result, many analytical methods developed

for phase-type distributions can be applied to the proposed mortality model.

We have shown that the model is capable of explaining some stylized facts of

observed mortality very well. We have fitted the model to the Swedish population

cohort data and life tables compiled by the U.S. Social Security Administration in

Actuarial Study (107). The fitting results are satisfactory.

One of the potential applications of the model could be for mortality projection,

which is of central importance to the insurance industry and social security sys-

tems. Current mortality projection methods normally involve certain extrapolation

techniques. However, several recent reviews (C.M.I.B., 2002, and GAD, 2001) have

176

showed that these approaches often underestimate mortality improvements. In addi-

tion, certain important issues, including the cohort effect and the future uncertainties

in mortality rate, are not well addressed by extrapolative methods. This process-

based mortality model could provide a framework to (at least partly) address these

practical issues. Moreover, it is possible to extend this model to provide a unified

framework for all cohorts. This will be very useful when we are dealing with cross-

sectional (period) mortality data. Also we have shown in Chapter 3 and 4 that we

can introduce a time-change factor to this model to generate stochastic mortality.

Appendix A

Matrix Algebra

As you have seen, our results rely heavily on matrix computation and matrix expo-

nential; therefore we give a brief summary of relevant matrix operations, and some

useful results in this section.

A.1 Matrix Exponential

The matrix exponential eA of a n × n matrix A is defined as

eA =∞∑

n=0

An

n!.

Some fundamental properties for matrix exponential are:

1.

d

dteAt = AeAt = eAtA (A.1.1)

2. If matrix A is invertible, then

∫ t

0

eAsds = A−1(eAt − I); (A.1.2)

177

178

If all the eigenvalues of A are negative (or have negative real part), then

∫ ∞

0

eAsds = −A−1. (A.1.3)

3. If there exists a matrix H such that A = HDH−1, where D = (dj)diag, then

An = HDnH−1 (A.1.4)

Note that the equation (A.1.4) holds for negative n as well as positive. Dn is

easy to calculate since it only involves the powers of a diagonal matrix. More-

over,

eA = eHDH−1

= H eD H−1 = H(edi)

diagH−1. (A.1.5)

4. In general, the equality eA+B = eAeB doesn’t hold except for A and B which

are commutable 1. However, Kronecker operations allow exponential function

being generalized to matrices as it is shown in the Propositoin (A.1.1).

Proposition A.1.1.

eA⊕B = eA ⊗ eB (A.1.6)

For a proof, see Asmussen (2000a, p345).

A.2 The Kronecker product ⊗ and the Kronecker

sum ⊕

In the proposition (A.1.1), the Kronecker product ⊗ and the Kronecker sum ⊕ have

been used. Their definition and some properties are given in the following.

1Note that Baker-Campbell-Hausdorff formula can be used to calculate the approximated valuefor eA+B when A and B are not commutable. For more details, see Bakhturin (2001). We want tothank Professor Sebastian Jaimungal for pointing this out to us.

179

Definition A.2.1. (The Kronecker product ⊗ and the Kronecker sum ⊕)

Let A be a k1×m1 and B be a k2×m2 matrix, then the Kronecker product A⊗B

is the (k1 × k2) × (m1 × m2) matrix defined as

A ⊗ B =

a11B · · · a1 m1B

· · · · · · · · ·

ak11B · · · ak1 m1B

(A.2.1)

If A and B are both square (k1 = m1 and k2 = m2), then the Kronecker sum

A ⊕ B is defined by

A ⊕ B = A ⊗ Ik2+ Ik1

⊗ B (A.2.2)

Note that when A reduces to a column vector hk×1 and B reduces to a row vector

ν1×m, h⊗ν is the k×m matrix with ijth element hiνj, i.e. h⊗ν = hν in standard

matrix notation.

Proposition A.2.2.

(A1B1C1) ⊗ (A2B2C2) = (A1 ⊗ A2)(B1 ⊗ B2)(C1 ⊗ C2)

In particular, if A1 = ν1,A2 = ν2 are row vectors and C1 = h1,C2 = h2 are column

vectors, then ν1B1h1 and ν1B2h2 are real numbers, and

(ν1B1h1) · (ν2B2h2) = (ν1B1h1) ⊗ (ν2B2h2) = (ν1 ⊗ ν2)(B1 ⊗ B2)(h1 ⊗ h2)

Exponential representation

Borrowing the Kronecker notation, we can express the matrix exponential of a

diagonalizable square matrix as the sum of exponential functions in terms of its right

and left eigenvectors.

180

Proposition A.2.3. (Exponential representation) Suppose transition matrix A has

n distinct eigenvalues λ1, · · · , λn. Let ν1, · · · ,νn be the corresponding left (row)

eigenvectors and h1, · · · ,hn the corresponding right (column) eigenvectors (that is,

νiA = λiνi, Ahi = λihi with νihj = 0, i 6= j, and νihi = 1 after normalization).

Then transition matrix A has diagonal form, and,

A =n∑

i=1

λihiνi =n∑

i=1

λihi ⊗ νi

eAt =n∑

i=1

eλithiνi =n∑

i=1

eλithi ⊗ νi

In this Proposition, the key step is to find a decomposition for the matrix A such

that A = H (λi)diag H−1. This is sufficient when A has nondegenerate eigenval-

ues λ1, · · · , λn, and corresponding linearly independent eigenvectors h1,h2, · · · ,hn.

Then H = [h1, · · · ,hn]. Using formula (A.1.5) and Kronecker operations, the result

follows.

The advantage of the above property is that we have an explicit formula for A and

eAt once the λi,hi,νi has been computed, and it is a combination of exponentials.

There are, however, two serious drawbacks of this approach (see Asmussen (2000a)):

1. Numerical instability: If the λi are too close, matrix A contains terms which

almost cancel and the loss of digits may be disastrous. The phenomenon occurs

not only when the dimension n is large.

2. Complex calculas: If not all λi are real, it requires to do calculations with

complex numbers or to perform the cumbersome translation into real and imag-

inary parts, both of which are messy.

181

Proposition A.2.4. For any m × m square matrix B with inverse B−1,

(B ⊗ B)−1 =(B−1 ⊗ B−1

)

Proof Denote

B =

b11 · · · b1 m

· · · · · · · · ·bm 1 · · · bm m

and

B−1 =

c11 · · · c1 m

· · · · · · · · ·cm1 · · · cm m

which satisfy

m∑

j=1

bij cjk = 0, i 6= k, (A.2.3)

m∑

j=1

bij cjk = 1, i = k. (A.2.4)

Also

B ⊗ B =

b11B · · · b1 mB

· · · · · · · · ·bm 1B · · · bm mB

B−1 ⊗ B−1 =

c11B−1 · · · c1 mB−1

· · · · · · · · ·cm 1B

−1 · · · cm mB−1

Applying relation (A.2.3) and (A.2.4), it is straightforward to see that

(B ⊗ B)(B−1 ⊗ B−1) =

I · · · · · · 0

0 I · · · 0

· · · · · · · · · · · ·0 · · · · · · I

m×m

Bibliography

Aalen, O. O., 1995, “Phase type distributions in survival analysis,” The Scandinavian

Journal of Statistics, 22, 447–463.

Andreev, K., 2001, Overview of the Program Lexis 1.0Odense University, Denmark

and Max Planck Institute for Demographic Research, Germany.

Asmussen, S., 1987, Applied Probability and queues, Wiley, New York.

Asmussen, S., 2000a, “Matrix-analytic models and their analysis,” The Scandinavian

Journal of Statistics, 27, 193–226.

Asmussen, S., 2000b, Ruin Probabilities, World Scientific Publishing, Singapore.

Asmussen, S., O. Nerman, and M. Olsson, 1996, “Fitting phase-type distributions via

the EM algorithm,” The Scandinavian Journal of Statistics, 23, 419–441.

Asmussen, S., and T. Rolski, 1991, “Computational methods in risk theory: a matrix

algorithmic approach,” Insurance: Mathematics and Economics, 10, 259–274.

Austad, S. N., 1997, Why We Age: What Science Is Discovering about the Body’s

Journey through Life, John Wiley & Sons, New York.

Bafitis, H., and F. Sargent, 1977, “Human physiological adaptability through the life

sequence,” Journal of gerontology, 32, 402–410.

182

183

Bakhturin, Y., 2001, Campbell-Hausdorff formula . Encyclopaedia of Mathematics in

Hazewinkel, M. (eds), Kluwer Academic Publishers.

Ballotta, L., and S. Haberman, 2003, “Valuation of guaranteed annuity conversion

options,” Insurance: Mathematics and Economics, 33, 87–108.

Ballotta, L., and S. Haberman, 2006, “The fair valuation problem of guaranteed annu-

ity options: The stochastic mortality environment case,” Insurance: Mathematics

and Economics, 38, 195–214.

Benjamin, B., and J. Pollard, 1993, The analysis of mortality and other actuarial

statistics, The Institute of Actuaries, Oxford.

Benjamin, B., and A. Soliman, 1993, Mortality on the Move, Actuaries Education

Service, Oxford.

Biffis, E., 2005, “Affine processes for dynamic mortality and actuarial valuation,”

Insurance: Mathematics and Economics, 37, 443–468.

Biffis, E., and P. Millossovich, 2006, “The fair value of guaranteed annuity options,”

Scandinavian Actuarial Journal, 1, 23–41.

Bjork, T., 1998, Arbitrage Theory in Continuous Time, Oxford University Press.

Blake, D., and W. Burrows, 2001, “Survivor bonds: Helping to hedge mortality risk,”

Journal of Risk and Insurance, 68, 339–348.

Blake, D., A. J. G. Cairns, and K. Dowd, 2006, “Living with mortality: longevity

bonds and other mortality-linked securities,” British Actuarial Journal, 12, 153–

197.

Booth, H., J. Maindonald, and L. Smith, 2002, “Applying Lee-Carter under conditions

of variable mortality decline,” Population Studies, 56, 325–336.

184

Bowers, N. L., H. U. Gerber, J. C. Hickman, D. A. Jones, and C. J. Nesbitt, 1997,

Actuarial Mathematics, The Society of Actuaries. Second Edition. Schaumburg,

Illinois.

Box, G., and G. Jenkins, 1970, Time series analysis: Forecasting and control, San

Francisco: Holden-Day.

Boyle, P., and M. Hardy, 2003, “Guaranteed annuity options,” Astin Bulletin, 33,

125–152.

Brace, A., D. Gatarek, and M. Musiela, 1997, “The market model of interest-rate

dynamics,” Mathematical Finance, 7, 127–155.

Brass, W., 1974, “Mortality models and their uses in demography,” Transactions of

the Faculty of the Actuaries, 33, 123–132.

Cairns, A. J., 2000, “A discussion of parameter and model uncertainty in insurance,”


Cairns, A. J. G., D. Blake, P. Dawson, and K. Dowd, 2005, “Pricing the risk on

longevity bonds,” Life and Pensions, pp. 41–44.

Cairns, A. J. G., D. Blake, and K. Dowd, 2006a, “Pricing Death: Frameworks for the

Valuation and Securitization of Mortality Risk,” Astin Bulletin, 36, 79–120.

Cairns, A. J. G., D. Blake, and K. Dowd, 2006b, “A two-factor model for stochastic

mortality with parameter uncertainty,” Journal of Risk and Insurance, 11, 687–718.

CMI, 1990, Continuous Mortality Investigation Reports No.10.

CMI, 1999, Continuous Mortality Investigation Reports No. 17.

CMI, 2002, “An Interim Basis for Adjusting the ‘92’ Series Mortality Projections for

Cohort Effects,” Continuous Mortality Investigation Working paper 1.

185

CMI, 2004, “Projecting future mortality: a Discussion paper,” Continuous Mortality

Investigation Working paper 3.

CMI, 2005, “Projecting future mortality: Towards a proposal for a stochastic method-

ology,” Continuous Mortality Investigation Working paper 15.

Collatz, K.-G., 1986, Towards a Comparative Biology of Aging, vol. Insect Aging,

K.-G. Collatz and R.S. Sohal (Eds), . pp. 1–8, Springer-Verlag, Berlin.

Comfort, A., 1964, Ageing: The Biology of Senescence, Routledge and Kegan Paul,

London.

Cox, J., J. Ingersoll, and S. Ross, 1985, “A thoery of the term-structure of interest

rates,” Econometrica, 53, 385–408.

Cox, S. H., J. Fairchild, and H. Pedersen, 2000, “Economic Aspects of Securitization

of Risk,” Astin Bulletin, 30, 157–193.

Cramer, H., and H. Wold, 1935, “Mortality variations in Sweden: a study in gradua-

tion and forecasting,” Skandinavisk Aktuarietidskrift, 18, 161–241.

Crimmins, E., M. Hayward, , and Y. Saito, 1994, “Changing Mortality and Morbid-

ity Rates and the Health Status and Life Expectancy of the Older Population,”

Demography, 31, 159–175.

Dahl, M., 2004, “Stochastic mortality in life insurance: market reserves and mortality-

linked insurance contracts,” Insurance: Mathematics and Economics, 35, 113–136.

Dahl, M., and T. Møller, 2006, “Valuation and hedging of life insurance liabilities with

systematic mortality risk,” Insurance: Mathematics and Economics, 39, 193–217.

Davidson, A. R., and A. R. Reid, 1927, “On the calculation of rates of mortality,”

Transactions of the Faculty of Actuaries, 11, 183–232.

186

Duffie, D., 2001, Dynamic Asset Pricing Theory, Third Edition, Princeton University

Press.

Elliott, R. J., W. C. Hunter, and B. M. Jamieson, 2001, “Financial signal processing:

A self calibrating model,” International Journal of Theoretical and Applied Finance,

4, 567–584.

Elliott, R. J., and R. S. Mamon, 2002, “A complete yield curve description of a Markov

interest rate model,” International Journal of Theoretical and Applied Finance, 6,

317–326.

Featherman, D. L., 1986, “Marker of Aging,” Research on Aging, 8, 339–365.

Finch, C. E., 1990, Longevity, Senescence, and the Genome, The University of

Chicago Press, Chicago and London.

Flesaker, B., and L. Hughston, 1996, “Positive interest,” Risk, 9, 46–49.

Follmer, H., and D. Sondermann, 1986, Hedging of Non-Redundant Contingent

Claimspp. 205–223, . Contributions to Mathematical Economics, W. Hildenbrand

and A. Mas-Colell (eds), North-Holland.

Forfar, D., and D. Smith, 1988, “The changing shape of English Life Tables,” Trans-

actions of the Faculty of Actuaries, 40, 98–134.

Fries, J., 1980, “Aging, Natural Death, and the Compression of Mortality,” The New

England Journal of Medicine, 303, 130–135.

GAD, 2001, National population projections: Review of methodology for projecting

mortality.National Statistics Quality Review Series, Report No.8, Government Ac-

tuary’s Department.

187

Guihenniuc-Jouyaux, C., S. Richardson, and I. Longini, 2000, “Modeling Markers of

Disease Progression by a Hidden Markov Process: Application to Characterizing

CD4 Cell Decline,” Biometrics, 56, 733–741.

Gutterman, S., and I. Vanderhoof, 1998, “Forecasting Changes in Mortality: a Search

for a Law of Causes and Effects,” North American Actuarial Journal, 2.

Harrison, J., and S. Pliska, 1981, “Martingales and stochastic integrals in the theory

of continuous trading,” Stochastic processes and Applications, 11, 215–260.

Harrison, P. G., 1990, “Laplace transform inversion and passage-time distributions in

Markov processes,” Journal of Applied Probability, 27, 74–87.

Hayflick, L., 2002, “Longevity determination and aging,” Living to 100 and Beyond:

Survival at Advanced Ages Symposium.Schaumburg, I11.: Society of Actuaries.

Health, D., R. Jarrow, and A. Morton, 1992, “Bond pricing and the term structure of

interest rates: A new methodology for contingent claims valuation,” Econometrica,

60, 77–105.

Heligman, L., and J. Pollard, 1980, “The age pattern of mortality,” Journal of Insti-

tute of Actuaries, 107, 49–75.

Higgins, T., 2003, “Mathematical Models of Mortality,” Paper presented at the Work-

shop on Mortality Modelling and Forecasting Australian National University.

Hull, J., and A. White, 1990, “Pricing Interest rate derivative securities,” Review of

Financial Studies, 3, 573–592.

Hurd, T., and A. Kuznetsov, 2006, “Affine Markov Chain model of multifirm credit

migration,” Working paper.

Jamshidian, F., 1997, “LIBOR and swap market models and measures,” Finance and

Stochastics, 1, 293–330.

188

Jarrow, R. A., D. Lando, and S. M. Turnbull, 1997, “A Markov model for the term

structure of credit risk spreads,” The Review of Financial Studies, 10, 481–523.

Jones, H., 1956, A special consideration of the aging process, disease and life ex-

pectancy, vol. 4 of Advances in Biological and Medical Physics, J.H.Lawrence and

C.A. Tobias Eds, . pp. 281–337, Academic Press Inc., New York.

Jones, H., 1959, The relation of human health to age, place and time. . in Handbook of

Aging and the Individual, J.E. Birren, Ed., University of Chicago Press, Chicago.

Kannisto, V., 1994, Development of Oldest-Old Mortality, 1950-1990: Evidence from

28 Developed Countries, Odense University Press, Odense, Denmark.

Kay, R., 1986, “A Markov Model for analysing cancer markers and disease states in

survival studies,” Biometrics, 42, 855–865.

Keilson, J., 1979, Markov Chain Models — Rarity and Exponentiality, Springer-

Verlag, New York.

Keyfitz, N., 1981, “The Limits of Population Forecasting,” Population and Develop-

ment Review, 7, 579–593.

Lee, R. D., and L. R. Carter, 1992, “Modeling and forecasting U.S. mortality,” Journal

of the American Stotistical Association, 87, 659–675.

Lee, R. D., and T. Miller, 2001, “Evaluating the performance of the Lee-Carter

Method for Forecasting mortality,” Demography, 38.

Li, S.-H., M. R. Hardy, and K. S. Tan, 2006, “Uncertainty in Mortality forecasting:

an extension to the classical Lee-Carter approach,” In press, 5, 1–20.

Lin, X., and K. Tan, 2003, “Valuation of equity-indexed annuities under stochastic

interest rate,” North American Actuarial Journal, 7, 72–91.

189

Lin, X. S., and X. Liu, 2007, “Markov aging process and phase-type law of mortality,”

North American Actuarial Journal, 11, 92–109.

Lin, Y., and S. H. Cox, 2006, “A mortality securitization model,” Journal of Risk

and Insurance, 76, 22–52.

Longini, I., W. Clark, R. Byers, J. Ward, W. Darrow, G. Lemp, and H. Hethcote,

1989, “Statistical Analysis of the Stages of HIV-Infections Using a Markov Model,”

Statistics in Medicine, 8, 831–843.

Longini, I. M., W. S. Clark, L. Gardner, and J. F. Brundage, 1991, “The dynamics

of CD4+ T-lymphocyte decline in HIV-infected individuals: a Markov modelling

approach,” Journal of Acquired Immune Deficiency Syndromes, 4, 1141–1147.

Madan, D. B., P. Carr, and E. C. Chang, 1998, “The Variance Gamma process and

option pricing,” European Finance Review, 2, 79–105.

Madan, D. B., and F. Milne, 1991, “Option pricing with V.G. martingale compo-

nents,” Mathematical Finance, 1, 39–55.

Maghsoodi, Y., 1996, “Solution of the extended CIR term structure and bond option

valuation,” Mathematical Finance, 6, 89–109.

McNown, R., and A. Rogers, 1989, “Forecasting mortality: a parameterized time

series approach,” Demography, 26.

Milevsky, M. A., and S. D. Promislow, 2001, “Mortality derivatives and the option

to annuitise,” Insurance: Mathematics and Economics, 29, 299–318.

Miltersen, K. R., and S.-A. Persson, 2005, “Is Mortality dead? Stochastic Forward

Force of Mortality Rate Determined by No Arbitrage,” Working Paper.

Møller, T., 1998, “Risk-minimizing hedging strategies for unit-linked life insurance

contracts,” Astin Bulletin, 28, 17–47.

190

Møller, T., 2001a, “Hedging Equity-linked life insurance contracts,” North American

Actuarial Journal, 5, 79–95.

Møller, T., 2001b, “On transformations of actuarial valuation principles,” Insurance:

Mathematics and Economics, 28, 281–303.

Nelder, J., and R. Mead, 1965, “A Simplex Method for Function Minimization,”

Computer Journal, 7, 308–313.

Neuts, M. F., 1981, Matrix-geometrix solutions in stochastic models, Johns Hopkins

University Press, Baltimore, London.

Norberg, R., 2003, “The Markov Chain Market,” Astin Bulletin, 33, 265–287.

O’Cinneide, C. A., 1989, “On Non-uniqueness of representations of phase-type distri-

butions,” Commun. Statist.-Stochastic Models, 5, 247–259.

O’Cinneide, C. A., 1990, “Characterization of phase-type distributions,” Commun.

Statist.-Stochastic Models, 6, 1–57.

O’Cinneide, C. A., 1999, “Phase-type distributions: open problems and a few prop-

erties,” Commun. Statist.-Stochastic Models, 15, 731–757.

Olivieri, A., 2001, “Uncertainty in mortality projections: an actuarial perspective,”


Pitacco, E., 2003, “Survival Models in Actuarial Mathematics: From Halley to

Longevity Risk,” Invited talk at 7th International Congress Insurance: Mathe-

matics & Economics, ISFA, Lyon.

Renshaw, A., and S. Haberman, 2000, “Modelling for Mortality Reduction Factors,”

Actuarial Research Paper No. 127, Department of Actuarial Science and Statistics,

City University, London.

191

Renshaw, A., and S. Haberman, 2003, “Lee-Carter mortality forecasting with age-

specific enhancement,” Insurance: Mathematics and Economics, 33, 255–272.

Rogers, L., 1997, “The potential approach to the term-structure of interest rates and

foreign exchange rates,” Mathematical Finance, 7, 157–164.

Rutkowski, M., 1997, “A note on the Flesaker & Hughston model of the term structure

of interest rates,” Applied Mathematical Finance, 4, 151–163.

Sacher, G., and E. Trucco, 1962, “The stochastic theory of mortality,” Annals of New

York Academy of Sciences, 96, 985–1007.

Sehl, M. E., and R. E. Yates, 2001, “Kinetics of Human aging: I. Rates of Senescence

between ages 30 and 70 years in healthy people,” Journal of gerontology, 56B,

198–208.

Shock, N., 1974, “Physiological theories of aging,” Theoretical aspects of aging, M.

Rockstein Ed. Academic Press, New York:119-136.

SSA, 1992, Life Tables for the United States Social Security Area 1900-2080,.Actuarial

Study No.107, August 1992.

Strehler, B. L., 1999, Time, Cells, and Aging, Demetriades Brothers, Larnaca.

Tenenbein, A., and I. Vanderhoof, 1980, “New Mathematical Laws of Select and

Ultimate Mortality,” Transactions of the Society of Actuaries, 32, 119–158.

The National Vital Statistics Report, U. S., 1999, Vol. 47, No. 19, June. 30,

http://www.cdc.gov/nchs/products/pubs/pubd/nvsr/nvsr.htm.

Tuljapurkar, S., and C. Boe, 1998, “Mortality Changes and Forecasting: How Much

and How Little do we Know?,” North American Actuarial Journal, 2, 13–47.

192

Vasicek, O., 1977, “An equilibrium characterisation of the term structure,” Journal

of Financial Economics, 5, 177–188.

Wetterstrand, W. H., 1981, “Parametric models for life insurance mortality data:

Gompertz’s law over time,” Transactions of the Society of Actuaries, 33, 159–175.

Willets, R., 1999, “Mortality in the next millenium,” Presented at the Staple Inn

Actuarial Society on 7 December.

Wilmoth, J. R., 1993, Computational methods for fitting and extrapolating the Lee-

Carter model of mortality changeTechnical report. Department of Demography.

University of California, Berkeley.

Yashin, A. I., K. G. Manton, M. Woodbury, and E. Stallard, 1995, “The Effects of

Health Histories on Stochastic Process Model of Aging and Mortality,” Journal of

Mathematical Biology, 34, 1–16.

Zuev, S. M., A. I. Yashin, K. G. Manton, and E. Dowd, 2000, “Vitality Index in Sur-

vival Modeling: how physiological Aging Influences Mortality,” Journal of geron-

tology, 55A, 10–19.

stochastic mortality modelling · stochastic mortality modelling xiaoming liu department of...

Documents