modelling longevity dynamics for

416

Upload: others

Post on 25-Jan-2022

6 views

Category:

Documents


0 download

TRANSCRIPT

Modelling Longevity Dynamics forPensions and Annuity Business

MATHEMATICS TEXTS FROM OXFORD UNIVERSITY PRESSDavid Acheson: From Calculus to Chaos: An introduction to dynamicsNorman L. Biggs: Discrete Mathematics, second editionBisseling: Parallel Scientific ComputationCameron: Introduction to AlgebraA.W. Chatters and C.R. Hajarnavis: An Introductory Course inCommutative AlgebraRené Cori and Daniel Lascar: Mathematical Logic: A Course withExercises, Part 1René Cori and Daniel Lascar: Mathematical Logic: A Course withExercises, Part 2Davidson: TurbulenceD’Inverno: Introducing Einstein’s RelativityGarthwaite, Jollife, and Jones: Statistical InferenceGeoffrey Grimmett and Dominic Welsh: Probability: An IntroductionG.R. Grimmett and D.R. Stirzaker: Probability and Random Processes,third editionG.R. Grimmett and D.R. Stirzaker: One Thousand Exercises in Probability,second editionG.H. Hardy and E.M. Wright: An Introduction to the Theory of NumbersJohn Heilbron: Geometry CivilizedHilborn: Chaos and Nonlinear DynamicsRaymond Hill: A First Course in Coding TheoryD.W. Jordan and P.Smith: Non Linear Ordinary Differential EquationsRichard Kaye and Robert Wilson: Linear AlgebraJ.K. Lindsey: Introduction to Applied Statistics: A modelling approach,second editionMary Lunn: A First Course in MechanicsJirí Matousek and Jaroslav Nesetril: Invitation to Discrete MathematicsTristan Needham: Visual Complex AnalysisJohn Ockendon, Sam Howison, : Applied Partial Differential EquationsH.A. Priestley: Introduction to Complex Analysis, second editionH.A. Priestley: Introduction to IntegrationRoe: Elementary GeometryIan Stewart and David Hall: The Foundations of MathematicsW.A. Sutherland: Introduction to Metric and Topological SpacesDominic Welsh: Codes and CryptographyRobert A. Wilson: Graphs, Colourings and the Four Colour TheoremAdrian F. Tuck: Atmospheric TurbulenceAndré Nies: Computability and RandomnessPitacco, Denuit, Haberman, and Olivieri: Modelling Longevity Dynamicsfor Pensions and Annuity Business

Modelling LongevityDynamics for Pensionsand Annuity Business

Ermanno PitaccoUniversity of Trieste (Italy)

Michel DenuitUCL, Louvain-la-Neuve (Belgium)

Steven HabermanCity University, London (UK)

Annamaria OlivieriUniversity of Parma (Italy)

1

3Great Clarendon Street, Oxford OX2 6DPOxford University Press is a department of the University of Oxford.It furthers the University’s objective of excellence in research, scholarship,and education by publishing worldwide inOxford New YorkAuckland CapeTown Dar es Salaam Hong Kong KarachiKuala Lumpur Madrid Melbourne MexicoCity NairobiNewDelhi Shanghai Taipei TorontoWith offices inArgentina Austria Brazil Chile CzechRepublic France GreeceGuatemala Hungary Italy Japan Poland Portugal SingaporeSouthKorea Switzerland Thailand Turkey Ukraine Vietnam

Oxford is a registered trade mark of Oxford University Pressin the UK and in certain other countries

Published in the United Statesby Oxford University Press Inc., New York

© Ermanno Pitacco, Michel Denuit, Steven Haberman, and Annamaria Olivieri 2009

The moral rights of the authors have been assertedDatabase right Oxford University Press (maker)

First published 2009

All rights reserved. No part of this publication may be reproduced,stored in a retrieval system, or transmitted, in any form or by any means,without the prior permission in writing of Oxford University Press,or as expressly permitted by law, or under terms agreed with the appropriatereprographics rights organization. Enquiries concerning reproductionoutside the scope of the above should be sent to the Rights Department,Oxford University Press, at the address above

You must not circulate this book in any other binding or coverand you must impose the same condition on any acquirer

British Library Cataloguing in Publication DataData available

Library of Congress Cataloging in Publication DataData available

Typeset by Newgen Imaging Systems (P) Ltd., Chennai, IndiaPrinted in Great Britainon acid-free paper byCPI Antony Rowe, Chippenham, Wiltshire

ISBN 978–0–19–954727–2

10 9 8 7 6 5 4 3 2 1

Preface

Actuarial science effectively began with the bringing together of compoundinterest and life tables, some of which had been derived from observedmortality rates. One of the first categories of financial problems that earlyactuaries tackled was the calculation of annuity values. Thus, the subjectmatter of this book can be traced back to the beginnings of the disciplineof actuarial science in the mid-17th century. At this time, states and citiesoften raised money for public purposes by the sale of life annuities to theircitizens. One of the first to write on this subject was Jan de Witt, who wasthe PrimeMinister of the States of Holland, and who demonstrated in 1671how to calculate the value of annuities using a constant rate of interest anda hypothetical life table (that was piecewise linear). Another early investi-gation of the calculation method for annuity values is the seminal work ofEdmund Halley in 1693 which uses population mortality rates.

From an overview of this early history, we can identify two key devel-opments that feature in this book. First, there was the recognition of theimportance of using life tables that were not hypothetical and were notbased on general population data but rather were based on observed mor-tality data from registers of annuitants. This came in the mid-18th century– through the work of Nicholas Struyck in 1740, William Kersseboom in1742, and Antoine Deparcieux in 1746 (Haberman and Sibbett, 1995). Inmodern terminology, we would present this in the context of adverse selec-tion among the holders of life annuities – the tendency for purchasers oflife annuities to live longer than the general population. It was the work ofJohn Finlaison (the UK’s first Government Actuary) in 1829, which cogentlydemonstrated the financial problems that could emerge from overlookingthis fundamental phenomenon. During the first two decades of the 19thcentury, the British Government had been selling annuities in an attempt toreduce the National Debt. The annuities were priced using mortality ratesfrom a contemporary population-based life table, which failed to allow forthe adverse selection effect and hence the mortality rates were too high fora portfolio of annuitants. Finlaison uncovered this problem and showedthat the annuities were being sold at a loss. He identified the problem in1819 and then produced scientific evidence based on a painstaking analysis

vi Preface

of a range of data sets that led to recommendations that were accepted bythe British Government in 1829. Thus, subsequently, Government annuitieswere sold on a sound basis in line with Finlaison’s recommendations.

It is noteworthy that issues connected with mortality, annuities andadverse selection are common features of western 19th century literature.As pointed out by Skwire (1997), the novels of Jane Austen are a particu-larly rich source of actuarial references. In the words of Fanny Dashwoodin Sense and Sensibility, ‘people always live for ever when there is any annu-ity to be paid them; . . .An annuity is very serious business and there is nogetting rid of it’.

The second development related to the understanding that, in the pres-ence of a downward secular trend in mortality rates, mortality tables forapplication to annuities should include some allowance for the expectedfuture improvements in mortality rates in order to protect the seller ofannuities against future loss. The first tables to make such an allowancewere those produced in the United Kingdom based on insurance companydata covering the period 1900–1920.

Thus, we see that this book is closely related to fundamental practicalproblems that actuaries have been trying to address for some years. But themore immediate history of this book can be traced to the research that hasbeen carried out by the four authors and to two Summer Schools on whichthe authors taught and which were organized by the Groupe ConsultatifActuariel Européen (i.e. Consultative Group of Actuarial Organizations inthe European Union) in 2005 at the MIB School of Management of Triesteand in 2006 at the University of Parma. The presentations at the summerschools were centered on disseminating the results of this research workin a manner that was accessible to both practitioner and academic audi-ences. The book is specifically aimed at final year undergraduate students,MSc students, research students preparing for an MPhil or PhD degree andpractising actuaries (as part of their continuing professional development).

This book deals with some very important topics in the field of actu-arial mathematics and life insurance techniques. These concern mortalityimprovements, the uncertainty in future mortality trends and their rele-vant impact on life annuities and pension plans. In particular, we considerthe actuarial calculations concerning pensions and life annuities. As wehave noted above, the insurance company (or the pension plan) mustadopt an appropriate forecast of future mortality in order to avoid under-estimation of the related liabilities. These concepts and models couldbe extended to apply to other living benefits, which are provided, forexample, by long-term care insurance products and whole life sicknesscovers.

Preface vii

Considerable attention is currently being devoted in actuarial work to themanagement of life annuity portfolios, both from a theoretical and a prac-tical point of view, because of the growing importance of annuity benefitspaid by private pension schemes. In particular, the progressive shift in manycountries from defined benefit to defined contribution pension schemes hasincreased the interest in life annuities with a guaranteed annual amount.

This book aims at providing a comprehensive and detailed descriptionof methods for projecting mortality, and an extensive introduction to someimportant issues concerning longevity risk in the area of life annuities andpension benefits. The following topics are dealt with: life annuities in theframework of post-retirement income strategies; the basic mortality model;recent mortality trends; general features of projection models; a discussionof stochastic projectionmodels, with numerical illustrations; andmeasuringand managing longevity risk.

Chapter 1 has an introductory role, and aims to present the basic structureof life annuity products. Moving from the simple model of the annuity-certain, typical features of life annuity products are presented. From anactuarial point of view, the presentation progressively shifts from the tradi-tional deterministic models to the more modern stochastic models. With anappropriate stochastic approach, we are able to capture the riskiness inher-ent in a life annuity portfolio and in particular the risks that arise fromrandom mortality. Cross-subsidy mechanisms which may operate in lifeannuity portfolios and pension plans are then described. Our presentationof the actuarial structure of life annuities focuses on a very simple annu-ity model, namely the immediate life annuity. So, problems arising in theso-called accumulation phase (as well as problems regarding the annuitiza-tion of the accumulated amount) are initially disregarded. The chapter thenprovides a comprehensive description of a number of life annuity models;actuarial aspects are briefly mentioned, in favour of some more practicalissues with the objective, in particular, of paving the way for the subsequentformal presentation.

Some elements of the basic mortality model underlying life insurance,life annuities, and pensions are introduced in Chapter 1, while present-ing the structure of life annuities. In Chapter 2, the mortality model isdescribed in more depth, by adopting a more structured presentation ofthe fundamental ideas. At the same time we introduce some new concepts.In particular, an age-continuous framework is defined, in order to providesome tools needed when dealing with mortality projection models. Indicessummarizing the probability distribution of the lifetime are described, andparametric models (often called mortality ‘laws’ in the literature) are pre-sented. Transforms of the survival function are briefly addressed. We also

viii Preface

consider two further topics that are of great importance in the context oflife annuities and mortality forecasts but which are less traditional as far asactuarial books are concerned. These are mortality at the very old ages (i.e.the problem of ‘closing’ the life table) and the concept of ‘frailty’ as a toolto represent heterogeneity in populations due to unobservable risk factors.

Chapter 3 considers mortality trends during the past century. The well-known background is that average human life span has roughly tripled overthe course of human history. Compared to all of the previous centuries, the20th century has been characterized by a huge increase in average longevity.As we demonstrate in several chapters, there is no evidence which showsthat improvements in longevity are tending to slow down. This chapteraims to illustrate the observed decline in mortality over the 20th century,on the basis of Belgian mortality statistics, using several of the mortalityindices that have been introduced in Chapters 1 and 2. We also illus-trate the trends in mortality indices for insurance data from the Belgianinsurance market, which have been provided by the Banking, Finance andInsurance Commission (in Brussels). We note the key point that emergesfrom actuarial history that, in order to protect an insurance company frommortality improvements, actuaries need to resort to life tables incorporat-ing a forecast of the future trends of mortality rates (the so-called projectedtables). The building of these projected life tables is the main topic of thenext chapters.

Chapter 4 aims at describing the various methods that have been pro-posed by actuaries and demographers for projecting mortality. Many ofthese have been used in an actuarial context, in particular for pricing andreserving in relation to life annuity products and pension products andplans, and in the demographic field,mainly for population projections. First,the idea of a ‘dynamic’ approach tomortalitymodelling is introduced. Then,projectionmethods are presented and our starting point is the extrapolationprocedures which are still widely used in current actuarial practice. Morecomplex methods follow, in particular those methods based on mortalitylaws, on model tables, and on relations between life tables. The Lee–Cartermethod, which has been recently proposed, and some relevant extensionsare briefly introduced (while a more detailed discussion, together with var-ious examples of its implementation, is presented in Chapters 5 and 6). Thepresentation is thematic rather than following a strict chronological order.In order to obtain an insight into the historical evolution of mortality fore-casts, the reader can refer to the final section of this chapter, in which somelandmarks in the history of dynamic mortality modelling are identified.

There is a variety of statistical models used for mortality projection, rang-ing from the basic regression models, in which age and time are viewed

Preface ix

as continuous covariates, to sophisticated robust non-parametric models.In Chapter 5, we adopt the age-period framework and we first considerthe Lee–Carter log-bilinear projection model. The key difference with theclassical generalized linear regression model approach centers on the inter-pretation of time which in the log-bilinear approach is modelled as a factorand under the generalized linear regression approach ismodelled as a knowncovariate. In addition to the Lee–Carter model, we also consider the alter-native Cairns–Blake–Dowd mortality forecasting method. Compared withthe Lee–Carter approach, the Cairns–Blake–Dowdmodel includes two timefactors. This allows the model to capture the imperfect correlation in mor-tality rates at different ages from one year to the next. This approach canalso be seen as a compromise between the generalized regression approachand the Lee–Carter views of mortality modelling, in that age enters theCairns–Blake–Dowd model as a continuous covariate whereas the effectof calendar time is captured by two factors (time-varying intercept andslope parameters). These two approaches are applied to Belgian mortalitystatistics and the results are interpreted.

In Chapter 6, our aim is to extend the mortality models described inChapter 5 in order to incorporate cohort effects as well as age and periodeffects. The cohort effect is a prominent feature of mortality trends in sev-eral developed countries including the United Kingdom, the United States,Germany, and Japan. It relates to the favourable mortality experience thathas been observed for those born during the decades between the two worldwars. Given that this is a significant feature of past experience, it is neces-sary first to be able to model and then to forecast its impact on futuremortality trends. First, we discuss the evidence for the cohort effect, withparticular reference to the United Kingdom. The age–period–cohort versionof the Lee–carter model is then introduced, along with a discussion of theerror structure, model fitting, and forecasting. A detailed case study is thenpresented involving historic data from England and Wales. The cohort ver-sions of the Cairns–Blake–Dowd and P-splines models are also presentedand their principal features are reviewed.

In Chapter 7, we deal with the mortality risks borne by an annuityprovider, and in particular with the longevity risk which originates from theuncertain evolution ofmortality at adult and old ages. First, we describe pos-sible approaches to a stochastic representation of mortality, as is requiredwhen modelling longevity risk. Then, an analysis of the impact of longevityrisk on the risk profile of the provider of immediate life annuities is devel-oped. Taking a risk management perspective, possible solutions for riskmitigation are then examined. Risk transfers as well as capital requirementsfor the risk retained are discussed. As far as the latter are concerned, some

x Preface

rules which could be implemented within internal models are tested and acomparison is also developed with the requirement for longevity risk set bySolvency 2, in its current state of development. With regard to risk trans-fers, particular attention is devoted to capital market solutions, that is, tolongevity bonds. The possible design of reinsurance arrangements is exam-ined in connection with the hedging opportunities arising from some ofthese capital market solutions. The main issues concerning policy designand the pricing of longevity risk are sketched. The possible behaviour ofthe annuitant with respect to the planning of her/his retirement income,which should be carefully considered in order to choose an appropriatedesign of life annuity products, is also examined.

Our approach to writing this book has been to allocate prime responsi-bility for each chapter to one or two authors and then for us all to providecomments and input. Thus, Chapters 1 and 4 were written by ErmannoPitacco; Chapter 2 by Ermanno Pitacco and Annamaria Olivieri jointly;Chapters 3 and 5 by Michel Denuit; Chapter 6 by Steven Haberman; andChapter 7 by Annamaria Olivieri. We would like to add that a book likethis will never be the result of the inputs of just the authors. Thus, we eachwould like to acknowledge the support that we have received from a rangeof colleagues. First, we would each like to thank our respective institutionsfor the stimulating environment that has enabled us to complete this project.

Michel Denuit would like to acknowledge the inputs byNatacha Brouhnsand Antoine Delwarde, who both worked on the topic of this book asPhD students under his supervision at UCL. Andrew Cairns kindly pro-vided detailed comments on an earlier version of Chapters 3 and 5, whichled to significant improvements, in particular with regard to mortalityprojection models. Discussions and/or collaborations with many esteemedcolleagues helped to clarify the analysis of mortality and its consequencefor insurance risk management, including Enrico Biffis, Hélène Cossette,Claudia Czado, Pierre Devolder, Jan Dhaene, Paul Eilers, Esther Frostig,Anne-Cécile Goderniaux, Montserrat Guillen, Étienne Marceau, ChristianPartrat, Christian Robert, Jeroen Vermunt, and Jean-François Walhin. LucKaiser, Actuary at the BFIC kindly supplied mortality data about the Bel-gian life insurance market. Particular thanks go to all the participantsto the ‘Mortality’ task force of the Royal Society of Belgian Actuaries,directed by Philippe Delfosse. Interesting discussions with practising actuar-ies involved also helped to clarify some issues. In that respect,Michel Denuitwould like to thank Pascal Schoenmaekers fromMunich Re for stimulatingexchanges. Michel Denuit would like to stress his beneficial involvement inthe working party appointed by the Belgian federal government in order to

Preface xi

produce projected life tables for Belgium. Special thanks in this regard go toMicheline Lambrecht and Benoît Paul from FPB. Also, Michel Denuit hasbenefited from partnerships with (re)insurance companies, especially withDaria Khachakidze and Laure Olié from SCOR, and with Lucie Taleysonfrom AXA. The financial support of the Communauté française de Belgiqueunder contract ‘Projet d′Actions de Recherche Concertées’ ARC 04/09-320and of Banque Nationale de Belgique under grant ‘Risk measures andEconomic capital’ are gratefully acknowledged.

Steven Haberman would like to express his deep gratitude to his long-term research collaborator, Arthur Renshaw, for his contributions to theirjoint work which has underpinned the ideas in Chapters 5 and 6 and forstimulating discussions about mortality trends. He would also like to thankhis close colleague, Richard Verrall, for his contributions and advice onmodelling mortality, as well as their recent PhD students, Terry Sitholeand Marwa Khalaf-Allah, and their research assistant, Zoltan Butt, whohave all worked on the subject of mortality trends and their impact onannuities and pensions. Steven Haberman would also like to thank AdrianGallop from theGovernment Actuary’s Department for providingmortalitydata for England and Wales (by individual year of age and calendar year)that facilitated the modelling of trends by cohort. The financial support,provided through annual research grants, received from the ContinuousMortality Investigation Bureau of the UK Actuarial Profession is gratefullyacknowledged.

Annamaria Olivieri and Ermanno Pitacco would like to thank EnricoBiffis and PietroMillossovich for stimulating exchanges and collaborations,PatriziaMarocco and Fulvio Tomè fromAssicurazioni Generali for interest-ing discussions on various practical aspects of longevity, Marco Vesentinifrom Cattolica Assicurazioni, Verona, for providing useful material. Thefinancial support from the ItalianMinistero dell’Università e della Ricerca isgratefully acknowledged; thanks to the research project ‘Income protectionagainst longevity and health risks: financial, actuarial and economic analysisof pension and health products. Market trends and perspectives’, coordi-nated by Ermanno Pitacco, various stimulating meetings have been held.

Finally, special thanks go to all the participants of the Summer Schoolof the Groupe Consultatif Actuariel Europeen on the topic ‘Modellingmortality dynamics for pensions and annuity business’ held twice in Italy(Trieste, 2005; Parma, 2006). Their feedback and comments have beenvery useful and such Continuing Professional Development initiatives offerto the lecturers involved exciting opportunities for the merging of theoret-ical approaches and practical issues, which we hope have been retained asa theme in this book.

This page intentionally left blank

Contents

Preface v

1 Life annuities 1

1.1 Introduction 11.2 Annuities-certain versus life annuities 2

1.2.1 Withdrawing from a fund 21.2.2 Avoiding early fund exhaustion 51.2.3 Risks in annuities-certain and in life annuities 6

1.3 Evaluating life annuities: deterministic approach 81.3.1 The life annuity as a financial transaction 81.3.2 Actuarial values 91.3.3 Technical bases 12

1.4 Cross-subsidy in life annuities 141.4.1 Mutuality 141.4.2 Solidarity 161.4.3 ‘Tontine’ annuities 18

1.5 Evaluating life annuities: stochastic approach 201.5.1 The random present value of a life annuity 201.5.2 Focussing on portfolio results 211.5.3 A first insight into risk and solvency 241.5.4 Allowing for uncertainty in mortality

assumptions 271.6 Types of life annuities 31

1.6.1 Immediate annuities versus deferred annuities 311.6.2 The accumulation period 331.6.3 The decumulation period 361.6.4 The payment profile 381.6.5 About annuity rates 401.6.6 Variable annuities and GMxB features 41

1.7 References and suggestions for further reading 43

xiv Contents

2 The basic mortality model 45

2.1 Introduction 452.2 Life tables 46

2.2.1 Cohort tables and period tables 462.2.2 ‘Population’ tables versus ‘market’ tables 472.2.3 The life table as a probabilistic model 482.2.4 Select mortality 49

2.3 Moving to an age-continuous context 512.3.1 The survival function 512.3.2 Other related functions 532.3.3 The force of mortality 552.3.4 The central death rate 572.3.5 Assumptions for non-integer ages 57

2.4 Summarizing the lifetime probability distribution 582.4.1 The life expectancy 592.4.2 Other markers 602.4.3 Markers under a dynamic perspective 62

2.5 Mortality laws 632.5.1 Laws for the force of mortality 642.5.2 Laws for the annual probability of death 662.5.3 Mortality by causes 67

2.6 Non-parametric graduation 672.6.1 Some preliminary ideas 672.6.2 The Whittaker–Henderson model 682.6.3 Splines 69

2.7 Some transforms of the survival function 732.8 Mortality at very old ages 74

2.8.1 Some preliminary ideas 742.8.2 Models for mortality at highest ages 75

2.9 Heterogeneity in mortality models 772.9.1 Observable heterogeneity factors 772.9.2 Models for differential mortality 782.9.3 Unobservable heterogeneity factors.

The frailty 802.9.4 Frailty models 832.9.5 Combining mortality laws with frailty models 85

2.10 References and suggestions for further reading 87

Contents xv

3 Mortality trends during the 20th century 89

3.1 Introduction 893.2 Data sources 90

3.2.1 Statistics Belgium 913.2.2 Federal Planning Bureau 913.2.3 Human mortality database 923.2.4 Banking, Finance, and Insurance Commission 92

3.3 Mortality trends in the general population 933.3.1 Age-period life tables 933.3.2 Exposure-to-risk 953.3.3 Death rates 963.3.4 Mortality surfaces 1013.3.5 Closure of life tables 1013.3.6 Rectangularization and expansion 1053.3.7 Life expectancies 1113.3.8 Variability 1133.3.9 Heterogeneity 115

3.4 Life insurance market 1163.4.1 Observed death rates 1163.4.2 Smoothed death rates 1183.4.3 Life expectancies 1223.4.4 Relational models 1233.4.5 Age shifts 127

3.5 Mortality trends throughout EU 1293.6 Conclusions 135

4 Forecasting mortality: an introduction 137

4.1 Introduction 1374.2 A dynamic approach to mortality modelling 139

4.2.1 Representing mortality dynamics: single-figuresversus age-specific functions 139

4.2.2 A discrete, age-specific setting 1404.3 Projection by extrapolation of annual probabilities

of death 1414.3.1 Some preliminary ideas 1414.3.2 Reduction factors 144

xvi Contents

4.3.3 The exponential formula 1454.3.4 An alternative approach to the exponential

extrapolation 1464.3.5 Generalizing the exponential formula 1474.3.6 Implementing the exponential formula 1484.3.7 A general exponential formula 1494.3.8 Some exponential formulae used in

actuarial practice 1494.3.9 Other projection formulae 151

4.4 Using a projected table 1524.4.1 The cohort tables in a projected table 1524.4.2 From a double-entry to a single-entry

projected table 1534.4.3 Age shifting 155

4.5 Projecting mortality in a parametric context 1564.5.1 Mortality laws and projections 1564.5.2 Expressing mortality trends

via Weibull’s parameters 1604.5.3 Some remarks 1624.5.4 Mortality graduation over age and time 163

4.6 Other approaches to mortality projections 1654.6.1 Interpolation versus extrapolation:

the limit table 1654.6.2 Model tables 1664.6.3 Projecting transforms of life table functions 167

4.7 The Lee–Carter method: an introduction 1694.7.1 Some preliminary ideas 1694.7.2 The LC model 1714.7.3 From LC to the Poisson log-bilinear model 1724.7.4 The LC method and model tables 173

4.8 Further issues 1734.8.1 Cohort approach versus period approach.

APC models 1734.8.2 Projections and scenarios. Mortality

by causes 1754.9 References and suggestions for further reading 175

4.9.1 Landmarks in mortality projections 1754.9.2 Further references 178

Contents xvii

5 Forecasting mortality: applications and examplesof age-period models 181

5.1 Introduction 1815.2 Lee–Carter mortality projection model 186

5.2.1 Specification 1865.2.2 Calibration 1885.2.3 Application to Belgian mortality statistics 200

5.3 Cairns–Blake–Dowd mortality projection model 2035.3.1 Specification 2035.3.2 Calibration 2065.3.3 Application to Belgian mortality statistics 207

5.4 Smoothing 2095.4.1 Motivation 2095.4.2 P-splines approach 2105.4.3 Smoothing in the Lee–Carter model 2125.4.4 Application to Belgian mortality statistics 213

5.5 Selection of an optimal calibration period 2145.5.1 Motivation 2145.5.2 Selection procedure 2165.5.3 Application to Belgian mortality statistics 217

5.6 Analysis of residuals 2185.6.1 Deviance and Pearson residuals 2185.6.2 Application to Belgian mortality statistics 220

5.7 Mortality projection 2215.7.1 Time series modelling for the time indices 2215.7.2 Modelling of the Lee-Carter time index 2235.7.3 Modelling the Cairns-Blake-Dowd time indices 228

5.8 Prediction intervals 2295.8.1 Why bootstrapping? 2295.8.2 Bootstrap percentiles confidence intervals 2305.8.3 Application to Belgian mortality statistics 232

5.9 Forecasting life expectancies 2345.9.1 Official projections performed by the Belgian

Federal Planning Bureau (FPB) 2355.9.2 Andreev–Vaupel projections 2355.9.3 Application to Belgian mortality statistics 237

xviii Contents

5.9.4 Longevity fan charts 2405.9.5 Back testing 240

6 Forecasting mortality: applications and examples ofage-period-cohort models 243

6.1 Introduction 2436.2 LC age–period-cohort mortality projection model 246

6.2.1 Model structure 2466.2.2 Error structure and model fitting 2486.2.3 Mortality rate projections 2536.2.4 Discussion 253

6.3 Application to United Kingdom mortality data 2546.4 Cairns-Blake-Dowd mortality projection model:

allowing for cohort effects 2636.5 P-splines model: allowing for cohort effects 265

7 The longevity risk: actuarial perspectives 267

7.1 Introduction 2677.2 The longevity risk 268

7.2.1 Mortality risks 2687.2.2 Representing longevity risk: stochastic

modelling issues 2707.2.3 Representing longevity risk: some examples 2737.2.4 Measuring longevity risk in a static framework 276

7.3 Managing the longevity risk 2937.3.1 A risk management perspective 2937.3.2 Natural hedging 2997.3.3 Solvency issues 3037.3.4 Reinsurance arrangements 318

7.4 Alternative risk transfers 3307.4.1 Life insurance securitization 3307.4.2 Mortality-linked securities 3327.4.3 Hedging life annuity liabilities through

longevity bonds 3377.5 Life annuities and longevity risk 343

7.5.1 The location of mortality risks in traditionallife annuity products 343

7.5.2 GAO and GAR 3467.5.3 Adding flexibility to GAR products 347

Contents xix

7.6 Allowing for longevity risk in pricing 3507.7 Financing post-retirement income 354

7.7.1 Comparing life annuity prices 3547.7.2 Life annuities versus income drawdown 3567.7.3 The ‘mortality drag’ 3597.7.4 Flexibility in financing post-retirement income 363

7.8 References and suggestions for further reading 369

References 373

Index 389

This page intentionally left blank

1Life annuities

1.1 Introduction

Great attention is currently devoted to the management of life annuity port-folios, both from a theoretical and a practical point of view, because of thegrowing importance of annuity benefits paid by private pension schemes.In particular, the progressive shift from defined benefit to defined contribu-tion pension plans has increased the interest in life annuities, which are theprincipal delivery mechanism of defined contribution pension plans.

Among the risks which affect life insurance and life annuity portfolios,longevity risk deserves a deep and detailed investigation and requires theadoption of proper management solutions. Longevity risk, which arisesfrom the random future trend in mortality at adult and old ages, is a rathernovel risk. Careful investigations are required to represent and measure it,and to assess the relevant impact on the financial results of life annuityportfolios and pension plans.

This book provides a comprehensive and detailed description of methodsfor projecting mortality, and an extensive introduction to some importantissues concerning the longevity risk in the area of life annuities and pensionbenefits.

Conversely, the present chapter mainly has an introductory role, aimingat presenting the basic structure of life annuity products. Moving fromthe simple model of the annuity-certain, typical features of life annuityproducts are presented (Section 1.2). From an actuarial point of view, thepresentation progressively shifts from very traditional deterministic models(Section 1.3) to more modern stochastic models (Section 1.5). An appropri-ate stochastic approach allows us to capture the riskiness inherent in a lifeannuity portfolio, and in particular the risks arising from randommortality.

Cross-subsidy mechanisms which work (or may work) in life annuityportfolios and pension plans are described in Section 1.4.

2 1 : Life annuities

The presentation of the actuarial structure of life annuities focusses on avery simple annuity model, namely the immediate life annuity. So, problemsarising in the so-called accumulation phase (as well as problems regard-ing the annuitization of the accumulated amount) are initially disregarded.Conversely, in Section 1.6 a comprehensive description of a number of lifeannuity models is provided. In this section, actuarial aspects are just men-tioned, in favour of more practical issues aiming in particular at paving theway for a following formal presentation.

A list of references and some suggestions for further readings concludethe chapter (Section 1.7).

1.2 Annuities-certain versus life annuities

1.2.1 Withdrawing from a fund

Assume that the amount S is available at a given time, say at retirement, andis used to build up a fund.Denote the retirement timewith t = 0, and assumethat the year is the time unit. In order to get her/his post-retirement income,the retiree withdraws from the fund at time t the amount bt (t = 1, 2, . . . ).Suppose that the fund ismanaged by a financial institutionwhich guaranteesa constant annual rate of interest i.

Denote with Ft the fund at time t, immediately after the payment of theannual amount bt. Clearly:

Ft = Ft−1(1 + i) − bt for t = 1, 2, . . . (1.1)

with F0 = S. Thus, the annual variation in the fund is given by

Ft − Ft−1 = Ft−1 i − bt for t = 1, 2, . . . (1.2)

Figure 1.1 illustrates the causes explaining the behaviour of the fundthroughout time, formally expressed by equation (1.2).

The behaviour of the fund throughout time obviously depends on thesequence of withdrawals b1, b2, . . .. In particular, if for all t the annualwithdrawal is equal to the annual interest credited by the fund manager,that is,

bt = Ft−1 i (1.3)

then, from (1.1) we immediately find

Ft = S (1.4)

1.2 Annuities-certain versus life annuities 3

Time

Fund

Ft – Ft–1

Ft–1

Ft

t – 1 t

– Annualpayment

+ Interest

Figure 1.1. Annual variation in the fund providing an annuity-certain.

for all t, whence a constant withdrawal

b = S i (1.5)

follows.

Conversely, if we assume a constant withdrawal b,

b > S i (1.6)

(as probably will be needed to obtain a reasonable post-retirement income)the drawdown process will exhaust, sooner or later, the fund. Indeed, fromequation (1.2) we have

F0 > F1 > · · · > Ft > · · · (1.7)

and we can find a time m such that

Fm ≥ 0 and Fm+1 < 0 (1.8)

Clearly, the exhaustion time m depends on the annual amount b (and theinterest rate i as well), as it can be easily understood from equation (1.2).

The sequence of m constant annual withdrawals b (with m defined byconditions (1.8), and possibly completed by the exhausting withdrawal attime m + 1) constitutes an annuity-certain.

Example 1.1 Assume S = 1000. Figure 1.2 illustrates the behaviour ofthe fund when i = 0.03 and for different annual amounts b. Conversely,Fig. 1.3 shows the behaviour of the fund for various interest rates i, assumingb = 100. �

4 1 : Life annuities

–4,000

–3,000

–2,000

–1,000

0

1,000

2,000

0 5 10 15 20 25 30 35

t

Ft

b = 50b = 75b = 100b = 125

Figure 1.2. The fund providing an annuity-certain (i = 0.03).

–600

–400

–200

0

200

400

600

800

1,000

1,200

0 2 4 6 8 10 12 14 16

t

Ft

i = 0.02i = 0.03i = 0.04i = 0.05

Figure 1.3. The fund providing an annuity-certain (b = 100).

It is interesting to compare the exhaustion time m with the remaininglifetime of the retiree. Assume that her/his age at retirement is x, for exam-ple x = 65. Of course the lifetime is a random variable. Denote with Txthe remaining random life for a person age x. Let ω denote the maximumattainable age (or limiting age), say ω = 110. Hence, Tx can take all val-ues between 0 and ω − x. If Tx < m then the amount Fm is available as abequest. Conversely, if Tx > m there are ω−x−m years with no possibilityof withdrawal (and hence no income).

1.2 Annuities-certain versus life annuities 5

In practice, the annual amount b (for a given interest rate i) could bechosen by comparing the related exhaustion time m with some quantitywhich summarizes the remaining lifetime. For example, a synthetic valueis provided by the expected remaining lifetime E[Tx]; another possibilityis given by the remaining lifetime with the maximum probability, that is,the mode of the remaining lifetime, Mod[Tx]. Note that, to find E[Tx] orMod[Tx], assumptions about the probability distribution of the lifetime Txare needed (see Section 1.3.2).

For example, the value b may be chosen, such that

m ≈ Mod[Tx] (1.9)

Thus, with a high probability the exhaustion time will coincide with theresidual lifetime. Notwithstanding, events like Tx > m, or Tx < m, mayoccur and hence the retiree bears the risk originating from the randomnessof her/his lifetime. Conversely, the choice

m = ω − x (1.10)

obviously removes the risk of remaining alive with no withdrawal possibil-ity, but this choice would result in a low annual amount b.

1.2.2 Avoiding early fund exhaustion

Risks related to random lifetimes can be transferred from the annuitantsto the annuity provider thanks to a different contractual structure, that is,the life annuity. To provide a simple introduction to technical features oflife annuities, we adopt now a (very) traditional model; in the followingsections, more modern and general models will be described.

Consider the following transaction: an individual age x pays to a lifeannuity provider (e.g. an insurer) an amount S to receive a (life) annuityconsisting in a sequence of annual benefits b, paid at the end of every yearwhile she/he is alive. Assume that the same type of annuity is purchased attime t = 0 by a given number, say lx, of individuals all age x.

Let lx+t denote an estimate (at time 0) of the number of individuals (annu-itants) alive at age x + t (t = 1, 2, . . . ,ω − x), out of the initial ‘cohort’ of lxindividuals. As ω denotes the (integer) maximum age, we have by definitionlω > 0 and lω+1 = 0. The following (estimated) cash flows of the annuityprovider are then defined:

(a) an income lx S at time 0;(b) a sequence of outgoes lx+t b at time t, t = 1, 2, . . . ,ω − x.

6 1 : Life annuities

Let Vt denote the fund pertaining to a generic annuitant at time t. Thetotal fund of the annuity provider is given by lx+t Vt, and is defined fort = 1, 2, . . . ,ω − x as follows:

lx+t Vt = lx+t−1 Vt−1 (1 + i) − lx+t b (1.11)

clearly with lx V0 = lx S

From (1.11), we find the following recursion describing the evolution ofthe individual fund:

Vt = lx+t−1

lx+tVt−1 (1 + i) − b (1.12)

with V0 = S. Recursion (1.12) can also be written as follows:

Vt = Vt−1 (1 + i) + lx+t−1 − lx+t

lx+tVt−1 (1 + i) − b (1.13)

Thus, the annual variation in the fund is given by

Vt − Vt−1 = Vt−1 i + lx+t−1 − lx+t

lx+tVt−1 (1 + i) − b (1.14)

It is worth noting from (1.14) that the annual decrement of the individualfund can be split into three contributions (see Figure 1.4):

(a) a positive contribution provided by the interest Vt−1 i;(b) a positive contribution provided by the share of the funds released

because of the death of lx+t−1 − lx+t annuitants in the t-th year, theshare being credited to the lx+t annuitants alive at time t;

(c) a negative contribution given by the benefit b.

Contribution (b), which does not appear in the model describing theannuity-certain (see Figure 1.1), is maintained thanks to a cross-subsidyamong annuitants, that is, the so-called mutuality effect. For more details,see Section 1.4.1.

In the case of life annuities, the individual fundVt (as defined by recursion(1.12)) is called the reserve.

1.2.3 Risks in annuities-certain and in life annuities

First, let us focus on the simple model of annuity-certain we have dealtwith in Section 1.2.1, and consider the perspectives of the retiree and thefinancial institution providing the annuity.

1.2 Annuities-certain versus life annuities 7

Time

Res

erve

t

Vt

Vt–1

t–1

Vt – Vt–1

– Annual payment

+ Interest

+ Mutuality

Figure 1.4. Annual variation in the (individual) fund of a life annuity.

The provider of an annuity-certain does not bear any risk inherent inthe random lifetime of the retiree, as, whatever this lifetime may be, theannuity will be paid up to the exhaustion of the fund. Conversely, theannuity provider takes financial risks which can be singled out lookingat the two causes of change in the fund level (see Fig. 1.1). Risks are asfollows:

– market risk, more precisely interest rate risk, as we have assumed that i isthe guaranteed interest rate which must be credited to the fund whateverthe return from the investment of the fund itself may be;

– liquidity risk, as the annual payment obviously requires cash availability.

Conversely, the retiree does not take any financial risk thanks to theguaranteed interest rate, whereas she/he bears the risk related to her/hisrandom lifetime, as seen above.

Now, let us move to the life annuity. According to the structure of thisproduct (at least as defined in Section 1.2.2), the annuitant does not bearany risk. Actually, the annuity is paid throughout the whole lifetime andthe amount of the annual payment is guaranteed.

Conversely, the annuity provider first bears the market risk and theliquidity risk as in the annuity-certain model. Further, if the actual life-times of annuitants lead to numbers of survivors greater than the estimatedones, the cross-subsidy mechanism (see Section 1.2.2 and Fig. 1.4) cannotfinance the payments to the annuitants still alive. In other words, contri-bution (b), which is required to maintain the individual fund Vt, should be

8 1 : Life annuities

partially funded, in this case, by the annuity provider. Conversely, num-bers of survivors less than the estimated ones lead to a provider’s profit.Hence, the annuity provider takes risks related to the mortality of theannuitants.

1.3 Evaluating life annuities: deterministic approach

1.3.1 The life annuity as a financial transaction

Purchasing a life annuity constitutes a financial transaction, whose cashflows are

– a price, or premium, paid by the annuitant to the annuity provider;– a sequence of amounts, namely the annuity, paid by the annuity providerto the annuitant while he/she is alive; the payment frequency may bemonthly, quarterly, semi-annual, or annual.

In what follows, we only refer to annual payments, hence disregardingannuities payable more frequently than once a year (which require specialtreatment; see the references cited in Section 1.7). Further, we will assume(if not otherwise specified) that payments are made at the end of each year(annuity in arrears).

In the life annuity structure presented in Section 1.2.2, the amount S rep-resents the premium paid against the annuity with b as the annual payment.Clearly, the life annuity structure we have described requires a single pre-mium at time 0, as the annuity is an immediate one. Conversely, for otherannuity models different premium arrangements are feasible, as we will seein Section 1.6.

The relation between S and b is implicitly defined by recursion (1.11)(or (1.12)). Solving with respect to S (or b), when b (or S) has beenassigned, leads to an explicit relation between the two amounts. In par-ticular, S is the expected present value of the life annuity, as we will see inSection 1.3.2.

Indeed, a reasonable starting point (but not necessarily the only one) fordetermining the single premium is given by the calculation of the expectedpresent value of the life annuity. In particular, when the so-called equiva-lence principle is adopted, the single premium is set equal to the expectedpresent value. Other premium calculation principles will be dealt with inSection 7.6.

1.3 Evaluating life annuities: deterministic approach 9

1.3.2 Actuarial values

For a given i and a given sequence lx, lx+1, . . . , lω, from recursion (1.11),with lx V0 = lx S, we find

lx S =ω−x∑t=1

b lx+t (1 + i)−t (1.15)

and, referring to a single annuitant,

S =ω−x∑t=1

blx+t

lx(1 + i)−t (1.16)

In formula (1.16), S turns out to be the present value of the sequenceof amounts b ‘weighted’ with the ratios lx+t/lx. The numbers of survivorslx+t (and the interest rate i, as well) are assumed deterministic. Hence themodel relying on these assumptions, and leading in particular to expression(1.16), is a deterministic one.

Some comments can help in understanding the features of the deter-ministic model. First, a point in favour of the model is that, in spite ofits deterministic nature, the risk borne by the life annuity provider, andarising from random lifetimes clearly emerges, although it is not explicitlyaccounted for (see Section 1.2.3).

Second, equation (1.16) can be rewritten in ‘probabilistic’ terms, sincelx+t/lx can be interpreted as the estimate of the probability of an individualage x being alive at age x+t. Denotingwith tpx this probability, it is formallydefined as follows:

tpx = P[Tx > t] (1.17)

and we have

S = bω−x∑t=1

tpx (1 + i)−t (1.18)

An alternative expression is provided by the following formula:

S = bω−x∑h=1

ah� hpx qx+h (1.19)

where

10 1 : Life annuities

– the symbol ah�, defined as follows:

ah� = 1 − (1 + i)−h

i(1.20)

denotes the present value of a temporary annuity-certain consisting of hunitary annual payments in arrears;

– the symbol qx+h denotes the probability of an individual age x+h dyingwithin one year, formally

qx+h = P[Tx+h < 1] (1.21)

we note that, assuming ω as the maximum age, qω = 1;– hence, hpx qx+h is the probability of an individual currently age x dyingbetween ages x + h and x + h + 1; in symbols

hpx qx+h = P[h ≤ Tx < h + 1] (1.22)

Note that

hpx = (1 − qx)(1 − qx+1) . . . (1 − qx+h−1) (1.23)

The equivalence of (1.18) and (1.19) can be proved using the followingrelation:

tpx = 1 −t−1∑h=0

hpx qx+h (1.24)

where the sum expresses the probability of dying before age x + t.

Clearly, the right-hand side of expression (1.19) represents the expectedpresent value, or actuarial value, of the life annuity, thus:

S = b E[aKx�] (1.25)

whereKx denotes the curtate random remaining lifetime of an individual agex, namely the integer part ofTx. The quantities hpx qx+h, h = 0, 1, . . . ,ω−x,constitute the probability distribution of the discrete random variable Kx.

With the symbol commonly used to denote the actuarial value of the lifeannuity, we have:

S = b ax (1.26)

where, according to (1.18),

ax =ω−x∑t=1

tpx (1 + i)−t (1.27)

1.3 Evaluating life annuities: deterministic approach 11

Finally, the quantity Vt can be interpreted as the mathematical reserve ofthe life annuity, whose evolution throughout time is described by recursion(1.12), namely, in probabilistic terms:

Vt = 1

1px+t−1Vt−1 (1 + i) − b (1.28)

It should be noted that recursion (1.28) expresses the reserve Vt as theresult of the decumulation process, driven by financial items (the interestrate i and the payment b) and a demographic item (the probability 1px+t−1).Under this perspective, the reserve Vt can be interpreted as assets pertainingto the generic annuitant. Conversely, the annuitant has the right to receivethe annual amount b while she/he is alive. This obligation of the life annuityprovider, viz. a liability, can be expressed as the expected present value attime t (and hence referred to the annuitant assumed to be alive at time t) offuture annual payments:

b ax+t = bω−x−t∑

h=1hpx+t (1 + i)−h (1.29)

It is easy to prove, replacing Vt and Vt−1 in equation (1.28) with b ax+tand b ax+t−1 respectively, that equation (1.28) itself is satisfied. Thus,

Vt = b ax+t (1.30)

whence the amount Vt can be interpreted as the amount of assets exactlymeeting the provider’s liability. Note that the reserve Vt exhausts at themaximum age ω only.

Example 1.2 In Fig. 1.5 the mathematical reserve Vt is plotted against timet. We have assumed S = 1000, i = 0.03, x = 65. The estimated numbersof survivors can be drawn from various data set. For example, assume thatthe probabilities qx+h, h = 0, 1, . . . ,ω − x, where x is a given initial ageof interest, have been assigned. From the qx+h’s, the estimated number ofsurvivors can be derived via the following recursion:

lx+h+1 = lx+h (1 − qx+h) (1.31)

starting from a (notional) initial value lx. For example, assume for qx+h thefollowing expression:

qx+h =

G Hx+h

1 + G Hx+hif x + h < 110

1 if x + h = 110(1.32)

12 1 : Life annuities

0

200

400

600

800

1,000

1,200

65 75 85 95 105 115

Vt

x + t

Figure 1.5. Mathematical reserve of a life annuity.

with the parameters G = 0.000002, H = 1.13451. From the data assumed,we obtain a65 = 14.173 and hence b = 70.559. �

Remark The first line on the right-hand side of (1.32) approximatelyexpresses the mortality at older ages according to the first and secondHeligman–Pollard laws, as we will see in Section 2.5.2. �

1.3.3 Technical bases

The relation between S (the single premium) and b (the annual benefit)relies on the equivalence principle, as S is the expected present value ofthe sequence of annual amounts b. The adoption of this principle complieswith common (but not necessarily sound) actuarial practice. Actually, whenthe equivalence principle is used for pricing insurance products and lifeannuities in particular, a safe-side technical basis (or prudential basis, orfirst-order basis) is chosen, namely an interest rate i lower than the estimatedinvestment yield, and a set of probabilities expressing amortality level lowerthan that expected in the life annuity portfolio. The estimated investmentyield and the mortality actually expected constitute the scenario technicalbasis (or realistic basis, or second-order basis).

For simplicity, assume a constant estimated investment yield i∗; denotewith q∗

x+h, h = 0, 1, . . . ,ω − x the realistic probabilities of death. The sur-vival probabilities, tp∗

x, can be calculated from the q∗x+h as stated by relation

1.3 Evaluating life annuities: deterministic approach 13

(1.23). The resulting actuarial value of the life annuity, a∗x, is clearly given

(see (1.27)) by

a∗x =

ω−x∑t=1

tp∗x (1 + i∗)−t (1.33)

The difference ax −a∗x can be interpreted as the expected present value (at

time t = 0) of the profit generated by the life annuity contract. Note that,if i∗ > i, the yield from investment contributes to the profit. Usually profitparticipation mechanisms assign a (large) part of the investment profit topolicyholders, and so the expected profit ax − a∗

x should be taken as grossof the profit participation.

Example 1.3 For example, assume i = 0.03 and the qx+h adopted in Exam-ple 1.1 as the items of the safe-side technical basis (i.e. the pricing basis);conversely, for the scenario basis assume i∗ = 0.05 as the estimated invest-ment yield, and the mortality level described by probabilities q∗

x+h given bythe expression (1.32) implemented with the parameters G∗ = 0.0000023,H∗ = 1.134. With these assumptions, we have that i∗ > i and q∗

x+h > qx+h.We find that a∗

65 = 11.442, and hence the expected present value of theprofit produced by a life annuity with a unitary annual payment, that is,with b = 1, is a65 − a∗

65 = 2.731. �

An appropriate choice of the first-order basis, for a given scenario basis,also provides the insurer with a safety loading in order to face an adversemortality experience (and/or an adverse yield from investments). In otherwords, while the spread between the technical bases produces a (posi-tive) profit if the insurer experiences a mortality and an investment yieldas described by the scenario basis, the spread itself, increasing the singlepremium for a given annual payment or conversely reducing the annualpayment for a given premium, can avoid losses when an adverse experienceoccurs.

As regards the choice of the age-patterns of mortality, to adopt as thefirst-order and the second-order technical basis respectively, it should bekept in mind that life annuities may involve very long time intervals, say25–30 years or even more. Indeed, survival probabilities (i.e. probabilitiestpx and tp∗

x) should express reasonable mortality assumptions referring tothe future lifetime of an individual who is currently at age x.

Age-patterns of mortality are commonly available as the result of statis-tical observations, and usually express the mortality at various ages as itemerges at the time of the observation itself. As mortality is affected by evi-dent trends (see Chapter 3), observed mortality (even when resulting from

14 1 : Life annuities

recent investigations) cannot be directly used to express long term futuremortality, as required when dealing with life annuities. Thus, projectionmodels (see Chapter 4) are needed to forecast future mortality.

1.4 Cross-subsidy in life annuities

Although insurance transactions can be analysed at an individual level (e.g.in terms of the equivalence principle), in practice these transactions usuallyinvolve a group of insureds transferring the same type of risk to an insurer.This is also the case for life annuity products, and actually these productshave been introduced in Section 1.2.2 referring to a cohort of annuitants.

Thanks to the existence of an insured population, money transfers insidethe population itself (i.e. among the policyholders) are possible, causing across-subsidy among the insureds (or annuitants). The term cross-subsidybroadly refers to some arrangement adopted for sharing among a givenpopulation the cost of a set of benefits. However, various types of crosssubsidy can be recognized. While mutuality underpins the management ofany insurance portfolio (see Section 1.2.2, as regards life annuities), othertypes of cross subsidy are not necessarily involved, for example, solidarity.Further, special cross-subsidy structures may occur with particular policies;this is the case of tontine schemes in the context of life annuities.

In the following parts of this section, we deal with cross-subsidy mech-anisms (mutuality, solidarity, and tontines), focussing on life annuityportfolios.

1.4.1 Mutuality

The mutuality principle underpins the insurance process (whether or notit is run by a ‘mutual’ insurance company or by a proprietary insurancecompany, which is owned by shareholders), and arises from the pooling ofa number of risks. Moreover, the mutuality effect also works in ‘mutualassociations’ of individuals exposed to the same type of risk, even withoutresorting (at least in principle) to an insurance company.

The mutuality effect leads to money transfers from insureds (or annu-itants) who, in terms of actuarial value, paid premiums greater than thebenefits received to insureds in the opposite situation. For example, in anon-life portfolio the insuredswithout claims transfermoney to the insuredswith claims.

1.4 Cross-subsidy in life annuities 15

Referring to a life annuity portfolio, it is interesting to focus on the annualequilibrium between assets available and liabilities. This equilibrium relieson an asset transfer among annuitants, namely, from annuitants dying inthe year to annuitants alive at the end of the year. This clearly appearsfrom recursion (1.11), where the accumulated fund pertaining to lx+t−1annuitants alive at time t−1, whose amount is lx+t−1 Vt−1 (1+ i), is used tofinance benefits to lx+t annuitants (out of the lx+t−1) alive at time t, namely,the payment of the amount lx+t b and the maintenance of the fund lx+t Vtfor future payments. So, resources needed at time t are made available (also)thanks to this cross subsidy, namely the mutuality effect.

Let us now look at the technical equilibrium under an individualperspective. Recursion (1.13) can be rewritten, in more compact terms, as

Vt = Vt−1 (1 + i) (1 + θx+t) − b (1.34)

where

θx+t = lx+t−1 − lx+t

lx+t(1.35)

In terms of survival probabilities, as it emerges from (1.28), we haveθx+t = (1/1px+t−1) − 1.

Looking at recursion (1.34), θx+t can be interpreted as an ‘extra-yield’which is required to maintain the decumulation process of the individualreserve Vt, and hence can be interpreted as a measure of the mutualityeffect. The extra yield θx+t is also called the mortality drag, or interest frommutuality. As already seen in Section 1.2.2, θx+t determines the share of thefunds released because of the death of lx+t−1 − lx+t annuitants in the t-thyear, and credited to the lx+t annuitants alive at time t.

Remark The (annual) extra-yield provided by the mutuality effect is clearlya function of the current age x + t (see (1.35)). Referring to a given ageinterval (x, x + m), the sequence θx, θx+1, . . . , θx+m can be summarized inan index, depending on x, m, and the interest rate i, called the impliedlongevity yield (ILY).1 This index plays an important role in the analysis ofannuitization alternatives, as we will see in Section 7.7. �

Example 1.4 In Fig. 1.6 the quantity θx+t is plotted for x = 65 andt = 0, 1, . . .. The underlying technical basis is the first-order basis, withi = 0.03 and the qx+t defined in Example 1.2. It is interesting to note that,

1 The expression ‘Implied Longevity Yield’ and its acronym ‘ILY’ are registered trademarks andproperty of CANNEX Financial Exchanges.

16 1 : Life annuities

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

65 75 85 95 105Age

u

Figure 1.6. A measure of the mutuality effect.

when moderately old ages are involved (say, in the interval 65–75), the val-ues of θ are rather small. In such a range of ages, they could be ‘replaced’with a higher yield from investments (provided that riskier investments canbe accepted), and so, in that age interval, a withdrawal process could be pre-ferred to a life annuity. Conversely, as the age increases, θ reaches very highvalues, which obviously cannot be replaced by investment yields. So, whenold and very old ages are concerned, the life annuity is the only technicaltool which guarantees a lifelong constant income. As regards theoreticalresults showing that the annuitization constitutes the optimal choice, seeSection 1.7. �

1.4.2 Solidarity

Assume that a population consisting of (potential or actual) insureds is splitinto risk classes. Each risk class groups individuals with the same probabilityof claim (or death, or survival, etc.).

Risk classes could be directly referred to for pricing purposes, namelycharging with a specific premium rate the individuals belonging to a givenrisk class. Conversely, two or more risk classes can be grouped leading toa rating class, which would be aimed at charging all individuals belong-ing to a given rating class with the same premium rate. The premium rateattributed to a rating class should be an appropriate weighted average ofthe premiums competing to the risk classes grouped into the rating class

1.4 Cross-subsidy in life annuities 17

itself. Theweighting should reflect the expected numbers of (future) insuredsbelonging to the various risk classes.

Assume that, as far as pricing is concerned, the population is split intorating classes rather than into risk classes. The rationale of this groupingmay be, for example, a simplification in the tariff structure.

When two or more risk classes are aggregated into one rating class,some insureds pay a premium higher than their ‘true’ premium, that is,the premium resulting from the risk classification, while other insureds paya premium lower than their ‘true’ premium. Thus, the equilibrium insidea rating class relies on a money transfer among individuals belonging todifferent risk classes. This transfer is usually called solidarity (among theinsureds).

Clearly, such a premium system may cause adverse selection, as individ-uals forced to provide solidarity to other individuals can reject the policy,moving to other insurance solutions (or, more generally, risk managementactions). The severity of this self-selection phenomenon depends on howpeople perceive the solidaritymechanism, as well as on the premium systemsadopted by competitors in the insurance market. In any event, self-selectioncan jeopardize the technical equilibrium inside the portfolio, which dependson actual versus expected numbers of insureds belonging to the various riskclasses grouped into a rating class. So, in practice, solidarity mechanismscan work provided that they are compulsory (e.g. imposed by insuranceregulation) or they constitute a common market practice.

As regards life annuities, risk classes are usually based on age and gen-der. In particular, it is well known that females experience a mortalitylower than males and a higher expected lifetime. So, if for some reasonthe same premium rates (only depending on age) are applied to all annu-itants, a solidarity effect arises, implying a money transfer from males tofemales.

The solidarity effect is stronger when the number of rating classes issmaller, compared with the number of risk classes. In the private insur-ance field, an extreme case is achieved when one rating class only relatesto a large number of underlying risk classes. Outside of the private insur-ance area, the solidarity principle is commonly applied in social security.In this field, the extreme case arises when the whole national populationcontribute to fund the benefits, even if only a part of the population itself iseligible to receive benefits; so, the burden of insurance is shared among thecommunity.

Finally, it is interesting to stress the implications of this argument. Mutu-ality affects the benefit (or claim) payment phase, so that ‘direction’ and

18 1 : Life annuities

‘measure’ of themutuality effect in a portfolio are only known ex-post. Con-versely, solidarity affects the premium income phase, and hence its directionand measure are known ex-ante.

1.4.3 ‘Tontine’ annuities

Assume that each one of lx individuals, all aged x at time t = 0, paysat that time the amount S to a financial institution. Against the amountS = lx S, the financial institution will pay at the end of each year, that is, attimes t = 1, 2, . . ., the (total) constant amount B, while at least one of theindividuals of the group is alive.

Each year the amount B is divided equally among the survivors. Hence,each individual (out of the initial lx) alive at time t receives a benefit btwhich depends on the actual number of survivors at that time. Denoting,as usual, with lx+t the estimated number of survivors, an estimate of bt isgiven by B/lx+t. Clearly,

Blx+1

≤ Blx+2

≤ . . . ≤ Blx+t

≤ · · · (1.36)

The mechanism of dividing B among the survivors is called a tontinescheme, whereas the sequence (1.36) is called a tontine annuity.

The relation between S (the initial income) and B (the annual payment)can be stated (at least in theory) on the basis of the equivalence principle.To this purpose, first note that the duration, K, of the annuity paid by thefinancial institution is random, being defined as follows:

K = max{K(1)x ,K(2)

x , . . . ,K(lx)x } (1.37)

where K(j)x denotes the random curtate residual lifetime of the j-th individ-

ual. Hence, the equivalence principle requires

S = B E[aK�] (1.38)

The calculation of E[aK�] is extremely difficult. In practice, a rea-sonable approximation could be provided by aω−x�. While in generalaω−x� > E[aK�], the larger is lx the better is this approximation, as thereis a higher probability that some individual reaches, or at least approaches,the maximum age ω.

Example 1.5 The tontine annuity derives its name from Lorenzo Tonti (aNeapolitan banker living most of his life in Paris) who, around 1650, pro-posed a plan for raising monies to Cardinal Mazzarino, the Chief Minister

1.4 Cross-subsidy in life annuities 19

of France at the time of King Louis XIV. In this plan, a fund was raisedby subscriptions. Let S denote the amount collected by the State. Then, theState had to pay each year the interest on S, at a given annual interest ratei. The constant annual payment Si was to be divided equally among thesurviving members of the group and would terminate with the death of thelast survivor. Thus, according to our notation, the duration of the annuityis K (see definition (1.37)), and we have B = Si. Note that

BS = i = 1

a∞�(1.39)

where a∞� = 1/i is the present value of a perpetuity (given the discountrate i). As

Sa∞�

<S

aω−x�<

SE[aK�] (1.40)

(assuming that the same discount rate is used for all the present values), wefind that original Tonti’s scheme did not fulfill the equivalence principle,whilst it is favourable to the issuer (i.e., to the State). �

Turning back to the general tontine scheme, two points should bestressed.

(a) The tontine scheme clearly implies a cross subsidy among the annui-tants, and in particular a mutuality effect arises as each dying annuitantreleases a share of the amount B, which is divided among the survivingannuitants.

(b) A basic difference between tontine annuities and ordinary life annuitiesshould be recognized. In an ordinary life annuity, the annual (individual)benefit b is stated and guaranteed, in the sense that the life annuityprovider has to pay the amount b to the annuitant for her/his wholeresidual lifetime, whatever the mortality experienced in the portfolio (orpension plan) may be. Conversely, in a tontine scheme the sequence ofamounts b1, b2, . . . paid to each annuitant depends on the actual size ofthe surviving tontine group. Note that, when managing an ordinary lifeannuity portfolio the annuity provider takes the risk of a poor mortalityexperience in the portfolio (see Section 1.2.3), whereas in a tontinescheme the only cause of risk is the lifetime of the last survivor. Further,it should be noted that, for a given technical basis and a given amount S,the annual benefit b is likely to be much higher than the initial payments

20 1 : Life annuities

in a tontine scheme. Actually (using the approximation aω−x�), from

B = Saω−x�

= lx Saω−x�

(1.41)

we obtain, for small values of t (such that lxlx+t

<aω−x�

ax)

bt = lx Slx+t aω−x�

<Sax

= b (1.42)

From inequality (1.42) it follows that achieving a ‘good’ amount bt(when compared with b) relies on the mortality experienced in the ton-tine group. Mainly for this reason, tontine annuities were suppressed bymany governments, and at present prohibited in most countries.

Nevertheless, ideas underlying tontine schemes survive in some mech-anisms of profit participation, especially when also mortality profits areinvolved, as we will see in Section 7.5.3.

1.5 Evaluating life annuities: stochastic approach

1.5.1 The random present value of a life annuity

It should be noted that, although formulae (1.18) and (1.19) involve prob-abilities, the model built up so far is a deterministic model, as probabilitiesare only used to determine expected values. A first step towards stochasticmodels follows.

Equation (1.19) implicitly involves the random present value Y,

Y = aKx� (1.43)

of a life annuity (see also (1.25)). The possible outcomes of the randomvariable Y are as follows:

y0 = a0� = 0

y1 = a1� = (1 + i)−1

· · · = · · ·yω−x = aω−x� = (1 + i)−1 + (1 + i)−2 + · · · + (1 + i)−(ω−x)

and we have

P[aKx� = yh] = P[Kx = h] (1.44)

1.5 Evaluating life annuities: stochastic approach 21

0.01

00 5 10 15 20

0.02

0.03

0.04

0.05

0.06

Prob

abili

ty

Present value of the annuity

Figure 1.7. Probability distribution of aK65�.

Calculating the probability distribution of Y = aKx� requires the choiceof a technical basis, for example the scenario basis. Moments other than theexpected value can then be calculated, for example, the variance of aKx�.

Example 1.6 Figure 1.7 illustrates the probability distribution of aK65�,calculated adopting the probabilities q∗

x+h and the interest rate i∗, as spec-ified in Example 1.3. In particular, for the variance we find Var(aK65�) =12.889. �

1.5.2 Focussing on portfolio results

Interesting insights into the features of a stochastic approach to life annuitymodelling can be achieved by focussing on a group (a portfolio, a pensionplan, etc.) of annuitants.

For a given initial number lx of annuitants, all age x and all with thesame age-pattern of mortality, for example, expressed by the q∗

x+h’s, thenumbers lx+t, t = 1, 2, . . . ,ω − x, can be interpreted as expected numbersof survivors at age x + t, out of the initial cohort (see (1.31)).

Actually, the numbers of annuitants alive at time t, t = 1, 2, . . . ,ω − x, constitute a random sequence,

Lx+1, Lx+2, . . . , Lω (1.45)

22 1 : Life annuities

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0.18

0.2(a)

0 10 20 30 40 50 60 70 80 90 100

L70 L85

Prob

abili

ty

Prob

abili

ty

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0.18

0.2(b)

0 10 20 30 40 50 60 70 80 90 100

Figure 1.8. Probability distributions of L70 and L85.

It is interesting to find the probability distribution of the generic randomnumber Lx+t. If we assume that the lifetimes of the annuitants are indepen-dent (and identically distributed), then the probability distribution of Lx+tis binomial, namely

P[Lx+t = k] =(

lxk

)(tp∗

x)k (1 − tp∗x)lx−k; k = 0, 1, . . . , lx (1.46)

and, in particular, we have

E[Lx+t] = lx tp∗x (1.47)

Example 1.7 Figure 1.8(a) and (b) illustrate the probability distributionof L70 and L85 respectively, under the following assumptions: x = 65,l65 = 100, q∗

x+t as specified in Example 1.3. �

Further insights can be obtained from a consideration of the insurer’s cashflows. First, the probability distribution of the annual random payout maybe of interest. If we assume that all annuitants receive an annual amount b,the random payout at time t is given by b Lx+t, and the related probabilitydistribution of the annual payment is immediately derived from (1.46).

When various individual annual amounts are concerned, deriving theprobability distribution of the annual payout is more difficult. In any event,various numerical procedures and approximations are available. As analternative,Montecarlo simulation procedures can be used. Simulation pro-cedures can also be used to obtain other results related to a portfolio of lifeannuities, or a pension plan.

1.5 Evaluating life annuities: stochastic approach 23

Consider now the random behaviour over time of the fund Zt defined fort = 1, 2, . . . ,ω − x, as follows:

Zt = Zt−1 (1 + i∗) − Lx+t b (1.48)

withZ0 = lx S. Suppose that the relation between b and S is given by formula(1.26), where ax has been calculated assuming the first-order technical basis,given by i = 0.03 and the qx+h’s used in Example 1.2.

A ‘path’ of the fund Zt can be obtained via simulation of the randomnumbers Lx+t, which in turn can be obtained simulating the random life-times of the annuitants. Indeed, denoting with T(j)

x the remaining lifetimeof the j-th annuitant, we have

Lx+t =lx∑

j=1

I{T(j)x >t} (1.49)

where IE is the indicator function of event E.

Note that the expected pathE[Zt], t = 1, 2, . . . ,ω−x, can be immediatelyderived as

E[Zt] = E[Zt−1] (1 + i∗) − E[Lx+t] b (1.50)

the expected numbers E[Lx+t] being given by (1.47).

Example 1.8 Figures 1.9(a) and (b) illustrate 10 paths of Zt, for t =0, 1, . . . , 5 and t = 15, . . . , 20, respectively. The data already describedhave been assumed as the input of the simulation procedure, in particu-lar i∗ = 0.05 and the q∗

x+h’s used in Example 1.3. Figures 1.10(a) and (b)illustrate the (simulated) statistical distribution of Z5 and Z20 respectively,based on a sample of 1000 simulated paths. �

Further interesting aspects may emerge from comparing the behaviour ofthe fund Zt with the (random) portfolio reserve, whose amount is Lx+t Vt,with Vt given by (1.30) (traditionally implemented with the first-orderbasis). As the assets actually available are given byZt, the (random) quantity

Mt = Zt − Lx+t Vt (1.51)

represents the assets in excess of the level required (according to the first-order basis) to meet expected future obligations.

Example 1.9 Figures 1.11(a) and (b) represent the (simulated) statisticaldistribution of M5 and M20 respectively, based on the simulated samplepreviously adopted. The erratic behaviour in these figures (as well as in

24 1 : Life annuities

75,000

80,000

85,000

90,000

95,000

100,000

0 1 2 3 4 5t

Zt

10,000

15,000

20,000

25,000

35,000

30,000

45,000(b)(a)

40,000

15 16 17 18 19 20t

Zt

Figure 1.9. Some paths of Zt.

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0

10,00

0

20,00

0

30,00

0

40,00

0

50,00

0

60,00

0

70,00

0

80,00

0

90,00

0

Z5 Z20

0

10,00

0

20,00

0

30,00

0

40,00

0

50,00

0

60,00

0

70,00

0

80,00

0

90,00

0

Freq

uenc

y

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7(b)(a)

Freq

uenc

y

Figure 1.10. Statistical distributions of Z5 and Z20.

Figures 1.11(a), 1.11(b), 1.12(a), 1.12(b), and 1.14) is clearly due to thesimulation procedure; smoother results can be obtained by increasing thenumber of simulations. �

1.5.3 A first insight into risk and solvency

From the exercise developed in Examples 1.7–1.9, an important feature ofstochastic models clearly emerges. Allowing for randomness provides uswith a tool for assessing the ‘risk’ inherent in a life annuity portfolio or apension plan. As we can see in Figure 1.9(a) and (b), random fluctuationsaffect the portfolio behaviour, and these are caused (in this example) by therandomness in the number of survivors throughout time. The risk we are

1.5 Evaluating life annuities: stochastic approach 25

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0.18

0.2

–50,0

00

–40,0

00

–30,0

00

–20,0

00

–10,0

00 0

10,00

0

20,00

0

30,00

0

40,00

0

M5

Freq

uenc

y

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0.18

0.2(b)(a)

–50,0

00

–40,0

00

–30,0

00

–20,0

00

–10,0

00 0

10,00

0

20,00

0

30,00

0

40,00

0

M20

Freq

uenc

y

Figure 1.11. Statistical distributions of M5 and M20.

now focussing on is usually named the risk of mortality random fluctuation,or the process risk due to mortality (see also Section 7.2).

Figures 1.10(a) and (b) suggest measures which can be used for assessingthe riskiness of a life annuity portfolio in terms of the ‘dispersion’ of the fundZt. Analogous considerations emerge from Figure 1.11(a) and (b) in relationto the quantity Mt. For example, the variance or the standard deviation,estimated from the statistical distributions, can be used as (traditional) riskmeasures.

The possibility of quantifying portfolio riskiness suggests ‘operational’applications of our stochastic model, provided that it is properly general-ized. For example, let us focus on the quantity Mt. From Figure 1.11(a) and(b) it emerges that, with a positive probability, M5 and M20 take negativevalues. Of course, the event Mt < 0 indicates an insolvency situation.

So, probabilities of events like Mt < 0 for some t, at least within astated time horizon, should be kept reasonably small. In particular, an initialallocation of (shareholders’) capital, leading to Z0 > S, clearly lowers theprobability of being insolvent.

Example 1.10 Allocating the amount M0 = 3000, so that Z0 = 100 S +3000, leads to the distributions of M5 and M20 depicted in Figure 1.12(a)and (b), fromwhich a smaller probability of insolvency clearly emerges. �

Of course, causes of risk other thanmortality could be introduced into ourmodel, and typically the investment risk, in particular arising from randomfluctuations (i.e. ‘volatility’) in the investment yield. To this purpose, thesequence of annual investment yields must be simulated, on the basis of

26 1 : Life annuities

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0.18

0.2

–50,0

00

–40,0

00

–30,0

00

–20,0

00

–10,0

00 0

10,00

0

20,00

0

30,00

0

40,00

0

M5

Freq

uenc

y

(a)

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0.18

0.2

–50,0

00

–40,0

00

–30,0

00

–20,0

00

–10,0

00 0

10,00

0

20,00

0

30,00

0

40,00

0

M20

Freq

uenc

y

(b)

Figure 1.12. Statistical distribution of M5 and M20.

an appropriate model for stochastic interest rates, and used in place of theestimated yield i∗. We do not deal with these problems, which are beyondthe scope of the present chapter.

Let us now refer to the random present value at time t = 0, Y(�)

0 , offuture benefits in a portfolio consisting of one generation of life annuities.We have

Y(�)

0 = bω−x∑t=1

Lx+t (1 + i)−t (1.52)

If we calculate the expected value of Y(�)

0 using the first-order basis, wehave

E[Y(�)

0 ] = bω−x∑t=1

E[Lx+t] (1 + i)−t = b lxω−x∑t=1

tpx (1 + i)−t = lx V0 (1.53)

Formula (1.53) provides the (traditional) portfolio reserve, given by

V (�)

0 = E[Y(�)

0 ] = lx V0 (1.54)

Obvious generalizations lead to Y(�)t and E[Y(�)

t ], for t ≥ 0.

However, in a stochastic context, the portfolio reserve can be defined indifferent ways, in particular in order to allow for the riskiness inherent inthe life annuity portfolio. For example, the reserve can be defined as the

1.5 Evaluating life annuities: stochastic approach 27

yα0

1 − α

E[Y0(Π)]

Figure 1.13. Probability distribution of Y(�)0 ; α-percentile.

α-percentile of the probability distribution of Y(�)

0 (see Fig. 1.13):

V (�;α)

0 = yα (1.55)

with yα such that

P[Y(�)

0 > yα] = 1 − α (1.56)

Example 1.11 Using the data of the previous examples, from the simulateddistribution of Y(�)

0 (see Fig. 1.14) we find the results shown in Table 1.1.Note that, conversely, we have P[Y(�)

0 > V (�)

0 ] = 0.209 (where V (�)

0 =E[Y(�)

0 ] = 100000). �

It is worth noting that the calculation of the portfolio reserve V (�)

0 (and,in general V (�)

t ) according to (1.54) represents the traditional approachthat is adopted in actuarial practice. In this context, the presence of risks istaken into account simply via the first-order basis adopted in implementingformula (1.54). Conversely, the reserving approach based on the probabilitydistribution ofY(�)

0 (andY(�)t in general) and leading to the portfolio reserve

V (�;α)

0 (V (�;α)t ) allows for risks via the choice of an appropriate percentile

of the distribution itself.

1.5.4 Allowing for uncertainty in mortality assumptions

As already mentioned in Section 1.3.3, experience suggests that we shouldadopt projected mortality tables (or laws) for the actuarial appraisal of

28 1 : Life annuities

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

83,500 88,500 93,500 98,500 103,500Y0

Prob

abili

ty

Figure 1.14. Statistical distribution of Y(�)0 .

Table 1.1. Percentiles ofthe probability distribu-tion of Y(�)

0

α yα

0.75 92067.0330.90 101553.8150.95 102608.2530.99 104480.738

life annuities (and other living benefits), that is, use mortality assumptionswhich include a forecast of future mortality trends. Notwithstanding, what-ever hypothesis is assumed, the future trend in mortality is random, andhence an uncertainty risk arises, namely a risk due to uncertainty in therepresentation of the future mortality scenario.

Example 1.12 Assume the first-order basis already used in the previousexamples. To describe the (future) mortality scenario, use the model (1.32)with the following alternative parameters:

(1) G(1) = 0.0000025; H(1) = 1.13500(2) G(2) = 0.0000023 (= G∗); H(2) = 1.13400 (=H∗)(3) G(3) = 0.0000019; H(3) = 1.13300

1.5 Evaluating life annuities: stochastic approach 29

We assume that scenario (2) (which coincides with the scenario adoptedas the second-order basis in previous examples) represents the bestestimate mortality hypothesis. Scenario (1) involves a higher mortalitylevel and hence can be considered ‘optimistic’ from the point of viewof the annuity provider. Conversely, scenario (3) expresses a lower mor-tality level and thus constitutes a ‘pessimistic’ mortality forecast. Weobtain:

a(1)

65 = 11.046, a(2)

65 = 11.442, a(3)

65 = 12.102

(with obvious meaning of the notation). �

The coexistence of more than one mortality scenario (namely, three inExample 1.12) depicts a new modelling framework. When no uncertaintyin future mortality trend is allowed for, and hence just one age-pattern ofmortality is assumed (e.g. in terms of probability of dying), a deterministicactuarial value of the life annuity follows. Conversely, if we recognize uncer-tainty in the future pattern of mortality, randomness in actuarial valuesfollows.

Figure 1.15 illustrates three different approaches to uncertainty in mor-tality assumptions. The first approach (A) simply disregards uncertainty, sothat the related result is a deterministic actuarial value of the life annuity. Inthe second case (B), a finite set of scenarios is used to express uncertainty,from which a finite set of actuarial values follows; clearly, this approachhas been adopted in Example 1.12. Note that, according to this approach,each actuarial value should be regarded as an expected value conditionalon a given scenario. Finally, the third approach (C) allows for uncertaintyvia a continuous set of scenarios and a consequent interval for the (condi-tional) actuarial value of the life annuity; this approach can be implemented,for example, assuming a given interval as the set of possible values for aparameter of the mortality law.

Clearly, the uncertainty risk coexists with the risk of mortality ran-dom fluctuations. As regards the present value of a life annuity, randomfluctuations lead to the probability distribution depicted for example, inFig. 1.14 (see Example 1.11). When allowing also for uncertainty in futuremortality, a set of probability distributions should be addressed. Thus,referring to approach B (see Fig. 1.15), a finite set of conditional distri-butions is involved, each one relating to an alternative mortality scenario(see Fig. 1.16).

A comprehensive description of riskiness inherent in a life annuity prod-uct (still excluding financial risks arising from investment performance)

30 1 : Life annuities

Agex

Deterministicscenario

Deterministic actuarial value of the life annuity

Value

Prob

abili

ty o

f de

arth

x

Uncertaintyin scenario

(discrete setting)

Uncertaintyin scenario

(continuous setting)

Uncertainty in the actuarial value of the life annuity

(discrete setting)

Uncertainty in the actuarial value of the life annuity

(continuous setting)

Value

Value

Prob

abili

ty o

f de

arth

Pr

obab

ility

of

dear

th

Age

xAge

(3)

(1)

(A)

(B)

(C)

(2)

Figure 1.15. Mortality scenarios and actuarial values.

requires a further step. By assigning an appropriate probability descrip-tion of the scenario space, we can move from conditional probabilitydistributions to an unconditional distribution, which ‘summarizes’ bothcomponents of risk, namely the uncertainty risk and the risk of randomfluctuations. This topic will be focussed in Chapter 7 while dealing with theassessment of longevity risk.

1.6 Types of life annuities 31

(3)(2)(1)

Present valueof the life annuity

Figure 1.16. Conditional probability distributions of the random present value of the life annuity.

1.6 Types of life annuities

In the previous sections we have dealt with an immediate life annuity inarrears, that is a life annuity whose first payment is due one period (oneyear, according to our assumptions) from the date of purchase, while thelast payment is made at the end of the period preceding the death of theannuitant. Although this structure is rather common, a number of othertypes of life annuities are sold on insurance markets, and paid by pensionplans as well.

So, the purpose of this section is to describe a range of annuity types,looking at features of both the accumulation period and the decumulationperiod (also called the liquidation period, or payout period); see Fig. 1.17.

1.6.1 Immediate annuities versus deferred annuities

Let us continue to focus on an immediate life annuity, and denote with b theannual benefit and S the net single premium (i.e., disregarding expense load-ings). It is natural to look at the amount S as the result of an accumulationprocess carried out during (a part of) the working life of the annuitant.

Let us now denote with x the age at the beginning of the accumulationprocess, that is, at time 0. The accumulation process stops at time n, so thatx + n is the age at the beginning of the decumulation phase.

The relation between S and b is given, according to the equivalenceprinciple, by

S = b ax+n (1.57)

(see (1.26) and the previous equations).

32 1 : Life annuities

Fund

/res

erve

n – 1 n + 1 x + n

n

S

1 2 0x

DecumulationAccumulation

Annuity benefits/ withdrawals

Premiums/savings

Time and age

Figure 1.17. Accumulation and decumulation phases.

As regards the accumulation process, this can be carried out via vari-ous tools, for example insurance policies providing a survival benefit atmaturity (time n). Some policy arrangements tools will be described inSection 1.6.2.

Conversely, it is possible to look jointly at the accumulation and the decu-mulation phase, even in actuarial terms. Consider a deferred life annuity ofone monetary unit per annum, with a deferred period of n years. Assumenow that each annual payment is due at the beginning of the year (annuityin advance). The actuarial value at time 0, n|ax, is given by

n|ax =ω−x∑h=n

(1 + i)−hhpx (1.58)

In this context, it is natural that the accumulation period coincides withthe deferred period. In particular, the deferred annuity can be financed viaa sequence of n annual level premiums P, paid at times 0, 1, . . . , n − 1. Theannual premium for a deferred life annuity of b per annum, according tothe equivalence principle, is then given by

P = b n|ax

ax:n�(1.59)

1.6 Types of life annuities 33

where

ax:n� =n−1∑h=0

(1 + i)−hhpx (1.60)

Two important aspects of the actuarial structure of deferred life annuitiesfinanced by annual level premiums, as is apparent from equations (1.58) and(1.60), should be stressed:

(a) Formulae (1.58) and (1.60) rely on the assumption that the technicalbasis is chosen at time 0, when the insured is aged x. If for examplex = 40, this means that the technical rate of interest will be guaranteedthroughout a period of, maybe, fifty years or even more. Further, thelife table adopted should keep its validity throughout the same period.

(b) In the case that the policyholder dies before time n, no benefit is due. Thisis, of course, a straight consequence of the policy structure, accordingto which the only benefit is the deferred life annuity.

Feature (b) is likely to have a negative impact on the appeal of the annuityproduct. However, the problem can be easily removed by adding to thepolicy a rider benefit such as the return of premiums in case of death duringthe deferred period, or including some death benefit with term n.

The problems arising from aspect (a) aremuchmore complex, and requirea re-thinking of the structure and design of the life annuity product. As afirst step, we provide an analysis of the main features of life annuity prod-ucts, addressing separately the accumulation period and the decumulationperiod.

1.6.2 The accumulation period

The deferred life annuity, as described above, can be interpreted as a pureendowment at age x with maturity at age x + n, ‘followed’ (in the case ofsurvival at age x + n) by an immediate life annuity, with benefits due at thebeginning of each year. In formal terms, from (1.58) we obtain

n|ax = (1 + i)−nnpx

ω−x−n∑h=0

(1 + i)−hhpx+n = nEx ax+n (1.61)

where nEx = (1+ i)−nnpx denotes the actuarial value of a pure endowment

with a unitary amount insured.

Clearly, relation (1.61) relies on the assumption that the same technicalbasis is adopted for both the period of accumulation and decumulation. As

34 1 : Life annuities

already noted, this implies a huge risk for the life annuity provider. So, animportant idea is to address separately the two periods, possibly delayingthe choice of the technical basis to be adopted for the life annuity.

As regards the accumulation period, the pure endowment can be replacedby a purely financial accumulation, via an appropriate savings instrument.Then, the loss in terms of the mutuality effect is very limited when (partof) the working period is concerned. Hence, a very modest extra-yield canreplace the mortality drag.

Example 1.13 In Fig. 1.18, the function θ (see Section 1.4.1) is plottedagainst the age in the range 40–64. Note that θ is consistent with formula(1.34), with given values for the mathematical reserve, however with b = 0.The underlying technical basis is the first-order basis adopted in Example1.4. It is interesting to compare the graph in Fig. 1.18 (noting the scaleon the vertical axis) with the behaviour of the function θ throughout thedecumulation period, illustrated in Fig. 1.6. �

Of course, various insurance products including a benefit in case of lifeat time n can replace the pure endowment throughout the accumulationperiod. Examples are given by the traditional endowment assurance policy,by various types of unit-linked endowments, and so on. Inmany cases, someminimum guarantee is provided: for example, the technical rate of interestin traditional insurance products like pure endowments and endowment

0

0.001

0.002

0.003

0.004

0.005

0.006

0.007

40 45 50 55 60 65Age

u

Figure 1.18. Mutuality effect during the accumulation period.

1.6 Types of life annuities 35

assurances, a minimum death benefit and/or a minimum maturity benefitin unit-linked products.

Whatever the insurance product may be, the benefit at maturity can beused to purchase an immediate life annuity. However, the ‘quality’ of theinsurance product used for the accumulation can be improved, from theperspective of the policyholder, including in the product itself an optionto annuitize. This option is the possibility of converting the lump sum atmaturity into an immediate life annuity, without the need to cash in the sumand pay the expense charges related to the underwriting of the life annuity.

Clearly, when an option to annuitize is included in the policy, the insurerfirst takes the adverse selection risk, as the policyholders who choose theconversion into a life annuity will presumably be in good health, with a lifeexpectancy higher than the average. However a further risk may arise, dueto the uncertainty in the future mortality trend, that is, the longevity risk.

If the annuitization rate, that is the quantity 1/ax+n, which is applied toconvert the sum available atmaturity into an immediate life annuity is stated(and hence guaranteed) just at maturity, the time interval throughout whichthe insurer bears the longevity risk clearly coincides with the time intervalduring which the life annuity is paid.

However, more ‘value’ can be added to the annuity product if the annuiti-zation rate is guaranteed during the accumulation period, the limiting casebeing represented by the annuitization rate guaranteed at time 0, that is, atpolicy issue. The opposite limit is clearly given by stating the guaranteedrate at time n, that is, at maturity.

The so-called guaranteed annuity option (GAO) is a policy conditionwhich provides the policyholder with the right to receive at retirement eithera lump sum (the maturity benefit) or a life annuity, whose annual amountis calculated at a guaranteed rate. The annuity option will be exercised bythe policyholder if the current annuity rate (i.e. the annuity rate applied byinsurers at time n for pricing immediate life annuities) will be worse thanthe guaranteed one.

As regards the accumulation period, the severity of the longevity riskborne by the life annuity provider can be reduced (with respect to theseverity involved in a GAO with a guaranteed rate that is stated at thepolicy issue date) if the annuity purchase is arranged according to a single-recurrent premium scheme. In this case, with the premium paid at time h(h = 0, 1, . . . , n − 1) a deferred life annuity of annual amount bh, withdeferred period n − h, is purchased. In actuarial terms:

Ph = bh n−h|a[h]x+h (1.62)

36 1 : Life annuities

Note that the actuarial value n−h|a[h]x+h is calculated according to the

technical basis adopted at time h. Hence, the annuity total benefit b, given by

b =n−1∑h=0

bh (1.63)

is ultimately determined and guaranteed at time n − 1 only. According tothis step-by-step procedure, the technical basis, used in (1.62) to determinethe amount bh purchased with the premium Ph, can change every year, soreflecting possible adjustments in the mortality forecast.

1.6.3 The decumulation period

Let us denote with n the starting point of the decumulation period, andwith x + n the annuitant’s age. Let S be the amount, available at time n, tofinance the life annuity. In the case of a deferred life annuity, S is given bythe mathematical reserve at time n of the annuity itself.

The relation between S and the annual payment b depends on the policyconditions which define the (random) number of payments, and hence theduration of the decumulation period. Let us denote with K the number ofpayments. Focussing on a life annuity in arrears only, the following casesare of practical interest:

(a) If the number of annual payments is stated in advance, say K = m, wehave an annuity-certain, that is, a simple withdrawal process. Then, theannual benefit b is defined by the following relation:

S = b am� (1.64)

(b) In the case of a whole life annuity, the annual payments cease withthe death of the annuitant. Thus, K = Kx+n (where Kx+n denotes thecurtate remaining lifetime), and

S = b ax+n (1.65)

(c) The m−year temporary life annuity pays the annual benefit while theannuitant survives during the first m years. Then K = min{m,Kx+n},and

S = b ax:m� =m∑

h=1

(1 + i)−hhpx+n (1.66)

(d) If the annuitant dies soon after time n, neither the annuitant nor theannuitant’s estate receive much benefit from the purchase of the life

1.6 Types of life annuities 37

annuity. In order to mitigate (at least partially) this risk, it is possible tobuy a life annuity with a guarantee period (5 or 10 years, say), in whichcase the benefit is paid for the guarantee period regardless of whetherthe annuitant is alive or not. Hence, for a guarantee period of r yearswe have K = max{r,Kx+n}, and

S = b ar� + b r|ax+n (1.67)

We have so far assumed that the annuity payment depends on the life-time of one individual only, namely the annuitant. However, it is possibleto define annuity models involving two (or more) lives. Some examples(referring to two lives) follow:

(e) Consider an annuity payable as long as at least one of two individu-als (the annuitants) survives, namely a last-survivor annuity. Let nowdenote by y and z respectively the ages of the two lives at the annuitycommencement, and with K(1)

y , K(2)z their curtate remaining lifetimes.

Thus,K = max{K(1)y ,K(2)

z }. The actuarial value of this annuity is usuallydenoted by ay,z, and can be expressed as

ay,z = a(1)y + a(2)

z − ay,z (1.68)

where the suffices (1), (2) denote the life tables (e.g. referring to malesand females respectively) used for the two lives, whereas ay,z denotesthe actuarial value of an annuity of 1 per annum, payable while bothindividuals are alive (namely a joint-life annuity). Hence,

S = b ay,z = b (a(1)y + a(2)

z − ay,z) (1.69)

Note that, if we accept the hypothesis of independence between the tworandom lifetimes, we have

ay,z =+∞∑h=1

(1 + i)−hhp(1)

y hp(2)z (1.70)

In equation (1.69) it has been assumed that the annuity continues withthe same annual amount until the death of the last survivor. A modi-fied form provides that the amount, initially set to b, will be reducedfollowing the first death: to b′ if the individual (2) dies first, and to b′′if the individual (1) dies first. Thus

S = b′ a(1)y + b′′ a(2)

z + (b − b′ − b′′) ay,z (1.71)

with b′ < b, b′′ < b. Conversely, inmany pension plans the last-survivorannuity commonly provides that the annual payment is reduced only if

38 1 : Life annuities

the retiree, say life (1), dies first. Formally, b′ = b (instead of b′ < b) inequation (1.71).

(f) A reversionary annuity (on two individuals) is payable while a givenindividual, say individual (2), is alive, but only after the death ofthe other individual. In this case, the number of payments is K =max{0,K(2)

z − K(1)y }, and the first payment (if any) is made at time

K(1)y + 1. Such an annuity can be used, for example, as a death benefit

in pension plans, to be paid to a surviving spouse or dependant.

1.6.4 The payment profile

Level annuities (sometimes called standard annuities) provide an incomewhich is constant in nominal terms. Thus, the payment profile is flat.

A number of models of ‘varying’ annuities have been derived, mainlywith the purpose of protecting the annuitant against the loss of purchasingpower because of inflation. First, we focus on escalating annuities.

(a) In the fixed-rate escalating annuity (or constant-growth annuity) theannual benefit increases at a fixed annual rate, α, so that the sequenceof payments is

b1, b2 = b1 (1 + α), b3 = b1 (1 + α)2, . . .

Usually, the premium is calculated accounting for the annual increasein the benefit. Thus, for a given amount S (the single premium of theimmediate life annuity), the starting benefit b1 is lower than the benefitthe annuitant would get from a level annuity.

Various types of index-linked escalating annuities are sold in annuity andpension markets. Two examples follow:

(b) Inflation-linked annuities provide annual benefits varying in line withsome index, for example a retail-price index (like the RPI in the UK),usually with a stated upper limit. An annuity provider should investthe premiums in inflation-linked assets so that these back the annuitieswhere the payments are linked to a price index.

(c) Equity-indexed annuities earn annual interest that is linked to a stockor other equity index (e.g., the Standard & Poor’s 500). Usually, theannuity promises a minimum interest rate.

Moving to investment-linked annuities, we focus on the followingmodels:

1.6 Types of life annuities 39

(d) In awith-profit annuity (typically in the UKmarket), the single premiumis invested in an insurer’s with-profit fund. Annual benefits depend onan assumed annual bonus rate (e.g. 5%), and on the sequence of actualdeclared bonus rates, which in turn depend on the performance of thefund. In each year, the annual rate of increase in the annuity depends onthe spread between the actual declared bonus and the assumed bonus.Clearly, the higher is the assumed bonus rate, the lower is the rate ofincrease in the annuity. The benefit decreases when the actual declaredbonus rate is lower than the assumed bonus rate. Although the annualbenefit can fluctuate, with-profit annuities usually provide a guaranteedminimum benefit.

(e) Various profit participation mechanisms (other than the bonus mecha-nism described above in respect of with-profit annuities) are adopted,for example, in many European continental countries. A share (e.g.80%) of the difference between the yield from the investments back-ing the mathematical reserves and the technical rate of interest (i.e.the minimum guaranteed interest, say 2% or 3%) is credited tothe reserves. This leads to increasing benefits, thanks to the extra-yield.

(f) The single premium of a unit-linked life annuity is invested into unit-linked funds. Generally, the annuitant can choose the type of fund, forexample medium risk managed funds, or conversely higher risk funds.Each year, a fixed number of units are sold to provide the benefit pay-ment. Hence, the benefit is linked directly to the value of the underlyingfund, and then it fluctuates in line with unit prices. Some unit-linkedannuities, however, work in a similar way to with-profit annuities. Anannual growth rate (e.g. 6%) is assumed. If the fund value grows atthe assumed rate, the benefit stays the same. If the fund value growthis higher than assumed, the benefit increases, whilst if lower the benefitfalls. Some unit-linked funds guarantee a minimum performance in linewith a given index.

We conclude this section addressing some policy conditions whichprovide a ‘final’ payment, namely some benefit after the death of theannuitant.

The complete life annuity (or apportionable annuity) is a life annuitypayable in arrears which provides a pro-rata adjustment on the death of theannuitant, consisting in a final payment proportional to the time elapsedsince the last payment date. Clearly, this feature is more important if theannuity is paid annually, and less important in the case of, say, monthlypayments.

40 1 : Life annuities

Capital protection represents an interesting feature of some annuity poli-cies, usually called value-protected annuities. Consider, for example, asingle-premium, level annuity. In the case of early death of the annuitant,a value-protected annuity will pay to the annuitant’s estate the difference(if positive) between the single premium and the cumulated benefits paidto the annuitant. Usually, capital protection expires at some given age(75, say), after which nothing is paid even though the difference abovementioned is positive. The capital protection benefit can be provided intwo ways:

– in a cash-refund annuity, the balance is paid as a lump sum;– in an instalment-refund annuity the balance is paid in a sequence ofinstalments.

Adding capital protection clearly reduces the annuity benefit (for a givensingle premium).

Remark Note that capital protection constitutes a death benefit, which isdecreasing as the age at death increases and hence the number of annualbenefits paid to the annuitant increases. For this reason, capital protectioncan help in building-up a (partial) ‘natural hedging’ of mortality-longevityrisks inside the annuity product. See Section 7.3.2. �

1.6.5 About annuity rates

The price of life annuities depend on several ‘risk factors’. In particular,

(a) age at time of annuity purchase;(b) gender;(c) voluntary annuities versus pension annuities;(d) information available to the insurer about the annuitant’s expected

lifetime are important factors.

The importance of factor (a) is self-evident. Risk factor (b) is usuallytaken into account, because of the difference between the age-pattern ofmortality in males and females. However, in uni-sex annuities the sameannuity rate (for a given age at entry) is adopted for males and females.These annuities involve a solidarity effect (see Section 1.4.2) in the sensethat men cross-subsidize women.

The term voluntary annuities (see point (c)) usually denotes annuitiesbought as a consequence of individual choice, that is exercised on a vol-untary basis. Conversely, the term pension annuities refers to benefits paid

1.6 Types of life annuities 41

to people as a direct consequence of their membership of an occupationalpension plan, or to annuities bought because a compulsory purchase mech-anism works. Voluntary annuities are usually purchased by people with ahigh life expectancy, whereas individuals who know that they have a lowexpected lifetime are unlikely to purchase an annuity. The consequence isthat actual voluntary annuitants have a mortality pattern different from thepopulation as a whole. This fact is known as adverse selection (from thepoint of view of the life insurer). In terms of annuity rates, adverse selectionleads to higher premiums for voluntary annuities, compared with pensionannuities.

As regards point (d), insurers offer lower prices, that is, sell special-rateannuities, to people with an expected lifetime lower than the average one(or, equivalently, a higher annual benefit for a given single premium). Inparticular,

– impaired-life annuities can be sold to people having health problemscertified by a doctor (e.g. diabetes, chronic asthma, high blood pressure,cancer, etc.);

– enhanced annuities can be purchased by people who self-certify the pres-ence of some cause of a higher mortality level, like being overweight, orbeing a regular smoker.

Remark Enhanced annuities should not be confused with enhanced pen-sions, which provide an uplift of the annual benefit if the annuitant entersa senescent disability state (namely, in the case of a ‘Long-Term Care’claim). �

1.6.6 Variable annuities and GMxB features

In the previous sections, various ‘guarantees’ have been addressed; forexample: minimum guarantees like the guaranteed interest rate in the accu-mulation period (Section 1.6.2), a guaranteed minimum annual benefit inwith-profit annuities, a minimum interest rate in equity-indexed annuities,a minimum performance in unit-linked annuities, a minimum total payoutvia capital protection mechanisms (Section 1.6.4).

Packaging a range of guarantees is a feature of variable annuities. Theseproducts are unit-linked investment policies, providing deferred annuitybenefits. The annuity can be structured as a level annuity or a unit-linkedannuity (see Section 1.6.4).

The guarantees, commonly referred to as GMxBs (namely, GuaranteedMinimum Benefits of type ‘x’), include minimum benefits both in case of

42 1 : Life annuities

death and in case of life. The GMxBs are usually defined in terms of theamount resulting from the accumulation process (the account value) at somepoint of time, compared with a given benchmark (which may be expressedin terms of the interest rate, a fixed benefit amount, etc.).

One or more than one GMxB can be included in the policy as a riderto the basic variable annuity product. A brief description of some GMxBsfollow:

(a) GMDB = Guaranteed Minimum Death Benefit. TheGMDB guaranteesa minimum lump sum benefit payable upon the annuitant’s death. TheGMDB can be defined in several ways; for example:– return of premiums consists in the payment of the greater of theamount of premiums paid and the account value;

– highest anniversary value pays the greater of the highest accountvalue at past anniversaries and the current account value (hence,according to a ratchet mechanism);

– roll-up consists in the payment of the higher of an amount equal tothe premiums paid accumulated at a given interest rate (say, 5%) andthe account value.

The GMDB typically expires either at the end of the accumulationperiod, or when a given time (say, 10 years) has elapsed since thecommencement of the decumulation period.

(b) GMAB = Guaranteed Minimum Accumulation Benefit. The GMABcan be exercised at pre-fixed dates (during the accumulation period);the policyholder receives, as the surrender value, a lump sum equalto the higher of the guaranteed amount and the account value. Forexample the guaranteed amount can be determined as the premiumspaid accumulated at a given interest rate (say, 5%) according to a roll-up rule, and can be paid for example at the 10-th anniversary (measuredfrom the beginning of the accumulation period).

(c) GMIB = Guaranteed Minimum Income Benefit. The term ‘income’refers to (annual) amounts payable to the annuitant. The policyholderreceives the higher of the guaranteed amount and the account value,payable as an annuity whose annual benefit is determined accordingto a given interest rate and life table. The guaranteed amount is typ-ically calculated according to a roll-up accumulation or an annualratchet. Hence, the GMIB guarantees a minimum annual income uponannuitization.

(d) GMWB = Guaranteed Minimum Withdrawal Benefit. The policyholderreceives the greater of return of premiums and the account value,payable as a sequence of periodic withdrawals throughout time. For

1.7 References and suggestions for further reading 43

example, the GMWBmight guarantee that the policyholder will receivefor 20 years an annual amount equal to 5% of the premiums paid.Some policies do not allow the policyholder to withdraw money afterthe commencement of the annuity payments.

GMAB, GMIB, and GMWB are commonly referred to as GLB, namelyGuaranteed Living Benefits.

All GMxBs have option-like characteristics. However, the possible uti-lization of the GMDB follows the age-pattern of mortality, and hence canbe assessed using a life table (together with assumptions about the perfor-mance of the financial market). Conversely, the utilization of aGLB dependson the policyholder’s behaviour, and hence the assessment of its impact ismuch more difficult.

1.7 References and suggestions for further reading

In this section, we only quote textbooks and papers dealing with generalaspects of life annuity products. Studies particularly devoted to longevityrisk in life annuity portfolios and pension planswill be quoted in the relevantsections of the following chapters.

Basic actuarial aspects of life annuities (namely expected present values,premium calculation, mathematical reserves) are dealt with in almost all ofthe main textbooks of actuarial mathematics and life insurance techniques.The reader can refer, for example, to Bowers et al. (1997), Gerber (1995),Gupta and Varga (2002), Rotar (2007).

As regards the notation, the use of symbols like aKx� (see (1.43)) canbe traced back to de Finetti. Actually, de Finetti (1950, 1957) focussedon the random present value of insured benefits. For example, in the age-continuous context,

– the random present value of the whole life assurance (with a unitary sumassured) is (1 + i)−Tx , and then, according to usual actuarial notation,the expected present value is

Ax = E[(1 + i)−Tx]– the random present value of the standard endowment is (1+ i)−min{Tx,n},

and hence

Ax,n� = E[(1 + i)−min{Tx,n}]

44 1 : Life annuities

As regards the stochastic approach to actuarial values, see also the sem-inal contribution by Sverdrup (1952). Mortality risks in life annuities areanalysed by McCrory (1986).

The objectives andmain design features of life annuity products are exten-sively dealt with by Black and Skipper (2000). We have mainly referredto this textbook in Section 1.6. Various papers and reports have beenrecently devoted to innovation in life annuity products, especially address-ing the impact of longevity risk. See, for example Cardinale et al. (2002),Department for Work and Pensions (2002), Retirement Choice WorkingParty (2001), Richard and Jones (2004), Wadsworth et al. (2001), SwissRe (2007), Blake and Hudson (2000). Variable annuities are addressed inparticular by Sun (2006) and O’Malley (2007).

The book by Milevsky (2006) constitutes an updated reference in thecontext of life annuities and post-retirement choices.

Great effort has been devoted to the analysis of life annuities from aneconomic perspective, in particular in the framework ofwealthmanagementand human life cycle modelling. We only cite the seminal contribution byYaari (1965), whereas for other bibliographic suggestions the reader canrefer to Milevsky (2006). The extra yield defined in Section 1.4.1 is the keyelement behind the seminal result of Yaari (1965). He shows that a riskaverse, life cycle consumer facing an uncertain time of death would, undercertain assumptions (e.g. the absence of bequest, and the absence of othersources of randomness), find it optimal to invest 100% of his/her wealth inan annuity (priced on an actuarially fair basis).

An extensive discussion on the concepts of mutuality and solidarity (how-ever with some terms used with a meaning different from that adopted inthe present chapter) is provided by Wilkie (1997).

Finally, some references concerning the history of life annuities and therelated actuarial modelling follow. For the early history of life annuitiesthe reader can refer to Kopf (1926). The paper by Hald (1987) is moreoriented to actuarial aspects, and constitutes an interesting introduction tothe early history of life insurance mathematics. Haberman (1996) providesextensive information about the history of actuarial science up to 1919,while in Haberman and Sibbett (1995) the reader can find the reproduc-tion of a number of milestone papers in actuarial science. The papers byPitacco (2004a) and Pitacco (2004c) mainly deal with the evolution of mor-tality modelling, ranging from Halley’s contributions to the awareness oflongevity risk.

2The basic mortalitymodel

2.1 Introduction

Some elements of the basic mortality model underlying life insurance, lifeannuities and pensions have been already introduced in Chapter 1, whilepresenting the structure of life annuities; see in particular Sections 1.2 and1.3. In Chapter 2, we consider the mortality model in more depth.We adopta more structured presentation of the fundamental ideas, which means thatsome repetition of elements from Chapter 1 is unavoidable.

However, new concepts are also introduced. In particular, an age-continuous framework is defined in Section 2.3, in order to provide sometools needed when dealing with mortality projection models.

Indices summarizing the probability distribution of the lifetime aredescribed in Section 2.4, whereas parametric models (i.e. mortality ‘laws’)are presented in Section 2.5. Basic ideas concerning non-parametric gradu-ation are introduced in Section 2.6. Transforms of the survival function arebriefly addressed in Section 2.7.

Less traditional topics, yet of great importance in the context of lifeannuities and mortality forecasts, are dealt with in Sections 2.8 and 2.9,respectively: mortality at very old ages (i.e. the problem of ‘closing’ the lifetable), and the concept of ‘frailty’ as a tool to represent heterogeneity inpopulations, due to unobservable risk factors.

A list of references and suggestions for further readings (Section 2.10)conclude the chapter. As regards references to actuarial and statisticalliterature, in order to improve readability we have avoided the use ofcitations throughout the text of the first sections of this chapter, namelythe sections devoted to traditional issues. Conversely, important contri-butions to more recent issues are cited within the text of Sections 2.8and 2.9.

46 2 : The basic mortality model

2.2 Life tables

2.2.1 Cohort tables and period tables

The life table is a (finite) decreasing sequence l0, l1, . . . , lω. The generic itemlx refers to the integer age x and represents the estimated number of peoplealive at that age in a properly defined population (from an initial group ofl0 individuals aged 0). The exact meaning of the lx’s will be explained afterdiscussing two approaches to the calculation of these numbers.

First, assume that the sequence l0, l1, . . . , lω is provided by statisticalevidence, that is by a longitudinal observation of the actual numbers ofindividuals alive at age 1, 2, . . . ,ω, out of a given initial cohort consistingof l0 newborns. The (integer) age ω is the limiting age (say, ω = 115), thatis, the age such that lω > 0 and lω+1 = 0. The sequence l0, l1, . . . , lω is calleda cohort life table. Clearly, the construction of a cohort table takes ω + 1years.

Assume, conversely, that the statistical evidence consists of the frequencyof death at the various ages, observed throughout a given period, for exam-ple one year. Assume that the frequency of death at age x (possibly after agraduation with respect to x) is an estimate of the probability qx.

Then, for x = 0, 1, . . . ,ω − 1, define

lx+1 = lx (1 − qx) (2.1)

with l0 (the radix) assigned (e.g. l0 = 100,000). Hence, lx is the expectednumber of survivors out of a notional cohort (also called a synthetic cohort)initially consisting of l0 individuals. The sequence l0, l1, . . . , lω, defined byrecursion (2.1), is called a period life table, as it is derived from periodobservations.

Remark Period observations are also called cross-sectional observations,as they analyse an existing population (in terms of the frequency of death)‘across’ the various ages (or age groups). �

An important hypothesis underlying recursion (2.1) should be stressed.As the qx’s are assumed to be estimated frommortality experience in a givenperiod (say, one year), the calculation of the lx’s relies on the assumptionthat the mortality pattern does not change in the future.

As is well known, statistical evidence show that humanmortality, inmanycountries, has declined over the 20th century, and in particular over its lastdecades (see Chapter 3). So, the hypothesis of ‘static’ mortality cannot be

2.2 Life tables 47

assumed in principle, at least when long periods of time are referred to.Hence, in life insurance applications, the use of period life tables should berestricted to products involving short or medium durations (5 to 10 years,say), like term assurances and endowment assurances, whilst it should beavoided when dealing with life annuities and pension plans. Conversely,these products require life tables which allow for the anticipated futuremortality trend, namely projected tables constructed on the basis of theexperienced mortality trend.

For any given sequence l0, l1, . . . , lω it is usual to define

dx = lx − lx+1; x = 0, 1, . . . ,ω (2.2)

thus, dx is the expected number of individuals dying between exact age xand x + 1, out of the initial l0 individuals. Clearly,

ω∑x=0

dx = l0 (2.3)

2.2.2 ‘Population’ tables versus ‘market’ tables

Mortality data, and hence life tables, can originate from observations con-cerning a whole national population, a specific part of a population (e.g.retired workers, disabled people, etc.), an insurer’s portfolio, and so on.

Life tables constructed on the basis of observations involving a wholenational population (usually split into females and males) are commonlyreferred to as population tables.

Market tables are constructed using mortality data arising from a collec-tion of insurance portfolios and/or pension plans. Usually, distinct tablesare constructed for assurances (i.e. insurance products with a positive sumat risk, for example term and endowment assurances), annuities purchasedon an individual basis, pensions (i.e. annuities paid to the members of apension plan).

The rationale for distinct market tables lies in the fact that mortality levelsmay significantly differ as we move from one type of insurance product toanother. The case of different types of life annuities has been discussed inSection 1.6.5.

Market tables provide experience-based data for premium and reservecalculations and for the assessment of expected profits. Population tablescan provide a starting point when market tables are not available. More-over, population tables usually reveal mortality levels higher than thoseexpressed by market tables and hence are likely to constitute a prudential

48 2 : The basic mortality model

(or ‘conservative’, or ‘safe-side’) assessment of mortality in assurance port-folios. Thus, population tables can be used when pricing assurances in orderto include a profit margin (or an implicit safety loading) into the premiums.Indeed, in the early history of life insurance, population life tables were usedin the calculation of premiums – and this prudential assessment of mortalityled to many insurance companies making unanticipated profits.

2.2.3 The life table as a probabilistic model

We consider a person aged x, and denote by Tx the random variable rep-resenting his/her remaining lifetime. In actuarial calculations, probabilitieslike P[Tx > h] and P[h < Tx ≤ h + k] are usually involved. When a lifetable is available, these probabilities can be immediately derived from thelife table itself, provided that the ages and durations are integers.

In life insurancemathematics, a specific notation is commonly used for theprobabilities of survival and death. The notation for the survival probabilityis as follows:

hpx = P[Tx > h] (2.4)

where h is an integer. In particular 1px can be simply denoted with px;clearly 0px = 1.

The notation for the probability of death is as follows:

h|kqx = P[h < Tx ≤ h + k] (2.5)

If h = 0 the notation kqx is used, and in particular, when h = 0 and k = 1,the symbol qx is commonly adopted. Trivially, 0qx = 0.

Note that, in all symbols, the right-hand side subscript denotes the agebeing considered. Conversely, the left-hand side subscript denotes someduration, whose meaning depends on the specific probability addressed.

Starting from recursion (2.1), which defines the life table, and using wellknown theorems of probability theory, we can calculate probabilities ofsurvival and death.

Obviously, for the probability qx (called the annual probability of death)we have

qx = 1 − lx+1

lx= dx

lx(2.6)

and hence, for the probability px (called the annual survival probability),

px = 1 − qx (2.7)

2.2 Life tables 49

Remark Sometimes the one-year probabilities qx and px are called ‘mor-tality rate’ and ‘survival rate’ respectively. We do not use these expressionsto denote probability of death and survival, as the term ‘rate’ should bereferred to a counter expressing the number of events per unit of time. �

In general, for the survival probability we have

hpx = px px+1 . . . px+h−1 = lx+h

lx(2.8)

while for the probabilities of dying we have

kqx = 1 − kpx = 1 − lx+k

lx(2.9)

and

h|kqx = hpx kqx+h = lx+h − lx+h+k

lx(2.10)

Note that the sequence

0|1qx, 1|1qx, . . . , ω|1qx (2.11)

constitutes the probability distribution of the random variable Kx, usuallycalled the curtate remaining lifetime and defined as the integer part of Tx;thus, the possible outcomes of Kx are 0, 1, . . . ,ω − x.

Further useful relations are as follows:

h|kqx = h+kqx − hqx (2.12)

h|kqx = hpx − h+kpx (2.13)

When qx can be expressed as qx = φx/(1 + φx), the function φx representsthe so-called mortality odds, namely

φx = qx

px(2.14)

From 0 < qx < 1 (for x < ω), it follows that φx > 0. Thus, focussing onthe odds, rather than the annual probabilities of dying, can make easier thechoice of a mathematical formula fitting the age-pattern of mortality (seeSection 2.5), as the only constraint is the positivity of the odds.

2.2.4 Select mortality

Consider, for example, a group of insureds, all age 45, deriving from a pop-ulation whose mortality can be described by a given life table. Is q45 (drawn

50 2 : The basic mortality model

from the assumed life table) a reasonable assessment of the probability ofdying for each insured in the group?

In order to answer this question, the following points should beaddressed:

(a) When starting a life insurance policy with an insurance company, anindividual may be subject to medical screening and, possibly, to a med-ical examination. An individual, who passes such tests and who is notcharged any extra premium, is often called a ‘standard risk’.

(b) It has been observed that the mortality experienced by policyholdersrecently accepted (as standard risks) is lower than the mortality expe-rienced by policyholders (of the same age) with a longer duration sincepolicy issue.

So, the answer to the above question is negative if the insureds have enteredinsurance in different years: it is reasonable to expect that an individual,who has just bought insurance, will be of better health than an individualwho bought insurance several years ago.

Hence, the attained age (45, in the example) should be split as follows:

attained age = age at entry + time since policy issue

The following notation is usually adopted to denote the annual probabilitiesof death for an insured age 45:

q[45], q[44]+1, . . . , q[40]+5, . . .

where the number in square brackets denotes the age at policy issue, whereasthe second number denotes the time since policy issue. In general, q[x]+udenotes the probability of an individual currently aged x + u, who boughtinsurance at age x, dying within one year.

According to point (b), it is usual to assume:

q[45] < q[44]+1 < · · · < q[40]+5 < · · ·

However, experience shows that it is reasonable to assume that the selec-tion effect vanishes after some years, say r years after policy issue. So, ingeneral terms, we can assume:

q[x] < q[x−1]+1 < · · · < q[x−r]+r = q[x−r−1]+r+1 = · · · = q′x (2.15)

where q′x denotes the probability of an individual currently age x, who

bought insurance more than r years ago, dying within one year. The periodr is called the select period.

2.3 Moving to an age-continuous context 51

Referring now to a person who bought insurance at age x, and assum-ing a select period of r = 3 years, the following probabilities should beused:

q[x], q[x]+1, q[x]+2, q′x+3, q

′x+4, . . . (2.16)

We denote by xmin and xmax the minimum and respectively the maximumage at entry. The set of sequences (2.16), for x = xmin, xmin+1, . . . , xmax, iscalled a select table. In particular, the table used after the select period iscalled an ultimate life table.

Conversely, life tables inwhichmortality depends on attained age only (asis the case for the life tables described in Section 2.2.1) are called aggregatetables.

Select mortality also concerns life annuities. The person purchasing a lifeannuity is likely to be in a state of good health, and hence it is reasonable toassume that her/his probabilities of death, for a certain period after policyissue, are lower than the probabilities of other individuals with the sameage. In this case, a self-selection effect works.

Remark The selection effect, due to medical ascertainment (in the case ofinsurances with death benefit) or self-selection (in the case of life annuities),operates during the first years after policy issue, and the related age-patternof mortality is often called issue-select. Another type of selection is allowedfor, when some contingency can adversely affect the individual mortality.For example, in actuarial calculations regarding insurance benefits in thecase of disability, the mortality of disabled policyholders is usually con-sidered to be dependent on the time elapsed since the time of disablementinception (as well as on the attained age). In this case, the mortality is calledinception-select. �

2.3 Moving to an age-continuous context

2.3.1 The survival function

Suppose that we have to evaluate the survival and death probabilities (like(2.8), (2.9) and (2.10)) when ages and times are real numbers. Tools otherthan the life table (as described in Section 2.2) are then needed.

Assume that the function S(t), called the survival function and definedfor t ≥ 0 as follows:

S(t) = P[T0 > t] (2.17)

52 2 : The basic mortality model

has been assigned. Clearly, T0 denotes the random lifetime for a new-born. In the age-continuous framework, it is usual to assume that thepossible outcomes of Tx lie in (0,+∞); nonetheless, we can assume thatthe probability measure outside the interval (0,ω) is zero, where ω is thelimiting age.

Consider the probability (2.4); we have

P[Tx > h] = P[T0 > x + h | T0 > x] = P[T0 > x + h]P[T0 > x] (2.18)

we then find

hpx = S(x + h)

S(x)(2.19)

For probability (2.5), via the same reasoning, we obtain

h|kqx = S(x + h) − S(x + h + k)

S(x)(2.20)

and, in particular

kqx = S(x) − S(x + k)

S(x)(2.21)

Turning back to the life table, we note that, since lx is the expected numberof people alive at age x out of a cohort initially consisting of l0 individuals,wehave:

lx = l0 P[T0 > x] (2.22)

and, in terms of the survival function,

lx = l0 S(x) (2.23)

(provided that all individuals in the cohort have the same age-pattern ofmortality, described by S(x)). Thus, the lx’s are proportional to the valueswhich the survival function takes on integer ages x, and so the life table canbe interpreted as a tabulation of the survival function.

Remark If a mathematical formula has been chosen to express the functionS(t), ‘exact’ survival and death probabilities can be calculated, with agesand times given by real numbers. Conversely, when the survival function istabulated at integer ages only, for example, derived from the life table settingS(x) = lx/l0 (see (2.23)), approximate methods are needed to calculatesurvival and death probabilities at fractional ages. Some of these methodsare described in Section 2.3.5. �

Figure 2.1(a) illustrates the typical behaviour of the survival functionS(x). This behaviour reflects results of statistical observations on mortality,as we will see in Chapter 3.

2.3 Moving to an age-continuous context 53

Age x0

1

(a) (b)

S(x)

Age x0

1

S(x)

Figure 2.1. Survival functions.

Figure 2.1(b) focusses on the dynamic aspects of mortality. In particular,two aspects (which emerge from mortality observations throughout time)can be singled out:

– the survival curve moves (in a north easterly direction over time) towardsa rectangular shape, and hence the term rectangularization is used todescribe this feature;

– the point of maximum downwards slope of the survival curve progres-sivelymoves towards the very old ages; this feature is called the expansionof the survival function.

These aspects will be considered in more detail in Chapter 7, when dealingwith longevity risk.

2.3.2 Other related functions

Other functions can be involved in age-continuous actuarial calculations.The most important is the force of mortality (or mortality intensity), dealtwith in Section 2.3.3. In the present section we introduce the probabilitydensity function (pdf) and the distribution function of the random variableTx, x ≥ 0.

First, we focus on the random lifetime T0. Let f0(t) and F0(t) denote,respectively, the pdf and the distribution function of T0. In particular, F0(t)expresses, by definition, the probability of a newborn dying within t years.Hence,

F0(t) = P[T0 < t] (2.24)

or, according to the actuarial notation,

F0(t) = tq0 (2.25)

54 2 : The basic mortality model

Of course, we have

F0(t) = 1 − S(t) (2.26)

The following relation holds between the pdf f0(t) and the distributionfunction F0(t):

F0(t) =∫ t

0f0(u) du (2.27)

Usually it is assumed that, for t > 0, the pdf f0(t) is a continuous function.Then, we have

f0(t) = ddt

F0(t) = − ddt

S(t) (2.28)

The pdf f0(t) is frequently called the curve of deaths.

Figure 2.2(a) illustrates the typical behaviour of the pdf f0(t). Equation(2.28) justifies the relation between the curve of deaths and the survivalcurve (see Fig. 2.1(a)). In particular, we note that the point of maximumdownward slope in the survival curve corresponds to the modal point (atadult-old ages) in the curve of deaths.

Moving to the remaining lifetime at age x, Tx (x > 0), the followingrelations link the distribution function and the pdf of Tx with the analogousfunctions relating to T0:

Fx(t) = P[Tx < t] = P[x < T0 ≤ x + t]P[T0 > x] = F0(x + t) − F0(x)

S(x)(2.29)

fx(t) = ddt

Fx(t) =ddt F0(x + t)

S(x)= f0(x + t)

S(x)(2.30)

From functions Fx(t) and fx(t) (and in particular, via (2.29) and(2.30), from F0(t) and f0(t)), all of the probabilities involved in actuarial

00

(a) (b)

Age x Age x

f 0(x

)

mx

Figure 2.2. Probability density function and force of mortality.

2.3 Moving to an age-continuous context 55

calculations can be derived. For example:

tpx = 1 − Fx(t) =∫ +∞

tfx(u) du = 1

S(x)

∫ +∞

tf0(x + u) du (2.31)

2.3.3 The force of mortality

We refer to an individual age x, and consider the probability of dying beforeage x + t (with x and t real numbers), namely tqx. The force of mortality(or mortality intensity) is defined as follows:

µx = limt↘0

P[Tx ≤ t]t

= limt↘0

tqx

t(2.32)

and hence it represents the instantaneous rate of mortality at a given age x.In reliability theory, this concept is usually referred to as the failure rate orthe hazard function.

From

P[Tx ≤ t] = Fx(t) = F0(x + t) − F0(x)

S(x)(2.33)

we obtain

µx = limt↘0

F0(x + t) − F0(x)

t S(x)= f0(x)

S(x)(2.34)

or

µx = − ddx S(x)

S(x)= − d

dxln S(x) (2.35)

Hence, once the survival function S(x) has been assigned, the force ofmortality can be derived. Thus, the force of mortality does not add anyinformation concerning the age-pattern of mortality, provided that this hasbeen described in terms of S(x) (or f0(x), or F0(x)). Conversely, the roleof the force of mortality is to provide a tool for a fundamental statementof assumptions about the behaviour of individual mortality as a functionof the attained age. The Gompertz model for the force of mortality (seeSection 2.5.1) provides an excellent example.

Note that, as µx = f0(x)/S(x) (see (2.34)), the relation between the graphof µx and the graph of f0(x) (see Fig. 2.2) can be explained in terms of thebehaviour of S(x). When S(x) is close to 1, the two graphs are quite similar,whereas as S(x) strongly decreases, µx definitely increases.

From (2.35), with the obvious boundary condition S(0) = 1, we obtain:

S(x) = exp{−

∫ x

0µu du

}(2.36)

56 2 : The basic mortality model

As clearly appears from (2.36), the survival function S(x) can be obtainedonce the force of mortality has been chosen. Clearly, the possibility offinding a ‘closed’ form for S(x) strictly depends on the structure of µx.

Relations between the force of mortality and the basic mortality functionsrelating to an individual age x can be easily found. For example, from (2.34)and (2.30), we obtain

µx+t = f0(x + t)S(x + t)

= fx(t)1 − Fx(t)

(2.37)

and hence

fx(t) = tpx µx+t (2.38)

Finally, the cumulative standard force of mortality (or cumulative hazardfunction) is defined as follows:

H(x) =∫ x

0µu du (2.39)

Remark A link between the quantities used in an age-discrete context(like lx, dx, etc.) and the quantities used in age-continuous circumstances(like S(x), f0(x), etc.) may be of interest, especially when comparingand interpreting graphical representations of data provided by statisticalexperiences.

The analogy between lx and S(x) immediately emerges from (2.23). Asregards dx (see equation (2.2)), the analogy with the pdf f0(x) follows fromthe fact that the former is minus the first-order difference of the functionlx, while the latter is minus the derivative of the survival function S(x).

Finally, an interesting link can be found between the probabilities h|1qxand the pdf fx(t). The quantities

h|1qx = hpx qx+h; h = 0, 1, . . . ,ω

constitute the probability distribution of the curtate lifetime Kx (see (2.10)and (2.11)). Conversely, in age-continuous circumstances, the pdf of theprobability distribution of Tx is given by

fx(t) = tpx µx+t; t ≥ 0

(see (2.38)). The analogy between the right-hand sides of the two expres-sions is evident. Note, however, that fx(t) (as well as µx+t) does notrepresent a probability, the probability of a person age x dying betweenage x + t and x + t + dt being given by fx(t) dt. �

2.3 Moving to an age-continuous context 57

2.3.4 The central death rate

The behaviour of the force of mortality over the interval (x, x + 1) can besummarized by the central death rate at age x, which is usually denoted bymx. The definition is as follows:

mx =∫ 10 S(x + u) µx+u du∫ 1

0 S(x + u) du= S(x) − S(x + 1)∫ 1

0 S(x + u) du(2.40)

We note that mx is defined as the (age-continuous) weighted arithmeticmean of the force of mortality over (x, x +1), the weighting function beingthe probability of being alive at age x + u, 0 < u ≤ 1, expressed in terms ofthe survival function S(x + u).

The integral∫ 10 S(x+u) du can be approximated using the trapezoidal rule

(and an approximation has to be used when only a life table is available).Then, we obtain an approximation to the central death rate:

mx = S(x) − S(x + 1)

(S(x) + S(x + 1))/2(2.41)

Note that mx can also be expressed in terms of the annual probabil-ity of survival or the annual probability of death. Indeed, from (2.41) weimmediately obtain:

mx = 21 − px

1 + px= 2 qx

2 − qx(2.42)

2.3.5 Assumptions for non-integer ages

Assume that a life table (as described in Section 2.2) is available. How toobtain the survival function for all real ages x, and probabilities of deathand survival for all real ages x and durations t? In what follows, we describethree approximate methods widely used in actuarial practice:

(a) Uniform distribution of deaths. Relation (2.23) suggests a practicableapproach. First, set S(x) = lx/l0 for all integer x using the available lifetable. Then, for x = 0, 1, . . . ,ω − 1 and 0 < t < 1, define

S(x + t) = (1 − t) S(x) + t S(x + 1) (2.43)

and assume S(x) = 0 for x > ω, and so the survival function is a piece-wise linear function. It easy to prove that, from (2.43) we obtain inparticular tqx = t qx, that is, a uniform distribution of deaths between

58 2 : The basic mortality model

exact ages x and x + 1, whence the name of this approximation. It isalso easy to prove that, from (2.43) and (2.35),

µx+t = qx

1 − t qx(2.44)

so that µx+t is an increasing function of t in the interval 0 < t < 1.(b) Constant force of mortality. Let us assume, for 0 < t ≤ 1

µx+t = µ(x) (2.45)

where µ(x) denotes a value estimated from mortality observations. Itfollows, in particular, tpx = e−tµ(x) . This assumption, consisting in apiece-wise constant force of mortality, is frequently adopted in actuarialcalculations. We note that, from (2.40),

mx = µ(x) (2.46)

(c) The Balducci assumption. Let us define, for 0 < t ≤ 1

tqx = t qx

1 − (1 − t) qx(2.47)

The Balducci assumption has an important role in traditional actuar-ial techniques for constructing life tables from mortality observations.However, it is possible to prove that, from (2.47) and (2.35),

µx+t = qx

1 − (1 − t) qx(2.48)

so that µx+t is a decreasing function of t in the interval 0 < t < 1: formost ages, this would be an undesirable consequence of the Balducciassumption.

2.4 Summarizing the lifetime probability distribution

Age-specific functions are usually needed in actuarial calculations. Forexample, in the age-discrete context functions like lx, qx, etc. are commonlyused, whereas, for age-continuous calculations, the survival function S(x)

or the force of mortality µx are the usual starting points.

Nevertheless, the role of single-figure indices (or markers), summarizingthe lifetime probability distribution, should not be underestimated. In par-ticular, important features of past mortality trends can be singled out byfocussing on the behaviour of some indices over time, as we will see inChapter 3.

2.4 Summarizing the lifetime probability distribution 59

2.4.1 The life expectancy

In the age-continuous context, the life expectancy (or expected lifetime) fora newborn, denoted with e0, is defined as follows:

e0 = E[T0] =∫ ∞

0t f0(t) dt (2.49)

integrating by parts, we also find in terms of the survival function:

e0 =∫ ∞

0S(t) dt (2.50)

The definition can be extended to all (real) ages x. So, the expectedremaining lifetime at age x is given by

ex = E[Tx] =∫ ∞

0t fx(t) dt (2.51)

and also, integrating by parts, by

ex = 1S(x)

∫ ∞

0S(x + t) dt (2.52)

Note that, for an individual age x, the random age at death can beexpressed as x + Tx, and then the expected age at death is given by

x + E[Tx] = x + ex (2.53)

For all x, x > 0, the following inequality holds:

x + ex ≥ e0 (2.54)

The expected lifetime is often used to compare mortality in variouspopulations. In this regard, the following aspects should be stressed. Thedefinition of ex is based on the probability distribution of the lifetime condi-tional on being alive at age x. Thus, when x = 0 the probability distributioninvolved has the pdf f0(t) (see (2.49)), and hence mortality at all agescontributes to the value of e0, in particular, for example, the infant mor-tality. Conversely, if x > 0 the conditional pdf fx(t) is involved, and so theage-pattern of mortality beyond age x only determines the value of ex.

The expected value of the curtate lifetime Kx is called the curtate expec-tation of life at age x. It is usually denoted by ex, and is defined asfollows:

ex = E[Kx] =ω−x∑k=0

k k|1qx (2.55)

60 2 : The basic mortality model

From (2.55), the following simpler expression can be derived:

ex =ω−x∑k=1

kpx (2.56)

Another interesting quantity is the so-called complete expectation of lifeat age x, defined as follows:

◦ex = E

[Kx + 1

2

]= ex + 1

2 (2.57)

This quantity can be taken as an approximation to ex, and is useful

when only a life table is available. Thus, it is possible to prove that◦ex

is an approximation to ex by applying the trapezoidal rule to the integralin (2.52).

Remark Age-specific functions (namely, functions of age x), like lx, qx, ex,etc. in the age-discrete context, and S(x), f0(x), µx, ex, etc. in the age-continuous context, are frequently named biometric functions (or life tablefunctions, even in the age-continuous context). It should be noted that,once one of certain of these functions has been assigned, the other func-tions (in the same context) can be derived. For example, in age-discretecalculations from the lx values we can derive the functions qx, ex, etc.;in the age-continuous framework, from the force of mortality µx thesurvival function can be calculated and then all of the probabilities ofinterest. �

2.4.2 Other markers

As it is well known in probability theory, the expected value provides alocation measure of a probability distribution, and this is also the case forthe random lifetime T0 (or Tx in general). Other location measures can beused to summarize the probability distribution of the random lifetime. Inparticular:

– themodal value (at adult ages) of the curve of death,Mod[T0], also calledthe Lexis point;

– the median value of the probability distribution of T0, Med[T0], ormedian age at death.

A number of variability measures can be used to summarize the dispersionof the probability distribution of the lifetime. As we will see in Chapter 3,in a dynamic context interesting information about the rectangularization

2.4 Summarizing the lifetime probability distribution 61

process can be obtained from these characteristics. Some examples follow:

– A traditional variability measure is provided by the variance of therandom lifetime, Var[T0], or its standard deviation,

σ0 =√

Var[T0] (2.58)

– The coefficient of variation, defined as

CV[T0] =√

Var[T0]E[T0] = σ0

e0(2.59)

provides a relative measure of variability.– The entropy H[T0] is defined as follows:

H[T0] = −∫ ∞0 S(x) ln S(x) dx∫ ∞

0 S(x) dx(2.60)

thus, the entropy is minus the mean value of ln S(x), weighted by S(x); itis possible to prove that, as deaths become more concentrated, the valueof H declines and, in particular, H = 0 if the survival function has aperfectly rectangular shape.

– As deaths become more concentrated in an increasingly narrow interval,the slope of the survival curve becomes steeper. A simple variability mea-sure is thus the maximum downward slope of the graph of S(x) in theadult and old age range. Thus, a lower variability implies a steeper slope.Formally, the slope at the point of fastest decline is

maxx

{− d

dxS(x)

}= max

x{S(x) µx} = max

x{f0(x)} (2.61)

Note that the point of fastest decline is Mod[T0], that is, the Lexis point.Further characteristics of the random lifetime follow:

– the probability of a new born dying before a given age x1,

x1q0 = 1 − S(x1) (2.62)

which, for x1 small (say 1, or 5), provides a measure of infant mortality;– the percentiles of the probability distribution of T0; in particular, the

10-th percentile, usually called endurance, is defined as the age ξ such that

S(ξ) = 0.90 (2.63)

– the interquartile range is defined as follows:

IQR[T0] = x′′ − x′ (2.64)

62 2 : The basic mortality model

where x′ and x′′ are respectively the first quartile (the 25-th percentile)and the third quartile (the 75-th percentile) of the probability distributionof T0, namely the ages such that S(x′) = 0.75 and S(x′′) = 0.25; note thatthe IQR decreases as the lifetime distribution becomes less dispersed.

While most markers are referred to the probability distribution of T0, itis also interesting to single out some characteristics referred to individualsalive at a chosen age x, that is, concerning the distribution of Tx, say withx = 65 (of obvious interest when analysing the age-pattern of mortalityof annuitants and pensioners). An example is provided by the expectedremaining lifetime at age x, ex (or the expected age at death, x + ex) (see(2.52), (2.53)). Other examples are given by

– the variance Var[Tx], the standard deviation σx = √Var[Tx], and the

coefficient of variation CV[Tx];– the interquartile range IQR[Tx].For example, the analysis of the values of IQR[T65] related to various

subsequent mortality observations allows us to check whether the rectan-gularization phenomenon occurs even when only old ages are addressed.

Figure 2.3 illustrate some markers of practical interest.

2.4.3 Markers under a dynamic perspective

Information provided by markers calculated on the basis of a period obser-vationmust be carefully interpreted, in particular keeping in mindmortalitytrends.

Consider, in particular, the complete expectation of life at age x (see(2.57)), namely

◦ex =

ω−x∑k=1

kpx + 12

(2.65)

Probabilities kpx are derived from the qx’s according to (2.7) and (2.8), and,in turn, the qx’s are determined as the result of a (recent) period mortality

observation. The quantity◦ex is usually called the (complete) period life

expectancy.

The life expectancy drawn from a period life table can be taken as areasonable estimate of the remaining lifetime for an individual currentlyage x only if we accept the hypothesis that, from now on, the age-patternof mortality will remain unchanged. See also the comments in Section 2.2.1regarding the construction of the life table in terms of lx.

2.5 Mortality laws 63

LexisEndurance

x''e0_

e65+65_

x1q0

f0(x)

max{f0(x)}

IQR[T0]

Age x

x1 x'

x

Figure 2.3. Some markers.

When the hypothesis of unchanging future mortality is rejected, the cal-

culation of period quantities like◦ex (as well as other markers) and the

corresponding ‘cohort’ quantities requires the use of appropriate mortalityforecasts, and hence of projected life tables. This aspect will be dealt within Section 4.4.1.

2.5 Mortality laws

Since the earliest attempt to describe in analytical terms a mortality sched-ule (due to A. De Moivre and dating back to 1725), great effort has beendevoted by demographers and actuaries to the construction of analyticalformulae (or laws) that fit the age-pattern of mortality. When a mortalitylaw is used to fit observed data, the age-pattern of mortality is summarizedby a small number of parameters (two to ten, say, in the mortality laws com-monly used in actuarial and demographical models). This exercise has theadvantage of reducing the dimensionality of the problem – thus, we couldreplace the 120, say, items of a life table by a small number of parameterswithout sacrificing much information.

64 2 : The basic mortality model

It is beyond the scope of this book to present an extensive list of mor-tality laws. Conversely we only focus on some important laws, which areinteresting because of their possible use in a dynamic context, that is, to sum-marize observedmortality trends and to project the age-pattern of mortalityin future years.

2.5.1 Laws for the force of mortality

A number of mortality laws refer to the force of mortality, µx (althoughsome of them have been originally proposed in different terms, for example,in terms of the life table lx).

The Gompertz law, proposed in 1825, is as follows:

µx = B cx (2.66)

Sometimes the following equivalent notation is used:

µx = α eβ x (2.67)

It is interesting to look at the hypothesis underlying the Gompertz law.Assume that,moving fromage x to age x+x, the increment of themortalityintensity is proportional to its initial value, µx, and to the length of theinterval, x; thus

µx = β µx x (2.68)

This assumption leads to the differential equation

dµx

dx= β µx, β > 0 (2.69)

and finally to (2.67), with α > 0. The Gompertz law is used to represent theage progression of mortality at the old ages, that is, the senescent mortality.

The (first) Makeham law, proposed in 1867, is a generalization of theGompertz law, namely

µx = A + B cx (2.70)

where the term A > 0 (independent of age) represents non-senescent mor-tality, for example, because of accidents. An interpretation in more generalterms can be found in Section 2.5.3. The following equivalent notation isalso used:

µx = γ + α eβ x (2.71)

The second Makeham law, proposed in 1890, is as follows:

µx = A + H x + B cx (2.72)

and hence constitutes a further generalization of the Gompertz law.

2.5 Mortality laws 65

The Thiele law, proposed in 1871, can represent the age-pattern ofmortality over the whole life span:

µx = A e−Bx + C e−D(x−E)2 + F Gx (2.73)

The first term decreases as the age increases and represents the infantmortal-ity. The second term, which has a ‘Gaussian’ shape, represents the mortalityhump (mainly due to accidents) at young-adult ages. Finally, the third term(of Gompertz type) represents the senescent mortality.

In 1932 Perks proposed two mortality laws. The first Perks law is asfollows:

µx = α eβx + γ

δ eβx + 1(2.74)

Conversely, the second Perks law has the following more general structure:

µx = α eβx + γ

δ eβx + ε e−βx + 1(2.75)

As we will see in Section 2.8, Perks’ laws have an important role in repre-senting the mortality pattern at very old ages (say, beyond 80); moreover,the first Perks law can be reinterpreted in the context of the ‘frailty’ models(see Section 2.9.5).

The Weibull law, proposed in 1951 in the context of reliability theory, isgiven by

µx = A xB (2.76)

or, in equivalent terms:

µx = α

β

(xβ

)α−1

(2.77)

The GM class of models (namely, the Gompertz-Makeham class ofmodels), proposed by Forfar et al. (1988), has the following structure:

µx =r−1∑i = 1

αi xi + exp

s−1∑j=0

βj xj

(2.78)

with the proviso that when r = 0 the polynomial term is absent, and whens = 0 the exponential term is absent. The general model in the class (2.78) isusually labelled as GM(r, s). Note that, in particular, GM(0, 2) denotes theGompertz law, GM(1, 2) the first Makeham law and GM(2, 2) the secondMakeham law. Models used by the Continuous Mortality InvestigationBureau in the UK to graduate the force of mortality µx are of the GM(r, s)type. In particular, models GM(0, 2), GM(2, 2), and GM(1, 3) have beenwidely used.

66 2 : The basic mortality model

2.5.2 Laws for the annual probability of death

Various mortality laws have been proposed in terms of the annual proba-bility of death, qx, and in terms of the odds φx (see (2.14)). For example,Beard proposed in 1971 the following law:

qx = A + B cx

E c−2x + 1 + D cx(2.79)

Barnett proposed, in 1974, the following law for the odds:

φx = A − H x + B cx (2.80)

The odds can also be graduated using the following formula:

φx = ePx (2.81)

where Px is a polynomial in x. For example, with a first degree polynomial,we have

φx = ea+b x (2.82)

Heligman and Pollard (1980) proposed a class of formulae which aim torepresent the age-pattern of mortality over the whole span of life (as madeby Thiele, see (2.73)). The first Heligman–Pollard law, expressed in termsof the odds, is

φx = A(x+B)C + D e−E(ln x−ln F)2 + G Hx (2.83)

while the second Heligman–Pollard law, in terms of qx, is given by

qx = A(x+B)C + D e−E(ln x−ln F)2 + G Hx

1 + G Hx (2.84)

Note that, in both cases, at higher ages we have

qx ≈ G Hx

1 + G Hx (2.85)

The third Heligman–Pollard law, which generalizes the second one, is asfollows:

qx = A(x+B)C + D e−E(ln x−ln F)2 + G Hx

1 + K G Hx (2.86)

Another generalization of the second law is provided by the fourthHeligman–Pollard law, which is given by

qx = A(x+B)C + D e−E(ln x−ln F)2 + G Hxk

1 + G Hxk(2.87)

2.6 Non-parametric graduation 67

2.5.3 Mortality by causes

When various (say, r) causes or death are singled out, the force of mortalityµx can be expressed in terms of ‘partial’ forces of mortality, each forcepertaining to a specific cause:

µx =r∑

k=1

µ(k)x (2.88)

where µ(k)x refers to the k-th cause of death.

Makeham proposed a reinterpretation of his first law (see (2.70)) in termsof partial forces of mortality. Let

A =m∑

k=1

Ak (2.89)

and

B =m+n∑

k=m+1

Bk (2.90)

whence

µx =m∑

k=1

Ak + cxm+n∑

k=m+1

Bk =m+n∑k=1

µ(k)x (2.91)

2.6 Non-parametric graduation

2.6.1 Some preliminary ideas

The term ‘graduation’ denotes an adjustment procedure applied to a setof estimated quantities, in order to obtain adjusted quantities which areclose to a reasonable pattern and, in particular, do not exhibit an erraticbehaviour.We note that previous experience and intuition suggest a smoothprogression.

In actuarial science, graduation procedures are typically applied to rawmortality rates which result from statistical observation. Graduated seriesof period mortality rates should exhibit a progressive change over a seriesof ages, without sudden and/or huge jumps, which cannot be explained byintuition or supported by past experience.

A detailed analysis of the various aspects of graduation is beyond thescope of this book. So, we only focus on some topics which constitutestarting points for projection models presented in Chapters 5 and 6.

68 2 : The basic mortality model

Various approaches to graduation can be adopted. In particular, twobroad categories can be recognized:

– parametric approaches, involving the use of mortality laws;– non-parametric approaches.

According to a parametric approach, a functional form is chosen (e.g.Make-ham’s law, Heligman–Pollard’s law, and so on; see Section 2.5), and therelevant parameters are estimated in order to find the parameter valueswhich provide the best fit to the observed data, for example, to mortal-ity rates. Various fitting criteria can be adopted for parameter estimation,for example maximum likelihood, based on a Generalized Linear Modelsformulation.

The choice of a particular functional form is avoided when a non-parametric graduation method is adopted. Important methods in thiscategory are: weighted moving average methods, kernel methods, theWhittaker–Henderson model, methods based on spline functions. In whatfollows, we restrict our attention to the latter two methods only.

2.6.2 The Whittaker–Henderson model

The Whittaker–Henderson approach to graduation is based on the mini-mization of an objective function. We denote by z1, z2, . . . , zn the observedvalues of a given quantity, and by y1, y2, . . . , yn the corresponding gradu-ated values. For example, referring to the graduation of mortality rates, zhcould represent the raw mortality rate at age xh, namely mxh , and yh thecorresponding graduated value, mxh .

The objective function (to be minimized with respect to y1, y2, . . . , yn) isdefined as follows:

F(y1, y2, . . . , yn) =n∑

h=1

wh (yh − zh)2 + λ

n−k∑h=1

(kyh)2 (2.92)

where

– w1,w2, . . . ,wn are weights attributed to the squared deviations;– kyh is the k-th forward difference of yh, defined as follows:

kyh =k∑

i=0

(−1)i(

ki

)yh+k−i (2.93)

– λ is a (constant) parameter.

2.6 Non-parametric graduation 69

The first term on the right-hand side of formula (2.92) provides a mea-sure of the discrepancy between observed and graduated values. The choiceof each weight wh allows us to attribute more or less importance to thesquared deviation related to the h-th observation. In particular, referringto the graduation of mortality rates, an appropriate choice of the weightsshould reflect a low importance attributed to the raw mortality rates con-cerning very old ages at which few individuals are alive, and hence theobserved values could be affected by erratic behaviour. To this purpose,the weights can be chosen to be inversely proportional to the estimatedvariance of the observed mortality rates.

The second term on the right-hand side of (2.92) quantifies the degreeof roughness in the set of graduated values. Usually, the value of k isset equal to 2, 3, or 4. Finally, the parameter λ allows us to express our‘preference’ regarding features of the graduation results: higher values ofλ denote a stronger preference for a smooth behaviour of the graduatedvalues, whereas lower values express more interest in the fidelity of thegraduated values to the observed ones.

The objective function can be generalized and modified. For example,it has been proposed to replace, in the first term of the right-hand side of(2.92), the squared deviations with other powers. As regards the secondterm, a mixture of differences of various orders can be used instead of thek-th differences only.

2.6.3 Splines

A spline is a function defined piecewise by polynomials. We denote by[a, b] an interval of real numbers, and by ξ0, ξ1, . . . , ξm, ξm+1 real numberssuch that

a = ξ0 < ξ1 < · · · < ξm < ξm+1 = b (2.94)

Let s denote the spline function, and p0, p1, . . . , pm the polynomials. Thus,the spline function is defined as follows:

s(x) =

p0(x); ξ0 ≤ x < ξ1

p1(x); ξ1 ≤ x < ξ2

. . .

pm(x); ξm ≤ x ≤ ξm+1

(2.95)

The m + 2 numbers ξ0, ξ1, . . . , ξm+1 are called the knots. In particular,ξ1, . . . , ξm are the internal knots. If the knots are equidistantly distributed in[a, b], the spline is called a uniform spline (a non-uniform spline otherwise).

70 2 : The basic mortality model

As regards the behaviour of s(x) in a neighbourhood of the generic knotξh, a measure of smoothness is provided by the maximum order of thederivative of the polynomials such that the polynomials ph−1 and ph havecommon derivative values; if the maximum order is k, the spline is said tohave smoothness (or continuity) of class Ck at ξh.

When all polynomials have degree at most r, the spline is said to be ofdegree r. A spline of degree 0 is a step function. A spline of degree 1 isalso called a linear spline. An example of a linear spline is provided bythe piecewise linear survival function, constructed by assuming as knots allof the integer ages and adopting the hypothesis of uniform distribution ofdeaths over each year of age (see point (a) in Section 2.3.5).

A spline of degree 3 is a cubic spline. In particular, a natural cubic splinehas continuity C2 at all of the knots, and the second derivatives of thepolynomials equal to 0 in a and b; thus, the spline is linear outside theinterval [a, b].It can be proved that, for a given interval [a, b] and a given set of m

internal knots, the set of splines of degree r constitutes a (real) vector spaceof dimension d = m + r + 1. A basis for this space is provided by thefollowing d functions:

1, x, . . . , xr, [(x − ξ1)+]r, . . . , [(x − ξm)+]r (2.96)

where

(x − ξh)+ ={

0; x < ξh

x − ξh; x ≥ ξh(2.97)

for h = 1, . . . ,m. The corresponding representation of the spline functionis given by:

s(x) =r∑

j=0

αj xj +m∑

h=1

βh [(x − ξh)+]r (2.98)

where the αj’s and the βh’s are the coefficients of the linear combination.

If d is the dimension of the space, then any basis consists of d elements.We denote by b1, b2, . . . , bd a basis. Hence, any spline s in the space can berepresented as a linear combination of these functions, namely

s(x) = γ1 b1(x) + γ2 b2(x) + · · · + γd bd(x) (2.99)

where the coefficients γ1, γ2, . . . , γd are uniquely determined by thefunction s.

The choice of a basis constitutes a crucial step in the graduation processthrough splines. The starting point of this process is the choice of the result

2.6 Non-parametric graduation 71

we want to achieve by using a spline function, and the related objectivefunction to optimize.

We assume that our target is a ‘best fit’ graduation, namely we requirethat the spline function is as close as possible (according to a stated criterion)to our data set, consisting of n points,

(x1, z1), (x2, z2), . . . , (xn, zn) (2.100)

with a ≤ xh ≤ b for h = 1, 2, . . . , n. For example, referring to actu-arial applications, the xh’s may represent ages, whereas the zh are thecorresponding observed mortality rates (namely the mxh ’s referred to inSection 2.6.2).

As regards the best-fit criterion, we focus on the weighted mean squareerror, expressed by the quantity

n∑h=1

wh [s(xh) − zh]2 (2.101)

where the wh’s are positive weights. Using (2.99) to express the spline func-tion, our best-fit problem can be stated as follows: find the coefficientsγ1, γ2, . . . , γd which minimize the function

G(γ1, γ2, . . . , γd) =n∑

h=1

wh

d∑j=1

γj bj(x)

− zh

2

(2.102)

Although minimizing the function G is, in principle, a simple exer-cise which consists in solving a set of simultaneous equations, in practicecomputational difficulties may arise. However, the complexity of the min-imization problem can be reduced if a particular basis is chosen in orderto express the spline function s, namely the one consisting of the so-calledB-splines.

A formal definition of the B-splines and a detailed discussion of their useas a basis in graduation problems through splines is beyond the scope ofthis Section. The interested reader can refer, for example, to McCutcheon(1981). We just mention that the idea underlying the B-splines is to choosea basis such that each spline in the basis is zero outside a short interval.Typically, the basis consists of cubic polynomial pieces, smoothly joinedtogether. In particular, when the spline function is uniform (i.e. the knotsare equidistantly distributed), the B-splines are (for a given degree) justshifted copies of each other. The advantage provided by B-splines in theminimization problem (2.102) derives from the fact that, as each B-spline

72 2 : The basic mortality model

is zero outside a given short interval, the matrix involved by solving therelated set of simultaneous equations has many entries equal to zero, andthis improves the tractability of the best-fit problem.

Spline functions can be introduced by adopting a different approach,namely the ‘variational approach’. Following Champion et al. (2004), westart by defining an interpolation problem. Assume that we need to find afunction f interpolating the n data points (x1, z1), (x2, z2), . . . , (xn, zn), thatis, such that

f (xh) = zh; h = 1, 2, . . . , n (2.103)

Among all functions f fulfilling condition (2.103), we are interested in thosewhich have a continuous second derivative and a ‘limited’ oscillation (i.e. asmooth behaviour) in the interval [x1, xn]. We introduce the functional

�[f ] =∫ xn

x1[f ′′(x)]2 dx (2.104)

(where f ′′(x) denotes the second derivative of f ) as a measure of oscillation.Then, it is possible to prove that a cubic spline is the only function whichminimizes the functional (2.104).

We now shift from the interpolation problem to a graduation problem.To this purpose, we use the following functional in order to express ourobjective:

�[f ] =n∑

h=1

[zh − f (xh)]2 + λ

∫ xn

x1[f ′′(x)]2 dx (2.105)

Clearly, �[f ] generalizes the functional (2.104). The first term on the right-hand side of (2.105) provides ameasure of the discrepancy between the datazh’s and the graduated values f (xh)’s, whereas the second term can be inter-preted as a measure of smoothness. The parameter λ allows us to expressour preference in the trade-off between closeness to data and smoothness.The analogy with the structure of formula (2.92) is self-evident.

It can be proved that, among all functions f with continuous secondderivatives, there is a unique function which minimizes the functional(2.105).

Finally, it is worth noting that the spline functions so far dealt with are‘univariate’ splines, as their domains consist of intervals of real numbers.Extension to a bivariate context is possible; an example will be presentedin Section 5.4, together with the more general concept of P-splines (namely,‘Penalized’ splines).

2.7 Some transforms of the survival function 73

2.7 Some transforms of the survival function

Some transforms of life table functions may help us in reaching a bet-ter understanding of some aspects of the age-pattern of mortality (and ofmortality trends as well). Two examples will be provided: the logit trans-form of the survival function S(x), and the so-called resistance function.Some aspects of their use in mortality projections will be addressed inSection 4.6.3.

The logit transform of the survival function is defined as follows:

�(x) = 12

ln(1 − S(x)

S(x)

)(2.106)

Features of this transform have been analysed by Brass (seee.g. Brass (1974)). In particular, Brass noted empirically that �(x) can beexpressed in terms of the logit of the survival function describing the age-pattern of mortality in a ‘standard’ population, �∗(x), via a linear relation,that is,

�(x) = α + β �∗(x) (2.107)

whose parameters are (almost) independent of age.

Figures 2.4–2.6 show the effect of various choices for the parameters α

and β.

A different transform of the survival function S(x) has been proposed byPetrioli and Berti (1979). The proposed transform is the so-called resistancefunction, defined as follows:

ρ(x) = S(x)/(ω − x)

(1 − S(x))/x(2.108)

40300

0.25

0.5

0.75

1

1.25

50 60 70 80 90 100

a = 0; b = 1

a = 0.2; b = 1a = –0.2; b = 1

a = 0; b = 1

a = 0.2; b = 1a = –0.2; b = 1

0 10

–3–2–101234567

(b)(a)

20 30 40 50 60 70 80 90 100

Figure 2.4. Logit transforms and survival functions.

74 2 : The basic mortality model

40300

0.25

0.5

0.75

1

1.25

50 60 70 80 90 100

a = 0; b = 1

a = 0; b = 0.75a = 0; b = 1.25

a = 0; b = 1

a = 0; b = 0.75a = 0; b = 1.25

0 10

–4

–2

0

2

4

6

8

(b)(a)

20 30 40 50 60 70 80 90 100

Figure 2.5. Logit transforms and survival functions.

20 400 10 300

0.25

0.5

0.75

1

1.25

50 60 70 80 90 100

a = 0; b = 1a = –0.2; b = 1.25

a = 0; b = 1a = –0.2; b = 1.250 10

–4

–2

0

2

4

6

8

(b)(a)

20 30 40 50 60 70 80 90 100

Figure 2.6. Logit transforms and survival functions.

where ω denotes, as usual, the limiting age. Thus, the transform is the ratioof the average annual probability of death beyond age x to the averageannual probability of death prior to age x (both probabilities being referredto a newborn).

2.8 Mortality at very old ages

2.8.1 Some preliminary ideas

Several problems arise when analysing the mortality experience of veryold population segments. A first problem obviously concerns the observedold-age mortality rates, which are heavily affected by random fluctuationsbecause of their scarcity. In the past, mortality at very old ages was largelyhypothetical and assumptions were normally made as the result of extrap-olations from younger ages, based on models such as the Gompertz orthe Makeham law. In recent times, mortality statistics have been improved

2.8 Mortality at very old ages 75

Gompertz–Makeham–Thiele Lindbergson

e.g. Logistic

Age x

Forc

e of

mor

talit

y m

x

Figure 2.7. Mortality at highest ages.

in many countries, and provide stronger evidence about the shape of themortality curve at old and very old ages.

In particular, it has been observed that the force of mortality is slowlyincreasing at very old ages, approaching a rather flat shape. In other words,the exponential rate of mortality increase at very old ages is not constant,as for example in Gompertz’s law (see (2.66)), but declines (see Fig. 2.7).However, a basic problem arises when discussing the appropriateness ofmortality laws in representing the pattern of mortality at old ages: ‘what’force of mortality are we dealing with? We will return on this importantissue in Section 2.9.3.

As classical mortality laws may fail in representing the very old-age mor-tality, shifting from the exponential assumption may be necessary in orderto fit the relevant pattern of mortality.

2.8.2 Models for mortality at highest ages

Several alternative models have been proposed. In Section 2.5.2 we haveaddressed the Heligman–Pollard family of laws, which aim to representthe age-pattern of mortality over the whole span of life. As regards oldages, according to the first and the second Heligman–Pollard law, qx canbe approximated by G Hx

1+G Hx (see (2.85)). Conversely, the third Heligman–Pollard law when applied to old ages reduces to

qx = G Hx

1 + K G Hx (2.109)

In Perks’ laws (see (2.74) and (2.75)), the denominators have the effectof reducing the mortality especially at old and very old ages. In particularthe graph of the first law is a logistic curve.

76 2 : The basic mortality model

The logistic model for the force of mortality proposed by Thatcher (1999)assumes that

µx = δ α eβx

1 + α eβx + γ (2.110)

Its simplified version, used in particular for studying long-term trends andforecasting mortality at very old ages, has δ = 1 and hence has only threeparameters, namely α, β, and γ:

µx = α eβx

1 + α eβx + γ (2.111)

A modified version of the Makeham law has been proposed byLindbergson (2001), replacing the exponential growth with a straight lineat very old ages:

µx ={

a + b cx if x ≤ w

a + b cw + d (x − w) if x > w(2.112)

The model proposed by Coale and Kisker (see Coale and Kisker (1990))relies on the so-called exponential age-specific rate of change of centraldeath rates, defined as follows:

kx = lnmx

mx−1(2.113)

The model assumes that kx is linear over the age of 85:

kx = k85 − (x − 85) s (2.114)

as documented by statistical evidence. The parameter s is determinedassuming that k85 is calculated from empirical data, whereas a predeter-mined value is given to the mortality rate m110. For given values of kx,x = 85, 86, . . . , 110, we find from (2.113)

mx = m85 exp

x∑h=86

kh

(2.115)

From (2.115) it follows that theCoale–Kiskermodel implies an exponential-quadratic function for central death rates at the relevant ages, that is,

mx = exp(a x2 + b x + c) (2.116)

which is clearly in contrast with the Gompertz assumption.

2.9 Heterogeneity in mortality models 77

2.9 Heterogeneity in mortality models

It is well known that any given population is affected by some degree ofheterogeneity, as far as individual mortality is concerned. Heterogeneity inpopulations should be approached addressing two main issues:

(i) detecting and modelling observable heterogeneity factors (e.g. age,gender, occupation, etc.);

(ii) allowing for unobservable heterogeneity factors.

2.9.1 Observable heterogeneity factors

As regards observable factors, mortality depends on:

(1) biological and physiological factors, such as age, gender, genotype;(2) features of the living environment; in particular: climate and pollution,

nutritional standards (mainly with reference to excesses and deficienciesin diet), population density, hygienic and sanitary conditions;

(3) occupation, in particular in relation to professional disabilities orexposure to injury, and educational attainment;

(4) individual lifestyle, in particular with regard to nutrition, alcohol anddrug consumption, smoking, physical activities and pastimes;

(5) current health conditions, personal and/or family medical history, civilstatus, and so on.

Item 2 affects the overall mortality of a population. That is why mortalitytables are typically considered specifically for a given geographic area. Theremaining items concern the individual and, when dealing with life insur-ance, they can be observed at policy issue. Their assessment is performedthrough appropriate questions in the application form and, as to healthconditions, possibly through a medical examination.

The specific items considered for insurance rating depend on the types ofbenefits provided by the insurance contract (see also Section 2.2.2). The aimof the insurer is to group people in classes within which insured lives bearthe same expected mortality profile. Age is always considered, due to theapparent variability of mortality in this regard. Gender is usually accountedfor, especially when living benefit are involved, given that females on aver-age live longer than males. As far as genetic aspects are concerned, theevolving knowledge in this area has raised a lively debate (which is stillrunning) on whether it is legitimate for insurance companies to resort togenetic tests for underwriting purposes. Applicants for living benefits are

78 2 : The basic mortality model

usually in good health condition, so a medical examination is not neces-sary; on the contrary, a proper investigation is needed for those who buydeath benefits, given that people in poorer health conditions may be moreinterested in them and hence more likely to buy such benefits.

When death benefits are dealt with, health conditions, occupation andsmoking status lead to a classification into standard and substandard risks;for the latter (also referred to as impaired lives), a higher premium level isadopted, given that they bear a higher probability to become eligible forthe benefit. In some markets, standard risks are further split into regularand preferred risks, the latter having a better profile than the former (e.g.because they never smoked); as such, they are allowed to pay a reducedpremium rate.

Mortality for people in poorer or better conditions than the averageis usually expressed in relation to average (or standard) mortality. Thisallows us to deal only with one life table (or one mortality law), prop-erly adjusted when substandard or preferred risks are dealt with. Forthe case of life annuities, usually specific tables are constructed for eachsubpopulation.

2.9.2 Models for differential mortality

Let us index with (S) standard mortality and with (D) a different (higheror lower) mortality. Below, some examples of differential mortality modelsfollow.

q(D)x = aq(S)

x + b (2.117)

µ(D)x = aµ(S)

x + b (2.118)

q(D)x = q(S)

x+z (2.119)

µ(D)x = µ

(S)x+z (2.120)

q(D)x = q(S)

x ϕ(x) (2.121)

q(D)[x−t]+t = q(S)

x ρ(x − t, t) (2.122)

q(D)[x−t]+t = q(S)

[x−t]+t ν(t) (2.123)

q(D)[x−t]+t = q(S)

[x−t]+t η(x, t) (2.124)

In any case, x is the current age and t the time elapsed since policy issue(t ≥ 0), whence x − t is the age at policy issue.

2.9 Heterogeneity in mortality models 79

Models (2.117) and (2.118) are usually adopted for substandard risks.Letting a = 1 and b = δq(S)

x−t, δ > 0, in (2.117) (b = δµ(S)x−t in (2.118)) the so-

called additive model is obtained, where the increase in mortality dependson initial age. An alternative model is obtained choosing b = θ, θ > 0, thatis, a mortality increase which is constant and independent of the initial age;such a model is consistent with extra-mortality due to accidents (relatedeither to occupation or to extreme sports). Letting a = 1 + γ, γ > 0, andb = 0 the so-called multiplicative model is derived, where the mortalityincrease depends on current age. When risk factors are only temporarilyeffective (e.g. some diseases which either lead to an early death or have ashort recovery time), parameters a, b may be positive up to some propertime τ; for t > τ, standard mortality is assumed, so that a = b = 0.

Models (2.119) and (2.120) are very common in actuarial practice, bothfor substandard and preferred risks, due to their simplicity; they are calledage rating or age shifting models. Model (2.120), in particular, can beformally justified, assuming the Gompertz law for the standard force ofmortality and the multiplicative model for differential mortality. Actually,if µ

(S)x = α eβx (see (2.67)), we have from (2.118), with a = 1+γ and b = 0,

µ(D)x = (1 + γ) α eβx = α eβ(x+z) = µ

(S)x+z (2.125)

where eβz = 1 + γ. In insurance practice, the age-shifting is often applieddirectly to premium rates.

In (2.121), mortality is adjusted in relation to age. Such a choice is com-mon when annuities are dealt with. For example, ϕ(x) may be a step-wiselinear function.

The other models listed above concern the effect on mortality of the timeelapsed since policy issue, t (see Section 2.2.4). Model (2.122) expressesissue-select mortality in terms of aggregate mortality (so that, differentialmortality simply means, in this case, select mortality). Conversely, models(2.123) and (2.124) express issue-select differential mortality through atransform of the issue-select standard probabilities of death; in particular,ν(t) and η(x, t) may be chosen to be linear.

A particular implementation of model (2.117) (with b = 0) is given bythe so-called numerical rating system, introduced in 1919 by New YorkLife Insurance and still adopted by many insurers. A set of m risk factors isreferred to. The annual probability of death specific for a given individual is

q(spec)x = q(S)

x

1 +m∑

h=1

γh

(2.126)

80 2 : The basic mortality model

where the parameters γh lead to a higher or lower death probability forthe individual in relation to the values assumed by the chosen risk factors(clearly, with the constraint −1 <

∑mh=1 γh < (1/q(S)

x ) − 1). Note that anadditive effect of each of the risk factors is assumed.

2.9.3 Unobservable heterogeneity factors. The frailty

Heterogeneity of a population in respect of mortality can be explained bydifferences among the individuals; some of these are observable, as discussedin the previous section, whilst others (e.g. the individual’s attitude towardshealth, some congenital personal characteristics) are unobservable.

When allowing for unobservable heterogeneity factors, two approachescan be adopted:

– A discrete approach, according to which heterogeneity is expressedthrough a (finite) mixture of appropriate functions.

– A continuous approach, based on a non-negative real valued variable,called the frailty, whose role is to include all unobservable factorsinfluencing the individual mortality.

The second approach is most interesting. We will deal with this approachonly. In the following discussion, the term heterogeneity refers to unob-servable risk factors only; in respect of the observable risk factors, thepopulation is instead assumed to be homogeneous.

In order to develop a continuous model for heterogeneity, a propercharacterization of the unobservable risk factors must be introduced. Intheir seminal paper, Vaupel et al. (1979) extend the earlier work of Beard(1959, 1971) and define the frailty as a non-negative quantity whose levelexpresses the unobservable risk factors affecting individual mortality. Theunderlying idea is that those people with a higher frailty die on averageearlier than others. Several models can be developed, which are susceptibleto interesting actuarial applications.

With reference to a population (defined at age 0, and as such closed to newentrants), we consider people current age x. They represent a heterogeneousgroup, because of the unobservable factors. Let us assume that, for anyindividual, such factors are summarized by a non-negative variable, viz thefrailty. The specific value of the frailty of the individual does not changeover time, but remains unknown. On the contrary, because of deaths, thedistribution of people in respect of frailty does change with age, given thatpeople with low frailty are expected to live longer; we denote by Zx therandom frailty at age x, for which a continuous probability distribution

2.9 Heterogeneity in mortality models 81

with pdf gx(z) is assumed. It must be mentioned that the hypothesis ofunvarying individual frailty, which is reasonable when thinking of geneticaspects, seems weak when referring to environmental factors, which maychange over time affecting the risk of death; however, there is empiricalevidence which validates quite satisfactorily this assumption.

For a person current age x with frailty level z, the (conditional) force ofmortality (see (2.32)) is defined as

µx(z) = limt↘0

P[Tx ≤ t|Zx = z]t

(2.127)

Now the task is to look at possible relations between µx(z) and a standardforce of mortality, given that mortality analysis requires the joint distribu-tion of (Tx,Zx). For brevity, conditioning on Zx = z will be denoted simplywith z.

In Vaupel et al. (1979) a multiplicative model for the force of mortalityhas been proposed:

µx(z) = z µx (2.128)

where µx represents the force of mortality for an individual with z = 1; µxis considered as the standard force of mortality. If z < 1, then µx(z) < µx,which suggests that the person is in good conditions; vice versa if z > 1.Note that (2.128) may be adopted also when the standard frailty level isother than 1. Let a, a �= 1, be the standard frailty level and µ′

x the stan-dard force of mortality; according to the multiplicative model, µx(a) = µ′

x,whence (replacing in (2.128)) µx = 1

a µ′x. So, following (2.128), the force

of mortality for a person age x and frailty level z may be written as

µx(z) = za

µ′x = z′µ′

x (2.129)

which coincides with (2.128) using an appropriate definition of the standardforce of mortality and a scaling of the frailty level. A simple generalizationmay further be adopted to represent a mortality component independent ofage and frailty (e.g. accident mortality). The model

bµx(z) = b + zµx (2.130)

may be considered for this purpose. For brevity, in the following we referjust to (2.128).

We denote with H(x) the cumulative standard force of mortality in (0, x)

(see (2.39)).

Let us refer to age 0. The survival function for a person with frailty z is

S(x|z) = e− ∫ x0 µt(z)dt = e−zH(x) (2.131)

82 2 : The basic mortality model

The pdf ofT0 conditional on a frailty level z, given by f0(x|z) = S(x|z) µx(z),can be expressed as

f0(x|z) = e−zH(x)zµx = − ddx

S(x|z) (2.132)

The joint pdf of (T0,Z0), denoted by h0(x, z), can then be easily obtained.We have

h0(x, z) = f0(x|z) g0(z) = S(x|z) µx(z) g0(z) (2.133)

Referring to the whole population, we can define the average survivalfunction as

S(x) =∫ ∞

0S(x|z) g0(z) dz (2.134)

Note that S(x) represents the share of people alive at age x out of the initialnewborns.

We now refer to a given age x, x ≥ 0. The pdf of Zx may be derivedfrom the distribution of (T0,Z0) considering that, as was mentioned earlier,the distribution of Zx changes because of a varying composition of thepopulation due to deaths. We can then relate Zx to Z0 as follows:

Zx = Z0|T0 > x (2.135)

For the pdf of Zx we obtain

gx(z) = limz↘0

P[z < Zx ≤ z + z]z

=

= limz↘0

P[T0 > x|z < Z0 ≤ z + z] P[z < Z0 ≤ z + z]z P[T0 > x] (2.136)

from which, under usual conditions, we obtain

gx(z) = S(x|z) g0(z)

S(x)= S(x|z) g0(z)∫ ∞

0 S(x|z) g0(z) dz(2.137)

Note that the pdf of Zx is given by the pdf of Z0, adjusted by the ratioS(x|z)/S(x) which updates at age x the proportion of people with frailtyz. It is also interesting to stress that the assessment of gx(z) is based onan update of g0(z) with regard to the number of survivors with frailty zcompared to what would be expected over the whole population.

We define the average force of mortality in the population as

µx =∫ ∞0 µx(z) S(x|z) g0(z) dz∫ ∞

0 S(x|z) g0(z) dz=

∫ ∞0 h0(x, z) dz

S(x)(2.138)

2.9 Heterogeneity in mortality models 83

Thanks to (2.128) and (2.137) we obtain

µx = µx

∫ ∞

0z gx(z) dz (2.139)

that is, toµx = µx zx (2.140)

where zx = ∫ ∞0 z gx(z) dz = E[Zx] represents the expected frailty at age x.

Note that the average force of mortality coincides with the standard oneonly if zx = 1. A similar relation holds for model (2.130): we easily findµx = b + µx zx.

It is easy to show that

ddx

zx = −µxVar[Zx] < 0 (2.141)

Then, according to (2.140), µx varies less rapidly than µx. This is dueto the fact that those with a high frailty die earlier, therefore leading to areduction of zx with age. If one disregards the presence of heterogeneity,on average an underestimation of the force of mortality follows when onecohort only is addressed.

2.9.4 Frailty models

In order to get to numerical valuations (and further analytical results, aswell), the distribution of Z0 must be chosen. In Vaupel et al. (1979), aGamma distribution has been suggested, due to its nice features. Let thenZ0 ∼ Gamma(δ, θ). The pdf g0(z) is therefore

g0(z) = θδzδ−1

�(δ)e−θz (2.142)

We have in particular

E[Z0] = z0 = δ

θ(2.143)

Var[Z0] = δ

θ2(2.144)

The coefficient of variation of Z0

CV[Z0] =√

Var[Z0]E[Z0] = 1√

δ(2.145)

shows that δ plays the role of measuring, in relative terms, the level ofheterogeneity in population. If δ ↗ ∞, then CV[Z0] ↘ 0, that is, the

84 2 : The basic mortality model

population can be considered homogeneous; for small values of δ, on thecontrary, the value of CV[Z0] is high, representing a wide dispersion, thatis, heterogeneity, in the population.

It can be shown that also Zx, x > 0, has a Gamma distribution, with oneof the two parameters updated to the current age. In order to check this, weneed the expression of the average survival function at age x. Substituting(2.142) into (2.134), and using (2.131), we have

S(x) =(

θ

θ + H(x)

)δ ∫ ∞

0

(θ + H(x))δzδ−1

�(δ)e−(θ+H(x))z dz (2.146)

Note that (θ+H(x))δzδ−1

�(δ)e−(θ+H(x))z is the pdf of a random variable Gamma-

distributed with parameters (δ, θ + H(x)); hence, the integral in (2.146)reduces to 1. Therefore,

S(x) =(

θ

θ + H(x)

(2.147)

Replacing in (2.137) and rearranging we have

gx(z) = (θ + H(x))δzδ−1

�(δ)e−(θ+H(x))z (2.148)

which is the pdf of a randomvariableGamma(δ, θ+H(x))Thus, theGammadistribution has a self-replicating property, and the relevant parametersneed to be chosen with reference to the distribution at age 0.

So it follows that

E[Zx] = zx = δ

θ + H(x)(2.149)

Var[Zx] = δ

(θ + H(x))2(2.150)

CV[Zx] =√

Var[Zx]E[Zx] = 1√

δ(2.151)

Note thatwhilst the expected value of the frailty reduceswith age, its relativevariability keeps constant.

We can give an interesting interpretation for the average survivalfunction. Rearranging (2.146) we find

S(x) =(

θ

δ

δ

θ + H(x)

=(

zx

z0

(2.152)

2.9 Heterogeneity in mortality models 85

and then we argue that the average survival function at age x, that is, theaverage probability of newborns attaining age x, depends on the compar-ison between the expected frailty level at age x and age 0; this result isindependent of the particular mortality law that we adopt for the standardforce of mortality, which actually has not yet been introduced, and is simplydue to the properties of the Gamma distribution.

The population force of mortality is

µx = µxδ

θ + H(x)(2.153)

Usually, the initial values of the parameters of the Gamma distribution arechosen so that z0 = 1, that is, θ = δ. So we have

µx = µxδ

δ + H(x)(2.154)

Only the parameter δ has to be assigned, in a manner which is consistentwith the level of heterogeneity in the population. Finally, the unconditionalpdf of Tx may be easily obtained from previous results.

An alternative choice for the distribution of Z0 is the Gaussian-Inversedistribution. Like the Gamma, this distribution is self-replicating, so thatZx is Gaussian-Inverse for any age x; and hence, the relevant parametersneed to be chosen only with reference to the distribution at age 0. Whena Gaussian-Inverse distribution is used, in relative terms the variability ofZx decreases with age, which can be justified by the fact that as time passesthose with a low (and similar) frailty keep on living, hence reducing the het-erogeneity of the population. In this regard, the Gassian-Inverse hypothesisis more interesting than the Gamma. However, some authors (e.g. see Buttand Haberman (2004) and Manton and Stallard (1984)) note that individ-ual frailty is unlikely to remain unchanged over the lifetime, but shouldincrease with age. So the assumption that, within the population, the rel-ative variability keeps constant can be accepted. In the following, we willmainly deal with the Gamma case.

2.9.5 Combining mortality laws with frailty models

Referring to adult ages, we can assume the Gompertz law (see (2.67)) fordescribing the standard force of mortality. So the cumulative standard forceof mortality is

H(x) =∫ x

0α eβt dt = α

β(eβx − 1) (2.155)

86 2 : The basic mortality model

If we accept the Gamma assumption for Z0, then the population force ofmortality is

µx = αδeβx(θ − α

β

)+ α

βeβx

(2.156)

Rearrange as

µx = 1θ − α

β

αδeβx

1 + αβθ−α

eβx (2.157)

Let αδθ−(α/β)

= α′, αβθ−α

= δ′; so

µx = α′eβx

1 + δ′eβx (2.158)

which is the first Perks law (see (2.74)), with γ = 0. Hence, (2.156) has alogistic shape; see Fig. 2.8.

The logistic model for describing mortality within a heterogeneous pop-ulation may be built also adopting a different approach (see Cumminset al. (1983); Beard (1971)). With reference to a heterogeneous population,assume that the individual force of mortality is Gompertz, with unknown‘base’ mortality; hence

µx = A eβx (2.159)

where A (the parameter for base mortality) is a random quantity, specific tothe individual, whilst β (the parameter for senescent mortality) is commonto all individuals and known. Let ϕ(a) denote the pdf of A; the populationforce of mortality is then

µx =∫ ∞

0a eβx ϕ(a) da = eβx

E[A] (2.160)

0.40

0.5

1

1.5

2

2.5

0.6 0.8 1 1.2 1.4 1.6 650

0.51

1.52

2.53

3.54

4.55

75 85 95 105 115

GomperizPerksx = 0

x = 85

Figure 2.8. Gamma distributions and forces of mortality.

2.10 References and suggestions for further reading 87

If A ∼ Gamma(ρ, ν), then

µx = eβx ρ

ν(2.161)

Letting ρ = αδβ, ν = 1

δ+eβx

β, we find

µx = αeβx

1 + δeβx (2.162)

which is still a particular case of (2.74), with γ = 0. Note, however, thatthis choice implies that the probability distribution of A depends on age.

What we have just described can be easily classified under the multiplica-tive frailty model. Actually, ifA in (2.159) is replacedwith αz (with α certainand z random), one finds (2.128). The Perksmodel then follows by choosinga Gamma distribution for Z0, with appropriate parameters. However, thisapproach is less elegant than that proposed by Vaupel et al. (1979), giventhat in (2.159) the distribution ofA is not forced to depend on age. Actually,the multiplicative model allows for extensions and generalizations; further,it does not require a Gompertz force of mortality.

2.10 References and suggestions for further reading

As regards the ‘traditional’ mortality model, that is, the model disregardingspecific issues as mortality at very old ages and frailty, we restrict ourselvesto general references. In some of these, the reader can find references to theoriginal papers and reports, for example, by Gompertz, Makemam, Thiele,Perks, and so on.

A number of textbooks of actuarial mathematics deal with life tablesand mortality models, in both an age-discrete and an age-continuouscontext. The reader can refer for example to Bowers et al. (1997),Gerber (1995), Gupta and Varga (2002), Rotar (2007). The textbook byBenjamin and Pollard (1993) is particularly devoted to mortality analysisand mortality laws.

The articles by Forfar (2004a) and Forfar (2004b) provide a compact andeffective presentation of life tables and mortality laws respectively.

Graduation methods are dealt with by many actuarial and statistical text-books. Besides the textbook by Benjamin and Pollard (1993) already cited,the reader should consult, for example, London (1985), and the article byMiller (2004) which also provides an extensive list of references. As regardsspline functions and their use to graduate mortality rates, the reader can

88 2 : The basic mortality model

refer to McCutcheon (1981), and Champion et al. (2004) and referencestherein.

Historical aspects are dealt with by Haberman (1996), Haberman andSibbett (1995) and Smith and Keyfitz (1977). In particular, in Habermanand Sibbett (1995) the reader can find the reproduction of milestone papersin mortality analysis up to 1919.

In relation to mortality at old and very old ages, the deceleration in therate of mortality increase is analysed in detail in the demographic literature.In particular, the reader can refer to Horiuchi and Wilmoth (1998), wherethe problem is attacked in the context of the frailty models. A discussionabout non-Gompertzianmortality at very old ages is provided byOlshanskyand Carnes (1997).

Allowing for heterogeneity in populationmortality (and, in particular, fornon-observable heterogeneity) constitutes, together with mortality dynam-ics modelling, one of the most important issues in the evolution of survivalmodels (see e.g. Pitacco (2004a)). Modelling frailty can suggest newways toforecast mortality. Although the earliest contribution to this topic probablycame from the actuarial field (Beard (1959) proposed the idea of individ-ual frailty for capturing heterogeneity due to unobservable risk factors),the topic itself was ignored by actuaries up to some time ago. Conversely,seminal contributions have come from demography and biostatistics, alsoconcerning the dynamics of mortality and longevity limits (see Vaupel etal. (1979), Hougaard (1984), and Yashin and Iachine (1997)). However,very recent contributions show interest for this topic within the actuarialcommunity; see Butt and Haberman (2002), Butt and Haberman (2004),Olivieri (2006).

Conversely, the interest of actuaries in observable factors (like gender,health condition, etc.) can be traced back to the first scientific models forlife insurance. For example, see Cummins et al. (1983) as regards risk clas-sification in life insurance and the numerical rating system in particular,that was pioneered by the New York Life Insurance company.

3Mortality trends duringthe 20th century

3.1 Introduction

Life expectancy at birth among early humans was likely to be between 20and 30 years as testified by evidence that has been glaned from tombstonesinscriptions, genealogical records, and skeletal remains. Around 1750, thefirst national population data began being collected in the Nordic countries.At that time, life expectancy at birth was around 35–40 years in the moredeveloped countries. It then rose to about 40–45 by the mid-1800s. Rapidimprovements began at the end of the 19th century, so that, by the middleof the 20th century it was approximately 60–65 years. By the beginning ofthe 21st century, life expectancy at birth has reached about 70 years. Theaverage life span has thus, roughly tripled over the course of human history.Much of this increase has happened in the past 150 years: the 20th centuryhas been characterized by a huge increase in average longevity comparedto all of the previous centuries. Broadly speaking, the average life spanincreased by 25 years in the 10,000 years before 1850. Another 25-yearincrease took place between 1850 and 2000. And there is no evidence thatimprovements in longevity are tending to slow down.

The first half of the 20th century saw significant improvement inthe mortality of infants and children (and their mothers) resulting fromimprovements to public health and nutrition that helped to withstand infec-tious diseases. Since the middle of the 20th century, gains in life expectancyhave been due more to medical factors that have reduced mortality amongolder persons. Reductions in deaths due to the ‘big three’ killers (cardio-vascular disease, cancer, and strokes) have gradually taken place, and lifeexpectancy continues to improve.

The population of the industrialized world underwent a major mortalitytransition over the course of the 20th century. In recent decades, the pop-ulations of developed countries have grown considerably older, because oftwo factors – increasing survival to older ages as well as the smaller numbers

90 3 : Mortality trends during the 20th century

of births (the so-called ‘baby bust’ which started in the 1970s). In this newdemographic context, questions about the future of human longevity haveacquired a special significance for public policy and fiscal planning. In par-ticular, social security systems, which in many industrialized countries areorganized according to the pay-as-you-go method, are threatened by theageing of the population due to the baby bust combined with the increase inlife expectancy. As a consequence, many nations are discussing adjustmentsor deeper reforms to address this problem.

Thus, mortality is a dynamic process and actuaries need appropriate toolsto forecast future longevity. We believe that any sound procedure for pro-jecting mortality must begin with a careful analysis of past trends. Thischapter purposes to illustrate the observed decline in mortality, on the basisof Belgianmortality statistics. Themortality experience during the 20th cen-tury is carefully studied by means of several demographic indicators whichhave been introduced in Chapter 2. Specifically, after having presented thedifferent sources of mortality statistics, we compute age-specific death rates,life expectancies,median lifetimes and interquartile ranges, inter alia, aswellas survival curves. We also compare statistics gathered by the insuranceregulatory authorities with general population figures in order to measureadverse selection. A comparison between the mortality experience of someEU member countries is performed in Section 3.5.

Before proceeding, let us say a few words about the notation used inthis chapter. Here, we analyse mortality in an age-period framework. Thismeans that we use two dimensions: age and calendar time. Both age andcalendar time can be either discrete or continuous variables. In discreteterms, a person aged x, x = 0, 1, 2, . . ., has an exact age comprised betweenx and x+1. This concept is also known as ‘age last birthday’ (i.e., the age ofan individual as a whole number of years, by rounding down to the age atthe most recent birthday). Similarly, an event that occurs in calendar yeart occurs during the time interval [t, t + 1]. This two-dimension setting isformally defined in Section 4.2.1; see Table 4.1. Otherwise, we follow thenotation introduced in the previous chapters.

3.2 Data sources

In this chapter, we use three different sources of mortality data. Officialdata coming from a National Institute of Statistics or another governmen-tal agency, data available from a scientific demographic database allowingfor international comparisons, and market data provided by nationalregulatory authorities.

3.2 Data sources 91

3.2.1 Statistics Belgium

Statistics Belgium is the official statistical agency for Belgium. Formerlyknown as NIS-INS, Directoriate General Statistics Belgium is part ofthe Federal Public Service Economy. It is based in Brussels. Its missionis to deliver timely, reliable and relevant figures to the Belgian govern-ment, international authorities (like the EU), academics, and the public.For more information, we refer the reader to the official website athttp://www.statbel.fgov.be. A national population register serves as thecentralizing database in Belgium and provides official population figures.Statistics on births and deaths are available from this register by basicdemographic characteristics (e.g. age, gender, marital status).

Statistics Belgium constructs period life tables, separately for menand women. These life tables are available for the periods 1880–1890,1928–1932, 1946–1949, 1959–1963, 1968–1972, 1979–1982, 1988–1990, 1991–1993 and 1994–1996. After 1996, period life tables have beenprovided each year based on a moving triennium, starting from the 1997–1999 life table, and continuing with the 1998–2000 life table, 1999–2001life table, etc. The last available life table relates to the period 2002–2004.In each case, the mortality experienced by the Belgian population is repre-sented as a set of one-year death probabilities qx (see Section 2.2.3 fora formal definition). Here, we use the life tables of the periods 1880–1890, 1928–1932, 1968–1972, and 2000–2002 to investigate the long-termevolution of the mortality in Belgium.

Even if the figures are computed from Belgian mortality experience, theanalysis conducted in this chapter applies to any industrialized country andthe findings would be very similar.

3.2.2 Federal Planning Bureau

The Federal Planning Bureau (FPB) is a public utility institution based inBrussels. The FPB makes studies and projections on socio-economic andenvironmental policy issues for the Belgian government. The populationplays an important role in numerous themes examined by the FPB. This iswhy the FPB produces regularly updated projected life tables for Belgium.

The official mortality statistics for Belgium come from FPB togetherwith Statistics Belgium. Specifically, from 1948 to 1993, annual deathprobabilities were computed by FPB. From 1994, annual death prob-abilities are computed by Statistics Belgium and published on a yearlybasis. The annual death probabilities are now available for calendar years

92 3 : Mortality trends during the 20th century

t = 1948, 1949, . . . , 2004 and ages

x =0, 1, . . . , 100, for t = 1948, 1949, . . . , 19930, 1, . . . , 101, for t = 1994, 1995, . . . , 19980, 1, . . . , 105, for t = 1999, 2000, . . .

3.2.3 Human mortality database

The Human mortality database (HMD) was launched in May 2002 to pro-vide detailedmortality and population data to those interested in the historyof human longevity. It has been put together by the Department of Demog-raphy at the University of California, Berkeley, USA, and the Max PlanckInstitute for Demographic Research in Rostock, Germany. It is freely avail-able at http://www.mortality.org and provides a highly valuable source ofmortality statistics.

HMD contains original calculations of death rates and life tables fornational populations, as well as the raw data used in constructing thosetables. The HMD includes life tables provided by single years of age up to109, with an open age interval for 110+. These period life tables representthe mortality conditions at a specific moment in time. We refer readersto the methods protocol available from the HMD website for a detailedexposition of the data processing and table construction.

For Belgium, datewere compiled byDanaGlei, Isabelle Devos andMichelPoulain. They cover the period starting in 1841 and ending in 2005. How-ever, data are missing during World War I. This is why we have decided torestrict the study conducted in this chapter to the period 1920–2005.

3.2.4 Banking, Finance, and Insurance Commission

In addition to general population data, we also analyse mortality statisticsfrom the Belgian insurance market. Any difference between the general pop-ulation and the insured population is due to adverse selection, as explainedin Section 1.6.5.

Market data are provided by the Banking, Finance and Insurance Com-mission (BFIC) based in Brussels. BFIC has been created as a result of theintegration of the Insurance Supervisory Authority (ISA) into the Bank-ing and Finance Commission (BFC). Since the 1st of January 2004, it isthe single supervisory authority for the Belgian financial sector. For moreinformation, we refer readers to the official website http://www.cbfa.be.

3.3 Mortality trends in the general population 93

Annual tabulations of the number of deaths by age, by gender, and bypolicy type are made by the BFIC based on information supplied by insur-ance companies. Together with the number of deaths, the corresponding(central) risk exposure is also available in each case. These data allow us tocalculate age-gender-type-of-product specific (central) death rates. We donot question the quality of the data provided by BFIC.

3.3 Mortality trends in the general population

3.3.1 Age-period life tables

As explained in Section 2.2, life table analyses are based upon an analyticalframework in which death is viewed as an event whose occurrence is prob-abilistic in nature. Life tables create a hypothetical cohort (or group) of,say, 100,000 persons at age 0 (usually of males and females separately) andsubject it to age-gender-specific annual death probabilities (the number ofdeaths per 1,000 or 10,000 or 100,000 persons of a given age and gender)observed in a given population. In doing this, researchers can trace how the100,000 hypothetical persons (called a synthetic cohort) would shrink innumbers due to deaths as the group ages.

As stressed in Section 2.2.1, there are two basic types of life tables: periodlife tables and cohort life tables. A period life table represents the mortalityexperience of a population during a relatively short period of time, usuallybetween one and three years. Life tables based on population data are gen-erally constructed as period life tables because death and population dataare most readily available on a time period basis. Such tables are usefulin analysing changes in the mortality experienced by a population throughtime. These are the tables used in the present chapter.

We analyse the changes in mortality as a function of both age x and cal-endar time t. This is the so-called age-period approach. In this chapter, weassume that the age-specific forces of mortality are constant within bandsof age and time, but allowed to vary from one band to the next. Thisextends to a dynamic setting the constant force of mortality assumption(b) in Section 2.3.5.

Specifically, let us denote as Tx(t) the remaining lifetime of an individualaged x at time t. Compared to Section 2.2.3, we supplement the notationTx for the remaining lifetime of an x-aged individual with an extra indext representing calendar time. This individual will die at age x + Tx(t) inyear t + Tx(t). Then, qx(t) is the probability that an x-aged individual in

94 3 : Mortality trends during the 20th century

calendar year t dies before reaching age x+1, that is, qx(t) = P[Tx(t) ≤ 1].Similarly, px(t) = 1 − qx(t) is the probability that an x-aged individual incalendar year t reaches age x + 1, that is, px(t) = P[Tx(t) > 1].The force of mortality µx(t) at age x and time t is formally defined as

µx(t) = lim↘0

P[x < T0(t − x) ≤ x + |T0(t − x) > x]

(3.1)

Compare (3.1) to (2.32)–(2.34). Now, given any integer age x and calendaryear t, we assume that

µx+ξ1(t + ξ2) = µx(t) for 0 ≤ ξ1, ξ2 < 1 (3.2)

This is best illustrated with the aid of a coordinate system that has calendartime as abscissa and age as coordinate as in Fig. 3.1. Such a representationis called a Lexis diagram after the German demographer who introducedit. Both time scales are divided into yearly bands, which partition the Lexisplane into square segments. Formula (3.2) assumes that the mortality rateis constant within each square, but allows it to vary between squares; seeFig. 3.1 for a graphical interpretation. Since life tables do not include mor-tality measures at non-integral ages or for non-integral durations, (3.2) canalso be seen as a convenient interpolation method to expand a life table forestimating such values.

Under (3.2), we have for integer age x and calendar year t that

px(t) = exp

(−

∫ 1

0µx+ξ(t + ξ) dξ

)= exp(−µx(t)) (3.3)

x + 1

x

t – x – 1 t – x t + 1 Timet

Age

Figure 3.1. Illustration of the basic assumption (3.2) with a Lexis diagram.

3.3 Mortality trends in the general population 95

which extends (2.36). For durations s less than 1 year, we have underassumption (3.2) that

spx(t) = exp(

−∫ s

0µx+ξ(t + ξ) dξ

)= exp (−sµx(t)) = (

px(t))s (3.4)

Moreover, the forces of mortality and the central death rates (see Section2.3.4 for formal definitions) coincide under (3.2), that is, µx(t) = mx(t).This makes statistical inference much easier since rates are estimated bydividing the number of occurrences of a selected demographic event in a(sub-) population by the corresponding number of person-years at risk (seenext section).

3.3.2 Exposure-to-risk

When working with death rates, the appropriate notion of risk exposureis the person-years of exposure, called the (central) exposure-to-risk inthe actuarial literature. The exposure-to-risk refers to the total number of‘person-years’ in a population over a calendar year. It is similar to the aver-age number of individuals in the population over a calendar year adjustedfor the length of time they are in the population.

Let us denote as ETRxt the exposure-to-risk at age x last birthday duringyear t, that is, the total time lived by people aged x last birthday in calendaryear t. There is an easy expression for the average exposure-to-risk that isvalid under (3.2). As in (1.45), let Lxt be the number of individuals aged xlast birthday on January 1 of year t. Then,

E[ETRxt|Lxt = l] = l∫ 1

ξ=0

(px(t)

)ξ dξ

= − lµx(t)

(1 − px(t)

)= −lqx(t)

ln(1 − qx(t))(3.5)

Hence, provided the population size is large enough, we get theapproximation

ETRxt ≈ −Lxtqx(t)ln(1 − qx(t))

(3.6)

that can be used to reconstitute the ETRxt’s from the Lxt’s and the qx(t)’s inthe case where the ETRxt’s are not readily available. This formula appears

96 3 : Mortality trends during the 20th century

to be useful since, in the majority of the applications to general populationdata, the exposure-to-risk is not provided. When the actuary works withmarket data, or with statistics gathered from a given insurance portfolio,the exposures-to-risk are easily calculated so that there is no need for theapproximation formula (3.6).

3.3.3 Death rates

We consider the estimation of µx(t) under assumption (3.2). We will seethat the maximum likelihood estimator of µx(t) is obtained by dividingthe number of deaths recorded at age x in year t by the correspondingexposure-to-risk ETRxt. This is an expected result since µx(t) and mx(t)coincide under (3.2).

To get this result in a formal way, let us associate to each of the Lxtindividuals alive at the beginning of the period an indicator variable δidefined as

δi ={1, if individual i dies at age x0, otherwise

(3.7)

i = 1, 2, . . . ,Lxt. Furthermore, let τi be the fraction of the year lived by indi-vidual i, and let Dxt be the number of deaths recorded at age x last birthdayduring calendar year t, from an exposure-to-risk ETRxt. We obviously havethat

Lxt∑i=1

δi = Dxt andLxt∑i=1

τi = ETRxt (3.8)

Note that the method of recording the calendar year of death and the agelast birthday at death means that the death counts Dxt cover individualsborn on January 1 in calendar year t−x−1 throughDecember 31 in calendaryear t − x (i.e., two successive calendar years) with a peak representationaround January 1 in calendar year t − x.

Under the assumption (3.2) and using (3.3), the contribution of individuali to the likelihood may be written as

px(t) = exp(−µx(t)) (3.9)

if he survives, and

τi px(t)µx+τi(t + τi) = exp(−τiµx(t))µx(t) (3.10)

if he dies at time τi during year t. Combining expressions (3.9)–(3.10), thecontribution of individual i to the likelihood can be transformed into

exp(−τiµx(t)) (µx(t))δi (3.11)

3.3 Mortality trends in the general population 97

If the individual lifetimes are mutually independent, the likelihood for theLxt individuals aged x is then equal to

L(µx(t)

) =Lxt∏i=1

exp(−τiµx(t)) (µx(t))δi

= exp(−µx(t)ETRxt) (µx(t))Dxt (3.12)

Note that this likelihood is proportional to the one based on the Poissondistributional assumption forDxt. Setting the derivative of lnL

(µx(t)

)equal

to 0,we find themaximum likelihood estimate µx(t) of the force ofmortalityµx(t) that is given by

µx(t) = Dxt

ETRxt= mx(t) (3.13)

The mx(t)’s are referred to as crude (i.e. unsmoothed) death rates for agex in calendar year t. The death rate is, thus, the proportion of people of agiven age expected to die within the year, expressed in terms of the expectednumber of life-years rather than in terms of the number of individuals ini-tially present in the group. Often, ETRxt is approximated by an estimate ofthe population aged x last birthday in the middle of the calendar year. Thisquantity is estimated by a national institute of statistics taking account ofrecorded births and deaths and net immigration. Formula (3.6) can also beused to reconstitute the exposure-to-risk under assumption (3.2).

Figure 3.2 displays the logarithm of the death rates mx(t) for males andfemales for four selected periods. They come from the official life tablesconstructed by Statistics Belgium, and cover the last 120 years. For eachperiod, death rates are relatively high in the first year after birth, declinerapidly to a low point around age 10, and thereafter rise, in a roughlyexponential fashion, before decelerating (or slowing their rate of increase)at the end of the life span. This is the typical shape of a set of death rates.

From Fig. 3.2, it is obvious that dramatic changes in mortality haveoccurred over the 20th century. The striking features of the evolution ofmortality are the downard trends and the substantial variations in shape.We see that the greatest relative improvement in mortality during the 20thcentury occurred at the young ages, which has resulted largely from the con-trol of infectious diseases. The decrease over time at ages 20–30 for femalesreflects the rapid decline in childbearing mortality. The hump in mortal-ity around ages 18–25 has become increasingly important, especially foryoung males. Accidents, injuries, and suicides account for the majority ofthe excess mortality of males over females at ages under 45 (this is why thishump is often referred to as the accident hump).

98 3 : Mortality trends during the 20th century

0 20 40 60 80 100

–8

–6

–4

–2

x

0 20 40 60 80 100x

2000–20021968–19721928–19321880–1890

2000–20021968–19721928–19321880–1890

lnm

x

–8

–6

–4

–2

lnm

x

Figure 3.2. Death rates (on the log scale) for Belgian males (top panel) and Belgian females (bot-tom panel) from period life tables 1880–1890, 1928–1932, 1968–1972, and 2000–2002. Source:Statistics Belgium.

The trend in the logarithm of the mx(t)’s for some selected ages is depictedin Figs 3.3 and 3.4. An examination of Fig. 3.3 reveals distinct behavioursfor age-specific death rates affecting Belgian males. At age 20, a rapid reduc-tion took place after a peak which occurred in the early 1940s due toWorldWar II. A structural break seems to have occurred, with a relatively highlevel of mortality before World War II, and a much lower level after 1950.Since the mid-1950s, only modest improvements have occurred for them20(t)’s. This is typical for ages around the accident hump, where malemortality has not really decreased since the 1970s. At age 40, the samedecrease after World War II is apparent, followed by a much slower reduc-tion after 1960. The decrease after 1970 is nevertheless more marked than

1920 1940 1960 1980 2000

–7.0

–6.5

–6.0

–5.5

–5.0

–4.5

t

lnm

20(t

)

1920 1940 1960 1980 2000

–6.0

–5.5

–5.0

t

lnm

40(t

)

1920 1940 1960 1980 2000

–4.4

–4.2

–4.0

–3.8

–3.6

t

lnm

60(t

)

1920 1940 1960 1980 2000

–2.4

–2.2

–2.0

–1.8

t

lnm

80(t

)

Figure 3.3. Trend in observed death rates (on the log scale) for Belgian males at ages 20, 40, 60, and 80, period 1920–2005. Source: HMD.

1920 1940 1960 1980 2000

t

lnm

20(t

)

–8.5

–8.0

–7.5

–7.0

–6.5

–6.0

–5.5

1920 1940 1960 1980 2000

t

lnm

60(t

)

–3.8

–4.0

–4.2

–4.4

–4.6

–4.8

–5.0

–5.2

1920 1940 1960 1980 2000

t

lnm

40(t

)

–5.0

–5.5

–6.0

–6.5

–7.0

1920 1940 1960 1980 2000t

–1.8

–2.0

–2.2

–2.4

–2.6

–2.8

–3.0

lnm

80(t

)Figure 3.4. Trend in observed death rates (on the log scale) for Belgian females at ages 20, 40, 60, and 80, period 1920–2005. Source: HMD.

3.3 Mortality trends in the general population 101

at age 20. At ages 60 and 80, mortality rates have declined rapidly after1970, whereas the decrease during 1920–1970 was rather moderate. Wenote that the effect of World War II is much more important at youngerages than at older ages. This clearly shows that gains in longevity have beenconcentrated on younger ages during the first half of the 20th century, andhave then moved to older ages after 1950.

The analysis for Belgian females illustrated in Fig. 3.4 parallels that formales for ages 20 and 40, but with several differences. At age 20, modestimprovements are visible after the mid-1950s. At age 40, more pronouncedreductions occurred after 1960. At older ages, the rate of decrease is moreregular, and has tended to accelerate after 1980.

This acceleration is a feature seen in a number of Western Europeancountries. Kannisto et al. (1994) report an acceleration in the late 1970s inrate of decrease of mortality rates at ages over 80 in an analysis of mortalityrates for 9 European countries with reliable mortality data at these ages overan extended period.

3.3.4 Mortality surfaces

The dynamic analysis of mortality is often based on the modelling of themortality surfaces that are depicted in Fig. 3.5. Such a surface consists of athree-dimensional plot of the logarithm of the mx(t)’s viewed as a functionof both age x and time t. Fixing the value of t, we recognize the classi-cal shape of a mortality curve visible in Fig. 3.2. Specifically, along crosssections when t is fixed (or along diagonals when cohorts are followed),one observes relatively high mortality rates around birth, the well-knownpresence of a trough at about age 10, a ridge in the early 20s (which is lesspronounced for females), and an increase at middle and older ages.

Mortality does not vary uniformly over the age-year plane and the advan-tage of plots as in Fig. 3.5 is that they facilitate an examination of the waythat mortality changes with year and cohort as well as with age. In additionto random deviation from the underlying smooth mortality surface, the sur-face is subject to period shocks corresponding to wars, epidemics, harvests,summer heat waves, etc. Roughness of the surface indicates volatility andridges along cross sections at given years mark brief episodes of excess mor-tality. For instance, higher mortality rates are clearly visible for the yearsaround World War II.

3.3.5 Closure of life tables

At higher ages (above 80), death rates displayed in Fig. 3.5 appear rathersmooth. This is a consequence of the smoothing procedure implemented

102 3 : Mortality trends during the 20th century

1920

1940

1960

1980

20000

20

40

6080

100

–8

–6

–4

–2

1920

1940

1960

1980

20000

20

40

60

80100

–8

–6

–4

–2

t

x

tx

Figure 3.5. Observed death rates (on the log scale) for Belgian males (top panel) and Belgianfemales (bottom panel), ages 0 to 109, period 1920–2005. Source: HMD.

3.3 Mortality trends in the general population 103

in HMD. Death rates for ages 80 and above were estimated accordingto the logistic formula and were then combined with death rates fromyounger ages in order to reconstitute life tables. To have an idea of thebehaviour of mortality rates at the higher ages, we have plotted in Fig. 3.6the rough death rates observed for the Belgian population. As discussedin Section 2.8, we clearly see from Fig. 3.6 that data at old ages producesuspect results (because of small risk exposures): the pattern at old andvery old ages is heavily affected by random fluctuations because of thescarcity of data. Sometimes, data above some high age are not availableat all.

Recently, some in-depth demographic studies have provided a moresound knowledge about the slope of the mortality curve at very old ages.It has been documented that the force of mortality is slowly increasing atvery old ages, approaching a rather flat shape. The deceleration in the rateof increase in mortality rates can be explained by the selective survival ofhealthier individuals at older ages (see, e.g. Horiuchi and Wilmoth, 1998)for more details, as well as the discussion about frailty in Section 2.9.3).Demographers and actuaries have suggested various techniques for estimat-ing the force of mortality at old ages and for completing the life table. SeeSection 2.8.2 for examples and references. Here, we apply a simple andpowerful method proposed by Denuit and Goderniaux (2005).

The starting point is standard: there is ample empirical evidence thatthe one-year death probabilities behave like the exponential of a quadraticpolynomial at older ages, that is, qx(t) = exp(at + btx + ctx2). Hence, alog-quadratic regression model of the form

ln qx(t) = at + btx + ctx2 + εxt (3.14)

for the observed one-year death probabilities, with εxt independent andNormally distributed with mean 0 and variance σ2, is fitted separately toeach calendar year t (t = t1, t2, . . . , tm) and to ages x�

t and over. Then,constraints are imposed to mimic the observed behaviour of mortality atold ages. First, a closure constraint

q130(t) = 1 for all t (3.15)

which retains as working assumption that the limit age 130 will not beexceeded. Secondly, an inflection constraint

∂xqx(t)

∣∣∣∣x=130

= 0 for all t (3.16)

104 3 : Mortality trends during the 20th century

19501960

1970

1980

1990

20000

20

40

60

80100

–8

–6

–4

–2

0

1950

1960

1970

1980

1990

20000

20

40

60

80100

–12

–10

–8

–6

–4

–2

tx

t x

Figure 3.6. Observed death rates (on the log scale) for Belgian males (top panel) and Belgianfemales (bottom panel), period 1950–2004. Source: Statistics Belgium.

which is used to ensure that the behaviour of the ln qx(t)’s will be ulti-mately concave. This is in line with empirical studies that provide evidenceof a decrease in the rate of mortality increase at old ages. One explana-tion proposed for this deceleration is the selective survival of healthierindividuals to older ages, as noted above.

3.3 Mortality trends in the general population 105

Note that both constraints are imposed here at age 130. In general, theclosing age could also be treated as a parameter and selected from the data(together with the starting age x�

t , thereby determining the optimal fittingage range).

These two constraints yield the following relation between the at’s, bt’s,and ct’s for each calendar time t:

at + btx + ctx2 = ct(130 − x)2 (3.17)

for x = x�t , x

�t + 1, . . . and t = t1, t2, . . . , tm. The ct’s are then estimated

on the basis of the series {qx(t), x = x�t , x

�t + 1, . . .} relating to year t

from equation (3.14), noting the constraints imposed by (3.17). It is worthmentioning that the two constraints underlying the modelling of the qx(t)for high x are in line with empirical demographic evidence.

Let us now apply this method to the data displayed in Fig. 3.6. Theoptimal starting age is selected from the age range 75–89. It turns out tobe around 75 for all of the calendar years. Therefore, we fix it to be 75 forboth genders and for all calendar years. The R2 corresponding to the fittedregression models (3.14), as well as the estimated regression parameters ctare displayed in Fig. 3.7. We keep the original qx(t) for ages below 85 andwe replace the death probabilities for ages over 85 with the fitted valuescoming from the constrained quadratic regression (3.14). The results forcalendar years 1950, 1960, 1970, 1980, 1999, and 2000 can be seen inFig. 3.8 for males and in Fig. 3.9 for females. The completed mortalitysurfaces are displayed in Fig. 3.10.

3.3.6 Rectangularization and expansion

Figure 3.11 shows the rectangularization phenomenon. It presents the pop-ulation survival functions based on period life tables for, from left to right,1880–1890, 1928–1932, 1968–1972, and 2000–2002. Survival functionshave been formally introduced in Section 2.3.1. Broadly speaking, they givethe proportion of individuals reaching the age displayed along the x-axis,where this proportion is computed on the basis of the set of age-specificmortality rates corresponding to the different period life tables.

As we have noted in the introduction, considerable progress has beenmade in the 20th century towards eliminating the hazards to survivalwhich existed at the young ages in the early 1900s. This is clearly visi-ble from Fig. 3.11 where the proportion of the population still alive atsome given age increases as we move forward in calendar time. As a con-sequence, the slope of the survival function has become more rectangular(less diagonal) through time. This is the so-called ‘curve squaring’ concept,

106 3 : Mortality trends during the 20th century

1950 1960 1970 1980 1990 2000t

MalesFemales

MalesFemales

1950 1960 1970 1980 1990 2000t

1.00

0.98

0.96

0.94

0.92

R2

–0.0008

–0.0009

–0.0010

–0.0011

–0.0012

c t

Figure 3.7. Adjustment coefficients and estimated regression parameters for model(3.14)–(3.17).

which has been the subject of passionate debate among demographers inrecent years.

Let us now consider the age corresponding to a value of 0.5 for thesurvival curve. This age is called median age at birth and is one of thestandard demographic markers; see Section 2.4.2. Broadly speaking, themedian is the age reached by half of a hypothetical population with mortal-ity experience reflected by that particular period life table. Figure 3.12(toppanel) shows the increasing trend in the median life at birth: median life-times are depicted by gender and calendar year, based on period life tables.Figure 3.12(bottom panel) is the analogue for the median remaining lifetimeat age 65.

0 100 120x

ln q

x0

–2

–4

–6

20 40 60 80 0 100 120x

ln q

x

ln q

x

0

–2

–4

–6

–820 40 60 80 0 100 120

x

0

–2

–4

–6

–820 40 60 80

0 100 120x

ln q

x

0

–2

–4

–6

–8

20 40 60 80 0 100 120x

ln q

x

ln q

x

0

–2

–4

–6

–8

20 40 60 80 0 100 120x

0

–2

–4

–6

–8

–1020 40 60 80

Figure 3.8. Completed life tables for Belgian males, years 1950, 1960, 1970, 1980, 1990, and 2000, together with empirical death probabilities (brokenline), on the log-scale.

0 20 40 60 80 100 120x

lnq x

0 20 40 60 80 100 120x

lnq x

0 20 40 60 80 100 120x

lnq x

0 20 40 60 80 100 120x

lnq x

0 20 40 60 80 100 120x

lnq x

0 20 40 60 80 100 120x

lnq x

0

–2

–4

–6

–8

0

–2

–4

–6

–8

0

–2

–4

–6

–8

0

–2

–4

–6

–8

0

–2

–4

–6

–8

0

–2

–4

–6

–8

Figure 3.9. Completed FPB life tables for Belgian females, years 1950, 1960, 1970, 1980, 1990, and 2000, together with empirical death probabilities (brokenline), on the log-scale.

3.3 Mortality trends in the general population 109

19501960

1970

1980

1990

2000

19501960

1970

1980

1990

2000

0

50

100

−8

−6

−4

−2

0

0

50

100

−8

−6

−4

−2

0

t

x

t

x

Figure 3.10. Completed death rates (on the log scale) for Belgian males (top panel) and Belgianfemales (bottom panel), period 1920–2005.

110 3 : Mortality trends during the 20th century

100806040200

0.0

0.2

0.4

0.6

0.8

1.0

x

S(x)

2000–20021968–19721928–19321880–1890

2000–20021968–19721928–19321880–1890

100806040200

0.0

0.2

0.4

0.6

0.8

1.0

x

S(x)

Figure 3.11. Survival curves for Belgian males (top panel) and Belgian females (bottom panel)corresponding to the 1880–1890, 1928–1932, 1968–1972, and 2000–2002 period life tables.Source: Statistics Belgium.

Rectangularization of survival curves is associated with a reduction in thevariability of age at death. As deaths become concentrated in an increasinglynarrow age range, the slope of the survival curve in that range becomessteeper, and the curve itself begins to appear more rectangular. A simplemeasure of rectangularity is thus the maximum downward slope of thesurvival curve S in the adult age range that has been formally defined in(2.61). Increasing rectangularity according to thismeasure implies a survivalcurve which becomes increasingly vertical at older ages.

Figure 3.13 displays the distribution of ages at death (empirical versionof the theoretical probability density function f defined in (2.28)). It can beseen that the distribution of ages at death has shifted to the right and has

3.3 Mortality trends in the general population 111

1920 1940 1960 1980 200060

65

70

75

80

85

t

MalesFemales

MalesFemales

Med

[T65

(t)]

1920 1940 1960 1980 200010

12

14

16

18

20

t

Med

[T65

(t)]

Figure 3.12. Observed median lifetimes at birth (top panel) and at age 65 (bottom panel), period1920–2005. Source: HMD.

become less variable and less obviously bimodal. We clearly observe thatthe point of fastest decline increases with time, which empirically supportsthe expansion phenomenon.

3.3.7 Life expectancies

The index, life expectancy, has been formally defined in Section 2.4.1. Lifeexpectancy statistics are very useful as summary measures of mortality, andthey have an intuitive appeal. However, it is important to interpret dataon life expectancy correctly when their computation is based on periodlife tables. Period life expectancies are calculated using a set of age-specificmortality rates for a given period (either a single year, or a run of years), with

112 3 : Mortality trends during the 20th century

0 20 40 60 80 100x

0 20 40 60 80 100x

f(x)

f(x)

2000–20021968–19721928–19321880–1890

2000–20021968–19721928–19321880–1890

0.03

0.02

0.01

0.00

0.04

0.03

0.02

0.01

0.00

Figure 3.13. Observed proportion of ages at death for Belgian males (top panel) and Belgianfemales (bottom panel) corresponding to 1880–1890, 1928–1932, 1968–1972, and 2000–2002period life tables. Source: Statistics Belgium.

no allowance for any future changes in mortality. Cohort life expectanciesare calculated using a cohort life table, that is, using a set of age-specificmortality rates which allow for known or projected changes in mortality atlater ages (in later years).

Period life expectancies are a useful measure of the mortality rates thathave been actually experienced over a given period and, for past years,provide an objective means of comparison of the trends in mortality overtime, between areas of a country and with other countries. Official lifetables which relate to past years are generally period life tables for thesereasons. Cohort life expectancies, even for past years, may require projectedmortality rates for their calculation. As such, they are less objective becausethey are subject to substantial model risk and forecasting error.

3.3 Mortality trends in the general population 113

In this chapter, we only compute period life expectancies. Cohort lifeexpectancies will be derived in Chapter 5 using appropriate mortality pro-jection methods. Let e↑

x(t) be the period life expectancy at age x in calendaryear t. Here, we have used a superscript ‘↑’ to recall that we work along avertical band in the Lexis diagram, considering death rates associated witha given period of time. Specifically, e↑

x(t) is computed from the period lifetable for year t, given by the set µx+k(t), k = 0, 1, . . . . The formula givinge↑

x(t), under assumption (3.2), is

e↑x(t) =

∫ξ≥0

exp(

−∫ ξ

0µx+η(t) dη

)dξ

= 1 − exp( − µx(t)

)µx(t)

(3.18)

+∑k≥1

k−1∏j=0

exp(−µx+j(t)

) 1 − exp(−µx+k(t)

)µx+k(t)

In this formula, the ratio (1 − exp(−µx+k(t))/µx+k(t)) is the average frac-tion of the year lived by an individual alive at age x + k, and the product∏k−1

j=0 exp(−µx+j(t)) is the probability kp↑x(t) of reaching age x+k computed

from the period life table.

Figure 3.14 shows the trend in the period life expectancies at birth e↑0(t)

and at retirement age e↑65(t) by gender. The period life expectancy at a

particular age is based on the death rates for that and all higher ages thatwere experienced in that specific year. For life expectancies at birth, weobserve a regular increase after 1950, with an effect due to World War IIwhich is visible before that time (especially at the beginning and at the endof the conflict for e↑

0(t), and during the years preceding the conflict as well asduring the war itself for e↑

65(t)). Little increase was experienced from 1930to 1945. It is interesting to note that period life expectancies are affectedby sudden and temporary events, such as a war or an epidemic.

3.3.8 Variability

Wilmoth andHoriuchi (1999) have studied different measures of variabilityfor the distribution of ages at death. These authors favour the interquartilerange for both its ease of calculation and for its straightforward interpreta-tion. The interquartile range measures the distance between the lower andthe upper quartiles of the distribution of ages at death in a life table. Thisrange is formally defined as the difference between the age corresponding

114 3 : Mortality trends during the 20th century

1920 1940 1960 1980 2000

t

MalesFemales

MalesFemales

1920 1940 1960 1980 2000t

e 0(t

)

85

80

75

70

65

60

55

50

e 65(

t)

20

18

16

14

12

10

Figure 3.14. Observed period life expectancies at birth (top panel) and at age 65 (bottom panel)for Belgian males (continuous line) and Belgian females (dotted line), period 1920–2005. Source:HMD.

to the value 0.25 of the survival curve minus the age corresponding to thevalue 0.75 of this curve; see (2.64). The former age (called the third quar-tile) is attained by 25% of the population whereas 75% of the populationreaches the latter age (called the first quartile). The interquartile range isthus the width of the age interval containing the 50% central deaths inthe population. As age at death becomes less variable, we would expectthat this measure would decrease. It is very simple to calculate because itequals the difference between the ages where the survival curve S crosses theprobability levels 0.25 and 0.75. Being the length of the span of ages con-taining the middle 50% of deaths, it possesses a simple interpretation. Notethat the rectangularization of survival curves is associated with decreasinginterquartile range.

3.3 Mortality trends in the general population 115

1920 1940 1960 1980 2000t

IQR

Males

Females

Males

Females

1920 1940 1960 1980 2000

t

IQR

45

40

35

30

25

20

15

11.5

11.0

10.5

10.0

Figure 3.15. Observed interquartile range at birth (top panel) and at age 65 (bottom panel)for Belgian males (continuous line) and Belgian females (dotted line), period 1920–2005. Source:HMD.

Figure 3.15 depicts the interquartile range at birth and at age 65.Whereasthe interquartile range at birth clearly decreases over time, there is anupward trend at age 65. This suggests that even if variability is decreasingfor the entire lifetime, this may not be the case for the remaining lifetime atage 65.

3.3.9 Heterogeneity

Within populations, differences in life expectancy exist with regard to gen-der. Females tend to outlive males in all populations, and have lowermortality rates at all ages, starting from infancy. This is clear from all of thefigures examined so far in this chapter. Another difference in life expectancy

116 3 : Mortality trends during the 20th century

occurs because of social class, as assessed through occupation, income, oreducation.

In recent decades, population data have shown widening mortality dif-ferentials by socio-economic class. The mortality of the better off classeshas improved more rapidly. The major cause of death responsible for thewidening differential is cardiovascular disease: persons of higher socialclasses have experienced much larger declines in death due to cardiovas-cular disease than persons of lower classes. Other possible explanationsinclude cigarette smoking (which is known to vary significantly accordingto social class) as well as differences in diet, selection mechanims, poorerquality housing conditions and occupation. In general, individuals withhigher socio-economic status live longer than those in lower socio-economicgroups. This heterogeneity can be accounted for as discussed in Section 2.9.

We will see below that the effect of social class is significant for insurancemarketmortality statistics. Indeed, the act of purchasing life insurance prod-ucts often reveals that the individual belongs to upper socio-economic class,which in turn yields lower mortality (even in the case of death benefits).

3.4 Life insurance market

3.4.1 Observed death rates

Figure 3.16 displays the period life tables for the Belgian individual lifeinsurance market, group life insurance market, and the general populationobserved in the calendar years 1995, 2000, and 2005. The variability inthe set of death rates is clearly much higher for the insurance market, asexposures-to-risk are considerably smaller. This is why smoothing the mar-ket experience to make the underlying trend more apparent is desirable.This is achieved as explained below.

The standardized mortality ratio (SMR) is a useful index for comparingmortality experiences: actual deaths in a particular population are com-pared with those which would be expected if ‘standard’ age-specific ratesapplied. Precisely, the SMR is defined as

SMR =∑

(x, t)∈D ETRxtmx(t)∑(x, t)∈D ETRxtmstand

x (t)=

∑(x, t)∈D Dxt∑

(x, t)∈D ETRxtmstandx (t)

where D is the set of ages and calendar years under interest.

Here are the SMRs by calendar year for the life insurance market: com-puted over 1993–2005, the estimated SMR is equal to 0.5377419 for ages

40 50 60 70 80 90x

0

–2

–4

–6

–8

lnm

x

100 40 50 60 70 80 90x

0

–2

–4

–6

–8

lnm

x

100 40 50 60 70 80 90x

0

–2

–4

–6

–8

lnm

x

100

40 50 60 70 80 90x

0

–2

–4

–6

–8

lnm

x

100 40 50 60 70 80 90x

0

–2

–4

–6

–8

lnm

x

100 40 50 60 70 80 90x

0

–2

–4

–6

–8

lnm

x

100

Figure 3.16. General population (broken line) and individual (circle) and group (triangle) life insurance market death rates (on the log scale) observed in1995, 2000, and 2005 for Belgian males (top panel) and females (bottom panel). Source: HMD for the general population and BFIC for insured lives.

118 3 : Mortality trends during the 20th century

45–64 and to 0.3842981 for ages 65 and over for individual policies, andto 0.495525 and to 0.8042604 for group policies. The same values com-puted over 2000–2005 are equal to 0.4796451, 0.3699633, 0.4963897, and0.8692767, respectively. Note that the values for group contracts, ages 45–64 have been computed by excluding calendar year 2001,which appeared tobe atypical for group life contracts before retirement age.We see that SMR’sare around 50% for individual and group life insurance contracts beforeretirement age, and then decrease to reach 40% for individual policies andincrease to 80% for group life policies.

3.4.2 Smoothed death rates

It is clear from Fig. 3.16 that death rates based on market data exhibitconsiderable variations. This is why some smoothing is desirable in orderto obtain a better picture of the underlying mortality experienced by insuredlives. Since possible changes in underwriting practices or tax reforms arelikely to affect market death rates, we smooth the death rates across ages bycalendar year, as in Hyndman and Ullah (2007). To this end, we use localregression techniques.

Local regression is used to model a relation between a predictor variable(or variables) x and a response Y, which is related to the predictor variable.Typically, x represents age in the application that we have in mind in thischapter, whileY is some (suitably transformed) demographic indicator suchas the logarithm of the death rate or the logit of the death probability. Thelogarithmic and logit transformations involved in these models ensure thatthe dependent variables can assume any possible real values.

As pointed out by Loader (1999), smoothing methods and local regres-sion originated in actuarial science in the late 19th and early 20th centuries,in the problem of graduation. See Section 2.6 for an introduction to theseconcepts. Having observed (x1,Y1), (x2,Y2), . . ., (xm,Ym), we assume amodel of the form Yi = f (xi)+εi, i = 1, 2, . . . ,m, where f (·) is an unknownfunction of x, and εi is an error term, assumed to be Normally distributedwith mean 0 and variance σ2. This term represents the random departuresfrom f (·) in the observations, or variability from sources not included in thexi’s. No strong assumptions are made about f , except that it is a smoothfunction that can be locally well approximated by simple parametric func-tions. For instance, invoking Taylor’s theorem, any differentiable functioncan be approximated locally by a straight line, and a twice differentiablefunction can be approximated locally by a quadratic polynomial.

In order to estimate f at some point x, the observations are weighted insuch a way that the largest weights are assigned to observations close to

3.4 Life insurance market 119

x. In many cases, the weight wi(x) assigned to (xi,Yi) to estimate f (x) isobtained from the formula

wi(x) = W(

xi − xh(x)

)(3.19)

where W(·) is choosen to be continuous, symmetric, peaked at 0 andsupported on [−1, 1]. A common choice is the tricube weight functiondefined as

W(u) ={ (

1 − |u|3)3 for −1 < u < 10 otherwise

(3.20)

The bandwidth h(x) defines a smoothing window (x − h(x), x + h(x)), andonly observations in that window are used to estimate f (x). Within thesmoothing window, f is approximated by a polynomial. The coefficients ofthis polynomial are then estimated via weighted least-squares.

The bandwidth h(x) has a critical effect on the local regression. If h(x) istoo small, insufficient data fall within the smoothing window and a noisyfit results. On the other hand, if h(x) is too large, the local polynomial maynot fit the data well within the smoothing window, and important featuresof the mean function may be distorted or even lost. The nearest neighbourbandwidth is often used. Specifically, h(x) is selected so that the smoothingwindow contains a specified number of points.

A high polynomial degree can always provide a better approximationto f than a low polynomial degree. But high order polynomials have largenumbers of coefficients to estimate, and the result is increased variability inthe estimate. To some extent, the effects of the polynomial degree and band-width are confounded. It often suffices to chose a low degree polynomialand to concentrate on choosing the bandwidth in order to obtain a satis-factory fit. The most common choices are local linear and local quadratic.A local linear estimate usually produces better fits, especially at the bound-aries. The weight function W(·) has much less effect on the bias-variancetrade-off, and the tricube weight function (3.20) is routinely used.

Let us approximate f by a linear function β0(x) + β1(x)x in the smooth-ing window (x − h(x), x + h(x)). This leads to local linear regression. Thecoefficients β0(x) and β1(x) are estimated by minimizing the local residualsum of squares

OW (x) =m∑

i=1

wi(x)(Yi − β0(x) − β1(x)xi

)2 (3.21)

120 3 : Mortality trends during the 20th century

Denoting as

xw =∑m

i=1 wi(x)xi∑mi=1 wi(x)

(3.22)

the weighted average of the xi’s in the smoothing window, the minimizationof the objective function OW (x) gives

f (x) = β0(x) + β1(x)x

=∑m

i=1 wi(x)Yi∑mi=1 wi(x)

+ (x − xw

)∑mi=1 wi(x)

(xi − xw

)Yi∑m

i=1 wi(x)(xi − xw

)2 (3.23)

Let us give an interpretation to this expression for f (x). The first term in f (x)

is the well-known Nadaraya–Watson kernel estimate that is obtained byapproximating f by a constant in the smoothingwindow (x−h(x), x+h(x)).The second term is a correction for local slope of the data and skewness ofthe xi’s. A local linear estimate would exhibit bias if the mean function fhad a high curvature.

Let us now apply this methodology to the life insurance market data. Fora fixed calendar year t, we use the model

ln mx(t) = f (x) + εxt, x = 40, 41, . . . , 98 (3.24)

where mx(t) is the observed death rate in the insurance market. Hence, thesmoothed death rates are given by exp(f (x)), x = 40, 41, . . . , 98. The modelis fitted separately to males and females, and to group and individual mor-tality experiences. The result is visible in Fig. 3.17 which is the analogue ofFigure 3.16, leading to smoothed mortality curves for the insurance mar-ket. We see that the individual life experience is consistently better thanthe general population mortality. The experience for group life contracts isbetter than the general population mortality before retirement age but thendeteriorates and becomes comparable to the general population mortalityafter retirement.

Remark Alternatively, f can be estimated by minimizing the objectivefunction (2.105), that is,

Oλ(f ) =m∑

i=1

(yi − f (xi)

)2 + λ

∫u∈R

(f"(u)

)2du (3.25)

The first term ensures that f (·) will fit the data as well as possible. Thesecond term penalizes roughness of f (·); it imposes some smoothness onthe estimated f (·). The factor λ quantifies the amount of smoothness: ifλ ↗ +∞ then f" = 0 and we get a linear fit; and if λ ↘ 0 then f perfectlyinterpolates the data points.

40 50 60 70 80 90Age

Smoo

th lo

g. d

eath

rat

e, y

ear

an–1

–2

–3

–4

–5

–6

–7

100

40 50 60 70 80 90Age

Smoo

th lo

g. d

eath

rat

e, y

ear

an

–1

–2

–3

–4

–5

–6

–7

100 40 50 60 70 80 90Age

Smoo

th lo

g. d

eath

rat

e, y

ear

an

–1

–2

–3

–4

–5

–6

–7

100 40 50 60 70 80 90Age

Smoo

th lo

g. d

eath

rat

e, y

ear

an

–1

–2

–3

–4

–5

–6

–7

100

40 50 60 70 80 90Age

Smoo

th lo

g. d

eath

rat

e, y

ear

an

–1

–2

–3

–4

–5

–6

–7

100 40 50 60 70 80 90Age

Smoo

th lo

g. d

eath

rat

e, y

ear

an

–1

–2

–3

–4

–5

–6

–7

100

Figure 3.17. General population (broken line) death rates and individual (circle) and group (triangle) life insurance market smoothed death rates (on the logscale) observed in 1994 for Belgian males (top panel) and females (bottom panel).

122 3 : Mortality trends during the 20th century

If x1 < x2 < · · · < xm then the solution fλ is a cubic spline with knotsx1, x2, . . . , xm; see Section 2.6.3. This means that fλ coincides with a third-degree polynomial on each interval (xi, xi+1) and possesses continuous firstand second derivatives at each xi.

Remark Instead of working in a Gaussian regression model, we could alsomove to the generalized linear modelling framework by implementing alocal likelihood maximization principle. Consider for instance the Bernoullimodel where P[Yi = 1] = 1 − P[Yi = 0] = p(xi). The contribution of theith observation to the log-likelihood is

l(Yi, p(xi)) = Yi ln p(xi) + (1 − Yi) ln(1 − p(xi))

= Yi lnp(xi)

1 − p(xi)+ ln(1 − p(xi)) (3.26)

A local polynomial approximation for p(xi) is difficult since the inequal-ities 0 ≤ p(xi) ≤ 1 must be fulfilled. Therefore, we prefer to work on thelogit scale, defining the new parameter from the logit transformation

θ(x) = lnp(x)

1 − p(x)(3.27)

Note that θ(x) can assume any real value as p(x) moves from 0 to 1. Thelocal polynomial likelihood at x is then

m∑i=1

wi(x)(Yi

(β0(x) + β1(x)xi

) − ln(1 + exp(β0(x) + β1(x)xi)

))(3.28)

The estimation of p(x) is then obtained from

p(x) = exp(β0(x) + β1(x)x

)1 + exp

(β0(x) + β1(x)x

) (3.29)

3.4.3 Life expectancies

Figure 3.18 gives the life expectancy at age 65 for the general populationand for insured lives, computed on the basis of observed death rates.

We see that the life expectancies for the group life insurance market areclose to the general population ones. This is due to the moderate adverseselection present in the collective contracts, where the insurance coverage ismade compulsory by the employment contract, noting that there is a selec-tion effect through being employed (the so-called ‘healthy worker effect’).On the contrary, the effect of adverse selection seems to be much strongerfor individual policies. This is due to the particular situation prevailing

3.4 Life insurance market 123

1994 1996 1998 2000 2002 2004

Year

Lif

e ex

p. 6

5

1994 1996 1998 2000 2002 2004Year

Lif

e ex

p. 6

526

24

22

20

18

16

28

26

24

22

20

18

Figure 3.18. Life expectancy at age 65 for males (top panel) and females (bottom panel): Generalpopulation (diamond) and individual (circle) and group (triangle) life insurance market. Source:HMD for the general population and BFIC for insured lives.

in Belgium, where no tax incentives are offered for buying life annuities orother life insurance products after retirement. This explainswhy only peoplewith improved health status consider insurance products as valuable assets.Note that this situation has recently changed in Belgium, where purchasinglife annuities at retirement age is now encouraged by the government.

3.4.4 Relational models

Actuaries are aware that the nominee of a life annuity is, with a high proba-bility, a healthy person with a particularly low mortality in the first years oflife annuity payment and, generally, with an expected lifetime higher than

124 3 : Mortality trends during the 20th century

average. In order to account for this phenomenon, Delwarde et al. (2004)have suggested a method for adjusting a reference life table to the experi-ence of a given portfolio, based on non-linear regression models using locallikelihood for inference.

Denoting as mHMDx (t) the population death rates contained in the HMD,

and as mBFICx (t) their analogue for the life insurance market computed from

BFIC statistics, we consider models of the form

ln mBFICx (t) = f

(ln mHMD

x (t))

+ εxt (3.30)

for ages x = 40, 41, . . . , 98 and calendar years 1994–2005. The similaritywith (3.24) is clearly apparent. Now, population death rates are used asexplanatory variables, instead of age x. Note that both variables couldenter the model as covariates, but we need here to establish a link betweenpopulation and insurance market mortality statistics that will be exploitedin Chapter 5. Figure 3.19 describes the result of the procedure for males,whereas Fig. 3.20 is the analogue for females.

Figures 3.19 and 3.20 suggest that a linear relationship exists betweenpopulation and market death rates (at least for older ages). If we fit theregression model

ln mBFICx (t) = a + b ln mHMD

x (t) + εxt (3.31)

to the observed pairs (ln mHMDx (t), ln mBFIC

x (t)) that are available for ages60–98, and calendar years 1994 to 2005, we obtain estimated values for bthat are significantly less than 1 (for group and individual policies, malesand females). Moreover, the estimations are very sensitive to the age andtime ranges included in the analysis. Let us briefly explain why b < 1 seemsinappropriate.

Mortality reduction factors express the decrease in mortality at somefuture time t + k compared with the current mortality experience at timet. They are widely used to produce projected life tables and are formallyintroduced in Section 4.3.2. The link between the regression model (3.31)and the mortality reduction factors for the insurance market is as follows.It is easily seen that if the linear relationship given above indeed holdstrue then

ln

(mBFIC

x (t + k)

mBFICx (t)

)= b ln

(mHMD

x (t + k)

mHMDx (t)

)(3.32)

⇔ mBFICx (t + k)

mBFICx (t)

=(

mHMDx (t + k)

mHMDx (t)

)b

(3.33)

–3

–2

–1

0

1

2

3

AM

fit,

gro

up in

s. m

arke

t

–6 –5 –4 –3 –2 –1Gen. population

–6 –5 –4 –3 –2 –1

Gen. population

Ind.

ins.

mar

ket

–1

–2

–3

–4

–5

–6

–7

–6 –5 –4 –3 –2 –1

Gen. population

Ind.

ins.

mar

ket

–1

–2

–3

–4

–5

–6

–7

AM

fit,

ind.

ins.

mar

ket

–6 –5 –4 –3 –2 –1

Gen. population

–1

0

1

2

3

–2

–6 –5 –4 –3 –2 –1Gen. population

Gro

up in

s. m

arke

t

–2

–4

–6

–8

–6 –5 –4 –3 –2 –1Gen. population

Gro

up in

s. m

arke

t

–2

–4

–6

–8

Figure 3.19. Relational models for males: observed pairs (ln mHMDx (t), ln mBFIC

x (t)) are displayed in the left panels, the estimated functions f in (3.30) aredisplayed in the middle panels, and the resulting fits are displayed in the right panels, individual policies in the top panels, group policies in the bottom panels.

–7 –6 –5 –4 –3 –2 –1Gen. population

Ind.

ins.

mar

ket

–1

–2

–3

–4

–5

–6

–7

–7 –6 –5 –4 –3 –2 –1Gen. population

Ind.

ins.

mar

ket

–1

–2

–3

–4

–5

–6

–7

–7 –6 –5 –4 –3 –2 –1Gen. population

AM

fit,

ind.

ins.

mar

ket

–2

–1

0

1

2

3

Gro

up in

s. m

arke

t

–7 –6 –5 –4 –3 –2 –1Gen. population

–2

–4

–6

–8

AM

fit,

gro

up in

s. m

arke

t

–7 –6 –5 –4 –3 –2 –1Gen. population

4

2

0

–2

–4

Gro

up in

s. m

arke

t

–7 –6 –5 –4 –3 –2 –1Gen. population

–2

–4

–8

–6

Figure 3.20. Relational models for females: observed pairs (ln mHMDx (t), ln mBFIC

x (t)) are displayed in the left panels, the estimated functions f in (3.30) aredisplayed in the middle panels, and the resulting fits are displayed in the right panels, individual policies in the top panels, group policies in the bottom panels.

3.4 Life insurance market 127

so that themortality reduction factor for themarket is equal to themortalityreduction factor for the general population raised to the power b. The samereasoning obviously holds for the group life insurance market. We note thatthe mortality reduction factors are less than 1 in the presence of decreasingtrends in mortality rates.

As socio-economic class mortality differentials have widened over time,we expect mortality improvements for assured lives to have been greaterthan in the general population. This statement is based on the fact thatthe socio-economic class mix of this group is higher than the populationaverage. Of course, there may be distortion factors, like changes in under-writing practices, or reforms in tax systems. Considering that the estimatedvalues for parameters b are less than 1, the interpretation is that the speed ofthe future mortality improvements in the insured population is somewhatsmaller than the corresponding speed for the general population. This isnot desirable and only reflects the changes in the tax regimes in Belgium,lowering adverse selection.

This is why we now consider the following model:

ln mBFICx (t) = f (x) + lnmHMD

x (t) + εxt (3.34)

We fit (3.34) to the observed pairs (ln mHMDx (t), ln mBFIC

x (t)) over calen-dar years 1994–2005 and ages 60–98. This produces estimated SMR’sof the form exp(f (x)) that can be used to adapt mortality projections tothe insurance market. Note that in (3.34), we force the speed of mortalityimprovements to be equal to the one for the general population. The qualityof the fit of (3.34) is remarkable, as it can be seen from the high values ofthe R2’s: 99.8% for males, individual policies, 97.2% for males, group poli-cies, 99.8% for females, individual policies, and 97.8% for females, grouppolicies. The estimated SMR’s are displayed in Fig. 3.21.

3.4.5 Age shifts

Another approach to quantify adverse selection consists in determining ageshifts or Rueff’s adjustments. More details can be found in Section 4.4.3.Here, we determine the age shift (t) to minimize the objective function

Ot() =80∑

x=65

(eBFICx (t) − eHMD

x− (t))2 (3.35)

We select the optimal value of (t) by a grid search over {−10,−9, . . . , 10}.Then, the overall age shift � is determined by minimizing O() =∑2005

t=1994 Ot(). This gives the values displayed in Table 3.1.

128 3 : Mortality trends during the 20th century

60 70 80 90

0.70

0.75

0.80

0.85

0.90

0.95

1.00

x

SMR

60 70 80 90x

SMR

0.50

0.45

0.40

0.35

60 70 80 90x

SMR

0.60

0.55

0.50

0.45

0.40

60 70 80 90x

SMR

0.8

0.7

0.6

0.5

Figure 3.21. Estimated SMR’s from (3.34) for males (top panels) and females (bottom panels),individual (left) and group (right) life insurance market.

Table 3.1. Optimal age shifts obtained from the objective functions Ot in (3.35),t = 1994, 1995, . . . , 2005 and O = ∑2005

t=1994 Ot .

Year t Ind., males Ind., females Group, males Group, females

1994 −8 −6 −4 −11995 −7 −6 −1 01996 −9 −8 −2 −11997 −8 −5 −2 −11998 −6 −4 −1 −11999 −9 −6 −1 −12000 −8 −5 −1 02001 −9 −8 −2 02002 −9 −5 0 12003 −9 −4 −1 02004 −8 −4 1 32005 −6 −3 1 1

1994–2005 −9 −5 −1 0

Considering the period 1994–2005, we see that the actuarial computa-tions for males, individual policies, should be based on general populationlife tables with age decreased by 9 years. The corresponding shift for grouplife policies is reduced to −1 year. For females, the values are −5 years forindividual policies with no adjustment for group life contracts.

Let us now briefly explain another approach to get these age shifts. To thisend, we assume that the observed number of deaths at age x in calendar year

3.5 Mortality trends throughout EU 129

0 5 10 15 20

–75000

–70000

–65000

–60000

a

L

0 5 10 15 20

–80000

–78000

–76000

–74000

a

L

0 5 10 15 20

–150000

–145000

–140000

–135000

–130000

–125000

a

L

0 5 10 15 20–130000

–128000

–126000

–124000

–122000

a

L

Figure 3.22. Log-likelihood L in function of the age shift for males (top panels) and females(bottom panels), individual (left) and group (right) life insurance market.

t in the insurance market DBFICxt is Poisson distributed, with a mean equal to

the product of the exposure to risk ETRBFICxt of the market multiplied by the

population death rate mHMDx− (t) at age x−. This distributional assumption

is not restrictive as the likelihood (3.12) has been seen to be proportionalto a Poisson one. The age shift is then determined by maximizing thelikelihood obtained by considering the DBFIC

xt ’s as mutually independent,that is, by maximizing the objective function

L() =∏x,t

exp(−ETRBFIC

xt mHMDx− (t)

) (−ETRBFIC

xt mHMDx− (t)

)DBFICxt

DBFICxt !

over by a grid search.

The results are displayed in Fig. 3.22 when calendar years 2005 and overare considered. The log-likelihood L = lnL is given in function of the ageshift . We clearly see that the log-likelihoods peak around the age shiftsgiven in Table 3.1.

3.5 Mortality trends throughout EU

This section compares the Belgian mortality experience with several otherEU members: Sweden, Italy, Spain, West-Germany, France, and England& Wales. Even if the trend is comparable in these countries, regional

130 3 : Mortality trends during the 20th century

differences in death rates produce gaps in life expectancies, especially atretirement age. All the data used to perform the multinational comparisoncome from HMD and all the analysis is performed on a period basis.

Figure 3.23 shows the trend in the period life expectancy at birth formalesin Sweden, Italy, Spain, West-Germany, France, and England & Wales,compared with Belgium. Figure 3.25 is the analogue for females. Figures3.24 and 3.26 display the period life expectancy at age 65.

Sweden has had a complete and continually updated register of its pop-ulation for more than 2 centuries. In general, life expectancies at birthincreased from 1750 to present. In 1771–1772, a harvest failure lead tofamine and epidemics, resulting in a drastic reduction of life expectancyat birth. Another increase in mortality took place during the first decadeof the 19th century because of the Finnish war of 1808–1809 and relatedepidemics. The effect of the 1918 Spanish influenza epidemic is also clearlyvisible. Because Sweden has remained neutral during both world wars, lifeexpectancies were minimally affected relative to other European countries.Compared to Belgium, we see that the life expectancy at birth is higher inSweden for both genders, but that the gap tends to narrow over time. Con-sidering age 65, we notice an important difference for males, with a clearadvantage for Sweden over Belgium.

Considering the Italian experience, the effect of the Spanish influenzaepidemics is clearly visible, as well as the impact of World War II. Thespeed of longevity improvement seems to be better in Italy: until the 1950s,life expectancy at birth was higher in Belgium and this changed in the 1960s.Italy has now a slightly higher life expectancy at birth. The advantage ofItaly is even more apparent for life expectancies at age 65.

In addition to the marked effect of the 1918 influenza epidemics, theSpanish civil war (1936–1939) and post-war period (1941–1942) causedan important decline of life expectancy at birth. As has been observed withItaly, the Belgian advantage over Spain disappeared in the 1960s. Biggerdifferences exist for life expectancies at age 65, in favour of Spain.

Let us now consider Germany. Instead of considering the whole country,we restrict ourselves to the territory of the former Federal Republic of Ger-many (called West Germany), starting after the end of World War II. Wesee that the mortality trends in Belgium and in West Germany are similar:life expectancies at birth and at age 65 closely agree in these two areas.

The trends in life expectancies at birth in France and Belgium are almostidentical, despite the fact that the effect ofWorldWar II is more pronouncedin France. Note also that the conjuntion of World War I and the Spanish fluepidemics have a very strong effect on life expectancies in the second half

1750 1800 1850 1900 1950 2000

Calendar year

Lif

e ex

p. a

t bir

th80

70

60

50

40

30

20

1880 1900 1920 1940 1960 1980 2000

Calendar year

Lif

e ex

p. a

t bir

th

70

60

50

40

30

1920 1940 1960 1980 2000Calendar year

Lif

e ex

p. a

t bir

th

70

60

50

40

30

1850 1900 1950 2000

Calendar year

Lif

e ex

p. a

t bir

th

70

60

50

40

1900 1920 1940 1960 1980 2000Calendar year

Lif

e ex

p. a

t bir

th

70

60

50

40

30

1960 1970 1980 1990 2000

Calendar year

Lif

e ex

p. a

t bir

th

76

74

72

70

68

66

Figure 3.23. Life expectancy at birth in the EU for males for, from upper left to lower right, Sweden, Italy, Spain, West-Germany, France and England &Wales, compared to Belgium (broken line). Source: HMD.

1750 1800 1850 1900 1950 2000

Calendar year

Lif

e ex

p. a

t 65

12

10

8

14

16

1880 1900 1920 1940 1960 1980 2000

Calendar year

Lif

e ex

p. a

t 65

10

12

14

16

1920 1940 1960 1980 2000

Calendar year

Lif

e ex

p. a

t 65

16

14

12

10

1960 1970 1980 1990 2000

Calendar year

Lif

e ex

p. a

t 65

16

15

14

13

12

1900 1920 1940 1960 1980 2000Calendar year

Lif

e ex

p. a

t 65

18

16

14

12

10

1850 1900 1950 2000Calendar year

Lif

e ex

p. a

t 65

16

15

14

13

12

11

10

Figure 3.24. Life expectancy at age 65 in the EU for males for, from upper left to lower right, Sweden, Italy, Spain, West-Germany, France and England &Wales, compared to Belgium (broken line). Source: HMD.

1900 1920 1940 1960 1980 2000

50

60

70

80

Calendar year

Lif

e ex

p. a

t bir

th

1880 1900 1920 1940 1960 1980 2000

30

40

50

60

70

80

Calendar year

Lif

e ex

p. a

t bir

th

1920 1940 1960 1980 200030

40

50

60

70

80

Calendar year

Lif

e ex

p. a

t bir

th

1850 1900 1950 2000

40

50

60

70

80

Calendar year

Lif

e ex

p. a

t bir

th

1750 1800 1850 1900 1950 2000Calendar year

Lif

e ex

p. a

t bir

th

80

70

60

50

40

30

20

1960 1970 1980 1990 2000

72

74

76

78

Calendar year

Lif

e ex

p. a

t bir

th

80

Figure 3.25. Life expectancy at birth in the EU for females for, from upper left to lower right, Sweden, Italy, Spain, West-Germany, France and England &Wales, compared to Belgium (broken line). Source: HMD.

1750 1800 1850 1900 1950 2000

8

10

12

14

16

18

20

Calendar year

Lif

e ex

p. a

t 65

1900 1920 1940 1960 1980 2000

12

14

16

18

20

22

Calendar year

Lif

e ex

p. a

t 65

1960 1970 1980 1990 2000

14

15

16

17

18

19

Calendar year

Lif

e ex

p. a

t 65

1880 1900 1920 1940 1960 1980 2000

10

12

14

16

18

20

Calendar year

Lif

e ex

p. a

t 65

1920 1940 1960 1980 2000

10

12

14

16

18

20

Calendar year

Lif

e ex

p. a

t 65

1850 1900 1950 200010

12

14

16

18

Calendar year

Lif

e ex

p. a

t 65

Figure 3.26. Life expectancy at age 65 in the EU for females for, from upper left to lower right, Sweden, Italy, Spain, West-Germany, France and England& Wales, compared to Belgium (broken line). Source: HMD.

3.6 Conclusions 135

of the 1910s. Quite surprisingly, significant differences appear between theFrench and Belgian life expectancies at age 65, with a clear advantage forFrance.

Mortality in England and Wales has been significantly influenced by thetwo world wars as well as by the 1918 flu epidemics. We see that the trendin the life expectancies at birth and at retirement age are very similar inBelgium and in England and Wales.

3.6 Conclusions

As clearly demonstrated in this chapter, mortality at adult and old agesreveals decreasing annual death probabilities throughout the 20th century.There is an ongoing debate among demographers about whether humanlongevity will continue to improve in the future as it has done in the past.Demographers such as Tuljapurkar and Boe (2000) andOeppen andVaupel(2002) argue that there is no natural upper limit to the length of human life.The approach that these demographers use is based on an extrapolationof recent mortality trends. The complexity and historical stability of thechanges in mortality suggest that the most reliable method of predictingthe future is merely to extrapolate past trends. However, this approach hascome in for criticisms because it ignores factors relating to life style and theenvironment that might influence future mortality trends. Olshansky et al.(2005) have suggested that the future life expectancy might level off or evendecline. This debate clearly indicates that there is considerable uncertaintyabout future trends in longevity.

Mortality improvements are viewed as a positive change for individualsand as a substantial social achievement. Nevertheless, they pose a chal-lenge for the planning of public retirement systems as well as for the privatelife annuities business. Longevity risk is also a growing concern for com-panies faced with off-balance-sheet or on-balance-sheet pension liabilities.More generally, all the components of social security systems are affected bymortality trends and their impact on social welfare, health care and societalplanning has become a more pressing issue. And the threat has now becomea reality, as testified by the failure of Equitable Life, the world’s oldest lifeinsurance company, in the UK in 2001. Equitable Life sold deferred lifeannuities with guaranteed mortality rates, but failed to predict the improve-ments in mortality between the date the life annuities were sold and the datethey came into effect.

Despite the fact that the study of mortality has been core to the actuarialprofession from the beginning, booming stock markets and high interest

136 3 : Mortality trends during the 20th century

rates and inflation have largely hidden this source of risk. In the recent past,with the lowering of inflation, interest rates, and expected equity returns,mortality risks have no longer been obscured.

Low nominal interest rates have made increasing longevity a much big-ger issue for insurance companies. When living benefits are concerned, thecalculation of expected present values (which are needed in pricing andreserving) requires an appropriate mortality projection in order to avoidunderestimation of future costs. This is becausemortality trends at adult/oldages reveal decreasing annual death probabilities. In order to protect thecompany from mortality improvements, actuaries have to resort to lifetables including a forecast of the future trends of mortality (the so-calledprojected tables). The building of such life tables will be the topic of thenext chapters.

4Forecasting mortality:An introduction

4.1 Introduction

This chapter aims at describing various methods proposed by actuaries anddemographers for projecting mortality. Many of these have been actuallyused in the actuarial context, in particular for pricing and reserving in rela-tion to life annuity products and pensions, and in the demographic field,mainly for population projections.

First, the idea of a ‘dynamic’ approach to mortality modelling is intro-duced. Then, projection methods are presented starting from extrapolationprocedures which are still widely used in current actuarial practice. Morecomplex methods follow, in particular methods based on mortality laws,on model tables, and on relations between life tables. The Lee–Cartermethod, recently proposed, and some relevant extensions are briefly intro-duced, whereas a more detailed discussion, together with some examples ofimplementation, is presented in Chapters 5 and 6.

The presentation does not follow a chronological order. In order to obtainan insight into the historical evolution of mortality forecasts the readershould refer to Section 4.9.1, in which some landmarks in the history ofdynamic mortality modelling are identified.

Allowing for futuremortality trends (and, possibly, for the relevant uncer-tainty of these trends) is required in a number of actuarial calculations andapplications. In particular, actuarial calculations concerning pensions, lifeannuities, and other living benefits (provided, e.g. by long-term care cov-ers and whole life sickness products) are based on survival probabilitieswhich extend over a long time horizon. To avoid underestimation of therelevant liabilities, the insurance company (or the pension plan) must adoptan appropriate forecast of future mortality, which should account for themost important features of past mortality trends.

Various aspects of mortality trends can be captured looking at thebehaviour, through time, of functions representing the age-pattern of

138 4 : Forecasting mortality: An introduction

mortality. The examples discussed in Chapter 3 clearly witness thispossibility.

Particular emphasis has been placed by many researchers on thebehaviour, for each integer age x, of the quantity qx (i.e. the probabilityof dying within one year), drawn from a sequence of life tables relating tothe same kind of population (e.g. males living in a given country, annuitantsof an insurance company, etc.). The graph constructed plotting qx, for anygiven age x, against time is usually called the mortality profile. Mortalityprofiles are often declining, in particular at adult and old ages.

Further, mortality experience over the last decades shows some aspectsaffecting the shape of curves representing the mortality as a function of theattained age, such as the curve of deaths (i.e. the graph of the probabilitydensity function of the random lifetime, in the age-continuous setting) andthe survival function. In particular (see also Section 2.3.1):

(a) an increasing concentration of deaths around the mode (at old ages) ofthe curve of deaths is evident; so the graph of the survival functionmovestowards a rectangular shape, whence the term rectangularization todenote this aspect; see Fig. 3.11 for an actual illustration, and Fig. 4.1(a)for a schematic representation;

(b) the mode of the curve of deaths (which, owing to the rectangularization,tends to coincide with the maximum age ω) moves towards very oldages; this aspect is usually called the expansion of the survival function;see Fig. 3.13 for an actual illustration, and Fig. 4.1(b) for a schematicrepresentation;

(c) higher levels and a larger dispersion of accidental deaths at young ages(the so-called young mortality hump) have beenmore recently observed;see Fig. 3.2 for an illustration.

0

1

Age

(a)

v

Rectangularization

0

1

Age

(b)

v v�

Expansion

Figure 4.1. Mortality trends in terms of the survival function.

4.2 A dynamic approach to mortality modelling 139

From the above aspects, the need for a dynamic approach to mortal-ity assessment clearly arises. Addressing the age-pattern of mortality as adynamic entity underpins, from both a formal and a practical point of view,any mortality forecast and hence any projection method.

4.2 A dynamic approach to mortality modelling

4.2.1 Representing mortality dynamics: single-figuresversus age-specific functions

When working in a dynamic context (in particular when projectingmortality), the basic idea is to express mortality as a function of the(future) calendar year t. When a single-figure representation of mortality isconcerned (see Sections 2.4.1 and 2.4.2), a dynamic model is a real-valuedfunction�(t). For example, the expected lifetime for a newborn, denoted bye0 in a non-dynamic context, is represented by e0(t), a function of the cal-endar year t (namely the year of birth), when the mortality trend is allowedfor. Similarly, the general probability of death in a given population can berepresented by a function q(t), where t denotes the calendar year in whichthe population is considered.

In actuarial calculations, however, age-specific measures of mortality areusually needed. Then in a dynamic context, mortality is assumed to be afunction of both the age x and the calendar year t. In a rather general setting,a dynamic mortality model is a real-valued or a vector-valued function�(x, t). In concrete terms, a real-valued function may represent one-yearprobabilities of death, mortality odds, the force of mortality, the survivalfunction, some transform of the survival function, etc. This concept hasbeen already introduced in Section 3.3. Further, a vector-valued functionwould be involved when causes of death are allowed for.

The projected mortality model is given by the restriction �(x, t)|t > t′,where t′ denotes the current calendar year, or possibly the year for whichthe most recent (reliable) period life table is available. The calendar year t′is usually called the base year. The projected mortality model (and, in par-ticular, the underlying parameters) is constructed by applying appropriatestatistical procedures to past mortality experience.

Although age-specific functions are needed in actuarial calculations, theinterest in single-figure indexes as functions of the calendar year shouldnot be underestimated. In particular, important features of past mortalitytrends can be singled out by focussing on the behaviour of some indexes that

140 4 : Forecasting mortality: An introduction

are intended to be markers of the probability distribution of the randomlifetime at birth, T0, or at some given age x, Tx (see Section 2.4). In adynamic context, all such markers should be noted to be functions of thecalendar year t, for example, e0(t), σ0(t), ξ(t), etc.

4.2.2 A discrete, age-specific setting

Turning back to age-specific functions, we assume now that both age andcalendar year are integers. Hence, �(x, t) can be represented by a matrixwhose rows correspond to ages and columns to calendar years. In particular,let �(x, t) = qx(t), where qx(t) denotes the probability of an individualaged x in the calendar year t dying within one year (namely, the one-yearprobability of death in a dynamic context).

The elements of the matrix (see Table 4.1) can be read according to threearrangements:

(a) a vertical arrangement (i.e. by columns),

q0(t), q1(t), . . . , qx(t), . . . (4.1)

corresponding to a sequence of period life tables, with each tablereferring to a given calendar year t;

(b) a diagonal arrangement,

q0(t), q1(t + 1), . . . , qx(t + x), . . . (4.2)

corresponding to a sequence of cohort life tables, with each tablereferring to the cohort born in year t;

(c) a horizontal arrangement (i.e. by rows),

. . . , qx(t − 1), qx(t), qx(t + 1), . . . (4.3)

yielding the mortality profiles, with each profile referring to a givenage x.

Table 4.1. One-year probabilities of death in a dynamic context

. . . t − 1 t t + 1 . . .

0 . . . q0(t − 1) q0(t) q0(t + 1) . . .

1 . . . q1(t − 1) q1(t) q1(t + 1) . . .

. . . . . . . . . . . . . . . . . .

x . . . qx(t − 1) qx(t) qx(t + 1) . . .

x + 1 . . . qx+1(t − 1) qx+1(t) qx+1(t + 1) . . .

. . . . . . . . . . . . . . . . . .

ω − 1 . . . qω−1(t − 1) qω−1(t) qω−1(t + 1) . . .

4.3 Projection by extrapolation of annual probabilities of death 141

4.3 Projection by extrapolation of annualprobabilities of death

4.3.1 Some preliminary ideas

An extrapolation procedure for mortality simply aims at deriving futuremortality patterns (e.g. future probabilities of death) from a database thatexpresses pastmortality experience. The database typically consists in cross-sectional observations and, possibly, (partial) cohort observations. This ideais sketched in Figs. 4.2 and 4.3.

However, a number of points should be addressed. In particular, considerthe following:

1. How are the items in the database interpreted? Are they correctly inter-preted as observed outcomes of random variables (e.g. frequencies ofdeath), or, conversely, are they simply taken as ‘numbers’?

2. The projected table, resulting from the extrapolation procedure, isa two-dimensional array of numbers, providing point estimates offuture mortality. How do we get further information, namely, intervalestimates?

If the answer to question (1) is ‘data are simply numbers’, then the extrap-olation procedure does not allow for any statistical feature of the infor-mation available, as, for example, the reliability of the data. Conversely,

Past

0

1

x qx( t �)

Future

t �

v–1

Projectedtable

Figure 4.2. The projected table.

142 4 : Forecasting mortality: An introduction

0

1

x

v–1

Projectedtable

Projection

Database

Figure 4.3. From the data set to the projected table.

when the data are interpreted as the outcomes of random variables, theextrapolation procedure must rely on sound statistical assumptions and,as a consequence, future mortality can be represented in terms of bothpoint and interval estimates (whilst only point estimates can be providedby extrapolation procedures only based on ‘numbers’).

Various traditional projection methods consist in extrapolation pro-cedures simply based on ‘numbers’. First, we will describe these meth-ods which, in spite of several deficiencies, offer a simple and intuitiveintroduction to mortality forecasts.

Let us assume that several period observations (or ‘cross-sectional’ obser-vations) are available for a given population (e.g. males living in a country,pensioners who are members of a pension plan, etc.). Each observationconsists of the age-pattern of mortality for a given set X of ages, sayX = {xmin, xmin + 1, . . . , xmax}. The observation referred to calendar yeart is expressed by

{qx(t)}x∈X = {qxmin(t), qxmin+1(t), . . . , qxmax(t)} (4.4)

Let us focus on the set of observation years T = {t1, t2, . . . , tn}. Then, weassume that the matrix

{qx(t)}x∈X ; t∈T = {qx(t1), qx(t2), . . . , qx(tn)}x∈X (4.5)

4.3 Projection by extrapolation of annual probabilities of death 143

Past(Graduation)

Future(Extrapolation)

t'

Time t

q x(t

)

Figure 4.4. Extrapolation of the mortality profile

constitutes the data base for mortality projections. Note that each sequenceon the right-hand side of (4.5) represents the observed mortality profile atage x.

We assume that the trend observed in past years (i.e. in the set of yearsT ) can be graduated, for example, via an exponential function. Further, wesuppose that the observed trend will continue in future years. Then, futuremortality can be estimated by extrapolating the trend itself (see Fig. 4.4).

Remark The choice of the set T is a crucial step in building up a mortalityprojection procedure. Even if a long sequence of cross-sectional observa-tions is available (throughout a time interval of, say, more than 50 years), achoice restricted to recent observations (over, say, 30–50 years) may bemore reasonable than the whole set of data. Actually, a very long sta-tistical sequence can exhibit a mortality trend in which recent causes ofmortality improvement have a relatively small weight, whereas causes ofmortality improvement whose effect should be considered extinguished arestill included in the trend itself (see Fig. 4.5). For more information, seeSection 5.5. �

Extrapolation of the qx’s (namely of the mortality profiles) representsa particular case of the horizontal approach for mortality forecasts (seeFig. 4.6). The horizontal approach can be applied to quantities other thanthe annual probabilities of death, for example, the mortality odds φx, thecentral death rates mx, etc.

Adopting the horizontal approach means that extrapolations are per-formed independently for each qx (or other age-specific quantity), so that

144 4 : Forecasting mortality: An introduction

t'

Time t

q x(t

)

Figure 4.5. Extrapolation results depending on the graduation period.

Age

xmin

xmax

x

t1 tht2

Calendar year

c x min(t)

c x max(t)

cx(t)

Figure 4.6. The horizontal approach.

the result is a function ψx(t) for each age x. This may lead to inconsisten-cies with regard to the projected age-pattern of mortality, as we will see inSection 4.5.3.

4.3.2 Reduction factors

As far as future mortality is concerned, let us express the relation betweenprobability of death at age x, referred to a given year t′ (e.g. t′ = tn) and ageneric year t (t > t′) respectively, as follows:

qx(t) = qx(t′) Rx(t − t′) (4.6)

4.3 Projection by extrapolation of annual probabilities of death 145

The quantity Rx(t − t′) is called the variation factor (and usually reductionfactor, as it is expected to be less than 1 because of the prevailing downwardtrends in probabilities of death) at age x for the interval (t′, t).

A simplification can be obtained assuming that the reduction factor doesnot depend on the age x, that is, assuming for all t and x

Rx(t − t′) = R(t − t′) (4.7)

Mortality forecasts can then be obtained through an appropriate mod-elling procedure applied to the reduction factor. The structure as well as theparameters of Rx(t − t′) should be carefully chosen. Then, projected mor-tality will be obtained via (4.6) (provided that we assume that the observedtrend, on which the reduction factors are based, will continue in the future).

Remark The approach to projection by extrapolationwhichwe are describ-ing is based on a mathematical formula, namely, the formula for thereduction factor (examples are provided in Sections 4.3.3–4.3.8). Con-versely, extrapolation may be based on a graphical method. The graphicalapproach to extrapolation consists in drawing, for each age x, a smoothcurve representing the past trend in probabilities of death, assumed to con-tinue after the calendar year t′, and then reading the projected probabilitiesfrom the extrapolated part of the curve. �

4.3.3 The exponential formula

Let us suppose that the observed mortality profiles are such that thebehaviour over time of the logarithms of the qx’s is, for each age x, approx-imately linear (see Fig. 4.7). Then, we can find a value δx such that, forh = 1, 2, . . . , n − 1, we have approximately:

ln qx(th+1) − ln qx(th) ≈ −δx(th+1 − th) (4.8)

Henceqx(th+1)

qx(th)≈ e−δx(th+1−th) (4.9)

or, defining rx = e−δx :qx(th+1)

qx(th)≈ r

th+1−thx (4.10)

Assume that, for each age x, the parameter δx (or rx) is estimated, forexample via a least squares procedure. So, the graduated probabilities qx(t)can be calculated. The constraint qx(tn) = qx(tn) is usually applied in theestimation procedure.

146 4 : Forecasting mortality: An introduction

0

0

lnq x

(t)

q x(t

)

Time t

Time t

Figure 4.7. Behaviour of the qx’s along time

Relation (4.10) suggests a natural extrapolation formula. Set t′ = tn, andassume for t > t′:

qx(t) = qx(t′) rt−t′x (4.11)

from which we can express the reduction factor as follows:

Rx(t − t′) = rt−t′x = e−δx(t−t′) (4.12)

The extrapolation formula (4.11) (as well as, for instance, formula (4.17)in Section 4.3.5) originates from the analysis of the mortality profiles, andhence constitutes an example of the horizontal approach.

4.3.4 An alternative approach to theexponential extrapolation

For the calculation of parameters rx’s (or δx’s), procedures other than leastsquares estimation can be used. An example follows.

Suppose, as above, that n period tables are available. For each age x andfor h = 1, 2, . . . , n − 1, calculate the quantities r(h)

x ’s as follows:

r(h)x =

[qx(th+1)

qx(th)

] 1th+1−th

(4.13)

Then, for each x, we calculate rx as the weighted geometric average ofthe quantities r(h)

x :

rx =n−1∏h=1

(r(h)x

)wh(4.14)

The weights must, of course, fulfill the conditions: wh ≥ 0, h = 1, 2, . . . ,n − 1;

∑n−1h=1 wh = 1.

4.3 Projection by extrapolation of annual probabilities of death 147

Eachweight,wh, should be chosen in away to reflect both the length of thetime interval between observations and the statistical reliability attaching tothe observations themselves. Trivially, if we set wh = (th+1 − th)/(tn − t1)for all h, only the lengths of the time intervals are accounted for, and soexpression (4.14) reduces to

rx =[

qx(tn)

qx(t1)

] 1tn−t1

(4.15)

so that rx is determined only by the first and last values of qx(t) in the pastdata.

4.3.5 Generalizing the exponential formula

Let us turn back to the exponential formula. From (4.11) it follows that, ifrx < 1, then

qx(∞) = 0 (4.16)

where qx(∞) = limt↗+∞ qx(t). Although the validity of mortality forecastsshould be restricted to a limited time interval, it may be more realistic toassign a positive limit to the mortality at any age x. To this purpose, thefollowing formula with an assigned asymptotic mortality can be adopted:

qx(t) = qx(t′)[αx + (1 − αx) rt−t′

x

](4.17)

where αx ≥ 0 for all x see Fig. 4.8. The reduction factor is thus given by

Rx(t − t′) = αx + (1 − αx) rt−t′x (4.18)

Clearly, (4.17) is a generalization of (4.11). From (4.17) we have:

qx(∞) = αx qx(t′) (4.19)

The exponential formula expressed by equation (4.17) can be simplifiedby assuming that rx = r for all x, from which we obtain:

qx(t) = qx(t′)[αx + (1 − αx) rt−t′

](4.20)

Although the mortality decline is not necessarily uniform across a given(wide) age range, this assumption can be reasonable when a limited set ofages is involved in the mortality forecast. This would be the case for mor-tality projections concerning annuitants or pensioners. In any case, someflexibility is provided by the parameters αx.

148 4 : Forecasting mortality: An introduction

t't'

qx(t' ) qx(t' )

qx(t' ) αx

Time t Time t

Figure 4.8. Asymptotic mortality in exponential formulae.

4.3.6 Implementing the exponential formula

An alternative version of the exponential formula (4.17) can help in directlyassigning estimates to the parameters rx. Without loss of generality, weaddress the simplified structure represented by equation (4.20), so that r isindependent of the age x.

The total (asymptotic) mortality decline, from time t′ on, is given byqx(t′) − qx(∞), whereas the decline in the first m years is given byqx(t′) − qx(t′ + m). Let us define the ratio fx(m) as follows:

fx(m) = qx(t′) − qx(t′ + m)

qx(t′) − qx(∞)(4.21)

then, fx(m) is the proportion of the total mortality decline assumed to occurby time m. Dividing both numerator and denominator by qx(t′), we obtain:

fx(m) = 1 − Rx(m)

1 − Rx(∞)= (1 − αx)(1 − rm)

1 − αx= 1 − rm (4.22)

Note that, since we have assumed rx = r for all x, we have fx(m) = f (m).Hence

r = (1 − f (m))1m (4.23)

The choice of the couple (m, f (m)) unambiguously determines theparameter r. Finally, we have

Rx(t − t′) = αx + (1 − αx) (1 − f (m))t−t′m (4.24)

For example, if we assume that 60% of the total mortality decline occursin the first 20 years, we set (m, f (m)) = (20, 0.60), and so r = 0.40

120 =

0.9552.

4.3 Projection by extrapolation of annual probabilities of death 149

4.3.7 A general exponential formula

The exponential formulae discussed in Sections 4.3.3 and 4.3.5 can beplaced in a more general context. We assume the following expression forthe annual probability of death:

qx(t) = ax + bx ctx (4.25)

in which the parameters ax, bx, cx depend on age x and are independent ofthe calendar year t. Thus, qx(t) is an exponential function of t. Equation(4.25) then represents a general exponential formula for projections viaextrapolation.

The projection formulae which are currently used in actuarial practiceconstitute particular cases of formula (4.25). For instance, with ax = 0,bx = qx(t′) rt′

x , cx = rx, we obtain formula (4.11). With ax = αx qx(t′),bx = (1 − αx) qx(t′) rt′

x , cx = rx, we find the more general formula (4.17).

The projection formula

qx(t) = qx(t′) at−t′x+b (4.26)

(called the Sachs formula) where a and b are constants and at−t′x+b represents

the reduction factor, also constitutes a particular case of (4.25), as can beeasily proved.

Note that formulae (4.11) and (4.17) (and some related expressions)explicitly refer to the base year t′ (usually related to the most recentobservation, that is, t′ = tn). Conversely, formula (4.25) as well as otherformulae presented in Section 4.3.9 do not explicitly address a fixed calen-dar year. Nonetheless, a link with a given calendar year can be introducedvia parameters, as illustrated, for example, by formula (4.26).

4.3.8 Some exponential formulae used in actuarial practice

Exponential formulae have been widely used in actuarial practice. Imple-mentations of these formulae can be found, for instance, in the USA, GreatBritain, Germany and Austria. Some examples follow.

Example 4.1 In the UK, formula (4.11) was used for forecasting the mortal-ity of life office pensioners and annuitants; see CMIB (1978). In particular,a simplified version with the same reduction factor at all ages (see (4.7))was implemented, that is,

qx(t) = qx(t′) rt−t′ (4.27)

150 4 : Forecasting mortality: An introduction

The approximation was considered acceptable from age x = 60 upwards.�

Example 4.2 Formula (4.20) was also proposed in the UK; see CMIB(1990). The reduction factor Rx(t − t′), with t′ = 1980 as the base year, isgiven by

Rx(t − t′) = αx + (1 − αx)(1 − f )t−t′20 (4.28)

with f = f (20) = 0.60 [see formula (4.24)] and:

αx =

0.50 if x < 60x − 10100

if 60 ≤ x ≤ 110

1 if x > 110

(4.29)

It is easy to see that, for any year t, the reduction factor increases (i.e.the mortality improvement reduces) linearly with increasing age, between

0.50+ 0.50 (0.40)t−t′20 at age 60 and below, to unity at age 110 and above.

For any given age x, the rate of improvement decreases as t increases.Further, following the analysis in Section 4.3.6, it is easy to prove thatexpression (4.28) for the reduction factor, with f = 0.60, implies that 60%of the total (asymptotic) mortality improvement (at any age x) is assumedto occur in the first 20 years. �

Example 4.3 A recent implementation of formula (4.17) by the ContinuousMortality Investigation Bureau is as follows (see CMIB (1999)). In this case,the reduction factor is given by

Rx(t − t′) = αx + (1 − αx)(1 − fx)t−t′20 (4.30)

The functions αx, fx have been chosen as follows:

fx =

c if x < 60

1 + (1 − c)x − 110

50if 60 ≤ x ≤ 110

1 if x > 110

(4.31)

αx =

h if x < 60(110 − x) h + (x − 60) k

50if 60 ≤ x ≤ 110

k if x > 110

(4.32)

where c = 0.13, h = 0.55, and k = 0.29. Parameters have been adjusted sothat t′ = 1992 is the base year. �

4.3 Projection by extrapolation of annual probabilities of death 151

Example 4.4 An exponential formula has also been used in the UnitedStates. The Society of Actuaries published the 1994 probabilities on deathas the base table and the annual improvement factors 1 − rx; see GroupAnnuity Valuation Table Task Force (1995). The projected probabilities ofdeath are determined as follows:

qx(t) = qx(1994) rt−1994x (4.33)

The parameter rx varies from 0.98 to 1, being equal to 1 for x > 100, forboth males and females. �

4.3.9 Other projection formulae

Mortality improvements resulting from observed data may suggest assump-tions other than the exponential decline of annual probabilities of death.Thus, a formula different from the exponential one (see (4.25)) can be usedto express the probabilities qx(t). Conversely, the exponential formula canbe used to express other life table functions, or a transform of a life tablefunction, such as the odds φx(t).

Below we present some formulae which have been suggested or used inapplications:

qx(t) = ax + bx

t(4.34)

qx(t) =p∑

h=0

ax,h th (4.35)

qx(t) = eGx(t)

1 + eGx(t)(4.36)

where Gx(t) is, for each age x, a polynomial in t, that is,

Gx(t) =p∑

h=0

cx,h th (4.37)

Some comments about these formulae follow. Formula (4.35) with p = 1represents the linear extrapolation method:

qx(t) = ax,0 + ax,1 t (4.38)

with ax,1 < 0 to express mortality decline. This formula is not usuallyadopted because of its obvious disadvantage that for large t a negativeprobability is predicted. The polynomial extrapolation formula (4.35) withp = 3 is called the Esscher formula.

152 4 : Forecasting mortality: An introduction

Referring to formula (4.36), note that it can be expressed as follows:

lnqx(t)px(t)

= Gx(t) (4.39)

If observed mortality improvements suggest a linear behaviour of the log-arithms of the odds, and thus an exponential behaviour of the odds, thenwe can use formula (4.36) with

Gx(t) = cx,0 + cx,1 t (4.40)

and so we have the following expression:

qx(t) = ecx,0+cx,1 t

1 + ecx,0+cx,1 t (4.41)

4.4 Using a projected table

4.4.1 The cohort tables in a projected table

A projected mortality table is a rectangular matrix {qx(t)}x∈X ; t≥t′ , where t′is the base year. The appropriate use of the projected table requires that, ineach year t, probabilities concerning the lifetime of a person age x in thatyear are derived from the diagonal

qx(t), qx+1(t + 1), . . . (4.42)

that is, from the relevant cohort table (see also Section 4.2.2). Then, theprobability of a person age x in year t being alive at age x + k is given by:

kp↗x (t) =

k−1∏j=0

[1 − qx+j(t + j)] (4.43)

where the superscript ↗ recalls that we are working along a diagonal bandin the Lexis diagram (see Section 3.3, and Fig. 3.1 in particular), or, simi-larly, along a diagonal of the matrix in Table 4.1 with the proviso that theordering of the lines is inverted. Note that explicit reference to the year ofbirth τ is omitted, as this is trivially given by τ = t − x.

For example, to calculate, in the calendar year t, the expected remaininglifetime of an individual age x in that year, the following formula shouldbe adopted, rather than formula (2.65) (which relies on the assumption ofunchanging mortality after the period observation from which the life table

4.4 Using a projected table 153

was drawn):◦e↗x (t) =

ω−x∑k=1

kp↗x (t) + 1

2(4.44)

The quantity◦e↗x (t) is usually called the (complete) cohort life expectancy,

for a person age x in year t. If a decline in future mortality is expected (andhence represented by the projected cohort table), the following inequalityholds:

◦e↗x (t) >

◦ex (4.45)

where◦ex denotes the period life expectancy (see Section 2.4.3).

Note that, in a dynamic framework, the period life expectancy should bedenoted as follows:

◦e↑x(t) =

ω−x∑k=1

kp↑x(t) + 1

2(4.46)

with

kp↑x(t) =

k−1∏j=0

[1 − qx+j(t)] (4.47)

where the superscript ↑ recalls that we are working along a vertical band inthe Lexis diagram, or, similarly, along a column of the matrix in Table 4.1.

The same cohort-based approach should be adopted to calculate actuarialvalues of life annuities, for both pricing and reserving.Hence, various cohorttables should be simultaneously used, according to the year of birth of theindividuals addressed in the calculations.

4.4.2 From a double-entry to a single-entry projected table

From a strictly practical point of view, the simultaneous use of variouscohort tables may have some disadvantages. Moreover, probabilities con-cerning people with the same age x at policy issue vary according to theissue year t. These disadvantages have often led to the adoption, in actu-arial practice, of one single-entry table only, throughout a period of some(say 5, or 10) years. The single-entry table must be drawn, in some way,from the projected double-entry table.

Single-entry tables can be derived, in particular, as follows (see alsoFig. 4.9):

(1) A birth year τ is chosen and the cohort table pertaining to the generationborn in year τ is only addressed; so, the probabilities

qxmin(τ + xmin), qxmin+1(τ + xmin + 1), . . . , qx(τ + x), . . . (4.48)

154 4 : Forecasting mortality: An introduction

0

1

x

v–1

Projectedtable

Past Future

qx( t �)

t �t t

(2) (1)

Figure 4.9. Two approaches to the choice of a single-entry projected table.

where xmin denotes the minimum age of interest, are used in actuarialcalculations. Thus, just one diagonal of the matrix {qx(t)} is actu-ally used. The choice of τ should reflect the average year of birth ofannuitants or pensioners to whom the table is referred.

(2) A (future) calendar year t is chosen and the projected period tablereferring to year t is only addressed; and so the probabilities

qxmin(t), qxmin+1(t), . . . , qx(t), . . . (4.49)

are adopted in actuarial calculations. Thus, just one column of thematrix is used. The choice of t should be broadly appropriate to themix of life annuity business in force over the medium-term future.

Following approach (1), and using the superscript [τ] ↗ to denote refer-ence to the cohort table for the generation born in year τ, the probabilityof being alive at age x + k is given (for any year of birth τ = t − x) by

kp[τ]↗x =

k−1∏j=0

[1 − qx+j(τ + x + j)] (4.50)

4.4 Using a projected table 155

Adopting approach (2), and denoting by [t] ↑ the reference to the periodtable for year t, the probability of being alive at age x+k is conversely given(for any year of birth τ = t − x) by

kp[t]↑x =

k−1∏j=0

[1 − qx+j(t)] (4.51)

Of course, both approaches lead to biased evaluations. Notwithstand-ing this deficiency, approach (1) can be ‘adjusted’ reducing such a bias. Acommon adjustment is described in the following section.

4.4.3 Age shifting

For people born in year τ = t−x, the probabilities (4.43) (which are relatedto the year of birth τ) should be used, whereas approach (1) leads to the useof probabilities (4.50), which are independent of the actual year of birth. Toreintroduce a dependence on τ, at least to some extent, we use the followingprobabilities:

qxmin+h(τ)(τ + xmin + h(τ)), qxmin+1+h(τ)(τ + xmin + 1 + h(τ)), . . . ,

qx+h(τ)(τ + x + h(τ)), . . . (4.52)

Note that all the probabilities involved belong to the same diagonal referredto within approach (1).

This adjustment (often called Rueff’s adjustment) involves an age-shiftof h(τ) years. Assuming a mortality decline, the function h(τ) must satisfythe following relations:

h(τ)

≥ 0 for τ < τ

= 0 for τ = τ

≤ 0 for τ > τ

(4.53)

The survival probability is then calculated as follows (instead of usingformula (4.50)):

kp[τ; h(τ)]↗x =

k−1∏j=0

[1 − qx+h(τ)+j(τ + x + h(τ) + j)] (4.54)

where the superscript also recalls the age-shift. Probabilities given by for-mula (4.54) can be adopted to approximate the cohort life expectancy (seeSection 4.4.1) as well as actuarial values of life annuities.

156 4 : Forecasting mortality: An introduction

Table 4.2. Age-shifting function(table TPRV; τ = 1950; i = 0)

τ h(τ)

1901–1910 51911–1920 41921–1929 31930–1937 21938–1946 11947–1953 01954–1960 −11961–1967 −21968–1975 −31976–1984 −4≥ 1985 −5

As regards the determination of the age-shift function h(τ), various cri-teria can be adopted. We just mention that most criteria are based onthe analysis of the actuarial values of life annuities calculated using theappropriate probabilities, given by (4.43), and, respectively, using the prob-abilities (4.54), with the aim of minimizing the ‘distance’ (convenientlydefined) between the sets of actuarial values. When a criterion of this typeis adopted, the function h(τ) depends on the interest rate used in calculatingthe actuarial values.

Example 4.5 Table 4.2 shows the age-shifting function used in connectionwith the French projected table TPRV. The interest rate assumed for theconstruction of the function is i = 0. �

Remark It is worth noting that adjustments via an age-shifting mechanismare rather common in life insurance actuarial technique. For example, anincrement in the insured’s age is often used to account for the effects ofimpairments on the age-pattern of mortality; see Section 2.9.1 and, in par-ticular, formulae (2.119) and (2.120). A further example of age-shiftingin the context of mortality projections is presented in Section 4.5.1 (seeExample 4.8). �

4.5 Projecting mortality in a parametric context

4.5.1 Mortality laws and projections

When amortality law is used to fit observed data, the age-pattern of mortal-ity is summarized by some parameters (see Section 2.5). Then, the projection

4.5 Projecting mortality in a parametric context 157

procedure can be applied to the set of parameters (instead of the set of age-specific probabilities), with a dramatic reduction in the dimension of theforecasting problem, namely in the number of the ‘degrees of freedom’.

Consider a law, for example, describing the force of mortality:

µx = ϕ(x;α,β, . . . ) (4.55)

In a dynamic context, the calendar year t enters the model via its parameters

µx(t) = ϕ(x;α(t),β(t), . . . ) (4.56)

Let T = {t1, t2, . . . , tn} denote the set of observation years. Hence, fora given set X of ages, the data base is represented by the set of observedvalues

{µx(t)}x∈X ; t∈T = {µx(t1),µx(t2), . . . ,µx(tn)}x∈X (4.57)

For each calendar year th, we estimate the parameters to fit the model

µx(th) = ϕ(x;αh,βh, . . . ) (4.58)

(e.g. via least squares, or minimum χ2, or maximum likelihood) so that aset of n functions of age x is obtained

{µx(t1),µx(t2), . . . ,µx(tn)} (4.59)

Trends in the parameters are then graduated via some mathematicalformula, and hence a set of functions of time t is obtained:

α1,α2, . . . ,αn ⇒ α(t)

β1,β2, . . . ,βn ⇒ β(t)

. . .

(see Fig. 4.10).

It is worth noting that the above projection procedure follows a verticalapproach to mortality forecast, as the parameters of the chosen law areestimated for each period table based on the experienced mortality (seeFig. 4.11).

Conversely, a diagonal approach can be adopted, starting fromparameterestimation via a cohort graduation (see Fig. 4.12). In this case, parametersdepend on the year of birth τ:

µx(τ) = ϕ(x; γ(τ), δ(τ), . . . ) (4.60)

158 4 : Forecasting mortality: An introduction

Past(Graduation)

Future(Extrapolation)

t'th

ah

a

Time t

a(t)

Figure 4.10. Projection in a parametric framework.

Age

xmin

xmax

x

t1 tht2

Calendar year

w (x; ah, bh, ...)

w (x; a(t), b(t), ...)

w (x; a2, b2, ...)

w (x; a1, b1, ...)

Figure 4.11. The vertical approach.

For each year of birth τh, h = 1, 2, . . . ,m, we estimate the parameters tofit the model

µx(τh) = ϕ(x; γh, δh, . . . ) (4.61)

so that a set of m functions of age x is obtained

{µx(τ1),µx(τ2), . . . ,µx(τm)} (4.62)

4.5 Projecting mortality in a parametric context 159

Age

xmin

xmax

x

t1 tht2

��

��

��

��

��

��

��

��

��

��

��

��

Calendar year

w (x; gh, dh, ...)

w (x; g(t), d(t), ...)

w (x; g2, d2, ...)

w (x; g1, d1, ...)

Figure 4.12. The diagonal approach.

Trends in the parameters are then graduated via some mathematicalformula, and hence a set of functions of time τ is obtained:

γ1, γ2, . . . , γm ⇒ γ(τ)

δ1, δ2, . . . , δm ⇒ δ(τ)

. . .

Example 4.6 AMakeham’s law (see (2.70)), representing mortality dynam-ics according to the vertical approach, can be defined as follows:

µx(t) = A(t) + B(t) c(t)x (4.63)

where t represents the calendar year.

When the diagonal approach is adopted, the dynamic Makeham law isdefined as follows:

µx(τ) = A(τ) + B(τ) c(τ)x (4.64)

where τ = t − x denotes the year of birth of the cohort. �

Example 4.7 In some law-based projection models it has been assumedthat the age-pattern of mortality is represented by one of the Heligman–Pollard laws (see (2.83) to (2.87)), and that various relevant parametersare functions of the calendar year. Thus, according to a vertical approach,

160 4 : Forecasting mortality: An introduction

functions A(t), B(t), C(t), . . . are used to express the dependency of theage-pattern of mortality on the calendar year t. �

Example 4.8 We assume that, for each past calendar year t, the oddsφx(t) = qx(t)/px(t) are graduated using (2.81). Then, we have

φx(t) = ePx(t) (4.65)

where Px(t) denotes, for each t, a polynomial in x. Further, we assume thatthe odds are extrapolated, for t > t′, via an exponential formula, that is,

φx(t) = φx(t′) rs (4.66)

where s = t − t′ and r < 1.

As far as the age-pattern of mortality in the base year t′ is concerned, weassume:

Px(t′) = α + βx (4.67)

Then, from (4.66) we have:

lnφx(t) = α + βx + s ln r (4.68)

Defining

w = − ln rβ

(4.69)

we finally obtain:

lnφx(t) = α + β(x − w s) = Px−ws(t′) (4.70)

By assumption r < 1, and, given the behaviour of probabilities qx(t′) andpx(t′) as functions of the age x, it is sensible to suppose β > 0. Then we findw > 0. Hence, a constant reduction factor applied to the odds leads to anage reduction w for each of the s projection years. If this result is transferredfrom the odds φx(t) to the probabilities qx(t), we have approximately:

qx(t) ≈ qx−ws(t′) (4.71)

Formulae (4.70) and (4.71) provide examples of approximate evaluationvia age-shifting. See also the Remark in Section 4.4.3. �

4.5.2 Expressing mortality trends via Weibull’s parameters

Assume that the probability distribution of the random lifetime at birth, T0,is represented (for a given cohort of lives) by the Weibull law, hence with

4.5 Projecting mortality in a parametric context 161

force of mortality given by (2.77). The corresponding pdf is then

f0(x) = α

β

(xβ

)α−1

e−( xβ)α ; α, β > 0 (4.72)

whereas the survival function is given by

S(x) = e−( xβ)α (4.73)

It is well known that, whilst the Weibull law does not fit well the age-pattern of mortality throughout the whole life span (especially because ofthe specific features of infant and young-adult mortality), it provides a rea-sonable representation of mortality at adult and old ages. Moreover, thechoice of the Weibull law is supported by the possibility of easily express-ing, in terms of its parameters, the mode (at adult ages) of the distributionof the random lifetime T0, that is, the Lexis point,

Mod[T0] = β

(α − 1

α

) 1α

; α > 1 (4.74)

as well as the expected value and the variance,

E[T0] = β �

(1α

+ 1)

(4.75)

Var[T0] = β2

[�

(2α

+ 1)

−(

(1α

+ 1))2

](4.76)

where � denotes the complete gamma function (see, e.g. Kotz et al. (2000)).Moments for the remaining lifetime at age x > 0, Tx, can similarly bederived.

The above possibility facilitates the choice of laws which reflect specificfuture trends of mortality. When a dynamic mortality model is con-cerned, the force of mortality must be addressed as a function of the(future) calendar year t (according to the vertical approach), or the year ofbirth τ (diagonal approach). Hence, referring for example to the diagonalapproach, we generalize formula (2.77) as follows:

µx(τ) = α(τ)

β(τ)

(x

β(τ)

)α(τ)−1

(4.77)

Functions α(τ) and β(τ) should be chosen in order to reflect the assumedtrends in the rectangularization and expansion processes. To this purpose,formulae (4.74) to (4.76) provide us with a tool for checking the validity ofa choice of the above functions.

162 4 : Forecasting mortality: An introduction

qx(t)

tt'

x1

x2

t* Time

Figure 4.13. A possible inconsistency in mortality profile extrapolation.

4.5.3 Some remarks

Comparing mortality profile extrapolations (i.e. the horizontal approach)with law-based projections (i.e. the vertical and the diagonal approaches),we note the following point. First, when the projection consists in a straightextrapolation of the mortality profiles, inconsistencies may emerge as aresult of the extrapolation itself. For example we may find that a futurecalendar year t∗ exists, such that for t > t∗ and for some ages x1,x2, with x1 < x2, we find qx1(t) > qx2(t) (see Fig. 4.13), even at oldages. Hence, appropriate adjustments may be required. Conversely, simplecalculation procedures have an advantage when extrapolating mortalityprofiles.

A further disadvantage of mortality profile extrapolations is due to thefact that they do not ensure the representation of future sensible mortalityscenarios. On the contrary, such outcomes can be rather easily produced bycontrolling the behaviour of projected parameters in a law-based context(see, in particular, Section 4.5.2).

As already noted in Section 4.5.1, law-based mortality projections lead toa dramatic reduction in the dimension of the forecasting problem, namely inthe number of the degrees of freedom. However, the age-pattern of mortal-ity can be summarized without resorting to mathematical laws (and henceavoiding the choice of an appropriate mortality law). In particular, sometypical values, or markers (see Section 4.2.1), of the mortality pattern canbe used to this purpose; this aspect is dealt with in Section 4.6.2.

Finally,manyAuthors note that the parameters ofmostmortality laws areoften strongly dependent, for example theB and c parameters inMakeham’slaw (see (2.70)). Hence, univariate extrapolation (as in the vertical andthe diagonal approaches) may be misleading. Conversely, a multivariate

4.5 Projecting mortality in a parametric context 163

approach may provide a better representation of mortality trends, althoughproblems in computational tractability may arise.

4.5.4 Mortality graduation over age and time

As seen in the previous sections, the construction of projected quantities(e.g. the one-year probabilities of death, or the force of mortality) is usuallyworked out in two steps separately.

First, mortality tables are built up for various past calendar years andpossibly graduated, in particular using mathematical formulae, for exam-ple, in order to obtain the force of mortality for each calendar year (seeSection 4.5.1).

Second, when nomortality law is involved,mortality profiles are analysedin order to construct a formula for extrapolating probabilities of death.Conversely, when a law-based projectionmodel is used, the behaviour of theparameters over time is analysed, in order to obtain formulae for parameterextrapolation.

In conclusion, the construction of the projected mortality is performedwith respect to age and calendar year separately.

The above approach is computationally straightforward, in particularthanks to the possibility of using well known techniques while performingthe first step. Despite this feature, recent researchwork has shown thatmod-els which incorporate (simultaneously) both the age variation in mortalityand the time trends in mortality have considerable advantages in terms ofgoodness-of-fit and hence, presumably, in terms of forecast reliability.

Mortality projections based on models incorporating age variation andtime trends represent the surface approach to mortality forecasts (seeFig. 4.14).

We focus on the so-called Gompertz–Makeham class of formulae,denoted by GM(r, s) and defined in Section 2.5.1 (see (2.78)). Formulae ofthe GM(r, s) type can be included in models allowing for mortality trends.In this section, as an illustration, we introduce the model proposed byRenshaw et al. (1996), implemented also by Sithole et al. (2000), andRenshaw and Haberman (2003b), albeit in a modified form.

Consider the following model:

µx(t) = exp

s∑j=0

βj Lj(x)

exp

r∑i=1

αi +s∑

j=1

γij Lj(x)

ti

(4.78)

164 4 : Forecasting mortality: An introduction

Calendar year

Age

t1

xmin

xmax

x

t2 th

� � �

� � �

� � �

� � �

� � �

� � �

� � �

� � �

� � �

�(x,t)

Figure 4.14. The surface approach.

with the proviso that some of the γij may be preset to 0. Lj(x) are Legendrepolynomials. The variables x and t are the transformed ages and trans-formed calendar years, respectively, such that both x and t are mappedonto [−1,+1]. Note that the first of the two multiplicative terms on theright hand side is a graduation model GM(0, s + 1), while the second onemay be interpreted as an age-specific trend adjustment term (provided thatat least one of the γij is not preset to zero). Formula (4.78) has been pro-posed by Renshaw et al. (1996) for modelling with respect to age and time,noting that, for forecasting purposes, low values of r should be preferred –that is, polynomials in t with a low degree.

A further implementation of this model has been carried out by Sitholeet al. (2000). Trend analysis of UK immediate annuitants’ and pensioners’mortality experiences (provided by the CMIB) suggested the adoption ofthe following particular formula (within the class of models (4.78)):

µx(t) = exp

β0 +3∑

j=1

βj Lj(x) + (α1 + γ11 L1(x)) t

(4.79)

where we note that r = 1.

Moreover, the reduction factor Rx(t − t′) related to the force of mortality(rather than to the probabilities of death) has been addressed:

µx(t) = µx(t′) Rx(t − t′) (4.80)

4.6 Other approaches to mortality projections 165

where, as usual, t′ is the base year for themortality projection. From (4.5.30)we obtain:

Rx(t − t′) = exp[

t − t′

w(α1 + γ11 x)

](4.81)

where w denotes half of the calendar year range for the investigation period.Hence:

Rx(t − t′) = exp[(a + b x) (t − t′)

](4.82)

(with a < 0 and b > 0, which result from the fitting of the observed data).

Renshaw andHaberman (2003b) consider a regression-based forecastingmodel of the following simple structure:

lnmx(t) = ax + bx (t − t′) (4.83)

Then, introducing a reduction factor that is related to the central death rateand interpreting the term ax as representing the central death rate for thebase year mx(t′), we have that

mx(t) = mx(t′) Rx(t − t′) (4.84)

andRx(t − t′) = exp[bx (t − t′)] (4.85)

Renshaw and Haberman (2003b) also experiment with a series of break-point predictors (equivalent to linear splines) in order to model changes ofslope in the mortality trend that have been observed in the past data. Withone such term, the reduction factor would be

Rx(t − t′) = exp[bx (t − t′) + b′x (t − t0)+] (4.86)

where (t − t0)+ = t − t0 for t > t0, and 0 otherwise.

4.6 Other approaches to mortality projections

4.6.1 Interpolation versus extrapolation: the limit table

From Sections 4.3 and 4.5, it clearly emerges that a number of projectionmethods are based on the extrapolation of observed mortality trends, pos-sibly via the parameters of some mortality law. Important examples areprovided by formulae (4.11), (4.20), and (4.63). Athough it seems quitenatural that mortality forecasts are based on past mortality observations,different approaches to the construction of projected tables can be adopted.

166 4 : Forecasting mortality: An introduction

We suppose that an ‘optimal’ limiting life table can be assumed. Therelevant age-pattern of mortality is to be interpreted as the limit pattern towhich mortality improvements can lead. Let qx denote the limit probabilityof death at age x, whereas qx(t′) denotes the current mortality. Then, weassume that the projected mortality qx(t) is expressed as follows:

qx(t) = I[qx, qx(t′)] (4.87)

where the symbol I denotes some interpolation model.

Example 4.9 Adopting an exponential interpolation formula, we have:

qx(t) = qx + [qx(t′) − qx] rt−t′ (4.88)

with r < 1. Note that formula (4.20) can be easily linked to (4.88), choosingαx such that qx(t′)αx = qx. �

Determining a limit table requires a number of assumptions about thetrend in various mortality causes, so that an analysis of mortality by causesof death should be carried out as a preliminary step (see Section 4.8.2).

4.6.2 Model tables

As noted in Section 4.5.1, when a mortality law is used to fit mortalityexperience, the age-pattern of mortality is summarized by some param-eters. Then, the projection procedure can be applied to each parameter(instead of each mortality profile), with a dramatic reduction in the dimen-sion of the forecasting problem. However, the age-pattern of mortality canbe summarized without resorting to mathematical laws, and, in particu-lar, some markers of the mortality pattern can be used to this purpose (seeSection 4.2.1). The possibility of summarizing the age-pattern of mortalityby using some markers underpins the use of model tables.

The first set of model tables was constructed in 1955 by the UnitedNations. A number of mortality tables was chosen, with the aim of rep-resenting the age-pattern of mortality corresponding to various degrees ofsocial and economic development, health status, etc. The set was indexed

on the expectation of life at birth,◦e0, so that each table was summarized

by the relevant value of this marker.

Procedures based on model tables can be envisaged also for mortalityforecasts relating to a given population. With this objective in mind, wechoose a set of tables, representing the mortality in the population at severalepochs, and assumed to represent also future mortality for that population.

4.6 Other approaches to mortality projections 167

Observed trendin markers

Markers

Lifetables

The past Future

In a given population

Extrapolation

Set of modeltables

Figure 4.15. Model tables for mortality forecasts.

Trends in some markers are analysed and then projected, possibly usingsome mathematical formula, in order to predict their future values. Pro-jected age-specific probabilities of death are then obtained by entering thesystem of model tables for the various projected values of the markers. Theprocedure is sketched in Fig. 4.15.

4.6.3 Projecting transforms of life table functions

A number of methods for mortality forecasts require that the projectionprocedure starts from the analysis of trends in mortality, in terms of one-year probabilities of death or other fundamental life table functions, suchas the force of mortality (in an age-continuous context) or the survivalfunction. An alternative approach is to use some transforms of life tablefunctions which may help us reach a better understanding of some featuresof mortality trends. Two examples will be provided: the relational methodand the projection of the resistance function.

The relational method was proposed by Brass (1974), who focussed onthe logit transform of the survival function; see Section 2.7.

For the purpose of forecasting mortality, equation (2.107) can be used ina dynamic sense. In a dynamic context, the Brass logit transform is partic-ularly interesting when applied to cohort data, as the logits for successivebirth-year cohorts seem to be linearly related (see Pollard (1987)). Hence,

168 4 : Forecasting mortality: An introduction

denoting by �(x, τ) the logit of the survival function, S(x, τ), for the cohortborn in the calendar year τ, we have:

�(x, τ) = 12

ln(1 − S(x, τ)

S(x, τ)

)(4.89)

Referring to a pair of birth years, τk and τk+1, we assume

�(x, τk+1) = αk + βk �(x, τk) (4.90)

So, the problem of projecting mortality reduces to the problem of extrap-olating the two series αk and βk. Projected values of various life tablefunctions can be derived from the inverse logit transform:

S(x, τ) = 11 + exp[2�(x, τ)] (4.91)

Figures 2.4–2.6 show how rectangularization and expansion phenomena,in particular, can be represented by choices of the parameter α and β.

Application of the Brass transform to cohort-based projections requires along sequence ofmortality observations, in order to build up cohort survivalfunctions. Further, inconsistencies may appear, since the method does notensure that, for any year of birth τ, S(x1, τ) > S(x2, τ) for all pairs (x1, x2)

with x1 < x2. So, negative values for mortality rates qx(τ + x) may follow,and hence appropriate adjustments in the linear extrapolation procedureare required.

A different transform of the survival function S(x) has been addressedby Petrioli and Berti (see Petrioli and Berti (1979); see also Keyfitz (1982)).The proposed transform is the resistance function, defined in Setion 2.7 (see(2.108)). The resistance function has been graduated with the formula:

ρ(x) = xα(ω − x)βeAx2+Bx+C (4.92)

and, in particular, with the three-parameter formula:

ρ(x) = k xα(ω − x)β (4.93)

Model tables have been constructed using combinations of the threeparameters, by focussing on the values of some markers. In a dynamiccontext, the mortality trend is represented by assuming that (some of) theparameters of the resistance function depend on the calendar year t. Thus,referring to equation (4.93), we have:

ρ(x, t) = k(t) xα(t)(ω − x)β(t) (4.94)

4.7 The Lee–Carter method: an introduction 169

Note that, when amodel for the resistance function (see (4.92) and (4.93))is assumed, the resulting projection model can be classified as an analyticalmodel, even though it does not directly address the survival function.

The Petrioli–Berti model has been used to project the mortality of theItalian population, and then has been adopted by the Italian Associationof Insurers in order to build up projected mortality tables for life annuitybusiness.

4.7 The Lee–Carter method: an introduction

4.7.1 Some preliminary ideas

In general, most of the projection formulae presented in the previoussections do not allow for the stochastic nature of mortality. Actually, anumber of projection methods used in actuarial practice simply consist ingraduation–extrapolation procedures (see e.g. (4.11), (4.17), (4.63)).

Amore rigorous approach tomortality forecasts should take into accountthe stochastic features of mortality. In particular, the following pointsshould underpin a stochastic projection model:

– observed mortality rates are outcomes of random variables representingpast mortality;

– forecasted mortality rates are estimates of random variables representingfuture mortality.

Hence, stochastic assumptions aboutmortality are required, that is, prob-ability distributions for the random numbers of deaths, and a statisticalstructure linking forecasts to observations must be specified (see Fig. 4.16).

In a stochastic framework, the results of the projection proceduresconsist in

• Point estimates• Interval estimates

of future mortality rates (see Fig. 4.17) and other life table functions.Clearly, traditional graduation–extrapolation procedures, which do notexplicitly allow for randomness in mortality, produce just one numericalvalue for each future mortality rate (or some other age-specific quantity).Moreover, such values can be hardly interpreted as point estimates, becauseof the lack of an appropriate statistical structure and model.

170 4 : Forecasting mortality: An introduction

t'

. . ...

qx(t) Sample=observed outcomes ofthe random mortality frequency

Path of a stochastic process= possible future outcomes ofthe random mortality frequency

A model linking the probabilistic structure of the stochastic process to the sample

Time t

Figure 4.16. From past to future: a statistical approach.

qx(t)

t�

Graduation

Interval estimation

Point estimation

Observations

Time t

Figure 4.17. Mortality forecasts: point estimation vs interval estimation.

An effective graphical representation of randomness in future mortality isgiven by the so-called fan charts; see Fig. 4.18, which refers to the projectionof the expected lifetime. The fan chart depicts a ‘central projection’ togetherwith some ‘prediction intervals’. The narrowest interval, namely the onewith the darkest shading, corresponds to a low probability prediction, say10%, and is surrounded by prediction intervals with higher probabilities,say 30%, 50%, etc. See also Section 5.9.4.

The Lee–Carter (LC) method (see Lee and Carter (1992); Lee (2000))represents a significant example of the stochastic approach to mortalityforecasts and constitutes one of the most influential proposals in recenttimes. A number of generalizations and improvements have been proposed,which follow and build on the basic ideas of the LC methodology.

4.7 The Lee–Carter method: an introduction 171

t'

e65(t)

Central projection (point estimate)

Predictionintervals

Time t

Figure 4.18. Forecasting expected lifetime: fanchart.

4.7.2 The LC model

In order to represent the age-specific mortality we address the central deathrate. Let mx(t) denote the central death rate for age x at time t, and weassume the following log-bilinear form:

lnmx(t) = αx + βx κt + εx,t (4.95)

where the αx’s describe the age-pattern of mortality averaged over time,whereas the βx’s describe the deviations from the averaged pattern whenκt varies. The change in the level of mortality over time is described bythe (univariate) mortality index κt. Finally, the quantity εx,t denotes theerror term, with mean 0 and variance σ2

ε , reflecting particular age-specifichistorical influence that are not captured by the model. Expression (4.95)constitutes the starting point of the LC method.

It is worth stressing that the LC model differs from ‘parametric models’(namely, mortality laws, see Section 2.5), because in (4.95) the depen-dence on age is non-parametric and is represented by the sequences of αx’sand βx’s.

The model expressed by (4.95) cannot be fitted by simple regression,since there is no observable variable on its right-hand side. A least squaressolution can be found by using the first element of the singular value decom-position. The parameter estimation is based on a matrix of available deathrates, and we note that the system implied by (4.95) is undetermined with-out additional constraints. Lee andCarter (1992) propose the normalization∑

x βx = 1,∑

t κt = 0, which in turn forces each αx to be an average of thelog-central death rates over calendar years.

Once the parameters αx, βx and κt are estimated, obtaining the estimatesαx, βx, κt, mortality forecast follows by modelling the values of κt as a time

172 4 : Forecasting mortality: An introduction

series, for example, as a random walk with drift. Starting from a given yeart′, forecasted mortality rates are then computed, for t > t′, as follows:

mx(t) = exp(αx + βx κt) = mx(t′) exp[βx(κt − κt′)

](4.96)

It is worth noting that mx(t) is modelled as a stochastic process, thatis driven by the stochastic process κt, from which interval estimates can becomputed for the projected values of mortality rates.

4.7.3 From LC to the Poisson log-bilinear model

The LCmethod implicitly assumes that the random errors are homoskedas-tic. This assumption, which follows from the ordinary least squares estima-tion method that is used as the main statistical tool, seems to be unrealistic,as the logarithm of the observed mortality rate is much more variable atolder ages than at younger ages, because of the much smaller number ofdeaths observed at old and very old ages.

In Brouhns et al. (2002b) and Brouhns et al. (2002a), possible improve-ments of the LC method are investigated, using a Poisson random variationfor the number of deaths. This is instead of using the additive error term εx,tin the expression for the logarithm of the central mortality rate (see (4.95)).

In terms of the force of mortality µx(t), the Poisson assumption meansthat the random number of deaths at age x in calendar year t is given by

Dx(t) ∼ Poisson(ETRx(t) µx(t)

)(4.97)

where ETRx(t) is the central number of exposed to risk. In order to definethe Poisson parameter ETRx(t) µx(t), Brouhns et al. (2002a) and Brouhnset al. (2002b) assume a log-bilinear force of mortality, that is,

lnµx(t) = αx + βx κt (4.98)

hence with the structure expressed by (4.95), apart from the error term.The meaning of the parameters αx, βx, κt is essentially the same as forthe corresponding parameters in the LC model. The parameters are thendetermined by maximizing the log-likelihood based on (4.97) and (4.98).

Brouhns et al. (2002b) do not modify the time series part of the LCmethod. Hence, the estimates αx and βx are used with the forecasted κtin order to generate future mortality rates (as in (4.96)), as well as otherage-specific quantities.

4.8 Further issues 173

4.7.4 The LC method and model tables

An interesting example of projecting mortality patterns using the LCmethod is provided by Buettner (2002). The LC method is used to projectmortality patterns on the basis ofmodel tables that are indexed on the expec-

tation of life at birth◦e0 (see Section 4.6.2). Sincemodel tables do not contain

any explicit time reference, the LC model has been implemented replacingthe time index κt with an index reflecting the level of life expectancy. Then,the model is

lnmx(e) = αx + βx κe + εx,e (4.99)

where the parameter κe represents the trend in the level of life expectancyat birth.

4.8 Further issues

In this section we address some issues of mortality forecasts part of whichare, at least to some extent, beyond the main scope of this book, whereasothers will be developed in the following chapters.

4.8.1 Cohort approach versus period approach. APC models

First, consider the following projection model referred to the mortality oddsφx(t) = qx(t)/px(t):

φx(t) = φx(t′) rt−t′ (4.100)

where the first term on the right-hand side does not depend on t, whereasthe second term does not depend on x. Denoting the first term with A(x)

and the second termwith B(t), equation (4.100) can be rewritten as follows:

φx(t) = A(x) B(t) (4.101)

Then, we consider the so-called K-K-K hypothesis (formulated in 1934 byKermack, McKendrick, and McKinlay), according to which the followingfactorization is assumed:

µx(τ) = C(x) D(τ) (4.102)

where τ denotes, as usual, the year of birth of the cohort.

In projection model (4.101), the future mortality structure is split into:

– a factor A(x), expressing the age effect;– a factor B(t), expressing the year of occurrence effect or period effect.

174 4 : Forecasting mortality: An introduction

Conversely, in model (4.102) it is assumed that the future mortalitystructure can be split into:

– a factor C(x), expressing the age effect;– a factor D(τ), expressing the year of birth effect or cohort effect.

Recently, models including both the period effect and the cohort effect(as well as the age effect) have been proposed. These models are commonlycalled APC (Age-Period-Cohort) models. An APC model, referring to theforce of mortality, can be expressed as follows:

µx(t) = Q(x) R(t) S(t − x) (4.103)

(where t − x = τ) or, in logarithmic terms:

lnµx(t) = lnQ(x) + lnR(t) + ln S(t − x) (4.104)

A slightly modified version of (4.104), referring to central death rates (seeWillets (2004)), is as follows:

lnmx(t) = m + αx + βt + γt−x (4.105)

with finite sets for the values of x and t. Constraints are usually as follows:∑x

αx =∑

t

βt =∑t−x

γt−x = 0 (4.106)

The model can be estimated using Poisson maximum likelihood, orweighted least squares methods. However, no unique set of parametersresult in an optimal fit because of the trivial relation

cohort + age = period

Further weak points can be found in APC models like (4.102) and(4.103). In particular, these models assume an age-independent periodeffect, or an age-independent cohort effect, whereas the impact of mortalityimprovements over time (or between cohorts) may vary with age.

As far as statistical evidence is concerned, both period and cohort effectsseem to impact on mortality improvements. In particular, it is reason-able that period effects summarize contemporary factors, for example,the general health status of the population, availability of healthcare ser-vices, critical weather conditions, etc. Conversely, cohort effects quantifyhistorical factors, for example, World War II, diet, smoking habits, etc.

From a practical point of view, themain difficulty in implementing projec-tion models allowing for cohort effects obviously lies in the fact that statist-icaldata foravery longperiodarerequired,andsuchdataarerarelyavailable.

4.9 References and suggestions for further reading 175

Conversely, fromageneral point of view, the role of periodand cohort effectsin quantifying factors that affect mortality improvements suggests that weconsider future likely scenarios and, in particular, causes of death.

4.8.2 Projections and scenarios. Mortality by causes

When projecting mortality, the collateral information available to the fore-caster can be allowed for. Information may concern, for example, trendsin smoking habits, trends in prevalence of some illness, improvements inmedical knowledge and surgery, etc. Thus, projections can be performedaccording to an assumed scenario.

The introduction of relationships between causes (e.g. advances in medi-cal science) and effects (mortality improvements) underpins mortality pro-jections which are carried out according to assumed scenarios. Obviously,some degree of arbitrariness follows, affecting the results.

The projection methods that we have described refer to mortality inaggregate. Nonetheless, many of them can be used to project mortalityby different causes separately.

Projections by cause of death offer a useful insight into the changingincidence of the various causes. Conversely, some important problems arisewhen this type of projection is adopted. In particular, it should be stressedthat complex interrelationships exist among causes of death, whilst theclassic assumption of independence is commonly accepted. For example,mortality from heart diseases and lung cancer are positively correlated, asboth are linked to smoking habits. A further problem concerns the difficultidentification of the cause of death for elderly people.

A final issue concerns the phenomenon in long-term projections by causeof death whereby the future projected mortality rate is dominated by themortality trend for the cause of death where mortality rates are reducing atthe lowest speed.

For these reasons, many forecasters prefer to carry out mortality projec-tions only in aggregate terms.

4.9 References and suggestions for further reading

4.9.1 Landmarks in mortality projections

4.9.1.1 The antecedents

As noted by Cramér and Wold (1935), the earliest attempt to project mor-tality is probably due to the Swedish astronomer H. Gyldén. In a work

176 4 : Forecasting mortality: An introduction

presented to the Swedish Assurance Association in 1875, he fitted a straightline to the sequence of general death rates of the Swedish population con-cerning the years 1750–1870, and then extrapolated the behaviour of thegeneral death rate. A similar graphical fitting was proposed in 1901 by T.Richardt for sequences of the life annuity values a60 and a65, calculatedaccording to various Norwegian life tables, and then projected via extrapo-lation for application to pension plan calculations. Note that, in both of theproposals of Gyldén and Richardt, the projection of a single-figure indexwas concerned.

Mortality trends and the relevant effects on life assurance and pensionannuities were clearly identified at the beginning of the 20th century, as wit-nessed by various initiatives in the actuarial field. In particular, it is worthnoting that the subject ‘Mortality tables for annuitants’ was one of the top-ics discussed at the 5th International Congress of Actuaries, held in Berlinin 1906. Nordenmark (1906), for instance, pointed out that improvementsin mortality must be carefully considered when pricing life annuities and, inparticular, cohort mortality should be addressed to avoid underestimationof the related liabilities. The 7th International Congress of Actuaries, heldin Amsterdam in 1912, included the subject ‘The course, since 1800, of themortality of assured persons’.

As Cramér and Wold (1935) notes, a life table for annuities was con-structed in 1912 by A. Lindstedt, who used data from Swedish populationexperience and, for each age x, extrapolated the sequence of annual prob-abilities of death, namely the mortality profile qx(t), hence adopting ahorizontal approach. Probably, this work constitutes the earliest projectionof an age-specific function.

4.9.1.2 Early seminal contributions

Blaschke (1923) proposed a Makeham-based projected mortality model(see Section 4.5.1). In particular he adopted a vertical approach, consistingin the estimation of Makeham’s parameters for each period table based onthe experienced mortality, and then in fitting the estimated values. Hence,projected values for the three parameters were obtained via extrapolation.

In 1924, the Institute of Actuaries in London proposed a horizontalmethod for mortality projection (see Cramér and Wold (1935)), assum-ing that probabilities of death are exponential functions of the calendaryear, from which comes the name ‘exponential model’ frequently used todenote this approach to mortality projections. Various extrapolation for-mulae used by UK actuaries in recent times for annuitants and pensionerstables are particular cases of the early exponential model (see Section 4.3.8).

4.9 References and suggestions for further reading 177

We now turn to the diagonal approach. In 1927 A. R. Davidson and A. R.Reid proposed a Makeham-based projection model, in which Makeham’slaw refers to cohort mortality experiences. The relevant parameters wereestimated via a cohort graduation (see Reid and Davidson (1927)).

The use of Makeham-based projections is thoroughly discussed byCramér andWold (1935), dealing with the graduation and extrapolation ofSwedish mortality. In particular, the vertical (i.e. period-based, see (4.63))and the diagonal (i.e. cohort-based, see (4.64)) approaches are compared.Let

µx(z) = γ(z) + α(z) β(z)x

denote the force of mortality in both the vertical (with z = t) and thediagonal (with z = t − x) approach. For the graduation of the parameters,Cramér andWold (1935) assumed that, in both the vertical and the diagonalapproach, α(z) is linear while ln β(z) and ln γ(z) are logistic.

The assumption formulated in 1934 by Kermack, McKendrick, andMcKinlay constitutes another example of the diagonal approach to mor-tality projections. As Pollard (1949) notes, these authors showed that, forsome countries, it was reasonable to assume that the force of mortalitydepended on the attained age x and the year of birth τ = t − x, and theydeduced that µx(t) = C(x) D(τ), where C(x) is a function of age only andD(τ) is a function of the year of birth only; see also Section 4.8.1.

4.9.1.3 Some modern contributions

Seminal contributions to mortality modelling and mortality projections inparticular have been produced by demographers, throughout the latter halfof the 20th century. The ‘optimal’ life table, model tables and relationalmethods probably constitute three of themost influential proposals in recenttimes, in the framework of mortality analysis.

The idea of an ‘optimal’ table (see Section 4.6.1) was proposed byBourgeois-Pichat (1952). The question was: ‘can mortality decline indef-initely or is there a limit, and if so, what is this limit?’ While a numberof projection methods are based on the extrapolation of observed mortal-ity trends, focussing on optimal tables provides an alternative approach tomortality forecasts, as an interpolation procedure between past data andthe limit table is required.

The possibility of summarizing the age-pattern ofmortality by using somemarkers underpins the use of ‘model tables’ in mortality projections (seeSection 4.6.2). Model tables were first constructed by the United Nations,

178 4 : Forecasting mortality: An introduction

in 1955. Each table is summarized by the relevant value of the expectationof life at birth.

A new way to mortality forecasts was paved by the ‘relational method’proposed by W. Brass (see e.g. Brass (1974)), who focussed on the logittransform of the survival function (see Section 2.7). A different transformof the survival function, namely the ‘resistance function’, has been addressedby Petrioli and Berti (1979); see also Keyfitz (1982). In a dynamic context,the mortality trend is represented assuming that (some of) the parametersof the resistance function depend on the calendar year t.

4.9.1.4 Recent contributions

In the last decades of the 1900s, various mortality law-based projectionmodels have been proposed. In particular, Forfar and Smith (1988) have fit-ted the Heligman–Pollard curve to the graduated English life tables ELT1 toELT13, for bothmales and females, and then have analysed the behaviour ofthe relevant parameters. Mortality projections have been performed assum-ing that various parameters of the Heligman–Pollard law are functions ofthe calendar year (see Benjamin and Soliman (1993) and Congdon (1993)for examples).

In the 1990s, a new method for forecasting the age-pattern of mortal-ity was proposed and then extended by L. Carter and R.D. Lee (see Leeand Carter (1992) and Lee (2000)). The LC method addresses the centraldeath rate to represent the age-specific mortality (see Section 4.7.2). Whiletraditional projections models provide the forecaster with point estimatesof future mortality rates (or other age-specific quantities), the LC methodexplicitly allows for random fluctuations in future mortality, representingthe related effect in terms of interval estimates. The LC methodology con-stitutes one of the most influential proposals in recent times, in the field ofmortality projections. Indeed, much research work as well as many recentapplications to actuarial problems are directly related to this methodology(for detailed references see Section 4.9.2).

Finally, frailty models in the context of mortality forecast have beenaddressed by Butt and Haberman (2004) and Wang and Brown (1998).

4.9.2 Further references

There are a number of both theoretical and practical papers dealing withmortality forecasts, produced by actuaries as well as by demographers. Thereader interested in various perspectives on forecasting mortality shouldrefer to Tabeau et al. (2001), and Booth (2006), in which a number of

4.9 References and suggestions for further reading 179

approaches to mortality projections are discussed and several applicationsare described. Interesting reviews on mortality forecast methods can befound also in Benjamin and Pollard (1993), Benjamin and Soliman (1993),National Statistics - Government Actuary’s Department (2001), Olshansky(1988), Pollard (1987), and Wong-Fupuy and Haberman (2004).

Mortality projections via reduction factors represent a practical andwidely adopted approach to mortality forecast. As regards formulae usedby UK actuaries, the reader should refer to CMIB (1978, 1990, 1999).Recent contributions to the modelling of reduction factors have beengiven by Renshaw and Haberman (2000, 2003a), and Sithole et al.(2000).

In the field of law-based mortality projections, Felipe et al. (2002) haveused the Heligman–Pollard law 2 for fitting and projecting mortality trendsin the Spanish population. Also more traditional mortality laws have beenused for analysing mortality trends and producing mortality forecasts. Forexample, Barnett (1960) has analysed mortality trends through the param-eters of a modified Thiele’s formula, whereas Buus (1960) has used theMakeham law, focussing on the interdependence between the parameters.Poulin (1980) has proposed aMakeham-based projection formula, whereasWetterstrand (1981) has used Gompertz’s law. Functions other than theforce of mortality can also be addressed. For example, Beard (1952) builtup a projection model by fitting a Pearson Type III curve to the curveof deaths, and then taking some parameters (in particular the maximumage) as functions of the year of birth. The Weibull law has been usedby Olivieri and Pitacco (2002a) and Olivieri (2005), in order to express,via the relevant parameters, various assumptions about the expansion andrectangularization of the survival function.

The use of a law-based approach tomortality forecasting is rather contro-versial. For interesting discussions on this issue, the reader should consultKeyfitz (1982) and Pollard (1987). Brouhns et al. (2002b) stress thatthe estimated parameters are often strongly dependent. Hence, univariateextrapolation of the parameters may be misleading, whereas a multivari-ate time series for the parameters is theoretically possible but can leadto computational intractability. Of course, a distribution-free approach tomortality projections avoids these problems. Very important examples ofthe distribution-free approach are provided by the LC model and severalmodels aiming to improve the LC methodology.

The practical use of projected tables deserves special attention, especiallywhen just one cohort table is actually adopted in pricing and reserving (seeSection 4.4.2). In particular, the optimal choice of the age-shifting function

180 4 : Forecasting mortality: An introduction

(see Section 4.4.3) has been dealt with by Delwarde and Denuit (2006); seealso Chapter 3.

Considerable research work has been recently devoted to improve andgeneralize the LC methodology. In particular, the reader should refer toCarter (1996), Alho (2000), Renshaw andHaberman (2003a, b, c), Brouhnsand Denuit (2002), Brouhns et al. (2002b). See also the list of references inLee (2000).

Among the extensions of the LC method, we note the following devel-opments. Carter (1996) incorporates in the LC methodology uncertaintyabout the estimated trend of mortality kt, through a specific model for thetrend itself. Renshaw and Haberman (2003c) have noted that the standardLC methodology fails to capture and then project recent upturn in crudemortality rates in the age range 20–39 years. So, an extension of the LCmethodology is proposed, in order to incorporate in the LC model specificage differential effects.

Booth et al. (2002) have developed systematic methods for choosing themost appropriate subset of the data to use for modelling – the graduationsubset of Fig. 4.4. The importance of ensuring that the estimates αx and βxare smooth with respect to age so that irregularities are not magnified viaextrapolations into the future has been discussed by Renshaw and Haber-man (2003a), Renshaw andHaberman (2003c), De Jong and Tickle (2006),and Delwarde et al. (2007).

A cause-of-death projection study was proposed by Pollard (1949), basedon Australian population data.

As regards scenario-based mortality forecasts, Gutterman and Van-derhoof (1998) stress that a projection methodology should allow forrelationships between causes (e.g. advances in medical science) and effects(mortality improvements).

5Forecasting mortality:applications andexamples of age-periodmodels

5.1 Introduction

As explained in Chapter 4, actuaries working in life insurance and pensionhave been using projected life tables for some decades. But the problemconfronting actuaries is that people have been living much longer than theywere expected to according to the life tables being used for actuarial com-putations. What was missing was an accurate estimation of the speed of themortality improvement: thus, most of the mortality projections performedduring the second half of the 20th century have underestimated the gainsin longevity. The mortality improvements seen in practice have quite con-sistently exceeded the projected improvements. As a result, insurers have,from time to time, been forced to allocate more capital to support their in-force annuity business, with adverse effects on free reserves and profitability.From the point of view of the actuarial approach to risk management, themajor problem is that mortality improvement is not a diversifiable risk. Tra-ditional diversifiable mortality risk is the random variation around a fixed,known life table. Mortality improvement risk, though, affects the wholeportfolio and can thus not be managed using the law of large numbers(see Chapter 7 for a detailed discussion of systematic and non-systematicrisks). In this respect, longevity resembles investment risk, in that it isnon-diversifiable: it cannot be controlled by the usual insurance mecha-nism of selling large numbers of policies, because they are not independentin respect of that source of uncertainty. However, longevity is differentfrom investment risk in that there are currently no large traded marketsin longevity risk so that it cannot easily be hedged. The reaction to thisproblem is twofold. First, actuaries are trying to produce better models formortality improvement, paying more attention to the levels of uncertaintyinvolved in the forecasts. The second part of the reaction is to look to the

182 5 : Age-period projection models

capital markets to share the risk, through the emergence of mortality-linkedderivatives or longevity bonds. This kind of securitization will be discussedin Chapter 7.

As explained in the preceding chapter, there is a variety of statistical mod-els used for mortality projection, ranging from the basic regression models,in which age and time are viewed as continuous covariates, to sophisti-cated robust non-parametric models. Mortality forecasting is a hazardousyet essential enterprise for life annuity providers. This chapter examines theproblem in the favourable circumstances encountered in developed coun-tries, where extensive historical data are often easily available. A statisticalmodel (in the form of a regression or a time series) is used to describehistorical data and extrapolate past trends to the future.

In this chapter, we first consider the log-bilinear projection model pio-neered by Lee and Carter (1992) that has been introduced in Section 4.7.2.Themethod describes the log of a time series of age-specific death rates as thesum of an age-specific component that is independent of time and anothercomponent that is the product of a time-varying parameter reflecting thegeneral level of mortality, and an age-specific component that representshow rapidly or slowly mortality at each age varies when the general levelof mortality changes. This model is fitted to historical data. The resultingestimate of the time-varying parameter is then modelled and projected asa stochastic time series using standard Box-Jenkins or ARIMA methods.From this forecast of the general level of mortality, the future death ratesare derived using the estimated age effects. The key difference between theclassical generalized linear regression model approach (see Section 4.5.4)and the method pioneered by Lee and Carter (1992) centers on the inter-pretation of time which in the logbilinear approach is modelled as a factorand under the generalized linear regression approach ismodelled as a knowncovariate.

The model proposed by Lee and Carter (1992) has now been widelyadopted. However, it is of course not the only candidate for extrapolat-ing mortality to the future. It should be stressed that some models aredesigned to project specific demographic indicators, and that the forecasthorizon may depend on the type of model. In this respect, the model pro-posed by Lee and Carter (1992) is typically meant for long-term projectionsof aggregate mortality indicators like life expectancies. It is not intendedto produce reliable forecasts of series of death rates for a particular age.This is why this model is so useful for actuaries, who are interested inlife annuity premiums and reserves, which are weighted versions of lifeexpectancies (the weights being the financial discount factors). Some exten-sions incorporating features specific to each cohort are proposed in the nextchapter.

5.1 Introduction 183

In addition to the Lee–Carter model, we also consider a powerful alter-native mortality forecasting method proposed by Cairns et al. (2006a). Itincludes two time factors (whereas only one time factor drives the futuredeath rates in the Lee–Carter case) with a smoothing of age effects using alogit transformation of one-year death probabilities. Specifically, the logitof the one-year death probabilities is modelled as a linear function of age,with intercept and slope parameters following some stochastic process.Compared with the Lee–Carter approach, the Cairns–Blake–Dowd modelincludes two time factors. This allows the model to capture the imperfectcorrelation inmortality rates at different ages fromone year to the next. Thisapproach can also be seen as a compromise between the generalized regres-sion approach and the Lee–Carter views of mortality modelling, in that ageenters the Cairns–Blake–Dowd model as a continuous covariate whereasthe effect of calendar time is captured by a couple of factors (time-varyingintercept and slope parameters).

The Cairns–Blake–Dowd model is fitted to historical data. The resultingestimates for the time-varying parameters are then projected using a bivari-ate time series model. From this forecast of the future intercept and theslope parameters, the future one-year death probabilities are computed incombination with the linear age effect.

Mortality forecasts performed by demographers are traditionally basedon the forecaster’s subjective judgements, in the light of historical data andexpert opinions. This traditional method has been widely used for officialmortality forecasts, and by international agencies. A range of uncertaintyis indicated by high and low scenarios (surrounding the medium scenariowhich is meant to be the best estimate), which are also constructed throughsubjective judgements.

In the hands of a skilled and knowledgeable forecaster, the traditionalmethod has the advantage of drawing on the full range of relevant infor-mation for the medium forecast and the high–low range. However, it alsohas certain deficiencies. First, mortality projections in industrialized coun-tries have been found to under-predict mortality declines and gains in lifeexpectancy when compared to subsequent outcomes, as pointed out by Leeand Miller (2001). Thus, a systematic downward bias has been observedfor this traditional approach during the 20th century. A second difficulty isthat it is not clear how to interpret the high–low range of a variable unlessa corresponding probability for the range is stated. We will come back tothis issue in Section 5.8.

Both the Lee–Carter and the Cairns–Blake–Dowd models greatly reducethe role of subjective judgement since standard diagnostic and statistical

184 5 : Age-period projection models

modelling procedures for time series analysis are followed. Nonetheless,decisions must be taken about a number of elements of these models–forexample, how far back in history to begin, or exactly what time series modelto use.

It should be noted that the models investigated in this chapter do notattempt to incorporate assumptions about advances in medical science orspecific environmental changes: no information other than previous historyis taken into account. The (tacit) underlying assumption is that all of theinformation about the future is contained in the past observed values ofthe death rates. This means that this approach is unable to forecast suddenimprovements in mortality due to the discovery of new medical treatments,revolutionary cures including antibiotics, or public health innovations. Sim-ilarly, future deteriorations caused by epidemics, the appearance of newdiseases or the aggravation of pollution cannot enter the model. The actu-ary has to keep this in mind when he uses this model and makes decisionon the basis of the outputs, for example, in the setting of a reinsuranceprogramme.

Some authors have severely criticized the purely extrapolative approachbecause it seems to ignore the underlying mechanisms of a social, economicor biological nature. As pointed out by Wilmoth (2000), such a critiqueis valid only insofar as such mechanisms are understood with sufficientprecision to offer a legitimate alternative method of prediction. Since ourunderstanding of the complex interactions of social and biological fac-tors that determine mortality levels is still imprecise, we believe that theextrapolative approach to prediction is particularly compelling in the caseof human mortality.

The R software has been found convenient to perform the analysisdescribed in this chapter (as well as those in Chapter 3). R is a free lan-guage and environment for statistical computing and graphics. R is a GNUproject which is similar to the S language and environment which was devel-oped at Bell Laboratories (formerly AT&T, now Lucent Technologies) byJohn Chambers and colleagues. For more details, we refer the interestedreader to http://www.r-project.org/.

In addition to our own R code, we have benefitted from the demographypackage for R created by Rob J. Hyndman, Heather Booth, Leonie Tickle,and John Maindonald. This package contains functions for various demo-graphic analyses. It provides facilities for demographic statistics, modellingand forecasting. In particular, it implements the forecasting model pro-posed by Lee and Carter (1992) and several variations of it, as well as theforecasting model proposed by Hyndman and Ullah (2007).

5.1 Introduction 185

After the Crédit Suisse longevity index based on the expectation of lifederived from US data, the more comprehensive JPMorgan LifeMetrics hasinnovated by producing publicly available indices on population longevity.LifeMetrics is a toolkit for measuring and managing longevity and mor-tality risk. LifeMetrics advisors include Watson Wyatt and the PensionsInstitute at Cass Business School. LifeMetrics Index provides mortality ratesand period life expectancy levels across various ages, by gender, for eachnational population covered. Currently the LifeMetrics Index publishesindex values for the United States, England &Wales, and The Netherlands.All of the methodology, algorithms and calculations are fully disclosed andopen. The LifeMetrics toolkit includes a set of computer based models thatcan be used in forecasting mortality and longevity. These models have beenevaluated in the research paper ‘A quantitative comparison of eight stochas-tic mortality models using data from England & Wales and the UnitedStates’ byCairns et al. (2007). TheR source code required to run the forecastmodels is available for download along with a user guide.

We also mention two other resources which are available from the web(but which were not used in the present book). Federico Girosi and GaryKing offer the YourCast software that makes forecasts by running sets oflinear regressions together in a variety of sophisticated ways. This opensource software is freely available from http://gking.harvard.edu/yourcast/.It implements the methods introduced in Federico Girosi and Gary King’smanuscript on Demographic Forecasting, to be published by PrincetonUniversity Press.

Further, we note the recent initiative of the British CMIB (ContinuousMortality Investigation Bureau), that is, the bureau affiliated to the UK actu-arial profession, with the function of producing mortality tables for use byinsurers and pension plans. The CMIB has made available software runningon R with the aim of illustrating the P-Spline methodology for projectingmortality. CMIB software now allows the fitting of the Lee-Carter modelas well, but with restricted ARIMA specifications. For more details, pleaseconsult http://www.actuaries.org.uk.

Before embarking in the presentation of the Lee–Carter and the Cairns–Blake–Dowd approaches, let us say a few words about the material notincluded in the present chapter. First, we do not consider possible cohorteffects, and limit our analysis to the age and period dimensions. For coun-tries like Belgium, cohort effects are weak enough and can be neglected.However, for countries like the UK, cohort effects are significant and mustbe accounted for. Chapter 6 is devoted to the inclusion of cohort effects inthe Lee–Carter and Cairns–Blake–Dowd models discussed here.

186 5 : Age-period projection models

We also do not consider continuous-time models for mortality, that areinherited from the interest rate management and credit risk literature. Werefer the reader to the works by Biffis and Millossovich (2006a), Biffis andMillossovich (2006b), Biffis and Denuit (2006), and Biffis (2005) for moreinformation and further references about this approach. See also Chapter 7in this book.

5.2 Lee–Carter mortality projection model

5.2.1 Specification

Lee and Carter (1992) proposed a simple model for describing the secu-lar change in mortality as a function of a single time index. Throughoutthis chapter, we assume that assumption (3.2) is fulfilled, that is, that theage-specific mortality rates are constant within bands of age and time, butallowed to vary from one band to the next. Recall that under (3.2), the forceof mortality µx(t) and the death rate mx(t) coincide.

Lee and Carter (1992) specified a log-bilinear form for the force ofmortality µx(t), that is,

lnµx(t) = αx + βxκt (5.1)

The specification (5.1) differs structurally from parametric models giventhat the dependence on age is non-parametric, and represented by thesequences of αx’s and βx’s. Interpretation of the parameters is quite simple:expαx is the general shape of the mortality schedule and the actual forcesof mortality change according to an overall mortality index κt modulatedby an age response βx (the shape of the βx profile tells which rates declinerapidly and which slowly over time in response of change in κt). The param-eter βx represents the age-specific patterns of mortality change. It indicatesthe sensitivity of the logarithm of the force of mortality at age x to varia-tions in the time index κt. In principle, βx could be negative at some agesx, indicating that mortality at those ages tends to rise when falling at otherages. In practice, this does not seem to happen over the long-run, exceptsometimes at the very oldest ages. There is also some evidence of negative βxestimates for males at young adult ages in certain industrialized countries.This has been attributed to an increase in mortality due to AIDS in the late1980s and 1990s.

In a typical population, age-specific death rates have a strong tendencyto move up and down together over time. The specification (5.1) uses thistendency by modelling the changes over time in age-specific death rates as

5.2 Lee–Carter mortality projection model 187

driven by a scalar factor κt. This strategy implies that the modelled deathrates are perfectly correlated across ages, which is the strength but also theweakness of the approach. As pointed by Lee (2000), the rates of declinein the lnµx(t)’s at different ages are given by βx(κt − κt−1) so that theyalways maintain the same ratio to one another over time. In practice, therelative speed of decline at different ages may vary. In such a case, theextended version of the Lee–Carter model introduced by Booth et al. (2002)– see equation (5.14) – or the Cairns–Blake–Dowd approach might bepreferable.

Remark Hyndman and Ullah (2007) extend the principal componentsapproach by adopting a functional data paradigm combined with non-parametric smoothing (penalized regression splines) and robust statistics.Univariate time series are then fitted to each component coefficient (or levelparameter). The Lee-Carter method then appears to be a particular case ofthis general approach. �

Remark Many models produce projected death rates that tend to 0. Hence,some constraint should be imposed on the long-term behaviour of thedeath rates. In that respect, limit life tables that have been discussed inSection 4.6.1 may be specified, or we can use a forecast that incorporatesa theoretical maximum achievable life expectancy. This feature implies aslowdown in the rate of mortality decline as the theoretical maximum lifeexpectancy is reached. If we denote as µ∞

x the limiting force of mortality,the model becomes ln

(µx(t) − µ∞

x) = αx + βxκt. �

Remark Considering the global convergence in mortality levels, and thecommon trends evidenced in Section 3.5 of Chapter 3, it may seem appro-priate to prepare mortality forecasts for individual national populations intandemwith one another. Li and Lee (2005) have modified the original pro-jection model of Lee and Carter (1992) for producing mortality forecastsfor a group of populations. To this end, the central tendencies for the groupare first identified using a common factor approach, and national historicalparticularities are then taken into account.

Note that the most direct application of this approach is to forecast in apopulation for the two sexes mortality. The same βx and κt can be used forboth males and females, letting the αx’s depend on gender, as in Li and Lee(2005). Alternatively, Carter and Lee (1992) used the same κt’s for malesand females but allowed the αx’s and βx’s to be gender-specific.

Delwarde et al. (2006) have analysed the pattern of mortality decline inthe G5 countries (France, Germany, Japan, UK, and USA). Each G5 countryis viewed as the value of a covariate. This model allows us to analyse the

188 5 : Age-period projection models

level and age pattern of mortality by country, the general time pattern ofmortality change, and the speed and age pattern of mortality change bycountry. As for the Lee–Carter model, the extrapolation of estimates κtgives future mortality rates for given gender, age, time, and country. Themain interest of this method lies in the estimation of a unique time series(or two if each gender is treated separately) which gives mortality rates forall countries and age-time categories.

As expected, the analysis conducted by Delware et al. (2006) revealsthat age is the most important factor determining mortality rate. The timeeffect is more relevant than the country effect if weights are taken intoaccount, which is a sign of convergence. In other words, the time horizonis more important than the country, but since the country effect is not neg-ligible, the differences between country-specific death rates increase withtime. These results allow us to compare the mortality experience observedin the G5 countries through the same model and also to produce forecasts.An estimated average death rate and a common index of mortality declinecan be obtained from the analysis, which is essential for economists. Mostfinancial and insurance decisions are taken on the basis of a worldwideview, more than on a regional or particular location. From this analysis,one can obtain baseline mortality forecasts from the pooled G5 popu-lation, but at the same time, one can see the influence of each gender,age, time trend, and country on the mortality forecast. In this way, theobserved past behaviour of the G5 is summarized in a single model andthe identification and comparison of each country specific effect becomemuch easier. �

5.2.2 Calibration

5.2.2.1 Identifiability constraints

Let us assume that we have observed data for a set of calendar years t =t1, t2, . . . , tn and for a set of ages x = x1, x2, . . . , xm. On the basis of theseobservations, we would like to estimate the corresponding αx’s, βx’s, andκt’s. However, this is not possible unless we impose additional contraints.

In (5.1), the αx parameters can only be identified up to an additiveconstant, the βx parameters can only be identified up to a multiplicativeconstant, and the κt parameters can only be identified up to a linear trans-formation. Precisely, if we replace βx with cβx and κt with κt/c for any c �= 0or if we replace αx with αx − cβx and κt with κt + c for any c, we obtainthe same values for the death rates. This means that we cannot distinguishbetween the two parametrizations: different values of the parameters pro-duce the same mx(t)’s. To see that two constraints are needed to ensure

5.2 Lee–Carter mortality projection model 189

identification, note that if (5.1) holds true, we also have

lnµx(t) = αx + βxκt (5.2)

with αx = αx+c1βx, βx = βx/c2 and κt = c2(κt −c1). Therefore, we need toimpose two constraints on the parameters αx, βx, and κt in order to preventthe arbitrary selection of the parameters c1 and c2.

A pair of additional constraints are thus required on the parameters forestimation to circumvent this problem. To some extent, the choice of theconstraints is a subjective one, although some choices are more natural thanothers. In the literature, the parameters in (5.1) are usually subject to theconstraints

tn∑t=t1

κt = 0 andxm∑

x=x1

βx = 1 (5.3)

ensuring model identification. Under this normalization, βx is the propor-tion of change in the overall log mortality attributable to age x. We alsonote that other sets of constraints can be found in the literature, for instance,κtn = 0 or

∑xmx=x1 β2

x = 1.

Note that the lack of identifiability of the Lee-Carter model is not a realproblem. It just means that the likelihood associated with the model has aninfinite number of equivalent maxima, each of which would produce identi-cal forecasts. Adopting the constraints (5.3) consists in picking one of theseequivalent maxima. The important point is that the choice of constraintshas no impact on the quality of the fit, or on forecasts of mortality. Somecare is needed, however, in any bootstrap procedures used for simulation(see Section 5.8).

5.2.2.2 Least-squares estimation

Statistical model The model classically used to estimate the αx’s, βx’s, andκt’s is

ln mx(t) = αx + βxκt + εx(t) (5.4)

for x = x1, x2, . . . , xm and t = t1, t2, . . . , tn, where mx(t) denotes theobserved force of mortality at age x during year t computed according to(3.13), and where the εx(t)’s are homoskedastic centered error terms. Theerror term εx(t), with mean 0 and variance σ2

ε reflects any particular age-specific historical influences that are not captured in the model. Note thatthe errors have the same variance over age, which is sometimes a question-able assumption: the logarithm of the observed force of mortality is usuallymuch more variable at the older ages than at the younger ages because ofthe much smaller absolute number of deaths at the older ages. However,

190 5 : Age-period projection models

if the mortality surface has been previously completed (i.e. extrapolated tothe oldest ages using a parametric model), the homoskedasticity assump-tion is not a problem provided that the actuary restricts the age range formodelling to 50 and over, say, in order to avoid the instability around theaccident hump.

It is worth mentioning that model (5.4) is not a simple regression model,since there are no observed quantities on the right-hand side. Specifically,age x and calendar time t are treated as factors and the effect on mortalityis quantified by the sequences αx1 ,αx2 , . . . ,αxm and βx1 ,βx2 , . . . ,βxm forage, and by the sequence κt1 , κt2 , . . . , κtn for calendar time. Note that themodel (5.4) is particularly useful when the actuary has only a set of deathrates mx(t) at his disposal. In the case where more detailed informationis available, the Poisson approach described in the next section makes aneffective use of observations of death counts and exposure-to-risk.

Objective function The model (5.4) is fitted to a matrix of age-specificobserved forces of mortality using singular value decomposition. Specifi-cally, the αx’s, βx’s, and κt’s are such that they minimize

OLS(α,β, κ) =xm∑

x=x1

tn∑t=t1

(ln mx(t) − αx − βxκt

)2(5.5)

This is equivalent to maximum likelihood estimation provided that theεx(t)’s obey the Normal distribution.

Remark Wilmoth (1993) suggested a weighted least-squares procedure forestimating the (α,β, κ) parameters. Specifically, the objective function (5.5)is replaced with

OWLS(α,β, κ) =xm∑

x=x1

tn∑t=t1

wxt

(ln mx(t) − αx − βxκt

)2(5.6)

Empirical studies reveal that using the observed dxt’s as weights (i.e.wxt = dxt) has the effect of bringing the parameters estimated into closeagreement with the Poisson-response-based estimates (discussed below).However, the choice of the death counts as weights is questionable, andthe Poisson maximum likelihood approach described in the next sectionhas better statistical properties, and should therefore be preferred for infer-ence purposes. The reason is that a valid weighted least-squares approachmust use exogeneous weights, but obviously the number of deaths is a ran-dom variable. As such, estimates resulting from the minimization of OWLShave no known statistical properties and can be strongly biased. �

5.2 Lee–Carter mortality projection model 191

Effective computation: Singular value decomposition Setting ∂∂αx

OLS

equal to 0 yields

tn∑t=t1

ln mx(t) = (tn − t1 + 1)αx + βx

tn∑t=t1

κt (5.7)

Since∑tn

t=t1 κt = 0 by the constraint (5.3), we get

αx = 1tn − t1 + 1

tn∑t=t1

ln mx(t) (5.8)

The minimization of (5.5) thus consists in taking for αx the row averageof the ln mx(t)’s. When the model (5.4) is fitted by ordinary least-squares,the fitted value of αx exactly equals the average of ln mx(t) over time t sothat expαx represents the general shape of the mortality schedule. We thenobtain the βx’s and κt’s from the first term of a singular value decompositionof the matrix ln mx(t) − αx.

Specifically, death rates can be combined to form a matrix

M = mx1(t1) · · · mx1(tn)

.... . .

...mxm(t1) · · · mxm(tn)

(5.9)

of dimension (xm −x1 +1)× (tn − t1 +1). Model (5.1) is then fitted so thatit reproduces M as closely as possible. Now, let us create the matrix

Z = ln M − α

= ln mx1(t1) − αx1 · · · ln mx1(tn) − αx1

.... . .

...ln mxm(t1) − αxm · · · ln mxm(tn) − αxm

(5.10)

of dimension (xm−x1+1)×(tn−t1+1). Approximating the zxt’s with theirLee–Carter expression βxκt indicates that the absence of age-time interac-tions is assumed, that is, the βx’s are fixed over time and the κt’s are fixedover ages. Most data sets do not comply with the time-invariance of theβx’s, unless the optimal fitting period has been selected as explained below.

Now, the βx’s and κt’s are such that they minimize

OLS(β, κ) =xm∑

x=x1

tn∑t=t1

(zxt − βxκt

)2(5.11)

The solution is given by the singular value decomposition of Z. More pre-cisely, let us define the square matrices ZTZ of dimension (tn − t1 + 1) ×

192 5 : Age-period projection models

(tn − t1 + 1) and ZZT of dimension (xm − x1 + 1) × (xm − x1 + 1). Let u1be the eigenvector corresponding to the largest eigenvalue of ZTZ. Let v1be the corresponding eigenvector of ZZT . The best approximation of Z inthe least-squares sense is known to be

Z ≈ Z� = √λ1v1uT

1 (5.12)

from which we deduce

β = v1∑xm−x1+1j=1 v1j

and κ = √λ1

xm−x1+1∑j=1

v1j

u1 (5.13)

provided that∑xm−x1+1

j=1 v1j �= 0. The constraints (5.3) are then satisfied by

the βx’s and κt’s. Note that the second and higher terms of the singular valuedecomposition together comprise the residuals. Typically, for lowmortalitypopulations, the first order approximation (5.12) behind the Lee–Cartermodel accounts for about 95% of the variance of the ln mx(t)’s.

Remark As pointed out by Booth et al. (2002), the original approach byLee and Carter (1992) makes use of only the first term of the singular valuedecomposition of the matrix of centered log death rates. In principle, thesecond-and higher-order terms could be incorporated in the model. The fullexpanded model is

ln mx(t) = αx +r∑

j=1

β[j]x κ

[j]t (5.14)

where r is the rank of the lnmx(t)−αx matrix. In this case, β[j]x κ

[j]t is referred

to as the jth order term of the approximation. Any systematic variationin the residuals from fitting only the first term would be captured by thesecond and higher terms. In their empirical illustration, Booth et al. (2002)find a diagonal pattern in the residuals that was interpreted as a cohort-period effect. We will come back to the modelling of cohort effects in thenext chapter. Brouhns et al. (2002b) have tested whether the inclusion ofa second log-bilinear term significantly improves the quality of the fit, andthis was not the case in their empirical illustrations.

Renshaw and Haberman (2003a) report on the failure of the first-orderLee–Carter model to capture important aspects of the England and Walesmortality experience (despite explaining about 95% of the total variance)together with the presence of noteworthy residual patterns in the second-order term. As a consequence, Renshaw and Haberman (2003b) haveinvestigated the feasibility of constructing mortality forecasts on the basisof the first two sets of SVD vectors, rather than just on the first set of such

5.2 Lee–Carter mortality projection model 193

vectors, as in the Lee–Carter approach. Whereas Renshaw and Haberman(2003b) have applied separate univariate ARIMA processes to the first twoperiod components, Renshaw and Haberman (2005) have used a bivariatetime series. �

Effective computation: Newton–Raphson The estimations for the param-eters αx, βx and κt can also be obtained recursively using a Newton–Raphson algorithm avoiding singular value decomposition.

The system to solve in order to obtain the estimated values of the param-eters αx, βx and κt is obtained by equating to 0 the partial derivative ofOLS(α,β, κ) given in (5.5) with respect to αx, κt and βx, that is,

0 =tn∑

t=t1

(ln mx(t) − αx − βxκt

), x = x1, x2, . . . , xm

0 =xm∑

x=x1

βx(ln mx(t) − αx − βxκt

), t = t1, t2, . . . , tn (5.15)

0 =tn∑

t=t1

κt(ln mx(t) − αx − βxκt

), x = x1, x2, . . . , xm

Each of these equations is of the form f (ξ) = 0, where ξ is one of theparameters αx, βx, and κt.

The idea is to update each parameter in turn using a univariate Newton-Raphson recursive scheme. Starting from some initial value ξ(0), the (k+1)thiteration gives ξ(k+1) from ξ(k) by

ξ(k+1) = ξ(k) − f (ξ(k))

f ′(ξ(k))

Each time one of the Lee–Carter parameters αx, βx and κt is updated, thealready revised values of the other parameters are used in the iterativeformulas. The recurrence relations are thus as follows:

α(k+1)x = α(k)

x +∑tn

t=t1

(ln mx(t) − α

(k)x − β

(k)x κ

(k)t

)tn − t1 + 1

κ(k+1)t = κ

(k)t +

∑xmx=x1 β

(k)x

(ln mx(t) − α

(k+1)x − β

(k)x κ

(k)t

)∑xm

x=x1

(k)x

)2 (5.16)

β (k+1)x = β (k)

x +∑tn

t=t1 κ(k+1)t

(ln mx(t) − α

(k+1)x − β

(k)x κ

(k+1)t

)∑tn

t=t1

(k+1)t

)2

194 5 : Age-period projection models

This alternative to singular value decomposition does not require a rect-angular array of data (it suffices to let the summation indices range overthe available observations). Further estimation can proceed in the presenceof empty cells, as these would receive a zero weight and are then simplyexcluded from the computations.

Identifiability constraints The estimates for αx, βx, and κt produced by themethods described above (the singular value decomposition or the Newton-Raphson procedure (5.16)) do not satisfy the constraints (5.3). To fullfillthe identifiability constraints, we replace αx with αx+βxκ, κt with (κt−κ)β•,and βx with βx/β• where β• is the sum of the βx’s coming out of the singularvalue decomposition or theNewton–Raphson procedure (5.16), and κ is theaverage of the κt’s coming out of the singular value decomposition or theNewton–Raphson procedure (5.16).

Adjustment of the κt ’s by refitting to the total observed deaths Instead ofkeeping the κt’s obtained from singular value decomposition or Newton–Raphson algorithm, Lee and Carter (1992) suggested that the κt’s (takingthe αx’s and βx’s as given) be adjusted in order to reproduce the observednumber of deaths

∑xmx=x1 Dxt in year t. This avoids discrepancies arising

from modelling on the logarithmic scale.

Since it is desirable that the differences between the actual and expectedtotal deaths in each year are zero, as in the construction and graduation ofperiod life tables, the adjusted κt’s solve the equation

xm∑x=x1

Dxt =xm∑

x=x1

ETRxt exp(αx + βxζ) (5.17)

in ζ. So, the κt’s are reestimated in such a way that the resulting death rates(with the previously estimated αx and βx), applied to the actual risk expo-sure, produce the total number of deaths actually observed in the data forthe year t in question. There are several advantages to making this secondstage estimate of the parameters κt. In particular, it avoids sizable discrep-ancies between predicted and actual deaths (which may occur because themodel (5.4) is specified by means of logarithms of death rates). We notethat no explicit solution is available for (5.17), which has thus to be solvednumerically (using a Newton–Raphson procedure, for instance).

It is worth mentioning that more than one solution for (5.17) may arisewhen all the βx’s do not have the same sign. A nonuniform sign for the βx’simplies that mortality is increasing at some ages and decreasing at others.This is not normally expected to happen, except sometimes at advanced ages

5.2 Lee–Carter mortality projection model 195

(but the phenomenon disappears when the actuary starts the modelling byclosing the life tables). Therefore, solving (5.17) usually does not pose anyproblem.

Adjustment of the κt ’s by refitting to the observed period life expectanciesWhereas Lee and Carter (1992) have suggested that the κt be adjusted asin (5.17) by refitting to the total observed deaths, Lee and Miller (2001)have proposed an adjustment procedure in order to reproduce the periodlife expectancy at some selected age (instead of the total number of deathsrecorded during the year).

In practice, the actuary first selects an age x0. In population studies, it iscommon to take x0 = 0 but in mortality projections for annuitants, takingx0 = 60 or 65 may be more meaningful. Considering (3.18), the estimatedκt is adjusted to match the observed life expectancy at age x0 in year t giventhe estimated αx’s and βx’s obtained from the singular value decompositionor from the Newton–Raphson algorithm. Thus, the adjusted κt’s solve theequation

e↑x0(t) =

1 − exp(− exp

(αx0 + βx0ζ

))exp

(αx0 + βx0ζ

)+

∑k≥1

k−1∏j=0

exp(− exp

(αx0+j + βx0+jζ

))×1− exp

(− exp

(αx0+k + βx0+kζ

))exp

(αx0+k + βx0+kζ

) (5.18)

in ζ.

The advantage of this second adjustment procedure is that it does notrequire exposures-to-risk nor death counts and is thus generally applicable.Note that, as before, numerical problems may arise when the βx’s do nothave the same sign, but we believe that this problem is unlikely to occur inpractice.

Adjustment of the κt ’s by refitting to the observed age distribution of deathsBooth et al. (2002) have suggested another procedure for adjusting theκt’s. Rather than fitting the yearly total number of deaths

∑xmx=x1 Dxt as in

(5.17), this variant fits to the age distribution of deaths Dxt assuming thePoisson distribution for the age-specific death counts and using the deviancestatistic to measure the goodness-of-fit. Specifically, for a fixed calendaryear t, the Dxt’s are considered as independent random variables obeying

196 5 : Age-period projection models

the Poisson distribution with respective mean ETRxt exp(αx + βxκt

), where

the values of the αx’s and βx’s are those coming from either the singularvalue decomposition or the Newton–Raphson iterative method, and whereκt has to be determined in order to make the observed Dxt’s as likely aspossible. This means that κt maximizes the Poisson log-likelihood

xm∑x=x1

(Dxt ln

(ETRxt exp

(αx + βxζ

)) − ETRxt exp(αx + βxζ

))(5.19)

over ζ, or equivalently, minimizes the deviance

D = 2xm∑

x=x1

(Dxt ln

Dxt

Dxt− (

Dxt − Dxt))

(5.20)

where Dxt = ETRxt exp(αx+βxζ

)is the expected number of deaths, keeping

the αx’s and βx’s unchanged.

Identifiability constraints The identifiability constraints (5.3) are nolonger satisfied by the adjusted κt. Therefore, we replace κt with κt − κ

and αx with αx + βxκ, where κ is the average of the adjusted κt’s. Thissimple method only works because we are dealing with an identificationconstraint (not a model restriction).

Poisson maximum likelihood estimation

Statistical model

Let us now assume that the actuary has at his/her disposal observed deathcounts Dxt and corresponding exposures ETRxt. Then, the least-squaresapproach can be applied to the ratio of the death numbers to the expo-sure (i.e. to the mx(t) = Dxt/ETRxt’s as explained above). The methodpresented in this section better exploits the available information, and doesnot assume that the variability of the mx(t)’s is the same whatever the agex. Specifically, we assume that the number of deaths at age x in year thas a Poisson random variation. To justify this approach, we prove thatassumption (3.2) is compatible with Poisson modelling for death counts.To this end, let us focus on a particular pair: age x – calendar year t. Weobserve Dxt deaths among Lxt individuals aged x on January 1 of year t.We assume that the remaining lifetimes of these individuals are independentand identically distributed. The likelihood function (3.12) is proportional tothe Poisson likelihood, that is, the one obtained under the assumption thatDxt is Poisson distributed with mean ETRxtµx(t) = ETRxt exp(αx + βxκt)

5.2 Lee–Carter mortality projection model 197

where the parameters are still subjected to the constraints (5.3). Therefore,provided that we resort to the maximum likelihood estimation procedure,working on the basis of the ‘true’ likelihood (3.12) or working on thebasis of the Poisson likelihood are equivalent, once the assumption (3.2)has been made.

Objective function The parameters αx, βx, and κt are now estimatedby maximizing the log-likelihood based on the Poisson distributionalassumption. This is given by

L(α,β, κ) =xm∑

x=x1

tn∑t=t1

(Dxt(αx +βxκt)−ETRxt exp(αx +βxκt)

)+ constant.

(5.21)Equivalently, the parameters are estimated by minimizing the associateddeviance defined as

D = −2(L(α,β, κ) − Lf ) (5.22)

where Lf is the log-likelihood of the full or saturated model (characterizedby equating the fitted and actual numbers of deaths).

Effective computation Because of the presence of the bilinear term βxκt,it is not possible to estimate the proposed model with commercial statisti-cal packages that implement Poisson regression. We can nevertheless easilysolve the likelihood equations with the help of a uni-dimensional or elemen-tary Newton–Raphson method implemented in (5.16) in the least-squarescase.

The updating scheme is as follows: starting with α(0)x = 0, β

(0)x = 1, and

κ(0)t = 0 (random values can also be used), the sequences of α

(k)x , β

(k)x , and

κ(k)t are obtained from the formulas

α(k+1)x = α(k)

x −∑tn

t=t1

(Dxt − ETRxt exp

(k)x + β

(k)x κ

(k)t

))−∑tn

t=t1 ETRxt exp(α

(k)x + β

(k)x κ

(k)t

)κ(k+1)t = κ

(k)t −

∑xmx=x1

(Dxt−ETRxt exp

(k+1)x +β

(k)x κ

(k)t

))β

(k)x

−∑xmx=x1 ETRxt exp

(k+1)x +β

(k)x κ

(k)t

) (β

(k)x

)2 (5.23)

β(k+1)x = β(k)

x −∑tn

t=t1

(Dxt − ETRxt exp

(k+1)x + β

(k)x κ

(k+1)t

))κ(k+1)t

−∑tnt=t1 ETRxt exp

(k+1)x + β

(k)x κ

(k+1)t

) (κ(k+1)t

)2

198 5 : Age-period projection models

The criterion used to stop the procedure is a relative increase in the log-likelihood function that is smaller than a pre-selected sufficiently small fixednumber.

The maximum likelihood estimations of the parameters coming out of(5.23) have to be adapted in order to fulfill the constraints (5.3): specifically,we replace κt with (κt − κ)

∑xmx=x1 βx, βx with βx/

∑xmx=x1 βx, and αx with

αx + βxκ.

Remark As pointed out by Renshaw andHaberman (2006), the error struc-ture can be imposed by specifying the second moment properties of themodel, as in the framework of generalized linear modelling. This allows fora range of options for the choice of the error distribution, including Pois-son, both with and without dispersion, as well as Gaussian, as used in theoriginal approach by Lee and Carter (1992). �

Remark In contrast to the classical least-squares approach to estimating theparameters, the error applies directly on the number of deaths in the Poissonregression approach. There is, thus, no need for a second-stage estimationlike (5.17) for the κt’s.

Note that differentiating the log-likelihood (5.21) with respect to αx givesthe equation

tn∑t=t1

Dxt =tn∑

t=t1

ETRxt exp(αx + βxκt) (5.24)

which is similar to (5.17) except that the sum is now over calendar timeinstead of age. So, the estimated κt’s are such that the resulting death ratesapplied to the actual risk exposure produce the total number of deathsactually observed in the data for each age x. Sizable discrepancies betweenpredicted and actual deaths are thus avoided. �

5.2.2.3 Alternative estimation procedures for logbilinear models

Brillinger (1986) showed that under reasonable assumptions about theprocesses governing births and deaths, the Poisson distribution is a goodcandidate to model the numbers of deaths at different ages. This provides asound justification for the Poisson model for estimating the (α,β, κ) param-eters. There are nevertheless (at least) two alternatives for estimating theparameters.

Binomial maximum likelihood estimation Cossette and Marceau (2007)have proposed a Binomial regression model for estimating the parame-ters in logbilinear mortality projection models. The annual number Dxt

5.2 Lee–Carter mortality projection model 199

of recorded deaths is then assumed to follow a Binomial distribution, witha death probability qx(t), which is expressed as a function of the force ofmortality (5.1) via qx(t) = 1 − exp(−µx(t)).

The number of deaths Dxt at age x during year t has a Binomialdistribution with parameters Lxt and qx(t). The specification for µx(t) gives

qx(t) = 1 − exp(− exp

(αx + βxκt

))(5.25)

To ensure identifiability, we adhere to the set of constraints (5.3). Assumingindependence, the likelihood for the entire data is the corresponding productof binomial probability factors. The log-likelihood is then given by

L(α,β, κ) =tn∑

t=t1

xm∑x=x1

(dxt ln

(1 − qx(t)

) + dxt ln qx(t))

+ constant (5.26)

As in the Poisson case, the presence of the bilinear term βxκt makescommercial statistical packages that implement linear Binomial regressionuseless. An iterative procedure has been proposed in Cossette et al. (2007)for estimating the parameters. A parallel analysis is provided by Haber-man and Renshaw (2008) with an investigation of a number of alternativespecifications to (5.25).

Overdispersed Poisson and Negative Binomial maximum likelihoodestimation Poisson modelling induces equidispersion. We know fromSection 3.3.9 that populations are heterogeneous with respect to mortal-ity. Heterogeneity tends to increase the variance compared to the mean (aphenomenon termed as overdispersion), which rules out the Poisson specifi-cation and favours a mixed Poisson model. Besides gender, age x and year t,there are many other exogeneous factors affecting mortality. It is, therefore,natural to extend the Lee–Carter model in order to take this feature intoaccount. One approach advocated by Renshaw and Haberman (2003b),Renshaw and Haberman (2003c), Renshaw and Haberman (2006), is topostulate that the random number of deaths Dxt has an overdispersedPoisson distribution. Thus, it is suggested that

Var[Dxt] = φE[Dxt] (5.27)

where φ is a parameter that measures the degree of overdispersion. Clearly,φ = 1 reduces to the standard Poisson case.

An alternative approach is to take the exogeneous factors into accountby adding a random effect εxt super-imposed on the Lee–Carter predic-tor αx + βxκt, exactly as in (5.4). More precisely, Delwarde et al. (2007b)

200 5 : Age-period projection models

have suggested the replacement of the Poisson model with a Mixed Poissonone. Given εxt, the number of deaths Dxt is assumed to be Poisson dis-tributed with mean ETRxt exp(αx +βxκt +εxt). Unconditionally, Dxt obeysa mixture of Poisson distributions. The εxt’s are assumed to be indepen-dent and identically distributed. A prominent example consists in taking theDxt’s to beNegative Binomial distributed. See also Renshaw andHaberman(2008).

Mortality data from the life insurance market often exhibit overdisper-sion because of the presence of duplicates. It is common for individuals tohold more than one life insurance or annuity policy and hence to appearmore than once in the count of exposed to risk or deaths. In such a case,the portfolio is said to contain duplicates, that is, the portfolio contains sev-eral policies concerning the same lives. It is well known that the variancebecomes inflated in the presence of duplicates. Consequently, even if theportfolio (or one of its risk class) is homogeneous, the presence of duplicateswould increase the variance and cause overdispersion. The overdispersedPoisson andNegative Binomial models for estimating the parameters of log-bilinear models for mortality projections are thus particularly promising foractuarial applications.

5.2.3 Application to Belgian mortality statistics

Before embarking on a mortality projection case study, we have to decideabout the type of mortality statistics that will be used. In some countries(like in the UK), extensive data are available for policyholders, according tothe type of contract. In such a case, we might wonder whether the forecastshould be based on population or market data.

Using market data allows us to take adverse selection into account. How-ever, basing mortality projections on market data implicitly would meanthat no structural breaks have occurred because of changes to the characterof the market, or modifications in the tax system or in the level of adverseselection, for instance. Thus, this is not always the best strategy. Assume,for example, that the government starts offering incentives to individualsfrom the lower socio-economic classes to buy life annuities in order to sup-plement public pensions. Using market data would result in a worseningin mortality because of a modification in the profile of the insured lives(as lower socio-economic classes usually experience higher mortality rates).Hence, this will artificially modify the mortality trends for the market. Itis, thus, impossible to separate long-term mortality trends from modifica-tions in the structure of the insured population. If, however, we need toundertake forecasts based on market data, covariates are often helpful, like

5.2 Lee–Carter mortality projection model 201

the amount of the annuity (reflecting individuals’ socio-economic class), forinstance.

Actuaries sometimes weight their calculations by policy size to accountfor socio-economic differentials amongst policyholders. These ‘amount-based’ measures usually produce lower mortality rates than their ‘lives-based’ equivalents due to the tendency for wealthier policyholders to livelonger. The pension size is thus used as a proxy of socio-economic group.However, this approach is somewhat ad hoc, and the amount of pensionshould better be included explicitly as a covariate in the regression modelsused for mortality projections.

For the reason given above, we prefer to use general population data formortality forecasting. Relational models introduced in Section 3.4.4 wouldallow us to take adverse selection into account, and to exploit the long-termchanges in population mortality. Specifically, the overall mortality trendwould be estimated from the general population, and a regression modelthen used to switch from the general population to the insurance market.Proceeding in this way would separate the long-term mortality trends fromthe particular features of the insured population.

We begin by fitting the log-bilinear model to the HMD data set by theleast-squares method. We only consider males; the analysis for females issimilar. The calendar years 1920–2005 and ages 0–104 are included in theanalysis. The reason for restricting the highest age to 104 is that the Belgian2002–2004 population life table that will serve as the basis for the forecast(as explained in below) does not extend beyond this age. Note that the dataat high ages have been processed in the HMD, so that the independenceassumption is no more valid at these ages and the corresponding resultshave to be interpreted with care. Figure 5.1 (top panels) plots the estimatedαx’s, βx’s, and κt’s. The estimated αx’s exhibit the typical shape of a setof log death rates with relatively high values around birth, a decrease atinfant ages, the accident hump, and finally the increase at adult ages withan ultimately concave behaviour. The estimated βx’s appear to decreasewith age, suggesting that most of the mortality decreases are concentratedon the young ages. The estimated κt’s are adjusted to reproduce the observedperiod life expectancies at birth. The estimated κt’s are affected by WorldWar II, with comparatively higher values in the early 1940s. We note thatthe model explains 92.09% of the total variance.

We now restrict ourselves to ages above 60. Figure 5.1 (bottom panels)plots the estimated αx, βx, and κt. The model now explains 90.18% of thetotal variance. Note that compared to the case where all the ages 0–104were included in the analysis, the adjusted κt’s are much more similar tothe initial ones coming from singular value decomposition.

60 70 80 90 100–4.0

–3.5

–3.0

–2.5

–2.0

–1.5

–1.0

–0.5

Age

Alp

ha

–6

–4

–2

Age0 20 40 60 80 100

Alp

ha

60 70 80 90 100

0.010

0.015

0.020

0.025

0.030

Age

Bet

a

0.005

0.010

0.015

0.020

0.025

0.030

Age0 20 40 60 80 100

Bet

a

1920 1940 1960 1980 2000

–20

–15

–10

–5

0

5

10

Time

Kap

pa

1920 1940 1960 1980 2000–100

–50

0

50

Time

Kap

pa

Figure 5.1. Estimated αx, βx, and κt (from left to right), x = 0, 1, . . . , 104 (top panels) and x = 60, 61, . . . , 104 (bottom panels), t = 1920, 1921, . . . , 2005,obtained with HMD data by minimizing the sum of squares (5.5) with the estimated κt ’s adjusted by refitting to the period life expectancies at birth or at ages60 (for the estimated κt ’s, the values before adjustment are displayed in broken line).

5.3 Cairns–Blake–Dowd mortality projection model 203

It is important to mention that the sole use of the proportion of the totaltemporal variance (as measured by the ratio of the first singular value to thesum of singular values) is not a satisfactory diagnostic indicator. An exam-ination of the residuals is needed to check for model adequacy (see below).

The fitted mortality surfaces are depicted in Fig. 5.2. These surfacesshould be compared with Fig. 5.3. The mortality experience appears rathersmooth, with some ridges around 1940–1945.

We now fit the log-bilinear model to the HMD data set by the method ofPoissonmaximum likelihood. All of the ages 0–104 are included in the anal-ysis. Figure 5.3 (top panels) plots the estimated αx, βx and κt. The estimatedparameters are compared with those obtained by minimizing the sum of thesquared residuals (5.5).We see that the least-squares and Poissonmaximumlikelihood procedures produce very similar sets of estimated parameters αx,βx, and κt.

As above, we restrict ourselves to ages above 60. Figure 5.3 (bottom pan-els) plots the estimated αx, βx, and κt. The estimated parameters are com-pared with those obtained by minimizing least squares. We observe sizeablediscrepancies between the βx’s produced by the least-squares and Poissonmaximum likelihood procedures, whereas the αx’s and κt’s remain similar.

5.3 Cairns–Blake–Dowd mortality projection model

5.3.1 Specification

Empirical analyses suggest that ln qx(t)/px(t) is reasonably linear in x forfixed t (sometimes with a small degree of curvature in the plot of x versusln qx(t)/px(t)), except at younger ages. This is why Cairns et al. (2006a)assume that

lnqx(t)px(t)

= κ[1]t + κ

[2]t x ⇔ qx(t) =

exp(κ

[1]t + κ

[2]t x

)1 + exp

[1]t + κ

[2]t x

) (5.28)

where κ[1]t and κ

[2]t are themselves stochastic processes. This specification

does not suffer from any identifiability problems so that no constraints needto be specified.

We see that age is now treated as a continuous covariate and enters themodel in a linear way on the logit scale. The intercept κ

[1]t and slope κ

[2]t

parameters make up a bivariate time series the future path of which governsthe projected life tables. The intercept period term κ

[1]t is generally declining

1920

1940

1960

1980

20000

20

40

60

80

100

–8

–6

–4

–2

1920

1940

1960

1980

200060

70

80

90

100

–4

–3

–2

–1

t

t

x

x

Figure 5.2. Fitted death rates (on the log scale) for Belgian males, ages 0–104 (top panel) andages 60–104 (bottom panel), period 1920–2005.

60 70 80 90 100–4

–3

–2

–1

Alp

ha

60 70 80 90 1000.00

0.01

0.02

0.03

Bet

a

1920 1940 1960 1980 2000–20

–15

–10

–5

0

5

10

Kap

pa

1920 1940 1960 1980 2000

–100

–50

0

50

Time

Kap

pa

0 20 40 60 80

–6

–4

–2

0

Age

Alp

ha

1000.000

0.005

0.010

0.015

0.020

0.025

0.030

0.035

Bet

a

0 20 40 60 80Age

100

Age Age Time

Figure 5.3. Estimated αx, βx, and κt (from left to right), x = 0, 1, . . . , 104 (top panels) and x = 60, 61, . . . , 104 (bottom panels), t = 1920, 1921, . . . , 2005,obtained with HMD data by maximizing the Poisson log-likelihood (5.21) (the values obtained by least-squares are displayed in broken line).

206 5 : Age-period projection models

over time, which corresponds to the feature that mortality rates have beendecreasing over time at all ages. Hence, the upward-sloping plot of the logitof death probabilities against age is shifting downwards over time. If duringthe fitting period, the mortality improvements have been greater at lowerages than at higher ages, the slope period term κ

[2]t would be increasing over

time. In such a case, the plot of the logit of death probabilities against agewould be becoming more steep as it shifts downwards over time.

Sometimes, the logit of the death probabilities qx(t) plotted against age xexhibits a slight curvature after retirement age. This curvature can be mod-elled by including a quadratic term in age in the Cairns–Blake–Dowdmodel.However, the dynamics of the time factor associated with this quadraticeffect often remains unclear and when combined with the quadratic ageterm, its contribution to mortality dynamics is highly complex.

The Cairns–Blake–Dowd model has two time series κ[1]t and κ

[2]t which

affect different ages in different ways. This is a fundamental difference com-pared with the 1-factor Lee–Carter approach where a single time seriesinduces perfect correlation in mortality rates at different ages from oneyear to the next. There is empirical evidence to suggest that changes in thedeath rates are imperfectly correlated, which supports the Cairns–Blake–Dowd model or the 2-factor Lee–Carter model represented by equation(5.14) with r = 2. Compared to the 1-factor Lee–Carter model, the Cairns–Blake–Dowd model thus allows changes in underlying mortality rates thatare not perfectly correlated across ages. Also, the longer the run of data thatthe actuary uses, the better does the 2-factor model relative to its 1-factorcounterpart. For example, if we consider the entire 20th century, mortalityimprovements concentrate on younger ages during the first half of the cen-tury and on higher ages during the second half. We need a 2-factor model tocapture these two different dynamics. Note, however, that the restriction tothe optimal fitting period in the Lee–Carter case favours recent past historyso that the inclusion of a second factor may not be needed.

Note that the switch from a unique time series to a pair of time-dynamicfactors has far-reaching consequences when we discuss securitization, as theexistence of an imperfect correlation structure implies, for example, thathedging longevity-linked liabilities would require more than one hedginginstrument.

5.3.2 Calibration

We assume that we have observed data for a set of calendar years t =t1, t2, . . . , tn and for a set of ages x = x1, x2, . . . , xm. On the basis of

5.3 Cairns–Blake–Dowd mortality projection model 207

these observations, we would like to estimate the intercept κ[1]t and slope

κ[2]t parameters. This can be done by least-squares. This means that theregression model

lnqx(t)px(t)

= κ[1]t + κ

[2]t x + εx(t) (5.29)

is fitted to the observations of calendar year t, where the qx(t)’s are thecrude one-year death probabilities, and where the error terms εx(t) areindependent and Normally distributed, with mean 0 and constant varianceσ2

ε . The objective function

Ot(κ) =xm∑

x=x1

(ln

qx(t)px(t)

− κ[1]t − κ

[2]t x

)2

(5.30)

has to be minimized for each calendar year t, giving the estimations ofthe κ

[1]t and κ

[2]t parameters. Note that, in contrast to the Lee–Carter case,

where the estimated time index κt depends on the observation period, thetime indices κ

[1]t and κ

[2]t are estimated separately for each calendar year t

in the Cairns–Blake–Dowd model.

The Cairns–Blake–Dowd model can also be calibrated in a number ofalternative ways, as was the case for the Lee–Carter model. For instance,a Poisson regression model can be specified by assuming that the observeddeath counts are independent and Poisson distributed, with a mean equalto the product of the exposure-to-risk times the population death rate ofthe form

µx(t) = − ln(1 − qx(t)) = ln(1 + exp

[1]t + κ

[2]t x

))(5.31)

Estimation based on a Binomial or Negative Binomial error structure canalso be envisaged.

5.3.3 Application to Belgian mortality statistics

As for the implementation of the Lee–Carter approach, we fit the Cairns-Blake-Dowd model by least-squares to the HMD data set, using Belgianmales from the general population. The results of the fit are displayed inFig. 5.4.

The top panels of Fig. 5.4 display the results when all of the ages0–104 are included in the analysis. Note that the Cairns–Blake–Dowdmodel was never designed to cover all ages, certainly not down to age 0.The linearity in x means that this model is not able to capture the level-ling off around age 30 and the accident hump around age 20. From left

1920 1940 1960 1980 2000–9.5

–9.0

–8.5

–8.0

–7.5

–7.0

–6.5

t

kt[1

]

1920 1940 1960 1980 2000

0.060

0.065

0.070

0.075

0.080

0.085

t

kt[2

]

1920 1940 1960 1980 2000

0.84

0.86

0.88

0.90

0.92

0.94

0.96

t

R2[t]

1920 1940 1960 1980 2000

–11.0

–10.5

–10.0

–9.5

–9.0

t

kt[1

]

1920 1940 1960 1980 20000.085

0.090

0.095

0.100

0.105

t

kt[2

]

1920 1940 1960 1980 2000

0.990

0.992

0.994

0.996

0.998

t

R2[t]

Figure 5.4. Estimated κ[1]t and κ

[2]t parameters together with the values of the adjustment coefficient by calendar year (from left to right), for ages

x = 0, 1, . . . , 104 (top panels) and x = 60, 61, . . . , 104 (bottom panels), t = 1920, 1921, . . . , 2005, obtained with HMD data by least-squares.

5.4 Smoothing 209

to right, we see the estimated κ[1]t ’s, the estimated κ

[2]t ’s, and the value

of the adjustment coefficient R2(t) for each calendar year t. The bot-tom panels give the corresponding results for the restricted age range 60,61, . . . , 104.

When all of the ages are considered, the estimated κ[1]t ’s exhibit a down-

ward trend, which expresses the improvement in mortality rates over timefor all ages. A peak around 1940–1945 indicates a higher mortality experi-ence during World War II. The estimated κ

[2]t ’s tend to increase over time,

indicating that mortality improvements have been comparatively greater atyounger ages over the period 1920–2005. We note that World War II alsoaffected the estimated κ

[2]t ’s, with a decrease in the early 1940s. The val-

ues of the adjustment coefficient R2(t) indicate that the Cairns-Blake-Dowdmodel explains from about 80% of the variance in 1920 to about 95% inthe early 2000s.

If we restrict the age range to 60, 61, . . . , 104, we see that the goodness-of-fit is greatly increased, with adjustment coefficients larger than 99%. TheCairns–Blake–Dowdmodel takes advantage of the approximate linearity inage (on the logit scale) at higher ages to provide a parsimonious represen-tation of one-year death probabilities. The adjustment coefficients close to1 demonstrate the ability of the Cairns–Blake–Dowd model to describe themortality experienced in Belgium. The trend in the estimated intercept andslope parameters is less clear, unless we restrict our interest to the latter partof the 20th century, where the estimated κ

[1]t ’s and κ

[2]t ’s become markedly

linear (with a decreasing trend for the former, and an increasing one for thelatter).

5.4 Smoothing

5.4.1 Motivation

Actuaries use projected life tables in order to compute life annuity prices,life insurance premiums as well as reserves that have to be held by insurancecompanies to enable them to be able to pay the future contractual benefits.Any irregularities in these life tables would then be passed on to the pricelist and to balance sheets, which is not desirable. Therefore, as long as theseirregularities do not reveal particular features of the risk covered by theinsurer, but are likely to be caused by sampling errors, actuaries prefer toresort to statistical techniques to produce life tables that exhibit a regularprogression, in particular with respect to age.

210 5 : Age-period projection models

5.4.2 P-splines approach

Durban and Eilers (2004) have smoothed death rates with P-splines in thecontext of a Poissonmodel. The P-spline approach is an example of a regres-sion model and is similar to the generalized linear modelling discussed inSection 4.5.4. But unlike generalized linear models, P-splines allow for moreflexibility in modelling observed mortality.

Regression models take a family of basis functions, and choose a com-bination of them that best fits the data according to some criterion. TheP-spline approach uses a spline basis, with a penalty function that is intro-duced in order to avoid oversmoothing. P-splines are related to B-splineswhich have been discussed in Section 2.6.3. Recall that univariate, or uni-dimensional, B-splines are a set of basis functions each of which dependson the placement of a set of ‘knot’ points providing full coverage of therange of data. Defining B-splines in two dimensions is straightforward. Wedefine knots in each dimension, and each set of knots gives rise to a uni-variate B-spline basis. The two-dimensional B-splines are then obtained bymultiplying the respective elements of these two bases.

Durban and Eilers (2004) have suggested a decomposition of µx(t) asfollows:

lnµx(t) =∑i,j

θijBij(x, t) (5.32)

for some prespecified two-dimensional B-splines Bij in age x and calendartime t, with regularly-spaced knots, and where the θij’s are parameters tobe estimated from historical data.

If we use a large number of knots in the year and age dimensions, then wecan obtain an extremely accurate fit. However, such a fit does not smooththe random variations present in the data and the resulting death ratesbecome less reliable. Switching to P-splines helps to overcome this problem,because of the presence of the penalty function.

The method of P-splines suggested by Eilers and Marx (1996) is nowwell-established as a method of smoothing in Generalized Linear Models.It consists of using B-splines as the basis for the regression and in modify-ing the log-likelihood by a difference penalty that relates to the regressioncoefficients. The inclusion of a penalty with appropriate weight meansthat the number of knots can be increased without radically altering thesmoothness of the fit. Penalties can be calculated separately in each dimen-sion, involving sums of (θij − 2θi−1, j + θi−2, j)

2 in the age dimension, sumsof (θij − 2θi, j−1 + θi, j−2)

2 in the calendar year dimension, and sums of

5.4 Smoothing 211

(θi+1, j−1 − 2θij + θi−1, j+1)2 across cohorts. The CMI Bureau in the UK has

suggested the use of age and cohort penalties (see also Chapter 6). Eachof these penalties involves an unknown weight coefficient that has to beselected from the data.

Note that there is a difference in the structural assumption behind the P-spline approach, compared with the Lee–Carter and Cairns–Blake–Dowdalternative approaches: the P-spline approach assumes that there is smooth-ness in the underlying mortality surface in the period effects as well as in theage and cohort effects. Some further extensions have recently been proposedto account for period shocks.

The P-splines approach is a powerful smoothing procedure for theobserved mortality surface. Using the penalty to project the θij’s to thefuture, it is also possible to use this tool to forecast future mortality rates,by extrapolating the smooth mortality surface. However, as pointed outby Cairns et al. (2007), the P-spline approach to mortality forecasting isnot transparent. Its output is a smooth surface fitted to historical data andthen projected into the future. An important difference (compared with theLee–Carter and Cairns–Blake–Dowd alternatives) is that forecasting withthe P-splines approach is a direct consequence of the smoothing process.The choice of the penalty then corresponds to a view of the future patternof mortality. In contrast, the two stages of fitting the data and extrapolatingpast trends are kept separate in the Lee–Carter annd Cairns–Blake–Dowdapproaches. This is an advantage for actuarial applications, since it allowsfor more flexibility.

Moreover, the form of the penalty is usually difficult to infer from thedata, whereas it entirely drives the P-spline mortality forecast (a similarfeature occurs in period-based mortality graduation using splines whenmortality rates are extrapolated beyond the data to the oldest ages). Thedegree of smoothing in empirical applications depends on the variabil-ity of the observed death rates. The size of the population under study,as well as the range of ages considered, thus, both influence the smooth-ing coefficient and, possibly, the choice of the penalty. In the Lee–Carterand Cairns–Blake–Dowd approaches, these features of the data do notdirectly affect the projection of the time index. As the order of the penaltyhas no discernible effect on the smoothness of the observed data, it ishard to deduce it from the observed data. The choice of the penalty,in fact, corresponds to a view of the future pattern of mortality: futuremortality continuing at a constant level, future mortality improving at aconstant rate or future mortality improving at an accelerating (quadratic)rate.

212 5 : Age-period projection models

5.4.3 Smoothing in the Lee–Carter model

As can be seen from Fig. 5.1, the estimated βx’s exhibit an irregular pat-tern. This is undesirable from an actuarial point of view, since the resultingprojected life tables will also show some erratic variations across ages.

Bayesian formulations assume some sort of smoothness of age and periodeffects in order to improve estimation and facilitate prediction. A Bayesiantreatment ofmortality projections has been proposed byCzado et al. (2005).

Note that the estimated αx’s are usually very smooth, since they representan average effect of mortality at age x (however, Renshaw and Haberman(2003a) experiment with different choices for αx, representing differentaveraging periods and hence different levels of smoothing, as well as explicitgraduation of the αx estimates). The estimated κt’s are often rather irregular,but the projected κt’s, obtained from some time series model (as explainedbelow), will be smooth. Hence, we only need to smooth the βx’s in orderto get projected life tables with mortality varying smoothly across the ages.This can be achieved by penalized least-squares or maximum likelihoodmethods.

The estimated Lee–Carter parameters are traditionally obtained by min-imizing (5.5). This has produced estimated βx’s and κt’s with an irregularshape in the majority of empirical studies. In order to smooth the estimatedβx’s we can use the objective function

OPLS(α,β, κ) =xm∑

x=x1

tn∑t=t1

(ln µx(t) − αx − βxκt

)2+πβ

xm∑x=x1

(βx+2 − 2βx+1 + βx

)2 (5.33)

where πβ is the smoothing parameter. This is the penalized least-squaresapproach proposed in Delwarde et al. (2007a). The second term penalizesirregular βx’s. The objective function can therefore be seen as a compro-mise between goodness-of-fit (first term) and smoothness of the βx’s (secondterm). The penalty involves the sum of the squared second-order differencesof the βx’s, that is, the sum of the squares of βx+2 − 2βx+1 + βx. Second-order differences penalize deviations from the linear trend. The trade offbetween fidelity to the data (governed by the sum of squared residuals) andsmoothness (governed by the penalty term) is controlled by the smooth-ing parameters πβ. The larger the smoothing parameters the smoother theresulting fit. In the limit (πβ ↗ ∞) we obtain a linear fit. The choiceof the smoothing parameters is crucial as we may obtain quite different

5.4 Smoothing 213

fits by varying the smoothing parameters πβ. The choice of the optimalπβ is based on the observed data, using cross-validation techniques. SeeDelwarde et al. (2007a) for more details. We note that equation (5.33) issimilar to the objective function used in Whittaker–Henderson graduationdiscussed in Section 2.6.2, a non-parametric graduation method that hasbeen commonly used in the United States.

If the parameters are estimated using Poisson maximum likelihood,the penalized least-squares method becomes a penalized log-likelihoodapproach. Specifically, following Delwarde et al. (2007a) the log-likelihood(5.21) is replaced with

L(α,β, κ) − 12

πβ

xm∑x=x1

(βx+2 − 2βx+1 + βx

)2 (5.34)

As above, the selection of the optimal value for the roughness penaltycoefficient πβ is based on cross validation.

Here, we adopt a very simple strategy in our case study: instead of fittingthe Lee–Carter model to the rough mortality surface, we first smooth itusing the methods described in Section 3.4.2 and then we fit the model tothe resulting surface.

Remark An alternative aproach to smoothing the βx’s has also been sug-ested. It ismore ad hoc in nature than the above, in that it introduces an extrastage in the modelling process. Thus, Renshaw and Haberman (2003a,c)smooth the Lee–Carter βx estimates using linear regression as well as cubicB-splines and natural cubic splines and the methods of least-squares. �

Remark An advantage of the Cairns–Blake–Dowd model is that it auto-matically produces smooth projected life tables, because future deathprobabilities depend on age in a linear way, and on the projected timeindices κ

[1]t and κ

[2]t . �

5.4.4 Application to Belgian mortality statistics

We fit the Lee–Carter model to the set of smoothed HMD death rates byleast-squares to ages 0–104 and the years 1920–2002. Figure 5.5 (top pan-els) plots the estimated αx, βx, and κt. The estimated κt’s are then adjustedin order to reproduce the observed period life expectancies at birth. The val-ues obtained without smoothing (i.e. those displayed in Fig. 5.1) are plottedusing a broken line. We see that the prior smoothing of the death rates doesnot impact on the estimated αx’s, except just before the accident hump, noron the estimated κt’s (mainly because of the adjustment procedure). Prior

214 5 : Age-period projection models

smoothing does, however, impact on the estimated βx’s which now appearto behave very regularly with age. The model explains 93.70% of the totalvariance.

We now restrict ourselves to ages above 60. Figure 5.5 (bottom panels)plots the estimated αx, βx, and κt.We see that prior smoothing has almost noimpact on the estimatedαx’s nor on the estimated κt’s, whereas the estimatedβx’s are smoothed in an appropriate way. The model now explains 91.37%of the total variance. The estimated αx’s and κt’s closely agree, while theestimated βx’s are smoothed appropriately.

5.5 Selection of an optimal calibration period

5.5.1 Motivation

Many actuarial studies have based the projections of mortality on the statis-tics relating to the years from 1950 to the present. The question thenbecomes why the post-1950 period better represents expectations for thefuture than does the post-1900 period, for example. There are several justi-fications for the use of the second half of the 20th century. First, the pace ofmortality decline was more even across all ages over the 1950–2000 periodthan over the 1900–2000 period. Second, the quality of mortality data, par-ticularly at the older ages, for the 1900–1950 period is questionable. Third,infectious diseases were an uncommon cause of death by 1950, while heartdisease and cancer were the two most common causes, as they are today.This view seems to imply that the diseases affecting death rates from 1900through 1950 are less applicable to expectations for the future than thedominant causes of death from 1950 through 2000.

According to Lee and Carter (1992), the length of the mortality timeseries was not critical as long as it was more than about 10–20 years.However, Lee and Miller (2001) obtained better fits by restricting thestart of the calibration period to 1950 in order to reduce structural shifts.Specifically, in their evaluation of the Lee–Carter method, Lee and Miller(2001) have noted that for US data the forecast was biased when usingthe fitting period 1900–1989 to forecast the period 1990–1997. The mainsource of error was the mismatch between fitted rates for the last yearof the fitting period (1989 in their study) and actual rates in that year.This is why a bias correction is applied. It was also noted that the βxpattern did not remain stable over the whole 20th century. In order toobtain more stable βx’s, Lee and Miller (2001) have adopted 1950 asthe first year of the fitting period. Their conclusion is that restricting the

60 70 80 90 100–4.0

–3.5

–3.0

–2.5

–2.0

–1.5

–1.0

–0.5

Age

Alp

ha

60 70 80 90 100

0.010

0.015

0.020

0.025

0.030

Age

Bet

a

1920 1940 1960 1980 2000

–20

–15

–10

–5

0

5

10

Time

Kap

pa

1920 1940 1960 1980 2000–100

–50

0

50

Time

Kap

pa

0 20 40 60 80

–7

–6

–5

–4

–3

–2

–1

0

Age

Alp

ha

100

0.005

0.010

0.015

0.020

0.025

0.030

Bet

a

0 20 40 60 80Age

100

Figure 5.5. Estimated αx, βx, and κt (from left to right), x = 0, 1, . . . , 104 (top panels) and x = 60, 61, . . . , 104 (bottom panels), t = 1920, 1921, . . . , 2005,obtained with smoothed HMD death rates by minimizing the sum of squares (5.5) with the resulting κt ’s adjusted by refitting to the period life expectancies atbirth (corresponding values obtained without smoothing are displayed in broken line).

216 5 : Age-period projection models

fitting period to 1950 on avoids outlier data. Similarly, Lundström andQvist (2004) have reduced the 1901–2001 period to the past 25 years withSwedish data.

Baran and Pap (2007) have applied the Lee–Carter method to forecastmortality rates in Hungary for the period 2004–2040 on the basis of eithermortality data between 1949 and 2003 or on a restricted data set corre-sponding to the period 1989–2003. The model fitted to the data of theperiod 1949–2003 forecasts increasing mortality rates for men betweenages 45 and 55, indicating that the Lee–Carter method may not be appli-cable for countries where mortality rates exhibit trends as peculiar as inHungary. However, models fitted to the data for the last 15 years bothfor men and women forecast decreasing trends, which are similar to thecase of countries where the method has been successfully applied. Thisclearly shows that the selection of an optimal fitting period is of paramountimportance.

5.5.2 Selection procedure

Booth et al. (2002) have designed procedures for the selection of an opti-mal calibration period which identifies the longest period for which theestimated mortality index parameter κt is linear. Specifically, these authorsseek to maximize the fit of the overall model by restricting the fitting periodin order to maximize the fit to the linearity assumption. The choice of thefitting period is based on the ratio of the mean deviances of the fit of theunderlying Lee–Carter model to the overall linear fit. This ratio is computedby varying the starting year (but holding the jump-off year fixed) and thechosen fitting period is that for which the ratio is substantially smaller thanfor periods starting in previous years.

More specifically, Booth et al. (2002) assume, a priori, that the trendin the adjusted κt’s is linear, based on the ‘universal pattern’ of mortalitydecline that has been identified by several researchers, including Lee andCarter (1992) and Tuljapurkar and Boe (2000). When the κt’s depart fromlinearity, this assumption may be better met by appropriately restricting thefitting period. As noted above, the ending year is kept equal to tn and thefitting period is then determined by the starting year (henceforth denotedas tstart).

Restricting the fitting period to the longest recent period (tstart, tn)

for which the adjusted κt’s do not deviate markedly from linearityhas several advantages. Since systematic changes in the trend in κt areavoided, the uncertainty in the forecast is reduced accordingly. More-over, the βx’s are likely to satisfy better the assumption of time invariance.

5.5 Selection of an optimal calibration period 217

Finally, the estimate of the drift parameter more clearly reflects the recentexperience.

An ad hoc procedure for selecting tstart has been suggested in Denuit andGoderniaux (2005). Precisely, the calendar year tstart ≥ t1 is selected in sucha way that the series {κt, t = tstart, tstart + 1, . . . , tn} is best approximatedby a straight line. To this end, the adjustment coefficient R2 (which is theclassical goodness-of-fit criterion in linear regression) is maximized (as afunction of the number of observations included in the fit).

Note that in Denuit and Goderniaux (2005), the κt’s are replaced bya linear function of t and a parametric regression model (using a lineareffect term for the continuous covariate calendar time with an interactionwith the categorical variable age, together with a term for the categor-ical variable age) is then used. Even if this approach produces almostthe same projections as the Lee–Carter method, it underestimates theuncertainty in mortality forecasts. The resulting confidence intervals arethen artificially narrow because of the imposition of the linear trend inthe κt’s.

The situation is slightly different in the Cairns–Blake–Dowd model. Asthe time-varying parameters are estimated separately for each calendar year,they remain unaffected if we modify the range of calendar years underinterest. Considering Fig. 5.4, we clearly see that the slope and interceptparameters become linear only in the last part of the observation period(especially for ages 60 and over). Therefore, it is natural to extrapolate theirfuture path on the basis of recent experience only. The approach suggestedby Denuit and Goderniaux (2005) is easily extended to the Cairns-Blake-Dowd setting, by selecting the starting year as the maximum of the startingyears for each time factor. The deviance approach proposed by Booth et al.(2002) can also easily be adapted to the Cairns–Blake–Dowd model.

Note, however, that the selection of the optimal fitting period is subjectto criticisms, in the sense that it could lead to an underestimation of theuncertainty in forecasts, and artificially favours the Lee–Carter specifica-tion. The same comment applies in the Cairns–Blake–Dowd approach. Wedo not share this view, and we believe that the selection of the optimalfitting period is an essential part of the mortality forecast.

5.5.3 Application to Belgian mortality statistics

We first consider the Lee–Carter fit. Applying the method of Booth et al.(2002) gives tstart = 1978. The ad hoc method suggested in Denuit andGoderniaux (2005) roughly confirms this choice. Restricting the age range

218 5 : Age-period projection models

to 60 and over yields tstart = 1974. Again, the ad-hoc method agrees withthis choice.

Whereas the common practice would consist of taking all of the availabledata 1920–2005, we discard here observations for the years 1920–1977when all of the ages are considered, and observations for the years 1920–1973 when the analysis is restricted to ages 60 and over. Here, short-termtrends are preferred even if long-term forecasts are needed for annuity pric-ing. The reason is that past long-term trends are not expected to be relevantto the long-term future. Note that the fact that the optimal fitting period isselected on the basis of goodness-of-fit criteria to the linear model resultsin relatively small deviations from this short-term linear trend, but theshorter fitting period results in a more rapid widening of the confidenceintervals.

The final estimates based on observations comprised in the optimal fittingperiod are displayed in Fig. 5.6 which plots the estimated αx, βx, and κt.We see that the estimated αx’s and κt’s obtained with and without priorsmoothing closely agree whereas the estimated βx’s are smoothed in anappropriateway. Themodel explains 67.70%of the total variance formaleson the basis of unsmoothed data, 90.57% of the total variance for maleson the basis of smoothed data for ages 0–104. The model explains 92.62%of the total variance for males on the basis of unsmoothed data, 95.74% ofthe total variance for males on the basis of smoothed data for ages 60 andover.

For the Cairns–Blake–Dowd model, the optimal projection periods nowbecome 1969–2005 when all of the ages are included in the analysis and1979–2005when ages are restricted to the range 60–104. Note that the esti-mated time indices are not influenced by the restriction of the time period,so that those displayed in Fig. 5.4 remain valid.

5.6 Analysis of residuals

5.6.1 Deviance and Pearson residuals

Since we work in a regression framework, it is essential to inspect the resid-uals. Model performance is assessed in terms of the randomness of theresiduals. A lack of randomness would indicate the presence of systematicvariations, such as age–time interactions. We note that the adjustment ofthe κt’s in the Lee–Carter case may have introduced systematic changes to

60 70 80 90 100

–4

–3

–2

–1

Age

Alp

ha

60 70 80 90 100

0.00

0.01

0.02

0.03

0.04

Age

Bet

a

1975 1980 1985 1990 1995 2000 2005

–10

–5

0

5

Time

Kap

pa

0 20 40 60 80

–8

–6

–4

–2

Age

Alp

ha

100

0.000

0.005

0.010

0.015

0.020

0.025

Bet

a

0 20 40 60 80Age

100 1980 1985 1990 1995 2000 2005

–20

–10

0

Time

Kap

pa

10

20

Figure 5.6. Estimated αx, βx, and κt (from left to right), x = 0, 1, . . . , 104 (top panels) and x = 60, 61, . . . , 104 (bottom panels), obtained by minimizing thesum of squares (5.5) over the optimal fitting period 1978–2005 for ages 0–104 and 1974–2005 for ages 60–104 with smoothed HMD death rates (correspondingvalues obtained without smoothing are displayed in broken line).

220 5 : Age-period projection models

the residuals so that the examination of model performance is in fact basedon the residuals computed with the adjusted κt’s.

When the parameters are estimated by least-squares, Pearson residualshave to be inspected. In the Lee–Carter case, these residuals are given by

rxt = εx(t)√1

(xm−x1)(tn−t1−1)

∑xmx=x1

∑tnt=t1

(εx(t)

)2 (5.35)

where εx(t) = ln mx(t)− (αx + βxκt). In the Cairns–Blake–Dowd case, theseresiduals are given by

rxt = εx(t)√1

(xm−x1−1)(tn−t1+1)

∑xmx=x1

∑tnt=t1

(εx(t)

)2 (5.36)

where εx(t) = ln(qx(t)/px(t)) − (κ[1]t + κ

[2]t x).

If the residuals rxt exhibit some regular pattern, this means that the modelis not able to describe all of the phenomena appropriately. In practice,looking at (x, t) �→ rxt, and discovering no structure in those graphs ensuresthat the time trends have been correctly captured by the model.

With a Poisson, Binomial, or Negative Binomial random component, it ismore appropriate to consider the deviance residuals in order to monitor thequality of the fit. These residuals are defined as the signed square root of thecontribution of each observation to the deviance statistics. These residualsshould also be displayed as a function of time at different ages, or as afunction of both age and calendar year.

5.6.2 Application to Belgian mortality statistics

We find that the residuals computed from the model fitted to ages 0–104reveal systematic patterns and comparatively large values at young ages. Inthe Lee–Carter case, the fit around the accident hump is very poor, withlarge negative residuals for ages below 20. The residuals are positive for allof the higher ages. The same phenomenon appears with the Cairns–Blake–Dowd fit, with huge positive residuals around age 0. Overall, we find thatthe inclusion of young ages significantly deteriorates the quality of the fitat the higher ages. The presence of a trend in the residuals violates theindependence assumption and homoskedasticity does not hold as the graphpresents clustering. The large residuals before the accident hump suggestthat the Lee–Carter and Cairns–Blake–Dowd approaches are not able toaccount for the particular mortality dynamics at younger ages. Since olderages are the most relevant in pension and annuity applications, we restrictthe analysis to ages 60 and over.

5.7 Mortality projection 221

Residuals can be displayed as a function of both age and calendar time,and inspected with the help of maps as displayed in Fig. 5.7 for the Lee–Carter fit (top panel) and for the Cairns–Blake–Dowd fit (bottom panel),for ages 60–104. The particular patterns at oldest ages come from theclosing procedure applied to HMD mortality statistics, and do not invali-date the fit. The residuals are unstructured, except for a moderate cohorteffect for generations reaching age 60 around 1980. Thus, apart fromthese cohorts born just after World War I and the 1918–1920 influenzaepidemics, there is no significant diagonal pattern in the residuals. Wefind that the cohort effect revealed in the residuals is too weak to rejectthe age-period Lee–Carter model. In some countries, the cohort effectsare stronger and need to be included in the mortality modelling. This isthe case for instance in the United Kingdom, as will be seen in the nextchapter where cohort effects will be included in the models discussed in thepresent chapter.

We now turn to the residuals for the Cairns–Blake–Dowd model. Theresiduals are less dispersed than those for the Lee–Carter fit. The generationsborn around 1920 again emerge as a notable feature in the residuals plot.We now observe a clustering of negative residuals for the generations bornafter this particular one, whereas positive residuals are associated with theolder generations. This suggests that the inclusion of a cohort effect couldbe envisaged in the Cairns–Blake–Dowd setting. We postpone the analysisof this kind of effect to the next chapter.

5.7 Mortality projection

5.7.1 Time series modelling for the time indices

An important aspect of both the Lee–Carter and the Cairns–Blake–Dowdmethodology is that the time factor (κt in the Lee–Carter case, and (κ

[1]t , κ[1]

t )

in the Cairns–Blake–Dowd case) is intrinsically viewed as a stochastic pro-cess. Box-Jenkins techniques are then used to estimate and forecast the timefactorwithin anARIMA time seriesmodel. These forecasts in turn yield pro-jected age-specific mortality rates, life expectancies and single premiums forlife annuities.

In the Lee–Carter model, the estimated κt’s are viewed as a realization ofa time series that is modelled using the classical autoregressive integratedmoving average (ARIMA) models. Such models explain the dynamics ofa time series by its history and by contemporaneous and past schocks.

222 5 : Age-period projection models

1980 1985 1990 1995 2000 2005

Age

–2.8760

–2.4763

–2.0766

–1.6769

–1.2771

–0.8774

–0.4777

–0.0780

0.3216

0.7213

1.1211

1.5208

1.9205

2.3202

2.7199

3.1197

1975 1980 1985 1990 1995 2000 2005

Age

–4.0753

–3.4200

–2.8400

–2.2600

–1.6800

–1.1000

–0.5200

0.0600

0.6400

1.2200

1.8000

2.3800

2.9600

3.5400

4.12004.6179

100

90

80

70

60

Time

100

90

80

70

60

Time

Figure 5.7. Residuals for Belgian males, Lee–Carter model, ages x = 60, 61, . . . , 104 (top panel),and Cairns–Blake–Dowd model, ages x = 60, 61, . . . , 104 (bottom panel).

The dynamics of the κt’s is described by an ARIMA(p, d, q) process if itis stationary and

∇dκt = φ1∇dκt + · · · + φp∇dκt + ξt + ψ1ξt−1 + · · · + ψqξt−q (5.37)

with φp �= 0, ψq �= 0, and where ξt is a Gaussian white noise processsuch that σ2

ξ > 0.

5.7 Mortality projection 223

There are a few basic steps to fitting ARIMA models to time series data.The main point is to identify the values of the autoregressive order p, theorder of differencing d, and the moving average order q. If the time indexis not stationary, then a first difference (i.e. d = 1) can help to remove thetime trend. If this proves unsuccessful then it is standard to take furtherdifferences (i.e. investigate d = 2 and so on). Preliminary values of p andq are chosen by inspecting the autocorrelation function and the partialautocorrelation function of the κt’s. More details can be found in standardtextbooks devoted to time series analysis.

The appropriateness of the Lee–Carter approach has been questionedby several authors. The rigid structure imposed by the model necessitatesthe selection of an optimal fitting period (which is also conservative inthe context of life annuities, that is, it tends to overstate the expectedvalue of annuities). The Gaussian distributional assumption imposed onthe κt’s means that large jumps are unlikely to occur. This feature canbe problematic for death benefits, where negative jumps correspond toevents which threaten the financial strength of the insurance company. Forinstance, insurers currently are worrying about an avian influenza pan-demic which could cause the death of many policyholders. On the basis ofvital registration data gathered during the 1918–1920 influenza pandemic,extrapolations indicate that if the mortality were concentrated in a sin-gle year, it would increase global mortality by 114%. However, neglectingsuch jumps is conservative for life annuities. Positive jumps correspond-ing to sudden improvements in mortality thanks to the availability of newmedical treatments are considered to be unlikely to occur, since it wouldtake some time for the population to benefit from these innovative treat-ments. Hence, the assumptions behind the Lee–Cartermodel are compatiblewith mortality projections for life annuity business, and we do not need toacknowledge explicitly period shocks in the stochastic mortality model.We note also that the optimal fitting period, that has been widely used, hastended to start after the three pandemics of the 20th century (1918–1920,1957–1958, and 1968–1970).

5.7.2 Modelling of the Lee–Carter time index

5.7.2.1 Stationarity

Time series analysis procedures require that the variables being studied bestationary. We recall that a time series is (weakly) stationary if its meanand variance are constant over time, and the covariance for any two timeperiods (t and t +k, say) depends only on the length of the interval betweenthe two time periods (here k), not on the starting time (here t).

224 5 : Age-period projection models

Nonstationary series may be the result of two different data-generatingprocesses:

1. The non-stationarity can reflect the presence of a deterministic compo-nent. Such a trending series can be rendered stationary by simply settingup a regression on time and working on the resulting residuals. Theseseries are said to be trend stationary.

2. The non-stationarity can result from a ‘non-discounted’ accumulationof stochastic shocks. In this case, stationarity may be achieved by differ-encing the series one or more times. These series are said to be differencestationary.

A first check for stationarity consists of displaying the data in graphicform and in looking to see if the series has an upward or downward trend.We have observed a gradually decreasing underlying trend in the estimatedκt’s. The series of the estimated κt’s is, thus, clearly not stationary: it tendsto decrease over time on average. Figure 5.8 displays the estimated auto-correlation function of the estimated κt’s (on the left panel). The classicsignature for a nonstationary series is a set of very strong correlations thatdecay slowly as the lag length increases. Specifically, if the time series isstationary, then its autocorrelation function declines at a geometric rate.As a result, such processes have short-memory since observations far apartin time are essentially independent. Conversely, if the time series needs tobe differenced once, then its autocorrelation function declines at a linearrate and observations far apart in time are not independent. The sampleautocorrelation coefficients of the κt’s in Fig. 5.8 clearly exhibit a lineardecay which supports nonstationarity.

In addition to these graphical procedures, several formal tests for(non)stationarity have been developed. Stationarity tests are for the nullhypothesis that a time series is trend stationary. Taking the null hypothesisas a stationary process and differencing as the alternative hypothesis is inaccordance with a conservative testing strategy: if we reject the null hypoth-esis then we can be confident that the series indeed needs to be differenced(at least once). The Kwiatowski–Philips–Schmidt–Shin test with a lineardeterministic trend has a test-statistic equal to 0.168with 3 lags, and 0.1529with 9 lags. This leads to rejecting trend stationarity for males (at 5%) andto the conclusion that the κt’s need to be differenced.

Since the estimated κt’s are difference stationary, we compute the firstdifferences of the estimated κt’s for males and females. In order to checkwhether a second difference is needed, we test the resulting series for(non)stationarity using unit root tests. The Augmented Dickey–Fuller p-value is less than 1%, so that we conclude that the first differences of theκt’s are stationary and so do not need further differencing.

5.7 Mortality projection 225

0 5 10 15–0.4

–0.2

0.0

0.2

0.4

0.6

0.8

1.0

Lag

AC

F

2 4 6 8 12 14

–0.2

0.0

0.2

0.4

0.6

0.8

Lag

Par

tial

AC

F

10

Figure 5.8. Autocorrelation function (on the left) and partial autocorrelation function (on theright) of the estimated κt ’s obtained with completed data for the ages 60 and over.

5.7.2.2 Random walk with drift model for the time index

As no autocorrelation coefficient nor partial autocorrelation coefficientof the differenced time index appears to be significantly different from0, an ARIMA(0,1,0) process seems to be appropriate for the estimatedκt’s. The Ljung–Box–Pierce test supports this model. Running a Shapiro–Wilk test yields a p-value of 23.08%, which indicates that the residualsseem to be approximately Normal. The corresponding Jarque-Bera p-valueequals 48.27%, which confirms that there is no significant departure fromNormality.

The previous analysis suggests that for Belgian mortality statistics, a ran-dom walk with drift model is suitable for modelling the estimated κt’s (asis the case in many of the empirical studies in the literature). In this case,

226 5 : Age-period projection models

the dynamics of the estimated κt’s are given by

κt = κt−1 + d + ξt (5.38)

where the ξt’s are independent and Normally distributed with mean 0 andvariance σ2, and where d is known as the drift parameter. In this case,

κtn+k = κtn + kd +k∑

j=1

ξtn+j (5.39)

The point forecast of the time index is thus

κtn+k = E[κtn+k|κt1 , κt2 , . . . , κtn] = κtn + kd (5.40)

which follows a straight line as a function of the forecast horizon k, withslope d. The conditional variance of the forecast is

Var[κtn+k|κt1 , κt2 , . . . , κtn] = kσ2 (5.41)

Therefore, the conditional standard errors for the forecast increase with thesquare root of the distance of the forecast horizon k.

Using the random walk with drift model for forecasting κt is equivalentto forecasting each age-specific death rate to decline at its own rate. Indeed,it follows from (5.38) that the differences in expected log-mortality ratesbetween times t + 1 and t is

lnµx(t + 1) − lnµx(t) = βxE[κt+1 − κt] = βxd (5.42)

The ratio of death rates in two subsequent years of the forecast is equalto exp(βxd) and is thus invariant over time. The product βxd is thereforeequal to the rate of mortality change over time at age x. In such a case, theparameter βx can be interpreted as a normalized schedule of age-specificrates of mortality change over time.

It is important to notice that the future mortality age profile produced bythe Lee–Carter model always becomes less smooth over time, as pointedout by Girosi and King (2007). This explains why this approach hasbeen designed to forecast aggregate demographic indicators, such as lifeexpectancies (or actuarial indicators like annuity values), and not futureperiod or cohort life tables. This comes from the fact that the forecast ofthe log-death rates is linear over time from (5.42): as the βx’s vary with age,the age profile of log-mortality will eventually become less smooth overtime, since the distance between log-mortality rates in adjacent age groupscan only increase. Each difference in βx is amplified as we forecast further

5.7 Mortality projection 227

into the future. Sometimes, the forecast lines converge for a period, but afterconverging they cross and the age profile pattern becomes inverted.

The dynamics (5.38) ensures that κt −κt−1, t = t2, t3, . . . , tn, are indepen-dent andNormally distributed,withmean d and variance σ2. Themaximumlikelihood estimators of d and σ2 are given by the sample mean and vari-ance of the κt − κt−1’s, that is, the maximum likelihood estimators of themodel parameters are

d = 1tn − t1

tn∑t=t2

(κt − κt−1) = κtn − κt1

tn − t1(5.43)

and

σ2 = 1tn − t1

tn∑t=t2

(κt − κt−1 − d

)2(5.44)

This gives d = −0.5867698 and σ2 = 0.3985848 for Belgian males for theoptimal fitting period 1974–2005.

This approach is known as the ruler method of forecast as it connectsthe first and last points of the available data with a ruler and then extendsthe resulting line further in order to produce a forecast. Considering theexpression for d, the actuary has to check the value of κtn for reasonable-ness. For instance, if a summer heat wave occurs during calendar year tn,producing excess mortality at older ages, then κtn might be implausibly high,resulting in a too small d, and biasing downwards the future improvementsin longevity (as noted by Lee (2000), Renshaw and Haberman (2003a),Renshaw and Haberman (2003c)). As note above, Lee and Carter (1992)did not prescribe the random walk with drift model for all situations. How-ever, this model has been judged to be appropriate in very many cases.For instance, Tuljapurkar et al. (2000) find that the decline in the κt’s isin accordance with the random walk with drift model for the G7 coun-tries. Even when a different model is indicated, the more complex modelis found to give results which are close to those obtained with the randomwalk with drift.

Remark Building on the random walk with drift model for the κt’s, Girosiand King (2007) propose that we should model directly the ln mx(t)’s usinga multivariate random walk with drift model. In this reformulation ofthe Lee–Carter model, the drift vector and the covariance matrix of theinnovations are arbitrary. �

Remark Carter (1996) has developed a method in which the drift d in therandom walk forecasting equation for κt is itself allowed to be a random

228 5 : Age-period projection models

variable. This is done using state-space methods for modelling time series.Nevertheless, it is noteworthy that the forecast and probability intervalsremain virtually unchanged compared to the simple randomwalk with driftmodel. �

Remark Booth et al. (2006) compare the original Lee–Carter methodwith the different adjustments for the estimated κt’s, as well as theextensions proposed by Hyndman and Ullah (2007) and De Jong andTickle (2006). They find that, from the forecasting point of view,there are no significant differences between the five methods. See alsoBooth et al. (2005). �

5.7.3 Modelling the Cairns-Blake-Dowd time indices

The analysis of each time index in isolation parallels the analysis performedfor the Lee–Carter time index. These preliminary results have now to be sup-plemented with a bivariate analysis of the time series κ t = (κ

[1]t , κ[2]

t )T thatgoes beyond the scope of this book. When fitted to data, the changes overtime in κ t have often been approximately linear, at least in the recent past.This suggests that the dynamics of the time factor κ t could be appropriatelydescribed by a bivariate random walk with drift of the form{

κ[1]t = κ

[1]t−1 + d1 + ξ

[1]t

κ[2]t = κ

[2]t−1 + d2 + ξ

[2]t

(5.45)

where d1 and d2 are the drift parameters, and ξ t = (ξ[1]t , ξ[2]

t )T are inde-pendent bivariate Normally distributed random pairs, with zero mean andvariance-covariance matrix

� =(

σ21 σ12

σ12 σ22

)(5.46)

The drift parameters are estimated as

di = κ[i]tn − κ

[i]t1

tn − t1, i = 1, 2 (5.47)

the marginal variances are estimated as

σ2i = 1

tn − t1

tn∑t=t2

[i]t − κ

[i]t−1 − di

)2, i = 1, 2 (5.48)

5.8 Prediction intervals 229

and the covariance is estimated as

σ12 = 1tn − t1

tn−1∑s=t1

tn−1∑t=t1

[1]s+1 − κ[1]

s − d1

)(κ

[2]t+1 − κ

[2]t − d2

)(5.49)

This gives d1 = −0.0757558, d2 = 0.0007619443, σ21 = 0.01563272,

σ22 = 3.3048× 10−6, and σ12 = −0.0002247978 for Belgian males for the

period 1979-2005.

While a bivariate randomwalk with drift model has been used in connec-tion with the Cairns–Blake–Dowd approach to mortality forecasting, meanreverting alternatives might have a stronger biological justification. AndrewCairns pointed out in a personal communication that negative autocorrela-tion coefficients between the κ

[2]t − κ

[2]t−1’s indicate that at higher ages good

years and bad years alternate. This can be explained as follows: if a flu epi-demic kills a lot of the unhealthy older people, it leaves the healthy olderand then the next year mortality is low.

5.8 Prediction intervals

5.8.1 Why bootstrapping?

The projections made so far, while interesting, reveal nothing about theuncertainty attached to the future mortality. In forecasting, it is importantto provide information on the error affecting the forecasted quantities. Inthe traditional demographic approach to mortality forecasting, a range ofuncertainty is indicated by high and low scenarios, around a medium fore-cast that is intended to be a best estimate. However, it is not clear how tointerpret this high-low range unless a corresponding probability distributionis specified.

In this respect, prediction intervals are particularly useful. This sectionexplains how to get such margins on demographic indicators in the Lee–Carter setting. The ideas are easily extended to the Cairns–Blake–Dowdsetting.

In the current application, it is impossible to derive the relevant predictionintervals analytically. The reason for this is that two very different sourcesof uncertainty have to be combined: sampling errors in the parameters αx,βx, and κt, and forecast errors in the projected κt’s. An additional compli-cation is that the measures of interest – life expectancies or life annuitiespremiums and reserves – are complicated non-linear functions of the param-eters αx,βx, and κt and of the ARIMA parameters. The key idea behind the

230 5 : Age-period projection models

bootstrap is to resample from the original data (either directly or via a fittedmodel) in order to create replicate data sets, fromwhich the variability of thequantities of interest can be assessed. Because this approach involves repeat-ing the original data analysis procedure with many replicate sets of data, itis sometimes called a computer-intensive method. Bootstrap techniques areparticularly useful when, as in our problem, theoretical calculation with thefitted model is too complex.

If we ignore the other sources of errors, then the confidence boundson future κt’s can be used to calculate prediction intervals for demo-graphic indicators. Even if for long-run forecasts (over 25 years), theerror in forecasting the mortality index clearly dominates the errors infitting the mortality matrix, prediction intervals based on κt alone seri-ously understate the errors in forecasting over shorter horizons. We knowfrom Lee and Carter (1992), Appendix B, that prediction intervals basedon κt alone are a reasonable approximation only for forecast horizonsgreater than 10–25 years. If there is a particular interest in forecasting overthe shorter term, then we cannot make a precise analysis of the forecasterrors.

Because of the importance of appropriate measures of uncertainty in anactuarial context, the next sections aim to derive prediction intervals takinginto account all of the sources of errors. To fix the ideas, we will con-sider a cohort life expectancy e↗

x (t) as defined in Section 4.4.1 or in (5.57)below, but the approach is easily adapted to other demographic or actuarialindicators.

5.8.2 Bootstrap percentiles confidence intervals

To avoid any (questionable) Normality assumption, we use the bootstrappercentile interval to construct confidence interval for the predicted lifeexpectancy. The bootstrap procedure yields B samples of αx, βx, and κtparameters, denoted as αb

x, βbx, and κb

t , b = 1, 2, . . . ,B. This procedure canbe carried on in several ways:

Monte Carlo simulation Brouhns et al. (2002b), Brouhns et al. (2002a)sample directly from the approximate multivariate Normal distributionof the Poisson maximum likelihood estimators (α, β, κ), that is, thoseobtained by maximizing the log-likelihood (5.21). Invoking the largesample properties of the maximum likelihood estimators, we know that(α, β, κ) is asymptotically multivariate Normally distributed, with mean(α,β, κ) and covariance matrix given by the inverse of the Fisher infor-mation matrix, whose elements equal minus the expected value of the

5.8 Prediction intervals 231

second derivatives of the log-likelihood with respect to the parametersof interest. We refer to Appendix B in Brouhns et al. (2002b) for theexpression of the information matrix and how to sample values from themultivariate Normal distribution of interest.As reported by Renshaw and Haberman (2008), the result of this firstapproach heavily rely on the identifiability constraints. Given that thechoice of constraints is not unique and that this choice materially affectsthe resulting simulations, this first approach should not be used for riskassessment purposes unless there are compelling reasons for selecting aparticular set of identifiability constraints.

Poisson bootstrap Starting from the observations (ETRxt,Dxt), Brouhnset al. (2005b) create B bootstrap samples (ETRxt,Db

xt), b = 1, 2, . . . ,B,where the Db

xt’s are realizations from the Poisson distribution with meanETRxtµx(t) = Dxt. The bootstrapped death counts Db

xt are thus obtainedby applying a Poisson noise to the observed numbers of deaths. For eachbootstrap sample, the αx’s, βx’s, and κt’s are estimated.

Residuals bootstrap Another possibility is to bootstrap from the residualsof the fitted model, as suggested by Koi et al. (2006). The residuals shouldbe independent and identically distributed (provided that the model iswell specified). From these, it is possible to reconstitute bootstrappedresiduals, and then bootstrapped mortality data. A good fit resultingin a set of pattern-free random residuals for sampling repeatedly withreplacement is a basic requirement for this approach. When this is notthe case, distortions can occur in the simulated histogram of the quantityof interest.Specifically, we create the matrix R of residuals, with elements rxtas defined in Section 5.6. Then, we generate B replications Rb, b =1, 2, . . . ,B by sampling with replacement the elements of the matrix R.The inverse formula for the residuals is then used to obtain the cor-responding matrix of death counts Db

xt; we refer the reader to Koissiet al. (2006) for further explanation about the inverse of the residu-als, as well as to Renshaw and Haberman (2008) for further comments.This leads to the computations of B sets of estimated parameters αb

x, βbx,

and κbt .

We then estimate the time series model using the κbt as data points. This

yields a new set of estimated ARIMA parameters. We can then generate aprojection κb

t , t ≥ tn + 1 using these ARIMA parameters. The future errorsξb

t are sampled from a univariate Normal distribution with a mean of 0 anda standard deviation of σb

ε . Note that the κt’s are projected on the basis ofthe reestimated ARIMA model. Note that we do not select a new ARIMAmodel but keep the ARIMA model selected on the basis of the original

232 5 : Age-period projection models

data. Nevertheless, the parameters of these models are reestimated with thebootstrapped data.

The first step is meant to take into account the uncertainty in the param-eters αx’s, βx’s, and κt’s. The second step deals with the fact that theuncertainty in the ARIMA parameters depends on the uncertainty in theαx’s, βx’s and κt’s parameters. The third step ensures that the uncertaintyof the forecasted κt’s not only depends on the ARIMA standard error, butalso on the uncertainty of the ARIMA parameters themselves. Finally, in thecomputation of the relevant measures in step four, all sources of uncertaintyare taken into account.

This yields B realizations αbx, β

bx, κ

bt and projected κb

t on the basis of whichwe can compute the measure of interest e↗

x (t). Assume that B bootstrapestimates eb

x(t), b = 1, 2, . . . ,B, have been computed. The (1−2α) percentile

interval for e↗x (t) is given by (eb(α)

x (t), eb(1−α)x (t)), where eb(ζ)

x (t) is the 100×ζth empirical percentile of the bootstrapped values for e↗

x (t), which is equalto the (B×ζ)th value in the ordered list of replications eb

x(t), b = 1, 2, . . . ,B.For instance, in the case ofB = 1, 000 bootstrap samples, the 0.95th and the0.05th empirical percentiles are, respectively, the 950th and 50th numbersin the increasing ordered list of 1,000 replications of e↗

x (t).

Note that these bootstrap procedures account for parameter uncertaintyas well as Arrowian uncertainty (also known as risk, in which the setof future outcomes is known and probabilities can be assigned to eachof the possible outcomes based on a known model with known parame-ters generating the distribution of future outcomes). Knightian uncertainty,by comparison, ackowledges the presence of both model uncertainty andparameter uncertainty. Allowing for model uncertainty would require theconsideration of several mortality projection models and the assignment tothese of probabilities in line with their relative likelihoods.

Remark Empirical studies conducted in Renshaw and Haberman (2008)reveal varying magnitudes of the Monte Carlo based confidence and pre-diction intervals under different sets of identifiability constraints. Suchdiverse results are attributed by these authors to the over parametriza-tion present in the model rather than to the non-linearity of the parametricstructure. �

5.8.3 Application to Belgian mortality statistics

In the approach proposed by Lee and Carter (1992), future age-specificdeath rates are obtained using extrapolated κt’s and fixed αx’s and βx’s, thatis, the pointwise projections κtn+s of the κtn+s’s, s = 1, 2, . . ., are inserted

5.8 Prediction intervals 233

into the formulas giving the force of mortality and provide

µx(tn + s) = exp(αx + βxκtn+s) (5.50)

In this case, the jump-off rates (i.e. the rates in the last year of the fittingperiod or jump-off year) are fitted rates. The basic Lee–Carter method hasbeen criticized by Bell (1997) for the fact that a discontinuity is possiblebetween the observed mortality rates and life expectancies for the jump-off year and the forecast values for the first year of the forecast period.The bias arising from this discontinuity would then persist throughout theforecast.

As suggested by Bell (1997), Lee andMiller (2001), Lee (2000), Renshawand Haberman (2003a), Renshaw and Haberman (2003c), the forecastcould be started with observed death rates rather than with fitted ones.This would help to eliminate a jump between the observed and forecasteddeath rates in the first year of the forecast as the model does not fit age-specific death rates exactly in the last year. If the fitting period is sufficientlylong, then the difference between the observed and the fitted death ratescan be appreciable. Specifically, the forecast mortality rates are aligned tothe latest available empirical mortality rates as

µx(tn + s) = mx(tn) exp(βx

(κtn+s − κtn

))= mx(tn)RF(x, tn + s) (5.51)

Note that here, mx(tn) denotes the death rate observed at age x in year tnand not the fitted one, and RF denotes the reduction factor as introducedin equation (4.6).

If the latest empirical mortality rates were judged to be atypical in levelor shape, an alternative would be to average across a few years at the endof the observation period, or to resort to a recent population life table,as advocated by Renshaw and Haberman (2003a,c). In the example, here,we use the Belgian 2002–2004 population life table released in 2006 byStatistics Belgium as the base for the mortality forecast.

Here, we bootstrap the residuals displayed in Figure 5.7 (top panel). With10,000 replications, we obtain the histograms displayed in Figure 5.9 for thecohort life expectancies e↗

65(2006) for males. The point estimate is 18.17.The mean of the bootstrapped values is 18.05, with a standard deviationof 0.3802. The bootstrap percentiles confidence interval at level 90% is(17.41183,18.66094).

We have also applied a Poisson bootstrap. The results are shownin Fig. 5.9, lower panel. The mean and the standard deviations are

234 5 : Age-period projection models

Freq

uenc

y

16.5 17.0 17.5 18.0 18.5 19.0 19.5

0

200

400

600

800

1000

Freq

uenc

y

0

200

400

600

800

1000

16.5 17.0 17.5 18.0 18.5 19.0 19.5

Figure 5.9. Histograms for the 10,000 bootstrapped values of the cohort life expectancies at age65 in year 2006 for the general population, males: residuals bootstrap in the top panel, Poissonbootstrap in the bottom panel.

almost equal to those of the residuals bootstrap (respectively, 18.03 and0.3795). The bootstrap percentiles confidence interval at level 90% is(17.40993,18.65580). The histograms obtained with the Poisson boot-strap and with the residuals bootstrap have very similar shapes, and theconfidence intervals closely agree.

5.9 Forecasting life expectancies

In this section, we consider the computation of projected life expectancies atretirement age 65, obtained from the Lee–Carter and Cairns–Blake–Dowdapproaches, by replacing the death rates with their forecasted values. More-over, the results are then compared with other projections performed forthe Belgian population.

5.9 Forecasting life expectancies 235

5.9.1 Official projections performed by the Belgian FederalPlanning Bureau (FPB)

The FPB was asked in 2003 by the Pension Ministry to produce (in collab-oration with Statistics Belgium) projected life tables to be used to convertpension benefits into life annuities in the second pillar. A working party wasset up by the FPB with representatives from Statistics Belgium, BFIC, theRoyal Society of Belgian Actuaries and UCL. The results are summarizedin the Working Paper 20-04 available from http://www.plan.be.

The FPB model specifies qx(t) = exp(αx + βxt) where αx = ln qx(0) andβx is the rate of decrease of qx(t) over time. Thus, each age-specific deathprobability is assumed to decline at its own exponential rate. The αx’s andβx’s are first estimated by the least-squares method, that is, minimizing theobjective function

xm∑x=x1

tn∑t=t1

(ln qx(t) − αx − βxt

)2(5.52)

Then, the resulting βx’s are smoothed using geometric averaging. Finally,the αx’s are adjusted to represent the recent mortality experience. Thismethodology is similar to the generalized linear modelling regression-basedapproach proposed by Renshaw and Haberman (2003b).

The death rates mx(t) and the death probabilities qx(t) are typically veryclose to one another in value. This is why we would expect that the FPBapproach would lead to similar projections to the Lee–Carter method oncethe optimal fitting period has been selected. However, no such selection isperformed in the FPB analysis, which may result in some differences in theforecasts.

5.9.2 Andreev–Vaupel projections

The method used by Andreev and Vaupel (2006) is based on Oeppen andVaupel (2002). Plotting the highest period female life expectancy attainedfor each calendar year from 1840 to 2000, Oeppen and Vaupel (2002) havenoticed that the points fall close to a straight line, starting at about 45 yearsin Sweden and ending at about 85 years in Japan. They find that recordfemale life expectancy has increased linearly by 2.43 years per decade from1840 to 2000 (with an adjustment coefficient R2 = 99.2%). The recordmale life expectancy has increased linearly from 1840 to 2000 at a rate of2.22 years per decade (with R2 = 98%). Moreover, there is no indicationof either an acceleration or deceleration in the rates of change. If the trend

236 5 : Age-period projection models

continues, they predict that female record life expectancy will be 97.5 bymid-century and 109 years by 2100. Life expectancy can be forecast for agiven country by considering the gap between national performance andthe best-practice level. See also Lee (2003).

Andreev and Vaupel (2006) combine the approach due to Oeppen andVaupel (2002) with the Lee–Carter model to gain stability over the longrun. More precisely, they assume that the linear trend in the best practicefemale life expectancy continues into the future and also that the differ-ence between the life expectancy of a particular country and the generaltrend stays constant over time. Then, the life expectancy at birth can beforecast as

e↑0(t) = e↑

0(tn) + s(t − tn) (5.53)

where s is the pace of increase in the best practice life expectancy over timethat has been estimated by Oeppen and Vaupel (2002) and e↑

0(t) is the lifeexpectancy at birth in the particular country. Andreev and Vaupel (2006)do not use separate values of s for males and females but the female valueof 0.243 for both genders.

Andreev andVaupel (2006) consider ages 50 and over so that they need todeduce the value of e↑

50(t) from e↑0(t). To do so, they start with a forecast of

death rates by the linear decline model (according to which each age-specificdeath rate is forecasted to decline at its own independent rate) along thelines of

µx(tn + s) = µx(tn) exp(−gxs) (5.54)

where gx is the annual rate of decline for the mortality rate at age x.

Then, the forecasted death rates aremultiplied by a constant factor so thatthe life expectancy at birth matches the e↑

0(t) values coming from (5.53).The value of e↑

50(t) is then obtained from these adjusted death rates.

Given the estimated value of e↑50(t), we need to calculate the set of mortal-

ity rates at ages over 50 that correspond to this value. Andreev and Vaupel(2006) use the Kannisto model

µx(t) = at exp(btx)

1 + at exp(btx)(5.55)

which is fitted to data for ages 50 and over by the method of Poissonmaximum likelihood. The at’s are then projected into the future fromthe linear model

ln at = β0 + β1t (5.56)

Then, for each t ≥ tn + 1, the parameter bt is determined to match e↑50(t)

given the value of at obtained from (5.56).

5.9 Forecasting life expectancies 237

This method may produce a jump in death rates. To avoid this drawback,the death rates can be blended with the death rates produced by the Lee–Carter method over a short period of time. Specifically, the Lee–Cartermodel is fitted to data for ages 50 and over, and the estimated κt’s areadjusted by refitting to the e↑

50(t)’s. The bias correction ensures that theforecasted death rates closely agree with the latest available death rates inthe first years of the forecast. The weight assigned to the Lee–Carter deathrates is 1 − k/n + 1 for years tn + k, k = 1, 2, . . . , n, where n is the lengthof the blending period. The value of n ranges from 10 for countries wherethe model (5.55) provides a good fit to 40 where this is not the case.

5.9.3 Application to Belgian mortality statistics

Life expectancies are often used by demographers to measure the evolutionofmortality. Specifically, e↗

x (t) is the average number of years that an x-agedindividual in year t will survive, allowing for the evolution of mortality rateswith time after t. We, thus, expect that this person will die in year t + e↗

x (t)at age x + e↗

x (t). The formula giving e↗x (t) under (3.2) is

e↗x (t) =

∫ξ≥0

exp(

−∫ ξ

0µx+η(t + η) dη

)dξ

= 1 − exp( − µx(t)

)µx(t)

+∑k≥1

k−1∏j=0

exp(−µx+j(t + j)

) 1 − exp(−µx+k(t + k)

)µx+k(t + k)

. (5.57)

It is interesting to compare (5.57) with the expression (3.18) previouslyobtained for the period life expectancy. The actual computation of the pro-jected cohort life expectancies at (retirement) age 65 is made using formula(5.57) where the future death rates are replaced with their forecast values.First, the cohort life expectancies obtained in the Lee–Carter model arecompared with the values coming from the Cairns–Blake–Dowd forecast.Then, the Lee–Carter projections are compared with two projections per-formed for the Belgian population, by the Federal Planning Bureau and byAndreev and Vaupel (2006).

Figure 5.10 displays the values of the cohort life expectancies at age65 obtained from the Lee–Carter and Cairns–Blake–Dowd mortality pro-jections. For the Lee–Carter forecast, we present also a 90% predictioninterval. We see that the values obtained from the two approaches are in

238 5 : Age-period projection models

2010 2015 2020 2025

18

19

20

21

t

e 65(

t)

Figure 5.10. Forecast of cohort life expectancies at age 65 for the general population (circle)with 90% confidence intervals (gray-shaded area), together with values obtained from the Cairns–Blake–Dowd model (triangle).

close agreement, with slightly larger values coming from the Lee–Carterapproach.

Figure 5.11 displays the cohort life expectancies at age 65 resulting fromthe Lee–Carter forecast for the general population, together with the officialFPB values and the corresponding values obtained by Andreev and Vaupel(2006). The small differences (of <6 months) between the FPB forecastsand the projections obtained in this chapter remain stable over time. Theofficial FPB forecasts lie inside the 90% confidence interval for the cohortlife expectancy at age 65. Hence, the FPB forecast is as plausible as theLee–Carter projection performed in this chapter. These two projections,however, significantly differ from the results derived in Andreev and Vaupel(2006), which are either implausibly small or become rapidly significantlylarger than the present forecasts.

Considering the values obtained by Andreev and Vaupel (2006) using theLee–Carter methodology, the differences relative to the forecast obtainedin the present study can be explained as follows. First, Andreev and Vau-pel (2006) use age groups 50–54, 55–59,…, 100+ and not single years ofages. Next, the optimal fitting period is not determined by Andreev andVaupel (2006) who routinely used 1950–2000. Finally, the forecast startswith death rates observed in the last year with available data (i.e. 2000 intheir case). We see that the projections obtained in this chapter from theLee–Carter model after the optimal fitting period has been selected exceedthose produced by Andreev and Vaupel (2006) by the same methodology

5.9 Forecasting life expectancies 239

2010 2015 2020 2025

17

18

19

20

21

22

t

e 65(

t)

Figure 5.11. Forecast of cohort life expectancies at age 65 for the general population (circle)with 90% confidence intervals (gray-shaded area), together with official FPB values (triangle),with values obtained by Andreev and Vaupel (2006) using the Lee–Carter methodology (square),and with values obtained by Andreev and Vaupel (2006) using the Oeppen and Vaupel (2002)modified methodology (diamond).

from calendar years 1950–2000. The selection of the optimal fitting periodmay thus have a dramatic effect on the forecast, and is in line with theconservative actuarial approach.

Andreev and Vaupel (2006) apply the same rate of decrease 0.243 forboth genders in order to forecast future life expectancies using the Oeppen–Vaupel line of record life expectancies. The life expectancy at age 50 isdeduced from the projected life expectancy at birth using a forecast of deathrates by the linear decline model (i.e. letting each age-specific death ratedecline at its own independent rate by fitting a randomwalkwith driftmodelseparately to the log of death rates in each age group). Finally, the Lee–Carter projection is combined with a Kannisto model to produce projectedlife tables. We see from Fig. 5.11 that this method yields a much higher lifeexpectancy at age 65 than the other approaches. Moreover, the speed ofimprovement exceeds the other forecasts.

It is interesting to note that all of the mortality forecasting models con-sidered in the present chapter (Lee–Carter with optimal fitting period,Cairns–Blake–Dowd, FPB, and Oeppen-Vaupel) agree about the forecastsof e↗

65(t) in the next few years. Significant differences compared with theOeppen–Vaupel approach emerge from 2013, this approach suggesting sig-nificantly higher values for the life expectancy at retirement age than itscompetitors.

240 5 : Age-period projection models

5.9.4 Longevity fan charts

Following Blake and Dowd (2007), we produce longevity fan charts fore↗65(2006 + t), t = 0, 1, . . . , 20 based on residuals bootstrap (with B =10,000). The result is shown in Fig. 5.12. Such charts depict some cen-tral projection of the forecasted variable, together with bounds around thisshowing the probabilities that the variable will lie within specified ranges.The chart in Fig. 5.12 shows the central 10% prediction interval with theheaviest shading surrounded by the 20%, 30%, . . ., 90% prediction inter-vals with progressively lighter shading. The shading becomes stronger asthe prediction interval narrows. We can therefore interpret the degree ofshading as reflecting the likelihood of the outcome: the darker is the shad-ing, the more likely is the outcome. The fan in Fig. 5.12 consists of 9 greybands of varying intensity. The upper and lower boundaries correspond topaths of the forecast 95% and 5% quantiles, and the inner edges of thebands in the fan correspond to the 10%, 15%, . . ., 90% quantiles. Thedarkest band in the middle is bounded by the 45% and 55% quantiles.Note that the quantiles are calculated for each year in isolation. The fanchart in Fig. 5.12 shows that longevity risk is rather low. The question as towhether these narrow confidence bounds are realistic remains an open one.

5.9.5 Back testing

Let us now forecast the period life expectancies for calendar years1981–2005, 1991–2005, and 2001–2005 on the basis of the observations

2010 2015 2020 2025

18

19

20

21

t

e 65(

t)

Figure 5.12. Fan chart for the cohort life expectancies at age 65 e↗65(2006+ t), t = 0, 1, . . . , 20,

for the general population.

5.9 Forecasting life expectancies 241

Table 5.1. Summary of the Lee–Carter fit to the periods 1950–1980,1950–1990, and 1950–2000.

1950–1980 1950–1990 1950–2000

Opt. fitting period 1960–1980 1968–1980 1968–1980% var. explained 65.34 91.63 94.78d −0.3526104 −0.6126091 −0.5541269σ2 5.509462 0.6228406 0.4072292

1985 1990 1995 2000 2005

12

13

14

15

16

t

e 65(

t)

Figure 5.13. Observed period life expectancies at age 65 for the general population (circles),together with forecast based on 1950–1980, 1950–1990, and 1950–2000 periods (triangles)surrounded by prediction 90% intervals.

relating to calendar years 1950–1980, 1950–1999, and 1950–2000, respec-tively. We thus investigate the predictive power of the Lee–Carter approachif it were applied in the past to forecast future mortality for ages 60–104.Table 5.1 summarizes the features of the Lee–Carter fits to each of thesethree periods.

We see that the fit is rather poor when the observation period is restrictedto 1950–1980, with only 65% of the total variance explained by the Lee–Carter decomposition. Also, the drift parameter gives an higher yearlyimprovement in longevity for the two subsequent periods. Also, the esti-mated variance of the random walk with drift model is considerably largerfor the 1950–1980 period compared with the two subsequent ones.

Figure 5.13 displays the forecast of the period life expectancies at age 65,together with observed values and 90% prediction intervals (grey areas,with progressively heavier shading). We see that using 1950–1980 data

242 5 : Age-period projection models

gives a point forecast far below the actual life expectancies observed during1981–2005. Moreover, the prediction intervals are wider compared to the1950–1990 and 1950–2000 periods. The Lee–Carter model would thusclearly have underestimated the actual gains in longevity after 1980 on thebasis of the 1950–1980 observation period. The forecast becomes betterwhen the 1950–1990 and 1950–2000 periods are considered. The Lee–Carter model captures the trends in the observed period life expectancies,which remain in the prediction intervals.

6Forecasting mortality:applications andexamples ofage-period-cohort models

6.1 Introduction

In this chapter, we consider the proposal that the models introduced inChapter 5 should be extended to include components that represent a cohorteffect, as well as how this proposal has been implemented. We illustratethis implementation with a case study based on the UK experience. Thejustification for this proposal comes initially from some descriptive studiesof mortality trends in the United Kingdom which demonstrate that thereis a strong birth cohort effect present. Thus, the Government Actuary’sDepartment, which was responsible at the time for the official UK popula-tion projections, has highlighted, in a series of reports (GAD 1995, 2001,and 2002), the existence of cohorts in the United Kingdom who have expe-rienced rapid mortality improvement, relative to those born previously ormore recently. The generations (of both sexes) born approximately between1925 and 1945 (and centered on the generation with year of birth 1931)seem to have experienced this more rapid mortality improvement.

Further evidence has come also from the ContinuousMortality Investiga-tion Bureau (CMIB) in the United Kingdom. In an analysis of the mortalityexperience of males with life insurance over an extended period, CMIB(2002) notes the existence of a similar effect, although this seems to becentered on a slightly earlier cohort, that is, that born in 1926. A similarcohort effect is also noted in an investigation into the mortality rates ofmale pensioners who are members of insured pension schemes – again thehighest rates of mortality decrease are noted for the 1926 cohort.

The reasons for this so-called cohort effect are not precisely understood.A number of explanatory factors have been suggested in the literature – ahelpful review is provided by Willets (2004). Among the most plausible

244 6 : Forecasting mortality

factors are the following. First, the diet in the United Kingdom during the1940s and early 1950s may have had a beneficial effect on the health ofchildren growing up during this time. Although this was a time of foodrationing, there is evidence that the average consumption of fresh vegeta-bles, bread, milk, and fish was higher during those years than in a recentperiod like the early 1990s – and, at the same time, average consump-tion of cheese and meat was lower. Second, the introduction of a universalsocial security system in 1948 (following the Beveridge Report of 1942),the introduction of free secondary school education for all in 1944 and theestablishment of the National Health Service in 1947 meant that the socialconditions for children growing up in the early 1950s would have beenvery different than that experienced by earlier generations. Third, there arestrong cohort-related patterns in mortality from diseases that are linkedto smoking, for example, lung cancer and heart disease (ONS 1997). It isclear that in the United Kingdom (and elsewhere) different generations havehad different smoking histories. Those born around 1920 may have startedsmoking during the 1930s, they may have been given free cigarettes duringWorld War II and would have been smoking for some considerable timewhen the deleterious health impact of smoking was first identified in the late1950s and early 1960s. There is a marked contrast with those born some20 years later who would have reached adulthood just when these researchfindings were being widely discussed.

Given the close association of lung cancer with smoking, Willets (2004)also examines the patterns of cause-specific mortality rates from lung cancerin the United Kingdom. He argues that, for males, lung cancer death ratesplotted for different cohorts indicate an upward trend for those born from1870 onwards with the peak rates for those born between 1900 and 1905and the greatest average annual improvement for those born in the period1930–1935.

These findings are supported by published analyses; for example,Evandrou and Falkingham (2002) have studied smoking prevalence rates.They estimate that approximately 95% of men born in 1916–1920 hadsmoked at some point by the time they reached age 60 – while, for thecohort born in 1931–1935, the corresponding figure was 25%. Finally,varying birth cohort sizes may confer benefits in that those born at a timeof low birth rates may acquire social and economic advantages relative tothose born at times of higher birth rates. In this regard, we note that theperiod from 1925 to 1945 was a period of falling birth rates sandwichedbetween the two post-war ‘baby booms’.

Other hypotheses have been suggested. For example, Catalano andBruckner (2006) have tested the ‘diminished entelechy hypothesis’ – this

6.1 Introduction 245

postulates that cohorts, who experience relatively many or relatively viru-lent environmental insults (e.g. infectious diseases, extreme weather, poordiet in terms of quality and/or quantity) in their early years, then suffera reduced subsequent life span. From a thorough time series analysis, theyfind a positive association between mortality in the first five years of life andaverage lifespan at age 5 for those born in Denmark (in the period 1835–1913), England and Wales (in the period 1841–1912) and Sweden (in theperiod 1751–1912). It is not clear whether this hypothesis, however, wouldbe relevant to cohorts born later in the 20th century in these countries whereearly childhood mortality rates have been much reduced, through the con-trol of infectious diseases via antibiotics and immunization. Other studieshave also looked at the role played by cohort effects in cause-specific ratherthan all causes mortality.

Similarly, Crimmins and Finch (2006) investigate the association betweenexposure to infections and late life health outcomes within the same cohort.In particular, they consider the relationship between mortality declineamong older persons in a cohort and earlier mortality decline in child-hood within the same cohort, using childhood mortality as an index ofenvironmental exposure to infections. The analysis focuses on four west-ern European countries (Sweden, England, France, and Denmark) wherecohort-based mortality data for cohorts born before 1900 is of a high qual-ity. The authors find that mortality declines among older persons tend tooccur in the same cohorts that had experiencedmortality decline as children.The choice of cohorts born before 1900 was made to avoid the confound-ing influence of smoking, immunization, and antibiotics. Although this maymean that the results have reduced relevance for developed countries, it isclear that there are implications for developing countries where childhoodmortality has reduced markedly in the last few decades, suggesting, forexample, that cohort effects may emerge in the future.

Davy Smith et al. (1998) have looked at the association between adversesocial circumstances in childhood and adult mortality from a range of majorcauses – they find a positive association between deprivation in childhoodand adult mortality from stroke and stomach cancer (and less strong asso-ciations with coronary heart disease and respiratory disease). They suggestthat the re-emergence of child poverty in the last 20 years may well lead toa cohort effect that will be observed in the future.

We find also that there is corresponding evidence from the United States,Japan, and Germany that there is a cohort effect present in national mor-tality data during the last 40 years, and also for Sweden over a much longerperiod: see the analyses of data in Cairns et al. (2007), Richards et al. (2007),and Willets (2004).

246 6 : Forecasting mortality

In the following sections, we investigate the extension of the Lee–Carter(LC) and the Cairns–Blake–Dowd models in order to incorporate a cohorteffect. We, thus, build on the detailed discussions in Chapter 5 – inparticular, in Sections 5.2 and 5.3.

6.2 LC age–period–cohort mortality projection model

6.2.1 Model structure

Webegin the discussion by presenting the LCmodel in awider setting.Giventhe important role played by the mortality reduction factors in generatingmortality projections for actuarial applications, we emphasis the targetingof the mortality reduction factor, as opposed to the force of mortality.Thus, while the parametric structure is expanded to allow for age-cohort aswell as the familiar LC effects, the error structure is imposed by specifyingthe second moment properties of the model. This allows for a range ofoptions for the choice of error distribution including Poisson, both withand without dispersion, as well as Gaussian, as used in the original LCapproach. We then review the methods of fitting such models and expandon them. As in Chapter 5, extrapolation is conducted using the standardapproach advocated by LC of parametric time series forecasting.

As before, we let the random variable Dxt denote the number of deathsin a population at age x and time t. A rectangular data array (dxt, ETRxt) isavailable for analysis where dxt is the actual number of deaths and ETRxtis the matching exposure to the risk of death. The force of mortality andempirical mortality rates are denoted by µx(t) and mx(t)(= dxt/ETRxt)

respectively. Cross-classification is by individual calendar year t ∈ [t1, tn](range n) and by age x ∈ [x1, xm], either grouped into m (ordered) cat-egories, or by individual year (range m), in which case year-of-birth orcohort year z = t − x ∈ [t1 − xm, tn − x1] (range n + m − 1) is defined. Weassume that this is the case throughout.

In terms of the force of mortality (as opposed to the central rate ofmortality), the LC model structure is

lnµx(t) = αx + βxκt (6.1)

subject to the most commonly used (but non-unique) constraints which areadopted to ensure the identifiability of the parameters:

tn∑t=t1

κt = 0,xm∑

x=x1

βx = 1. (6.2)

As discussed in Chapter 5, the LC model structure reduces the dimension-ality of the problem by identifying a single time index, which affects the

6.2 LC age–period–cohort mortality projection model 247

force of mortality at time t at all ages simultaneously. The first constraintunder (6.2) has the effect of centring the κt values over the range t ∈ [t1, tn].The model structure is designed to capture age-period effects with the αxterms incorporating the main age effects, averaged over time, and the bilin-ear terms βxκt incorporating the age specific period trends (relative to themain age effects). We re-write equation (6.1) in the following form:

µx(t) = exp(αx + lnRF(x, t)

)(6.3)

in general, where specifically the mortality reduction factor RF

lnRF(x, t) = βxκt (6.4)

is defined under LCmodelling.We subsequently adjust the constraints (6.2),so that

lnRF(x, tn) = 0, for all x (6.5)

when extrapolating mortality rates, as described in Chapter 4.

We now generalize the model structure in order to incorporate an age-cohort term. Thus, we consider the age–period–cohort (APC) version of theLC model, first introduced by Renshaw and Haberman (2006),

lnµx(t) = αx + β(0)x ii−x + β(1)

x κt (6.6)

and the related mortality reduction factor

RF(x, t) = exp(β(0)

x ii−x + β(1)x κt

)(6.7)

with an extra pair of bilinear terms β(0)x it−x introduced in order to represent

the cohort effects. We can see that (6.6) is then a natural extension ofequation (4.98) which was introduced in Chapter 4.

It is clear that the structure represented by equations (6.6) and (6.7) givesrise to a rich sub-structure of models:

Lee−Carter age-period (LC) model β(0)x = 0

Age−Cohort (AC) model β(1)x = 0

plus versions where either or both of the β(j)x = 1 for j = 1, 2: where the

application of age adjustments to one or both of the main period-effectsand cohort effects terms is not found to be significant.

In formulating these structures, in each of the above cases, we maypartition in the force of mortality as follows:

µx(t) = exp(αx)RF(x, t) (6.8)

that is into the product of a static term, representing the age profileof mortality and incorporating the main age effects αx, and a dynamic

248 6 : Forecasting mortality

parameterized mortality reduction factor RF(x, t), which contains both theage-specific (κt) and cohort (it−x) effects.

6.2.2 Error structure and model fitting

6.2.2.1 Introduction

In Chapter 5, a number of approaches to specifying the error structurefor the model and to model fitting have been described. In this section,we set aside the standard least-squares and singular value decompositionapproach, and focus on selecting a Poisson response model and usingmaximum likelihood estimation (as in Section 5.2.2.3).

Thus, wemodel the random number of deaths, Dxt, as a Poisson responsevariable. As noted in the previous chapter, direct modelling of Dxt is veryuseful in many practical applications where, for example, we might need tosimulate the future cash flows of a life annuity or pensions portfolio. Weallow also for over-dispersion and the allocation of prior weights, whichcan be important in the presence of empty data cells. This is formalized byfollowing the approach of generalized linear models and by specifying thefirst two moments of the responses Yxt where

Yxt = Dxt,

E(Yxt) = ETRxtµx(t) = ETRxt exp(αx)RF(x, t) (6.9)

Var(Yxt) = φE(Yxt) (6.10)

with a scale parameter φ, variance function V(E(Yxt)) = E(Yxt) and priorweights wxt = 1 (or 0 if the data cell is empty). Then under the log link, thenon-linear predictor ηxt is defined as(

lnE(Yxt) =)ηxt = ln ETRxt + αx + lnRF(x, t) (6.11)

It is also of interest to note a possible alternative error structure in thatthe original LC model with a Gaussian error structure is re-established onreplacing (6.9) and (6.10) with

Yxt = ln(

Dxt

ETRxt

), E(Yxt) = αx + lnRF(x, t), Var(Yxt) = φ

wxt.

(6.12)This formulation comprises a free standing scale parameter φ(=σ2), vari-ance function V(E(Yxt)) = 1 and prior weights wxt = 1. Then, under theidentity link, the non-linear predictor is given by

(E(Yxt) =) ηxt = αx + βxκt (6.13)

which is the standard LC structure (6.1).

6.2 LC age–period–cohort mortality projection model 249

Given the non-linear nature of the parametric predictors (ηxt), we focuson two alternative model-fitting procedures: Method A which is based onan unpublished technical report by Wilmoth (1993); and Method B whichis based on a method of mortality analysis incorporating age–year inter-actions, from the field of medical statistics and attributable to James andSegal (1982), and which predates the LC model. We present Methods Aand B in the context of the LC model and then go on to describe the fittingprocedures for the new APC and AC versions of the model, which havebeen described above.

6.2.2.2 Fitting the LC model by Method A

We adapt the approach of Section 5.2.2.3, which is based on Wilmoth(1993) and Brouhns et al. (2002b), and obtain maximum likelihood esti-mates under the original LC Gaussian error structure given by (6.10) and(6.11) using an iterative process, which can be re-expressed as follows:

Set starting values αx, βx, κt; compute yxt↓update αx; compute yxt

update κt, adjust s.t.tn∑

t=t1κt = 0; compute yxt

update βx; compute yxtcompute D(yxt, yxt)

↓Repeat the updating cycle; stop when D(yxt, yxt) converges

where

yxt = ln mx(t), yxt = αx + βxκt (6.14)

D(yxt, yxt) =∑x,t

dev(x, t) =∑x,t

2wxt

yxt∫yxt

yxt − uV(u)

du =∑x,t

wxt(yxt − yxt)2

(6.15)

with weights

wxt ={1, ETRxt > 00, ETRxt = 0

. (6.16)

The updating of a typical parameter θ proceeds according to

updated(θ) = u(θ) = θ − ∂D∂θ

/∂2D∂θ2

(6.17)

250 6 : Forecasting mortality

where D is the deviance of the current model. Table 6.1 provides fullerdetails. Effective starting values, conforming to the usual LC constraints(6.2) are κt = 0, βx = 1/k, coupled with the SVD estimate

αx = 1tn − t1 + 1

tn∑t=t1

ln∧mx(t) (6.18)

so that αx is estimated by the logarithm of the geometric mean of the empir-ical mortality rates. The model has ν = (k − 1)(n − 2) degrees of freedom.

Table 6.1. Parameter updating relationships

Gaussian Poisson

LC u(αx) = αx +∑

twxt(yxt − yxt)∑

twxt

u(αx) = αx +∑

twxt(yxt − yxt)∑

twxtyxt

u(κt) = κt +∑

xwxt(yxt − yxt)βx∑

xwxt β

2x

u(κt) = κt +∑

xwxt(yxt − yxt)βx∑

xwxtyxt β

2x

u(βx) = βx +∑

twxt(yxt − yxt)κt∑

twxt κ

2t

u(βx) = βx +∑

twxt(yxt − yxt)κt∑

twxtyxt κ

2t

APC u(ιz) = ιz +

∑x,t

t−x=z

wxt(yxt − yxt)β(0)x

∑x,t

t−x=z

wxt β(0)2

x

u(ιz) = ιz +

∑x,t

t−x=z

wxt(yxt − yxt)β(0)x

∑x,t

t−x=z

wxtyxt β(0)2

x

u(β(0)x ) = β

(0)x +

∑twxt(yxt − yxt)ιt−x∑

twxt ι

2t−x

u(β(0)x ) = β

(0)x +

∑twxt(yxt − yxt)ιt−x∑

twxtyxt ι

2t−x

u(κt) = κt +∑

xwxt(yxt − yxt)β

(1)x∑

xwxt β

(1)2

x

u(κt) = κt +∑

xwxt(yxt − yxt)β

(1)x∑

xwxtyxt β

(1)2

x

u(β(1)x ) = β

(1)x +

∑twxt(yxt − yxt)κt∑

twxt κ

2t

u(β(1)x ) = β

(1)x +

∑twxt(yxt − yxt)κt∑

twxtyxt κ

2t

AC u(αx) computed as above u(αx) computed as above

u(ιz) = ιz +

∑x,t

t−x=z

wxt(yxt − yxt)βx

∑x,t

t−x=z

wxt β2x

u(ιz) = ιz +

∑x,t

t−x=z

wxt(yxt − yxt)βx

∑x,t

t−x=z

wxtyxt β2x

u(βx) = βx +∑

twxt(yxt − yxt)ιt−x∑

twxt ι

2t−x

u(βx) = βx +∑

twxt(yxt − yxt)ιt−x∑

twxtyxt ι

2t−x

6.2 LC age–period–cohort mortality projection model 251

This iterative fitting process generates maximum likelihood estimates underthe Poisson error structure presented in (6.9) and (6.10) on setting

yxt = dxt, yxt = dxt = ext exp(αx + βxκt) (6.19)

D(yxt, yxt) =∑x,t

dev(x, t) =∑x,t

2wxt

yxt∫yxt

yxt − uV(u)

du

=∑x,t

2wxt

{yxt ln

(yxt

yxt

)− (

yxt − yxt)}

(6.20)

As noted by Renshaw and Haberman (2003a, 2003b, 2006), we canattribute the iterative method for estimating log-linear models with bilinearterms to Goodman (1979). Table 6.1 provides full details of the parameterupdating relationships.

6.2.2.3 Fitting the LC model by Method B

Following James and Segal (1982), we use the iterative procedure:

Set starting values βx↓given βx, update αx, κt

given κt, update αx, βxcompute D(yxt, yxt)

↓Repeat the updating cycle; stop when D(yxt, yxt) converges

Given βx or κt, updating is by selecting the desired generalized linearmodel and fitting the predictor, which is linear in the respective remain-ing parameters. Thus, log-link Poisson responses yxt = dxt with offsetsln ETRxt are set in order to generate the same results as in the iterative fit-ting process of Section 5.2.2.3. The respective predictors are declared byaccessing the model formulae (design matrices), a feature which is availablein GLIM (Francis et al., 1993), for example, and other software packages.In specifying the model formulae, we impose the constraints

κt1 = 0,∑

x

βx = 1, (6.21)

and then revert back to the standard LC constraints (6.2) once convergenceis attained.

252 6 : Forecasting mortality

6.2.2.4 Fitting the APC LC model

It is well known that APC modelling is problematic, since the three factorsare constrained by the relationship

cohort = period − age

To ensure a unique set of parameter estimates, we resort to a two-stage fitting strategy in which αx is estimated first, typically as in (6.18)corresponding to the original LC SVD approach. Then, the remainingparameters, those of the reduction factor RF, may be estimated by suit-ably adapting Method B by declaring log-link Poisson responses yxt = dxtand the augmented offsets ln ext + αx and adapting the design matrices,together with the constraints∑

x

β(0)x = 1,

∑x

β(1)x = 1 and either ιt1−xk = 0 (or κt1 = 0). (6.22)

Obvious simplifications to the design matrices are needed when fitting theassociated sub-models with β

(0)x = 1 or β

(1)x = 1, while the iterative element

in the fitting procedure is redundant when fitting the model with β(0)x =

β(1)x = 1 for all x. We note that the APC model has ν = k(n − 3) − 2(n − 2)

degrees of freedom (excluding any provision for the first-stage modellingof αx). We find that effective starting values are β

(0)x = β

(1)x = 1/k. Fitting

is also possible under Method A, once αx has been estimated, using theextended definitions of yxt and adapting the core of the iterative cycle inaccordance with the relevant updating relationships (Table 6.1). Effectivestarting values may be obtained by setting β

(0)x = β

(1)x = 1 and fitting this

restricted version of the APCmodel to generate starting values for ιz and κt.

6.2.2.5 Fitting the AC LC model

Model identification is conveniently achieved by means of the parameterconstraints

ιt1−xk = 0,∑

x

β(0)x = 1 (6.23)

Model fitting is then possible by reformulatingMethodA in terms of αx,β(0)x

and ιt−x. Thus, ιt−x instead of κt is updated in the core of the iterative cycle(subject to the adjustment ιt1−xk = 0), using the replacement updating rela-tionships of Table 6.1. Fitting is also possible usingMethod B by replacing κtwith ιt−x andmodifying the designmatrices accordingly. A possible strategyfor generating starting values is to set β

(0)x = 1 and additionally fit the main

effects structure αx+ιt−x in accordance with the distributional assumptionsunder Method A. There are ν = (k − 1)(n − 3) degrees of freedom in thismodel.

6.2 LC age–period–cohort mortality projection model 253

6.2.3 Mortality rate projections

Projected mortality rates

·mx(tn + s) = ∧

mx(tn)·

RF(x, tn + s), s > 0 (6.24)

are computed by alignment with the latest available mortality rates∧

mx(tn).Here,

·RF(x, tn + s) = exp

{β(0)

x(ιtn−x+s − ιtn−x

) + β(1)x

(κtn+s − κtn

)}, s > 0

(6.25)for which

lims→0

RF(x, tn + s) = 1

is based on the parameter estimates β(i)x , ιz, κ

(i)t and the time series forecasts{

ιz: z ∈ [t1 − xk, tn − x1]} �→ {

ιtn−x1+s: s > 0}{

κt: t ∈ [t1, tn]} �→ {κtn+s: s > 0

}(6.26)

where

ιtn−x+s ={ιtn−x+s, 0 < s ≤ x − x1ιtn−x+s, s > x − x1

As we have seen in Section 5.7, the time series forecasts are typically gen-erated using univariate ARIMA processes. The random walk with drift(or ARIMA(0,1,0) process) features prominently in many of the publishedapplications of the LC model. If no provision for alignment with the latestavailable mortality rates is made (as in equation (6.24)), the extrapolatedmortality rates decompose multiplicatively as

mx(tn + s) = exp(αx + β(0)

x ιtn−x + β(1)x κtn

) ·RF(x, tn + s), s > 0 (6.27)

which has the same functional form as (6.8), and can be directly comparedwith (6.24). This was the approach originally proposed in Lee and Carter(1992).

6.2.4 Discussion

By specifying the second moment distributional properties when definingthe model error structure, the choice of distribution is not restricted tothe Poisson and Gaussian distributions, and may indeed be expanded byselecting different variance functions (within the exponential family of dis-tributions). Empirical evidence suggests that, for all practical purposes,

254 6 : Forecasting mortality

maximum likelihood estimates obtained for the LC model using the itera-tive fitting processes under the Gaussian error structure given by (6.12), arethe same as those obtained under fitting by SVD. Unlike modelling by SVD,however, the choice of weights (6.16) means that estimation can proceed,in the presence of empty data cells, under the Gaussian, Poisson, and anyof the other viable error settings. Wilmoth (1993) uses weights wxt = dxtin combination with the Gaussian error setting. Empirical studies revealthat this has the effect of bringing the parameter estimates into close agree-ment with the Poisson-response-based estimates. When comparing a rangeof results obtained under both modelling approaches (with identical modelstructures), we have found that the same number of iterations is requiredto induce convergence. However, convergence is slow when fitting the APCmodel.

As discussed in Section 5.6, diagnostic checks on the fitted model are veryimportant. For consistency with the model specification, we consider plotsof the standardized deviance residuals

rxt = sign(yxt − yxt)

√dev(x, t)/φ (6.28)

where

φ = D(yxt, yxt)

ν(6.29)

The sole use of the proportion of the total temporal variance, as measuredby the ratio of the first singular value to the sum of singular values underSVD, is not a satisfactory diagnostic indicator.However, this index is widelyquoted in the demographic literature: see, for example, Tuljapurkar et al.(2000).

The parameters αx are estimated simultaneously with the parameters ofthe reduction factor RF in both the LC and AC models. A two-stage esti-mation process is necessary, however, in which αx is estimated separatelyto condition on the estimation of RF, when fitting the APC model (and itssubstructures). This two-stage approach can also be applied when fittingthe LC and AC models. In the case of the former, empirical studies showthat this has little practical material effect, due to the robust nature of theαx estimate (6.18).

6.3 Application to United Kingdom mortality data

To explore the potential of the APCmodel, we present results for the UnitedKingdom 1961–2000 mortality experiences for each gender, with cross-classification by individual year of age from 0 to 99. This data set has been

6.3 Application to United Kingdom mortality data 255

provided by the Government Actuary’s Department – the availability ofdata cross-classified by single year of age and by individual calendar yearfacilitates the construction of cohort-based mortality measures. We make adirect comparison with the standard age–period LC and the AC models. Inthis application, all of the models are fitted under the Poisson error setting,represented by equations (6.9) and (6.10).

The implications of the choice of model structure are immediatelyapparent from the respective residual plots, illustrated for the UK femaleexperience (Fig. 6.1). Here the distinctive ripple effects in the year-of-birthresidual plots under LC modelling (Fig. 6.1(a), RH frame), signify a fail-ure of the model to capture cohort effects. This is then transferred to thecalendar-year residual plots under AC modelling (Fig. 6.1(b), LH frame)signifying a reciprocal failure to capture period effects. However, these dis-tinctive ripple effects are largely removed under APCmodelling (Fig. 6.1(c))– this feature indicates that themodel captures in a relatively successfulman-ner all of the three main effects and represents a significant improvement

3.02.5

1.5

0.50.0

–0.5–1.0–1.5–2.0–2.5–3.0

1965 1970 1975 1980 1985Calendar year Age Year of birth

1990 1995 2000 0 10 20 30 40 50 60 70 80 90 100 1880 1900 1920 1940 1960 1980 2000

2.0

1.0

3.02.5

1.5

0.50.0

–0.5–1.0–1.5–2.0–2.5–3.0

2.0

1.0

3.02.5

1.5

0.50.0

–0.5–1.0–1.5–2.0–2.5–3.0

2.0

1.0

3.02.5

1.5

0.50.0

–0.5–1.0–1.5–2.0–2.5–3.0

1965 1970 1975 1980 1985Calendar year Age Year of birth

1990 1995 2000 0 10 20 30 40 50 60 70 80 90 100 1880 1900 1920 1940 1960 1980 2000

2.0

1.0

3.02.5

1.5

0.50.0

–0.5–1.0–1.5–2.0–2.5–3.0

2.0

1.0

3.02.5

1.5

0.50.0

–0.5–1.0–1.5–2.0–2.5–3.0

2.0

1.0

3.02.5

1.5

0.50.0

–0.5–1.0–1.5–2.0–2.5–3.0

1965 1970 1975 1980 1985Calendar year Age Year of birth

1990 1995 2000 0 10 20 30 40 50 60 70 80 90 100 1880 1900 1920 1940 1960 1980 2000

2.0

1.0

3.02.5

1.5

0.50.0

–0.5–1.0–1.5–2.0–2.5–3.0

2.0

1.0

3.02.5

1.5

0.50.0

–0.5–1.0–1.5–2.0–2.5–3.0

2.0

1.0

(a)

(b)

(c)

Figure 6.1. Female mortality experience: residual plots for (a) LC model; (b) AC model; and (c)APC model.

256 6 : Forecasting mortality

over the fitted LC model. Similar patterns are observed in the residual plotsfor theUKmale experience (not reproduced here but the details are availablefrom the authors).

Turning first to the parameter estimates for the APC modelling approach(Fig. 6.2), we believe that it is helpful and informative to compare matchingframes between the sexes. Thus, the main age–effect plots (αx vs x) displaythe familiar characteristics, including the ‘accident’ humps, of static cross-sectional life-tables (on the log scale), with a more pronounced accidenthump and heavier mortality for males than for females. We recall that theseeffects are estimated separately, by averaging crude mortality rates over tfor each x, to condition for both period and cohort effects.

The main period effects plot (κt vs t) is linear for females but exhibits mildcurvature for males, which can be characterized as piece-wise linear witha knot or hinge positioned in the first-half of the 1970s. This effect is alsopresent in the separate LC analysis of mortality data of the G7 countries

(a)

(b)

19751960 19651970 19801985Calendar year Year or birth

1990199520002005 20102015 2020 1860 1870 1880 1890 1900 1910 1920 1930 1940 1950 1960 1970 1980 1990 2000 2010 2020

–1 0.0350.0220.0200.0180.0160.0140.0120.0100.0080.0060.0040.002

0.030

0.025

0.020

0.015

b0a

kk i

i

a b0

b1

b1

0.010

0.005

0 10 20 30 40 50 60 70 80 90 0 10 20 30 40 50 60 70 80 90

–2

–3

–4

–5

–6

–7

–8

0–10 80

60

40

20

–0

–20

–40

–20–30–40–50–60–70–80–90

–100

–1 0.0600.014

0.012

0.010

0.008

0.006

0.004

0.002

0.0550.0500.0450.0400.0350.0300.0250.0200.0150.0100.005

0 10 20 30 40 50Age Age

60 70 80 90 0 10 20 30 40 50 60 70 80 90

–2

–3

–4

–5

–6

–7

–8

0

–10

100

80

60

40

20

–0

–20

1860 1870 1880 1890 1900 1910 1920 1930 1940 1950 1960 1970 1980 1990 2000 2010 2020

–20

–30

–40

–50

–60

–70

–801960 19651970 19751980 19851990

Calendar year Year of birth19952000 200520102015 2020

0 10 20 30 40 50Age

60 70 80 90

0 10 20 30 40 50Age Age Age

60 70 80 90

Figure 6.2. Parameter estimates and forecasts for the APC model: (a) females; (b) males.

6.3 Application to United Kingdom mortality data 257

(Tuljaparkar et al., 2000) and has been discussed further for the UnitedKingdom by Renshaw and Haberman (2003a). The forecasts for κt arebased on the auto-regressive time series model

yt = d + φ1yt−1 + εt where yt = κt − κt−1 (6.30)

which is the equivalent of ARIMA(1, 1, 0) modelling. There are noteworthydifferences in the β

(1)x patterns, which control the rate of decline by period

of the age specific rates of mortality in the projections. In particular, thetrough in the male β

(1)x pattern in the 20–40 age range is consistent with

similar findings from trends in the male England and Wales mortality rates(Renshaw and Haberman, 2003a).

The plots of themain cohort effects (ιz vs z = t−x) are particularly reveal-ing. Thus, noteworthy discontinuities occur corresponding to the ending ofWorld Wars I and II. While it is possible to identify the first of these withthe 1919 influenza epidemic, we are not aware of the likely cause of thesecond discontinuity. (The 1886–1887 discontinuity can be traced to a setof outliers, and is possibly due to mis-stated exposures for this particularcohort.) The pronounced decline in the ιz profile in the inter-war years isconsistent with the reported rapid mortality improvements experienced bygenerations born between 1925 and 1945 (for both sexes) and discussedat the start of this chapter. The apparent stable linear trends in the ιz pro-files, present since the late 1940s, form the basis of the depicted time seriesforecasts, generated using an ARIMA (0, 1, 0) process for females and anARIMA(1, 1, 0) process for males. The β

(0)x patterns, which control the age-

specific cohort contributions to the mortality projections, are similar, forboth sexes, for ages up to 65.

We illustrate the implications of these projections in Fig. 6.3 by plottingcurrent log crude mortality rates (for the calendar year 2000) against agefor each gender and comparing these with projections for the calendar year2020 under three different models: the LC (or standard Lee-Carter) model,ACmodel, and APCmodel. In Fig. 6.3(a), we show the LC–AC comparisonand in Fig. 6.3(b) the LC–APC comparison. We note the marked mortalityreductions projected for 2020 under the AC and APC models at ages whichcorrespond to the cohorts identified at the start of this chapter (based ondescriptive analyses): those born between 1925 and 1945 and hence aged75–95 in 2020.

In order to illustrate the impact of such diverse projections under age-period (LC) and age-period-cohort (APC) modelling, we have calculatedcomplete life expectancies e65(t) at age 65 (Fig 6.4) and immediate lifeannuity values a65(t) at age 65 assuming a 5%pa fixed interest rate (Fig. 6.5)

258 6 : Forecasting mortality

–1(a)

(b)

UK female population study UK male population study

UK female population study UK male population study

–2 Age-cohort (2020)Age-period (2020)Latest (2000)

Age-cohort (2020)Age-period (2020)Latest (2000)

Age-period-cohort (2020)Age-period (2020)Latest (2000)

Age-period-cohort (2020)Age-period (2020)Latest (2000)

–3

–4

–5

–6

–7

–8

–9

–100 10 20 30 40 50 60

Age70 80 90

–1

–2

–3

–4

–5

–6

–7

–8

–9

–100 10 20 30 40 50 60

Age70 80 90

–1

–2

–3

–4

–5

–6

–7

–8

–9

–100 10 20 30 40 50 60

Age70 80 90

–1

–2

–3

–4

–5

–6

–7

–8

–9

–100 10 20 30 40 50 60

Age70 80 90

Figure 6.3. Current (2000) and projected (2020) ln µx(t) age profiles: (a) LC and AC models; (b)LC and APC models.

for a range of years t using both the cohort and periodmethod of computing.(We note that the annuity values represent the expected present value of anincome of one paid annually in arrears while the individual initially aged 65remains alive.) For the cohort method of computing, we use the followingformulae, which are analogous to (5.57):

ex(t) =∑

h≥0 lx+h(t + h){1 − 12qx+i(t + h)}

lx(t),

ax(t) =∑

h≥1 lx+i(t + h)vh

lx(t)(6.31)

where

qx(t) ≈ 1 − exp(−µxt), lx+1(t + 1) = {1 − qx(t)}lx(t) (6.32)

with annual discount factor v, where v = 1/(1 + i) is calculated usinga constant interest rate. Thus, in the cohort versions, we allow fully forthe dynamic aspect of the mortality rates, with the summations proceeding(diagonally) along a cohort. We illustrate values up to the year 2005 calcu-lated by the cohort method and this requires extrapolation up to the year

6.3 Application to United Kingdom mortality data 259

26by period: LCby period: APCby cohort: LCby cohort: APC

by period: LCby period: APCby cohort: LCby cohort: APC

UK female population study

UK male population study

24

22

20

18

e(65

,t)e(

65,t)

16

14

12

26

24

22

20

18

16

14

12

1960 1965 1970 1975 1980 1985 1990 1995Period t

2000 2005 2010 2015 2020

1960 1965 1970 1975 1980 1985 1990 1995Period t

2000 2005 2010 2015 2020

Figure 6.4. Projected life expectancies at age 65, computed by period and by cohort methods forage-period (LC) and age-period-cohort (APC) models.

2040. In contrast, under the period method of calculation, the mortalityrates are treated as a sequence of (annual) static life tables, and computingproceeds by suppressing the variation in t in expressions (6.31) and (6.32),with (marginal) summation over age (≥x) for each fixed t, as for examplein (3.18). We illustrate values up to the year 2020 using this method basedon the empirical mortality rates µx(t) = mx(t) in the period up to 2000 andrequiring extrapolation for subsequent years up to 2020. The periodmethodof computation fails to capture the full dynamic impact of the evolvingmortality rates under the modelling assumptions and generates less uncer-tainty than the cohort method of calculation. The latter necessarily requires

260 6 : Forecasting mortality

14by period: LCby period: APCby cohort: LCby cohort: APC

UK female population study

13

12

11

10a(65

,t)

9

8

1960 1965 1970 1975 1980 1985 1990Period t

1995 2000 2005 2010 2015 20207

14by period: LCby period: APCby cohort: LCby cohort: APC

UK male population study

13

12

11

10a(65

,t)

9

8

1960 1965 1970 1975 1980 1985 1990Period t

1995 2000 2005 2010 2015 20207

Figure 6.5. Projected life annuity values at age 65 (calculated using a 5% per annum fixed interestrate), computed by period and by cohort under age-period (LC) and age-period-cohort (APC)models.

more lengthy extrapolations and this contributes a source of increasinguncertainty. One means of quantifying this uncertainty is through the adop-tion of boot-strapping simulation methods, as described in Section 5.8, inthe context of LC modelling. This and other methods are currently underinvestigation for the case of the APC model. The reserves that insurancecompanies selling life annuities and pension funds would have to hold inorder to meet their future contractual liabilities are directly related to termslike a65(t); see Booth et al. (2005). The financial implications of the upwardtrends in cohort-based life annuity values (which are the most relevant forpricing and reserving calculations) in Fig. 6.5 are clear and significant and

6.3 Application to United Kingdom mortality data 261

6

t = 2020

t = 2016

t = 2012

t = 2008

x = 85

x = 80

x = 75

x = 70

x = 65

t = 2004Age-period (LC)Age-period-cohort (APC)Age 65

Age-period (LC)Age-period-cohort (APC)Period 2000

5

4

e(65, t) computed by period

e(x, 2000) computed by cohort

3

2

1

0

6

5

4

3

2

1

04 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

14.0 14.5 15.0 15.5 16.0 16.5 17.0 17.5 18.0Life expectancy

Life expectancy

18.5 19.0 19.5 20.0 20.5 21.0 21.5 22.0

Figure 6.6. E+W male mortality: comparison of life expectancy predictions using (i) age-period-cohort and (ii) age-period Poisson structures. Predictions with intervals by bootstrapping the timeseries prediction error in the period (and cohort) components, and selecting the resulting 2.5, 50,97.5 percentiles.

indicate the burden that increasing longevity may place on such financialinstitutions.

As we have discussed in Section 5.8, it is important to be able to qual-ify any projections of key mortality indices with measures of the error oruncertainty present. Because of the complexities of the structure of the APCLC model, the indices of interest are non-linear functions of the parametersαx,βx, κt, it−x and hence analytical deviations of prediction intervals are notpossible. It is therefore necessary to employ bootstrapping techniques.

In Figs. 6.6 and 6.7, we use the LC and APCmodels fitted to England andWales male mortality rates over the period 1961–2000 in order to compareestimates of life expectancy and 95% prediction intervals. Specifically, weshow in Fig. 6.6(a) computations of the period life expectancy at age 65 forvarious future periods (equivalent to the median of the simulated distribu-tions) and the corresponding 2.5 and 97.5 percentiles from the simulated

262 6 : Forecasting mortality

6

5 t = 2020

t = 2016

x = 85

x = 80

x = 75

x = 70

x = 65

t = 2012

Age-period (LC)

Age-period-cohort (M) Age 65

Age-period (LC)Age-period-cohort (M)Period 2000

t = 2008

t = 2004

4

3

2

1

0

6

5

4

3

2

1

0

10.0

3.0 4.03.5 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0 8.5 9.0 9.5 10.0 10.511.0 11.5 12.0 12.5 13.0

10.2 10.4 10.6 10.8 11.0 11.2 11.4 11.6 11.8 12.04% Annuity

4% Annuity

12.2 12.4 12.6 12.8 13.0 13.4 13.6 13.8 14.013.2

a(65, t) computed by period

a(x, 2000) computed by cohort

Figure 6.7. E+W male mortality: comparison of 4% fixed rate annuity predictions using (i) age-period-cohort and (ii) age-period Poisson structures. Predictions with intervals by bootstrappingthe time series prediction error in the period (and cohort) components, and selecting the resulting2.5, 50, 97.5 percentiles.

distributions. In this case, the simulated distributions have been calculatedby bootstrapping only the time series prediction error as experiments revealthat this is the most important component of the uncertainty in the model(as originally suggested by Lee and Carter, 1992). In this way, we avoidusing the detailed bootstrapping strategies discussed in Section 5.8 whichcan be rather slow to converge.

The results in Fig. 6.6 (upper frame) show that the central estimates forlife expectancy are higher when using the APC model as against using theLC model (as illustrated in Fig. 6.4) and that the prediction intervals arewider for the APC models for each year. As we move forward in time from2004 to 2020, we note that both pairs of prediction intervals become wider,indicating a greater level of uncertainty present in the estimates for futureyears. Thus, the calculations for 2004 involve 4 years of forward projectionswhereas the calculations for 2020 involve 20 years of projections.

6.4 Cairns–Blake–Dowd mortality projection model: allowing for cohort effects 263

The results in Fig. 6.6 (lower frame) show the corresponding figures forcohort life expectancy for five cohorts ofmales aged 65, 70, 75, 80, and 85 in2000. The younger cohorts have estimates of cohort life expectancy that arehigher under the APC model than under the LC model (as in Fig. 6.4). Theprediction intervals under the APC model are much wider for the youngercohorts. As we consider the older cohorts, we note that the central estimatesand the prediction intervals become more similar under the two modelsindicating the particular incidence of the cohort effect which affects thoseaged 55–75 in 2000. Expectedly, under bothmodels, the prediction intervalsare wider for the cohorts aged 65 and 70 in 2000 than for the older cohorts,and the width decreases in stages as age in 2000 increases. This reflects theunderlying level of projection involved in the calculations – if we regard age110 as approximately the terminal age in the underlying survival model,then the cohort estimates at age 65 would involve 45 years of projectedquantities while the cohort estimates at age 85 would involve only 25 yearsof projections.

Figure 6.7 reproduces the calculations of Fig. 6.6 but for immediate lifeannuities calculated using a constant interest instant rate of 4% per annum.We can regard Fig. 6.7 as extending the results of Fig. 6.5 by includingprediction intervals and a more detailed comparison. Figure 6.7 shows thesame principal features as Fig. 6.6.

6.4 Cairns–Blake–Dowd mortality projection model:allowing for cohort effects

In Section 5.3, we introduced the Cairns–Blake–Dowd mortality projectionmodel which is motivated by the empirical observation that logitqx(t) is areasonably linear function of x for fixed t. The model introduced by Cairnset al. (2007) is the following:

lnqx(t)px(t)

= κ(1)t + κ

(2)t x (6.33)

which can be regarded as a specific example of a more general class ofmodels

lnqx(t)px(t)

= β(1)x κ

(1)t + β(2)

x κ(2)t (6.34)

Responding to the need to consider the cohort effect observed in the historicmortality data for a number of countries, Cairns et al. (2007) introducean AC term into the predictor as follows, in an analogous manner to the

264 6 : Forecasting mortality

Renshaw and Haberman (2006) enhancement of the original Lee-Cartermodel. Thus, Cairns et al. (2007) propose the following family of models:

lnqx(t)px(t)

= β(1)x κ

(1)t + β(2)

x κ(2)t + β(3)

x ιt−x (6.35)

where the it−x term represents a cohort effect as in (6.6). Having consideredgoodness-of-fit of this family of models to historic data from England andWales and the USA, Cairns et al. (2007) investigate two specific versions insome detail.

The special cases are

I. β(1)x = 1, β(2)

x = x − x, and β(3)x = 1 (6.36)

II. β(1)x = 1, β(2)

x = x − x, and β(3)x = xc − x (6.37)

where x is the average age in the data set and xc is a constant parameter thatneeds to be estimated. As with the APC version of the Lee Carter model inSection 6.2, we need to introduce some identifiability constraints to ensurethat the parameters can be uniquely estimated. Version II is motivated bythe observation that in the applications of the APCmodel of Section 6.2, thecoefficient of the cohort term it−x is often found to be a decreasing functionof age: (6.37) incorporates the simplest such specification of β

(3)x .

Cairns et al. (2007) fit the models by the method of maximum likelihood,assuming that Dxt has a Poisson distribution as assumed in earlier Sections5.2.2.3 and 6.2.2. For England and Wales data comprising the calendaryears 1961–2004 inclusive and ages 60–89, they find that the best-fittingmodel is (6.37). For US data comprising the calendar years 1968–2003inclusive and ages 60–89 (although only data for ages 85–89 are used for1980–2003), they find that the best-fitting model is (6.6). When robustnessto the choice of fitting period is considered, the best fits to the historic datafrom both countries are obtained for an augmented version of (6.35) viz.

lnqx(t)px(t)

= β(1)x κ

(1)t + β(2)

x κ(2)t + β(3)

x κ(3)t + β(4)

x it−x (6.38)

with the specific choices β(1)x = 1,β(2)

x = x − x,β(3)x = (

(x − x)2 − σ2x)and

β(4)x = 1. Here σ2

x is the average value of (x − x)2. This development ofthe model is inspired by the observation that there is some curvature in theage-profile of log it qx(t) in the United States data.

As in Sections 5.3 and 6.2, we could use the Cairns–Blake–Dowd classof models for projection purposes. This would require models to be pos-tulated and estimated for the dynamics of the period and cohort effects

6.5 P-splines model: allowing for cohort effects 265

terms in (6.35)–(6.38). An obvious approach would be to employ standardtime series methods based on ARIMA models, as discussed earlier. A com-plication with the models represented by equations (6.35)–(6.38) is thatthey involve three or four stochastic (time series) terms. We could followSection 6.2 and postulate that and (for different values of i) are independent.Still, the presence of two or three terms would mean that we would needto consider multivariate time-series modelling to estimate the underlyingdependency structure. This would be likely to involve vector autoregressivemodels and co-integration techniques as discussed by Hamilton (1994).

6.5 P-splines model: allowing for cohort effects

As noted in Section 5.4.2, Currie et al. (2004) have introduced a two-dimensional graduation methodology based on B-splines, which is fittedto observational data using a regression framework. The two-dimensionalversion of univariate B-splines is obtained by multiplying the respective ele-ments of the univariate B-splines in the age- and time-dimensions. Thus, themodel is

lnµx(t) =∑i, j

θijBij(x, t) (6.39)

where Bij(x, t) = Bi(x) · Bj(t)and where Bij and the θij are parameters to beestimated from the data and Bi and Bj are the respective univariate splines.In reality, B-splines can provide a very good fit to the data if we employ alarge number of knots in the year and age dimensions. But this excellent levelof goodness of fit is achieved by sacrificing smoothness in the resulting fit.The method of P-splines (or penalized splines) has been suggested by Eilersand Marx (1996) to overcome this problem: in this case, the log-likelihoodis adjusted by a penalty function, with appropriate weights.

Schematically, the penalized by likelihoodwould have the following formfor an LC model:

PL(θ) = L(θ) − λxPx(θ) − λtPt(θ) (6.40)

where λx and λt are weighting parameters and Px(θ) is a penalty functionin the age dimension and Pt(θ) is a penalty function in the calendar timedimension. An alternative formulation would involve an AC model:

PL(θ) = L(θ) − λxPx(θ) − λzPz(θ) (6.41)

where, as in Section 6.2, we use z = t −x to index cohorts. The λ’s are esti-mated from the data. As noted in Section 5.4.2, typical choices for quadratic

266 6 : Forecasting mortality

penalty functions would be

Px(θ) =∑i, j

(θij − 2θi−1, j + θi−2, j)2 (6.42)

Pt(θ) =∑i, j

(θij − 2θi, j−1 + θi, j−2)2 (6.43)

Pz(θ) =∑i, j

(θi+1,j−1 − 2θij + θi−1, j+1)2 (6.44)

Thus, the B-splines are used as the basis for the underlying regression andthe log likelihood is modified by penalty functions like the above whichdepend on the smoothness of the θij parameters.

The idea of using P-spline regression not just for graduating mortalitydata but also for mortality projections was first suggested by CMIB (2005).In this application, that is, projecting mortality rates, the choice of the P(θ)

function plays a critical role in extending the mortality surface beyond therange of the data so that projections are a direct consequence of the smooth-ing process. Thus, a quadratic penalty function effectively leads to linearextrapolation – in the age and time dimensions, for (6.40) combined withthe choices (6.42) and (6.43); or in the age and year of birth dimensionsfor (6.41) combined with the choices (6.42) and (6.44). Different choicesfor P(θ) would be possible, and these may have little impact on the qualityof fit to the historic data and hence would be difficult to infer from thedata. However, the impact on the projected mortality surface is consider-able. The choice of P(θ) corresponds to a decision on the projected trend.We have seen the implications of a quadratic penalty. Similarly, a linearpenalty function would lead to constant log mortality rates being projectedin the appropriate dimensions and a cubic penalty function would lead toquadratic log mortality rates being projected in the appropriate dimensions.

Detailed applications of the P-spline methodology indicate that it is bet-ter suited to graduation and smoothing of historic observational data thanto projection: see, for example, Cairns et al. (2007) and Richards et al.(2007). Further, we should note that P-spline models can be used to gen-erate percentiles for the measurement of uncertainty but unlike, the LCand Cairns–Blake–Dowd families of models, P-spline models are not ableto generate sample paths. In many asset–liability modelling applications ininsurance and pensions, the production of sample paths is an important fea-ture and could be useful elsewhere such as in the pricing of longevity-linkedfinancial instruments – see Chapter 7.

7The longevity risk:actuarial perspectives

7.1 Introduction

In this chapterwe dealwith themortality risks borne by an annuity provider,and in particular with the longevity risk originating from the uncertainevolution of mortality at adult and old ages.

The assessment of longevity risk requires a stochastic representation ofmortality. Possible approaches are described in Section 7.2, which is alsodevoted to an analysis of the impact of longevity risk on the risk profile ofthe provider. In Section 7.3 and 7.4 we take a risk management perspective,and we investigate possible solutions for risk mitigation. In particular, risktransfers as well as capital requirements for the risk retained are discussed.Policy design and the pricing of life annuities allowing for longevity risk aredealt with in Section 7.5 and 7.6; such aspects, owing to commercial pres-sure and modelling difficulties, are rather controversial. We do not developan in-depth analysis, but we instead remark on the main issues. To reach aproper arrangement of the policy conditions of a life annuity, the possiblebehaviour of the annuitant in respect of the planning of her/his retirementincome has to be considered. In Section 7.7 we describe possible choicesavailable to the annuitant in this respect.

The topics dealt with in this chapter are rather new and not well-established either in practice or in the literature. So the chapter is based onrecent research. To give a comprehensive view of the available literature,most contributions are cited in Section 7.8, which is devoted to commentson further readings; for some specific issues, however, references are alsoquoted in the previous sections.

In this chapter, we refer usually to annuitants and insurers. Such termsare anyhow used in a generic sense. The discussion could also be referredto pensioners, with a proper adjustment of the parameters of the relevant

268 7 : The longevity risk: actuarial perspectives

mortality models, and to annuity providers other than an insurer. Just forbrevity, only annuitants and insurers are mentioned.

7.2 The longevity risk

7.2.1 Mortality risks

Mortality risk may emerge in different ways. Three cases can in particularbe envisaged.

(a) One individual may live longer or less than the average lifetime in thepopulation to which she/he belongs. In terms of the frequency of deathsin the population, this may result in observed mortality rates higherthan expected in some years, lower than expected in others, with noapparent trend in such deviations.

(b) The average lifetime of a population may be different from what isexpected. In terms of the frequency of deaths, it turns out that mortalityrates observed in time in the population are systematically above orbelow those coming from the relevant mortality table.

(c) Mortality rates in a population may experience sudden jumps, due tocritical living conditions, such as influenza epidemics, severe climaticconditions (e.g. hot summers), natural disasters and so on.

In all the three cases, deviations in mortality rates with respect to what isexpected are experienced; an illustration is sketched in Fig. 7.1 where, withreference to a given cohort, in each panel dots represent mortality ratesobserved along time, whereas the solid line plots their forecasted level.

Case (a) is the well-known situation of possible deviations aroundexpected mortality rates; the mortality risk here comes out as a risk of

Time

Mor

talit

y ra

tes

Time

Mor

talit

y ra

tes

Time

Mor

talit

y ra

tes

Case (a) Case (b) Case (c)

Figure 7.1. Experienced (dots) vs expected (solid line) mortality rates for a given cohort.

7.2 The longevity risk 269

random fluctuations, which is traditional in the insurance business, in boththe life and the non-life area (actually, it is the basic grounds of the insurancebusiness). It is often named process risk or also insurance risk. It concernsthe individual position, and as such its severity reduces as the single posi-tion becomes negligible in respect of the overall portfolio. The process riskcan be hedged through the realization of a proper pooling effect, since itreduces as soon as the portfolio is made of similar policies and its size islarge enough, as well as through traditional risk transfer arrangements.

Under case (b), deviations are from expected values, rather than aroundthem, hence their systematic nature. This may be the result of either a mis-specification of the relevant mortality model (e.g. because the time-patternof actual mortality differs from that implied by the adopted mortality table)or a biased assessment of the relevant parameters (e.g. due to a lack of data).The former aspect is referred to as the model risk, the latter as the parameterrisk. The term uncertainty risk is often used to refer to model and parameterrisk jointly, meaning uncertainty in the representation of a phenomenon(e.g. future mortality). When adult-old ages are concerned, uncertainty riskmay emerge in particular because of an unanticipated reduction inmortalityrates (as is presented in the mid-panel of Fig. 7.1, where the mortality profileof the cohort is better captured by the dashed line rather than by the solidline). In this case, the term longevity risk is used instead of uncertaintyrisk. It must be stressed that longevity risk concerns aggregate mortality; sopooling arguments do not apply for its hedging.

In case (c), a catastrophe risk emerges, namely the risk of a sudden andshort-term rise in the frequency of deaths. Similar to case (b), aggregatemortality is concerned; however, when compared with longevity risk, thetime-span involved by the emergence of the risk needs to be stressed: short-term in case (c), long-term (possibly, permanent) in case (b). Clearly, aproper hedging of catastrophe risk is required when death benefits are dealtwith (whilst when dealing with life annuities, profit arises because of thehigher mortality experienced). The usual pooling arguments do not apply;however, diversification effects may be realized and risk transfers can beconceived as well. Some remarks in this regard are given in Sections 7.3.2and 7.4.2.

Apart from some short remarks on the management of process and catas-trophe risk, in the following we focus on longevity risk. Before moving tothe relevant discussion, it is necessary to make a comment on terminology.

The vocabulary introduced above for mortality risks is commonlyacknowledged in the literature. In some risk classification systems, how-ever, the meaning of some terms may be different, and this may lead topossiblemisunderstandings.Wemention in particular the evolving Solvency

270 7 : The longevity risk: actuarial perspectives

2 system, where (see CEIOPS (2007)) both the mortality and the longevityrisks are meant to result from uncertainty risk. Mortality risk addressespossible situations of extra-mortality; concern here is for a business withdeath benefits. On the contrary, longevity risk addresses the possible real-ization of extra-survivorship; clearly, in this case concern is for a businesswith living benefits, life annuities in particular. In the following, we disre-gard this meaning; reference is therefore to what we have described underitems (a)–(c) above and the relevant remarks.

7.2.2 Representing longevity risk: stochasticmodelling issues

Whenever we aim at representing a risk, a stochastic valuation is required.In general terms, a stochastic mortality model should allow for the sev-eral types of possible deviations in the frequency of death in respect of theforecasted mortality rate, namely:

(a) random fluctuations (to represent process risk);(b) deviations due to the shape of the time-pattern implied by the mortality

model, in respect of both age and calendar year (model risk);(c) deviations due to the level of the parameters of the mortality model

(parameter risk);(d) shocks due to period effects (catastrophe risk).

As to the shape of the time-pattern ofmortality rates in respect of calendaryear, we recall that by longevity risk we mean the risk of an unanticipateddecrease in mortality rates at adult ages (see Section 7.2.1); hence, someprojection must be adopted. Except for the Lee–Carter model, projectedmortality models do not allow explicitly for risk (see Chapter 4). So, giventhe purpose of this chapter we need to attack mortality modelling in a newperspective.

Embedding four sources of randomness in the mortality model is a trickybusiness. So somemodelling choices are required. In this section, we exploregeneral aspects of stochastic modelling. Notation is stated in general terms.More specific examples are then presented in Section 7.2.3.

Let Y be the random number of deaths in a given cohort at a givenage. We assume that Y depends on two input variables, say X1,X2; soY = φ(X1,X2). The quantity X1 could be meant to represent the prob-ability of death or the force of mortality in the cohort in a given yearin the absence of extreme situations. Possible shocks are then representedby X2.

7.2 The longevity risk 271

Various approaches can be conceived for investigating Y. A graphicalillustration is provided by Fig. 7.2.

Approach 1 is purely deterministic. Assigning specific values x1, x2 tothe two input variables, the corresponding outcome y of the result vari-able is simply calculated as y = φ(x1, x2). In our example, x1 is a normal(projected) probability of death or force of mortality, whilst x2 is a givenshock (possibly set to zero). It is interesting to note that classical actu-arial calculations follow this approach, replacing random variables withtheir expected or best-estimate value. In a more modern perspective, thisapproach is adopted for example when performing stress testing (assigningto some variables ‘extreme’ values), or scenario testing.

Randomness in input variables is, to some extent, acknowledged whenapproach 2 is adopted. Reasonable ranges for the outcomes of the inputvariables are chosen (e.g. the interval of possible values for a shock in agiven year), and consequently a range (ymin, ymax) for Y is derived. As faras X1 is concerned, the range of possible values may represent randomnessdue to random fluctuations, as well as to the unknown trend of the cohort.Note, however, that the valuation is fundamentally deterministic; the maindifference between approach 1 and approach 2 is the number of possibleoutcomes which is considered.

Approach 3 provides a basic example of stochastic modelling, typicallyadopted for assessing the impact of process risk. The probabilistic structureassigned to X1 is meant here to represent the intrinsic stochastic nature ofmortality, that is, random fluctuations. Assuming a continuous probabilitydistribution function, the probability density function fX1 can be obtained,for example, first assigning the probability distribution function of the life-time of each individual (based on some projectedmortalitymodel with givenparameters), then aggregating the relevant results. Note that setting fixedparameters for the mortality model, a deterministic assumption for trend isunderstood. The probability distribution of Y (and X1 as well) can be foundusing only analytical tools just in very simple (or simplified) circumstances.Numerical methods or stochastic simulation procedures help in mostcases.

Approach 4 addresses, albeit in a naive manner, the risk of systematicdeviations. The three probability distributions assigned to X1 are intendedto be based on alternative models for the lifetime of each individual. In prac-tical terms, the same mortality projection model may be assumed, but withalternative sets of values chosen for the relevant parameters to representalternative mortality trends. Approach 4 then simply consists of iteratingthe procedure implied by approach 3, each iteration corresponding to aspecific assumption about the probability distribution of an input variable.

272 7 : The longevity risk: actuarial perspectives

Input

X1

X2

X1

X2

Y

Y

fX1

fY

fY|Ah

A3A2A1

fY

+

fX1|Ah

fX1|Ah

fX1|Ah

1

3

4

2

5

X2

X2

6

+

fY

fX2

X2

Output

Figure 7.2. Modelling approaches to stochastic valuations.

7.2 The longevity risk 273

A set of conditional distributions of Y is determined. Note that in respectof systematic deviations a representation similar to approach 2 is gained;the difference concerns process risk, which is explicitly addressed underapproach 4.

Under approach 5 a probability distribution is assigned over the set oftrend assumptions.Hence, the unconditional distribution of the output vari-able Y can be calculated. Note that, this way, both process and uncertaintyrisk are allowed for. In the graphical representation of Fig. 7.2 a discretesetting is considered in respect of uncertainty risk; more complex modelsmay attack the problem within a continuous framework.

Finally, under approach 6 a probabilistic structure is assigned to all ofthe input variables. In this case, either the joint distribution may be directlyaddressed or the marginal distributions of the input variables as well as therelevant correlation assumptions (as is depicted in Fig. 7.2). The problemcan be handled just through stochastic simulation; difficulties arise withreference to the choice of the probability distribution of the uncertaintyrisk, of the catastrophe risk, as well as with regard to the dependenciesamong the various sources of randomness.

7.2.3 Representing longevity risk: some examples

We now specifically address longevity risk. Clearly, approach 5 (or 6) inFig. 7.2 is required, but some insights into the problem may be gained alsofrom approach 4.

Let �(x, t) denote a projected mortality quantity, where x is the ageattained in calendar year t by the cohort born in year τ = t − x. The pro-jected quantity may either be the probability of death, qx(t), the mortalityodds, qx(t)/px(t), the force of mortality, µx(t), and so on.

To develop approach 4, alternative hypotheses about future mortalityevolution must be chosen. Such alternative assumptions may originate fromdifferent sets of the relevant parameters of the projection model; in thisway, parameter risk is addressed. Otherwise, the alternative assumptionsmay be given by mortality projections obtained under different procedures;in this case, also model risk would be addressed. However, it is intrinsicallydifficult to perform an explanatory comparison of different models (e.g.it is not easy to state whether the different outcomes of two models aremainly due to the implied time-pattern or to the relevant parameters). Forthis reason, we focus in the following discussion on parameter risk. In anycase, unless it is explicitly addressed (as it was depicted in Fig. 7.2), the riskof catastrophe mortality is not considered.

274 7 : The longevity risk: actuarial perspectives

Let A(τ) denote a given assumption about the mortality trend for peo-ple born in year τ, and A(τ) the set of such assumptions. The notation�(x, τ + x|A(τ)) refers to the projected mortality quantity � conditional onthe specific assumption A(τ). The set of all mortality projections is denotedas the family {�(x, τ + x|A(τ));A(τ) ∈ A(τ)}.

In principle, the set A(τ) can be either discrete or continuous. The formercase is anyhow more practicable. Examples may be found in the projec-tions developed by CMIB, addressing the cohort effect and assuming threehypotheses about the persistence in the future of such an effect; see CMI(2002) and CMI (2006).

Let us then suppose that a discrete set has been designed for A(τ). Ascenario testing, and possibly a stress testing, can be performed. In par-ticular, the sensitivity of some quantities, such as reserves, profits, andso on, in respect of future mortality trends can be investigated. As men-tioned in Section 7.2.2, process risk can be explicitly appraised through theprobability distribution function of the lifetime of all the individuals in thecohort, conditional on a given trend assumption. However, the approachin respect of parameter risk is deterministic. Some examples are describedin Section 7.2.4.

A step forward consists of assigning a (non-negative and normalized)weighting structure to A(τ) (see approach 5 in Fig. 7.2). In this way,unconditional valuations can be performed, thus accounting explicitly forparameter risk. Let

A(τ) = {A1(τ),A2(τ), . . . ,Am(τ)} (7.1)

be the set of alternative mortality assumptions; then, let ρh be the weightattached to assumption Ah(τ), such that 0 ≤ ρh ≤ 1 for h = 1, 2, . . . ,m and∑m

h=1 ρh = 1. The set

{ρh}h = 1,2,...,m (7.2)

can be intended as a probability distribution on A(τ). Unfortunately, expe-rience providing data for estimating such weights is rarely available and sopersonal judgement is often required. See Section 7.2.4 for some examples.

We now address possible ways of attacking the problem within a contin-uous setting. To define A(τ) as a continuous set, a continuous probabilitydistribution must be assigned to the parameters of the mortality model. Dif-ficulties, here, concern the appropriate correlation assumptions among theparameters and hence the complexity of the overall model is clearly greaterthan in the discrete case. Because there is likely a paucity of data allowingus to make a reliable estimate of the correlations, simplifying hypotheses

7.2 The longevity risk 275

would have to be accepted. Hence, the setting would not necessarily bemore powerful than the discrete one. For this reason, we do not provideexamples in respect of a continuous approach.

Whatever setting is referred to, either discrete or continuous, the frame-work discussed above can, to some extent, be classified as a static one.Actually, the notation indicates that the set A(τ) is fixed. Uncertainty isexpressed in terms of which of the assumptions A(τ) ∈ A(τ) is the betterone for describing the aggregate mortality behaviour of the cohort, that is,the relevant prevailing trend. Irrespective of the setting, either discrete orcontinuous, no future shift from such a trend is allowed for in the proba-bilistic distribution (see also panels 5 and 6 in Fig. 7.2). A critical aspectcan be found in the fact that assumptions about the temporal correlation ofchanges in the probabilities of death are implicitly involved; see, for exam-ple, Tuljapurkar and Boe (1998). Further, we note that mortality shocks arenot embedded into the static representation, which is not a serious problemgiven that we are addressing the longevity risk. Finally, we mention that,while keeping the setting as a static one, possible updates to the weights(7.2) based on experience could be introduced; an example in this respectis described in Section 7.2.4.

According to a dynamic probabilistic approach, either the probability ofdeath or the force of mortality (or possibly some other quantity) is modelledas a path of a random process. In this context, the probabilistic modelconsists of assumptions concerning the random process and the relevantparameters. In the current literature, many authors have been focussing onthis approach. Most investigations, which are, in particular, motivated bythe problem of setting a pricing structure for longevity securities, move fromassumed similarities between the force of mortality and interest rates orsimply from the assumption that the market for longevity securities shouldbehave like other aspects of the capital market. The application to mortalityof some stochastic models developed originally for financial purposes isthen tested. In particular, interest rate models and credit risk models havebeen considered. However, financial models are not necessarily suitable fordescribing mortality; actually, the force of mortality and interest rates donot necessarily behave in a similar manner. Therefore, the basic buildingblocks of the new theory still require careful discussion and investigation.Some examples are quoted in Section 7.6.

It is important to note that the Lee–Carter model (see Chapters 5 and6) is an early example of mortality modelled as a stochastic process. In itsoriginal version, deviations originating from sampling errors are in particu-lar addressed, and hence process risk is considered. Indeed, when stochasticprocesses are adopted, certainly the intrinsic stochastic nature of mortality,

276 7 : The longevity risk: actuarial perspectives

that is, random fluctuations, must be acknowledged. To represent alsoaggregate mortality risk, a second source of randomness must be intro-duced. So, in recent proposals, mortality is described as a doubly stochasticprocess. In particular when, moving from financial modelling, diffusionprocesses are considered for the force of mortality, unexpected movementsin the mortality curve may be accounted for through stochastic jumps. SeeSection 7.8 for some references.

7.2.4 Measuring longevity risk in a static framework

In this section we highlight the impact of longevity risk. With referenceto a portfolio comprising one cohort of annuitants, the distribution ofboth the present value of future payments and annual outflows is inves-tigated. What follows can be referred also to a cohort of pensioners, with aproper adjustment of the parameters of the mortality model; in either case, ahomogeneous group is considered. As mentioned in Section 7.1, for brevityexplicit reference is to annuitants only. Similarly, the provider could be aninsurer or a pension fund; however, we refer explicitly just to an insurer.

A static representation is considered for evolving mortality and, in par-ticular, parameter risk is addressed. To understand better the impact oflongevity risk, a comparison is made with process risk.

We assumeqx(t)px(t)

= G(τ) (K(τ))x (7.3)

where τ = t − x is the year of birth of the cohort. Hence, the third termof the first Heligman–Pollard law, that is, the one describing the old-agepattern of mortality, is adopted to express the time-pattern of mortality(see Section 2.5.2). Note, in particular, that the relevant parameters arecohort-specific.

Whilst the age-pattern of mortality for cohort τ is accepted to be logistic,namely

qx(t) = G(τ) (K(τ))x

1 + G(τ) (K(τ))x (7.4)

see (2.85) (see also the second Heligman–Pollard law in Section 2.5.2),uncertainty concerns the level of parameters G(τ),K(τ). Actually, our inves-tigation focusses on parameter risk and we note that such uncertainty may,in particular, arise from an underlying unknown cohort effect.

We define five alternative sets of parameters, quoted in Table 7.1 whichalso shows the expected lifetime (E[T65|Ah(τ)]) and the standard deviation(√

Var[T65|Ah(τ)]) of the lifetime at age 65 conditional on a given set of

7.2 The longevity risk 277

Table 7.1. Parameters for the Heligman–Pollard law

A1(τ) A2(τ) A3(τ) A4(τ) A5(τ)

G(τ) 6.378E-07 3.803E-06 2.005E-06 1.060E-06 3.149E-06K(τ) 1.14992 1.12347 1.13025 1.13705 1.11962E[T65|Ah(τ)] 20.170 20.743 21.849 22.887 24.187√

Var[T65|Ah(τ)] 7.796 8.780 8.707 8.602 9.910

parameters. It emerges that, in terms of the survival function itself, the alter-native assumptions imply different levels of rectangularization (i.e. squaringof the survival function, as it is witnessed by

√Var[T65|Ah(τ)]) and expan-

sion (i.e. forward shift of the adult age at which most deaths occur, whichis then reflected in the value of E[T65|Ah(τ)]) (see Sections 3.3.6 and 4.1for the meaning of rectangularization and expansion). The relevant survivalfunctions and curves of deaths are plotted in Fig. 7.3.

Assumption A3(τ) will be referred to as the best-estimate description ofthe mortality trend for cohort τ; its parameters have been obtained by fitting(7.3) to the current market Italian projected table for immediate annuities(named IPS55). When comparing the values taken by E[T65|Ah(τ)] and√

Var[T65|Ah(τ)] (quoted in Table 7.1) under the various assumptions, itturns out that in respect of A3(τ) at age 65:

– assumption A1(τ) implies a lighter expansion (i.e. lower expected life-time) joint with a stronger rectangularization (i.e. lower standarddeviation of the lifetime);

– assumption A2(τ) implies a lighter expansion and rectangularization aswell;

– assumption A4(τ) implies a stronger expansion and rectangularizationas well;

– assumption A5(τ) implies a stronger expansion joint with a lighterrectangularization.

In each case, the maximum attainable age has been set equal to 117,according to the reference projected table.

The portfolio we refer to consists of one cohort of immediate life annu-ities. We assume that all annuitants are aged x0 at the time t0 of issue.To shorten the notation, time t will be recorded as the time elapsed sincethe policy issue, that is, it is the policy duration; hence, at policy dura-tion t the underlying calendar year is t0 + t. The lifetimes of annuitantsare assumed, conditional on any given survival function, to be indepen-dent of each other and identically distributed. Since our objective is the

278 7 : The longevity risk: actuarial perspectives

0

1

65 75 85 95 105 115Age

A1A2A3A4A5

0

0.01

0.02

0.03

0.04

0.05

0.06

65 75 85 95 105 115Age

A1A2A3A4A5

Num

ber

of s

urvi

vors

0.9

0.8

0.7

0.6

0.5

0.4

0.3

0.2

0.1

Num

ber

of d

eath

s

Figure 7.3. Survival functions (top panel) and curves of deaths (bottom panel) under theHeligman–Pollard law.

measurement of longevity risk only, we disregard uncertainty in financialmarkets; hence, a given flat yield curve is considered. All of the annuitantsare entitled to a fixed annual amount (participating mechanisms are notallowed for). Finally, we focus on net outflows; therefore, expenses andrelated expense loadings are not accounted for.

7.2 The longevity risk 279

Let Nt be the random number of annuitants at time t, t = 0, 1, . . . , withN0 a specified number (viz the initial size of the portfolio). Whenever thecurrent size of the portfolio is an observed quantity, we will denote it as nt;so N0 = n0. Quantities relating to the generic annuitant are labelled with(j) on the top, j = 1, 2, . . . , n0. The in-force portfolio at policy time t isdefined as

�t = {j|T(j)x0 > t} (7.5)

Quantities relating to the portfolio are then labelled with (�) on the top.

Annual outflows for the portfolio are defined, for t = 1, 2, . . . , as

B(�)t =

∑j:j∈�t

b(j) (7.6)

where b(j) is the annual amount to annuitant j.

The present value of future payments at time t, t = 0, 1, . . . , may at firstbe defined for one annuitant as

Y(j)t = b(j) a

K(j)x0� (7.7)

(see Section 1.5.1). By summing up in respect of in-force policies, we obtainthe present value of future payments for the portfolio

Y(�)t =

∑j:j∈�t

Y(j)t (7.8)

We are interested in investigating some typical values of B(�)t and Y(�)

t , aswell as the coefficient of variation and some percentiles.Wewill in particularconsider the impact of longevity risk in relation to the size of the portfolio.So, unless otherwise stated, a homogeneous portfolio in respect of annualamounts is considered; that is, we set b(j) = b for all j. Note that in this case(7.6) may be rewritten as

B(�)t = b Nt (7.9)

whilst the present value of future payments for the portfolio may also beexpressed as

Y(�)t =

ω−x0∑h=t+1

B(�)

h (1 + i)−(h−t) =ω−x0∑h=t+1

b Nh (1 + i)−(h−t) (7.10)

where i is the annual interest rate. For a homogeneous portfolio, in thefollowing Y(1)

t is used to denote the present value of future payments to ageneric annuitant.

280 7 : The longevity risk: actuarial perspectives

We first adopt approach 4 described in Section 7.2.2 (see also Fig. 7.2).All valuations are then conditional on a given mortality assumption. Wehave

E[Y(�)t |Ah(τ), nt] = nt E[Y(1)

t |Ah(τ)] (7.11)

Because we are assuming independence of the annuitants’ lifetimes, condi-tional on a given mortality trend, the following results hold:

Var[Y(�)t |Ah(τ), nt] = nt Var[Y(1)

t |Ah(τ)] (7.12)

CV[Y(�)t |Ah(τ), nt] = 1√

nt

√Var[Y(1)

t |Ah(τ)]E[Y(1)

t |Ah(τ)](7.13)

where nt is the size of the in-force portfolio, observed at the valua-tion policy time t. (Expressions for E[Y(1)

t |Ah(τ)] and Var[Y(1)t |Ah(τ)] are

straightforward and therefore omitted.)

The coefficient of variation, in particular, allows us to investigate theeffect of the size of the portfolio on the overall riskiness. Expression (7.13)shows that, in relative terms, the riskiness of the portfolio decreases as ntincreases. Thus, we have

limnt↗∞ CV[Y(�)

t |Ah(τ), nt] = 0 (7.14)

This represents the well-known result that the larger is the portfolio, theless risky it is, since with high probability the observed values will be closeto the expected ones. The quantity CV[Y(�)

t |Ah(τ), nt] is sometimes calledthe risk index.

Conditional on a given mortality assumption and because of the inde-pendence among the lifetimes of the annuitants and the assumption ofhomogeneity of annual amounts, the percentiles of Y(�)

t could be assessedthrough a process of convolution. In practice, however, due the numberof random variables constituting Y(�)

t (i.e. due to the magnitude of nt),analytical calculations are not practicable and so we must resort to stochas-tic simulation. The ε-percentile of the distribution of Y(�)

t conditional onassumption Ah(τ) and an observed size of the in-force portfolio nt at timet is defined as

yt,ε[Ah(τ), nt] = inf{u ≥ 0

∣∣ P

[Y(�)

t ≤ u|Ah(τ), nt

]> ε

}(7.15)

In particular, we are interested in investigating the right tail of Y(�)t ;

therefore, high values for ε should be considered.

As far as the distribution of annual outflows B(�)t is concerned, simi-

lar remarks to those for Y(�)t can be made. Thus, due to independence

7.2 The longevity risk 281

Table 7.2. Expected present value of future payments, con-ditional on a given scenario, per policy in-force at time

t:E[Y(�)

t |Ah(τ),nt ]nt

= E[Y(1)t |Ah(τ)]

Assumption

Time t A1(τ) A2(τ) A3(τ) A4(τ) A5(τ)

0 14.462 14.651 15.259 15.817 16.4135 12.004 12.374 12.956 13.500 14.238

10 9.504 10.076 10.599 11.097 11.98115 7.102 7.862 8.294 8.714 9.72420 4.962 5.846 6.167 6.484 7.57025 3.221 4.127 4.336 4.543 5.62630 1.944 2.766 2.877 2.988 3.98035 1.099 1.765 1.807 1.849 2.681

and homogeneity, the random variables B(�)t have (under the information

available at time 0) a binomial distribution, with parameters n0 and thesurvival probability from issue time to policy duration t calculated underthe given mortality assumption. For reasons of space, we omit the relevantexpressions (which are straightforward).

Example 7.1 In the following tables, we provide an example, in which theage at entry is x0 = 65, the interest rate is 3% p.a., the annual amountof each annuity is b(j) = 1. It then follows that B(�)

t = Nt.

In Table 7.2, the expected present value of future payments is presented,per annuitant (having set, under each assumption Ah(τ), nt = E[Nt|Ah(τ)]for each valuation time t, t = 0, 5, . . . , 35). As was clear from the assump-tions (see also Table 7.1), at the time of issue the five assumptions (orderedfrom 1 to 5) imply an increasing expected present value of future payments.The comparison may change in later years, due to the shape of the survivalfunction for a given assumption (actually, some survival functions crossover each other; see Fig. 7.3, top panel). From these results, we get an ideaabout the possible range of variation of the current value of liabilities, dueto uncertainty about the mortality trend.

In Table 7.3, we present the variance of the present value of future pay-ments, per annuitant. The illustrated variability is a consequence of therectangularization level implied by the different assumptions.We recall thatonly process risk is accounted for in this assessment; so when addressinglongevity risk such information is not of intrinsic interest, but is helpful forcomparison with the impact of longevity risk.

To compare longevity risk with process risk, we make some furthercalculations involving process risk only. Thus, Tables 7.4 and 7.5 show,

282 7 : The longevity risk: actuarial perspectives

Table 7.3. Variance of the present value of future pay-ments, conditional on a given scenario, per policy in-force

at time t:Var[Y(�)

t |Ah(τ),nt ]nt

= Var[Y(1)t |Ah(τ)]

Assumption

Time t A1(τ) A2(τ) A3(τ) A4(τ) A5(τ)

0 20.838 25.301 23.804 22.250 25.3155 20.858 24.858 23.994 22.985 26.102

10 18.963 22.607 22.375 21.970 25.22915 15.314 18.780 19.008 19.095 22.58120 10.777 14.092 14.505 14.838 18.49325 6.550 9.497 9.855 10.181 13.72630 3.479 5.771 5.969 6.159 9.19835 1.677 3.217 3.277 3.337 5.594

Table 7.4. Coefficient of variation of the present value of future payments,conditional on the best-estimate scenario: CV[Y(�)

t |A3(τ), nt]

Initial portfolio size

Time t n0 = 1 n0 = 100 n0 = 1 000 n0 = 10 000 … n0 ↗ ∞

0 31.974% 2.982% 0.969% 0.027% … 0%5 38.514% 3.618% 1.156% 0.030% … 0%

10 47.039% 4.452% 1.397% 0.038% … 0%15 58.973% 5.626% 1.734% 0.056% … 0%20 77.647% 7.469% 2.259% 0.103% … 0%25 111.894% 10.853% 3.218% 0.250% … 0%30 189.580% 18.541% 5.379% 0.883% … 0%35 424.200% 41.832% 11.815% 5.202% … 0%

respectively, the coefficient of variation for some initial sizes of the portfo-lio and some percentiles of the present value of future payments, per unitof expected value. Only the best-estimate assumption is considered. As faras the coefficient of variation is concerned, we note that at any given time itdecreases rapidly as the size of the portfolio increases, as we commented onearlier. For a given initial portfolio size, the coefficient of variation increasesin time; this is due to the decreasing residual size of the portfolio and toannuitants becoming older as well. A similar result is found when analysingthe right tail of the distribution, as it emerges in Table 7.5.

Tables 7.6 and 7.7 give a highlight on the distribution of annual outflows.In particular, Table 7.6 quotes the expected value of annual outflows underthe different assumptions; we recall that, having set b(j) = 1, what is shownis the expected number of annuitants (not rounded, to avoid too manyapproximations). Remarks are similar to those discussed for the presentvalue of future payments. �

7.2 The longevity risk 283

Table 7.5. Some percentiles of the present value of future payments,conditional on the best-estimate scenario, per unit of expected value:

yt,ε[A3(τ),nt ]E[Y(�)

t |A3(τ),nt ]

Probability

Time t ε = 0.75 ε = 0.90 ε = 0.95 ε = 0.99

Initial portfolio size: n0 = 100

0 2.159% 3.995% 4.983% 6.739%5 2.500% 4.863% 6.266% 8.554%

10 3.074% 6.110% 7.904% 11.289%15 3.738% 7.161% 9.857% 13.801%20 5.418% 10.393% 12.898% 17.866%25 8.319% 15.577% 20.338% 26.503%30 13.658% 26.115% 33.982% 47.540%35 32.386% 63.107% 83.067% 130.409%

Initial portfolio size: n0 = 1,000

0 0.635% 1.286% 1.631% 2.286%5 0.820% 1.531% 1.934% 2.668%

10 0.898% 1.923% 2.423% 3.386%15 1.131% 2.221% 2.854% 4.472%20 1.354% 2.692% 3.781% 6.223%25 2.117% 4.281% 5.443% 7.967%30 3.638% 7.355% 9.765% 14.334%35 9.155% 18.426% 22.253% 31.641%

Initial portfolio size: n0 = 10,000

0 0.200% 0.407% 0.523% 0.733%5 0.238% 0.461% 0.609% 0.850%

10 0.332% 0.622% 0.786% 1.051%15 0.415% 0.739% 0.967% 1.385%20 0.518% 0.968% 1.239% 1.761%25 0.670% 1.414% 1.765% 2.705%30 1.165% 2.317% 3.090% 4.309%35 2.323% 4.661% 6.048% 10.430%

We now assign the (naive) probability distribution (7.2) on the set A(τ).The unknown mortality trend, assumed to lie in A(τ), is denoted by A(τ).

For the unconditional expected present value of future payments, thefollowing relations hold (the suffix ρ denotes that the underlying probabilitydistribution is given by (7.2)):

E[Y(�)t |nt] = Eρ[E[Y(�)

t |A(τ), nt]] = nt Eρ[E[Y(1)t |A(τ)]]

= nt

m∑h=1

E[Y(1)t |Ah(τ)] ρh = nt E[Y(1)

t ] (7.16)

284 7 : The longevity risk: actuarial perspectives

Table 7.6. Expected value of annual outflows, conditional on agiven scenario: E[B(�)

t |Ah(τ)]; initial portfolio size: n0 = 1,000

Assumption

Time t A1(τ) A2(τ) A3(τ) A4(τ) A5(τ)

5 963.105 954.252 963.630 971.087 969.63610 893.255 877.810 900.177 918.558 918.55615 768.675 756.711 794.479 826.872 835.46320 570.930 581.960 632.539 678.377 707.95725 319.516 367.226 418.752 468.784 530.95430 105.929 165.716 200.690 237.508 323.52635 14.160 43.221 55.764 70.089 139.572

Table 7.7. Coefficient of variation of annual outflows,conditional on the best-estimate scenario: CV[B(�)

t |A3(τ)];initial portfolio size: n0 = 1,000

Time t n0 = 100 n0 = 1,000 n0 = 10,000

5 1.943% 0.614% 0.194%10 3.330% 1.053% 0.333%15 5.086% 1.608% 0.509%20 7.622% 2.410% 0.762%25 11.782% 3.726% 1.178%30 19.957% 6.311% 1.996%35 41.150% 13.013% 4.115%

where E[Y(1)t ] = ∑m

h=1 E[Y(1)t |Ah(τ)] ρh.

The unconditional variance of Y(�)t can be calculated as

Var[Y(�)t |nt] = Eρ[Var[Y(�)

t |A(τ), nt]] + Varρ[E[Y(�)t |A(τ), nt]]

= nt Eρ[Var[Y(1)t |A(τ)]] + n2

t Varρ[E[Y(1)t |A(τ)]]

= nt

m∑h=1

Var[Y(1)t |Ah(τ)] ρh

+ n2t

m∑h=1

(E[Y(1)

t |Ah(τ)] − E[Y(1)t ]

)2ρh (7.17)

The first term in the expression for the variance reflects deviations aroundthe expected value; so it can be thought of as a measure of processrisk. The second term, instead, reflects deviations from the expected value(i.e. systematic deviations) and so it may be thought of as a measure oflongevity (namely parameter, in our example) risk. Under the unconditional

7.2 The longevity risk 285

valuation, the coefficient of variation now takes the following expression:

CV[Y(�)t |nt] =

√Var[Y(�)

t ]E[Y(�)

t ]

=√√√√ 1

nt

Eρ[Var[Y(1)t |A(τ)]]

E2[Y(1)t ]

+ Varρ[E[Y(1)t |A(τ)]]

E2[Y(1)t ]

(7.18)

The first term under the square root shows that random fluctuations rep-resent a pooling risk, since (in relative terms) their effect is absorbed bythe size of the portfolio. This result is similar to that obtained under thevaluation conditional on a given mortality trend (see (7.13)). The secondterm, instead, shows that systematic deviations constitute a non-poolingrisk, which is not affected by changes in the portfolio size. In particular, theasymptotic value of the risk index

limnt↗∞ CV[Y(�)

t |nt] =√√√√Varρ[E[Y(1)

t |A(τ)]]E2[Y(1)

t ](7.19)

can be thought of as a measure of that part of the mortality risk which isnot affected by simply changing the size of the portfolio.

The ε-percentile of the unconditional probability distribution of Y(�)t

under an observed size of the in-force portfolio nt at time t is defined as

yt,ε[nt] = inf{u ≥ 0

∣∣ P

[Y(�)

t ≤ u|nt

]> ε

}(7.20)

To assess this quantity, stochastic simulation is required, through whichfirst the mortality trend is randomly picked up from A(τ), and then thelifetimes of annuitants are generated.

In regard of annual amounts, similar valuations and comments can beperformed.

Example 7.2 We now describe a numerical example of the results presentedabove. We consider the same inputs of Example 7.1. We assign to A(τ) theweights quoted in Table 7.8. The best-estimate assumption (A3(τ)) has beengiven the highest weight. The residual weight has been spread out uniformlyon the remaining assumptions.

Table 7.9 shows the unconditional expected value of future payments. Itsmagnitude is driven by the best-estimate assumption, as seen by comparisonwith the results in Table 7.2.

286 7 : The longevity risk: actuarial perspectives

Table 7.8. Probabilitydistribution on A(τ)

Assumption Weight ρh

A1(τ) 0.1A2(τ) 0.1A3(τ) 0.6A4(τ) 0.1A5(τ) 0.1

Table 7.9. (Unconditional) expectedpresent value of future payments, perpolicy in-force at time t

Time t E[Y(�)t |nt ]nt

= E[Y(1)t ]

0 15.2905 12.985

10 10.62515 8.31720 6.18725 4.35330 2.89435 1.824

In Table 7.10, the unconditional variance of Y(�)t for some portfolio

sizes is shown, split into the pooling and non-pooling components. Forcomparison with the conditional valuation, also the case n0 = 1 is quoted.We note the increase in the magnitude of the variance, due to the non-pooling part, as the portfolio size increases. Whenever the portfolio is largeat policy issue, the non-pooling component remains important relative tothe pooling component even at high policy durations.

The behaviour of the coefficient of variation in respect of the portfoliosize is illustrated in Table 7.11. When compared with the case allowing forprocess risk only (see Table 7.4), the risk index decreases more slowly as theportfolio size increases. We note, in particular, its positive limiting value,which is evidence of the magnitude of the systematic risk.

In Table 7.12 the right tail of the distribution of the present value offuture payments is investigated, for some portfolio sizes. We note that thetail is rather heavier than in the case allowing for process risk only (seeTable 7.5).

Finally, in Tables 7.13–7.15 the distribution of annual outflows is inves-tigated. Similar remarks hold to those made above for the distribution offuture payments. �

7.2 The longevity risk 287

Table 7.10. (Unconditional) variance of the present value offuture payments per policy in-force at time t, and components

Time Variance Pooling part Non-pooling part

t Var[Y(�)t |nt ]nt

Eρ[Var[Y(�)t |A(τ),nt ]]

Var[Y(�)t |nt ]

Varρ[E[Y(�)t |A(τ),nt ]]

Var[Y(�)t |nt ]

Initial portfolio size: n0 = 1

0 23.916 98.90% 1.10%5 23.303 98.73% 1.27%

10 20.369 98.56% 1.44%15 15.322 98.43% 1.57%20 9.331 98.45% 1.55%25 4.202 98.75% 1.25%30 1.221 99.30% 0.70%35 0.187 99.79% 0.21%

Initial portfolio size: n0 = 100

0 50.026 47.28% 52.72%5 52.493 43.83% 56.17%

10 49.436 40.61% 59.39%15 39.209 38.46% 61.54%20 23.670 38.81% 61.19%25 9.391 44.18% 55.82%30 2.062 58.81% 41.19%35 0.226 82.60% 17.40%

Initial portfolio size: n0 = 1,000

0 287.390 8.23% 91.77%5 317.858 7.24% 92.76%

10 313.680 6.40% 93.60%15 256.365 5.88% 94.12%20 154.023 5.96% 94.04%25 56.568 7.33% 92.67%30 9.707 12.49% 87.51%35 0.580 32.20% 67.80%

Initial portfolio size: n0 = 10,000

0 2 661.023 0.89% 99.11%5 2971.508 0.77% 99.23%

10 2956.118 0.68% 99.32%15 2427.922 0.62% 99.38%20 1457.548 0.63% 99.37%25 528.334 0.79% 99.21%30 86.159 1.41% 98.59%35 4.120 4.53% 95.47%

We finally address the problem of choosing the weights (7.2). As wehave already mentioned, data to estimate such weights are available rarely.However, some numerical tests suggest that the weights do not deeply affectthe results of the investigation, unless only process risk is allowed for. Weshow this effect in Example 7.3.

288 7 : The longevity risk: actuarial perspectives

Table 7.11. (Unconditional) coefficient of variation of the present value of futurepayments: CV[Y(�)

t |nt]

Initial portfolio size

Time t n0 = 1 n0 = 100 n0 = 1,000 n0 = 10,000 … n0 ↗ ∞

0 31.985% 4.626% 3.506% 3.374% … 3.359%5 38.579% 5.790% 4.506% 4.356% … 4.339%

10 47.188% 7.351% 5.856% 5.685% … 5.665%15 59.241% 9.477% 7.663% 7.457% … 7.434%20 78.060% 12.432% 10.029% 9.756% … 9.725%25 112.448% 16.811% 13.048% 12.610% … 12.560%30 190.275% 24.726% 16.965% 15.983% … 15.870%35 425.348% 46.751% 23.680% 19.957% … 19.499%

Table 7.12. Some (unconditional) percentiles of the present value of

future payments, per unit of expected value:yt,ε[nt ]

E[Y(�)t |nt ]

Probability

Time t ε = 0.75 ε = 0.90 ε = 0.95 ε = 0.99

Initial portfolio size: n0 = 100

0 3.473% 6.175% 7.796% 11.465%5 3.453% 8.010% 10.041% 14.405%

10 4.638% 10.290% 14.492% 19.236%15 5.293% 13.350% 18.412% 25.884%20 7.968% 17.498% 25.518% 34.996%25 14.135% 23.243% 30.693% 52.379%30 17.964% 37.964% 46.114% 75.150%35 33.817% 70.810% 92.929% 138.271%

Initial portfolio size: n0 = 1,000

0 1.057% 5.164% 7.357% 8.591%5 1.378% 6.300% 9.654% 11.276%

10 1.878% 7.501% 12.471% 14.889%15 2.108% 9.426% 16.712% 19.332%20 2.797% 11.838% 22.206% 25.939%25 3.417% 14.771% 29.772% 34.880%30 5.100% 19.791% 37.774% 47.734%35 14.933% 27.796% 48.299% 72.891%

Initial portfolio size: n0 = 10,000

0 0.261% 4.124% 7.298% 7.726%5 0.189% 4.800% 9.695% 10.062%

10 0.254% 5.417% 12.743% 13.304%15 0.316% 6.102% 16.754% 17.552%20 0.461% 6.499% 22.162% 23.146%25 0.799% 6.417% 28.886% 30.326%30 1.571% 7.292% 36.976% 40.782%35 2.902% 13.794% 47.056% 52.805%

7.2 The longevity risk 289

Table 7.13. (Unconditional)expected value of annual out-flows: E[B(�)

t ]; initial portfoliosize: n0 = 1,000

Time t E[B(�)t ]

5 963.98610 900.92415 795.45920 633.44625 419.89930 203.68235 60.162

Table 7.14. Components of the (unconditional) variance ofannual outflows

Pooling part Non-pooling part

Time t Eρ[Var[B(�)t |A(τ)]]

Var[B(�)t ]

Varρ[E[B(�)t |A(τ)]]

Var[B(�)t ]

Initial portfolio size: n0 = 100

5 99.488% 0.512%10 98.652% 1.348%15 97.119% 2.881%20 94.229% 5.771%25 89.724% 10.276%30 85.729% 14.271%35 86.181% 13.819%

Initial portfolio size: n0 = 1,000

5 95.104% 4.896%10 87.976% 12.024%15 77.124% 22.876%20 62.016% 37.984%25 46.613% 53.387%30 37.528% 62.472%35 38.409% 61.591%

Initial portfolio size: n0 = 10,000

5 66.013% 33.987%10 42.253% 57.747%15 25.214% 74.786%20 14.036% 85.964%25 8.030% 91.970%30 5.667% 94.333%35 5.870% 94.130%

290 7 : The longevity risk: actuarial perspectives

Table 7.15. (Unconditional) coefficient of variationof annual outflows: CV[B(�)

t ]

Time t n0 = 100 n0 = 1,000 n0 = 10,000

5 6.143% 1.943% 0.614%10 10.531% 3.330% 1.053%15 16.084% 5.086% 1.608%20 24.103% 7.622% 2.410%25 37.257% 11.782% 3.726%30 63.110% 19.957% 6.311%35 130.126% 41.150% 13.013%

Table 7.16. Alternative probability distributionson A(τ)

Weighting system

Weight (a) (b) (c) (d)

ρ1 0 0.1 0.15 0.2ρ2 0 0.1 0.15 0.2ρ3 1 0.6 0.4 0.2ρ4 0 0.1 0.15 0.2ρ5 0 0.1 0.15 0.2

Example 7.3 In this example, we compare the right tail of the distributionof the present value of future payments assuming the alternative weightingsystems for (7.2) that are presented in Table 7.16. System (a) is the oneallowing for process risk only (see Example 7.1). System (b) is the oneadopted in Example 7.2. System (c) is similar to (b), with the highest weightassigned to the best-estimate assumption; however, such weight has beenreduced. System (d), finally, consists of a uniform distribution of weights.

We focus on the right tail of the distribution of the present value of futurepayments (and not on the other risk measures considered previously, suchas the risk index) due to its practical importance. Actually, reserving orcapital allocation could be based on this quantity (see also Section 7.3.3).From the details presented for this example in Table 7.17, it seems thatwhenever parameter risk is allowed for, the magnitude of the right tail isnot deeply affected by the weighting system (although, of course, the actualfigure does depend on the specific weights). Indeed, an apparent differenceemerges between results found under system (a), on the one hand, andsystems (b)–(d), on the other. This suggests that, having poor information,the allowance for longevity risk is more important than the actual choiceof the weights. �

7.2 The longevity risk 291

Table 7.17. Some (unconditional) percentiles of the present value of future payments, per unit

of expected value:yt,ε[nt ]

E[Y(�)t |nt ]

, under alternative weighting systems; n0 = 1, 000

Probability

Time t ε = 0.75 ε = 0.90 ε = 0.95 ε = 0.99

System (a)

0 0.635% 1.286% 1.631% 2.286%5 0.820% 1.531% 1.934% 2.668%

10 0.898% 1.923% 2.423% 3.386%15 1.131% 2.221% 2.854% 4.472%20 1.354% 2.692% 3.781% 6.223%25 2.117% 4.281% 5.443% 7.967%30 3.638% 7.355% 9.765% 14.334%35 9.155% 18.426% 22.253% 31.641%

System (b)

0 1.057% 5.164% 7.357% 8.591%5 1.378% 6.300% 9.654% 11.276%

10 1.878% 7.501% 12.471% 14.889%15 2.108% 9.426% 16.712% 19.332%20 2.797% 11.838% 22.206% 25.939%25 3.417% 14.771% 29.772% 34.880%30 5.100% 19.791% 37.774% 47.734%35 14.933% 27.796% 48.299% 72.891%

System (c)

0 2.912% 6.785% 7.652% 8.539%5 3.206% 8.850% 9.822% 11.384%

10 3.615% 11.893% 13.119% 15.178%15 3.881% 15.645% 17.404% 19.643%20 4.258% 20.997% 23.363% 26.253%25 5.047% 27.391% 31.364% 35.474%30 6.192% 35.431% 41.423% 49.504%35 16.809% 41.965% 54.794% 74.366%

System (d)

0 3.697% 7.142% 7.642% 8.508%5 4.609% 9.408% 10.362% 11.702%

10 5.079% 12.195% 13.497% 15.037%15 5.480% 16.397% 17.687% 19.671%20 5.929% 21.825% 23.725% 26.542%25 7.025% 29.249% 32.239% 35.218%30 8.782% 36.966% 42.059% 50.565%35 19.443% 46.965% 60.689% 74.138%

We have noted that the most important aspect is to allow for parameterrisk by assigning positive weights to trend assumptions alternative to thebest-estimate one. However, the specific weights do affect the magnitude ofquantities of interest (such as the tail of the distribution of future payments).

292 7 : The longevity risk: actuarial perspectives

A Bayesian inferential model could provide an appropriate method forupdating the weights. We briefly discuss how one could structure such aprocedure.

We still refer to a cohort of annuitants, which is homogeneous and,conditional on a given trend, with identically distributed and independentlifetimes. The observed number of annuitants at time t is nt. As the staticapproach for stochastic mortality evolution, we assume that the trend ofthe cohort is unknown, but fixed (i.e. not subject neither to shocks nor tounanticipated shifts). The set of trend assumptions is given by (7.1). In thecurrent context, the set of weights (7.2) will be denoted as

{ρ(Ah(τ))}h=1,2,...,m (7.21)

We let f0(t|A(τ)) denote the probability density function (briefly: pdf)of the lifetime at birth of one individual, conditional on assumption A(τ)

about the mortality trend. We then let S(t|A(τ)) denote the relevant survivalfunction.

Within the inferential procedure, the sampling pdf is defined as follows:

ft(z|A(τ)) ={0 for z ≤ tf0(z|A(τ))

S(t|A(τ))for z > t

(7.22)

The multivariate sampling pdf is then given by

ft(z(1), z(2), . . . , z(nt)|A(τ)) =nt∏

j=1

ft(z(j)|A(τ)) (7.23)

Note that

ft(z) =m∑

h=1

ft(z|A(τ)) ρ(Ah(τ)) (7.24)

represents the (prior) predictive pdf restricted to the age interval [t,ω−x0].Assume now the observation period [t, t′]. Let d denote the number of

deaths observed in such period. With an appropriate renumbering, let

x = {x(1), x(2), . . . , x(d)} (7.25)

denote the array of ages at death. We note that the defined observation pro-cedure implies a Type I-censored sampling (see, for instance, Namboodiriand Suchindran (1987)).

Using the information provided by the pair (d, x), the (posterior) predic-tive pdf ft(z|d, x) can be constructed. With this objective in mind, we canadopt the following procedure (usual in the Bayesian context):

7.3 Managing the longevity risk 293

1. Update the initial opinion about the possible evolution of mortality,and hence about the probability distribution over the set of trendassumptions A(τ), by calculating the posterior pdf

ρ(Ah(τ)|d, x) ∝ ρ(Ah(τ))L(Ah(τ|d, x)) (7.26)

where L(Ah(τ|d, x)) denotes the likelihood function;2. Calculate the (posterior) predictive pdf as

ft(z|d, x) =m∑

h=1

ft(z|Ah(τ)) ρ(Ah(τ)|d, x) (7.27)

Step 1 requires the construction of the likelihood function L(Ah(τ|d, x)).We have (see, e.g. Namboodiri and Suchindran (1987)):

L(Ah(τ|d, x)) ∝ d∏

k=1

ft(x(k) − t|Ah(τ))

(S(t′|Ah(τ))

S(t|Ah(τ))

)nt−d

(7.28)

The inferential procedure described above could be adopted within inter-nal solvency models, whenever alternative projected mortality tables areavailable. Some numerical investigations in this regard are discussed byOlivieri and Pitacco (2002a).

7.3 Managing the longevity risk

7.3.1 A risk management perspective

Several tools can be developed to manage longevity risk. These tools can beplaced and analysed in a risk management (RM) framework.

As sketched in Fig. 7.4, the RM process consists of three basic steps,namely the identification of risks, the assessment (or measurement) of therelevant consequences, and the choice of the RM techniques. In what fol-lows we refer to the RM process applied to life insurance, in general, andto life annuity portfolios, in particular.

The identification of risks affecting an insurer can follow, for example, theguidelines provided by IAA (2004) or those provided within the Solvency 2project (see CEIOPS, 2007 and CEIOPS, 2008). Mortality/longevity risksbelong to underwriting risks; the relevant components have already beendiscussed (see Section 7.2.1). Obviously, for an insurer the importance ofthe longevity risk within the class of mortality risks is strictly related to the

294 7 : The longevity risk: actuarial perspectives

DETERMINISTIC MODELS Sensitivity testing Scenario testing

STOCHASTIC MODELS Risk index, VaR, Probability of default . . .

RISKMANAGEMENTTECHNIQUES

LOSS CONTROL

Loss prevention (frequency control) Loss reduction (severity control)

LOSS FINANCING

HedgingTransfer

Retention

PORTFOLIOSTRATEGIES

PRODUCT DESIGN Pricing (life table, guarantees, options, expense loading, etc.) Participation mechanism

PORTFOLIO PROTECTION Natural hedging Reinsurance, ART No advance funding Capital allocation

RISKMITIGATION

IDENTIFICATION

ASSESSMENT

UNDERWRITING RISK Mortality/Longevity risk – Volatility – Level uncertainty – Trend uncertainty – Catastrophe Lapse risk . . . MARKET RISK . . .

Figure 7.4. The risk management process.

relative weight of the life annuity portfolio with respect to the overall lifebusiness.

A rigorous assessment of the longevity risk requires the use of stochasticmodels (i.e. approach 5 in Fig. 7.2). In Section 7.2.4 we have providedsome examples of risk measurement, viz the variance, the coefficient ofvariation, and the right tail of liabilities – these need to be appropriatelydefined; in Section 7.2.4 they were stated in terms of the present value offuture payments and of annual outflows. A further example is given by

7.3 Managing the longevity risk 295

Expectedvalues

Ann

ual o

utfl

ows

Actualoutflows

Threshold

Time

Figure 7.5. Annual outflows in a portfolio of immediate life annuities (one cohort).

the probability of default (or ruin probability, in the traditional language),which will be considered in Section 7.3.3 when dealing with the solvencyproblem. As discussed in Section 7.2.2, deterministic models (i.e. approach4 in Fig. 7.2) can provide useful, although rough, insights into the impact oflongevity risk on portfolio results. In particular, as outlined in Sections 7.2.3and 7.2.4, deterministic models allow us to calculate the range of valuesthat some quantities (present value of future payments, annual outflows,or others) may assume in respect of the outcome of the underlying randomquantity.

Risk management techniques for dealing with longevity risk include awide set of tools, which can be interpreted, under an insurance perspective,as portfolio strategies, aimed at risk mitigation.

A number of portfolio results can be taken as ‘metrics’ to assess theeffectiveness of portfolio strategies. In what follows, we focus on annualoutflows relating to annuity payments only, which, in any event, consti-tute the starting point from which other quantities (e.g. profits) may bederived.

In Fig. 7.5, we present a sequence of outflows, together with a barrier (the‘threshold’) which represents a maintainable level of benefit payment. Thethreshold amount is financed first by premiums via the portfolio technicalprovision, and then by shareholders’ capital as the result of the allocationpolicy (consisting of specific capital allocations as well as an accumulationof undistributed profits).

296 7 : The longevity risk: actuarial perspectives

The situation occurring in Fig. 7.5, namely, some annual outflows beingabove the threshold level, should be clearly avoided. To lower the proba-bility of such critical situations, the insurer can resort to various portfoliostrategies, in the framework of the RM process.

Figure 7.6 illustrates a wide range of portfolio strategies which aim at riskmitigation, in terms of lowering the probability and the severity of eventslike the situation depicted in Fig. 7.5. In practical terms, a portfolio strategycan have as targets

(i) an increase in the maintainable annual outflow, and thus a higherthreshold level;

(ii) lower (and smoother) annual outflows in the case of unanticipatedimprovements in portfolio mortality.

Both loss control and loss financing techniques (according to the RM lan-guage) can be adopted to achieve targets (i) and (ii). Loss control techniquesare mainly performed via the product design, that is, via an appropriatechoice of the various items which constitute an insurance product. In par-ticular, loss prevention is usually interpreted as the RM technique whichaims to mitigate the loss frequency, whereas loss reduction aims at loweringthe severity of the possible losses.

The pricing of insurance products provides a tool for loss prevention. Thisportfolio strategy is represented by path (1) → (a) in Fig. 7.6. Referring to

(b) Net outflow

Gross outflow

Transfers

(a) Threshold

Shareholders'capital

Reserve

(3) Undistributed profits

(2) Allocation

Annualbenefits

(4) Profit partic.

(5) [Reduction]

(6) Reinsurance

(7) Swaps

(8) Longevity bonds

(1) Single premiums

Figure 7.6. Portfolio strategies for risk mitigation.

7.3 Managing the longevity risk 297

a life annuity product, the following issues, in particular, should be takeninto account.

– Mortality improvements require the use of a projected life table forpricing life annuities.

– Because of the uncertainty in the future mortality trend, a premium for-mula other than the traditional one based on the equivalence principle(see Section 1.6.1, and formula (1.57) in particular) should be adopted. Itshould be noted that, by adopting the equivalence principle, the longevityrisk can be accounted for only via a (rough) safety loading, which is calcu-lated by increasing the survival probabilities resulting from the projectedlife table. Indeed, this approach is often adopted in current actuarialpractice.

– The presence, in an accumulation product such as an endowment, of anoption to annuitize at a fixed annuitization rate (the so-calledGuaranteedAnnuity Option, briefly GAO – see Section 1.6.2) requires an accuratepricing model accounting for the value of the option itself.

To pursue loss reduction, it is necessary to control the annuity amountspaid out. Hence, some flexibility must be added to the life annuity product.One action could be the reduction of the annual amount as a consequenceof an unanticipated mortality improvement (path (5) → (b) in Fig. 7.6).However, in this case the product would be a non-guaranteed life annuity,although possibly with a reasonable minimum amount guaranteed. A morepracticable tool, consistent with the features of a guaranteed life annuity,consists of reducing the level of investment profit participation when themortality experience is adverse to the annuity provider (path (4) → (b)). Itis worth stressing that undistributed profits also increase the shareholders’capital within the portfolio, hence increasing the maintainable threshold(path (3) → (a)).

Loss financing techniques require specific strategies involving the wholeportfolio, and in some cases even other portfolios of the insurer. Risktransfer can be realized via (traditional) reinsurance arrangements (path(6) → (b)), swap-like reinsurance ((7) → (b)) and securitization, that is,Alternative Risk Transfer (ART). In the case of life annuities, ART requiresthe use of specific financial instruments, for example, longevity bonds((8) → (b)), whose performance is linked to some measure of longevityin a given population.

A comment is required on traditional risk transfer tools. Traditionalreinsurance arrangements (e.g. surplus reinsurance, XL reinsurance, andso on) at least in principle can be applied also to life annuity portfolios.

298 7 : The longevity risk: actuarial perspectives

But, it should be stressed that such risk transfer solutions mainly rely onthe improved diversification of risks when these are taken by the reinsurer,thanks to a stronger pooling effect. Notably, such an improvement can beachieved in relation to process risk (i.e. random fluctuations in the num-ber of deaths), whilst uncertainty risk (leading to systematic deviations)cannot be diversified ‘inside’ the insurance–reinsurance process. Hence, tobecome more effective, reinsurance transfers must be completed with a fur-ther transfer, that is, a transfer to capital markets. Such a transfer canbe realized via bonds, whose yield is linked to some mortality/longevityindex, so that the bonds themselves generate flows which hedge the pay-ment of life annuity benefits. While mortality bonds (hedging the risk ofa mortality higher than expected) already exist, longevity bonds (hedg-ing the risk of a mortality lower than expected) are yet to appear in themarket.

To the extent that mortality/longevity risks are retained by an insurer, theimpact of a poor experience falls on the insurer itself. Tomeet an unexpectedamount of obligations, an appropriate level of advance fundingmay providea substantial help. To this purpose, shareholders’ capital must be allocatedto the life annuity portfolio (path (2) → (a), as well as (3) → (a) in Fig. 7.6),and the relevant amount should be determined to achieve insurer solvency.Conversely, the expression ‘no advance funding’ (see Fig. 7.4) should bereferred to the situations where no specific capital allocation is provided inrespect of mortality/longevity risks. In the case of adverse experience, theunexpected amount of obligations has to be met (at least partially) by theavailable residual assets, which are not tied up to specific liabilities.

Hedging strategies in general consist of assuming the existence of a riskwhich offsets another risk borne by the insurer. In some cases, hedgingstrategies involve various portfolios or lines of business (LOBs), or eventhe whole insurance company, so that they cannot be placed in the port-folio framework as depicted in Fig. 7.6. In particular, natural hedging (seeFig. 7.4) consists of offsetting risks in different LOBs. For example, writingboth life insurance providing death benefits and life annuities for similargroups of policyholders may help to provide a hedge against longevity risk.Such a hedge is usually named across LOBs. A natural hedge can be realizedeven inside a life annuity portfolio, allowing for a death benefit (possiblydecreasing as the age at death increases) combined with the life annuity;see Section 1.6.4. Clearly, in the case of a higher than anticipated mortalityimprovement, death benefits which are lower than expected will be paid.Such a hedge is usually called across time.

Clearly, mortality/longevity risks should be managed by the insurerthrough an appropriate mix of the tools described above. The choice of the

7.3 Managing the longevity risk 299

RM tools is also driven by various interrelationships among the tools them-selves. For example, the possibility of purchasing profitable reinsurance isstrictly related with the features of the insurance product and, in particular,the life tables underlying the pricing, as well as with the availability of ARTfor the reinsurer.

The following sections are devoted to an in-depth analysis of the RMtools which currently seem to be the most practicable.

7.3.2 Natural hedging

In the context of life insurance, natural hedging refers to a diversificationstrategy combining ‘opposite’ benefits with respect to the duration of life.The main idea is that if mortality rates decrease then life annuity costsincrease while death benefit costs decrease (and vice versa). Hence the mor-tality risk inherent in a life annuity business could be offset, at least partially,by taking a position also on some insurance products providing benefits inthe case of death. We discuss two situations, one concerning hedging acrosstime and one across LOBs.

We first consider hedging across time. We assume that at time 0 (i.e.calendar year x0 + t0) an immediate life annuity is issued to a person agedx0, with the proviso that at death (e.g. at the end of the year of death) themathematical reserve therein set up to meet the life annuity benefit (only) ispaid back to the beneficiaries. Reasonably, the reserving basis concerningthe death benefit should be stated at policy issue so that the death benefit,although decreasing over time, is guaranteed.

At time 0, the random present value of future (life annuity and death)benefits for an individual (generically, individual j) is defined as follows:

Y(j)0 = b(j) a

K(j)x0� + (1 + i)−(K(j)

x0+1) C(j)

K(j)x0+1

(7.29)

where C(j)t is the death benefit payable at time t if death occurs in (t − 1, t),

defined as follows

C(j)t = b(j) a[A]

x0+t = b(j)ω−x0−t∑

h=1

(1 + i)−hhp[A]

x0+t (7.30)

The benefit C(j)t is therefore the mathematical reserve set up at time t to meet

the life annuity benefit, calculated according to the mortality assumptionA(τ) and the annual interest rate i. Note that the individual reserve (meetingboth the life annuity and the death benefit) to be set up at time t according

300 7 : The longevity risk: actuarial perspectives

to the (traditional) equivalence principle is

V (j)t = b(j) ax0+t +

ω−x0−t∑h=0

h/1qx0+t (1 + i)−(h+1) C(j)t+h+1 (7.31)

(calculated according to a proper technical basis, possibly other than thatassumed in the calculation of C(j)

t ). The sum at risk, C(j)t −V (j)

t , in each year(t − 1, t) is intended to be close to 0.

Intuitively, when dealing with both a life annuity and a death benefit theinsurer benefits from a risk reduction, given that the longer is the annuitypayment period, the lower is the amount of the death benefit. However, therisk reduction cannot be total, because of the definition of the death benefit(which is in particular guaranteed). The tricky point of this package is thecost to the annuitant. Intuitively, we expect that the death benefit (7.30)will be expensive (given that the consequence – which is the insurer’s targetas well – is a strong reduction of the cross-subsidy effect); so commercialdifficulties may arise.

For the sake of brevity we do not give analytical details; for discussion,we only provide a numerical example.

Example 7.4 We take the assumptions adopted in Examples 7.1 and 7.2.We assume that the death benefit is calculated according to the annualinterest rate i = 0.03 and themortality assumptionA3(τ). Table 7.18 quotesthe risk index (i.e. the coefficient of variation of the present value of futurepayments), when a given mortality assumption is adopted. The reductionin the risk profile of the insurer is apparent (compare with Table 7.4). Thereduction of the riskiness can be noticed also in the unconditional case;see Table 7.19, which should be compared with Table 7.11. However, thedeath benefit requires a 22.730% increase in the single premium at age 65(according to a pricing basis given by i = 0.03 and themortality assumptionA3(τ)). Actually, the mutuality effect is weaker in this case than when justa life annuity benefit is involved.

For the sake of brevity, we do not investigate further risk measures. �

From the point of view of the annuitant, the previous policy structure hasthe advantage of paying back the assets (in terms of the amount stated underpolicy conditions) remaining at her/his death, hence meeting bequest expec-tations. On the other hand, the death benefit is rather expensive. Furthersolutions can be studied, in order to reconcile the risk reduction purposesof the insurer with the request by the annuitant for a high level of the ratiobetween the annual amount and the single premium. However, the lower isthe death benefit, the lower is the risk reduction gained by the insurer. To

7.3 Managing the longevity risk 301

Table 7.18. Coefficient of variation of the present value of future payments, conditional onthe best-estimate scenario: CV[Y(�)

t |A3(τ), nt], in the presence of death benefit (7.30)

Initial portfolio size

Time t n0 = 1 n0 = 100 n0 = 1,000 n0 = 10,000 … n0 ↗ ∞

0 10.714% 1.071% 0.339% 0.107% … 0%5 13.364% 1.336% 0.423% 0.134% … 0%

10 16.722% 1.672% 0.529% 0.167% … 0%15 20.925% 2.093% 0.662% 0.209% … 0%20 26.105% 2.610% 0.826% 0.261% … 0%25 32.390% 3.239% 1.024% 0.324% … 0%30 39.960% 3.996% 1.264% 0.400% … 0%35 49.174% 4.917% 1.555% 0.492% … 0%

Table 7.19. (Unconditional) coefficient of variation of the present value of futurepayments: CV[Y(�)

t |nt], in the presence of death benefit (7.30)

Initial portfolio size

Time t n0 = 1 n0 = 100 n0 = 1,000 n0 = 10,000 … n0 ↗ ∞

0 10.804% 1.764% 1.442% 1.405% … 1.401%5 13.489% 2.256% 1.866% 1.822% … 1.817%

10 16.902% 2.924% 2.455% 2.403% … 2.397%15 21.193% 3.830% 3.273% 3.213% … 3.206%20 26.511% 5.043% 4.390% 4.319% … 4.312%25 33.009% 6.620% 5.858% 5.776% … 5.767%30 40.884% 8.574% 7.680% 7.585% … 7.575%35 50.490% 10.859% 9.789% 9.675% … 9.663%

give an example that can be commercially practicable, we consider a deathbenefit defined as the difference (if positive) between the single premium Sfunding the life annuity benefit and the number of annual amounts paid upto death (see also Section 1.6.4); so we have

C(j)t = max

{S − (t − 1) b(j), 0

}(7.32)

See Example 7.5.

Example 7.5 With the same inputs as Example 7.4, we quote, in Tables 7.20and 7.21, the risk index. In Table 7.20 the calculation is conditional onmortality assumption A3(τ) and in Table 7.21 it is based on the uncondi-tional probability distribution. The single premium has been calculated asthe expected present value of future payments, conditional on assumptionA3(τ); hence, S = b(j) E[a

K(j)x0�|A3(τ)]. When compared with Tables 7.4 and

7.11, we note a reduction in the risk profile to the insurer in the early policy

302 7 : The longevity risk: actuarial perspectives

Table 7.20. Coefficient of variation of the present value of future payments, condi-tional on the best-estimate scenario:CV[Y(�)

t |A3(τ), nt], in the presence of death benefit(7.32)

Initial portfolio size

Time t n0 = 1 n0 = 100 n0 = 1,000 n0 = 10,000 … n0 ↗ ∞

0 18.877% 1.888% 0.597% 0.189% … 0%5 26.330% 2.633% 0.833% 0.263% … 0%

10 37.817% 3.782% 1.196% 0.378% … 0%15 52.312% 5.231% 1.654% 0.523% … 0%20 61.755% 6.175% 1.953% 0.618% … 0%25 72.408% 7.241% 2.290% 0.724% … 0%30 84.929% 8.493% 2.686% 0.849% … 0%35 100.172% 10.017% 3.168% 1.002% … 0%

Table 7.21. (Unconditional) coefficient of variation of the present value of futurepayments: CV[Y(�)

t |nt], in the presence of death benefit (7.32)

Initial portfolio size

Time t n0 = 1 n0 = 100 n0 = 1,000 n0 = 10,000 … n0 ↗ ∞

0 19.010% 3.129% 2.568% 2.505% … 2.498%5 26.497% 4.386% 3.609% 3.522% … 3.512%

10 38.040% 6.363% 5.263% 5.140% … 5.126%15 52.659% 9.063% 7.594% 7.431% … 7.413%20 62.362% 11.512% 9.918% 9.745% … 9.725%25 73.394% 14.493% 12.766% 12.581% … 12.560%30 86.413% 18.000% 16.096% 15.893% … 15.870%35 102.214% 21.929% 19.756% 19.525% … 19.499%

years; of course, when the death benefit is zero, we find again the case of thestand-alone life annuity benefit. The risk reduction is lower than in Exam-ple 7.4, due to the lower death benefit. The increase in the single premiumrequired at age 65 is lower as well; according to the usual pricing basis(i = 0.03, mortality assumption A3(τ)), a 7.173% increase is required withrespect to the case of the stand-alone life annuity. �

Death benefits like (7.32) are included in the so-called money-backannuities; see Boardman (2006).

One further, very well-known, example of natural hedging across time isgiven by reversionary annuities (see Section 1.6.3). In this case, the longer isthe payment period to the leading annuitant, the lower should be the num-ber of payments to the reversionary annuitant. However, some increasedlongevity risk arises in this case, due to the fact that two (or more) lives areinvolved instead of just one (with a possibly correlated mortality trend).

7.3 Managing the longevity risk 303

We now address natural hedging across LOBs. A risk reduction could bepursued by properly mixing positions in life insurances and life annuities.The offset result is unlikely to be as good as those mentioned previously,given that life insurances usually concern a different range of ages thanlife annuities. Further, we would point out that mortality trends emergedifferently within life insurance and life annuity blocks of business.

Some empirical investigations have been performed (see Cox and Lin,2007), considering a set of whole life insurances and a set of life annuities.Some interesting effects in terms of risk reduction can be gained when, atissue, the magnitude of the costs of life insurances is similar to those of lifeannuities.

A satisfactory offsetting effect between sets of life insurances and lifeannuities is difficult to obtain. Only large insurance companies could bepartially effective in this regard. Reinsurers, in particular, could offer propersupport, also through swap-like agreements (see Section 7.3.4).

7.3.3 Solvency issues

Appropriate capital allocation policies should be undertaken to deal withthe longevity risk which has been retained by the insurer. In particular, theadoption of internal models addressing longevity risk should be considered.In what follows we investigate some internal models in this regard andcompare the main results with the requirements embedded in Solvency 2for longevity risk (only). We focus mainly on longevity risk and refer toconventional immediate life annuities, so that there is no allowance forparticipation in financial or other profits. To make the results easier tounderstand, we further assume that no risk transfer (i.e. neither reinsurancenor ART) has been undertaken. Where not specified, we adopt the notationand assumptions introduced in Section 7.2.4.

With reference to time t, letWt be the amount of portfolio assets andV (�)t

the portfolio reserve (or technical provision). These quantities are randomat the valuation time, because of the risks, mortality, and investment risksin particular, facing the portfolio. Let z be the valuation time (z = 0, 1, . . . ).The random path of portfolio assets is recursively described as follows:

Wt = Wt−1 (1 + it) − B(�)t ; t = z + 1, z + 2, . . . (7.33)

where it is the investment yield in year (t − 1, t) and Wz is given (includ-ing both the reserve and capital in the size required according to a chosensolvency rule).

304 7 : The longevity risk: actuarial perspectives

According to legislation, the portfolio reserve is normally calculated asthe expected present value (using an appropriate technical basis) of futurepayments, increased by an appropriately defined riskmargin. If the riskmar-gin is a function of the expected present value of future payments, then (atleast in principle) the mathematical reserve can be calculated by aggregatingindividual reserves. In this case, the reserve at time t is random because itis the sum of a random number of individual reserves. If V (j)

t denotes theindividual reserve at time t, we have

V (�)t =

∑j:j∈�t

V (j)t (7.34)

We will adopt this assumption in the following discussion. However, wepoint out that if the risk margin is an appropriate risk measure assessedfor the portfolio as a whole, the reserve must be calculated directly at theportfolio level, given that the number of in-force policies affects the amountof the technical provision when pooling risks are present. For example, theportfolio reserve could be defined as a given percentile (e.g., the 75th) of thepresent value of future payments (see Section 1.5.3); in this case, the riskmargin would be implicitly assessed as the difference between the percentileand the expected value of the distribution of the present value of futurepayments.

The quantity

Mt = Wt − V (�)t (7.35)

represents the assets available to meet the residual risks having allowedfor those risks met by the portfolio reserve; shortly, we will refer to Wtas the total portfolio assets and to Mt as the capital assets in the portfolio(conversely, Wt − Mt represents assets backing the portfolio reserve).

In line with common practice, we consider solvency to be the ability ofthe insurer to meet, with an assigned (high) probability, random liabilitiesas they are described by a realistic probabilistic structure. To implementsuch a concept, choices are needed in respect of the following items:

1. The quantity expressing the ability of the insurer to meet liabilities; rea-sonable choices are either the total portfolio assets Wt or, as it is moreusual in practice, the capital assets, Mt, which (clearly) is supposed tobe positive when the insurer is solvent.

2. The time span T which the above results are referred to; it may rangefrom a short-medium term (1–5 years, say), to the residual duration ofthe portfolio.

7.3 Managing the longevity risk 305

3. The timing of the results, in particular annual results (e.g. the amount ofportfolio assets at every integer time within T years) versus single figureresults (e.g. the amount of portfolio assets at the end of the time horizonunder consideration, that is, after T years).

Further choices concern how to define the portfolio (just in-force poli-cies or also future entrants). To make these choices, the point of view fromwhich solvency is ascertained must be stated. Policyholders, investors andthe supervisory authority represent possible viewpoints in respect of theinsurance business. However, the perspectives of the (current or poten-tial) policyholders and investors involve profitability requirements possiblyhigher than those implied by the need of just meeting current liabilities.Such requirements would lead to a concept of insurer’s solidity, ratherthan solvency. So, we restrict our attention to the supervisory authority’sperspective.

The supervisory authority is charged to protect mainly the interests ofcurrent policyholders. So a run-off approach should be adopted (hencedisregarding future entrants). Further, no profit release should be allowedfor within the solvency time-horizon T, nor should any need for capitalallocation be delayed.

Let z be the time at which solvency is ascertained (z = 0, 1, . . . ). Thecapital required at time z could be assessed according to one of the following(alternative) models

P

z+T∧t=z+1

Mt ≥ 0

= 1 − ε1 (7.36)

P [Mz+T ≥ 0] = 1 − ε2 (7.37)

P

z+T∧t=z+1

Wt − Y(�)t ≥ 0

= 1 − ε3 (7.38)

where εi (i = 1, 2, 3) is the accepted default probability under the chosenrequirement and Y(�)

t is defined as in (7.8). Clearly, in all the solvency mod-els above (i.e. (7.36)–(7.38)), the relevant probability is assessed conditionalon the current information at time z.

With reference to requirement (7.38), first note that recursion (7.33) canbe rewritten as

Wt = Wz1

v(z, t)−

t∑h=z+1

B(�)

h

1v(h, t)

(7.39)

306 7 : The longevity risk: actuarial perspectives

where

1v(h, k)

= (1 + ih+1) (1 + ih+2) . . . (1 + ik) (7.40)

is the accumulation factor based on investment returns from time h to timek, and

v(h, k) = ((1 + ih+1) (1 + ih+2) . . . (1 + ik))−1 (7.41)

is the discount factor, based on the annual investment yields, from time k totime h. Referring to one cohort only, the quantity Y(�)

t can also be writtenas (see (7.10))

Y(�)t =

ω−x0∑h=t+1

B(�)

h v(t, h) (7.42)

Requirement (7.38) can be rewritten as

P

z+T∧t=z+1

Wz1

v(z, t)−

t∑h=z+1

B(�)

h

1v(h, t)

−ω−x0∑h=t+1

B(�)

h v(t, h) ≥ 0

= 1− ε3

(7.43)Assume, for brevity, that the annual investment yields are constant, that is,ih = i for all h. Then we can write (7.43) as

P

z+T∧t=z+1

Wz (1 + i)t−z −ω−x0∑

h=z+1

B(�)

h (1 + i)t−h ≥ 0

= 1 − ε3 (7.44)

or also as

P

[z+T∧

t=z+1(1 + i)t−(ω+1−x0) (Wz (1 + i)ω+1−x0−z

−ω−x0∑

h=z+1B(�)

h (1 + i)ω+1−x0−h) ≥ 0

]= 1 − ε3

(7.45)

We note that

Wz (1 + i)ω+1−x0−z −ω−x0∑

h=z+1

B(�)

h (1 + i)ω+1−x0−h = Wω+1−x0 (7.46)

7.3 Managing the longevity risk 307

represents the amount of portfolio assets available when the cohort isexhausted, and so the following result can be easily justified:

P

z+T∧t=z+1

(1 + i)t−(ω+1−x0) Wω+1−x0 ≥ 0

= P

z+T∧t=z+1

Wω+1−x0 ≥ 0

= P[Wω+1−x0 ≥ 0] = 1 − ε3 (7.47)

Hence, requirement (7.38) can be replaced by the following:

P[Wω+1−x0 ≥ 0] = 1 − ε3 (7.48)

Before commenting on the above results from the perspective of solvency,it is useful to note that such results hold in particular because: (a) the port-folio is closed to new entrants; (b) the probability in requirement (7.38) (aswell as in (7.36) and (7.37)) is assessed according to the natural probabilitydistribution of assets and liabilities (so that no risk-adjustment is applied,for example, in a risk-neutral sense) and it is implicitly conditional on theinformation available at time z on the relevant variables (current numberof survivors, investment yields, and so on). The results described in (7.44)–(7.47) could then be generalized to the case where more than one cohort isaddressed and the investment yield is not constant.

Turning back to the solvency requirements (7.36)–(7.38), the differ-ence between requirement (7.37) and (7.36) is clear. The same quantityis addressed in both, but whilst under requirement (7.36) it is checked atevery year within the solvency time-horizon, under (7.37) it is checked justat its end. We note that requirement (7.37) allows, in particular, for tempo-rary shortages of money within the solvency time-horizon. In the contextof a portfolio of immediate life annuities, possible deficiencies of assets maybe self-financed only by healthy financial profits and, also in this case, whenthe participation mechanisms to such profits (when present) are under thecontrol of the insurer (i.e. if the insurer can reduce the participation in someyears to recover more easily past or future losses). In the case of immediatelife annuities, therefore, the outputs of requirement (7.37) should be closeto those of (7.36). Hence, in the following we will disregard requirement(7.37).

The apparent difference between (7.36) and (7.38) arises from the waythat the liabilities are defined. In (7.38), the liabilities are stated in terms ofthe randompresent value of future payments, whilst in (7.36) they are statedas the expected value of such a quantity (plus possibly a risk margin). So

308 7 : The longevity risk: actuarial perspectives

whilst in (7.38) a consistent assessment of assets and liabilities is performed,under (7.36) some intermediate step is required.

To compare further (7.36) with (7.38), it is useful to note that the capi-tal assets build up because of specific capital allocations, and also becauseof the annual profits which are released according to the reserve profileand, in our setting, retained within portfolio assets. On the other hand,the amount of portfolio assets at the natural maturity of the cohort repre-sents the surplus left to the insurer at the expiry of the cohort itself. Giventhat, under (7.38), the maximum available time-horizon is implicitly con-sidered (see (7.47)), we can argue that such a requirement takes care ofthe overall losses possibly deriving from the portfolio. Assume that a time-horizon T = ω + 1− x0 − z is chosen in requirement (7.36); the differencebetween (7.36) and (7.38) lies in the fact that, under the latter, only thetotal amount of the surplus (and loss) is considered (see (7.48)), whilst,under the former, also the timing of their emergence is taken into consid-eration. According to valuation terminology, requirement (7.36) is basedon a ‘deferral and matching’ logic, whilst (7.38) on an ‘asset and liability’approach. Further, whenever a shorter time-horizon is chosen in (7.36), wenote that just profits (and losses) emerging in the first T years are acco-unted for.

Note that because of the differences among the three requirements, it isreasonable that they are implemented with different levels of the accepteddefault probability; in particular, we can imagine ε2 ≥ ε1. The comparisonbetween ε1 and ε3 is not straightforward in general, given that, in a lifeportfolio, short-term losses could be recovered in the long run. Referringto a portfolio of immediate life annuities, however, we can imagine thatε1 ≥ ε3 whenever T < ω + 1 − x0 − z. Should T = ω + 1 − x0 − z, thenε1 = ε3 could be a reasonable choice.

Solving (7.36), through stochastic simulation, one finds the amount ofcapital assets required at time z; we will denote such amount by M[R1]

z (T).Then W [R1]

z (T) = V (�)z + M[R1]

z (T) is the amount of total portfolioassets required at time z. Solving (7.48), again through stochastic simu-lation, one finds the amount of total portfolio assets required at time z,denoted as W [R3]

z ; the required amount of capital assets at time z is then:M[R3]

z = W [R3]z − V (�)

z .

Example 7.6 Let us adopt the inputs of Example 7.2; so, in particular, werefer to a homogeneous cohort. To focus on mortality, we disregard finan-cial risk; so we set it = i = 0.03 for all t (i = 0.03 is adopted in the reservingbasis as well). To facilitate the comparisons among the results obtainedunder the different requirements, we define the individual reserve as the

7.3 Managing the longevity risk 309

Table 7.22. Individual reserve

Time z Reserve V (1)z

0 15.2595 12.956

10 10.59915 8.29420 6.16725 4.33630 2.87735 1.807

expected value of future payments, under the best-estimate assumption;then

V (j)t = E[Y(j)

t |A3(τ)] (7.49)

Further, the same default probability is set for all the requirements, soε1 = ε3 = 0.005. Such a level has been chosen to be consistent with thedeveloping Solvency 2 system (see CEIOPS (2007) and CEIOPS (2008)).We note that under Solvency 2 a risk margin should be added to (7.49),calculated according to the Cost of Capital approach; see CEIOPS (2007)and CEIOPS (2008) for details.

Table 7.22 quotes the individual reserve. Clearly, at any time z the port-folio reserve is simply: V (�)

z = nz V (1)z , where V (1)

z is the reserve at time zfor a generic annuitant.

In Table 7.23, we state the amount of the capital (per unit of portfolioreserve) required according to (7.36) and (7.38) for several portfolio sizes.For (7.36), the maximum possible time-horizon has been chosen. As wewould expect from the previous discussion, the two requirements lead tosimilar outputs, at least when mortality only is addressed. In this case, atleast, the outputs suggest that requirement (7.36) is to some extent inde-pendent of the reserve when T takes the maximum possible value for thetime-horizon. It should be stressed that in our investigation no risk mar-gin is included in V (�)

z . Thus, a share of the required capital quoted inTable 7.23 should be included in the reserve and, possibly, charged to annu-itants through an appropriate safety loading at the issue of the policy.Wheninterpreting the size of the required capital per unit of the portfolio reserve,we also point out that the reserve is lower than what would be required bythe supervisory authority, and so the ratios in Table 7.23 would be higherthan what we would find in practice.

310 7 : The longevity risk: actuarial perspectives

Table 7.23. Required capital based on requirements (7.36) and (7.38), facing longevity risk andmortality random fluctuations

Required capital based on (7.36) Required capital based on (7.38)M[R1]

z (ω+1−x0−z)

V(�)z

M[R3]z

V(�)z

Time z n0 = 100 n0 = 1,000 n0 = 10,000 n0 = 100 n0 = 1,000 n0 = 10,000

0 12.744% 9.243% 8.103% 12.744% 9.241% 8.103%5 16.510% 11.938% 10.525% 16.492% 11.938% 10.525%

10 21.474% 15.630% 13.890% 21.333% 15.621% 13.890%15 28.097% 20.372% 18.282% 28.007% 20.372% 18.281%20 37.722% 27.031% 24.131% 37.456% 27.008% 24.131%25 53.980% 36.129% 31.832% 53.378% 36.113% 31.832%30 82.980% 50.605% 42.152% 81.037% 50.476% 42.140%35 171.782% 79.024% 56.968% 165.842% 77.890% 56.968%

It is worthwhile to comment on the similar magnitude of the ratiosM[R3]

z /V (�)z and yz,ε[nz]/E[Y(�)

z |nz] (see Table 7.12), when the probabil-ity ε considered for the calculation of the percentile yz,ε[nz] is very close(or, better, the same as) the non-default probability 1 − ε3 adopted forcalculating M[R3]

z . In order to deal with an example, we can comparethe ratio M[R3]

z /V (�)z in Table 7.23 (where 1 − ε3 = 0.995) with the

ratio yz,0.99[nz]/E[Y(�)z |nz] in Table 7.12 (thus, we are setting ε = 0.99);

we can note that the two ratios have a similar magnitude at each timez. First, we note that, as pointed out in Example 7.2, V (�)

z (given bynz E[Y(1)

z |A3(τ)] = E[Y(�)z |A3(τ), nz]) and E[Y(�)

z |nz] are very close (com-pare in particular Tables 7.2 and 7.9). So given the similar values of thetwo ratios, the quantities M[R3]

z and yz,ε[nz] are also likely to be close toone the other. Actually, under requirement (7.38) what is measured is theaccumulated value of annual payments, whilst with yz,ε[nz] the relevantpresent value is accounted for. Indeed in Section 7.2.4, we mentioned thepractical importance of investigating the right tail of the distribution of thepresent value of future payments; this comes from the fact that the quantityyz,ε[nz] may be taken as a measure of the capital required to meet liabilitiesunder a low default probability (and according to the maximum possiblesolvency time horizon).

In Table 7.24, outputs from requirement (7.36) are investigated forshorter time-horizons. Comparing Tables 7.23 with 7.24, the long-termnature of longevity risk clearly emerges. We note that, both in Tables 7.23and 7.24, at each valuation time and for each requirement, the size of therequired capital decreases when a larger portfolio is considered. This is dueto the fact that also randomfluctuations are accounted for in the assessment.

7.3 Managing the longevity risk 311

Table 7.24. Required capital based on requirements (7.36), per unit of portfolio reserve:(M[R1]

z (T))/V(�)z , facing longevity risk and mortality random fluctuations

Time-horizon T = 1 Time-horizon T = 3

Time z n0 = 100 n0 = 1, 000 n0 = 10, 000 n0 = 100 n0 = 1, 000 n0 = 10, 000

0 0.574% 0.473% 0.242% 1.834% 1.076% 0.581%5 1.058% 0.743% 0.397% 3.358% 1.711% 0.983%

10 1.951% 1.159% 0.649% 5.162% 2.568% 1.738%15 3.600% 1.903% 1.226% 8.689% 4.463% 3.399%20 6.639% 3.265% 2.306% 13.796% 8.003% 6.403%25 12.246% 6.070% 4.465% 22.727% 14.314% 11.790%30 22.588% 12.168% 8.655% 44.454% 26.438% 21.145%35 41.664% 26.210% 16.739% 124.167% 51.506% 36.973%

We have obtained Table 7.25 by addressing random fluctuations only.In particular, the required capital has been calculated adopting only thebest-estimate mortality assumption A3(τ). In Table 7.26, in contrast, onlylongevity risk has been accounted for, by assuming that whatever is therealized mortality trend, the actual number of deaths in each year coincideswith what has been expected under the relevant trend assumption. We notethat in the latter case the amount of the required capital per unit of portfo-lio reserve is independent of the size of the portfolio – this occurs because,as noted previously, longevity risk is systematic. Regarding Table 7.25, wepoint out that the randomfluctuations accounted for there are not fully com-parable to those embedded in Tables 7.23 and 7.24. Actually, in Tables 7.23and 7.24, a mixture of the random fluctuations which can be appraisedunder the several mortality assumptions in A(τ) is accounted for. Whencomparing Table 7.25 (lower panels) with Table 7.24, we can see that, ifrequirement (7.36) is implemented with a short time-horizon, in practicewe are mainly accounting for random fluctuations, rather than systematicdeviations; this is due to the long-term nature of longevity risk. Tables 7.25and 7.26 do provide us with some useful information. However, it must bepointed out that implementing an internal model allowing for a componentonly of a risk represents an improper use of the model itself. As an illus-tration, we note that on summing the results in Tables 7.25 and 7.26, fora given requirement and portfolio size, we do not find the correspondentresults in Table 7.23 or 7.24. Thus, some aspects are missed when work-ing with marginal distributions only (as is the case when we address eitherrandom fluctuations or systematic deviations only).

Finally, it is interesting to compare the findings described by the previ-ous Tables with some legal requirements. We refer here to the developingSolvency 2 system, which is one of the few explicitly considering longevity

312 7 : The longevity risk: actuarial perspectives

Table 7.25. Required capital based on requirements (7.36) and (7.38), facing mortalityrandom fluctuations only; mortality assumption A3(τ)

Required capital based on (7.36) Required capital based on (7.38)M[R1]

z (ω+1−x0−z)

V(�)z

M[R3]z

V(�)z

Time z n0 = 100 n0 = 1, 000 n0 = 10, 000 n0 = 100 n0 = 1, 000 n0 = 10, 000

0 7.813% 2.832% 0.879% 7.031% 2.698% 0.800%5 9.983% 3.071% 1.067% 9.436% 2.949% 1.040%

10 12.144% 4.040% 1.217% 11.543% 3.759% 1.193%15 16.153% 5.202% 1.544% 14.982% 4.921% 1.462%20 22.343% 6.938% 2.091% 21.292% 6.554% 1.936%25 29.728% 10.388% 3.072% 28.546% 9.642% 2.983%30 54.183% 16.871% 5.547% 51.253% 16.807% 5.152%35 155.859% 36.795% 11.715% 144.058% 34.809% 11.207%

Required capital based on (7.36) Required capital based on (7.36)M[R1]

z (1)

V(�)z

M[R1]z (3)

V(�)z

Time z n0 = 100 n0 = 1, 000 n0 = 10, 000 n0 = 100 n0 = 1, 000 n0 = 10, 000

0 0.574% 0.473% 0.171% 1.834% 0.983% 0.378%5 1.058% 0.743% 0.271% 3.358% 1.443% 0.479%

10 1.951% 0.932% 0.388% 5.162% 1.957% 0.657%15 3.600% 1.642% 0.583% 8.604% 2.630% 0.932%20 6.639% 2.458% 0.806% 13.304% 3.775% 1.329%25 12.246% 4.633% 1.379% 19.609% 7.129% 2.166%30 22.588% 7.878% 2.804% 41.023% 13.181% 4.168%35 41.664% 21.058% 7.321% 124.167% 32.954% 10.176%

risk. The capital required to deal with such risk is the change expected inthe net asset value against a permanent reduction by 25% in the currentand all future mortality rates (we do not discuss further details, such as pos-sible reductions of this amount; see CEIOPS (2007) and CEIOPS (2008)).Under our hypotheses (we are considering just one cohort, there is no profitparticipation, we are disregarding risks other than those deriving frommor-tality, and so on), the requirement reduces to the difference between thebest-estimate reserve and a reserve set up with a mortality table embeddingprobabilities of death 25% lower than in the best-estimate assumption. Therelevant results are quoted in Table 7.27, where the required capital at timez is denoted by M[Solv2]

z . It is clear that, in relative terms, such an amount isindependent of the portfolio size. We further recall that, under Solvency 2,no specific capital allocation is required for the risk of random fluctuations,since they are treated as hedgeable risks. �

7.3 Managing the longevity risk 313

Table 7.26. Required capital based on requirements (7.36) and(7.38), facing longevity risk only

Required capital

Time z M[R1]z (ω+1−x0−z)

V(�)z

M[R3]z

V(�)z

M[R1]z (1)

V(�)z

M[R1]z (3)

V(�)z

0 7.562% 7.562% 0.125% 0.389%5 9.895% 9.895% 0.205% 0.651%

10 13.040% 13.040% 0.437% 1.394%15 17.239% 17.239% 0.922% 2.857%20 22.745% 22.745% 1.883% 5.621%25 29.762% 29.762% 3.727% 10.564%30 38.348% 38.348% 7.110% 18.745%35 48.330% 48.330% 12.949% 30.875%

Table 7.27. Required capital accordingto Solvency 2

Time z M[Solv2]z

V(�)z

0 7.274%5 9.080%

10 11.377%15 14.293%20 18.000%25 22.767%30 29.102%35 38.065%

Tables 7.26 and 7.27 may suggest that a deterministic approach can beadopted for allocating capital to deal with longevity risk. In particular, theassessment of the required capital could be based on a comparison betweenthe actual reserve and a reserve calculated under a more severe mortalitytrend assumption (as turns out to be the case under Solvency 2).

Let V (�)[B]z be a reserve calculated according to the same valuation prin-

ciple adopted for V (�)z (the equivalence principle, in our implementation),

but based on a worse mortality assumption, so that

V (�)z ≤ V (�)[B]

z (7.50)

The required capital would be

M[R4]z = V (�)[B]

z − V (�)z (7.51)

We note that requirement (7.51) would deal with longevity risk only. Fur-ther, no default probability is explicitly mentioned; however, the mortality

314 7 : The longevity risk: actuarial perspectives

assumption adopted in V (�)[B]z clearly implies some (not explicit) default

probability. The time-horizon implicitly considered is the maximum resid-ual duration of the portfolio, given that this is the time-horizon referredto in the calculation of the reserve. We also point out that, to simplify theassessment of the required capital and to avoid any duplication of risk mar-gins as well, it is reasonable that reserves in (7.51) are actually based on theequivalence principle. So the required capital M[R4]

z turns out to be linearin respect of the portfolio size nz.

To compare requirements (7.36) and (7.38) with (7.51), let us define thefollowing ratios:

QM[R1]z (T; nz) = M[R1]

z (T)

V (�)z

(7.52)

QM[R3]z (nz) = M[R3]

z

V (�)z

(7.53)

QVz = M[R4]z

V (�)z

(7.54)

Accounting also for the risk of random fluctuations, the ratiosQM[R1]

z (T; nz) and QM[R3]z (nz) depend on the size of the portfolio while,

in contrast, the ratio QVz, which considers just longevity risk, is indepen-dent of portfolio size. On the other hand, requirement (7.36) and (7.38)could be implemented considering only the risk of random fluctuations orthe longevity risk, as we have illustrated in the calculations in Tables 7.25and 7.26, respectively. As noted previously, when addressing longevity riskonly, the ratios QM[R1]

z (T; nz) and QM[R3]z (nz) are independent of the size

of the portfolio (as it emerges from Table 7.26). However, we have alreadycommented on the fact that addressing just a component of the mortal-ity risk represents an improper use of requirements (7.36) and (7.38). Afurther difference between ratio QM[R1]

z (T; nz) and QVz stands in the pos-sibility to set a preferred time-horizon; indeed, time-horizons other than themaximum one may be chosen only when requirement (7.36) is adopted.

It is not possible to derive general conclusions regarding the comparisonbetween the outcoming levels of ratios QM[R1]

z (T; nz) and QM[R3]z (nz), on

one hand, and QVz, on the other. However, we comment further throughan example.

Example 7.7 Figure 7.7 plots the ratios (7.53) and (7.54), for severalportfolio sizes, based on calculations performed at time 0. In particular:

7.3 Managing the longevity risk 315

0%

2%

4%

6%

8%

10%

12%

0 2000 4000 6000 8000 10000 12000

Req

uire

d ca

pita

l, pe

r un

it of

res

erve

(3) (1)

(2) (4)

Portfolio size

Figure 7.7. Ratios QM[R3]0 (n0) and QV0. (1): QM[R3]

0 (n0); (2): QV0, with V(�)[A5(τ)]0 ; (3):

QM[R3]0 (n0), with M[R3]

0 accounting for random fluctuations only; (4): QV0 + QM[R3]0 (n0), with

M[R3]0 accounting for random fluctuations only.

– case (1) plots the ratio QM[R3]0 (nz);

– case (2) plots the ratio QV0, obtained by choosing the mortality trendA5(τ) as an assumption more severe than the best-estimate;

– case (3) plots the ratio QM[R3]0 (nz) where, in contrast to case (1), the

required capital M[R3]0 has been obtained by addressing random fluc-

tuations only (the best-estimate assumption has been used to describemortality);

– case (4) plots the required capital obtained summing the results in case(2) (accounting for longevity risk only) and in case (3) (accounting forrandom fluctuations only).

We first note that the outputs found under case (2) are very similar to(indeed, in our example they coincide with) those found adopting require-ment (7.38), as well as requirement (7.36) with T = ω + 1 − x0 (the ratioQV0, with V (�)[A5(τ)]

0 , plotted in Fig. 7.7, amounts to 7.562% for each port-folio size; compare this outcome with the ratios QM[R1]

0 (ω + 1− x0; n0) =(M[R1]

0 (ω + 1 − x0))/V (�)

0 and QM[R3]0 (n0) = M[R3]

0 /V (�)

0 in Table 7.26).This is explained by the fact that the (left) tail of the distribution of assets(addressed in (7.38) and (7.36)) is heavily affected by the worst scenario(A5(τ), in our example) when low probabilities (of default) are addressed.

316 7 : The longevity risk: actuarial perspectives

Thus, when allowing for longevity risk only, requirement (7.36) adoptedwith the maximum possible time-horizon and requirement (7.38) reduce to(7.51). This is why a practicable idea could be to split the capital allocationprocess in two steps:

– one for longevity risk only, based on a comparison between reservescalculated according to different mortality assumptions (i.e. adoptingrequirement (7.51));

– one for random fluctuations only, adopting an internal model or someother standard formula.

Case (4) in Fig. 7.7 is intended to represent such a choice.We note, however,that an unnecessary allocation may result from this procedure; as we havealready commented, working separately on the components of mortalityrisk is improper and may lead to an inaccurate capital allocation. �

Undoubtedly, the advantage of requirement (7.51) is its simplicity, andwe note that it seems that this requirement will be adopted by Solvency 2in respect of many risks. Of course, it is also possible to find the reserv-ing basis avoiding the situation plotted in Fig. 7.7 (but to be sure, oneshould first perform the valuation through an internal model, at least forsome typical compositions of the portfolio). Another possibility supportingthe separate treatment of the mortality risk components is to adopt differ-ent solvency time-horizons for the different components of mortality risk.So we could choose the maximum possible value for T for longevity risk(adopting (7.51)) and a short-medium time-horizon for random fluctua-tions (if requirement (7.36) is adopted, with say T = 1 to 5 years). Forpractical purposes, this approach could represent a good compromise, oncondition that the relevant assumptions are properly disclosed. If valuationtools other than an internal model are available or are required for the riskof random fluctuations (as should be the case for Solvency 2), then require-ment (7.51) is certainly able to capture properly the feature of longevity risk(only).

Example 7.8 We conclude this section with a final example. So far, justhomogeneous portfolios have been investigated. We now consider the caseof a portfolio with some heterogeneity in annual amounts. A stronger dis-persion of annual amounts usually leads to a poorer pooling effect. Alsothere is the danger that if the annuitants living longer are those with higherannual amounts, then the impact of longevity risk could be more severe.Even though it is reasonable to assume that, because of adverse selection,those with higher annual amounts live longer (as it is supported by some evi-dence), in this example we do not account for this dependence. The impact

7.3 Managing the longevity risk 317

Table 7.28. Classes of annual amounts in five portfolios

Portf. 1 Portf. 2 Portf. 3 Portf. 4 Portf. 5

Class Amount Freq. Amount Freq. Amount Freq. Amount Freq. Amount Freq.

1 1 100% 0.75 40% 0.25 20% 0.75 90% 0.5625 80%2 1 50% 0.75 20% 3.25 10% 2 15%3 2 10% 1 20% 5 5%4 1.25 20%5 1.75 20%

distribution of the annual amount

Average value 1 1 1 1 1standarddeviation 0 0.35355 0.5 0.75 1.0503

of the dispersion of annual amounts is checked through the calculation ofthe capital required to meet mortality risks, assuming a zero correlationbetween the annual amount and the lifetime of the annuitant.

We test the five portfolios described in Table 7.28. We note that, to facil-itate comparisons, the same average annual amount per annuitant has beenassumed. The specific annual amount paid to each annuitant may, however,be different from the average value, depending on the insurance class (eachclass grouping people with the same annual amount).We note that the port-folios are ordered with respect to the degree of heterogeneity, as measuredby the standard deviation of the distribution of the annual amounts.

Adopting the inputs of Example 7.6, we have calculated the capitalrequired based on requirement (7.38). The assessment has been performedat time 0 only, for several portfolio sizes. The outputs are plotted in Fig. 7.8.A stronger requirement emerges when portfolios with a wider dispersion ofannual amounts are considered: portfolio 5 versus portfolio 1, for example.We note that the portfolio reserve at time 0 is the same in all portfolios,due to the assumption about the average annual amount. It is interesting tocompare Figs. 7.8 to 7.9, where only random fluctuations have been consid-ered. It seems that most of the change in the capital required when changingthe portfolio composition is due to random fluctuations. We note, in par-ticular, the width of the range of variation of the capital required for theseveral portfolios when also longevity risk is accounted for relative to whathappens when only random fluctuations are accounted for. Thus, whencomparing Figs. 7.8 and 7.9 in detail, we note, although the scale of the y-axis is different, that the length of the range is the same. So we can concludethat, to some extent, longevity risk is independent of the heterogeneity ofthe portfolio. It is important to note, again, that this result is also due to the

318 7 : The longevity risk: actuarial perspectives

8%

9%

10%

11%

12%

0 2000 4000 6000 8000 10000 12000

portf 1portf 2portf 3portf 4portf 5

Req

uire

dca

pita

l (pe

r un

it of

res

erve

)

Portfolio size

Figure 7.8. Required capital, per unit of reserve: QM[R3]0 (n0).

model, which does not explicitly account for any dependence of the lifetimeof the individual on her/his annual amount. �

7.3.4 Reinsurance arrangements

Various reinsurance arrangements can be conceived, at least in principle,to transfer longevity risk. At the time of writing, reinsurers are reluctantto accept such a transfer, due to the systematic nature of the risk of unan-ticipated aggregate mortality. Actually, only some slight offset (throughnatural hedging) can be gained by dealing with longevity risk just withinthe insurance-reinsurance process. Longevity-linked securities, transferringthe risk to the capital market, could back the development of a longevityreinsurance market (see Section 7.4). So in the following we describe severalarrangements, some of which in particular could be effective when linkedto longevity securities. At the same time, we disregard any arrangementdesigned to deal with random fluctuations. To be consistent with the pre-vious discussion, we refer to immediate life annuities (which in any caseare the most interesting type of annuity when a transfer of longevity risk isbeing considered).

It must be pointed out that when mortality risk is reinsured in a lifeannuity portfolio, one cannot be sure that just longevity risk is transferred.Indeed, random fluctuations also contribute to deviations in mortality rates,

7.3 Managing the longevity risk 319

0%

1%

2%

3%

4%

0 2000 4000 6000 8000 10000 12000

Req

uire

d ca

pita

l (pe

r un

it of

res

erve

)

portf 1portf 2portf 3portf 4portf 5

Portfolio size

Figure 7.9. Required capital, per unit of reserve: QM[R3]0 (n0), facing mortality random fluctua-

tions only; mortality assumption A3(τ).

as we have highlighted previously. If the reinsurance arrangement is meantto deal mainly with longevity risk, then before underwriting it the risk ofrandom fluctuations has to be reduced; for example, some leveling of theannual amounts has to be achieved through a first-step surplus reinsurance.For this reason, in the following we will implicitly refer to homogeneousportfolios in respect of the amount of benefits.

The more natural way to transfer longevity risk for an annuity provider isto truncate the duration of each annuity. To this purpose, an Excess-of-Loss(XL) reinsurance can be designed. Under such an arrangement, the reinsurerwould pay to the cedant the ‘final’ part of the life annuity in excess of a givenage xmax. Such an age should be reasonably old, but not too close to themaximum age (otherwise the transfer would be ineffective); xmax could,for example, be set equal to the Lexis point in the current projected table.Note that xmax defines the deductible of the XL arrangement. See Fig. 7.10,where x0 = 65 and xmax = 85.

From the point of view of the cedant, this reinsurance treaty convertsimmediate life annuities payable for the whole residual lifetime into imme-diate temporary life annuities. From the point of view of the reinsurer, aheavy charge of risk emerges. Actually, the reinsurer takes the ‘worst part’of each annuity, being involved at the oldest ages only. Therefore, from apractical point of view, the reinsurance treaty would be acceptable to the

320 7 : The longevity risk: actuarial perspectives

Lifetime

Ann

uita

nts

8565

Reinsurer's intervention

54321

n

.

.

.

.

.

.

.

.

.

Figure 7.10. An XL reinsurance arrangement.

reinsurer only if it were compulsory for some annuity providers. This couldbe the case, for example, with pension funds, which may be forced by thesupervisory authority to back their liabilities through arrangements with(re-)insurers.

The XL arrangement is clearly defined on a long-term basis, so imply-ing a heavy longevity risk charged to the reinsurer. In more realistic terms,reinsurance arrangements defined on a short-medium period basis couldbe addressed. With this objective in mind, stop-loss arrangements couldprovide interesting solutions. According to the stop-loss rationale, the rein-surer’s interventions are aimed at preventing the default of the cedant,caused by (systematic) mortality deviations.

The effect of mortality deviations can be identified, in particular, by com-paring the total portfolio assets at a given time with the portfolio reserverequired tomeet the insurer’s obligations. A Stop-Loss reinsurance on assetscan then be designed, according to which the reinsurer funds (at least par-tially) the possible deficiency in assets; Fig. 7.11 sketches this idea (in arun-off perspective).

Let z be the time of issue (or revision) of the reinsurance arrangement.Adopting the notation introduced earlier, in practical terms the reinsurer’sintervention can be limited to the case

Wz+k < (1 − π) V (�)

z+k, π ≥ 0 (7.55)

7.3 Managing the longevity risk 321

Time

Ass

ets

and

rese

rve

Assetsavailable

Requiredportfolioreserve

Reinsurer's intervention

Figure 7.11. A Stop-Loss reinsurance arrangement on assets.

where the amount πV (�)

z+k represents the ‘priority’ of the stop-loss treaty andk is a given number of years. We note that setting π > 0 may contain thepossibility of random fluctuations being transferred. However, thanks tothe fact that the assets and the reserve of a life annuity portfolio have long-term features, the flows of the arrangement should not be heavily affectedby random fluctuations, at least up to some time. In fact, close to the naturalmaturity of the portfolio we may expect that random fluctuations becomepredominant relative to systematic deviations; see also Section 7.2.4. Settingk > 1 (e.g. k = 3 or k = 5) ensures that the reinsurer intervenes in the moresevere situations, and not when the lack of assets may be recovered by thesubsequent flows of the portfolio. However, k should not be set too high,otherwise the funding to the cedant in the critical cases would turn out tobe too delayed in time.

A technical difficulty in this treaty concerns the definition of assets andreserve to be referred to for ascertaining the loss. Further, some control ofthe investment policy adopted by the cedant in relation to these assets couldbe requested by the reinsurer. For these reasons, the treaty can be conceivedas an ‘internal’ arrangement, that is, within an insurance group (where theholding company takes the role of the reinsurer of affiliates) or when thereis some partnership between a pension fund and an insurance company (thelatter then acting as the reinsurer, the former as the cedant).

A Stop-Loss reinsurance may be designed on annual outflows, instead ofassets. The rationale, in this case, is that, at a given point in time, longevityrisk is perceived if the amount of benefits to be currently paid to annuitantsis (significantly) higher than expected. A transfer arrangement can then be

322 7 : The longevity risk: actuarial perspectives

designed so that the reinsurer takes charge of such an extra amount, or‘loss’. As in the previous case, the loss may be due to random fluctuations– here, this situation is more likely, given that annual outflows are directlyreferred to, instead of some accrual of outflows. By setting a trigger level forthe reinsurer’s intervention higher than the expected value of the amountof benefits, we would reduce the possible transfer of such a random riskcomponent.

Reinsurance conditions should concern the following items:

– Let z be the time of issue (or revision) of the arrangement. The timehorizon k of the reinsurance coverage should be stated, as well as thetiming of the possible reinsurer’s intervention within it. Within the timehorizon k, policy conditions (i.e. premium basis, mortality assumptions,and so on) should be guaranteed. As to the timing of the interventionof the reinsurer, since reference is to annual outflows, it is reasonable toassume that a yearly timing is chosen. Hence, in the following, we willmake this assumption.

– The mortality assumption for calculating the expected value of the out-flow, required to define the loss of the cedant. Reasonably, we will adoptthe current mortality table, which will be generically denoted as A(τ) inwhat follows.

– The minimum amount �′t of benefits (at time t, t = z+1, z+2, . . . , z+k)

below which there is no payment by the reinsurer. For example,

�′t = E[B(�)

t |A(τ), nz] (1 + r) = b E[Nt|A(τ), nz] (1 + r) (7.56)

with r ≥ 0 and b the annual amount for each annuitant; thus the amount�′

t represents the priority of the Stop-Loss arrangement.– The Stop-Loss upper limit, that is, an amount �′′

t such that �′′t − �′

t isthe maximum amount paid by the reinsurer at time t. From the pointof view of the cedant, the amount �′′

t should be set high enough so thatonly situations of extremely high survivorship are charged to the cedant.However, the reinsurer reasonably sets �′′

t in connection to the availablehedging opportunities. We will come back to this issue in Section 7.4.3.As to the cedant, a further reinsurance arrangementmay be underwritten,if available, for the residual risk, possibly with another reinsurer; in thiscase, the amount �′′

t − �′t operates as the first layer.

In Fig. 7.12, a typical situation is represented.

When we consider the features of this treaty, especially in relation to theStop-Loss arrangement on assets, we note that measuring annual outflowsis relatively easy, since this relies on some direct information about the

7.3 Managing the longevity risk 323

Time

Expected values

Reinsurer's interventionA

nnua

l out

flow

sPriority Upperlimit

Actualoutflows

Figure 7.12. A Stop-Loss reinsurance arrangement on annual outflows.

portfolio (viz. the number of living annuitants, joint to the annual amountof their benefits). On the other hand, as already pointed out, it is moredifficult to avoid the transfer of random fluctuations as well.

We now define in detail the flows paid by the reinsurer. Let B(SL)t denote

such flow at time t, t = z + 1, z + 2, . . . , z + k. We have

B(SL)t =

0 if B(�)

t ≤ �′t

B(�)t − �′

t if �′t < B(�)

t ≤ �′′t

�′′t − �′

t if B(�)t > �′′

t

(7.57)

The net outflow of the cedant at time t (gross of the reinsurance premium),denoted as OF(SL)

t , is then

OF(SL)t = B(�)

t − B(SL)t =

B(�)

t if B(�)t ≤ �′

t

�′t if �′

t < B(�)t ≤ �′′

t

B(�)t − (�′′

t − �′t) if B(�)

t > �′′t

(7.58)

The net outflow of the cedant is clearly random but, unless some ‘extreme’survivorship event occurs, it is protected with a cap. It is interesting(especially for comparison with the swap-like arrangement described sub-sequently) to comment on this outflow. First of all, it must be stressed thatB(�)

t ≤ �′t represents a situation of profit or small loss to the insurer. On

the contrary, the event B(�)t > �′′

t corresponds to a huge loss. Whenever�′

t < B(�)t ≤ �′′

t a loss results for the insurer, whose severity may range

324 7 : The longevity risk: actuarial perspectives

from small (if B(�)t is close to �′

t) to high (if B(�)t is close to �′′

t ). So theeffect of the Stop-Loss arrangement is to transfer to the reinsurer all of theloss situations, except for the lowest and the heaviest ones; any situation ofprofit, on the contrary, is kept by the cedant.

To reduce further randomness of the annual outflow, the cedant may bewilling to transfer to the reinsurer not only losses, but also profits. Thus,a reinsurance-swap arrangement on annual outflows can be designed. LetB∗

t be a target value for the outflows of the insurer at time t, t = z + 1, z +2, . . . , z + k; for example,

B∗t = E[B(�)

t |A(τ), nz] (7.59)

where A(τ) is an appropriate mortality assumption and z is the time of issueof the reinsurance swap. Under the swap, if B(�)

t > B∗t the cedant receives

money from the reinsurer; otherwise, if B(�)t < B∗

t , then the cedant givesmoney to the reinsurer, so that the target outflow is reached.

Let B(swap)t be the payment from the reinsurer to the cedant, defined as

follows:B(swap)

t = B(�)t − B∗

t (7.60)

The annual outflow (gross of the reinsurance premium) for the cedant attime t is

OF(swap)t = B(�)

t − B(swap)t = B∗

t (7.61)

The advantage for the cedant is to convert a random flow, B(�)t , into

a certain flow, B∗t and hence the term ‘reinsurance-swap” that we have

assigned to this arrangement. Figure 7.13 depicts a possible situation. Notethat, ceteris paribus, this arrangement should be less expensive than theStop-Loss treaty on outflows, given that the reinsurer participates not onlyin the losses, but also in the profits.

Although one advantage for the cedant of the reinsurance-swap is a pos-sible price reduction, the cedant may be unwilling to transfer profits. On thecontrary, the arrangement may be interesting for the reinsurer depending onthe hedging tools available in the capital market (so that it could even be theonly arrangement available on the reinsurance market); see Section 7.4.3 inthis regard.

The design of the reinsurance-swap can be generalized by assigning twobarriers �′

t, �′′t (with �′

t ≤ B∗t ≤ �′′

t ) such that

B(swap-b)t =

B(�)

t − �′t if B(�)

t ≤ �′t

0 if �′t < B(�)

t ≤ �′′t

B(�)t − �′′

t if B(�)t > �′′

t

(7.62)

7.3 Managing the longevity risk 325

Time

Ann

ual o

utfl

ows

bn0

Actual outflowTarget outflow

From/to cedant

Figure 7.13. A reinsurance-swap arrangement.

Clearly, when setting �′t = �′′

t = B∗t in (7.62), one finds (7.60) again. The

net outflow (gross of the reinsurance premium) to the cedant is then

OF(swap-b)t = B(�)

t − B(swap-b)t =

�′

t if B(�)t ≤ �′

t

B(�)t if �′

t < B(�)t ≤ �′′

t

�′′t if B(�)

t > �′′t

(7.63)

It is interesting to compare (7.63) with (7.58). We have already com-mented on the implications of (7.58) for the profit/loss left to the cedant.Under (7.63), large losses as well as large profits are transferred to the rein-surer; therefore, both a floor and a cap are now applied to the profits/lossesof the cedant.

So far we have not commented on the pricing of the reinsurance arrange-ments which have been examined. Actually, we will not enter into detailsregarding this subject, but just make some remarks.

The critical issue in pricing a reinsurance arrangement involving aggre-gate mortality risk is the pricing of longevity risk. As already commentedin Section 7.2.3, many attempts have been devoted to this issue, but nogenerally accepted proposal is yet available. The stochastic model usedextensively in this chapter, although useful for internal purposes (such ascapital allocation), is not appropriate in general for pricing, due to the wideset of items to be chosen (alternative mortality scenarios, weights attached

326 7 : The longevity risk: actuarial perspectives

to such scenarios, and so on), as well as to the intrinsic static representationof stochastic mortality.

As far as the XL arrangement and the Stop-Loss treaty on assets areconcerned, the adoption of traditional actuarial pricing methods, such asthe percentile principle, is reasonable because of the traditional structureof the arrangement. Due to the context within which they could have tobe realized (such as a compulsory backing of pension fund liabilities by alife insurer), the stochastic model used so far, with the set A(τ) suggestedby some independent institution, can offer an acceptable representation ofstochastic mortality also for pricing purposes.

The Stop-Loss arrangement on outflows and the reinsurance-swap, incontrast, have features very close to those of financial derivatives. As wehave already noted, these arrangements can develop if they are properlybacked by longevity-linked securities. So their pricing will depend on thepricing of the backing securities; attempts in this respect are still at an earlystage.

The choice of a particular reinsurance arrangement clearly depends at firston what is available in the reinsurance market. In the case that more thanone solution is available, attention should not be paid just to the price,but also to the benefits obtained in terms of reduction of the requiredcapital. For the reasons discussed above, we are not going to comparethe arrangements in terms of their price but we conduct some numericalinvestigations concerning the capital requirements resulting from variousreinsurance arrangements. Due to the practical interest that theymight have,we consider just the Stop-Loss treaty on outflows and the swap-reinsurancearrangement. Given the earlier comments about solvency issues, we makeuse of an internal model, to account jointly (and consistently) for the riskof random fluctuations and the longevity risk.

Example 7.9 We refer to the assumptions of Example 7.6. As we havehighlighted in discussing this example, if one wants to deal with a modelrecording the overall longevity risk, the proper time-horizon is the maxi-mum residual duration of the portfolio. So we adopt requirement (7.38). Atany valuation time, we assume that the flows until the end of the reinsur-ance period are the net outflows OF(·)

t , whilst after that time they are simplythe annual payments B(�)

t . Therefore, when assessing the required capital,we do not assume that the reinsurance arrangement will be automaticallyrenewed.

Policy conditions have to be chosen specifically for the reinsurancearrangement; in particular, the two bounds �′

t and �′′t must be set dif-

ferently in the Stop-Loss arrangement and in the reinsurance-swap with

7.3 Managing the longevity risk 327

barriers. The following choices are adopted:

�′t = 1.1E[B(�)

t |A3(τ), nz]�′′

t = 2E[B(�)t |A3(τ), nz] (7.64)

for the former;

�′t = 0.75E[B(�)

t |A3(τ), nz]�′′

t = 1.25E[B(�)t |A3(τ), nz] (7.65)

for the latter. For the reinsurance-swap we set:

B∗t = E[B(�)

t |A3(τ), nz] (7.66)

For all of the arrangements, a 5-year reinsurance period has been chosen.To allow for some comparisons, we have assumed that at the beginning ofeach reinsurance period a premium must be paid by the cedant, assessed asthe (unconditional) expected present value of future reinsurance flows. Weshould point out that this pricing principle does not make practical sense,given that no risk margin is included; however, with this approach, we canat least take into account the magnitude of the reinsurance premium. Weassume further that the reinsurer and the cedant adopt the same mortalitymodel, with the same parameters and that the reserve must be fully set upby the cedant. The possible default of the reinsurer is disregarded whenassessing the required capital.

In Table 7.29, we give the required capital (per unit of reserve) forthe three arrangements, for different portfolio sizes, as well as for thecase of no reinsurance arrangement (these latter results are taken fromTable 7.23). Because of the increased certainty of the outflows during thereinsurance period, the lowest amount of required capital is found under thereinsurance-swap (with no barriers); but clearly, in such an arrangement thepremium for the risk (which we have not considered) could be higher thanin other cases. As already noted, due to the different parameter values, theoutflows under the alternative arrangements are not directly comparable.It is interesting to note that most of the reduction in the required capitalis gained at the oldest ages, roughly after the Lexis point. Indeed, the mostsevere part of the longevity risk is expected to emerge after this age. So, wecan argue that the need for reinsurance emerges in particular at the oldestages; at earlier ages, the risk could be managed through other RM tools. �

We conclude this section by describing an arrangement which (at least inprinciple) could help in realizing natural hedging across LOBs.

328 7 : The longevity risk: actuarial perspectives

Table 7.29. Required capital, per unit of reserve: M[R3]z /V(�)

z , with and without reinsurance

No reinsurance Stop-loss on outflows

Time z n0 = 100 n0 = 1,000 n0 = 10,000 n0 = 100 n0 = 1,000 n0 = 10,000

0 12.744% 9.241% 8.103% 12.744% 9.241% 8.103%5 16.492% 11.938% 10.525% 16.492% 11.938% 10.525%

10 21.333% 15.621% 13.890% 21.333% 15.621% 13.890%15 28.007% 20.372% 18.281% 27.603% 20.372% 18.281%20 37.456% 27.008% 24.131% 35.246% 26.230% 23.739%25 53.378% 36.113% 31.832% 44.356% 31.746% 28.433%30 81.037% 50.476% 42.140% 51.771% 35.389% 30.687%35 165.842% 77.890% 56.968% 58.926% 30.841% 25.540%

Reinsurance-swap, no barriers Reinsurance-swap, with barriers

Time z n0 = 100 n0 = 1,000 n0 = 10,000 n0 = 100 n0 = 1,000 n0 = 10,000

0 12.451% 9.088% 8.002% 12.744% 9.241% 8.103%5 15.819% 11.571% 10.241% 16.492% 11.938% 10.525%

10 20.138% 14.731% 13.196% 21.333% 15.621% 13.890%15 24.683% 18.440% 16.548% 28.007% 20.372% 18.281%20 30.776% 22.168% 19.918% 37.299% 27.008% 24.131%25 37.998% 25.280% 22.112% 49.855% 35.183% 31.413%30 45.167% 26.260% 21.452% 62.390% 41.945% 36.373%35 66.244% 27.762% 17.984% 89.438% 48.414% 37.579%

As was mentioned in Section 7.3.2, an appropriate diversification effectbetween life insurance and life annuities may be difficult to obtain by aninsurer on its own. Intervention of a reinsurer can help in reaching the targetand, inter alia, could provide a way for reinsurers to hedge the acceptedlongevity risk.

We sketch a simple situation, involving two insurers, labelled IA and IBrespectively, and a reinsurer.

Insurer IA deals with life annuities. At time 0 a (total) single premium SA

is collected from the issue of immediate life annuities; the overall annualamount paid at time t is B(�)

t (t = 1, 2, . . . ). Insurer IB deals with whole lifeinsurances. Let us assume that annual premiums are payable up to the timeof death and the benefit is paid at the end of the year of death; the totalamount of premiums collected at time t is PB

t (t = 0, 1, . . . ) whilst thebenefits falling due at time t (t = 1, 2, . . . ) over the portfolio are denoted asC(�)

t . A reinsurance arrangement is underwritten by the two insurers withthe same reinsurer, according to which

– at time t (t = 0, 1, . . . ) the reinsurer pays to insurer IA an amount equalto PB

t and at time 0 an amount equal to SA to insurer IB;

7.3 Managing the longevity risk 329

Insurancepremiums

Insurer IA

Annuitypremiums

Annuitybenefit

Deathbenefit

Reinsurer

Annuitybenefit

Annuitypremiums

Insurer IB

Deathbenefit

Insurancepremiums

InsuredsAnnuitants

Figure 7.14. Flows in the swap-like arrangement between life annuities and life insurances.

– at each time t (t = 1, 2, . . . ) the reinsurer receives from insurer IA anamount equal to C(�)

t and at time t (t = 1, 2, . . . ) receives from insurerIB an amount equal to B(�)

t .

This would be a swap-like arrangement between life annuities and lifeinsurances; Fig. 7.14 gives a graphical idea of the overall flows.

Let us assume that the quantities introduced above are defined for eachtime t; in particular, SA

0 = SA and SAt = 0 for t = 1, 2, . . . whilst C(�)

0 = 0and B(�)

0 = 0. Then it turns out that at any time t, t = 0, 1, . . . , the netcashflow for both insurer IA and IB is SA

t + PBt − B(�)

t − C(�)t , whilst for the

reinsurer, the net cashflow is B(�)t + C(�)

t − SAt − PB

t . Each party has both aposition in life annuities and one in life insurances, and therefore gains thebenefit from natural hedging.

Practical difficulties inherent in such an arrangement are self-evident.Advantages may be weak, especially because of the incomplete hedgingprovided. It must also be pointed out that the actual duration of the lifeinsurance covers may be shortened because of surrenders. Further, somereward has to be acknowledged to the reinsurer, which can reduce theadvantages gained from the new position. However, this structure couldrepresent a useful management framework within an insurance group,where the holding company could play the part of the reinsurer (withreduced fees charged to the counterparties).

A similar swap arrangement is described by Cox and Lin (2007), howeverwithout explicit intervention of a reinsurer. Consider homogeneous port-folios, both for insurer IA and IB. Therefore: B(�)

t = bNt and Ct = cDt,where b denotes the annual amount to each annuitant, c the death benefitto whole life policyholder, Nt the number of the annuitants at time t inthe portfolio of insurer IA and Dt the number of deaths in year (t − 1, t)

330 7 : The longevity risk: actuarial perspectives

in the portfolio of insurer IB. Let n∗t and d∗

t be two given benchmarks forthe number of annuitants at time t for insurer IA and the number of deathsin year (t − 1, t) for insurer IB, respectively. Insurer IA and IB agree thatthe flow b · max{Nt − n∗

t , 0} is paid at time t by insurer IB to insurer IA,whilst the flow c · max{Dt − d∗

t , 0} is paid at the same time by insurer IAto IB. This way, insurer IA is protected against excess survivorship, whilstinsurer IB is protected in respect of excess mortality. However, insurer IA isthen exposed to excess mortality, whilst insurer IB to excess survivorship.Cox and Lin (2007) show through numerical assessments that some naturalhedging effects are gained by both insurers, provided that the present valueof future payments for life annuities and for life insurances are the same atthe time of issue.

7.4 Alternative risk transfers

7.4.1 Life insurance securitization

Securitization consists in packaging a pool of assets or, more generally, asequence of cash flows into securities traded on the market. The aims of asecuritization transaction can be:

– to raise liquidity by selling future flows (such as the recovery ofacquisition costs or embedded profits);

– to transfer risks whenever contingent payments or random cash flowsare involved.

We note that, since new securities are issued, a counterparty risk arises (forthe investor).

The organizational aspects of a securitization transaction are rather com-plex. Figure 7.15 sketches a simple design for a life insurance deal, focussingon the main agents involved. The transaction starts in the insurance market,where policies underwritten give rise to the cash flows which are securi-tized (at least in part). The insurer then sells the right to some cash flowsto a special purpose vehicle (SPV), which is a financial entity that has beenestablished to link the insurer to the capital market. Securities backed bythe chosen cash flows are issued by the SPV, which raises monies from thecapital market. Such funds are (at least partially) available to the insurer.

According to the specific features of the transaction, further items maybe added to the structure. For example, a fixed interest rate could be paid

7.4 Alternative risk transfers 331

Policyholders

Insurer

Premiums Benefits

Funding

Cash flows

Special Purpose Vehicle (SPV)

Price

Securities

CapitalMarket

Figure 7.15. The securitization process in life insurance: a simplified structure.

Policyholders

Insurer

Premiums BenefitsPremium

CreditEnhancementMechanism

Special Purpose Vehicle (SPV)

Swapcounterparty

Funding

Cash flows

Floatinginterest rate

Fixed interest rate

Guarantee

Price

Securities

CapitalMarket

Figure 7.16. The securitization process in life insurance: a more composite structure.

to investors, so that the intervention by a swap counterparty is required;see Fig. 7.16.

As has been pointed out above, some counterparty risk is originated bythe securitization transaction. This is due to the possible default of theinsurer with respect to the obligations assumed against the SPV, as well asof the policyholders in respect of the insurer, in the form of surrenders andlapses (whichmay possibly affect the securitized cash flows). To reduce suchdefault risks, some form of credit enhancement may be introduced, bothinternal (e.g. transferring to the SPV higher cashflows than those requiredby the actual size of the securities) and external, through the interventionof a specific entity (issuing, for example, credit insurance, letters of credit,and so on); see again Fig. 7.16. Further counterparty risk emerges from theother parties involved, similarly to any financial transaction. We note thatthe intervention by a third financial institution may result in an increase ofthe rating of the securities.

332 7 : The longevity risk: actuarial perspectives

Further details of the securitization transaction concern services for pay-ments provided by external bodies, investment banks trading the securitieson the market, and so on. Since we are only interested in the main technicalaspects of the securitization process, we do not go deeper into these topics(which, nevertheless, do play an important role in the success of the overalltransaction).

7.4.2 Mortality-linked securities

Mortality-linked securities are securities whose pay-off is contingent on themortality experienced in a given population; this is obtained, in particu-lar, by embedding some derivatives whose underlying is a mortality indexassessed on the given population. These securities may serve two oppo-site purposes: to hedge extra-mortality or extra-survivorship. In the formercase, we will refer to them as mortality bonds, in the latter case as longevitybonds.We restrict the terminology to ‘bond”,withoutmaking explicit refer-ence (in the name) to the derivative which is included in the security (whichcould be option-like, swap-like, or other) because we are more interestedin the hedging opportunities rather than in the organizational aspects ofthe deal. We are aware of the importance that such aspects play from apractical point of view, but their discussion goes beyond the aims of thisbook.

Both for mortality and longevity bonds, a reference population is cho-sen, whose mortality rates are observed during the lifetime of the bond. Thepopulation may consist of a given cohort (as can be the case for longevitybonds) or a given mix of populations, possibly of different countries (typ-ically this applies to mortality bonds). A mortality or a survivor index isdefined, whose performance is assessed according to the mortality experi-enced in the reference population. Possible examples of an index are: theaverage mortality rate in one-year’s time (or a longer period) across thepopulation, the number of survivors relative to the size of the populationat the time of issue of the bond, and so on. The amount of the coupon iscontingent on such index; in particular, the couponmay be higher/lower thehigher is the index, depending on the specific bond design. In some cases,the principal may vary (in particular, be reduced) according to the mortal-ity index. Specific cases are discussed below, separately for mortality andlongevity bonds. We point out that, to avoid lack of confidence in the waythat the pay-off of the mortality-linked security is determined, mortalitydata should be collected and calculated by independent analysts; so typi-cally general population mortality data are referred to instead of insurancedata (we will come back later to this aspect).

7.4 Alternative risk transfers 333

Mortality bonds are designed as catastrophe bonds. The purpose is to pro-vide liquidity in the case of mortality being in excess of what is expected,possibly owing to epidemics or natural disasters. So, typically a short posi-tion on the bond may hedge liabilities of an insurer/reinsurer dealing withlife insurances.

Mortality bonds are typically short term (3-5 years) and they are linkedto a mortality index expressing the frequency of mortality observed in thereference population in a given period. Some thresholds are normally set atbond issue. If the mortality index outperforms a threshold, then either theprincipal or the coupon are reduced. Although it is outside the scope of thediscussion to deal with mortality risk in the portfolios of life insurances,we discuss in some detail possible structures for mortality bonds to give acomprehensive picture of the developingmortality-linked securities. Inwhatfollows, 0 is the time of issue of the bond and T its maturity. With It wedenote the mortality index after t years from bond issue (t = 0, 1, . . . ,T).Further, St denotes the principal of the bond at time t and Ct the coupondue at time t.

Mortality bond – example 1. The bond is designed to protect againsthigh mortality experienced during the lifetime of the bond itself. This isobtained by reducing the principal at maturity. Although just some agescould be considered in detecting situations of highmortality, it is reasonableto address a range of ages. Further, the index should account for mortalityover the whole lifetime of the bond. So the following quantities representpossible examples of a mortality index

IT = max{q(t)}t=1,2,...,T (7.67)

It =∑T

t=1 q(t)T

(7.68)

where q(t) is the annual frequency of death averaged across the chosenpopulation in year t (we stress that although in our notation t is the timesince the issue of the bond, the frequencies of death in (7.67) and (7.68) arerecorded in specific calendar years, namely, in the ‘calendar year of issue+ t”). It is then reasonable for I0 = q(0).

At maturity the principal paid-back to investors is

ST = S0 ×

1 if IT ≤ λ′I0�(IT) if λ′I0 < IT ≤ λ′′I00 if IT > λ′′I0

(7.69)

where λ′, λ′′ are two parameters (stated under bond conditions), with 1 ≤λ′ < λ′′, and �(IT) is a decreasing function, such that �(λ′I0) = 1 and

334 7 : The longevity risk: actuarial perspectives

�(λ′′I0) = 0. For example,

�(IT) = λ′′I0 − IT

(λ′′ − λ′) I0(7.70)

Note that λ′I0 and λ′′I0 represent two thresholds for the mortality index.

The coupon is independent of mortality; it could be defined as follows:

Ct = S0 (it + r) (7.71)

where it is the market interest rate in year t (defined by the bond conditions)and r is an extra-yield rewarding investors for taking mortality risk.

We note that for an insurer/reinsurer dealing with life insurances andtaking a short position in the bond, in the case of high mortality experience,the high frequency of payment of death benefits is counterbalanced by areduced payment to investors.

An example of this security is the mortality bond issued by Swiss Re; see,for example, Blake et al. (2006a).

Mortality bond – example 2. The flows of the bond described in theprevious example try to match the flows in the life insurance portfolio justat the end of a period of some years. An alternative design of the mortalitybond may provide a match on a yearly basis. This is obtained by letting thecoupon depend on mortality. For example,

Ct = S0 ×

it + r if It ≤ �′

t

(it + r) φ(It) if �′t < It ≤ �′′

t

0 if It > �′′t

(7.72)

where �′t,�

′′t set two mortality thresholds. For example,

�′t = λ′ E[Dt|A]

�′′t = λ′′ E[Dt|A] 1 ≤ λ′ < λ′′ (7.73)

whereDt is the number of deaths in year (t−1, t) in the reference populationand E[Dt|A] is its expected value according to the mortality assumption A.Clearly, in this structure the mortality index It should measure the numberof deaths in year (t − 1, t). The function φ(·) should then be decreasing; forexample,

φ(It) = �′′t − It

�′′t − �′

t(7.74)

As in (7.71), the rate r in (7.72) is an extra-investment yield rewardinginvestors for the mortality risk inherent in the pay-off of the bond.

7.4 Alternative risk transfers 335

For longevity bonds the critical situation is a mortality lower thanexpected or, in other terms, people outliving their expected lifetime. In con-trast to the situation of extra-mortality, excess survivorship is not a suddenphenomena, but rather a persistent situation. So longevity bonds are, bynature, long term.

Remark It is worthwhile stressing the difference between longevity bondsand (fixed-income) long-term bonds. While the former are financial secu-rities whose performance is linked to some longevity index (see below fordetails), the latter are traditional bonds with, say, a 20–25 years maturity,and (usually) a fixed annual interest (or possibly an annual interest linkedto some economic or financial index, as for example an inflation index).Although not tailored to the specific needs arising from the longevity risk,long-term bonds can help in meeting obligations related to a life annu-ity portfolio. Actually, one of the most important problems in managingportfolios of life annuities (with a guaranteed benefit) consists in mitigat-ing the investment risk through the availability of fixed-income long-termassets, to match the long-term liabilities. Clearly, this problem becomesmore dramatic as the expected duration of the life annuities increases. �

Depending on its design, the longevity bond may offer hedging oppor-tunities to an insurer/reinsurer dealing with life annuities through either along or a short position. In the first case, the pay-off of the bond increaseswith decreasing mortality; vice versa in the second case. Given the long-term maturity, it is reasonable that the link is realized through the coupon,hence providing liquidity on a yearly basis. In the following, we thereforeassume that the principal is fixed.

The reference population should be a given cohort, possibly close toretirement, that is, with age 60–65 at bond issue. Let Lt be the number ofindividuals in the cohort after t years from issue, t = 0, 1, . . . ; viz, L0 = l0is a known value. A maturity T may be chosen for the bond, with T high(e.g.: T ≥ 85−initial age). In the following, some possible designs for thecoupons are examined.

Longevity bond – example 1. The easiest way to link the coupon to thelongevity experience in the reference population is to let it be proportionalto the observed survival rate. So

Ct = C × Lt

l0(7.75)

where C is a given amount (linking the size of the coupon to the principal ofthe bond). We note that in the case of unanticipated longevity the couponincreases faster than expected; so a long position should be taken by an

336 7 : The longevity risk: actuarial perspectives

insurer/reinsurer dealing with life annuities. A similar bond has been pro-posed by EIB/BNP Paribas, although it has not been traded on the market;see Blake, Cairns and Dowd (2006a) for details.

Longevity bond – example 2. In a similar way to the mortality bond(example 1 or 2), two thresholdsmay be assigned, expressing survival levels.If the number of survivors in the cohort exceeds such thresholds, then theamount of the coupon is reduced, possibly to 0. The following definitioncan be adopted:

Ct = C ×

l′′t −l′t

l0if Lt ≤ l′t

l′′t −Ltl0

if l′t < Lt ≤ l′′t0 if Lt > l′′t

(7.76)

where l′t, l′′t are the two thresholds, expressing a given number of survivors.For example: l′t = λ′ E[Lt|A(τ)], l′′t = λ′′ E[Lt|A(τ)], where 1 ≤ λ′ < λ′′ andA(τ) is a given mortality assumption for the reference cohort (assumed tobe born in year τ). We note that, in this case, the lower is the mortality (i.e.the higher is Lt), the lower is the amount of the coupon. A short positionshould be taken to hedge life annuity outflows. A similar bond is describedby Lin and Cox (2005).

Longevity bond – example 3. The coupon can be set proportional to thenumber of deaths observed in the reference cohort from issue. For example

Ct = C × l0 − Lt

l0(7.77)

where l0 − Lt is the observed number of deaths up to time t. In contrast tothe previous case, no target is set for such a number. Clearly, also in thiscase a short position should be taken to hedge longevity risk.

We will discuss in more detail how to hedge longevity risk throughlongevity bonds in Section 7.4.3. We now address some market issues.

There are many difficulties in developing a market for longevity bonds.A first issue concerns who might be interested in issuing/investing in bondsthat offer hedging opportunities to insurers/reinsurers. In general terms,one could argue that such securities may offer diversification opportunities,in particular because of their low correlation with standard financial mar-ket risk factors. Further, they may give long-term investment opportunities,which may be rarely available. From the point of view of the issuer of bondslike example 1, the possibility of building a longevity bond depends, how-ever, on the availability of financial securities with an appropriate maturityto match the payments promised under the longevity bond.

7.4 Alternative risk transfers 337

A further issue, already mentioned, concerns the choice of mortality data.To encourage confidence in the linking mechanism, reference to insurancedata should be avoided. Data recorded and analysed by an independentbody should rather be adopted. This raises an issue of basis risk for hedgers(see Section 7.4.3). Conversely, there are many weak points in a mecha-nism linking the pay-off of the bond to insurance data; among these, wemention the following: insurance datamay be affected in particular by insur-ers/reinsurers with large portfolios, so that some manipulation of data maybe feared by investors; due to commercial reasons, the mix of the insuredpopulationmay change over time, whilst reference to the general populationoffers more stability.

A final aspect (but not least in terms of importance) concerns the pricingof the longevity risk transferred to the capital market. Also in this respectthere are many difficulties. First, an overall accepted model for stochasticmortality is not yet available (see Section 7.2.3). Second, a market is not yetdeveloped, nor are similar risks traded in the market itself. So, even if therewere common agreement on a pricing model, data to estimate the relevantparameters are not yet available. Three theoretical approaches have beenproposed in the literature: distortion measures, risk-neutral modelling, andincomplete markets. Researches in this respect can be considered to be atan early stage and open issues remain requiring careful investigation. SeeSection 7.6 for some examples, and Section 7.8 for references.

7.4.3 Hedging life annuity liabilities through longevity bonds

We refer here to an insurer or to a reinsurer dealing with immediate lifeannuities. In the case of an insurer, we refer to the portfolio described inSection 7.2.4; in the case of a reinsurer, we assume that support is providedto an insurer with a portfolio as the one described in Section 7.2.4. Wehave already noted (see Section 7.3.3) that heterogeneous annual amountsmainly impact on random fluctuations. Therefore, when managing mortal-ity risk in a life annuity portfolio, the insurer should first underwrite sometraditional surplus reinsurance to reduce the dispersion of annual amountsin its portfolio. In the following, we will assume that such action has beentaken; so, unless otherwise stated, we make reference to a homogeneous lifeannuity portfolio, where b(j) = b for each annuitant j. We recall that in thiscase B(�)

t = b Nt.

The insurer/reinsurer faces the random outflows B(·)t and counterbalances

them with random flows Ft, such that the net outflows B(·)t − Ft are close

to some target outflows OF∗t . If the hedging is pursued by an insurer, then

reference is to the original outflows B(�)t of the life annuity portfolio. If the

338 7 : The longevity risk: actuarial perspectives

hedging is realized by a reinsurer, then reference is to the outflows B(SL)t ,

B(swap)t , or B(swap−b)

t , depending on the reinsurance arrangement dealt with.In the following, we discuss how the target OF∗

t can be set and reachedaccording to the hedging tools available in the market. For the sake ofbrevity, we assume that the longevity bond is issued at the same time asthe life annuities; some comments will follow in this regard. Thus, unlessotherwise stated, time 0 will be the time of issue of the life annuities andthe bond.

We first consider the case of a longevity bond with coupon (7.75). Aninsurer dealing with immediate life annuities should buy k units of suchbond at time 0, so that Ft = k Ct > 0 at time t = 1, 2, . . . . The net outflowfor the insurer at time t, t = 1, 2, . . . , is then

OF(LB)t = B(�)

t − k Ct (7.78)

which can be rewritten as

OF(LB)t = b n0

Nt

n0− k C

Lt

l0(7.79)

We assume that Nt/n0 = Lt/l0 for any time t; this means that mortality ofannuitants is perfectly replicated by mortality in the reference population.The net outflow to the insurer then becomes

OF(LB)t = Lt

l0(b n0 − k C) (7.80)

Note that the net outflow is still random because of the dependence on Lt.However, if k = b n0/C then the term b n0 − k C reduces to zero, and asituation of certainty is achieved (i.e. the hedging would be perfect); thetarget outflow for this situation is therefore OF∗

t = 0.

In practical terms, perfect hedging is difficult to realize. Although we canrely on some positive correlation between the survival rate in the referencepopulation, Lt/l0, and that in the annuitants’ cohort, Nt/n0, it is unrealisticthat they coincide in each year, due to the fact that usually the annuitantsare not representative of the reference population. In particular, the year ofbirth of the reference cohort and of annuitants may differ. This mismatchingleads to basis risk in the strategy for hedging longevity risk.

A second aspect concerns the lifetime of the bond. Typically the bondis not issued when the life annuity payments start. If it is issued earlier,the previous relations still hold, just with an appropriate redefinition of thequantities l0 andLt; the problem in this case would consist in the availabilityof the bond, in the required size, in the secondary market. If the bond isissued later than the life annuities, the longevity risk of the insurer would be

7.4 Alternative risk transfers 339

unhedged for some years (but in a period when annuitants are still young,and longevity risk is therefore not too severe). In both cases, the basis riskmay be stronger, due to the fact that it is more likely that the years ofbirth of annuitants and the reference population are different. The criticalaspect of the lifetime of the bond is its maturity, T. Realistically, T is afinite time, so that the hedge in (7.79) can be realized just up to time T (andnot for any time t). The insurer has to plan a further purchase of longevitybonds after time T; however, the availability of bonds, in particular withthe features required for the hedging, is not certain. In the case that furtherlongevity bonds are available in the future, the basis risk may worsen intime, given that for any bond issue a cohort of new retirees is likely to bereferred to.

We now move to longevity bonds with coupon (7.76) and (7.77). Asalready mentioned in Section 7.4.2, such bonds require a short position tohedge longevity risk. This position is, however, difficult for an insurer (orother annuity provider) to realize on its own, because of the complexityof the deal. It is reasonable to assume that some form of reinsurance ispurchased by the annuity provider. The reinsurer, who transacts businesson a larger scale than the insurer, then hedges its position through longevitybonds, typically issued by an SPV (see Fig. 7.17).

Let us assume that a reinsurer is able to issue a bond with coupon (7.76).The reinsurer should be willing, in this case, to underwrite the Stop-Lossarrangement on annual outflows, whose reinsurance flows are described

SPVAnnuityProvider

Premium

Annuitants

Premium

PremiumsAnnual

payments

Reinsurer

Couponsand Principal

CapitalMarket

Benefits BenefitsIncome from

bond sale

Figure 7.17. Longevity risk transfer from the annuity provider to the capital market.

340 7 : The longevity risk: actuarial perspectives

by (7.57). Thus, the longevity bond should offer hedging opportunitiesagainst the liabilities of the reinsurer in respect of the insurer, as we willdemonstrate.

Assume that the reinsurer matches the outflow B(SL)t arising from the

reinsurance arrangement with a short position on k units of the longevitybond with coupon (7.76). In this case, Ft = −k Ct < 0 at time t = 1, 2, . . . .If the underlying life annuity portfolio is homogeneous in respect of annualamounts, the net outflow of the reinsurer, NF(SL)

t , is

NF(SL)t = B(SL)

t + k Ct

=

0 if b Nt ≤ �′

t

b Nt − �′t if �′

t < b Nt ≤ �′′t

�′′t − �′

t if b Nt > �′′t

+ k C ×

l′′t −l′t

l0if Lt ≤ l′t

l′′t −Ltl0

if l′t < Lt ≤ l′′t0 if Lt > l′′t

(7.81)

Since we are aiming at perfect hedging, the thresholds �′t,�

′′t in the rein-

surance arrangement are reasonably chosen according to the feature of thelongevity bond. So we assume that �′

t = (l′t/l0) b n0 and �′′t = (l′′t /l0) b n0.

We can rewrite (replacing the relevant quantities and rearranging)

NF(SL)t = b n0 ×

0 if Nt

n0≤ l′t

l0

Ntn0

− l′tl0

if l′tl0

< Ntn0

≤ l′′tl0

l′′t −l′tl0

if Ntn0

>l′′tl0

+ k C ×

l′′t −l′tl0

if Ltl0

≤ l′tl0

l′′t −Ltl0

if l′tl0

< Ltl0

≤ l′′tl0

0 if Ltl0

>l′′tl0

(7.82)

If Nt/n0 = Lt/l0, this reduces to

NF(SL)t =

k C l′′t −l′t

l0if Lt

l0≤ l′t

l0

b n0Lt−l′t

l0+ k C l′′t −Lt

l0if l′t

l0< Lt

l0≤ l′′t

l0

b n0l′′t −l′t

l0ifLt

l0>

l′′tl0

(7.83)

7.4 Alternative risk transfers 341

Ann

ual o

utfl

ows

Annuityoutflows

Priority Upper limit

Flow to investors

Flow to the insurer

Time

Figure 7.18. Flows for a reinsurer dealing with a Stop-Loss arrangement on annual outflows andissuing a longevity bond – example 2.

Further, if k = b n0/C, then

NF(SL)t =

b n0

l′′t −l′tl0

if Ltl0

≤ l′tl0

b n0l′′t −l′t

l0if l′t

l0< Lt

l0≤ l′′t

l0

b n0l′′t −l′t

l0if Lt

l0>

l′′tl0

= b n0l′′t − l′t

l0(7.84)

which is a non-random situation. A graphical representation is provided inFig. 7.18.

The assumptions on which such a perfect hedging strategy is based arethe same as those adopted for the longevity bond – example 1, that is,

– the survival rate in the annuitant population, Nt/n0, is the same as thatobserved in the reference population, Lt/l0;

– the lifetime of the bond coincides with the lifetime of the life annuityportfolio; in particular, no maturity has been set.

It is clear that such conditions are unrealistic, so that the reinsurer transfersjust partially the longevity risk to investors. In any case, the target outflowin setting the hedging strategy in this case is OF∗

t = b n0l′′t −l′t

l0. A similar

strategy is described by Lin and Cox (2005), albeit without calling explicitlyfor a reinsurance arrangement between an insurer and a reinsurer.

342 7 : The longevity risk: actuarial perspectives

We note that the unavailability of a longevity bond which perfectlymatches the reinsurer’s liability suggests that the reinsurance arrangementshould be underwritten just for a finite time, as we have considered inSection 7.3.4. At any renewal time, the pricing of the arrangement, as wellas the relevant conditions, can be updated to take account of the currentavailability of hedging tools.

If the reinsurer is able to issue a bond with coupon (7.77), then thereinsurance-swap arrangement can be hedged.We assume that the reinsurertakes a short position on k units of the longevity bond with coupon (7.77)(note that in this case, similarly to the previous one, Ft = −k Ct < 0).Underwriting jointly a reinsurance-swap arrangement, the following netflow of the reinsurer is

NF(swap)t = B(�)

t − B∗t + k Ct (7.85)

First, we refer to a homogeneous life annuity portfolio and note that thetarget outflow (7.59) for the insurer under the reinsurance-swap can berestated as

B∗t = b E[Nt|A(τ), nz] = b n∗

t (7.86)

and so the net flow for the reinsurer can be rewritten as

NF(swap)t = b n0

Nt

n0− b n∗

t + k Cl0 − Lt

l0

= k C − b n∗t + b n0

Nt

n0− k C

Lt

l0(7.87)

If Ntn0

= Ltl0

and k = b n0C , then

NF(swap)t = k C − b n∗

t = b (n0 − n∗t ) (7.88)

which is again non-random. We note that the net outflow of the rein-surer is proportional to the number of deaths assumed as a target in thereinsurance-swap, namely, n0 −n∗

t . Clearly, OF∗t = b (n0 −n∗

t ) is the targetoutflow for the hedging strategy. A graphical representation is provided inFig. 7.19. Remarks on the possibility of realizing a perfect hedging are asin the previous cases.

The impossibility of relying on a perfect hedging strategy suggests toadopt the reinsurance-swap arrangementwith flows (7.62) instead of (7.60).The arrangement (7.62) could also be justified by a hedging strategyinvolving several positions on longevity-linked securities.

We conclude this section by recalling thatwhenever longevity risk is trans-ferred to some other entities, either to the issuer of a longevity bond or

7.5 Life annuities and longevity risk 343

Time

Ann

ual o

utfl

ow

bn0

To investors

Net outflow

From/to cedant

Figure 7.19. Flows for a reinsurer dealing with a reinsurance-swap arrangement and issuing alongevity bond – example 3.

to a reinsurer, a default risk arises for the insurer. This aspect should beaccounted for when allocating capital for the residual longevity risk borneby the insurer itself.

7.5 Life annuities and longevity risk

7.5.1 The location of mortality risks in traditional life annuityproducts

So far in this chapter we have dealt with longevity risk referring to a portfo-lio of immediate life annuities. The need for taking into account uncertaintyin future mortality trends and hence for a sound management of the impactof longevity risk has clearly emerged.

However, life annuity products other than immediate life annuities aresold on a number of insurance markets and, in many products, the severityof longevity risk can be even higher than what has emerged in the previousinvestigations. We now introduce some remarks considering cases otherthan immediate life annuities.

The technical features of several types of life annuities have already beenexamined in Chapter 1, and the relevant traditional pricing tools as well(see, in particular, Section 1.6). Unsatisfactory features of such models can

344 7 : The longevity risk: actuarial perspectives

be easily understood if one analyses the models themselves under the per-spective of a dynamic mortality scenario. In this section, we develop somegeneral comments on the pricing of life annuities allowing for longevityrisk; a few examples are then mentioned in Section 7.6.

In Section 1.6, we recalled that in the traditional guaranteed life annuityproduct the technical basis is stated when the premiums are fixed. So

(a) a deferred life annuity with (level) annual premiums implies the highestlongevity risk borne by the insurer, as the technical basis is stated atpolicy issue (hence, well before retirement);

(b) a single premium immediate life annuity implies the lowest longevityrisk, as the technical basis is stated at retirement time only;

(c) the arrangement with single recurrent premiums represents an interme-diate solution, given that the technical basis can be stated specificallyfor each premium.

It follows that a stronger safety loading is required for solution (a) thanfor (b), with solution (c) at some intermediate level. Clearly, in order tocalculate properly the safety loading required for the implied longevity risk,some pricing model is needed. Alternatively, policy conditions that allowfor a revision of the technical basis should be included in the policy, as willbe commented later.

As it was recalled in Section 1.6, in case (b) the accumulation ofthe amount funding an immediate life annuity can be obtained throughsome insurance saving product, for example, an endowment insurance.A package, in particular, can be offered, in which an endowment for theaccumulation period is combined with an immediate life annuity for thedecumulation period.

Combining an endowment insurance with a life annuity provides thepolicyholder with

(a) an insurance cover against the risk of early death during the workingperiod;

(b) a saving instrument for accumulating a sum at retirement, to be (partly)converted into a life annuity;

(c) a life annuity throughout the whole residual lifetime.

It is interesting to analyse the risks involved by this product, from thepoint of view of the insurance company (see Fig. 7.20); we refer just tothe flows given by net premiums and benefits (hence we disregard risksconnected to expenses and other aspects). Consistent with the notation in

7.5 Life annuities and longevity risk 345

Timen

C1

Dea

th b

enef

it

Sum

at r

isk

Res

erve

Res

erve

0

Post-retirement periodAccumulation period

Mortalityrisk

Risk of surrender

Annuitizationrisk

Mortality risk (Longevity risk included)

Investmentrisk

Figure 7.20. Risks in an endowment combined with a life annuity.

Section 1.6, we let 0 be the time of issue of the endowment, n the maturityof the endowment and the retirement time as well, x the age at time 0.

During the accumulation period, that is, throughout the policy durationof the endowment, the insurer in particular bears:

– the investment risk, related to the mathematical reserve of the endow-ment, if some financial guarantee operates, involving for example aminimum interest rate guarantee;

– the (extra-)mortality risk, related to the sum at risk;– the risk of surrender, related to the amount of the reserve, if some guar-antee on the surrender price (usually expressed as a share of the reserve)is given.

During the decumulation period, as the annual amount is usually guaran-teed, the insurer bears:

– the investment risk, related to the mathematical reserve of the annuity,if a minimum interest rate guarantee is operating;

– the (under-)mortality risk, and in particular the longevity risk.

At retirement time, if some guarantee has been given on the annuitizationrate, the insurer bears the risk connected to the option to annuitize. Thisaspect is discussed in more detail in Section 7.5.2.

346 7 : The longevity risk: actuarial perspectives

As regards the longevity risk, the time interval throughout which theinsurer bears the risk itself clearly coincides with the time interval involvedby the immediate life annuity, if the annuity rate 1/ax+n is stated and henceguaranteed at retirement time only. We recall that the annuity rate convertsthe sum atmaturity S (used as a single premium) into a life annuity of annualamount b according to the relation b = S/ax+n (see (1.57)).

Even if the annuity rate is stated at time n only, it is worth noting thatthe endowment policy contains an ‘option to annuitize’. Apart from theseverity of the longevity risk implied by the guarantee on the annuity rate,the presence of this option determines the insurer’s exposure to the riskof adverse selection, as most of the policyholders annuitizing the maturitybenefit will be in a good health status (see Section 1.6.5).

7.5.2 GAO and GAR

The so-called guaranteed annuity option (GAO) (see Section 1.6.2) entitlesthe policyholder to choose at retirement between the current annuity rate(i.e. the annuity rate applied at time n for pricing immediate life annuities)and the guaranteed one.

By definition, the GAO condition implies a guaranteed annuity rate(GAR). In principle, the GAR can be stated at any time t, 0 ≤ t ≤ n. Inpractice, the GAR stated at policy issue, that is, at time 0, constitutes a moreappealing feature of the life insurance product. If the GAR is stated at timen only, the GAO vanishes and the insurance product simply provides thepolicyholder with a life annuity with a guaranteed annual amount. What-ever may be the time at which the GAR is stated, the life annuity providesa guaranteed benefit, so that it can be referred to as a guaranteed annuity(see Fig. 7.21).

Conversely, the expression non-guaranteed annuity denotes a life annuityproduct in which the technical basis (and in particular the mortality basis)can be changed during the annuity payment period; in practice, this means

GAOGAR

at time t (0 ≤ t ≤ n)

GARat time n

GuaranteedAnnuity

Figure 7.21. GAO, GAR and Guaranteed Annuity.

7.5 Life annuities and longevity risk 347

that the annual amount of the annuity can be reduced, according to themortality experience. Clearly, such an annuity is a rather poor productfrom the point of view of the annuitant.

As a consequence of the GAR, the insurer bears the longevity risk(and the market risk, as the guarantee concerns both the mortality tableand the rate of interest) from the time at which the guaranteed rateis stated on. Obviously, the longevity (and the market) risk borne bythe insurer decreases as the time at which the guaranteed rate is statedincreases.

The importance of an appropriate pricing of a GAO, and therefore ofan appropriate setting of a GAR, is witnessed by the default of EquitableLife. The unanticipated decrease in interest and mortality rates experiencedduring the 1990s, let the GAOs issued by Equitable during the 1980sto become deeply in the money at the end of the 1990s. As a conse-quence, in 2000 the Equitable was forced to close to new life and pensionbusiness.

Pricing a life annuity product within the GAR framework requires the useof a projected mortality table. The more straightforward (and traditional)approach for pricing the guarantee consists of adopting a table that includesa safety loading to meet mortality improvements higher than expected. Oneshould, however, be aware of the fact that the possibility of unanticipatedmortality improvements reduces the reliability of such a safety loading (ashappened to Equitable). A more appropriate approach requires a pricingmodel explicitly allowing for the longevity risk borne by the insurer, ratherthan a safety loading roughly determined; see Section 7.6.

7.5.3 Adding flexibility to GAR products

A rigorous approach to pricing a GAR product usually leads to high pre-mium rates, which would not be attractive from the point of view ofthe potential clients. Conversely, lower premiums leave the insurer hardlyexposed to unexpected mortality improvements. However, in both cases,adding some flexibility to the life annuity product can provide interestingsolutions to the problemof pricing guaranteed life annuities. Inwhat followswe focus on some practicable solutions.

Assume that the insurer decides to set the GAR 1/a[1]x+n(h) at time

h (0 ≤ h < n) for a deferred life annuity to be paid from time n. Supposethat a[1]

x+n(h) is lower than the correspondent output of a rigorous approachto GAR pricing. If an amount S is paid at time n as a single premium, the

348 7 : The longevity risk: actuarial perspectives

n0 rh

Reduction in the annual amount

A new projected table

Time

b[1]

b'[1]

Figure 7.22. Annual amount in a conditional GAR product.

resulting annual amount of the life annuity is given by

b[1] = S

a[1]x+n(h)

(7.89)

Assume that the insurer promises to pay the annual amount b[1] fromtime n on, with the proviso that no dramatic improvement in the mortalityexperienced occurs before time n. Conversely, if such an improvement isexperienced (and it results, for example, from a new projected life tableavailable at time r, h < r ≤ n), then the insurer can reduce the annualamount to a lower level b′[1] (see Fig. 7.22). So a policy condition must beadded, leading to a conditional GAR product. Some constraints are usuallyimposed (e.g. by the supervisory authority); in particular:

(a) the mortality improvement must exceed a stated threshold (e.g. in termsof the increase in the life expectancy at age 65);

(b) r ≤ n − 2, say;(c) no more than one reduction can be applied in a given number of years;(d) whatever the mortality improvements may be, the reduction in the

annual amount must be less than or equal to a given share ρ, that is,

b[1] − b′[1]

b[1] ≤ ρ (7.90)

so that, combining (c) and (d), a guarantee of minimum annual amountworks. Conversely, from time n the annual amount is guaranteed, irrespec-tive of any mortality improvement which can be recorded afterwards.

7.5 Life annuities and longevity risk 349

n0 sh

Increase in the annual amount

Experienced mortality higher than expected

Time

b' [2]

b[2]

Figure 7.23. Annual amount in a participating GAR product.

Let us now turn to the case in which the insurer charges a rigourous (i.e.lower) annuity rate 1/a[2]

x+n(h). Hence, the annuity amount is given by

b[2] = S

a[2]x+n(h)

(7.91)

with b[2] < b[1].Suppose that, at time s (s > n), statistical observations reveal that

the experienced mortality is higher than expected, because of a mortal-ity improvement lower than forecasted. Hence, a mortality profit is goingto emerge from the life annuity portfolio. Then, the insurer can decideto share part of the emerging profit among the annuitants, by raising theannual amount from the (initial) guaranteed level b[2] to b′[2] (see Fig. 7.23).This mechanism leads to a with-profit GAR product (or participating GARproduct).

Participation mechanisms work successfully in a number of life insur-ance and life annuity products as far as distributing the investment profitsis concerned. Conversely, mortality profit participation is less common.Notwithstanding, important examples are provided by mortality profitsharing in group life insurance and, as regards the life annuity business,participation mechanisms adopted in the German annuity market. Thecritical point is that, in contrast to what happens for products with par-ticipation to investment profits and to mortality profits in life insurance,people participating to mortality profits in life annuity portfolios are notthosewho have generated such profits and, so, a tontine scheme emerges (seeSection 1.4.3).

350 7 : The longevity risk: actuarial perspectives

s0 nh

Reduction in the annual amount

Experiencedmortality lower than expected

Time

b[3]

b[2]

Figure 7.24. Annual amount in a product with conditional GAR in the decumulation period.

It is worthwhile to note that from a technical point of view a policycondition similar to the conditional GAR may work also during the decu-mulation period. In this case, the amount of the benefit (possibly assessed atretirement time with an annuity rate higher than what resulting from a ringapproach to GAR pricing) may be reduced in the case of strong unantici-pated improvements in mortality. It would be reasonable to fix a minimumbenefit level in this case.

As an illustration, assume that the amount b[2] resulting from (7.91) isconsidered the level of benefit that is consistent with a rigorous approach toGAR pricing. However, considering that the implied safety loading couldturn out to be too severe according to the actual mortality experienced,the insurer is willing to pay the annual benefit b[3], with b[3] > b[2]. Ifafter time n, a strong mortality improvement is recorded, then the insurerwill reduce the annual amount down to b[2] (see Fig. 7.24). Constraintssimilar to (a) and (c) for the conditional GAR in the accumulation periodshould be applied. From a commercial point of view, care should be takenin making clear to the annuitant that the guaranteed benefit is b[2] andnot b[3]. However, a tontine scheme emerges, given that in some sense aparticipation to losses is realized.

7.6 Allowing for longevity risk in pricing

As already pointed out, we are not going to discuss in details the problemof pricing long-term living benefits allowing for longevity risk. Indeed, theunsolved issues are too important and complex to allow for a complete

7.6 Allowing for longevity risk in pricing 351

description in the present chapter: for example, there are different opinionson evolving mortality and hence on the appropriate stochastic model toallow for uncertain mortality trends, and the data for estimating the mainparameters are unavailable.

On the other hand, pricing models for longevity risk are required whendealing with life annuities and longevity bonds. Therefore, in this section,we summarize a few of the main proposals which have been described inliterature. However, this is a subject which has been developing in the recentliterature, and we do not aim to give a comprehensive illustration of theseveral proposals that have been put forward.

We first address the present value of life annuities. Denuit and Dhaene(2007) and Denuit (2007) allow for randomness in the probabilities ofdeath within a Lee–Carter framework. Due to the importance of such aframework, we briefly describe their approach. Let us adopt the standardLee–Carter framework, where the future forces of mortality are decom-posed in a log-bilinear way (see Section 4.7.2). Specifically, the death rateat age x in calendar year t is of the form exp(αx + βxκt), where κt, inparticular, is a time index, reflecting the general level of mortality.

We denote as hPx0(t0) the random h-year survival probability for anindividual aged x0 in year t0, that is, the conditional probability that thisindividual reaches age x0 + h in year t0 + h, given the κt’s. Adoptingassumptions (3.2) (from which (3.13) holds), such probability is formallydefined as

hPx0(t0) = exp

−h−1∑s=0

mx0+s(t0 + s)

= exp

−h−1∑s=0

exp(αx0+s + βx0+sκt0+s

) (7.92)

We refer to a basic life annuity contract paying the annual amount b = 1at the end of each year, as long as the annuitant survives. The present valueof such annuity is the expectation of the payments made to an annuitantaged x0 in year t0, conditional on a given time index; it is calculated as

ax0(t0) =ω−x0∑h=1

hPx0(t0)v(0, h)

=ω−x0∑h=1

exp

−h−1∑s=0

exp(αx0+s + βx0+sκt0+s

) v(0, h) (7.93)

352 7 : The longevity risk: actuarial perspectives

where v(0, h) is the discount factor, that is, the present value at time 0 ofa unit payment made at time h. We note that ax0(t0) is a random vari-able, since it depends on the future trajectory of the time index (i.e. onκt0 , κt0+1, κt0+2, . . .). We note also that (7.93) generalizes (1.27).

The distribution function of ax(t0) is difficult to obtain. Useful approx-imations have been proposed by Denuit and Dhaene (2007) and Denuit(2007). Specifically, Denuit and Dhaene (2007) have proposed comono-tonic approximations for the quantiles of the random survival probabilitieshPx0(t0). Since the expression for ax(t0) involves a weighted sum ofthe hPx0(t0) terms, Denuit (2007) supplemented the first comonotonicapproximation with a second one. This second approximation is basedon the fact that the hPx0(t0) terms are expected to be closely dependentfor increasing values of h so that it may be reasonable to approxi-mate the vector of random survival probabilities with its comonotonicversion.

Interesting information can be obtained from a further investigation ofthe distribution of ax0(t0). We consider a homogeneous portfolio, made ofn0 annuitants at time t0. We refer now to the random variable a

K(j)x0�, where

K(j)x0 is the curtate lifetime of individual j. Given the time index, the K(j)

x0 ’sare assumed to be independent and identically distributed, with commonconditional h-year survival probability hPx0(t0).

We recall from Denuit et al. (2005) that a random variable X is saidto precede another one Y in the convex order, denoted as X �cx Y, ifthe inequality E[g(X)] ≤ E[g(Y)] holds for all the convex functions gfor which the expectations exist. Since X �cx Y ⇒ E[X] = E[Y] andVar[X] ≤ Var[Y], X �cx Y intuitively means that X is ‘less variable’, or‘less dangerous’ than Y.

Now, since the aK(j)

x0�’s are exchangeable, we have from Proposition 1.1

in Denuit and Vermandele (1998) that

ax(t0) = E[aK(j)

x0�|κt0+k, k = 1, 2, . . .] �cx · · · �cx

∑n0+1j=1 a

K(j)x0�

n0 + 1

�cx

∑n0j=1 a

K(j)x0�

n0. (7.94)

Increasing the size of the portfolio makes the average payment per annuityless variable (in the �cx-sense), but this average remains random whateverthe number of policies comprising the portfolio, being bounded from belowby ax(t0) in the �cx-sense. We note that, despite the positive dependence

7.6 Allowing for longevity risk in pricing 353

existing between the Lee–Carter lifetimes, there is still some diversificationeffect in the portfolio.

Biffis (2005) calculates the single premiumof a life annuity adopting affinejump-diffusions for modelling the force of mortality and the short interestrate. In this way, one deals simultaneously with financial and mortalityrisks and calculates values based on no-arbitrage arguments. The settingis also applied for portfolio valuations in Biffis and Millossovich (2006a)and to the valuation of GAOs in Biffis and Millossovich (2006b). Affinemortality structures are also addressed by Dahl (2004) andDahl andMøller(2006), where, in particular, hedging strategies for life insurance liabilitiesare investigated.

Turning to the problem of pricing longevity bonds, Lin and Cox (2005),consider that the market is incomplete, and adopt the Wang transform(see, e.g. Wang (2002) and Wang (2004)). Given the future random flowX with cumulative probability distribution function (briefly, cdf) F(x), theone-factor Wang transform is the distorted cdf F∗(x) such that

F∗(x) = �(�−1(F(x)) + λ) (7.95)

where �(·) is the standard normal cdf and λ is the market price of risk(longevity risk included). The fair price of X is the present value of theexpected value of X, calculated with the risk-free rate and the distorted cdfF∗(x).

Lin and Cox (2005) take X as the lifetime of an annuitant and calibrateλ using life annuity quotations in the market (assuming that the price of alife annuity is the present value of future payments, based on the risk-freerate and the distorted cdf of the lifetime). They then apply the approach toprice mortality-linked securities.

The one-factor Wang transform assumes that the underlying distribu-tion is known. However, usually F(x) is the best-estimate of the underlyingunknown distribution. The two-factor Wang transform is the cdf F∗∗(x)

such that

F∗∗(x) = Q(�−1(F(x)) + λ) (7.96)

where Q is the t-distribution with k degrees of freedom. Lin and Cox(2008) adopt this latter approach for pricing mortality-linked securities,with k = 6.

Cairns, Blake and Dowd (2006a) assume similarities between the forceof mortality and interest rates and adapt arbitrage-free pricing frameworksdeveloped for interest-rates derivatives to price mortality-linked securities.In Cairns, Blake and Dowd (2006b) they introduce the two-factor model

354 7 : The longevity risk: actuarial perspectives

described in Section 5.3 and price longevity bonds with different termsto maturity referenced to different cohorts. In particular, they develop amethod for calculating the market risk-adjusted price of a longevity bond,which allows for mortality trend uncertainty and parameter risk as well.

We finally address the problem of the valuation of a GAO. The GAO(see Section 7.5.2) consists of a European call option with the underly-ing asset the retail market value of a life annuity at retirement time andthe strike the GAR set when the GAO was underwritten. The pay-off ofthe option by itself depends on the comparison between the guaranteedand the current annuity rate. However, the actual exercise of the optiondepends also on the preference that the holder expresses for a life annu-ity instead of self-annuitization. The intrinsic structure of the pay-off of theoption is, therefore, uncertain because it depends on individual preferences,with possible adverse selection in respect of the insurer. When assessing thevalue of the GAO, individual preferences are usually disregarded in the cur-rent literature. The pricing problem is therefore attacked by assuming thatthe policyholder will decide to exercise the option just comparing the cur-rent market quotes for life annuities and the GAR. Ballotta and Haberman(2003) address this problem, assuming that the overall mortality risks (andhence also the longevity risk) are diversified. In Ballotta and Haberman(2006) the analysis is extended to the case in which mortality risk is incor-porated via a stochastic model for the evolution over time of the underlyingforce of mortality.

7.7 Financing post-retirement income

7.7.1 Comparing life annuity prices

We refer to a person buying an immediate life annuity. Let S be the capitalconverted into the annuity and b the annual amount. The annuity rate b/Sis a function of:

• the discount rate, i;• the reference mortality table, A(τ);• a safety loading (possibly explicit) for longevity risk.

The buyer may be interested in comparing the annuity rates applied bydifferent providers, and in explaining the relevant differences. However, itmay not be straightforward to understand the reasons for such differences,due to the interaction of the items building up the annuity rate and thecomplexity of the pricing model for longevity risk.

7.7 Financing post-retirement income 355

Typically, the discount rate is disclosed; this is in particular requiredwhenparticipation in investment profits occurs during the annuity payment. Thecomparison of annuity rates then concerns the incidence of mortality andthe relevant interaction with the discount rate. Some equivalent parametersshould be produced by the annuity provider (or by some other entity) toprovide better information in this regard.

It is reasonable that the comparison among annuity rates makes referenceto traditional pricing models. In particular, the actuarial value of a lifeannuity, ax = ∑ω−x

t=1 (1+i)−ttpx, and the present value of an annuity certain,

ak�i = ∑kt=1(1 + i)−t = (1 − (1 + i)−k)/i, may be addressed.

Given the discount rate assumed in the annuity rate bS , we can determine

the equivalent number of payments of an annuity certain, that is, the numberk such that ak�i = b

S . If i > 0, we easily find

k = − ln(1 − i bS )

ln(1 + i)(7.97)

Conversely, if i = 0 then k is simply given by bS and, according to a tradi-

tional actuarial valuation of the life annuity, it coincides with the expectedlifetime assumed by the annuity provider for the annuitant. Clearly, thestronger is the cost of longevity embedded in the annuity rate b

S , the loweris k. Note that if i > 0, k depends also on i.

In the case where there is a prevailing mortality table referred to for thetraditional actuarial valuation of life annuities, one can calculate what is theequivalent entry age x′ such that, according to this table and having set thediscount rate i, 1/ax′ coincides with the annuity rate quoted by the annuityprovider. Such an age should then be compared with the actual entry age,say x0.

Similarly, in the case where there is a prevailing mortality table referredto for the traditional actuarial valuation of life annuities, an alternativepossibility is to refer to the actual age x0 and to calculate the equivalentdiscount rate, that is, the rate i′ such that 1/ax0 (based on the referencemortality table) coincides with the quoted annuity rate b

S , as it is performed,for example, by Verrall et al. (2006).

Example 7.10 With reference to the expected values quoted in Table 7.2 fortime 0, we perform the comparisons discussed above. We assume that theprevailing mortality table referred to for the traditional actuarial valuationof the life annuity is given by assumptionA3(τ). All of the other assumptionsare as in Example 7.1; in particular, the actual entry age is x0 = 65.

356 7 : The longevity risk: actuarial perspectives

Table 7.30. Equivalent number of payments of an annu-ity certain; discount rate: i = 0.03

Mortality Annuity Equivalent numberassumption rate of payments

A1(τ)1

14.462 = 0.06915 19.247

A2(τ)1

14.651 = 0.06825 19.587

A3(τ)1

15.259 = 0.06554 20.707

A4(τ)1

15.817 = 0.06322 21.767

A5(τ)1

16.413 = 0.06093 22.938

Table 7.31. Equivalent number of payments of anannuity certain; mortality assumption: A3(τ)

Discount Annuity Equivalent numberrate rate of payments

i = 0 121.853 = 0.04576 21.853

i = 0.01 119.238 = 0.05198 21.473

i = 0.02 117.071 = 0.05858 21.091

i = 0.03 115.259 = 0.06554 20.707

i = 0.04 113.733 = 0.07282 20.321

Tables 7.30 and 7.31 give the equivalent number of payments of an annu-ity certain, for several quoted prices of the life annuity. In particular, inTable 7.30 the discount rate has been kept fixed, while alternative mor-tality assumptions have been used; in Table 7.31 the annuity rate is basedon the mortality assumption A3(τ) while alternative levels of the discountrate are chosen. Clearly, given the mortality table, the equivalent numberof payments of an annuity certain is higher the lower is the discount rate.With a fixed discount rate, the equivalent number of payments is higher thestronger is the mortality improvement implied by the table.

In Table 7.32 the reference mortality assumption is A3(τ) and the refer-ence discount rate is i = 0.03. First, the equivalent discount rate relating todifferent mortality assumptions is calculated (third column); then the equiv-alent rounded entry age is quoted (fourth column). We note that a lowerequivalent discount rate and a lower equivalent entry age emerge from astronger assumption about mortality improvements.

7.7.2 Life annuities versus income drawdown

When planning post-retirement income, some basic features of the lifeannuity product should be accounted for. In particular,

7.7 Financing post-retirement income 357

Table 7.32. Equivalent discount rate, equivalent entry age; referenceparameters: mortality A3(τ), discount rate: i = 0.03

Mortality Annuity Equivalent Equivalentassumption rate discount rate entry age x′

A1(τ)1

14.462 = 0.06915 3.501% 67

A2(τ)1

14.651 = 0.06825 3.379% 66

A3(τ)1

15.259 = 0.06554 3% 65

A4(τ)1

15.817 = 0.06322 2.673% 64

A5(τ)1

16.413 = 0.06093 2.343% 62

(a) a life annuity provides the annuitant with an inflexible income, in thesense that, if the whole fund available to the annuitant at retirement isconverted into a life annuity, the annual income is stated as defined bythe annuity rate (apart from the effect of possible profit participationmechanisms);

(b) a more flexible income can be obtained via a partial annuitization ofthe fund, or partially delaying the annuitization itself; the part of theincome not provided by the life annuity is then obtained by drawdownfrom the non-annuitized fund;

(c) the life annuity product benefits from a mortality cross-subsidy, aseach life annuity in a given portfolio (or pension plan) is annuallycredited with ‘mortality interests’, that is, a share of the technical pro-visions released by the deceased annuitants, according to the mutualityprinciple (see Sections 1.4 and 1.4.1 in particular).

Let us start with point (c). We refer to a life annuity issued at age x0with annual amount b, whose technical provision (simply denoted by Vt) iscalculated according to rule (7.49) (adopting a mortality assumption A(τ)).Recursively, we may express the technical provision as follows:

V0 = S

Vt−1 (1 + i) = (Vt + b) px0+t−1, t = 1, 2, . . . (7.98)

where i is the technical interest rate, px0+t−1 is based on mortality assump-tion A(τ) and S is the single premium (see (1.28)). According to a traditionalpricing structure, we may further assume

S = b ax0 (7.99)

where ax0 is calculated according to the same assumptions adopted in (7.98).

To be more realistic, we consider a (financial) profit participation mech-anism. We denote as b0 the amount of the benefit set at policy issue (so,

358 7 : The longevity risk: actuarial perspectives

b0 = b, where b comes from (7.99)). Assume that in each policy year aconstant (to shorten notation) extra interest rate r is credited to the reserve.As a consequence, the annual amounts b1, b2, . . . , bt, . . . are paid out, attimes 1, 2, . . . , t, . . . , where bt is assessed as follows

bt = bt−1 (1 + r), t = 1, 2, . . . (7.100)

The recursion describing the behaviour of the reserve then becomes

Vt−1 (1 + i) (1 + r) = (Vt + bt) px0+t−1, t = 1, 2, . . . (7.101)

or, defining 1 + i′ = (1 + i) (1 + r), so that i′ represents the total annualinterest rate credited to the reserve

Vt−1 (1 + i′) = (Vt + bt) px0+t−1, t = 1, 2, . . . (7.102)

Rearranging (7.102), we obtain

Vt − Vt−1 = −bt px0+t−1 + Vt−1 i′ + Vt qx0+t−1 (7.103)

which can be rewritten as

Vt − Vt−1 = Vt−1 i′ + (Vt + bt) qx0+t−1 − bt (7.104)

and, replacing Vt + bt according to (7.102), finally as

Vt − Vt−1 = Vt−1 i′ + qx0+t−1

px0+t−1Vt−1 (1 + i′) − bt (7.105)

We note that (7.105) generalizes (1.13).

Recalling that Vt − Vt−1 < 0, from (7.103) we find that the variation inthe reserve is due to the following contributions:

(i) a positive contribution due to the (total) amount of interest assignedto the reserve;

(ii) a positive contribution due to mutuality;(iii) a negative contribution by the payment bt.

The splitting of the variation of the reserve in a year is sketched in Fig. 1.4.

We now address item (b) in the list at the beginning of this Section. Aswas discussed in Section 1.2.1, the annuitant may decide not to use S tobuy a life annuity, but simply to invest it and receive the post-retirementincome via a sequence of withdrawals (set at her/his choice). Suppose thatthe fund is credited each year with annual interest at the rate g. Furtherassume that the annuitant withdraws from the fund a sequence of amounts

7.7 Financing post-retirement income 359

set to be a (constant) proportion α of the annual payments she/he wouldhave obtained under the life annuity, that is, of the sequence (7.100).

Let Ft be the fund available at time t. We have

F0 = SFt−1 (1 + g) = α bt + Ft, t = 1, 2, . . .

(7.106)

simply generalizing (1.1).

As already noted in Section 1.2.1, there is a time m such that Fm ≥ 0and Fm+1 < 0, that is, the withdrawals b1, b2, . . . , bm exhaust the fund. Ifthe lifetime of the annuitant, Tx0 , turns out to be lower than m, then theamount FTx0

is available at her/his death for bequest. However, if Tx0 > mthen at time m the annuitant is unfunded. To avoid early exhaustion, theannuitant should set a low level for α or look for investments with a highyield g. In the former case, however, the annual income may then becomeinsufficient to meet current needs; in the latter case, risky assets could beinvolved, so that possible losses may then emerge because of fluctuatingvalues.

Example 7.11 Let us assume that the amount S = 15.259 can be usedto buy a life annuity with initial benefit b = b0 = 1, subject to profitparticipation. The annuity rate b/S = 1/15.259 is based on a traditionalcalculation of the actuarial value of the life annuity, under the mortalityassumption A3(τ) and the annual interest rate i = 0.03 (see Table 7.32,second column). We set the actual annual interest rate gained in each yearon investments to be i′ = 0.05, so that benefits are yearly increased by therate r = 1.05/1.03 − 1 = 0.01942.

With the parameters mentioned above, we now refer to the case ofdrawdown, based on an annual consumption α bt, t = 1, 2, . . . . Settingg = i′ = 0.05, Fig. 7.25, panel (a), shows the share α as a function of thetime m to fund exhaustion. Note that α becomes lower than 1 as soon as thetime m is greater than the expected lifetime of the annuitant under scenarioA3(τ) (which turns out to be 21.853 years; see also Table 7.31 for i = 0).Alternatively, setting α = 1, in panel (b) of Fig. 7.25 the required annualinvestment yield g is quoted, again as a function of the time to exhaustionof the fund. We note that, in this case, g exceeds i′ = 0.05 as soon as m isgreater than the expected lifetime of the annuitant.

7.7.3 The ‘mortality drag’

The absence of mutuality in an income drawdown process can be compen-sated (at least partially) by a higher investment yield (see Section 1.4.1). The

360 7 : The longevity risk: actuarial perspectives

0.00

0.20

0.40

0.60

0.80

1.00

1.20

1.40

1.60

1.80

2.00(a)

(b)

0 10 20 30 40 50 60

0%

1%

2%

3%

4%

5%

6%

7%

8%

9%

Rat

e g

Shar

e α

Time to exhaustion, m

0 10 20 30 40 50 60

Time to exhaustion, m

Figure 7.25. Annual withdrawal (panel (a)) and annual investment yield (panel (b)) as a functionof the time to fund exhaustion.

extra return required in each year for this purpose has been called the mor-tality drag. However, it is worth stressing that a fixed drawdown sequenceleads in any case to wealth exhaustion in a given number of years (possiblythe maximum residual lifetime), whatever the interest rate may be, as wasdepicted in Fig. 7.25, panel (b).

7.7 Financing post-retirement income 361

Conversely, the concept of mortality drag suggests an alternative arrange-ment for the post-retirement income. Assume that at time 0 no life annuityis purchased, whereas some amount will be converted into a life annuity attime k, thus with a delay of k years since the retirement time. We supposethat a traditional pricing method is adopted at time k by the insurer and thatthe mortality assumption for the trend of the cohort is not revised duringthe delay period. To facilitate a comparison, we assume that the amountto be annuitized at time k must provide the annuitant with the sequencebk+1, bk+2, . . . , whose items follow from (7.100) (assuming b0 = b asgiven by (7.99)). Hence, the amount to be converted at time k into the lifeannuity is

bk ax+k = Vk (7.107)

with Vk originated by (7.105). Therefore, an amount funding the reserveto be set up must be provided at time k.

If the annuitant aims at getting the same income as under the life annuityalso during the delay period, than the drawdown process b1, b2, . . . , bkmust be defined. Because of the absence of mutuality, if the individualinvestment provides the same yield as that which the insurer is willing torecognize, then the fund available at time k, Fk, is lower then the requiredamount to annuitise, Vk. However, an extra return may offset the loss of‘mutuality (or mortality) returns”, thus leading to Fk = Vk. The size ofthe extra investment yield required so that Fk = Vk can be obtained from(7.106), considered with α = 1. If i′ is the yield on the life annuity product,then intuitively g − i′ is an average of the annual quantities θx+t definedin Section 1.4.1. It is worthwhile stressing that given the deferment k, theextra yield g − i′ must be obtained in each of the k years of delay. Thus,g − i′ is like a yield to maturity, measuring the mortality interest in k years,whereas the quantity θx+t is the extra yield specific of year (t − 1, t) (seealso (1.34) and (1.35)).

Example 7.12 Under the assumptions adopted in Example 7.11 for the lifeannuity, Fig. 7.26 plots the extra-yield required on individual investmentsin each of the k years of delay to compensate the loss of mutuality. Trivially,the higher is k, the higher is the required extra yield. Given that the extrayield must be realized in each of the k years of delay, this target may bevery difficult to reach when the annuitization is planned for a distant timein the future. �

It is worthwhile to investigate in more detail how the average mortalitydrag g − i′ is affected by the annuity rate. From (7.106), having set α = 1

362 7 : The longevity risk: actuarial perspectives

0%

1%

2%

3%

4%

5%

6%

7%

8%

9%

5 10 15 20 25 30 35 40 45 50

Delay period k

Extra investment yield

Life annuity yield

Figure 7.26. Extra investment yield required by mortality drag.

we get

Ft = S (1 + g)t −t∑

h=1

bh (1 + g)t−h (7.108)

Let gk be the rate g such that Fk = Vk for a given k. The rate gk is thereforedefined by the following relation:

S (1 + gk)k −k∑

h=1

bh (1 + gk)k−h = Vk (7.109)

Note that Fig. 7.26 actually plots the rate gk for several choices of k.

From (7.100), we can express the annual benefit at time t as

bt = b (1 + r)t (7.110)

Replacing (7.110), (7.107), and (7.99) into (7.109), we obtain

b ax0 (1 + gk)k −k∑

h=1

b (1 + r)h (1 + gk)k−h = b (1 + r)k ax0+k (7.111)

or equivalently

ax0 (1 + gk)k − (1 + gk)k 1 + rgk − r

+ (1 + r)k+1

gk − r= (1 + r)k ax0+k (7.112)

7.7 Financing post-retirement income 363

which suggests that gk depends on the annuity rate applied at time k,1/ax0+k, but also on that applied at time 0, 1/ax0 . The rate gk obtained withr = 0 has been named the Implied Longevity Yield (ILY)1; see Milevsky(2005) and Milevsky (2006).

The delay in the purchase of the life annuity may have some advantages.In particular:

– in the case of death before time k, the fund available constitutes a bequest(which is not provided by a life annuity purchased at time 0, because ofthe implicit mortality cross-subsidy);

– more flexibility is gained, as the annuitant may change the annual incomemodifying the drawdown sequence (with a possible change in the fundavailable at time k).

Conversely, a disadvantage is due to the risk of a shift to a differentmortalityassumption, leading to a conversion rate at time k which is less favourableto the annuity purchaser than the one in force at time 0. Further, as alreadynoted, in the case where k is high, it may be difficult to gain the requiredmortality drag.

7.7.4 Flexibility in financing post-retirement income

Combining an income drawdown with a delay in the life annuity purchaseconstitutes an example of a post-retirement income arrangement which ismore general than the one consisting of a life annuity-based income only.We now summarize what has emerged in the previous sections, therebydefining a general framework for a discussion of post-retirement incomeplanning. Our focus will be mostly on mortality issues, to keep the presen-tation in line with the main scope of the chapter. Nevertheless, importantfinancial aspects should not be disregarded when assessing and comparingthe several opportunities of meeting post-retirement income needs.

We assume that an accumulation process takes place during the work-ing period of an individual. After retirement, a decumulation process takesplace and hence income requirements are met using, in some way, theaccumulated fund.

Figure 7.27 illustrates the process consisting of:

1. the accumulation of contributions during the working period;

1 Registered trademarks and property of CANNEX Financial Exchanges.

364 7 : The longevity risk: actuarial perspectives

Contributions(before retirement)

Annuitypurchase

Non-annuitizedfund

Interests

Annuitizedfund

Incomedrawdown

(after retirement)

Interests

MortalityAnnuitypayment

(after retirement)

Figure 7.27. Accumulation process and post-retirement income.

2. a (possible) annuitization of (part of) the accumulated fund (before orafter retirement);

3. receiving a post-retirement income from life annuities or through incomedrawdown.

The annuitization of (part of) the accumulated fund consists of pur-chasing a deferred life annuity if annuitization takes place during theaccumulation period, and an immediate life annuity otherwise. Hence, atany time, the resources available for financing post-retirement income areshared between a non-annuitized and an annuitized fund. It is reason-able to assume that a higher degree of flexibility in selecting investmentopportunities is attached to the non-annuitized fund.

We note that the non-annuitized fund builds up because of contributionsand investment returns. Conversely, the annuitized fund builds up becauseof investment returns and mortality, as the fund coincides with the totalmathematical reserve of the life annuities purchased, and hence it benefitsfrom the cross-subsidy effect.

Figures 7.28 and 7.29 illustrate a possible behaviour of the non-annuitized and the annuitized fund, respectively. Effects of the life annuitypurchase (jumps in the processes), of the income drawdown and of theannuity payment are identified.

The slope of the non-annuitized fund depends, while the fund itself isincreasing, on both contributions and interest earnings, whereas it dependson the drawdown policy while the fund is decreasing. As regards the annu-itized fund, as previously noted, its slope depends on interest and mortality,

7.7 Financing post-retirement income 365

Fund

Accumulation period Post-retirement periodTime

Life annuity purchase

Incomedrawdown

Figure 7.28. The non-annuitized fund.

Fund

Life annuity purchaseLife annuity

payment

Accumulation period Post-retirement periodTime

Figure 7.29. The annuitized fund.

while it is increasing, whereas it also depends on the annuity payment whiledecreasing.

Let us denote by F[NA]t and F[A]

t the values of the non-annuitized andthe annuitized fund, respectively, at time t. The ‘degree’ of the annuitiza-tion policy can be summarized by the annuitization ratio ar(t), defined asfollows:

ar(t) = F[A]t

F[A]t + F[NA]

t

(7.113)

Note that, obviously, 0 ≤ ar(t) ≤ 1; ar(t) = 0 means that up to time t nolife annuity has been purchased, whilst ar(t) = 1 means that at time t thewhole fund available consists of reserves related to purchased life annuities.

366 7 : The longevity risk: actuarial perspectives

Ann

uitiz

atio

n ra

tio

0%

(1)

(2)

Deferredlife annuity

100%

Accumulationperiod

Post-retirementperiod

Time

Incomedrawdown

only

Figure 7.30. Arrangements: (1) deferred life annuity; (2) income drawdown.

Example 7.13 Figures 7.30–7.33 illustrate some strategies for financingpost-retirement income. In most cases, the technical tool provided by thelife annuity is involved. The various strategies are described in terms of theannuitization ratio profile; thus, the value of ar(t) is plotted against time t.

To improve understanding, we suppose that a specifiedmortality assump-tion is adopted when annuitizing (a part of) the accumulated fund and thatthe assumption itself cannot be replaced in relation to the purchased annu-ity, whatever the mortality trend might be (so, that a guaranteed annuity isinvolved).

Figure 7.30 illustrates two ‘extreme’ choices. Choice (1) consists of build-ing up a traditional deferred life annuity. In this case, each amount paid tothe accumulation fund (possibly a level premium, or a single recurrent pre-mium) is immediately converted into a deferred life annuity; this way, theaccumulated fund is completely annuitized. Post-retirement income require-ments are met by the life annuity (a flat annuity or, possibly, a rising profileannuity, viz an escalating annuity or an inflation-linked annuity).

Choice (2) represents the opposite extreme. There is no annuitizationoperating, so that income requirements are fulfilled by income draw-down, which implies spreading the fund accumulated at retirement over thefuture life expectation, according to some spreading rule. Sometimes annu-itants prefer this choice because of the high degree of freedom in selectinginvestment opportunities even during the post-retirement period.

It should be stressed that choice (1) leads to an inflexible post-retirementincome, whilst choice (2) allows the annuitant to adopt a spreading rule

7.7 Financing post-retirement income 367

Ann

uitiz

atio

n ra

tio

0%

100%

Accumulationperiod

Post-retirementperiod

Time

Immediatelife annuity

Figure 7.31. Immediate life annuity.

consistent with a specific income profile. Conversely, it is worth notingthat arrangement (1) completely transfers the mortality risk (including itslongevity component) to the insurer, whilst according to arrangement (2)the mortality risk remains completely with the annuitant (see Section 7.7.2).

In more general terms, the process of transferring mortality risk dependson the annuitization profile: thus, the portion of mortality risk trans-ferred from the annuitant to the insurer increases as the annuitization ratioincreases. The following arrangements provide practical examples of howmortality risk can be transferred, as time goes by, to the insurer.

The annuitization of the fund at retirement time only is illustrated inFig. 7.31, which depicts the particular case of a complete annuitization ofthe fund available at retirement. This arrangement can be realized throughpurchasing a single-premium life annuity, and is characterized by flexibilityin the investment choice during the accumulation period. Conversely, itproduces an inflexible post-retirement income profile.

In Fig. 7.32, the annuitization ratio increases during the accumulationperiod because of positive jumps corresponding to the purchase of life annu-ities with various deferment periods. The behaviour of the annuitizationratio between jumps obviously depends on the contributions and the inter-est earnings affecting the non-annuitized fund as well as on the financialand mortality experience of the annuitized fund.

In contrast Fig. 7.33 illustrates the case in which no annuitization is madethroughout the accumulation period, whereas the fund available after theretirement date is partially used (with delays) to purchase life annuities;

368 7 : The longevity risk: actuarial perspectives

Ann

uitiz

atio

n ra

tio

100%

0%

Accumulationperiod

Post-retirementperiod

Time

Combinedannuities

Figure 7.32. Combined life annuities.

Ann

uitiz

atio

n ra

tio

0%

100%

Accumulationperiod

Post-retirementperiod

Time

Staggeredannuitization

Figure 7.33. Staggered annuitization.

such a process is sometimes called staggered annuitization or staggeredvesting. The behaviour of the ratio between jumps depends on the interestearnings and income drawdown as regards the non-annuitized fund as wellas financial and mortality experience of the annuitized fund.

Arrangements like those illustrated by Figs. 7.32 and 7.33 are charac-terized by a high degree of flexibility as regards both the post-retirementincome profile and the choice of investment opportunities available for thenon-annuitized fund. �

7.8 References and suggestions for further reading 369

The framework proposed above clearly shows the wide range of choicesleading to different annuitization strategies. So, convenient investment andlife annuity products can be designed, to meet the different needs and pref-erences of the clients. An example in this regard is given by the solutionsproviding natural hedging across time (Section 7.3.2), such as the money-back annuity with death benefit (7.32), which is designed so that at somefuture time the death benefit reduces to zero. We note that, as long as thedeath benefit is positive, a situation of fund just partially annuitized can beidentified. As soon as the death benefit reduces to zero, the fund turns outto be fully annuitized. Thus, an annuitization strategy is embedded in thestructure of money-back annuities.

7.8 References and suggestions for further reading

In this section we summarize the main contributions on the topics dealtwith in this chapter, some of which have already been mentioned whileaddressing specific issues. However, the purpose is to add references tothose that have been previously cited.

An informal and comprehensive description of longevity risk, and in par-ticular of the relevant financial impact on life annuities, is provided byRichard and Jones (2004). See also Riemer-Hommel and Trauth (2000).

A static framework for representing the longevity risk according to aprobabilistic approach has been used, for example, by Olivieri (2001),Olivieri and Pitacco (2002a), Olivieri and Pitacco (2003). Olivieri andPitacco (2002a) suggest a Bayesian-inferential procedure for updating theweighting distribution. Marocco and Pitacco (1998) adopt a continuousprobability distribution for weighting the alternative scenarios. A dynamicprobabilistic approach to longevity risk modelling has been proposed,among the others, by Biffis (2005), Dahl (2004), Cairns et al. (2006b). Biffisand Denuit (2006) introduce, in particular, a class of stochastic forces ofmortality that generalize the Lee–Carter model. The static and the dynamicprobabilistic approaches to randomness in mortality trend are addressed byTuljapurkar and Boe (1998).

The investigation in Section 7.2, and in Section 7.2.3 in particular, isbased on Olivieri (2001). The analysis of the random value of future bene-fits is addressed also by Biffis and Olivieri (2002), where a pension scheme(or a group insurance) providing a range of life and death benefits is referredto. FollowingOlivieri (2001), Coppola et al. (2000) provide an investigation

370 7 : The longevity risk: actuarial perspectives

also addressing financial risk for life annuity portfolios. In the Lee–Carterframework, given that the future path of the time index is unknown andmodelled as a stochastic process, the policyholders’ lifetimes become depen-dent on each other. Consequently, systematic risk is involved. Denuit andFrostig (2007a) study this aspect of the Lee–Carter model, in particularconsidering solvency issues. Denuit and Frostig (2007b) further study thedistribution of the present value of benefits in a run-off perspective. Asthe exact distribution turns out to be difficult to compute, various approx-imations and bounds are derived. Denuit (2008) summarizes the resultsobtained in this field.

The literature on risk management in industry and business in generalis very extensive. For an introduction to the relevant topics the reader canrefer, for example, to Harrington and Niehaus (1999), and to Williams,Smith and Young (1998). Various textbooks address specific phases ofthe risk management process. For example, Koller (1999) focuses on therisk assessment in the risk management process for business and industry,whereas Wilkinson Tiller, Blinn and Kelly (1990) deal with the topic of riskfinancing. Pitacco (2007) addresses mortality and longevity risk within arisk management perspective.

Several investigations have been performed with regard to natural hedg-ing. As far as portfolio diversification effects are concerned, the reader mayrefer to Cox and Lin (2007), where the results of an empirical investigationconcerning the US market are discussed. With regard to arrangements ona per-policy basis, some possible designs referring to pension schemes withcombined benefits are discussed in Biffis and Olivieri (2002). Gründl et al.(2006) analyse natural hedging from the perspective of the maximizationof shareholder value and show, under proper assumptions, that naturalhedging could not be optimal in this regard.

Solvency investigations in portfolio of life annuities are dealt with byOlivieri and Pitacco (2003). Solvency issues within a Lee-Carter frameworkare discussed by Denuit and Frostig (2007a). A review of solvency systemsis provided by Sandström (2006); when the longevity risk is addressed, typ-ically the required capital in this respect is set as a share of the technicalprovision. The most recent regulatory system is provided by the evolvingSolvency 2 system, where the required capital is the change expected in thenet asset value in case of a permanent shock in survival rates; see, for exam-ple, CEIOPS (2007) and CEIOPS (2008). The idea of assessing the requiredcapital by comparing assets to the random value of future payments, exam-ined in Section 7.3.3, has been put forward, for the life business in general,by Faculty of Actuaries Working Party (1986).

7.8 References and suggestions for further reading 371

Reinsurance arrangements for longevity risk have not received muchattention in the literature, due to the practical difficulty of transferringthe systematic risk. A Stop-Loss reinsurance on the assets has been pro-posed by Marocco and Pitacco (1998), which the reader is referred to forsome numerical examples, evaluated using both analytical and simulationmethods. Olivieri (2005) deals, in a more formal setting, with both XLand Stop-Loss treaties, analysing the effectiveness of these arrangements interms of the capital the insurer must allocate to face the residual longevityrisk not covered by the reinsurer. Olivieri and Pitacco (2008) refer to aswap-like arrangement, in the context of the valuation of a life annuityportfolio. Cox and Lin (2007) also design a swap-like arrangement, basedon natural hedging arguments.

In contrast, considerable attention has been devoted in the recent litera-ture to longevity bonds. Securitization of risks in general is described byCoxet al. (2000). The life insurance case is considered by Cowley and Cummins(2005). A mortality-indexed bond is described in Morgan Stanley-EquityResearch Europe (2003). Various structures for longevity bonds have beenproposed by Lin and Cox (2005), Lin and Cox (2007), Blake and Burrows(2001), Dowd (2003), Blake et al. (2006a), Blake et al. (2006b), Dowd et al.(2006), Olivieri and Pitacco (2008), Denuit et al. (2007). Pricing problemsare also dealt with in Cairns et al. (2006b) and Denuit et al. (2007), thelatter, in particular, working within the classical Lee–Carter model.

The pricing of longevity risk has been addressed also in the frameworkof portfolio valuation. Biffis and Millossovich (2006a) consider in partic-ular new business. Olivieri and Pitacco (2008) design a valuation setting,however without solving the problem of the appropriate stochastic mor-tality model to use. Friedberg and Webb (2005) analyse the pricing of theaggregate mortality risk in relation to the cost of capital of the insurancecompany. With reference to the problem of pricing a life annuity, Denuitand Frostig (2008) explain how to determine a conservative life table serv-ing as first-order mortality basis, starting from a best-estimate of futuremortality.

Many recent papers deal with the pricing and valuation of insuranceproducts including an option to annuitize; see, for example, Milevsky andPromislov (2001), O’Brien (2002), Wilkie et al. (2003), Boyle and Hardy(2003), Ballotta and Haberman (2003), Pelsser (2003), Ballotta and Haber-man (2006), Biffis and Millossovich (2006b). Some of them, mainly deal indetail with financial aspects.

Innovative ideas and proposals for structuring post-retirement benefitsare presented and discussed in the reports by the Department for Work and

372 7 : The longevity risk: actuarial perspectives

Pensions (2002) in the United Kingdom, and the Retirement Choice Work-ing Party (2001). The paper byWadsworth et al. (2001) suggests a technicalstructure for a fund providing annuities. A comprehensive description ofseveral annuities markets is provided by Cardinale et al. (2002). Piggotet al. (2005) describe Group-Self Annuitization schemes, which provide anexample of flexible GAR; however, the benefit in this case is not guaranteed.Money-back annuities in the UnitedKingdom represent an interesting annu-itization strategy; see Boardman (2006). Income drawdown issues withinthe context of defined contribution pension plans are discussed by Emmsand Haberman (2008), Gerrard et al. (2006). An extensive presentation ofissues concerning financing the post-retirement income is given byMilevsky(2006). An informal description of private solutions is provided by SwissRe (2007).

The reader interested in the impact of longevity risk on living benefitsother than life annuities can refer, for example, to Olivieri and Ferri (2003),Olivieri and Pitacco (2002c), Olivieri and Pitacco (2002b). See also Pitacco(2004b), where both life insurance and other living benefits are considered.

References

Alho, J. M. (2000). Discussion of Lee (2000). North American ActuarialJournal, 4(1), 91–93.

Andreev, K. F. and Vaupel, J. W. (2006). Forecasts of cohort mortalityafter age 50. Technical report.

Ballotta, L. and Haberman, S. (2003). Valuation of guaranteed annuityconversion options. Insurance: Mathematics & Economics, 33, 87–108.

Ballotta, L. and Haberman, S. (2006). The fair valuation problem ofguaranteed annuity options: The stochastic mortality environment case.Insurance: Mathematics & Economics, 38(1), 195–214.

Baran, S., Gall, J., Ispany, M., and Pap, G. (2007). Forecasting hun-garian mortality rates using the Lee–Carter method. Journal ActaOeconomica, 57, 21–34.

Barnett, H. A. R. (1960). The trends of population mortality andassured lives’ mortality in Great Britain. In Transactions of the 16thInternational Congress of Actuaries, Volume 2, Bruxelles, pp. 310–326.

Beard, R. E. (1952). Some further experiments in the use of the incompletegamma function for the calculation of actuarial functions. Journal of theInstitute of Actuaries, 78, 341–353.

Beard, R. E. (1959). Note on some mathematical mortality models. InCIBA Foundation Colloquia on Ageing (ed. C. E. W. Wolstenholmeand M. O. Connor), Volume 5, Boston, pp. 302–311.

Beard, R. E. (1971). Some aspects of theories of mortality, cause of deathanalysis, forecasting and stochastic processes. In Biological aspects ofdemography (ed. W. Brass), pp. 57–68. Taylor & Francis, London.

Bell, W. R. (1997). Comparing and assessing time series methods forforecasting age-specific fertility and mortality rates. Journal of OfficialStatistics, 13, 279–303.

Benjamin, B. and Pollard, J. H. (1993). The analysis of mortality and otheractuarial statistics. The Institute of Actuaries, Oxford.

Benjamin, J. and Soliman, A. S. (1993). Mortality on the move. ActuarialEducation Service, Oxford.

Biffis, E. (2005). Affine processes for dynamic mortality and actuarialvaluations. Insurance: Mathematics & Economics, 37(3), 443–468.

374 References

Biffis, E. and Denuit, M. (2006). Lee–Carter goes risk-neutral: an applica-tion to the Italian annuity market. Giornale dell’Istituto Italiano degliAttuari, 69, 33–53.

Biffis, E. and Millossovich, P. (2006a). A bidimensional approach tomortality risk. Decisions in Economics and Finance, 29, 71–94.

Biffis, E. and Millossovich, P. (2006b). The fair value of guaranteedannuity options. Scandinavian Actuarial Journal, 1, 23–41.

Biffis, E. and Olivieri, A. (2002). Demographic risks in pensionschemes with combined benefits. Giornale dell’Istituto Italiano degliAttuari, 65(1–2), 137–174.

Black, K. and Skipper, H. D. (2000). Life & health insurance. PrenticeHall, New Jersey.

Blake, D., Cairns, A. J., and Dowd, K. (2007). Facing up to the uncertaintyof life: the longevity fan charts. Technical Report.

Blake, D. and Burrows, W. (2001). Survivor bonds: helping tohedge mortality risk. The Journal of Risk and Insurance, 68(2),339–348.

Blake, D., Cairns, A. J. G., and Dowd, K. (2006a). Living with mortality:longevity bonds and other mortality-linked securities. British ActuarialJournal, 12, 153–228.

Blake, D., Cairns, A. J. G., Dowd, K., and MacMinn, R. (2006b).Longevity bonds: financial engineering, valuation, and hedging. TheJournal of Risk and Insurance, 73(4), 647–672.

Blake, D. and Hudson, R. (2000). Improving security and flexibility inretirement. Retirement Income Working Party, London.

Blaschke, E. (1923). Sulle tavole di mortalità variabili col tempo. Giornaledi Matematica Finanziaria, 5, 1–31.

Boardman, T. (2006). Annuitization lessons from the UK: money-backannuities and other developments. The Journal of Risk and Insur-ance, 73(4), 633–646.

Booth, H. (2006). Demographic forecasting: 1980 to 2005 in review.International Journal of Forecasting, 22(3), 547–581.

Booth, H., Hyndman, R. J. Tickle, L., and De Jong, P. (2006). Lee–Carter mortality forecasting: a multi-country comparison of variantsand extensions. Technical Report.

Booth, H., Maindonald, J., and Smith, L. (2002). Applying Lee–Carterunder conditions of variable mortality decline. Population Stud-ies, 56(3), 325–336.

Booth, H., Tickle, L., and Smith, L. (2005). Evaluation of the variantsof the Lee–Carter method of forecasting mortality: a multi-countrycomparison. New Zealand Population Review, 31, 13–34.

References 375

Booth, P., Chadburn, R., Heberman, S., James, D., Kharasarce, Z., Plumb,R. andRickayza, B. (2005).Modern advanced theory and practice. BocaRator: Chapman & Hall/CRC.

Bourgeois-Pichat, J. (1952). Essai sur la mortalité “biologique” del’homme. Population, 7(3), 381–394.

Bowers, N. L., Gerber, H. U., Hickman, J. C., Jones, D. A., and Nes-bitt, C. J. (1997). Actuarial mathematics. The Society of Actuaries,Schaumburg, Illinois.

Boyle, P. and Hardy, M. (2003). Guaranteed annuity options. ASTINBulletin, 33, 125–152.

Brass, W. (1974). Mortality models and their uses in demography.Transactions of the Faculty of Actuaries, 33, 123–132.

Brillinger, D. R. (1986). The natural variability of vital rates and associatedstatistics. Biometrics, 42, 693–734.

Brouhns, N. and Denuit, M. (2002). Risque de longévité et rentes viagères.II. Tables de mortalité prospectives pour la population belge. BelgianActuarial Bulletin, 2, 49–63.

Brouhns, N., Denuit, M., and Keilegom, van, I. (2005). Bootstrappingthe Poisson log-bilinear model for mortality forecasting. ScandinavianActuarial Journal, (3), 212–224.

Brouhns, N., Denuit, M., and Vermunt, J. K. (2002a). Measuring thelongevity risk in mortality projections. Bulletin of the Swiss Associationof Actuaries, 2, 105–130.

Brouhns, N., Denuit, M., and Vermunt, J. K. (2002b). A Poisson log-bilinear approach to the construction of projected lifetables. Insurance:Mathematics & Economics, 31(3), 373–393.

Buettner, T. (2002). Approaches and experiences in projecting mortalitypatterns for the oldest-old. North American Actuarial Journal, 6(3),14–25.

Butt, Z. and Haberman, S. (2002). Application of frailty-based mortalitymodels to insurance data. Actuarial Research Paper No. 142, Dept. ofActuarial Science and Statistics, City University, London.

Butt, Z. and Haberman, S. (2004). Application of frailty-based mortalitymodels using generalized linear models. ASTIN Bulletin, 34(1), 175–197.

Buus, H. (1960). Investigations on mortality variations. In Transactionsof the 16th International Congress of Actuaries, Volume 2, Bruxelles,pp. 364–378.

Cairns, A. J. G., Blake, D., and Dowd, K. (2006a). Pricing death: frame-works for the valuation and securitization of mortality risk. ASTINBulletin, 36(1), 79–120.

376 References

Cairns, A. J. G., Blake, D., and Dowd, K. (2006b). A two-factor model forstochastic mortality with parameter uncertainty: theory and calibration.The Journal of Risk and Insurance, 73(4), 687–718.

Cairns, A., Blake, D., Dowd, K., Coughlan, G., Epstein, D., Ong, A.and Balevich, I. (2007) A quantitative comparison of stochastic mor-tality models using data from England andWales and the United States.Pensions Insitute Discussion Paper PI-0701, Cass Business School, CityUniversity.

Cardinale, M., Findlater, A., and Orszag, M. (2002). Paying out pensions.A review of international annuities markets. Research report, WatsonWyatt.

Carter, L. and Lee, R. D. (1992). Modelling and forecasting US sexdifferentials in mortality. International Journal of Forecasting, 8,393–411.

Carter, L. R. (1996). Forecasting U.S. mortality: a comparison of Box –Jenkins ARIMA and structural time series models. The SociologicalQuarterly, 37(1), 127–144.

Catalano, R. and Bruckner, T. (2006). Childmortality and cohort lifespan:a test of diminished entelechy. International Journal of Epidemiology,35, 1264–1269.

CEIOPS (2007). QIS3. Technical specifications. Part I: Instructions.CEIOPS (2008). QIS4. Technical specifications.Champion, R., Lenard, C. T., and Mills, T. M. (2004). Splines. In Ency-

clopedia of actuarial science (ed. J. L. Teugels and B. Sundt), Volume 3,pp. 1584–1586. John Wiley & Sons.

CMI (2002). An interim basis for adjusting the “92” series mortality pro-jections for cohort effects. Working Paper 1, The Faculty of Actuariesand Institute of Actuaries.

CMI (2005). Projecting future mortality: towards a proposal for a stochas-tic methodology. Working paper 15, The Faculty of Actuaries andInstitute of Actuaries.

CMI (2006). Stochastic projection methodologies: Further progress andP-spline model features, example results and implications. WorkingPaper 20, The Faculty of Actuaries and Institute of Actuaries.

CMIB (1978). Report no. 3. Continuous Mortality Investigation Bureau,Institute of Actuaries and Faculty of Actuaries.

CMIB (1990). Report no. 10. ContinuousMortality Investigation Bureau,Institute of Actuaries and Faculty of Actuaries.

CMIB (1999). Report no. 17. ContinuousMortality Investigation Bureau,Institute of Actuaries and Faculty of Actuaries.

Coale, A. and Kisker, E. E. (1990). Defects in data on old age mortal-ity in the United States: new procedures for calculating approximately

References 377

accurate mortality schedules and life tables at the highest ages. Asianand Pacific Population Forum, 4, 1–31.

Congdon, P. (1993). Statistical graduation in local demographic anal-ysis and projection. Journal of the Royal Statistical Society, A, 156,237–270.

Coppola, M., Di Lorenzo, E., and Sibillo, M. (2000). Risk sources in alife annuity portfolio: decomposition and measurement tools. Journalof Actuarial Practice, 8(1–2), 43–61.

Cossette, H., Delwarde, A., Denuit, M., Guillot, F., and Marceau, E.(2007). Pension plan valuation and dynamic mortality tables. NorthAmerican Actuarial Journal, 11, 1–34.

Cowley, A. and Cummins, J. D. (2005). Securitization of life insuranceassets and liabilities. The Journal of Risk and Insurance, 72(2), 193–226.

Cox, S. H., Fairchild, J. R., and Pedersen, H.W. (2000). Economic aspectsof securitization of risk. ASTIN Bulletin, 30(1), 157–193.

Cox, S. H. and Lin, Y. (2007). Natural hedging of life and annuitymortality risks. North Americal Actuarial Journal, 11, 1–15.

Cramér, H. and Wold, H. (1935). Mortality variations in Sweden: astudy in graduation and forecasting. Skandinavisk Aktuarietidskrift, 18,161–241.

Crimmins, E. and Finch, C. (2006). Infection, inflammation, heightand longevity. Proceedings of the National Academy Sciences, 103,498–503.

Cummins, J. D., Smith, B. D., Vance, R. N., and VanDerhei, J. L. (1983).Risk classification in life insurance. Kluwer-Nijhoff Publishing, Boston,The Hague, London.

Czado, C., Delwarde, A., and Denuit, M. (2005). Bayesian Poissonlog-bilinear mortality projections. Insurance: Mathematics & Eco-nomics, 36(3), 260–284.

Dahl, M. (2004). Stochastic mortality in life insurance. Market reservesand mortality-linked insurance contracts. Insurance: Mathematics &Economics, 35(1), 113–136.

Dahl, M. and Møller, T. (2006). Valuation and hedging of life insur-ance liabilities with systematic mortality risk. Insurance: Mathematics& Economics, 39(2), 193–217.

Davidson, A. R. and Reid, A. R. (1927). On the calculation of rates ofmortality. Transactions of the Faculty of Actuaries, 11(105), 183–232.

Davy Smith, G., Hart, C., Blane, D., and Hole, D. (1998) Adverse socio-economic conditions in childhood and cause specific adult mortality: aprospective observational study. British Medical Journal, 316, 1631–1635.

378 References

De Jong, P. and Tickle, L. (2006). Extending the Lee–Carter model ofmortality projection. Mathematical Population Studies, 13, 1–18.

Delwarde, A. and Denuit, M. (2006). Construction de tables de mortalitépériodiques et prospectives. Ed. Economica, Paris.

Delwarde, A., Denuit, M., and Eilers, P. (2007a). Smoothing the Lee–Carter and Poisson log-bilinear models for mortality forecasting.Statistical Modelling, 7, 29–48.

Delwarde, A., Denuit, M., and Partrat, Ch. (2007b). Negative binomialversion of the Lee–Carter model for mortality forecasting. AppliedStochastic Models in Business and Industry, 23, 385–401.

Delwarde, A., Denuit, M., Guillen, M., and Vidiella, A. (2006). Applica-tion of the Poisson log-bilinear projection model to the G5 mortalityexperience. Belgian Actuarial Bulletin, 6, 54–68.

Delwarde, A., Kachakhidze, D., Olié, L., and Denuit, M. (2004). Mod-èles linéaires et additifs généralisés, maximum de vraisemblance localet méthodes relationnelles en assurance sur la vie. Bulletin Françaisd’Actuariat, 6, 77–102.

Denuit, M. (2007). Comonotonic approximations to quantiles of lifeannuity conditional expected present values. Insurance: Mathematics& Economics, 42, 831–838.

Denuit, M. (2008). Life annuities with stochastic survival probability: areview. Methodology and Computing in Applied Probability, to appear.

Denuit, M., Devolder, P., and Goderniaux, A.C. (2007). Securitization oflongevity risk: pricing survivor bonds with Wang transform in the Lee–Carter framework. The Journal of Risk and Insurance, 74(1), 87–113.

Denuit, M. and Dhaene, J. (2007). Comonotonic bounds on the sur-vival probabilities in the Lee–Carter model for mortality projections.Computational and Applied Mathematics, 203, 169–176.

Denuit, M., Dhaene, J., Goovaerts, M. J., and Kaas, R. (2005). Actuarialtheory for dependent risks: measures, orders and models. Wiley, NewYork.

Denuit, M. and Frostig, E. (2007a). Association and heterogeneity ofinsured lifetimes in the Lee–Carter framework. Scandinavian ActuarialJournal, 107, 1–19.

Denuit, M. and Frostig, E. (2007b). Life insurance mathematics with ran-dom life tables. WP 07-07, Institut des Sciences Actuarielles, UniversitéCatholique de Louvain, Louvain-la-Neuve, Beglium.

Denuit, M. and Frostig, E. (2008). First-order mortality basis for lifeannuities. The Geneva Risk and Insurance Review, to appear.

Denuit, M. and Goderniaux, A.-C. (2005). Closing and projecting lifetables using log-linear models. Bulletin of the Swiss Association ofActuaries (1), 29–48.

References 379

Denuit,M. and Vermandele, C. (1998). Optimal reinsurance and stop-lossorder. Insurance: Mathematics & Economics, 22, 229–233.

Department for Work and Pensions (2002). Modernising annuities.Technical Report, Inland Revenue, London.

Dowd, K. (2003). Survivor bonds: A comment on Blake and Burrows. TheJournal of Risk and Insurance, 70(2), 339–348.

Dowd, K., Blake, D., Cairns, A. J. G., and Dawson, P. (2006). Survivorswaps. The Journal of Risk and Insurance, 73(1), 1–17.

Durban, I., Currie, M., and Eilers, P. (2004). Smoothing and forecastingmortality rates. Statistical Modelling, 4, 279–298.

Eilers, P. H. C. andMarx, B. D. (1996). Flexible smoothing with B-splinesand penalties. Statistical Sciences, 11, 89–121.

Emms, P. and Haberman, S. (2008). Income drawdown schemes for adefined contribution pension plan. Journal of Risk and Insurance, 75(3),739–761.

Evandrou, E. and Falkingham, J. (2002). Smoking behaviour and socio-economic class: a cohort analysis, 1974 to 1998. Health StatisticsQuarterly, 14, 30–38.

Faculty of Actuaries Working Party (1986). The solvency of life assurancecompanies. Transactions of the Faculty of Actuaries, 39(3), 251–340.

Felipe, A., Guillèn, M., and Perez-Marin, A. M. (2002). Recent mortalitytrends in the Spanish population. British Actuarial Journal, 8, 757–786.

Finetti, de, B. (1950). Matematica attuariale. Quaderni dell’Istituto pergli Studi Assicurativi (Trieste), 5, 53–103.

Finetti, de, B. (1957). Lezioni di matematica attuariale. Edizioni Ricerche,Roma.

Forfar, D. O. (2004a). Life table. In Encyclopedia of Actuarial Science (ed.J. L. Teugels and B. Sundt), Volume 2, pp. 1005–1009. John Wiley &Sons.

Forfar, D. O. (2004b). Mortality laws. In Encyclopedia of actuarial sci-ence (ed. J. L. Teugels and B. Sundt), Volume 2, pp. 1139–1145. JohnWiley & Sons.

Forfar, D. O.,McCutcheon, J. J., andWilkie, A. D. (1988). On graduationby mathematical formulae. Journal of the Institute of Actuaries, 115,1–149.

Forfar, D. O. and Smith, D. M. (1988). The changing shape of EnglishLife Tables. Transactions of the Faculty of Actuaries, 40, 98–134.

Francis, B., Green, M., and Payne, C. (1993). The GLIM system: Release4 Manual. Clarendon Press, Oxford.

Friedberg, L. andWebb, A. (2005). Life is cheap: using mortality bonds tohedge aggregate mortality risk.WPNo. 2005-13, Center for RetirementResearch at Boston College.

380 References

Gerber, H. U. (1995). Life insurance mathematics. Springer-Verlag.Gerrard, R., Haberman, S., and Vigna, E. (2006). The managementof decumulation risks in a defined contribution environment. NorthAmerican Actuarial Journal, 10(1), 84–110.

Girosi, F. and King, G. (2007). Understanding the Lee–Carter mortalityforecasting method. Technical report.

Government Actuary’s Department (1995). National Population Projec-tions 1992-based. HMSO, London.

Government Actuary’s Department (2001). National Population Projec-tions: review of methodology for projecting mortality. GovernmentActuary’s Department, London.

Government Actuary’s Department (2002). National Population Projec-tions 2000-based. HMSO, London.

Goss, S. C.,Wade, A., and Bell, F. (1998). Historical and projectedmortal-ity forMexico, Canada and theUnited States.North American ActuarialJournal, 2(4), 108–126.

Group Annuity Valuation Table Task Force (1995). 1994 Group annuitymortality table and 1994 Group annuity reserving table. Transactionsof the Society of Actuaries, 47, 865–913.

Gründl, H., Post, T., and Schulze, R. N. (2006). To hedge or not to hedge:managing demographic risk in life insurance companies. The Journal ofRisk and Insurance, 73(1), 19–41.

Gupta, A. K. and Varga, T. (2002). An introduction to actuarial mathe-matics. Kluwer Academic Publishers.

Gutterman, S. and Vanderhoof, I. T. (1998). Forecasting changes in mor-tality: a search for a law of causes and effects. North American ActuarialJournal, 2(4), 135–138.

Haberman, S. (1996). Landmarks in the history of actuarial science (upto 1919). Actuarial Research Paper No. 84, Dept. of Actuarial Scienceand Statistics, City University, London.

Haberman, S. and Renshaw, A. (2008). On simulator-based approaches torisk measurement in mortality with specific reference to binomial Lee–Cartermodelling. Presented to Living to 100. Survival to AdvancedAgesinternational symposium. Society of Actuaries, Orlando, Florida.

Haberman, S. and Renshaw, A. (2007) Discussion of “pension plan valua-tion and mortality projection: A case study with mortality data”, NorthAmerican Actuarial Journal, 11(4), 2007, 148–150.

Haberman, S. and Sibbett, T. A. (eds.) (1995). History of actuarial science,London. Pickering & Chatto.

Hald, A. (1987). On the early history of life insurance mathematics.Scandinavian Actuarial Journal, (1), 4–18.

References 381

Hamilton, J. (1994). Time series analysis. Princeton: Princeton UniversityPress.

Harrington, S. E. and Niehaus, G. R. (1999). Risk management andinsurance. Irwin/McGraw-Hill.

Heligman, L. and Pollard, J. H. (1980). The age pattern of mortality.Journal of the Institute of Actuaries, 107, 49–80.

Horiuchi, S. and Wilmoth, J. R. (1998). Deceleration in the age pattern ofmortality at older ages. Demography, 35(4), 391–412.

Hougaard, P. (1984). Life table methods for heterogeneous popu-lations distributions describing the heterogeneity. Biometrika, 71,75–83.

Hyndman, R. J. and Ullah, Md. S. (2007). Robust forecasting of mortalityand fertility rates: a functional data approach. Computational Statisticsand Data Analysis, 51, 4942–4956.

IAA (2004). A global framework for insurer solvency assessment. ResearchReport of the Insurer Solvency AssessmentWorking Party, InternationalActuarial Association.

James, I. R. and Segal, M. R. (1982). On a method of mortality analysisincorporating age–year interaction, with application to prostate cancermortality. Biometrics, 38, 433–443.

Kannisto, V. J., Lauritsen, A. R. Thatcher and Vaupel, J. W. (1994).Reductions in mortality at advanced ages: several decades of evi-dence from 27 countries. Population and Development Review, 20,793–810.

Keyfitz, N. (1982). Choice of functions for mortality analysis: Effec-tive forecasting depends on a minimum parameter representation.Theoretical Population Biology, 21, 329–352.

Koissi, M.-C., Shapiro, A. F., and Högnäs, G. (2006). Evaluating andextending the Lee–Carter model for mortality forecasting: Bootstrapconfidence interval. Insurance: Mathematics & Economics, 38(1),1–20.

Koller, G. (1999). Risk assessment and decision making in business andindustry. CRC Press.

Kopf, E. W. (1926). The early history of life annuity. Proceedings of theCasualty Actuarial Society, 13(27), 225–266.

Kotz, S., Balakrishnan, N., and Johnson, N. L. (2000). Continuous multi-variate distributions (2 edn), Volume 1: Models and applications. JohnWiley & Sons.

Lee, R. D. (2000). The Lee–Carter method for forecasting mortality,with various extensions and applications. North American ActuarialJournal, 4(1), 80–93.

382 References

Lee, R. D. (2003). Mortality forecasts and linear life expectancy trends.Technical Report.

Lee, R. D. and Carter, L. R. (1992). Modelling and forecasting U.S.mortality. Journal of the American Statistical Association, 87(14),659–675.

Lee, R. and Miller, T. (2001). Evaluating the performance of theLee–Carter approach to modelling and forecasting. Demography, 38,537–549.

Li, N. and Lee, R. D. (2005). Coherent mortality forecasts for a group ofpopulations: an extension of the Lee–Carter method. Demography, 42,575–594.

Lin, Y. and Cox, S. H. (2005). Securitization of mortality risks in lifeannuities. The Journal of Risk and Insurance, 72(2), 227–252.

Lin, Y. and Cox, S. H. (2008). Securitization of catastrophe mortalityrisks. Insurance: Mathematics & Economics, 42, 628–637.

Lindbergson,M. (2001).Mortality among the elderly in Sweden 1988–97.Scandinavian Actuarial Journal (1), 79–94.

Loader, C. (1999). Local regression and likelihood. Springer, New York.London, D. (1985). Graduation: the revision of estimates. ACTEXPublications.

Lundström, H. and Qvist, J. (2004). Mortality forecasting and trendshifts: an application of the Lee–Carter model to swedishmortality data.International Statistical Review, 72, 37–50.

Manton, K. G. and Stallard, E. (1984). Recent trend in mortality analysis.Academic Press.

Marocco, P. and Pitacco, E. (1998). Longevity risk and life annuity reinsur-ance. In Transactions of the 26th International Congress of Actuaries,Birmingham, Volume 6, pp. 453–479.

McCrory, R. T. (1986). Mortality risk in life annuities. Transactions ofSociety of Actuaries, 36, 309–338.

McCutcheon, J. J. (1981). Some remarks on splines. Transaction of theFaculty of Actuaries, 37, 421–438.

Milevsky, M. A. and Promislov, S. D. (2001). Mortality derivatives andthe option to annuitise. Insurance: Mathematics & Economics, 29(3),299–318.

Milevsky,M. A. (2005). The implied longevity yield: A note on developingan index for life annuities. The Journal of Risk and Insurance, 72(2),301–320.

Milevsky, M. A. (2006). The calculus of retirement income. Financialmodels for pension annuities and life insurance. Cambridge UniversityPress.

References 383

Miller, R. T. (2004). Graduation. In Encyclopedia of actuarial science (ed.J. L. Teugels and B. Sundt), Volume 2, JohnWiley& Sons. pp. 780–784.

Morgan Stanley-Equity Research Europe (2003). Swiss Re-Innovativemortality-based security. Technical report, Morgan Stanley.

Namboodiri, K. and Suchindran, C. M. (1987). Life table techniques andtheir applications. Academic Press.

National Statistics-Government Actuary’s Department (2001). Nationalpopulation projections: Review of methodology for projecting mortal-ity. National Statistics Quality Review Series, Report No. 8.

Nordenmark, N. V. E. (1906). Über die Bedeutung der Verlängerungder Lebensdauer für die Berechnung der Leibrenten. In Transactionsof the 5th International Congress of Actuaries, Volume 1, Berlin,pp. 421–430.

O’Brien, C. D. (2002). Guaranteed annuity options: five issues forresolution. British Actuarial Journal, 8, 593–629.

Oeppen, J. and Vaupel, J. W. (2002). Broken limits to life expectancy.Science, 296, 1029–1031.

Office ofNational Statistics (1997)The health of adult Britain 1841–1994.HMSO, London.

Olivieri, A. (2001). Uncertainty in mortality projections: an actuarialperspective. Insurance: Mathematics & Economics, 29(2), 231–245.

Olivieri, A. (2005). Designing longevity risk transfers: the point of viewof the cedant. Giornale dell’Istituto Italiano degli Attuari, 68, 1–35.Reprinted on: ICFAI Journal of Financial Risk Management, 4-March2007: 55–83.

Olivieri, A. (2006). Heterogeneity in survival models. applications topension and life annuities. Belgian Actuarial Bulletin, 6, 23–39.http://www.actuaweb.be/frameset/frameset.html.

Olivieri, A. and Ferri, S. (2003). Mortality and disability risks in longterm care insurance. IAAHS Online Journal. http://www.actuaries.org/members/en/IAAHS/OnlineJournal/2003-1/2003-1.pdf.

Olivieri, A. and Pitacco, E. (2002a). Inference about mortality improve-ments in life annuity portfolios. In Transactions of the 27th Interna-tional Congress of Actuaries, Cancun (Mexico).

Olivieri, A. and Pitacco, E. (2002b). Managing demographic risks in longterm care insurance. Rendiconti per gli Studi Economici Quantitativi, 2,15–37.

Olivieri, A. and Pitacco, E. (2002c). Premium systems for post-retirementsickness covers. Belgian Actuarial Bulletin, 2, 15–25.

Olivieri, A. and Pitacco, E. (2003). Solvency requirements for pensionannuities. Journal of Pension Economics & Finance, 2, 127–157.

384 References

Olivieri, A. and Pitacco, E. (2008). Assessing the cost of capi-tal for longevity risk. Insurance: Mathematics & Economics, 42,1013–1021.

Olshansky, S. J., Passaro, D., Hershaw, R., Layden, J., Carnes, B. A.,Brody, J., Hayflick, L., Butler, R. N., Allison, D. B., and Ludwig, D. S.(2005). A potential decline in life expectancy in the United States in the21st century. New England Journal of Medicine, 352, 1103–1110.

Olshansky, S. J. (1988). On forecasting mortality. The Milbank Quar-terly, 66(3), 482–530.

Olshansky, S. J. and Carnes, B. E. (1997). Ever since Gompertz. Demog-raphy, 34, 1–15.

O’Malley, P. (2007). Development of GMxB markets in Europe. InTransactions of the 1st IAA Life Colloquium, Stockholm.

Pelsser, A. (2003). Pricing and hedging guaranteed annuity options viastatic option replication. Insurance: Mathematics & Economics, 33(2),283–296.

Petrioli, L. and Berti, M. (1979). Modelli di mortalità. Franco AngeliEditore, Milano.

Piggot, J., Valdez, E. A., and Detzel, B. (2005). The simple analyticsof a pooled annuity fund. The Journal of Risk and Insurance, 72(3),497–520.

Pitacco, E. (2004a). From Halley to “frailty”: a review of survivalmodels for actuarial calculations. Giornale dell’Istituto Italiano degliAttuari, 67(1–2), 17–47.

Pitacco, E. (2004b). Longevity risks in living benefits. In Developing anannuity market in Europe (ed. E. Fornero and E. Luciano), pp. 132–167.Edward Elgar, Cheltenham.

Pitacco, E. (2004c). Survival models in a dynamic context: a survey.Insurance: Mathematics & Economics, 35(2), 279–298.

Pitacco, E. (2007). Mortality and longevity: a risk management perspec-tive. In Proceedings of the 1st IAA Life Colloquium, Stockholm.

Pollard, A. H. (1949). Methods of forecasting mortality using Australiandata. Journal of the Institute of Actuaries, 75, 151–182.

Pollard, J. H. (1987). Projection of age-specific mortality rates. PopulationBulletin of the UN, 21–22, 55–69.

Poulin, C. (1980). Essai de mise au point d’un modèle représentatifde l’évolution de la mortalité humaine. In Transactions of the 21stInternational Congress of Actuaries, Volume 2, Zürich-Lausanne,pp. 205–211.

Renshaw, A. E. and Haberman, S. (2000). Modelling for mortality reduc-tion factors. Actuarial Research Paper No. 127, Dept. of ActuarialScience and Statistics, City University, London.

References 385

Renshaw, A. E. and Haberman, S. (2003a). Lee–Carter mortality fore-casting, a parallel generalized linear modelling approach for England &Wales mortality projections. Applied Statistics, 52, 119–137.

Renshaw, A. E. and Haberman, S. (2003b). Lee–Carter mortality fore-casting with age specific enhancement. Insurance: Mathematics &Economics, 33(2), 255–272.

Renshaw, A. E. and Haberman, S. (2003c). On the forecasting of mor-tality reduction factors. Insurance: Mathematics & Economics, 32(3),379–401.

Renshaw, A. E. and Haberman, S. (2005). Lee–Carter mortality forecast-ing incorporating bivariate time series for England and Wales mortalityprojections. Technical report.

Renshaw, A. E. and Haberman, S. (2006). A cohort-based extensionto the Lee–Carter model for mortality reduction factors. Insurance:Mathematics & Economics, 38(3), 556–570.

Renshaw, A. E. and Haberman, S. (2008). On simulation-basedapproaches to risk measurement in mortality with specific referenceto poisson Lee–Carter modelling. Insurance: Mathematics and Eco-nomics, 42, 797–816.

Renshaw,A. E.,Haberman, S., andHatzopoulos, P. (1996). Themodellingof recent mortality trends in United Kingdommale assured lives. BritishActuarial Journal, 2(II), 449–477.

Retirement Choice Working Party (2001). Extending retirement choices.Retirement income options for modern needs. The Faculty and Instituteof Actuaries.

Richards, S., Ellam, J., Hubbard, J., Lu, J., Makin, S., and Miller, K.(2007). Two-dimensional mortality data: patterns and projections.Presented to the Institute of Actuaries.

Richard, S. J. and Jones, G. L. (2004). Financial aspects of longevity risk.The Staple Inn Actuarial Society, London.

Riemer-Hommel, P. and Trauth, T. (2000). Challenges and solutions forthe management of longevity risk. In Risk management. Challenge andopportunity (ed. M. Frenkel, U. Hommel, and M. Rudolf), pp. 85–100.Springer.

Rotar, V. I. (2007). Actuarial Models. The Mathematics of Insurance.Chapman & Hall/CRC.

Sandström, A. (2006). Solvency. Models, assessment and regulation.Chapman & Hall, CRC.

Sithole, T. Z., Haberman, S., and Verrall, R. J. (2000). An investiga-tion into parametric models for mortality projections, with applicationsto immediate annuitants and life office pensioners’ data. Insurance:Mathematics & Economics, 27(3), 285–312.

386 References

Skwire, D. (1997). Actuarial issues in the novels of Jane Austen. NorthAmerican Actuarial Journal, 1(1), 74–83.

Smith, D. and Keyfitz, N. (eds.) (1977). Mathematical demography.Selected papers, Berlin. Springer Verlag.

Sun, F. (2006). Pricing and risk management of variable annuities withmultiple guaranteed minimum benefits. Actuarial Practice Forum. Soci-ety of Actuaries.

Sverdrup, E. (1952). Basic concepts in life assurance mathematics. Skan-dinavisk Aktuarietidskrift, 3–4, 115–131.

Swiss Re (2007). Annuities: a private solution to longevity risk. Sigma, 3.Tabeau, E., van den Berg Jeths, A., and Heathcote, C. (eds.) (2001). Fore-

casting mortality in developed countries. Kluwer Academic Publishers.Thatcher, A. R. (1999). The long-term pattern of adult mortality and thehighest attained age. Journal of the Royal Statistical Society, A, 162,5–43.

Tuljapurkar, S., Li,N., and Boe, C. (2000). A universal pattern ofmortalitydecline in the G7 countries. Nature, 405, 789–792.

Tuljapurkar, S. and Boe, C. (1998).Mortality change and forecasting: howmuch and how little do we know. North American Actuarial Journal, 2,13–47.

Vaupel, J. W., Manton, K. G., and Stallard, E. (1979). The impactof heterogeneity in individual frailty on the dynamics of mortality.Demography, 16(3), 439–454.

Verrall, R., Haberman, S., Sithole, T., and Collinson, D. (2006, Septem-ber). The price of mortality. Life and Pensions, 35–40.

Wadsworth, M., Findlater, A., and Boardman, T. (2001). Reinventingannuities. The Staple Inn Actuarial Society, London.

Wang, S.H. (2002). A universal framework for pricing financial andinsurance risks. ASTIN Bulletin, 32(2), 213–234.

Wang, S.H. (2004). Cat bond pricing using probability transforms. TheGeneva Papers on Risk and Insurance: Issues and Practice, 278, 19–29.

Wang, S. S. and Brown, R. L. (1998). A frailty model for projection ofhuman mortality improvement. Journal of Actuarial Practice, 6(1–2),221–241.

Wetterstrand, W. H. (1981). Parametric models for life insurance mor-tality data: Gompertz’s law over time. Transactions of the Society ofActuaries, 33, 159–175.

Wilkie, A. D., Waters, H. R., and Yang, S. Y. (2003). Reserving, pric-ing and hedging for policies with guaranteed annuity options. BritishActuarial Journal, 9, 263–425.

Wilkie, A. D. (1997). Mutuality and solidarity: assessing risks and sharinglosses. British Actuarial Journal, 3, 985–996.

References 387

Wilkinson Tiller, M., Blinn, J. D., and Kelly, J. J. (1990). Essentials of riskfinancing. Insurance Institute of America.

Willets, R. C. (2004). The cohort effect: insights and explanations. BritishActuarial Journal, 10, 833–877.

Williams, JR. C. A., Smith, M. L., and Young, P. C. (1998). Riskmanagement and insurance. Irwin/McGraw-Hill.

Wilmoth, J. R. (1993). Computational methods for fitting and extrapolat-ing the Lee–Carter model of mortality change. Technical report.

Wilmoth, J. R. (2000). Demography of longevity: Past, present, and futuretrends. Journal of Experimental Gerontology, 35, 1111–1129.

Wilmoth, J. R. and Horiuchi, S. (1999). Rectangularization revisited: vari-ability of age at death within human populations. Demography, 36(4),475–495.

Wong-Fupuy, C. and Haberman, S. (2004). Projecting mortality trends:recent developments in the United Kingdom and the United States.North American Actuarial Journal, 8, 56–83.

Yaari, M. E. (1965). Uncertain lifetime, life insurance, and the theory ofthe consumer. Review of Economic Studies, 32(2), 137–150.

Yashin, A. I. and Iachine, I. A. (1997). How frailty models can be usedfor evaluating longevity limits: Taking advantage of an interdisciplinaryapproach. Demography, 34, 31–48.

This page intentionally left blank

Index

account value 42accumulation period 31, 32, 33–6, 344,

350, 364, 367actuarial value 9–12additive model 79adverse selection 41age at death variability 113–15age rating models 79age shifts 79, 127–9, 155–6age-patterns of mortality 13–14,

159–60, 178age-period life tables 93–5age-period-cohort models see APC

(Age-Period-Cohort) modelsage-specific functions 60, 139–40aggregate table 51alternative risk transfer (ART) 297

see also risk transferAndreev–Vaupel life expectancy

projections 235–7annual probability of death 48

laws for 66mortality modelling by

extrapolation 141–52, 162versus interpolation 165–6

annual survival probability 48annuities-certain 2–8, 36

avoiding early fund exhaustion 5–6equivalent number of payments 355risks in 6–8withdrawing from fund 2–5

annuitization 35, 364–9staggered 368

annuity in advance 32–3annuity in arrears 8, 31APC (Age-Period-Cohort) models

173–5application to UK mortality

data 254–63Lee–Carter APC model 246–54

error structure and modelfitting 248–52

model structure 246–8mortality rate projections 253

apportionable annuity 39asymptotic mortality 147autoregressive integrated moving average

(ARIMA) models 221–3, 231–2,253

B-splines 71–2, 210, 265Balducci assumption 58Banking, Finance, and Insurance

Commission (BFIC) 92–3Barnett law 66Beard law 66Belgium 130–53

Cairns–Blake–Dowd modelapplication 207–9

Lee–Carter model application 200–3prediction intervals 232–4smoothing 213–14

life expectancy forecasting 237–9optimal calibration period

selection 217–18residuals analysis 220–1see also Federal Planning Bureau (FPB),

BelgiumBernoulli model 122binomial maximum likelihood

estimation 198–9negative 199–200

bonus rates 39bootstrapping 229–30

application to Belgian mortalitystatistics 232–4

bootstrap percentiles confidenceintervals 230–2

Brass logit transform 167–8

Cairns–Blake–Dowd mortalityprojection model 183–4,203–9

allowing for cohort effects 263–5application to Belgian mortality

statistics 207–9

390 Index

Cairns–Blake–Dowd mortalityprojection model (Cont.)

calibration 206–7optimal calibration period

selection 217, 218residuals analysis 220–1specification 203–6time index modelling 228–9see also mortality modelling

calibration period selection214–18

application to Belgian mortalitystatistics 217–18

motivation 214–16selection procedure 216–17

capital protection 40cash-refund annuity 40catastrophe risk 269central death rate 57Coale–Kisker model 76coefficient of variation 61cohort effect 243–5

in Cairns-Blake-Dowd model263–5

in P-splines model 265–6UK 243–5see also APC (Age-Period-Cohort)

modelscohort life expectancies 112–13, 153cohort life table 46, 140

in projected table 152–3complete expectation of life 60complete life annuity 39conditional GAR products 348, 349constant-growth annuity 38Continuous Mortality Investigation

Bureau (CMIB), UK 243software 185

cross-subsidy 14–20mutuality 14–16solidarity 16–18tontine annuities 18–20

cubic spline 70curtate expectation of life 59curtate remaining lifetime 49curve of deaths 54curve squaring 105–6

deathage at, variability 113–15annual probability of 48curve of deaths 54death rates 96–101

central 57observed 116–18smoothed 118–22, 209–14

uniform distribution of deaths 57–8see also mortality

decumulation period 31, 32, 36–8, 344,345, 350

deferred life annuity 32–3diminished entelechy hypothesis

244–5distribution function 53–4dynamic mortality model 139

endowment 33–4, 344–5endurance 61England see United Kingdomenhanced annuities 41enhanced pensions 41entropy 61Equitable Life 135equity-indexed annuity 38equivalence principle 12equivalent discount rate 355equivalent entry age 355equivalent number of payments 355escalating annuities 38Esscher formula 151excess-of-loss (XL) reinsurance 319–20,

326exhaustion time 5expansion 138, 161, 168, 179expected lifetime 59, 139, 152, 170exponential formula 145–6, 149

alternative approach 146–7formulae used in actuarial

practice 149–51generalization 147implementation 148

exposure-to-risk (ETR) 95–6, 97

failure rate 55fan charts 170, 240Federal Planning Bureau (FPB),

Belgium 91–2life expectancy projections 235

financing post-retirementincome 354–69, 371–2

comparing life annuity prices 354–6flexibility in 363–9life annuities versus income

drawdown 356–9mortality drag 359–63

Index 391

first-order basis 12, 13fixed-rate escalating annuity 38force of mortality 55–6, 58, 82–3, 94–5

cumulative 56laws for 64–5

frailty 80–3models 83–5, 88

combined with mortality laws 85–7France 130–5fund exhaustion

avoiding 5–6exhaustion time 5

Gamma distribution 83–5, 87Gaussian-Inverse distribution 85Germany 130GLB (Guaranteed Living Benefits) 43GM (Gompertz-Makeham) models 65,

163–4GMAB (Guaranteed Minimum

Accumulation Benefit) 42, 43GMDB (Guaranteed Minimum Death

Benefit) 42GMIB (Guaranteed Minimum Income

Benefit) 42, 43GMWB (Guaranteed Minimum

Withdrawal Benefit) 42–3GMxBs (Guarantees Minimum Benefits

of type ‘x’) 41–3Gompertz model 55–6, 64, 85–6

see also GM (Gompertz-Makeham)models

graduation 67–8, 87–8mortality graduation over age and

time 163–5see also non-parametric graduation

guaranteed annuity 346guaranteed annuity option (GAO) 35,

297, 346–7valuation of 354

guaranteed annuity rate (GAR) 346–7adding flexibility 347–50conditional GAR products 348, 349with-profit GAR products 349

Gyldén, H. 175–6

hazard function 55cumulative 56

healthy worker effect 122hedging 298

across LOBs 303across time 299–302

life annuity liabilities through longevitybonds 337–43

natural hedging 298, 299–303, 370Heligman–Pollard laws 12, 66, 75, 178,

179, 276highest anniversary value 42Human mortality database (HMD) 92

impaired-life annuities 41implied longevity yield (ILY) 15, 363inception-select mortality 51index-linked escalating annuity 38inflation-linked annuity 38instalment-refund annuity 40insurance risk 269insured population 14internal knots 69, 70interquartile range 61–2investment-linked annuities 38–9issue-select mortality 51Italy 130

joint-life annuity 37

K-K-K hypothesis 173knots 69–70Kwiatowski–Philips–Schmidt–Shin

test 224

last-survivor annuity 37Lee–Carter (LC) model 169–73, 178–80,

182–4, 186–203age-period-cohort model 246–54

see also APC (Age-Period-Cohort)models

application to Belgian mortalitystatistics 200–3

application to UK mortalitystatistics 254–63

calibration 188–200alternative estimation

procedures 198–200identifiable constraints 188–9least-squares estimation 189–98optimal calibration period

selection 214–18extensions 172, 180, 192–200life expectancy forecasting 237–9,

241–2model tables and 173prediction intervals 232–4residuals analysis 218–21

392 Index

Lee–Carter (LC) model (Cont.)smoothing in 212–13specification 186–8time index modelling 221–8

random walk with driftmodel 225–8

stationarity 223–4see also mortality modelling

level annuities 38Lexis diagram 94Lexis point 60liability 11life annuities 2–8

accumulation period 33–6as financial transactions 8avoiding early fund exhaustion 5–6cross-subsidy in 14–20decumulation period 36–8deterministic evaluation 8–14

actuarial value 9–12technical bases 12–14

immediate versus deferredannuities 31–3

longevity risk and 343–50mortality risk location 343–6

payment profile 38–40present value of 351–2price comparisons 354–6risks in 6–8stochastic evaluation 20–30

focussing on portfolioresults 21–4

random present value 20–1risk assessment 24–7uncertainty in mortality

assumptions 27–30temporary life annuity 36versus income drawdown 356–9whole life annuity 36with a guarantee period 37withdrawing from fund 2–5

life expectancy 59–60, 89Andreev–Vaupel projections 235–7Belgian Federal Planning Bureau (FPB)

projections 235cohort life expectancies 112–13, 153forecasting 234–42

application to Belgian mortalitystatistics 237–9

back testing 240–2fan charts 240

heterogeneity 115–16observed 122–3period life expectancies 62, 111–13

life insurance market 116–29age shifts 127–9life expectancies 122–3observed death rates 116–18smoothed death rates 118–22

life insurance securitization 330–2life tables 46–51, 93

aggregate table 51as probabilistic models 48–9closure 101–5cohort life table 46, 140

in projected table 152–3limit table 165–6optimal 166, 177period life table 46–7, 93, 140

age-period 93–5population versus market tables 47–8projected life table 47projecting transforms of life table

functions 167–9ultimate life table 51

LifeMetrics 185lifetime probability distribution 58limiting age 4linear spline 70lines of business (LOBs) 298

natural hedging across LOBs 303liquidation period see decumulation

periodliquidity risk 7location measure 60logit transform of the survival

function 73long-term bonds 335longevity bonds 332, 335–7, 371

hedging life annuity liabilitiesthrough 337–43

longevity risk 1, 267, 268–93, 369life annuities and 343–50management 293–330

natural hedging 299–303reinsurance arrangements 318–30,

371risk management perspective 293–9solvency issues 303–18, 370see also risk management

measurement in a staticframework 276–93

mortality risks 268–70pricing and 350–4, 371representation 273–6stochastic modelling issues 270–3

loss control techniques 296–7loss financing techniques 297

Index 393

Makeham laws 64, 67, 76, 159, 176–7,179

see also GM (Gompertz-Makeham)models

market risk 7maximum downward slope 61median age at death 60model risk 269model tables 165–6, 173, 177–8money-back annuities 302Monte Carlo simulation 22, 230–1mortality

age-patterns 13–14, 159–60, 178allowing for uncertainty 27–30asymptotic 147at very old ages 74–6, 88best estimate 29by causes 67, 175force of 55–6, 58, 64–5, 82–3, 94–5

cumulative 56forecasting see mortality modellinggraduation over age and time 163–5heterogeneity 77–87, 88

frailty models 83–7models for differential

mortality 78–80observable heterogeneity

factors 77–8unobservable heterogeneity

factors 80–3laws 63–7, 179

combined with frailty models 85–7projections and 156–60

risk of random fluctuation 25select 49–51trends see mortality trendssee also death; life tables; survival

mortality bonds 332, 333–4mortality drag 15, 359–63mortality modelling 137–9, 175–80

age-period models 181–242age-period-cohort models 243–66age-specific functions 139–40cohort versus period approach 173–5diagonal approach 157–9, 162, 177dynamic approach 137–41extrapolation of annual probabilities of

death 141–52, 162versus interpolation 165–6

horizontal approach 143–4, 162, 176life expectancy forecasting 234–42model tables 165–6, 173, 177–8mortality by causes 175mortality projection 221–9

projection in parametriccontext 156–65

prediction intervals 229–34projected table use 152–6projecting transforms of life table

functions 167–9relational method 178surface approach 163vertical approach 157, 159–60, 162,

177see also Cairns–Blake–Dowd mortality

projection model; Lee–Carter (LC)model

mortality odds 49mortality profile 138, 140mortality risks 268–70

location in traditional life annuityproducts 343–6

mortality trends 93–116, 176age-period life tables 93–5closure of life tables 101–5death rates 96–101exposure-to-risk 95–6expression via Weibull’s

parameters 160–1heterogeneity 115–16life expectancies 111–13life insurance market 116–29mortality surfaces 101rectangularization and

expansion 105–11throughout the EU 129–35variability 113–15see also mortality

mortality-linked securities 332–7multiplicative model 79mutuality 6, 14–16, 17–18, 357

interest from 15

Nadaraya–Watson kernel estimate 120natural cubic spline 70natural hedging 298, 299–303, 370

across LOBs 303across time 299–302

Newton–Raphson procedure 193–4no advance funding 298non-guaranteed annuity 346–7non-parametric graduation 67–72

splines 69–72Whittaker–Henderson model 68–9

non-pooling risk 285numerical rating system 79–80

394 Index

option to annuitize 35, 297, 346overdispersed Poisson and negative

binomial maximum likelihoodestimation 199–200

P-splines modelallowing for cohort effects 265–6smoothing approach 210–11

parameter risk 269participating GAR products 349payout period see decumulation periodPearson residuals 220pension annuities 40–1period life expectancies 62, 111–13period life table 46–7, 140

age-period 93–5Perks laws 65, 75–6, 86–7Petrioli–Berti model 168–9Poisson bootstrap 231Poisson log-bilinear model 172Poisson maximum likelihood

estimation 196–8overdispersed 199–200

pooling risk 285post-retirement income financing see

financing post-retirement incomeprediction intervals 229–34

application to Belgian mortalitystatistics 232–4

premium 8return of premiums 35, 42

present value 351–2pricing

longevity risk and 350–4, 371reinsurance arrangements 325–6

probability density function (pdf) 53–4probability of default 295process risk 25, 269profit participation mechanisms 13, 39projected life table 47projected mortality model 139

extrapolation of annual probabilities ofdeath 141–52, 162

versus interpolation 165–6parametric context 156–65see also mortality modelling; projected

mortality tableprojected mortality table 152–6

age shifting 155–6cohort tables in 152–3from double-entry to single-entry

projected table 153–5prudential basis 12

R software 184–5random present value 20–1, 24, 43random walk with drift model 225–8ratchet 42rating classes 16–17realistic basis 12rectangularization 51, 105–11, 138, 161,

168, 179reduction factors 124, 144–5, 179, 233,

246–8, 252reinsurance arrangements 318–30, 371

excess-of-loss (XL)reinsurance 319–20, 326

pricing 325–6reinsurance-swap arrangement on

annual outflows 324–5stop-loss reinsurance

on annual outflows 321–4, 326on assets 320–1, 326

swap-like arrangement between lifeannuities and lifeinsurances 329–30

Renshaw–Haberman model 165Renshaw–Haberman–Hatzopoulos

model 163–4reserve 6, 27residuals analysis 218–21

application to Belgian mortalitystatistics 220–1

residuals bootstrap 231resistance function 73–4, 178return of premiums 35, 42reversionary annuity 38Richardt, T. 176risk 6–8, 78

assessment 24–7exposure-to-risk (ETR) 95–6, 97management see risk managementof mortality random fluctuation 25process risk 25, 269uncertainty risk 28–9see also longevity risk; risk

management (RM); risk transferrisk classes 16–17risk factors 40risk index 280risk management (RM) 293–9, 370

natural hedging 299–303reinsurance arrangements 318–30, 371solvency issues 303–18, 370

risk transfer 297–8hedging life annuity liabilities through

longevity bonds 337–43

Index 395

life insurance securitization 330–2mortality-linked securities 332–7see also risk management

roll-up 42Rueff’s adjustments 127, 155ruin probability 295

safe-side technical basis 12, 13safety loading 13scenario technical basis 12second-order basis 12securitization 330

life insurance 330–2select mortality 49–51select period 50select table 51self-selection 17, 51single-entry projected table 153–5Sithole–Haberman–Verrall model

164–5smoothing 118–22, 209–14

application to Belgian mortalitystatistics 213–14

in Lee–Carter model 212–13motivation 209P-splines approach 210–12

solidarity 14, 16–18solvency 303–18, 370

assessment 24–7Spain 130special-rate annuities 41splines 69–72

B-splines 71–2, 210, 265P-splines model

allowing for cohort effects 265–6smoothing approach 210–11

staggered annuitization 368standard annuities 38standardized mortality ratio (SMR)

116–18stationarity 223–4Statistics Belgium 91stochastic valuation 270–3

life annuity evaluation 20–30stop-loss reinsurance

on annual outflows 321–4, 326on assets 320–1, 326

survival, annual probability 48see also mortality

survival function 51–3

expansion 51, 138rectangularization 51, 105–11, 138transforms of 73–4

Sweden 130

temporary life annuity 36Thiele law 65time series modelling 221–3

Cairns–Blake–Dowd timeindices 228–9

Lee–Carter time index 221–8random walk with drift

model 225–8stationarity 223–4

Tonti, Lorenzo 18–19tontine annuities 14, 18–20

ultimate life table 51uncertainty

in mortality assumptions 27–30uncertainty risk 269, 298

uni-sex annuities 40uniform spline 69unit-linked life annuity 39United Kingdom 135, 243–4

APC model application 254–63cohort effect 243–5

value-protected annuities 40variability measures 60–1variable annuities 41–3variance of the random lifetime 61variation factor 145voluntary annuities 40

Wales see United KingdomWang transform 353Weibull law 65, 160–1, 179Whittaker–Henderson model

68–9whole life annuity 36with-profit annuity 39with-profit GAR products 349

XL (excess-of-loss) reinsurance 319–20,326

young mortality hump 138YourCast software 185