portfolio optimization and performance analysis

CHAPMAN & HALL/CRC FINANCIAL MATHEMATICS SERIES

PortfolioOptimization and

PerformanceAnalysis

CHAPMAN & HALL/CRCFinancial Mathematics Series

Aims and scope:The field of financial mathematics forms an ever-expanding slice of the financial sector. This seriesaims to capture new developments and summarize what is known over the whole spectrum of thisfield. It will include a broad range of textbooks, reference works and handbooks that are meant toappeal to both academics and practitioners. The inclusion of numerical code and concrete real-worldexamples is highly encouraged.

Proposals for the series should be submitted to one of the series editors above or directly to:CRC Press, Taylor and Francis Group24-25 Blades CourtDeodar RoadLondon SW15 2NUUK

Series EditorsM.A.H. DempsterCentre for FinancialResearchJudge Business SchoolUniversity of Cambridge

Dilip B. MadanRobert H. Smith Schoolof BusinessUniversity of Maryland

Rama ContCenter for FinancialEngineeringColumbia UniversityNew York

Published TitlesAmerican-Style Derivatives; Valuation and Computation, Jerome Detemple

Financial Modelling with Jump Processes, Rama Cont and Peter Tankov

An Introduction to Credit Risk Modeling, Christian Bluhm, Ludger Overbeck, and

Christoph Wagner

Portfolio Optimization and Performance Analysis, Jean-Luc Prigent

Robust Libor Modelling and Pricing of Derivative Products, John Schoenmakers

Structured Credit Portfolio Analysis, Baskets & CDOs, Christian Bluhm and

Ludger Overbeck

CHAPMAN & HALL/CRC FINANCIAL MATHEMATICS SERIES

Boca Raton London New York

Chapman & Hall/CRC is an imprint of theTaylor & Francis Group, an informa business

PortfolioOptimization and

PerformanceAnalysis

Jean-Luc Prigent

Chapman & Hall/CRCTaylor & Francis Group6000 Broken Sound Parkway NW, Suite 300Boca Raton, FL 33487‑2742

© 2007 by Taylor & Francis Group, LLC Chapman & Hall/CRC is an imprint of Taylor & Francis Group, an Informa business

No claim to original U.S. Government worksPrinted in the United States of America on acid‑free paper10 9 8 7 6 5 4 3 2 1

International Standard Book Number‑10: 1‑58488‑578‑5 (Hardcover)International Standard Book Number‑13: 978‑1‑58488‑578‑8 (Hardcover)

This book contains information obtained from authentic and highly regarded sources. Reprinted material is quoted with permission, and sources are indicated. A wide variety of references are listed. Reasonable efforts have been made to publish reliable data and information, but the author and the publisher cannot assume responsibility for the validity of all materials or for the conse‑quences of their use.

No part of this book may be reprinted, reproduced, transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying, microfilming, and recording, or in any information storage or retrieval system, without written permission from the publishers.

For permission to photocopy or use material electronically from this work, please access www.copyright.com (http://www.copyright.com/) or contact the Copyright Clearance Center, Inc. (CCC) 222 Rosewood Drive, Danvers, MA 01923, 978‑750‑8400. CCC is a not‑for‑profit organization that provides licenses and registration for a variety of users. For organizations that have been granted a photocopy license by the CCC, a separate system of payment has been arranged.

Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used only for identification and explanation without intent to infringe.

Library of Congress Cataloging‑in‑Publication Data

Prigent, Jean‑Luc, 1958‑Portfolio optimization and performance analysis / Jean‑Luc Prigent.

p. cm. ‑‑ (Chapman & Hall/CRC financial mathematics series ; 7)Includes bibliographical references and index.ISBN‑13: 978‑1‑58488‑578‑8 (alk. paper)ISBN‑10: 1‑58488‑578‑5 (alk. paper)1. Portfolio management. 2. Investment analysis. 3. Hedge funds. I. Title. II.

Series.

HG4529.5.P735 2007332.6‑‑dc22 2006100727

Visit the Taylor & Francis Web site athttp://www.taylorandfrancis.com

and the CRC Press Web site athttp://www.crcpress.com

Preface

Since the seminal mean-variance analysis was introduced by Markowitz (1952),the portfolio management theory has been expanded to take account of dif-ferent features:

• Dynamic portfolio optimization as per Merton (1962);

• Choice of new decision criteria, based on risk aversion (utility functions)or risk measures (VaR, CVaR and beyond);

• Market imperfections, e.g., transaction costs; and,

• Specific portfolio strategies, such as portfolio insurance or alternativemethods (hedge-funds).

At the same time, many new financial products has been introduced, basedin particular on financial derivatives.

Due to this intensive development and increasing complexity, this book hasfour purposes:

• First, to recall standard results and to provide new insights about theaxiomatics of the individual choice in an uncertain framework. A conciseintroduction to portfolio choice under uncertainty based on investors’preferences (usually represented by utility functions), and on severalkinds of risk measures. These theories are the fundamental basis ofportfolio optimization.

- Chapter 1 recalls the seminal approach of the utility maximization,introduced by Von Neumann and Morgenstern. It also deals withfurther extensions of this theory, such as weighted expected utilitytheory, non-expected utility theory, etc.

- Chapter 2 contains a survey about a new approach: the risk measureminimization. Such risk measures have been recently introducedin particular to take better account of nonsymmetric asset returndistributions.

• Second, to provide a precise overview on standard portfolio optimiza-tion. Both passive and active portfolio management are considered.Other results, such as risk measure minimization, are more recent.

V

VI Portfolio Optimization and Performance Analysis

- Chapter 3 is devoted to the very well-known Markowitz analysis.Some extensions are analyzed, in particular with risk minimizationconstraints such as safety criteria.

- Chapter 4 deals with two important standard fund managements:managing indexed funds and benchmarked portfolio optimization.In particular, statistical methods to replicate a financial index aredetailed and discussed. As regards benchmarking, the tracking er-ror is computed and analyzed.

- Chapter 5 recalls results about the main performance measures, suchas the Sharpe and Treynor ratios and the Jensen alpha.

• Third, to make accessible the literature about stochastic optimizationapplied to mathematical finance (see for example Part III) to students,to researchers who are not specialists on this subject, and to financialengineers. In particular, a review of the main standard results both forstatic and dynamic cases are provided. For this purpose, precise math-ematical statements are detailed without “too many” technicalities. Inparticular:

- Chapter 6 provides an introduction to dynamic portfolio optimiza-tion. The two main methods are the theory of stochastic controlbased on dynamic programming principle and, more recently, themartingale approach jointly used with convex duality.

- Chapter 7 gives two important applications of previous results: thesearch for an optimal portfolio profile and the long-term manage-ment.

- Chapter 8 is the more “technical” one. It provides an overview onportfolio optimization with market frictions, such as incomplete-ness, transaction costs, labor income, random time horizon, etc.

• Finally, to show how theoretical results can be applied to practical andoperational portfolio optimization (Part IV). This last part of the bookdeals with structured portfolio management which has grown signifi-cantly in the past few years.

Preface VII

- Chapter 9 is devoted to portfolio insurance and, in particular, toOBPI and CPPI strategies.

- Chapter 10 shows how common strategies, used by practitioners, maybe justified by utility maximization under, for example, guaranteeconstraints. It summarizes the main results concerning optimalportfolios when risk measures such as expected shortfall are intro-duced to limit downside risk.

- Chapter 11 recalls some problems when dealing with hedge funds, inparticular the choice of appropriate performance measures.

As a by-product, special emphasis is put on:

• Utility theory versus practice;

• Active versus passive management; and,

• Static versus dynamic portfolio management.

I hope this book will contribute to a better understanding of the modernportfolio theory, both for students and researchers in quantitative finance.

I am grateful to the CRC editorial staff for encouraging this project, in par-ticular Sunil Nair, and for the help during the preparation of the final version:Michele Dimont and Shashi Kumar.

Jean-Luc PRIGENT, PARIS, February 2007.

Contents

List of Tables XIII

List of Figures XV

I Utility and risk analysis 1

1 Utility theory 51.1 Preferences under uncertainty . . . . . . . . . . . . . . . . . 7

1.1.1 Lotteries . . . . . . . . . . . . . . . . . . . . . . . . . . 71.1.2 Axioms on preferences . . . . . . . . . . . . . . . . . . 8

1.2 Expected utility . . . . . . . . . . . . . . . . . . . . . . . . . 91.3 Risk aversion . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

1.3.1 Arrow-Pratt measures of risk aversion . . . . . . . . . 131.3.2 Standard utility functions . . . . . . . . . . . . . . . . 151.3.3 Applications to portfolio allocation . . . . . . . . . . . 17

1.4 Stochastic dominance . . . . . . . . . . . . . . . . . . . . . . 191.5 Alternative expected utility theory . . . . . . . . . . . . . . . 24

1.5.1 Weighted utility theory . . . . . . . . . . . . . . . . . 251.5.2 Rank dependent expected utility theory . . . . . . . . 271.5.3 Non-additive expected utility . . . . . . . . . . . . . . 321.5.4 Regret theory . . . . . . . . . . . . . . . . . . . . . . . 33

1.6 Further reading . . . . . . . . . . . . . . . . . . . . . . . . . 35

2 Risk measures 372.1 Coherent and convex risk measures . . . . . . . . . . . . . . 37

2.1.1 Coherent risk measures . . . . . . . . . . . . . . . . . 382.1.2 Convex risk measures . . . . . . . . . . . . . . . . . . 392.1.3 Representation of risk measures . . . . . . . . . . . . . 402.1.4 Risk measures and utility . . . . . . . . . . . . . . . . 412.1.5 Dynamic risk measures . . . . . . . . . . . . . . . . . 43

2.2 Standard risk measures . . . . . . . . . . . . . . . . . . . . . 482.2.1 Value-at-Risk . . . . . . . . . . . . . . . . . . . . . . . 482.2.2 CVaR . . . . . . . . . . . . . . . . . . . . . . . . . . . 542.2.3 Spectral measures of risk . . . . . . . . . . . . . . . . 59

2.3 Further reading . . . . . . . . . . . . . . . . . . . . . . . . . 62

IX

X Portfolio Optimization and Performance Analysis

II Standard portfolio optimization 65

3 Static optimization 673.1 Mean-variance analysis . . . . . . . . . . . . . . . . . . . . . 68

3.1.1 Diversification effect . . . . . . . . . . . . . . . . . . . 683.1.2 Optimal weights . . . . . . . . . . . . . . . . . . . . . 713.1.3 Additional constraints . . . . . . . . . . . . . . . . . . 783.1.4 Estimation problems . . . . . . . . . . . . . . . . . . . 82

3.2 Alternative criteria . . . . . . . . . . . . . . . . . . . . . . . 853.2.1 Expected utility maximization . . . . . . . . . . . . . 853.2.2 Risk measure minimization . . . . . . . . . . . . . . . 93

3.3 Further reading . . . . . . . . . . . . . . . . . . . . . . . . . 100

4 Indexed funds and benchmarking 1034.1 Indexed funds . . . . . . . . . . . . . . . . . . . . . . . . . . 103

4.1.1 Tracking error . . . . . . . . . . . . . . . . . . . . . . 1044.1.2 Simple index tracking methods . . . . . . . . . . . . . 1054.1.3 The threshold accepting algorithm . . . . . . . . . . . 1064.1.4 Cointegration tracking method . . . . . . . . . . . . . 112

4.2 Benchmark portfolio optimization . . . . . . . . . . . . . . . 1174.2.1 Tracking-error definition . . . . . . . . . . . . . . . . . 1184.2.2 Tracking-error minimization . . . . . . . . . . . . . . . 119

4.3 Further reading . . . . . . . . . . . . . . . . . . . . . . . . . 127

5 Portfolio performance 1295.1 Standard performance measures . . . . . . . . . . . . . . . . 130

5.1.1 The Capital Asset Pricing Model . . . . . . . . . . . . 1305.1.2 The three standard performance measures . . . . . . . 1325.1.3 Other performance measures . . . . . . . . . . . . . . 1405.1.4 Beyond the CAPM . . . . . . . . . . . . . . . . . . . . 145

5.2 Performance decomposition . . . . . . . . . . . . . . . . . . . 1515.2.1 The Fama decomposition . . . . . . . . . . . . . . . . 1515.2.2 Other performance attributions . . . . . . . . . . . . . 1535.2.3 The external attribution . . . . . . . . . . . . . . . . . 1535.2.4 The internal attribution . . . . . . . . . . . . . . . . . 155

5.3 Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . 163

III Dynamic portfolio optimization 165

6 Dynamic programming optimization 1696.1 Control theory . . . . . . . . . . . . . . . . . . . . . . . . . . 169

6.1.1 Calculus of variations . . . . . . . . . . . . . . . . . . 1696.1.2 Pontryagin and Bellman principles . . . . . . . . . . . 1756.1.3 Stochastic optimal control . . . . . . . . . . . . . . . . 182

6.2 Lifetime portfolio selection . . . . . . . . . . . . . . . . . . . 187

Contents XI

6.2.1 The optimization problem . . . . . . . . . . . . . . . . 1876.2.2 The deterministic coefficients case . . . . . . . . . . . 1886.2.3 The general case . . . . . . . . . . . . . . . . . . . . . 1956.2.4 Recursive utility in continuous-time . . . . . . . . . . 203

6.3 Further reading . . . . . . . . . . . . . . . . . . . . . . . . . 205

7 Optimal payoff profiles and long-term management 2077.1 Optimal payoffs as functions of a benchmark . . . . . . . . . 207

7.1.1 Linear versus option-based strategy . . . . . . . . . . 2077.2 Application to long-term management . . . . . . . . . . . . . 214

7.2.1 Assets dynamics and optimal portfolios . . . . . . . . 2147.2.2 Exponential utility . . . . . . . . . . . . . . . . . . . . 2207.2.3 Sensitivity analysis . . . . . . . . . . . . . . . . . . . . 2237.2.4 Distribution of the optimal portfolio return . . . . . . 225

7.3 Further reading . . . . . . . . . . . . . . . . . . . . . . . . . 226

8 Optimization within specific markets 2298.1 Optimization in incomplete markets . . . . . . . . . . . . . . 230

8.1.1 General result based on martingale method . . . . . . 2308.1.2 Dynamic programming and viscosity solutions . . . . 238

8.2 Optimization with constraints . . . . . . . . . . . . . . . . . 2428.2.1 General result . . . . . . . . . . . . . . . . . . . . . . . 2428.2.2 Basic examples . . . . . . . . . . . . . . . . . . . . . . 249

8.3 Optimization with transaction costs . . . . . . . . . . . . . . 2568.3.1 The infinite-horizon case . . . . . . . . . . . . . . . . . 2568.3.2 The finite-horizon case . . . . . . . . . . . . . . . . . . 260

8.4 Other frameworks . . . . . . . . . . . . . . . . . . . . . . . . 2638.4.1 Labor income . . . . . . . . . . . . . . . . . . . . . . . 2638.4.2 Stochastic horizon . . . . . . . . . . . . . . . . . . . . 272

8.5 Further reading . . . . . . . . . . . . . . . . . . . . . . . . . 276

IV Structured portfolio management 279

9 Portfolio insurance 2819.1 The Option Based Portfolio Insurance . . . . . . . . . . . . . 282

9.1.1 The standard OBPI method . . . . . . . . . . . . . . . 2849.1.2 Extensions of the OBPI method . . . . . . . . . . . . 286

9.2 The Constant Proportion Portfolio Insurance . . . . . . . . . 2949.2.1 The standard CPPI method . . . . . . . . . . . . . . . 2959.2.2 CPPI extensions . . . . . . . . . . . . . . . . . . . . . 303

9.3 Comparison between OBPI and CPPI . . . . . . . . . . . . . 3059.3.1 Comparison at maturity . . . . . . . . . . . . . . . . . 3059.3.2 The dynamic behavior of OBPI and CPPI . . . . . . . 310

9.4 Further reading . . . . . . . . . . . . . . . . . . . . . . . . . 318

XII Portfolio Optimization and Performance Analysis

10 Optimal dynamic portfolio with risk limits 31910.1 Optimal insured portfolio: discrete-time case . . . . . . . . . 321

10.1.1 Optimal insured portfolio with a fixed number of assets 32110.1.2 Optimal insured payoffs as functions of a benchmark . 326

10.2 Optimal Insured Portfolio: the dynamically complete case . . 33310.2.1 Guarantee at maturity . . . . . . . . . . . . . . . . . . 33310.2.2 Risk exposure and utility function . . . . . . . . . . . 33510.2.3 Optimal portfolio with controlled drawdowns . . . . . 337

10.3 Value-at-Risk and expected shortfall based management . . . 34010.3.1 Dynamic safety criteria . . . . . . . . . . . . . . . . . 34010.3.2 Expected utility under VaR/CVaR constraints . . . . 347

10.4 Further reading . . . . . . . . . . . . . . . . . . . . . . . . . 350

11 Hedge funds 35111.1 The hedge funds industry . . . . . . . . . . . . . . . . . . . . 351

11.1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . 35111.1.2 Main strategies . . . . . . . . . . . . . . . . . . . . . . 352

11.2 Hedge fund performance . . . . . . . . . . . . . . . . . . . . 35411.2.1 Return distributions . . . . . . . . . . . . . . . . . . . 35411.2.2 Sharpe ratio limits . . . . . . . . . . . . . . . . . . . . 35511.2.3 Alternative performance measures . . . . . . . . . . . 36211.2.4 Benchmarks for alternative investment . . . . . . . . . 36811.2.5 Measure of the performance persistence . . . . . . . . 369

11.3 Optimal allocation in hedge funds . . . . . . . . . . . . . . . 37011.4 Further reading . . . . . . . . . . . . . . . . . . . . . . . . . 371

A Appendix A: Arch Models 373

B Appendix B: Stochastic Processes 381

References 397

Symbol Description 431

Index 433

List of Tables

1.1 Kahnemann and Tversky example . . . . . . . . . . . . . . . 251.2 Equivalence of the two problems . . . . . . . . . . . . . . . . 25

3.1 Expectations, variances and covariances . . . . . . . . . . . . 80

4.1 MADD minimization and weighting differences . . . . . . . . 111

5.1 Asymptotic standard deviation of the Sharpe ratio estimator 1405.2 Asset allocation (percentages) . . . . . . . . . . . . . . . . . . 1565.3 Contribution to asset classes . . . . . . . . . . . . . . . . . . . 1575.4 Asset selection effects . . . . . . . . . . . . . . . . . . . . . . 1575.5 Portfolio characteristics . . . . . . . . . . . . . . . . . . . . . 1585.6 Performance attribution . . . . . . . . . . . . . . . . . . . . . 1585.7 Performance attribution of portfolio 1 . . . . . . . . . . . . . 1605.8 Tracking-error volatilities . . . . . . . . . . . . . . . . . . . . 1625.9 Information ratios . . . . . . . . . . . . . . . . . . . . . . . . 162

7.1 Optimal weights for logarithmic utility function . . . . . . . . 2187.2 Optimal weights for CRRA utility function . . . . . . . . . . 2197.3 Optimal weights for HARA utility function . . . . . . . . . . 2207.4 Optimal weights for CARA utility function . . . . . . . . . . 2227.5 Asset allocation sensitivities for CRRA utility . . . . . . . . . 2237.6 Asset allocations for agressive, moderate and conservative in-

vestors (CRRA utility function) . . . . . . . . . . . . . . . . . 224

9.1 OBPI and Call-power moments . . . . . . . . . . . . . . . . . 2889.2 Comparison of the first four moments and semi-volatility . . . 3079.3 Probability P [∆OBPI > ∆CPPI ] for different m and σ . . . . 3139.4 Probability P [∆OBPI > ∆CPPI ] for different m and µ . . . . 313

11.1 HFR and CSFB classifications . . . . . . . . . . . . . . . . . . 35211.2 Characteristics of the portfolio S − (S −K)+ . . . . . . . . . 35711.3 Characteristics of the portfolio S + (H − S)+ . . . . . . . . . 35811.4 The four hedge funds characteristics . . . . . . . . . . . . . . 365

XIII

List of Figures

1.1 Risk aversion, certainty equivalence, and concavity . . . . . . 111.2 Stochastic dominance . . . . . . . . . . . . . . . . . . . . . . 201.3 Kahneman and Tversky functions . . . . . . . . . . . . . . . . 31

2.1 Pdf and cdf with VaR . . . . . . . . . . . . . . . . . . . . . . 502.2 Gaussian and Stable Paretian distributions with same VaR . 51

3.1 Diversification effect . . . . . . . . . . . . . . . . . . . . . . . 683.2 Mean-variance portfolios . . . . . . . . . . . . . . . . . . . . . 743.3 Efficient frontiers (Rf < A

C ) . . . . . . . . . . . . . . . . . . . 773.4 Efficient frontiers (Rf > A

C ) . . . . . . . . . . . . . . . . . . . 773.5 Efficient frontiers (Rf = A

C ) . . . . . . . . . . . . . . . . . . . 783.6 Efficient frontier with no shortselling . . . . . . . . . . . . . . 813.7 Efficient frontier with additional group constraints . . . . . . 813.8 Efficient frontier with maximum number of constraints . . . . 813.9 Roy’s portfolio . . . . . . . . . . . . . . . . . . . . . . . . . . 943.10 Telser’s portfolio . . . . . . . . . . . . . . . . . . . . . . . . . 963.11 Kataoka’s portfolio . . . . . . . . . . . . . . . . . . . . . . . . 97

4.1 MADD minimization with ten stocks . . . . . . . . . . . . . . 1104.2 MADD minimization: estimation and test . . . . . . . . . . . 1104.3 MADD minimization with constraints . . . . . . . . . . . . . 1114.4 Efficient and relative frontiers, RB > R0 . . . . . . . . . . . . 1214.5 Efficient and relative frontiers, RB < R0. . . . . . . . . . . . . 1224.6 Efficient, relative and beta frontiers . . . . . . . . . . . . . . . 1244.7 Efficient, relative and Φ frontiers . . . . . . . . . . . . . . . . 1254.8 Frontiers and iso-tracking curves . . . . . . . . . . . . . . . . 127

5.1 Security market line . . . . . . . . . . . . . . . . . . . . . . . 1315.2 SML and portfolios A, B, A’, and B’ . . . . . . . . . . . . . . 1365.3 Capital market line . . . . . . . . . . . . . . . . . . . . . . . . 1365.4 Capital market line and leverage effect . . . . . . . . . . . . . 1375.5 FAMA performance decomposition . . . . . . . . . . . . . . . 1525.6 Linear regression of RP on RM without market timing . . . . 1535.7 Linear regression of RP on RM with successful market timing 1545.8 Efficient and relative frontiers . . . . . . . . . . . . . . . . . . 159

7.1 Optimal portfolio profiles . . . . . . . . . . . . . . . . . . . . 212

XV

XVI Portfolio Optimization and Performance Analysis

7.2 Optimal portfolio profiles according to stock return . . . . . . 2137.3 Inverse cumulative distribution of the return at maturity . . . 226

8.1 Solvency region . . . . . . . . . . . . . . . . . . . . . . . . . . 2588.2 Optimal consumption as function of optimal wealth . . . . . 272

9.1 OBPI portfolio value as function of S . . . . . . . . . . . . . 2869.2 Call-power option profiles . . . . . . . . . . . . . . . . . . . . 2879.3 Call-power option paths . . . . . . . . . . . . . . . . . . . . . 2889.4 Call-power option pdf . . . . . . . . . . . . . . . . . . . . . . 2899.5 Call-power option cdf . . . . . . . . . . . . . . . . . . . . . . 2899.6 Convex case with linear constraints . . . . . . . . . . . . . . . 2929.7 Concave case with linear constraints . . . . . . . . . . . . . . 2929.8 Portfolio value and cushion . . . . . . . . . . . . . . . . . . . 2949.9 CPPI and OBPI payoffs as functions of S . . . . . . . . . . . 3059.10 CPPI and OBPI payoffs and probability of S . . . . . . . . . 3089.11 Cumulative distribution of OBPI/CPPI ratio . . . . . . . . . 3089.12 Multiple OBPI as function of S . . . . . . . . . . . . . . . . . 3119.13 OBPI multiple cumulative distribution . . . . . . . . . . . . . 3119.14 CPPI and OBPI delta as functions of S . . . . . . . . . . . . 3129.15 Cumulative distribution of OBPI/CPPI delta ratio . . . . . . 3149.16 CPPI and OBPI delta as functions of current time . . . . . . 3149.17 CPPI and OBPI gamma as functions of S for K = 100 . . . . 3159.18 CPPI and OBPI gamma as functions of S for K = 110 . . . . 3169.19 CPPI and OBPI vega as functions of S . . . . . . . . . . . . 317

10.1 Optimal portfolio profile (1) . . . . . . . . . . . . . . . . . . . 32210.2 Optimal portfolio profile (2) . . . . . . . . . . . . . . . . . . . 32210.3 Optimal portfolio profile (3) . . . . . . . . . . . . . . . . . . . 32310.4 Optimal portfolio profile (quadratic case) . . . . . . . . . . . 32510.5 Optimal portfolio weighting (quadratic case) . . . . . . . . . 32510.6 Dynamic Roy portfolio payoff . . . . . . . . . . . . . . . . . . 34510.7 Probability of success as function of the minimal return . . . 34510.8 Optimal portfolio value with VaR constraints . . . . . . . . . 348

11.1 The hedge funds development . . . . . . . . . . . . . . . . . . 35111.2 Portfolio profile (S − (S −K)+) as a function of stock value S. 35611.3 Portfolio profile (S + (H − S)+) . . . . . . . . . . . . . . . . 35811.4 Sharpe ratio as a function of the strike K . . . . . . . . . . . 36011.5 Portfolio profile maximizing the Sharpe ratio . . . . . . . . . 36111.6 The monthly returns of the four hedge funds . . . . . . . . . 36611.7 Omega ratio as function of the threshold . . . . . . . . . . . . 36611.8 Correlation of hedge funds/standard funds . . . . . . . . . . . 370

A.1 Random walk . . . . . . . . . . . . . . . . . . . . . . . . . . . 374

Part I

Utility and risk analysis

“[Under uncertainty] there is no scientific basis on which to form any cal-culable probability whatever. We simply do not know. Nevertheless, thenecessity for action and for decision compels us as practical men to do ourbest to overlook this awkward fact and to behave exactly as we should ifwe had behind us a good Benthamite calculation of a series of prospectiveadvantages and disadvantages, each multiplied by its appropriate probabilitywaiting to be summed.”

John Maynard Keynes, “General Theory of Employment,” Quarterly Journalof Economics, (1937).

1

2 Portfolio Optimization and Performance Analysis

Nowadays, financial theory is one of the major economic fields where decision-making under uncertainty plays a crucial part. Actually, many sources of risk(market, model, liquidity, operational, etc.) have to be taken into account andcarefully examined for most financial activities, such as pricing and hedgingderivatives, asset allocation, or credit portfolio management.

Assume that these risky events are identified with, for example, probabilitydistributions that may be objective or subjective. Nevertheless:

How can we model individual decisions under uncertainty?

Is it possible to rationalize traders or portfolio managers strategies? Canwe provide them with sufficiently operational and computational tools to tryto improve their decision process?

As it is well-known, a unified framework can be proposed to quantify uncer-tainty in financial modelling: the utility theory, and especially the expectedutility theory introduced by John von Neumann and Oskar Morgenstern in[400] and recognized for its usefulness and applicability.

Utility functions are based on risk aversion modelling from which the notionof risk premium can be defined. In Chapter 1, basic notions of the theory ofdecision under uncertainty are recalled. The emphasis is put on the expectedutility theory and various risk aversion notions. One of the advantages of theexpected utility is that it provides an operational tool to determine explicitportfolios under mild assumptions. In this framework, the risk-aversion allowsa calibration of the portfolio weights, as detailed in Part III.

Nevertheless, the increasing development of the so-called behavorial eco-nomics and finance, based on empirical evidence, justifies sections devotedto alternative preference representation theories. Indeed, many experimentalstudies have shown that individuals (in particular the investors) do not actaccording to the expected utility theory. This can partly explain investmentanomalies such as insufficient diversification, financial bubbles, etc.

However a new stream has emerged based on bank activity regulation. Itfocuses in particular on potential losses and downside risk.

In [373] and [374], Markowitz proposed to measure risk of portfolio returnsby means of their variances which involve judiciously the joint distributionof returns of all assets. Despite its simplicity and tractability, the Markowitzmodel has two pitfalls:

• First, the probability distribution of each asset return is characterizedonly by its first two moments. In the case of nonGaussian distributions

Part I 3

(even symmetrical), the Markowitz model and utility theories are mainlycompatible for quadratic utility functions.

• Second, the dependence structure is only described by the linear corre-lation coefficients of each pair of asset returns. As shown, for example,by Alexander [14], the linear correlation coefficient is not always appli-cable. It also may imply incorrect results when probability distributionsare not elliptic (see Joe [307]), as proved for instance by Embrechts et al.([201] and [202]). In that case, severe losses can be observed if extremeevents are too underestimated.

What kind of risk measures can be introduced?

Unlike dispersion risk measures such as the standard deviation, other mea-sures have been proposed, based rather on downside risks.

From the seminal paper by Artzner, Delbaen, Eber and Heath [31], specificaxioms have been introduced to model risk measures (coherent), and furtherexamined and generalized by Follmer and Schied [236] (convex measures).

Chapter 2 is devoted to the definitions and main properties of such riskmeasures. Note that, as for preference representation, the theory of risk mea-sures is not yet achieved, in particular when they have to be defined in adynamic framework. Besides, both approaches are linked, as shown by recentresults. Among the possible operational risk measures, the value-at-risk andits “coherent” extension, the expected shortfall, have emerged as importanttools to bank regulation and risk management.

This is the reason why in Chapter (2), some emphasis is put on thesemeasures, in particular on some results about their estimation and sensitivi-ties computation. Under some additional and “rational” specific axioms, riskmeasures can be defined from the expected shortfall (the so-called spectralrisk measures).

Portfolio management can also involve such measures to limit risk exposure,as detailed for instance in Part IV, Chapter 10.

Chapter 1

Utility theory

The importance of risk and uncertainty in economic analysis was suggested forthe first time by Frank H. Knight in his seminal treatise Risk, Uncertainty andProfit (see [328]). Previously, very few economists considered that risk anduncertainty might play a key role in economic theory, except for some notableexamples like Carl Menger [384], Irving Fisher [227] and Francis Edgeworth[186]. The problem was:

• First, to define precisely the notions of “uncertainty” or “risk” whenevents are random; and,

• Second, to model the choice process within risk and uncertainty.

The notion of choice under risk and uncertainty was not well modelled for along time, despite the results of Hicks [293] and Marschak [377], who under-stood that preferences should be defined also on distributions, to take accountsimultaneously of the evaluation of the level of risk or uncertainty, and of thepure preferences over outcomes. However, Bernoulli [54] had formerly intro-duced the notion of expected utility to solve the famous St.Petersburg paradoxposed in 1713. The expected utility theory allows the representation of thelevel of satisfaction by the sum of utilities from outcomes weighted by theprobabilities of these outcomes. Nevertheless, in that case, a gain can in-crease utility less than a decline can reduce it, which was not considered asrational.

In the seminal Theory of Games and Economic Behavior, John von Neu-mann and Oskar Morgenstern [400] succeeded in providing a rational founda-tion for decision-making under risk according to expected utility properties.At this time, this axiomatic foundation for decision-making under risk wasnot well understood. This theory was further developed by Marschak [378],Samuelson ([445] and [446]), Herstein and Milnor [291] and others.

The expected utility hypothesis was rehanced in the famous Foundationsof Statistics by Savage [449]. Savage proposed to deduce the expected utilityproperty without imposing prior objective probabilities but by determinat-ing implicit subjective probabilities. This approach was further studied byAnscombe and Aumann [26]. Thus the von Neumann-Morgenstern theorywas extended by the Savage-Anscombe-Aumann “subjective” approach.

5


The “state-preference” approach to uncertainty was introduced by Arrow[28] and Debreu [152]. It does not necessarily assume the existence of objec-tive or subjective probabilities. Rather, it concerns actual goods than moneyamounts and has been applied to study general economic equilibria.

The notion of “risk aversion” was introduced by Friedman and Savage [242]and by Markowitz [374]. The measures of risk aversion were examined byPratt [412] and Arrow [29], and later studied by Ross [434]. The differentnotions of risk aversion have been further developed by Yaari [506] and K-ihlstrom and Mirman [327]. The notions and definitions of “riskiness” basedon stochastic dominance were suggested by Rothschild and Stiglitz ([435] and[436]) and Diamond and Stiglitz [167].

The expected utility assumption for modelling choice under risk and un-certainty has been discussed and disputed, in particular by Allais [19] andEllsberg [196]. This debate has generated many alternative approaches toexpected utility theory. Some of them have been based on experimental expe-riences, such as in Kahneman and Tversky [314]. The main alternative mod-els are: weighted expected utility (Allais [20], Chew and McCrimmon [120]);rank-dependent expected utility (Quiggin [420], Yaari ([506], [507]), and morerecently, the cumulative prospect theory by Kahneman and Tversky [496]);non-linear expected utility (Machina [367]); and regret theory (Loomes andSugden [365]) to take account of preference reversals. Other alternative ex-pected utility theories have been developed within the framework of Savage’ssubjective probability: non-additive expected utility (Schmeidler [454]) andstate-dependent preferences (Karni [323]).

To summarize, the purpose of the decision theory is to provide analyticaltools of different degrees of formality in order to model the behavior of adecison maker who has to choose among a set of alternatives with differentconsequences. Typically, since Knight [328], three cases are distinguished,according to the degree of information:

• First, the environment is certain: the agent perfectly knows the eventthat will occur in the future.

• Second, the environmment is risky: this means existence of uncontrol-lable random events for which the modelling of a probability space canbe proposed, in particular a probability distribution can be determined.

• Finally, the environmment is uncertain: in that case, the probabilitydistribution is unknown.

Financial theory is mainly concerned with the second situation. However,for the third case, note that under some assumptions a subjective probabilitymay exist, as proved by Savage [449].

Utility theory 7

1.1 Preferences under uncertainty

In order to model any decision problem under risk, it is necessary to in-troduce a functional representation of preferences which measures the degreeof satisfaction of the decision maker. This is the purpose of the utility the-ory. The investor is supposed to be “rational”: this means that his choicesare made according to given “good” rules which are “stable” over time (insome sense). Thus a binary relation on possible outcomes can be proposed toanalyze his behavior. Specific axioms are introduced to describe his “ratio-nality.” Then, for this given identified choice functional, his optimal decision(for example his investment strategy) is determined from the “maximization”of this criterion.

1.1.1 Lotteries

Within a risky framework, first we must pick out all possible outcomeswhich may have an impact on the consequences of the decisions. Secondly,we must associate each random event with a probability.

Let Ω represent the set of possible outcomes. To simplify the exposition,we suppose Ω is finite:

Ω = ω1, ..., ωm.Let p = p1, ..., pm be the probability of occurence of Ω:

∀i, 0 ≤ pi ≤ 1 and∑i

pi = 1.

DEFINITION 1.1 A lottery L is defined by a vector (ω1, p1), ..., (ωm, pm).The set of all lotteries with the same given set of outcomes Ω is denoted byL. A compound lottery Lc is a lottery whose outcomes are also lotteries.

Consider for example a compound lottery with two outcomes: the lotteryLa and the lottery Lb with respective probabilities α and 1 − α. Then theprobability that the outcome of Lc will be ωi is given by:

pi = αpai + (1− α)pbi .

Therefore, Lc has the same vector of probabilities as the convex combination:

αLa + (1− α)Lb.


1.1.2 Axioms on preferences

The decision maker is assumed to be “rational” if his preference relation(denoted by ) over the set of lotteries L is a binary relation which satisfiesthe following axioms:

Axiom 1 :

• The relation is complete (all lotteries are always comparable by ):

∀La ∈ L, ∀Lb ∈ L, La Lb or Lb La.

The indifference relation ∼ associated to the relation is defined by:

∀La ∈ L, ∀Lb ∈ L, La ∼ Lb ⇐⇒ La Lb and Lb La.

• The relation is reflexive:

∀L ∈ L, L L.

• The relation is transitive:

∀La ∈ L, ∀Lb ∈ L, ∀Lc ∈ L, La Lb and Lb Lc =⇒ La Lc.

Another standard assumption is the continuity: small changes on probabilitiesdo not modify the ordering between two lotteries. This property is specifiedin the following axiom:

Axiom 2 : The preference relation on the set L of lotteries is such that∀La ∈ L, ∀Lb ∈ L, ∀Lc ∈ L, if La Lb Lc then there exists a scalarα ∈ [0, 1] such that:

Lb ∼ αLa + (1− α)Lc.

This continuity axiom implies the existence of a functional U : L −→ R suchthat:

∀La ∈ L, ∀Lb ∈ L, La Lb ⇐⇒ U(La) ≥ U(Lb).

The previous axioms were already well-known in the economic theory ofconsumer choice (sometimes called the “weak order” axioms). To develop theanalysis of economics under uncertainty, more properties must be imposedon the preferences. One of the most important conditions that can be addedto describe the behavior of the economic agent (here the “investor”) is theindependence axiom, which is the foundation of the standard theory underuncertainty, but considered more troublesome (stated as in Jensen [301]):

Axiom 3 : The preference relation on the set L of lotteries is such that∀La ∈ L, ∀Lb ∈ L, ∀Lc ∈ L and for all α ∈ [0, 1],

La Lb ⇐⇒ αLa + (1− α)Lc αLb + (1− α)Lc.

This property means that if two lotteries La and Lb are mixed in the sameway with any third lottery Lc then the preference ordering between the twonew mixed lotteries is not modified.

Utility theory 9

1.2 Expected utility

The independence axiom, implicitly introduced in von Neumann and Mor-genstern [400], characterizes the expected utility criterion: indeed, it impliesthat the preference functional U on the lotteries must be linear in the proba-bilities of the possible outcomes:

THEOREM 1.1

Assume that the preference relation on the set L of lotteries satisfies thecontinuity and independence axioms. Then, the relation can be represent-ed by a preference functional that is linear in probabilities: there exists afunction u defined (up to a positive linear transformation) on the space ofpossible outcomes Ω and with values in R such that for any two lotteriesLa = (pa1, ..., pan) and Lb = (pb1, ..., pbn), we have:

La Lb ⇐⇒n∑

i=1

u(ωi)pai ≥n∑

i=1

u(ωi)pbi . (1.1)

PROOF In what follows, only a sketch of the proof is presented. Wehave mainly to prove that for any two lotteries La and Lb, and any compoundlottery L = αLa + (1− α)Lb:

U [αLa + (1− α)Lb] = αU [La] + (1− α)U [Lb].

• First, consider the worst and best lotteries, Lwo and Lbe in L, withrespect to the preference functional U . They are obtained by minimizingand maximizing U on a closed preference interval of L. Then, for anylottery L in L, we have : Lwo L Lbe. Thus, by the continuityaxiom, there exist two scalars βa and βb in [0, 1] such that:

La ∼ βaLbe + (1− βa)Lwo and Lb ∼ βbLbe + (1− βb)Lwo.

Note that βa and βb are unique.

• (Mixture monotonicity.) We have:

La Lb ⇐⇒ βa ≥ βb.

Indeed, if βa ≥ βb, then the parameter β = βa−βb

1−βb is in [0, 1]. Then, wehave:

βaLwo + (1− βa)Lbe ∼ βLbe + (1− β)[βbLbe + (1 − βb)Lwo].


Moreover, by definition of Lbe, Lbe βbLbe + (1 − βb)Lwo. Therefore,using the independence axiom, we deduce:

βLbe + (1− β)[βbLbe + (1− βb)Lwo] β[βbLbe + (1 − βb)Lwo] + (1− β)[βbLbe + (1− βb)Lwo].

Consequently, La = βaLbe + (1 − βa)Lwo Lb = βbLbe + (1− βb)Lwo.This result is quite intuitive, since it means that if we construct twocompound lotteries La and Lb with different weights, then we prefer thecompound lottery in which the best lottery is given the higher weight.

• Consequently, we conclude that the functional U , such that for anylottery L ∼ βLbe + (1 − β)Lwo we have U(L) = β, satisfies exactlyDefinition (1.1.2) of a preference functional associated to the preferencerelation .

• Finally, we must prove that:

U [αLa + (1 − α)Lb] = αβa + (1 − α)βb.

This is equivalent to show that:

αLa+(1−α)Lb ∼ (αβa+(1−α)βb)Lbe+(α(1−βa)+(1−α)(1−βb))Lwo.

Using twice the independent axiom, we get:

αLa + (1− α)Lb ∼ α[βaLbe + (1− βa)Lwo] + (1− α)Lb

∼ α[βaLbe + (1− βa)Lwo] + (1 − α)[βbLbe + (1− βb)Lwo]

∼ (αβa + (1− α)βb)Lbe + (α(1 − βa) + (1− α)(1 − βb))Lwo.

Then, the previous result is extended to the whole space of lotteries Land the proof is finished.

REMARK 1.11) The property in Theorem (1.1) is an equivalence, i.e. expected utility im-plies the three axioms.

2) Note that the previous results have been proved for simple lotteries, i.e.probability distributions which take positive values only for a finite numberof outcomes. We have to extend these results to continuous spaces (“infinitesupport”): i.e. for any probability measure P on Ω, we must prove an analogueto the expected utility decomposition

U(P) =∫Ω

u(ω)dP(ω). (1.2)

For this purpose, the continuity axiom (or Archimedian axiom) has to besupplemented as shown in Fishburn [223].

Utility theory 11

1.3 Risk aversion

When facing alternatives with comparable returns, what are the attitudesof investors towards risk? Despite, for example, state-owned lotteries, usualobservations on financial or insurance markets show that generally humanbeings are risk-averse. For example, they choose to invest on risky assetsonly if their expected returns are significantly larger than the riskless one.To illustrate this notion, consider the construction of Friedman and Savage[242]: let X a random variable with only two values x1 and x2, and let p, theprobability of x1, and (1−p) be the probability of x2. Let u represent a utilityfunction defined on the outcomes. Consider the following two lotteries La andLb: lottery La pays the amount E[X] with probability equal to 1. Lottery Lb

pays x1 with probability p, and x2 with probability (1 − p). These lotterieshave the same expected income, but an investor who is averse to risk wouldselect La instead of Lb. Therefore, we have:

U [La] = u(E[X]) and U [Lb] = E[u(X)]. (1.3)

Then, as illustrated in Figure (1.1), the concavity of u implies that the utilityof expected return u(E[X]) is greater than the expected utility E[U(X)].

6

-

A

B

x1 x2E[X]

u(E[X])

E[u(X)]= u(C(X))

C[X]

u(x1)

u(x2)

π(X)-

FIGURE 1.1: Risk aversion, certainty equivalence, and concavity


Indeed, by the definition of concavity, we have:

∀λ ∈ [0, 1], ∀a ∈ R, ∀b ∈ R, λu(a) + (1− λ)u(b) ≤ u(λa + (1− λ)b).

The basic mathematical result is the well-known Jensen’s inequality: considera concave real-valued function f and a real-valued random variable Y withfinite expectation. Then we have E[f(Y )] ≤ f(E[Y ]).

Another way to represent this risk-aversion is to introduce the certaintyequivalent of the lottery Lb. This lottery, denoted by C[X ], is the sure orcertainty-equivalent lottery which yields the same utility as the random lot-tery Lb. Thus, the investor is indifferent to the choice of receiving C[X ]with certainty or investing on the risky lottery Lb with expected return E[X ].The risk-aversion is equivalent to the inequality C[X ] < E[X ]. The differ-ence π[X ] = E[X ] − C[X ] is called the risk-premium as introduced in Pratt[412]. It is the maximum amount that the investor accepts to lose in orderto get a riskless income. From these properties, we can propose the followingdefinitions:

DEFINITION 1.21) An investor is risk-averse if C[X ] ≤ E[X ], or equivalently if π[X ] ≥ 0, forall random variables X.2) An investor is risk-neutral if C[X ] = E[X ], or equivalently if π[X ] = 0, forall random variables X.3) An investor is risk-loving if C[X ] ≥ E[X ], or equivalently if π[X ] ≤ 0, forall random variables X.

Using the properties of concavity/convexity of utility functions, we deducea characterization of the risk-aversion:

THEOREM 1.2

Let u be a utility function representing preferences over the set of outcomes.Assume that u is increasing. Then:1) The function u is concave if and only if the investor is risk-averse.2) The function u is linear if and only if the investor is risk-neutral.3) The function u is convex if and only if the investor is risk-loving.

PROOF Consider for example the concavity case. Then: ∀λ ∈ [0, 1], ∀a ∈R, ∀b ∈ R, λu(a) + (1− λ)u(b) ≤ u(λa + (1− λ)b). But E[X ] = λa + (1− λ)band E[u(X)] = λu(a) + (1− λ)u(b). By definition, u(C[X ]) = E[u(X)]. Thus,u(C[X ]) ≤ u(E[X ]). Therefore, since u is increasing, we deduce: C[X ] ≤ E[X ],which is the definition of risk-aversion.

Utility theory 13

REMARK 1.2 As noted in [242], an investor’s utility function may havedifferent curvatures: For example, he may be risk-averse for small and veryhigh wealth levels, but risk-loving for intermediate levels. In that case, hisutility function has a double inflection. However, such behavior modelling canbe criticized as was done by Markowitz [374].

1.3.1 Arrow-Pratt measures of risk aversion

Human beings preferences are heterogeneous: some may prefer safety torisk, others do not. In the first case, they will invest significantly on “riskless”assets, treasury bonds for example. In the second case, they will purchasestocks that will represent a high proportion of their financial portfolios. But,how can we measure the “degree” of risk aversion of an investor? Since for ex-ample utility functions are defined up to linear transformations, the concavityitself is not sufficient to characterize this degree. Another possible approachis to examine the risk premia and to relate them to concavity. This is theway chosen by Pratt [412] and Arrow [29]. Consider the following result dueto Pratt:

DEFINITION 1.3 Let u and v be two utility functions representing pref-erences over wealth. The preference u has more risk-aversion than v if therisk-premia satisfy: πu(X) ≥ πv(X), for all random real-valued variables X.

THEOREM 1.3Let u and v be two utility functions representing preferences over wealth.Assume that they are continuous, monotonically increasing, and twice differ-entiable. Then the following properties are equivalent and characterize the“more risk-aversion”:1) The derivatives of both utility functions are such that: −u”

u′ (x) > − v”v′ (x),

for every x in R.2) There exists a concave function Φ such that: u(x) = Φ[v(x)], for every

x in R.3) The risk-premia satisfy: πu(X) ≥ πv(X), for all random real-valued

variables X.

PROOF Case 1: (1)⇒(2).Since the function v is monotonic and continuous, v has an inverse v−1. Definethe function Φ by: Φ(y) = u v−1(y). Then, by construction, we have:

u(x) = Φ[v(x)].

Since the functions u and v are twice differentiable, we deduce that Φ is alsotwice differentiable and:

u′(x) = Φ′[v(x)]v′(x).


Thus, Φ′ is non-negative. Differentiating again, we get:

u”(x) = Φ”[v(x)]v′2(x) + Φ′[v(x)]v”(x).

Therefore:u”(x) = Φ”[v(x)]v′2(x) + u′(x)v”(x)/v′(x),

and finally we get:

v”(x)/v′(x)− u”(x)/u′(x) = (−Φ”[v(x)])(v′2(x)/u′(x)),

with (v′2(x)/u′(x)) > 0. Now, from assumption (1), we have:

v”v′

(x) − u”u′ (x) > 0.

Consequently, Φ” is negative and Φ is concave.

Case 2: (2)⇒(3).By definition of C[X ], we have: u(Cu[X ]) = E[u(X)]. Therefore, since u(x) =Φ[v(x)], then u(Cu[X ]) = E[Φ[v(X)]]. Now, since Φ is concave, we have:

E[Φ[v(X)]] ≤ Φ[E[v(X)]],

which implies:u(Cu[X ]) ≤ Φ[v(Cv[X ])].

Finally, since u(.) = Φ[v(.)] is increasing, we deduce:

Cu[X ] ≤ Cv[X ],

which implies: πu(X) ≥ πv(X).

Case 3: (3)⇒(1). Equivalently, we can prove that not(1)⇒not(3). If “not(1)”,then we have:

−u”u′ (a) < −v”

v′(xa),

for some a in R. By continuity, there exists a neighborhood V(a) for whichthis inequality still holds for all x ∈ V(a). Consider a random variable Xwith values in V(a) and 0 otherwise. First, we can prove that the previoustranformation Φ is convex on the set V(a): In the proof (1)⇒(2), we haveseen that

v”v′

(x) − u”u′ (x) = −kΦ[v(x)].

But on (V)(a), we havev”v′

(x) − u”u′ (x) > 0,

which implies that Φ” > 0 on (V)(a). Second, since Φ is convex, using theresult (2)⇒(3), we deduce: πu(X) ≤ πv(X). Thus, not(1)⇒not(3).

Utility theory 15

REMARK 1.3 Using the Taylor approximation at E[X ], we have:

E[u(X)] u(E[X ]) + (1/2)u”(E[X ])E[(X − E[X ])2], (1.4)

and alsou(E[X ]− π[X ]) = u(E[X ])− u′(E[X ])π[X ]. (1.5)

But, by definition, u(E[X ]− π[X ]) = E[u(X)]. Denote σ2X = E[(X −E[X ])2].

Then:π[X ] = −[u”(E[X ])/u′(E[X ])]σ2

X . (1.6)

Since, σ2X > 0, the ratio −u”(E[X ])/u′(E[X ]) can be also considered as a

measure of risk-aversion. Note that it is positive, since the utility function uis increasing and concave. Note also that when X = V0 + Y with E[Y ] = 0,then E[X ] is equal to the initial amount V0 invested on the market.

DEFINITION 1.4 The term A(x) = −u”(x)/u′(x) is called the Arrow-Pratt Measure of Absolute Risk-Aversion (ARA). Another measure allows usto take account of the level of wealth: the ratio R(x) = −xu”(x)/u′(x) whichis called the Arrow-Pratt Measure of Relative Risk-Aversion (RRA).

REMARK 1.4 From the geometrical point of view, note that the curva-ture of the utility function u is equal to

ρ(u)(x) = u”(x)/[1 + u′2(x)]2/3. (1.7)

Taking the absolute value of ρ, we get the risk-aversion−u”(x)/[1+u′2(x)]2/3,which is very close to the ARA measure.

1.3.2 Standard utility functions

From the previous risk-aversion measures, we can characterize some stan-dard utility functions, and in particular the general class of HARA utilitieswhich are useful to get analytical results.

DEFINITION 1.5 A utility function u is said to have harmonic absoluterisk aversion (HARA) if the inverse of its absolute risk aversion is linear inwealth.

PROPOSITION 1.1

HARA utility functions u take the following form:

u(x) = a(b +

x

c

)1−c

, (1.8)


with u defined on the domain b + xc > 0. The constant parameters a, b, and

c satisfy the condition: a(1 − c)/c > 0.The ARA is given by:

A(x) =(b +

x

c

)−1

, (1.9)

which clearly has an inverse linear in wealth x. To ensure that u′ > 0 andu” < 0, it is assumed that a (1− c) /c > 0.

Usually, three subclasses are distinguished:

• Constant absolute risk aversion (CARA). If the parameter c goes toinfinity then we obtain A(x) = A constant. In that case, the utilityfunction u takes the form : u(x) = − exp[−Ax]

A . Note that the RRA isincreasing in wealth.

• Constant relative risk aversion (CRRA). If c = 0, then R(x) = c con-stant and, up to a linear transformation, we get:

u(x) =x1−c/(1− c) if c = 1,ln[x] if c = 1. (1.10)

Note that if c < 1, then utility goes from 0 to ∞, and if c > 1, thenutility goes from −∞ to 0. However, in all cases, this kind of utilityfunction exhibits a decreasing absolute risk aversion (DARA).

• Quadratic utility function. Consider the case c = −1. Then the utilityfunction u is quadratic. Note first that we have to restrict its domain,since u is decreasing on ]b,∞[. Second, the ARA is increasing (IARA)with wealth, which unfortunately implies that the risk premium π(.) isincreasing. Thus, as wealth increases, the unwillingness to take risksincreases.

REMARK 1.5 The decreasing absolute risk aversion (DARA) can becharacterized by the following property: denote V0 as the initial amount in-vested on the market and consider the random payoff X defined by X = V0+Ywith E[Y ] = 0. Then, the risk-premium πu(V0, x) satisfies: for all α > 0,πu(V0, x) > πu(V0 + α, x). This property is also equivalent to the existencefor all α > 0 of a functional denoted by Ψα(.), such that: u(x) = Ψα[u(x+α)]with Ψ′

α(.) > 0 and Ψ”α(.) < 0.

REMARK 1.6 The utility functions CARA and CRRA can be character-ized by invariance properties respectively with respect to multiplicative andadditive transformations of lotteries (see for example [162]).

Utility theory 17

1.3.3 Applications to portfolio allocation

Consider the following standard portfolio problem: the investor has anincreasing and concave utility function u. He can invest his initial endowmentin a risk-free asset M (for example, a monetary asset) and in risky asset S (forexample, a stock or a financial index). Denote by rM and rS the respectivereturns of assets M and S. Denote by λ the proportion of initial wealth V0

invested in the risky asset. Then, the value of the portfolio at maturity isgiven by:

(1− λ)V0(1 + rM ) + λV0(1 + rS) = V0 + λV0(rS − rM ). (1.11)

Therefore, using the standard assumption E[rS ] > rM , the expected portfolioreturn is increasing with respect to the proportion λ. The problem of theinvestor is to find λ maximizing the expected utility:

maxλU(λ) = E[u(V0 + λV0(rS − rM ))]. (1.12)

Assuming that u is twice differentiable, the optimum λ∗ (if it exists) is deducedfrom the first-order condition which is given by the implicit relation:

U ′(λ∗) = E[(rS − rM )u′(V0 + λV0(rS − rM ))] = 0. (1.13)

From the concavity of u, we get:

U”(λ∗) = E[(rS − rM )2u”(V0 + λV0(rS − rM ))] ≤ 0.

Thus, U ′ is decreasing. But

U ′(0) = u′(V0)E[(rS − rM ))]. (1.14)

Thus, from the condition E[rS ] > rM , we deduce that the portfolio proportionλ∗ invested in the risky asset is positive. Note that if U ′(0) = 0, then λ∗ = 0:it is optimal to invest the whole endowment V0 in the risk-free asset. Thus,the following result is proved:

PROPOSITION 1.2For the standard portfolio problem with a concave and differentiable utilityfunction, the investor invests a positive proportion in the risky asset if andonly if the excess expected return E[(rS − rM )] is positive.

REMARK 1.7 The previous result does not depend on the level of thevariance of the excess return. Even if the expected excess return is verysmall, it is optimal to invest part of the portfolio in the risky asset. However,the order of magnitude of λ∗ will depend on this variance. Note that if theutility function is not differentiable, then it may happen that λ∗ = 0, even ifE[(rS − rM )] > 0.


Consider the special case of HARA utility function u(x) =(b + x

c

)−c. De-note re = (rS − rM ) and λ∗∗ the optimal solution when

(b + V0

c

)= 1. We

have:

E

[re

(b +

V0+λ∗∗V0(b+V0c )re

c

)−c]

=(b + V0

c

)−cE

[re

(1 + λ∗∗V0re

c

)−c].

(1.15)

Therefore, since λ∗∗ satisfies E

[re

(1 + λ∗∗V0re

c

)−c]

= 0, the optimal solution

is λ∗ = λ∗∗(b+ V0c ), for any initial endowment V0. Thus λ∗ is a linear function

of the wealth V0. Moreover, when c→∞, λ∗ is independent of wealth.

The hypothesis of Arrow [29] is that individual preferences should be DARAand IRRA (Increasing relative risk aversion):

dA(x)dx

≤ 0 anddR(x)dx

≥ 0. (1.16)

The reasoning for DARA is that, for a given risk, wealthy investors are notmore risk-averse than poorer ones. IRRA implies that when both wealthand risk increase, then the readiness to bear risk should be reduced. Moreprecisely, for the previous standard portfolio problem with two assets, if theutility function u is twice-differentiable and exhibits DARA and IRRA, thenthe optimal proportion of initial wealth invested in the risky asset is increasingwith wealth; but it increases less than proportionally to the increase in wealth.

REMARK 1.8 When the two assets are risky, another risk aversionmeasure must be introduced to avoid some paradox, as shown in Ross [434].For the Ross risk aversion measure, a utility function u is said to displayhigher risk aversion than the utility function v if there exists a constant λ > 0such that for all x and y, we have:

u”v”

(x) ≥ λ ≥ u′

v′(y). (1.17)

This means that infxu”(x)/v”(x) ≥ supxu′(x)/v′(x). This condition is equiv-

alent to the existence of a constant a > 0 and a decreasing concave functionG such that for all x in R, v(x) = au(x) + G(x). Finally, it is also equivalentto the following inequality on the risk premia: πu(V0, X) ≥ πv(V0, X) for anyinitial wealth V0 and for any lottery X where E[X ] = V0.

Utility theory 19

1.4 Stochastic dominance

How can two random prospects X and Y be ranked? For a given investor,X is prefered to Y if the expected utility of E[u(X)] is higher than E[u(Y )].But how can they be compared if the utility is not observable? Besides, wehave seen in previous sections how a change in the utility function modifiesthe risk premium for a given lottery with payoff X . However, how does achange in X modify the risk premium independently of the utility function?The theory of stochastic dominance is introduced to answer these questions.It involves using the probability distributions of random prospects X and Y .In particular, it provides conditions under which E[u(X)] ≥ E[u(Y )] for allutility functions u in a given set. Depending on this set, several stochasticdominance orders are defined.

To simplify, assume that the supports of all random variables are in aninterval [a, b]. Denote respectively by FX and FY the cumulative distributionfunctions of X and Y .

DEFINITION 1.6 X is said to dominate Y according to first-orderstochastic dominance (“X 1 Y ”) if FX(w) ≤ FY (w), for all w ∈ [a, b].

This definition is consistent with the expected utility, since we have:

PROPOSITION 1.3X 1 Y if and only if

E[u(X)] =∫ b

a

u(w)dFX (w) ≥ E[u(Y )] =∫ b

a

u(w)dFY (w), (1.18)

for any utility function u which is monotonically increasing.

PROOF 1) Assume that X 1 Y . Consider any utility function u (mono-tonically increasing). Integrating by parts, we get:∫ b

a

u(w)[dFX (w) − dFY (w)]

= [u(z)(FX(w)− FY (w))]ba −∫ b

a

u′(w)[FX(w) − FY (w)]dw.

The first term is equal to 0. Moreover, by assumption, u′ > 0 and FX(w) −FY (w) ≤ 0. Consequently,∫ b

a

u(w)[dFX(w) − dFY (w)] ≥ 0,


which means that E[u(X)] ≥ E[u(Y )].

2) Conversely, suppose that there exists an w0 such that FX(w0) > FY (w0).By right continuity, there exists a neighborhood N (w0) such that FX(w) >FY (w), for all w in N (w0). Consider a utility function u which is constantoutside N (w0) and increasing inside (u′(w) = 0 for w /∈ N (w0) and u′(w) > 0for w ∈ N (w0)). Integrating by parts, we get:∫ b

a

u(w)[dFX(w)− dFY (w)] = −∫N(w0)

u′(w)[FX(w) − FY (w)]dw.

Since the right side is negative, we deduce∫ bau(w)dFX(w) <

∫ bau(w)dFY (w).

Thus “X 1 Y ” is false. Consequently, E[u(X)] ≥ E[u(Y )] for any utilityfunction u (monotonically increasing) implies X 1 Y .

In the next figure, the random prospects X and Y both stochastically dom-inate at the first order the random prospect Z. Since for all w, P[Z > w]is smaller than P[X > w] and P[Y > w], the expectation and the expectedutility of Z are smaller than those of X and Y .However, X and Y cannot be compared by this criteria. Indeed, the binaryrelation 1 is only a partial order on the space of random prospects.

6

- w0

1

a bc

FZ

FX

FX

FYFY

S1

S2

FIGURE 1.2: Stochastic dominance

From Figure (1.2), we see that the random prospect X is “less dispersed”

Utility theory 21

than Y : the area S(w) =∫ w

a[FY −FX ](s)ds between the two curves is always

positive (for w ≤ c, it is in [0, S1] strictly positive and, assuming −S2 ≤ S1,it is also positive for w ≥ c).

DEFINITION 1.7 X is said to dominate Y according to second-orderstochastic dominance (“X 2 Y ”) if

∫ w

aFX(s)ds ≤ ∫ w

aFY (s)ds, for all w ∈

[a, b] or, equivalently, the area S(w) =∫ w

a [FY − FX ](s)ds is always positive.

REMARK 1.9 As mentioned in Hadar and Russel [277], or in Rothschildand Stiglitz [435], this property is equivalent to the fact that all min-utilityinvestors prefer X to Y : for all w0 in [a, b], we have:

E[Min(X,w0)] ≥ E[Min(Y,w0)]. (1.19)

Indeed, this latter condition is equivalent to:∫ w0

a

wdFX(w) + w0(1 − FX(w0)) ≥∫ w0

a

wdFY (w) + w0(1− FY (w0)),

which is also equivalent to (by integrating by parts)

w0 −∫ w0

a

FX(w)dw ≥ w0 −∫ w0

a

FY (w)dw,

and, finally:∫ w0

a FX(w)dz ≤ ∫ w0

a FY (w)dw, for all w0 in [a, b]. In partic-ular, this property implies that if X 2 Y , then E[X ] ≥ E[Y ]. Note alsothat obviously first-order stochastic dominance implies second-order stochas-tic dominance but not vice-versa.

A complete caracterization of second-order stochastic dominance is providedby conditions under which E[u(X)] ≥ E[u(Y )] for all utility functions u in agiven set:

PROPOSITION 1.4X 2 Y if and only if E[u(X)] ≥ E[u(Y )], for any utility function u which ismonotonically increasing and concave.

PROOF 1) Assume that X 2 Y . Consider any utility function u in-creasing concave and twice-differentiable. Integrating by parts, we get:∫ b

a

u(w)[dFX (w) − dFY (w)] =

[u(w)(FX (w)− FY (w))]ba −∫ b

a

u′(w)[FX (w) − FY (w)]dw.


Since the first term is equal to 0, then:∫ b

a

u(w)[dFX (w) − dFY (w)] = −∫ b

a

u′(w)[FX (w) − FY (w)]dw.

Integrating by parts again, we deduce:∫ b

a


=[−u′(w)

∫ w

a

(FX(s)− FY (s))ds]ba

+∫ b

a

u”(w)[∫ w

a

(FX(s)− FY (s))ds]dw.

Recall that by assumption the area S(w) =∫ w

a [FY − FX ](s)ds is positive.Then we deduce: ∫ b

a


= [u′(w)S(w)]ba −∫ b

a

u”(w)S(w)dw = u′(b)S(b)−∫ b

a

u”(w)S(w)dw.

Since S, u′, and −u” are positive, we get∫ b

au(w)[dFX (w) − dFY (w)] ≥ 0,

which gives E[u(X)] ≥ E[u(Y )].2) Conversely, suppose that there exists w0 such that S(w0) < 0. By

continuity, there exists a neighborhood N (w0) such that S(w) < 0, for all win N (w0). Consider an increasing utility function u which is linear outsideN (w0) and concave inside (u”(w) = 0 for w /∈ N (w0) and u”(w) < 0 forw ∈ N (w0)). Then, we get:∫ b

a

u(w)[dFX(w) − dFY (w)] =∫N (w0)

u”(w)S(w)dw.

Since the right side is positive, we deduce∫ b

a u(w)dFX(w) <∫ b

a u(w)dFY (w).Thus “X 2 Y ” is false. Consequently, E[u(X)] ≥ E[u(Y )] for any increasingand concave utility function u implies X 2 Y .

REMARK 1.10 (See Hanoch and Levy [281]). Assume, as in Figure(1.2), that the two curves of FX and FY have only one intersection point.Suppose that there exists c such that:

FX(w) ≤ FY (w), for any w ≤ c and FX(w) ≥ FY (w), for any w ≥ c.

Then:X 2 Y if and only if E[X ] ≥ E[Y ]. (1.20)

Utility theory 23

In fact, for two random prospects X and Y with the same returns, thenotion of second-order stochastic dominance is equivalent to several definitionsof “riskiness,” introduced in [277], [281], [435], and [436]. Summing up:

PROPOSITION 1.5The three following statements are equivalent:1) Having the same expectation, the expected utility of X is greater than

the expected utility of Y for any utility function u which is monotonicallyincreasing and concave:

E[X ] = E[Y ] and E[u(X)] ≥ E[u(Y )]. (1.21)

2) Mean-preserving and increasing in spread: the area between the two cu-mulative distribution functions

S(w) =∫ w

a

[FY − FX ](s)ds (1.22)

satisfies: S(b) = 0 (i.e. E[X ] = E[Y ]), and S(w) is always positive (i.e.X 2 Y ).3) Adding a noise: there exists a random variable ε such that Y = X + ε

and E[ε|X ] = 0.

REMARK 1.11 Statement (3) justifies the diversification. Considern lotteries with net gains G1, ..., Gn which are assumed to be independentwith the same probability distribution. Consider any feasible strategy θ =(θ1, ..., θn) with weights θi such that

∑ni=1 θi = 1. Consider the final wealth

X =n∑

i=1

θiGi,

associated to the perfect diversification strategy

X =n∑

i=1

Gi/n

and the final wealth Y associated to any strategy θ. Then the perfect di-versification strategy second-order dominates any alternative one: X 2 Y .Indeed, we have:

Y =n∑

i=1

θiGi = X +n∑

i=1

(θi − 1/n)Gi.

Since, by symmetry, E [(θi − 1/n)Gi|X ] is independent from i, we deduce:

E

[n∑

i=1

(θi − 1/n)Gi|X]

= 0,

which implies X 2 Y by the previous statement (3).


1.5 Alternative expected utility theory

The key property linked to the expected utility theory is the IndependenceAxiom, which may fail empirically and can yield some paradoxes, as shownby Allais [19]: consider three outcomes g1 = 0, g2 = 100, and g3 = 500 andtwo pairs of probability distributions (P1,P2) and (Q1,Q2), defined by:

P1(g1) = 0%, P1(g2) = 100%, P1(g3) = 0%,P2(g1) = 1%, P2(g2) = 89%, P2(g3) = 10%,

Q1(g1) = 89%, Q1(g2) = 11%, Q1(g3) = 0%,Q1(g1) = 90%, Q1(g2) = 0%, Q1(g3) = 10%.

As experimental evidence often shows, usually people prefer lottery P1 to lot-tery P2, and lottery Q2 to lottery Q1, while the independence axiom impliesthat if you prefer P1 to P2, then you must prefer Q1 to Q2.

Besides some empirical violations of the Independence Axiom, the expectedutility theory is a problem for the interpretation of the utility function u, asproved for example in Cohen and Tallon [125]. In fact, the utility functionmust simultaneously give a representation of the choice among outcomes, andalso be the expression of the attitude towards risk. Therefore, for example,an investor who has a decreasing marginal utility u′ is necessarily risk-averse.

To argue with this kind of criticism, several responses have been given:

• First, following Marschak [378] or Savage [449], expected utility theoryis what “rational” people ought to do under uncertainty, and not nec-essarily what they actually do. This means that if they were perfectlyinformed and aware of their decisions, they will behave according toexpected utility theory.

• Second, new theories of choice under uncertainty can be introduced toavoid, for example, Allais paradox.

During the 80s, the revision of the expected utility paradigm has been intense-ly developed by slightly modifying or relaxing the original axioms. Amongseveral proposals are: the Weighted Utility Theory of Chew and MacCrimmon[120] that assumes a weaker form of the axiom of independence; the ProspectTheory of Kahnemann and Tversky [314] which modifies the axioms muchmore; the Non-Linear Expected Utility Theory of Machina [367]; the Antici-pated Utility Theory of Quiggin [420]; the Dual Theory of Yaari [507]; and theRegret Theory introduced by Loomes and Sugden [365].

Utility theory 25

1.5.1 Weighted utility theory

One of the first theories that was consistent with Allais paradox was the“weighted utility theory”, introduced by Chew and MacCrimmon [120] andfurther developed by Chew [119] and Fishburn [224]. The idea is to apply atransformation on the initial probability.

The basic result of Chew and MacCrimmon yields the following represen-tation of preferences over lotteries L = (ω1, p1), ..., (ωm, pm):

U(L) =∑i

u(xi)φ(pi) with φ(pi) = pi/[∑i

v(xi)pi], (1.23)

where u and v are two different elementary utility functions.

Another approach is the “prospect theory” introduced by Kahnemann andTversky [314]. Consider the following example where two lotteries are pro-posed.

Example 1.1

Most people choose A and D. Thus they violate the theory of expected utility

TABLE 1.1: Kahnemann and Tversky exampleProblem 1Assume you are 300 richer than you are today. Choose between:A. The certainty of earning 100B. 50% probability of winning 200 and 50% of not winning anythingProblem 2Assume you are 500 richer than today. Choose between:C. A sure loss of 100D. 50% chance of not losing anything and 50% chance of losing 200

(the independence axiom of the theory). However, in terms of expected utility,the two problems are equivalent:

TABLE 1.2: Equivalence of the two problemsProblem 1Case A: 400 with prob=1Case B: 300 with prob=0.5 or 500 with prob=0.5Problem 2Case C: 400 with prob=1Case D: 300 with prob=0.5 or 500 with prob=0.5


In fact, the available wealth was considered after the choice has been made.Thus, most people behave as risk takers when facing a problem presented interms of loss (Problem 2), while they behave as risk-averse when the sameproblem is presented in terms of gain (Problem 1).

This behavioral inconsistency is called the “framing effect,” and shows thatthe mental representation of a choice problem may be crucial.

Kahnemann and Tversky observe that “the preferences observed in the t-wo problems are of particular interest as they violate not only the theory ofexpected utility, but practically all choice models based on other normativetheories.”

The idea of the prospect theory is to represent the preferences by means ofa function φ such that the utility of a lottery

L = (x1, p1), ..., (xn, pn)

is given by:

U(L) =n∑i

u(xi)φ(pi), (1.24)

where φ is an increasing function defined on [0, 1] with values in [0, 1] andφ(0) = 0, φ(1) = 1.

The function φ(·) is a transformation of the initial probability and cor-responds to a decision weight functional. It allows us to take account of a“certainty effect.” For example, if the function φ is not left-continuous at 1,then φ(p) < p maybe in a neighborhood of 1. This is the result of the passagefrom certitude to uncertainty. Note that the equality

∑ni=1 φ(pi) = 1 may no

longer be true.

Using this transformation, the Allais paradox can be solved. Moreover,from experimental observations, Kahneman and Tversky argue that it is nec-essary to distinguish positive results (gains) from negative ones (losses) fromexperimental observations. However, the sub-additivity of φ which is induced:

∀p1, p2 ∈]0, 1[ φ(p1) + φ(p2) < φ(p1 + p2), (1.25)

may imply the violation of the first-order stochastic dominance, as well asother models with weighted probabilities.

To solve this problem, alternative approaches can be proposed.

Utility theory 27

1.5.2 Rank dependent expected utility theory

The “Rank Dependent Expected Utility” theory (RDEU) assumes that peo-ple consider cumulative distribution functions rather than probabilities them-selves. In this framework, it is possible to introduce preference representationsthat are compatible with the first-order stochastic dominance.

The functional representation of preferences is defined as follows:

DEFINITION 1.8 For all random variables X and Y which model resultsor consequences and with values in [−M,M ],

X Y ⇔ V (X) ≥ V (Y ), with V (Z) =∫ M

−M

u(z)dΦ(FZ(z)), (1.26)

where the function u(·) is continuous and differentiable, non-decreasing andunique up to a non-negative linear, and Φ(·) is a continuous function, non-decreasing from [0, 1] in [0, 1]. Without loss of generality, it can be assumedthat if Φ(0) = 0 and Φ(1) = 1, then Φ(·) is unique.

Note that for a discrete lottery L = (x1, p1), ..., (xm, pm) with

x1 ≤ x2 ≤ · · · ≤ xn,

the utility V is given by:

V (L) =∑n

i=1 u(xi)[Φ(∑i

j=1 pj)− Φ(∑i−1

j=1 pj)],

= u(x1) +∑n

i=2(u(xi)− u(xi−1))[1− Φ(

∑ij=1 pj)

].

(1.27)

Since the weights Φ(∑i

j=1 pj) depend on the ranking of the outcomes xi, thispreference representation is called “rank dependent expected utility.” Theseweights are calculated by first ranking outcomes from the worst to the bestthen by summing up the utilities weighted by the sequence (Φ(

∑ij=1 pj) −

Φ(∑i−1

j=1 pj))i. Thus, it is assumed that an objective probability exists, butindividuals transform this given law by using a function of its cdf. Contraryto transformation defined on the pdf itself, this allows us not to violate thefirst-order stochastic dominance.

REMARK 1.12 The RDEU is a generalization of the expected utilitycriterion (EU). Indeed, for Φ(p) = p, ∀p ∈ [0, 1], the functional representationof preferences is given by:

V (L) =n∑

i=1

u(xi)

(i∑

j=1

pj)− (i−1∑j=1

pj)

=n∑

i=1

piu(xi) = EU(L).


If Φ is not the identity but u(x) = x, then the RDEU is the dual theory ofYaari [507].

As mentioned in Tallon [488], the RDEU has several advantages:

• Contrary to the EU, the RDEU allows separation of the behavior to-wards wealth from the behavior towards risk. Therefore, the RDEU iscompatible with usual empirical observations which show that individu-als under- or overestimate probabilities of random events (i.e., are eitherpessimistic or optimistic).

• Contrary to the EU, the RDEU allows identification of two notionsof risk-aversions: the standard weak risk-aversion and the strong risk-aversion.

Indeed, in the RDEU framework, these two notions have to be differentiat-ed:

1) The weak risk-aversion of Arrow-Pratt: An individual prefers the expec-tation of the lottery to the lottery itself:

∀L = (xi, pi)i=1,··· ,n, E(L) =n∑

i=1

pixi L. (1.28)

In the ES context, it is equivalent to the concavity of the utility function.2) The strong risk-aversion of Rothschild and Stiglitz ([435] and [436]):

This definition is based on the notion of mean preserving spread. Considertwo random variables X and Y associated respectively to lotteries LX andLY .

DEFINITION 1.9 Y is said to be a mean preserving spread of X if:

E(LX) = E(LY )and

∀T ∈ [−M,M ],∫ T

−M ProbX < tdt ≤ ∫ T

−M ProbY < tdt.(1.29)

The latter condition corresponds to the second-order stochastic dominanceas seen in Proposition 1.5.

DEFINITION 1.10 A strong risk-averse individual prefers the mean p-reserving spread Y of X to X itself: LY LX .

REMARK 1.13 These two risk-aversion notions are identical in the ESframework. They correspond to the concavity of the utility function. In theRDEU context, they are distinct.

Utility theory 29

• Chateauneuf et al. [115] show that an individual satisfying RDEU with aconcave utility function u(·) is weakly averse to risk if an only if his trans-formation function Φ is such that Φ(p) ≤ p, ∀p ∈ [0, 1]. Furthermore, ifu(x) = 1− (1− x)n with n ≥ 1, he is weakly averse to risk if and only ifhis transformation function Φ is such that Φ(p) ≥ 1−(1−p)n, ∀p ∈ [0, 1].

• Chew, Karni and Safra [121] prove that an individual is strongly averseto risk if and only if his utility function u is concave and his transfor-mation function Φ is convex.

• Tallon [488] proves that strong risk-aversion allows us to give an inter-pretation of the RDEU: The beliefs of an individual are characterizedby a given set of probability distributions and his utility is the infimumof the expectations of his utility on this set. This kind of result is alsodeduced for characterization of risk measures, as shown in Section 2.1.3of Chapter 2.

Several models have been proposed in the (RDEU) framework, as recalledin what follows.

1.5.2.1 Anticipated utility theory

Quiggin [421] keeps three main properties of the ES theory: the transitivity,the first-order stochastic dominance, and the continuity. He has added thefollowing axiom:

DEFINITION 1.11 (Weak independence Axiom)Consider two lotteries

LX = (x1, p1), ..., (xn, pn) and LY = (y1, p1), ..., (yn, pn)

such thatx1 ≤ ... ≤ xn and y1 ≤ ... ≤ yn

and∀i ∈ 1, ..., n , P[X = xi] = P[Y = yi].

Assume that there exists a common value xi0 = yi0 . Consider two lotteriesLX′ and LY ′ which are equal respectively to LX and LY , except that xi0 andyi0 are replaced by another common value.The preference is weak independent if and only if:

LX LY ⇐⇒ LX′ LY ′ . (1.30)


PROPOSITION 1.6

Consider a lottery L = (x1, p1), ..., (xn, pn). Then a functional V whichsatisfies Quiggin’s conditions is given by:

V (L) =n∑

i=1

u(xi)

Φ(i∑

j=1

pj)− Φ(i−1∑j=1

pj)

, (1.31)

where Φ is non-decreasing on [0, 1] to [0, 1], and is concave on [0, 12 ], (Φ(pi) >pi) and convex on [12 , 1], (Φ(pi) < pi) with Φ(12 ) = 1

2 and Φ(1) = 1.

As proposed in Quiggin [420], the function Φ can be chosen as follows:

Φ(p) =pγ

(pγ + (1− p)γ)1γ

, (1.32)

with, for example, γ = 0, 6.

REMARK 1.14 Under the previous assumptions, the first-order stochas-tic dominance is satisfied. Moreover, the Allais paradox is solved. Finally, themodel is in accordance with the empirical result of Kahneman and Tversky[314]: individuals weight more events with small probabilities and weight lessthose with high probabilities.

REMARK 1.15 For the special case : u(x) = x, the RDEU approach isequivalent to the dual theory of Yaari [507]. The implication of such preferencerepresentation is that no investor will diversify: in the presence of one risklessasset and one risky asset, either he buys only the riskless one, or only therisky one. However, as shown by Eeckhoudt [185], when both assets are risky,and in the presence of a “background risk” (such as illness, accident, fire...),diversification can be observed.

Finally, different empirical experiences have shown that individuals do nothave the same attitude towards losses and gains: the utility on losses seemsto be convex, whereas the utility on gains seems to be concave. The valueof each component is computed by taking the expected utility with respectto distortions of the distribution function which may differ for the positiveand the negative parts of the distribution. This model can be viewed as ageneralization of the standard rank-dependent utility model, where the samedistortion function is used for the whole distribution.

This kind of behavior is modelled by the “Cumulative Prospect Theory.”

Utility theory 31

1.5.2.2 Cumulative prospect theory

Tversky and Kahneman [496] have introduced on one hand specific utilityfunctions for losses and gains, on the other hand a transformation function ofthe cumulative distributions. There exist two functions, w− and w+ definedon [0, 1], and a utility type function v such that the utility V on the lotteryL = (x1, p1), ..., (xn, pn) with x1 < ... < xm < 0 < xm+1 < ... < xn isdefined as follows: define Φ− and Φ+ by: Φ−

1 = w−(p1) and Φ+n = w+(pn),

Φ−i = w−

(∑ij=1 pj

)− w−

(∑i−1j=1 pj

), ∀i ∈ 2, ...,m ,

Φ+i = w+

(∑nj=i pj

)− w+

(∑nj=i+1 pj

), ∀i ∈ m + 1, ..., n .

(1.33)

Then, V is given by: V (L) = V −(L) + V +(L) with

V −(L) =m∑i=1

v(xi)Φ−i and V +(L) =

n∑i=m+1

v(xi)Φ+i . (1.34)

When the probability distribution F has a pdf f on [−M,M ], and the func-tions w− and w+ have derivatives w−′

and w+′, then:

V (L) =∫ 0

−M

v (x)w−′[F (x)]f(x)dx +∫ M

0

v (x)w+′[1− F (x)]f(x)dx.

As in Quiggin [420], both functions w− and w+ can be chosen as follows:w(p) = pγ

(pγ+(1−p)γ)1γ, with, for example, γ− = 0, 69 and γ+ = 0, 61.

-100 -50 0 50 100-1

-0.5

0

0.5

1

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

FIGURE 1.3: Kahneman and Tversky functionsThe utility function v is convex on losses and concave on gains. The weightingfunctions w+ and w− are above the identical function for small probabilitiesand under it for large probabilities.


1.5.3 Non-additive expected utility

The “Choquet expected utility” of Schmeidler [454] does not assume thatthere exists a probability to measure the likelihood of random events. Themodel is based on the so-called “Choquet Integral” (see Choquet [123]). Con-sider, for example, a finite probability set Ω equipped with a σ-algebra F .

DEFINITION 1.12 A Choquet measure (or “capacity”) C is a functiondefined on F with values in [0, 1] satisfying:

C(∅) = 0,C(Ω) = 1,∀A,B ∈ F , A ⊂ B ⇒ C(A) ≤ C(B).

(1.35)

Note that for mutually exclusive subsets A and B, C(A∪B) may be smalleror higher than C(A) + C(B).

The Choquet integral,∫Ωu(x)dC(x), of a function u (piecewise constant)

can be defined as follows:

DEFINITION 1.13 Let u be defined on the set of outcomes of a lotteryL = (x1, A1), ..., (xn, An) with x1 < ... < xn, and where for all i, the eventAi corresponds to the outcome xi,

CEU(L) = u(x1) +∑n

i=2[(u(xi)− u(xi−1)]C(Ai ∪ · · · ∪An).

=∑n

i=1 φiu(xi),(1.36)

where:

∀i = 1, · · · , n− 1, φi = C(Ai ∪ · · · ∪An)−C(Ai+1 ∪ · · · ∪An), φn = C(An).

The Choquet expected utility (CEU)) of an individual is a functional rep-resentation of preferences based on some particular axioms.

DEFINITION 1.14 1) A preference relation is said to be monotonicif, for any random variables X and Y defined on a probability set Ω:

X(ω) ≥ Y (ω) =⇒ X Y. (1.37)

2) Two random variables X and Y defined on a probability set Ω are said tobe comonotonic if there exist no distinct ω1 and ω2 in Ω such that:

X(ω1) > X(ω2) and Y (ω2) > Y (ω1). (1.38)

3) A preference relation is said to be comonotonic independent if, for allpairwise random variables X, Y , and Z, and for any λ in ]0, 1[:

X Y =⇒ λX + (1 − λ)Z λY + (1 − λ)Z. (1.39)

Utility theory 33

PROPOSITION 1.7Assume that the preference relation satisfies axioms 1 and 2 (see Section(1.1.2)) and monotonicity and comonotonic independence axioms. Then thereexists a unique Choquet capacity defined on the σ-algebra F and an affine realvalued function u such that :

LX LY ⇐⇒∫Ω

u[X(ω)]dC(ω) ≥∫Ω

u[Y (ω)]dC(ω). (1.40)

REMARK 1.16 The term[1− Φ(

∑ij=1 pj)

]in Definition (1.27) of RDEU

corresponds to the term C(Ai ∪· · ·∪An) in Definition (1.36) of CEU. Indeed,for the RDEU, the preference functional is given by

V (L) =∫

uRDEU (x)dΦ(F (x)),

while for the CEU, it is defined by

V (L) =∫

uCEU(x)dC(x).

This preference representation covers problems such as the Ellsberg paradox(see [196]).

1.5.4 Regret theory

Consider the following bet as examined in Lichtenstein and Slovic [356].Assume that there are two lotteries LX and LY such that:

LX has outcomes (A, a) with probabilities (p, (1− p)),LY has outcomes (B, b) with probabilities (q, (1− q)). (1.41)

The money amounts A and B are assumed to be large and a and b are small,possibly negative. The main assumption is that p > q (lottery LX has a higherprobability of a large outcome), and that B > A (lottery LY has the highestprobability of a large outcome). Thus, individuals who choose lottery LX facea relatively higher probability of a relatively low gain. Individuals who chooselottery LY have a relatively smaller probability of a relatively high gain. Forexample:

LX has outcomes (30, 0) with probabilities (p = 90%, (1− p) = 10%),LY has outcomes (100, 0) with probabilities (q = 30%, (1− q) = 70%).

(1.42)Note that the expectation of lottery LX is higher than that of lottery LY .Many experiments (see for example [356]) have shown that individuals tend


to choose lottery LX rather than lottery LY . Furthermore, they were willingto sell their right to play lottery LX for less than their right to play lotteryLY . Thus, whereas they prefer lottery LX , they were willing to accept alower certainty-equivalent amount of money for this lottery (25 for LX) thanthey do for the other lottery LY (27 for LY ). This violates the transitivityaxiom: indeed, one is indifferent between the certainty-equivalent of a lotteryand the lottery itself. Thus, for this example, we have: U(LX) = U(25) andU(LY ) = U(27). Since U is increasing, U(27) > U(25), which implies thatU(LY ) > U(LX). However the empirical choice is such that U(LX) > U(LY ),and the preference is not transitive.

Transitivity is often considered as quite rational. However, we can searchfor preference representation models which can “rationally” explain this “ir-rationality.” Loomes and Sugden [365] proposed a “regret/rejoice” functionfor pairwise lotteries which contain the outcomes of both the chosen and theforegone lottery. Let LX and LY be two lotteries. If LX is chosen and LY

is foregone and the outcome of LX turns out to be a and the outcome of LY

turns out to be b, then we can consider, for example, the difference betweenthe (elementary) utilities between the two outcomes, to be a measure of re-gret, i.e. r(x, y) = u(x) − u(y), which is negative if “regret” and positive if“rejoice.” Individuals thus faced with alternative lotteries do not seek to max-imize expected utility but rather to minimize expected regret. More generally,the regret theory is defined as follows: Let R(.) be a regret/rejoice functionwhich is assumed to be non-decreasing and such that R(0) = 0. Introduce thefunction Q given by

Q[z] = z + R[z]−R[−z]. (1.43)

The function Q is non-decreasing and such that Q[z] = −Q[−z]. Consider twolotteries LX = (x1, ..., xn) and LY = (y1, .., yn) with the same probabilities(p1, .., pn). The expected rejoice/regret is defined by:

DEFINITION 1.15 (see [365])1) Rejoice/regret of two lotteries:

E(Q(LX , LY )) =n∑

i=1

Q[u(xi)− u(yj)]pi. (1.44)

2) Preference on lotteries:

LX LY ⇐⇒ E(Q(LX , LY )) ≥ 0. (1.45)

It is obvious that when Q is linear, the regret theory is equivalent to theexpected utility criterion. Note that Q is usually assumed to be convex.

Utility theory 35

1.6 Further reading

Fishburn [223] provides basic results about utility theory for decision-making.In [225], Fishburn recalls the story of expected utility theory based on specificaxioms and examines violations of these axioms, which are at the origin ofnew theories. Gollier [258] provides an overview about expected utility theoryand comments on its interest with respect to alternative approaches. More-over, applications to static portfolio optimization are detailed. Levy [352]empirically examines the absolute and relative risk aversion to check Arrow’sassumption: “investors reveal decreasing absolute risk aversion (DARA) andincreasing relative risk aversion (IRRA).” The purpose of the experience is totest these two hypotheses when the individual’s wealth varies depending onhis/her investment performance. The empirical result is that DARA is indeedstrongly supported, but IRRA is rejected.

From the 1980s, and in particular during the 1990s, new theories based onalternative axioms have been introduced such that the Allais hypothesis wouldemerge as a result. The debate about expected or non-expected utility hasnot yet ended. A comprehensive and detailed survey of the literature on alter-native expected utility theories is found in Fishburn [226]. Among them, thenon-linear expected utility of Machina [367] appealed to the notion of “localexpected utility.” A generalization of rank dependent expected utility is in-troduced in Segal ([457],[458]): there is no strict separation between behaviortowards outcomes and their probabilities (joint function of outcomes and cdf).

Maccheroni et al. [369] introduce dynamic variational preferences. Theygeneralize the multiple priors preferences of Gilboa and Schmeidler [254] whomodel ambiguity averse agents. They provide conditions under which dynamicvariational preferences are time consistent and have a recursive representation.

Experiments have been conducted to check that individual choices do notsatisfy the linearity property with respect to probabilities. Schoemaker [455]provides several examples showing that the expected utility property is oftenviolated: the choice process is examined from economic, decision theoretic,and psychological perspectives. Multiple factors are examined which mayinduce variance in risk-bearing: including portfolio constraints and marketincompleteness. In the RDEU framework, the major difficulty is to estimateseparately the utility function and the transformation function. Wakker andDeneffe [501] provide a method to dissociate these two estimations. In [160]and [161], another approach is proposed: the idea is to test the invarianceof choices with respect to a given transformation (for example, additive ormultiplicative). This allows a characterization of the utility functions, inde-pendently of the weighting of the utility. Jaffray [295] and Cohen and Tallon


[125] examine the individual’s behavior when he focuses simultaneously onthe worst and best outcomes. They propose special weighting of the utilitiesof these outcomes.

Many recent studies are devoted to behavorial finance and its applicationto portfolio management. For example, in Benzion and Yagil [53], portfo-lio choices are investigated experimentally. Other consequences for financialmarkets are analyzed, for instance in De Bondt and Wolff [151]. In Levy andLevy [353], it is shown that, when diversification between assets are allowed,mean-variance and prospect theory determine almost the same efficient sets,while prospect theory supposes that investors’ choices are based on changeof wealth rather than on total wealth, and that objectice probabilities aresubjectively distorted.

Another approach, which has not yet been fully applied in financial mod-elling, is based on the discrete choice theory as detailed by MacFadden [366].The idea is that individual preferences are random utility functions: one com-ponent is observable and deterministic. The other component, which is notobservable, takes account of imperfections such as lack of information, indi-viduals’ heterogeneity, etc. The Logit multinomial model is frequently usedto model the uncertainty about the utility function.

Chapter 2

Risk measures

Recent years have seen increasing development of new tools for risk manage-ment analysis. Many souces of risk have been identified, such as market risk,credit risk, counterparty default, liquidity risk, operational risk and others.One of the main problems concerning the evaluation and optimization of riskexposure is the choice of “good” risk measures. Value-at-Risk has been in-troduced for bank regulation purposes. Nevertheless, due to some of its defi-ciencies, other risk measures have been proposed. But what axioms must beimposed to determine a “rational” risk measure?

2.1 Coherent and convex risk measures

In portfolio theory, many risk measures alternative to the variance have beenintroduced. As noted by Giacometti and Ortobelli [252], we can distinguishtwo kinds of risk measures:

• First, the dispersion measures: these functions of risks X are increasing,positive, and positively homogeneous. Among them: standard deviationand mean-absolute deviation. Note that, if X > 0 and λ > 1, then

ρ(λX) = λρ(X) > ρ(X).

Thus, they are not consistent with the first-order stochastic dominance.

• Second, the safety measures: these were introduced in portfolio theoryby Roy [437], Telser [490], and Kataoka [324]. The safety-first rules arebased on risk measures which involve the probability that the portfolioreturn falls under a given level. These kinds of measures are consistentwith the first-order stochastic dominance.

Among the safety-first measures, a special class has been emphasized: thecoherent risk measures.

37


2.1.1 Coherent risk measures

In their seminal contribution, Artzner, Delbaen, Eber, and Heath [31] in-troduce axioms that are based on regulation of private banks by a centralbank. Such risk measures determine the minimal amount of capital that mustbe added to make the future value of a position acceptable. Consider the setΩ of states of nature. This set may describe, for example, the possible valuesof all asset prices. The probabilities of each state ω may be unknown. LetX(ω) be the discounted future net worth of the position for each “scenario”ω. Let X be the set of all risks; that is a given set of real random variablesdefined on Ω. The set X is a linear space which may be, for instance, the setof all bounded random variables.

Assume that there exists a reference asset of a constant total return r.Recall the axiomatics of Artzner et al. for a risk measure ρ defined on the setX :

• Axiom T. Translation invariance: for all X in X and all real numbersα, we have

ρ(X + α.r) = ρ(X)− α.

This means that if the sure amount α is initially invested in the referenceasset, then the variation of the risk measure is equal to α itself. Thisproperty is quite in accordance with a monetary interpretation of themeasure ρ. Note that in particular, we have:

ρ(X + ρ(X).r) = 0.

• Axiom S. Subadditivity: for all X1 and X2 in X :

ρ(X1 + X2) ≤ ρ(X1) + ρ(X2).

In particular, this property means that “a merger does not create extrarisk.” Note also that, for a global risk X1 + X2, the amount ρ(X1) +ρ(X2) is sufficient to guarantee the position. This property does notallow reduction of risk by dividing the total position into smaller ones,which is highly desirable for regulation purposes, but does not satisfythe diversification principle.

• Axiom PH. Positive homogeneity: for all X in X and for all λ ≥ 0,

ρ(λX) = λρ(X).

This property implies that the risk measure is a linear function of thesize of the position. It requires that there is no liquidity risk.

• Axiom M. Monotonicity: for all X1 and X2 in X :

X1 ≤ X2 =⇒ ρ(X1) ≥ ρ(X2).

Risk measures 39

• Axiom R. Relevance: for all X in X with X ≤ 0 and X = 0,

ρ(X) > 0.

This means that if there actually exists a risk, it must be taken intoaccount.

DEFINITION 2.1 A risk measure satisfying the axioms of translationinvariance, subadditivity, positive homogeneity, and monotonicity is called co-herent.

2.1.2 Convex risk measures

When the risk measure is not assumed to have variations proportional tothe risk variations themselves, the positive homogeneity is no longer satisfied.Alternative axioms can be proposed. This leads to the notion of convexity,as introduced in Heath [286] for finite sample spaces (“weakly coherent riskmeasures”) and Follmer and Schied [234] for general spaces (see also [236]).

• Axiom C Convexity: for all X1 and X2 in X and for all 0 ≤ λ ≤ 1,

ρ(λX1 + (1− λ)X2) ≤ λρ(X1) + (1 − λ)ρ(X2).

DEFINITION 2.2 A risk measure satisfying the axioms of translationinvariance, monotonicity, and convexity is called convex.

PROPOSITION 2.1A convex risk measure is coherent if it satisfies the positive homogeneity. Notealso that the positive homogeneity and the subadditivity implies the convexity.

As mentioned in Artzner et al. [31], a class Aρ of “acceptable positions”can be associated to each risk measure ρ. This class, called the acceptance setof ρ, contains all positions X which do not induce positive risk:

Aρ = X ∈ X|ρ(X) ≤ 0. (2.1)

Conversely, given a class A in X of acceptable positions, a risk measure ρAcan be defined by:

ρA(X) = infm ∈ R |m + X∈A . (2.2)

The following proposition (see [236] and [238]) indicates the relations betweenconvex risk measures and the corresponding acceptance sets.


PROPOSITION 2.2If ρ is a convex risk measure with acceptance set Aρ, then ρAρ = ρ. Besides,the acceptance set A = Aρ has the following properties:1) The set A is non-empty and convex.2) If X ∈ A and Y ∈ X , then Y ≥ X implies Y ∈ A.3) If the risk measure ρ is coherent, then A is a convex cone.Conversely, if A is a non-empty convex subset of X which satisfies property

(2) and such that the corresponding functional ρA satisfies ρA(0) > −∞, then:1) ρA is a convex measure of risk.2) If A is a cone then ρA is a coherent measure of risk.

2.1.3 Representation of risk measures

Assume that X is the linear space of all bounded measurable functionson a measurable space (Ω,F). Denote by M1 = M1(Ω,F) the class of allprobabilities on (Ω,F). Denote also by M1,f = M1,f (Ω,F) the class of allfinitely additive and non-negative functions Q on F such that Q(Ω) = 1. Thena characterization of coherent risk measures can be deduced (see [30] and[31]for finite probability spaces, and [154] for general spaces):

PROPOSITION 2.3A functional ρ is a coherent measure of risk if and only if there exists a subsetQ ofM1,f such that:

ρ(X) = supQ∈Q

EQ[−X ], X ∈ X . (2.3)

Besides, Q can be chosen such that it is convex and the supremum is attained.

A similar representation result is obtained in [237] for convex measures ofrisk. Let α : M1(Ω,F) → R ∪ ∞ be a functional which is bounded frombelow and not identically equal to ∞. Define the measure of risk ρ by:

ρ(X) = supQ∈M1,f

(EQ[−X ]− α(Q)) . (2.4)

The measure ρ associated to the function α is convex.

DEFINITION 2.3 The functional α is called a penalty function for therisk measure ρ defined onM1,f .

PROPOSITION 2.4Any convex measure of risk ρ on X has the following form:

ρ(X) = maxQ∈M1,f

(EQ[−X ]− αmin(Q)) , X ∈ X , (2.5)

Risk measures 41

where the penalty function αmin is given by:

αmin(Q) = supY ∈Aρ

EQ[−Y ], for Q ∈ M1,f . (2.6)

Additionally, the minimal penalty function αmin represents the risk measureρ: for any penalty function α satisfying relation (2.5), α(Q) ≥ αmin(Q), forall Q ∈M1,f .

REMARK 2.1 - The minimal penalty function αmin of a coherent mea-sure of risk ρ takes only the values 0 and ∞ and:

ρ(X) = maxQ∈Qmax

EQ[−X ], X ∈ X ,

for the convex set

Qmax = Q ∈ M1,f |αmin(Q) = 0,

which is the largest set for which the representation (2.3) holds.- If Q is the set of all probability measures on (Ω,F), then the coherent riskmeasure induced by Q is given by the worst case measure:

ρmax(X) = supQ∈Q

EQ[−X ] = − infω∈Ω

X(ω), for all X ∈ X .

- Using continuity argument such as, if Xn X then ρ(Xn) ρ(X), themeasure of risk ρ can be represented as the maximum:

ρ(X) = maxQ∈M1

(EQ[−X ]− αmax(Q)) , X ∈ X ,

which may be too restrictive (see [237]).

2.1.4 Risk measures and utility

As detailed in Chapter 1, under some specific assumptions, the investor’spreference on the possible positions X can be represented by a utility functionU . Nevertheless, the expected utility criterion, which assumes in particularthat one single probability measure is determined, may be not appropriateto model the investor’s decision under uncertainty. Maybe the investor hasin mind a larger class of measures of occurrence of scenarios, represented byelements of M1,f . Then, he may consider the worst case for the expectedlosses of his investments. In this framework, under a set of axioms for hispreference order (see [253] and [254]), there exists a Savage representation ofU associated to a utility function u defined on the outcomes such that:

U(X) = infQ∈Q

EQ[X ], (2.7)


where Q is a set of probability measures on (Ω,F). A coherent measure ofrisk ρ can be associated to this functional U from the relation:

U(X) = −ρ(u(X)), for all X ∈ X . (2.8)

As detailed in [237] (see also [303]), another approach is to associate a lossfunctional L to the utility U by letting L = −U . Then, there exists a lossfunction l, convex and increasing, defined by l(x) = −u(−x) and such that:

L(X) = supQ∈Q

EQ[l(−X)]. (2.9)

A position X is acceptable if the loss functional L(X) is not higher thana given reference level xr . Then, a convex class of acceptable positions isdeduced:

AL = X ∈ X|L(X) = supQ∈Q

EQ[l(−X)] ≤ xr.

The set A defines a convex measure of risk ρL with the representation:

ρL(X) = supQi∈M1

(EQ[−X ]− αL(Q)) .

Then, the penalty function αL has to be determined. As proved in [238], ageneral expression can be provided by using the Fenchel-Legendre transforml∗ of the loss function l which is defined by:

l∗(y) = supx∈R

[yx− l(x)] .

PROPOSITION 2.5The convex risk measure ρL associated to the acceptance set AL has a penaltyfunction αL given by:

αL(P ) = infλ>0

1λ

(xr + inf

Q∈QEQ

[l∗(λdP

dQ

)]), (2.10)

where dPdQ

is a generalized density in the sense of Lebesgue decomposition:αL(P ) < ∞ only if P is absolutely continuous with respect to at least someQ ∈ Q (notation: P# Q).

Example 2.1For the exponential loss function l(x) = ex and xr = 1, the penalty functionαL is given by:

αL(P) = infQ∈Q

H(P/Q), (2.11)

where H(P/Q) denotes the relative entropy of P with respect to Q:

H(P/Q) =

EQ

[dPdQ

log dPdQ

]if P # Q,

+∞ otherwise.

Risk measures 43

2.1.5 Dynamic risk measures

In Cvitanic and Karatzas [140], the idea of risk measures is introduced in adynamic setting. They assume that a complete financial market is given andthey define the risk of a position as the highest expected shortfall under someset of probability measures. The corresponding risk measure is given by:

ρ(X)V0 = supQ∈Q

infθ∈Θ(V0)

EQ

[(X − V θ

T )]+

, (2.12)

where Q is a given set of probability measures (“the possible scenarios”),Θ(V0) is the class of all admissible portfolios with initial endowment V0, andV θT is the value of the portfolio θ at maturity T .

Thus, as mentioned in Frittelli [243], the dynamic property of such a mea-sure is due to the possible portfolio rebalancing, and not to a dynamic valueof the measure itself. In Wang [502], the risk measure is introduced in adynamic way. The class of “likelihood-based risk measures” is introduced totake account of dynamic readjustments (for discrete time and finite samplespaces). Further results can be found in [32], where two different dynamicmeasures of risk are proposed.

In Riedel [425], the class of coherent risk measures is extended to the dy-namic framework by considering a predictable translation invariance to takenew information into account, and also by introducing a dynamic consistencyto avoid possible contradictions between judgements over time. This leads tothe following representation:

Consider a sequence of time periods t = 0, ..., T and a finite set Ω of statesof the world. Suppose that there exists a sequence of random variables (Zt)twhich reveals the information at time t.

Denote by (Ft) the filtration generated by the process Z:

Ft = σ(Z1, ..., Zt). (2.13)

A position D = (Dt)t is a Ft-adapted process, which corresponds to a se-quence of random payments at any time t. It is assumed that there exists anexogenous interest rate r. Then, under the axioms given in [425], the dynamicrisk measure assigns to the sequence D of payments the risk:

ρt(D) = maxQ∈Q

EQ

[−

T∑s=t

Ds

(1 + r)T−s|Ft

]. (2.14)


The notion of dynamic risk measures can be defined as follows (see [243]):

- First, a whole process (ρt)t must be introduced to represent the risk of theposition at any time, conditionally to the information available at any time talong the period [0, T ].

- Second, some boundary conditions must be satisfied. At time 0, the riskmeasure ρ0 is a static risk measure as defined in previous sections. Addition-ally, at maturity T , ρT is the opposite of the worth of the financial position.

Let (Ft)t be a filtration defined on the probability space (Ω,F ,P). Denoteby Lp = Lp(Ω,FT ,P) the space of all real-valued, FT -measurable, and p-integrable random variables, and by L0

t = L0(Ω,Ft,P) the space of all randomvariables defined on (Ω,Ft,P).

DEFINITION 2.4 A dynamic risk measure is a process (ρt)t such that:i) ρt : Lp → L0

t , for all t ∈ [0, T ];ii) ρ0 is a static measure; and,iii) ρT (X) = −X, P− a.s., for all X ∈ Lp.

Most of the “rational” dynamic axioms that can be imposed on the process(ρt)t are similar to those in the static case.

Axiom T. Translation invariance: ∀t ∈ [0, T ], ∀a ∈ Lp and Ft-measurable,∀X ∈ Lp,

ρt(X + a) = ρt(X)− a, P− a.s.

Note that at time t, the random variable a can be considered as a constantsince it is observable given the information Ft.

Axiom S. Subadditivity: ∀t ∈ [0, T ], ∀X1, X2 ∈ Lp,

ρt(X1 + X2) ≤ ρt(X1) + ρt(X2), P− a.s.

Axiom PH. Positive homogeneity: ∀t ∈ [0, T ], ∀X ∈ Lp and ∀λ ≥ 0,

ρt(λX) = λρt(X), P− a.s.

Axiom M. Monotonicity: ∀X1, X2 ∈ Lp,

X1 ≤ X2 =⇒ ∀t ∈ [0, T ], ρt(X1) ≥ ρt(X2), P− a.s.

Axiom Ct. Constancy: ∀c ∈ R, ∀t ∈ [0, T ],

ρt(c) = −c, P− a.s.

Axiom P. Positivity: ∀X ∈ Lp,

X ≥ 0 =⇒ ∀t ∈ [0, T ], ρt(X) ≤ ρt(0), P− a.s.

Risk measures 45

Axiom C. Convexity: ∀t ∈ [0, T ], ∀X1, X2 ∈ Lp, and ∀λ such that 0 ≤ λ ≤ 1,

ρt(λX1 + (1 − λ)X2) ≤ λρt(X1) + (1− λ)ρt(X2).

The following definitions are proposed for coherent and convex risk mea-sures.

DEFINITION 2.5 (see [243])i) A dynamic risk measure (ρt)t is called convex if it satisfies axiom (C)

and ρt(0) = 0.ii) A dynamic risk measure (ρt)t is called coherent if it satisfies axioms (T),

(S), (PH), and (P).iii) A dynamic risk measure (ρt)t is said to be time-consistent if it satisfies

the following axiom: ∀t ∈ [0, T ], ∀X ∈ Lp, ∀A ∈ Ft,

ρ0(XIA) = ρ0(−ρt(X)IA).

This time-consistency condition is the condition of the “filtration-consistency”of Coquet et al. [129] adapted to the risk measure framework. It is linkedto the recursivity property defined in Artzner et al. [32] (not their time-consistency definition).

There exists mainly two ways to provide dynamic risk measures:- First, by using robust representations as in the static case, as shown in

([32], [33]).- Second, by relying on dynamic risk measures to backward stochastic differ-

ential equations (BSDE) through the notion of the “conditional g-expectation,”as introduced in Peng [403].

The first approach can be summarized into the two following results:

Dynamic convex risk measure: Let Q be a convex set of P-absolutelycontinuous probability measures defined on (Ω,Ft). For any t ∈ [0, T ], letαt : Q → R be a convex functional such that infQ∈Q αt(Q) = 0. Then, theprocess (ρt)t is defined by: ∀t ∈ [0, T ], ∀X ∈ Lp,

ρt(X) = ess. supQ∈Q (EQ[−X |Ft]− αt(Q)) . (2.15)

This relation is the dynamic version of relation (2.5): any such process (ρt)tis a dynamic convex risk measure. In addition, it satisfies the axioms (T),(Ct), and (P) (i.e., translation invariance, constancy and positivity).

Dynamic coherent risk measure: Let Q be a convex set of P-absolutelycontinuous probability measures defined on (Ω,Ft). Then, the process (ρt)tis defined by: ∀t ∈ [0, T ], ∀X ∈ Lp,

ρt(X) = ess. supQ∈Q (EQ[−X |Ft]) . (2.16)


This is the dynamic version of relation (2.3): any such process (ρt)t is a dy-namic coherent risk measure.

The second approach is based on the following BSDE:

Consider a standard d-dimensional Brownian motion (Bt)t, and denote(Ft)t the augmented filtration generated by B.Denote by L2

F = L2F (T,Rn) the space of all Rn-valued, adapted processes θ

such that:

E

[∫ T

0

‖θt‖2ndt]>∞, (2.17)

where ‖.‖n stands for the Euclidean norm on Rn.

Let g : [0, T ]× R × Rd → R be a function which satisfies usual conditionsto guarantee existence and uniqueness of the solution (Y, Z) of the followingBSDE (see [403] or [129]):

−dYt = g(t, Yt, Zt)dt− ZtdBt, 0 ≤ t ≤ T and YT = X.

The notion of g-expectation introduced in Peng [403] is as follows:

DEFINITION 2.6 For any X ∈ L2 and for all t ∈ [0, T ], the conditionalg-expectation of X under Ft, denoted by Eg[X |Ft] is defined by:

Eg[X |Ft] = Yt, (2.18)

where Y is the first component of the solution of the previous BSDE with valueX at maturity T . For t = 0, Eg[X ] = Y0 is called the g-expectation.

Then, a risk measure can be defined as a family of maps from to L2 to L2F .

Consider the process (ρt)t given by: for all X ∈ L2,

ρt(X) = Eg[−X |Ft]. (2.19)

From the representation of risk measures by conditional g-expectations, suffi-cient conditions of convexity and coherence can be deduced (see [243]):

PROPOSITION 2.6i) If the functional g is convex in (y, z) ∈ (R × Rd), then the process (ρt)tdefined in (2.19) is a convex dynamic risk measure. It is also time-consistentand satisfies axioms (T), (P), and (Ct).ii) If the functional g is sublinear in (y, z) ∈ (R,Rd), then the process (ρt)t de-fined in (2.19) is a coherent dynamic risk measure. It is also time-consistent.

Risk measures 47

Conversely, sufficient conditions can be given to prove that a given riskmeasure is associated to a g-expectation.

For this purpose, recall the definition of Eµ-dominance given in [129]: themeasure ρ0 is Eµ-dominated if for any X1 and X2 in L2,

ρ0(X1 + X2)− ρ0(X1) ≤ Egµ [−Y ], for some gµ = µ|z|, with µ > 0. (2.20)

Assume that d = 1. Then the following result can be deduced:

PROPOSITION 2.7Consider a dynamic convex risk measure (ρt)t on L2 satisfying translationinvariance (T) and monotonicity (M).1) If:i) (ρt)t is time-consistent;ii) ρ0 is strictly monotone, i.e.:

X1 ≥ X2 and ρ0(X1) = ρ0(X2)⇐⇒ X1 = X2; (2.21)

iii) and, ρ0 is Eµ-dominated,

then there exists a unique functional g : Ω× [0, T ]×R→ R, independent of y,satisfying usual conditions, such that |g(t, z)| ≤ µ|z| and associated to ρ by:for all X in L2,

ρ0(X) = Eg(−X) and ρt(X) = Eg[−X |Ft]. (2.22)

Besides, if g is P-a.s. continuous in t for all z ∈ R then g is convex in z.

2) Consider a dynamic coherent risk measure (ρt)t on L2 satisfying properties(i), (ii), and (iii). Then there exists a unique functional g, independent of y,satisfying (2.22). Also, if g is P-a.s. continuous in t for all z ∈ R, then g issublinear in z.

In [44], a general axiomatic of dynamic convex risk measures is relatedto the notions of consistent convex price systems and non-linear expectationsintroduced in El Karoui and Quenez [194] and Peng [403]. Then their dynamicrisk measures (so-called “market risk measures”) are also related to quadraticBSDE.


2.2 Standard risk measures

2.2.1 Value-at-Risk

The need to introduce new specific risk measures, alternative to dispersionmeasures such as standard deviation, has been implied by the regulation offinancial institutions and the development of risk management. The Value-at-Risk (VaR) has been the first attempt to provide a quantitative tool todetermine capital requirements (see for example JP Morgan [393]), and tobetter take account of fat-tailed and non-normal returns; for example, whenvolatilities are random and possible jumps may occur, when financial optionsare involved in the position, when default-risks are not negligible, when cross-dependence between assets are complex, etc.

2.2.1.1 Definition of the VaR

To illustrate the VaR concept, consider the following model:

Example 2.2

Let a = (a1, ..., aN )′ be a portfolio allocation, where ai denotes the allocationinvested on the i-th financial asset. Denote by Si,t the price of the asset i attime t. Then, the portfolio value V (a) is given by:

Vt(at) =N∑i=1

ai,tSi,t = a′.St.

Then, its change between dates t and t + 1 is equal to:

∆Vt+1(at) = (Vt+1 − Vt) (at) =N∑i=1

ai,t (Si,t+1 − Si,t) = a′.∆St+1.

Assume that the future asset prices St = (S1,t, ..., SN,t) has a continuousconditional distribution PSt,t given the information at time t. Then, for a lossprobability level α, with usual values between 1% and 5%, the Value-at-RiskV aRt,α(at) is defined by:

PSt,t [(Vt+1 − Vt) (at) + V aRt,α(at) < 0] = α. (2.23)

Thus, the VaR corresponds to a reserve amount added to the portfolio value,such that the probability of global loss is small (equal to the level α). TheVaR depends on the information available at current date t, on the conditionaldistributions of financial assets, on the portfolio weights, and on the lossprobability level α.

Risk measures 49

It can be also viewed as an upper quantile at level (1− α) of the potentialportfolio loss since:

PSt,t [(Vt − Vt+1) (at) > V aRt,α(at)] = α. (2.24)

Assume that returns are normally distributed. Denote by µt and Σt its con-ditional mean and covariance matrix. Then, from equation (2.24), the expres-sion of VaR is deduced:

V aRt,α(at) = −a′t.µt + (a′t.Σt.at)12 q1−α, (2.25)

where q1−α is the quantile of level (1−α) of the standard normal distribution.Thus the VaR is the sum of conditional expected negative returns and theirstandard deviation, weighted by a positive constant q1−α for a given (small)threshold α. Therefore, as seen in Chapter 3, minimizing VaR for the Gaussiancase is equivalent to searching for a mean-variance efficient portfolio.

For a general position X (corresponding to a P&L), upper and lower Value-at-Risk can be defined at a given level α (see for example Acerbi [2]). Denoteby FX the cumulative distribution function (cdf) of the random variable X(FX(x) = P [X ≤ x]). Let qα(X) and qα(X) be respectively the lower andupper quantiles of X .

DEFINITION 2.7 The lower α−Value-at-Risk (usual VaR) is given by:V aRα(X) = − sup x |FX(x) < α = −qα(X).

The upper α−Value-at-Risk is given by:V aRα(X) = − inf x |FX(x) < α = −qα(X). (2.26)

REMARK 2.2 Obviously, when the cdf of X is continuous and strictlyincreasing, the two quantiles are equal. However, for discrete distributionfunctions, they may be distinct. Consider for example a P&L X with possiblenegative values: −n% for −n in E = −100,−95, ...,−5, 0 with respectiveprobabilities p−n. Then, for αn =

∑−nk=−100 pk,

V aRαn(X) = n% and V aRαn(X) = (n + 1)%.

REMARK 2.3 As mentioned in [5], note that these definitions are am-biguous. For example, a 5% lower VaR is the amount that the position maylose in the best of the 5% worst cases (and may gain in the worst of the 95%best cases). Thus, from the loss point of view, the VaR underestimates the riskat a given level α, since it provides only the smallest loss at this probabilitylevel.


-20 0 20 40 60 80

0.2

0.4

0.6

0.8

-20 0 20 40 60 80

0.2

0.4

0.6

0.8

-20 -10 0 10 20

0.2

0.4

0.6

0.8

FIGURE 2.1: Probability distribution (pdf) and cumulative distribution(cdf) with V aR5% = 13.5%. The 5% worst cases (resp. 95%) are shadowedin dark (resp. light) grey. In the cdf plot, VaR is the opposite of the abscissaof the intersection point of the cdf with the horizontal line α = 5%.

Figure (2.1) illustrates the determination of the VaR for a probability dis-tribution having a density (pdf). For a given threshold α, the V aRα is theopposite of the quantile qα of the distribution: the highest (“best”) value suchthat the probability to be under this value is smaller than α.

Figure (2.2) shows the pdf of returns and the pdf of returns conditional onexceeding the VaR. Two statistical models are considered for modelling theposition: the Gaussian distribution, which has thin tails, and the Pareto-Levydistribution, whose “fat” tails decrease by a power. The two distributions haveparameters such that they have the same VaR.

The Gaussian distribution is assumed to be centered and its standard de-viation is chosen equal to 1%. In this case, the VaR corresponding to a 99%probability of overshoot is equal to 2.33% of the value of the position.

The Levy-stable distribution is centered on zero and is symmetrical. Itscharacteristic exponent is equal to 1.5 (which determines the decrease in thedistribution tail). The value of the scale parameter is equal to 1. It indicatesthe dispersion of the distribution. Thus the VaR of the stable Paretian dis-tribution for a 99% probability of overshoot is also equal to 2.33%. Note thatthe values of the losses (lower than VaR) are near the VaR for the Gaussiandistribution contrary to the stable Paretian case.

Risk measures 51

0% 5%-5%-10% 10%0.00

0.50

1.00

1.50

2.00

2.50Probability p = 99%

VaR=2.33%

Density distributionsof the returns

Density distributionsof the returns

exceeding the VaR

FIGURE 2.2: Level of returns for Gaussian and Stable Paretiandistributions with same VaR

2.2.1.2 Convexity of the VaR

While the VaR satisfies translation invariance (T), positive homogeneity(PH), and monotonocity (M), the VaR fails to be convex for particular prob-ability distributions. More precisely, it may not satisfy the subadditivityproperty, as shown in the following example provided by Danielsson et al.[144]:

Example 2.3Consider two financial assets A and B having Gaussian probability distribu-tions but with independent “shocks”:

A = ε + η, ε ∼ N (′,∞) and η =

0 with probability 0, 991,−10 with probability 0, 009.

Then VaR(A) at the level 1% is equal to 3.1. Suppose that B has thesame probability distribution as A but is independent from A. In that case,VaR(A + B) at the level 1% is equal to 9.8, since for the asset A + B theprobability of getting −10 draw from A or B is higher than 1%.Then we have:

V aR(A + B) = 9.8 > V aR(A) + V aR(B) = 3.1 + 3.1 = 6.2.

Nevertheless, it may be subaddtive (and thus convex) for other probabilitydistributions (see [30] and [144]):


Example 2.4If the joint distribution of (X,Y ) is Gaussian, then the VaR is subadditivefor usual values of the level α. Indeed, as seen previously previous in Example(2.2), for a normal random variable Z, the VaR satisfies:

V aRα(Z) = − (E [Z] + qασ(Z)) , (2.27)

where qα is the quantile of the standard Gaussian distribution and is negativesince α is smaller than 0.5. Therefore, V aRα(Z) is an increasing function ofthe standard deviation σ(Z). Since for any (X,Y ), σ(X +Y ) ≤ σ(X) +σ(Y ),the subaddivity property is deduced.

2.2.1.3 Sensitivity of the VaR

The sensitivity of VaR previously in Example (2.2) has been examined inGourieroux et al. [261], under the following assumptions: the increment of as-set prices Yt+1 = (S1,t+1−S1,t; ...;SN,t+1−SN,t) has a continuous conditionaldistribution with positive density and finite second-order moments.

PROPOSITION 2.8Consider the portfolio with value V (a) given by Vt(at) = a′.St. Under previ-ous assumptions, V aRt,α(at) has a first-order derivative with respect to theallocation a given by:

∂V aRt,α(at)∂α

= −Et [Yt+1 |a′Yt+1 = −V aRt,α(at) ] . (2.28)

Denote by Vt the conditional variance and by ga,t the conditional pdf ofa′Yt+1. The second-order derivative of VaR is equal to:

∂2V aRt,α(at)∂α∂a′

=∂Log [ga,t]

∂x(−V aRt,α(at)) Vt [Yt+1 |a′Yt+1 = −V aRt,α(at) ]

−

∂

∂xVt [Yt+1 |a′Yt+1 = −x ]

x=V aRt,α(at)

. (2.29)

PROOF (see [261] for more details)1) The VaR is defined from relation:

PSt,t [Ai,t+1 + ai,tBi,t+1 > V aRt,α(at)] = α,

where Ai,t+1 = −∑j =i aj,tYj,t+1 and Bi,t+1 = −Yi,t+1.

Besides, the following property holds: if (A,B) is a bivariate continuous vec-tor, then the quantile q(b, α) defined by P [A + bB > q(b, α)] = α, has a first-order derivative with respect to b given by:

∂

∂bq(b, α) = E [B |A + bB = q(b, α) ] .

Risk measures 53

Using this result, the first-order derivative of the VaR is deduced.2) The second-order derivative is determined from a first-order expansion

of the first-order derivative around a given allocation a0 (see [261], AppendixB).

REMARK 2.4 For the Gaussian case, the second-order derivative of theVaR is equal to

∂2V aRt,α(at)∂α∂a′

=q1−α

(a′t.Σt.at)12Vt [Yt+1 |a′Yt+1 = −V aRt,α(at) ] ,

where q1−α is the quantile of level (1 − α) of the standard normal distribu-tion. Thus, as soon as α < 0.5, the second-order derivative is positive whichimplies the convexity of the VaR. This result can be extended, for example,to a Gaussian model with unobserved heterogeneity for which the probabilitydistribution is a mixture of Gaussian distributions (see [261]).

2.2.1.4 VaR estimation

VaR estimation is obviously based on quantile estimation and tail analysis.First, parametric methods have been developed, generally using a Gaussianassumption on the joint distribution of asset returns (see e.g., JP MorganRiskmetrics). Second, to better take account of fat tails, non-parametricapproaches have been introduced to determine the historical VaR which is anempirical quantile (see e.g., [216], [283], [308]...). Semiparametric methods,based in particular on extreme value approximation for the tails have beenproposed, by Bassi et al. [49] and McNeil ([380], [381]).

When observations are independent and stationary (iid), non-parametricmethods based on kernel estimators can be introduced as in [261]. Considerthe sequence of returns (Zt)t defined by: for any asset i,

Zi,t = (Si,t+1 − Si,t) /Si,t. (2.30)

Let θ be the vector of allocations measured in values instead of shares. Then,the VaR is defined by:

PSt,t [−θt.Zt+1) > V aRt,α(at)] = α. (2.31)

From iid assumption, this is equivalent to:

P [−θt.Zt+1) > V aRt,α(at)] = α. (2.32)

Therefore, it can be consistently estimated from T observations by using aGaussian kernel. The estimated VaR, denoted by V aR, is solution of:

α =1T

T∑t=1

N

(−θ′.Zt − V aR

h

), (2.33)


where N is the cdf of the standard normal distribution, and h is the selectedbandwidth.

Equation (2.33) is solved numerically by a Gauss-Newton recursive algo-rithm. Let var(p) be the value of the approximation at step p, then:

var(p+1) = var(p) +

1T

T∑t=1

N(

−θ′.Zt−var(p)

h

)1Th

T∑t=1

f(

θ′.Zt+var(p)

h

) ,

where f is the pdf of the standard normal distribution.The starting value var(0) can be chosen equal to the VaR calculated under

a Gaussian assumption on the historical VaR. As mentioned in [261], thismethod is quite appealing and can be applied for large portfolios.

2.2.2 CVaR

As it can be easily seen, VaR is a risk measure that only takes account of theprobability of losses, and not of their sizes. Moreover, VaR is usually basedon the assumption of normal asset returns and has to be carefully evaluatedwhen there are extreme price fluctuations. Furthermore, VaR may be notconvex for some probability distributions. Due to these deficiencies, otherrisk measures have been proposed. Among them, the Expected Shortfall (ES)as defined in Acerbi et al. [3], also called Conditional Value-at-Risk (CVaR)in [427] or TailVaR in [31]. Note that in Acerbi and Tasche [5], several riskmeasures related to ES are considered and the coherence of ES is proved.

2.2.2.1 Definition of the ES

The ES can be expressed as follows (see [2]):

DEFINITION 2.8 Let←−FX be the generalized inverse of the cdf FX of X

defined by: ←−FX(p) = sup x |FX(x) < p .

Then, the ES is defined as the average in probability of all possible outcomesof X in the probability range 0 ≤ p ≤ α:

ESα(X) = − 1α

α∫0

←−FX(p)dp. (2.34)

Then, for continuous cdf, the ES is given by:

ESα(X) = − (E [X |X ≤ qα(X) ]) , (2.35)

where qα(X) is the quantile of X at the level α.

Risk measures 55

REMARK 2.5 Figure (2.2) illustrates how two probability distributionscan have the same VaR, whereas one is thin-tailed (the Gaussian case) andthe other is fat-tailed (the Pareto-Levy case). In this example, the VaR cor-responding to a 99% probability of overshoot is equal to 2.33% of the valueof the position. Nevertheless, the comparison of the expected shortfalls showsthe different risk levels.

As mentioned in Longin [362], for the standard Gaussian distribution, theES is approximatively equal to V aR+ 1

21

V aR (here equal to 2.66%). Thus, thehigher the VaR, the nearer the possible losses beyond the VaR. For a stableParetian distribution, with location and skewness parameters equal to 0, withcharacteristic exponent c > 1, the ES is approximatively equal to V aR+ V aR

c−1(here equal to 3.22%). Therefore, the higher the VaR, the wider the spreadof the losses beyond the VaR.

REMARK 2.6 The expected shortfall ES is also equal to:

ESα(X) = − 1α

[E[X1X≤qα(X)

]− qα(X) (P [X ≤ qα(X)]− α)]. (2.36)

Thus, when FX(x) has a jump at α, the term qα(X) (P [X ≤ qα(X)]− α)must be substracted to the expected value E

[X1X≤qα(X)

].

REMARK 2.7 When there are discontinuities in the loss distribution,the α−tail conditional expectation TCEα is defined by:

TCEα(X) = − (E [X |X ≤ qα(X) ]) , (2.37)

is not always equal to ESα(X). In that case, it may be not coherent (notsubadditive) as shown in [2].

The evaluation of marginal impacts of positions on risk measures and reg-ulatory capital is a key point for risk management analysis (see e.g., exampleJorion [308]).

When sensitivities are known, the global portfolio risk can be decomposedcomponent by component. Then, the largest risk contributions can be iden-tified. Additionally, it is no longer necessary to recompute the risk measureeach time the portfolio composition is slightly modified.

The sensitivity analysis of expected shortfall and its (non-parametric) es-timation has been examined in Scaillet [450]. Estimators for both expectedshortfall and its first-order derivative are provided and their consistenciesproved.


2.2.2.2 Sensitivity of the ES

PROPOSITION 2.9Consider the portfolio with value V (a) given by Vt(at) = a′.St. The expectedshortfall ESt,α(at) has a first-order derivative with respect to the allocation agiven by:

∂ESt,α(at)∂α

= −Et [Yt+1 |a′Yt+1 > −V aRt,α(at) ] . (2.38)

For the Gaussian case, the expected shortfall is given by:

ESt,α(at) = −a′t.µt + (a′t.Σt.at)12f(q1−α)

α, (2.39)

where f is the pdf of the standard Gaussian Law. Thus, its first-order deriva-tive is equal to:

∂ESt,α(at)∂α

= −µt +Σt.at

(a′t.Σt.at)12

f(q1−α)α

. (2.40)

PROOF (See [450]). The proof is based on the following property: consid-er the loss function L1 = X−εY where ε is a positive real number and (X,Y )is a random vector admitting a continuous conditional density with respect tothe Lebesgue measure. Then the first derivative of expected shortfall is equalto:

∂V aR(L1)/∂ε = −E[Z|L1 > V aR(L1)]. (2.41)

Note that for the Gaussian case, the first-order derivative is an affine func-tion of the expected shortfall itself since we have:

∂ESt,α(at)∂α

= −µt +Σt.at

(a′t.Σt.at)(ESt,α(at) + a′.µt) . (2.42)

2.2.2.3 Estimation of the ES

This analysis has been mainly based on estimation of the mean excess func-tion in extreme value theory (see Embrechts et al. [201] or McNeil [380]).Thus, data have been assumed to be iid and usually univariate. In Scaillet[450], the estimation method is based on a kernel approach. This allows us totake account of more general dependencies, such as strong mixing (as definedfor example in [169]).

Assume that the data have been generated by a sequence of random vari-ables (Yt)t which models the “risks” along the time period. The process Y issupposed to be strictly stationary. It may be, for example, a VARMA or amultivariate GARCH process.

Risk measures 57

Consider the following Kernel estimator:

[(Yt, a′tYt); ξ] = (Th)−1

T∑t=1

YtK(ξ − at′Yt/h). (2.43)

Introduce also:

I[ξ] =

ξ∫−∞

[(Yt, a′tYt);u]du. (2.44)

The bandwidth h is assumed to be a function of the number of observationsT which converges to 0 as T goes to infinity. The kernel estimator q(a, α) ofthe quantile −V aRα(a) is determined from the relation:

q(a,α)∫−∞

[(1, a′tYt);u]du = α. (2.45)

Then, the ratio I[q(a, α)]/α is an estimator of the conditional expectationE[Yt |Yt < −V aRα(a) ]. Therefore an estimator of the ES is given by:

m = −a′.I[q(a, α)]/α. (2.46)

Similarly, an estimator of the first-order derivative of the ES is given by:

m1 = I[q(a, α)]/α. (2.47)

Note that m is simply equal to m1 multiplied by the allocation a. Under mildassumptions, asymptotic normality of the estimators are proved in [450].

2.2.2.4 Numerical computation of the ES

Often, the dependence between asset prices and their particular payoffsimplies that no tractable analytical expression of the portfolio distribution isavailable. In such case, Monte Carlo simulations can be used. Consider Nindependent simulations Xi of the same position value X . They allow thesimulation FN

X of the cdf X defined by:

FNX (x) =

1N

N∑i=1

1Xi≤x. (2.48)

Let us introduce the ordered statistics (Xi,N )i, i.e., the values Xi,N areequal to the values Xi sorted in increasing order:

X1,N ≤ ... ≤ XN,N . (2.49)


Denote by ESNα (X) the α−expected shortfall of the simulated distribution.

For any real number x, denote by [x] the integer part of x. Then:

ESNα (X) = − 1

Nα

(Nα∑i=1

Xi,N

). (2.50)

REMARK 2.8 Note that we have also:

ESNα (X) = − 1

Nα

[Nα]∑i=1

Xi,N + (Nα− [Nα]) X[Nα]+1,N

. (2.51)

For example, consider a sample of about 250 daily outcomes of a financialreturn, observed over one year. Then, with α = 5%, Nα = 12.5 and theα−expected shortfall of the estimated distribution is equal to

ES2505% (X) = − 1

12.5

(12∑i=1

Xi,250 + (0.5) X13.250

).

Thus, the last term (0.5) X13,250 is not negligible.

Using results about convergence of ordered statistics, the following result isproved:

PROPOSITION 2.10The α−expected shortfall of the simulated distribution ESN

α (X) converges al-most surely to the α−expected shortfall ESα(X) of the true distribution. Notethat when the lower and upper α−Value-at-Risk are equal, the convergenceof the α−VaR holds: V aRN

α (X) = −X[Nα]+1,N converges to V aRα(X) =V aRα(X).However, if V aRα(X) = V aRα(X), then V aRN

α (X) does not converge asN converges to infinity, but flips between V aRα(X) and V aRα(X).

In practice, it is necessary to also use explicit estimators of the first or-der derivatives of the risk measure. These derivatives are also of particularrelevance in the portfolio selection problem. They help to characterize andevaluate efficient portfolio allocations when VaR and ES are substituted forvariance as measures of risk. In fact, numerical constrained optimization al-gorithms for computations of optimal allocations usually require consistentestimates of first order derivatives in order to converge properly. Therefore,it is necessary to search for risk measures which are not only “rational,” butalso easily estimated and computed. This is one of the reasons to considerthe spectral risk measures.

Risk measures 59

2.2.3 Spectral measures of risk

The expected shortfall is a “natural” coherent extension of the Value-at-Risk. Both are based on quantiles of positions X . As seen in relation (2.34),the ES is an average of the outcomes of X with respect to a particular prob-ability which is uniform on the interval [0, α], i.e., it assigns equal weightsdp/α to the worst 100α% and zero to the others.

2.2.3.1 Definition of spectral risk measures

More general averages of type can be introduced as in Acerbi [1]:

Mφ(X) = −1∫

0

φ(p)←−FX(p)dp, (2.52)

where φ is a weighting function defined on the set of quantiles. In [1], thefunction φ is called risk spectrum or risk aversion function, associated to themeasure Mφ. For the ES, the function φ is given by:

φESα(p) =1α

1p≤α =

1α if p ≤ α,

0 else. (2.53)

As any convex combination of coherent measures of risk is clearly coherent,we can search for the convex hull of a given set of such measures, among themthe family of ESα indexed by the level α ∈ [0, 1]. In Acerbi [1], any measurein this convex hull is called spectral risk measure and the following result isproved:

PROPOSITION 2.11Any spectral risk measure is of the type:

Mφ,c(X) = cES0(X)− (1− c)

1∫0

φ(p)←−FX(p)dp, (2.54)

with c ∈ [0, 1] and φ : [0, 1]→ R satisfying the following conditions:1) Positivity: φ(p) ≥ 0 for all p ∈ [0, 1].

2) Normalization:1∫0

φ(p)dp = 1.

3) Monotonicity: φ(p1) ≥ φ(p2) for all 0 ≤ p1 ≤ p2 ≤ 1.The measure Mφ,c is coherent if and only if these assumptions are satisfied.

Exponential risk-aversion functions φ can be used:

φγ(p) =e−p/γ

γ(1− e−p/γ), γ ∈ (0,+∞).


The smaller the parameter γ, the steeper the risk-aversion function φ, whichallows us to better take account of the left tail of the distribution.

2.2.3.2 Characterization of spectral risk measures

The set of spectral risk measures does not contain all coherent risk mea-sures. To characterize spectral risk measures, comonotonic additivity, andlaw-invariance have to be introduced as in [342] and [489].

DEFINITION 2.9 1) (Comonotonicity) Two random variables X and Yare said to be comonotonic if they are non-decreasing functions of the samerandom variable Z :

X = f(Z) and Y = g(Z), with f and g : R→ R non-decreasing.

2) (Comonotonic additivity) A measure of risk ρ is said to be comonotonicadditive if, for any comonotonic random variables X and Y :

ρ(X + Y ) = ρ(X) + ρ(Y ). (2.55)

REMARK 2.9 The comonotonicity is a very strong dependence propertybetween two random variables. In particular, if financial asset returns arecomonotonic, they cannot hedge each other and diversification is inefficient.Thus, in that case, property (2.55) is rather intuitive.

The law-invariance property is also “natural”:

DEFINITION 2.10 A measure of risk ρ is said to be law-invariant if,for any random variables X and Y having the same probability distributions:

ρ(X) = ρ(Y ). (2.56)

This means that the measure of risk ρ is a functional defined on the set of cdf:

ρ(X) = ρ(FX [.]).

REMARK 2.10 As mentioned in [2], a law-invariant measure of risk givesthe same value to two empirically indistinguishable random variables. Thus,a measure ρ which is not law-invariant cannot be estimated from empiricaldata.

PROPOSITION 2.12(see [2]) The class of spectral risk measures is the set of all coherent riskmeasures which are comonotonic additive and law-invariant.

Risk measures 61

This result is in favor of the spectral risk measures, since both conditionsare quite rational. Basic examples of coherent measures of risk which do notsatisfy one of these conditions are:

• The risk measure, introduced by Fisher [228], defined by:

ρp,a = −E[X ] + aσ−p (X), (2.57)

where 0 ≤ a ≤ 1, and where the one-side pth moment is given by:

σ−p (X) = p

√E[Max[0, (E[X ]−X ])]p]. (2.58)

For a = 0, this measure is not comonotonic additive. Note that, fora = 1 and p = 2, the minimization of this measure is equivalent to themean-semivariance analysis.

• The worst conditional expectation (WCE) defined in Artzner et al. [31]:

WCEα(X) = − inf E[X |A] : A ∈ A,P[A] > α ] , (2.59)

where A is a given σ-algebra. This measure takes the infimum of condi-tional risk measures only on events A with probability larger than thelevel α. Thus, generally it is not a law-invariant risk measure, exceptwhen the probability space is non-atomic (in that case, it is equal to theexpected shortfall ESα).

As seen in Chapter 1, the notion of stochastic dominance (in particular ofthe first-order) allows the comparison of two random financial positions byordering their probability distributions. Thus, it is interesting to search forrisk measures which are compatible with stochastic dominance.

DEFINITION 2.11 A map ρ : X → ρ(X) ∈ R is said to be monotonicwith respect to first-order stochastic dominance if it satisfies:

ρ(X) ≥ ρ(Y ) if Y 1 X.

Note that if Y 1 X , then X always has a left tail higher than Y : V aRα(X) ≥V aRα(Y ) at any level α. Therefore, the previous monotonicity property israther desirable.

Acerbi and Tasche prove that spectral risk measures are characterized bysuch property:

PROPOSITION 2.13The class of spectral risk measures is the set of all coherent risk measures whichare comonotonic additive and monotone with respect to first-order stochasticdominance.


2.2.3.3 Estimation of spectral risk measures

The ordered statistics (Xi,N )i allow definition of a consistent estimatorMN

φ,c(X) of the spectral risk measure Mφ,c(X), as follows: define by ESNα (X)

the α−expected shortfall of the simulated distribution by:

MNφ,c(X) = −

N∑i=1

Xi,N φi with φi =∫ i/N

(i−1)/N

φ(p)dp. (2.60)

Then the sequence(MN

φ,c(X))N

converges almost surely to the spectral

risk measure Mφ,c(X).

2.3 Further reading

Szego ([484] and [485]) provides a general overview of risk measures andtheir applications to regulation, capital allocation, and portfolio optimization.Time-consistency for preferences has been introduced in Koopmans [332]. Forfurther contributions, see Epstein and Schneider [211], Epstein and Zin [212],and Wang [502]. Examples and characterizations of time-consistent coher-ent risk measures are also given in Weber [503] and Cheridito et al. ([117],[118]), where coherent and convex monetary risk measures are defined on thespace of all cadlag processes that are adapted to a given filtration, bounded ornon-bounded. In Barrieu and El Karoui [43], inf-convolution of risk measuresare introduced and applied to examine optimal risk transfer. When infor-mation is partial or asymmetrical with no a priori given probability, robustrepresentation of convex conditional risk measures can also be deduced as inBion-Nadal [71] (see also Detlefsen and Scandolo [165] in a dynamic frame-work). Law-invariant risk measures are examined in particular in Kusuoka[342]. Vector-valued coherent risk measures are introduced in Jouini et al.[311] to take account of difficulty in aggregating portfolio positions due toliquidity problems or transaction costs. Another approach to generate riskmeasures is proposed in Gooverts et al. [259]. This is based on insurancepremium principles and uses the Markov inequality for tail probabilities. Itdoes not always lead to coherent risk measures. This point is further analyzedby Denuit et al. [159], who generate a large class of risk measures from theactuarial equivalent utility pricing principle.

As examined in Wilson [504], three main methods can be used to determinethe VaR:

• The Variance/Covariance method which assumes normal probabilitydistribution of returns (thus with deterministic volatilities). This as-sumption makes easy its implementation. It is easy to interpret, to

Risk measures 63

compute, and to implement. However, it does not capture the fat tailsof returns. Besides, the problem of non-linearity of derivatives payoffshas to be (partially) solved by a Delta/Gamma approximation.

• The Historic Simulation based on past data which are generated by“true” distributions. Fat tails and non-linearity can be taken into ac-count. Its interpretation is straightforward, but the use of large numbersof data can be time-consuming. Moreover, the implicit assumption thatpast data are “good” predictors of future returns is not always valid.

• The Monte Carlo Simulation which has the purpose to generate valuesof the position from specific assumptions on returns and investmentstrategies. This method can be applied for quite general distributionsand for different time periods. It can take non-linearity into account.However, its interpretation and computation are more involved.

Duffie and Pan [176] also provide a broad overview of VaR, in particular inthe presence of market risk: asset returns have Markovian stochastic volatili-ties (GaRCH type, with possible regime-switching, etc.), and/or are modelledby jump diffusions. Factors models are also considered when the portfolio ofpositions has a market value which is sensitive to risk factors variations, suchas major equity indices and treasury rates. A Delta/Gamma approach is alsoproposed to simplify the calculation methods, in particular when using MonteCarlo simulations for large portfolios.

The first derivative of the VaR and the ES with respect to portfolio allo-cation can be also derived when netting between positions exists, as in creditrisk management (see Fermanian and Scaillet [221]).

Conditional parametric estimations of VaR and CVaR based on GARCHmodelling of financial time series are examined in Alexander and Tasche [16],in McNeil et al. ([380], [381]), and Engle and Manganelli [207] propose a quan-tile estimation that does not assume normality or iid returns. They introducea conditional autoregressive value at risk so-called CAViaR model where theevolution of the quantile over time is modeled by an autoregressive process.

For other detailed analyses of Value-at-Risk and its application to risk man-agement, see Jorion [309], Dowd [170], and Stulz [482].

Part II

Standard portfoliooptimization

“The rule that the investor does (or should) maximize discounted expected,or anticipated, returns...is rejected both as a hypothesis to explain, and asa maximum to guide investment behavior. We next consider the rule thatthe investor does (or should) consider expected return a desirable thing andvariance of return an undesirable thing. This rule has many sound points, bothas a maxim for, and hypothesis about, investment behavior...Diversificationis both observable and sensible, a rule of behavior which does not imply thesuperiority of diversification must be rejected both as a hypothesis and as amaxim”.

Harry Markowitz, “Portfolio Selection”, Journal of Finance, (1952).

65


Most of the time, an investor (private or fund manager) has to choose acombination of securities in an uncertain framework. Two main objectivescan be pursued:

• Passive portfolio management. The purpose is to replicate a given fi-nancial index as best as possible. The implicit assumption is that finan-cial markets are efficient; no financial strategy can regularly outperformtheir performances. Clearly, for a given portfolio horizon, a private in-vestor will choose a passive management style if the market is presumedto be bullish, with relatively high probability.

• Active portfolio management. If the financial market is not efficient,better asset allocations may exist. In this case, financial assets must beselected first, second their weighting must be optimized. A fund manag-er may also choose between two strategical approaches: the bottom-upprocess where individual securities are selected from their individual per-formances or the top-down analysis where, for example, macroeconomicand financial forecasts determine the global asset allocation among in-ternational securities.

For the first step, asset selection, the investor needs financial data andappropriate estimation methods. The selection of common stock is usuallyalso based on earning forecasts and valuation process, such as the discountedcash flow models which allow the evaluation of, for instance, price earningratios.

For the second step, asset weighting, the investor can select a decision rule tooptimally choose the percentage invested on each asset. This choice criterioncan be rationalized by using such axiomatics as detailed in Part I of this book.

However, specific constraints may be imposed on the portfolio allocationwhich induces more involved computational problems.

This part provides an overview of static portfolio optimization and standardperformance analysis:

- First, the optimal weighting for active (but static) portfolio management,based on the seminal Markowitz model is discussed. Extensions of this theoryare also presented.

- Second, indexed funds (passive management) and benchmark optimization(mixture of active and passive asset allocation) are introduced. Most of thefunds are based on this latter method.

- Finally, standard performance measures to analyze and rank mutual fundsare enumerated.

Chapter 3

Static optimization

This chapter is mainly devoted to the mean-variance analysis introduced byMarkowitz [373]:

- The fundamental lesson of the Markowitz analysis is to show that investorsmust take care not only of the realized returns but also of the “risk” of theirposition, represented by the standard deviation of their portfolio return.

- Markowitz proposes to measure the risk of return R by its standard devi-ation σ(R), and to determine the minimal σ(R) for any fixed expected returnE[R]. The dual criterion can also be considered: maximize the expected returnfor any fixed standard deviation.

The first part of the chapter details the determination and the analysis ofthe efficient frontier:

1. Diversification property which allows for reduction of portfolio risk.

2. Optimal weights computation with the main properties of two-fund sep-aration and efficient frontier.

3. Additional constraints on strategies, taking into account specific boundson portfolio weights.

4. Parameter estimation problems.

The second part examines expected utility maximization and/or risk mea-sures minimization, such as safety criteria or CVaR minimization:

1. In particular, conditions ensuring that these additional criteria allowchoosing one and only one efficient portfolio are examined.

2. For expected utility maximization, results about the important propertyof two-fund separation are detailed.

3. For VaR/CVaR minimization, convexity properties are detailed in orderto use algorithms based on convex programming.

67


3.1 Mean-variance analysis

Consider an investor with an initial amount V0 to invest on given financialassets. The time horizon is fixed, the strategy is assumed to be static (buy-and-hold). Expectations of returns and their correlation matrix are assumedto be known.

3.1.1 Diversification effect

Example 3.1To illustrate the impact of diversification, consider the following case: let S1

and S2 be two financial assets. Let R1 and R2 respectively be their returns.Denote by E [R1] and E [R2] the expectations of their returns, and by σ2

1 andσ2

2 their variances. Finally, their covariance is denoted by:

σ12 = Cov (R1, R2) = ρ12σ1σ2,

where ρ12 (−1 ≤ ρ12 ≤ +1) is the correlation coefficient between the two assetsS1 and S2. Consider a portfolio P with a percentage of wealth x invested onasset S1 and (1− x) invested on S2. Its return RP is such that:

E [RP ] = xE [R1] + (1− x) E [R2] ,

σ2P = x2σ2

1 + (1− x)2 σ22 + 2x (1− x)σ1σ2ρ12.

The following figure indicates the set of all such combinations of the twoassets, according to the value of the correlation coefficient ρ12. The first axiscorresponds to the values of σP , and the second to the values of E [RP ] .

Parameter values: E [R1] = 10%, E [R2] = 20%, σ1 = 0, 12, σ2 = 0, 17.

6

-0.00 0.12

0.15

0.10

0.20

0.17Standard deviation

Mean return

S1

S2

FIGURE 3.1: Diversification effect

Static optimization 69

We note that:

1) If ρ12 = 1, both expected return and standard deviation of the portfolioP are linear combinations of those of the two assets. Therefore, the set of allpossible portfolios is the segment which lies between the two assets S1 andS2. Additionally, if 0 ≤ x ≤ 1, no portfolio has a lower standard deviationthan the minimal value, min(σ1, σ2).

2) If −1 < ρ12 < +1, we search for the minimal variance portfolio by solvingthe following equation:

∂σ2 (RP )∂x

= xσ21 − (1− x) σ2

2 + (1− 2x)σ1σ2ρ12 = 0. (3.1)

The optimal percentage is given by:

x∗ =σ22 − σ1σ2ρ12

σ21 + σ2

2 − 2σ1σ2ρ12. (3.2)

Its variance satisfies:

σ2 (RP∗) =

(1− ρ212

)σ22σ

21

σ21 + σ2

2 − 2σ1σ2ρ12. (3.3)

Assume, for example, that the security S1 is less risky than S2 (σ1 < σ2).

Then:

σ2 (RP∗)− σ21 = − σ2

1 (σ1 − ρ12σ2)2

σ21 + σ2

2 − 2σ1σ2ρ12. (3.4)

If ρ12 = σ1σ2

, from(3.4) we see that the difference σ2 (RP∗) − σ21 is always

negative, whatever the value of the correlation coefficient. Thus, the minimalstandard deviation is smaller than σ1.

If ρ12 < σ1σ2

, then σ22 − σ1σ2ρ12 > 0 and x∗ > 0: the percentage invested on

the less risky asset is non-negative.If ρ12 > σ1

σ2, then σ2

2 − σ1σ2ρ12 < 0 and x∗ < 0: this represents the shortposition on the less risky asset.

If ρ12 = σ1σ2

, the minimal standard deviation is equal to σ1. In that case,there is no diversification effect.

If ρ12 = −1, there exists a portfolio with no risk: x = σ2/(σ1 + σ2).

The diversification allows the reduction of the risk as soon as the correlationcoefficient between the two assets is strictly smaller than 1, except whenρ12 = σ1

σ2.


Example 3.2Consider a more general case with n risky securities (see Merton [387]).

Notations: for i = 1, ..., n,w = (w1, ..., wn) is the vector of portfolio weights.R = (R1, ..., Rn) is the vector of asset returns.R = (R1, ..., Rn) is the vector of asset returns expectations.e = (1, ..., 1) is the vector with all components equal to 1.V = [σij ]1≤i,j≤n is the (n× n) variance-covariance matrix of returns. The

matrix V is supposed to be invertible.

Denote by A′ the vector deduced from transposition of the vector A. Foreach given expected return, we have to determine the minimal variance port-folio.

The expected return of any portfolio P with weights w is given by:

E [RP ] =n∑

i=1

wiE [Ri] = w.R′. (3.5)

The variance of the return of P is equal to:

σ2 (RP ) = w′Vw =n∑

i=1

n∑j=1

wiwjσij = 2n−1∑i=1

n∑j=i+1

wiwjσij +n∑

i=1

w2i σ

2i . (3.6)

The previous relation shows the decomposition of the variance of the portfolioreturn into two components. This relation proves that the marginal contri-bution of a given asset to the risk of the whole portfolio is not reduced to itsown risk (its variance), but also takes account of its potential correlations toother securities. This latter property induces the diversification effect.

From relation (3.6), the partial derivative with respect to any weight wi isdeduced:

∂σ2 (RP )∂wi

= 2n∑

j=1

wjσij .

Denote by σiP the correlation coefficient between asset i and portfolio P.Then:

n∑j=1

wjσij =n∑

j=1

wjCov(Ri, Rj) = Cov(Ri,

n∑j=1

wj .Rj) = Cov(Ri, RP ) = σiP ,

and finally:∂σ2 (RP )

∂wi= 2σiP .


Thus, the marginal contribution of asset i to the portfolio risk is twice itscorrelation with the portfolio.

Suppose now that the portfolio weights are equal to 1n . Then, its variance

is given by:

σ2 (RP ) =n∑

i=1

(1n

)2

σ2i + 2

n−1∑i=1

n∑j=i+1

(1n

)2

σij . (3.7)

It can be written as follows:

σ2 (RP ) =(

1n

) n∑i=1

[(1n

)σ2i

]+

2n2

n−1∑i=1

n∑j=i+1

σij . (3.8)

The first term converges to 0 when n goes to infinity (the variances σ2i are

assumed to be uniformly upper bounded). Therefore, the individual contri-bution of each asset to the whole portfolio is negligible for large portfolios.If returns are independent, then the portfolio risk converges to 0; and thediversification effect is to eliminate the risk.

The second expression involves n(n−1)2 covariances. Thus this term does not

converge to 0 if the covariances σij are also assumed to have absolute valuesin a given interval [cmin, cmax] with cmin > 0. It converges to the asymptoticmean of covariances.

3.1.2 Optimal weights

3.1.2.1 Case 1: no riskless asset

Following the Markowitz approach (see also [387]), we have to determinethe set of portfolios which minimize the variance for given expected returnsE [RP ]. This leads to the following quadratic optimization problem:

minw

w′Vw,

with w′R = E [RP ] ,w′e =1.

(3.9)

The first constraint corresponds to the fixed expectation level. The secondconstraint is simply that w is a vector of weights. However, shortselling isallowed and no other specific constraints are introduced.

To solve Problem (3.9), consider the following Lagrangian functional:

L (w,λ,δ) = w′Vw+λ(E [RP ]−w′R

)+ δ (1−w′e) , (3.10)

where λ and δ are the usual Lagrangian multipliers which are constant pa-rameters. Then, Problem (3.9) is equivalent to:

minw,λ,δ

L (w,λ,δ) = w′Vw+λ(E [RP ]−w′R

)+ δ (1−w′e) . (3.11)


The first-order conditions are:

∂L (w,λ,δ)∂w

= 2Vw−λR−δe = 0, (3.12)

∂L (w,λ,δ)∂λ

= E [RP ]−w′R = 0, (3.13)

∂L (w,λ,δ)∂δ

= 1−w′e = 0. (3.14)

Moreover, by assumption, the variance-covariance matrix V is invertible.Thus, the previous first-order conditions are necessary and sufficient to deter-mine the unique solution of this linear system.

Define the four following real numbers A,B,C, and D by:

A = e′V−1R,B = R′V−1R,C = e′V−1e and D = BC −A2.

Then, the optimal portfolio weights at the level E [RP ] are given by:

w =1D

(BV−1e−AV−1R

)+ E [RP ]

1D

(CV−1R−AV−1e

). (3.15)

Introduce w1 and w2:

w1 =1d

(BV−1e−AV−1R

),

w2 =1d

(CV−1R−AV−1e

).

Both w1 and w2 do not depend on the given level E [RP ]. They are only de-termined from financial market parameters: the vector of return expectationsR and the variance-covariance matrix V . Using w1 and w2, we get:

w = w1 + E [RP ] .w2. (3.16)

Therefore we deduce:

PROPOSITION 3.1For any given return expectation level E [RP ], the optimal portfolio exists andis unique. Moreover, it can be decomposed as a combination of two basicportfolios, since we have:

w = (1− E [RP ]) .w1 + E [RP ] . (w1 +w2) . (3.17)

Any two distinct optimal portfolios generate the set of optimal portfolios.


PROOF Examine the latter assertion. Consider two optimal portfolios gand h with respective weights wf and wg. Let q be any optimal portfolio. Wehave to prove that portfolio q is a combination of portfolios g and h. Moreprecisely, we search for a real number α such that wq = αwg + (1− α)wh.- Since E [Rg] = E [Rh], there exists a (unique) solution α of the equation:

E [Rq] = αE [Rg] + (1− α) E [Rh] .

- The portfolio p, with weights α, (1− α) invested on g and h, satisfies:

wp = α (w1 +w2E [Rg]) + (1− α) (w1 +w2E [Rh]) ,= w1 +w2E [Rq] .

Thus wp = wq.

REMARK 3.1 The portfolio w1 is associated to the expectation levelE [RP ] = 1. The portfolio (w1 +w2) corresponds to the level 0. This relationproves the so-called “two mutual funds separation” property.

REMARK 3.2 When the expectation level E [RP ] is varying, the set ofoptimal portfolios is a half-line included in the set of all possible portfolios.Thus, whatever the number n of securities, the optimal set is one-dimensional,whereas the set of all portfolios is (n− 1)-dimensional (since

∑ni=1 wi = 1).

From relation (3.15), the minimal variance at the level E (RP ) is computed.This yields to the fundamental implicit relation between risk and expectedreturns:

σ2 (RP )1/C

− (E (RP )−A/C)2

D/C2= 1. (3.18)

Geometrical interpretation.

This relation defines an arc of hyperbole in the plane with axis (σ (RP ) ,E (RP )).

• Its summit is the point with components(√

1/C,A/C).

• Its asymptotes are defined from the equation:

E (RP ) =A

C± D/C2

1/Cσ (RP ) .


6

-

mvpA/C

1/√C

Standard deviation

Mean return

FIGURE 3.2: Mean-variance portfolios

Therefore:

• There exists one and only one portfolio with the minimal standard de-viation among all possible portfolios. This corresponds to the summitof the hyperbole. It is usually called the mean-variance portfolio (mvp).Its expectation is equal to A/C, and its standard deviation is equal to√

1/C.

• The set of all possible portfolios is delimited by the arc of the hyperbole.Thus, this one is called the portfolio frontier.

• Optimal portfolios with expected returns smaller than A/C are domi-nated in the mean-variance sense by those which have expected returnshigher than A/C: if q is optimal and such that E (Rp) < A/C, thenthere exists an optimal portfolio p with the same risk (σRp = σRq ) andsuch that E (Rp) > A/C.

• Consequently, for the mean-variance analysis, the “rational” portfoliosare those for which the expected return is higher than the expectedreturn of the mvp. The set of such portfolios is called the efficientfrontier .

REMARK 3.3 (Conditions for which there exists at least an efficientportfolio with all weights that are positive.) In that case, there is no short-selling. Such results are proposed in [66], [432], and [440].


3.1.2.2 Case 2: one riskless asset

The same kind of analysis can be used when there exists a riskless asset.Denote by w the vector of weights of the n risky assets, and by R the vectorof returns. The riskless asset has a return denoted by Rf . The percentage ofwealth invested on this riskless asset is w0. The budget constraint is:

w′e+ w0 = 1⇐⇒ w0 = 1−w′e. (3.19)

Therefore, the new optimization program is:

minw

w′Vw,

with w′R+ (1−w′e)Rf = E [RP ] .(3.20)

The Lagrangian functional associated to Problem (3.20) is:

L (w,λ) = w′Vw+λ(E [RP ]−w′R− (1−w′e)Rf

). (3.21)

Thus, we have to solve:minw,λ

L (w,λ) . (3.22)

The first-order conditions, which are also necessary and sufficient, are givenby:

∂L (w,λ)∂w

= 2Vw−λ (R− eRf

)= 0, (3.23)

∂L (w,λ)∂λ

= E [RP ]−w′R− (1−w′e)Rf = 0. (3.24)

Then the optimal portfolio at the level E [RP ] satisfies:

w = V−1(R− eRf

) E [RP ]−Rf(R− eRf

)′V−1

(R− eRf

) . (3.25)

Its variance is given by:

σ2 (RP ) = w′Vw =(E [RP ]−Rf )2

J, (3.26)

where J = B − 2ARf + CR2f is non-negative.

Therefore, its standard deviation is defined as a function of the expectedreturn level E [RP ]:

σ (RP ) =

+ (E[RP ]−Rf )√

Jif E [RP ] ≥ Rf ,

− (E[RP ]−Rf )√J

if E [RP ] < Rf .(3.27)

PROPOSITION 3.2When a riskless asset is available, the optimal frontier is the union of twohalf-lines starting from the new mvp (0, Rf), and with slopes

√J and −√J .


REMARK 3.4 The two-fund separation property shows that any efficientportfolio is a combination of the riskless asset and the tangent portfolio, whichis the only efficient portfolio with nil weight on the riskless asset. The efficientfrontier with the riskless asset can be decomposed into two parts:

• First, the segment which lies between theriskless portfolio and the tan-gential point t. This is the set of portfolios for which the weight w0

invested on the riskless asset is positive. Investors who choose suchportfolios are risk-averse: they prefer a small risk rather than a highexpected return.

• Second, the half-line with origin t is the set of all efficient portfolioswith a short position on the riskless asset. Investors who choose suchportfolios are less risk-averse than those of the previous case: they searchfor higher expected returns despite higher risk.

Several cases can occur:

Case 1) Rf < A/C: the portfolio with the smallest risk (the mvp) has anexpected return higher than the riskless one. This means that the financialmarket is favorable enough to invest on the mvp (nevertheless, its risk is notequal to 0).

PROPOSITION 3.3In that case, the two efficient frontiers with and without the riskless asset haveone intersection point t corresponding to the portfolio weight wt.

They are tangential at this point, defined by:

wt =V−1

(R− eRf

)(A− CRf )

. (3.28)

The expectation and standard deviation of this portfolio are given by:

E [Rt] = w′tR

B −ARf

A− CRf, (3.29)

σ2 [Rt] = w′tVwt=

J

(A− CRf )2.


This property is illustrated by the following figure.

6

-σ(Rp)

E(Rp)

1/√C

A/C

Rf

FIGURE 3.3: Efficient frontiers (Rf <AC )

Case 2) Rf > A/C: the portfolio with the smallest risk (the mvp) has anexpected return smaller than the riskless one. In that case, we have:

6

-σ(Rp)

E(Rp)

1/√C

A/C

Rf

FIGURE 3.4: Efficient frontiers (Rf >AC

)


Case 3) Rf = A/C: Relation (3.27) gives:

E (RP ) =A

C±√D

Cσ (RP ) . (3.30)

This is the equation of the two asymptotes of the efficient frontier withoutthe riskless asset. Thus, the two efficient frontiers have no intersection point.

6

-.5σ(Rp)

E(Rp)

1/√C

A/CRf =

FIGURE 3.5: Efficient frontiers (Rf = AC

)

Obviously, the interesting and usual case is the first case. Nevertheless, fora given period (bearish market), case (2) can be observed.

3.1.3 Additional constraints

Transaction costs and specific constraints, such as no shortselling (see Green[265], Dybvig [181], Alexander [17]) induce a set of constraints on the investor’sportfolio. When these constraints are linear, explicit solutions are still avail-able. The optimization problem is solved by using Lagrangian methods, asshown in the previous section.

Otherwise, numerical methods must be used. Fortunately, most constraintsyield to convex programming, such as the following problem:

minw

Φ(w)

with w ∈ H ,w ∈ K,

(3.31)

where H is a hyperplane, K is a convex set, and Φ is a convex function.


Usually, H is determined from given matrix Λ and vector ν:

H = w such that Λw = ν ,

and K is defined from a set of inequalities on convex functions:

K = w such that Ψ(w) ≤ ν where Ψ is a convex function.

The convex optimization problems have numerical solutions1, but the num-ber of assets and the type of constraints may induce computational difficulties.

However, a special subclass of such constrained optimization problems aremore easy to handle, i.e., cone programming. Methods based on extension ofthe interior point algorithm can be provided, as for example in Nesterov andNemirovski [399]. The functional Ψ is linear or quadratic, and the feasible setdetermined from constraints is the intersection of a hyperplane H and a coneC.

For example, the standard mean-variance problem corresponds to:

Ψ(w) = w′Vw,

and

Λ =(Re

), ν =

(E [RP ]

1

).

A no shortselling condition is taken into account by setting C = R+n. Moregeneral conditions, such as bounds on portfolios weights, can be introduced: wmin ≤ Γw ≤ wmax (component by component), where wmin and wmax arefixed vectors and Γ is a given matrix. In that case, the corresponding convexK is the intersection of two cones. The choice of wmin, wmax, and Γ maydepend on specific constraints, such as:

• Particular allocation on bonds and stocks, limits on international diver-sification, etc., for institutional investors;

• Anticipated scenarios about industrial sector returns;

• In order to diversify, an upper bound of 2% can be imposed on eachsecurity; and,

1Special algorithms have been proposed for the mean-variance constrained optimization byFrank and Wolfe [240], and Perold [405]. See also Dantzig ([145], [146]) for linear pro-gramming, and Boyd and Vandenberghe [86]. Standard softwares propose such constrainedoptimization, but for relatively simple constraints.


• To limit the amount of purchase when rebalancing the portfolio betweentwo dates t1 and t2, the following constraint can be imposed: let Vt1 thewealth at time t1. Then the portfolio weight wt2 at time t2 satisfies:

Vt1 ×∑i

Max(wt2 − wt2 , 0) ≤ s,

where s is a fixed amount.

Example 3.3Consider a portfolio manager who must allocate the fund among five assetclasses:

1. Small caps; 2. Big caps; 3. Growth; 4. Value; and, 5. Others,

with expectations E and standard deviations σ as follows (percentage %):

TABLE 3.1: Expectations, variances and covariancesE σ CovariancesR1 = 22 σ1 = 21 σ12 = 2.8 σ13 = 4.00 σ14 = 2.30 σ15 = 2.70R2 = 10 σ2 = 14 σ23 = 2.60 σ24 = 2.00 σ25 = 2.10R3 = 20 σ3 = 20 σ34 = 2.05 σ35 = 2.75R4 = 14 σ4 = 12 σ45 = 1.70R5 = 13 σ5 = 15

• Figure 3.6 corresponds to the case where only shortselling is forbidden.

• Assume now that the investor wants to buy at least 10% of small caps,to invest at least 10% on big caps, and at most 80% on the group whichcontains the big caps and the values. Then, the new constrained efficientfrontier is dominated by the previous one (Figure 3.7).

• Finally, if the investor wants to invest at least 10% on small caps, at least10% on the fifth class, at least 20% on each other class, and finally, atmost 80% on the group of big caps and values, then, a third constrainedefficient frontier is dominated by all the previous ones (Figure 3.8).


0.2 0.25 0.3 0.35 0.4 0.45 0.5

0.15

0.16

0.17

0.18

0.19

0.2

0.21

0.22

0.23

Active Risk (Standard Deviation in Percent)

Active R

eturn (P

ercent)

Tracking Error Efficient Frontier

FIGURE 3.6: Efficient frontier with no shortselling

0.2 0.25 0.3 0.35 0.4 0.45

0.15

0.16

0.17

0.18

0.19

0.2

0.21

0.22

0.23


Active R

eturn (P

ercent)


FIGURE 3.7: Efficient frontier with additional group constraints

0.225 0.23 0.235 0.240.148

0.15

0.152

0.154

0.156

0.158

0.16

0.162

0.164

0.166

0.168


Active R

eturn (P

ercent)


FIGURE 3.8: Efficient frontier with maximum number of constraints


3.1.4 Estimation problems

Mean-variance analysis heavily relies on “good” prediction of the mean andvariance-covariance matrix of the securities. Indeed, mean-variance optimalportfolios are significantly sensitive to parameter values, as shown, for in-stance, in Best and Grauer [65]. These values can be estimated from financialdata. However:

• Using past data to predict future returns means that we assume stabilityof parameters through time. Generally, the variance-covariance matrixis more stable than the mean return.

• Parameter estimations may be involved when, for example, the portfoliois composed of 150 to 250 securities.

• Numerical problems are also posed by the computation of the inversevariance-covariance matrix, as illustrated in particular by Stevens [480].

The most efficient method in terms of saving calculation time is to simplifythe matrix structure. For this reason, specific hypotheses can be introduced:

• The assumption that factor models such as Sharpe’s market model arevalid to predict asset returns from financial indices or macroeconomicsfactors.

• The assumption that some correlation coefficients are identical; for ex-ample for stocks belonging to the same industrial group.

In this framework, some of the following methods can be used:

1) Single-index and multi-index models (see Chapter 5).

For example, assume that asset returns satisfy the Sharpe market model:for all i ∈ 1, ..., n , for any t ∈ [0, T ],

E[Ri,t] = αi,t + βi,t RM,t + εi,t, (3.32)

where RM,t denotes the return on the market index and εi,t denotes the spe-cific risk on asset i, non-correlated with the market return. The parametersαi,t and βi,t are determined from linear regression of the market returns onthe asset returns for the same period. In particular:

βi,t = Cov(Ri,t, RM,t)/V ar(RM,t). (3.33)

Then the total risk of the asset i has two components: a term for systematicrisk (“the market risk”) and a term for non-systematic risk (“the diversifiablerisk”):

V ar(Ri,t) = β2i,t σ

2M,t + σ2

ej with V ar(RM,t) = σ2M,t and σ2

ej = V ar(εi,t).


Therefore, for a single-index model, it is sufficient to calculate the covari-ances of each asset with the index: for n assets, the number of operations isn instead of n(n− 1)/2. Then, the matrix inversion is greatly simplified (seeBartlett [45]).

2) The method of Elton, Gruber, and Padberg ([198],[199]) (see also Chapter9 of [200] for more details).

The method uses an optimal ranking of the assets, with the help of thesimplified correlation structure. A threshold C∗ is determined. Then returnsbelow this threshold are rejected.

- The ratios E[Ri]−Rfβi

are ranked from the highest to the lowest. Thisinduces the ranking of assets: i1 < ... < in. The higher the value of the ratio,the more desirable to include the asset in the portfolio. Therefore, if an assetis rejected, then all following assets are also rejected.

- The cut-off ratio C∗ is determined through an iterative procedure, asfollows. Consider the successive ratios associated to portfolios containing thel first assets. These ratios are given by:

Cl =σ2M

∑lj=1

E[Rij ]−Rf

σ2eij

βij

1 + σ2M

∑lj=1

βijσ2eij

. (3.34)

Note that the ratio Cl is also equal to:

Cl =βlM (E[RM ]−Rf )

βl, (3.35)

where βlM is the expected change in the rate of return on asset l with 1%change in the return on the optimal portfolio.

- Consider the unique l∗ for which all assets il such that il ≤ l∗ have ratios(E[Ril ]−Rf ) /βil higher than Cil , and for which all assets with il > l∗ haveratios smaller than Cil . Therefore, an asset il is included in the portfolioif its mean excess return, E[Ril ] − Rf , is higher than the optimal portfoliocontaining the first il assets.

3) Average correlation models.

The averaging data in the historical correlation matrix has been examinedin [197] to forecast future values. This method assumes that the past cor-relation matrix provides information about what the average correlation willbe in the future but contains no information about individual correlations.Therefore, the main assumption is that usual groups of industrial stocks havecommon correlations. Such problems can be also examined for internationaldiversification, as in Longin and Solnik ([363],[364]).


4) Bayesian approach.

The Bayesian estimation process is based on one hand on prior knowledgeof market parameters (“the investor’s experience”), and on the other hand, oninformation from market observations. Then, a posterior return probabilitydistribution can be determined. Such an approach has been examined by Stein[478], Frost and Savarino [244] and others. Black and Litterman [74] also useBayes rule to reduce the sensitivity of the optimal allocation to parameterchoices. Meucci [390] provides details about the Bayesian procedure.

REMARK 3.5 The empirical observations lead to the following conclu-sions (see, e.g., Elton et al. ([198],[199]), Chan et al. [113], Zeng and Zhang[511].) :

• The estimation of the variance-covariance matrix is easier than the meanreturn estimation. Estimation from individual return data does notprovide efficient forecasts. Multi-index models and average correlationmodels give more accurate values by reducing the impact of fluctuations(see, for example, Eun and Resnick [214]). However, significant estima-tion errors may remain, as mentioned by Jobson and Korkie ([305],[306]),and Chopra and Ziemba [122].

• Taking account of additional constraints, in particular the no short-selling condition, partly reduces estimation errors, as shown by Best andGrauer ([66],[67]) and by Frost and Savarino [245]. Moreover, takingmore account of actual strategies of fund managers, these additionalconstraints often improve portfolio performances.


3.2 Alternative criteria

The Markowitz approach is not directly based on expected utility maxi-mization or risk measure minimization. However, under specific assumptions,the optimal solutions of the two previous problems are mean-variance efficientportfolios.

3.2.1 Expected utility maximization

3.2.1.1 Expected utility and mean-variance analysis

The mean-variance analysis implicitly assumes that the investor’s utility,defined on the portfolio return RP , is a function V (RP ) which depends onlyon the mean and the variance:

V (RP ) = f(E (RP ) , σ2 (RP )

), (3.36)

and is increasing with respect to the mean(

∂f∂E > 0

), and decreasing with

respect to the variance(

∂f∂σ2 < 0

). This kind of utility function is defined as

a mean-variance utility function.Consider an investor with an initial amount V0 and a time horizon T who

uses a “buy and hold” strategy (static portfolio optimization). In that case,the expected utility maximization is equivalent to the search of a vector ofweights, w = (w1, ..., wn), invested on n securities, which is the solution of:

maxw

EP [U (VT )] . (3.37)

Epstein [210] proves that mean-variance utility functions are implied by aset of decreasing absolute risk aversion postulates.

For two main cases, the optimal solution is indeed mean-variance efficient:

• The utility is quadratic: U(x) = x− k2x

2, with k > 0 defined on the set]−∞, 1/k] on which it is increasing. Then:

EP [U (VT )] = E (V0.RP )− k

2.V 2

0 .E(R2

P

)= f

(E (RP ) , σ2 (RP )

),

(3.38)

with

f(E (RP ) , σ2 (RP )

)= U(V0.E (RP ))− k

2.σ2 (RP ) . (3.39)

Since the portfolio value is assumed to be in the set ] −∞, 1/k], thenthe quantity U(V0.E (RP )) is indeed an increasing function of the meanE (RP ). Therefore, any optimal solution is mean-variance efficient.


• The utility is exponential: U(x) = − exp[−Ax]A with A > 0, defined on R+.

Moreover, the return distribution R is supposed to be Gaussian. Thus,the portfolio return RP also has a Gaussian law since RP = w.R andthe Gaussian probability distribution is stable. Therefore, the utilityU (VT ) has a lognormal distribution. Recall the following property: ifthe random variable X has a Gaussian law N (m,σ), then

E[eX

]= exp

[m +

σ2

2

].

Using this expression, we deduce:

EP [U (VT )] = −(1/A)exp[−A (E (RP )−A/2.σ2 (RP )

)]. (3.40)

Therefore, the maximization of this expected utility is also equivalentto the maximization of a quadratic utility with:

f(E (RP ) , σ2 (RP )

)= E (RP )−A/2.σ2 (RP ) . (3.41)

REMARK 3.6 For the previous cases, the functional to maximize hasthe following form:

V (RP ) = E (RP )− φ

2.σ2 (RP ) where φ > 0. (3.42)

The parameter φ is the marginal substitution rate between the mean and thevariance. It can be viewed as an aversion to the variance.

Let us examine this kind of problem. For a given aversion to variance φ,we have to solve:

maxw

w′R− φ2 .w

′Vw,

with w′e =1 .(3.43)

The Lagrangian associated to Problem (3.43) is defined by:

L (w,λ) = w′R−φ

2.w′Vw+λ (1−w′e) (3.44)

The first-order conditions are given by:

∂L (w,λ)∂w

= R−φVw−λe = 0, (3.45)

∂L (w,λ)∂λ

= 1−w′e = 0. (3.46)

Solving this linear system, we deduce:


PROPOSITION 3.4The optimal solution is equal to:

w =1φ

(V−1R− A− φ

CV−1e

). (3.47)

Its mean and variance are given by:

E (RP ) = w′R=d

φ.C+

A

C, (3.48)

σ2 (RP ) = w′Vw =d

φ2.C+

1C. (3.49)

When the parameter φ is varying in R+, the set of optimal solutions isexactly equal to the set of efficient portfolios.

REMARK 3.7 When the aversion to variance φ goes to infinity, we have:

E (RP )→ A

Cand σ2 (RP )→ 1

C. (3.50)

In that case, the investor chooses the mvp portfolio.When the aversion φ goes to 0, we have:

E (RP )→ +∞ and σ2 (RP )→ +∞. (3.51)

The smaller φ, the higher the mean, but also the higher the risk.

REMARK 3.8 As can be seen, the main advantage of mean-varianceutility functions is the simplicity of the determination of optimal solutions.Moreover, when convex constraints are added, such problems as

maxw

w′R− φ2 .w

′Vw,

with w′e =1 and w ∈ K,(3.52)

have numerical solutions computed with efficient algorithms (see Section 3.1.3and [405]).

3.2.1.2 Optimal weights for expected utility maximization

The first-order condition of problem (3.37) is given by: for all i ∈ 1, ..., n ,E[U ′ (V0RP ) (Ri −Rf )] = 0, (3.53)

where the portfolio return RP is equal to Rf +∑n

i=1 wi(Ri −Rf ).The solutions of the previous system of equations are generally implicit, and

numerical algorithms are needed to analyze them. However, for an importantclass of utility functions, the HARA family, qualitative results can be derived,in particular the two-fund separation.


3.2.1.3 Two-fund separation

The mean-variance efficient portfolios have the two-fund separation proper-ty. This is not the only case for which this property is satisfied: for example, ifthere exists a riskless asset, other utility functions imply two-fund separation.In addition, particular probability distributions can also induce separationproperty.

3.2.1.3.1 Two-fund separation and utility function Indeed, Cassand Stiglitz [108] provide necessary and sufficient conditions on utility func-tions to get the two-fund separation.

PROPOSITION 3.5For the static portfolio optimization problem:1) The two-fund separation property is satisfied for any probability distri-

bution of returns if and only if the investor has a quadratic utility function.

2) If there exists a riskless asset, then a necessary and sufficient conditionfor investors to have the same percentages of each risky asset is that they havean absolute risk aversion (ARA), such that for any wealth level V ,

ARA(V ) =1

α + βV, (3.54)

where the parameter β is the same for all investors and is assumed to bepositive. Thus, they have the same HARA utility function (up to a lineartransformation). They differ only by their wealth V0.

PROOF We examine only the sufficient condition (see [108] for the com-plete proof):

Note that Condition (3.54) implies that the marginal utility U ′ has theform:

1) If β = 0, denote α = A/(βB). Then:

U ′(V ) = (A + BV )−1β .

2) If β = 0:

U ′(V ) =1α

exp(−V

α

).

For the HARA utility functions, note that the first-order condition (3.53)is equivalent to: if β = 0,

E

(A + BV0

[Rf +

n∑i=1

wi(Ri −Rf )

])− 1β

(Ri −Rf )

= 0. (3.55)


If A = 0 (CRRA case), then the optimal weights wf and w = (w1, ..., wn)invested on the n risky assets, which are the solutions of the system of e-quations (3.55), do not depend on BV0. They are the same for all investorshaving CRRA utility functions with the same parameter β. Denote them by:

wCRRAf and wCRRA = (wCRRA

1 , ..., wCRRAn ).

If A = 0, define C by:

C =BV0

A + BV0(Rf ).

Then, Equation (3.55) is equivalent to:

(BV0

C

)− 1γ

E

(1 +n∑

i=1

Cwi(Ri −Rf )

)− 1γ

(Ri −Rf )

= 0.

Therefore, for all i ∈ 1, ..., n , the value of Cwi does not depend on theparameters of the HARA utility function.

Thus, the ratios of optimal weights, wHARAi , invested on the risky assets do

not also depend on BV0. For parameter β, they satisfy: for all i, j ∈ 1, ..., n ,

wHARAi

wHARAj

=wCRRA

i

wCRRAj

.

The same result is deduced when β = 0.

REMARK 3.9 (Determination of the utility function)1) If β = 0, the utility function is given by:

U(V ) =β

(β − 1)B(A + BV )

β−1β if β = 1,

U(V ) =1B

ln (A + BV ) if β = 1.

2) If β = 0:

U(V ) = − exp[−V

α

].

Examine the properties of optimal weights for the HARA case. Denote byw∗ = (w∗

1 , ..., w∗n) the percentages of risky assets in the mutual fund which

depend only on the parameter β. Note that∑n

i=1 w∗i = 1. This portfolio is

the optimal portfolio of an investor with a CRRA utility function (A = 0).


PROPOSITION 3.6For HARA functions, the optimal weights are functions of the CRRA optimalweights.

PROPOSITION 3.7(Determination of the optimal weights)1) If β = 0, the optimal weights are given by:

wi = νw∗i and wf = 1− ν,

with ν = (1− λ)(

1 +A

BV0Rf

),

where the parameter λ is equal to wf for the CRRA utility function.Thus, ν is the percentage of risky investment, and (1− ν) the percentage of

riskless one.2) If β = 0, the optimal weights are given by:

wi =(

(1− λ)αV0

)w∗

i ,

wf =λα

V0+

V0 − α

V0.

REMARK 3.10 For the first case, note that all investors with CRRAutility functions have the same optimal portfolio (wf and w) whatever theirinitial wealth. For A = 0, the percentage ν of risky investment is decreasingw.r.t. the initial wealth. More precisely, the demand upon each risky asset i(i.e., the amount invested on i) is linear in wealth. For the second case, thedemand upon each risky asset i is constant.

Another approach is based on the characterization of probability distribu-tions that imply two-fund separation.

3.2.1.3.2 Two-fund separation and probability distribution Multi-variate Gaussian distributions are particular solutions of this problem. Theyare special cases of the spherical distributions , defined as follows:

DEFINITION 3.1 A random vector R has a spherical distribution if forevery orthogonal map L ∈ Rn×n(i.e., L′L = LL′ = In where In is the n-dimentional identity matrix), the random variables R and L.R have the samedistribution.

This means that the distribution of a spherical random variable is invariantto rotation of the coordinates.


DEFINITION 3.2 A random vector R is determined from its character-istic function ϕR, which must satisfy: for any vector t ∈ Rn,

ϕR(t) = E [exp (it′.R)] = ψ(t′.t),

where ψ is some scalar function and usually called the characteristic generatorof the spherical distribution (see Fang et al. [220] for properties of such adistribution).

Examples of the spherical distributions are the Gaussian distributions, thestudent-t distributions, the logistic distributions, etc.

Chamberlain [112] provides the complete family of distributions that arenecessary and sufficient for the expected utility of final wealth to be a mean-variance utility function.

PROPOSITION 3.8

1) If there is a riskless asset, then the distribution of any portfolio is deter-mined by its mean and variance if and only if the random vector of returns Ris a linear transformation of a spherically distributed random vector.

2) If there is no riskless asset, then the spherically distributed random vectoris replaced by a random vector in which the last n− 1 components are spher-ically distributed conditional on the first component, which has an arbitrarydistribution. If the number of assets is infinite, then there must exist randomvariables X,Y, and varepsilon, where the distribution of ε conditional on Xand Y is standard normal, such that every portfolio is distributed as somelinear combination of X and Y ε. (If there is a riskless asset, then X has zerovariance.)

The general result is provided by Ross [433].

PROPOSITION 3.9

For a risk-averse investor, necessary and sufficient conditions to have thetwo-fund separation property are the following ones:

Case 1: There is no riskless asset. The condition is that there exist tworandom variables X and Y , and two portfolios w(1) and w(2), such that therisky asset returns Ri can be written as follows: for all i ∈ 1, ..., n ,

Ri = X + aiY + εi, (3.56)


with:

E[εi

∣∣∣X, Y]

= 0,n∑

i=1

w(1)i εi =

n∑i=1

w(2)i εi = 0, (3.57)

a(1) =n∑

i=1

w(1)i ai = a(2) =

n∑i=1

w(2)i ai. (3.58)

This condition is equivalent to the existence of two portfolios w(1) and w(2),such that the risky asset returns Ri can be written as follows: for all i ∈1, ..., n ,

Ri = Rw(1)

i + bi(Rw(2)

i −Rw(1)

i ) + εi,

with:E[εi

∣∣∣Rw(2)

i , Rw(1)

i

]= 0.

Case 2: There is a riskless asset with return Rf . The conditions are thesame if we set X = Rf and w

(1)f = 1, w(1) = (0, ..., 0).

PROOF We examine only the sufficient condition (see [433] for the nec-essary condition).

Using the risk notion of Rothschild and Stiglitz ([435],[436]) (see Chapter1), it is sufficient to prove that any portfolio wP is more risky than thecombination w(1,2) of w(1) and w(2), which has the same mean return. FromCondition (3.56), we have: for all i ∈ 1, ..., n ,

RP =n∑

i=1

wPi Ri = X +

n∑i=1

wPi aiY +

n∑i=1

wPi εi.

From Condition (3.58), there exists always a real number γP :

n∑i=1

wPi ai = γP

(n∑

i=1

w(1)i ai

)+(1− γP

)( n∑i=1

w(2)i ai

)= γPa(1)+

(1− γP

)a(2).

Consequently, the portfolio return RP can be written as:

RP = γP

(X +

n∑i=1

w(1)i aiY

)+(1− γP

)(X +

n∑i=1

w(2)i aiY

)+

n∑i=1

wPi εi.

= γPRw(1)

i +(1− γP

)Rw(2)

i +n∑

i=1

wPi εi. (3.59)

Consider now the portfolio w(1,2) = γPw(1) +(1− γP

)w(2). Then:

RP = Rw(1,2)+ εP with εP =

n∑i=1

wPi εi.


From Condition (3.57),

E

[n∑

i=1

wPi εi

∣∣∣Rw(1,2)

]=

n∑i=1

wPi E

[n∑

i=1

wPi εi

∣∣∣∣∣X +n∑

i=1

wPi aiY

]= 0.

Therefore, since we have:

RP = Rw(1,2)+ εP with E

[εP

∣∣∣Rw(1,2)]

= 0,

the portfolio RP is more risky than Rw(1,2)(according to Rothschild and

Stiglitz) and cannot be optimal. Case(2) is proved in the same manner.

REMARK 3.11 When the investor is risk-averse and no riskless asset isavailable, the two-fund separation property is based on two conditions:

- First, the return of any asset i is generated by two common factors.- Second, there exist two portfolios, w(1) and w(2), with distinct mean

returns and without residual risks.- As proved in Chamberlain [112], if for any asset i, there exists ai ∈ R and

bi ∈ R such that:

Ri = aiX + biY ε with E [ε |X,Y ] = 0, (3.60)

then, the property of two-fund separability is satisfied, even if the utilityfunction is not concave.

3.2.2 Risk measure minimization

If investors are wary of downside risks, they want to choose portfolios withsmall probabilities of loss. As seen in Chapter 2, they may search to minimizethe probability of having returns under a given level (this refers to VaR), orthe expectation of the losses under this level (this refers to CVaR).

Attempts to solve this kind of problem are found in Roy [437], Telser [490],and Kataoka [324], when portfolio returns have Gaussian probability distri-butions.

3.2.2.1 Safety first

3.2.2.1.1 Roy’s criterion Let Rmin be the minimum return fixed by theinvestor or by statutory conditions for particular funds. Roy’s criterion isthe minimization of the probability to get a portfolio return lower than Rmin.Then, the optimization problem is:

minw

P(RP < Rmin). (3.61)

Assume that the vector of returns has a multivariate Gaussian distributionand no riskless asset is available. Since any portfolio return also has a Gaussian


law N (E[RP ], σP )., then, by normalizing the return RP , the Roy problem isequivalent to:

minw

P

(RP − E[RP ]

σP<

Rmin − E[RP ]σP

).

Now the probability distribution of the random variableRP − E[RP ]

σPisN (0, 1).

Thus, the problem is equivalent to the minimization ofRmin − E[RP ]

σP, and

also to the maximization of the quantity a =E[RP ]−Rmin

σP. From a geomet-

rical point of view, in the Markowitz plane (σP ,E [RP ]), this latter ratio isthe slope of a half-line starting from point (0, Rmin).

Its equation is:

E[RP ] = a σP + Rmin.

Therefore, under the Gaussian assumption, the Roy portfolio is on the half-line starting from point (0, Rmin), having the maximum slope and a non-emptyintersection with the set of all possible portfolios.

The solution is tangent to the efficient portfolios curve, as illustrated bythe following graph:

R

RMin

σ(Rp)

E(Rp)

E[RP ] = RMin + K1σ(Rp)



FIGURE 3.9: Roy’s portfolio


The half-line with the lowest slope is mean-variance dominated by the tan-gent, which would be dominated by the third half-line, but this one has anempty intersection with the set of portfolios.

The determination of Roy portfolio is, for example, deduced by searching

the value of the maximum ratio a =E[RP ]−Rmin

σP:

1) Substituting a σP + Rmin for E[RP ] in the hyperbole’s equation inducesa quadratic polynomial equation with unknown variable σ. The half-line istangent to the efficient frontier if and only if its discriminant ∆ is equal to 0.

2) ∆ is itself a quadratic function of the ratio a. Thus, we have to solve thissecond equation. The optimal ratio a is the highest solution.

Tangent conditions can also be used to determine Roy’s portfolio.

3.2.2.1.2 Telser criterion Telser’s portfolio is the portfolio which hasthe highest return expectation under the safety conditions. We have to solve:

max E[RP ],

with P(RP < Rmin) ≤ ε, (3.62)

where both the minimum return Rmin, and the threshold ε, are fixed.

Under the Gaussian assumption, the safety condition is equivalent to:

Rmin − E[RP ]σP

≤ xε,

where xε is the quantile of the standard Gaussian distribution at the level ε.

In the Markowitz plane (σP ,E [RP ]), the set of solutions is the area abovethe line defined by E[RP ] = Rmin + (−xε) σP .

For example, if ε = 5%, we have xε −1.65. Then the safety condition isapproximately:

E[RP ] ≥ Rmin + 1.65 σp.

The Telser portfolio is the intersection point of the efficient frontier withthe half-line having the equation:

E[RP ] = Rmin + (−xε) σP ,

as illustrated by the following graph.


T

RMin

σ(Rp)

E(Rp)

Angle=1.65

FIGURE 3.10: Telser’s portfolio

3.2.2.2 Kataoka criterion

The idea of Kataoka is to search for the maximum value of the minimumreturn Rmin, guaranteed at a fixed probability threshold ε. We have to solve:

maxw

Rmin,

with P(RP < Rmin) ≤ ε. (3.63)

Under Gaussian assumptions, this problem is equivalent to:

maxw

Rmin

withE[RP ]−Rmin

σP≥ −xε.

In the Markowitz plane, consider the half-lines with same fixed slope (−xε).Then, the Kataoka portfolio is the tangent point of the efficient frontier withthe unique half-line having the given slope −xε, and the value of Rmin is givenby the intersection with the vertical axis (see Rmin 2 in the Figure 3.11.)


K

RMin1

RMin2

RMin3

σ(Rp)

E(Rp)

FIGURE 3.11: Kataoka’s portfolio

REMARK 3.12 The three previous criteria are based on a fixed levelguaranteed at a given probability threshold. Under Gaussian assumptions, alloptimal solutions are necessarily mean-variance efficient. As a consequence,a unique efficient portfolio can be determined by choosing the values of Rmin

and/or the threshold ε.

• Roy criterion is based on the search of a true guarantee, since it is forthis criterion that the level ε is the lowest. For example, when marketvolatility is high, an investor may want to reduce market risk.

• If market parameters are assumed to be well estimated, and the min-imum level Rmin and the threshold ε can be easily determined, then“rationally,” the investor maximizes the return expectation. This is thepurpose of Telser criterion.

• In the same framework, Kataoka portfolio has the most attractive guar-antee for a given probability threshold.

REMARK 3.13 Obviously, other probability distributions than the Gaus-sian one must be used if asset returns exhibit skewness and fat tails, in par-ticular when some payoffs are not linear with respect to given basic securities.Then, optimal solutions may no longer be mean-variance efficient. Numer-ical algorithms are generally needed to determine and analyze the optimalsolutions, as shown in what follows.


3.2.2.3 CVaR Minimization

Alternatively, we can search to minimize the CVaR (“expected shortfall”)with or without specific constraints. For normally distributed loss functions,mean-variance (MV) and VaR/CVaR minimizations are equivalent (see com-putation of VaR and CVaR for the Gaussian case in Chapter 2). Otherwise,especially for asymmetrical distributions, MV and VaR/CVaR portfolio opti-mizations may lead to significantly different solutions. Note also that unlikein the mean-VaR optimization problem, mean-shortfall optimization can besolved efficiently as a convex optimization problem, as shown in Bertsimas etal. [55].

General results about VaR and CVaR minimizations are provided in Rock-afellar and Uryasev ([427], [428]). They show that these problems can bebased on a particular representation of the performance function, which thenallows use of analytical or scenario-based optimization algorithms. When thenumber of scenarios is fixed, the problem is solved by linear programming orby non-smooth optimization methods.

Consider the loss function l defined on weights w and return R. Let usassume that the vector of returns R has a density fR (this assumption is notcritical). For α ∈ (0, 1), the α-VaR denoted by qα(w) is given by:

qα(w) = min q ∈ R : P[l(w, R) ≥ α] .

The α-CVaR denoted by φα(w) is equal to: (notation: x+ = max(x, 0))

φα(w) = (1− α)−1

∫l(w,u)≥qα(w)

l(w, u)fR(u)du. (3.64)

As can be seen, the α-CVaR φα(w) is the conditional expectation of theloss associated to the vector of weights w relative to that loss being equal toqα(w) or greater than qα(w).

The key idea is to characterize both α-VaR qα(w) and α-CVaR φα(w) bymeans of a convex function Fα(w, q) defined by:

Fα(w, q) = q + (1 − α)−1

∫Rn

[l(w, u)− q]+ fR(u)du. (3.65)

We have (see [427]):

PROPOSITION 3.10The function Fα(w, q) is convex and continuously differentiable w.r.t. q. Theα-CVaR φα(w) of the loss associated with any vector of weights is given by:

φα(w) = minq∈R

Fα(w, q). (3.66)


The set Aα(w) of values q for which the minimum is attained, namely

Aα(w) = arg minq∈R

Fα(w, q),

is a bounded interval which is non-empty and closed. It can be reduced to asingle point. Its left endpoint is the α-VaR qα(w). Using convexity results(see e.g., Rockafellar [426]), the existence of a unique solution can be provedunder specific assumptions which eliminate a local but not global minimum.

Then, a characterization of the α-CVaR can be deduced (see [427]):

PROPOSITION 3.11The minimization of the α-CVaR φα(w) is equivalent to minimizing Fα(w, q)over all (w, q):

minw∈Λ

φα(w) = min(w,q)∈Λ

Fα(w, q). (3.67)

Moreover, any pair (w∗, q∗) is an optimal solution of the right-hand sideif and only if w∗ is an optimum of the left-hand side. When the intervalAα(w) reduces to a single point, any solution (w∗, q∗) of the minimization ofthe function Fα(w, q) is such that w∗ minimizes the α-CVaR, and q∗ is thecorresponding α-VaR qα(w).

Note also that Fα(w, q) is convex w.r.t. (w, q), and φα(w) is convex w.r.t.w if the loss function l(w, q) is convex w.r.t. w. Moreover, if the set Λ ofportfolios weights is convex, the optimization problem relies on convex pro-gramming (see Section 3.1.3).

REMARK 3.14 This approach allows us to avoid the VaR computation.The function Fα(w, q) can also be estimated from financial data or simulatedfrom the Monte Carlo method for a given density fR.

3.2.2.4 Efficient frontier

When loss functions are normally distributed, MV and VaR/CVaR opti-mizations generate the same efficient frontier.

Another problem is the equivalence representations of efficient frontiers withconcave reward and convex risk functions.

This kind of result is known for the mean-variance case (see Steinbach [479]),and for mean-regret performance functions, see Dembo and Rosen [157].


Krokhmal et al. [336] provide a general result about this equivalence, asshown in the next proposition.

PROPOSITION 3.12

Let us consider the functions Φ(.) (“the risk”) and Ψ(.) (“the reward”) w.r.t.the vector of weights. Consider the following three problems:

(P1) minw Φ(w) − aΨ(w), a ≥ 0, w ∈ Λ,

(P2) minw Φ(w), Ψ(w) ≥ b, w ∈ Λ,

(P3) minw−Ψ(w), Φ(w) ≤ c, w ∈ Λ.

When the parameters a, b, and c are varying, three corresponding efficientfrontiers are generated.Assume that constraints Ψ(w) ≥ b and Φ(w) ≤ c have internal points

(under some regularity conditions from duality theory). If Φ(.) is convex,Ψ(.) is concave, and Λ is convex, then the three efficient frontiers are equal.This is the case when Φ(.) is the α-CVaR φα(w) and Ψ(.) is the mean return.

3.3 Further reading

There is a huge amount of literature concerning static portfolio manage-ment and, in particular, mean-variance analysis, both from the theoreticaland empirical points of view. The book of Elton and Gruber [200] provides ageneral overview about standard problems. The classic texts on portfolio op-timization are the books of Markowitz ([375] and [376]). The book of Meucci[390] contains details about the Bayesian approach and computational meth-ods when additional constraints are introduced on portfolio weights. Thediversification property for mean-variance efficient portfolios is analyzed inGreen and Hollifield [266]. Large-scale optimization is studied in Perold [405],Best and Kale [69], Bixby et al. [72], Levkovitz and Mitra [351], and Mulveyet al. [395]. Extension of mean-variance to more general complete marketsare examined in Dybvig and Ingersoll [182]. Nowadays, mean-variance op-timization is also applied on asset class level. This is due to the increasingrange of asset classes, and also to the easier estimation of Markowitz inputsthan for individual securities. As illustrated and mentioned for instance inLederman and Klein [345], global asset allocation, in particular internationaldiversification, and strategic asset allocation for long-term investment can bebased on this approach.

Michaud [391] examines statistical properties and their significant impacton portfolio optimization. Ledoit and Wolf [346] examine variance-covariancematrix tests when dimensionality is large. Arch models can also give betterex-post predictions of the variance-covariance matrix. They allow for intro-


duction of factor models based on GARCH processes, which describe volatil-ities and correlations of securities, as shown in Herzog et al. [292]. Basicresults about ARCH models can be found in Bollerslev et al. [80], Gourieroux[260].

Utility maximization is mainly developed in a dynamic framework, as canbe seen in Chapters 6 and 7. Kroll et al. [338] examine the relation betweenmean-variance analysis and utility maximization. Kallberg and Ziemba [315]compare optimal portfolio weights for different utility functions. Portfolio op-timization can also be based on higher-moments, involving skewness and kur-tosis. A fourth-order Taylor approximation of a utility function leads to suchcriterion. A three-moments portfolio choice has been analyzed by Athaydeand Flores [35] for a maximum skewness portfolio. The four-moments case isexamined by Jurczenko and Maillet [312], and by Malevergne and Sornette([371],[372]). The latter articles also show how centered (absolute) momentsand some cumulants are consistent measures of risks which can be used togeneralize the mean-variance approach. Efficient frontiers for stochastic dom-inance can be also examined, as in Ruszczynski and Vanderbai [444], andDarinka and Ruszczynski [147]. Prospect theory and its connection to mean-variance analysis is examined in Levy and Levy [353].

When probability distributions are not Gaussian, the safety first conditioncan be studied by means of extreme value theory as in de Haan et al. [278];or also by using the theory of great deviations. Giacometti and Lozza [252]examine risk measures for asset allocation. They compare, in particular, port-folio choice on both historical data and simulated returns with jointly stablenon-Gaussian returns. Differences between MV and CVaR efficient frontiersare illustrated in Mausser and Rosen [379], and Anderson et al. [25] for creditrisk portfolio management, and Larsen et al. [344], and Chabaane et al. [111]for hedge funds management. Gaivoronski and Pflug [251] show that VaRand CVaR efficient frontiers can be quite different. Alexander and Baptista[18] compare the mean-VaR approach with mean-variance analysis.

Other criteria can also be examined such as mean-absolute deviation (seeKonno et al. ([329],[331]), minimax rule (see Cai et al. [101], Teo and Yang[491], and Young [508]). Athayde [34] studies the minimization of downsiderisk. Chekhlov et al. [116] provide portfolio optimization results with draw-down constraints.

Acerbi and Prospero [4] examine the minimization problem of spectral mea-sures of risk. They prove that minimizing risks with constrained returns, ormaximizing returns with constrained risks (standard risk-reward problem)coincides with the unsconstrained optimization of a single suitable spectralmeasure. This means that “minimizing a spectral measure turns out to bealready an optimization process itself, where risk minimization and returnsmaximization cannot be disentangled from each other.”

Chapter 4

Indexed funds and benchmarking

4.1 Indexed funds

The goal of passive management is to achieve returns identical to a spe-cific benchmark. It is based on holding a basket of securities designed, forexample, to track a broad-market index which matches as closely as possiblethe return of the overall stock market. The most commonly used strategy ismarket capitalization weighting. No specific judgment is needed, and the riskof variation from the accepted asset-class benchmark is reduced as much aspossible.

Why invest in global passive funds?The first justification is the cost: Systems and personnel costs are lower

than with active management since, once the procedure is in place, it is s-traightforward to use it. Transaction costs are also reduced. Indeed, sincetransactions are informationless, commissions are significantly smaller. Taxesare reduced since active portfolio strategies have a higher turnover. More-over, bid-offer spreads are smaller because passive funds involve less small-capitalization stocks. Funds such as “trackers” also have low transactioncosts.

The second is the performance: According to financial data, global indexessuch as the S&P 500 and EAFE beat the median manager performance mostof the time. The implicit assumption is that financial markets are efficient,and no manager has a superior performance. In fact, less than 20% of activelymanaged mutual funds, diversified on large-capitalizations, have outperformedthe S&P 500 over the last 10 years. According to SEI Investments, more thanhalf of the funds of the Euro area, which had performances above the averageduring the period 1998-2000, had lower performances than the average duringthe next two years.

Indexed funds are relatively recent. Until 1993, only about 5% of the totalamount of mutual funds were invested on indexed funds. However, duringthe year 1999, about 38% of new investments have been based on passivemanagement, which has become very popular.

When the goal is to reproduce as closely as possible a given financial index,the passive management is called an index tracking strategy. Two methods

103


can be considered: first to have the same assets with the same weights asthe index itself (exact replication); or, second to choose a smaller subset ofassets while minimizing the replication or tracking error for some given crite-rion (partial replication). For the second case, an objective function must beselected and specific constraints can be introduced, such as bounds upon in-dividual weights, number of assets to be included in the portfolio, restrictionson transaction costs, etc.

4.1.1 Tracking error

Most of the time, the objective function T of the tracking error is defined asthe variance of the difference between tracking portfolio return RP and indexreturn RI :

T = σ(RP −RI). (4.1)

Such criterion is used by Toy and Zurach [493], Connor and Leland [126],and Larsen and Resnick [344]. Other criteria, based on absolute deviationsinstead of squared deviations, can also be used, as in Consiglio and Zenios[127], Rockafellar and Uryasev [428], and Konno and Wijayanayake [330].

Rudolf et al. [443], for example, introduce the following “linear” tracking-errors :

• MAD (Mean Absolute Deviation).

Let Y ∈ RT be the index return time series. Let X ∈ Rn×T be thereturn matrix of the n securities which are included in the replicatingportfolio. Let β ∈ Rn the vector of weights to be determined.

The optimal weights are deduced from the minimization of the sumof absolute deviations between index return and replicating portfolioreturn:

β = Argminβ

T∑t=1

(∣∣∣∣∣n∑

i=1

Xitβi − Yt

∣∣∣∣∣),

where Xt = (X1t, ..., Xnt).

• MinMax.

The optimal weights are determined against the worst case:

β = Argminβ

(max

t|Xtβ − Yt|

).

These two measures of tracking-errors can be extended by selecting onlythe values Xtβ smaller than Yt (i.e., Downside Risk). This latter criterion iscalled MADD (“Mean Absolute Downside Deviation”). Similarly, we definethe DMinMax criterion (“Downside MinMax”).To summarize, we have:

Indexed funds and benchmarking 105

• TEMAD : minβ

∑t (|Xβ − Y |)

• TEMADD : minβ

∑t

([Y −Xβ]+

)• TEMinMax : min

βmax

t|Xtβ − Yt|

• TEDMinMax : minβ

maxt

[Yt −Xβt]+

4.1.2 Simple index tracking methods

Index tracking-errors can be based on statistical methods, such as the coin-tegration approach, or on calibration-type algorithms, such as the thresholdaccepting algorithm.

4.1.2.1 Exact market capitalization weighting

This method consists of investing in all assets of the index proportionalto their shares in the index. This perfect replication seems to be the idealmethod. However, some difficulties may appear:

• The invested amounts must be sufficiently high in order to avoid roundweighting problems.

• Depending on management style, liquidity problems can occur, for ex-ample on small-capitalizations.

• If the index is modified, high transaction costs reduce the performance.

4.1.2.2 Sratified replication

This method is based on the decomposition of the index according to a setof characteristics which can be weighted. Then, the replicating portfolio musthave the same decomposition. For example, consider an equity index with i =1, ..., n stocks belonging to j = 1, ...,m industrial sectors, and k = 1, ..., l styles(small-cap, big-cap, growth, value, etc.). Then, the replicating portfolio musthave the same respective percentages (pj%, qk%) of sectors and styles, butnot necessarily the same stocks with the same weights. However, this methoddoes not indicate the types and the optimal number of these characteristics.

4.1.2.3 Synthetic replication

Derivatives written on the index may also be used (if they exist). However,if maturity dates do not coincide, options must be rolled over. This inducesadditional costs since, for example, when options are not always at-the-money,it is more difficult to forecast their prices.


4.1.2.4 Optimum sampling replication

Meade and Salkin ([382],[383]) use quadratic programming to determinethe optimal tracking portfolio weights. However, they use a pre-selected set ofsecurities. As mentioned in Tabata and Takeda [486], index fund managementrequires:

• Minimization of the number of assets in the tracking portfolio.

• Minimization of a function of the tracking-error between portfolio andindex.

They propose an algorithm which determines a locally optimal trackingportfolio.

4.1.3 The threshold accepting algorithm

Dueck et al. ([171],[172]) have proposed the so-called threshold acceptingalgorithm for the risk and reward optimization problem. Gilli and Kellezi[255] examine the performance of the threshold accepting algorithm for in-dex tracking. They assume the existence of transaction costs for portfoliorebalancing. Their approach is presented in this section.

4.1.3.1 The model

Assume that there are (nA + 1) securities in the index to be replicated.Asset 0 is assumed to be the risk free asset with constant rate r. Let pit bethe price at time t of asset i, i = 1, ..., nA.

Let It be the index value at time t. Its return on time period [t− 1, t] isgiven by:

rIt = ln(

ItIt−1

).

Let xit be the quantity invested on the ith asset in the tracking portfolioPt at time t:

Pt = xit | i = 0, 1, ..., nA .Consider the set of indices corresponding to assets included in portfolio Pt:

Jt = i | xit = 0 .Denote by Vt the value of portfolio Pt:

Vt =nA∑i=0

xitpit =∑i∈Jt

xitpit.

The corresponding weights are given by:

wit =xitpitVt

.


Denote also by Vt−the tracking portfolio value before t:

Vt− =nA∑i=0

xi,t−1pit.

Without transaction costs, the portfolio return rPt on time period [t− 1, t] isgiven by:

rPt = ln(

Vt−

Vt−1

)= ln

( ∑nAi=0 xi,t−1pit∑nA

i=0 xi,t−1pi,t−1

). (4.2)

Assume that transaction costs Ct are proportional to absolute changes in eachsecurity i:

Ct = c

nA∑i=0

pit |xit−xi,t−1| , (4.3)

where c is a given positive real number.Then, the portfolio return, taking account of transaction costs, is given by:

rPt = ln(

Vt−

Vt−1

)= ln

(Vt−

V(t−1)−−Ct−1

). (4.4)

4.1.3.2 Objective function

Given the observations of prices pit and It at times t1, ..., t2, we searchfor the portfolio which would have replicated as closely as possible the indexreturn over the period [t1, t2]. Therefore:

- First we have to choose a criterion Ft1,t2 which is a function of the trackingerror.

- Second, we search for quantities xit, i = 1, ..., nA, such that the corre-sponding portfolio minimizes Ft1,t2 . As seen previously, several measures canbe used. For example, the α-norm:

Et1,t2

=

(∑t2t=t1

∣∣rPt − rIt∣∣α) 1

α

t2 − t1, (4.5)

where α > 0. Gilli and Kellezi [255] use this criterion for α = 1.

4.1.3.3 Constraints

Several additional constraints can be introduced.

- No shortselling:xit 0, i = 0, ..., nA.

- Minimum and maximum amount on individual securities:

εi ≤ xitpit∑i∈Jt

xitpit≤ δi, i ∈ Jt, (4.6)


where εi and δi are exogenous.- Upper bound K on the number of assets:

# Jt ≤ K.

- Constraints on transaction costs: for a given non-negative coefficient γ,

Ct ≤ γVt− .

Other constraints can also be considered, such as limits on asset groups.

4.1.3.4 Optimization problem

Assume that at time t2 we have the portfolio Pt−2=xi,t−2

with value Vt−2

.

We attempt to “optimally” rebalance this portfolio in order to replicate theindex during the time period [t2, t3] . Thus, we search for the “ideal” portfolioP ∗t1 =

x∗i,t1

, unchanged on [t1, t2], which minimizes the function Ft1,t2 of

the tracking error on the time period [t1, t2] . Therefore, we have to solve:

minFt1,t2

with Ct ≤ γVt− ,∑

i∈Jtxit1pit1 + Ct1 = Vt−1

,

εi ≤ xi,t1pi,t1∑i∈Jt xi,t1pi,t1

≤ δi, i ∈ Jt,

# Jt ≤ K.

Then, assuming that P ∗t1 will be also optimal for the time period [t2, t3], at

time t2 we choose the portfolio Pt2 such that wi,t2 = w∗i,t1

.

4.1.3.5 Implementation

The TA algorithm is a local search algorithm which avoids local extremaby accepting solutions which are not worse by more than a given threshold.This latter one is gradually decreasing and is equal to 0 after a given numberof steps. The optimization problem can be defined as follows.

Let f : P → R be the objective function where P is the discrete set of allfeasible replicating portfolios. Consider fopt the minimum of f over the setP :

fopt = minx∈P

f (x) .

Consider the set of all optimal solutions:

Pmin = x ∈ P | f (x) = fopt .

The TA algorithm provides either a solution in Pmin, or close to an elementof Pmin. It is based on the use of a neighborhood function, defined as follows:


DEFINITION 4.1 -Let X be the set of all acceptable configurations fora given problem. We call “neighborhood” any function N : X → 2X , where2X is the class of all subsets of X.- A “mechanism of exploration of the neighborhood” is a procedure that

indicates how we pass from a configuration s ∈ X to a configuration s′ ∈ N(s).- A configuration s is a local minimum with respect to the function N if

f(s) ≤ f(s′) for any configuration s′ ∈ N(s).

For the TA algorithm, at any iteration r, the acceptance of the followingconfiguration s′ ∈ N(s) is only based on a function r(s′, s) and a thresholdTr; s′ is accepted if r(s′, s) < Tr.

The sequence (Tr)r is decreasing and Tr → 0. We start with a given port-folio P0. Then, the procedure indicates how to pass from one configuration,which is randomly chosen, to another which is in the neighborhood of theprevious one. It takes account of the objective function. The process stops assoon as the fixed number of iterations is reached (it allows for limitation ofcomputation time), or if another given condition is satisfied.

For example, Gilli and Kellezi [255] use the following function:

r(s′, s) = f(s′)− f(s).

The portfolio P1 is determined as follows: an asset index i1 is randomly chosenin JP0 and a fixed amount of the corresponding asset is sold and convertedinto number of assets. Next, an asset j1 is randomly bought. If the sizeconstraints are satisfied by the portfolio P0, asset j1 is chosen in JP0 . Afterselling i1 and buying j1, the portfolio constraints are examined. Adjustmentsare made if the constraints are not satisfied.

REMARK 4.1 The choice of assets i and j are random. However, wecan search for special probability distributions in order to improve the con-vergence. From the theoretical point of view, results about stochastic conver-gence have been proved by Van Laarhoven and Aarts [497], and Zhigljavsky[512]. Beasley et al. [52] introduce an index tracking method based on geneticalgorithms. A complete proof of convergence of genetic algorithms is given inCerf [109].

4.1.3.6 Empirical results

Gilli and Kellezi [255] test the algorithm when the exact solution is known.Using different constraints and objective functions, they prove that the TAalgorithm is an efficient method of solving the index tracking and also bench-marking problems. Such a result is also shown for example in [416], bothfor actual financial data and for Garch simulations. To illustrate the efficien-cy of the TA Algorithm, consider a simulation of a multidimensional DCC-MVGARCH model (see Appendix A). The correlation matrix is estimated


from data on the period from February 2002 to May 2004. Each of the 40stocks is modelled by a GARCH(1,1) process. The following figures illustratethe random behavior of the objective function MADD and its performance.

FIGURE 4.1: MADD minimization with ten stocks

On the first period, the manager applies the TA algorithm to determine thereplicating portfolio weights. On the second period, the value of this portfoliois compared with the index value. The MADD minimization is quite efficient.

FIGURE 4.2: MADD minimization: estimation and test


However, in order to limit transaction costs, we also have to take care ofthe weighting stability if the manager decides to rebalance the portfolio. Toillustrate this constraint, consider another DCC-MVGARCH simulation forwhich the manager rebalances the portfolio when one stock weight in theportfolio deviates by more than 1.5 percent (w.r.t. the corresponding weightin the index). For example, consider an initial invested amount equal to onemillion euros, and a proportional transaction cost of 0.5 percent. Then, wenote that generally the manager will rebalance the replicating portfolio twice(rebalancing times: T1and T2,) for which the differences in the weighting aregiven in next table. The replicating result is quite satisfactory.

TABLE 4.1: MADD minimization and weighting differencesStocks Initial weighting Differences in the Differences in the

n0 weighting at T1 weighting at T2

1 15.45% +0.91%2 9.15% +1.73%3 15.45%4 13.29% -0.64%5 11.52% -1.18%6 8.67% +0.87% -1.51%7 7.05% +0.17% +0.42%8 6.24% +0.44% +1.37%9 6.55% -0.21%10 17.63% -1.08% -1.84%

FIGURE 4.3: MADD minimization with constraints


4.1.4 Cointegration tracking method

Nowadays, the notion of cointegration is widely applied in time series anal-ysis with applications to macroeconomics and finance. Practitioners haverecently begun to use this method for indexed fund management or detectionof statistical arbitrage (see Alexander and Dimitriu [15] and Dunis and Ho[179]).

Initially, the financial modelling used the analysis of return correlation co-efficients. Correlation and cointegration are two related but distinct notions:correlation measures the short term dependency of two return series, whilecointegration measures the long term dependency. Indeed, the correlationanalysis implies dealing with stationary time series. Therefore, data seriesmust be transformed into stationary series by deleting the trend or by dif-ferentiating them (see Granger and Joyeux [263]). Consequently, commonstochastic trends between securities are ignored. The cointegration approachcan use more information. Thus, this method can better describe long term re-lations. It is also powerful since it allows for the use of simple statistical meth-ods such as least mean squares regressions, in order to study non-stationaryprocesses.

In this section, some basic properties of cointegration are discussed (forbasic statistical properties, see Greene [267]).

4.1.4.1 Tests of unit roots

To determine the type of non-stationarity, we have to introduce stationarytests (see Appendix A for basic statistical definitions and properties):

• Deterministic trend: function of time (t, t2, log (t) , ...). In this case, themean is increasing (or decreasing) but the variance is constant.

• Stochastic trend: the process X is such that

Xt = α0 + α1.t + εt, with εt ∼ ARMA.

Such process is stationary “around its mean.” This stochastic trend isdue to the existence of a unit root in the autoregressive process X :

Xt = φ0 + Xt−1 + εt, with εt ∼ ARMA.

Then, its variance may explode with time and its perturbations arepersistent. We have to differentiate it to get a stationary time series:

∆Xt = β0 + β1.t + (φ− 1)Xt−1 + εt,with εt ∼WN(0, σ2).

The stationary test (called the Dickey-Fuller test) is the following:

• Null hypothesis: (H0) : φ− 1 = 0,


• Alternative hypothesis: (H1) : φ− 1 < 1.

If assumption H1 is rejected, then the process is considered non-stationary.Coefficients β1 and β0 allow us to check if there exists a deterministic/stochastictrend. A special statistic has been introduced by Dickey and Fuller [180] (seealso Davidson and Mac-Kinnon [178]). However, this test is limited since itassumes that the noise is a white noise. Thus, we have to use the “AugmentedDickey-Fuller” (ADF) test.

Additional lagged variables are introduced:

∆Xt = β0 + β1.t + (φ− 1)Xt−1 +p∑

j=1

λj .∆Xt−j + εt, with εt ∼WN(0, σ2).

The number of lags p for the variable ∆Yt−j is chosen such that ε is a whitenoise.

REMARK 4.2 For a stationary time series, we have φ = 0. When d rootsexist, we call I(d) the time series X . It must be differentiated d times in orderto get a stationary series (see Granger [264]). Note that financial time seriesare generally such that d ≤ 2.

4.1.4.2 Cointegration definition

Consider two time series (Xt)t and (Yt)t. If they are both I(1), any linearcombination of these two variables is generally not stationary. However, insome cases, this combination is I(0), which means that it is stationary. Inthat case, the two series (Xt)t and (Yt)t are said to be cointegrated :

Yt = a + b.Xt + Zt where Yt ∼ I(1), Xt ∼ I(1) and Zt is stationary. (4.7)

This means that these two series have similar trends such that, for a specificlinear combination, their trends can be compensated to provide a stationarytime series. If Zt is strongly stationary, then the difference Yt− (a+ b.Xt) hasalways the same probability distribution. There exists a “long term equilibri-um” between the two series.

From relation (4.7), to test the cointegration property, it is sufficient to testthe stationarity of the residual terms zt.

This can be done by using Dickey-Fuller tests such as:

∆zt = (ϕ− 1)zt−1 + νt.


4.1.4.3 Tracking portfolio determination

This problem is solved by using the following three steps:

• First step. We have to select a given subset of assets, either by meansof, for example, “stock picking,”or by statistical methods. This step isessential to get a significant property of cointegration, which determinesthe quality of the tracking method. It cannot be based on cointegration.The manager has to test different securities combinations to be includedin the replicating portfolio. In particular, he must select the number ofassets. The higher this number, the more stable the cointegration (inthe absence of costs).

• Second step. The logarithm of the index value is regressed by leastsquares estimation (LSE) on the logarithms of the previously selectedasset prices:

log (It) = c1 +n∑

k=1

ck+1 log (Pk,t) + εt. (4.8)

Note that by the logarithm transformation, time series are more homo-geneous. If time series are cointegrated, then their logarithms are alsocointegrated. The residuals are stationary if and only if log (I) and thetracking portfolio

∑nk=1 ck. log (Pk) are cointegrated. The coefficients ck

are the weights of the tracking portfolio. In order to have a cointegra-tion relation, the residual term ε in relation (4.8) must be I(0) and allthe asset prices in the tracking portfolio are I(d) with d 1. Otherwise,the coefficients ck, determined from LSE, are not efficient and may befallacious.

• Third step. We have to estimate the following dynamic of index return:

∆ log (It) = γεt−1+n∑

h=1

αh∆ log (It−h)+∑k∈CI

Γk∆ log (Pk,t)+ut. (4.9)

This king of regression is an “error correction model”: it corrects shortterm effects.

The coefficient γ models the speed of mean-reverting to the long termvalue, given by the cointegration relation. It be must non-positive. Co-efficients Γk are normalized such that

∑nk=1 Γk = 1, since they represent

the weights of the tracking portfolio.

4.1.4.3.1 First step: causality To determine the assets to introduce inthe tracking portfolio, the notion of causality between the tracking portfolioand the index can be introduced. This notion has been considered in Granger[262]. A series induces causality on one another “if the knowledge of the firstone improves the forecast of the second one.”


Granger causality test. This is based on linear regressions.

DEFINITION 4.2 Let X be a multivariate process Xt = (X1,t, ...Xn,t)′.

The random variable Xj does not cause Xk if and only if, at any time t, theknowledge of the past of Xj, Xj,t−1,does not improve the forecast of Xk,t+H ,for any time horizon H:

EL (Xk,t+H/Xt−1; 1 ≤ i ≤ n) = EL(Xk,t+H/Xj,t−1

), (4.10)

where EL (./.) is the linear regression operator, and Xj,t−1 is the set of randomvariables Xi,t−1; i = j.

In practice, for any asset i, consider the following regression on the rates ofreturn:

rIt = α0 +k∑

p=1

αprIt−p +

k∑p=1

βprit−p + εt. (4.11)

To determine the optimal value of the number k of assets, the Akaike criterion(AIC) [11] can be used:

AIC = −2(L

T) + 2

k

T, (4.12)

where L is the likelihood, and T is the number of observations. The Grangertest on Equation (4.11) for any asset i, are based on a Fisher test on the jointhypothesis:

H0 : β1 = β2 = ... = βk = 0. (4.13)

Sims causality Another definition of causality is given by Sims [471]. Itis based on impulse analysis. A time series causes another one if a randomshock on the first one has an impact on the other one, especially on thevariance of its forecast errors about its future values. To characterize theSims causality between two stationary time series, consider the Wold canonicaldecomposition:

Xt = (X1t, ..., Xnt)′ = C(L)εt = C(L)(ε1t, ..., εnt)′, (4.14)

where L is the lag operator, C(0) = Id, and εt is the canonical innovation ofXt which has the variance-covariance matrix Σ = (σij)1≤i,j≤n :

εit = Xit − EL(Xit/Xi,t−1

). (4.15)

The random variable Xt can be decomposed on two moving averages, accord-ing to:

Xit =n∑

j=1

Cij (L) εjt. (4.16)


According to Sims, at any time t, past shocks on variables Xj have an impacton Xit, through the variable Cij(L)εjt. This approach allows us to measurethe influence of different impulses on variables. Therefore, the manager canselect securities to be included in the tracking portfolio by using two causalitycriteria: the first one based on forecast improvement, the second on impulsereaction.

4.1.4.3.2 Second step: cointegration tests The first method, due toEngle and Granger [205], uses the following approach. We begin to search fora long term relation between a dependent variable and explicative variables.Then we have to introduce an error correcting model for the cointegrated vari-ables. Nevertheless, this method assumes prior relations and special causalitybetween the variables.

The second method uses the maximum likelihood, introduced by Johansen[304]. Since this method uses a multivariate model, it allows for differentiatingof several cointegrated vectors. Stock and Watson [481] prove that under thecointegration hypothesis, (i.e., ε is stationary), the estimators of the coeffi-cients ci in Equation (4.8) have a very fast speed of convergence: it is equalto T and not to

√T , as is usual.

4.1.4.3.3 Third step: stability tests After the estimation of the track-ing portfolio weights, the fund manager faces the stability problem of thesecoefficients. One possible stability test is the “Cumulative Sum Test” definedby Brown et al. [92].

From Equation (4.9), consider the recursive residual terms wt defined by:

wt =(yt − x′

tb)(1 + x′

t (A′tAt)

−1xt

) 12, (4.17)

where x′t is the observation vectors at time t, At the regressor matrix from

time 1 to time t, b the vector of regression coefficients, and yt−A′tb the value

of forecast errors. Its variance is given by:

σ2(1 + x′

t (A′tAt)

−1xt

). (4.18)

The CUSUM test is based on the statistics:

Wt =t∑

r=k+1

wr

s, (4.19)

where wr is the vector of recursive residual terms, and s is the standarddeviation of the regression on the whole sample. If the tracking portfolioweights are constant, then E (Wt) = 0.


REMARK 4.3 Empirical tests are mitigated. For example, Alexanderand Dimitriu [15] compare properties of tracking portfolios based on coin-tegration and those based on tracking error quadratic minimization. Theyconclude that there is no significant advantage to use cointegration. Howev-er, on other financial data, Dunis and Ho [179] show that the cointegrationmethod reduces the number of rebalancing times. In [417] and [487], suchanalyses are made on the main French stock index CAC40. The cointegrationapproach appears to be an efficient method.

4.2 Benchmark portfolio optimization

As seen in Chapter 3, asset allocation can be based on mean-variance op-timization. This allows for the determination of a set of asset class weights.This can induce a “long-term” investment plan, often called the portfolio’sstrategic asset allocation. It can be viewed as a benchmark for portfolio opti-mization. Indeed, the choice of the benchmark is crucial.

However, typically, the portfolio manager has the task of “beating” thebenchmark. He may have short-term forecasts which deviate from the long-term forecasts associated to the benchmark. In that case, market timing (ac-tive asset allocation) may lead to a divergent portfolio choice. This approachis usually called tactical asset allocation. Another example of such diver-gence from an index or a benchmark is the core/satellite approach: accordingto some specialists, an investor would invest on one hand on an index fundwhich is considered as the “core” portfolio, and on the other hand choosea skillful manager to add value. This second investment usually called the“satellite” is based on active management.

But, generally, the portfolio manager must choose a portfolio that doesnot deviate too much from the benchmark. The tracking-error between theportfolio and the benchmark must be controlled. As seen in previous section-s, several objective functions can be introduced to measure the risk of thetracking. The most common function is standard deviation. Note that forbenchmark portfolio optimization, the tracking error is not necessarily min-imized, as it is for index fund management. For instance, it can be fixedat a given “rational” level and other decision criteria can be introduced todetermine optimal solutions.


4.2.1 Tracking-error definition

4.2.1.1 Tracking ex-ante and tracking ex-post

The tracking error is defined as the difference between portfolio and bench-mark returns. It represents a relative risk of the portfolio with respect to itsbenchmark. However, as mentioned by Pope and Yadav [411] and Satchelland Hwang [448], the tracking error can be tricky:

• First, the tracking-error can be “anticipated.” The risk is measured ex-ante (tracking ex-ante). It is a statistical measure. It is a measure ofriskiness w.r.t. the benchmark.

• Second, the risk is measured on “realized” risk (tracking ex-post). Thismeasures the time series standard deviation of the realized active re-turns.

For a tracking ex-ante, the fund manager chooses an allocation differentfrom the benchmark’s in order to get a better return expectation than thebenchmark. To do so, the anticipated function of the tracking, such as stan-dard deviation, may be significantly above 0. For example, for an ex-antetracking error of 2%, the portfolio returns will fall within +/ − 2% of thebenchmark with a two-thirds probability. At maturity, ideally, the fund man-ager will have succeeded in having a realized return at a given level x% abovethe benchmark, while the tracking error was almost constant. Unfortunately,the tracking error is often underestimated. Pope and Yadav [411] indicatethat autocorrelation of excess returns is a reason for this underestimation.Satchell and Hwang [448] argue that ex-ante and ex-post tracking errors d-iffer when the realized benchmark volatility is high. Haar and van Straalen[279] confirm that ex-post tracking errors are higher than predicted becauseof changes in volatility, but also because of the concentration in systematicrisk factors in a portfolio.

4.2.1.2 Tracking-error, correlation and beta coefficients

We first must distinguish the tracking error from the correlation coefficientand the beta coefficient:

• The correlation coefficient (ρ) measures a “linear risk” between the port-folio return RP and the benchmark return RB. For a given standarddeviation σP of the portfolio return, the correlation coefficient decreas-es when the specific risk increases. Here, the specific risk indicates theportfolio risk which is independent from the benchmark return.

• The beta coefficient (βP ) measures the risk exposure to the benchmark.For example, if βP > 1, then the fund manager incurs a systematic riskw.r.t. the benchmark.


If the objective function on the tracking error T is the variance, denoted byT 2, then we have:

T 2 = σ2 (RP −RB) = σ2B + σ2

P − 2ρσBσP ,

= σ2P + σ2

B (1− 2βP ) . (4.20)

Therefore, the tracking-error volatility (TEV) is a function of the totalportfolio volatility, of its exposition to the benchmark, and also of the totalbenchmark volatility. As a consequence, if we focus on excess returns whileincluding benchmark securities, the total risk of the portfolio may be high.

REMARK 4.4 The Sharpe market model allows for the introduction ofthe relation between the tracking-error, the systematic risk βPσB , and thespecific risk σεP .

This model is:RP = αP + βP .RB + εP . (4.21)

Thus, we have:RP −RB = αP + (βP − 1) .RB + εP , (4.22)

where εP has a standard Gaussian distribution. Then, we deduce:

T 2 = (βP − 1)2 σ2B + σ2

εP . (4.23)

4.2.2 Tracking-error minimization

Several portfolio optimization programs can be introduced to control thetracking-error, depending on what kind of objective function is used. Specificconstraints are also often introduced. In what follows, the volatility of thetracking-error is considered, since it is the basic criterion.

Roll [431] has studied two optimizations programs under a constraint onthe tracking-error’s volatility: minimization of the tracking-error volatility(TEV), and minimization of TEV under a beta constraint.

4.2.2.1 TEV minimization under a mean-return constraint

Consider the same notations as in Chapter 3 for the mean-variance analysis,in particular the definitions of A,B,C, and D (see 3.1.2.1). In addition,denote:b: the benchmark weight vectorx=(w − b): the vector of differences between the weights of the portfolio

and those of the benchmarkRB = b′R: the expected benchmark returnσ2B = b′V b: the variance of the benchmark return


The tracking-error volatility (TEV), denoted also by T , is:

T =√

(w − b)′V (w − b) =√x′Vx, (4.24)

where V is the variance-covariance matrix of asset returns.The first optimization problem is the following one:

P (1):

minw

(w − b)′ V (w − b)

with (w− b)′ R = G,w′e =1.

(4.25)

The term G is the excess expected return of the portfolio w.r.t. the bench-mark. Roll [431] proves the following result.

PROPOSITION 4.11) The optimal solution is given by:

w = b+ D (q1 − q0) , (4.26)

whereD =

G

R1 −R0

and q0 = V−1BC, q1 = V−1R

A. (4.27)

The parameter coefficient D can be viewed as a relative performance coef-ficient. The portfolio q0 is the mvp, and q1 is also a mean-variance efficientportfolio with:

R0 =A

C, R1 =

B

A,

σ20 =

1C, σ2

1 =B

A2.

2) The tracking-error variance and the total variance of optimal solutionsfrom Problem P (1) are given by:

T 2 = D2(σ21 − σ2

0

),

σ2 = σ2B + T 2 + 2Dσ2

0

(RB/R0 − 1

). (4.28)

REMARK 4.5 Optimal portfolios given in relation (4.26) are the sums ofthe benchmark and of a deviation term. This one depends only on two specialmean-variance efficient portfolios and on the anticipated excess return, whichis independent from the benchmark. The tracking-error volatility (relation(4.28)) is an increasing linear function w.r.t. the anticipated excess return,which does not depend on the benchmark.


DEFINITION 4.3 The set of optimal solutions associated to ProblemP (1) is generated by the variation of the excess return G. In risk-returnspace, this set is called the relative frontier.

REMARK 4.6 Most of the time, the benchmark is not mean-varianceefficient, as empirically shown by Grinold [272]. If the benchmark is efficient,then all optimal solutions of Problem P (1) are also mean-variance efficient.Otherwise, they are not.

Portfolio solutions of Problem P (1) have two shortcomings:

• First, generally they do not mean-variance dominate the benchmark.

• Second, their beta coefficients w.r.t. the benchmark are higher than 1.

These properties are examined in what follows.

A) Mean-variance dominance.

- Case 1: RB > R0. Solutions of Problem P (1) do not dominate the bench-mark. This case is illustrated by the following figure.

FIGURE 4.4: Efficient and relative frontiers, RB > R0

In that case, the benchmark has an expected return above the mvp expectedreturn. Then, if the excess return of a portfolio w.r.t the benchmark is non-negative (G > 0), the portfolio total risk is higher than the benchmark total


risk (see relation (4.28)). If a riskless asset is introduced, such property is stillsatisfied for benchmark having an expected return higher than the riskless one.Therefore, for this first case which is the most usual, none of the solutions ofProblem P (1) mean-variance dominate the benchmark (which also does notdominate them).

- Case 2 : RB < R0. Some of the optimal solutions dominate the bench-mark. For small tracking error volatility, the total risk is smaller than thebenchmark total risk. However, for a higher excess return G, optimal solu-tions no longer dominate the benchmark. These properties are illustrated bythe following figure, which includes both mean-variance and relative efficientfrontiers.

FIGURE 4.5: Efficient and relative frontiers, RB < R0

Indeed, the term D measures a relative gain when taking a sufficient tracking-error in order to generate an excess return. When D is small (the portfolio isclose to the benchmark), a first-order Taylor’s expansion leads to the followingapproximation:

σ2P σ2

B + 2D(RB/R0 − 1

). (4.29)

The portfolio total risk σ2P is smaller than the benchmark total risk σ2

B .

This explains why the portfolio manager gets a portfolio which dominatesthe benchmark (according to mean-variance criterion).


B) Beta coefficients w.r.t. the benchmark.The beta coefficients of optimal solutions w.r.t. the benchmark are given

by:

βP = 1 + D

(σ20

σ2B

)(RB/R0 − 1

). (4.30)

Consider the usual case when the benchmark has an expected return higherthan the expected return of the mvp (RB > R0). Then, βP > 1. Therefore,the systematic risk w.r.t. the benchmark is significant, as soon as the portfoliomanager has a benchmark-timing strategy.

4.2.2.2 TEV minimization under mean-return and beta constraints

In order to mitigate the previous drawbacks, Roll [431] proposes a relativeoptimization problem with an additional constraint on the portfolio beta. Fora given portfolio exposition to the benchmark (the beta), we can search forportfolios which minimize the tracking error. The interesting point is to getoptimal solutions with beta smaller than 1, according to the given constraint.

The new optimization problem is:

P (2):

minw

(w − b)′ V (w − b)

with (w− b)′ R = G,w′e =1,

w′Vbσ2B

= β.

(4.31)

PROPOSITION 4.2The solution of Problem P (2) is given by:

w∗ = b+ νb+V−1(γR+ δe

), (4.32)

where the Lagrange multipliers are given by:

γ =G(1− Cσ2

B

)− σ2B (β − 1) (B − CRB)

(A−BRB) (1− Cσ2B)− (B − CRB) (RB −Bσ2

B), (4.33)

δ =σ2B (β − 1) (A−BRB)−G

(RB −Bσ2

B

)(A−BRB) (1− Cσ2

B)− (B − CRB) (RB −Bσ2B)

, (4.34)

ν = −γB − δC. (4.35)

Roll [431] proves that the optimal solution is a convex combination of the threeportfolios q0, q1, and the benchmark b.

The following figure presents three “beta frontiers.”Note that for some fixed beta values (for example β < 1 ), some solutions

of Problem P (2) mean-variance dominate the relative frontier associated to


FIGURE 4.6: Efficient, relative, and beta frontiers

Problem P (1). Nevertheless, these portfolios have tracking-error volatilitieshigher than those of the relative frontier.

REMARK 4.7 By fixing beta values smaller than 1, the benchmark-timing risk is limited. However, for β = 1, the beta frontier does not containthe benchmark, which may be a drawback.

4.2.2.3 Mean-variance optimization under TEV constraint

As mentioned by Roll [431], excess return optimization leads to optimalportfolios with systematically higher total risk than the benchmark. Thus,we can search for a criterion which controls both the total risk and the rela-tive risk (TEV).

For example, Bertrand et al. [61] consider that investors (or fund man-agers) maximize a mean-variance criterion under a tracking-error volatilityconstraint. In this framework, the manager has to solve the following opti-mization problem:

P (3):

max w′R− φ2w

′Vwwith (α− b)′V (α− b) = T 2,

w′e =1.(4.36)

The parameter φ is the marginal substitution rate between the return andthe variance. It can be viewed as an aversion w.r.t. the variance. In whatfollows, we call it the “variance aversion.”


PROPOSITION 4.3The optimal solution is given by (see Bertand et al. [61]):

w∗∗ = b+ 1φ+λ

(−φb+V−1(R− µe

))λ = −φ +

√φ2(σ2

B−σ20)+2φ(R0−RB)+ d

C

T ,

µ = B−φC .

REMARK 4.8 The set of portfolio solutions of the previous ProblemP (3) is the same as the set of portfolio solutions of Roll program, as soonas the expected rate of return of the benchmark is greater than that of theminimum variance portfolio.

For a fixed variance aversion φ, the φ-frontier associated to Problem P (3)is generated when the TEV is varying.

FIGURE 4.7: Efficient, relative, and Φ frontiers

When the variance aversion φ increases, the φ-frontiers move to the left(area of portfolios with small total risk). If the variance aversion is high,the portfolio manager chooses portfolios with a total risk smaller than thebenchmark total risk. If the variance aversion goes to infinity, then the optimalportfolio converges to the mvp, if the TEV is sufficiently high. For some valuesof φ (for example φ = 2.5), some optimal portfolios mean-variance dominatethe benchmark. Note also that all these frontiers contain the benchmark.


REMARK 4.9 When φ = 0, the set of optimal solutions P (3) is equalto the set of solutions of Problem P (1). The relative frontier corresponds toinvestors who are “neutral” w.r.t. the total risk (φ = 0). Nevertheless, wehave to choose an aversion risk level. For example, we can consider values ofφ such that there exists a tangency point between the φ-frontier (solution ofP (3)) and the mean-variance frontier. An optimal portfolio can be chosen onthat curve, according to the TEV. The set of optimal solutions of ProblemsP (2) and P (3), which have mean returns higher than the mvp return, areequal: any solution of P (2), associated to a pair (G, β), is equal to one andonly one solution of P (3), associated to a pair (φ, T ).

Example 4.1Consider a fund manager who chooses a TEV equal to 3.00%. For the param-eter values considered in this example, we have:

1) For φ = 2.4145, the portfolio solution of Problem P (3) has the followingcharacteristics:

Standard deviation Mean TEV Beta18.00% 9.70% 3.00% 0.975

This portfolio mean-variance dominates the benchmark. Its beta is smallerthan 1. Therefore, this portfolio is closer to the benchmark than the corre-sponding portfolio belonging to the relative frontier.

2) Consider now the portfolio belonging to the relative frontier with thesame TEV. Its characteristics are:


The fund manager may choose to lose 0.32% in the mean return in orderto reduce the total volatility (−1.24%) while also reducing the exposure beta(from 1.045 > 1 to 0.975 < 1).

3) We can also consider the TEV-frontier with TEV=3.00% (the constant-TEV-frontiers are ellipses in the risk/return space). Then, we can select theportfolio belonging to this curve with the same total risk as the benchmark.Its characteristics are:


Contrary to the solution of Problem P (3) with φ = 2.4145, this porfoliodoes not mean-variance dominate the benchmark.

The following figure shows the comparison between the three criteria andthe associated solutions.


FIGURE 4.8: Frontiers and iso-tracking curves

4.3 Further reading

As examined in Sorenson et al. [474], investors and fund managers have toallocate between active and passive management.

Strategic allocation consists of defining the orientation of the investment:By determining the risk aversion and horizon in order to choose the bench-

mark;By choosing the portfolio composition: asset class allocation among cash,

fixed-income products (domestic or international) and stocks (by sectors orstyles), alternative investments, etc; and,

By fixing the tolerance of the fund manager with respect to the benchmark.

Tactical allocation is more concerned with the effective management process.It consists of searching for the best way to achieve the previous goal:

By choosing the weighting asset class with respect to the benchmark. Thiscan be done by market timing (which induces the determination of the expo-sure to the benchmark), and by asset allocation among sectors, lands, etc.

By using asset valuation models.


Main Index tracking methods are based on statistical replication which goesback to the 70s. Many articles have examined this method. Rudd [441] ex-amines the selection of passive portfolios. Rudd and Rosenberg [442] considerportfolio optimization problems and their realistic implementation. Bambergand Wagner [40] search for robust regression estimators to replicate equityindexes. Dash et al.[148] attempt to measure the efficiency and costs of suchportfolio optimization.

Evolutionary algorithms are appropriate candidates for being able to con-tinuously track the movement of the optimum through time. In order to solvelarge scale optimization problems with constraints, “metaheuristic” method-s have been introduced, e.g., by Derigs and Nickel [163] for index tracking.They are based on neighborhood methods and evolutionary algorithms suchas genetic algorithms. These latter methods are also used in other applica-tions in finance, as shown by Bauer [51]. Conditions insuring the convergenceof the

threshold-accepting algorithm are also provided in Althofer and Koschnick[21]. Applications of this algorithm in econometrics is provided in Winker[505].

An overview of some different benchmark portfolio optimization proceduresis given in Wagner [500] (see also Wagner [499]). Other references are: Franks[241] and Clarke et al. [124] who study tracking errors and tactical asset al-location, and, Jorion [310] who proves that the set of tracking-error volatilityconstrained portfolios is an ellipse on the mean-variance plane.

Gaivoronski and Krylov [250] present several portfolio selection algorithmsin a dynamic setting which take into account different risk/target measures.They search for the best tradeoff between tracking precision and rebalancingfrequency in order to control costs.

Chapter 5

Portfolio performance

Competition among financial institutions has led to the performance analysiswhich examines the qualities of managers who use active investment strate-gies. The investment process is overall evaluated.

The fundamental question is:Does a given manager provide an actual value-added service with respect to

a simple index replication or to a given benchmark?As a by-product, the performance analysis is a test of the market efficiency

which is based on different kinds of information, as introduced by Fama [217].An active manager tries to take opportunities from particular information

(public or private) which may not be reflected by market prices.

To analyze a manager’s performance, it is necessary:

• First, to introduce specific performance measures, taking account of d-ifferent types of risk. Therefore, several risk measures can be consideredin order to evaluate the risk associated to the performance.

• Second, to attribuate this performance:

- What are the exact contributions of each investment decision to theoverall portfolio performance?- Is the manager skillful or lucky?- Compared with the benchmark, does the performance come from stockpicking or from market timing?

• Finally, to examine the performance consistency? More precisely, is pastperformance a good indicator of future performance?

129


5.1 Standard performance measures

In order to ensure that fund comparisons are legitimate:

• First, funds must have the same nature.

• Second, a standardization of performance measures and their resultsmust be introduced.

The AIMR (Association for Investment Management and Research) provid-ed such standards, further developed by the GIPS (Global Investment Perfor-mance Standard). They focus mainly on equities and fixed-income securities.Generally, two measures are considered: a measure of total risk, such as thestandard deviation, and a measure of market risk, such as the beta. The firstone is “absolute,” the second one is “relative.”

For the latter measure, we have to evaluate the sensitivity of securities tothe market. This can be done by using the fundamental CAPM introducedby Sharpe ([462],[463]).

5.1.1 The Capital Asset Pricing Model

Suppose that there exists a riskless asset with return Rf . As seen in Chapter3, under the Markowitz assumptions, the two-fund separation theorem is valid:any efficient portfolio P is a combination of the riskless asset and of the marketportfolio M , which corresponds to the point of tangency between the twoefficient frontiers (with and without the riskless asset).

Then, we have:

RP = xRf + (1 − x)RM and RP −Rf = (1− x)(RM −Rf

). (5.1)

The choice of x depends on the risk aversion. Therefore, its variance isequal to:

σP = (1− x)σM . (5.2)

Consequently,

RP = Rf + σP

(RM −Rf

σM

). (5.3)

In the Markowitz plane (σ(R), R), this equation defines the efficient frontier,which is also called the capital market line.

Following Sharpe’s demonstration, at equilibrium the prices of assets aresuch that the market portfolio is made up of all assets in proportion to theirmarket capitalizations (in practice, a stock exchange index). Then, consider

Portfolio performance 131

the portfolio Pi (not necessarily efficient) with a proportion x invested onasset i, and (1− x) invested on M . We have:

RP = xRi + (1 − x)RM , (5.4)

σP =[x2 + σ2

i + (1− x)2σ2M + 2x(1− x)σiM

]1/2. (5.5)

Consider now the curve of all possible portfolios Pi when x is varying. Thecoefficient of the tangent to this curve at the point associated to x is givenby:

∂RP

∂σP=

∂RP /∂x

∂σP /∂x=

(Ri −RM

)σP

x (σ2i + σ2

M − 2σiM ) + σiM − σ2M

. (5.6)

Since for the market portfolio M we have x = 0, we deduce from Equation(5.3) that: (

Ri −RM

)σM

σiM − σ2M

=(RM −Rf

σM

),

which is equivalent to:

Ri −Rf = βi(RM −Rf ) with βi = σiM/σ2M . (5.7)

The coefficient beta represents the systematic risk which is due to expositionto the market variations. In the plane (βM (R), R]), the previous equationdefines a straight line, the so-called security market line. At equilibrium, allassets (thus all portfolios) are located on this line.

Beta

Mean-return

RM

Rf

Market

1

Security market line

FIGURE 5.1: Security market line

Note also that, at equilibrium, the market portfolio is optimal, which isin favor of passive management based on index funds. As shown by Black


[73], the CAPM is still valid without the riskless asset, which is replaced bya zero-beta portfolio Z:

Ri −RZ = βi(RM −RZ) with βi = σiM/σ2M .

Taxes can also be taken into account, as in Brennan [87]:

Ri −Rf − T (Di −Rf ) = βi(RM −Rf − T (DM −Rf )) with T =

Td − Tg

1− Tg,

where Td denotes the average taxation rate for dividends, Tg denotes theaverage taxation rate for capital gains, Di and DM denote respectively thedividend yields of asset i and market portfolio M .

The CAPM highlights the relationship between the excess mean return (the“reward”) and the exposure coefficient beta (the “risk”). Other factors canalso be introduced to estimate the excess return, as shown in the next sectionon performance decomposition.

5.1.2 The three standard performance measures

These performance measures are based on the previous properties: Relation(5.3) which defines the capital market line and relation (5.7) which definesthe security market line. The higher these performance measures, the moreinteresting the portfolio.

5.1.2.1 The Sharpe measure

This measure is based on the the capital market line. For all efficientportfolios, we have the following equality between (excess reward)/(total risk)ratios:

RP −Rf

σ (RP )=

RM −Rf

σ (RM ). (5.8)

The previous ratio is the slope of the capital market line. As for Markowitzportfolio optimization, we search for such portfolios, since this is equivalentto the maximization of the ratio RS = RP−Rf

σ(RP ) . Thus, as proposed by Sharpe[464], this can be actually considered as a performance measure. It is equal tothe excess mean return (w.r.t. the riskless asset) and the measure ot total risk(the standard deviation), and is defined as the “reward-to-variability ratio.”For example, the manager can check if the excess mean return of the portfoliois sufficient to compensate a higher risk than the market portfolio. If theportfolio is well-diversified, its Sharpe ratio is close to the market portfolio’s(see also comments in Sharpe [466]).


5.1.2.2 The Treynor measure

The Treynor’s ratio [494] is directly based on the CAPM:

RT =RP −Rf

βP. (5.9)

It can also be viewed as a reward-to-risk ratio where the “risk” is the ex-position to market risk. At equilibrium, this ratio is constant and equal toRM −Rf . The Treynor ratio allows us to evaluate the performance of a well-diversified portfolio, since it only involves the systematic risk. It can be usedto examine performance of portfolio which is only a part of the investor’sassets. Due to previous diversification, the investor takes care only of thesystematic risk.

5.1.2.3 The Jensen measure

This measure is also based on the CAPM, but mainly when it is not satisfied.Indeed, consider the difference αP between the mean excess return of portfolioP and the return explained by the CAPM. Then:

αP =(RP −Rf

)− βP

(RM −Rf

). (5.10)

Consequently:RP −Rf = αP + βP

(RM −Rf

). (5.11)

The coefficient αP is called the performance measure introduced by Jensen[300]. Therefore, the manager searches for portfolios with αP > 0. For ex-postmeasures, the manager will be judged skillful if the coefficient αP is signifi-cantly above 0.

If αP = 0, the return of portfolio P is at equilibrium and the manager’sforecasts have not beat the market performance: the portfolio has the samealpha as any combination of the riskless asset and the market portfolio. Unlikethe Sharpe and Treynor ratios, the Jensen measure contains the benchmarkitself. As with the Treynor ratio, it only takes account of the systematic risk.Due to the particular form of alpha, only portfolios with the same risk betacan be compared. Otherwise, we can consider, for example, the Black-Treynorratio:

RBT =αP

βP.

REMARK 5.1 Previous ratios are defined ex-ante. However, ex-postformulas can be also introduced, as shown by Jensen [300]. Consider themarket model:

RPt = γP + βPRMt + εPt, (5.12)


where εPt has zero expectation, is independent from RMt, and:

Cov(εPt, εPs) = 0, ∀t = s.

The return expectation of portfolio P is given by:

RP = γP + βPRM .

Thus:RPt = RP + βP

(RMt −RM

)+ εPt.

Finally:RPt −Rf = αP + βP (RMt −Rf ) + εPt. (5.13)

Assume that Rf and βP are constant. Then, we can get an unbiased estima-tion of the coefficient αP by using a least squares regression. The estimatorsof the unknown parameters αP and βP are given by

βP =Cov (RM −Rf , RP −Rf )σ2 (RM −Rf )

,

αP =(RP −Rf

)− βP

(RM −Rf

),

where the symbol X denotes the empirical mean of the sample which containsT observations of the random variable α. The significance of the estimator αP

can be based on a Student test on the hypothesis H0 : αP = 0. The ex-postSharpe and Treynor ratios are given by:

RS =RP −Rf

σ (RP )and RT =

RP −Rf

βP

. (5.14)

REMARK 5.2 Jensen [300] considers the following beta specification:

βPt = E [βP ] + εPt,

where εPt is a white noise. The manager has a mean beta target E [βP ]. Inthis framework, a particular estimation model of the alpha coefficient αP isintroduced (see [300]). The manager can adjust the beta from the globalmarket forecasts.


5.1.2.4 Performance measures comparison

From Equation 5.11, relations between all these measures can be deduced.

PROPOSITION 5.1(Relations between Sharpe, Treynor, and Jensen measures)

- The Sharpe ratio is also equal to:

RS =RP −Rf

σ (RP )=

αP

σ (RP )+

ρ (Rp, RM )σ (RM )

(RM −Rf

). (5.15)

For any efficient portfolio P , we have: ρ (Rp, RM ) = 1. Therefore, we get:

RS =αP

σ (RP )+

(E [RM ]−Rf )σ (RM )

. (5.16)

- The Treynor ratio is simply equal to:

RT =E [RP ]−Rf

βP=

αP

βP+ (E [RM ]−Rf ) . (5.17)

- If the portfolio is well-diversified, we have:

ρ (Rp, RM ) 1. Thus: βP σP

σM. Then: RS RT

σM.

The following example illustrates properties of the three performance mea-sures.

Example 5.1Suppose that the fund manager has private information about a given firm.The portfolio denoted by A contains only this stock. We have:

RA = αA + Rf + βA (RM −Rf ) + εA.

We assume that this information is sufficiently relevant such that:

E [RA](4.9 %)

= αA(0.5 %)

+ Rf(2 %)

+ βA(0.687)

(E [RM ](5.49 %)

−Rf

).

The total risk is assumed to be equal to 6% and the specific risk equal to4.72%. These risks are assumed to not be modified by the information.

A second fund manager has a well-diversified portfolio B due to the “good”knowlegde of many stocks with total risk equal to 8.71% and the specific riskequal to 4.73%. We have:

RB = αB + Rf + βB (RM −Rf ) + εB.


with:

E [RB](7.24 %)

= αB(0.5 %)

+ Rf(2 %)

+ βB(1.359)

(E [RM ](5.49 %)

−Rf). (5.18)

The next figure illustrates these assumptions. Points A′ and B′ correspondto portfolios A and B at equilibrium (i.e., with 0 alpha).

6

-Beta

Mean-return

Rf

RM

1


0.7 1.4

= 2%

A′

AMarket

B′

B

FIGURE 5.2: SML and portfolios A, B, A′ and B′

The following figure represents the efficient frontier without riskless assetand the CML.

6

-

Market

mvp

Capital market line

Standard deviation

Mean-return

A′

A

B′B

FIGURE 5.3: Capital market line


The well-diversified fund B′ is assumed to be efficient, while the fund A′,which contains only one stock, is not on this frontier. Using the alpha measure,funds A and B cannot be distinguished since both fund managers have thesame excess return w.r.t. the CAPM (equal to 0.5%). For this special case,the Jensen measure does not allow for the comparison of these two portfolios,having too different risks.

As mentioned by Modigliani and Pogue [392], the Treynor ratio allows usto reduce the bias due to the leverage effect which is contained in the Jensenalpha. Indeed, fund A has an excess return per systematic risk unit whichis higher than fund B. Assuming that we can borrow the riskless asset, theleverage effect allows us to reach A∗ (from A), which has the same beta asfund B but a higher mean return. The following figure shows A, A∗, and B.Funds A and A∗ have same Treynor ratio (equal to 4.22%) which is higherthan fund B (3.86%).

6

-Standard deviation

Mean-return

Rf=

1


2%

A

Market

BA∗

FIGURE 5.4: Capital market line and leverage effect

The Sharpe ratio applied to funds A and B allows us to take account ofthe diversification. The Sharpe ratio of fund A is equal to 0.4833, while forfund B it is equal to 0.6025. Therefore, the manager who chooses the morediversified portfolio has a higher performance ranking.

PROPOSITION 5.2

(A condition ensuring the same ranking by Sharpe and Treynor ratios)

Consider two funds A and B. Their rankings w.r.t. Sharpe and Treynorratios are equivalent if they have the same correlation coefficient with the


market portfolio:

(RTA > RTB ⇐⇒ RSA > RSB) if ρAM = ρBM . (5.19)

PROOF We have:

RTA > RTB ⇐⇒ RA −Rf

βA>

RB −Rf

βB,

⇐⇒ σM

ρAM

RA −Rf

σA>

σM

ρBM

RB −Rf

σB⇐⇒ RSA

ρAM>

RSB

ρBM.

REMARK 5.3 If RTA > RTB and ρAM > ρBM , then RSA > RSB.Otherwise, no general comparison is possible. Using the market model, wehave:

σ2A = β2

Aσ2M + σ2

εA ⇐⇒ σ2A = ρ2AMσ2

A + σ2εA ⇐⇒ ρ2AM = 1− σ2

εA

σ2A

. (5.20)

Thus, the higher the correlation coefficient, the smaller the specific risk for agiven total risk. Then, if fund A has a higher Treynor ratio than fund B anda smaller (specific risk)/(total risk) ratio, then fund A has a higher Sharperatio than fund B.

PROPOSITION 5.3(A condition ensuring the same ranking by Sharpe and Jensen measures)If σA < σB and ρAM > ρBM , then:

αA > αB ⇐⇒ RSA > RSB. (5.21)

PROOF The Sharpe ratios are given by:

RSA =αA

σA+

βA

σA(E [RM ]−Rf ) , RSB =

αB

σB+

βB

σB(E [RM ]−Rf ) .

Therefore:

RSA > RSB ⇐⇒ αA − σA

σB.αB > RSM (ρBM − ρAM )σA.

Using the assumption σA < σB and ρAM > ρBM , we deduce:αA > αB ⇐⇒ RSA > RSB.


REMARK 5.4 The Sharpe ratio does not modify the Treynor ratio rank-ing RTA > RTB if ρAM > ρBM . In order to get such general relation betweenthe Sharpe ratio and the Jensen measure, we have to introduce an additionalcondition on the total risk: σA < σB.

5.1.2.5 Roll’s criticism

According to Roll [430]), it is very difficult to determine the true marketportfolio. Moreover, the market portfolio has to be mean-variance efficient.But, the true market portfolio cannot be observed since it must contain allthe risky assets, even those which are not traded. Since a stock exchangeindex has to be substituted, the empirical studies depend on this choice. TheCAPM is validated if this index is mean-variance efficient.

Therefore, Roll’s criticism concerns performance measures based on theCAPM, such as the Treynor ratio and the Jensen measure. Using a portfoliowhich is not the true market portfolio may lead to estimation errors in thebetas. However, the measurement errors can be corrected, as proposed byShanken [461]. The Sharpe ratio avoids this problem, since it is based on thetotal risk instead of the beta. However, if we compare the Sharpe ratio ofa given portfolio to the Sharpe ratio of an index, this comparison obviouslydepend on the choice of index. In addition, some statistical problems mayalso appear when estimating the Sharpe ratio.

REMARK 5.5 What is the precision of Sharpe ratio estimation when us-ing financial data? Lo [359] analyzes the statistical distribution of the Sharperatio, assuming different properties on the financial process which generatesthe data. If portfolio returns are supposed to be independent and identicallydistributed (iid) then, from central limit theorem, the asymptotic distributionof the Sharpe ratio is such that:

√T(RS −RS

)a∼ N

(0, 1 +

12RS2

).

Table 5.1 indicates some standard deviation values of the asymptotic Sharpe

ratio(SE

(RS

)a=√(

1 + 12RS2

)/T

).

For example, for 60 observations, if the true value of the Sharpe ratio isequal to 1.50, then the standard deviation of the Sharpe ratio estimator isequal to 0.188. If the true value of the Sharpe ratio is equal to 3, then thestandard deviation of the Sharpe ratio estimator is equal to 0.303. Thus, forinstance, the hedge funds, which search for high Sharpe ratios, often have ahigher standard deviation of the Sharpe ratio estimator than usual funds.


TABLE 5.1: Asymptotic standard deviation of theSharpe ratio estimator

Number T of observations

Sharpe ratio 12 24 36 48 60 120

0.5 0.306 0.217 0.177 0.153 0.137 0.0970.75 0.327 0.231 0.189 0.163 0.146 0.1031 0.354 0.250 0.204 0.177 0.158 0.1121.25 0.385 0.272 0.222 0.193 0.172 0.1221.5 0.421 0.298 0.243 0.210 0.188 0.1331.75 0.459 0.325 0.265 0.230 0.205 0.1452 0.500 0.354 0.289 0.250 0.224 0.1582.25 0.542 0.384 0.313 0.271 0.243 0.1722.5 0.586 0.415 0.339 0.293 0.262 0.1852.75 0.631 0.446 0.364 0.316 0.282 0.2003 0.677 0.479 0.391 0.339 0.303 0.214

However, most of the time, portfolio returns are not iid, such as hedgefunds. Autocorrelations are high. Lo [359] shows that neglecting this featuremay lead to overestimation of the Sharpe ratio (65% more for example), andthat specific estimators must be used to avoid this problem.

5.1.3 Other performance measures

5.1.3.1 The information ratio

5.1.3.1.1 Information ratio definition As seen in Chapter 4, manyfunds are based on benchmark optimization. Thus, such funds must be evalu-ated by taking account of this feature. Recall that the set of optimal portfoliosis another frontier, called the relative frontier. For all efficient portfolios ofthe standard mean-variance frontier, the Sharpe ratio is constant. For opti-mal portfolios relative to a benchmark, the new risk measure is usually thetracking error volatility. Then, a new performance measure adjusted to therelative risk can be introduced.

DEFINITION 5.1 The information ratio is defined by:

RI =RP −RB

T, (5.22)

where T = σ (RP −RB) is the tracking-error volatility. An ex-post ratio canalso be introduced:

RI =RP − RB

σ (RP −RB). (5.23)


PROPOSITION 5.4

As proved in Bertrand et al. [61], all portfolios belonging to the relative fron-tier (except the benchmark itself) have the same information ratio. Moreover,this ratio does not depend on the benchmark.

PROOF Using relation T = D√

σ21 − σ2

0 (see Chapter 2), we deduce:

RI =(RP −RB)D√

σ21 − σ2

0

=(RP −RB)

(RP−RB)(R1−R0)

√σ21 − σ2

0

=R1 −R0√σ21 − σ2

0

= cste. (5.24)

Therefore, the information ratio of any portfolio of the relative frontier de-pends only on characteristics of portfolios 0 and 1 (which belong to the mean-variance frontier).

REMARK 5.6 Due to previous properties, the information ratio is adapt-ed to measure performance of benchmark funds. Consider portfolios of iso-beta curves in Chapter 4. Their information ratios depend on beta:

- If β = 1, portfolios of the iso-beta frontier (except the benchmark), havethe same information ratio absolute value.

- If β = 1, portfolios of the iso-beta frontier have different information ratios.

This is a drawback since, for a given beta, the portfolio which has the higherinformation ratio corresponds to an infinite standard deviation.

On the contrary, portfolios belonging to a same φ-frontier (see mean-varianceoptimization with tracking error constraint in Chapter 4) have the same infor-mation ratio, whatever the value of the aversion to variance φ. With respectto the relative performance, they are indistinguishable.

5.1.3.2 Information ratio and statistical test of excess performance

The information ratio can be viewed as a statistical test of the excess per-formance w.r.t. the benchmark. Assume that the monthly portfolio andbenchmark returns are iid. The excess return is a random variable ER, withexpectation µ and unknown standard deviation.

Let ER = RP − RB be the empirical mean, and s = σ (RP −RB) theempirical standard deviation of the excess return, estimated from a sample ofn observations. Then, we can consider the following univariate test:

H0 : µ ≤ 0,H1 : µ > 0. (5.25)


Under the null hypothesis, tc =√n(ER− 0

)/s =

√n.RI. The t-statistic

t = ER−µ

s/√nhas a (n − 1)−Student distribution. Therefore, the information

ratio corresponds to the statistical test (divided by the square root of thenumber of observations) of the null hypothesis, which corresponds to the caseof no overperformance of the portfolio w.r.t. the benchmark.

Assume, for example, that n is equal to 60 (5 years of monthly data). Then,to reject hypothesis H0 for the threshold 5%, it is necessary that tc > 1.671.The monthly information ratio must be higher than 0.2157 (which correspondsto 0.747 for one year).

Suppose also that this information ratio corresponds to a mean ratio withmanagement cost equal to 0.36. In order to reject hypothesis µ ≤ 0 and toconclude that the fund is performant with a confidence level equal to 95%,we must have 0.36√

12

√n = 1.645. This corresponds to n = 250.5 months, or

equivalently to about 21 years. Thus, this result shows that a long periodof performance persistence is needed to conclude that the better performancew.r.t. the benchmark is significant.

5.1.3.3 Adjusted measures

5.1.3.3.1 The Sortino ratio Indicators based on standard deviations donot indicate whether the value is above or below the mean. Risk-adjustedmeasures based, for example, on semi-variance can separate these two cases.Moreover, they can better take account of asymmetrical return distributions.This is the purpose of the Sortino ratio, introduced in Sortino and Price [477].It is defined similar to the Sharpe ratio, but with a minimum acceptable return(MAR) instead of the riskless return. Moreover, the standard deviation isreplaced by the semi-standard deviation of the return below the MAR.

Sortino ratioP=RP −MAR√

1T

∑Tt=0

([RPt −MAR]+

)2. (5.26)

5.1.3.3.2 The Morningstar rating system The Morningstar rating isalso called the risk-adjusted rating (RAR). It is used in particular in theUSA. It is based on a system of stars, and is devoted to the ranking of fundsbelonging to the same peer group. It is calculated as the difference betweenrelative returns and relative risks:

RARP = RRP −RRiskP . (5.27)

Relative return and risk are defined as follows:

RRP =RP

BRG, RRiskP =

RiskPBRiskG

, (5.28)


where P is the portfolio belonging to the peer group G, such as domesticstock funds, international stock funds, taxable bond funds, and tax-exemptmunicipal bond funds as well as equity funds classified by style: capitalization(large-cap, mid-cap, small cap) or growth/value.

RP is the return of fund P in excess of the risk-free rate, and RiskP is therisk of fund P .

BRG denotes the base which is used to calculate the relative returns offunds in the group G, and BRiskG is the base which is used to calculate therelative risks of funds in the group.

The risk of the fund can be measured by first determining the average ofthe negative values of the fund’s monthly returns in excess of the short-termriskless rate, then by setting:

RiskP = − 1T

T∑t=0

min [RPt , 0] , (5.29)

where T is the number of months.

Risk can also be measured by taking account of both downside risk and up-side volatility, as well as downward volatility. Then, such criterion penalizesfunds with highly volatile returns (both upside and downside). Indeed, ex-treme gains (upside volatility) may be associated to potential extreme losses(such as Internet funds in the early 2000s). Thus, this kind of performancemeasure reduces the potential interest in high-risk funds.

The base is calculated by taking the average return of the n funds containedin the group. It can be compared with the riskless rate Rf :

BRG = max

(1n

n∑i=1

RPi , Rf

). (5.30)

The base for the relative risk is defined as the average of the funds in thepeer group:

BRiskG =1n

n∑i=1

RiskPi . (5.31)

Sharpe [467] examines properties and limitations of this measure, which isnot appropriate for measuring the risk of funds held over a long period.


5.1.3.3.3 The Dowd ratio The performance analysis can be also basedon other risk measures, such as the Value-at-Risk measure (see Chapter 2).For example, we can consider a Sharpe-type ratio in which the standard de-viation is replaced with a risk measured based on the VaR:

RP −Rf

V aRP /VP,0, (5.32)

where V aRP denotes the VaR of portfolio P , and VP,0 is the initial value ofthis portfolio.

Dowd [170] introduces a measure of investment process based on the VaR.

Consider an investor who holds a portfolio which can be modified by intro-ducing a new asset. To analyze the potential benefit to introduce this assetin the portfolio, the investor defines the risk in terms of the increase in theportfolio’s VaR. The asset will be included if the excess return that it gives isnot compensated by a too high incremental VaR.

Assuming for example that portfolio returns are normally distributed, theVaR of the portfolio P is proportional to the standard deviation:

V aR = −ασ(RP )N, (5.33)

where α denotes the confidence parameter for which the VaR is estimated,σ(RP ) is the standard deviation of the portfolio returns, and N is a parameterlinked to the size of the portfolio. Then, asset A will be included in theportfolio with weight a if and only if:

RA ≥ RP +RP

a

(V aRPA

V aRP− 1

), (5.34)

where PA denotes the new portfolio with asset A. Denote the incrementalVaR by:

IV aR = V aRPA − V aRP .

Define the function γA by:

γA(V aR) =1a

IV aR

V aRP.

The function γA is the percentage increase in the VaR due to the purchase ofasset A. Then, asset A will be included in the portfolio with weight a if andonly if:

RA ≥ (1 + γA(V aR))RP . (5.35)


5.1.4 Beyond the CAPM

Criticisms of the CAPM have led to the introduction of more sophisticatedstatistical models, such as Arch models. The volatility may depend on pastasset values, or the coefficient beta may evolve according to given factors.

5.1.4.1 Performance measurement using a conditional beta

Amenc and Lesourd [22] propose to introduce ARCH modelling for identi-fied factors which influence asset return dynamics (not for the assets them-selves, since the number of assets is much higher than the number of factors).

Denote by n the number of assets and by Rt the vector of the returns at timet. Consider the conditional expectation E [Rt |Ft−1 ] and variance-covariancematrix V [Rt |Ft−1 ] of the vector of returns. Assume that these terms aregiven by the following equation:

Rt = βft + ut, (5.36)

with

E [ut |Ft−1 ] = 0,V [Rt |Ft−1 ] = σ2. (5.37)

The factor ft satisfiesft = µ + εt, (5.38)

with (εt)t and (ut)t independent and

E [εt |Ft−1 ] = 0,

V [εt |Ft−1 ] = c +q∑

i=1

aiε2t−1 +

p∑j=1

bjht−j . (5.39)

ThusE [Rt |Ft−1 ] = δβ where δ is a scalar.

In addition, the variance-covariance matrix of the n assets can be writtenas follows:

V [Rt |Ft−1 ] =

β21 ...β1βn

...β1βn...β

2n

V [εt |Ft−1 ] +

σ12 0...0

0...0...0 β σ2

n

. (5.40)

Therefore, for a portfolio with weights (w1, ..., wn), we have:

RPt =n∑

i=1

wi,tRi,t =n∑

i=1

wi,t (βifi,t + ui,t) . (5.41)


Set βP,t =n∑

i=1

wi,tβi and uP,t =n∑

i=1

wi,tui,t. Then, the Jensen measure is

estimated from the relation

RPt = αP + βP,tft + uP,t. (5.42)

The coefficients αP and βP,t are estimated by using regression from a timeseries on the portfolio and the market.

5.1.4.2 Performance measurement using a conditional beta

A conditional version of the CAPM can be introduced, as in Ferson andSchadt [222]. The conditional CAPM is such that, for any asset i:

Ri,t+1 − Rf,t = βi,M,t

(RM,t+1 −Rf,t

)+ ui,t+1, (5.43)

with

E [ui,t+1 |Ft ] = 0,E[ui,t+1

(RM,t+1 −Rf,t

) |Ft

]= 0. (5.44)

The filtration Ft represents the public information available at time t. Thebeta βi,M,t of the regression is a conditional beta which depends on the infor-mation Ft, generated by a given set of factors.

If only information Ft is used, then the alpha term is null. The error termui,t+1 in the regression is independent from information Ft. Therefore, thismodel is coherent with the efficient market hypothesis. If Ft is the σ−algebragenerated by a process (It)t, then, using Taylor’s approximation, the beta canbe estimated through a linear function:

βP,M,t = b0,P + B′P (It − It), (5.45)

where b0,P is an average beta which is equal to the unconditional mean of theconditional beta: b0,P = E [βP,M,t |Ft ] . The components of the vector BP arethe response coefficients of the conditional beta w.r.t. the random variable It.Then, a conditional formulation of portfolio return can be deduced:

RP,t+1 −Rf,t+1 = b0,P(RM,t+1 −Rf,t+1

)+B′

P It(RM,t+1 −Rf,t+1

)+uP,t+1,

(5.46)with

E [uP,t+1 |Ft ] = 0,E[uP,t+1

(RM,t+1 −Rf,t+1

) |Ft

]= 0. (5.47)

Thus, this is a stochastic factor model which is a linear function of theexcess market return, the coefficients of which depend linearly on It. It is


a time-dependent generalization of the standard CAPM and can be used toexamine, for example, the Jensen measure:

RP,t+1 −Rf,t+1 =

αCP + b0,P(RM,t+1 − Rf,t+1

)+ B′

P It(RM,t+1 −Rf,t+1

)+ eP,t+1, (5.48)

where αCP is the average difference between the excess return of the portfolioand the excess return of a dynamic reference strategy. To improve the alphaforecast, Ferson and Schadt [222] assume a relationship between the portfoliorisk and the market indicators: for example, the market index dividend yield(DYt)t and the return on short-term T-bills (TBt)t.

Denote dyt and tbt the random variables equal to the differentials comparedto the average of the variables DYt and TBt:

dyt = DYt − E [DYt] ,tbt = TBt − E [TBt] . (5.49)

Therefore, we have: It − It =[dyttbt

]and BP =

[b1Pb2P

]. The conditional

beta can be written as:

βP = b0 + b1dyt + b2tbt.

Then, the conditional formulation of the Jensen model is given by:

RP,t+1 −Rf,t+1 = αCP + b0,P(RM,t+1 −Rf,t+1

)(5.50)

+ (b1dyt + b2tbt)(RM,t+1 −Rf,t+1

)+ eP,t+1, (5.51)

where αCP is the conditional performance measure, b0,P is the conditionalbeta, and b1 and b2 indicate the variations in the conditional beta comparedwith the dividend yield and the return on the T-bills. These coefficients areestimated through regression from the time series of the variables.

5.1.4.3 Performance measurement independent of the market mod-el

Due to Roll’s criticism, measures that do not depend on the market modelhave been developed, such as the Cornell measure and the Grinblatt andTitman measure. The goal of these measures is to evaluate the managers’scapacity to select stocks that have higher returns than the average. Thisaverage must be defined, which is one drawback of these measures.

5.1.4.3.1 The Cornell measure The Cornell measure (see Cornell [130])is defined as the average difference between the return on the investor’s port-folio during the management period, and the return of the reference portfolio


with the same weighting but observed on a different period than the investor’smanagement period. The calculation is made after the investment horizon.

The Cornell measure can be written as:

C = RP − βP RB , (5.52)

where the symbol XP denotes the P-limit of 1T

∑Tt=1 XP,t.

Therefore, we get:

C = P− limit of

[1T

T∑t=1

βP,t

(RB,t − RB,t

)]+ εP . (5.53)

This means that C is the sum of the selectivity and timing components inthe Jensen measure decomposition. Thus, if the investor has no particularskill in terms of timing or selectivity, Jensen and Cornell measures give a nullperformance.

5.1.4.3.2 The Grinblatt and Titman measures Grinblatt and Tit-man [269], [270]) propose a measure to improve upon the Jensen measure byallowing us to better take account of market timing, without any informationabout portfolio weighting. This method assumes that, if a manager has a mar-ket timing skill, then his performance would be observed over several periods.The measure attributes a null performance to uninformed fund managers. Itis defined as follows. Let wP,t denote the weighting attributed to the returnfor period t. Consider the return RB,t of the reference portfolio for the periodt. We have:

∑Tt=1 wP,t (RB,t −RF,t) = 0. Then, a first Grinblatt and Titman

measure is given by:

GB1 =T∑

t=1

wP,t (RP,t −Rf,t) . (5.54)

Therefore, a positive GB indicates that the fund manager has good fore-casts on market dynamics. However, this measure supposes that the portfolioweighting is well determined.

For this reason, Grinblatt and Titman [271] introduce another measurewhich takes account of the portfolio’s composition evolution. This approachsupposes that an informed fund manager modifies the portfolio weights ac-cording to his forecast on the market’s evolution. Thus, covariances betweenasset returns and their weights are not null. The measure involves these co-variances:

GB2 =1T

n∑i=1

T∑t=1

(wi,t − wi,t−k) (Ri,t −Rf,t) , (5.55)

where wi,t and wi,t−k are the weights of asset i at times t and t− k.


The expectation of this measure is null if the fund manager is not informed,and positive otherwise. No reference portfolio is used, but this method re-quires a large amount of data and calculation.

5.1.4.4 Performance measurement and multi-factor models

Generalizations of the CAPM can be based on multi-factor models, whichare also linear models but do not make any assumption on the investor’s riskaversion.

A first model, introduced by Ross [439], is based on a specific arbitragevaluation. It is called the Arbitrage Pricing Theory (APT). It does not assumenormality of returns and supposes only that investors are risk-averse, withoutspecifying a particular utility function. We have:

Ri,t = E [Ri] +K∑

k=1

bi,kFk,t + εi,t, (5.56)

where bi,k denotes the sensitivity of asset i to factor k, Fk,t denotes the returnof factor k with E [Fk,t] = 0, and εi,t denotes the residual return of asset i. Itis the specific risk of asset i which is not explained by the factors and satisfies:

E [εi,t] = 0.

• The APT model supposes that markets are perfectly efficient and thatthe factor model is the same for all investors.

• The number n of assets is assumed to be very large w.r.t. the numberK of factors.

The residuals are independent from each other and independent from thefactors:

Cov(εi,t, εj,t) = 0, ∀i = j,

Cov(εi,t, Fk,t) = 0, ∀i, ∀k. (5.57)

Arbitrage conditions lead to the existence of factor risk premia λk suchthat:

E [Ri]−Rf =K∑

k=1

λkbi,k. (5.58)

Denote δk the expected return of a portfolio with a sensitivity to factor kequal to 1, and null sensitivity to other factors. Then:

λk = δk −Rf and E [Ri]−Rf =K∑

k=1

(δk −Rf ) bi,k, (5.59)

where bi,k = Cov(Ri,δk)V ar(δk)

are the sensitivities to the factor loadings.


When there exists only one factor corresponding to the market return, thismodel is the CAPM. The problem is to identify the number of factors. Numer-ous empirical studies are devoted to the determination of the macroeconomicor financial factors. For example, the three-factor model of Fama and French[219] takes account of the book-to-market ratio and the company’s size mea-sured by its market capitalization:

RPt −Rft = αP + βP (RMt −Rft) + bS.SMBt + bH .HMLt + εPt, (5.60)

where SMBt indicates “small (cap) minus big.” It measures the excess returnof the small-capitalization returns w.r.t. large-capitalization returns.

The term HMLt is the “high (book/price) minus low.” It denotes the d-ifference between returns on portfolios with high book-to-market ratios andportfolios with low book-to-market ratios.

Fama and French assume that the market is efficient, but that more thanone factor is needed to explain asset returns.

Carhart’s four factor model [105] introduces one additional factor: thePRIY R which denotes the difference between the average of the highest re-turns and the average of the lowest returns from the previous year. Thus, wehave the following decomposition:

RPt−Rft = αP +βP (RMt −Rft)+bS .SMBt+bH .HMLt+bP .PRIY Rt+εPt

(5.61)The Barra multi-factor model (see Barra [42] and Scheikh [453]) supposes

that asset returns are determined by the firm’s characteristics: size, earnings,industrial sectors, etc., which are the factors used jointly with risk indices.

REMARK 5.7 As can be seen, the application of multi-factor modelto performance measurement requires the choice of a specific model to takeaccount of industrial and financial factors. Endogeneous factors can also beextracted through a factor analysis, either based on the principle of the maxi-mum likelihood, or based on principal component analysis. It also provides forfactor loadings. However, in that case, these factors are not well-identified.Therefore, we can search for explanation of these factors from known indi-cators. Then, we can deduce an explicit decomposition from the implicitdecomposition.


5.2 Performance decomposition

Since nowadays investors request more information about the investmentprocess of fund managers, alternative performance measures are necessary tosearch for the performance causes. The performance attribution tries to de-compose the excess performance into identified terms by taking more accountof the management process.

5.2.1 The Fama decomposition

Fama [218] proposes a performance decomposition which separates the fundperformance into two parts:

• The selectivity

and

• The risk.

This analysis is made in the CAPM framework. A portfolio P is comparedwith a “naıve” portfolio C, which consists of investing a weight x on theriskless asset, and (1 − x) on the market portfolio, such that the portfoliobeta of C is equal to the beta of P : βC = βP . Then, the portfolio C is theefficient portfolio which has the same systematic risk as portfolio P and doesnot require a forecast ability. Then, we have:

RC = (1− βP )Rf + βP RM . (5.62)

The portfolio P may be less diversified. This risk may allow for an excessperformance w.r.t. portfolio C.

Fama proposes the following decomposition:

Total performance︷︸︸︷RP −Rf =

Selectivity︷︸︸︷[RP − RC

]+

Risk︷︸︸︷[RC − Rf

]. (5.63)

The selectivity term measures the performance part due to the systematicrisk, born by the fund manager. It is equal to the Jensenalpha. Portfolios Pand C have the same systematic risk. However, their total risks are different.If portfolio P is the main investor’s wealth, then the total risk is the perti-nent risk measure. In that case, portfolio P must be compared with a naıveportfolio with the same total risk rather than the same sytematic risk.

Thus, Fama proposes the following selectivity decomposition:


]= Net Selectivity +

Diversification︷︸︸︷[RC′ − RC

], (5.64)


where

Net Selectivity =


]−

Diversification︷︸︸︷[RC′ − RC

]. (5.65)

By definition, RC′ is the return of a portfolio which contains the riskless assetand the market portfolio with the same total risk as portfolio P . The portfolioC ′ is also efficient. If the net selectivity is negative, then the fund managerhas a diversified risk which has not been compensated by excess return. Thediversification term takes account of this additional return (i.e., σ (RP )−βP ).It is always positive.

The risk RC −Rf can also be decomposed as follows:

Risk︷︸︸︷[RC − Rf

]=

Manager’s risk︷︸︸︷RC − RT +

Investor’s risk︷︸︸︷[RT −Rf

]. (5.66)

The investor has a risk objective βT which leads to a portfolio with returnRP on the security market line. The fund manager chooses a portfolio withrisk βP . Therefore, the performance component due to the total risk is dueto the risk level fixed by the investor, and to the risk level fixed by the fundmanager.

6

-

Mean-return

Beta

βT βP

Selectivity

Manager’srisk

Investor’srisk

P

C

Rf

RT

RC

RP6

?6

?6?

6?6?

Net selectivity

Diversification

6

?

Total risk

6

?

AlphaJensen

FIGURE 5.5: FAMA performance decomposition


5.2.2 Other performance attributions

Two kinds of performance attribution can be distinguished:

• The external attribution, which uses exogeneous information w.r.t. thetime series of the portfolio and benchmark returns.

• The internal attribution, which uses the time series of the portfolio andbenchmark weightings.

5.2.3 The external attribution

The following methods allow for the determination of the fund manager’sability to forecast the global evolution of the market and to select assets well.

5.2.3.1 Treynor-Mazuy

In order to determine the quality of the forecast concerning the global evolu-tion of the market, Treynor and Mazuy[495] examine the beta evolution w.r.t.market evolution. Typically, a successful market timing strategy is associatedto a beta which is higher than 1 when the market is bullish, and smaller than1 when the market is bearish. If the fund manager has no market timinggoal, then βP = cste. Then, without specific risk, points with components(RMt, RPt) would be on the straight line with equation: RPt = βPRMt. Withspecific risk, but still with βP = cste, one would observe a set of points suchas in the following figure.

RP

RM

••

•

•

•

••

•

••

••

••

•• •

••

FIGURE 5.6: Linear regression of RP on RM without market timing


However, as soon as the fund manager tries to forecast the market evo-lution and modifies his beta according to this, one would observe points inthe previous figure above the regression line for various market return valuesRMt. The set of points would be adjusted w.r.t. a convex curve, if the markettiming strategy is successful, as shown in Figure (5.7).

RP

RM

•

••

••

••

••

•••• •

••

•• •

FIGURE 5.7: Linear regression of RP on RM with successful markettiming

Treynor and Mazuy propose a statistical method based on this property.More precisely, in order to allow for a convex relation between RMt and RPt,they introduce the following equation:

RPt −Rft = αP + βP (RMt −Rft) + δP (RMt −Rft)2 + εPt. (5.67)

If coefficient δP is significantly different from zero, then we can conclude thatthe fund manager has used a market-timing strategy (with success if δP > 0and failure if δP < 0).

Moreover, the alpha coefficient is still viewed as the fund manager’s abilityto select the assets with returns above the values given by the CAPM.


5.2.3.2 Henrickson-Merton

Henrickson and Merton [290] assume that the fund manager can forecast acomparison between RMt and Rft: RMt > Rft or RMt < Rft. This assump-tion leads to the following regression model:

RPt −Rft = αP + βP (RMt −Rft) + δPD (RMt −Rft) + εPt, (5.68)

with D = 1 if RMt > Rft, and D = 0 if RMt ≤ Rft.Then, the beta can take two values: βP + δP (resp. βP ) is the portfolio betawhen the market return is higher (resp. smaller) than the riskless return. Amarket-timing strategy is successful if δP is significantly positive. The fundmanager’s ability to select assets is determined from the Jensen alpha.

5.2.4 The internal attribution

The purpose of this performance attribution is to explain the excess perfor-mance w.r.t. a benchmark, jointly determined by the investor and the fundmanager. As seen in Chapter 4, the management process can be divided in-to the asset allocations and the tactical strategy based on stock picking andmarket timing.

5.2.4.1 The Brinson model

Brinson et al. [90] propose the following method: consider a portfolio Pand a benchmark B which contain n asset classes i = 1, ..., n.

Denote:- U the set of assets, l = 1, ...,m = #(U), U1, ..., Un is a partition of the

set U and is the set of the n asset classes i = 1, ..., n ≤ m;- Zl the return of asset l (vector: Z). This is a random variable if we

consider ex-ante values. This is a scalar if ex-post data are analyzed;- Rl the expected return of asset l (vector : R);- σkl the covariance between assets k and l (matrix : V );- zpl (resp. zbl) the weight on asset l in the portfolio (resp. the benchmark)

(vector: zp and zb);- wpi =

∑l∈Ui

zpl (resp. wbi =∑l∈Ui

zbl) the weight of asset class i in the

portfolio (resp. the benchmark);- Mpl = zpl∑

l∈Uizpl

and Mbl = zbl∑l∈Ui

zblare the weights of asset l into the asset

class i of the portfolio and the benchmark;- Zpi =

∑l∈Ui

Mpl.Zl (resp. Zbi =∑l∈Ui

Mbl.Zl) is the return of asset class i in

the portfolio (resp. the benchmark); and,- Rpi =

∑l∈Ui

Mpl.Rl (resp. Rbi =∑l∈Ui

Mbl.Rl) is the mean return of asset

class i in the portfolio (resp. in the benchmark).


Then, the excess global performance of the portfolio w.r.t. the benchmarkis given by:

S = Rp −Rb =n∑

i=1

(wpi.Rpi − wbi.Rbi) =m∑l=1

(zpl − zbl)Rl. (5.69)

This excess performance has to be decomposed into main management pro-cesses: asset allocation and asset selection.

5.2.4.1.1 Asset allocation effect When searching for the origin of ex-cess performance, we encounter a problem due to the allocation effect for oneparticular asset class. Indeed, the high weighting (resp. the low weighting) ofan asset class leads to the low or high weighting of at least one another class.

One would expect to consider the following measure for asset class i:

(wpi − wbi)Rbi. (5.70)

However the following example shows that this measure is not well adapted.

Example 5.2

TABLE 5.2: Asset allocation (percentages)

Class wpi wbi Rbi (wpi − wbi)Rbi

1 25 15 8 0.82 40 45 10 −0.53 35 40 12 −0.6

Total 100 100 Rb = 10.50 −0.3

The last column leads to the conclusion that the high weighting of assetclass 1 is judicious, and that the relatively bad global performance of theportfolio is due to the low weighting of asset classes 2 and 3. Nevertheless, wecan see that the fund manager has highly weighted an asset class which hasa return smaller than the mean return of the benchmark (8% versus 10.5%).

This example proves that we must take into account the difference betweenthe class return and the mean benchmark return. Therefore, the contributionof asset class i to the excess performance is defined by:

AAi = (wpi − wbi) (Rbi −Rb) . (5.71)


The following table indicates the different cases:

TABLE 5.3: Contribution to asset classes

Excess performance Bad performanceclass i: class i:

(Rbi −Rb)> 0 (Rbi −Rb)< 0High weighting Good decision Bad decision

class i: (wpi − wbi) . (wpi − wbi) .(wpi − wbi) > 0 (Rbi −Rb)> 0 (Rbi −Rb) < 0Low weighting Bad decision Good decision

class i: (wpi − wbi) . (wpi − wbi) .(wpi − wbi) < 0 (Rbi −Rb) < 0 (Rbi −Rb) > 0

Applying this method to the previous example, we get the following table:

TABLE 5.4: Asset selection effects

Class wpi wbi Rbi (wpi − wbi)(Rbi −Rb)1 25 15 8 −0.252 40 45 10 0.033 35 40 12 −0.08

Total 100 100 Rb = 10.50 −0.30

Note that the bad performance of the portfolio w.r.t. the benchmark ismainly due to the high weighting of asset class 1, which is rather intuitive.Additionally, the global bad performance is the same as previously (−0.3%).This is due to:

n∑i=1

(wpi − wbi)Rb = 0 sincen∑

i=1

wpi =n∑

i=1

wbi = 1.

5.2.4.1.2 Asset selection effect The contribution to the global excessreturn of asset selection within each asset class is given by:

SEi = wbi (Rpi −Rbi) . (5.72)

The choice of the benchmark weight for asset class i is justified in order tonot interfere with the allocation effect. The difference (Rpi −Rbi) is not nullas soon as the fund manager’s weighting of the assets included in the assetclass i is different from the benchmark’s. Therefore, this allows us to measurethe selection effect.


5.2.4.1.3 The interaction term Since the sum of the two previous ef-fects is not equal to the global excess performance of asset class i, an additionalterm must be introduced:

Ii = (wpi − wbi) (Rpi −Rbi) . (5.73)

Then we deduce:

S =n∑

i=1

(wpi.Rpi − wbi.Rbi) =n∑

i=1

(AAi + SEi + Ii) . (5.74)

Example 5.3Consider a fund with benchmark: 35% domestic stocks; 50% domestic bonds;and 15% international stocks. Suppose that the management period corre-sponds to one year, and that the weighting of asset classes has been determinedthrough a strategical asset allocation which is not modified during the givenperiod. The numerical values are given in the following tables:

TABLE 5.5: Portfolio characteristics

Return ReturnPortfolio Benchmark class i : class i :weights weights portfolio benchmark

Class wpi wbi Rpi Rbi

Stock (do.) 40 35 13.0 12.0Bond (do.) 40 50 6.75 7.0Stock (int.) 20 15 11.0 11.0Total 100 100 Rb = 9.35

The measures of the three effects are given by:

TABLE 5.6: Performance attribution

Measure of Measure of Measure ofAllocation selection interactioneffect effect effect

Stocks (do.) 0.13 0.35 0.05Bonds (do.) 0.24 −0.13 0.03Stocks (Int.) 0.08 0.00 0.00

Total 0.45 0.23 0.075

The benchmark and the portfolio have returns respectively equal to 9.35%and 10.10%. This excess performance is decomposed into 0.45% for the al-


location effect, 0.23% for the asset selection and 0.075% for the interactioneffect.

5.2.4.2 Limit of attribution method

The Brinson method does not sufficiently take into account the risk of fi-nancial investments. Consider, for instance, a fund manager who chooses aportfolio according to the Markowitz criterion applied on the tracking error(see Chapter 4 and Roll [431]).

Suppose that the benchmark contains three asset classes, each with twoassets. Suppose that this benchmark is not efficient.

The minimization of the tracking error is illustrated by the following figure,where the benchmark (resp. the managed portfolio 1) has a mean-return equalto 8.60% (resp. 9.60%) and a standard deviation 9.30% (resp. 10.57%). Thetracking error volatility is equal to 2.24%.

FIGURE 5.8: Efficient and relative frontiers

Despite the optimal choice of portfolio 1, the performance attribution is notnecessarily good. The following table indicates the performance attributionprocess.


TABLE 5.7: Performance attribution of portfolio 1

Asset Portfolio Portfolio Allocation Selection Interactionclass weighting weighting Effect Effect Effect

1 65.80 40.00 0.232 −0.027 −0.0172 14.73 40.00 0.278 0.012 −0.0073 19.47 20.00 −0.002 0.546 −0.014

Total 100.00 100.00 0.508 0.531 −0.039

The excess portfolio return w.r.t. the benchmark is equal to 1.00%: 0.5081%is due to the allocation process, 0.5308% is due to the asset selection and−0.0389% is due to the interaction effect.

Whereas the portfolio is optimal, some of the performance indicators arenegative. For example, the low weighting of asset class 3 leads to a negativecontribution equal to −0.0002% of the global portfolio performance. However,these results are not too opposed to the portfolio optimality since performanceattribution does not take account of the tracking error itself.

5.2.4.2.1 The risk attribution To keep the coherence with the bench-mark optimization, Bertrand [56] proposes a performance attribution methodbased on the decomposition of the tracking-error volatility.

In particular, we can note that information ratios for each decision processare equal (asset allocation, asset selection, and interaction effect).

The tracking-error volatility is given by:

T =T 2

T=

(zp − zb)′V (zp − zb)√

(zp − zb)′V (zp − zb)

=Cov (S, S)

T,

=1TCov

(∑m

l=1(zpl − zbl)Zl,

∑m

l=1(zpl − zbl)Zl

),

=1T

[Cov

(n∑

i=1

(wpi − wbi) (Zbi − Zb) , S

)

+Cov

(n∑

i=1

wbi (Zbi − Zbi) , S

)

+ Cov

(n∑

i=1

(wpi − wbi) (Zpi − Zbi) , S

)],

=1T

[Cov (AA,S) + Cov (SE, S) + Cov (I, S)] .


From the previous relation, the tracking-error volatility T is decomposedinto three terms:

• The contribution to the total risk of the asset allocation, which is mea-sured by Cov (AA,S) /T ;

• The contribution to the relative risk of the asset selection, which ismeasured by Cov (SE, S) /T ;

• The contribution to the relative risk of the interaction effect, which ismeasured by Cov (I, S) /T .

PROPOSITION 5.5According to Bertrand [56]), for any portfolio belonging to the relative frontier,we have:

• Each term in the risk attribution decomposition has the same sign as thecorresponding term in the perfomance attribution decomposition. Addi-tionally, it is equal to the component of performance attribution dividedby the information ratio:

Cov (AAi, S)T

=(wpi − wbi) (Rbi −Rb)

RIP,

Cov (SEi, S)T

=wbi (Rbi −Rbi)

RIP,

Cov (Ii, S)T

=(wpi − wbi) (Rpi −Rbi)

RIP.

• Each term in the risk attribution decomposition has the same informa-tion ratio as the information ratio of the portfolio RIP :

RI (AAi) =(wpi − wbi) (Rbi −Rb)

Cov(AAi,S)T

= RIP ,

RI (SEi) =wbi (Rbi −Rbi)

Cov(SEi,S)T

= RIP ,

RI (Ii) =(wpi − wbi) (Rpi −Rbi)

Cov(Ii,S)T

= RIP .

The previous results show that for portfolios belonging to the relative fron-tier, the appropriate risk attribution measure is the tracking-error volatilityof each term of the decomposition.


The following table indicates the tracking-error volatility of each componentof the performance attribution process proposed in the previous example.

TABLE 5.8: Tracking-error volatilities

Cov (AAi, S) Cov (SEi, S) Cov (Ii, S) TotalAsset class 1 0.5205 −0.0596 −0.0385 0.422Asset class 2 0.6231 0.0258 −0.0163 0.633Asset class 3 −0.0047 1.2234 −0.0323 1.186

Total 1.1388 1.1896 −0.0871 2.241

The total tracking-error volatility of the portfolio is equal to 2.241%. Notethat each decision which leads to a bad performance (for example, the lowweighting of asset class 3 or the weighting into asset class 1) is now clearlyidentified as a decision which contributes to the reduction of the relative riskand thus is justified.

The constance of information ratio is illustrated by the following table:

TABLE 5.9: Information ratios

RI (AAi) RI (SEi) RI (Ii)Asset class 1 0.44617 0.44617 0.44617Asset class 2 0.44617 0.44617 0.44617Asset class 3 0.44617 0.44617 0.44617

Note that Grinold and Kahn [273] consider that such a value for an infor-mation ratio (=0.44617) is a good indicator for the active fund manager.


5.3 Further Reading

General results about portfolio performance are presented in Grinold andKahn [273], and Amenc and Le Sourd [22].

Performance measures taking management style into account are intro-duced in Sharpe [465], and in Lobosco [360], who proposes the SRAP measure(style/risk-adjusted performance). Muralidhar [396] introduces a specific mea-sure to compare performance of different managers within funds with the sameobjectives (so belonging to the same peer group). International diversifica-tion can be taken into account by using the IAPM (international asset pricingmodel) introduced by Solnik [472].

The problem of performance persistence is related to market efficiency.However, from the professional point of view, investment performances forindividual fund managers are examined (skillful or lucky?) rather than theglobal market inefficiency. As mentioned in Kahn and Rudd [313], the earli-est empirical studies seem to suggest no performance persistence while recentarticles conclude that a certain level of performance persistence exists.

According to Brown et al. [93], short-term performance is persistent, butthe survivorship bias affects the results (i.e., bad funds tend to disappear),since it is in favor with performance persistence (see also [93]). Jegadeshand Titman [299] examine HYSE and AMEX securities over the period 1965-1989. They show that a momentum strategy, which is based on buying thebest funds and selling the worst ones from the previous six months, providesa 1% per month excess return over the following six months.

The difference in results concerning the performance persistence can be dueto seasonal or daily effects. Note also that stock markets are subject to cycles.Thus, a given management process may be good for a given cycle and bad foranother.

Other recent methods involving new risk measures have been introduced tostudy risk attribution and portfolio performance. The total risk of a portfoliocan be decomposed into terms that can be interpreted as the risk contributionof the corresponding subsets of the portfolio. An overview of such methodol-ogy is provided in Rachev and Zhang [422].

Part III

Dynamic portfoliooptimization

“Finance is a highly analytical subject, and nowhere more so than incontinuous-time analysis. Indeed, the mathematics of the continuous-timefinance model contains some of the most beautiful applications of probabilityand optimization theory. But, of course, not all that is beautiful in scienceneed also be practical. And surely, not all that is practical in science is beau-tiful. Here we have both. With all its seemingly abstruse mathematics, thecontinuous-time model has nevertheless found its way into the mainstream offinance practice. Perhaps its most visible influence on practice has been in thepricing and hedging of financial instruments, an area that has experienced anexplosion of real-world innovations over the last decade. In fact, much of theapplied research on using the continuous-time model in this area now takesplace within practicing financial institutions.”

Robert Merton, “Continuous-Time Finance,” Blackwell Publishers, (1990).

165


Portfolio optimization is said to be “myopic” when the investor does notknow what will happen beyond the immediate next period. In this framework,basic results about one-period portfolio optimization, such as mean-varianceanalysis, were described in Part II. Such an approach can be justified forshort-term horizons without portfolio rebalancing, or for special utility func-tions such as power utilities (portfolio choice is myopic when the relative riskaversion is constant and returns are iid).

However, for long-term investment, the investor can benefit from dynamicportfolio optimization, which allows him to take account of important oppor-tunities:

• The investor can modify the portfolio weighting along the whole man-agement period, contrary to a “buy and hold” strategy. Therefore, forexample, the portfolio value at maturity can be a quite general functionof financial indexes (no longer necessarily linear).

• The investor can use the information delivered at any time by observingfinancial or economic indicators. In particular, the portfolio strategycan take into account variations of main financial indices.

Despite the unrealistic assumption that a portfolio is actually rebalancedin continuous-time, this approximation can be justified by several reasons:

• First, looking at rebalancing times during a time period [0, T ] (for ex-ample, intraday market), we observe that they do not correspond to thesame deterministic moments, but look like marked point process withvalues in the whole time interval [0, T ].

• Second, when examining financial market properties, we can fix a timescaling so that derivatives hedging and pricing seem to be in continuous-time, and perhaps choose another time scaling in order to assume thatportfolio strategies are in continuous-time.

• Finally, under some mild assumptions, discrete-time financial modelsconverge to continuous-time ones, in particular, optimal portfolio strate-gies (see, e.g., Prigent [414]).

Indeed, continuous-time modelling leads to more complexity than one-period modelling, requiring introduction of stochastic processes, Ito lem-ma, martingales, etc. However, for standard models, dynamic complete-ness leads to significant simplification. For utility maximization, explicitsolutions can be deduced and analyzed, while, for an one-period incom-plete financial market, this analysis may be more involved.

Part III 167

This part is devoted to continuous-time optimization:

• Chapter 6 provides a “brief” summary of standard dynamic optimiza-tion. The two main approaches are illustrated by basic examples:

- The first one is based on the dynamic programming method, using thePontryagin and Bellman principles.

- The second one is based on martingale methods and duality.

• Chapter 7 is devoted to the search of optimal payoff profiles and long-term portfolio management:

- Search of an optimal portfolio value, which is assumed to be a functionof a given security price, for example a financial index.

- Determination of a long-term portfolio, which is composed of threemain assets: cash, bond, and stock.

• Finally, Chapter 8 describes main results of financial portfolio optimiza-tion when “frictions” are considered:

- Market incompleteness and/or convex constraints;

- Transaction costs; and,

- Other extensions, such as the existence of a labor income or a randomtime horizon.

Chapter 6

Dynamic programming optimization

6.1 Control theory

In what follows, we consider calculation of variations in the deterministicframework to introduce the main ideas of dynamic programming.

6.1.1 Calculus of variations

6.1.1.1 General problem of the calculus of variations

Very often, this problem takes the following form:

- Let [0, T ] be the time period.- Let U be a function w.r.t. three variables (t, x, y) with real values, assumed

to be continuously-differentiable on [0, T ]× Rd × Rd.

- We search for a function x(.), continuously-differentiable on [0, T ] × Rd

solution of the optimization problem (P):

max U(x(.)) =∫ T

0

U(t, x(t),·x(t))dt,

under : x(0) = x0 and x(T ) = xT , (6.1)

where·x(t) denotes the derivative of x(.) w.r.t. the current time, and x0 and

xT are given.In what follows, the general results are illustrated by some basic examples.

6.1.1.2 Standard consumption-saving problems

Example 6.1

The function x(.) denotes the wealth invested on a riskless asset with rate ofreturn r. Suppose that there exists an income i(.) which is divided into savingss(.) and consumption c(.): i(t) = s(t) + c(t). Then, the wealth dynamics aregiven by:

·x(t) = rx(t) + s(t). (6.2)

169


Assume that, at a given initial date, the wealth x(0) is equal to x0, and thatthe goal at horizon T is to hold xT . Suppose that U(t, x, y) = e−ρtu(x), whereρ denotes a psychological discount factor, and u(.) the instantaneous utility(i.e., on consumption). Then, the optimization problem is:

max∫ T

0

e−ρtu(t, x(t))dt,

under : x(0) = x0 and x(T ) = xT . (6.3)

Since c(t) = i(t) + rx(t) − ·x(t), then U(t, x, y) = e−ρtu(i(t) + rx − y).

6.1.1.2.1 Euler equation We search for a solution x(.) of the previousproblem (P) (with the same assumptions).

PROPOSITION 6.1(Euler equation)If x∗(.) is the solution of (P), then it satisfies the equation:

d

dt

(∂U

∂y

(t, x∗(t),

·x∗(t)

))=

∂U

∂x

(t, x∗(t),

·x∗(t)

). (6.4)

PROOF Let x∗(.) be the solution of (P), and let x(.) be another func-tion such that x(0) = x(T ) = 0. Then, for any s > 0, the function t →(x∗(t) + sx(t)) satisfies the constraints of (P). Therefore, we have:

1s

[U (x∗ + sx)− U (x∗)] ≤ 0.

Taking the limit when s→ 0+, we deduce:∫ T

0

[α(t)x(t) + β(t)

·x(t)

]dt ≤ 0,

with

α(t) =∂U

∂x

(t, x∗(t),

·x∗(t)

); β(t) =

∂U

∂y

(t, x∗(t),

·x∗(t)

).

Integrating by part and using x(0) = x(T ) = 0, there exists a scalar c suchthat ∫ T

0

[β(t)−

(c +

∫ t

0

α(u))du

]·x(t)dt ≤ 0.

Now, consider the particular function x(t) =∫ t

0

[β(v) − ∫ v

0α(u)du

]dv with

c such that∫ T

0

[β(v) − (

c +∫ v

0 α(u))du]dv = 0.

Dynamic programming optimization 171

Then, we have:

x(0) = x(T ) = 0; β(t)−(c +

∫ t

0

α(u))

=·x(t) and

∫ T

0

[ ·x(t)

]2dt ≤ 0.

Thus, necessarily·x(t) = 0 and

·β(t)−α(t) = 0, which is the Euler equation.

6.1.1.2.2 Application to the consumption-saving plan (see Demangeand Rochet [156])

Since U(t, x, y) = e−ρtu [i(t) + rx − y] and c∗(t) = i(t) + rx∗(t)−·x∗(t), we

get:

∂U

∂x

(t, x∗(t),

·x∗(t)

)= re−ρtu′ [c∗(t)] ,

∂U

∂y

(t, x∗(t),

·x∗(t)

)= −e−ρtu′ [c∗(t)] .

Thus, the Euler equation is:

d

dt

(−e−ρtu′ [c∗(t)])

= re−ρtu′ [c∗(t)] .

Then:

−ρu′ [c∗(t)] +d

dt(u′ [c∗(t)]) = ru′ [c∗(t)]⇐⇒

d

dt(lnu′ [c∗(t)]) = ρ− r ⇐⇒

u′ [c∗(t)] = k exp [(ρ− r) t] ,

where k is a non-negative constant. Assuming that u′ has an inverse functionj(.), we have:

c∗(t) = j [k exp [(ρ− r) t]] . (6.5)

Therefore, the optimal consumption is a monotonous function of currenttime. It is increasing if and only if the interest rate r is higher than thepsychological discount rate ρ.

When the solutions are not necessarily continuously differentiable but “suf-ficiently” regular, a modified version of the Euler equation can be deduced.Examine solutions which are continuously differentiable, except at a count-able number of points where they have left-hand and right-hand derivatives.Denote this set by E.


PROPOSITION 6.2(Piece-wise differentiable solutions) If x∗(.) belongs to E, then we have:

∂U

∂y

(t, x∗(t),

·x∗(t)

)=∫ T

0

∂U

∂x

(t, x∗(t),

·x∗(t)

)dt + constant, (6.6)

with:- If x∗(.) is differentiable at t, we recover the Euler equation.

- If x∗(.) is not differentiable at t, ∂U∂y

(t, x∗(t),

·x∗(t)

)is continuous (Erdman-

Weierstrass condition)

If the terminal value xT is relaxed, the optimal solution x∗(.) obviously cor-responds to a solution of the previous problem (P) with xT = x∗(T ). There-fore, this solution must satisfy the Euler and Erdman-Weierstrass conditions.Nevertheless, since this terminal value is not fixed, it must be compared withall solutions of (P) when xT is varying. We can also introduce an additionnalfunction A(.) defined on the terminal value and continuously differentiable.This leads to the following condition:

PROPOSITION 6.3(Transversality condition)If x∗∗(.) is solution of the following problem (P ′):

max U(x(.)) =∫ T

0

U(t, x(t),·x(t))dt + A [x(T )] ,

under : x(0) = x0, (6.7)

then

∂U

∂y

(t, x∗∗(t),

·x∗∗(t)

)= −A′ [x∗∗(t)]−

∫ T

0

∂U

∂x

(t, x∗∗(t),

·x∗∗(t)

)dt, (6.8)

with- If x∗∗(.) is differentiable at t, we recover the Euler equation:

d

dt

(∂U

∂y

(t, x∗∗(t),

·x∗∗(t)

))=

∂U

∂x

(t, x∗∗(t),

·x∗∗(t)

). (6.9)

- If x∗∗(.) is not differentiable at t, ∂U∂y

(t, x∗∗(t),

·x∗∗(t)

)is continuous

(Erdman-Weierstrass condition).- At terminal value:

∂U

∂y

(t, x∗∗(t),

·x∗∗(t)

)+−A′ [x∗∗(t)] = 0.


PROOF The proof is similar to the previous one: let x∗∗(.) be a solutionof problem (P ′), and x(.) be a function (continuously differentiable except ata countable number of points) such that x(0) = 0. Then, for any s > 0, wehave:

1s

[U (x∗µ + sx)− U (x∗µ)] ≤ 0.

PROOF Taking the limit when s→ 0+, we deduce:∫ T

0

[α(t)x(t) + β(t)

·x(t)

]dt + γx(T ) ≤ 0,

with

α(t) =∂U

∂x

(t, x∗(t),

·x∗(t)

); β(t) =

∂U

∂y

(t, x∗(t),

·x∗(t)

); γ = A′(x(T )).

Integrating by part, we have:∫ T

0

[β(t)−

(γ +

∫ t

0

α(u))du

]·x(t)dt ≤ 0.

Now, consider the particular function x(t) =∫ t

0

[β(v)− ∫ v

0α(u)du

]dv, we

have:∫ T

0

[ ·x(t)

]2dt ≤ 0.

Thus, necessarily·x(t) = 0 and

·β(t) = α(t), which proves the result.

REMARK 6.1 If the time horizon is no longer fixed, then the transver-sality condition is modified (see Seierstadt and Sydseter [459]). If the horizonT is infinite, the problem is more involved, since the transversality condition“at infinity” may be not necessary.

Example 6.2 Consumption-saving problem without fixed wealth atmaturity(see [156])

max U(x(.)) =∫ T

0

e−ρtu(i(t) + rx(t) − ·

x(t))dt + A [x(T )] ,

under : x(0) = x0, (6.10)

The Euler equation is valid at every point where the control x(.) is contin-uous. Then:

u′(c∗∗(t)) = l exp [(ρ− r) t] .


The solution is similar to the previous one when the terminal value is fixed.However, the two constants k and l differ.

Suppose that u(c) = ln c and A(x) = exp (−ρT ) ln x. Then,

c∗(t) =1k

exp [(ρ− r) t] and c∗∗(t) =1l

exp [(ρ− r) t] .

For both cases c(t) = c∗(t) and c(t) = c∗∗(t), the wealth x(.) is given by:

x(t) = x0e(ρ−r)t +

∫ t

0

[i(s)− c(s)] er(t−s)ds,

= x0e(ρ−r)t +

∫ t

0

[i(s)] er(t−s)ds +1hρ

ert(1− e−ρt)

),

where h = k or l, according to the optimization problem.Denote by VT the discounted total wealth:

VT = x0e(ρ−r)t +

∫ t

0

[i(s)] er(t−s)ds.

- For (P), we get:

x∗T e

−rT = VT +1kρ

ert(1− e−ρt

).

- For (P ′), we get:

A′(x∗∗(t)) +∂U

∂y

(t, x∗∗(t),

·x∗∗(t)

)=⇒ e−ρT

x∗∗(T )− e−ρT

c∗∗(T )= 0.

Thus,1le−ρT = VT +

1lρ

(1− e−ρT

).

Finally, the first problem has a rational solution (k > 0) if the discountedtotal wealth VT is higher than the discounted wealth x∗

T e−rT fixed at time T .

Then,1k

=

(VT −XT e

−rT)ρ

(1− e−ρT ).

For the second problem, we deduce:

1l

=VTρ

(ρ− 1) e−ρT + 1.


6.1.2 Pontryagin and Bellman principles

Assume now that the time derivative·x(.) is no longer a true control variable,

but must satisfy an ordinary differential equation (ODE): for t in [0, T ],·x(t) = f (t, x(t), v(t)) , (6.11)

where the variable x(.) is the state variable and v(.) is the control variable.Then, the usual optimization problem is:

- Let [0, T ] be the time period.- Let U be a function w.r.t. three variables (t, x, v) with real values, assumed

to be continuously differentiable on [0, T ]× Rd × Rd′.

- Let f be a function w.r.t. three variables (t, x, v) with values in Rd′,

assumed to be continuously differentiable on [0, T ]× Rd × Rd′.

- Let A be a function w.r.t. x, with values in R, assumed also to be contin-uously differentiable on Rd.

We search for a state function x(.) continuously differentiable on [0, T ],and a control function x(.), continuous on [0, T ] solution of the optimizationproblem (P”):

max U(x(.)) =∫ T

0

U(t, x(t), v(t))dt + A [x(T )] , (6.12)

under :·x(t) = f (t, x(t), v(t)) ,

x(0) = x0

and v(T ) ∈ V (set of constraints on control),

where V is the set of constraints on the control variable, supposed to be anopen convex subset of Rd′

.

When f(t, x, v) = v, the optimization problem is the previous calculus ofvariations. Otherwise, another method must be introduced. Two “indepen-dent” approaches have been proposed:

• The Pontryagin method considers that Problem (P”) is a calculus ofvariations with an infinite number of constraints corresponding to thestate equation (6.11).

• The Bellman approach uses the dynamic programming principle.

However, both use the notion of Hamiltonian of P”:

DEFINITION 6.1 The Hamiltonian of P” is defined by: for any (t, x, p, v)in [0, T ]× Rd × Rd × Rd′

,

H(t, x, p, v) = U(t, x, v) + 〈p, f(t, x, v)〉 , (6.13)

where 〈., .〉 denotes the scalar product on Rd.


6.1.2.1 The Pontryagin principle

This can be deduced from the Lagrange theorem and the Euler equation,as shown in what follows. For this purpose, define the Lagrange multiplierp(t) associated to the constraint (6.11) at current time t. The Lagrangian Lof P” is:

L =∫ T

0

[U(t, x(t), v(t)) + 〈p(t), f(t, x(t), v(t))〉 −

⟨p(t),

·x(t)

⟩]dt + A [x(T )] ,

(6.14)which is equivalent to:

L =∫ T

0

[H(t, x(t), p(t), v(t)) −

⟨p,

·x(t)

⟩]dt + A [x(T )] . (6.15)

Therefore, the maximization w.r.t. (x,·x, v) can be decomposed into two

steps:

• Step 1: Maximization w.r.t. v.

Any optimal control must satisfy the following condition:

v∗(t) = arg maxv

H(t, x(t), p(t), v(t)), ∀t a.s.

Denote H∗(t, x(t), p(t), v(t)) the maximum of the Hamiltonian.

• Step 2: Maximization w.r.t. (x,·x).

We have to solve the following problem of calculus of variations: forx(0) = x0,

max∫ T

0

[H∗(t, x(t), p(t), v(t)) −

⟨p(t),

·x(t)

⟩]dt + A [x(T )] .

Since the terminal value xT is not fixed, the optimality conditions aregiven by Proposition 6.4:

- If v(.) is continuous at t, we recover the Euler equation:

·p(t) = −∂H∗

∂x(t, x(t), p(t)) . (6.16)

- If v(.) is discontinuous at t, t → p(t) is continuous (Erdman-Weierstrasscondition).

- At terminal value: (transversality condition)

p(T ) = A′ (x(T )) .


·x(T ) = −∂H∗

∂p(t, x(t), p(t)) .

Therefore, we have to solve the system of Hamilton-Jacobi equations:

·x(T ) = −∂H∗

∂p(t, x(t), p(t)) , (6.17)

·p(T ) = −∂H∗

∂x(t, x(t), p(t)) , (6.18)

with p(T ) = A′(x(t)) and x(0) = x0.

Note that:- When H∗ does not depend on current time, the Hamiltonian is constant

for solutions of:

d

dt[H∗ (x(t), p(t))] =

∂H∗

∂x

·x +

∂H∗

∂p

·p = 0.

- The boundary condition implies that both the initial values of x(.) andthe terminal value of p(.) are fixed.

THEOREM 6.1 Pontryagin principle

Assume that:- The utility function U , the function f , and the function A are continuously

differentiable on their respective domains.- In addition, the function f is bounded and Lipschitzian w.r.t. x, uniformly

w.r.t. (t, v).- The set V is a convex and compact subset of Rd′.Then, if (x∗(.), v∗(.)) is solution of P”, there exists p∗(.) , belonging to E,

such that:1) v∗(t) = arg maxv H(t, x∗(t), p∗(t), v), ∀t a.s. Denote this maximum by

H∗(t, x∗(t), p∗(t), v).2) The pair (x∗, p∗) is the solution of the Hamilton-Jacobi equation: ·

x(T ) = −∂H∗∂p (t, x(t), p(t)) ,

·p(T ) = −∂H∗

∂x (t, x(t), p(t)) ,with

p(T ) = A′(x(t)),

x(0) = x0.(6.19)

REMARK 6.2 This theorem is applied through two steps:1) Hamiltonian maximization: we search for v∗(t, x(t), p(t)), which deter-

mines H∗(t, x(t), p(t), v).2) Hamilton-Jacobi equations solution: we search for (x∗, p∗), which deter-

mines v∗(t, x∗(t), p∗(t)).


Example 6.3 Deterministic portfolio optimization with transactioncosts(see [156])

Assume that an individual invests his wealth on cash with a rate of returnr(t), and stock with price S(t) and dividend rate d(t).

Each transaction (buying or selling) has a proportional cost θ and thereexists an upper bound N on the number of traded stocks.

Denote: C(t) the cash amount, n(t) the number of purchased stocks, andq(t) the number of sold stocks.

The state variable is x(t) = (C(t), n(t)) and the control variable is v(t) =q(t).

The optimization problem is:

max (C(T ) + n(T )S(T ))·C = rV + dn + S(q − θ |q|),·n = −q,

C(0) = C0 and n(0) = p0,

q ∈ [−N,N ].

The Hamiltonian does not depend on current time and is given by:

H(x, p, v) = H(C, n, p1, p2, q) = p1 (rV + dn + S(q − θ |q|))− p2q.

Step 1: Hamiltonian maximization, w.r.t. q on [−N,N ].The solution is given by: q∗ = N if p1S(1− θ) > p2,

q∗ = 0 if p1S(1− θ) < p2 < p1S(1 + θ),q∗ = −N if p2 > p1S(1 + θ),

and

H∗(x, p) = p1 (rV + dn) + N max (p1S(1− θ)− p2, p2 − p1S(1 + θ), 0) .

Step 2: Hamilton-Jacobi equations solution. ·p1 = −∂H∗

∂V = −rp1,·p2 = −∂H∗

∂n = −dp1,with

p1(T ) = 1,

p2(T ) = S(T ).

The solution is given by:p1 = exp

[∫ T

t r(s)ds],

p2 = S(T ) +∫ T

t d(s)p1(s)ds.

Therefore, p1(t) can be viewed as the value at time T of one monetary unitinvested on time t, invested on the cash. p2(t) is the value at time T of onestock invested from t to T .


The optimal control is “bang-bang” : it can have three values −N, 0, or Nand is piece-wise constant. Its value depends on the ratio τ = p2/p1S, whichis the cuurent value of one monetary unit invested on stock from t to T . Wehave:

τ(t) =exp

[− ∫ T

tr(s)ds

] (S(T ) +

∫ T

td(s)p1(s)ds

)S(t)

.

Thus, the optimal control satisfies: q∗ = N if τ(t) < (1− θ) ,q∗ = 0 if (1− θ) < τ(t) < (1 + θ),

q∗ = −N if τ(t) > (1+) θ.

Note that if there exists no friction (θ = 0 and N = ∞), then τ(t) must beequal to 1 in order to have a solution. In that case, we recover the standardno-arbitrage valuation:

S(t) = exp

[−∫ T

t

r(s)ds

](S(T ) +

∫ T

t

d(s)p1(s)ds

).

6.1.2.2 The Bellman principle

The Bellman approach considers that problem (P”) can be embedded in amore general class, parametrized by the initial date t0 and the initial statevalue xt0 , solved by the dynamic programming principle.

Consider the functional J defined by:

J (t0, xt0) = maxv(.)

∫ T

t0

U(t, x(t), v(t))dt + A [x(T )] , (6.20)

under :·x(t) = f (t, x(t), v(t)) if v(.) is continuous at t, (6.21)x(t0) = x0 and v(T ) ∈ V ,

where V is the set of piece-wise continuous functions.

- Assume that the ODE·x(t) = f (t, x(t), v(t)) has one and only one solution.

- Suppose that (t0, xt0) belongs to the domain D where J is defined. Con-sider the subdivision of the interval [t0, T ] into [t0, t0 + h[ and [t0 + h, T ].


The optimization problem can be divided into two steps:

- Maximization on [t0+h, T ], knowing the state value at time t0+h. Denotethe optimal value by J (t0 + h, xt0+h).

- Maximization on [t0, t0 + h[, taking account of its impact on J (t0 +h, xt0+h).

This leads to the dynamic programming principle:

J (t0, xt0) = maxv(.)

∫ t0+h

t0

U(t, x(t), v(t))dt + J (t0 + h, xt0+h). (6.22)

If (t0, xt0) belongs to the interior of the domain D, J (t0, xt0) exists forsufficient small values of h and we have:

0 = maxv(.)

1h

[∫ t0+h

t0

U(t, x(t), v(t))dt + J (t0 + h, xt0+h)− J (t0, xt0)

].

(6.23)Assuming that J is differentiable at (t0, xt0), we deduce (for h→ 0+):

0 ≥ maxv(.)

[U(t0, xt0 , v) +

∂

∂tJ (t0, xt0) +

∂

∂xJ (t0, xt0)f (t0, xt0 , v)

],

0 ≥ ∂

∂tJ (t0, xt0) + max

v(.)

[U(t0, xt0 , v) +

⟨∂

∂xJ (t0, xt0), f (t0, xt0 , v)

⟩].

Therefore, using the Hamiltonian, we have:

0 ≥ ∂

∂tJ (t0, xt0) + H∗(t0, xt0 ,

∂

∂xJ (t0 , xt0)). (6.24)

Under mild assumptions, the reverse inequality can be proved. Thus, forvarying t and x, J satisfies the Bellman partial derivative equation (PDE).

THEOREM 6.2 Bellman PDE

If the value function J associated to problem (P”) is defined and continu-ously differentiable on ]t0, T [×O where O is a convex open subset of Rd, thenit satisfies the Bellman PDE:For any (t, x) ∈]t0, T [×O,

∂

∂tJ (t, x) + H∗(t, x,

∂

∂xJ (t, x)) = 0 with J (T, x) = A(x). (6.25)


REMARK 6.3 The solution is deduced through two steps:

- Hamiltonian maximization.

- Bellman PDE solution.

The Bellman approach allows recovery of the maximum principle and theHamilton-Jacobi equations, introduced in the Pontryagin approach.

Indeed, consider p(t) = ∂∂xJ (t, x(t)) and assume that J is twice-continuously

differentiable. Then, we have to check that p(.) is solution of the Hamilton-Jacobi ODE, and also that p(.) satisfies the transversality condition.

1) p(.) is solution of the Hamilton-Jacobi ODE:

Since p(t) = ∂∂xJ (t, x(t)), by differentiating pi(t), we get:

·pi(t) =

d

dt

[∂J∂xi

(t, x(t))]

=∂2J∂t∂xi

(t, x(t)) +d∑

j=1

∂2J∂xi∂xj

(t, x(t)) .·xj(t).

Differentiating the Bellman equation ∂J∂t + H∗(t, x, ∂J

∂x ) = 0 w.r.t. xi, wededuce:

∂2J∂t∂xi

+∂H∗

∂xi+

d∑j=1

∂2J∂xi∂xj

.∂H∗

∂xi= 0.

Finally, since fj(t, x, v∗) = ∂H∗∂pj

(t, x, ∂J

∂x

)and

·xi = fj(t, x, v∗), the result is

proved:·pi(t) = −∂H∗

∂xi(t, x(t), p(t)).

2) p(T ) satisfies the transversality condition:

Differentiating w.r.t. xi, the terminal value of Bellman equation, we have:

∂J∂xi

(T, x) =∂A

∂xi(x),

which implies:

p(T ) =∂J∂x

(T, x(T )) =∂A

∂x(x(T )).

REMARK 6.4 The Bellman method allows us to more easily find thesolution when its form is anticipated. Otherwise, computation difficulties ofthe Bellman PDE and Hamilton-Jacobi ODE are often similar.


6.1.3 Stochastic optimal control

6.1.3.1 Introduction to stochastic optimal control

The theory of stochastic optimal control extends previous methods to thestochastic dynamical systems. For basic notations, definitions and propertiesof stochastic processes, we refer to Karatzas and Shreve ([320],[321]) and Jacodand Shiryaev [294]. Appendix B provides a short survey about these notions.

Consider the case of systems associated to stochastic differential equations(SDE):

dXt = f(t,Xt)dt + g(t,Xt)dWt with Xt0 = x0, (6.26)

where Xt is the state variable with values in Rd, v(t) is the control variablewith values in Rd′

, and W is a d”−dimensional Brownian motion with d” ≤ d.

The functions f(., .) and g(., .) are assumed to satisfy usual conditions inorder to ensure that the previous SDE has one and only one solution. Forexample:

- The functions f(., .) and g(., .) have linear growth:

||f(t, x)|| ≤ (1 + ||x||)Mt and ||g(t, x)|| ≤ (1 + ||x||)Mt,

where M(.) is a positive determistic function, upper bounded on each com-pact subset of [0, T ].

- The functions f(., .) and g(., .) are Lipschitzian:

||f(t, x)− f(t, y)|| ≤ (||x− y||)Kt and ||g(t, x)− g(t, y)|| ≤ (||x− y||)Kt,

where K(.) is also a positive determistic function, upper bounded on eachcompact subset of [0, T ].

Due to the observation of the path on time period [0, T ], the acquiredinformation may modify the control variable v(.). Thus, we assume that v(.) isnow a stochastic process (v(t, ω))t. Furthermore, if v(.) is supposed to be suchthat v(t, ω) = v(t,Xt(ω)), where v is a deterministic function v : [0, T ]× Rd

→ Rd′, then v(.) is said to be “feedback” and the system is Markovian:

dXt = f(t, v(t,Xt))dt + g(t, v(t,Xt))dWt with Xt0 = x0. (6.27)

The information is modelled by the filtration FXt generated by process X .

The matrix g(t, x) is assumed to satisfy the following: There exists ε > 0such that Trace (tg(t, x)g(t, x)) ≥ ε (non-degenerated).


Under previous assumptions, we are led to the following stochastic controlproblem Pc:

maxv(.)

E

[∫ T

t0

U(t,X(t), v(t))dt + A [X(T )] |X(t0) = x0

],(6.28)

under, for t0 ≤ t ≤ T,

dXt = f(t, v(t))dt + g(t, v(t)dWt, (6.29)X(t0) = x0 and v(t) = v(t,Xt(ω)) ∈ V a.s. ,

where V is an open convex subset of Rd′.

The objective function U is assumed to be twice-differentiable on [t0, T ]×Rd × Rd′

, and A is twice-differentiable on Rd.

The maximization is made on functions v which are piece-wise continuousw.r.t. the current time t and Lipschitzian w.r.t. the state variable x.

Problem (Pc ) is embedded in the class parametrized by (t, x). Thus, we canapply the Bellman approach. For this purpose, we define the new Hamiltonian.

DEFINITION 6.2 The Hamiltonian H(t, x, p, q, v) associated to problem(Pc ) is given by: for (t, x, p, q, v) ∈ [t0, T ]× Rd × Rd × (Rd × Rd)× Rd′

,

H(t, x, p, q, v) = U(t, x, v)+d∑

i=1

pifi(x, v)+12

d∑i,l=1

qi,l

d”∑j=1

σi,j(x, v)σl,j(x, v)

.

(6.30)

With respect to the deterministic case, any pair of state variables (xi, xl) isassociated a new variable qi,l with a coefficient equal to half the instantaneouscovariance of the random variables Xi and Xl. This term is due to the Dynkinoperator of the process X , denoted by D(X). Note that D(X) can be viewedas a “mean or expectation of variation rate” of the process X , “roughly” equalto the ordinary derivative w.r.t. current time as for deterministic functions:

E[dXt |Ft ]dt

.

The Hamiltonian can also be expressed as:

U(t, x, v) + 〈p, f(x, v)〉+12Trace

(g(t, x)qtg(t, x)

).


6.1.3.2 The stochastic Bellman principle

We apply the dynamic programming approach to the stochastic framework.

Denote J (t0, x0) as the value function associated to problem Pc.

Assume that (t0, x0) belongs to the interior domain of the operator J .

This leads to the stochastic dynamic programming principle:

J (t0, xt0) = maxv(.)

E

[∫ t0+h

t0

U(t,X(t), v(t))dt + J (t0 + h,Xt0+h) |Xt0 = x0

].

(6.31)Then:

0 = maxv(.)

1hE

[∫ t0+h

t0U(t, x(t), v(t))dt |Xt0 = x0

]+

+ 1hE [J (t0 + h,Xt0+h)− J (t0, Xt0) |Xt0 = x0 ]

, (6.32)

where the optimum is searched on the set of feedback controls v, and theprocess X is solution of the SDE:

dXt = f(t, v(t,Xt))dt + g(t, v(t,Xt)dWt with Xt0 = x0. (6.33)

For h→ 0+, we deduce:

limh→0+

1h

E

[∫ t0+h

t0

U(t, x(t), v(t))dt |Xt0 = x0

]= U(t0, Xt0 , v).

Then, applying Ito’s lemma, we get:

limh→0+

1h

E [J (t0 + h,Xt0+h)− J (t0, Xt0) |Xt0 = x0 ] = DJ (t0, Xt0).

Thus, we have:

0 ≥ maxv(.)

[U(t0, xt0 , v) +

∂J∂t

(t0, xt0) +⟨∂J∂x

(t0, xt0), f (t0, xt0 , v)⟩]

+12Trace

(g(t0, xt0 , v)tg(t0, xt0 , v)

).

Using mild assumptions, the equality is proved.Set

p =∂J∂x

(t0, xt0) and q =∂2J∂x2

(t0, xt0).

Therefore, using the Hamiltonian, we are led to the stochastic Bellmanequation.

0 =∂J∂t

(t0, xt0) + H∗(t0, xt0 ,∂J∂x

(t0, xt0),∂2J∂x2

(t0, xt0)), (6.34)


where

H∗(t0, xt0 ,∂J∂x

(t0, xt0),∂2J∂x2

(t0, xt0))

= maxv(.)H(t0, xt0 ,∂J∂x

(t0, xt0),∂2J∂x2

(t0, xt0), v).

Thus, for varying t and x, J satisfies the stochastic Bellman equation.

THEOREM 6.3 Stochastic Bellman equation

If the value function J associated to problem (Pc) is defined and continu-ously differentiable on ]t0, T [×O where O is a convex open subset of Rd, thenit satisfies the Bellman equation: for any (t, x) ∈]t0, T [×O,

∂J∂t

(t, x) + H∗(t, x,∂J∂x

(t, x),∂2J∂x2

(t, x)) = 0 with J (T, x) = A(x). (6.35)

REMARK 6.5 The solution is again deduced through two steps:

- Hamiltonian maximization:

H∗(t, x, p, q) = maxv(.)

H(t, x, p, q, v), and

- Solution of the Bellman equation:

For a solution v∗(.) of previous optimization problem, we have to solve theBellman PDE:

∂J∂t

(t, x) + H∗(t, x,∂J∂x

(t, x),∂2J∂x2

(t, x)) = 0 with J (T, x) = A(x).

Under previous assumptions, there exists a maximal domain on which thereexists one and only one solution. However, solutions are rarely explicit.

For stationary problems, explicit solutions may be determined.

Consider, for example, the case of an infinite horizon and suppose that theinvestor has an exponential utility function (CARA utility).


Example 6.4Suppose that T = ∞ and U is such that:

U(t, x, v) = e−ρtu(x, v),

with ρ > 0.

Suppose also that d = 1 (univariate case).

Consider

J (t, x) = maxv(.)

E

[∫ ∞

t

e−ρsu(X(s), v(s))ds |Xt = x

], (6.36)

with

dXs = f(Xs, v(s,Xs))ds + g(Xs, v(s,Xs))dWs, (6.37)

X(t) = x and v(s) = v(s,Xs) ∈ V .

Then, if J (0, x) = I(x) is defined and twice-continuously differentiable onthe open subset O, J (t, x) is defined on the whole space R×O and we have:

For any (t, x) ∈ R×O,

J (t, x) = e−ρtI(x),

where I is solution of the ODE:

ρI(x) = maxv∈V

[u(x, v) + I ′(x)f(x, v) +

12I”(x)g2(x, v)

].

The next section illustrates this approach to determine optimal financialportfolios.


6.2 Lifetime portfolio selection

6.2.1 The optimization problem

In what follows, we consider the approach introduced by Samuelson [447],and further studied by Merton ([385], [386] and [387]), who used dynamicprogramming methods in order to deduce explicit solutions when security pa-rameters are deterministic. A more general solution is also examined whensecurity parameters are no longer deterministic.

Assumptions (S) on securities:- Let S be the price vector of d securities. Let (Ω,P) represent the probabil-

ity space. Assume that it is defined from the following stochastic differentialequation (SDE):

dSt = St− . (µ(t, St)dt + σ(t, St)dWt) , (6.38)

where W = (W1, ...,Wn) is a d−multidimensional Brownian motion such that

∀i = j, Cov(Wi,t,Wj,t) = ρi,jt and V ar(Wi,t) = t. (6.39)

- The information is modelled by the filtration Ft generated by the Brow-nian motion (and, as usual, completed in order to contain all P-null sets).

- The time horizon is supposed to be finite and is denoted by T .

- The processes µ(., .) = (µ1, ..., µd)(., .), which model the instantaneousexpectations, and σ(., .) = (σ1, ..., σd)(., .), which model the volatilities, satisfyusual conditions which ensure that the previous SDE has one and only onesolution (see for example [294]). For instance:

i) These processes are measurable, Ft-adapted and uniformly bounded on[0, T ]× Ω.

ii) The matrix σ(t, .) is invertible with bounded inverse for any t ∈ [0, T ].The process σ is also predictable.

REMARK 6.6 Under the previous hypothesis, the financial market iscomplete and without arbitrage opportunity.


Assumptions (P) on portfolio strategies:

- At any time t, the investor chooses the positive amount ct per time unit,which is assigned to his consumption, and also the portfolio weighting wt.

- The investor’s strategy is supposed to be self-financing and the cumulativeconsumption

∫ t

0csds is an Ft-adapted-process with

∫ t

0csds <∞, P-a.s.

- The portfolio weighting wt is predictable and such that∫ t

0||ws||2 ds <∞,

P-a.s., where ||.|| denotes the norm.

Thus, the portfolio value Vt is an Ito process defined by:

Vt = V0 +d∑

i=1

∫ t

0

wi,sVsdSi,s

Si,s−∫ t

0

csds. (6.40)

Assumptions (U) on utility functions:

For any time t, let U(., t) and U(., t) be two utility functions satisfying:∀t ∈ [0, T ],

- U(., t) and U(., t) are defined on R+, strictly concave, non-decreasing andcontinuously differentiable.

- limx→∞ ∂∂xU(x, t) = 0 and limx→∞ ∂

∂x U(., t) = 0.

- U(., .) and U(., .) are continuous on R+ × [0, T ] (for example: U(x, t) =e−ρtu(x) with ρ positive scalar).

Note that, under the first assumption, the marginal utilities ∂∂xU(x, t) and

∂∂x U(., t) are non-decreasing. Therefore, their inverse functions J and J exist.

The maximization of the expected intertemporal utility along the time pe-riod [0, T ] is the following optimization problem:

maxc,w

E

[∫ T

0

U(cs, s)ds + U (VT , T )

]. (6.41)

There exists one constraint corresponding to the initial budget: at timet = 0, the portfolio value V is equal to a given value V0.

6.2.2 The deterministic coefficients case

The functional to be optimized is time-additive and strategies are assumedto be non-anticipative (i.e., they are functions of past or current informationand not of future observations).


Therefore, as seen in the previous section, the solution can be determinedthrough dynamic programming: “an optimal strategy on the whole time pe-riod [0, T ] must be optimal on any sub-period [t, T ]. ”

Then, at any time t, the consumption rate process C and the weightingprocess w are solutions of the following problem:

- Consider the functional J (“the value-function”) given by:

J (V, S, t) = maxc,w

Et

[∫ T

t


], (6.42)

where Et denotes the conditional expectation given the information at time t.

- Consider the functional Φ defined by:

Φ(c, w, V, S, t) = U(ct, t) +D(J ), (6.43)

where D denotes the Dynkin operator associated to price variable S and withvalue V for given control variables c and w:

D =∂

∂t+

(d∑

i=1

wi,tµi,tVt − ct

)∂

∂V+

d∑i=1

µi,tSi,t∂

∂Si(6.44)

+12

d∑i=1

d∑j=1

σi,j,twi,twj,tV2t

∂2

∂V 2+

12

d∑i=1

d∑j=1

σi,j,tSi,tSj,tV2t

∂2

∂Si,t∂Sj,t

+d∑

i=1

d∑j=1

Si,tVtσi,j,twi,t∂2

∂Si,t∂Vt.

Using standard results of dynamic programming (see Kushner [340]), wededuce:

PROPOSITION 6.4If U is strictly concave w.r.t. c, and if U is concave w.r.t V , then there

exists a set of optimal controls w∗ and c∗ such that:d∑

i=1

wi = 1, J (V, S, T ) =

U (VT , T ), and for any t in [0, T ],

0 = Φ(c∗, w∗, V, S, t) ≥ Φ(c, w, V, S, t). (6.45)

This first result shows that the optimization problem is equivalent to themaximization of the functional Φ(c, w, V, S, t) under the following constraint


on weights w:d∑

i=1

wi,t = 1. This can be done by using the Lagrangian:

L = Φ(c, w, V, S, t) + λ

(1−

d∑i=1

wi,t

),

where λ denotes the usual Lagrange parameter. Under previous assumptions,the first-order conditions are given by: For any i ∈ 1, ..., d,

0 = Lc(c∗, w∗) = Uc(c∗, t)− JV , (6.46)

0 = Lwi(c∗, w∗) = −λ + JV µiV + JV V

d∑j=1

σi,jw∗i,tV

2 +d∑

j=1

JSjV σi,jSjV,

0 = 1−d∑

i=1

wi,t. (6.47)

(notation: AX denotes the partial derivative of A w.r.t. X and AXY thepartial derivative of order 2 w.r.t. X and Y ).Note that Lcc = Ucc < 0, Lcwi = 0, Lwi,wi = σ2

i V2JV V , Lwi,wj = σi,jV

2JV V ,and that the volatility matrix [σi,j ]i,j is non-degenerate. Therefore, a suffi-cient condition for the existence of an interior solution is: JV V < 0 (meaningthat J is strictly concave w.r.t. V ). By differentiating the first previous e-quation, we deduce that the optimal consumption is an increasing functionw.r.t. V .

Next, we must determinate c∗, w∗, and λ as functions of S, V , and tand the derivatives of J by solving (n + 2) implicit equations. Then, wehave to substitute these solutions into Equation (6.45), in order to get a d-ifferential equation of the second order w.r.t. J with boundary condition:J (V, S, T ) = U (VT , T ) .

Denote by J , the inverse of the marginal utility U ′ w.r.t. the consumptionc:

J = (U ′)−1. (6.48)

From Equation (6.46), we deduce:

c∗t = J(JV , t). (6.49)

The optimal weights w∗i are determined from a linear system, which allows

us to get explicit solutions. For this purpose, denote:

Σ = [σi,j ]i,j ,Σ−1 = [νi,j ]i,j and Γ =

d∑i=1

d∑j=1

νi,j .


By eliminating the Lagrange parameter λ in the second equation (6.46), wedetermine the weights w∗

i :

w∗i,t = hi(St, t) + m(St, Vt, t)gi(St, t) + fi(St, Vt, t), (6.50)

withd∑

i=1

hi = 1,d∑

i=1

fi =d∑

i=1

gi = 0,

and

hi(St, t) =d∑

j=1

νi,jΓ

;m(St, Vt, t) = − JV

V JV V,

gi(St, t) =1Γ

n∑j=1

νi,j

(Γµj −

d∑h=1

d∑l=1

νh,lµl

),

fi(St, Vt, t) = −ΓJSiV Si −

d∑j=1

JSjV Sj

d∑h=1

νi,h

1ΓV JV V

.

From previous expressions of C∗ and w∗, we deduce the second-order dif-ferential equation satisfied by J . Its coefficients are functions of St, Vt, andt:

0 = U(J(JV , t)) + Jt + JV

d∑

i=1

d∑i=1

νi,jµi,tVt

Γ− J(JV , t)

(6.51)

+d∑

i=1

µi,tSi,tJSi +12

d∑i=1

d∑j=1

σi,j,tSi,tSj,tJSiSj +Vt

Γ

d∑j=1

Sj,tJSjV

− JV

ΓJV V

d∑i=1

Γµi,tSi,tJSiV −d∑

i=1

Si,tJSiV

d∑j=1

d∑h=1

ν,jlµl,t

+JV V V 2

2Γ

− 12ΓJV V

d∑i=1

d∑j=1

Γσi,j,tSi,tSj,tJSiV JSjV −(

d∑i=1

Si,tJSiV

)2

− J 2V

2ΓJV V

d∑i=1

d∑j=1

Γνi,j,tµi,tµj,t − d∑

i=1

d∑j=1

νi,j,tµi,t

2 .

In addition, the functional J satisfies the equation:

J (VT , ST , T ) = U (VT , T ) .


Then, the solution J ∗ of Equation (6.51) is introduced in equations (6.49)and (6.50), which allows us to deduce c∗ and w∗.

For the general case, the solutions of these equations are not explicit. More-over, for a large number of securities, numerical computations of solutions aretedious.

However, for some cases, explicit solutions can be identified and thus moreeasily analyzed; for example, if securities are defined from a Brownian multidi-mensional process with constant coefficients (in that case, they have lognormaldistributions).

Assume that functions µ(., .) = (µ1, ..., µd)(., .) and σ(., .) = (σ1, ..., σd)(., .)are constant. Equation (6.51) is:

0 = U(J(JV , t)) + Jt + JV

d∑

i=1

d∑i=1

νi,jµi,t

ΓVt − J(JV , t)

(6.52)

− J 2V

2ΓJV V

d∑i=1

d∑j=1

Γνi,j,tµi,tµj,t − d∑

i=1

d∑j=1

νi,j,tµi,t

2 .

The solution J depends only on V , not on t or S. In that case, optimalweights w∗ satisfy:

w∗i,t = hi + m(Vt, t)gi, (6.53)

where hi and gi are constant.

Therefore, at any time t, the weight w∗i,t belongs to a straight line included

in the hyperplaned∑

i=1

wi = 1. The term m(Vt, t) looks like a parameter which

determines the position of w∗i,t on this line. It depends on the utility function,

contrary to the terms hi and gi.

PROPOSITION 6.5

Under previous assumptions, the optimal solution w∗ has the following de-composition: There exist two mutual-funds, independent from the investor’spreferences, such that the investor is indifferent between a combination of thesetwo funds and a combination of the d given securities. These two funds arecharacterized (up to a multiplicative coefficient) by their respective weighting


α and β such that for any i ∈ 1, ..., d:

αi = hi +1− η

νgi, (6.54)

βi = hi − η

νgi, (6.55)

where η and ν are arbitrary constant parameters (ν = 0).The solution w∗ is such that there exists a scalar a(Vt, t) satisfying

a(Vt, t) = υm(Vt, t) + η,

andw∗

i,t = a(Vt, t)αi + [1− a(Vt, t)]βi. (6.56)

If there exists a riskless asset S0 with rate of return r, the proportions aregiven by: for any i ∈ 1, ..., d ,

αi =1− η

ν

d∑j=1

νij(µj − r), (6.57)

βi = −η

ν

d∑j=1

νij(µj − r), (6.58)

α0 = 1−d∑

j=1

αi et β0 = 1−d∑

j=1

βi. (6.59)

REMARK 6.7 For a financial market with a return vector having aGaussian distribution, two mutual funds are sufficient to generate all optimalportfolios. This dynamic two-fund separation property is analogous with theTobin-Markowitz separation property, proved in the mean-variance framework(see Chapter 3). When a riskless asset exists, it can be considered as the firstmutual fund. Note that the value of the second one also has a lognormaldistribution.

When considering HARA utility functions, the optimal solution is an explic-it function of the utility function parameters. Assume, to simplify, that thereexist two securities, one riskless and one risky. Consider an intertemporalutility defined as a function of the consumption c of the following type:

U(c, t) = exp(−ρt)× U(c) with U(c) =1− γ

γ

(βc

1− γ+ η

)γ

, (6.60)

where ρ > 0, γ = 1, β > 0, βC1−γ + η > 0, η = 1 if γ = −∞.


The optimal solutions of Equation (6.52) are given by:

J (V, t) =δβγ

γexp(−ρt)

(δ [1− exp [− (ρ− γυ) (T − t)/γ]]

ρ− γυ

)δ

(6.61)

×(V

δ+

η

βr[1− exp [−r(T − t)]]

)γ

where δ = 1− γ and ν = r + (µ− r)2/2δσ2.

Note that, for γ > 1, the solution holds only for

Vt ≤ (γ − 1)η [1− exp [−r(T − t)]] /βr.

The optimal consumption and weighting are respectively given by:

c∗t (Vt) =(ρ− γν)

(Vt + δη

βr [1− exp [−r(T − t)]])

δ(1− exp

[(ρ−γυ)(T−t)

δ

]) − δη

β, (6.62)

and,

w∗t (Vt)Vt =

µ− r

δσ2Vt +

η (α− r)βrσ2

[1− exp [−r(T − t)]] . (6.63)

REMARK 6.8 The optimal consumption c∗ and amounts w∗V are linearfunctions of the wealth V . The value process V also has coefficients which aredeterministic functions w.r.t. the current time. Note that, when the vectorof security logreturns is Gaussian, the HARA utility functions are the onlyutility functions for which these linearity properties are valid.

The final step is to search for the optimal wealth V ∗. It is determined byusing previous solutions c∗ and w∗ of Equation 6.62:

dVt = ([w∗t (µ− r) + r]Vt − c∗t ) dt + σw∗

t dWt.

Then, setting

Xt = Vt +δη

βr[1− exp [−r(T − t)]] ,

we have a stochastic process (Xt)t which is solution of the following standardSDE:

dXt

Xt= adt + bWt,

where a and b are constant. The solution X is a geometric Brownian mo-tion. Then, its probability distribution is lognormal. We deduce the optimal


portfolio value V ∗:

V ∗t = X∗

t −δη

βr[1− exp [−r(T − t)]] , (6.64)

with

X∗t = X0 exp

[(r − ρ− γν

δ+ (1− 2γ)

(µ− r)2

2σ2δ2

)t +

µ− r

σδWt

]×1− exp

[ρ−γν

δ (t− T )]

1− exp[− ρ−γν

δ (T )] . (6.65)

Then, the optimal consumption and allocations are functions c∗t (V∗t ) and

w∗t (V ∗

t ).

Examine the following two particular cases:

• Case 1: (logarithmic utility) (which results from the limit case whereγ = η = 0, and β = 1− γ = 1)

c∗t (V∗t ) = V ∗

t /(T − t) and w∗i,tV

∗t =

n∑j=1

νij(µj − r)

V ∗t . (6.66)

• Case 2: (CARA utility) (U(c, t) = − exp(−ηc)/η))

c∗t (V∗t ) = rV ∗

t +1ηr

[(µ− r)2

2σ2− r

]and w∗

t V∗t =

(µ− r)ηrσ2

. (6.67)

6.2.3 The general case

Consider the general financial model with all assumptions (A),(P), and (U).This model has been examined by Karatzas et al. [318], using the Bellman ap-proach developed in the previous section. However, since the financial marketis complete, another approach, based on the representation theorem for mar-tingales, can be used to solve the optimization problem, as shown by Lehoczkyet al. [350].

The investor searches for the solution of

maxC,w

E

[∫ T

0

U(Cs, s)ds + U (VT , T )

], (6.68)

under a positivity constraint on the wealth process V.


6.2.3.1 The positivity constraint

Indeed, the “infinite-dimensional” constraint Vt ≥ 0 is not easy to check.Thus, we can try to “reduce” it to one-dimension. For this purpose, introducethe risk-neutral probability Q with Radon-Nikodym derivative L defined by:

Lt = E

[dQ

dP|Ft

]= exp

[∫ t

0

η(s)dWs − 12

∫ t

0

||η(s)||2 ds], (6.69)

where: (I denotes the vector with all components equal to 1)

η(t) = − [σ(t)]−1 (µ(t)− r(t)I) . (6.70)

Under assumptions (A), the process L is bounded and the Girsanov theo-rem can be used.

The process W, defined by Wt = Wt −∫ t

0η(s)ds, is a Brownian motion

w.r.t. the risk-neutral probability Q and the filtration Ft. Under Q, the assetprices are solution of:

dSt = St− .(rtdt + σ(t, St)dWt

), (6.71)

and the wealth process, associated to strategy (c, w), is given by:

dV c,wt =

[V c,wt− (rt − c(t))

]dt +

∑i,j

wi(t)σi,j(t, St)dWj,t, (6.72)

V c,w(0) = v0. (6.73)

Denote by R the money market account:

Rt = exp[−∫ t

0

r(s)ds]. (6.74)

Then

V c,wt Rt = v0 −

∫ t

0

R(s)c(s)ds +∫ t

0

R(s)tw(s)σ(s)dWs.

DEFINITION 6.3 A strategy (c, w) is said to be “admissible” for a giveninitial wealth v0, if the process V c,w is positive a.s. Denote A(v0) the set ofsuch strategies.

PROPOSITION 6.6Let a strategy (c, w) in A(v0) and V c,w represent the associated wealth process.Then

EQ

[V c,wT RT +

∫ T

0

R(s)c(s)ds

]≤ v0. (6.75)


PROOF If (c, w) is admissible, the process Mt = v0+∫ t

0R(s)tw(s)σ(s)dWs

is a local Q-martingale which is equal to V c,wt Rt +

∫ t

0R(s)c(s)ds, and so is

positive. Thus, M is a Q-supermartingale, which implies EQ [MT ] ≤M0.

A converse property can also be proved, using the martingale representa-tion.

PROPOSITION 6.7Let c be a consumption process satisfying assumption (S2), and let X be apositive random variable FT -measurable (“contingent claim”) such that:

EQ

[XRT +

∫ T

0

R(s)c(s)ds

]= v0.

Then, there exists a portfolio weighting w which is predictable such that thepair (c, w) is admissible, and the terminal wealth V c,w

T is equal to X.

PROOF- First note that, if such weighting w exists, the local Q-martingale

Mt = v0 +∫ t

0

R(s)tw(s)σ(s)dWs , (6.76)

is positive since it can be written as:

Mt = V c,wt Rt +

∫ t

0

R(s)c(s)ds.

The supermartingale M is also a martingale such that:

Mt = EQ [MT |Ft ] = EQ

[V c,wT RT +

∫ T

0

R(s)c(s)ds |Ft

].

- Therefore, it is sufficient to prove that there exists w satisfying relation(6.76). For given pair (c,X), the process

Mt = EQ

[XRT +

∫ T

0

R(s)c(s)ds |Ft

](6.77)

is a Q-martingale.

The predictable representation theorem for martingales proves that thereexists a predictable process θ such that

∫ T

0||θt||2 dt <∞ a.s., and

Mt = M0 +∫ t

0

θ(s)dWs with M0 = EQ [MT ] = v0. (6.78)


Then, consider the process w given by: for any t ∈ [0, T ],

w(t) = − [R(t)]−1 [tσ(t)]−1 t

θ(t),

and the process V c,w defined by:

V c,wt Rt = v0 −

∫ t

0

R(s)c(s)ds +∫ t

0

θ(s)dWs.

We can easily check that the process V c,w is associated to the strategy(c, w), which is admissible.

6.2.3.2 Existence of an optimal strategy

The optimization problem is:

maxc,w

EP

[∫ T

0


], (6.79)

with the budget constraint:

EP

[V c,wT R(T )L(T ) +

∫ T

0

R(s)c(s)L(s)ds |Ft

]≤ v0.

Consider the value-function J :

J (V c,w, w, t, v0) = maxc,w

EP

[∫ T

t

U(cs, s)ds + U (V c,wT , T ) |Ft

]. (6.80)

We first have to search for optimal solution (c∗, V ∗), then to determine w∗.A useful duality result can be used.

PROPOSITION 6.8 Legendre transform

Under the assumptions (U), we have:

U(J(y))− yJ(y) = maxc≥0

[U(c)− cy] ,

where J denotes the inverse function of U ′. Note that

U(y) = maxc≥0

[U(c)− cy]

is the convex conjugate function of the investor’s utility.


PROOF Since U is concave, we have:

U(J(y))− U(c) ≥ U ′(J(y))(J(y)) − c).

If J(y) > 0, U ′(J(y)) = y. Therefore:

U(J(y))− U(c) ≥ y(J(y))− c).

If J(y) = 0, y ≥ U ′(0). Therefore:

U(0)− U(c) ≥ −U ′(0)c ≥ −yc.

To determine an optimal solution, we can use the Lagrange multipliers. Forλ ∈ R+, denote:

L(c, VT , λ) = EP

[∫ T

0


]

+λ

(v0 − EP

[VTR(T )L(T ) +

∫ T

0

R(t)c(t)L(t)dt

]).

A sufficient condition for (c∗, V ∗) to be optimal is that there exists a La-grange multiplier λ∗ ∈ R+ such that (c∗, V ∗

T , λ∗) satisfies: for any (c, VT , λ)

satisfying budget equation 6.75 and λ ∈ R+,

L(c, VT , λ∗) ≤ L(c∗, V ∗

T , λ∗) ≤ L(c∗, V ∗

T , λ).

The first inequality, which is due to the optimality of (c∗, V ∗T ), can be solved

by searching for the values (c∗t , V ∗T ) which satisfy: for any (t, ω),

- The consumption c∗tmaximizes U(ct(ω), t)− λ∗R(t)ct(ω)L(t); and,

- The wealth value V ∗T maximizes U(V ∗

T (ω), T )− λ∗R(T )V ∗T (ω)L(T ).

We deduce:

c∗t = J(λ∗(κ(t))) and V ∗T = J(λ∗(κ(T ))),

where κ(t) = R(t)L(t).

The Lagrange parameter λ∗ is determined from the budget equation. Itsexistence is deduced by assuming:

- (V): For any λ > 0, E[∫ T

0L(t)R(t)J(λκ(t))dt

]and E

[L(T )R(T )J(λκ(T ))

]are finite.


Under assumptions (U) and (V), the function F , defined by

F (y) = EP

[∫ T

0

κ(t)J(yκ(t), t)dt + κ(T )J (yκ(T ), T )

],

is continuous and non-increasing on [0, a] where a = inf y |F (y) = 0 . Itsatisfies also:

limy→0

F (y) = +∞; limy→+∞F (y) = 0.

Then, the function F has an inverse F−1. Define the function G by:

G(y) = H(F−1(y)),

where the function H is given by:

H(y) = EP

[∫ T

0

U(J(yκ(t)), t)dt + U[J (yκ(T ), T )

]].

Assume also:

- (W): the utility functions U and U are twice-continuously differentiableand U” and U” are increasing.

Then (see Karatzas et al. [318]): F and G are continuously-differentiableand H ′(y) = yF ′(y).

To summarize:

PROPOSITION 6.9Under assumptions (U),(V), and (W) for utility functions U and U , thereexists an optimal strategy (c∗, w∗) ∈ A(v0) such that:

J (c∗, w∗, v0) = maxc,w∈A(v0)

J (c, w, v0) (6.81)

where J (c, w, v0) = EP

[∫ T

0

U(cs, s)ds + U (V c,wT , T )

]. (6.82)

Using previous function F and the inverse functions J and J of marginalutility functions U ′ and U ′, we have:

c∗t = J(F−1(v0)κ(t)), (6.83)

V ∗T = J(F−1(v0)κ(T )), (6.84)

and the optimal weighting w∗ is deduced from the martingale representation(6.78).


Example 6.5Assuming that the coefficients r, µ, and σ are constant, that the utility func-tion U is a power function, U(x) = xγ

γ , and that U(x) = 0. We have:

1) The inverse function J is given by: J(y) =(

yγ

)1/(γ−1)

.

2) The function F−1 is such that:

F−1(v) = γ (bv)(γ−1) with b = 1/E

[∫ T

0

κ(t)γ/(γ−1)dt

].

3) The optimal consumption is given by:

c∗t = J(F−1(v0)κ(t)) = v0bκ(t)1/(γ−1).

4) The optimal wealth is null, since the utility function U is itself null.

5) The optimal weighting is determined from the predictable representationof the martingale

Mt = EQ

[∫ T

0

R(s)c∗(s) |Ft

]= v0 +

∫ t

0

θ∗(s)dWs ,

which, by identification, leads to the equality:

w∗(t) = [R(t)]−1θ∗(t) [σ(t)]−1 = ertσ−1θ∗(t).

To determine θ∗(t), we note that:

κ(t) = e−rt exp[ηWt − η2t/2

].

Then, settingδ = 1/ (γ − 1) ,

consider the exponential martingale:

Lt = exp[ηδWt − η2δ2t/2

].

Setζ = r(1 + δ)− 1

2η2δ(1 + δ).

Then,Rtc

∗(t) = v0bRtκ(t)δ = v0bLte−ζt.


Thus, the martingale Mt can be expressed by using L :

Mt = v0b

∫ t

0

Lse−ζsds + v0b

∫ T

t

EQ

[Ls |Ft

]e−ζsds.

Since L is a Ft-martingale w.r.t. Q, we have:

Mt = v0b

[∫ t

0

Lse−ζsds +

e−ζt − e−ζT

ζLt

].

Using dLt = ηbLtdWt, we deduce:

dMt =v0bηδ

ζ

[Lt

(e−ζt − e−ζT

)]dWt.

Finally,

w∗(t) =v0b(µ− r)σ2(γ − 1)ζ

[(e−ζ(T−t) − 1

)]κ(t)1/(γ−1),

and

H(y) =(y

γ

)γ/(γ−1) 1− e−ζT

ζ,

G(v) = (v)γ(

1− e−ζT

ζ

)(1−γ)

.

Note that if U(x) = U(x) = xγ

γ , we have:

c∗t =v0

b

(κ(t)γ

)1/(γ−1)

and V ∗T =

v0

b

(κ(T )γ

)1/(γ−1)

,

with

b = 1/E

[∫ T

0

κ(t)γ/(γ−1)dt + κ(T )γ/(γ−1)

]γ1/(γ−1).

REMARK 6.9 The previous method relies heavily on martingale meth-ods. In Chapter 7, this approach is used to determine optimal portfolio design,in particular by using the results of Cox and Huang [132].


6.2.4 Recursive utility in continuous-time

Epstein and Zin [212] introduce the notion of recursive preferences whichgeneralizes the standard time-separable power utility. This new preferencemodelling allows for the separation of the relative risk aversion from the elas-ticity of intertemporal substitution of consumption. Duffie and Epstein [173](see also Svensson [483]) introduce recursive utility in continuous-time.

Consider the following parametrization of recursive utility:

Ure,t = E

[∫ ∞

t

f (Cs, Ure,s) ds |Ft

], (6.85)

where f (., .) is a function which allows for the aggregation of the currentconsumption Cs and the continuation utility Is.

Duffie and Epstein [173] assume that this function is given by:

f (C, J) =β

1−(

1ψ

) (1− γ)Ure

[C

((1− γ)Ure)1

1−γ

]1− 1ψ

− 1

, (6.86)

where β > 0 denotes the rate of time preference, γ > 0 denotes the coefficientof relative rsik aversion, and ψ > 0 denotes the elasticity of intertemporalsubstitution.

Note that for ψ = 1γ , we recover the power utility. Duffie and Epstein [173]

prove that:

• First, the Bellman principle can still be applied.

• Second, it is sufficient to substitute the term f (C,Ure) for the instan-taneous utility function U (C)in Equation (6.43). Then, we have to usea new functional Φr defined by:

Φr(C,w, V, S, t) = f (C,Ure) + L(J ). (6.87)

Example 6.6In the recursive utily framework, Campbell and Viceira [102] propose to ex-amine the impact of a mean-reverting interest-rate. They assume that theinvestor can only choose between cash and a long-term real bond. They alsoassume that the instantaneous riskless interest rate rt follows an Ornstein-Uhlenbeck process given by:

drt = ar(br − rt)dt− σrdWr, (6.88)

where ar, br, and σr are positive constants, and Wr is a standard Brownianmotion. The market price of interest rate risk is assumed to be constant (seeVasicek [498]).


(1) Cash dynamics:dCt

Ct= rtdt. (6.89)

(2) The zero-coupon bond B with maturity T is solution of the followingSDE:

dB(rt, T − t)B(rt, T − t)

= [rt + θ(T − t)] dt + σ(T − t)dWr,t, (6.90)

where σT−t denotes the volatility at time t of the zero-coupon bond:

σT−t =σr(1− e−ar(T−t))

ar, (6.91)

which is decreasing with time.

Denote by λr the risk premium on the bond. Set

θ(T − t) = λrσT−t.

Using the Bellman equation associated to the functional Φr in Equation (6.87)with ψ = 1, Campbell and Viceira guess that the solution has the followingform:

Ure(Vt, rt) = I(rt)V 1−γt

1− γ. (6.92)

This leads to the following ordinary differential equation (ODE):

0 =β

1− γlog (I) +

(β log β − β +

λ2r

2γ+ r

)+

σ2r

2γ

(1I

∂I

∂r

)2

+(

ar1− γ

(br − r) − λrσr

γ

)1I

∂I

∂r+

σ2r

2(1− γ)1I

∂2I

∂r2. (6.93)

This equation has an exact solution. There exist two constants A0 and A1

such that:

I(rt) = exp [A0 + A1rt] .

The consumption-wealth ratio is constant equal to β, and the fraction ofwealth αt invested on the bond is given by:

αt =1γ

λr

σT−t+(

1− 1γ

)σr

σT−t (ar + β).


6.3 Further reading

Optimization theory and its applications based on ordinary equations arestudied in Cesari [110]. Multiperiod optimization based on discrete-time Bell-man principle is examined in Bertsekas ([63], [64]). Portfolio optimization inthe binomial model is solved in Bajeux [37].

Since this chapter provides only a brief overview on stochastic control the-ory, it refers to El Karoui [188] for a general and sound stochastic controlanalysis. Results about controlled diffusion process can be found in Krylov[339], Davis [150], and Dempster [158].

Optimization dealing with Markov models are detailed in [150]. Hamilton-Jacobi equations can be studied by using viscosity notion, as shown for ex-ample by Shreve and Soner [470], and Zariphopoulou [510].

Dynamic allocation problems are also examined in El Karoui and Karatzas[189]. Recursive portfolio analysis is provided in El Karoui, Peng and Quenez[195], using backward stochastic differential equations.

Portfolio optimization in a lognormal market is examined for a power utilityin Dexter et al. [166].

Portfolio optimization when assets are discontinuous is examined in Jean-blanc and Pontier [298], and in Shirakawa [468].

Numerical methods can be found in Kushner and Dupuis [341], Fitzpatrickand Fleming [229], and Rogers and Talay [429].

Monte Carlo methods for portfolio analysis are used in Cvitanic et al. [136],and in Detemple et al. [164].

Chapter 7

Optimal payoff profiles andlong-term management

This chapter deals with two main applications of some of the results obtainedin Chapter 6:

• First, the optimal portfolio value at maturity is assumed to be a func-tion of a given set of asset prices. Therefore, the problem consists indetermining an optimal function or “payoff profile.”

• Second, the determination of an optimal long-term portfolio, investedin cash, bonds, and stocks.

7.1 Optimal payoffs as functions of a benchmark

7.1.1 Linear versus option-based strategy

Assume that the investor maximizes his expected utility, but uses only abuy-and-hold strategy w = (wB,0, wS,0) to invest in a riskless asset B withrate r, and a risky asset S. Then, any solution of the optimization problem:

MaxwEP[U(VT )] with V0 = e−rTEQVT ],

is necessarily linear w.r.t. the risky asset:

VT = V0

(wB,0

BT

B0+ wS,0

ST

S0

).

However, he may search for nonlinear solutions, including, for example,derivatives in his portfolio.

To determine his optimal choice, V ∗T , he can search this solution assuming

that VT = h(ST ). Then, the portfolio optimization is now given by:

MaxhEP[U(h(ST ))] with V0 = e−rTEQ[h(ST )],

where P denotes the historical probability, and Q is the risk-neutral probabilitywhich is used to price options. According to Brennan and Solanki [88] (see

207


also Carr and Madan [106]), we deduce:

V ∗T = J (λg) , (7.1)

where g is the pdf of dQ/dP w.r.t. P, λ is the Lagrange parameter corre-sponding to the budget constraint, and J(x) = (U ′)−1(x).

7.1.1.1 The general model

Suppose, as in [415], that three basic financial assets are available:

• The cash associated to a discount factor N .

• The bond B.

• The stock or a financial index S. The investor is supposed to determinean optimal payoff h which is a function defined on all possible values ofthe assets (N,B, S) at maturity.

REMARK 7.1 If the market is complete, this payoff can be achievedby the investor. The market can be complete, for example, as shown in theprevious chapter if:

• The financial market evolves in continuous time and all options can bedynamically replicated by a perfect hedging strategy;

• Or, if, in one period setting, European options of all strikes are availableon the financial market. In this setting, the inability to continuouslytrade potentially induces investment in cash, asset B, asset S and allEuropean options with underlying assets B and S (if cash and bond arenon stochastic, only European options on S are required).

The market can be also incomplete. In that case, the solution given in thissection is only “theoretical,” but still interesting to know since the optimalpayoff can be approximated by investing on traded assets. (In practice, theinvestor defines an approximation method, which may take transaction costsor liquidity problems into account.)

The investor is assumed to be a pricetaker. For example, if his benchmarkS is the SP&500 then his investment is too weak to modify the index value.

Under the standard condition of no-arbitrage, the assets prices are calcu-lated under risk neutral probabilities. If markets exist for out-of-the-moneyEuropean puts and calls of all strikes, then it implies the existence of a uniquerisk-neutral probability that may be identified from option prices. Otherwise,if there is no continuous trading, generally the market is incomplete and one

Optimal payoff profiles and long-term management 209

particular risk-neutral probability Q is used to price the options. It is alsopossible that stock prices change continuously, but the market may be stilldynamically incomplete. Again, it is assumed that one risk-neutral probabil-ity is selected. Assume that prices are determined under measure Q. Denoteby dQ

dPthe Radon-Nikodym derivative of Q with respect to the historical prob-

ability P. Denote by NT the discount factor and by MT the product NTdQdP

.

Due to the no-arbitrage condition, the budget constraint corresponds to thefollowing relation:

V0 = EQ[h(NT , BT , ST )NT ] = EP[h(NT , BT , ST )MT ].

The investor has to solve the following optimization problem:

MaxhEP[U(h(NT , BT , ST )] under V0 = EP[h(NT , BT , ST )MT ]. (7.2)

To simplify the presentation of the main results, we suppose that the functionh fulfils: ∫

IR+3h2(n, b, s)P(NT ,BT ,ST )(dn, db, ds) <∞ .

This means that h ∈ L2(IR+3,PXT (dx)), where XT = (NT , BT , ST ), which isthe set of the measurable functions with squares that are integrable on IR+3

with respect to the distribution PXT (dx).

The utility function U is associated with a new functional ΦU , which isdefined on the space L2(IR+3,PXT (dx)) by:

For any Y ∈ L2(IR+3,PXT (dx)), ΦU (Y ) = EPXT[U(Y )].

ΦU is usually called the Nemitski functional associated with U (see Ekelandand Turnbull [187] for definition and basic properties).

PROPOSITION 7.1Introduce the conditional expectation of MT under the σ-algebra generated by(NT , BT , ST ). Denote its pdf by g. Assume that g is a function defined onthe set of the values of XT = (NT , BT , ST ), and g ∈ L2(IR+3,PXT ). Then,the optimization problem is reduced to:

Maxh∈L2(IR+3,PXT )

∫IR+3

[U(h(x))]PXT (dx) (7.3)

under : V0 =∫IR+3

h(x)g(x)PXT (dx).

We deduce that the optimal payoff he is given by:

he = J(λg), (7.4)


where λ is the scalar Lagrange multiplier such that

V0 =∫IR+3

J(λg(x))g(x)PXT (dx).

PROOF From the properties of the utility function U , the Nemitski func-tional ΦU is concave and differentiable (the Gateaux-derivative exists) onL2(IR+3,PXT ). Additionally, the budget constraint is a linear function of h.So there exists exactly one solution he.he is the solution of ∂L

∂h = 0 where the Lagrangian L is defined by:

L(h, λ) =∫IR+3

[U(h(x))]PXT (dx) + λ

(V0 −

∫IR+3

h(x)g(x)PXT (dx)),

where λ is the Lagrange multiplier associated to the budget constraint.So, he satisfies: U ′(he) = λg. Therefore, he = J(λg).

Suppose for example that cash and bonds are not stochastic. Then, theproperties of the optimal payoff h∗ as a function of the benchmark S can beanalyzed. Since the utility function U is concave, the marginal utility U ′ isdecreasing, then J also is decreasing, from which we deduce:

COROLLARY 7.1

h∗ is an increasing function of the benchmark ST iff the conditional expec-tation g of dQ

dPunder the σ-algebra generated by ST is a decreasing function

of ST . More precisely: assume that g is differentiable. From the optimalityconditions, the derivative of the optimal payoff is given by:

h′(s) =(− U ′(h(s))U ′′(h(s))

)×(−g(s)′

g(s)

).

Note that, in most cases, g is decreasing.

Introduce the tolerance of risk To(h(s)) equal to the inverse of the absoluterisk-aversion:

To(h(s)) = − U ′(h(s))U ′′(h(s))

.

As it can be seen, h′(s) depends on the tolerance of risk. The design of theoptimal payoff can also be specified. Denote Y (s) = − g′(s)

g(s) .Differentiating twice with respect to s, and from the previous corollary, we

deduce:


COROLLARY 7.2Assume that g is twice-differentiable.Then:

h′′(s) = [X ′(h(s)) +Y ′(s)Y (s)2

]× [X(h(s))Y 2(s)]. (7.5)

Therefore, usually, the higher the tolerance of risk, the higher h”(s).

Example 7.1In what follows, the optimal portfolio profile is examined for a special case.

• The utility function is CRRA:

U(x) =xα

α, 0 < α < 1,

J(x) = (U ′(x))−1 = x1

α−1 .

• The stock price S is a geometric Brownian motion. At each time t, itslogarithm has a Gaussian probability distribution with mean

(µ− 1

2σ2)t

and variance σ2t.

The optimal payoff profile is given by:

h∗ (s) =V0e

rT∫∞0

g(s)α

α−1 fP(s)dsg(s)

1α−1 ,

where s denotes all possible values of ST , and fP is the pdf of ST :

l(S, µ, σ)=Ix>0Sσ√

2πe− 1

2

[ln(S/S0)−(µ− 1

2σ2)

σ

]2

.

Within this framework, the density g of the Radon-Nikodym derivative dQdP

is given by:

g(s) = ψs−κ,

with θ = µ−rσ , A = − 1

2θ2T + θ

σ

(µ− 1

2σ2)T, κ = θ

σ and ψ = eA (S0)κ.

Therefore h∗ (s) can be written as a power function of the stock S:

h∗ (s) = d.sm,

with

d =V0e

rT∫∞0 g(s)

αα−1 fP(s)ds

ψ1

α−1 > 0 and m =κ

1− α> 0.

Note that h∗ (s) is an increasing function of s. Its profile only depends onthe comparison between the relative risk aversion 1−α and the ratio κ, whichlooks like a Sharpe ratio, depending only on µ, r, and σ:


• h∗ (s) is concave if κ < 1− α.

This means that, for a bearish market, the investor wants to receivehigher payoff than the stock value. However, he will get a smaller payoffif the financial market will be bullish.

• h∗ (s) is linear if κ = 1− α.

The investor buys the stock S itself.

• h∗ (s) is convex if κ > 1− α.

The investor has a higher risk exposure in order to benefit better froma bullish market. He is less protected against a bearish market.

The following figure illustrates the concavity/convexity of the optimal port-folio profiles according to the risk-tolerance:

6

-

Convex Profile

Concave Profile

Stock Value

Payoff Profile

FIGURE 7.1: Optimal portfolio profiles

For a fixed relative risk aversion α, the influence of the market parameterscan be examined: for example, the instantaneous expected return µ which is afundamental parameter when dealing with portfolio management. As shownin next figure, the higher the parameter µ, the more convex the optimalportfolio payoff. Note also that, for “extreme” values of µ smaller than theriskless rate r, the optimal profile is a decreasing function of the stock priceat maturity.


FIGURE 7.2: Optimal portfolio profiles according to stock return

REMARK 7.2 For an HARA utility function:

U(x) = a(b +

x

c

)1−c

, 0 < c < 1

J(x) = (U ′(x))−1 = c

[(cx

a(1− c)

)− 1c

− b

].

Then, the optimal portfolio profile has the following form:

V ∗T = h∗(ST ), with h∗ (s) = c

[(cs−κ

a(1− c)

)− 1c

− b

].

Consequently, the optimal portfolio profile is a linear function of a power ofthe stock value.

• This power is equal to κc . Note that c is the asymptotic relative risk

aversion (c = limx→∞−xU”(x)

′(x) ). Therefore, the discussion about concav-ity/convexity is the same as the previous one when dealing with CRRAutility functions.

• The term −(bc) corresponds to a fixed guaranteed amount if it is posi-tive, or a maximal loss if it is negative.


7.2 Application to long-term management

Consider an investor who has a specific goal such as retirement, paying forhis children’s education, etc. He is facing a variety of decisions:

Which amount of money should he invest initially? How should he investamong assets: cash, stock index, and bond funds? Should he use a markettiming or fixed strategy? For example, Dybvig [182] has shown that the costof undiversified strategies over time can be substantial.

As mentioned in Chapter 1, a well-known property of an optimal portfolio isthe mutual fund separation theorem: a rational investor divides his investmentbetween two assets, a riskless one and a risky mutual fund, the compositionof which is the same, whatever the investor’s risk aversion.

As noted by Canner, Mankiw, and Weil [104], popular investment advicedoes not conform to this property. Empirical studies show that allocationsbetween stocks, bonds and cash depend indeed on risk aversion. In particular,the ratios of bond/stock differ when considering conservative, moderate, oraggressive investors. For example, this ratio is 1.5 for a conservative investor,1.00 for a moderate one, and 0.5 for an aggressive one.

Bajeux-Besnainou, Jordan, and Portait [38] address this inconsistency issuebetween mutual fund property and popular advice. They consider that theinvestor’s horizon exceeds the maturity of the cash asset. They introducea continuous-time portfolio rebalancing. In that case, cash may be a moneymarket security with a short maturity (one to six months), and may no longerbe the common riskless asset in the standard theory. In particular, this istrue when dealing with long-term investments. Nevertheless, the investor cansynthesize a riskless asset (for example a zero-coupon bond maturing at thehorizon) by using a bond fund and cash. Consequently, bonds appear inboth synthetic riskless asset and in the risky mutual fund, which can justifya bond/stock ratio varying with risk aversion for any HARA investor.

7.2.1 Assets dynamics and optimal portfolios

We adopt the same framework. This is a generalization of the Black andScholes model and a variant of the Merton (1971) one state variable model.The model assumes normality of log returns, which is not a restrictive as-sumption when dealing with long term investment. In particular, this allowsus to provide explicit formulas, which greatly simplifies the computation ofutility and monetary losses. Obviously, other financial market models can beintroduced and examined, as seen in Chapter 8.


7.2.1.1 The financial market

The market is assumed to be arbitrage-free and without friction. Financialtransactions occur in continuous-time, along a time period [0, T ].

Three basic assets are available at any time on the market. (1) An instanta-neously riskless money market fund, the Cash, with a price denoted by C. (2)A Stock index fund with a price S. (3) A Bond fund with constant durationD, obtained by continuously rolling bonds throughout the investment period[0, T ]. It is denoted by BD, which is a zero-coupon bond with maturity (t+D)at time t.

As mentioned in [38], if inflation uncertainty is ignored, the interest raterisk means real estate rate risk. The omission of inflation uncertainty mayinduce some problems, when considering long horizons. Nevertheless, firstempirical evidence indicates, for example, that the US inflation volatility issmaller than real interest rate volatility. Second, special long term bonds in-dexed on inflation are now available on some financial markets (for example,in the US or UK). Finally, it is possible to diversify by investing in the housingmarket.

Thus, since D > d, the riskless asset is a zero-coupon nominal bond BT

that matures at the investor’s horizon, and which is replicated by a dynamiccombination of C, BD and S. Therefore, there exists a two-fund Cass-Stiglitzseparation with a synthetic riskless fund BT replicated by C, and BD and arisky fund replicated by C,BD, and S.

Since continuous-time rebalancing is allowed, financial markets can be as-sumed to be complete by introducing two sources of risk. In fact, as shownin Duffie and Huang [175], such assumption of dynamic market completenessis allowed when contingent claims can be synthesized by continuous-time re-balancing.

To illustrate the results, we assume that the instantaneous riskless interestrate rt follows an Ornstein-Uhlenbeck process given by:

drt = ar(br − rt)dt− σrdWr, (7.6)

where ar, br, and σr are positive constants, and Wr is a standard Brownianmotion. The market price of interest rate risk is assumed to be constant (seeVasicek [498]).

The asset price dynamics are given by:

(1) Cash:dCt

Ct= rtdt. (7.7)


(2) Stock index:

dSt

St= (rt + θS)dt + σ1dW + σ2dWr. (7.8)

(3) Bond fund:dBDt

BDt= (rt + θB)dt + σBdWr , (7.9)

where W is another standard Brownian motion, independent of Wr, and whereσ1, σ2, and σB are positive constants. The parameter θS is the constant riskpremium of the stock, and θB is the risk premium of the bond fund.

Since B is a zero-coupon bond indexed on the interest rate rt, the relationbetween their volatilities is given by:

σB =σr(1 − e−arD)

ar.

Denote also by σT−t the volatility at time t of the zero-coupon bond ma-turing at time T :

σT−t =σr(1− e−ar(T−t))

ar, (7.10)

which is decreasing with time (clearly σB coincides with σT−t when T−t = D).

Note that in this model, interest rates and stock index prices are negativelycorrelated1. Furthermore, the market is complete. Therefore, there exists aunique risk-neutral probability Q associated to two market risk premia, λ andλr, for which density η with respect to the initial probability P is given by:

ηt = exp[−(λWt + λrWr,t)− 12

(λ2 + λ2r)t].

The premia λ and λr are determined from the relation:

θS = σ1λ + σ2λr,

θB = σBλr.

In this setting, BT can be replicated using only the two assets C and BD.These two assets span the bond market. As noted in [38], synthesizing BT

requires a positive weight on BD and a weight on C, which is negative if D < Tand positive if D > T. This dynamic combination of fixed-income securities ofdifferent durations is referred to as the passive immunization (see Fong [239]or Fabozzi [215]).

1The instantaneous correlation between the interest rate and the stock index is equal to-σrσ2. This assumption is not necessary to solve the optimization problems.


7.2.1.2 Optimal portfolios

We recall the standard results about optimal portfolio computation. Port-folio weights are denoted by xC , xS , and xB . The portfolio value at time t isdenoted by Vt. Therefore, the portfolio value V follows the dynamics:

dVt

Vt= [rt + xS(t)θS + xB(t)θB ]dt + xS(t)σ1dW + [xS(t)σ2 + xB(t)σB ]dWr .

The investor’s preferences are described by utility function U , which embedshis risk aversion. He has an initial capital denoted by V0. He is assumed tomaximize expected utility over the time horizon T . Therefore, his optimalportfolio weights are the solutions of the following problem:

MaxxC ,xS,xB

E [U (VT )] .

For different utility functions, we show below how the optimal dynamicfactors depend on the investor risk aversion. In what follows, we present thesolutions for four standard utility functions:

7.2.1.2.1 Logarithmic Utility

U(x) = Log(x), x > 0, (7.11)

so that the absolute risk aversion is −U ′′ (x) /U ′ (x) = 1 /x , and the relativerisk aversion is constant and equal to −xU ′′ (x) /U ′ (x) = 1. In that case,the optimal portfolio is called the numeraire portfolio (see Long [361]), or thegrowth-optimal portfolio (see Merton [389]). Its value at maturity T, denotedby V log

T , is given by:

V logT = V0HT ,

where the numeraire portfolio HT is

HT =

ηT

exp(∫ T

0rsds

)−1

. (7.12)

The optimal weights at any time t are displayed in the next table.In the growth-optimal portfolio, hs and hB represent the weights of the

stock index and the constant maturation bond, respectively. In the case con-sidered by [38], “this optimal portfolio is highly aggressive and thus levered(negative weight in cash). Increasing risk aversion implies decreasing weightin the growth optimal portfolio and consequently increasing weight in cash.”Note that here the ratio xB /xS is constant since within the risky fund, theweights are the same regardless the level of risk aversion. Therefore, there


TABLE 7.1: Optimal weights for logarithmicutility functionxC = 1− (xS + xB)

xS = hS , with hS = λσ1

xB = hB, with hB =−λσ2+λrσ1σ1σB

is no inconsistency with the mutual fund separation theorem. However, theratio of all bonds BT to stock increases with risk aversion.

Some properties of the numeraire portfolio:Recall that the numeraire portfolio HT is equal to exp

(∫ T

0rs ds

)/ηT .

Thus, since in particular(∫ T

t rsds)

has a Gaussian distribution, the ra-tio Hz

T /Hzt is Lognormally distributed: it is equal to exp[Nz(t, T )], where

Nz(t, T ) has a Gaussian distribution.

Therefore, the expectation E [HzT /Hz

t ] is defined by:

Et[Hz

T

Hzt

] = exp[E(Nz)(t, T ) +12V ar(Nz)(t, T )], (7.13)

whereE(Nz)(t, T ) = z Φ(t,T ) and V ar(Nz)(t, T ) = z2Ψ(t,T ), (7.14)

with

Φ(t,T ) = (T − t) [br + (rt − br)σT−t

(T − t)σr+

λ2 + λ2r

2], (7.15)

Ψ(t,T ) = (T − t)×[σ2r

a2r(1− 2

σT−t

(T − t)σr+

σ2(T−t)

2(T − t)σr) + (λ2 + λ2

r)− 2λrσr

ar(1 − σT−t

(T − t)σr)].

(7.16)Denote respectively by ϕ(t,T ) and ψ(t,T ) the expectations E[Ht/HT ] and

E[Ht log(HT /Ht)/HT ]. We have:

ϕ(t,T ) = exp[−Φ(t,T ) +12

Ψ(t,T )] and ψ(t,T ) = φ(t,T ) ×(Φ(t,T ) −Ψ(t,T )

).

(7.17)To simplify the notations, we denote respectively E(Nz)(T ), V ar(Nz)(T ),ΦT ,

ΨT , ϕT , and ψT the values of E(Nz)(0, T ), V ar(Nz)(0, T ),Φ(0,T ), Ψ(0,T ),ϕ(0,T ), and ψ(0,T ).


7.2.1.2.2 CRRA utility We now consider the utility function with aconstant relative risk aversion, which generalizes the previous case.

U(x) =x1−γ

1− γ, γ > 0, (7.18)

so that −xU ′′ (x) /U ′ (x) = γ. The value at maturity T, denoted by V CRRAT

is

V CRRAT = (λ)−

1γ H

( 1γ )

T , (7.19)

where λ is a Lagrange multiplier determined by the initial investment V0. Inthis case, the optimal weights at any time t are deterministic functions of time(independent of the level of interest rate r). They are given in the next table.

TABLE 7.2: Optimal weights forCRRA utility function

xC = 1− 1γ (hS + hB)− (1− 1

γ )(σT−tσB )

xS = 1γ (hS)

xB = (1− 1γ )(σT−t

σB ) + 1γ (hB)

Furthermore, the optimal bond over stock ratio is also time-dependent:

xB(t)xS(t)

=hB

hS+ (γ − 1)

1hS

σT−t

σB.

Thus, whatever the time t and for any value of the parameters, this ratio isincreasing in investor risk aversion. This property is consistent with currentpractice. Moreover, if γ is larger than 1, the weight in cash, xC , is increasingin t, since it is a decreasing function of σT−t (see Equation (7.10)).

7.2.1.2.3 HARA utility The HARA utility function has a hyperbolic ab-solute risk aversion. It includes the CRRA utility function (described above),and the quadratic utility function as special cases. It is given by:

U(x) =(

γ

1− γ

)(x− x∗

γ

)1−γ

, (7.20)

where γ and x∗ are two parameters that cannot both be negative. We alsorequire that x∗BT (0) − V0 < 0. We have −U ′′ (x) /U ′ (x) = γ /(x− x∗) and−xU ′′ (x) /U ′ (x) = γx /(x− x∗) . Two cases arise according to the sign ofγ. If γ > 0, the absolute risk aversion is decreasing in x. In this case, the


amount x∗ represents a required lower bound for the terminal portfolio value(x > x∗). The initial investment V0 is at least equal to the discounted valueof this lower bound (x∗BT (0) < V0). If γ < 0, the absolute risk aversion isdecreasing in x. The amount x∗ is a required upper bound for the terminalportfolio value (x < x∗). The initial investment V0 is at most equal to thediscounted value of this lower bound (x∗BT (0) > V0).

In both cases, the risk aversion is an increasing function of the absolutevalue of γ. Note that the quadratic case is obtained with γ = −1 and x∗ > 0.The optimal portfolio at maturity T , V HARA

T , is given by:

V HARAT = (µ)−

1γ H

( 1γ )

T + x∗.

This expression can be interpreted as follows: the optimal portfolio is a com-bination of a CRRA fund with γ parameter and a zero coupon bond yieldingx∗ at time T. The optimal portfolio weights for the HARA utility functionare functions of Mt = 1−BT−t(t)x∗ /V HARA

t .

TABLE 7.3: Optimal weights forHARA utility function

xC = 1− (xS + xB)

xS = Mt × 1γ (hS)

xB = [Mt × ( 1γ )(hB − σT−t

σB)] + σT−t

σB

In these expressions, BT−t(t) represents the value at time t of a zero-couponbond maturing at time T . The factor Mt represents the proportion at time tof the risky fund in the total portfolio value. Therefore, the ratio Bond overStock is no longer deterministic. In this case, the optimal weights depend onmarket conditions (median, bullish, or bearish markets).

7.2.2 Exponential utility

Finally, we consider the exponential utility function:

U(x) = −e−ax

a, a > 0,

so that the absolute risk aversion is constant, −U ′′ (x) /U ′ (x) = a. Theoptimal portfolio value for the exponential utility function, V exp

T , is a functionof the numeraire portfolio (see Equation (7.12)):

V expT = A(V0) +

1a

log(HT ),


where

A(V0) =V0 − 1

aE[log(HT )/HT ]E[1/HT ]

.

The computations of the expressions for the optimal weights are more in-volved:

• To compute the replicating strategies for a given optimal portfolio, firstnote that Vt/Ht is a martingale. Thus, we obtain the martingalityrelation

Vt/Ht = EP,t [VT /HT ] . (7.21)

• Therefore, it is necessary to compute conditional expectations of quan-tities which are functions of VT /HT . For this purpose, the conditionalexpectations of the numeraire portfolio have to be used.

Using the martingality relation, the value V expt is given by:

V expt = HtEt

[A(V0) + 1

a log(HT )HT

].

Then, we obtain the expression of the optimal portfolio value at anytime t of the investment period:

V expt =

[A(V0)ϕ(t, rt) +

1aψ(t, rt)

]+

1a

ln(Ht)ϕ(t, rt). (7.22)

• To compute the optimal weights, we have to determine dV expt

V expt

, and tosearch xS , xB , and x

Csuch that

dV expt

V expt

= xSdSt

St+ xB

dBD,t

BD,t+ x

C

dCt

Ct.

• For this purpose, applying Ito’s formula to Equation (7.22), we obtain:

dV expt =

[A(V0)

∂ϕ

∂t(t, rt) +

1a

∂ψ

∂t(t, rt) +

1a

ln(Ht)∂ϕ

∂t(t, rt)

]dt

+12σ2r [A(V0)

∂2ϕ

∂r2(t, rt) +

1a

∂2ϕ

∂r2+

1a

ln(Ht)∂2ϕ

∂r2]dt

+[A(V0)

∂ϕ

∂r(t, rt) +

1a

∂ψ

∂r(t, rt) +

1a

ln(Ht)∂ϕ

∂r(t, rt)

]drt

+1aϕ(t, rt)

[dHt

Ht− 1

2H2t

d < Ht, Ht >

],


where < Ht, Ht > is the “instantaneous variance” of Ht. Define χ(t, rt)by:

χ(t, rt) = A(V0)∂ϕ

∂r(t, rt) +

1a

∂ψ

∂r(t, rt) +

1a

ln(Ht)∂ϕ

∂r(t, rt).

Then, dV expt has the form:

dV expt = V exp

t

[1a

ϕ(t, rt)V expt

(hSdSt

St+ hB

dBD,t

BD,t) +

χ(t, rt)V expt

drt + (...)dt].

• Using this property, it is possible to determine the optimal weights byidentifying the martingale components. This yields to the followingrelations:

xSσ1 =1a

(ϕ(t, rt)V expt

)hSσ1,

xSσ2 + xBσB =

1a

(ϕ(t, rt)V expt

)hSσ2 +

1a

(ϕ(t, rt)V expt

)hBσB +

(χ(t, rt)V expt

)(−σr),

from which we can compute the optimal weights for the exponentialutility function.

• Finally, we deduce the optimal portfolio weights for the CARA utilityfunction,

TABLE 7.4: Optimal weights for CARA utilityfunctionxC = 1− (xS + xB)

xS = 1a

(ϕ(t,T )

V expa,t

)hS

xB = 1a

(ϕ(t,T )

V expa,t

)(σ2−σ1

σBhS + hB)−

(χ(t,T )

V expa,t

σrσB

)

where

χ(t,T ) =(

1a

)[ϕ(t,T )

(σT−t

σr

)](1 + Ψ(t,T ) − Φ(t,T ) − aV0 − ψT

ϕT− ln(Ht)

).

Note that these optimal weights are deterministic functions of the instanta-neous interest rate rt, and the portfolio value V exp

a,t . These functions dependonly on market parameters. The ratio Bond over Stock is stochastic.


7.2.3 Sensitivity analysis

We consider a numerical base case. For simplicity, from now on we focus ourattention on the CRRA utility function (the other cases can be treated in asimilar way). For the base case, for the values of the short rate parameters, weuse the estimates of Chan [114]: speed of convergence, ar = 12%, asymptoticshort rate value, br = 4%, and volatility, σr = 4% (see Equation (7.6)). Thecurrent instantaneous interest rate is r0 = 4%. The risk premia are θB = 1.5%,and θS = 6% ; the index stock volatilities are σ1 = 19%, and σ2 = 6%(see Equations (7.8) and (7.9)). Finally, the Bond fund constant duration isD = 10 years.

TABLE 7.5: Asset allocation sensitivities for CRRA utilityMarket parametersand investor’s typeγ CASH BONDS STOCKS Bonds

Stocks

Decrease of the volatilityof Interest rate

from 4 to 2% % %

3.52 -13 70 43 1.635.28 -8 80 28 2.86

10.57 -2 88 14 6.28

Decrease of the speedof convergence ar

from 12 to 103.52 -10 65 45 1.445.28 -5 75 30 2.50

10.57 0 85 15 5.66

Change in the valueof brfrom 4 to 2

3.52 -10.5 65.5 45 1.455.28 -6 76 30 2.53

10.57 -2 87 15 5.80

A decreasing of the volatility of the interest rate shifts money from stock tocash and bonds (the instantaneous bond return is less risky). Decreasing thespeed of convergence ar is similar to increasing interest rate volatility, sinceinterest rates converge slowly to their long run value br. Decreasing the valueof br has almost no impact for the CRRA case, since portfolio weights areindependent of the interest rate level.


7.2.3.1 Utility of the optimum portfolio

Consider an investor with a HARA utility function, with relative risk aver-sion γ. His time horizon is T . In this case, the discounted expected utility(computed at time t = 0) is given by:

E[Uγ(V ∗(γ,T )

;V0)] =(

γγ

1− γ

)[(V0 − x∗αT )](1−γ) (7.23)

exp[γ

(E(Nz)(T ) +

12V ar(Nz)(T )

)], (7.24)

and, for the special case CRRA, we obtain:

E[Uγ(V ∗(γ,T )

;V0)] =(

γγ

1− γ

)V 1−γ0 exp

[γ

(E(Nz)(T ) +

12V ar(Nz)(T )

)],

(7.25)with z = (1− γ) /γ, where V0 is the initial investment, and where E(Nz)(T )and V ar(Nz)(T ) are defined by relation (7.14).

Consider now an investor with a CARA utility function, with absolute riskaversion a. We have:

E[Ua(V ∗(a,T )

;V0)] = −1a

exp[−aV0 + βT

αT

]× αT . (7.26)

Under the same numerical assumptions of the previous numerical base case,consider a financial institution that offers to its clients three standardizedportfolios: the first one is aggressive (45% Stock), the second one is a moderateportfolio (30% Stock), and the third is a conservative portfolio (15% Stock).Assuming CRRA utility functions, it is possible to recover two unknowns:the values of the risk aversion and of the investment period corresponding tothis portfolio. For this example, we find T = 25 years, and we provide belowvalues of the risk aversion which best fit these three portfolios.

TABLE 7.6: Asset allocations for agressive, moderate andconservative investors (CRRA utility function)

CASH % BONDS % STOCKS % Ratio ofBonds to Stocks

Investortype γ

-10 65 45 1.44 Aggressive 3.52-6 76 30 2.53 Moderate 5.28-2 87 15 5.8 Conservative 10.57


7.2.4 Distribution of the optimal portfolio return

The performance of an optimal strategy can be illustrated by the distribu-tion of return at maturity. Below, this distribution is computed for the CRRAutility function (the other cases can be derived in a similar way).

From Equation (7.19), we deduce the optimal portfolio return V CRRAT /V0:

R(γ) = H(1/γ)T /E[H(1/γ−1)

T ].

Since H(1/γ)T has a lognormal distribution, the cumulative distribution func-

tion FR(γ) (x) of the return R(γ) can be computed explicitly. Let YT denotethe random variable such that HT = exp[YT ]. The distribution of YT is Gaus-sian. Denote respectively by mT and sT its expectation and standard devia-tion.

The expectation E[H

(1/γ−1)T

]is defined by:

E[H

(1/γ−1)T

]= exp

[E(N(1−γ)/γ) +

12V ar(N(1−γ)/γ )

],

with

E(Nz) = z ΦT and V ar(Nz) = z2ΨT . (7.27)

Then, we have:

FR(γ) (x) = P

H(1/γ)T

E[H

(1/γ−1)T

] ≤ x

= P

X ≤γLog

[x E

[H

(1/γ−1)T

]]−mT

sT

,

where X is a random variable with a standard Gaussian distribution. Notethat mT and sT are constant, which only depend on market parameters (butnot on risk aversion γ).

The inverse cumulative distribution functions are displayed in the next fig-ure in the base case for three values of γ (γ = 3.52, γ = 5.28, and γ = 10.57)and for a time horizon T = 20 years.

At a given probability level 0.98, an investor with a aggressive portfoliohas a guarantee to recover only 70% of his initial investment. However, thisinvestor has a 20% chance to multiply his initial investment by a factor 5. Ifthis investor selects a conservative portfolio, he has a guarantee (up to 98%)to increase by half the value of his initial portfolio value. However, in thiscase, the probability to multiply his initial investment by 5 is almost zero(the probability to multiply the initial investment by 3 is only about 10%).Finally, the investor selecting the moderate portfolio (intermediary curve) hasa 98% probability to recover his initial investment after 20 years.


FIGURE 7.3: Inverse cumulative distribution of the return at maturity

7.3 Further reading

Long-term management depends on a large variety of factors. These in-clude: income, saving capacity, age, gender, know-how, experience, liquidity,real estate, investment horizon, attitudinal and personality factors, investmentobjectives (e.g. planned projects or retirements), initial investable assets andadditional funds. Campbell and Viceira [103] discuss to what extent life-cycleportfolio choice and saving are affected by the variables mentioned above.

The influence of risk aversion has been examined by Kallberg and Ziemba[315]; the optimal portfolio is more sensitive to the value of risk aversion thanto the functional form of utility function.

Similarly, Brennan and Xia [89] highlight the importance of considering in-vestors’ time horizons in the analysis of optimal portfolio policies.

Long-term portfolio optimization with fixed incomes is studied in Brennanand Xia [89], who examine the Bond-Stock Mix when stochastic interest ratesare involved. Fabozzi [215] provides an overview on bond markets analysis.Sorensen [473] and Lioui and Poncet [357] analyze optimal portfolio choiceand fixed income management under stochastic interest rates.

Battocchio et al. [50] examine in particular the role of the decumulation


phase in the determination of the optimality of asset allocation under mortal-ity risk.

Note also that financial institutions typically offer a limited number of stan-dardized portfolios which imperfectly match investor preferences. Jensen andSorensen [302] and De Palma and Prigent [161] quantify the efficiency losses ofan investor acquiring a standardized (e.g., conservative, balanced, and aggres-sive) versus a customized portfolio. A standardized portfolio may differ fromthe optimal one by its risk exposure or its time horizon. Numerical resultsshow that the monetary losses from not having access to a customized port-folio can be substantial, in particular when the time horizon of the investordiffers from the one selected by the financial institution.

Chapter 8

Optimization within specific markets

As seen in Chapter 6, the investor’s utility maximization of his terminal wealthcan be solved in the continuous-time setting. Using methods of stochasticoptimal control, Merton ([385], [386]) proves that the value function is thesolution of a non-linear partial differential equation: the Bellman equation.Then, closed-form solutions are available for the HARA utility. However, thedynamic programming method is based on Markovian assumptions. To avoidthis hypothesis, another approach has been introduced: the duality portfoliocharacterization by using the martingale measures, the so-called risk-neutralmeasures. For complete markets, this set is reduced to one point and theoptimal solution is determined from the fundamental result: the terminalwealth of the optimal portfolio is equal (up to a multiplicative constant) tothe marginal utility inverse of the density of the martingale measure. Thismethod, illustrated in Chapter 6, has been introduced by Pliska [410], Coxand Huang ([132],[133]), and Karatzas et al. [319]. This is in line with theoptimal investment problem for a one-period model with a finite set of randomevents solved by introducing the Arrow-Debreu state prices.

However:

• The assumption of financial market completeness is very strong. Rough-ly speaking, it is supposed that there are as many assets as randomsources.

• In addition, several market “frictions” must also be taken into account:

- Constraints, such as no-short selling;

- Transaction costs;

- Labor income stream;

- Partial information about security prices.

Thus, specific methods must be introduced to examine such optimizationproblems.

229


8.1 Optimization in incomplete markets

He and Pearson ([288], [289]) have studied this problem both in discrete-time and continuous-time frameworks. Karatzas et al. [319] have shownhow expected utility maximization can be solved by martingale methods andconvex duality. In what follows, first a general result, due to Kramkov andSchachermayer [335], is presented. Then, some standard financial models areconsidered and explicit optimal solutions are detailed.

8.1.1 General result based on martingale method

- Price process: Consider a financial market with (d + 1) securities, oneriskless bond with constant rate r, and d stocks with price process (Si,t)i,t,t ∈ [0, T ], and 1 ≤ i ≤ d. Note that the time horizon can be also infinite. Theprocess S is assumed to be a semimartingale on a filtered probability space(Ω,F , (Ft)t,P).

- Portfolio strategy and value: Recall that a self-financing portfolio strategy(θi,t)i,t, where θi,t denotes the amount invested on asset i at time t, is a pre-dictable process, which is integrable w.r.t. the price process S. The portfoliovalue process (Vt)t, associated to strategy (θi,t)i,t, is given by: for t ∈ [0, T ],

Vt = V0 +d∑

i=1

∫ t

0

θi,sdSi,s. (8.1)

Denote by V(V0) the family of wealth processes (Vt)t such that for t ∈ [0, T ],Vt ≥ 0 and with initial value V0.

- Equivalent local martingale: A probability measure Q, equivalent to P, iscalled an equivalent local martingale measure if any process V in V(1) is alocal martingale w.r.t. Q. Note that when the process S is bounded (resp.locally bounded), then under an equivalent martingale measure, the processS is a martingale (resp. locally martingale) (see Delbaen and Schachermayer[155] for more details about such property). Denote byM =Me(S) the set ofequivalent local martingale measures which is assumed to be non-empty dueto the absence of arbitrage opportunities, as shown by Harrison et al. ([284],[285]).

- Expected utility maximization: The investor has a utility function onwealth U : (0,+∞) → R. The investor searchs for the value function as-sociated to the primal problem:

J (V0) = supVT∈V(V0)

E [U(VT )] . (8.2)

Optimization within specific markets 231

Assumptions (A) on utility function: The function U is defined on R+,strictly increasing, strictly concave, continuously differentiable, and such that:

U ′(0 = limx→0+

U ′(x) = +∞,

U ′(∞) = limx→∞U ′(x) = 0.

The value function J is supposed to satisfy for some x > 0, J (x) <∞.In order to solve the optimization problem in this general framework, Kramkovand Schachermayer [335] introduce the following key assumption:- Asymptotic elasticity of the utility function: The utility function U has

asymptotic elasticity AE(U) strictly smaller than 1 if:

AE(U) = lim supx→∞

xU ′(x)U(x)

< 1. (8.3)

Examples of this utility function are the logarithm and power utilities: forU(x) = lnx, AE(U) = 0, and for U(x) = xα

α with 0 < α < 1, AE(U) = α.However, for instance, the utility function U(x) = x

ln x has AE(U) equal to 1.Kramkov and Schachermayer [335] show that the condition AE(U) < 1

is necessary and sufficient to get the following properties under the previousassumption on price process S:

• 1) The value function J is a utility function which is increasing, strictlyconcave, continuously differentiable, and such that:

limx→0+

J ′(x) = +∞ and limx→∞J

′(x) = 0.

• 2) The optimization problem has a solution.

Note also that, either the utility function U(.) satisfies AE(U) < 1, whichimplies that AE(J ) < 1, or AE(U) = 1 in which case there exists an R-valuedprice process S which is continuous, induces a complete market, and is suchthat J is not strictly concave (more precisely, there exists x0 such that J (x)is a straightline with slope one for x > x0).

- Legendre-transform and conjugate utility function: As seen in Chapter 6,it is useful to introduce the conjugate function of the utility function U :

U(y) = maxx>0

[U(x)− xy] , y > 0. (8.4)

The function U is the Legendre-transform of the function −U(−x) (seeRockafellar [426]). If the utility function U satisfies assumption (A), thefunction U is continuously differentiable, decreasing, strictly convex and suchthat:

limx→0+

U(x) = limx→∞U(x) and lim

x→∞ U(x) = limx→0+

U(x), (8.5)

limx→0+

U ′(x) = −∞ and limx→∞ U ′(x) = 0. (8.6)


Furthermore, we have the following bidual property:

U(x) = maxy>0

[U(y) + xy

], x > 0. (8.7)

The derivative of U(.) is the inverse function of the negative of the derivativeof U(.). Denote it by J . We then have: J = −U ′ = (U ′)−1.

Example 8.1Consider the power utility U(x) = xα

α with 0 < α < 1. Then,

U(y) =1− α

αy

α1−α .

Consider the exponential utility U(x) = − e−axa with 0 < a. Then,

U(y) =y

a(ln y − 1) .

- Optimal solutions: We refer to [335] for the three following theorems andtheir detailed proofs, corresponding first to the complete case, and second tothe incomplete case with AE(U) < 1 or AE(U) = 1.

THEOREM 8.1 Complete case

Suppose that previous assumptions are satisfied, and also that M = Q .Denote:

J (y) = E

(U

[ydQ

dP

]). (8.8)

We have:(i) The functions J and J are finite:

J (x) <∞, ∀x > 0 and J (y) <∞, ∀y > 0 sufficiently large.

Denote y0 = infy : J (y) <∞

. The function J (y) is continuously differ-

entiable and strictly convex on ]y0,∞[. Denote x0 = limy→y0

(−J ′(y)

). The

function J is continuously differentiable on ]0,∞[ and strictly concave on]0, x0[. The value functions J and J are conjugate:

J (y) = maxx>0

[J (x)− xy] , y > 0, (8.9)

J (x) = maxy>0

[J (y) + xy

], x > 0, (8.10)


and their derivatives satisfy:

limx→0+

J ′(x) =∞ and limy→∞ J

′(y) = 0.

(ii) If x < x0, the optimal solution V ∗T (x) is given by:

V ∗T (x) = I

(ydQ

dP

), for y < y0,

where x and y satisfy y = J ′(x), or equivalently x = − J ′(y). Note that, inthat case, the optimal solution process V ∗(x) is a uniformly integrable mar-tingale under the risk-neutral probability Q.(iii) For 0 < x < x0 and for y > y0, we have:

J ′(x) = E

[V ∗T (x)U ′ (V ∗

T (x))x

], J ′(y) = E

[dQ

dPU ′

(ydQ

dP

)].

In order to study the incomplete case, we introduce:

• The family Y(y) of non-negative semimartingales with Y0 = y and suchthat, for any X ∈ V(1), the product XY is a supermartingale. Note thatthis set contains the density processes of all equivalent local martingalesQ.

• The value function of the dual optimization problem is denoted anddefined by:

J (y) = infY ∈Y(y)

E(U [YT ]

). (8.11)

For the complete case, we have also J (y) = E(U[y dQ

dP

]). For the in-

complete case, they may differ.

Since the functions J and −J are concave, the right-continuous versionsof their derivatives J ′and −J ′ exist. Recall that the asymptotic elasticityAE(J ) is defined by:

AE(J ) = lim supx→∞

xJ (x)J (x)

.


The following two theorems (see [335]) provide general results for the in-complete case.

THEOREM 8.2 Incomplete case with general utility function U

Under previous assumptions, we have:(i) The function J has finite values, J (x) < ∞, ∀x > 0, and the function

J is finite ∀y > y0, y0 sufficiently large. Denote y0 = infy : J (y) <∞

.

The function J is continuously differentiable on ]0,∞[, and the function J (y)is strictly convex on ]0,∞[. The value functions J and J are conjugate:

J (y) = maxx>0

[J (x)− xy] , y > 0, (8.12)

J (x) = maxy>0

[J (x) + xy

], x > 0. (8.13)

and their derivatives satisfy:

limx→0+

J ′(x) =∞ and limy→∞ J

′(y) = 0, (8.14)

(ii) If J (y) <∞, then the optimal solution V ∗T (x) exists and is unique.

THEOREM 8.3 Incomplete case with AE(U) < 1

Under previous assumptions and the additional hypothesis AE(U) < 1, wehave:i) The function J is finite: J (y) < ∞, ∀y > 0 sufficiently large. The

functions J and J are continuously differentiable on ]0,∞[. The functionsJ ′ and -J ′ are strictly decreasing and satisfy

limx→∞J

′(x) = 0 and limy→0+

−J ′(y) = +∞. (8.15)

The asymptotic elasticity AE(J ) is also smaller than 1 and: (notationx+ = max(x, 0))

AE(J )+ ≤ AE(U)+ < 1. (8.16)

(ii) The optimal solution V ∗T exists and is unique. If Y ∗(y) ∈ Y(y) is

the optimal solution of problem (8.11), where y = J ′(x), we have the dualrelation:

V ∗T (x) = J (Y ∗

T (y)) . (8.17)

The process V ∗T (x)Y ∗

T (y) is a uniformly integrable martingale on [0, T ].(iii) The relations between J , J and V ∗

T , Y∗T are:

J (x) = E

[V ∗T (x)U ′ (V ∗

T (x))x

]; J ′(y) = E

[Y ∗T (y)U ′ (Y ∗

T (y))y

]. (8.18)


(iv) The value function J is given by:

J (y) = infQ∈M

E

[J(ydQ

dP

)]. (8.19)

Example 8.2 One-period modelConsider a one-period model where the investor has a logarithmic utility

function U(x) = lnx. Thus, we have:

AE(U) = lim supx→+∞

x1x

lnx= 0 < 1 and

J(y) = 1

y ,

U(y) = − ln y − 1.

Suppose that the financial market contains a riskless asset, taken as nu-meraire, and a stock S with only three return values. Set S0 = 1. The stockvalue at maturity T is given by:

ST =

u (“up”) with probability p,m (“mean”) with probability 1− p− q,

d (“down”) with probability q.

with d < 1,m < u. Assume that m = 1.Then, a supermartingale Y in Y(y) has the following form: Y0 = y and

YT =

yu (“up”) with probability p,

ym (“mean”) with probability 1− p− q,yd (“down”) with probability q,

where yd, ym, and yu are non-negative scalars.The primal optimization problem is associated to the value function J :

J (x) = supVT∈V(x)

E [lnVT ] .

The dual problem is associated to the value function J :

J (y) = infY ∈Y(y))

E [− lnVT − 1] .

Since Y and Y.S are in Y(y), J (y) can be written as:

J (y) = −1− supyd,ym,yu

E [− lnYT ] ,

= −1− supyd,ym,yu

[q ln yd + (1 − q − p) ln ym + p ln yu] ,

with E [YT ] = qyd + (1− q − p)ym + pyu ≤ y,

E [YTST ] = qdyd + (1 − q − p)ym + puyu ≤ y,yd, ym, yu ≥ 0.


Note that the optimum (y∗d, y∗m, y∗u) is necessarily such that the previous con-

straints are true equalities. Then, we deduce:

Y ∗T =

y∗u = y (p+q)(1−d)

p(u−d)

y∗m = y

y∗d = y (p+q)(u−1)q(u−d)

= y

(p+q)(1−d)

p(u−d)

1(p+q)(u−1)

q(u−d)

.

Finally, since V ∗T (x) = J (Y ∗

T (y)) where y = J ′(x) = 1x , we deduce:

V ∗T = x

p(u−d)

(p+q)(1−d)

1q(u−d)

(p+q)(u−1)

.

Another standard example is based on a d−dimensional Brownian motionwith dimension d higher than the number of available assets.

Example 8.3 Multidimensional Brownian motion

Consider now a continuous-time model where the investor still has a log-arithmic utility function U(x) = lnx. The financial market still contains ariskless asset, taken as numeraire, and one stock S. Set S0 = 1. Assume thatS is solution of the SDE:

dSt = St− . (µdt + σ1dW1,t +σ2dW2,t ) , (8.20)

where W = (W1,W2) is a 2−dimensional standard Brownian motion. Thus,the price S is equal to:

St = exp([

µ− 12(σ21 + σ2

2

)]t + σ1W1,t +σ2W2,t

), (8.21)

where µ, σ1, and σ2 are constant with σ1 > 0 and σ2 > 0.Due to the predictable representation theorem, the process Y has the fol-

lowing form:

Yt = Y0 exp(∫ t

0

[µY (s)− 1

2(σ21 + σ2

2

)]ds +

∫ t

0

σY1,sdW1,s +

∫ t

0

σY2,sdW2,s

).

(8.22)The processes Y and Y.S are in Y(y). Therefore, for any T ,

E [YT |Y0 = y ] = y exp

(∫ T

0

µY (t)dt

)≤ y,

E [YTST |Y0 = y ] = y exp

(∫ T

0

[µ + µY (t) + σ1σ

Y1,t + σ2σ

Y2,t

]dt

)≤ y.


Thus µY (.) ≡ 0. Then, we have:

E(lnVT ) = ln y − 12

∫ T

0

[(σY1,t

)2+(σY2,t

)2]dt.

Consequently,µ + σ1σ

Y1,t + σ2σ

Y2,t = 0.

Then,

E(lnVT ) = ln y − 12

∫ T

0

[µ2 + 2µσ2

(σY2,t

)+(σY2,t

)2 (σ21 + σ2

2

)]/ (σ1)

2dt.

Finally, we deduce that the optimal solution is given by:

σY1,t = − µσ2

(σ21 + σ2

2); σY

2,t = − µσ1(σ2

1 + σ22)

,

Y ∗T = y exp

[µ

(σ21 + σ2

2)

(−T

2µ− σ1W1,T −σ2W2,T

)],

J (y) = −1− E[lnY ∗T ] = −1− ln y +

µ2

(σ21 + σ2

2).

Since y = J ′(x) = 1x , we deduce

V ∗T = V0 exp

[µ

(σ21 + σ2

2)

(T

2µ + σ1W1,T σ2W2,T

)].

REMARK 8.1 The optimal portfolio value V ∗T is an increasing function

of the stock price at maturity. Indeed, using the relation:

ST exp(−[µ− 1

2(σ21 + σ2

2

)]T

)= exp (σ1W1,t +σ2W2,T ) ,

we deduce:

V ∗T = V0 × vT × S

µ

(σ21+σ2

2)T ,

where vT is a deterministic function given by:

vT = exp[

µ2T

2 (σ21 + σ2

2)

]exp

[ −µ(σ2

1 + σ22)

(µ− 1

2(σ21 + σ2

2

)T

)].

The concavity/convexity of V ∗T only depends on the comparison between µ

and σ21 + σ2

2 .


8.1.2 Dynamic programming and viscosity solutions

The standard dynamic approach can be extended to portfolio analysis with-in incomplete markets. The value function can be characterized as a viscositysolution of the associated Bellman equation. In general, the value function isnot smooth. However, as shown by Pham [408], for CRRA utility functions,the non-linearity of the Bellman equation can be reduced by using a logarith-mic transformation, as introduced by Fleming [230] in a stochastic controlsetting. This allows us to get a semilinear equation with quadratic growth onthe derivative term.

8.1.2.1 Stochastic volatility

In what follows, the results of Pham [408] are presented (see also Za-riphopoulou [510] for the univariate case).

- The financial market contains a riskless asset with rate r. The price S ofthe d risky assets is assumed to be a semimartingale on a filtered probabilityspace (Ω,F , (Ft)t,P) , and to be the solution of the SDE:

dSt = diag(St). [(µ (Ft) dt + σ1 (Ft) dW1,t +σ2 (Ft) dW2,t )] , (8.23)

where W1 is a d−dimensional standard Brownian motion, and W2 is an m−dimensional standard Brownian motion, independent of W1.

Diag(St) is the diagonal matrix d× d matrix with mi,i = Si,t.

The function σ1 (.) is Rd-valued, and σ2 (.) is a d ×m matrix. Both σ1 (.)and σ2 (.) are assumed to be continuous functions.

The process F refers to stochastic factors and is defined from:

dFt = η (Ft) dt + dW1,t ,

where the function η (.) is assumed to be Lipschitzian.

Denote µ(y) = µ(y) − rI, the excess rate of return w.r.t. the riskless rate(I is the vector of one in Rd).

Denote also Σ(y) = [σ1 (y)σ2 (y)] the matrix-valued volatility of the riskyassets. The matrix Σ is of full rank equal to d. Denote:

β(y) = infw∈Rd

||tΣ(y)w||2||w||2 .

Assume that there exists a positive constant C such that:

||µ(y)|| /√

β(y) ≤ C(1 + ||y||),||σ(y)|| /

√β(y) ≤ C(1 + ||y||).


- The investor has a weighting process (wt)t which is, as usual, assumed tobe predictable and integrable:

supt∈[0,T ]

E[exp

(a∣∣∣∣Σ(y)tw

∣∣∣∣)] <∞, for some constant a > 0.

Denote by A the set of admissible strategies. The investor has a powerutility U(x) = xα

α , with 0 < α < 1.

The value function of the investor is defined by: for any (t, x, f) ∈ [0, T ]×R+ × Rd,

J (t, x, f) = supw∈A

E [U(VT ) |Vt = x, Ft = f ] . (8.24)

- The dynamic programming equation (the Hamilton-Jacobi-Bellman equa-tion) associated to the stochastic control problem is the non linear PDE:

∂J∂t

+ rx∂J∂x

+t η(f)DfJ +12

∆fJ+ (8.25)

max[twµ(f)x

∂J∂x

+12

∣∣∣∣Σ(f)tw∣∣∣∣2 x2 ∂2J

∂t2+ twσ(f)xD2

xfx

]= 0, (8.26)

where Dfu denotes the gradient vector of u w.r.t. f and ∆fu is the Laplacianof of u w.r.t. f and D2

xf is the second derivative vector w.r.t. (x, f).

The terminal value is:J (T, x, f) =

xα

α.

Since the utility function is homogeneous, and the dynamics of the wealthprocess depends linearly on the control w, a logarithmic transformation canbe introduced and the value function can be searched from the form:

J (t, x, f) =xα

αexp [−ϕ(t, f)] .

Therefore, we have:

∂J∂t

= −∂ϕ

∂tJ ; x

∂J∂x

= αJ ; x2∂2J∂t2

= α(α − 1)J ,

DfJ = −J Dϕ; ∆fJ =[−∆ϕ + DtϕDϕ

]J ; xD2xfx = −αJDJ(8.27)

Thus, the function ϕ is the solution of the following semilinear PDE:

−∂ϕ

∂t− 1

2∆ϕ + H(f,Dϕ) = 0 with ϕ(T, f) = 0, (8.28)


and the function H is defined by:

H(f, p) =

12||f ||2− tpη(f) +αr+ max

[αtw (µ(f)− σ(f)p)x− α(1 − α)

2

∣∣∣∣Σ(f)tw∣∣∣∣2] .

PROPOSITION 8.1 Equivalence of both PDE

Assume that there exists a solution ϕ to the previous semilinear PDE (whichis supposed to be continuously differentiable w.r.t. current time t, and twice-continuously differentiable w.r.t.(x, f), with terminal value ϕ(T, f) = 0.

Then, the value function of the first problem is given by:

J (t, x, f) =xα

αexp [−ϕ(t, f)] .

In addition, an optimal portfolio is given by the Markov control

wt = w(t, Ft),

with

w(t, f) ∈ arg minw∈A

[1− α

2

(∣∣∣∣Σ(f) tw∣∣∣∣2) − tw (µ(f)− σ(y)Dϕ(t, f))

].

If there is no specific constraint on w, the optimal Markov control is givenby:

w(t, f) =1− α

2

(∣∣∣∣Σ(f)tw∣∣∣∣2)− tw (µ(f)− σ(y)Dϕ(t, f)) .

REMARK 8.2 In the case of constant coefficients, which corresponds toan extension of Merton’s model, the solution does not depend on y and isequal to:

ϕ(t, y) = λ(T − t),

where the parameter λ is given by:

λ = αr + maxw∈A

[αwtµ(y)− α (1− α)

2

(∣∣∣∣tΣw∣∣∣∣2)] .

The optimal proportion is constant given by:

w ∈ arg minw∈A

[(1− α)

2

∣∣∣∣Σ(y)tw∣∣∣∣2 − wtµ(y)

].

However, in a general stochastic volatility model, no closed-form solutionis available for the parabolic Cauchy problem (8.28).It must be solved numerically but is simpler that the initial Bellman equation.


Example 8.4 Hull-White and Scott modelsAssume that

dSt

St= (r + µ(Ft)) dt + ρeγFtdW1,t +

√1− ρ2eγFtdW2,t,

dFt = (a− θFt) dt + dW2,t,

where a and γ = 0 are constant and θ is the rate of mean-reversion.

The parameter ρ is a constant correlation coefficient.

In that case, we have:

η(f) = a− θf, σ(f) = ρeγf and ΣtΣ(f) = e2γf .

Therefore, without constraint on weighting w, the value function of theoptimization problem is given by

J (t, x, f) =xα

αexp [−ϕ(t, f)] ,

where ϕ(., .) is the unique solution of the previous Cauchy problem:

−∂ϕ

∂t− 1

2∂2ϕ

∂f2+

12

(1 +

α

1− αρ2)(

∂ϕ

∂f

)2

−(a− θf +

α

1− α

ρµ(f)eγf

)∂ϕ

∂f+ αr +

α

1− α

µ2(f)e2γf

= 0. (8.29)

Additionally, the optimal portfolio is given by:

wt =e−γFt

1− α

[µ (Ft)eγFt

− ρ∂ϕ

∂f(t, Ft)

], a.s., 0 ≤ t ≤ T.


8.2 Optimization with constraints

As seen in Chapter 3, specific constraints are often introduced, such as noshort-selling, lower and upper bounds on portfolio weights, etc.

In the continuous-time setting, Cvitanic and Karatzas [137] study the s-tochastic dynamic control problem associated to the maximization of the ex-pected utility of consumption and terminal value.

Mathematically speaking, the set of constraints is supposed to be a givenclosed and convex subset of Rd. As seen in what follows, the idea is toembed the constrained problem in a family of unsconstrained ones. Then, tofind an element of this family which satisfies the required constraints. Suchresults cover, in particular, incompleteness and no short-selling constraints.Again, this approach is based on martingale theory, duality theory, and convexanalysis.

8.2.1 General result

- The financial market contains a riskless asset with rate r. The price S ofthe d risky assets is assumed to be a semimartingale on a filtered probabilityspace (Ω,F , (Ft)t,P) solution of the SDE:

dSt = diag(St). [µ (t) dt + σ (t) dWt] , (8.30)

where W is a d−dimensional standard Brownian motion, and usual previousassumptions on coefficient functions are made.

This financial market is complete. Thus, there exists one and only onerisk-neutral probability Q defined as follows:

Denote the relative risk process η by:

η(t) = σ (t)−1 [µ (t)− r(t)I] ,

with

E

[∫ T

0

||η(t)||2 dt]<∞.

The exponential local martingale L:

L(t) = exp[−1

2

∫ t

0

||η(s)||2 ds−∫ t

0

η(s)dWs

],

is the Radon-Nikodym density of the risk-neutral probability Q w.r.t. theprobability P.


Denote R as the discount factor:

R(t) = exp[−∫ t

0

r(s)ds].

Denote M the product: M(t) = R(t)L(t).

- Set K of constraints: The constraints are assumed to be such that theset K is a non-empty, closed, and convex subset of Rd.

Denote by δ the support function of the convex set −K, defined on Rd, andwith values in Rd ∪ +∞:

δ(x) = δ(x,K) = supw∈K

(− tw.x).

The function δ is a closed, positively homogeneous, proper convex functionon Rd (see for example, Rockafellar [426] for details about these notions).

The effective domain of the function δ is the set K defined by:

K =x ∈ Rd, δ(x) <∞

,

=x ∈ Rd, ∃β ∈ R,− tw.x ≤ β, ∀w ∈ K

.

The set K is a convex cone, called the “barrier cone” of −K.

In what follows, it is assumed that the function δ(.,K) is continuous on Kand bounded below on Rd. For some δ0 ∈ R, δ(x,K) ≥ δ0 (for example, if Kcontains the origin, δ0 = 0 is convenient).

Note also that the function δ(.) is subadditive:

δ(x + y) ≤ δ(x) + δ(y).

- Utility functions: these functions satisfy all the same assumptions asshown in the previous section devoted to incomplete markets. In particular,the conjugate function of a utility function U is still denoted by U :

U(y) = maxx>0

[U(y)− xy] , y > 0, (8.31)

and the inverse of the derivative marginal utility U ′ is denoted by J.

- Constrained optimization problem: The investor searchs to maximize

E

[∫ T

0

U(t, ct)dt + U(VT )

],


on the set AK(v0) of all admissible strategies (c, w) satisfying usual assump-tions (see Chapter 6):

AK(v0) = (c, w), w(ω, t) ∈ K, for P⊗ dt a.s. (ω, t) .The value function JK is defined by:

JK = sup(c,w)∈AK(v0)

E

[∫ T

0

U(t, ct)dt + U(VT )

].

- When K = Rd (no constraint), recall that the solution is given by relation(6.83):

c∗t = J(λ∗(κ(t))) and V ∗T = J(λ∗(κ(T ))),

where κ(t) = R(t)L(t).

The Lagrange parameter λ∗ is determined from the budget equation. Itsexistence is deduced from assumptions (U) and (V), since the function F ,defined by

F (y) = EP

[∫ T

0

κ(t)J(yκ(t), t)dt + κ(T )J (yκ(T ))

],

is continuous and non-increasing on [0, a] where a = inf y |F (y) = 0 . Itsatisfies also:

limy→0

F (y) = +∞; limy→+∞F (y) = 0.

Then, the function F has an inverse F−1.

Define the function G by:

G(y) = H(F−1(y)),

where the function H is given by:

H(y) = EP

[∫ T

0

U(J(yκ(t)), t)dt + U[J (yκ(T ))

]].

Under assumptions (U) and (V) for utility functions U and U , there existsan optimal strategy (c∗, w∗) ∈ A(v0) such that:

J (c∗, w∗, v0) = maxc,w∈A(v0)

J (c, w, v0) (8.32)

where J (c, w, v0) = EP

[∫ T

0

U(cs, s)ds + U (V c,wT )

]. (8.33)


Using the previous function F and the inverse functions J and J of marginalutility functions U ′ and U ′, we have:

c∗t = J(F−1(v0)κ(t)),

V ∗T = J(F−1(v0)κ(T )),

and the optimal weighting w∗ is deduced from the martingale representa-tion 6.78.

- Auxiliary unsconstrained optimization problem: Cvitanic and Karatzas[137] introduce a family of unsconstrained optimization problems which em-beds the constrained problem. For this purpose:

• Consider the space H of Ft-progressively measurable processes (υt)twith values in Rd, and such that:

||υ||2 = E

[∫ T

0

||υt||2 dt]<∞.

• Introduce the class D of processes such that:

D =

ν ∈ H,

∫ T

0

δ (υt) dt ≤ ∞, (8.34)

where δ (.) is the support function of the set of constraints K. Notethat:

ν ∈ D ⇐⇒ ν(ω, t) ∈ K, for P⊗ dt a.s. (ω, t), (8.35)

where K is the barrier cone of K.

For any given ν ∈ D, consider a new financial market M, with one bondand d stocks:

dB(υ)t = B

(υ)t . [r(t) + δ (υt)] dt, (8.36)

dS(υ)i,t = S

(υ)i,t .

µi(t) + υi(t) + δ (υt) +d∑

j=1

σi,j(t)dWj,t

. (8.37)

Associated to the process ν ∈ D, denote also:

* The relative risk process η(υ) by:

η(υ)(t) = σ (t)−1 [µ (t) + υ(t) + δ (υt) I− (r(t) + δ (υt)) I] = η(t) + σ−1ν(t).(8.38)


* The exponential local martingale L(υ):

L(υ)(t) = exp[−1

2

∫ t

0

∣∣∣∣∣∣η(υ)(s)∣∣∣∣∣∣2 ds− ∫ t

0

η(υ)(s)dWs

], (8.39)

is the Radon-Nikodym density of the risk-neutral probability Q(υ) w.r.t. theprobability P.

* Denote R(υ) the discount factor:

R(υ)(t) = exp[−∫ t

0

(r(s) + δ (υs)) ds]. (8.40)

* Denote M (υ) the product: M (υ)(t) = R(υ)(t)L(υ)(t).

* Consider the new set of admissible strategies:

- The wealth process V (υ) satisfies:

dV(υ)t = .

[[r(t) + δ (υt)]V

(υ)t − c (t) dt

]+ .V

(υ)t

[tw(t)σ(t)dW (υ)

t

], (8.41)

where W(υ)t = Wt +

∫ t

0 η(υ)(s)ds is a standard Brownian motion under therisk-neutral probability Q(υ).

- The investor searchs to maximize (through the strategy (c(υ), w(υ))):

E

[∫ T

0

U(t, c(υ)t )dt + U(V (υ)T )

], (8.42)

on the set A(υ) of all admissible strategies (c(υ), w(υ)):

A(υ)(v0) =

(c(υ), w(υ)), w(υ)(ω, t) ∈ K, for P⊗ dt a.s. (ω, t). (8.43)

The value function J is still defined by:

J (υ) = sup(c(υ),w(υ))∈A(υ)

E

[∫ T

0

U(t, c(υ)t )dt + U(V (υ)T )

]. (8.44)

The Lagrange parameter λ(υ)∗ is determined from the budget equation. Itsexistence is from assumptions (U) and (V). Indeed, the function F (υ), definedby

F (υ)(y) = EP

[∫ T

0

κ(υ)(t)J(yκ(υ)(t), t)dt + κ(υ)(T )J(yκ(υ)(T )

)], (8.45)


has an inverse F (υ)−1.

Define the function G(υ) by:

G(υ)(y) = H(υ)(F (υ)−1(y)),

where the function H(υ) is given by:

H(υ)(y) = EP

[∫ T

0

U(J(yκ(υ)(t)), t)dt + U[J(yκ(υ)(T )

)]]. (8.46)

PROPOSITION 8.2Under assumptions (U) and (V) for utility functions U and U , there existsan optimal strategy (c

(υ)∗, w(υ)∗) ∈ A(υ)(v0) such that:

J (c(υ)∗, w

(υ)∗, v0) = maxc(υ),w(υ)∈A(υ)(v0)

J (c(υ), w(υ), v0) (8.47)

where J (c(υ), w(υ), v0) = EP

[∫ T

0

U(c(υ)s , s)ds + U(V

(υ)c,wT

)]. (8.48)

Using the previous function F (υ) and the inverse functions J and J ofmarginal utility functions U ′ and U ′, we have:

c(υ)∗t = J(F (υ)−1(v0)κ(υ)(t)),

V(υ)∗T = J(F (υ)−1(v0)κ(υ)(T )),

and the optimal weighting w(υ)∗ is deduced from the martingale representation(6.78).

Introduce the class D′ of processes:

D′ =ν ∈ D, F (υ)(y) <∞, for all y a.s;

, (8.49)

A(υ)(v0) = (c, w), w(ω, t) ∈ K, for P⊗ dt a.s. (ω, t) . (8.50)

- The optimization problem:

J (υ)(c(υ)∗, w(υ)∗, v0) = maxc(υ),w(υ)∈A(υ)(v0)

J (υ)(c(υ), w(υ), v0) (8.51)

with J (υ)(c(υ), w(υ), v0) = EP

[∫ T

0

U(c(υ)s , s)ds + U(V

(υ)c(υ),w(υ)

T

)].(8.52)


Then, under assumptions (U) and (V) for utility functions U and U , thereexists an optimal strategy (c(υ)∗, w(υ)∗) ∈ A(υ)(v0) such that,

Using previous function F and the inverse functions J and J of marginalutility functions U ′ and U ′, we have:

c(υ)∗t = J(F (υ)−1(v0)κ(υ)(t)),

V(υ)∗T = J(F

(υ)−1(v0)κ(υ)(T )).

- Equivalent optimization conditions:

PROPOSITION 8.3Suppose that for some process λ ∈ D′, we have:

w(λ)t ∈ K, δ (λ(ω, t)) + tw

(λ)t λ(ω, t) = 0. (8.53)

Then the pair(c(λ)∗, w(λ)∗) belongs to A(λ)(v0) and is optimal for the con-

strained optimization problem in the original market.

- Consider a solution (c∗K , w∗K) of the constrained optimization problem (A):

sup(cK ,wK)∈AK(v0)

E

[∫ T

0

U(t, cK,t)dt + U(VK,T )

].

Cvitanic and Karatzas [137] characterize the solution of Problem (A) byusing the following conditions (B)-(E) for a given process λ in the class D′:

* Financibility of(c(λ), V

(λ)T

):

There exists a portfolio process wK such that (c(λ), w(λ)) ∈ AK and:

w(λ)t (ω, t) ∈ K, δ (λ(ω, t)) + tw

(λ)t λ(ω, t) = 0.

V (v0,c(λ),w(λ))(ω, t) = V (λ)(ω, t)for P⊗ dt a.s. (ω, t).

* Minimality of λ. For every ν ∈ D, we have:

E

[∫ T

0

U(t, c(λ)t )dt + U(V (λ)T )

]≤ E

[∫ T

0

U(t, c(ν)t )dt + U(V (ν)T )

].

* Dual optimality of λ. For every ν ∈ D, we have:

EP

[∫ T

0

κ(υ)(t)J(yκ(υ)(t), t)dt + κ(υ)(T )J(yκ(υ)(T )

)]

≤ EP

[∫ T

0

κ(υ)(t)J(yκ(λ)(t), t)dt + κ(υ)(T )J(yκ(λ)(T )

)].


* Parsimony of λ. For every ν ∈ D, we have:

EP

[∫ T

0

κ(υ)(t))c(λ)t )dt + κ(υ)(T )V (λ)T

]≤ v0.

THEOREM 8.4 Equivalence of conditionsConditions (B)-(E) are equivalent and imply property (A) with

(c∗K , w∗K) = (c(λ), w(λ)). (8.54)

In addition, conversely, property (A) implies the existence of λ ∈ D′ whichsatisfies conditions (B)-(E) with w∗

K = w(λ) under the following assumptions:

(i) The utility functions satisfy:c→ cU ′(c)v → vU ′(v)

are nondecreasing on (0,∞), (8.55)

and for some α ∈ (0, 1), γ ∈ (1,∞), we have:

αU ′(x) ≥ U ′(γx), for all x > 0. (8.56)

From the previous result, we are led again to the dual stochastic controlproblem:

J (y) = infν∈D

E

[∫ T

0

U(t, yκ(ν)(t)

)dt +

U(yκ(ν)(T )

)]. (8.57)

THEOREM 8.5 Existence of a solution of the constrained opti-mizationUnder all previous assumptions on the utility functions, there exists an opti-mal pair (c∗K , w∗

K) for the constrained portfolio.

8.2.2 Basic examples

Examine the optimal solution for various constraints: no short-selling, up-per and lower bounds on the weighting vector, etc.

8.2.2.1 Standard strategy constraints

Example 8.5 General closed convex cone KWe have:

δ(x) =

0 if x ∈ K

∞ if x ∈ K.


Recall that K is the polar cone of −K :

K =x ∈ Rd,t wx ≥ 0, ∀w ∈ K

.

Note that, for the unsconstrained case, K = Rd,

δ(x) =

0 if x = 0∞ otherwise , with K = 0 .

Example 8.6 No short-sellingLet:

K =w ∈ Rd, wi ≥ 0, ∀i = 1, ..., d

.

Then:

δ(x) =

0 if x ∈ K∞ if x /∈ K

and K = K.

Example 8.7 Incomplete marketLet:

K =w ∈ Rd, wi = 0, ∀i = m + 1, ..., d

, for some m ∈ 1, ..., d− 1 .

Then:

δ(x) =

0, x1 = ... = xm = 0,∞, otherwise.

andK =

x ∈ Rd, xi = 0, ∀i = 1, ...,m

.

This case corresponds to an incomplete financial market driven by a multi-dimensional Brownian motion, which contains m stocks with m < d.

If, furthermore, there are no short-selling conditions then:

K =w ∈ Rd, wi ≥ 0, ∀i ≤ d and wi = 0, ∀i = m + 1, ..., d

,

δ(x) =

0, (x1, ..., xm) ∈ [0,∞[m

∞, otherwise,

andK =

x ∈ Rd, xi ≥ 0, ∀i = 1, ...,m

.


Example 8.8 Rectangular constraintsLet:

K =d∏

i=1

Ki with Ki = [αi, βi],

for some fixed numbers: −∞ ≤ αi ≤ βi ≤ +∞.Then:

δ(x) =d∑

i=1

βix−i −

d∑i=1

αix+i ,

andK =

x ∈ Rd, xi ≥ 0, ∀i ∈ S+ and xi ≤ 0, ∀i ∈ S− ,

where

S+ = i = 1, ..., d |βi = +∞ ,S− = j = 1, ..., d |αj = −∞ .

8.2.2.2 Logarithmic utility case

Assume that U(c) = ln c and U(v) = ln v. Then, we have: for every ν ∈ D,

c(υ)(t) =v0

T + 11

κ(υ)(t);V (υ)(T ) =

v0T + 1

1κ(υ)(T )

. (8.58)

Example 8.9Note that D = D′. Thus:

EP

[∫ T

0

κ(υ)(t)J(yκ(λ)(t), t)dt + κ(υ)(T )J(yκ(λ)(T )

)]

= −(1 + T )(

1 + ln1 + T

v0

)+ EP

[ln

1κ(υ)(T )

+∫ T

0

ln1

κ(υ)(t)dt

],

and,

EP

[ln

1κ(υ)(t)

]= EP

[∫ t

0

(r(s) + δ(ν(s)) +

12

∣∣∣∣η(s) + σ−1(s)v(s)∣∣∣∣2)] ds.

Therefore, the optimization problem is equivalent to a pointwise minimiza-tion of the convex function δ(x) + 1

2

∣∣∣∣η(t) + σ−1(t)x∣∣∣∣2 over x ∈ K, for every

t ∈ [0, T ].Thus the process λ is determined by:

λ(t) = arg minx∈K

[2δ(x) +

∣∣∣∣η(t) + σ−1(t)x∣∣∣∣2] . (8.59)


Finally, we deduce:

wK(t) = σ(t)tσ(t)−1 [λ(t) + µ(t)− r(t)I] ,

cK(t) =v0

T + 11

κ(υ)(t)=

V (λ)(t)1 + (T − t)

.

For all previous examples corresponding to various constraint sets K, wehave δ(.) = 0 on K. Thus, for the logarithmic case, the problem of determiningthe process λ ∈ D reduces to that of minimizing pointwise a simple quadraticform, over K :


[2δ(x) +

∣∣∣∣η(t) + σ−1(t)x∣∣∣∣2] . (8.60)

For the unsconstrained case, λ(t) = 0 and we recover the standard solution:

wK(t) = σ(t)tσ(t)−1 [λ(t) + µ(t)− r(t)I] .

Therefore, for the logarithmic case, we get explicit formulas:

• For the incomplete case, set:

σ(t) =[

Σ(t)ρ(t)

],

where Σ(ω, t) is an (m× d) matrix of full rank and ρ(ω, t) is an (n× d)matrix with orthogonal rows that span the kernel of Σ(ω, t) for every(ω, t) (we have: ρ(ω, t)tρ(ω, t) = In and Σ(ω, t)tρ(ω, t) = 0 and n =d−m).

Then, set:

M(t) = t (µ1(t), ..., µm(t)) ,a(t) = t (bm+1(t), ..., bd(t)) ,Λ(t) = tΣ(t)Σ(t)tΣ(t)−1 [B(t)− r(t)I] .

We have:η(t) = Λ(t) + tρ(t) [M(t)− r(t)I] .

For any υ ∈ K, necessarily of the form ν(t) =[

0m

N

]for some N ∈ Rm,

∣∣∣∣η(t) + σ−1(t)ν∣∣∣∣2 =

∣∣∣∣Λ(t) + tρ(t) (a(t)− r(t)In+N)∣∣∣∣2 .


Since Λ(t) and tρ(t) (a(t)− r(t)In+N) are orthogonal, we have:∣∣∣∣η(t) + σ−1(t)ν∣∣∣∣2 = ||Λ(t)||2 +

∣∣∣∣tρ(t) (a(t)− r(t)In+N)∣∣∣∣2 .

Therefore, the minimization 8.60 is achieved by the random vector,

λ(t) =[

0m

Λ(t)

]where Λ(t) = r(t)In − a(t).

Thus, we deduce (result in Karatzas et al. [319]):

wK(t) =[

Σ(t)tΣ(t)−1 [M(t)− r(t)Im]0n

].

• For the rectangular constraints, the process λ can be also determined:

- First case (only one stock): S+ = S− = ∅.

Since d = 1, K has the form [α, β] and δ(x) = βx− −αx+. The processλ is defined by:

λ(t) =

σ(t) [σ(t)β − η(t)] , if σ(t)β < η(t),σ(t) [σ(t)α − η(t)] , if σ(t)α > η(t),

0, otherwise.

Therefore, the optimal portfolio is given by:

wK(t) =

β if σ−1(t)η(t) > β,α if σ−1(t)η(t) < α,

σ−1(t)η(t), otherwise.

Note that the optimal portfolio wK(t) is equal to the unconstrainedoptimal one wRd(t), as long as wRd(t) is in the interval [α, β]. Otherwise,wK(t) is equal to the closest point to wRd(t).

- Second case (two stocks): suppose for example that α = t(0, 0), β =t(1, 1), η = t(1, 2) and

σ(t) =[

1 − 10−1 1

].

For the unsconstrained case, the optimal portfolio is given by:

wRd(t) = σ(t)−1η = t(−1/3,−4/3).

The optimal constrained portfolio wK(t) is no longer the portfolio in Kwhich is the closest one to the unsconstrained optimal portfolio wRd(t).Otherwise, it would be equal to t(0, 0).


Indeed, the minimization:


[12

∣∣∣∣η + σ−1x∣∣∣∣2 − tαx+ + tβx−

], x ∈ R2, (8.61)

leads to the value λ = t(13.5, 0), and the optimal portfolio is given by:

wK(t) = tσ−1(η + σ−1λ

)= t(0, 1/2),

which means that we must not invest on the first stock and invest halfon the second one.

8.2.2.3 Deterministic coefficients case

• We get feedback formulas: There exists a formal Hamilton-Jacobi-Bellmanequation associated with the dual optimization problem (8.11):

∂Q

∂t+ inf

x∈K

[12∂2Q

∂v∂v

∣∣∣∣η(t) + σ−1(t)x∣∣∣∣2 − ν

∂Q

∂vδ(x)

]−ν ∂Q

∂vr(t)+U (t, v) = 0,

(8.62)

andQ(T, v) =

U(v).

If there exists a solution Q which satisfies mild growth conditions, thenthe dual function satisfies (see Fleming and Rischel [231]).

Suppose that δ(.) = 0 on K, such as for all previous constraint sets K.Thus the process


[2δ(x) +

∣∣∣∣η(t) + σ−1(t)x∣∣∣∣2] , (8.63)

is deterministic and constant w.r.t. v. Then Equation (8.62) becomes:

∂Q

∂t+

12

∣∣∣∣∣∣η(λ)(t)∣∣∣∣∣∣2 ν2 ∂2Q

∂v∂v− r(t)y

∂Q

∂v) + U(t, v) = 0. (8.64)

Consider for instance the case U(x) = U(x) = xα

α . Then U(x) = U(x) =

y−δ

δ with δ = α1−α . Therefore, the Cauchy problem (8.62) has a solution

of the form:Q(t, v) =

1ρv−ρu(t),

where the function u(.) is the solution of the following ODE:

du

dt+ h(t)u(t) + 1 = 0, u(T ) = 1,


with:

h(t) = ρ infx∈K

[1 + ρ

2

∣∣∣∣η(t) + σ−1(t)x∣∣∣∣2 + δ(x)

]+ r(t)ρ.

The process λ is again deterministic and:


[2(1− α)δ(x) +

∣∣∣∣η(t) + σ−1(t)x∣∣∣∣2] . (8.65)

Cvitanic and Karatzas [137] provide a general result for the deterministiccase.

PROPOSITION 8.4Suppose that r(.), b(.), and σ(.) are deterministic and that there exists a deter-ministic process λ(.) ∈ D, which achieves the infimum of dual problem (8.11),and that: for some real numbers α > 0, β > 0, and M > 0,

J(t, y) + J(y) + |J ′(t, y)|+∣∣∣J ′(y)

∣∣∣ ≤M(yα + y−β), 0 < y <∞.

With all assumptions introduced in this section, the optimal process of con-sumption/investment (c∗K , w∗

K) is given in feedback form w.r.t. the wealthV (λ)(t) by:

c(λ)(t) = J(t, F−1(t, V (λ)(t))),

wK(t) = −σ(t)tσ(t)−1 [λ(t) + µ(t)− r(t)I]F−1(t, V (λ)(t))

V (λ)(t)F−1(t, V (λ)(t)).

REMARK 8.3 This result shows that, for the deterministic case, theweighting ratios are still constant, despite the additional constraints on theportfolio strategies:

wK,i(t)wK,j(t)

=σ(t)tσ(t)−1 [λ(t) + µ(t)− r(t)I]iσ(t)tσ(t)−1 [λ(t) + µ(t)− r(t)I]j

.


8.3 Optimization with transaction costs

As shown by Magill and Constantinides [368], Davis and Norman [149], andShreve and Soner [470], the standard approach to utility maximization undertransaction costs is based on the analytical study of the value function whichusually leads to an optimal strategy corresponding to:

• No transaction in a certain region. With a single risky asset, no tradeis made when the stock weight lies in a given interval.

• Minimal transactions at the boundary such that the weighting vectorstays in the region.

This kind of result holds for infinite horizon: it is always optimal to investin the stock (even a small amount) if the rate of return is positive.

For finite horizon, Cvitanic and Karatzas [139] show that the result maybe quite different. If the difference between the stock return and the risklessreturn is non-negative but small, and the time horizon is also relatively small,then it may be optimal not to trade at all.

8.3.1 The infinite-horizon case

Consider the model introduced by Davis and Norman [149], and furtherstudied by Shreve and Soner [470] who use the concept of viscosity solutionsto Hamilton-Jacobi-Bellman equations.- The financial market consists of one riskless asset, the bond B, and one

risky asset, the stock S, given by:

dBt = Btr(t)dt, dSt = St− . [µ(t)dt + σ(t)dWt] ,

where W is a standard Brownian motion, for t ∈ [0, T ], on a filtered proba-bility space (Ω,F , (Ft)t,P).

The processes r(t), µ(t), and σ(t) are assumed to be measurable, Ft-adapted,and uniformly bounded on [0, T ]×Ω. Besides, the process σ(t) is assumed tobe uniformly bounded away from zero.

- The trading strategy is a pair (L,M) of adapted and left-continuous pro-cesses on [0, T ] with non-decreasing paths and L(0) = M(0) = 0. The processL (respectively M) denotes the cumulative purchases (resp. sales). It is thetotal amount transferred from bank account to stock (resp. from stock tobank account).- The transaction costs are proportional : 0 < λ, µ < 1.


- The portfolio holdings (VB, VS) corresponding to a given strategy (L,U)evolve according to:

VB,t = VB,0 − (1 + λ)Lt + (1 − µ)Mt +∫ t

0

VB,u (ru − cu) du, (8.66)

VS,t = VS,0 + Lt −Mt +∫ t

0

VS,u [µ(u)du + σ(u)dWu] . (8.67)

- The solvency region is defined by:

Sλ,µ =(x, y) ∈ R2 |x + (1− µ)y ≥ 0 and x + (1 + λ)y ≥ 0

.

Denote respectively by ∂+µ and ∂−

λ the upper and lower boundaries of thesolvency region. The investor’s net wealth is equal to zero on ∂+

µ ∪ ∂−λ .

- The consumption rate is an adapted process denoted by c.- An admissible strategy (c, L,M) is such that

P [(L(t),M(t)) ∈ Sλ,µ, for all t ≥ 0] = 1. (8.68)

- The maximization of expected utility from consumption is defined by:

J (x, y) = sup(c,L,M)∈Sλ,µ

Ex=VB,0,y=VS,0

[∫ ∞

0

e−δtU(ct)dt]. (8.69)

8.3.1.1 A special case

Assume that the utility function is a power function: U(x) = xα

α . Supposealso that both processes L and U are absolutely continuous w.r.t. the currenttime:

Lt =∫ t

0

ltdt and Mt=∫ ∞

0

mtdt.

The Bellman equation to be solved for the value function J is

max(c,l,m)∈Sλ,µ

[12σ2y2

∂2J (x, y)∂y2

+ rx∂J (x, y)

∂x+ µy

∂J (x, y)∂y

+1αcα − c

∂J (x, y)∂x

]+[−(1 + λ)

∂J (x, y)∂x

+∂J (x, y)

∂y

]l +

[(1− µ)

∂J (x, y)∂x

− ∂J (x, y)∂y

]m

−δJ (x, y) = 0.

The maximum is achieved for:

c =(∂J (x, y)

∂x

)1/(1−α)

,


l =

K if ∂J (x,y)

∂y ≥ (1 + λ)∂J (x,y)∂x

0 if ∂J (x,y)∂y < (1 + λ)∂J (x,y)

∂x

and m =

0 if ∂J (x,y)

∂y > (1− µ)∂J (x,y)∂x

K if ∂J (x,y)∂y ≤ (1− µ)∂J (x,y)

∂x

.

Thus, the optimal strategy is bang-bang: buying and selling either are madeat maximum rate, or not at all.

The solvency region splits into three parts: “buy” (B) , “sell” (S), and “notransaction” (NT).

6

- x

y

x0 x1

B

NT

S

1

1

1

1-µ

1-λ

x+(1+λ)y=0

x+(1-µ)y=0

FIGURE 8.1: Bond/stock space: directions of finite transactions andsolvency region

At the boundary between (B) and (NT) subsets, we have:

∂J (x, y)

∂y= (1 + λ)

∂J (x, y)

∂x.

At the boundary between (S) and (NT) subsets, we have:

∂J (x, y)

∂y= (1− µ)

∂J (x, y)

∂x.

Then, these boundaries are to be determined precisely. For this purpose,assuming that the value function satisfies an homothetic property:

J (x, y) = yαΨ

(x

y

),


where Ψ(x) = J (x, 1). Therefore, J is constant along the lines of slope(1 + λ)−1 in B, and (1− µ)−1 in S. Thus:

Ψ(x) =1αA (x + 1− µ)α , x ≤ x0,

Ψ(x) =1αB (x + 1 + λ)α , x ≥ x1, (8.70)

for some constants A and B and x0 and x1 defined as in the previous figure.

THEOREM 8.6 Davis and Norman [149]Assume that:i) Well-posed condition: δ > α

[r + (µ− r)2/σ2(1− α)

].

ii) Transaction costs: λ ∈ [0,∞[, µ ∈ [0, 1[ and max(λ, µ) > 0.Then:1) Power utility case: (U(x) = xα/α). Suppose that:

r < µ < r + (1− α)σ2.

Then, the optimal solutions are defined from Equation (8.70): Let NTdenote the closed wedge

(x, y) ∈ R+2

∣∣∣ 1x1

< yx < 1

x0

. For (x, y) ∈ NT −

(0, 0) , definec∗(x, y) = y

[Ψ′

(x

y

)]1/(1−α)

.

Then, the process c∗t = c∗(L∗t ,M

∗t ), where both processes L∗

t and M∗t satisfy

Equations 8.66 and 8.67, is optimal for any initial investment (x, y) in NT. If(x, y) /∈NT, then an immediate transaction leads to the closest point in NT.The processes L∗

t and M∗t are the local times of the wealth at the boundaries

of the no transaction region NT with regions B and S:

Lt =∫ t

0

I(VB,s,VS,s)∈∂BdLs and Mt =∫ t

0

I(VB,s,VS,s)∈∂SdMs.

Thus, the optimal strategy is to trade minimally in order to keep the stockweight between (1 + x1)−1 and (1 + x0)−1.

2) Logarithmic utility case: (U(x) = lnx). Suppose that:

r < µ < r + σ2.

There are constants x0, x1, A and B such that:

Ψ(x) =1α

ln[A (x + 1− µ)

], x ≤ x0,

Ψ(x) =1α

ln[B (x + 1 + λ)

], x ≥ x1, (8.71)


Define the function c∗(., .) by:

c∗(x, y) = y

[Ψ′

(x

y

)]−1

.

Note that, if µ = r and δ > αr, then the optimal strategy is to close outany position in stock and to consume optimally from the bank account.

8.3.2 The finite-horizon case

Consider the model introduced by Cvitanic and Karatzas [139]. The finan-cial market consists of the same assets as in the previous case, transactioncosts are still proportional and the trading strategies (L,M) are also definedas previously.- Introduce the following auxiliary martingales : D denotes the class of pairs

of strictly positive (Ft)-martingales (Z0, Z1) with:

Z0(0) = 1, Z1(0) ∈ [S0(1 − µ), S0(1 + λ)], (8.72)and

(1− µ) ≤ R(t) =Z1(t)

Z0(t)P (t)≤ (1 + λ), (8.73)

where P is the discounted price of the stock: P (t) = S(t)B(t) .

The martingales Z0 and Z1are the feasible state-price densities for holdingsin bank and stock in the markets with transaction costs. From the martingalerepresentation theorem, there exist predictable processes θ0 and θ1 such that:

Z0(t) = Z0(0) exp[∫ t

0

θ0(u)dWu − 12

∫ t

0

θ20(u)du], (8.74)

Z1(t) = Z1(0) exp[∫ t

0

θ1(u)dWu − 12

∫ t

0

θ21(u)du]. (8.75)

Denote Zc0 the density process when there is no transaction cost. We have:

Zc0(t) = Zc

0(0) exp[∫ t

0

θc0(u)dWu − 12

∫ t

0

θc20 (u)du],

with

θc0(t) =r(t) − µ(t)

σ(t).

- The utility function on wealth U : (0,+∞) → R satisfies usual assump-tions as in previous sections. The inverse of the marginal utility is againdenoted by J : J = (U ′)−1.


- The maximization of expected utility from terminal wealth is solved asfollows.

The terminal wealth VB,T+ is defined by:

VB,T+ = VB,T + f(VS,T ) with f(u) =

(1 + λ)u if u ≤ 0,(1 − µ)u if u > 0.

This means that, at the end of the management period, the investor liqui-dates his position on stock and transfers it to a bank account with transactioncosts.

For an initial holding VS(0), we have to search for an optimal strategy(L∗, U∗) that maximizes expected utility from terminal wealth:

J (VB,0, VS,0) = sup(L,M)

E [VB,T + f(VS,T )] . (8.76)

Consider the dual problem

J (ζ, VS,0) = inf(Z0,Z1)∈D

E

[U

(ζZ0(T )B(T )

)+

VS,0

S0ζZ1(T )

], (8.77)

under the assumption that there exists a pair (Z∗0 , Z

∗1 ) ∈ D that achieves the

infimum, for all 0 < ζ <∞. Additionally, for all 0 < ζ <∞, we have:

J (ζ, VS,0) <∞ and E

[Z∗0 (T )B(T )

J

(ζZc0(T )

B(T )

)],

where Zc0(T ) is the density process when there is no transaction cost.

Note for example that this assumption is satisfied if VS,0 = 0, and eitherU(x) = ln(x) or U(x) = xα/α.

Consider the function Γ

ζ → Γ(ζ) = E

[Z∗0 (T )B(T )

J

(ζZ∗0 (T )B(T )

)].

Since the function Γ is continuous and strictly decreasing, there exists aunique ζ∗ such that

E

[Z∗0 (T )B(T )

J

(ζ∗

Z∗0 (T )B(T )

)]= VB,0 +

VS,0

S0E [Z∗

1 (T )] .

We deduce:


THEOREM 8.7(Cvitanic and Karatzas [139]) Under previous assumptions, there exists anoptimal pair (L∗,M∗) such that:

V ∗B,T+ = V ∗

B,T + f(V ∗S,T ) = J

(ζ∗

Z∗0 (T )B(T )

), (8.78)

with the following property:

L∗ is flat off the set 0 ≤ t ≤ T |R∗(t) = 1 + λ ,M∗ is flat off the set 0 ≤ t ≤ T |R∗(t) = 1− µ ,

and:V ∗B,t + R∗(t)V ∗

S,t

B(t)= EQ∗

0

[J

(ζ∗

Z∗0 (T )B(T )

)1

B(T )|Ft

],

where R∗(t) = Z∗1 (t)/Z∗

0 (t) and Q∗0 is defined by Z∗

0 = dQ∗0

dP.

The following examples illustrate the fact that for finite-horizon, it may beoptimal not to trade at all.

Example 8.10Assume that the rate r is deterministic and the inital amount VS,0 investedon stock is equal to 0. In that case, we have:

J (ζ, 0) = inf(Z0,Z1)∈D

E

[U

(ζZ0(T )B(T )

)]≥ U

(ζ

B(T )

).

Thus, the infimum is achieved by any pair (Z∗0 (.), Z∗

1 (.)) such that Z∗0 (.) ≡ 1

(then, with (1− µ) ≤ Z∗1 (t)

P (t) ≤ (1 + λ)).Consider for instance Z∗

1 (0) = (1 + λ)S0 and θ∗1(.) = σ(.). In that case,(1, Z∗

1 (.)) ∈ D if and only if:

0 ≤∫ t

0

[µ(s)− r(s)] ds ≤ ln[

1 + λ

1− µ

].

This condition is satisfied if

r(.) ≤ µ(.) ≤ r(.) + ρ for some 0 ≤ ρ ≤ 1T

ln[

1 + λ

1− µ

].

Also:

V ∗B,T+ = V ∗

B,T + f(V ∗S,T ) = J

(ζ∗

B(T )

)= VB,0B(T ). (8.79)

The no-trading strategy (L∗ ≡ 0,M∗ ≡ 0) is optimal and leads to

V ∗B,T = VB,0B(T ) and V ∗

S,T = 0. (8.80)


Note that if µ(.) ≡ r(.), even with no transaction cost, it is not optimal totrade. However, if µ(.) > r(.), the optimal stock weight is positive if there is notransaction cost. This is true even with transaction costs for infinite-horizonwith constant coefficients, as seen in previous section.

Example 8.11Consider the case µ(.) ≡ r(.) deterministic and a strictly positive amount V ∗

S,0

initially invested on the stock. In that case, we have:

Z∗0 (0) ≡ 1 and Z∗

1 (t) = (1− µ)S0 exp[∫ t

0

σ(u)dWu − 12

∫ t

0

σ2(u)du].

The optimal strategy is given by:

L∗(.) ≡ 0,M∗(.) ≡ VS,0I]0,T ](.).

This corresponds to an immediate liquidation of the stock position. Wehave:

V ∗B,T+ = V ∗

B,T + f(V ∗S,T ) = J

(ζ∗

B(T )

)= (VB,0 + VS,0(1− µ))B(T ). (8.81)

and

V ∗B,t =

(VB,0 + VS,0(1 − µ)I]0,T ](t)

)B(t), V ∗

S,t = VS,0I]0,T ](t). (8.82)

8.4 Other frameworks

8.4.1 Labor income

We examine the portfolio optimization problem of an investor who is en-dowed with a stochastic insurable stream throughout his lifetime. The in-vestor’s wealth is assumed to be non-negative over the lifetime interval, whichimplies that the wage income cannot be sold in the financial market. This liq-uidity constraint is examined by Cuoco [134], and by El Karoui and Jeanblanc-Picque [190], who succeed in finding a closed formula.

- The financial market:This consists of one “riskless” asset and d risky securities.

- The riskless asset B satisfies:

dBt = Btrtdt,


where rt is the short interest rate and B0 = 1.

- Let S be the price vector of d risky securities. Let (Ω,P) be the probabilityspace. Assume that S is defined from the following stochastic differentialequation (SDE):

dSt = St− . (µ(t, St)dt + σ(t, St)dWt) , (8.83)

where W = (W1, ...,Wn) is a standard d−multidimensional Brownian motion.

- The information is modelled by the filtration Ft generated by the Brow-nian motion (and as usual completed in order to contain all P-null sets).

- The processes µ(., .) = (µ1, ..., µd)(., .) and σ(., .) = [σi,j (., .)]i,j satisfyusual conditions which ensure that the previous SDE has one and only onesolution. These processes are assumed to be Ft-predictable and uniform-ly bounded. The matrix σ(t, .) is invertible with bounded inverse for anyt ∈ [0, T ].

- There exists a predictable and bounded process θ, the risk premium vector,such that:

σtθt = µt − rtI, a.s.

Under the previous hypothesis, the financial market is complete and withoutarbitrage opportunity.- Assumptions on portfolio strategies: At any time t, the investor receives

an income at the rate et and chooses the amount ct per time unit, whichis assigned to his consumption, and also the portfolio weighting wt which issupposed to be self-financing. The cumulative consumption

∫ t

0csds is an Ft-

adapted-process with∫ t

0csds <∞, P-a.s.

Note that if we impose ct ≥ max(et, 0), then the optimization problemlooks like the standard one, since the assumption of non-negative terminalwealth implies the liquidity constraint. When et ≥ 0, ∀t ≥ 0, the process ecan be interpreted as a labor wage. When et ≤ 0, ∀t ≥ 0, the process e can beviewed as a constraint on the consumption process: the excess consumption,ct = ct − et ≥ −et.

The portfolio weighting wt is predictable and such that∫ t

0 ||ws||2 ds < ∞,P-a.s., where ||.|| denotes the norm.

Thus, the portfolio valueVt is an Ito process defined by:

Vt = V0 +∫ t

0

rsVsds +d∑

i=1

∫ t

0

wi,tVsdSi,s

Si,s−∫ t

0

(cs − es) ds. (8.84)


Assumption on the income process: The income process (et)t is spanned bythe market assets. There is no “extra noise” on the income dynamics. Thismeans that (et)t is Ft-adapted.

The process (et)t is also supposed to be square-integrable.

- Assumption on liquidity constraint: The wealth process V satisfies Vt ≥ 0at any time during the management period [0, T ]. This condition implies thatthe investor cannot borrow against future labor income.

- Assumptions on utility functions: For any time t, let U(., t) and U(., t) betwo utility functions satisfying ∀t ∈ [0, T ],

- U(., t) and U(., t) are defined on R+, strictly concave, non-decreasing andcontinuously differentiable.

- limx→∞ ∂∂xU(x, t) = 0 and limx→∞ ∂

∂x U(., t) = 0.

- U(., .) and U(., .) are continuous on R+ × [0, T ] (for example: U(x, t) =e−ρtu(x) with ρ positive scalar).

Note that the marginal utilities ∂∂xU(x, t) and ∂

∂x U(., t) are non-decreasing.Therefore, their inverse functions J and J exist.

As in previous sections, U and U denote respectively the convex conjugate

functions of U and U .

The maximization of the expected intertemporal utility along the time pe-riod [0, T ] is the following optimization problem:

maxc,w

E

[∫ T

0


], (8.85)

for a given initial budget (at time t = 0, the portfolio value V is equal to agiven value V0) and for a given income process e.

Since the financial market is complete, all positive and square-integrableconsumption/wealth plans (c, ξT ) are replicated by square-integrable process-es (w, V ) such that:

dVt = rtVtdt +d∑

i=1

wtσtVt (dWt + θtdt)− (ct − et) dt, (8.86)

VT = ξT . (8.87)


Assume that the consumption/wealth plan (c, ξT ) is such that c is a non-negative Ft-progressively measurable process with

∫ t

0 |cs| ds < ∞, and ξT isa non-negative FT -measurable random variable. Then consider the minimalequation of 8.86 which is:

Vt = E

[∫ T

t

Zts(cs − es)ds + Zt

T |Ft

],

where (Zts)s≥t is the shadow state-price process defined by the following for-

ward equation:

dZts = −Zt

s (rsds + θsdWs) , s ≥ t, Ztt = 1. (8.88)

Thus, the liquidity constraint has the following form:

Vt ≥ −It, t ≥ 0, (8.89)

where It = E[∫ T

t Ztsesds |Ft

]is the unique solution of 8.86, when c = e = 0.

The existence of an optimal solution can be proved in a general setting, asshown in El Karoui and Jeanblanc-Picque [190] (see Theorems (3.3), (3.7), and(3.8)), by using the Kuhn-Tucker multiplier method to linearize the budgetconstraint, and by solving a stochastic control problem which is the dual ofthe unconstrained problem.

Consider the standard Markovian framework: the state-price density Z andthe income process e are Markov processes with

dZt = −Zt [rdt + θdWt] and det = −et [µedt + σedWt] , (8.90)

where r, θ, µe and σe are constant coefficients.

The free dual value function J is defined by:

J (t, ν, ε) = E

[∫ T

t

(U[νZt

s, s)]

+ νZtses

)ds +

U(νZt

T

) |et = ε

], (8.91)

withU [ν, s] + νε + L(J ) = 0, (8.92)

where

L(J ) =∂J∂t− νr

∂J∂ν

− εµe∂J∂ε

+12ν2θ2

∂2J∂ν2

+12ε2σ2 ∂

2J∂ε2

+ νθεσe∂2J∂ε∂ν

,

(8.93)

and the terminal condition J (T, ν, ε) = U(ν).


The dual value function Φ(t, ν, et) is defined by:

Φ(t, ν, et) =

minD∈D

E

[∫ T

t

(U[νZt

sDs, s)]

+ νZtsDses

)ds +

U(νZt

TDT−) |et = ε

], (8.94)

where D is the class of adapted, right-continuous, non-increasing processes Dsuch that D0 ≤ 1 and DT = 0.

Examine the particular case of absolutely continuous parameters D. Then,the problem can be characterized by using variational inequality:

∂Φ∂ν

≤ 0,

U [ν, t)] + νε + L(J ) ≤ 0,(∂Φ∂ν

)(U [ν, t)] + νε + L(J )

)= 0,

Φ(T, ν, ε) = 0,

which can be written as

max(∂Φ∂ν

, U [ν, t)] + νε + L(J ))

= 0. (8.95)

Consider the boundary between the set(t, ν, ε)

∣∣∂Φ∂ν (t, ν, ε) = 0

and the

continuation region(t, ν, ε)

∣∣∂Φ∂ν (t, ν, ε) < 0

.

The variational equation is associated with an American put:

Ψ(t, ν, ε) = supD∈D

E

[∫ T

t

Zts

(−U ′ [νZt

s, s)]− es

)ds− Zt

τ Iτ=TU

′ (νZt

T

) |et = ε

].

Under the risk-neutral probability Q, the process W = Wt+θt is a Brownianmotion and we have:

dZt = Zt

[(−r + θ2)dt− θdW t

]and det = −et

[(µe − σeθ) dt + σedW t

].

(8.96)Also,

Ψ(t, ν, ε) = supτ≥t

EQ

[∫ T

t

(−U ′ [νZt

s, s)]− es

)ds− Iτ=T

U

′ (νZt

T

) |et = ε

],

is solution ofmax (ΛΨ(t, ν)−Ψ(t, ν)) = 0, (8.97)


where

ΛΨ =∂Ψ∂t

+ (θ − r)∂Ψ∂ν

+ ε (−µe − θσe)∂Ψ∂ε

+12ν2θ2

∂2Ψ∂ν2

+12ε2σ2

e

∂2Ψ∂ε2

+ νθεσe∂2Ψ∂ε∂ν

− rΨ − U ′ − ε.

The optimal wealth is V ∗t = H(t, et, Z∗

t ), where

H(t, ε, ν) = E

[∫ T

t

Zts

(−U ′ [Zt∗

s (ν), s)]− es

)ds− Zt

TU

′ (Zt∗

T−(ν)) |et = ε

].

Using Ito’s formula, we deduce the optimal portfolio w∗ which is the solutionof: [

−Z∗t θ

∂H∂ν

− εσe∂H∂ε

](t, ε, Z∗

t ) = wt.

Example 8.12A closed formula can be given for the optimal consumption plan in the infinitehorizon case. Assume that the utility function U(c, t) is HARA:

U(c, t) = e−δt c1−α

1− α, α = 1, then − U ′(ν, t) =

(νeδt

)− 1α .

Suppose that:

- A1: The certainty equivalent present value I0 of the lifetime labor incomesatisfies:

I0(ε) = E

[∫ ∞

0

Ztetdt

]<∞,

which means that r+ µe − σeθ > 0. Then, we have: I0(ε) = Bε < ∞ withB = [r + µe − σeθ]

−1 .

-A2: The free dual problem is well-posed:

0 < A = E

[∫ ∞

0

Zt

(νeδtZt

)− 1α dt

]<∞.

Note that

A =[δ

α+(α− 1α

)(r +

θ2

2α

)]−1

.

For the free case, from Markovian properties, we deduce that the optimalfree wealth process V f is given by:

V ft (ν) = A

[νeδtZt

]− 1α −Bet,


with the multiplier ν such that the initial wealth is equal to V0:

A [ν]−1α = V0 + Bε.

We deduce that the optimal consumption c∗ is given by:

c∗t (ν) =V ft (ν) + Bet

A.

The optimal wealth process V f is the solution of:

dV ft = −Bdet + Ad

(νZte

δt)− 1

α

= Bet (µedt + σedWt)

+(V ft + Bet

) [( 1α

(r − δ) +1

2α

(α + 1α

θ2))

dt +θ

αdWt

].

The optimal portfolio is given by:

w∗t =

θ

αV ft (ν) +

(σe +

θ

α

)Bet.

For the constrained case, the wealth value Vt(ν) is deduced from the priceof an American option written on the negative part of the free value:

V0(ν) = V0f (ν) + J0(ν),

whereJ0(ν) = sup

τE[Zτ

(A(νeδτZτ

)− 1α −Beτ

)].

The associated one-dimensional stopping time problem is as follows. Thevalue function J0 corresponds to an optimal stopping problem with a twodimensional Markov diffusion processes (νeδtZt, et, t ≥ 0). Since the payof-f is homogeneous, this problem can be transformed into a one-dimensionalstopping time problem:

J0(ν)ε

= supτ

E

[Zτeτε

(AYτ −B)−]

with Yt =

(νeδtZt

)− 1α

et.

The function J0(ν)ε is the value function of an optimal stopping time problem

w.r.t. (AY −B), where the Markov process (Yt)t is a geometrical Brownianmotion under the risk-neutral probability Qe with characteristics given by:

dYt = Yt

(µY + σY dWY

t

), Y0 = ν−

1α ε−1,

µY = B−1 −A−1, σ2Y = θ2

α2 + σ2e + θσe

α .


The optimal stopping problem is solved as follows. Since the value functionJ0(ν)

ε is convex and monotonous, the optimal stopping problem is equivalentto the search of an entrance time τ(a) such that

τ(a) = inf t : Yt ≤ a .

To this stopping time is associated a “reward” given by:

Ψ(y, a) = EQe

[e−τ(a)/B

(AYτ(a) −B

)−] = (Amin(a, y)−B)− EQe

[e−τ(a)/B

].

We can use the Laplace transform of the law of a Brownian entrance timeto compute Ψ(y, a). Let T (b, µ) = inf t : µt + Wt = b be the entrance timefor the Brownian motion with drift (µt + Wt)t:

E[e−λT (b,µ)IT (b,µ)<∞

]= exp

[bµ− |b|

√µ2 + 2λ

], λ > 0.

The process ln(Y ) is a Brownian motion with drift ∇ given by:

∇ = µY − 12σ2Y =

r − δ − θ2/2α

+ µe − σ2e

2− θσe − θσe

2α.

Thus, the stopping time τ(a) corresponds to an entrance time for a Brow-nian motion:

τ(a) = T

(1σY

ln(a

y

),∇σY

), for a < y.

We have

EQ

[e−τ(a)/B

]=(

min(a, y)y

)∆

,

with

∆ =1σ2Y

[∇+

√∇2 + 2σ2

Y B−1

].

The optimal stopping time corresponds to the value of a which maximizesΨ(y, a) :

a∗ =∆B

A (1 + ∆)(note that: a∗ ≤ 1).

We have also:

J0(ν)ε

= Ψ(y,min(a∗, y)) = (B −Amin(a∗, y))+(

min(a∗, y)y

)∆

.

El Karoui and Jeanblanc [190] prove that there is a closed form solution forthe optimal wealth associated with the constrained dual problem. It is based


on the free boundary of this problem:

b(ε) =(A (1 + ∆)

∆Bε

)α

,

Z0(ν) = 0 if ν ≥ b(ε),

Z0(ν) =Bε

1 + ∆

(∆Bεν

1α

A (1 + ∆)

)∆

+ Aν−1α −Bε, otherwise.

Finally, we deduce that the optimal consumption function is given by: (x =V0)

c∗(νx) = min (νx, b(ε))− 1α = ε(max (y, a∗)) with y = ε−1ν

− 1α

x .

The optimal consumption and optimal wealth are linked through a feedbackformula:

x∗ = ε

(B

1 + ∆

(∆B

A (1 + ∆)ε

c∗

)∆

+ Ac∗

ε−B

). (8.98)

When the initial wealth value is equal to 0, the maximal value for the con-sumption function is a fraction equal to a∗ of the income stream.

The fraction a∗ is equal to ∆BA(1+∆) , and is strictly smaller than the fraction

BA corresponding to the free problem.

The following figure illustrates how the optimal consumption depends onthe optimal wealth, according to relation (8.98). It is drawn in the plane(x∗

ε , c∗ε ). The upper curve corresponds to the solution of the free problem.

The parameter values are:

r = 0.1, b = 0.2, σ = 0.1, µe = 1, σe = 0.1, γ = 0.7 and β = 1.


FIGURE 8.2: Optimal consumption as function of optimal wealth

8.4.2 Stochastic horizon

If the investor does not know with certainty the time of exiting the market(e.g., retirement, death, etc.), the time horizon becomes uncertain. Therefore,we have to examine optimization problem such as:

max E [U(Vτ∧T )] , (8.99)

where τ ∧ T is the investment horizon.

When τ is a stopping time and the financial market is complete, the solu-tion is provided, for example, by Karatzas and Wang [322] and Richard [424].Blanchet-Scalliet et al. [79] consider the case where the time τ is a randomtime horizon and the conditional distribution of τ given the available informa-tion at time t is known. The process Ft = P [τ ≤ t |Ft ] satisfies the so-called(G)-assumption. The process (Ft)t is non-decreasing and right-continuous. Inthe complete market framework, they examine in particular the case whereFt admits a density function ft w.r.t. the Lebesgue measure. Then, problem(8.99) is equivalent to:

max E

[∫ T

0

U(Vt)ftdt + U(VT )

(1−

∫ T

0

ftdt

)]. (8.100)

This latter problem is reduced to a standard stochastic control problem forwhich we can use dynamic programming. Using a mild assumption on thedensity f, this problem is solved through PDE (see Blanchet-Scalliet et al.


[79]) or BSDE (see El Karoui et al.[191]). However, they do not include eitherthe case where Ft = I[τ≤t], nor the case where F has no density w.r.t. theLebesgue measure.

This the reason why Bouchard and Pham [84] introduced a more generalwealth path dependent utility maximization problem:

max E

[∫ T

0

U(Vt)dFt

], (8.101)

where the process (Ft)t is non-decreasing and right-continuous. Note that

Ft = P [τ ∧ T ≤ t |Ft ] .

Note also that, contrary to fixed time horizon, the whole path of the port-folio process must be taken into account. Therefore, the dual variables arestochastic processes and not simple FT−measurable random variables. Thus,the optimization problem is not reduced to a static one. Bouchard and Pham[84] derive a dual formulation in a general incomplete semimartingale frame-work. Then, they provide a solution to the primal problem, using a calculusof variation on the primal problem. The model is defined as follows:

- The financial market consists of one bond B chosen as numeriare andd securities (Si,t)i,t. The process S is assumed to be a semimartingale on afiltered probability space (Ω,F , (Ft)t,P).- Portfolio strategy and value: The numbers of shares (θi,t)i,t, invested

on each asset i at time t, is a predictable process, which is integrable w.r.t.the price process S. The portfolio value process (Vt)t, associated to strategy(θi,t)i,t, is given by, for t ∈ [0, T ],

Vt = V0 +d∑

i=1

∫ t

0

θi,sdSi,s. (8.102)

The utility function U satisfies the same assumptions as in previous sections.The optimization is well-defined by assuming that the wealth process V is

in the set

A =

V : ∀t ∈ [0, T ], Vt ≥ 0, a.s. and E

[∫ T

0

U−(Vt)dFt

]<∞

.

To avoid degeneracy, it is assumed that P [FT > 0] > 0. The random variableFT is also supposed to satisfy E [FT ] < ∞ and, without loss of generality,E[∫ T

0 dFt

]= 1.


PROPOSITION 8.5 Bouchard and Pham [84]Under the previous hypothesis and assuming also that:i) The financial market has no arbitrage opportunity (there exists at least

one equivalent local martingale)ii) The value function J of the problem

J (y) = infY ∈D(y)

∫ T

0

U(Yt)dFt <∞,

where D(y) is the set of all processes Y such that Y ≥ 0 w.r.t. the measuredm, which is defined by:

m(E × F ) = E

[∫ T

0

IE×F dFt

].

Then, we have:i) Existence of a solution to the primal problem: for all V0 > 0, there exists

a unique optimal solution V ∗ in the set A.ii) Existence of a solution to the dual problem: for all Y0 > 0, there exists

a unique optimal solution Y ∗ in the set D(Y0).iii) Duality relations: (notation: J = (U ′)−1) for all V0 > 0,

V ∗t = J(Y ∗

t ) with V0Y∗0 = E

[∫ T

0

J(Y ∗t )Y ∗

t dFt

].

Example 8.13 Blanchet-Scalliet et al. [79]The financial market contains a riskless asset with rate r and a risky assetS which is assumed to be a semimartingale on a filtered probability space(Ω,F , (Ft)t,P) solution of the SDE:

dSt = St− . [µdt + σdWt] , (8.103)

where W is a standard Brownian motion and µ is constant and σ is a non-negative constant. This financial market is complete. Thus, there exists oneand only one risk-neutral probability Q defined with the relative risk processη by η = σ−1 [µ− rI] . Assume that the deterministic time horizon is infinite( T < +∞). Introduce:

J (t, V0) = supw

E

[∫ ∞

t

f(u)U(Vu)du].

Case 1: For some deterministic function f, F (t) =∫ t

0 f(u)du on [0, T ] andF (T ) = 1. The random time τ is independent from the filtration and has thedensity f .


• If U(x) = lnx, we have:

J (t, V0) = p(t) + q(t) ln V0,

w∗t =

µ− r

σ2V ∗t ,

with

q(t) = 1−∫ t

0

f(u)du, p(t) = λ− ct + c

∫ t

0

∫ s

0

f(u)du,

c = r +12

(µ− r)2

σ2, λ = E [τ ]

[r +

12

(µ− r)2

σ2

].

• If U(x) = xα

α , 0 < α < 1, we have:

J (t, V0) = p(t)V α0 ,

w∗t = − µ− r

(α− 1)σ2V ∗t ,

with

p(t) = e−at

∫ ∞

t

eauf(u)du, c = r − 12(α− 1)

(µ− r)2

σ2.

• If U(x) = e−x, r = 0 and E

[e

((µ−r)2σ2

)τ

]<∞, then we have:

J (t, V0) = p(t)e−V0 ,

w∗t =

µ

σ2,

with

p(t) = e−ct

∫ ∞

t

ecuf(u)du, c =12

(µ− r)2

σ2.

Case 2: F (t) = P [τ ≤ t |Ft ] =∫ t

0 f(u)du on [0, T ] where f is solution of thefollowing SDE:

dft = ft (adt + bdWt) .

The random time τ is a stopping time w.r.t. the filtration.

• If U(x) = lnx and a < 0, we have:

J (t, V0, f0) = f0

[C2e

−C1t

∫ T

t

q(s)eC1sds + q(t) lnV0

],

w∗t =

(µ− r + σb

σ2

)V ∗t ,


with

q(t) =(

1a

+1C1

)eC1(T−t) − 1

C1, C1 = b

(µ− r

σ+ b

),

C2 = r + (µ− r)(µ− r + σb

σ2

)− 1

2

(µ− r + σb

σ

)2

.

• If U(x) = xα

α and a < 0, we have:

J (t, V0) = f0V α0

α

(1a

+1CeCT

)e−Ct − 1

C,

w∗t = −µ− r + σb

(α− 1)σ2V ∗t ,

with C = α(r + (µ− r)

(µ−r+σb(α−1)σ2

))+ 1

2α (α− 1)(

µ−r+σb(α−1)σ

)2

−b(

µ−r+σb(α−1)σ

).

8.5 Further reading

Standard results about dynamic portfolio optimization can be found in Korn[333]. As shown in Schachermayer [452], the asymptotic elasticity conditionis related to the asymptotic relative risk aversion under mild assumptions.

Investment-consumption models with transaction costs are examined inAkian et al. ([12],[13]). Quadratic optimization with transaction costs is an-alyzed by Adcock and Meade [6]. Bielecki and Pliska [70] study risk-sensitivedynamic asset management with transaction costs. Optimal portfolio man-agement with transaction costs equal to a fixed fraction of the portfolio, cor-responding to a portfolio management fee, is examined in Morton and Pliska[394]; the asymptotic growth rate (the “Kelly criterion”) is maximized on aninfinite-horizon. It is shown that even with very low transaction costs, the op-timal solution leads to very infrequent trading. Duffie and Sun [177] examine amodel where the transaction is also a fixed fraction of portfolio value and alsowith a proportional cost for withdrawal of funds for consumption. Assumingthat the wealth is only observed at transaction times, it is optimal to tradeat fixed deterministic intervals. Fleming and Soner [232] also provide detailsabout viscosity solutions to portfolio optimization with transaction costs (seealso Lions [358] and Zariphopoulou [510]). Cadenillas et al. ([97],[98]) alsostudy portfolio optimization with transaction costs and possible taxes. Portfo-lio optimization with transaction costs based on impulse control is introducedin Korn [334]. Deelstra et al.[153] provide the dual formulation of the utilitymaximization under transaction costs.


Portfolio optimization with labor income has been also studied withoutthe assumption that wealth is non-negative over the period of trading, byKaratzas et al. [317]. The investor is allowed to capitalize his wage incomeat some interest rate which is simply added to the current wealth. He andPages [287] consider that the investor’s wealth must be non-negative. Theyprove that there exists an optimal solution by application of duality theory.The optimization problem is transformed in an unconstrained dual shadowprice problem. Assuming that the asset price is a Markov-diffusion process,they solve the problem by the dynamic programming method when the laborincome is a function of the asset price. Duffie et al. [174] examine the samekind of optimization problem by dynamic programming when the income pro-cess is uninsurable. Thus, the financial market is incomplete since the laborincome cannot be duplicated by a portfolio. They provide a quasi-explicitsolution of the H.J.B. equation when the utility function is HARA, and whencoefficients are deterministic. They show that at a zero wealth, the investorconsumes a fixed fraction of wealth while saving the remaining amount in theriskless asset.

Portfolio optimization under partial information has been studied by Lakner[343]. Nagai and Peng [397] consider an optimal investment problem withpartial information for a factor model. Jeanblanc et al. [296] examine utilitymaximization under partial information when asset dynamics are modelledby a jump-diffusion process. They assume that only the vector of stock priceis observable. However, they prove that the optimization problem can berewritten as a problem with coefficients depending on past history of observedprices.

Portfolio optimization with possible bankruptcy can also be analyzed. Forexample, the investor is under the obligation to pay a debt until he declaresbankruptcy. This type of problem has been examined by Cadenillas and Sethi[99], Lehoczky et al. [350], Sethi [460] and Jeanblanc and Lakner [297].

Dynamic benchmark optimization is analyzed by Browne [94], who searchesto outperform a stochastic benchmark. Buckley and Korn [96] use impulsecontrol to determine an optimal index tracking under transaction costs.

Part IV

Structured portfoliomanagement

“The sophistication of portfolio insurance users has grown as rapidly as theproduct itself... Sponsors are realizing the advantage of programs that pro-tect a fund’s surplus. Portfolio insurance is being applied to many differentclasses of assets besides equities; fixed-income and international investmentsare growing areas of application. An early criticism of portfolio insurancewas that it reduced return as well as reducing risk. But users are discover-ing that portfolio insurance can be used aggressively rather than simply toreduce risks. Long-run returns can actually be raised, with downside riskscontrolled, when insurance programs are applied to more aggressive activeassets. Pension, endowment, and educational funds can actually enhancetheir expected returns by increasing their commitment to equities and otherhigh-return sectors, while fulfilling their fiduciary responsibilities by insuringthis more aggressive portfolio. Compared with current static allocation tech-niques, annual expected returns can be raised by as much as 200 basis pointsper year... Dynamics can be used to mold a set of returns to virtually anyfeasible investor objective. It can be used to manage the risks of corporatebalance sheets as well as investment funds. As such, it may represent themost significant advance to date in the science of financial engineering.”

Hayne E. Leland and Mark Rubinstein, “The Evolution of Portfolio Insur-ance,” published inDynamic Hedging: A Guide to Portfolio Insurance, (1988).

279

Chapter 9

Portfolio insurance

The purpose of portfolio insurance is to limit portfolio losses when suddendrops occur in the financial market, while allowing investors to benefit frompotential rises of the market. Thus, very often, for insured portfolio values atmaturity:

• There exists a guaranteed amount (it means that the probability thatthe guarantee is violated must be equal to 0).

• When the market rises, the portfolio return must also rise at (at least)a predetermined percentage of a given index return.

Therefore, portfolio insurance generally requires to specify the guaranteeconstraint and the portfolio maturity.

Two standard portfolio insurance methods are the Option Based PortfolioInsurance (OBPI) and the Constant Proportion Portfolio Insurance (CPPI):

• The OBPI, introduced by Leland and Rubinstein [349], consists of aportfolio invested in a risky asset S (usually a financial index such asthe S&P) covered by a listed put written on it.

• The CPPI was introduced by Perold [406] (see also Perold and Sharpe[407]) for fixed-income instruments and Black and Jones [76] for equityinstruments. This method uses a simplified strategy to allocate assetsdynamically over time.

This chapter provides:

• First, some basic properties of OBPI and CPPI methods, such as theirpayoff functions, for instance in the geometric Brownian motion frame-work.

• Second, their payoffs at maturity are compared by means of stochasticdominance, by means of four first moments of their returns and of thecumulative distribution of their ratio.

• Finally their dynamics properties are examined and compared. It isproved that the OBPI method is a generalized CPPI where the multipleis allowed to vary. We will also focus on the dynamics of both methods,in particular their “Greeks.” In what follows, we use in particular resultsin Bertrand and Prigent ([60], [58], [59]).

281


9.1 The Option Based Portfolio Insurance

Leland and Rubinstein [349] introduce the OBPI method which uses com-binations of standard securities (such as bonds and stocks) and options.

To illustrate this method, let us examine the following example.

Example 9.1 Why do we need options?Consider a given time horizon T (for example one year).

- A first investor, denoted by I1, chooses to invest in a riskless asset B andin a risky security S (typically a financial index).

- A second investor, denoted by I2, chooses to invest in B and to buy a Calloption C, written on the asset S.

The amount in B of investor I2 corresponds to the discounted exercise priceof the option.

Assume that:- Both investors have the same initial total wealth V0 invested in the finan-

cial market.- Each of them wants to recover at maturity a same given percentage p of

his initial investment.Suppose that the riskless rate r is equal to 3%, and that the volatility of S is

equal to 20%. Assume also that B0 = 1 and S0 = 100. Within this framework,the value C0 of the at-the-money Call option is equal to C0 = 9.41.

- The initial value V0 of investor I2, who buys the option, is given by:

V0 = S0e−rT + C0 106.41.

Then, the portfolio value V(2)T of investor I2 at maturity T is equal to:

V(2)T = S0 + (ST − S0)+ = 100 + (ST − 100)+.

This value is higher than the amount S0 = 100, which corresponds to theguaranteed amount for investor I2. It is the minimal amount that he recoversat maturity if he cannot exercise the option (if the price of S decreases). Notethat the guaranteed percentage p is given by p = S0/V0 94%.

- The initial value V0 of investor I1, who chooses a simpler strategy: buyand hold quantities a and b respectively in B and in S, without using anoption, is given by

V0 = aB0 + bS0 106.41.

Since investor I1 wants the same guaranteed percentage as I2, he must investthe same amount in the riskless asset B. This implies that aB0 = S0e

−rT 97.Therefore, he chooses the same percentage of investment in B (here, about

91%).

Portfolio insurance 283

Thus, the portfolio value V(1)T of investor I1 at maturity T is equal to

V(1)T = aB0e

rT + bST 100 + 0.094× ST .

- Let us compare the two portfolio returns for the two cases:

* First case: the growth of the asset price S is equal to 30%.* Second case: the asset price S decreases by −30%.

For a rise of 30%, the portfolio return of investor I1 is approximately equalto p+ (0.09)× 1.3, which gives a rate of about 5.7%. For a decrease of −30%,the portfolio return of investor I1 is approximately equal to p + (0.09)× 0.7,which gives a rate of about 0.3%.

For a rise of 30%, the portfolio return of investor I2 is approximately equalto p + (0.09)× (option return), which gives a rate of about 22.7%. For adecrease of −30%, the portfolio return of investor I2 is approximately equalto p, which gives a rate of about −6%.

Therefore, when the risky asset price S increases significantly, the portfoliowhich contains the option has a much better return than the simple combi-nation of the two basic assets B and S.

However, it is the converse when the risky asset drops. Nevertheless, inthat case, the guarantee is always satisfied. Note also that, with respect toa riskless investment with return erT 3%, both portfolios clearly providesmaller returns when the financial market is bearish, but a riskless investmentcannot benefit from bullish market.

Note that the percentage that the portfolio of the first investor can acquirefrom a potential increase of the risky asset S is rather low. Indeed, we have:(

VT

V0

)= p + (1 − pe−rT )

(ST

S0

) 0.94 + (0.09)

(ST

S0

). (9.1)

Thus, only about 9 % of the risky return ST /S0 can be obtained.Concerning the portfolio including the option, note that if the option can

be exercised, its return is equal to:

(VT

V0

)= p + (1− pe−rT )

S0

(STS0− 1

)C0

0.94 + (0.95)×(ST

S0− 1

). (9.2)

Therefore, this portfolio can benefit from a high leverage when the financialmarket rises.


9.1.1 The standard OBPI method

The OBPI, introduced by Leland and Rubinstein [349], consists of a port-folio invested in a risky asset S (usually a financial index such as the S&P)covered by a listed put written on it. Whatever the value of S at the terminaldate T , the portfolio value will be always greater than the strike K of the put.At first glance, the goal of the OBPI method is to guarantee a fixed amountonly at the terminal date. In fact, as shown in what follows, the OBPI methodallows one to get portfolio insurance at any time. Nevertheless, the Europeanput with suitable strike and maturity may be not available on the market.Hence, it must be synthesized by a dynamic replicating portfolio invested ina risk-free asset (for instance, T-bills) and in the risky asset.

The portfolio manager is assumed to invest in two basic assets: a moneymarket account, denoted by B, and a portfolio of traded assets such as acomposite index, denoted by S. The period of time considered is [0, T ]. Thestrategies are self-financing.

The value of the riskless asset B evolves according to:

dBt = Btrdt,

where r is the deterministic interest rate.

The dynamics of the market value of the risky asset S are given by thestandard diffusion process:

dSt = St [µdt + σdWt] ,

where Wt is a standard Brownian motion.The OBPI method consists basically of purchasing q shares of the asset S

and q shares of European put options on S with maturity T and exercise priceK.

Thus, the portfolio value V OBPI is given at the terminal date by:

V OBPIT = qST + q(K − ST )+, (9.3)

which is also V OBPIT = qK + q(ST −K)+, due to the Put/Call parity. This

relation shows that the insured amount at maturity is the exercise price timesthe quantity q: qK.

The value V OBPIt of this portfolio at any time t in the period [0, T ] is:

V OBPIt = qSt + qP (t, St,K) = qK.e−r(T−t) + qC(t, St,K), (9.4)

where P (t, St,K) and C(t, St,K) are the Black-Scholes values of the Europeanput and call.


REMARK 9.1 Assume that standard no-arbitrage conditions hold to-gether with no market friction. Then, for all dates t before T , the portfoliovalue is always above the deterministic level qKe−r(T−t), which shows thatthe OBPI strategy also provides a dynamic guarantee.

REMARK 9.2 1) The amount insured at the final date is often expressedas a percentage p of the initial investment V0 (with p ≤ erT ). Since, here, thisamount is equal to the strike K itself, it is required that K is an increasingfunction of the percentage p, determined from the relation:

pV0(K) = p(qK.e−rT + qC(0, S0,K)) = qK. (9.5)

Indeed, we haveC0(K)K

=1− p e−rT

p, (9.6)

where C0(K) denotes the initial Call option value (for example determinedfrom Black and Scholes formula). Since the ratio C0(K)

K does not depend onS0K , we deduce that the higher the guaranteed percentage p, the higher mustbe the strike K (for a given value S0).

2) For a given initial portfolio value V0 and for given values S0 and P0(K),the number q of shares is a decreasing function of the exercice price K, sinceq is determined from

q =V0

S0 + P0(K), (9.7)

where P0(K) denotes the initial Put value.Finally, note that for a given investment V0 and a fixed guaranteed percent-

age p, the option strike K and the number of shares q are entirely determined.

To simplify the presentation and without loss of generality, we shall assumethat q is normalized and set equal to one.

Then:V OBPIT = ST + (K − ST )+ = K + (ST −K)+. (9.8)

This function is increasing and convex w.r.t. the risky asset price ST atmaturity. Therefore, it has the typical features of the portfolio payoffs withguarantee constraints. Indeed, such portfolio can benefit from a market rise,since if the risky asset price is higher than the strike at maturity, the returnis given by: (

VT

V0

)=(ST

S0

)×(

S0

S0 + P0(K)

).

Thus, in that case, the percentage which is obtained is equal to 11+P0(K)/S0

(for the previous example, this percentage is equal to 94.4%. Thus, for a returnST /S0 equal to 30%, the return VT /V0 is about 22.7%).


The portfolio profile V OBPIT is as follows (for K = 100):

100100

Risky asset value S

Portfolio value VT

FIGURE 9.1: OBPI portfolio value as function of S

9.1.2 Extensions of the OBPI method

Several extensions can be proposed. They are mainly based on other kindsof options to be included in the portfolio. Among them, more general polyno-mial options can be substituted for the standard European options. As shownin Chapter 10, the choice of such options can be based on risk aversion andutility maximization.

9.1.2.1 Polynomial options

Assume that the guarantee constraint is as follows. At maturity, the port-folio value VT must be always higher than

FT = hc (ST ) .

For example, consider a linear function hc (ST ) = aST + b. The investor hasa fixed guarantee b whatever the market evolution, and also a minimal per-centage a of the potential market rise. Moreover, the underlying asset of theoption is a power function of the risky asset price at maturity.

Then, we are led to the following combination:

VT = d.SmT + ([aST + b]− d.Sm

T )+ = [aST + b] + (d.SmT − [aST + b])+ . (9.9)

For a simple fixed guarantee, we get:

VT = d.SmT + (K − d.Sm

T )+ = K + (d.SmT −K)+ . (9.10)


The figure 9.2 provides an example of such a power option payoff.

FIGURE 9.2: Call-power option profiles

Figure 9.3 provides examples of paths of call-power portfolios and of stan-dard OBPI one for three types of paths of the risky asset S and three differentvalues of m. The OBPI strategy is obviously a special case of call power optionwith m = 1 (linear case).

In order to compare the two methods, the same path of S (drop, rise andstability) corresponds to each row.

We note that the guaranteed level strongly depends on the value of theparameter m. This result is quite intuitive since, in order to get a terminalpayoff more and more convex, the investor must more and more bear the riskof a portfolio value drop.

The first column provides the path of Call-power options with m = 0.6 < 1.Above the guaranteed level, the payoff is concave. For a bearish market, theportfolio value reachs the floor very quickly but remains always above theOBPI payoff. For a bullish market, the concavity of the payoff implies thatthe portfolio value is smaller than the OBPI one. Then, the concavity allowsthe “reduction” of the variations of the risky asset, as illustrated by the lastgraph in the first column.

The second column corresponds to a medium case with m = 2. The guar-anteed level is reduced from 98.96% (previous case) to 83.72% of the initialinvested amount, as illustrated by the first graph corresponding to a marketdecrease. The last graph shows that, contrary to the concave case, the risky


FIGURE 9.3: Call-power option paths

asset variations are amplified.The last column corresponds to the smaller guaranteed level with very

volatile paths. The next table summarizes informations about the four firstmoments: mean, standard deviation, skewness, and kurtosis, and also theguaranteed levels for the different values of m. Let V0 = 100.

TABLE 9.1: OBPI and Call-power momentsm Guarantee E(VT ) σ(VT ) Skew Kurto0.6 (concave) 98.96% 103.14 5.733 1.33 5.051 (linear = OBPI) 94.72% 103.71 9.934 1.4345 5.592 (convex) 83.72% 104.88 22.06 1.5971 6.173 (convex) 72.33% 106.31 36.52 1.98 8.824 (convex) 60.86% 108.06 55.07 2.61 14.02Asset S 0% 104.23 14.93 0.35 3.13

The call-power moments are increasing functions of m, whereas the guar-anteed level is decreasing w.r.t. the parameter m.


The following two figures correspond respectively to the pdf and cdf ofcall-power options.

FIGURE 9.4: Call-power option pdf

FIGURE 9.5: Call-power option cdf


Note that all cdf graphs intersect each other, which means that no stochas-tic dominance exists. Besides, the higher the possible gain, the smaller theguaranteed level.

Note that, for the case hc (ST ) = aST + b, the portfolio value VT also hasthe following form:

VT = aST + b +

d.SmT − aST︸︷︷︸Q(S)

− b

+

.

The study of such portfolio payoffs (see for example [417]) requires the ex-amination of polynomial option properties.

As shown by Macovschi and Quittard-Pinon [370], a polynomial option canbe decomposed as a sum of power options. Indeed, we have to search for rootsof the characteristic polynomial function associated to the payoff function.Let Q(S) =

∑nj=1 ajS

j be a polynomial function with degree n, and let bbe a positive scalar such that the polynomial function P (S) = Q(S) − b hasexactly p non-negative roots λ1, λ2, ..., λp with λ1 < λ2 <, ..., < λp. Then theEuropean call with strike b can be valued from the relation:

max (Q(S)− b, 0) =⇒ CQ(b) =n∑

j=1

aj

(p∑

k=1

(−1)k+1Cj

((λk)j

)), (9.11)

where Cj (H) is the power-j option, with strike H and maturity T.

The following example illustrates the possible payoffs of such portfoliosaccording to the values of parameters d and m.

Example 9.2Assume that the financial parameters are given by:

µ = 0.1, σ = 0.2S0 = 100 and r = 3%.

Assume also that the investor’s characteristics are the following ones:

V0 = 100, T = 5, and a = 0.7, b = 90, d = 8.7253.10−11, m = 5.8333.

The portfolio value with guarantee constraint is given by:

V ∗∗T = 0.7ST + 90 + max

(8.7253.10−11.S5.8333

T − 0.7ST − 90; 0).

The characteristic polynomial function associated to the payoff is equal to:

P (S) = 8.7253.10−11.S5.8333T − 0.7ST − 90.


It has only one positive root, λ = 129.1958.

From Equation (9.11), we deduce the polynomial option value:

CQ(90) =n∑

j=1

ajCj

((λ) ,j

)= −0.7.C1(129.1958) + 8.7253.10−11.C3

(129.19585.8333

),

where C1(K) and C3(K) are respectively equal to the standard Black-Scholescall value and the Black-Scholes value of the power-5.8333 option with strikeK.

Recall that the Black-Scholes power-m option value is given by:

V0 = e−r(1−m)T[S0N(d1)−Ke−rTN(d2)

],

= Sm0 e[(r+

12σ

2m)T (m−1)]N(d1)−Ke−rTN(d2),

with

d1 =ln(

αSm0 e12σ

2m(m−1)T

K

)+(mr + 1

2 (mσ)2)T

mσ√T

,

andd2 = d1 −mσ

√T .

Then, we get:CQ(90) = 18.9.

Thus, we deduce the initial portfolio value which allows the investors toreceive such guarantee at maturity:

V ∗∗0 = 0.7S0 + 90.e−rT + CQ(90) = 166.37.

This investor is sure to recover 54.09% of his initial investment, at maturity.


The following figure indicates the terminal values of the portfolio as functionof the risky asset values ST .

FIGURE 9.6: Convex case with linear constraints

An example of concave profile with linear constraint:

FIGURE 9.7: Concave case with linear constraints


9.1.2.2 Other possible extensions

Obviously, other options can be introduced to provide a percentage of thepotential market rise. Indeed, any portfolio with terminal value VT such that:

VT = K + HT , (9.12)

where HT is a positive random variable is always above the level K.

The choice of a particular payoff HT may depend on:

• Market predictions: rise or drop of the financial market, volatility levels,etc.

• The type of risky assets: financial index, hedge fund, etc.

• The insurance cost associated to each chosen derivative: lookback op-tions, corridor options, etc.

REMARK 9.3 As seen in the next chapter, from a theoretical point ofview, it may be optimal to use options on a power of the risky asset.

From Relations (9.1 and 9.2), the reduction of the insurance cost can allowthe investor to get a higher percentage of the market rise.

Among possible options, consider for example the capped Call CcappT :

CcappT = K + Min(Max(ST −K, 0);K ′), (9.13)

which always provide a guaranteed amount equal to K, is equal to the riskyasset value ST when K ≤ ST ≤ K ′, but remains constant equal to K + K ′

for high values of ST .

This allows reduction of the insurance cost and is profitable if the value ST

remains below K ′.

However, options must often be synthesized by a dynamic hedging strategywhen they are not available on the financial market.

Therefore, standard problems appear, such as imperfect hedging, transac-tion costs, and rebalancing strategy, which generally increase the insurancecost.


9.2 The Constant Proportion Portfolio Insurance

The CPPI, introduced by Perold [406], uses a simplified strategy to allocateassets dynamically over time. The investor starts by setting a floor equal tothe lowest acceptable value of the portfolio. Then, he computes the cushionas the excess of the portfolio value over the floor and determines the amountallocated to the risky asset by multiplying the cushion by a predeterminedmultiple. Both the floor and the multiple are functions of the investor’s risktolerance and are exogenous to the model. The total amount allocated to therisky asset is known as the exposure. The remaining funds are invested in thereserve asset, usually T-bills.

6

-

Floor F

Portfolio value V

Cushion

Time

F0

V0

FIGURE 9.8: Portfolio value and cushion

The higher the multiple, the more the investor will participate in a sustainedincrease in stock prices. Nevertheless, the higher the multiple, the faster theportfolio will approach the floor when there is a sustained decrease in stockprices. As the cushion approaches zero, exposure approaches zero too. Incontinuous time, this keeps the portfolio value from falling below the floor.Portfolio value will fall below the floor only when there is a very sharp dropin the market before the investor has a chance to trade.

In what follows, we refer mainly to Prigent [413].


9.2.1 The standard CPPI method

The CPPI method consists of managing a dynamic portfolio so that its valueis above a floor F at any time t. The value of the floor gives the dynamicalinsured amount. It is assumed to evolve as a riskless asset, according to:

dFt = Ftrdt. (9.14)

Obviously, the initial floor F0 is less than the initial portfolio value V CPPI0 .

The difference V CPPI0 −F0 is called the cushion, denoted by C0. Its value Ct

at any time t in [0, T ] is given by :

Ct = V CPPIt − Ft. (9.15)

Denote by et the exposure, which is the total amount invested in the riskyasset. The standard CPPI method consists of letting

et = mCt, (9.16)

where m is a constant called the multiple. Note that the interesting case forportfolio insurance corresponds to m > 1, that is, when the payoff function isconvex and can provide significant percentage of the market rise.

Assume that the risky asset price process (St)t is a diffusion with jumps:

dSt = St−[µ(t, St)dt + σ(t, St)dWt + δ(t, St)dγ], (9.17)

where (Wt)t is a standard Brownian motion, independent from the Poissonprocess with measure of jumps γ.

In particular, this means that:

• The sequence of random times (Tn)n corresponding to jumps satisfiesthe following properties: the interarrival times Tn+1−Tnare independentand have the same exponential distribution with parameter denoted byλ.

• The relative jumps of the risky asset ∆STnSTn

are equal to δ(Tn, STn). Theyare supposed to be strictly higher than −1 (in order for the price S tobe stricly positive).


We deduce the portfolio value:

PROPOSITION 9.1i) The value of this portfolio V CPPI

t at any time t in the period [0, T ] is givenby:

V CPPIt (m,St) = F0.e

rt + Ct, (9.18)

where the cushion value Ct is equal to:

Ct = C0exp

((1−m)rt + m

[∫ t

0

(µ− 1/2mσ2)(s, Ss)ds +∫ t

0

σ(s, Ss)dWs

])×

∏0≤Tn≤t

(1 + m δ(Tn, STn)). (9.19)

ii) When the risky asset price has no jump (δ = 0), and the coefficientsµ(., .) and σ(., .) are constant, we deduce the following standard formula:


rt + αt.Smt , (9.20)

where

αt =(

C0

Sm0

)exp [βt] and β =

(r −m

(r − 1

2σ2

)−m2σ

2

2

). (9.21)

PROOF - The portfolio value of the CPPI strategy is given by:

dV CPPIt = (V CPPI

t − et)dBt

Bt+ et

dSt

St.

The CPPI strategy is based on the following relations: V CPPIt = Ct + Ft,

et = mCt, and the floor value satisfies dFt = rdt. Therefore, the cushionvalue is given by:

dCt = d(V CPPIt − Ft),

= (V CPPIt − et)dBt

Bt+ (et)dSt

St− dFt,

= (Ct + Ft −mCt)dBtBt

+ (mCt)dStSt− dFt,

= (Ct −mCt)dBtBt

+ (mCt)dStSt

,

= Ct[r + m (µ(t, St)− r) dt + mσ(t, St)dWt + mδ(t, St)dγ].

Consequently, the cushion value is a stochastic exponential (the so-calledDoleans-Dade exponential as shown in Appendix B).

Thus, the cushion value Ct at any time t is given by:

C0exp

((1−m)rt + m

[∫ t

0

(µ− 1/2mσ2)(s, Ss)ds +∫ t

0

σ(s, Ss)dWs

])×

∏0≤Tn≤t

(1 + m δ(Tn, STn)).


When the risky asset price has no jump and the coefficients µ(., .) and σ(., .)are constant, we have:

Ct = C0exp[((m(µ− r) + r − m2σ2

2)t + mσWt)]. (9.22)

Then, using the relation St = S0 exp[σWt +

(µ− 1

2σ2)t], we deduce:

Wt =1σ

[ln(St

S0

)−(µ− 1

2σ2

)t

].

Therefore, substituting Wt in the expression of Ct, we have:

Ct (m,St) = C0

(St

S0

)m

exp[(

r −m

(r − 1

2σ2

)−m2σ

2

2

)t

]= αt.S

mt ,

where

αt =(

C0

Sm0

)exp [βt] and β =

(r −m

(r − 1

2σ2

)−m2σ

2

2

).

Finally, the CPPI portfolio value is given by:


rt + αt.Smt .

REMARK 9.4 Consequently, the guarantee constraint is satisfied as soonas the relative jumps are such that:

δ(Tn, STn) ≥ −1/m. (9.23)

Thus, when the risky asset jumps are higher than a constant, then the condi-tion 0 ≤ m ≤ −1/d allows the positivity of the cushion value. For example,if d is equal to −20%,we have m ≤ 5. If d is equal to −10%,we have m ≤ 10.Note that these upper bounds on the multiple do not depend on the proba-bility distribution of the jump times ∆STn .

REMARK 9.5 - When the risky asset price is a geometric Brownianmotion (δ = 0 and µ and σ are constant), the cushion value is given by:

Ct = C0emσWt+[r+m(µ−r)−m2σ2

2 ]t with C0 = V0 − P0. (9.24)

In that case, the portfolio and cushion values are path independent. Theyhave lognormal distributions, up to the deterministic floor value FT for theportfolio. The volatility is equal to mσ, and the instantaneous mean is givenby r + m(µ − r). This illustrates the leverage effect of the multiple m: thehigher the multiple, the higher the excess return but also the volatility. Sincewe have V CPPI

t (m,St) = F0.ert + αt.S

mt , the portfolio profile is convex as

soon as the multiple m is higher than 1.


9.2.1.1 Levy process case

Assume that δ(.) is not equal to 0, and µ and σ are constant. Supposealso that the jumps are iid with probability distribution H(dx), with finitemean E[δ(T1, ST1)] denoted by b and E[δ2(T1, ST1)] < ∞ equal to c. Thelogarithmic return of the risky asset S is a Levy process (see Appendix B).Then, we deduce:

PROPOSITION 9.2- The first two moments of the portfolio values are given by:

E[Vt] = (V0 − P0)e[r+m(µ+bλ−r)]t + P0ert,

V ar[Vt] = (V0 − P0)2e2[r+m(µ+bλ−r)]t[em2(σ2+cλ)t − 1].

(9.25)

- The pdf of the cushion Ct is determined as follows. Let g0,t denote the pdfwithout jump. The function g0,t is defined by:

g0,t(x) =Ix>0

x√

2πσ2m2te− 1

2σ2m2t

(ln[xC0

]−t(r+m(µ−r)−m2σ2

2 ))2

. (9.26)

When jumps can occur, we have:

gCt(x) =∞∑

n=0

e−λt (λt)n

n!

∫IRn

g0,t(x∏

i≤n(1 + myi))Kn(dy1, ..., dyn)∏

i≤n(1 + myi), (9.27)

where Hn denotes the joint distributon of the relative jumps δ(Ti, STi), i ≤ n.Since they are assumed to be independent with the same distribution H, wededuce the relation:

Hn(dy1, ..., dyn) = ⊗H(dyi). (9.28)

Consequently, the pdf of the portfolio value is given by:

fV t(x) = gCt(x− P0ert). (9.29)

REMARK 9.6 From the mean-variance point of view, note that bothmean and variance of the CPPI portfolio value are increasing functions of themultiple m and decreasing w.r.t. the initial floor value F0. It is not possibleto optimize w.r.t. the multiple m according to Markowitz criterion. Indeed,for a fixed mean level L the intial floor value F0 is a function of the multiplegiven by:

F0(m) =LV0 − V0e

[r+m(µ+bλ−r)]t

ert[1− em(µ+bλ−r)t], (9.30)

which implies

V ar[Vt/V0] = (ert − L)2[em

2(σ2+λc)t − 1][1− e−m(µ+bλ−r)t]2

, (9.31)


which is a function of the multiple not having a real minimum.In fact, the expectation of gains is not the first goal: the guarantee level

is crucial. When it has been determined, the multiple allows to adjust thepercentage of anticipated profit when the financial market rises.

The mean of the portfolio return is an increasing function of the multi-ple m. However, as mentioned previously in relation (9.23), the risky assetdiscontinuities lead to upper bounds on the multiple m.

In order to reduce this constraint, another weaker guarantee condition canbe considered which can be based on Value-at-Risk or Expected Shortfallcriterion. For example, consider the VaR type condition:

P[Ct ≥ 0, ∀t ≤ T ] ≥ 1− ε, (9.32)

where ε is “small.” This condition is equivalent to:

P[∀t ≤ T,∆St

St≥ −1

m] ≥ 1− ε. (9.33)

Denote by H the cdf of relative jumps and asume that it is strictly increas-ing. Then, we deduce:

PROPOSITION 9.3

The condition

P[Ct ≥ 0, ∀t ≤ T ] ≥ 1− ε

is equivalent to the following condition on the multiple m :

m ≤ −1

H(−1)(

1λT ln( 1

1−ε )) . (9.34)

PROOF The VaR guarantee condition fo a given threshold 1 − ε is thefollowing:

P[Ct ≥ 0, ∀t ≤ T ] ≥ 1− ε. (9.35)

The cushion value provided in relation (9.19) shows that the variation due tojumps is equal to

∏0≤Tn≤t(1 +m δ(Tn, STn). This term must remain positive

with a probability equal to 1 − ε at any time t. Therefore, each term in thisproduct must be positive. This is equivalent to:

P

[∩Tn≤T

δ(Tn, STn) ≥ −1

m

]≥ 1− ε. (9.36)


Denote by NT the number of jumps before maturity T . Since in the Levycase jumps are independent from their occurence times, we deduce:

P

[∩Tn≤T


m

]=∑

k

P

[∩n≤k


m

]P [NT = k] . (9.37)

The random variable NT , which counts the number of jumps during the timeperiod [0, T ], is a Poisson distribution with parameter λT. Therefore, the VaRguarantee constraint at the threshold 1− ε is equivalent to:

m ≤ −1

H(−1)(

1λT ln( 1

1−ε )) . (9.38)

REMARK 9.7 The latter condition (9.38) provides an upper bound onthe multiple which is obviously higher than the opposite of the infimum dof the range of the distribution H. Note also that this upper bound is adecreasing function of the jump intensity λ. For high values of λ, the upperbound converges to −1/d .

9.2.1.2 Discrete-time case

Suppose now that the investor trades according a discrete-time grid: tk,k ≤ n. For example, he wants to limit transaction costs or portfolio assetsare not sufficiently liquid. In this framework, the CPPI portfolio value can bedetermined as follows

Denote by Xk the opposite of the arithmetical return of the risky assetbetween tk−1 and tk. We have:

Xk = −Stk − Stk−1

Stk−1

. (9.39)

Consider the maximum M of these values. For each l, define:

Ml = Max(X1, ..., Xl).

Denote by Vk the portfolio value at time tk. The guarantee constraint is tokeep the portfolio value Vk above the floor Pk. The exposure ek invested inthe risky asset Sk is equal to mCk, where the cushion value Ck is equal toVk−Pk. The remaining amount (Vk−ek) is invested in the riskless asset withreturn rk on the time period [tk, tk+1] .


Therefore, the dynamic evolution of the portfolio value is given by (similarto the continuous-time case):

Vk+1 = Vk − ekXk+1 + (Vk − ek)rk+1, (9.40)

where the cushion value is defined by:

Ck+1 = Ck [1−m Xk+1 + (1−m)rk+1] . (9.41)

Since at any time tk, the cushion value must be positive, we get, for anyk ≤ n,

−mXk + (1−m)rk ≥ −1.

Besides, since rk is relatively small, the previous inequality is “approximately”equivalent to:

∀k ≤ n,Xk ≤ 1m

then also Mn = Max(Xk)k≤n ≤ 1m

.

We deduce that the guarantee condition is satisfied as soon as the multiplem is smaller than d (equal to the infimum f the range of the distribution ofthe random variables Xk).

For a VaR guarantee constraint at the level (1−ε) on the time period [0, T ]:

P[Ct > 0, ∀t ∈ [0, T ]] ≥ 1− ε,

the maximum Mn of the values Xk at times tk during [0, T ] must satisfy:

P[∀tk ∈ [0, T ] , Xk <1m

] = P[Mn <1m

] ≥ 1− ε. (9.42)

According to the distributions of the random variables Xk, upper boundson the multiple are deduced.

Suppose for example that the random variables Xk are iid with cdf H whichis assumed to have an inverse function H(−1).

Then, the following condition must be satisfied by the multiple m:

m <1

F−1(1− n√

1− ε). (9.43)


9.2.1.3 Random rebalancing case

When the cushion rises due to market fluctuations, the exposure may beclose to the maximum amount that the investor wants to invest in the riskyasset. When the exposure is below this limit, the investor can have a tolerancew.r.t. these market variations. It means that he can fix a percentage of marketdrops and rises such that, when market fluctuations are above this level, herebalances his portfolio.

Consider for example a lower bound m and an upper bound m on themultiple m∗. The investor begins by choosing an initial floor F0, a quantityθS0 , invested in the risky asset S and a quantity θB0 invested in the risklessasset B.From initial conditions, we deduce:

θS0 =m∗(V0 − P0)

S0. (9.44)

The portfolio is rebalanced as soon as the ratio etCt

is smaller than m orhigher than m. If θB0 < P0

B0, then, for the geometric Brownian case, the

conditionm ≤ et

Ct≤ m (9.45)

is equivalent to:A ≤ Xt ≤ B, (9.46)

where (Xt)t is a Brownian motion with drift, defined by:

Xt = (µ− r − 1/2σ2)t + σWt,

and the paramaters A and B are given by:A = Ln(

mm−1

(P0−θB0 B0)m∗(V0−P0)

)= Ln

(m

m−1(m∗−1)

m∗

),

B = Ln(

mm−1

(P0−θB0 B0)m∗(V0−P0)

)= Ln

(m

m−1(m∗−1)

m∗

).

The conditional distribution of the rebalancing times is characterized bythe exist time of the process (Xt)t from the corridor A,B. This probabilitydistribution is deduced from the trivariate distribution of the maximum, min-imum, and terminal value of the Brownian motion (see Revuz and Yor [423]or Borodine and Salminen [85]).

The pdf of this joint distribution with a constant drift ρ is defined by, forany x in A,B:

g(x,A,B) = exp[ρx

σ2− ρ2t

2σ2]×

+∞∑n=−∞

1σ√t

(φ(

x − 2n(B −A)σ√t

)− φ(x − 2n(B −A)− 2A

σ√t

)), (9.47)


where φ is the pdf of the standard Gaussian distribution and N its cdf.

If A < 0 and B > 0, then the distribution of the first passage time T1 isgiven by:

P[T1 ≤ t] = 1− P[Maxs≤tXs ≤ B,Mins≤tXs ≥ A] = 1−∑+∞n=−∞ e2nρ(B−A)/σ2

[N(B−ρt−2n(B−A)

σ√

t)−N(A−ρt−2n(B−A)

σ√

t)]

−e2Aρ/σ2[N(B−ρt−2n(B−A)−2A

σ√

t)−N(A−ρt−2n(B−A)−2A

σ√

t)].

After the firsttime T1, the new portfolio value is determined from the fol-lowing relations: The new initial floor is equal to P0e

rT1 . The quantities θST1

and θBT1respectively invested in S and B are determined from rebalancing

conditions. We have in particular:

θST1=

m∗(VT1 − P0erT1)

ST1

. (9.48)

From the Levy property, the distribution of the new interarrival time T2−T1

is deduced from the previous one, by stopping all processes at time T1. Notethat when jumps can occur but the logarithmic return is still a Levy process,the previous cdf can be defined from infinite expansions.

9.2.2 CPPI extensions

The CPPI method is based on an exposure e which is a simple linear func-tion w.r.t. the cushion. It can be extended by introducing a more generalexposure function e(t, x), defined on [0, T ]×R+, which is assumed to be pos-itive and continuous and has the following form:

et = e(t, Ct). (9.49)

Consequently, the cushion is the solution of the following SDE:

dCt = α(t, St, Ct)dt + β(t, St, Ct)dWt + γ(t, St, Ct)dµ,withα(t, St, Ct) = rCt + e(t, Ct))[a(t, St)− r],

β(t, St, Ct) = e(t, Ct)σ(t, St),γ(t, St, Ct) = e(t, Ct)δ(t, St).

(9.50)

The positivity of the cushion is controlled by an appropriate choice of thefunction e(., .):

• If the cushion is null then the exposure e(t, 0) must be equal to 0.

• If the relative jumps ∆SS− are higher than a fixed constant d (negative),

then for any (t, x), we must have e(t, x) ≤ − 1dx.


REMARK 9.8 The implications of this method can be analyzed by con-sidering conditions on buying and selling, probability distribution of the cush-ion value, etc. Indeed, a more general exposure function allows for a betterperformance since the portfolio manager has more available parameters toadjust the portfolio profile.

Such a choice is compatible with tactical allocation and can be applied withfixed income instruments or hedge funds. For example, the multiple m canbe dynamically chosen to take account of the implicit volatility of options ofthe financial market or other factors.

We can also impose a more general guarantee condition

VT ≥ FT

at maturity T , where FT is a contingent claim such that for example,

FT > P0erT .

Sometimes, the investor wants to keep until maturity some part of theportfolio gains. For this purpose, the Time Invariant Protection Insurance(TIPP), introduced by Ested and Kritzman [213] can be used. This methodallows to have at any time t:

Vt ≥ kMax(Ft, sups≤tVs), with 0 < k < 1. (9.51)

In particular, it means that the investor does not want to lose more than agiven percentage of the maximum of previous portfolios values. Denote:

Xt = Max(Ft, sups≤tVs).

In that case, the extended CPPI method is based on the new floor kXt andthe new exposure:

et = m (Vt − kXt) .

We can also consider a more general function of Max(Ft, sups≤tVs) :

et = e (t,Max(Ft, sups≤tVs)) .

As presented in the next chapter, we can weaken the constraint VT ≥ FT byimposing only this condition at a given probability threshold. The insurancecost is reduced but the guarantee is no longer sure.


9.3 Comparison between OBPI and CPPI

In what follows we refer to Bertrand and Prigent [60]. We assume thatthe same initial amount V0 is invested at time 0, and also that the sameguarantee K holds at maturity. We suppose also that the risky asset pricefollows a geometric Brownian motion.

9.3.1 Comparison at maturity

9.3.1.1 Comparison of the payoff functions

Is it possible that the payoff function of one of these two strategies liesabove the other for all ST values? Since the initial investments are equal(V OBPI

0 = V CPPI0 ), the absence of arbitrage implies the following result.

PROPOSITION 9.4Neither of the two payoffs is greater than the other for all terminal values ofthe risky asset. Therefore, the two payoff functions intersect one another.

Figure 9.9 illustrates a numerical example with typical values for the finan-cial market parameters: µ = 10%, σ = 20%, T = 1, r = 5%, K = S0 = 100.Note that as m increases, the payoff function of the CPPI becomes moreconvex.

FIGURE 9.9: CPPI and OBPI payoffs as functions of S


For this example, the two curves intersect one another for the differentvalues of m considered (m = 2, m = 4, m = 6, and m = 8).

CPPI performs better for large fluctuations of the market, while OBPIperforms better in moderate bullish markets.

9.3.1.2 Comparison with the stochastic dominance criterion

The first-order stochastic dominance allows us to take account of the riskydimension of the terminal payoff functions for both methods.

Recall that a random variable X stochastically dominates a random variableY at the first order (X Y ) if and only if the cumulative distribution functionof X , denoted by FX , is always below the cumulative distribution functionFY of Y.

PROPOSITION 9.5Neither of the two strategies stochastically dominates the other at first order.

9.3.1.3 Comparison of the expectation, variance, skewness, andkurtosis

Since option payoffs are not linear w.r.t. the underlying risky asset, themean-variance criterion is not always justified. Thus, we examine simultane-ously the first four moments and the semi-variance of the rates of portfolioreturns ROBPI

T and RCPPIT .

PROPOSITION 9.6The equality of return expectations of both strategies,

E[ROBPIT ] = E[RCPPI

T ],

leads to a unique value for the multiple, denoted by m∗ (K) , for any fixedguaranteed amount K. In the Black and Scholes framework, this multiple isequal to:

m∗ (K) = 1 +(

1(µ− r)T

)ln(C(0, S0,K, µ)C(0, S0,K, r)

), (9.52)

where C(0, S0,K, x) is the Black-Scholes value of the call option, and x de-notes all possible values of the riskless rate.

PROPOSITION 9.7For any parametrization of the financial markets (S0,K, µ, σ, r), there existsat least one value for m such that the OBPI strategy dominates (is dominated),in a mean-variance (mean-semivariance) sense, (by) the CPPI one.


The following example gives an illustration with the previous values of theparameters.

The multiple m, solution of E[ROBPIT ] = E[RCPPI

T ], is equal to 5.77647.

The next table contains the first four moments and the semi-volatility forthe OBPI with an at-the-money call, and for the corresponding CPPI withthis particular value of the multiple.

TABLE 9.2: Comparison of the firstfour moments and semi-volatility

OBPI CPPIexpectation 8.61176 % 8.61176 %volatility 16.8625 % 23.2395 %semi-volatility 9.1676% 7.7666%relative skewness 1.49114 9.70126relative kurtosis 5.4576 357.73

• The OBPI dominates the CPPI in a mean-variance sense, but is domi-nated by the CPPI if semi-volatility is considered. This is confirmed bythe relative skewness.

• Nevertheless, the CPPI has a higher positive relative skewness than theOBPI, and should be preferred to OBPI for this criterion.

• However, CPPI relative kurtosis is much higher than the OBPI one.This is due to the dominance of the CPPI payoff for small and highvalues of the risky asset S, as shown in Figure 9.9.

• Note that, here, owing to the insurance feature, kurtosis arises mainlyin the right tail of the distribution.

9.3.1.4 Comparison of “quantiles”

Since the distributions to be compared are strongly asymmetrical, the studyof the moments is not sufficient. The whole distribution has to be considered.The next figure loosely illustrates the situation, where both payoff functionsand risky asset density are depicted.


FIGURE 9.10: CPPI and OBPI payoffs and probability of S

To examine the effect of probabilities, we study the distribution of thequotient of the CPPI value to the OBPI one. The plot of the cumulativedistribution function of V OBPI

T

V CPPIT

for different values of K is:

FIGURE 9.11: Cumulative distribution of V OBPIT

V CPPIT


This figure shows in particular that:

• For the at-the-money call (K = 100 and thus p = 94.72%), the prob-ability that the CPPI portfolio value is higher than the OBPI one isapproximately 0.5, meaning that neither of the strategies “dominates”the other.

• This is no longer true for K = 90 (thus p = 87.97%), where the prob-ability that the CPPI portfolio value is above the OBPI one is about0.4.

• For K = 110 (thus p = 99.39%), this probability takes the value 0.7.

REMARK 9.9

• This arises because the probability of exercising the call decreases withthe strike. Recall that the strike K is an increasing function of theinsured percentage p of the initial investment. Thus, as the guaranteedpercentage p rises, the CPPI method is more desirable than the OBPImethod.

• Notice that, for in- and out-of-the-money calls, extreme values of thequotient are more likely to appear:

- On the one hand, the CPPI portfolio value can be at least equal to106% of the OBPI portfolio value with probability 5% (respectivelyabout 0%) when K = 90 (respectively K = 110).

- On the other hand, the CPPI portfolio value can be at most equal to94% of the OBPI portfolio value with probability about 0% (respectively18%) when K = 90 (respectively K = 110).

• The same qualitative results are obtained for other usual values of themultiple (m between 2 and 8), which confirms the key role played bythe insured percentage of the initial investment.


9.3.2 The dynamic behavior of OBPI and CPPI

In many situations, the use of traded options is not possible. For example,the portfolio to be insured may be a diversified fund for which no single optionis available. The insurance period may also not coincide with the maturityof a listed option. Thus, for all these reasons, the OBPI put often has tobe synthesized. In this framework, both CPPI and OBPI induce dynamicmanagement of the insured portfolio. In what follows:

• First, it is proven that the OBPI method is a generalized CPPI with avariable multiple. The study of this multiple allows quantification of itsrisk exposure.

• Second, portfolio rebalancing implies hedging risk. Hence, hedging prop-erties of both methods are to be analyzed, in particular the behavior ofthe quantity to invest on the risky asset at any time during the man-agement period.

For the CPPI method, the key parameter is the multiple. Indeed, it determinesthe amount invested in the risky asset at any time. Does there exist such an“implicit” parameter for the OBPI?

9.3.2.1 OBPI as a generalized CPPI

PROPOSITION 9.8For the geometric Brownian motion case, the OBPI method is equivalent tothe CPPI method in which the multiple is allowed to vary and is given by

mOBPI (t, St) =StN (d1 (t, St))C(t, St,K)

. (9.53)

This coefficient is the ratio of the delta of the Call N (d1 (t, St)) multipliedby the risky asset price St. It is equal to the risk exposure divided by thecushion value, which is equal to the Call value C(t, St,K).

In this framework, the OBPI method looks like a CPPI one.

Note that:

• The generalized multiple mOBPI , associated to the OBPI method, is adecreasing function of the risky asset price S, at any time t.

• The multiple mOBPI takes higher values than usual CPPI multiples,except when the associated Call is in-the-money.

• This implies that, for a bullish market, the OBPI method limits morethe risk exposure.


The following two figures illustrate these properties.

m

S

5

10

15

12080

FIGURE 9.12: Multiple OBPI as function of S

The OBPI multiple takes higher values than the standard CPPI multiple,except when the associated call is in-the-money. In particular, in a risingmarket, the OBPI method prevents the portfolio being over-invested in therisky asset, as the multiple is low.

FIGURE 9.13: OBPI multiple cumulative distribution


We now study the dynamic properties of the two strategies, and in particulartheir “greeks.”

9.3.2.2 The Delta

The delta of the OBPI is obviously the delta of the call. For the CPPI, itis given by:

∆CPPI =∂V CPPI

t

∂St= αmSm−1

t . (9.54)

The following figure shows the evolution of the delta as a function of therisky asset value St.

FIGURE 9.14: CPPI and OBPI delta as functions of S

It can be observed in the previous figure that the behavior of the deltaof the two strategies are different. For the CPPI, not surprisingly, the deltabecomes more convex with m and the delta can be greater than one.

For a large range of the values of the risky asset, the delta of the OBPIis greater than that of the CPPI. Moreover, this happens for the most likelyvalues of the underlying asset (i.e., around-the-money).


In order to be more precise, the probability that the delta of the OBPIis greater than that of the CPPI has to be calculated for various marketparametrizations.It can be observed that, in probability, CPPI is significantly less sensitive tothe risky asset than OBPI, as shown in the following tables. Notice that thisfinding has important practical implications.

TABLE 9.3: Probability P [∆OBPI > ∆CPPI ] fordifferent m and σ

m σ = 5% σ = 10% σ = 15% σ = 20% σ = 25%3 1.000 0.991 0.970 0.945 0.9214 1.000 0.987 0.961 0.930 0.8765 1.000 0.983 0.946 0.860 0.7596 0.999 0.978 0.884 0.748 0.6727 0.999 0.949 0.782 0.661 0.6368 0.999 0.881 0.685 0.616 0.6309 0.992 0.788 0.616 0.599 0.64010 0.96 0.69 0.58 0.60 0.66

TABLE 9.4: Probability P [∆OBPI > ∆CPPI ] fordifferent m and µ

m µ = 5% µ = 10% µ = 15% µ = 20% µ = 25%3 0.925 0.930 0.943 0.951 0.9535 0.861 0.860 0.850 0.830 0.8016 0.774 0.748 0.712 0.667 0.6167 0.706 0.661 0.610 0.554 0.4958 0.670 0.616 0.558 0.497 0.4369 0.657 0.599 0.537 0.475 0.41310 0.657 0.598 0.536 0.473 0.411

The previous features are made clearer by examining the distribution ofthe ratio ∆OBPI

∆CPPI . The next figure shows that the probability that the CPPIdelta is smaller than the OBPI one is a decreasing function of the strike K(or equivalently, of the insured amount). Note that, for small values of K, therange of possible values of the ratio ∆OBPI

t

∆CPPIt

spreads out.The following figure shows the evolution of the delta with time. Whatever

the level of S compared to the level of the insured level at maturity, K, thedelta of the CPPI is decreasing with time. For the OBPI, the evolution of thedelta obviously depends on the moneyness of the option.


FIGURE 9.15: Cumulative distribution of ∆OBPIt

∆CPPIt

at t = 0.5

FIGURE 9.16: CPPI and OBPI delta as functions of current time.

More surprisingly, the delta of the CPPI is decreasing with the actualvolatility (since m > 1). The same feature arises when examining the ve-ga of the CPPI, since they depend in the same way on this actual volatility.For the OBPI, the result depends on the moneyness of the option.


9.3.2.3 The Gamma

The gamma of the CPPI is equal to :

ΓCPPI =∂∆CPPI

t

∂St= αm(m− 1)Sm−2

t

FIGURE 9.17: CPPI and OBPI gamma as functions of S at t = 0.5, forK = 100

For the CPPI, it is always for high values of S that the gamma is impor-tant. Nevertheless, for usual values of m, the CPPI gamma is smaller thanthe OBPI one for a large range of values of S. This is particularly true forK = 110.

This fact is important, as the magnitude of transaction costs are directlylinked to the gamma. Again, the CPPI method seems to be better suitedwhen the insured percentage, p, of the initial investment is high.

Moreover, the gamma of the CPPI is monotonically decreasing with time,although it does not reach zero at maturity. Recall that, for a call, the gammawill go to zero as the expiration date approaches if the call is in-the-money orout-of-the-money, but will become very large if it is exactly at-the-money.


FIGURE 9.18: CPPI and OBPI gamma as functions of S at t = 0.5, forK = 110

9.3.2.4 The Vega

The vega of the CPPI is defined as1:

vegaCPPI = ∂V CPPIt

∂σ

= C(0, S0,K)(

StS0

)m ((m−m2)σt

)exp [βt]

=((m−m2)σt

)V CPPIt

Thus, the sensitivity of the CPPI value with respect to the actual volatilityis negative if m > 1.

The higher the multiple, the more it decreases.

REMARK 9.10 To summarize, comparison of OBPI and CPPI, withusual criteria such as first order stochastic dominance and various momentsof their rates of return, does not allow one to discriminate clearly between thetwo strategies:

- The standard CPPI method is based on dynamic portfolio managementand thus seems more flexible.

1In the following calculation, we do not take into account the effect of the volatility onC(0, S0, K) because the call enters in the CPPI formula only to insure the compatibilitywith the OBPI at time 0. Furthermore, C(0, S0, K) depends only on the expected volatilityand not on the actual one.


FIGURE 9.19: CPPI and OBPI vega as functions of S at t = 0.5

- However, to avoid sudden drops, the multiple must be not too high. Insuch case, the OBPI strategy seems more robust if the option is well hedged.The worst scenario for the CPPI strategy compared to the OBBI one is a sud-den drop during the management period, then a financial market rise. In thatcase, for the standard method, the cushion remains null and it is no longerpossible to benefit from risky asset price increases.

- For small risky asset values, the CPPI method provides higher returnsbut, in that case, a simple riskless investment can beat the CPPI portfolio.

- As the guaranteed percentage increases, the CPPI strategy is more rele-vant than the OBPI one. This arises mainly because the OBPI call has lesschance to be exercised.

- The analysis of the dynamic properties of these two methods shows inparticular how the OBPI method can be considered as a generalized CPPImethod. They differ mainly by their vegas.

Note that implicit volatilities can be considered to better analyze the influ-ence of such a parameter since the Call option is priced from implicit volatility.For the CPPI method, a conditional multiple can be considered, based on fac-tors such as the realized or implicit volatilities.


9.4 Further reading

As mentioned in Leland and Rubinstein (1988), long term returns can ben-efit from portfolio insurance methods since they allow investment in moreagressive securities, while satisfying guarantee constraints.

Bookstaber and Langsam [82] analyze properties of portfolio insurance mod-els. They focus on path dependence, showing that only option-replicatingstrategies provide path independence. They deal also with the problem of thetime horizon and, in particular, time-invariant or perpetual strategies (stud-ied also in Black and Perold [77]).

Black and Rouhani [75] compare CPPI with OBPI when the put optionhas to be synthesized. They compare the two payoffs and examine the role ofboth expected and actual volatilities. They show that “OBPI performs betterif the market increases moderately. CPPI does better if the market drops orincreases by a small or large amount.”

Bertrand and Prigent [58] introduce general marked point processes to mod-el stock price variations. Upper bounds are provided in such framework, usingVaR type criteria. Such an approach can be further extended with other riskmeasures.

Time varying multiples can be introduced according to market turbulencefactors, as proposed by Hamidi et al. [280] using quantile regression to esti-mate the potential losses, conditionally to these factors.

Cesari and Cremonini [110] examine various dynamic asset allocations byusing Monte Carlo simulations. Risk adjusted performance measures, such asSharpe ratio, Sortino ratio, and return to risk are introduced, and strategiesare compared for different market evolutions (bullish, bearish, and no-trend).CPPI strategies seems to perform better in bear and no-trend markets. Obvi-ously, benchmarking strategies are better in bullish markets, but such strate-gies provide no true “absolute” insurance against market drops, since they“only” must be close to the benchmark (relative performance versus absoluteperformance).

Chapter 10

Optimal dynamic portfolio with risklimits

Portfolio insurance payoff is designed to give the investor the ability to limitdownside risk while allowing some participation in upside markets. Suchmethods allow investors to recover, at maturity, a given percentage of theirinitial capital, in particular in falling markets.

This payoff is a function of the value at maturity of some specified port-folio of common assets, usually called the benchmark. As is well-known bypractitioners, specific insurance constraints on the horizon wealth must begenerally satisfied. For example, a minimum level of wealth and some partic-ipation in the potential gains of the benchmark can be guaranteed. However,institutional investors for instance may require more complicated insurancecontracts.

As seen previously in Chapter 9, two standard portfolio insurance methodsare the Option Based Portfolio Insurance (OBPI) and the Constant Propor-tion Portfolio Insurance (CPPI).

However, to what extent are these methods optimal?

The literature on portfolio optimality generally considers an investor whomaximizes the expected utility of his terminal wealth by trading in continuoustime, as seen in Part 3. The continuous-time setup is also usually introducedto study portfolio insurance (see e.g., Grossman and Vila [274], Basak [46],or Grossman and Zhou [276]). El Karoui, Jeanblanc and Lacoste [192] provethat, under a fixed guarantee at maturity, an option based portfolio strategyis optimal for quite general utility functions (see also Jensen and Sorensen[302] for a particular case and [415] for a more general guarantee). The keyassumption is that markets are complete which means that all portfolio pro-files at maturity can be perfectly hedged.

Other literature is devoted to the optimal positioning problem which hasbeen addressed in the partial equilibrium context by Brennan and Solanki [88]and by Leland [347]. The value of the portfolio is a function of the benchmarkin a one period set up. An optimal payoff, maximizing the expected utility,

319


is derived. It is shown that it depends crucially on the risk aversion of theinvestor. Following this approach, Carr and Madan [106] consider markets inwhich exist out-of-the-money European puts and calls of all strikes. As theymentioned, this assumption allows the examination of the optimal positioningin a complete market and is the counterpart of the assumption of continuoustrading. This approximation is justified when there is a large number of op-tion strikes (e.g. for the S&P500). Due to practical constraints, liquidity,transaction costs, etc., portfolios are in fact discretely rebalanced.

In this chapter, the optimal insured portfolio is determined for the twocases:

• In the first section, the insurance is perfect, since the probability thatthe portfolio value is above the guaranteed level is equal to 1.

- First, a one-period market is considered. Structured portfolios withpayoffs defined as functions of the risky asset (a financial stock in-dex for example) are examined. Constraints on the horizon wealthare included. In addition, markets can be incomplete. The insuredoptimal portfolio is characterized for arbitrary utility functions, re-turn distributions, and for any choice of a particular risk neutralprobability if the market is incomplete.

- Second, the financial market is assumed to be dynamically complete.In this framework, results concerning European guarantees are ex-tended to more general guarantee constraints at maturity. Forexample, this guarantee can no longer be a fixed percentage of theinitial investment, but can involve a stochastic component.

• In the second section, the insurance is satisfied at a given probabilitylevel.

- First, the maximization of the probability of success is studied.

- Second, the expected utility maximization under VaR/CVaR con-straints is examined.

In particular, the optimal portfolio is calculated for CRRA utility functions.

Optimal dynamic portfolio with risk limits 321

10.1 Optimal insured portfolio: discrete-time case

10.1.1 Optimal insured portfolio with a fixed number of as-sets

In this section, we consider an investor who chooses a buy-and-hold strategyand can invest in three available financial assets:

• A riskless asset B;

• A risky asset S; and,

• A put written on S with strike K and initial value P0(K).

Note that this portfolio strategy is an extension of the OBPI method forwhich the number of shares for the underlying asset S and the Put writtenon S are equal.

The investor’s strategy consists of setting constant porfolio shares where:

α is the number of shares invested in B;β is the number of shares invested in S; and,γ is the number of shares invested in the Put with strike K.

Denote by VT the portfolio value:

VT = αBT + βST + γ(K − ST )+. (10.1)

The budget constraint is given by:

V0 = αB0 + βS0 + γP0(K). (10.2)

Assume that the guarantee constraint at maturity is linear:

VT ≥ aST + b,

where a corresponds to a minimal percentage of the potential rise of the riskyasset, and b is a fixed insured amount.

Three main cases have to be considered.

Case 1: 0 < β < γ (the number of purchased puts is higher than the num-ber of purchased risky assets S).

This asset allocation allows provision of the guarantee αBT + βK. Notethat the portfolio profile is convex. However, this function is decreasing forsmall S values, which is not satisfactory for most investors.


FIGURE 10.1: Optimal portfolio profiles according to instantaneous stockreturn (convex but not increasing)

Case 2: β > γ > 0 (the number of purchased puts is smaller than thenumber of purchased risky assets S).

FIGURE 10.2: Optimal portfolio profiles according to instantaneous stockreturn (convex increasing)


In this case, the investor has a guarantee equal to αBT +γK. The portfoliopayoff is still convex.

Case 3: β > 0 > γ (the investor buys S and sells the Put)

FIGURE 10.3: Optimal portfolio profiles according to instantaneous stockreturn (concave)

The guarantee is still equal to αBT + γK. However, the portfolio profileis now concave. The investor will receive more money when the risky assetdecreases, but less money when it increases.

We now search the optimal portfolio for an investor who maximizes hisutility function U , which is assumed to satisfy usual assumptions.

REMARK 10.1 More generally, the guarantee constraints are infinite-dimensional since they must be satisfied for all S values (for example, if theprobability distribution of S is lognormal). However, since both portfolio pay-offs and guarantee constraints are piecewise linear, the guarantee constraintsare reduced only to three conditions: the portfolio values must be above theguarantee function for S = 0 and S = K, and the slope for high values of Smust be higher than the slope of the guarantee function.


Therefore, the optimization problem is the following:

Max E[U(VT )]

with(guaranteeconstraint)

αBT + γK ≥ bαBT + βK ≥ aK + bβ ≥ a

,

and(budgetconstraint) V0 = αB0 + βS0 + γP0(K).

The solution is determined by using the Kuhn and Tucker approach. TheLagrangian is defined by:

L = E[U(VT )] + λ1(αBT + γK − b) + λ2(αBT + βK − aK − b)+λ3(β − a) + λ4(V0 − αB0 − βS0 − γP0(K)),

where λ4 is the Lagrange multiplier associated to the budget constraint, andwhere parameters λi are associated to guarantee constraints. They are allpositive and satisfy the necessary conditions of local optimality given by:

∂L∂α = E[U ′(VT )] + λ1 + λ2 − λ4e

−rT = 0,

∂L∂β = E[U ′(VT )ST ] + λ2K + λ3 − λ4S0 = 0,

∂L∂γ = E[U ′(VT )(K − ST )+] + λ1K − λ4P0(K) = 0.

Therefore, eight cases have to be considered according to the three guar-antee constraints which can be (or not) saturated. Three main results areobserved:Strategy 1: The investor is weakly risk-averse.Strategy 2: The investor is risk-neutral.Strategy 3: The investor is rather strongly risk-averse.The weakly risk-averse investor chooses a portfolio which is equal to the

guarantee for values of the risky asset S smaller than the strike K. The risk-averse investor has higher returns for small risky asset values, but smallerreturns for higher values of S. According to risk aversion, the porfolio profileis concave or convex.

Assume for example that the utility function is quadratic:

U(x) = x− a

2x2, with a > 0.

Then, the number of shares can be examined according to the aversion to thevariance a:

The higher the aversion to the variance a, the higher the amount investedin the riskless asset, and the smaller both the numbers of shares invested inthe risky asset and in the put.

These properties are illustrated by the following figures.


FIGURE 10.4: Optimal portfolio profile as a function of the risky assetvalue

B0 = 150, S0 = 2500, V0 = 10000, K = 2400, r = 4%, σ = 20%, µ = 5%.

FIGURE 10.5: Optimal portfolio weighting as a function of the varianceaversion (quadratic case)


10.1.2 Optimal insured payoffs as functions of a benchmark

In this section, the financial market is to be composed of three basic financialassets: the cash associated to a discount factor N , the bond B, and the stockS (a financial index, for example). We suppose that the investor determinesan optimal payoff h which is a function defined on all possible values of theassets (N,B, S) at maturity. As in Section 7.1, when there is no insuranceconstraint, the optimization problem is solved as follows:

Assume that prices are determined under measure Q. Denote by dQdP

theRadon-Nikodym derivative of Q with respect to the historical probability P .Denote by NT the discount factor, and by MT the product NT

dQdP

.The investor has to solve the following optimization problem:

MaxhEP[U(h(NT , BT , ST )] under V0 = EP[h(NT , BT , ST )MT ]. (10.3)

Assume that h ∈ L2(R+3,PXT (dx)) where XT = (NT , BT , ST ) which is theset of the measurable functions with squares that are integrable on R+3 withrespect to the distribution PXT (dx).

Now, the investor introduces a specific guarantee, which can be institution-al, or may imply an additional insurance against risk. If, for example, theinterest rate is not stochastic, such a guarantee can be modelled by letting afunction h0 be defined on the possible values of the benchmark ST : whateverthe value of ST , the investor wants to get a final portfolio value above thefloor h0(ST ). For instance, if h0 is linear with h0(s) = as+ b, then, when thebenchmark falls, the investor is sure of getting at least b (equal to a fixed per-centage of his initial investment), and if the benchmark rises, he make profitsout of the rises at a percentage a.

The optimal payoff with insurance constraints on the terminal wealth is thesolution of the following problem:

MaxhEP[U(h(XT )]V0 = EP[h(XT )MT ],h(XT ) ≥ h0(XT ).

As can be seen, the initial investment V0 must be higher than EP[h0(XT )MT ]if the insurance constraint must be satisfied.

The solution of this problem is given in Prigent [415]. To solve it, introducethe sets

H1 = h ∈ L2(R+3,PXT )|V0 = EP[h(XT )MT ],and

H2 = h ∈ L2(R+3,PXT )|h ≥ h0.The set H = H1 ∩ H2 is a convex set of L2(R+3,PXT ). Consider the

following indicator function of H , denoted by δH and defined by:

δH(h) =

0 if h ∈ H,+∞ if h /∈ H.


Since H is closed and convex, δH is lower semi-continuous and convex.

Recall the notion of subdifferentiability (see, e.g., Ekeland and Turnbull[187] for the definition and properties of subdifferentials).

Let V denote a Banach space and < ., . > the duality symbol.

DEFINITION 10.1 1) For any function F defined on V with values inIR∪+∞, a continuous affine functional l : V → IR everywhere less than F .This means that:

∀v ∈ V, l(v) ≤ F (v) is exact at v∗ if l(v∗) = F (v∗). (10.4)

2) A function F : V → IR ∪ +∞ is subdifferentiable at v∗ if there exists acontinuous affine functional l(.) =< ., vl > −a, everywhere less than F , whichis exact at v∗. The slope vl of such an l is a subgradient of F at v∗. The setof all subgradients of F at v∗ is the subdifferential of F at v∗ and is denotedby ∂F (v∗).

Recall the following characterization:

vc ∈ ∂F (v∗) iff F (v∗) < +∞ and ∀v ∈ V,< v − v∗, vc > +F (v∗) ≤ F (v).(10.5)

Denote by ∂δH the subdifferential of δH . The optimization problem isequivalent to:

Maxh (E[U(h(XT )]− δK(h)) . (10.6)

The optimality conditions leads to:

PROPOSITION 10.1

There exists a scalar λc and a function hc defined on L2(IR+3, PXT ) such that:

h∗ = J(λcg + hc), (10.7)

where λc is the solution of:

y?, V0 =∫ ∞

0

J [yg(x) + hc(x)]g(x)f(x)dx, (10.8)

and hc ∈ ∂δH2(h∗).

To explain more precisely the condition hc ∈ ∂δH2(h∗), assume that thefunctions h0 and hc are continuous (such properties are always verified inpractice). Consequently, the optimal payoff h∗ is continuous.


COROLLARY 10.1Under the above assumption, the function hc satisfies the following property:1) If on a product I of intervals of values of XT , h∗(XT ) > h0(XT ) then hc

is equal to 0 on I.2) If on a product I of intervals of values of XT , h∗(XT ) = h0(XT ) then hc

is negative on I.

REMARK 10.2 From the previous results, the optimal payoff h∗ canbe determined by introducing the unconstrained optimal payoff he associatedto the modified coefficient λc (i.e., he = J(λcg) ). λc can also be consideredas a Lagrange multiplier associated to a non insured optimal portfolio butwith a modified initial wealth. Indeed, when he is greater than the insurancefloor h0, then h∗ = he. Otherwise, h∗ = h0. However, the payoff is usually acontinuous function of the values of the benchmark like any linear combinationof standard options. In that case, the optimal payoff is given by:

h∗ = Max(h0, he) = h0 + Max(he − h0, 0). (10.9)

Consequently, the optimal solution is the sum of the constraint h0 with acall of “strike” h0 and underlying he, which is the optimal solution withoutconstraint. This result is true even in the non-complete case, and with ageneral guarantee constraint. In the next section (“dynamic” case), the samekind of result is proved (see Proposition 10.3).

Since the general result in the previous proposition is an extension of theKuhn-Tucker theorem to infinite dimension, it is well-known that the determi-nation of h∗ implies the comparison of all possible solutions of the kind (10.9).However, this problem can easily be solved if the payoff must be continuous.

COROLLARY 10.2Under the previous assumptions on the utility U , there is one and only onecontinuous optimal payoff, associated to the unique solution λc of the budgetequation.

PROOF From the assumptions on the marginal utility U ′, we deduce thatits inverse J is a continuous and decreasing function with:

limo+J = +∞ and lim+∞J = 0.

Thus, for all s, the function λc −→ h∗(λc, x) = Max(h0(x), h(λc, x)) is con-tinuous and decreasing. Therefore, the function λc −→ EQ[h∗(λc, XT )] iscontinuous and decreasing from +∞ to EQ[h0(λc, XT )], which is lower thanthe initial investment V0. From the intermediate values theorem and by mono-tonicity, the result is deduced.


REMARK 10.3 Finally, an investment strategy is associated to the op-timal payoff, which can be computed by the following approach. Suppose forexample that the interest rate is non-stochastic. Then:

• First, the function h∗ is approximated by a sequence of twice differen-tiable payoff functions hn.

• Second, as proved in Carr and Madan (1997), it is possible to explicitlyidentify the position that must be taken in order to achieve a givenpayoff hn that is twice differentiable.

hn is duplicated by an unique initial position of hn(S0)− h′n(S0)S0 unit dis-

count bonds, h′n(S0) shares and hn(K)dK out-of-the-money options of all

strikes K:

hn(S) = [hn(S0)− h′n(S0)S0] + h′

n(S0)S

+∫ S0

0

h′′n(K)(K − S)+dK +

∫ ∞

S0

h′′n(K)(S −K)+dK.

Generally, as mentioned previously, h0 is increasing and he also. Therefore,the optimal payoff is an increasing function of the benchmark.

REMARK 10.4 From the previous theoretical result, we conclude that:- Generally, an optimal portfolio must include options in order to maximize

the expected utility of investors.- The solution is a combination of the optimal portfolio value without guar-

antee, and a put written on it with a strike equal to the floor. Under the stan-dard assumptions that the insurance constraints and the payoff are modelledby continuous functions of the risky asset, the solution is also the maximumbetween this function and the solution of the unconstrained problem but witha different initial wealth.

- In the no guarantee case, the concavity/convexity of the portfolio profileis determined from the degree of risk aversion and from the financial marketperformance, for example a Sharpe-type ratio. This kind of result still holdsaccording to the insurance constraint at maturity.

All the above properties are illustrated in the next example.


10.1.2.0.1 A special caseAssume that the utility function of the investor is a CRRA utility:

U(x) =xα

α,

with 0 < α < 1, from which we deduce J(x) = x1

α−1 .

Suppose that the interest rate r is constant, that the stock price evolves ina continuous time set up, and in particular that (St)t is a geometric Brownianmotion given by:

St = S0exp[(µ− 1/2σ2)t + σWt

].

Denote by f the density of ST .

Notations:θ =

µ− r

σ, A = −1

2θ2T +

θ

σ(µ− 1

2σ2)T,

ψ = eA(S0)θσ , κ =

θ

σ.

Recall that in the Black and Scholes model, the conditional expectation gof dQ

dPunder the σ-algebra generated by ST is given by:

g(s) = ψs−κ.

Therefore, he(s) satisfies:

he(s) = d× sm with d = cψ1

α−1 and m =κ

1− α> 0. (10.10)

We apply the previous general results to solve the optimization problem.Then, if there is no insurance constraint, the optimal payoff is given by:

he(s) =V0e

rT∫∞0 g(s)

αα−1 f(s)ds

× g(s)1

α−1 . (10.11)

If the insurance constraint is required then the optimal payoff must be thesolution of

MaxhE[ (h(ST ))α

α ]V0 = e−rTE[h(ST )]h(ST ) ≥ h0(ST )

Then:

PROPOSITION 10.2The optimal payoff with guarantee is given by:

h∗ = (λcg + hc)1

α−1 , (10.12)


where λc is the solution of:

y?, V0erT =

∫ ∞

0

[yg(s) + hc(s)]1

α−1 g(s)f(s)ds, (10.13)

and hc is a negative function satisfying the property of the previous corollary.

COROLLARY 10.3Assume as usual that h0 is increasing and continuous. Recall that we haveh∗ = Max(he, h0) and the solution he of the unconstrained problem associatedto the Lagrange multiplier λc is increasing. Then, the optimal payoff is anincreasing continuous function of the benchmark at maturity.

h∗ = Max(he, h0) and the solution he of the unconstrained problem asso-ciated to the Lagrange multiplier λc is increasing.

REMARK 10.5 As seen in Chapter 7, if there is no insurance constraint,the concavity/convexity of the optimal payoff is determined by the compar-ison between the risk-aversion and the ratio κ = µ−r

σ2 , which is the Sharperatio divided by the volatility σ.

i) he is concave if κ < 1− α.ii) he is linear if κ = 1− α.iii) he is convex if κ > 1− α.

The graph of the optimal payoff changes from concavity to convexity ac-cording to the increase of the risk-aversion of the investor. If, for example, theinsurance constraint is linear (h0(s) = as+ b), it looks like the unconstrainedcase, except when h∗ is equal to the constraint h0.

The previous theoretical result justifies the introduction of power optionsin the portfolio (see Chapter 9).

As seen in Section 9.2, if κ < 1−α, h∗∗ the optimal payoff is concave, andif κ > 1− α, h∗∗, it is convex.

As shown previously for the buy-and-hold case (one bond, one stock, andonly a finite number of options written on it), if the guaranteed payoff islinear, the optimal (polygonal) payoff is still convave/convex according to thedegree of risk aversion.


Example 10.1Consider the same parameter values as in Example 9.2.

Then, the optimal payoff profile can be examined according to values ofparameters d and m. Assuming that

µ = 0.1, σ = 0.2, V0 = 100, S0 = 100, r = 3, T = 5, a = 0.7, b = 90, α = 0.7,

the optimal payoff profile is equal to:

h∗ (s) = d.sm,

with d = 8.7253.10−11 and m = 5.8333.

The terminal portfolio value is given by:

V ∗∗T = 0.7ST + 90 + max

(8.7253.10−11.S5.8333

T − 0.7ST − 90; 0).

Note for example that:

α = 0.1⇒ d = 0.01 and m = 0.84,α = 0.9⇒ d = 1.55.10−36 and m = 17.5.

REMARK 10.6 When the number of available options is finite, the indi-rect expected utility is smaller than the previous one, where an infinite numberof options can be used to replicate the portfolio value h∗∗(s).

This static replication is based on the following result (see, e.g., Carr andMadan [106]):

Any payoff p(s) can be replicated by a position which is composed of

• [p(S0)− (∂p/∂s) (S0)S0] shares of riskless asset;

• [(∂p/∂s) (S0)] shares of risky asset S; and,

• [(∂2p/∂s2

)(K)] shares of out-of-the money options for all strikes K.

However, for a given finite set of available options, we can consider either acombination p(s) of these options which minimizes a given loss function w.r.t.the optimal payoff h∗∗(s), or the optimal solution for the given utility functionwith the finite set of available options.


10.2 Optimal Insured Portfolio: the dynamically com-plete case

10.2.1 Guarantee at maturity

Assume that the financial market is complete, arbitrage free, and friction-less. Asset prices are supposed to follow continuous time diffusion processes.According to the investor’s risk aversion and horizon, the portfolio managerchooses the proportions to invest on financial assets, among them all zero-coupon bonds for maturities T defined on an instantaneous interest rate (rt)t.The resulting portfolio value (Vt)t is self-financing. This means that the pro-cess (Vt exp(− ∫ t

0 rsds))t is a Q-martingale where Q is the risk-neutral proba-bility.

Denote by η = dQdP

the Radon-Nikodym derivative of Q with respect to thehistorical probability P. Denote also by MT the process ηT exp(− ∫ T

0 rsds).Due to the no-arbitrage condition, the budget constraint corresponds to thefollowing relation:

V0 = EQ[VT exp(−∫ T

0

rsds)] = EP[VTMT ].

Assume that the investor wants to maximize an expected utility under thestatistical probability P. As usual, the utility U of the investor is supposedto be increasing, concave, and twice-differentiable. Suppose also that themarginal utility U ′ satisfies:

limo+U′ = +∞ and lim+∞U ′ = 0.

Denote by J the inverse of the marginal utility U ′.

The guarantee constraint consists in letting the portfolio value VT at ma-turity above a floor FT . This floor may be deterministic, corresponding, forexample, to a predetermined percentage p of the initial invesment V0, or maybe stochastic if, for instance, the investor wants to benefit from potentialmarket rises. For example, this floor may be equal to

FT = aST + b,

where a is a given percentage of the benchmark S (a stock index, for instance)and b is a fixed guaranteed amount which corresponds usually to a fixedpercentage of the initial investment. In all cases, it is assumed that thereexists a portfolio that duplicates the floor FT .

Then, for a given initial investment V ∗0 , the investor wants to find the

portfolio θ solution of the following optimization problem:

Maxθ EP[U(VT )] under VT ≥ FT .


Due to market completeness, this problem is equivalent to (see Cox-Huang[132]):

MaxVT EP[U(VT )] (10.14)under VT ≥ FT and V ∗

0 = EP[VTMT ] ≥ EP[FTMT ].

PROPOSITION 10.3

The optimal solution V ∗T of problem (10.14) is given by the maximum of the

floor FT and the solution V eT of the non constrained problem for an initial in-

vestment V e0 such that V ∗

0 = EP[Max(V eT , FT )MT ]. Equivalently, this solution

can be viewed as a combination of the portfolio value V eT and a put written

on it with “strike” equal to the floor, or a combination of the floor and a callwritten on the portfolio value V e

T .

V ∗T = V e

T + (FT − V eT )+ = FT + (V e

T − FT )+.

PROOF Consider the solution V ∗T of the free problem (without guarantee

constraint). Using Cox and Huang [132] results, this solution is given by:

V eT = J(αMT ),

where the Lagrangian parameter α is such that V e0 = EP[V e

TMT ].Furthermore, for any portfolio VT with initial investment V0 satisfying VT ≥

FT , since the marginal utility U ′ is concave, we have:

U(VT )− U(V ∗T ) ≤ U ′(V ∗

T )(VT − V ∗T ),

and since U ′ is decreasing, we deduce:

U ′(V ∗T )(VT − V ∗

T ) = Min(αMT , U′(FT ))(VT − V ∗

T ).

Additionally,

Min(αMT , U′(FT ))(VT −V ∗

T ) = αMT (VT −V ∗T )− [αMT −U ′(FT )]+(VT −FT ).

Finally, since EP[VTMT ] = V ∗0 = EP[V ∗

T MT ], we get:

EP[Min(αMT , U′(FT ))(VT − V ∗

T )] = −EP[[αMT − U ′(FT )]+(VT − FT )] ≤ 0.

Therefore:EP[U(VT )] ≤ EP[U(V ∗

T )].


10.2.2 Risk exposure and utility function

Call-power options and CPPI strategy can be chosen according to optimal-ity criteria, as seen in the next examples. In what follows, we assume thatthe interest rate r is constant and that (St)t is a geometric Brownian motiongiven by:

St = S0exp[(µ− 1/2σ2)t + σWt

].

As done previously, we denote:

θ =µ− r

σ, A = −1

2θ2T +

θ

σ(µ− 1

2σ2)T,

ψ = eA(S0)θσ , κ =

θ

σ.

Recall that in the Black and Scholes model, the conditional expectation gof dQ

dPunder the σ-algebra generated by ST is given by:

g(s) = ψs−κ.

Example 10.2Assume that the investor has an HARA utility U given by:

U(x) =(x−K)

α

α

.

Then, the portfolio value VT which maximizes the expected utility is givenby: there exists a non-negative constant ζ such that

VT (ST ) = K + ζ(Sκ

1−αT ).

Thus, it corresponds to a CPPI portfolio value with guarantee K and a mul-tiple equal to κ

1−α .

Example 10.3Assume that the investor has a CRRA utility U given by:

U(x) =x

α

α.

Then, the portfolio value VT which maximizes the expected utility with theadditional guarantee constraint K at maturity is associated to a non-negativeconstant ξ such that:

VT (ST ) = K + (ξSκ

1−αT −K)+.

Thus, it corresponds to a call-power portfolio value with guarantee K and apower equal to κ

1−α .


Example 10.4Assume that the investor has a CRRA utility U given by:

U(x) =x

α

α.

Then, the portfolio value VT which maximizes the expected utility defined onthe cushion VT −K is given by: there exists a non-negative constant χ suchthat

VT (ST ) = K + (χSκ

1−αT ).

Thus, it corresponds also to a CPPI portfolio value with guarantee K and amultiple equal to κ

1−α .

REMARK 10.7- For a given profile h(ST ), under some mild assumptions, a utility functionU can be identified (up to a linear transformation) such that the solution ofthe expected utility maximization is the profile h(ST ).

Assume for instance that h is defined and invertible on a given interval ofR. Then, the condition: there exists a non-negative scalar λ such that

U ′(x) = λg(h−1(x))

insures that h(ST ) is the optimal profile for the expected U maximization forsome initial amount V0.

- For standard options (generally not invertible), U must be determined foreach case. For instance, if

VT = V0 + (ST −K)+,

then we can consider the utility function:

U(x) = (−∞)Ix≤V0 +x− V0 + K

m + 1g(x− V0 + K)Ix>V0 .

Therefore, the choice of a particular insured portfolio can determine animplicit utility function.


10.2.3 Optimal portfolio with controlled drawdowns

As in Grossman and Zhou [275], consider an investor who wants at any timeto lose no more than a fixed percentage of the maximum value his portfoliohas achieved up to that time.

Denote by Mt the maximum value of the investor’s wealth V invested onthe portfolio, on or before time t. Thus the constraint on the value processV is as follows: there exists a constant λ in [0, 1] such that at any time t in[0, T ],

Mt ≥ λVt.

- The financial market contains a riskless asset with rate r. The price ofthe risky asset S is the solution of the following SDE w.r.t. the probabilityspace (Ω,F , (Ft)t,P):

dSt = St. [µdt + σdWt] , (10.15)

where W is a standard Brownian motion, and the usual assumptions on co-efficient µ and σ are made. Denote µ = µ − r. This financial market iscomplete.

- The portfolio value V is the solution of:

dVt = Vt [rdt + wS (µdt + σdWt)] ,

where wS is the portfolio weight invested on the stock S which is assumed tobe predictable w.r.t. the filtration (Ft)t and such that the wealth process Vis always positive.

- The drawdown control: Denote M0 as a positive amount invested at time0 which evolves at a growth rate δ with δ ≤ r. Define:

Mt = max(M0e

δt, Vseδ(t−s), s ≤ t

). (10.16)

Note that, when δ = 0, Mt denotes the highest value between the initial val-ue M0 and all the portolio values on or before time t. If δ → −∞, Mt goes to 0.

The portfolio value must always be above the stochastic floor λM :

∀t ∈ [0, T ], Vt ≥ λMt, a.s. (10.17)

The term(1− Vt

Mt

)is called the drawdown. Therefore, the drawdown

control condition is: for a given level λ,(1− Vt

Mt

)≥ 1− λ. (10.18)

- The utility function: Assume that the investor has a power utility functionU(x) = xα

α with α < 1 and α = 0. Suppose that the objective is to maximize


the long-term growth rate of the expected utility. Its upper value is given by:

ξ∗ = supwS∈A

lim infT→∞

1αT

ln (E [αU(VT )]) , (10.19)

where A is the set of weights wS such that condition 10.17 is satisfied.

Recall that, when λ = 0, the optimal investment strategy is given by:

w∗S =

11− α

µ− r

σ2and ξ∗ = r +

12 (1− α)

(µ− r)2

σ2.

Note that when λ > 0, this strategy violates condition (10.17) with proba-bility one.

Denote Mt = e−δtMt and Vt = e−δtVt.

PROPOSITION 10.4 Grossman and Zhou [275]If there exists a constant ξ and a function J (V , M) such that:i) J (V , M) is solution of the Bellman equation:

J (V , M)t = supwS∈A

E[αJ (Vt, Mt)U(VT )e−αξt

].

ii) There exists a trading strategy w∗S which achieves the supremum; and,

iii) There exist positive constants C1 and C2 such that:

C1αU(V ) ≤ αJ (V , M) ≤ C2αU(V ),

then, the maximum long-term growth rate of the expected utility of final wealthis achieved by the strategy w∗

S . In addition, ξ is also the rate of the finite-horizon problem:

ξ = lim infT→∞

1αT

ln(α sup

wS∈AE [U(VT )]

), (10.20)

Moreover, consider J (V , M) defined by:

J (V , M) = supwS∈A

lim infT→∞

E[U(VT )e−αξT

].

Then, if J (V , M) is finite for V ≥ λM , we have:- The functional J (V , M) is homogeneous of degree α in V and M.

- J (V , M) is an increasing and concave function of V and a decreasingfunction of M.


Example 10.5 Case λ = 0 and δ = r- The optimal weight on stock is given by:

w∗S,t = k

Vt − λMt

Vt= k

(1− λ

Mt

Vt

),

with k =µ− r

σ2

1(1− λ)(1 − α) + λ

.

When λ = 0, we recover the Merton’s solution.

Thus, for CRRA utility functions, the optimal strategy is equivalent to aninvestment in the risky asset which is proportional to the cushion Vt − λMt.This is analogous to the CPPI method (see Chapter 9) but with a stochasticfloor λMt. When we assume Mt = K, we recover the standard CPPI method.

The optimal weight w∗S,t is an increasing function of the ratio Mt

Vt. When

the portfolio value Vt is close to the floor λMt, the optimal weight w∗S,t is close

to 0, since the guarantee must not be violated.

- The optimal portfolio value V is the solution of the following SDE:

dVt = k(Vt − λMt) [(µ− r)dt + σdWt] with Mt = max(M0, Vt).

The process ln(

VtMt− λ

)is a regulated Brownian motion. Then, we deduce:

Vt = λM0 exp ((1− λ)Lt)+(V0 − λM0) exp[−λLt +

(kµ− 1

2k2σ2

)t + kσWt

],

with

Lt = maxs≤t

[0 ; ln

(V0

M0− λ

)+(kµ− 1

2k2σ2

)s + kσWs − ln(1 − λ)

].


10.3 Value-at-Risk and expected shortfall based man-agement

In this section, the guarantee condition is satisfied at a given probabilitythreshold, not neccessarily equal to 1. Two cases are examined:

• First, the analysis of portfolio asset management under dynamic safetyconstraints which control the probability that the yield falls under agiven level.

• Second, the expected utility maximization under VaR/CVaR constraints.

10.3.1 Dynamic safety criteria

Consider safety criteria such as those of Roy [437] and Kataoka [324], whichare presented in a one-period setting, in Chapter 3. In what follows, weuse results from Prigent and Toumi [418] and Toumi [492]. To simplify theexposition, the financial market is assumed to evolve in continuous time andis supposed to be dynamically complete (see [492] for the the discrete-timeand incomplete cases).

10.3.1.1 Roy Criterion

There exists a riskless asset taken as numeraire. The risky asset price Sis supposed to be a semimartingale S = (St)t∈[0,T ] on a complete probabilityspace (Ω,F , P ) with a filtration (Ft)t∈[0,T ] .

The predictable strategy process ξ is assumed to be self-financing with aninitial invested amount V0 > 0. Such strategy (V0, ξ) is called admissible ifthe process defined by:

Vt = V0 +∫ t

0

ξsdSs, ∀t ∈ [0, T ] ,P− a.s. (10.21)

satisfies:Vt ≥ 0 ∀t ∈ [0, T ] ,P− a.s.

Since the financial market is assumed to be complete, there exists a uniqueequivalent martingale measure Q ≈ P .

According to Roy’s criterion, the investor searchs for the best admissiblestrategy (V0,ξ) which maximizes the probability that the portfolio terminalvalue verifies VT ≥ erTRMinV0, for a given minimal fixed return RMin.


Therefore, the investor has to solve the following optimization problem:

maxξ

P

[V0 +

∫ T

0

ξsdSs ≥ RMinV0

]. (10.22)

For fixed RMin, consider the “success set” A(V0,ξ) defined by:

AV0,ξ = VT ≥ RMinV0 . (10.23)

First step: We show that the Roy problem is equivalent to the determina-tion of a success set of maximal probability:

PROPOSITION 10.5Let A ∈ FT be a solution to the problem

max P[A], (10.24)

under the constraint:EQ [IA] ≤ 1

RMin. (10.25)

Let ζ be the perfect hedge of the option H = IA ∈ L1(Q), i.e,

EQ

[IA/Ft

]= EQ

[IA]

+∫ t

0

ζsdSs ∀t ∈ [0, T ] , P− a.s. (10.26)

Then(V0; ξ = V0

EQ[IA] ξ)is the solution of problem (10.22), and the corre-

sponding success set AV0,ξis equal almost surely to A.

REMARK 10.8 - The proof is detailed in [492]. It is based on resultsof quantile hedging, provided in Follmer and Leukert [234]. An alternativeproof is given in Browne [95] for the (Gaussian) complete case.

- The problem of constructing a maximal success set according to Roy cri-teria is then solved by applying Neyman-Pearson lemma, in a similar mannerto the case of quantile theory in Follmer and Leukert [234]. In fact, the Royproblem can be viewed as a quantile hedging of the constant H = RMinV0

under the constraint that the initial capital V0 to invest is smaller than H andthat RMin is higher than 1 (note that here the riskless return is equal to 1).

Since obviously Q[A] = E[IA], problem (10.24) is equivalent to the maxi-mization of P[A] under the constraint Q[A] ≤ 1

RMin. This is the reason why

we can apply the Neyman-Pearson lemma.Let a be the threshold defined by:

a = infa : Q[

dP

dQ> aH ] ≤ 1

RMin

, (10.27)


and consider the set

A ≤dP

dQ> a

. (10.28)

PROPOSITION 10.6Let’s assume that the set A defined by the two previous relations satisfies:

Q[A] = a. (10.29)

Then the optimal strategy(V0; ξ = V0

EQ[IA] ξ)is the solution to problem (10.22).

Second step: We maximize the expected success ratio.

The condition Q[A] = a is clearly satisfied when:

P

[dQ

dQ= a

]= 0. (10.30)

Generally, it is not easy to find a set A ∈ FT defined by (10.30). In thiscase, the Neyman-Pearson theory suggests the replacement of the critical re-gion A ∈ FT with a random test, i.e, by a function ϕ, FT -measurable, suchthat 0 ≤ ϕ ≤ 1.

LetR be the set of these functions ϕ and consider the following optimizationproblem:

E [ϕ] = maxϕ∈R

E [ϕ] (10.31)

under the constraintEQ [ϕ] ≤ 1

RMin. (10.32)

The Neyman-Pearson lemma proves that the solution ϕ of (10.31, 10.32)has the form:

ϕ = I dPdQ

>aH + γI dPdQ

=aH, (10.33)

where a is given by (10.29), and where γ is defined by:

γ =1/RMin −Q

[dPdQ

> a]

Q[dPdQ

= aH] . (10.34)

This allows a solution for the dynamic Roy problem:

DEFINITION 10.2 Let (V0, ξ) be an admissible strategy. We define the“success ratio” associated to this strategy as

ϕV0,ξ = IVT≥RMinV0 +VT

RMinV0IVT<RMinV0. (10.35)


Note that ϕV0,ξ ∈ R and the set ϕV0,ξ = 1 coincide with the success setAV0,ξ = VT ≥ V0 RMin associated to the strategy (V0, ξ) .

We search for the strategy maximizing the mean of the success rate E [ϕV0,ξ]under the probability P over the set of admissible strategies:

max E [ϕV0,ξ] : (V0, ξ) admissible . (10.36)

PROPOSITION 10.7Let ξ represent the process determining the perfect hedging of the FT−measurable

function ϕ defined by (10.33). Then the strategy(V0; V0

RMinξ)is a solution

to (10.36).

Example 10.6Examine the Roy criterion in the Black and Scholes framework.

The price process S of the risky asset is given by:

dSt = St(mdt + σ dWt),

where W is a Brownian motion under P and m is constant. To simplify, theinterest rate is assumed to be null r = 0.

The unique equivalent martingale measure is defined by:

dQT

dPT= exp

[−m

σWT − 1

2

(m

W

)2

T

],

The process W ∗t = Wt + m

σ t, is a Brownian motion under Q and the riskyasset price is equal to:

St = S0 exp(σW ∗

T −12σ2T

).

For fixed RMin, the optimal Roy strategy is the replicating strategy ofthe option H = IA where A is written as A = dPT

dQT> a, and where a is

determined from the relation

EQ[IA] =1

RMin. (10.37)

Since we have:dPT

dQT= β(ST )

mσ2 ,

with

β = (1/S0)(mσ2 ) exp

(−1

2

(m

σ2

)T +

(m2

)T

),


then we deduce:A = β(ST )

mσ2 > a.

We consider only the case m > 0 (the financial market has a positive trend).For the case m > 0, the success set A has the following form:

A = ST > c,where c is determined from the relation EQ[IA] = 1

RMin.

The Roy optimal portfolio has a payoff equal to the payoff of an option thatcan be written as:

H = IA = I ST > c,

and the Roy success probability is then given by:

P(A) = Φ(−b− mσ T√T

),

where Φ is the cdf of the standard Gaussian distribution, and b is such that:

c = S0 exp(σb− 12σ2T ).

Setting

d−(c, t) = − 1σ√TLn(

c

St)− 1

2σ√T − t,

the value Vt of the option to duplicate is equal to:

Vt = E∗t [IA] = Φ(d−(c, t))

andb = −

√TΦ−1(

1RMin

).

The expressions of the Delta ∆(t, St) =∂Vt

∂St(t, St) and of the Gamma

Γ(t, St) = ∂Vt∂S2

t(t, St) allow us to study the variation of the quantity invested

in the risky asset.

We obtain:

θqS(t) = ∆q(t, St) =1√

2πTσSt

exp(−d−(c, t)2/2

),

Γq(t, St) =−1√

2πTσS2t

(1 +1√Tσ

d−(c, t)) exp(−d−(c, t)2/2

).


Consider the following numerical values:

S0 = 100, m = 0.03, σ = 0.2, T = 1, and RMin = 1.2.

FIGURE 10.6: Dynamic Roy portfolio payoff

FIGURE 10.7: Probability of success as functionof the minimal return RMin


REMARK 10.9- In order to maximize the success probability, the investor chooses to get

exactly the minimum return for risky asset values above a given level. Indeed,if he would receive a higher return than the minimum one, the success prob-abililty would be reduced for the same initial investment.

- Additionally, we note that for small values of the risky asset, the optimalportfolio payoff is null, which induces extreme losses.

- To avoid such a problem, it would be preferable to use a criterion such asthe expected shortfall, which takes more account of the loss sizes.

- For large values of St, the investor chooses to reduce the quantity ∆q(t, St)invested in the risky asset. Once the condition

VT ≥ RMinV0

is achieved, the investor is satisfied and the increase of the value of the assetdoes not interest him any more. However, for small values of St, ∆q(t, St) isan increasing function of St.

- The success probability curve is obviously a decreasing function of RMin,since it is more difficult to guarantee a larger return. Besides,

P(A) = Φ(−b− mσ T√T

)

is an increasing function of m.

- Note also that P(A) is a decreasing function of the volatility σ. Actually,the event VT ≥ RMinV0 is less probable at the expiry date if the volatility islarger.


10.3.2 Expected utility under VaR/CVaR constraints

In what follows, we examine Value-at-Risk based management, as analyzedfor example in Basak and Shapiro [47].

- The financial market contains a riskless asset with rate r. The price S ofthe d risky assets is assumed to be a semimartingale on a filtered probabilityspace (Ω,F , (Ft)t,P) , and to be the solution of the SDE:

dSt = diag(St). [µ (t) dt + σ (t) dWt] , (10.38)

where W is a d−dimensional standard Brownian motion and usual previousassumptions on coefficient functions are made. This financial market is com-plete. Thus, there exists one and only one risk-neutral probability Q definedas follows.

Denote the relative risk process η by:

η(t) = σ (t)−1 [µ (t)− r(t)I] ,

with:

E

[∫ T

0

||η(t)||2 dt]<∞.

The exponential local martingale L:

L(t) = exp[−1

2

∫ t

0

||η(s)||2 ds−∫ t

0

η(s)dWs

],

is the Radon-Nikodym density of the risk-neutral probability Q w.r.t. theprobability P.

Denote R the discount factor:

R(t) = exp[−∫ t

0

r(s)ds].

Denote M as the product: M(t) = R(t)L(t).

- Expected utility maximization: the investor has a utility function on wealthU : (0,+∞)→ R. Using the martingale approach, the dynamic optimizationproblem under a VaR constraint is the following:

max E [U(VT )] ,under E [MTVT ] ≤ V0,P [VT ≥ Vmin] ≥ 1− ε.


PROPOSITION 10.8 Basak and Shapiro [47]

The optimal portfolio value V ∗T with VaR constraint is given by:

V ∗T =

J (yMT ) if MT < M,Mmin if M ≤MT < M,J (yMT ) if M ≤MT ,

where J = (U ′)−1, M is such that P

[MT ≥M

]= ε and the positive Lagrange

multiplier y is such that E [MTVT ] = V0.The VaR constraint is binding if and only if M< M . Also, the Lagrange

parameter y is decreasing in ε and y ∈ [yB, yPI ].

Define the quantity MMin by:

MMin =J(yM) if M < M,MMin otherwise.

The following figure plots the optimal portfolio value as a function of thestate price MT . The right thin curve corresponds to the portfolio B (case ofno VaR contraint: ε = 1). The left thin curve (PI) is the portfolio value whenthe insurance constraint VT ≥MMin must always be satisfied (ε = 0).

MMin

MMin

PI

VaR

B

V V arT

M M MT

FIGURE 10.8: Optimal portfolio value with VaR constraints


REMARK 10.10 Whenever the constraint is binding, the investor mustreduce portfolio losses to satisfy the VaR constraint. Then, he chooses toincrease portfolios losses in the “costly” states (i.e. Mt > M). Unfortunately,these events correspond to the largest losses when there is no insurance VaRconstraint. Therefore, the portfolio value with a VaR constraint has a fatterleft tail!

PROPOSITION 10.9(Power utility)Assume that U(V ) = V α

α with α < 1 and r and η are constant.

- The optimal wealth is given by:

V V aRt = eΓ(t)

(yMt)1

1−α

−[

eΓ(t)

(yMt)1

1−αΦ(−d1(M))−MMine

−r(T−t)Φ(−d2(M))]

+[

eΓ(t)

(yMt)1

1−αΦ(−d1(M))−MMine

−r(T−t)Φ(−d2(M))],

(10.39)

where Φ is the cdf of the standard normal distribution and

M =1

yM1−α ,

Γ(t) =α

1− α

(r +

||η||22

)(T − t) +

(α

1− α

)2 ||η||22

(T − t),

d2(x) =ln( x

Mt) +

(r − ||η||2

2

)(T − t)

||η||√T − t

d1(x) = d2(x) +||η||√T − t

1− α.

The fraction of wealth invested in stocks is given by:

wV aRt = qV aR

t wBt ,

where the value wBt and the exposure to risky assets relative to the benchmark

B are:wB

t =1

1− α[σ−1

t ]η,

qV aRt = 1− Me−r(T−t)(Φ(−d2(M))−Φ(−d2(M)))

V V aRt

+ (1−α)(M−MMin)e−r(T−t)(φ(d2(M))

V V aRt ||η||√T−t

,

where φ is the standard normal pdf.


10.4 Further reading

Portfolio insurance constraints of the American type are examined in ElKaroui et al.[192]. For example, for the CRRA case, the optimal solution isbased on American puts. Optimal portfolios with controlled drawdowns arealso determined in Cvitanic and Karatzas [138].

The dynamic minimization of expected shortfall is detailed in Pham [409]and Follmer and Leukert [235].

Cuoco et al. [135] introduce a dynamic VaR constraint. Contrary to theBasak and Shapiro results when the VaR constraint is static, they prove that,if the investor can fully use the current information, then the risk exposureof an investor, subject to a VaR constraint, is always lower than that of aninvestor with no VaR constraint.

Emmer et al. [203] also study VaR-type constraints applied to the determi-nation of optimal portfolios with bounded Capital-at-Risk. They examine thecontinuous-time maximization of the expected terminal wealth under an up-per bound on the Capital-at-Risk. In a Black-Scholes framework, they deduceexplicit formulae. They also consider generalized inverse Gaussian diffusionsfor which some qualitative results can be proved and simple simulation meth-ods can be proposed.

The equilibrium of portfolio insurance is analyzed in Grossman and Zhou[276] and Basak [48], who examine the impact of portfolio insurance methodson market volatility.

Portfolio insurance with stochastic dominance criterion is developed in ElKaroui and Meziou [193] using a general concave criterion over all martin-gales with American constraint. The result is also used to determine utility-maximizing strategies.

Chapter 11

Hedge funds

11.1 The hedge funds industry

11.1.1 Introduction

The development of hedge funds over recent years is due to different factors:

• During the 1990s, the equity bull market had been in favor of finan-cial investment development. Higher returns provided by more complexportfolios had earlier been requested by investors with relative weakrisk-aversion and who were searching for alternative investments.

• Since the year 2000, to protect their capitals, investors have searchedfor hedge funds in order to diversify, and in order to limit exposure tomain financial indices.

Figure 11.1 illustrates the development of the hedge funds industry.

FIGURE 11.1: The hedge funds development

351


A hedge fund is a pool where the investor may be a (standard) shareholderwith or without influence on the pool governance or may have a partnership.

11.1.2 Main strategies

The ranking of hedge funds by investment styles is not easy:

• The classification is not standardized.

• The number of subclasses is still growing.

• The relative transparency of some hedge funds does not facilitate theanalysis.

Recall two usual classifications: HFR and CSFB Tremont.

TABLE 11.1: HFR and CSFB classifications

HFR CSFB TremontEmerging markets Emerging marketsEquity hedge Long/Short equityDistressed securitiesEquity market neutral Market neutralEquity non hedgeEvent driven Event drivenFixed-income Fixed-income arbitrageMarket timing Managed futuresMerger arbitrageRegulation DRelative value arbitrageConvertible arbitrage Convertible arbitrageSectorShort selling Dedicated short biasStatistical arbitrageFunds of funds

Hedge funds 353

According to Amenc et al. [24], the classification can be also based onprincipal component analysis. This method of classification includes:

• Convertible arbitrage and volatility arbitrage

The goal is to use differences of prices for independent markets. Thedifferent risks are extracted: stock, rates, volatility, and credit. Theserelatively recent funds are rather complex. For example, their objectivecan be to provide the monetary rate plus a fixed excess return x%.

• Commodities trading advisors (CTA)

A CTA fund (for example a “Trend Follower” fund) is based on marketanticipations of futures and forwards, such as the exchange markets. Nocorrelation with usual financial markets is provided by strategies whichinsure diversification.

• Fusion/acquisition arbitrage

Such funds are, for instance, the “Merger Arbitrage,”“Risk Arbitrage,”and “Event Driven.” Simultaneously, long and short positions are takenon firms implied in merger or acquisition process.

• “Distressed” funds

Bonds and stocks of a firm which has a bankruptcy risk will maybe havehigher values in the future, if the firm can survive and grow again.

• “Long/Short Equity” funds

This is one of the oldest alternative strategies and one of the mostimportant, according to the amounts invested in such a way. It usesall types of assets (stocks, options, etc.). The idea is to shortsell midcapor bigcap stocks.

• “Fixed-Income Arbitrage” funds

These are based on arbitrages on fixed-income markets: treasury bonds,futures and options on interest rates (e.g., swaptions, caps, floors), creditderivatives, etc.

• “Global Macro” funds

This strategy is based on macroeconomic variables: stock indices, ex-change rates, inflation rate, taxation policy, etc. Using information frommacro-economic factors may allow better anticipation of unusual pricevariations.


11.2 Hedge fund performance

11.2.1 Return distributions

Hedge fund returns are not easy to estimate, since different biases can oc-cur, according to, for instance, Fung and Hsieh ([246],[247]):

- The “survival” bias ;- The “selection” bias ; and,- The “instant history” bias.

• Data are not necessarily public and, therefore, some of them are notavailable. A “self reporting” bias can also affect the results. The selec-tion bias is also due to non-standardized selection criteria.

• The survival bias is due to the fact that “badly” managed funds disap-pear, while the other ones remain. Therefore, funds for which we havedata for a relatively long period are more successful. For example, Gre-goriou [268] examined data on the Zurich market along the time periodfrom 1990-2001. The median of the life duration for a hedge fund isabout 5.5 years.

• The instant history bias is generated by differences between dates whenfunds are introduced in databases. New incorporated funds have gener-ally higher recent performances.

• Very often, the assumptions of independence and stationarity are notvalidated. The flexibility and the short life of such funds also inducestatistical problems. Hedge fund returns are not Gaussian; for examplewhen derivative assets are included or specific dynamic strategies areused.

Hedge funds also have risk exposures to volatility risk, default and creditrisk, and liquidity problems.

Therefore, all these sources of risk must be taken into account in order tomeasure the performance of alternative investments.

Hedge funds 355

11.2.2 Sharpe ratio limits

The performance measures introduced in Chapter 5 are mainly based onMarkowitz criterion. Therefore, they are validated when standard assump-tions on asset returns, such as normal distributions, are satisfied, or wheninvestors have utility functions depending only on the first two moments.

As soon as probability distributions are no longer symmetrical, performancemeasures such as the Sharpe ratio may no longer be adapted. Strategies whichdo not require any anticipation of the fund manager can get higher Sharperatios than “buy and hold” strategies.

Therefore:

• We can search for alternative performance measures as in Leland [348].

• We can also profit from this inadequacy to maximize the Sharpe ratioas shown by Goetzmann et al.[257].

11.2.2.1 Sharpe ratio inadequacy

11.2.2.1.1 Static strategy based on options Leland [348] considers aportfolio which is long on a stock index and short on a call written on thisindex.

Therefore:

• The portfolio payoff is a concave function.

• This strategy consists of selling the index when it rises, and buying itwhen it decreases.

• Then, the portfolio skewness is reduced.

• No “market timing” or “stock picking” are required.

• Then, this type of strategy provides fairly average returns but withpossible severe losses.

The following graph shows the profile of such a portfolio, for different strikesK (the call option is supposed to be priced by the Black-Scholes formula).


FIGURE 11.2: Portfolio profile (S−(S−K)+) as a function of stock valueS

The next table presents some of the characteristics of such a strategy forvarious values of the strike K.

Consider the following market parameter values:

S0 = 1, µ = 12%, σ = 18%, r = 3%, T = 1.

The corresponding mean and standard deviations of the portfolio returnper year are given respectively by 12.75% and 20.46%.

Notations :E(RP ) denotes the mean return of portfolio P .βP and αP denote the beta and Jensen alpha of portfolio P w.r.t. the index

S.RSP is the Sharpe ratio of portfolio P .σP and εP are respectively the total risk and specific risk of portfolio P .

Hedge funds 357

TABLE 11.2: Characteristics of the portfolio S − (S −K)+

E(RP ) (%) βP αP (%) RSP σP (%) εP (%)K = 0.9 4.63 0.076 0.845 0.419 3.22 2.82K = 1 6.30 0.199 1.329 0.482 6.18 4.65K = 1.1 8.20 0.376 1.514 0.508 9.70 5.92K = 1.2 9.89 0.566 1.358 0.511 13.11 6.15K = 1.3 11.13 0.729 1.015 0.504 15.88 5.46K = 1.4 11.92 0.846 0.658 0.495 17.84 4.31K = 1.5 12.35 0.920 0.381 0.487 19.07 3.11K→ ∞ 12.75 1 0 0.474 20.46 0

From the previous table, we can see that:

- The alpha of portfolio P is always positive, while none of these portfoliosrequires specific anticipation of the fund manager.

- Moreover, there exists an optimal portfolio with respect to alpha criterion.Besides, the Sharpe ratio is a function of the exercise price K.

- There exists also a value of K which maximizes the Sharpe ratio. Notethat the Sharpe ratio of the index is equal to 0.4743.

- Therefore, using the Sharpe ratio, it is possible to dominate a “buy-and-hold” strategy by a static one based on a long position on a call, and a shortposition on the underlying asset.

- In addition, the total risk of the portfolio is an increasing function of thestrike. At the limit, the portfolio risk converges to that of the index whenthe strike goes to infinity. Note also that the specific portfolio risk is firstincreasing then decreasing w.r.t. the strike K.

- Contrary to this strategy which always has a positive alpha, we can con-sider a strategy which always has a negative alpha. To do so, it is sufficient tointroduce a portfolio which is made up of the index and of a put written onthe index. As seen in Chapter 9, this portfolio corresponds to an insurancestrategy with a payoff that is a convex function of the index.


The following figure shows the profile of such a portfolio as function of theterminal value S of the index, according to various strikes H of the put.

FIGURE 11.3: Portfolio profile (S + (H − S)+)

The next table provides some of the characteristics of this strategy usingthe same market parameters as before.

TABLE 11.3: Characteristics of the portfolio S + (H − S)+

Portfolio : S + PE(RP ) (%) βP αP (%) RSP σP (%) εP (%)

H = 0 12.75 1 0 0.474 20.46 0H = 0.9 11.22 0.924 −0.789 0.437 19.11 2.818H = 1 9.41 0.801 −1.409 0.395 17.04 4.650H = 1.1 7.34 0.624 −1.765 0.339 14.08 5.922H = 1.2 5.58 0.434 −1.680 0.278 10.81 6.153H = 1.3 4.38 0.271 −1.300 0.217 7.79 5.464H = 1.4 3.68 0.154 −0.855 0.163 5.33 4.309

As with the previous strategy, this portfolio insurance strategy also does notrequire any anticipation by the fund manager. However, it generates alphaswhich are systematically negative and which depend on strikes.

Hedge funds 359

Note that the Sharpe ratio is decreasing w.r.t. the strike K. Thus, theportfolio insurance strategy which maximizes the Sharpe ratio corresponds tothe value K = 0. It is the index itself. Its Sharpe ratio is equal to 0.4743.

Consequently, as soon as an investment in the stock is hedged by a putwritten on it, its performance as measured by the Sharpe ratio is reduced andthis reduction is increasing with an increasing protection (i.e., the strike Kincreases).

However, as seen in Chapter 10, these types of strategy may be interestingfor investors.

Goetzmann et al. [257] propose an example of a strategy that requires aperfect knowledge of the financial market, but which has a weak Sharpe ratio.

11.2.2.2 Alternative measure and optimal Sharpe ratio

Two approaches can be proposed:

• First, as proposed by Leland (1999), we can search for a modified CAPMwith a beta that can better measure the portfolio risks for any probabil-ity distribution. Then, the alpha of strategies based on options wouldbe equal to 0.

• Second, it is also possible to determine the portfolio strategy whichmaximizes the Sharpe ratio, as in Goetzmann et al. [257].

11.2.2.2.1 Alternative definition of alpha and beta for the CAPMLeland [348] introduces an alternative definition for parameters alpha andbeta in the CAPM framework. This new valuation model takes account of allthe moments of the probability distribution.

It is based on a model introduced by Rubinstein [438], which is based on avaluation model through a power utility.

Under such assumptions, the CAPM is:

E [RP ] = Rf + BP (E [RM ]−Rf ) , (11.1)

where

BP =Cov [RP ,−(1 + RM )−γ ]Cov [RM ,−(1 + RM )−γ ]

. (11.2)

Note that for γ = −1, the utility function is quadratic. Then, we recoverthe standard CAPM formula, since:

BP =Cov [RP , RM ]Cov [RM , RM ]

. (11.3)

The risk aversion coefficient γ of the representative investor is given by:

γ = − ln (E [1 + RM ])− ln (E [1 + Rf ])V ar [ln (1 + RM )]

. (11.4)


The new definition of alpha is:

AP = (E [RP |IG]−Rf )−BP (E [RM ]−Rf ) . (11.5)

The term IG denotes the information available for the fund manager. Atmarket equilibrium, for the CAPM, AP must be null. It will be so if the fundmanager has private information and good anticipations. Note that the newalphas of portfolios S − C and S + P are null for all strikes.

11.2.2.2.2 Strategy which maximizes the Sharpe ratio Recall thatthe Sharpe ratio for the market index is equal to 0.4743. The following figurepresents the Sharpe ratio of portfolios which contain one unit of index and −aunits of call, as function of the strike K using the same financial parametervalues as before.

FIGURE 11.4: Sharpe ratio as a function of the strike K

The maximal Sharpe ratio is reached for the following values:

a = 0.7826 and K = 0.992.

This is equal to 0.5251 and so is higher than that of the index.Instead of searching for performance measures which cannot be manipulated

by strategies based on options, Goetzmann et al. [257] propose to determinestrategies which maximize the Sharpe ratio without requiring any skill.

Hedge funds 361

They prove that the best static strategy has a probability distribution whichis right truncated and has a left fat tail. Such a strategy can be approximatedby a combination of a put and a call.

The portfolio is based on an investment of one unit in the index, on theselling of a European calls with strikes K, and on the purchase of b Europeanputs with strikes H and (K > H).

The portfolio value at maturity is given by:

VT = ST − a.(ST −K)+ + b.(H − ST )+. (11.6)

Thus, its initial value is equal to:

P0 = 1− a.C (1, T, σ, r;K) + b.P (1, T, σ, r;H) . (11.7)

At maturity, the profile is concave as shown by figure 11.5, which corre-sponds to portfolios maximizing the Sharpe ratio.

FIGURE 11.5: Portfolio profile maximizing the Sharpe ratio

Using now a units of call and b units of put, we get a maximal Sharpe ratioequal to 0.531. This new maximum corresponds to the following values:

a = 0.705, K = 0.846, and b = 1.93, H = 1.126.

Note that the main part of the Sharpe ratio increase is obtained from onlyone option.


These results can be compared with those shown in Chapter 10 when max-imizing expected utility with insurance constraints.

In addition, Goetzmann et al. [257] show that the Sharpe ratio is not verysensitive to the choice of strikes. Therefore, options close to the spot indexprice can be used since they are more liquid.

To summarize, if we are only interested by the performance measure, resultsof Goetzmann et al.[257] suggest that the probability distribution of a fundwhich has a high Sharpe ratio, would be compared with the strategy whichmaximizes the Sharpe ratio. In particular, this is true when we consider theperformance measure of hedge funds based on style analysis.

11.2.3 Alternative performance measures

Chapter 2 deals with the choice of appropriate risk measures. In particular,downside risk measures are analyzed. Performance measures for hedge fundscan be based on such risk measures which focus on loss risk.

11.2.3.1 The semi-variance

The Sharpe ratio is based on a dispersion risk measure around the mean.Therefore, it does not allow us to determine if the variations are below orabove the mean. The semi-variance, SV , is a possible tool to avoid such aproblem:

SV (R) = E

[[(E[R]−R)+

]2]. (11.8)

The semi-variance takes account of such assymetry. Lower partial momentsare extensions of the semi-variance. They are defined by:

E[[(E[R]−R)+

]p], (11.9)

where the values of parameter p are above 2, which allows us to take betteraccount of assymetries and potential high risks.

11.2.3.2 Sortino ratio

This is one of the most famous ratios. It takes account of loss expectations(“downside risk”). It is defined by (see Sortino et al. [475], [476], [477]):

Sor(R) =E[R]− L√

E[[(L−R)+]2], (11.10)

where L denotes the minimal acceptable return level (MAR).This ratio is based on the same principle as the Sharpe ratio. However,

the riskless rate is replaced by the level L, that is by the minimal acceptablereturn level, while the standard deviation of the return σ(R) is replaced bythe standard deviation of those returns that are below the level L.

Hedge funds 363

The choice of the level L can be made according to different criteria:

• In order to control the loss risk, we can take L = 0.

• If we consider the riskless rate Rf as a benchmark, we can consider thevalue L = Rf .

• If we want to compare the performance of funds with each other, thelevel L can be chosen equal to the mean of return expectations of thesefunds.

As seen in what follows, (unfortunately) this choice is crucial to rank thefunds.

11.2.3.3 The Omega performance measure

This performance measure was introduced by Keating and Shadwick [326].Contrary to standard performance measures, such as the Treynor and Sharperatios or the Jensen alpha, it takes account of the whole probability distribu-tion.

The Omega measure considers both the gain and loss probabilities. It isdefined by:

ΩF (L) =

∫ b

L (1− F (x)) dx∫ L

aF (x) dx

. (11.11)

The function F (.) is the cumulative distribution function of the financialassets with range (a, b) and w.r.t. the probability distribution P and thereference level L chosen by the investor.

For a given level, the investor would always prefer the portfolio with thehighest Omega value.

The Omega ratio can be written as:

ΩFX (L) =EP

[(X − L)+

]EP

[(L−X)+

] . (11.12)

REMARK 11.1 Therefore, the Omega function ΩFX (L) is the ratio ofthe expectation of gains above the level L on the expectation of losses belowthe level L. As mentioned in Kazemi et al. [325], the Omega ratio can beviewed as the ratio of a call on a put having the same strike L written onthe same underlying asset (the portfolio value or its return) but with valuescomputed w.r.t. the historical probability P instead of a risk-neutral one.


The Omega function also satisfies the following properties:

• For L = EP [X ], ΩFX (L) = 1.

• ΩFX (.) is a decreasing function.

Thus, from the previous property, we deduce that:

- For levels of L smaller than the mean, the Omega ratio is positive.

- For values of X higher than the mean, the Omega ratio is negative.

• ΩFX (.) = ΩGX (.) if and only if F = G.

When portfolio returns always have identical Omega ratios, their prob-ability distributions are equal.

• The Omega ratio is compatible with the second-order stochastic domi-nance:

X 2 Y =⇒ ΩFX (L) ≥ ΩFY (L). (11.13)

As seen in Chapter 1, this is an interesting property.

• Kazemi et al. [325] define the Sharpe Omega ratio as follows:

SharpeΩ (L) =EP [X ]− L

EP

[(L−X)+

] = ΩFX (L)− 1. (11.14)

The upper term is the same as for the Sharpe ratio when the level L isequal to the riskless return. The risk measure EP

[(L−X)+

]replaces

the usual standard deviation.

Example 11.1Consider the following standard case: The payoff X is the value at maturityT of a stock S which is assumed to follow a geometric Brownian motion:

X = S0 exp[(µ− σ2/2)T + σWT ],

where (Wt)t is a standard Brownian motion which consequently is such thatWT has a Gaussian distribution N

(0,√T).

Then, we have:EP [X ] = S0 exp[µT ].

Therefore, the mean return does not depend on the volatility.

Hedge funds 365

Consequently:

- If S0 exp[µT ] < L, then the Sharpe Omega ratio is an increasing functionof the volatility (due to the vega of the put).

- If S0 exp[µT ] > L, then the Sharpe Omega ratio is a decreasing functionof the volatility.

Generally, the level L is chosen smaller than the mean (since it representsa loss w.r.t the expected value). Therefore, for the standard “buy-and-hold”strategy, the performance measure Omega is indeed decreasing w.r.t. theusual volatility risk.

In the following numerical example, the Omega performance measure is ap-plied to four hedge funds. The time period is 1992-2004.

Two types of hedge funds are examined:

1) Hedge/Convertible Arbitrage: A and D.

2) Hedge/Equity Market Neutral: B and C.

The estimation of parameters is made on monthly data.

TABLE 11.4: The four hedge funds characteristics

Characteristics Fund A Fund B Fund C Fund D

Mean 0.70 0.92 0.35 0.66Median 0.80 0.80 0.27 0.58Variance 0.54 0.51 11.74 2.13Skewness -0.74 0.65 0.35 0.56Kurtosis 4.33 3.11 4.56 4.50


Their monthly returns are represented as follows:

FIGURE 11.6: The monthly returns of the four hedge funds

The figure plots the Omega ratio as function of the reference level L:

FIGURE 11.7: Omega ratio as function of the threshold level L

Hedge funds 367

For this particular example, we can see that:

- None of the four funds dominate another one for all values of the thresholdL (fund D for example dominates the three other ones for small values ofL but it is dominated by the other ones for high values of L). However,fund B has a Omega ratio which is always above that of fund A, whichmay lead to stochastic dominance.

- Note that the choice of the reference level L is crucial, in particular around“rational” values such as the riskless rate where the ranking changesvery quickly. Therefore, if the level of L is not an absolute reference, itseems necessary to define an additional criterion to determine this level.For this purpose, we can take into account the investor’s risk aversion,the remuneration of the fund manager, or we can introduce weightedOmega ratios (judgment of financial experts, expectation for a givenprobability distribution on the level, etc.).

11.2.3.4 The ASRAP measure (“Alternative Style Risk AdjustedPerformance”)

In order to take into account the management style, Lobosco [360] proposesthe “style risk adjusted performance” measure (SRAP).

The risk is measured by the volatility. Since hedge funds generally haveasymetrical probability distributions with fat tails, it is necessary to examineat least their moments of orders 3 and 4.

For this purpose, the VaR developed by Cornish and Fisher [131] can beintroduced to measure the risk. This is an extension of the standard VaR,which includes the skewness and the excess kurtosis of the return distribution.

The first step consists of calculating the VaR based on a Gaussian distri-bution then considering the extended VaR of Cornish-Fisher.

Define q by :

q = qc +16

(q2c − 1)s +124

(q3c − 3qc)k − 136

(2q3c − 5qc)s2, (11.15)

where qc is the critical value at the probability level (1−α), s is the skewness,k is the excess kurtosis.

Then the adjusted VaR is given by:

V aRCF = −(µ− qσ), (11.16)

where µ and σ are respectively the mean and the standard deviation of theprobability distribution. Note that, if this distribution is Gaussian, then s = 0and K = 0, so qc = q. In that case, it is equal to the usual VaR.

We deduce that the ARAP measure is defined by:

ARAP =V aRCF (IFOF )V aRCF (HF )

(RHF −Rf ) + Rf , (11.17)


where IFOF is an index FOF, and HF denotes the hedge fund to be examined.The SRAP measure is determined by the difference between the measure

RAP of the portfolio and the measure RAP of the style benchmark, whichcorresponds to the portfolio style:

ASRAP = ARAP(fund)−ARAP(style index). (11.18)

11.2.4 Benchmarks for alternative investment

The alternative investment searches for absolute performance. However,the use of the riskless return as a benchmark is not always a convenient toolfor all styles of hedge funds, in particular when their beta is not null, and ifthe implicit assumptions which validate the CAPM are not satisfied.

Nowadays, hedge fund performance is measured more and more relative toa style benchmark.

Such a benchmark must satisfy the following properties:

- Transparency - The list of hedge funds that are included in the benchmarkmust be detailed and the way to compute their performance must bespecified.

- Representativeness - A large set of hedge funds must be considered whileexcluding those which are too small or the management of which is toohazardous.

- Weighting - The use of the respective capitalizations of different funds is noteasy to handle. The recent development of many funds and their lackof standardization does not facilitate this weighting. Therefore, equalweights are often considered.

- Accessability of the hedge funds.

- The “reporting” frequency.

To satisfy some of the previous required properties, we can:

• Analyze the return of a hedge fund w.r.t. the return of a given portfoliohaving the same strategy, but in a passive way (passive benchmark).Agarwal and Naik ([9],[10]) introduce multifactorial models in order toanalyze the types of assets and strategies used by the hedge funds.These factors include the strategies based on options on observable as-sets, as suggested in Agarwal and Naik [9]. This approach is furtherstudied in Schneeweis and Spurgin [456]. They prove that the perfor-mance measure of hedge funds can actually be based on option strate-gies.

Hedge funds 369

• Compare the return of a given fund with that of a representative index(active benchmark).

• Consider a pure style index. This type of index is defined as the trueindex which is not observable.

11.2.5 Measure of the performance persistence

Among studies about performance persistence of hedge funds, we have:

- Brown et al. [93] who examine if the future returns can be predicted frompast observations. Their conclusion is that this is not the case.

- Agarwal and Naik ([7], [8]) conclude that there exists a short term perfor-mance. They examine annual returns during the time period 1982-1998.Their results prove that the performance persistence level is significantfor a monthly observation, but it is reduced for annual observations.

Park and Staum [402] show that there exists a performance persistence forCTA’s.

- Elton et al. [198] and Brown et al. [93] argue that the survival bias mayinduce performance persistence.

- Harri and Brorsen [282] show that some funds are actually performant overa sufficiently long time period.

- Baquero et al. [41] take account of the “look-ahead bias” which is inducedby the multiperiodic sampling. This bias can induce a difference ofabout 3.8%. However they confirm the persistence.

The general conclusion is that there exists a performance persistence, forboth the “losers” and the “winners.” Note also that the performance is linkedto management costs received by the fund managers, as noted by exampleCaglayan and Edwards [100].

Usual statistical methods are based on:

• Regression of the returns; in particular, the behavior of the alpha coef-ficient is examined.

• Style analysis.

• Correlation tests, such as Spearman’s test.


11.3 Optimal allocation in hedge funds

The management of a fund which itself uses other funds such as hedge fundsallows for diversification. This is due to the weak correlation of hedge fundswith standard financial assets, as shown in the next figure, where a hedgefund index is compared with standard financial indexes.

FIGURE 11.8: Correlation of hedge funds/standard funds

We can see that returns of hedge funds can evolve independently fromtraditional assets (confirmed by the correlation values).

What is the percentage of hedge funds to include in a fund or a portfolio?Obviously, the answer is not easy:

• Some funds cannot include more than a given percentage (for example,10%). Otherwise, their category is changed.

• A minimal diversification is desirable to profit from weak beta and highalpha.

Note that nowadays most financial institutions include hedge funds. Froma theoretical point of view, as a first step, a mean-variance analysis can be de-veloped as in Schneeweis and Spurgin [456], who determine the mean-variance

Hedge funds 371

frontier, including the S&P500, the fixed-income index Lehman Brothers, anda hedge fund index, the EACM 100. They conclude that hedge funds must beused. This analysis is further extended by Cvitanic et al. [141], who introducea model based on a dynamic mean-variance criterion. In this framework, theyprove that the model risk reduces the percentage invested in hedge funds.However, as mentioned by Lhabitant [355], the mean-variance criterion doesnot take account of moments with orders higher than 2, which strongly limitsthe analysis. It is the reason why for example Chabaane et al. [111] introducevarious optimization criteria such as the expected shortfall.

11.4 Further reading

Lhabitant [354] provides an overview of hedge funds including description,classification, and performance. Fung and Hsieh [248] examine the hedgefund survival lifetimes, while in [249] they show how hedge fund strategiesmay involve risk. Brooks and Kat [91] examine statistical properties of hedgefund index returns and deduce their implications for portfolio managemen-t. The book of Gregoriou et al. [268] contains a survey about performancepersistence. Bookstaber and Roger [83] highlight the problem of measuringperformance of portfolios with options. Bacmann and Scholz [36] proposealternative performance measures for hedge funds. Bonnet and Nagot [81]propose a class of performance measures in order to evaluate alternative in-vestment, regardless of assumptions on payoff. The representation of thesemeasures involves the Log-Laplace transform of the asset distribution - amongthem: the squared Sharpe ratio, the Stutzer’s rank ordering index and theHodge’s Generalized Sharpe ratio. Cascon et al. [107] provide some mathe-matical properties of the Omega measure.

Avouyi et al. [27] apply the Omega to the portfolio allocation choice prob-lem, using Threshold Accepting. They introduce conditional copula to takeinto account non-Gaussian returns, extreme joint movements, time-varyingdependence, and volatility. They apply these methods to a portfolio com-posed of three total stock market indices (US, UK and Germany). Amencand Martellini [23] also examine portfolio optimization involving hedge fundsusing an improved estimator of the covariance structure of hedge fund indexreturns. Using data from CSFB-Tremont hedge fund indices, they concludethat ex-post volatility of minimum variance portfolios is between 1.5 and 6times lower than that of a value-weighted benchmark. Thus, inclusion ofhedge funds in a portfolio can potentially generate a dramatic decrease inthe portfolio volatility without lower expected returns. Krokhmal et al. [337]also endeaver to optimize a portfolio of hedge funds. They examine linearrebalancing strategies using Value-at-Risk and CVaR criteria.

Appendix A

Appendix A: Arch Models

Linear and non linear processes

Time series have been introduced in particular to describe and to predictdiscrete-time dynamics. One of the most popular models is the so-called au-toregressive moving average process (ARMA). The current value of the seriesis a linear function of its own lagged values and of current and past valuesof a “noise”process, usually called the innovation process. However, it is set-up in a linear framework (rather strong approximation), and no constraintis usually imposed on the moving average parameters (which does not allowtaking into account structural relations). Financial time series, for example,exhibit non-linear dynamics and are submitted to structural constraints suchas equilibrium conditions.

Therefore, a new family of time series have been introduced by Engle [204]:the ARCH (Autoregressive Conditionally Heteroscedastic) models. These mod-els allow consideration of nonlinear time series models. They can also beapplied to path dependent volatility models.

Weak and strong stationarity

A stationary time series is stationary if it has no trend and no seasonality.It is homogeneous with respect to time. More precisely:

DEFINITION A.1 A time series (Xt)t is weakly stationary if:

E(X1) = ... = E(Xt) = ... = µ,

Cov(Xt, Xt+h) = γ(h), ∀(t, h) ∈ N× Z. (A.1)

In particular, the autocovariances only depend on the time period h and noton current time t.A time series (Xt)t is strongly stationary if for any n and any t1 < ... < tn,

the probability distribution of the random vector (Xt1 , ..., Xtn) depends onlyon n. Strong stationarity obviously implies weak stationarity.

373


Example A.1A well-known non-stationary time series is the random walk (RW):

Xt = Xt−1 + εt,

where (εt)t is a standard Gaussian white noise (WN) (i.e. the random vari-ables εt are independent and have the same probability distribution, which isthe standard Gaussian law N (0, 1)).

Such series “diverge.”Consider for example a stationary time series whichis a first-order autoregressive process (AR1):

Yt = φ.Yt−1 + εt, with |φ| < 1.

The behaviors of the two time series are illustrated in the next figure (forX0 = Y0 = 0).

FIGURE A.1: Random walk (RW ) and autoregressive process of order 1(AR1)

The stationary series varies around its mean (equal to 0) whereas the ran-dom walk has a high variance since we have:

Xt = X0 +N∑t=1

εt =⇒ σ2(Xt) = σ2(N∑t=1

εt) = σ2N.

Note that the latter equality proves that the random walk is not stationary.

Standard econometrics analysis is based on stationarity and is not adaptedto non-stationary time series as shown by the following example of Grangerand Newbold [264].

Appendix A: Arch Models 375

Example A.2 Fallacious regressionsConsider a linear regression between two independent random walks RW1

and RW2 such that (the numbers between braces are the Student’s statisticvalues):

RW1t = 3.5167(7)

+ 0.59(72.25)

RW2t + εt, with an R2 = 0.72.

We note that the variable RW2 would allow us to explain the dynamics ofRW1: there exists a linear relation between these two random variables,whereas they are independent. This surprising result is due to their commonpropensity for diverging. However, if we consider two independent Gaussianwhite noises ε

(1)t and ε

(2)t (thus stationary processes), no linear relation be-

tween them appears:

ε(1)t = −0.0148

(−0.66)− 0.086

(−0.841)ε(2)t + εt, with an R2 = 0.018.

None of the previous coefficients is significant.

ARMA processes

The current value of Xt is a linear function of the past values of X and ofpast and current values of a white noise process ε. Define:

Xt−1 = (X1, ..., Xt−1).

Denote also by LE[Xt

∣∣Xt−1

]the linear regression of Xt (the best prediction

of Xt by means of a linear function of X1, ..., Xt−1).

DEFINITION A.2 A second order stochastic process is:1) an autoregressive process of order K if and only if:

E[Xt

∣∣Xt−1

]= E [Xt |Xt−1, .., Xt−K ] , ∀t.

2) a linear autoregressive process of order K if and only if:

LE[Xt

∣∣Xt−1

]= LE [Xt |Xt−1, .., Xt−K ] , ∀t.

An ARMA representation is defined as follows:

Xt = C + Φ1Xt−1 + ... + ΦpXt−p + εt −Θ1εt−1 − ...−Θqεt−q, (A.2)

where Φ1, ...,Φp and Θ1, ...,Θq are square matrices and C is a vector.The autoregressive and moving average lag-polynomials are given by:

Φ(L) = Id + Φ1L− ...− ΦpLp,

Θ(L) = Id−Θ1L− ...−ΘqLq, (A.3)


where L denotes the lag operator, L(Xt) = Xt−1. The ARMA representationis then defined by:

Φ(L)(Xt) = C + Θ(L)εt. (A.4)

The coefficients Φ1, ...,Φp and Θ1, ...,Θq are usually constrained. For ex-ample, the roots of the equations:

det (Φ(z)) = 0 and det (Θ(z)) = 0

are supposed to be outside the unit circle (|z| > 1). Indeed, under this latterassumption, the polynomials det (Φ(z)) and det (Θ(z)) can be inverted. Thisleads to alternative representations of the process X :

(infinite moving average representation) Xt = Φ(L)−1C + Φ(L)−1Θ(L)εt,(infinite autoregressive representation) Θ(L)−1Φ(L)Xt = Θ(1)−1C + εt.

Thus ARMA processes are relatively tractable. Note that they can beconsidered as truncated approximations of weakly stationary processes due toWold’s decomposition.

THEOREM A.1 Wold’s theorem

Consider a weakly stationary process (Xt)t such that

limk→∞

E [Xt+k |Xt ] = E [Xt] ,

which means that no information at an infinite horizon is given from theobservations prior to time t.Then, this process always has an infinite moving average representation:

Xt = C0 + A(L)εt, (A.5)

where (εt)t is a sequence of homoscedastic noise variables, V εt = Ω, whichare uncorrelated, Cov(εt, εt′) = 0, ∀t, ∀t′, with zero mean, E [εt] = 0, and suchthe coefficients A1, ...Aj , ...satisfy the following stability condition:

∞∑j=0

AjΩA′j <∞.

Note that the process ε satisfies:

εt = Xt − LE [Xt |εt ] ,

which is the reason why this process is called the innovation process.


ARCH model

The Autoregressive Conditionally Heteroscedastic model takes account oftime dependent conditional variances. For example:

Xt = ε2t−1εt,

where ε is a Gaussian white noise with variance σ2. The process X is weaklystationary and the conditional variance depends on lagged residuals:

V ar(Xt

∣∣Xt−1

)= V ar

(ε2t−1εt

∣∣Xt−1

)= σ2ε2t−1.

ARCH volatility models

The univariate case

A standard ARCH volatility process of order p can be written as:

Rt =k∑

n=1

ΦnRt−n + εt,

σ2t = α0 +

p∑i=1

αiε2t−i, (A.6)

where σ2t is the conditional variance of the innovation process εt. The previous

representation is based on a moving average of the squares of the innovations.The coefficients α0 > 0, αi 0, i = 1, . . . , p are assumed to be positive.Under the assumption

∑pi=1 αi < 1, the time series is stationary.

This representation has been extended by Bollerslev et al. [80] who intro-duce the GARCH(p, q) model for which the conditional variance satisfies thefollowing equation:

σ2t = α0 +

p∑i=1

αiε2t−i +

q∑j=1

βjσ2t−j . (A.7)

In that case, the variance is defined as the sum of an autoregressive termand of a moving average of the squares of the innovations. The conditionalvariance is stationary if the condition

∑pi=1 αi +

∑qi=1 βi < 1 is satisfied.

More generally, the term∑p

i=1 αi+∑q

i=1 βi indicates the persistence degreeof the variance, since the non-conditional variance can be written as:

σ2 =α0

1−∑pi=1 αi +

∑qi=1 βi

.


When∑p

i=1 αi +∑q

i=1 βi = 1, the non-conditional variance is infinite. Theconditional variance follows an IGARCH process (Integrated GARCH ).

Note that if the standardized innovations (zt = εtσt

) have a standard Gaus-sian distribution, then the marginal distribution of the innovation process εtis characterized by a kurtosis which is always higher than the standard Gaus-sian one. In order to fit fat tail distributions, other innovation processes canbe introduced, for example GED or Student distributions.

In order to take asymmetrical distributions into account, several modelshave been introduced by Nelson [398], Engle and Ng [208], Glosten et al.[256], and Zakoian [509], etc.

• GJR model (Glosten, Jagannathan, and Runkel [256]) defined by:

σ2t = α0 + αε2t−1 + γIt−1ε

2t−1 + βσ2

t−1,

with α0 > 0, α 0, α + γ 0 and β 0. The process is stationary ifβ + α + γ/2 < 1.

• TGARCH model (Threshold GARCH ), Zakoian [509] defined by:

σ2t = α0 + α |εt−1|+ γIt−1 |εt−1|+ βσt−1,

with α0 > 0, α 0, α + γ 0 and β 0. The volatility process isstationary if:

β2 +α2 + (α + γ)2

2+ 2β

2α + γ√2π

< 1.

• EGARCH model (Exponential GARCH ) (Nelson [398]) defined by:

lnσt2 = α0 + α(|zt−1| − E |zt−1|) + γzt−1 + β lnσ2

t−1,

where E |zt−1| =√

2/π under the normality assumption.

The volatility process is stationary if β < 1.

• If the long term volatility is non constant, we can consider “component-models.”

These models allow us to take into account mean-reverting properties:

σ2t − qt = α

(ε2t−1 − qt−1

)+ β

(σ2t−1 − qt−1

), (A.8)

qt = ω + ρ (qt−1 − ω) + φ(ε2t−1 − σ2

t−1

). (A.9)

The term qt is the long term volatility component. Equation (A.8)describes the transient volatility component, σ2

t − qt, which convergesto zero with speed (α + β).

Equation (A.9) defines the long term volatility component qt, whichconverges to ω with speed ρ.


• To take account of asymmetry, the TARCH model can be used:

σ2t − qt = α

(ε2t−1 − qt−1

)+ γ

(ε2t−1 − qt−1

)It−1 + β

(σ2t−1 − qt−1

).

(A.10)

In all previous relations, β is the autoregressive term, α is the effect of ashock on return, and γ is the asymmetry effect corresponding to an additionalimpact of a negative shock. Thus, for the GJR and TGARCH models, theeffect of a positive shock is measured by α, and the effect of a negative shockis measured by α + γ ( γ is assumed to be positive).

For the EGARCH model, the effect of a positive shock is measured by α+γand the effect of a negative shock is measured by α− γ. In that case, γ mustbe negative so that a negative shock has no higher impact than a positiveshock.

REMARK A.1 The comparison of these asymmetrical models can bebased on different response curves to innovations which illustrate the inno-vation effects on the conditional variance (News Impact Curves) proposed byEngle and Ng [208].

The multidimensional case

Consider a N -multidimensional GARCH model:

Yt = µ + εt, with εt | It−1 N (0, Ht) .

Two basic examples of such models are:

- Diagonal VECH:

Ht = A0 +p∑

i=1

Ai ⊗Ht−i +q∑

i=1

Bi ⊗ εt−1εTt−1.

- BEKK (see [206]) :

Ht = A0AT0 +

p∑i=1

AiHt−iATi +

q∑i=1

Bi

(εt−1ε

Tt−1

)BT

i .

Consider for example the multidimensional GARCH model which is called theDCC-MVGARCH (Dynamic Conditional Correlations Multivariate General-ized Auto Regressive Conditional Heteroscedastic model) introduced by Engleand Sheppard [209].

εt | It−1 N (0, Ht) ,Ht ≡ DtRtDt,


where εt are the residuals from the filtration, Dt is a diagonal matrix withcoefficients which are stochastic standard deviations generated by univariateGARCH processes, and Rt denotes a stochastic correlation matrix.

The log-likelihood is defined by:

L = −12

T∑t=1

(k log (2π) + log (|Ht|) + ε′tH

−1t εt

),

= −12

T∑t=1

(k log (2π) + log (|DtRtDt|) + ε′tD

−1t R−1

t D−1t εt

),

= −12

T∑t=1

(k log (2π) + log (|Dt|) + log (|Rt|) + η′tR

−1t ηt

),

where ηt N(0, Rt) are the residuals, standardized by their conditionalstandard deviations. In Engle and Sheppard [209], the coefficients of Dt areunivariate GARCH processes given by:

Hit = ωi +Pi∑p=1

αipε2it−p +

Qi∑p=1

βiqhit−p,

where Hit is the usual conditional GARCH variance and, for i = 1, 2, ..., k,usual non-negativity constraints hold together with the stationary condition:

Pi∑p=1

αip +Qi∑q=1

βiqhit−p < 1.

The dynamic correlations are:

Qt =

(1−

M∑m=1

αm −N∑

n=1

βn

)Q +

M∑m=1

αm

(ηt −mη′t−m

)+

N∑n=1

βnQt−n

Rt = Q∗−1t QtQ

∗−1t ,

where α and β are the weights, Q is the non conditional covariance of stan-dardized residuals, Q∗

t is a diagonal matrix with coefficients which are squaresof the diagonal coefficients qii of the matrix Qt, and M and N are the DCClags. Note that the coefficients of Rt are given by:

ρijt =qijt√qiiqjj

.

Appendix B

Appendix B: Stochastic Processes

Stochastic basis, filtration, stopping times

The notion of filtration is used to represent, for example, the flow of possibleobservations of prices on a financial market which is available for traders andportfolio managers.

DEFINITION B.1 Suppose (Ω,F ,P) is a probability space.A filtration (Ft)t is an increasing family of sub-sigma-fields Ft ⊂ F . A

stochastic basis is a probability space equipped with a filtration. Increasingmeans that if s ≤ t, then Fs ⊂ Ft. Ft is usually interpreted as the set ofevents that occur before or at time t. Generally, Ft represents the history ofsome process observed up to time t, but other possible histories are allowed.

It is assumed that the “usual conditions” are satisfied:

i) Complete : every P-null set in F belongs to F0 and so to all Ft.

ii) Right continuous : Ft = ∩s>tFs.

DEFINITION B.2 A continuous time stochastic process X is a family ofrandom variables (Xt)t defined on (Ω,F ,P), indexed by t, which take values in(E, E). Hence, for all t, Xt is a random variable with values in E. Moreover,for each fixed ω (which represents a “state of the world”), t → Xt(ω) is afunction defined on [0, T ], called a path or a trajectory of the process X.

The concept of Martingale is crucial in the modern theory of finance. Ifthe price process M of an asset is a martingale, the conditional expectationat time s of the future value Mt of the stock at time t is given by its currentvalue Ms.

DEFINITION B.3 A martingale (resp. submartingale, resp. super-martingale) is an adapted process X on the basis (Ω,F , (Ft)t,P) whose paths

381


are all right continuous and left limited (rcll) P−almost surely, such that everyXt is integrable and such that for s ≤ t:

Xs = EP[Xt|Fs] (resp.Xs ≤ EP[Xt|Fs], resp.Xs ≥ EP[Xt|Fs]).

In order to model dynamics of financial assets, several types of stochas-tic processes are introduced: Brownian motion, Poisson processes, and, moregenerally, on Levy processes, diffusions, diffusions with jumps, point process-es, and martingales (see Shiryaev [469]).

Semimartingales and stochastic integrals

The class of semimartingales is the class of stochastic processes that is“rich” enough and sufficiently “tractable.” It contains the previous processesand is stable under many of the usual transformations: localization, changeof measure, change of filtration, and change of time. Finally, it is possible todefine stochastic integrals with respect to semimartingales, which leads to thefamous Stochastic Calculus and its very powerful results, like the Ito formula.

DEFINITION B.4 1) A semimartingale is a process X which has thefollowing decomposition (not unique)

X = X0 + A + M,

where X0 is finite-valued and F0−measurable, M is a local martingale withM0 = 0, and A has finite-variation.2) A special semimartingale is a semimartingale X for which A is moreoverpredictable. Furthermore, this decomposition is unique and called the canoni-cal decomposition of the special semimartingale X.

There exist several ways to construct stochastic integrals (see e.g., Protter[419] for details).

If X has finite-variation and if H is a bounded process, the integral

H.Xt =∫ t

0

HsdXs

is directly defined as the Stieljes integration path-by-path. But this con-struction excludes such fundamental processes as, for instance, the Brownianmotion whose paths almost surely have no finite variations over each finiteinterval. Martingales in general, and also Markov processes, are similarly

Appendix B: Stochastic Processes 383

excluded. This is a standard problem with semimartingales X having nofinite-variation, since the measure dXs(ω) is not defined.So a special construction of a stochastic integral had to be developed whiletaking into account that stochastic integrals must be defined from limit ofsums as usual integrals. For this purpose, restrictions must be made on bothintegrand and integrator.

DEFINITION B.5 A process H is said to be simple predictable if H hasa representation

Ht = H010(t) +n∑

i=1

Hi1(Ti,Ti+1](t), (B.1)

where 0 = T0 < T1 < ... < Tn+1 < ∞ is a finite sequence of stopping times,Hi ∈ FTi with |Hi| <∞ a.s., and 0 ≤ i ≤ n.

DEFINITION B.6 Let X be a right continuous and left limited process.Define the linear mapping associated to stochastic integrals with respect toprocess X by, for any previous process H :

JX(H) = H0X0 +n∑

i=1

Hi(XTi+1 −XTi), (B.2)

whenever

Ht = H010(t) +n∑

i=1

Hi1(Ti,Ti+1](t),

where 0 = T0 < T1 < ... < Tn+1 < ∞ is a finite sequence of stopping times,Hi ∈ FTi , with |Hi| <∞ a.s., and 0 ≤ i ≤ n.

JX(H) is called the stochastic integral of H with respect to X.

Example B.1Consider the standard Brownian motion W . Recall that W can be charac-terized as a continuous stochastic process with independent and (Gaussian)stationary increments (i.e. for each s < t, Wt −Ws is independent of Fs andits distribution is the Gaussian law with mean 0 and variance equal to t) withW0 = 0.Let (Pin)n be a refining sequence (i.e. Pin ⊂ Pim if m > n) of partitions of[0,∞) with mesh sizes converging to 0 as n goes to infinity. Consider

Wnt =

∑tk∈Pin

11(tk,tk+1].


Wn converges to W . Now fix t and assume that t is a partition point of eachPin. Then,

JW (Wn) =∑

tk∈Pin,tk<t

Wtk(W tk+1 −W tk),

andJW (W )t = lim

n→∞ JW (Wn)t =∑

tk∈Pin,tk<t

Wtk(Wtk+1 −Wtk),

= limn→∞

∑tk∈Pin,tk<t

1/2(Wtk + Wtk+1)(Wtk+1 −Wtk)− 1/2(Wtk+1 −Wtk)2

= 1/2 W 2

t − 1/2 limn→∞

∑tk∈Pin,tk<t

(Wtk+1 −Wtk)2

.

Now, examine the last term on the right. Denote

Yk = (Wtk+1 −Wtk)2 − (tk+1 − tk).

The sequence (Yk)k is an iid sequence of random variables with zero mean.Thus

E

[ ∑tk∈Pin

(Wtk+1 −Wtk)2 − (tk+1 − tk)

]=∑k

E[Y 2k ].

Moreover,(FracWtk+1 −Wtk tk+1 − tk

)2 has the distribution of the squareof a Gaussian random variable Z with zero mean and variance equal to 1.Therefore

E[∑

tk∈Pin

(Wtk+1 −Wtk)2 − (tk+1 − tk)] ≤ E[(Z2 − 1)2] mesh(Pin)t,

which converges to 0 as n goes to infinity. Thus, the last term on the rightconverges to t (L2 convergence, but almost-surely convergence can also beproved (see Protter [419])). Consequently,∫ t

0

WsdWs = 1/2 W 2t − 1/2 t,

which distinctly differs from the Riemann-Stieljes integral formula.

For processes A with continuous paths of finite variations, the term∑tk∈Pin

(Atk+1 −Atk)2 converges to 0, which explains the difference betweenpath-by-path Riemann-Stieljes integrals with respect to processes A andstochastic integral with respect to processes such as the Brownian motion.

The quadratic variation of a semimartingale is a very convenient tool.


DEFINITION B.7 1) The quadratic co-variation of two semimartingalesX and Y , denoted by [X,Y ], is defined by:

[X,Y ] = XY −X0Y0 −X−.Y − Y−.X . (B.3)

2) The quadratic variation of a semimartingale X is [X,X ] and so:

[X,X ]t = X2t −X2

0 − 2∫ t

0

Xs−dXs. (B.4)

Ito’s formula for semimartingales

In what follows, Dif denotes the first partial derivative of the function fw.r.t. xi and Dijf denotes the second partial derivative of the function fw.r.t. xi and xj .

THEOREM B.1Let X = (X1, ..., Xd) be a d-dimensional semimartingale and f a class C2

twice continuously-differentiable function on Rd. Then f(X) is a semimartin-gale and:

f(Xt) = f(X0) +∑i≤d

Dif(X−).X i + 1/2∑i,j≤d

Dijf(X−).〈X i,c, Xj,c〉

+∑s≤t

f(Xs)− f(Xs−)−∑i≤d

Dif(Xs−)∆X is

.

Doleans-Dade exponential formula

This notion is very useful to the financial theory, since most asset price pro-cesses are in fact Doleans-Dade exponentials. Moreover, it is also relevant tochange of measures to compute, for example, the Radon-Nikodym derivativesof the risk-neutral probabilities.

Consider the equation

Y = 1 + Y−.X (or equivalently dY = Y−dX and Y0 = 1),

where X is a given semimartingale and Y an unknown rcll adapted process.

By analogy with the ordinary differential equation dydx = y, Y is called the

exponential of X .


THEOREM B.2

If X is a semimartingale then the above equation has one and only onercll adapted solution (up to indistinguishability) which is a semimartingale,denoted by E(X), and given by

E(X)t = exp (Xt −X0 − 1/2〈Xc, Xc〉)∏s≤t

(1 + ∆Xs)e−∆Xs . (B.5)

Recall some basic properties of this exponential:

PROPOSITION B.1

1) If X has finite variation, then so has E(X) which is given by

E(X)t = exp (Xt −X0)∏s≤t

(1 + ∆Xs)e−∆Xs .

2) If X is a local martingale then so is E(X).3) Let T = inft : ∆Xt = −1. Then, E(X) = 0 on [[0, T [[, E(X−) = 0 on[[0, T ]] and E(X) = 0 on [[T,∞[[.

Note also the following useful property:

PROPOSITION B.2

Let X and Y be two semimartingales with X0 = Y0 = 0. Then

E(X)E(Y ) = E(X + Y + [X,Y ]) . (B.6)

Example B.2

As in the Black and Scholes model, consider a rate of return X described bya Brownian motion with drift. Thus, we have :

Xt = µt + σWt,

where µ and σ are constants, and (Wt)t is a standard Brownian motion. Then,the stock price process S is solution of the equation S = S0+S−.X . Thus S isthe Doleans-Dade exponential of X and, since 〈σW, σW 〉t = σ2t, one obtainsthe very well-known geometric Brownian motion

St = S0 exp((µ− 1/2σ2)t + σWt

).


Markov processes and stochastic differential equations

This paragraph is devoted to the Markov property of solutions to stochasticdifferential equations. It contains a brief survey on infinitesimal generatorsfor diffusions.

One of the motivations for the development of stochastic integrals was thestudy of diffusions (i.e. the continuous strong Markov processes) as solutionsof differential equations of the form:

Xt = X0 +∫ t

0

f(s,Xs)ds +∫ t

0

g(s,Xs)dWs,

where (Wt)t is a standard Brownian motion, and f and g are sufficiently reg-ular functions to define a unique continuous solution which is strong Markov.From the general concept of semimartingale differentials, the terms ds anddWs can be replaced by general semimartingales that should have indepen-dent increments so that the solution is Markovian. As mentioned in Protter[419], a “naive” definition of a Markovian process looks like a weakening ofthe property of independent increments:

DEFINITION B.8 A process X with values in Rd and adapted is saidto be a simple Markov process with respect to the filtration (Ft)t, if for eachs ≥ 0 the σ−fields Fs and σ(Xt, t ≥ s) (information uniquely from time s)are conditionally independent given Xs.

This is in fact equivalent to: for t ≥ s and for every f bounded and Borelmeasurable,

E[f(Xt)|Fs] = E[f(Xt)|σ(Xs)]. (B.7)

“The best prediction of the future given the past and the present is thepresent.”

From the previous relation, one can define a transition function for a Markovprocess as follows : for s < t and f bounded and Borel measurable,

Ps,t(Xs, f) = E[f(Xt)|Fs] . (B.8)

Letting f(x) = 11C(x), the preceding relation reduces to

P(Xt ∈ C|Fs) = Ps,t(Xs, 1C). (B.9)

also denoted by Ps,t(Xs, C).

If for any s < t, the transition function satisfies the relationship

Ps,t = P0,t−s, also denoted by P (t− s) , (B.10)


the Markov process is said to be time homogeneous and the transitions func-tions are a semigroup of operators, known as the transition semigroup(P (t))t.In the time homogeneous case, the Markov property becomes, for any u ≥ 0:

P(Xt+u ∈ C|Ft) = P (u,Xt, C). (B.11)

Thus, a function P (t, x, C) defined on (0,∞)×Rd×B(Rd) is a time homo-geneous transition function if:1) For each (t, x), P (t, x, .) is a probability on (Rd,B(Rd)).2) For each x, P (0, x, .) is the Dirac measure δx(C) at x (since X0 = x).3) For each C ∈ B(Rd), P (., ., C) is real-valued, Borel measurable, and bound-ed.4) P (t, x, C) satisfies the following relation, called the Chapman-Kolmogorovproperty: for each (t, x, C), t and u ≥ 0:

P (t + u, x, C) =∫

P (u, y, C)P (t, x, dy). (B.12)

The probability measure m defined on (Rd,B(Rd), by m(A) = P[X0 ∈ C]is called the initial distribution of the process X .

A transition function for X and the initial distribution m determine thefinite-dimensional distributions of X by

P(X0 ∈ C0, Xt1 ∈ C1..., Xtn ∈ Cn) =∫C0

...∫Cn−1

P (tn − tn−1, yn−1, Cn)...P (t1, y0, dy1)m(dy0) .(B.13)

It can be required that the Markov Property holds for stopping times.

DEFINITION B.9 A time homogeneous simple Markov process X isStrong Markov if for any stopping time T with P[T <∞] = 1 and u ≥ 0,

P(XT+u ∈ C|FT ) = P (u,XT , C) , (B.14)

or equivalently, for any function f bounded and Borel measurable

E[f(XT+u)|FT ) = P (u,XT , f) . (B.15)

In the previous definition, the strong Markov property has been defined foronly time homogeneous processes but, if Xt is a Rd-valued simple Markovprocess, then Yt = (Xt, t) is a Rd+1-valued simple Markov process.

Such examples of Markov processes are the Brownian motion and the Pois-son process and typical solutions of stochastic differential equations.

A first standard example of such stochastic differential equations is givenby the Ito diffusion processes. Consider the case of the evolution of a stockprice subject to random shocks (resulting, for instance, from a multitude of


stock trade orders). If b(t, x) is the trend of the price at the point x and timet (a kind of “velocity”), then to describe the value Xt at time t, it may bereasonable (under some assumptions like independence of the shocks) to usean equation like

dXt

dt= b(t,Xt)dt + σ(t,Xt)εt ,

where (εt)t is a white noise and σ(t, x) measures the amplitude of the shocks(the very well-known volatility in finance).

The mathematical interpretation of this equation, due to Ito is

dXt = b(t,Xt)dt + σ(t,Xt)dWt , (B.16)

where (Wt)t is a Brownian motion.

This equation can be immediately extended to the case X is Rd-valued and(Wt)t is a p-dimensional Brownian motion. b(t, x) ∈ Rd is usually called thedrift, and σ(t, x) ∈ Rd×p the diffusion coefficient.

In fact, since the works of Ito, the above equation is written with the useof stochastic integrals : if X0 is given, then the solution is

Xt = X0 +∫ t

0

b(s,Xs)ds +∫ t

0

σ(s,Xs)dWs , (B.17)

where (Wt)t is a Brownian motion, and b(t, x), σ(t, x) are appropriately s-mooth to ensure the existence and the uniqueness of solutions.Intuitively speaking, if (Ft)t is the underlying filtration of the Brownian mo-tion, then for small ε and for all i, j ≤ n,

E[X it+ε −X i

t |Ft] = bi(t,Xt)ε + o(ε) ,

E[(X it+ε −X i

t − bi(t,Xt)ε)(Xjt+ε −Xj

t − bj(t,Xt)ε)|Ft] = σ′iσj(t,Xt)ε + o(ε)

(notation: σ′ denotes the transpose of the matrix σ)

The basic properties of Ito diffusions are :

1) The Markov property.2) The strong Markov property.3) The generator A of the process X can be expressed in terms of b and σ.

Two conditions are usually introduced on the coefficient functions b and σto guarantee the existence and the uniqueness of the solution. For example,

THEOREM B.3Let T > 0, b(., .) : [0, T ] × Rd → Rd and σ(., .) : [0, T ] × Rd×p → Rd bemeasurable functions satisfying


1)|b(t, x)|+ ‖ σ(t, x) ‖≤ C(1 + |x|) , (B.18)

for some constant C (where |.| is the Euclidian norm on Rd, and ‖ σ ‖2=∑σ2ij) and the functions t→ b(t, x) t→ σ(t, x) are continuous.

2)|b(t, y)− b(t, x)|+ ‖ σ(t, y)− σ(t, x) ‖≤ D|y − x| , (B.19)

for some constant D.Let Z be a random variable which is independent of the σ−field FW generatedby the Brownian W and such that E[Z2] <∞. Then the stochastic differentialequation

dXt = b(t,Xt)dt + σ(t,Xt)dWt , t ≤ T, with X0 = Z

has a unique solution X which is continuous with respect to t and with com-ponents X i that are adapted and satisfy E[Supt≤T (X i

t)2] <∞.

The first assumption is called a linear growth condition. The second is alocally Lipschitz condition.

Two standard examples of diffusions are described in what follows.

Example B.3 Diffusions1) The stochastic exponential eWt−1/2t is a diffusion with b(t, x) = 0 andσ(t, x) = x.

2) The Ornstein-Uhlenbeck process can be defined as follows:

dXt = k(m−Xt)dt + σdWt,

where k is a non-negative constant and X0 is independent of the BrownianW . Note that X is a Gaussian process (all finite-dimensional distributions areGaussian). This process is often used in the term structure modeling since ithas the mean-reverting property. In fact, since

Xt = m + (X0 −m)e−kt + σ

∫ t

0

e−k(t−s)dWt ,

then E[Xt] = m + (X0 −m)e−kt which tends to m as t goes to infinity.

In fact, the definition of a diffusion is not standardized. For example, aprocess may be called a diffusion if it has continuous sample paths and if itsatisfies the strong Markov property. Under the assumptions of the previous


theorem, the unique solution of the stochastic differential equation is a strongMarkov process and, by construction, its paths are continuous.

Examine now the generator of a diffusion, solution of the stochastic differ-ential equation

dXt = b(t,Xt)dt + σ(t,Xt)dWt , t ≤ T, with X0 = Z .

Then, we deduce :

PROPOSITION B.3If f is twice continuously differentiable functions with a compact support thenf is in the domain D(A) of the operator A and

Af(x) =∂f

∂t+∑i≤d

bi(x)∂f

∂xi+ 1/2

∑i,j≤d

(σσ′)i,j(x)∂2f

∂xi∂xj. (B.20)

PROOF It is based on the Ito’s formula applied to Y = f(X).

dY =∑i

∂f

∂xi(X)dXi + 1/2

∑i,j≤d

∂2f

∂xi∂xj(X)d〈Xc

i , Xcj 〉 ,

from which, denoting by Ex[.] the conditional expectation with respect to theevent X0 = x, it can be deduced that

Ex[f(Xt)] = f(x)

+Ex

∫ t

0

∂f

∂t(X) +

∑i≤d

bi(x)∂f

∂xi(X) + 1/2

∑i,j≤d


∂xi∂xj(X)

ds

.

Then, the proposition is established by applying the definition of A

Af(x) = limt→0

1tE[f(Xt)|X0 = x]− f(x).

Example B.4 Generators1) The d−dimensional Brownian motion, which of course is the solution of

dXt = dWt,

has a generator A given by, for any function f in C20 (Rd),

Af = 1/2∑i

∂2f

∂x2i(X).


Thus A = 1/2 ∆ where ∆ is the Laplace operator.

2) The Ornstein-Uhlenbeck process

dXt = k(m−Xt)dt + σdWt,

has a generator A given by

Af(x) = k(m− x)∂f

∂x+ 1/2σ2∂

2f

∂x2.

Other standard concepts for Markov processes are the following ones (seee.g. Oksendal [401]):

1) The Dynkin formula.

PROPOSITION B.4

Assume that f is in C20 (Rd). If T is a stopping time with E[T |X0 = x] < ∞

then

E[f(XT )|X0 = x] = f(x) + E[∫ T

0

Af(Xs)ds|X0 = x] . (B.21)

For example, if T is the first exit time of a bounded set, then E[T |X0 =x] <∞ and the above formula is valid.

Consider for example the d-dimensional Brownian motion W starting atx0 ∈ Rd. Assume that |a| < R. Then, by applying the Dynkin formula, theexpected value of the first exit time TR of the Brownian motion from the ball

x ∈ Rd; |x| < R

is given byE[TR|W0 = x0] = 1/d (R2 − |x0|2) .

2) The Kolmogorov’s backward equation.

Consider an Ito diffusion X in Rd with a generator A. If f is a function inC20 (Rd), then by using the Dynkin formula with T = t, it is obvious that

u(t, x) = E[f(XT )|X0 = x]


is differentiable with respect to t with a differential given by

∂u

∂t= Ex[Af(XT )] .

Therefore, we have the following result:

PROPOSITION B.5 Solution of the backward equation

1) If f is in C20 (Rd), then u(t, .) is in D(A) for each t and

∂u∂t = Au, t > 0, x ∈ Rd,u(0, x) = f(x), x ∈ Rd .

(B.22)

2) Let C1,2(Rd) be the set of all functions continuous with respect to t andtwice continuously differentiable with respect to x. If a function w(., .) isbounded and in C1,2(Rd) and satisfies (B.22), then w = u.

3) The Feynman-Kac formula.This is a generalization of the Kolmogorov’s backward equation.

PROPOSITION B.6If f is in C2

0 (Rd) and g is continuous on Rd and lower bounded. Then v(t, .)defined by

v(t, x) = Ex[exp(−∫ t

0

g(Xs)ds)f(Xt)]

is the solution of∂v∂t = Av − gv, t > 0, x ∈ Rd,v(0, x) = f(x), x ∈ Rd .

(B.23)

2) Let C1,2(Rd) be the set of all functions continuous with respect to t andtwice continuously differentiable with respect to x. If a function w(., .) is inC1,2(Rd), is bounded on K×Rd for each compact subset K in R, and satisfies(B.23) then w = v.

An application of this is the determination of the generator of a killeddiffusion. Consider an Ito process X solution of

dXt = b(Xt)dt + σ(Xt)dWt .

Its generator is given by

Af(x) =∑i≤d

bi(x)∂f

∂xi+ 1/2

∑i,j≤d


∂xi∂xj.


Consider now the process X which is the process X killed at the random timeT :

Xt = Xt, if t < T ,

and Xt is undefined if t ≥ T . Denote the function c(x), which is the killingrate defined by

c(x) = limt↓0

1/t P[X is killed in the time interval (0, t)|X0 = x] .

Then, X is also a strong Markov process and

Ex[f(Xt)] = Ex[f(Xt)1t<T ] = Ex[f(Xt)exp(−∫ t

0

c(Xs)ds)] ,

for all bounded continuous functions f on Rd. Thus, the generator A of X isgiven by

Af(x) =∑i≤d

bi(x)∂f

∂xi+ 1/2

∑i,j≤d


∂xi∂xj− c(x)f(x) .

As it can be seen, the use of the generators may lead to the resolutionof partial differential equations (PDE), in particular when calculating optionprices. Consider the following pricing problem:Let A be the operator

Af(t, x) =∂f

∂t(t, x) + b(t, x)

∂f

∂x(t, x) + 1/2σ2(t, x)

∂2f

∂x2(t, x) ,

where b and σ satisfy the assumptions of Theorem (B.3).For a given function g (the payoff of a European option, for example), findthe solutions f of the next parabolic equation (f(t,Xt) will be the price ofthe option at any time t according to the value of the stock Xt at time t):

Af(t, x) = rf(t, x), Forallt ∈ [0, T ], ∀x ∈ Rd,f(T, x) = g(x). (B.24)

Consider Xx,t the Ito process defined by, for all u ≥ t,

Xx,tu = Xx,t

t +∫ u

t

b(s,Xx,ts )ds +

∫ u

t

σ(s,Xx,ts )dWs,

with the initial condition Xx,tt = x. Then, if f is the solution of (B.24),

applying the Ito formula, f satisfies

f(u,Xx,tu )e−ru = e−rtf(t, x) +

∫ u

t

e−rs ∂f

∂x(s,Xx,t

s )σ(s,Xx,ts )dWs.


If the previous stochastic integral is a martingale (with a mild integrabilityassumption on ∂f

∂x), then it can be deduced that

f(t, x) = E[e−r(T−t)g(Xx,tT )] = E[e−r(T−t)g(XT )|Xt = x].

Example B.5 Basic models1) Black and Scholes model - The well-known equation is

Af(t, x) = ∂f∂t + rx∂f

∂x + 1/2σ2 ∂2f∂x2 = rf(t, x) ,

f(T, x) = g(x) = (x−K)+ .(B.25)

Sof(t, x) = E[e−r(T−t)(XT −K)+|Xt = x] .

2) Cox-Ingersoll-Ross model - This process is usually used in the term struc-ture modeling.Thus, it is important to calculate such expectations as E

[exp(− ∫ t

srudu|Fs)

]when the spot rate r is given by:

drt = a(b− rt)dt + ρ√rtdWt.

Using the Markov property, it remains to calculate E[exp(− ∫ t

0 rudu)]. For

this, consider the solution rx,tu of

drx,ts = a(b− rx,ts )ds + ρ

√rx,ts dWs, r

x,tt = x .

Introduce :

f(t, x) = E

[exp

(−∫ T

t

rx,tu du

)], f(T, x) = 1 .

Then, f is solution of the PDE :

∂f

∂t+ a(b− x)

∂f

∂x+ 1/2ρ2

∂2f

∂x2= xf .

Now, results concerning PDE are applied to determine the solution which isgiven by

f(t, x) = Φ(T − t)exp(−r(t)Ψ(T − t)) ,

with Φ(s) =(

2γe(γ+a)s/2

(γ+a)(eγs−1)+2γ

) 2abρ2 , γ2 = a2 + 2ρ2, Ψ(s) = 2(eγs−1)

(γ+a)(eγs−1)+2γ .

Ito diffusions are used in most financial models. Nevertheless, other semi-martingales can be introduced instead of the Brownian motion; other Levyprocesses for example (see Cont and Tankov [128]). So, it is interesting to s-tudy stochastic differential equations driven by more general semimartingales(see Protter [419]).

References

[1] Acerbi, C., (2002): Spectral measures of risk: a coherent representationof subjective risk aversion, Journal of Banking and Finance, 26, 1505-1518.

[2] Acerbi, C., (2004): Coherent representations of subjective risk-aversion,In: G. Szego (ed.), Risk Measures for the 21st Century, Wiley: NewYork.

[3] Acerbi, C., Nordio, C., and Sirtori, C., (2001): Expected short-fall as a tool for financial risk management, Working paper,http://www.gloriamundi.org.

[4] Acerbi, C. and Prospero, S., (2002): Portfolio optimization with spec-tral measures of risk, Abaxbank.

[5] Acerbi, C. and Tasche, D., (2002): On the coherence of expected short-fall, Journal of Banking and Finance, 26, 1487-1503.

[6] Adcock, C.J. and Meade, N., (1994): A simple algorithm to incorpo-rate transactions costs in quadratic optimization, European Journal ofOperational Research, 79, 85-94.

[7] Agarwal, V. and Naik, N.Y., (2000a): Multi-period performance per-sistence analysis of hedge funds, Journal of Financial and QuantitativeAnalysis, 35, 327-342.

[8] Agarwal, V. and Naik, N.Y., (2000b): On taking the alternative route:risks, rewards style and performance persistence of hedge funds, Journalof Alternative Investment, 2, 6-23.

[9] Agarwal, V. and Naik, N.Y., (2000c): Performance evaluation of hedgefunds with option-based and buy-and-hold strategies, working paper,London Business School.

[10] Agarwal, V. and Naik, N.Y., (2004): Risks and portfolio decisions in-volving hedge funds, Review of Financial Studies, 17, 63-98.

[11] Akaike, H., (1976): Canonical correlation analysis of time series andthe use of an information criterion, System identification: advancesand case studies, R.K. Mehra and D.G. Lainiotis, eds., Academic Press:New York, 27-96.

397

398 References

[12] Akian M., Menaldi J., and Sulem A., (1996): On an investment-consumption model with transaction costs, SIAM Journal Control andOptimization, 34, 329-364.

[13] Akian M., Sulem A., and Taksar M., (2001): Dynamic optimizationof long term growth rate for a portfolio with transaction costs: thelogarithmic utility case, Mathematical Finance, 11, 153-188.

[14] Alexander, C., (2002): Market Models, Wiley: New York.

[15] Alexander, C. and Dimitriu, A., (2005): Indexing and statistical arbi-trage, Journal of Portfolio Management, 32, 50-63.

[16] Alexander, C. and Tasche, D., (2002): On the covariance matrices usedin Value-at-Visk models, Journal of Derivatives, 4, 50-62.

[17] Alexander, G.J., (1993): Short selling and efficient sets, Journal ofFinance, 48, 1497-1506.

[18] Alexander, G. and Baptista, A., (2002): Economic implications of usinga mean-VaR model for portfolio selection: a comparison with mean-variance analysis, Journal of Economic Dynamics and Control, 34, 329-364.

[19] Allais, M., (1953): Le comportement de l’homme rationnel devant lerisque: critique des postulats et axiomes de l’ecole americaine, Econo-metrica, 21, 503-546.

[20] Allais, M., (1979): The so-called Allais paradox and rational decisionsunder uncertainty, in Expected Utility Hypothesis and the Allais Para-dox, Allais and Hagen (1979) (eds.), Dordrecht: D. Reidel.

[21] Althofer, I. and Koschnick, K.U., (1991): On the convergence of thresh-old accepting, Applied Mathematical Optimization, 24, 183-195.

[22] Amenc, N. and Le Sourd, V., (2003): Portfolio Theory and PerformanceAnalysis, Wiley Finance, Wiley.

[23] Amenc, N. and Martellini, L., (2002): Portfolio optimization and hedgefund style allocation decisions, Journal of Alternative Investments, 5,7-20.

[24] Amenc N., Bonnet S., Henry G., Martellini L., and Weytens, A., (2004):La Gestion Alternative, Economica: Paris.

[25] Andersson, F., Mausser, H., Rosen, D., and Uryasev, S., (2001): Creditrisk optimization with conditional Value-at-Risk criterion, Mathemati-cal Programming, 89, 273-291.

[26] Anscombe, F.J. and Aumann, R.J., (1963): A definition of subjectiveprobability, Annals of Mathematical Statistics, 34, 199-205.

References 399

[27] Avouyi-Dovi, S., Morin, S., and Neto, D., (2004): Optimal asset allo-cation with Omega function, working paper Banque de France, Paris,France.

[28] Arrow, K.J., (1953): Le role des valeurs boursieres pour la repartition lameilleure des risques, Econometrie, Paris: CNRS (translated as Arrow,K. J., (1964): The role of securities in the optimal allocation of risk-bearing. Review of Economics Studies, 31, 91-96.)

[29] Arrow, K.J., (1965): Aspects of the Theory of Risk-Bearing, Helsinki:Yrj Hahnsson Foundation.

[30] Artzner, P., Delbaen, F., Eber, J.-M., and Heath, D., (1997): Thinkingcoherently, Risk, 10, 68-71.

[31] Artzner, P., Delbaen, F., Eber, J.-M., and Heath, D., (1999): Coherentmeasures of risk, Mathematical Finance, 9, 203-228.

[32] Artzner, P., Delbaen, F., Eber, J.-M., Heath, D., and Hyejin Ku, (2002):Coherent multiperiod risk measurement, Working paper, ETH Zurich.

[33] Artzner, P., Delbaen, F., Eber, J.-M., Heath, D., and Hyejin Ku,(2003): Coherent multiperiod risk adjusted values and Bellman’s prin-ciple, Working paper, ETH Zurich.

[34] Athayde, G., (2001): Building a mean-downside risk portfolio frontier,in Sortino F. and S. Satchell, Managing Downside Risk in FinancialMarkets: Theory, Practice and Implementation, Butterworth Heine-mann.

[35] Athayde, G. and Flores, R. Jr., (2004): Finding a maximum skewnessportfolio: a general solution to three-moments portfolio choice, Journalof Economic Dynamics and Control, 28, 1335-1352.

[36] Bacmann, J.-F. and Scholz, S., (2003): Alternative performance mea-sures for hedge funds, AIMA Journal, June.

[37] Bajeux, I., (1989): Gestion de portefeuille dans un modele binomial,Annales d’Economie et de Statistiques, 13, 50-76.

[38] Bajeux-Besnainou, I., Jordan, J.V., and Portait, R., (2001): An asset al-location puzzle: comment, American Economic Review, 91, 1170-1179.

[39] Bajeux-Besnainou, I., Jordan, J.V., and Portait, R., (2003): Dynamicasset allocation for stocks, bonds and cash, Journal of Business, 76,263-287.

[40] Bamberg, G., and Wagner, N., (2000): Equity index replication withstandard and robust regression estimators, OR Spectrum, 22, 525-543.

[41] Baquero, G., Horst, J., and Verbeek, M., (2005): Survival, look-aheadbias, and persistence in hedge fund performance, Journal of Financialand Quantitative Analysis, 40, 493-517.

400 References

[42] Barra, (1996): The Barra E2 multiple-factor model, Barra web site.

[43] Barrieu, P. and El Karoui, N., (2002): Inf-convolution of risk measuresand optimal risk transfer, Finance and Stochastics, 9, 269-298.

[44] Barrieu, P. and El Karoui, N., (2005): Pricing, hedging and optimallydesigning derivatives via minimization of risk measures, L.S.E., Statis-tics Department, London, and CMAP, Ecole Polytechnique, France.

[45] Bartlett, M., (1951): An inverse matrix adjustment arising in discrimi-nant analysis, Annals of Mathematical Statistics, 22, 107-111.

[46] Basak, S., (1995): A general equilibrium model of portfolio insurance,Review of Financial Studies, 8, 1059-1090.

[47] Basak, S., and Shapiro, A., (2001): Value-at-Risk-based risk manage-ment: optimal policies and asset prices, Review of Financial Studies,14, 371-405.

[48] Basak, S., (2002): A comparative study of portfolio insurance, Journalof Economic Dynamics and Control, 26, 1217-1241.

[49] Bassi, F., Embrechts, P., and Kafetzaki, M., (1997): Risk managementand quantile estimation, in: Adler, R., Feldman, R., and Taqqu, M.(Eds.), Practical Guide to Heavy Tails, Birkhauser: Boston.

[50] Battocchio, P., Menoncin, F., and Scaillet, O., (2005): Optimal assetallocation for pension funds under mortality risk during the accumu-lation and decumulation phases, forthcoming in Annals of OperationsResearch.

[51] Bauer, R.J., (1994): Genetic Algorithms and Investment Strategies, Wi-ley.

[52] Beasley, J.E., Meade N., and Chang, T.-J., (2003): An evolutionaryheuristic for the index tracking problem, European Journal of Opera-tional Research, 148, 621-643.

[53] Benzion, U. and Yagil, J., (2003): Portfolio composition choice: a be-havioral approach, Journal of Behavioral Finance, 4, 85-95.

[54] Bernoulli, D., (1738): Exposition of a new theory on the measurementof risk, Comentarii Academiae Scientiarium Imperialis Petropolitanae,as reprinted in 1954, Econometrica, 22, 23-26.

[55] Bertsimas, D., Lauprete, G.J., and Samarov, A., (2004): Shortfall asa risk measure: properties, optimization and applications, Journal ofEconomic Dynamics and Control, 28, 1353-1381.

[56] Bertrand, P., (2005): A note on portfolio performance attribution: tak-ing risk into account, Journal of Asset Management, 5, 428-437.

References 401

[57] Bertrand, P., Lesne, J.-P., and Prigent, J.-L., (2001): Gestion de porte-feuille avec garantie: l’allocation optimale en actifs derives, Finance,22, 7-35.

[58] Bertrand, P. and Prigent, J.-L., (2002): Portfolio Insurance: the ex-treme value approach of the CPPI method, Finance, 23, 69-86.

[59] Bertrand, P. and Prigent, J.-L., (2003): Portfolio insurance strategies:a comparison of standard methods when the volatility of the stock isstochastic, Int. J. Business, 8, 461-472.

[60] Bertrand, P. and Prigent, J.-L., (2005): Portfolio insurance strategies:OBPI versus CPPI, Finance, 26, 5-32.

[61] Bertrand, P., Prigent, J.-L., and Sobotka, R., (2001): Optimisation deportefeuille sous contrainte de variance de la tracking-error, Banque &Marches, 13, 1826-1836.

[62] Bertrand, P. and Rousseau, P., (2005): L’attribution de performance:problematique, limite et extension, Revue Francaise de Gestion.

[63] Bertsekas, D.P., (1987): Dynamic Programming, Prentice-Hall.

[64] Bertsekas, D.P., (2005): Dynamic Programming and Optimal Control,vol I and II, Athena Scientific.

[65] Best, M.J. and Grauer, R., (1991): On the sensitivity of mean-varianceefficient portfolios to changes in asset means: some analytical and com-putational results, Review of Financial Studies, 4, 315-342.

[66] Best, M.J. and Grauer, R., (1992): Positively weighted minimum-variance portfolios and the structure of asset expected returns, Journalof Financial and Quantitative Analysis, 27, 513-537.

[67] Best, M.J. and Hlouskova, J., (2000): The efficient frontier for boundedassets, Mathematical Methods of Operations Research, 52, 195-212.

[68] Best, M.J., and Hlouskova, J., (2003): Portfolio selection and transac-tions costs, Computational Optimization and Applications, 24, 95-116.

[69] Best, M.J. and Kale, J., (2000): Quadratic programming for large-scaleportfolio optimization, in Financial Services Information Systems, ed.Jessica Keyes, CRC Press, 513-529.

[70] Bielecki, T.R., and Pliska, S. R., (2000): Risk-sensitive dynamic assetmanagement in the presence of transaction costs, Finance and Stochas-tics, 4, 1-33.

[71] Bion-Nadal, J., (2003): Conditional risk measure and robust represen-tation of convex conditional risk measures, C.M.A.P., Ecole Polytech-nique, Palaiseau, France.

402 References

[72] Bixby, R.E., Gregory, J. W., Lustig, I.J., Marsten, R.E., and Shanno,D.F., (1992): Very large scale linear programming: a case study incombining interior point and simplex methods, Operations Research,40, 885-897.

[73] Black, F., (1972): Capital market equilibrium with restricted borrow-ing, Journal of Business, 45, 444-455.

[74] Black, F. and Litterman, R., (1990): Asset allocation: combining in-vestor views with market equilibrium, Goldman Sachs Fixed IncomeResearch.

[75] Black, F. and Rouhani, R., (1989): Constant proportion portfolio in-surance and the synthetic put option: a comparison, in Institutionalinvestor focus on investment management, edited by Frank J. Fabozzi,Cambridge, Mass. : Ballinger, 695-708.

[76] Black, F. and Jones, R., (1987): Simplying portfolio insurance, Journalof Portfolio Management, 48-51.

[77] Black, F. and Perold, A.R., (1992): Theory of constant proportion port-folio insurance, Journal of Economic Dynamics and Control, 16, 403-426.

[78] Black, F. and Scholes, M., (1973): The pricing of options and corporateliabilities portfolio insurance, Journal of Political Economy, 81, 637-654.

[79] Blanchet-Scalliet, C., El Karoui, N., Jeanblanc, M., and Martellini,L., (2001): Optimal investment and consumption decisions when time-horizon is uncertain, working paper, University of Evry, France.

[80] Bollerslev, T., Engle, R., and Nelson, D., (1986): Arch Models, in: R. F.Engle, and D. McFadden (eds.), Handbook of Econometrics, Elsevier,Edition 1, volume 4, chapter 49, pages 2959-3038.

[81] Bonnet, A. and Nagot, I., (2005): Methodology of measuring perfor-mance in alternative investment, Working paper, University Paris 1,France.

[82] Bookstaber, R. and Langsam, J.A., (2000): Portfolio insurance tradingrules, Journal of Futures Markets, 8, 15-31.

[83] Bookstaber, R. and Roger, C., (1985): Problems in evaluating the per-formance of portfolios with options, Financial Analysts Journal, 41,48-62.

[84] Bouchard, B. and Pham, H., (2004): Wealth-path dependent utilitymaximization in incomplete markets, Finance and Stochastics, 8, 579-603.

[85] Borodine, A.N. and Salminen, P., (2002): Handbook of Brownian Mo-tion - Facts and Formulae, Birkhauser.

References 403

[86] Boyd, S. and Vandenberghe, L., (2004): Convex Optimization, Cam-bridge University Press.

[87] Brennan, M., (1970): Taxes, market valuation and corporate financialpolicy, National Tax Journal, 25, 417-427.

[88] Brennan, M. and Solanki, R., (1981): Optimal portfolio insurance,Journal of Financial and Quantitative Analysis, 16, 279-300.

[89] Brennan, M. and Xia, Y., (2000): Stochastic interest rates and theBond-Stock mix, European Finance Review, 4, 197-210.

[90] Brinson, G.P., Hood, L.R., and Beebower, G.L. (1986): Determinantsof portfolio performance, Financial Analysts Journal, 39-44.

[91] Brooks, C. and Kat, H., (2002): The statistical properties of hedge fundindex returns and their implications for investors, Journal of AlternativeInvestments, 5, 26-44.

[92] Brown, W., Durbin, J., and Evans, J.M., (1975): Techniques for testingthe constancy of regression relationships over time, Journal of the RoyalStatistical Society, 37, 149-192.

[93] Brown, S.J., Goetzmann, W., Ibbotson, R.G., and Ross, S.A., (1992):Survivorship bias in performance studies, Review of Financial Studies,5, 553-580.

[94] Browne, S., (1999a): Beating a moving target: optimal portfolio strate-gies for outperforming a stochastic benchmark, Finance and Stochastics,3, 275-294.

[95] Browne, S., (1999b): Reaching goals by a dealine: digital optionsand continuous-time active portfolio management, Advances in AppliedProbability, 31, 551-577.

[96] Buckley, I.R.C., and Korn, R., (1998): Optimal index tracking undertransaction costs and impulse control, International Journal of Theo-retical and Applied Finance, 1, 315-330.

[97] Cadenillas, A., (2000): Consumption-investment problems with trans-action costs: survey and open problems, Mathematical Methods Opera-tions Research, 51, 43-68.

[98] Cadenillas, A., and Pliska, S., (1999): Optimal trading of a securitywhen there are taxes and transaction costs, Finance and Stochastics, 2,137-165.

[99] Cadenillas, A. and Sethi, S.P., (1997): The consumption investmentproblems with subsistence consumption, bankruptcy and random mar-ket coefficients, Journal of Optimization Theory and Applications, 93.

404 References

[100] Caglayan, M. and Edwards, F., (2001): Hedge fund performance andmanager skill, Journal of Futures Markets, 21, 1003-1028.

[101] Cai, X., Teo, K. L., Yang, X. Q., and Zhou, X.Y., (2000): Portfoliooptimization under a minimax rule, Management Science, 46, 957-972.

[102] Campbell, J., Cocco, J., Gomes, F., Maenhout, P.J., and Viceira, L.,(2001): Stock market mean reversion and the optimal equity allocationof a long-lived investor, European Finance Review, 5,269-292.

[103] Campbell, J. Y. and Viceira, L., (2002): Strategic Asset Allocation:Portfolio Choice for Long-Term Investors, Oxford University Press: Ox-ford.

[104] Canner, N., Mankiw, N. G., and Weil, D.N., (1997): An asset allocationpuzzle, American Economic Review, 87, 181-191.

[105] Carhart, M.M., (1997): On persistence in mutual fund performance,Journal of Finance, 52, 57-82.

[106] Carr, P. and Madan, D., (2001): Optimal positioning in derivative se-curities, Quantitative Finance, 1, 19-37.

[107] Cascon, A., Keating, C., and Shadwick, F., (2002): The mathematics ofthe Omega measure, working paper, The Finance Development Center.

[108] Cass, D. and Stiglitz, J.E., (1970): The structure of investor preferencesand asset returns, and separability in portfolio allocation: a contribu-tion to the pure theory of mutual funds, Journal of Economic Theory,2, 122-160.

[109] Cerf, R., (1994): Une theorie asymptotique des algorithmes genetiques,Phd Thesis, University of Montpellier II, France.

[110] Cesari, L., (1983): Optimization Theory and Applications: Problemswith Ordinary Equations, Springer: New York.

[111] Chabaane, A., Laurent, J.-P., Malevergne, Y., and Turpin, F., (2003):Alternative risk measures for alternative investments, to appear in Jour-nal of Risk.

[112] Chamberlain, G., (1983): A characterization of the distributions thatimply mean-variance utility functions, Journal of Economic Theory, 29,185-201.

[113] Chan, L.K.C., Karceski, J., and Lakonishok, J., (1999): On portfo-lio optimization: forecasting covariances and choosing the risk model,Review of Financial Studies, 12, 937-974.

[114] Chan, K., Karolyi, G., Longstaff, F., and Sanders, A., (1992): An em-pirical comparison of alternative models of the short term interest rate,Journal of Finance, 47, 1209-1227.

References 405

[115] Chateauneuf, A., Cohen, M., and Meilijson, I., (1997): New tools to bet-ter model behavior under risk and uncertainty: an overview, Finance,18, 25-46.

[116] Chekhlov, A., Uryasev, S., and Zabarankin, M., (2003): Portfolio op-timization with drawdown constraints, in Asset and Liability Manage-ment Tools, (B. Scherer ed.), Risk Books, London, 263-278.

[117] Cheridito, P., Delbaen, F., and Kupper, M., (2004): Coherent and con-vex risk measures for bounded cadlag processes, Stochastic processesand their applications, 112, 1-22.

[118] Cheridito, P., Delbaen, F., and Kupper, M., (2005): Coherent and con-vex risk measures for unbounded cadlag processes, Finance and Stochas-tics, 9, 369-387.

[119] Chew, S.-H., (1983): A generalization of the quasilinear mean with ap-plications to the measurement of income inequality and decision theoryresolving the Allais paradox, Econometrica, 51, 1065-1092.

[120] Chew, S.-H. and MacCrimmon, K.R., (1979): Alpha-nu choice theory:a generalization of expected utility theory, Working Paper No. 669,University of British Columbia, Vancouver.

[121] Chew, S., Karni, E., and Safra, Z., (1987): Risk aversion in the theory ofexpected utility with rank dependent preferences, Journal of EconomicTheory, 42, 370-381.

[122] Chopra, V. K. and Ziemba, W.T., (1993): The effect of errors in means,variances and covariances on optimal portfolio choice, Journal of Port-folio Management, 19, 6-11.

[123] Choquet, G., (1953): Theorie des capacites, Annales Institut Fourier,5, 131-295.

[124] Clarke, R.C., Krase, S., and Statman, M. (1994). Tracking errors, re-gret, and tactical asset allocation, Journal of Portfolio Management,20, 16-24.

[125] Cohen, M. and Tallon, J.-M., (2000): Decision dans le risque etl’incertain : l’apport des modeles non-additifs, Revue d’Economie Poli-tique, 110.

[126] Connor, G. and Leland, H., (1995): Cash management for index track-ing, Financial Analysts Journal, 75-80.

[127] Consiglio, A. and Zenios, S., (1999): Designing portfolios of financialproducts via integrated simulation and optimization models, OperationsResearch, 47, 195-208.

[128] Cont, R. and Tankov, P., (2004): Financial Modelling with Jump Pro-cesses, Chapman & Hall: Boca Raton.

406 References

[129] Coquet, F., Hu, Y., Memin, J., and Peng, S., (2002): Filtration con-sistent nonlinear expectations and related g-expectations, ProbabilityTheory and Related Fields, 123, 381-411.

[130] Cornell, B., (1979): Asymmetric information and portfolio performancemeasurement, Journal of Financial Economics, 7, 381-390.

[131] Cornish, E.A. and Fisher, R.A., (1937): Moments and cumulants inthe specification of distributions, Review of the International StatisticalInstitute, 307-330.

[132] Cox, J. and Huang, C.-F., (1989): Optimal consumption and portfoliochoices when asset prices follow a diffusion process, Journal of EconomicTheory, 49, 33-83.

[133] Cox, J. and Huang, C.-F., (1991): A variational problem arising infinancial economics, Journal of Mathematical Economics, 20, 465-487.

[134] Cuoco, D., (1997): Optimal consumption and equilibrium prices withportfolio constraints and stochastic income, Journal of Economic The-ory, 72, 33-73.

[135] Cuoco D., He, H., and Issaneko, S., (2001): Optimal dynamic tradingstrategies with risk limits, Working paper, Wharton School.

[136] Cvitanic, J., Goukasian, L., and Zapatero, F., (2003): Monte Carlocomputation of optimal portfolios in complete markets, Journal ofEconomic Dynamics and Control, 27, 971-986.

[137] Cvitanic, J. and Karatzas, I., (1992): Convex duality in constrainedportfolio optimization, Annals of Applied Probability, 2, 767-818.

[138] Cvitanic, J. and Karatzas, I., (1995): On portfolio optimization underdrawdown constraints, IMA Lecture Notes in Mathematics and Appli-cations, 65, 77-88.

[139] Cvitanic, J. and Karatzas, I., (1996): Hedging and portfolio optimiza-tion under transaction costs: a martingale approach, Mathematical Fi-nance, 6, 133-165.

[140] Cvitanic, J. and Karatzas, I., (1999): On dynamic measures of risk.Finance and Stochastics, 3, 451-482.

[141] Cvitanic, J., Lazrak, A., Martellini, L., and Zapatero, F., (2003): Op-timal allocation to hedge funds: an empirical analysis, QuantitativeFinance, 3, 1-12.

[142] Cvitanic, J., Schachermayer, W., and Wang, H., (2001): Utility maxi-mization in incomplete markets with random endowment, Finance andStochastics, 5, 259-272.

References 407

[143] Daniel, K., Grinblatt, M., Titman, S., and Wermers, R., (1997): Mea-suring mutual fund performance with characteristic-based benchmarks,Journal of Finance, 52, 1035-1058.

[144] Danielsson, J., Jorgensen, B.N., Samorodnitsky, G., Sarma, M., and deVries, C.G., (2005): Subadditivity re-examined: the case for Value-at-Risk, Working paper, London School of Economics.

[145] Dantzig, G.B., (1998): Linear Programming and Extensions, PrincetonUniversity Press: Princeton, NJ.

[146] Dantzig, G.B. and Infanger, G., (1993): Multi-stage stochastic linearprograms for portfolio optimization, Annals of Operations Research, 45,59-76.

[147] Darinka, D. and Ruszczynski, A., (2006): Portfolio optimization withstochastic dominance constraints, Journal of Banking and Finance, 30,433-451.

[148] Dash, S., Wagner, N., Bruck, B., and Diller, C., (2004): Tracking anindex with narrow baskets: efficiency, costs and tradeoffs involved inoptimized portfolios, Standard and Poor’s Report, Munich, New York,Paris, London.

[149] Davis, M. and Norman, A., (1990): Portfolio selection with transactioncosts, Mathematics of Operations Research, 15, 676-713.

[150] Davis, M. (1993): Markov Models and Optimization, Chapman andHall: London.

[151] De Bondt, W. and Wolff, F.P.C., (2004): Introduction to the specialissue on behavorial finance, Journal of Empirical Finance, 11, 423-427.

[152] Debreu, G.,(1959): Theory of Value: An Axiomatic Analysis of Eco-nomic Equilibrium, Yale University Press: New Haven.

[153] Deelstra, G., Pham, H., and Touzi, N., (2001): Dual formulation ofthe utility maximization problem under transaction costs, Annals ofApplied Probability, 11, 1353-1383.

[154] Delbaen, F., (2002): Coherent risk measures on general probability s-paces, Advances In Finance and Stochastics, Essays in Honour of DieterSondermann, Springer: New York.

[155] Delbaen, F. and Schachermayer, W. (2006): The mathematics of Arbi-trage, Springer: Berlin.

[156] Demange, G. and Rochet, J.-C., (1992): Methodes Mathematiques de laFinance, Economica: Paris.

[157] Dembo, R.S. and Rosen, D., (1999): The practice of portfolio replica-tion: a practical overview of forward and inverse problems, Annals ofOperations Research, 85, 267-284.

408 References

[158] Dempster, M.A.H., ed., (1980): Stochastic Programming, AcademicPress: London.

[159] Denuit, M., Dhaene, J., Goovaerts, M., Kaas, R., and Laeven, R.,(2005): Risk measurement with the equivalent utility principles, work-ing paper, Universite Catholique de Louvain, Belgium.

[160] De Palma, A., and Picard, N., (2003): Measuring risk tolerance, work-ing paper, University of Cergy-Pontoise, THEMA, France.

[161] De Palma, A., and Prigent, J.-L., (2005): Standardized versus cus-tomized portfolio: a compensating variation approach, to appear inAnnals of Operations Research.

[162] De Palma, A., Picard, N., and Prigent, J.-L., (2005): Utility invarianceunder a family of transformations, Working paper 2005, University ofCergy-Pontoise, Thema, France.

[163] Derigs, U. and Nickel, N., (2004): On a local-search heuristic for aclass of tracking error minimization problems in portfolio management,Annals of Operations Research, 131, 45-77.

[164] Detemple, J. Garcia, R., and Rindishbacher, M., (2003): A Monte Carlomethod for optimal portfolios, Journal of Finance, 58, 401-446.

[165] Detlefsen, K. and Scandolo, G., (2005): Conditional and dynamic con-vex risk measures. Finance and Stochastics, 9, 539-561.

[166] Dexter, A. S., Yu, J. N. W., and Ziemba, W.T., (1980): Portfolio s-election in a lognormal market when the investor has a power utility:computational results, in Stochastic Programming, M.A.H. Dempster(Ed.), Academic Press, New-York, 507-523.

[167] Diamond, P. and Stiglitz, J. E., (1974): Increases in risk and in riskaversion, Journal of Economic Theory, 8, 337-360.

[168] Doherty, N. and Eeckhoudt, L., (1995):. Optimal insurance without ex-pected utility : the dual theory and the linearity of insurance contracts,Journal of Risk and Uncertainty, 10, 157-179.

[169] Doukhan, P., (1994): Mixing: Properties and Examples, Lecture Notesin Statistics, Springer: New York.

[170] Dowd, K., (1998): Beyond Value-at-Risk : The New Science of RiskManagement, Wiley: Chichester.

[171] Dueck, G. and Scheuer, T., (1990): Threshold accepting: a generalpurpose algorithm appearing superior to simulated annealing, Journalof Computational Physics, 90, 161-175.

[172] Dueck, G. and Winker, P., (1992): New concepts and algorithms forportfolio choice, Applied Stochastic Models and Data Analysis, 8, 159-178.

References 409

[173] Duffie, D. and Epstein, L.G., (1992): Stochastic differential utility, E-conometrica, 60, 353-394.

[174] Duffie, D., Fleming, W., Soner, H.M. and Zariphopoulou, T., (1997):Hedging in incomplete markets with Hara utility, Journal of EconomicDynamics and Control, 21, 753-782.

[175] Duffie, D. and Huang, C.F., (1985): Implementing Arrow-Debreu equi-libria by continuous trading a few long-lived securities, Econometrica,53, 1337-1356.

[176] Duffie, D. and Pan, J. (1997): An overview of value at risk, Journal ofDerivatives, 4, 7-49.

[177] Duffie, D. and Sun, T.S., (1990): Transactions costs and portfolio choicein a discrete-continuous time setting, Journal of Economic Dynamicsand Control, 14, 35-51.

[178] Davidson, R. and Mac Kinnon, J., (1993): Estimation and Inference inEconometrics, Oxford University Press: New York.

[179] Dunis, C. and Ho, R., (2005): Cointegration portfolios of Europeanequities for index tracking and market neutral strategies, Journal ofAsset Management, 6, 33-52.

[180] Dickey, D.A. and Fuller, W.A., (1979): Distribution of the estimatorsfor autoregressive time series with a unit root, Journal of the AmericanStatistical Association, 74, 427-431.

[181] Dybvig, P.H., (1984): Short sales restrictions and kinks on the meanvariance frontier, Journal of Finance, 39, 239-244.

[182] Dybvig, P.H., (1988): Inefficient dynamic portfolio strategies or how tothrow away a million dollars in the stock market, Review of FinancialStudies, 1, 67-88.

[183] Dybvig, P.H. and Ingersoll, J.E., (1982): Mean-variance theory in com-plete markets, Journal of Business, 55, 233-251.

[184] Dybvig, P.H. and Ross, S.A., (1985): The analytics of performancemeasurement using a security market line, Journal of Finance, 40, 401-416.

[185] Eeckhoudt, L., (1997): La theorie duale des choix risques et la diversi-fication: quelques reflexions, Finance, 18, 77-84.

[186] Edgeworth, F.Y., (1908): Papers relating to political economy, 3 vols(London: MacMillan for the Royal Economic Society, 1925), I, 19-20,fn. 3.

[187] Ekeland, I. and Turnbull, T., (1983): Infinite-Dimensional Optimiza-tion and Convexity, Chicago lectures in Mathematics, the University ofChicago Press.

410 References

[188] El Karoui, N., (1981): Les Aspects Probabilistes du Controle Stochas-tique, Lecture Notes in Math 876, Springer, Berlin, New York.

[189] El Karoui, N. and Karatzas, I., (1994): Dynamic allocation problemsin continuous time, Annals of Applied Probability, 4, 255-286.

[190] El Karoui, N., and Jeanblanc, M., (1998): Optimization of consumptionwith labor income, Finance and Stochastics, 2, 409-440.

[191] El Karoui, N., Jeanblanc, M., and Huang, S., (2002): Random horizon,working paper, University of Evry, France.

[192] El Karoui, N., Jeanblanc, M., and Lacoste, V., (2005): Optimal portfo-lio management with American capital guarantee, Journal of EconomicDynamics and Control, 29, 449-468.

[193] El Karoui, N. and Meziou A., (2006): Constrained optimization with re-spect to stochastic dominance: application to portfolio insurance, Math-ematical Finance, 16, 103-117.

[194] El Karoui, N. and Quenez, M.-C., (1996): Non-linear pricing theoryand backward stochastic differential equations in financial mathematics,(ed: W.J. Runggaldier), Lecture Notes in Mathematics 1656, SpringerVerlag, 191-246.

[195] El Karoui, N., S. Peng, S., and Quenez, M.-C., (1997): Backward s-tochastic differential equations in finance, Mathematical Finance, 7, 1-71.

[196] Ellsberg, D., (1961): Risk, ambiguity and the Savage axioms, QuarterlyJournal of Economics, 75, 643-669.

[197] Elton, E. and Gruber, M., (1973): Estimating the dependence structureof share prices-Implications for portfolio selection, Journal of Finance,8, 1203-1232.

[198] Elton, E., Gruber, M., and Padberg, M., (1976): Simple criteria foroptimal portfolio selection, Journal of Finance, 31, 1341-1357.

[199] Elton, E., Gruber, M., and Padberg, M., (1978): Optimal portfolio fromsimple ranking devices, Journal of Portfolio Management, 4, 15-19.

[200] Elton, E. and Gruber, M., (2002): Modern Portfolio Theory and In-vestment Analysis, 6th edition, Wiley: New York.

[201] Embrechts, P., Kluppelberg, C., and Mikosch, T., (1997): ModellingExtremal Events for Finance and Insurance, Springer: New York.

[202] Embrechts, P., Mc Neil, A., and Straumann, D., (1999): Correlation:pitfalls and alternatives, Risk Magazine, 69-71.

[203] Emmer, S., Kluppelberg, C., and Korn, R., (2001): Optimal portfolioswith bounded Capital-at-Risk, Mathematical Finance, 11, 365-384.

References 411

[204] Engle, R.F., (1982): Autoregressive conditional heteroscedasticity withestimates of the variance of U.K. inflation, Econometrica, 50, 987-1008.

[205] Engle, R.F. and Granger, C.W.J., (1987): Cointegration and error cor-rection: representation, estimation and testing, Econometrica, 55, 251-276.

[206] Engle, R.F. and Kroner, K., (1995): Multivariate simultaneousGARCH, Econometric Theory, 11, 122-150.

[207] Engle, R. and Manganelli, S., (2004): Caviar: conditional autoregressiveValue-at-Risk by regression quantiles, Journal of Business & EconomicStatistics, 22, 367-381.

[208] Engle, R. and Ng, V., (1993): Time varying volatility and the dynamicbehavior of the term structure, Journal of Money, Credit and Banking,25, 336-349.

[209] Engle, R. and Sheppard, K., (2001): Theoretical and empirical prop-erties of dynamic conditional correlation multivariate GARCH, UCSDworking paper, 2001-15.

[210] Epstein, L. G., (1985): Decreasing risk aversion and mean-varianceanalysis, Econometrica, 53, 945-962.

[211] Epstein, L.G. and Schneider, M., (2003): Recursive multiple-priors, J.Econom. Theory, 113, 1-31.

[212] Epstein, L.G. and Zin, S.E., (1989): Substitution, risk aversion, andthe temporal behavior of consumption and asset returns: a theoreticalframework, Econometrica, 57, 937-969.

[213] Ested, T. and Kritzman, M., (1988): TIPP: insurance without com-plexity, Journal of Portfolio Management, 14, 38-42.

[214] Eun, C. and Resnick, B., (1992): Forecasting the correlation structureof share prices: a test of new models, Journal of Banking and Finance,16, 643-656.

[215] Fabozzi, F., (1996): Bond Markets Analysis and Strategies, 3rd ed.Englewood Cliffs, NJ: Prentice Hall.

[216] Falk, M., (1985): Asymptotic normality of the kernel quantile estimator,Annals of Statistics, 13, 428-423.

[217] Fama, E.F., (1970): Efficient capital markets: a review of theory andempirical work, Journal of Finance, 25, 383-417.

[218] Fama, E.F., (1972): Components of investment performance, Journalof Finance, 27, 551-567.

412 References

[219] Fama, E.F. and French, K.R., (1993): Common risks factors in thereturns on stocks and bonds, Journal of Financial Economics, 33, 389-406.

[220] Fang, K.-T., Kotz, S., and Ng, K.W., ( 1990): Symmetric Multivariateand Related Distributions, Chapman and Hall, London.

[221] Fermanian, J.D. and Scaillet, O., (2005): Sensitivity analysis of VaRand expected shortfall for portfolios under netting agreements, J. Bank-ing and Finance, 29, 927-958.

[222] Ferson, W. E. and Schadt, R. W., (1996): Measuring fund strategy andperformance in changing economic conditions, Journal of Finance, 51,425-461.

[223] Fishburn, P.C., (1970): Utility Theory for Decision-Making, New York:Wiley.

[224] Fishburn, P.C., (1983): Transitive measurable utility, Journal of Eco-nomic Theory, 31, 293-317.

[225] Fishburn, P.C., (1988a): Expected utility: an anniversary and a newera, Journal of Risk and Uncertainty, (Historical archive), 1, 267-283.

[226] Fishburn, P.C., (1988a): Non-linear Preference and Utility Theory, Bal-timore: Johns Hopkins University Press.

[227] Fischer, I., (1906): The Nature of Capital and Income, MacMillan: NewYork.

[228] Fischer, T., (2001): Examples of coherent risk measures depending onhigher moments, Insurance: Mathematics and Economics, 32, 135-146.

[229] Fitzpatrick, B. and Fleming, W., (1991): Numerical methods for opti-mal investment-consumption models, Math. Oper. Res., 16, 657-823.

[230] Fleming, W., (1978): Exit probabilities and optimal stochastic control,Appl. Math. Optim., 4, 329-346.

[231] Fleming, W. and Rischel, R., (1975): Deterministic and Stochastic Op-timal Control, Springer Verlag: Berlin.

[232] Fleming, W. and Soner, M., (1993): Controlled Markov Processes andViscosity Solution, Springer Verlag: Berlin.

[233] Follmer, H. and Kabanov, Y., (1998): Optional decomposition and La-grange multipliers, Finance and Stochastics, 2, 69-81.

[234] Follmer, H. and Leukert, P., (1999): Quantile hedging, Finance andStochastics, 3, 251-273.

[235] Follmer, H. and Leukert, P., (2000): Efficient hedging: cost versusshortfall risk, Finance and Stochastics, 4, 117-146.

References 413

[236] Follmer, H. and Schied, A., (2002a): Convex measures of risk and trad-ing constraints, Finance and Stochastics, 6, 429-448.

[237] Follmer, H. and Schied, A. (2002b): Robust preferences and convexmeasures of risk, in K. Sandmann and P. J. Schonbucher, eds, Advancesin Finance and Stochastics, Springer Verlag: Berlin, 39-56.

[238] Follmer, H. and Schied, A., (2002c): Stochastic Finance - An Introduc-tion in Discrete Time, Walter de Gruyter: Berlin.

[239] Fong, F., (1990): Portfolio construction: fixed income, in John L. Mag-inn and Donald L. Tuttle, eds., Managing investment portfolios: a dy-namic process, 2nd ed. Boston: Warren, Gorham and Lamont.

[240] Frank, M. and Wolfe, P., (1956): An algorithm for quadratic program-ming, Naval Research Logistics Quarterly, 3, 149-154.

[241] Franks, E.C. (1992): Targeting excess-of-benchmark returns, Journalof Portfolio Management 18, 6-12.

[242] Friedman, M. and Savage, L., (1948): Utility analysis of choices involv-ing risk, Journal of Political Economics, 56, 279-304.

[243] Frittelli, M. and Rosazza, G.E., (2004): Dynamic convex risk measures.In: G. Szego (ed.), Risk Measures for the 21st Century, Wiley: NewYork.

[244] Frost, P. and Savarino, J., (1986): An empirical Bayes approach to effi-cient portfolio selection Journal of Financial and Quantitative Analysis,21, 293-306.

[245] Frost, P. and Savarino, J., (1988): For better performance: constraintportfolio weights, Journal of Portfolio Management, 14, 29-34.

[246] Fung, W., and Hsieh, D.A., (1997a): Empirical characteristics of dy-namic trading strategies: the case of hedge funds, Review of FinancialStudies, 10, 275-302.

[247] Fung, W. and Hsieh, D.A., (1997b): Survivorship bias and investmentstyle in the returns of CTA’s, Journal of Portfolio Management, 24,30-41.

[248] Fung, W. and Hsieh, D.A., (1997c): Hedge fund survival lifetimes, Jour-nal of Asset Management, 3, 237-252.

[249] Fung, W. and Hsieh, D.A., (2001): The risk in hedge fund strategies:theory and evidence from trend followers, Review of Financial Studies,14, 313-341.

[250] Gaivoronski, A. and Krylov, S., (2006): Decision support for benchmarktracking: dynamic rebalancing strategies, working paper, Department

414 References

of Industrial Economy and Technology Management, Norwegian Uni-versity of Science and Technology, Norway.

[251] Gaivoronski, A. and Pflug, G., (2005): Value-at-Risk in portfolio opti-mization: properties and computational approach, Journal of Risk, 7,1-31.

[252] Giacometti, R. and Lozza, S., (2004): Risk measures for asset allocationmodels, in the volume (eds. Szego), Risk measures for the 21st century,Wiley, 69-87.

[253] Gilboa, I., (1987): Expected utility with purely subjective non-additiveprobabilities, Journal of Mathematical Economics, 16, 65-88.

[254] Gilboa, I. and Schmeidler, D. (1989): Maxmin expected utility withnon-unique prior, Journal of Mathematical Economics, 18, 141-153.

[255] Gilli, M. and Kellezi, E., (2002): The threshold accepting heuristicfor index tracking, in Financial Engineering, E-Commerce, and Sup-ply Chain, (Eds. P. Pardalos and V.K. Tsitsiringos), Kluwer AppliedOptimization Series, 1-18.

[256] Glosten, L.R., Jagannathan, R., and Runkle, D., (1993): On the rela-tion between the expected value and the volatility of the normal excessreturn on stocks, Journal of Finance, 48, 1779-1801.

[257] Goetzmann, W., Ingersoll, J.E., Spiegel, M., and Welch, I., (2002):Sharpening Sharpe ratios, Working Paper Yale School of Management.

[258] Gollier, C., (2001): The Economics of Risk and Time, The MIT Press:Cambridge, MA.

[259] Goovaerts, M.J., Kaasb, R., Dhaenea, J., and Tangb, Q., (2003): Aunified approach to generate risk measures, Department of QuantitativeEconomics, University of Amsterdam.

[260] Gourieroux, C., (1997): ARCH Models and Financial Applications,Springer Series in Statistics, Springer: New york.

[261] Gourieroux, C., Laurent, J.-P., and Scaillet, O., (2000): Sensitivityanalysis of values-at-risk, Journal of Empirical Finance, 7, 225-245.

[262] Granger, C.W.J., (1969): Investigating causal relations by econometricmodels and cross spectral methods, Econometrica, 37.

[263] Granger, C.W.J. and Joyeux, R., (1980): An introduction to longmemo-ry time series models and fractional differencing, Journal of Time SeriesAnalysis, 1, 15-29.

[264] Granger, C.W.J. and Newbold, P., (1974): Spurious regressions in e-conometrics, Journal of Econometrics, 2, 111-120.

References 415

[265] Green, R., (1986): Positively weighted portfolios on the minimum-variance frontier, Journal of Finance, 41, 1051-1068.

[266] Green, R. and Hollifield, B., (1992): When will mean-variance efficientportfolios be well diversified?, Journal of Finance, 47, 1785-1809.

[267] Greene, W.H., (2002): Econometric Analysis, 5th ed., Prentice Hall.

[268] Gregoriou, G., Hubner, G., and Papageorgiou, N., (2005): Hedge fund-s: insights in performance measurement, risk analysis, and portfolioallocation, Wiley.

[269] Grinblatt, M. and Titman, S., (1989a): Mutual fund performance: ananalysis of quarterly portfolio holdings, Journal of Business, 62, 393-416.

[270] Grinblatt, M. and Titman, S., (1989b): Portfolio performance eval-uation: old issues and new insights, Review of Financial Studies, 2,393-421.

[271] Grinblatt, M. and Titman, S., (1993): Performance measurement with-out benchmarks: an examination of mutual fund returns, Journal ofBusiness, 66, 47-68.

[272] Grinold, R.C., (1992): Are benchmark portfolios efficient?, Journal ofPortfolio Management, 19, 34-40.

[273] Grinold, R.C., and Kahn, R., (2000): Active Portfolio Management: AQuantitative Approach for Producing Superior Returns and ControllingRisk, 2nd edn., Irwin: Chicago.

[274] Grossman, S. and Vila, J.L., (1989): Portfolio insurance in completemarkets: a note, Journal of Business, 62, 473-476.

[275] Grossman, S. and Zhou, J., (1993): Optimal investment strategies forcontrolling drawdowns, Mathematical Finance, 3, 241-276.

[276] Grossman, S. and Zhou, J., (1996): Equilibrium analysis of portfolioinsurance, Journal of Finance, 51, 1379-1403.

[277] Hadar, J. and Russel, W., (1969): Rules for ordering uncertain prospect-s, American Economic Review, 59, 25-34.

[278] de Haan, L., Jansen, D.W., Koedijk, K., and de Vries C.G., (1994) :Safety first portfolio selection, extreme value theory and long run assetrisks, Extreme Value Theory and Applications.

[279] Haar, S. and van Straalen, J.P., (2004): Managing tracking errors in adynamic environment, GARP Risk Review, 16.

[280] Hamidi, B., Jurczenko, E., and Maillet, B., (2006): D’un multiple condi-tionnel en assurance de portefeuille, working paper, University of Paris1.

416 References

[281] Hanoch, G. and Levy, H., (1969): The efficiency analysis of choicesinvolving risk, Review of Economic Studies, 36, 335-346.

[282] Harri, A. and Brorsen, B.W., (2004): Performance persistence and thesource of returns for hedge funds, Applied Financial Economics, 14,131-141.

[283] Harrel, F. and Davis, C., (1982): A new distribution free quantile esti-mation, Biometrika, 69, 635-640.

[284] Harrison, J.M. and Kreps, D., (1979): Martingales and arbitrage inmultiperiod securities markets, Journal of Economic Theory, 20, 381-408.

[285] Harrison, J.M. and Pliska, S., (1981): Martingales and stochastic inte-grals in the theory of continuous trading, Stoch. Proc. Appl., 11, 215-260.

[286] Heath, D., (2000): Back to the future, Plenary lecture at the first WorldCongress of the Bachelier Society, June, Paris.

[287] He, H. and Pages, H.,(1993): Labor income, borrowing constraints, andequilibrium asset prices, Economic Theory, 3, 663-696.

[288] He, H. and Pearson, N.D., (1991a): Consumption and portfolio poli-cies with incomplete markets and short-sale constraints: the infinite-dimensional case, Journal of Economic Theory, 54, 259-305.

[289] He, H. and Pearson, N.D., (1991b): Consumption and portfolio poli-cies with incomplete markets and short-sale constraints: the finite-dimensional case, Mathematical Finance, 1, 1-10.

[290] Henriksson, R.D., and Merton, R.C., (1981): On market timing andinvestment performance II. Statistical procedures for evaluationg fore-casting skills, Journal of Business, 54, 513-533.

[291] Herstein, I.N. and Milnor, J.,(1953) : An axiomatic approach to mea-surable utility, Econometrica, 21, 291-297.

[292] Herzog, F., Dondi, G., and Geering, H.P., (2004): Strategic asset alloca-tion with factor models for returns and GARCH models for volatilities,Proceedings of the 4th IASTED International Conference on Modelling,Simulation, and Optimization - MSO 2004, pp. 59-65, Kauai, HI.

[293] Hicks, J., (1931): The theory of uncertainty and profit, Economica, 11,170-189.

[294] Jacod, J. and Shiryaev, A.N., (1987): Limit Theorems for StochasticTheorems, Grundlehren der Math. Wissenschaften, 288, Springer (newedition 2003).

References 417

[295] Jaffray, J.Y., (1989): Linear utility for belief functions, Operations Re-search Letters, 8, 107-112.

[296] Jeanblanc, M., Lacoste, V., and Roland, S., (2005): Utility maximiza-tion under partial information in jump-diffusion models, working paper,University of Evry, France.

[297] Jeanblanc, Lakner, P., and Kadam, A., (20014): Optimal bankruptcytime and consumption/investment policies on an infinite horizon with acontinuous debt payment until bankruptcy, Mathematics of OperationsResearch, 29, 649-671.

[298] Jeanblanc, M. and Pontier, M., (1990): Optimal portfolio for a smallinvestor in a market with discontinuous prices, Applied mathematicsand optimization, 22, 287-310.

[299] Jegadeesh, N. and Titman, S., (1993): Returns to buying winners andselling losers: implications for stock market efficiency, Journal of Fi-nance, 48, 65-91.

[300] Jensen, M.C., (1968): The performance of mutual funds in the period1945-1964, Journal of Finance, 23, 389-416.

[301] Jensen, N.E., (1967): An introduction to Bernoullian utility theory, I:utility functions, Swedish Journal of Economics, 69, 163-83.

[302] Jensen, B.A. and Sorensen, C., (2001): Paying for minimum interestrate guarantees: who should compensate who? European FinancialManagement, 7, 183-211.

[303] Jia, J., and Dyer, S., (1996): A standard measure of risk and risk-valuemodels, Management Science, 42, 1691-1705.

[304] Johansen, S., (1988): Statistical analysis of cointegration vectors, Jour-nal of Economic Dynamics and Control, 12, 231-254.

[305] Jobson, J. and Korkie, B., (1980): Estimation for Markowitz efficientportfolios, Journal of the American Statistical Association, 75, 544-554.

[306] Jobson, J. and Korkie, B., (1981): Putting Markowitz theory to work,Journal of Portfolio Management, 7, 70-74.

[307] Joe, H., (1997): Multivariate Models and Dependence Concepts, Chap-mann & Hall: London.

[308] Jorion, P., (1996): Risk2: measuring the risk in Value-at-Tisk, FinancialAnalysts Journal, November/December, 47-56.

[309] Jorion, P., (1997): Value at Risk: The New Benchmark for ManagingFinancial Risk, McGraw-Hill: New York.

[310] Jorion, P., (2003): Portfolio optimization with tracking-error con-straints, Financial Analyst Journal, 70-82.

418 References

[311] Jouini, E., Meddeb, M. and Nizar T., (2004): Vector-valued coherentrisk measures, Finance and Stochastics, 8, 531-552.

[312] Jurczenko, E. and Maillet, B. (2004): The 4-CAPM: in between AssetPricing and Asset Allocation, in: Multi-moment Capital Asset PricingMoment and Relatd Topics, Adcock et al. (eds.), Springer, Chapter 3,forthcoming.

[313] Kahn, R.N. and Rudd, A., (1995): Does historical performance predictfuture performance?, Barra Newlsetter.

[314] Kahneman, D. and Tversky, A., (1979): Prospect theory: an analysisof decision under risk, Econometrica, 47, 263-291.

[315] Kallberg, J. G. and Ziemba, W.T., (1983): Comparison of alternativeutility functions in portfolio selection problems, Management Science,29, 1257-1276.

[316] Kan, Y. S. and Kibzun, A.I., (1996): Stochastic Programming Problemswith Probability and Quantile Functions, Wiley.

[317] Karatzas, I., Lakner, P., Lehoczky, J.P., and Shreve, S. E., (1991): Equi-librium in a simplified dynamics stochastic economy with heterogeneousagents, in: E. Meyer-Wolf, E. Merzbach, O. Zeitouni (eds), Stochasticanalysis Liber Amicorum for Moshe Zakai, Academic Press: New York,245-272.

[318] Karatzas, I., Lehoczky, J.P., Sethi, S.P., and Shreve, S. E., (1985):Explicit solution of a general consumption/investment problem, Math-ematics of Operations Research, 25, 157-186.

[319] Karatzas, I., Lehoczky, J.P., Shreve, S.E. and Xu, G.-L., (1991): Mar-tingale and duality methods for utility maximization in an incompletemarket, SIAM Journal of Control and Optimization, 29, 702-730.

[320] Karatzas, I. and Shreve, S.E., (1991): Brownian Motion and StochasticCalculus (2nd edition), Springer: New York.

[321] Karatzas, I. and Shreve, S.E., (1998): Methods of Mathematical Fi-nance, Springer: New York.

[322] Karatzas, I. and Wang, H., (2001): Utility maximization with discre-tionary stopping, SIAM Journal of Control and Optimization, 39, 306-329.

[323] Karni, E., (1985): Decision Making under Uncertainty: the Case ofState-Dependent Preferences, Harvard University Press: Cambridge,MA.

[324] Kataoka, S., (1963): A stochastic programming model, Econometrica,31, 181-196.

References 419

[325] Kazemi, H., Schneeweis, T. and Gupta, R., (2003): Omega as perfor-mance measure, CISDM University of Massachusetts, Amherst.

[326] Keating, C. and Shadwick, F., (2002): A universal performance mea-sure, Journal of Performance Measurement, 6, 59-84.

[327] Kihlstrom, R. and Mirman, L.J., (1974): Risk aversion with many com-modities, Journal of Economic Theory, 8, 361-388.

[328] Knight, F.H., (1921): Risk, Uncertainty and Profit, Hart, Schaffner &Marx; Houghton Mifflin company: Boston, MA.

[329] Konno, H, Shirakawa, H., and Yamazaki, H., (1993): A mean-absolutedeviation-skewness portfolio optimization model, Annals of OperationsResearch, 45, 205-220.

[330] Konno, H. and Wijayanayake, A., (2001): Portfolio optimization prob-lem under concave transaction costs and minimal transaction unit con-straints. Mathematical Programming, 89, 233-250.

[331] Konno, H. and Yamazaki, H., (1991): Mean-absolute deviation port-folio optimization models and its applications to Tokyo stock market,Management Science, 37, 519-531.

[332] Koopmans, T.C., (1960): Stationary ordinal utility and impatience,Econometrica, 28, 287-309.

[333] Korn, R., (1997): Optimal Portfolios - Stochastic Models for OptimalInvestment and Risk Management, World Scientific.

[334] Korn, R., (1998): Portfolio optimization with strictly positive transac-tion costs and impulse control, Finance and Stochastics, 2, 85-114.

[335] Kramkov, D. and Schachermayer, W., (1999): The asymptotic elastic-ity of utility functions and optimal investment in incomplete markets,Annals of Applied Probability, 9, 904-950.

[336] Krokhmal, P., Palmquist, J. and Uryasev, S., (2001): Portfolio op-timization with conditional Value-at-Risk objective and constraints,Journal of Risk, 4, 43-68.

[337] Krokhmal, P., Uryasev, S., and Zrazhevsky G., (2002): Risk manage-ment for hedge fund portfolios: a comparative analysis of linear rebal-ancing strategies, Journal of Alternative Investments, 10-29.

[338] Kroll, Y., Levy, H., and Markowitz, H., (1984): Mean-variance versusdirect utility maximization, Journal of Finance, 39, 47-61.

[339] Krylov, N.V., (1980): Controlled Diffusion Process, Springer: Berlin.

[340] Kushner, H.J., (1967): Optimal discounted stochastic control for diffu-sion processes, SIAM Journal of Control, 5, 520-531.

420 References

[341] Kushner, H.J. and Dupuis, P.G., (1992): Numerical Methods for S-tochastic Control Problems in Continuous Time. Springer: New York,Berlin.

[342] Kusuoka, S., (2001): On law invariant coherent risk measures, in: Ad-vances in Mathematical Economics, Springer: Tokyo, 3, 83-95.

[343] Lakner, P., (1998): Optimal trading strategy for an investor: the caseof partial information, Stochastic Processes and their Applications, 76,77-97.

[344] Larsen, N., Mauser H., and Uryasev, S., (2002): Algorithms for opti-mization of Value-at-Risk, in: P.Pardalos and V.K. Tsitsiringos, (Ed-s) Financial Engineering, e-commerce and Supply Chain. Kluwer Aca-demic Publishers, 129-157. (see http://www.ise.ufl.edu/uryasev/VaR-minimization.pdf).

[345] Lederman, J. and Klein, R., (1994): Global Asset Allocation: Tech-niques for Optimizing Portfolio Management, Wiley.

[346] Ledoit, O. and Wolf, M., (2002): Some hypothesis tests for the covari-ance matrix when the dimension is large compared to the sample size,Annals of Statistics, 30, 1081-1102.

[347] Leland, H.E., (1980): Who should buy portfolio insurance? Journal ofFinance, 35, 581-594.

[348] Leland, H.E., (1999): Beyond mean-variance: risk and performancemeasurement in a nonsymmetrical world, Financial Analysts Journal,54, Jan./Feb.

[349] Leland, H.E. and Rubinstein, M., (1988): The evolution of portfolio in-surance, in: D.L. Luskin, ed., Portfolio Insurance: a Guide to DynamicHedging, Wiley.

[350] Lehoczky, J.P., Sethi, S.P., and Shreve, S.E., (1983): Optimal con-sumption and investment policies allowing consumption constraints andbankruptcy, Mathematics of Operations Research, 8, 613-636.

[351] Levkovitz, R. and Mitra, G., (1993): Solution of large scale linear pro-grams: a review of hardware, software and algorithmic issues, Opti-mization in industry, Wiley, 139-172.

[352] Levy, H., (1994) : Absolute and relative risk aversion: An experimentalstudy. Journal of Risk and Uncertainty, 8, 289-307.

[353] Levy, H. and Levy, M., (2004) : Prospect theory and mean-varianceanalysis, Review of Financial Studies, 17, 1015-1041.

[354] Lhabitant, F.-S., (2001): Hedge funds investing: a quantitative lookinside the black box, working paper, Union Bancaire Privee.

References 421

[355] Lhabitant, F.-S., (2004): Hedge Funds: Quantitative Insights, Wiley.

[356] Lichtenstein, S. and Slovic, P., (1971): Reversal of preferences betweenbids and choices in gambling decisions, Journal of Experimental Psy-chology, 89, 46-55.

[357] Lioui, A. and Poncet, P., (2001): On optimal portfolio choice understochastic interest rates, Journal of Economic Dynamics and Control,25, 1841-1865.

[358] Lions, P.-L., (1983): Optimal control of diffusion processes andHamilton-Jacobi-Bellman equations. Part II: Viscosity solutions and u-niqueness, Comm. Partial Differential Equations, 8, 1229-1276.

[359] Lo, A., (2002): The statistics of Sharpe ratios, Financial Analysts Jour-nal, 58, 36-52.

[360] Lobosco, A., (1999): Style/risk-adjusted performance, Journal of Port-folio Management, 25, 65-68.

[361] Long, J.B., (1990): The numeraire portfolio, Journal of Financial Eco-nomics, 26, 29-69.

[362] Longin, F., (2001): Beyond the VaR, Journal of Derivatives, Summer,36-48.

[363] Longin, F. and Solnik, B., (1995): Is the correlation in internationalequity returns constant: 1960-1990?, Journal of International Moneyand Finance, 14, 3-26,

[364] Longin, F. and Solnik, B., (2001): Extreme correlation of internationalequity markets, Journal of Finance, 56, 651-678.

[365] Loomes, G. and Sugden, R., (1982): Regret theory: an alternativetheory of rational choice under uncertainty, Economic Journal, 92, 805-824.

[366] MacFadden, D., (1973): Conditional logit analysis of qualitative choicebehavior, in Zarembka, P. (ed.), Frontiers in Econometrics, AcademicPress: New York.

[367] Machina, M., (1982a): Expected utility analysis without the indepen-dence axiom, Econometrica, 50, 277-323.

[368] Magill, M.J.P., and Constantinides, G.M., (1976): Portfolio selectionwith transaction costs, Journal of Economic Theory, 13, 264-271.

[369] Maccheroni, F., Marinacci, M. and Rustichini, A., (2005): Dynamicvariational preferences, working paper Fondazione Collegio Carlo Al-berto.

[370] Macovschi, S. and Quittard-Pinon, F., (2005): Pricing power and otherpolynomial options, to appear in Journal of Derivatives.

422 References

[371] Malevergne, Y. and Sornette, D., (2004): High-Order moments andcumulants of multivariate asset returns: analytical theory and empiricaltests- II, Finance Letters.

[372] Malevergne, Y. and Sornette, D., (2005): Higher-moment portfolio the-ory: capitalizing on behavioral anomalies of stock markets, Journal ofPortfolio Management, 31, 49-55.

[373] Markowitz, H.M., (1952a): Portfolio selection, Journal of Finance, 7,77-91.

[374] Markowitz, H.M., (1952b): The utility of wealth, Journal of PoliticalEconomy, 60, 151-158.

[375] Markowitz, H.M., (1959): Portfolio Selection: Efficient Diversificationof Investments, Wiley: New York.

[376] Markowitz, H.M., (1987): Mean-variance Analysis in Portfolio Choiceand Capital Markets, Basil Blackwell: Cambridge, MA.

[377] Marschak, J., (1938): Money and the theory of asset, Econometrica, 6,311-325.

[378] Marschak, J., (1950): Rational behaviour, uncertain prospects and mea-surable utility, Econometrica, 18, 111-141.

[379] Mausser, H. and Rosen, D., (1999): Efficient risk/return fron-tiers for credit risk, ALGO Research Quarterly, 2, 35-48. (seewww.algorithmics.com/research/dec99/arq2-4-frontiers.pdf).

[380] McNeil, A., (1997): Estimating the tails of loss severity distributionsusing extreme value theory, ASTIN Bulletin, 27, 117-137.

[381] McNeil, A. and Frey, R., (2000): Estimation of tail-related risk for het-eroscedastic financial time series: an extreme value approaches, Journalof Empirical Finance, 7, 271-300.

[382] Meade, N. and Salkin, G.R., (1989): Index funds-construction and per-formance measurement, Journal of the Operational Research Society,40, 871-879.

[383] Meade, N. and Salkin G.R., (1990): Developing and maintaining anequity index fund,Journal of the Operational Research Society, 41, 599-607.

[384] Menger, C., (1871): Grundsatze der volkswirthschaftslehre, Vienna:Braumnalter. Translated as Principles of Economics, by J. Dingwalland B.F. Hoselitz, Glencoe, Ill.: Free Press, 1951. Reprinted in 1976and 1981, New York University Press: New York.

[385] Merton, R.C., (1969): Lifetime portfolio selection under uncertainty:the continuous time case, Review of Economics and Statistics, 51, 247-257.

References 423

[386] Merton, R.C., (1971): Optimum consumption and portfolio rules in acontinuous-time model, Journal of Economic Theory, 3, 373-413.

[387] Merton, R.C., (1972): An analytic derivation of the efficient portfoliofrontier, JJournal of Financial Quantitative Analysis, 7, 1851-1872.

[388] Merton, R.C., (1980): On estimating the expected return on the market,Journal of Financial Economics, 8, 323-361.

[389] Merton, R.C., (1990): Continuous Time Finance, Basil Blackwell:Cambridge, MA.

[390] Meucci, A., (2005): Risk and Asset Allocation, Springer: Berlin.

[391] Michaud, R.O., (1998): Efficient Asset Management: A Practical Guideto Stock, Portfolio Optimization and Asset Allocation, Harvard BusinessSchool Press: Boston, MA.

[392] Modigliani, F., and Pogue, G.A., (1974): An introduction to risk andreturn - II, Financial Analysts Journal, 30, 69-86.

[393] Morgan, J.P., (1996): Risk Metrics-technical document, 4th edition,J.P. Morgan: New York.

[394] Morton, A. J. and Pliska, S., (1995): Optimal portfolio managementwith fixed transaction costs, Mathematical Finance, 5, 337-356.

[395] Mulvey, J. M., Vanderbei, R. J., and Zenios, S. A., (1995): Robustoptimization of large scale systems, Operations Research, 43, 264-281.

[396] Muralidhar, A.S., (2001): Optimal risk-adjusted portfolios with multi-ple managers, Journal of Portfolio Management, 27, 97-104.

[397] Nagai, H. and Peng, S., (2001): Risk-sensitive dynamic portfolio opti-mization with partial information on infinite time horizon, Annals ofApplied Probability, 173-195.

[398] Nelson, D.B., (1991): Conditional heteroskedasticity in asset returns: anew approach, Econometrica, 59, 347-370.

[399] Nesterov, Y. and Nemirovski, A., (1995): Interior-Point PolynomialAlgorithms in Convex Programming, Society for Industrial and AppliedMathematics: Philadelphia.

[400] von Neumann, J. and Morgenstern, O., (1944): Theory of Games andEconomic Behavior, Princeton University Press.

[401] Oksendal, B., (1995): Stochastic Differential Equations, Springer: NewYork, 4th edition.

[402] Park, J.M. and Staum, J.-C., (1998): Performance persistence in thealternative industry, Paradigm capital management, working paper.

424 References

[403] Peng, S., (1997): Backward SDE and related g-expectations. In: Back-ward Stochastic Differential Equations, N. El Karoui and L. Mazliak(Eds), Pitman Res. Notes Math. Ser., 364, Longman, Harlow, 141-159.

[404] Pennachi, G., (1991): Identifying the dynamics of real interest ratesmodels and inflation: evidence using survey data, Review of Financialstudies, 4, 53-86.

[405] Perold, A., (1984): Large-scale portfolio optimization,Management Sci-ence, 30, 1143-1160.

[406] Perold, A., (1986): Constant portfolio insurance, Harvard BusinessSchool, unpublished manuscript.

[407] Perold, A. and Sharpe, W., (1988): Dynamic strategies for asset allo-cation, Financial Analysts Journal, 44, 16-27.

[408] Pham, H., (2002): Smooth solutions of optimal investment models withstochastic volatilities and portfolio constraints, Appl. Math. Optim., 46,55-78.

[409] Pham, H., (2002): Minimizing shortfall risk and applications to financeand insurance problems, Annals of Applied Probability, 12, 143-172.

[410] Pliska, S.R., (1986): A stochastic model of continuous trading: optimalportfolios, Mathematics of Operations Research, 11, 371-384.

[411] Pope, P.F. and Yadav, P.K., (1994): Discovering errors in tracking error,Journal of Portfolio Management, Winter, 27-32.

[412] Pratt, J.W., (1964): Risk aversion in the small and in the large, Econo-metrica, 32, 122-136.

[413] Prigent, J.-L., (2001): Assurance du portefeuille : analyse et extensionde la methode du coussin, Banque et Marches, 51, 33-39.

[414] Prigent, J.-L., (2003): Weak Convergence of Financial Markets,Springer: Berlin.

[415] Prigent, J.-L., (2004): Generalized option based portfolio insurance,working Paper Thema, University of Cergy, France.

[416] Prigent, J.-L. and Tahar, F., (2004): Indexed funds management basedon TA algorithm and cointegration, working paper Thema, Universityof Cergy, France.

[417] Prigent, J.-L. and Tahar, F., (2006): Optimal portfolios with guaranteeat maturity: computation and comparison, International Journal ofBusiness, 11, 171-185.

[418] Prigent, J.-L. and Toumi, S., (2003): Portfolio management with safetycriteria in complete financial markets, International Journal of Busi-ness, 10, 233-250.

References 425

[419] Protter, P., (1990): Stochastic Integration and Differential Equations,Springer: Berlin.

[420] Quiggin, J., (1982): A theory of anticipated utility, Journal of EconomicBehaviour and Organization, 3, 323-343.

[421] Quiggin, J., (1991): Comparative statics for rank-dependent expectedutility, Journal of Risk and Uncertainty, 4, 339-350.

[422] Rachev, S. and Zhang, Y., (2004): Risk attribution and portfolio perfor-mance measurements, Journal of Applied Functional Analysis, 4, 373-402.

[423] Revuz, D. and Yor, M., (1991): Continuous Martingales and BrownianMotion, Springer.

[424] Richard, S., (1975): Optimal consumption, portfolio and life insurancerules for an uncertain lived individual in a continuous time model, Jour-nal of Financial Economics, 1, 187-203.

[425] Riedel, F., (2004): Dynamic coherent risk measures, Stochastic process-es and their applications, 112, 185-200.

[426] Rockafellar, R.T., (1970): Convex Analysis, Princeton Mathematics,28, Princeton University Press.

[427] Rockafellar, R.T. and Uryasev, S.P., (2000): Optimization of condition-al Value-at-Risk, Journal of Risk, 2, 21-42.

[428] Rockafellar, R.T. and Uryasev, S.P., (2002): Conditional Value-at-Riskfor general loss distributions, Journal of Banking and Finance, 26, 1443-1471.

[429] Rogers, C. and Talay, D., (eds) (1997): Numerical Methods in Finance,Cambridge University Press, Cambridge.

[430] Roll, R., (1977): A critique of the asset pricing theory’s tests, Journalof Financial Economics, 4, 129-176.

[431] Roll, R., (1992): A mean-variance analysis of tracking error, The Jour-nal of Portfolio Management, 18, 13-23.

[432] Roll, R. and Ross, S.A., (1977) : Comments on qualitative results forinvestment proportions, Journal of Financial Economics, 5, 265-268.

[433] Ross, S., (1978): Mutual fund separation in financial theory: the sepa-ration distributions, Journal of Economic Theory, 17, 254-286.

[434] Ross, S., (1981): Some stronger measures of risk aversion in the smalland in the large with applications, Econometrica, 49, 621-639.

[435] Rothschild, M. and Stiglitz, J.E., (1970): Increasing risk: I. A definition,Journal of Economic Theory, 2, 225-243.

426 References

[436] Rothschild, M. and Stiglitz, J.E., (1971): Increasing risk: II. Its eco-nomic consequences, Journal of Economic Theory, 3, 66-84.

[437] Roy, A.D., (1952): Safety-first and the holding of assets, Econometrica,20, 431-449.

[438] Rubinstein, M., (1976): The valuation of uncertain income streams andthe pricing of options, Bell Journal of Economics, 7, 407-425.

[439] Ross, S.A., (1976): The arbitrage theory of capital asset pricing, Journalof Economic Theory, 13, 341-360.

[440] Rudd, A., (1977): A note on qualitative results for investment propor-tions, Journal of Financial Economics, 5, 259-263.

[441] Rudd, A., (1980): Optimal selection of passive portfolios, FinancialManagement, 9/10,57-66.

[442] Rudd, A. and Rosenberg, B., (1979): Realistic portfolio optimization,in: Elton, E. J., Gruber, M. J. (eds.): Portfolio Theory, 25 Years after,Studies in the Management Sciences 11, North-Holland, Amsterdam,21-46.

[443] Rudolf, M., Wolter, H.-J., and Zimmermann, H., (1999): A linear modelfor tracking error minimization, Journal of Banking and Finance, 23,85-103.

[444] Ruszczynski, A. and Vandenbei, R., (2003): Frontiers of stochasticallynondominated portfolios, Econometrica, 71, 1287-1297.

[445] Samuelson, P.A., (1950): Probability and attempts to measure utility,Economic Review, 1, 167-173.

[446] Samuelson, P.A., (1952): Probability, utility and the independence ax-iom, Econometrica, 20, 670-678.

[447] Samuelson, P.A., (1969): Lifetime portfolio selection by dynamic s-tochastic programming, The Review of Economics and Statistics, 51,239-246.

[448] Satchell, S.E. and Hwang, S., (2001): Tracking error: ex ante versus expost measures, Journal of Asset Management, 2, 241-246.

[449] Savage, L.J., (1954): The Foundations of Statistics, Wiley: New York.

[450] Scaillet, O., (2004): Non-parametric estimation and sensitivity analysisof expected shortfall, Mathematical Finance, 14, 115-129.

[451] Schachermayer, W., (2000): Optimal investment in incomplete financialmarkets, Mathematical Finance: Bachelier Congress 2000, H. Geman,D. Madan, St. Pliska, T. Vorst, editors, 427-462.

References 427

[452] Schachermayer, W., (2001): Portfolio optimization in incomplete finan-cial markets, Lecture Notes of “Scuola Normale Superiore di Pisa”.

[453] Scheikh, A., (1996): Barra’s risk model, Barra Research Insights.

[454] Schmeidler, D., (1989): Subjective probability and expected utilitywithout additivity, Econometrica, 57, 571-587.

[455] Schoemaker, P.J.H., (1993): Determinants of risk-taking: behavioraland economic views, Journal of Risk and Uncertainty, (HistoricalArchive), 6, 49-73.

[456] Schneeweis, T. and Spurgin, R., (2001): The benefits of index option-based strategies for institutional portfolios, Journal of Alternative In-vestments, 44-52.

[457] Segal, U., (1987): The Ellsberg paradox and risk aversion: An antici-pated utility approach, International Economic Review, 28, 175-202.

[458] Segal, U., (1989): Anticipated utility: a measure representation ap-proach, Annals of Operations Research, 19, 359-373.

[459] Seierstad, A. and Sydsaeter, K., (1987): Optimal Control Theory withEconomic Applications, North-Holland: Amsterdam.

[460] Sethi, S.P., (1997): Optimal Consumption and Investment withBankruptcy, Kluwer.

[461] Shanken, J., (1992): On the estimation of beta-pricing models, Reviewof Financial Studies, 5, 1-33.

[462] Sharpe, W., (1963): A simplified model for portfolio analysis, Manage-ment Science, 277-293.

[463] Sharpe, W., (1964): Capital asset prices: a theory of market equilibriumunder conditions of risk, Journal of Finance, 19, 425-442.

[464] Sharpe, W., (1966): Mutual fund performance, Journal of Business, 39,119-138.

[465] Sharpe, W., (1992): Asset allocation management style and perfor-mance measurement, Journal of Portfolio Management, 18, 7-19.

[466] Sharpe, W., (1994): The Sharpe ratio, Journal of Portfolio Manage-ment, 21, 49-58.

[467] Sharpe, W., (1998): Morningstar’s risk-adjusted ratings, Financial An-alysts Journal.

[468] Shirakawa, H., (1994): Optimal consumption and portfolio selectionwith upper and lower bounds cosntraints, Mathematical Finance, 4, 1-24.

[469] Shiryaev, A., (1984): Probability, Springer: New York.

428 References

[470] Shreve, S.E. and Soner, H.M., (1994): Optimal investment and con-sumption with transaction costs, Annals of Applied Probability, 4, 609-692.

[471] Sims, C.A., (1980): Macroeconomcs and reality, Econometrica, 48.

[472] Solnik, B., (1974): The international pricing of risk: an empirical in-vestigation of the world capital market structure, Journal of Finance,365-378.

[473] Sorensen, C., (1999): Dynamic asset allocation and fixed income man-agement, Journal of Financial and Quantitative Analysis, 34, 513-531.

[474] Sorenson, E.H., Miller, K.L. and Samak V., (1998): Allocating betweenactive and passive management, Financial Analysts Journal, 54, 18-31.

[475] Sortino, F., (2001): From Alpha to Omega, inManaging Downside Riskin Financial Markets. Frank A. Sortino and Stephen E. Satchell, eds.,Reed Educational and Professional Publishing Ltd., 2001.

[476] Sortino, F. and van der Meer, R.A.H., (1991): Downside risk, Journalof Portfolio Management, August, 27-31.

[477] Sortino, F. and Price, L.N., (1994): Performance measurement in adownside risk framework, Journal of Investing, 3, 59-64.

[478] Stein, C., (1955): Inadmissibility of the usual estimator for the meanof a multivariate normal distribution, Proceedings of the 3rd BerkeleySymposium on Probability and Statistics, 197-206.

[479] Steinbach, M.C., (1999): Markowitz revisited: single-period and multi-period mean-variance models, working paper SC-99-30, Konrad-Zuse-Zentrum fur Informationstechnik Berlin.

[480] Stevens, G., (1998): On the inverse of the covariance matrix in portfolioanalysis, Journal of Finance, 53, 1821-1827.

[481] Stock, J.H. and Watson, M.W., (1988): Testing for common trends,Journal of the American Statistical Association, 83, 1097-1107.

[482] Stulz, R., (1998): Derivatives, Risk Management and Financial Engi-neering, Southwestern Publishing.

[483] Svensson, L., (1989): Portfolio choice with non-expected utility incontinuous-time, Economic letters, 30, 313-317.

[484] Szego, G., (2002): Measures of risk, Journal of Banking and Finance,26, 1253-1272.

[485] Szego, G., (2004): Risk Measures for the 21st Century, Wiley: NewYork.

References 429

[486] Tabata, Y. and Takeda, E., (1995): Bicriteria optimization problem ofdesigning an index fund, Journal of the Operational Research Society,46, 1023-1032.

[487] Tahar, F., (2005): Elements de gestion de portefeuille: analyse et ex-tension de methodes en gestion indicielle et garantie, PhD Thesis, Uni-versity of Cergy, France.

[488] Tallon, J-M., (1997): Risque microeconomique et prix d’actifs dans unmodele d’equilibre general avec esperance d’utilite dependante du rang,Finance, 18, 139-153.

[489] Tasche, D., (2002): Expected shortfall and beyond, Journal of Bankingand Finance, 26, 1519-1533.

[490] Telser, L., (1955): Safety first and hedging, Review of Economic andStatistics, 23, 1-16.

[491] Teo, K.L. and Yang, X.Q., (2001): Portfolio selection problem withminimax type risk function, Annals of Operations Research, 101, 333-349.

[492] Toumi, S., (2003): Application de la theorie de l’equilibre general et desnotions de Value-at-Risk a la finance de marche, Phd Thesis, THEMA,University of Cergy, France.

[493] Toy, W.M. and Zurach, M.A., (1989): Tracking the Euro-Pac index,Journal of Portfolio Management, 15, 55-58.

[494] Treynor, J., (1965): How to rate management of investment funds,Harvard Business Review, 43, 63-75.

[495] Treynor, J. and Mazuy, K., (1966): Can mutual funds outguess themarket? Harvard Business Review, 43, 63-75.

[496] Tversky, A. and Kahneman, D., (1992): Advances in prospect theory:cumulative representation of uncertainty, Journal of Risk and Uncer-tainty, 5, 297-323.

[497] Van Laarhoven, P.J.M. and Aarts, E.H.L., (1987): Simulated Annealing:Theory and Applications, Kluwer Academic Publishers.

[498] Vasicek, O., (1977): An equilibrium characterization of the term struc-ture, Journal of Financial Economics, 5, 177-88.

[499] Wagner, N., (2002): On a model of portfolio selection with benchmark,Journal of Asset Management, 3, 55-65.

[500] Wagner, N. (2003): Methods of relative portfolio optimization, in:Satchell, S. E., Scowcroft, A. (eds.): New Advances in Portfolio Con-struction and Implementation, Butterworth and Heinemann, Oxford,pp. 333-341.

430 References

[501] Wakker, P., (1994): Separating marginal utility and risk aversion, The-ory and Decision, 36, 1-44.

[502] Wang, T., (1996): A characterization of dynamic risk measures, WPFaculty of Commerce and Business Administration, U.B.C.

[503] Weber, S., (2005): Distribution-invariant risk measures, information,and dynamic consistency, Mathematical Finance, 16, 419-44.

[504] Wilson, T.C., (1996): Calculating risk capital. In: Alexander, C. (Ed.),The Handbook of Risk Management and Analysis; Wiley, Chichester,193-232.

[505] Winker, P., (2000): Optimization Heuristics in Econometrics: Applica-tions of Threshold Accepting, Wiley, New York.

[506] Yaari, M.E., (1969): Some measures of risk aversion and their uses,Journal of Economic Theory, 1, 315-329.

[507] Yaari, M.E., (1987): The dual theory of choice under risk. Econometri-ca, 55, 95-115.

[508] Young, M.R., (1998): A minimax portfolio selection rule with linearprogramming solution, Management Science, 44, 673-683.

[509] Zakoian, J.-M., (1994): Threshold heteroskedastic models, Journal ofEconomic Dynamics and Control, 18, 931-955.

[510] Zariphopoulou, T., (1992): Investment-consumption models with trans-action fees and Markov-chain parameters, SIAM Journal of Control andOptimization, 30, 613-636.

[511] Zeng, B. and Zhang, J., (2001): An empirical assesment of asset corre-lation models, KMV LLC, San Francisco.

[512] Zhigljavsky, A., (1991): Theory of Random Search, Kluwer AcademicPublishers.

Symbol Description

Rd The real Euclidian space ofdimension d

Ω The set of random eventsP The historical probabilityF The σ-algebra correspond-

ing to known information(Ft)t The filtration correspond-

ing to increasing informa-tion along time t

E[X ] Expectation of the randomvariable X

L The set of lotteriesρ(X) Risk measure of the ran-

dom variable XRDEU Rank Dependent Expected

UtilityVaR Value-at-RiskCVaR Conditional Value-at-RiskES Expected shortfall Preference relationX i Y The random variable X s-

tochastically dominates Yat order i

Lp Space of all F -measurablerandom variables such thatEP[Xp] is finite

E Doleans-Dade stochasticexponential

RP Return of portfolio PA′ or At Transpose of A[X,Y ] Co-variation of processes X

and Y

[X,X ] Quadratic variation of pro-cess X

〈X,X〉 Predictable compensator ofprocess X

I The vector with all compo-nents equal to 1

IA The indicator function ofthe subset A

i.i.d. Independent and identical-ly distributed

r.c.l.l Right continuous and leftlimited

w.r.t. With respect toCPPI Constant Proportion Port-

folio InsuranceOBPI Option Based Portfolio In-

suranceCARA Constant Absolute Risk

AversionCRRA Constant Relative Risk

AversionHARA Hyperbolic Absolute Risk

AversionODE Ordinary Differential Equa-

tionPDE Partial Differential Equa-

tionSDE Stochastic Differential E-

quationBSDE Backward Stochastic Dif-

ferential Equation

431

Index

Acceptance set, 39Arch models, 100

conditional beta, 145general properties, 373volatility models, 377

Arrow-Pratt measures of risk-aversion,13

Brownian motion, 46geometric, 211multidimensional, 236properties, 383

Capital Asset Pricing Model, 130Certainty equivalent, 12Choquet capacity, 33Conditional Value-at-Risk, 54

minimization, 98CPPI, 294

discrete-time, 300extensions, 303standard, 295

Cumulative prospect theory, 31

Doleans-Dade exponential, 385Dynkin, 183

formula, 392operator, 189

Efficient frontier, 74no short-selling, 80relative, 122riskless asset, 76VaR/CVaR, 99

Ellsberg paradox, 33

Filtration, 43consistency, 45

definition, 381

Hedge funds, 351industry, 351main strategies, 352optimal allocation, 370performance, 354

Information ratio, 140definition, 140statistical test, 141

Jensen alpha, 133

Kahneman and Tversky, 6Kataoka criterion, 96

Lottery, 7compound, 7

Markovian, 63dynamic programming, 229stochastic differential equation,

387system, 182

Markowitz model, 67Martingale, 167

representation, 197semimartingale, 382submartingale, 381supermartingale, 197

Mean-variance analysis, 68two-fund separation, 88estimation problems, 82expected utility, 85optimization under TEV con-

straint, 124tracking error, 119

Merton portfolio selection, 187

433

434 Index

Monotonicity, 38

OBPI, 282extensions, 286generalized CPPI, 310standard, 284

Performance measure, 129alternative, 362comparison, 135extensions, 140Omega, 363standard, 130

Portfolio insurance, 281Positive homogeneity, 38

Regret theory, 33Relevance, 39Risk measure, 37

alternative, 362coherent, 38convex, 39dynamic, 43minimization, 93penalty function, 40representation, 40safety-first, 37spectral, 59standard, 48utility, 41

Risk-averse, 12Risk-loving, 12Risk-neutral, 12Roy criterion, 340

Semi-variance, 362Sharpe ratio, 132

criticism, 139definition, 132estimation, 139

Sortino ratio, 142downside risk, 362

Spherical distributions, 90Stochastic basis, 381Stochastic dominance, 6

efficient frontier, 101first-order, 19general properties, 19portfolio insurance, 350second order, 21

Stochastic integral, 382Stochastic process, 381Subadditivity, 38

Telser criterion, 95Tracking-error, 104

correlation and beta, 118definition, 118MAD, 104minimization, 119MinMax, 104volatility, 119

Transitivity, 29violation, 34

Translation invariance, 38Treynor ratio, 133

definition, 133Two-fund separation, 67

existence, 88

Utility theory, 5alternative expected, 24anticipated, 29expected, 9non-additive expected, 32rank dependent expected, 27standard utility functions, 15weighted, 25

Value-at-Risk, 48convexity, 51definition, 48estimation, 53Gaussian case, 49lower, 49sensitivity, 52upper, 49

portfolio optimization and performance analysis

Documents