pricing derivative securities

644

Upload: anh-tuan-nguyen

Post on 05-Dec-2014

219 views

Category:

Documents


14 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Pricing derivative securities
Page 2: Pricing derivative securities

PRICING

DERIVATIVE

SECURITIES

S E C O N D E D I T I O N

Page 3: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in FM FA

This page intentionally left blankThis page intentionally left blank

Page 4: Pricing derivative securities

T W E P P S University of Virginia. USA

PRICING

DERIVATIVE

SECURITIES

S E C O N D E D I T I O N

World Scientific N E W J E R S E Y * LONDON * SINGAPORE * BElJlNG * SHANGHAI * HONG KONG * TAIPEI * CHENNAI

Page 5: Pricing derivative securities

Published by

World Scientific Publishing Co. Pte. Ltd.

5 Toh Tuck Link, Singapore 596224

USA office: 27 Warren Street, Suite 401-402, Hackensack, NJ 07601

UK office: 57 Shelton Street, Covent Garden, London WC2H 9HE

Library of Congress Cataloging-in-Publication DataEpps, T. W.

Pricing derivative securities / by Thomas W. Epps. --2nd ed.p. cm.

ISBN-13 978-981-270-033-9 -- ISBN-10 981-270-033-11. Derivative securities--Prices--Mathematical models. I. Title.

HG6024.A3E66 2007332.64'57--dc22

2007006660

British Library Cataloguing-in-Publication DataA catalogue record for this book is available from the British Library.

Copyright © 2007 by World Scientific Publishing Co. Pte. Ltd.

All rights reserved. This book, or parts thereof, may not be reproduced in any form or by any means,electronic or mechanical, including photocopying, recording or any information storage and retrievalsystem now known or to be invented, without written permission from the Publisher.

For photocopying of material in this volume, please pay a copying fee through the CopyrightClearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, USA. In this case permission tophotocopy is not required from the publisher.

Typeset by Stallion PressEmail: [email protected]

Printed in Singapore.

PricingDerivative.pmd 7/19/2007, 1:31 PM1

Page 6: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in FM FA

Contents

Preface xiii

PART I PRELIMINARIES 1

1 Introduction and Overview 31.1 A Tour of Derivatives and Markets 4

1.1.1 Forward Contracts 41.1.2 Futures 61.1.3 “Vanilla” Options 81.1.4 Other Derivative Products 12

1.2 An Overview of Derivatives Pricing 141.2.1 Replication: Static and Dynamic 151.2.2 Approaches to Valuation when Replication is Possible 161.2.3 Markets: Complete and Otherwise 191.2.4 Derivatives Pricing in Incomplete Markets 20

2 Mathematical Preparation 212.1 Analytical Tools 22

2.1.1 Order Notation 222.1.2 Series Expansions and Finite Sums 232.1.3 Measures 252.1.4 Measurable Functions 27

v

Page 7: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in FM FA

vi Pricing Derivative Securities (2nd Ed.)

2.1.5 Variation and Absolute Continuity of Functions 282.1.6 Integration 292.1.7 Change of Measure: Radon-Nikodym

Theorem 412.1.8 Special Functions and Integral Transforms 42

2.2 Probability 472.2.1 Probability Spaces 472.2.2 Random Variables and Their Distributions 492.2.3 Mathematical Expectation 552.2.4 Radon-Nikodym for Probability Measures 662.2.5 Conditional Probability and Expectation 682.2.6 Stochastic Convergence 732.2.7 Models for Distributions 772.2.8 Introduction to Stochastic Processes 83

3 Tools for Continuous-Time Models 933.1 Wiener Processes 93

3.1.1 Definition and Background 933.1.2 Essential Properties 94

3.2 Ito Integrals and Processes 973.2.1 A Motivating Example 973.2.2 Integrals with Respect to Brownian Motions 993.2.3 Ito Processes 104

3.3 Ito’s Formula 1103.3.1 The Result, and Some Intuition 1103.3.2 Outline of Proof 1113.3.3 Functions of Time and an Ito Process 1133.3.4 Illustrations 1133.3.5 Functions of Higher-Dimensional Processes 1153.3.6 Self-Financing Portfolios in Continuous Time 117

3.4 Tools for Martingale Pricing 1183.4.1 Girsanov’s Theorem and Changes of Measure 1183.4.2 Representation of Martingales 1223.4.3 Numeraires, Changes of Numeraire, and Changes

of Measure 1233.5 Tools for Discontinuous Processes 125

3.5.1 J Processes 1253.5.2 More General Processes 130

Page 8: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in FM FA

Contents vii

PART II PRICING THEORY 139

4 Dynamics-Free Pricing 1414.1 Bond Prices and Interest Rates 143

4.1.1 Spot Bond Prices and Rates 1444.1.2 Forward Bond Prices and Rates 1474.1.3 Uncertainty in Future Bond Prices and Rates 149

4.2 Forwards and Futures 1504.2.1 Forward Prices and Values of Forward Contracts 1514.2.2 Determining Futures Prices 1534.2.3 Illustrations and Caveats 1564.2.4 A Preview of Martingale Pricing 158

4.3 Options 1594.3.1 Payoff Distributions for European Options 1604.3.2 Put-Call Parity 1614.3.3 Bounds on Option Prices 1654.3.4 How Prices Vary with T , X , and St 169

5 Pricing under Bernoulli Dynamics 1745.1 The Structure of Bernoulli Dynamics 1765.2 Replication and Binomial Pricing 1795.3 Interpreting the Binomial Solution 185

5.3.1 The P.D.E. Interpretation 1865.3.2 The Risk-Neutral or Martingale Interpretation 186

5.4 Specific Applications 1965.4.1 European Stock Options 1965.4.2 Futures and Futures Options 1995.4.3 American-Style Derivatives 2025.4.4 Derivatives on Assets That Pay Dividends 209

5.5 Implementing the Binomial Method 2205.5.1 Modeling the Dynamics 2205.5.2 Efficient Calculation 230

5.6 Inferring Trees from Prices of Traded Options 2395.6.1 Assessing the Implicit Risk-Neutral Distribution

of ST 2405.6.2 Building the Tree 2435.6.3 Appraisal 246

Page 9: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in FM FA

viii Pricing Derivative Securities (2nd Ed.)

6 Black-Scholes Dynamics 2476.1 The Structure of Black-Scholes Dynamics 2486.2 Approaches to Arbitrage-Free Pricing 250

6.2.1 The Differential-Equation Approach 2516.2.2 The Equivalent-Martingale Approach 254

6.3 Applications 2616.3.1 Forward Contracts 2616.3.2 European Options on Primary Assets 2626.3.3 Extensions of the Black-Scholes Theory 270

6.4 Properties of Black-Scholes Formulas 2736.4.1 Symmetry and Put-Call Parity 2746.4.2 Extreme Values and Comparative Statics 2756.4.3 Implicit Volatility 2796.4.4 Delta Hedging and Synthetic Options 2816.4.5 Instantaneous Risks and Expected Returns of

European Options 2846.4.6 Holding-Period Returns for European Options 287

7 American Options and “Exotics” 2927.1 American Options 292

7.1.1 Calls on Stocks Paying Lump-Sum Dividends 2937.1.2 Options on Assets Paying Continuous Dividends 2967.1.3 Indefinitely Lived American Options 308

7.2 Compound and Extendable Options 3117.2.1 Options on Options 3117.2.2 Options with Extra Lives 321

7.3 Other Path-Independent Claims 3277.3.1 Digital Options 3277.3.2 Threshold Options 3287.3.3 “As-You-Like-It” or “Chooser” Options 3297.3.4 Forward-Start Options 3307.3.5 Options on the Max or Min 3317.3.6 Quantos 336

7.4 Path-Dependent Options 3397.4.1 Extrema of Brownian Paths 3397.4.2 Lookback Options 3447.4.3 Barrier Options 3487.4.4 Ladder Options 3537.4.5 Asian Options 356

Page 10: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in FM FA

Contents ix

8 Models with Uncertain Volatility 3678.1 Empirical Motivation 367

8.1.1 Brownian Motion Does Not Fit Underlying Prices 3678.1.2 Black-Scholes No Longer Fits Option Prices 369

8.2 Price-Dependent Volatility 3708.2.1 Qualitative Features of Derivatives Prices 3718.2.2 Two Specific Models 3738.2.3 Numerical Methods 3798.2.4 Limitations of Price-Dependent Volatility 3798.2.5 Incorporating Dependence on Past Prices 380

8.3 Stochastic-Volatility Models 3828.3.1 Nonuniqueness of Arbitrage-Free Prices 3838.3.2 Specific S.V. Models 388

8.4 Computational Issues 3958.4.1 Inverting C.f.s 3968.4.2 Two One-Step Approaches 397

9 Discontinuous Processes 4019.1 Derivatives with Random Payoff Times 4029.2 Derivatives on Mixed Jump/Diffusions 409

9.2.1 Jumps Plus Constant-Volatility Diffusions 4099.2.2 Nonuniqueness of the Martingale Measure 4119.2.3 European Options under Jump Dynamics 4139.2.4 Properties of Jump-Dynamics Option Prices 4159.2.5 Options Subject to Early Exercise 417

9.3 Jumps Plus Stochastic Volatility 4189.3.1 The S.V.-Jump Model 4199.3.2 Further Variations 422

9.4 Pure-Jump Models 4269.4.1 The Variance-Gamma Model 4269.4.2 The Hyperbolic Model 4329.4.3 A Levy Process with Finite Levy Measure 4369.4.4 Modeling Prices as Branching Processes 4389.4.5 Assessing the Pure-Jump Models 445

9.5 A Markov-Switching Model 446

Page 11: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in FM FA

x Pricing Derivative Securities (2nd Ed.)

10 Interest-Rate Dynamics 45710.1 Preliminaries 457

10.1.1 A Summary of Basic Concepts 45710.1.2 Spot and Forward Measures 45810.1.3 A Preview of Things to Come 461

10.2 Spot-Rate Models 46210.2.1 Bond Prices under Vasicek 46310.2.2 Bond Prices under Cox, Ingersoll, Ross 468

10.3 A Forward-Rate Model 47210.3.1 The One-Factor HJM Model 47410.3.2 Allowing Additional Risk Sources 47810.3.3 Implementation and Applications 480

10.4 The LIBOR Market Model 49810.4.1 Deriving Black’s Formulas 50010.4.2 Applying the Model 504

10.5 Modeling Default Risk 51210.5.1 Endogenous Risk: The Black-Scholes-Merton Model 51410.5.2 Exogenous Default Risk 518

PART III COMPUTATIONAL METHODS 525

11 Simulation 52711.1 Generating Pseudorandom Deviates 529

11.1.1 Uniform Deviates 52911.1.2 Deviates from Other Distributions 532

11.2 Variance-Reduction Techniques 53711.2.1 Stratified Sampling 53711.2.2 Importance Sampling 54011.2.3 Antithetic Variates 54111.2.4 Control Variates 54411.2.5 Richardson Extrapolation 546

11.3 Applications 54711.3.1 “Basket” Options 54711.3.2 European Options under Stochastic Volatility 55111.3.3 Lookback Options under Stochastic Volatility 55311.3.4 American-Style Derivatives 554

Page 12: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in FM FA

Contents xi

12 Solving P.D.E.s Numerically 57712.1 Setting Up for Solution 579

12.1.1 Approximating the Derivatives 57912.1.2 Constructing a Discrete Time/Price Grid 58012.1.3 Specifying Boundary Conditions 581

12.2 Obtaining a Solution 58212.2.1 The Explicit Method 58212.2.2 A First-Order Implicit Method 58512.2.3 Crank-Nicolson’s Second-Order Implicit Method 58912.2.4 Comparison of Methods 591

12.3 Extensions 59112.3.1 More General P.D.E.s 59112.3.2 Allowing for Lump-Sum Dividends 594

13 Programs 59513.1 Generate and Test Random Deviates 596

13.1.1 Generating Uniform Deviates 59613.1.2 Generating Poisson Deviates 59613.1.3 Generating Normal Deviates 59613.1.4 Testing for Randomness 59713.1.5 Testing for Uniformity 59713.1.6 Anderson-Darling Test for Normality 59813.1.7 ICF Test for Normality 598

13.2 General Computation 59913.2.1 Standard Normal CDF 59913.2.2 Expectation of Function of Normal Variate 59913.2.3 Standard Inversion of Characteristic Function 60013.2.4 Inversion of Characteristic Function by FFT 600

13.3 Discrete-Time Pricing 60113.3.1 Binomial Pricing 60113.3.2 Solving PDEs under Geometric Brownian Motion 60213.3.3 Crank-Nicolson Solution of General PDE 602

13.4 Continuous-Time Pricing 60213.4.1 Shell for Black-Scholes with Input/Output 60213.4.2 Basic Black-Scholes Routine 60313.4.3 Pricing under the C.E.V. Model 603

Page 13: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in FM FA

xii Pricing Derivative Securities (2nd Ed.)

13.4.4 Pricing a Compound Option 60313.4.5 Pricing an Extendable Option 60413.4.6 Pricing under Jump Dynamics 604

Bibliography 605

Subject Index 617

Page 14: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in FM FA

Preface

This book was developed during many years of teaching derivatives to doc-toral students in financial economics. The aim in writing it was to helpfill the gap between books that offer a theoretical treatment without muchapplication and those that simply present the pricing formulas withoutderiving them. The project was guided by the beliefs that understanding isnot complete without practice at application and that applying results onedoesn’t understand is risky and unsatisfying. This book presents the theorybut directs it toward the goals of producing practical pricing formulas forderivative assets and of implementing them empirically.

Teaching the theory of derivatives to Ph.D. students in economics or todoctoral finance students in business requires teaching some math along theway. I have tried to make the book self-contained in this regard by present-ing the required basics of analysis, general probability theory, and stochasticprocesses. This development starts with chapter 2, which presents (and isdevoted entirely to) the general mathematical background needed through-out the book. It continues in chapter 3 with the specific tools requiredfor continuous-time finance, including the special theory needed to han-dle processes with discontinuous sample paths. Since teaching math is notthe book’s main purpose, the treatment is necessarily brief and emphasizesexamples rather than proofs. Because appendices (like prefaces) tend to beignored by students, I have chosen to present this background in the maintext. Paging through chapters 2 and 3 will serve as a refresher to those whohave had the required math. For those who have not, the material is therefor more careful thought, as a guide to the mathematical literature, and asa reference to be consulted in the later chapters. The math-prep chaptersand the general overview in chapter 1 comprise Part I: Preliminaries.

xiii

Page 15: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in FM FA

xiv Pricing Derivative Securities (2nd Ed.)

The book has two distinctive features besides the math lessons. First,the chapters in Part II: Pricing Theory, are organized around the assump-tions made about the dynamics of underlying assets on which the deriva-tives depend. This allows us to progress from the relatively simple modelsthat require little mathematical sophistication to the more complex onesthat require a great deal. Thus, chapter 4 begins with pricing bounds andrelations such as European put-call parity that apply very generally andare derived from simple static-replication arguments. Chapter 5 progressesto Bernoulli dynamics and the associated binomial approach to pricing indiscrete time. Modeling the evolution of prices in continuous time beginsin chapter 6 with the basic Black-Scholes theory for pricing European-stylederivatives under geometric Brownian motion. Chapter 7 applies the samedynamic framework to American options and some of the many “exotic”varieties of contingent claims. Chapters 8 and 9 deal with more elaboratemodels based on diffusions with ex ante uncertain volatility and with dis-continuous processes, respectively. Much of this material comes from veryrecent research literature. Finally, chapter 10 brings in stochastic modelsfor interest rates and introduces readers to the literature on pricing interest-sensitive instruments.

The book’s second distinctive feature is its detailed treatment of empir-ical and numerical methods for implementing the pricing procedures. Thismaterial is presented in Part III: Computational Methods, which com-prises chapters on simulation, on the numerical solution of partial differ-ential equations, and on computation. The last chapter is a summary listof FORTRAN, C++, and VBA programs on the accompanying CDROMthat implement many of the basic pricing procedures discussed in the text:binomial pricing, Black-Scholes, the constant-elasticity-of-variance model,options on assets following mixed jump-diffusions, and others. There arealso basic routines for generating pseudorandom numbers, for testing forrandomness and adherence to specific distributional forms, for numericalintegration and Fourier inversion, and for solving p.d.e.s numerically. TheFORTRAN and C++ programs are presented in source form in order toguide readers in producing their own applications. Code for many of theVisual Basic for Applications routines can be viewed as macros within thespreadsheets that facilitate the input and output.

I have tried to supply enough detail in the derivations of pricing formu-las to enable readers to follow the development without having to puzzleover each line. The amount of such detail declines as we move along, in

Page 16: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in FM FA

Preface xv

the expectation that readers are acquiring a mastery of technique as theyprogress.

The second edition is a complete revision of the first and adds severaltopics from recent research in the field. The chapter on fixed-income deriva-tives has been significantly expanded to include sections on the LIBORmarket model and default risk. The chapter on discontinuous processes nowincludes two new models, one that allows jumps in volatility as well as priceand another that accommodates stochastic variation in the mean frequencyof price breaks. A new section on regime-change models opens up a numberof flexible and computationally attractive strategies for parsimonious mod-eling of noisy processes. The chapter on stochastic volatility models presentsa new and very fast technique for pricing European-style derivatives off theunderlying characteristic function. Finally, the chapter on simulation nowincludes a detailed treatment of its application to American-style deriva-tives. Both to save space and to facilitate their use, the computer routinesincluded in the first edition have been placed on an accompanying CDROM.An outline and general description still appears as chapter 13.

Besides graduate students in economics and doctoral students in busi-ness, others who should find this book useful include masters students orupper-level undergraduates in applied math courses, students in masters-level computational finance and financial engineering programs, those withprior math background who are being trained as practitioners, and non-specialist researchers who are trying to acquire some familiarity with thefield of derivatives. Whether this book does help to fill the gap between thetoo-theoretical and the too-applied, I leave to the market to decide!

I am grateful to the many students who have commented on andcorrected the course notes from which this book evolved; to Hua Fangand, especially, to Todd Williams, who proofed and gave detailed com-ments on several chapters in the first edition; and to Michael Nahas, whopainstakingly translated my FORTRAN code into C++. The second edi-tion owes much to Ubbo Wiersema, who suggested several additions, pro-vided detailed comments on one of the chapters, and assisted in developingthe VBA programs. For his capable work in producing these, I thank BenKoulibali. Finally, I am indebted to my wife Mary Lee for her bounteoussupport, advice, and editorial assistance.

Queries about the book or reports of errors in the text or in the computercodes should be directed to [email protected]. Please include the initials“PDS” in the subject field. A current list of errata will be maintained athttp://www.worldscibooks.com/economics/6243.html.

Page 17: Pricing derivative securities

This page intentionally left blankThis page intentionally left blank

Page 18: Pricing derivative securities

April 6, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch01 FA

PART I

PRELIMINARIES

Page 19: Pricing derivative securities

April 6, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch01 FA

This page intentionally left blankThis page intentionally left blank

Page 20: Pricing derivative securities

April 6, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch01 FA

1Introduction and Overview

Derivative securities are financial instruments whose values are tied con-tractually to values of other assets, called underlying assets, at one or morepoints in time. One of the simplest examples is a forward contract. Thisis an agreement between two parties to exchange some specified quantityof a commodity or asset for a certain sum of money at a stated futuretime and location. For example, a forward buyer may agree to accept deliv-ery of $1,000,000 face value of U.S. Treasury bills of six months’ maturityfrom the forward seller on a specific date and to pay $968,000—the forwardprice—at that time. Delivery would likely be to an account with a bank ora broker. Once the bills are delivered and paid for, the agreement expires.The forward contract is thus a derivative on the underlying Treasury bills.To the buyer, the per-unit value of the contract at expiration is the dif-ference between the underlying asset’s spot-market price and the forwardprice of $968,000; to the seller, it is the negative of this. The spot-marketprice is the price at which one could buy 6-month Treasury bills on theopen market for immediate delivery.

Derivatives are distinguished from primary assets, such as stocks, bonds,currencies, and gold. Values of these are not tied contractually to otherprices, although they and other prices are surely interdependent as a conse-quence of the connections between markets in general economic equilibrium.We make no distinction between primary financial assets, such as stocks,bonds, and currencies, and primary “real” assets or commodities, such asgold. It is sometimes necessary to distinguish commodities that are held pri-marily for investment purposes—gold being a prime example—from othercommodities like wheat and petroleum that are often held in inventory foruse in production. Forward contracts and certain other derivatives on such

3

Page 21: Pricing derivative securities

April 6, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch01 FA

4 Pricing Derivative Securities (2nd Ed.)

underlying assets commonly come into being as firms and individuals dobusiness and attempt to limit risk.

Forward contracts, and derivatives generally, fall into the broader classof contingent claims—contracts that mandate payments that are contingenton uncertain events. As regards the forward contract, it is the commodity’sfuture spot price that is uncertain. Casualty and life insurance contracts,in which damage and death are the contingencies, are members of thisbroader class that are not normally considered derivative assets. However,the distinction begins to blur as the field of financial engineering becomesmore creative. For example, one can now buy and sell bets on outcomes oftemperature and snowfall in specific localities, and these are often referredto loosely as “derivative” products. Our use of the term derivative and ourconcern in this book will, however, be limited to those financial instrumentsdescribed in the first sentence of this chapter.

It is precisely because derivatives’ values are tied contractually to otherassets that “pricing” them is a distinct subfield within economic science.While economics helps us understand the determinants of prices of goods,services, and primary assets—for example, Coca Cola and the price of Cokestock—economists might disagree profoundly were they asked to specifythe prices at which these commodities “should” sell. Indeed, since no goodanswer is apt to be forthcoming, the question is not often asked. On theother hand, financial economists who agreed about the dynamic behaviorof the price of an underlying asset would likely supply very similar answerswhen asked to value, or “price”, a derivative. Before we can begin to seewhy this is so, we will need to know more about the kinds of derivativesthat are available, how they are used, and how they are bought and sold.

1.1 A Tour of Derivatives and Markets

1.1.1 Forward Contracts

Forward agreements to deliver and accept delivery of commodities or finan-cial instruments at a future date are alternatives to simply waiting andtrading cash for commodity at the spot price that prevails at that date.Such agreements are made because at least one party finds the risk ofan adverse price fluctuation undesirable. For example, the parties to theexchange of Treasury bills discussed in the introduction might be a bankand a dealer in government securities. The bank’s purpose in contracting toaccept forward delivery—that is, to take a “long” forward position—would

Page 22: Pricing derivative securities

April 6, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch01 FA

Introduction and Overview 5

be to lock in the prices of these instruments until the funds to purchasethem became available. The dealer’s purpose would be to earn a fee, in theform of a markup, for taking the opposite or “short” side of the transac-tion. Similarly, a U.S. importing firm that had contracted for a shipment ofgoods from a Japanese supplier might enter into a forward agreement witha bank for the purchase of Japanese yen, thus protecting against an increasein the $/ exchange rate. In the same way, a producer of cotton yarn mightcontract to deliver a shipment of such goods to a textile manufacturer at aspecified price, allowing both firms to avoid the risk of adverse price fluctu-ation. As these examples illustrate, forward agreements are typically madebetween financial institutions, between productive enterprises and finan-cial institutions, and between enterprises that produce commodities andthose that use them as inputs. Because the delivery terms—date, place,and quantity—are tailored to suit the demands of the originating parties,positions in forward agreements are rarely resold to others. In particular,they are not traded on any organized exchange.

Given the delivery terms, the forward price is set so that neither partyrequires compensation for entering the agreement. That is, the forwardprice is such that the initial value of the contract is zero to both parties.1

However, once the agreement is concluded and the spot price of the under-lying commodity (or other relevant factors) begins to change, the agreementstarts to acquire positive or negative value. Letting T be the time to deliv-ery for a contract initiated at time t = 0, we will represent the forwardprice itself as f(0, T ) (or simply as f if the times are understood) and thetime-t value of the long position in the contract as F(t, T ; f), t ∈ [0, T ].Thus, F(0, T ; f) = 0 and F(T, T ; f) = ST − f, where ST is the spot price atT . In chapter 4 we will see how to price a forward contract at any t ∈ [0, T ]and to find the forward price, f(0, T ), that satisfies F(0, T ; f) = 0. We willalso see that the special feature that makes these tasks very easy is thatthe contract’s terminal value is linear in the spot price.

1But why would they enter the agreement if it had no value to them? The statementin the text, like many others to follow in this book, pertains to conditions that wouldapply in markets free of impediments to borrowing and transacting. We shall see inchapter 4 that in such frictionless markets either party could replicate the agreement

by taking appropriate positions in the underlying commodity and riskless bonds. It isthe common value of this replicating portfolio that would be zero at the agreed-uponforward price.

Page 23: Pricing derivative securities

April 6, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch01 FA

6 Pricing Derivative Securities (2nd Ed.)

1.1.2 Futures

Futures contracts involve the same general commitments that forwardcontracts entail; namely, to exchange a commodity for cash at a futuredate. Unlike forward agreements, which are made through direct negotia-tion between private parties, futures commitments are bought and sold inactive markets conducted on organized exchanges. Examples of these arethe Chicago Board of Trade (CBOT), the Chicago Mercantile Exchange, theLondon International Financial Futures Exchange, the New York FuturesExchange, the Tokyo International Financial Futures Exchange, and theToronto Futures Exchange. Most of the distinguishing features of futurescontracts can be traced to the need to assure that markets for them will beactive and liquid, and therefore cheap to trade in.

One such feature that distinguishes futures from forwards is that theterms of futures contracts are standardized as to quantity and quality ofthe commodity, as to delivery dates, and as to delivery locations.2 Forexample, wheat contracts on the CBOT call for delivery of 5,000 bushelsof wheat of specified types and grades to approved warehouses in Chicago(and certain other specific locations) during a particular delivery month.Besides wheat contracts, one can buy and sell futures commitments formany other agricultural products, for certain industrial commodities, andfor various financial instruments. Among the last are U.S. Treasury bondsand bills, certain stock indexes, and major currencies. Trading of futuresis typically done through a broker, who transmits clients’ orders to theexchange. The transactions arising from the continual flow of buy and sellorders determine the futures price, which in turn determines the net cashoutlay that the buyer must provide at delivery. This is explained furtherbelow.

Besides standardizing the contracts, an important step to promote mar-ket liquidity is assuring buyers and sellers that counterparties will complywith their obligations to deliver or take delivery. To this end, the exchangesoperate clearing houses that serve as intermediaries to the parties and backthe compliance (“performance”) of buyers and sellers with their own capi-tal and that of member brokers. Thus, one who buys (takes a long positionin) wheat futures on the CBOT deals not directly with the seller of the

2As do some forward agreements, futures contracts typically allow the deliveringparty some flexibility in certain of these terms; for example, some contracts allow fordelivery on any business day of the expiration month.

Page 24: Pricing derivative securities

April 6, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch01 FA

Introduction and Overview 7

contract but with the CBOT’s clearinghouse. Since the house simultane-ously commits to deliver to the buyer and to take delivery from the seller,it has no net exposure to price risk; but it is exposed to risk of default byparties to the trade. It is to limit this risk that exchanges follow a practiceknown as marking to market, which from our perspective constitutes themajor difference between futures contracts and forward contracts.

Here is how marking to market works. An individual who places an orderto buy or sell futures must post a surety deposit, called the initial margin,with the broker who handles the trade. This deposit usually amounts to justa small proportion of the total value of the commodity for which deliveryis contracted, because, like forward contracts when they are first arranged,futures positions at first have no net value to either party. This changes,however, as the futures price fluctuates through time and one of the partiesacquires a liability—the obligation to sell at a price below the spot or to buyat a price above. To help keep unsecured losses from getting out of hand,futures contracts are, in effect, renegotiated at the end of each tradingday to provide for delivery at the current futures price. If this is lowerthan the original price, then the long position acquires negative value equalto the product of quantity and price difference, while the short positionacquires a positive value of equal magnitude. These losses and gains arethen deducted from and added to the respective traders’ accounts, whichare thereby “marked” to market.

For example, suppose one buys a contract for 5,000 bushels of July wheatat futures price $3.00, posting initial margin of $500. At the end of thefirst day, with the futures price settling at $2.97, the loss of ($3.00-$2.97) ×5,000 = $150 is deducted from the margin balance, the original commitmentto buy at $3.00 is terminated, and a new contract is opened for deliveryat the futures price $2.97. When losses reduce the balance below somethreshold, another deposit is required. If this deposit is not forthcoming,the position is closed out by the broker and the balance, if any, returnedto the client. A large intraday price fluctuation that exhausted the marginbalance would have to be made up by the broker were the client unable todo so.3

The net effect of marking to market can be seen as follows. Let Stt≥0

be the spot-price process and F(t, T )t∈[0,T ] be the futures price for a

3To reduce the chance of such an occurrence, exchanges limit daily price fluctuationsby halting trading for the day when prices make “limit moves” from the previous day’ssettlement value.

Page 25: Pricing derivative securities

April 6, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch01 FA

8 Pricing Derivative Securities (2nd Ed.)

contract expiring at T . Of course, at t = T the futures price and spot pricecoincide, so that F(T, T ) = ST . Taking the trading day as the unit of time, along position in a contract initiated at price F(0, T ) is worth F(1, T )−F(0, T )per commodity unit at the end of the first trading day, F(2, T )−F(0, T ) perunit at the end of the second day, and so on. On the delivery day itself thevalue of the position will be ST − F(0, T ), just as for a forward contractinitiated at F(0, T ). Upon paying F(T, T ) = ST at delivery, the net costto the long party is ST − [ST − F(0, T )] = F(0, T ) per unit, which was thefutures price when the contract was first created. One can see, then, thatthe effect of marking to market is merely to distribute the net gain or lossover time, rather than allowing it all to come due at T . We consider inchapters 4 and 10 under what conditions and how this difference in timingaffects the relative values of futures and forward positions and the relationbetween forward and futures prices.

The example just given pertained to futures positions that were heldopen to delivery, but in fact only a small proportion of positions are heldthis long. Instead, parties usually cancel their commitments to the clearinghouse by taking offsetting positions before expiration; that is, going long tocancel a short, and vice versa. For those using futures to hedge the risk ofadverse price fluctuations, this eliminates the requirement to meet the exactquantity, quality, location, and time requirements specified in the contract.The futures position still provides a partial hedge for spot purchases andsales in local markets, to the extent that local spot prices in the commodityare correlated with those on which futures contracts are written.4

One should bear in mind that futures prices, like forward prices, are notthemselves prices of traded assets. Instead, it is the futures position thatis the asset, the futures price being merely the quantifiable variable thatfluctuates so as to keep the values of new futures positions equal to zero.

1.1.3 “Vanilla” Options

Put and call options give those who hold them the rights to transact inunderlying assets at specified prices during specified periods of time. Thereare many variations on this theme, but we consider in this section just themost basic instruments, often called “vanilla” options. One who owns (islong in) a vanilla put has the right to sell the underlying asset at a fixed

4Kolb and Overdahl (2006) describes in detail the operation of futures markets andthe use of futures in hedging.

Page 26: Pricing derivative securities

April 6, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch01 FA

Introduction and Overview 9

price, called the “strike price” or “exercise price”, during or at the end ofa specified period. If this option is exercised, the counterparty who is shortthe put has the obligation to take the other side of the trade; that is, to buythe asset at the strike price. Calls work in just the opposite way: one who islong a call has the option to buy at the strike price during or at the end of aspecified period, while the counterparty must sell at this price if the holderchooses to exercise. Options not exercised by the end of the specified periodare said to expire. Even vanilla puts and calls come in two flavors, havingdifferent conventions as to when they can be exercised. American optionscan be exercised at the discretion of the holder at any time on or before theexpiration date, whereas European options can be exercised only at thatspecific time. The names have nothing to do with where these options aretraded today.

The seller of an option is often referred to as the “writer”, a term thatoriginated before the advent of exchange trading. In those days all optioncontracts were negotiated between buyers and sellers or purchased fromdealers, known as put and call brokers. Since opportunities for resale werelimited, options were almost always held until they either expired or wereexercised. Exchange trading of options was begun in 1973 by the ChicagoBoard Options Exchange (CBOE). Currently, exchanges in major financialcenters around the world list European- and American-style options on avariety of underlying assets, including common stocks, stock indexes, andcurrencies. In the U.S. exchange-traded stock options are almost alwaysAmerican,5 but there are European-style options on various stock indexes,such as the Dow Jones Industrials and the S&P 500. Index options, likeindex futures contracts, are settled in cash rather than through deliveryof the underlying.6 Exchange-traded options on spot currencies are alsoavailable, in both American and European style. Besides these spot optionsthere are also exchange-traded futures options. A futures option gives theright to assume a position in a futures contract that expires at some laterdate than the option itself. One who exercises a futures put gets a shortposition in a futures contract plus cash equal to the difference between thestrike price and the futures price. Exercising a futures call conveys a longposition in the futures plus the difference between the futures price and

5“Flex” options available through the CBOE can be obtained in European form.6One who exercises a call on an index receives from the seller in cash an amount

proportional to the difference between the index value and the strike price. For puts, theexercise value is proportional to the difference between the strike price and the index.

Page 27: Pricing derivative securities

April 6, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch01 FA

10 Pricing Derivative Securities (2nd Ed.)

the strike.7 There are exchange-traded futures options on many underlyingassets, including stock indexes, currencies, short- and long-term bonds, andvarious agricultural commodities.

As for futures contracts, the key features of most exchange-tradedoptions—time to expiration, strike price, and quantity—are standardizedin order to promote a liquid market. For example, stock options are typi-cally for 100 shares and expire in six months or less.8 Buying or selling anexchange-traded option (of any sort) works much like buying or selling anexchange-listed stock: an order placed through a broker is transmitted tothe exchange, where it is either matched with another individual’s order oraccepted by the market maker.

Sales of options either reduce or close out existing long positions arisingfrom previous purchases, or else they create short positions. Short posi-tions may be covered or uncovered. A covered short is one that is backedby a position in the underlying (or in another derivative) that automat-ically enables one to meet the obligation imposed by the option holder’sdecision to exercise. For example, a short position in calls would be coveredif backed by a long position in the underlying, which could be delivered ifthe call were exercised. By contrast, to meet the terms of an uncovered or“naked” short position in calls one must have the resources to purchase theunderlying at the prevailing spot price. To reduce the chance of nonper-formance, brokers demand deposits of cash or other securities as surety fornaked short positions, just as they do for short positions in stocks. How-ever, whether they are covered or not, short sales of options are not boundby the tick rules that restrict short sales of stocks on certain exchanges.9

As with futures, short positions in options can be eliminated simply bybuying options of the same type. As for futures also, option exchanges actas intermediaries to all transactions in order to reduce counterparty risk totraders.

7Note that, unlike the payoffs of spot options, payoffs of futures options are linked tofutures prices and therefore not to prices of traded assets. We will see that this differencehas important implications for pricing futures options.

8The CBOE’s flex options, designed to appeal to institutional investors, do offerflexible features, but they must be traded in large quantities. There are also long-termoptions, called “LEAPS” with lives of up to 39 months.

9Securities and Exchange Commission rules in the U.S. prohibit the short sale ofa stock at a price below that of the last preceding trade. A short sale at that price isallowed only if that price is itself above the last preceding different price.

Page 28: Pricing derivative securities

April 6, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch01 FA

Introduction and Overview 11

Introducing the general notation used throughout for prices and con-tract features of standard options, we let T represent the expiration date;X, the strike price; and St, the value at t of the market-determined price towhich the option’s payoff is tied at expiration—that is, the underlying price.We use more specific notation for the underlying price in certain cases; forexample, Ft for the underlying price of a futures option. Time-t prices ofgeneric calls and puts are represented as C(St, T − t) and P (St, T − t),where T − t is the remaining time until expiration. Other arguments areadded when it is necessary to specify contract features or other param-eters affecting value; for example, C (St, T − t; X1) , C (St, T − t; X2) forcalls with different strikes. Superscripts are used to identify specific typesof options, as CA, CE for the American and European varieties. Takingt = 0 as the initiation date of an option contract, the initial values of putsand calls are thus P (S0, T ) and C(S0, T ), and terminal values—values atexpiration if not previously exercised—are P (ST , 0) and C(ST , 0). As forfutures contracts, our convention is that these represent values to one whohas a long position in the option.

Terminal values of vanilla options can be stated explicitly in terms of theunderlying price at T and the strike alone. Since exercising an option is notobligatory, a call (which gives the right to buy at X) would not (rationally)be exercised at expiration unless ST > X. Otherwise, the option conveysnothing of value, and is said to be “out of the money”. When ST > X

an expiring call is “in the money” and, abstracting from transaction costs,has net value to the holder equal to ST − X . Likewise, a put would not beexercised at expiration unless ST < X and would then have value X − ST .

Terminal values of puts and calls can therefore be expressed as

P (ST , 0) = max(X − ST , 0) ≡ (X − ST ) ∨ 0 ≡ (X − ST )+

C(ST , 0) = max(ST − X, 0) ≡ (ST − X) ∨ 0 ≡ (ST − X)+.

Notice that when the underlying price is bounded below by zero andunbounded above (the usual case), then the terminal value of the put isbounded above by the strike price, whereas the call’s value has no upperbound. One who is short a put can therefore lose no more than X , whereasone who is short an uncovered call has unlimited potential loss.

Figure 1.1 represents these elementary payoff functions at expiration.Unlike payoffs of forwards, they are clearly not linear functions of ST . Wewill see in chapter 4 that this fact has far-reaching implications for pricingthe options at dates before expiration; that is, for finding P (St, T − t) and

Page 29: Pricing derivative securities

April 6, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch01 FA

12 Pricing Derivative Securities (2nd Ed.)

Fig. 1.1. Payoff functions of call and put options.

C(St, T−t) for t < T . Pricing American options faces the additional compli-cation that exercise can occur at any time before T. We begin in chapter 5to develop specific techniques for pricing both American and Europeanoptions, extending and developing these procedures in later chapters.

1.1.4 Other Derivative Products

Besides the many exchange-traded derivatives there are available over thecounter—that is, through negotiated transactions with dealers and otherprivate parties—an immense and rapidly growing array of structured deriva-tive products. Serving institutional investors, firms involved in manufac-turing, finance, and commerce, public enterprises, and governmental units,these products make it possible to hedge risks of price movements thatwould adversely affect cash flow and/or the values of assets. Of course, asseveral well-publicized incidents have demonstrated, derivatives also sup-port purely speculative activity that can have disastrous financial conse-quences if there is insufficient internal oversight.

One form of specialized derivatives sold over the counter are the so-called“exotic” options, which can be grouped according to how they modify the

Page 30: Pricing derivative securities

April 6, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch01 FA

Introduction and Overview 13

features of vanilla puts and calls. Here is a sampling. We treat most of thesein chapter 7.

1. Variations on the terminal payoff function. There are “digital” calls thatpay a fixed sum when the underlying price at expiration is above thestrike, and “threshold” calls that pay ST − X, but only when ST >

K > X for some threshold K. There are various path-dependent optionswhose payoffs depend on the entire path of price up to expiration; forexamples, (i) “down-and-out” puts that pay X−ST provided St remainsabove some level K < X for 0 ≤ t ≤ T ; (ii) “lookback” options whosepayoffs depend on the extrema of price over [0, T ]; and (iii) “Asian”options whose payoffs depend on the average price.

2. Variations on the underlying. There are options on other options. Thereare “basket” options with payoffs depending on the value of some port-folio; options on the minimum or maximum of two or more underlyingprices; and “quanto” derivatives, whose payoffs in domestic currency arein fixed proportion to prices of assets denominated in a foreign currency,irrespective of changes in exchange rates.

3. Variations on what right the option confers. There are “chooser” optionsthat let one choose at some future date whether the option is to be aput or a call.

4. Variations on the option’s term. There are “forward-start” options thatcome to life at some future date, t∗, as at-the-money puts or calls (thatis, with X = St∗) that expire at some T > t∗. There are “extendable”options that must be exercised if in the money at one or more discretedates but are otherwise extended one or more times. There are “Bermu-dan” options that can be exercised at certain discrete times but do nothave to be.

There is also a huge over-the-counter market in interest-rate deriva-tives, whose values depend primarily on prices of fixed-income assets. Theseinclude: (i) options on bonds; (ii) interest-rate “swaps”, which are exchangesof payments on floating-rate loans for payments on fixed-rate loans; (iii)currency swaps, which are exchanges of loans in one currency for loans inanother; (iv) options on swaps, called “swaptions”; (v) “caps” and “floors”,which provide upper and lower limits to variable-rate loans; and (vi) optionson caps and floors. Chapter 10 serves as an introduction to the large liter-ature on these instruments.

Page 31: Pricing derivative securities

April 6, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch01 FA

14 Pricing Derivative Securities (2nd Ed.)

1.2 An Overview of Derivatives Pricing

The theory of derivatives pricing is founded on two observations about theprices we find in markets for economic goods:

• Two identical commodities, offered for sale at the same time and place,usually sell for the same price.

• One cannot ordinarily obtain for free something that people value.

The first observation, minus the qualification usually, is often referred toin economics as the law of one price. The second observation—again minusthe qualification—corresponds to the trite expressions, “You can’t get some-thing for nothing” and “There’s no such thing as a free lunch”. Of course,we recognize that both qualifiers are needed, since we have all seen excep-tions to the unqualified statements. For example, the first condition can failwhen not everyone is aware of what goods are available in the marketplaceor not well informed about their characteristics; and the second can failwhen one who owns some commodity either has some charitable motiveor is simply unaware that the commodity has value to others. Of course,both conditions can fail when some coercive authority controls prices byfiat or limits individuals’ ability to transact. However, few would disagreethat the unqualified versions of these statements characterize situations towhich things tend over time in free, competitive markets. As individualslearn through experience or from others, their actions to obtain more ofthe things they value will cause prices of goods, services, and assets toattain levels consistent with their marginal social values.

The failure of either of these conditions to hold in a freely functioningand frictionless market for assets presents an opportunity for arbitrage.Specifically, there is an arbitrage opportunity if one can either:

• Obtain a sure, immediate return in cash (or other good) by trading assets,or

• Get for free a claim on cash (or other good) that has a positive chanceof paying off in the future.

In either case something of value—cash, good, or valuable claim—can beobtained for sure at no cost. Thus, when transacting is costless a failureof the law of one price confers the opportunity for a sure, immediate cashreturn. To earn it, one simply sells the more expensive asset and buys thecheaper one. Likewise, selling one asset and buying another that costs the

Page 32: Pricing derivative securities

April 6, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch01 FA

Introduction and Overview 15

same but adds potential cash payments conveys a valuable claim for no netoutlay.

1.2.1 Replication: Static and Dynamic

In its simplest form pricing some financial asset by arbitrage works by (i)finding a way to replicate its payoffs exactly by assembling a portfolio ofother traded assets whose prices are known, then (ii) invoking the law ofone price. Arbitrage pricing would thus allow a financial firm or its clientto judge the value of a derivative asset that is not already traded but thatthe firm might be asked to sell. The method’s usefulness relies on two keyconditions:

1. That markets function well enough to eliminate significant opportunitiesfor arbitrage, and

2. That the introduction of the new asset does not appreciably change thevalue of the replicating portfolio.

If these conditions hold, then the lawof one price equates the value of the assetto that of the replicating portfolio. What distinguishes derivatives from pri-mary assets is that it is sometimes possible to replicate their payoffs and pricethem by arbitrage. In chapter 4 we will state formally the sufficient conditionson markets, but—depending on the derivative—we shall see that other condi-tions may be needed on the dynamics of the underlying price as well.

The use of arbitrage arguments for pricing financial assets can be tracedat least to John B. Williams’ classic 1938 monograph, The Theory ofInvestment Value. Applying what he called the “principle of conservation ofinvestment value”, Williams argued that the true worth of a firm is invari-ant under changes in its capital structure—that is, changes in its mix ofdebt and equity. Modigliani and Miller (1958) developed a rigorous proofof this proposition for firms in a frictionless economy without taxes andfree of transaction costs and the legal costs associated with contracting andbankruptcy. In Miller and Modigliani (1961) they argued on similar groundsthat a firm’s dividend policy is also irrelevant to its total market value. Allof the arguments rely on the law of one price to show that the total value ofclaims on a stream of earnings should not depend on how the receipts areallocated between dividends and interest. For example, individuals couldin effect create their own desired debt/equity mix by combining the firm’sstock with borrowing and lending on their own account, thereby replicatingany preferred allocation.

Page 33: Pricing derivative securities

April 6, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch01 FA

16 Pricing Derivative Securities (2nd Ed.)

Modigliani-Miller’s recipe for combining stock and bonds to attain aparticular degree of financial leverage is an example of “static” replica-tion. It is static in the sense that no further trades are required once thereplicating portfolio is put in place. We shall see in chapter 4 that staticreplication is also possible for certain derivatives. For example, one canmimic a long position in a forward agreement by buying the underlyingasset now, borrowing the present value of the forward payment, and repay-ing the loan at delivery. Although just one transaction is involved, theportfolio’s market value nevertheless equals that of the forward position atany time until expiration. However, static replication is not usually possiblefor more complicated derivatives whose payoffs are not linear functions ofthe underlying price. It remained for Black and Scholes (1973) and Merton(1973) to show how nonlinear derivatives such as European options can bevalued by “dynamic” replication. Dynamic replication works by forming aself-financing portfolio of traded assets and trading back and forth betweenthem in such a way that the portfolio is sure to have the same value as thederivative when the derivative expires. The term self-financing means thatpurchases of one asset are always financed by sales of others, so that newfunds need never be added nor withdrawn. For example, we shall see inchapters 5 and 6 that the terminal payoff of a European stock option canbe replicated with a self-financing portfolio of stock and bonds, assumingthat the prices of these assets behave in certain ways. Again, by invokingthe law of one price we can conclude that the arbitrage-free value of thederivative in a frictionless market must be that of the replicating portfolio.

1.2.2 Approaches to Valuation when Replication is

Possible

This is all well and good, but how exactly does one find a dynamic replicat-ing portfolio? Paradoxically, once the existence of a self-financing, replicat-ing portfolio can be verified, it happens that the value of the derivative canbe worked out directly without first finding the portfolio weights. Indeed,the solution will tell us what those weights must be. There are two generalways to work out the derivative’s arbitrage-free value.

Pricing by Solving P.D.E.s

The first approach, following the path laid out by Black and Scholes (1973)and Merton (1973), is to infer from the replicating argument that the

Page 34: Pricing derivative securities

April 6, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch01 FA

Introduction and Overview 17

price of the derivative asset must follow a certain fundamental equationof motion. This fundamental equation is either a partial difference equationor a partial differential equation, depending on whether the underlying priceis modeled in discrete time or in continuous time. In either case it describeshow the arbitrage-free price of the derivative changes with respect to timeand the various state variables, such as the prices of assets that comprisethe replicating portfolio. The solution gives price as a weighted sum of theprices of the primary assets and thereby indicates precisely what the repli-cating portfolio is. Moreover, the solution specifies the portfolio weights asfunctions of time and the state variables, thus prescribing how the replicat-ing portfolio is to be adjusted as the state variables evolve.

Risk-Neutral/Martingale Pricing

The second way to find a derivative’s arbitrage-free price and replicat-ing portfolio relies on the no-free-lunch property of arbitrage-free markets.Adopting Arrow’s (1964) and Debreu’s (1959) characterization of assetsas bundles of state-contingent receipts, we can view the market value ofan asset in an arbitrage-free market as simply the sum of the values ofthe contingent receipts it offers. In other words, it is the sum of prod-ucts of the cash amounts received in various states and times multipliedby the market prices associated with those time-and-state-dependent pay-offs. Each time-state price is, of course, the value of a unit cash receipt.It reflects both the likelihood of the state and the subjective trade-offbetween having cash to spend now versus having a unit payoff at theparticular time and in the particular set of circumstances that the staterepresents.

These state prices have two important properties. First, state prices arepositive if and only if state probabilities are positive. This is because theright to a positive chance of receiving a receipt at some date T alwayscommands a positive price in an arbitrage-free market, whereas there canbe no value to a payoff that is tied to a contingency that has no chanceof occurring. Second, in an arbitrage-free market the sum of all the stateprices for contingent payoffs at T must equal the price of a default-free, T -maturing, discount “unit” bond that returns one unit of currency at T inevery state—i.e., a riskless bond. If we divide the state prices by the priceof such a riskless bond, the resulting normalized state prices then have allthe properties of probabilities; that is, they are nonnegative and sum tounity. Indeed, the normalized prices represent a new probability measure

Page 35: Pricing derivative securities

April 6, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch01 FA

18 Pricing Derivative Securities (2nd Ed.)

over states that is “equivalent” to the real one, in the technical sense thatboth measures assign positive values to the same sets of contingencies.

This equivalent measure affords another way to characterize an asset’sarbitrage-free price. Summing products of payoffs and state prices is equiv-alent to summing products of payoffs and pseudo-probabilities and thenmultiplying the result by the price of the riskless bond. Of course, mul-tiplying by the price of the bond is the same as discounting at the aver-age rate of interest that the bond pays over its lifetime. Therefore, theasset’s price can be thought of as the mathematical expectation of itspayoffs in the pseudo-probability measure, discounted to the present atthe riskless rate of interest. Since assets would actually be valued attheir discounted expected payoffs in the true measure if people were riskneutral, it makes sense to refer to this pseudo-measure as the “risk-neutral”measure.

Once one has found the equivalent risk-neutral measure, there is a sim-ple recipe for pricing a derivative security whose value at expiration (dateT ) is a known function of an underlying price: (i) find the mathematicalexpectation of the derivative’s value at T in the risk-neutral measure and(ii) discount it to the present at the riskless rate. Derivatives with payoffsat more than one date can be decomposed into the dated claims, which canbe priced individually in this way and then added together. Thus, we cancalculate an arbitrage-free price for any derivative if we can find a new prob-ability measure in which other assets, including its underlying, are pricedas though they were riskless. The one concern is whether the risk-neutralmeasure and the price obtained from it are unique. As we shall see, theability to replicate the derivative’s payoff with traded assets is preciselywhat is needed for uniqueness.

This risk-neutral approach to pricing followed from the insights of Coxand Ross (1976), who noted that the Black-Scholes (1973) and Merton(1973) formulas for values of European puts and calls can be interpretedas discounted expected values of their terminal payoffs—(X − ST )+ or(ST − X)+—in a different measure. Later, Harrison and Kreps (1979)showed that suitably normalized prices of assets behave as martingales inthe equivalent risk-neutral measure; that is, they are stochastic processeswhose current values equal the conditional expectations of their values atany future date. The usual normalizing factor or numeraire is the currentvalue of a savings account that grows continuously at the riskless rate ofinterest. From this perspective risk-neutral pricing of a derivative asset

Page 36: Pricing derivative securities

April 6, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch01 FA

Introduction and Overview 19

becomes “equivalent-martingale” pricing, and the recipe becomes:

• Choose a numeraire—a strictly positive price process by which to nor-malize all other relevant prices;

• Find a measure in which normalized prices of assets (and derivatives) aremartingales;

• Calculate the derivative’s current normalized price as the conditionalexpectation of its future value; and then

• Multiply by the current value of the numeraire to obtain the derivative’sarbitrage-free price in currency units.

We shall fill in the details of this procedure in later chapters and apply therecipe repeatedly.

1.2.3 Markets: Complete and Otherwise

In the abstract, at least, martingale pricing seems simple enough, assum-ing that one can find the mysterious martingale measure. But even if wetake for granted that it can be found, an obvious question is whether themeasure and the prices derived from it are unique. If not, then a derivativecould have more than one price that was consistent with the absence ofarbitrage opportunities—hardly a satisfactory situation. As it turns out,uniqueness depends ultimately on the dynamics followed by the underlyingassets. When they are such that the payoff of any contingent claim canbe replicated with a portfolio of traded assets (dynamically or otherwise),then the market for these claims is said to be “complete”. As shown byHarrison and Pliska (1981) and more rigorously by Delbaen and Schacher-mayer (1994), there is only one equivalent-martingale measure in a completemarket—and one set of prices that denies opportunities for arbitrage.

Our first applications of martingale pricing are to models for under-lying prices that do allow replication. Chapter 5 treats discrete mod-els, in which the underlying price evolves in discrete time and has dis-crete outcome states. The binomial model considered there both developsour intuitive understanding of martingale pricing and provides a practicalnumerical method for pricing many complicated derivatives. We take up inchapter 6 the continuous-time/continuous-state dynamics that Black andScholes used to price European options. Chapter 7, where we price variousexotic options under Black-Scholes dynamics, highlights the complementar-ity between their p.d.e. approach to valuation and the modern martingaleapproach. Other complete-markets situations are encountered in chapter 10,

Page 37: Pricing derivative securities

April 6, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch01 FA

20 Pricing Derivative Securities (2nd Ed.)

where we take up the modeling of stochastic interest rates and the pricingof interest-sensitive derivatives.

1.2.4 Derivatives Pricing in Incomplete Markets

The hard fact is that certain models that allow for more realistic behaviorof the dynamics of underlying assets do not allow all contingent claims to bereplicated with traded assets—even dynamically. In the incomplete marketsthat such models generate, infinitely many measures exist that could poten-tially price the claims. Of these measures the market actually uses just one,but we cannot discover which one it uses without more information. Whatthis means in practice is that derivatives cannot be priced by arbitragearguments alone when markets are incomplete. Instead, prices will dependon free parameters that are usually interpreted as reflecting peoples’ tastesfor risk bearing. While these parameters are in principle discoverable, theyare not readily observable.

Since the risk parameters are in fact embedded in the observed marketprices of traded derivatives, one way of discovering them is to see whatvalues of the parameters make predictions of the pricing model accordwith observation. That is, if we can somehow find the equivalent-martingaleprices that correspond to any given set of parameters—either by solving thep.d.e.s or by calculating the conditional expectations—then we can gropearound in the parameter space, repricing at each point until we land on oneat which there is a good fit with the prices quoted in the market. Chapters 8and 9 provide the first look at these incomplete-markets models, for whichthe relevant parameters for pricing must be filtered in this way from theobserved prices of traded derivatives.

The program ahead of us clearly involves some math. In the next twochapters we review the principal concepts of analysis, probability, andstochastic processes that will be applied in financial modeling, and we learnto use the special tools of stochastic calculus for treating processes thatevolve in continuous time.

Page 38: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch02 FA

2Mathematical Preparation

This chapter summarizes the basic tools of analysis and probability theorythat are needed in the applications to financial derivatives. It also introducesthe continuous-time stochastic processes that will be the subject of the nextchapter. The following summarizes notation that will be used throughoutthe book:

a, a, b, a, b, . . . sets of discrete points in a space[a, b], (a, b), [a, b), (a, b] closed, open, half-open intervals of realsℵ the natural numbers, 1, 2, . . .ℵ0 ℵ with 0 added, 0, 1, 2, . . . the real numbers, (−∞,∞)+ the nonnegative reals, [0,∞)n n-dimensional space, n ∈ ℵQ the rational numbersC, Cn the complex numbers, n-vectors⊆,⊇;⊂,⊃ subset, superset; proper subset, superset∩,∪, \ intersect, union, and difference of sets∅ empty setAc = Ω\A complement of set A relative to space ΩAn∞n=1 ↑, An∞n=1 ↓ nondecreasing, nonincreasing sequencesa,b column vectors or matricesa′b,ab′ inner product, outer product

1A(x) indicator function =

1, x ∈ A

0, x ∈ Ac

f(x±) right/left-hand limits: limn→∞ f(x ± 1/n)a ∨ b, a ∧ b greater of a or b, lesser of a or b

convergence in distribution

21

Page 39: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch02 FA

22 Pricing Derivative Securities (2nd Ed.)

2.1 Analytical Tools

We begin with some basic results from analysis and measure theory thatwill be essential in the study of derivatives.1

2.1.1 Order Notation

It is often necessary to approximate functions and to specify how theapproximation error varies with an argument of the function. Order nota-tion is useful for this purpose. Consider, for example, a sequence of realnumbers an∞n=1, which is a mapping from the positive integers, ℵ, tothe reals, . One often wants to approximate an to some specified degreeof accuracy by a function an that has simpler properties. Thus, since thefunction an = n

n−1 = 11−1/n can be expanded as

an = 1 + 1/n + 1/n2 + · · · + 1/nm−1 + 1/nm + 1/nm+1 + · · · ·

=m−1∑j=0

n−j + 1/nm + 1/nm+1 + · · · ·

=m−1∑j=0

n−j + n−m n

n − 1,

one could write an =∑m−1

j=0 n−j + O(n−m). This both represents an interms of an approximating polynomial of degree m−1 in n−1 and indicateshow the maximum discrepancy between an and the polynomial behaves asn increases. In this example we would say that discrepancy is of order n−m

as n → ∞. A quantity that is O(n−m) satisfies

limn→∞

nmO(n−m) = c,

where c (equal to unity in the example) is a constant not equal to zero.Thus, the statement an − an = O(n−m) specifies the slowest rate at whichthe difference vanishes or, if m < 0, the fastest rate at which it increaseswith n. The definition implies that a quantity is O(n0) or, equivalently,O(1) if it neither vanishes nor diverges as n → ∞.

In many cases it is necessary to know only that an approximationerror vanishes faster, or increases more slowly, than a specified rate.

1References that may be consulted for systematic treatment and proofs of thetheorems include Ash (1972), Billingsley (1986), Royden (1968), Rudin (1976), andTaylor (1966).

Page 40: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch02 FA

Mathematical Preparation 23

The small-o notation is used for this. We would write an − an = o(n−m) iflimn→∞ nm(an − an) = 0 and would then say that the approximation erroris of order less than n−m. Thus, by adding one more term to the summationin the example above, we could write

an =m∑

j=0

n−j + o(n−m).

A quantity that is o(n0) or, equivalently, o(1) simply vanishes (at an unspec-ified rate) as n → ∞.

The following arithmetic of order notation is obvious from the defini-tions. For k, m ∈ 0,±1,±2, . . .

O(nk) ± O(nm) = O(nmax(k,m))

o(nk) ± o(nm) = o(nmax(k,m))

O(nk) ± o(nm) =

o(nm), m > k

O(nk), m ≤ k

O(nk) × O(nm) = O(nk+m)

o(nk) × o(nm) = o(nk+m)

O(nk) × o(nm) = o(nk+m).

In approximating functions of real variables a variant of this notationis often used in which a real variable h, corresponding to n−1, approacheszero. For example, f(h) = f(h) + O(hk) as h → 0 means that

limh→0

h−k[f(h) − f(h)] = c = 0,

while f(h) = f(h) + o(hk) as h → 0 means that

limh→0

h−k[f(h) − f(h)] = 0.

2.1.2 Series Expansions and Finite Sums

Taylor series gives polynomial approximations for differentiable func-tions, as

f(x) = f(a)+f ′(a)(x − a)+f ′′(a)

2(x − a)2 + · · ·+ f (k)(a)

k!(x − a)k + Rk+1,

=k∑

j=0

f (j)(a)j!

(x − a)j + Rk+1,

Page 41: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch02 FA

24 Pricing Derivative Securities (2nd Ed.)

where

Rk+1 =f (k+1)(xa)(k + 1)!

(x − a)k+1

and |x−xa| < |x−a|. Thus, xa is a point between x and a, its precise valuenot specified. Two important applications (both with a = 0) are

ex =k∑

j=0

xj

j!+ o(xk) as x → 0 (2.1)

ln(1 + x) =k∑

j=1

(−1)j−1

jxj + o(xk) as x → 0. (2.2)

Formula (2.1) is valid for k = ∞ and any complex number x, while (2.2)holds for k = ∞ only when |x| < 1. We will also need the following equal-ities, which can be verified by comparing Taylor expansions of each sideabout ζ = 0:

sin ζ =eiζ − e−iζ

2i, cos ζ =

eiζ + e−iζ

2, i =

√−1. (2.3)

From these follows Euler’s formula,

eiζ = cos ζ + i sin ζ. (2.4)

Setting f(x) = (1 + x)s for s ∈ and |x| < 1 produces the binomialseries,

(1 + x)s =∞∑

j=0

(s

j

)xj . (2.5)

The binomial coefficients are given by

(s

j

)≡

s(s − 1) · · · (s − j + 1)/j!, j ∈ ℵ1, j = 00, else

(2.6)

When s ∈ ℵ the right side of (2.5) comprises just s + 1 terms, all terms inpowers higher than s vanishing by virtue of (2.6 ). In this case the expansionholds without restriction on x. Note that definition (2.6) implies the identity(

s

j − 1

)+

(s

j

)=

(s + 1

j

). (2.7)

Page 42: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch02 FA

Mathematical Preparation 25

The following elementary expressions for finite sums will often be useful:n∑

j=1

j = n (n + 1) /2 (2.8)

n∑j=1

j2 = n (n + 1) (2n + 1) /6 (2.9)

n∑j=0

aj =

1 − an+1

1 − a, a = 1

n + 1, a = 1(2.10)

(a + b)n =n∑

j=0

(n

j

)ajbn−j. (2.11)

2.1.3 Measures

A measure is simply a special type of set function; that is, a function thatassigns numbers to sets. The explanation of the concept begins with anoverall space, Ω, of which everything we would want to measure is a subset.For example, Ω would be if we wanted to measure various sets of realnumbers. The next ingredient is a class M of measurable sets, the collectionof sets that actually can be measured, given the requirements we imposefor how a measure should behave. Measures with the right properties canbe defined on classes of sets known as σ-fields or σ-algebras. These arecollections that contain all the sets that can be built up by countably manyof the ordinary set operations (union, intersection, complementation) onsome fundamental constituent class. For example, the Borel sets, B, area σ -field of sets of the real line that contain all countable unions andintersections of the open intervals (x, y) (or closed intervals [x, y], or half-open intervals (x, y], etc.), together with their complements.2 The Borelsets of the line comprise all those sets (intervals, points, unions of intervals,. . . ) to which a meaningful notion of length could be assigned. Indeed, theyare the smallest such collection that can be built up from the intervals.Another way to put this is to say that B is the σ-field “generated” by theintervals in .

2For examples, (i) an open interval (x, y) can be built from the class of half-openintervals (x, z] : x ∈ , z ∈ as (x, y) = ∪∞

n=1(x, y− 1/n]; (ii) the closed interval [x, y]

can be built from open intervals (w, z) : w ∈ , z ∈ as [x, y] = ∩∞n=1(x−1/n, y+1/n);

and (iii) the singleton set (point) x can be constructed from intervals in any of thefollowing ways: ∩∞

n=1(x − 1/n, x], ∩∞n=1[x − 1/n, x], ∩∞

n=1[x, x + 1/n), ∩∞n=1[x, x + 1/n].

Page 43: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch02 FA

26 Pricing Derivative Securities (2nd Ed.)

A measure µ on (Ω,M) is then one of the class of set functions thatmap M into the reals, but it has two special properties that set it apartfrom other functions in the class that are not measures. The required spe-cial properties are (i) µ(·) ≥ 0 (non-negativity), and (ii) µ(∪∞

j=1Aj) =∑∞j=1 µ(Aj), when Aj ∩ Aj′ = ∅ (countable additivity). In short, µ is

a countably additive mapping from M into + ∪ +∞ ≡ [0, +∞].Property (i) just rules out having negative lengths, masses, probabilities,etc.; while property (ii) assures that the measure of the whole is the sum ofthe measures of its (disjoint) parts. The triple (Ω,M, µ) forms a measurespace, all three ingredients being needed in order to specify exactly whatthe measure is. Typically, measures are defined in some intuitive way on arelatively simple class of sets that generate M, and then it is proved thatthey can be extended—uniquely—to M itself. With Ω = and M = B,Lebesgue measure is defined for a simple class of sets, the intervals, to con-form with our common notion of length; e.g., µ((a, b]) = b−a. The extensionof this measure to the class B (some sets of which are extremely compli-cated!) can then be made. There are, however, subsets of that are notin B and cannot be measured if we insist that measures have properties (i)and (ii).3

Our main interest will be in probability measures, which are introducedin the next section, but we will need to know a few facts and some termi-nology pertaining to measures generally and some properties of Lebesguemeasure in particular. First, we shall require a fundamental result aboutthe measures of monotone sets. A sequence of sets An∞n=1 is said to bemonotone if either An ⊆ An+1 or An ⊇ An+1 for each n. In the first casethe sequence is said to be weakly increasing or, equivalently, nondecreasing;and in the second, weakly decreasing or nonincreasing. These features ofthe sequence are also expressed symbolically as An ↑ and An ↓, respec-tively. If An ↑, then we define limn→∞ An as ∪∞

n=1An; and if An ↓, wedefine limn→∞ An as ∩∞

n=1An. For the fundamental result about measuresof such sets let An∞n=1 be a monotone sequence of sets in M, and letµ be a measure on (Ω,M). Then if either (i) An ↑ or (ii) An ↓ andµ (A1) < ∞, we have

µ( limn→∞

An) = limn→∞

µ(An). (2.12)

3While all sets in B are Lebesgue-measurable, there are some measurable sets thatare not in B (Taylor, 1966, p. 94). Examples of nonmeasurable sets are (happily) noteasily constructed. For one such, see Billingsley (1986, pp. 41–42).

Page 44: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch02 FA

Mathematical Preparation 27

In words, this says that (under the stated condition) measures of limits ofmonotone sequences of sets equal limits of the sequences of their measures.This is referred to as the “monotone property” of µ. As an example of whyµ (A1) < ∞ is needed in the decreasing case, take (,B) as measure space,An = [0, n−1]∩Q (the rational numbers on [0, n−1]), and let # be countingmeasure (# (A) equals the number of elements of A). Thus, # (An) = ∞for each n but # (limn→∞ An) = #(0) = 1.

Now, some terminology. A measure µ is “finite” if µ(Ω) < ∞. It is“σ-finite” if there is a countable decomposition, Ω = ∪∞

n=1An, such thatµ(An) < ∞ for each n. Thus, Lebesgue measure on (,B) is not finite,but it is σ-finite. Next, suppose some condition C holds for all membersof Ω except those comprising a measurable set A0 such that µ(A0) = 0.Then we say that C holds “almost everywhere” (a.e.) with respect to µ.For example, a function g such that g > 0 except on such a set would besaid to be positive almost everywhere with respect to µ, or, for short, to bepositive a.e. µ.

Finally, consider some aspects of Lebesgue measure in particular. Definethe sequence of sets An as An = (a − 1/n, a] for n ∈ ℵ. Then with µ

as Lebesgue measure we have µ(An) = 1/n, conforming to the ordinarynotion of length; and, by the monotone property, µ(a) = µ (∩∞

n=1An) =limn→∞ µ(An) = limn→∞ 1/n = 0. Thus, the Lebesgue measure of anypoint in (any singleton set) is zero. Now let B be any countable set ofpoints in , so that B = ∪∞

n=1xn for real numbers x1, x2, . . . Then bycountable additivity of measures, µ(B) =

∑∞n=1 µ(xn) = 0. Countable

sets in are thus sets of Lebesgue measure zero. Applying the “a.e.” ter-minology, we would say that a condition that holds except on a countableset in holds a.e. with respect to Lebesgue measure.

2.1.4 Measurable Functions

A mapping f from a measurable space (Ω,M) to a measurable space (Ψ,N )is said to be a “measurable” mapping if the inverse image under f of eachset in N is a set in M; that is, if f−1(D) ≡ ω : f(ω) ∈ D ∈ M foreach D ∈ N . Similarly, a real-valued function f : Ω → is measurable(that is, M-measurable) if for each Borel set B ∈ B the inverse image,f−1(B) = ω : f(ω) ∈ B, is in M.4 Knowing that f is a measurable

4Notice that f here is not a set function, which is a mapping from M to . Instead, itis a “point” function that maps Ω into and therefore assigns real numbers to elementsω of Ω.

Page 45: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch02 FA

28 Pricing Derivative Securities (2nd Ed.)

mapping means that we can associate with a statement like “f takes avalue in D” a probability, a length, or a value of whatever measure it isthat has been defined on (Ω,M). That a function f is M-measurable isoften expressed as f ∈ M for brevity.

For example, working with (,B, µ), where µ is Lebesgue measure, letf(x) = ex, so that f maps onto +\0 = (0,∞). Then the Borel sets(0, 1] and [1, e] have inverse images (−∞, 0] and [0, 1] under f , and these arethemselves Borel sets with µ((−∞, 0]) = +∞ and µ([0, 1]) = 1. For func-tions of real variables it can be shown that the condition f−1((−∞, x]) ∈ Mfor each x ∈ is sufficient for measurability. (Of course, the necessity ofthis is obvious.) This simplifies considerably the task of checking that afunction is measurable. The functions of real variables that arise in ordi-nary applications of analysis are, in fact, Borel measurable. Thus, right-and/or left-continuous real functions are measurable, as are compositions,sums, products, maxima and minima, suprema and infima, inverses of (non-vanishing) functions, and limits of sequences of measurable functions (onthe sets where the limits exist).

2.1.5 Variation and Absolute Continuity of Functions

A collection S of subsets of a set S constitutes a partition of S if eachelement of S belongs to one and only one member of S. Thus, if S is apartition that comprises the sets s1, s2, . . . , then sj ∩ sj′ = ∅ wheneverj = j′ and ∪jsj = S. Considering now subsets of , let S = (xj−1, xj ]n

j=1

be a partition of the interval (a, b] into a finite number of subintervals,where we take x0 = a and xn = b. The “variation” of a measurable functionF : → over (a, b] is defined as

V(a,b]F ≡ supn∑

j=1

|F (xj) − F (xj−1)| ,

where the supremum is over all such finite partitions. F is said to be of“bounded variation” over (a, b] if V(a,b]F < ∞. Monotone functions on(a, b], whether nondecreasing or nonincreasing, necessarily have boundedvariation, equal to |F (b)−F (a)|. Likewise, F has bounded variation if it isa sum or a difference of monotone functions, in which case it can always berepresented as the difference between two nondecreasing functions. Indeed,

Page 46: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch02 FA

Mathematical Preparation 29

it is true that a function is of bounded variation on a finite interval if andonly if it can be represented as the difference in nondecreasing functions.5

We know from elementary calculus that a function F : → is contin-uous if |F (x′) − F (x)| can be made arbitrarily small (less than an arbitrarypositive ε) by taking |x′ − x| sufficiently small (less than some δ(ε)). Abso-lute continuity imposes the following more stringent restriction: for eachε > 0 there is a δ > 0 such that

∑nk=1 |F (x′

k) − F (xk)| < ε for each finitecollection of nonoverlapping intervals such that

∑nk=1 |x′

k − xk| < δ. Unlikesimple continuity, absolute continuity imposes extra smoothness require-ments that ensure differentiability a.e. with respect to Lebesgue measure.Clearly, no function of unbounded variation can be absolutely continu-ous, although simply continuous functions can have unbounded variation.Chapter 3 will present a prominent example of the latter—realizations ofsample paths of Brownian motions.

An important feature of absolutely continuous functions of real variablesis that they (and only they) have representations as indefinite integrals.That is, F : → is absolutely continuous if and only if there exists a“density” f such that

F (x) − F (a) =∫ x

a

f(t) · dt, (2.13)

where f = F ′ a.e. with respect to Lebesgue measure and where the inte-gral is interpreted in the Lebesgue sense.6 Our next task is to define suchintegrals.

2.1.6 Integration

This section defines several constructions of integrals of real-valued, measur-able functions that are encountered in the applications to financial deriva-tives. Following a quick review of the ordinary Riemann integral fromcalculus, we extend to Riemann-Stieltjes integrals of functions of real vari-ables and then develop the more general concept of integrals on abstractmeasure spaces, of which the Lebesgue and Lebesgue-Stieltjes integrals arespecial cases.

5For this see Billingsley (1986, pp. 435–436).6For a proof, see Ash (1972, theorem 2.3.4). We shall see in chapter 3 that the

only-if restriction of this theorem makes the Lebesgue construction unsuitable as aninterpretation of stochastic integrals with respect to processes of unbounded variationon finite intervals.

Page 47: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch02 FA

30 Pricing Derivative Securities (2nd Ed.)

Riemann Integral

Defining the definite Riemann integral,∫ b

ag(x) · dx, for a suitable function

g : → begins by partitioning [a, b] as

a = x0 < x1 < · · · < xn−1 < xn = b

in such a way that δn ≡ max1≤j≤n |xj − xj−1| → 0 as n → ∞ and setting

gj ≡ supx∈(xj−1,xj]

g(x)

gj≡ inf

x∈(xj−1,xj]g(x).

It does not matter whether the intervals from xj−1 to xj are openor closed. Next, construct upper and lower approximating sums, Sn =∑n

j=1 gj(xj − xj−1) and Sn =∑n

j=1 gj(xj − xj−1). If limn→∞ Sn and

limn→∞ Sn both exist and are equal, then g is said to be Riemann inte-grable on [a, b], and

∫ b

a g(x) · dx is defined as the common value of the twolimits. Clearly, the limits will be the same if g is continuous, for then eachgj and g

jcan be made arbitrarily close by taking δn sufficiently small.

Bounded functions having a countable number of jump discontinuities arealso Riemann integrable, as are finite sums, differences, products, andcompositions of integrable functions. Since limm→∞ |

∫ a

a−1/mg(x) · dx| ≤

limm→∞ m−1 supx∈[a−1/m,a] |g(x)| = 0 if g is Riemann integrable, thereis no distinction between integrals over closed intervals, half-open inter-vals, and open intervals. (The vague notation

∫ b

ag(x) · dx admits any

of these interpretations.) The improper integral∫∞

ag(x) · dx exists if

limb→∞∫ b

a g(x) · dx exists; and∫ b

−∞ g(x) · dx and∫∞−∞ g(x) · dx are defined

in a like manner.The definition of the Riemann integral grants substantial flexibility in its

construction (when it exists). Since gj≤ g(x∗

j ) ≤ gj for any x∗j ∈ (xj−1, xj ],

the condition limn−→∞ Sn = limn−→∞ Sn guarantees that

limn−→∞

n∑j=1

g(x∗j )(xj − xj−1) =

∫ b

a

g(x) · dx.

Example 1 To illustrate the process of evaluating a Riemann integral fromthe definition, take g(x) = x, a = 0, b = 1, and xj = j/n. Using (2.8) to

Page 48: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch02 FA

Mathematical Preparation 31

express Sn and Sn, we have

limn→∞

Sn = limn→∞

n∑j=1

j

n

(1n

)= lim

n→∞n(n + 1)

2n2

=12

= limn→∞

n(n − 1)2n2

= limn→∞

n∑j=1

j − 1n

(1n

)= lim

n→∞Sn.

The Riemann-Stieltjes Integral

Picking an arbitrary origin c and representing the Lebesgue measure of [c, x]as µ([c, x]) = x − c for x ≥ c, the Riemann integral can be expressed as∫ b

a

g(x) · dµ([c, x]) = limn→∞

n∑j=1

g(x∗j ) · ∆µ([c, xj ]),

where (i) c ≤ a, (ii) x∗j ∈ (xj−1, xj ], with xjn

j=0 partitioning [a, b] asbefore, and (iii) ∆µ([c, xj ]) = µ([c, xj ]) − µ([c, xj−1]) = xj − xj−1. Themessage is that in the Riemann sum the representative value of g in eachinterval is being weighted by the Lebesgue measure of that interval. TheStieltjes integral extends the concept of integration by allowing other mea-sures to be used as weights.

Suppose F is a nondecreasing function on . Then a measure µF on(Ω,B) can be defined by taking µF ((a, b]) = F (b) − F (a) for intervals andthen extending to the entire class B, just as Lebesgue measure is extendedfrom the same fundamental class. A function g is Riemann-Stieltjes inte-grable with respect to F on (a, b] if the limits

limn→∞

n∑j=1

gj [F (xj) − F (xj−1)] , limn→∞

n∑j=1

gj[F (xj) − F (xj−1)]

exist and are equal, in which case the common value is written as∫(a,b]

g(x) · dF (x).

Page 49: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch02 FA

32 Pricing Derivative Securities (2nd Ed.)

Again, if the integral exists then g could as well be evaluated at anyx∗

j ∈ (xj−1, xj ]. Unlike the Riemann integral, however, Stieltjes inte-grals over open and closed sets may differ, because µF (a) = F (a) −limm→∞ F (a−1/m) ≡ F (a)−F (a−) is not necessarily zero. Thus, the nota-tion for Stieltjes integrals must indicate explicitly whether intervals includethe end points, as

∫(a,b]

g(x) · dF (x),∫[a,b]

g(x) · dF (x),∫[a,b)

g(x) · dF (x),etc. One should note also that

∫a g(x) · dF (a) = g(a) [F (a) − F (a−)] and∫

[a,b] g(x) · dF (x) =∫a g(x) · dF (a)+

∫(a,b) g(x) · dF (x)+

∫b g(x) · dF (a).

Two common special cases that allow other representations of the Stielt-jes integral are when (i) F is absolutely continuous and (ii) F is a right-continuous step function. In case (i) there exists a density f that allows arepresentation for F itself as a Riemann-integral:

F (b) − F (a) =∫ b

a

f(x) · dx.

Since absolute continuity implies differentiability with respect to Lebesguemeasure almost everywhere, we have f(x) = F ′(x) except possibly atisolated points where F ′ fails to exist. On that (countable) set of zeroLebesgue measure f can be defined arbitrarily. In the limiting sum thatdefines the integral the mean value theorem can be used to approximateF (xj)−F (xj−1) on (xj−1, xj ] as f(x∗

j )(xj −xj−1) for some x∗j ∈ (xj−1, xj).

Passing to the limit shows that the Stieltjes integral with respect to F

can be evaluated as an ordinary Riemann integral of the product g(x)f(x);that is,

limn→∞

n∑j=1

g(x∗j )f(x∗

j )(xj − xj−1) =∫ b

a

g(x)f(x) · dx.

The other common situation allowing an elementary representation ofthe Stieltjes integral on (a, b] is that F is a right-continuous step functionwith jump discontinuities at a countable number of points x′

1, x′2, . . . . At any

such point x′k we have µF (x′

k) = F (x′k) − F (x′

k−), so that the definitionof the integral in this case gives∫

(a,b]

g(x) · dF (x) = limn→∞

n∑j=1

g(x∗j ) [F (xj) − F (xj−1)]

=∑

k

g(x′k) [F (x′

k) − F (x′k−)] .

Page 50: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch02 FA

Mathematical Preparation 33

The Stieltjes integral of g is thus evaluated as the sum of its values atthe points where F is discontinuous, weighted by the jumps in F at thesepoints.

Mixed cases in which F is absolutely continuous on some set C andhas jump discontinuities at points in a countable set D are handled bydecomposition:∫

(a,b]

g(x) · dF =∫

(a,b]∩C

g(x) · dF +∫

(a,b]∩D

g(x) · dF

=∫

(a,b]∩C

g(x)F ′(x) · dx +∑

x∈(a,b]∩D

g(x) [F (x) − F (x−)] .

A list of the important common properties of Riemann and abstractintegrals appears on page 35.

Integrals on Abstract Measure Spaces

A more general concept of the integral is sometimes needed than that pro-vided by the Riemann construction. A limitation of the Riemann construc-tion is that a set A over which the integral is evaluated (e.g., A = [a, b])must have a natural ordering, such as the real numbers have. This is neededbecause the set itself has to be partitioned according to the values of itsmembers. A more general construction is possible if the range of the inte-grand g over the domain A is subdivided, rather than A itself. As it turnsout, the more general construction achieved in this way also has propertiesthat are more convenient and robust in mathematical analysis.

To develop the abstract integral, consider a measure space (Ω,M, µ)and a measurable function g : Ω → . Here Ω could be (or k, k > 1),but it could as well be a collection of physical objects or of abstract enti-ties, such as the outcomes of a chance experiment. The function g assignsa real number to any such element ω of Ω. The abstract integral, writ-ten either as

∫g(ω) · dµ(ω) or as

∫g(ω) · µ(dω), is defined in a series of

steps in which the allowed class of integrands becomes increasingly moregeneral. Step 1 pertains to the class S+ of nonnegative simple functions,which are functions that take nonnegative, constant values on each of afinite number of subsets that partition Ω. Thus, we suppose initially thatΩ is the union of disjoint sets Ajn

j=1 such that g(ω) = gj ≥ 0 for eachω ∈ Aj . Using indicator functions that take the value unity on the desig-nated sets and zero elsewhere, simple functions can be written compactly

Page 51: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch02 FA

34 Pricing Derivative Securities (2nd Ed.)

as g(ω) =∑n

j=1 gj1Aj (ω). For g ∈ S+ the integral is defined in a veryintuitive way: ∫

g(ω) · dµ(ω) =n∑

j=1

gjµ(Aj).

Here, the constant value of the function g on each set in the partition issimply weighted by the measure of the set, and the integral is just the sumof these weighted values.

Step two extends the definition to arbitrary nonnegative measurablefunctions g ∈ M+. This is done by showing that any member of M+ canbe attained as the limit of a sequence of functions gn in S+. A commonconstruction is to define sets

Aj,n =

ω : j−12n ≤ g(ω) <

j

2n

, j = 1, 2, . . . , 22n

ω : g(ω) ≥ 2n , j = 0

and take

gn(ω) =

j − 12n

, ω ∈ Aj,n, j > 0

2n, ω ∈ A0,n

.

Notice that the sets Aj,n are indeed based on a subdivision of the rangeof g, namely +. It is obvious that the sequence of such simple functionsgn has the first two of the following properties, and the third is not hardto show: (i) gn(ω) ≤ g(ω), (ii) gn(ω) ≤ gn+1(ω), and (iii) gn(ω) ↑ g(ω)for each ω ∈ Ω and each g ∈ M+. Since gn ∈ S+, the definition of theintegral extends naturally to this broader class of nonnegative measurablefunctions as the limit of the sequence of integrals of the simple functionsthat converge to g:∫

g(ω) · dµ(ω) = limn→∞

∫gn(ω) · dµ(ω).

It can be shown that the limit is unique even though there are many waysof constructing such a sequence of approximating simple functions.7 Thelimit may well be infinite, however. In that case the integral is said not toexist, and g is said to be not µ-integrable. This can happen if g(ω) ≥ ε > 0on some set Aε such that µ(Aε) = +∞. It can happen also if g is positive

7For a proof of uniqueness see Taylor (1966, pp. 112–113).

Page 52: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch02 FA

Mathematical Preparation 35

and unbounded on a set of finite measure in such a way that the sequenceof integrals diverges for all approximating simple functions.

Finally, step three extends the definition to the entire class of measurablefunctions M without the positivity restriction. This is done very simply.Setting g+(ω) ≡ g(ω) ∨ 0 and g−(ω) ≡ g(ω) ∧ 0, we have g+ ∈ M+,−g− ∈ M+, and g = g+ − (−g−). The natural definition of the integral forg ∈ M is therefore∫

g(ω) · dµ(ω) =∫

g+(ω) · dµ(ω) −∫

[−g−(ω)] · dµ(ω).

As thus defined, the integral is said to exist if and only if the integrals ofg+ and −g− are both finite—a point of some importance that is discussedbelow.

Many familiar properties of the Riemann integral are preserved in thisconstruction. In the following list it is assumed that Aj∞j=0 ∈ M (theσ-field of sets on which the measure µ is defined), that Aj ∩ Aj′ = ∅ forj = j′, and that µ(A0) = 0. As well, a and b are arbitrary real numbers,and g and h are measurable functions.8

1.∫

A1g(ω) · dµ(ω) =

∫g(ω)1A1(µ) · dµ(ω)

2.∫∪∞

j=1Ajg(ω) · dµ(ω) =

∑∞j=1

∫Aj

g(ω) · dµ(ω)

3.∫

a · dµ(ω) = a∫

dµ(ω) = aµ(Ω)

4.∫

bg(ω) · dµ(ω) = b∫

g(ω) · dµ(ω)

5.∫[a + bg(ω) + ch(ω)] · dµ(ω) = a + b

∫g(ω) · dµ(ω) + c

∫h(ω) · dµ(ω)

6. g(ω) ≤ h(ω) a.e. µ ⇒∫

g(ω) · dµ(ω) ≤∫

h(ω) · dµ(ω)

7. g(ω) = h(ω) a.e. µ ⇒∫

g(ω) · dµ(ω) =∫

h(ω) · dµ(ω)

8.∣∣∫ g(ω) · dµ(ω)

∣∣ ≤ ∫|g(ω)| · dµ(ω)

9. µ(A0) = 0 ⇒∫

A0g(ω) · dµ(ω) = 0.

While these properties are familiar because they are shared by the Rie-mann integral, the abstract integral has other important properties that

8To apply these to the Riemann contruction (assuming that g and h are Riemannintegrable), take ω ∈ , Aj ∈ B, µ(Aj) as either Lebesgue measure or as the measureinduced by a nondecreasing function F on , and interpret dµ(ω) as either dω or dF (ω).

Note that property 1 implies thatR

g(ω) · dµ(ω) has the same meaning asRΩ g(ω) ·

dµ(ω). Property 7 just states that if g(ω) = h(ω) outside a set of measure zero, then theintegrals also are equal.

Page 53: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch02 FA

36 Pricing Derivative Securities (2nd Ed.)

the Riemann construction does not necessarily confer. One of these is theimplication that

∫g(ω) · dµ(ω) exists if and only if

∫|g(ω)| · dµ(ω) exists.

Since |g| = g+ − g−, this property follows from the way the definitionis extended from nonnegative to arbitrary measurable functions. Anothervery important implication of the definition that does not hold in generalfor Riemann integrals is the dominated convergence theorem. This essen-tial tool in analysis gives a simple condition under which the operations ofintegrating and taking limits can be interchanged.9

Theorem 1 Suppose gn is a sequence of M-measurable functions converg-ing (pointwise) to a function g, with |gn| ≤ h for some integrable h. Theng itself is integrable, and

limn→∞

∫gn(ω) · dµ(ω) =

∫lim

n→∞gn(ω) · dµ(ω) =

∫g(ω) · dµ(ω).

Thus, the interchange of limits and integration is justified if there is adominating, integrable function h.

The Lebesgue Integral

The integral having been defined on an abstract measure space (Ω,M, µ),the Lebesgue integral can now be viewed just as the special case in whichΩ = , M = B, and µ is Lebesgue measure. Since then µ((c, x]) = x − c

for c ≤ x, the integral can be expressed simply as∫

g(x) · dx. Also, as forthe Riemann integral, µ(a) = µ(b) = 0 implies that

∫[a,b] g(x) · dx =∫

(a,b) g(x) · dx =∫[a,b) g(x) · dx, etc., so that the ambiguous representation∫ b

ag(x) · dx still works perfectly well. Finally, letting F : → be a

nondecreasing function and defining the measure µF ((c, x]) as F (x)−F (c)gives us the Lebesgue-Stieltjes integral:

∫ g(x) · dµF (x) =

∫ g(x) · dF (x).

As in the Riemann-Stieltjes construction the fact that µF (a) and µF (b)are not necessarily zero requires specifying whether sets are open or closed,as

∫[a,b] g(x) · dF (x),

∫(a,b] g(x) · dF (x), etc.

When g is Riemann integrable, the Riemann and Lebesgue construc-tions give the same values for

∫ b

ag(x) · dx. Likewise, when g is Riemann-

Stieltjes integrable with respect to F, then∫[a,b]

g(x) · dF is the same in

9The proof can be found in standard texts on real analysis and measure; e.g., Ash(1972), Billingsley (1986), Royden (1968), and Taylor (1966).

Page 54: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch02 FA

Mathematical Preparation 37

either construction. Thus, when Riemann integrability holds, the ordinaryintegration formulas from calculus still apply to evaluate the Lebesgue inte-gral. The greater generality of the latter comes into play for integrals offunctions that are highly oscillatory or discontinuous.

Example 2 Set g(x) = 1 for x ∈ Q (the rational numbers) and g(x) = 0elsewhere, and partition (a, b) as a = x0 < x1 < · · · < xn = b so thatmaxj∈1,2,...,n xj −xj−1 → 0 as n → ∞. Then g is not Riemann integrable,since gj ≡ supx∈(xj−1,xj] g(x) = 1 and g

j≡ infx∈(xj−1,xj] g(x) = 0 for each

j and n imply that

limn→∞

n∑j=1

gj(xj − xj−1) = b − a = limn→∞

n∑j=1

gj(xj − xj−1) = 0.

However, g is Lebesgue integrable because the splitting of (a, b) into subsetsis based on values of g rather than of x :∫ b

a

g(x) · dx =∫

(a,b)∩Qg(x) · dx +

∫(a,b)∩(\Q)

g(x) · dx

= 1 · µ [(a, b) ∩ Q] + 0 · µ [(a, b) ∩ (\Q)]

= 1 · 0 + 0 · (b − a)

= 0.

This greater generality of the Lebesgue construction applies also whenintegrating over the entire real line. However, in cases where integrals overthe positive and negative axes are not both finite, it is possible to comeup with finite values for the Riemann integral when Lebesgue integrabil-ity fails, depending on how the limits are taken. For example, evaluating∫∞−∞ x(1+x2)−1 ·dx as lima→∞

∫ a

−ax(1+x2)−1 ·dx (which is Cauchy’s inter-

pretation of the improper Riemann integral from complex analysis) gives 0as the unambiguous result, even though the integrals over the positive andnegative axes individually are +∞ and −∞, respectively. That is, in theRiemann construction the positive and negative parts simply cancel out ifthe limits are taken symmetrically, whereas

∫∞−∞ |x|(1 + x2)−1 · dx fails to

exist no matter how the limits are taken. However, if a function g is abso-lutely integrable in the Riemann sense, meaning that

∫∞−∞ |g (x) | · dx < ∞,

then∫∞−∞ g (x) · dx has the same value in the Lebesgue and Riemann

constructions.

Page 55: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch02 FA

38 Pricing Derivative Securities (2nd Ed.)

Uniform Integrability

Noting that property 2 of the integral implies that∫

g(x) · dx =∫x:|g(x)|>n

g(x) · dx +∫x:|g(x)|≤n

g(x) · dx

for each n, and that

limn→∞

∫x:|g(x)|≤n

|g(x)| · dx =∫|g(x)| · dx,

it is clear that we must have limn→∞∫x:|g(x)|>n |g(x)| ·dx = 0 if g is to be

integrable in the Lebesgue sense. For a collection gα(x)α∈A of integrablefunctions, it is therefore true that limn→∞

∫x:|gα(x)|>n |gα(x)| · dx = 0 for

each member of the collection. This means that each integral can be madesmaller than an arbitrary ε > 0 by taking n > N(α, ε), where N maydepend on α as well as ε. However, it may be that there is no single N(ε)that simultaneously bounds

∫x:|gα(x)|>n |gα(x)| ·dx for all members of the

collection. On the other hand, if

limn→∞

supα∈A

∫x:|gα(x)|>n

|gα(x)| · dx = 0 (2.14)

then this simultaneous control is in fact possible, and the family gα(x)α∈Ais said to be “uniformly integrable”. (The set A here is unrestricted; it maybe uncountable.)

Example 3 Functions gα(x) = x−11[1/α,1](x)1≤α<∞ are integrable, with∫gα(x) · dx = lnα, but are not uniformly so because supα

∫x:|gα(x)|>n

|gα(x)| · dx = +∞ for each n.

Multiple Integrals and Fubini’s Theorem

Taking Ω = 2 (the plane) and B2 the σ-field generated by the productsets (−∞, x1] × (−∞, x2] gives a measurable space (2,B2). On this spacethere exists a product measure, µ2((−∞, x1] × (−∞, x2]) = µ((−∞, x1]) ·µ((−∞, x2]), that, with µ as Lebesgue measure, corresponds to areas of theindicated rectangular sets. Given g : 2 → , construction of the integral∫

g(x1, x2) · dµ2(x1, x2) ≡∫ ∫

g(x1, x2) · dx1 dx2 proceeds in much thesame way as in the one-dimensional case. Fubini’s theorem states conditionsunder which the double integral can be evaluated as an iterated integral and

Page 56: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch02 FA

Mathematical Preparation 39

in either order; that is, conditions under which∫ ∫g(x1, x2) · dx1 dx2 =

∫ [∫g(x1, x2) · dx1

]· dx2

=∫ [∫

g(x1, x2) · dx2

]· dx1. (2.15)

Fortunately, the conditions for this are very general. Equalities (2.15) holdunder any one of the following conditions: (i) g is a nonnegative function (inwhich case all the integrals may be infinite); (ii)

∫ ∫g(x1, x2)·dx1 dx2 < ∞;

(iii)∫ (∫

|g(x1, x2)| · dx1

)· dx2 < ∞.

Changes of Variable in the Integral

Given a definite integral∫

Ag(x) · dx, where A ∈ B (the Borel sets of ),

suppose we convert to a new variable y as y = h(x), where h is a (Borel)measurable function. Let h(A) be the image of A under h; that is, theset onto which h maps A. Suppose further that h is differentiable and that0 < |h′(x)| on A. This ensures that the transformation is one-to-one, so thata unique inverse, h−1(y), exists. It also ensures that the derivative of h−1(y)is finite; that is, |dy/dx| > 0 ⇒ |dx/dy| < ∞. Making the substitutionx = h−1(y) then gives the change-of-variable formula,∫

A

g(x) · dx =∫

h(A)

g[h−1(y)

]·∣∣dh−1(y)

∣∣ , (2.16)

where |dh−1(y)| ≡ |dh−1(y)/dy| · dy.

Example 4 In∫ a

0e−x2 · dx change variables as y ≡ h(x) = x2. Since

h′(x) > 0 on (0, a), there is a unique inverse, x = h−1(y) = +√

y, withdh−1(y) = 1

2y−1/2 · dy. Since h : (0, a) → (0, a2), we have∫ a

0

e−x2 · dx =12

∫ a2

0

y−1/2e−y · dy.

Now suppose that h is not one to one and has no unique inverse onA but that A can be decomposed into disjoint subsets as A = ∪k

j=0Aj

such that (i) h has a unique inverse on each of A1, A2, . . . , Ak and (ii)µ(A0) = 0, where µ is Lebesgue measure. Then since

∫A0

g(x) · dx = 0, we

have∫

A g(x) ·dx =∑k

j=1

∫Aj

g(x) ·dx by properties 9 and 2 of the integral,

Page 57: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch02 FA

40 Pricing Derivative Securities (2nd Ed.)

so that ∫A

g(x) · dx =k∑

j=1

∫h(Aj)

g[h−1(y)

]· d

∣∣h(y)−1∣∣ .

Example 5 In∫(−∞,∞) e−x2 · dx change variables as y ≡ h(x) = x2.

Decomposing (−∞,∞) as (−∞, 0)∪0∪(0,∞), we note that (i) x = −√y

is one-to-one on (−∞, 0) with dh−1(y) = − 12y−1/2 · dy; (ii) x = +

√y is

one-to-one on (0,∞) with dh−1(y) = 12y−1/2 · dy; and (iii) µ(0) = 0.

Since h : (−∞, 0) → (0,∞) and h : (0,∞) → (0,∞) we then have∫(−∞,∞)

e−x2 · dx =∫ ∞

0

y−1/2

∣∣∣∣−12e−y

∣∣∣∣ · dy +∫ ∞

0

y−1/2

∣∣∣∣12e−y

∣∣∣∣ · dy

=∫ ∞

0

y−1/2e−y · dy.

Now let us extend the change-of-variable formula to the multivariatecase. Given the integral

∫A

g(x) · dx, where x = (x1, x2, . . . , xn)′, A ∈ Bn

(the Borel sets of n), and g : n → , introduce new variables as y = h(x),h : n → n. Let h(A) represent the image of A under h, which is the sety = h(x) : x ∈ A. Suppose that the partial derivatives ∂yj/∂xk exist andthat J , the determinant of the n × n matrix ∂yj/∂xk, is not zero on A.J is called the “Jacobian” of the transformation from x to y. That J isnot zero signifies that the transformation is one to one and guarantees thatthe determinant of the inverse transformation from y to x, denoted J−1,is a finite number. Conveniently, it happens that J−1 = 1/J. Then themultivariate change-of-variable formula in this one-to-one case becomes∫

A

g(x) · dx =∫

h(A)

g[h−1(y)

]|J−1| · dy. (2.17)

The standard integration formulas for shifts to polar coordinates are justspecial cases of (2.17).

Example 6 In∫∞0

∫∞0

(x1x2)−1/2e−x1−x2 · dx2 dx1 change variables asx1 = r2 cos2 θ and x2 = r2 sin2 θ, thus defining directly the inverse trans-formation from new variables r, θ to x1, x2. Then A = (0,∞) × (0,∞),h(A) = (0,∞) × (0, π/2), and

J−1 =

∣∣∣∣∣ 2r cos2 θ 2r sin2 θ

−2r2 cos θ sin θ 2r2 cos θ sin θ

∣∣∣∣∣ = 4r3 cos θ sin θ

Page 58: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch02 FA

Mathematical Preparation 41

is finite on h(A). Therefore∫ ∞

0

∫ ∞

0

(x1x2)−1/2e−x1−x2 · dx2 dx1 = 4∫ π/2

0

[∫ ∞

0

re−r2 · dr

]· dθ

= 2∫ π/2

0

= π.

As in the univariate case, situations in which h is not one-to-one on A

can sometimes be handled by decomposition.

2.1.7 Change of Measure: Radon-Nikodym Theorem

Suppose there are two different measures, µ and ν, on the same measurablespace (Ω,M). For example, ν = 100µ would be one obvious possibility,corresponding to a simple change of units. Suppose, as in this simple case,it happens that ν(A) = 0 for any set A for which µ(A) = 0. That is, sets thatare “null” under µ are also null under ν. Then we say that ν is “absolutelycontinuous” with respect to µ and express this symbolically as ν << µ. Theterm continuity is apt here since the condition ν << µ implies (assumingthat ν is finite or σ-finite) that ν(A) can be made arbitrarily small bychoosing A so that µ(A) is sufficiently small. Two measures are said tobe “equivalent” if each is absolutely continuous with respect to the other.Equivalent measures thus have the same null sets.

With this terminology in mind and a measure space (Ω,M, µ) at hand,let us construct a new measure ν on (Ω,M) in a particular way. Introducinga nonnegative integrable function Q : Ω → +, define a set function ν as

ν(A) =∫

A

Q(ω) · dµ(ω), A ∈ M. (2.18)

Property 6 of the integral (page 35) implies that ν ≥ 0; property 2 impliesthat ν is countably additive; and the integrability of Q implies that ν isfinite. Therefore, ν thus constructed is a finite measure on (Ω,M). More-over, we have ν µ by property 9 of the integral. Thus, given a measureand an integrable function, it is easy to create a new measure that is abso-lutely continuous with respect to the first. What is not so easily seen to bepossible, but what is often desirable to do, is to go the other way; that is, toknow whether (and how) to convert one measure to another. The followingtheorem tells us that under appropriate conditions on the measures such aconversion is in fact possible.

Page 59: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch02 FA

42 Pricing Derivative Securities (2nd Ed.)

Theorem 2 (Radon-Nikodym) If µ and ν are σ-finite measures on a mea-surable space (Ω,M), and if ν µ, then there exists an essentially uniquefunction Q : Ω → + such that (2.18) holds for each A ∈ M.

Here, essentially unique means that if there is another function Q′ thatsatisfies (2.18), then Q = Q′ a.e. µ. Clearly, absolute continuity is neces-sary here, for (2.18) could not possibly hold if ν(A) > 0 for a set for whichµ (A) = 0. By analogy with the integral representation for absolutely contin-uous functions of real variables, F (b)−F (a) =

∫ b

aF ′(t) ·dt =

∫ b

a(dF/dt) ·dt,

the function Q is sometimes written as dν/dµ in applications of the Radon-Nikodym theorem. Q is called the “Radon-Nikodym derivative”. The impli-cation of the theorem is that if one wants to change from one measure µ

to a given measure ν µ, then there is a function that can be integratedwith respect to µ that does this.10 In the case of equivalent measures, thefunction Q that takes us from µ to ν must clearly be strictly positive onsets of positive measure. In this case dµ/dν = Q−1. We will find that suchchanges from one probability measure to another are a critically importanttool in the pricing of derivative assets. Application of Radon-Nikodym toprobability measures specifically is treated in section 2.2.4 below.

2.1.8 Special Functions and Integral Transforms

Many useful functions in analysis are simply defined as definite or indefiniteintegrals of other functions or as solutions to differential equations. For ourpurposes the most important of these types are the gamma function, theerror function, and the modified Bessel functions. We will also encounterLaplace transforms and will have frequent contact with Fourier transforms.

Gamma and Incomplete Gamma Functions

The gamma function is defined as

Γ(t) ≡∫ ∞

0

xt−1e−x · dx, t > 0.

Integration by parts establishes the recursive property

Γ(t + 1) = tΓ(t), t > 0. (2.19)

10The Radon-Nikodym derivative that works in the introductory example, ν = 100µ,is obviously dv/dµ = Q(ω) = 100. Less trivial examples will be given in section 2.2.4 inconnection with probability measures.

Page 60: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch02 FA

Mathematical Preparation 43

This then gives the explicit values Γ(n) = (n − 1)! for n ∈ ℵ and, since∫∞0 e−x · dx = 1, explains the convention that 0! = 1. Taking t = 1/2 and

making use of example 6 on page 40 yield another explicit form:

Γ(

12

)=

√(∫ ∞

0

x−1/2e−x · dx

)2

=

√∫ ∞

0

∫ ∞

0

x−1/2y−1/2e−x−y · dy dx

=√

π.

Using this and (2.19) gives explicitly for all odd multiples of 1/2

Γ(

2n− 12

)=

(2n − 3

2

)·(

2n − 52

)· . . . · 1

2·√

π, n = 1, 2, . . . .

The “incomplete” gamma function is defined as

Γ(t; y) =∫ y

0

xt−1e−x · dx, t > 0, y ≥ 0. (2.20)

Obviously,

Γ(t;∞) ≡ limy→∞

Γ(t; y) = Γ(t). (2.21)

Error Function

The error function of a positive real variable x, usually denoted “erf(x)”,is defined as

erf(x) =2√π

∫ x

0

e−t2 · dt, x ≥ 0.

By example 4 on page 39 the change of variable to y = t2 relates this tothe incomplete gamma function, as

erf(x) =Γ(1/2; x2)

Γ(1/2),

and this with (2.21) shows that erf(+∞) = 1.

Page 61: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch02 FA

44 Pricing Derivative Securities (2nd Ed.)

Bessel Functions

Bessel functions are solutions of the differential equation

z2 d2f

dz2+ z

df

dz+ (z2 − ν2)f = 0, (2.22)

which is known as “Bessel’s equation of order ν”. Both z and ν may becomplex, although our applications will involve only real values of ν. Thetwo linearly independent solutions to this linear homogeneous second-orderequation are the Bessel functions of the first kind of order ν and secondkind of order ν, denoted Jν(z) and Yν(z), respectively.11 If z is replaced byiz, where i =

√−1, and if dkf/d(iz)k is interpreted as i−k · dkf/dzk, then

(2.22) becomes

z2 d2f

dz2+ z

df

dz− (z2 + ν2)f = 0. (2.23)

Applying (2.4) to write iz as eiπ/2z and normalizing the solution Jv(iz),one gets the “modified” Bessel function of the first kind of order ν, as

Iν(z) = e−iνπ/2Jν(eiπ/2z).

Among several representations is the ascending series

Iν(z) = (z/2)ν∞∑

k=0

(z/2)2k

k!Γ(ν + k + 1). (2.24)

The second solution to (2.23), a modified Bessel function of the secondkind, is usually written as

Kν(z) =π

2I−ν(z) − Iν(z)

sin(νπ)(2.25)

when ν is not an integer. When ν = n, an integer, this is defined by conti-nuity and has the representation12

Kn(z) = (−1)n+1 [γ + ln(z/2)] In(z) +12

n−1∑k=0

(−1)k (n − k − 1)!k!

(z/2)2k−n

+(−1)n

2

∞∑k=0

(z/2)n+k

k!(n + k)!

k∑j=1

j−1 +n+k∑j=1

j−1

, (2.26)

11For various representations of these as infinite series or as integrals, see Abramowitzand Stegun (1970) and Gradshteyn and Ryzhik (1980). General references for Besselfunctions are Kotz et al. (1982), McLachlan (1955), Relton (1946), and (for the definitivetreatment) Watson (1944).

12See McLachlan (1955, p. 115).

Page 62: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch02 FA

Mathematical Preparation 45

where γ is Euler’s constant: γ = limn→∞(∑n

j=1 j−1 − lnn)

.= 0.577216.Useful differentiation/recurrence formulas for Kν(·) are

K ′ν(z) = νz−1Kν(z) − Kν+1(z) (2.27)

K ′ν(z) = −νz−1Kν(z) − Kν−1(z). (2.28)

Laplace Transforms

The two-sided Laplace transform of a function g(x) is defined as

L2g =∫ ∞

−∞eζxg(x) · dx ≡ λ2(ζ), ζ ∈ ,

when the integral exists, as it necessarily does at ζ = 0 if g itself is(Lebesgue) integrable. If g is defined just on the positive reals and ζ isrestricted to be negative, then the transform often exists even when g itselfis not integrable. Restricting to the one-sided case, the definition of theLaplace transform is commonly presented as

Lg =∫ ∞

0

e−ζxg(x) · dx ≡ λ(ζ), ζ > 0. (2.29)

Three examples of one-sided transforms are given in table 2.1.13

More general is the one-sided Laplace-Stieltjes transform,

LG =∫ ∞

0

e−ζx · dG(x),

which specializes to (2.29) when G is absolutely continuous with density g.

Table 2.1. Laplace transforms of certainfunctions on +.

g : + → Lg : →

xα, α > −1 Γ(α + 1)/ζα+1

xα−1e−x, α > 0 Γ(α)(1 + ζ)−α, ζ > −1

e−αx2, α > 0 eζ2/α

qπ4α

13Extensive tables are presented in Abramowitz and Stegun (1970) and Gradshteynand Ryzhik (1980).

Page 63: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch02 FA

46 Pricing Derivative Securities (2nd Ed.)

Fourier Transforms

Related to the two-sided Laplace transform, but applicable to a muchbroader class of functions, is the Fourier transform. This is defined as

Fg =∫ ∞

−∞eiζxg(x) · dx ≡ f(ζ), (2.30)

where ζ ∈ and i =√−1. The identity (2.4) shows that |eiζx| = 1, so that

by property 8 of the integral (page 35), Fg exists for all ζ so long as g itselfis integrable. In the case

∫∞−∞ |f(ζ)| · dζ < ∞ the inversion formula

g(x) = F−1f(ζ) ≡ 12π

∫ ∞

−∞e−iζxf(ζ) · dζ (2.31)

allows g to be recovered from f . Some other important properties are (whereprimes denote derivatives)

1. F(bg + ch) = bFg + cFh, where b, c are constants2. Fg′ = −iζ · f(ζ) if lim|x|→∞ g(x) = 03. Fg′′ = −ζ2 · f(ζ) if lim|x|→∞ g′(x) = 0.

We also have the important convolution property:

F(g ∗ h) = Fg · Fh, (2.32)

where (g ∗ h)(x) ≡∫

g(x − y)h(y) · dy represents the convolution of thetwo real functions g and h. The fact that this rather complicated processof convolution is reduced to the multiplication of transforms accounts formuch of the usefulness of the Fourier transform in analysis and probability.We will encounter many applications. Three examples of Fourier transformpairs are in table 2.2, where C represents the complex numbers.14

Table 2.2. Fourier transforms of certain func-tions on +.

g : + → Fg : → C

1[α,β](x) (eiζβ − eiζα)/(iζ)

xα−1e−x1[0,∞)(x), α > 0 Γ(α)(1 − iζ)−ν

e−αx2, α > 0 e−ζ2/α

qπ4α

14Abramowitz and Stegun (1970) and Gradshteyn and Ryzhik (1980) give extensivetabulations.

Page 64: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch02 FA

Mathematical Preparation 47

More general than (2.30) is the Fourier-Stieltjes transform,

FG =∫

eiζx · dG(x).

We encounter applications of Fourier-Stieltjes and Laplace-Stieltjes trans-forms in the next section.

2.2 Probability

This section presents the basic tools of probability from a measure-theoreticperspective and serves to introduce the continuous-time stochastic processesthat will be the main subject of chapter 3.15

2.2.1 Probability Spaces

We begin with a collection Ω of outcomes of some chance experiment anda σ-field, F , containing those subsets of Ω whose probabilities we will beable to determine. Although probabilities are just measures with particularproperties, some new terminology is applied. Sets in F are now “events”,and we say that event A “occurs” when the chance experiment has anoutcome ω that is an element of set A. Disjoint sets are now “exclusiveevents”. The σ-field F contains all countable unions and intersections ofsome basic class of events of interest, together with their complements.This implies, among other things, that Ω ∈ F . We take for granted that allevents in the discussion that follows do belong to F .

A probability measure P is a mapping from F into [0, 1]. P is thus a finitemeasure. The triple (Ω,F , P) represents a probability space. The usual (fre-quentist) interpretation is that P(A) represents the proportion of replica-tions of the experiment in which an event A ∈ F would occur if the chanceexperiment were repeated indefinitely while holding constant the conditionsunder one’s control. If P(A) = 1, then we say that A occurs “almost surely”(a.s.). The meaning is the same as “A occurs a.e. P”. If P(A) = 0, then A

is said to be a “null” set or, more explicitly, a “P-null” set.

15Good general references, roughly in order of prerequisite knowledge, are Bartoszyn-ski and Niewiadomska-Bugaj (1996), Williams (1991), Billingsley (1986), Chung (1974),and Laha and Rohatgi (1979).

Page 65: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch02 FA

48 Pricing Derivative Securities (2nd Ed.)

The following fundamental properties of P are just assumed:

1. P(Ω) = 12. P(A) ≥ 03. P(∪∞

j=1Aj) =∑∞

j=1 P(Aj) whenever Aj ∩ Aj′ = ∅.

Of course, the first property is the only one that distinguishes P from othermeasures on (Ω,F) that are not also probability measures. From these threeproperties we then deduce the familiar complement rule, addition rule, etc.A familiarity with these basic ideas is assumed. Recall, also, that eventscomprising a finite collection Ajn

j=1 are independent if P(Aj1∩. . .∩Ajm ) =∏mi=1 P(Aji) for m ∈ 2, 3, . . . , n and each such subcollection Aji

mi=1 of

m distinct sets. Events in the infinite collection An∞n=1 are independentif those in each finite collection are independent. The practical implicationof independence is that the assessment of the probability of an event is notaltered by knowledge that other, independent events have occurred. Otheressential properties of P are

1. The monotone property. If An∞n=1 is a monotone sequence of eventsthat converges to a limit A, with either A = ∪∞

n=1An (in the case thatAn ↑) or A = ∩∞

n=1An (in the case that An ↓), then limn→∞ P(An) =P(A). This follows from the monotone property of measures generally,(2.12).

2. The law of total probability. If Ω = ∪∞j=1Aj with Aj ∩ Aj′ = ∅, then

P(B) =∑∞

j=1 P(B ∩ Aj).3. Borel-Cantelli lemma (convergence part).

∑∞n=1 P(An) < ∞ implies

P(infinitely many of the events An occur) = 0.4. Borel-Cantelli lemma (divergence part). If An∞n=1 are indepen-

dent,∑∞

n=1 P(An) = ∞ implies P( infinitely many of the eventsAn occur) = 1.

The event that infinitely many of the An occur is expressed symbolicallyas ∩∞

n=1 ∪∞m=n Am or as limn→∞ sup An. The complement of this event is

(limn→∞ sup An)c = limn→∞ inf Acn ≡ ∪∞

n=1 ∩∞m=n Ac

n. This represents theevent that no events in the sequence occur from some point forward.

Example 7 A sequence of independent, random draws is made from an urncontaining, at the nth stage of the sequence, one red ball and n − 1 whiteballs. At stage n the probability of event red is thus 1/n. A draw is madeand then another white ball is added before proceeding to stage n + 1. Whatis the probability that red occurs infinitely many times? Since

∑∞n=1 P(red

Page 66: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch02 FA

Mathematical Preparation 49

at stage n) =∑∞

n=1 n−1 = ∞, and since events at different stages areindependent, the divergence part of Borel-Cantelli implies that a red ball iscertain to be drawn infinitely many times in an infinite sequence of draws.Put differently, this tells us that there is never a stage beyond which it iscertain that only white balls will be drawn.

Example 8 In an otherwise identical urn experiment, suppose the numberof white balls at stage n is 2n−1−1, so that the probability of red at stage n

is 2−n+1. A draw is made and then 2n−1 more white balls are added beforeproceeding to stage n + 1. Since

∑∞n=1 2−n+1 = 2 < ∞, the convergence

part of Borel-Cantelli tells us that a red ball is certain not to be drawninfinitely many times. Put differently, there is some (undetermined) stagebeyond which we can be sure that only white will be drawn.

2.2.2 Random Variables and Their Distributions

A (scalar-valued) random variable on a probability space (Ω,F , P) is anF -measurable mapping from Ω to . (Vector-valued random variables aretreated below.) That is, as a mapping from Ω to , a random variablejust assigns a number to an outcome; for example, the number 4 to theoutcome :: in the experiment of rolling a 6-sided die, or the number52 to the outcome A♠ in drawing a card from an ordinary deck. Suchfunctions are written as X(ω), Y (ω), Z(ω), . . . etc., or simply X, Y, Z, . . .

for short. Focusing on some particular random variable X , recall thatF -measurability requires that the inverse image under X of any Borel setB be contained in F ; that is, that X−1(B) ≡ ω : X(ω) ∈ B ∈ Ffor each B ∈ B. Recall, too, that this is guaranteed if sets of the specificform X−1((−∞, x]) ≡ ω : X(ω) ≤ x belong to F for each x ∈ . Theshorthand notation X ∈ F is often used to state that X is F -measurable.Examples of measurable and nonmeasurable functions are given below.

Since F is the collection of events to which probabilities can be assigned,the measurability of X implies that P

(X−1(B)

)exists for each Borel set

B and defines a new mapping, PX say, from B into [0, 1]. Since the single-valuedness of functions implies that X−1(Bj)∩X−1(Bj′ ) = ∅ whenever Bj∩Bj′ = ∅, and since P is countably additive, it follows that PX

(∪∞

j=1Bj

)=∑∞

j=1 PX(Bj) for a sequence of such exclusive events. Thus, PX is itselfa probability measure and is referred to as the measure “induced” by therandom variable X . Thus, introducing a random variable on a probabilityspace (Ω,F , P) gives rise to an entirely new, induced probability space,

Page 67: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch02 FA

50 Pricing Derivative Securities (2nd Ed.)

(,B, PX). Stated formally in words, PX(B) represents the probability ofthe event that an outcome ω occurs to which X assigns a value in B. Statedmore simply, it is the probability that X takes a value in B. Often, we willexpress this in an abbreviated form as P(X ∈ B).

The fineness of the structure of the σ-field F governs the amount ofdetail we can have about the results of a chance experiment. How fine thestructure needs to be in modeling a given experiment is determined by ourinterest.

For example, suppose we are to roll an ordinary six-sided die. If whatmatters is only whether an even number or an odd turns up, then it isenough to know the probabilities of just ·, · · · , : · : and :, ::, :::—plusthe events that can be built up from these by intersection and union. Wecan then take as the overall space the set Ω2 that contains just these twoelementary outcomes and can work with the σ-field F2 (which in this caseis just a field16) that is generated from these two events. The collection F2

comprises the two events themselves together with their intersection andunion, ∅ and Ω2. A random variable Y (ω) with, say, Y (:, ::, :::) = 0 andY (·, · · · , : · :) = 1, is then measurable with respect to F2, since we cantell whether Y takes on a value in any arbitrary Borel subset of if weknow whether each event in F2 occurs.

On the other hand, the very limited information structure F2 will notsuffice if we need to know more about what happens. For example, supposewe want to determine the probability associated with each possible numberof dots that can turn up on the roll of the die. To define the randomvariable X(ω) = number of dots on side ω, we will have to work withan overall space that is at least as detailed as the set Ω6 that comprisesthe elementary outcomes ·, :, · · · , ::, : · :, :::. We will then need a field F6

that contains all 26 subsets of Ω6, including ∅ and Ω6 itself. Now X ∈ F6

(that is, X is measurable with respect to F6), whereas X /∈ F2 since onecannot tell whether events like X = 1 and 3 ≤ X ≤ 5 occur just fromthe events in F2. On the other hand, the random variable Y defined inthe previous paragraph is F6-measurable as well as F2-measurable, sinceF6 contains all the sets of F2. This makes F2 a subfield of F6. Being the

16“Fields” of subsets of some space Ω are collections closed under finitely manyset operations (unions, intersections, complementations), whereas σ-fields remain closed

under countably many such operations. When Ω comprises a finite number of elementaryoutcomes, the field of sets built up from these is also a σ-field, since any set can berepresented both as the union and as the intersection of countably many copies of itself.

Page 68: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch02 FA

Mathematical Preparation 51

coarsest information structure that makes X measurable, F6 is said to bethe σ-field “generated” by X ; likewise, F2 is the σ-field generated by Y .

More detail about the die experiment might be required than even F6

provides. For instance, one might also want to assess probabilities of eventsdescribing the die’s orientation after it comes to rest or its position in somecoordinate system. Depending on how precisely these are to be measured, Ωmight then have to comprise an infinite number of elementary outcomes.17

Representations of Distributions

A “support” of a random variable X is a Borel set X such that P (X ∈ X ) ≡PX(X ) = P

(X−1 (X )

)= P (ω : X (ω) ∈ X) = 1. Clearly, itself is

always a support since PX() = P(X−1()

)= P (Ω) = 1, but X may

also be a proper subset of . Thus, X = 1, 2, . . . , 6 supports X in thedie example. The induced probabilities associated with events of the formX−1((−∞, x]) ≡ ω : X(ω) ≤ x show how the unit mass of probability isdistributed over the real line. Because it is often more convenient to workwith point functions than with measures, we can make use of an equivalentrepresentation in terms of the cumulative distribution function (c.d.f.) ofthe random variable X. The c.d.f. of X at some x ∈ is defined as

F (x) ≡ PX((−∞, x]) ≡ P (X ≤ x) .

This mapping from into [0, 1] has the following properties:

1. F (−∞) = 0, since (by the monotone property)

limn→∞

PX((−∞,−n]) = PX(∩∞n=1(−∞,−n]) = PX(∅).

2. F (+∞) = 1, since (again by the monotone property)

limn→∞

PX((−∞, n]) = PX(∪∞n=1(−∞, n]) = PX().

3. F (y) − F (x) ≥ 0 when x < y, since

PX((−∞, y]) − PX((−∞, x]) = PX((−∞, x] ∪ (x, y]) − PX((−∞, x])

= PX((x, y]).

17In that case the distinction between finite and countable additivity—between fieldsand σ-fields of subsets—would indeed arise.

Page 69: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch02 FA

52 Pricing Derivative Securities (2nd Ed.)

4. F (y) − F (y−) ≥ 0, since

F (y) − limn→∞

F (y − 1/n) = limn→∞

PX((y − 1/n, y])

= PX(∩∞n=1(y − 1/n, y])

= PX(y).

5. F (y+) − F (y) = 0, since

limn→∞

F (y + 1/n) − F (y) = PX(∩∞n=1(y, y + 1/n]) = PX(∅).

The nature of F is the basis for a taxonomy of random variables. Whenthere is a countable collection of discrete points xj∞j=1, called “atoms”,such that

∑∞j=1 F (xj − xj−) ≡

∑∞j=1 PX(xj) = 1, then X is called a

“discrete” random variable. In other words X is a discrete random variablewhen an atomic set is a support. In this case F is a step function thatincreases only by discontinuous jumps at these discrete points. Since F issimply flat between the atoms, the probability distribution of X can berepresented more compactly by just identifying the atoms themselves andspecifying how much probability mass is concentrated at each one. Definingf(x) = F (x) − F (x−) as the “probability mass function” (p.m.f.) of thediscrete random variable X , we see that f(xj) > 0 for each xj in the atomicset and

∑∞j=1 f(xj) = 1. Models for the various discrete distributions we

encounter are described in section 2.2.7 (table 2.3). Most common are thelattice distributions, in which the atoms are equally spaced, typically com-prising some subset of the nonnegative integers, ℵ0 = 0, 1, 2, . . . ..

If the distribution of X contains no such atoms, then F is continu-ous, and X is a “continuous” random variable. If, in addition, F is abso-lutely continuous, then there exists a density f such that PX((a, b]) =F (b) − F (a) =

∫ b

a f(x) · dx for arbitrary a ≤ b. In that case X is an “abso-lutely continuous” random variable, and f is called the “probability densityfunction” (p.d.f.) of X .18 We will not encounter the other type of contin-uous random variable, “continuous singular”. Since absolute continuity ofF implies differentiability a.e. with respect to Lebesgue measure, f(x) canbe identified with F ′(x) except on the PX -null set of exceptional points

18C.d.f. F is absolutely continuous if and only if induced measure PX is absolutelycontinuous with respect to Lebesgue measure µ. Thus, sets that are µ-null are also PX -null, and p.d.f. f corresponds to Radon-Nikodym derivative dPX/dµ. If X is a discrete

random variable on a finite space, then PX is absolutely continuous with respect tocounting measure #, which maps subsets of countable spaces into ℵ0 ∪ ∞. P.m.f. fthen corresponds to Radon-Nikodym derivative dPX/d#.

Page 70: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch02 FA

Mathematical Preparation 53

where F ′ fails to exist. At these exceptional points f can be defined to haveany value that makes the representation convenient.19 Clearly, properties 1and 2 of the c.d.f. imply that

∫∞−∞ f(x) · dx = 1. Models for the vari-

ous continuous distributions we encounter are summarized in section 2.2.7(table 2.3).

“Mixed” random variables that combine the two pure types are alsopossible, and indeed are encountered often in the study of financial deriva-tives. In these cases X has a countable set D of atoms such that 0 <∑

x∈D [F (x) − F (x−)] < 1, but there is also a density f such that

PX (B) =∫

B

f (x) · dx +∑

x∈D∩B

[F (x) − F (x−)] (2.33)

for each B ∈ B.

Example 9 A fair pointer on a circular scale of unit circumference is spunand eventually comes to rest at some physical point ω. Taking an arbitraryω0 as the origin, let X(ω) be the clockwise distance from ω0 to ω, as depictedin figure 2.1, and define Y (ω) = max[X(ω), 1/2]. Then if we model the c.d.f.of X as

FX(x) =

0, x < 0x, 0 ≤ x < 11, x ≥ 1

= x1[0,1) (x) + 1[1,∞) (x)

the implied model for the c.d.f. of Y is

FY (y) =

0, y < 1/2y, 1/2 ≤ y < 11, y ≥ 1.

= y1[.5,1) (y) + 1[1,∞) (y) .

The model for FX implies that X is absolutely continuous, with p.d.f.f(x) = 1[0,1] (x) (where we arbitrarily define f (x) = 1 at x = 0 and x = 1,where F ′ does not exist). However, the implied model for FY is such thatY is neither purely discrete nor purely continuous.

19Since F is nondecreasing by property 3 above, it follows that f ≥ 0 at pointswhere F ′ does exist, and for consistency it is conventional to choose nonnegative valuesalso at the exceptional points.

Page 71: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch02 FA

54 Pricing Derivative Securities (2nd Ed.)

Fig. 2.1. Experiment and c.d.f for mixed random variable Y.

The Multidimensional Case

The concepts of random variables and the representations of their dis-tributions extend to higher dimension. Thus, on the space (Ω,F , P) theF -measurable mapping X = (X1, X2, . . . , Xk)′ : Ω → k defines a vector-valued random variable. For examples, an experiment could involve drawingone individual at random from a population and recording k quantitativecharacteristics (age, years of schooling, annual income, etc.), or it couldinvolve sampling k individuals at random from the same population andrecording one quantifiable characteristic for each. Induced probability mea-sure PX and joint c.d.f. F (x) and joint p.m.f. or p.d.f. f(x) are developedin ways that parallel the one-dimensional case. Thus, if B ∈ Bk (the Borelsets of k) then PX(B) = P(X−1(B)), and F (x) = F (x1, . . . , xk) is the PX

measure of the product set (−∞, x1] × . . . × (−∞, xk]. From the inducedmeasure PX one obtains the induced “marginal” probability measure cor-responding to a subset X1, . . . , Xj as

PX1,...,Xj (B) = PX1,...,Xj ,Xj+1,...,Xk(B ×k−j),

where B ∈ Bj and j < k. Likewise, the marginal c.d.f. is

FX1,...,Xj (x1, . . . , xj) = FX1,...,Xj ,Xj+1,...,Xk(x1, . . . , xj , +∞, . . . ,+∞).

In the absolutely continuous case the joint p.d.f. is developed as the deriva-tive of the c.d.f. where this exists, as it does a.e. PX :

f(x1, . . . , xk) = ∂kF (x1, . . . , xk)/∂x1 · . . . · ∂xk.

Page 72: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch02 FA

Mathematical Preparation 55

Random variables X1, X2, . . . , Xk are independent if and only if eventsX−1

1 (B1), X−12 (B2), . . . , X−1

k (Bk) are independent for each finite collectionof sets Bjk

j=1 with Bj ∈ B. This is the case if and only if the joint inducedprobability measure of a product set factors as the product of marginalmeasures, as

PX(B1 × B2 × . . . × Bk) = PX1(B1)PX2(B2) · . . . · PXk(Bk).

Equivalently, independence holds if and only if the joint c.d.f. factors intomarginal c.d.f.s; that is,

FX(x) = F1(x1)F2(x2) · . . . · Fk(xk),

for each x ∈ k. Likewise, random variables that are all purely discrete orpurely absolutely continuous are independent if and only if the joint p.m.f.or p.d.f., respectively, factors as the product of the marginals:

fX(x) = f1(x1)f2(x2) · . . . · fk(xk).

The implication of independence is that assessments of probabilities ofevents defined in terms of any subset of X1, . . . , Xk are not altered byinformation about the realizations of the remaining variables.

2.2.3 Mathematical Expectation

Integrals of functions on abstract measure spaces were defined in section2.1.6, and the concept can be carried over directly to random variables.As usual, however, some new jargon is attached. The “expected value” or“mathematical expectation” of the random variable X , denoted EX , isdefined as

EX ≡∫

X(ω) · dP(ω). (2.34)

Recalling the properties of the abstract integral, we say that EX existswhen X is absolutely integrable; that is, when

∫|X(ω)| · dP(ω) < ∞. For

some g : → we can likewise define the expected value of the composi-tion as

Eg(X) ≡∫

g [X(ω)] · dP(ω) (2.35)

whenever E|g(X)| is finite. Of course, (2.35) encompasses (2.34) on takingg(X) = X . Eg(X) can be expressed in other ways. Translating to theinduced probability space (,B, PX), we have

Eg(X) =∫

g(x) · dPX(x),

Page 73: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch02 FA

56 Pricing Derivative Securities (2nd Ed.)

or, in terms of the c.d.f.,

Eg(X) =∫

g(x) · dF (x), (2.36)

where the latter is interpreted as a Lebesgue-Stieltjes integral.In the case that X is a purely discrete random variable with atomic set

D expression (2.36) can be written in terms of the p.m.f. as

Eg(X) =∑x∈D

g(x)f(x);

while if X is absolutely continuous it can be written in terms of the p.d.f.:

Eg(X) =∫ ∞

−∞g(x)f(x) · dx.

In the mixed case the expectation can be expressed as

Eg(X) =∑x∈D

g(x) [F (x) − F (x−)] +∫ ∞

−∞g(x)f(x) · dx,

where D is the countable set of atoms and f is the density that appears in(2.33).

Example 10 With FY as in example 9 and g(Y ) = eY we have D = 0.5and F (.5) − F (.5−) = 0.5. Taking f(y) = 1(.5,1) (y) gives

EeY = .5e.5 +∫ 1

.5

ey · dy = e −√

e.

The Stieltjes representation, as in (2.36), is especially convenientbecause it encompasses all of these cases.

The nine properties of the integral listed on page 35 imply the followingproperties for expectation, where a, b, c are constants, Bj∞j=1 are disjointBorel sets, and PX (B0) = 0:

1. E1B1 (X) g (X) =∫

B1g (x) · dPX (x)

2. E[1∪∞

j=1Bj (X) g (X)]

=∑∞

j=1 E1Bj (X) g (X)

3. Ea ≡ E[a1(X)] = aPX () = a

4. E [bg (X)] = bEg (X)

5. E[a + bg(X) + ch(X)] = a + bEg(X) + cEh(X)

6. g (X) ≤ h (X) a.s. ⇒ Eg (X) ≤ Eh (X)

7. g (X) = h (X) a.s. ⇒ Eg (X) = Eh (X)

Page 74: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch02 FA

Mathematical Preparation 57

8. |Eg(X)| ≤ E|g(X)|9. PX (B0) = 0 ⇒ E1B0 (X) g (X) = 0.

Property 1 implies that PX(B1) ≡ P(X ∈ B1) can be representedas an expectation through the use of indicator functions: PX(B1) =E1B1(X) =

∫B1

dPX(x) =∫

B1dF (x).

Expectation in the multivariate case is treated by regarding the vari-ables as vectors in the expressions above and treating sums and integralsas multiple. The new result in the case of independent random variables isthat the expectation of the product of functions of the individual variablesequals the product of the expectations, given that all the expectations exist.That is,

Eg1(X1) · . . . · gk(Xk)

=∫k

g1(x1) · . . . · gk(xk) · dFX(x)

=∫ ∫

. . .

∫g1(x1) · . . . · gk(xk) · dFX1(x1) . . . dFXk

(xk)

= Eg1(X1) · . . . · Egk(Xk), (2.37)

where the last line follows from Fubini’s theorem.

Uniformly Integrable Random Variables

The concept of uniform integrability of functions, defined in (2.14), carriesover to families of random variables. Random variables Xαα∈A definedon (Ω,F , P) are said to be uniformly integrable if

limn→∞

supα∈A

E|Xα|1|Xα|>n = 0. (2.38)

Since supα∈A E|Xα| = supα∈A E|Xα|1|Xα|>n + supα∈A E|Xα|1|Xα|≤nit follows that

limn→∞

supα∈A

[E|Xα| − E|Xα|1|Xα|≤n

]= 0

when (2.38) holds and therefore that supα∈A E|Xα| < ∞. Thus, uniformintegrability implies that the expectations E|Xα| are bounded; that is, thatthere exists b < ∞ such that E|Xα| ≤ b for each α ∈ A.20

20On the other hand, supα∈A E|Xα| < ∞ does not imply that the Xα are uni-formly integrable. For a counterexample see Williams (1991, p. 127). However, thecondition supα∈A E|Xα|p < ∞ for some p > 1 does suffice. Obviously, it is also truethat random variables that are a.s. uniformly bounded are uniformly integrable.

Page 75: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch02 FA

58 Pricing Derivative Securities (2nd Ed.)

Moments

The kth “moment” about the origin of a scalar-valued random variable X isdefined as21 EXk =

∫xk ·dF (x) and denoted µ′

k. The kth moment exists ifthe corresponding absolute moment, νk ≡

∫|x|k ·dF (x), is finite. Of course,

µ′k ≤ νk, with equality guaranteed both when k is even and when + is

a support for X . While we sometimes encounter moments of negative orfractional order, the positive-integer moments (k ∈ ℵ) are usually the onesof interest. It is easy to see that the existence of µ′

k implies the existence ofµ′

j for j = 1, 2, . . . , k. Of course µ′0 = 1 invariably. The first moment about

the origin, EX , is represented simply as µ and is called the “mean” of X .This corresponds to the center of mass of the probability distribution of X .

The following representation of the mean is often useful and is easilyproved using integration by parts:

EX = −∫ 0

−∞F (x) · dx +

∫ ∞

0

[1 − F (x)] · dx. (2.39)

When X is supported on +, this becomes simply

EX =∫ ∞

0

[1 − F (x)] · dx =∫ ∞

0

P(X > x) · dx. (2.40)

In dealing with discrete random variables having integer support (thatis, variables with lattice distributions) we will sometimes encounter the“descending factorial moments”, defined as

µ(k) = EX(k) ≡ E [X(X − 1) · . . . · (X − k + 1)] .

“Central” moments, or moments about the mean, are defined as E(X−µ)k and denoted µk without the prime. The binomial expansion

(X − µ)k =k∑

j=0

(k

j

)Xk−j(−µ)j

shows that the kth central moment (and, for that matter, central momentsof all integer orders less than k) exists if µ′

k exists. The second centralmoment, µ2, is the “variance” of X and is usually denoted specially as

21Throughout, expressions like EXk or Eg(X) always have the same meaning as

E(Xk) or E[g(X)], parentheses or brackets being used only when there is ambiguityabout what the operation E applies to. Parentheses are invariably used when expecta-tions appear as arguments of functions, as (EX)k , g(EX), (EXj)k , etc.

Page 76: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch02 FA

Mathematical Preparation 59

σ2 or as V X . The standard deviation, σ, is defined as +√

σ2 = +√

V X.The transformation X ′ = (X − µ) /σ puts random variable X in “stan-dard form”, with EX ′ = 0 and V X ′ = 1. The third and fourth momentsof standardized random variables are often used as summary indicators ofasymmetry (skewness) and tail thickness (kurtosis). Specifically, the coeffi-cients of skewness and kurtosis are

α3 ≡ µ3/σ3

α4 ≡ µ4/σ4.

Distributions with long right (left) tails are said to be right (left) skewed.Typically in such cases the sign of α3 indicates the direction of skewness—positive for right- and negative for left-skewed. Obviously, symmetric distri-butions possessing third moments have α3 = 0. The coefficient of kurtosisis a common (though often unreliable) indicator of tail thickness relativeto the normal distribution, for which α4 = 3.0. The quantity α4 − 3 issometimes called the “excess” kurtosis. One typically thinks of symmetricdistributions with α4 > 3.0 as having thicker tails than a normal distri-bution with the same µ and σ2, and higher density near the mean. Suchdistributions are said to be “leptokurtic”. Empirical marginal distributionsof price changes of financial assets are often of this sort.

The only additional concept needed in the multivariate case, X : Ω −→k, is that of “product moments”. These are expectations of the formE(Xj1 − aj1)m1 · . . . · (Xj

− aj)m , where the a’s are constants—usually

equal either to zero or to the respective means. By far the most importantcase is the bivariate concept of “covariance”. The covariance between ran-dom variables X1 and X2 is defined as E(X1−µ1)(X2−µ2) and denoted σ12

or Cov(X1, X2). The coefficient of correlation between X1 and X2, denotedρ12 or Corr(X1, X2), is the ratio of the covariance to the product of thestandard deviations: ρ12 ≡ σ12/(σ1σ2).

Inequalities

The following inequalities for random variables are frequently used:

1. Jensen: If g(·) is a convex function on a support X of X and E|X | < ∞,then Eg(X) ≥ g(EX). [Proof: let a and b be real numbers such thatg(x) ≥ a + bx for x ∈ X and g(EX) = a + bEX . Convexity implies thatsuch a and b exist, not necessarily uniquely. Then Eg(X) ≥ E(a+bX) =a + bEX = g(EX).]

Page 77: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch02 FA

60 Pricing Derivative Securities (2nd Ed.)

2. Markov: If g(·) ≥ 0 on a support X of X and if ε > 0, then P(g(X) ≥ε) ≤ Eg(X)/ε. [Proof: let Gε ≡ x : g(x) ≥ ε. Then Eg(X) =

∫Gε

g(x) ·dF (x) +

∫X\Gε

g(x) · dF (x) ≥∫Gε

g(x) · dF (x) ≥ εPX(Gε).]3. Schwarz: If EX2 < ∞ and EY 2 < ∞, (EXY )2 ≤ EX2 · EY 2. [Proof:

For some n ∈ ℵ put Xn = |X | ∧ n, Yn = |Y | ∧ n. EXnYn exists sinceXnYn ≤ n2, so ∀t ∈ we have

0 ≤ E(Xn + tYn)2

= (1, t)

(EX2

n EXnYn

EXnYn EY 2n

)(1t

).

The matrix is thus positive semidefinite, which implies (EXnYn)2 ≤EX2

nEY 2n . Now let n −→ ∞ and apply dominated convergence to con-

clude that EX2n ↑ EX2, EY 2

n ↑ EY 2, and (EXnYn)2 ↑ (E|XY |)2 ≤EX2 · EY 2.] The special case P(X = aj , Y = bj) = 1/n for j ∈1, 2, . . . , n and P(X = x, Y = y) = 0 elsewhere gives the deterministicversion: n∑

j=1

ajbj

2

≤n∑

j=1

a2j

n∑j=1

b2j . (2.41)

The special case that X and Y are in standard form gives −1 ≤ ρXY ≤ 1as bounds for the coefficient of correlation.

Generating Functions and Characteristic Functions

The moment generating function (m.g.f.) of a random variable X is equiv-alent to the two-sided Laplace-Stieltjes transform of its c.d.f.:

M(ζ) = EeζX =∫

eζx · dF (x).

The m.g.f. exists when the integral is finite for all real ζ in some openinterval about ζ = 0. Of course, M(0) = 1 always. Differentiating M(·) k

times with respect to ζ and interchanging the operations of integration andderivation, as is justified by dominated convergence, shows that M(k)(ζ) =∫

xkeζx ·dF (x) and hence M(k)(0) = EXk. This is the sense in which M(·)“generates” the moments. Existence of M(·) implies the existence of allpositive integer moments of X, but the converse does not hold.22

22The lognormal distribution (section 2.2.7) has finite moments of all orders but nom.g.f.

Page 78: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch02 FA

Mathematical Preparation 61

The natural log of the m.g.f., L(ζ) ≡ lnM(ζ), is the “cumu-lant generating function” (c.g.f.) of X . It can be expanded in Taylor seriesabout ζ = 0 to give

L(ζ) =∞∑

j=1

κjζj/j!,

where the κj, which are defined from this relation, are the “cumulants”of X . The first four cumulants are

κ1 = µ

κ2 = µ′2 − µ2 = µ2 = σ2

κ3 = µ′3 − 3µµ

′2 + 2µ2 = µ3

κ4 = µ′4 − 4µ

′3µ − 3(µ

′2)

2 + 12µ′2µ − 6µ4 = µ4 − 3σ4.

It follows that the first three derivatives of L(ζ) evaluated at ζ = 0 arethe mean, the variance, and the third central moment, respectively, whileL(4)(0) is the fourth central moment minus three times the square of thevariance. The coefficients of skewness and excess kurtosis can be calculatedfrom the c.g.f. as

α3 = L(3)

(0)/[L′′(0)]3/2 = κ3/κ

3/22 (2.42)

α4 − 3 = L(4)(0)/[L′′(0)]2 = κ4/κ2

2. (2.43)

The related “probability generating function” (p.g.f.) is useful in char-acterizing distributions supported on the nonnegative integers. This isdefined as

Π(ζ) ≡ EζX =∑x∈X

ζxf(x),

for ζ ∈ C (the complex numbers) and restricted to some region of con-vergence. Clearly, there is always convergence for |ζ| ≤ 1; in particular,Π(1) = 1. The obvious identity Π (ζ) = M(ln ζ) for real, positive ζ showsthat convergence in a neighborhood of ζ = 1 implies existence of the m.g.f.and hence of all the moments. In this case the derivatives of Π(·) evaluatedat ζ = 1 equal the descending factorial moments, as

Π(k)(1) =∑x∈X

x(x − 1) · . . . · (x − k + 1)f(x) = µ(k).

In any case the derivatives evaluated at ζ = 0 generate the probabilities,as Π(k)(0)/k! = f(k) = PX (k).

Page 79: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch02 FA

62 Pricing Derivative Securities (2nd Ed.)

The most important of the generating functions for our purposes isthe “characteristic function” (c.f.). The c.f. of a random variable X is theFourier-Stieltjes transform of the c.d.f.:

Ψ(ζ) = Eeiζx =∫

eiζx · dF (x), ζ ∈ , i =√−1.

The joint c.f. of a vector-valued random variable X with joint c.d.f. F isgiven by

Ψ(ζ) = Eeiζ′X =∫k

eiζ′x · dF (x), ζ ∈ k,

where ζ ′x is the inner product. Since |eiζ′x| = 1, we have∣∣∣∫ eiζ′x · dF (x)

∣∣∣ ≤1, so existence of Ψ(ζ) is guaranteed for each real ζ. Focusing on the uni-variate case, here are the properties we will use:23

1. Ψ(0) = 12. |Ψ(ζ)| = [Ψ(ζ)Ψ(−ζ)]1/2 ≤ 13. Ψ(ζ) is uniformly continuous on .

[Proof: for B > 0

|Ψ(ζ + h) − Ψ(ζ − h)|2

=∣∣∣∣∫

eiζx sin(hx) · dF (x)

∣∣∣∣≤

∫|sin(hx)| · dF (x)

≤ F (−B) + 1 − F (B) +∫ B

−B

|sin(hx)| · dF (x)

≤ F (−B) + 1 − F (B) + h

∫(−B,B]

|x| · dF (x),

which can be made arbitrarily small by taking B sufficiently large andh sufficiently small, independently of ζ.]

4. C.f.s of absolutely continuous random variables approach (0, 0) as |ζ| →∞, whereas c.f.s of random variables supported on a lattice are periodicand never damp to the origin.

5. If kth derivative Ψ(k)(ζ) exists at ζ = 0, then moments to order k

exist if k is even, and moments to order k − 1 exist if k is odd. By thecontrapositive, when k is even the nonexistence of Ψ(k)(0) implies thatthe kth moment does not exist. If the kth moment about the origin does

23See Lukacs (1970) for proofs of those not given and not obvious.

Page 80: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch02 FA

Mathematical Preparation 63

exist, it can be generated as µ′k = i−kdkΨ(ζ)dζk |ζ=0 . Similarly, the

logarithm of the c.f. serves to generate the cumulants:

κk = i−kdk ln Ψ(ζ)/dζk |ζ=0 . (2.44)

6. If kth derivative Ψ(k)(ζ) exists at ζ = 0, then Ψ can be expanded as

Ψ (ζ) = 1 + Ψ′ (0) ζ + Ψ′′ (0) ζ2/2 + . . . + Ψ(k) (0) ζk/k! + o(ζk

),

where ζ−ko(ζk

)→ 0 as ζ → 0—the crucial point being that the remain-

der vanishes faster than ζ−k even if the (k + 1)st derivative at ζ = 0does not exist.

Example 11 If X is a continuous random variable with p.d.f. f(x) =(2π)−1/2e−x2/2 for x ∈ , the c.f. is

Ψ(ζ) =∫ ∞

−∞(2π)−1/2eiζx−x2/2 · dx

= e−ζ2/2

∫ ∞

−∞(2π)−1/2e−(x−iζ)2/2 · dx

= e−ζ2/2 erf(+∞)

= e−ζ2/2,

where the next-to-last step follows by setting z = (x − iζ)/√

2. Note thatΨ(ζ) is real-valued and positive and that Ψ(ζ) → 0 as |ζ| → ∞.

Example 12 If X has a lattice distribution with p.m.f. f(x) = θ1−x(1−θ)x

for x ∈ 0, 1 and θ ∈ (0, 1), then Ψ(ζ) = 1 + θ(eiζ − 1). Applying (2.3)shows that |Ψ(ζ)|2 = (1− θ)2 + 2θ(1− θ) cos ζ, which oscillates indefinitelyas |ζ| increases.

A crucial property of c.f.s is expressed in the uniqueness theorem. Thisstates that if c.f.s of two random variables X and Y agree for every ζ ∈ then the distributions of X and Y are also the same, in the sense thatPX (B) = PY (B) for all B ∈ B. In other words, there is a one-to-one corre-spondence between c.f.s and probability distributions. Combined with theconvolution property of c.f.s and the continuity theorem for c.f.s this willmake it possible to deduce the distributions of sums of independent vari-ables and the limiting distributions of certain sequences of random vari-ables. We derive the convolution formula now and present the continuitytheorem in section 2.2.6.

Let us consider how one would find directly—that is, without the benefitof c.f.s—the distribution of the sum of two independent random variables

Page 81: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch02 FA

64 Pricing Derivative Securities (2nd Ed.)

X and Y with marginal c.d.f.s FX and FY . Letting Z = X + Y , the c.d.f.of Z can be found by integrating over the half plane as

FZ(z) = P(X + Y ≤ z)

=∫

x+y≤z

dFY (y) dFX(x)

=∫

[∫(−∞,z−x]

dFY (y)

]· dFX(x).

This gives FZ(z) =∫ FY (z − x) · dFX(x) or, switching the roles of X and

Y, FZ(z) =∫ FX(z − y) · dFY (y). Either of these integrals represents the

convolution of FX and FY , denoted FX ∗FY . From this it follows that FZ(z)is absolutely continuous if either X or Y is a continuous random variable; forif Y is continuous, differentiating FZ gives the density as fZ(z) =

∫ fY (z−

x) · dFX(x), while if FX is absolutely continuous it gives the density asfZ(z) =

∫ fX(z − y) · dFY (y). If both FX and FY are continuous, either

expression is equivalent to

fZ(z) =∫

fY (z − x)fX(x) · dx, (2.45)

which is the convolution of the p.d.f.s: fX ∗ fY . These are the operationsinvolved in order to work out directly the distributions of sums of indepen-dent random variables. Sums of more than two have to be handled in thisway, one step at a time.

Working instead through the c.f. and using (2.37), we have

ΨZ(ζ) = EeiζZ = E(eiζX · eiζY

)= EeiζXEeiζY = ΨX(ζ)ΨY (ζ).

Therefore, the c.f. of the sum of independent random variables is just theproduct of their c.f.s. This extends to any finite number of independentrandom variables; for if Zn =

∑nj=1 Xj and the Xj are independent,

then ΨZn(ζ) =∏n

j=1 ΨXj (ζ). In particular, if Xjnj=1 are independent and

identically distributed (i.i.d.) then ΨZn(ζ) = ΨX1(ζ)n.Notice that if X1 and X2 are i.i.d. random variables with c.f. ΨX(ζ),

then |ΨX(ζ)|2 = ΨX(ζ)ΨX(−ζ) is the c.f. of X1 − X2. In general, as inthis case, random variables distributed symmetrically about the origin havereal-valued c.f.s.

Page 82: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch02 FA

Mathematical Preparation 65

Example 13 In the pointer experiment of example 9 the c.f.s of X and Y

are

ΨX(ζ) =1iζ

(eiζ − 1)

ΨY (ζ) =eiζ/2

2+

eiζ − eiζ/2

iζ.

If the pointer is spun k times in such a way as to generate independentrandom variables Xjk

j=1 and Yjkj=1, then the c.f.s of SX ≡

∑kj=1 Xj

and SY are ΨSX (ζ) = ΨX(ζ)k and ΨSY (ζ) = ΨY (ζ)k.

Deriving the full power of the convolution property requires a way toretrieve the c.d.f. (or p.d.f. or p.m.f.) from the c.f. There are a numberof inversion formulas that enable this to be done under appropriate con-ditions. The easiest version, which corresponds to Fourier inversion for-mula (2.31), requires the c.f. to be absolutely integrable, meaning that∫∞−∞ |Ψ(ζ)| · dζ < ∞. In this case one obtains the p.d.f. as

f(x) =12π

∫ ∞

−∞e−iζxΨ(ζ) · dζ. (2.46)

Example 14 Applying (2.46) to the c.f. of the standard normal distributionin example 11 gives

f(x) =12π

∫ ∞

−∞e−iζxe−ζ2/2 · dζ

and (changing variables as z = (ζ + ix)/√

2)

f(x) =e−x2/2

√2π

∫ ∞

0

2√π

e−z2 · dz

= (2π)−1/2e−x2/2 · erf(+∞)

= (2π)−1/2e−x2/2.

The requirement that Ψ(ζ) be absolutely integrable clearly restricts for-mula (2.46) to continuous random variables, since only these satisfy thenecessary condition lim|ζ|→∞ |Ψ(ζ)| → 0. However, not all continuous vari-ables have absolutely integrable c.f.s. For example, X in example 13 doesnot. The following more general results retrieve from the c.f. the probabilityassociated with a finite or infinite interval under the condition that no endpoint of the interval is an atom.

Page 83: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch02 FA

66 Pricing Derivative Securities (2nd Ed.)

Theorem 3

(i) If PX(a) = PX(b) = 0,

F (b) − F (a) = limc→∞

∫ c

−c

e−iζa − e−iζb

2πiζΨ(ζ) · dζ.

(ii) If PX(x) = 0,

F (x) =12− lim

c→∞

∫ c

−c

e−iζx

2πiζΨ(ζ) · dζ. (2.47)

Notice that the integrals correspond to the Cauchy interpretation of theimproper integral,

∫∞−∞, which need not exist in the Lebesgue sense when

Ψ is not absolutely integrable.

Example 15 With ΨX(ζ) = 1iζ (eiζ − 1) as in example 13 and a = 0, b = 1

F (1) − F (0) = limc→∞

12π

∫ c

−c

2 − (eiζ + e−iζ)ζ2

· dζ

=1π

∫ ∞

0

sin2(ζ/2)(ζ/2)2

· dζ

= 1.

2.2.4 Radon-Nikodym for Probability Measures

Given a σ-finite measure µ on a measurable space (Ω,M), we have seenthat an equivalent σ-finite measure ν can be constructed as

ν(A) =∫

A

Q(ω) · dµ(ω), A ∈ M,

where Q is a µ-integrable function that is strictly positive on sets of posi-tive µ measure. From the Radon-Nikodym theorem we know also that anessentially unique such Q exists that connects any pair of equivalent mea-sures, with Q = dν/dµ as the Radon-Nikodym derivative. Let us see howthese statements carry over to probability measures in particular and alsoto the probability measures induced by a random variable X . Of course,σ-finiteness always holds in this context because probability measures arefinite, but there are much more interesting things to see.

Page 84: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch02 FA

Mathematical Preparation 67

Given a probability space (Ω,F , P), it is clear that any function Q > 0such that

∫Q(ω) · dP(ω) = 1 defines a new probability measure,

P(A) =∫

A

Q(ω) · dP(ω), A ∈ F ,

that is equivalent to (has the same null sets as) P. In this context themapping Q ≡ dP/dP from Ω to (0,∞) is just a positive random variablewith unitary expected value—the Radon-Nikodym derivative. Indeed, anypositive, integrable random variable can be used to create such a new mea-sure if the random variable is normalized by its expected value. Conversely,Radon-Nikodym tells us that for two such equivalent probability measuresthere always does exist a positive random variable Q that connects themin this way, and that it is essentially unique.24

Now introduce a random variable X and consider the probabilitymeasure PX on (,B) that it induces, together with its c.d.f., F (x) =PX((−∞, x]). Let q(·) be a positive function on a support of X such thatEq(X) < ∞, and set Q (X) = q (X) /Eq (X). Then from PX and F thenew measure PX and c.d.f. F can be generated as

PX(B) =∫

B

Q(x) · dPX(x) =∫

B

Q(x) · dF (x) (2.48)

F (x) = PX((−∞, x]).

Conversely, Radon-Nikodym tells us that given two induced mea-sures or two distribution functions there is a nonnegative and inte-grable Q = dF /dF that connects them. Finally, if g is such thatE |g (X)| =

∫|g (x)| · dPX (x) < ∞, then the expected value with respect

to PX is

Eg (X) = Eg (X)Q (X) (2.49)

=∫

g (x)Q (x) · dF (x)

=∫

g (x)Q (x) · dPX (x) .

Example 16 Consider the c.d.f.s

F (x) =(1 − e−x

)1(0,∞) (x)

F (x) =[1 − (1 + x)e−x

]1(0,∞) (x) .

24That is, if there are two such functions, Q and Q′, then they agree almost every-where with respect to P and (by equivalence) with respect to P.

Page 85: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch02 FA

68 Pricing Derivative Securities (2nd Ed.)

The supports being the same, the corresponding induced measures PX andPX are equivalent, so there must be a PX-almost-surely positive Q such that(2.48) holds. Since F and F are absolutely continuous with densities f andf , we have

Q(x) =dF (x)dF (x)

=f(x)f(x)

=xe−x

e−x= x

for x > 0.

Example 17 Consider the c.d.f.s

F (x) = (1 − π)1[0,1) (x) + 1[1,∞) (x)

F (x) = (1 − π)1[0,1) (x) + 1[1,∞) (x)

where π, π ∈ (0, 1). Both being supported on 0, 1, the induced measuresare equivalent. Since F and F are step functions with corresponding p.m.f.sf and f , we have

Q(x) =dF (x)dF (x)

=f(x)f(x)

=

1 − π

1 − π, x = 0

π

π, x = 1

.

2.2.5 Conditional Probability and Expectation

Given a probability space (Ω,F , P) and two events A, A′ ∈ F , theconditional probability of A given A′, written P(A |A′), is defined asP(A ∩A′)/P(A′) whenever P(A′) > 0. Intuitively, P(A |A′) is the probabil-ity assigned to A when we have partial information about the experiment,to the extent of knowing that A′ has occurred.

Introducing the bivariate random variable Z = (X, Y )′ on (Ω,F , P),the familiar concepts of conditional p.m.f.s and conditional p.d.f.s can bedeveloped directly from the definition of conditional probability. Thus, inthe discrete case that Z is supported on a countable set of points Z =X × Y ⊂ 2 we can take A = ω : X (ω) = x , A′ = ω : Y (ω) = y, anddefine the conditional p.m.f. for (x, y) ∈ Z as

fX|Y (x | y) =fZ (x, y)fY (y)

≡ P (A ∩ A′)P (A′)

= P(A | A′).

For the continuous case, putting A = ω : X (ω) ∈ (−∞, x] and A′ =ω : Y (ω) ∈ (y − h′, y] for h′ > 0, and considering just the values of y

Page 86: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch02 FA

Mathematical Preparation 69

such that P (A′) > 0 for all h′ > 0, we get the conditional c.d.f. of X giventhe zero-probability event Y = y as

FX|Y (x | y) = limh′−→0

P (A ∩ A′)P (A′)

= limh′−→0

P(A | A′).

The conditional p.d.f. is then found as fX|Y (x | y) = dFX|Y (x | y) /dx.

Alternatively, putting A = ω : X (ω) ∈ (x − h, x] for h > 0 and leavingA′ as before, the p.d.f. can be obtained as

fX|Y (x | y) =fZ (x, y)fY (y)

≡ limh,h′−→0

P (A ∩ A′) / (hh′)P (A′) /h′ = lim

h,h′−→0h−1P(A | A′).

From the conditional p.m.f.s and p.d.f.s conditional expectations can bedeveloped in the usual way as weighted sums or integrals; e.g.,

E [g (X, Y ) | Y = y] =

x∈Xg (x, y) fX|Y (x | y)∫

g (x, y) fX|Y (x | y) · dx

.

Of course, all this should be familiar from elementary probability theory.For the finance applications that lie ahead, we now need to extend theconcepts to probabilities and expectations conditional on collections of setsthat comprise σ-fields.

Probabilities Conditional on σ-fields

We have explained the sense in which σ-field F in a probability space(Ω,F , P) represents an information structure and have discussed sub-σ-fields that contain coarser (that is, less) information than F about theresults of the chance experiment. Letting F ′ ⊆ F be a sub-σ-field of F andexhibiting some event A ∈ F , let us consider the symbol P(A | F ′) to rep-resent the probability of A given that we know the information representedby F ′. “Knowing F ′” means that we know whether each (nonempty) eventA′ ∈ F ′ has occurred. This is equivalent to knowing which of the elementaryevents has occurred of which events in F ′ are composed.25 Thus, we canthink of P(A | F ′) as an F ′-measurable random variable—a mapping fromthe events that generate F ′ to the interval [0, 1]. P(A | F ′) is F ′-measurablein the sense that we will know the actual value of P(A | F ′) once we obtainthe F ′ information. Note that P (A) itself can be thought of as conditional

25For example, if σ-field F ′ were generated by a countable collection of disjoint setsthat partitioned Ω, then these would be the “elementary events” of F ′.

Page 87: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch02 FA

70 Pricing Derivative Securities (2nd Ed.)

on the trivial σ-field T = ∅, Ω ⊆ F ′, which represents the knowledge justof what things are possible; thus, P (A) ≡ P (A | T ). At the other extreme,P (A | F) is a Bernoulli random variable that takes one of two values—unityor zero, since F includes the knowledge of whether ω ∈ A or not.

Example 18 Referring to the die example on page 50, take F = F6 to bethe σ-field generated by the random variable X(ω) = number of dots onside ω when a fair six-sided die is thrown, and let G = F2 be the sub-σ-field generated by the random variable Y , where Y (ω) = 0 or Y (ω) = 1as the number of dots is even or odd. Knowing F2 means that we knowwhich of the odd-even outcomes, ·, · · · , : · : or :, ::, :::, has occurred.Since : ∈ F6, we interpret P(: | F2) as a random variable taking thevalue P(: | :, ::, :::) = 1/3 if :, ::, ::: occurs or P(: | ·, · · · , : · :) = 0if ·, · · · , : · : occurs.

The same statements naturally apply to measures induced by randomvariables. Thus, if X is a random variable on (Ω,F , P) and (,B, PX) isthe induced space, then for B′ ⊂ B and B ∈ B we think of PX (B | B′) as aB′-measurable random variable. For each of the collection of events B′ thatgenerate B′ the variable PX (B | B′) expresses the probability that X ∈ B

given that X ∈ B′—i.e., P (X ∈ B | X ∈ B′) . As before, we can think ofPX (B) as conditioned on the trivial field T ′ = ∅, and can recognizePX (B | B) as a Bernoulli random variable.

Expectations Conditional on σ-fields

With these concepts in mind, we can now develop the concept of con-ditional expectation given a σ-field of events. For each fixed sub-σ-fieldF ′ ⊆ F , P(· | F ′) is a mapping from F to [0, 1], and so (Ω,F , P(· | F ′)) is aprobability space. Defining a random variable X on Ω, statements such asP(X ≤ x | F ′) can be interpreted as the probability that will be assigned tothe F -measurable event ω : X (ω) ≤ x once the information F ′ is known.Since P(· | F ′) is a probability measure, the conditional expectation of X

given F ′ can then be defined in the usual way as

E(X | F ′) =∫

X(ω) · dP(ω | F ′).

This conditional expectation can be interpreted as the F ′-measurable ran-dom variable that represents what the expected value of X will be once theinformation F ′ becomes available. In particular, E (X | T ) ≡ EX when T

Page 88: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch02 FA

Mathematical Preparation 71

is trivial, while E (X | F) = X . Defined in this way, E (· | F ′) is automati-cally endowed with the nine properties of mathematical expectation givenon pages 56–57. In particular, corresponding to E [bg (X)] = bEg (X) forsome constant b we have E [Y g (X) | F ′] = Y E [g (X) | F ′] if Y ∈ F ′ (Y isan F ′-measurable random variable).

There is also the parallel interpretation in terms of induced measures.Thus, E (X | B′) =

∫x · dPX (x | B′) corresponds to the expectation of X

given the partial information that X ∈ B′ ⊆ B.There is one other very essential property, known as the “tower” prop-

erty of conditional expectation:

E [E(X | F ′)1A′ ] = E(X1A′), (2.50)

where A′ is any set in F ′. In particular, with A′ = Ω we haveE [E(X | F ′)] = EX . A way to understand the tower property intuitivelyis from the identity

E(X1A′) = E [E(X | F ′)1A′ ] + E [X − E(X | F ′)]1A′ .

Thinking of the conditional expectation E(X | F ′) as the best prediction26

of X given the information F ′, it is clear that the second term should bezero, since prediction error X−E(X | F ′) should be orthogonal to anythingin F ′. Likewise, writing EX as E [E (X | F ′)]+E [X − E(X | F ′)] , we canunderstand that the second term should be zero because it represents theexpectation given less information (T ) of the prediction error based onmore information (F ′ ⊇ T ).27

While this development of conditional expectation is very natural,the usual treatment, traced to Kolmogorov (1950), is simply to defineE(X | F ′) as an F ′-measurable random variable having the property (2.50).More precisely, E(X | F ′) is regarded as an “equivalence class” of ran-dom variables satisfying (2.50). This allows different “versions” to exist,members of any countable collection of which are equal with probabilityone. With conditional expectation thus defined, conditional probabilitiescan be deduced as conditional expectations of indicator functions—e.g.,P (A | F ′) = E (1A | F ′) for A ∈ F and PX (B | B′) = E (1B | B′) for any

26It is in fact the best prediction in the mean-square sense, in that E[(X −m)2 | F ′]attains a unique minimum (among all F ′-measurable m) at m = E (X | F ′).

27Williams (1991) gives a formal derivation of the tower property from conditionalprobabilities in the case that Ω is a finite set.

Page 89: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch02 FA

72 Pricing Derivative Securities (2nd Ed.)

B ∈ B (the Borel sets) and any B′ ⊆ B. The fact that conditional expecta-tions have the usual properties of expectation can then be shown to followfrom the definition. While less insightful, this standard treatment has theadvantage of making it easy to verify that some F ′-measurable randomvariable Y is in fact a version of the conditional expectation E(X | F ′):simply show that E(Y 1A′) = E (X1A′) for any F ′-measurable A′.

Conditional Expectations under Changes of Measure

Now, given the measure P on (Ω,F) and the sub-σ-field F ′, let us see howto express conditional expectations in a new measure P. As in section 2.2.4,let P and P be equivalent probability measures on (Ω,F), with randomvariable Q = dP/dP as the Radon-Nikodym derivative. Thus, P (Q > 0) =P (Q > 0) = 1 and

∫Q · dP = 1. Let X be a random variable with E |X | ≡∫

|X | · dP =∫|X |Q · dP < ∞. Then if A′ ∈ F ′ ⊆ F ,

EXQ1A′ = EX1A′ = E[E (X1A′ | F ′)

]= E

[E (X1A′ | F ′)Q

],

where the second equality follows from the tower property and the first andthird equalities follow from (2.49). But, again by the tower property,

E[E (X1A′ | F ′)Q

]= E

E[E (X1A′ | F ′)Q] | F ′

= E

[E (X1A′ | F ′)E (Q | F ′)

],

since E (X1A′ | F ′) is F ′-measurable. That this holds for each A′ ∈ F ′

proves that the expression in brackets is in fact (a version of) the conditionalexpectation of XQ given F ′; i.e., E (XQ | F ′) = E (X | F ′)E (Q | F ′) a.s.Finally, since Q > 0 we have

E (X | F ′) =E (XQ | F ′)E (Q | F ′)

, (2.51)

which is sometimes called “Bayes’ rule for conditional expectations”.

Example 19 Let X = Y + Z, where Y and Z are independent randomvariables with p.d.f.s fY (y) = (2π)−1/2

e−(y−1)2/2 and fZ (z) = (2π)−1/2

e−z2/2. Take F to be the σ-field generated by Y and Z (so that X isF-measurable) and F ′ to be the sub-σ-field generated by Y alone.Convolution formula (2.45) shows the p.d.f. of X to be fX (x) =(4π)−1/2

e−(x−1)2/4. Thus, for any Borel set B we have PX (B) =

Page 90: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch02 FA

Mathematical Preparation 73

∫B

(4π)−1/2e−(x−1)2/4 · dx. To define a new measure PX , put Q =

dPX/dPX = dFX/dFX = e−X/2+1/4. Then Q > 0 and

EQ =∫ ∞

−∞

1√4π

e−x/2+1/4−(x−1)2/4 · dx =∫ ∞

−∞

1√4π

e−x2/4 = 1,

as required. Thus, for integrable g = g (X) we have Eg = E (Qg) . In par-ticular, EX = 0 and EX2 = 1. Now let us use (2.51) to find E (X | F ′)and E

(X2 | F ′) . Since knowledge of F ′ confers knowledge of realization

Y (ω) = y, we can write these alternatively as E (X | y) and E(X2 | y

).

Under original measure PX the conditional p.d.f. of X given Y = y is

fX|Y (x | y) =d

dxFX|Y (x | y) =

d

dxFZ (x − y) =

1√2π

e−(x−y)2/2,

and so

E(Q | y) =∫

1√2π

e−x/2+1/4−(x−y)2/2 · dx

= e−y/2+3/8

∫1√2π

e−[x−(y−1/2)]2/2 · dx

= e−y/2+3/8.

Applying (2.51) for arbitrary, integrable g, we have

E [g (X) | y] =E [g(X)Q | y]

E(Q | y)

=e−y/2+3/8

∫1√2π

g(x)e−[x−(y−1/2)]2/2 · dx

e−y/2+3/8

=∫

1√2π

g (w + y − 1/2) e−w2/2 · dw

=

y − 1/2, g (X) = X

1 + (y − 1/2)2 , g (X) = X2 .

2.2.6 Stochastic Convergence

This section deals with sequences of random variables Xn∞n=1 that, insome sense, settle down or converge as they progress. There are varioussenses in which this can occur. We begin with the most basic one, in whichthe distributions of the Xn approach some particular form.

Page 91: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch02 FA

74 Pricing Derivative Securities (2nd Ed.)

Convergence in Distribution

Definition 1 Let F be a c.d.f. The sequence Xn converges in distributionto F if the sequence of c.d.f.s Fn converges (pointwise) to F for each x ∈ that is a continuity point of F.

The term “weak convergence” is also used. We write Xn F to indi-cate convergence in distribution. Since there is a one-to-one correspondencebetween Fn, the c.d.f. at stage n, and its induced probability measure, PXn ,it is clear that the measures are approaching some common form that cor-responds to the c.d.f. F that represents the limiting distribution. The weakrequirement that the c.d.f.s converge to F just where F is continuous han-dles the awkward transitions from sequences of continuous distributions tolimiting forms with one or more atoms.

Example 20 Let X1 have c.d.f.

F1(x) = x1[0,1) (x) + 1[1,∞) (x) ,

corresponding to the random variable X in example 9, and define Xn = Xn1

for n ∈ ℵ. Then Xn has c.d.f.

Fn(x) = P(Xn1 ≤ x) = P(X1 ≤ x1/n) = x1/n1[0,1) (x) + 1[1,∞) (x) .

Plainly, the probability mass for Xn becomes more and more concentratednear the origin the larger is n. Taking limits,

Fn(x) → 1(0,∞) (x) ≡ F∞(x).

F∞ is not a c.d.f. since it is not right-continuous at x = 0, but definingF (x) = F∞(x) for x = 0 and F (0) = 1 does produce a c.d.f. to which Fnconverges except at discontinuity point 0. The continuous distribution of Xn

can therefore be approximated arbitrarily closely for large enough n by thedegenerate distribution F that puts unitary probability mass at the origin.

In many applications it is difficult to express the c.d.f.s of the Xn andtherefore hard to ascertain the limiting distribution from the definition.For example, when each Xn is a sum of independent variables, finding Fn

directly requires calculating a sequence of convolution integrals. In suchcases the following “continuity” theorem for characteristic functions tellsus that the limiting distributions can be deduced from the limits of thec.f.s.

Theorem 4 Xn converges in distribution to F if and only if the sequence ofcharacteristic functions Ψn∞n=1 converges (pointwise) to a function Ψ∞

Page 92: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch02 FA

Mathematical Preparation 75

that is continuous at ζ = 0. In this case Ψ∞ is a c.f., and the one thatcorresponds to F.

Example 21 In example 12 on page 63 take θ = 1/2 and let Xj∞j=1 beindependent random variables, each having c.f. Ψ(ζ) = 1+ (eiζ − 1)/2. Thelinear transformation 2Xj − 1 sets the mean to zero and the variance tounity. Let Zn be n−1/2 times the sum of the first n of the scaled, centeredvariates: Zn = n−1/2

∑nj=1(2Xj −1). The limiting distribution of Zn can

be found via the continuity theorem, as follows:

Ψn(ζ) = EeiζZn =n∏

j=1

Eei(2Xj−1)ζ/√

n = e−iζ√

n[1 + (e2iζ/

√n − 1)/2

]n

.

Expanding e2iζ/√

n to order n−1 as 1+2iζ/√

n−2ζ2/n+o(n−1

)and taking

logs give

ln Ψn(ζ) = −iζ√

n + n ln[1 + iζ/

√n − ζ2/n + o

(n−1

)]= −iζ

√n + n

[iζ/

√n − ζ2/n + ζ2/(2n) + o

(n−1

)]= −ζ2/2 + o (1) ,

so that Ψn(ζ) → e−ζ2/2. One can either recognize this from example 11 asthe c.f. of the standard normal distribution, or else invert it as in example 14to see that the limiting distribution of Zn is represented by p.d.f. f(z) =(2π)−1/2e−z2/2.

Central limit theorems give conditions under which averages or sumsof random variables, appropriately centered and scaled, converge in distri-bution to the standard normal. In the case of independent and identicallydistributed random variables the Lindeberg-Levy theorem states the weak-est possible conditions for this result. Example 21 was just a special case,but the proof of the general result is even easier.

Theorem 5 If Xj∞j=1 are independent and identically distributed (i.i.d.)

with mean µ and finite variance σ2, then(∑n

j=1 Xj − nµ)

/ (σ√

n) con-verges in distribution to the standard normal as n → ∞.

Proof: Put Zn =(∑n

j=1 Xj − nµ)

/ (σ√

n) = n−1/2∑n

j=1 Yj , where

Yj ≡ (Xj − µ) /σ are i.i.d. with EY1 = 0, EY 21 = 1 and c.f. Ψ. Then

Page 93: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch02 FA

76 Pricing Derivative Securities (2nd Ed.)

Zn has c.f.

Ψn(ζ) = Ψ(n−1/2ζ

)n

=[1 +

iζ√n

EY1 −ζ2

2nEY 2

1 + o(n−1

)]n

=[1 − ζ2

2n+ o

(n−1

)]n

,

and limn→∞ Ψn(ζ) = e−ζ2/2.

Stochastic Convergence to a Constant

In example 20 the limiting distribution of the sequence Xn was degener-ate, in the sense that all probability mass was concentrated at one point.In such cases we say that Xn “converges stochastically to a constant”.Again, there are several modes of such convergence.

Definition 2 The sequence of random variables Xn converges stochas-tically to the constant c

1. In probability, if the limiting distribution of Xn is F (x) = 1[c,∞) (x) .

2. In mean square, if limn−→∞ E(Xn − c)2 = 0.

3. Almost surely, if for each ε > 0

P(|Xn − c| > ε for infinitely many n) = 0.

The usual notations are Xn →P c, Xn →m.s. c, and Xn →a.s. c, respectively.An equivalent condition for convergence in probability is that P(|Xn −c| > ε) → 0 for each ε > 0. In other words, the requirement is that theprobability of a significant deviation at stage n be made arbitrarily smallby taking n sufficiently large. Almost-sure (a.s.). convergence is a strongerform, since it requires there to be some point beyond which there will neverbe a significant deviation. The convergence part of the Borel-Cantelli lemmaprovides a sufficient condition:

∞∑n=1

P(|Xn − c| > ε) < ∞ ⇒ P(|Xn − c| > ε for infinitely many n) = 0.

Taking g(Xn) = |Xn − c|2 in the Markov inequality shows that conver-gence in mean square implies convergence in probability. On the other hand,mean-square convergence is not implied by either of the other two forms,

Page 94: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch02 FA

Mathematical Preparation 77

since neither a.s. convergence nor convergence in probability requires theexistence of moments.

Laws of large numbers give conditions under which averages of a col-lection of random variables converge stochastically to some constant µ. Fori.i.d. random variables with mean µ Kolmogorov’s strong law of large num-bers gives the best possible result:

Theorem 6 If Xj∞j=1 are i.i.d. with EXj = µ, then n−1∑n

j=1 Xj

→a.s. µ as n → ∞.

2.2.7 Models for Distributions

Important properties of most of the models for distributions we willencounter in applications to financial derivatives are summarized intable 2.3. For the moments the table gives whichever has the simplestform—moments about the origin, µ

′k; moments about the mean, µk; or

descending factorial moments, µ(k). Any of these suffices to determine theothers. “ /Σ >> 0” for the multinormal signifies positive definiteness of thematrix. Moment generating functions exist for all distributions in the tableexcept the lognormal and Cauchy. These can be obtained from the expres-sions for the c.f.s by replacing ζ by ζ/i. The lognormal c.f. has no closedform, and the Cauchy law has no positive-integer moments.

Normal and Lognormal Families

The normal and lognormal distributions are of special importance in thestudy of derivatives. If random variable X is distributed as (univariate) nor-mal with parameters θ1 and θ2

2 , we write X ∼ N(θ1, θ22) for short. Since X

then has mean θ1 and variance θ22 , we typically represent the distribution as

N(µ, σ2). The case N(0, 1) is the “standard” normal. The normal distribu-tion is often called the “Gaussian” law. One can see from the expressions forthe central moments in table 2.3 that the coefficients of skewness and kurto-sis are α3 = 0 and α4 = 3.0, respectively. Special symbols for the standardnormal p.d.f. and c.d.f. are φ(x) = (2π)−1/2e−x2/2 and Φ(x) =

∫ x

−∞ φ(z)·dz.A change of variables in the integral shows that erf(x/

√2) = Φ(x)−Φ(−x)

and therefore

Φ(x) =12

[1 + erf

(x/

√2)]

.

Page 95: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch02 FA

78 Pricing Derivative Securities (2nd Ed.)

Table 2.3. Common distributions and their properties.

Name P.d.f./ Support Parameter Moments Characteristic

Abbrev. P.m.f. X Space µ′k, µk, µ(k) Function, Ψ(ζ)

Bernoulli θ1−xθx 0, 1 θ ∈ (0, 1) µ′k : 1 − θ + θeiζ

B(1, θ) θ ≡ 1 − θ θ

Binomial`n

x

´θn−xθx 0, 1, . . . , n θ ∈ (0, 1) µ(k) :

“1 − θ + θeiζ

”n

B(n, θ) θ ≡ 1 − θ n ∈ ℵ n!θk

(n−k)!

Poisson θxe−θ/x! ℵ0 θ > 0 µ(k) : exp“

θeiζ − θ”

P (θ) θk

Neg. Bin. Γ(ϕ+x)x!Γ(ϕ) θϕθx ℵ0 θ ∈ (0, 1) µ(k) :

“θ

1−θeiζ

”ϕ

NB(θ, ϕ) θ ≡ 1 − θ ϕ ≥ 1 Γ(ϕ+k)Γ(ϕ)

θk

θk

Uniform (ϕ − θ)−1 (θ, ϕ) θ < ϕ µ′k : eiζδ

δ sinh

ζ(ϕ+θ)2

i

U(θ, ϕ) ϕk+1−θk+1

(k+1)(ϕ−θ) δ ≡ ϕ−θ2

Gamma ϕ−θxθ−1e−x/ϕ

Γ(θ) + θ > 0 µ′k : (1 − iζϕ)−θ

Γ(θ, ϕ) ϕ > 0 ϕk Γ(θ+k)Γ(θ)

Exponential ϕ−1e−x/ϕ + ϕ > 0 µ′k : (1 − iζϕ)−1

Γ(1, ϕ) ϕkk!

Normal e−(x−µ)2/2σ2

σ√

2π µ ∈ µk : eiζµ−ζ2σ2/2

N(µ, σ2) σ > 0 0, k odd

σkk!2k/2[k/2]!

, else

Multinormal e−(x−µ)′ /Σ−1(x−µ)/2q(2π)k | /Σ|

k µ ∈ k µ′1 = µ eiζ′µ−ζ′ /Σζ/2

N(µ, /Σ) /Σ >> 0 µ2 = /Σ

Lognormal e−(ln x−µ)2/2σ2

xσ√

2π(0,∞) µ ∈ µ

′k : no closed form

LN(µ, σ2) σ > 0 ekµ+k2σ2/2

Cauchy 1

π

»1+

“x−θ

ϕ

”2– θ ∈ /∃ eiζθ−|ζ|ϕ

C(θ, ϕ) ϕ > 0

Two useful inequalities for bounding probabilities of large deviations are

P(Z > c) = 1 − Φ(c) ≤∫ ∞

c

(2π)−1/2ze−z2/2 · dz = φ(c) (2.52)

P(|Z| > c) = 2P(Z > c) < e−c2/2 (2.53)

for c ≥ 1.

Page 96: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch02 FA

Mathematical Preparation 79

For the k-variate version of the normal we write X ∼ N(µ, /Σ), whereµ ≡ EX is the k-vector of means and /Σ ≡ E(X − µ)(X − µ)′ is thek×k matrix of variances and covariances. Thus, /Σ is a symmetric, positive-semidefinite matrix whose jth element, σj, is the covariance between Xj

and X when j = and is the variance of Xj when j = . The c.f. of theaffine function a0 +a′X = a0 +

∑kj=1 ajXj , where ajk

j=0 are constants, is

Ψa0+a′X(ζ) = eiζa0ΨX(ζa) = exp[iζ(a0 + a′µ) − ζ2a′ /Σa/2

],

which shows that a0 + a′X ∼ N(a0 + a′µ, a′ /Σa).When the covariances are all zero, /Σ then being a diagonal matrix, the

multivariate normal p.d.f. factors as the product of the marginal p.d.f.s ofX1, . . . , Xk; that is,

f(x) = (2π)−k/2| /Σ|−1/2 exp[−1

2(x − µ)′ /Σ−1(x − µ)

]

= (2π)−k/2∏k

j=1σ−1

j exp

−12

k∑j=1

(xj − µj

σj

)2

=k∏

j=1

1σj

√2π

exp

[−1

2

(xj − µj

σj

)2]

. (2.54)

Uncorrelated normals are therefore independent. Also, since independentrandom variables with normal marginal distributions have (2.54) as jointp.d.f., any collection of independent normals is necessarily multivariate nor-mal. A sum, S, of such independent normals is distributed as normal withmean

∑kj=1 µj and variance

∑kj=1 σ2

j . If the Xj are also identically dis-tributed, then S ∼ N(kµ, kσ2). The normal family is therefore closed underconvolution.

In the bivariate case, k = 2, the p.d.f. is

f(x1, x2) =exp

− 1

2(1−ρ2)

[(x1−µ1

σ1

)2 − 2ρ(

x1−µ1σ1

)(x2−µ2

σ2

)+

(x2−µ2

σ2

)2]2πσ1σ2

√1 − ρ2

,

where ρ ∈ (−1, 1) is the coefficient of correlation. The bivariate standardnormal form for random variables Z1, Z2 with zero means and unit vari-ances is

φ(z1, z2) =1

2π√

1 − ρ2exp

[−z2

1 − 2ρz1z2 + z22

2(1 − ρ2)

].

Page 97: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch02 FA

80 Pricing Derivative Securities (2nd Ed.)

A little manipulation presents this as

φ(z1, z2) =

1√

2π√

1 − ρ2exp

[− (z1 − ρz2)2

2(1 − ρ2)

]·(

1√2π

e−z22/2

),

which shows that the conditional density of Z1 given Z2 is

φ(z1 | z2) =1√

2π√

1 − ρ2exp

[− (z1 − ρz2)2

2(1 − ρ2)

], (z1, z2) ∈ 2.

Thus, for a bivariate standard normal pair Z1, Z2 the conditional distri-bution of Z1 given Z2 = z2 is itself normal, with mean proportional toz2. Changing variables as X1 = µ1 + σ1Z1 and X2 = µ2 + σ2Z2 gives acorresponding result for the general case:

f(x1 | x2) =1√2πδ

exp[− (x1 − α − βx2)2

2δ2

],

where β ≡ ρσ1/σ2, α ≡ µ1−βµ2, and δ ≡ σ1

√1 − ρ2. Thus, the conditional

expectation of X1 given X2 is a linear function of X2 when (X1, X2) isbivariate normal, whereas the conditional variance does not depend on X2

at all.Introducing a standard normal variate, U ∼ N(0, 1), that is independent

of X2, the c.f. of α + βX2 + δU is

eiζαEeiζβX2+iζδU = eiζα exp(iζβµ2 − ζ2β2σ2

2/2)exp(−ζ2δ2/2)

= eiζ(µ1−βµ2) exp[iζβµ2 − ζ2ρ2σ2

1/2 − ζ2(1 − ρ2)σ21/2

]= eiζµ1−ζ2σ2

1/2

= ΨX1 (ζ) .

This shows that when X1, X2 are bivariate normal X1 is distributed as anaffine function of X2 plus an independent normal error. The affine functionof X2 is the “regression function” of X1 on X2.

We will often use the following relation between normal and bivariatenormal c.d.f.s:∫ γ

−∞Φ(α − βy) · dΦ(y) = Φ

(γ,

α√1 + β2

;β√

1 + β2

), (2.55)

where α, β, γ are arbitrary real numbers and Φ(·, ·; ρ) represents the bivari-ate normal c.d.f. To see that this holds, write the left side of (2.55) as∫ γ

−∞

∫ α−βy

−∞

12π

e−(w2+y2)/2 · dw dy

Page 98: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch02 FA

Mathematical Preparation 81

and change variables as x = βy/√

1 + β2 + w/√

1 + β2. Defining ρ =ρ (β) ≡ β/

√1 + β2 then gives∫ γ

−∞

∫ α/√

1+β2

−∞

1

2π√

1 − ρ2exp

[−y2 − 2ρxy + x2

2 (1 − ρ2)

]· dx dy,

which corresponds to the right-hand side of (2.55).Noting that Φ(+∞, ·; ρ) is the marginal c.d.f. of the second variate,

sending γ → +∞ in (2.55) gives another useful relation:∫ ∞

−∞Φ(α ± βy) · dΦ(y) = Φ

(α/

√1 + β2

). (2.56)

If X ∼ N(µ, σ2) then Y ≡ eX is distributed as lognormal with thesame parameters: Y ∼ LN(µ, σ2). Lognormal variates are clearly strictlypositive. Their moments can be deduced from the m.g.f. of the normal, as

EY k = EekX = MX(k) = ekµ+k2σ2/2.

In particular,

EY = eµ+σ2/2

V Y = e2µ+σ2(eσ2 − 1).

While all the moments exist—even fractional and negative moments, them.g.f. does not, since E exp(ζeX) < ∞ only if ζ = 0.

Since the normal family is closed under convolution (i.e., sums of inde-pendent normals are normal), it is immediate that products of independentlognormals are lognormal.

Mixtures of Distributions

Distributions that serve as good models for financial data can sometimesbe constructed by mixing standard models probabilistically. The resultingforms are often referred to as “hierarchical” models. As a simple example,suppose realizations of a random variable X are generated by drawing withprobability p from the distribution N(0, σ2

1) and with probability 1−p fromthe distribution N(0, σ2

2), where σ21 = σ2

2 . Here the variance represents themixing parameter. The resulting mixture of normals that this generatescan be thought of in the following way. Regard the mixing parameter as adiscrete random variable V , with P(V = σ2

j ) = p2−j(1−p)j−1 for j ∈ 1, 2,

Page 99: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch02 FA

82 Pricing Derivative Securities (2nd Ed.)

and think of X as having the distribution N(0, V ) conditional on V . Withthis interpretation the marginal c.d.f. of X is

F (x) = EP(X ≤ x|V ) = pΦ(x/σ1) + (1 − p)Φ(x/σ2).

Mixtures of this particular type are symmetric about the origin and bellshaped, but they have thicker tails than the normal and kurtosis exceed-ing 3.0.

More elaborate hierarchical schemes can be generated by choosing otherdistributions for V , including continuous distributions that are supportedon (0,∞). Of course, the mixture class is not limited to mixtures of normals.One has the freedom to choose among various models for the conditionaldistributions, among the mixing parameters, and among models for thedistributions of those parameters.

Infinitely Divisible Distributions

A distribution F is infinitely divisible if for each positive integer n it canbe expressed as the convolution of n identical distributions, Fn. In otherwords, X has an infinitely divisible distribution if for each n there arei.i.d. random variables Xjnn

j=1 such that X =∑n

j=1 Xjn. The c.f. of aninfinitely divisible distribution can therefore be written as Ψ(ζ) = Ψn(ζ)n

for some c.f. Ψn. Taking Ψn(ζ) = exp[n−1(iζµ − ζ2σ2/2)

]shows that the

normal distribution is infinitely divisible, as are the Poisson, gamma, andCauchy laws. We will make use of the following properties of infinitelydivisible distributions.

1. An infinitely divisible c.f. has no zeroes; i.e., Ψ(ζ) > 0 ∀ζ ∈ if Ψ(ζ)is infinitely divisible. [Proof: By infinite divisibility Ψ(ζ)1/n is a c.f. foreach n, and therefore Υn(ζ) ≡ Ψ(ζ)1/nΨ(−ζ)1/n = |Ψ(ζ)|2/n is a real-valued c.f. Since |Ψ(ζ)| ≤ 1, it follows that Υ∞(ζ) ≡ limn→∞ Υn(ζ) = 0or Υ∞(ζ) = 1 according as Ψ(ζ) = 0 or Ψ(ζ) = 0. But since Ψ(ζ) iscontinuous and Ψ(0) = 1 there is an ε > 0 such that Ψ(ζ) = 0 for|ζ| < ε. Thus, Υ∞(ζ) = 1 for |ζ| < ε and is therefore continuous there.But by the continuity theorem Υ∞(ζ) is a c.f. and therefore continuouseverywhere. Accordingly, Υ∞(ζ) = 1 and Ψ(ζ) = 0 for all ζ ∈ .]

2. If Ψ(ζ) is the c.f. of an infinitely divisible distribution with mean zero,then there exists a σ-finite measure M such that

ln Ψ(ζ) =∫

eiζx − 1 − iζx

x2M(dx). (2.57)

Page 100: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch02 FA

Mathematical Preparation 83

Property 2, the Levy-Khintchine theorem, gives one of several canonicalrepresentations of an infinitely divisible c.f.28 Here are two examples ofinfinitely divisible distributions in the canonical form:

Example 22 Defining the integrand of (2.57) at zero by continuity andsetting M() = M(0) = σ2 give Ψ(ζ) = exp(−ζ2σ2/2), the c.f. ofN

(0, σ2

).

Example 23 Setting M() = M(1) = θ gives Ψ(ζ) = e−iζθ

exp[θ(eiζ − 1)

], the c.f. of a centered Poisson distribution.

Recall from (2.44) that the variance (second cumulant) of a random variableX with c.f. Ψ(ζ) is d2 ln Ψ(ζ)/d(iζ)2 |ζ=0. Differentiating the right side of(2.57) gives V X =

∫ M(dx), which shows that V X exists if and only if M

is a finite measure, as it was in the two examples.

2.2.8 Introduction to Stochastic Processes

A stochastic process is a family of random variables with some naturalordering. To represent the family requires an index and an index set thatindicates what the ordering is, such as Xjj∈J or Xtt∈T . Most appli-cations involve orderings in space or in time. For example, Xdd∈D mightrepresent a feature of a topographic map that shows the heights above sealevel at points on some curvilinear track on the earth’s surface; and Xtt∈Tmight represent the evolution through time of the concentration of a chem-ical reactant, of the incidence of solar radiation on a panel of receptors, orof the price of an asset. Since our interest is just in the last of these, we con-sider only processes that evolve in time. Among such there are discrete-timeprocesses, for which index set T is countable (usually with equally spacedelements), and continuous-time processes, for which T is a continuum (usu-ally a single bounded or unbounded interval). For example, a continuousrecord of bid prices for a common stock would represent a continuous-timeprocess, while a daily record of settlement prices for commodity futureswould be a process in discrete time.

Regardless of how time is recorded, any individual Xt in the familycan be either a discrete, a continuous, or a mixed random variable. Forexample, bid prices for stocks evolve in continuous time but could be mod-eled either as continuous or as discrete variables supported on a lattice of

28For proofs see Feller (1971, chapter 17) and Lukacs (1970, chapter 5).

Page 101: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch02 FA

84 Pricing Derivative Securities (2nd Ed.)

integer multiples of the minimum price increment. Thus, “discrete-time”and “continuous-time” refer to how the process is indexed, while the statespace of the process defines the random variables’ support. Of course, every-thing in the real world is discretely measured, so how a particular processis regarded usually hinges on the predictive power and convenience of theprobability models from which we can choose to represent it.

Filtrations and Adapted Processes

The modeling of a stochastic process proceeds, as usual, with reference to aprobability space (Ω,F , P) that defines events and their measures. However,there are some subtleties. First, a random draw ω from Ω now produces onerealization of the entire process, Xt(ω)t∈T . Second, as observers of theprocess evolve along with it, they acquire new information that includes, atthe least, the current and past values of Xt. Thinking of F and various sub-σ-fields of F as information structures gives a natural way to characterizethis evolution as a progression of σ-fields—a “filtration”. Represented asFtt∈T , a filtration is an increasing family of sub-σ-fields of F , meaningthat Fs ⊆ Ft ⊆ F for s < t. The process Xt is then required to be“adapted” to Ft, meaning that all the history of the process up to andincluding t is Ft-measurable. A discrete-time example will help to showwhat all this means.

Example 24 Consider the experiment of flipping a coin three times insuccession, and let Ω comprise the sequences of heads or tails that could beobtained, as

Ω ≡ HHH, HHT, HTH, HTT, THH, THT, TTH, TTT .

Next, take as the general information structure F the collection of all 28

subsets of these eight elementary outcomes, and define the probability mea-sure P as

P(A) = #(A)/#(Ω) = #(A)/8, A ∈ F .

Here “#” is counting measure that enumerates the elementary outcomesin sets. Having fixed the overall probability space, we can now describe afiltration that represents how information evolves as one learns, succes-sively, what happens on each flip. We can also define an adapted processXtt∈0,1,2,3 that represents the total number of heads obtained after t

flips.

Page 102: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch02 FA

Mathematical Preparation 85

At stage t = 0, before the first flip is made, we know only thatone of the outcomes in Ω will occur and that an outcome not in Ω will notoccur. In this case our initial information set is the trivial field F0 = ∅, Ω.Since we start off with X0 = 0, a known constant that does not depend onthe outcome of the experiment, it follows X0 ∈ F0 (is F0-measurable). Nowpartition Ω into the exclusive sets

HHH, HHT, HTH, HTT ︸ ︷︷ ︸H

, THH, THT, TTH, TTT︸ ︷︷ ︸T

.

Once the first flip is made and we see whether the coin came up heads ortails, we acquire the information F1 ≡ ∅, H, T, Ω. In other words, wethen know whether each of the 22 events generated by the partition hasoccurred. In particular, we know the value assumed by X1, which meansthat random variable X1 is measurable with respect to F1. Of course, weretain the trivial knowledge we had at the outset, so F0 ⊂ F1, and thusX0 is also F1-measurable. However, F1 clearly does not pin down the valueof X2, the total number of heads on the first two flips. That value will beknown once we get the information F2 that represents the history to thatstage. This comprises the 24 events generated by the finer partition

HHH, HHT ︸ ︷︷ ︸HH

, HTH, HTT ︸ ︷︷ ︸HT

, THH, THT ︸ ︷︷ ︸TH

, TTH, TTT ︸ ︷︷ ︸TT

.

But each set in F1 is also in F2, so we still know what happened on the firstflip. For example, H = HH ∪ HT ∈ F2. Therefore, all of X0, X1, and X2

are F2-measurable. Finally, making the last flip gives us F3 = F , the infor-mation structure of the 28 events generated by the finest possible partitionof Ω (given its original description) into the eight elementary outcomes.F3 tells the full story of the experiment and determines the values of allof X0, X1, X2, and X3. Our discrete-time stochastic process Xtt∈0,1,2,3is then adapted to the filtration Ftt∈0,1,2,3, in the sense that the infor-mation structure at each t reveals the entire history of the process to thattime.

Notice in the example that F2 is not the smallest information set thatdetermines X2, in the sense that it is not built from the coarsest possi-ble partition that tells the number of heads after two flips. This minimalσ-field, the field σ(X2) that is generated by X2, comprises just the 23 setsbuilt up from

TTH, TTT , THH, THT, HTH, HTT, HHH, HHT .

Page 103: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch02 FA

86 Pricing Derivative Securities (2nd Ed.)

However, X1 is not measurable with respect to σ(X2) and is therefore nota random variable on the probability space (Ω, σ(X2), P2) (where P2 is theappropriate probability measure on (Ω, σ(X2)) ). In fact, F2 is the smallestinformation set that determines the entire history of Xt through t = 2; thatis, F2 = σ(X0, X1, X2). Similarly, σ(X3) is based on a coarser partition thanF3, namely

TTT , TTH, HTT, THT , HHT, HTH, THH, HHH,but F3 = σ(X0, X1, X2, X3).

It is worth emphasizing that Ft tells us the history up through t butdoes not support prophecy!

Example 25 Suppose we start with some fixed initial capital C0 and placebets on the outcomes of the coin flips in example 24, winning one unit ofcurrency each time a head turns up and losing one unit each time there isa tail. Then the process Ct = C0 + Xt is adapted to the filtration Ft,since we know the value of Ct and its past at each t; however, the processUt = C3 − Ct (the total amount yet to be won) is not adapted.

Does it make any sense to call Ft a “filtration”? Indeed, the sequenceof information sets really is like a series of filters. F0 catches just the coarsestpossible information that lets us distinguish between certainties and impos-sibilities but allows all the finer detail to pass through. The knowledge ofwhat happened on the first flip is caught by F1, yet F1 allows the still finerinformation about the second flip to pass, and so on. The fineness of infor-mation that resides in each Ft corresponds precisely to the fineness of thepartition of Ω from which Ft is constructed. It does makes sense, then, tothink of progressing through information sets as a filtering operation and todescribe the process Xtt∈T as being adapted to a filtration. Notice thatthe overall σ-field F in the underlying probability space (Ω,F , P) containsinformation that is at least as fine as the finest in Ft.

We often refer to a probability space endowed with a filtration as a“filtered probability space” and represent it as (Ω,F , Ft, P).

Probabilities and expectations conditioned on sub-σ-fields, as P(· | Ft)and E(· | Ft), play important roles in dealing with stochastic processes.The processes P(· | Ft) and E(· | Ft) record how our assessments ofprobabilities and expectations evolve over time. Thus, for an FT -measurablerandom variable XT we regard E(XT | Ft)0≤t≤T itself as a stochasticprocess that shows how the expectation progresses with new information.When applying these tools to real financial markets we must keep in mind

Page 104: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch02 FA

Mathematical Preparation 87

that the nestedness of the filtration structure, Fs ⊆ Ft for s < t, impliesthat nothing is forgotten as time evolves. A joint implication of this featureand the tower property of conditional expectation, (2.50), is that

E [E(XT | Ft) | Fs] = E(XT | Fs), 0 ≤ s ≤ t ≤ T.

The practical interpretation is that the best guess now (time s) of what ourexpectation will be at t is our current expectation. For example, today’sbest forecast of the weather seven days from now is our best forecast oftomorrow’s forecast of the weather six days hence.

We will often use the following abbreviated notation for conditionalexpectation and conditional probability in the context of stochastic pro-cesses:

Et(·) ≡ E(·|Ft)

Pt(·) ≡ P(·|Ft).

This has the virtue of saving space, but it does require one to keep inmind that conditioning is on the entire history and not just on the currentstate. The more elaborate notation will be used when this fact needs specialemphasis and when space allows. When no conditioning is indicated, as E (·)and P (·), it is implicit that conditioning is on the trivial field ∅, Ω. Unlessotherwise specified, we shall always regard the initial σ-field F0 as trivial.This means that the initial value of a stochastic process is considered to bedeterministic. Thus, unless otherwise stated,

E(·) ≡ E0(·) ≡ E(·|F0)

P(·) ≡ P0(·) ≡ P(·|F0).

Two very important classes of stochastic processes are “martingales”and “Markov processes”. We will meet many examples of these, and mar-tingales in particular will be of central importance in the study of financialderivatives.

Martingales

An adapted process Xt on a filtered probability space (Ω,F , Ft, P) isa martingale if

1. E|Xt| < ∞ for all t ≥ 0 and2. EsXt = Xs for 0 ≤ s ≤ t.

Page 105: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch02 FA

88 Pricing Derivative Securities (2nd Ed.)

The integrability requirement, property 1, just ensures that the condi-tional expectation in 2 always exists, since by the tower property E|Xt| ≡E0|Xt| = E0 (Es|Xt|). The really crucial feature, property 2, is called the“fair-game property”. Here is how it gets its name.

Example 26 Suppose one starts with a fixed amount X0 in capital andundertakes a sequence of fair bets at times t ∈ T ≡ 0, 1, 2, . . ., theoutcomes of which determine the capital available at t, Xt. Let Yt+1t∈Tbe i.i.d. random variables with zero mean that represent the payoffs per unitof currency of the bets placed at t, and let Ft be the σ-field generated byY0, Y1, Y2, . . . , Yt (where we define Y0 ≡ 0). The individual bets are fair inthe sense that their payoffs have zero expected value. Suppose a nonnegative,uniformly bounded amount bt, 0 ≤ bt ≤ B < ∞, is wagered at each t andthat bt ∈ Ft. The last means that bt is Ft-measurable and therefore that btis adapted to the filtration Ftt∈T . This allows the amount of the bet todepend in any way on the sequence of past gains and losses but rules out anyknowledge of the future. Since each bet placed at t produces a payoff btYt+1

at t+1, wealth evolves in the sequence of plays as Xt+1 = Xt+btYt+1, t ∈ T .Clearly, Xt itself is adapted to Ft, since the sizes and outcomes of allpast bets determine the capital at each stage. Also, the integrability of Ytand the boundedness of bt imply

E|Xt| ≤ X0 +t∑

j=1

BE|Yj | < ∞;

and the fact that the payoffs are independent with mean zero implies

EsXt = E(Xs + bsYs+1 + · · · + bt−1Yt|Fs)

= Xs + bsEsYs+1 + · · · + Es (bt−1Et−1Yt)

= Xs

for 0 ≤ s ≤ t. Thus, Xtt∈T is a discrete-time martingale. The messageis that so long as the outcome of each bet is fair, the game itself is fair,provided the amounts wagered are bounded and do not somehow anticipatefuture outcomes.

Example 27 If Z is an FT -measurable random variable with E |Z| < ∞and Xt = EtZ ≡ E(Z | Ft) for t ∈ [0, T ], then

E|Xt| = E |EtZ| ≤ E (Et|Z|) = E|Z| < ∞

Page 106: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch02 FA

Mathematical Preparation 89

and

EsXt = Es(EtZ) = EsZ = Xs.

Thus, the conditional expectations process with respect to a filtration Ftis itself a martingale adapted to that filtration.

Example 28 Suppose c is a finite, F0-measurable constant. It is then obvi-ously adapted to Ft, and it clearly satisfies conditions 1 and 2. Such aconstant is therefore a martingale on any filtered probability space.

Submartingales and supermartingales are, respectively, favorable games(EsXt ≥ Xs) and unfavorable games (EsXt ≤ Xs). Thus, in reality thewealth processes of gambling houses are submartingales, while those of thepatrons are supermartingales. Notice that since the inequalities are weak,martingales are also sub- and supermartingales.

Stopping times (optional times, Markov times) are important conceptsin connection with both martingales and Markov processes. These areextended random variables τ : Ω → + ∪ +∞ for which events of theform τ ≤ t are Ft-measurable. As such, they model decisions that are madein connection with an adapted process Xt. For a gambling process aplan that decides from current and past conditions whether to terminateplay at each point is a stopping time. An example is the random variableτK = inft : Xt ≥ K, which is the decision to stop the first time wealthhits level K. At any time t one knows whether the event τ ≤ t has occurredby looking at the present and past of wealth, both of which are in the infor-mation set Ft. What the definition of a stopping time rules out is relianceon information about the future. For example, τ ′

K = supt ≤ T : Xt ≥ Krepresents the plan to stop the last time before T at which wealth is atleast K. This is not a stopping time, because one cannot know whether tostop at a given time without knowing the future.

In working with martingales stopping times can provide a means ofcontrolling the processes in some way that is advantageous to the purposeat hand. Localization is one such means of control. For example, letting

τn = inft : |Xt| ≥ n (2.58)

be a stopping time for a martingale Xt, then the stopped process Xt∧τnbecomes a (locally)-bounded martingale (adapted to Ft∧τn) that simplyterminates at the time it first hits ±n. If a process Xtt≥0 satisfies thefair-game property but not the integrability property, condition 1, it can

Page 107: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch02 FA

90 Pricing Derivative Securities (2nd Ed.)

nevertheless be considered a local martingale if there is a sequence of stop-ping times τn ↑ +∞ such that Xt∧τn is (for each n) a martingale adaptedto Ft∧τn .29 The requirement that τn ↑ +∞ as n → ∞ just ensures that theprocess can be made to evolve stochastically for as long as desired by takingn sufficiently large. The stopping times that control Xt in this way aresaid to “reduce” the process. A local martingale Xt is also a martingaleproper if E sups≤t |Xs| < ∞ for each t ≥ 0.

One last variant on the martingale theme will be important in model-ing asset prices in continuous time. A “semimartingale” is a process thatwould be a local martingale if some other adapted process of finite vari-ation were extracted; in other words, it is a process that can be decom-posed into an adapted, finite-variation process and a local martingale.30

As a discrete-time example, the wealth process Xn of someone playinga sequence of fair games would not be a martingale if some extraneousincome process augmented wealth at each play, but Xn could still be asemimartingale.

An important result that explains much of martingales’ central role inmodern probability theory is the martingale convergence theorem:

Theorem 7 If Xt is a supermartingale with supt E|Xt| < ∞, then thelimit of Xt as t → ∞ exists a.s. and is finite.

In short, supermartingales (and therefore martingales) whose expected val-ues are bounded converge to either a finite constant or a random variable;that is, they neither diverge nor fluctuate indefinitely.31 We note that uni-form integrability of Xt implies supt E|Xt| < ∞. While the convergencetheorem is crucial for modern proofs of laws of large numbers and relatedasymptotic results, it is mostly the fair-game property of martingales thataccounts for their importance in finance.

29In a situation where the initial value X0 itself is not integrable (F0 now being nolonger the trivial σ-field), we would require Xt∧τn1(0,∞)(τn) for localization, wherethe indicator prevents an automatic stop at t = 0. An example is the wealth processof a gambler who gets to starts the play with a Cauchy-distributed initial stake, X0.E|Xt| < ∞ clearly fails here, yet it remains true (so long as the bets are boundedand the individual plays are fair) that EsXt = Xs for 0 ≤ s < t, so localization asXt∧τn1(0,∞)(τn) controls the process in the way we want.

30A “finite-variation” process is one whose sample paths have finite variation overfinite intervals.

31For a proof and applications see Williams (1991).

Page 108: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch02 FA

Mathematical Preparation 91

Markov Processes

A Markov process is an adapted stochastic process in either discrete or con-tinuous time that is memoryless, in the sense that future behavior dependsonly on the present state and not on the past. Specifically, an adapted pro-cess Xt on a filtered probability space (Ω,F , Ft, P) is Markov if for anyBorel set B

Ps(Xt ∈ B) ≡ P(Xt ∈ B | Fs) = P (Xt ∈ B | σ(Xs)) ≡ P(Xt ∈ B | Xs).

In words, the probability of any event given the entire history to s—thatis, given Fs—is the same as the probability conditional on knowledge of Xs

alone.An example of a process that is not Markov is one that describes the

position over time of a massive object subject to unbalanced forces. In sucha case the future trajectory would depend on the current rate of changeas well as on the current position. The wealth process Xtt=1,2,... of thegambler who bets on a sequence of independent plays of a game of chanceis not Markov if amounts wagered at each play depend in any nontrivialway on the past sequence of outcomes. It is Markov if the bets depend oncurrent wealth only. Here are other examples of processes that do have theMarkov property.

Example 29 The first-order autoregressive process, Xt = ρXt−1 +Utt=1,2,..., where continuous random variables Ut are i.i.d. with zeromean, is a continuous-state, discrete-time Markov process.

Example 30 A branching process is a discrete-state, discrete-time Markovprocess that has been used to represent, among other things, the dynamicsof populations. Starting with an initial population of size P0 > 0, eachmember j ∈ 1, 2, . . . , P0 produces a random number Qj0 of offspringand then dies off. Together, these progeny constitute the next generationof P1 =

∑P0j=1 Qj0 individuals, who in turn bear offspring and die off. The

process continues either indefinitely or until some stage at which all indi-viduals remain childless. Thus, taking Q0n ≡ 0 to allow for the possibilityof extinction at generation n− 1, there are Pn =

∑Pn−1j=0 Qjn individuals in

the nth generation, where Qjnj∈1,2,...,Pn−1,n∈ℵ0 are i.i.d. and integer-valued.

Example 31 The Poisson process Ntt≥0 is a discrete-state, continuous-time Markov process that will be encountered often in succeeding chapters.

Page 109: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch02 FA

92 Pricing Derivative Securities (2nd Ed.)

It is defined by the conditions (i) N0 = 0, (ii) Nt − Ns independent ofNr −Nq for q < r ≤ s < t, and (iii) Nt+u −Nt distributed as Poisson withparameter θu; that is,

P(Nt+u − Nt = n) = e−θu(θu)n/n!, n ∈ ℵ0.

The process takes a Poisson-distributed number of unit jumps during a finiteperiod of time, the mean number being proportional to the elapsed time. Theindependence of increments in nonoverlapping intervals confers the Markovproperty.

Example 32 The Wiener or Brownian-motion process is a continuous-state, continuous-time Markov process with independent, normallydistributed increments. Whereas sample paths of the Poisson process arediscontinuous, those of Brownian motion are a.s. continuous. The stan-dard Wiener process Wtt≥0 is defined by the conditions (i) W0 = 0,(ii) Wt − Ws independent of Wr − Wq for q < r ≤ s < t, and (iii)Wt+u − Wt ∼ N(0, u).

Diffusion processes are continuous-time Markov processes with contin-uous sample paths. The Wiener process is an example. We will see shortlythat the more general Ito processes—some of which have the Markov prop-erty and some of which do not—can be built up from the Wiener process.These continuous processes can be combined in various ways with Poissonprocesses to model phenomena that are subject to unpredictable, possiblydiscontinuous changes. To use and construct such models requires at least abasic understanding of stochastic calculus, which is the subject of chapter 3.

Page 110: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch03 FA

3Tools for Continuous-Time Models

This chapter adds to the tools of analysis and probability theory presentedin chapter 2 the concepts needed to value derivatives on assets whoseprices evolve in continuous time. We begin by surveying the key proper-ties of Brownian motions, which are the most basic building blocks forcontinuous-time models of asset prices. Sections 3.2–3.4 then develop thetools of stochastic calculus that permit us to work with such erratic pro-cesses. The final section of the chapter is devoted to processes whose samplepaths may be discontinuous.1

3.1 Wiener Processes

3.1.1 Definition and Background

Wiener processes are continuous-time, continuous-state-space versions ofthe random walk. A celebrated early application was by Einstein (1905) indescribing the phenomenon of molecular vibration, by which he explainedthe incessant, random-seeming motion of microscopic particles observedabout 80 years earlier by botanist Robert Brown. Louis Bachelier’s appli-cation to speculative prices in a 1900 doctoral thesis (published in English

1Sources for the technical detail that underlies the material in this chapter includeArnold (1974), Chung and Williams (1983), Durrett (1984, 1996), Elliott (1982),Elliott and Kopp (1999), Karatzas and Shreve (1991), Shreve (2004), and Musiela andRutkowski (2005). Baxter and Rennie (1996) offer a highly readable and intuitive treat-ment of Girsanov’s theorem (section 3.4.1) and stochastic calculus generally. Karlin and

Taylor (1975) give a more technical but non-measure-theoretic account of Brownianmotion in particular. Protter (1990) provides a rigorous treatment of stochastic calculusfor discontinuous processes.

93

Page 111: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch03 FA

94 Pricing Derivative Securities (2nd Ed.)

in Cootner (1964)) predated Einstein’s work. The appellation “Wiener”honors the American probabilist Norbert Wiener, a pioneer in the mathe-matical analysis of these processes. In current literature the terms Wienerprocess and Brownian motion are used interchangeably.

The standard Wiener process was defined in chapter 2 as a stochasticprocess Wtt≥0 having the following properties: (i) W0 = 0; (ii) Wt − Ws

independent of Wr − Wq for q < r ≤ s < t; and (iii) Wt+u −Wt ∼ N(0, u).The process thus has time-stationary and independent increments. It isimportant to note that u is the variance of Wt+u − Wt. The choice ofzero for initial value is just a matter of convention, whereas the inde-pendence and normality of increments are crucial features. An implica-tion of (iii) that we will often put to use is that Wt+u − Wt = Z

√u,

where Z ∼ N(0, 1).2 Also, (ii) and (iii) together imply that incrementsWtj − Wtj−1n

j=1 over nonoverlapping intervals [tj−1, tj)nj=1 are dis-

tributed as Zj√

tj − tj−1nj=1, where the Zj are independent standard

normals.The definition of the Wiener process extends to higher dimension k

simply by regarding Wt as a vector in (i) and (ii) and writing (iii) as (iii′)Wt+u−Wt ∼ N(0, uIk), where Ik is the identity matrix. The k-dimensionalstandard form generalizes to allow mean vector proportional to u and anarbitrary positive-definite covariance matrix in place of Ik. However, themore general processes we need can be built from the standard form alone,and for now we consider just the one-dimensional case.

Proofs that processes meeting conditions (i)-(iii) do exist can be foundin many texts on stochastic processes; e.g., Durrett (1984) and Revuz andYor (1991). While the conditions are thus known to be mutually consistent,they do give rise to some strange properties. These will be the main sourceof distinction between stochastic calculus and deterministic calculus.

3.1.2 Essential Properties

With (Ω,F , Ft, P) as a filtered probability space, we consider a Brownianmotion adapted to Ftt≥0. Here are the main features.

1. Wiener processes are Markov. Although there are delicate technicalissues here3 the independence of increments makes this easy to believe.

2Expressions such as Wt = Z√

t and Wt ∼ Z√

t should be interpreted as “equal indistribution”.

3For these see Durrett (1984, section 1.2).

Page 112: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch03 FA

Tools for Continuous-Time Models 95

For example, one gets the idea in a simple way by calculating the con-ditional characteristic function of Wt. For any ζ ∈ and s ≤ t we have

EseiζWt = E[eiζWseiζ(Wt−Ws) | Fs]

= eiζWsE[eiζ(Wt−Ws) | Fs]

= eiζWs−ζ2(t−s)/2

= E(eiζWt |Ws).

The uniqueness theorem for c.f.s then implies that P(Wt ∈ B | Fs) =P(Wt ∈ B | Ws) for any Borel set B.

2. Standard Wiener processes are martingales. First, Wt is integrable, since

E|Wt| = E|Z|√

t < ∞∀t,

where Z ∼ N(0, 1). Next, Wt has the fair-game property:

Es(Wt) = Es[Ws + (Wt − Ws)] = Ws, s ≤ t.

3. E(WsWt) = E(Ws · EsWt) = EW 2s = s, s ≤ t.

4. Sample paths Wt(ω)t≥0 are a.s. continuous. That is, the occurrence ofa discontinuity or jump anywhere in the realization of a Brownian pathis an event of zero probability. In symbols, if

D = ω : Wt(ω)t≥0 is not continuous

then P(D) = 0. A theorem due to Kolmogorov4 states that sample pathsof a stochastic process Xtt≥0 are a.s. continuous if there are positivenumbers a, b, c such that

E|Xt − Xs|a ≤ c(t − s)1+b

for each s < t. Taking a = 4 gives E(Wt−Ws)4 = EZ4(t−s)2 = 3(t−s)2,thus satisfying the requirement.

5. Although they are continuous, sample paths of Brownian motion are a.s.not differentiable anywhere. The following, although not a proof of thestatement, gets across the idea. Let ∆n ≡ n(Wt+1/n −Wt) ∼ N(0, n) bethe average rate of change of W over [t, t + 1/n). For K > 0, P(|∆n| >

K) = P(|Z| > Kn−1/2) → 1 as n → ∞, and since K is arbitrary itfollows that |∆n| → ∞ in probability. This nondifferentiability propertyhas an important implication for models of asset prices built up from

4For the proof see Karatzas and Shreve (1991, pp. 53–55). Strictly, what is proved isthat there is a continuous modification of Xt; that is, an a.s. continuous process X′

tsuch that P(Xt = X′

t) for each t ≥ 0.

Page 113: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch03 FA

96 Pricing Derivative Securities (2nd Ed.)

Brownian motions: one cannot predict an asset’s future value even atthe next instant by looking at the current value and the change in theimmediate past.

6. Sample paths of Brownian motion are of unbounded variation. Thisimplies that the process covers an infinite distance as it evolves overany time interval, no matter how short! The following argument showsspecifically that W has infinite variation over [0, 1] (and extends to anyinterval through an affine transformation of the time scale). Partitioning[0, 1] into subintervals of length 1/n, let ∆kn ≡ Wk/n −W(k−1)/n be thenet change over the kth of these, and let V

[0,1]n W ≡

∑nk=1 |∆kn| be the

total variation over the discrete partition. Then V[0,1]W = supn V[0,1]n W .

Since the ∆knnk=1 are i.i.d. as N(0, n−1) we have

E|∆kn| = n−1/2√

2/π,

V |∆kn| = n−1(1 − 2/π),

and therefore

EV[0,1]n W =

√2n/π,

V V[0,1]n W = (1 − 2/π).

Finally, by the central limit theorem

Zn ≡ (V V[0,1]n W )−1/2(V[0,1]

n W − EV[0,1]n W ) N(0, 1),

so that for any K > 0

P(V[0,1]n W > K) = P

(Zn >

K −√

2n/π√1 − 2/π

)→ P(Z > −∞) = 1

as n → ∞.7. The net change in Brownian motion over an interval cannot be repre-

sented as an integral with respect to Lebesgue measure; that is, thereis no density w such that Wt =

∫ t

0ws · ds. For such a representation to

exist it is necessary and sufficient that W be an absolutely continuousfunction of t. That is, for any ε > 0 there must exist δ > 0 such that foreach finite k

k∑j=1

|tj − tj−1| ≤ δ ⇒k∑

j=1

|Wtj − Wtj−1 | < ε.

However, since W is of unbounded variation the sum on the right canbe made arbitrarily large by increasing k and partititioning any given

Page 114: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch03 FA

Tools for Continuous-Time Models 97

interval of length δ more and more finely. Therefore, W is not absolutelycontinuous and there can be no conventional integral representation interms of a density function.

3.2 Ito Integrals and Processes

These properties of Brownian motion indicate that the conventional con-cepts of integration do not extend to stochastic integrals with respect toWt. Since sample paths of Brownian motion are continuous, an expressionsuch as

∫b(t) · dWt could not have meaning as a countable sum, as for

a Stieltjes integral with respect to a step function. But since Wt is notabsolutely continuous, neither could there be the other interpretation as aLebesgue integral of b times a density with respect to Lebesgue measure.A theory of stochastic integration that does provide a useful interpreta-tion of

∫b(t) · dWt was developed during the 1940s and early 1950s by the

Japanese probabilist Kiyosi Ito. Ito integrals and the processes constructedfrom them have since become essential tools in financial modeling.

3.2.1 A Motivating Example

What accounts for the importance of stochastic integrals in finance? Recallexample 26 on page 88, which showed that wealth after a sequence of fairbets is a discrete-time martingale. Beginning with a known stake X0, bt

is bet on the tth play and each unit of currency that is wagered returnsYt+1 at t + 1, so that wealth evolves as Xt+1 = Xt + btYt+1. Moving fromthe casino to a financial market, we can transform the sequence of gamesinto a sequence of investments at times t0, t1, . . . , tn−1, where tj = j∆t andn∆t ≡ T. Now think of btj as the number of units of an asset held at tjor, more aptly, as a vector of units of various assets—which is to say, aportfolio; and let Ytj+1 = ∆Stj ≡ Stj+1 − Stj be the corresponding vectorof price changes. After n = T/∆t periods the total change in wealth is5

XT − X0 =n−1∑j=0

btj ∆Stj .

5In this example Xt is, in general, no longer a martingale in the measure P that

governs the actual behavior of prices, since expected price changes are not generallyequal to zero. However, we will see later that there is an equivalent measure under whichthe martingale property does hold when Xt is appropriately normalized.

Page 115: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch03 FA

98 Pricing Derivative Securities (2nd Ed.)

Since the dynamics of self-financing portfolios that replicate derivativeswill be of just this form, the relevance of expressions like this becomesobvious. And when we move to continuous time and allow portfolio weightsand prices to vary continuously, such expressions turn into integrals withrespect to the stochastic process St:

XT − X0 =∫ T

0

bt · dSt. (3.1)

Although Brownian motion per se is not a good model for the price of afinancial asset, it is necessary to define integration with respect to Wienerprocesses in order to develop integrators that are good models. We firstundertake to find a meaningful construction of integrals

∫ t

0 bs ·dWs for suit-able processes bt, then develop plausible models for asset prices in termsof such integrals. Before beginning, however, it is worthwhile to observe thatfinding integrals of Brownian motions, or of suitable functions of Brownianmotions, requires nothing new. Since Brownian paths Wt(ω)t≥0 are a.s.continuous, expressions (random variables) like

∫ t

0Ws·ds or

∫ t

0b(Ws)·ds can

be interpreted, path by path, as ordinary Riemann (or Lebesgue) integrals,provided that b(·) is itself Riemann (or Lebesgue) integrable. The followingexample provides a specific result that will be needed later in applications.

Example 33 Let us derive the distribution of∫ t

0 Ws ·ds from its Riemannconstruction, as limn→∞ tn−1

∑nk=1 Wkt/n. Note first that

Wkt/n ≡k∑

j=1

(Wtj/n − Wt(j−1)/n) ∼√

t/n

k∑j=1

Zj ,

where “∼” signifies “equal in distribution” and where the Zj are i.i.d. asN(0, 1). Therefore,

n∑k=1

Wkt/n ∼√

t/n[nZ1 + (n − 1)Z2 + · · · + Zn]

∼ N

(0,

t

n

n∑k=1

k2

)= N [0, t(n + 1)(2n + 1)/6].

Page 116: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch03 FA

Tools for Continuous-Time Models 99

The characteristic function of∫ t

0 Ws · ds is

E exp(

∫ t

0

Ws · ds

)= E exp

[iζ lim

n→∞t

n

n∑k=1

Wkt/n

]

= limn→∞

E exp

[iζ

t

n

n∑k=1

Wkt/n

]

= limn→∞

exp[−ζ2t2

2n2· t(n + 1)(2n + 1)

6

]= e−ζ2t3/6,

where the second step follows by dominated convergence. From the unique-ness theorem for c.f.s (section 2.2.3) we conclude that∫ t

0

Ws · ds ∼ N(0, t3/3).

3.2.2 Integrals with Respect to Brownian Motions

Defining integrals of adapted processes with respect to Brownian motion onsome finite interval [0, t] proceeds in steps similar to those for the Lebesgueintegral, working from restricted to successively more general forms ofintegrands. At each stage the restrictions will guarantee that the processXtt≥0, with Xt ≡ X0+

∫ t

0 bs ·dWs viewed as a function of the upper limit,is at least locally a martingale. Henceforth, the integrands bt we considerare processes adapted to the same filtration Ftt≥0 as the Wiener processWt. With such integrands the process Xt thus defined belongs to theclass of Ito processes. In studying these processes we will always assumethat X0 ∈ F0 and that F0 is the trivial field, ∅, Ω.

Defining the Integral

Stage one begins with “simple” adapted processes. The process bnt,t0≤t

is simple if each sample path is constant (and bounded) on intervals. Thus,bnt,t is simple adapted if there exists for each t > 0 numbers 0 = t0 <

t1 < ... < tnt−1 < tnt = t, a finite constant K, and random variablesb0, b1, . . . , bnt−1 with |bj | ≤ K a.s. and bj ∈ Ftj for each j such that

bnt,s =

b0, s = 0∑nt

j=1 bj−11(tj−1,tj](s), 0 < s ≤ t.

Page 117: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch03 FA

100 Pricing Derivative Securities (2nd Ed.)

For these simple processes the Ito integral∫ t

0 bnt,s · dWs is defined as

Xnt,t − X0 ≡∫ t

0

bnt,s · dWs =nt∑

j=1

bj−1(Wtj − Wtj−1 ). (3.2)

In words, the realization of the integral for outcome ω is the weighted sumof the (constant) values of b on the intervals, where the weights are therespective (positive or negative) increments to the Brownian motion. TheX0 in (3.2) is simply an arbitrary F0-measurable initial value.

Stage two extends to adapted processes bt that are well-enoughbehaved that E

∫ t

0 b2s · ds < ∞. For this it is first established that any

such process can be attained as the limit of a sequence of simple pro-cesses bnt,snt∈ℵ, in the sense that E

∫ t

0 (bnt,s − bs)2 · ds → 0 as nt → ∞.Then Xt = X0 +

∫ t

0 bs · dWs is defined as a random variable to which thesequence Xnt,t converges in mean square; that is, the variable such thatlimnt→∞ E(Xt − Xnt,t)

2 = 0. That this is a good definition follows fromthe fact that a random variable X

′t constructed from some other sequence

of simple adapted processes b′n′

t,s equals Xt a.s. (For proofs of these facts

consult the technical references listed in the introduction to this chapter.)A final extension is to adapted processes for which

∫ t

0b2s · ds < ∞ a.s.

This is a weaker condition than that in step two because E∫ t

0b2s · ds < ∞

implies∫ t

0b2s ·ds < ∞ but not conversely. The extension is made by defining

stopping times τK that produce a bounded version of the process, for whichthe expected value of

∫ t

0 b2s ·ds is necessarily finite, and then taking limits of

the resulting stochastic integrals of the bounded processes. Thus, for someK > 0 set τK ≡ inft :

∫ t

0 b2s · ds ≥ K, with τK ≡ +∞ if

∫∞0 b2

s · ds < K.This is a stopping time since the fact that bt is Ft-adapted means thatthe value of

∫ t

0 b2s ·ds is known at time t. Also, being bounded, the truncated

process bK,t ≡ bt1[0,τK)(t)t≥0 clearly satisfies E∫ t

0 b2K,s · ds < ∞. As a

result the Ito integral, XK,t −X0 =∫ t

0bK,s ·dWs, say, is defined for each K

and each t > 0. Since τK → ∞ a.s. as K → ∞, Xt − X0 =∫ t

0bs · dWs can

be defined as the random variable to which∫ t

0 (Xs − XK,s)2 · ds convergesin probability. Again, this random variable can be shown to be essentiallyunique.

Here are three examples that show how Ito integrals can be constructedfrom the definition. The first two examples illustrate the fact that the Itoand Riemann-Stieltjes constructions yield the same result when bt is aprocess of finite variation.

Page 118: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch03 FA

Tools for Continuous-Time Models 101

Example 34 Taking bt = 1, which is already a simple process, (3.2) givesXt − X0 =

∫ t

0 dWs as Wt − W0 = Wt—the same as in the Stieltjes con-struction.

Example 35 Taking bt = t, let us evaluate Xt − X0 =∫ t

0s · dWs. First,

partition (0, t] as ∪nj=1(tj−1, tj] with tj = jt/n, and create the simple process

bn,s ≡ ∑n

j=1 tj−11(tj−1,tj](s)0<s≤t. The approximating sum is then

n∑j=1

tj−1(Wtj − Wtj−1 ) =t

n

n∑j=1

(j − 1)(Wtj − Wtj−1 )

=t

n[−Wt1 − · · · − Wtn−1 + (n − 1)Wt]

= −n∑

j=1

Wtj−1 (tj − tj−1) +n − 1

ntWt.

Taking limits path by path gives∫ t

0

s · dWs = tWt −∫ t

0

Ws · ds.

Again, this stochastic counterpart to the integration-by-parts formula is thesame as for an ordinary Stieltjes integral.

Example 36 Taking bt = Wt, let us evaluate Xt −X0 ≡∫ t

0Ws ·dWs. This

differs from the two previous examples in that both integrand and integra-tor have unbounded variation. Were Wt an ordinary absolutely continuousfunction, the Stieltjes construction would give W 2

s /2|t0 = W 2t /2, but this

time the result will be different because the noise in the integrand and inte-grator interact. Again partitioning (0, t] as ∪n

j=1(tj−1, tj ] with tj = jt/n

and forming the simple process

bn,s ≡n∑

j=1

Wtj−11(tj−1,tj ](s),

the Ito integral will emerge as the random variable to which

Xt − X0 ≡n∑

j=1

Wtj−1 (Wtj − Wtj−1 )

Page 119: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch03 FA

102 Pricing Derivative Securities (2nd Ed.)

converges in probability. To find this limit, write the summation as

n∑j=1

Wtj−1Wtj −n∑

j=1

W 2tj−1

=n∑

j=1

Wtj−1Wtj −12

n∑j=1

W 2tj−1

+n∑

j=1

W 2tj− W 2

t

=

12

W 2t −

n∑j=1

(Wtj − Wtj−1

)2

∼ 12

W 2t − t

n

n∑j=1

Z2j

,

where the Zj are i.i.d. as N(0, 1). Since 1n

∑nj=1 Z2

j converges a.s. toEZ2 = 1 by the strong law of large numbers, we conclude that∫ t

0

Ws · dWs = W 2t /2 − t/2. (3.3)

In the second example it was not crucial to evaluate the integrand at tj−1

in the approximating sum for∫ t

0 s·dWs, because the integral would have thesame value for any sequence of simple adapted processes that converged tothe integrand process, t. In the last example, however, it is crucial thatthe integrand be evaluated at tj−1 rather than at some t∗j ∈ (tj−1, tj ]. Thisis so because the effect of interaction between Wtj −Wtj−1 and the changein the integrand over (tj−1, t

∗j ], Wtj − Wt∗j , does not become negligible as

tj − tj−1 approaches zero. To appreciate this point, the reader can showthat taking bn,s =

∑nj=1 Wtj 1(tj−1,tj](s) results in

n∑j=1

Wtj (Wtj − Wtj−1 ) →P W 2t /2 + t/2.

This feature shows that the Ito and Riemann-Stieltjes constructions dodiffer fundamentally, despite the superficial similarity.

Properties of the Integral

It is convenient to characterize the Ito integral in terms of the stochasticprocess Xt = X0 +

∫ t

0 bs · dWs that it generates. As usual, we assume thatX0 is deterministic. The step-wise development of the integral from simpleintegrands makes it easy to determine the key properties of this process.

Page 120: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch03 FA

Tools for Continuous-Time Models 103

1. Xtt≥0 is a continuous process. This is a simple consequence of thecontinuity of Wtt≥0, since

limn→∞

|Xt+1/n − Xt| = limn→∞

|bt(Wt+1/n − Wt)| = 0.

2. Xtt≥0 has the fair-game property. With Xnt,t as in (3.2) and s ∈(t, t+1], we have

Xnt,t − Xns,s = b(Wt+1 − Ws) +nt−1∑

j=+2

bj−1(Wtj − Wtj−1 )

+ bnt−1(Wt − Wtnt−1).

Taking expectations and using the tower property of conditional expec-tation and the Ftj−1 -measurability of bj−1 give Es(Xnt,t − Xns,s) = 0,showing that EsXt = Xs. Likewise, E0Xt = X0.

3. Xtt≥0 is a continuous martingale when E∫ T

0b2t · dt < ∞. The fair-

game property having been demonstrated, it remains just to verify thatE|Xt| < ∞. Since E|Xt| ≤ E|Xt − X0| + E|X0| = E|Xt − X0| + |X0|,we need show only that E|Xt − X0| < ∞. Working again from theapproximating simple process, write (Xn,t − X0)2 as

nt−1∑j=1

b2j−1(Wtj − Wtj−1 )

2

+ 2nt−1∑j=2

j−1∑k=1

[bj−1(Wtj − Wtj−1 )bk−1(Wtk− Wtk−1)]. (3.4)

Taking expectations, apply the tower property by conditioning on Ftj−1

in each summand of the first term and conditioning on Ftk−1 in eachsummand of the second term. Each of the latter summands then van-ishes, leaving

E(Xnt,t − X0)2 = E

nt−1∑j=1

b2j−1(tj − tj−1),

which, on taking limits, gives

E(Xnt,t − X0)2 → E

∫ t

0

b2s · ds < ∞. (3.5)

Finally, E|Xt − X0| < ∞ follows from Jensen’s inequality.4. Xtt≥0 is a continuous local martingale when

∫ t

0 b2s ·ds < ∞ but E

∫ t

0 b2s ·

dt = ∞. In this case Xt is reduced by the stopping times τK = inft :|Xt| ≥ K.

Page 121: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch03 FA

104 Pricing Derivative Securities (2nd Ed.)

5. If X0 = 0 the first two moments of Xt =∫ t

0 bs · dWs are EXt = 0 andV Xt = EX2

t = E∫ t

0 b2s · ds when E

∫ T

0 b2t · dt < ∞. The first of these

comes directly from the fair-game property, and the second follows inthe same way as (3.5).

6. Xt = X0 +∫ t

0bs ·dWs ∼ N(X0,

∫ t

0b2s ·ds) when bss≤t ∈ F0. When the

integrand process is F0-measurable, as in example 35, normality followsfrom the construction of the integral as a linear function of incrementsof Brownian motion.

Generalization to Higher Dimensions

Taking Wt to be a k-dimensional Brownian motion and bt to be anm× k adapted matrix process leads to a multidimensional extension of theintegral that makes Xt = X0+

∫ t

0 bs ·dWs an m-vector-valued process. Theconstruction proceeds in the same way, from (i) simple processes bt to (ii)processes with E

∫ t

0bsb

′s · ds < ∞ to (iii) processes with

∫ t

0bsb

′s · ds < ∞

a.s. Members of Xt are martingales under the condition (ii), and whenbt is nonstochastic the increments are distributed as multivariate normal:Xt − X0 ∼ N(0,

∫ t

0bsb′

s · ds) .

3.2.3 Ito Processes

Taking bt ≡ 1 as in example 34 gives Xt−X0 =∫ t

0bs ·dWs = Wt, which we

know to be distributed as N(0, t). As we have seen from the other exam-ples, more interesting integrands produce processes Xt that are them-selves more general, having variances E

∫ t

0 b2s ·ds (when these are finite) and

increments that are no longer necessarily normally distributed. In addition,such processes are no longer necessarily Markov since btt≥0 may dependon the entire past of Wtt≥0 as, for example, bt = sup0≤s≤t |Ws|. Onemajor restriction remains, however. By the fair-game property of the inte-gral a process of the form Xt = X0 +

∫ t

0bs · dWs drifts aimlessly and thus

lacks the tendency to upward movement that prices of financial assets usu-ally display over long periods. “Ito processes” are defined more generallyto allow for such systematic trend.

Definition and Examples

Let att≥0 be an adapted process with∫ t

0 |as| · ds < ∞ almost surely foreach finite t and (as heretofore) let btt≥0 be adapted with

∫ t

0 b2s · ds < ∞

Page 122: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch03 FA

Tools for Continuous-Time Models 105

a.s. for each t < ∞. Then the processXt = X0 +

∫ t

0

as · ds +∫ t

0

bs · dWs

t≥0

with X0 ∈ F0 is called an “Ito process”. Notice that∫ t

0 as · ds is not astochastic integral but is interpreted pathwise in the Lebesgue sense. Also,note that the definition of the integral implies that the process

∫ t

0 as·dst≥0

is one of finite variation.6 Clearly, unless at = 0 for (almost) all t the processXt lacks the fair-game property and is not a martingale. However, sinceXt does have a local martingale component if

∫ t

0b2s · ds < ∞ for each t,

and since ∫ t

0as · ds is a continuous, adapted process of finite variation,

Xtt≥0 remains a continuous semimartingale.The Ito class is rich enough to provide reasonable models for prices of

many commodities and financial assets. Moreover, as we shall see, semi-martingales in this class can serve as integrators in other stochastic inte-grals, which in turn produce even richer models. The following examples—all continuous Markov (i.e., diffusion) processes—have been widely used infinancial modeling.

Example 37 Taking at = µXt and bt = σXt in the Ito integral, where µ

and σ > 0 are constants, gives the process known as “geometric Brownianmotion”:

Xt = X0 +∫ t

0

µXs · ds +∫ t

0

σXs · dWs

t≥0

. (3.6)

This is the process on which the Black-Scholes model for European options isbased, and it is still the benchmark continuous-time model for prices of manyfinancial assets. The parameter µ is called the (proportional) mean drift,and σ is the volatility parameter. Notice that the change in the process overan interval [t, t + ∆t), namely Xt+∆t − Xt =

∫ t+∆t

tµXs · ds +

∫ t+∆t

tσXs ·

dWs, damps out as the initial value Xt approaches zero. A process thatstarts off at X0 > 0 thus remains nonnegative, as is consistent with thelimited-liability feature of most financial assets. Also, the variance of therelative change in the process over [t, t+∆t), Xt+∆t/Xt−1, is approximately

6Under the integrability condition the integral process R t0

as · dst≥0 is a.s. of finitevariation even if the integrand process att≥0 is not. Recall that the definition of the

Lebesgue integral presentsR t0

as · ds asR t0

a+s · ds − R t

0(−a−

s ) · ds, where a+s = as ∨ 0

and a−s = as ∧ 0. Thus, on almost all sample paths the integral process is the difference

between two nondecreasing processes.

Page 123: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch03 FA

106 Pricing Derivative Securities (2nd Ed.)

invariant to the initial price level. This implies that holding period returnsover short periods—ratios of terminal to initial prices—are approximatelystationary.

A slight variation on the theme is to allow mean drift and volatility tobe time-varying yet still F0-measurable (predictable as of t = 0).

Example 38 Taking at = (µ− νXt), where ν > 0, gives a mean-revertingor “Ornstein-Uhlenbeck” process:

Xt = X0 +∫ t

0

(µ − νXs) · ds +∫ t

0

bs · dWs

t≥0

. (3.7)

This has negative mean drift when Xt > µ/ν and positive mean drift whenXt < µ/ν, so that the process is attracted to µ/ν with strength proportionalto ν. The bt process can take various forms. Mean-reverting processeshave been used extensively in modeling interest rates, which do typicallyfluctuate about some long-term average value. We encounter mean-revertingprocesses first in chapter 8 in connection with processes whose volatilitycoefficients are themselves mean-reverting diffusions.

Example 39 Taking bt = σXγt for 0 ≤ γ < 1 gives a “constant-elasticity-

of-variance” (c.e.v.) process:Xt = X0 +

∫ t

0

as · ds +∫ t

0

σXγs · dWs

t≥0

. (3.8)

When γ = 1/2 this is commonly called a “square-root process”. The chiefcontrast with geometric Brownian motion is that the variance of the relativechange in the process over [t, t + ∆t), Xt+∆t/Xt − 1, is now decreasing inthe initial value. This corresponds to the empirical observation that returnsof high-priced stocks tend to be less variable than those of low-priced stocks.Mean-reverting square-root processes,

Xt = X0 +∫ t

0

(µ − νXs) · ds +∫ t

0

σX1/2s · dWs, (3.9)

have been important models for interest rates, as in Cox, Ingersoll, andRoss (1985); for stochastic volatility models, as in Heston (1993); and inmodeling prices of securities subject to default risk. These are referred tovariously as “Feller processes” after Feller (1951) and as “CIR” processesafter Cox et al. (1985).

Page 124: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch03 FA

Tools for Continuous-Time Models 107

The Quadratic-Variation Process

The quadratic variation of a semimartingale Xtt≥0 over [0, T ] is the prob-ability limit as n → ∞ of

∑nj=1(Xtj − Xtj−1)2, where t0 = 0, tn = T , and

maxj |tj−tj−1| → 0 as n → ∞. The process that tracks the quadratic varia-tion over [0, t] for 0 ≤ t ≤ T is called the “(predictable) quadratic-variationprocess” and is denoted 〈X〉t. Of course, 〈X〉0 = 0. We have seen that Itoprocess Xt = X0 +

∫ t

0as ·ds+

∫ t

0bs ·dWs0≤t≤T is a semimartingale when∫ T

0b2s ·ds < ∞. We will need to determine its quadratic variation process in

order to develop Ito’s formula for the change over time in a smooth functionof Xt—expression (3.15) below.

To start with the simplest case, the quadratic variation of Brownianmotion over [0, T ] is

〈W 〉T = limn→∞

n∑j=1

(Wtj − Wtj−1 )2 = T · lim

n→∞n−1

n∑j=1

Z2j = T,

where (i) we set tj = jT/n for j ∈ 0, 1, . . . , n, (ii) the Zj are i.i.d. asN(0, 1), and (iii) convergence is with probability one by the strong law oflarge numbers. Likewise 〈W 〉t = t, so that the quadratic variation process〈W 〉t0≤t≤T for Brownian motion is perfectly predictable. More generally,for an Ito process Xt0≤t≤T with Xt = X0 +

∫ t

0as · ds +

∫ t

0bs · dWs we

will see that the quadratic variation over [0, T ] is 〈X〉T =∫ T

0b2t · dt, the

salient fact being that the drift term makes no contribution. To make thisplausible without belaboring the details, we prove it for the special casethat at and bt are a.s. bounded processes.7

Approximating Xtj −Xtj−1 =∫ tj

tj−1as · ds +

∫ tj

tj−1bs · dWs as atj−1(tj −

tj−1)+ btj−1(Wtj −Wtj−1), the quadratic variation of Xt on the grid tjis approximately

n∑j=1

a2tj−1

(tj − tj−1)2 + 2n∑

j=1

atj−1btj−1(tj − tj−1)(Wtj − Wtj−1 )

+n∑

j=1

b2tj−1

(Wtj − Wtj−1)2. (3.10)

Again setting tj = jT/n, the first term equals Tn−1∑n

j=1 a2tj−1

(tj − tj−1),

which converges a.s. to zero since∑n

j=1 a2tj−1

(tj − tj−1) →∫ T

0a2

t · dt < ∞

7For the general case see Durrett (1996, chapter 2).

Page 125: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch03 FA

108 Pricing Derivative Securities (2nd Ed.)

as n → ∞ when at is bounded. Squaring the second term and apply-ing the Schwarz inequality, (2.41), give an upper bound proportional ton−1

∑nj=1 a2

tj−1b2tj−1

(tj − tj−1) ·∑n

j=1(Wtj − Wtj−1 )2. The first sum con-

verges to∫ T

0a2

t b2t · dt < ∞ and the second converges to T , so this term also

converges a.s. to zero. This confirms that 〈X〉T does not depend on at.The last term of (3.10) is where the action is. Writing it as

n∑j=1

b2tj−1

(tj − tj−1) +n∑

j=1

b2tj−1

(tj − tj−1)(Z2j − 1),

where the Zj are i.i.d. as N(0, 1), we see that the first sum converges to∫ T

0 b2t · dt. Letting Jn represent the second sum and setting J0 = 0, we see

that Jn is a discrete-time martingale since

Etn−1Jn = Jn−1 + b2tn−1

(tn − tn−1)Etn−1(Z2n − 1) = Jn−1.

Moreover, supn E|Jn| = E|Z2−1|∫ T

0Eb2

t ·dt < ∞, so theorem 7 (martingaleconvergence) implies that Jn converges a.s. to some finite quantity. Usingthe tower property of conditional expectation in the manner of (3.4) gives

EJ2n =

n∑j=1

Eb4tj−1

(tj − tj−1) · 2T/n → 0,

showing that Jn →a.s. 0 and therefore that 〈X〉T =∫ T

0 b2t · dt.

Integration with Respect to Ito Processes

Given an Ito process Xt = X0 +∫ t

0 as · ds +∫ t

0 bs · dWs and a processct adapted to the filtration Ft, assume that

∫ t

0 |ascs| · ds < ∞ and∫ t

0 b2sc

2s · ds < ∞ a.s. for each finite t. Then∫ t

0

cs · dXs =∫ t

0

csas · ds +∫ t

0

csbs · dWs.

With Y0 ∈ F0 the process Yt with

Yt = Y0 +∫ t

0

cs · dXs = Y0 +∫ t

0

csas · ds +∫ t

0

csbs · dWs

is therefore just another Ito process, and therefore another semimartingale.Generalizing to the multidimensional case, let Xt = X0 +

∫ t

0 as ·ds +

∫ t

0 bs · dWs be an m-vector process, where Wt is a k-dimensionalBrownian motion and at and bt are m-vector and m × k-matrix

Page 126: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch03 FA

Tools for Continuous-Time Models 109

adapted processes, respectively. If ct is an × m adapted process with∫ t

0 |csas| · ds < ∞ and∫ t

0 (csbs)(b′sc

′s) · ds < ∞ for each finite t, then

Yt = Y0 +∫ t

0

csas · ds +∫ t

0

csbs · dWs

defines the -vector process Yt = Y0 +∫ t

0 cs · dXs.

Stochastic Differentials

The integral expression Xt − X0 =∫ t

0 as · ds +∫ t

0 bs · dWs that defines theIto process can be written in a compact differential form as

dXt = at · dt + bt · dWt. (3.11)

Although this is just a shorthand, it has the useful intuitive interpretationas the instantaneous change in the process at t—that is, the change over[t, t + dt). Differential forms of (3.6), (3.7), and (3.8) are, respectively,

dXt = µXt · dt + σXt · dWt (3.12)

dXt = (µ − νXt) · dt + bt · dWt

dXt = at · dt + σXγt · dWt.

The differential form of geometric Brownian motion is often written as

dXt/Xt = µ · dt + σ · dWt. (3.13)

In the special case that at and bt depend just on time and the currentvalue of the process, as at = a(t, Xt), bt = b(t, Xt), (3.11) represents astochastic differential equation (s.d.e.). The solution, up to an arbitraryF0-measurable X0, is the process Xt that satisfies8

Xt − X0 =∫ t

0

a(s, Xs) · ds +∫ t

0

b(s, Xs) · dWs.

The solution to (3.12) can be characterized explicitly by applying Ito’sformula, which we now develop.

8Conditions on a(·, ·) and b(·, ·) such that the s.d.e. has a unique solution are given by

Karatzas and Shreve (1991, pp. 286–291). Examples of s.d.e.s that meet these conditionsare (3.6) and (3.7). Duffie (1996, pp. 291–293) gives weaker conditions that accommodatethe constant-elasticity-of-variance process when γ ≥ 1/2.

Page 127: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch03 FA

110 Pricing Derivative Securities (2nd Ed.)

3.3 Ito’s Formula

If f(t) is a continuously differentiable function of time, then the change inthe function over [0, t] can of course be expressed as an ordinary Riemannintegral,

f(t) − f(0) =∫ t

0

ft(s) · ds, (3.14)

where ft denotes the derivative. In differential form this would be df(t) =ft · dt. The Ito formula gives corresponding expressions for the change overtime in a smooth function of an Ito process. These now take the form ofstochastic integrals or stochastic differentials. What we find, most conve-niently, is that smooth functions of Ito processes are themselves Ito pro-cesses and therefore semimartingales. This simply stated (though not sosimply proven) fact is the primary reason that Ito processes—and, forthat matter, continuous-time stochastic models built up from Brownianmotions—have been so useful in finance and other fields.

3.3.1 The Result, and Some Intuition

With Xt−X0 =∫ t

0as ·ds+

∫ t

0bs ·dWs and assuming f : → to be twice

continuously differentiable, we will see that

f(Xt) − f(X0) =∫ t

0

fX(Xs) · dXs +12

∫ t

0

fXX(Xs) · d〈Xs〉, (3.15)

where fX and fXX are the first two derivatives and

d〈X〉t = d

∫ t

0

b2s · ds = b2

t · dt.

In abbreviated, differential form this is

df(Xt) = fX · dXt +12fXX · d〈X〉t. (3.16)

Going a little farther to write out dXt as at · dt + bt · dWt and collect termsgives

df(Xt) = (fXat + fXXb2t /2) · dt + fXbt · dWt, (3.17)

showing most clearly that smooth functions of Ito processes are, indeed,still Ito processes.

The key feature of (3.15), and what sets it apart from (3.14), is thesecond-order term in fXX . Before getting into the details it will be helpful

Page 128: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch03 FA

Tools for Continuous-Time Models 111

to grasp the intuition for this distinction. For a smooth function of timeitself, f(t), we can think of df = ft · dt as approximating the motion of thefunction over [t, t+∆t) to order ∆t—that is, as an O(∆t) approximation to∆f(t) = f(t+∆t)− f(t). In the integral form, the fundamental theorem ofcalculus tells us that the next order term involving ftt does not contributeto the change in f because the effects of these o(∆t) components in the sumthat converges to the integral become negligible as ∆t → 0. Why, then, doesit not work this way for functions of an Ito process? Very simply, becausethe motion of Xt itself over [t, t + ∆t) is stochastically of order larger than∆t as ∆t → 0—meaning that on average it goes to zero more slowly than∆t. Recalling that V (Wt+∆t − Wt) = ∆t, it follows that E|Wt+∆t − Wt| =O(

√∆t). This extreme variability of Brownian motion, which is what causes

it to be nondifferentiable, carries over to the more general Ito processes aswell. Therefore, to get an O(∆t) approximation to ∆f(Xt) requires takingthe Taylor expansion of f out one more term to include fXX(∆Xt)2. Thatis precisely what Ito’s formula does.

3.3.2 Outline of Proof

We now sketch the derivation of the Ito formula. To keep it simple, weassume that at, bt, and fXX are a.s. bounded.9 The development issimilar to that for the quadratic variation of Xt. Begin with the usual gridtj = jt/nn

j=0 and the identity f(Xt)−f(X0) =∑n

j=1[f(Xtj )−f(Xtj−1)].Expanding each f(Xtj ) to the second order about Xtj−1 , we have

f(Xt) − f(X0) =n∑

j=1

fX(Xtj−1 )(Xtj − Xtj−1)

+12

n∑j=1

fXX(Xt∗j )(Xtj − Xtj−1)2.

Notice that the continuity of Xt assures that there is indeed a t∗j ∈(tj−1, tj) such that Xt∗j ∈ (Xtj−1 , Xtj ). Sending n → ∞ turns the first

term on the right into∫ t

0fX(Xs) · dXs. Concentrating on the second term,

approximating Xtj − Xtj−1 as atj−1(tj − tj−1) + btj−1(Wtj − Wtj−1 ) andsquaring leads to an expression involving three terms. We find their limitsone by one.

9For the general treatment see Durrett (1996) or Karatzas and Shreve (1991).

Page 129: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch03 FA

112 Pricing Derivative Securities (2nd Ed.)

1.

12

n∑j=1

fXX(Xt∗j )a2tj−1

(tj − tj−1)2

=t

2n

n∑j=1

fXX(Xt∗j )a2tj−1

(tj − tj−1)

→ 0.

2.

n∑j=1

fXX(Xt∗j )atj−1btj−1(tj − tj−1)(Wtj − Wtj−1 )

√√√√tn−1

n∑j=1

fXX(Xt∗j )2a2tj−1

b2tj−1

(tj − tj−1)n∑

j=1

(Wtj − Wtj−1 )2

=

√√√√tn−1

n∑j=1

fXX(Xt∗j )2a2tj−1

b2tj−1

(tj − tj−1) · tn−1

n∑j=1

Z2j

→ 0,

where the Schwarz inequality is used in the second line.3.

12

n∑j=1

fXX(Xt∗j )b2tj−1

(Wtj − Wtj−1 )2

=12

n∑j=1

fXX(Xtj−1 )b2tj−1

(Wtj − Wtj−1 )2

+12

n∑j=1

[fXX(Xt∗j ) − fXX(Xtj−1)]b2tj−1

(Wtj − Wtj−1 )2.

The first term converges to 12

∫ t

0 fXX(Xs)b2s · ds ≡ 1

2

∫ t

0 fXX(Xs) · d〈X〉sby the same argument used to show that 〈X〉t =

∫ t

0 b2s · ds. Invoking

the continuity of fXX and the fact that Xt∗j − Xtj−1 → 0 as n → ∞, asimilar argument implies that the second term converges to zero.

Assembling the pieces leaves us with (3.15).

Page 130: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch03 FA

Tools for Continuous-Time Models 113

3.3.3 Functions of Time and an Ito Process

The Ito formula extends as follows for a function f(t, Xt) that depends inde-pendently on time and an Ito process and is once continuously differentiablein t and twice continuously differentiable in X :

f(t, Xt) − f(0, X0) =∫ t

0

ft(s, Xs) · ds +∫ t

0

fX(s, Xs) · dXs

+12

∫ t

0

fXX(s, Xs) · d〈X〉s. (3.18)

Note that the existence of ftX is not required. The differential form is

df(t, Xt) = ft · dt + fX · dXt +12fXX · d〈X〉t (3.19)

=(

ft + fXat +12fXXb2

t

)· dt + fXbt · dWt.

3.3.4 Illustrations

When integrating ordinary functions of a real variable, we typically justuse differentiation formulas to find antiderivatives, rather than work outthe result from the definition of the integral as a limiting sum. Ito’s for-mula affords a similar way to find stochastic integrals and solve stochasticdifferential equations. Here are some examples.

Example 40 Example 36 used the definition of the Ito integral to showthat ∫ t

0

Ws · dWs = W 2t /2 − t/2. (3.20)

Since W0 = 0, this can be written as W 2t − W 2

0 = 2∫ t

0 Ws · dWs +∫ t

0 ds

or, in differential form, as dW 2t = 2Wt · dWt + dt. Setting Xt = Wt and

f(Wt) = W 2t and applying (3.16) with d〈W 〉t = dt yield the same result:

dW 2t = 2Wt · dWt + 1

2 (2) · dt = 2Wt · dWt + dt. Applying the formula inreverse gives Wt · dWt = dW 2

t /2 − dt/2.

Example 41 Example 35 showed that∫ t

0

s · dWs = tWt −∫ t

0

Ws · ds,

which has the differential form

d(tWt) = t · dWt + Wt · dt.

Page 131: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch03 FA

114 Pricing Derivative Securities (2nd Ed.)

With Xt = Wt and f(t, Wt) = tWt (3.19) gives the same result.

Example 42 From expression (3.13) a differential form of geometric Brow-nian motion is dXt/Xt = µ · dt + σ · dWt. We can propose a solution tothis s.d.e. and use Ito’s formula to check it. We are used to thinking thatdx/x = d lnx, so our first guess might be Xt = f∗(t, Wt) ≡ X0e

µt+σWt .

This gives f∗t = µXt, f∗

W = σXt, f∗WW = σ2Xt and

dXt/Xt = µ · dt + σ · dWt +12σ2 · dt

= (µ + σ2/2) · dt + σt · dWt.

This misses—and incidentally tells us that d lnXt = dXt/Xt in stochasticcalculus, but it shows that we need just to subtract σ2/2 from µ to get theright solution:

Xt = X0e(µ−σ2/2)t+σWt . (3.21)

Given the importance of geometric Brownian motion as a model forassets’ prices, (3.21) will be an extremely useful result, to which we willrefer repeatedly in later chapters. It implies that

lnXt = lnX0 + (µ − σ2/2)t + σWt, (3.22)

so that,

lnXt ∼ N [lnX0 + (µ − σ2/2)t, σ2t]. (3.23)

Thus, if Xt represents the price of a financial asset, its future value aftert units of time is lognormally distributed. Moreover, the log-price processlnXt is a Gaussian process with independent increments and with auto-correlation function given by

Corr(ln Xs, lnXt) = Cov(ln Xs, lnXt)/√

V lnXs · V lnXt

=√

s/t, s ≤ t.

Example 43 Applying Ito with dXt = at · dt + bt · dWt and f(Xt) = lnXt

gives

d lnXt = X−1t · dXt −

12X−2

t · d〈X〉t

=(

X−1t at −

12X−2

t b2t

)· dt + X−1

t bt · dWt.

When at = µXt and bt = σXt, this is

d lnXt = (µ − σ2/2) · dt + σ · dWt,

in agreement with (3.22).

Page 132: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch03 FA

Tools for Continuous-Time Models 115

The next example introduces two concepts whose importance will beseen in section 3.4 in connection with Girsanov’s theorem.

Example 44 Consider the trendless Ito process dXt = θt · dWt, where∫ t

0 θ2s · ds < ∞ for each finite t. Define a new process dQt = Qt · dXt =

Qtθt ·dWt. By analogy with example 42 we guess (and can verify using Ito’sformula) that solutions to this s.d.e. are of the form

Qt = Q0eXt−〈X〉t/2 = Q0 exp

(∫ t

0

θs · dWs −12

∫ t

0

θ2s · ds

)for arbitrary Q0. Taking Q0 = 1 gives the “Doleans exponential” or“stochastic exponential” of Xt. It can be shown that Qt is a contin-uous local martingale.10 Moreover, under the Novikov condition,

E exp(

12

∫ t

0

θ2s · ds

)< ∞, (3.24)

Qt is integrable and therefore a continuous proper martingale, withEQt = 1. The Novikov condition holds trivially when θtt≥0 ∈ F0 andθ2

t is integrable—a case that covers many of our applications.

3.3.5 Functions of Higher-Dimensional Processes

It is often necessary to consider the behavior of functions of several Itoprocesses. Let Xt = X0 +

∫ t

0 as · ds +∫ t

0 bs · dWs be an m-dimensionalIto process, where Wt is a k-dimensional standard Brownian motion andat and bt are an m-vector and an m × k matrix of adapted processes.The quadratic covariation of Xjs and Xs on [0, t] is the probabilitylimit as n → ∞ of

∑ni=1(Xjti −Xjti−1)(Xti −Xti−1) on a grid ti = it/n.

This equals

〈Xj , X〉t =∫ t

0

bjsb′s · ds,

10This is easily seen in two special cases. Write Qt = Qs exp[R ts

θu ·dWu− 12

R ts

θ2u ·du]

for 0 ≤ s < t, and suppose, as the first case, that θt is F0-measurable andR T0 θ2

t · dt <

∞. ThenR ts

θu ·dWu ∼ N(0,R ts

θ2u ·du), and EsQt = Qs follows at once from the form of

the normal moment generating function. Considering as the second case simple processesθu =

Pθj−11[tj−1,tj)(u) the result follows easily from the independence of increments

to Brownian motion. In both cases the Novikov condition, (3.24), clearly holds, so thatQt is in fact a martingale. For the more general case that θt = 〈X〉t and Xt is acontinuous local martingale see Karatzas and Shreve (1991, section 3.5).

Page 133: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch03 FA

116 Pricing Derivative Securities (2nd Ed.)

where bjs and bs are the jth and th rows of bs and bjsb′s is the inner

product. Of course, when j =

〈Xj , X〉t = 〈Xj〉t =∫ t

0

bjsb′js · ds.

Note that 〈Xj , X〉t = 0 when the subsets of W1t, W2t, . . . , Wkt on whichXjt and Xt depend have no elements in common, in which case bjsb

′s =

0, s ∈ [0, t]. For a differentiable function f(t,Xt) the differential form ofIto’s formula is

df(t,Xt) = ft · dt +m∑

j=1

fXj · dXjt +12

m∑j=1

m∑=1

fXjX· d〈Xj , X〉t, (3.25)

or, written out,

df(t,Xt) =

ft +m∑

j=1

fXj ajt +12

m∑j=1

m∑=1

fXjXbjtb′

t

· dt

+m∑

j=1

fXjbjt · dWt.

The next two examples develop special cases that will be important in theapplications.

Example 45 Let X1t and X2t be Ito processes with

dXjt = ajt · dt + bjt · dWt

for j = 1, 2, where Wt is a k-dimensional standard Brownian motion.Taking f(Xt) = X1tX2t formula (3.25) gives

df(X1tX2t) = X2t · dX1t + X1t · dX2t + d〈X1, X2〉t. (3.26)

Since d〈X1, X2〉t = b1tb′2t · dt, this is

df(X1tX2t) = (X2ta1t + X1ta2t + b1tb′2t) · dt + (X2tb1t + X1tb2t) · dWt.

Notice that if either or both of X1t, X2t is of finite variation—in whichcase b1t = 0 and/or b2t = 0—then the quadratic-covariation process isidentically zero, and (3.26) reduces to the ordinary product formula fordifferentiation:

df(X1tX2t) = X2t · dX1t + X1t · dX2t. (3.27)

Page 134: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch03 FA

Tools for Continuous-Time Models 117

Example 46 Let pt and St be two m-dimensional Ito processes,m ≥ 2, as

dpt = πt · dt + δt · dWt

dSt = µt · dt + σt · dWt,

where πt and µt are m vectors and δt and σt are m × k matrices, allFt-adapted. Let

Pt = f(pt,St) ≡ p′tSt =

m∑j=1

pjtSjt.

Then (3.25) gives

dPt = S′t · dpt + p′

t · dSt + d〈pt,St〉t (3.28)

=m∑

j=1

[Sjt · dpjt + pjt · dSjt + δjtσ′jt · dt]

=m∑

j=1

[(Sjtπjt + pjtµjt + δjtσ′jt) · dt + (Sjtδjt + pjtσjt) · dWt],

where δjt and σjt are the jth rows of δt and σt.

3.3.6 Self-Financing Portfolios in Continuous Time

Working now toward applications, think of pt and St in the last exampleas portfolio and asset-price vectors; that is, component pjt is the numberof units of asset j held at t and Sjt is its price. With P0 = p′

0S0 given,Pt = p′

tSt is then the value of the portfolio at t—the sum of shares timesprices. However, the pt process needs to be restricted if this interpretationis to make sense. Since instantaneous changes in the portfolio should beunder the investor’s control—hence, predictable—we will need to set δt = 0.This change makes pt of finite variation, with derivative dpt/dt = πtan Ft-adapted process. In this case (3.27) applies, and (3.28) reduces to

dPt = S′t · dpt + p′

t · dSt.

The first component represents the instantaneous change in Pt that isattributable to purchases and sales of assets at the current prices, whilethe second component is the change attributable to changes in prices ofassets already in the portfolio.

Now let us make the portfolio self-financing by insisting that instanta-neous purchases and sales balance out. This is done by restricting πt such

Page 135: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch03 FA

118 Pricing Derivative Securities (2nd Ed.)

that S′tπt = S′

t · dpt/dt = 0, thereby eliminating the first component ofdPt. With these controls

dPt = p′t · dSt, (3.29)

or in integral form

PT − P0 =∫ T

0

p′t · dSt. (3.30)

3.4 Tools for Martingale Pricing

Previous sections of the chapter have shown how to construct Ito inte-grals and have established that integrals of suitable adapted processes withrespect to Brownian motions are continuous local martingales. Thus, given aWiener process Wt on a filtered probability space (Ω,F , Ft, P), the pro-cess Xt = X0 +

∫ t

0 bs ·dWs0≤t≤T is a continuous local martingale on [0, T ]when bt ∈ Ft for each t and

∫ T

0 b2t ·dt < ∞. Moreover, Xt is a proper mar-

tingale when E∫ T

0b2t ·dt < ∞. On the other hand, the more general Ito pro-

cesses that allow for systematic trend, as Xt = X0+∫ t

0as ·ds+

∫ t

0bs ·dWs,

are not P-martingales but, rather, are semimartingales. Girsanov’s theoremwill show us how to find a new measure in which a process with trend doesbecome a proper martingale.

3.4.1 Girsanov’s Theorem and Changes of Measure

Turning Ito processes that are semimartingales into Ito processes that arelocal martingales is just a matter of removing the drift term. Suppose it werepossible to replace the standard Brownian motion in dXt = at ·dt+ bt ·dWt

with a new process Wtt≥0 with Wt = Wt +∫ t

0(as/bs) ·ds. This would give

a formal representation for the stochastic differential of Xt as

dXt = at · dt + bt[dWt − (at/bt) · dt] = bt · dWt.

Obviously, this would require the extra restriction bt > 0, but even grantingthis one quickly realizes that this sleight of hand accomplishes nothing.Wt itself has systematic trend and is no longer a standard Brownianmotion under measure P, so the stochastic integral

∫ t

0 bs · dWs is still nota local P-martingale. But suppose there were a different measure P underwhich Wt actually was a Brownian motion. Then the process Xt =X0 +

∫ t

0 bs · dWs0≤t≤T would indeed be a local martingale under P—and

Page 136: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch03 FA

Tools for Continuous-Time Models 119

a proper martingale, too, if E∫ T

0 b2t · dt < ∞. So how could one change the

measure so as to bring this about?Clearly, the Radon-Nikodym theorem will be involved here. A hint

toward the answer comes by considering how Radon-Nikodym would beused to change a normally distributed random variable with nonzero meaninto one with zero mean. For this, suppose Z ∼ N(0, 1) on a probabilityspace (Ω,F , P), and let PZ be the measure on the space (,B) inducedby Z. This is defined as

PZ(B) =∫

B

1√2π

e−z2/2 · dz, B ∈ B.

Let Z = Z + θ ∼ N(θ, 1) under P, with θ = 0. Then with φ(z) =(2π)−1/2e−z2/2 as the normal p.d.f., introduce the nonnegative random vari-able Q(Z) = exp(−θZ − θ2/2) and define the set function PZ as

PZ(B) ≡ EQ(Z)1B(Z) =∫

B

Q(z)φ(z) · dz

=∫

B

1√2π

e−(z+θ)2/2 · dz, B ∈ B. (3.31)

Thus defined, PZ is obviously nonnegative and countably additive, and sincePZ() = EQ(Z)1(Z) = EQ(Z) = 1 it follows that PZ(·) is a probabilitymeasure on (,B). As the form of the integral in (3.31) shows, this corre-sponds to some new measure P on (Ω,F) under which Z ∼ N(−θ, 1). Sincethe mean of Z is −θ under P it is not surprising that Z = Z + θ ∼ N(0, 1)under the new measure. This is shown explicitly as follows. Letting E denoteexpectation under P, the c.f. of Z under P is

EeiζZ =∫

eiζ(z+θ) 1√2π

e−(z+θ)2/2 · dz =∫

1√2π

eiζz−z2/2 · dz = e−ζ2/2,

showing that Z is indeed distributed as N(0, 1) under this new measure.Now let us try to extend this idea to Brownian motions. Consider-

ing a finite time horizon T , we work with the filtered probability space(Ω,FT , Ft0≤t≤T , P). Recalling example 44, let

Qt = exp(−

∫ t

0

θs · dWs −12

∫ t

0

θ2s · ds

)(3.32)

be the Doleans exponential of −∫ t

0 θs · dWs, where θt is an adaptedprocess on [0, T ] with

∫ T

0 θ2t · dt < ∞ and such that EQT = 1. Recall that

Page 137: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch03 FA

120 Pricing Derivative Securities (2nd Ed.)

the Novikov condition,

E exp

(12

∫ T

0

θ2t · dt

)< ∞,

is sufficient for Qt0≤t≤T to be a P martingale. Now defining the measureP on FT as

P(A) = EQT1A =∫

A

QT (ω) · dP(ω), A ∈ FT , (3.33)

where QT = dP/dP is the Radon-Nikodym derivative, we have the conclu-sion of Girsanov’s theorem:

Theorem 8 (Girsanov) For fixed T ∈ [0,∞) the processWt = Wt +

∫ t

0

θs · ds

0≤t≤T

is a Brownian motion on (Ω, FT , Ft0≤t≤T , P), where P is defined by(3.33) and (3.32).

For a multidimensional version simply interpret Wt, Wt, and θtas k-dimensional processes and θ2

t as scalar product θ′tθt, making Wt =

Wt +∫ t

0θs · ds0≤t≤T a k-dimensional Brownian motion under P, with

dP

dP= exp

(−

∫ T

0

θs · dWs −12

∫ t

0

θ′sθs · ds

).

Notice that the martingale property of Qt implies that when A ∈ Ft

P(A) = E[1AE(QT | Ft)] = EQt1A,

Thus, EQt1A is the restriction of P to Ft in the sense that P(A) = EQt1A

when A ∈ Ft. Notice, too, the correspondence between how the measure ischanged to produce a Brownian motion without trend and how the measureis changed to produce a normal variate with zero mean.

The proof here is much more involved, however, because the behavior ofthe entire process Wt0≤t≤T must be established. The standard approachrelies on a characterization theorem of Paul Levy, which states that if Xtis a continuous martingale with X0 = 0 and if X2

t −t is also a martingale,then Xt is a Brownian motion. We will use Levy’s characterization toprove a simple one-dimensional version of Girsanov in which θt = θ, aconstant. In this case QT = exp(−θWT − θ2T/2) and Wt = Wt + θt.

Page 138: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch03 FA

Tools for Continuous-Time Models 121

We need a preliminary result; namely, that if Yt0≤t≤T is Ft0≤t≤T

adapted and 0 ≤ s < t ≤ T, then

EsYt = Es(YtQt/Qs), (3.34)

where Es and Es are expectations conditional on Fs with respect to P andP, respectively. Given that Qt is a continuous P martingale, this followsat once from Bayes’ rule for conditional expectations, (2.51). Expression(3.34) tells us that the right variable by which to scale an Ft-measurablevariable when conditioning on Fs is

Qt

Qs= exp

[−

∫ t

s

θu · dWu − 12

∫ t

s

θ2u · du

].

Also, since Q0 = 1, it follows that EYt = EYtQt.Now let us use Levy’s characterization theorem to prove Girsanov for a

constant process θt = θ. This requires showing that

EsW2t − t − (W 2

s − s) = 0.

First, write

EsW2t − t − (W 2

s − s) = Es[2Ws(Wt − Ws) + (Wt − Ws)2] − (t − s)

= Es[2Ws(Wt − Ws) + (Wt − Ws)2]Qt/Qs − (t − s),

then set Wt = Wt + θt, Ws = Ws + θs, and Qt/Qs = exp[θ(Wt − Ws) −θ2(t − s)/2]. Finally, expressing Wt − Ws as Z

√t − s, where Z ∼ N(0, 1),

and making a further change of variables reduces this to

2Ws

√t − s

zφ(z) · dz + (t − s)∫

z2φ(z) · dz − (t − s) = 0,

thus establishing the conclusion of Girsanov in this special case.Returning to the problem posed at the beginning of this section, let

us now look specifically at how to apply Girsanov’s theorem. If there is ameasure P under which Wt = Wt +

∫ t

0 (as/bs) · ds is a Brownian motion,then under this measure the Ito process Xt with dXt = at ·dt+bt ·dWt =bt · dWt has zero drift. Taking θt = at/bt, we conclude from Girsanov thatsuch P exists provided Qt = exp[−

∫ t

0 θs ·dWs− 12

∫ t

0 θ2s ·ds] (with Q0 = 1)

is a martingale under the original measure P. The Novikov condition issufficient for this, and clearly bt = 0 is necessary.

Page 139: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch03 FA

122 Pricing Derivative Securities (2nd Ed.)

3.4.2 Representation of Martingales

From what we have seen Ito processes are extremely resilient: smooth func-tions of Ito processes are still Ito processes, and integrals of suitable adaptedprocesses with respect to Ito processes are still Ito processes. Given thisremarkable fungibility, it is not surprising that it is possible to representeither of two different Ito processes in terms of the other if they are adaptedto the same filtration. Suppose there are two trendless Ito processes,

Xt = X0 +∫ t

0

as · dWs

0≤t≤T

Yt = Y0 +∫ t

0

bs · dWs

0≤t≤T

,

where at is a.s. nonzero for each t and∫ T

0 b2s·ds < ∞. Then setting ct = bt/at

shows that there exists a process ct such that Yt = Y0 +∫ t

0 cs · dXs and∫ T

0 a2t c

2t · dt < ∞ a.s. The first condition follows just from algebra:

Y0 +∫ t

0

cs · dXs = Y0 +∫ t

0

(bs/as)as · dWs = Yt;

and the almost-certain finiteness of∫ T

0a2

t c2t · dt =

∫ T

0b2t · dt comes from the

fact that Yt was defined to be an Ito process.Taking this one step farther, suppose there are two local martingales

Xt0≤t≤T and Yt0≤t≤T whose quadratic variation processes 〈X〉t and〈Y 〉t are absolutely continuous functions of t. Then it can be shown11

that there exist adapted processes at and bt with∫ T

0 a2t · dt < ∞ and∫ T

0 b2t · dt < ∞ such that Xt = X0 +

∫ t

0 as · dWs and Yt = Y0 +∫ t

0 bs · dWs.In other words, subject to the restriction on quadratic variations, it is alwayspossible to represent local martingales as Ito processes. Piecing togetherthese facts then gives the following result:

Theorem 9 (Martingale Representation) For local martingalesXt0≤t≤T and Yt0≤t≤T such that a2

t ≡ d〈X〉t/dt > 0 a.s. there exists anadapted process ct0≤t≤T such that Yt = Y0 +

∫ t

0 cs · dXs for t ∈ [0, T ] and∫ T

0a2

tc2t · dt < ∞ a.s.

11See Karatzas and Shreve (1991, theorem 4.2).

Page 140: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch03 FA

Tools for Continuous-Time Models 123

3.4.3 Numeraires, Changes of Numeraire, and Changes

of Measure

Prices of domestic assets are normally specified in units of domesticcurrency—dollars, pounds, Euros, etc. If we allow trades only among domes-tic assets, there is no need to think about how the currency’s value relatesto values of foreign currencies. However, even when confining attention todomestic markets, finding arbitrage-free prices of derivatives will requireus to work with other numeraires than units of currency. The reasonis that currency per se is dominated by riskless, interest-bearing assetssuch as money-market funds and default-free bonds. In arbitrage-free mar-kets containing such (admittedly, idealized) riskless alternatives, currencywould simply not be held as an asset. Therefore, to explore arbitrage-freeeconomies, we need to express prices relative to the price of some asset orportfolio of assets that is a viable means of storing wealth. Any such priceprocess that remains always positive can serve as numeraire. Specifically,

Definition 3 A numeraire is an almost-surely positive price process.

We refer to the ratio of an asset’s price to a numeraire as a “normalized”price process. While the idealized money-market fund introduced in sec-tion 4.1—essentially a savings account—will usually serve our purpose asnumeraire, others will be more convenient in some applications, particu-larly in chapter 10, where we treat derivatives on interest-sensitive assets.This section shows how normalized price processes are changed by switch-ing from one numeraire to another. If a particular normalized price processis a martingale in some measure, we will see that it is no longer so after achange of numeraire, but we will also see how to change measures so as torestore the martingale property.

For some finite horizon T let the price process St0≤t≤T of a tradedasset evolve as dSt = stSt · dt + σ′

tSt · dWt, where σt is a k-dimensionalprocess and Wt is a vector-valued standard Brownian motion on(Ω,F , Ft0≤t≤T , P); and let Mt and Lt be other, strictly positive priceprocesses evolving as

dMt = mtMt · dt

dLt = ltLt · dt + λ′tLt · dWt.

These will be our numeraires. Note that Mt is represented as a process offinite variation, with Mt = M0 exp(

∫ t

0ms · ds). We take st, mt, lt, σt, λt

Page 141: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch03 FA

124 Pricing Derivative Securities (2nd Ed.)

to be adapted processes with∫ T

0 σ′tσt · dt < ∞ and

∫ T

0 λ′tλt · dt < ∞.

Applying Ito’s formula to SLt ≡ St/Lt = f(St, Lt) and to SM

t ≡ St/Mt

gives

dSLt = fS · dSt + fL · dLt + fSL · d〈S, L〉t +

12fLL · d〈L〉t

=dSt

Lt− St

dLt

L2t

− StLtσ′tλt · dt

L2t

+StL

2t λ

′tλt · dt

L3t

= SLt

[dSt

St− dLt

Lt− (σt − λt)

′ λt · dt

]=

[st − lt − (σt − λt)

′ λt

]SL

t · dt + (σt − λt)′ SL

t · dWt

dSMt = (st − mt)SM

t · dt + σ′tS

Mt · dWt

Note that the volatility vector of SLt is the difference between the volatil-

ity vectors of St and of Lt and that the drift of SLt is influenced by

both the trend and the volatility of Lt, whereas the change from Stto SM

t is in the drift term only since the volatility of Mt is null. How-ever, since the volatilities of Lt and Mt differ, the change of numerairefrom Lt to Mt (or vice versa) alters both the drift and volatility of thenormalized version of St.

Now suppose there exists a measure PM under which both SMt and

LMt ≡ Lt/Mt are martingales. By the martingale representation theo-

rem there are under this measure a k-vector Brownian motion WMt and

adapted processes σMt , λM

t such that

dSMt = (σM

t )′SMt · dWM

t

dLMt = (λM

t )′LMt · dWM

t .

However, if we change to numeraire Lt, we have for SLt ≡ St/Lt =

SMt /LM

t

dSLt = −(σM

t − λMt )′λM

t SLt · dt + (σM

t − λMt )′SL

t · dWMt (3.35)

= (σMt − λM

t )′SLt · d

(WM

t −∫ t

0

λMs · ds

).

There is now a drift term, so SLt is not a martingale under PM ; but we can

ask whether there is a new measure PL under which the martingale propertyis restored. Girsanov’s theorem tells us that the answer is ‘yes’ if λt issuch that Qt ≡ exp(−

∫ t

0 λ′s ·dWM

t − 12

∫ t

0 λ′sλs ·ds)0≤t≤T is a martingale

under PM . In that case WLt ≡ WM

t −∫ t

0 λMs · ds is a k-dimensional

standard Brownian motion in the measure defined by PL(A) =∫

A QT ·dPM

Page 142: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch03 FA

Tools for Continuous-Time Models 125

=∫

AdP

A

dPM · dPM , A ∈ FT ; and with EM , EL denoting expectation underPM , PL, we have EL

t SLT ≡ EL(SL

T | Ft) = EMt SL

T QT /Qt = SLt . Notice that

the change of measure has altered only the trend of SLt , not its volatility

process.To summarize,

• Changing from one numeraire to another that has different drift andvolatility alters both the drift and the volatility of an asset’s normalizedprice.

• Changing measures without changing numeraires changes the drift of theasset’s normalized price but not its volatility.

Conveniently, random variable dPL/dPM can be represented in terms ofobservables. As above, let A be any FT -measurable event, and for t ∈ [0, T ]set St ≡ MtE

Mt 1A = EM (Mt1A | Ft). In particular, ST = MT1A since

A ∈ FT . Then SMt 0≤t≤T is a conditional expectations process adapted

to Ft and therefore a PM martingale, with

S0 = M0EMSM

T = M0EM (MT1A/MT ) =

∫A

M0 · dPM .

But by hypothesis SLt is a PL martingale, so

S0 = L0ELSL

T = L0EL(MT 1A/LT ) =

∫A

L0MT

LT

dPL

dPM· dPM .

Since the two integral expressions agree for each A ∈ FT , we conclude that

dPL

dPM=

LT /L0

MT /M0, PM a.s. (3.36)

3.5 Tools for Discontinuous Processes

This section describes certain continuous-time processes with discontinuoussample paths and introduces the applicable tools of stochastic calculus. Theapplications of these processes will come in chapter 9. We begin with modelsfor a simple class of pure-jump processes having no continuous component,then turn to more general forms.

3.5.1 J Processes

Given a filtered probability space (Ω,F , Ftt≥0, P), an increasing sequenceof stopping times τk∞k=1, and a collection of random variables Uk∞k=1

Page 143: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch03 FA

126 Pricing Derivative Securities (2nd Ed.)

with Uk ∈ Fτk, we introduce a class of pure-jump processes Jtt≥0 having

the representation

Jt = J0 +∞∑

k=1

Uk1[τk,τk+1)(t), (3.37)

where J0 ∈ F0. We shall refer to processes of this type as J processes. Theyare clearly right-continuous step functions, each sample path of which takesthe constant value Uk(ω) on [τk(ω), τk+1(ω)).

Example 47 The Poisson process, Ntt≥0, was introduced in chapter 2as an example of a continuous-time Markov process. Recall that Nt is anadapted process on (Ω,F , Ftt≥0, P) with the properties (i) N0 = 0; (ii)Nt − Ns independent of Nr − Nq for q < r ≤ s < t; and (iii) Nt+u − Nt ∼P (θu), with P(Nt+u − Nt = n) = (θu)ne−θu/n! for n ∈ ℵ0 and θ, u > 0.

Parameter θ is called the “intensity” or the “arrival rate”. Taking τk =inft : Nt ≥ k and Uk = k gives a representation in the form (3.37) asNt =

∑∞k=1 k1[τk,τk+1)(t); equivalently, Nt =

∑∞k=1 1[τk,∞)(t). The stopping

times τk are called the “arrival times”.Here are some features of the Poisson process that will be needed

in applications. Since ENt = V Nt = θt, it is clear from the independent-increments property that Nt − θt and (Nt − θt)2 − θt are martingales.Likewise, the form of the probability generating function,

Π(ζ) = EζNt = exp[θt(ζ − 1)],

shows that ξNte−θt(ξ−1) is a martingale.

Example 48 Taking τk∞k=1 as Poisson arrival times, let Uk∞k=1 bei.i.d. random variables independent of Nt (and therefore of the τk),and set U0 = 0. With J0 ∈ F0, we then have the following representationsof a “compound-Poisson” process Jt:

Jt = J0 +∞∑

k=1

k∑j=1

Uj

1[τk,τk+1)(t)

= J0 +∞∑

k=1

Uk1[τk,∞)(t)

= J0 +Nt∑

k=0

Uk.

Page 144: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch03 FA

Tools for Continuous-Time Models 127

As the sum of a Poisson distributed number of independent randomvariables, Jt − J0 takes a jump of random size Uk at the k th Poissonarrival. Clearly, Jtt≥0 is a martingale if the jump sizes have mean zero.For later reference, if J0 = 0 and the jumps have characteristic functionΨU (ζ) ≡ EeiζU , then the c.f. of Jt is

ΨJt(ζ) = E

E

[exp

(iζ

Nt∑k=0

Uk

)| Nt

]= E exp[ΨU (ζ)Nt]

= expθt[ΨU (ζ) − 1]. (3.38)

Integration with Respect to J Processes

A special theory for integration with respect to Brownian motion—and Itoprocesses generally—is required because Brownian motion is of unboundedvariation. By contrast, sample paths of J processes necessarily have finitevariation on finite time intervals, since step functions can be representedas differences between nondecreasing functions. This means that integralsof continuous functions with respect to J processes can be defined path-wise as ordinary Riemann-Stieltjes integrals. That is, if ctt≥0 is a.s.continuous, then we can interpret

∫ t

0cs · dJs on a sample path ω as

limn→∞∑n

j=1 ct∗j (Jtj − Jtj−1) for any t∗j ∈ [tj−1, tj ], whenever maxj |tj −tj−1| → 0 as n → ∞. The following example shows that, contrary to thecase for the Ito integral, ct does not have to be evaluated at the left boundof each interval in the approximating sum if c is continuous.

Example 49 Taking Jt = Nt, ct = t, tj = jt/n and

∆n ≡n∑

j=1

ctj (Jtj − Jtj−1) −n∑

j=1

ctj−1(Jtj − Jtj−1 ),

we have

limn→∞

∆n = t limn→∞

n−1n∑

j=1

(Ntj − Ntj−1)

= t limn→∞

n−1Nt

= 0.

Page 145: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch03 FA

128 Pricing Derivative Securities (2nd Ed.)

On the other hand, integrals with respect to Jt of a process ct thatis merely right-continuous (such as another J process) must be defined morerestrictively if the result is to be unambiguous.

Example 50 Taking Jt = Nt = ct and tj = jt/n gives

limn→∞

∆n = limn→∞

n∑j=1

Ntj (Ntj − Ntj−1 ) − limn→∞

n∑j=1

Ntj−1(Ntj − Ntj−1)

=Nt∑j=1

j −Nt∑j=1

(j − 1)

= Nt.

Adopting as the definition the left-end construction (we motivate thischoice momentarily) gives∫

(0,t]

cs− · dJs = limn→∞

n∑j=1

ctj−1(Jtj − Jtj−1 ) =∑

q

cq−(Jq − Jq−),

where cs− is the left hand limit (that is, limn→∞ cs−1/n) and the last sum-mation is over the arrival times τk.

Regarding the stochastic integral with respect to a J process as a func-tion of the upper limit leads to a new J process, Kt say, with

Kt = K0 +∫

(0,t]

cs− · dJs,

for some K0 ∈ F0. The differential shorthand is then

dKt = ct− · dJt.

If there is an absolutely continuous function12 ϑt such that J ′t ≡ Jt − ϑt

is a martingale adapted to Ft, then this left-end construction makes theintegral with respect to J ′

t a martingale also, for suitably restricted func-tions c. Thus, for Kt − K0 =

∫(0,t] cs− · dJ ′

s and a bounded c,

E(Kt − K0) = E limn→∞

n∑j=1

ctj−1(J′tj− J ′

tj−1)

= limn→∞

En∑

j=1

ctj−1(J′tj− J ′

tj−1)

12More generally, a function having finite variation on finite intervals.

Page 146: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch03 FA

Tools for Continuous-Time Models 129

= limn→∞

E

n∑j=1

ctj−1Etj−1 (J′tj− J ′

tj−1)

= 0.

For example, with ct = e−Nt, Jt = Nt, ϑt = θt, and K0 ∈ F0 the processKt with

Kt = K0+∫

(0,t]

e−Ns− ·d(Ns−θs) = K0+∫

(0,t]

e−Ns− ·dNs−θ

∫ t

0

e−Ns− ·ds

is an Ft martingale.Integrals with respect to compound-Poisson processes can be written

explicitly as

Kt − K0 =∑

q

cq−U · (Nq − Nq−) =Nt∑

k=0

cτk−Uk, (3.39)

where the first summation is over jump times τ1, τ2, .... In differential formthis is

dKt = ct− · dJt = ct−U · dNt.

Of course, U ≡ 1 for a standard Poisson process.

Differentials of Functions of J Processes

Ito’s formula provides representations of smooth functions of Ito processesas stochastic integrals and of changes in these functions as stochastic dif-ferentials. Obtaining these representations for functions of J processes isstraightforward. For any measurable f : [0,∞) × → we have as anidentity for f(t, Jt) − f(0, J0) the expression

n∑j=1

[f(tj , Jtj ) − f(tj−1, Jtj )] +n∑

j=1

[f(tj−1, Jtj ) − f(tj−1, Jtj−1)],

where 0 = t0 < t1 < ... < tn = t. Assuming that f is continuously differen-tiable in the time argument, apply the mean value theorem to express thefirst summation as

n∑j=1

ft(t∗j , Jtj )(tj − tj−1),

Page 147: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch03 FA

130 Pricing Derivative Securities (2nd Ed.)

where ft is the time derivative and t∗j ∈ (tj−1, tj). In the limit as maxj |tj −tj−1| approaches zero this gives

f(t, Jt) − f(0, J0) =∫ t

0

ft(s, Js) · ds +∑

q

[f(q, Jq) − f(q, Jq−)], (3.40)

where the summation is over jump times τ1, τ2, .... The differential form is

df(t, Jt) = ft(t, Jt) · dt + [f(t, Jt) − f(t, Jt−)]. (3.41)

When Jt is an integral with respect to a compound-Poisson process, as in(3.39), (3.40) is

f(t, Jt)−f(0, J0) =∫ t

0

ft(s, Js)·ds+Nt∑k=0

[f(τk, Jτk−+cτk−Uk)−f(τk, Jτk−)].

Consider next a function f(t, Yt, Jt) that depends on both a J process,Jtt≥0, and an Ito process, Ytt≥0, with

Yt = Y0 +∫ t

0

as · ds +∫ t

0

bs · dWs. (3.42)

If f is continuously differentiable with respect to t and twice continuouslydifferentiable with respect to Y , then we have the following extension ofIto’s formula, generalizing (3.18) and (3.40):

f(t, Yt, Jt) − f(0, Y0, J0) =∫ t

0

fs · ds +∫ t

0

fY · dYs +12

∫ t

0

fY Y · d〈Y 〉s

+∑

q

[f(q, Yq, Jq) − f(q, Yq, Jq−)].

Here the derivatives are evaluated at Js−; for example, fY = fY (s, Ys, Js−).Of course the continuity of Ytt≥0 makes Yt− = Yt a.s. This has the dif-ferential form

df = ft · dt + fY · dYt +fY Y

2· d〈Y 〉t + f(t, Jt, Yt) − f(t, Jt−, Yt) (3.43)

or equivalently

df =(

ft + atfY +b2t fY Y

2

)· dt + btfY · dWt + f(t, Jt, Yt) − f(t, Jt−, Yt).

3.5.2 More General Processes

We now consider generalizations of J processes that are subject to discon-tinuous movements but are not necessarily pure jump.

Page 148: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch03 FA

Tools for Continuous-Time Models 131

Mixed Ito and J Processes

The discontinuous models most widely applied in finance have been simplecombinations of Ito processes and J processes built up from the compoundPoisson. These allow for continuous, stochastic movement of unboundedvariation on finite intervals, punctuated by randomly timed breaks of ran-dom size. With Yt as in (3.42), Kt as in (3.39), and X0 = Y0 + K0 ∈ F0,define the new process Xtt≥0 with

Xt = X0 +∫

(0,t]

(dYs + dKs) (3.44)

= X0 +∫ t

0

as · ds +∫ t

0

bs · dWs +∫ t

0

cs−U · dNs,

where Wtt≥0 and Ntt≥0 are independent of each other and of the Uk.Xtt≥0 has continuous paths governed by the Ito part until the indepen-dent Poisson process triggers a jump of size ct−U. Therefore, Xt − Xt− =ct−U(Nt − Nt−). For a differentiable function f(t, Xt) the Ito formula forthe mixed process becomes

df = ft · dt + fX · dYt +fXX

2· d〈Y 〉t + [f(t, Xt) − f(t, Xt−)] (3.45)

= (ft + atfX + b2t fXX/2) · dt + btfX · dWt + [f(t, Xt) − f(t, Xt−)],

where the derivatives are evaluated at (t, Xt−).We also require the following more general version of Ito’s formula for

functions of right-continuous semimartingales:13

df = ft·dt+fX ·dXt+fXX

2·d〈X〉ct+[f(t, Xt)−f(t, Xt−)−fX∆Xt]. (3.46)

Here, 〈X〉ct is the quadratic variation of the continuous part of Xt; thederivatives are evaluated at (t, Xt−); and ∆Xt ≡ Xt − Xt−. When Xt =Yt + Kt, where Yt is Ito and Kt is pure jump, then ∆Xt = ∆Kt,fX · dXt = fX · dYt + fX∆Kt, and 〈X〉ct = 〈Y 〉t, which agrees with (3.45).

Example 51 Constructing a process that will be important in theapplications, take at = (α + β2/2)Xt−, bt = βXt−, and ct = Xt andassume that P(1+U > 0) = 1. This results in a geometric Brownian motionpunctuated by Poisson-directed, multiplicative, positive shocks 1 + U . WithRt ≡ lnXt − lnX0 and R0 = 0 formula (3.45) gives

dRt = α · dt + β · dWt + ln(1 + U) · dNt (3.47)

13For the proof, consult Protter (1990, pp. 71–74).

Page 149: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch03 FA

132 Pricing Derivative Securities (2nd Ed.)

or

Rt = αt + βWt +Nt∑

k=0

ln(1 + Uk) (3.48)

along a sample path with Nt jumps. Rt is therefore the sum of a Brownianmotion and a compound-Poisson process. Letting ΨU (ζ) = E(1 + U)iζ bethe characteristic function of ln(1 + U), the natural log of the c.f. of Rt is(for later reference)

ln ΨR(ζ) = tiζα − ζ2β2/2 + θ[ΨU (ζ) − 1] (3.49)

when the Poisson process has intensity θ.

Levy Processes

A Levy process is an adapted, right-continuous stochastic process Xtt≥0

with the following features:14

1. X0 = 02. Xt − Xs independent of Xr − Xq for q < r ≤ s < t

3. P(Xt+u − Xt ∈ B) independent of t for each t ≥ 0, u > 0 and B ∈ B.

Levy processes are thus right-continuous processes with stationary, inde-pendent increments. It is clear that Brownian motions, Poisson processes,and compound-Poisson processes all belong to this class, as does the processRt from (3.48); however, Ito processes, Yt = Y0 +

∫ t

0as ·ds+

∫ t

0bs ·dWs,

are not, in general, Levy.Together, the second and third properties impose a surprising degree

of structure on Levy processes. Letting Ψt(ζ) = EeiζXt denote the c.f.,stationarity and independence of increments implies that Ψt(ζ) = Ψt/n(ζ)n

for each positive integer n and each t ≥ 0. Thus, for each t the distributionof Xt is infinitely divisible. As such, as was shown in section 2.2.7, the c.f.cannot equal zero, so we can take logs and write

ln Ψt+1/n(ζ) − ln Ψt(ζ) = ln Ψ1/n(ζ) = n−1 ln Ψ1(ζ)

for each n > 0. Multiplying by n and letting n → ∞ shows that

d ln Ψt(ζ)/dt = ln Ψ1(ζ)

14Strictly, it is possible to show that there is a right-continuous version of a processwith these properties—e.g., Protter (1990, chapter I, theorem 30). We always deal withsuch a version.

Page 150: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch03 FA

Tools for Continuous-Time Models 133

and, therefore, (by property 1) that

Ψt(ζ) = etΘ(ζ) (3.50)

for some continuous, complex-valued function Θ(ζ) = ln Ψ1(ζ). For stan-dard Brownian motions, Poisson processes, and compound-Poisson pro-cesses we have, respectively, Θ(ζ) = −ζ2/2, Θ(ζ) = θ(eiζ − 1), andΘ(ζ) = θ[ΨU (ζ) − 1], while

Θ(ζ) = iζα − ζ2β2/2 + θ[ΨU (ζ) − 1] (3.51)

for the mixed process Rt in (3.48).Observe that any infinitely divisible distribution with c.f. Ψ1(ζ) gener-

ates a Levy process as the collection of random variables Xtt≥0, whereXt has c.f. Ψt(ζ) = exp[t ln Ψ1(ζ)] = Ψ1(ζ)t. Assuming that E|Xt| < ∞, wecan appeal to the Levy-Khintchine theorem (section 2.2.7) for a canonicalrepresentation of Ψt(ζ) as (3.50) with

Θ(ζ) = iζµ +∫

eiζx − 1 − iζx

x2M(dx), (3.52)

where µ = EX1 and M is a σ-finite measure on (,B). The first twoexamples below correspond to those in section 2.2.7.

Example 52 With M(0) = σ2 ≥ 0 and M(\0) = 0 we haveΘ(ζ) = iζµ − ζ2σ2/2. If σ2 > 0 this corresponds to a Brownian motionwith mean rate of drift µ and variance rate σ2; otherwise, it corresponds toa deterministic process.

Example 53 With µ = M(1) = θ > 0 and M(\1) = 0 we haveΘ(ζ) = θ(eiζ − 1), the Poisson process.

Example 54 Let F be the c.d.f. of a random variable U with c.f. ΨU (ζ)and mean EU = υ, and set µ = θυ. Taking M(0) = 0 and settingM(dx) = θx2 · dF (x) give Θ(ζ) = θ[ΨU (ζ) − 1], which corresponds tocompound-Poisson process (3.38). In this case M(\0) = θ

∫x2 · dF (x),

which is finite if and only if V U exists.

Example 55 Let F be as in the previous example but takeM(0) = β2 > 0, so that M(\0) = θ

∫x2 · dF (x). Then putting µ =

α − θυ gives (3.51).

Page 151: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch03 FA

134 Pricing Derivative Securities (2nd Ed.)

Example 56 Taking M(dx) = xe−x · dx for x > 0, M((−∞, 0]) = 0, andµ = 1 gives

Θ(ζ) = iζ +∫ ∞

0

eiζx − 1x

e−x · dx − iζ

∫ ∞

0

e−x · dx

=∫ ∞

0

eiζx − 1x

e−x · dx.

Differentiating with respect to ζ and recognizing the integral in the resultingexpression as the c.f. of the exponential distribution give

Θ′(ζ) = i

∫ ∞

0

eiζxe−x · dx = i(1 − iζ)−1,

whence Θ(ζ) = ln(1 − iζ)−1 and Ψt(ζ) = (1 − iζ)−t. Xtt≥0 is now agamma process, Xt having the density

ft(x) = Γ(t)−1xt−1e−x

for t > 0 and x > 0.

Notice that with M(\0) = 0, as in the first example, we get acontinuous process, whereas M(\0) > 0 produces either a pure-jumpprocess or a process with both continuous motion and jumps, according asM(0) = 0 or M(0) > 0. We will see that the gamma process in the lastexample is indeed pure jump, although not a member of the J class.

The applications we study will employ a variant of (3.52) involving adifferent measure. For a set of real numbers B ∈ B that is bounded awayfrom the origin (meaning that B∩[−ε, ε] = ∅ for some ε > 0) define Λ(B) =∫

Bx−2M(dx). Λ is then a σ-finite measure on the Borel sets of \0—the

“Levy measure” of the Levy process Xtt≥0. Setting M(0) = σ2 ≥ 0gives the canonical representation

Θ(ζ) = iζµ − ζ2σ2/2 +∫\0

(eiζx − 1 − iζx)Λ(dx). (3.53)

Expression (3.53) shows that Levy processes can be decomposed intothree pure types and combinations thereof: (i) a deterministic trend if σ = 0and Λ = 0; (ii) a trendless Brownian motion if µ = 0 and Λ = 0; and (iii)a pure-jump process if µ = σ = 0. If Λ(\0) = 0, as in example 52,then Θ(ζ) = iζµ − ζ2σ2/2, so that Xtt≥0 is either a deterministic trend(if σ2 = 0) or a Brownian motion. In either case the sample paths are a.s.continuous. The process will be discontinuous whenever Λ(\0) > 0—being pure jump if σ2 = 0 and µ =

∫\0 xΛ(dx) and otherwise the sum

Page 152: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch03 FA

Tools for Continuous-Time Models 135

of a Brownian motion and an independent pure-jump process. The natureof the jump part depends on whether Λ(\0) is finite. If Λ(\0) = θ,where 0 < θ < ∞, then θ−1Λ is a probability measure on \0 and Xtis then a J process—either Poisson or compound-Poisson according as theentire mass θ is concentrated at a single point (as in example 53) or not (asin example 54). The remaining case, that Λ(\0) = ∞, is illustrated inexample 56, where the integral∫

\0Λ(dx) = lim

ε↓0

∫ ∞

ε

x−1e−x · dx (3.54)

has no finite value. Since, however, Λ is σ-finite, it is possible to parti-tion \0 into countably many sets of finite measure; for example, as∪∞

j=−∞Bj , where Bj = [2−j−1, 2−j) ∪ (−2−j,−2−j−1]. Letting Λ(Bj) = θj

and dFj(x) = θ−1j Λ(dx)1Bj (x) then gives∫

\0(eiζx − 1 − iζx)Λ(dx) =

∞∑j=−∞

∫Bj

θj(eiζx − 1 − iζx) · dFj(x)

=∞∑

j=−∞θj [Ψj(ζ) − 1] − iζυj.

This shows that a pure-jump Levy process with Λ(\0) = ∞ can berepresented as an infinite sum of independent compound-Poisson processes.Since the original measure M is finite whenever Xt has finite variance, wesee that

∫B Λ(dx) =

∫B x−2M(dx) is always finite whenever B is bounded

away from the origin.To gain an understanding of just what it is that Λ measures, let us

begin with example 53, the Poisson process, where Λ(\0) = Λ(1) =M(1) = θ. Here the measure is concentrated at the value (unity) that isthe size of jumps in the Poisson process and Λ(1) equals the expectednumber of jumps per unit time. Generalizing, suppose Λ is concentrated ona countable set xj∞j=1, with Λ(xj) = θj . Then

Θ(ζ) =∞∑

j=1

θj(eiζxj − 1 − iζxj),

so that Ψt(ζ) = etΘ(ζ) is the c.f. of the sum of independent scaled andcentered Poisson processes with jump sizes xj and expected frequenciesθj . If the xj are bounded away from the origin, then

∑∞j=1 θj < ∞, so

that the expected number of jumps per unit time is finite. In general, weinterpret Λ(B) as measuring the expected frequency of jumps of size in B,

Page 153: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch03 FA

136 Pricing Derivative Securities (2nd Ed.)

whenever B is a set of real numbers whose closure excludes the origin. Theexpected number of jumps during any finite interval of time is infinite when

Λ(\0) = limε↑0

Λ((−∞,−ε]) + limε↓0

Λ([ε, +∞)) = ∞,

as is the case for the gamma process.15

Subordinated Processes

Let Xtt≥0 be a Levy process with c.f. ΨXt(ζ) = etΘ(ζ), and let Ttt≥0

be an a.s. nondecreasing process, independent of Xt, with T0 = 0. Let usthink of Tt as representing the “operational” time that has elapsed duringthe interval [0, t] of calendar time. The composite process XTtt≥0, rep-resenting the evolution of X in operational time, is called a “subordinatedprocess”, and Ttt≥0 is referred to as a “directing process” or “subordi-nator”. Since time itself is infinitely divisible, consistency requires that thedistribution of T1 be infinitely divisible as well, and therefore that Ttbe a Levy process. Since the sample paths are nondecreasing, there canbe no Brownian component, and the associated Levy measure ΛT must beconcentrated on +. Letting etΥ(ζ) be the c.f. of Tt and µ the mean of T1,the canonical representation is16

Υ(ζ) = iζµ +∫+

(eiζx − 1 − iζx)ΛT(dx).

Examples of subordinators are the Poisson and gamma processes. The c.f.of the subordinated process is then

ΨXT(ζ) = EE[eiζXTt | Tt]

= EeTtΘ(ζ)

= exptΥ[−iΘ(ζ)],

which shows that XTtt≥0 is again a Levy process. Thus, the family ofLevy processes is closed under subordination.

Example 57 Let Wtt≥0 and Ntt≥0 be, respectively, a standard Brow-nian motion and a Poisson process with intensity θ, and take Tt = t + Nt.

15Protter (1990) gives further details of Levy measure and Levy processes generally,and Bertoin (1996) provides an extensive technical treatment.

16For the proof see Karatzas and Shreve (1991, pp. 405–408).

Page 154: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch03 FA

Tools for Continuous-Time Models 137

Then EeiζTt = exp[iζt + θt(eiζ − 1)], and composition WTt has c.f. corre-sponding to the sum of a Brownian motion and an independent compoundPoisson:

ΨWT(ζ) = expt[−ζ2/2 + θ(e−ζ2/2 − 1)]. (3.55)

Thus WTt has the same finite-dimensional distributions as the processWt +

∑Nt

j=0 Zj, where Ntt≥0 is a Poisson process, Z0 = 0, and Zj∞j=1

are i.i.d. as standard normal.

Example 58 Let us try to replicate the mixed process (3.48) via a timechange of Brownian motion when ln(1 + Uk) ≡ σZk∞k=1 are i.i.d. asN(0, σ2) and β = 0. Representing the value of the mixed process at time t as

Rt = αt + βWt + σ

Nt∑k=0

Zk,

the log of the c.f. is

ln ΨR(ζ) = tiζα − ζ2β2/2 + θ[e−ζ2σ2/2 − 1].

Setting Tt = t + Ntσ2/β2 and Yt = αt + βWTt , calculation of the c.f. of Yt

shows that Rt and Yt do have the same finite-dimensional distributions.

Page 155: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch03 FA

This page intentionally left blankThis page intentionally left blank

Page 156: Pricing derivative securities

April 9, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch04 FA

PART II

PRICING THEORY

Page 157: Pricing derivative securities

April 9, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch04 FA

This page intentionally left blankThis page intentionally left blank

Page 158: Pricing derivative securities

April 9, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch04 FA

4Dynamics-Free Pricing

The work of pricing derivatives begins in this chapter. Here, using the sim-plest of mathematical tools, we look at what can be said about the valuationof forward contracts, futures contracts, and options without explicit mod-els for the dynamics of the prices of the underlying assets. This is done byrelying just on static replication for the arbitrage arguments. Recall fromchapter 1 that static replication involves duplicating a derivative’s payoffswith a portfolio that does not have to be rebalanced. We also consider herewhat the absence of arbitrage opportunities implies about the prices of risk-less bonds. In both cases the only assumption made about the dynamics isthat assets’ prices and interest rates are bounded below by zero. However,the arguments do rely on the Perfect-Market conditions that were referredto generally in chapter 1 and which we now spell out in detail, as follows.

PM1 It is costless to trade and to hold assets. Thus, there are no commis-sions to brokers, no transfer fees, no bid-asked spreads to compensatemarket makers, no margin requirements, no property taxes, and notaxes on capital gains, dividends, or interest.

PM2 Trades can be made instantaneously and at any time. Besidesthe obvious meaning, this implies that transactions in differentmarkets—for example, those for options and stocks—can be madesimultaneously.

PM3 Primary and derivative assets can be sold short without restriction.Thus, there are no special tick rules governing when a short salecan be consummated, and proceeds of sales are not escrowed but areavailable for other transactions. This also implies that riskless bondscan be issued as well as bought, so that individuals can, in effect,borrow at riskless rates.

141

Page 159: Pricing derivative securities

April 9, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch04 FA

142 Pricing Derivative Securities (2nd Ed.)

PM4 Assets can be held in any quantity whatsoever. One can thereforebuy and sell fractional shares.

We also make the following additional assumption regarding what isknown about the underlying asset:

CCK Primary assets’ explicit Costs of Carry over the life of a derivativeare Known in advance.

While PM1-PM4 are self-explanatory, condition CCK requires special com-ment. Actually holding an asset for some period of time, as opposed totaking steps to secure ownership at a later date, can confer both positivecosts and benefits (negative costs). Contributing on the positive side are thedirect expenses to store and insure the asset, plus any losses from depreci-ation or deterioration. On the negative side are any receipts to which theowner of the asset is entitled. For financial assets these include dividendand interest payments and perhaps intangibles such as voting rights. Forcommodities like wheat or petroleum that can be used either directly orindirectly in consumption there is a convenience yield, discussed at lengthbelow. This is of much less significance for sterile commodities like gold thatare in nearly fixed supply and are held primarily for investment. Whateverthe source of the costs and benefits, condition CCK requires that theirvalues be known in advance over the lifetime of a derivative asset—eitherin actual currency units or as proportions of the evolving market value ofthe underlying asset or commodity. Notice that the explicit cost of carryexcludes the implicit opportunity cost associated with tying up investmentresources.

While there is no real market in which these five conditions strictly hold,the conditions are nevertheless useful idealizations. They are useful from amethodological standpoint because they permit simple arguments to gen-erate unambiguous predictions about prices of derivatives. They are usefulin practice because observation confirms that violations of the implied pric-ing relations in some real markets are neither persistent nor extreme. Onesuspects that this is true largely because in those markets the conditionsare more restrictive than needed for the results (although not for the argu-ments). For example, the tick rules and escrow requirements that do in factinhibit short sales of common stocks in some markets do not restrict salesby those who already own these assets. Likewise, while some of us maypay high commissions that would make otherwise profitable trades unprof-itable, it suffices that there are those with ample resources who can trade

Page 160: Pricing derivative securities

April 9, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch04 FA

Dynamics-Free Pricing 143

very cheaply. On the other hand, we will see that there are prominent sit-uations in which the predictions based on these conditions—and CCK inparticular—are not well borne out in practice and in which naıve arbitragearguments alone lead to less useful results.

Since the ability to borrow and lend risklessly is fundamental to arbi-trage arguments, we begin with a discussion of interest rates and theirrelation to spot and forward prices of default-free bonds. Section 4.2 thentakes up forward and futures contracts. Because values of forward positionsdepend linearly on prices of traded primary assets, they can be replicatedwith static, buy-and-hold portfolios of the underlying and bonds. Since nosubsequent trading is involved, valuing forward positions does not require amodel for the underlying price dynamics. It is possible also to value futurespositions and determine arbitrage-free futures prices without such models,although this does require restrictions on the evolution of bond prices. Thefinal section of the chapter deals with options. Because their values at expi-ration are nonlinear functions of the underlying prices, arguments basedon static replication suffice merely to bound their pre-expiration values orrestrict the relative prices of such claims. These bounds and restrictionswill, however, guide us in later chapters as we adopt specific models forthe underlying dynamics and apply dynamic replication to develop sharpestimates of arbitrage-free prices.

4.1 Bond Prices and Interest Rates

Payoffs of derivative assets are typically combinations of sure claims orobligations and contingent claims or obligations. For example, we saw inchapter 1 that the payoff from contracting for forward delivery of a com-modity at future date T is ST −f. In such a deal one simply trades a deferredobligation to pay the sure amount f, the forward price, for a deferred claimon a commodity worth an uncertain amount ST . To replicate and price suchderivatives requires replicating and pricing the sure claims as well as thecontingent ones. Riskless discount bonds are just such sure claims, deliver-ing a single guaranteed receipt of the principal amount at expiration. In theUnited States, U.S. Treasury bills and Treasury strips come close to thisidealized concept. For simplicity we apply the term bonds to all such obli-gations, regardless of their maturity. Thus, a 3-month Treasury bill will bereferred to as a “bond” that matures in 0.25 years. Only in the next sectionand in chapter 10 will we have occasion to consider coupon bonds that makeperiodic payments to the holder before maturity. The terms bond, riskless

Page 161: Pricing derivative securities

April 9, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch04 FA

144 Pricing Derivative Securities (2nd Ed.)

bond, and unit bond (see below) without further qualification always referto the discount variety.

Interest rates are merely secondary constructs that describe the tem-poral behavior of bond prices. We will see how average and instantaneousspot rates and forward rates can be deduced, in principle, from a collectionof bonds having a continuous range of maturities. Our convention through-out is to measure time to maturity in years and to express interest ratesper annum. The riskless bonds we consider are riskless only in the sensethat the nominal value of their contractual payoffs is certain; that is, theyare default-free. Purchasing-power risk arising from uncertainty in goods’prices may still be present. We also deal only with nominal interest rates,not real interest rates adjusted for anticipated changes in the purchasingpower of the currency.

4.1.1 Spot Bond Prices and Rates

Throughout, B(t, T ) represents the price at t of an idealized discount bondthat matures and pays the holder one unit of currency for sure at T ≥ t;thus, B(t, t) = B(T, T ) = 1. We sometimes refer to these as “unit” bondsto emphasize that they pay a single currency unit at maturity. Of course,no one really issues bonds with unitary “principal”, or “face”, value; but wecan simply think of B(t, T ) as the price per unit principal. As the price ofa unit bond, B(t, T ) is the appropriate discount factor for valuing at t anysure cash amount that is to be received at T . For example, a sure claim ona cash amount c at T would be worth B(t, T )c at time t.1 The time- t valueof a default-free coupon bond that promises to pay the bearer c currencyunits at each of dates T1, . . . , Tn and deliver also a final one-unit principalpayment at Tn would thus be

B(t, c; T1, . . . , Tn) = c

n∑j=1

B(t, Tj) + B(t, Tn).

Table 4.1 illustrates the term structure of prices of discount bondsthrough a sample of bid and asked quotations for U.S. Treasury strips atone-year maturity intervals, all in the month of August.2 Treasury strips

1This familiar fact is itself based on an arbitrage argument. Since one could securethe same receipt c by buying c of the unit, T -maturing bonds, a disparity between theprice of the claim and the cost of the bonds could be arbitraged by buying the cheaperof the two and selling the other.

2Source: Wall Street Journal, 8 June 2006.

Page 162: Pricing derivative securities

April 9, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch04 FA

Dynamics-Free Pricing 145

Table 4.1. Price quotations for U.S. Treasury strips.

Maturity Bid Asked

2006 99:03 99:042007 94:09 94:102008 89:26 89:272009 85:16 85:172010 81:24 81:252011 77:19 77:192012 73:29 73:292013 70:07 70:082014 66:09 66:092015 62:29 62:302016 59:16 59:17

are discounted (non-coupon paying) claims backed by portfolios of coupon-paying Treasury notes and long-term bonds. Prices are quoted in percentagepoints of maturity value, with digits after the colon representing 32ds ofone percentage point.

Average and Instantaneous Rates

We know from first principles that an initial sum S0 invested at a contin-uously compounded average rate of return r per annum would be worthSτ = S0e

rτ after τ years. Equivalently, S0, Sτ , and τ together determinethe average continuously compounded return, as r = τ−1 ln(Sτ/S0). Thus,rτ equals the logarithm of the “total return”, which is the terminal valueof a unit initial investment. Applying these relations to default-free bonds,a unit bond purchased at t < T for an amount B(t, T ) and held the T − t

years to maturity gives total return B(T, T )/B(t, T ) = B(t, T )−1 and aver-age continuously compounded rate of return

r(t, T ) = (T − t)−1[lnB(T, T ) − lnB(t, T )] = −(T − t)−1 lnB(t, T ).

Since buying a bond at its current or “spot” price is the same as makinga loan of B(t, T ) for T − t years, r(t, T ) represents the average spot ratefor lending from t to T . It is also known as the bond’s (average) “yieldto maturity”. Current prices of bonds of various maturities being known,the corresponding yields or average spot rates for those maturities can bereadily calculated. The “yield curve” represents a plot of these yields againsttime to maturity, as of a point in time. Of course, a bond’s price can be

Page 163: Pricing derivative securities

April 9, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch04 FA

146 Pricing Derivative Securities (2nd Ed.)

expressed in terms of its yield as

B(t, T ) = e−r(t,T )(T−t). (4.1)

Now consider a bond that is just on the verge of maturing. Lettingt−∆t be the current time and t the maturity date, the average continuouslycompounded spot rate is

r(t − ∆t, t) = (∆t)−1[ln B(t, t) − lnB(t − ∆t, t)].

Assuming now that B(t, T ) is differentiable with respect to both arguments,letting ∆t ↓ 0 in the above gives the derivative of lnB with respect to thecurrent date. This limit is the “instantaneous spot rate” at t:

rt ≡ lim∆t↓0

r(t − ∆t, t) =∂ lnB(t, T )

∂t

∣∣∣∣T=t

. (4.2)

For brevity we will often refer to the instantaneous spot rate as the “shortrate”. Conversely, taking the derivative with respect to the maturity dategives

lim∆t↓0

(∆t)−1[lnB(t, t + ∆t) − lnB(t, t)] ≡ ∂ lnB(t, T )∂T

∣∣∣∣T=t

.

Since∂ lnB(t, T )

∂t

∣∣∣∣T=t

· dt +∂ lnB(t, T )

∂T

∣∣∣∣T=t

· dT = d lnB(t, t) = d ln(1) = 0,

the short rate is also given by

rt = − ∂ lnB(t, T )∂T

∣∣∣∣T=t

. (4.3)

Since a unit of currency at a future date T could always be securedsimply by acquiring one now and holding it, a unit bond priced above unitywould afford an opportunity for arbitrage unless the explicit cost of carryfor cash money were positive. Given the advanced monetary and financialinstitutions of modern developed nations, we are justified in regarding theexplicit cost of carrying cash as zero. Under this condition arbitrage thenassures that B(t, T ) ≤ 1 for each t ≤ T or, equivalently, that the averagespot rate, r(t, T ), and the short rate, rt, are always nonnegative.3 Since

3Short-term U.S. Treasury bills did sell at prices slightly above maturity value duringthe late 1930s, perhaps because of weak confidence in depression-era banks and the low

coverage limits in the early years of federal deposit insurance. Other factors that couldcontribute to positive cost of carry are the costs of safekeeping currency and propertytaxes levied on deposits at financial institutions by some governmental units.

Page 164: Pricing derivative securities

April 9, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch04 FA

Dynamics-Free Pricing 147

obtaining sure cash for free is (by definition) an arbitrage, it follows thatprices of bonds of any finite maturity are positive and therefore that aver-age spot rates at all maturities are finite. Henceforth, then, we take forgranted that bond prices and instantaneous spot rates satisfy the followingcondition:

RB Interest Rates are nonnegative, and bond prices are Bounded:

0 < B(t, T ) ≤ 1, 0 ≤ t ≤ T(4.4)

0 ≤ rt < ∞, t ≥ 0

The Money Fund

Some of the arguments we make in pricing derivative assets rely on the con-cept of an idealized savings account or money-market fund, whose value pershare is often used as a numeraire. This “money fund”, as we shall usuallycall it, always holds bonds that are just approaching maturity, continuouslyrolling over the proceeds from one cohort of maturing bonds into the next.Since this strategy earns the short rate on the fund’s assets, the share priceat t, Mt, grows at instantaneous rate rt, so that

Mt = M0 exp(∫ t

0

rs · ds

), t ≥ 0.

The initial value of one share, M0, can be specified arbitrarily.

4.1.2 Forward Bond Prices and Rates

Paying B(t, T ) for a T -maturing unit bond is a direct way of securingone currency unit at T . The currency unit can also be secured indirectlythrough a forward agreement. For this, one contracts forward at t to buy aT -maturing unit bond at a date t′ ∈ (t, T ) at some negotiated forward price,Bt(t′, T ). Here, subscript t indicates the time at which the forward priceis quoted, and the arguments indicate the initiation and maturity dates ofthe loan. Upon making the forward commitment, one also buys in the spotmarket Bt(t′, T ) units of t′-maturing bonds, each of which costs B(t, t′).This spot transaction at t thus provides the Bt(t′, T ) units of cash that isto be swapped at t′ for the T -maturing bond that will in turn provide thedesired unit payoff at T . Arbitrage will equate the costs of securing thecurrency unit in these direct and indirect ways. Assuming that the explicit

Page 165: Pricing derivative securities

April 9, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch04 FA

148 Pricing Derivative Securities (2nd Ed.)

cost of carrying discount bonds is zero, it follows that the time-t forwardprice of a T -maturing bond at t′ must be

Bt(t′, T ) = B(t, t′)−1B(t, T ).

Notice that the forward price depends just on current spot prices and thatcondition RB assures that such a price always exists and is positive. Currentspot prices of discount bonds over a span of maturities thus determineimplicitly the prices of forward commitments to lend and borrow over thatspan of time.

Corresponding to the average spot rate, the average time-t forward rateover (t′, T ) is

rt(t′, T ) = −(T − t′)−1 lnBt(t′, T )

= (T − t′)−1[lnB(t, t′) − lnB(t, T )].

If t′ = t this is the same as the average spot rate, r(t, T ). Taking T = t′+∆t

and letting ∆t ↓ 0 give the time-t value of the instantaneous forward rateat time t′:

rt(t′) ≡ lim∆t↓0

rt(t′, t′ + ∆t) = − ∂ lnB(t, T )∂T

∣∣∣∣T=t′

.

If t′ = t, we have rt(t) = −∂ lnB(t, T )/∂T |T=t= rt, which is just the shortrate at t. The assumption that cash money has zero cost of carry impliesthat average and instantaneous forward rates are necessarily nonnegative,just as for spot rates.

The relations below follow trivially from the definitions of forward bondprices, average forward rates, and instantaneous forward rates:

Bt(t′, T ) = exp

(−

∫ T

t′rt(s) · ds

)Bt(t′, T ) = e−rt(t

′,T )(T−t′)

rt(t′, T ) = (T − t′)−1

∫ T

t′rt(s) · ds.

Example 59 Based on the asked quotes in table 4.1 the implied forwardprice as of June 2006 for August 2010 delivery of a two-year unit bond was

Bt(t + 4, t + 6) =B(t, 6)B(t, 4)

=(73 : 29)/100(81 : 25)/100

.= 0.9037,

and the corresponding average forward rate was − ln(0.9037)/2 .= 0.0506.

Page 166: Pricing derivative securities

April 9, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch04 FA

Dynamics-Free Pricing 149

4.1.3 Uncertainty in Future Bond Prices and Rates

Although the values of default-free bonds are known both currently andat maturity, we obviously do not know what they will sell for betweennow and then. That is, while both B(t, T ) and B(T, T ) are known att, the value of B(t′, T ) at t′ ∈ (t, T ) will depend on market conditionsduring (t, t′]. It follows that the average continuously compounded rateof return from holding a T -maturing bond from t to t′ < T—namely(t′− t)−1 ln [B(t′, T )/B(t, T )]—is uncertain, and likewise the instantaneousrate of return at t′,

lim∆t→0

(∆t)−1 [lnB(t′, T ) − lnB(t′ − ∆t, T )] .

Likewise, current short rate rt is known, but its value at t′ > t is not.Although future spot bond prices and interest rates are indeed uncer-

tain, this uncertainty turns out to be of secondary importance in pricingshort-term derivatives on volatile underlying assets whose prices are notlinked directly to fixed-income instruments; for example, common stocks,stock indexes, and currencies. Since strategies for replicating a derivative’spayoffs are much simpler when there is a single source of risk, we will usu-ally assume that future short rates are known when treating derivatives onsuch volatile assets. Since we refer to this assumption often in this and laterchapters, we set it out as a separate condition:

RK The course of the short Rate (instantaneous spot rate) of interest isKnown over the life of the derivative asset.

In other words, for a derivative initiated at t = 0 and expiring at t = T ,the assumption is that the process rt0≤t≤T is F0-measurable. Of course,if future spot rates are known they must equal the future forward rates toprevent spot-forward arbitrages; that is, it must be true under RK that rt′ =rt(t′) for each t′ ≥ t. Likewise, if future spot rates are known, then so arefuture bond prices, so that B(t, t′)0≤t≤t′≤T ∈ F0. Moreover, future bondprices must equal the current forward prices as a condition for markets’being arbitrage-free. Therefore, condition RK also implies that

B(t′, T ) = Bt(t′, T ) = B(t, t′)−1B(t, T )

for 0 ≤ t ≤ t′ ≤ T . Finally, since current bond prices can alwaysbe expressed in terms of instantaneous forward rates, as B(t, T ) =exp(−

∫ T

t rt(s) · ds), they can also be expressed in terms of future spot

Page 167: Pricing derivative securities

April 9, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch04 FA

150 Pricing Derivative Securities (2nd Ed.)

rates under RK; i.e., B(t, T ) = exp(−∫ T

t rs ·ds). We shall use this simplifi-cation frequently in what follows, but one must remember that it is just asimplification. Assumption RK is dispensed with in chapter 10, where weconsider stochastic models for interest rates and see how to price derivativeson interest-sensitive assets.

4.2 Forwards and Futures

In the previous section an arbitrage argument was used to express theprice at t of a forward bond, Bt(t′, T ), in terms of known (Ft-measurable)prices of bonds that mature at later dates t′ and T . The conclusion thatBt(t′, T ) = B(t, T )/B(t, t′) depended on the perfect-market assumptionsPM1-PM4, on the known-cost-of-carry assumption CCK, and on regularitycondition RB, but it did not require assumption RK. That is, we did notneed to know the future prices of bonds in order to determine the arbitrage-free forward prices. One is then led to ask whether forward prices for otherassets than default-free bonds are also fully determined in markets with noarbitrage. We will see that under the same conditions arbitrage considera-tions alone do indeed determine the precise values of forward prices (and offorward contracts as well) without specific models for the dynamics of eitherunderlying prices or discount bonds. The expression for the forward bondprice thus turns out to be a special case of a more general result. Underthe additional assumption RK, we will see that futures prices for interest-insensitive assets are determined also. We will go through the argumentsfor forwards, then turn to futures, and finally discuss an important class ofcases in which the manifest failure of condition CCK (known cost of carry)leaves us just with bounds on values and prices rather than exact relations.

Here are the conventions and notation to be used. The spot price ofthe underlying asset or commodity at t ≥ 0 is St. This is the price thatwould be paid for immediate delivery to a specified location. The partyto a forward or futures contract who is obligated to take delivery will besaid to have a “long” position—for example, to be “long the futures”—while the party who must make delivery is said to be “short”. We willdeal with forward contracts initiated at t = 0 and expiring at T. Recallingthe notation introduced in chapter 1, F(St, T − t; f) represents the value att ∈ [0, T ] of a forward contract initiated at forward price f ≡ f(0, T ), whichof course does not change over the life of the contract. “Value” is from theperspective of the party who will take delivery—i.e., the one with the longposition. The futures price, which does change through time, is representedby F(t, T ). When t denotes a specific time, it is the current time.

Page 168: Pricing derivative securities

April 9, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch04 FA

Dynamics-Free Pricing 151

4.2.1 Forward Prices and Values of Forward Contracts

We use static replication to determine the value at any t ∈ [0, T ] of a forwardcontract initiated at some price f = f(0, T ). Since the actual forward pricein any contract is such that F(S0, T ; f) = 0, finding the value functionF(St, T − t; f) for arbitrary f and each t ∈ [0, T ] will enable us to solve forf(0, T ) as well.

In a forward contract cash and the commodity change hands only atexpiration (time T ). At that time the value (from the viewpoint of theparty who will take delivery) is F(ST , 0; f) = ST − f. To replicate this payoffat time t one would need to do three things: (i) buy one unit of the asset;(ii) finance the explicit cost of carrying it to T ;and (iii) borrow B(t, T )f ofcash by selling f unit bonds each requiring the repayment of one currencyunit at T . The net cost to replicate is therefore St + K(t, T ) − B(t, T )f,where K(t, T ) is the explicit cost of carry. As will be shown presently,under condition CCK K(t, T ) can itself be replicated with a portfolio ofthe underlying asset and/or riskless bonds—which to use depending onwhether the cost of carry is known in absolute terms or varies as a knownproportion of the value of the underlying. Therefore, St +K(t, T )−B(t, T )fis also the value of a portfolio of underlying asset and bonds. Since thisportfolio replicates the payoff of the forward contract, the two will have thesame value if there are no opportunities for arbitrage, and so

F(St, T − t; f) = St + K(t, T )− B(t, T )f (4.5)

for t ∈ [0, T ]. If interests in such forward contracts sold for more than this,one could secure a riskless profit by taking the short side of the contract(contracting to deliver) and buying the replicating portfolio. If interestssold for less, one could take the opposite positions. We can now use (4.5)to determine the forward price itself. Setting F(S0, T ; f) = 0 and solving forf(0, T ) give4

f(0, T ) = B(0, T )−1 [S0 + K(0, T )] . (4.6)

What remains is to see how to replicate the explicit cost of carry. Ifthe payments or receipts associated with holding the asset are of knowncash amounts and at known dates, the obligations can be replicated with

4In the case that the underlying commodity is a sure claim on one currency unit atT ′, then the current price is S0 = B(0, T ′). There being no explicit cost of carry for sucha claim, (4.6) gives B(0, T )−1B(0, T ′) as the forward price for delivery of the claim atT ≤ T ′, which coincides with the expression for the time-0 price of a forward discountbond, B0(T, T ′).

Page 169: Pricing derivative securities

April 9, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch04 FA

152 Pricing Derivative Securities (2nd Ed.)

a portfolio of riskless bonds. For example, known payments ct1 , . . . , ctn forstorage and safekeeping at times t1, . . . , tn ∈ [t, T ] would be financed byacquiring a portfolio of unit bonds maturing at these times and now worth

K(t, T ) =n∑

j=1

B(t, tj)ctj .

Receipts of dividends or interest would correspond to negative values of thectj. These receipts could be used to redeem tjn

j=1-maturing bonds soldat t for |K(t, T )|. If the payments/receipts were of unknown cash value butwere known proportions of the future value of the asset, then the carryingcost could be financed through sales/purchases of the asset itself at thetimes tjn

j=1. For example, the funds needed to pay ctj Stj > 0 at tj couldbe generated then by selling ctj > 0 units of the asset for each one held,and a receipt of −ctjStj > 0 could be used to purchase −ctj > 0 units.Since for each unit held at t one would have

∏nj=1(1 − ctj ) units after n

such transactions, one would need to start with∏n

j=1(1 − ctj )−1 units inorder to wind up with one unit at T , as is required to replicate the forwardcommitment. Therefore, the total cost at t to assure the availability of oneunit at T would be

St + K(t, T ) = St

n∏j=1

(1 − ctj )−1,

and the value of the forward contract would be

F(St, T − t; f) = St

n∏j=1

(1 − ctj )−1 − B(t, T )f. (4.7)

Notice that the timing of the payments does not matter when they areknown proportions of the price at whatever times the payments are to bemade.

In some cases it is appropriate to think of such proportional amounts asbeing paid, received, or accruing continuously through time. For example,an investment in a default-free foreign certificate of deposit grows in valuecontinuously at a known rate in proportion to the (unknown) future valueof the currency. Extending (4.7) to allow for such continuous flows, lettj − tj−1 = (T − t)/n and make each ctj proportional to the time since theprevious payment, as ctj = κ(T − t)/n. Then

∏nj=1(1 − ctj )−1 → eκ(T−t)

as n → ∞ and

F(St, T − t; f) = Steκ(T−t) − B(t, T )f. (4.8)

Page 170: Pricing derivative securities

April 9, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch04 FA

Dynamics-Free Pricing 153

More generally, if instantaneous rates of payment vary through time insuch a way that κs0≤s≤T is integrable and F0-measurable, then for eacht ∈ [0, T ]

F(St, T − t; f) = St exp

(∫ T

t

κs · ds

)− B(t, T )f, (4.9)

and initial value F(S0, T ; f) = 0 when

f(0, T ) = B(0, T )−1S0 exp

(∫ T

0

κs · ds

). (4.10)

4.2.2 Determining Futures Prices

Turning now to futures, recall that the parties to such contracts have obli-gations that differ slightly from those of parties to forward contracts. Onewho goes long the futures at time t = 0 when the futures price is F(0, T ) willactually wind up paying the time-T spot price, ST , to take delivery; but inthe meantime the net price difference, ST − F(0, T ), will have been postedto the buyer’s account through a succession of daily marks to market.5

The net result differs from that of a forward contract initiated at the sameprice, f(0, T ) = F(0, T ), because the intervening additions to the futuresaccount could be reinvested and the intervening withdrawals would haveto be financed. Despite this difference in timing we will show that futuresprices would nevertheless be identical to forward prices under the perfect-market and cost-of-carry conditions if, in addition, spot interest rates overthe life of the contract were known in advance—condition RK. Once this isdemonstrated, we can draw some qualitative conclusions as to how forwardand futures prices should differ given that bond prices are, in reality, uncer-tain. More than that cannot be done without explicit models for Stt≥0

and rtt≥0, of which we shall see some examples in chapter 10.Assume now that the course of the short rate during [0, T ] is known at

t = 0, which implies that bond prices, B(t, t′), are known at t = 0 for all t

and t′ such that 0 ≤ t ≤ t′ ≤ T . This is the first of many applications ofthe strategic assumption RK that was discussed in section 4.1.3. Under this

5Our discussion pertains to forward and futures contracts that are identical exceptfor provisions of marking to market. For example, we ignore the fact that some futurescontracts allow a range of delivery dates.

Page 171: Pricing derivative securities

April 9, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch04 FA

154 Pricing Derivative Securities (2nd Ed.)

condition it turns out that dynamic adjustments to a futures position initi-ated at price F(0, T ) can offset the effect of marking to market and delivera terminal value of ST − F(0, T ). This implies that arbitrages between for-wards and futures will equate the time-0 futures prices and forward prices,F(0, T ) and f(0, T ), on contracts for delivery at T .

Let t1 < t2 < · · · < tn−1 be the times at which a position in T -expiringfutures is marked to market, and set t0 = 0 and tn = T . Begin with along position in B0(t1, T ) = B(0, T )/B(0, t1) < 1 contracts at futures priceF(0, T ). At t = t1 when the futures price is F(t1, T ), marking to marketwill generate a (positive or negative) receipt [F(t1, T ) − F(0, T )]B0(t1, T ).Investing this in (financing it with) T -expiring unit bonds, each costingB(t1, T ), will add

B0(t1, T )B(t1, T )

[F(t1, T ) − F(0, T )] = F(t1, T )− F(0, T )

to the value of the account at T . The equality holds because future bondprices, being known at t = 0, must equal the current forward prices, so thatB(t1, T ) = B0(t1, T ). Besides investing or financing the capital gain or lossat t1, we also increase the futures position by the factor B(t1, t2)−1, givingB(t1, T )/B(t1, t2) = Bt1(t2, T ) contracts in all. Under condition PM1 thisrequires no cash outlay. Marking to market at t2 posts an additional change

Bt1(t2, T )[F(t2, T )− F(t1, T )]

to the account, and investing or financing this with T -expiring bonds eachworth B(t2, T ) adds

Bt1(t2, T )B(t2, T )

[F(t2, T )− F(t1, T )] = F(t2, T ) − F(t1, T )

to the final value.Continuing, arranging to hold at each tj−1 a total of Btj−1 (tj , T ) con-

tracts and reinvesting at tj the subsequent the one-period gain,

Btj−1(tj , T )[F(tj, T ) − F(tj−1, T )],

in T -maturing bonds each worth B(tj , T ) adds F(tj , T ) − F(tj−1, T ) tothe value of the account at T. The total of the receipts generated by thisstrategy is

n∑j=1

[F(tj , T )− F(tj−1, T )] = F(T, T ) − F(0, T ) = ST − F(0, T ).

Page 172: Pricing derivative securities

April 9, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch04 FA

Dynamics-Free Pricing 155

As we have argued, this implies that futures and forward prices for con-tracts of the same specifications must agree. The key requirement for thisresult is that the current forward price of a T -expiring bond acquired atthe next marking to market be equal to the actual, realized price at thatfuture time—a requirement that is met under condition RK. As futuresprice F(t, T ) evolves up to T it must therefore track the current forwardprices for new T -expiring contracts having the same specifications. Specifi-cally, in the case of continuous carrying costs we have

F(t, T ) = B(t, T )−1St exp

(∫ T

t

κs · ds

)= St exp

[∫ T

t

(rs + κs) · ds

](4.11)

under condition RK.Without explicitly modeling bond prices and Stt≥0 it is not possible

to say precisely how forward and futures prices should correspond whenRK fails to hold. However, when price movements in the underlying andbonds are strongly related, equilibrium arguments do at least support someeducated guesses. First, if the correlation is strongly negative, so that theterm structure of bond prices usually shifts in a direction opposite to thatof the spot price, then one expects futures prices to be higher than forwardprices, for the following reason. Positive changes in St lead typically toincreases in F(t, T ) and gains for those who are long the futures. If accom-panied by a decline in bond prices, these gains purchase more of the cheaperT -maturing bonds and thereby generate larger cash receipts at T than inthe absence of negative correlation. This makes long positions in futuresmore attractive than long positions in forwards when F(t, T ) = f(t, T ).Therefore, the futures price needs to be higher to offset this advantage andequilibrate the two markets. Conversely, if the correlation between move-ments in prices of bonds and the underlying is strongly positive, then gainson long futures positions usually buy fewer bonds and secure lower ter-minal receipts than in the absence of such correlation. In that case oneexpects futures prices to be lower than forward prices. However, it is notso clear what to expect when bond prices are uncertain but only weaklylinked to the underlying. We come back to this issue in chapter 10, wherewe show precisely how futures and forward prices are related under a par-ticular model for the instantaneous forward rate and the underlying asset.

Page 173: Pricing derivative securities

April 9, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch04 FA

156 Pricing Derivative Securities (2nd Ed.)

4.2.3 Illustrations and Caveats

If the short rate is positive and an asset’s explicit cost of carry is nonneg-ative, so that

∫ T

t(rs + κs) · ds > 0 for each t ∈ [0, T ), then it is clear from

(4.11) that futures prices should increase with the remaining time untilexpiration. This is, in fact, what we usually observe for sterile assets likegold, which generate no income but are nevertheless held largely for invest-ment purposes as hedges against inflation. To illustrate, table 4.2 showsthat settlement prices for gold futures traded on the COMEX (a divisionof the New York Mercantile Exchange) as of t = 7 June 2006 do increasemonotonically with time to delivery.6

On the other hand, price would decrease monotonically with time todelivery if the cost of carry were sufficiently negative. For example, thiswould be true of currency futures if foreign interest rates were higher thandomestic rates. U.S. dollar prices of Mexican peso futures on 8 June 2006(table 4.3) illustrate such a monotonic decline, as is consistent with thedifference in average spot rates on Mexican and U.S. government securities.(For comparison, the last column shows the futures prices predicted fromthe simple cost-of-carry relation (4.11) under assumption RK, given a spotprice of St = .0882 $/peso.)

However, the situation is more complicated for commodities held impor-tantly as inventories for direct consumption or for the production of finalgoods, such as industrial metals, petroleum products, and agricultural prod-ucts. Although these generate no explicit receipts and are costly to store,futures prices are often poorly explained by naıve cost-of-carry arguments.

Table 4.2. Gold futures prices vs. length ofcontract.

Expiration, T F(t, T )($/oz.)

Jun 06 627.40Aug 06 632.60Dec 06 645.20Feb 07 651.50Jun 07 663.80Dec 07 682.30

6Sources for tables 4.2–4.4: Wall Street Journal ; Chicago Mercantile Exchange;Banco de Mexico (CETES rates); U.S. Treasury (government securities yield curve);Chicago Board of Trade: all as of 8 June 2006. Settlement prices are those officiallydetermined by the exchange in order to mark traders’ positions to market.

Page 174: Pricing derivative securities

April 9, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch04 FA

Dynamics-Free Pricing 157

Table 4.3. Dollar/peso futures prices vs. length of contract.

Expiration, T F(t, T )($/peso) r(t, T )MX r(t, T )US Ste(rMX−rUS)(T−t)

Jun 06 0.0878Aug 06 0.0872 0.0701 0.0480 0.0878Oct 06 0.0869 0.0728 0.0486 0.0874Dec 06 0.0866 0.0750 0.0505 0.0871Feb 07 0.0862Apr 07 0.0857Jun 07 0.0851 0.0756 0.0503 0.0859

Table 4.4. Live cattle futures pricesvs. length of contract.

Expiration, T F(t, T )($.01/lb.)

Jun 06 79.875Aug 06 80.625Oct 06 84.100Dec 06 86.125Feb 07 88.525Apr 07 86.275Jun 07 81.700

For example, futures prices for commodities in seasonal supply or demandare often not monotonic in time to delivery. This is illustrated in table 4.4,which reports futures prices for live cattle on the Chicago Board of Tradeas of 8 June 2006. Prices increase progressively for delivery dates from Junethrough February then decline for deliveries during spring 2007.

Why is it that the arguments that apply to investment assets seem notto apply here? Two things are involved. First, there is a latent benefit toholding stocks of such commodities; namely, that inventories afford oppor-tunities for immediate use in consumption or production. Often describedas “convenience yields”, such opportunities are more accurately regardedas “real” (as opposed to financial) options. Thus, a beef processor who soldApril 2007 futures (to deliver then out of current inventory) and simulta-neously contracted to take delivery in June 2007 at a lower price wouldforego some options to market beef during that interval. Second, the factthat existing quantities of the commodity fluctuate as stocks are contin-ually being used up or augmented by new production makes the value ofthe utilization option variable and hard to determine. Thus, the value ofmaintaining inventories of live cattle for immediate use depends on the

Page 175: Pricing derivative securities

April 9, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch04 FA

158 Pricing Derivative Securities (2nd Ed.)

uncertain domestic and foreign demand for beef, on changes in export andimport policies, and on cattlemen’s evolving plans to deplete or build uptheir herds. Contrary to what condition CCK requires, there is no way toassign a value to the utilization option—and thus no way to assess the costof carry—without a model for the dynamics of the spot price. The best wecan do with static replication for such consumable commodities is to setupper bounds on futures prices by taking account of just the known com-ponents of cost of carry. Clearly, a situation in which the futures price wasabove the explicit and opportunity cost of carrying live cattle to T wouldoffer a readily exploitable opportunity for arbitrage: short the futures andbuy the cows.

4.2.4 A Preview of Martingale Pricing

Let us now develop a useful interpretation of our findings about thevalues of forward contracts and about forward and futures prices. LetMt = M0 exp(

∫ t

0rs · ds) be the value at t of an initial investment of

M0 in a money-market fund earning the short rate continuously, and letS∗

t = M−1t [St + K(t, T )] be the relative value of the underlying asset plus

the explicit cost of carry during [t, T ]. Now suppose there exists a proba-bility measure P and a filtration Ft0≤t≤T such that S∗

t is a martingaleadapted to Ft. We refer to such P as a “martingale measure”. Then bythe fair-game property of martingales

EtS∗T ≡ E(S∗

T | Ft) = S∗t ,

where the carat on E denotes expectation under P. Assuming thatK(T, T ) = 0, this implies that

M−1t [St + K(t, T )] = M−1

T EtST

or

EtST

St + K(t, T )=

MT

Mt=

B(T, T )B(t, T )

= B(t, T )−1,

where the equality connecting money fund and bond prices holds underassumption RK. Under the martingale measure, therefore, the expectedtotal return from holding the risky asset during [t, T ] is the same as thesure total return of a riskless bond or, equivalently, of an investment in themoney fund. This same condition would hold in equilibrium if individualswere risk neutral, since no risk-neutral person would willingly hold bonds

Page 176: Pricing derivative securities

April 9, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch04 FA

Dynamics-Free Pricing 159

or money fund if B(t, T )−1 = MT /Mt < EtST /[St + K(t, T )], and nonewould willingly hold the risky asset if the inequality went the other way.Therefore, we can refer to P also as a “risk-neutral” measure.

Applying (4.5), the value of a T -expiring forward contract can beexpressed in terms of the risk-neutral expectation as

F(St, T − t; f) = St + K(t, T ) − B(t, T )f

= B(t, T )[EtST − f(0, T )]

= B(t, T )EtF(ST , 0; f).

Letting F∗(St, T − t; f) ≡ M−1t F(St, T − t; f) and noting that B(t, T ) =

Mt/MT under assumption RK, we see that the above implies

F∗(St, T − t; f) = EtF∗(ST , 0; f).

Thus, the process representing the evolving normalized value of the forwardposition is also a martingale under P. Finally, the forward price that setsF(St, T − t; f) = 0 at t would be f(t, T ) = B(t, T )−1[St + K(t, T )] = EtST ;and, still abstracting from uncertainty about future bond prices, the evolv-ing futures price F(t, T ) would be the same. Since f(T, T ) = F(T, T ) = ST ,this means that processes f(t, T )0≤t≤T and F(t, T )0≤t≤T are themselvesmartingales under P.

These are remarkable and, as we shall come to see, very significantresults. One way of expressing them is that in the absence of possibilitiesfor arbitrage the present value of a forward contract is the discounted valueof its expected terminal value in a risk-neutral market; and the forwardprice on a new contract is the risk-neutral expectation of the future spot.Put another way, in the absence of arbitrage there is a probability measuresuch that processes representing normalized values of the derivative and theunderlying are martingales. We shall see that such a martingale measuredoes exist in the absence of arbitrage possibilities. We shall also see that themeasure is unique when it is possible to replicate the derivative’s payoffswith a portfolio of traded assets. Under these conditions we have a simplerecipe for pricing derivatives.

4.3 Options

Payoffs of options are harder to replicate than payoffs of forward contractsand futures because they are nonlinear functions of the underlying price.Replication, if possible at all, has to be done dynamically, and whether and

Page 177: Pricing derivative securities

April 9, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch04 FA

160 Pricing Derivative Securities (2nd Ed.)

how this can be done depends on the price dynamics. Without a modelfor these, arbitrage arguments can merely establish loose bounds on pricesand determine qualitatively how prices depend on specific provisions of theoption contract, such as the strike price and time to expiration. On theother hand, under the perfect-market and known cost-of-carry conditions(PM1-PM4, CCK) static replication arguments alone do impose a tightparity relation between prices of puts and calls with the same strike andterm. Before developing these parity relations and bounds, it will be usefulto get a feel for the risks associated with option positions by relating theprobability distributions of option payoffs to the conditional distribution ofthe price of the underlying. Throughout the section the terms option, put,and call refer to the ordinary (“vanilla” ) European or American varieties.7

4.3.1 Payoff Distributions for European Options

We know from chapter 1 that the payoffs of European puts and calls struckat X and expiring at T are PE(ST , 0) = (X − ST )+ and CE(ST , 0) =(ST −X)+, respectively, where ST is the underlying price at T . It is obvious,then, that the put will wind up in the money with the same probability thatST is less than X and that the call will end in the money with the sameprobability that ST is greater than X. More generally, it is easy to representthe distributions of CE(ST , 0) and PE(ST , 0) in terms of the c.d.f. of ST .Regarding Stt≥0 as a stochastic process adapted to the filtered probabilityspace (Ω,F , Ft, P), it is clear that for each t ≤ T and p, c ∈

Pt[PE(ST , 0) ≤ p] ≡ P[PE(ST , 0) ≤ p | Ft] = Pt[ST ≥ X − p]

Pt[CE(ST , 0) ≤ c] ≡ P[CE(ST , 0) ≤ c | Ft] = Pt[ST ≤ X + c].

Therefore, letting FS(s) ≡ Pt(ST ≤ s), the conditional c.d.f.s of CE(ST , 0)and PE(ST , 0) are

FC(c) =

0, c < 0FS(X + c), c ≥ 0

FP (p) =

0, p < 01 − FS((X − p)−), 0 ≤ p < X

1, p ≥ X

,

7The European put-call parity relation was first pointed out by Stoll (1969). Merton’sseminal (1973) paper presented many of the arbitrage relationships derived here. Coxand Rubinstein (1985) give a highly readable account.

Page 178: Pricing derivative securities

April 9, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch04 FA

Dynamics-Free Pricing 161

Fig. 4.1. C.d.f.s of terminal values of underlying, put, and call.

where (X − p)− is the left-hand limit. C.d.f.s for European puts and callsat expiration can therefore be read off the c.d.f. of ST , as illustrated infigure 4.1.8

It is also possible to infer from the graphs the mathematical expecta-tions of the options’ terminal values. Applying (2.40), EtP

E(ST , 0) andEtC

E(ST , 0) are represented by the areas between the unit lines and therespective c.d.f.s.

4.3.2 Put-Call Parity

Since being long a call gives the right to buy at X and being short a putwith the same strike gives the contingent obligation to buy at X , holdingthe two positions together is clearly akin to having a forward commitment

8With one exception these payoff distributions would not apply to American optionsconditional on their being held to term, because that condition would restrict the allow-

able sample paths of St0≤t≤T and alter the conditional distribution of ST . The excep-tional case is that of an American call on an underlying asset with nonnegative explicitcost of carry. As we show below, such an option would never be exercised early.

Page 179: Pricing derivative securities

April 9, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch04 FA

162 Pricing Derivative Securities (2nd Ed.)

to take delivery at the price f = X . Indeed, for European puts and callswith the same term the correspondence is exact, and this imposes a strictrelation between the prices of the options before maturity. There is alsoa relation of this sort for American options, although the early-exercisefeature makes it a bit more complicated.

European Options

Being long a T -expiring European call and simultaneously short a put onthe same underlying asset and with the same term and same strike, X ,produces a payoff at T equal to

CE(ST , 0) − PE(ST , 0) = (ST − X)+ − (X − ST )+ = ST − X,

identical to that of a forward contract to buy at X. Under the perfect-market and known-cost-of-carry conditions arbitrage will therefore alignthe prices of the three claims at each t ∈ [0, T ] so that

CE(St, T − t) − PE(St, T − t) = F(St, T − t; X) (4.12)

= St + K(t, T )− B(t, T )X,

where K(t, T ) is the explicit cost of carrying the underlying asset fromt to T . Of course, the explicit cost of carry is typically zero or negative forfinancial assets. For example, suppose the underlying is a share of stockthat generates sure “lump-sum” dividends (that is, cash payments of fixedsums) at known times during [t, T ], and let ∆(t, T ) be the present value ofthe dividends at t. Thus, if the owner of the stock receives a single cashpayment of ∆ at t∗ ∈ [t, T ], then ∆(t, T ) = B(t, t∗)∆. The cost of carry isthen the negative of ∆(t, T ), and

CE(St, T − t) − PE(St, T − t) = St − ∆(t, T ) + B(t, T )X.

If the underlying is a currency or stock index that is regarded as payinga continuous proportional dividend at constant rate δ ≥ 0, then settingκ = −δ in (4.8) gives9

CE(St, T − t) − PE(St, T − t) = Ste−δ(T−t) − B(t, T )X. (4.13)

9If the payout rate varies with time but is known in advance then the right side of(4.13) is simply St exp(− R T

tδs · ds) − B(t, T )X.

Page 180: Pricing derivative securities

April 9, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch04 FA

Dynamics-Free Pricing 163

Finally, if the underlying is a futures price or a no-dividend stock, thenthere is no explicit cost of carry at all, and

CE(St, T − t) − PE(St, T − t) = St − B(t, T )X. (4.14)

Notice that when interest rates are positive (4.14) implies that at-the-money European calls are worth more than the corresponding puts evenwhen the underlying generates no receipts, since

CE(X, T − t) − PE(X, T − t) = X − B(t, T )X > 0.

The higher value of at-the-money calls is sometimes mistakenly attributedto a bullish bias in the market, yet we see that arbitrage would bring thisabout even if the underlying asset’s expected rate of return (under “natural”measure P) were negative.

American Options

Because of the possibility of early exercise, the best one can do in the caseof American options is to provide bounds—albeit usually tight ones—onthe difference between the values of calls and puts. In the case of knownlump-sum dividends with present value ∆(t, T ) at t, the bounds are

St − ∆(t, T ) − X ≤ CA(St, T − t) − PA(St, T − t) ≤ St − B(t, T )X.

(4.15)

We shall show that the violation of either bound presents an opportunityfor arbitrage, referring to the underlying as a “stock” as we go through thearguments. To establish the lower bound, consider a portfolio that is longthe call, short the put, short the stock, and has X + ∆(t, T ) invested inriskless securities. The amount X is placed in shares of the money fund,now worth Mt each, that grow in value at short rate rt; while ∆(t, T )is distributed among discount bonds that mature at the stock’s dividenddates. If the lower bound is violated, this portfolio can be bought for

CA(St, T − t) − PA(St, T − t) − St + X + ∆(t, T ) < 0.

Since the put may be exercised at a time not under our control, we mustbe prepared to unwind all the positions when and if that occurs. Therefore,in analyzing the payoffs there are two relevant states: “put not exercised”and “put exercised”. Values of the portfolio in each state are laid out andexplained in table 4.5.

Page 181: Pricing derivative securities

April 9, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch04 FA

164 Pricing Derivative Securities (2nd Ed.)

Table 4.5. Cash flows attainable from violation of lower bound, Americanput-call parity.

State Put Not Exercised by T Put Exercised at t∗ ∈ [t, T ]

Call ST − X CA(St∗ , T − t∗)Put 0 −X + St∗Stock −ST −St∗Money fund XMT /Mt XMt∗/Mt

Bonds 0 ∆(t∗, T )

Net X(MT /Mt − 1) ≥ 0 CA(St∗ , T − t∗) + X(Mt∗/Mt − 1)+∆(t∗, T ) ≥ 0

If the put is not exercised by T it must then be out of the money, in whichcase the call can be exercised to cover the shorted stock, paying the strikeprice out of the money-fund balance. Each stock dividend is repaid to theowner as it comes due using the principal from the bonds of that maturity.What is left is the residual money fund balance, X(MT /Mt − 1), and thisis positive unless the short rate was always zero during [t, T ]. (Recall thatcondition RB rules out negative short rates.) If the put is exercised atsome t∗, one uses the stock that has to be purchased to cover the shortposition, again paying the strike price out of the money account. Besidesthe nonnegative residual balance, X(Mt∗/Mt−1), one still has bonds worth∆(t∗, T ) that were dedicated to the remaining dividends, plus a call withremaining life T − t∗. Since the call can always be thrown away, its valueis clearly nonnegative. The net value of the portfolio being at least zerowhether the put is exercised or not, this strategy is indeed an arbitrage.

Violation of the upper bound of (4.15) affords such an opportunity aswell. Take a portfolio short the call, long the put, long the stock, and shortX units of T -maturing bonds worth B(t, T )X . Again, this portfolio can behad for a negative sum. Since it is short the call, the relevant states arenow “call not exercised” and “call exercised”. As dividends are paid, investthem in the money fund. Letting M(t, t′) be the resulting balance at timet′ ∈ [t, T ], table 4.6 shows that the portfolio’s value in each case is positiveor zero.

If the underlying is a currency or other asset that generates a continuousproportional receipt at rate δ, then a similar argument shows that the parityinequality is

Ste−δ(T−t)−X ≤ CA(St, T − t)−PA(St, T − t) ≤ St−B(t, T )X. (4.16)

To give some idea of the width of these bounds, the difference betweenprices of a 3-month at-the-money call and a corresponding put on a stock

Page 182: Pricing derivative securities

April 9, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch04 FA

Dynamics-Free Pricing 165

Table 4.6. Cash flows attainable from violation of upper bound, Americanput-call parity.

State Call Not Exercised by T Call Exercised at t∗ ∈ [t, T ]

Call 0 −St∗ + XPut X − ST P A(St∗ , T − t∗)Stock ST St∗Money fund M(t, T ) M(t, t∗)Bonds −X −B(t∗ , T )X

Net M(t, T ) ≥ 0 P A(St∗ , T − t∗) + X[1 − B(t∗ , T )]+M(t, t∗) ≥ 0

index paying 2% per annum would be between about −0.5% and +1.0% ofthe index price if the 3-month average spot rate were 4% per annum.

Since the explicit cost of carry for a financial asset paying lump-sumdividends is

K(t, T ) = −∆(t, T ) (4.17)

and is

K(t, T ) = St[e−δ(T−t) − 1] (4.18)

for an asset paying continuous proportional dividends at constant rate δ >

0, the two expressions (4.15) and (4.16) are both contained in

St + K(t, T )− X ≤ CA(St, T − t) − PA(St, T − t) ≤ St − B(t, T )X.

(4.19)

4.3.3 Bounds on Option Prices

While arbitrage keeps the difference between prices of comparable calls andputs within fairly narrow limits, the bounds on prices of individual optionsremain quite wide unless structure is imposed on the prices of the under-lying and of default-free bonds. If we assume just that St is nonnegative,then under perfect-market conditions PM1-PM4, the known-cost-of-carryassumption (CCK), and the bounded-rate assumption (RB) arbitrage con-strains prices of puts and calls as shown in table 4.7. By applying either(4.17) or (4.18), this covers the two cases of known lump-sum dividends andcontinuous proportional dividends. In the table K stands for K(t, T ), theexplicit cost of carry, which is assumed to be negative or zero depending onwhether or not the underlying generates cash receipts; B stands for B(t, T ),the current price of a T -maturing bond; and each option is struck at X .

Page 183: Pricing derivative securities

April 9, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch04 FA

166 Pricing Derivative Securities (2nd Ed.)

Table 4.7. Arbitrage bounds on prices of options.

Bound Lower Upper

European call, CE(St, T − t) max0, St + K − BX St + KAmerican call, CA(St, T − t) max0, St + K − BX, St − X St

European put, P E(St, T − t) max0, BX − St − K BXAmerican put, P A(St, T − t) max0, BX − St − K,X − St X

Here are the arguments, starting with the upper bounds. How to arbi-trage the condition CE(St, T − t) > St + K depends on how the dividendsare paid. If they are lump-sum, buy one share of stock and borrow a totalof −K = ∆(t, T ), matching the maturities of the loans with the variousdividend dates and repaying the loans as the dividends are received. If div-idends are proportional to the price of the stock, just buy St + K worthof stock and invest the dividends in stock as they are received. In bothcases the initial net cost of the purchases would be more than offset by theproceeds from the sale of one call. If the call is exercised, deliver the stockin return for the strike price. If it is not exercised, one is left with stockworth ST ≥ 0, thus at least breaking even either way. If an American callsells for more than St, buy the stock and sell a covered call, pocketing thedifference. If the call is exercised at or before T , deliver the stock at nocost. If it is not exercised, sell the stock. (If the stock does pay a dividend,the upper bound is a little higher for the American call because it couldbe exercised before any dividends are received.) If the European put sellsfor more than BX, sell it and put that amount into T -maturing bonds. Ifthe American put sells for more than X , then sell it and put the proceedsin the money fund. If either put is exercised, the respective bond or moneyfund balances will finance the payment of the strike, and the stock receivedin exchange can be worth no less than zero.

Moving to the lower bounds, none of the options can have negative valuesince, as we have argued before, they can always be thrown away. If either ofthe calls is worth less than St+K−BX, so that C(St, T−t)+BX < St+K,then buy the call and BX worth of bonds and short the stock. Since we havethe option of holding the American call all the way to T , we can considerjust that possibility. If the dividends are lump-sum, with K = −∆(t, T ), weshort one share of stock and buy bonds worth BX + ∆(t, T ), distributingthe maturities of those covering the dividends to match the payment datesand using principal values to repay the dividends on the shorted stock asthey come due. If the dividends are proportional to the stock’s price, weshort stock worth St+K (i.e., 1+K/St shares) and finance the dividends as

Page 184: Pricing derivative securities

April 9, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch04 FA

Dynamics-Free Pricing 167

Table 4.8. Cash flows attainable from violationof 2nd lower bound for price of American call.

State ST ≤ X ST > X

Call 0 ST − XStock −ST −ST

Bonds X X

Net X − ST ≥ 0 0

they come due by further sales. For example, if proportional dividends werepaid continuously at rate δ, we would short e−δ(T−t) shares and continueto sell short at rate δ until time T . Either way, we wind up at T short oneshare and with X in cash. If ST ≤ X the call is worthless and we coverthe stock and come out ahead by X − ST . If ST > X we use the bondprincipal to exercise the call, then repay the borrowed stock and come outeven. Table 4.8 summarizes.

If 0 ≤ CA(St, T − t) < St − X , violating the third lower bound for theAmerican call, buy the call and exercise it immediately. Of course, this is notpossible for the European option. The bounds for puts can be establishedby similar arguments or, for European options, by applying put-call parityand the bounds on the call.

The bounds on prices of American calls, although very wide when St andX are positive, nevertheless have an important consequence for pricing theseinstruments. Notice that if the stock pays no dividend (has zero explicit costof carry), the second lower bound dominates the third so long as t < T,

since in that case B(t, T ) ≤ 1 and

St − B(t, T )X ≥ St − X.

Therefore, CA(St, T − t) ≥ St −X when t < T , and the inequality is strictif interest rates are strictly positive. This means that an American call ona no-dividend stock is always worth at least as much as its intrinsic value—its value if exercised immediately, and therefore it will not (rationally) beexercised early.10 It follows that when no dividends are to be paid duringthe option’s life an American call must have the same value as an other-wise identical European call. This is important because, as we shall see inlater chapters, it is much easier to value European options than Americanoptions. On the other hand, when the underlying is to pay a lump-sumdividend during the term of the option, it may well be optimal to exer-cise just before the stock begins to trade ex dividend. Moreover, when

10This is also true, a fortiori, if the explicit cost of carry is positive.

Page 185: Pricing derivative securities

April 9, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch04 FA

168 Pricing Derivative Securities (2nd Ed.)

dividends are paid continuously, early exercise is possible at any time. Allthis is explained in chapters 5 and 7 as we consider ways to value Americanoptions on dividend-paying stocks in discrete and continuous time.

The situation is entirely different for the early exercise of American puts.Prior to expiration the third lower bound, X − St, dominates the second,BX − St − K, unless there is a dividend. This means that early exerciseof an American put could well be optimal. One can see at once that, if thestock’s price fell to zero (the lower bound) at t, nothing could be gained bywaiting, and interest on X would be lost. Since the early-exercise feature hasvalue, an American put is usually worth more than an otherwise identicalEuropean put.

The bounds on American and European options on underlying financialassets become sharp at certain extreme values of the underlying and strike,as summarized in table 4.9. (We take for granted that K(t, T ) ≤ 0 forfinancial assets.)

Notice that when X = 0 a call—even a European call on a no-dividendstock—effectively gives the stock for the taking. In fact, it is sometimeshelpful to think of a stock or other underlying asset as a call with a strikeprice of zero.

The bounds for some of the options also become sharp as the timeto expiration approaches infinity. Interestingly, the limiting behavior asT → ∞ depends critically on whether dividends are lump-sum or propor-tional to the stock’s price. Table 4.10 summarizes the results under the

Table 4.9. Values of options at extremes of underlying price and strike.

Case St = 0 X = 0

European call CE(St, T − t) = 0 CE(St, T − t) = St + KAmerican call CA(St, T − t) = 0 CA(St, T − t) = St

European put P E(St, T − t) = BX P E(St, T − t) = 0American put P A(St, T − t) = X P A(St, T − t) = 0

Table 4.10. Values and bounds of perpetual options.

Type of Dividend

Option Lump-Sum Proportional

European call CE(St,∞) = St CE(St,∞) = 0American call CA(St,∞) = St St − X ≤ CA(St,∞) ≤ St

European put P E(St,∞) = 0 P E(St,∞) = 0American put X − St ≤ P A(St,∞) ≤ X X − St ≤ P A(St,∞) ≤ X

Page 186: Pricing derivative securities

April 9, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch04 FA

Dynamics-Free Pricing 169

assumptions that both the average spot rate, r(t, T ), and the dividend pro-portions are positive and nonvanishing (that is, bounded away from zero)and that the present value of the lump-sum dividend stream approacheszero. For example, for the European call when dividends are lump-sum,both K(t, T ) and B(t, T ) approach zero under these assumptions, so thatthe lower and upper bounds in table 4.7 coincide asymptotically. If, on theother hand, dividends are proportional to price, positive, and nonvanish-ing, then St +K(t, T ) → 0. For example, if dividends are paid continuouslyat constant rate δ > 0, then St + K(t, T ) = Ste

−δ(T−t). In this case asT → ∞ we can replicate the call’s maximum possible time-T payoff with avanishingly small fractional share.

4.3.4 How Prices Vary with T , X, and St

Let us now see what static-replication arguments can establish about therelations of option prices to time to expiration, strike price, and the currentprice of the underlying.

Variation with Term

A simple argument shows that the value of any vanilla American-style putor call must be a nondecreasing function of the time to expiration. Forexample, a call expiring at T2 > T1 is equivalent to a pair of options—aT1-expiring call plus a T2-expiring call that kicks in at T1 if the first callwas not exercised. Since the second option cannot have negative value, itfollows that CA(St, T2 − t) ≥ CA(St, T1 − t)and, likewise, PA(St, T2 − t) ≥PA(St, T1 − t). Obviously, this argument does not apply to Europeanoptions, since the one expiring at T2 cannot be exercised before T2. Ofcourse, if the underlying is a financial asset that pays no dividend (andhence has zero cost of carry) then we know that American and Euro-pean calls have the same prices. In this one case we can conclude thatCE(St, T2− t) ≥ CE(St, T1− t) when T2 > T1; otherwise, without imposingspecific structure on St the change in the price of a European option withterm is of ambiguous sign.

Variation with Strike Price

Arbitrage does enforce very specific relations of option prices to the strike,for European as well as American varieties. We will see that calls are

Page 187: Pricing derivative securities

April 9, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch04 FA

170 Pricing Derivative Securities (2nd Ed.)

Table 4.11. Cash flows attainable if call values increased with strike.

State

Position ST ≤ X1 X1 < ST ≤ X2 ST > X2

Long call 0 ST − X1 ST − X1

Short call 0 0 X2 − ST

Net 0 ST − X1 > 0 X2 − X1 > 0

nonincreasing functions of X , that puts are nondecreasing functions, thatthe absolute values of rates of change of puts and calls alike are no morethan unity, and that both C(·, ·; X) and P (·, ·; X) are convex functions.Since all comparisons are for the same values of St and T , we simplifynotation by writing C(X) and P (X), omitting superscripts for relationsthat apply to both European and American options.

We first establish that C(X2) − C(X1) ≤ 0 and P (X2) − P (X1) ≥ 0when X2 > X1. Starting with the call, if C(X2) > C(X1) we could sellthe call struck at X2, buy the one struck at X1, and have money left over.In the European case, or if the options are American and the X2 call isnot exercised before T , then the payoffs of the arbitrage portfolio in thethree relevant terminal price states will be as shown in table 4.11. If theoptions are American and the shorted put is exercised at some t∗ ∈ [0, T ),then the long put would also be in the money and exercising it at t∗ wouldproduce the same positive payoff as that in the last column. The samekind of argument shows that the condition P (X2)− P (X1) < 0 would alsopresent an opportunity for arbitrage.

Having signed the slopes of the put and call functions, we can nowverify that their absolute values are bounded above by unity; that is,∆C/∆X ≥ −1 and ∆P/∆X ≤ 1. Starting with American calls we showthat CA(X2)−CA(X1) < −(X2 −X1) presents us with an arbitrage. Buy-ing the X2 call, shorting the X1 call, and putting ∆X ≡ X2 − X1 in themoney fund brings in a positive amount now and leads to the time-T pay-offs shown in table 4.12 if the shorted call is not exercised early. If the X1

call is exercised at some t∗ < T , then the payoffs are as in the last twocolumns of the table but with t∗ replacing T throughout. Either way thereis an arbitrage.

For European calls the bound on∣∣∆CE/∆X

∣∣ can actually be sharpeneda bit, to ∆CE/∆X ≥ −B(t, T ). If CE(X2)−CE(X1) < −B(t, T )(X2−X1)we can buy the X2 call and sell the X1 call as before, but since early exerciseis no longer a worry we can lock in a return on the net take from the calls as

Page 188: Pricing derivative securities

April 9, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch04 FA

Dynamics-Free Pricing 171

Table 4.12. Cash flows attainable from violation of lower bound onslope of price of American call.

State

Position ST ≤ X1 X1 < ST ≤ X2 ST > X2

Long call 0 0 ST − X2

Short call 0 X1 − ST X1 − ST

Money fund MTMt

∆X MTMt

∆X MTMt

∆X

Net MTMt

∆X ≥ 0 X2 − ST +` MT

Mt− 1

´∆X ≥ 0` MT

Mt− 1

´∆X ≥ 0

Table 4.13. Cash flows attainable from violation of lower bound onslope of price of European call.

State

Position ST ≤ X1 X1 < ST ≤ X2 ST > X2

Long call 0 0 ST − X2

Short call 0 X1 − ST X1 − ST

Bonds ∆CE

B+ ∆X ∆CE

B+ ∆X ∆CE

B+ ∆X

Net ∆CE

B+ ∆X > 0 ∆CE

B+ X2 − ST > 0 ∆CE

B> 0

well as from the difference in the strikes. This would be done by buying T

-maturing bonds worth ∆CE + B(t, T )∆X. While this leaves us with zerocash at time t, table 4.13 shows that it buys strictly positive payoffs in allstates at T . Similar arguments for American and European puts show that∆PA/∆X ≤ 1 and ∆PE/∆X ≤ B(t, T ).

Not only does arbitrage bound the slopes of calls and puts with respectto X , it requires C(X) and P (X) to be (weakly) convex functions. We nowshow that for each X1 > 0, each X3 > X1, and each p ∈ (0, 1) and q = 1−p

C(pX1 + qX3) ≤ pC(X1) + qC(X3) (4.20)

P (pX1 + qX3) ≤ pP (X1) + qP (X3) (4.21)

for both American and European options. For this let X2 = pX1 + qX3, sothat p = (X3 − X2)/(X3 − X1) and q = (X2 − X1)/(X3 − X1). Startingwith the call, suppose, contrary to (4.20), that C(X2) > pC(X1)+ qC(X3).Selling one X2 call and buying p of the X1 calls and q of the X3 calls bringsin a positive amount of cash now. If the shorted call is not exercised beforeT then table 4.14 shows that the portfolio has nonnegative value for anyoutcome ST ≥ 0.

Page 189: Pricing derivative securities

April 9, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch04 FA

172 Pricing Derivative Securities (2nd Ed.)

Table 4.14. Cash flows attainable if call’s price were concave function of strike.

State

Position ST ≤ X1 X1 < ST ≤ X2 X2 < ST ≤ X3 ST > X3

X1 calls 0 p(ST − X1) p(ST − X1) p(ST − X1)X2 call 0 0 −(ST − X2) −(ST − X2)X3 calls 0 0 0 q(ST − X3)

Net 0 p(ST − X1) > 0 q(X3 − ST ) ≥ 0 0

Table 4.15. Cash flows attainable if put’s price were concave function of strike.

State

Position ST ≤ X1 X1 < ST ≤ X2 X2 < ST ≤ X3 ST > X3

X1 puts p(X1 − ST ) 0 0 0X2 put −(X2 − ST ) −(X2 − ST ) 0 0X3 puts q(X3 − ST ) q(X3 − ST ) q(X3 − ST ) 0

Net 0 p(ST − X1) > 0 q(X3 − ST ) ≥ 0 0

This takes care of the European case. If the options are American andthe shorted X2 call is exercised at some t∗ < T , then the payoffs in theonly relevant states, X2 < St∗ ≤ X3 and St∗ > X3, will be as in the lasttwo columns with t∗ in place of T . Arbitraging the condition P (X2) >

pP (X1) + qP (X3) proceeds in the same way by buying the portfolio of X1

and X3 puts and shorting the X2 option. If the shorted put is not exercisedbefore T the payoffs are as shown in table 4.15, and if it is exercised att∗ < T the payoffs are as in the first two columns with T replaced by t∗.

Homogeneity of Option Prices in St and X

On any price path Stt≥0 that produces a positive payoff of a vanilla Euro-pean or American option the payoff will depend linearly on the underlyingprice and strike, as St∗ −X or X −St∗ for calls and puts, respectively. Theobvious implication of this is that call and put prices at any t ∈ [0, T ] willbe denominated in the same units as St and X . If we choose to measure allthe quantities in units of the strike price, then the options’ payoffs becomeSt∗/X − 1 and 1 − St∗/X , and the prices at t become C(St, T − t; X)/X

and P (St, T − t; X)/X. It follows that

C(St, T − t; X) = XC(St/X, T − t; 1) (4.22)

P (St, T − t; X) = XP (St/X, T − t; 1). (4.23)

Page 190: Pricing derivative securities

April 9, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch04 FA

Dynamics-Free Pricing 173

These relations will be of value in later chapters when we develop explicitexpressions for the prices of puts and calls and when we approximate themnumerically. For example, values of C(St, T − t; X) on a two-dimensionalgrid of price and strike values can be found by evaluating (4.22) on a one-dimensional grid of price/strike ratios. The relations are also useful in help-ing to spot erroneous pricing formulas. However, one must be careful notto misinterpret them. For example, (4.22) should not be taken to meanthat an at-the-money equity call when the stock’s price is St = 10 currencyunits has value ten times that of an at-the-money call when St = 1 currencyunit. The reason is that the dynamics of Suu≥t—that is, the nature ofthe possible price paths emanating from (t, St)—may well depend on theinitial price level.

With this caveat in mind, one should be able to spot the flaw in thefollowing

FALSE Theorem: Under conditions PM1-PM4, and given just that St ≥0 for all t ≥ 0, the price of a European or American put is a strictlydecreasing function of the current price of the underlying.

The “proof” of this statement might run as follows. Taking u > 1, (4.23)gives

P (uSt, T − t; X) = uP (St, T − t; X/u).

Applying (4.21) with X3 = X , X2 = X/u, and X1 = 0 then gives

P (uSt, T − t; X) ≤ u − 1u

P (St, T − t; 0) +1u

P (St, T − t; X)

=1u

P (St, T − t; X)

< P (St, T − t; X).

Bergman et al. (1996) give specific examples of price processes for whichthis sort of reasoning fails. We present one of their examples in section 8.2.1to show that call prices do not necessarily increase monotonically with St.In sum, without further restriction on the dynamics of the underlying pricenothing can be said about how option prices depend on St.

Page 191: Pricing derivative securities

April 8, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch05 FA

5Pricing under Bernoulli Dynamics

In chapter 4, static replication was used to value forward commitments tobuy or sell an investment asset. This involved finding a position in tradedassets—the underlying itself plus riskless bonds—that produces the sameterminal payoff as the forward contract, ST − f(0, T ). Because this payoff islinear in the price of the underlying asset, and because values of portfoliosare also linear in the prices, replicating the forward contract in this wayrequires just a static portfolio—one that does not have to be adjusted astime passes. Also, since it is static, the replicating portfolio is necessarilyself-financing, in that no funds need be added or withdrawn along the way.Having found a self-financing replicating portfolio, we could apply the lawof one price to value the forward contract and, ultimately, to deduce theinitial arbitrage-free forward price, f(0, T ). The linearity of the contract’sterminal payoff makes the process very easy.

Imagine, however, trying to follow this program to determine the valueat time t < T of a T -expiring European call option, CE(St, T − t). If thestrike is X the option’s only potential payoff is CE(ST , 0) = (ST − X)+ atT . But since (ST −X)+ is not linear in ST , it will not ordinarily be possibleto construct a static replicating portfolio. If it turns out that ST > X wewould need to be long a full unit of the asset and short X currency unitsworth of bonds at T , whereas if ST ≤ X we would want to have no netposition in either asset. As we approached time T , the replicating portfoliowould have to be adjusted continually toward one or the other of theseextreme positions, depending on the path taken by St. Pricing a call byarbitrage therefore requires a recipe for constructing a dynamic replicatingportfolio. And since the call position that is being replicated requires noadditions or withdrawals of cash after the initial purchase, the replicating

174

Page 192: Pricing derivative securities

April 8, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch05 FA

Pricing under Bernoulli Dynamics 175

portfolio must also be self-financing once it is put in place, with subsequentpurchases of one asset being financed by sales of the other.

This chapter shows how to construct dynamic, self-financing replicatingportfolios for options and other nonlinear derivatives under a very sim-ple Bernoulli model for the underlying price. The method yields what arecalled “binomial” estimates of arbitrage-free prices. The central features ofBernoulli dynamics are that time is measured discretely and that assets’prices after any span of time are limited to a finite number of possiblestates. Although the Bernoulli model will at first seem highly unrealistic, itdoes have two very desirable features. First, it helps in understanding thecontinuous-time, continuous-state-space models taken up in later chapters,for which the underlying mathematics is far more difficult. Second, despiteits apparent lack of realism, the binomial method turns out to be capableof producing very useful quantitative estimates of prices.

We begin by explaining the motivation for and structure of the Bernoullimodel and the conditions for portfolios to be self-financing in this discrete-time setup. Section 5.2 then gives stylized applications of the binomialmethod to European-style derivatives, using an underlying asset and risk-less bonds to form the dynamic replicating portfolio. Although the methodis presented first as a recursive procedure, we will see that simple pric-ing formulas can be found for European-style derivatives. Section 5.3shows that these formulas—and binomial estimates generally—can begiven two different interpretations. One interpretation is that they aresolutions to difference equations that govern how the price of the deriva-tive evolves over time and with the underlying price. The other inter-pretation is that binomial estimates are discounted expected values ofthe derivative’s payoffs, relative to a special probability distribution. Thelatter viewpoint is that of equivalent-martingale or risk-neutral pricing.Both methods have counterparts in the continuous-time framework oflater chapters, but they are much easier to understand in the Bernoullicontext.

After these basic concepts are presented, we turn in section 5.4 to spe-cific applications: European options on stocks that pay no dividends, futuresprices and derivatives on futures, American-style derivatives, and deriva-tives on assets that yield cash receipts, such as dividend-paying stocks andpositions in foreign currencies. Section 5.5 provides further details on imple-mentation, demonstrates that binomial valuations of European options canapproximate the Black-Scholes formulas from continuous time, and sug-gests efficient computational methods. Of course, like all other models, the

Page 193: Pricing derivative securities

April 8, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch05 FA

176 Pricing Derivative Securities (2nd Ed.)

standard binomial setup has its limitations. The final section shows how itcan be adapted to better fit the observed market prices of traded options.

5.1 The Structure of Bernoulli Dynamics

Let us consider the effort to replicate at some initial time t = 0 a genericEuropean-style derivative whose payoff at time T , D(ST , 0), depends juston the price of a single underlying traded asset. We have seen that it is notgenerally possible to construct a static replicating portfolio from just theunderlying asset and riskless bonds unless D is linear in ST . However, itis clear that any proper function is linear on a domain that comprises justtwo points! For example, f(s) = exp(s) is linear if restricted to the domains′, s′′. Thus, the payoff of a European call would also be linear in theterminal value, ST , if the dynamics were such that P(ST ∈ s′, s′′) = 1.

This simple observation is the basis of the binomial model for derivativespricing, proposed by Cox , Ross, and Rubinstein (1979).

Of course, it is hardly realistic to propose that the price of a stock, aforeign currency, or a precious metal, etc., could have one of just two valuesafter an interval of a month, a week, or even a day. However, suppose thetime interval [0, T ) on which the derivative lives is subdivided arbitrarilyfinely, into say n equal subperiods of length T/n. Letting sj ≡ SjT/n be theprice at time jT/n for j ∈ 0, 1, . . . , n− 1 , model sj+1 as sj+1 = sjRj+1,

where price relatives R1, R2, . . . , Rn are independent random variables withthe same (translated and rescaled) Bernoulli distribution:

P(Rj = u) = π = 1 − P(Rj = d)

for positive numbers u > d. In fact, that the distributions be the samefor each time step is not essential, and we will later consider schemesthat allow for deterministic variation. Although the notation at this pointdoes not indicate it, the uptick and downtick values of the price rela-tives, u and d, will certainly have to depend on n, since prices are lessvariable over short periods than over long. Starting with initial values0 = S0, there are two possible price states at t = T/n (s0u, s0d),three states at t = 2T/n (s0u

2, s0ud, s0d2), and n + 1 states at t = T

(s0un, s0u

n−1d, . . . , s0udn−1, s0dn). The multiplicative setup with positive

u and d has the virtue of keeping sj positive, which is appropriate for mostfinancial assets. Clearly, we must have u > 1 > d in order to allow for bothpositive and negative changes in price.

Page 194: Pricing derivative securities

April 8, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch05 FA

Pricing under Bernoulli Dynamics 177

Fig. 5.1. A recombinant binomial tree.

The tree diagram in figure 5.1 depicts the state space after n periods.Positions from which the two branches emanate—the attainable price statesat a given stage—will be called the “nodes”. Notice that with this arrange-ment the downtick from a given node at any step j leads to the same pricelevel at j+1 as does an uptick from the node below it; for example, at step 2we have s0u · d = s0d · u. A tree with this property is said to “recombine”.Later, we will encounter situations leading to nonrecombinant trees. Thetotal number of paths after n steps is 2n whether the tree recombines ornot. If it does fully recombine there are n + 1 terminal nodes; if not, therecan be up to 2n.

Figure 5.1 depicts a stochastic process sj adapted to a filtered proba-bility space, (Ω,F , Fjn

j=0, P). An element ω of Ω is one of the 2n samplepaths of sj, and F = Fn is the field of events generated by sjn

j=0. Like-wise, Fj conveys the whole history of the process through stage j, and wetake F0 to be the trivial field, ∅, Ω. The process sj is clearly Markovin view of the independence of the one-step price relatives. Finally, we notethat the probability associated with any attainable price level at stage j

has a binomial form. That is, since the event sj = s0uj−kdk occurs when

Page 195: Pricing derivative securities

April 8, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch05 FA

178 Pricing Derivative Securities (2nd Ed.)

and only when there are k downticks in j independent trials, we have fork ∈ 0, 1, . . . , j

P(sj = s0uj−kdk) ≡ P(ω : sj(ω) = s0u

j−kdk) =(

j

k

)πj−k(1 − π)k

and, for j ∈ 0, 1, . . . , n and k ∈ 0, 1, . . . , n − j,

P(sn = sjun−j−kdk|Fj) =

(n − j

k

)πn−j−k(1 − π)k.

Before beginning to construct self-financing, replicating portfolios thatwill enable us to value derivatives, we should make very clear what it meansfor a portfolio to be self-financing in the discrete-time setup. Consider aportfolio of just two assets, worth Pj = pjsj + qjs

′j at time j. At j + 1

two things will ordinarily have happened: the prices will have changed, andthe portfolio weights will have been adjusted. The change in value of theportfolio between j and j +1, Pj+1−Pj , can be decomposed into these twoeffects, as

(pj+1sj+1 + qj+1s′j+1) − (pjsj + qjs

′j)

= [(pj+1 − pj)sj+1 + (qj+1 − qj)s′j+1]

+ [pj(sj+1 − sj) + qj(s′j+1 − s′j)]

= (∆P from portfolio adjustment) + (∆P from price change).

The portfolio-adjustment term would be zero for a self-financing portfolio.In this case purchases of one asset are financed by sales of the other or,to put it the other way, proceeds of sales of one asset are reinvested inthe other. Since no cash is either added or withdrawn, the change in theportfolio’s value between j and j + 1 would be ∆Pj = pj∆sj + qj∆s′j .Maintaining this discipline over n trading intervals, the net change in theportfolio’s value from j = 0 to j = n will be

Pn − P0 =n−1∑j=0

(pj∆sj + qj∆s′j). (5.1)

More generally, the net change in value of a self-financing portfoliopj = (p1j , p2j , . . . , pkj)′ chosen from k assets with step-j prices sj =(s1j , s2j , . . . , skj)′ would be

Pn − P0 =n−1∑j=0

p′j∆sj . (5.2)

This is the discrete counterpart to expression (3.30), which applies in con-tinuous time.

Page 196: Pricing derivative securities

April 8, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch05 FA

Pricing under Bernoulli Dynamics 179

5.2 Replication and Binomial Pricing

To see how the Bernoulli setup makes it possible to construct dynamic,self-financing replicating portfolios, let us replicate a generic European-style derivative that pays the holder D(ST , 0) upon expiration at stage n.Continuing to measure time in steps, we write D(sj , n− j) ≡ D(SjT/n, T −jT/n) for the value of the derivative at step j when the underlying price issj and there are n−j steps to go. With this convention D(sn, 0) = D(ST , 0)at step n. Similarly, for 0 ≤ j ≤ k ≤ n B(j, k) ≡ B(jT/n, kT/n) representsthe price at step j of a unit discount bond that matures at step k. Invokingassumption RK, current and future prices of bonds maturing at step n orearlier are supposed to be known at step 0. We refer to the underlyingasset as a “stock” and assume for now that it pays no dividend and has noexplicit cost of carry. The objective is to determine the arbitrage-free valueof the derivative at the initial step, D(s0, n), given knowledge of the initialprice of one share of the stock, s0.

As a second asset needed for replication we could use a unit discountbond maturing at step n or later, or else we could manage by rolling overbonds of shorter lives; however, it is a little easier notationally to use the“money” fund. Recall that this is like a savings account that rolls overinvestments in riskless bonds that are just on the verge of maturing. In thediscrete-time setup these are bonds that mature after one time step, eachbeing worth B(j, j + 1) at step j and B(j + 1, j + 1) = 1 at j + 1. Thus,the value of a unit of the money fund grows from mj ≡ MjT/n at step j tomj+1 = mjB(j, j+1)−1 ≡ mj(1+rj) at step j +1, where rj is the discrete-time counterpart of the short rate at step j. To keep the notation as simpleas possible for now, we fix rj = r for each j; and, as for u and d, we donot make explicit its dependence on n. Starting off the money fund at somearbitrary initial value m0 per share, the value at stage j is mj = m0(1+r)j.Under assumption RK the entire sequence of values mjn

j=0 is known asof step 0, and by assumption RB it is nondecreasing in j. Also, in accordwith assumptions PM1-PM4, trading and holding stock and money fundare costless, and any positive or negative number of shares of either can beheld.

With this setup the value of the derivative asset can be replicated ateach time step by working backwards through the tree. We work backwardsbecause the terminal value of the derivative at step n is a known, contractu-ally specified function of the underlying price, whereas the values at earlierstages are what we are attempting to determine. At the final stage there

Page 197: Pricing derivative securities

April 8, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch05 FA

180 Pricing Derivative Securities (2nd Ed.)

are n + 1 possible values of the stock and up to n +1 corresponding knownvalues of the derivative, D(sn, 0). We begin the program at stage n − 1 byfinding D(sn−1, 1) at each of the n possible realizations of sn−1. Given anysuch sn−1 the stock will be worth either sn−1u or sn−1d next period, thederivative will be worth either D(sn−1u, 0) or D(sn−1d, 0), and the valueof the money-market share will be mn = mn−1(1+ r). The goal is to find aportfolio of the stock and money fund that can be bought at n− 1 and yet,without our adding or withdrawing funds, will be sure to have the samevalue as the derivative at stage n. D(sn−1, 1), the arbitrage-free value ofthe derivative at n−1 given sn−1, will then equal the value of this portfolio.

If the replicating portfolio contains some pn−1 shares of the stock andqn−1 units of the money fund, its value at stage n − 1 is

Pn−1 = pn−1sn−1 + qn−1mn−1.

If it is self-financing, (5.1) implies that its value at stage n will be

Pn = Pn−1 + pn−1(sn − sn−1) + qn−1(mn − mn−1)

= pn−1sn + qn−1mn.

Choosing portfolio shares to replicate the derivative requires

Pun ≡ pn−1sn−1u + qn−1mn = D(sn−1u, 0) (5.3)

P dn ≡ pn−1sn−1d + qn−1mn = D(sn−1d, 0), (5.4)

or

pn−1 =D(sn−1u, 0) − D(sn−1d, 0)

sn−1u − sn−1d(5.5)

qn−1 =D(sn, 0)

mn− pn−1

sn

mn. (5.6)

In (5.6) we express the step-n values of the derivative and the stock interms of the Fn-measurable random variable sn since qn−1 is the same forsn = sn−1u and sn = sn−1d.

Given the realization of sn−1 at stage n − 1, the portfolio pn−1, qn−1does replicate the derivative when it has one period to go, and it doesso without having to add or withdraw funds. The value of this portfo-lio, and therefore the arbitrage-free value of the derivative itself, is thenD(sn−1, 1) = pn−1sn−1 + qn−1mn−1, a function of the Fn−1-measurablequantities sn−1, u, d, and r. Once the value of D(sn−1, 1) is determined inthis way at each of the n possible realizations of sn−1 we can back up tostage n − 2 and value D(sn−2, 2) at each of the n − 1 realizations of sn−2.

Page 198: Pricing derivative securities

April 8, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch05 FA

Pricing under Bernoulli Dynamics 181

Continuing in this way leads ultimately to the value of D(s0, n) at theknown initial value s0. This is the essence of the binomial method.

It will be helpful to characterize this recursive process more generallybefore giving an example. Suppose we are at any stage j ∈ 0, 1, . . . , n− 1with sj known and the next-stage value function, D(·, n − j − 1), havingalready been found. Simplifying notation still further, let Dj = D(sj , n− j)be the value of the derivative at step j (which is to be found) and Du

j+1 =D(sju, n− j − 1) and Dd

j+1 = D(sjd, n− j − 1) be the known future valuesin the two states that can be reached from sj . Forming a portfolio worth

Pj = pjsj + qjmj (5.7)

now and

Pj+1 = pjsj+1 + qjmj+1 (5.8)

next period requires that pj , qj satisfy

Puj+1 ≡ pjsju + qjmj+1 = Du

j+1

P dj+1 ≡ pjsjd + qjmj+1 = Dd

j+1.

Substituting solutions

pj =Du

j+1 − Ddj+1

sju − sjd

qj =Dj+1

mj+1− pj

sj+1

mj+1=

1mj+1

(uDd

j+1 − dDuj+1

u − d

)into (5.7) and simplifying give as the derivative’s arbitrage-free price

Dj =mj

mj+1

[(mj+1/mj − d

u − d

)Du

j+1 +(

u − mj+1/mj

u − d

)Dd

j+1

]. (5.9)

Finally, by setting mj+1/mj = 1 + r and

π ≡ 1 + r − d

u − d

we obtain the simple expression

Dj = (1 + r)−1[πDu

j+1 + (1 − π)Ddj+1

], j ∈ 0, 1, . . . , n − 1. (5.10)

Both the forms (5.9) and (5.10) will prove useful later on, and the weights π

and 1− π will be seen to have special significance. The notation deliberatelysuggests a parallel to the state probabilities π and 1−π, but it is evident thatneither π nor the binomial estimate Dj depends directly on those actual

Page 199: Pricing derivative securities

April 8, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch05 FA

182 Pricing Derivative Securities (2nd Ed.)

probabilities. Of course, behind the scenes is a market that has presumablytaken those probabilities into account in determining the interest rate, thestock’s current price, and the distribution of price relatives, all of which aretaken now as given.

Example 60 Pricing a European call struck at X = 10.0 on an underlyingstock worth s0 = 10.0, let us divide the option’s life into n = 3 subperiodsand choose u = 1.05, d = 0.97, and r = 0.01, so that π = 1

2 . (Later we willsee how to determine the move sizes in realistic applications.) Figure 5.2summarizes the steps to find the call’s replicating portfolio at each stageand its arbitrage-free value at t = 0. The first step is to determine the valueof the option at the terminal nodes. At stage 3 the possible realizationsof the stock’s price are 10.0(1.053) .= 11.576, 10.0(1.052)(.97) .= 10.694,

10.0(1.05)(.972) .= 9.879, and 10.0(.973) .= 9.127. Taking m0 = 1, themoney fund is worth m3 = 1.013 .= 1.030 in all four states. The call is out ofthe money at the two bottom nodes, and is worth about 11.576−10.0 = 1.576and 10.694−10.0 = 0.694 at the two upper nodes. Now proceed to value thecall recursively at each preceding stage. At the top node of stage 2 with the

Fig. 5.2. Binomial pricing of a European call.

Page 200: Pricing derivative securities

April 8, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch05 FA

Pricing under Bernoulli Dynamics 183

stock worth 10.0(1.052) = 11.025, the replicating portfolio contains

p2 =1.576− 0.694

11.576− 10.694= 1.0

shares of stock and

q2 =1.576− 11.576

1.013= − 10.0

1.013

units of the money-market fund. (Notice that it makes sense to hold a fullunit of the stock, since from this position the call must wind up in the moneyat step 3.) The values of the replicating portfolio and call are then

P2 = p2s2 + q2m2

= (1.0)10.0(1.052) − 10.01.013

(1.012).= 1.124.

Alternatively, the value of the portfolio and call can be found directly from(5.10) as

CE(11.025, 1) =πCE(11.576, 0) + (1 − π)CE(10.694, 0)

1 + r

.=(.5) 1.576 + (.5) 0.694

1.01.= 1.124.

At the middle node of stage 2, where the stock is worth 10.0(1.05)(0.97) =10.185, the call is no longer certain to wind up in the money. The replicatingportfolio here contains only

p2 =0.694− 0.0

10.694− 9.878.= 0.852

shares of stock and

q2 =0.694− (0.852) · 10.694

1.013

.= −8.170

units of the fund. The value of the portfolio and the call is then

CE(10.185, 1) =(.5)0.694 + (.5)0.0

1.01.= 0.344.

From the bottom node of stage 2 the call has no chance to finish in themoney, so the replicating portfolio is null (p2 = q2 = 0) and the call is

Page 201: Pricing derivative securities

April 8, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch05 FA

184 Pricing Derivative Securities (2nd Ed.)

worthless. Similar calculations at stage 1 give call values of about 0.727 and0.170 in the two states. Finally, at stage 0 the replicating portfolio has

p0 =0.727 − 0.170

10.5 − 9.7.= 0.696

shares of stock and

q0 =0.727 − (0.696)10.5

1.01.= −6.516

units of the money fund, with total value P0.= 0.444. Starting out with

this portfolio and making the adjustments shown at each node in the figuredelivers a portfolio that has the same value as the expiring call in every stateat stage 3. Accordingly, the call’s initial arbitrage-free price is CE(10.0, 3) .=0.444.

Although the solution was found recursively in the example, the initialvalue of a European-style derivative can be expressed directly in terms ofs0 by means of the following formula:

D(s0, n) = (1 + r)−nn∑

k=0

(n

k

)πn−k(1 − π)kD(s0u

n−kdk, 0). (5.11)

To prove this by induction, setting n = 1 in (5.10) gives

D(s0, 1) = (1 + r)−1[πD(s0u, 0) + (1 − π)D(s0d, 0)],

which verifies (5.11) for that case. Now supposing (5.11) holds for anyn − 1 ∈ ℵ, set j = 0 in (5.10) to get

D(s0, n) = (1 + r)−1[πD(s0u, n − 1) + (1 − π)D(s0d, n − 1)]

= (1 + r)−n

[n−1∑k=0

(n − 1

k

)πn−k(1 − π)kD(s0u

n−kdk, 0)

+n−1∑j=0

(n − 1

j

)πn−1−j(1 − π)j+1D(s0u

n−1−jdj+1, 0)

.

Noting from (2.6) on page 24 that(n−1

n

)=

(n−1−1

)= 0, one can extend the

first sum to n and change variables in the second sum as k = j + 1 to get

D(s0, n) = (1+r)−nn∑

k=0

[(n − 1

k

)+(

n − 1k − 1

)]πn−k(1−π)kD(s0u

n−kdk, 0),

from which (5.11) follows by identity (2.7). In the next section we shalldiscover a much simpler way to show this.

Page 202: Pricing derivative securities

April 8, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch05 FA

Pricing under Bernoulli Dynamics 185

How do things change if short rates are allowed to vary over time butare still known in advance, so that mj+1/mj = B(j, j + 1)−1 = 1 + rj , andmn = m0(1 + r0), . . . , (1 + rn−1)? The same replicating argument leads to

Dj = (1 + rj)−1[πjDuj+1 + (1 − πj)Dd

j+1], (5.12)

where now

πj ≡ 1 + rj − d

u − d. (5.13)

The recursive procedure works almost as easily as before despite the timedependence of π, but there is no longer such a simple formula for the valueof a European derivative. Let

Π(n, k) ≡∑

j0+···+jn−1=k

π1−j00 (1 − π0)j0 · · · · · π1−jn−1

n−1 (1 − πn−1)jn−1 ,

(5.14)

where the summation is over values of j0, . . . , jn−1 ∈ 0, 1 such thatj0 + · · · + jn−1 = k. Then (5.11) becomes

D(s0, n) =m0

mn

n∑k=0

Π(n, k)D(s0un−kdk, 0).

This is equivalent to

D(s0, n) = B(0, n)n∑

j=0

Π(n, k)D(s0un−kdk, 0), (5.15)

since when interest rates are known in advance m0/mn must equal the priceof a discount bond maturing at step n.

If we think of the πj as pseudo-probabilities, then Π(n, k) equals theprobability of k down moves in n steps. When the π’s are time-invariant,this has the binomial form

(nk

)πn−k(1 − π)k. We are about to see that this

probabilistic interpretation is both justifiable and useful.

5.3 Interpreting the Binomial Solution

Whether obtained recursively or directly as in (5.11) or (5.15), binomialsolutions can be interpreted in two useful ways. Viewed as solutions toa certain difference equation, they are the discrete-time counterparts ofsolutions to partial differential equations that arise in the continuous-timeframework adopted in later chapters. Alternatively, thinking of the π’s aspseudo-probabilities, binomial solutions appear to be mathematical expec-tations of the derivatives’ normalized payoffs. This interpretation, which

Page 203: Pricing derivative securities

April 8, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch05 FA

186 Pricing Derivative Securities (2nd Ed.)

also has its counterpart in continuous time, corresponds to risk-neutral ormartingale pricing. The recursive formula (5.12) is a good place from whichto start in order to attain both perspectives.

5.3.1 The P.D.E. Interpretation

Formula (5.12) really amounts to a partial difference equation that governsthe variation of D with respect to the discrete-time counter and the price ofthe underlying asset. To see this, apply some algebra and reduce (5.12) to

0 = −rjDj + rjsj

(Du

j+1 − Ddj+1

sju − sjd

)

+

[(u − 1)(Dd

j+1 − Dj) + (1 − d)(Duj+1 − Dj)

(u − 1) + (1 − d)

].

The factor in parentheses in the second term is ∆D/∆s, the instantaneousrate of change of D at step j+1 with respect to sj+1, holding time constant.The third term, a weighted average of the state-dependent changes in D

with time, amounts to the partial difference with respect to t, or ∆D/∆t.With these interpretations (5.12) can be written as the partial differenceequation

−rD + rs · ∆D/∆s + ∆D/∆t = 0.

The binomial estimate of D(s0, n) solves this equation subject to the termi-nal condition imposed by the derivative’s contractual relation to the stock’sprice at expiration, D(sn, 0). Equations (5.11) and (5.15) are explicit solu-tions for a European-style derivative.

5.3.2 The Risk-Neutral or Martingale Interpretation

Gaining an understanding of the equivalent-martingale interpretation is abit more involved, but this is by far the easiest setting in which to do it.It is also well worth the cost. The first step is to see that the ability toreplicate a derivative’s payoffs within the Bernoulli framework implies theexistence of a pseudo-probability measure, with respect to which normalizedprices of the derivative, the underlying asset, and the money fund are allmartingales.

The Martingale Measure

Begin by recognizing that weighting factors πj = (1 + rj − d)/(u − d)and 1 − πj = (u − 1 − rj)/(u − d) in (5.12) must be strictly positive

Page 204: Pricing derivative securities

April 8, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch05 FA

Pricing under Bernoulli Dynamics 187

in an arbitrage-free market. Weight πj could be zero or negative only if1 + rj ≤ d. In this case there would be an arbitrage against riskless bonds,since shorting bonds (borrowing) and buying the stock could produce a gainbut could not produce a loss. Likewise, 1− πj ≤ 0 would require u ≤ 1+ rj

and set up an arbitrage against the stock. The implication is that πj ∈ (0, 1)at each step j. Since it thus possesses all the features of a Bernoulli proba-bility, let us construct a new pseudo-probability space (Ω,F , Fj, P). Thishas the same information structure as the “real” one in this model butcontains a new measure P such that

P(sn = sjun−j−kdk | Fj) = Π(n − j, k), k ∈ 0, 1, . . . , n − j (5.16)

for each sj and each j ∈ 0, 1, . . . , n−1. Again, sj is an adapted processin this new filtered space.

It is clear that P and P are equivalent measures since P assigns a positivevalue to a given sample path if and only if P does. By the Radon-Nikodymtheorem (section 2.2.4) there is then at each step j ∈ 0, 1, . . . , n − 1 aunique random variable Qj such that P(A | Fj) =

∫A

Qj(ω) · dP(ω | Fj) foreach A ∈ F . The construction of Qj is very simple in this discrete Bernoulliframework. For each step j a sample path ω that passes through sj andterminates at step n is defined by a sequence of upticks and downticks.Under P this path has probability

P(ω | Fj) = π1−ij

j (1 − πj)ij · ... · π1−in−1n−1 (1 − πn−1)in−1 ,

where each index ij, ij+1, . . . , in−1 takes a value in 0, 1 according as thepath moves down or up at that stage. The expression for P(ω | Fj), thepath probability under P, is the same except that π’s replace π’s. Then Qj

is defined by

Qj(ω) =P(ω | Fj)P(ω | Fj)

.

For example, taking Ajk to be the set of sample paths emanating from sj

and having k downticks between stages j and n, we have

P(Ajk | Fj) ≡ P(sn = sjun−j−kdk | Fj)

= Π(n − j, k)

=∑

ω∈Ajk

Qj(ω)P(ω | Fj).

Here summing over ω ∈ Ajk is equivalent to summing over all(n−j

k

)indices

ij , ij+1, . . . , in−1 such that ij + ij+1 + · · · + in−1 = k.

Page 205: Pricing derivative securities

April 8, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch05 FA

188 Pricing Derivative Securities (2nd Ed.)

Let us now see that the new measure P has the special property ofturning suitably normalized values of sj and mj—and of any self-financingportfolio of these—into martingales. Normalize the prices of both assetsby the current per-unit value of the money fund, which is thus treated asnumeraire. With “∗” denoting normalized values, we have s∗j = sj/mj andm∗

j = 1, both quantities being dimensionless. Of course, m∗j is trivially a

martingale under both P and P. As for s∗j, taking conditional expectationsin the new measure gives

Ejs∗j+1 = πj

sju

mj+1+ (1 − πj)

sjd

mj+1

=(

mj+1/mj − d

u − d

)sju

mj+1+

(u − mj+1/mj

u − d

)sjd

mj+1

=sj

mj

= s∗j .

This with the tower property of conditional expectation then implies that

E|s∗n| = Es∗n = s∗0 + E

n−1∑j=0

Ej(s∗j+1 − s∗j )

= s∗0

for all n ∈ ℵ. Therefore, s∗j is also a martingale adapted to Fj.1 Theappropriateness of the term “equivalent martingale measure” for P is nowevident.

Martingale/Risk-Neutral Pricing

Now comes the crucial implication for pricing a derivative on s. Given thatthe derivative can be replicated at each step with a self-financing portfolioof the underlying and the money fund, its own normalized value at j + 1,

D∗j+1 ≡ Dj+1/mj+1, is a linear function of s∗j+1 with weights that are

Fj-measurable. Therefore, the martingale property of s∗ and m∗ under P

carries over to the normalized derivative also. That is, if at step j thereare known portfolio weights pj and qj such that pjsj + qjmj = Dj andpjsj+1 + qjmj+1 = Dj+1 for sure, it follows that

EjD∗j+1 = Ej(pjs

∗j+1 + qj) = pjs

∗j + qj = D∗

j .

1Since s∗j is a Markov process in the Bernoulli setup, one could as well write

Ej(s∗j+1) ≡ E(s∗j+1 | Fj) as E(s∗j+1 | s∗j ).

Page 206: Pricing derivative securities

April 8, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch05 FA

Pricing under Bernoulli Dynamics 189

Indeed, in the Bernoulli setup one sees directly that the derivative’s nor-malized price is a martingale under P, since by (5.12)

D∗j = πj

Duj+1

mj+1+ (1 − πj)

Ddj+1

mj+1

= EjD∗j+1

for each j ∈ 0, 1, . . . , n − 1.The martingale property of D∗

j can be exploited to find the deriva-tive’s arbitrage-free price at any step j by writing

D∗j = EjD

∗n. (5.17)

Since D∗n is a known function of sn and mn, this expectation can be

evaluated—either recursively or directly. Next, multiply both sides of (5.17)by the current value of the money-market fund, mj , to restore to currencyunits:

Dj = mjD∗j = mjEj

(Dn

mn

). (5.18)

For the initial price at j = 0 this gives

D0 = m0E(Dn/mn),

or, since mjnj=0 are all F0-measurable under assumption RK,

D0 =m0

mnEDn = B(0, n)EDn (5.19)

under that condition. Finally, using (5.16) to evaluate EDn gives directlythe explicit formula that was proved earlier by brute-force induction:

D0 = B(0, n)n∑

k=0

Π(n, k)D(s0un−kdk, 0). (5.20)

These last expressions show that martingale pricing can be thought ofequivalently as “risk-neutral” pricing when the price of the money fund isnumeraire. Since s0 = B(0, n)Esn and D0 = B(0, n)EDn under assumptionRK, currentprices of both theunderlying and thederivative are justwhat theywould be in an equilibrium with risk-neutral agents who happened to regardP as the actual probability measure. For this reason P is often called the “risk-neutral measure” as well as an “equivalent-martingale measure”. As we shallsee later, this distinguishes P from other martingale measures that apply forother numeraires. By switching to this new measure, assets can be priced bypretending that the economy is made up of individuals who are indifferent to

Page 207: Pricing derivative securities

April 8, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch05 FA

190 Pricing Derivative Securities (2nd Ed.)

risk. However, one must be careful not to misinterpret this procedure. Indi-viduals in the Bernoulli world would really regard measure P as driving thedynamics of s, and there is no reason whatever to think that P and P are thesame. The crucial point is that the validity of risk-neutral pricing does notdepend on the assumption of risk neutrality. Rather, given that replication ispossible, the method is justified by the law of one price and the observationthat markets do not afford systematic opportunities for arbitrage.

It is helpful to consider why pricing by expectation requires a change ofmeasure. Notice that, if individuals were really risk-neutral, prices would sat-isfy s0 = B(0, n)Esn and D0 = B(0, n)EDn, since expected returns fromboth positions under P would then equal the sure return from holding a risk-less bond to maturity or equivalently (under assumption RK) investing inthe money fund. However, since we think that most people display risk aver-sion in their investment decisions, this is not the right discount factor to usewhen expectations are taken with respect to P. To find the factor B∗ suchthat D0 = B∗EDn would require measuring the derivative’s risk and deter-mining how much compensation the representative investor demands to bearit—facts aboutwhich reasonablepeople (and plausible models) are apt to dis-agree.2 By contrast, if a self-financing, replicating portfolio exists, then arbi-trage will drive the prices of the derivative and the portfolio together and tellus what the derivative is worth—regardless of peoples’ tastes for risk.

We are now in a position to state in more detail the recipe for martingalepricing that was given in chapter 1. To find the arbitrage-free value of aderivative that makes a single payoff at some future date contingent on theprice of an underlying asset,

• Verify that there exists a self-financing portfolio that replicates thederivative’s payoff;

• Find a measure that (i) is equivalent to the real measure governing theprice of the underlying and (ii) makes the underlying price a martingaleafter normalization by some numeraire;

• Relying on the fair-game property of martingales, calculate the currentnormalized price of the derivative as the conditional expectation of itsnormalized payoff under the new measure; and then

• Multiply by the current value of the numeraire to obtain the arbitrage-free price in currency units.

2Of course, once D0 is found by martingale methods B∗ can be found indirectly asB∗ = BEDn/EDn, assuming that we know enough to determine EDn in measure P.

Page 208: Pricing derivative securities

April 8, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch05 FA

Pricing under Bernoulli Dynamics 191

When the price of the money fund is used as numeraire and assumption RKapplies, the last two steps together are equivalent to finding the expectationof the derivative’s unnormalized payoff under P and discounting back tothe present at the riskless rate. Discounting is equivalent to multiplyingby the current price of a riskless unit bond that matures on the date ofpayoff. If there are contingent payoffs at multiple dates, price each of themindividually in this way and add them up.3

Martingale Measures and Arbitrage-Free Markets

Within the Bernoulli framework it is easy to see that the existence of a mar-tingale measure implies and is implied by the absence of opportunities forarbitrage, a statement that is now regarded as the Fundamental Theorem ofAsset Pricing. Consider the simple one-period setup depicted in figure 5.3.An asset costs v0 initially, and its price can take values vu

1 and vd1 at t = 1

with probabilities P(v1 = vu1 ) = π > 0 and P(v1 = vd

1) = 1 − π > 0. Anyequivalent measure P will assign positive probabilities π and 1− π to thesesame two outcomes. In this framework there are two canonical conditions

Fig. 5.3. A simple Bernoulli tree.

3In the continuous-time setting of chapter 10 we will distinguish between the “spot”martingale measure, P, for numeraire Mt and the “time-T forward measure”, PT , fornumeraire B(t, T ). When the expectation is taken under PT , relation D(St, T − t) =B(t, T )ET [D(ST , 0) | Ft] holds even when future short rates are uncertain. When RKdoes hold, P and PT coincide.

Page 209: Pricing derivative securities

April 8, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch05 FA

192 Pricing Derivative Securities (2nd Ed.)

for an arbitrage:

v0 < 0, vu1 ≥ 0, vd

1 ≥ 0; (5.21)

v0 = 0, vu1 > 0, vd

1 ≥ 0. (5.22)

Under the first condition the asset delivers a positive initial payout (hasnegative cost), yet can be sold for a nonnegative amount in the future. Anexample would be the opportunity at t = 0 to exchange one portfolio fora cheaper one that was sure to be worth at least as much at t = 1. Underthe second condition the asset costs nothing initially but has a chance toproduce a positive payoff. For example, portfolios could be exchanged foreven money with the result that future value might be greater and couldnot be less. In either case a rational person who prefers more wealth to lesswould work the trades on the highest possible scale. Now if (5.21) holds wewill have

Ev1 ≥ 0 > v0

under P and under any measure equivalent to P, while if (5.22) holds wewill have

Ev1 > 0 = v0.

In either case the martingale condition Ev1 = v0 fails to hold, showingthat the existence of an arbitrage rules out the possibility of a martingalemeasure that is equivalent to P.

Conversely, the existence of a martingale measure is inconsistent withboth forms of arbitrage; for if Ev1−v0 = 0 then either vu

1 −v0 > 0 > vd1−v0

in violation of (5.21) or else vu1 −v0 = vd

1−v0 = 0, violating both conditions.While we have been thinking here of prices denominated in some constantunit of exchange, it is obvious that these conclusions hold also for anynumeraire w (numeraires being positive by definition) and correspondingnormalized process, v∗j = vj/wj. Moreover, the condition

vd1

wd1

≤ v0

w0≤ vu

1

wu1

(5.23)

implies the existence of at least one measure equivalent to P that makesthe normalized price process a martingale, and when it is violated there isan opportunity for arbitrage.4

4That is, when (5.23) holds, there is necessarily a π such that v0/w0 = πvu1 /wu

1 +(1 − π)vd

1/wd1 . When (5.23) does not hold, then either vd

1/wd1 ≤ vu

1 /wu1 < v0/w0 or

v0/w0 < vd1/wd

1 ≤ vu1 /wu

1 . The first case implies vd1/v0 < wd

1/w0 and vu1 /v0 < wu

1 /w0,so that shorting v and investing the proceeds in w gives a costless, certain gain. Thesecond case is exploited by the opposite trade.

Page 210: Pricing derivative securities

April 8, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch05 FA

Pricing under Bernoulli Dynamics 193

It is not crucial that the price of the money fund be the numeraire. Infact, under assumption RK that future interest rates (and therefore bondand money-fund prices) are known, we can price an n-expiring derivativein exactly the same way using B(j, n)n

j=0 as numeraire.5 That is,sj

B(j, n)=

mnsj

mj= mns∗j

,

mj

B(j, n)= mn = mnm∗

j

are still martingales under P, and so

D0 = B(0, n)EDn

B(n, n)= B(0, n)EDn

as in (5.19). We can also normalize instead by sj, creating s∗∗j = 1,m∗∗

j = mj/sj, and D∗∗j = Dj/sj. Of course s∗∗j is automatically a

martingale in any measure equivalent to P and P, but for the others to bemartingales a new measure is required. Defining P such that

P(sj+1 = sju | Fj) = πj =mj/mj+1 − d−1

u−1 − d−1,

we have

E(m∗∗j+1|Fj) =

(mj/mj+1 − d−1

u−1 − d−1

)mj+1

sju+

(u−1 − mj/mj+1

u−1 − d−1

)mj+1

sjd

= mj/sj

= m∗∗j

Likewise, using (5.9) it is easy to show that E(D∗∗j+1|Fj) = D∗∗

j . However,we would no longer refer to measure P as a “risk-neutral measure”, sinces0 = B(0, n)Esn. Thus, while different martingale measures are, in general,associated with different numeraires, the price of any asset in the replicatingportfolio—indeed, of any traded asset—can serve as the normalizing factor,so long as it is strictly positive.

Complete Markets

While the existence of a martingale measure is tied to the absence of arbi-trage, the uniqueness of the measure for any given choice of numeraire istied to the completeness of markets. To see what completeness means withinthe Bernoulli framework, let the market’s information structure evolve over

5See footnote 3 in this connection.

Page 211: Pricing derivative securities

April 8, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch05 FA

194 Pricing Derivative Securities (2nd Ed.)

a finite span of n periods according to the filtration Fjnj=0. Traded in this

market is a collection of k + 1 assets with prices sj = (s0j , s1j, . . . , skj)′ atstep j. The market is complete relative to the information structure Fjn

j=0

if each adapted payoff process yjnj=1 can be replicated by a self-financing

portfolio process pj = (p0j , p1j , . . . , pkj)′n−1j=0 . This means that, given an

appropriate initial endowment, there exists a portfolio that generates anyarbitrary stream of payoffs, y1, y2, . . . , yn.

Consider such a complete market. For any j ∈ 1, 2, . . . , n let Aj ⊂ Ωbe an arbitrary Fj-measurable set of outcomes. Completeness implies thatthere exists at least one portfolio of assets that pays one currency unit at j

when Aj occurs, and nothing otherwise. (Take the adapted process yini=1

to be yi(ω) = 1 for i = j and ω ∈ Aj , and yi(ω) = 0 otherwise.) If there areno opportunities for arbitrage, then corresponding to whatever particularprice is chosen as numeraire there exists at least one measure such that thenormalized value of this claim is a martingale. Suppose there are two such(equivalent) measures, Pj and Pj , when the price of a j-maturing unit bondis numeraire.6 The current value of the claim is then either

v0(Aj) ≡ B(0, j)E1Aj = B(0, j)∫

Aj

dPj(ω) = B(0, j)Pj(Aj)

or

v0(Aj) ≡ B(0, j)E1Aj = B(0, j)∫

Aj

dPj(ω) = B(0, j)Pj(Aj).

Since there is no arbitrage, v0(Aj) and v0(Aj) must be equal, and sincesuch an equality holds for each j and each measurable Aj it follows thatthe measures P and P are identical. Thus, in a complete, arbitrage-freemarket there corresponds to each numeraire a unique martingale measure.

Completeness in the Bernoulli Framework

Let us now consider what it takes to make the martingale measure uniquewhen the underlying price for our derivatives follows Bernoulli dynam-ics. Since uniqueness holds in complete markets, we can ask under whatcondition a market with a “stock” whose price, s, follows a Bernoulliprocess is complete relative to the information structure Fjn

j=0, whereFj ≡ σ(s0, s1, . . . , sj) is the field of events generated by s0, s1, . . . , sj .

6Under assumption RK, we can remove the “j” and consider P and eP as alternativespot measures.

Page 212: Pricing derivative securities

April 8, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch05 FA

Pricing under Bernoulli Dynamics 195

In order to create a portfolio that reproduces any adapted process yjnj=1,

it suffices to find a portfolio that pays yj for any single j ∈ 1, 2, . . . , n andzero at other times, since a collection of n such portfolios generates theentire process. Now since yj is Fj-measurable, there exists f such thatyj = f(s0, s1, . . . , sj). Notice that yj may depend on the entire history of thestock’s price through stage j and thus on the path by which sj is reached.

To replicate yj let us try to work backwards in the usual fash-ion, beginning at stage j − 1 with s0, s1, . . . , sj−1 known. Given theBernoulli dynamics, there are two possible realizations of yj ; namely,yu

j ≡ f(s0, . . . , sj−1, sj−1u) and ydj ≡ f(s0, . . . , sj−1, sj−1d). We require

a portfolio now worth some amount Pj−1 and worth Puj = yu

j in the upstate and P d

j = ydj in the down state next period. Since yu

j and ydj are

arbitrary, we clearly need a portfolio such that (P dj , Pu

j ) can take on anyvalue in 2. To form such a portfolio requires, besides the stock itself,an asset whose price, vj , is adapted to our market information structure,Fj = σ(s0, s1, . . . , sj)n

j=0, and which, together with the stock, spans 2.In other words, we require the payoff matrix of the two assets,(

sj−1d vdj

sj−1u vuj

)to have rank two or, equivalently, that the vectors (d, u) and (vd

j , vuj ) be

linearly independent. This in turn requires that u/d = vuj /vd

j . Figure 5.4

Fig. 5.4. Spanning the Bernoulli state space.

Page 213: Pricing derivative securities

April 8, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch05 FA

196 Pricing Derivative Securities (2nd Ed.)

depicts state prices (vd, vu) that meet this condition and prices (wd, wu)that do not. The figure also shows how an appropriate portfolio (p, q) ofthe two assets with linearly independent prices can be used to replicate anarbitrary payoff (yd, yu) = (P d, Pu).

Suppose, besides a stock that follows Bernoulli dynamics, the marketcontains also the riskless money fund, a unit of which will be worth mu

j =md

j = mj−1(1 + rj−1) in both the up and down states next period. Thisdegenerate random variable mj is linearly independent of the stock’s price,since mu

j /mdj = 1 < u/d. Therefore, under Bernoulli dynamics an arbitrage-

free market consisting of the stock and the money fund is indeed completewith respect to the filtration generated by sj, and so a unique martingalemeasure does correspond to whatever numeraire is chosen.

5.4 Specific Applications

This section treats binomial pricing of specific derivatives. To develop theintuition and some parallels with the continuous-time models in later chap-ters, we start with the simple case of European options on assets with zeroexplicit cost of carry, then move on to futures prices, derivatives on futures,American-style derivatives, and derivatives on assets that pay dividends.To focus on the main ideas, we continue to take as given the move sizesand one-period interest rates that define the Bernoulli dynamics. Ways tochoose these in real applications are described in the next section.

5.4.1 European Stock Options

Pricing a derivative recursively by working backward through a binomialtree gives estimates of its value at all the nodes. Taken all together, thenodal values at any single time step represent the derivative’s value functionon a discrete domain of points; that is, the functional relation between thederivative’s price and that of the underlying. Thus, after j steps backwardin time the n − j + 1 nodal values of an n-step tree constitute the valuefunction for a derivative with j steps to go before expiration. It is instructiveto see how the value function evolves as one steps backward in the tree, andfigure 5.5 illustrates the process for a European call. The bold kinked linein the top panel is the option’s piecewise-linear terminal payoff function,drawn to connect the dots corresponding to the n + 1 nodal stock prices atstep n. The dots in the next panel of the figure represent the value functionat stage n − 1 on the lattice of n values of sn−1. Taking the short rateas constant and setting B = (1 + r)−1 as the one-period discount factor,

Page 214: Pricing derivative securities

April 8, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch05 FA

Pricing under Bernoulli Dynamics 197

Fig. 5.5. Recursive evolution of binomial call function.

the call’s value at each point is CE(sn−1, 1) = B[πCE(sn−1u, 0) + (1 −π)CE(sn−1d, 0)]. At high values of sn−1 from which the call is sure to endup in the money this is

CE(sn−1, 1) = B[π(sn−1u − X) + (1 − π)(sn−1d − X)]

= sn−1 − BX,

where X is the strike price. These points lie along a line parallel to but abovethe terminal payoff, s−X . Skipping to the low values of sn−1, dots along theS axis correspond to points from which the call must finish out of the money.The interest centers on the one value of sn−1 for which sn−1u−X > 0 andsn−1d − X ≤ 0, where the call is worth CE(sn−1, 1) = Bπ(sn−1u − X).This is positive and so lies above the axis, and it is also above the extensionof the (dashed) line sn−1 −BX. This point thus puts an additional kink inthe trace of call values at step n − 1. Another kink appears at step n − 2,and so on. Although the value function is observed at progressively fewerpoints as the tree reduces, one still perceives that the function becomessmoother as the process continues. We will soon be in a position to do themath and show that with the right specification of move sizes the value

Page 215: Pricing derivative securities

April 8, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch05 FA

198 Pricing Derivative Securities (2nd Ed.)

function approaches a smooth, convex curve as the number of remainingstages approaches infinity.

To get additional insight into the binomial valuation of a Europeanoption, apply the explicit pricing formula (5.20) to a European call. Tomake things easier, regard interest rates as constant over time, so that thesimpler (5.11) applies to give for the call’s initial value

CE(s0, n) = B(0, n)n∑

k=0

(n

k

)πn−k(1 − π)k(s0u

n−kdk − X)+. (5.24)

Reducing this further, let kX,n be the largest number of downticks such thatthe call finishes in the money. Then (s0u

n−kdk −X)+ equals s0un−kdk −X

when k ≤ kX,n and equals zero otherwise, so that the right side of (5.24)becomes

B(0, n)s0

kX,n∑k=0

(n

k

)(πu)n−k[(1 − π)d]k − B(0, n)X

kX,n∑k=0

(n

k

)πn−k(1 − π)k.

The sum in the second term is P(sn > X). Expressing the bond’s price as

B(0, n) = (1 + r)−n = (1 + r)−n+k(1 + r)−k,

the first term equals

s0

kX,n∑k=0

(n

k

)[πu

1 + r

]n−k [(1 − π)d

1 + r

]k

.

But since πu + (1 − π)d = 1 + r, the two bracketed factors sum to unity.Therefore, letting P be a new measure equivalent to P and P but with uptickprobability π = πu/(1 + r), the call’s value can be expressed finally as

CE(s0, n) = s0P(sn > X) − B(0, n)XP(sn > X). (5.25)

The European put can be developed in the same way as

PE(s0, n) = B(0, n)XP(sn < X) − s0P(sn < X). (5.26)

The formulas relate the prices of European calls and puts to the proba-bilities (in two different pseudo-measures) that the options expire in themoney. It is clear immediately that the formulas are consistent with Euro-pean put-call parity for options on an asset with zero explicit cost of carry:CE(s0, n) − PE(s0, n) = s0 − B(0, n)X. Expressions (5.25) and (5.26) arevery similar to the Black-Scholes formulas from the continuous-time theory,which are given in section 5.5 as (5.45) and (5.46). Indeed, with appropriate

Page 216: Pricing derivative securities

April 8, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch05 FA

Pricing under Bernoulli Dynamics 199

choices of u and d we will see that they actually converge to those expres-sions as n → ∞. Thus, while (5.25) and (5.26) dominate the slower recursiveprocedure in pricing European options, their value is strictly pedagogical.On the other hand, we will see that the recursive method comes into itsown in pricing more complicated derivatives, such as American options, forwhich exact computational formulas are not available.

5.4.2 Futures and Futures Options

The static replication arguments in chapter 4 established that the arbitrage-free forward price for time-T delivery of a commodity with zero explicit costof carry is

f(0, T ) = B(0, T )−1S0. (5.27)

We also saw that under condition RK the initial futures price F(0, T ) mustbe the same as the forward price. Reassuringly, one can confirm theseanswers in the Bernoulli framework by applying martingale pricing for-mula (5.19); but to do so one must be clear about what the derivative assetactually is in connection with futures or forward contracts. The futuresprice itself is not the price of a derivative asset. It merely sets the terms fora later transaction, and no asset actually trades at the futures price untilthe contract expires, at which time the futures price is just the spot price.The real derivative asset whose discounted value is a martingale under therisk-neutral measure is the futures position, which is to say the value of thefutures account. Using the time-step notation from the Bernoulli setup andassuming that the futures position is marked to market at each time step,the value at step j +1 of a long position initiated at step j is Fj+1 −Fj percommodity unit. Applying (5.19) gives

0 =Fj − Fj

mj= Ej

(Fj+1 − Fj

mj+1

). (5.28)

which (since mj+1 ∈ F0 ⊂ Fj by assumption RK) implies that Fj =EjFj+1, for j ∈ 0, 1, . . . , n − 1. The fact that the normalized value of thefutures position is a martingale thus does imply that the futures price itselfis a martingale under P, provided we abstract from uncertainty about futureinterest rates.

With this result we can now see that the forward and futures prices docoincide and that they satisfy cost-of-carry relation (5.27). Since Fn = sn ≡ST and since sj/mjn

j=0 is a martingale under P, repeated application

Page 217: Pricing derivative securities

April 8, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch05 FA

200 Pricing Derivative Securities (2nd Ed.)

of (5.28) gives F0 = Esn = s0(mn/m0) = B(0, n)−1s0 = f0 or, in thecontinuous-time notation, F(0, T ) = B(0, T )−1S0 = f(0, T ).

Having developed some background, let us now consider derivativeswhose payoffs are linked to a futures price. An example is futures options,which on exercise yield cash payments plus positions in the futures.Specifically, one who exercises a European futures call option struck atX receives F(T, T ′) − X in cash when the option expires at T , plus along position in a futures contract that expires at some T ′ > T . Sincethe futures position has no value when it is first assumed, the optionwould be exercised only if F(T, T ′) > X , and the call’s terminal valueis therefore [F(T, T ′) − X ]+. Similarly, the value of an expiring put onthe futures is [X − F(T, T ′)]+. If exercised at T , it would convey a cashpayment of X − F(T, T ′) plus a short position in the futures. In a friction-less market these long and short positions in the underlying futures canbe closed out immediately at no cost by taking offsetting short or longpositions.

Modeling the futures price itself as a Bernoulli process, with Fuj+1 = Fjuj

and Fdj+1 = Fjdj , we will try to form a self-financing, dynamic replicating

portfolio for a European-style futures derivative using futures contractsand investments in the money fund. (We will see presently why it mightbe appropriate to allow time-dependent tick sizes, as the notation sug-gests.) Things will work a little differently than for derivatives on stocksand other primary assets. Because a futures contract has zero value at thetime it is acquired, the step-j value of a replicating portfolio comprisingpj contracts and qj units of the money fund is just Dj = qjmj . Since thecontracts acquire positive or negative value at the next time step as thefutures price moves, the replicating shares for futures and money fund atstep j must solve

pj(Fuj+1 − Fj) + qjmj+1 = Du

j+1

pj(Fdj+1 − Fj) + qjmj+1 = Dd

j+1,

where Duj+1 and Dd

j+1 are uptick and downtick values of the derivative. Thesolutions are

pj =Du

j+1 − Ddj+1

Fuj+1 − Fd

j+1

qj =Dj+1 − pj(Fj+1 − Fj)

mj+1.

Page 218: Pricing derivative securities

April 8, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch05 FA

Pricing under Bernoulli Dynamics 201

A little algebra then shows that the value of the replicating portfolio atstep j equals

qjmj =mj

mj+1

(Fj − Fdj+1)D

uj+1 + (Fu

j+1 − Fj)Ddj+1

Fuj+1 − Fd

j+1

=mj

mj+1

[πjD

uj+1 + (1 − πj)Dd

j+1

],

with

πj =1 − dj

uj − dj. (5.29)

The pseudo-probabilities πj , 1 − πj are the basis for a new measure P

that depends on the move sizes for the futures price. At step j + 1 thefutures position from step j is closed out, the profits (losses) are reinvestedin (made up from) the money fund, and new futures commitments are madeto replicate at the next step. In this way the replicating portfolio is forced tobe self-financing. As usual, the normalized price of the replicating portfolio,and therefore of the derivative itself, is a martingale in the correspondingmeasure P; that is,

Dj

mj= πj

Duj+1

mj+1+ (1 − πj)

Ddj+1

mj+1

= Ej

(Dj+1

mj+1

)and, as in (5.19),

D0 =m0

mnEDn = B(0, n)EDn. (5.30)

This parallels the development for derivatives on primary assets, suchas the one that underlies the futures, yet it seems that we have wound upwith a different risk-neutral probability—πj instead of the πj that drivesthe underlying price itself—and a different martingale measure. In fact, πj

and πj turn out to be the same if move sizes for the futures are madeconsistent with those for the underlying primary asset. This would have tobe true since the dynamics of the primary asset are driving everything here,but we can see it formally as follows. Impose the usual Bernoulli dynamicswith move sizes u and d for the primary asset, and suppose there are N

time steps until the futures contract expires and n < N steps in the life ofthe futures derivative. Since the futures price itself is a martingale under

Page 219: Pricing derivative securities

April 8, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch05 FA

202 Pricing Derivative Securities (2nd Ed.)

P, we have Fj = EjsN = (mN/mj)sj , so that

Fuj+1 =

mN

mj+1sju =

(mN

mjsj

)mj

mj+1u = Fj

mj

mj+1u,

and likewise

Fdj+1 = Fj

mj

mj+1d.

Thus, if the dynamics for the futures are to be consistent with those forthe underlying, the move sizes for the two prices must be related. For atraded asset with zero cost of carry the relation is that uj/u = dj/d =mj/mj+1 = (1 + rj)−1. Recognizing this connection, we see that πj and πj

are indeed the same. Of course, if we work in terms of futures prices ratherthan prices of the underlying commodity, the measures P and P are indeeddifferent, because the state spaces differ. In applications it is common toignore the connection between (u, d) and (u, d) and to estimate the movesizes for futures directly from observations on futures prices rather thanprices of the underlying.

Example 61 Consider replicating and pricing a 3-period European futurescall with strike X = 10.0 when the initial futures price is F0 = 10.0. Take r

to be constant at 0.01 and u = 1.05/1.01 and d = 0.97/1.01 in each period,which would be consistent with move sizes of u = 1.05 and d = 0.97 forthe price of the underlying commodity. Then (5.29) gives π = 1

2 . Figure 5.6illustrates recursive valuation and replication of the call. Applying (5.30),the initial value can also be obtained directly as

CE(10.0, 3) =1

1.013

3∑k=0

(3k

)23

[10.0

(1.051.01

)3−k (0.971.01

)k

− 10.0

]+

.

5.4.3 American-Style Derivatives

Ordinary or “vanilla” American-style derivatives can be exercised at anytime before expiration and, upon exercise, deliver payoffs that depend juston the current value of the underlying. We can consider just as easily slightlymore general forms whose payoffs can depend on time as well. For this pur-pose DX

j ≡ DXj (sj) represents the derivative’s value at the (sj , j) node

if it is not held beyond step j. Since the derivative could fail to be held(by someone) only if it were exercised or discarded, and since disposal iscostless, it follows that DX

j ≥ 0. We refer to this (somewhat imprecisely)

Page 220: Pricing derivative securities

April 8, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch05 FA

Pricing under Bernoulli Dynamics 203

Fig. 5.6. Pricing a European call on a futures.

as the derivative’s “exercise value”. As usual, Dj without the “X” repre-sents the derivative’s arbitrage-free price at j. Obviously Dj ≥ DX

j , sincethe holder has the right to exercise (or discard) at any time, and exerciseoccurs if and only if Dj = DX

j > 0. The tasks are to discover at eachstep and each underlying price whether exercise is optimal and, thereby,ultimately to determine the derivative’s current price, D0. We first treatderivatives on primary assets (“stocks”) and then turn to American-stylefutures derivatives. Throughout this section we continue to assume thatthe underlying has no explicit cost of carry, meaning that one who holdsthe underlying receives no dividends or other emoluments and incurs noexplicit costs. These complications are considered in section 5.4.4.

Derivatives on Primary Assets

Begin by valuing an American derivative at an arbitrary node (sj , j) ofan n-step binomial tree, where n corresponds to the expiration time. Asusual, there is nothing to do at the terminal nodes, since values there areknown functions of sn. When j < n the holder of the derivative has two

Page 221: Pricing derivative securities

April 8, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch05 FA

204 Pricing Derivative Securities (2nd Ed.)

relevant options: (i) hold until at least j + 1 or (ii) exercise (or discard)immediately. As for European-style derivatives, the self-financing portfoliothat replicates under the “hold” option contains

pj =Du

j+1 − Ddj+1

sju − sjd

shares of the stock and

qj =Dj+1

mj+1− pj

sj+1

mj+1

units of the money fund. The current value of the replicating portfolio,which equals the derivative’s arbitrage-free value under the hold option, isthen

pjsj + qjmj =mj

mj+1

[πjD

uj+1 + (1 − π)Dd

j+1

](5.31)

=mj

mj+1EjDj+1.

Since the derivative is worth DXj if not held, and since holding is discre-

tionary, the solution for Dj at the (sj , j) node is

Dj =mj

mj+1EjDj+1 ∨ DX

j , (5.32)

where “∨” signifies “greater of” and Dj+1 represents the value at j + 1 ifthe derivative is not exercised at step j.

The procedure for pricing an American-style derivative is then as fol-lows. One begins at step n, where the derivative’s terminal value is known,and works backward through the tree. At each node of each step j < n,(5.31) is used to calculate the value under the hold option from the twostate-dependent values at the next step. This value is compared with thevalue under the exercise option, DX

j , and then Dj is set equal to the greaterof these. When prices at all nodes of step j have been found in this way,one moves back to step j − 1, repeats the process, and then continues inthis manner step by step back to stage 0.

The next section shows how to modify this procedure when the underly-ing pays a dividend or has some other positive or negative explicit carryingcost. When there are no such costs, we have seen that there are explicit solu-tions for prices of European-style derivatives and that it is not necessaryto work through the entire tree to price them. But, with one exception, therecursive method is essential for American derivatives, since at each stepthe values at the next future time step must be known in order to decide

Page 222: Pricing derivative securities

April 8, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch05 FA

Pricing under Bernoulli Dynamics 205

whether or not to hold. The exceptional case is that of an American call ona no-dividend stock, for which early exercise would never be optimal. Onthe other hand, an American put may well be exercised before expiration,regardless of whether the underlying pays dividends. As a result, Americanputs are typically worth more than their European counterparts, and theymust always be priced recursively.

Example 62 Figure 5.7 illustrates binomial pricing of a 4-period vanillaAmerican put struck at X = 10.0 on an underlying stock with initial prices0 = 10.0. Taking u = 1.05, d = 0.95, and r = 0.01 gives π = (1.01 −0.95)÷(1.05−0.95) = 0.6. The put’s value if not held beyond j is P X

j (sj) =(X−sj)+, independent of time. The values p and q at each node in the figureare the number of shares of stock and money fund that replicate the put ifit is held to the next period. A shaded block at a node indicates that the putwould be exercised there. As a sample of the calculations, look at the bottomnode at step 2, which is marked with a check () in the figure. The put’s

Fig. 5.7. Binomial pricing of American put.

Page 223: Pricing derivative securities

April 8, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch05 FA

206 Pricing Derivative Securities (2nd Ed.)

value if held is calculated via (5.31) as [.6(0.524)+ .4(1.426)]÷1.01 .= 0.876,

and the exercise value is (10.000 − 9.025) = 0.975; hence, the value of theAmerican put at this node is P A

2 = 0.876∨ 0.975 .= 0.98.

Futures Derivatives

Consider now American options on futures. Once the move sizes are deter-mined appropriately to model the futures price, the binomial method worksin the same way as for American options on primary assets. However, inone instance there is an important qualitative difference in the results: eventhough a dividend is never associated with a futures price, an American callon the futures might still be exercised early. Accordingly, American callson futures will ordinarily be worth more than otherwise identical Europeancalls and must be priced recursively.

The possibility of rational early exercise for the American call on futurescan be demonstrated easily in a one-step binomial tree. When the callexpires at step 1, it will be worth (F1 − X)+, where F1 = F0u in the upstate and F1 = F0d in the down state. Suppose F1 − X > 0 in both states.Since the call is certain to wind up in the money from the initial F0, it mustbe in the money at step 0 also. Then the current value of the American call is

CA(F0, 1) = (1 + r)−1E(F1 − X)+ ∨ CX0

= (1 + r)−1E(F1 − X) ∨ (F0 − X)

= (1 + r)−1(F0 − X) ∨ (F0 − X)

= (F0 − X).

Here, the third equality follows from the martingale property of the undis-counted futures price in the measure P, as in (5.28) with n = 1, and thelast equality holds since r ≥ 0 (assumption RB). This shows that it wouldbe optimal to exercise the call at step 0. The same argument would applyat any step j of an n-step tree and any initial Fj from which the call wouldbe sure to wind up in the money.7 Notice that it is the fact that the futuresprice is a P martingale that makes American calls on futures fundamentallydifferent from calls on no-dividend stocks. Following through the steps of the

7The certainty of finishing in the money is merely sufficient for early exercise, how-ever. It is not a necessary condition.

Page 224: Pricing derivative securities

April 8, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch05 FA

Pricing under Bernoulli Dynamics 207

example with a one-period equity call instead of a call on the futures gives

CA(s0, 1) = (1 + r)−1E(S1 − X)+ ∨ (S0 − X)

= (1 + r)−1E(S1 − X) ∨ (S0 − X)

= (1 + r)−1[S0(1 + r) − X ] ∨ (S0 − X)

= S0 − (1 + r)−1X.

Consistent with the arguments in chapter 4, American calls on no-dividendstocks would be kept to expiration, even if they were certain to end up inthe money. In the next section we shall see that this is no longer alwaystrue when the stock pays dividends.

Another View of American Derivatives

While the mechanics of valuation in the Bernoulli framework are clear, themethod is not particularly insightful. There is another perspective fromwhich to view the arbitrage-free price of an American-style derivative. Thefrontier of nodes at which early exercise is optimal represents the “exerciseboundary”. There are always such critical exercise boundaries for Americanoptions, whether they be vanilla puts or calls or more exotic varieties. Forthe put in example 62 this is the boundary between the unshaded andshaded nodes in figure 5.7. In general, for vanilla puts the exercise boundaryis a function of the time step, B : 0, 1, . . . , n → +, such that exerciseoccurs at step j whenever sj < Bj . A given path followed by the stock’sprice from an initial point above the boundary produces an outcome ofpositive value for the put when and only when it crosses this boundary forthe first time. The put then generates a cash receipt, and (since exercise hasoccurred) the remainder of the path becomes irrelevant. Paths that nevercross the exercise boundary contribute nothing to the option’s value. Thosethat do cross contribute to the value of the derivative an amount equal tothe product of the receipt itself times the risk-neutral probability of thepath times the discount factor that converts the future receipt to presentvalue. The value of an American derivative can be regarded as the sum ofthese contributions. For example, in figure 5.7 there are six such paths toa first crossing. Each path is shown in table 5.1 as a string of d’s and u’s,along with its contribution to value. These sum to about 0.26, as in thefigure.

This path-by-path view suggests formal ways to characterize values ofAmerican-style derivatives. Letting “ω” designate a sample path of theunderlying price and assuming that it is not initially optimal to exercise,

Page 225: Pricing derivative securities

April 8, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch05 FA

208 Pricing Derivative Securities (2nd Ed.)

Table 5.1. Sample paths of underlying contributing to value of American put.

Path Probability Receipt Time Discount Value

dd 0.42 × 0.975 ÷ 1.012 = 0.153dud 0.42(0.6) × 0.524 ÷ 1.013 = 0.049udd 0.42(0.6) × 0.524 ÷ 1.013 = 0.049uudd 0.42(0.62) × 0.050 ÷ 1.014 = 0.003udud 0.42(0.62) × 0.050 ÷ 1.014 = 0.003duud 0.42(0.62) × 0.050 ÷ 1.014 = 0.003

Total value: 0.260

define

J(ω) = min1 ≤ j ≤ n : DXj ≥ Dj.

This represents the number of the first step on path ω at which the value ofthe derivative is not greater than its exercise value. If ω crosses the exerciseboundary before n, J(ω) will be the time step of the first crossing. If there isno crossing before n, then J(ω) = n, since in that case the value and exercisevalue will coincide for the first time at expiration. (If the derivative expiresout of the money, both Dn and DX

n are zero.) With this convention theinitial value of the American derivative is

E

[m0

mJDX

J (sJ)]

= E[B(0, J)DXJ (sJ )].

This amounts to doing just what was done in the table, except that itaverages in the paths that contribute nothing to value along with thosethat do.

The value of a particular derivative can be also stated directly in termsof its positive payoffs at the exercise boundary and the times of first passage.The first-passage time is defined as

J∗(ω) = min1 ≤ j ≤ n : sj < Bjif this set is not empty (that is, if the crossing does occur by step n) andJ∗(ω) = n + 1 otherwise. Note that events J(ω) = j and J∗(ω) = j aremeasurable with respect to Fj , since one always knows at stage j whetherthe condition for exercise is met. Thus, J and J∗ are stopping times withrespect to the filtered space (Ω,F , Fj, P). For the American put, whichhas exercise value PX

j (sj) = X − Bj at the boundary, the initial value is

PA(s0, n) =n∑

j=1

B(0, j)(X − Bj)P(J∗ = j),

Page 226: Pricing derivative securities

April 8, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch05 FA

Pricing under Bernoulli Dynamics 209

provided the stock’s initial price s0 is above B0. If it is not, then PA(s0, n) =X − s0.

5.4.4 Derivatives on Assets That Pay Dividends

The prospect of a dividend payment may either trigger or delay the exerciseof an American option and typically affects the values of derivatives of alltypes. In this context the term dividends applies to any receipts of posi-tive monetary value that (i) accrue to the holder of an asset, (ii) alter thedynamics of its price, and (iii) are not automatically adjusted for in the con-tractual terms of a derivative asset. Dividends are the negative componentsof an underlying asset’s explicit cost of carry, and for financial assets theyare typically the only such components. Examples of dividends in the broadsense (as the term is used here) include the interest accrued on a long posi-tion in a money-market fund denominated in a foreign currency, couponinterest received from a portfolio of bonds, and cash dividends on com-mon or preferred stock. The definition does exclude stock dividends, whichare pro-rata disbursements of shares of the underlying stock itself, whenconsidering derivatives (such as exchange-traded options) whose terms areadjusted for the price effects of such dilutions of value.8

As described in chapter 4, dividends can sometimes be modeled accu-rately as proportional to the value of the underlying asset at the timepayments are made. This would be true, for example, of interest paymentsat a known rate on a foreign money fund or discount bond, whose valuein domestic currency units would depend on the exchange rate. In othercases it is reasonable to model dividends as lump-sum cash amounts thatare completely unrelated to the value of the underlying. Interest paymentson default-free coupon bonds are examples, and cash dividends paid bycommon stocks are often assumed to be lump-sum as well. In line with theassumption that cost of carry is known (condition CCK), if the dividends

8For exchange-traded options both stock dividends and stock splits trigger adjust-ments to the strike price and either the contract size or the number of contracts. Lettingρ be the split factor (the number of shares outstanding per initial share after the split ordividend), strike prices are adjusted as X/ρ. If ρ is an integer (as in a two-for-one splitor 100% stock dividend), then the number of option contracts is multiplied by ρ. If ρ isnot an integer, then the number shares per option contract (normally 100) is increasedby that factor.

However, such contractual adjustments do not insulate derivatives’ prices from thevolatility changes that usually accompany splits and stock dividends. In pricing thederivatives such effects have to be captured by the appropriate modeling of volatility.

Page 227: Pricing derivative securities

April 8, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch05 FA

210 Pricing Derivative Securities (2nd Ed.)

paid by the underlying are proportional to price, we assume that the rates(proportions) over the life of the derivative are known; if they are lump-sum,we assume that cash amounts and dates of receipt are known.

The frequency with which common stocks pay dividends varies fromcountry to country. For example, in the U.S. they are typically paid quar-terly, while annual payments are the standard in Germany. Because it takestime to disburse the payments, a stock begins to trade “ex dividend” a shorttime before the payments are made, and those who buy the stock after theex date do not receive the dividend. On average, the price of a share dropsby about the amount of the dividend on the ex date. The short delay beforepayment is received and the tax disadvantage of dividends vs. capital gainsin the U.S. both slightly diminish the price effect for U.S. stocks. For pur-poses of pricing derivatives on stocks it is sensible just to define the dividendas the price effect that results from the disbursement.

The proper treatment of dividends in the binomial scheme depends onthe frequency of payments and whether they are proportional to value of theunderlying or of fixed amount. We will look at three cases, in increasingorder of difficulty: (i) proportional dividends paid continuously, (ii) pro-portional dividends paid at discrete intervals, and (iii) fixed (lump-sum)cash amounts paid at discrete intervals. In each case we will work out thereplicating portfolio and martingale measure for pricing an American-stylederivative on an underlying asset (which we continue to call a “stock”), andwill indicate briefly how the procedure is to be modified for European-stylederivatives.

Continuous, Proportional Dividends

In the Bernoulli framework continuous means occurring at each time step.Working with the usual n-step binomial tree, the holder of the underlyingstock receives a dividend δsj , proportional to price, at each node (sj , j)n

j=1.(Allowing for time dependence of δ would just require adding a subscript.)A key observation here is that it is not possible without portfolio adjust-ments to maintain a desired position in the underlying stock alone, becauseit generates cash payments at each step. One thing that can be done isto reinvest the dividend in the stock each time it is received, using thepayment δsj to purchase δ additional shares. Alternatively, some or all ofthe cash could be invested the money market fund. We show below thatthe result would be the same, but for now we consider just reinvestmentin the stock. Let sc

j = sj(1 + δ)j be the value at step j of one unit of this

Page 228: Pricing derivative securities

April 8, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch05 FA

Pricing under Bernoulli Dynamics 211

cum-dividend position that begins at step 0 with one actual share of stockand collects additional units as dividends are received and reinvested. Thisposition is the asset that must be used for replication—and the asset whosediscounted value should be a martingale under the risk-neutral measure.

The usual Bernoulli dynamics for the price of the stock itself, as sj+1 =sjRj+1 for Rj+1 ∈ d, u, give sc

j+1 = scj(1 + δ)Rj+1 for the value of the

cum-dividend position. Let Dj+1(sj+1) ≡ D(sj+1, n− j−1) be the value ofthe derivative asset at j +1, when it has n− j−1 periods to go. Notice thatDj+1(sj+1) depends on the actual market price of the stock rather than onsc

j+1 since it is the former that determines exercise value. The replicatingportfolio (pj , qj) that sets

pjscj(1 + δ)u + qjmj+1 = Dj+1(sju) (5.33a)

pjscj(1 + δ)d + qjmj+1 = Dj+1(sjd) (5.33b)

is

pj =Dj+1(sju) − Dj+1(sjd)

scj(1 + δ)(u − d)

,

qj =Dj+1(sjd) · u − Dj+1(sju) · d

mj+1(u − d).

Setting mj+1 = mj(1 + r), some algebra gives the derivative’s value ifheld as

pjscj + qjmj =

11 + r

[πDj+1(sju) + (1 − π)Dj+1(sjd)], (5.34)

where

π =(1 + r)/(1 + δ) − d

u − d. (5.35)

Therefore, if the current exercise value is DXj (sj) the derivative’s current

value at step j is

Dj(sj) =1

1 + rEjDj+1(sj+1) ∨ DX

j (sj).

Notice that, for actual calculations, accounting for continuous, proportionaldividends is a simple matter of dividing 1+r by 1+δ in the expression for π

and proceeding as if there were no dividends. That is why the continuous-dividend case is the simplest to handle.

It is worth pausing a moment to see a more direct way to get the sameresult and to appreciate its intuitive content. The pseudo-probability π in

Page 229: Pricing derivative securities

April 8, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch05 FA

212 Pricing Derivative Securities (2nd Ed.)

(5.35) could have been found right away by determining the probabilitythat makes the process sc

j/mj a martingale; that is, by solving

scj

mj= Ej

scj+1

mj+1

=sc

j

mj

(1 + δ

1 + r

)[πu + (1 − π)d] . (5.36)

Once this was done we could have seen that it was possible to construct aself-financing, replicating portfolio by just writing down (5.33a) and (5.33b)without having to solve for pj and qj . With the existence of a martingalemeasure and a replicating portfolio thus verified, the right side of (5.34)could have been obtained immediately to value the derivative. As to theintuitive content, notice that the expected one-period return of the stockfrom capital gain alone (in the martingale measure) is

Ej(sj+1/sj) =1 + r

1 + δ.

This corresponds to the fact that in a risk-neutral market the expectedvalue of the cum-dividend, one-period return, sj+1(1 + δ)/sj, would be thesame as the sure return on the money fund, 1 + r.

Had all dividends been invested in the money fund rather than in thestock, it is the normalized value of the stock-fund position that would bea martingale. Letting vj be the position’s value at step j, it is easy to seethat vj+1 = sjRj+1(1 + δ) + (vj − sj)mj+1/mj and hence that

vj+1

mj+1=

vj

mj+

[sj+1

mj+1(1 + δ) − sj

mj

].

Accordingly, the same value of π equates Ejvj+1/mj+1 to vj/m, so that themartingale measure is independent of the reinvestment choice for dividends.

To summarize, in pricing a European-style derivative on an asset withcontinuous proportional dividends at rate δ, one merely divides (1 + r) or(1+rj) by (1+δ) in calculating the π or πj that enters expression (5.11) or(5.15). Moreover, if r and δ are small, (1+r)/(1+δ) can be approximated as1+r−δ, an expression for which we will see parallels in the continuous-timetreatment of chapter 6.

A Symmetry Relation for Puts and Calls

We shall see later that uptick and downtick values are inversely related ina standard implementation of Bernoulli dynamics; i.e., d = u−1. When this

Page 230: Pricing derivative securities

April 8, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch05 FA

Pricing under Bernoulli Dynamics 213

holds, there is a useful symmetry between the values of puts and calls—European and American alike. Consider first European puts and calls on anunderlying asset that pays proportional dividends continuously at rate δ.

Representing the values of the options in terms of underlying price, strikeprice, dividend rate, and the short rate, the call’s value at step n − 1 is

CEn−1(sn−1; X, δ, r) = (1 + r)−1En−1(sn − X)+

=

(1

1+δ − d1+r

u − d

)(sn−1u − X)+

+

(u

1+r − 11+δ

u − d

)(sn−1d − X)+,

and the put’s value is

PEn−1(sn−1; X, δ, r) = (1 + r)−1En−1(X − sn)+

=

(1

1+δ − d1+r

u − d

)(X − sn−1u)+

+

(u

1+r − 11+δ

u − d

)(X − sn−1d)+.

In the last expression for the put interchange sn−1 and X and also δ and r

to obtain(1

1+r − d1+δ

u − d

)(sn−1 − Xu)+ +

(u

1+δ − 11+r

u − d

)(sn−1 − Xd)+.

Then if u = d−1 this becomes(1

1+r − d1+δ

u − d

)(sn−1d − X)+

d+

(u

1+δ − 11+r

u − d

)(sn−1u − X)+

u

=

(1/d1+r − 1

1+δ

u − d

)(sn−1d − X)+ +

(1

1+δ − 1/u1+r

u − d

)(sn−1u − X)+

=

(u

1+r − 11+δ

u − d

)(sn−1d − X)+ +

(1

1+δ − d1+r

u − d

)(sn−1u − X)+

= CEn−1(sn−1; X, δ, r).

Thus, permuting the arguments for the European put with one step to goproduces the corresponding expression for the call, and vice versa. More-over, the symmetry persists as we back up through the tree. At step j the

Page 231: Pricing derivative securities

April 8, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch05 FA

214 Pricing Derivative Securities (2nd Ed.)

expression for the European put with permuted arguments is

PEj (X ; sj, r, δ) =

(u

1+r − 11+δ

u − d

)PE

j+1(Xu; sj, r, δ)

+

(1

1+δ − d1+r

u − d

)PE

j+1(Xd; sj , r, δ)

= CEj (sj ; X, δ, r).

Since the corresponding American options at each stage j are valued as

CAj (sj ; X, δ, r) =

11 + r

EjCAj+1(sj+1; X, δ, r) ∨ (sj − X)+

PAj (sj ; X, δ, r) =

11 + r

EjPAj+1(sj+1; X, δ, r) ∨ (X − sj)+,

permuting the arguments has the same effect for American puts and callsas well. Thus, under the usual specification of Bernoulli dynamics reversingthe roles of S and X and of δ and r turns puts into calls and calls into puts.

Proportional Dividends at Discrete Intervals

Continuing to explore the treatment of dividends, consider now a derivativeon a stock that pays at a single time step, k > 0, a dividend equal toa proportion δ of its price. The stock’s observed price up to k is a cum-dividend price, sc

j , since (ignoring the lag between ex -dividend and paymentdates) whoever owns the stock prior to k will get the dividend. The dividendpayment itself is sc

kδ. From k onwards what we observe is an ex -dividendprice, sj ; but the relation sk = sc

k(1 − δ) makes it possible to constructa continuation of the cum-dividend sequence for j ∈ k, k + 1, . . . , n, assc

j = sj/(1−δ). Again, it is the (discounted) cum-dividend asset that shouldbe a martingale under the risk-neutral measure. The martingale probabilitythat solves

scj =

mj

mj+1Ejs

cj+1 (5.37)

=1

1 + r

[πsc

ju + (1 − π)scjd],

is the standard one:

π =1 + r − d

u − d. (5.38)

Page 232: Pricing derivative securities

April 8, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch05 FA

Pricing under Bernoulli Dynamics 215

Up through step k − 2 both the stock’s current and next-period prices arecum-dividend, so the derivative is priced off the observed cum-dividendprice as

Dj(scj) =

11 + r

EjDj+1(scj+1) ∨ DX

j (scj). (5.39)

From k onward the derivative is priced off the observed ex -dividend price as

Dj(sj) =1

1 + rEjDj+1(sj+1) ∨ DX

j (sj) (5.40)

=1

1 + rEjDj+1

[sc

j+1(1 − δ)]∨ DX

j (sj).

At k−1, the critical stage just before the ex date, one compares the deriva-tive’s current exercise value, which is based on the cum-dividend price, withits value if held until the stock’s price reacts to the dividend:

Dk−1(sk−1) =1

1 + rEk−1Dk[sc

k(1 − δ)] ∨ DXk−1(s

ck−1).

An example will show what this means in applications.

Example 63 Figure 5.8 illustrates pricing a 3-period American call struckat X = 10.0 when the initial cum-dividend price is sc

0 = 10.0 and a 5%dividend is paid at step 2. The dynamics are u = 1.05, d = 0.95, andr = 1.01, giving π = 0.6. At step 2 the cum-dividend prices in the threestates are sc

0uu = 11.025, sc0ud = 9.975, and sc

0dd = 9.025. Each declines by5% to sc

0uu(1− δ) .= 10.474, sc0ud(1− δ) .= 9.476, and sc

0dd(1 − δ) .= 8.574,

and these then produce the four values shown in the final period. Followingthe calculations along the top-most path of the tree, the call ends up in themoney at the top terminal node with value sc

0uu(1−δ)u−X = 0.997. At step2, with the stock trading ex dividend for the first time, the call’s value is

CA(10.474, 1) =.6(0.997) + .4(0.0)

1.01∨ (10.474− 10.0) = 0.592.

At step 1, just before the ex date, the value is

CA(10.500, 2) =.6(0.592) + .4(0.0)

1.01∨ (10.500− 10.0) = 0.500,

which is attained by early exercise. This illustrates the general result thatan American call on a primary asset will be exercised, if ever, just before itbegins to trade ex dividend. Finally, the initial value is

CA(10.0, 3) =.6(0.500) + .4(0.0)

1.01∨ (10.0 − 10.0) = 0.297.

Page 233: Pricing derivative securities

April 8, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch05 FA

216 Pricing Derivative Securities (2nd Ed.)

Fig. 5.8. Binomial pricing of American call on stock paying proportional dividend.

Pricing a European derivative on an asset that makes a single divi-dend payment proportional to price just involves adjusting the initial, cum-dividend price by the dividend factor, since

D0[sc0(1 − δ)] = B(0, n)EDn[sc

n(1 − δ)] = B(0, n)EDn(sn).

Multiple proportionate payments at rates δkKk=1 during the life of the

derivative would simply require correcting sc0 by the factor

∏Kk=1(1 − δk).

Fixed Dividends at Discrete Intervals

This case is the one that best fits short-term derivatives on equities, sincecash amounts paid as dividends during periods of a few months are highlypredictable and (therefore) not very sensitive to changes in the underlyingprice. It is also the hardest case to implement in the binomial setup. Let usagain suppose that a single dividend is paid at step k > 0, but now fix theamount at ∆ units of currency rather than at δ units of stock. As before,it is the cum-dividend price that is observed prior to k. Two methods ofpricing a derivative on this asset can be considered.

Page 234: Pricing derivative securities

April 8, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch05 FA

Pricing under Bernoulli Dynamics 217

First suppose that the cum-dividend price of the stock follows the usualBernoulli dynamics during the entire life of the derivative, so that sc

j+1 =sc

jRj+1 with Rj+1 ∈ d, u. For this to make sense the dividend must havebeen reinvested in the stock at stage k to purchase an additional ∆/sk

shares. Thus, scj = sj(1 + ∆/sk) for j ∈ k, k + 1, . . . , n − 1. This means

that the ex -dividend price, sj = scj/(1 + ∆/sk), follows the usual Bernoulli

dynamics also from stage k onward. With this construction of scj (5.37)

is solved as before to give the same π as in (5.38). Up to step k − 2 thederivative is priced off the cum-dividend price, sc

j, as in (5.39), and from k

onward it is based on the ex -dividend price, sj , as in (5.40). At step k − 1,

just before the dividend, it depends on both prices, as

Dk−1(sk−1) =1

1 + rEk−1Dk(sc

k − ∆) ∨ DXk−1(s

ck−1).

What is hard about this? The difficulty here is not conceptual but compu-tational. The problem can be seen at once by looking at the tree diagram infigure 5.9, which illustrates the pricing of an American call with the same

Fig. 5.9. Binomial pricing of American call on stock paying lump-sum dividend.

Page 235: Pricing derivative securities

April 8, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch05 FA

218 Pricing Derivative Securities (2nd Ed.)

data as figure 5.8 but with a fixed dividend of ∆ = 0.50 currency units atstep 2. Following the fixed dividend the price paths no longer recombineas they did with proportional dividends. From the top node at step 2 theex -dividend price of s2 = sc

0uu − ∆ becomes in the down state at step 3(sc

0uu − ∆)d, whereas from the ex -dividend price of s2 = sc0ud − ∆ at the

middle node at step 2 there is a move to (sc0ud − ∆)u in the up state at

step 3. The difference is ∆(u−d). The effect is that an entirely new tree hasto be constructed at each node on the ex -dividend date. There will thus bek+1 new (n−k)-step trees and (k+1)(n−k+1) terminal nodes. This is notdifficult to manage if the binomial scheme is programmed in modular formso that it can be executed repeatedly with different input data. In that caseone merely starts off the routine and values the derivative in an (n − k)-step tree at each of the k + 1 ex -dividend prices, then uses those pricesas terminal values in a second-stage k-step tree that starts from the initialcum-dividend price. However, the procedure is much slower, particularly soif the dividend is paid near the midpoint in the derivative’s life. The num-ber of nodes at which the derivative must be valued reaches a maximum of(n/2 + 1)2 at k = n/2, versus the usual n + 1 in a recombining tree.

There is a way to treat fixed dividends that avoids the nonrecombinanttree, but it has the disadvantage of introducing some inconsistency in themodeling of the stock dynamics. We refer to this as the “set-aside” method.The assumption is that in anticipation of paying the dividend the firm setsaside a money-fund deposit equal to the present value of the payment obli-gation. Equivalently (under assumption RK), the firm purchases discountbonds of face value equal to the promised dividend and maturing on theex date. This portion of the firm’s assets contributes to the share price butdoes not follow the usual Bernoulli dynamics generated by the firm’s other(“real”) assets. Paying the dividend removes this component of value andleaves just the residual real component, which continues to follow the samedynamics as before.

Here, specifically, is how the idea works. At all time steps before theex date, step k, regard the cum-dividend share price that is then observedas the sum of the ex -dividend price (the “real” component) and the presentvalue of the anticipated dividend; that is, as

scj = sj + ∆(1 + r)j−k (5.41)

for j ∈ 0, 1, . . . , k−1. The ex -dividend price sj follows the usual dynamics,sj+1 = sjRj+1, Rj+1 ∈ d, u. From step k onward this is the price weobserve, but a comparable cum-dividend asset can be constructed simply

Page 236: Pricing derivative securities

April 8, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch05 FA

Pricing under Bernoulli Dynamics 219

by reinvesting the dividend in the money-market account. This done, (5.41)applies for each j ∈ 0, 1, . . . , n. Again, the martingale probability π solvessc

j = mj

mj+1Ejs

cj+1, or

sj +∆(1+r)j−k =π

1 + r[sju+∆(1+r)j−k+1]+

1 − π

1 + r[sjd+∆(1+r)j−k+1],

and this has the usual form (5.38). As before, the derivative is priced offthe cum-dividend price prior to step k − 1, as

Dj(scj) =

11 + r

EjDj+1(scj+1) ∨ DX

j (scj),

and off the ex -dividend price from k onward, as

Dj(sj) =1

1 + rEjDj+1(sj+1) ∨ DX

j (sj);

while at k − 1

Dk−1(sk−1) =1

1 + rEjDk(sc

k − ∆) ∨ DXk−1(s

ck−1).

For a European-style derivative one can simply evaluate (5.11) or (5.15) atthe current ex -dividend price, which is the observed price minus the presentvalue of the dividend, s0 = sc

0 − B(0, k)∆.Figure 5.10 shows how the set-aside assumption avoids nonrecombinant

trees, illustrating once again the pricing of an American call with n = 3time steps, dividend payment at step k = 2, sc

0 = X = 10.0, u = 1.05,d = 0.95, r = 1.01 and ∆ = 0.50. The observed price of sc

0 = 10.0 at step 0equals the money-market deposit of ∆(1+r)−2 .= 0.490 (which finances thedividend) plus the residual real component, s0

.= 9.510. At step 1 in the upstate the real component grows to s0u

.= 9.985 and the money fund depositgrows to ∆(1 + r)−1 .= 0.495, giving an observed price of sc

1.= 10.480. At

step 2 in the upper-most state the ex -dividend price is s2 = s0uu.= 10.485,

which drops to s0uud.= 9.961 in the down state at the second node from

the top in step 3. The same point is reached after an uptick from the middlenode at step 2, where the ex-dividend price is s2 = s0ud

.= 9.486Unfortunately, while this arrangement does make the tree recombine

and cuts computational time, the dynamic structure does not quite makesense. The problem is that we start out conveniently at step 0 with theset-aside for the step-k dividend already in place, but no similar provisionis made at step k for the next dividend to be paid after the derivativeexpires. Doing so would produce the same nonrecombinant tree that was tobe avoided. Thus, this method should be thought of just as a shortcut rather

Page 237: Pricing derivative securities

April 8, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch05 FA

220 Pricing Derivative Securities (2nd Ed.)

Fig. 5.10. Binomial pricing of American call on stock with lump-sum, set-asidedividend.

than as a completely satisfactory solution. Notice that the final answers forthe call price in figures 5.9 and 5.10 do differ slightly. The set-aside makesthe call worth a little less because there is less volatility and less potentialfor growth in share prices.

5.5 Implementing the Binomial Method

This section covers the details of how to put the binomial method intopractice. The main issues are (i) how to choose move sizes u and d to afforda satisfactory approximation to the dynamics of the underlying asset, and(ii) how to do the recursive calculations so as to use computational resourcesto best advantage.

5.5.1 Modeling the Dynamics

The Bernoulli model for the price of an asset clearly becomes more plau-sible the shorter is the interval of calendar time that corresponds to eachstep—that is, the larger is n. Moreover, we would have little confidence in

Page 238: Pricing derivative securities

April 8, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch05 FA

Pricing under Bernoulli Dynamics 221

the binomial method unless the sequence of estimates converged to somedefinite, finite value as n increased. Since the move sizes must themselvesdepend on n for this to happen, we write them now as un, dn. The mostbasic approach for pricing derivatives on stocks, stock indexes, and financialfutures is to choose un and dn so that the n-step distribution of terminalprice converges to the lognormal. We will now see how to do this. Sincethe limiting lognormal distribution for terminal price is the same as thatimplied by the continuous-time model for St that underlies the Black-Scholes theory (that is, geometric Brownian motion), it will not be surpris-ing to see that the n-step binomial prices for European options converge tothose given by the Black-Scholes formulas. Our first look at those formulaswill come in this section.

Specifying Move Sizes to Achieve Convergence

We work first with dynamics for prices of primary assets (“stocks”), treat-ing futures prices below. For now we assume that there are no dividendsor other explicit carrying costs. Begin as usual by dividing the T -year lifeof the derivative into n equal intervals [tj , tj+1)n−1

j=0 with tj = jT/n.On [tj , tj+1) the price of a share in the money fund grows by the pro-portion mj+1/mj = 1 + rj . Under assumption RK this equals the surereturn B(tj , tj+1)−1 on a one-period discount bond, and the common valueof these for each j is known as of t = 0. Since the goal is to approxi-mate continuous-time dynamics, we will express B(tj , tj+1)−1 in terms ofthe average continuously compounded spot rate, as B(tj , tj+1)−1 ≡ erjT/n,

where we set rj ≡ r(tj , tj+1) for brevity. Of course, under RK rj also equalsthe average forward rate as of t = 0, r0(tj , tj+1). Since erjT/n correspondsto (1 + rj) in the discrete-time notation, the martingale probability on thejth interval can now be written as

πjn =erjT/n − dn

un − dn. (5.42)

We now derive the limiting distribution as n → ∞ of the stock’s terminalprice after n steps, given the following choices of un and dn:

un = eσ√

T/n, (5.43)

dn = u−1n .

Here σ is a positive parameter not depending on n, whose interpretationwill be given later.

Page 239: Pricing derivative securities

April 8, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch05 FA

222 Pricing Derivative Securities (2nd Ed.)

The limiting argument requires making large-n approximations to vari-ous quantities. The first thing we need is an approximation for πjn. This is

πjn =erjT/n − e−σ

√T/n

eσ√

T/n − e−σ√

T/n

=erjT/n+σ

√T/n − 1

e2σ√

T/n − 1

=σ√

T/n + (rj + σ2/2)T/n + o(n−1)2σ

√T/n + 2σ2T/n + o(n−1)

=12 + rj+σ2/2

√T/n + o(n−1/2)

1 + σ√

T/n + o(n−1/2)

=12

+rj − σ2/2

√T/n + o(n−1/2).

Now let Un be the number of upticks in the n time steps, and set

Rn ≡ ln sn − ln s0 = lnST − lnS0 ≡ R(0, T ),

which is the stock’s continuously compounded return over [0, T ]. (Recallthat we assume for now that there are no dividends.) Since sn =s0u

Unn dn−Un

n , we have

Rn = n ln dn + Un ln(un/dn)

= (2Un − n) lnun

= (2Un − n)σ√

T/n.

Being the number of upticks, Un is the sum of independent (but not identi-cally distributed) Bernoulli variates with parameters π0n, π1n, . . . , πn−1,n.

The jth of these has characteristic function

Ψj(ζ) = 1 + πjn(eiζ − 1)

for ζ ∈ and i =√−1, so that the sum has c.f.

ΨUn(ζ) =n−1∏j=0

[1 + πjn(eiζ − 1)].

Accordingly, the c.f. of Rn is

ΨRn(ζ) = e−iζσ√

Tnn−1∏j=0

[1 + πjn(e2iζσ√

T/n − 1)].

Page 240: Pricing derivative securities

April 8, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch05 FA

Pricing under Bernoulli Dynamics 223

Taking the logarithm of ΨRn and then expanding e2iζσ√

T/n in Taylor seriesin powers of n−1/2 give for ln ΨRn(ζ)

−iζσ√

Tn +n−1∑j=0

ln[1 + πjn(e2iζσ√

T/n − 1)]

= −iζσ√

Tn +n−1∑j=0

ln1 + πjn[2iζσ√

T/n − 2ζ2σ2T/n + o(n−1)].

Now use the O(n−1/2) approximation for πjn, simplify, and expand thelogarithm in Taylor series to write ln ΨRn(ζ) as

−iζσ√

Tn +n−1∑j=0

ln

[1 + iζσ

√T

n+ iζ(rj − σ2/2)

T

n− ζ2σ2 T

n+ o(n−1)

]

= −iζσ√

Tn +n−1∑j=0

[iζσ

√T/n + iζ(rj − σ2/2)T/n

− ζ2σ2

2T/n + o

(n−1

)]

= iζ

n−1n−1∑j=0

rj − σ2/2

T − ζ2σ2T/2 + o(1)

= iζ[r(0, T ) − σ2/2]T − ζ2σ2T/2 + o(1).

The last step follows from the fact that the log of the total return on aT -period bond at t = 0 is

n−1∑j=0

rjT/n = lnn−1∏j=0

B(tj , tj+1)−1 = lnB(0, T )−1 = r(0, T )T

under assumption RK. Letting n → ∞ gives

ΨRn(ζ) → ΨR(0,T )(ζ)

= expiζ[r(0, T ) − σ2/2]T − ζ2σ2T/2.

Now apply the continuity theorem for characteristic functions (theorem 4on page 74) to conclude that the limiting distribution of Rn is normal withmean r(0, T )T − σ2T/2 and variance σ2T . This implies that terminal priceST = S0e

R(0,T ) is distributed under the martingale measure as

S0 exp[r(0, T )T − σ2T/2 + Zσ

√T], (5.44)

Page 241: Pricing derivative securities

April 8, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch05 FA

224 Pricing Derivative Securities (2nd Ed.)

where Z ∼ N(0, 1). The terminal price is therefore lognormally distributed.This shows also that σ2 governs the rate at which the variance of the logof the asset’s total return grows with time, and is therefore the varianceof log return over a unit interval. For this reason σ is called the “volatilityparameter”.

Other choices of uptick and downtick parameters than those in (5.43)can lead to the same lognormal limiting distribution; for example,

ujn = exp[(rj − σ2/2)T/n + σ

√T/n

]djn = exp

[(rj − σ2/2)T/n− σ

√T/n

].

Here, with time-varying spot rates the move sizes themselves are time-varying, but the risk-neutral probabilities,

πn =erjT/n − djn

ujn − djn=

eσ2T/(2n) − e−σ√

T/n

eσ√

T/n − e−σ√

T/n,

are constant. Although they are a bit more complicated, these expressionsdo have the advantage of revealing a more direct link to the comparablecontinuous-time model, geometric Brownian motion.

Since the distribution of R(0, T ) was derived from risk-neutral Bernoullidynamics, one expects it to be consistent with risk neutrality over the period[0, T ]. It is so. From the expression for the m.g.f. of a normal variate we haveE exp(Zσ

√T ) = exp(σ2T/2), so that E0ST = S0e

r(0,T ) = S0B(0, T )−1 andS0 = B(0, T )E0ST .

In pricing European options we have seen that the recursive proce-dure can be sidestepped and the arbitrage-free price written as a simpleformula; for example, as (5.24) or (5.25) for call options. These formulasresult from evaluating ECE(sn, 0) in the martingale measure that appliesto the Bernoulli dynamics, using the fact that ln sn is a linear functionof a binomial variate. We have just seen that the limiting distributionof sn as n → ∞ is lognormal, so it is not surprising that evaluatingECE(ST , 0) = E(ST −X)+ with ST as in (5.44) leads to a related formula,

CE(S0, T )=S0Φ

[ln( S0

BX ) + σ2T/2

σ√

T

]− BXΦ

[ln( S0

BX ) − σ2T/2

σ√

T

], (5.45)

where Φ(·) is the standard normal c.d.f. and B ≡ B(0, T ). Likewise, evalu-ating EP E(ST , 0) = E(X − ST )+ gives for the European put

PE(S0, T )=BXΦ

[ln(BX

S0) + σ2T/2

σ√

T

]− S0Φ

[ln(BX

S0) − σ2T/2

σ√

T

]. (5.46)

Page 242: Pricing derivative securities

April 8, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch05 FA

Pricing under Bernoulli Dynamics 225

These are in fact the famous Black-Scholes (1973) formulas, of which wegive two derivations in chapter 6.

Modifications for Futures Prices

In section 5.4.2 move sizes of the futures price, ujn and djn, were shownto be related to the move sizes of the price, sj , of the primary asset thatunderlies the futures. Multiplying ujn and djn by one-period discount factormj/mj+1 = (1 + rj)−1 = e−rjT/n = B(tj−1, tj) eliminates the averagepositive trend in sj under the risk-neutral measure and makes the futuresprice itself a P martingale. In moving from stage j to stage j + 1 of then-stage binomial tree on [0, T ] we thus take Fu

j+1 = FjB(tj , tj+1)un andFd

j+1 = FjB(tj , tj+1)dn in the up and down states, respectively, where Fj ≡F(tj , T ′) is the futures price at tj on a contract expiring at T ′ ≥ T . Withthis convention the equivalent-martingale probabilities remain as in (5.42).

How does this alteration of move sizes affect the limiting distributionof Fn as n → ∞? Recall that the terminal value of the underlying can berepresented as sn = s0u

Unn dn−Un

n , where Un is the number of upticks inthe n stages. Since up and down moves in the futures at stage j are bothproportional to B(tj , tj+1), the terminal futures price can be expressed as

Fn =n∏

j=0

B(tj , tj+1)F0uUnn dn−Un

n = B(0, T )F0uUnn dn−Un

n .

Since the continuously compounded n-period rate of return on the under-lying is

R(0, T ) = Rn ≡ ln(sn/s0) = ln(uUnn dn−Un

n ),

the corresponding n-period log return on the futures price is

Rn ≡ ln(Fn/F0) = lnB(0, T ) + Rn

= −r(0, T )T + Rn,

where r(0, T ) is the average spot rate for T -year loans initiated at t = 0.

Since the limiting distribution of Rn is normal with mean [r(0, T )−σ2/2]Tand variance σ2T , it follows that Rn is asymptotically normal with the samevariance and mean −σ2T/2. The limiting distribution of Fn/F0 is thereforelognormal with these two parameters. Consistent with the fact that thefutures price is a martingale under the risk-neutral Bernoulli dynamics, thelimiting result implies that the P-expected gain from the futures position

Page 243: Pricing derivative securities

April 8, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch05 FA

226 Pricing Derivative Securities (2nd Ed.)

over [0, T ] is

E0[F(T, T ′) − F(0, T )] = F(0, T )e−σ2T/2+σ2T/2 − F(0, T ) = 0.

An alternative to modeling the dynamics of the underlying commod-ity or asset is to model the futures price directly. This would clearly beappropriate for futures on a commodity used in consumption and not heldprimarily as an investment asset, which could not in practice be used to con-struct a replicating portfolio. Taking the move sizes for the futures price as

un = eσ√

T/n, dn = e−σ√

T/n

(the same in each subperiod) and setting

πn =1 − dn

un − dn

again makes Rn ≡ ln(Fn/F0) converge in distribution to normal with mean−σ2T/2 and variance σ2T . The interpretation of σ2 as the variance of thelog return over a unit interval remains the same.

Modifications for Assets Paying Continuous Dividends

Moving from derivatives on futures back to derivatives on stocks, curren-cies, or other primary assets, let us see how the payment of continuous,proportional dividends can be specified so as to provide the correct limit-ing distribution of R(0, T ). Suppose the underlying asset pays a continuousproportional dividend at known constant rate δ over [0, T ]. With reinvest-ment of the dividends one share bought for S0 at t = 0 would be worthS0e

δT at T , apart from capital gains. For the asset to be priced like a risk-less bond, as it would be in the risk-neutral measure, the expected totalreturn from capital gains must be reduced by the factor e−δT . This requiressetting the mean parameter in the normal distribution of ln(ST /S0) to[r(0, T )−δ−σ2/2]T. Simply subtracting δ from the average forward rate ineach subinterval makes the binomial setup produce this result in the limit.We therefore take

πjn =e(rj−δ)T/n − dn

un − dn.

To allow for time-varying (but still deterministic) dividend rates, justreplace δ by δj . For derivatives on a foreign currency the foreign inter-est rate is the continuous dividend, since it governs the rate at which aninitial investment in foreign-denominated bonds or money deposits growsover time. To apply binomial pricing in this case, just deduct from rj theaverage period-j forward rate in the foreign country.

Page 244: Pricing derivative securities

April 8, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch05 FA

Pricing under Bernoulli Dynamics 227

By contrast, dividends or storage costs on an asset or commodity under-lying a futures contract do not affect the expected gain on a futures posi-tion under P, which always equals zero. These receipts or costs require noadjustment of πn or the move sizes in pricing derivatives on futures.

Choosing Binomial Parameters in Applications

To apply the binomial method one must specify the number of time steps,n, and determine the move sizes, un = exp(σ

√T/n) and dn = u−1

n , andthe martingale probability πjn. The probability for step j depends on themove sizes, the period-j discount factor, B(tj , tj+1), and (if the underlyingpays continuous dividends) the dividend rate, δ. The volatility, σ, is the keyparameter because it is the only ingredient that is not observable or cannotreadily be approximated in terms of observables. We consider this first.

In principle, estimating σ should be a simple matter. The Bernoullidynamics and the limiting lognormal model imply that log returns of theunderlying asset in nonoverlapping periods of equal length are independentand identically distributed. Since σ is just the standard deviation of the logreturn over a unit time interval, the obvious thing to do is to calculate thesample standard deviation of returns in an historical sample. For stockswe might calculate the log returns from dividend-corrected daily closingprices, as Rτ = ln(Sτ+1 + ∆τ+1)− lnSτM

τ=1. Here τ counts trading days,and ∆τ+1 is the dividend if the stock goes ex dividend on day τ + 1, andzero otherwise. For futures the calculations are usually made with dailysettlement prices—the prices officially determined by the exchange in orderto mark traders’ positions to market. The unbiased sample variance froma sample of size M is

S2M = (M − 1)−1

M∑τ=1

(Rτ − R)2,

where R is the sample mean. Once S2M has been calculated, one converts it

to the same time units used for interest rates and time to expiration. Forexample, if the basic time unit is years, S2

M from daily returns is multipliedby the number of days per year.9 Under the limiting lognormal form of

9Alternatively, the variance might be scaled by the number of trading days per year

or by a weighted average of calendar and trading days. Averaging can be justified by thefact that volatility over non-trading intervals is greater than zero but less than that overcomparable trading spells.

Page 245: Pricing derivative securities

April 8, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch05 FA

228 Pricing Derivative Securities (2nd Ed.)

the Bernoulli model S2M is a sufficient statistic for σ2 and therefore has

minimum sampling variance in the class of unbiased estimators. Since it isalso strongly consistent, it seems that we could get as good an estimate ofσ as desired by just taking M to be large enough. What could possibly bewrong with this simple plan?

The problem is that for most securities the i.i.d.-normal model for logreturns is a poor description of the actual data. The empirical marginaldistributions typically exhibit much thicker tails than the normal, espe-cially for short holding periods, such as one day. This indicates that largeabsolute deviations from the mean occur much more often than they wouldfor a normal variate with the same variance. Moreover, returns are notindependent: the conditional distribution of tomorrow’s return manifestlydepends on past experience. Although the past record does not help much,on average, in forecasting tomorrow’s actual return, it does provide goodforecasts of the square of the return or the squared difference from themean. Shocks that raise volatility in one period seem to persist and keepit high for a time.10 To apply binomial pricing we need to know, at thevery least, the average volatility over the life of the derivative, and thesample standard deviation gives a poor estimate of this because it treats allhistorical observations as conveying the same information about the future.

Of course, if σ fluctuates through time, there is clearly a flaw inbinomial-pricing models based on the Bernoulli dynamics thus far consid-ered, and the same flaw is present in continuous-time models like Black-Scholes that depend on the limiting lognormal case. In later chapters weintroduce more elaborate and more realistic continuous-time models, andin the final section of the present chapter we discuss a way of incorporatingtime- and price-dependent volatility into the binomial framework. How-ever, despite their inconsistency with empirical evidence about the under-lying dynamics, the standard binomial and Black-Scholes models remainthe benchmarks against which others are judged because of their simplicityand tractability. The trick in using them is to obtain a better estimate ofthe average volatility over the derivative’s lifetime than is provided by thehistorical sample standard deviation. There are various ways to do this,

10Not only is volatility persistent, for stock returns it has regular weekly and annualcycles. For the evidence see Campbell, Lo, and MacKinlay (1997). Temporal variationand dependence in volatility can explain some of the observed thickness of the tails in the

empirical marginal distributions of daily returns, but the usual models of time-varyingvolatility do not fully account for it. Even conditionally on the past, the normal modelgives a poor description of daily log returns.

Page 246: Pricing derivative securities

April 8, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch05 FA

Pricing under Bernoulli Dynamics 229

some sophisticated and some ad hoc. A common ad hoc approach thatoften works very well is to use a value of σ that puts estimates for com-parable traded derivatives close to their observed prices. This “implicit”volatility that can be deduced from prices of traded derivatives will be dis-cussed in section 5.6 below and again in connection with the Black-Scholesmodel in chapter 6. Also in common use are more sophisticated statisti-cal techniques for estimating σ from historical data. Nonlinear time-seriesmodels such as ARCH (for autoregressive, conditional heteroskedasticity)and its many generalizations (GARCH, EGARCH, etc.) characterize thepersistence in volatility parametrically. Having estimated the parametersby maximum likelihood, one can use the models to generate conditionalforecasts of average volatility over future periods.11

Once σ has been estimated and un and dn have been calculated, we stillneed the one-period interest rate or price of a one-period bond at each stepj in order to find the martingale probabilities:

πjn =exp[r(tj , tj+1)T/n] − dn

un − dn=

B(tj , tj+1)−1 − dn

un − dn.

Since in actual practice these future one-period rates and prices are notknown in advance, they must be replaced by the corresponding forwardrates and prices. This, too, is not entirely straightforward. If spot pricesof discount bonds could be observed at a continuum of maturities, the for-ward prices could just be calculated as B0(tj , tj+1) = B(0, tj+1)/B(0, tj),but we must make do with the limited number of maturities that areavailable. A direct way to estimate B(0, t) for arbitrary t is to interpolatebetween prices of nearby bonds. A better way is to interpolate along theyield curve between the annualized spot rates for the available maturities,r(0, t) = − lnB(0, t)1/t, then back out the corresponding tj-maturing spotbond price. In the U.S. there is an active secondary market for Treasurybills and Treasury strips of a wide range of maturities, from which the spotbond prices and spot rates can be calculated in this way.

Prices of most equity and index derivatives are not highly sensitiveto variation in interest rates. If the yield curve is relatively flat over thederivative’s life, it is reasonable to treat one-period interest rates as constantand equal to the average spot rate on a riskless bond with about the samematurity, r(0, T ). In this case the risk-neutral probabilities are the same for

11Campbell et al. (1997, chapter 12) give a full account of the procedure.

Page 247: Pricing derivative securities

April 8, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch05 FA

230 Pricing Derivative Securities (2nd Ed.)

each time step:

πn =exp[r(0, T )T/n] − dn

un − dn=

B(0, T )−1/n − dn

un − dn.

If the underlying pays a continuous dividend at rate δ, this is subtractedfrom r0(tj , tj+1) or r(0, T ) in the above formulas. (Dividends paid at dis-crete intervals are handled as explained in section 5.4.4 and require noadjustment of π.)

Example 64 As an illustration of the whole process, let us construct themove sizes and risk-neutral uptick probability for pricing a 3-month (.25-year) option on a no-dividend common stock using a binomial tree withn = 100 time steps. Given the estimate σ = .30 for annualized volatilityand an annualized yield of r0(0, .25) = .05 on 3-month Treasury bills, wehave

un = e.30√

.25/100 .= 1.015

dn = u−1n

.= 0.985

πn =e.05(.25/100) − 0.985

1.015 − 0.985.= 0.504.

5.5.2 Efficient Calculation

With specifications for un, dn, and πjn that cause sn to converge in dis-tribution to lognormal, one expects binomial price estimates to improve inaccuracy as n increases. How large n should be in a particular applicationdepends on the degree of accuracy required, the nature of the derivativebeing priced, the costliness of computational time, the limitations of com-puter memory, and the computational design. This section presents somedesign suggestions for making efficient use of computer resources. We beginwith some techniques of general applicability that economize on memoryand computational time, then describe some smoothing techniques thatgreatly accelerate convergence for certain derivative assets.

Economizing on Memory

Memory requirements for binomial computations quickly get out of handeven for moderate values of n unless storage is used efficiently. The recom-binant trees we have used to illustrate the technique have been two-dimensional arrays over time and price, involving n+1 terminal price states

Page 248: Pricing derivative securities

April 8, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch05 FA

Pricing under Bernoulli Dynamics 231

after n time steps. One’s first thought in designing a program is to constructone or perhaps two n × (n + 1) arrays—one for the underlying price andone for the derivative. However, all that is really needed is just a singlevector array of dimension n + 1. Indeed, it is possible to get by with evenless than that.

First, here is how to do it with an (n + 1)-dimensional array. For con-creteness, consider an American-style derivative on a no-dividend stockthat is worth DX

j ≡ DXj (sj) upon exercise at some node (sj , j). After

n time steps the binomial tree will have n + 1 terminal nodes, corre-sponding to underlying prices ranging from s0d

n to s0un. In the pro-

gram, as in the numerical examples, one starts at step n when the optionexpires and works backward successively to steps n − 1, n − 2, . . . andultimately to step 0, where the desired initial value is finally obtained.We work with an (n + 1)-dimensional vector D, which can be envisionedas a columnar array. At step n fill D with the derivative’s terminal values,D(sn, 0) = D(s0u

n−kdk, 0)nk=0, storing D(s0u

n, 0) at the top as Dn andD(s0d

n, 0) at the bottom as D0. For a put struck at X these would becalculated as D(s0u

n−kdk, 0) ≡ (X − s0un−kdk)+. Working down from the

top with underlying price s0un, each successive value of sn can be obtained

recursively from its predecessor by multiplying by d/u. Now moving backto the n − 1 step, calculate new derivative values, D(s0u

n−1−kdk, 1)n−1k=0

and store them in the same D array, as follows. Starting at the top, withBn−1 ≡ B(tn−1, tn ≡ T ) as the one-step discount factor for step n − 1,calculate D(s0u

n−1, 1) as

Bn−1[πn−1D(s0un, 0) + (1 − πn−1)D(s0u

n−1d, 0)] ∨ DXn−1(s0u

n−1)

and store as Dn, writing over the old value D(s0un, 0). Continue down the

array, evaluating the derivative at position k, D(s0un−1−kdk, 1), as

Bn−1[πn−1D(s0un−kdk, 0) + (1 − πn−1)D(s0u

n−k−1dk+1, 0)]

∨DXn−1(s0u

n−1−kdk) (5.47)

and storing as Dn−k, for k = 1, 2, . . . , n−1. These n operations replace justthe top n elements of D, leaving D0 unchanged as D(s0d

n, 0). This is notneeded after the n step. At the n−2 step the top n−1 elements of D will bemodified while leaving D1 and D0 unchanged, and so on. In general, at eachstep j just the top j + 1 elements, Dn, Dn−1, . . . , Dn−j , will be calculated.In particular, at step 0 only Dn is changed, and this represents initial valueD(s0, n) ≡ D(S0, T ).

Page 249: Pricing derivative securities

April 8, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch05 FA

232 Pricing Derivative Securities (2nd Ed.)

Now here is how to manage with even fewer than n+1 storage positions.This is done by storing only nonzero values of the derivative in the D

array, these being the only ones that will affect its initial value. To makethe notation more compact, we illustrate specifically for a European calloption. Let nX be the maximum number of downticks for which the callis in the money at the n step; that is, nX is such that s0u

n−kdk − X > 0for k = 0, 1, . . . , nX and (s0u

n−kdk − X)+ = 0 for k > nX . We will needa D array with only nX + 1 elements. At step n, fill D with the nonzerooption values, beginning with DnX = CE(s0u

n, 0) = s0un − X at the top

and continuing through D0 = CE(s0un−nX dnX , 0) = s0u

n−nX dnX − X

at the bottom. Moving to step n − 1, recalculate the top nX elements,DnX , DnX−1, . . . , D1, as

DnX−k = CE(s0un−1−kdk, 1)

= Bn−1[πn−1CE(s0u

n−kdk, 0) + (1 − πn−1)CE(s0un−k−1dk+1, 0)]

for k = 0, 1, . . . , nX − 1. The bottom element of the array, D0, will contain

CE(s0un−1−nX dnX , 1) = Bn−1πn−1C

E(s0un−nX dnX , 0).

This depends just on the uptick value at step n, since the downtick valueis zero. The process is repeated at steps j = n− 2, n− 3, . . . , nX . From thebottom j +1−nX nodes of the complete tree at each such step it would beimpossible to attain a terminal node corresponding to a positive value of thecall. Thus, at each such j the element D0 will depend on just the bottomelement of the previous step. Once step nX − 1 is reached, however, thecomplete tree will have only nX nodes, and from even the lowest of these,where snX−1 = s0d

nX−1, it is possible to attain positive terminal valueson downtick as well as uptick. For each of steps j = nX − 1, nX − 2, . . . , 0,therefore, each of the j +1 elements DnX−k = CE(s0u

j−kdk, n− j)jk=0 is

computed as

Bj [πjCE(s0u

j−k+1dk, n − j − 1) + (1 − πj)CE(s0uj−kdk+1, n − j − 1)],

just as if the full (n+1)-dimensional array had been used. We wind up withthe desired initial value at the top, as before.

The procedure for n = 4 and nX = 3 is illustrated in table 5.2, whereD

jk stands for the element in the kth position at the jth step and β and β

represent Bπ and B(1 − π), respectively. We start at step 4 of the treewith the array D4 of the derivative’s nonzero terminal values and workbackward to step 0. The entries show how elements of D are calculated at

Page 250: Pricing derivative securities

April 8, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch05 FA

Pricing under Bernoulli Dynamics 233

Table 5.2. Value arrays for memory-efficient binomial calculations.

D0 D1 D2 D3 D4

βD13 + βD

12 βD

23 + βD

22 βD

33 + βD

32 βD

43 + βD

42 D

43

[βD22 + βD2

1] βD22 + βD2

1 βD32 + βD3

1 βD42 + βD4

1 D42

[βD31 + βD3

0] [βD31 + βD3

0] βD31 + βD3

0 βD41 + βD4

0 D41

[βD40] [βD

40] [βD

40] βD

40 D4

0

each successive step. The bracketed elements at the bottom are those thatget carried over from the previous step and are not used again.

In pricing path-independent European derivatives it is unnecessary toknow the values of the asset except at the terminal nodes, but they areneeded for American derivatives in order to judge whether to exercise early.Rather than store the asset values, it is efficient to compute them recursivelyas one moves down the D array at each step, just as we described doing atthe n step. That is, at the j step start with s0u

j (the highest price) andmultiply each element by d/u to obtain the next lower price in the array.

Program BINOMIAL on the accompanying CD carries out binomial pric-ing of European, American, and extendable puts and calls on various under-lying assets, allowing for various types of dividends. (Extendable optionsare discussed below.)

Economizing on Time

Two time-saving numerical methods that work in a number of contexts,including binomial pricing, are (i) the use of control variates and (ii)Richardson extrapolation.

A control variate is an artificial variable with known properties that isconstructed in order to reduce the estimation error in a quantity whoseproperties are not well understood. Suppose, for example, that we haveestimated the value of an American put option using an n-step binomialtree that is set up to converge to geometric Brownian motion as n → ∞. Theestimate, PA

n , is the sum of the true value, PA, plus an unknown estimationerror, an. The goal is to obtain a new estimate, PA

n , with smaller error. Forthis purpose we would construct as a control a binomial estimate, PE

n , of anotherwise identical European put from the same tree. The estimation errorin the control, en = PE

n − PE , can be determined with high accuracy bycomparing the binomial estimate with that obtained from the Black-Scholesformula. Presuming that the errors an and en are closely related, we would

Page 251: Pricing derivative securities

April 8, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch05 FA

234 Pricing Derivative Securities (2nd Ed.)

expect to improve the estimate for the American put by subtracting en fromit. The new estimate would then be PA

n = PAn − en = PA + (an − en). If an

and en were random variables having about the same mean and variance,then we could say that the mean squared error of PA would be smaller thanthat of PA

n provided the correlation between an and en exceeds 0.5.12

Example 65 Consider 3-month (.25-year) puts struck at X = 10.0 onan underlying no-dividend stock priced at S0 = 10.0 and with volatilityσ = 0.4. With 3-month unit discount bonds selling for B(0, .25) = 0.9850,

program BINOMIAL yields for an n = 25-stage tree PA25

.= 0.7384 andPE

25.= 0.7257, while the Black-Scholes estimate from (5.46) is PE .= 0.7178.

The error, e25.= 0.7257 − 0.7178 = 0.0079, yields control-variate estimate

PA25

.= 0.7384 − 0.0079 = 0.7305. For comparison, a 1000-stage binomialtree delivers PA

1000.= 0.7300.

Turn now to the second error-reduction technique, Richardson extrapo-lation. Broadly, the idea is to estimate the price of a derivative for eachof a short sequence of relatively small but increasing values of n—forn1, n2, . . . , nm, say—and, in effect, to extrapolate from these to n = ∞. Forexample, we could estimate for n = 10, 20, 30 and extrapolate from there.We will see just how to do this directly, but first consider the logic of it.The number of calculations in the standard binomial solution of n steps isthe same as the number of nodes in the tree, or (n+1)(n+2)/2 (somewhatless if the compact solution is used). Getting estimates for n = 10, 20, 30requires a total of 793 calculations. If extrapolating from these gives accu-racy comparable to one estimate with n = 50, we will have saved 533calculations; if it is comparable to one with n = 100, the saving is 4,358calculations!

Being thus motivated, let us see how it works. Instead of extrapolatinga sequence of functions of n out to n = ∞, we actually extrapolate downto zero a sequence of functions of 1/n. These functions are the binomialestimates indexed by the time interval between steps, T/n. For a derivativeexpiring after T units of time, let hk = T/nk be the interval correspondingto an nk-step tree and let D(hk) be the corresponding binomial estimate.

12A refinement of the control-variate estimator is P An (α) = P A

n − αen, where α ischosen to minimize the variance of P A

n (α). The optimal value, α∗ = cov(P An , en)/V en,

can be determined by evaluating the covariance and variance in the terminal (binomial)distribution implied by the tree. Details and applications of this idea in the Monte Carlocontext are given in chapter 11.

Page 252: Pricing derivative securities

April 8, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch05 FA

Pricing under Bernoulli Dynamics 235

The hope is that D(h) is smooth enough that it can be approximated bya few terms of a Taylor expansion about h = 0, as D(h) = a0 + a1h +a2h

2 + · · · + am−1hm−1. If this is so, then obtaining estimates

D(h1), . . . , D(hm) for m values of h gives m equations in m unknowns, andone of these is the value a0 = D(0) that corresponds to n = ∞. For exam-ple, taking just a linear approximation with m = 2 gives D(h1) = a0+a1h1,D(h2) = a0 + a1h2, and the solution

D(0) =h1D(h2) − h2D(h1)

h1 − h2. (5.48)

With m = 3 the estimate is

D(0) =h1h2(h1 − h2)D(h3) − h1h3(h1 − h3)D(h2) + h2h3(h2 − h3)D(h1)

h21h2 + h1h2

3 + h22h3 − h2h2

3 − h1h22 − h2

1h3.

Even the linear approximation can be very effective.

Example 66 Consider pricing a one-year European put struck at X = 10.0on a stock with S0 = 10.0 and volatility σ = 0.30, given an annual discountfactor of B(0, 1) = 0.95. With n1 = 10 (h1 = 0.10) and n2 = 20 (h2 = 0.05)program BINOMIAL yields D(0.10) .= 0.900 and D(0.05) .= 0.915. Plugginginto (5.48) gives D(0) .= 0.930. This is closer to the Black-Scholes solutionof 0.929 than the 0.926 obtained with a 100-step tree, which took 4, 853 morecalculations (counting the extrapolation itself as one).

Special Smoothing Techniques

While Richardson extrapolation can indeed produce impressive gains inefficiency, there is cause for concern that sequences of binomial estimatesmay lack sufficient smoothness for the method to work reliably in someapplications. We now look at the cause of this problem and then considersome ways of smoothing the estimates. We have seen that Bernoulli dynam-ics leads to an explicit solution for arbitrage-free prices of European-stylederivatives. For example, when the short rate is considered to be constantout to T the initial price of a European call is

CE(S0, T ) = B(0, T )n∑

j=0

(n

j

)πj

n(1 − πn)n−j(S0ujndn−j

n − X)+.

The top panel of figure 5.11 depicts what is involved in calculating thesummation; namely, adding the n + 1 products of binomial weights (rep-resented by the vertical bars) times the corresponding terminal payoffs.

Page 253: Pricing derivative securities

April 8, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch05 FA

236 Pricing Derivative Securities (2nd Ed.)

Fig. 5.11. Binomial pricing of European call as a discrete expectation.

The bottom panel shows what happens as the number of time steps, n, isincreased. This increases the number of terminal nodes at which there areatoms of probability mass, both extending the range of the support of ST

and, since un and dn shrink toward unity as n grows, also decreasing thespacing between the atoms. In the process there is a net transfer of prob-ability mass from one side to the other of the kink in the terminal payofffunction. Since the payoff is zero to the left of the kink and positive to theright, this shift of mass can appreciably affect the value of the summation.If one plots a sequence of binomial estimates based on n1, n2, . . . time steps,the result is a slightly ragged curve, the estimates oscillating back and forthas they tend toward the limiting Black-Scholes value. The kink typically hasmore effect for small n since the individual probability weights around thestrike price are usually larger when n is small. Although the mechanismis more complicated for American options, the piecewise linearity of thepayoff function still has a similar effect.

The problem arising from lack of smoothness in the terminal payoff func-tion of an American option can be reduced by the following simple trick.Backing up to stage n−1 of the tree, the retention value of the derivative isjust that of a European option with remaining life T/n. This can be pricedby Black-Scholes. As we shall see in chapter 6, the Black-Scholes price

Page 254: Pricing derivative securities

April 8, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch05 FA

Pricing under Bernoulli Dynamics 237

(prior to expiration) is a smooth, convex function of the price of the under-lying, so starting off an (n−1)-step tree with Black-Scholes prices eliminatesthe kink in the value function. The smaller n is and the larger is T/n, thegreater is the gain in smoothness and the greater is the improvement in thebinomial estimates. Reviewing a number of approaches for pricing Amer-ican options, Broadie and Detemple (1996) conclude that this enhancedversion of the binomial method is quite competitive in terms of accuracyand speed, particularly when coupled with Richardson extrapolation.

Lack of smoothness in the payoff function is an even greater problemwhen pricing what are called “extendable” options. If not in the moneyat each of a sequence of specific dates prior to expiration, the life of anextendable option is extended to the next date in the sequence. If it is inthe money at any of these dates, it must be exercised. For example, a putoption might expire after six months if then in the money, but otherwise beextended for one additional six-month period. Using program BINOMIAL toprice these options, one can see that convergence as n increases is extremelyslow. This is illustrated in figure 5.12. The three bold, jagged curves areplots of binomial estimates of prices of three European-style extendable

0 10 20 30 40 50 60 70 80 90 100

Subperiods between ex dates (in 100s)

4.4

4.6

4.8

5.0

5.2

5.4

5.6

5.8

Est

imate

dprice

(x10

4) PE(0.,.5,1.)

PE(0.,.33,.67,1.)

PE(0.,.25,.5,.75,1.)Standard binomial estimateWith smoothness correction

Fig. 5.12. Binomial estimates of extendable puts vs. number of time steps betweenexercise dates.

Page 255: Pricing derivative securities

April 8, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch05 FA

238 Pricing Derivative Securities (2nd Ed.)

puts vs. the number of time steps between dates. The top-most curve is fora once-extendable 6-month (half-year) put, which is denoted PE(0, .5, 1).The other curves are for 4-month and 3-month puts with up to two andthree extensions, respectively. For all three the oscillations continue to bewell above the graphic threshold, even out to 10,000 steps between exercisedates.

To see why convergence is so slow, consider valuing at t = 0 a Europeanput that can be extended once at t = t∗ and then expires at T > t∗. At t∗

the put is worth X − St∗ if in the money; otherwise its value is that ofan ordinary, nonextendable European put with remaining life T − t∗. Theinitial value can be represented as the discounted expectation of this time-t∗

payoff function in the martingale measure; that is, as

PE(S0, t∗, T ) = B(0, t∗)E[(X − St∗)+ + PE(St∗ , T − t∗)1[X,∞)(St∗)].

The sawtooth curve in figure 5.13 depicts the value at t∗ as a function ofSt∗ , with a discrete probability mass function overlaid. The value functionis discontinuous at X because the payoff approaches zero as St∗ approachesthe strike price from below, whereas its value at X is that of an at-the-money European put with T − t∗ to go. What happens as n increases issimilar to what was described for an ordinary nonextendable option, but thediscontinuity in the t∗ value function makes the effect much more dramatic.Probability mass shifts back and forth across the discontinuity, causing thebinomial estimates to oscillate wildly and persistently.

Fig. 5.13. Binomial price of extendable put as discrete expectation.

Page 256: Pricing derivative securities

April 8, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch05 FA

Pricing under Bernoulli Dynamics 239

Dealing with this problem is a little more complicated. Suppose thepayoff function were shifted upward in the interval [0, X) so as to eliminatethe discontinuity, as represented by the dashed diagonal line in figure 5.13.The required shift would equal the value of an at-the-money vanilla Euro-pean put with T − t∗ to go, or PE(X, T − t∗), and this can be computedfrom the Black-Scholes formula. The value of a derivative with this quitesmooth, continuized payoff function can be estimated relatively accuratelyby the binomial method. Its terminal (t∗) value would be defined for then step of a binomial recursion as X − sn + PE(X, T − t∗) for sn ∈ [0, X)and as PE(sn, T − t∗) for sn ≥ X . Of course, the value of this displacedtime-t∗ payoff would overstate that of the extendable put by the discountedexpected value of PE(X, T − t∗) times an indicator function of the interval[0, X); namely,

B(0, t∗)E[PE(X, T − t∗)1[0,X)(St∗)].

This is the value of a “digital” option that pays PE(X, T − t∗) wheneverSt∗ < X . Since PE(X, T − t∗) is a known constant, however, this overstate-ment is easily calculated as

B(0, t∗)PE(X, T − t∗)P(St∗ < X),

and it can simply be subtracted from the binomial estimate of the dis-placed payoff function. The probability can be calculated from the limitinglognormal distribution for St∗ given S0.13 The smooth, dotted curves infigure 5.12 are plots of these continuized estimates.

5.6 Inferring Trees from Prices of Traded Options

In the standard binomial setup considered thus far the move sizes, un anddn, are the same for all nodes. While the risk-neutral probabilities do varywith the time step if the short rate of interest is time varying, the valuesof πj at step j are nevertheless the same at all j +1 nodes. A more generalmodel would allow move sizes and probabilities alike to depend on both time

13This corresponds to the factor Φ[·] in the first term of the Black-Scholes put

formula, (5.46). With T = t∗ as the time of expiration this is

bP(St∗ < X) = Φ

»ln(BX/S0) + σ2t∗/2

σ√

t∗

–.

Details of the smoothing method and extensions to multiple exercise dates are given inEpps et al. (1996).

Page 257: Pricing derivative securities

April 8, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch05 FA

240 Pricing Derivative Securities (2nd Ed.)

and price. In effect, since move probabilities are calculated from interestrates and move sizes, and since move sizes are calculated from the volatilityparameter (as un = eσ

√T/n in the usual setup), such an arrangement would

amount to having price- and time-dependent volatilities. Whereas the logof price approaches normality as n → ∞ in the usual constant-volatilityBernoulli setup, an appropriate price-dependent specification would lead tomarginal distributions with thicker tails than the normal and could accountfor some degree of intertemporal dependence of volatilities. As discussed atthe end of section 5.5.1, prices of financial assets usually exhibit both ofthese features. We will see in chapter 8 that continuous-time models withprice-dependent volatilities can also generate option prices whose implicitvolatilities vary systematically with the strike price. While this is contraryto both the Black-Scholes and standard binomial models, it, too, is consis-tent with empirical observation.

Just as the constant σ in the standard model can be inferred from a sin-gle option price, it is possible to infer σ(t, St) as a function of time and pricefrom prices of a suitable collection of derivatives. Methods for doing thiswithin the Bernoulli framework have been proposed by Derman and Kani(1994) and Rubinstein (1994).14 Here we describe the simpler Rubinsteinprocedure, as modified by Jackwerth and Rubinstein (1996), which uses acollection of options all of the same maturity to determine node-dependentmove sizes and probabilities. The method requires knowing current pricesof a collection of European-style options at different strikes but with thesame expiration date, T . Because of European put-call parity either a putor call at each strike suffices, and nothing is gained from having both. Themethod is therefore limited to underlying assets on which European optionswith a large range of strikes do trade, such as the S&P 500 index.15

5.6.1 Assessing the Implicit Risk-Neutral Distribution of ST

Letting (Ω,F , Fj, P) be the actual filtered probability space on which theBernoulli price process is defined, our task will be to find an equivalentmartingale measure P that correctly prices a collection of T -expiring,

14Dupire (1994) presented a related idea in the context of diffusion models andtrinomial dynamics. Working also in the continuous-time framework, Dumas et al. (1998)use option prices to fit a flexible quadratic model for σ. We describe their findings inchapter 8.

15In principle, it would suffice to have a large number of T -expiring American callson an underlying stock that paid no dividends before T .

Page 258: Pricing derivative securities

April 8, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch05 FA

Pricing under Bernoulli Dynamics 241

traded, European-style derivatives. The first step is to build a standardn-step tree with time steps T/n, using a guess as to what an appropri-ate constant σ would be. This might be an average of implicit volatilitiesdeduced from options close to the money or an estimate from historicalreturns. This trial value of σ is then used to calculate values of move sizes, u

and d, in the standard way. From the move sizes one then determines pricesof the underlying at the terminal nodes, as sn,kn

k=0 = s0un−kdkn

k=0, anda corresponding vector of risk-neutral probabilities, πkn−1

k=0 . Together, themove sizes and probabilities determine a benchmark measure, P, of pre-cisely the form we have worked with heretofore. Assuming a constant inter-est rate, the move probabilities will be the same at each step—that is, π

and 1 − π—and the nodal probabilities under P will just be binomial, as

Πn,k ≡ P(ST = s0un−kdk) =

(n

k

)πn−k(1 − π)k, k = 0, 1, . . . , n.

If the asset pays a continuous proportional dividend at rate δ, then ST isinterpreted as a cum-dividend price and π is calculated as in (5.35).

The goal is to replace these benchmark probabilities with new ones,Πn,k ≡ P(ST = s0u

n−kdk), that do a better job of explaining the observedprices of derivatives. Ultimately, all of the interior nodes of the tree will bemoved around as well in the process of developing distinct move sizes foreach node, but the next task is just to find the new probabilities for theterminal nodes. Finding the new vector of probabilities, Πn ≡ Πn,kn

k=0,

can be posed as a constrained optimization problem. The idea is to minimizea certain criterion function, C(Πn), subject to the following constraints onthe elements of Πn:16

• That they be nonnegative and sum to unity;• That they preserve the martingale property for the (cum-dividend) nor-

malized process sj/mj = (1 + r)−jsj/m0; and• That they correctly price all the options in the working sample.

Here, “correctly price” means that the implied prices fall within the cur-rent bid and asked quotes at the time the option sample is taken. Thus, ifD(s0; Xm)q, m ∈ 1, 2, . . . , M, q ∈ b, a are quotes on the collection of

16Rubinstein (1994) does not impose the second constraint, which requires him toredefine the interest rate (an observable!) in order to restore the martingale property.

Page 259: Pricing derivative securities

April 8, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch05 FA

242 Pricing Derivative Securities (2nd Ed.)

M puts and calls expiring at T , the constraints would ben∑

k=0

Πn,k = 1, Πn,k ≥ 0, k ∈ 0, 1, . . . , n (5.49)

(1 + δ

1 + r

)n n∑k=0

Πn,ksn,k = s0 (5.50)

and

(1 + r)−nn∑

k=0

Πn,kD(snk; Xm) ∈[D(s0; Xm)b, D(s0, Xm)a

],

m ∈ 1, 2, . . . , M. (5.51)

In these expressions the initial price of the underlying, s0, is defined as theaverage of the current bid and asked, (sb

0 + sa0)/2. Rubinstein (1994) takes

the vector of standard binomial probabilities, Πn, as the benchmark anduses the squared Euclidean distance of Πn from Πn as criterion function:

C(Πn; Πn

)=

n∑k=0

(Πn,k − Πn,k

)2

.

Jackwerth and Rubinstein (1996) consider several other criteria, includingthe Kullback-Leibler information,17

C(Πn; Πn) =n∑

k=0

Πn,k ln(Πn,k/Πn,k

), (5.52)

but they wind up favoring a simple smoothness criterion that does notinvolve the original binomial probabilities at all; namely,

C(Πn

)=

n∑k=0

[(Πn,k+1 − Πn,k

)−

(Πn,k − Πn,k−1

)]2

.

Notice that the tree must have more than m+2 time steps if the optimiza-tion is to be feasible, since (5.49)-(5.51) impose a total of m+2 constraintson the n + 1 elements of Πn.

17The K-L information or entropy function is a measure of distance between twoprobability measures that takes the form (5.52) in the discrete case. It is nonnegativeand equals zero if and only if the two measures are the same. For an overview of properties

see the article by Kullback (1983). Stutzer (1996) uses the entropy function in a clevernonparametric scheme to estimate the risk-neutral distribution over a holding period oflength T .

Page 260: Pricing derivative securities

April 8, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch05 FA

Pricing under Bernoulli Dynamics 243

5.6.2 Building the Tree

As usual, the tree branches out from initial price s0 through a sequenceof up and down moves. What is different now is that the move sizes u

and d and the uptick and downtick probabilities will vary from node tonode. To handle this notationally, we distinguish the probabilities in thesame way as the prices; thus, π0, u0, d0 are the initial uptick probabilityand move sizes from s0, and πj,k, uj,k, dj,k are the probability and movesfrom price sj,k, where j counts time steps and k ∈ 0, 1, . . . , j indexesthe nodes. Since k corresponds to the number of downticks from the initialprice, the prices sj,kj

k=0 are ranked as sj,0 > sj,1 > · · · > sj,j.The following assumptions provide the additional structure needed to

guarantee a nice solution to the problem of finding equivalent measure P.

1. The tree is recombining. This means that the product of a sequence ofj − k up moves and k down moves leading to a step-j price is invariantunder a permutation of the moves. For example, there are 22 = 4 pathsleading to the prices at step two, but the uptick-downtick and downtick-uptick paths both lead to the same place: s0u0d1,0 = s0d0u1,1. An n-steptree will therefore have just n + 1 terminal prices, as usual.

2. All paths leading to any given node are equally likely. Of the 2n pathsin an n-step tree,

(nk

)lead to the kth terminal node, where the nodal

probability is Πn,k. The assumption implies that each of these paths hasthe same probability, Pn,k ≡ Πn,k/

(nk

). Combined with assumption 1,

this tells us that all paths with given numbers of up and down movesare equally likely. For example, the path probabilities at the three nodesat step two are P2,0 = Π2,0, P2,1 = Π2,1/2, P2,2 = Π2,2.

Within this framework let us see how to solve for the move probabilitiesand move sizes at any node. The n+1 terminal prices at step n have alreadybeen determined, along with corresponding nodal probabilities Πn,kn

k=0.Let us take a position at any node k ∈ 0, 1, . . . , n − 1 on the n − 1 step,at which the price (to be determined) will be sn−1,k and from which theprices sn,k and sn,k+1 can be reached at step n. The probabilities of thepaths leading to sn,k and sn,k+1 through sn−1,k are Pn,k = Πn,k/

(nk

)and

Pn,k+1 = Πn,k+1/(

nk+1

), respectively. Since all paths through sn−1,k must

go to one of these nodes, the probability for any single path through sn−1,k

is Pn−1,k = Pn,k + Pn,k+1, and by assumption 2 all such paths are equallylikely. Given the path probabilities, we can now find the uptick probabilityat this node, πn−1,k. Since this is the conditional probability of getting

Page 261: Pricing derivative securities

April 8, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch05 FA

244 Pricing Derivative Securities (2nd Ed.)

to sn,k via sn−1,k given that we have arrived at sn−1,k, it is clear thatπn−1,k = Pn,k/Pn−1,k. Now it remains just to find sn−1,k itself. This isdone by imposing the martingale property of the P measure, as

sn−1,k =1 + δ

1 + r[πn−1,ksn,k + (1 − πn−1,k)sn,k−1],

where δ is the proportionate dividend received each period and all theprices are interpreted as cum-dividend. Of course, sn−1,k, sn,k, and sn,k−1

implicitly determine the move sizes, as un−1,k = sn,k/sn−1,k and dn−1,k =sn,k−1/sn−1,k, but these are now of no direct relevance.

Once sn−1,k is found for each k ∈ 0, 1, . . . , n−1, the process is repeatedat step n − 2. In general, at any step j ∈ 0, 1, . . . , n − 1 and node k ∈0, 1, . . . , j we

1. Find the probability of each path passing through the sj,k node, asPj,k = Pj+1,k + Pj+1,k+1; then

2. Calculate the uptick probability as πj,k = Pj+1,k/Pj,k; and, finally,3. Determine the price level as

sj,k =1 + δ

1 + r[πj,ksj+1,k + (1 − πj,k) sj+1,k−1] .

Example 67 Figure 5.14 illustrates the procedure for a tree with n = 3steps whose terminal prices match those in figure 5.2. The data used forthe original construction (s0 = 10.0, r = 0.01, δ = 0.0, u = 1.05, d = 0.97,

π = 1/2) give as terminal prices and benchmark nodal probabilities

s3,k3k=0 = 11.576, 10.694, 9.875, 9.127

Π3,k3k=0 = 0.125, 0.375, 0.375, 0.125.

Suppose that an optimization subject to constraints (5.49)–(5.51) has pro-duced the new risk-neutral probabilities

Π3,k3k=0 = 0.136, 0.393, 0.303, 0.168.

At each node of the tree are shown the path probabilities, uptick probabilities,and prices of the underlying asset and of a European call struck at X = 10.0.At the top of step 2, for example, the path probability is calculated as

P2,0 = P3,0 + P3,1 = 0.136 + 0.131 = 0.267,

and the move probability is

π2,0 = P3,0/P2,0 = 0.136/0.267 .= 0.509.

Page 262: Pricing derivative securities

April 8, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch05 FA

Pricing under Bernoulli Dynamics 245

Fig. 5.14. Pricing a call from an inferred tree.

The implied prices for the underlying and the call are then

s2,0 = (1 + r)−1[(π2,0)s3,0 + (1 − π2,0)s3,1]

= 1.01−1[(.509)11.576 + (.491)10.694].= 11.033

and

CE2,0 = (1 + r)−1[(π2,0)CE

3,0 + (1 − π2,0)CE3,1]

= 1.01−1[(.509)1.576 + (.491)0.694].= 1.132.

The path probability at stage 0 works out to be unity, and the underlyingprice works out to be s0 = 10.0, consistent with the original construction.Notice that the price of the European call is a little higher than in figure 5.2,

as is consistent with the fact that P puts more probability mass in the tailsthan does P.

Page 263: Pricing derivative securities

April 8, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch05 FA

246 Pricing Derivative Securities (2nd Ed.)

5.6.3 Appraisal

Although we chose to price a European call in order to keep the examplesimple, it is unnecessary to construct the entire tree if the purpose is toprice just European-style derivatives, since terminal probabilities Πn,kn

k=0

alone suffice for this. Thus, in the example the call’s price can be founddirectly as

CE(s0, 3) = (1 + r)−33∑

k=0

Π3,k(s3,k − X)+

= 1.01−3[(.136)1.576 + (.393).694]

= 0.473.

The real reason for building the tree is to price American-style optionsor exotic derivatives whose payoffs depend on the entire price path. Thepremise is that these prices will be more accurate if derived in a frameworkthat is consistent with the revealed prices of traded European derivatives.While this seems reasonable, one does question whether accurate inferencesabout the intertemporal behavior of price under the martingale measure canreally be drawn from a single cohort of European-style options.18 A rele-vant consideration in this regard is how closely implicit volatility structuresσ(t, St) correspond when constructed from cohorts of options having dif-ferent terms. Some measure of such correspondence might turn out to abetter criterion for optimizing the Πn,k than just smoothness or confor-mity with log-binomial priors.

18The Derman-Kani (1994) method does simultaneously fit options having differentterms as well as different strikes, but it requires interpolating prices on what is ordinarilya coarse grid of times and strikes.

Page 264: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch06 FA

6Black-Scholes Dynamics

We now take up the valuation of derivative assets when prices of underlyingprimary assets are considered to evolve in continuous time. This chapterand the next treat the classic geometric Brownian motion dynamics thatwere the basis for the pioneering option-pricing theory of Black and Scholes(1973) and Merton (1973). The first section describes the slight general-ization of geometric Brownian motion that we will use and lays out theassumptions about the behavior of the money-market fund in continuoustime. The price of the money fund still serves as a convenient numeraire,as it did in the discrete-time Bernoulli framework of chapter 5. Section 6.2introduces the two pricing strategies that are available in this setting—solving the partial differential equation that gives the law of motion of afinancial derivative, and evaluating expectations of normalized payoffs inthe risk-neutral measure. Section 6.3 first shows how the two approachesare applied in valuing futures and European puts and calls on an under-lying primary asset that can pay continuous dividends, then extends tooptions on stocks paying lump-sum dividends and options on stock indexesand futures. The last section works out the properties of arbitrage-freeprices of European options within the Black-Scholes framework. There we(i) see how Black-Scholes prices respond in a comparative-static sense tochanges in the parameters that define the options and the underlying assets,(ii) determine the implied instantaneous risk-reward trade-offs from posi-tions in European options, and (iii) express the expected return (in naturalmeasure P) from holding European options over arbitrary periods. Pric-ing American options and other derivatives with more complicated payoffstructures under Black-Scholes dynamics is the subject of chapter 7.

247

Page 265: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch06 FA

248 Pricing Derivative Securities (2nd Ed.)

6.1 The Structure of Black-Scholes Dynamics

Consider a primary asset (a “stock”) whose price starts off at a known valueS0 and evolves over time in accord with the stochastic differential equation

dSt = µtSt · dt + σtSt · dWt. (6.1)

Here Wtt≥0 is a standard Brownian motion on (Ω,F , Ft, P), andµtt≥0 and σtt≥0 are deterministic (F0-measurable) processes with∫ t

0 |µs| · ds < ∞,∫ t

0 σ2s · ds < ∞, and σt > 0 for each t. The drift process,

µtt≥0, will be essentially irrelevant until section 6.4.5, but the volatilityprocess, σtt≥0, is of key importance throughout. When these are constant,as assumed in the classic Black-Scholes-Merton theory of option pricing, themodel is geometric Brownian motion; but extending to allow for determin-istic variation in volatility with time, as we do here, is almost costless. Withthis addition we refer to the model as “Black-Scholes dynamics”.

Applying Ito’s formula to lnSt gives

lnSt − lnSs =∫ t

s

(µu − σ2u/2) · du +

∫ t

s

σu · dWu.

From the definition of the stochastic integral the change in the logarithm ofthe stock’s price over [s, t] is therefore normal with mean

∫ t

s(µu −σ2

u/2) ·du

and variance∫ t

sσ2

u · du. The stock-price process itself is Markov in viewof the independence of the increments of Brownian motion, and Ft can beregarded equivalently as the σ-field generated by the history of either W

or S through time t. Accordingly, for 0 ≤ s ≤ t we can write

St ∼ Ss exp

∫ t

s

(µu − σ2u/2) · du + Z

√∫ t

s

σ2u · du

(6.2)

where Z ∼ N(0, 1), concluding that St is lognormally distributed condi-tional on σ(Ss) (the σ-field generated by Ss). Since E(ea+bZ) = ea+b2/2,the first moment is EsSt ≡ E(St | Fs) = E(St | Ss) = Ss exp(

∫ t

sµu · du).

These facts are all we need to value derivatives on this underlying primaryasset.

We shall also consider futures and derivatives on futures under theassumption that the futures price follows the same Black-Scholes dynamicsas (6.1); that is,

dFt = µtFt · dt + σtFt · dWt.

Next, we need to recall the definitions and assumptions about the behav-ior of interest rates and bond prices, which in the present setting evolve in

Page 266: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch06 FA

Black-Scholes Dynamics 249

continuous time. As usual, B(t, T ) represents the price at t of a risklesszero-coupon bond paying one currency unit at T, and

rt =∂ ln B(t, T )

∂t

∣∣∣∣T=t

= − ∂ lnB(t, T )∂T

∣∣∣∣T=t

is the short or instantaneous spot rate of interest. As in chapter 5 we con-tinue to invoke assumption RK, which is that short rates are known overthe life of each derivative that we price. The numeraire is again the priceof the money fund that rolls over instantaneous riskless spot loans. Its per-unit value therefore grows continuously through time at deterministic ratesrt, as

dMt/dt = rtMt. (6.3)

Taking the initial value to be M0, this implies that the processMt = M0 exp

(∫ t

0

rs · ds

)t≥0

is itself F0-measurable.If the stock that underlies our derivative asset pays proportional div-

idends continuously at F0-measurable instantaneous rates δt, it will benecessary also to model the market value of the position that comprises thestock and its dividend stream. There are two possibilities, according as thedividends are used to purchase more stock or are invested in the moneyfund. In the former case, we have a cum- dividend process,

Sct = St exp

(∫ t

0

δu · du

), (6.4)

which represents the value of a position that begins with one share at t = 0and reinvests the dividends continuously. Applying Ito’s formula to (6.4)shows that the dynamics are given by

dSct = (µt + δt)Sc

t · dt + σtSct · dWt. (6.5)

Alternatively, if the dividends are invested in the money fund, the value att of a position that starts with one share of stock at t = 0 is

Vt = St + Mt

∫ t

0

δsSs/Ms · ds, (6.6)

with

dVt = rtVt · dt + [µt − (rt − δt)]St · dt + σtSt · dWt. (6.7)

Page 267: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch06 FA

250 Pricing Derivative Securities (2nd Ed.)

t

0.5

1.0

1.5

St

f(s1)1.0 : 0

e

(2πσ2)1/2s1

_________1f(s1) =

−ln(s1)−µ−σ2/22/(2σ2)

Fig. 6.1. Simulated sample paths of geometric Brownian motion and p.d.f. of S1

(S0 = 1, µ = .07, σ = .21 ).

By allowing δt < 0, these expressions apply also to positions in invest-ment commodities for which the explicit cost of carry is positive anddeterministic.

Figure 6.1 shows five simulated realizations of sample paths of geometricBrownian motion for t ∈ [0, 1], all with initial value S0 = 1. The lognormalp.d.f. of S1 is juxtaposed.

6.2 Approaches to Arbitrage-Free Pricing

The market value of a derivative security in an arbitrage-free market can bedetermined uniquely if there exists a self-financing portfolio of traded assetsthat replicates the derivative’s payoffs. One way to value a derivative isjust to set out to construct such a portfolio. Considering path-independentderivatives whose values at each moment depend just on the contempo-raneous price of the underlying asset, it is usually possible to find such areplicating portfolio if the derivative’s value is a smooth function of time

Page 268: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch06 FA

Black-Scholes Dynamics 251

and the underlying price. In that case the conditions for replication will beseen to imply a law of motion for the value of the derivative, which takesthe form of a partial differential equation. Solving this p.d.e. subject to ini-tial conditions (and, typically, boundary conditions) gives the derivative’svalue in terms of the observables, t and St. This is the differential-equationapproach to valuation.

The standard alternative to this is the equivalent-martingale or risk-neutral approach. It works like this. One first finds a measure equivalentto P under which the value of a position in the underlying, normalized bysome numeraire, is a martingale. One then just verifies that a self-financing,replicating portfolio for the derivative exists—without having to constructit. The martingale representation theorem (section 3.4.2) will make thispossible. This done, one can conclude that the portfolio’s normalized valueis itself a martingale under the new measure. Exploiting the fair-game prop-erty of martingales then leads to an expression for the derivative’s currentvalue in terms of the conditional expected value of its future payoffs.

This section explains both approaches. We consider derivatives on bothprimary assets, such as stocks and foreign currencies, and on futures prices.We limit attention to European-style derivatives, each of which makes asingle state/time-contingent payoff D at some future date T. Of course,derivatives with payoffs at more than one date can be valued as the sum ofthe parts.

6.2.1 The Differential-Equation Approach

Consider first a T -expiring European-style derivative on an underlying pri-mary asset (a “stock”) that can pay dividends continuously at determinis-tic rates δt per unit time. Representing the value function at t ∈ [0, T ]as D(St, T − t), let us try to construct a self-financing replicating portfo-lio of the stock and money fund. If the plan is to hold pt shares of thecum-dividend stock position and qt units of the money fund, we thus want

D(St, T − t) = ptSct + qtMt

at each (t, St) (in order to replicate) and

dD(St, T − t) = pt · dSct + qt · dMt

= pt · dSct + [D(St, T − t) − ptS

ct ] · dMt/Mt (6.8)

(to make the position self-financing). Notice that the value of the derivativeitself depends on the ex -dividend value of the underlying, whereas one works

Page 269: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch06 FA

252 Pricing Derivative Securities (2nd Ed.)

with the cum-dividend position in order to replicate. Using (6.5) and (6.3),the right side of ( 6.8) is

[D(St, T − t)rt + pt(µt − rt + δt)Sct ] · dt + ptσtS

ct · dWt.

Assuming that the value function is smooth enough to apply Ito’s formula—that is, continuously differentiable once with t and twice with St, the leftside of (6.8) is

(−DT−t + DSµtSt + DSSσ2t S2

t /2) · dt + DSσtSt · dWt,

where subscripts on D denote partial derivatives. (Note that DT−t and−DT−t are the partial derivatives of D with respect to T − t and t, respec-tively.) Equating the two sides shows that the portfolio shares must satisfy

0 =∫ T

t

[−DT−u + DSµuSu + DSSσ2uS2

u/2

− Dru − pu(µu − ru + δu)Scu] · du (6.9)

and

0 =∫ T

t

(DSSu − puScu)σu · dWu

for each t ∈ [0, T ]. The second integral is zero for sure if and only if

0 = E

[∫ T

t

(DSSu − puScu)σu · dWu

]2

=∫ T

t

E[(DSSu − puScu)2σ2

u] · du,

which requires that pu = DSSu/Scu for almost all u.1 Making this substitu-

tion in (6.9) shows that D(·, ·) must satisfy the partial differential equation

0 = −DT−t + DS(rt − δt)St + DSSσ2t S2

t /2 − Drt. (6.10)

Equation (6.10) is the “Black-Scholes” or “fundamental” p.d.e. for valuingcontingent claims. Solving this subject to the terminal condition D(ST , 0) =D(ST ) and applicable boundary conditions gives the value of the derivativeat each (t, St). For example, boundary conditions for a European call struckat X on a no-dividend stock are St − B(t, T )X ≤ CE(St, T − t) ≤ St. (Seetable 4.7 on page 166.) The “terminal” condition at t = T for calendartime corresponds to an initial condition at T − t = 0 for time to expiration,and is sometimes referred to as such. The solution of (6.10) also implicitly

1Recall that St0≤t≤T and Sct 0≤t≤T are a.s. positive under geometric Brownian

motion, assuming that δt0≤t≤T is nonnegative.

Page 270: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch06 FA

Black-Scholes Dynamics 253

determines the composition of the replicating portfolio, as

pt = DS(St, T − t)St/Sct

qt = [D(St, T − t) − ptSct ]/Mt.

If dividends are invested in the money fund rather than used to buymore stock, one can replicate using stock-savings combination Vt in (6.6)instead of Sc

t . In this case it is easy to see that the replicating portfoliocomprises pt = DS units of the stock-fund position and qt = (D − Vt)/Mt

additional units of the money fund. Of course, (6.10) still applies.Now consider a T -expiring derivative on a futures price, Ft. Since

futures contracts have no initial value, replicating a derivative worthD(Ft, T−t) with qt units of the money fund and pt futures contracts requiresthat qtMt = D(Ft, T − t). To be self-financing, the portfolio must satisfy

dD(Ft, T − t) = pt · dFt + qt · dMt

= pt · dFt + D(Ft, T − t) · dMt/Mt

= pt(µtFt · dt + σtFt · dWt) + D(Ft, T − t)rt · dt.

Expressing dD(Ft, T − t) with Ito’s formula as before, the last equality isseen to require that pt = DF and

0 = −DT−t + DFFσ2t F2

t /2 − Drt. (6.11)

This is solved subject to terminal condition D(FT , 0) = D(FT ) (and applica-ble boundary conditions) to obtain the value function at arbitrary (Ft, T−t).Note that solutions to (6.10) with the same side conditions also work for(6.11) upon setting δt = rt.

Equations (6.10) and (6.11) are linear second-order p.d.e.s of theparabolic type. With appropriate changes of variables each can be reducedto the simplest case of the classic heat equation from physics, ut = κuxx,for which solutions are well known.2 In the physical application the solutionu(x, t) relates the temperature in an idealized, insulated, one-dimensionalrod to the time (t) and distance (x) from an origin, given an initial dis-tribution of temperature, u(·, 0) = h(·). The constant κ > 0 characterizesthe rod’s physical conductivity properties. In section 6.3 we use Fouriermethods to work out a solution to (6.10) subject to D(ST , 0) = (X −ST )+,thereby deriving the Black-Scholes formula for the value of a European put.

2Cannon (1984, section 1.2) gives transformations that reduce various linear, second-order p.d.e.s to the heat equation. Wilmott et al. (1995, section 5.4) provide such areduction for (6.10) specifically.

Page 271: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch06 FA

254 Pricing Derivative Securities (2nd Ed.)

6.2.2 The Equivalent-Martingale Approach

In an arbitrage-free market there exists a probability measure with respectto which normalized values of traded assets and of self-financing portfoliosof assets are martingales. Within the Bernoulli framework it was possibleto determine this measure explicitly by finding the pseudo-probabilities ofthe up and down states at each time step, πj and 1− πj . Things obviouslywill not be so simple in the continuous-time Black-Scholes environment,where prices of underlying assets are Ito processes built up from Brownianmotions. In order to be a martingale an Ito process must have zero meandrift, yet prices of risky assets typically exhibit systematic trends throughtime—even after they are normalized by values of bonds or money-marketfund. Girsanov’s theorem is the tool that will enable us to find an alternativemeasure in which normalized price processes are, in fact, trendless. Weconsider first how things work for underlying primary assets that can paydividends continuously at deterministic rates. We shall see how to find themeasure in which normalized asset prices are martingales, how to verify theexistence of self-financing replicating portfolios for suitable derivatives, andhow to use the fair-game property of martingales to find the derivatives’arbitrage-free prices. Once all this is done, it will be easy to see how tomodify the procedure for derivatives on futures.

Finding Martingale Measures in the Black-Scholes Setting

Consider an underlying primary asset (a “stock”) with deterministic, pro-portional dividend-rate process δt0≤t≤T , where T is some fixed futuredate. Let us suppose first that the dividends are reinvested in the stockas they are received, so that the cum-dividend process evolves as dSc

t =(µt + δt)Sc

t · dt + σtSct · dWt with σt > 0 for t ∈ [0, T ]. With Mt0≤t≤T

as numeraire, Ito’s formula shows that the normalized process Sc∗t ≡

Sct /Mt0≤t≤T evolves as

dSc∗t = (µt − rt + δt)Sc∗

t · dt + σtSc∗t · dWt.

Notice that since Mt is of finite variation, normalization alters the driftbut not the volatility, as was explained in section 3.4.3. This normalizedprocess can be a martingale under P only if it happens that µt − rt + δt = 0for all t. However, suppose there exists a measure P, equivalent to P, withrespect to which the process

Wt = Wt +∫ t

0

σ−1u (µu − ru + δu) · du (6.12)

Page 272: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch06 FA

Black-Scholes Dynamics 255

is a Brownian motion on [0, T ].3 In this new measure the drift term for Sc∗t

vanishes, since

dSc∗t = (µt − rt + δt)Sc∗

t · dt + σtSc∗t [dWt − σ−1

t (µt − rt + δt) · dt]

= σtSc∗t · dWt. (6.13)

Removing the trend in this way makes Sc∗t 0≤t≤T a local martingale under

P. That it is also a martingale proper (requiring that E|Sc∗t | < ∞ for all

t ≥ 0) is confirmed by Novikov’s condition, since the F0 -measurabilityand integrability of σ2

t clearly guarantee that E exp(12

∫ T

0 σ2t · dt) < ∞.4

Girsanov’s theorem (section 3.4.1) verifies that such a P does exist andshows that it can be constructed as

P(A) = E1AQT =∫

A

QT (ω) · dP(ω), A ∈ F ,

where

QT = exp

[−

∫ T

0

σ−1t (µt − rt + δt) · dWt −

12

∫ T

0

σ−2t (µt − rt + δt)2 · dt

].

All this is strictly mechanical, but there is a very intuitive way to inter-pret the change of measure. Referring again to figure 6.1, think of thisas depicting sample paths of a normalized process Sc∗

t 0≤t≤1 with ini-tial value Sc∗

0 = 1 and a drift process µt − (rt − δt) that is positive forall t. Then, as in the simulation that actually generated the data, thesepaths have an upward trend, on average under natural measure P, so that∫

Sc∗1 (ω)·dP(ω) > 1. Now when we change the measure, the potential paths

themselves do not change. We merely change the weights attached to them,giving less weight to those for which Sc∗

1 (ω) > 1 and more to those withSc∗

1 (ω) < 1 so as to make∫Sc∗

1 (ω)Q1(ω) · dP(ω) =∫

Sc∗1 (ω) · dP(ω) = 1.

Now let us consider what happens if the dividends are invested in themoney fund instead of being used to buy more stock. In this case the value

3The reason for assuming that σt > 0 becomes is now evident.4F0-measurability and integrability under P imply measurability and integrability

under P by equivalence. The martingale property of Sc∗t under P can also be verified

directly from the solution to s.d.e. (6.13),

Sc∗t = Sc∗

0 exp

»Z t

0σs · dWs −

Z t

0σ2

s/2 · ds

–.

Page 273: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch06 FA

256 Pricing Derivative Securities (2nd Ed.)

at t of the position in stock and fund is given by (6.6), and the normalizedprocess V ∗

t = Vt/Mt0≤t≤T evolves as

dV ∗t = S∗

t [µt − (rt − δt)] · dt + σtS∗t · dWt

= σtS∗t · dWt.

Thus, V ∗t is also a martingale under the same measure P.5

Notice that the unnormalized, ex -dividend price process St evolvesunder P as

dSt = µtSt · dt + σtSt[dWt − σ−1t (µt − rt + δt) · dt]

= (rt − δt)St · dt + σtSt · dWt. (6.14)

This implies that

EtST = St exp

[∫ T

t

(ru − δu) · du

](6.15)

and therefore EtScT = Sc

t exp(∫ T

tru · du), so that the value of the cum-

dividend stock position has the same expected growth rate as does thevalue of the money fund. Likewise, if dividends are invested in the moneyfund, the unnormalized value of the stock-fund position evolves under P as

dVt = rtVt · dt + σtSt · dWt,

and EtVT = Vt exp(∫ T

t ru ·du). Since this is precisely how the values of thesepositions would behave in a risk-neutral market, P is also the risk-neutralmeasure.

So long as we abstract from uncertainty about future interest rates (aswe do until chapter 10) nothing is changed if we use B(t, T ) as numeraireinstead of Mt, since under assumption RK

Sct /B(t, T ) ≡ Sc

t B(0, T )/Mt0≤t≤T

is also a martingale on (Ω,F , Ft, P).6 In general, however, different mar-tingale measures do correspond to different numeraires. For example, usingSc

t as numeraire and letting M∗∗t = Mt/Sc

t , we have

dM∗∗t = (rt − δt + σ2

t − µt)M∗∗t · dt − σtM

∗∗t · dWt.

5Write dV ∗t /V ∗

t = σtSt/Vt ·dWt. AssumingR t0 δs ·ds ≥ 0 for each t ∈ [0, T ], we have

0 ≤ St/Vt ≤ 1, P a.s. Hence, ER T0

exp( 12σ2

t S2t /V 2

t ) ·dt < ∞ and V ∗t is indeed a proper

martingale under P.6When we introduce stochastic interest-rate processes in chapter 10 we will see

that “spot” and T -forward measures P and PT associated with numeraires Mt and

B(t, T ) do in fact differ.

Page 274: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch06 FA

Black-Scholes Dynamics 257

With

Q′T = exp

[∫ T

0

(rt − δt + σ2

t − µt

σt

)· dWt

− 12

∫ T

0

(rt − δt + σ2

t − µt

σ2t

)2

· dt

]and

P(A) = E1AQ′T =

∫A

Q′T (ω) · dP(ω), A ∈ F , (6.16)

we have a measure under which Wt = Wt−∫ t

0 σ−1u (ru−δu+σ2

u−µu)·du is aBrownian motion and M∗∗

t becomes trendless, with dM∗∗t = −σtM

∗∗t ·dWt.

Verifying a Self-financing Replicating Strategy for D

The martingale representation theorem (theorem 9 on page 122) makesit possible to confirm the existence of a self-financing replicating strat-egy within the Black-Scholes framework without actually constructing it.Applying the theorem requires restricting the class of derivative assets tothose whose payoffs have finite expected value under P, but this limitationis of no practical importance under Black-Scholes dynamics.

Here is how the process works. When E|D(ST )| < ∞, the derivative’snormalized terminal value gives rise to a conditional-expectation process,Et[D(ST )/MT ] ≡ D∗(St, T − t)0≤t≤T say, that is a P martingale adaptedto the same filtration as price process St0≤t≤T .7 The representation the-orem tells us that any such P martingale can be represented as an Itoprocess and therefore as a stochastic integral with respect to a Brownianmotion under P. This means that there exists a process Dt0≤t≤T with∫ T

0 D2s · ds < ∞ such that

D∗(St, T − t) = D∗(S0, T ) +∫ t

0

Ds · dWs.

Moreover, the theorem implies that D∗(St, T −t) can also be represented asa stochastic integral with respect to the normalized price of the underlying.That is, since dSc∗

s = σsSc∗s · dWs and both σs and Sc∗

s are positive, taking

7See example 27 on page 88.

Page 275: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch06 FA

258 Pricing Derivative Securities (2nd Ed.)

ps ≡ (σsSc∗s )−1Ds for s ≥ 0 gives

D∗(St, T − t) = D∗(S0, T ) +∫ t

0

ps · dSc∗s

or dD∗(St, T − t) = pt · dSc∗t . Now define D(St, T − t) ≡ MtD

∗(St, T − t)and set qt = D∗(St, T − t) − ptS

c∗t , so that D(St, T − t) = ptS

ct + qtMt.

This will be the current value of the replicating portfolio. It does replicatethe payoff, since pT Sc

T + qT MT = MT D∗(ST , 0) = MT ET [D(ST )/MT ] =D(ST ). Moreover, it is self-financing, since

dD(St, T − t) = Mt · dD∗(St, T − t) + D∗(St, T − t) · dMt

= Mtpt · dSc∗t + (ptS

c∗t + qt) · dMt

= pt · d(MtSc∗t ) + qt · dMt

= pt · dSct + qt · dMt.

With replication thus verified, we now know that the measure P that pricesderivatives on the underlying asset is unique.

Valuing the Derivative

While a replicating portfolio is thus shown to exist, we do not knowexplicitly the numbers of shares, pt and qt, from which to calculate thederivative’s value at any t < T . This value can be found directly, how-ever, by exploiting the martingale property of D∗(St, T − t) under P. SinceD∗(St, T − t) = EtD

∗(ST , 0) = Et[D(ST )/MT ], we have under assumptionRK

D(St, T − t) = MtEt[D(ST )/MT ]

=Mt

MTEtD(ST )

= exp

(−

∫ T

t

ru · du

)EtD(ST )

or

D(St, T − t) = B(t, T )EtD(ST ). (6.17)

The derivative’s initial value is then

D(S0, T ) = exp

(−

∫ T

0

ru · du

)ED(ST )

= B(0, T )ED(ST ).

Page 276: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch06 FA

Black-Scholes Dynamics 259

Under P the derivative itself is therefore valued as it would be in anequilibrium within a risk-neutral market, just as was true of the underlyingasset itself. Under the new measure the price of the underlying behaves pre-cisely as it did under P except that the drift term, µt, in (6.1) is replaced bythe short rate minus the dividend rate, rt−δt, as in (6.14). This means thatunder P the conditional distribution of ST given St still has the lognormalform

ST ∼ St exp

∫ T

t

(ru − δu − σ2u/2) · du + Z

√∫ T

t

σ2u · du

, (6.18)

where Z ∼ N(0, 1). With this result pricing a European-style derivativeunder Black-Scholes dynamics reduces to the purely mechanical process ofevaluating the expectation of a function of a standard normal variate.

Futures Prices under the Risk-Neutral Measure

Modeling the evolution of a futures price over [0, T ] under the natural mea-sure P as

dFt = µtFt · dt + σtFt · dWt, (6.19)

let us consider the implications for the stochastic behavior of Ft, thevalue of a futures position that was initiated at t = 0. Supposing this tohave been marked to market at times 0 < t1 < · · · < tn ≤ t, with gains(losses) being invested in (financed by) shares of the money fund, we have

Ft = (Ft1 −F0)Mt

Mt1

+(Ft2 −Ft1)Mt

Mt2

+ · · ·+(Ftn −Ftn−1)Mt

Mtn

+(Ft −Ftn).

Now as an approximation imagine that the marking to market has takenplace continuously. Taking limits in the above as maxtj − tj−1n

j=1 → 0gives Ft

.= Mt

∫ t

0M−1

s · dFs and, upon dividing by Mt to normalize,

dF∗t ≡ d(Ft/Mt) = M−1

t · dFt = µtF∗t · dt + σtF

∗t · dWt.

Since it is the futures position rather than the futures price that has thestatus of an asset, it is F∗

t rather than F∗t that should be a martingale under

the appropriate change of measure. Applying Girsanov’s theorem with

QT = exp

(−

∫ T

0

σ−1t µt · dWt −

12

∫ T

0

σ−2t µ2

t · dt

)

Page 277: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch06 FA

260 Pricing Derivative Securities (2nd Ed.)

produces the measure P that makes Wt = Wt +∫ t

0 σ−1s µs ·ds a Brownian

motion and thereby removes the trend in F∗t , as dF∗

t = σtF∗t ·dWt. Referring

to (6.19), we see that under this measure dFt = σtFt · dWt, so that theunnormalized futures price is itself a trendless Ito process motion, andhence a martingale. Accordingly, the time-T futures price has the followingdistribution conditional on what is known at t ≤ T :

FT ∼ Ft exp

−12

∫ T

t

σ2u · du + Z

√∫ T

t

σ2u · du

, Z ∼ N(0, 1).

With this result valuing European options or other European-style deriva-tives on the futures is again reduced to finding the discounted expectationof a function of a normally distributed random variable.

Overview of Martingale Pricing

To summarize, here is how the equivalent-martingale/risk-neutral method isused within the Black-Scholes framework to value a European-style deriva-tive whose payoff depends just on time and the price of an underlying assetat some future date T .

• First, define the underlying asset in such a way that it can be used toreplicate the derivative. For a primary asset with zero explicit cost ofcarry, such as a common stock that pays no dividends during the life ofthe derivative, ordinary shares can be held and used along with the moneyfund in replication. For derivatives on dividend-paying assets, however,one must take positions in either cum-dividend shares or portfolios ofshares and money fund; and for derivatives on futures one has to workwith futures positions rather than futures prices. In each case, so longas we are within the Black-Scholes framework the instantaneous changein the value of this underlying asset will be given by an s.d.e. that cor-responds to an Ito process with either constant or time-dependent (butdeterministic) volatility. Either way, the distribution of the asset’s futurevalue will be lognormal, conditional on the current value.

• Next, set the drift term in the s.d.e. for the underlying to what it would beif the asset were priced under conditions of risk neutrality. For a primaryasset paying dividends continuously at rates δt0≤t≤T this amounts tosetting the mean proportional drift at t equal to rt−δt. For a futures pricejust set the mean proportional drift to zero, or else treat the underlyingas a primary asset with dividend rate equal to rt.

Page 278: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch06 FA

Black-Scholes Dynamics 261

• Finally, to find the derivative’s arbitrage-free value at t ∈ [0, T ), find theexpected value of the price-dependent payoff in the risk-neutral measure,as EtD(ST ), EtD(FT ), etc., and weight it by the riskless discount factor,B(t, T ) = MtM

−1T = exp(−

∫ T

t ru · du). If price-dependent payoffs canoccur at more than one time, weight each of their expected values by theappropriate discount factor and sum them.

6.3 Applications

Let us now begin to price some specific derivatives. Simple forward contractsare a good place to start since their arbitrage-free values are already knownfrom the static replication arguments of chapter 4 and since risk-neutralpricing is trivially easy. For options we will work out the prices both bysolving the fundamental p.d.e. and by following the steps for martingalepricing. Then we will stand back and reflect a bit on why these two verydifferent procedures lead to the same result.

6.3.1 Forward Contracts

Our derivative is a forward contract expiring at T with pre-establishedforward price f. At expiration it is worth F(ST , 0; f) = ST−f to the long side,and the goal is to find the contract’s value at any t ∈ [0, T ). The underlyingasset either pays a continuous dividend at rate δt ≥ 0 or else incurs anexplicit cost of carry at rate −δt = κt ≥ 0. We already know that thecum-dividend/ex -cost position can be used for replication by continuouslyreinvesting the receipts or selling down to finance the outlays. Take theunderlying price to evolve as dSt = µtSt · dt + σtSt · dWt under measureP, where µt, σt0≤t≤T ∈ F0. Changing to P, we set µt = rt − δt, writedSt = (rt − δt)St · dt + σtSt · dWt, and solve the s.d.e. to get

ST = St exp∫ T

t

(ru − δu − σ2u/2) · du +

∫ T

t

σu · dWu. (6.20)

The risk-neutral value of the contract is then found as

F(St, T − t; f) = B(t, T )Et(ST − f)

= B(t, T )St exp

[∫ T

t

(ru − δu) · du

]− B(t, T )f

= Ste−

RTt

δu·du − B(t, T )f.

Page 279: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch06 FA

262 Pricing Derivative Securities (2nd Ed.)

This clearly satisfies terminal condition F(ST , 0; f) = ST − f, and withδu = −κu it corresponds to our dynamics-free solution (4.9) on page 153.Checking that it also solves the fundamental p.d.e., we have

Ft = ∂F/∂t = δtSte−

R Tt

δu·du − rtB(t, T )f

FS = ∂F/∂St = e−R

Tt

δu·du

FSS = ∂2F/∂S2t = 0

and

Ft + FS(rt − δt)St + FSSσ2t S2

t /2 − Frt = 0.

6.3.2 European Options on Primary Assets

Consider now European options on primary assets such as stocks and cur-rency positions. To illustrate both techniques, we price a European put bysolving the Black-Scholes p.d.e., then use martingale methods to value aEuropean call.

Solving the Black-Scholes P.D.E.

Letting PE(St, T − t) represent the price at t of a T -expiring Europeanput and setting D(St, T − t) = PE(St, T − t) in (6.10), the task is to solvethe p.d.e. subject to terminal condition PE(ST , 0) = P (ST ) ≡ (X − ST )+.This is most easily done using Fourier transforms, which convert the p.d.e.to an ordinary differential equation in the time variable alone. To simplifynotation let us regard the volatility parameter (σ), the continuous dividendrate (δ), and the short rate (r) as time invariant. Later we modify the finalresult to let them vary deterministically.

Recall from chapter 2 that if g : → is integrable, then its Fourierintegral transform is the complex-valued function

Fg(·) =∫ ∞

−∞eiζxg(x) · dx ≡ f(ζ),

where ζ ∈ and i =√−1. For present purposes we need to generalize

slightly to functions g : 2 → , allowing g to depend on a parameter t

that is independent of x. Then

Fg(·, t) =∫ ∞

−∞eiζxg(x, t) · dx ≡ f(ζ, t),

Page 280: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch06 FA

Black-Scholes Dynamics 263

and the inversion and convolution formulas, (2.31) and (2.32), become

g(x, t) = F−1f(ζ, t) ≡ 12π

∫ ∞

−∞e−iζxf(ζ, t) · dζ

F [g(·, τ) ∗ h(·, τ)] = Fg(·, τ) · Fh(·, τ).

The first three relations below correspond to properties 1–3 on page 46,while the last expresses the transform of the derivative of g with respect to t:

(i) F [bg(·, t) + ch(·, t)] = bFg(·, t) + cFh(·, t) , where b, c are constants.(ii) Fgx(·, t) = −iζ · f(ζ, t) if lim|x|→∞ g(x, t) = 0(iii) Fgxx(·, t) = −ζ2 · f(ζ, t) if lim|x|→∞ gx(x, t) = 0(iv) Fgt(·, t) = ft(ζ, t).

The first step in solving (6.10) is to simplify with a normalization anda change of variables. Recalling the homogeneity of the put’s value withrespect to S and X, we can normalize by the strike price, putting s ≡ln(St/X), τ ≡ T −t, and p(s, τ) ≡ PE(Xes, τ)/X . With these substitutions(6.10) becomes

0 = −pτ + ps(r − δ − σ2/2) + pssσ2/2 − pr, (6.21)

and the terminal condition becomes an initial condition, p(s, 0) = (1−es)+.From the static replication arguments of chapter 4 (table 4.7 on page 166)we also recognize the following boundary conditions:

(e−rτ − es−δτ )+ ≤ p(s, τ) ≤ e−rτ .

However, it will turn out that these are automatically satisfied by the solu-tion from the initial condition only.

Now we would like to Fourier transform p(s, τ) with respect to s,

thus eliminating that variable, and use relations (ii)–(iv) above to expressthe derivatives in (6.21). p(s, τ) itself is bounded and therefore inte-grable, so its transform does exist, but the condition for (ii) fails becauselims→−∞ p(s, τ) = e−rτ > 0. We circumvent this problem by working notwith p(s, τ) but with the still-integrable function q(s, τ) ≡ esp(s, τ), forwhich the required conditions in (ii) and (iii) do hold. Expressing p and itsderivatives in terms of q and simplifying give the new p.d.e.

0 = −qτ + qs(ν − σ2) + qssσ2/2 − q(r + ν − σ2/2),

where ν ≡ r− δ− σ2/2. Now we can work the transform, putting f(ζ, τ) =Fq(·, τ) =

∫∞−∞ eiζsq(s, τ) · ds and using (ii)–(iv) to get what is now an

Page 281: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch06 FA

264 Pricing Derivative Securities (2nd Ed.)

ordinary differential equation:

0 = −fτ − iζ(ν − σ2)f − ζ2σ2f/2 − (r + ν − σ2/2)f.

Writing asfτ

f= −iζ(ν − σ2) − ζ2σ2/2 − (r + ν − σ2/2)

and solving give

f(ζ, τ) = e−(r+ν−σ2/2)τf(ζ, 0) exp[−iζ(ν − σ2)τ − ζ2σ2τ/2].

The task now is to invert the transform. Assuming that f(ζ, 0) =Fq(·, 0), and noting that the second exponential factor in the solution forf(ζ, τ) is the c.f. of a normal random variable with mean −(ν − σ2)τ andvariance σ2τ , we can write f(ζ, τ) ≡ Fq(·, τ) as

f(ζ, τ) = e−(r+ν−σ2/2)τFq(·, 0)F

1√2πσ2τ

e−[(·)+(ν−σ2)τ ]2/(2σ2τ)

.

In words, the Fourier transform of the function we seek is proportional tothe product of transforms of these two other functions. It follows from theconvolution formula that q(s, t) is proportional to the convolution of thosefunctions and so equal to

e−(r+ν−σ2/2)τ

∫ ∞

−∞q(w, 0)

1√2πσ2τ

exp− [w − s − (v − σ2)τ ]2

2σ2τ

· dw.

(6.22)Writing q(w, 0) = ew(1 − ew)+ in recognition of the initial condition andnoting that this is zero for w > 0, we are led to evaluate

e−(r+ν−σ2/2)τ

∫ 0

−∞(ew − e2w)

1√2πσ2τ

exp− [w − s − (v − σ2)τ ]2

2σ2τ

· dw.

Integrals ∫ 0

−∞ekw 1√

2πσ2τexp

− [w − s − (v − σ2)τ ]2

2σ2τ

· dw

can be reduced to

esk+νkτ+σ2(k2/2−k)τ Φ

(−s + [ν +

(k − 1)σ2

σ√

τ

)(Φ(·) being the standard normal c.d.f.) by completing the square in theexponent and changing variables as

z =w − s − [ν + (k − 1)σ2]τ

σ√

τ.

Page 282: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch06 FA

Black-Scholes Dynamics 265

Evaluating at k ∈ 1, 2 and using ν ≡ r− δ− σ2/2, one obtains for q(s, τ)

es−rτΦ(−s + (r − δ − σ2/2)τ

σ√

τ

)− e2s−δτΦ

(−s + (r − δ + σ2/2)τ

σ√

τ

).

Finally, on recovering p(s, τ) as e−sq(s, τ), restating in terms of the orig-inal variables, and putting B ≡ B(t, T ) = e−r(T−t), we can expressPE(St, T − t) as

BXΦ

ln(

BXSte−δ(T−t)

)+ σ2(T−t)

2

σ√

T − t

−Ste

−δ(T−t)Φ

ln(

BXSte−δ(T−t)

)− σ2(T−t)

2

σ√

T − t

or

PE(St, T − t) = BXΦ[q+

(BX

Ste−δ(T−t)

)]−Ste

−δ(T−t)Φ[q−

(BX

Ste−δ(T−t)

)]. (6.23)

With δ = 0 this is the usual Black-Scholes formula for a European put onan underlying asset that pays no dividends and has no other explicit costof carry. We show in section 6.4.2 that (6.23) is consistent with terminalcondition PE(ST , 0) = (X − ST )+ and that it satisfies the differentiabilityconditions that were assumed in deriving the fundamental p.d.e.

The shorthand notation

q±(x) ≡ lnx ± σ2(T − t)/2σ√

T − t(6.24)

will often appear in pricing formulas derived within the Black-Scholesframework. A second argument will sometimes be included, as q±(x; T − t)or q±(x; σ), if more than one such feature is involved in a particular prob-lem. One can convert between the + and − versions as

q−(x) = q+(x) − σ√

T − t = −q+(x−1). (6.25)

While this shows that a single symbol “q” would really do, it will be con-venient to use both.

Since the T -forward price of an underlying asset that pays continuousproportional dividends at rate δ is

f ≡ f(t, T ) = B(t, T )−1Ste−δ(T−t), (6.26)

Page 283: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch06 FA

266 Pricing Derivative Securities (2nd Ed.)

formula (6.23) can be written also as

PE(St, T − t) = BXΦ[q+(X/f)] − BfΦ[q−(X/f)]. (6.27)

A similar exercise for the European call, solving (6.10) subject to terminalcondition CE(ST , 0) = (ST − X)+, leads to

CE(St, T − t) = BfΦ[q+(f/X)] − BXΦ[q−(f/X)]. (6.28)

Below, we derive this expression via the martingale approach.Deterministic variation in σ, δ, and r is accommodated in these formulas

just by setting

r = (T − t)−1

∫ T

t

ru · du

σ2 = (T − t)−1

∫ T

t

σ2u · du

δ = (T − t)−1

∫ T

t

δu · du.

This amounts to regarding r, σ2, and δ as averages of the time-varyingvalues over the interval [t, T ]. The fact that the average values are all thatmatter for valuing European options in this setting will be of importance inchapter 8, where we look at dynamics in which volatility of the underlyingvaries stochastically.

Risk-Neutral Valuation of European Options

Let us now derive the European call formula by martingale methods. Takingthe solution for ST given St to be as in (6.20) and recognizing CE(ST , 0) =C(ST ) ≡ (ST − X)+ as the call’s terminal value, we will apply (6.17) tofind the current value as

CE(St, T − t) = B(t, T )Et(ST − X)+. (6.29)

For this purpose it is helpful to have a simple representation forthe conditional c.d.f. of ST in the risk-neutral measure—namely,Ft(s) ≡ P(ST ≤ s | Ft). Obtaining this will be our first step. Continuingto take r, σ2, and δ as the time averages of these deterministic processes,(6.20) implies that ST is distributed under P as St exp[(r − δ − σ2/2)(T − t) + Zσ

√T − t], where Z ∼ N(0, 1). It follows that Ft(s) = 0 for

Page 284: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch06 FA

Black-Scholes Dynamics 267

s ≤ 0, while for positive s

Ft(s) = P[ln(ST /St) ≤ ln(s/St) | Ft]

= P[(r − δ − σ2/2)(T − t) + Zσ√

T − t ≤ ln(s/St)]

= Φ[ln(s/St) − (r − δ − σ2/2)(T − t)

σ√

T − t

].

Finally, using again the shorthand

q±(x) ≡ lnx ± σ2(T − t)/2σ√

T − t

and taking the forward price f = f(t, T ) from (6.26), Ft(·) can be expressedcompactly as

Ft(s) = Φ[q+(s/f)]. (6.30)

Having found Ft, we are not far from a solution for the arbitrage-freeprice of the call. Applying (6.29) with B ≡ B(t, T ), we have

CE(St, T − t) = B

∫ ∞

0

(s − X)+ · dFt(s)

= B

∫ ∞

X

s · dFt(s) − BX

∫ ∞

X

dFt(s)

= B

∫ ∞

X

s · dΦ[q+(s/f)] − BX

∫ ∞

X

dΦ[q+(s/f)].

Changing variables in the first integral as z = q+(s/f)−σ√

T − t = q−(s/f)gives

Bf

∫ ∞

q−(X/f)

dΦ(z) = BfΦ[q+(f/X)],

and setting z = q+(s/f) in the second integral gives

−BX

∫ ∞

q+(X/f)

dΦ(z) = −BXΦ[q−(f/X)].

Assembling the results gives (6.28).We could have got the same solution by working not with risk-

neutral measure P but with the martingale measure P that correspondsto numeraire St, as in (6.16 ). In the present setting, with σ, r, and δ stillas constants or averages over [t, T ], we find that ST is distributed under P

(conditional on Ft) as ST = St exp[(r − δ + σ2/2)(T − t) + Zσ√

T − t], andfrom this we deduce that Ft(s) = Φ[q−(s/f)]. Under this measure it is now

Page 285: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch06 FA

268 Pricing Derivative Securities (2nd Ed.)

CE(St, T − t)/St0≤t≤T rather than CE(St, T − t)/Mt0≤t≤T that is amartingale. Thus,

CE(St, T − t)/St = Et[(ST − X)+/ST ]

and

CE(St, T − t) = St

∫ ∞

0

s−1(s − X)+ · dFt(s) = St

∫ ∞

X

(1 − X/s) · dFt(s).

Working out the integral leads again to (6.28).Notice that we can write (6.28) using relation (6.25) as

CE(St, T − t) = BfΦ[−q−(f/X)] − BXΦ[−q+(f/X)]

= Bf[1 − Ft(X)] − BX [1 − Ft(X)]

= BfPt(ST > X) − BXPt(ST > X).

Likewise, the solution for PE(St, T − t) can be expressed as

PE(St, T − t) = BXPt(ST < X) − BfPt(ST < X).

One will recall the corresponding expressions (5.25) and (5.26) fromBernoulli dynamics (page 198). Thus, the values of the call and put dependon the probabilities of finishing in the money under the distinct martingalemeasures corresponding to numeraires Mt and St.

The Feynman-Kac Solution

We have presented two very different derivations of the Black-Scholes formu-las. The differential-equation method required solving the p.d.e. that repre-sents the derivative’s law of motion, while the martingale method involvedcalculating the mathematical expectation of the option’s terminal payoff inthe risk-neutral distribution or other martingale measure. It seems aston-ishing at first that calculating such an expectation turns out to deliver asolution to the financial p.d.e. One wonders how the two methods are con-nected and whether there is something special about derivative securitiesthat makes it possible to solve their equations of motion by means of prob-abilistic arguments. In fact, probabilistic methods are applicable to p.d.e.sthat describe a wide variety of phenomena. Just as was demonstrated forthe call option, the probabilistic approach delivers solutions in the form ofexpectations of functions of Brownian motion. Attribution is often madeto Feynman (1948) and Kac (1949), although their work focused just on a

Page 286: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch06 FA

Black-Scholes Dynamics 269

specific version of the heat equation. Durrett (1996) describes many otherapplications.

To get the flavor of the probabilistic approach, consider its applicationto the classic heat equation, which was described on page 253:

ut = κuxx (6.31)

u(x, 0) = h(x).

Here t and x represent time and position, κ is a positive constant, and h isa given function that describes the initial distribution of temperature. TheFeynman-Kac method hinges on the following observation: If Ws0≤s≤t isa Brownian motion on (Ω,F , Fs, P) and if Us = u(Ws

√2κ, t− s)0≤s≤t

is uniformly integrable, then Us is a martingale adapted to Fs when(6.31) is satisfied. This follows because Ito’s formula together with (6.31)imply that Us0≤s≤t is trendless, since

du(Ws

√2κ, t − s) = (−ut + κuxx) · ds + ux

√2κ · dWs

= ux

√2κ · dWs,

and because uniform integrability assures that sup0≤s≤t E|Us| < ∞. Mar-tingale convergence (theorem 7) thus applies, from which we conclude thatlims→t Us = h(Wt

√2κ), and so Us = Esh(Wt

√2κ). Therefore, taking

W0 = x/√

2κ to meet the initial condition gives as the solution to thep.d.e. u(x, t) = Eh(Wt

√2κ).

Applied to the problem of solving

0 = −DT−t + DS(r − δ)St + DSSσ2S2t /2 − Dr (6.32)

subject to D(ST , 0) = D(ST ), the method requires exactly the same stepsas in the equivalent-martingale approach. Under risk-neutral measure P wehave

dSt = (r − δ)St · dt + σSt · dWt

with Wt a Brownian motion, so applying Ito’s formula to D∗(St, T − t) ≡M−1

t D(St, T − t) and invoking (6.32) give

dD∗ =−DT−t + DS(r − δ)St + DSSσ2S2

t /2 − Dr

Mt· dt +

DSStσ

Mt· dWt

= M−1t DSStσ · dWt.

Thus, assuming uniform integrability (which is guaranteed for vanillaoptions by the dynamics-free bounds in chapter 4), we conclude that

Page 287: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch06 FA

270 Pricing Derivative Securities (2nd Ed.)

D∗(St, T − t) is a martingale under P, with D∗(St, T − t) = EtD∗(ST , 0)

and

D(St, T − t) = MtM−1T EtD(ST ) = e−r(T−t)EtD(ST ).

6.3.3 Extensions of the Black-Scholes Theory

By making some approximations the Black-Scholes formulas can beextended to European options on stocks that pay deterministic lump-sumdividends and to options on stock indexes. There is also a straightforwardextension to futures, due to Black (1976b).

Handling Lump-Sum Dividends

The formulas derived thus far allow for the underlying asset to pay div-idends continuously at a deterministic rate. This is appropriate for mod-eling options on currencies, which could be held in the form of foreigndiscount bonds, and is a common approximation for options on diversi-fied stock indexes like the S&P 500. For ordinary stock options there isalso a way to adapt the Black-Scholes formulas to allow for predictablelump-sum dividends. This corresponds to the “set-aside” method that wasintroduced in chapter 5 to avoid a nonrecombining binomial tree. Like thatmethod, it is merely an approximation rather than a thoroughly satisfactorysolution.

To see how it works, suppose first that a single lump-sum, per-sharepayment ∆t′ is to be made at t′ ∈ (t, T ]. The firm can finance thispayment by holding ∆t′ unit, t′-maturing bonds per share of stock.If this is done, the value of the stock up to t′ has two components:(i) the bonds that provide for the dividend payment, which are worth∆(t, t′) ≡ B(t, t′)∆t′ , and (ii) the ownership claim on the firm’s real assets,which is worth St − ∆(t, t′). Assuming that this latter quantity followsBlack-Scholes dynamics, the continuous-dividend formula still holds withSt − ∆(t, t′) in place of e−δ(T−t)St. To extend to multiple dividend pay-ments beginning at t′, just take ∆(t, t′) to be their aggregate presentvalue at t. Notice that the forward price for delivery at T of a stockthat pays lump-sum dividends is f(t, T ) = B(t, T )−1[St − ∆(t, t′)], cor-responding to expression (4.6) with K(t, T ) = −∆(t, t′) as the explicit costof carry. Therefore, formulas (6.27) and (6.28) already accommodate thelump-sum case.

Page 288: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch06 FA

Black-Scholes Dynamics 271

As explained in chapter 5, the conceptual problem with this approachis that it makes no allowance for the financing of dividends to be paid afterthe option expires.

Options on Stock Indexes

The value at t of a value-weighted index like the S&P 500 is just a weightedsum of the prices of n individual stocks:

It = v′St =n∑

j=1

vjSjt.

Weights for the S&P in particular are computed as 10 times the ratio ofthe number of shares outstanding to the average aggregate value of thefirms during a reference period. This scheme automatically adjusts for splitsand stock dividends. Let us consider how the index itself behaves if eachindividual price follows geometric Brownian motion. For this it is easiest toregard the weights as constants and figure the prices themselves to be splitadjusted—that is, multiplied by the cumulative split factor. Suppose eachSjt is modeled as

dSjt

Sjt= µjt · dt + σ′

j · dWt, j = 1, 2, . . . , n,

where (i) Wt≡(W1t, . . . , WNt)′ is N -dimensional standard Brownianmotion, with N ≥ n; (ii) σj ≡ (σj1, . . . , σjN )′; and σ′

jWt =∑N

k=1 σjkWkt

is the inner product. Introducing the multidimensional process allows thereturns of different stocks to be imperfectly correlated, but returns can alsobe expressed in one-dimensional form as

dSjt = µjtSjt · dt + θjSjt · dW σjt,

where θj ≡√

σ′jσj and W σ

jt ≡ θ−1j σ′

jWt. The index itself then satisfiesthe stochastic differential equations

dIt =n∑

j=1

vjSjtµjt · dt +n∑

j=1

vjSjtσ′j · dWt

=n∑

j=1

vjSjtµjt · dt +n∑

j=1

vjSjtθj · dW σjt.

Page 289: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch06 FA

272 Pricing Derivative Securities (2nd Ed.)

Momentarily ignoring dividends from the individual stocks, each µjt isreplaced by short rate rt when we move to risk-neutral measure P, so that

dIt

It= rt · dt + /Σ′

t · dWt,

where

/Σt ≡∑n

j=1 vjSjtσj∑nj=1 vjSjt

= I−1t

n∑j=1

vjSjtσj .

Here subscript t signifies the time/state dependence that enters throughprices Sjt. Now admitting dividends on the individual stocks but approxi-mating their joint effect as a continuous proportional dividend on the index,we can write

dIt/It = (rt − δt) · dt + /Σ′t · dWt

= (rt − δt) · dt + θIt · dW It ,

where θIt ≡√

/Σ′t /Σt and W I

t ≡ θ−1It

/Σ′tWt. It is now clear that the index

itself does not follow Black-Scholes dynamics under P, since /Σt and θIt arenot generally deterministic. They would be deterministic—and constant—ifand only if volatility vectors σjn

j=1 were all the same, but this would meanthat returns on all stocks were perfectly correlated. Still, Black-Scholesdynamics may serve as a reasonable empirical approximation for the indexprocess. In discrete time this is equivalent to approximating the distributionof a weighted sum of correlated lognormals as lognormal, and this canprovide a very satisfactory fit if the weights are not too disparate. For thisreason, formulas (6.28) and (6.27) have often been used in practice to priceEuropean options on diversified stock indexes.8

Options on Futures

We have seen that (subject to an integrability condition) futures prices arethemselves martingales in the risk-neutral measure, without the need fornormalization by the value of the money fund. If a futures price Ft followsBlack-Scholes dynamics under P, it thus solves the s.d.e. dFt = σtFt · dWt

8Chapter 11 shows how to avoid the lognormal approximation by using simulationto price “basket” options on portfolios of assets.

Page 290: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch06 FA

Black-Scholes Dynamics 273

under P, so that

FT = Ft exp

[−1

2

∫ T

t

σ2s · ds +

∫ T

t

σs · dWs

]∼ Ft exp[−σ2(T − t)/2 + Zσ

√T − t],

where σ2 ≡ (T − t)−1∫ T

tσ2

s · ds and Z ∼ N(0, 1). Given the implied condi-tional lognormality of the futures price, it is easy to adapt the Black-Scholesformulas to price European options on futures, as done originally by Black(1976b). For this, consider T -expiring options on futures prices for deliveryat T ′ > T . Since the futures price evolves under P just like a primary assetthat has no net trend, one simply sets rt − δt = 0 and St = Ft in thederivations of the previous section. This has the effect of replacing forwardprice f(t, T ) in (6.27) and (6.28) with futures price Ft. Black’s formulas forarbitrage-free prices of European options on the futures are thus

PE(Ft, T − t) = BXΦ[q+(X/Ft)] − BFtΦ[q−(X/Ft)] (6.33a)

CE(Ft, T − t) = BFtΦ[q+(Ft/X)] − BXΦ[q−(Ft/X)], (6.33b)

where q±(·) is given by (6.24) and B ≡ B(t, T ).For reference in chapter 7 when discussing American-style options,

notice that when t < T and r > 0 the ratio CE(Ft, T −t)/(Ft−X) → B < 1as Ft → ∞. Since the value of the live option is thus ultimately belowintrinsic value, early exercise of a call on the futures will at some point bedesirable. Therefore, as in the Bernoulli model so also in the Black-Scholesframework: American calls on futures may well be exercised early. This iseasy to remember if one recalls that the futures price behaves just like theprice of a primary asset with dividend rate equal to the short rate, sincewe already know that calls on dividend-paying assets are subject to earlyexercise.

Routine B-S on the CD calculates Black-Scholes prices for Europeanputs and calls on stocks paying lump-sum dividends, on indexes payingcontinuous dividends, on currency deposits appreciating at the foreigninterest rate, and on futures. Program doB S handles the input and out-put conveniently for single calculations. This is also implemented in theContinuousTime.xls spreadsheet.

6.4 Properties of Black-Scholes Formulas

This section documents a number of important features of the Black-Scholesformulas. We show that Black-Scholes prices satisfy put-call parity, that

Page 291: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch06 FA

274 Pricing Derivative Securities (2nd Ed.)

they have the right terminal values, and that they fall within the no-arbitrage bounds worked out in chapter 4. We also see precisely how theyvary with the current underlying price, the strike price, and other param-eters. The last four parts of the section treat “implicit” volatility, hedg-ing and the construction of “synthetic” derivatives, and instantaneous andholding-period returns to positions in European options.

6.4.1 Symmetry and Put-Call Parity

Not surprisingly, there are symmetries in the Black-Scholes formulas forcalls and puts like those found in the binomial approximations. If the priceof the call in (6.28) is written symbolically as a function of Bf and BX, as

CE(St, T − t) = f(Bf, BX),

then the corresponding expression for the put in (6.27) is obtained just bypermuting the arguments; that is,

PE(St, T − t) = f(BX, Bf). (6.34)

Of course, something like this must be true since the terminal payoffs ofputs and calls are symmetric in the same way. Less obvious is the connec-tion between δ and r. If the underlying asset pays continuous proportionaldividends at rate δ, then f(t, T ) = Ste

(r−δ)(T−t), and the option formulascan be written as

CE(St, T − t) = f(Ste−δ(T−t), Xe−r(T−t)) (6.35)

PE(St, T − t) = f(Xe−r(T−t), Ste−δ(T−t)).

This shows that the price of the European put can be expressed in terms ofthe European call, and vice versa, by interchanging St and X , δ and r. Wesaw in chapter 5 that this was true for both European and American optionsunder Bernoulli dynamics, and we will see in chapter 7 that it applies alsofor American options in the Black-Scholes framework.9

9While put-call symmetry holds more generally than under Black-Scholes dynam-ics, it is not model-independent. Williams (2001) shows that if the distribution of

Y ≡ ln[ST /f(t, T )] is independent of St, r, and δ, then it is necessary and sufficientfor symmetry that dF (y) = e−ydF (−y). The condition does hold in the Black-Scholesmodel, where the conditional density of Y is proportional to exp−[(y + σ2(T − t)]2/2.

Page 292: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch06 FA

Black-Scholes Dynamics 275

To verify that the Black-Scholes formulas are consistent with Europeanput-call parity, notice that

Φ[q+(f/X)] ≡ Φ[ln(f/X) + σ2(T − t)/2

σ√

T − t

]= 1 − Φ

[ln(X/f) − σ2(T − t)/2

σ√

T − t

]≡ 1 − Φ[q−(X/f)],

which from (6.27) and (6.28) implies that

CE(St, T − t) − PE(St, T − t) = B(t, T )[f(t, T )− X ]. (6.36)

This is the put-call parity relation stated in terms of the forward price foran underlying asset with arbitrary (but deterministic) explicit cost of carry.If there are neither dividends nor positive carrying costs, then f(t, T ) =St/B(t, T ) and

CE(St, T − t) − PE(St, T − t) = St − B(t, T )X.

6.4.2 Extreme Values and Comparative Statics

Let us now verify that the Black-Scholes formulas satisfy the terminal valueconditions and the boundary conditions worked out in chapter 4, then seehow the prices depend on stock and option parameters.

Calls

We present call formula (6.28) again here for convenience as

CE(St, T − t) = BfΦ[q+(f/X)] − BXΦ[q−(f/X)], (6.37)

where once again

q±(x) ≡ lnx ± σ2(T − t)/2σ√

T − t,

B ≡ B(t, T ) is the price of a discount T -maturing bond, and f ≡ f(t, T ) isthe T -forward price as of time t.

To check that this satisfies terminal value condition CE(ST , 0) = (ST −X)+, begin with the obvious facts that B → B(T, T ) = 1 and f → f(T, T ) =ST as t → T . If ST > X then q±(f/X) → +∞ as t → T and CE(St, T−t) →ST − X, whereas if ST ≤ X then q±(f/X) → −∞ and CE(St, T − t) → 0.

Page 293: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch06 FA

276 Pricing Derivative Securities (2nd Ed.)

Thus, the call formula does deliver at expiration the correct piecewise-linearpayoff function, (ST − X)+.

Now fixing t < T , consider how the value function behaves as St

approaches its extreme values. As St (and therefore f) approaches zero,it is apparent that q±(f/X) → −∞ and CE(St, T − t) → 0. Also,

CE(St, T − t) − B(f − X) = BfΦ[q+(f/X)] − 1 − BXΦ[q−(f/X)] − 1

approaches zero as St → ∞. Thus, the value function starts at the originand approaches asymptotically the line B(f − X) = Ste

−δ(T−t) − BX asSt → ∞. When there is no dividend, this is the line St−BX, as in figure 6.2.Notice that the fact that CE(St, T − t) > St − BX ≥ St − X when t < T

is consistent with the general conclusion in chapter 4 that there would beno incentive to exercise a call on a no-dividend stock before expiration.

It is obvious from (6.37) that CE(St, T − t) ≤ Ste−δ(T−t), which is the

upper bound in table 4.7 that we derived from static replicating arguments.To see that it satisfies lower bound CE(St, T − t) ≥ (Ste

−δ(T−t) − BX)+,set X = 1 as a normalization and put

c(f) ≡ CE(St, T − t)/B = fΦ[q+(f)] − Φ[q−(f)].

Fig. 6.2. Shape of Black-Scholes call function.

Page 294: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch06 FA

Black-Scholes Dynamics 277

With these conventions the bound becomes c(f) ≥ (f − 1)+. Differentiatingc(f) gives

c′(f) = Φ[q+(f)] + fφ[q+(f)] · dq+/df − φ[q−(f)] · dq−/df,

where φ is the standard normal p.d.f. This reduces to c′(f) = Φ[q+(f)]on expressing the last two terms. Thus, c′(f) > 0 when f > 0, and sincec(0) = 0 it follows that c(f) ≥ 0 for all f ≥ 0. On the other hand, we havethe stronger bound c(f) ≥ (f − 1) when f ≥ 1, since

c(f) − (f − 1)f

= Φ[q+(f)] − 1 − f−1Φ[q−(f)] − 1

= f−1Φ[q+(f−1)] − Φ[q−(f−1)]

= c(f−1),

which is nonnegative by the previous result.Figure 6.2 depicts the call function as monotonic and convex in St. To

verify this, we can sign the first two derivatives. In the standard notationthe derivative with respect to the current price of the underlying, which iscalled the “delta” of the call, is

CES = e−δ(T−t)Φ[q+(f)] = e−δ(T−t)Φ

[q+

(Ste

−δ(T−t)

BX

)]> 0.

Interestingly, this is the same expression one would get on differentiating(6.37) with respect to St but overlooking its presence in the arguments ofthe c.d.f.s. The second derivative with respect to St, the “gamma”, is alsopositive. Specifically,

CESS ∝ φ[q+(f)]

fσ√

T − t> 0,

confirming that the Black-Scholes call function is indeed increasing andconvex in St.

Differentiating with respect to the strike price gives

CEX = −BΦ[q−(f)] < 0.

This and the expression for CES show that a call on a no-dividend stock can

be represented as

CE(St, T − t) = CES St + CE

XX, (6.38)

a decomposition that will be found useful as we proceed.Here are other comparative statics for European calls in the no-dividend

case (δ = 0), where subscripts denote partial derivatives and the “Greeks”

Page 295: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch06 FA

278 Pricing Derivative Securities (2nd Ed.)

in parentheses are the nicknames applied by practitioners:

1. CEt = −CE

T−t < 0 (“theta”). Value declines as remaining life diminishes,since there is less potential to get deeply into the money, and since thepresent value of the liability to pay X upon exercise increases as t → T .

2. CEσ > 0 (“vega”). Greater volatility adds potential to be in the money

at expiration.3. CE

r > 0 (“rho”). Higher interest rates reduce the present value of thecontingent liability to pay X in order to exercise the call.

4. CEXX > 0. Convexity of CE with respect to X follows from convexity of

PE with respect to St (see below) and identity (6.34).

Puts

The corresponding results for European puts can be deduced most easilyusing European put-call parity, expression (6.36). For the terminal condi-tion, PE(St, T − t) → (ST − X)+ − (ST − X) = (X − ST )+ as t → T.

At the extremes of St, we have PE(St, T − t) → B(t, T )X as St → 0 andPE(St, T − t) → 0 as St → ∞. The bounds are

[B(t, T )X − Ste−δ(T−t)]+ = B(t, T )(X − f)+ ≤ PE(St, T − t) ≤ B(t, T )X,

in line with the findings in chapter 4. For the comparative statics, the deltafor a European put is

PES = −e−δ(T−t)Φ

[q−

(BX

Ste−δ(T−t)

)]< 0 (6.39)

and the gamma, PESS , is positive, so that Black-Scholes values of European

puts are also convex functions of the underlying price, decreasing fromB(t, T )X to 0 as St ranges from 0 to ∞. These features are depicted infigure 6.3. Notice from the figure that PE(St, T − t) < X − St when St

is sufficiently small, indicating that the holder of a put would sometimesprefer to exercise before expiration.

The derivative of PE with respect to X is

PEX = BΦ

[q+

(BX

Ste−δ(T−t)

)]> 0.

This and the expression for PES show that the put’s value can be decom-

posed into terms involving PEX and its delta. When δ = 0 the result is

PE(St, T − t) = PES St + PE

X X. (6.40)

Page 296: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch06 FA

Black-Scholes Dynamics 279

Fig. 6.3. Shape of Black-Scholes put function.

Other comparative statics when δ = 0 are

1. PEt = −PE

T−t (theta) has indeterminate sign. Value rises as time tomaturity falls if the St is low enough that early exercise would be desir-able; otherwise, longer life adds value by increasing the chance of gettinginto the money.

2. PEσ > 0 (“vega”). Greater volatility adds potential value.

3. PEr < 0 (rho). Higher interest rates reduce the present value of whatever

is received at exercise.4. PE

XX > 0. Convexity of the call in St implies convexity of the put in X .

6.4.3 Implicit Volatility

Since the “vegas” of vanilla European puts and calls are strictly positive, theoptions’ values are strictly increasing in σ. Also, it is easy to see from the for-mulas that Black-Scholes option prices range from their lower to their upperarbitrage-free bounds as σ ranges from zero to infinity; specifically, theprices of calls and puts increase from B(f−X)+ to Bf and from B(X − f)+

to BX , respectively. It follows that one and only one nonnegative value ofσ corresponds to each feasible call value and likewise to each feasible putvalue. Given the market price of a European put or call (along with theprices of the underlying asset and the riskless bond), it is therefore possible

Page 297: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch06 FA

280 Pricing Derivative Securities (2nd Ed.)

to find a unique value of volatility that corresponds. This is the volatilitythat is said to be “implied by”, or “implicit in”, the Black-Scholes formulas.

Under the strict assumptions of Black and Scholes, including thatvolatility is constant through time, the volatilities implicit in any collectionof European options on the same underlying asset should be precisely thesame, regardless of expiration date, strike price, and even the current cal-endar time. Under the less restrictive assumption that volatility may varypredictably through time, puts and calls with different strike prices butthe same expiration date should still have the same implicit volatilities. Ofcourse, one expects some variation empirically because of transaction costs,the discreteness of observed prices, and the fact that price quotes on theoption and the underlying may not be simultaneous. Even so, before thepronounced market reversal of October 1987 differences in implicit volatil-ities across strike prices were relatively minor, albeit somewhat higher foroptions far away from the money. After 1987 things changed radically inmany markets. Implicit volatilities at low strikes have been substantiallyhigher than those at strikes close to and above the current underlyingprice. This has been particularly pronounced for options on stock indexes.Figure 6.4 illustrates a typical relation between implicit volatility and the

Fig. 6.4. An implicit-volatility “smile” curve.

Page 298: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch06 FA

Black-Scholes Dynamics 281

“moneyness” of options, X/St. The shape of the curve suggests the commonterms for the phenomenon: volatility “smile” and volatility “smirk”.

To give a specific example of what the curve implies, take as bench-mark the implicit volatility of at-the-money options, σ(1) say, and usethis to calculate Black-Scholes prices for far-out-of-money puts, for whichX/St 1. Because of the monotonic relation between price and volatil-ity, the high implicit volatility of these out-of-money puts indicates thattheir actual market prices are higher than those calculated from the Black-Scholes formula with σ = σ(1). In other words, the market places a higherrelative value on out-of-money puts than on those at or in the money.In-the-money calls are also valued higher, because the put-call parity rela-tion fixes the difference between prices of puts and calls with the same strikeand expiration. Explaining such systematic anomalies has been—and is—aprimary research objective. Chapters 8 and 9 treat alternative models basedon richer dynamics than geometric Brownian motion that yield predictionsmore in accord with current experience.

6.4.4 Delta Hedging and Synthetic Options

The derivation of the Black-Scholes formulas relies on the ability to repli-cate payoffs of European options with dynamic, self-financing portfoliosof the underlying asset and money fund or riskless bonds. The formulasthemselves actually provide the exact recipes for such replicating portfo-lios. For example, (6.40) shows that replicating a put on a no-dividendstock requires holding PE

X X = XΦ[q+ (BX/St)] worth of the money fundand PE

S = −Φ[q−(BX/St)] units of the stock at time t. That is, one holdsa number of shares of stock equal to the option’s delta, together withPE

X X/Mt units of the fund. Since PES < 0 and PE

X > 0, this requires ashort position in the stock and a long position in the fund.

As t, St, and PES (St, T − t) change, the portfolio shares have to be

adjusted, but it must be possible to make the adjustments without cashinfusions or withdrawals if the replicating portfolio is to be self-financing.To see that this is so, we must show that

dP E(St, T − t) = PES · dSt +

PEX X

Mt· dMt.

Applying Ito’s formula to PE(St, T − t) gives

dP E(St, T − t) = PES · dSt + (PE

SSσ2S2t /2 − PE

T−t) · dt;

Page 299: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch06 FA

282 Pricing Derivative Securities (2nd Ed.)

and fundamental p.d.e. (6.10) implies (with δ = 0)

PESSσ2S2

t /2 − PET−t = rt[PE(St, T − t) − PE

S St]

= rtPEX X

=PE

X X

Mt· dMt,

which gives the desired result.Actually replicating puts or other derivatives by following this recipe is

an example of “delta hedging”, which derives its name from the fact thatthe derivative’s delta is the required position in the underlying asset. Aninstitution that sold a put to a client would typically hedge the exposure byreplicating a long position in the option.10 In hedging the put, consider whathappens as the price of the underlying declines. Because PE is convex in St,PE

S decreases (increases in absolute value) as St falls. This means that morestock must be sold short and that proceeds of the sale have to be added tothe money-fund balance. As the share price rises, the money fund balanceis drawn down to purchase stock and reduce the short position. By the timeof expiration both positions will have been closed out entirely if the put isout of the money. If it is in the money, the account ends up short one shareof the underlying asset and with X units of cash. Figure 6.5 gives a visualillustration of the replication process. The slope of the tangent line at St

(the put’s delta) indicates the required (short) position in the underlying,and the vertical intercept of the tangent line (XPX) indicates the value ofthe replicating position in the money fund. The put’s value shows up onthe vertical axis as the sum of XPX and the negative quantity StPS . Bymoving the tangent line appropriately, one can track how the positions varywith St. Since PE

S ↓ −1 and PEX ↑ B(t, T ) as St ↓ 0, the portfolio would

be long BX/Mt units of the money fund and short one share of the stockshould the stock become worthless, and both positions would ultimately beclosed out as St → ∞.

Other than to hedge a short position arising from selling a put, whywould anyone want to replicate one? Owning a put on an asset provides afloor on the loss that could be sustained if the price falls. For example, themaximum potential loss as of time T from buying a share of (no-dividend)stock and a T -expiring put at t = 0 is S0 +P (S0, T )−X < S0, as compared

10Of course, institutions actually worry just about hedging their net exposures ratherthan the individual positions.

Page 300: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch06 FA

Black-Scholes Dynamics 283

Fig. 6.5. Replicating portfolio for European put.

with maximum loss S0 from an unhedged position in the stock alone. Theput thus serves as a form of insurance, with initial cost P (S0, T ) as theinsurance premium. Indeed, one presumes that the high implicit volatilitiesfor out-of-money puts since 1987 are attributable to the high valuations themarket now places on such insurance. However, one cannot always find anexchange-traded put with the precise X, T, underlying asset, and contractsize that one desires. Moreover, if the goal is to insure the value of a partic-ular portfolio, it would be wasteful to insure each separate position, even ifthat were possible. This is where delta hedging comes in. Delta hedging canbe used to create “synthetic” puts of arbitrary expiration date and strikeprice and on individual assets or portfolios of assets for which options donot actively trade. Creating synthetic puts on diverse portfolios is in fact acommon means of creating “portfolio insurance”.11 Thus, to start at time t

11. . . but a far less common means than before the 1987 crash, to which its widespreaduse during the 1980s was a prime contributor. The concept works well on a scalesmall enough to prevent substantial feedback to the underlying price, but on a large

scale the program of dumping shares into a declining market and scrambling to getthem back on upticks is clearly destabilizing. It is interesting and ironical that theBlack-Scholes-Merton model is less useful today because of its triumphant early success.

Page 301: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch06 FA

284 Pricing Derivative Securities (2nd Ed.)

to insure out to T a portfolio worth St (and assumed to follow Black-Scholesdynamics), one would sell off an amount worth |PS(St, T − t)|St and buyPX(St, T −t)X worth of riskless bonds—the net outlay (an increasing func-tion of the strike) being the cost of insuring.

6.4.5 Instantaneous Risks and Expected Returns

of European Options

Under risk-neutral measure P all primary assets and derivatives have instan-taneous expected rate of return (proportional mean drift) equal to the shortrate, rt, just as do riskless savings accounts. Of course, in the real world(that is, in measure P) one expects returns of risky assets to compensate forrisk on average over time. Looking in turn at European calls and Europeanputs, we will see what the Black-Scholes formulas imply about their actualinstantaneous mean returns and risks. To simplify the expressions, we con-tinue to work with options on no-dividend assets and to treat underlyingvolatilities as constants.

Calls

Modeling the evolution of the underlying price as dSt/St = µSt ·dt+σS ·dWt,

where µSt is a deterministic process with∫ T

0 |µSt | · dt < ∞, Ito’s lemmaapplied to CE(St, T − t) gives

dCE

CE=

−CET−t + µStC

ES St + CE

SSS2t σ2

S/2CE

· dt +σSCE

S St

CE· dWt

≡ µCt · dt + σCt · dWt. (6.41)

Notice that while the option’s price is an Ito process, the volatility, σCt ,is not deterministic since it depends on St—directly and through CE andCE

S . Letting ηCt ≡ CES St/CE = σCt/σS be the call’s price elasticity with

respect to St and using (6.38) to express CE(St,T − t), one sees that

ηCt =CE

S St

CES St + CE

XX.

Since CEX < 0, we have ηCt > 1 and thus σCt > σS . Thus, the call is always

more volatile than the stock. Moreover, ηCt → 1 as St → ∞ and ηCt → +∞

Page 302: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch06 FA

Black-Scholes Dynamics 285

as St → 0. 12 From (6.41 ) the call’s mean proportional drift is

µCt =−CE

T−t + rtCES St + CE

SSS2t σ2

S/2CE

+ ηCt(µSt − rt).

and applying the fundamental p.d.e. reduces this to

µCt − rt = ηCt(µSt − rt).

Thus, since ηCt > 1, if the stock’s instantaneous excess rate of return ispositive, then the call’s excess return is even larger, in compensation for itsextra volatility.

Now let us suppose that the stock’s instantaneous rate of return isrelated to the excess return of the “market” portfolio in the manner pre-dicted by Capital Asset Pricing Model (CAPM).13 To set up the CAPM incontinuous time, model the price of asset j ∈ 1, 2, . . . , n as

dSjt/Sjt = µjt · dt + σ′j · dWt,

where Wt is N -dimensional standard Brownian motion with N > n. Wealso need the equivalent one-dimensional form,

dSjt/Sjt = µjt · dt + θj · dW σjt,

where θj ≡√

σ′jσj and W σ

jt ≡ θ−1j σ′

jWt. Then if St is the vector of asset

prices and v ≡ (ν1, . . . , νn)′ is the vector of numbers of shares (regarded as

12This is true because CXXCSSt

→ −1 as St → 0. To show this, apply l’Hospital’s ruleto get

limSt→0

CXX

CSSt= lim

St→0

XCXS

CS + StCSS.

Expressing CXS and CSS in terms of normal p.d.f. φ and CS in terms of normal c.d.f. Φ,setting σS

√T − t = 1 without loss of generality, and letting y ≡ S/(BX) shows that

limSt→0

CXX

CSSt= − lim

y→0

1y· φ(ln y − 1

2)

Φ(ln y + 12) + φ(ln y + 1

2).

The numerator simplifies to φ(ln y + 12). Dividing numerator and denominator by this

and letting z ≡ ln y + 12

give

limSt→0

CXX

CSSt= −[1 + lim

z→−∞Φ(z)/φ(z)]−1.

Finally, applying l’Hospital again shows that limz→−∞ Φ(z)/φ(z)=limz→−∞ −z−1 =0.13Under the Sharpe (1964)-Lintner (1965)-Mossin (1966) version of the CAPM an

asset’s equilibrium expected excess rate of return is proportional to that of the marketportfolio, as ERt − rt = β(ERM − rt), where β = cov(R, RM)/V RM.

Page 303: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch06 FA

286 Pricing Derivative Securities (2nd Ed.)

constants), the value of the market portfolio is Mt = v′St and

dMt/Mt = µMt · dt + /Σ′Mt

· dWt ≡ µMt · dt + θMt · dWMt ,

where

/ΣMt≡

∑nj=1 vjSjtσj∑n

j=1 vjSjt,

θ2Mt

≡ /Σ′Mt

/ΣMt,

and WMt ≡ /Σ′Mt

Wt. The instantaneous correlation between the returnsof stock j and the market is

ρjMt ≡ d〈M, Sj〉t√d〈M〉td〈Sj〉t

=σ′

j /ΣMt

θjθMt

,

and the instantaneous “beta” coefficient is

βSjt = ρjMtθj/θMt .

In this model beta represents the asset’s systematic risk—its exposure to theforces that cause aggregate wealth to vary unpredictably. Because individ-uals seek to avoid this risk the stock’s excess return varies in proportion toits beta, as µjt − rt = βSjt(µMt − rt). (We take for granted that µMt > rt.)

Focusing on some particular asset j ∈ 1, 2, . . . , n now but droppingthe subscript, the volatility of the call on the asset can be expressed asσCt = ηCtθ, so that the call’s beta is βCt = ηCtβSt . If the instantaneousCAPM does hold for the stock, we find that it holds for the call as well, since

µCt − rt = ηCt(µSt − rt)

= ηCtβSt(µMt − rt)

≡ βCt(µMt − rt).

Moreover, since ηCt > 1 the call has higher systematic risk than the stockwhenever the stock’s own systematic risk is positive.

Puts

The situation for puts is a bit more complicated. Corresponding expressionsdo hold; that is,

µPt − rt = ηPt(µSt − rt)

= ηPtβSt(µMt − rt)

= βPt(µMt − rt).

Page 304: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch06 FA

Black-Scholes Dynamics 287

However, since ηPt ≡ PES St/PE < 0, the put offers negative excess return

whenever the underlying has positive systematic risk. That is, within theCAPM framework µPt − rt < 0 when βSt > 0. The intuition is that beinglong a put reduces the overall systematic risk of one’s portfolio (recall itsinsurance value for portfolios of primary assets), and this makes it worth-while to hold the put even though it returns less on average than do risklessbonds.

Unlike calls, puts do not necessarily have higher volatility than theunderlying asset in the Black-Scholes framework. Writing

|ηPt | =∣∣∣∣ PE

S St

PES St + PE

X X

∣∣∣∣ ,shows that the absolute price elasticity is greater than, equal to, or lessthan unity as −2PE

S St PEX X, or as

2Φ[ln(BX/St) − σ2

S(T − t)/2σS

√T − t

]St Φ

[ln(BX/St) + σ2

S(T − t)/2σS

√T − t

]BX.

Letting f ≡ St/BX , this is equivalent to

2Φ[ln f−1 − σ2

S(T − t)/2σS

√T − t

] f−1Φ

[ln f−1 + σ2

S(T − t)/2σS

√T − t

].

The right side will certainly be greater than the left when the option is farenough in the money that f < 1/2. Thus, |ηPt | < 1 for puts that are deep inthe money, whereas |ηPt | > 1 for puts that are far enough out. The locationof the threshold of unit elasticity depends on the volatility and time toexpiration. When f = 1 some algebra shows that |ηPt | 1 according asΦ(σS

√T − t/2) 2

3 . As expiration nears, Φ(σS

√T − t/2) → 1

2 , so thatabsolute elasticity is certainly greater than unity for at-the-money putsthat are about to expire; however, |ηPt | < 1 when volatility and time toexpiration are sufficiently large.

Summing up, the Black-Scholes formulas imply that European calls arealways more volatile than the underlying stocks, but that puts are morevolatile only when far out of the money or far from expiration.

6.4.6 Holding-Period Returns for European Options

Having just seen what the Black-Scholes formulas imply for options’ instan-taneous expected returns, let us now look at expected returns over arbitraryholding periods. It is clear that in risk-neutral measure P expected holding-period returns of all primary assets equal the sure return from holding a

Page 305: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch06 FA

288 Pricing Derivative Securities (2nd Ed.)

default-free bond to maturity. Under assumption RK this is the same asthe return of the money fund and the same as the return from holding abond to a time before maturity. Thus, for both an underlying asset withinitial price S0 and a T -expiring derivative worth D(S0, T ) the P-expectedtotal return during [0, t] is

ESt

S0=

ED(St, T − t)D(S0, T )

=1

B(0, t)=

Mt

M0=

B(t, T )B(0, T )

= eR

t0 rs·ds (6.42)

for t ≤ T . However, under actual measure P, wherein

dSt/St = µt · dt + σ · dWt

with deterministic µt, we have E(St/S0) = exp(∫ t

0 µs · ds). Can we alsocalculate the expected holding-period returns of European puts and callsunder P for arbitrary paths of µs? In fact, we can thanks to the followingrelation (proved in section 2.2.7):∫ ∞

−∞Φ(α ± βx) · dΦ(x) = Φ(α/

√1 + β2). (6.43)

Following Rubinstein (1984) we shall use (6.43) to work out the expectedholding-period return from 0 to t ≤ T of a T -expiring European call on ano-dividend stock. The c.d.f. of St under P, given what is known at t = 0, is

F (s) ≡ P(St ≤ s)

= P(S0eR t0 µs·ds−σ2t/2+σWt ≤ s)

= P(ESt · e−σ2t/2+σWt ≤ s)

= Φ[q+(s/ESt; t)],

where

q±(x; t) ≡ lnx ± σ2t/2σ√

t.

The time-zero expectation of the European call’s value at t is then

ECE(St, T − t) =∫ ∞

0

sΦ[q+

( s

BX; T − t

)]· dΦ

[q+

(s

ESt; t)]

−BX

∫ ∞

0

Φ[q−

( s

BX; T − t

)]· dΦ

[q+

(s

ESt; t)]

,

where B ≡ B(t, T ). Changing variables as

z ← ln(s/ESt) − σ2t/2σ√

t= q+

(s

ESt; t)− σ

√t/2 = q−

(s

ESt; t)

Page 306: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch06 FA

Black-Scholes Dynamics 289

in the first integral and as

z ← ln(s/ESt) + σ2t/2σ√

t= q+

(s

ESt; t)

in the second gives, after simplifying,

ECE(St, T − t) = ESt

∫ ∞

−∞Φ

[ln

(ESt

BX

)+ σ2T/2

σ√

T − t+ z

√t

T − t

]· dΦ(z)

−BX

∫ ∞

−∞Φ

[ln

(ESt

BX

)− σ2T/2

σ√

T − t+ z

√t

T − t

]· dΦ(z).

Since B ≡ B(t, T ) = B(0, T )/B(0, t) under assumption RK, we can write

ESt

B(t, T )X=

B(0, t)ESt

B(0, T )X

in the above. Formula (6.43) then gives for ECE(St, T − t) the expression

EStΦ[q+

(B(0, t)ESt

B(0, T )X; T

)]− B(t, T )XΦ

[q−

(B(0, t)ESt

B(0, T )X; T

)].

Finally, multiplying by B(0, t) shows that

B(0, t)ECE(St, T − t) = CE [B(0, t)ESt, T ]

and gives the following remarkable formula for the call’s expected holding-period return under P:

ECE(St, T − t)CE(S0, T )

=CE [B(0, t)ESt, T ]B(0, t)CE(S0, T )

. (6.44)

Notice that when P = P the right side reverts to

CE [B(0, t)ESt, T ]B(0, t)CE(S0, T )

=CE(S0, T )

B(0, t)CE(S0, T )= B(0, t)−1,

in line with (6.42).While the numerator on the right side of (6.44) increases with the

P-expected future price of the stock, the current value of the call in thedenominator does not change, since it based on measure P. An increase inESt therefore raises the call’s mean holding-period return, as one wouldexpect. Things work in the opposite direction for puts. Figures 6.6 and 6.7show, for several values of volatility, how mean holding-period returns ofEuropean options vary with the stock’s average proportional drift over theholding period, µ(0, t) ≡ t−1

∫ t

0 µs · ds. The figures plot the expected annu-alized, continuously compounded rates of return of calls and puts vs. µ(0, t)

Page 307: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch06 FA

290 Pricing Derivative Securities (2nd Ed.)

0.00 0.02 0.04 0.06 0.08 0.10 0.12 0.14 0.16 0.18 0.20

Mean Drift of Underlying Price

−0.5

−0.3

−0.1

0.1

0.3

0.5

0.7

0.9

1.1

1.3

1.5

Mean A

nnualiz

ed R

ate

of R

etu

rn

σ=.10

σ=.20

σ=.30σ=.40σ=.50

Fig. 6.6. Expected holding-period returns of call vs. mean drift of stock.

0.00 0.02 0.04 0.06 0.08 0.10 0.12 0.14 0.16 0.18 0.20

Mean Drift of Underlying Price

−3.0

−2.5

−2.0

−1.5

−1.0

−0.5

0.0

0.5

1.0

Mean A

nnualiz

ed R

ate

of R

etu

rn

σ=.10

σ=.20

σ=.30

σ=.40

σ=.50

Fig. 6.7. Expected holding-period return of put vs. mean drift of stock.

Page 308: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch06 FA

Black-Scholes Dynamics 291

for each of five values of σ. (Annualized, continuously compounded rates ofreturn are natural logs of holding-period returns divided by the length ofthe holding period in years.) In each case t = 0.5, T = 1.0, S0 = X = 10.0,

and the average short rate is r = 0.05. The curves for all values of σ inter-sect at µ = r = 0.05, since here the actual and risk-neutral values coincide.Notice that the options’ mean returns are especially sensitive to variationsin the stock’s mean drift when volatility is low, because in that case theyhave little chance of being in the money at expiration unless the trend isfavorable.

Page 309: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch07 FA

7American Options and “Exotics”

This chapter takes up the pricing of derivatives that have more complicatedpayoff structures than European-style options: American options with bothfinite and indefinite lives; “compound” options, for which the underlyingassets themselves are (or depend on) options; European-style options thatcan be extended finitely or indefinitely many times; options with terminalpayoff functions besides just (ST −X)+ or (X −ST )+; and options that letthe holder choose among alternative underlying assets or that let the buyerdecide after the purchase whether the option is to be a put or a call. Wealso consider various path-dependent derivatives such as barrier options,Asian options, and lookbacks, all of whose payoffs depend on the price ofthe underlying at more than one point in time. While finite-lived Americanoptions are traded on the organized exchanges—and, indeed, are the mostcommon types of exchange-traded options—the various exotics are tradedprimarily over the counter, being created by financial firms to meet specifichedging demands of their clients.

Throughout this chapter we adhere to the standard Black-Scholes frame-work, taking the dynamics for the underlying as geometric Brownian motionwith volatility σ and regarding the short rate of interest as a constant, r.

7.1 American Options

American calls on no-dividend stocks are valued just as European calls,because their always-positive time value prior to expiration rules out earlyexercise. This was established in chapter 4 by simple static replicating argu-ments. On the other hand, unless they are dividend-protected by a contrac-tual adjustment of the strike price, calls on stocks that do pay dividends maywell be exercised early. Likewise, early exercise may well occur for American

292

Page 310: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch07 FA

American Options and “Exotics” 293

calls on foreign currencies and, as explained on page 273, for American callson futures. Moreover, American puts on any of these underlying assets aresubject to early exercise. Obviously, one would want to exercise a put ifthe price of the underlying were so low that foregone interest on the strikeexceeded the possible gain from further decline.

Valuing finite-lived American puts and finite-lived puts and calls onfutures or on assets paying continuous dividends is complicated by thepath dependence of the date of exercise. While this early-exercise option iseasily handled in the backwards-recursive binomial setup, no exact compu-tational formula like that for European options has ever been found withinthe Black-Scholes framework. This is one reason that the binomial methodremains a useful tool in derivatives pricing. On the other hand, in thecontinuous-time environment it is relatively easy to price finitely lived callson stocks that pay known dividends at discrete times and, oddly enough,American options that never expire. We begin with the first of these cases,then treat the harder case of puts and calls on assets that may pay contin-uous dividends, then conclude by developing simple valuation formulas foroptions with no definite termination dates.

7.1.1 Calls on Stocks Paying Lump-Sum Dividends

Consider an American call expiring at T on a stock that pays a singleknown, cash dividend to holders of record up to some date t∗ ∈ (t, T ), theex -dividend date. On this date the price will drop below its cum-dividendvalue by some amount ∆, which we define to be the value of the cashdividend.1 The same argument that rules out early exercise of a call ona no-dividend stock shows that exercise in this case would occur, if atall, just prior to t∗. Let t∗− denote this last instant before t∗. In orderto value the option at t, we must determine its value at t∗−. The holderof the option would exercise it at that time if intrinsic value St∗− − X

exceeded its live value on the ex -dividend asset. Since by assumption onlyone dividend is paid before T , the option’s ex -dividend value is the valueof a European call with remaining life T − t∗ on a stock worth St∗− − ∆.Thus, the value of the American call at t∗− is the greater of St∗− − X andCE(St∗− − ∆, T − t∗) = CE(St∗ , T − t∗).

Figure 7.1 illustrates. We know that the value of a call—either Europeanor American—on a no-dividend stock would approach asymptotically the

1See the discussion of tax and timing issues for dividends in section 5.4.4.

Page 311: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch07 FA

294 Pricing Derivative Securities (2nd Ed.)

Fig. 7.1. Early exercise condition for American call on stock paying dividend ∆ at t∗.

line St∗ − B(t∗, T )X as St∗ → ∞. This lies above the St∗ − X line by anamount X [1 − B(t∗, T )]. This difference increases with the average shortrate of interest over the option’s remaining life and, so long as short ratesare sometimes positive, with T − t∗. This no-dividend case is depicted bythe dashed curve emanating from the origin. That the value function liesalways above the intrinsic value line prior to expiration indicates that earlyexercise would never be advantageous in the absence of a dividend. Nowconsider how the payment of a dividend alters the picture. Since the stock’sprice will drop at once to St∗ = St∗− − ∆ on the ex date, a cum-dividendprice of ∆ at t∗− would correspond to an ex -dividend price of zero at t∗.Once the dividend is out of the way, the American call is again valued likea European call with T − t∗ to go. Thus, its value at t∗ as a function ofex -dividend price St∗ is simply the value function of a European call at St∗−

shifted to the right by ∆. This translation is depicted by the bold curvein the figure. If the curve intersects the line (St∗ − X)+, as it does in thefigure at price s∆, then it would be advantageous to exercise the Americancall at t∗− whenever St∗− > s∆. The critical issue, then, is whether suchan intersection occurs, and this depends on the size of the dividend and onB(t∗, T )—that is, on the average short rate and on how long the option hasyet to live.

Page 312: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch07 FA

American Options and “Exotics” 295

To see whether such a critical s∆ exists and to determine its value justrequires a numerical search for a solution to

St∗ − X − CE(St∗ − ∆, T − t∗) = 0,

using the Black-Scholes formula to value the call. If there is no solution,the value of the American call at t will be that of a European call on adividend-paying stock. This can be approximated by the set-aside approachdiscussed in section 6.3.3. If there is a solution, then one has to do somework, as follows. Applying the risk-neutral paradigm, express the call’scurrent value, denoted CA(St, T − t; t∗, ∆), as

B(t, t∗)∫ ∞

0

max [st∗ − X, CE(st∗ − ∆, T − t∗)] · dFt(st∗)

or

B(t, t∗)∫ s∆

CE(st∗ −∆, T − t∗) ·dFt(st∗)+B(t, t∗)∫ ∞

s∆

(st∗ −X) ·dFt(st∗),

where Ft(st∗) ≡ Pt(St∗ ≤ st∗) is the conditional c.d.f. at t of the cum-dividend price at t∗ in the risk-neutral measure. Look first at the secondterm. If the strike price of the option were s∆ rather than X , the value ofthis term would be given by the Black-Scholes formula with s∆ in place ofX. As it is, X just takes its usual place as the coefficient of the second termin the formula, with s∆ appearing in place of X everywhere else; that is,setting τ1 ≡ t∗ − t and B1 ≡ B(t, t∗) for brevity,

B1

∫ ∞

s∆

(st∗−X)·dFt(st∗) = StΦ[q+

(St

B1s∆; τ1

)]−B1XΦ

[q−

(St

B1s∆; τ1

)].

To work on the first term in the expression for CA(St, T − t; t∗, ∆), letτ2 ≡ T − t∗ and restate Ft in terms of Φ to get

B1

∫ s∆

CE(st∗ − ∆, τ2) · dΦ[q+

(B1st∗

St; τ1

)].

Now change variables as

z ← q+

(B1st∗

St; τ1

)st∗ → St

B1exp

(zσ

√τ1 − σ2τ1/2

),

and calculate this as

B1

∫ zs∆

z∆

CE

[St

B1exp

(zσ

√τ1 − σ2τ1/2

)− ∆, τ2

]· dΦ(z),

Page 313: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch07 FA

296 Pricing Derivative Securities (2nd Ed.)

where the limits of integration are

z∆ ≡ q+

(B1∆St

; τ1

)zs∆ ≡ q+

(B1s∆

St; τ1

).

Alternatively, expressing CE(·, τ2) gives∫ zs∆

z∆

(Ste

zσ√

τ1−σ2τ1/2 − B1∆)

Φ

[q+

(Ste

zσ√

τ1−σ2τ1/2 − B1∆BX

; τ2

)]

−BXΦ

[q−

(Ste

zσ√

τ1−σ2τ1/2 − B1∆BX

; τ2

)]· dΦ(z),

where B ≡ B1B(t∗, T ) ≡ B(t, T ). This integral can be computed numeri-cally using, for example, routine GAUSSINT on the accompanying CD. Themethod can be adapted for a dividend that is a fixed proportion of St∗

or, for that matter, a dividend that is any well-behaved function of theunderlying price.

7.1.2 Options on Assets Paying Continuous Dividends

This section treats American options on an underlying asset that followsgeometric Brownian motion and pays continuous proportional dividends atconstant, deterministic rate δ. The dynamics under the risk-neutral measureare thus

dSt = (r − δ)St · dt + σSt · dWt,

where Wt≥0 is a Brownian motion under P. We refer to the underlyingasset as a “stock”, but the analysis applies also to options on currencies, ondiversified stock portfolios (indexes), and (taking δ = r) even on futures,so long as it is appropriate to model their price processes as geometricBrownian motion. The plan is to deal first with the American put and thenextend to calls by exploiting the symmetry between these two types. Weconsider a T -expiring put with strike X and terminal value (if not previouslyexercised) equal to PA(ST , 0) = (X − ST )+, with PA(St, T − t) denotingthe value at t ∈ [0, T ]. Although there is no simple analytical formula forPA(St, T − t), there are many implicit characterizations, some of whichlead to useful numerical approximations. The binomial model considered inchapter 5 gives one such approximation, but standard binomial solutions are

Page 314: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch07 FA

American Options and “Exotics” 297

generally slower than those we now develop.2 We shall describe proceduresbased on three characterizations of PA(St, T−t). Since all of these somehowinvolve the critical exercise boundary, Bt0≤t≤T , we begin by describingthis function.

The Put’s Critical Exercise Boundary

The exercise boundary for the put is a function B : [0, T ] → + such thatexercise is optimal at any t ∈ [0, T ] whenever St < Bt. Given the parametersthat define the put and the dynamics of the underlying (that is, X , T , r, δ,and σ), this boundary has to be found numerically by some process that iscapable of valuing the put itself. We saw examples of this in chapter 5, wherewe discovered the exercise boundary in the process of valuing the option inthe binomial lattice. Nevertheless, there are several features of Bt thatcan be deduced analytically. First, it is clear that Bt ≤ X for t ∈ [0, T ],since it would never be optimal to exercise an out-of-money option. Second,it must be the case that BT = X , since an in-the-money put would be letto expire unexercised if BT < X and BT ≤ ST < X . Third, if σ = 0 andr ≥ 0 then Bt = X for all t ∈ [0, T ], since with nonnegative mean driftand zero volatility there would be no chance that the option could ever getfurther in the money. Fourth, Bt is nondecreasing with time; that is, as wemove closer to expiration the threshold below which exercise is desirablecannot decrease. The intuition for this is that by exercising at t one tradesthe put’s time value—the value associated with the contingency of furtherdeclines in S—for the sure intrinsic value X − Bt. Since the time valuecannot increase as t → T , neither can X − Bt. Indeed, with r > 0 andσ > 0 (as we assume in the remainder of this discussion) the option’s timevalue strictly decreases as t → T , and Bt must strictly increase. Fifth, sincethere is nothing that could introduce discontinuities in the put’s time valueor its derivatives, Bt itself should be a smooth, differentiable function. And,finally, the values of PA and ∂PA/∂S ≡ PA

S at the exercise boundary areknown;3 namely,

PA(Bt, T − t) = X − Bt (7.1)

PAS (Bt, T − t) = −1. (7.2)

2On the other hand, we will see that the smoothing technique of Broadie and Detem-ple (1996) coupled with Richardson extrapolation is competitive with the best of theseapproximations.

3See van Moerbeke (1976) for proofs of these properties.

Page 315: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch07 FA

298 Pricing Derivative Securities (2nd Ed.)

Although the “high-contact” condition, (7.2), looks a bit strange, it isnot hard to develop some intuition for it. Clearly, the put’s delta would benegative unity at the point of exercise, since a replicating portfolio wouldthen be short precisely one share of the underlying. More formally, lettingPA(St, T−t; bt) represent the put’s value at t when the critical exercise priceis bt, the fact that Bt is the optimal threshold suggests that ∂PA(St, T −t; bt)/∂bt = 0 at bt = Bt. Now differentiating (7.1) gives

−1 = dP A(Bt, T − t; Bt)/dBt

= ∂PA(St, T − t; Bt)/∂St |St=Bt +∂PA(Bt, T − t; bt)/∂bt |bt=Bt

= PAS (Bt, T − t; Bt) + 0.

Characterizations of PA(St, T − t)

We saw in chapter 5 that the value of an American put can be representedas the discounted expected value of the receipt generated when underlyingprice first passes the exercise boundary. Specifically, in the discrete-timeBernoulli setup we expressed the put’s initial value as

PA0 =

n∑j=1

B(0, j)(X − Bj)P(J∗ = j),

where stopping time J∗ is the first step at which sj < Bj and B(0, j) isthe discrete-time notation for the price of a bond expiring at step j. Theapplicable boundary function Bj maximizes the value of this expression.Corresponding expressions in continuous time are

PA(S0, T ) = maxB

∫ T

0

e−ru(X − Bu) · dP(U ≤ u)

and

PA(S0, T ) = maxB

Ee−rU (X − SU ), (7.3)

where stopping time U = inft : St < Bt. The characterization (7.3)will be used below in developing a symmetry relation for American putsand calls and again in chapter 11 to value options by simulation. However,neither of these expressions has been found helpful in devising analyticalapproximations, because distributions of first-passage times of Brownianmotions through arbitrary boundaries are not easily described, and becausethe boundary itself has to be found along with the value of the option.

Page 316: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch07 FA

American Options and “Exotics” 299

More fruitful for analytical results is the representation of PA(St, T −t) as solution to the Black-Scholes p.d.e. Merton (1973) showed that theAmerican put satisfies

−PAT−t + PA

S (r − δ)St + PASSσ2S2

t /2 − PAr = 0 (7.4)

subject to the terminal condition PA(ST , 0) = (X − ST )+ and boundaryconditions

PA(St, T − t) ∈ [(X − St)+, X ] (7.5)

limSt→∞ PA(St, T − t) = 0. (7.6)

There are also conditions (7.1) and (7.2). Brennan and Schwartz (1977)presented an algorithm that yields numerical solutions for this problem,using discretized state spaces for both time and the underlying price. Thesefinite-difference methods for solving p.d.e.s are described in chapter 12,which gives specific examples of their application to American options.4

Later in this section we work out an analytical solution to the p.d.e. in thecase T = ∞, which turns out to be a much easier problem.

A third characterization of the American put involves discretizing timealone. Letting Pn(S0, T ) denote the value of a put that can be exercisedjust at times tj ≡ jT/n for j ∈ 0, 1, . . . , n, the value of the standardAmerican put is approached in the limit as n → ∞. We develop below thefollowing recursion, which, in principle, permits valuation for arbitrary n:

Pn(S0, T ) = B(0, t1)E(X − St1)1[0,Bt1 ](St1)

+ B(0, t1)EPn−1(St1 , T − t1)1(Bt1 ,∞)(St1). (7.7)

Here, 1A is the usual indicator notation and Bt1 is the critical exercisevalue at t1. Implementing this procedure gives an approximation proposedby Geske and Johnson (1984).

Finally, the value of the American put can be expressed as the sum of thevalue of a corresponding European option and an early-exercise premium:

PA(St, T − t) = PE(St, T − t) + Π(St, T − t). (7.8)

There are a number of ways to represent the premium and to exploit thisrelationship in order to approximate the value of the put. We describemethods proposed by MacMillan (1986) and Ju (1998).

4Wilmott et al. (1993, 1995) give extensive treatment of this subject.

Page 317: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch07 FA

300 Pricing Derivative Securities (2nd Ed.)

The Geske-Johnson Approximation

Abstracting from the fact that markets do not operate around the clock, anAmerican put can in principle be exercised at any time before expiration.However, one can imagine discrete versions that could be exercised only atspecific times up to and including T . Indeed, there is an active over-the-counter market in “Bermudan” options that have precisely this feature. Nowconsider a sequence of Bermudan puts indexed by the number of equallyspaced exercise dates, with prices Pn(S0, T )∞n=1. Given the continuity ofunderlying process Stt≥0, one expects such a sequence of prices to con-verge to the arbitrage-free price of a vanilla American put. The idea behindthe Geske-Johnson (1984) approach is to value each of a short sequence ofthese puts, then apply Richardson extrapolation to extend to an infinitenumber of dates. The process starts with P1(S0, T ), which is nothing butthe Black-Scholes value of a European put. Next comes P2(S0, T ), the valueof an option that can be exercised at t1 = T/2 and at t2 = T . Geske andJohnson represent this in terms of the discounted risk-neutral expectationsof the contingent receipts at the two times:

P2(S0, T ) = B(0, t1)E(X − St1)1[0,Bt1 )(St1)

+ B(0, T )E(X − ST )+1[Bt1 ,∞)(St1).

With some manipulation the expectation in the second term, which involvesthe correlated lognormals St1 and ST , can be expressed as the expectationof a function of bivariate normals and evaluated numerically. Alternatively,conditioning on what is known at t1 and writing B(0, T )E(X−ST )+ in thesecond term as

B(0, t1)E[B(t1, T )Et1(X − ST )+] = B(0, t1)EP1(St1 , T − t1)

gives a recursive expression that makes the computations easier to programand leads naturally to the next stage in the sequence:

P2(S0, T ) = B(0, t1)E[(X − St1)1[0,Bt1)(St1)

+ P1(St1 , T − t1)1[Bt1 ,∞)(St1)]

= B(0, t1)E maxX − St1 , P1(St1 , T − t1). (7.9)

This shows that one who owns the put at t1 chooses either to receive X−St1

in cash or to hold on to what becomes a European put that dies at T .Critical exercise price Bt1 equates the values of these two claims at t1 andcan therefore be found by solving X−Bt1 = P1(Bt1 , T−t1) numerically. (Of

Page 318: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch07 FA

American Options and “Exotics” 301

Fig. 7.2. Value function at t1 of Bermudan put exercisable at t1 and T .

course, this is the critical price for the Bermudan approximation rather thanfor a genuine American put.) The solid, kinked curve in figure 7.2 depictsthe value of the two-step Bermudan option at t1 as a function of St1 .

Proceeding to the third stage, where exercise can occur at t1 = T/3,t2 = 2T/3, or t3 = T , the recursion progresses as

P3(S0, T ) = B(0, t1)E[(X − St1)1[0,Bt1 ](St1)

+ P2(St1 , T − t1)1[Bt1 ,∞)(St1)]

= B(0, t1)E max X − St1 , P2(St1 , T − t1) , (7.10)

where Bt1 now solves X − Bt1 = P2(Bt1 , T − t1)—and so on to (7.7) atstage n.

Evaluating the discounted expectation of the first terms in (7.9)and (7.10) is straightforward since

E(X − St1)1[0,Bt1 ](St1) = XΦ[q+(Bt1/f)] − fΦ[q−(Bt1/f)],

where Φ is the standard normal c.d.f., f = S0e−δt1/B(0, t1), and q±(x) =

(lnx)/(σ√

t1) ± σ√

t1/2. However, the computations get more difficultas one proceeds, since at each new stage beyond the first calculating

Page 319: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch07 FA

302 Pricing Derivative Securities (2nd Ed.)

EPn−1(St1 , T − t1)1[Bt1 ,∞)(St1) requires an additional iteration of numeri-cal integration. Geske and Johnson suggest using Richardson extrapolationto extend to P∞ from a short sequence of estimates. Letting hj = T/j forj ∈ 1, 2, . . . , n and representing Pj(S0, T ) temporarily as P (hj), we recallfrom section 5.5.2 that Richardson extrapolation proceeds by approximat-ing P (h) as a polynomial,

P (h) = a0 + a1h + a2h2 + · · · + an−1h

n−1,

in which a0 corresponds to P (0) or P∞. This is found by solving the n

equations corresponding to P (h1) through P (hn). The linear approximation(n = 2) works out to be

P∞(S0, T ) .= 2P (h2) − P (h1) ≡ 2P2(S0, T )− P1(S0, T ),

and the quadratic approximation is

P∞(S0, T ) .= 9P3(S0, T )/2− 4P2(S0, T ) + P1(S0, T )/2.

Computations beyond n = 3 are very slow because of the need to computeP4, P5, . . . , but (depending on σ and T ) the results can be quite accurateeven for n = 2. While the Geske-Johnson method is usually faster than astandard binomial tree of the same accuracy, Broadie and Detemple (1996)and Ju (1998) find that a still better speed/accuracy frontier is attained byapplying Richardson extrapolation to binomial estimates that start withBlack-Scholes values at the terminal nodes, as described in section 5.5.2.

The MacMillan Approximation

In a 1986 paper L.W. MacMillan proposed a very fast procedure thatexploits the decomposition (7.8) to produce a p.d.e. whose solution approx-imates the value of the early-exercise premium. Barone-Adesi and Whaley(1987) extend the method to allow for continuous dividends and (in their1988 paper) for dividends at discrete times. We consider just the case ofcontinuous dividends.

The idea is based on the observation that, since prices of both theEuropean and American puts satisfy (7.4), so must early-exercise premiumΠ(St, T − t) satisfy

−ΠT−t + ΠS(r − δ)St + ΠSSσ2S2t /2 − Πr = 0 (7.11)

subject to certain boundary conditions. However, since PA and PE sharethe same terminal condition, P (ST ) = (X − ST )+, the corresponding con-dition for the premium is simply Π(ST , 0) = 0. The plan is to simplify

Page 320: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch07 FA

American Options and “Exotics” 303

still more by a change of variable and an approximation that reduces theproblem to solving an ordinary differential equation in St alone.

Begin with the change of variable. Introducing the function K(T − t) ≡1 − e−r(T−t), write

Π(St, T − t) = f(St, K) · K,

thus defining a new function f . With this substitution, and using

K ′ ≡ dK(T − t)/d(T − t) = r(1 − K),

(7.11) becomes

−r(1 − K)KfK − fr + KfS(r − δ)St + KfSSσ2S2t /2 = 0. (7.12)

The time-dependence of f is expressed here explicitly only through the firstterm, and this term is small under two circumstances: (i) when T −t

.= 0, inwhich case 1−K

.= 0, or (ii) when T −t is large, in which case f is relativelyinsensitive to time dimension and fK

.= 0. MacMillan’s approximation issimply to drop the first term. This leaves just the ordinary differentialequation

−fr + KfS(r − δ)St + KfSSσ2S2t /2 = 0, (7.13)

which is to be solved for f(·, K). The relevant boundary conditions are(i) f ≥ 0, since an American put can be worth no less than a European;(ii) limSt→∞ f = 0, since both American and European options are worth-less in the limit; and (iii) that f is bounded on [Bt,∞). Terminal condi-tion Π(ST , 0) = 0 is automatically satisfied by the specification of K andimposes no condition on f . When K = 0 (7.13) is a second-order ordinarydifferential equation with variable coefficients, of which any function of theform f = cS−u

t is a solution.5 Substituting, one finds that there are tworoots,

u±(K) =12

−1 + 2(r − δ)/σ2 ±

√[1 − 2(r − δ)/σ2]2 + 8r/(Kσ2)

,

so that the general solution is

f(S, K) = c+(K)S−u+(K)t + c−(K)S−u−(K)

t .

5K = 0 requires either T − t = 0 or r = 0. In the first case the option’s value is givenby the terminal condition. The other case, which is for practical purposes irrelevant, istreated by MacMillan (1986) in an appendix.

Page 321: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch07 FA

304 Pricing Derivative Securities (2nd Ed.)

Boundary condition (ii) rules out negative root u−, whereas condition (iii)still allows positive root u+. Dropping the “+” and letting u(K), c(K) cor-respond to the positive root, one sees that all three conditions are met byf(St, K) = c(K)S−u(K)

t , so that

PA(St, T − t) =

X − St, St < Bt

PE(St, T − t) + Kc(K)S−u(K)t , St ≥ Bt

. (7.14)

Coefficient c(K) and critical exercise price Bt are found from conditionsimposed by (7.1) and (7.2):

PE(Bt, T − t) + Kc(K)B−u(K)t = X − Bt (7.15)

PES (Bt, T − t) − Kc(K)u(K)B−u(K)−1

t = −1. (7.16)

Eliminating c(K) gives a nonlinear equation for Bt,

Bt =u(K)

[X − PE(Bt, T − t)

]1 + u(K) + PE

S (Bt, T − t),

which must be solved numerically. For this, PE(Bt, T −t) is calculated fromthe Black-Scholes formula,

PE(Bt, T − t) = BXΦ[q+

(BX

Bte−δ(T−t)

)]−Bte

−δ(T−t)Φ[q−

(BX

Bte−δ(T−t)

)],

where q+ and q− are given in (6.24), and

PES (Bt, T − t) = −Φ

[q−

(BX

Bte−δ(T−t)

)].

With Bt determined equations (7.15) and (7.16) give an explicit solutionfor c(K):

c(K) = B1+u(K)t

1 + PES (Bt, T − t)Ku(K)

.

These values can be calculated very quickly, and findings reported byMacMillan (1986) and Barone-Adesi and Whaley (1987) indicate that theiraccuracy compares favorably over an extensive range of parameter valueswith 3-stage Geske-Johnson (1984) estimates and numerical solutions afterBrennan and Schwartz (1977).

Page 322: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch07 FA

American Options and “Exotics” 305

It is interesting that the approximation (7.14) becomes exact as T → ∞and K → 1, since (7.12) and (7.13) then coincide. Since limT→∞ PE(Bt, T−t) = limT→∞ PE

S (Bt, T − t) = 0, (7.15) and (7.16) yield explicit solutions:

Bt =u(1)X

1 + u(1)

c(K) =X

[1 + u(1)]B

u(1)t

and

PA(St, T − t) =X

1 + u(1)

(St

Bt

)−u(1)

.

Since u(1) > 0, the value function is hyperbolic above the critical exerciseprice. We will get the same result later by solving directly the infinite-horizon version of (7.4).

The Multipiece Exponential Approximation

It appears that the current state of the art in pricing American puts underBlack-Scholes dynamics is an approach proposed by Nengjiu Ju (1998). Thisis based on the following characterization of the early-exercise premium,developed by Kim (1990), Carr et al. (1992), and others:

Π(St, T − t) = Et

[∫ T

t

e−r(u−t) (rX − δSu)1[0,Bu)(Su) · du

]. (7.17)

This amounts to the difference between the value of the earnings frominvesting the strike price after early exercise and the value of the dividendsforgone by putting the stock. Carr et al. (1992) give an intuitive explana-tion, involving a strategy that effectively converts the American put into aEuropean by stripping away the potential gain from early exercise.

The idea is easiest to grasp if one begins with a degenerate case in whichboth interest rate and dividend rate are zero. Start at t by buying a “live”American option at price PA(St, T − t)—that is, one for which St is abovethe exercise boundary. So long as S stays above the boundary, continue tohold the option. If it crosses the boundary, exercise the option, borrowingthe share of stock that is put to the writer and receiving the strike priceX in cash. This involves no net outlay. If the stock’s price recrosses theboundary, use the cash/stock account to repurchase the option, again forno net outlay. In this way one holds the option when S is above the exerciseboundary, and one holds an account worth X − S when it is below the

Page 323: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch07 FA

306 Pricing Derivative Securities (2nd Ed.)

boundary. At t = T the position has the same value as a European option,(X−ST )+. Since the strategy replicates the European put, is self-financing,and costs PA(St, T − t) initially, we conclude that American and Europeanputs have the same value if r = δ = 0.

Now try to follow the same strategy in the relevant case that r > 0 and δ

is arbitrary. Again, start by buying a live American option for PA(St, T−t),and again exercise if S crosses the boundary, shorting a share of stock andinvesting the strike in riskless bonds. The difference now is that the valueof this account is no longer X − S because interest accrues on the bondsand dividends have to be constantly repaid to the lender of the borrowedstock. During an interval [u, u+du) in which S is below the boundary theseeffects change the value of the account by (rX−δSu)·du. The present valueof this prospective increment at t is e−r(u−t)(rX − δSu) · du. The strategystill yields the European put’s terminal payoff, (X − ST )+, on any path,but a given path Su(ω)t≤u≤T generates in addition a cash stream worth∫ T

t

e−r(u−t)[rX − δSu(ω)]1[0,Bu)[Su(ω)] · du.

Averaging over paths (in the risk-neutral measure) gives (7.17) as the addi-tional value of the American put over the European. Working out the inte-gral yields a relatively simple expression for the exercise premium:

Π(St, T − t) =∫ T

t

e−r(u−t)rXΦ[q+(e−(r−δ)(u−t)Bu/St)] · du

−∫ T

t

e−δ(u−t)δStΦ[q−(e−(r−δ)(u−t)Bu/St)] · du,

where

q±(x) ≡ lnx ± σ2(u − t)/2σ√

u − t.

This is easily calculated once the boundary function Bu is known. WhileBu can be solved for in principle from the equation

PA(Bt, T − t) = Bt − X = PE(Bt, T − t) + Π(Bt, T − t),

this is very difficult to do in practice.Ju (1998) proposes to break the time to expiration into subintervals

[tj , tj+1)n−1j=0 with t0 = t and tn ≡ T and to approximate Bu within

[tj , tj+1) by an exponential, Bu = αjeβju. When Bu is of this form, the

integrals involved in Π(St, T − t) can be expressed explicitly in terms ofunivariate-normal c.d.f.s, which can be calculated very quickly. The con-stants αj and βj are found by applying PA(Bt, T − t) = Bt − X and

Page 324: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch07 FA

American Options and “Exotics” 307

high-contact condition (7.2) at tj . One may consult the original article forthe lengthy formulas involved in the approach. Ju gives extensive compar-isons with other methods, including Geske-Johnson, the standard binomial,and Broadie and Detemple’s (1996) smoothed binomial that starts withBlack-Scholes prices at the terminal nodes. Using a three-piece exponentialapproximation to Bt, Ju’s method seems to dominate in terms of speedand accuracy.6

Extension to American Calls

A neat symmetry relation immediately extends to American calls all themethods that apply to American puts on assets that may pay continuous,proportional dividends. In chapter 6 we demonstrated that under Black-Scholes dynamics the price of either the European put or the call couldbe obtained from the other by permuting arguments St and X , r and δ.Specifically, if we write CE(St, T − t; X, r, δ) and PE(St, T − t; X, r, δ) forT -expiring European options struck at X on an asset paying dividends atrate δ, then

CE(St, T − t; X, r, δ) = PE(X, T − t; St, δ, r).

We now show formally that the same relation holds for American optionsCA and PA. Also, representing the critical exercise boundaries for calls andputs as BC

t = βCt X and BP

t = βPt X , it will turn out that βC

t = 1/βPt .

Start with the characterization of American puts and calls as

PA(St, T − t; X, r, δ) = maxβP

E[e−rUP

(X − SUP )+ | Ft],

UP ≡ infu ∈ [t, T ] : Su < βPu X,

6The ongoing search for better approximations to values of American options underBlack-Scholes dynamics is, without question, a worthy intellectual challenge; and fast andaccurate methods are clearly of value to institutions that must price and hedge many ofthese instruments in the course of a business day. However, one wonders if intellectualcapital might not now be better spent elsewhere, given the limited descriptive accuracy ofBlack-Scholes dynamics for financial assets. Within this framework, the simple smoothed-binomial scheme of Broadie and Detemple (1996) appears perfectly adequate for mostpurposes, as Ju himself suggests in the following statement (p. 641):

“When the requirement of accuracy is not too stringent, [the Broadie-Detemple

scheme] could be the choice of method in many applications because it is simple,fast, and easy to program.”

Page 325: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch07 FA

308 Pricing Derivative Securities (2nd Ed.)

and

CA(St, T − t; X, r, δ) = maxβC

E[e−rUC

(SUC − X)+ | Ft],

UC ≡ infu ∈ [t, T ] : Su > βCu X,

where UP = T , UC = T if the respective boundaries are not crossed by T .These represent the prices of American options as solutions to the prob-lem of finding the optimal exercise boundary. Let Yu ≡ Sue(δ−r)ut≤u≤T ,noting that this is a martingale under P since E|Yu| < ∞ and EtYu =Ste

(δ−r)t = Yt for u ≥ t. Then

PA(Yt, T − t; X, r, δ) = maxβP

Et(e−rUP

X − e−δUP

YUP )+,

UP = infu ∈ [t, T ] : Yu < βPu Xe(δ−r)u.

Permuting the arguments,

PA(X, T − t; Yt, δ, r) = maxβP

Et(e−δUP

YUP − e−rUP

X)+,

UP = infu ∈ [t, T ] : X < βPu Yue(r−δ)u

= infu ∈ [t, T ] : Yue(r−δ)u ≥ X/βPu ,

or

PA(X, T − t; Yt, δ, r) = maxβC

Ee−rUC

(SUC − X)+,

UC = infu ∈ [t, T ] : Yue(r−δ)u ≡ Su ≥ βCu X,

with βC = 1/βP . Thus, as promised,

CA(St, T − t; X, r, δ) = PA(X, T − t; St, δ, r). (7.18)

7.1.3 Indefinitely Lived American Options

Paradoxically, the problem of valuing American-style derivatives is ofteneasier when there is no fixed termination date. When the underlying followsgeometric Brownian motion with time-invariant volatility, and when theinstantaneous spot rate and dividend rates are also constant through time,the value of the derivative itself becomes time-invariant. The fundamentalp.d.e. that describes its dynamics then becomes an ordinary differentialequation. If the boundary conditions for the particular derivative are nottoo difficult, there may be an elementary solution. We demonstrate thetechniques through several examples, which are of interest in their own

Page 326: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch07 FA

American Options and “Exotics” 309

right. In each case the short rate and dividend rate are constants, r and δ,and the dynamics of the underlying under the risk-neutral measure are

dSt = (r − δ)St · dt + σSt · dWt.

1. Let PA(St,∞) be the value of an indefinitely lived American put struckat X. Since this does not depend on time independently of St, the Black-Scholes p.d.e. becomes

PAS (r − δ)St + PA

SSσ2S2t /2 − PAr = 0, (7.19)

which is a second-order ordinary differential equation with variable coef-ficients. The American put will be exercised when the stock’s price fallsto a critical level B that is now time-invariant. This imposes a lowerboundary condition, PA(B,∞) = X − B, and there is the usual upperbound, limSt→∞ PA(St,∞) = 0. A solution to (7.19) will be a func-tion of the form PA(St,∞) = cS−u

t for appropriate constants c and u.Substituting and simplifying shows that there are two roots,

u, u′ =r − δ − σ2/2 ±

√(r − δ − σ2/2)2 + 2rσ2

σ2,

with u′ < 0 and u positive or zero according as r is positive or zero. Thegeneral solution is therefore

PA (St,∞) = cS−ut + c′S−u′

t ,

but since the upper boundary condition is inconsistent with the negativeroot, we conclude that c′ = 0 and PA(St,∞) = cS−u

t . Applying thelower boundary condition gives c = (X − B)Bu, whence

PA(St,∞) = (X − B)(St/B)−u.

The maximum value of PA is attained at

B =Xu

1 + u.

Notice that this is consistentwith high-contact conditionPAS (B,∞) =−1.

Assuming that the put is live to begin with (i.e., that St ≥ B) thesolution for the put’s current value is

PA(St,∞) =X

1 + u(St/B)−u. (7.20)

Recall that this is the same result obtained in the limit as T → ∞ fromMacMillan ’s (1986) approximation for the finite-lived put. Notice, too,that X − St ≤ PA(St,∞) ≤ X , consistent with the bounds derived insection 4.3.3.

Page 327: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch07 FA

310 Pricing Derivative Securities (2nd Ed.)

2. Let CA(St,∞) be the value of a perpetual American call with strike X.

Lower and upper boundary conditions are CA(0,∞) = 0 and

CA(B,∞) = B − X,

where B is the critical exercise price. This satisfies an ordinary differ-ential equation of the same form as (7.19),

CAS (r − δ)St + CA

SSσ2S2t /2 − CAr = 0, (7.21)

of which the general solution is

CA(St,∞) = cSut + c′Su′

t ,

with

u, u′ =−r + δ + σ2/2 ±

√(r − δ − σ2/2)2 + 2rσ2

σ2.

The lower and upper boundary conditions require c′ = 0 and c = (B −X)B−u, respectively, so that, assuming St ≤ B,

CA(St,∞) = (B − X)(St/B)u. (7.22)

Algebra shows that u > 1 when δ > 0 and that u = 1 when δ = 0. Whenδ > 0 and X > 0, CA(St,∞) is maximized by taking

B =u

u − 1X,

which is the same value inferred from high-contact conditionCA

S (B,∞) = 1. When δ = 0 there is no unique maximum; instead,CA(St,∞) ↑ St as B → ∞. This tells us that a perpetual American callon a no-dividend stock will never be exercised and that its value equalsthat of the stock regardless of the strike price.

A little algebra shows that switching X for St and δ for r in (7.22)gives (7.20), so that

CA(X,∞; St, δ, r) = PA(St,∞; X, r, δ),

corresponding to the symmetry relation for finite-lived American optionsin (7.18).

3. Let C(St,∞; K) be an indefinitely lived “capped” call option with strikeX and cap K > X . The capped call pays K−X and terminates as soonas S hits the cap, but it cannot be exercised voluntarily. The problem isthe same as for the infinite American call except that the upper bound-ary condition is C(K,∞; K) = K −X. Thus, assuming St < K initially,

C(St,∞; K) = (K − X)(St/K)u,

where u is the same as in the previous example.

Page 328: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch07 FA

American Options and “Exotics” 311

4. Let CA(St,∞; L) be an indefinitely lived American “down-and-out” callwith strike X and knockout price L ∈ (0, X). This option can be exer-cised at any time, but it terminates and becomes worthless when St

first hits L. The boundary conditions are now CA(L,∞; L) = 0 andCA(B,∞; L) = B − X for some critical exercise price B. Again, thegeneral solution to (7.21) is

CA(St,∞; L) = cSut + c′Su′

t ,

where u and u′ are as in the second example. Here the boundary con-ditions do not rule out either term, but they determine c and c′ assolutions to

cLu + c′Lu′= 0

cBu + cBu′= B − X.

When St > L > 0 the particular solution is

CA(St,∞; L) =(

B − X

BuLu′ − Bu′Lu

)(Lu′

Sut − LuSu′

t ).

To complete the problem, the first factor is maximized numerically withrespect to the optimal exercise price, B.

7.2 Compound and Extendable Options

This section treats European-style derivatives with special features thatmake their valuation more difficult: (i) options written on underlying finan-cial assets that are either options themselves or portfolios of options andreal assets; and (ii) options whose lives are extended one or more timesif not in the money at an expiry date. In the simple framework of Black-Scholes dynamics it is often possible to develop tractable computationalformulas even in such nonstandard cases.

7.2.1 Options on Options

Black and Scholes (1973) pointed out that a common stock itself has option-like characteristics. Take for example a stylized, one-period firm that hasdebt (i.e., liabilities) in the form of zero-coupon bonds paying face value L

at T . Let AT represent the value of the firm’s assets at T . The aggregatevalue of the common is then ST = (AT − L)+. That is, having limitedliability, shareholders simply turn assets over to the bondholders if the firm

Page 329: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch07 FA

312 Pricing Derivative Securities (2nd Ed.)

is insolvent; otherwise, they pay off the debt and claim the residual. Thismakes the firm’s stock a call option on the assets, and it makes an option onthe stock an option on an option, or a “compound option”. We develop ananalytical formula for the value of a compound call like that first presentedby Geske (1979) and then give a corresponding expression for a put on a call.

A related pricing problem has attracted interest because of thetechnology-driven boom in U.S. stocks during the 1990s. During that periodMicrosoft, Intel, and several other successful firms began to take large posi-tions in options on their own stocks. The primary strategy involved sellingputs, the goal being to enhance earnings by collecting premiums on optionsthat would expire worthless if the stocks’ prices continued to rise. Of course,the firms were obligated to buy their own stock at above-market prices ifand when the puts were exercised. Just as happens when a firm incurs debt,issuing these “inside” puts creates opportunities and liabilities that changethe dynamics of the price of its stock. Similar issues arise when firms issuewarrants (long-term calls on their own stock), when they compensate execu-tives with inside call options, and when they use derivatives either to hedgeexposure to various business risks or to take affirmative positions. In thesecond part of this section we will see how the writing of inside Europeanputs affects the price of the firm’s stock and the prices of other derivatives.

Compound Options

Following Geske (1979) we consider first a European-style call with strikeX and expiration date t∗ on the common stock of a firm having debt L thatcomes due at some T ≥ t∗. We view this as a compound call, whose value,CC(·, ·), is to be determined. Assume now that the process representing thevalue of the firm’s assets, At, follows geometric Brownian motion and thatthe firm pays no dividend on the stock. Both L and A are expressed pershare of common. To simplify notation, take the current time to be t = 0.Then under the risk-neutral measure the distribution of At∗ is lognormalwith c.d.f.

F (at∗) = Φ[q+(at∗e−rt∗/A0; t∗)], (7.23)

where for arbitrary positive x and τ

q±(x; τ) ≡ lnx ± σ2τ/2σ√

τ.

Page 330: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch07 FA

American Options and “Exotics” 313

Regarding each share of stock as a European call with strike L on the firm’sassets, its value at t∗ is given by Black-Scholes as

St∗ = CE(At∗ , T − t∗)

= At∗Φ[q+(At∗er(T−t∗)/L; T − t∗)]

− e−r(T−t∗)LΦ[q−(At∗er(T−t∗)/L; T − t∗)].

Let aX be the value of assets that puts the compound call just at themoney at t∗. That is, aX satisfies CE(aX , T − t∗) = X . Assuming thatX > 0, a solution necessarily exists and is unique because CE increasesmonotonically from zero with the price of the underlying. Martingale pricingthen sets the value of the compound call at

CC(A0, t∗) = e−rt∗

∫ ∞

aX

[CE(at∗ , T − t∗) − X ] · dF (at∗). (7.24)

Notice that if t∗ = T, then CE(aX , T −t∗) = (aX−L)+ so that aX = L+X.

In this case (7.24) becomes

e−rT

∫ ∞

L+X

[aT − (L + X)] · dF (aT ) = e−rT E[AT − (L + X)]+,

which is given by the Black-Scholes formula with L + X as the strike price.We concentrate on the case t∗ < T to avoid this triviality.

Breaking (7.24) into two terms and expressing e−rt∗X [1− F(aX)] using(7.23) give

e−rt∗∫ ∞

aX

CE(at∗ , T−t∗)·dF (at∗)−e−rt∗XΦ[q−(A0ert∗/aX ; t∗)]. (7.25)

For the first term, express CE(at∗ , T − t∗) via the Black-Scholes formulato get

e−rt∗∫ ∞

aX

at∗Φ[q+(at∗e−r(T−t∗)/L; T − t∗)] · dΦ[q+(at∗e

−rt∗/A0; t∗)]

− e−rT L

∫ ∞

aX

Φ[q−(at∗e−r(T−t∗)/L; T − t∗)] · dΦ[q+(at∗e−rt∗/A0; t∗)].

Changing variables in the first integral as

at∗ → A0 exp(rt∗ + σ2t∗/2 − zσ

√t∗)

Page 331: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch07 FA

314 Pricing Derivative Securities (2nd Ed.)

and in the second as

at∗ → A0 exp(rt∗ − σ2t∗/2 − zσ

√t∗)

reduces the above to

A0

∫ q+„

A0ert∗aX

;t∗«

−∞Φ

ln(

A0erT

L

)+ σ2T

2

σ√

T − t∗− z

√t∗

T − t∗

· dΦ(z)

− L

erT

∫ q−(

A0ert∗aX

;t∗)

−∞Φ

ln(

A0erT

L

)− σ2T

2

σ√

T − t∗− z

√t∗

T − t∗

· dΦ(z) (7.26)

This form is computationally easy using a routine like GAUSSINT thatfinds expectations of functions of standard normals, but there is a betterrepresentation in terms of a bivariate normal c.d.f. For this we need theresult from section 2.2.7 that for arbitrary constants α, β, γ∫ γ

−∞Φ(α − βy) · dΦ(y) = Φ

(γ, α/

√1 + β2; β/

√1 + β2

), (7.27)

where Φ(x, y; ρ) represents the bivariate normal c.d.f. Using (7.27) toexpress the integrals in the expression for the first term of (7.25) givesfor the value of the compound call

CC(A0, t∗) = A0Φ

[q+(A0e

rt∗/aX ; t∗), q+(A0erT /L; T );

√t∗/T

]−e−rT LΦ

[q−(A0e

rt∗/aX ; t∗), q−(A0erT /L; T );

√t∗/T

]−e−rt∗XΦ[q−(A0e

rt∗/aX ; t∗)].

The arbitrage-free price of a put option on an underlying call comes fromput-call parity. One who is long the compound call at t∗ and short thecorresponding compound put receives for sure the difference between thevanilla European call on the assets and the strike; therefore,

e−rt∗E[CC(At∗ , 0) − PC(At∗ , 0)] = e−rt∗E[CE(At∗ , T − t∗) − X ].

Since normalized values of all these instruments are martingales under P,it follows that

CC(A0, t∗) − PC(A0, t

∗) = CE(A0, T ) − e−rt∗X,

Page 332: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch07 FA

American Options and “Exotics” 315

so that an exact parity relation holds for these compound options, just asfor European options. Written out, the solution for PC(A0, t

∗) is

e−rT L

Φ[q−

(A0e

rT

L; T

)]−Φ

[q−

(A0e

rt∗

aX; t∗

), q−

(A0e

rT

L; T

);

√t∗

T

]

−A0

Φ[q+

(A0e

rT

L; T

)]−Φ

[q+

(A0e

rt∗

aX; t∗

), q+

(A0e

rT

L; T

);

√t∗

T

]

+ e−rt∗XΦ[q+

(aXe−rt∗

A0; t∗

)].

Good approximations are available that make it easy to evaluate thebivariate normal c.d.f. that appears in these formulas. Mee and Owen (1983)show that the following delivers mean absolute error over 2 of less than1.0% when |ρ| is less than about 0.9:

Φ(x, y; ρ) .= Φ(x)Φ(

y − α

β

), (7.28)

where

α = −ρφ(x)/Φ(x),

β = 1 + ρxα − α2.

While this should be adequate for most financial applications, the error maybe unacceptable when |ρ| is close to unity, as it would be in the present casewhen t∗

.= T . For such cases Moskowitz and Tsai (1989) have developeda procedure based on the decomposition of the bivariate normal as theproduct of conditional and marginal distributions, using second- or third-degree polynomials to approximate the conditional c.d.f. Both quadraticand cubic versions deliver lower mean absolute error than (7.28) for |ρ| > 0.6and have particularly high accuracy for |ρ| near unity.

“Inside” Options

It is clear that writing puts on a firm’s own stock is a strategy that enhancesvolatility. When the firm does well and the price of its stock rises, extrarevenue will have been gained from the sale of out-of-money puts that

Page 333: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch07 FA

316 Pricing Derivative Securities (2nd Ed.)

ultimately expire worthless. On the other hand, a firm with declining rev-enues and stock prices might have to buy back its own shares at a premiumover market. The practice of writing inside puts (also called put “warrants”)also gives managers some perverse incentives, such as delaying the releaseof negative information or altering dividend policy. Abstracting from suchpotential manipulative effects, let us see how the writing of European putsaffects the dynamics of the firm’s equity and the prices of derivatives.

To simplify, consider a firm having no debt and no initial position inoptions, whose equity consists entirely of common stock, on which it paysno dividends. Under these conditions the price of a share of stock equals themarket value of assets per share; that is, St = At. At time t∗ the firm sells anumber of European puts amounting to p < 1 options per share of common.(Selling more would be infeasible, since exercise would require the firm torepurchase all its outstanding shares.) The new inside puts are struck atX , expire at T , and have initial value P I(At∗ , T − t∗; X). The transactionhas no immediate effect on shareholders’ equity, since the acquired shortposition in puts is exactly compensated by the premiums received. Fromthen on, at any t ∈ [t∗, T ], the value of a share of stock equals the value ofassets per share plus the proceeds from selling the puts minus their currentvalue:

St = At + p[P I(St∗ , T − t∗; X) − P I(St, T − t; X)]

= A+t − pP I(St, T − t; X), (7.29)

where A+t ≡ At + pP I(St∗ , T − t∗; X) is the augmented value of assets per

share.Table 7.1 shows the values of the put and the stock at T for various

values of A+T . To explain the entries, if A+

T < pX the firm will expendall its assets toward meeting its obligation to pay the strike price, leavingnothing for shareholders. When A+

T ∈ [pX, X ] the assets will fully satisfy

Table 7.1. Values of inside put andstock vs. values of augmented assets.

Range of A+T P I(ST , 0; X) ST

[0, pX)A+

Tp

0

[pX, X]X−A+

T1−p

A+T−pX

1−p

(X,∞) 0 A+T

Page 334: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch07 FA

American Options and “Exotics” 317

the obligation, in which case

P I(ST , 0; X) = X − [A+T − pP I(ST , 0; X)]

=X − A+

T

1 − p

and

ST = A+T − pP I(ST , 0; X)

=A+

T − pX

1 − p.

Notice that exercise of the inside puts has opposing effects on the price ofthe stock: the expenditure of assets reduces the aggregate value of stockby pX times the original number of shares outstanding, but the retiringof shares moderates the decline in price per share. Finally, when A+

T > X

the puts expire worthless, giving shareholders sole claim to the augmentedassets. Since the sign of A+

T −X is the same as that of ST −X , the conditionfor exercise can be stated equivalently in terms of A+

T or ST .Assume now that the proceeds from selling the puts are invested in

the core business and that this additional investment does not alter thestochastic process for the value of assets, which we take to be geometricBrownian motion; that is,

dA+t /A+

t = dAt/At = µt · dt + σ · dWt.

The fact that the firm’s augmented assets can be still replicated by a port-folio of the common stock and riskless bonds justifies changing to the equiv-alent martingale measure, replacing µt with rt and Wt with Wt. The con-ditional c.d.f. of A+

T under P is therefore

Ft(a) = Φ[q+(Ba/A+t )] ≡ Φ

[ln(Ba/A+

t ) + σ2(T − t)/2σ√

T − t

],

where B ≡ B(t, T ).The put’s value at t ∈ [t∗, T ] is then

P I(St, T − t; X) = B · EtPI(ST , 0; X)

=B

p

∫ pX

0

a · dFt(a) +B

1 − p

∫ X

pX

(X − a) · dFt(a).

Page 335: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch07 FA

318 Pricing Derivative Securities (2nd Ed.)

A bit of algebra shows that this can be written in terms of ordinary outsideEuropean puts on the augmented assets:

P I(St, T − t; X) =1

1 − p

[PE(A+

t , T − t; X) − 1pPE(A+

t , T − t; pX)]

.

(7.30)Note that the convexity of the Black-Scholes put price with respect to X

(recall that PEXX > 0) guarantees that P I(St, T − t; X) ≥ 0, with equality

if and only if X = 0.7

The intuitive content of (7.30) is as follows. The put on the stock itselfdiffers from a put on the assets in two respects. First, if the puts are exer-cised the value of the stock declines, since shareholders have residual claimafter put holders. This feature adds value to the inside put on the stock rel-ative to a put on the assets themselves, accounting for the factor (1− p)−1

in (7.30). Second, the stockholders have a trick up their sleeves that reducesthe value of the inside put. The trick is that their liability to the put hold-ers is limited to the value of the firms’ assets should that value fall belowpX . In this sense, shareholders themselves have an option to discharge theirobligation by putting the assets to holders of the inside put at strike pricepX . Holders of the inside put are, in effect, short this secondary optionof shareholders, since these two parties are the only claimants to assets.The value of this secondary option (per unit of inside puts) is precisely thesecond term inside the brackets.

Applying risk-neutral pricing to the stock itself,

St = BEt[A+T − pPE(ST , 0; X)]

= B

∫ X

pX

a − pX

1 − pdFt(a) + B

∫ ∞

X

a · dFt(a)

= B

∫ ∞

0

a · dFt(a)

− B

1 − p

[p

∫ X

0

(X − a) · dFt(a) −∫ pX

0

(pX − a) · dFt(a)

]= A+

t − pPE(St, T − t; X),

which agrees with (7.29).

7By Jensen’s inequality (suppressing arguments of P E(·, ·; ·) other than the strike),P E(pX) = P E [(1 − p) · 0 + p · X] ≤ (1 − p)P E(0) + pP E(X) = pP E(X), so thatp−1P E(pX) ≤ P E(X).

Page 336: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch07 FA

American Options and “Exotics” 319

0.1 0.3 0.5 0.7 0.9 1.1 1.3 1.5

Terminal / Initial Value of Assets

0.8

1.0

1.2

1.4

1.6

1.8

2.0

Rela

tive P

rice

of P

uts

p=.10

p=.30

p=.50

Fig. 7.3. Relative values, inside/outside puts, vs. terminal/initial value of assets.

Figure 7.3 illustrates the relation between values of inside and outsideputs having the same strike price and time to expiration. It was constructedin the following way. At time t∗, having assets worth At∗ = 1.1 currencyunits per share, the firm was assumed to have sold p puts per share of stock,each struck at X = 1.0 and having T − t∗ = 0.5 years to maturity. Theproceeds of these sales for each of p = 0.1, 0.3, and 0.5 were calculated from(7.30) (with St∗ = 1.1, T − t∗ = 0.5, X = 1.0, r = 0.05, σ = 0.30) to be0.0043 , 0.0167, and 0.0390 per share, respectively. Adding each of thesein turn to the initial value of assets to create A+

t∗ , time was advanced tot = t∗ + .25. For values of A+

t ranging from 0.05A+t∗ to 1.5A+

t∗ the valuesof the inside puts, P I(St, 0.25; 1.0), were calculated from (7.30). Likewisefor values of At ranging from 0.25At∗ to 1.5At∗ values of outside Europeanputs on the per-share value of assets, PE(At, 0.25; 1.0), were calculated fromthe Black-Scholes formula (again with r = 0.05 and σ = 0.30). The plotfor each value of p shows how the ratio P I(St,0.25; 1.0)÷ PE(At, 0.25; 1.0)varies with the relative change in asset values over the 0.25-year intervalfrom t∗ to t.

Page 337: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch07 FA

320 Pricing Derivative Securities (2nd Ed.)

Figure 7.3 shows that an inside put is ordinarily worth more than onewritten by an outsider in the absence of such activity by the firm. This canbe attributed to the increase in volatility of the stock when the firm takessuch positions, as documented below. There are, however, two exceptionalcircumstances under which the inside puts are worth less. The first is whenthe firm’s assets have fallen near enough to pX that put holders may notreceive full intrinsic value at expiration. At the other extreme, when assetsgrow very rapidly the additional funds raised by selling the puts multiply tothe point that the options have much less chance of expiring in the money.

To see how the sale of puts affects the stock’s volatility, apply Ito’sformula to St to obtain

dSt/St = r · dt + σS(A+t , t) · dWt,

where

σS(A+t , t)/σ =

A+t − p

1−pA+t PE

A (A+t , T − t; X) + 1

1−pA+t PE

A (A+t , T − t; pX)

A+t − p

1−pPE(A+t , T − t; X) + 1

1−pPE(A+t , T − t; pX)

and subscripts on the P ’s indicate partial derivatives. Next, express thevalues of PE in the denominator using the identity PE(A, τ ; X) = PE

A A +PE

X X . Finally, noting that

∂PE(A+t , T − t; X)/∂X = Φ[q+(BX/A+

t )]

> Φ[q+(BpX/A+t )]

= ∂PE(A+t , T − t; pX)/∂(pX),

one sees that σS(A+t , t) > σ for each A+

t > 0 and each t ∈ (t∗, T ). Therefore,the practice of writing inside puts does indeed enhance volatility. Bergmanet al. (1996) derive inequalities for prices of European options on assetswhose prices follow diffusion processes with volatilities depending on theunderlying price and time. They show that an increase in the volatilityfunction necessarily raises prices of all European options. Therefore, underthe conditions maintained here we conclude that a firm’s strategy of sellingputs on its own stock increases the prices of outside derivatives.

The price, D(St, TO − t), of some such generic outside European-stylederivative expiring at TO ≤ T can be evaluated as

D(St, TO − t) = B(t, TO)EtD(STO , 0)

= B(t, TO)EtD[A+TO

− pP I(STO , T − t; X), 0].

Page 338: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch07 FA

American Options and “Exotics” 321

For example a European call struck at XO would be valued as

B(t, TO)Et

[A+

TO−

pPE(A+TO

, T − t; X)1 − p

+PE(A+

TO, T − t; pX)

1 − p− XO

]+

.

Explicit expressions can be found for values of outside derivatives thatexpire at the same time as the inside puts. Following Qiu (2004), the valueof an outside European-style put written on the stock of a firm with insideputs having the same strike is

PE(St, T − t; X) = B(t, T )

[∫ pX

0

X · dFt(a) +∫ X

pX

X − a

1 − p· dFt(a)

],

which reduces to

PE(St, T−t; X) = (1−p)−1[PE(A+t , T−t; X)−PE(A+

t , T−t; pX)]. (7.31)

Comparing with (7.30) shows that the value of the outside put exceeds thatof the inside option by

PE(St, T − t; X) − P I(St, T − t; X) = p−1PE(A+t , T − t; pX).

Generalizing (7.31) to allow different strike prices, XI and XO,

PE(St, T − t; XO) = PE(A+t , T − t; XO) +

p

1 − pPE(A+

t , T − t; XI ∧ XO)

− 11 − p

PE(A+t , T − t; pXI) +

p

1 − p(XI − XO)+∆(A+

t , T − t; XO),

where ∆(A+t , T − t; XO) is the value of a digital option that pays one cur-

rency unit at T if A+T < XO.

7.2.2 Options with Extra Lives

The discussion of compound options involved a stylized firm with debt L inthe form of discount bonds all maturing at time T . Stockholders in such afirm will exercise their collective European call option on the assets if AT −L > 0, paying off the bondholders and retaining the residual; otherwise,they will default on the loan and transfer the assets to creditors in partialpayment of what is owed. Let us consider how to develop a security thateffectively insures the bondholders against such default. Such a derivativewould pay L − AT per share if the firm were insolvent at T , otherwisenothing. Of course, this payment of (L−AT )+ is the same as that of a putoption struck at L. Therefore, an insurer of the firm’s creditors would, ineffect, be writing a European put.

Page 339: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch07 FA

322 Pricing Derivative Securities (2nd Ed.)

This is essentially the position of the Federal Deposit Insurance Cor-poration (FDIC) when it insures deposits at U.S. commercial banks. Asthe bank’s creditors, depositors are endowed by the FDIC with a putoption struck at the nominal value of their deposits. For this continu-ing protection of its depositors the bank pays a recurring fee. Typicallythe FDIC (or other authority) assesses a bank’s condition twice per year,closing any bank found to be insolvent and paying off depositors, thusforcing the exercise of the put. Options for banks that are solvent are sim-ply extended until the next examination. Valuing deposit insurance forsome fixed term over which there are one or more intermediate exami-nations is like valuing a finitely extendable, European-style put option.We have already seen how to value finitely extendable puts within theBernoulli framework. Our first task in this section, following Longstaff(1990), will be to develop a computational formula for the value of a once-extendable put under Black-Scholes dynamics. In principle, the methodscould be applied to options with any finite number of extensions, althoughin practice the computations quickly get very hard. Moreover, thinkingagain of the application to deposit insurance, there is a related problemthat these methods cannot handle at all. Taking the longer view of theFDIC’s situation, it actually has an indefinite commitment to insure amember bank’s deposits, in that it must continue to provide the insur-ance so long as the bank is found solvent at the periodic examinations.Having this commitment is like being short an infinitely extendable putoption on the bank’s assets—one that will be exercised if and only if it isin the money at any future periodic examination. The second part of thesection describes a computational procedure for valuing such an indefinitelyextendable option.

Pricing Finitely Extendable Puts

Consider a European-style put that pays X − St∗ if in the money at t∗

and is otherwise extended once to expire unconditionally at T > t∗. Takingt = 0 as the current time, let PX(S0, t

∗) denote the current value of thisonce-extendable option. We will price this by finding the discounted expec-tation under P of the option’s value at t∗. The value at that time is easilydetermined. It is X −St∗ when St∗ is less than X ; otherwise, it is the valueof an ordinary European put with remaining life T − t∗. That is,

PX(St∗ , 0) = (X − St∗)+ + PE(St∗ , T − t∗)1[X,∞)(St∗).

Page 340: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch07 FA

American Options and “Exotics” 323

Relying on the martingale property under P, we find that

PX(S0, t∗) = e−rt∗E(X − St∗)+ + e−rt∗EPE(St∗ , T − t∗)1[X,∞)(St∗).

(7.32)The first term is just the value of a t∗-expiring European put, PE(S0, t

∗).The second term, which is the extra value of the extendable feature, is

e−rt∗∫ ∞

X

PE(st∗ , T − t∗) · dF (st∗), (7.33)

where

F (st∗) = Φ[q+(e−rt∗st∗/S0; t∗)]

and

q±(x; τ) =lnx ± σ2τ/2

σ√

τ.

Evaluating (7.33) is the part that takes some work, although it proceedsmuch as for compound options. First, write out PE(st∗ , T − t∗) from theBlack-Scholes formula and express F in terms of Φ(q+):

e−rT X

∫ ∞

X

Φ[q+(Xe−r(T−t∗)/st∗ ; T − t∗)] · dΦ[q+(st∗e−rt∗/S0; t∗)]

−e−rt∗∫ ∞

X

st∗Φ[q−(Xe−r(T−t∗)/st∗ ; T − t∗)] · dΦ[q+(st∗e−rt∗/S0; t∗)].

Next, change variables in the first integral as

st∗ → S0 exp(rt∗ − σ2t∗/2 − zσ√

t∗)

and in the second integral as

st∗ → S0 exp(rt∗ + σ2t∗/2 − zσ√

t∗).

This gives

e−rT X

∫ q−(S0ert∗/X;t∗)

−∞Φ

(ln(Xe−rT /S0) + σ2T/2

σ√

T − t∗+ z

√t∗

T − t∗

)· dΦ(z)

−S0

∫ q+(S0ert∗/X;t∗)

−∞Φ

(ln(Xe−rT /S0) − σ2T/2

σ√

T − t∗+ z

√t∗

T − t∗

)· dΦ(z).

As with compound options, these are reduced to expressions involving thebivariate c.d.f. using∫ γ

−∞Φ(α − βy) · dΦ(y) = Φ

(γ, α/

√1 + β2; β/

√1 + β2

).

Page 341: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch07 FA

324 Pricing Derivative Securities (2nd Ed.)

Substituting for γ, α, β using symmetry relation Φ(−x, y;−ρ) = Φ(x, y; ρ)and assembling all the pieces from (7.32 ) give as the arbitrage-free valueof the once-extendable European put

PX(S0, t∗) = e−rt∗XΦ[q+(Xe−rt∗/S0; t∗)] − S0Φ[q−(Xe−rt∗/S0; t∗)]

+ e−rT XΦ[q+(Xe−rt∗/S0; t∗), q+(Xe−rT /S0; T );

√t∗/T

]−S0Φ

[q−(Xe−rt∗/S0; t∗), q−(Xe−rT /S0; T );

√t∗/T

].

Pricing the Indefinitely Extendable Put

Figure 7.4 illustrates the mechanics of an indefinitely extendable put optionand gives the first glimmer of an idea of how to price it. Depicted there arefour possible price paths for the underlying asset, starting off at t = 0from an out-of-money initial value S0. The height of each vertical bar is thestrike price, X . The bars are spaced at intervals equal to the period betweenextensions, which we take as the unit of time. The bars thus appear ashurdles that the price path must clear at unit intervals if the option is notto be terminated. St can dip arbitrarily far below X between exercise datesso long as it does not touch the absorbing barrier at the origin, but theoption remains alive at any t such that Sj ≥ X for each j ∈ 0, 1, . . . , [t].(The brackets denote “greatest integer”.) When Sj < X at some j, however,then the option pays X − Sj and dies.

An obvious way to price the put is by simulating price paths correspond-ing to the risk-neutral version of St, following each one either until it hitsa hurdle and generates a receipt or to a time sufficiently remote that a hitcould produce a payoff of only negligible discounted value. Averaging thediscounted values of receipts generated by a large sample of such paths givesan estimate of the option’s initial value as the discounted P-expected pay-off. We discuss simulation techniques at length in chapter 11 and encounterother applications in chapters 8–10, where we take up more complicateddynamics. Simulation is a feasible technique for valuation in many appli-cations, but, being slow, it is almost always a last resort. In this case con-templating those evenly spaced bars in figure 7.4 suggests a better way.

Represent the value of the put at an arbitrary t ≥ 0 as P∞(St, [t+1]−t),the second argument showing the time that remains until the next exercisedate. For now we concentrate on pricing the put at some integral valueof t—that is, at one of the exercise dates. The put’s current value is thenP∞(St, 1), and at the next exercise date it will be worth P∞(St+1, 1).

Page 342: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch07 FA

American Options and “Exotics” 325

Fig. 7.4. Price paths vs. strike hurdles for indefinitely extendable puts.

Now here is the key point. Assuming constant volatility, interest rates,dividend rate, and strike price, the Markov property of geometric Brownianmotion guarantees that the value at t will be the same function of the stock’sprice as at t + 1. This observation and the martingale paradigm then leadto the following recursive expression for the put’s value at any exercisedate—that is, at any t ∈ ℵ0:

P∞(St, 1) = (X − St)+ + BEP∞(St+1, 1) · 1[X,∞)(St),

where B ≡ B(t, t + 1) = e−r is the discount factor. In words, the putis worth its present intrinsic value if in the money; else, its value is thediscounted expectation under P of its value one period hence.

To save notation, we will value the put as of t = 0. If the underlyingpays dividends continuously at rate δ, the value then can be expressed as

P∞(S0, 1) = (X − S0)+ + BEP∞(S0er−δ−σ2/2+σZ , 1) · 1[X,∞)(S0),

where Z is distributed as standard normal. Since the put’s value necessarilylies between 0 and X , this is a contraction mapping from [0, X ] into itself.This guarantees that value function P∞(·, 1) at any exercise date can bediscovered by recursive programming.

The method is as follows. First, construct a trial function P0(·) on adiscrete domain of points s ∈ S, arrayed as a vector. The trial value should

Page 343: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch07 FA

326 Pricing Derivative Securities (2nd Ed.)

be X − s for any s < X , but values at other points can be chosen freely.For example, one can use Black-Scholes values of one-period vanilla Euro-pean puts. Then, starting at the first out-of-money value in S (the firsts ≥ X), calculate BEP0 (ser−δ−σ2/2+σZ) by Gaussian quadrature or someother method of numerical integration. Gaussian quadrature (Press et al.,1992, chapter 4.5) is an efficient technique that works well for smooth func-tions.8 Any such method involves evaluating Z on a discrete set of pointsZ ≡ z1, . . . , zn, say. Since the corresponding values of ser−δ−σ2/2+σz forz ∈ Z will rarely be among the set S on which P0 was initially defined,values of P0 at those points must be found by interpolation. Having foundthe expectation at s, one then has the value at that point of an updatedfunction, P1. Continuing, P1 is evaluated at each of the remaining out-of-money points of S. Then, as for P0, one sets P1(s) = X − s for s < X . Thisprocess is repeated to produce P2, P3, and so on until at some stage m themaximum difference between Pm and Pm−1 reaches some desired tolerance.The boundedness of the put function assures eventual convergence.

Figure 7.5 illustrates the results of such calculations using r = δ = 0.0,X = 1.0, extensions at intervals of 0.5 years, and five values of σ (expressed

0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2.0

Underlying value

0.00

0.05

0.10

0.15

0.20

0.25

Put va

lue

σ = .50

σ = .40

σ = .30

σ = .20

σ = .10

Fig. 7.5. Indefinitely extendable puts valued recursively.

8We will see shortly that some smoothing technique must be applied for this to workwell here.

Page 344: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch07 FA

American Options and “Exotics” 327

in annual time units). Each curve depicts the put’s value as a functionof the initial value of the underlying asset, S0. Notice that these func-tions are not smooth. Indeed, they are discontinuous at S0 = X , becauselims↑X P∞(s, 1) = 0 whereas lims↓X P∞(s, 1) is the positive value of astill-viable instrument that could wind up in the money at a future exercisedate. Given that any method of numerical integration involves averagingthe function over a discrete domain, it is important to use a continuizationmethod like that applied in section 5.5.2 to value finitely extendable optionsin the binomial framework. The figure was produced using that method inconjunction with Gaussian quadrature.

Having found the value function at an extension date, we can value itat a nonintegral value of t as

P∞(St, [t + 1] − t) = e−r([t+1]−t)EtP∞(S[t+1], 1),

where the integration would again be done numerically after continuizingP∞(S[t+1], 1).

7.3 Other Path-Independent Claims

This section looks at certain other “exotic” derivative products whose pay-offs depend on prices of one or more underlying assets at a finite numberof points in time. These are distinguished from path-dependent derivatives(treated in the next section), whose payoffs cannot be determined withoutknowing the entire price path of the underlying. We continue to assume thatthe underlying follows geometric Brownian motion with constant volatilityσ and pays dividends (if at all) continuously at constant rate δ. The instan-taneous spot rate of interest is also a constant, r. However, the results inthis section are easily generalized to allow σ, δ, and r to be time-varying butdeterministic. Prices for options on futures are obtained by setting δ = r,since the futures price has zero mean drift under the risk-neutral measure.Since the goal is to exploit the relative simplicity of the Black-Scholes set-ting to derive explicit pricing formulas, we consider only European-styleversions of these exotic options. It is straightforward to get approximateprices of American versions via the binomial method.

7.3.1 Digital Options

Standard digital calls and puts pay a fixed number of currency units condi-tional on the event that the price of the underlying is in some stated range

Page 345: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch07 FA

328 Pricing Derivative Securities (2nd Ed.)

on the expiration date. For example, payoffs of standard digital calls andputs struck at X have the form

CD(ST , 0; K) = K · 1[X,∞)(ST )

PD(ST , 0; K) = K · 1[0,X](ST ),

where K is the number of currency units contingently paid. A variety ofcomplex terminal payoffs can be constructed by combining digital optionsthat have different strike prices and contingent payoffs.

Martingale pricing of European digitals is very simple in the Black-Scholes framework. We have

CD(St, T − t; X, K) = e−r(T−t)K · Pt(ST ≥ X)

= B(t, T )KΦ[ln(f/X) − σ2(T − t)/2

σ√

T − t

]= B(t, T )KΦ[q−(f/X)]

PD(St, T − t; X, K) = B(t, T )KΦ[q+(X/f)],

where f = f(t, T ) is the time-t price for forward delivery of the underlyingat T . The simple put-call parity relation for European digitals is

CD(St, T − t; X, K) − PD(St, T − t; X, K) = B(t, T )K.

7.3.2 Threshold Options

Threshold puts and calls struck at X deliver the usual payoffs (ST − X)+

and (X − ST )+, but only if they are sufficiently far in the money. Forexample, a standard arrangement is

CT (ST , 0; X, K) = (ST − X) · 1[K,∞)(ST ), K > X

PT (ST , 0; X, K) = (X − ST ) · 1[0,K)(ST ), K < X,

which requires the options to be in the money by at least |K − X |. Theappeal of these is that they allow hedging of forward positions in the under-lying at lower premiums than vanilla options struck at X , but give greaterprotection (at higher cost) than vanilla options struck at K. Pricing thesethreshold options is easy because they are replicable with a static portfolioof European vanilla options and digitals. Writing

CT (ST , 0; X, K) = [(ST − K) + (K − X)] · 1[K,∞)(ST )

= (ST − K)+ + (K − X) · 1[K,∞)(ST ), K > X

PT (ST , 0; X, K) = (K − ST )+ + (X − K) · 1[0,K)(ST ), K < X

Page 346: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch07 FA

American Options and “Exotics” 329

shows that

CT (St, T − t; X, K) = CE(St, T − t; K) + CD(St, T − t; K, K − X)

PT (St, T − t; X, K) = PE(St, T − t; K) + PD(St, T − t; K, X − K),

where PE and CE are European puts and calls and PD and CD are thedigitals.

7.3.3 “As-You-Like-It” or “Chooser” Options

The holder of a “chooser” option at some time t gets to decide at a futuredate t∗ whether the option will be a T -expiring put or a T -expiring call.When t∗ arrives and the decision has been made, it becomes a vanilla optionof the specified type; therefore, the interest is in pricing it at t < t∗ while theoption to choose the type is still alive. Such an instrument might appeal toone who anticipates the occurrence of an upcoming event that could affectthe price of the underlying either positively or negatively—for example,an earnings announcement that could be either above or below analysts’expectations. An alternative play on such information would be a T -expiringstraddle, which is a simultaneous long position in a put and a call that havethe same strike. We will see that the chooser is cheaper than the straddleby a margin that diminishes with T − t∗.

Pricing the European chooser is made simple by an application of put-call parity. Taking t = 0 as the present time, the current price, denoted9

H(S0, t∗, T ; X), is

e−rt∗E max[CE(St∗ , T − t∗; X), PE(St∗ , T − t∗; X)]

= e−rt∗EPE(St∗ , T − t∗; X) + [St∗e−δ(T−t∗) − e−r(T−t∗)X ]+

= PE(S0, T ; X) + e−rt∗E[St∗e−δ(T−t∗) − e−r(T−t∗)X ]+.

Here, the second equality follows upon using put-call parity to express CE ,and the last equality holds because the normalized value of PE is itselfa martingale under P. Thus,

H(S0, t∗, T ; X) = PE(S0, T ; X) + CE(S0e

−δ(T−t∗), t∗; e−r(T−t∗)X).

Equivalently, using put-call parity to express PE instead of CE gives

H(S0, t∗, T ; X) = CE(S0, T ; X) + PE(S0e

−δ(T−t∗), t∗; e−r(T−t∗)X).

9“H ermaphroditic” seems an apt descriptor for this derivative!

Page 347: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch07 FA

330 Pricing Derivative Securities (2nd Ed.)

The price difference between a T -expiring straddle and a comparablechooser—∆SH(t∗, T ), say—is

∆SH(t∗, T ) ≡ CE(S0, T ; X) + PE(S0, T ; X)− H(S0, t∗, T ; X)

= CE(S0, T ; X)− CE(S0e−δ(T−t∗), t∗; e−r(T−t∗)X)

= PE(S0, T ; X)− PE(S0e−δ(T−t∗), t∗; e−r(T−t∗)X).

Expressing either

CE(S0e−δ(T−t∗), t∗; e−r(T−t∗)X)

or

PE(S0e−δ(T−t∗), t∗; e−r(T−t∗)X)

via the Black-Scholes formula shows that it is equal to the price ofa T -expiring option of the same type with volatility σ

√t∗/T . Since

Black-Scholes prices increase monotonically in volatility, it follows that∆SH(t∗, T ) > 0 for t∗ < T and that ∆SH(t∗, T ) ↓ 0 as t∗ → T. Thismakes sense because owning the straddle lets one decide right at the lastmoment whether the instrument to be exercised shall be a put or a call,whereas with the chooser one must commit in advance.

7.3.4 Forward-Start Options

A European forward-start put or call begins its life at some future time t∗

as an at-the-money vanilla put or call that expires at a still later date T .That is, at t∗ it becomes a European option struck at St∗ and having T − t∗

to go. Since from t∗ forward it is an ordinary European option, the interestis in pricing it at t < t∗. Once more, this is easy to do. Again taking t = 0as the current date, the price of a forward-start call is

CF (S0, t∗, T ) = e−rt∗ECE(St∗ , T − t∗; St∗)

= e−rt∗E[St∗CE(1, T − t∗; 1)]

= e−δt∗S0CE(1, T − t∗; 1)

= CE(e−δt∗S0, T − t∗; e−δt∗S0).

The second and last equalities follow from the homogeneity of the Black-Scholes call formula in the price of the underlying and the strike, and thethird equality follows from the martingale property of the normalized, cum-dividend value of S under P. Similarly, the arbitrage-free price of a forward-start put is

PF (S0, t∗, T ) = PE(e−δt∗S0, T − t∗; e−δt∗S0).

Page 348: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch07 FA

American Options and “Exotics” 331

Notice that e−δt∗S0 is the current value of an ex -dividend stock positionheld to t∗. Thus, options that commence at the money at t∗ and expire atT are equivalent to (T − t∗)-expiring vanilla options that are at the moneyat current ex -dividend values of the underlying.

7.3.5 Options on the Max or Min

Given two assets with price processes S1tt≥0 and S2tt≥0 there are sev-eral ways one can construct derivatives whose payoffs depend on the greateror lesser of these prices at a future date. Options on the max or min areone approach. For a given strike X and expiration date T we could have(i) a call on the maximum, with terminal payout (S1T ∨ S2T − X)+; (ii) acall on the minimum, with payout (S1T ∧ S2T − X)+; (iii) a put on themaximum, paying (X −S1T ∨S2T )+; or (iv) a put on the minimum, worth(X − S1T ∧ S2T )+ at T . Stultz (1982) gave the first systematic accountof options of this sort under Black-Scholes dynamics. Another scheme pro-duces what are called “best-of” options. Consider two primary assets whoseinitial prices are S10 and S20. An investment of one currency unit in thefirst will be worth S1T /S10 at T , while a like investment in the secondwill be worth S2T /S20. For each unit of principal a “best-of-assets 1, 2”instrument gives the holder the better of these returns or unity, whicheveris greater. Thus, the payoff at T per unit of principal is

max(S1T /S10, S2T /S20, 1) = 1 + (S1T /S10 − 1)+ ∨ (S2T /S20 − 1)+

= 1 + (S1T /S10 ∨ S2T /S20 − 1)+.

Clearly, if we can value a call on the maximum at unit strike price and forunit initial values of the assets, then adding unity to it will give the value ofa unit position in the best-of 1, 2. We will work out the pricing formulafor a call on the maximum, present corresponding results for calls on theminimum, and then show how what amounts to a put-call parity relationcan be used to generate the put prices from these.

Pricing a Call on the Max

Martingale pricing of any of these instruments requires developing anexpression for the conditional distribution of an extremum of the two prices.Working still within the Black-Scholes framework, we assume that prices

Page 349: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch07 FA

332 Pricing Derivative Securities (2nd Ed.)

evolve under P as

dS1t = (r − δ1)S1t · dt + σ1S1t · dW1t

dS2t = (r − δ2)S2t · dt + σ2S2t · dW2t,

where W1t and W2t are Brownian motions with EW1tW2t = ρt. Theterminal value of a call on the maximum thus depends on the maximum oftwo correlated lognormal variates. The call’s price at t can be expressed as

Cmax(S1t, S2t; τ) = e−rτ Et[(S1T ∨ S2T − X)1(X,∞)(S1T ∨ S2T )], (7.34)

where τ = T − t. A nice computational expression for this can be found byworking out the conditional density of ln(S1T ∨ S2T ).

Letting Pt(·) denote P(· | Ft) as usual, begin by observing that for h > 0and η = lnh

Pt(S1T ∨ S2T ≤ h) = Pt(lnS1T ≤ η, lnS2T ≤ η)

= P[µ1 + σ1W1τ ≤ η, µ2 + σ2W2τ ≤ η]

= Φ(

η − µ1

σ1√

τ,η − µ2

σ2√

τ; ρ

), (7.35)

where

µj ≡ lnSjt + (r − δj − σ2j /2)τ (7.36)

for j ∈ 1, 2 and Φ(·, ·; ρ) is the c.d.f. of bivariate standard normals withcorrelation ρ. Likewise, we use φ(·, ·; ρ) to represent the p.d.f. The right sideof expression (7.34) can now be written as

e−rτ

∫ ∞

lnX

eη ft(η) · dη − e−rτX1− Pt[ln(S1T ∨ S2T ) ≤ lnX ], (7.37)

where ft is the conditional p.d.f. of lnS1T ∨ S2T under P. Since the proba-bility in the second term comes directly from (7.35) evaluated at η = lnX ,only the first term requires attention. Letting Zj(η) ≡ (η − µj)/(σj

√τ ) for

j = 1, 2, we find the density as

ft(η) =d

dηΦ [Z1(η), Z2(η); ρ]

=∂Φ∂Z1

Z ′1(η) +

∂Φ∂Z2

Z ′2(η)

=1

σ1√

τ

∫ Z2(η)

−∞φ[Z1(η), z2; ρ]dz2 +

1σ2

√τ

∫ Z1(η)

−∞φ[Z2(η), z1; ρ]dz1

≡ f1t (η) + f2

t (η).

Page 350: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch07 FA

American Options and “Exotics” 333

Notice that the two terms are identical, modulo an interchange ofsubscripts.

Decomposing the bivariate p.d.f. into the product of conditional andmarginal, as

φ(z1, z2; ρ) =1√

1 − ρ2φ

(z2 − ρz1√

1 − ρ2

)φ(z1),

gives

f1t (η) =

1σ1

√τ

Φ

[Z2(η) − ρZ1(η)√

1 − ρ2

]φ[Z1(η)],

where φ(·) and Φ(·) are the univariate normal p.d.f. and c.d.f. The samekind of expression holds for f2

t (η), just interchanging subscripts. The firstterm on the right side of (7.37) is then

e−rτ

∫ ∞

lnX

eηft(η) · dη = e−rτ

∫ ∞

ln X

eηf1t (η) · dη + e−rτ

∫ ∞

ln X

eηf2t (η) · dη

≡ e−rτ(I1 + I2). (7.38)

Only the first integral needs to be developed, since the second will havethe same form. Writing out eηφ[Z1(η)] and completing the square in theexponent give

I1 =eµ1+σ2

1τ/2√2πσ2

∫ ∞

ln X

Φ

[Z2(η) − ρZ1(η)√

1 − ρ2

]exp

−[η − (µ1 + σ2

1τ)]2

2σ21τ

· dη.

Now change variables, as

y ← −η − (µ1 + σ21τ)

σ1√

τ

and write out Z1(η), Z2(η), and the µj to get

I1 = f1t

∫ q+(f1t/X;σ1)

−∞Φ(a12 − b12y) · dΦ(y), (7.39)

Page 351: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch07 FA

334 Pricing Derivative Securities (2nd Ed.)

where

fjt = Sjte(r−δj)τ (the forward price)

q±(x; θ) =lnx ± θ2τ/2

θ√

τ

a12 =ln(f1t/f2t) + σ2τ/2√

1 − ρ2σ2√

τ(7.40)

b12 =σ1 − ρσ2

σ2

√1 − ρ2

(7.41)

σ2 = σ21 − 2ρσ1σ2 + σ2

2 .

This can be tidied up by expressing (7.39) in terms of a bivariate normalc.d.f. For this, apply (2.55) to write∫ γ

−∞Φ(a12 − b12y) · dΦ(y) = Φ

(γ, a12/

√1 + b2

12; b12/√

1 + b212

).

Since a12/√

1 + b212 = q+(f1t/f2t; σ) and b12/

√1 + b2

12 = (σ1 − ρσ2)/σ, thisgives

I1 = f1tΦ[q+(f1t/X ; σ1), q+(f1t/f2t; σ); (σ1 − ρσ2)/σ].

An interchange of subscripts then gives I2, the second integral in (7.38).Finally, assembling the pieces of (7.37) yields this final expression for thecall on the max, Cmax(S1t, S2; τ):

e−rτ2∑

j=1

fjtΦ[q+(fjt/X ; σj), q+(fjt/f3−j,t; σ); σ−1(σj − ρσ3−j)] (7.42)

−Xe−rτ1 − Φ[q+(X/f1t; σ1), q+(X/f2t; σ2); ρ],

where

q±j (x; θ) =lnx ± θ2τ/2

θ√

τ

σ2 = σ21 − 2ρσ1σ2 + σ2

2

fjt = Sjte(r−δj)τ

τ = T − t.

(The 3− j subscript in the first line just handles the interchange: 3− 1 = 2and 3 − 2 = 1.)

Page 352: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch07 FA

American Options and “Exotics” 335

Other Options on the Max or Min

Options on the minimum of two prices can be valued in the same way using

P(S1T ∧ S2T ≤ ) = 1 − P(ln S1T > ln , lnS2T > ln )

= 1 − Φ(− ln − µ1

σ1√

τ,− ln − µ2

σ2√

τ; ρ

).

The following expression holds for the call on the min, Cmin(S1t, S2t; τ) [cf.Stultz (1982)]:

e−rτ2∑

j=1

fjtΦ[q+(fjt/X ; σj), q−(f3−j,t/fjt; σ); σ−1(ρσ3−j − σj)]

−Xe−rτΦ[q−(f1t/X ; σ1), q−(f2t/X ; σ2); ρ].

To get prices of the puts, we infer from the identities

Cmax(S1T , S2T ; 0) − Pmax(S1T , S2T ; 0) = S1T ∨ S2T − X

Cmin(S1T , S2T ; 0) − Pmin(S1T , S2T ; 0) = S1T ∧ S2T − X

that

Pmax(S1t, S2t; τ) = Cmax(S1t, S2t; τ) − e−rτ [Et(S1T ∨ S2T ) − X ]

Pmin(S1t, S2t; τ) = Cmin(S1t, S2t; τ) − e−rτ [Et(S1T ∧ S2T ) − X ].

An expression for Et(S1T ∧ S2T ) is developed by analogy with (7.38) and(7.39) as

e−rτ

∫ ∞

−∞eηft(η) · dη = e−rτ

∫ ∞

−∞eηf1

t (η) · dη + e−rτ

∫ ∞

−∞eη f2

t (η) · dη

=2∑

j=1

fjt

∫ ∞

−∞Φ(aj,3−j − bj,3−jy)dΦ(y).

The integrals are evaluated using∫ ∞

−∞Φ(a − by)φ(y) · dy = Φ

(a/

√1 + b2

)to give

Et(S1T ∧ S2T ) =2∑

j=1

fjtΦ[q+(fjt/f3−j,t)].

Page 353: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch07 FA

336 Pricing Derivative Securities (2nd Ed.)

7.3.6 Quantos

One who buys an asset in a foreign market ordinarily bears two kinds of risk.One is the risk of fluctuations in price as measured in the foreign currency,and the other is the risk of fluctuations in the rate at which foreign currencycan be exchanged for domestic currency. Quanto contracts are designed toremove the currency risk from such transactions by providing for foreignasset positions to pay off at a fixed exchange rate. For example, a U.S.investor might buy a quanto forward contract on a German stock withthe provision that the price in Euros would convert one-for-one into U.S.dollars. In pricing such derivatives one is tempted to ignore the exchange-rate mechanism entirely and treat the problem just as it would be treatedby an investor in the foreign market, whose position in the foreign (non-quanto) version of the same instrument would also be free of currency risk.However, this overlooks an important consideration. The foreign investorcould use the derivative and the underlying to replicate a position in foreign-denominated riskless bonds, whereas currency risk makes it impossible forthe domestic investor to replicate a position in domestic bonds with theunderlying and a quanto, since the underlying is denominated in the foreigncurrency. Accordingly, arbitrage-free prices of quanto derivatives do windup depending on exchange-rate risk.

The notation we use to develop the ideas will seem a bit cumbersome,but it has the advantage of making the meanings of the symbols transparent.

Rd/ft domestic-currency value of a unit of foreign currency

Mft = Mf

0 erf t foreign-currency value of the foreign money fundMd

t = Md0 erdt domestic-currency value of the domestic money fund

Md/ft = Mf

t Rd/ft domestic-currency value of foreign money fund

Sft foreign-currency value of foreign underlying asset

Sd/ft = Sf

t Rd/ft domestic-currency value of foreign underlying asset

Pf martingale measure for foreign marketPd martingale measure for domestic market

Note especially that Md/ft and S

d/ft are the domestic values of the foreign

assets at t. To model the dynamics of these quantities, take the foreign andlocal short rates, rf and rd, to be constants, and assume that the motionsof Sf

t and Rd/ft under the actual measure P are described by

dSft /Sf

t = µS · dt + σS · dW ft (7.43)

dRd/ft /R

d/ft = µR · dt + σR

[ρ · dW f

t +√

1 − ρ2 · dW dt

],

Page 354: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch07 FA

American Options and “Exotics” 337

where W f and W d are independent Brownian motions and ρ ∈ (−1, 1) isa constant. This setup allows for systematic comovements in the exchangerate and the price of the foreign asset while fixing the volatility of R

d/ft at

σR. (We assume for now that the foreign asset pays no dividend.) One lastbit of notation is needed. Let M

d/f∗t ≡ M

d/ft /Md

t and Sd/f∗t ≡ S

d/ft /Md

t

denote the normalized domestic values of the foreign assets. Ito’s formulagives the dynamics of these as

dMd/f∗t /M

d/f∗t = (µR + rf − rd) · dt + ρσR · dW f

t

+ σR

√1 − ρ2 · dW d

t (7.44)

dSd/f∗t /S

d/f∗t = (µR + µS − rd + ρσSσR) · dt + (σS + ρσR) · dW f

t

+ σR

√1 − ρ2 · dW d

t . (7.45)

The economics of the problem of pricing a quanto derivative becomesclearer once it is understood how the perspectives of the foreign and domes-tic investor differ. There exists within the foreign arbitrage-free market ameasure Pf under which values of all locally traded assets normalized byMf

t are martingales. Likewise, in the domestic market there exists Pd

under which the prices of domestically traded assets normalized by Mdt

are martingales. Since foreign-denominated assets are not traded in thedomestic market, there is no reason to expect their normalized prices tobe martingales under Pd, and yet they must become so upon convertingtheir prices to domestic currency if there is to be no arbitrage. In otherwords, while neither Sf

t /Mdt nor Mf

t /Mdt is a martingale under Pd if

exchange rates are uncertain, the converted values Sd/f∗t = S

d/ft /Md

t and Md/f∗

t = Md/ft /Md

t will be martingales. The key to pricing quantoderivatives on Sf is to discover how Sf itself behaves under the measurethat makes Sd/f∗

t and Md/f∗t martingales.

To do this, it is necessary to remove the drift terms from (7.44) and(7.45). Girsanov’s theorem confirms the existence of Pd under which bothW f

t = W ft + αf t and W d

t = W dt + αdt are Brownian motions for any

finite constants αf and αd. Substituting W ft − αf t and W d

t − αdt for W ft

and W dt in (7.44) and (7.45) shows that the drift terms are eliminated if αf

and αd solve

ρσRαf + σR

√1 − ρ2αd = µR + rf − rd

(ρσR + σS)αf + σR

√1 − ρ2αd = µR + µS − rd + ρσSσR.

However, only αf is needed for pricing quanto derivatives on Sf , and this is

αf =µS − rf + ρσRσS

σS.

Page 355: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch07 FA

338 Pricing Derivative Securities (2nd Ed.)

Since σR, σS , and ρ are constants, Novikov condition (3.24) is clearly satis-fied, so that Sd/f∗

t and Md/f∗t are indeed martingales under Pd. Replac-

ing W ft by W f

t − αf t in (7.43), the dynamics of Sft under Pd are then

dSft /Sf

t = µS · dt + σS · d(W ft − αf t)

= (rf − ρσRσS) · dt + σS · dW ft .

Using this result to price quanto derivatives is made easier and more intu-itive by setting δd/f ≡ rd − rf + ρσRσS and writing

dSft /Sf

t = (rd − δd/f ) · dt + σS · dW ft . (7.46)

This makes δd/f look like a continuous dividend yield on a domesticallytraded asset. To interpret it in this context, note that under Pd the meanproportionate drift of the domestic value of a position in the foreign asset—that is, of Sd/f

t ≡ Sft R

d/ft —is simply rd. Therefore, δd/f can be thought of

as the component of the average yield of Sd/f that is attributable to changesin the exchange rate. Of this, rd−rf is the mean drift of Rd/f itself underPd, and ρσRσS reflects the effect of systematic comovements in Rd/f

t andSf

t . Expressing dSft /Sf

t as (7.46) makes it possible to calculate arbitrage-free prices of quanto derivatives from the same formulas that apply toderivatives on domestically traded assets that pay continuous dividends.

Here are three particular cases. Since Edt Sf

T = Sft e(rd−δd/f )(T−t), the

value at t of a T -expiring quanto forward contract at forward price fQ0 ≡fQ(0, T ) is

e−rd(T−t)Edt (Sf

T − fQ0 ) = Sft e−δd/f (T−t) − e−rd(T−t)fQ0 .

The initial forward price that make this value zero is

fQ(0, T ) = Sf0 e(rd−δd/f )T = Sf

0 e(rf−ρσRσS)T .

Similarly, current values of European-style quanto calls and puts are

CQ(Sft , T − t) = e−rd(T−t)fQt Φ[q+(fQt /X)]− XΦ[q−(fQt /X)]

PQ(Sft , T − t) = e−rd(T−t)XΦ[q+(X/fQt )] − fQ

t Φ[q−(X/fQt )],

where

fQt ≡ fQ(t, T ) = Sft e(rf−ρσRσS)(T−t)

is the forward price at t for T delivery. In all three cases pricing the quantoproduct just involves replacing the domestic interest rate by the foreignrate minus the instantaneous covariance between the foreign asset’s return

Page 356: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch07 FA

American Options and “Exotics” 339

and the return on a position in the foreign currency. If the foreign assetitself yields a continuous dividend at rate δf (paid in the foreign currency),then δf is simply subtracted from rf .

7.4 Path-Dependent Options

A European-style derivative is said to be “path dependent” if to determineits terminal value requires knowing the entire sample path of the underly-ing during all or part of the derivative’s life. Two examples are lookbackoptions, whose payoffs depend on the extreme values of the underlying—either maxima or minima, and Asian options, whose payoffs depend on theaverage price. The mechanics of each of these is explained below. While thecontractual terms of these instruments usually call for averages and extremato be determined from a finite set of prices at fixed trading intervals, thepricing problem turns out to be simpler in the Black-Scholes framework ifbased on the entire continuous record. The simpler formulas derived in thiscontext then serve as approximations for the prices of real-world instru-ments. Basic to the whole process are some beautiful theoretical resultsabout the distributions of maxima and minima of Brownian paths.

7.4.1 Extrema of Brownian Paths

As usual, let us take St = S0 exp[(r−δ−σ2/2)t+σWt] as the model for theunderlying price under the risk-neutral measure, where δ is the continuousdividend rate, r ≥ 0 is the short rate, and Wtt≥0 is a standard Brownianmotion. Letting Yt ≡ (r−δ−σ2/2)t+σWt and representing the price processas S0e

Ytt≥0 shows that extrema of price paths are functions of extrema ofnonstandard Brownian motions—that is, Brownian motions with arbitrarymean drift and scale. The first step in characterizing the price extrema is towork out distributions of extrema of standard Wiener processes. Althoughthe arguments here will depend crucially on the symmetry of these trendlessprocesses, Girsanov’s theorem will make it possible to extend them to thecase at hand.

Standard Wiener Processes

Considering a standard Brownian motion Wtt≥0 on (Ω,F , Ft, P) withW0 = 0, let W ≡ W 0,T and W ≡ W 0,T denote the maximum and minimumon the finite interval [0, T ]. We shall first apply the reflection principle to

Page 357: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch07 FA

340 Pricing Derivative Securities (2nd Ed.)

work out the joint distributions of W, WT and W, WT and then deducefrom these the marginal distributions of W and W themselves.10

The first step will be to develop an identity for the joint probabilityP(W ≥ w, WT ≤ w) for arbitrary real w, w. This is where the reflectionprinciple comes in, and figure 7.6 tells the story. Consider a path for whichW ≥ w does occur, as depicted by the solid curve that ends up at WT = w.In view of the almost-sure continuity of Brownian paths, there is necessarilya time t∗ at which Wt∗ = w along this path. Imagine being at this pointin time, observing that Wt∗ = w and thus knowing that the event W ≥ w

has occurred, but not knowing the subsequent behavior of Wt beyond t∗.Regarding Ft∗ as the information then available, let us try to assess thechance that the path will end up at or below an arbitrary value w—thatis, to determine Pt∗(WT ≤ w) ≡ P(WT ≤ w | Ft∗). Since the conditionaldistribution of WT −w is symmetric about zero, it is clear that the path hasas much chance of rising by |w−w| or more as it does of falling by |w−w|or more. That is, corresponding to any path that winds up at or below w

Fig. 7.6. Applying the reflection principle.

10Karlin and Taylor (1975) provide a fuller and wonderfully accessible treatment ofthis and other applications of the reflection principle.

Page 358: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch07 FA

American Options and “Exotics” 341

is its reflection that winds up at or above w + (w − w). The reflected pathis depicted by the dashed track in figure 7.6. Therefore,

Pt∗(WT − w ≥ w − w) = Pt∗(WT − w ≤ w − w),

or equivalently

Pt∗(WT ≥ 2w − w) = Pt∗(WT ≤ w).

Since the same relation holds for some such t∗ on any path for which W ≥ w,we conclude that

P(W ≥ w, WT ≤ w) = P(W ≥ w, WT ≥ 2w − w) (7.47)

for all w, w. The same kind of argument for the minimum, W , gives

P(W ≥ w, WT ≤ w) = P(W ≤ w, WT ≥ 2w − w). (7.48)

Putting (7.47) to work produces two useful results. First, by picking avalue w ≤ w we have 2w − w ≥ w, so that WT ≥ 2w − w implies WT ≥ w,

which in turn implies (is a subset of the event) that W ≥ w. Thus,

P(W ≥ w, WT ≤ w) = P(WT ≥ 2w − w) (7.49)

whenever w ≤ w. Next, using that11

P(W ≥ w) = P(W ≥ w, WT ≤ w) + P(W ≥ w, WT ≥ w) (7.50)

= P(W ≥ w, WT ≥ 2w − w) + P(W ≥ w, WT ≥ w)

and taking w = w give

P(W ≥ w) = 2P(W ≥ w, WT ≥ w).

Finally, again using the fact that WT ≥ w implies W ≥ w gives the remark-able result that the distribution of the maximum of Brownian motion canbe expressed just in terms of the distribution of its terminal value:

P(W ≥ w) = 2P(WT ≥ w). (7.51)

Note that this is consistent with P(W ≥ 0) = 1, which is necessarily truegiven that W0 = 0.

Likewise, (7.48) (or simply the symmetry of Brownian motion) impliesthat for w ≥ w

P(W ≤ w, WT ≥ w) = P(WT ≤ 2w − w) (7.52)

P(W ≤ w) = 2P(WT ≤ w). (7.53)

11Observe that event WT = w is counted twice in (7.50) but has zero probability.

Page 359: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch07 FA

342 Pricing Derivative Securities (2nd Ed.)

Wiener Processes with Drift

Consider now the trending process Yt = µt+Wt, where Wtt≥0 is againa standard Brownian motion and µ is a constant not equal to zero. Althoughthere is nonzero drift, the process still starts at the origin. Adhering to thesame notation, with Y ≡ Y 0,T as the maximum value on [0, T ], we shalldevelop results for Y that correspond to (7.49) and (7.51).

The hard part is the counterpart of (7.49). Girsanov’s theorem impliesthat taking

dP/dP = e−µYT +µ2T/2

yields an equivalent measure P under which Ytt≥0 is a standard Brownianmotion on [0, T ]. Of course, the goal is to find not P(Y ≥ y, YT ≤ y) butP(Y ≥ y, YT ≤ y). However, since

P(Y ≥ y, YT ≤ y) = E1[y,∞)×(−∞,y](Y , YT )

= E[1[y,∞)×(−∞,y](Y , YT ) · dP/dP]

= E[1[y,∞)×(−∞,y](Y , YT ) · eµYT −µ2T/2], (7.54)

the probability we want can be found by evaluating the expectation. Ytt≥0

being a standard Brownian motion under P, (7.49) implies that

P(Y ≥ y, YT ≤ y) = P(YT ≥ 2y − y)

for y ≤ y. The joint density of Y , YT under P is therefore

f(y, y) =∂2

∂y∂yP(Y ≤ y, YT ≤ y)

=∂2

∂y∂y[P(YT ≤ y) − P(Y ≥ y, YT ≤ y)]

= − ∂2

∂y∂yP(Y ≥ y, YT ≤ y)

= − ∂

∂y

[∂

∂y

∫ ∞

2y−y

1√2πT

e−z2/(2T ) · dz

]=

√2π

(2y − y)T−3/2e−(y−2y)2/(2T ), y ≤ y, y > 0.

This makes it possible to evaluate the expectation in (7.54) for y ≤ y as∫ y

−∞

∫ ∞

y

eµu−µ2T/2 f(u, u) · du du

=∫ y

−∞T−1eµu−µ2T/2

[∫ ∞

y

2(2u − u)√2πT

e−(u−2u)2/(2T ) · du

]· du.

Page 360: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch07 FA

American Options and “Exotics” 343

Changes of variables and algebra reduce this to

e2µy

∫ y−2y

−∞

1√2πT

e−(z−µT )2/(2T ) · dz,

yielding as the counterpart to (7.49)

P(Y ≥ y, YT ≤ y) = e2µyP(YT ≤ y − 2y) (7.55)

for y ≤ y and y > 0. Since YT = µT +WT and P(WT ≤ w) = P(WT ≥ −w),this can also be written as

P(Y ≥ y, YT ≤ y) = e2µyP(YT ≥ 2y − y + 2µT ).

A similar process yields for the minimum of Yt0≤t≤T

P(Y ≤ y, YT ≥ y) = e2µyP(YT ≥ y − 2y) (7.56)

for y ≥ y and y < 0.The counterparts of (7.51) for Y and of (7.53) for Y now come at once.

Applying (7.55) and the identity corresponding to (7.50) gives

P(Y ≥ y) = e2µyP(YT ≤ −y) + P(YT ≥ y)

or

P(Y ≥ y) = e2µyP(YT ≥ y + 2µT ) + P(YT ≥ y) (7.57)

for y > 0. Likewise, (7.56) and (7.50) imply

P(Y ≤ y) = e2µyP(YT ≥ −y) + P(YT ≤ y)

or equivalently

P(Y ≤ y) = e2µyP(YT ≤ y + 2µT ) + P(YT ≤ y) (7.58)

for y < 0. Note that since Y , Y , and YT are absolutely continuous, all of(7.55), (7.56), ( 7.57), and (7.58) continue to hold when the weak inequali-ties are replaced by strict inequalities.

Supplied with these results, we can now price some path-dependentderivatives under Black-Scholes dynamics.

Page 361: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch07 FA

344 Pricing Derivative Securities (2nd Ed.)

7.4.2 Lookback Options

“Fixed-strike” and “variable-strike” options are the two common types oflookbacks. Payoffs of fixed-strike T -expiring European lookbacks issued att = 0 depend on differences between an extremum of the underlying on [0, T ]and strike price X . Specifically, fixed-strike lookback calls pay (S0,T −X)+,and fixed-strike lookback puts pay (X−S0,T )+, where overbars and under-bars denote maxima and minima, respectively. With variable-strike look-backs, payoffs depend on the difference between the terminal price andthe maximum or minimum; thus, variable-strike puts pay (S0,T − ST ) andvariable-strike calls pay (ST − S0,T ). Since the terminal value can be nolarger than the maximum and no smaller than the minimum, variable-strike options will be exercised with probability one under Black-Scholesdynamics. Goldman et al. (1979) did the pioneering work on Europeanvariable-strike lookbacks, demonstrating, among other things, that theyare replicable with self-financing portfolios. Conze and Viswanathan (1991)apply Girsanov’s theorem to price both variable-strike and fixed-strike look-backs, and they provide bounds for prices of American lookbacks. We followtheir treatment in working out arbitrage-free prices for European variable-strike puts and fixed-strike calls at an arbitrary t ∈ [0, T ). In both cases themain trick is to figure out the conditional distribution of S0,T given whatis known at t, and the hardest part of that has already been done.

Variable-Strike Options

Letting PV L(St, T − t) be the price of the variable-strike lookback put att, martingale pricing gives

PV L(St, T − t) = e−r(T−t)Et(S0,T − ST )

= e−r(T−t)Et(S0,T ) − Ste−δ(T−t). (7.59)

As usual, δ is the continuous dividend rate, E denotes expectation underthe risk-neutral measure, and Et(·) ≡ E(· | Ft). Obviously, one has onlyto evaluate the expectation in order to price the option. Standing at t,

we already know the maximum price up to that point, S0,t. Therefore,conditional on Ft, the expectation of the maximum price during [0, T ] is

Et(S0,T ) = Et(S0,t ∨ St,T )

=∫ ∞

0

Pt(S0,t ∨ St,T > s) · ds

= S0,t +∫ ∞

S0,t

Pt(St,T > s) · ds. (7.60)

Page 362: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch07 FA

American Options and “Exotics” 345

Here, Pt(·) ≡ P(·|Ft); the second equality exploits the identity (2.40) forthe expectation; and the third equality follows from the obvious relation

Pt(S0,t ∨ St,T > s) =

1, s < S0,t

Pt(St,T > s), s ≥ S0,t.

Letting γ ≡ (r − δ − σ2/2)/σ and Yt ≡ γt + Wt, where Wtt≥0 is aBrownian motion under P, we have St = S0e

σYt and St,T = St exp(σY t,T ).Thus,

Pt(St,T > s) = Pt[Y t,T > σ−1 ln(s/St)]

and

Et(S0,T ) = S0,t + σSt

∫ ∞

σ−1 ln(S0,t/St)

eσyPt(Y t,T > y) · dy. (7.61)

Setting τ ≡ T − t and applying (7.57) with Pt in place of P give for theintegral in (7.61)∫ ∞

σ−1 ln(S0,t/St)

eθyPt(YT − Yt > y + 2γτ) · dy (7.62)

+∫ ∞

σ−1 ln(S0,t/St)

eσyPt(YT − Yt > y) · dy,

where θ ≡ 2(r − δ)/σ. For the time being we assume that θ = 0, savingthat case, which is relevant for options on futures, until later.

Evaluating the integrals in (7.62) requires the following relation, whichintegration by parts easily establishes. For arbitrary α = 0 assume thatX is an absolutely continuous random variable such that eαxP(X ≥ x) isintegrable. Then∫ ∞

x0

eαxP(X > x) · dx = α−1E(eαX − eαx0)1[x0,∞)(X). (7.63)

Attacking the first integral in (7.62), set Yt = γt + Wt and changevariables as u = y + γt to get

e−θγτ

∫ ∞

σ−1 ln(S0,t/St)+γτ

eθuPt(WT − Wt > u) · du.

Now applying (7.63) with α = θ and x0 = σ−1 ln(S0,t/St) + γτ gives

e−θγτθ−1Et[eθ(WT −Wt) − eθx0]1[x0,∞)(WT − Wt),

or

er′τ

θ

∫ ∞

x0

eθw 1√2πτ

e−(w−θτ)2/(2τ) · dw − 1θ(S0,t/St)θ/σPt(WT − Wt > x0),

Page 363: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch07 FA

346 Pricing Derivative Securities (2nd Ed.)

where r′ ≡ r−δ. One more change of variables in the integral leads finally to

I1 ≡ σ

2r′

er′τΦ

[q+

(Ste

r′τ

S0,t

)]−

(S0,t

St

)2r′/σ2

Φ

[q+

(Ste

−r′τ

S0,t

)](7.64)

as the expression for the first integral in (7.62). As usual,

q±(x) =lnx ± σ2τ/2

σ√

τ.

Similar steps involving another application of (7.63) give for the secondintegral in (7.62)

I2 ≡ σ−1

er′τΦ

[q+

(Ste

r′τ

S0,t

)]− S0,t

StΦ

[q−

(Ste

r′τ

S0,t

)]. (7.65)

Finally, assembling the pieces, we have from (7.61) that

Et(S0,T ) = S0,t + σSt(I1 + I2),

and finally, from (7.59), the arbitrage-free value of the variable-strike look-back put:

PV L(St, τ) = −Ste−δτΦ

[q−

(S0,t

Ster′τ

)]+ S0,te

−rτΦ[q+

(S0,t

Ster′τ

)]+

σ2

2r′Ste

−δτΦ

[q+

(Ste

r′τ

S0,t

)]

− σ2

2r′Ste

−rτ

(S0,t

St

)2r′/σ2

Φ

[q+

(Ste

−r′τ

S0,t

)],

where r′ ≡ r − δ and τ ≡ T − t.

The variable-strike lookback call option, which is worth (ST − S0,T ) atexpiration, can be valued in much the same way using (7.58) to deduce thedistribution of the minimum:

CV L(St, τ) = Ste−δτΦ

[q+

(Ste

r′τ

S0,t

)]− S0,te

−rτΦ

[q−

(Ste

r′τ

S0,t

)]

− σ2

2r′Ste

−δτΦ[q−

(S0,t

Ster′τ

)](7.66)

+σ2

2r′Ste

−rτ

(S0,t

St

)2r′/σ2

Φ[q−

(S0,t

Ste−r′τ

)].

Page 364: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch07 FA

American Options and “Exotics” 347

Notice that a variable-strike lookback straddle—a position long in boththe call and the put—produces a payoff of S0,T − S0,T , which effectivelyguarantees a purchase at the low and a sale at the high over the period.

The results differ considerably for lookback options on futures. Sincefutures price Ft is a martingale under P, modeling the dynamics requiressetting δ = r. In that case θ = 0 and γ = −σ/2 in (7.62), and (7.63)no longer applies to evaluate the first integral in (7.62). However, a directapplication of integration by parts ultimately reduces it to∫ ∞

σ−1 ln(F0,t/Ft)

Pt(YT − Yt ≥ y − στ) · dy =√

τ[q+Φ(q+) + φ(q+)

],

where φ(·) is the standard normal p.d.f., τ = T − t, and

q± ≡ ln(Ft/F0,t) ± σ2τ/2σ√

τ.

The final result for the variable-strike lookback put on the futures is

PV L(Ft, τ) = Fte−rτ

Φ(q+)

[2 + σ

√τ (q+ + φ(q+))

]− 1

+ F0,te

−rτΦ(−q−

).

Fixed-Strike Options

Most of the work needed to price fixed-strike lookbacks has already beendone. It just remains to show that this is true. To that end we will workout the arbitrage-free price of a fixed-strike European call on an underlyingfor which r − δ = 0. Letting CFL(St, τ ≡ T − t) denote the price at t andrecalling that the terminal value is CFL(ST , 0) = (S0,T − X)+, we have

CFL(St, τ) = e−rτ Et(S0,T − X)+

= e−rτ Et(S0,t ∨ St,T − X)+

= e−rτ

∫ ∞

0

Pt[(S0,t ∨ St,T − X)+ > u] · du.

Now the probability in the last line is

Pt

[(S0,t ∨ St,T − X)+ > u

]=

1, u < (S0,t − X)+

Pt(St,T > X + u), u ≥ (S0,t − X)+.

Page 365: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch07 FA

348 Pricing Derivative Securities (2nd Ed.)

Therefore,

CFL(St, τ) = e−rτ

[∫ (S0,t−X)+

0

1 · du +∫ ∞

(S0,t−X)+Pt

(St,T > X + u

)· du

]

= e−rτ

[(S0,t − X)+ +

∫ ∞

S0,t∨X

Pt

(St,T > s

)· ds

],

where in the second line we have set s = X + u and used the identity(S0,t −X)+ + X = S0,t ∨X . Now the integral in the last expression is justthe same as that in (7.60) except that the lower limit is S0,t ∨ X insteadof S0,t. Evaluating CFL(St, τ), then, just involves making this substitutionin the integrals (7.64) and (7.65) and collecting the results:

CFL(St, τ) = e−rτ (S0,t − X)+ + Ste−δτΦ

[q+

(Ste

r′τ

S0,t ∨ X

)]

− (S0,t ∨ X)e−rτΦ

[q−

(Ste

r′τ

S0,t ∨ X

)]

+σ2

2r′Ste

−δτΦ

[q+

(Ste

r′τ

S0,t ∨ X

)]

− σ2

2r′Ste

−rτ

(S0,t ∨ X

St

)2r′/σ2

Φ

[q+

(Ste

−r′τ

S0,t ∨ X

)],

where τ = T − t, r′ = r − δ = 0, and

q±(x) =lnx + σ2τ/2

σ√

τ.

Notice that if we hold St < S0,t ∨ X while letting t → T , the value of thecall converges to (S0,T − X)+, as it should.

7.4.3 Barrier Options

Terminal payoffs of variable-strike lookback options depend on both anextremum of price and the terminal value, ST , but the dependency is strictlylinear. For example, the payoff is S0,T − ST for the variable-strike put.Payoffs of European barrier options, however, depend in a more intricateway on terminal price and extremum. These options deliver the same payoffsat T as vanilla call and puts—that is, (ST −X)+ or (X − ST )+—but onlyunder the condition that an extremum of price over [0, T ] either has or

Page 366: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch07 FA

American Options and “Exotics” 349

Table 7.2. Terminal values of various barrier options with strike X and barrier K.

Option Call Put

Up & in, K > X (ST − X)+1[K,∞)(S0,T ) (X − ST )+1[K,∞)(S0,T )

Up & out, K > X (ST − X)+1[0,K](S0,T ) (X − ST )+1[0,K](S0,T )

Down & in, K < X (ST − X)+1[0,K](S0,T ) (X − ST )+1[0,K](S0,T )

Down & out, K < X (ST − X)+1[K,∞)(S0,T ) (X − ST )+1[K,∞)(S0,T )

has not surpassed the pertinent “barrier”. Many variations are possible,depending on how the contingency is specified. The payoff formulas forthe four most common types of barrier options are listed in table 7.2. Ineach case X is the strike price and the payoff is contingent on the eventthat price attains some value K during [0, T ]. Thus, the up-and-in call paysST −X if this is positive and if price has reached K > X during the option’slifetime; otherwise, it pays nothing. Similarly, the down-and-out put paysX − ST if this is positive and price has not fallen below K < X. Clearly,risk-neutral pricing of derivatives with these payoffs requires knowing thejoint distributions of terminal price and extrema. Fortunately, these comeeasily from (7.55) and (7.56).

There is one sense in which the barrier options in table 7.2 are easier todeal with than lookbacks. The price at t ∈ (0, T ) does not depend on theprecise value attained by the relevant extremum up to that point—S0,t orS0,t. All that matters is whether the pertinent barrier has been surpassed.If it has, then the option has either been “knocked in”, becoming a vanillaEuropean whose price comes directly from Black-Scholes, or else it hasbeen “knocked out” and is worthless. To price a barrier option that hasnot already been knocked out or in, we can save notation by taking t = 0as the current time. To illustrate the technique, we shall price the up-and-in call and the down-and-out put. Except for the timing convention, theassumptions and notation are the same as for lookback options.

The Up-and-In Call

Letting CUI(S0, T ) denote the current value of the up-and-in call, we have

CUI(S0, T ) = e−rT ECUI(ST , 0)

= e−rT E(ST − X)+1[K,∞)(S0,T ),

where K > X > 0. We also assume that K > S0 to exclude the trivial casethat the option is a vanilla European at the outset. We shall derive the

Page 367: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch07 FA

350 Pricing Derivative Securities (2nd Ed.)

p.d.f. of the terminal value by deducing the complementary c.d.f. under P.This is given by

P[CUI(S0, T ) > c] =

1, c < 0P(ST > c + X, S0,T > K), 0 ≤ c ≤ K − X

P(ST > c + X), c > K − X

,

where the last expression applies because ST > K implies S0,T > K.

Consider first the case c ∈ [0, K − X ]. Letting Yt ≡ γt + Wt andσγ ≡ r − δ − σ2/2 ≡ r′ − σ2/2, where Wtt≥0 is a Brownian motionunder P, we have ST = S0e

σYT and S0,T = S0 exp(σY 0,T ). Therefore,

P(ST > c+X, S0,T > K) = P

[YT > σ−1 ln

(c + X

S0

), Y 0,T > σ−1 ln

(K

S0

)].

Now (7.55) and (7.57) imply that, for y ≤ y,

P(YT > y, Y 0,T > y) = P(Y 0,T > y) − P(YT ≤ y, Y 0,T > y)

= P(YT > y) + e2γyP(YT > y + 2γT )− e2γyP(YT ≤ y − 2y)

= Φ(−y + γT√

T

)+ e2γy

[Φ(−y + γT√

T

)− Φ

(y − 2y − γT√

T

)].

Substituting σ−1 ln( c+XS0

) for y and σ−1 ln(K/S0) for y and simplifying givefor P[CUI(S0, T ) > c] the expression

(K

S0

)2r′/σ2−1 Φ[q+

(S0

er′T K

)]−Φ

[q+

((c + X)S0

K2er′T

)]+ Φ

[q−

(S0e

r′T

K

)]

when 0 ≤ c ≤ K − X . For c > K − X it is easy to see that

P[CUI(S0, T ) > c] = P(ST > c + X) = Φ

[q−

(S0e

r′T

c + X

)].

Differentiating each expression to get the p.d.f. on the corresponding inter-val and evaluating e−rT

∫∞0 cf(c) · dc piecewise as e−rT

∫ K−X

0 cf(c) · dc +

Page 368: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch07 FA

American Options and “Exotics” 351

e−rT∫∞

K−Xcf(c) · dc, we have

CUI(S0, T ) = S0e−δT

(K

S0

)2r′/σ2+1

Φ

[q+

(K2er′T

S0X

)]

−Φ

[q+

(Ker′T

S0X

)]− Xe−rT

(K

S0

)2r′/σ2−1

×

Φ

[q−

(K2er′T

S0X

)]− Φ

[q−

(Ker′T

S0X

)]+ S0e

−δT Φ[q+

(S0e

r′T /K)]

− Xe−rT[q−

(S0e

r′T /K)]

,

where r′ = r − δ and

q±(x) =lnx ± σ2T/2

σ√

T.

Upon setting r′ = 0 and S0 = F0, the formula applies to a call on a futuresprice. Notice that as either K ↓ S0 or K ↓ X the expression for CUI(S0, T )reduces to the Black-Scholes formula for a vanilla European call on an assetpaying continuous dividends.

The Down-and-Out Put

Letting PDO(S0, T ) denote the initial value of the down-and-out, we have

PDO(S0, T ) = e−rT EPDO(ST , 0)

= e−rT E(X − ST )+1[K,∞)(S0,T ),

where 0 ≤ K < X and (again to exclude the trivial case) S0 > K. We shallfirst deduce the c.d.f. of PDO(S0, T ) under the martingale measure, thenderive the p.d.f. from that, then evaluate the expectation as

∫∞0 pf(p) · dp.

The c.d.f. of PDO(S0, T ) under P is

P(P ≤ p) =

0, p < 0P(P = 0) + P(S0,T > K, ST > X − p), 0 ≤ p < X − K

1, p ≥ X − K,

where

P(P = 0) = P[(ST ≥ X) ∪ (S0,T ≤ K)]

= P(ST ≥ X) + P(S0,T ≤ K) − P[(ST ≥ X) ∩ (S0,T ≤ K)].

Page 369: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch07 FA

352 Pricing Derivative Securities (2nd Ed.)

Again letting Yt ≡ γt + Wt and σγ ≡ r − δ − σ2/2 ≡ r′ − σ2/2, we have,for 0 ≤ p < X,

P(S0,T > K, ST > X − p)

= P

[Y 0,T ≥ σ−1 ln

(K

S0

), YT ≥ σ−1 ln

(X − p

S0

)]= P

[YT ≥ σ−1 ln

(X − p

S0

)]− P

[Y 0,T < σ−1 ln

(K

S0

), YT ≥ σ−1 ln

(X − p

S0

)].

An application of (7.56) reduces this to an expression involving the marginaldistribution of YT ,

P

[YT ≥ σ−1 ln

(X − p

S0

)]− e2γσ−1 ln(K/S0)P

[YT ≥ σ−1 ln

(X − p

S0

)− 2σ−1 ln

(K

S0

)];

and this simplifies to give the value of P(P ≤ p) for 0 < p < X − K as

P(P = 0) + Φ

[q−

(S0e

r′T

X − p

)]−

(K

S0

)2r′/σ2−1

Φ

[q−

(K2er′T /S0

X − p

)].

On (0, X − K) the p.d.f. under the martingale measure is

f(p) =

dP(P ≤ p)/dp, 0 < p < X − K

0, elsewhere.

Differentiating and calculating e−rT∫ X−K

0 pf(p) · dp ultimately give as thevalue of the down-and-out put, PDO(S0, T ),

Xe−rT

Φ

[q−

(S0e

r′T

K

)]− Φ

[q−

(S0e

r′T

X

)]

−S0e−δT

Φ

[q+

(S0e

r′T

K

)]− Φ

[q+

(S0e

r′T

X

)]

−(

K

S0

)2r′/σ2−1

Xe−rT

Φ

[q−

(Ker′T

S0

)]− Φ

[q−

(K2er′T

S0X

)]

+(

K

S0

)2r′/σ2+1

S0e−δT

Φ

[q+

(Ker′T

S0

)]− Φ

[q+

(K2er′T

S0X

)],

Page 370: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch07 FA

American Options and “Exotics” 353

where r′ = r − δ and

q±(x) =lnx ± σ2T/2

σ√

T.

Setting r′ = 0 and S0 = F0 produces the corresponding formula for a puton a futures price.

7.4.4 Ladder Options

Payoffs of barrier options depend on whether the price path of the under-lying has at some point crossed the pertinent barrier, but given terminalprice ST there is only one possible terminal payoff. European ladder optionsare a bit more sophisticated. There are several barriers, called “rungs”, andfinal payoffs depend on how many of these have been crossed, as well as onthe terminal price. Consider first a ladder call having strike X and rungsKn > Kn−1 > · · · > K1 > X . For some j ≥ 1 suppose the jth rung, Kj ,is the highest one exceeded by the maximum price during [0, T ). Then thepayoff at T is the greatest of 0, ST −X , and Kj −X . If, however, the pricenever gets above the first rung, K1, then the payoff is just the greater of 0and ST − X . As a notational device, introducing K0 ≡ 0 allows payoffs inboth cases to be expressed as

CL(ST , 0) = (ST − X)+ ∨ (Kj − X)+ = (ST − X)+ ∨ (Kj − X).

Figure 7.7 depicts the payoffs of a 2-rung ladder call on two alternativeprice paths, both of which pass K1 but fail to reach K2. On the lower trackthe terminal price winds up below K1, so the payoff is K1 − X . On theupper track, where the terminal price is above K1, the payoff is ST − X .With a last piece of notation, Kn+1 ≡ +∞, the whole payoff structure canbe captured succinctly as

CL(ST , 0) =n∑

j=0

[(ST − X)+ ∨ (Kj − X)

]· 1[Kj,Kj+1)(S0,T ).

The ladder put’s payoff works symmetrically. Taking 0 ≡ K′n+1 < K

′n <

K′n−1 < · · · < K

′1 < X and K

′0 = +∞, it pays (X − ST )+ ∨ (X − Kj)+ if

price ever drops below rung j, so that12

PL(ST , 0) =n∑

j=0

[(X − ST )+ ∨ (X − K

′j)]· 1(K

′j+1,K

′j](S0,T ).

12For the term j = n note that bP(S0,T = 0) = 0.

Page 371: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch07 FA

354 Pricing Derivative Securities (2nd Ed.)

Fig. 7.7. Alternative payoffs of a ladder call.

Pricing a Ladder Call

We shall work out the arbitrage-free price of a call with n rungs to illustratethe application of martingale pricing to ladder options. As usual, we beginby developing the c.d.f. under P from the option’s payoff structure. Anextra complication here that was avoided with barrier options is that theconditional distribution of the terminal payoff does change substantively aswe move through time, since knowing that j rungs have been surpassed byt tells us that the payoff cannot be less than (Kj − X)+. Fortunately, thiscauses no great difficulty.

Standing at t ∈ [0, T ), we observe current price St and the highestsurpassed rung. That is, we observe St and a jt ∈ 0, 1, . . . , n such thatS0,t ∈ [Kjt

, Kjt+1), where we continue to take K0 = 0 and Kn+1 = +∞.Then a little reflection shows that the conditional c.d.f. of the terminalpayoff is Pt[CL(ST , 0) ≤ c] = 0 for c < (Kjt

− X)+ and

Pt[CL(ST , 0) ≤ c] = Pt(ST ≤ c + X, St,T < Kj+1) (7.67)

for (Kj − X)+ ≤ c ≤ Kj+1 − X and j ∈ jt, jt + 1, . . . , n. For example,taking j = 2 and picking a c between K2−X and K3−X , the payoff wouldbe larger than c if either ST > c+X or the maximum price from t forward

Page 372: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch07 FA

American Options and “Exotics” 355

were at least K3. Notice that the c.d.f. has discontinuities at values of c

corresponding to all the rungs. Specifically, for j ∈ jt, jt + 1, . . . , n andc = (Kj − X)+

Pt[CL(ST , 0) = c] = Pt(ST ≤ c + X, Kj ≤ St,T < Kj+1),

which is the probability that the jth rung is the highest one passed andthat the terminal price ends up below it. (Note that Pt(ST = Kj) = 0.)figure 7.8 illustrates for a 2-rung call.

The arbitrage-free price of the ladder call as of t ∈ [0, T ) is

CL(St, τ) = e−rτ EtCL(ST , 0)

= e−rτ

∫[0,∞)

c · dPt[CL(ST , 0) ≤ c],

where τ = T − t and the integral is Lebesgue-Stieltjes. Separating out theatomic points that have positive probability mass, this is written as

CL(St, τ) = e−rτn∑

j=jt

∫ Kj+1−X

(Kj−X)+cP′

t

[CL(ST , 0) ≤ c

]· dc

+ (Kj − X)+Pt

[CL(ST , 0) = (Kj − X)+

], (7.68)

Fig. 7.8. Payoff c.d.f. for ladder call option.

Page 373: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch07 FA

356 Pricing Derivative Securities (2nd Ed.)

where P′t(·) ≡ dPt(·)/dc. Now evaluate Pt[CL(ST , 0) ≤ c] at the continuity

points using (7.67) and (7.55), and then differentiate to get

P′t[C

L(ST , 0) ≤ c] = φ(z) · dz

dc−

(Kj+1

St

)2r′/σ2−1

φ(z′) · dz′

dc.

Here

z ≡ q+

(c + X

ft

),

z′ ≡ q+

(c + X

ft

S2t

K2j+1

),

q±(x) ≡ lnx ± σ2τ/2σ√

τ,

and ft ≡ f(t, T ) = Ster′τ is the underlying’s T -delivery forward price at t.

Evaluating the integrals and probabilities in (7.68) leads finally to the fol-lowing expression for CL(St, τ):

e−rτn∑

j=jt

⟨X

(Kj+1

St

) 2r′σ2 +1

Φ

[q+

(ht

j+1S2t

K2j+1

)]− Φ

[q+

(ht

jS2t

K2j+1

)]

− ft

(Kj+1

St

) 2r′σ2 +1

Φ

[q−

(ht

j+1S2t

K2j+1

)]− Φ

[q−

(ht

jS2t

K2j+1

)]+ ftΦ[q−(ht

j+1)] − Φ[q−(htj)] − XΦ[q+(ht

j+1)] − Φ[q+(htj)]

+ (Kj − X)+

(

Kj

St

) 2r′σ2 −1

Φ

[q+

(ht

jS2t

K2j

)]

−(

Kj+1

St

) 2r′σ2 −1

Φ

[q+

(ht

jS2t

K2j+1

)]⟩

,

where r′ ≡ r − δ and htj ≡ (Kj ∨ X)/ft.

7.4.5 Asian Options

Payoffs of “Asian” or “average-price” options depend on the average price ofthe underlying over all or part of the option’s lifetime. There are two generalclasses. Payoffs of fixed-strike options depend on the difference between theaverage price of the underlying and a fixed strike price X ; while variable-strike or floating-strike options depend on the difference between terminal

Page 374: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch07 FA

American Options and “Exotics” 357

Table 7.3. Terminal values of fixed- and variable-strikeAsian options.

Class Put Call

Fixed-strike (X − A[t∗,T ])+ (A[t∗,T ] − X)+

Variable-strike (A[t∗,T ] − ST )+ (ST − A[t∗,T ])+

price ST and the average price. These options are almost always European.Table 7.3 states explicitly the terminal payoffs of the two classes of Asianputs and calls. Here A[t∗,T ] represents the average over a specified periodextending from some t∗ ≥ 0 up to T . Averaging is typically done arith-metically. (We discuss geometric-averaging in the last part of the section.)Depending on the application the average can be updated as often as daily,or it can be based on as few as 10–15 points over the course of severalmonths.

Asians are attractive vehicles for hedging risks associated with pay-ments that are distributed more or less evenly over time. For example, anexporter making continual sales abroad might find it convenient to hedgewith an Asian currency option rather than a vanilla option that dependson the exchange rate at one date only. Asians also take some of the riskout of hedging in options on thinly traded commodities, whose price fluc-tuations on any one day may be very large. In this sense, too, the lowervolatility associated with the average price tends to make Asians cheaperthan otherwise identical vanilla options. Since both advantages are morepronounced for the fixed-strike class, they are by far the more common ofthe two types. We focus entirely on fixed-strike options here. Levy (1992)discusses the pricing of variable-strike Asians.

As we have seen, the Black-Scholes framework is well suited for pricingmany path-dependent options, because it lets us exploit known facts abouthitting times and extrema of Brownian motion. Unfortunately, modelingunderlying prices as geometric Brownian motion does not facilitate themartingale pricing of arithmetic Asian options. The reason is that thereis no simple representation for the distribution of a discrete arithmeticaverage of jointly lognormal variates. (As we shall see, things are mucheasier in the case of geometric averages.) Inevitably, simple formulas forarbitrage-free prices within the Black-Scholes framework must be approx-imations of one form or another, and many ways of producing these havebeen proposed in what is now a large literature on the subject. Our surveybegins by introducing notation and deriving some general conditions that

Page 375: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch07 FA

358 Pricing Derivative Securities (2nd Ed.)

arbitrage-free prices of fixed-strike, arithmetic Asians must satisfy. We thensummarize some of the solutions that have been put forward before turningto a detailed treatment of one method that seems to offer a particularlygood balance among conceptual simplicity, quickness of computation, andaccuracy. The final part develops the analytical solution for prices of Asianswhose payoffs depend on continuous geometric averages of price.

Notation and Useful Identities

Consider a fixed-strike Asian option issued at t = 0, expiring at T , andhaving terminal payoff depending on A[t∗,T ] = (N + 1)−1

∑Nj=0 Stj , which

is the arithmetic average of prices at the N +1 discrete times 0 ≤ t∗ ≡ t0 <

t1 < · · · < tN ≡ T . For t ∈ [0, T ) let Nt = maxj : tj < t − 1. That is, Nt

is one fewer than the number of “fixing points” already passed as of timet. Define the average price up to any t ∈ [0, T ] as

A[t∗,t] =

0, t < t∗

(Nt + 1)−1

Nt∑j=0

Stj , t∗ ≤ t ≤ T.

We can then decompose A[t∗,T ] into a part that has already been observedand a part that has not:

A[t∗,T ] = (N + 1)−1

Nt∑j=0

Stj +N∑

j=Nt+1

Stj

=

Nt + 1N + 1

A[t∗,t] +N − Nt

N + 1A(t,T ].

The price of an Asian fixed-strike call at t < T can, therefore, be written as

CAF (St, A[t∗,t]; T − t) = e−r(T−t)Et(A[t∗,T ] − X)+

= e−r(T−t)Et

(Nt + 1N + 1

A[t∗,t] +N − Nt

N + 1A(t,T ] − X

)+

=N − Nt

N + 1e−r(T−t)Et(A(t,T ] − Xt)+, (7.69)

where

Xt ≡N + 1N − Nt

X − Nt + 1N − Nt

A[t∗,t]. (7.70)

Page 376: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch07 FA

American Options and “Exotics” 359

Thus, at any t the problem boils down to pricing a call on the average priceduring the option’s remaining life.

As Levy (1992) points out, the no-arbitrage condition requires that putsbe priced off the call (or vice versa) using a version of put-call parity thatcomes from the following argument. A long position in a put coupled witha short position in a call at the same fixed strike guarantees a cash receiptat expiration equal to

X − A[t∗,T ] =

X − (N + 1)−1Nt∑j=0

Stj

+ (N + 1)−1N∑

j=Nt+1

Stj .

This future receipt can be exactly replicated at t by borrowing

e−r(T−t)

X − (N + 1)−1Nt∑j=0

Stj

currency units and entering forward contracts to deliver (N +1)−1e−r(T−tj)

of the underlying “stock” at each tj ∈ Nt + 1, . . . , N. To see that this isthe right replicating strategy, note that at each such tj one can meet theobligation to deliver stock by borrowing (N + 1)−1e−r(T−tj)Stj currencyunits to finance the purchase, thereby committing to repay (N + 1)−1Stj

at time T . The present value of the receipt X − A[t∗,T ] is then

V (t, T ) = e−r(T−t)

X − (N + 1)−1Nt∑j=0

Stj

+e−r(T−t)

N + 1

N∑j=Nt+1

f(t, tj),

(7.71)where f(t, tj) = Ste

(r−δ)(tj−t) is the forward price at t for tj delivery, assum-ing a continuous yield at rate δ. Therefore, V (t, T ) must equal the differencein the arbitrage-free prices of the put and call at t.

A Survey of Proposed Pricing Solutions

Having seen that the key to valuing the arithmetic-average Asian call iscarrying out the expectation in (7.69), we can now look at some waysthat have been proposed to do this. To simplify notation a bit, supposewe stand at t ∈ [0, T ) and that tj is the time at which the next priceenters the average. Set τj = tj − t, τj+1 = tj+1 − tj , and so on, withτN = T − tN−1. Thus, τj is the time until the next price enters the average,while the remaining τ ’s are times between successive price fixings. Letting

Page 377: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch07 FA

360 Pricing Derivative Securities (2nd Ed.)

Yk ≡ (r−δ−σ2/2)τk+σWτkfor k ∈ j, j+1, . . . , N, where Wt is a Brow-

nian motion under P, we have Stj = St exp(Yj), Stj+1 = St exp(Yj +Yj+1) =Stj exp(Yj+1), and so on, with ST = St exp(

∑Nk=j Yk) = StN−1 exp(YN ).

Each price can thus be built up from the previous one by multiplying byan exponentiated normal (that is, a lognormal) variate that is independentof all that have gone before.

This recursive setup lends itself ideally to Monte Carlo simulation. Theway it works is very simple. Beginning with initial price St, we take adraw Z from a standard normal distribution, scale by σ

√τj , add mean

(r− δ−σ2/2)τj , and obtain a realization of Stj , the first new price to enterthe average, as

Stj = St exp[(r − δ − σ2/2)τj + σ√

τjZ].

Taking another draw and constructing Yj+1 we generate Stj+1 , and so onto the end. Calculating the average of these prices yields one realization of(A(t,T ]−X)+. Repeating the procedure M −1 more times delivers a sampleof M realizations, which are simply averaged to estimate Et(A(t,T ] − X)+.With large enough M (and enough computer time) any desired degree ofaccuracy can be attained. We discuss in chapter 11 some ways to enhancethe efficiency of the process, including the use of special control variates forvaluing continuously averaged Asian options. Although slower than ana-lytical approximations, simulation provides the standard against which theaccuracy of other methods has been judged.

Another computationally intensive procedure, proposed by Carverhilland Clewlow (1990), provides what is essentially an exact numerical solutionto the problem. To understand the idea it is best to start with a specificcase as an example. Take j = 1, so that t1 is the next time at which a priceenters the average, and suppose there are in all N = 3 more prices to come.We then have

A(t,T ] = (St1 + St2 + St3)/3

= St(eY1 + eY1+Y2 + eY1+Y2+Y3)/3

= SteY1 [1 + eY2(1 + eY3)]/3.

Starting at the end of the chain, the first step is to find the p.d.f. of Q2 ≡ln(1 + eZ3), where (to appreciate the recursive pattern) we take Z3 ≡ Y3.Since Y3 is distributed under P as N [(r − δ − σ2/2)τ3, σ

2τ3], finding fQ2(·)

Page 378: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch07 FA

American Options and “Exotics” 361

requires just an application of the change-of-variable formula. Thus,

fQ2(q) = φ[ln(eq − 1); (r − δ − σ2/2)τ3, σ2τ3]

∣∣∣∣ d

dqln(eq − 1)

∣∣∣∣= φ

[ln(eq − 1); (r − δ − σ2/2)τ3, σ

2τ3

]( eq

eq − 1

),

where φ(·; α, β2) is the normal density with mean α and variance β2. Eval-uating fQ2(q) on a discrete set of points q ∈ Q, the characteristic function,ΨQ2(ζ) =

∫∞−∞ eiζqfQ2(q) · dq, is then found on a discrete set ζ ∈ Ξ using

the fast Fourier transform (f.F.t.), an algorithm that provides an efficientway of doing the calculation.13 The c.f. of Y2 can of course be expressedanalytically. Taking the product of this and ΨQ2(ζ) yields ΨZ2(ζ), which isthe c.f. of

Z2 ≡ Y2 + Q2 = ln[eY2(1 + eY3)].

The density of Z2, which is the convolution of the densities of Y2 and Q2, isthen evaluated on the grid Q by inverting ΨZ2(ζ) (again using the f.F.t.), as

fZ2(z) =12π

∫ ∞

−∞e−iζzΨZ2(ζ) · dζ.

Another application of the change-of-variable formula gives the density ofQ1 ≡ ln(1 + eZ2), from which its c.f., ΨQ1(ζ), is then found. Taking theproduct of ΨQ1(ζ) and ΨY1(ζ) gives ΨZ1(ζ), the c.f. of

Z1 ≡ Y1 + Q1 = lneY1 [1 + eY2(1 + eY3)],

which is again inverted to find the density. Finally, one more change ofvariable yields the density of

A(t,T ] = SteZ1/3.

Having thought through the specific case, it is easy to see how thegeneral procedure goes. Let the next time at which a price enters the averagebe tj , and suppose there are in all N − j + 1 more prices to enter. Startingoff by defining ZN ≡ YN , proceed step by step for k = N − 1, N − 2, . . . , j

to find the density of Qk = ln(1+eZk+1) by the change-of-variable formula,and then the density of Zk = Yk +Qk by using the c.f.s to convolve fYk

and

13The f.F.t. is discussed in connection with stochastic volatilty models in chapters 8and 9. Routine INVFFT on the accompanying CD implements the f.F.t. procedure in Presset al. (1992) to invert the c.f. of a centered, scaled random variable.

Page 379: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch07 FA

362 Pricing Derivative Securities (2nd Ed.)

fQk. Having worked all the way back to Zj , find the p.d.f. of A(t,T ] from the

transformation (SteZj )/(N − j +1) and use it to calculate Et(A(t,T ]−X)+.

Since both this approach and Monte Carlo require extensive computa-tion, there have been many attempts to develop fast approximations. Incertain cases simplifications can be achieved by treating the averaging ascontinuous, in which case

A(t,T ] = (T − t)−1

∫ T

t

Su · du.

In the special situation that the average to date, A[t∗,t], is such as to assurethat the option will finish in the money, Geman and Yor (1992, 1993) andGeman and Eydeland (1995) present an appealing closed-form solution.However, dealing with the usual case that the option’s terminal moneynessis not yet known still requires a lot of calculation. Milevsky and Posner(1998) develop a relatively simple pricing formula using the fact that asT → ∞ the reciprocal of

∫ T

tSu ·du converges in distribution to the gamma

under the condition r − δ − σ2/2 < 0. This condition would obviously holdfor futures, but there are many applications in which it would not.

Method-of-Moments Solutions

Several relatively simple techniques for pricing Asian options are based onlognormal approximations to the distribution of A(t,T ]. Such approxima-tions are often quite good, especially so when the variability of the variatescomposing the average is not too large. Recall that if X is distributed as log-normal, its distribution is characterized by just two parameters, the meanand variance of lnX . Therefore, the task of fitting a lognormal distributionto A(t,T ] comes down to choosing these two parameters. Levy and Turnbull(1992) describe several such methods, of various degrees of complexity. Oneof the simplest, which nevertheless works quite well when the annualizedvolatility of the underlying price is no more than 0.30, is a procedure pro-posed by Levy (1992). We now consider this method in detail.

Levy’s approach works much like the classical method of moments instatistical estimation. Begin with the first two moments of A(t,T ] about theorigin—µ′

1 = EtA(t,T ] and µ′2 = EtA

2(t,T ], say—and then find parameters

α = α(t, T ) and β = β(t, T ) that equate the corresponding lognormalmoments to these by solving the equations µ′

1 = Eeα+βZ = eα+β2/2 andµ′

2 = E(e2α+2βZ) = e2α+2β2. The solutions are

α = 2 lnµ′1 − ln

√µ′

2, β2 = lnµ′2 − 2 lnµ′

1. (7.72)

Page 380: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch07 FA

American Options and “Exotics” 363

To find expressions for µ′1 and µ′

2 it is helpful to note first that the price ofthe underlying at tj is Stj = Ste

R(Nt+1,j) for j ∈ Nt + 1, . . . , N, where

R(i, j) ≡j∑

=i

Yτ∼ N [(r − δ − σ2/2)(tj − ti), σ2(tj − ti)].

The first moment is then

µ′1 = (N − Nt)−1St

N∑j=Nt+1

EteR(Nt+1,j)

= (N − Nt)−1St

N∑j=Nt+1

e(r−δ−σ2/2)(tj−t)+σ2(tj−t)/2

= (N − Nt)−1St

N∑j=Nt+1

e(r−δ)(tj−t), (7.73)

and the second moment is

µ′2 =

(St

N − Nt

)2 N∑j=Nt+1

N∑k=Nt+1

EteR(Nt+1,j)+R(Nt+1,k)

=(

St

N − Nt

)2 N∑

j=Nt+1

Ete2R(Nt+1,j)

+ 2N−1∑

j=Nt+1

N∑k=j+1

Ete2R(Nt+1,j)+R(j+1,k)

.

Using the independence of R(Nt + 1, j) and R(j + 1, k), evaluating theexpectations, and simplifying give

µ′2 =

(St

N − Nt

)2 N∑

j=Nt+1

e2(r−δ+σ2/2)(tj−t)

+ 2N−1∑

j=Nt+1

N∑k=j+1

e(r−δ+σ2/2)(tj+tk−2t)

.

Putting this all together using (7.69) gives as the two-moment approxi-mation to the value CAF (St, A[t∗,t]; T − t) of the fixed-strike Asian call theexpression

N − Nt

N + 1e−r(T−t)

[µ1Φ

(α − lnXt

β+ β

)− XtΦ

(α − lnXt

β

)],

Page 381: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch07 FA

364 Pricing Derivative Securities (2nd Ed.)

where Xt is given in (7.70) and α and β are as in (7.72). The approximatearbitrage-free price of the put would then be determined from the parityrelation as

PAF (St, A[t∗,t]; T − t) = CAF (St, A[t∗,t]; T − t) + V (t, T ),

where V (t, T ) is given in (7.71).

Asians Based on Geometric Averages

The fact that products of jointly lognormal variates are themselves lognor-mal makes the pricing of geometric-average, fixed-strike Asian options muchmore straightforward. With averaging starting at t∗ = 0 and N fixings attimes 0 = t0 < t1 < · · · < tN−1 < tN = T the (time-weighted) geometricmean is

GN [0, T ] =N∏

j=1

S(tj−tj−1)/Ttj

= exp

T−1N∑

j=1

lnStj (tj − tj−1)

.

Of course, if the spacing is equal with tj = jT/N , then each factor (tj −tj−1)/T is just N−1. Taking St = S0 exp(νt+σWt), where ν ≡ r− δ−σ2/2and Wtt≥0 is a Brownian motion under P, this is

GN [0, T ] = S0 exp

T−1N∑

j=1

(νtj + σWtj )(tj − tj−1)

.

The distribution of GN [0, T ] is clearly lognormal, and it is a simplematter to work out the mean and variance of lnGN [0, T ]. We considerinstead the pricing of puts and calls whose payoffs depend on the continuousmean. This is defined as

G[0,T ] ≡ exp(

T−1

∫ T

0

lnSt · dt

)= S0 exp

(νT/2 + σT−1

∫ T

0

Wt · dt

)for T > 0, and G[0,0] ≡ S0.

We first need the distribution of G[0,T ] for T > 0. Example 33 on page 98showed that

∫ T

0Wt · dt is distributed as N(0, T 3/3), so

ln(G[0,T ]/S0) = νT/2 + σT−1

∫ T

0

Wt · dt ∼ N(νT/2, σ2

√T/3

).

Page 382: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch07 FA

American Options and “Exotics” 365

For t ∈ [0, T ] we can therefore write

G[0,t] ∼ S0 exp(νt/2 + σWt/

√3)

, (7.74)

where Wtt≥0 is a P Brownian motion. However, to price an option at anyt ∈ (0, T ) when St and G[0,t] are known requires knowing the conditionaldistribution of G[0,T ] given Ft. Letting τ ≡ T − t, we can express G[0,T ] interms of G[0,t] and St as

G[0,T ] = Gt/T[0,t]G

τ/T[t,T ]

= Gt/T[0,t]

S0 exp

[τ−1

∫ T

t

(νs + σWs) · ds

]τ/T

= Gt/T[0,t]

S0e

νt+σWt exp

τ

∫ T

t

(s − t) · ds +σ

τ

∫ T

t

(Ws − Wt) · ds

]τ/T

= Gt/T[0,t]

[St exp

τ

∫ τ

0

u · du +σ

τ

∫ τ

0

Wu · du

)]τ/T

= Gt/T[0,t]S

τ/Tt exp

T

(ντ/2 +

σ

τ

∫ T

t

Wu

)· du

],

where Wuu≥0 = Wt+u − Wtu≥0 is a standard P Brownian motion.Using (7.74) gives

G[0,T ] ∼ Gt/T[0,t]S

τ/Tt exp

[ τ

T

(ντ/2 + σWτ/

√3)]

. (7.75)

To put this to work in the simplest way, set Yt ≡ Gt/T[0,t]S

τ/Tt so that

(7.75) corresponds to

YT = Yt exp[(r − δ′ − β2/2)τ + βWτ ], (7.76)

where β = στ/(T√

3) and δ′ = r − ντ/(2T ) − σ2τ2/(6T 2). The corre-spondence between (7.75) and (7.76) lets us express prices of continuousgeometric-average, fixed-strike Asian options in terms of the Black-Scholesformula on an underlying with price Yt, volatility β, and dividend rate δ′.For example, the arbitrage-free value, PAF (St, G[0,t]; τ), of a T -expiring putstruck at X is

BXΦ

q+

BXeδ′τ

Gt/T[0,t]S

τ/Tt

− Gt/T[0,t]S

τ/Tt e−δ′τΦ

q−

BXeδ′τ

Gt/T[0,t]S

τ/Tt

,

Page 383: Pricing derivative securities

April 20, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch07 FA

366 Pricing Derivative Securities (2nd Ed.)

where B ≡ B(t, T ) is the price at t of a T -maturing riskless bond andq±(x) ≡ (lnx ± β2τ/2)/(β

√τ). When t = 0 (in which case τ = T and

G[0,0] = S0), this simplifies to

PAF (S0; T ) = BXΦ

[q+

(BX0e

δ′T

S0

)]− S0e

−δ′T q−(

BX0eδ′T

S0

),

(7.77)with δ′ = r − ν/2 − σ2/6 = (r + δ + σ2/6)/2.

Page 384: Pricing derivative securities

April 9, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch08 1st Reading

8Models with Uncertain Volatility

This chapter and the next treat the pricing of derivatives when the dynamicsof underlying prices are modeled in ways that fit the data better than doesBlack-Scholes. We continue to consider just derivatives on stocks, indexes,and currencies, whose prices are relatively insensitive to stochastic variationin interest rates. Thus, we still treat future Rates and bond prices as Knownin accord with assumption RK from chapter 4.

8.1 Empirical Motivation

Two sets of observations motivate the search for richer models than Black-Scholes dynamics:

• Empirical findings that underlying prices themselves violate the salientpredictions of that model; and

• Systematic discrepancies between actual market prices of derivatives andthose predicted within the Black-Scholes framework.

We review the evidence on both counts.

8.1.1 Brownian Motion Does Not Fit Underlying Prices

The key predictions of the geometric Brownian motion model are (i) thatchanges in the logarithm of price over fixed intervals of time are nor-mally distributed, (ii) that increments of log price over nonoverlappingtime periods of equal length are independent and identically distributed,and (iii) that price paths are almost surely continuous. There is compellingevidence that prices of financial assets and commodities violate all three.

367

Page 385: Pricing derivative securities

April 9, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch08 1st Reading

368 Pricing Derivative Securities (2nd Ed.)

The nonnormality of marginal distributions of logs of price relatives(that is, of continuously compounded returns) of stocks, indexes, currencies,commodities, and even real estate over fixed intervals is documented inmany studies.1 Empirical distributions of changes in log price are usuallyfound to have thicker tails than the normal, indicating a higher frequencyof outliers than would be consistent with a Gaussian law with the samemean and variance. Although the divergence from normality is slight whenreturns are measured over periods as long as one month, it is very prominentin daily returns. For example, in daily data for common stocks estimates ofkurtosis, α4 = µ4/σ4, are often in the 5.0–10.0 range vs. 3.0 for the normaldistribution.

Consistent with the thick tails in the data, but inconsistent with theindependent-increments feature of geometric Brownian motion, are theobservations that absolute (or squared) returns (e.g., | ln(St/St−1)|) arehighly predictable from their past and that shocks to the returns processtend to be highly persistent. One interpretation of these findings is thatthe volatility parameter that pertains at any point in time depends in someway on past price realizations. Evidence of variable and persistent volatil-ity is given in Chou (1988) and Bollerslev et al. (1992).2 There are alsoindications that the variance of ln(St/St−1) for stocks tends to be inverselyrelated to the initial price level. Often called the “leverage effect” sinceits early description by Black (1976a) and the studies by Schmalensee andTrippi (1978) and Christie (1982), the standard explanation has been thatfirms financed partly by debt become more highly leveraged when theirstocks’ prices fall, making the stocks riskier and their prices more volatile.While leverage per se now seems unlikely to be the entire explanation,3

there is no doubt that volatility does move inversely to the price level formany financial assets.

Temporal variation in dispersion implies that empirical marginal dis-tributions of returns are mixtures—that is, draws from distributions that

1For example, Mandelbrot (1963), Fama (1965), Officer (1972), Praetz (1972), Clark(1973), Blattberg and Gonedes (1974), Boothe and Glassman (1987), Young and Graff(1995), Mandelbrot and Hudson (2004). McCulloch (1996) and McDonald (1996) providedetailed surveys of this literature.

2See Ghysels et al. (1996) and Palm (1996) for extensive references that furtherdocument the conditional heteroskedasticity in returns and describe efforts to model it.

3For example, it appears that volatility of stock prices rises and stays higher afterstock splits that involve no substantive change in financial structure. For the evidencesee Ohlson and Penman (1985) and Dubofsky (1991).

Page 386: Pricing derivative securities

April 9, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch08 1st Reading

Models with Uncertain Volatility 369

differ in scale. The finding of excess kurtosis in the marginal distributionsof returns is consistent with this; however, the extreme thickness of tails ofempirical distributions of daily returns seems too large to be explained bystandard models of time-varying volatility with Gaussian errors. Account-ing for the extreme price movements that occur occasionally across entiremarkets (as in U.S. stock markets on 19 October 1987) and much more oftenfor individual assets seems to call for models that allow for discontinuitiesin price paths.

8.1.2 Black-Scholes No Longer Fits Option Prices

Observations of derivatives’ prices also indicate the need for richer modelsof price dynamics. Section 6.4.3 described the anomalous nature of post-1987 Black-Scholes implicit volatilities derived from market prices of stock.The characteristic smile and smirk effects, which are also prominent inprices of options on stock indexes and currencies, are depicted stylistically infigure 6.4. The smile effect is that implicit volatilities of options with strikesfar from the current underlying price are higher than for options near themoney. This means that Black-Scholes prices for options at the extremesof moneyness are too low relative to market prices when calculated withvolatilities backed out from prices of near-the-money options. The effectis qualitatively consistent with the evidence that risk-neutral distributionsof logs of terminal prices are thick-tailed, since out-of-money options aremore apt to end up in the money when the probability of outliers is high.The smirk effect, manifested in the steeply descending left branch of theplot of implicit volatility vs. X/St, indicates that Black-Scholes especiallyunderprices out-of-money puts. The high valuations the market places onthese instruments are consistent with an excess of probability mass in theleft tail of the risk-neutral distribution of terminal price; that is, with aleft-skewed conditional distribution of log price or log return, ln(ST /St).This effect is consistent with a special fear of market reversals like that ofOctober 1987. It could reflect either the perception that such price drops arelikely to reoccur, or else that people put high values on contingent claimsthat pay off in such states of the world.

Shapes of implicit-volatility curves typically vary over time, and thechanges sometimes seem to reflect specific political and economic events.For example, in S&P 500 futures options Kochard (1999) finds evidence ofenhanced left skewness in the risk-neutral distribution of ln(ST /St) aroundthe time of the 1991 Gulf War. He also documents a systematic increase

Page 387: Pricing derivative securities

April 9, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch08 1st Reading

370 Pricing Derivative Securities (2nd Ed.)

in left skewness following large negative price moves in the futures. Thereis no obvious way to determine whether these short-term variations comemainly from changes in subjective probabilities or from changes in tastes—subjective valuations.

There is also a term structure associated with implicit volatilities. Smileand smirk effects are most pronounced in prices of options with short matu-rities, the volatility curve flattening out as time to expiration increases.Also, both implicit volatilities at particular levels of moneyness and aver-ages of volatilities across levels may vary systematically with time to expi-ration. Even these relations can be unstable. Taylor and Xu (1994) find incurrency options that the slopes of volatility-to-maturity curves sometimeschange sign over time.4

This chapter and the next treat models that can capture at least some ofthe discrepant features observed in prices of underlying assets and deriva-tives. The present chapter focuses on the modeling of volatility within theframework of continuous Ito processes. We first consider models in whichvolatility coefficients depend on the current price level or on current andpast prices. Since the Wiener process that drives the underlying price isthe only source of risk, these models permit dynamic replication of deriva-tives’ payoffs and are therefore consistent with the existence of a uniquemartingale measure. In this setting, as in the Black-Scholes environment,derivatives prices can be determined from arbitrage considerations alone.However, the equivalent martingale measure is no longer unique when thereare other risks that cannot be hedged with traded assets. Section 8.3 consid-ers models in which the stochastic volatility process cannot be so replicated.We shall see that in this situation there may be many preference-dependentprices, all consistent with the absence of arbitrage. The same situation isencountered in chapter 9, where we first consider models of discontinuousprice processes.

8.2 Price-Dependent Volatility

This section deals primarily with the following class of models for underlyingprices under risk-neutral measure P:

dSt/St = rt · dt + σ(St, t) · dWt, (8.1)

4Bates (1996b) gives an extensive survey of empirical findings on the nature andevolution of volatility curves.

Page 388: Pricing derivative securities

April 9, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch08 1st Reading

Models with Uncertain Volatility 371

where short-rate process rtt≥0 is deterministic and Wtt≥0 is a Brownianmotion. The key feature is that σ(St, t) depends on the current value of theunderlying, while still satisfying the regularity condition

∫ T

0σ(St, t)2 · dt <

∞ a.s. for each finite T . Models of this sort were first proposed by Coxand Ross (1976) to capture the leverage effect, whereby volatility variesinversely with the price level. We begin by examining certain qualitativefeatures of option prices under this general setup, discovering that theyhave much in common with those of Black-Scholes. Specific computationalformulas for prices of European options can be worked out under certainspecifications of σ(t, St). We derive two such formulas, then describe hownumerical methods can be used to handle more general cases. Unfortunately,there is evidence that some fairly flexible specifications of σ(t, St) still failto predict option prices out of sample better than does an ad hoc versionof Black-Scholes with strike-specific volatilities. This suggests that a stillricher class of models is needed, such as those in the following section thatallow for additional stochastic influences on volatility. However, there is analternative model that is more in the spirit of (8.1). We shall see that time-varying smiles and smirks might be well accounted for without additionalrisk sources by allowing σ to depend on past as well as present values ofthe underlying price.

8.2.1 Qualitative Features of Derivatives Prices

Most of the qualitative features of derivatives prices in the Black-Scholessetting turn out to extend to general diffusion processes. Recall that diffu-sion processes are Ito processes with the Markov property, and that prop-erty is preserved if volatility depends on time and the current price ofthe underlying, as in (8.1). Comparing the models’ features, option pricesremain weakly monotone and convex in the prices of the underlying, andthe comparative statics with respect to time to expiration and volatilityare also qualitatively the same as for Black-Scholes. Monotonicity and con-vexity hold as well when volatility depends on some other diffusion process(or processes) whose drift and volatility parameters do not depend on theprice of the underlying asset. However, some strange behavior is possiblewhen the latter condition is violated.

These conclusions are due to Bergman et al. (1996). Many of themfollow from a simple and intuitive result that they call the “no-crossinglemma”. Let St0≤t be a diffusion process on a filtered probability space(Ω,F , Ft, P), whose motion under the risk-neutral measure is described

Page 389: Pricing derivative securities

April 9, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch08 1st Reading

372 Pricing Derivative Securities (2nd Ed.)

by (8.1). Of course, to each realization Wt(ω)t≥0 of a Brownian paththere corresponds a path St(ω)t≥0 for price.

Lemma 10 Suppose S′t(ω)t≥0 and S′′

t (ω)t≥0 are processes with thesame drift and volatility parameters that are generated by the same Brow-nian path Wt(ω)t≥0 but from different initial values, S′

0 > S′′0 . Then

S′t(ω) ≥ S′′

t (ω) for all t ∈ [0, T ] and almost all ω.

The restriction “almost all ω” just means that the exceptional outcomesfor which the result fails form a set of P measure zero. The proof exploitsthe continuity and Markov behavior of diffusion processes. By continuity,S′′

t > S′t for some t only if the paths crossed at some τ ∈ (0, t). But from

the Markov property, S′τ = S′′

τ implies that S′t = S′′

t for all t > τ , which isa contradiction. The processes could, in fact, merge in this way by hittingan absorbing barrier, such as the origin.

A simple application of this result is to show that the delta of aEuropean-style derivative on an underlying asset whose price follows (8.1)is bounded by the extremes of the slope of the terminal payoff function withrespect to ST . For example, the delta of a European call on a no-dividendstock is bounded by zero and unity. When the terminal payoff function iseverywhere differentiable, the proof is a simple consequence of the mean-value theorem, but Bergman et al. show that the result extends to payofffunctions with jump discontinuities (e.g., those of digital options) and func-tions whose right and left derivatives exist but may be unequal (e.g., thoseof ordinary puts and calls).

To see this, let ∆ST (ω) ≡ S′T (ω) − S′′

T (ω) be the difference in termi-nal values of two price paths corresponding to the same Brownian pathbut with initial values S′

0 > S′′0 , and let D−

S ≡ infDS(ST , 0) : ST ≥ 0and D+

S ≡ supDS(ST , 0) : ST ≥ 0 be the infimum and supremum ofslopes of the terminal payoff function. With P as the risk-neutral measureand a deterministic short-rate process, we have e−

R Tt

rs·dsEtD(ST , 0) =D(St, T − t) and e−

R Tt

rs·dsEt∆ST = ∆St ≥ 0, where the inequality followsfrom the no-crossing lemma. The mean value theorem implies that

D(S′T , 0) − D(S′′

T , 0) = DS(S∗T , 0) · ∆ST ,

for some S∗T ∈ (S′′

T , S′T ), so that

D−S · ∆ST ≤ D(S′

T , 0) − D(S′′T , 0) ≤ D+

S · ∆ST .

Taking expectations conditional on Ft and discounting give

D−S · ∆St ≤ D(S′

t, T − t) − D(S′′t , T − t) ≤ D+

S · ∆St,

Page 390: Pricing derivative securities

April 9, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch08 1st Reading

Models with Uncertain Volatility 373

which implies that

D−S ≤ DS(St, T − t) ≤ D+

S

for t ∈ [0, T ].Similar arguments show that convexity or concavity of the terminal

payoff function implies convexity or concavity of D(St, T − t). When theunderlying follows (8.1), Bergman et al. also show (i) that prices of Euro-pean calls increase with an upward shift in the entire yield curve for default-free bonds, but not necessarily with a change in yields that merely reducesthe discounted present value of the strike price; and (ii) that an upwardshift in volatility function σ(St, t) increases current prices of European-stylederivatives. We have already applied the latter result in section 7.2.1 in ref-erence to firms that write options on their own stocks. The monotonicityresult extends also to cases in which volatility depends on another (possiblyvector-valued) diffusion process Yt, if its drift and diffusion parameter donot depend on St. Yet another condition is needed to extend the convex-ity result.

To show what can go wrong, Bergman et al. give the following insightfulexample of a nonMarkov price process for which the value of a call does notincrease monotonically with the current price of the underlying. A firm’smanagers will be rewarded if the price of its stock is above a threshold S

at some time t′′, and they have an opportunity to revise investment policyat t′ < t′′. The firm is financed entirely by equity capital. If the stock’sprice exceeds S exp(−

∫ t′′

t′ rs · ds) at t′, managers will lock in the rewardby choosing riskless projects that cause S to grow at the riskless rate. Inthis case a call option expiring at T ≤ t′′ with strike price greater thanSt′ exp(

∫ T

t′ rs · ds) will be worthless. However, if St′ < S exp(−∫ t′′

t′ rs · ds)managers will choose risky projects that produce high volatility in S so asto gain some chance of reward. Since this also makes it possible for the callto be in the money at T , it is clear that the call is worth more when St′

is less than S exp(−∫ t′′

t′ rs · ds) than when it is greater. The problem hereis precisely that the controlled process St0≤t≤T is not a Markov process.It is not because the volatility at each t > t′ depends on the value of thestock at t′.

8.2.2 Two Specific Models

Turning now to some specific formulations of price-dependent volatility, webegin with a simplified version of Rubinstein’s (1983) “displaced diffusion”

Page 391: Pricing derivative securities

April 9, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch08 1st Reading

374 Pricing Derivative Securities (2nd Ed.)

model, in which the underlying asset has two components, one risky and theother riskless. In this model the volatility of the underlying rises and fallsstochastically as the value of the risky component increases and declines.Although this behavior runs counter to the leverage effect observed inthe returns of common stocks, it does characterize portfolios balancedbetween stocks and short-term government debt. Entities such as trusts andendowments that hold such portfolios sometimes purchase over-the-counterderivative products in order to reduce their risk.5 In any case the modelis interesting pedagogically because it allows prices of European optionsto be expressed in terms of the Black-Scholes formulas. We also look atthe constant-elasticity-of-variance (c.e.v.) model, which does capture theinverse relation between volatility and price level that is observed in stocks’returns.

The Displaced-Diffusion Model

Consider derivatives on an underlying asset that is itself a portfolio. Theportfolio comprises (i) a risky component worth At at time t; and (ii) b unitsof T ′-maturing riskless bonds each worth B(t, T ′) at t. Assuming that therisky asset is traded and that its price follows geometric Brownian motion,process Att≥0 evolves as

dAt/At = rt · dt + σA · dWt

under the risk-neutral measure. Letting St = At + bB(t, T ′) be the value ofthe portfolio at t, we have

dSt = dAt + b · dB(t, T ′) = rt[At + bB(t, T ′)] · dt + σAAt · dWt,

and

dSt/St = rt · dt + σ(St, t) · dWt,

where

σ(St, t) = σA[1 − bB(t, T ′)/St].

Even though the future volatility of St is unpredictable, it is never-theless easy to express prices of European-style derivatives. For example,

5Notice that the polar case of a firm with negative holdings of the riskless asset

would correspond to debt financing. However, because of shareholders’ limited liability,the compound-option model is more appropriate in this case. Indeed, that model actuallybelongs to the class (8.1).

Page 392: Pricing derivative securities

April 9, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch08 1st Reading

Models with Uncertain Volatility 375

consider a European call option that expires at T ≤ T ′. Its arbitrage-freeprice at t is

CE(St, T − t) = B(t, T )Et(ST − X)+

= B(t, T )EtAT − [X − bB(T, T ′)]+ (8.2)

Since AT > 0 a.s., the option will necessarily expire in the money if X ≤bB(t, T ′). In this case the “+” symbols in (8.2) are superfluous, and thecall’s value is that of the portfolio minus the discounted value of the exerciseprice,

CE(St, T − t) = B(t, T )Et(ST − X)

= St − B(t, T )X

= At − B(t, T )X + B(t, T ′)b.

In the less trivial case that X > bB(t, T ′) one just appeals to the factthat Att≥0 is geometric Brownian motion and expresses (8.2) by meansof the Black-Scholes formula with At in place of St and X − bB(T, T ′) inplace of X .

Option Pricing under the C.E.V. Model

The constant-elasticity-of-variance process is represented by the stochasticdifferential equation

dSt/St = µt · dt + σ0Sγ−1t · dWt, (8.3)

with γ ∈ (0, 1). The name applies because the “elasticity” of the volatilityfunction with respect to the underlying price, ∂ lnσ(St, t)/∂ lnSt, equals theconstant γ−1. Since it appears that s.d.e. (8.3) does not have a unique solu-tion when γ < .5,6 applications are normally confined to the case γ ∈ [.5, 1).In this range volatility decreases with the price level, as is consistent withwhat we observe in returns of common stocks, but St remains almost-surely nonnegative. Among many possible schemes for linking volatility toprice, this one has the advantage of supporting computational formulas forprices of European-style options.

W. Feller (1951) has derived the distribution of ST conditional on St

in the special case γ = 1/2, and Cox (1975) has extended to other values.

6See Duffie (1996, p. 293).

Page 393: Pricing derivative securities

April 9, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch08 1st Reading

376 Pricing Derivative Securities (2nd Ed.)

Following Cox and Ross (1976), we develop a formula for prices of Euro-pean options on a no-dividend stock whose price follows the “square-root”process, with

dSt/St = µ · dt + σ0S−1/2t · dWt. (8.4)

Letting Ft(s) ≡ Pt(ST ≤ s), Feller’s results give

dFt(s) =√

θtξte−ξt−θtss−1/2I1(2

√θtξts

1/2) · ds, s > 0 (8.5)

Ft(0) = G(µ; 1)

Ft(s) = 0, s < 0.

Here θt and ξt are given by

θt ≡2µ

σ20 [eµ(T−t) − 1]

ξt ≡ Stθteµ(T−t),

while I1(·) is the modified Bessel function of the first kind of order one, andG is the complementary gamma c.d.f.:

G(x; α) ≡∫ ∞

x

1Γ(α)

yα−1e−y · dy.

I1(·) can be approximated using the series expansion (2.24) with ν = 1.To apply martingale pricing, one changes to the measure P in which

Wt = Wt + σ−10

∫ t

0S

1/2s (µ − rs + δs) · ds is a martingale, where rt and

δt are the F0-measurable instantaneous short rate and dividend rate at t,and replaces µ with r − δ ≡ (T − t)−1

∫ T

t(ru − δu) · du in the expressions

for θt and ξt. The price of the European call is then CE(St, T − t) =B(t, T )Et(ST − X)+, where the expectation is given by∫ ∞

X

[(s − X)

√θtξte

−ξt−θtss−1/2 ·√

θtξts1/2

∞∑k=0

1k!(k + 1)!

(θtξts)k

]· ds.

To develop further, rearrange as

ξt

θt

∞∑k=0

e−ξtξkt

k!

∫ ∞

X

θk+1t sk+1e−θts

(k + 1)!· d(θts)

−X

∞∑k=0

e−ξtξk+1t

(k + 1)!

∫ ∞

X

θkt ske−θts

k!· d(θts),

then change variables as (θts) → y in the integrals. Finally, writing ξt/θt =Ste

(T−t)(r−δ) = Ste−δ(T−t)B(t, T )−1 and g(x; α) for the gamma p.d.f., one

Page 394: Pricing derivative securities

April 9, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch08 1st Reading

Models with Uncertain Volatility 377

obtains for the value of the call under the square-root version of the c.e.v.process

CE(St, T − t; 1/2) = Ste−δ(T−t)

∞∑k=1

g(ξt; k)G(θtX ; k + 1) (8.6)

−B(t, T )X∞∑

k=1

g(ξt; k + 1)G(θtX ; k).

Cox’s (1975) more general expression that applies for γ ∈ [.5, 1) is

CE(St, T − t; γ)

= Ste−δ(T−t)

∞∑k=1

g(ξ′t; k)G(

θ′tX2−2γ ; k +

12 − 2γ

)(8.7)

−B(t, T )X∞∑

k=1

g

(ξ′t; k +

12 − 2γ

)G(θ′tX

2−2γ ; k),

where

θ′t ≡r − δ

σ20(1 − γ)[e2(1−γ)(r−δ)(T−t) − 1]

ξ′t ≡ S2−2γt θ′te

2(1−γ)(r−δ)(T−t).

Schroder (1989) shows that the formula can be expressed in terms of c.d.f.sof noncentral chi-squared distributions and provides some computationalshortcuts.

Considering the square-root version of the c.e.v. model, Beckers (1980)finds that (8.6) prices in-the-money calls and out-of-money puts higher thanBlack-Scholes (evaluated at implicit volatilities of at–the-money options),while the c.e.v. model prices out-of-money calls and in-the money puts lowerthan Black-Scholes. This is at least qualitatively consistent with the smirkthat is often observed in implicit volatilities. Figure 8.1 shows for severalvalues of γ how the difference in prices, c.e.v. minus Black-Scholes, varieswith St/X for European calls and puts on no-dividend stocks.7 Note thatthe differences are positive for St/X > 1, where puts are out of money andcalls in, and they increase as γ declines from unity. Figure 8.2 shows thesame information through the Black-Scholes implicit volatilities. Measur-ing moneyness as X/St to correspond to figure 6.4, we can see that the

7By put-call parity differences in call prices equal differences in put prices. Thecalculations use X = 10.0, r = 0.05, δ = 0.0, T − t = 0.5 and σ0 = 0.30, with volatilitydivided by Sγ−1

t for c.e.v. to make initial volatilities comparable.

Page 395: Pricing derivative securities

April 9, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch08 1st Reading

378 Pricing Derivative Securities (2nd Ed.)

0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4 1.5

St/X

−0.03

−0.02

−0.01

0.00

0.01

0.02

0.03

Pric

e D

iffere

nce

, C.e

.v.−

B.S

. γ = .50

γ = .65

γ = .80

Fig. 8.1. C.e.v. minus Black-Scholes prices vs. moneyness.

0.6 0.8 1.0 1.2 1.4 1.6 1.8

X/St

0.25

0.26

0.27

0.28

0.29

0.30

0.31

0.32

0.33

0.34

Impl

icit

Vol

atili

ty

γ = .50

γ = .65

γ = .80

Fig. 8.2. Implicit volatilities for c.e.v. model.

Page 396: Pricing derivative securities

April 9, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch08 1st Reading

Models with Uncertain Volatility 379

c.e.v. model does introduce the smirk pattern that is commonly observedin prices of equity, index, and currency options. On the other hand, c.e.v.effects by themselves seem incapable of accounting for the other side of thesmile curve, corresponding to Black-Scholes’ underpricing of in-the-moneyputs and out-of-money calls. In addition, the c.e.v. model implies an essen-tially flat term structure of implicit volatilities, in that implicit volatility isinsensitive to the length of time to expiration. This is inconsistent with theobservation that smile curves tend to flatten out with time.

8.2.3 Numerical Methods

Martingale pricing of a derivative requires being able either to express ana-lytically the risk-neutral conditional distribution of the underlying price orto simulate replicates from that distribution. The main attraction of thec.e.v. model is that the distribution of price is known—albeit quite com-plicated. Under other a priori reasonable specifications, such as σ(St, t) =σ0 + σ1S

γ−1t , the distribution is not explicitly known. Simulation, discussed

in chapter 11, is often a feasible approach to such a problem, but it is usu-ally faster to solve the partial differential equation that is implied by thedynamic replicating argument (section 6.2.1). While analytical solutions tothese p.d.e.s are rarely available in interesting cases, numerical ones obtainedby finite-difference methods usually are. These methods also apply whenderivatives are subject to early exercise. Chapter 12 presents an introduc-tory description of these techniques. In section 8.4 we shall see that there aresimple alternatives for pricing European-style options when one has a com-putational formula for the characteristic function of the underlying price.

8.2.4 Limitations of Price-Dependent Volatility

Although models in the class (8.1) are consistent with preference-freepricing and can account for some moneyness-related variation in implicitvolatilities, there are good reasons to look for still more general mod-els. It is unlikely that price-level effects alone could fully account for theobserved shapes of implicit volatility curves, nor do they explain the flat-tening out of the smile as time to expiration increases. Flattening mightbe captured by allowing explicitly for time dependence, as σ(St, T − t) =σ0 + σ1(T − t)−1S1−γ

t , but this kind of arrangement could still not accom-modate the day-to-day variations in the slopes of volatility term structures,nor could it pick up changes over time in the shapes of smile curves.

Page 397: Pricing derivative securities

April 9, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch08 1st Reading

380 Pricing Derivative Securities (2nd Ed.)

There are also indications that price/time-dependent volatility modelsdeliver poor price predictions out of sample. This is the conclusion reachedby Dumas et al. (1998) after extensive efforts to find price/time-volatilityforms that do give good predictions. Their approach was, in effect, to usemarket prices of S&P 500 index options at different strikes to fit pricingmodels based on the following flexible specification of volatility:

σ(St, T−t) = a0+a1St+a2S2t +a3(T−t)+a4(T−t)2+a5St(T−t). (8.8)

Fitting involved estimating the coefficients aj by minimizing the sum ofsquared deviations of actual option prices from those implied by the model,as calculated by finite-difference methods. Although weekly estimates overseveral years gave good fits within sample, the coefficients in (8.8) werefound to be highly unstable over time. Moreover, weekly out-of-sample pre-diction errors based on coefficients from the previous week were large anddid not improve on errors from an ad hoc application of Black-Scholeswith strike- and time-specific volatilities. Although one may question theappropriateness and generality of the quadratic specification for σ, thesefindings do suggest the need for richer models than one-factor diffusionswith volatilities that depend just on current prices and time.

8.2.5 Incorporating Dependence on Past Prices

A promising direction in which to extend these models is to allow bothcurrent and past prices of the underlying to drive volatility. For example,volatility might be specified as a deterministic function of the differencebetween the current price and an average of past prices, all detrended bydiscounting at the short rate. The idea, similar to that underlying discrete-time ARCH models, is that large deviations of price from a “normal” (underrisk-neutrality) trend tend to forecast higher future variation. Such a spec-ification could account for temporal changes in smile curves and volatilityterm structures as current price and the norm evolve through time; yetlimiting the determinants of volatility to time and the underlying price stillpermits payoff replication and preference-free pricing.

A class of such models was proposed by Hobson and Rogers (1998). Onesimple version starts with the primitive that the logarithm of the discountedprice of a no-dividend asset, st ≡ ln(e−rtSt), is an Ito process of the form

dst = µ(∆t) · dt + σ(∆t) · dWt.

Page 398: Pricing derivative securities

April 9, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch08 1st Reading

Models with Uncertain Volatility 381

Here ∆t is the deviation from norm, defined as

∆t = st − λ

∫ t

−∞e−λ(t−u)su · du,

with extrinsic parameter λ governing the rate at which the influence of pastprices decays. Note that the time origin is also, in effect, a parameter here.Taking ∆0 = 0, this setup makes ∆tt≥0 itself an Ito process adapted toFt, with

d∆t = dst − λ∆t · dt

= [µ(∆t) − λ∆t] · dt + σ(∆t) · dWt.

The quadratic specification for σ,

σ2(∆t) = α0 + α1∆t + α2∆2t ,

is feasible if supplemented (for technical reasons) by an upper bound. Thiscould accommodate different responses of volatility to positive and negativeprice deviations, depending on the sign of α1. Defining

θ(∆t) ≡ µ(∆t)/σ(∆t) + σ(∆t)/2,

a martingale measure P for the discounted price of the underlying can bedetermined by Girsanov’s theorem as

dP

dP= exp

[−∫ t

0

θ(∆u) · dWu − 12

∫ t

0

θ(∆u)2 · du

],

assuming µ(·) and σ(·) are such that the right-hand side is a martingaleunder P. Then Wt = Wt +

∫ t

0θ(∆u) · du becomes a P-Brownian motion

and Stt≥0 and ∆tt≥0 evolve as,

dSt/St = r · dt + σ(∆t) · dWt (8.9)

d∆t = −[λ∆t + σ(∆t)2/2] · dt + σ(∆t) · dWt. (8.10)

Applying the martingale representation theorem (section 3.4.2), aEuropean-style derivative with integrable terminal payoff function, D(ST ),that depends on ST and time alone can be replicated by a self-financingportfolio and valued as

D(St, ∆t, T − t) = e−r(T−t)EtD(ST ).

Notice that the deviation from norm now enters as another state variableon which the derivative’s arbitrage-free price depends.

While Monte Carlo would be an obvious approach to evaluatingEtD(ST ), Hobson and Rogers (1998) suggest solving for D(St, ∆t, T − t)

Page 399: Pricing derivative securities

April 9, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch08 1st Reading

382 Pricing Derivative Securities (2nd Ed.)

from the associated p.d.e. Since M−1t D(St, ∆t, T − t) = M−1

0 e−rtD(St,

∆t, T − t) is a martingale under P (where Mt is the value of the money-market fund at t) it has zero mean drift. Applying Ito’s formula, one obtains

ert · d[e−rtD(St, ∆t, T − t)] = −(rD + DT−t) · dt + DS · dSt +DSS

2· d〈S〉t

+ DS∆ · d〈S, ∆〉t+D∆ · d∆t +D∆∆

2· d〈∆〉t.

Using (8.9) and (8.10) to express the stochastic differentials and quadraticvariations, collecting the drift terms, and equating to zero give the p.d.e.

0 = −DT−t + DSrSt − λD∆∆t

+ (DSSS2t + 2DS∆St + D∆∆ − D∆)σ(∆t)2/2 − Dr.

Notice that this reduces to the standard Black-Scholes p.d.e. in the degener-ate cases that σ(∆t) = σ0, a constant, or σ(∆t) = σt, a deterministic func-tion of time alone. Allowing for time-varying, deterministic interest ratesand continuously paid dividends (at rates that could in principle depend ont, St, and ∆t) would be accomplished just by replacing r with rt − δt. Thep.d.e. would be solved by finite-difference methods. Of course, the presenceof an additional state variable increases to three the required dimensionof the grid. Hobson and Rogers (1998) show that the model can indeedgenerate a wide variety of smile patterns and volatility term structures.

8.3 Stochastic-Volatility Models

This section treats models in which volatility is itself an Ito process thatmay be subject to some stochastic influence other than the price of theunderlying asset. Specifically, with the underlying price evolving as

dSt = µtSt · dt + σtSt · dW1t, (8.11)

σt is modeled as

dσt = ξt · dt + γt(ρt · dW1t + ρt · dW2t), (8.12)

where ρt =√

1 − ρ2t . Here W1t, W2t are independent Brownian motions

under P, and ξt, γt ≥ 0, and ρt ∈ [−1, 1] are adapted to the filtrationFt generated by W1t, W2t. If |ρt| = 1 the model reduces to those insection 8.2 where σt = σ(St, t). The other degenerate case is that there areother traded assets whose prices are adapted to Ft and such that σt can bereplicated by a portfolio of these. In both cases preference-free pricing is still

Page 400: Pricing derivative securities

April 9, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch08 1st Reading

Models with Uncertain Volatility 383

possible. Otherwise, the market is not complete, and arbitrage argumentsalone do not yield a unique solution for the price of the derivative.

The first task in this section is to demonstrate the truth of the last twostatements. We show that replication is indeed possible when the uncer-tainty is “spanned” by prices of traded assets and that there can otherwisebe no unique arbitrage-free price for a derivative whose payoffs depend non-linearly on the price of the underlying. We then see how to price optionsunder particular specifications for σt, beginning with the case ρt = 0 inwhich shocks to volatility and price are independent.

8.3.1 Nonuniqueness of Arbitrage-Free Prices

While the absence of arbitrage guarantees the existence of a measure underwhich normalized prices of traded assets are martingales, it is the complete-ness of markets—the ability to replicate payoffs with traded assets—thatmakes the martingale measure unique. Without replication there can be nounique arbitrage-free solution for the price of a derivative. If the risk asso-ciated with holding a derivative cannot be hedged away, the derivative’smarket price, like prices of primary assets, will depend on how individu-als respond to risk and how much compensation they require to bear it.We will identify explicit conditions under which preference-free pricing ofderivative assets is and is not possible under (8.11) and (8.12). Of course, weknow from chapter 4 that derivatives like forwards and futures, whose pay-offs depend linearly on the underlying price, can be replicated with staticportfolios of the underlying and riskless bonds, regardless of the underlyingdynamics. Our discussion therefore pertains to options and other derivativeswith nonlinear payoff structures.

Since (8.11) and (8.12) involve two independent sources of risk, it isclear that dynamic replication of a derivative will require at least two riskyassets. Consider first the case that there is a second traded asset whoseprice depends directly on both risk sources, as

dS′t = µ′

tS′t · dt + σ′

tS′t(ρ

′t · dW1t + ρ′t · dW2t).

Here σ′t may be stochastic also, provided that it is adapted to the Ft

with respect to which St and σt are measurable. Let D(St, σt, T − t)be the price of a T -expiring European-style derivative whose terminal valuedepends just on ST . We will try to replicate the payoff D(ST ) with a self-financing portfolio comprising pt units of the underlying, qt units of thesecondary asset, and mt units of the money fund. The procedure is like that

Page 401: Pricing derivative securities

April 9, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch08 1st Reading

384 Pricing Derivative Securities (2nd Ed.)

followed in section 6.2.1 except that we simplify by excluding dividends.Also, σt must be recognized now as an additional state variable since it is nota function just of St and t. Setting mt = [D(St, σt, T − t)−ptSt−qtS

′t, ]/Mt

and using dMt/Mt = rt · dt, the self-financing property of the portfoliorequires

dD(St, σt, T − t) = pt · dSt + qt · dS′t + (D − ptSt − qtS

′t)rt · dt. (8.13)

Using Ito’s formula to express the left side (assuming that D has at leasttwo continuous derivatives in S and σ) gives for dD(St, σt, T − t)

(−DT−t + DSµtSt + DSSσ2t S2

t /2 + Dσξt + Dσσγ2t /2 + DSσρtStσtγt) · dt

+ (DSσtSt + Dσρtγt) · dW1t + Dσ ρtγt · dW2t, (8.14)

while the right side of (8.13) is

[Drt + pt(µt − rt)St + qt(µ′t − rt)S′

t] · dt

+ (ptσtSt + qtσ′tS

′tρ

′t) · dW1t + qtσ

′tS

′tρ

′t · dW2t. (8.15)

Equating the terms in dW1t and dW2t on the two sides requires

qt = Dσρtγt

σ′tS

′tρ

′t

pt = DS + Dσγt

σtSt(ρt − ρ′tρt/ρ′t).

Substituting these in (8.15), equating the dt term with that in (8.14), andrearranging give

0 = −DT−t + DSrtSt + DSSσ2t S2

t /2 + Dσξt + Dσσγ2t /2 + DSσρtStσtγt

−Drt −Dσγt

ρ′t

[(µt − rt

σt

)(ρtρ

′t − ρ′tρt) +

(µ′

t − rt

σ′t

)ρt

](8.16)

as the differential equation that D(St, σt, T − t) must satisfy.Notice that this involves two preference-dependent parameters, µt and

µ′t, that reflect the market’s required compensation for the risk of holding

the two primary assets. However, assuming there is no arbitrage, there muststill be a measure P under which S∗

t ≡ St/Mt and S′∗t ≡ S′

t/Mt aremartingales. To develop such a measure, introduce adapted processes θ1tand θ2t satisfying

rt = µt − σtθ1t = µ′t − σ′

tρ′tθ1t − σ′

tρ′tθ2t

Page 402: Pricing derivative securities

April 9, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch08 1st Reading

Models with Uncertain Volatility 385

and apply Girsanov’s theorem to construct P Brownian motions W1t ≡W1t +

∫ t

0 θ1s ·ds and W2t ≡ W2t +∫ t

0 θ2s ·ds. Girsanov’s theorem guaranteesthat this construction is possible if θ1t and θ2t are such that

dP

dP= exp

2∑j=1

∫ t

0

(−θjs · dWjs −

12θ2

js · ds

) (8.17)

is a P-martingale. Under this new measure we have

dSt = µtSt · dt + σtSt(dW1t − θ1t · dt)

= rtSt · dt + σtSt · dW1t

dS′t = µ′

tS′t · dt + σ′

tS′t[ρ

′t(dW1t − θ1t · dt) + ρ′t(dW2t − θ2t · dt)]

= rtS′t · dt + σ′

tS′t(ρ

′t · dW1t + ρ′t · dW2t),

as the martingale property for S∗t and S′

t∗ requires. The time-t values ofthe adapted processes involved in this transition are

θ1t =(

µt − rt

σt

)θ2t =

1ρ′t

(µ′

t − rt

σ′t

)− ρ′t

ρ′t

(µt − rt

σt

).

Under this measure the final term of (8.16) vanishes, and a solution tothe p.d.e. subject to the boundary conditions and the terminal conditionD(ST , σT , 0) = D(ST ) gives a preference-free price for the derivative. Alter-natively, since the drift of D(St, σt, T − t)/Mt itself is zero under thismeasure, we could just exploit the martingale property and price the deriva-tive as

D(St, σt, T − t) = Et[D(ST )MT /Mt] = B(t, T )EtD(ST ).

What happens when there is no traded asset that completes the mar-ket in this way? Since replication still requires two risky assets, a logicalidea would be to replicate the derivative of interest using the underlyingand another derivative that expires at the same date or later. Since theprice of the other derivative would have to respond directly to both risksources, it could not be a linear instrument like a forward contract, whosearbitrage-free price depends just on prices of the underlying and risklessbonds. When ρt = 0, so that volatility and price shocks are independent,Bajeux and Rochet (1992) have shown that a European option does com-plete the market, and the result was extended by Romano and Touzi (1997)(under restrictive conditions on ξt and γt) to the case ρt = 0. Supposing,

Page 403: Pricing derivative securities

April 9, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch08 1st Reading

386 Pricing Derivative Securities (2nd Ed.)

then, for the sake of example, that we have chosen a T -expiring Europeanput as the second asset, let us try to follow the same program as before.

Forming a replicating portfolio with pt units of the underlying, qt putseach worth PE(St, σt, T − t), and mt units of the money fund, we have

dD(St, σt, T − t) = pt ·dSt +qt ·dP E(St, σt, T − t)+(D−ptSt−qtPE)rt ·dt.

Expressing dD and dP E by Ito’s formula and dSt as (8.11) then collectingthe terms in dW1t and dW2t give qt = Dσ/PE

σ and pt = DS − PES Dσ/PE

σ .

With these substitutions, equating the dt terms produces the p.d.e.

0 =[−DT−t + DSrtSt + DSS

σ2t

2S2

t + Dσσγ2

t

2+ DSσρtStσtγt − Drt

]− Dσ

PEσ

[−PE

T−t + PES rtSt + PE

SS

σ2t

2S2

t + PEσσ

γ2t

2

+ PESσρtStσtγt − PErt

]. (8.18)

Although there are no preference-dependent parameters here, a moment’sreflection shows that there is an essential indeterminacy in the solutionsfor PE and D. Letting LD and LPE denote the two bracketed expres-sions, consider functions D(St, σt, T − t) and PE(St, σt, T − t) that solve,respectively

0 = LD + Dσλt (8.19)

0 = LPE + PEσ λt. (8.20)

Notice that in these expressions λt takes the place of the drift process ξt in(8.16), so we refer to λt as the “adjusted” drift of σt. Since λt is arbitraryand any such D and PE that solve (8.19) and (8.20) also solve (8.18), wesee that the replicating argument fails to deliver a unique solution for theprice of either derivative.8

Another way of seeing the problem [c.f. Renault and Touzi (1996,p. 282)] is that a change of measure of the form (8.17) with θ1t = (µt−rt)/σt

but with θ2t unrestricted reduces St/Mt to a martingale. While expecta-tion in any such measure delivers unique prices for the underlying asset

8Scott (1987, p. 423), considering specifically the use of one call to help replicateanother, sums up the problem in a very intuitive way:

“We cannot determine the price of a call option without knowing the price ofanother call on the same stock, but that is precisely the function that we aretrying to determine. (Emphasis added.)

Page 404: Pricing derivative securities

April 9, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch08 1st Reading

Models with Uncertain Volatility 387

and (trivially so) for the money fund, it does not produce a uniquevalue for a (nonlinear) derivative that responds to volatility risk, becauseEt[D(ST )MT /Mt] depends on θ2t.

Being thus convinced that preferences will matter in pricing options andother nonlinear derivatives under stochastic volatility, the next question is“What do we do about it?” The essence of the problem is that solutions willdepend on unobservable, preference-dependent parameters (or processes)like the adjusted drift λt in the discussion of (8.18). For instance, λt canbe considered to incorporate the market’s required compensation for expo-sure to σ risk, with Dσ representing the derivative’s marginal sensitivity tosuch risk, just as beta in the CAPM represents a primary asset’s sensitivityto market risk. One approach is to derive λt from an equilibrium modelthat relies on strong assumptions about preferences and markets (and theprocesses ξt, γt, and ρt that define σt). Alternatively, one can dothe following. First,

• specify a plausible, flexible, yet reasonably (parametrically) parsimoniousmodel for σt that has a good chance of capturing the dynamics associ-ated with the unknown equivalent martingale measure that the marketactually uses; then

• infer the parameters from the structure of prices of traded derivatives.

This is in the same spirit as deducing the volatility parameter in the Black-Scholes model from the price of a single European option.9 If the number ofparameters is the same as the number of observed prices, one can ordinarilysolve for the parameters uniquely, just as is done for implicit volatility. Withmore observations than parameters inference can be based on some robuststatistical procedure, such as nonlinear least squares. For example, takinga panel of observed prices of options with various strikes and maturities,one can fit a parametric model by minimizing a measure of distance (suchas the sum of squared deviations) of actual prices from those implied bythe model.10 One’s confidence in the particular parameterization depends

9Chernov and Ghysels (1998) have considered combining information from options’prices and the time series of the underlying to infer the parameters of the equivalentmartingale distribution. They find, however, that it is the options’ values that contributethe most information. See also Ghysels et al. (1996) for further discussion of modelingissues and inference.

10Recall that this was the procedure used by Dumas et al. (1998) in estimatingflexible price-dependent volatility models.

Page 405: Pricing derivative securities

April 9, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch08 1st Reading

388 Pricing Derivative Securities (2nd Ed.)

on the stability of parameter estimates from different time samples and theaccuracy of out-of-sample predictions of future prices.

8.3.2 Specific S.V. Models

We now look at some specific models for European-style derivatives in thestochastic-volatility (s.v.) framework. For definiteness in interpreting “inthe money” and “out of the money” we focus the discussion on Europeanputs, expressing (8.20) as

−PET−t + PE

S rtSt + PESSσ2

t S2t /2 + PE

σ λt + PEσσγ2

t /2

+ PESσρtStσtγt − PErt = 0. (8.21)

Determining the current value of the option, PE(St, σt, T−t), whether doneby solving (8.21) subject to PE(ST ) = (X−ST )+ and boundary conditionsor by exploiting the martingale property of PE(St, σt, T − t)/Mt0≤t≤T

under P, requires specific assumptions about the processes γt, ρt, andthe adjusted drift parameter λt. The analysis is easiest when ρt, thecorrelation between the shocks to price and volatility, is always zero.

Independent Shocks to Price and Volatility

Regarding stochastic volatility as a mixing process that scatters probabilitymass into the tails of the distribution of ST , one’s intuition is that bothout-of-money puts and out-of-money calls would be worth more in suchan environment than under Black-Scholes dynamics. By put-call parity, itfollows that deep in-the-money options would be worth more as well. Theseconsiderations suggest the presence of a smile effect in the implicit volatil-ity curve. While the logic seems compelling, the result has been establishedrigorously only in the case that ρt = 0. Renault and Touzi (1996) provetwo general propositions that describe the shape of the implicit volatil-ity curve under this condition. Letting x ≡ ln(X/ft), where (assuming nodividend on the underlying) ft = St/B(t, T ) is the time-t forward price,they show that implicit volatility as a function of x is symmetric about theorigin, decreasing with X when x < 0 (puts out of the money), reachinga minimum at x = 0, and increasing with X when x > 0 (puts in themoney). This establishes firmly that a volatility smile is in fact presentwhen ρt = 0. Indeed, the indicated symmetry of implicit volatility with

Page 406: Pricing derivative securities

April 9, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch08 1st Reading

Models with Uncertain Volatility 389

0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6

Moneyness (X/f)

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

Impl

icit

Vol

atili

ty

Fig. 8.3. An asymmetric smile consistent with ρt = 0.

respect to x implies an asymmetry with respect to X/ft = ex that is con-sistent with the smirk effect. For example, figure 8.3, which simply plotsσ(x) = x2 vs. ex, illustrates an asymmetric smile that is consistent withthe theoretical results of Renault and Touzi.

Besides supporting these definite qualitative results, maintainingρt = 0 greatly simplifies the quantitative problem of pricing derivativesunder stochastic volatility. This is because the independence of price andvolatility shocks makes it possible to express prices of European optionsas mathematical expectations of Black-Scholes prices. To see this, recallour development of the Black-Scholes put and call formulas in section 6.3.2for deterministic but time-varying volatility. Taking the current time to bet = 0, the put formula is

PE(S0, T ; σT ) ≡ B

[ln(X/f0) + σ2

T T/2σT

√T

]− f0Φ

[ln(X/f0) − σ2

T T/2σT

√T

],

Page 407: Pricing derivative securities

April 9, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch08 1st Reading

390 Pricing Derivative Securities (2nd Ed.)

where B = B(0, T ), f0 is the current forward price for delivery of the under-lying at T , and σ2

T = T−1∫ T

0 σ2t · dt is the average volatility to expiration.

If shocks to volatility and price are independent, then the time-T valueof the underlying is lognormal conditional on average volatility; and sincethe validity of the put formula depends just on the lognormality of ST , itstill applies in this conditional sense. Therefore, the current value of theoption under stochastic volatility is simply the mathematical expectationof PE(S0, T ; σT ), or

PE(S0, σ0, T ) = EP E(S0, T ; σT ).

Notice that the boundedness of the Black-Scholes price function assures theexistence of the expectation.

There remains the issue of inferring the distribution for σT that isconsistent with the particular preference-dependent martingale measurethat actually governs Stt≥0 and σtt≥0. Since in this measure dσt =λt ·dt+ γt ·dW2t when ρt = 0, the distribution of σT will be determined bythe specification of γt and adjusted drift process λt in (8.21). Assumingthat the appropriate risk adjustment is simply an additive constant, so thatλt = ξt − λ∗, Scott (1987) uses Monte Carlo simulation to implement theprocedure for two specific mean-reverting specifications (i) ξt = ξ(σ∞−σt),γt = γ; and (ii) ξt = [ξ ln(σ∞/σt) + γ2/2]σt, γt = γσt, where in eachcase σ∞ is the constant value about which σt fluctuates.11 Because ofthe conditioning argument just related, pricing the option by Monte Carlojust involves simulating price paths of σt0≤t≤T , evaluating σT for eachpath, and averaging PE(S0, T ; σT ) over the replications. Hull and White(1987), modeling σ2

t first as geometric Brownian motion with λt = σtλ

and γt = σtγ, work out an approximate formula for EPE(S0, T ; σT ) bydeveloping the first few moments of σ2

T = T−1∫ T

0σ2

t · dt and expandingPE(S0, T ; σT ) about the mean. They also use simulation to implement amean-reverting specification, as λt = ξ(σ∞ − σt). Stein and Stein (1991)adopting Scott’s (Ornstein-Uhlenbeck) specification (i), exploit the condi-tional lognormality of ST given σT to derive the characteristic function.Inverting by numerical integration to find the risk-neutral c.d.f. of ST , theycan estimate PE(S0, σ0, T ) as B(0, T )

∫ X

0 (X−s) ·dFST (s; ξ, σ∞, γ, λ∗) by asecond numerical integration. Unfortunately, this relatively quick analytical

11Notice that (i) is an Ornstein-Uhlenbeck process (defined in example 38 onpage 106), which has the unfortunate property of allowing negative volatility.

Page 408: Pricing derivative securities

April 9, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch08 1st Reading

Models with Uncertain Volatility 391

method depends critically on the assumption that ρt = 0, whereas (as weshall see) simulation does not.

Dependent Price/Volatility Shocks

Just as one expects there to be a smile in implicit volatilities when σt isstochastic, intuition suggests also that correlation between price and volatil-ity shocks should affect the symmetry of the curve. For example, whenρt > 0 positive volatility shocks tend to be associated with positive priceshocks, which therefore tend to be larger in absolute value than negativeprice shocks. It seems that this effect would lengthen the right tail in thedistribution of lnST and that negative correlation, conversely, should skewthe distribution of lnST to the left. Since extra probability mass in theright and left tail enhances the values of calls and puts, respectively—particularly for options that are out of the money—one expects skew-ness in the distribution of lnST to induce skewness in the smile curvealso. While this effect has yet to be proved rigorously, the logic is wellsupported by numerical estimates obtained with variously specified mod-els. Although we have seen from the work of Renault and Touzi (1996)that symmetry in the plot of implicit volatility vs. ln(X/ft) is consistentwith skewness in the plot against X/ft itself, allowing for dependence inprice/volatility shocks extends the range of patterns that the model canaccommodate.

The pricing problem is harder when ρt = 0 because the Black-Scholesformula no longer applies conditionally on σT when processes Stt≥0 andσtt≥0 are dependent. Nevertheless, several attacks have been made onthe problem. Hull and White (1987) simulate the joint process St, σtand average (X − ST )+ over the replicates obtained from the resultingsample paths of St0≤t≤T . We give details of their approach in chapter 11.Although computationally intensive, simulation does afford considerableflexibility in modeling; for example, the specification of the σt process caninclude direct price-level effects on volatility, as in the c.e.v. model. Anotherapproach, followed by Wiggins (1987) with Scott’s (1987) specification (ii),is to solve (8.21) using finite-difference methods. However, since a three-dimensional lattice is now required, this method is much more cumbersomethan when volatility depends on the price level alone. Finally, Heston (1993)has developed an analytical procedure that offers a nice balance of modelingflexibility and computational efficiency. We now give a detailed account ofthat method.

Page 409: Pricing derivative securities

April 9, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch08 1st Reading

392 Pricing Derivative Securities (2nd Ed.)

An Analytical Approach

Following Heston (1993), let us model the price process as in (8.11) butreplace (8.12) with

dvt = ξ(v∞ − vt) · dt + γ√

vt(ρ · dW1t + ρ · dW2t),

where ρ and ρ ≡√

1 − ρ2 are constants and vt ≡ σ2t is squared volatility.

We will then assume that the risk-adjusted process under the martingalemeasure has the same functional form:

dvt = (α − βvt) · dt + γ√

vt(ρ · dW1t + ρ · dW2t). (8.22)

This would be consistent with a drift adjustment in vt of the formλt = ξ(v∞ − vt) − λ∗vt. Squared volatility is thus proposed to be amean-reverting square-root process under P. This is the “Feller” process or“CIR” process that was introduced in expression (3.9). Like the Ornstein-Uhlenbeck process, this captures the intuitive notion that volatility shouldbe attracted to some long-run norm (in this case α/β), but it rules outnegative values of vt. Corresponding to (8.21), the price of a European putunder this specification satisfies the p.d.e.

0 = −PET−t + PE

S rtSt + PESSvtS

2t /2 + PE

v (α − βvt) (8.23)

+ PEvvvtγ

2/2 + PESvρStvtγ − PErt

subject to PE(ST ) = P (ST , σT , 0) = (X − ST )+. Not surprisingly, thereare no known simple tricks for obtaining a solution. The feasible approach,which we now undertake, is to develop and solve a similar p.d.e. for theconditional characteristic function of lnST under P, which can then beinverted in order to price the put via martingale methods.

In what follows we write sT ≡ lnST and put τ ≡ T − t. The first stepis to recognize that the conditional expectations process

Ψ(ζ; st, vt, τ) = EteiζsT 0≤t≤T

is a (complex-valued) martingale adapted to Ft. Expressing dΨ via Ito’sformula and equating the drift term to zero, we obtain p.d.e.

0 = −Ψτ + Ψs

(rt −

vt

2

)+ Ψss

vt

2+ Ψv (α − βvt) + Ψvvγ2 vt

2+ Ψsvργvt,

(8.24)which must be solved subject to initial condition

Ψ(ζ; sT , vT , 0) = eiζsT = SiζT .

Page 410: Pricing derivative securities

April 9, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch08 1st Reading

Models with Uncertain Volatility 393

The affine structure of (8.24) makes it possible to find a solution of theform

Ψ(ζ; st, vt, τ) = exp[iζst + g(τ ; ζ) + h(τ ; ζ)vt] (8.25)

for certain complex-valued functions g and h. Clearly, the initial conditionrequires

g(0; ζ) = h(0; ζ) = 0, (8.26)

and we must have g(τ ; 0) = h(τ ; 0) = 0 as well if Ψ is to be a c.f. Toverify that the solution is of this form and to identify g and h, express thederivatives of Ψ as

Ψτ = Ψ · (g′ + h′vt)

Ψs = Ψ · iζΨss = −Ψ · ζ2

Ψv = Ψ · hΨvv = Ψ · h2

Ψsv = Ψ · iζh,

where g′ and h′ are derivatives with respect to τ . Inserting these into (8.24)yields

0 = Ψ ·

[−g′ + iζrt + αh] −[h′ +

iζ + ζ2

2− (iζργ − β)h − γ2

2h2

]vt

.

While we cannot rule out that Ψ ≡ Ψ(ζ; st, vt, τ) = 0 for some values of ζ

and the state variables, for the above equality to hold uniformly requiresthat each of the two bracketed terms equal zero. This gives rise to thefollowing ordinary differential equations:

dg(τ ; ζ)dτ

= iζrt + αh(τ ; ζ) (8.27)

dh(τ ; ζ)dτ

= − iζ + ζ2

2+ (iργζ − β)h(τ ; ζ) +

γ2

2h(τ ; ζ)2. (8.28)

Solving first for h, write (8.28) as

dh

A + Bh + γ2h2/2= dτ,

where A = −(iζ + ζ2)/2 and B = iζργ − β. Integrating, we have

τ + K = C−1 ln[γ2h(τ ; ζ) + B − C

γ2h(τ ; ζ) + B + C

],

Page 411: Pricing derivative securities

April 9, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch08 1st Reading

394 Pricing Derivative Securities (2nd Ed.)

where we take the principal value of the logarithm. Here, K is a constantof integration and C ≡

√B2 − 2Aγ2 when γζ = 0 and C ≡ −β otherwise.

Putting τ = 0 = h(0; ζ) forces K to satisfy eCK = (B − C)/(B + C) ≡ D

and gives as the final solution when γ = 0

h(τ ; ζ) =B − C

γ2

eCτ − 11 − DeCτ

. (8.29)

Turning to (8.27), we have

g(τ ; ζ) = iζ

∫ t+τ

t

ru · du + α

∫ τ

0

h(u; ζ) · du.

Applying (8.29), the right side is

−iζ lnB(t, T ) +α

γ2(B − C)

[∫ τ

0

du

e−Cu − D−

∫ τ

0

du

1 − DeCu

],

and evaluating the integrals and simplifying lead to

g(τ ; ζ) = −iζ lnB(t, T ) +α

γ2

[(−B + C)τ − 2 ln

(1 − DeCτ

1 − D

)], γ = 0.

Note that the solutions do also satisfy g(τ ; 0) = h(τ ; 0) = 0.

For completeness, the solutions for h and g when γ = 0 are

h(τ ; ζ) =A

2β(1 − e−βτ)

g(τ ; ζ) = −iζ lnB(t, T ) + αAτ − αA

β(1 − e−βτ ).

This is the case of time-varying but deterministic volatility and is not ofparticular interest.

Having found g and h, we can at last evaluate the conditional c.f. ofsT ≡ lnST under P as

Ψ(ζ) ≡ Ψ(ζ; st, vt, τ) = expiζst + g(τ ; ζ) + h(τ ; ζ)vt. (8.30)

As a simple check of the result, setting ζ = −i should give the properexpression for Ψ(−i) = Ee−i2 ln ST = EtST . With this value of ζ we haveA = 0, C = B, and D = 0, so that h(τ ; ζ) = 0 and g(τ ; ζ) = − lnB(t, T ) =∫ T

t rs ·ds. Thus, EtST = exp(∫ T

t rs ·ds+ st) = MT St/Mt, showing that thesolution is consistent with the martingale property of St/Mt under P, asit should be.

Page 412: Pricing derivative securities

April 9, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch08 1st Reading

Models with Uncertain Volatility 395

There are several ways to use the c.f. to value the European put atsome t ∈ [0, T ]. Heston’s approach was to work from the risk-neutral pricingformula

PE(St, σt, T − t) = B(t, T )∫

[0,∞)

(X − s)+ · dFt(s)

= B(t, T )X∫

[0,X]

dFt(s) − B(t, T )∫

[0,X]

s · dFt(s)

= B(t, T )XFt(X) − StGt(X),

where Ft(X) = Pt(ST ≤ X) and Gt(X) =∫[0,X]

sEtST

·dFt(s). Since dGt ≥ 0

and∫[0,∞) dGt = 1, the quantity Gt(X) is, like Ft(X), a conditional proba-

bility that the option expires in the money—but now in a measure differentfrom, but equivalent to, P. (Assuming St is strictly positive, it is in factthe martingale measure that applies when St rather than Mt serves asnumeraire.) Since ST ≤ X if and only if sT ≡ lnST ≤ x ≡ lnX , we alsohave

PE(St, σt, T − t) = B(t, T )XFt(x) − StGt(x), (8.31)

where F and G are the c.d.f.s of lnST in the respective measures. LettingΨF (ζ) ≡ Ψ(ζ) and12 ΨG(ζ) = ΨF (ζ− i)/ΨF (−i) represent the correspond-ing c.f.s, we can find Ft(x) and Gt(x) from the inversion formula for c.f.s,expression (2.47), as

Jt(x) =12− lim

c→∞

∫ c

−c

e−iζx

2πiζΨJ(ζ) · dζ, J ∈ F, G. (8.32)

8.4 Computational Issues

As the derivation makes clear, expression (8.31) and the comparable expres-sion for the call,

CE(St, σt, T − t) = St[1 − Gt(x)] − B(t, T )X [1− Ft(x)], (8.33)

12Since for integrable g,R

g(s) · dGt(s) = (EtST )−1R

g(s)es · dFt(s), we haveZ ∞

−∞eiζs · dGt(s) = (EtST )−1

Z ∞

−∞e(iζ+1)s · dFt(s)

= ΨF (−i)−1ΨF (ζ − i).

Page 413: Pricing derivative securities

April 9, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch08 1st Reading

396 Pricing Derivative Securities (2nd Ed.)

are not specific to the Heston model but hold under any dynamics of Stt≥0

for which martingale methods apply. Thus, arbitrage-free values of Euro-pean options can be determined if one can evaluate the conditional c.d.f.sof lnST in the two measures. As in the Heston model, it is often not pos-sible to do this directly, but it can be accomplished by Fourier inversionso long as there are computationally feasible expressions for the c.f.s. Suchan approach will be needed for most of the models considered in chap-ter 9. Here we look at some practical ways of carrying out Fourier inversionand some one-step methods that are alternatives to working from (8.31)and (8.33). We shall make us of these one-step methods repeatedly in thesequel.

8.4.1 Inverting C.f.s

Bohman (1975) considers several algorithms for inverting c.f.s numerically,and Waller et al. (1995) describe applications of the most straightforwardof these, which is implemented in routine INVERTCF on the CD:

Jt(x) .=12

+ηx

2π−

N−1∑j=1−N

j =0

e−iηjx

2πijΨJ(ηj), J ∈ F, G. (8.34)

Here η > 0 governs the fineness of the grid of ζ values on which the integralis approximated, and η and integer N jointly govern the range of integra-tion. Values η = .01 and N = 5000 work well in many cases. The distri-bution Jt should be put in standard form or otherwise centered and scaledso that most of the probability mass lies within a few units of the origin.Fortunately, standardizing lnST is easy with the information at hand. SinceΨJ is known explicitly for J ∈ F, G, the conditional mean and varianceof lnST can be calculated as

EJsT = i−1Ψ′J(0)

VJsT = −Ψ′′J(0) − (EJsT )2,

where Ψ′J(0) and Ψ′′

J(0) are derivatives with respect to ζ evaluated at ζ = 0.

Letting ζ∗ ≡ ζ/√

VJsT , the c.f. of the standardized variable is then

Ψ∗J(ζ∗) = exp(−iζ∗EJsT )ΨJ(ζ∗),

and one calculates Jt(x) as in (8.34) with Ψ∗J replacing ΨJ and x∗ ≡

(x − EJsT )/√

VJsT replacing x. Bohman discusses several more accurateinversion algorithms that can be applied when the c.f. is in standard form.

Page 414: Pricing derivative securities

April 9, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch08 1st Reading

Models with Uncertain Volatility 397

While the application of (8.34) is straightforward, it is computationallydemanding if options at many strikes are to be priced at one time, asis required in order to estimate parameters of the model implicitly frommarket prices of traded puts and calls. With some care it is possible toimplement the fast Fourier transform (f.F.t.) to estimate c.d.f. Jt(·) on anextensive grid of strike prices all at once. Beginning with a vector of valuesof the complex-valued function ΨJ(·) on a grid ζjN

j=1, the method returnsa real-valued vector Jt(·) on a grid xjN

j=1. Done directly as in (8.34), thiswould require on the order of N2 computations, but the f.F.t. algorithmreduces this to O(N log2 N). Routine INVFFT on the CD applies the f.F.t.routine in Press et al. (1992) to invert a user-coded c.f. corresponding to acentered, scaled distribution.

8.4.2 Two One-Step Approaches

Whereas the Heston (1993) procedure based on (8.31) and (8.33) requirestwo separate inversions for each strike price in order to evaluate Ft(x) andGt(x), Carr and Madan (1998) have developed a way to price an option witha single Fourier inversion. The method requires merely having a computableexpression for the risk-neutral conditional c.f. and is not limited to theHeston model. The idea is to work with a transform of the price of theoption itself, regarded as a function of the log of the strike. They show thatthis can be expressed as a function of the c.f. of Ft alone, ΨF ≡ Ψ.

Here is how it works for European call options. Express the value of thecall in terms of the log of the strike price as CE(x) = B(t, T )

∫(x,∞)

(es −ex) · dFt(s), where Ft(s) ≡ Pt(lnST ≤ s) is the conditional c.d.f. of lnST

under the risk-neutral measure. The Fourier transform of CE(·), namely∫ eiζxCE(x) · dx, exists if CE itself is integrable, but this condition fails

here since

limx→−∞

CE(x) = B(t, T )∫

es · dFt(s) = St > 0.

An alternative is to work with a “damped” function, CEθ (x) ≡ eθxCE(x).

With θ > 0 the transform of CEθ is then

(FCEθ )(ζ) =

eiζxB(t, T )∫

(x,∞)

(es+θx − e(1+θ)x) · dFt(s) dx

= B(t, T )∫

∫(x,∞)

(es+(θ+iζ)x − e(1+θ+iζ)x) · dFt(s) dx

Page 415: Pricing derivative securities

April 9, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch08 1st Reading

398 Pricing Derivative Securities (2nd Ed.)

= B(t, T )∫

∫(−∞,s)

(es+(θ+iζ)x − e(1+θ+iζ)x) · dx dFt(s)

= B(t, T )∫

[e(1+θ+iζ)s

θ + iζ− e(1+θ+iζ)s

1 + θ + iζ

]· dFt(s) (8.35)

= B(t, T )Ψ[ζ − i(1 + θ)]

(θ + iζ)(1 + θ + iζ).

CEθ (x) can now be recovered in one operation by inverting (FCE

θ ), and thisyields the solution for the undamped call price as

CE(x) = e−θxF−1(FCEθ ) =

e−θx

∫ ∞

−∞e−iζx(FCE

θ )(ζ) · dζ. (8.36)

Although the method can work very well, some experimentation with θ isrequired for good results. Also, existence of the integrals in (8.35) requiresthat

∫ e(1+θ)s ·dFt(s) = EtS

1+θT < ∞. If not all moments of ST exist under

P, then this imposes an upper bound on θ

There is an alternative one-step approach that is even simpler andrequires no choice of damping parameter. For this we work directly withthe put, exploiting the identity

Et(X − ST )+ =∫ X

0

Ft(s) · ds,

which is easily established via integration by parts. Transforming ST andX to logs and applying (8.32) give for Et(X − ST )+∫ x

−∞Ft(s)es · ds =

∫ x

−∞

[12− lim

c→∞

∫ c

−c

e−iζs

2πiζΨ(ζ) · dζ

]es · ds

=X

2− 1

2

∫ x

−∞lim

c→∞a(c, s)es · ds,

where a(c, s) ≡∫ c

−ce−iζs

πiζ Ψ(ζ) · dζ. Inversion formula (8.32) implies that| limc→∞ a(c, s)| = |1 − 2Ft(s)| ≤ 1, so for arbitrary ε > 0 there exists cε

such that | sups[1−2Ft(s)−a(cε,s)]| < ε, in which case |a(c, s)es| ≤ es(1+ε)for sufficiently large c. Since es is integrable on (−∞, lnX), the dominatedconvergence theorem implies that∫ x

−∞lim

c→∞a(c, s)es · ds = lim

c→∞

∫ x

−∞a(c, s)es · ds

= limc→∞

∫ x

−∞

∫ c

−c

e(1−iζ)s

πiζΨ(ζ) · dζ ds.

Page 416: Pricing derivative securities

April 9, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch08 1st Reading

Models with Uncertain Volatility 399

Moreover, since the limit of the double integral equals X−2Et(X−ST )+ ∈[−X, X ], Fubini’s theorem justifies reversing the order of integration.Therefore,

limc→∞

∫ x

−∞

∫ c

−c

e(1−iζ)s

πiζΨ(ζ) · dζ ds = lim

c→∞

∫ c

−c

[∫ x

−∞e(1−iζ)s · ds

]Ψ(ζ)πiζ

· dζ

=X

πlim

c→∞

∫ c

−c

X−iζ

iζ + ζ2Ψ(ζ) · dζ

and thus, for sufficiently large c,

PE(St, T − t) .= B(t, T )X[12− 1

∫ c

−c

X−iζ Ψ(ζ)ζ(i + ζ)

· dζ

]. (8.37)

Of course, the singularity at ζ = 0 must be avoided in the numericalintegration, but even the simplest trapezoidal scheme with a central intervalthat straddles the origin is fast and very accurate.13 The inequality∣∣∣∣e−iζx Ψ(ζ)

ζ(i + ζ)

∣∣∣∣ <1

ζ2 + 1

can help in setting finite integration limits. Despite the singularity it is easyto adapt f.F.t. routines to carry out the inversion.

Tables 8.1 and 8.2 compare Carr and Madan’s damped estimates ofprices of European calls with estimates obtained using (8.37) and put-call

Table 8.1. Damped, ICDF, and Black-Scholes estimates of call values under geometricBrownian motion: σ0 = .1, S0 = 100, T = .5, r = 0.

Strike Carr-Madan

X θ = .1 θ = .3 θ = .5 ICDF B-S

80 15.86 19.99 20.00 20.00 20.0085 10.88 15.02 15.02 15.02 15.0290 6.06 10.19 10.20 10.20 10.2095 1.80 5.93 5.94 5.94 5.94

100 −1.32 2.81 2.82 2.82 2.82105 −3.09 1.04 1.05 1.05 1.05110 −3.84 .30 .30 .30 .30115 −4.07 .06 .07 .07 .07120 −4.13 .00 .01 .01 .01

13The range of integration can be restricted to (0,∞) by doubling the result,but computation is actually slower because smaller step sizes are needed to maintainaccuracy.

Page 417: Pricing derivative securities

April 9, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch08 1st Reading

400 Pricing Derivative Securities (2nd Ed.)

Table 8.2. Damped, ICDF, and two-step estimates of call values under s.v. dynamics:α = .02, β = 2, γ = .10, σ0 = .1, ρ = −.5, S0 = 100, T = .5, r = 0.

Strike Carr-Madan

X θ = .1 θ = .3 θ = .5 ICDF 2-step

80 15.87 20.00 20.01 20.01 20.0185 10.92 15.05 15.06 15.06 15.0690 6.15 10.28 10.29 10.29 10.2995 1.88 6.01 6.02 6.02 6.02

100 −1.36 2.78 2.78 2.78 2.78105 −3.21 .92 .93 .93 .93110 −3.93 .20 .21 .21 .21115 −4.11 .02 .03 .03 .03120 −4.14 .00 .00 .00 .00

parity. The latter estimates are labeled “ICDF” (for “integrated c.d.f.”).Carr-Madan estimates are given for several values of damping parameter θ.Entries in table 8.1 are for the special case of geometric Brownian motion(α = β = γ = 0) to allow comparison with values from Black-Scholes formu-las. Entries in table 8.2 are for Heston’s stochastic volatility model. Thosein the “2-step” column are obtained by his two-step procedure, using themethod described in Waller et al. (1995) to invert the c.f.s. Numerical inte-grations are by simple trapezoidal approximation with ζ ranging between±99.9 in steps of 0.2.

The estimates with θ = .1 show that the choice of θ does matter in theCarr-Madan procedure. Nevertheless, damping procedures applied either tocalls or directly to puts work well in this application for a wide range ofθ values. Indeed, estimates even for θ = .1 can be brought into line if theintegration steps are made sufficiently small. However, valuing options byintegrating the risk-neutral c.d.f. is simpler, and it is just as accurate as themuch slower two-step procedure.

Page 418: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch09 FA

9Discontinuous Processes

This chapter treats derivatives that depend in some way on processes withdiscontinuous sample paths, such as Poisson processes that count randomlyspaced, discrete events in time. Such events can figure into the modelingof derivatives in two general ways. One way is to affect the timing of aderivative’s potential payoffs; for example, European options with randomtime to expiration or loan-guarantee policies with random audit times.These models are treated in section 9.1. Another way is in the model-ing of price processes for underlying assets, and the remaining sections ofthe chapter cover some of the many ways of letting price processes changeabruptly. Section 9.2 considers mixed jump/diffusion processes whose sam-ple paths are almost everywhere continuous but have a random, Poisson-distributed number of breaks per unit time. Such models allow explicitlyfor the abrupt, large changes in prices of financial assets that sometimesdo occur. Section 9.3 shows that one can also allow for stochastic volatilityin the diffusion part and even for volatility processes that are themselvessubject to jumps. As we shall see in section 9.4, there are also useful modelsbased on purely discontinuous—pure-jump—processes, some of which haveinfinitely many jumps during any finite interval of time. The final sectionof the chapter presents a general class of regime-switching models in whichseveral distinct types of processes drive the price of the underlying security,alternating at random times. Not surprisingly, it will turn out that short-lived and out-of-money options can be worth considerably more when theunderlying price is subject to any of these forms of abrupt change.

The special tools needed to describe and work with discontinuous pro-cesses were surveyed in section 3.5, and most of the processes encounteredin this chapter have already been described. We begin with applications

401

Page 419: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch09 FA

402 Pricing Derivative Securities (2nd Ed.)

involving the simple Poisson process, using the differential-equation tech-nique to price derivatives whose payoffs are triggered by Poisson events.

9.1 Derivatives with Random Payoff Times

Section 7.1.3 showed that it is relatively straightforward to develop analyt-ical formulas for arbitrage-free prices of indefinitely lived American optionswhen the underlying follows geometric Brownian motion. Assuming con-stant dividend rates and short rates of interest, values of perpetual deriva-tives are not time-dependent. In this case the replicating argument thatleads to the Black-Scholes partial differential equation for finite-lived deriva-tives leads to a more tractable ordinary differential equation in the price ofthe underlying. Usually, this o.d.e. is much easier to solve. Interestingly,because of an elementary property of the interarrival times of Poissonevents, the same simplifying circumstance is present when the timing ofa derivative’s payoffs is governed by a Poisson process.

Given a Poisson process Ntt≥0 with intensity θ, and taking t as thecurrent time, let Tt be the time that will elapse until the next Poisson eventoccurs; that is, Tt = t∗ − t if the next event occurs at t∗ . Now for τ > 0

P(Tt ≤ τ | Ft) = P(Nt+τ − Nt > 0 | Ft) = P(Nt+τ − Nt > 0),

since the Poisson process has independent increments. The incrementNt+τ − Nt is distributed as Poisson with mean θτ , so P(Nt+τ − Nt = 0) =e−θτ . Thus, the c.d.f. of Tt is

FTt(τ) =

0, τ ≤ 01 − e−θτ , τ > 0.

The interarrival times of Poisson events are thus exponentially distributedwith mean θ−1. They are also stationary, in that the distribution of thetime until the next event depends on neither the current time nor the timeof the most recent event. This memoryless feature of exponential waitingtimes causes prices of derivatives whose payoffs are triggered by Poissonevents to be time-invariant whenever the dynamics of the underlying assetsunder P are also stationary. In such cases, the law of motion for a derivativewith a single underlying asset can be described by just an ordinary differ-ential equation, and solving this is usually easier than applying martingalemethods.

For later reference notice also that the exponential model impliesFTt(τ) → 0 for all τ as θ → 0 and that FTt(τ) → 1 for τ > 0 as θ → ∞.

Page 420: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch09 FA

Discontinuous Processes 403

Thus, almost surely,

limθ→0

Tt = ∞ and limθ→∞

Tt = 0. (9.1)

We now give three examples of such derivative assets, solving ordinarydifferential equations to work out their arbitrage-free prices. In each case itis assumed that the risk associated with the Poisson events is diversifiableand therefore not compensated in market equilibrium. This implies that nopreference parameters are present in the equilibrium solution.1

1. Consider a pension fund that holds a portfolio of traded assets whosevalue, currently S0, follows geometric Brownian motion with volatilityσ. The fund’s managers now have contractual obligations to pensionerstotaling L0, but the obligation will grow over time at a fixed, continuousrate λ. The fund is currently solvent, meaning that L0 < S0. In orderto guarantee that it can meet its future obligations, the fund seeks tobuy insurance against insolvency. This would work in the following way.The insurer would audit the fund on a random schedule governed bya Poisson process with intensity θ > 0. The value of this parameter,which equals the mean number of audits per unit time, would be agreedupon in advance. If the fund was found solvent at an audit date τ thenit would continue to operate; but if Lτ > Sτ the insurer would takeover the fund’s assets and make up the shortfall, Lτ − Sτ , out of itsown resources. The problem is to find the value of such an insurancearrangement as a function of St and Lt.

The problem becomes simpler upon normalizing by Lt and valuinginsurance per unit of liabilities. For this, let st ≡ St/Lt, so that thesolvency condition becomes st ≥ 1. The process st evolves under risk-neutral measure P as

dst = (Lt · dSt − St · dLt)/L2t

= (r − λ)st · dt + σst · dWt,

where r is the short rate of interest. The value of the contract perunit of insured liabilities can be written as i(st; Nt) ≡ I(St, Lt; Nt)/Lt,where Ntt≥0 is the Poisson process that counts the audits from time 0.

1Shimko (1992) gives other examples that involve pricing Poisson-dependent payoff

streams. Merton (1978) and Pennacchi (1987a, b) have applied techniques like those inthe first two examples to analyze the fair pricing of deposit insurance at commercialbanks.

Page 421: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch09 FA

404 Pricing Derivative Securities (2nd Ed.)

Applying to i(·; ·) the extension of Ito’s formula for functions of Ito andjump processes—that is, expression (3.43) on page 130 with Jt = Nt—gives

di(st; Nt) = i′ · dst + (i′′/2) · d〈s〉t + [i(st, Nt) − i(st, Nt−)].

If the fund is solvent at the audit, then i(st, Nt) − i(st, Nt−) = 0 andthere is no change in value. If the fund is insolvent, it receives 1− st perunit of liabilities and the insurance terminates—along with the fund’scontrol over assets. Thus, when st < 1

i(st; Nt) − i(st; Nt−) = 1 − st − i(st; Nt−).

The P-expected instantaneous change in normalized value during[t, t + dt) is then limh↓0 Et[i(st+h, Nt+h) − i(st, Nt)] or

Etdi(st; Nt)=

[(r − λ)sti

′ +12σ2s2

t i′′]· dt, st ≥ 1[

(r − λ)sti′+

12σ2s2

t i′′ + (1 − st − i)θ

]· dt, 0 ≤ st < 1.

(9.2)Since e−rtI(St, Lt; Nt) is a martingale under P , we also have EdI =rI · dt, so that

Edi(st; Nt) = L−1t (EdI − I · dLt/Lt)

= ri · dt − i · dLt/Lt

= (r − λ)i(st; Nt) · dt. (9.3)

Equating (9.2) and (9.3) gives the ordinary differential equation thatmust be satisfied by the normalized value of insurance:

0 =

(r − λ)sti

′ +12σ2s2

t i′′ − (r − λ)i, st ≥ 1

(r − λ)sti′ +

12σ2s2

t i′′ − (r − λ + θ)i + (1 − st)θ, 0 ≤ st < 1.

Suppressing the Poisson counter now, the general solution to these equa-tions has the form

i(st) = α+sβ+

t + α−sβ−t + γst + δ.

Since the value of insurance should approach zero as the asset/liabilityratio goes to infinity, we have an upper boundary condition i(+∞) = 0.There is also a lower boundary condition, i(0) ≤ 1, since the value ofinsurance can be no greater than the current liability. The specific form

Page 422: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch09 FA

Discontinuous Processes 405

of solution depends on the value of st. In the case st ≥ 1 we haveγ = δ = 0 and the two roots β+ = 1 and β− = −2(r−λ)/σ2. The upperboundary condition requires α+ = 0, so that β+ becomes irrelevantwhen st ≥ 1. If λ ≥ r the condition also requires α− = 0, indicatingthat there are no nontrivial solutions. Assuming that λ < r, solutionsare restricted to i(st) = α−sβ−

t . In the case st < 1 we have γ = −1,δ = θ/(θ + r − λ), and

β± = 1/2 − (r − λ)/σ2 ±√

[(r − λ)/σ2 + 1/2]2 + 2θ/σ2 (9.4)

with β+ > 1 and β− < 0. The lower boundary condition requires α− = 0(making β− irrelevant now) and restricts solutions to

i(st) = α+sβ+

t − st + θ/(θ + r − λ).

Notice that this gives i(0) = θ/(θ+r−λ) = [1+(r−λ)/θ]−1. Thus, in thisworst possible case the value of insurance is Lt times a discount factor.The factor approaches unity as either the rate of growth of liabilitiesapproaches the short rate or the expected time until the next audit(θ−1) approaches zero. At these extreme values the right compensationfor the insurer equals the full present value of liabilities.

To complete the problem requires finding α+ and α−. Requiring i

and i′ to be continuous at st = 1 imposes the conditions

α− − α+ + 1 − i(0) = 0

α−β− − α+β+ + 1 = 0,

which yield solutions

α+ =1 − [1 − i(0)]β−

β+ − β− (9.5)

α− =1 − [1 − i(0)]β+

β+ − β− . (9.6)

Merton (1990, chapter 13) shows that if a solution with continuous firstderivative does exist then it must be the correct, arbitrage-free valuation.Summarizing, the solution for insurance value is

i(st) =

α−sβ−t , st ≥ 1

α+sβ+

t − st + θ/(θ + r − λ), st < 1,

with α+ and α− given by (9.5) and (9.6).

Page 423: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch09 FA

406 Pricing Derivative Securities (2nd Ed.)

2. Continuing the previous example, suppose the managers of the pensionfund plan to assess the beneficiaries for the cost of insuring its sol-vency. To do this, they will collect at periodic intervals an accumulatedcash sum that accrues at the continuous rate Ct ≡ cLt so long as theinsurance is maintained—that is, so long as the fund remains solvent.Payments are to be contributed by beneficiaries rather than paid out ofthe fund. The problem is to determine c in advance such that the valueof this indefinitely lived stream equals the value of insurance. Clearly,this fair value of c will depend on the value of the fund’s assets when theinsurance agreement is consummated. We will value the stream in termsof an arbitrary c and then determine the c that equates the stream’svalue to the value of insurance.

Letting V (St, Lt; Nt, c) represent the value of a stream at arbitraryfixed proportional rate cLt, and again normalizing by Lt, we have st ≡St/Lt and V (St, Lt; Nt, c)/Lt ≡ v(st; Nt) (with c understood). Sinceunder P the stream must yield the same expected return as a risklessbond, it follows that

EdV (St, Lt; Nt, c) + cLt · dt = rV (St, Lt; Nt, c) · dt.

Therefore,

Edv(st; Nt) + c · dt ≡ L−1t (EdV − V · dLt/Lt) + c · dt

= L−1t (rV − cLt − V λ) · dt + c · dt

= (r − λ)v(st; Nt) · dt. (9.7)

Expressing Edv using the extended Ito formula gives

Edv(st; Nt) =

(r − λ)stv′ +

12σ2s2

t v′′ + [v(st; Nt) − v(st−; Nt−)]θ

·dt.

When the fund is solvent the audit brings no change in the value of thecash stream, so that the bracketed term is zero when st ≥ 1. However,v(st; Nt) − v(st; Nt−) = −v(st; Nt−) when st < 1, since the streamterminates if the fund is found to be insolvent. Equating Edv + c withthe version of (9.7) that is appropriate in each of these cases gives thefollowing differential equation:

0 =

12σ2s2

t v′′ + (r − λ)stv

′ − (r − λ)v + c, st ≥ 1

12σ2s2

t v′′ + (r − λ)stv

′ − (r − λ + θ)v + c, 0 ≤ st < 1.

Page 424: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch09 FA

Discontinuous Processes 407

Still assuming that λ < r, the upper boundary condition is v(+∞) <

∞, since the present value of a sure infinite stream proportional to Lt

is finite if Lt grows more slowly than the interest rate. At the lowerboundary we have 0 ≤ v(0) < ∞. Finally, the solutions for the twocases and their derivatives should be continuous at st = 1. The generalsolution in each case is of the form

v(st) = α+sβ+

t + α−sβ−t + δ.

The case st ≥ 1 yields β+ = 1 and β− = −2(r − λ)/σ2 < 0, as for theinsurance value, and δ = c/(r − λ). The case st < 1 yields the sameexpressions as (9.4) for β±, with δ = c/(θ + r−λ). For st ≥ 1 the upperboundary condition requires α+ = 0, and for st < 1 the lower boundaryrequires α− = 0. The solutions are thus

v(st) =

α−sβ−t + c/(r − λ), st ≥ 1

α+sβ+

t + c/(θ + r − λ), 0 ≤ st < 1.

Equating the two expressions and their first derivatives at st = 1 andsolving for α+ and α− give final expressions

v(st) =

c

[− i(0)

r − λ

β+

β+ − β− sβ−t +

1r − λ

], st ≥ 1

c

[− i(0)

r − λ

β−

β+ − β− sβ+

t +1

θ + r − λ

], 0 ≤ st < 1.

Here, i(0) ≡ [1 + (r − λ)/θ]−1 is the worst-case value of insurance.Notice that v(st) is increasing in st in both the solvency and insolvencystates, since β− < 0. Notice, too, that v ↑ c/(r − λ) as st → ∞ andv ↓ c/(r − λ + θ) as st → 0. The upper limit is the value of a perpetualstream of payments beginning at rate c and growing at rate λ. It isindependent of θ because the rate of audit is irrelevant if the fund’ssolvency is not in doubt. To interpret the lower bound, notice that θ−1

is the expected time until the next audit, at which, in this limiting case,the stream of payments would certainly terminate.

The fair proportional rate of payment, c(st) ≡ C(St, Lt)/Lt, thatequates i(st) and v(st) is given by

c(st) =1 − [1 − i(0)]β+sβ−

t

−β− + β+[1 − i(0)sβ−t ]

(r − λ)

Page 425: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch09 FA

408 Pricing Derivative Securities (2nd Ed.)

when st ≥ 1 and

c(st) =

[1−[1−i(0)]β−

β+−β− sβ+

t − st + i(0)]

[(θ + r − λ)−1 − i(0)

r−λβ−

β+−β− sβ+

t

]when 0 ≤ st < 1. Note that c(st) ↓ 0 as st → ∞ and that c(st) →i(0)(θ + r − λ) = θ as st → 0. In the first case the interpretation issimply that the fair rate of compensation for insurance declines as thefund’s enduring solvency becomes more certain. In the second case, asthe fund becomes hopelessly insolvent, the required rate of compensationfor insurance is greater the sooner an audit is apt to occur.

3. Consider a European-style call option with the unusual feature thatthe time to expiration is uncertain, being distributed as exponentialwith mean θ−1 > 0. Under the risk-neutral measure the price of theunderlying asset evolves as dSt = rSt · dt + σSt · dWt. Because of thedual relation between exponential waiting times and Poisson processes,we recognize that expiration will occur at the next change in a Poissonprocess, Ntt≥0, that has intensity θ. Therefore, expressing the valueof the call at t as CE(St, Nt) and applying (3.43), we find its law ofmotion under P to be

dCE(St, Nt) = CES · dSt +

12CE

SS · d〈s〉t + [CE(St, Nt) − CE(St, Nt−)].

Normalizing CE(·, ·) and St by the strike price, with st ≡ St/X andc(st, Nt) ≡ CE(st, Nt)/X , and noting that c(st, Nt) = (st − 1)+ whenNt − Nt− = 1, the above can be written

dc(st, Nt) = (csstr + csss2t σ

2/2) · dt + csst · dWt

+ [(st − 1)+ − c(st, Nt−)]. (9.8)

The fair-game property of the call’s discounted value under P impliesEdc(st, Nt)/c(st, Nt) = r · dt. Taking expectations in (9.8) then givesthe following two-part differential equation that c(·, ·) must satisfy inthe absence of arbitrage:

0 = csstr + csss2t σ

2/2 − c(θ + r), 0 ≤ st ≤ 1

0 = csstr + csss2t σ

2/2 + θ(st − 1) − c(θ + r), st > 1.(9.9)

The solutions of both equations are of the form α+sβ+

t +α−sβ−t +γst+δ,

where

β± = (−r/σ2 + 1/2) ±√

(r/σ2 + 1/2)2 + 2θ/σ2.

Page 426: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch09 FA

Discontinuous Processes 409

Note that β+ > 1 and β− < 0. Lower boundary condition c(0, Nt) = 0requires α− = γ = δ = 0 in the first equation of (9.9). Assuming asan upper boundary condition that cs is bounded requires α+ = 0 whenst > 1. In that case we also have γ = 1 and δ = −θ/(θ+r). Thus, lettingt = 0 be the current time and taking N0 = 0,

c(s0, 0) =

α+sβ+

0 , s0 ≤ 1

α−sβ−0 + s0 − θ/(θ + r), s0 > 1.

Finally, imposing continuity of c(·, ·) and of cs(·, ·) at s0 = 1 gives solu-tions for α+ and α−:

α+ = (β+ − β−)−1[1 − β−r/(θ + r)]

α− = (β+ − β−)−1[1 − β+r/(θ + r)].

Recall from (9.1) that the exponential waiting time goes to infinitya.s. as θ → 0 and that it approaches zero a.s. as θ → ∞. Noticethat our solution implies c(s0, 0) → s0 as θ → 0, so that the valueof the call approaches that of the underlying as the expiration datebecomes more and more remote. This is consistent with the sharp,dynamics-free bounds that were established in section 4.3.3. Conversely,c(s0, 0) → (s0 − 1)+ as θ → ∞ and the time to expiration approacheszero. Of course, the Black-Scholes formula has the same limiting valuesas the time to expiration goes to zero or infinity.

9.2 Derivatives on Mixed Jump/Diffusions

This section treats the pricing of derivatives when underlying price processStt≥0 is a geometric Brownian motion with added stochastic, Poisson-directed jumps. This is a mixture of an Ito process and a “J” process likethat in example 51 in section 3.5.2.

9.2.1 Jumps Plus Constant-Volatility Diffusions

For the applications we parameterize the mixed process, as follows. WithdSt = at · dt + bt · dWt + ct−U · dNt, take EU = υ, at = µtSt−, bt = σSt−,and ct = St, giving

dSt = µtSt− · dt + σSt− · dWt + St−U · dNt. (9.10)

Page 427: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch09 FA

410 Pricing Derivative Securities (2nd Ed.)

To see what the model implies for the conditional distribution of ST , putf(St) = lnSt and apply the extended Ito formula (3.45) as in (3.47) assum-ing that P(1 + U ≤ 0) = 0:

d lnSt = (µt − σ2/2) · dt + σ · dWt + (lnSt − lnSt−)

= (µt − σ2/2) · dt + σ · dWt + ln(1 + U) · dNt.

In integral form this is

lnST − lnSt = (µt − σ2/2)(T − t) + σ(WT − Wt) +NT∑

j=Nt+1

ln(1 + Uj)

where the summation is taken to equal zero if NT −Nt = 0. This representsthe continuously compounded ex -dividend return on the underlying as thesum of a deterministic component, a normally distributed part arising fromthe Brownian motion, and a jump component. The jump part is a sum ofa Poisson-distributed number NT − Nt of i.i.d. random shocks.

To complete the specification requires a model for U . A convenientassumption is that 1 + Uj∞j=1 are i.i.d. as lognormal, independent ofWtt≥0 and Ntt≥0, with parameters ln(1 + υ) − ξ2/2 and ξ2. With thisparameterization E(1 + U) = exp[ln(1 + υ) − ξ2/2 + ξ2/2] = 1 + υ, aspreviously specified. Conditional on NT − Nt = n, the continuously com-pounded return ln(ST /St) is then a sum of n+1 independent normals withconditional mean

mn ≡ (µt − σ2/2)τ + n[ln(1 + υ) − ξ2/2]

and conditional variance

vn ≡ σ2τ + nξ2,

where τ ≡ T−t. Unconditionally, the return is a Poisson mixture of normals.Press (1967) proposed such a mixture to capture the thick tails observed inempirical marginal distributions of logs of stocks’ high-frequency returns.As we saw in example 58 on page 137, the process lnStt≥0 can also beconstructed as a time change of a Brownian motion in the case υ = 0.

The marginal distribution of lnST − lnSt can have many shapes, includ-ing bimodality when

∣∣ln(1 + υ) − ξ2/2∣∣ is sufficiently large. The moments

can be found most easily from the cumulant generating function, Lt(ζ) ≡

Page 428: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch09 FA

Discontinuous Processes 411

lnMt(ζ) = lnE exp[ζ ln(ST /St)]:2

Lt(ζ) = τζ[µt + (ζ − 1)σ2/2] + θ[(1 + υ)ζe(ζ2−ζ)ξ2/2 − 1].

Evaluating Mt(·) at ζ = 1 gives the conditional expectation of the totalreturn as

Et(ST /St) = e(µt+θν)(T−t). (9.11)

Differentiating the cumulant generating function three times and usingexpression (2.42) on page 61, one finds that coefficient of skewness α3 isinversely proportional to

√T − t and that its sign is that of [ln(1+υ)−ξ2/2].

Thus, the distribution can be right- or left-skewed or symmetric, with theasymmetry of probability mass in the tails increasing as time to expirationdiminishes. Notice that the skewness of log return is negative if the meansize of jumps, υ, is less than or equal to zero. Differentiating again andapplying (2.43) shows that excess kurtosis α4 − 3 is positive and of order(T−t)−1 except in the degenerate cases θ = 0 or υ = 0 and ξ = 0. Thus, likepure Ito processes with stochastic volatility, the distribution of log returndoes have thicker tails than the normal. We will see later that the thicktails produce higher implicit volatilities for options far in and far out of themoney, the effect being most pronounced for options with short lives.

9.2.2 Nonuniqueness of the Martingale Measure

As with stochastic-volatility models, derivatives pricing in this environ-ment is complicated by the fact that random-jump dynamics do not admita unique equivalent martingale measure. This can be seen as follows. Refer-ring to (9.11), it is clear that we need µt + θν = rt − δ (the instanta-neous short rate less the dividend rate) if normalized cum-dividend processSc∗

t = Steδt/Mtt≥0 is to be a martingale. Equivalently, using the extended

Ito formula with f(t, St) = Sc∗t = StM

−10 exp[

∫ t

0 (δ − rs) · ds] gives

dSc∗t = (δ − rt)Sc∗

t · dt + Sc∗t−(µt · dt + σ · dWt + U · dNt)

= (δ − rt)Sc∗t−(1 + U · dNt) · dt + Sc∗

t−(µt · dt + σ · dWt + U · dNt)

= Sc∗t−[(µt + θν + δ − rt) · dt + σ · dWt + (U · dNt − θν · dt)],

2This may be found directly or from expression (3.49) for the log of the c.f. of themixed process in example 51.

Page 429: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch09 FA

412 Pricing Derivative Securities (2nd Ed.)

apart from terms of o(dt). We eliminate the drift term by changing to anymeasure P for which3

• Ntt≥0 is a Poisson process with rate θ.• Random variables ln(1+ Uj)∞j=1 are i.i.d. as N [ln(1+ ν)− ξ

2/2, ξ

2] and

independent of Nt.• Wt = Wt + σ−1

∫ t

0 (rs − δ − θν − µs) · dst≥0 is a Brownian motionindependent of Ntt≥0 and Uj∞j=1.

• θν = θν.

Thus, the martingale property would be consistent with either a high meanrate (θ) of jumps having low mean (ν) or infrequent jumps with large expec-tation. Only the product θν matters for the martingale property.

As we have seen, the absence of a unique martingale measure is associ-ated with the incompleteness of markets, meaning that payoffs of derivativescannot always be replicated with portfolios of traded assets. The inability toreplicate nonlinear payoffs in the jump-diffusion setting can be seen throughthe following example. Take St to be the price of a primary underlyingasset that (for simplicity) pays no dividends, and let Mt be the price of amoney fund paying interest at a constant short rate r. We will try to finda self-financing portfolio of the underlying and the money fund that repli-cates a derivative whose value depends nonlinearly on St and time alone,D(St, T − t). Setting D(St, T − t) = ptSt + qtMt as was done in chapter 6when St was a continuous Ito process, write

dD = pt · dSt + (D − ptSt−) · dMt/Mt

= [Dr + ptSt−(µt − r)] · dt + ptSt−σ · dWt + ptSt−U · dNt.

Applying the extended Ito formula to D(St, T − t) gives

dD = [−DT−t + DSStµt + DSSS2t σ2/2] · dt + DSSt−σ · dWt

+ D[St−(1 + U), T − t] − D(St−, T − t) · dNt.

3Putting λ = t−1σ−1R t0 (rs − δ − θν − µs) · ds and QT = exp(−λ2T/2 − λWT ),

the change in Brownian measure is accomplished as eP(A) =R

1A(ω)QT (ω) · dP(ω),where A is any event in the product σ-field generated by the independent families ofrandom variables Wt0≤t≤T , Nt 0≤t≤T , Uj∞j=1. Simultaneous changes in the Pois-

son measure are effected with the Radon-Nikodym process Q′t = (λ/λ)Nt e−t(λ−λ),

as eP(A) =R

1A(ω)QT (ω)Q′T (ω) · dP(ω), while changes of the ln(1 + Uj) from

N [ln(1+ν)−ξ2/2, ξ2] to N [ln(1+ ν)− ξ2/2, ξ

2] are done with the usual Radon-Nikodym

derivative.

Page 430: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch09 FA

Discontinuous Processes 413

If we maintain pt = DS , then the portfolio matches the progress of theBrownian motion, but it does not capture the jump risk unless

DS =D[St−(1 + U), T − t] − D(St−, T − t)

St−(1 + U) − St−(9.12)

for all values of random variable U . Clearly, this is possible if and only if D

is linear in S, as it would be, for example, if D were a forward contract.4

In the absence of a unique martingale measure we are again obliged toconsider how the parameters of the process Stt≥0 reflect attitudes towardrisk. One simple possibility, considered by Merton (1976), is that jump riskis diversifiable. In that case, since the diffusion risk can be hedged away, onlythe µt of the diffusion process reflects attitudes toward risk. The remain-ing parameters of the model are the same under the risk-neutral and naturalmeasures and can be estimated with arbitrarily high precision by sufficientlypatient observation of the process. In this case, Girsanov’s theorem impliesa unique new measure P under which Wt = Wt+(

∫ t

0 (rs−δ−µs)·ds+θνt)/σ

is a Brownian motion. In effect, the change of measure is accomplished bysetting µt = rt−δ+θν and Wt = Wt in (9.10), leaving all other parametersthe same. However, as Jarrow and Rosenfeld (1984) argue, the occasionallarge breaks in market indices are highly improbable if all the jump riskis asset-specific, as it would have to be in order to be diversifiable. Bates(1996a) and Bakshi et al. (1997) appeal to representative-agent equilibriummodels that imply a risk-adjusted version of (9.10) with the same generalform, but with values of υ, ξ, and θ that register risk and time preference.5

These implicit parameters, which we have represented as υ, ξ, and θ, canbe inferred from prices of traded derivatives—for example, by nonlinearleast squares. There is evidence even in pre-1987 data that these can differconsiderably from the corresponding versions under P.6

9.2.3 European Options under Jump Dynamics

A computational formula for European-style derivatives can be devel-oped easily by means of a conditioning argument. The idea is to find the

4If the dynamics were pure-jump, having no diffusion part, and if U were determinis-tic, then the derivative could be replicated by setting pt equal to the right side of (9.12).See Cox and Ross (1976).

5The model proposed by Ahn (1992) does not quite nest within the Merton frame-work unless υ = 0 and there is zero correlation between the jump shocks, U and UW , ofthe security and aggregate wealth.

6Stern (1993), estimating υ, ξ, θ, and σ by maximum likelihood from daily 1986–87stock returns, finds that the jump model, while an improvement over Black-Scholes, stillunderprices short-maturity, out-of-money options.

Page 431: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch09 FA

414 Pricing Derivative Securities (2nd Ed.)

conditional expectation of the terminal (time-T ) payoff under P given a(random) number of jumps, then apply the tower property. Taking t = 0as the current date and continuing to assume that short rates are F0-measurable, the distribution of ln(ST /S0) under P conditional on NT = n

is normal with variance σ2T + nξ2

and mean∫ T

0

rt · dt − θυT + n ln(1 + υ) − δT − (σ2T + nξ2)/2.

With some new notation values of options conditional on the numberof jumps can now be expressed in terms of Black-Scholes formulas. PuttingBn ≡ B(0, T ) exp(θυT )(1 + υ)−n and σ2

n ≡ σ2 + nξ2/T, the conditional

distribution of ST /S0 is then lognormal with parameters − lnBn − δT −σ2

nT/2 and σ2nT. If Bn were actually the current price of a T -maturing bond,

then conditional on NT = n the discounted expected payoffs of Europeancall and put struck at X would be given by the Black-Scholes formulas withBn, σn replacing B(0, T ), σ:

CE(S0, T ; Bn, σn) ≡ S0e−δT Φ[q+

n (fn/X)] − BnXΦ[q−n (fn/X)]

PE(S0, T ; Bn, σn) ≡ BnXΦ[q+n (X/fn)] − S0e

−δT Φ[q−n (X/fn)],

where fn = S0e−δT /Bn and q±n (s) ≡ [ln s±σ2

nT/2]/(σn

√T ).However, since

the bond’s price is actually B(0, T ) rather than Bn, the discounted expectedpayoffs conditional on NT = n are really B(0, T )/Bn times these modifiedBlack-Scholes prices; that is,

B(0, T )E[(ST − X)+ | NT ] = CE(S0, T ; BNT , σNT )B(0, T )/BNT

B(0, T )E[(X − ST )+ | NT ] = PE(S0, T ; BNT , σNT )B(0, T )/BNT .

The arbitrage-free prices of European calls and puts are then the expecta-tions of these conditional values. Again, some new notation simplifies thefinal expressions. Representing Poisson probabilities as p(n; θ) = θne−θ/n!and letting θυ ≡ θ(1+υ), notice that p(n; θT )B(0, T )/Bn = p(n; θυT ). Withthis notation, the values of the European call and put are, respectively,

B(0, T )E(ST − X)+ =∞∑

n=0

p(n; θυT )CE(S0, T ; Bn, σn) (9.13)

B(0, T )E(X − ST )+ =∞∑

n=0

p(n; θυT )PE(S0, T ; Bn, σn). (9.14)

To make the computations, one simply truncates the summationswhen the tail probability for the modified Poisson distribution becomes

Page 432: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch09 FA

Discontinuous Processes 415

sufficiently small. Since the put’s conditional values are bounded (by X),it is best to work with (9.14) and price the call by European put-call par-ity. One can then terminate the sum in (9.14) at a value N(ε) such that1−

∑N(ε)n=0 p(n; θυT ) ≤ ε/X, where ε is an error tolerance. Notice that Pois-

son probabilities can be calculated recursively as p(n; θ) = p(n − 1; θ)θ/n

for n ∈ ℵ, with p(0; θ) = e−θ. Program JUMP on the accompanying CD(and implemented also in the ContinuousTime.xls spreadsheet) valuesEuropean puts and calls by these methods. Alternatively, options can bepriced by Fourier inversion using one of the one-step approaches describedin section 8.4.2.

9.2.4 Properties of Jump-Dynamics Option Prices

Figures 9.1–9.4 show smile curves from the jump-diffusion model. Theseare plots of implicit volatility vs. moneyness, X/St, when option prices aregenerated by (9.13) or (9.14). The figures contrast (i) times to expiration,T − t; (ii) Poisson intensities, θ; (iii) volatilities of the jump shock, ξ; and(iv) means of the jump shock, υ. Figure 9.1 shows that implicit volatili-ties at low values of X/St (where puts are out of the money) and at high

0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4

X/St

0.30

0.31

0.32

0.33

0.34

0.35

0.36

0.37

0.38

0.39

Impl

icit

Vol

atili

ty

T-t = 0.1

T-t = 0.5

θ = 0.1, ν = 0.0, ξ = 0.2, σ = 0.3, r = .05

Fig. 9.1. Jump-diffusion implicit volatilites for short vs. long maturities, T − t.

Page 433: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch09 FA

416 Pricing Derivative Securities (2nd Ed.)

0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4

X/St

0.30

0.32

0.34

0.36

0.38

0.40

0.42

0.44

0.46

0.48

0.50

Impl

icit

Vol

atili

ty

θ = 0.1

θ = 1.0

ν = 0.0, T-t = 0.1, ξ = 0.2, σ = 0.3, r = .05

Fig. 9.2. Jump-diffusion implicit volatilities for high vs. low jump intensities, θ.

0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4

X/St

0.35

0.37

0.39

0.41

0.43

0.45

0.47

0.49

0.51

Impl

icit

Vol

atili

ty

ν = 0.0, T-t = 0.5, θ = 1.0, σ = 0.3, r = .05

ξ = 0.2

ξ = 0.4

Fig. 9.3. Jump-diffusion implicit volatilities for high vs. low jump volatilities, ξ.

Page 434: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch09 FA

Discontinuous Processes 417

0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4

X/St

0.30

0.31

0.32

0.33

0.34

0.35

0.36

0.37

0.38

0.39

0.40

0.41

0.42

Impl

icit

Vol

atili

ty ν = 0.0

ν = -0.1

T-t = 0.1, θ = 0.1, ξ = 0.2, σ = 0.3, r = .05

Fig. 9.4. Jump-diffusion implicit volatilities for zero vs. negative mean jump sizes, ν.

values of X/St (where calls are out) increase markedly as the time to expi-ration decreases. This results from the higher probability that short-lived,out-of-money options will hit the money when the underlying price pro-cess may be discontinuous. Figures 9.2 and 9.3 show that implicit volatil-ities increase with either the mean frequency or volatility of jump shocks,and more rapidly so for options away from the money. Figure 9.4 showsthat the greater negative skewness in the distribution of ln(ST /St) whenυ is negative enhances the smirk effect in the smile curve, raising implicitvolatility more for out-of-money puts than for out-of-money calls. It doesmake sense intuitively that the greater potential for negative shocks addsmore value to puts of given degrees of moneyness than it does to calls.

9.2.5 Options Subject to Early Exercise

Obviously, no simple formula like (9.13) applies to American-style optionson assets whose prices follow jump-diffusions. However, Amin (1993) hasdeveloped a very satisfactory method for pricing derivatives subject toearly exercise by discretizing both the time and state spaces, as in the

Page 435: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch09 FA

418 Pricing Derivative Securities (2nd Ed.)

binomial method. The diffusion part of the price process is approximatedusing Bernoulli dynamics, with price moving either up or down by one tickduring (t, t + ∆t], while jumps are modeled as multitick moves that occurwith probability θ · ∆t. The state space and risk-neutral probabilities canbe defined in such a way that the discrete process converges weakly to thejump-diffusion as the number of time steps increases and ∆t decreases. Ateach time step tj (with tj+1− tj ≡ ∆t) and price state sk an American put,say, is valued recursively as

PA(sk, T − tj) = maxX − sk, e−r∆tEtj [PA(Sj+1, T − tj+1) | Sj = sk].

The method allows flexibility in modeling distributions of the jumpsthemselves. In the degenerate case P(U = −1) = 1, wherein jumps cor-respond to bankruptcy events, Amin (1993) discovers through numeri-cal computation that the early exercise boundary for American calls ondividend-paying stocks is higher than it is under the pure-diffusion model.In other words, even though bankruptcy would wipe out a call’s value, it isoptimal to hazard this event and delay exercise until the call becomes evendeeper in the money than under geometric Brownian motion. The earlyexercise of American puts is also delayed—occurs deeper in the money—asone would expect. Things are more complicated with lognormal jumps, theexercise boundaries moving farther in the money for short-term puts andcalls, but moving nearer the money for longer maturities.

9.3 Jumps Plus Stochastic Volatility

Working with 1984–1991 data for $/DM options and estimating parame-ters from actual option prices via nonlinear least squares, Bates (1996a)found that a stochastic-volatility model without jumps gives better fitsin sample than a mixed jump process with deterministic volatility on theIto part. However, since the estimated volatility of the variance process isimplausibly high given the time-series behavior of implicit volatilities, heconcludes that combining the two models is the best way to account for thevolatility smile. In this way the jump process carries some of the burden forexplaining the high kurtosis in exchange-rate changes. Bakshi et al. (1997)find with 1988–1991 data for S&P 500 index options that such an “s.v.-jump” model that allows for jumps in price as well as stochastic volatilityreduces both in-sample and out-of-sample pricing errors, particularly forshort-lived options away from the money. Certainly, one expects to havegreater flexibility in accounting for the smirk in implicit volatilities if thetwo effects are combined. In this section we look first at methods used by

Page 436: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch09 FA

Discontinuous Processes 419

Bates (1996a), Bakshi et al. (1997), and Scott (1997) to develop computa-tional formulas for European options in the s.v.-jump framework.7 Thereare, however, indications in both option prices and index returns that therapid changes and sustained levels of volatility following major economicevents require still more elaborate models. Eraker et al. (2003) argue thatreplicating this aspect of the data requires modeling volatility also as adiscontinuous process. An alternative idea, proposed by Fang (2002), is toallow the intensity of jumps in the price process to vary stochastically withtime. This feature can help to capture the abruptness and persistence ofchange in the tenor of the market even when price volatility is assumed tobe constant; however, adding stochastic volatility does further enhance themodel’s flexibility. We will see that such extensions to the basic s.v.-jumpmodel can be accommodated easily and produce computationally feasibleformulas for prices of European options. Our treatment can be relativelybrief since the analytical methods for treating all these models are similarto those used in the basic s.v. model of Heston (1993).

9.3.1 The S.V.-Jump Model

We work with the following model for the risk-neutral process for the priceof the underlying:

dSt = St−(rt − θυ) · dt +√

vtSt− · dW1t + St−U · dNt

dvt = (α − βvt) · dt + γ√

vt · dW2t.

Here (i) W1t and W2t are standard Brownian motions under P with|EW1tW2t| = |ρ|t < t; (ii) Nt is an independent Poisson processwith (constant) intensity θ; (iii) vt is the square of price volatility; and(iv) ln(1 + U) is distributed independently (of Ntt≥0 and the Brownianmotions) as normal with mean ln(1+υ)− ξ2/2 and variance ξ2.8 The nota-tion for the jump process is the same as in the previous section, exceptthat tildes are omitted from the risk-adjusted parameters. Likewise, theadjusted process for vt is the same as (8.22) except that we express dvt

in terms of a single Brownian motion that is correlated with the one that

7Naik (1993) presents a related model in which volatility is a Markov process thatcan switch among two or more discrete levels, with accompanying nonstochastic jumpsto price.

8The model applies to a primary asset that pays no dividend. Otherwise, we makethe usual adjustments: replace rt by rt − δt if the underlying pays a continuous dividendand set the mean proportional drift rate to −θυ if St represents a futures price.

Page 437: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch09 FA

420 Pricing Derivative Securities (2nd Ed.)

drives St. Applying the extended Ito formula with dst ≡ d lnSt givesdst = (rt − θυ − vt/2) · dt +

√vt · dW1t + u · dNt, where u ≡ ln(1 + U).

To price a derivative on an underlying asset whose price follows sucha process, we follow a program like that for the basic s.v. model withoutjumps (section 8.3.2). That is, we first exploit the martingale property ofconditional expectations processes to determine Ψ(ζ; st, vt, τ), the condi-tional c.f. of ST as of t = T − τ . This time, discovering Ψ will requiresolving a partial differential integral equation (p.d.i.e.), but the method ismuch the same. Once Ψ has been determined, we can use one of the tech-niques described in section 8.4 to value a T -expiring European option viaFourier inversion.

To begin the program, apply the extended Ito formula to Ψ(ζ; st, vt, τ) =Ete

iζsT to obtain

dΨ = −Ψτ · dt + Ψs

(rt − θν − vt

2

)· dt + Ψs

√vt · dW1t + Ψss

vt

2· dt

+ Ψν(α − βvt) · dt + Ψνγ√

vt · dW2t + Ψννγ2 vt

2· dt + Ψsνργvt · dt

+ [Ψ(ζ; st, vt, τ) − Ψ(ζ; st−, vt, τ)] · dNt.

Taking expectations conditional on Ft− and exploiting the martingale prop-erty under P give the p.d.i.e.

0 = −Ψτ + Ψs(rt − θν − vt

2) + Ψss

vt

2+ Ψν(α − βvt) + Ψννγ2 vt

2+ Ψsνργvt + Et−[Ψ(ζ; st, vt, τ) − Ψ(ζ; st−, vt, τ)]θ, (9.15)

which must be solved subject to terminal condition Ψ(ζ; sT , vT , 0) = eiζsT .Again, as in the basic s.v. model, the solution has a simple exponential-

affine form in state variables st, vt; namely,

Ψ(ζ; st, vt, τ) = exp[iζst + g(τ ; ζ) + h(τ ; ζ)vt + q(ζ)θτ ] (9.16)

for certain complex-valued functions g, h, and q satisfying

g(0; ζ) = h(0; ζ) = g(τ ; 0) = h(τ ; 0) = q(0) = 0. (9.17)

The added term involving q in the exponent arises from the expecta-tion term in (9.15). Given the form (9.16), that expectation term equalsΨ(ζ; st, vt, τ) times

Eeiζ ln(1+U) − 1 = (1 + ν)iζe−(iζ+ζ2)ξ2/2 − 1.

Setting q(ζ) ≡ Eeiζ ln(1+U) − 1, inserting this and the derivatives of (9.16)into (9.15), and separating the terms proportional to vt from those that are

Page 438: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch09 FA

Discontinuous Processes 421

not give the conditions

0 = −g′ + iζ(rt − θν) + αh

0 = −h′ − (iζ + ζ2)/2 + (iργζ − β)h + γ2h2/2.

When γ = 0 solutions of these o.d.e.s for g and h subject to ( 9.17) are

h(τ ; ζ) =B − C

γ2

eCτ − 11 − DeCτ

g(τ ; ζ) = −iζ[lnB(t, T ) + τθν] +α

γ2

[(−B + C)τ − 2 ln

(1 − DeCτ

1 − D

)],

where A ≡ −(iζ + ζ2)/2, B ≡ iζργ − β,

C ≡√

B2 − 2Aγ2, ζ = 0−β, ζ = 0,

and D ≡ (B − C)/(B + C). We thus have

Ψ(ζ; st, vt, τ) = expiζst + g(τ ; ζ) + h(τ ; ζ)vt + [(1 + ν)iζeAξ2 − 1]θτ.(9.18)

With this in hand we can use the integrated c.d.f. method described insection 8.4.2 to approximate the price of a T -expiring European put as

PE(St, vt, T − t) = B(t, T )X[12− 1

∫ c

−c

X−iζ Ψ(ζ; st, vt, τ)ζ(i + ζ)

· dζ

](9.19)

for some large c.Before looking at still richer models, it is instructive to see how to

retrieve the simpler ones that the s.v.-jump model encompasses.

1. If θ = 0 the jump component is eliminated, the last term in the exponentof (9.18) vanishes, and we are back to Heston’s s.v. model. The solutionsfor g(τ ; ζ) and h(τ ; ζ) are the same except that the term −iζτθν disap-pears in the expression for g.

2. If γ = 0, α ≥ 0, β > 0, and θ ≥ 0 we are left with either a purediffusion process (when θ = 0) or a jump-diffusion process (when θ > 0)in which volatility is time-varying but deterministic. Specifically, solvingdvt = (α−βvt) ·dt subject to initial value v0 gives vt = α/β +e−βt(v0−α/β), t ≥ 0. As t → ∞ this either increases or decreases toward α/β

according as v0 < α/β or v0 > α/β . The solutions for g and h now havedifferent forms:

h(τ ; ζ) = Aβ−1(1 − e−βτ )

g(τ ; ζ) = −iζ lnB(t, T ) − iζθντ − αβ−1Aτ − αAβ−2(1 − e−βτ ).

Page 439: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch09 FA

422 Pricing Derivative Securities (2nd Ed.)

3. If γ = α = β = 0 but θ > 0, then volatility is constant (√

vt =√

v0), andwe are left with the simple jump-diffusion model. In this case h(τ ; ζ) =Aτ and g(τ ; ζ) = −iζ[lnB(t, T ) + τθν].

4. If α = β = γ = θ = 0, we are all the way back to geometric Brownianmotion with volatility σ =

√v0. Setting θ = 0 in the last expression for

g(τ ; ζ), the conditional c.f. of sT is simply

Ψ(ζ; st, vt, τ) = exp[iζst − iζ lnB(t, T ) − iζv0τ/2 − ζ2v0τ/2],

which corresponds to N(st +∫ T

tru · du − v0τ/2, v0τ).9

9.3.2 Further Variations

Duffie et al. (2000), Eraker et al. (2003), and Eraker (2004) propose extend-ing the s.v.-jump model to allow for discontinuous changes in volatility aswell as in price. The changes can be modeled as occurring either simultane-ously or independently. We describe here just the simultaneous version, aspresented by Eraker et al. (2003). In this model the log price and squaredvolatility of an underlying (no-dividend) primary asset evolve under risk-neutral measure P as

dst = (rt − θν − vt/2) · dt +√

vt · dW1t + u1 · dNt

dvt = (α − βvt) · dt + γ√

vt · dW2t + u2 · dNt.

As in the s.v.-jump model W1t and W2t are standard Brownian motionswith correlation ρ and Nt is an independent Poisson process with inten-sity θ; however, the Poisson events now affect both price and volatility.Volatility shocks U2jNt

j=1 during [0, t] are modeled as i.i.d. exponentialvariates, with c.d.f. F (u) = (1 − e−ϕu)1[0,∞)(u), while shocks U1jNt

j=1 tolog price are i.i.d. as N [ln(1+ν)−ξ2/2+κu2, ξ

2] conditional on U2j = u2.Setting κ < 0 allows price shocks to move counter to volatility shocks onaverage, as they appear to do in most markets. With st− and vt− as left-hand limits of log price and of volatility and st = st− + u1 · dNt andvt = vt− + u2 · dNt, the same arguments that led to (9.15) show that theconditional c.f. of sT satisfies the p.d.i.e.

0 = −Ψτ + Ψs

(rt − θν − vt−

2

)+ Ψss

vt−2

+ Ψν(α − βvt−) + Ψννγ2 vt−2

+ Ψsνργvt− + Et−[Ψ(ζ; st, vt, τ) − Ψ(ζ; st−, vt−, τ)]θ

9Applying (9.19) with this c.f. and comparing with the usual Black-Scholes compu-tations give a good check on the program and the choice of settings in the numericalintegration algorithm.

Page 440: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch09 FA

Discontinuous Processes 423

and terminal condition Ψ(ζ; sT , vT , 0) = eiζsT . The solution again has theform (9.16) with g(τ ; ζ) and h(τ ; ζ) just the same but with

q(ζ) ≡ exp[iζ ln(1 + ν) + Aξ2][1 − (iζκ + h)ϕ]−1 − 1.

Sending ϕ → 0 makes the volatility jumps vanishingly small and returns usto the s.v.-jump model.

Figures 9.5 and 9.6 present series of implicit volatility curves and p.d.f.sof lnST that show the incremental effects of the various features of themodel. That is, we start with Black-Scholes (model BS) and progressivelyadd jumps to price (model JD), allow for stochastic volatility without jumpsin price (model SV), include both stochastic volatility and price jumps(model SVJ), and finally allow jumps in both price and stochastic volatil-ity (model SJJ). All the illustrations except those for BS use the parametervalues for the SJJ model from Eraker’s (2004) Table III. These were esti-mated jointly from returns of the S&P 500 and options on the S&P during

0.8 0.9 1.0 1.1 1.2

X/S0

0.14

0.16

0.18

0.20

0.22

0.24

0.26

0.28

0.30

0.32

Implic

it V

ola

tility

BS

JD

SV

SVJ

SJJ

SJJ

SVJSV

JD

Fig. 9.5. Implicit volatility curves for jump-diffusion (JD), stochastic volatility (SV),s.v.-jump (SVJ), and s.v.-jump with jumps in volatility (SJJ).

Page 441: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch09 FA

424 Pricing Derivative Securities (2nd Ed.)

−0.3 −0.2 −0.1 0.0 0.1 0.2 0.3

s = ln (ST)

0

1

2

3

4

5

6

7

f(s)

BSJD

−0.3 −0.2 −0.1 0.0 0.1 0.2 0.3

s = ln (ST)

0

1

2

3

4

5

6

7

8

f(s)

BSSV

−0.3 −0.2 −0.1 0.0 0.1 0.2 0.3

s = ln (ST)

0

1

2

3

4

5

6

7

f(s)

BSSVJ

−0.3 −0.2 −0.1 0.0 0.1 0.2 0.3

s = ln (ST)

0

1

2

3

4

5

6

7f(

s)

BSSJJ

Fig. 9.6. P.d.f.s for jump-diffusion (JD), stochastic volatility (SV), s.v.-jump (SVJ),and s.v.-jump with jumps in volatility (SJJ).

1987–1990.10 Note that since the same SJJ parameters were used in all thesubmodels, the volatility curves for SVJ, SV, and JD are not the best-fittingversions.11 The plots simply show the incremental effects of enriching themodel step by step. Generally, the stepwise changes progressively lengthenthe left tail of the distribution of lnST and increase the implicit volatilities

10Eraker’s estimates were rescaled to express returns in proportions rather thanpercentages and to record time in years rather than in days, assuming a 365-day marketyear. The resulting values are shown below in the text notation and in Eraker’s.

Text α β ρ γ ν κ ξ ϕ θ

Eraker θκQ κQ ρ σV eσ2y/2−µQ

y -1 ρJ σy µV λ0

Estimate 0.198 4.02 −0.582 0.595 −0.072 −0.190 0.0365 0.0598 0.730

The computations for the figures use σ0 = 0.1588 for initial volatility, which is theaverage from returns data reported in Table II of Eraker et al. (2003). Other data forthe figures: short rate r = 0.03, dividend rate δ = 0, time to expiration T = 0.135 years(about the average in Eraker’s sample of options).

11In any case Eraker’s estimates from pooled time series of option and underlyingprices do not minimize a loss function based on option prices alone. Thus, they arenot directly comparable with the results in Bakshi et al. (1997) and other studies thatminimize a squared-error loss.

Page 442: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch09 FA

Discontinuous Processes 425

for out-of-money puts. With these parameter values, the addition of jumpsto volatility also thickens slightly the right tail of the distribution andthereby increases implicit volatility for in-the-money puts. While addingjumps to volatility clearly does add to the flexibility of the model, it isnot clear from Eraker’s (2004) findings that much is gained in reducingout-of-sample pricing errors.

Fang (2002) extended the s.v.-jump model in a different direction, keep-ing the volatility process as a diffusion but allowing stochastic variation inthe intensity of the Poisson process that counts price shocks.12 With inten-sity θtt≥0 as another mean-reverting square-root process, the full modelbecomes

dst = (rt − θtν − vt/2) · dt +√

vt · dW1t + u · dNt

dvt = (α − βvt) · dt + γ√

vt · dW2t

dθt = (µ − λθt) · dt + φ√

θt · dW3t.

In this setup the random excursions in the intensity process can accountfor the clustering of jumps. By restricting W3t to be independent ofW1t, W2t, we get a conditional c.d.f. that is still exponentially affine instate variables s, v, θ:

Ψ(ζ; st−, vt, τ) = exp[iζst− + g(τ ; ζ) + h(τ ; ζ)vt + k(τ ; ζ)θt].

The standard methods yield as solutions for g, h, and k when γ = 0 andφ = 0

g(τ ; ζ) = −iζ lnB(t, T ) +α

γ2

[(−B + C)τ − 2 ln

(1 − DeCτ

1 − D

)]+

µ

φ2

[(λ + C′)τ − 2 ln

(1 − D′eCτ

1 − D′

)]h(τ ; ζ) =

B − C

γ2

eCτ − 11 − DeCτ

k(τ ; ζ) =−λ − C′

φ2

eC′τ − 11 − D′eC′τ .

Here, A, B, C, D and q(ζ) = Eeiζu − 1 are as in the s.v.-jump model andA′ ≡ −iζν + q(ζ), D′ ≡ (−λ − C′)/(−λ + C′), C′ ≡

√λ2 − 2A′φ2 when

ζ = 0 and C′ = −λ otherwise.

12Such doubly stochastic or “Cox” processes are commonly used in modeling survivaltimes and defaults on financial obligations; c.f. Cox and Oakes (1984), Lando (1998). Wedescribe these in more detail in section 10.5, where we look at applications to pricingdefaultable bonds.

Page 443: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch09 FA

426 Pricing Derivative Securities (2nd Ed.)

In practice, the usefulness of these models and of potential further exten-sions is limited by the need to estimate the large number of parameters theycontain. Counting the unobservable initial volatility, the s.v.-jump modelhas eight parameters under the risk-neutral measure (v0, α, β, γ, ν, ξ, θ, ρ),the volatility-jump setup adds two more (κ and ϕ), and the stochastic-intensity extension adds three (µ, λ, φ, with θ0 replacing θ). However, Fang(2002) has found some success in pricing individual stock options usingthe seven-parameter submodel that restricts price volatility to be constant.This forces the excursions in jump intensity to account for all the fluctua-tions in price volatility that appear to be in the data, which it seems to doreasonably well.

9.4 Pure-Jump Models

This section treats four models in which the underlying price process hasno continuous component at all. The first two are the well-known variance-gamma model of Madan and coauthors (1990, 1991, 1998, 1999) and thehyperbolic model of Eberlein and Keller (1995) and Eberlein et al. (1998).Both models are obtained by subordinating Brownian motion to a directingprocess with continuous sample paths. This construction yields Levy pro-cesses with infinite Levy measure, for which the expected number of jumpdiscontinuities in any finite interval is infinite. To illustrate the versatilityof the subordination scheme, we present a third model in which Brownianmotion is directed by a pure-jump process. This again produces a Levyprocess, but one having only finitely many jumps per unit time. In all threemodels prices move in discrete ticks, the sizes of which are randomly dis-tributed according to the Levy measure that governs the process. The lastmodel of the section characterizes price as a continuous-time branching pro-cess. Here the number of jumps per unit time is again finite, but the sizesof the jumps are all integral multiples of the minimum price tick authorizedfor the market. Despite their differences, all four models account for skew-ness and thick tails in the marginal distributions of asset returns and aretherefore capable of reducing the smile effect in option prices.

9.4.1 The Variance-Gamma Model

To motivate the continuous-time variance-gamma (v.g.) model, considerthe following discrete-time model for an asset’s average continuouslycompounded return over a unit interval of time—i.e., ln(St+1/St). Taking

Page 444: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch09 FA

Discontinuous Processes 427

Zt∞t=1 as i.i.d. standard normal and Ut∞t=1 as i.i.d. gamma variates,independent of the Zt, with shape parameter ν > 1 and unit scale, let

Rt,t+1 ≡ lnSt+1 − lnSt = µ + σ√

ν − 1Zt/√

Ut.

Algebra reduces the right side to µ + σ√

ν−1ν Tt, where Tt = Zt

√ν/Ut

has Student’s distribution with 2ν degrees of freedom (not necessarily aninteger). Since ETt = 0, V Tt = ν/(ν − 1), and (if ν > 2) ET 4

t /(V Tt)2 =3(1 + 1

ν−2 ), this setup delivers for Rt,t+1 a symmetric distribution withmean µ and variance σ2 but with excess kurtosis related inversely to ν.Since the change in log price is distributed as N [µ, σ2(ν−1)/U ] conditionalon U , the distribution can be viewed as a mixture of normals whose inversevariances are distributed as gamma. The idea of modelling centered returnsof stocks as scaled Student’s t was first proposed by Praetz (1972). Blattbergand Gonedes (1974) found that the model does successfully capture thethick tails observed in the empirical marginal distributions of daily returns,obtaining estimates of ν in the 3.0–6.0 range for most stocks in a sample oflarge US companies.

While the Student model fits the discrete-time empirical data reasonablywell, it has significant shortcomings in the application to derivatives pricing.One is that the Student family is not closed under convolution, meaningthat sums of independent Student variates are not in the family.13 A moreserious problem is that the model cannot account for asymmetries in therisk-neutral distribution of log price that are often implicit in option prices.Madan and Seneta (1990) proposed a related model that overcomes thefirst objection, and modifications by Madan and Milne (1991) and Madan,Carr , and Chang (1998) (henceforth, Madan et al.) subsequently overcamethe second. The basic idea is to model the log price in continuous time asa time change of Brownian motion; that is, as a subordinated process.

Variance-Gamma as a Subordinated Process

Here is how the v.g. model works. Let the directing process or subordi-nator Ttt≥0 that represents operational time be a gamma process with

13On the other hand, the Student distribution is known to be infinitely divisible

and, like the hyperbolic model described below, it therefore generates a Levy process incontinuous time. In this sense, lack of closure under convolution, while a nuisance, is nota crucial limitation.

Page 445: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch09 FA

428 Pricing Derivative Securities (2nd Ed.)

parameters t/ν and ν and density

fTt(τ) =τ t/ν−1e−τ/ν

Γ(t/ν)νt/ν, t > 0, ν > 0.

Notice that the first two moments are ETt = t and V Tt = νt. TakingWtt≥0 to be an independent Brownian motion, the continuously com-pounded return over [0, t] is modeled as

R0,t ≡ ln(St/S0) = µt + γTt + σWTt . (9.20)

Noting that this is distributed conditionally as N(µt + γTt, σ2Tt), we

see that the model for return is again a mixture of normals but havingstochastic mean as well as variance and with a gamma (instead of inversegamma) variate driving the parametric change. The subordination schemeadapts the mixture model to continuous time in a mathematically elegantway. Moreover, associating operational time with information flow gives themodel a natural economic interpretation.

Canonical Representation

We know from general principles that the return process R0,t is Levyunder model (9.20), since it comes from subordinating one Levy process(Brownian motion) to another (the gamma process). Discovering otherproperties begins with a development of the c.f. Evaluating

ΨR0,t(ζ) = E exp[iζ(µt + γTt + σWTt)]

by conditioning first on Tt gives

ΨR0,t(ζ) = eiµζt[1 − ν(iζγ − ζ2σ2/2)]−t/ν. (9.21)

Since ΨR0,t(ζ)ΨRt,u (ζ) = ΨR0,u(ζ), the model is closed under convolution,and we confirm that R0,tt≥0 is indeed a Levy process since (9.21) has theform etΘ(ζ), with

Θ(ζ) = iζµ − ν−1 ln[1 − ν(iζγ − ζ2σ2/2)].

Notice that this has a limiting value of i(µ+γ)−σ2/2 as ν → 0, indicatingthat the process approaches a Brownian motion as the volatility of opera-tional time diminishes to zero. However, as we are about to see, the processis of finite variation and, indeed, is pure-jump for each ν > 0.

Page 446: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch09 FA

Discontinuous Processes 429

Let us consider two independent gamma processes, R+0,tt≥0 and

R−0,tt≥0, with c.f.s ΨR±

0,t(ζ) = (1 − iζβ±)−t/ν and parameters

β± =[√

γ2ν2 + 2σ2ν ± γν]/

2.

Algebra shows that

eiζµtE exp[iζ(R+0,t − R−

0,t)] = eiζµtΨR+0,t

(ζ)ΨR−0,t

(−ζ) = ΨR0,t(ζ).

Therefore, apart from the deterministic drift, R0,t is the difference betweenthese two nondecreasing processes. As such, the return process necessarilyhas finite variation on finite intervals. Moreover, from the canonical repre-sentations of the gamma processes as (see example 56 and (3.54))

ln ΨR+0,t

(ζ) = tν−1

∫(0,∞)

(eiζx − 1)x−1e−x/β+ · dx (9.22)

ln ΨR−0,t

(−ζ) = tν−1

∫(−∞,0)

(eiζx − 1)|x|−1e−|x|/β− · dx (9.23)

it follows that ΨR0,t(ζ) has canonical form14

ln ΨR0,t(ζ) = t

[iζµ +

∫\0

(eiζx − 1)Λ(dx)

], (9.24)

where

Λ(dx) =1

ν|x| exp

1σ2

[xγ − |x|

√γ2 + 2σ2/ν

]· dx. (9.25)

14To see this, write Λ(dx) in (9.25) as λ(x) · dx, where for x = 0

λ(x) ≡ 1

ν|x| exp

»−x

2

„1

β+− 1

β−

«− |x|

2

„1

β++

1

β−

«–

=1

ν|x| exp

»−1

2

„x + |x|

β++

x − |x|β−

«–

=

8>><>>:

1

ν|x| exp(−|x|/β+), x > 0

1

ν|x| exp(−|x|/β−), x < 0.

Therefore, from (9.22) and (9.23) we have

lnˆΨ

R+0,t

(ζ)ΨR−

0,t(−ζ)

˜= t

Z\0

(eiζx − 1)λ(x) · dx,

and hence (9.24).

Page 447: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch09 FA

430 Pricing Derivative Securities (2nd Ed.)

The absence of a Brownian component in (9.24) indicates that the process ispure jump. Moreover, observing from (9.25) that Λ(\0) = +∞, one seesthat the expected number of jumps per unit time is infinite. Since the Levydensity declines monotonically from the origin in both directions, jumps oflarger magnitude tend to occur less frequently than those of smaller size.Notice that positive jumps are more or less likely than negative jumps ofthe same size according as γ > 0 or γ < 0.

Density Function and Moments

A development of the p.d.f. of R0,t starts from the mixture interpretation, as

fR0,t(R) =∫ ∞

0

1√2πσ2τ

exp[− (R − µt − γτ)2

2σ2τ

]τ t/ν−1e−τ/ν

Γ(t/ν)ντ/ν· dτ.

Madan et al. (1998) develop from this the following explicit formula:

fR0,t(R) =2eγ(R−µt)/σ2

νt/ν√

2πσ2Γ(t/ν)

( |R − µt|ρ

) tν − 1

2

K tν − 1

2

(ρ|R − µt|

σ2

),

where

ρ ≡ +√

2σ2/ν + γ2

and Kt/ν−1/2(·) is a modified Bessel function of the second kind of ordert/ν − 1/2.15

The m.g.f.,

MR0,t(ζ) = eζµt[1 − ν(ζγ + ζ2σ2/2)]−t/ν ,

exists for ζσ2 ∈ [−γ−√

γ2 + 2σ2/ν,−γ+√

γ2 + 2σ2/ν], indicating that allmoments of R0,t are finite. We find the mean and variance to be ER0,t =(µ + γ)t and V R0,t = (γ2ν + σ2)t, respectively, while the coefficients ofskewness and excess kurtosis are

α3 =γν(2γ2ν + 3σ2)(γ2ν + σ2)3/2

√t

∝ t−1/2

α4 − 3 =3ν

t

[2 − σ4

(γ2ν + σ2)2

]∝ t−1.

Both α3 and α4 − 3 increase with ν and decrease with t. The distributionis symmetric when γ = 0. Setting St = S0e

R0,t , we can evaluate ESt as

15Modified Bessel functions are described in section 2.1.8.

Page 448: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch09 FA

Discontinuous Processes 431

S0ΨR0,t(−i). Under risk neutrality and a constant interest rate r this wouldequal S0e

rt, requiring the total mean drift under the martingale measureto be

µ + γ = r + γ + ν−1 ln[1 − ν(γ + σ2/2)], (9.26)

approaching r − σ2/2 as ν → 0.

Option Pricing under Variance-Gamma Dynamics

Regarding ν, γ, and σ as parameters of the risk-neutral distribution, andwith µ satisfying (9.26), the value at t = 0 of a T -expiring European callstruck at X would be

CE(S0, T ) = e−rT E(ST − X)+.

Exploiting the lognormality of ST conditional on TT , Madan and Milne(1991) express the value of the call as

CE(S0, T ) =∫ ∞

0

c(τ)τ t/ν−1e−τ/ν

Γ(t/ν)ντ/ν· dτ,

where

c(τ) ≡ S0(1 − νβ2/2)T/νeβ2τ/2Φ(d/

√τ + β

√t)

−Xe−rT (1 − να2/2)T/νeα2τ/2Φ(d/

√τ + α

√t)

with

d ≡ (β − α)−1

[ln(S0/X) + rT +

T

νln

(1 − νβ2/21 − να2/2

)]α ≡ − γ

σ√

1 + γ2ν2σ2

β ≡ σ2 − γ

σ√

1 + γ2ν2σ2

.

Madan et al. (1998) develop an explicit expression in terms of modifiedBessel functions. However, Carr and Madan (1998) find that applying thefast Fourier transform to a modification of the call function is faster com-putationally than the explicit formula. Alternatively, one can use the inte-grated c.d.f. approach by evaluating formula (8.37) on page 399.

Madan et al. (1998) apply the v.g. model to (American) futures optionson the S&P 500 index, estimating the implicit risk-neutral parameters (γ, ν,

and σ) weekly from the option prices by nonlinear least squares. Finding

Page 449: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch09 FA

432 Pricing Derivative Securities (2nd Ed.)

that in-sample pricing errors reveal no statistically significant relation tomoneyness, they conclude that the model accounts well for smile effectsin the option prices. However, the substantial variation in their parameterestimates over the 143 weeks of the sample suggests that out-of-samplepredictions would be much less satisfactory.

9.4.2 The Hyperbolic Model

The idea of modeling the log-price process as a subordinated Brownianmotion opens up many possibilities for capturing the thick tails and skew-ness in the risk-neutral distribution of returns. Eberlein and Keller (1995)and Eberlein et al. (1998) have studied another such model that uses assubordinator a process generated by the generalized inverse Gaussian dis-tribution. As does the gamma process that clocks operational time in thev.g. model, the inverse Gaussian subordinator yields a pure-jump Levy pro-cess having no Gaussian component and infinite Levy measure. The implieddistribution of the continuously compounded return over a unit interval isa mixture of normals known as the “hyperbolic law”, which has been stud-ied extensively by O. E. Barndorff-Nielsen and coauthors and applied inmodeling the distribution of sizes of sand particles.

The Hyperbolic Distribution

To develop the model, begin with a random variable Y whose conditionaldistribution is normal with mean and variance proportional to a generalizedinverse Gaussian variate V :

fY |V (y|v) =1√2πv

e−(y−βv)2/(2v), y ∈ (9.27)

fV (v) =

√α2 − β2

2K1

(√α2 − β2

)e−[1/v+(α2−β2)v]/2, v > 0, (9.28)

where α > 0, 0 ≤ |β| < α, and K1 is a modified Bessel function of the secondkind of order one.16 Working out the integral

∫∞0

fY |V (y|v)fV (v) · dv gives

16To see that (9.28) is a density, set z =p

α2 − β2 in the following integral rep-resentation for the Bessel function, which is formula 8.432.6 in Gradshteyn and Ryzhik(1980), then change variables as t = (2v)−1:

K1(z) =

Z ∞

0

z exp[−t − z2/(4t)]

4t2· dt.

Page 450: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch09 FA

Discontinuous Processes 433

as the marginal p.d.f. of the resulting mean-variance mixture of normalsthe expression

fY (y; α, β) =

√α2 − β2

2αK1

(√α2 − β2

)e−α√

1+y2+βy, y ∈ . (9.29)

Parameter α governs the thickness of the tails, and from the symmetry offY (y; α, 0) we see that β controls skewness, the right tail being elongatedwhen β > 0. To see why the model is called “hyperbolic”, set z equal tothe exponent in (9.29), square z − βy, and collect terms to produce

Az2 + Bzy + Cy2 − α2 = 0,

where A = 1, B = −2β, and C = β2 − α2. Since B2 − 4AC = 4α2 > 0, thelocus (y, z) is in fact an hyperbola.

The form of (9.29) makes it easy to develop the m.g.f. as

MY (ζ) =∫

eζyfY (y; α, β) · dy

=

√α2 − β2√

α2 − (β + ζ)2

K1

(√α2 − (β + ζ)2

)K1

(√α2 − β2

) ∫

fY (y; α, β + ζ) · dy

=

√α2 − β2√

α2 − (β + ζ)2

K1

(√α2 − (β + ζ)2

)K1

(√α2 − β2

) , (9.30)

for |β + ζ| < α. While the existence of the m.g.f. implies the finiteness ofthe moments, these unfortunately have to be expressed in terms of Besselfunctions. For example, evaluating M′

Y (·) at ζ = 0 gives17

EY =β√

α2 − β2

K2

(√α2 − β2

)K1

(√α2 − β2

) .

Modeling Prices and Returns

Taking Y1 and Y2 to be i.i.d. as Y with m.g.f. (9.30), one sees thatMY1+Y2(ζ) = MY (ζ)2 is not of the same functional form as MY (ζ). This

17To differentiate MY (ζ) use the formula (2.27), which for functions of order onegives K ′

1(z) = z−1K1(z) − K2(z).

Page 451: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch09 FA

434 Pricing Derivative Securities (2nd Ed.)

means that the hyperbolic family, like the Student family, is not closedunder convolution. However, since the distribution is infinitely divisible,18

the family does generate a class of Levy processes Ytt≥0 from the relation

ΨYt(ζ) = ΨY (ζ)t, (9.31)

where ΨY (ζ) = MY (iζ) is the c.f. of Y . Eberlein and Keller (1995) exploitthis fact to produce a plausible model of security prices in continuous time.Extending to allow arbitrary location and scale, the continuously com-pounded return over a unit time interval is defined as R0,1 = µ + σY ,so that19

ΨR0,1(ζ) = eiζµΨY (ζσ)

= eiζµ

√α2 − β2√

α2 − (β + iζσ)2

K1

(√α2 − (β + iζσ)2

)K1

(√α2 − β2

) .

Applying (9.31), the continuously compounded return over [0, t] is then arandom variable having c.f.

ΨR0,t(ζ) = eiζµtΨYt(ζσ) = eiζµtΨY (ζσ)t. (9.32)

With this model for the return the price at t is given by

St = S0eR0,t. (9.33)

Under this setup the log of the price relative is, apart from addedmean trend at rate µ, a time change of Brownian motion with subordi-nator Tt = σYt. Developing the c.f. in canonical form, Eberlein and Keller(1995) show that

t−1 ln ΨR0,t(ζ) = iζµ +∫\0

(eiζx − 1 − iζx)Λ(dx),

where the Levy measure is such that Λ(\0) = +∞. Accordingly, apartfrom the deterministic trend the return process is pure-jump and, as in thev.g. model, there are infinitely many jumps in each open interval of time.With St as in (9.33) an application of (3.46) gives

dSt = St−µ · dt + St−σ · dYt + (St − St− − St−σ∆Yt)

18As proved by Barndorff-Nielsen and Halgreen (1977).19Here the shape parameters α and β correspond to those of Eberlein and Keller

(1995) and Eberlein et al. (1998) multiplied by the scale (our “σ” and their “δ”).

Page 452: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch09 FA

Discontinuous Processes 435

or

dSt/St− = µ · dt + σ · dYt + eσ∆Yt − 1 − σ∆Yt,

where ∆Yt = Yt − Yt− and Yt is the random variable defined implicitly by(9.31). One determines the distribution of St given S0 from the p.d.f. ofR0,t. For t = 1 this has to be found by inversion of (9.32).

Pricing Options

As was the case for the v.g. model, payoffs of options—and nonlinear deriva-tives generally—cannot be replicated with portfolios of riskless bonds andthe underlying alone when process Stt≥0 is hyperbolic. Accordingly, therisk-neutral measure P under which assets’ prices (normalized by Mt)are martingales in an arbitrage-free market is not unique. Taking r to beaverage short rate during [0, t], the fair-game property under P impliesthat S0/M0 = E(St/Mt) or S0 = e−rtS0EeR0,t. This in turn implies thatr = ln ΨR0,1(−i) and, therefore, that the parameters satisfy

µ = r − ln

√α2 − β2√

α2 − (β + σ)2

K1

(√α2 − (β + σ)2

)K1

(√α2 − β2

). (9.34)

If the underlying pays continuous dividends at rate δ, r is simply replacedby r − δ in the above. For simplicity, we ignore dividends from this point.

Given values of the parameters Eberlein et al. (1998) suggest the follow-ing direct way to price a T -expiring European-style derivative: (i) imposing(9.34), find fR0,T (·), the risk-neutral p.d.f. of R0,T , by Fourier inversion ofthe c.f. in (9.32); (ii) find fST (·), the implied density of ST = S0e

R0,T fromfR0,T (·) via a change of variables; and (iii) price the derivative by numericalintegration. For example, the price of a European put struck at X would be

PE(S0, T ) = e−rT

[X

∫ X

0

fST (s) · ds −∫ X

0

sfST (s) · ds

].

While this is certainly feasible, we now know from section 8.4 that thereare much faster ways to price European options from c.f.s; namely, Carr-Madan’s (1998) Fourier inversion of the damped call or the integrated c.d.f.method using formula (8.37).

Restricting µ as in (9.34) and using nonlinear least squares to infer α,β, and σ from prices of options on several German equities, Eberlein et al.(1998) find that the model produces much smaller mean absolute error than

Page 453: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch09 FA

436 Pricing Derivative Securities (2nd Ed.)

Black-Scholes. However, neither an in-sample comparison with competingpure-jump, mixed-jump, and s.v. models nor an assessment of predictionsout of sample is yet available.

9.4.3 A Levy Process with Finite Levy Measure

To show the versatility of the subordination scheme, we outline here anotherpure-jump model that differs from the v.g. and hyperbolic in having onlyfinitely many jumps per unit time. The change is accomplished by modelingthe directing process as compound-Poisson rather than as a process withcontinuous sample paths.

Specifically, let us model operational time as Tt =∑Nt

j=0 Uj, where(i) Ntt≥0 is a Poisson process with intensity θ, (ii) U0 ≡ 0, and (iii)Uj∞j=1 are nonnegative, i.i.d. random variables, independent of Nt andwith m.g.f. MU (ζ). Ttt≥0 is thus a nondecreasing Levy process that isitself pure-jump, and Tt has m.g.f. MTt(ζ) = exptθ[MU (ζ) − 1]. Nowintroduce a trended Brownian motion µt + σWtt≥0 that is independentof everything else and create the subordinated process

Xt = µTt + σWTtt≥0. (9.35)

By first conditioning on Tt it is easy to see that Xt has m.g.f.

MXt(ζ) = exptθ[MU (ζµ + ζ2σ2/2)− 1]

and c.f. ΨXt(ζ) = MXt(iζ) respectively. The expected value is EXTt =tθµEU1. Of course, Xt is again a Levy process. In the canonical formΨXt(ζ) = etΘ(ζ) we have

Θ(ζ) = θ[MU (iζµ − ζ2σ2/2) − 1]

= iζθµEU1 +∫\0

(eiζx − 1 − iζx)Λ(dx),

with Levy measure

Λ(dx) = θ · dx

∫ ∞

0

1√2πσ2u

exp[− (x − µu)2

2σ2u

]· dFU (u)

for x ∈ . Thus, θ−1Λ(dx) is the density of a random variable distributed asN(Uµ, Uσ2) conditional on U, where U has c.d.f. FU . Since

∫\0 Λ(dx) =

θ < ∞, the process Xt has only finitely many jumps during any finitespan of time; and since Θ(ζ) contains no term in ζ2 the process has noBrownian component.

Page 454: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch09 FA

Discontinuous Processes 437

With Xtt≥0 as the corner stone, we can build a model for a priceprocess Stt≥0 as

St = S0 exp(∫ t

0

νs · ds + Xt

),

where νt is a specified drift process. To model Stt≥0 under the risk-neutral measure one would take

νt = rt − δt − θ[MU (µ + σ2/2) − 1],

where rt, δt are the short rate and dividend rate, and regard Wt and Ntas Wiener and Poisson processes under P. With Mt = M0e

R t0 rs·ds as nume-

raire, the normalized cum-dividend price Sc∗t = M−1

t St exp(∫ t

0 δs · ds) thenhas expected value under P

ESc∗t = Sc∗

0 exp−tθ[MU (µ + σ2/2)− 1]EeXt

= Sc∗0 MXt(ζ)−1MXt(ζ)

= Sc∗0 ,

as required for the martingale property. lnSc∗t thus follows a steady trend

during any interval (t, t+∆t] in which Nt+∆t−Nt = 0, while at t such thatNt − Nt− = 1 we have ln Sc∗

t − lnSc∗t− = µU + σ

√UZ, where Z ∼ N(0, 1).

In words, when a Poisson event occurs ln Sc∗t jumps by the net change in

the trended Brownian motion during the random time U .As a specific example of such a process, take Uj∞j=1 as i.i.d. expo-

nential, with c.d.f. FU (u) = [1 − exp(−u/β)]1(0,∞)(u) and m.g.f. MU (ζ) =(1 − βζ)−1 for ζ < β−1. Forcing θ = β−1 gives ETt = t and

MXt(ζ) = exptβ−1[1 − β(ζµ + ζ2σ2/2)]−1 − tβ−1.

By differentiating cumulant generating function LXt(ζ) = lnMXt(ζ) onecan see that ln(St/S0) has coefficients of skewness and excess kurtosisgiven by

α3 =6µβ(µ2β + σ2)√t(2µ2β + σ2)3/2

α4 − 3 =6β(4µ4β2 + 6µ2βσ2 + σ4)

t(2µ2β + σ2)2.

The sign of α3 is the same as that of µ, while excess kurtosis is alwayspositive. As for any Levy process with finite moments, both α3 and α4 − 3approach zero as t → ∞.

Page 455: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch09 FA

438 Pricing Derivative Securities (2nd Ed.)

0 100 200 300 400 500 600 700 800 900

t (seconds)

0.9980

0.9985

0.9990

0.9995

1.0000

1.0005

1.0010

St/S

0

Fig. 9.7. Detrended sample paths of S&P 500 returns modeled from Brownian motiondirected by a compound-Poisson process.

Figure 9.7 depicts five sample paths of Sct /S0 = St exp(

∫ t

0δs · ds)/S0

with parameters β = 0.001, µ = −0.109, σ = 0.0046 . These values approxi-mately match the cumulants with empirical estimates from daily, dividend-adjusted closing values of the S&P 500 index during 2004–2005. The paths,which correspond to just 900 seconds (15 minutes) of market time, thusgive a close-up view of the behavior of the index in the natural measure.(The data also imply an average drift rate ν

.= 0.109, but the figures areplotted with ν = 0 to emphasize the stochastic component.)

9.4.4 Modeling Prices as Branching Processes

The three pure-jump models we have seen thus far have much in common.Each represents lnStt≥0 as a Levy process, and—while the stochasticcomponent of price is pure-jump—the state space for St is a continuum,+. We now describe a pure-jump model with very different properties.Returns over unit time intervals are no longer i.i.d., and price—although itevolves in continuous time—is an integer-valued process when denominatedin units of the minimum tick size.

Page 456: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch09 FA

Discontinuous Processes 439

Taking S0 to be a given positive integer, we begin with the construc-tion of an integer-valued process in discrete time, Sn∞n=0, as follows. Foreach n let K0n = 0; take Kjnj∈ℵ to be i.i.d., nonnegative, integer-valuedrandom variables; and define Sn =

∑Sn−1j=0 Kjn. Thinking of Sn−1 as the

size of a population at generation n − 1 and supposing that each mem-ber j ∈ 1, 2, . . . , Sn−1 of the generation dies off and leaves behind Kjn

progeny, Sn is then the size of the population at generation n. Known asthe Bienayme-Galton-Watson (BGW) process, this discrete-time branch-ing process has a long history in applied probability theory, with appli-cations in demography, biology, and particle physics.20 We extend theBGW process to continuous time as follows. Set N0 = 0 and introducea nondecreasing, integer-valued process Ntt≥0, independent of the Kjnand having stationary, independent increments. Taking S(0) ≡ S0 andS(t)t≥0 ≡ SNtt≥0 delivers an integer-valued process that evolves incontinuous time. Dion and Epps (1999) have observed that the process,if observed at discrete intervals of time, belongs to the class of branch-ing processes in random environments (b.p.r.e.s), introduced by Smith andWilkinson (1969). We shall refer to the continuous-time process also as a“b.p.r.e.” Epps (1996) has applied such a model to prices of common stocksmeasured in units of the minimum price tick. In this application Ntt≥0

can be thought of as counting the number of information events up to t.The price process is clearly pure jump, since S(t) does not change betweeninformation shocks; moreover, S(t) − S(t−) is always an integral multi-ple of the minimum tick size whenever branching does occur. The modelalso implies a positive (and easily quantified) probability of “extinction”—the event S(t) = 0—which occurs if Kjn = 0 for n = Nt− + 1 and eachj ∈ 1, 2, . . . , Nt−. Clearly, S(t) = 0 implies that S(u) = 0 for all u > t, sothat the origin is an absorbing barrier for the process. Using this processto model the price of an underlying asset, we now look at the features thatare most relevant for pricing derivatives.

Generating Function and Moments

The model’s main properties are most easily inferred from the probabilitygenerating function (p.g.f.). Letting pk∞k=0 = P(K = k)∞k=0 representthe probability mass function of K and ΠK(ζ) =

∑∞k=0 pkζk the p.g.f., we

have for the generating function of Sn (the price after n shocks) conditional

20The standard BGW process is described at length in the books by Harris (1963)and Athreya and Ney (1972).

Page 457: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch09 FA

440 Pricing Derivative Securities (2nd Ed.)

on Sn−1

ΠSn(ζ | Sn−1) = E(ζSn | Sn−1) = E(ζPSn−1

j=0 Kjn | Sn−1) = ΠK(ζ)Sn−1 .

Accordingly, the generating function of Sn conditional on Sn−2 is

ΠSn(ζ | Sn−2) = ΠK [ΠK(ζ)]Sn−2 ,

and so on, with

ΠSn(ζ | S0) = Π[n]K (ζ)S0 ,

where “[n]” indicates that Π[n]K (·) is the nth “iterate” of the p.g.f. Taking

S(0) = S0 as given, the generating function of S(t) ≡ SNt is then

ΠS(t)(ζ) = EΠ[Nt]K (ζ)S0 =

∞∑n=0

Π[n]K (ζ)S0P(Nt = n). (9.36)

A model for the distribution of K that is both convenient and satisfac-tory in the application to equity prices is the two-parameter geometric,

pk(α, β) =

1 − α

1 − β, k = 0

αβk−1, k = 1, 2, . . . ,(9.37)

where 0 < β < 1 and 0 < α ≤ 1 − β. This has the nice property that theiterates of the generating function can be expressed explicitly, as21

Π[n]K (ζ) =

γ(µn − 1)µn − γ

+µn

(1−γ

µn−γ

)2

ζ

1 −(

µn−1µn−γ

)ζ, µ = 1

nβ − (nβ + β − 1)ζ1 − β(1 + n − nζ)

, µ = 1, 0 < β < 1,

where µ ≡ EK =∑∞

k=1 kpk = α/(1−β)2 and γ = β−1[1−(1−β)µ]. Besidesbeing convenient, this gives considerable flexibility in fitting the data, sinceit allows µ to be very close to unity even though σ2 ≡ V K = µ(1+β

1−β − µ)is small. Both features are required to obtain good fits to equity prices.Epps (1996) considers four other such flexible two-parameter models, allbeing mixtures of a degenerate variate equal to unity for sure with a one-parameter lattice distribution—Bernoulli, Poisson, logseries, and geomet-ric. For these the iterates must be calculated recursively as the p.g.f. iscomputed.

21See Harris (1963, p. 9).

Page 458: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch09 FA

Discontinuous Processes 441

Moments of S(t) can be found from the descending factorial moments,µ(k) = ES(t)[S(t)−1]·· · · ·[S(t)−k+1]. These are obtained by evaluatingthe derivatives of ΠS(t)(· | S0) at ζ = 1. Unless the iterates can be expressedexplicitly, differentiating Π[n]

K (ζ) requires the recursive formula Π[n]′K (ζ) =

Π[n−1]′K [ΠK(ζ)]Π′

K(ζ), where primes denote differentiation. Since Π[n]′K (1) =

Π′K(1)n = µn, the mean is

ES(t) = S0E(µNt) = S0ΠNt(µ),

where ΠNt is the generating function of Nt. The variance is

V S(t) = S0σ2ENt

when µ = 1 and

V S(t) = [ΠNt(µ2) − ΠNt(µ)2]S2

0 +σ2

µ(µ − 1)[ΠNt(µ

2) − ΠNt(µ)]S0

when µ = 1, where σ2 ≡ V K. The conditional mean of the total returnover [0, t] is then

E[S(t)/S0] = ΠNt(µ),

while the conditional variance has the form

V [S(t)/S0] = κt + ηtS−10 ,

where κt ≥ 0 and ηt > 0 are functions of µ, σ, and the parameters governingNtt≥0. As does the constant-elasticity-of-variance model, the b.p.r.e. thusimplies that the conditional variance of total return decreases as the initialprice level increases. Unlike what we see in models based on Levy processes,returns over nonoverlapping periods are therefore not i.i.d. The skewness ofthe total return can be positive or negative, increasing in absolute value asholding period t increases. Excess kurtosis is always positive but declinesas the holding period lengthens.

Among candidate models for Ntt≥0 are the Poisson process, with

ΠNt(ζ) = eθt(ζ−1), θ > 0,

and the negative-binomial process, with

ΠNt(ζ) =[

θ

1 − ζ(1 − θ)

]ϕt

, 0 < θ < 1, ϕ > 0.

In either case the number of price moves in any finite time interval is a.s.finite.

Page 459: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch09 FA

442 Pricing Derivative Securities (2nd Ed.)

An Explicit Expression for Probabilities

While the extinction probability, PS(t) = 0, is readily expressed as

ΠS(t)(0) =∞∑

n=0

Π[n]K (0)S0

for any given distribution of the number of progeny, K, it is not generallyeasy to evaluate the probability mass function of S(t) at other points thanzero. However, Williams (2001) has shown that an explicit expression canindeed be found when the progeny distribution is given by (9.37). His resultsare as follows. Letting

An ≡ µn − 1µn − γ

, Bn ≡ µn

(1 − γ

µn − γ

)2

, and pn ≡ P(Nt = n),

the c.f. of S(t) is

ΨS(t)(ζ) =∞∑

n=0

pn

(γAn +

Bneiζ

1 − Aneiζ

)S0

. (9.38)

When s is a positive integer, the inversion formula for c.f.s gives

P[S(t) = s] =12π

∞∑n=0

pn

∫ π

−π

e−iζs

(γAn +

Bneiζ

1 − Aneiζ

)S0

· dζ.

S0 being itself an integer, a binomial expansion yields

P[S(t) = s] =12π

∞∑n=0

pn

S0∑j=0

(S0

j

)(γAn)S0−j

×Bjn

∫ π

−π

eiζ(j−s)(1 − Aneiζ)−j · dζ.

The integral in this expression vanishes when j = 0, so it is necessary toconsider only the terms with j > 0. Noting that 0 <

∣∣Aneiζ∣∣ = An < 1

when µ > 1, we can apply to these terms the binomial series expansion,(2.5), to obtain for P[S(t) = s] the expression

12π

∞∑n=0

pn

S0∑j=1

∞∑k=0

(S0

j

)(j − k − 1

j − 1

)(γAn)S0−jBj

nAkn

∫ π

−π

eiζ(k+j−s) · dζ.

The integral being 2π when k + j − s = 0, and zero otherwise, we windup with

P[S(t) = s] =∞∑

n=0

pn

S0∧s∑j=1

(S0

j

)(s − 1j − 1

)γS0−jBj

nAS0+s−2jn . (9.39)

Page 460: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch09 FA

Discontinuous Processes 443

A Limiting Form

The discreteness of the price increments is a salient feature of the model,but this clearly becomes less relevant the larger the initial price, S(0) = S0.Remarkably, when µ = 1 it turns out that a normalization of the priceat t converges in distribution as S0 → ∞ to a time change of Brownianmotion. To see this, let (V Sn)−1/2(Sn−ESn) be the normalized price aftern shocks. When µ = 1 this is proportional to Zn ≡ S

−1/20 (Sn − S0) . Note

that Sn can be represented as Sn =∑S0

j=1 Qjn, where Qjn is the number ofunits at stage n that descended from unit j ∈ 1, 2, . . . , S0 at stage zero.Since for each n the Qjn are i.i.d. with EQjn = 1 and V Qjn = nσ2 whenµ = 1, the central limit theorem implies that Zn converges weakly (for eachfixed n) to N(0, nσ2) as S0 → ∞. In turn, this implies that the c.f. of Zn

(conditional on S0) is

ΨZn(ζ) = e−ζ2σ2n/2 + o(1),

where o(1) → 0 as S0 → ∞. The corresponding normalization of S(t) is[S(t) − ES(t)]/

√V S(t), which is proportional to

Z(t) ≡ S−1/20 (SNt − S0) = ZNt

when µ = 1. The limiting c.f. is therefore

limS0→∞

ΨZ(t) = limS0→∞

[Ee−ζ2σ2Nt/2 + o(1)]

= ΠNt(e−ζ2σ2/2).

Defining a directing process Ttt≥0 as Tt = σNt and a subordinatedprocess W (Tt)t≥0, where W (t)t≥0 is a Brownian motion, one seesimmediately that

ΨW (Tt)(ζ) = ΠNt(e−ζ2σ2/2).

This shows that Z(t) is indeed distributed asymptotically as a time changeof Brownian motion. When Ntt≥0 is Poisson with intensity θ we have

limS0→∞

ΨZ(t) = exp[θt(e−ζ2σ2/2 − 1)],

which is the c.f. of a compound-Poisson process—a Poisson sum of normals.As explained by Epps (1996) and Dion and Epps (1999), the ability toapproximate the distribution of S(t) through the limiting form of Z(t) isimportant in estimating the parameters of the process from price data.

Page 461: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch09 FA

444 Pricing Derivative Securities (2nd Ed.)

Pricing Derivatives

As for the v.g. and hyperbolic models, the discreteness of S(t) under b.p.r.e.dynamics makes it impossible to replicate nonlinear payoff structures withthe underlying asset and riskless bonds alone. Accordingly, there is nota unique equivalent martingale measure within which derivatives can bepriced as discounted expected values. Stated differently, the condition thatS(t)/Mt be a martingale under P restricts the parameters of the distri-bution of K and Nt to satisfy

e−rtES(t)/S(0) = e−rtΠNt(µ) = 1; (9.40)

yet there are infinitely many models in the class that meet this restriction.In the case that Ntt≥0 is Poisson with intensity θ condition (9.40) requirese−rteθt(µ−1) = 1 or µ = 1 + r/θ, and when Ntt≥0 is negative-binomialwith parameters θ and γt it requires µ = (1− θ)−1(1− θe−r/γ). As for theother incomplete models we have considered, parameter values subject tothe martingale restrictions can be inferred from prices of traded options ofdifferent strikes and maturities.

Williams’ (2001) formula, (9.39), offers a very direct way to price theEuropean put when (9.37) models the progeny distribution; namely, as

PE(S0, T ) = B(0, T )X∑

s=0

(X − s)P[S(t) = s]. (9.41)

More generally, one can apply the integrated c.d.f. method of section 8.4.The first step is to develop the c.f. of S(T ) from the p.g.f., as ΨS(T )(ζ) =ΠS(T )(eiζ). We can then adapt formula (8.37) by expressing PE(S0, T ) as

PE(S0, T ) =∫ X

0

FT (s) · ds

=∫ X

0

[12− lim

c→∞

∫ c

−c

e−iζx

2πiζΨS(T )(ζ) · dζ

]· ds,

where the bracketed expression is from the usual inversion formula forc.d.f.s. Recall from theorem 3 of chapter 2 that the inversion formula appliesonly when argument s is a continuity point of FT . Although FT is discon-tinuous in this model, it has only countably many (indeed, finitely many)jumps on [0, X ], so we can still use it to evaluate the Riemann integral. Theresult is

PE(S0, T ) =X

2+

12π

limc→∞

∫ c

−c

1 − e−iζX

ζ2ΨS(T )(ζ) · dζ, (9.42)

Page 462: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch09 FA

Discontinuous Processes 445

where the integral would be evaluated numerically for some large c, avoidingthe singularity at ζ = 0. Neither of (9.41) nor (9.42) seems to dominate froma computational standpoint when (9.41) applies. Although (9.41) avoids thenumerical integration required to evaluate FT , the multiple summationsthat it involves can be tedious, particularly when X (in tick units) is large.On the other hand, vector processing speeds this up considerably.

Liu (2003) has applied the b.p.r.e. model to options on a sample of indi-vidual U.S. equities. Taking K to be distributed as in (9.37) and Ntt≥0

as Poisson, and inferring the parameters from transaction prices of tradedoptions, he finds that the model typically eliminates the smile effect insample. Results are mixed when out-of-sample predictions are comparedwith the ad hoc version of Black-Scholes with moneyness-specific implicitvolatilities. He does find evidence that the predictions are better for optionson low-priced stocks, where the discreteness arising from the minimum ticksize would seem to be more relevant.

9.4.5 Assessing the Pure-Jump Models

Since a thorough empirical comparison of the Levy and b.p.r.e. models isnot available, it seems appropriate to offer a brief qualitative appraisal. Allthese models account in some degree for the thick-tailedness of the empiri-cal distributions of assets’ returns, and for this reason all are able to reducethe severe mispricing of short-term, out-of-money options that is now asso-ciated with the Black-Scholes theory. However, no model that represents logprice as a Levy process can account for variations over time in the momentsof returns. In the branching process the variance and higher moments oftotal return do vary with the initial price level, but whether this picks upmuch of the variation that is observed empirically is not yet known. It isnow widely believed that some of the thick-tailedness in the marginal distri-butions of returns can be attributed to temporal variation in volatility. Levymodels obviously cannot account for this, since nonoverlapping returns arei.i.d.22 The b.p.r.e. accounts only for contemporaneous price-level effectsand so does little better on this score. Thus, none of the models explainsthe high autocorrelation that is almost always observed in the absolute val-ues or squares of high-frequency returns. Moreover, no Levy model with

22However, Finlay and Seneta (2006) have shown that it is possible to construct astationary process whose increments over intervals of specific length have v.g.-distributedmarginals but are nevertheless dependent.

Page 463: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch09 FA

446 Pricing Derivative Securities (2nd Ed.)

finite-variance increments to log price could successfully capture the slowlyfading volatility smiles that characterize the empirical term structure ofimplicit volatility.23 Of course, allowing explicitly for stochastic variationin one or more parameters, as in the s.v. diffusions, is always possible intheory; but doing so would both add to an already-extensive computationalburden and fundamentally change the character of these models.

9.5 A Markov-Switching Model

In previous sections we have seen several ways of modeling discontinuouschanges through the use of Levy processes. This has been done both directlyby invoking the Poisson model and indirectly by using Levy processes todirect stochastic time changes. Here we look at an alternative approach,based on Edwards (2004, 2005), in which changes are triggered by a dif-ferent type of continuous-time Markov process. Like the Poisson process,this is a continuous-time Markov chain, but instead of taking unit jumps itmerely switches back and forth at random times among some finite numberof positions. In the present application the switch determines which of aset of alternative processes controls the stochastic part of underlying priceprocess Stt≥0. Such “regime-switching” models have found wide appli-cation in empirical economics since their introduction by Hamilton (1989).The modeling of economic time series as switching at random times amongregimes of different growth rates and/or volatilities can be motivated in vari-ous ways: through spontaneous changes in consumers’ and investors’ degreesof optimism, through successive breakthroughs and plateaus in technology,changes in legal constraints, shifts in monetary and fiscal policies, natu-ral disasters, wars and threats of war, and so forth. Sufficient motivation

23This is documented by Carr and Wu (2003), who circumvent this limitation ofthe Levy specification by modeling increments to log price as affine functions of anassymetric-stable law with characteristic exponent α ∈ (1, 2) and symmetry parameterβ = −1. Stable laws are represented via the c.f., with ln Ψ(ζ) given by

iζµ − |ζσ|α[1 − iβs(ζ) tan(πα/2)],

where s(·) is the signum function, µ and σ are location and scale parameters, and β ∈[−1, 1] and α ∈ (0, 2] control symmetry and tail thickness. The normal and Cauchy lawsare special cases with α = 2 and α = 1, β = 0 respectively. First moments exist wheneverα > 1, but the variance is finite only in the normal case. While changes in log price thushave infinite variance in the Carr-Wu setup, they do possess means. Also, with β = −1

the right tail of the distribution is thin enough that EST = EelnST < ∞, as neededfor martingale pricing. Unfortunately, the inflexibility of this choice limits the model’sability to adapt to the skewness in actual returns.

Page 464: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch09 FA

Discontinuous Processes 447

for our purposes is the fact that the switching scheme offers a simple wayof generating price processes that produce volatility smiles. We begin byexamining some properties of the switching process itself, then see how toapply it in modeling the underlying price of a derivative security.24 As weshall see, Levy processes will continue to play a prominent role even thoughthey do not actually trigger the changes in regime.

We work with a stochastic process Itt≥0 having S+1 states, associatedwith the set of integers S =0, 1, . . . ,S. The behavior of such processes isspecified by the conditional probabilities of being in each possible state ateach time t + u given what is known at time t; i.e., ps′(t, u) = P(It+u =s′ | Ft). By restricting to processes that are Markovian, we can conditionjust on the knowledge of the present state, as pss′(t, u) = P(It+u = s′ |It = s). Restricting further to time-homogeneous processes removes theimplied dependence on t, so we can write the transition probabilities forthe process simply as pss′(u) = P(It+u = s′ | It = s). We assume that thesesatisfy for each t ≥ 0, u ≥ 0 and each state s the conditions pss′ ≥ 0 and∑

s′∈Spss′ = 1, together with the Chapman-Kolmogorov equations,

pss′(t + u) =∑s′′∈S

pss′′(t)ps′′s′(u)

s,s′∈S

. (9.43)

Each of these equations expresses the probability of transiting from s to s′

as the probability of the union of a collection of exclusive events; namely,that there is an intermediate transition from s to each s′′ ∈ S and thencefrom s′′ to s′. Of course, such conditions must hold for all positive t and u.

In addition, we set the initial transition probabilities as pss(0) = 1.

The usual procedure in modeling the transition probabilities is to specifytheir asymptotic behavior as u ↓ 0. Specifically, for positive constants ϑss′

we put

pss′(u) =

ϑss′u + o(u), s′ = s

1 − ϑssu + o(u), s′ = s,

where o(u)/u −→ 0 as u ↓ 0. Thus, ϑss′ represents the instantaneous rate atwhich the system transits from state s to state s′. Of course,

∑s′∈S

pss′ = 1implies that ϑss =

∑s′ =s ϑss′ . Putting state 0 last, we can arrange these in

24One may consult Cox and Miller (1965) and Ross (2000) for further details aboutdiscrete-space Markov processes in continuous time.

Page 465: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch09 FA

448 Pricing Derivative Securities (2nd Ed.)

matrix form as

Θ =

−ϑ11 ϑ12 · · · ϑ1S ϑ10

ϑ21 −ϑ22 · · · ϑ2S ϑ20

......

. . ....

...

ϑS1 ϑS2 · · · −ϑSS ϑS0

ϑ01 ϑ02 · · · ϑ0S −ϑ00

.

Together, the asymptotic structure, the Chapman-Kolmogorov equations,and the initial conditions pss(0) = 1 lead to a system of differential equa-tions that can be solved for the transition probabilities at all future times.Thus, from (9.43) we have

pss′ (t + u) = pss′ (t)ps′s′(u) +∑

s′′ =s′pss′′(t)ps′′s′(u)

= pss′ (t)(1 − ϑs′s′u) +∑

s′′ =s′pss′′(t)ϑs′′s′u + o(u)

and hence

p′ss′(t) = limu↓0

u−1[pss′(t + u) − pss′(t)] = −pss′(t)ϑs′s′ +∑

s′′ =s′pss′′(t)ϑs′′s′ ,

which are Kolmogorov’s “forward” equations. Represented in matrix formas P′(t) = P(t)Θ, the solution subject to P(0) = I is P(t) = eΘt, wherethe exponential matrix is defined as the convergent series

eΘt =∞∑

k=0

Θktk/k!.

For example, if there are just two states, S = 0, 1, the forward equa-tions are

p′01(t) = −p01(t)ϑ11 + p00(t)ϑ01 = −p′00(t)

p′10(t) = −p10(t)ϑ00 + p11(t)ϑ10 = −p′11(t).

Setting ϑ00 = ϑ01 ≡ ϑ0 and ϑ11 = ϑ10 ≡ ϑ1, the solutions have the explicitform

p01(t) =ϑ0

ϑ0 + ϑ1[1 − e−(ϑ0+ϑ1)t] = 1 − p00(t)

p10(t) =ϑ1

ϑ0 + ϑ1[1 − e−(ϑ0+ϑ1)t] = 1 − p11(t). (9.44)

Page 466: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch09 FA

Discontinuous Processes 449

What is essential for our application is to know the behavior of theoccupation times, which are the amounts of time during [0, t] that processIt spends in each state. These can be represented as Ts(t) =

∫ t

0 1s(Iu) ·

du ≥ 0 for s ∈ 1, 2, . . . ,S and T0(t) = t −S∑

s=1Ts(t) ≥ 0. The joint m.g.f.

of Ts(t)Ss=1 was given by Darroch and Morris (1968, theorem 1) as

MTs(t)(ρ) = π′et(Θ+D)1,

where (i) π is the vector of initial state probabilities,25 πs = P(I0 = s)Ss=0,(ii) ρ ∈ CS (an S -vector of complex numbers), (iii) 1 is an S + 1 vector ofunits, and (iv)

D =

ρ1 0 · · · 0 00 ρ2 · · · 0 0...

.... . .

......

0 0 · · · ρS 00 0 · · · 0 0

.

We will now build on these foundations a switching model for the priceprocess Stt≥0 of some underlying asset under the risk-neutral measure.Let Rs,ts∈S,t≥0 be a collection of S+1 stochastic returns processes havingthe following properties:

• Processes Rs,ts∈S,t≥0 are mutually independent and independent of thestate process Itt≥0.

• Each process Rs,tt≥0 is a nonnegative martingale under P, withERs,t = Rs,0 = 1.

• Each process lnRs,tt≥0 is a Levy process.

Two important features of Levy processes will be recalled from section 2.2.8.First, Levy processes are Markov, since they have independent increments.Second, if Rs,tt≥0 is Levy its c.f. under P satisfies

Ψs,t(ζ) = Eeiζ ln Rs,t = Ψs,1(ζ)t = (Eeiζ lnRs,1)t. (9.45)

These properties will prove critical in what follows.We will have to represent the security’s price at t in terms of the various

return processes that operate during [0, t]. This is not hard to do, but toget it right takes a bit of notation. For this, let Nt ∈ ℵ0 be the numberof times on (0, t) at which transitions from some state to another occur.

25These to be distinguished from the initial transition probabilities.

Page 467: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch09 FA

450 Pricing Derivative Securities (2nd Ed.)

If there is at least one such transition, let τjNt

j=1 be the set of (random)times themselves—these being the discontinuity points of the state processIu0≤u≤t, and take τ0 = 0 and τNt+1 = t. Next, let sj = Iτj be thestate to which a transit occurs at τj , with s0 = I0 being the initial stateand sNt+1 = It. Now letting rt and δt be F0 -measurable short-rateand instantaneous dividend processes, we model the price of a traded assetunder P as

St =

S0e

R t0 (ru−δu)·du

Nt+1∏j=1

Rsj−1,τj−Rsj−1,τj−1

·RsNt ,t

RsNt, τNt

, Nt = 1, 2, . . .

S0eR

t0 (ru−δu)·du R0,t

R0,0= S0e

Rt0 (ru−δu)·duR0,t, Nt = 0.

(9.46)

For example, if Nt = 1 then

St = S0eR t0 (ru−δu)·duR0,τ1− · Rs1,t/Rs1,τ1 .

Expression (9.46) makes it appear that the distribution of St is extremelycomplicated, because St seems to depend on the random times of statechanges, on which states occurred, and on their order. Fortunately, thingsare simpler than this. Because processes Iu0≤u≤t and Rs,us∈S,0≤u≤t areMarkov, St depends just on the total occupation time for each state, noton how many times each state was visited and when. Thus, with Ts(t)s∈S

representing the occupation times, we have

St = S0eR t0 (ru−δu)·du

S∏s=0

Rs,Ts(t),

where “=” is interpreted here as “equal in distribution”.With this simplification we can now do two useful things. First, dividing

by Mt = M0eR t0 ru·du, we can verify at once that the normalized cum-

dividend price process, Sc∗t = Ste

R t0 δu·du/Mt, is a martingale under P:

ESc∗t = S∗

0 E

S∏s=0

Rs,Ts(t)

= S∗0 E

[ S∏s=0

E(Rs,Ts(t) | Ts(t)Ss=1)

]

= S∗0 E

S∏s=0

Rs,0 = S∗0 ≡ Sc∗

0 .

Page 468: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch09 FA

Discontinuous Processes 451

This assures us that the model is consistent with an arbitrage-free market.Next, we can find the c.f. of lnSt under P. This will open a clear path towardthe goal of pricing European-style derivatives on this asset, via either of theone-step Fourier approaches described in section 8.4. Finding

Ψ(ζ) = Eeiζ ln St = Siζ0 eiζ

R t0 (ru−δu)·duEeiζ

PSs=0 ln Rs,Ts(t)

is straightforward given what we know about Levy processes and occu-pation times. Obviously, we have merely to work out the expectation inthe expression above, which is itself the c.f. of the log of the normalizedcum-dividend price relative, Sc∗

t /S∗0 . We can write this as

EeiζPS

s=0 ln Rs,Ts(t) = E exp

(iζ lnR0,t−T1(t)−···−TS(t) + iζ

S∑s=1

lnRs,Ts(t)

).

Conditioning on Ts(t)Ss=1 and using the tower property, we have

EeiζPS

s=0 ln Rs,Ts(t) = E

[Ψ0,t−T1(t)−···−TS(t)(ζ)

S∏s=1

Ψs,Ts(t)(ζ)

]

= E

[Ψ0,1(ζ)t−T1(t)−···−TS(t)

S∏s=1

Ψs,1(ζ)Ts(t)

]

= Ψ0,1(ζ)tE

S∏s=1

[Ψs,1(ζ)Ψ0,1(ζ)

]Ts(t)

,

where property (9.45) is used in the second line. Note that division byΨ0,1(ζ) is always permissible, since c.f.s of Levy processes have no zeroes.Now setting ρs(ζ) ≡ ln[Ψs,1(ζ)/Ψ0,1(ζ)] and ρ = (ρ1(ζ), ρ2(ζ), . . . , ρS(ζ))′,the c.f. of lnSt under P can be expressed in terms of the m.g.f. of theoccupation times, as

Ψ(ζ) = Siζ0 eiζ

Rt0 (ru−δu)·duΨ0,1(ζ)tMTs(t)(ρ)

= Siζ0 eiζ

Rt0 (ru−δu)·duΨ0,1(ζ)tπ′et(Θ+D)1. (9.47)

Notice that although the normalized cum-dividend price is a P martingale,the change in log price is not a Levy process, since

Eeiζ ln(Sc∗t /S∗

0 ) = Ψ0,1(ζ)tπ′et(Θ+D)1 = [Ψ0,1(ζ)π′e(Θ+D)1]t.

Thus, the increments of log price are not i.i.d. As Edwards (2005) pointsout, the switching effect is therefore capable of slowing the convergence tonormality of (the standardized) lnSc∗

t as t → ∞. This fact holds promise

Page 469: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch09 FA

452 Pricing Derivative Securities (2nd Ed.)

for delivering models of derivatives’ prices that better fit the observed termstructures of implicit volatilities.

As a simple example of such a multistate switching process, takelnRs,t = −σ2

st/2 + σsZs,t, where Zs,ts∈S are independent Brown-ian motions. Thus, our returns processes are simply geometric Brownianmotions with state-dependent, F0-measurable volatilities. Then ERs,t = 1,

as is required for the martingale property, and

Ψs,t(ζ) = e−(iζ+ζ2)σ2st/2 = [e−(iζ+ζ2)σ2

s/2]t = Ψs,1(ζ)t.

Thus, the c.f. of ln St is

Ψ(ζ) = Siζ0 eiζ

R t0 (ru−δu)·due−(iζ+ζ2)σ2

0t/2π′et(Θ+D)1, (9.48)

with

D =iζ + ζ2

2

σ2

0 − σ21 0 · · · 0 0

0 σ20 − σ2

2 · · · 0 0...

.... . .

......

0 0 · · · σ20 − σ2

S 00 0 · · · 0 0

.

When there are just two states with instantaneous transition rates ϑ0 ≡ϑ01 = ϑ00 and ϑ1 ≡ ϑ10 = ϑ11, we have

Θ + D =

(−ϑ1 + ( iζ+ζ2

2 )(σ20 − σ2

1) ϑ1

ϑ0 −ϑ0

), (9.49)

or, for general returns processes with ρ(ζ) ≡ ln[Ψ1,1(ζ)/Ψ0,1(ζ)],

Θ + D =(−ϑ1 + ρ(ζ) ϑ1

ϑ0 −ϑ0

). (9.50)

At the cost of some flexibility, one parameter in the two-state model can beeliminated by setting π1 = ϑ0/(ϑ0 + ϑ1), which corresponds to the “long-run” transition probability p11(∞) in expression (9.44).

There are several ways to compute eMt for an arbitrary square matrix

M. For large enough N any of the expressionsN∑

j=1

Mjtj/j!, (I+N−1Mt)N ,

or [(I− N−1Mt)−1]N can yield an acceptable approximation (assuming inthe last case that I − N−1Mt is invertible). However, an exact expres-sion can be obtained with minimal computation by a procedure basedon the Cayley-Hamilton theorem of matrix algebra. The theorem statesthat an n × n matrix M satisfies its own characteristic equation, whichis to say that |Iλ − M| =λn + a1λ

n−1 + · · · + an−1λ + an = 0 implies

Page 470: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch09 FA

Discontinuous Processes 453

Mn + a1Mn−1 + · · · + an−1M + an = 0.26 The Cayley-Hamilton methodfor evaluating eMt consists of the following steps:

1. Write down the characteristic equation and solve for latent rootsλjn

j=1.

2. Construct the system of n equations

eλ1t = α0 + α1λ1 + · · · + αn−1λn−11

eλ2t = α0 + α1λ2 + · · · + αn−1λn−12

...eλnt = α0 + α1λn + · · · + αn−1λ

n−1n

(9.51)

and solve for the coefficients α0, α1, . . . , αn−1.27

3. Finally, construct eMt as

eMt = α0I + α1M + · · · + αn−1Mn−1.

Thus, only n−2 matrix multiplications are required for an n×n matrix.28

To illustrate, we follow these steps to evaluate e(Θ+D)t in the two-statecase, where Θ + D is as in (9.50). First, the characteristic equation is

|Iλ − (Θ + D)| = λ2 − λ · tr(Θ + D)+|Θ + D|= λ2 + λ(ϑ0 + ϑ1 − ρ) − ϑ0ρ

26Schott (1997) gives a precise statement of the theorem. For a succinct explana-tion of how the Cayley-Hamilton method follows from the Cayley-Hamilton theorem seeRowell (2004).

27If the λj are not all distinct, (9.51) will comprise fewer than n independentequations and cannot be solved for the αj. In this case additional equations can begenerated by differentiating the equation pertaining to each multiple root. For example,if root λ1 has multiplicity k, the first k equations are generated as

d(j)eλ1t

dλj1

=d(j)

dλj1

(α0 + α1λ1 + · · · + αn−1λn−11 )

for j = 0, 1, . . . , k − 1.Nothing in the structure of the switching model rules out the possibility of equal

roots for certain values of ζ, transition rates ϑjk, and returns processes. Fortunately,the conditions that lead to multiplicity are of the knife-edge sort and are not apt tobe encountered in the process of numerically inverting the c.f.; however the numericalstability of the solutions near the edge must be considered.

28There is also a diagnonalization technique for evaluating eMt. It is based on the factthat when M has distinct latent roots the matrix V = (v1, v2, . . . ,vn) of latent vectors isnonsingular and such that V−1MkV = Λk, where Λ is a diagonal matrix with λjn

j=1

on the diagonal. From these facts it is easy to deduce that V−1eMtV=P∞

k=0 Λktk/k!and hence that eMt = VLtV−1, where Lt is a diagonal matrix with eλjtn

j=1 on thediagonal. The method can be modified to handle the case of multiple roots.

Page 471: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch09 FA

454 Pricing Derivative Securities (2nd Ed.)

with (complex-valued) roots

λ1 =12

[ρ − (ϑ0 + ϑ1) +

√ρ2 + 2ρ(ϑ0 − ϑ1) + (ϑ0 + ϑ1)2

]λ2 =

12

[ρ − (ϑ0 + ϑ1) −

√ρ2 + 2ρ(ϑ0 − ϑ1) + (ϑ0 + ϑ1)2

].

Second, the coefficients α0, α1 are

α0 =λ1e

λ2t − λ2eλ1t

λ1 − λ2

α1 =eλ1t − eλ2t

λ1 − λ2,

and, third, e(Θ+D)t = α0I + α1(Θ + D) simplifies to

e(Θ+D)t =

(λ1+ϑ1−ρ)eλ2t−(λ2+ϑ1−ρ)eλ1t

λ1−λ2

ϑ1(eλ1t−eλ2t)λ1−λ2

ϑ0(eλ1t−eλ2t)λ1−λ2

(λ1+ϑ0)eλ2t−(λ2+ϑ0)eλ1t

λ1−λ2

.

Finally, with π′ = (π1, 1− π1) the expression π′et(Θ+D)1 in (9.47) has thesimple form

π′et(Θ+D)1 =[λ1 − π1ρ(ζ)]eλ2t − [λ2 − π1ρ(ζ)]eλ1t

λ1 − λ2.

Figure 9.8 illustrates the effect of Markov switching in a two-statemodel in which price is driven by geometric Brownian motions with volatil-ities σ0 = 0.1 and σ1 = 0.6. In this case the log of the ratio of c.f.s isρ = (iζ + ζ2)(σ2

0 − σ21)/2, and Θ + D is as given in (9.49). The figure

plots implicit volatilities vs. moneyness for three sets of transition rates,(ϑ0, ϑ1) ∈ (.9, .1), (.5, .5), (1., .9), with π1 set equal to long-run transitionprobability ϑ0/(ϑ0 +ϑ1) in each case. Note that ϑ0 ≡ ϑ01 and ϑ1 ≡ ϑ10 arethe instantaneous rates of transit from the low- and high-volatility states,respectively.29 The top-most curve shows that implicit volatility varies lit-tle with moneyness when there is a relatively high probability of being andremaining in the high-volatility state. On the other hand, the volatility smileis much more pronounced when the low-volatility state is more sustainablebut there is significant potential for volatility to increase. Figure 9.9 showsthat implicit volatility curves for the case (ϑ0, ϑ1) = (1., .9) flatten veryslowly as time to expiration increases.

29We have used highly discrepant state volatilities so as to make the effects of vari-ations in (ϑ0, ϑ1) more apparent. The curves are based on prices of European optionswith 0.5 years to expiration, a short rate of 0.02, and a dividend rate of zero.

Page 472: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch09 FA

Discontinuous Processes 455

0.8 0.9 1.0 1.1 1.2

X/S0

0.1

0.2

0.3

0.4

0.5

0.6

Implic

it V

ola

tility

ϑ0=.5, ϑ1=.5

ϑ0=.9, ϑ1=.1

ϑ0=.1, ϑ1=.9

Fig. 9.8. Implicit volatility vs. moneyness in geometric B.m. switching model with(σ0, σ1) = (0.1, 0.6) and three sets of transition rates (ϑ0, ϑ1).

80 90 100 110 120

X/S0

0.15

0.16

0.17

0.18

0.19

0.20

0.21

0.22

0.23

0.24

0.25

Implic

it V

ola

tility

T = 0.5T = 1.0T = 2.0T = 4.0

Fig. 9.9. Implicit volatility vs. moneyness for four times to expiration T in geometricB.m. switching model: (ϑ0, ϑ1) = (0.1, 0.9), (σ0, σ1) = (0.1, 0.6).

Page 473: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch09 FA

456 Pricing Derivative Securities (2nd Ed.)

0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4

X/S0

0.30

0.31

0.32

0.33

0.34

0.35

0.36

0.37

0.38

0.39

0.40

0.41

Implic

it V

ola

tility

T =.1, θ ε .1,1

T =.1, θ = .1

T =.5, θ ε .1,1

T =.5, θ = .1

Fig. 9.10. Implicit volatility vs. moneyness for two times to expiration T, for jump-diffusion with intensity θ = .1 and for switching model with θ ∈ .1, 1.0.

Of course, there are many possible specifications of the switching model,since one is free to choose the number of states and the Levy returns pro-cess that applies in each. Figure 9.10 shows implicit volatilities from amodel that alternates between two constant-volatility jump-diffusion pro-cesses with radically different jump intensities, θ ∈ .1, 1.0, but having allother features in common. The transition rates from low- and high-intensitystates are (ϑ0, ϑ1) = (.1, .9). For comparison the curve for each time toexpiration T is paired with that for a standard jump diffusion with θ = .1(which is the same as in figure 9.1). The switching effect does somewhatimprove the term-structure properties of the model by slightly accentuatingthe curvature of volatility smiles at later expirations.

Page 474: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch10 FA

10Interest-Rate Dynamics

This chapter provides an introduction to the extensive literature on pricinginterest-sensitive assets and derivatives, such as bonds, options on bonds,interest-rate caps and swaps, and options on caps and swaps. We havethus far worked under the assumption that future interest rates and bondprices are known in advance (condition RK of chapter 4). It is time nowto recognize and begin to model their stochastic behavior. The first sectionprovides a preliminary review of the notation and definitions introduced inchapter 4, introduces the concept of forward measure, and concludes withan overview of the models and methods considered in remaining sections.

10.1 Preliminaries

10.1.1 A Summary of Basic Concepts

Table 10.1 collects the notation, definitions, and fundamental properties ofinterest rates and bond prices that have been used in previous chapters.

Here are a few specific points that are particularly relevant for the sub-sequent discussion:

1. The definitions of the average and instantaneous forward rates implythat

rt(t′, T ) = (T − t′)−1

∫ T

t′rt(u) · du.

Forward prices of bonds can therefore be recovered from instantaneousforward rates as

Bt(t′, T ) = e−R T

t′ rt(u)·du.

457

Page 475: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch10 FA

458 Pricing Derivative Securities (2nd Ed.)

Table 10.1. Notation for bond prices and interest rates.

Symbol Interpretation Properties/Relations

B(t, T ) Price at t of T -maturing B(t, t) = B(T, T ) = 1,

unit discount bond 0 < B(t, T ) ≤ 1

r(t, T ) Average spot rate at t for r(t, T ) = − ln B(t, T )

T − tborrowing from t until T

rt Instantaneous spot rate, rt =∂ ln B(t, T )

∂t|T=t

or short rate = −∂ ln B(t, T )

∂T|T=t

Mt Value of money fund that Mt = M0eR t0 rs·ds

invests at the short rate

Bt(t′, T ) Forward price at t to buy T - Bt(t′, T ) =B(t, T )

B(t, t′),

maturing bond at t′ ∈ [t, T ] Bt(t, T ) ≡ B(t, T )

rt(t′, T ) Average forward rate at t rt(t′, T ) = − lnBt(t′, T )

T − t′

for loans from t′ until T

rt(t′) Instantaneous forward rate rt(t′) = −∂ ln B(t, T )

∂T|T=t′ ,

at t for borrowing at t′ rt(t) = rt

2. Likewise, since Bt(t, T ) = B(t, T ) bonds’ spot prices can be expressedin terms of forward rates as well:

B(t, T ) = e−R

Tt

rt(u)·du. (10.1)

3. The “yield to maturity” of a T -maturing discount bond as of time t isthe same as r(t, T ), the average spot rate for lending from t to T.

10.1.2 Spot and Forward Measures

Modeling interest-sensitive derivatives typically requires modeling the evo-lution of the entire term structure of bond prices, or, equivalently, theterm structure of rates derived from them. The challenge is to derive amodel that handles all of these instruments together in a way that excludesthe possibility of arbitrage among them. Taking T∞ as the date of somefinite horizon at or beyond the longest term of any bond, we regard theprices of unit discount bonds of the various maturities as a collection of

Page 476: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch10 FA

Interest-Rate Dynamics 459

stochastic processes. Specifically, for each maturity date T ∈ (0, T∞] theevolving price B(t, T )0≤t≤T of a T -maturing unit bond is an adaptedprocess on (Ω,F , Ft0≤t≤T∞ , P) that satisfies 0 < B(t, T ) for t ∈ [0, T ]and B(T, T ) = 1.1 If VT is some FT -measurable contingent claim withE|VT | < ∞, there is, as usual in the absence of arbitrage, a measure P

(not necessarily unique) with respect to which normalized process V ∗t =

M−1t Vt0≤t≤T is a martingale. We can therefore express the claim’s current

arbitrage-free value as Vt = MtEt(M−1T VT ). In particular, with Vt = B(t, T )

and VT = B(T, T ) = 1 we have

B(t, T ) = MtEtM−1T = e

Rt0 rs·dsEte

−R

T0 ru·du = Ete

−R

Tt

ru·du. (10.2)

Were the entire short-rate process ru0<u≤T∞ known in advance, as wasassumed in earlier chapters, arbitrages between bonds and the money fundwould enforce the equality B(t, T ) = MtM

−1T , since both the T -maturing

bond and the (F0- and hence) Ft-measurable number M−1T of units of the

money fund would be worth one currency unit at T . In that case relationVt = MtEt(M−1

T VT ) would be equivalent to

Vt = B(t, T )EtVT = B(t, T )Et[B(T, T )−1VT ],

and it would make no difference whether we chose Mt0≤t≤T orB(t, T )0≤t≤T as normalizing process. That equivalence no longer holdsfor measure P when interest rates are uncertain, but there is an alter-native measure, PT , under which it is true that Vt = B(t, T )ET

t VT . Ofcourse, both P and PT are martingale measures; they merely correspondto different numeraires—Mt and B(t, T ). Both measures will be use-ful in pricing interest-sensitive assets, and to distinguish them we refer toP and PT , respectively, as the “spot (martingale) measure” and the “for-ward (martingale) measure for date T ”. It is important to note that aspecific forward measure corresponds to each maturity date; thus, distinctforward measures PT1 and PT2 correspond to numeraires B(t, T1)0≤t≤T1

and B(t, T2)0≤t≤T2 .

The connection between P and PT for a particular T can be seen asfollows. With T∞ as our finite horizon and T ≤ T∞, the two conditionsV0 = M0E(M−1

T VT ) = M0

∫ VT (ω)MT (ω) · dP(ω) and V0 = B(0, T )ET VT =

B(0, T )∫

VT (ω) · dPT (ω) imply that∫

VT (ω) · dPT (ω) =∫

VT (ω)QT (ω) ·

1It would be desirable to impose B(t, T ) ≤ 1 also, in conformity with assumption RBin chapter 4; but we shall see that this is violated in some otherwise attractive models.

Page 477: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch10 FA

460 Pricing Derivative Securities (2nd Ed.)

dP(ω), with

QT ≡ dPT

dP=

M0

MT B(0, T )=

B(T, T )/B(0, T )MT /M0

. (10.3)

Random variable (Radon-Nikodym derivative) QT thus effects the changeof measure from P to PT that is appropriate for valuation at t = 0, asET VT = B(0, T )EQT VT . Now for arbitrary t ∈ [0, T ] put

Qt ≡ EtQT

=M0

B(0, T )Et

B(T, T )MT

=M0

B(0, T )Mt(Ete

−R

Tt

ru·du) =B(t, T )/B(0, T )

Mt/M0,

noting that Q0 = 1. Notice that, as a conditional expectations process, Qtis defined so as to be a martingale under P. Formula (2.51) on page 72 forconditional expectations in alternative measures (with PT , P here corre-sponding to P, P there) now gives as the time-t value of an FT -measurableclaim.

Vt = B(t, T )ETt VT = B(t, T )

EtQT VT

EtQT

= B(t, T )Et(VT QT /Qt). (10.4)

Thus,QT /Qt =

MtM−1T

B(t, T )=

MtM−1T

Et(MtM−1T )

=e−

RTt

ru·du

Ete−

R Tt

ru·du

0≤t≤T

(10.5)

is the appropriate Radon-Nikodym process for valuation under the for-ward martingale measure for T . Later, we shall see what specific form thistakes under various models for interest rates and give some examples of itsapplication.

Regarding notation, observe that the superscript on E defines the for-ward measure, while (as heretofore) the subscript indicates conditioning;thus, ET

t (·) = ET (· | Ft) =∫

(·) · dPT (ω | Ft).To see why the designation “forward measure” is apt, note that if

Vt represents the evolving cum-dividend value of a financial claim, thenB(t, T )−1Vt = ET

t VT = Et(VT QT /Qt) is the forward price that would beset at t for delivery at T. Thus, even though interest rates are recognizedto be stochastic, forward prices do equal expected future spot prices in thismeasure. In particular, if Vt = B(t, T ) is the value at t of a discount bond

Page 478: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch10 FA

Interest-Rate Dynamics 461

maturing at T ≥ t, then for t ≤ t′ ≤ T we have

Bt(t′, T ) ≡ B(t, t′)−1Vt = Et′t Vt′ = Et′

t B(t′, T ).

Thus, the bond’s forward price as of time t is the expectation under Pt′ ofits spot price at t′.

10.1.3 A Preview of Things to Come

Comparing (10.2) with (10.1) makes it clear that the evolution of bonds’prices can be described either through the evolution of short rates,rt0≤t≤T∞ , or through the evolution of instantaneous forward rates atall maturities, rt(T )0≤t≤T≤T∞ . Short-rate, or “spot-rate”, models eitherspecify exogenously or deduce from an equilibrium framework a model forrt0≤t≤T∞ , typically as a (Markovian) diffusion process. Examples are themodels of Vasicek (1977),

drt = (a − brt) · dt + σ · dWt,

and Cox, Ingersoll, and Ross (1985),

drt = (a − brt) · dt + σr1/2t · dWt.

While modeling the short rate is easier than modeling all the forward rates,the disadvantage is that spot-rate models need not fit the existing termstructure of bond prices; that is, they do not in general satisfy (10.2) for eachT . Needless to say, it is undesirable that a model that might be used to pricederivatives on bonds cannot price the bonds themselves. Hull and White(1990) show that it is possible to fit the term structure by allowing timedependence in one or more of the parameters a, b, σ. Another approach—the one on which we focus here—is to start from an initial forward-ratecurve, r0(T )0≤T≤T∞ , and model the evolution from there of the entireterm structure of instantaneous forward rates, rt(T )0≤t≤T≤T∞ . This isdone by representing rt(T )0≤t≤T for each T as an Ito process with itsown characteristic trend and volatility:

rt(T ) − r0(T ) =∫ t

0

µs(T ) · ds +∫ t

0

σs(T )′ · dWs.

Obviously, in view of (10.1), this does assure that bonds’ initial prices arecorrectly fit. In this setup W and σ may be vector-valued, allowing forindependent influences on rates and thereby permitting forward rates atdifferent maturities to be imperfectly correlated. Since rt(t) = rt, the model

Page 479: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch10 FA

462 Pricing Derivative Securities (2nd Ed.)

necessarily specifies the evolution of the spot rate as well. This forward-rateapproach was pioneered by Heath, Jarrow, and Morton (HJM) (1992).

Because spot-rate models are more intuitive and are still important asfoundations for later developments, we work out in the next section theprices of discount, default-free bonds in the models of Vasicek and Coxet al. While this is usually done by solving differential equations, we showthat martingale methods are not hard to apply. Section 10.3 describes thegeneral HJM approach and its implications for bond prices. To illustratehow the model can be used, we adopt a simple one-factor version and showhow to work out analytically the prices of various interest-dependent con-tingent claims: options on the money fund, options on discount and coupon-paying bonds, caps and floors, and options on interest-rate swaps. Usingthe same setup, we will also see how to take account of stochastic rates inmodeling futures prices and valuing options on assets that are not highlyrate-sensitive. Section 10.4 treats the more recent LIBOR-market modelsthat deliver the simple Black-Scholes-like formulas for interest rate capsthat are commonly used by traders. Because these formulas are so simpleand cap markets so active, the LIBOR-model framework has been founduseful in calibrating models for other interest-sensitive derivatives. Finally,the last section of the chapter treats the pricing of bonds that are subjectto default and thus gives an introduction to the literature on credit risk.We shall see there that the venerable spot-rate models to which we nowturn still have a role to play in the pricing of debt instruments.

10.2 Spot-Rate Models

Most continuous-time models of the spot rate can be represented as Itoprocesses, in the form

drt = µt · dt + σt · dWt,

where µt and σt are appropriate scalar- and vector-valued processesand Wt is a vector-valued Brownian motion on (Ω,F , Ft, P). The vec-tor formulation allows for multiple risk factors. So long as they meet thetechnical conditions for Ito processes, µt and σt can be specified freelyto depend on time, on the short rate itself, and on other observable statevariables so as to represent as well as possible the observed behavior of rtunder the natural measure. Of course, the more elaborate the model theharder it is to work out explicit expressions for prices of bonds and interest-sensitive derivative assets. We describe here two single-factor Markovianmodels in which µt and σt depend just on rt.

Page 480: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch10 FA

Interest-Rate Dynamics 463

10.2.1 Bond Prices under Vasicek

Taking µt = a− brt and σt = σ for positive constants a, b, σ yields a mean-reverting model introduced by Vasicek (1977):

drt = (a − brt) · dt + σ · dWt. (10.6)

An explicit expression for rt comes by applying Ito’s formula to rtebt, giving

rtebt − r0 = a

∫ t

0

ebs · ds + σ

∫ t

0

ebs · dWs

=a

b(ebt − 1) + σ

∫ t

0

ebs · dWs

rt = e−btr0 + aβ(t) + σ

∫ t

0

e−b(t−s) · dWs, (10.7)

where we set β(t) ≡ b−1(1 − e−bt) for 0 ≤ t. The model thus presentsrt as a Gaussian process, with Ert = e−btr0 + aβ(t) and Cov(rt, ru) =σ2

2 [β(u + t) − β(u − t)] for 0 < t ≤ u. Note that this has the undesirableimplications that spot rates can become negative and that prices of discountbonds can exceed unity.

An expression for the arbitrage-free price of a discount bond can befound via a hedging argument similar to that used in developing the Black-Scholes formulas for options. Since rt itself is a Markov process adaptedto Ft, the price at t of a T -maturing discount bond must be a func-tion of t, T, and rt alone. Assuming that it has the required smooth-ness, we can represent B(t, T )0≤t≤T as an Ito process with dB(t, T ) =µ(t, T, rt) ·dt+σ(t, T, rt) ·dWt, say. Consider a self-financing portfolio com-prising at time t one T1-maturing unit bond and −pt unit bonds thatmature at T2. The portfolio’s value at t is Pt = B(t, T1) − ptB(t, T2), and,because it is self-financing, dPt = dB(t, T1) − pt · dB(t, T2). Because ofthe one-factor specification the bonds’ prices are (instantaneously) per-fectly correlated, so the stochastic part of dPt disappears upon settingpt = −σ(t, T1, rt)/σ(t, T2, rt). Since the resulting portfolio is riskless, wemust have dPt = rtPt · dt in order to avoid arbitrage against the moneyfund. Expressing dPt and simplifying yield the following condition on thebond processes:

µ(t, T1, rt) − rtB(t, T1)σ(t, T1, rt)

=µ(t, T2, rt) − rtB(t, T2)

σ(t, T2, rt).

A relation like this must hold for all pairs of bonds maturing after t. Thecommon value of the ratio represents the added per-unit compensation for

Page 481: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch10 FA

464 Pricing Derivative Securities (2nd Ed.)

bearing the instantaneous risk associated with discount bonds—i.e., it is the“price” of risk. Assuming this price to be a constant, γ, say,2 and using Ito’sformula to express µ(t, T, rt) = Bt + Br(a − brt) + σ2

2 Brr and σ(t, T, rt) =σBr, we obtain the p.d.e. 0 = Bt + Br(a− γσ − brt) + σ2

2 Brr − rtB, whereB ≡ B(t, T ) and subscripts on B represent partial derivatives. Of course,there is the terminal condition B(T, T ) = 1. Guessing (correctly) that thesolution has the form B(t, T ) = A(T − t) exp[−α(T − t)rt], one can plug inand solve the resulting ordinary differential equations for A and α subjectto ln A(0) = α(0) = 0.

As an alternative, let us find B(t, T ) by martingale methods. If we applyGirsanov’s theorem and change from natural measure P to a measure P inwhich Wt = Wt − γt is a Brownian motion, our model for rt becomesdrt = (a∗ − brt) · dt + σ · dWt, where a∗ ≡ a + γσ. The p.d.e. obtained bythe hedging argument is then

0 = Bt + Br(a∗ − brt) +σ2

2Brr − rtB. (10.8)

One gets the same p.d.e. by applying Ito’s formula to B(t, T )∗ ≡B(t, T )/Mt, replacing a by a∗, and setting drift term µ(t, T, rt)− rtB equalto zero. This means that B(t, T )∗ is a martingale under P, so we can nowidentify P as the spot martingale measure under which Mt-normalizedprice processes are martingales. It follows that

B(t, T ) = MtEt[M−1T B(T, T )] = Et exp

(−

∫ T

t

ru · du

), (10.9)

so that the bond’s arbitrage-free price can be found by working out theexpectation.3

Proceeding, from (10.7) we have under the new measure∫ T

t

ru ·du = r0[β(T )−β(t)]+a∗∫ T

t

β(u) ·du+σ

∫ T

t

∫ u

0

e−b(u−s) ·dWs du.

(10.10)

2That the price of risk be F0-measurable is enough for what follows, but allowingtime dependence would complicate the formulas to little purpose.

3To see directly that the last expression in (10.9) satisfies the p.d.e., note that

under P the conditional expectations process E(t, rt) ≡ Et exp(− R T0

rs · ds) =

exp(− R t0

rs · ds)B(t, T ) is a martingale adapted to the filtration generated by Wt.By the martingale representation theorem E(t, rt) can be represented as an Ito

process with zero drift. Applying Ito’s formula and equating the drift to zero give

0 = Et + (a∗ − brt)Er + σ2

2Err − rtE, which simplifies to (10.8).

Page 482: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch10 FA

Interest-Rate Dynamics 465

The last integral can be evaluated as4(∫ t

0

∫ T

t

+∫ t

s

∫ T

t

)e−buebs · du dWs

=e−bt − e−bT

b

∫ t

0

ebs · dWs +1b

∫ T

t

(1 − e−b(T−s)) · dWs

=β(T − t)

σ

∫ t

0

e−b(t−s) · dWs

]+

∫ T

t

β(T − s) · dWs

=β(T − t)

σ[rt − e−btr0 − a∗β(t)] +

∫ T

t

β(T − s) · dWs.

Combining the three terms on the right side of (10.10) and using identities

β(T − t)e−bt = β(T ) − β(t) (10.11)∫ T

t

β(T − u) · du =∫ T

t

β(u) · du − β(T − t)β(t) (10.12)

yield∫ T

t

ru · du = a∗∫ T

t

β(T − u) · du + σ

∫ T

t

β(T − u) · dWu + β(T − t)rt.

(10.13)This being normally distributed under P with time-t conditional mean

a∗∫ T

t

β(T − u) · du + β(T − t)rt

and conditional variance σ2∫ T

tβ(T − u)2 · du, the bond’s price is

B(t, T ) = exp

[−a∗

∫ T

t

β(T − u) · du +σ2

2

∫ T

t

β(T − u)2 · du − β(T − t)rt

].

(10.14)Thus, in view of (10.7) the value of the discount bond at time t is distributedex ante as lognormal.

Notice that parameters a, b, σ can be estimated from observations ofthe short rate alone, whereas the effect of risk price γ shows up only in theprices of traded, interest-sensitive assets, such as the discount bonds. Oncea, b, σ are determined, γ can be identified directly from (10.14) and relationa∗ = a + γσ.

An application of Ito’s formula shows the dynamics of B(t, T ) to begiven by dB(t, T )/B(t, T ) = rt · dt − σβ(T − t) · dWt under P and by

4To see the first step, draw a picture.

Page 483: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch10 FA

466 Pricing Derivative Securities (2nd Ed.)

dB(t, T )/B(t, T ) = [rt + γσβ(T − t)] · dt − σβ(T − t) · dWt under thenatural measure, the extra trend component in the latter case being therequired compensation for risk. Recall from (10.5) that Radon-Nikodymprocess QT /Qt = B(t, T )−1 exp(−

∫ T

t ru · du) effects the change of mea-sure from P to forward measure PT . It is easy to see from (10.13) and(10.14) that QT /Qt takes here the specific form

QT /Qt = exp

[−σ2

2

∫ T

t

β(T − u)2 · du − σ

∫ T

t

β(T − u) · dWu

].

This corresponds to the Doleans exponential for the change of measure thatconverts Wt − σ

∫ t

0 β(T − u) · du ≡ WTt to a Brownian motion.

EB(t, T ), the time-0 expected value of B(t, T ) under spot measure P, is

B0(t, T ) exp

σ2

2b[−2β(τ)β(t) +

12β(2τ)β(2t) +

12bβ(τ)2β(2t)]

,

where τ ≡ T − t. This is less than forward price B0(t, T ) except at t = 0and t = T. Since the expression does not involve a∗, the same result appliesunder the natural measure; however, B0(t, T ) itself does depend on a∗ andtherefore on risk price γ. Figure 10.1 depicts the ratio EB(t, 10)/B0(t, 10),

0 1 2 3 4 5 6 7 8 9 10

t

0.92

0.93

0.94

0.95

0.96

0.97

0.98

0.99

1.00

EB

(t,1

0)

/B

0(t,10)

r0=.01

r0=.04

r0=.07

Fig. 10.1. EB(t, 10)/B0(t, 10) vs. t under Vasicek for three values of initial spot rate,r0. Parameters: a∗ = .025, b = .5, σ = .1.

Page 484: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch10 FA

Interest-Rate Dynamics 467

which is also independent of initial spot rate r0. Of course, under forwardmeasure PT there is the usual result that ET B(t, T ) = B0(t, T ).

Naturally, from the expressions for spot prices of bonds at all maturitiesone can determine all the various prices and rates derived from them: (i)average spot rates or yields to maturity, r(t, T ) = −(T − t)−1 lnB(t, T );(ii) forward bond prices, Bt(t′, T ) = B(t, T )/B(t′, T ); and (iii) average andinstantaneous forward rates, rt(t′, T ) = −(T − t′) lnBt(t′, T ) and rt(t′) =−∂ lnBt(t′, T )/∂T |T=t′. For example, the instantaneous forward rate at t

for loans at T ≥ t is

rt(T ) = a∗β(T − t) − σ2

2b[2β(T − t) − β(2T − 2t)] + rte

−b(T−t),

which of course reduces to rt at T = t. This evolves as

drt(T ) = σ2e−b(T−t)β(T − t) · dt + σe−b(T−t) · dWt

= σe−b(T−t)[σβ(T − t) − γ] · dt + σe−b(T−t) · dWt. (10.15)

Despite the model’s simplicity, the mean-reverting feature of the instanta-neous spot rate process allows for quite varied behavior of the term structureof yields to maturity. Specifically, as figure 10.2 shows, the yield curve (a

0 1 2 3 4 5 6 7 8 9 10

Years to Maturity, T − t

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0.10

0.11

Yie

ld, r(

t,T

)

rt = .01

rt = .04rt = .07

Fig. 10.2. Vasicek yields vs. T − t for three values of initial spot rate rt Parameters:a∗ = .025, b = .5, σ = .1.

Page 485: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch10 FA

468 Pricing Derivative Securities (2nd Ed.)

plot of r(t, T ) versus T ) is strictly increasing when rt is sufficiently smalland hump shaped or decreasing when rt is larger. As T → ∞ the yieldapproaches r(t,∞) = a∗

b − σ2

2b2 , independently of rt.

10.2.2 Bond Prices under Cox, Ingersoll, Ross

Adopting strategically simplifying assumptions about technology and pref-erences, Cox, Ingersoll, and Ross (1985) show that the short-rate processrt represented by drt = (a−brt)·dt+σr

1/2t ·dWt can result from a general

equilibrium in which prices are generated by competitive, optimizing eco-nomic agents. This approach permits an interpretation of the parametersa, b, σ in terms of characteristics of individuals’ tastes and of the technologythat generates income. This deeper story not being relevant for our pur-poses, we consider the model strictly in an arbitrage framework, taking themean-reverting square-root “CIR” process for rt as a primitive.

A hedging argument proceeds like that used for the Vasicek model butwith the assumption that the price of risk is proportional to

√rt rather

than constant. Expressing dB(t, T ) = µ(t, T ) · dt + σ(t, T ) · dWt as beforethus leads to the equality

µ(t, T ) − rtB(t, T )σ(t, T )

= γ√

rt

for any bond maturing at T > t. We also get the following p.d.e. for itsprice: 0 = Bt + Br(a − brt − σγrt) + σ2

2 Brr − rtB. As in the Vasicekmodel, one can guess (and verify) that the solution has the form B(t, T ) =A(T − t)e−α(T−t)rt and find the functions A and α by solving ordinarydifferential equations subject to ln A(0) = α(0) = 0. However, we willagain find the solution by shifting to the spot martingale measure andevaluating B(t, T ) as Et exp(−

∫ T

tru · du). Assuming (as justified below)

that E exp(12

∫ T∞0

rs · ds) < ∞, where T∞ equals or exceeds the longesttime to maturity of any bond, Girsanov’s theorem authorizes a change ofmeasure from P to P, in which Wt = Wt + γ

∫ t

0

√rs · ds is a Brownian

motion on [0, T∞]. The process for rt can now be represented as drt =(a − b∗rt) · dt + σr

1/2t · dWt with b∗ = b + γσ, and the p.d.e. for B(t, T )

becomes 0 = Bt + Br(a − b∗rt) + σ2

2 Brr − rtB. Again, the same p.d.e.is obtained under P by applying Ito’s formula to B(t, T )∗ = B(t, T )/Mt,replacing b by b∗, and equating the drift to zero, thus confirming P as thespot martingale measure.

Page 486: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch10 FA

Interest-Rate Dynamics 469

We have encountered mean-reverting square-root processes already insections 8.3.2 and 9.3.2 in connection with models of stochastic volatilityand jump intensity, and we have seen that such processes are almost surelynonnegative. While this overcomes the salient shortcoming of the Vasicekmodel, the fact that rt is no longer Gaussian does make it harder to workout arbitrage-free prices of bonds and derivatives by martingale methods.Just what sort of process rt is was determined by Feller (1951), whodevelops the p.d.f. of rt by solving the parabolic equation ∂f

∂t = ∂2(rf)∂r2

σ2

2 −∂(a−b∗r)f

∂r subject to an initial condition that specifies rt at t = 0. Thesolution (here in the spot martingale measure) is5

f(r; t, r0) = θt exp[−θt(r + r0e−b∗t)]

(r

r0e−b∗t

) aσ2 − 1

2

× I1−2a/σ2

(2θt

√r r0e−b∗t

)for r > 0, where θt ≡ 2b∗

σ2(1−e−b∗t)and Iq(·) is the modified Bessel function

of the first kind of order q. The p.d.f. satisfies the boundary conditionf(0, t; r0) < ∞ so long as a/σ2 < 1/2.

To gain insight into this expression, change variables as x = 2θtr, λ =2θte

−b∗tr0, ν/4 = a/σ2, set f(x; λ, ν) = f(r; t, r0)| drdx |, and use the fact that

Iq(·) = I−q(·) to get

f(x; λ, ν) =12e−(x+λ)/2

(x

λ

)ν/4−1/2

Iν/2−1

(√xλ

), x > 0.

This is a common representation of the noncentral chi-squared p.d.f. withν degrees of freedom and noncentrality parameter λ. Thus, in the originalnotation, one sees that rt is (2θt)−1 times a noncentral chi-squared variatewith 4a/σ2 d.f. and noncentrality 2θte

−b∗tr0.

Expressing the Bessel function in series form as in (2.24) gives anotherhelpful expression,

f(x; λ, ν) = 2−ν/2e−(x+λ)/2∞∑

j=0

xν/2+j−1λj

Γ(ν/2 + j)22jj!

=∞∑

j=0

(λ/2)j

j!e−λ/2f(x; ν + 2j),

5This corresponds to Feller’s expression (6.2) with σ2/2 replacing a, −b∗ replacingb, and a replacing c. The factor 4b2 that appears in the first line of his expression shouldnot be present.

Page 487: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch10 FA

470 Pricing Derivative Securities (2nd Ed.)

where f(·; ν + 2j) is the central chi-squared p.d.f. with ν + 2j d.f. Thispresents the noncentral chi-squared as a Poisson-directed mixture of cen-tral chi-squareds. Thus, X ∼ χ2(ν + 2Y ) conditional on Y ∼ P (λ/2). Thishierarchical representation makes it easy to calculate moments and gener-ating functions/Laplace transforms. Specifically,

EX = E[E(X | Y )] = E(ν + 2Y ) = ν + λ

V X = E[E(X2 | Y )] − (ν + λ)2 = 2ν + 4λ

LX(ζ) = E[E(e−ζX | Y )] = (1 + 2ζ)−ν/2 exp(− λζ

1 + 2ζ

), ζ ≥ 0.

With these results we have

Ert = r0e−b∗t + aβ(t)

V rt = σ2β(t)[r0e−b∗t + aβ(t)/2]

Lrt(ζ) =[1 +

σ2

2β(t)ζ

]−2a/σ2

exp

[− r0e

−b∗tζ

1 + σ2

2 β(t)ζ

],

where β(t) ≡ (1 − e−b∗t)/b∗.We can now use the expression for the Laplace transform to develop

a solution for B(t, T ). Setting tj = t + j(T − t)/n ≡ t + j∆t, we have∫ T

t rs · ds = limn→∞∑n

j=1 rtj ∆t and, by dominated convergence,

B(t, T ) = limn→∞

Et exp

−n∑

j=1

rtj∆t

= lim

n→∞Et

exp

−n−2∑j=1

rtj∆t

e−rtn−1∆tEtn−1e−rtn∆t

= lim

n→∞

(1 +

σ2

2β(∆t)∆t

)−2a/σ2

· Et exp

− n−2∑j=1

rtj

∆t − rtn−1∆t

(1 +

e−b∗∆t

1 + σ2

2 β(∆t)∆t

).

If we now define α0,n ≡ α(T − tn−1) to be ∆t and for j ∈ 1, 2, . . . , n set

αj,n ≡ α(T − tn−j−1) = ∆t +e−b∗∆tαj−1,n

1 + σ2

2 β(∆t)αj−1,n

,

Aj,n ≡ A(T − tn−j) =j∏

i=1

(1 +

σ2

2β(∆t)αi−1,n

)−2a/σ2

,

Page 488: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch10 FA

Interest-Rate Dynamics 471

then the last expression for B(t, T ) develops as

B(t, T ) = limn→∞

A1,nEt exp

−n−2∑j=1

rtj

∆t

Etn−2e−rtn−1α1,n

= limn→∞

A2,nEt exp

−n−3∑j=1

rtj

∆t

Etn−3e−rtn−2α2,n

...

= limn→∞

An,ne−rtαn,n .

With A(T − t) ≡ limn→∞ An,n and α(T − t) = exp(−rt limn→∞ αn,n) thebond’s price is then given by6

B(t, T ) = A(T − t)e−rtα(T−t). (10.16)

We solve first for α(T − t). Noting that β(∆t) = ∆t + o(∆t) as ∆t → 0,

the recursion for αj,n yields difference equation

αj,n − αj−1,n = ∆t − αj−1,n(1 − e−b∗∆t) − σ2

2α2

j−1,n∆t + o(∆t)

=[1 − b∗αj−1,n − σ2

2α2

j−1,n

]∆t + o(∆t)

and corresponding differential equation

dα(τ)dτ

= 1 − b∗α(τ) − σ2

2α(τ)2.

The solution subject to α(0) = 0 is

α(τ) =e2cτ − 1

c(e2cτ + 1) + b∗2 (e2cτ − 1)

=sinh(cτ)

c cosh(cτ) + b∗2 sinh(cτ)

, (10.17)

where τ ≡ T − t and c ≡ 12

√b∗2 + 2σ2.

6Noting that noncentral chi-squared variates possess m.g.f.s, the Novikov conditionE exp(

R T∞0 rt · dt) < ∞ that justified the change to the martingale measure can now be

verified by steps similar to those above.

Page 489: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch10 FA

472 Pricing Derivative Securities (2nd Ed.)

Next, for A(T − t) we have

ln A(T − t) = −2a

σ2lim

n→∞

n∑j=1

ln[1 +

σ2

2aj−1,n∆t + o(∆t)

]

= −a limn→∞

n∑j=1

[aj−1,n∆t + o(∆t)]

= a

∫ T

t

α(T − u) · du,

so that d lnA(T − t)/dt = −d lnA(τ)/dτ = −aα(τ). Solving this subject toA(0) = 1 gives

A(τ) =[cα(τ)eb∗τ/2

sinh(cτ)

]2a/σ2

. (10.18)

An expression for EB(t, T ) can be found by evaluating Laplace trans-form Lrt(·) at ζ = α(τ), as

EB(t, T ) = B0(t, T )A(τ)A(t)

A(T )

[1 +

σ2

2β(t)α(τ)

]−2a/σ2

· exp−r0

[e−b∗tα(τ)

1 + σ2β(t)α(τ)/2− α(T ) + α(t)

].

The expectation under P is less than forward price B0(t, T ) for t ∈ (0, T ).Figure 10.3 plots EB(t, 10)/B0(t, 10) versus t for three values of the initialshort rate. Of course, ET B(t, T ) = B0(t, T ) for t ∈ [0, T ] under forwardmeasure PT .

The average spot rate or yield to maturity on a T = (t + τ)-maturingdiscount bond is

r(t, T ) = − lnA(τ)τ

+α(τ)

τrt.

As τ → ∞ this approaches r(t,∞) ≡ aσ2 [

√b∗2 + 2σ2 − b∗] independently of

rt. As was true of the Vasicek model, the dynamics allow for yield curves ofvarying shapes (figure 10.4). In the same way, too, we can determine riskcoefficient γ from bond prices once the remaining parameters have beenestimated from observations of the short rate.

10.3 A Forward-Rate Model

Rather than base the term structure of prices and yields on a model forthe spot rate alone, Heath, Jarrow, and Morton (HJM) (1992) undertake

Page 490: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch10 FA

Interest-Rate Dynamics 473

0 1 2 3 4 5 6 7 8 9 10

t

0.92

0.93

0.94

0.95

0.96

0.97

0.98

0.99

1.00

EB

(t,1

0)

/B0(t

,10) r0=.01

r0=.04

r0=.07

Fig. 10.3. EB(t, 10)/B0(t, 10) vs. t under CIR model for three values of initial spotrate, r0. Parameters: a = .025, b∗ = .5, σ = .8.

0 1 2 3 4 5 6 7 8 9 10

Years to Maturity, T − t

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

Yie

ld, r(

t,T

)

rt = .01

rt = .04

rt = .07

Fig. 10.4. CIR yields vs. T − t for three values of initial spot rate, rt Parameters:a = .025, b∗ = .5, σ = .8.

Page 491: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch10 FA

474 Pricing Derivative Securities (2nd Ed.)

to model the evolution of instantaneous forward rates across the entirespectrum of maturities. To get across the idea, we start with a one-factorversion in which all forward rates respond to a single stochastic influence.After extending to multiple risk factors we return to the simple one-factorsetting to illustrate how the model works to price various interest-sensitivefinancial instruments.

10.3.1 The One-Factor HJM Model

Taking as given the initial instantaneous forward rates at all maturi-ties, r0(T )0≤T≤T∞ , we begin by specifying the evolution of the variousforward-rate processes, as

rt(T )−r0(T ) =∫ t

0

µs(T )·ds+∫ t

0

σs(T )·dWs, 0 ≤ t ≤ T ≤ T∞, (10.19)

or, in differential form, as

drt(T ) = µt(T ) · dt + σt(T ) · dWt.

In this one-factor version Wtt≥0 is a (scalar-valued) Brownian motion onthe filtered probability space (Ω,F , Ft, P). For each T the drift and diffu-sion are adapted processes, µt(T )0≤t≤T , σt(T )0≤t≤T , possibly depend-ing on present and past values of interest rates, and such that the respectiveintegrals exist.7 Since the time-t forward rate for lending at t is just theinstantaneous spot rate—i.e., rt(t) = rt, the same framework implicitlydefines the behavior of the short rate also, as drt = µt(t) ·dt+σt(t) ·dWt or

rt = r0(t) +∫ t

0

µs(t) · ds +∫ t

0

σs(t) · dWs. (10.20)

Notice that in this one-factor form instantaneous changes in the short rateand in the instantaneous forward rates at all future lending dates are per-fectly correlated, each depending on the change in the single Brownianmotion that drives the entire structure. Of course, the same is true of theone-factor versions of the spot-rate models of the previous section.

The Evolution of Bond Prices

Unlike short rates and forward rates, discount bonds themselves are tradedassets. We will now see that a special structure must be imposed on the

7That is, such that P[R T0 |µs(T )| · ds < ∞] = P[

R T0 σs(T )2 · ds < ∞] = 1.

Page 492: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch10 FA

Interest-Rate Dynamics 475

forward-rate drift processes if the bond market is to be arbitrage free. Writ-ing lnB(t, T ) in terms of the time-t instantaneous forward rates as

lnB(t, T ) = −∫ T

t

rt(u) · du

= −∫ T

t

[r0(u) +

∫ t

0

µs(u) · ds +∫ t

0

σs(u) · dWs

]· du

and then reversing the order of integration give

lnB(t, T ) = −∫ T

t

r0(u) · du −∫ t

0

µs(t, T ) · ds −∫ t

0

σs(t, T ) · dWs,

where

µs(t, T ) ≡∫ T

t

µs(u) · du

σs(t, T ) ≡∫ T

t

σs(u) · du.

We now want to express the differential dB(t, T ). Note that dµs(t, T ) =−µs(t) · dt and dσs(t, T ) = −σs(t) · dt, so8

d

∫ t

0

µs(t, T ) · ds = µt(t, T ) −∫ t

0

µs(t) · ds dt

d

∫ t

0

σs(t, T ) · dWs = σt(t, T ) · dWt −∫ t

0

σs(t) · dWs dt.

Therefore,

d lnB(t, T ) =[r0(t) +

∫ t

0

µs(t) · ds +∫ t

0

σs(t) · dWs

]· dt

−µt(t, T ) · dt − σt(t, T ) · dWt

= rt · dt − µt(t, T ) · dt − σt(t, T ) · dWt.

The first term on the right supplies the customary trend in price as thebond gets closer to maturity, while the second and third terms pick up theeffects of slope and shifts in the term structure, respectively. Finally, Ito’sformula applied to B(t, T ) = exp[lnB(t, T )] gives

dB(t, T )/B(t, T ) = [rt−µt(t, T )+σt(t, T )2/2]·dt−σt(t, T )·dWt. (10.21)

8The relation in the second line, the stochastic equivalent of Leibnitz’ formula, followsat once from the definition of the stochastic integral.

Page 493: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch10 FA

476 Pricing Derivative Securities (2nd Ed.)

We know that in an arbitrage-free market there is a measure P, equiv-alent to natural measure P, under which values of traded assets relative tothe per-unit price of the money fund, Mt, are martingales. Let us now seewhat restrictions this places on the processes µt(T ) and σt(T ). Ito’sformula applied to B(t, T )∗ ≡ B(t, T )/Mt gives

dB(t, T )∗ = dB(t, T )/Mt − B(t, T )∗dMt/Mt

= B(t, T )∗[dB(t, T )/B(t, T )− dMt/Mt]

= B(t, T )∗[−µt(t, T ) + σt(t, T )2/2] · dt − σt(t, T ) · dWt

or

dB(t, T )∗/B(t, T )∗ = γt(t, T )σt(t, T ) · dt − σt(t, T ) · dWt,

where γs(t, T ) ≡ −µs(t, T )/σs(t, T ) + σs(t, T )/2. Now imagine, first, thatγs(t, T ) were independent of t, so that we could write γs(t, T ) = γs(T ), say;and, second, that the γs(T ) process were such that

E exp(∫ t

0

γs(T ) · dWs −12

∫ t

0

γs(T )2 · ds

)= 1. (10.22)

Then Girsanov’s theorem would give us a measure PT , say (not to be con-fused with forward measure PT ), such that WT

t = −∫ t

0γs(T ) · ds + Wt is

a Brownian motion. Under these conditions and in this new measure wewould have

dWTt = −γt(T ) · dt + dWt (10.23)

and

dB(t, T )∗/B(t, T )∗ = −σt(t, T ) · dWTt .

B(t, T )∗ would then be a continuous local martingale and, moreover, amartingale proper under Novikov condition E exp[ 12

∫ T

0σt(t, T )2 · dt] < ∞.

However, the existence of process γs(T )0≤s≤T and measure PT doesnot suffice to rule out arbitrage across the entire spectrum of availablebonds. As the notation indicates, both of these are specific to the maturitydate of the bond. For there to exist a single spot measure P that applies tobonds of all maturities it is necessary that processes γs(T )0≤s≤T be thesame for all T ∈ (0, T∞]. This implies that −µs(t, T )/σs(t, T )+σs(t, T )/2 =γs, say, where γs is independent of both t and T . This, in turn, imposes a

Page 494: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch10 FA

Interest-Rate Dynamics 477

restriction on the original drift and diffusion processes in the forward-ratemodel; namely, that∫ T

t

µs(u) · du = −[∫ T

t

σs(u) · du

]γs +

12

[∫ T

t

σs(u) · du

]2

for 0 ≤ s ≤ T ≤ T∞ and some process γs satisfying (10.22). Differentiat-ing with respect to T yields the condition

µs(T ) = −σs(T )γs + σs(T )σs(t, T ), (10.24)

which must hold for all maturities T . Inspecting (10.21), we see that themean drift of the bond price is rt + γtσt(t, T ), giving γt the usual inter-pretation as the compensation under P for bearing the instantaneous riskassociated with holding the discount bond. The assumption γs(t, T ) = γs

thus amounts to requiring that the “price” of risk (although obviously notthe amount of risk) be the same for bonds of all maturities. Of course, thesame assumption was implicit in the development of the spot-rate modelsof the previous section.9 Under this condition we then have

dWt = −γt · dt + dWt (10.25)

for all 0 ≤ t ≤ T ≤ T∞ under the single spot measure P. This guaranteesthat there are no opportunities for arbitrage among the spectrum of bondsand the money fund.

Combining (10.24) and (10.25) with (10.19) gives for the time-t instan-taneous forward rate at T under P

rt(T ) = r0(T ) +∫ t

0

[−σs(T )γs + σs(T )σs(s, T )] · ds

+∫ t

0

σs(T ) · (dWs + γs ds)

= r0(T ) +∫ t

0

σs(T )σs(s, T ) · ds +∫ t

0

σs(T ) · dWs (10.26)

for 0 ≤ t ≤ T ≤ T∞. The representation for the short rate then followsupon setting T = t:

rt = r0(t) +∫ t

0

σs(t)σs(s, t) · ds +∫ t

0

σs(t) · dWs. (10.27)

9It is easy to see that the one-factor model of Vasicek does satisfy (10.24). Referringto (10.15) we have σt(T ) = σe−b(T−t), σt(t, T ) = σβ(T −t), and µt(T ) = σt(T )σt(t, T )−σt(T )γ, so that µt(T )/σt(T ) − σt(t, T ) = −γ.

Page 495: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch10 FA

478 Pricing Derivative Securities (2nd Ed.)

Valuing Contingent Claims

Having at hand an expression for the short rate, we can express the valueof the money fund as Mt = M0 exp(

∫ t

0rs · ds). Valuation now proceeds in

the same way that it did for the spot-rate models. That is, since Mt-normalized values of traded assets are martingales under P , the value at t

of a claim worth (an integrable quantity) VT at T ≥ t satisfies

M−1t Vt = EtM

−1T VT ,

so that

Vt = Et

(e−

RTt

ru·duVT

). (10.28)

Alternatively, we could work in forward measure PT and find Vt as

Vt = B(t, T )ETt VT = B(t, T )Et(VT QT /Qt),

where QT /Qt

= MtM−1T /B(t, T ). As usual, attitudes toward risk are

involved only indirectly through the change of measure from P to P orto PT . Typically, evaluating the expectations must be done numerically inthe HJM framework, but we will see shortly that analytical solutions areindeed possible when σt(T ) is simply specified.

10.3.2 Allowing Additional Risk Sources

In the one-factor model a single Brownian motion drives prices and yieldsof bonds of all maturities. This implies that yields (that is, average spotrates r(t, T )) on long- and short-maturity bonds always move in the samedirection. In fact, yield curves do sometimes twist, with short rates movingup as long yields decline, or vice versa. Capturing this feature requires atleast two risk sources. While this extension makes the model harder toimplement and requires some additional technical conditions, we shall seethat it poses no conceptual difficulty.

Taking σs(T ) and Ws in (10.19) to be k vectors, the motion of instan-taneous forward rates under P becomes

rt(T ) − r0(T ) =∫ t

0

µs(T ) · ds +∫ t

0

σs(T )′ · dWs

for 0 ≤ t ≤ T ≤ T∞. Written out, this is

rt(T ) − r0(T ) =∫ t

0

µs(T ) · ds +k∑

j=1

∫ t

0

σjs(T ) · dWjs, (10.29)

Page 496: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch10 FA

Interest-Rate Dynamics 479

where the Wjt are independent Brownian motions. The processesσjt(T ) can be specified so as to allow individual risk sources to exertdifferent degrees of influence on rates of different term. For example, HJM(1992) propose as a two-factor model

rt(T ) − r0(T ) =∫ t

0

µs(T ) · ds + σ1W1t + σ2e−θT

∫ t

0

eθs · dW2s,

with θ > 0. Here W1t affects the entire term structure uniformly, whereasthe influence of W2t declines as the term of lending increases.

Paralleling the development of (10.21), the motion of the T -maturingbond is described by

dB(t, T )/B(t, T ) = [rt − µt(t, T )+ σt(t, T )′σt(t, T )/2] · dt−σt(t, T )′ · dWt

or

dB(t, T )/B(t, T ) =

rt − µt(t, T ) +k∑

j=1

σjt(t, T )2

2

·dt−k∑

j=1

σjt(t, T ) · dWjt,

where

µs(t, T ) ≡∫ T

t

µs(u) · du,

σjs(t, T ) ≡∫ T

t

σjs(u) · du

and σt(t, T ) ≡ [σ1t(t, T ), . . . , σkt(t, T )]′. As before, we want to find condi-

tions for the existence of a measure under which B∗(t, T ) = M−1t B(t, T )

is a martingale. Normalizing by Mt removes rt from the drift, but to elim-inate the remaining part there must be a vector of risk prices γt(T ) =[γ1t(T ), . . . , γkt(T )]

′that solves

−µt(t, T ) + σt(t, T )′σt(t, T )/2 = σt(t, T )′γt(T ), (10.30)

and there must exist a measure PT under which Wt = Wt−∫ t

0γs(T ) ·ds is

a k-dimensional Brownian motion. Moreover, in order for the measure andthe vector of risk prices in the economy to be unique, such a relation as(10.30) must hold simultaneously for bonds of all maturities. This requiresthe risk prices for different maturities to be the same, so that γt(T ) = γt

for all T ∈ (t, T∞], and it restricts the drifts as

µt(T ) = −σt(T )′γt + σt(T )′σt(t, T )

Page 497: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch10 FA

480 Pricing Derivative Securities (2nd Ed.)

for all t ∈ [0, T ]. Further conditions on γt and σt(t, T ) that

E exp(∫ t

0

γ ′s · dWs −

12

∫ t

0

γ ′sγs · ds

)= 1

and

E exp

[12

∫ T

0

σt(t, T )′σt(t, T ) · dt

]< ∞

then assure that the unique P exists under which each B∗(t, T ) is a mar-tingale, with

dB(t, T )∗/B(t, T )∗ = −σt(t, T )′ · dWt. (10.31)

10.3.3 Implementation and Applications

To see how the HJM model can be put to use, let us return to the one-factorversion and choose a specification that makes it relatively easy to developsome analytical results.

Setting σt(T ) = σ, an F0-measurable constant independent of time andmaturity, presents the basic forward-rate process as

rt(T ) − r0(T ) =∫ t

0

µs(T ) · ds + σWt, 0 ≤ t ≤ T ≤ T∞.

This is a continuous-time version of the discrete-time model proposed byHo and Lee (1986). Taking σt(t, T ) ≡

∫ T

tσt(u) ·du = σ(T − t) in expression

(10.24), the no-arbitrage condition on the drift becomes

µt(T ) = −γtσ + σ2(T − t);

and, corresponding to (10.26) and (10.27), the instantaneous forward ratefor time T and the short rate under the spot measure are now

rt(T ) = r0(T ) + σ2t(T − t/2) + σWt (10.32)

rt = r0(t) + σ2t2/2 + σWt. (10.33)

A price paid for this simplicity is that forward rates and short rates arenow normally distributed and therefore unbounded below, as in Vasicek’smodel. The benefit here, as there, is that we can produce explicit expressionsfor arbitrage-free prices of some interest-sensitive instruments. This nice

Page 498: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch10 FA

Interest-Rate Dynamics 481

tractability owes to the fact that bond prices and the value of the moneyfund are now ex ante lognormally distributed. Thus,

B(t, T ) = exp

(−

∫ T

t

rt(u) · du

)

= exp

[−

∫ T

t

r0(u) · du − σ2t

∫ T

t

(u − t/2) · du − σ(T − t)Wt

]= B0(t, T ) exp[−σ2(T − t)tT/2 − σ(T − t)Wt], (10.34)

where B0(t, T ) = exp[−∫ T

tr0(u)·du] is the T -maturing bond’s initial time-t

forward price. The dynamics of the bond’s price under P are thus

dB(t, T )/B(t, T ) = rt · dt − σ(T − t) · dWt.

Recalling (10.25), these can be stated in terms of a Brownian motion underthe natural measure as

B(t, T ) = B0(t, T ) exp[σ(T − t)

∫ t

0

γs · ds − σ2(T − t)tT/2 − σ(T − t)Wt

].

and

dB(t, T )/B(t, T ) = [rt + σ(T − t)γt] · dt − σ(T − t) · dWt.

Notice that in neither measure does the expectation of the spot price at t

equal initial forward price B0(t, T ). In particular,

EB(t, T ) = B0(t, T ) exp[−σ2t2(T − t)/2] < B0(t, T )

for t ∈ (0, T ).The value of the money fund at t is

Mt = M0eR

t0 rs·ds

= M0 exp∫ t

0

[r0(s) + σ2s2/2] · ds + σ

∫ t

0

Ws · ds

= M0B(0, t)−1 exp

(σ2t3/6 + σ

∫ t

0

Ws · ds

), (10.35)

so that (from example 33 in chapter 3)∫ t

0

Ws · ds ∼ N(0, t3/3) (10.36)

and

ln(Mt/M0) ∼ N [− lnB(0, t) + σ2t3/6, σ2t3/3].

Page 499: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch10 FA

482 Pricing Derivative Securities (2nd Ed.)

The Radon-Nikodym process QT /Qt that changes from the spot mar-tingale measure to the forward martingale measure at T is given by

QT

Qt= B(t, T )−1M−1

T Mt

= B(t, T )−1 exp

∫ T

t

[r0(u) + σ2u2/2 + σWu] · du

= exp

[−σ2

6(T − t)3 − σ

∫ T

t

Wu · du + σ(T − t)Wt

]

= exp

[−σ2

2

∫ T

t

(T − u)2 · du − σ

∫ T

t

(T − u) · dWu

].

This is simply the Doleans exponential that changes Wt−σ∫ t

0 (T −u) ·du =Wt − σt(T − t/2) ≡ WT

t to a Brownian motion. Referring to (10.33), wesee that r0(T ) = ET rt(T ) in measure PT and that r0(t) = Etrt in mea-sure Pt. Thus, if we work in the appropriate forward martingale measure,instantaneous forward rates are unbiased estimators of future spot rates.

Pricing a Sure Cash Receipt

As a warm-up in applying the model, let us verify that valuation formula(10.28) leads to a price for the bond that is consistent with (10.34). TakingVT = B(T, T ) = 1 as the terminal value, we have M−1

t B(t, T ) = EtM−1T

and

B(t, T ) = Ete−

R Tt

ru·du

= Et exp

(−

∫ T

t

r0(u) · du − σ2

∫ T

t

u2/2 · du − σ

∫ T

t

Wu · du

)

= B0(t, T ) exp[−σ2(T 3 − t3)/6] Et exp

(−σ

∫ T

t

Wu · du

).

To evaluate the expectation, write∫ T

t

Wu · du =∫ T

t

Wt · du +∫ T

t

(Wu − Wt) · du

= (T − t)Wt +∫ T−t

0

Ws · ds,

Page 500: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch10 FA

Interest-Rate Dynamics 483

where Wt is a Brownian motion independent of Wt. Since∫ T−t

0 Ws · ds

is distributed as N [0, (T − t)3/3], we have

Et exp

(−σ

∫ T

t

Wu · du

)= exp[σ2(T − t)3/6 − σ(T − t)Wt]

and

B(t, T ) = B0(t, T ) exp[−σ2tT (T − t)/2 − σ(T − t)Wt]

as in (10.34). Of course, working in the forward measure makes the resulttrivial, for (10.4) gives

Vt = B(t, T )ET B(T, T ) = B(t, T )Et(QT /Qt) = B(t, T ).

European Options on the Money Fund

Letting M0 be the arbitrary initial value of one unit of the money fund, thevalue at t = 0 of a European call worth (MT − X)+ at T is

CE(M0, T ) = M0E[M−1T (MT − X)+]

= E(M0 − XM0M−1T )+ (10.37)

= E[M0 − XB(0, T )e−σ2T 3/6−σYT ]+,

where the last follows from (10.35) with∫ T

0Ws · ds ≡ YT . Since YT ∼

Z√

T 3/3 under P, where Z ∼ N(0, 1), the bracketed expression is positivewhen

Z > zX ≡− ln

[M0

B(0,T )X

]− σ2T 3/6

σ√

T 3/3.

Thus,

CE(M0, T ) = M0E1(zX ,∞)(Z)−XB(0, T )Ee−σ2T 3/6−σ√

T 3/3·Z1(zX ,∞)(Z).

Writing out the second expectation as an integral, completing the square,and simplifying yield the following formula for the value of the call:

CE(M0, T ) = M0Φ

ln[

M0B(0,T )X

]+ σ2T 3

6

σ√

T 3/3

−B(0, T )XΦ

ln[

M0B(0,T )X

]− σ2T 3

6

σ√

T 3/3

.

Page 501: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch10 FA

484 Pricing Derivative Securities (2nd Ed.)

This corresponds to the Black-Scholes formula for a T -expiring call on anunderlying worth M0 and having volatility σT/

√3. In the usual “q” nota-

tion this is

CE(M0, T ) = M0Φ[q+(M0/BX ; σT/√

3)] − BXΦ[q−(M0/BX ; σT/√

3)].

Working instead in the forward measure as in (10.4), we would have

CE(M0, T ) = B(0, T )ET (MT − X)+

= B(0, T )E[(MT − X)+QT ]

= B(0, T )M0E[M−1

T (MT − X)+]B(0, T )

= E(M0 − XM0M−1T )+

as in (10.37), from which the rest follows.

Options on Discount Bonds

Let us now apply (10.28) to determine the price at time 0 of a t-expiringEuropean call struck at X on a discount bond that matures at T ≥ t. Theinitial value of the underlying is B(0, T ), and we can state the call’s initialvalue as

CE [B(0, T ), t] = B(0, t)Et[B(t, T ) − X ]+ = M0EM−1t [B(t, T ) − X ]+.

Using (10.33) and (10.34), this equals

B(0, t)e−σ2t3/6Ee−σYt [B0(t, T )e−σ2tTτ/2−στWt − X ]+, (10.38)

where Yt ≡∫ t

0Ws · ds and τ ≡ T − t. We recognize that Yt ∼ N(0, t3/3)

under P, but in evaluating the expectation we must take account of thedependence between Yt and Wt. The next step is to work out their jointdistribution under P.

This is easily done by developing the joint c.f.,

Ψ(ζ1, ζ2) = E exp(iζ1Yt + iζ2Wt).

The definition of the Riemann integral implies that Yt = t ·limn→∞

1n

∑nj=1 Wtj , where tj ≡ tj/n. Applying dominated convergence,

the c.f. can be expressed as

Ψ(ζ1, ζ2) = limn→∞

E exp

iζ1t · n−1n∑

j=1

Wtj + iζ2Wt

.

Page 502: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch10 FA

Interest-Rate Dynamics 485

The terms tn−1∑n

j=1 Wtj and Wt are clearly bivariate normal with zeromeans. Example 33 on page 98 shows that the first term has variancet3n−2(n + 1)(2n + 1)/6, while the variance of the second is, of course, t.Since the covariance under P is

tn−1n∑

j=1

EWtj Wt = tn−1n∑

j=1

t2j/n = t2n−1(n + 1)/2,

we conclude that

Ψ(ζ1, ζ2) = limn→∞

exp−1

2

[ζ21

t3(n + 1)(2n + 1)6n2

+ 2ζ1ζ2t2(n + 1)

2n+ ζ2

2 t

]= exp

[−1

2(ζ21 t3/3 + 2ζ1ζ2t

2/2 + ζ22 t)]

and therefore that Yt and Wt are jointly normal with

EWtYt ≡ EWt

∫ t

0

Ws · ds = t2/2. (10.39)

The variances being t3/3 and t, the correlation is ρ =√

3/2.

Recall from section 2.2.7 that when random variables X1 and X2 arebivariate normal with zero means, variances σ2

1 and σ22 , and correlation ρ,

then X1 is distributed as ρσ1σ2

X2+σ1

√1 − ρ2Z, where Z is standard normal

and independent of X2. It follows in the present instance that Yt can bedecomposed as

Yt =t

2Wt +

t

2√

3Ut, (10.40)

where Ut is an independent Brownian motion under P. Applying thisdecomposition to (10.38) gives

CE [B(0, T ), t]=B(0, t)e−σ2t3

8 E[B0(t, T )e−

σ2tT τ2 −σ(T− t

2 )Wt − e−σt2 WtX

]+

.

The bracketed expression is positive whenever

Wt√t

< zX ≡ ln[B0(t, T )/X ]στ

√t

− σT√

t

2.

Setting Z ≡ Wt/√

t ∼ N(0, 1) under P gives for the expectation in theabove expression

B0(t, T )E[e−σ2tTτ/2−σ(T−t/2)√

tZ1(−∞,zX)(Z)]−XE[e−σt3/2Z/21(−∞,zX)(Z)].

Page 503: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch10 FA

486 Pricing Derivative Securities (2nd Ed.)

Writing the expectations in integral form, completing the squares in theexponents, and recalling that the bond’s forward price is B0 ≡ B0(t, T ) =B(0, T )/B(0, t), we obtain for CE [B(0, T ), t] the expression

B(0, t)B0(t, T )Φ[q+(B0/X ; σB)] − XΦ[q−(B0/X ; σB)], (10.41)

where

q±(B0/X ; σB) ≡ ln(B0/X)± σ2Bt/2

σB

√t

and σB ≡ στ ≡ σ(T − t) is the volatility of the bond’s price at t.To value the European put, we can work from put-call parity:

PE [B(0, T ), t] = CE [B(0, T ), t] − M0E

[B(t, T ) − X ]+ − [X − B(t, T )]+

Mt

= CE [B(0, T ), t] − E

[B(t, T ) − X

Mt/M0

]= CE [B(0, T ), t] − E

(e−

Rt0 rs·dsEte

−R

Tt

rs·ds)− Ee−

Rt0 rs·dsX

= CE [B(0, T ), t] − Ee−R

T0 rs·ds − B(0, t)X

= CE [B(0, T ), t] − B(0, t)[B0(t, T ) − X ].

Finally, (10.41) and the relation

−q±(x; σB) = q∓(x−1; σB),

yield the following solution for PE [B(0, T ), t]:

B(0, t)XΦ[q+(X/B0; σB)] − B0(t, T )Φ[q−(X/B0; σB)]. (10.42)

Recall now expressions (6.27) and (6.28), which are the forward-priceforms of the Black-Scholes formulas for European options on an underlyingwhose price St follows geometric Brownian motion. Letting σS denotethe volatility and restating the formulas to represent the time-0 prices ofcalls and puts expiring at t, (6.27) and (6.28) become

PE(S0, t) = B(0, t)XΦ[q+(X/f; σS)] − fΦ[q−(X/f; σS)] (10.43)

CE(S0, t) = B(0, t)fΦ[q+(f/X ; σS)] − XΦ[q−(f/X ; σS)], (10.44)

where f ≡ f(0, t) is the forward price for delivery of the underlying at t.A comparison with (10.41) and (10.42) shows that under the Ho-Lee versionof the HJM model the Black-Scholes formulas still generate the correctprices for options on discount bonds.

Page 504: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch10 FA

Interest-Rate Dynamics 487

Options on Coupon-Paying Bonds

A coupon bond makes regular cash payments to the bearer, typically equalto a fixed proportion c of the principal value, then returns the principalvalue at maturity. Taking the principal value to be unity, c represents theactual periodic cash payment. Clearly, one can view the coupon bond as aportfolio of discount bonds with maturities corresponding to the paymentdates. The value at t ∈ [0, Tn] of a bond that pays cash coupon c at datesT, T2, . . . , Tn and returns the principal value at Tn is

B(t, c; T1, . . . , Tn) = cn∑

j=jt

B(t, Tj) + B(t, Tn), (10.45)

where Tjt is the first payment date after t. Interestingly, in the one-factorHJM model the portfolio characterization applies not only to the couponbond itself but also to European options written on it. In other words, aEuropean option on a coupon bond can be valued as a portfolio of optionson discount bonds that mature at the various payment dates.

The explanation for this hinges on the fact that under model (10.32)prices of discount bonds of all maturities can be expressed as functions ofa single variable, the current short rate. This means that the price of thecoupon bond depends just on the short rate as well. To see that these thingsare true, combine (10.33) and (10.34), to get

B(t, Tj) |rt= B0(t, Tj) exp−σ2

2(Tj − t)2t − (Tj − t)[rt − r0(t)]

.

Now, since B(t, Tj) |rt↑ +∞ as rt → −∞ (negative short rates being possi-ble in this model) and B(t, Tj) |rt↓ 0 as rt → +∞, a unique value of rt willcorrespond to each positive B(t, Tj), and the same is true of the couponbond as well.

Considering, then, a European call struck at X and expiring at t ≤ Tn,

there is some critical rt—call it rXt —such that B(t, c; T1, . . . , Tn) |rX

t= X.

Letting Xj ≡ B(t, Tj) |rXt

be the corresponding value of B(t, Tj), we haveX = c

∑nj=jt

Xj + Xn; and the time-zero value of a t-expiring Europeancall, CE [B(0, c; T1, . . . , Tn), t; X ], can then be found as

M0EM−1T [B(t, c; T1, . . . , Tn) − X ]+

= Ee−R

t0 rs·ds[B(t, c; T1, . . . , Tn) − X ]1(−∞,rX

t )(rt)

= Ee−R t0 rs·ds

c

tn∑j=jt

[B(t, Tj) − Xj ] + [B(t, Tn) − Xn]

1(−∞,rXt )(rt)

Page 505: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch10 FA

488 Pricing Derivative Securities (2nd Ed.)

= c

Tn∑j=jt

Ee−R

t0 rs·ds[B(t, Tj) − Xj]+ + Ee−

Rt0 rs·ds[B(t, Tn) − Xn]+.

This delivers the promised portfolio characterization as

CE [B(0, c; T1, . . . , Tn), t; X ]=c

Tn∑j=jt

CE [B(0, Tj), t; Xj]+CE [B(0, Tn), t; Xn],

from which the arbitrage-free price of the call on the coupon bond can befound by using (10.41) to value each of the terms.

Futures/Forward Prices under Stochastic Rates

Recall that the practice of marking to market distributes the gains or losseson futures positions over time, whereas with forward contracts they are con-centrated at the delivery date. Apart from default risk (which we ignore),this difference in timing is the only relevant distinction between futures andforwards from the standpoint of valuation. However, we saw in chapter 4that even the difference in timing does not matter if there is no uncertaintyabout the future course of interest rates over the terms of the contracts.Under that condition, futures prices and forward prices would still be thesame in arbitrage-free, frictionless markets. We are now in a position to seehow things can change when interest rates and bond prices over the livesof these contracts are not predictable.

The following simple setup suffices to get across the main idea. LetF(t, T )0≤t≤T be the futures price for a contract initiated at t = 0 andexpiring at T , and assume that the position is to be marked to marketat just one intermediate time, t∗. As usual Stt≥0 represents the priceof the underlying asset or commodity, for which there is assumed to be acontinuous, deterministic cost of carry at constant rate κ. For example, ifSt is the value in domestic currency of a position in a foreign certificate ofdeposit, then −κ would be the guaranteed rate of interest. Upon marking tomarket at t∗, the futures position, which is worth nothing initially, acquirespositive or negative value F(t∗, T )−F(0, T ); and there is another incrementof F(T, T )− F(t∗, T ) = ST − F(t∗, T ) at date T . In the absence of arbitragethe futures price at t = t∗ and at t = 0 must therefore satisfy, respectively,

Et∗e−

RTt∗ rt·dt[ST − F(t∗, T )] = 0 (10.46)

and

Ee−R

t∗0 rt·dt[F(t∗, T )−F(0, T )]+ Ee−

RT0 rt·dt[ST −F(t∗, T )] = 0. (10.47)

Page 506: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch10 FA

Interest-Rate Dynamics 489

Equation (10.47) is the key to the relation between F(0, T ) and f(0, T ),the price that would be set at t = 0 on a T -expiring forward contract. Thatthe second term in (10.47) is zero is seen by writing it as

E

e−R

t∗0 rt·dtEt∗e

−R

Tt∗ rt·dt[ST − F(t∗, T )]

and applying (10.46). The initial futures price therefore is given by

F(0, T ) = B(0, t∗)−1Ee−R t∗0 rt·dtF(t∗, T ).

It is clear that F(t∗, T ) must be the same as the forward price at t∗, sincefrom this point the two contracts have precisely the same payoff streams.Therefore, applying cost-of-carry relation (4.10),

F(t∗, T ) = f(t∗, T ) = B(t∗, T )−1St∗eκ(T−t∗).

To progress from here and relate initial values F(0, T ) and f(0, T )requires placing some structure on the underlying and on the bond price.For the latter we shall stick with the one-factor HJM model, and we shallassume that St evolves as

dSt/St = (rt + κ) · dt + σS · dWt.

Here Wtt≥0 is a Brownian motion under P, distinct from but possiblycorrelated with the Wtt≥0 that drives interest rates. Then

St∗ = S0 exp

(∫ t∗

0

rt · dt + κt∗ − σ2St∗/2 + σSWt∗

)and

F(0, T ) = B(0, t∗)−1E[e−

Rt∗0 rt·dtB(t∗, T )−1St∗e

κ(T−t∗)]

= B(0, t∗)−1S0eκT E

[B(t∗, T )−1e−σ2

St∗/2+σSWt∗].

From (10.34) the price of the unit bond at t∗ is

B(t∗, T ) = B(0, t∗)−1B(0, T )e−σ2(T−t∗)t∗T/2−σ(T−t∗)Wt∗ ,

and so

F(0, T ) = [B(0, T )−1S0eκT ]eσ2(T−t∗)t∗T/2−σ2

St∗/2Eeσ(T−t∗)Wt∗+σSWt∗ .

The factor in brackets is the forward price, f(0, T ). To develop the remainingpart, take ρ to be the instantaneous correlation between Wt and Wtand express Wt∗ as ρWt∗ +

√1 − ρ2Ut∗ ,where Utt≥0 and Wtt≥0 are

independent Brownian motions under P. Expression (10.33) shows that ρ

Page 507: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch10 FA

490 Pricing Derivative Securities (2nd Ed.)

can be interpreted as the correlation between the short rate and the instan-taneous return on the underlying asset. Evaluating the expectation andsimplifying then give

F(0, T ) = f(0, T ) expσt∗(T − t∗)[ρσS + σ(T − t∗/2)]. (10.48)

It is now possible to see what this particular model implies about therelation between forward and futures prices. Obviously, they must agree ifeither t∗ = 0 or t∗ = T , since the futures is then not marked to marketbetween the present time and expiration, and (10.48) is clearly consistentwith this. Putting σ = 0 shows that the two prices are equal also whenthere is no uncertainty in future bond prices, as we had already seen inchapter 4. The additional structure imposed here does provide an importantnew insight, however: when t∗ ∈ (0, T ) and σ > 0 the futures price isunambiguously greater than the forward price so long as the return on theunderlying is not negatively correlated with the short rate. The intuitionfor this in the case ρ > 0 was already provided in chapter 4. That is,when the underlying return and the short rate tend to move in concert,intermediate gains received on a long position in futures as it is markedto market can usually be reinvested at higher short rates; and, conversely,there is usually lower opportunity cost associated with losses on the futures.Since the terminal value of the futures position is then greater when ρ > 0than otherwise, it does make sense that the futures price would be higherthan the forward price under this condition. That F(0, T ) > f(0, T ) in thismodel even when ρ = 0 is particularly interesting, since it is not so intuitive.Notice that in this case the futures price no longer depends on the volatilityof the underlying, σS .

Equity/Index Options under Stochastic Rates

Let us now see how to price a T -expiring European option on an underlyingstock or index when interest rates are stochastic. Taking S0 as the initialprice, assume that

dSt/St = (rt − δt) · dt + σS · dWt,

under P, where δt is a deterministic dividend rate. The price at date T isthen

ST = S0e−

R T0 δt·dt exp

(∫ T

0

rt · dt − σ2ST/2 + σSWT

).

Page 508: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch10 FA

Interest-Rate Dynamics 491

Since the short rate is stochastic, two risk factors now generally drive theunderlying price. Some work is needed to sort these out in a way that letsus price an option.

Using (10.35), the cumulative short rate is∫ T

0

rt · dt = − lnB(0, T ) + σ2T 3/6 + σ

∫ T

0

Wt · dt,

and therefore

ST = f exp

(σ2T 3/6 − σ2

ST/2 + σ

∫ T

0

Wt · dt + σSWT

), (10.49)

where f ≡ f(0, T ) = B(0, T )−1S0e−

RT0 δt·dt is the forward price for T deliv-

ery of the underlying. As in the analysis of futures and forwards let ρ bethe correlation between Wt and Wt and write∫ T

0

Wt · dt = ρ

∫ T

0

Wt · dt +√

1 − ρ2

∫ T

0

Ut · dt,

where Ut is a Brownian motion independent of Wt. Since EWT

∫ T

0 Wt ·dt = T 2/2 by (10.39), it follows that EWT

∫ T

0 Wt · dt = ρT 2/2. Then,recognizing that

∫ T

0 Wt · dt ∼ N(0, T 3/3) and WT ∼ N(0, T ), we can writeσ∫ T

0 Wt · dt + σSWT as

σ∗√

TZ∗ ≡ σr

√TZr + σS

√TZS ,

say, where (Zr, ZS) are bivariate standard normal with EZrZS ≡ ρrS =ρ√

3/2, σ2r ≡ σ2T 2/3 and σ2

∗ ≡ σ2r + 2ρrSσrσS + σ2

S . Finally, note that

ρrSσrσS = E(σrZr · σSZS)

= E[σrZr(σ∗Z∗ − σrZr)]

= ρr∗σrσ∗ − σ2r ,

where ρr∗ ≡ EZrZ∗. With these conventions ST can be expressed as

ST = f exp[(

σ2r − σ2

S

)T/2 + σ∗

√TZ∗

]= f exp

[(σ2

r + ρrSσrσS − σ2∗/2

)T + σ∗

√TZ∗

]= f exp

[(ρr∗σrσ∗ − σ2

∗/2)T + σ∗√

TZ∗]. (10.50)

Putting this to work to value a T -expiring European call, we have

CE(S0, T ) = Ee−R

T0 rt·dt(ST − X)+

= B(0, T )e−σ2rT/2Ee−σr

√TZr

[fe(ρr∗σrσ∗−σ2

∗/2)T+σ∗√

TZ∗ − X]+

.

Page 509: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch10 FA

492 Pricing Derivative Securities (2nd Ed.)

Since Zr and Z∗ are bivariate normal with correlation ρr∗, Zr can be writ-ten as

Zr = ρr∗Z∗ +√

1 − ρ2r∗Z∗∗,

where Z∗∗ is standard normal and independent of Z∗. Splitting Zr in thisway and setting

Ee−σr

√T√

1−ρ2r∗Z∗∗ = eσ2

r(1−ρ2r∗)T/2,

the expression for the call becomes

B(0, T )e−ρ2r∗σ2

rT/2Ee−ρr∗σr

√TZ∗

[fe(ρr∗σrσ∗−σ2

∗/2)T+σ∗√

TZ∗ − X]+

.

The bracketed expression is positive when Z∗ > −q−(f/X ; σ∗)− ρr∗σr

√T ,

where

q±(x; σ∗) =lnx ± σ2

∗T/2σ∗

√T

.

The value of the call therefore is

B(0, T )fe−(σ∗−ρr∗σr)2T/2Ee(σ∗−ρr∗σr)√

T Z∗1A(Z∗)

−B(0, T )Xe−ρ2r∗σ2

rT/2Ee−ρr∗σr

√TZ∗1A(Z∗), (10.51)

where A is the interval (−q−(f/X ; σ∗) − ρr∗σr

√T ,∞). Writing out the

expectations,

CE(S0, T ) = B(0, T )f∫A

1√2π

exp−1

2[z − (σ∗ − ρr∗σr)

√T]2 · dz

−B(0, T )X∫A

1√2π

exp[−1

2(z − ρr∗σr

√T)2]· dz.

Finally, changing variables in the integrals and simplifying give as thedesired result

CE(S0, T ) = B(0, T )fΦ[q+(f/X ; σ∗)] − B(0, T )XΦ[q−(f/X ; σ∗)].(10.52)

This again parallels the standard Black-Scholes formula, (10.44), in thatσ2∗T is the variance of ln(ST /S0) when interest rates are stochastic, just as

σ2ST is the variance of ln(ST /S0) when rates are deterministic and St has

volatility σS . In the present instance σ2∗ can be expressed in terms of the

primitive parameters in the models for St and rt as

σ2∗ = σ2T 2/3 + ρσTσS + σ2

S ,

where σ is the standard deviation of the changes in short and forwardrates, σS is the volatility coefficient of the underlying price, and ρ is the

Page 510: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch10 FA

Interest-Rate Dynamics 493

instantaneous correlation between movements in interest rates and in thelog of the underlying price.10

Caps and Floors

An interest rate “cap” is an instrument that limits the interest cost toone who borrows at a floating rate. It does so by paying the holder wheneach interest installment is due the difference between what is owed at thefloating rate and what would be owed at some prearranged, fixed rate—the “cap”. An interest rate “floor” puts a lower bound on what is receivedby one who lends at a floating rate. It does so by paying the holder thedifference between what would be due at the fixed floor rate and what isdue at the floating rate. There is an extensive and very liquid over-the-counter market for interest rate caps and floors. The common arrangementis to base the floating payment on the LIBOR (London Interbank OfferingRate) as of the beginning of each payment period. Caps and floors normallyextend for many payment periods, thereby comprising portfolios of one-period “caplets” and “floorlets” with staggered payment dates. Once oneknows how to value individual caplets and floorlets with arbitrary fixedrates and expirations, the caps and floors themselves can be priced just byadding up.

First, we need to understand the market convention regarding con-tracts based on LIBOR. Let LT0(T0, T1) be the spot LIBOR at time T0

for loans during [T0, T1]. It is a “spot” rate since it is set at the timethe loan commences. By convention, LIBOR is quoted as a simply com-pounded annual rate, meaning that the payment due at T1 is the rateat T0 times the length of the period in years, or LT0(T0, T1)τ , whereτ ≡ T1 − T0.11 LT0(T0, T1) differs from r(T0, T1), the average spot ratebetween T0 and T1, in that (by our convention) r(T0, T1) is continuouslycompounded. Relating these to prices of unit discount bonds, we haveB(T0, T1)[1 + τLT0(T0, T1)] = B(T0, T1)eτr(T0,T1) = 1, since a loan ofB(T0, T1) currency units commencing at T0 would require repayment of

10For generalizations of (10.52) to other volatility functions see Musiela andRutkowski (1998, pp. 362–364).

11We abstract from the various “day-count” conventions that are often used in actual

transactions; for example, equating T1 − T0 to the number of intervening days dividedby 360 and excluding holidays that fall at either end. In what follows we also ignore thetiny default risk that is present in this interbank loan market.

Page 511: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch10 FA

494 Pricing Derivative Securities (2nd Ed.)

one unit at T1. Thus,

LT0(T0, T1) = τ−1[B(T0, T1)−1 − 1].

Forward LIBORs are related analogously to bonds’ forward prices.For t < T0

Lt(T0, T1) = τ−1[Bt(T0, T1)−1 − 1] = τ−1[B(t, T0)/B(t, T1) − 1]

represents forward LIBOR as of time t for a τ -year loan commencing at T0.The dates T0, T1 that define the contract are known as “reset” and “settle-ment” dates, respectively. For interest rate caps and floors with settlementdates Tjn

j=1 the payment at Tj is based on spot rate LTj−1(Tj−1, Tj),which is thus reset at each date Tj−1.

In section 10.4 we introduce a specific model for forward LIBOR thatyields simple, Black-Scholes-like formulas for prices of caplets and floorlets.To motivate and contrast with this later approach, let us now work outthe prices implied by our simple one-factor HJM model. We begin with acaplet that extends from T0 to T1 with capped rate K. For each unit ofprincipal value the holder receives at T1 the difference between what wouldbe owed on a τ -year loan at time-T0 spot LIBOR and what would be owedat simply compounded rate K, provided this difference is positive. Thus,the caplet’s value at T1 is C(T1, K; T0, T1) = τ [LT0(T0, T1) − K]+, whichamounts to the payoff of τ units of T1-expiring call options on LIBOR “inarrears”—i.e., on spot LIBOR at T0. Since the ultimate goal is to valuecaps that extend for many periods, we need to price caplets that begin atfuture dates. We will therefore work out the caplet’s arbitrage-free value asof time zero, C(0, K; T0, T1).

Working in spot measure P, we have

C(0, K; T0, T1) = M0EM−1T1

τ [LT0(T0, T1) − K]+ (10.53)

= Ee−R T10 rs·ds[B(T0, T1)−1 − (1 + τK)]+.

Expression (10.33) gives

e−R T10 rs·ds = exp

∫ T1

0

[r0(s) +

σ2

2s2 + σWs

]· ds

= B(0, T1) exp

(−σ2T 3

1 /6 − σYT0 − σ

∫ T1

T0

Ws · ds

)= B(0, T1) exp

(− σ2T 3

1 /6 − σYT0 − στWT0 − σYτ

),

Page 512: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch10 FA

Interest-Rate Dynamics 495

where YT0 ≡∫ T0

0 Ws ·ds and Yτ ≡∫ T1

T0(Ws−WT0)·ds. Note that Yτ is equiv-

alent to∫ τ

0Ws ·ds, where Ws0≤s≤τ is a Brownian motion independent of

events in FT0 ; therefore, Yτ is distributed as N(0, τ3/3) conditional on FT0 .Also, recall from (10.40) the decomposition YT0 = T0

2 WT0 + T0

2√

3UT0 , where

Utt≥0 is another Brownian motion independent of Wtt≥0 and thereforeof Yτ . Thus, exp(−

∫ T1

0 rs · ds) equals

B(0, T1) exp[−σ2 T 3

1

6− σ

T0

2√

3UT0 − σ

(τ +

T0

2

)WT0 − σYτ

].

Next, from (10.34) we have

B(T0, T1)−1 = B0(T0, T1)−1eσ22 τT0T1+στWT0 = (1 + τL0)e

σ22 τT0T1+στWT0 ,

where L0 ≡ L0(T0, T1). Assembling the pieces and using Ee−σT0UT0/(2√

3) =eσ2T 3

0 /24 and Ee−σYτ = eσ2τ3/6, we express C(0, K; T0, T1) as B(0, T1) ×e−

σ2T02 (T1τ+

T204 )Ee−σ(

T02 +τ)WT0 [(1 + τL0)e

σ22 τT0T1+στWT0 − (1 + τK)]+.

The bracketed expression is positive if and only if

WT0 > − ln(1 + τL0) − ln(1 + τK)στ

− σ

2T1T0.

Evaluating the expectations and simplifying, one obtains as the initial per-unit-principal value of the caplet

C(0, K; T0, T1) = B(0, T1)[(1 + τL0)Φ(q+) − (1 + τK)Φ(q−)],

where

q± ≡ ln(1 + τL0) − ln(1 + τK)στ

√T0

± σ

2τ√

T0.

The floorlet, with settlement value F (T1, K; T0, T1) = τ [K −LT0(T0, T1)]+, amounts to a portfolio comprising τ European put optionson the LIBOR, so its value at t = 0 comes from put-call parity:

F (0, K; T0, T1) = C(0, K; T0, T1) − Ee−R T10 rs·dsτ [LT0(T0, T1) − K].

Working out the expectation, one obtains

F (0, K; T0, T1) = C(0, K; T0, T1) − B(0, T1)τ(L0 − K)

= B(0, T1)[(1 + τK)Φ(−q−) − (1 + τL0)Φ(−q+)].

Page 513: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch10 FA

496 Pricing Derivative Securities (2nd Ed.)

The per-unit-principal values of caps and floors that extend from T0 ≥ 0to Tn with settlement dates at T1, T2, . . . , Tn would then be calculated as

C(0, K; T0, T1, . . . , Tn) =n∑

j=1

C(0, K; Tj−1, Tj)

F (0, K; T0, T1, . . . , Tn) =n∑

j=1

F (0, K; Tj−1, Tj).

Swaps and Swaptions

While payoff functions of caps resemble those of portfolios of options,interest-rate “swaps” are more like collections of futures or forward con-tracts. A simple (forward) interest-rate swap is merely an advance agree-ment to exchange payments at a fixed interest rate on some “notional”principal for payments made at a floating rate. The agreement becomeslive at some future date T0, and the exchanges take place at later datesT1, T2, . . . , ending at some Tn. From the standpoint of one who pays thefixed rate, the arrangement is a “payer” swap, while to the counterpartyit is a “receiver” swap. The term notional conveys the fact that the swapagreement involves no exchange of principal; indeed, what changes handsat intervals is merely the difference between the fixed-rate and floating-ratepayments.12 The floating-rate payment at the end of a period is commonlybased on the spot LIBOR at its beginning; thus, at Tj the payment perunit notional is (Tj − Tj−1)LTj−1 (Tj−1, Tj). The fixed payment for a swapinitiated at t = 0 is at some rate K0 = K0(T0, T1, . . . , Tn), which does notchange during the life of the contract. This is determined so as to give theswap zero initial value—and so depends on the initiation, commencement,and payment dates. Of course, K0 having once been set, the swap’s marketvalue will change with time and market conditions. A prospective fixed rateKt that would equate to zero the value of the swap at some t ∈ (0, Tn−1)is called the “forward swap rate”. Clearly, swaps simply allow floating-rateloans to be restructured, in effect, as fixed-rate loans, or vice versa. “Swap-tions” are options to take one position or the other in a swap arrangementat some future time.

12In practice, swap agreements may call for fixed and floating payments to be madeat different times; e.g., annually for fixed and quarterly for floating. We abstract fromthis trivial complication to simplify notation.

Page 514: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch10 FA

Interest-Rate Dynamics 497

Swaps themselves can be valued through static replicating argumentswithout imposing structure on the dynamics of interest rates. Consider anarrangement that calls for exchanging floating-rate payments for fixed-ratepayments at dates T1, T2, . . . , Tn. To simplify, set Tj − Tj−1 = τ , the samefor each j, and take the notional to be unity. Time T0 = T1 − τ representsthe (forward) commencement date of the swap. Now the claim on the fixedreceipts is like a bond that pays fixed coupons τK0 from T1 to Tn but doesnot return the principal value. This is obviously worth τK0

∑nj=1 B(T0, Tj)

at T0. With a little more reflection one can see also that the claim on thefloating payments is worth 1 − B(T0, Tn) at T0. This is because the samestream could be purchased by selling one Tn-maturing discount bond atT0, lending one currency unit at spot LIBOR, and reinvesting the periodicreceipts at spot LIBOR at each time up to Tn−1. Netting fixed againstfloating gives the swap’s value to the party receiving fixed.

While no dynamic model was needed for this, it is a good exer-cise to verify that our martingale pricing apparatus leads to the sameresult for the value of the floating leg. Recalling that B(Tj−1, Tj)−1 =1 + τLTj−1 (Tj−1, Tj), the value of the floating leg at time T0 is

MT0ET0

n∑j=1

M−1Tj

τLTj−1 (Tj−1, Tj)

=n∑

j=1

ET0e−

R TjT0

rt·dt[B(Tj−1, Tj)−1 − 1]

=n∑

j=1

ET0

e−R Tj−1

T0rt·dtETj−1

e−

R TjTj−1

rt·dt

B(Tj−1, Tj)

− e−R Tj

T0rt·dt

.

The inner expectation being unity, this reduces ton∑

j=1

[B(T0, Tj−1) − B(T0, Tj)] = 1 − B(T0, Tn).

The swap’s value at T0 to the party that receives fixed is therefore

S(T0,K0; T0, T1, . . . , Tn) = τK0

n∑j=1

B(T0, Tj) + B(T0, Tn) − 1.

Notice that the first two terms together amount to the value at T0 of a unitTn-maturing coupon bond, so that in the notation of (10.45) the value ofthe swap at T0 is simply B(T0, K0τ ; T1, . . . , Tn) − 1. Since one can arrange

Page 515: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch10 FA

498 Pricing Derivative Securities (2nd Ed.)

at t ≤ T0 to possess a Tj-maturing bond at T0 just by buying one andholding, the value of the swap at t = 0 is

S(0, K0; T0, T1, . . . , Tn) = τK0

n∑j=1

B(0, Tj) + B(0, Tn) − B(0, T0)

= B(0, K0τ ; T1, . . . , Tn) − B(0, T0).

Swap rate K0 would be set as

K0(T0, T1, . . . , Tn) =B(0, T0) − B(0, Tn)

τ∑n

j=1 B(0, Tj)

to equate the swap’s initial value to zero, while the forward swap rate atsome t ∈ (0, Tn−1) would be

Kt(Tjt , . . . , Tn) =B(t, Tjt−1) − B(t, Tn)

τ∑n

j=jtB(t, Tj)

,

where jt is the first payment date after t. The actual value of the swap att can then be expressed in terms of the forward swap rate as

S(t, K0; Tjt−1, Tjt , . . . , Tn) = τ(Kt − K0)n∑

j=jt

B(t, Tj).

At some t ≤ T0 an option to take the fixed side of the swap at T0 isworth

MtEtM−1T0

[B(T0, K0τ ; T0, T1, . . . , Tn) − 1]+,

which is precisely the value of a call option with unit strike on a bond ofunitary principal value that pays coupon K0τ .

10.4 The LIBOR Market Model

Although the HJM approach offers an elegant and comprehensive wayto model interest-sensitive instruments, it has not been widely adoptedby traders, who must rely on such models to make rapid pricing deci-sions. The fact that instantaneous forward rates are not directly observ-able makes it difficult to formulate good models for the volatility processesσt(T )0≤t≤T≤T∞ that control the stochastic behavior of rates and bondprices. What are readily observable are dealers’ quotes on caps, floors, andswaptions, for which there are very active markets. While we were able toobtain simple formulas for these instruments in the illustrative one-factor,

Page 516: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch10 FA

Interest-Rate Dynamics 499

constant-volatility setup of the previous section, things are not so easy withspecifications of HJM that generate nontrivial volatility term structures andtwists in yield curves. Thus, it is difficult to use these market data to cal-ibrate the model. In fact, to price the individual caplets and floorlets thatmake up caps and floors, traders routinely use simple adaptations of Black’s(1976b) formulas (6.33) for options on futures. As we saw in section 6.3.3,those formulas were derived by modeling the settlement value of the under-lying as lognormally distributed under P and regarding numeraire processMt as deterministic. How can this apply to caplets and floorlets? Treatingthem as options, at least, does make sense, because the settlement valuesof caplets and floorlets over the interval [T0, T1 = T0 +τ ] are the option-likefunctions τ [LT0(T0, T1) − K]+ and τ [K − LT0(T0, T1)]+ of spot LIBOR atT0, LT0(T0, T1) = τ−1[B(T0, T1)−1 − 1]. The adapted Black’s formulas forthe initial values of these are

C(0, K; T0, T1) = B(0, T1)τ [L0Φ(l+) − KΦ(l−)] (10.54a)

F (0, K; T0, T1) = B(0, T1)τ [KΦ(−l−) − L0Φ(−l+)], (10.54b)

where

l± ≡ln(L0/K) ± λ2

T0T0/2

λT0

√T0

,

L0 ≡ L0(T0, T1) is forward LIBOR as of time zero, and λT0 is a positive“volatility” constant.

Despite the aptness of the option analogy, how to reconcile these for-mulas with the HJM formulation is far from apparent. In our illustrativeapplication beginning on page 493 we valued the caplet as

C(0, K; T0, T1) = M0EM−1T1

τ [LT0(T0, T1) − K]+,

using a one-factor HJM model for instantaneous forward rates with con-stant, maturity-invariant volatilities, σt(T ) = σ. Even in that simplest pos-sible version of HJM neither is LT0(T0, T1) distributed as lognormal underP nor—quite obviously—is MT1 deterministic, so it is not surprising that(10.54a) is inconsistent with our solution:

C(0, K; T0, T1) = B(0, T1)[(1 + τL0)Φ(q+) − (1 + τK)Φ(q−)]

q± ≡ ln[(1 + τL0)/(1 + τK)] ± σ2τ2T0/2στ

√T0

.

Whether Black’s formulas could be consistent with any specification ofHJM—and therefore with an arbitrage-free market for bonds and the money

Page 517: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch10 FA

500 Pricing Derivative Securities (2nd Ed.)

fund—was for a long time in doubt, but a reconciliation was provided byBrace, Gatarek, and Musiela (1997). This section introduces their approach,now commonly referred to as the LIBOR market model.13 We see first howto derive Black’s formula in a way that is consistent with the general mul-tidimensional form of HJM, then discuss some of the general requirementsfor using the LIBOR market model to price derivatives other than caps andfloors.

10.4.1 Deriving Black’s Formulas

Let us begin with a quick review of the general multidimensional versionof HJM. We have a family of bond prices B(t, T )0≤t≤T at maturitiesT ∈ [0, T∞], expressible in terms of instantaneous forward rates as B(t, T ) =exp(−

∫ T

t rt(u) · du). There is also the money market fund, whose per-unitprice at t is Mt = M0 exp(

∫ t

0 rs · ds), where rt ≡ rt(t) is the instantaneousspot rate. Forward rates evolve as

rt(T ) − r0(T ) =∫ t

0

µs(T ) · ds +∫ t

0

σs(T )′ · dWs,

where Wt is an k-valued standard Brownian motion on (Ω,F , Ft, P).Drift processes µt(T ), risk prices γt, and volatilities σt(T ) satisfythe conditions (laid out in section 10.3.2) that assure that bond prices andthe money fund are arbitrage-free. There are thus a spot measure P and anassociated Brownian motion Wt such that normalized processes

B(t, T )∗ = B(t, T )/Mt

are martingales, evolving as

dB(t, T )∗/B(t, T )∗ = −σt(t, T )′ · dWt, 0 ≤ t ≤ T ≤ T∞. (10.55)

In addition, for each T0, T1 ∈ (0, T∞] with T0 ≤ T1 there are a forwardmeasure PT1 and an associated k-valued Brownian motion WT1

t suchthat normalized processes Mt/B(t, T1) and B(t, T0)/B(t, T1)0≤t≤T0≤T1

are martingales.The first key to developing Black’s formulas for T1-settled caplets and

floorlets is to shift from spot measure P to the forward measure PT1 thatis associated with the object’s particular settlement date—that is, to the

13Musiela and Rutkowski (2005) give a comprehensive account that encompassesthe now extensive literature on models of this type. Shreve (2004) offers a brief butinsightful treatment of the theory. Brigo and Mercurio (2001) and Joshi (2003) givedetails on implementing the model.

Page 518: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch10 FA

Interest-Rate Dynamics 501

martingale measure that applies when B(t, T1)0≤t≤T1 is numeraire. Thesecond key is to recognize that with this numeraire the forward LIBORprocess Lt(T0, T1)0≤t≤T0 is in fact the normalized price of a traded asset.That this is so can be seen directly from the relation that defines the simplycompounded rate: B(t, T1)[1 + τLt(T0, T1)] = B(t, T0). Thus,

B(t, T1)Lt(T0, T1) = τ−1[B(t, T0) − B(t, T1)]

corresponds to the value at t of a portfolio long T0-maturing bonds of prin-cipal value τ−1 and short a like number of bonds of the later maturity.In an arbitrage-free market Lt(T0, T1)0≤t≤T0 is therefore a martingaleunder forward measure PT1 . As such, by the martingale representation the-orem there are an k-valued, Ft-adapted process λT0(t)0≤t≤T0 and aBrownian motion WT1

t such that

dLt(T0, T1) = λT0(t)′Lt(T0, T1) · dWT1

t . (10.56)

Under the condition that λT0(·) is bounded and piecewise continuous andthat initial forward LIBOR L0(T0, T1) is positive, Brace et al. (1997) showthat (10.56) has the unique, strictly positive solution

Lt(T0, T1)L0(T0, T1)

= exp[∫ t

0

λT0(s)′ · dWT1

s − 12

∫ t

0

λT0(s)′λT0(s) · ds

].

(10.57)Under the additional condition that λT0(·) is deterministic (F0-measurable), it follows that Lt(T0, T1) is distributed as lognormalunder PT1 .

That λT0(·) is deterministic is in fact the critical assumption, for ittakes us directly to the desired simple formulas for caplets and floorletsthat make it easy to calibrate the model from market prices. Indeed, we arenow essentially done. Putting

λ2T0

≡ T−10

∫ T0

0

λT0(s)′λT0(s) · ds, (10.58)

writing

C(0, K; T0, T1) = B(0, T1)τET1 [LT0(T0, T1) − K]+

= B(0, T1)τE[L0e

λT0

√T0 Z−λ2

T0T0/2 − K

]+for Z ∼ N(0, 1), and working out the expectation yield (10.54a). Formula(10.54b) comes in a similar way from

F (0, K; T0, T1) = B(0, T1)τET1 [K − LT0(T0, T1)]+

or via the parity relation.

Page 519: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch10 FA

502 Pricing Derivative Securities (2nd Ed.)

Of course, we have not yet reconciled with HJM, for it remains to showthat processes λT0(t)0≤t≤T0 with the required properties correspond tosome set of volatilities σt(T0)0≤t≤T0≤T∞ that characterize instantaneousforward-rate processes rt(T ). To do this requires relating the evolutionof forward LIBOR,

Lt(T0, T1) = τ−1[B(t, T0)/B(t, T1) − 1], 0 ≤ t ≤ T0,

to the evolutions of discount bonds, whose dependence on forward-ratevolatilities is known from the HJM theory. More particularly, we shall relateto the Mt-normalized processes

B(t, T1)∗ = B(t, T1)/MtB(t, T0)∗ = B(t, T0)/Mt,

which we know to be martingales in spot measure P. Recall from (10.55)that these evolve under P as dB(t, T1)∗/B(t, T1)∗ = −σt(t, T1)′ · dWt anddB(t, T0)∗/B(t, T0)∗ = −σt(t, T0)′ · dWt. Starting with

dLt(T0, T1) = τ−1d[B(t, T0)/B(t, T1)] = τ−1d[B(t, T0)∗/B(t, T1)∗]

and applying Ito’s formula to the last expression lead to

d[B(t, T0)∗/B(t, T1)∗]B(t, T0)∗/B(t, T1)∗

=dB(t, T0)∗

B(t, T0)∗− dB(t, T1)∗

B(t, T1)∗

+d〈B(·, T1)∗〉t[B(t, T1)∗]2

− d〈B(·, T0)∗, B(·, T1)∗〉tB(t, T0)∗B(t, T1)∗

= −σt(t, T0)′ · dWt + σt(t, T1)′ · dWt

+ σt(t, T1)′σt(t, T1) · dt − σt(t, T0)′σt(t, T1) · dt

= [σt(t, T1) − σt(t, T0)]′·d[Wt +

∫ t

0

σu(u, T1)·du

].

Although this shows that B(t, T0)∗/B(t, T1)∗ is not a martingale under P,it is in fact a martingale in forward measure PT1 , since B(t, T0)∗/B(t, T1)∗ =B(t, T0)/B(t, T1) is the appropriately normalized price of a traded asset.Thus, we can write the above as

d[B(t, T0)∗/B(t, T1)∗]B(t, T0)∗/B(t, T1)∗

= [σt(t, T1) − σt(t, T0)]′ · dWT1t ,

where WT1t = Wt +

∫ t

0σu(u, T1) · du is an k-valued Brownian motion

under PT1 . Finally, expressing the ratio of bond prices in terms of forwardLIBOR gives

dLt(T0, T1) = λT0(t)′Lt(T0, T1) · dWT1

t ,

Page 520: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch10 FA

Interest-Rate Dynamics 503

with

λT0(t) ≡[1 + τLt(T0, T1)

τLt(T0, T1)

][σt(t, T1) − σt(t, T0)]. (10.59)

Expressing (10.59) for all reset/settlement pairs Tj−1, Tjnj=1 that

make up the market’s “tenor structure” provides a recurrence relation forthe σt(t, Tj)n

j=1. Given some specification of an initial process σt(t, T0),this allows the bond volatilities at each maturity to be expressed interms of prespecified LIBOR volatilities with the appropriate properties—i.e., bounded and deterministic—and of the evolving LIBORs themselves.Choosing specific LIBOR volatility functions requires calibrating the modelto market data, a procedure that we discuss briefly below. While thereis considerable flexibility (i.e., arbitrariness) in this, we clearly want eachprocess λTj−1(t)0≤t≤Tj−1 to be such that

∫ Tj−1

0λTj−1 (s)′λTj−1 (s) · ds =

T0λ2Tj−1

, where λTj−1 is the (scalar) volatility that equates (10.54a) and(10.54b) to market quotes for Tj-settled caplets and floorlets.

To illustrate the construction of the σt(t, Tj), let Tjn−1j=0 denote the

strictly increasing sequence of extant reset dates and set τj ≡ Tj − Tj−1.Then (10.59) gives

σt(t, Tj) = σt(t, Tj−1) + λTj−1 (t)[

τjLt(Tj−1, Tj)1 + τjLt(Tj−1, Tj)

], 0 ≤ t ≤ Tj−1.

To start the recursion one must choose an initial process σt(t, T0)0≤t≤T0 .The only logical (as opposed to empirical) requirement is that σT0(T0, T0) =0, since this represents the volatility vector of a T0-maturing, default-freebond at maturity. For example, Brace et al. (1997) set σt(t, T0) = 0 fort ∈ [0, T0]. Now, having specified an appropriate bounded, deterministic,k-valued function λT0(t)0≤t≤T0 , one finds σt(t, T1) for t ∈ [0, T0] as

σt(t, T1) = σt(t, T0) + λT0(t)[

τ1Lt(T0, T1)1 + τ1Lt(T0, T1)

],

then extends freely for t ∈ (T0, T1] subject to σT1(T1, T1) = 0. Continuing,σt(t, Tm) can be built up for m ∈ 2, 3, . . . , n as

σt(t, Tm) = σt(t, T0) +m∑

j=1

λTj−1 (t)[

τjLt(Tj−1, Tj)1 + τjLt(Tj−1, Tj)

], t ∈ (0, Tm−1],

then extended to t ∈ (Tm−1, Tm] in such a way that σTm(Tm, Tm) = 0.Notice that while the processes λTj−1 (t) are deterministic, the σt(t, Tj)are merely Ft-measurable, since they depend on the evolving lognormalforward LIBORs.

Page 521: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch10 FA

504 Pricing Derivative Securities (2nd Ed.)

The above construction shows that there do exist bond volatility pro-cesses that reconcile the market LIBOR framework with that of HJM. It alsoshows how, in principle, one could generate realizations of the volatilities—i.e., by simulating lnLt(Tj−1, Tj)n

j=1 as a joint Gaussian process in someparticular forward measure. As we shall see next, simulating ensembles offorward LIBORs is required in most applications of the LIBOR marketmodel, and doing this is far from trivial.

10.4.2 Applying the Model

Let us now consider how the market LIBOR model can be used to price agiven interest-sensitive derivative. Clearly, using it to price caps and floorsis easy, but in considering other applications it helps to think precisely whythis is so. Caps and floors are easy, first, because they are just portfoliosof individual caplets and floorlets that can be valued one at a time and,second, because the caplets and floorlets can be valued using Black’s simpleformulas.14 The second point is obvious, but there is more to the first pointthan may be apparent. Because each caplet or floorlet is a function of asingle LIBOR, LTj−1(Tj−1, Tj) say, pricing it does not require modeling thejoint behavior of rates with different settlement dates. For example, thedependence between LTj−1(Tj−1, Tj) and LTj (Tj , Tj+1) is of no relevancein evaluating C(0, K; Tj−1, Tj) and C(0, K; Tj, Tj+1). Also, it is possiblewithout inconsistency to use a different martingale measure for each capletor floorlet—namely, the one corresponding to its particular settlement date.Thus, by using Black’s formula for each C(0, K; Tj−1, Tj) we are implicitlycalculating ETj [LTj−1(Tj−1, Tj) − K]+ in each LIBOR’s “native” measurePTj . Moreover, since in its native measure the LIBOR for a given reset datehas no trend, there is no need to worry about drift terms in expressions fordLt(·, ·). In general, it is easy to price any instrument whose payoff dependson the value of a single LIBOR at one point in time or, as in the case ofcaps and floors, can be represented as an affine function of such individualcomponents.

In contrast, consider an option on a cap—a “caption”. A T0-expiringEuropean-style call struck at X on a cap commencing at T0 and with

14Indeed, the values of the caplets and floorlets would, in practice, have been deducedfrom market quotes for caps and floors of various durations, which in turn would havebeen used to calibrate the model.

Page 522: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch10 FA

Interest-Rate Dynamics 505

settlement dates Tjnj=1 is worth at expiration

[C(T0, K; T0, T1, . . . , Tn) − X ]+ =

n∑j=1

C(T0, K; Tj−1, Tj) − X

+

.

(10.60)This depends on the values of all the caplets at T0 and therefore on theevolution of all the underlying LIBORs up to that time, and because ofthe [·]+ it is not an affine function of the caplets. There being no singlemeasure in which all of the LIBORs are trendless, it is necessary to modelthe drift processes of (at least) all but one. Also, it is necessary to takeinto account the dependence between rates over the life of the derivative.This could be modeled by fully specifying the k-valued volatility processesλTj−1 (t)n

j=1 that govern the evolutions of the rates, but we shall see laterthat there is another way. Finally, to value the caption one must evaluate theexpectation of its normalized value at expiration. As for most other “exotic”interest-rate derivatives, there are no simple computational formulas forthese expected values, which therefore must be estimated by simulation.

In general, the key steps in applying the LIBOR market model are

• Choosing a numeraire and expressing the derivative’s normalized futurepayoffs in terms of a set of LIBORs.

• Modeling the evolution of the relevant LIBOR processes in the martingalemeasure associated with the chosen numeraire.

• Simulating the LIBORs and the replicates of the derivative’s payoffs inorder to estimate expected values.

We first describe some specific applications that illustrate steps one andthree, then take up the much more involved issue of modeling the variousrates.

Some Examples

1. Consider first a T0-expiring European put option struck at X on adiscount bond maturing at Tn, which is worth PE [B(T0, Tn), 0] =[X − B(T0, Tn)]+ at expiration. We work with LIBOR processes

Lt(Tj−1, Tj = Tj−1 + τj)nj=1

Page 523: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch10 FA

506 Pricing Derivative Securities (2nd Ed.)

and write

B(T0, Tn) =n∏

j=1

[1 + τjLT0(Tj−1, Tj)]−1.

The price of either a T0-maturing bond or the underlying Tn bond itselfwould be sensible choices as numeraires. In the former case, with PT0 asthe relevant measure, we would value the option at t = 0 as

PE [B(0, Tn), T0] = B(0, T0)ET0

X −n∏

j=1

[1 + τjLT0(Tj−1, Tj)]−1

+

.

In the latter case, we would work in measure PTn and evaluate

PE[B(0, Tn), T0] = B(0, Tn)ETn

X

n∏j=1

[1 + τjLT0(Tj−1, Tj)] − 1

+

.

In either case we would simulate forward LIBORs Lt(Tj−1, Tj)nj=1

out to t = T0 in the pertinent measure, express the put’s terminalvalue in terms of these, and average over replications to estimate theexpectations.

2. The procedure for pricing an option on a coupon-paying, default-freebond would be much the same. The value at T0 of a bond paying c

currency units at dates Tjnj=1 and returning unit principal value at

Tn would be

B(T0, c; T1, . . . , Tn) = cn−1∑j=1

B(T0, Tj) + (1 + c)B(T0, Tn).

With B(t, Tn) as numeraire the initial value of a T0-expiring Europeancall, CE [B(0, c; T1, . . . , Tn), T0; X ], would be

B(0, Tn)ETn

c

n−1∑j=1

B(T0, Tj)B(T0, Tn)

+ 1 + c − X

+

,

where the relative bond prices are expressed in terms of forwardLIBORs as

B(T0, Tj)B(T0, Tn)

=n∏

i=j+1

[1 + τiLT0(Ti−1, Ti)].

Again, the LIBORs would be generated by simulation—one would needan ensemble of n − j of them with reset dates out to Tn−j—and theexpectation estimated by averaging the replications.

Page 524: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch10 FA

Interest-Rate Dynamics 507

3. We saw in section 10.3.3 that an interest-rate receiver swap that receivespayments on a unit notional principal at fixed rate K0 and pays atfloating rate LTj−1(Tj−1, Tj) at equally spaced times Tj = T0 + jτn

j=1

would be worth at T0

S(T0, K0; T0, T1, . . . , Tn) = K0τ

n∑j=1

B(T0, Tj) + B(T0, Tn) − 1.

Note that the first two terms amount to the value at T0 of a bondpaying fixed coupons of K0τ at each future date and returning unitprincipal value at Tn. A European-style, T0-expiring receiver “swap-tion” would give the holder the right to enter such a swap arrange-ment at T0, at which time it would be worth S(T0, K0; T0, T1, . . . , Tn)+.This being the value of a European call struck at X = 1 on a Tn-maturing bond paying coupons K0τ , the swaption’s value at t = 0,CE [S(0, K0; T0, T1, . . . , Tn), T0], would be calculated as

CE [B(0, K0τ ; T1, . . . , Tn), T0; 1],

or

B(0, Tn)K0τETn

n−1∑j=1

n∏i=j+1

[1 + τiLT0(Ti−1, Ti)] + 1

+

.

4. Finally, consider the caption that was described earlier. This was a T0-expiring European-style call struck at X on a cap commencing at T0

and with settlement dates Tjnj=1, the caption’s value at expiration

being given by (10.60). Taking as numeraire the price of the T0-maturingdiscount bond, we would express the caption’s initial value as

C(0, K; T0, T1, . . . , Tn) = B(0, T0)ET0 [C(T0, K; T0, T1, . . . , Tn) − X ]+.

(10.61)How is this to be evaluated? If we knew the values at T0 of forwardLIBORs and bond prices LT0(Tj−1, Tj), B(T0, Tj)n

j=1, then each ofthe individual caplets could be valued using Black’s formula. This wouldgive us the T0 value of the cap itself and thus the exercise value of theoption, as

n∑j=1

B(T0, Tj)τj [LT0(Tj−1, Tj)Φ(l+j ) − KΦ(l−j )] − X

+

,

Page 525: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch10 FA

508 Pricing Derivative Securities (2nd Ed.)

where l±j ≡ ln[LT0(Tj−1, Tj)/K] ± 12 λ2

Tj−1Tj−1/(λTj−1

√Tj−1).

Expressing the prices of discount bonds at T0 in terms of LIBORs asB(T0, Tj) =

j∏i=1

[1 + τiLT0(Ti−1, Ti)]−1

n

j=1

,

we can obtain one realization of the caption’s terminal value by gener-ating the ensemble of forward LIBOR paths out to T0. Replicating theprocess, averaging the realizations, and multiplying by the observableB(0, T0) give an estimate of (10.61).

Modeling the LIBORs

We have seen that each member of the collection of forward LIBORprocesses

Lt(Tj−1, Tj) = τ−1j [B(t, Tj−1) − B(t, Tj)]/B(t, Tj)0≤t≤Tj−1,j∈1,2,...,n

is itself the evolving value of a portfolio of bonds, normalized by the one withlater maturity. Any one such process Lt(Tm−1, Tm)0≤t≤Tm−1 is thereforea martingale in its native measure PTm , evolving as

dLt(Tm−1, Tm) = λTm−1(t)′Lt(Tm−1, Tm) · dWTm

t .

If we have chosen as numeraire the price of the Tm-maturing bond, B(t, Tm),and are therefore valuing a derivative in forward measure PTm , it followsthat Lt(Tm−1, Tm) can be modeled and simulated as a trendless process.However, we must recognize that the other processes Lt(Tj−1, Tj)j =m onwhich the derivative depends cannot also be martingales in this measure,for they are not representable as marketable assets normalized by B(t, Tm).Instead, there will be drift terms, as in

dLt(Tj−1, Tj) = µTj−1,Tm(t)Lt(Tj−1, Tj) ·dt+λTj−1 (t)′Lt(Tj−1, Tj) ·dWTm

t .

When pricing derivatives such as bond options, swaptions, and captions thatdepend on—but are not affine functions of—more than one LIBOR process,we must somehow model the trends of all that do not evolve in their nativemeasures. Fortunately, we will see that the drifts can be expressed in termsof the realizations and volatility processes of the various rates. Specifyingthe volatilities will be the next step.

Page 526: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch10 FA

Interest-Rate Dynamics 509

Modeling the Drifts

Given the collection

Lt(T0, T1)0≤t≤T0 , Lt(T1, T2)0≤t≤T1 , . . . , Lt(Tn−1, Tn)0≤t≤Tn−1

of forward LIBOR processes, we will now see how to determine the trendof the jth member in each of the measures PTn and PT0 . For brevity, weshall use a simpler notation:

• Bj for B(t, Tj)• Lj for Lt(Tj−1, Tj)• dLj = µjmLj · dt + λ′

jLj · dWmt for the increment under measure PTm ,

m ∈ 0, n.

Also, it will be helpful to set out for reference a result from the Ito calculus:If X1t, X2t, . . . , Xnt are Ito processes and Pkt = X1tX2t · · · · · Xkt isthe product of time-t values of the first k of these, then by Ito’s formula

dPkt

Pkt=

k∑j=1

dXjt

Xjt+

∑i<j

d〈Xi, Xj〉tXitXjt

. (10.62)

Of course, when k = 2 this is just

dP2t

P2t=

dX1t

X1t+

dX2t

X2t+

d〈X1, X2〉tX1tX2t

. (10.63)

Consider first measure PTn , corresponding to numeraire Bn ≡B(t, Tn). We know that LjBj/Bn = τ−1

j (Bj−1 − Bj)/Bn has no trendin this measure, since it represents the normalized value of a portfolio ofbonds. Likewise, Bj/Bn has no trend, and this ratio can be expressed interms of forward LIBORs as Bj/Bn =

∏ni=j+1(1 + τiLi) for j < n. Of

course, Lj itself has no trend when j = n, but we shall need to determinedrift µjn for j ∈ 1, 2, . . . , n − 1. With X1 = Lj and X2 = Bj/Bn in(10.63) we have

d(LjBj/Bn)LjBj/Bn

=dLj

Lj+

d(Bj/Bn)Bj/Bn

+d〈Lj , Bj/Bn〉t

LjBj/Bn, (10.64)

and putting Xjt = (1 + τiLi) in (10.62) gives

d(Bj/Bn)Bj/Bn

=n∑

i=j+1

τi

1 + τiLi· dLi +

∑i<l

τiτl

(1 + τiLi)(1 + τlLl)· d〈Li, Ll〉t.

Page 527: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch10 FA

510 Pricing Derivative Securities (2nd Ed.)

But since neither LjBj/Bn nor Bj/Bn has trend, that of Lj mustequal the negative of the last term of (10.64):

µjn = −d〈Lj, Bj/Bn〉tLjBj/Bn

= −n∑

i=j+1

τiLiλ′iλj

1 + τiLi.

Next, to get the drifts for measure PT0 we normalize by B0 ≡ B(t, T0).Both LjBj/B0 and Bj/B0 =

∏ji=1(1 + τiLi)−1 being trendless in this

measure, it again follows that µj0 = −d〈Lj, Bj/B0〉/(LjBj/B0). However,since

dBj/B0 = −j∑

i=1

τi

1 + τiLi· dLi +

∑i<l

τiτl

(1 + τiLi)(1 + τlLl)· d〈Li, Ll〉t,

the drifts µj0nj=1 now come with positive signs:

µj0 =j∑

i=1

τiLiλ′iλj

1 + τiLi.

Thus, forward LIBORs have positive drifts in measures in which earlierrates are martingales; and, as is consistent with this, they have negativedrifts in the native measures of later rates.

To recap and express in the original notation, the drifts µTj−1,Tm(t)nj=1

for LIBOR processes Lt(Tj−1, Tj)0≤t≤Tj−1,j∈1,2,...n in measures PT0 andPTn are

µTj−1,T0(t) =j∑

i=1

τiLt(Ti−1, Ti)λTi−1(t)′λTj−1 (t)1 + τiLt(Ti−1, Ti)

, j ∈ 1, 2, . . . , n

µTn−1,Tn(t) = 0

µTj−1,Tn(t) = −n∑

i=j+1

τiLt(Ti−1, Ti)λTi−1(t)′λTj−1(t)1 + τiLt(Ti−1, Ti)

, j ∈ 1, 2, . . . , n − 1.

Notice that, unlike the volatility processes, which are by construction deter-ministic, the drift terms depend on past realizations of the LIBORs. Thishas the unfortunate implication that future LIBORs are no longer lognor-mally distributed in measures other than their native ones—a fact thatcomplicates the simulation process.

Modeling the Volatilities

We have modeled the ensemble of LIBORs as driven by an k-valued Brow-nian motion Wt rather than by a single scalar-valued process in order to

Page 528: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch10 FA

Interest-Rate Dynamics 511

permit rates to be imperfectly correlated. Clearly, rates do tend to movetogether, since yield curves continually shift up and down. On the otherhand, slopes of yield curves also change, indicating that the correlation inrates is not perfect. Exactly how the various rates coevolve in our modelis determined by the specification of the vector-valued volatility processes,λTj−1 (t)n

j=1. However, an alternative to modeling all these k-valued pro-cesses is to express the rates’ normal components,

∫ t

0 λTj−1(s)′ ·dWsnj=1,

in terms of n correlated, scalar-valued Brownian motions and scalar-valuedvolatilities. This is done as follows. Suppose we have chosen the measurePTm associated with numeraire B(t, Tm). Then for t ≤ Tj−1 we have

lnLt(Tj−1, Tj) = lnL0(Tj−1, Tj) +∫ t

0

µTj−1,Tm(s) · ds

−12

∫ t

0

λTj−1 (s)′λTj−1 (s) · ds −

∫ t

0

λTj−1(s)′ · dWTm

s ,

where µTj−1,Tm(t) is the drift process under PTm , as just described. Bysetting

λTj−1 (t)2 ≡ λTj−1(t)

′λTj−1(t) (10.65)

WTm

jt ≡ λTj−1 (t)−1

∫ t

0

λTj−1 (s)′ · dWTm

s ,

we can dispense with the vector notation and write lnLt(Tj−1, Tj) as

lnL0(Tj−1, Tj)+∫ t

0

µTj−1,Tm(s)·ds−12

∫ t

0

λTj−1 (s)2·ds−

∫ t

0

λTj−1 (s)·dWTm

js .

Another rate Lt(Tl−1, Tl) (l = j, t ≤ Tl−1) will be driven by another scalar-valued process WTm

lt that is also a Brownian motion under PTm . Theinstantaneous correlation between the two Brownian motions at time t is

ρjl(t) ≡λTj−1 (t)′λTl−1(t)λTj−1 (t)λTl−1(t)

,

whereas, viewed from time zero, the correlation between lnLt(Tj−1, Tj) andlnLt(Tl−1, Tl) at fixed date t ≤ Tj−1 ∧ Tl−1 is

ρjl[0, t] ≡∫ t

0λTj−1(s)′λTl−1(s) · ds∫ t

0λTj−1 (s) · ds

∫ t

0λTl−1(s) · ds

=

∫ t

0ρjl(s)λTj−1 (s)λTl−1 (s) · ds∫ t

0λTj−1 (s) · ds

∫ t

0λTl−1(s) · ds

.

Page 529: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch10 FA

512 Pricing Derivative Securities (2nd Ed.)

This will depend on the behavior of all three functions

ρjl(·), λTj−1 (·), λTl−1 (·)over [0, t]. Thus, to make the model accord with whatever market informa-tion or prior beliefs we have about correlations over intervals of time, wemust specify all three functions appropriately.

What hard information do we have to aid in specifying the model? Notnearly enough, as it turns out. There are, of course, the implied volatili-ties from Black’s formula and the market prices of caplets and floorlets,as inferred from prices of caps and floors. Indeed, it was the existenceof liquid markets in these instruments—and traders’ use of Black’s for-mula for pricing them—that motivated the development of the LIBORmarket model. Referring to (10.65) and (10.58), one sees that in ourscalar version of the model the volatility λTj−1 that enters Black’s for-mula for the initial value of a caplet settled at Tj, C(0, K; Tj−1, Tj),

is λTj−1 =√

T−1j−1

∫ Tj−1

0λTj−1 (t)2 · dt. Observing the succession of these

root-mean-squared volatilities at the (approximately) quarterly intervalsat which caplets settle gives some indication of how volatilities vary withtime, but these obviously do not fully specify the processes. There is alsothe possibility of calibrating to market quotes for swaptions, captions, etc.,by adopting plausible (but, inevitably, ad hoc) specifications of functionalforms for volatility functions and instantaneous correlations. For details ofsome of these schemes one may consult Rebonato (1999), Brigo and Mer-curio (2001), and Joshi (2003).

10.5 Modeling Default Risk

In models treated thus far it has been assumed that parties to contractsalways discharge their obligations. Thus, writers of options always supplythe prescribed quantity of the underlying asset at the expiration of in-the-money calls, writers of puts always come up with the necessary funds tobuy the assets, and issuers of bonds pay the coupons and face values in fulland on time. While institutional arrangements do exist to reduce the riskof “nonperformance” in certain markets (e.g., clearing houses and marginrequirements in option trades), and while a sovereign nation can usually beexpected to pay claims denominated in the currency it controls, there areindeed financial instruments whose market prices are substantially affectedby counterparty risk. As an introduction to the large literature on defaultrisk we describe here some of the prominent models for pricing corporate

Page 530: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch10 FA

Interest-Rate Dynamics 513

debt.15 Comprehensive treatments can be found in the books by Duffie andSingleton (2003) and Lando (2004).

There are two general approaches to the modeling of prices of default-able bonds. What we shall call the “endogenous” approach views the creditevent in terms of the relation between the value of the firm’s assets andsome threshold related to its liabilities. The simplest version, referred toas the Black-Scholes-Merton (BSM) model, is one in which a firm is obli-gated to redeem a single issue of zero-coupon bonds at some future date T.

Default occurs if and only if the firm’s assets at T are worth less thanthe face value of the bonds. An extension by Black and Cox (1976) allowsdefault to occur at any time up to T , contingent on the value of assetsfalling below some time-dependent threshold. This is supposed to corre-spond to the requirements of protective covenants written into the bondindenture. To be operative, this setup requires modeling first-passage timesof stochastic processes. As we saw in the description of barrier options inchapter 7, the theory is well developed for processes driven by Brownianmotions and linear thresholds; however, extending to more realistic assetprocesses and barrier functions requires simulation to determine the dis-tribution of default times and bond values. It is easier to get analyticalexpressions in the basic BSM framework. To illustrate, we focus here onjust one fairly rich but still tractable version of that model.

A newer approach to modeling defaultable bonds’ prices is to regarddefault events as essentially exogenous, much like the occurrences of nat-ural disasters. The evolution of a firm’s balance sheet is now unspecified,default simply being presumed to occur at the first arrival time of a Pois-son process. However, to enhance flexibility and realism, the intensity ofthe jump process is itself modeled as varying stochastically with time, asif in response to the evolving information about the firm, its markets, andthe general economic environment. One becomes more comfortable withthis approach upon regarding it as yielding “reduced-form” models. Thatis, whatever state variables on which balance-sheet items depend—default-free interest rates, market prices of outputs and inputs, advances in tech-nology, changes in regulatory constraints and tax laws, uninsured casualtylosses, etc.—are implicitly viewed as evolving processes that drive the Pois-son intensity. The obvious disadvantage of this abstract approach is thatwe lose a feel for what is “really” happening to the firm; nevertheless, its

15Our discussion is limited to defaultable bonds and does not deal with the risk ofnonperformance on options, futures, or other instruments.

Page 531: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch10 FA

514 Pricing Derivative Securities (2nd Ed.)

mathematical elegance and its complementarity with term structure mod-eling have brought it to prominence. In addition, since capital structure perse is no longer directly considered, models of exogenous risk can be appliedto sovereign debt as well as to corporate liabilities.

10.5.1 Endogenous Risk: The Black-Scholes-Merton Model

The foundation of the BSM scheme was laid in the original papers of Blackand Scholes (1973) and Merton (1973, 1978), who observed that corporateequity and liabilities have option-like features. We have already appliedthese ideas in section 7.2.1 to relate stock options and deposit insuranceto compound calls and extendable puts. As we did there, let us considera firm whose operations are financed by equity and debt, the latter in theform of zero-coupon bonds all maturing at future date T and having facevalue L. Ignoring taxes and potential claims by litigants and bankruptcycourts, ownership of the firm’s assets is apportioned between stockholdersand bondholders. Letting Att≥0 represent the evolving market value ofthe firm’s assets, the values of stockholders’ and bondholders’ claims at T

are, respectively, ST = (AT − L)+ and b(T, T ) = AT − ST . That is, at thetime of maturity stockholders (or managers acting in their interests) makethe required payment of L to bondholders if and only if there is nonnegativeresidual value, AT −L, for holders of equity. If there is not, default occurs.The available assets then pass to bondholders, and stockholders are leftempty handed. Thus, in this simple framework the value of equity at anyt ∈ [0, T ] is simply that of a T -expiring European call option on the firm’sassets, St = CE(At, T − t), while the bonds’ aggregate value is b(t, T ) =At−CE(At, T −t). Alternatively, applying put-call parity, the bondholders’position can be stated as B(t, T )L−PE(At, T − t), where as usual B(t, T )is the value of a default-free unit discount bond. In this view bondholdersare short an option that gives stockholders the right to put the assets tothem in the event AT < L. The value of that option would correspond tothe value of insuring the bondholders against default.

To see where this leads, it is helpful to normalize all market valuesby L, thus expressing them per unit face value of debt. Retaining thesame symbols, we have St as the normalized value of equity and b(t, T )as the price of a defaultable discount bond with unitary principal value.The bond’s yield to maturity or average spot rate at t is then rd(t, T ) =−(T − t)−1 ln b(t, T ), and the yield spread relative to default-free bonds iss(t, T ) = rd(t, T ) − r(t, T ) = (T − t)−1 ln[B(t, T )/b(t, T )]. Our goal is to

Page 532: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch10 FA

Interest-Rate Dynamics 515

model how yield spreads vary with time to maturity, default-free interestrates, and characteristics of the firm’s balance sheet and asset value process.

To show the versatility of the BSM setup, we shall use the Vasicek (1977)model for riskless rates of interest and specify At as a jump-diffusion pro-cess. Thus, recalling the development in section 10.2.1, the short rate evolvesunder spot measure P as drt = (a∗ − brt) · dt + σdWt, where a∗ is a risk-adjusted drift parameter. The initial value of a T -maturing discount bond is

B(0, T ) = exp

[−a∗

∫ T

0

β(T − u) · du +σ2

2

∫ T

0

β(T − u)2 · du − β(T )r0

],

where β(t) ≡ b−1(1−e−bt), and the value of the money-market fund at T is

MT = M0 exp

[a∗

∫ T

0

β(T − u) · du + σ

∫ T

0

β(T − u) · dWu + β(T )r0

]

=M0

B(0, T )eσ

RT0 β(T−u)·dWu+ σ2

2

RT0 β(T−u)2·du. (10.66)

Paralleling the exposition in section 9.2 (but dispensing with special sym-bols for risk-adjusted parameters), we model the normalized value of assetsat T under P as

AT = A0 exp

∫ T

0

ru · du − θνT − σ2AT

2+ σAWT +

NT∑j=0

ln(1 + Uj)

.

Here Nt is a Poisson process with (constant) intensity θ; logarithms ofjump shocks, ln(1 + Uj)∞j=1, are i.i.d. as N [ln(1 + υ) − ξ2/2, ξ2]; andU0 ≡ 0. Brownian motions Wt, Wt that drive rt and At are indepen-dent of Nt, and all three are independent of the jump sizes. However, weallow for correlation between interest rates and the asset process by set-ting WT = ρWT + ρYT , where ρ =

√1 − ρ2 and Yt is another Brownian

motion independent of everything else.The normalized (by L) value of the firm’s equity at t = 0 is S0 =

M0EM−1T (AT − 1)+. As in section 9.2 we will work this out by evaluat-

ing the expectation conditional on NT and then applying the tower prop-erty of conditional expectation. What makes this relatively easy in theVasicek/jump-diffusion setting is that MT is distributed as lognormal underP and AT is conditionally lognormal. Still, as in the option pricing examplewith the HJM model in section 10.3.3, it takes some effort to disentanglethe stochastic interest rates from the asset value process. Here is one way

Page 533: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch10 FA

516 Pricing Derivative Securities (2nd Ed.)

to do it. Using (10.66) with σ2M ≡ σ2T−1

∫ T

0 β(T − u)2 · du, we can write

M−1T =

B(0, T )M0

e−σ2

M T

2 −[σR T0 β(T−u)·dWu]

AT =A0

B(0, T )e−θνT−σ2

AT

2 +σ2

M T

2 +[σR T0 β(T−u)·dWu+σAWT +

PNTj=1 ln(1+Uj)].

Conditioning on NT , the bracketed term in the exponent in the expressionfor AT is distributed as normal with conditional mean NT [ln(1+υ)−ξ2/2] ≡µN , and using WT = ρWT + ρYT one sees that its conditional variance is

σ2NT ≡ σ2

MT + 2ρσAσ

∫ T

0

β(T − u) · du + σ2AT + NT ξ2. (10.67)

We can therefore represent the term as µN + σN

√TZN , where ZN ∼

N(0, 1) conditional on NT . Likewise, the bracketed term in the exponentin the expression for M−1

T is distributed as N(0, σ2MT ), so we write it as

σM

√TZM . Its covariance with σN

√TZN is σNMT ≡ σ2

MT+ρσAσ∫ T

0 β(T−u) · du, and the correlation between ZN and ZM is ρNM = σNM/(σNσM ).We can now set ZM = ρNMZN + ρNMZ∗, say, where Z∗ is conditionallyindependent of ZN . Then, with B0 ≡ B(0, T ),

S0 = Ee−σ2

M T

2 −ρNMσM

√TZ∗

· Ee−ρNMσM

√TZN

[A0e

−θνT+µN−σ2AT

2 +σ2

M T

2 +σN

√TZN − B0

]+

= e−ρ2

NM σ2M T

2 Ee−ρNMσM

√TZN

[A0e

−θνT+µN−σ2AT

2 +σ2

M T

2 +σN

√TZN − B0

]+

= A0e− ρ2

NM σ2M T

2 −θνT+µN−σ2AT

2 +σ2

M T

2 Ee(σN−ρNM σM )√

TZN 1(qN ,∞)(ZN )

−B0e− ρ2

NM σ2M T

2 Ee−ρNMσM

√TZN 1(qN ,∞)(ZN ),

where

qN ≡ − ln(A0/B0) + θνT − µN + σ2AT/2 − σ2

MT/2σN

√T

.

Working out expectations conditional on NT and setting

BN ≡ B0(1 + ν)−NT eθνT

q±N ≡ ln(A0/B0) ± σ2NT/2

σN

√T

Page 534: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch10 FA

Interest-Rate Dynamics 517

give

S0 = B0EB−1N [A0Φ(q+

N ) − BNΦ(q−N )]

= e−θνT E(1 + ν)NT CE(A0, T ; BN , σN ).

Here CE(A0, T ; BN , σ2N ) represents the Black-Scholes formula for a T -

expiring European call with unit strike, but with BN as the price of ariskless discount bond and σN as the volatility of the underlying. As wesaw in pricing options under jump dynamics in section 9.2, this can beevaluated as

S0 =∞∑

n=0

p(n; θυT )CE(A0, T ; Bn, σn),

where p(n; θυT ) represents P(NT = n) for a Poisson law with intensityθυ ≡ θ(1 + ν). Notice that stochastic short rates affect the result onlythrough the pseudo-volatility parameter σN , as given by (10.67).

As described earlier, we can now get the initial value of the defaultableunit bond as b(0, T ) = A0 − S0, and from this come the yield spreadsas s(0, T ) = T−1 ln[B(0, T )/b(0, T )]. Figure 10.5 presents plots of s(0, T )

0 1 2 3 4 5 6 7 8 9 10

Years to Maturity

0

10

20

30

40

50

60

Yie

ld S

pre

ad (

%)

A0 / L = 0.90A0 / L = 1.20A0 / L = 1.50

Fig. 10.5. The BSM yield spreads vs. time to maturity, for three initial values ofassets/debt.

Page 535: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch10 FA

518 Pricing Derivative Securities (2nd Ed.)

versus T for three values of A0. The calculations (done with subroutineMERTON in program JUMP on the CD) are with average (default-free) spotrate r(0, T ) = 0.05, interest rate parameters b = 0.5 and σ = 0.1, assetparameters θ = 0.1, ξ = 0.2, σA = 0.3, ν = −0.1, and correlation ρ = 0.5between the two Brownian motions. These parameters generate enormousbut rapidly declining spreads for a firm that is already insolvent, but theyproduce relatively flat term structures for solvent firms. Still, because of thepossibility of negative jumps in the value of assets, there are positive spreadsfor solvent but highly leveraged firms at even the shortest maturities. Allof this seems to correspond reasonably well to what we observe in marketsfor corporate bonds.16

10.5.2 Exogenous Default Risk

The jump-diffusion model just considered is one of many applications ofPoisson processes encountered in this survey of derivatives pricing. Recall-ing the definition from pages 91 and 126, a Poisson process Ntt≥0

with intensity θ is an adapted process on a filtered probability space(Ω,F , Ft, P) with independent increments and with

P(Nt+τ − Nt = n | Ft) = P(Nt+τ − Nt = n) = (θτ)ne−θτ/n!, n ∈ ℵ0.

(10.68)This is often referred to as a “homogeneous” Poisson process to distinguishfrom versions in which θ is time-varying. While the homogeneous modelhas heretofore served as our main workhorse for modeling randomly occur-ring events, it is sometimes advantageous to extend it by allowing inten-sity to vary with time—either deterministically or stochastically. Processeswith stochastic intensity, called “doubly stochastic processes” or “Cox pro-cesses”, have long been applied to model life expectancies, failure timesof industrial processes, etc.,17 and we have already seen an application tofinancial derivatives in section 9.3.2. The use of Cox processes in model-ing credit risk was pioneered and developed by Lando (1994, 1998), Jarrowet al. (1997), Duffie and Singleton (1999), and others. Before seeing how

16Of course, we have considered only discount bonds, whereas most long-term corpo-rate issues carry coupons. It is not so easy to price coupon bonds in the BSM frameworkbecause periodic cash payments affect the balance sheet. Lando (2004) describes recursiveprocedures for pricing under various assumptions about how the payments are financed.

17For an overview of the theory and a survey of applications see Cox and Oakes(1984). Grandell (1976) presents the theory in the deeper context of point processes andrandom measures.

Page 536: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch10 FA

Interest-Rate Dynamics 519

they are applied in this context, it will be helpful to learn a bit more aboutthe theory and to trace them back to the homogeneous version in (10.68).

One will recall from page 61 that the probability generating func-tion (p.g.f.) of a random variable X supported on ℵ0 is the functionΠX(ζ) = EζX , ζ ∈ C, and that the probability mass function of X canbe generated from ΠX by differentiation, as Π(k)

X (ζ)|ζ=0 = P(X = k). Inparticular, Π(0)

X (0) ≡ ΠX(0) = P(X = 0). When X is distributed as Pois-son with parameter θ > 0 we have ΠX(ζ) = e(ζ−1)θ and P(X = 0) = e−θ.

Thus, for a homogenous Poisson process Ntt≥0 with intensity θ we haveΠNt(ζ) = e(ζ−1)θt and P(Nt = 0) = e−θt. We will construct a doublystochastic process from this in stages. First, we obtain an inhomogeneousPoisson process by allowing θt to be a positive-valued, deterministic,right-continuous step function—a mapping from + to a countable set ofpositive numbers. Putting t0 = 0, introduce a strictly increasing sequenceof real numbers tj∞j=0 ↑ ∞, a sequence of positive constants θj∞j=1,and a sequence Xjt∞j=1 of independent homogeneous Poisson processes,with θj as the intensity of Xj . We construct a process Ntt≥0 by puttingNt = X1(t) for t ∈ [0, t1), Nt = X1(t1) + X2(t) − X2(t1) for t ∈ [t1, t2),Nt = X1(t1) + X2(t2) − X2(t1) + X3(t) − X3(t2) for t ∈ [t2, t3), and so on,with Nt =

∑j−1i=1 [Xi(ti)−Xi(ti−1)]+Xj(t)−Xj(tj−1) for t ∈ [tj−1, tj) and

j > 1. Increments to this process over nonoverlapping periods of time areclearly independent, and for any fixed t the p.g.f. of Nt is

ΠNt(ζ) = e(ζ−1)[Pjt

j=1 θj(tj−tj−1)+θjt+1(t−tjt )], (10.69)

where jt ≡ maxj : tj ≤ t. The form of ΠNt(·) shows that Nt is distributedas Poisson, and because the increments are independent, Nt is indeed aPoisson process.

To generalize further, let θ : + → + be an F0-measurable (i.e., deter-ministic), right-continuous, Riemann-integrable function. For some m ∈ ℵchoose the tj such that maxj |tj − tj−1| ≤ 1/m. Then setting θj = θtj

in (10.69) and letting m → ∞ give ΠNt(ζ) → e(ζ−1)R t0 θs·ds and Nt

P (∫ t

0θs · ds) for each t ≥ 0. Finally, letting θt be an adapted process

whose sample paths θt(ω) are a.s. positive, right-continuous, and inte-grable, a doubly stochastic adapted process Nt is obtained by followingthe above construction for each sample path θt(ω). The p.g.f. is thenΠNt(ζ) = E exp[(ζ − 1)

∫ t

0θs · ds], and the probability that there are no

jumps up to t is P(Nt = 0) = Ee−R t0 θs·ds. Note that in the doubly stochastic

Page 537: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch10 FA

520 Pricing Derivative Securities (2nd Ed.)

setup the information set at t, Ft, includes both histories, θs0≤s≤t andNs0≤s≤t.

Another construction that is useful in simulating Cox processes exploitsthe fact that interarrival times Tn − Tn−1∞n=1 of Poisson processes withdeterministic intensities are distributed as exponential. The recipe is as fol-lows. First, generate the realization of an indefinitely long sample path ofintensities, θs(ω) = θ∗ss≥0, using an appropriate model. Next, generaterealizations Yj(ω)∞j=1 of i.i.d. exponential random variables with expectedvalue unity. These have distribution function (1 − e−y)1[0,∞)(y) and c.f.(1 − iζ)−1, ζ ∈ . Next, setting T0(ω) = 0, determine sequentially forn = 1, 2, . . . the realizations of (stopping) times Tn(ω) = inft :

∫ t

0θ∗s · ds ≥∑n

j=1 Yj(ω). Thus, T1(ω) is the first time (strictly, the greatest lower

bound of times) at which∫ t

0 θ∗s · ds is at least Y1(ω); T2(ω) is the first timeat which

∫ t

T1θ∗s · ds is at least Y2(ω); and so on. Finally, for n = 1, 2, . . . set

Nt(ω) = n−1 for Tn−1(ω) ≤ t < Tn(ω) and NTn(ω) = n. Figure 10.6 illus-trates. The resulting process Nt satisfies, for the given realization θ∗s,

P(Nt < n | θ∗s) = P(Tn > t | θ∗s) = P

n∑j=1

Yj >

∫ t

0

θ∗s · ds

.

Fig. 10.6. Simulating the Cox process.

Page 538: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch10 FA

Interest-Rate Dynamics 521

The c.f. of∑n

j=1 Yj being (1 − iζ)−n, one sees (from table 2.3) that it is

distributed as Γ(n, 1). Thus, setting Θt ≡∫ t

0θ∗s · ds,

P(Nt < n | θ∗s) =∫ ∞

Θt

yn−1e−y

(n − 1)!· dy =

n−1∑j=0

Θjte

−Θt

j!,

where the last equality follows upon repeated integration by parts. Nt

is therefore distributed as Poisson with parameter∫ t

0 θ∗s · ds, conditionalon θs = θ∗s. In particular, P(Nt = 0 | θ∗s) = P(Nt < 1 |θ∗s) = e−

Rt0 θ∗

s ·ds, and averaging over realizations of θss≥0 again deliversP(Nt = 0) = Ee−

Rt0 θs·ds.

Now consider an entity (a firm, a machine, an organism, . . . ) thatendures and functions (in some objective sense) up to the first arrival timeT1 of such a doubly stochastic process, T1 itself marking the precise instantof “failure”. Then P(Nt = 0) = Ee−

R t0 θs·ds = E1(t,∞](T1) represents the

probability that failure will occur (if at all) after t, which is to say that itis the probability of survival to t.18 Moreover, since for T ≥ t event NT = 0implies Nt = 0, we have

e−R

Tt

θs·ds = e−R

T0 θs·ds+

Rt0 θs·ds

=P(NT = 0 | θs)P(Nt = 0 | θs)

=P(NT = 0, Nt = 0 | θs)

P(Nt = 0 | θs)= P(NT = 0 | Nt = 0, θs).

Therefore

Ete−

RTt

θs·ds = P(NT = 0 | Ft) = Et1(T,∞](T1)

is the probability of survival to T given survival to t. Notice that explicitlyconditioning on T1 > t is not required since information set Ft includes theentity’s current status. Letting T = t+∆t, we have, approximately for small∆t, Ete

−R

t+∆tt

θs·ds .= 1 − θt∆t as the probability of survival over the nextsmall interval of time. Therefore, θt can be thought of as an instantaneousdefault rate.19

18Intensities θs may be such that E exp(− R ∞0 θs · ds) > 0, as, for example, if

P(R ∞0 θs · ds ≤ m) = 1 for some finite m. Failure might then never occur, so T1 :

Ω → + ∪∞ must be considered an “extended” random variable; hence, the notation1(t,∞](T1).

19It was to justify this interpretation that we required sample paths of θs to bea.s. right continuous.

Page 539: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch10 FA

522 Pricing Derivative Securities (2nd Ed.)

Let us now apply these concepts to financial markets. Consider a firm’straded obligation to pay one currency unit at future date T conditionalon the firm’s remaining solvent until that time—and nothing otherwise.In other words, the security represents a defaultable, zero-coupon bondwith zero recovery of principal upon default. With the first jump of somedoubly stochastic process Nt signaling default, the bond’s value at T isb0(T, T ) = 1(T,∞](T1), where subscript 0 signifies zero recovery. The mar-ket price of the security at t = 0, assuming the firm is then solvent, can beexpressed as

b0(0, T ) = M0EM−1T 1(T,∞](T1) = Ee−

RT0 rt·dt1(T,∞](T1), (10.70)

where, since we are valuing a traded asset, the expectation is now in theequivalent spot measure P. Of course, that the firm is currently solvent isincluded in the market’s initial information, F0, on which the expectation isconditioned. In measure P there corresponds to θtt≥0 a risk-adjusted pro-cess θtt≥0, in terms of which we can express the risk-neutral probabilityof survival:

P(NT = 0) = E1(T,∞](T1) = Ee−R

T0 θt·dt.

However, we would like to be able to express b0(0, T ) also in terms of sam-ple paths of θt, since we could then value the bond just by modeling rtand θt. Doing this requires a little ingenuity.

Let Gt be the σ-field generated by the joint history θs, rs0≤s≤t, withGtt≥0 as the corresponding filtration. We insist that Gt ⊆ Ft for eacht. For example, if it were possible to express (θt, rt) as some function g :k → 2 of t-observable state variables Xt, we could take Gt to be theσ-field generated by Xs0≤s≤t. In any case, whether there are such statevariables or not, the paths θt, rt0≤t≤T out to maturity are GT -measurable,so we can write

E

[e−

R T0 rt·dt1(T,∞](T1) | GT

]= e−

R T0 rt·dtE[1(T,∞](T1) | GT ]

= e−R T0 (rt+θt)·dt.

Now, evaluating (10.70) by conditioning first on on GT and then on G0, wehave

b0(0, T ) = E

E

[E

(e−

R T0 rt·dt1(T,∞](T1) | GT

)| G0

]= E

[E

(e−

RT0 (rt+θt)·dt | G0

)]= Ee−

RT0 (rt+θt)·dt,

Page 540: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch10 FA

Interest-Rate Dynamics 523

where the last equality follows from G0 ⊆ F0. This achieves the desiredresult. More generally, the security’s value at any t ∈ [0, T ] is

b0(t, T ) = E

[e−

R Tt

(rs+θs)·ds | Ft

]≡ Ete

−R T

t(rs+θs)·ds.

Modeling prices of defaultable bonds with zero recovery can thus be reducedto specifying the joint evolution of the instantaneous spot rate and theinstantaneous, risk-adjusted default rate. Some ways of doing this are con-sidered below.

Of course, the potential for partial recovery of principal does have tobe considered in pricing actual corporate liabilities. This is handled auto-matically in the BSM framework, since the evolutions of assets’ values anddebt are formally modeled, but the exogenous risk approach requires apatchwork solution. To see what is involved, let us start with the followingbasic pricing relation for a defaultable discount bond with potential partialrecovery:

b(t, T ) = Et

[e−

RTt

rs·ds1(T,∞](T1) + e−R

T∗t

rs·dsRT∗1(t,T ](T1)]

= b0(t, T ) + Ete−

RT∗t

rs·dsRT∗1(t,T ](T1).

Here, the first term is the bond’s zero-recovery value, and the second termis the contribution from the uncertain amount RT∗ that is to be paid atsome date T ∗ contingent on default. (Of course, we take for granted thatthe firm is solvent at t, so that 1(t,∞][T1(ω)] = 1Ω(ω) = 1.) The problemis that in any specific application there is little to guide us in modelingeither the uncertain amount or the uncertain time of payment. Some of theways proposed to do this are (i) setting T ∗ = T1 and assuming R to bean Ft-measurable fraction of face value (i.e., R is known in advance andpaid upon default); (ii) setting T ∗ = T1 and assuming R to be an FT1-measurable fraction of face value (i.e., R revealed and paid upon default);(iii) setting T ∗ = T1, with R an FT1−-measurable fraction of the bond’svalue just before default, b(T1−, T ).

Given the level of ignorance and the multitude of possibilities, one mayfeel justified in opting for an approach that has the assured virtue of sim-plicity. One such is to set T ∗ = T and model R as an FT -measurablerandom variable supported on [0, 1) and independent of default intensities

Page 541: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch10 FA

524 Pricing Derivative Securities (2nd Ed.)

and interest rates. This scheme yields the following very tractable result:

b(t, T ) = Et[e−R

Tt

rs·ds1(T,∞](T1) + RT e−R

Tt

rs·ds1(t,T ](T1)]

= Ete−

RTt

rs·ds1(T,∞](T1) + EtRT · Ete−

RTt

rs·ds[1 − 1(T,∞](T1)]

= Ete−

R Tt

rs·ds · EtRT + Ete−

R Tt

rs·ds1(T,∞](T1)(1 − EtRT )

= B(t, T )EtRT + b0(t, T )(1 − EtRT ),

where, as usual, B(t, T ) is the value of a default-free discount bond.We can now focus on modeling B(t, T ) and b0(t, T ), which is done by

specifying the joint process rt, θt0≤t≤T . Since θt must be strictly positive,a bivariate version of the Vasicek (1977) model is unacceptable; but extend-ing the mean-reverting, square-root process of Cox et al. (1985) is feasible,and this yields expressions that are computationally manageable. The directapproach is to model rt and θt as independent CIR processes:

d

(rt

θt

)=

(a − b∗rt

aθ − bθ θt

)· dt +

(σ 00 σθ

)(r1/2t

θ1/2t

)· d

(W1t

W2t

),

where W1t and W2t are independent Brownian motions under P, and b∗

is the risk-adjusted drift parameter that we encountered in section 10.2.2.In this case B(t, T ) is as in (10.16), and

b0(t, T ) = B(t, T )Ete−

R Tt

θs·ds

= B(t, T )Aθ(T − t)e−θtαθ(T−t),

where Aθ(·) and αθ(·) are given by (10.18) and (10.17) with (aθ, bθ) in placeof (a, b∗). The indirect approach, which allows for dependence between rtand θt, is to model each as an affine function of the same k -vectorof state variables Xt, whose components would in turn be modeled asindependent CIR processes: dXjt = (aj − b∗jXjt) ·dt+σj

√Xjt ·dWjtk

j=1.

This, too, yields easily computed formulas for B(t, T ) and b0(t, T ). Fordetails and extensions to other affine models for rt, θt one can consultDuffie and Singleton (2003) and Lando (2004).

Page 542: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch11 FA

PART III

COMPUTATIONAL METHODS

Page 543: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch11 FA

This page intentionally left blankThis page intentionally left blank

Page 544: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch11 FA

11Simulation

Pricing derivative assets by martingale methods involves finding mathemat-ical expectations of discounted contingent payoffs. The discounted payoffsare themselves functions (or, in the case of path-dependent derivatives,functionals) of prices of underlying primary assets, whose conditional dis-tributions under the martingale measure are deduced from the dynamicmodel for the underlying. The pricing problem therefore amounts to findingexpected values of various functions of random variables—that is, Lebesgue-Stieltjes integrals of functions with respect to c.d.f.s. The problem can beanalytically intractable when either the integrand function is complicatedor the c.d.f. is difficult to determine. In the former case numerical integra-tion is usually the preferred approach; in the latter case, either simulation,the binomial method, or lattice solutions of p.d.e.s. When there are multiplestate variables, simulation is typically the only feasible method.

Simulation is merely an application of the law of large numbers. Toget the basic idea, suppose we are to evaluate Eg(Y ) =

∫g(y) · dF (y) for

some (possibly vector-valued) random variable Y . We will look at moresophisticated schemes later, but the straightforward approach is simply togenerate a large number of realizations yjN

j=1 by sampling independentlyfrom the distribution F , evaluate g(·) for each realization, and average toobtain the estimate, Eg = N−1

∑Nj=1 g(yj). When realizations yj are

generated independently, the standard error of the estimate can itself beestimated consistently as

σ(Eg) =

√√√√N−1

N∑j=1

y2j − Eg

2.

527

Page 545: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch11 FA

528 Pricing Derivative Securities (2nd Ed.)

Moreover, the central limit theorem can be applied to obtain an asymptotic100(1 − α)% confidence interval for Eg as

Eg ± Φ−1(1 − α/2)σ(Eg),

where Φ−1(·) is the inverse of the standard normal c.d.f.1 For example, withα = 0.05 the approximate 95% confidence interval is Eg ± 1.96σ(Eg). Inthe application to derivatives, g(Y ) will represent the payoff of a derivativeasset, and Y the (vector of) random influence(s) on which it depends. Forexample, for a vanilla European call option struck at X , random variable Y

would be the price of the underlying at the expiration of the call, F wouldbe the conditional distribution of the terminal price, and g(Y ) would be thediscounted value of (Y −X)+. For a European-style fixed-strike lookback orAsian call option Y would be, respectively, the maximum or average priceover some portion of the option’s life—a functional of the price path.

Simulation is well adapted to situations in which values of discountedpayoffs depend on more than one stochastic state variable; as, for example,when volatility and/or interest rates are stochastic. The binomial method,which evaluates expected payoffs by discretizing the state space, requiresa multidimensional lattice in such cases. Simulation is also well adaptedto path-dependent derivatives, such as lookbacks and Asians, for whichthe entire sample path of the underlying price must be known in orderto determine the payoff. As we will see in detail shortly, one approachesthese problems by discretizing the time to expiration and simulating therandom evolution of price one subinterval at a time. The information thatdetermines the derivative’s ultimate payoff is updated as the path pro-gresses. Binomial pricing is cumbersome in such cases because it is notstraightforward to determine the potential payoffs associated with all pathsthat pass through each node. However, simulation also has its limitations.Simulation proceeds by working forward in time, generating sample pathsprogressively from one step to the next. In pricing an American-style deriva-tive, however, the natural approach is to compare at each step its intrinsicand continuation values—but the latter depends on the possible subsequentpaths from that point. Of course, finding the continuation value is easy inthe backwards-recursive binomial scheme. Nevertheless, despite the inher-ent difficulty, we will see that there are ways to use simulation to priceAmerican-style derivatives.

1That is, Φ−1(1 −α/2) is the 1−α/2 quantile of the standard normal—the value zsuch that Φ(z) = 1 − α/2.

Page 546: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch11 FA

Simulation 529

At the heart of the simulation method is the task of getting realiza-tions of random variables. While databases of random deviates are available(e.g., Marsaglia (1995)), these are ordinarily generated as part of the sim-ulation routine. Realizations generated by digital computer are necessarily“pseudorandom”, meaning they are produced by some purely deterministic,replicable process that gives the appearance of randomness in the sense ofpassing various statistical tests. In section 11.1 we discuss pseudorandomnumber generators for various distributions, giving particular attention tothe uniform and normal. We then consider some ways of enhancing the effi-ciency of simulation estimators, focusing on methods that have the greatestrelevance in financial applications. While the straightforward approach tosimulation outlined above is always feasible, we will see in section 11.2 thatthere are usually shortcuts that produce estimators Eg with lower varianceand/or shorter execution times. Such variance-reduction schemes can beapplied both in generating random deviates and in averaging the realiza-tions of g(Y ).2 Finally, in section 11.3 we will look at specific applicationsto the pricing of derivatives, including those that can be exercised early.

11.1 Generating Pseudorandom Deviates

Programs that generate pseudorandom variates of various forms are widelyavailable. The International Mathematics and Statistics Library r© (IMSL)contains a wide selection of executable routines in FORTRAN. The bookof FORTRAN 77 routines by Press et al. (1992) and its counterparts inFORTRAN 90, C, Pascal, and Basic also cover many of the standard dis-tributions and give source code with explanations of the algorithms andprogramming steps employed. The present discussion provides a generaldescription and overview of the prominent methods, with emphasis on thosethat are of greatest relevance for financial applications.

11.1.1 Uniform Deviates

Standard techniques for generating random deviates from any distributionbegin with realizations of uniform deviates. There are two standard options

2Both of these sections draw heavily from the book by Fishman (1996), which con-tains a wealth of additional material. Hammersley and Handscomb (1964) is still a good

resource, as well. Press et al. (1992) gives an excellent overview of random number gen-erators. Glasserman (2004) gives a comprehensive overview of simulation methods, withemphasis on financial applications.

Page 547: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch11 FA

530 Pricing Derivative Securities (2nd Ed.)

for obtaining realizations of random variables distributed as uniform on theinterval (0, 1). One of the safest methods is to tap an existing database thatis known to have passed a battery of powerful statistical tests. Marsaglia(1995) has produced such a collection, available on CDROM. The otherstandard option is to produce the numbers in the course of the simula-tion experiment using a uniform number “generator”—a body of computercode that exists either in machine language on a specific computer sys-tem or in a higher-level language, such as FORTRAN or C. Computationaland programming systems like Gauss r© and Matlab r© usually have built-ingenerators. A desirable feature of such generators is that they be portable—that is, that they can be used with a variety of computers and compilers.Some understanding of generators is necessary if simulations are to producereliable results.

An algorithm on which many number generators are based is the multi-plicative congruential scheme. Beginning with a “seed” value I0 supplied bythe user, these generate new integer-valued numbers as Ij = aIj−1(mod m),where a and m are carefully chosen integers. Ij is thus the remainder afteraIj−1 is divided by m; for example, 2(mod 7) = 9(mod 7) = 16(mod7) = 2.Since the process obviously terminates when a zero is first produced, a andm must be chosen to exclude this possibility. Since m(mod m) = 0, thislimits the range of I to the integers 1, 2, . . . , m − 1. Division by m thenconverts to values u ≡ I/m on (0, 1). As a simple example, the merelyillustrative choices a = 2, m = 7, and seed I0 = 1 generate the sequence1, 2, 4, 1, 2, 4, 1 . . . , repeating after three digits. Clearly, this is a bad choiceof a, since not even all integers to m−1 are obtainable. The multiplier a = 3does better, generating 1, 3, 2, 6, 4, 5, 1. The message is that m − 1 is themaximum period—that is, the maximum length of nonrepeating string—but that poor choices of a can leave significant gaps and produce periodsmuch shorter than m − 1.

Since m limits the best possible outcome, it is obviously desirable tomake it as large as possible. This is where the limits of particular machinearchitectures come to be felt, since m is constrained by the word size onthe particular platform in use. With IBM mainframes and most PCs andworkstations the limit of 31 bits (excluding the sign bit) allows integers of upto 231.3 A standard choice for m is the prime number 231−1 = 2147483647,

3In fact, integers to 232 − 1 are managed in IBM-compatible computers byusing the sign bit to record 231, . . . , 232 − 1 as negative numbers; that is, as−2147483648, −2147483647, . . . ,−1.

Page 548: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch11 FA

Simulation 531

and handling even this moderate number requires a special programmingtrick to assure that a(m − 1) does not exceed the maximum word size.Different generators come with their own special choices of a (see below).Good choices maximize the length of the nonrepeating string and eliminateobvious patterns in the output. One pattern that can remain when a m

is that small values of Ij are apt to be followed by smaller values thanaverage, since in such cases aIj m. Some routines employ a shufflingscheme to reduce this first-order dependence. These work by first creatinga preliminary table of integers by the usual multiplicative scheme, thengenerating each Ij not directly from Ij−1 but from a value chosen at randomfrom the table, which is then replaced by Ij .

Notice that all sequences of numbers produced by the generator froma given seed I0 will be the same. This is actually an attractive feature,since it enables one to replicate the random number exactly. The last valueof I produced in a call to the generator serves as the seed that continuesthe sequence where it left off. Notice, too, that the uniformly distributednumbers u = I/m produced by multiplicative congruential schemes arelimited to the open interval (0, 1).

IMSL’s RNUN and Numerical Recipes (Press et al., 1992) routines RANn(n = 0, 1, . . . , 3) are examples of portable generators. RAN1 uses m = 231−1with a = 16807 and a shuffling scheme with a 32-element table. The IMSLroutines allow 128-element shuffling as an option and offer a choice of mul-tipliers a = 16807, 397204094, or 95070637. These progressively trade speedfor increasing degrees of apparent randomness. Fishman (1996) presents aversion of FORTRAN subroutine RAND, written by L.R. Moore of the RandCorporation, that uses a = 95070637 without shuffling. FORTRAN andC++ versions of this in function subprogram format, RANDFUN, are on theaccompanying CD. High-level languages such as Gauss and Matlab havebuilt in generators for uniforms and other standard distributions.

Whatever is used, it is essential to run at least cursory checks on theoutput with one’s own computer platform and software. In some cases eventhe supposedly portable routines are really compiler specific.4 A test foruniformity using the Neyman (1937) N2 statistic is included on the CD

4As an example, the version of RAN1 that appeared in Press et al. (1986) givesnonsensical results with 32-bit FORTRAN 77 compilers by Watcom (version 9.0) andAbsoft (version 5.0), yet passes cursory tests (and gives the same output) with Watfor87

and Lahey F77L (version 4.10) compilers. (With Absoft version 5.0 it outputs a singlenumber, .86528, on every call!) However, the (1992) version of RAN1 produces the same(satisfactory) results with all four compilers.

Page 549: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch11 FA

532 Pricing Derivative Securities (2nd Ed.)

as UTEST.5 The test, while known to have good power against broad alter-natives to uniformity, is sensitive mainly to uncharacteristic behavior ofthe first two moments. It does not necessarily pick up problems like largegaps in the frequency distribution (indicative of a bad choice of multiplier a

in multiplicative congruential generators) that a simple plot would reveal.A good supplement is simply to use a spreadsheet or statistical graphicsprogram to construct and plot a frequency distribution of a sample of theoutput.

Besides the uniformity of the numbers, the lack of egregious forms ofserial dependence will be important in simulating sample paths of assetprices. A simple runs test that compares the number of sequences of num-bers below 0.5 is included on the CD as RUNTEST. Much more thoroughchecks can be made with tests supplied by IMSL (e.g., RUNS, PAIRS, DSQAR,DCUBE). Marsaglia (1995) includes a battery of tests for randomness withhis collection of random deviates (which pass the tests).

11.1.2 Deviates from Other Distributions

The process of generating pseudorandom realizations from other distribu-tions than the uniform usually begins with a uniform deviate on (0, 1). Avery basic and often very attractive general method produces pseudoran-dom deviates with c.d.f. F by running uniform deviates through F−1.

The Inverse Probability Integral Transform

Let F be a c.d.f., and define the functions Y ± : (0, 1) → as Y +(u) =infy : F (y) > u and Y −(u) = infy : F (y) ≥ u.

Example 68 Exponential distribution with unit scale parameter:

F (y) =

0, y ≤ 01 − e−y, y > 0

Y ±(u) = −ln(1 − u).

5For description of the procedure and discussion of its power, see D’Agostino andStephens (1986, pp. 351–354, 357) and references cited there.

Page 550: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch11 FA

Simulation 533

Example 69 Bernoulli distribution with parameter θ:

F (y) =

0, y < 01 − θ, 0 ≤ y < 11, y ≥ 1

Y +(u) =

0, 0 ≤ u < 1 − θ

1, 1 − θ ≤ u < 1

Y −(u) =

0, 0 < u ≤ 1 − θ

1, 1 − θ < u ≤ 1.

In general, Y + and Y − coincide for continuous distributions, and even inthe discrete case Y + and Y − are indistinguishable when U is distributedas uniform on (0, 1), in the sense that P[Y +(U) = Y −(U)] = 1. Takingan arbitrary stand, define the inverse c.d.f. as F−1(u) ≡ Y +(u) = infy :F (y) > u and set Y (u) ≡ F−1(u). Then, if realizations of U are generatedrandomly from U(0, 1), we have

P(Y ≤ y) = P[F−1(U) ≤ y] = P[U ≤ F (y)] = F (y).

Thus, the random variable Y (U) = F−1(U) is indistinguishable from arandom variable having the distribution F . The transformation that takesY to the random variable F (Y ) ∼ U(0, 1) is often called the “probabilityintegral transform”. We require the inverse of this.

As an example, table 11.1 shows the first ten pseudorandom uni-forms from RANDFUN with seed 1.0 and the corresponding exponential andBernoulli deviates (with θ = 2/3). Notice that since U and 1 − U have thesame distributions, exponentials can be generated simply as −lnU . Fish-man (1996, table 3.1) gives expressions for inverse c.d.f.s of several promi-nent continuous distributions and suggestions for computational shortcuts.Routine POIDEV on the CD uses the inverse transform method to producevectors of Poisson variates.

Aside from simplicity in many applications, the inverse transformmethod has two advantages, both relating to variance-reduction meth-ods discussed in the next section. The first is that Y (U) is a monotone-increasing function of U , which implies that Y (U) and Y (1 −U) are nega-tively correlated but with identical marginal distributions. We will see thatthis feature often makes it possible to produce Monte Carlo estimates ofEg(Y ) more efficiently. The second advantage of the transform method isits ability to generate variates efficiently from modifications of F that haverestricted support. For example, if Y with c.d.f. F is supported on [a, d],

Page 551: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch11 FA

534 Pricing Derivative Securities (2nd Ed.)

Table 11.1. Simulated uniforms, exponential, and Bernoullivariates.

Uniform Exponential Bernoulliu y = −ln u y = 1[1/3,1)(u)

.44271 0.58466 1

.06008 0.06196 0

.80478 1.63365 1

.17005 0.18639 0

.91588 2.47554 1

.48670 0.66689 1

.29624 0.35132 0

.74641 1.37203 1

.29842 0.35442 0

.20020 0.22339 0

then the conditional distribution given Y ∈ [b, c], where a ≤ b < c ≤ d, is

F (y|Y ∈ [b, c]) =

0, y < b

F (y) − F (b)F (c) − F (b)

, b ≤ y < c

1, y ≥ c.

The inverse of the conditional distribution is then

F−1(u|Y ∈ [b, c]) = F−1u[F (c) − F (b)] + F (b) (11.1)

This property is relevant for generating stratified samples, which areexplained below.

The main disadvantage of the inverse transform method is that it is notso simple to use when there is no convenient expression for the inverse c.d.f.Examples are the normal and gamma distributions, whose c.d.f.s them-selves cannot be expressed in closed form. Satisfactory closed-form approx-imations for inverse c.d.f.s are sometimes available; for example, Fishman(1996, table 3.1) gives an approximation for the normal inverse. Otherwise,one can use a series expansion to represent the c.d.f., then apply an iterativeroot-finding approach to determine F−1(u). However, this takes computa-tional and programming time, and other procedures are usually preferable.

The Rejection Method

Another general approach uses properties of conditional probability andthe Radon-Nikodym derivative to generate pseudorandom variates with

Page 552: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch11 FA

Simulation 535

c.d.f. F . Let G be a c.d.f. corresponding to a measure equivalent to F ;thus, a support Y of F is a support of G and vice versa. Let R = dF/dG bethe Radon-Nikodym derivative, which is required to be bounded—that is,G must be such that there exists R < ∞ with R(y) ≤ R for y ∈ Y. Then

F (y) =∫Y∩(−∞,y]

R(x) · dG(x).

Suppose now that we generate independent realizations of random variablesX and U from G and U(0, 1), respectively. Then

P[U ≤ R(X)R−1] =∫Y

P[U ≤ R(x)R−1] · dG(x)

=∫Y

R(x)R−1 · dG(x) = R−1;

and the conditional probability that X ≤ y given that U ≤ R(X)R−1 is

P[X ≤ y|U ≤ R(X)R−1] =P[X ≤ y ∩ U ≤ R(X)R−1]

P[U ≤ R(X)R−1]

= R · P[X ≤ y ∩ U ≤ R(X)R−1]

= R

∫Y∩(−∞,y]

R(x)R−1 · dG(x)

=∫Y∩(−∞,y]

R(x) · dG(x)

= F (y).

The equality P[X ≤ y|U ≤ R(X)R−1] = F (y) gives us a recipe forgenerating a random variable with c.d.f. F . First, find another c.d.f., G, withthe same support, such that dF can be factored as the product of an almost–surely bounded function R times dG. When F is absolutely continuous, R

will be a ratio of densities; otherwise, it will be a ratio of probability massfunctions (p.m.f.s). Next, generate independent realizations x and u fromG and U(0, 1). If u ≤ R(x)R−1, where R is the upper bound of R, thenaccept x as a realization from F ; otherwise, reject and draw again. Theprobability of accepting is R−1. The process is obviously most efficientwhen (i) G is a distribution from which it is easy to generate realizations(as by the inverse-transform method) and (ii) R−1 is close to unity.

As an example, consider simulating Poisson variates, having p.m.f.f(y) = θye−θ/y! for y ∈ Y ≡ 0, 1, 2, . . . in the special case that θ ∈ (0, 1).

Page 553: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch11 FA

536 Pricing Derivative Securities (2nd Ed.)

Take as the comparison distribution the geometric, g(y) = (1 − θ)θy fory ∈ Y. The geometric c.d.f., G(y) = 1 − θ[y+1], has a simple inverse,G−1(u) = [lnu/ ln θ − 1], where in both expressions [ · ] is the greatestinteger function. This makes it easy to generate geometric variates by theinverse transform method. Since R(y) = (1−θ)−1e−θ/y! ≤ (1−θ)−1e−θ, wehave R−1 = (1 − θ)eθ. This declines from unity to zero as θ increases fromzero to unity. The rejection method will therefore work efficiently when θ

is small.

Generating Normals

While both inverse-transform and rejection methods may be used to gener-ate pseudorandom normals, there is a simpler alternative using a transfor-mation devised by Box and Muller (1958). Let U be distributed as uniformon (0, 1) and V be an independent exponential variate with unit scale. Thatis, V = −lnU ′, where U ′ is uniform, independent of U . The joint p.d.f. ofU and V is

fUV (u, v) = e−v, u ∈ (0, 1), v > 0

and fUV (u, v) = 0, elsewhere. Consider the bivariate transformation X =√2V cos(2πU), Y =

√2V sin(2πU). This takes (0, 1) × + to 2 and has

inverse V = (X2 + Y 2)/2 and U = tan−1(X/Y ). The Jacobian of thetransformation is

J ≡∣∣∣∣∂x/∂u ∂y/∂u

∂x/∂v ∂y/∂v

∣∣∣∣=

∣∣∣∣∣∣∣−2π

√2v sin(2πu) 2π

√2v cos(2πu)

1√2v

cos(2πu)1√2v

sin(2πu)

∣∣∣∣∣∣∣= −2π.

Change-of-variable formula (2.17) now shows that X and Y are bivariate(spherical) normal; that is,

fXY (x, y) = fUV (u, v) · |J−1|

=12π

e−(x2+y2)/2.

The method thus transforms independent uniform and exponential variatesinto a pair of independent standard normals.

Page 554: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch11 FA

Simulation 537

While this is amply straightforward, there are some clever program-ming tricks that obviate the need to evaluate the sine and cosine functions.Routine GASDEV in Press et al. (1992) employs such a method. Another isgiven in algorithm NA in Fishman (1996, pp. 189–191). Function GAUSFUN

on the CD is an implementation of the NA algorithm. IMSL routinesRNNOR and RNNOA use inverse transformation and rejection methods, respec-tively, of which the second is faster. In fact, all four routines are extremelyfast. Of course, speed is just one consideration. Unfortunately, none ofthe methods holds up especially well under powerful statistical tests forunivariate normality. Routines ADTEST and CFTEST on the CD implementAnderson-Darling (1954) and integrated characteristic function (i.c.f.) testsfor normality (Epps and Pulley, 1983; Baringhaus et al., 1989). Applied tosamples of sizes 500, 1000, and 2000 generated by the four routines, thesetests reject the null hypothesis of univariate normality at frequencies muchhigher than nominal significance levels. GAUSFUN does as well as any of theroutines. Since in all cases the test results are highly dependent on theinitial seeds used to begin the sequence of pseudorandom uniforms, one isadvised to experiment with several seeds before putting any of these rou-tines to work. Testing multivariate normality with subsets of the outputwould also be desirable. For tests of this sort see Baringhaus and Henze(1988) and Mardia (1970).

11.2 Variance-Reduction Techniques

As indicated in the introduction, the straightforward application of MonteCarlo methods to estimating Eg(Y ) =

∫g(y) · dF (y) is simply to average

a large number of replicates of the function evaluated at pseudorandomrealizations of the random variable Y . In other words, one computes Eg ≡N−1

∑Nj=1 g(yj). This section explores ways to improve accuracy and/or

reduce the computational time relative to this standard approach. Stratifiedsampling and importance sampling both have to do with how the replicatesof y are produced, while the use of antithetic variates and control variatesinvolve other aspects of the experimental design.

11.2.1 Stratified Sampling

As we know from elementary statistics, “random” samples are not alwaysrepresentative samples. The manifestation of this fact in simulation

Page 555: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch11 FA

538 Pricing Derivative Securities (2nd Ed.)

experiments is that the empirical (sample) frequency distribution, FN , of afinite number N of replicates from a distribution F may differ considerablyfrom F . Apart from the limitations of pseudorandom number generators,we do know that supy |FN (y) − F (y)| → 0 a.s. as N → ∞ (this is theGlivenko-Cantelli theorem); however, it would be nice to have some way toreduce the sizes of errors for fixed N . This is what stratified sampling isintended to accomplish. The idea is to break the support of F into strataand sample from each with the appropriate frequency.

More specifically, let F be supported on Y, and let YsSs=1 be a parti-

tion. These will be the strata from which we sample. Typically, they are con-tiguous, disjoint intervals. Let ps ≡ P(Y ∈ Ys) and Esg(Y ) ≡ E[g(Y )|Y ∈Ys] be, respectively, the probability measure of stratum s and the condi-tional expectation of g given that Y takes on a value in the stratum. TakingNs ≡ 〈psN〉 draws of Y from stratum s, where the angular brackets indi-cate “nearest integer”, we can estimate Esg(Y ) consistently and withoutbias as

Esg = N−1s

Ns∑j=1

g(yjs).

The stratified estimator of Eg(Y ), also unbiased and consistent, is then

ESg ≡S∑

s=1

psEsg.

Considering that the straight-on estimator, Eg = N−1∑N

j=1 g(yj), isalso unbiased and consistent, one is entitled to ask what stratificationaccomplishes. The answer is that the stratified estimator has lower vari-ance for any fixed total number of replicates, N . In the process of provingthis we shall discover a consideration that is involved in selecting strata.Letting

γ2 ≡ V g(Y ) =∫Y

g(y)2 · dF (y) − (Eg)2, (11.2)

where V (·) is the variance operator, note first that V Eg = γ2/N . We canfind the variance of the stratified estimator as follows:

V ESg =S∑

s=1

p2sV Esg =

S∑s=1

p2sγ

2s/Ns = N−1

S∑s=1

psγ2s , (11.3)

Page 556: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch11 FA

Simulation 539

where γ2s ≡ V [g(Y )|Y ∈ Ys] is the conditional variance of g given that Y is

from stratum s. This conditional variance can be expressed as

γ2s =

∫Ys

[g(y) − Esg]2 · dF (y|Y ∈ Ys)

=∫Ys

[g(y) − Eg] + (Eg − Esg)2 · dF (y|Y ∈ Ys)

=∫Ys

[g(y) − Eg]2 · dF (y|Y ∈ Ys) − (Eg − Esg)2.

Plugging into (11.3) gives

V ESg =1N

S∑s=1

ps

∫Ys

[g(y) − Eg]2 · dF (y|Y ∈ Ys) −1N

S∑s=1

ps(Eg − Esg)2

=1N

S∑s=1

∫Ys

[g(y) − Eg]2 · dF (y) − 1N

S∑s=1

ps(Eg − Esg)2

=1N

∫Y[g(y) − Eg]2 · dF (y) − 1

N

S∑s=1

p2s(Eg − Esg)

= V Eg − 1N

S∑s=1

p2s(Eg − Esg)2.

The summation in the second term is simply the variance of the conditionalexpectations of g(Y ) across the strata. If there is just one stratum, or ifmultiple strata are chosen so that there is no variation, then the secondterm is zero. Otherwise, stratification does produce an estimator of lowervariance than the unstratified estimator.

Getting the most out of stratification involves choosing strata that max-imize the cross-stratum variation in Esg. This suggests (correctly) that anyrefinement of the partition YsS

s=1—that is, any further stratification—can only lower, and cannot increase, the variance. However, since simulat-ing replicates from the stratified distributions F (·|Y ∈ Ys) takes more timethan simulating from F alone, there are limits to how far one should go.Given a decision as to the number of strata, however, it is clear that theyshould be chosen so as to maximize the variation in the conditional meanof g.

How can one generate replicates from the strata? This just involvessampling from each of the conditional distributions F (·|Y ∈ Ys). Thisrestricted sampling is where the inverse transform method is particularly

Page 557: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch11 FA

540 Pricing Derivative Securities (2nd Ed.)

handy. Expression (11.1) gives the inverse of the conditional distributionin terms of the inverse of the unconditional in the case that the strata areintervals. Extension to more general strata, such as unions of intervals, isstraightforward.

11.2.2 Importance Sampling

The identity

V ESg = N−1S∑

s=1

psV [g(Y )|Y ∈ Ys]

that is implied by (11.3) shows that the effective application of stratifiedsampling involves putting more weight on regions of the support where thevariation in g is low. That is, when V [g(Y )|Y ∈ Ys] is small for some s,we want ps to be large. This has the effect of increasing the sampling rate,Ns = 〈psN〉, where sampling is more informative about Eg. Importancesampling does the same thing in a more subtle manner.

Here is how it works. In estimating Eg(Y ) =∫Y g(y)·dF (y), let us make

a change of measure from F to a measure G with respect to which F isabsolutely continuous, as

Eg(Y ) =∫Y

g(y)R(y) · dG(y),

where R ≡ dF/dG is the Radon-Nikodym derivative. This reminds us thatnot only can we estimate Eg by drawing replicates yj from F and averag-ing the g(yj), but also by drawing replicates yj from G and averagingg(yj)R(yj). When would such a change of measure be advantageous?

One application is in variance reduction, as the following argumentshows. Letting EgR ≡ N−1

∑Nj=1 g(yj)R(yj) be the modified estimator,

we have

V EgR = N−1

∫Y

g(y)2R(y)2 · dG(y) − (Eg)2

= N−1

∫Y

g(y)2R(y) · dF (y) − (Eg)2,

so that, using (11.2),

V Eg − V EgR = N−1

∫Y

g(y)2 · dF (y) − N−1

∫Y

g(y)2R(y) · dF (y)

= N−1

∫Y

g(y)2[1 − R(y)] · dF (y).

Page 558: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch11 FA

Simulation 541

This expression could be of either sign. Notice that since∫Y [1 − R(y)] ·

dF (y) = 0 the integral is merely the covariance between 1 − R(Y ) andg(Y )2. The expression is positive, in which case the change of measure doeseffect a reduction in variance, when G is such that 1−R(Y ) and g(Y )2 havepositive covariance or, equivalently, when R and g2 covary negatively. Whenthis is so, more weight is placed on those regions of Y in which samplingis more informative about Eg, much as with stratification. In other words,the fact that G has more probability mass in those areas means that moreof the given sample of replicates will be drawn from there than would bedrawn under F . This explains the name “importance sampling”.

Fishman (1996, section 4.1) gives several examples of the applicationof importance sampling in variance reduction. Another application of themethod is just to swap distributions from which it is relatively hard togenerate replicates for easier ones. For example, suppose Y ∼ Γ(α, β) andthat we want to estimate

Eg(Y ) =∫ ∞

0

g(y)1

Γ(α)βαyα−1e−y/β · dy

=∫ ∞

0

g(βy)1

Γ(α)yα−1e−y · dy.

Taking dF (y) = Γ(α)−1yα−1e−y · dy and dG(y) = e−y · dy (exponential),we have

Eg(Y ) =∫ ∞

0

g(βy)R(y)e−y · dy,

where R(y) = Γ(α)−1yα−1. Generating pseudorandom exponentials nowleads to the estimate Eg = Γ(α)−1N−1

∑Nj=1 g(βyj)yα−1

j . Although accu-racy falls off as α increases, this is much faster to calculate and will givegood results for α near unity.

11.2.3 Antithetic Variates

If we take independent draws Y and Y ′ from F and calculate g(Y ) andg(Y ′), then the average, [g(Y ) + g(Y ′)]/2, has variance V g(Y )/2. If wemake N replications of this bivariate sample and form the estimator

(2N−1)N∑

j=1

[g(Yj) + g(Y ′j )],

Page 559: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch11 FA

542 Pricing Derivative Securities (2nd Ed.)

then the variance is1

4N[V g(Y ) + V g(Y ′)] =

12N

V g(Y ).

Of course, this is exactly the same thing—and involves the same amountof work—as taking 2N replicates from F and calculating

Eg ≡ (2N)−12N∑j=1

g(Yj),

so nothing special has been accomplished. However, suppose we could some-how generate a dependent pair Y, Y ′ from a joint distribution F (y, y′), bothof whose marginals were equal to F (y). Then the variance of the bivariateestimator, which we now denote EgA, would be

V EgA = N−1V g(Y ) + 2Cov[g(Y ), g(Y ′)] + V g(Y ′)/4

= (2N−1)V g(Y )(1 + ρgg′), (11.4)

where ρgg′ is the correlation between g(Y ) and g(Y ′). Clearly, relative toestimating with 2N independent replicates, this would achieve a reductionin variance if ρgg′ < 0, and the gain would be greater the stronger thecorrelation. If Y ′ could be generated at least as fast as Y itself, then therewould be a clear efficiency gain.

The three results below, established by Hoeffding (1940), provide guid-ance in putting this into practice by showing how to construct bivariaterandom variables having minimal correlation, subject to their having thesame marginal distributions. Here X and X ′ are random variables withjoint c.d.f., FXX′ , and with identical marginals, F , such that V X exists.For proofs of somewhat more general versions of the first two see Fishman(1996); the third is proved below.

1. Cov(X, X ′) =∫2

FXX′(x, x′) · dx · dx′ − (EX)2.2. FXX′(x, x′) ≥ [F (x) + 1 − F (x′)]+ for (x, x′) ∈ 2.

3. Construct random variables W and W ′ as W = F−1(U), W ′ = F−1(1−U), where U ∼ U(0, 1). Then W and W ′ are identically distributed asF, and

FWW ′ (w, w′) = [F (w) + F (w′) − 1]+.

Result 1, which is a product-moment extension of expression (2.39) in sec-tion 2.2.3, shows that, ceteris paribus, covariance increases with the integralof the joint c.d.f. Result 2 establishes a minimum for the joint c.d.f. at each

Page 560: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch11 FA

Simulation 543

point in 2, in terms of the marginals. Finally, result 3 shows how bivari-ate random variables with given marginals can be constructed whose jointc.d.f. attains the lower bound. With this construction W and W ′ have min-imal correlation subject to being identically distributed (and having finitevariance). It is not hard to see (Fishman, 1996, p. 313) that this minimalcorrelation is in fact negative.

Applying these results to the Monte Carlo estimation of Eg(Y ) whereY has c.d.f. F , suppose first that g is a nondecreasing function and that Y

is generated as F−1(U) where U ∼ U(0, 1). Set W = g F−1(U) ≡ h(U).Since F−1(U) is nondecreasing, so is h, which is the composition of twonondecreasing functions. Indeed, h−1 is the c.d.f. of W , since P(W ≤ w) =P[h(U) ≤ w] = P[U ≤ h−1(w)] = h−1(w). Then, setting W ′ = h(1−U), wehave

FWW ′ (w, w′) = P[h(U) ≤ w ∩ h(1 − U) ≤ w′]= P[U ≤ h−1(w) ∩ 1 − U ≤ h−1(w′)]= P[1 − h−1(w′) ≤ U ≤ h−1(w)]

= [h−1(w) + h−1(w′) − 1]+

= [FW (w) + FW (w′) − 1]+.

Besides establishing Hoeffding’s result 3 (take g ≡ 1 in the above), this inconnection with results 1 and 2 shows that we induce minimal correlationin g(Y ) and g(Y ′), subject to maintaining equality in the marginal distri-butions, by generating Y and Y ′ as F−1(U) and F−1(1 − U), respectively.That the same result applies to nonincreasing functions g as well followsupon generating Y as F−1(1 − U).

The dependent random variables Y and Y ′ that are generated in thisfashion are called “antithetic variates”. Since the construction guaranteesthat ρgg′ < 0 when g is monotone, (11.4) shows that the use of antitheticvariates necessarily reduces the variance of the estimator of Eg. If g is notmonotone on the support of Y then these optimal results cannot be guaran-teed; however, it is clear from (11.4) that the variance of EgA constructedfrom N pairs of replicates Y, Y ′ can in no case be larger than the varianceof Eg from N realizations of Y alone. If generating Y ′ requires little extrawork, then the antithetic procedure will usually extend the speed-accuracyfrontier.

How much work, then, is involved in generating Y ′? The constructionpresented above is limited to variates generated via the inverse-transformapproach. Generating Y ′ in this way as F−1(1 − U) takes twice the time

Page 561: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch11 FA

544 Pricing Derivative Securities (2nd Ed.)

as generating Y , and this can be costly when F−1 is not easily calculated.If Y could be generated in some faster way than by inverse transform, itseems that the antithetic procedure would not be advantageous, since onecould just generate twice as many ordinary deviates. However, there is animportant class of cases in which the production of Y ′ turns out to be triv-ial. Notice that since Y can be generated as F−1(U) and since U − 1/2 isdistributed symmetrically about the origin, it is always possible to repre-sent g as a function of a symmetrically distributed random variable. Withsuch representation for g let us suppose that the distribution of Y itselfis symmetric about the origin, so that F (y) = 1 − F (−y). Suppose fur-ther that Y can be generated in some efficient way. For example, Y mightbe standard normal, which can be generated by a variant of the Box-Muller algorithm. Since Y can still be represented via the inverse-transformmethod, we see that Y = F−1(U) implies U = F (Y ) = 1−F (−Y ), so that−Y = F−1(1 − U). This shows that Y,−Y constitute an antithetic pairwhen Y is symmetrically distributed. Therefore, expressing g as a functionof a random variable Y such that Y and −Y are identically distributed, theantithetic estimator of Eg is

EgA = (2N)−1N∑

j=1

[g(yj) + g(−yj)].

When g is monotone, this achieves maximal variance reduction; otherwise,it can do no harm. Of course, if g itself is symmetric about the origin, thenthis can accomplish nothing.

11.2.4 Control Variates

Control variates were introduced in section 5.5.2 in connection with thebinomial method for pricing derivatives. In that context the idea was touse the same binomial tree to price both the derivative of interest and ananalogous security whose exact value is known (such as a European option),then to adjust the estimate of interest by subtracting the known pricingerror for the comparison asset. This can reduce the pricing error in thederivative if the correlation between the known and unknown pricing errorsis sufficiently high, but the discrete nature of the state space and the effectsof different degrees of smoothness of the two payoff functions makes a preciseanalysis difficult. In the simulation context the method works by using thesame pseudorandom numbers to calculate payoffs of both the derivativeand the comparison security. But here, since the state space is no longer

Page 562: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch11 FA

Simulation 545

highly discrete and statistical analysis is more appropriate, we will be ableto adjust the scale of the pricing error of the comparison asset so as to getsomething close to the maximum possible reduction in mean squared error.

To see how, suppose once again that our goal is to estimate Eg(Y ) =∫g(y) · dF (y) via simulation. Introduce a new function h(Y ) whose math-

ematical expectation Eh is actually known—h will be the payoff functionof a derivative that can be priced analytically—and consider the followingmodified estimator of Eg:

EgC(α) ≡ N−1N∑

j=1

g(yj) − α[h(yj) − Eh]

= Eg − α(Eh − Eh),

where Eh = N−1∑N

j=1 h(yj) and where Eg is the standard simulation

estimate. The difference Eh − Eh represents the error correction for thecontrol variate. In choosing the weighting factor α one wants to minimize

V EgC(α) = V Eg − 2αCov(Eg, Eh) + α2V Eh,

and the optimal choice is the least-squares (population) regressioncoefficient,

α∗ = Cov(Eg, Eh)/V Eh = Cov[g(Y ), h(Y )]/V h(Y ).

The same simulation that produces Eg and Eh generates a consistent esti-mate of α∗; namely,

α∗ =

∑Nj=1 g(yj)[h(yj) − Eh]∑N

j=1[h(yj) − Eh]2. (11.5)

Notice that realizations of g and h do not have to be stored, because thesesums can be accumulated as the simulation progresses.6

Since α∗ is also subject to sampling error, it is not necessarily the casethat the estimator EgC(α∗) ≡ Eg − α∗(Eh − Eh) has lower variance thanEg, but it can be shown (Fishman, 1996, section 4.2) that V EgC(α∗) ≤V Eg if and only if α∗ lies between zero and 2α∗. Since the variance of α∗

6An alternative is to estimate α∗ in a separate experiment with different randomdeviates. Ideally (abstracting from the limitations of pseudorandom number generators)

this makes α∗ statistically independent of cEg and cEh and preserves the unbiasedness ofthe control-variate estimator.

Page 563: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch11 FA

546 Pricing Derivative Securities (2nd Ed.)

can be estimated consistently as

V α∗ =N−1

∑g(yj) − α∗[h(yj) − Eh]2∑N

j=1[h(yj) − Eh]2,

and since (V α∗)−1/2(α∗ − α∗) is asymptotically standard normal, the sim-ulation results themselves yield an estimate of the probability of the eventthat V EgC(α∗) ≤ V Eg. This is P(|α∗ − α∗| ≤ |α∗|) .= 2Φ(α∗/

√V α∗) − 1

for large n. The nice thing here is that one has complete control over thesample size on which the asymptotic approximations depend.

While one usually takes for granted the desirability of using an unbi-ased control variate so as to preserve (apart from sampling error in α∗) theunbiasedness of the final estimate, Fu et al. (1999) describe an applicationin which a biased control variate can improve the accuracy. Comparingsimulation estimates of fixed-strike, continuous arithmetic-average Asianoptions with estimates from the analytical formula by Geman and Yor(1993), they record the performance of control variates based on both dis-crete and continuous geometric-average options. Exact analytical formulasfor both the geometric cases are known—for example, (7.77) on page 366for the continuous case. Since sample paths generated by the simulationactually evolve in discrete time, simulation estimates of discrete geomet-ric options are unbiased, while those for the continuous options contain adiscretization bias. Likewise, there is discretization bias in simulation esti-mates of the continuous arithmetic-average Asians, and the finding by Fuet al. (1999) is that the two biases tend to cancel when pricing errors fromcontinuous geometrics are used as the control variate.

11.2.5 Richardson Extrapolation

We have explained the use of Richardson extrapolation in binomial pric-ing (section 5.5.2) and again in connection with the Geske-Johnson (1984)approximation for pricing American puts (section 7.1.2). In both cases thepurpose was to reduce errors associated with discrete-time approximations.The method finds the same application in Monte Carlo. Discretization isrequired whenever it is necessary to simulate entire sample paths, as inpricing path-dependent options or when volatility is stochastic. ApplyingRichardson extrapolation, one generates a sequence of Monte Carlo esti-mates of the derivative, D(hk)K

k=1, where D(hk) is based on nk subdi-visions of the derivative’s life into intervals each of length hk = T/nk.

Page 564: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch11 FA

Simulation 547

Approximating D(h) as a (K − 1 )th-order polynomial and solving the K

equations

D(hk) = D(0) + a1hk + a2h2k + · · · + aK−1h

K−1k , k = 1, 2, . . . , K,

for the intercept D(0) give an estimate of the limiting value of D(h) ash → 0. Fu et al. (1999) describe an application to Asian options.

11.3 Applications

This section gives several extended examples of the use of Monte Carlomethods in pricing derivatives, progressing in order of increasing com-plexity. We consider first how to price European options on a portfolioof assets whose log prices are correlated Brownian motions. In this simplecase the price of the portfolio and the option’s payoff on the expiration datecan be simulated in a single step without developing entire sample paths.This offers a simple framework in which to illustrate the use of antitheticand control variates. We then turn to European options on an underlyingwhose price follows a process with stochastic volatility. Here, when volatil-ity shocks and price shocks are correlated, complete price paths do have tobe constructed even though the derivative’s payoffs are path-independent.The added complication of path-dependent payoffs is introduced in thenext example, where simulation is used to price variable-strike lookbackcalls in the stochastic-volatility setting. Finally, we examine in detail sev-eral schemes for pricing American/Bermudan-style options via simulation.

11.3.1 “Basket” Options

The value of a static portfolio of assets whose prices follow geometric Brow-nian motion does not itself follow this simple process. Thus, it is inconsis-tent to assume both that future prices of a collection of assets are jointlylognormal and that the future value of a convex combination of them islognormally distributed. As discussed in section 6.3.3, lognormality can bea good approximation for the distribution of the value of a well-diversifiedportfolio, and this is the usual justification for the common practice of pric-ing index options by Black-Scholes. However, options on a small “basket”of underlying securities are another matter. Since analytical solutions aredifficult, and since the dimensionality of the problem rules out binomialand finite-difference methods, simulation is the natural vehicle for pricingsuch basket options.

Page 565: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch11 FA

548 Pricing Derivative Securities (2nd Ed.)

First, let us consider the straightforward application of Monte Carlowithout the use of variance reduction. We begin with a model for the pricesof the n assets in the basket. Let R =ρjkn

j,k=1 be the correlation matrixof the assets’ log returns, and let σjn

j=1 be their volatilities. Assume thatthe individual prices evolve as

dSjt/Sjt = µjt · dt + σj · dW ∗jt, j = 1, 2, . . . , n,

where the W ∗jt are correlated standard Brownian motions, with

E0W∗jtW

∗kt = ρjkt. With this setup the individual future prices are condi-

tionally lognormal, and prices of different assets are dependent. Assumingconstant short rate and continuous dividend rates, the model becomes

dSjt/Sjt = (r − δj) · dt + σj · dW ∗jt, j = 1, 2, . . . , n, (11.6)

under risk-neutral measure P. The W ∗jt are standard Brownian motions

under P with the same correlations as under P. The goal is to price aT -expiring European call on a static portfolio of these assets—the bas-ket portfolio. Letting νjn

j=1 be the portfolio weights, the basket’s time-t value is Kt ≡

∑nj=1 νjSjt. The European call is worth (KT − X)+ at

expiration and

CE(K0, T ) = e−rT E0(KT − X)+,

at t = 0, where E is expectation under P.Using Monte Carlo to estimate the expectation requires simulating N

realizations of KT and averaging. How is this done? First, we have to seewhat the model implies about the terminal distribution under P of theprices of the individual assets in the basket. From (11.6) each SjT can beexpressed as

SjT = Sj0 exp[(r − δj − σ2j /2)T + σjW

∗jT ]. (11.7)

Representing W ∗jT as Z∗

j

√T , where Z∗ ∼ N(0,R), one sees that obtaining

realizations of the SjT requires obtaining realizations of n normals withcorrelation matrix R. An efficient way to do this is to construct the Choleskydecomposition of R. This expresses R as the product of an n × n lower-triangular matrix L times its transpose; that is, as R = LL′ or

1 ρ21 · · · ρn1

ρ21 1 · · · ρn2

......

. . ....

ρn1 ρn2 · · · 1

=

1 0 · · · 0

λ21 λ22 · · · 0...

.... . .

...λn1 λn2 · · · λnn

1 λ21 · · · λn1

0 λ22 · · · λn2

......

. . ....

0 0 · · · λnn

.

Page 566: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch11 FA

Simulation 549

Efficient routines to accomplish this are widely available; e.g., Press et al.(1992). Taking Z = Zjn

j=1, a vector of pseudorandom standard normals,Z∗ can now be constructed as LZ. Thus,

Z∗1

Z∗2

...Z∗

n

=

Z1

λ21Z1 + λ22Z2

...∑nj=1 λnjZj

.

Multiplying each Z∗j by

√T to get W ∗

jT , one then constructs realizationsof the SjT as in (11.7) and, in turn, of (KT − X)+ = (

∑nj=1 νjSjT − X)+.

Letting KT (m) be the mth of N realizations, the simulation estimate ofthe call’s value is

CE(K0, T ) = N−1N∑

m=1

[KT (m) − X ]+.

Now let us apply some variance-reduction tricks, beginning with the useof antithetic variates. Since the SjT are functions of the symmetricallydistributed elements of Z, antithetic versions of payoff [KT (m) − X ]+ ineach replication m can be had simply by constructing vectors of asset pricesfrom both Z∗ and −Z∗. With KT (m)+ and KT (m)− as the two resultingportfolio values, the estimator

CEA (K0, T ) = (2N)−1

N∑m=1

[KT (m)+ − X ]+ + [KT (m)− − X ]+ (11.8)

remains unbiased and can have variance no larger than that of CE(K0, T ).Introducing a control variate takes a bit more work, as we must find

some asset whose payoff is highly correlated with [KT (·) − X ]+ over thevarious realizations but whose initial value is known analytically. Shaw(1998) suggests as a control variate the difference between the payoff andprice of a European call on an asset whose log price is normally distributedwith variance matching that of the basket portfolio. In other words, oneconstructs an asset that mimics the basket portfolio but has a lognormalreturn. Realizations are obtained by using a scaled sum of the same normaldeviates that were used to generate values of the basket.

Getting into the details, the model for the terminal value of the mim-icking asset under P is

K∗T = K∗

0 exp

(r − δK − σ2K/2)T +

n∑j=1

νjσjW∗jT

,

Page 567: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch11 FA

550 Pricing Derivative Securities (2nd Ed.)

where K∗0 = K0 =

∑nj=1 νjSj0, δK =

∑nj=1 νjδj , and σ2

K =∑n

j,k=1 νj

νkσjσjρjk. Under this model the Black-Scholes formula gives the actualprice of a European call as

CE(K∗0 , T ) = B(0, T )fK∗Φ[q+(fK/X)] − XΦ[q−(fK∗/X)],

where

q±(x) =lnx ± σ2

KT/2σK

√T

,

fK∗ = B(0, T )−1K∗0e−δKT .

Simulating the payoffs, one obtains antithetic versions of the call’s termi-nal value on replication m as [K∗

T (m)+ −X ]+ and [K∗T (m)− −X ]+, where

K∗T (m)± are constructed from the same realizations of the W ∗

jT that wereused to produce KT (m)±. The simulation estimate of the call on the mim-icking asset, CE

A (K∗0 , T ), is then obtained as in (11.8).

Now let CEAC(K0, T ) represent the simulation estimate of the value of

the basket call that applies both antithetic and control-variate techniques.This will be obtained by adjusting the antithetic estimate (11.8) by thepricing error for the call on the mimicking lognormal asset, CE

A (K∗0 , T ) −

CE(K∗0 , T ), as

CEAC(K0, T ) = CE

A (K0, T ) − α[CEA (K∗

0 , T )− CE(K∗0 , T )].

Recalling the discussion in section 11.2.4, one obtains an estimate of theoptimal weighting parameter, α∗, by regressing the average of the two anti-thetic realizations of the payoff of the basket call on the realizations of theerrors in pricing the control asset. Specifically, let

gm ≡ [KT (m)+ − X ]+ + [KT (m)− − X ]+/2

be the payoff of the basket option on the mth realization,

hm ≡ [K∗T (m)+ − X ]+ + [K∗

T (m)− − X ]+/2

be the payoff of the control asset, and Eh = CE(K∗0 , T ) be the Black-

Scholes price of the control. Then, corresponding to (11.5), the estimateof α∗ is

α∗ =

∑Nj=1 gm(hm − Eh)∑N

j=1(hm − Eh)2.

Page 568: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch11 FA

Simulation 551

11.3.2 European Options under Stochastic Volatility

Let us now see how to use simulation to price path-independent, European-style derivatives on an underlying asset whose price is an Ito processwith stochastic volatility. Specifically, consider the models discussed insection 8.3, wherein under P

dSt = rtSt · dt + σtSt · dW1t

dσt = λt · dt + γt(ρt · dW1t + ρt · dW2t).

Here λt is the risk-adjusted mean drift of volatility, ρt is the correla-tion between the shocks to price and volatility (with ρt =

√1 − ρ2

t ), andW1t, W2t are independent P Brownian motions.

There are two general approaches to pricing derivatives in this setup.The one to use depends on the value of the correlation between shocks. Aswe saw in connection with the models of Scott (1987) and Hull and White(1987), if ρt = 0 for all t the price of a European-style, path-independentderivative can be expressed in terms of the average volatility over the deriva-tive’s lifetime, σ2

T = T−1∫ T

0σ2

t · dt. In this case, assuming that the deriva-tive can be valued analytically when volatility is constant, only the volatilityprocess needs to be simulated. The procedure is as follows. One realizationof the path of volatility over a discrete time lattice, σtjn

j=1, is gener-ated (in the manner described below) and the time average is calculated asσ2

T = n−1∑n

j=1 σ2tj

. Using σT one then evaluates analytically the expecta-tion of D(ST , 0) (the derivative’s terminal value) conditional on σT = σT ,and discounts to the present. This produces a realization of the derivative’sinitial value conditional on σT = σT . Repeating this N times and averagingdelivers the final, unconditional estimate, D(S0, T ). Thus, when shocks toprice and volatility are uncorrelated, pricing a vanilla European-style putor call requires simulation only to estimate σT .

Things work very differently when ρt = 0. In this case price and volatil-ity shocks are not independent, and the conditioning argument that deliversan analytical expression for price given σT no longer applies. Moreover, thedependence between volatility shocks and price shocks would in any caseprevent one from determining σT without modeling the evolution of pricesimultaneously. Dealing with the case ρt = 0 therefore requires simulatingpaths of both σt and St and estimating D(S0, T ) by discounting theaverage of the resulting realizations of terminal value, D(ST , 0). We nowdescribe this more general procedure in detail.

Page 569: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch11 FA

552 Pricing Derivative Securities (2nd Ed.)

The process begins by defining a lattice of time steps, as tj = jT/n.Starting at t = t0 = 0 with S0 observed and σ0 a known parameter, andproceeding through steps j = 1, 2, . . . , n, generate

σtj = σtj−1 + λtj−1T/n + γtj−1

√T/n(ρtj−1Z1j + ρtj−1

Z2j) (11.9)

Stj = Stj−1 exp[(rtj−1 − σ2tj−1

/2)T/n + σtj−1

√T/nZ1j ], (11.10)

where Z1j and Z2j are independent pseudorandom normal deviates. Theadapted processes λt and γt can be specified in very general waysto depend on current and past values of volatility, or price, or both. Anexample is Scott ’s (1987) mean-reverting specification for σt, in whichλt = ξ(σ∞ − σt)− λ∗. Price-level (“c.e.v.”) effects could be accommodatedby letting λt and/or γt depend on St. Alternatively, as in the Hobson–Rogers (1998) model (section 8.2.5), these could depend on an average ofpast values of S. Straight-on Monte Carlo then just requires averaging anddiscounting the realizations of the derivative’s payoff, as

D(S0, T ) = B(0, T )N−1N∑

m=1

D[ST (m), 0].

Hull and White (1987) suggest the following way to employ antitheticvariates in this setup. The realization of Z1j, Z2jn

j=1 on replication m isused to calculate four terminal prices of the underlying and four correspond-ing terminal payoffs of the derivative. Terminal price ST (m)++ is obtainedusing the realizations of each pair Z1j, Z2j that were actually generated;then prices ST (m)−+, ST (m)+−, and ST (m)−− are calculated by reversing,respectively, the signs of the first, the second, and both members of eachpair. The resulting estimate of the current value of the derivative is then

DA(S0, T ) = (4N)−1N∑

m=1

D[ST (m)++, 0]+ + D[ST (m)+−, 0]+

+ D[ST (m)−+, 0]+ + D[ST (m)−−, 0]+.

Hull and White propose as control variates the differences between payoffsof European options and Black-Scholes prices, evaluating the payoffs atterminal prices ST (m)+ and ST (m)− generated from constant-volatilitydiffusions with σt = σ0 and the realizations Z1j and −Z1j, respectively.An alternative choice would be deviations of payoffs from the computationalsolution of Heston ’s (1993) mean-reverting square-root model for volatility,(8.22). Because they do pick up variation in volatility, one expects these

Page 570: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch11 FA

Simulation 553

deviations to be more highly correlated with payoffs implied by the specificvolatility model being applied. Of course, one can experiment with differentcontrol variates in search of one that is best.

Whatever variance-reduction scheme is used to reduce the required num-ber of replications, there remains the issue of how many time steps, n, toallow. The issue here is not one of statistical efficiency, but one of biasand inconsistency. Since the choice of n will be problem-specific, the onlyfeasible approach is to experiment by increasing n until estimates seem toapproach a limit.

11.3.3 Lookback Options under Stochastic Volatility

Clewlow and Carverhill (1994) use an interesting adaptation of the control-variate technique to price lookback options under stochastic volatility. Themethod is not limited to lookbacks but is applicable whenever the entiresample path of the underlying must be simulated. The idea is to constructone or more control processes that evolve as martingales along the pathand that track (as well as possible) the evolving conditional expectation ofthe derivative’s ultimate payoff.

The essence of the technique of “martingale variance reduction” is tobuild control variates from innovations in adjustments to a portfolio thatwould approximately replicate the derivative being priced. By “innovations”is meant deviations from conditional expected values. To price a variable-strike lookback call under stochastic volatility in a model like (11.9) and(11.10), Clewlow and Carverhill develop three variates that arise fromdelta, gamma, and sigma hedging a lookback call under geometric Brow-nian motion. In the same way that delta hedging replicates a derivative’sresponses to the underlying price, “gamma” hedging replicates changes inthe derivative’s delta, and “sigma” hedging replicates its response to volatil-ity. Since we have a specific formula—namely, (7.66)—for the value of thelookback under these Black-Scholes dynamics, hedging parameters can bedeveloped analytically; and from these, the martingales that track the valueof the lookback. Specifically, taking d0 = g0 = v0 = 0, define processes dt,gt, and vt at the discrete times tj = jT/nn

j=1 as

dtj = dtj−1 + CV LS (Stj−1 , T − tj−1; σtj−1 , S0,tj−1

)[∆Stj − Etj−1∆Stj ]

gtj = gtj−1 + CV LSS (Stj−1 , T − tj−1; σtj−1 , S0,tj−1

)[(∆Stj )2 − Etj−1 (∆Stj )

2]

vtj = vtj−1 + CV Lσ (Stj−1 , T − tj−1; σtj−1 , S0,tj−1

)[∆σtj − Etj−1∆σtj ].

Page 571: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch11 FA

554 Pricing Derivative Securities (2nd Ed.)

Here the ∆’s represent one-step changes from tj−1 to tj ; subscripts onC denote partial derivatives; and the argument S0,tj−1

in C representsthe minimum price up to tj−1. The one-step change in variable dt repre-sents the innovation in the adjustment required to delta-hedge the lookbackcall. Likewise, changes in gt and vt are innovations in the adjustments forgamma hedging and sigma hedging. Having zero conditional mean, thesechanges are martingale differences, so that dt, gt, and vt are them-selves discrete martingales. At the end of the sample path generated onreplication m one observes dT (m), gT (m), and vT (m), as well as the call’spayoff, ST (m) − S0,T (m). The control-variate estimator then takes theform

CV L = N−1N∑

m=1

ST (m) − S0,T (m) − [αddT (m) + αggT (m) + αvvT (m)],

where the α’s are weights on the adjustment factors. Estimates of variance-minimizing values of the α’s can be found by regressing the realizations ofST − S0,T on those of dT , gT , and vT .

Clewlow and Carverhill (1994) report that martingale variance reduc-tion produces a 14-fold drop in the standard error of the Monte Carloestimates at the cost of trebling the computational time per replication.Since a 14-fold reduction in standard error would require about a 200-foldincrease in replications, the method reduces by a factor of more than 60the overall computational time to achieve a given accuracy.

A final cautionary note: Monte Carlo pricing of derivatives whose pay-offs depend on extrema of the underlying price is highly sensitive to thenumber of time steps, n. At a minimum, n must correspond to the con-tractually specified sampling frequency used to identify the extremum, butlarger values may be required if volatility or interest rates are modeled asstochastic.

11.3.4 American-Style Derivatives

Simulation was long thought to be unsuited to the task of pricing optionssubject to discretionary early exercise. Nevertheless, the need to valuederivatives with complex payoff structures or depending on multiple statevariables has motivated research that has led to significant progress. Here

Page 572: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch11 FA

Simulation 555

we summarize three very different approaches to the problem7: (i) flexibleparametric modeling of exercise regions; (ii) flexible parametric modelingof continuation values; and (iii) a nonparametric tree-based technique thatapplies binomial-like backward induction along generated sample paths.While all three methods have shown promise and have found useful appli-cation, we will see that all have limitations that continue to motivateongoing research. In explaining each method we focus on the simplest ofapplications—American puts under Black-Scholes dynamics—then wrap upby comparing the approaches in a more challenging setting.

Modeling Exercise Boundaries/Stopping Times

Recall from our discussion of American options in sections 5.4.3 and 7.1.2that the initial value of a T -expiring, vanilla American put struck at X canbe represented as

PA(S0, T ) = Ee−rU (X − SU ), (11.11)

where U = inft : St ∈ [0, Bt]. Here Bt0≤t≤T is the optimal exer-cise boundary and U is the optimal stopping time. The sense of (11.11)is that the put generates a receipt whose present value is e−rU (X − SU )at the first time U ≤ T at which the underlying price enters a certaintime-dependent region, Bt = [0, Bt] ⊂ . The option’s current value is sim-ply the (risk-neutral) average of these discounted payoffs. More generally,a T -expiring derivative’s value at t may depend on not just one under-lying price but on some q-vector Yt of state variables, as D(Yt; T − t);for example, for an American put option on an underlying whose pricefollows a stochastic-volatility diffusion, we would have Yt = (St, σt)′ andD(Yt; T − t) = PA(St, σt; T − t). In this q-variate case Bt becomes a time-dependent region in q, and the optimal stopping time becomes U = inft :Yt ∈ Bt. Thinking of a put’s value in this way—i.e., as the maximum ofexpected discounted payoffs over possible exercise regions/stopping times—there springs to mind an obvious and natural sequence of steps for pricingby simulation.

7For further detail and other approaches to pricing by simulation consult Garcia(2003), Longstaff and Schwartz (2001), Broadie and Glasserman (1997), and Boyle et al.(1997). Glasserman (2004, chapter 8) gives an excellent overview, with emphasis ontree-based methods.

Page 573: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch11 FA

556 Pricing Derivative Securities (2nd Ed.)

1. Adopt some flexible specification that expresses the exercise region interms of time, the relevant state variables Yt, and a manageably smallset of parameters, θ, as Bt(Yt; θ). This implicitly defines a parametricstopping time U(Yt; θ) = inft : Yt ∈ Bt(Yt; θ).

2. Given the maintained form of Bt(Yt; θ) and a specific θ, generateN sample paths, Yt(ω)0≤t≤U(ω;θ),ω∈1,2,...,N, under the appropriatemartingale dynamics, where U(ω; θ) ≡ U(Yt(ω); θ) is the first time onpath ω at which Yt is within Bt(Yt; θ). Letting X(Yt) represent theintrinsic (exercise) value of the derivative at t, average over the paths toestimate the derivative’s value (given θ) as

DN (Y0, T ; θ) = N−1∑N

ω=1e−rU(ω;θ)X(YU(ω;θ)(ω)).

3. With Θ as the set of admissible values of θ, find (if possible) a value θN

such that DN (Y0, T ; θN ) = maxθ∈Θ DN(Y0, T ; θ).

If the specification Bt(Yt; θ) is adequate, it seems that following these stepswith sufficiently large N should produce an estimate of arbitrarily highprecision.

While this is almost true, things are not quite so straightforwardas they seem. The first problem—and a relatively minor one—is thatDN (Y0, T ; θN ) is an upward-biased estimate of

D(Y0, T ; θ0) ≡ maxθ∈Θ

D(Y0, T ; θ) = maxθ∈Θ

Ee−rU(Yt;θ)X(YU(Yt;θ))

(assuming that such θ0 exists). This can be understood by recognizing thatthe maximum of an average can be no larger—and can be smaller—thanthe average of maxima. That is,

D(Y0, T ; θ0) = maxθ∈Θ

EDN(Y0, T ; θ)

≤ E maxθ∈Θ

DN(Y0, T ; θ)

≡ EDN (Y0, T ; θN ),

since on the right of the inequality we average the largest values of membersof an ensemble while on the left each member is evaluated at the same point.Alternatively, we can recognize that DN(Y0, T ; θN ) ≥ DN (Y0, T ; θ0) foreach realization of θN , since θN maximizes the mean of the empirical dis-tribution of discounted payoffs rather than the actual expected value. Theusual approach to confronting this problem is to undertake a fourth step,as follows:

Page 574: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch11 FA

Simulation 557

4. With θN determined from step 3, generate an additional N ′ samplepaths and obtain a second estimate as

DN,N ′(Y0, T ; θN ) = N ′−1N ′∑

ω=1

e−rU(ω;θN )X(YU(ω;θN )(ω)).

Under the condition that Bt(Yt; θ0) represents the true functional formof the boundary, this second estimate is now downward biased, because

Ee−rU(Yt;θN )X(YU(Yt;θN )) | θN ≤ Ee−rU(Yt;θ0)X(YU(Yt;θ0)(ω))

for each path ω ∈ 1, 2, . . . , N ′. However, from the ensembles ofN and N ′ sample paths one can estimate the standard errors ofthe payoffs, sN and sN ′ , say, and construct two asymptotically valid100(1 − α)% confidence intervals for D(Y0, T ; θ0) as DN(Y0, T ; θN ) ±zα/2

sN√N

and DN,N ′(Y0, T ; θN ) ± zα/2sN′√

N ′ , where zα/2 is the upper-α/2quantile of the standard normal. The meta-interval [DN,N ′(Y0, T ; θN ) −zα/2

sN′√N ′ , DN(Y0, T ; θN )+zα/2

sN√N

] then represents a conservative (asymp-totic) 100(1− 2α)% confidence interval for D(Y0, T ; θ0).8 In principle, theinterval can be made as tight as one desires by taking N and N ′ sufficientlylarge.

While this seems a powerful result, there are several reasons not to taketoo much comfort in it. Continuing the list of “problems” alluded to above,one’s model for the optimal exercise region is almost certain to be wrong,in the sense that there is no admissible θ for which Bt(Yt; θ) correspondsto the true optimal region. This specification error contributes a down-ward bias of indeterminate magnitude, which partially or wholly offsetsthe upward bias in DN (Y0, T ; θN ). Indeed, with specification error there isno basis for assuming that either DN(Y0, T ; θN ) or DN,N ′(Y0, T ; θN ) con-verges in probability to D(Y0, T ) as N, N ′ → ∞, and the more complicatedare the dynamics of Yt and the structure of the derivative the less insightone is apt to have about the proper model. Next, the fact that sample pathsare generated in discrete time introduces a discretization error that limitsthe possible times of exercise. This is a second source of downward bias andinconsistency (with respect to N, N ′). Finally, unless N is extremely large,

8This follows from Bonferroni’s inequality: For arbitrary events A, B on (Ω,F , P),

P(A ∩ B) ≥ P(A) + P(B) − 1.

In this case A and B represent the events that the respective intervals containD(Y0, T ;θ0).

Page 575: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch11 FA

558 Pricing Derivative Securities (2nd Ed.)

in even the simplest of applications one is unlikely to obtain a good imageof the boundary for all values of t and Yt using paths from a given initialstate Y0. The reason is that relatively few sample paths will enter the lessaccessible parts of the exercise region, and so the experiment will providelittle information about these areas. The implication is that new simula-tions will have to be performed whenever there are appreciable changes ininitial conditions, even for derivatives that are far from expiration.

As an example, figure 11.1 shows three estimates of the boundary ofthe optimal exercise region for a vanilla put on a no-dividend stock, basedon sample paths starting at S0 ∈ 0.9, 1.0, 1.1. In each case St0≤t≤T

follows Black-Scholes dynamics with r = .05 and σ = .20, and the put hasX = T = 1. Since St is now the only state variable and the derivative is avanilla option, the exercise region at t is determined just by specifying itsboundary, as Bt = [0, Bt]. In this experiment Bt0≤t≤T was modeled as acubic spline with knots at 21 time points, and DN (S0, T ; θ) = PA

N (S0, T ; θ)was maximized with respect to the values of Bt at these 21 values oft.9 Clearly, during the early life of the option there are large discrepan-cies among boundary functions estimated from different starting values,S0. Moreover, the estimates in table 11.2 (from N = N ′ = 5,000 replica-tions) display a strong downward bias relative to the corresponding bino-mial estimates.

We can do better than this by using some additional information. Let-ting Bt be the value that equates the Black-Scholes value of a Europeanput to the intrinsic value, as PE(Bt, T − t) = X − Bt, we can parameterizethe difference between the American exercise boundary and Bt as a simplepolynomial in time to expiration. This gives

Bt(θ) = Bt−K∑

k=1

θk(T−t)k = X−PE(Bt, T−t)−K∑

k=1

θk(T−t)k (11.12)

for the boundary and

Bt =

St : X(St) > PE(Bt, T − t) +

K∑k=1

θk(T − t)k

(11.13)

9Values Bt : P E(Bt, T − t) = X − Bt served as an initial guess of the boundary,with P E(·, ·) given by the Black-Scholes formula. Knots were placed at t = 0 and atpoints tj20

j=1 that divided range [B0, BT = X] = [.895, 1.00] into 21 equal parts, as

Btj = B0+j(X−B0)/21. The simulations used N = N ′ = 5, 000 sample paths evaluatedat 200 evenly-spaced points on [0, 1]. Optimization was by the simplex algorithm in Presset al. (1992). The table compares averages of simulated “high” and “low” estimatesP A

N,N′ (S0, T ; θN ), P AN (S0, T ;θN ) with binomial estimates from 5,000 time steps.

Page 576: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch11 FA

Simulation 559

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

t

0.86

0.88

0.90

0.92

0.94

0.96

0.98

1.00

t

S0 = 0.90S0 = 1.00S0 = 1.10

Fig. 11.1. Exercise boundaries for P A(S0, T = 1; X = 1) by simulation and splineapproximation, from initial values S0 ∈ .90, 1.0, 1.1.

Table 11.2. Binomial vs. simulation estimates of P A(S0, T = 1; X = 1) withspline-approximated exercise boundary.

S0 BinomialP A

N,N′+P AN

2

P AN,N′+P A

N

2± z.025sN′/

√N ′

0.9 .1149 .1113 [.1099, .1126]

1.0 .0609 .0598 [.0582, .0615]

1.1 .0299 .0287 [.0275, .0300]

for the corresponding exercise region, where X(St) ≡ X − St. Much ofthe structure of the optimal boundary is thus imported from that of Bt,making it much simpler to approximate. Figure 11.2 shows the boundariesestimated again from three values of S0 with a cubic approximation toBt(θ) (K = 3). The number of samples and the number of time steps arethe same as before. Not only are the estimated boundaries close together,all are nicely monotonic. Table 11.3 confirms one’s expectation that betterestimates of values accompany the more reasonable estimates of boundaries.Note that the binomial estimates are all safely inside the confidence bins.

Page 577: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch11 FA

560 Pricing Derivative Securities (2nd Ed.)

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

t

0.70

0.75

0.80

0.85

0.90

0.95

1.00

t

S0 = 0.90S0 = 1.00S0 = 1.10

Fig. 11.2. Exercise boundaries for P A(S0, T = 1; X = 1) by simulation and polynomialapproximation, from initial values S0 ∈ .90, 1.0, 1.1.

Table 11.3. Binomial vs. simulation estimates of P A(S0, T = 1; X = 1)with polynomial-approximated exercise boundary.

S0 BinomialP A

N,N′+P AN

2

P AN,N′+P A

N

2± z.025sN′/

√N ′

0.9 .1149 .1143 [.1119, .1167]

1.0 .0609 .0607 [.0586, .0628]

1.1 .0299 .0305 [.0290, .0320]

Almost as good accuracy can be achieved by representing the boundaryand corresponding exercise region as

Bt(θ) = X − PE(St, T − t) −K∑

k=1

θk(T − t)k (11.14)

Bt =

St : X(St) > PE(St, T − t) +

K∑k=1

θk(T − t)k

. (11.15)

Page 578: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch11 FA

Simulation 561

This saves having to solve for Bt, although at the cost of generatingless-smooth estimates of the boundary.

An alternative to modeling the boundary at each t ∈ [0, T ] is to limitthe feasible exercise times to some small set tjn

j=1, with tn = T . In thiscase one is really valuing a Bermudan-style counterpart to an Americanderivative. For a vanilla option with just one state variable the parametervector θ would simply comprise the values Btjn−1

j=0 (taking BT = X). Forcases involving multiple state variables Y the boundary of each region Btj

would have to be specified as some flexible function of Ytj . Unfortunately,the optimization problem quickly gets out of hand as n and the number ofstate variables increase.

Modeling Continuation Value

The discrete-time, discrete-state binomial approach to valuing vanillaAmerican options relied on a different characterization than (11.11); namely(for an American put on a single underlying asset)

PA(Stj , T − tj) = (X−Stj )∨B(tj , tj+1)Etj PA(Stj+1 , T − tj+1). (11.16)

This represents the put’s value at tj as the greater of intrinsic value,X(Stj ) ≡ X − Stj , and continuation value—the discounted value if (opti-mally) held to tj+1. Determining the continuation value was straightforwardin the binomial scheme, and to do it by simulation will require followinga similar program. Specifically, the opportunities for exercise must againbe limited to a finite set of times tjn

j=1, and we will again have to workbackwards in time, since the continuation value is known a priori only at T

(where it is equal to zero). At each earlier time tj the continuation value willhave to be estimated from what is then known to the simulator—namely,the values of all state variables on all paths to that point. While there area number of ways to do it, the LSM (Least Squares Monte Carlo) methodproposed by Longstaff and Schwartz (2001) has proved the simplest andmost popular. The main attraction is that estimation is done by ordinaryleast squares rather than by numerical search, which makes the optimiza-tion part very fast. We start with a brief overview and then give an exampleand further details.

The key insight underlying the LSM method is that a derivative’s contin-uation value at any time is simply the (risk-neutral) conditional expectationof the discounted cash flows received from holding it until expiration. Theplan is to represent the continuation value as some flexible function of the

Page 579: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch11 FA

562 Pricing Derivative Securities (2nd Ed.)

current values of the state variables Y on which the value of the derivativedepends and then to estimate this function at each time step by regression.To make things easier, we consider first a single state variable, Y = S,and focus on pricing a vanilla American put. In this case the program is asfollows.

We begin by generating N sample paths of S in the appropriate risk-neutral measure and under the assumed dynamics. The simulation thussupplies N realizations at each of the feasible exercise dates,

Stj (ω)j,ω∈1,2,...,n×1,2,...,N.

Next, we specify a flexible functional form for the continuation value ateach tj as

V (Stj , T − tj ; θj) =m∑

k=0

θjkbk(Stj ) = b(Stj )θj ,

where b(·) = (b0(·), b1(·), . . . , bm(·)) is a vector of prescribed “basis” func-tions, and θj = (θj0, θj1, . . . , θjm)′ is a vector of parameters to be esti-mated. For example, we might take bk(Stj ) = Sk

tj, making V a simple

polynomial. Next, we start a backwards recursion from tn = T. Sincethe continuation value there is V (ST , 0) ≡ 0, we have for PA(ST , 0) theoption’s contractually specified terminal value. The realizations of theseterminal values, [X − ST (ω)]+N

ω=1, are the actual cash flows that wouldbe realized at T contingent on no early exercise. We store these in an N -vector C, keeping track of the fact that they were realized at T . Next,stepping back to tn−1 we must estimate V (Stn−1 , T − tn−1; θn−1) , rec-ognizing this is as the approximant of the conditional expectation of thediscounted future cash flows in measure P. To estimate it, we discountthe elements of C back to time tn−1 (multiplying by the one-period bondprice B(tn−1, T )) and store the discounted values as Cn−1. Letting Bn−1

be the N × (m + 1) matrix of realizations of the basis functions andUn a vector of “errors”, we have the identity Cn−1 = Bn−1θn−1 + Un.We obtain an estimate θn−1 = (B′

n−1Bn−1)−1B′n−1Cn−1 by regressing

the present (date-tn−1) value of the cash flows on the basis functions.Then from θn−1 we get an estimate of the continuation value on eachpath as Vn−1(ω) ≡ V (Stn−1(ω), T − tn−1; θn−1) = b(Stn−1(ω))θn−1. IfVn−1(ω) < X − Stn−1(ω), the option is assumed to be exercised at tn−1 onpath ω, thereby generating a contingent cash flow. Since exercise at tn−1

Page 580: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch11 FA

Simulation 563

precludes exercise at T , we replace the old element ω in C with this newcash flow, again keeping track of the date at which it was realized. Once thishas been done for all ω we move back to tn−2, discount all elements of Cfrom the dates received back to tn−2, store them in Cn−2, and regress Cn−2

on the realizations of the basis functions, Bn−2. This gives an estimate ofthe continuation value on each path ω at tn−2, which is again comparedwith the intrinsic value and used to update C as before in preparation fortime step tn−3. Following these steps through t = t1 yields a record C ofthe cash flows on all paths and the dates at which received. From these weget initial continuation value V (S0, T ) simply by discounting the cash flowsback to t = 0 and averaging over paths. Thus, at this step no parametricapproximant is needed. At last, we obtain the estimate of the initial valueof the option itself as PA(S0, T ) = V (S0, T ) ∨ (X − S0).

One further detail: Longstaff and Schwartz (2001) argue that samplepaths at which the option is out of the money at a given time step are lessinformative about continuation value, and they recommend excluding suchout-of-money realizations from the regressions. Thus, if the option were inthe money on just Nj sample paths at tj , vector Cj and “design” matrixBj would have only Nj rows, and θj would be estimated from just thoseNj observations.10

Here is a simplified numerical example that takes us through the proce-dure step by step. We price a vanilla put on a no-dividend stock followingBlack-Scholes dynamics with σ = .20 and having initial value S0 = 0.9(see table 11.4). The option’s strike is X = 1 and it expires at T = 3.

We take unit time steps at tj = j3j=1 and set the discount factor as

B(j, j + 1) = .99 at each step. N = 10 sample paths are generated as inthe table, and terminal values [X − S3(ω)]+N

ω=1 are recorded in a vectorC (last column of table 11.4).

We take as basis functions of underlying price (the only state variable)the monomials bk(Sj) = Sk

j 2k=0 and so obtain a quadratic approximant

to the continuation value at step j ∈ 1, 2: V (Sj , T − j; θj) ≡ Vj = θj0 +θj1Sj + θj2S

2j . Commencing at step 2, we construct the regression design

matrix B2 and the vector C2 as in table 11.5. Elements of C2 are those of Cin the table above multiplied by the one-step discount factor, .99. However,there are entries only for paths such that X − S2 > 0, since the others

10Of course, to simplify programming one could simply fill the missing rows of Bj

and Cj with zeroes, as this would not affect the least squares estimates.

Page 581: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch11 FA

564 Pricing Derivative Securities (2nd Ed.)

Table 11.4. Sample paths of St3t=0 and cash flows

C = (X − S3)+ at t = 3.

ω S0 S1 S2 S3 C

1 0.9 0.905 0.737 0.882 .1182 0.9 0.878 1.056 0.865 .1353 0.9 1.119 0.721 0.789 .2114 0.9 0.709 1.079 0.967 .0335 0.9 1.100 1.508 1.636 .0006 0.9 0.722 0.516 0.466 .5347 0.9 1.330 1.584 1.192 .0008 0.9 0.597 0.491 0.640 .3609 0.9 0.887 0.825 0.830 .170

10 0.9 0.895 0.943 0.919 .081

Table 11.5. Estimated continuation values as of t = 2 andrevised cash flows C at t = j.

ω B2 C2 V2 (X − S2)+ C j

1 1.0 .737 .543 .117 .184 .263 .263 22 — — — — — .000 .135 33 1.0 .721 .520 .209 .196 .279 .279 24 — — — — — .000 .033 35 — — — — — .000 .000 —6 1.0 .516 .266 .528 .415 .484 .484 27 — — — — — .000 .000 —8 1.0 .491 .241 .357 .449 .509 .509 29 1.0 .825 .681 .168 .128 .175 .175 210 1.0 .943 .889 .080 .086 .057 .081 3

are not used in the regression. Rows of B2 are realizations of the basisfunctions on the in-the-money paths. For example, since S2(ω) = 0.737 <

1.0 = X on path ω = 1, the first row of B2 is (1.0, .737, .7372 .= .543). Theleast squares estimates, θ2 = (1.466,−2.727, 1.340)′, yield the estimatedcontinuation values V2(ω) = 1.466− 2.727S2(ω) + 1.340S2(ω)2 that appearunder column head V2. These are compared with the exercise values in theadjacent column. When the exercise value is larger, we substitute it for theold corresponding element of cash-flow vector C and record the time stepj at which the new element of C was received (last column).

Now we back up to step 1 and construct new design matrix B1 anddiscounted cash-flow vector C1 as in table 11.6, again using only paths forwhich X − S1 > 0. Entries in C1 are those in corresponding rows of Cin the previous table multiplied by the appropriate discount factors. For

Page 582: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch11 FA

Simulation 565

Table 11.6. Estimated continuation values as of t = 1 andrevised cash flows C at t = j.

ω B1 C1 V1 (X − S1)+ C j

1 1.0 .905 .818 .261 .172 .095 .263 22 1.0 .878 .770 .132 .160 .122 .135 33 — — — — — .000 .279 24 1.0 .709 .503 .033 .260 .291 .291 15 — — — — — .000 .000 —6 1.0 .722 .521 .479 .243 .278 .278 17 — — — — — .000 .000 —8 1.0 .597 .357 .504 .495 .403 .509 29 1.0 .887 .786 .173 .163 .113 .175 210 1.0 .895 .802 .080 .167 .105 .081 3

example, entries in the first, second, and fourth rows of C1 are (apart fromrounding error) .99(.26314) .= .261, 992(.135) .= .132, and .992(.03335) .=.033.

The least-squares estimates θ1 = (4.003,−9.060, 5.333)′ yield entriesV1(ω) = 4.003 − 9.060S1(ω) + 5.333S1(ω)2 in the V1 column. For paths 4and 6 the exercise values exceed these continuation values, so the cashflows for those paths are updated to construct the new vectors of cash flowsand exercise dates in the last two columns. The entries in the C and j

columns now give a complete record of the cash flows after t = 0. Fromthese we obtain the estimate of continuation value at t = 0 by averagingthe discounted cash flows over the N = 10 paths, as

V (S0, T ) =110

10∑ω=1

B(0, j(ω))C(ω)

=110

[.992(.263) + .993(.135) + · · · + .993(.081)].= 0.197.

Comparing with the initial intrinsic value, X−S0 = 0.1, we value the optionas PA(S0, 3) = 0.197.

Several issues must be addressed before putting this into actual practice.

Selecting basis functions. Longstaff and Schwartz (2001) find that thereis considerable latitude in this choice. However, while monomials (the Skin our numerical example) are sometimes satisfactory, the resulting designmatrices Bj are apt to be ill conditioned and to produce wild estimatesof continuation values. This suggests the use of some orthogonal system,

Page 583: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch11 FA

566 Pricing Derivative Securities (2nd Ed.)

such as the Laguerre polynomials, given by11

L0(s) = 1

L1(s) = 1 − s

L2(s) = 2 − 4s + s2

...

Lk(s) = es dk

dsk(ske−s).

These are orthogonal on [0,∞) with respect to weight function e−s, in thesense that

∫∞0 Lk(s)Lk′ (s)e−s · ds = 0 for k = k′. Thus, for a nonnegative

state variable S one might take b0(S) = 1 (to allow for a constant term)and bk(S) = e−SLk−1(S) for k ∈ 1, 2, . . . , m. Other candidates wouldbe weighted Hermite polynomials for variables supported on (−∞,∞) andweighted Chebychev or Legendre polynomials for variables with boundedsupport.

Handling multiple state variables. If the value of the derivative dependson more than one state variable (a common case in which simulation isneeded), the basis functions would be products of the one-dimensional basesin the separate variables. Thus, restricting to orders no greater than m,the set of basis functions for state variables Y = (Y1, Y2)′ would compriseM2 ≡ (m+1)(m+2)/2 ≡ (m+1)(M1+1)/2 terms of the form bi(Y1)bk−i(Y2)for i ∈ 0, 1, . . . , k and k ∈ 0, 1, . . . , m. For example, for m = 3 andb0(·) = 1 these would be

1 b1(Y1) b2(Y1) b3(Y1)b1(Y2) b1(Y2)b1(Y1) b1(Y2)b2(Y1)b2(Y2) b2(Y2)b1(Y1)b3(Y2)

.

For Y = (Y1, Y2, Y3)′ there would be M3 = (m + 1)(M2 + 1)/2 terms, andso on, with Mq = (m + 1)(Mq−1 + 1)/2 terms for Y = (Y1, Y2, . . . , Yq)′.Clearly, to keep the problem manageable one must progressively curtail themaximum order as the number of state variables grows.

Performing the regressions. Design matrix Bj may still be ill condi-tioned even with weighted orthogonal polynomials as basis functions, par-

11Consult Abramovitz and Stegun (1970) and Davis (1963) for further description ofthis and other orthogonal systems. Miranda and Fackler (2002) discuss the ill-conditionednature of design matrices built up from monomials.

Page 584: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch11 FA

Simulation 567

ticularly when some of the state variables can assume large values and theweight functions are exponential, as for Laguerre and Hermite polynomi-als. For derivatives such as vanilla options that are homogenous of degreeone in the underlying price and the strike, it will help to normalize by thestrike and then rescale the final result. However, it is still essential to use aregression routine that can handle ill-conditioned designs.12

Managing memory. The LSM technique is a veritable memory hog. Pro-gramming in the most direct way requires storing in memory the realizationof each state variable at each of the n time steps on each of the N samplepaths. Thus, if there are q state variables, N · n · q words of memory areneeded just for the primitive data, plus another 2N words to record thedates and amounts of cash flows in C, plus enough storage for the largestof design matrices and response vectors Bj,Cj, plus work space for theregression routine. Even with a single state variable, handling just 5,000sample paths and 50 time steps is beyond the capabilities of most p.c.s. Ofcourse, one can write the sample paths to disk and read back as needed, butthat is a slow process. A better solution, at least when working in a Markovframework, is to generate the full sample paths out to T at each time stepj (starting with the same initial seed), but to save only the realizationsYtj (ω)N

ω=1 that are needed for step j. (Realizations at tj+1, . . . , T wouldstill have to be generated—or at least the same number of calls to the gener-ator would have to be made, since otherwise the cash flows in C would nothave been attained on the current paths.) Perhaps the best solution attain-able with straightforward programming is simply to (i) choose a value ofN small enough that the full sample paths can be stored in active memory,(ii) replicate the experiment independently several times starting with newseeds, and (iii) average the final estimates. Table 11.7 compares Longstaffand Schwartz’s (2001) estimates of prices of vanilla American puts fromN = 100,000 paths with averages of 25 sets of estimates from N = 4, 000paths. By generating 2,000 pairs of paths antithetically, one can get eachprice in about nine seconds on a 2.8 g.h. p.c.13

12Longstaff and Schwartz (2001) recommend IMSL routine LSBRR, which uses aniterative refinement algorithm.

13Estimates with N = 100, 000 and the standard errors are from Longstaff andSchwartz (2001, table 1). In each case the strike was X = 40, the interest rate was r = .06,prices followed Black-Scholes dynamics with n = 50 time steps, and continuation values

were represented as affine functions of the first three weighted Laguerre polynomials.Computations for the N = 25 · 4000 estimates were done in FORTRAN using IMSLroutine LBSRR for the regressions.

Page 585: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch11 FA

568 Pricing Derivative Securities (2nd Ed.)

Table 11.7. LSM estimates of prices of American puts from100,000 paths vs. averages of 25 estimates from 4,000 paths.

S0 σ T N = 100, 000 s.e. N = 25 · 4000

36 .20 1 4.472 (.010) 4.47536 .20 2 4.821 (.012) 4.82236 .40 1 7.091 (.020) 7.11636 .40 2 8.488 (.024) 8.52240 .20 1 2.313 (.009) 2.31440 .20 2 2.879 (.010) 2.87840 .40 1 5.308 (.018) 5.32140 .40 2 6.921 (.022) 6.92444 .20 1 1.118 (.007) 1.11944 .20 2 1.675 (.009) 1.69044 .40 1 3.957 (.017) 3.95844 .40 2 5.622 (.021) 5.645

Random Trees

The random tree method of Broadie and Glasserman (1997) is really just away of adapting binomial pricing to simulation. To see how it works, it helpsfirst to bring to mind the steps for pricing vanilla American-style derivativesin the binomial scheme. How did we do it? First, starting from initial valueS0 we generated a tree of values of underlying price in n discrete time stepsusing the appropriate risk-neutral Bernoulli dynamics. Specific locations inthe tree were identified by specifying time step j and price level, Sj ≡ Stj .From S0 and each of the subsequent time/price “nodes” prior to step n

there were two branches, giving a total of N = 2n sample paths and n + 1distinct values of Sn for an ordinary recombining tree. Next, from each ofthe n nodes Sn−1 at step n − 1 we estimated the derivative’s continuationvalue as the (π, 1 − π)-weighted average of the two discounted terminalvalues accessible from Sn−1. The derivative’s value at each node Sn−1 wasthen set equal to the greater of intrinsic value and continuation value, asin (11.16) for an American put. When values for all n nodes at step n − 1were thus determined, we moved back to step n − 2, valued the derivativeat all the n−1 nodes in the same way, and repeated the whole process backto step 0.

The same general steps are followed in the most straightforward versionof the random tree method. First, starting from initial value S0 we gen-erate a tree of values of underlying price at the n discrete time steps atwhich exercise is to be allowed. This will now be done by simulation, using

Page 586: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch11 FA

Simulation 569

whatever measure-P dynamics are appropriate for the underlying price (orother state variables) on which the derivative’s intrinsic value depends.From the initial price S0 and from each subsequent time/price node priorto step n there will now be m branches, giving a total of N = mn samplepaths and (in general) the same number of distinct values of Sn. Since theprice increments are now (pseudo-) random, this process will have produceda “random” tree with a continuous state space for prices. Next, from eachof the mn−1 nodes Sn−1 at step n − 1 we estimate the derivative’s con-tinuation value as the sample mean of the m discounted terminal valuesaccessible from Sn−1. The derivative’s value at each node Sn−1 is then setequal to the greater of intrinsic value and continuation value. When valuesfor all mn−1 nodes at step n − 1 are thus determined, we move back tostep n − 2, value the derivative at all the mn−2 nodes in the same way,and repeat the whole process back to step 0. Table 11.8 lays out and com-pares the steps for binomial and random-tree pricing of an American-stylederivative worth D(Sj) = D(Sj , n−j) at node Sj and having intrinsic valueX(Sj). There πu and πd = 1 − πu are the risk-neutral Bernoulli pseudo-probabilities, Sj,ω is the underlying price on branch ω from the price attime step j − 1 ∈ 0, 1, . . . , n− 1, and B represents the price of a one-stepdiscount bond.

One can see at a glance that the straightforward random tree methodis completely unworkable unless n and/or m is very small. However,

Table 11.8. Comparison of binomial and random-tree schemes.

Step Action Binomial Random Tree

From S0 generate tree of n + 1 prices Sn mn prices Sn

n − 1 Set D(Sn−1) = X(Sn−1)∨ BX

ω=u,d

πωX(Sn,ω)B

m

mXω=1

X(Sn,ω)

n − 2 Set D(Sn−2) = X(Sn−2)∨ BX

ω=u,d

πωD(Sn−1,ω)B

m

mXω=1

D(Sn−1,ω)

..

....

..

....

1 Set D(S1) = X(S1)∨ BX

ω=u,d

πωD(Sn−2,ω)B

m

mXω=1

D(S2,ω)

0 Set D(S0) = X(S0)∨ BX

ω=u,d

πωD(Sn−1,ω)B

m

mXω=1

D(S1,ω)

Page 587: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch11 FA

570 Pricing Derivative Securities (2nd Ed.)

since continuation values are estimated by averaging over the m replicatesSj,ωm

ω=1 accessed from each node Sj−1, it is clear that economy must beachieved mainly by moderating the number of time steps, n. Thus, the ran-dom tree method should really be considered a means of pricing Bermudan-style derivatives rather than those that are continuously exercisable. On theother hand, there is one conspicuous advantage: no parametric approxima-tion of either exercise boundary or continuation value is required. Rather,the continuation value at any node can be estimated with arbitrary preci-sion merely by taking m to be sufficiently large. This means that there arevirtually no limits to the number of state variables or to the complexity oftheir dynamics: if we can simulate the underlying variables, we can pricethe derivatives. In addition, there are modifications to the straightforwardscheme outlined above that significantly reduce the computation time forgiven m and n.

Two obvious ways of increasing efficiency are the standard techniquesof antithetic acceleration and, when applicable, the use of control variates.For acceleration one simply generates m/2 pairs of paths antitheticallyfrom each node and estimates continuation value by averaging over all m

paths. Using a control variate is feasible when a corresponding European-style derivative can be valued analytically or extrinsically in some otherfast, accurate way. Less obvious is a procedure known as “pruning” thatsaves steps at out-of-money nodes. The key insight is that the only relevantcourse of action at such nodes is precisely the same as for European-stylederivatives—i.e., hold on. Since the proper decision is obvious, a preciseestimate of continuation value is not needed. Thus, rather than generate m

new paths to estimate continuation value, we can generate just one succes-sor price from which to calculate D(Sj−1) as (in the notation of table 11.8)B · D(Sj). Doing so amounts to pruning m − 1 branches from the tree atthis point.

How to program all this is not exactly obvious. The following pseudo-code shows one way to implement all three time-saving methods with n = 3equally spaced time steps and an arbitrary even number m of branches ateach in-the-money node. The specific example involves a single underly-ing asset worth S(j) at step j and following geometric Brownian motionwith volatility s and drift r-d under P. Adapting to other dynamics andstate variables requires just changing the “Generate” steps. “D(S(j))” and“E(S(j))” represent values at node S(j) of an American-style deriva-tive and a European-style counterpart that serves as a control variate.“X(S(j))” represents the intrinsic value; for example, X(S(j))=(X − Sj)+

for a vanilla put option. “e(S(0))” at the end is an extrinsic estimate of the

Page 588: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch11 FA

Simulation 571

value of the European, and “d(S(0))” is the final control-variate-adjustedvalue of the American derivative. Variables “ip” and “jp” are used to han-dle the antithetic acceleration, and “jbranch” and “kbranch ” handle thepruning.

Set initial underlying price S(0), expiration date T, time

step dt=T/3, interest rate r, one-step bond price B,

dividend rate d, volatility s, number branches m (even)

Set emudt=exp[(r-d-s*s/2)*dt]

Initialize D(S(0))=E(S(0))=0

Loop for i=1,2,...,m/2Initialize D(S(1))=E(S(1))=0

Set ip=1

1 Generate S(1,i)=S(0)*emudt*exp(ip*s*Z(i)) for Z(i)~N(0,1)

If X(S(1))>0 set jbranch=m/2; else set jbranch=1

Initialize D1=E1=0

Loop for j=1,jbranch

Initialize D(S(2))=E(S(2))=0

Set jp=1

2 Generate S(2,j)=S(1,i)*emudt*exp(jp*s*Z(j)),Z(j)~N(0,1)

If X(S(2))>0 set kbranch=m/2; else set kbranch=1

Initialize D2=0

Loop for k=1,kbranch

Generate Sp(3,k)=S(2,j)*emudt*exp( s*Z(k))

Generate Sn(3,k)=S(2,j)*emudt*exp(-s*Z(k))Accumulate D2=D2+X(Sp(3,k))+X(Sn(3,k))

End k loop

Accumulate E(S(2))=E(S(2))+.5*B*D2/kbranch

Set D2 = max[X(S(2)), B*D2/kbranch]

Accumulate D(S(2))=D(S(2))+.5*D2

If jp=1 set jp=-1 and go to 2

Accumulate D1=D1+D(S(2)), E1=E1+E(S(2))

End j loop

Accumulate E(S(1))=E(S(1))+.5*B*E1/jbranch

Set D1=max[X(S(1)), B*D1/jbranch]

Accumulate D(S(1))=D(S(1))+ .5*D1

If ip=1 set ip=-1 and go to 1

Accumulate D(S(0))=D(S(0))+D(S(1)), E(S(0))=E(S(0))+E(S(1))

Page 589: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch11 FA

572 Pricing Derivative Securities (2nd Ed.)

End i loop

Set D(S(0)=max[X(S(0)),B*D(S(0)/m], E(S(0))=B*E(S(0))/m

Set d(S(0))=D(S(0))+[e(S(0))-E(S(0))]

End program

Although this is hardly the appropriate application for the method,using it to price a vanilla American put does at least give a benchmarkof performance. Table 11.9 compares random-tree estimates with m = 500branches and n = 3 time steps with binomial estimates from 5,000 timesteps.14 A European put was estimated as a control, and antithetic acceler-ation and pruning were used as described above. Glasserman (2004) showsthat this procedure yields upward-biased estimates of true prices of Bermu-dan derivatives with the same exercise opportunities. He constructs match-ing downward-biased estimators by using different subsets of branches froma given node to decide whether to continue and to estimate continuationvalue, then uses the two sets of estimates and their standard deviationsto construct conservative confidence intervals. Of course, the bias of thestandard procedure is indeterminate when measured against the values ofcontinuously exercisable derivatives.

A Comparison

Here we apply the three methods to a case where simulation may well bethe best approach to estimation, taking both accuracy and computing timeinto account; namely, an American call on the maximum of two underly-ing stocks. This application has been considered by proponents of all threemethods: Garcia (2003), Longstaff and Schwartz (2001), and Broadie andGlasserman (1997). In applying each method here we try to show it to itsbest advantage by exploiting what is known about this particular problem.

Table 11.9. Binomial vs. random tree valuations of P A(S0, T = 1; X = 1).

S0 Binomial Tree without control Tree with control

0.9 .1149 .1097 .11111.0 .0609 .0608 .05941.1 .0299 .0288 .0289

14The data are the same as for the earlier estimates obtained by parameterizing theexercise boundary: X = T = 1, σ = .20, r = .05, δ = 0. Execution times for the treeestimates are about twice those of the binomial.

Page 590: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch11 FA

Simulation 573

Specifically, we use the exact solution for the price of a European-stylecall on the max—formula (7.42)—to set up control variates and to modelthe exercise boundary and the continuation value in the two parametricapproaches.

Consider a T -expiring call option on the maximum of two assets whoseprices, St = (S1t, S2t)t≥0, follow geometric Brownian motion with volatil-ities σ1, σ2 and correlation (of log price) ρ. The short rate is a constant r,and the two assets pay continuous dividends at rates δ1, δ2. At expirationthe call is worth Cmax(ST ; 0) = X(ST ) ≡ [X − (S1T ∨ S2T )]+, and theproblem is to estimate initial value Cmax(S0; T ) given that the option canbe exercised at any of the discrete set of times tj = jT/nn

j=0. To conservenotation, define X(Sj) ≡ [X−(S1j ∨S2j)]+ as the intrinsic value at tj . Let-ting Z1j , Z0jn

j=1 be i.i.d. pairs of pseudorandom, independent standardnormals, we set Z2j = ρZ1j +

√1 − ρ2Z0j and generate sample paths as

S1j = S1,j−1 exp[(r − δ1 − σ21/2) · dt + σ1

√dtZ1]

S2j = S2,j−1 exp[(r − δ2 − σ22/2) · dt + σ2

√dtZ2], j ∈ 1, 2, . . . , n,

where dt ≡ T/n and initial values S10, S20 are given. We follow the appli-cations in the literature by treating the symmetric problem with

S10 = S20 ∈ 90, 100, 110σ1 = σ2 = .20

δ1 = δ2 = .10

ρ = .3

r = .05

and option parameters T = 3, X = 100.

To apply the parametric-boundary approach to this problem Garcia(2003) exploits the symmetry of the process S1t, S2t by modeling theexercise region at tj as

Bj(θ) = Sj : X(Sj) > θ1j , |S1j − S2j | > θ2j

for j ∈ 0, 1, . . . , n − 1 and setting Bn = Sn : S1n ∨ S2n > X. Thisrequires estimating the 2n parameters θ1j , θ2jn−1

j=0 . The idea is that fora given value of S1 ∨ S2 the option’s continuation value should dimin-ish as the prices spread out, and the incentive for early exercise shouldtherefore increase. Were the models for processes S1tt≥0 and S2tt≥0

not the same, some modification to Bj(θ) would be needed to allow theprices to enter asymmetrically. As an alternative, one could adapt and

Page 591: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch11 FA

574 Pricing Derivative Securities (2nd Ed.)

extend the scheme that was used to price a put on a single underlying,as expressed in (11.12) and (11.13). That is, letting Cmax

E (St; T − t) rep-resent the value of a European-style call on the max, given explicitly by(7.42), we could calculate at each tj the locus of points S∗

j in 2 such thatCmax

E (S∗j ; T −tj) = S∗

1j∨S∗2j −X . This would be done for a two-dimensional

grid of values of S1j , S2j , with (say) linear interpolation for points between.Then we could set the exercise region as

Bj(θ) =

Sj : S1j ∨ S2j − X > Cmax

E (S∗j ; T − tj) +

K∑k=1

θk(T − tj)k

.

An alternative, corresponding to (11.15), is to skip the calculation of thelocus S∗

j and set

Bj(θ) =

Sj : S1j ∨ S2j − X > Cmax

E (Sj ; T − tj) +K∑

k=1

θk(T − tj)k

.

The corresponding exercise boundary for S1j ∨ S2j can be calculated veryquickly. We use this method with K = 3 parameters in the comparisonsgiven below.

To obtain their regression-based “LSM” estimates of continuation value,Longstaff and Schwartz (2001) suggest15 using as basis functions the firstfew Hermite polynomials in S1j ∨ S2j , the value and square of S1j ∧ S2j ,

and the product (S1j ∨S2j) · (S1j ∧S2j). They also allow an intercept in theregression functions. Again, we can simplify by making use of the formulafor the European version. Specifically, we take as basis functions the priceof the European call, Cmax

E (St; T − t), and the first two weighted Laguerrepolynomials in the time to expiration, e−(T−t)/2 and e−(T−t)/2[1− (T − t)].We also allow for an intercept, giving a total of four parameters to beestimated at each step in the backward recursion.

For the random-tree approach little change is needed from the stepsoutlined above for puts on a single underlying: merely simulate paths ofthe bivariate process and use the appropriate intrinsic values X(Sj) in thescheme laid out in table 11.8.

With all three methods we use antithetic acceleration to generate theprice processes, and we estimate prices of a European put in the same simu-lation to use as a control. Estimates based on the parametric boundary are

15This is an inference from their recipe for a call on the maximum price of fiveunderlying assets.

Page 592: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch11 FA

Simulation 575

from 20,000 sample paths; those based on least-squares estimates of con-tinuation value are averages from 25 replications of 10,000 paths each; andthose from random trees are averages of 25 replications using 200 branchesat each step. The multiple replications of the last two methods provideestimates of the standard deviation of the mean of the 25 values; standarderrors for the parametric boundary estimates come from the cash flows real-ized in the 20,000 paths. Table 11.10 gives estimates of value for each of thethree sets of initial underlying prices, estimates of the standard errors ofthe estimators, and execution times in seconds.16 Execution times to esti-mate the parametric boundary are highly variable, depending on the initialguesses used in the optimization. Results for n = 3 time steps (one exerciseopportunity per year) are given for all three methods and compared withsupposedly accurate values (reported in Glasserman, 2004, p. 440), obtainedby two-dimensional lattice methods. The flexibility of the two parametricschemes allows for more exercise opportunities, so we report results fromthese for n = 60 also.

Table 11.10. Parametric/nonparametric simulation estimates of price of American callon the maximum price of two underlying assets, with n + 1 exercise dates, three yearsto expiration, and strike price 100.

Method S10, S20 = 90 S10, S20 = 100 S10, S20 = 110

Par. Boundaryn = 3 7.119 12.51 19.07

(s.e., time) (.097, 10.1) (.12, 16.1) (.14, 18.3)n = 60 7.962 13.39 20.43

(s.e., time) (.087, 416) (.10, 318) (.12, 654)Par. Continuation

n = 3 7.239 12.39 19.02(s.e., time) (.018, 4.1) (.02, 6.0) (.03, 7.6)

n = 60 7.680 13.11 20.06(s.e., time) (.045, 25.3) (.06, 38.4) (.07, 49.9)

Random Treen = 3 7.250 12.50 19.17

(s.e., time) (.024, 20.6) (.04, 38.9) (.06, 57.1)Lattice

n = 3 7.234 12.41 19.06

Standard errors (s.e.) and execution times in seconds are in parentheses.

16Programming was done in FORTRAN using IMSL routines to compute the bivari-ate normal c.d.f.s in (7.42) and for the LSM regressions. Execution times are for a 2.8 g.h.single-processor p.c.

Better results for the parametric continuation method were obtained without a controlvariate. These better results are reported in the table.

Page 593: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch11 FA

576 Pricing Derivative Securities (2nd Ed.)

In this particular application it is clear that the regression-based LSMtechnique (identified as “Par. Continuation” in the table) dominates interms of accuracy (relative to the lattice estimates) and computationalspeed. The parametric boundary scheme also has the flexibility to allowarbitrarily many exercise opportunities, but it seems to offer no advantageto compensate for the long execution times. As the results here indicate, itcan be easy to find good parametric models for continuation values whenthere is a tractable formula for a comparable European option. Still, theadvantage of LSM over the nonparametric tree method depends criticallyon having such good models of continuation value, and the nonparametricscheme provides a natural way to check their adequacy for small n.

Page 594: Pricing derivative securities

April 11, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch12 FA

12Solving P.D.E.s Numerically

This chapter gives an introduction to the numerical solution of partial dif-ferential equations by finite-difference methods.1

To get the basic idea of finite-difference methods, let us consider firstapplications in which the value of a T -expiring financial derivative dependsjust on St and t—the underlying price and time—as D(St, T − t). The goalis to get an approximate numerical solution to a p.d.e. that involves thederivatives of D with respect to these two arguments. The procedure theninvolves the following steps:

1. Express the p.d.e. in terms of appropriate finite-difference approxima-tions for the partial derivatives;

2. Set up a grid of values of t and St;3. Use initial and boundary conditions to specify the derivative’s values on

the borders of the grid that correspond to T − t = 0 and to the extremevalues of St; and

4. At each remaining point (t, St) in the grid use the p.d.e. to develop oneor more equations from which D(St, T − t) can be found in terms ofknown values of D at other points.

Figure 12.1 illustrates the basic scheme. Solid circles at nodes on the upper,lower, and right-hand borders represent values known from initial andboundary conditions, and the empty circles elsewhere are values that must

1Details and more advanced approaches can be found in the books by Collatz (1966),

Hall and Porsching (1990), Mitchell and Griffiths (1980), and Zwillinger (1989). Wilmottet al. (1993) and Wilmott et al. (1995) give extensive treatment of the financial p.d.e.in particular.

577

Page 595: Pricing derivative securities

April 11, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch12 FA

578 Pricing Derivative Securities (2nd Ed.)

Fig. 12.1. Time-price grid for numerical solution of p.d.e.

be found. Lattices of higher dimension are required if the financial deriva-tive depends on additional state variables.

We will take up three methods of solution in the order of increasingaccuracy and complexity: an “explicit” method, a “first-order implicit”method, and a “second-order implicit” method due to Crank and Nicol-son (1947). The explicit method involves solving just one equation at eachnode, whereas the implicit methods require a recursive procedure. Since allthe techniques are somewhat involved, we will first explain how to applythem to ordinary finite-lived derivatives, such as vanilla European or Amer-ican options, on assets whose prices follow geometric Brownian motion. Wewill also assume constant short rates and dividend rates. Under these con-ditions the fundamental p.d.e. takes the familiar form

−DT−t + DS(r − δ)St + DSSσ2S2t /2 − Dr = 0. (12.1)

Once mastering the techniques in this setting, we can see how to applythe Crank-Nicholson approach to more complicated p.d.e.s that allow forcontinuous time- and asset-dependent cash receipts by the underlying andthe derivative, time-varying interest rates, and more general Ito processes

Page 596: Pricing derivative securities

April 11, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch12 FA

Solving P.D.E.s Numerically 579

for St; for example, p.d.e.s such as

κ(St, t) − DT−t + DSµ(St, t) + DSSσ2(St, t)/2 − Drt = 0

for various specifications of κ(St, t), µ(St, t), σ(St, t), and rt.The main advantage of working with the simpler p.d.e. (12.1) is that

it can be transformed to make the coefficients of the partial derivativesinvariant with respect to t and St. This is done by expressing the derivative’svalue in terms of st ≡ ln(St) and calendar time t instead of St and T − t.

Letting d(st, t) ≡ D(est , T − t), we have

ds = StDS ,

dss = S2t DSS + ds,

dt = −DT−t;

and the following new version of (12.1):

dt + ds(r − δ − σ2/2) + dssσ2/2 − dr = 0. (12.2)

12.1 Setting Up for Solution

Before looking for solutions we must carry out the first three steps onpage 577.

12.1.1 Approximating the Derivatives

There are various ways to approximate derivatives dt, ds, and dss by finitedifferences, and the choice determines whether the method of solving thep.d.e. will be explicit or implicit. For example, dt can be approximated aseither a forward or backward difference; that is, as

d+t ≡ d(s, t + ∆t) − d(s, t)

∆t(12.3)

or as

d−t ≡ d(s, t) − d(s, t − ∆t)∆t

. (12.4)

To see the order of accuracy of these, use Taylor’s theorem to approximate d

at t+∆t and at t−∆t near (s, t), assuming that the second time derivative

Page 597: Pricing derivative securities

April 11, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch12 FA

580 Pricing Derivative Securities (2nd Ed.)

exists and is continuous:2

d+t (s, t) = dt(s, t) +

12dtt(s, t) · ∆t + O(∆t2)

d−t (s, t) = dt(s, t) −12dtt(s, t) · ∆t + O(∆t2).

This shows that d+t and d−t are O(∆t) approximations. The ∆t term van-

ishes, however, upon averaging the forward and backward approximations.The result is a centered approximation that is accurate to O(∆t2):

d0t (s, t) ≡

d+t + d−t

2=

d(s, t + ∆t) − d(s, t − ∆t)2∆t

= dt(s, t) + O(∆t2).

(12.5)Similarly, there are O(∆s) forward and backward approximations to the s

derivative,

d±s (s, t) =±d(s ± ∆s, t) ∓ d(s, t)

∆s,

and an O(∆s2) centered approximation:

d0s(s, t) ≡

d(s + ∆s, t) − d(s − ∆s, t)2∆s

. (12.6)

An O(∆s2) approximation for dss can be obtained from the differencebetween forward and backward first derivatives:

d0ss(s, t) ≡

d+s (s, t) − d−

s (s, t)∆s

=d(s + ∆s, t) + d(s − ∆s, t) − 2d(s, t)

∆s2.

(12.7)The explicit and implicit methods we describe below use various combina-tions of these approximations.

12.1.2 Constructing a Discrete Time/Price Grid

Evaluating d(s, t) by finite-difference methods requires a discrete grid of s

and t values. For definiteness we put t on the horizontal axis and s on thevertical. Let t = 0 and t = T be the current date and the expiration dateof the financial derivative, respectively. To construct the grid, first divide[0, T ] into some nT equal intervals of length ∆t = T/nT . Larger valuesof nT give more accuracy but lengthen the computation, and one has toexperiment with this parameter. This gives a set of nT + 1 time values,

2Throughout, symbols “∆t2” and “∆s2” represent squares of the first differences:(∆t)2 and (∆s)2.

Page 598: Pricing derivative securities

April 11, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch12 FA

Solving P.D.E.s Numerically 581

T ≡ j∆tnT

j=0. Next, specify a grid of values of s on an interval withinwhich st is almost certain to remain during the lifetime of the option. Thistakes a bit more care. One wants points on an interval [sMin, sMax] suchthat P(sMin < sT < sMax) .= 1, where P is the risk-neutral measure. If thevalue of the derivative asset is known at St = 0, then in specifying boundaryconditions it is helpful to take sMin such that esMin

.= 0. Also, since thederivative’s initial value will eventually be determined just at the values ofs in the grid, one also wants the log of initial price s0 to be one of the set.Finally, it is generally helpful to choose a price step that is proportional tothe square root of the time step, as ∆s = σ

√3∆t. The reason for this will

be seen later.Here is one way to accomplish all these things. Picking ∆s = σ

√3∆t,

find numbers n−S and n+

S such that exp(sMin) ≡ exp(s0 − n−S ∆s) .= 0 and

P(s0 − n−S ∆s < sT < s0 + n+

S ∆s) .= 1, and let nS ≡ n−S + n+

S . Thenwith sMin = s0 − n−

S ∆s and sMax = s0 + n+S ∆s = sMin + (nS + 1)∆s

the construction gives a set of nS + 1 ≡ n−S + n+

S + 1 log-price values thatcontains the initial value:

S ≡ sMin + j∆snS

j=0.

This construction delivers an (nT + 1) by (nS + 1) array of time andlog-price points, T × S. Most of the work is done with columns in thegrid, which will contain values of the financial derivative at specific t’s:d(s, t)s∈S . However, specifying boundary conditions also requires somework with the top and bottom rows: d(sMin, t)t∈T and d(sMax, t)t∈T .

12.1.3 Specifying Boundary Conditions

Now use the initial and boundary conditions for the financial derivative todetermine the value of d(s, t) at (i) each point on the right boundary ofthe grid, corresponding to the time column at nT ∆t = T ; (ii) each pointon the top boundary, corresponding to the sMax row; and (iii) each pointon the bottom boundary, corresponding to the row with s = sMin. Forexample, suppose the derivative is a call option. Letting X be the strike,the values at the right boundary, which correspond to the initial conditionat T − t = 0, are c(s, T ) = (es − X)+s∈S . At the bottom boundary,assuming esMin

.= 0, the values are c(0, t) = 0t∈T . If the call is American,the values at the upper boundary are c(sMax, t) = esMax − Xt∈T ; if it isEuropean, they are c(sMax, t) = esMax −e−r(T−t)Xt∈T . For a put option,values at the right and top boundaries are p(s, T ) = (X − es)+s∈S and

Page 599: Pricing derivative securities

April 11, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch12 FA

582 Pricing Derivative Securities (2nd Ed.)

p(sMax, t) = 0t∈T . Bottom values are p(0, t) = Xt∈T if the put isAmerican and p(0, t) = e−r(T−t)Xt∈T if it is European.

12.2 Obtaining a Solution

What remains is to fill in the rest of the points on the grid, the maininterest usually being in the first column, d(s, 0)s∈S . As in the binomialmethod one begins by evaluating d for each point in the next-to-last column,corresponding to t = (nT −1)∆t, and then works backward. Of several waysto proceed the next is the most straightforward.

12.2.1 The Explicit Method

This technique gets its name from the fact that the values d(s, t)s∈Sin each column will be represented explicitly in terms of just the valuesin the next column; that is, in terms of d(s, t + ∆t)s∈S . Letting p ≡(r − δ − σ2/2)and q ≡ σ2/2, write the p.d.e. (12.2) as

dt + dsp + dssq − dr = 0.

Since d(s, t) is to be calculated from values of d(s, t + ∆t), the forwardapproximation d+

t (s, t) will be used for dt, and the centered approxima-tions to ds and dss will be evaluated at the next time step, t + ∆t. Theseare d0

s(s, t + ∆t) and d0ss(s, t + ∆t). This involves a second approximation,

because it treats the s derivatives as constant over the interval [t, t + ∆t].With these choices the p.d.e. becomes

d(s, t)r = d+t (s, t) + d0

s(s, t + ∆t)p + d0ss(s, t + ∆t)q

=d(s, t + ∆t) − d(s, t)

∆t+

d(s + ∆s, t + ∆t) − d(s − ∆s, t + ∆t)2∆s

· p

+d(s + ∆s, t + ∆t) + d(s − ∆s, t + ∆t) − 2d(s, t + ∆t)

∆s2· q.

The advantage of using the forward approximation for dt and evaluatingthe centered s derivatives at t + ∆t is now apparent. Since values of d(·, ·)at the next time step are known, this is an equation in a single unknown,d(s, t). The solution is

d(s, t) = a∗d(s−∆s, t+∆t)+b∗d(s, t+∆t)+c∗d(s+∆s, t+∆t), (12.8)

Page 600: Pricing derivative securities

April 11, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch12 FA

Solving P.D.E.s Numerically 583

where

a∗ ≡ q∆t/∆s2 − (p/2)∆t/∆s

1 + r∆t,

b∗ ≡ 1 − 2q∆t/∆s2

1 + r∆t,

c∗ ≡ q∆t/∆s2 + (p/2)∆t/∆s

1 + r∆t.

We thus have a way of expressing d(s, t) in terms of three values in thet + ∆t column: one below, at s − ∆s; one on the same level, at s; and oneabove, at s+∆s. Since values of d in the last column of the grid are known,we begin the process at the next-to-last time step, tnT −1 = (nT − 1)∆t,where d(s, t + ∆t) = d(s, nT ∆t) ≡ d(s, T ) is known for each s ∈ S. Also,the boundary conditions determine the value of d(·, tnT −1) at the bottomof the column, d(sMin, tnT−1), and at the top, d(sMax, tnT−1). Values ofd(·, tnT −1) at the remaining nS − 1 points in the column are found bysolving (12.8) one row at a time. When this is complete, move back onetime step to tnT −2 = (nT −2)∆t and solve for d(s, tnT−2)s∈S in the samemanner. Repeating the process through column t = 0 ultimately producesthe initial value of the financial derivative for each s ∈ S, including thevalue at the actual current (log) price, d(s0, 0).

If the derivative is subject to early exercise, as for American options,all entries in a column are set equal to the greater of exercise value and thevalue calculated from entries in the next column. This has to be done onecolumn at a time before values at the next earlier time step are calculated.For example, if valuing an American put, each entry s in the tnT −j =(nT − j)∆t column would be set as max[d(s, tnT −j), X − es], whered(s, tnT −j) is calculated from (12.8).

A nice feature of this setup is that the coefficients a∗, b∗, and c∗ inthe calculations depend on p, q, ∆t, ∆s, but not on the values of s and t

themselves. This means that they are the same at each point in the grid.Had the p.d.e. been set up in levels of underlying price rather than in termsof logs, as in (12.1), the coefficients would have varied from one row to thenext. This is why it is easier to work in logs when the underlying price ismodeled as geometric Brownian motion. With more complicated models,as when volatility depends on St and/or t, then p and q will be state- ortime-dependent and a∗, b∗, c∗ will vary as well. We defer this complicationuntil section 12.3.1.

Page 601: Pricing derivative securities

April 11, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch12 FA

584 Pricing Derivative Securities (2nd Ed.)

As a simplified numerical example of the explicit method, consider pric-ing a one-year American put struck at X = 1.0 on a no-dividend stock whoseinitial price is S0 = 1.0 (initial log price s0 = 0.0) and has volatility σ = 0.5.

The short rate is r = 0.20. We take nT = 3 time steps, so that ∆t = 1/3,and use for the step in log price ∆s = σ

√3∆t = 0.5. Taking smax = 1.0

and smin = −1.0 as the extreme values gives nS ≡ (smax − smin)/σ = 4.

These values, which are used again later in examples of the other methods,are summarized in table 12.1.

From these data we have p ≡ (r − δ − σ2/2) = 0.075, q ≡ σ2/2 = 0.125,

and a∗ = 0.133, b∗ = 0.625, c∗ = 0.180. Figure 12.2 shows along the verticalaxis the nS + 1 = 5 values of the log of the underlying price and, oppositethe terminal nodes for time T, the terminal payoffs of the put. The zeroentries below the upper-most nodes are the values determined by the upperboundary condition: (X − esmax)+ = (1 − e1)+ = 0.0. Entries below thelowest nodes are given by the lower boundary condition: (X − esmin)+ =(1− e−1)+ = 0.632. To the right and below each remaining node is a block

Table 12.1. Data for solving p.d.e.s.

X S0 s0 smin smax nS ∆s T nT ∆t σ r δ

1.0 1.0 0.0 −1.0 1.0 4 0.5 1.0 3 0.33 0.5 0.20 0.0

Fig. 12.2. Explicit, implicit, and Crank-Nicholson valuations for American put.

Page 602: Pricing derivative securities

April 11, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch12 FA

Solving P.D.E.s Numerically 585

of three figures, the first of which represents the option’s value given bythe explicit method. (The others are described later.) Thus, the entry atthe t = 0.66, s = 0.0 node is a∗(.393) + b∗(.000) + c∗(.000) = 0.052. Thecorresponding entry in the node below, at t = 0.66, s = −0.5, equals theoption’s intrinsic value. The American put would have been exercised atthis point, since the intrinsic value of 0.393 exceeded the value if held,namely a∗(.632) + b∗(.393) + c∗(.000) = 0.330. Finally, the value 0.107 atthe t = 0, s = 0.0 node is the explicit estimate corresponding to the initialvalue of the stock.

The explicit method is easy to apply, but it does not converge to theright solution as ∆s and ∆t approach zero unless ∆s and ∆t are chosenin special ways. An indication of whether the choice is correct is whethera∗, b∗, c∗ are all positive. To see why, note that the three coefficients sumto (1 + r∆t)−1 = B(t, t + ∆t), which is the risk-free discount factor forone time step. Thus, the coefficients can be interpreted as discounted risk-neutral probabilities. For customary values of r and σ the main worry willbe with b∗: specifically, we need 0 < 1 − 2q∆t/∆s2 ≡ 1 − σ2∆t/∆s2, or∆t < ∆s2/σ2. Note that this is satisfied when ∆s = σ

√3∆t. Intuitively,

this restriction arises because of the additional error in approximating thes derivatives at the next time step as d0

s(s, t + ∆t) and d0ss(s, t + ∆t).

12.2.2 A First-Order Implicit Method

If the value of the financial derivative is less responsive to time than toprice, the restriction ∆t < ∆s2/σ2 requires a smaller time step—and morecomputing time—than might be necessary for the desired accuracy. Here weconsider a method that is known to converge regardless of the values of ∆t

and ∆s. It is somewhat harder to program, since it does not yield an explicitrepresentation of d(s, t) in terms of the values at t+∆t. Instead, it yields animplicit representation in the form of a system of simultaneous equationsfor unknown d’s in column t in terms of knowns in column t + ∆t. With alittle ingenuity, however, these equations can be solved without inverting amatrix.

We continue to work with the p.d.e. in log-price form,

dt + dsp + dssq − dr = 0,

and still use forward approximation d+t (s, t) for the time derivative. How-

ever, we now evaluate d0s and d0

ss at time t instead of t+∆t. This eliminatesone source of error and gives approximations for the s derivatives that are

Page 603: Pricing derivative securities

April 11, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch12 FA

586 Pricing Derivative Securities (2nd Ed.)

correct to O(∆s2), but it still leaves us with the O(∆t) approximation fordt. This is why the method is called “first-order”. With these approxima-tions the p.d.e. becomes

d(s, t)r = d+t (s, t) + d0

s(s, t)p + d0ss(s, t)q (12.9)

=d(s, t + ∆t) − d(s, t)

∆t+

d(s + ∆s, t) − d(s − ∆s, t)2∆s

· p

+d(s + ∆s, t) + d(s − ∆s, t) − 2d(s, t)

∆s2· q.

Working backwards as in the explicit method, this equation will involveone known value, d(s, t+∆t), and three unknowns, d(s−∆s, t), d(s, t), andd(s + ∆s, t)—just the reverse of the explicit method. Rearranging gives

d(s, t + ∆t) = a · d(s − ∆s, t) + b · d(s, t) + c · d(s + ∆s, t), (12.10)

where

a = (p/2)∆t/∆s − q∆t/∆s2

b = 1 + r∆t + 2q∆t/∆s2

c = −(p/2)∆t/∆s − q∆t/∆s2.

Again, values d(sMin, t) at the bottom of the t column and d(sMax, t) at thetop are known. Since each of the nS − 1 intermediate points in S satisfies(12.10), there are nS −1 equations in as many unknowns. Fortunately, theycan be solved without inverting a matrix using Gaussian elimination, sinceeach single equation involves just three unknowns.

Simpler notation will make it easier to see how to do this. For i ∈0, 1, . . . , nS the value of s corresponding to the ith step from the bottomof a column is si ≡ sMin + i∆s. Let ei = d(si, t + ∆t) be the known valueon the left side of (12.10), corresponding to the ith position of the t + ∆t

column, and let di = d(si, t) be the unknown value in the ith position ofthe t column. Expression (12.10) now comes out as

ei = a · di−1 + b · di + c · di+1. (12.11)

Remember that the e’s are known and the d’s are not, apart from the onesat the bottom and top of the column—d0 and dnS . Now look at the equationfor i = 1 and move the known d0 term to the left:

e1 − a · d0 = b · d1 + c · d2. (12.12)

Page 604: Pricing derivative securities

April 11, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch12 FA

Solving P.D.E.s Numerically 587

We shall see that such an equation holds for each position i ∈ 1, 2, . . . ,

nS − 1, wherein a known function of the e’s and d0 is equated to a linearfunction of the i and i + 1 values of d.

The steps are as follows. Use (12.12) to express d1 in terms of d2 andknown quantities—that is, as d1 = b−1(e1 − a · d0 − c · d2)—and substituteinto the next equation,

e2 − a · d1 = b · d2 + c · d3.

From this, express d2 in terms of d3 and known quantities, and continuein this way up to dnS−1. Each di will then have been expressed as a linearfunction of di+1. At the nS − 1 step, where dnS is known, this yields anexplicit solution for dnS−1. This solution is then used to find dnS−2, whichis used to find dnS−3, and so on back to d1. The trick is to keep track of thecoefficients in all these linear relations that connect di to di+1. Fortunately,they can be generated recursively.

One more round of notation is needed to see the recursive structure.Write (12.12) as

k1 = l1 · d1 + c · d2, (12.13)

where k1 = e1 − a · d0 and l1 = b. Express d1 in terms of d2 as

d1 =k1 − c · d2

l1

and then substitute in the i = 2 equation:

e2 = a · d1 + b · d2 + c · d3

=ak1

l1+

(b − ac

l1

)· d2 + c · d3.

Writing in a form like (12.13) gives

k2 = l2 · d2 + c · d3,

where k2 = e2 − ak1/l1 and l2 = b − ac/l1. The pattern of recursive rela-tions between the k’s and l’s now begins to emerge. Suppose for somei ∈ 2, . . . , nS − 1 we have

ki−1 = li−1 · di−1 + c · di

ei = a · di−1 + b · di + c · di+1.

Solving the first equation for di−1 as

di−1 =ki−1 − c · di

li−1(12.14)

Page 605: Pricing derivative securities

April 11, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch12 FA

588 Pricing Derivative Securities (2nd Ed.)

and substituting in the second equation give

ki = li · di + c · di+1,

where

li = b − a · c

li−1(12.15)

ki = ei − a · ki−1

li−1.

Starting with l1 = b, the l’s can be generated recursively up to lnS−1.

Since neither b nor c depends on the time step, this is done just once. Havingall the l’s and taking k1 = e1 − a · d0, one can generate k2, . . . , knS−1 forthe current time step as well. This does have to be done at each time stepsince the e’s change from one step to the next. Now, armed with the k’sand l’s, any di−1 can be expressed in terms of di via (12.14). Starting ati = nS − 1, where dnS = d(sMax, t) is known, solve for dnS−1 as

dnS−1 =knS−1 − c · dnS

lnS−1

and keep applying (12.14) until reaching the solution for d1. Having foundthe entire column of d’s for the nT − 1 time step, the e’s are now availablefor step nT − 2. Now recalculate the k’s, find the d’s for step nT − 2, andso continue until the t = 0 column is complete.

Applying the first-order implicit method to the data in table 12.1 givesa = −0.142, b = 1.400, c = −0.192, l1 = b = 1.400, l2 = b − a · c/1.400 =1.381, l3 = b − a · c/1.381 = 1.380. Table 12.2 gives the values of k1, k2, k3

for each time step. For example, the value of k1 at t = 0.66 is e1 − a · d0 =0.393 − a · (0.632), where e1 = 0.393 is the option’s terminal value at thet = 1.0, s = −0.5 node, and d0 = 0.632 is the option’s lower boundaryvalue at the t = 0.66 step. Applying the formulas throughout leads to thesecond figure in each block at the interior nodes of figure 12.2.

Table 12.2. Values of kj in first-order implicitsolution of example in Fig. 12.2.

t k1 k2 k3

0.66 0.483 0.049 0.0050.33 0.483 0.085 0.0120.00 0.483 0.112 0.020

Page 606: Pricing derivative securities

April 11, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch12 FA

Solving P.D.E.s Numerically 589

12.2.3 Crank-Nicolson’s Second-Order Implicit Method

As mentioned earlier, a limitation of the first-order implicit method is thatthe forward approximation for dt(s, t) is correct only to O(∆t). The Crank-Nicolson scheme is another implicit method that is second-order correct forboth time and log price. One expects it to converge faster and require fewertime steps than either of the two methods previously described.

The method involves a clever “swindle” that has the effect of cuttingthe time step in half and yet requires no more columns in the solutionarray. Here is the idea. If the time step ∆t were really cut in half anddt(s, t) approximated at (s, t + ∆t/2), then the centered approximation todt, correct to the second order, would be

d0t (s, t + ∆t/2) =

d(s, t + ∆t) − d(s, t)2(∆t/2)

(12.16)

=d(s, t + ∆t) − d(s, t)

∆t

= d+t (s, t).

That is, the centered approximation at the intermediate time step is thesame as the forward approximation at (s, t). If we could somehow replacethe two second-order approximations for ds and dss on the right side of(12.9) by approximations at (s, t+∆t/2) as well, then the entire result wouldbe second-order correct. The trick is finding how to do this without actuallyreducing the time step. The solution proposed by Crank and Nicolson (1947)is just to average the second-order approximations to ds and dss at (s, t)and (s, t + ∆t). This gives for d0

s(s, t + ∆t/2) the approximation

d0s(s, t) + d0

s(s, t + ∆t)2

=d(s + ∆s, t) − d(s − ∆s, t)

4∆s+

d(s + ∆s, t + ∆t) − d(s − ∆s, t + ∆t)4∆s

and for d0ss(s, t + ∆t/2)

d0ss(s, t) + d0

ss(s, t + ∆t)2

=d(s + ∆s, t) + d(s − ∆s, t) − 2d(s, t)

2∆s2

+d(s + ∆s, t + ∆t) + d(s − ∆s, t + ∆t) − 2d(s, t + ∆t)

2∆s2.

Page 607: Pricing derivative securities

April 11, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch12 FA

590 Pricing Derivative Securities (2nd Ed.)

Substituting these into the p.d.e.

dt + dsp + dssq − dr = 0,

moving the knowns evaluated at t + ∆t to the left side, and using thesame simplified notation as for the implicit method give an expression justlike (12.11),

ei = a · di−1 + b · di + c · di+1,

but with different interpretations of ei and the coefficients. Letting si =sMin + i∆s as before, take

ei ≡ a∗d(si−1, t + ∆t) + b∗d(si, t + ∆t) + c∗d(si+1, t + ∆t)

with

a∗ = −(p/4)∆t/∆s + (q/2)∆t/∆s2

b∗ = 1 − q∆t/∆s2

c∗ = (p/4)∆t/∆s + (q/2)∆t/∆s2

and

a = (p/4)∆t/∆s− (q/2)∆t/∆s2

b = 1 + r∆t + q∆t/∆s2

c = −(p/4)∆t/∆s− (q/2)∆t/∆s2.

Note that ei still depends on values of the derivative at the next time step,which are always known as we work backwards in time. With these newdefinitions of the symbols the solution proceeds just as in the first-orderversion of the implicit method.

Applying the method to the data for the American put, we havea = −0.142, b = 2.467, c = −0.192, a∗ = 0.142, b∗ = 1.667, c∗ = 0.192,

l1 = 2.467, l2 = 2.456, l3 = 2.456, and the k values shown in table 12.3.The third figure at each interior node in figure 12.2 is the Crank-Nicolsonvaluation.

Table 12.3. Values of kj in Crank-Nicolson solutionof example in Fig. 12.2.

t k1 k2 k3

0.66 0.835 0.104 0.0060.33 0.843 0.175 0.0200.00 0.849 0.226 0.037

Page 608: Pricing derivative securities

April 11, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch12 FA

Solving P.D.E.s Numerically 591

Table 12.4. Percentage pricing errors for Europeanput: Finite-difference and binomial methods.

1st-Order Crank-nT Explicit Implicit Nicolson Binomial

25 −0.48 −1.81 −1.14 1.1550 −0.10 −0.75 −0.42 −0.02

100 0.03 −0.27 −0.13 0.09200 0.06 −0.10 −0.02 0.10400 0.04 −0.03 −0.00 0.08800 0.01 0.00 0.00 0.04

12.2.4 Comparison of Methods

Program PDEgeoBm on the accompanying CD allows the user to select any ofthe three finite-difference methods to solve the log-price formulation of thefundamental p.d.e. with constant interest rates, dividend rate, and volatil-ity. The user merely specifies the type of option (put or call, American orEuropean), the specifications for underlying and option, the range of logprice, and the dimensions of the time/log-price grid.

To judge the accuracies of the three finite-difference methods and tocompare with the binomial approach, we apply all four methods to a prob-lem for which Black-Scholes gives an exact answer: pricing a one-yearEuropean put with X = 10.0, S0 = 10.125, σ = 0.3, and r = ln(1.05).Table 12.4 shows percentage deviations from the Black-Scholes price,PE(S0, 1.0) = 0.8949, for time steps ranging from nT = 25 to 800. In thisparticular example the explicit method gives surprisingly good accuracyeven for nT = 25, but the faster convergence of Crank-Nicolson is appar-ent. The computing time is about the same for the three finite-differencemethods, and all are faster than the binomial.

12.3 Extensions

This section shows how finite-difference methods can be adapted to dealwith more sophisticated dynamics than geometric Brownian motion andhow to handle payments by the underlying of lump-sum dividends.

12.3.1 More General P.D.E.s

Consider now trying to solve a financial p.d.e. of the more general form

κ(St, t) − DT−t + DSµ(St, t) + DSSσ2(St, t)/2 − Drt = 0.

Page 609: Pricing derivative securities

April 11, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch12 FA

592 Pricing Derivative Securities (2nd Ed.)

This accommodates (i) financial derivatives that themselves generatecontinuous time- and/or price-dependent cash receipts κ(St, t); (ii) time-varying (but deterministic) short rates rt; (iii) time/price-dependent con-tinuous dividend payouts by the underlying; and (iv) volatility that is anyfunction of current price and time that is bounded on S ×T . For example,(iii) could be handled by setting µ(St, t) = St[rt − δ(St, t)] for an appropri-ate payout function δ(·, ·), and (iv) would accommodate a generalized c.e.v.specification such as σ(St, t) = σ0St + σ1S

1−γt .3 To allow this generality it

will be necessary to leave the p.d.e. in terms of absolute prices rather thantransforming to logs. However, since it will still be convenient to work inforward time rather than time to expiration, we continue to work withforward-time derivative Dt and write the p.d.e. as

κ(St, t) + Dt + DSµ(St, t) + DSSσ2(St, t)/2 − Drt = 0. (12.17)

In this general setting it may be difficult to choose ∆t and ∆S so as toguarantee convergence of the explicit method. For example, the coefficientb∗ that would apply in this case, 1−σ2(St, t)∆t/∆S, could become negativefor large values of σ. For this reason we now focus solely on the Crank-Nicolson method.

Constructing the grid in price levels requires choosing prices Smax andSmin that bound St0≤t≤T with high probability. What bounds are suitablewill depend on the model for the process. The price step ∆S should bechosen in conjunction with Smin so that (S0 −Smin)/∆S is an integer, thusensuring that the initial price S0 is in the grid S. In constructing the timegrid T the usual recommendation is to choose ∆t to be proportional to∆S2. One approach is to set the number of steps nT equal to the greatestinteger in T/∆S2, then take ∆t = T/nT . This ensures that the derivative’slifetime comprises an integral number of time steps.

Having constructed the grid, we now replace Dt, DS , and DSS in (12.17)by second-order formulas D0

t (S, t+∆t/2), D0S(S, t+∆t/2), and D0

SS(S, t+∆t/2), corresponding to (12.16), (12.6), and (12.7) but with price and time

3Recall from section 3.2.3 that µ(St, t) and σ(St, t) must satisfy certain conditionsif there is to be a unique solution to the s.d.e. dSt = µ(St, t) · dt + σ(St, t) · dWt withgiven initial condition S0. See Karatzas and Shreve (1991, pp. 286–291) and, in relationto the c.e.v. model in particular, Duffie (1996, pp. 291–293).

This scheme does not accommodate formulations in which µ and σ depend nontrivially

on past prices, as in the Hobson and Rogers (1998) model, or on other risk sources, as instochastic-volatility models. Since these involve additional state variables, they requirelattices of dimension greater than two.

Page 610: Pricing derivative securities

April 11, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch12 FA

Solving P.D.E.s Numerically 593

as arguments. As before, this gives an expression involving values of thefinancial derivative at each of three price steps and at the current and nexttime step, t and t+∆t. Let Si = Smin+i∆S be the price at the ith step, i ∈0, 1, . . . , nS, and let tj = j∆t be the time at step j ∈ 0, 1, . . . , nT . Withthe time step understood to conserve notation, and setting Di ≡ D(Si, tj),the p.d.e. can be expressed as,

ei = aiDi−1 + biDi + ciDi+1, (12.18)

where

ei+1 ≡ κ(Si, tj)∆t + a∗i Di−1 + b∗i Di + c∗i Di+1

a∗i ≡ −[µ(Si, tj)/4]∆t/∆S + [σ(Si, tj)2/4]∆t/∆S2

b∗i ≡ 1 − [σ(Si, tj)2/2]∆t/∆S2

c∗i ≡ [µ(Si, tj)/4]∆t/∆S + [σ(Si, tj)2/4]∆t/∆S2

ai ≡ −a∗i

bi ≡ 1 + rtj ∆t + [σ(Si, tj)2/2]∆t/∆S2

ci ≡ −c∗i .

This is the same as (12.11) except that a, b, and c may change at each priceand time step.

In solving the equations the only changes are that a, b, and c have to becalculated at each grid point and that the vector l must be reconstructed ateach time step. To proceed, begin as before at next-to-last time step tnT −1

and first price step S1 = Smin+∆S. Calculate and store in the first positionof the (nS − 1) -vectors k and l the elements k1 = e1 − a1D0 and l1 = b1.Calculate c1 and store as the first element of vector c. Looping through theprice steps, for i = 2, 3, . . . , nS − 1 fill out the c vector and generate theremaining elements of k and l recursively as

ki = ei − aiki−1

li−1

li = bi −aici−1

li−1.

Beginning then at price step i = nS − 1 where Di+1 = DnS is knownand continuing down to price step i = 1, solve ki = liDi + ciDi+1 forDi in terms of Di+1. After checking each entry to allow for the pos-sibility of early exercise, back up to the previous time step and repeatthe procedure, continuing in this way until D1, D2, . . . , DnS−1 have beenfound at t = 0.

Page 611: Pricing derivative securities

April 11, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch12 FA

594 Pricing Derivative Securities (2nd Ed.)

The program PDEgeneral on the CD applies the Crank-Nicolson methodto the general form of the financial p.d.e., allowing flexible behavior ofinterest rates, continuous rates of dividend payout, and volatility.

12.3.2 Allowing for Lump-Sum Dividends

The first task in accommodating discrete payments by the underlying is tochoose time steps ∆t such that payment dates coincide approximately withelements of T . At each ex -dividend date the various prices in the set S arereduced by the amount of the dividend. If the log-price formulation is used,dividend payments proportional to the price level can be allowed by addingln(1− δ) to each price, where δ is the proportional payout. If payments arefixed in currency units, one would formulate the p.d.e. in absolute pricesrather than logs, subtracting the fixed dividend from each element of S onthe ex date. More complicated price-dependent payout schemes require thesize of the price step ∆S to vary over S. This introduces another sourceof variation in the coefficients a, b, and c of the implicit methods, but itcauses no special difficulty.

Page 612: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch13 FA

13Programs

This chapter serves as a guide to the contents of the accompanyingCDROM, which contains source code in FORTRAN, C++, and VBA(Visual Basic) for many of the computations described in earlier chap-ters.1 The CD contains a separate folder for each language, and each suchfolder contains subfolders corresponding to program groups. The sectionsthat follow correspond to the program groups. Each section contains a listof the component programs and a brief description of what each programdoes.

Several of the FORTRAN and C++ programs require routines in thecorresponding editions of Press et al. (1992).2 Others require routinesincluded here. All the VBA routines are self-contained, with input andoutput handled through the spreadsheets. Routines INVERTCF and INVFFT

are not implemented in VBA. ContinuousTime.xls executes doB S, CEV,COMPOUND, EXTEND and JUMP using VBA versions of PHI and GAUSSINT.

1C++ programs were developed from the FORTRAN by Michael Nahas. VBA sourcecode, visible as macros from the spreadsheets, was developed by Ben Koulibali and UbboWiersema. The various versions of these programs are intended for instruction only.Although produced to the best of our ability, one cannot be certain that they are freeof errors. It is nevertheless hoped that they will be found useful in guiding those whointend to conduct their own research or develop their own applications.

2Available on the Numerical Recipes (2nd Ed.) Multi-Language Code CD ROM,Cambridge University Press.

595

Page 613: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch13 FA

596 Pricing Derivative Securities (2nd Ed.)

13.1 Generate and Test Random Deviates

13.1.1 Generating Uniform Deviates

function randfun(seed)

Purpose: Generate pseudorandom uniform (0,1) deviates.

Input: Double precision number >1, seed

Output: One uniform deviate.

References: Fishman, G. (1996) Monte Carlo: Concepts,

Algorithms, and Applications, Springer-Verlag:

New York.

Notes: Adapted to function subprogram from subroutine

written by L.R. Moore, in Fishman (1996).Required routines: None

13.1.2 Generating Poisson Deviates

subroutine poidev(theta,n,jv,seed)

Purpose: Generates pseudorandom Poisson variates by

inversion of c.d.f.Input: Poisson parameter, theta

Number of variates generated, n

Double precision seed value >1, seed

Output: n-vector of Poisson deviates, jv

Notes: (1) A uniform u is generated. A table of F(j) is

constructed out to j* = first j: F(j) > u, and

j* is returned as the deviate. For new values of

u less than umax (the max of preceding u’s) the

corresponding j’s are read from the table. When

u > umax is generated, the table is extended.

(2) Uses RANDFUN to generate uniform deviates.

Replace as desired.

Required routines: RANDFUN

13.1.3 Generating Normal Deviates

function gausfun(seed)

Purpose: Generates pseudorandom standard normals by

adaptation of algorithm NA (see Reference) with

Page 614: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch13 FA

Programs 597

exponentials generated via inverse probability

transform.

Input: seed, a double precision number greater than or

equal to unity.

Output: A single pseudorandom normal deviate.Reference: Fishman, G.S.(1996), Monte Carlo: Concepts,

Algorithms, and Applications, Springer: New York

Notes: Uses RANDFUN to generate uniform deviates.

Required routines: RANDFUN

13.1.4 Testing for Randomness

function runtest(u,n,nrun)

Purpose: Perform large-sample runs test for randomness.

Input: n-vector of observations, u

Number of observations, n

Output: Runs test statistic, distributed as chi-square(1)

under null hypothesis of randomness.

Number of runs, nrun.Reference: Wonnacott, T. and Wonnacott, R. (1977)

Introductory Statistics. Wiley: New York.

Notes:

Significance level/ Critical points

.500 .250 .100 .050 .025 .010 .005

0.45 1.32 2.71 3.84 5.02 6.63 7.88

Required routines: None

13.1.5 Testing for Uniformity

function utest(n,u)Purpose: Test that data are i.i.d as uniform on (0,1)

Input: n-vector of numbers to be tested for uniformity,

u Number of observations, n

Output: N2 test statistic for uniformity

Reference: D’Agostino, R. and Stephens, M. (1986)

Goodness of Fit Techniques, Dekker: New York,

pp.351-354, 357.

Notes: Statistic is distributed for large n

approximately as chi-square(2).

Page 615: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch13 FA

598 Pricing Derivative Securities (2nd Ed.)

Significance Level/Critical Points

.500 .250 .100 .050 .025 .010 .005

1.386 2.773 4.605 5.991 7.378 9.210 10.597

Required routines: None

13.1.6 Anderson-Darling Test for Normality

subroutine ADtest(n,y,ybar,sdy,ad)

Purpose: Conducts modified Anderson-Darling test of

normality on vector y of length n.

Input: y, vector of length n

ybar, sdy = sample mean and variance (divisor

n-1) If ybar,sdy = 0 on input, they will be

computed

Output: Anderson-Darling statistic, ad

Reference: ’’Anderson-Darling test for goodness of fit’’,

Encyclopedia of Statist. Sciences, v1, pp.81-85

eq.(3) and a* modification for case 3, as

specified in table 1.

Significance level/critical points (all n)

.100 .050 .025 .010

.631 .752 .873 1.035

Notes: Vector y is returned in increasing order

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Required routines: PHI

From Numerical Recipes, Press et al. (1992): SORT

13.1.7 ICF Test for Normality

subroutine icftest(n,y,cf)

Purpose: Test of normality, mean and variance unspecified

Input: Number of observations, n

Vector of observations, y

Output: Value of CF statistic, cfReference: Epps, T. & Pulley, L.(1983) Biometrika 70(3),

723-6 Baringhaus,L. et al.(1989) Commun. in

Statist., Comp. & Simul., 18(1), 363-79

Notes: Data vector is returned in standard form.

Page 616: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch13 FA

Programs 599

Percentage Points of CF

n\1-p | 0.900 0.950 0.975 0.990

----------------------------------------------

4 | 0.248 0.291 0.316 0.332

5 | 0.243 0.323 0.388 0.4456 | 0.262 0.325 0.403 0.499

8 | 0.269 0.344 0.424 0.523

10 | 0.278 0.355 0.435 0.542

15 | 0.284 0.364 0.446 0.560

20 | 0.288 0.369 0.449 0.562

30 | 0.289 0.370 0.456 0.564

50 | 0.290 0.372 0.459 0.574

Required routines: None

13.2 General Computation

13.2.1 Standard Normal CDF

function phi(x)Purpose: Evaluate Standard Normal CDF

Input: x

Output: Phi(x)= Pr(Z<x), where Z ~N(0,1)

Reference: Moran, P.A.P. (1980), ’’Calculation of the normal

distribution function’’, Biometrika 67 (3),

675-676.

Notes: Accuracy to 7 decimal places when |x|< 5.

Required

routines: None

13.2.2 Expectation of Function of Normal Variate

function GaussInt(f,a,b,dz)

Purpose: Integrate function f(z) between a and b with

respect to standard normal density by rectangularapproximation on grid spaced ’dz’.

Input: Integration limits, a and b

Grid spacing, dz

Output: Integral between a and b of product of f(z) and

standard normal p.d.f.

Page 617: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch13 FA

600 Pricing Derivative Securities (2nd Ed.)

Notes:

(1) ’dz’ must be specified; a suggested value is .001.

(2) For a=-inf and/or b=+inf, set a=-1000. and/or

b=+1000.

(3) ’f’ must appear in EXTERNAL statement in main program.Required routines: User-supplied function f(z)

13.2.3 Standard Inversion of Characteristic Function

subroutine INVERTCF(x,nx,Fx,N,h,phi)

Purpose: Evaluate c.d.f. F(x) at nx points x by Fourier

inversion of c.f.

Input: Vector of points, x

Length of vector, nx

Length of grid at which c.f. is evaluated, N

Spacing of grid for c.f., h

Output: Vector of c.d.f. values, Fx

Reference: Waller, L et al.(1985) Amer. Statist. 49, 346-50.

Notes: (1) Random variable whose c.f. is inverted should be

roughly centerded and with unit scale; e.g., instandard form.

(2) N and h correspond to H and eta in Waller et al.

and control the grid on which c.f. is evaluated.

Reducing h with N fixed makes the grid denser,

sharpening estimates of tail probabilities, but

shrinks the range, [-Nh,Nh]. Increasing N with h

fixed extends range and improves accuracy near

mode.

Required routines:

User-supplied subroutine PHI to calculate c.f. of a

standardized random variable. Example routine

supplied.

13.2.4 Inversion of Characteristic Function by FFT

subroutine invfFt(n,t,CDF)

Purpose: Apply the fast Fourier transform to invert the

c.f. corresponding to an absolutely continuous

c.d.f. F, and thereby to recover values of F(t)

on a grid of points t.

Page 618: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch13 FA

Programs 601

Input: Number of points at which F is evaluated, n

Output: Values of c.d.f. at points t, CDF

Reference: Press, W. et al (1992) Numerical Recipes in

FORTRAN, Cambridge Press.

Required routines:(1) User-supplied complex-valued function PSI that

calculates from the parameters a ’center’ and a

’scale,’ then calculates 1/(i*t) times the c.f. of

centered and scaled r.v. whose distribution is F.

(2) From Numerical Recipes, Press et. al.(1992),

FOUR1

Notes: (1) n must be a power of 2 no larger than 2**14,

depending on available memory. Larger n =>

greater range of values where F is evaluated

& better accuracy in the tails, but slower

execution.

(2) Choice of center and scale is not crucial if

most probability mass is between -10 and

+10 after centering and scaling. Originalcenter and scale is restored at output.

(3) Two example psi functions, PSI1 and PSI2, are

provided. To run one, change the name to PSI.

(4) Set options in ***Set options*** block to

control printing and domain on which F is

calculated.

13.3 Discrete-Time Pricing

13.3.1 Binomial Pricing

program binomial

Purpose: Prices European, American, and extendable options

on stocks, currencies, indexes, or futures under

Bernoulli dynamics

Inputs: Called for on execution

Output: Summary of input data and value of optionRequired routines: None

Page 619: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch13 FA

602 Pricing Derivative Securities (2nd Ed.)

13.3.2 Solving PDEs under Geometric Brownian Motion

PROGRAM PDEgeoBm

Purpose: Solve the financial PDE when price follows

geometric Brownian motion by choice of EXPLICIT,

first-order IMPLICIT, or CRANK-NICHOLSON methods.

Input: Values in parameter statements as instructed.

Output: Option value over range of prices around the

strike.

Reference: Smith, G. (1985) Numerical Solution of Partial

Differential Equations, 3d ed., Clarendon Press:

Oxford

Required routines: None

13.3.3 Crank-Nicolson Solution of General PDE

PROGRAM PDEgeneral

Purpose: Solve financial PDE by Crank-Nicholson method,

allowing for time-varying interest rates andtime- and price- dependent cash flow, mean

drift, and volatility.

Input: Values in parameter statements as instructed in

routine. Specify interest-rate, cash flow, drift,

and volatility processes in function subprograms.

Examples supplied.

Output: Option value over range of prices around strike.

Reference: Smith, G. (1985) Numerical Solution of Partial

Differential Equations, 3d ed., Clarendon Press:

Oxford

Required routines: None

13.4 Continuous-Time Pricing

13.4.1 Shell for Black-Scholes with Input/Output

program doB_S

Purpose: Executes B_S subroutine with input/output

Input: Called for at execution

Output: Black-Scholes prices

Required routines: B_S, PHI

Page 620: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch13 FA

Programs 603

13.4.2 Basic Black-Scholes Routine

subroutine B_S(SS,X,sigma,B,T,underlying,div,option,optval)

Purpose: Calculates Black-Scholes price of European option

Input: Underlying price, SS

Strike price, X

Volatility, sigma

Price of unit bond maturing at expiration, B

Expiration date, T

Type of underlying asset, (’stock’,’currency’,

’index’, or ’futures’), underlying

Dividend (if underlying=’stock’) or dividend rate

(if ’underlying’=’currency’ or ’index’), div

Type of option (put or call), option

Output: Value of option, optval

Required routines: PHI

13.4.3 Pricing under the C.E.V. Model

program cev

Purpose: Values European puts and calls on underlying

asset that follows c.e.v. dynamics.

Inputs: Called for at execution

Output: Values of puts and calls under c.e.v. dynamics,

Black- Scholes values at same initial volatility,

and differences, for range of moneyness values,

S/X

Required routines: GAMPDF (included), B_SFrom Numerical Recipes, Press et. al.(1992): GAMMLN,GAMMQ,

GSER,GCF

13.4.4 Pricing a Compound Option

program compound

Purpose: Value compound call option (call on a call)

Input: Initial value & volatility of underlying

Strikes of compound & primary

Expiration times of compound & primary

Output: Value of compound call

Page 621: Pricing derivative securities

April 21, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in ch13 FA

604 Pricing Derivative Securities (2nd Ed.)

Reference: Geske, R. (1979), "Valuation of compound

options," J.Financial Econ. 7, 63-91.

Required routines: B_S, PHI, Gaussint

From Numerical Recipes, Press et. al. (1992): ZBRAC, RTBIS

13.4.5 Pricing an Extendable Option

program extend

Purpose: Value a put on a no-dividend stock that must beexercised if in the money at t* and is otherwise

extended to T.

Inputs: Called for at execution

Output: Values of European puts expiring at t* and T,

and value of the extendable put

Required routines: B_S, PHI, GAUSSINT

13.4.6 Pricing under Jump Dynamics

program jump

Purpose: Value European put or call on underlying

following mixed jump-diffusion dynamics.

Inputs: Called for at execution.

Output: Price of European option.

Required routines: B_S, PHI

Page 622: Pricing derivative securities

April 11, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in Bibliography FA

Bibliography

1. Abramowitz, M. and Stegun, I. (1970) Handbook of Mathematical Functions.New York: Dover.

2. Ahn, C.M. (1992) “Option pricing when jump risk is systematic,”Mathematical Finance 2, 299–308.

3. Amin, K.I. (1993) “Jump diffusion option valuation in discrete time,”J. Finance 48, 1833–1863.

4. Anderson, T.W. and Darling, D.A. (1954) “A test of goodness of fit,”J. American Statistical Association 49, 765–769.

5. Arnold, L. (1974) Stochastic Differential Equations: Theory andApplications. New York: John Wiley.

6. Arrow, K.J. (1964) “The role of securities in the optimal allocation of risk-bearing,” Review of Economic Studies 31, 91–96. Translation of “Le role desvaleurs boursiers pour la repartition la meilleure des risques,” Econometrie,41–48, Centre National de la Recherche Scientifique Paris, 1953.

7. Ash, R.B. (1972) Real Analysis and Probability. New York: Academic Press.8. Athreya, K.B. and Ney, P.E. (1972) Branching Processes. New York:

Springer-Verlag.9. Bajeux, I. and Rochet, J.C. (1992) “Dynamic spanning: Are options an

appropriate instrument?” Mathematical Finance 6, 1–16.10. Bakshi, G.; Cao, C.; and Z. Chen (1997) “Empirical performance of alter-

native option pricing models,” J. Finance 52, 2003–2049.11. Baringhaus, L.; Danschke, R.; and N. Henze (1989) “Recent and classical

tests for normality—a comparative study,” Communications in Statistics,Theory and Methods 18, 363–379.

12. Baringhaus, L. and Henze, N. (1988) “A consistent test for multivari-ate normality based on the empirical characteristic function.” Metrika 35,339–348.

13. Barndorff-Nielsen, O.E. and Halgreen, O. (1977) “Infinite divisibility of thehyperbolic and generalized inverse Gaussian distributions,” Zeitschrift furWahrscheinlichkeitstheorie und Verwandte Gebiete 38, 309–312.

14. Barone-Adesi, G. and Whaley, R.E. (1987) “Efficient analytic approxima-tion of American option values,” J. Finance 42, 301–320.

605

Page 623: Pricing derivative securities

April 11, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in Bibliography FA

606 Pricing Derivative Securities (2nd Ed.)

15. Barone-Adesi, G. and Whaley, R.E. (1988) “On the valuation of Americanput options on dividend-paying stocks,” Advances in Futures and OptionsResearch 3, 1–13.

16. Bartoszynski, R. and Niewiadomska-Bugaj, M. (1996) Probability and Sta-tistical Inference. New York: John Wiley.

17. Bates, D.S. (1996a) “Jumps and stochastic volatility: Exchange rate pro-cesses implicit in Deutschemark options,” Review of Financial Studies 9,69–108.

18. Bates, D.S. (1996b) “Testing option pricing models,” in G.S. Maddala andC.R. Rao, eds., Handbook of Statistics 14, 567–611. Amsterdam: Elsevier.

19. Baxter, M. and Rennie, A. (1996) Financial Calculus. Cambridge (UK):Cambridge Press.

20. Beckers, S. (1980) “The constant elasticity of variance model and its impli-cations for option pricing,” J. Finance 35, 661–73.

21. Bergman, Y.; Grundy, B.; and Z. Wiener (1996) “General properties ofoption prices,” J. Finance 51, 1573–1610.

22. Bertoin, J. (1996) The Levy Processes. New York: Cambridge Press.23. Billingsley, P. (1986) Probability and Measure, 2d. ed. New York: John Wiley.24. Black, F. and Scholes, M. (1973) “The pricing of options and corporate

liabilities,” J. Political Economy 81, 637–54.25. Black, F. (1976a) “Studies of stock price volatility changes,” Proceedings,

American Statistical Assn., 177–181.26. Black, F. (1976b) “The pricing of commodity options,” J. Financial Eco-

nomics 3, 167–179.27. Black, F. and Cox, J. (1976) “Valuing corporate securities: Some effects of

bond indenture provisions. J. Finance 31, 351–367.28. Blattberg, R.C. and Gonedes, N.J. (1974) “A comparison of the stable and

Student distributions as statistical models for stock prices,” J. Business 47,244–280.

29. Bohman, H. (1975) “Numerical inversion of characteristic functions,”Scandanavian Actuarial J., 121–124.

30. Bollerslev, T.; Chou, R.Y.; and K.F. Kroner (1992) “ARCH modeling infinance: A review of the theory and empirical evidence,” J. Econometrics52, 5–59.

31. Boothe, P. and Glassman, D. (1987) “The statistical distribution ofexchange rates,” J. International Economics 22, 297–319.

32. Box, G.E.P. and Muller, M.E. (1958) “A note on the generation of randomnormal deviates,” Annals of Mathematical Statistics 29, 610–611.

33. Boyle, P.; Broadie, M.; and P. Glasserman (1997) “Monte Carlo methodsfor security pricing,” J. Economic Dynamics and Control 21, 1267–1321.

34. Brace, A.; Gatarek, D.; and M. Musiela (1997) “The market model of inter-est rate dynamics,” Mathematical Finance 7, 127–154.

35. Brennan, M.J. and Schwartz, E.S. (1977) “The valuation of American putoptions,” J. Finance 32, 449–462.

36. Brigo, D. and Mercurio, F. (2001) Interest Rate Models: Theory andPractice. Berlin: Springer–Verlag.

Page 624: Pricing derivative securities

April 11, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in Bibliography FA

Bibliography 607

37. Broadie, M. and Detemple, J. (1996) “American option valuation: Newbounds, approximations, and a comparison of existing methods,” Reviewof Financial Studies 9, 1211–50.

38. Broadie, M. and Glasserman, P. (1997) “Pricing American-style securitiesby simulation,” J. Economic Dynamics and Control 21, 1323–1352.

39. Campbell, J.; Lo, A.; and A. MacKinlay (1997) The Econometrics of Finan-cial Markets. Princeton, NJ: Princeton Press.

40. Cannon, J.R. (1984) The One-Dimensional Heat Equation, v. 23 in Ency-clopedia of Mathematics and Its Applications, G. Rota, ed. Menlo Park, CA:Addison-Wesley.

41. Carr, P.; Jarrow, R.; and R. Myneni (1992) “Alternative characterizationsof American put options,” Mathematical Finance 2, 87–106.

42. Carr, P. and Madan, D. (1998) “Option valuation using the fast Fouriertransform,” J. Computational Finance 2, 61–73.

43. Carr, P. and Wu, L. (2003) “The finite moment log stable process and optionpricing,” J. Finance 58, 753–777.

44. Carverhill, A.P. and Clewlow, L.J. (1990) “Flexible convolution,” Risk 3(4),25–29.

45. Chernov, M. and Ghysels, E. (1998) “A study towards a unified approach tothe joint estimation of objective and risk neutral measures for the purposeof options valuation,” J. Financial Economics 56, 407–458.

46. Chou, R.Y. (1988) “Volatility persistence and stock valuations: Some empir-ical evidence using GARCH,” J. Applied Econometrics 3, 279–294.

47. Christie, A.A. (1982) “The stochastic behavior of common stock vari-ances: Value, leverage and interest rate effects,” J. Financial Economics 10,407–432.

48. Chung, K.L. (1974) A Course in Probability Theory. New York: AcademicPress.

49. Chung, K.L. and Williams, R.J. (1983) Introduction to Stochastic Integra-tion. Boston: Birhauser.

50. Clark, P.K. (1973) “A subordinated stochastic process model with finitevariance for speculative prices,” Econometrica 31, 135–155.

51. Clewlow, L. and Carverhill, A. (1994) “Quicker on the curves,” Risk 7(5),63–64.

52. Collatz, L. (1966) The Numerical Treatment of Differential Equations.Berlin: Springer-Verlag.

53. Conze, A. and Viswanathan (1991) “Path dependent options: The case oflookback options,” J. Finance 46, 1893–1907.

54. Cootner, P. (1964) The Random Character of Stock Market Prices.Cambridge (US): MIT Press.

55. Cox, D. and Miller, H. (1972) The Theory of Stochastic Processes. London:Chapman & Hall.

56. Cox, D. and Oakes, D. (1984) Analysis of Survival Data. London: Chapman& Hall.

57. Cox, J. (1975) “Notes on option pricing I: Constant elasticity of variancediffusions,” working paper, Stanford University.

Page 625: Pricing derivative securities

April 11, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in Bibliography FA

608 Pricing Derivative Securities (2nd Ed.)

58. Cox, J.; Ingersoll, J.; and S. Ross (1985) “A theory of the term structure ofinterest rates,” Econometrica 53, 385–408.

59. Cox, J. and Rubinstein, M. (1985) Options Markets. Englewood Cliffs, NJ:Prentice-Hall.

60. Cox, J. and Ross, S. (1976) “The valuation of options for alternative stochas-tic processes,” J. Financial Economics 3, 145–66.

61. Cox, J.; Ross, S.; and M. Rubinstein (1979) “Option pricing: A simplifiedapproach,” J. Financial Economics 7, 229–63.

62. Crank, J. and Nicolson, P. (1947) “A practical method for numerical eval-uation of solutions of partial differential equations of the heat conductiontype,” Proceedings of Cambridge Philosophical Society 43, 50–67.

63. D’Agostino, R.B. and Stephens, M.A. (1986) Goodness-of-Fit Techniques.New York: Marcel Dekker.

64. Darroch, J.N. and Morris, K.W. (1968) “Passage-time generating func-tions for continuous-time finite Markov chains,” J. Applied Probability 5,414–426.

65. Davis, H.F. (1963) Fourier Series and Orthogonal Functions. New York:Dover.

66. Debreu, G. (1959) Theory of Value. Cowles Foundation Monograph 17. NewHaven: Yale Univ. Press.

67. Delbaen, F. and Schachermayer, W. (1994) “A general version of the fun-damental theorem of asset pricing,” Mathematische Annalen 300, 463–520.

68. Derman, E. and Kani, I. (1994) “Riding on a smile,” Risk 7(2), 32–39.69. Dion, J.P. and Epps, T.W. (1999) “Stock prices as branching processes in

random environments: Estimation,” Communications in Statistics, Simula-tion and Computation 28, 957–976.

70. Dubofsky, D.A. (1991) “Volatility increases subsequent to NYSE and AMEXstock splits,” J. Finance 46, 421–431.

71. Duffie, D.; Pan, J.; and K. Singleton (2000) “Transform analysis and assetpricing for affine jump-diffusions,” Econometrica 68, 1343–1376.

72. Duffie, D. and Singleton, K. (1999) “ Modeling term structures of defaultablebonds,” Review of Financial Studies 12, 687–720.

73. Duffie, D. and Singleton, K. (2003) Credit Risk: Pricing, Measurement, andManagement. Princeton: Princeton Univ. Press.

74. Dumas, B.; Fleming, J.; and R.E. Whaley (1998) “Implied volatility func-tions: Empirical tests,” J. Finance 53, 2059–2106.

75. Dupire, B. (1994) “Pricing with a smile,” Risk 7(1), 18–20.76. Durrett, R. (1984) Brownian Motion and Martingales in Analysis. Belmont,

CA: Wadsworth.77. Durrett, R. (1996) Stochastic Calculus: A Practical Introduction. Boca

Raton, FL: CRC Press.78. Eberlein, E. and Keller, U. (1995) “Hyperbolic distributions in finance,”

Bernoulli 1, 281–299.79. Eberlein, E.; Keller, U.; and K. Prause (1998) “New insights into smile,

mispricing, and value at risk: The hyperbolic model,” J. Business 71,371–407.

Page 626: Pricing derivative securities

April 11, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in Bibliography FA

Bibliography 609

80. Edwards, C. (2004) Option Pricing with Continuous-time Markov ChainRegime Switching. Ph.D. dissertation, University of Virginia.

81. Edwards, C. (2005) “Derivative pricing models with regime switching: Ageneral approach,” J. Derivatives 13, 41–47.

82. Einstein, A. (1905) “On the movement of small particles suspended in astationary liquid demanded by the molecular kinetic theory of heat,” Annalsof Physics 17.

83. Elliott, R.J. (1982) Stochastic Calculus and Its Applications. Berlin:Springer-Verlag.

84. Elliott, R.J. and Kopp, P.E. (1999) Mathematics of Financial Markets.New York: Springer-Verlag.

85. Epps, T.W. and Pulley, L.B. (1983) “A test for normality based on theempirical characteristic function,” Biometrika 70, 723–726.

86. Epps, T.W.; Pulley, L.B.; and D.B. Humphrey (1996) “Assessing the FDIC’spremium and examination policies using ‘Soviet’ put options,” J. Bankingand Finance 20, 699–721.

87. Epps, T.W. (1996) “Stock prices as branching processes,” Communicationsin Statistics, Stochastic Models 12, 529–558.

88. Eraker, B.; Johannes, M.; and N. Polson (2003) “The impact of jumps involatility and returns,” J. Finance 58, 1269–1300.

89. Eraker, B. (2004) “Do stock prices and volatility jump? Reconciling evidencefrom spot and option prices,” J. Finance 59, 1367–1403.

90. Fama, E. (1965) “The behavior of stock market prices,” J. Business 38,34–105.

91. Fang, H. (2002) Jumps with a Stochastic Jump Rate. Ph.D. dissertation,University of Virginia.

92. Feller, W. (1951) “Two singular diffusion problems,” Annals of Mathematics54, 173–182.

93. Feller, W. (1971) An Introduction to Probabilty Theory and Its Applicationsv.2, New York: John Wiley.

94. Feynman, R.J. (1948) “Space-time approach to nonrelativistic quantummechanics,” Review of Modern Physics 20, 367–387.

95. Finlay, R. and Seneta, E. (2006) “Stationary-increment Student andvariance-gamma processes,” J. Applied Probability 43, 1–13.

96. Fishman, G. (1996) Monte Carlo: Concepts, Algorithms, and Applications.New York: Springer-Verlag.

97. Fu, M.C.; Madan, D.B.; and T. Wang (1999) “Pricing continuous Asianoptions: A comparison of Monte Carlo and Laplace transform inversionmethods,” J. Computational Finance 2, 49–74.

98. Garcia, D. (2003) “Convergence and biases of Monte Carlo estimates ofAmerican option prices using a parametric exercise rule,” J. EconomicDynamics and Control 27, 1855–1879.

99. Geman, H. and Eydeland, A. (1995) “Domino effect,” Risk 8(4), 65–67.100. Geman, H. and Yor, M. (1992) “Quelques relations entre processus de

Bessel, options asiatiques, et fonctions confluentes hypergeometriques,”Compte Rendu Academie des Sciences Paris, Series I, 471–474.

Page 627: Pricing derivative securities

April 11, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in Bibliography FA

610 Pricing Derivative Securities (2nd Ed.)

101. Geman, H. and Yor, M. (1993) “Bessel processes, Asian options and per-petutities,” Mathematical Finance 3, 349–375.

102. Geske, R. (1979) “The valuation of compound options,” J. Financial Eco-nomics 7, 63–81.

103. Geske, R. and H. Johnson (1984) “The American put option valued analyt-ically,” J. Finance 39, 1511–24.

104. Ghysels, E.; Harvey, A.C.; and E. Renault (1996) “Stochastic volatility,”in G.S. Maddala and C.R. Rao, eds., Handbook of Statistics 14, 119–191.Amsterdam: Elsevier.

105. Glasserman, P. (2004) Monte Carlo Methods in Financial Engineering. NewYork: Springer.

106. Goldman, B.; Sosin, H.; and M. Gatto (1979) “Path dependent options: Buyat the low, sell at the high,” J. Finance 34, 1111–1128.

107. Gradshteyn, I. and Ryzhik, I. (1980) Table of Integrals, Series, and Products.New York: Academic Press.

108. Grandell, J. (1976) Doubly Stochastic Poisson Processes. Berlin: Springer-Verlag.

109. Hall, C.A. and Porsching, T.A. (1990) Numerical Analysis of Partial Dif-ferential Equations. Englewood Cliffs, NJ: Prentice-Hall.

110. Hamilton, J. (1989) “A new approach to the economic analysis of nonsta-tionary time series and the business cycle,” Econometrica 57, 357–384.

111. Hammersley, J. and Handscomb, D. (1964) Monte Carlo Methods. London:Methuen.

112. Harris, T.E. (1963) The Theory of Branching Processes. Berlin: Springer-Verlag.

113. Harrison, J.M. and Kreps, D.M. (1979) “Martingales and arbitrage in mul-tiperiod securities markets,” J. Economic Theory 20, 381–408.

114. Harrison, J.M. and Pliska, S.R. (1981) “Martingales and stochastic inte-grals in the theory of continuous trading,” Stochastic Processes and TheirApplications 11, 215–260.

115. Heath, D.C.; Jarrow, R.A.; and A. Morton (1992) “Bond pricing and theterm structure of interest rates: A new methodology for contingent claimvaluation,” Econometrica 60, 77–105.

116. Heston, S.L. (1993) “A closed-form solution for options with stochasticvolatility with applications to bond and currency options,” Review of Finan-cial Studies 6, 327–343.

117. Ho, T.S.Y. and Lee, S.B. (1986) “Term structure movements and pricinginterest rate contingent claims,” J. Finance 41, 1011–1029.

118. Hobson, D.G. and Rogers, L.C.G. (1998) “Complete models with stochasticvolatility,” Mathematical Finance 8, 27–48.

119. Hoeffding, W. (1940) “Masstabvinvariante Korrelantionstheorie,” Schriftendes Mathematischen Institut Angewandte Mathematik der Univ. Berlin 5,179–233.

120. Hull, J. and White, A. (1987) “The pricing of options on assets with stochas-tic volatilities,” J. Finance 42, 281–300.

Page 628: Pricing derivative securities

April 11, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in Bibliography FA

Bibliography 611

121. Hull, J. and White, A. (1990) “Pricing interest-rate derivative securities,”Review of Financial Studies 3, 573–592.

122. Jackwerth, J.C. and Rubinstein, M. (1996) “Recovering probability distri-butions from option prices,” J. Finance 51, 1611–1631.

123. Jarrow, R.; Lando, D.; and S. Turnbull (1997) “A Markov model for theterm structure of credit risk spreads,” Review of Financial Studies 10,481–523.

124. Jarrow, R. and Rosenfeld, E. (1984) “Jump risks and the intertemporalcapital asset pricing model,” J. Business 57, 337–351.

125. Joshi, M. (2003) The Concepts and Practice of Mathematical Finance.Cambridge (UK): Cambridge Press.

126. Ju, N. (1998) “Pricing an American option by approximating its early exer-cise boundary as a multipiece exponential function,” Review of FinancialStudies 11, 627–646.

127. Kac, M. (1949) “On distributions of certain Wiener functionals,” Transac-tions of American Mathematical Society 65, 1–13.

128. Karatzas, I. and Shreve, S. (1991) Brownian Motion and Stochastic Calcu-lus, 2d. ed. Berlin: Springer.

129. Karlin, S. and Taylor, H.M. (1975) A First Course in Stochastic Processes.New York: Academic Press.

130. Kim, I.J. (1990) “The analytic valuation of American options,” Review ofFinancial Studies 3, 547–572.

131. Kochard, L.E. (1999) Option Pricing and Higher Order Moments of theRisk-Neutral Probability Density Function, Ph.D. dissertation, University ofVirginia.

132. Kolb, R.W. and Overdahl, J.A. (2006) Understanding Futures Markets.Malden, MA: Blackwell.

133. Kolmogorov, A.N. (1950) Foundations of Probability Theory. New York:Chelsea Pub. Co.

134. Kotz, S., Johnson, N.L., and C.B. Read, eds. (1982) Encyclopedia of Statis-tical Sciences v.1. New York: John Wiley.

135. Kullback, S. (1983) “Kullback information,” in S. Kotz et al. (1983) Ency-clopedia of Statistical Sciences v.4.

136. Kushner, H.J. and Dupuis, P.G. (1992) Numerical Methods for StochasticControl Problems in Continuous Time. New York: Springer-Verlag.

137. Laha, R.G. and Rohatgi, V.K. (1979) Probability Theory . New York: JohnWiley.

138. Lando, D. (1994) Three Essays on Contingent Claims Pricing. Ph.D. dis-sertation, Cornell University.

139. Lando, D. (1998) “On Cox processes and credit risky securities,” Review ofDerivatives Research 2, 99–120.

140. Lando, D. (2004) Credit Risk Modeling. Princeton: Princeton Univ. Press.141. Levy, E. (1992) “Pricing European average rate currency options,” J. Inter-

national Money and Finance 11, 474–491.142. Levy, E. and Turnbull, S. (1992) “Average intelligence,” Risk

5(2), 53–59.

Page 629: Pricing derivative securities

April 11, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in Bibliography FA

612 Pricing Derivative Securities (2nd Ed.)

143. Lintner, J. (1965) “The valuation of risky assets and the selection of riskyinvestments in stock portfolios and capital budgets,” Review of Economicsand Statistics 47, 13–37.

144. Liu, W. (2003) Option pricing with pure jump models, Ph.D. dissertation,University of Virginia.

145. Longstaff, F. (1990) “Pricing options with extendible maturities: Analysisand applications,” J. Finance 45, 935–57.

146. Longstaff, F. and Schwartz, E. (2001) “Valuing American options by sim-ulation: A simple least-squares approach,” Review of Financial Studies 14,113–147.

147. Lukacs, E. (1970) Characteristic Functions. London: Griffin.148. MacMillan, L.W. (1986) “Analytic approximation for the American put

option,” Advances in Futures and Options Research 1, 119–139.149. Madan, D.B. (1999) “Purely discontinuous asset price processes,” working

paper, University of Maryland.150. Madan, D.B.; Carr, P.; and E.C. Chang (1998) “The variance gamma model

and option pricing,” European Finance Review 2, 79–105.151. Madan, D.B. and Milne, F. (1991) “ Option pricing with V.G. martingale

components,” Mathematical Finance 1, 39–55.152. Madan, D.B. and Seneta, E. (1990) “The variance gamma (V.G.) model for

share market returns,” J. Business 63, 511–524.153. Mandelbrot, B. (1963) “The variation of certain speculative prices,” J. Busi-

ness 36, 394–419.154. Mandelbrot, B. and Hudson, R. (2004) The (Mis)behavior of Markets,

New York: Basic Books.155. Mardia, K.V. (1970) “Measures of multivariate skewness and kurtosis with

applications,” Biometrika 57, 519–530.156. Marsaglia, G. (1995) The Marsaglia Random Number CDROM including the

Diehard Battery of Tests of Randomness, Department of Statistics, FloridaState University.

157. McCulloch, J.H. (1996) “Financial applications of stable distributions,”in G.S. Maddala and C.R. Rao, eds., Handbook of Statistics 14, 393–425.Amsterdam: Elsevier.

158. McDonald, J.B. (1996) “Probability distributions for financial models,”in G.S. Maddala and C.R. Rao, eds., Handbook of Statistics 14, 427–461.Amsterdam: Elsevier.

159. McLachlan, N.W. (1955) Bessel Functions for Engineers. Oxford:Clarendon.

160. Mee, R.W. and Owen, D.B. (1983) “A simple approximation for bivariatenormal probabilities,” J. Quality Technology 15, 72–75.

161. Merton, R.C. (1973) “The theory of rational option pricing,” Bell J. Eco-nomics and Management Science 4, 141–183.

162. Merton, R.C. (1976) “Option pricing when underlying stock returns arediscontinuous,” J. Financial Economics 3, 125–144.

163. Merton, R.C. (1978) “On the cost of deposit insurance when there aresurveillance costs,” J. Business 51, 439–452.

Page 630: Pricing derivative securities

April 11, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in Bibliography FA

Bibliography 613

164. Merton, R.C. (1990) Continuous Time Finance. Cambridge (US): Blackwell.165. Milevsky, M.A. and Posner, S.E. (1998) “Asian options, the sum of lognor-

mals, and the reciprocal gamma distribution,” J. Financial and QuantitativeAnalysis 33, 409–422.

166. Miller, M.H. and Modigliani, F. (1961) “Dividend policy, growth and thevaluation of shares,” J. Business 34, 411–433.

167. Miranda, M.J. and Fackler, P.L. (2002) Applied Computational Economicsand Finance. Cambridge (US): MIT Press.

168. Mitchell, A.R. and Griffiths, D.F. (1980) The Finite Difference Method inPartial Differential Equations. Chichester: John Wiley.

169. Modigliani, F. and Miller, M.H. (1958) “The cost of capital, corporationfinance and the theory of investment,” American Economic Review 48,261–297.

170. van Moerbeke, P. (1976) “On optimal stopping and the free boundary prob-lem,” Archives of Rational Mechanical Analysis 60, 101–148.

171. Moskowitz, H. and Tsai, H. (1989) “An error-bounded polynomial approx-imation for bivariate normal probabilities,” Communications in Statistics,Simulation and Computation 18, 1421–1437.

172. Mossin, J. (1966) “Equilibrium in a capital asset market,” Econometrica 35,768–783.

173. Musiela, M. and Rutkowski, M. (1998, 2005) Martingale Methods in Finan-cial Modelling, 1st & 2d eds. Heidelberg: Springer-Verlag.

174. Naik, V. (1993) “Option valuation and hedging strategies with jumps in thevolatility of asset returns,” J. Finance 48, 1969–1984.

175. Naik, V. and Lee, M. (1990) “General equilibrium pricing of options on themarket portfolio with discontinuous returns,” Review of Financial Studies3, 493–521.

176. Neyman, J. (1937) “ ‘Smooth’ tests for goodness-of-fit,” Skandinavisk Aktu-arietidskrift 20, 149–199.

177. Officer, R.R. (1972) “The distribution of stock returns,” J. American Sta-tistical Association 67, 807–812.

178. Ohlson, J.A. and Penman, S.H. (1985) “Volatility increases subsequent tostock splits: An empirical aberration,” J. Financial Economics 14, 251–266.

179. Palm, F.C. (1996) “GARCH models of volatility,” in G.S. Maddala and C.R.Rao, eds., Handbook of Statistics 14, 209–240. Amsterdam: Elsevier.

180. Pennacchi, G.B. (1987a) “A reexamination of the over- (or under-) pricingof deposit insurance,” J. Money, Credit, and Banking 19, 340–360.

181. Pennacchi, G.B. (1987b) “Alternative forms of deposit insurance,” J. Bank-ing and Finance 11, 291–312.

182. Praetz, P.D. (1972) “The distribution of share price changes,” J. Business45, 49–55.

183. Press, J. (1967) “A compound events model for security prices,” J. Business40, 317–35.

184. Press, W.H.; Flannery, B.P.; Teukolsky, S.A.; and W.T. Vettering (1986)Numerical Recipes: The Art of Scientific Computing (FORTRAN Version).Cambridge (UK): Cambridge Press.

Page 631: Pricing derivative securities

April 11, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in Bibliography FA

614 Pricing Derivative Securities (2nd Ed.)

185. Press, W.H.; Flannery, B.P.; Teukolsky, S.A.; and W.T. Vettering (1992a)Numerical Recipes in FORTRAN: The Art of Scientific Computing, 2nd.edn. Cambridge (UK): Cambridge Press.

186. Press, W.H.; Flannery, B.P.; Teukolsky, S.A.; and W.T. Vettering (1992b)Numerical Recipes in C: The Art of Scientific Computing, 2d ed. Cambridge(UK): Cambridge Press.

187. Protter, P. (1990) Stochastic Integration and Differential Equations: A NewApproach. Berlin: Springer-Verlag.

188. Qiu, R. (2004) Put Warrants: How They Are Valued and How They AffectManagerial Incentives. Ph.D. dissertation, University of Virginia.

189. Rebonato, R. (1999) “Calibrating the BGM model,” Risk 12, 74–79.190. Relton, F.E. (1946) Applied Bessel Functions. London: Blackie & Son.191. Renault, E. and Touzi, N. (1996) “Option hedging and implied volatilities

in a stochastic volatility model,” Mathematical Finance 6, 279–302.192. Revuz, D. and Yor, M. (1991) Continuous Martingales and Brownian

Motion. Berlin: Springer–Verlag.193. Romano, M. and Touzi, N. (1997) “Contingent claims and market complete-

ness in a stochastic-volatility model,” Mathematical Finance 7, 399–412.194. Ross, S.M. (2000) Introduction to Probability Models. San Diego, CA:

Academic Press.195. Rowell, D. (2004) “Computing the matrix exponential: The Cayley-

Hamilton method,” class notes for “Advanced System Dynamics andControl,” Dept. of Mechanical Engineering, Massachusetts Institute of Tech-nology: web.mit.edu/2.151/www/Handouts/CayleyHamilton.pdf.

196. Royden, H.L. (1968) Real Analysis. New York: Macmillan.197. Rubinstein, M. (1983) “Displaced diffusion option pricing,” J. Finance 38,

213–217.198. Rubinstein, M. (1984) “A simple formula for the expected rate of return of

an option over a finite holding period,” J. Finance 39, 1503–1509.199. Rubinstein, M. (1994) “Implied binomial trees,” J. Finance 49, 771–818.200. Rudin, W. (1976) Principles of Mathematical Analysis. New York:

McGraw-Hill.201. Schmalensee, R. and Trippi, R.R. (1978) “Common stock volatility expec-

tations implied by option premia,” J. Finance 33, 129–147.202. Schott, J.R. (1997) Matrix Analysis for Statistics. New York: John Wiley.203. Schroder, M. (1989) “Computing the constant elasticity of variance option

pricing formula,” J. Finance 44, 211–219.204. Scott, L. (1987) “Option pricing when the variance changes randomly: The-

ory, estimation, and an application,” J. Financial and Quantitative Analysis22, 419–438.

205. Scott, L. (1997) “Pricing stock options in a jump-diffusion model withstochastic volatility and interest rates: Applictions of Fourier inversionmethods,” Mathematical Finance 7, 413–426.

206. Sharpe, W.F. (1964) “Capital asset prices: A theory of market equilibriumunder conditions of risk,” J. Finance 19, 425–442.

Page 632: Pricing derivative securities

April 11, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in Bibliography FA

Bibliography 615

207. Shaw, W.T. (1998) Modelling Financial Derivatives with Mathematica.Cambridge (UK): Cambridge Press.

208. Shimko, D. (1992) Finance in Continuous Time. Miami: Kolb Publishing.209. Shreve, S.E. (2004) Stochastic Calculus for Finance II. New York: Springer

Science+Business Media.210. Smith, W.L. and Wilkinson, W. (1969) “On branching processes in random

environments”. Annals of Mathematical Statistics 40, 814–827.211. Sprott, J.C. (1991) Numerical Recipes Routines and Examples in Basic.

Cambridge (UK): Cambridge Press.212. Stein, E.M. and Stein, J.C. (1991) “Stock price distributions with stochastic

volatility: An analytic approach,” Review of Financial Studies 4, 727–752.213. Stern, M. (1993) Estimating and Testing the Poisson Jump-Diffusion Model

for Options on Stocks. Ph.D. dissertation, University of Virginia.214. Stoll, H. (1969) “The relationship between put and call option prices,”

J. Finance 24, 801–824.215. Stultz, R. (1982) “Options on the minimum or maximum of two risky assets:

Analysis and applications,” J. Financial Economics 9, 383–407.216. Stutzer, M. (1996) “A simple nonparametric approach to derivative security

valuation,” J. Finance 51, 1633–1652.217. Taylor, S.J. (1966) Introduction to Measure and Integration. Cambridge

(UK): Cambridge Press.218. Taylor, S. and Xu, X. (1994) “The term structure of volatility implied

by foreign exchange options,” J. Financial and Quantitative Analysis 29,57–74.

219. Vasicek, O. (1977) “An equilibrium characterization of the term structure,”J. Financial Economics 5, 177–188.

220. Waller, L.; Turnbull, B.; and J.M. Hardin (1995) “Obtaining distributionfunctions by numerical inversion of characteristic functions with applica-tions,” The American Statistician 49, 346–350.

221. Watson, G.N. (1944) Theory of Bessel Functions. Cambridge (UK):Cambridge Press.

222. Wiggins, J.B. (1987) “Option values under stochastic volatility,” J. Finan-cial Economics 19, 351–372.

223. Williams, D. (1991) Probability with Martingales. Cambridge (UK):Cambridge Press.

224. Williams, J.B. (1938) The Theory of Investment Value. Cambridge (US):Harvard U. Press.

225. Williams, T. (2001) Option Pricing and Branching Processes. Ph.D. disser-tation, University of Virginia.

226. Wilmott, P.; Dewynne, J.; and S. Howison (1993) Option Pricing:Mathematical Models and Computation. Oxford: Oxford Press.

227. Wilmott, P.; Howison, S.; and J. Dewynne(1995) The Mathematicsof Financial Derivatives: A Student Introduction. Cambridge (UK):Cambridge Press.

Page 633: Pricing derivative securities

April 11, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in Bibliography FA

616 Pricing Derivative Securities (2nd Ed.)

228. Young, M.S. and Graff, R.A. (1995) “Real estate is not normal: A fresh lookat real estate return distributions,” J. Real Estate Finance and Economics10, 225–259.

229. Zwillinger, D. (1989) Handbook of Differential Equations. San Diego:Academic Press.

Page 634: Pricing derivative securities

April 23, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in index FA

SUBJECT INDEX

Absolute continuity

of functions, 29

of measures, 41

Almost everywhere, 27

American calls, see also Options,American

pricing by put-call symmetry,307

American puts, see also Options,American

Brennan-Schwartz algorithm,299

Broadie-Detempleapproximation, 237

Geske-Johnson approximation,300–302

MacMillan approximation,302–305

multipiece exponentialapproximation, 305–307

American-style derivatives, seeOptions, American

Antithetic variates, see underSimulation

Arbitrage opportunity, 14

Asian options, 356

Assumption CCK, 142

Assumption RB, 147

Assumption RK, 149

Assumptions PM1-PM4, 141

B.p.r.e. model, 438–445Barrier options, 348Basket options, 547

Bermudan options, 300, 561, 570Bernoulli distribution, 176Bernoulli dynamics

binomial pricing, 179–246limiting distribution of, 221–227martingale pricing under,

188–191martingale probabilities, 186,

211

for futures prices, 201–202replication under, 179–185structure of, 176–178

Bessel function, 44–45, 376, 430–433,469

Beta coefficient, 286

BGW process, 439Binomial coefficients, 24Binomial pricing

as risk-neutral pricing, 186as solving a p.d.e., 186efficient computation, 230–239

smoothing techniques, 235–239Binomial series, 24Binomial tree

nonrecombining, 217recombining, 176

Bivariate normal distribution, seeunder Normal distribution

617

Page 635: Pricing derivative securities

April 23, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in index FA

618 Pricing Derivative Securities (2nd Ed.)

Black’s caplet/floorlet formulas,499–504

reconciliation with HJM, 500

Black-Scholes dynamics

allowing for continuousdividends, 249

allowing for lump-sumdividends, 270

as approximate model for stockindexes, 271–272

futures prices under, 248,259–260

lognormality of price under, 248

martingale pricing under,254–261, 266–268

p.d.e. approach under, 251–253,262–266

Feynman-Kac solution,268–270

pricing futures options under,272–273

structure of, 248–250

Black-Scholes formulas, 224, 266

as convex functions of strike,278, 279

as recipes for replicatingportfolios, 281

extreme values and comparativestatics, 275–279

put-call parity, 274

put-call symmetry, 274

Bonds

coupon vs. discount, 143

forward prices, 147–148, 458

options on

under HJM model, 484–488

under LIBOR market model,505

spot prices, 144–146

bounds on, 147

term structure, 144

“unit”, 144

yields, 145, 458

Bonferroni’s inequality, 557

Borel sets, 25

Borel-Cantelli lemma, 48, 76

Bounded-rate assumption, 147

Box-Muller transformation, 536, 544

Branching processes

defined, 91

in financial modeling, 438–445

in random environments, 439

Brownian motion, 93–97

as Levy process, 132

defined, 92

extrema of Brownian paths,339–343

geometric

convergence of Bernoulliprocess to, 221–227

deficiencies of the model,367–370

defined, 105

differential form, 109

in financial modeling, 248–250

integral of, 98

integration with respect to,99–104

martingale property, 95

no density for, 96

nondifferentiability of samplepaths, 95

quadratic variation of, 107

unboundedness of variation of,96

Call options, see under Options

Capital asset pricing model (CAPM),285

Caplets, 499

Capped call options, 310

Caps, see Interest rate caps

Captions

defined, 504

pricing under LIBOR marketmodel, 507

Cayley-Hamilton theorem, 452

Central limit theorem, 75

Change of numeraire, 123–125

Page 636: Pricing derivative securities

April 23, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in index FA

Subject Index 619

Change-of-variable formula, 39–41,536

Chapman-Kolmogorov equations, 447Characteristic function, 61–66

continuity theorem, 74convolution property, 63

inversion theorem, 65

numerical methods for inversion,395–397

uniqueness theorem, 63

Cholesky decomposition of matrix,548

Chooser options, 329

CIR process, 468, 524Complete markets

and replication, 19–20and uniqueness of martingale

measure, 193–194, 383defined, 19

in Bernoulli framework, 194–196in Black-Scholes framework, 258

Compound options, 312Compound-Poisson process

as J process, 126

as Levy process, 132Conditional expectation, 70

in new measures, 72process, 88

tower property, 71, 87Conditional probability, 68

Constant-elasticity-of-variance(c.e.v.) process

defined, 106option prices under, 375

Contingent claims, 4

Control variatesdefined, 233

in binomial pricing, 233–234in simulation, 544–546, 549, 552,

553, 570, 574Convenience yield, 157

Cost of carry, 142Cox processes

in financial modeling, 521

simulation, 520

theory, 519

Cox-Ingersoll-Ross (CIR) model, 461,468–472

Credit risk

Black-Scholes-Merton (BSM)model, 514–518

exogenous risk models, 518–524

Cumulant generating function, 60,410

Cumulants, 61

Cumulative distribution function, 51

Default risk, see Credit risk

Delta hedging, 281–282, 553

Delta of option, see “Greeks” ofoptions

Deposit insurance, 321

Derivatives

defined, 3

exotic varieties, 311–366

interest-sensitive, 13

overview of types, 4–13

overview of valuation methods,16–19

pricing in incomplete markets,20, 411–413

vs. primary assets, 15

Diffusion processes, 92, 105

Digital options, 327

Directing process, 136

Discontinuous processes

applications, 401–446

theory, 125

Displaced diffusion model, 374–375

Diversifiable risk, 403, 413

Dividend-protected options, 209, 292

Dividends

continuously paid, 152, 162as approximation for

dividends on stock indexes,272

under Bernoulli dynamics,210–212, 226

under Black-Scholesdynamics, 249

Page 637: Pricing derivative securities

April 23, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in index FA

620 Pricing Derivative Securities (2nd Ed.)

cum-dividend price of asset, 249

effect on American calls, 167,209

effect on American put-callparity, 163

effect on American puts, 168,209

effect on European put-callparity, 162

effect on option-pricing bounds,165

lump-sum, 162

set-aside convention for, 218,270

under Bernoulli dynamics, 216

under Black-Scholesdynamics, 270

modeling of, 209

on common stocks, 210

proportional at discreteintervals, 214–216

Doleans exponential, 115, 119, 466,482

Dominated convergence theorem, 36

Down-and-out options, see Options,barrier

Equivalent measures, 17

defined, 41

in Bernoulli framework, 187

in Black-Scholes framework, 255

Error function, 43

Euler’s

constant, 45

formula, 24

Exercise boundary, see Options,American

Exotic derivatives, 311–366

Extendable options, 321

Extended Ito formula, 130, 131

Fast Fourier transform, 361, 397, 431

Feyman-Kac solution, 268–270

Filtered probability space, 86

Filtration, 84–87

Finite sums, 25

Finite-difference methods, 299, 379,380, 382, 391

Fixed-strike options, see Options,lookbacks

Floorlets, 499

Floors, see Interest rate floorsForward contracts

compared to futures, 6, 153,488–490

dynamics-free pricing, 151–153

general description, 4–5

static replication, 151under Black-Scholes dynamics,

261

Forward measure, 191, 256, 483, 484defined, 459

in derivation of Black’s formulas,500

vs. spot measure, 458

Forward prices, 5, 151–153

of bonds, see under Bondsrelation to futures prices, 153,

199, 488–490

Forward rates, see under Interestrates

Forward-start options, 330Fourier transform

applications, 361, 397, 431, 435

convolution property, 46, 64, 263

defined, 46generalization, 262

inversion formula, 46

solving p.d.e.s with, 263–265

Fourier-Stieltjes transform, 46, 61

Fubini’s theorem, 38Fundamental theorem of asset

pricing, 191Futures contracts

compared to forwards, 6, 153,488–490

general description, 6–8

initial margin, 7

markets and trading, 6

marking to market, 7–8, 153, 488

Page 638: Pricing derivative securities

April 23, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in index FA

Subject Index 621

under Bernoulli dynamics,199–200

Futures options

American style

early exercise, 206, 273

pricing under Bernoullidynamics, 206

pricing under Black-Scholesdynamics, 273, 292, 296

European style

pricing under Bernoullidynamics, 200–202

pricing under Black-Scholesdynamics, 272–273

mechanics, 9

Futures prices

as martingales under risk-neutralmeasure, 199, 206, 260

defined, 6

dynamics-free pricing, 153–155

not prices of traded assets, 8, 199

relation to forward prices, 153,199, 488–490

settlement prices, 156, 227

under Bernoulli dynamics,199–200, 225

under stochastic interest rates,488

variation with term, 156–158

Gamma

distribution, 82, 362, 376, 427,428, 534

function, 42–43

incomplete, 43

process

as Levy process, 134

as subordinator, 136

defined, 134

in financial modeling, 427

Gamma hedging, 553

Gamma of option, see “Greeks” ofoptions

Gaussian quadrature, 326

Generated σ-field, 25, 51, 85

Geometric Brownian motion, seeunder Brownian motion

Girsanov’s theorem, 118–121, 255

formal statement, 120proof, 120–121

Glivenko-Cantelli theorem, 538

“Greeks” of optionsdelta, 277, 278

gamma, 277, 278

rho, 278, 279theta, 278, 279

“vega”, 278, 279

“Greeks” of options

gamma, 553

Heath-Jarrow-Morton (HJM) model,472–498

applications, 480theory, 474–480

Hierarchical models, 81, 470

High-contact condition, 298Ho-Lee model, 480, 486

Hull-White model, 461

Hyperbolic model, 432–436

Implicit volatility, 229, 279, 369, 388,391, 415

term structure of, 370, 379variation in, 369

Implied binomial trees, 239–246

Incomplete gamma function, 43

Incomplete markets, 20, 383–388,412–413

Infinitely divisible distributions,82–83

“Inside” options, 315Integral

Ito, 97–109

Lebesgue, 36–37Lebesgue-Stieltjes, 36

on abstract measure space,33–36

Riemann, 30–31

Riemann-Stieltjes, 31–33

Interest rate caps

Page 639: Pricing derivative securities

April 23, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in index FA

622 Pricing Derivative Securities (2nd Ed.)

defined, 493

pricing under HJM model,494–496

Interest rate floorsdefined, 493

pricing under HJM model,494–496

Interest rates

bounds on, 147Cox-Ingersoll-Ross (CIR) model,

461, 468–472forward rates, 148, 457

Heath-Jarrow-Morton (HJM)model, 472–498

Ho-Lee model, 480

Hull-White model, 461short rate, 146, 148, 179

spot rate models, 462–472

spot rates, 145–146

Vasicek model, 461, 463–468Interest-rate swaps

defined, 496

pricing, 496–498Ito integral

as local martingale, 103

definition and construction, 99moments of, 104

multidimensional, 104

properties, 102

Ito processas semimartingale, 105

combined with J process, 131

defined, 104integration with respect to, 108

quadratic variation of, 107

stochastic differential of, 109Ito’s formula, 110

for functions of an Ito process,110

for functions of Ito & Jprocesses, 130

for functions of time and an Itoprocess, 113

for mixed Ito and J processes,131

for vector-valued processes, 115

proof, 111Ito’s lemma, see Ito’s formula

J processcombined with Ito process,

131–132

compound Poisson process as,126

defined, 125

differential of function of,129–130

integration with respect to,127–129

Poisson process as, 126

Jensen’s inequality, 59

Jump-diffusion processapplications, 409–426

theory, 131–132

Known-cost-of-carry assumption, 142

Kolmogorov’s forward equations, 448

Kurtosis

coefficient of, 59, 61, 411, 430,441

in assets’ returns, 368, 418

of normal distribution, 77

Ladder options, 353

Laplace transform, 45

Laplace-Stieltjes transform, 45, 60

Lattice distribution, 52, 58, 62

Law of one price, 14Laws of large numbers, 77

Least Squares Monte Carlo (LSM)method, 561

Leptokurtic distributions, 59

Leverage effect, 368

Levy measure, 134

Levy process

Brownian motion as, 133

closure under subordination, 136compound-Poisson process as,

133

decomposition of, 134

Page 640: Pricing derivative securities

April 23, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in index FA

Subject Index 623

gamma process as, 134

in financial modeling, 428, 432Poisson process as, 133

theory, 132

Levy-Khintchine theorem, 83LIBOR, 493

modeling, 508–512

rate conventions, 493LIBOR market model, 498–512

applications, 504–512

derivation of Black’s formulas,500–504

Local martingale, see underMartingale

Lognormal distribution, 81

Marking to market, see underFutures contracts

Markov

chain, 447inequality, 60

process, 91

time, see Stopping timeMartingale, 87–90

convergence theorem, 90

defined, 87fair-game property, 88

Ito integral as, 103

local, 89

representation theorem, 122role in martingale pricing, 257

Martingale measure, 18, 158

and Girsanov’s theorem, 118–121existence of in arbitrage-free

markets, 191–193in Bernoulli framework, 186–188

nonuniqueness of, 20, 386,411–413

uniqueness of, 19, 255, 383

in Bernoulli framework,193–196

in Black-Scholes framework,258

Martingale pricing

forwards and futures, 158–159

general description, 17–19in Bernoulli framework, 186–196

in Black-Scholes framework, 254

Mathematical expectationconditional on σ-fields, 70

defined, 55

mean drift coefficient, 105Mean-reverting process, see

Ornstein-Uhlenbeck processMeasurable function, 27

Measurable sets, 25Measure

change of, 41–42, 66–68

counting, 27, 84defined, 25

finite, 27

induced, 49Lebesgue, 26

Levy, 134

martingale, see Martingalemeasure

monotone property of, 27probability, 47

risk-neutral, 159, 256σ-finite, 27

space, 26

Mixtures of distributions, 81, 368,410, 427, 428, 470

Moment generating function, 60Moments, see under Random

variablesMoney fund

as numeraire in martingalepricing, 158, 187, 192, 254

defined, 147options on, 483

under HJM model, 481

Moneyness of options, 280Monotone sequence of sets, 26

Monte Carlo methods, see Simulation

Negative-binomial process, 441

No-crossing lemma, 371Noncentral chi-squared distribution,

377, 469

Page 641: Pricing derivative securities

April 23, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in index FA

624 Pricing Derivative Securities (2nd Ed.)

Normal deviates

simulating, 536–537

Normal distribution, 77–81

bivariate, 79–81, 315multivariate, 78–79

tests of fit for, 537

Notation

key to, 21

Notional principal, 496

Novikov condition, 115, 255

Numeraire, 19

defined, 123

Numerical solution of p.d.e.s, 577

applications, 299, 379, 380, 382,391

approximating derivatives,579–580

constructing a grid, 580–581

Crank-Nicolson method, 589–590

explicit method, 582–585

general applications, 591

implicit method, 585–588

setting boundary conditions,581–582

Occupation times, 449

Options

American

calls on no-dividend stocks,167, 205, 292

calls on stocks payinglump-sum dividends,293–296

early exercise, 167

early-exercise premium, 299

exercise boundary, 207,297–298, 555

indefinitely-lived, 308–311

on assets paying continuousdividends, 296–308

pricing by simulation, 554–576

pricing under Bernoullidynamics, 202–209

pricing under Black-Scholesdynamics, 292–311

put-call parity, 163–165

values in terms of pathpayoffs, 298

American vs. European, 9Asian, 356–366

barrier, 311, 348–353

basket, 547–550Bermudan, 561, 570

call options

general description, 9notation, 11

choosers, 329

compound, 312–315

digital, 327dynamics-free bounds, 165–169

dynamics-free comparativestatics, 169

European

pricing under Bernoullidynamics, 196–199

pricing under Black-Scholesdynamics, 262–270

put-call parity, 162–163, 198,274

exchange-traded, 10exercise price, 9

exotic, 311–366, 547–550

overview, 12expiration, 9

extendable, 237–239, 321–327

flex, 9

forward-start, 330general description, 8

“Greeks”

under Black-Scholesdynamics, 277

holding-period returns, 287–291homogeneity of prices in

underlying and strike, 172in vs. out of money, 11

“inside”, 315

instantaneous returns on,284–287

ladder, 353–356

LEAPs, 10

Page 642: Pricing derivative securities

April 23, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in index FA

Subject Index 625

lookbacks, 344–348, 553–554

markets and trading, 9moneyness of, 281, 369

on coupon bonds

under HJM model, 487–488

under LIBOR market model,506

on discount bonds

under HJM model, 484–486

under LIBOR market model,505

on futures, see Futures options

on interest rate caps, 504

on interest-rate swaps, 498

on max or min, 331–335, 573on money-market fund, 483

path-dependent, 339–366

payoff distributions, 160

payoff functions, 11

put options

general description, 8notation, 11

short positions, 10

strike price, 8

symmetry relation for puts andcalls, 212–214, 274–275

threshold, 328

under b.p.r.e. dynamics, 444–445

under c.e.v. dynamics, 375–379

under hyperbolic dynamics,435–436

under jump-diffusion dynamics,413–415

under s.v.-jump dynamics,420–422

under stochastic interest rates,490–493

under stochastic-volatilitydynamics, 551, 388–554

under variance-gammadynamics, 431–432

volatility of, 284, 287with random time to expiration,

408–409

writers of options, 9

Order notation, 22–23Ornstein-Uhlenbeck process, 106

Orthogonal polynomials, 565, 574

Partition, 28Pension fund guarantees, 403–408Perfect-market assumptions, 141

Poisson processas Levy process, 132

as subordinator, 136compound, 126

integrals with respect to, 129defined, 91

doubly stochastic, 519in financial modeling, 403–410,

419, 441, 515

interarrival times of, 402properties, 126

Portfolio insurance, 283Primary assets, 3

Probabilityaxioms, 48

conditional, 68density function, 52

generating function, 61integral transform, 532–534

mass function, 52measure, 47monotone property, 48

space, 47filtered, 86

Put options, see under OptionsPut warrants, 315

Put-call parityAmerican, 163

European, 161, 198

Quadratic-covariation process, 115

Quadratic-variation process, 107–108Quantos, 336–339

Radon-Nikodym derivative

applications, 534, 540defined, 42

for probability measures, 66–68

Page 643: Pricing derivative securities

April 23, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in index FA

626 Pricing Derivative Securities (2nd Ed.)

Radon-Nikodym theorem, 41–42, 119,187

Random variables

absolutely continuous, 52cumulants, 60

defined, 49

discrete, 52

distributions of, 51–54

independent, 55, 63measurability, 49

moments, 58–59

of mixed type, 53

supports of, 51uniformly integrable, 57

vector-valued, 54

Reflection principle, 339

Regime-switching models, 446Rejection method, see under

SimulationReplication

and complete markets, 19–20

in Bernoulli framework, 179–185

in Black-Scholes framework,251–253

role in arbitrage-free pricing, 16

role in martingale pricing, 188,190

static vs. dynamic, 15, 174under Black-Scholes dynamics,

257

under discontinuous dynamics,411–413

under stochastic-volatilitydynamics, 383–388

Return

continuously-compounded rateof, 145

total, 145Richardson extrapolation

defined, 234

in binomial pricing, 234–235

in pricing American puts, 237,300

in simulation, 546

Risk-neutral measure, 18, 159, 256

Risk-neutral pricing, see Martingalepricing

S.v.-jump models, 419

Schwarz inequality, 60

Self-financing portfolios

defined, 16

in arbitrage-free pricing, 16

in Bernoulli framework, 180

in Black-Scholes framework, 257

in continuous time, 117–118

in discrete time, 178

in martingale pricing, 190, 251

Semimartingale, 90Short sale, 10, 141

Sigma hedging, 553

σ-field, 25

trivial, 70

σ-finite measure, 27

Simple function, 33

Simulation

antithetic variates, 541–544, 549,552, 570, 574

applications, 324, 360, 379, 390,391, 504, 547–576

control variates, 544–546, 549,552, 553, 570, 574

for American-style derivatives,554–576

generating random deviates,529–537

importance sampling, 540–541

Least squares Monte Carlo(LSM) method, 561

martingale variance reduction,553

methods, 527–547

multiplicative congruentialgenerators, 530

probability integral transform,532–534

random tree method, 568

rejection method, 534–536

Richardson extrapolation, 546

stratified sampling, 537–540

Page 644: Pricing derivative securities

April 23, 2007 B497 Pricing Derivative Securities (2nd Ed.) 9in x 6in index FA

Subject Index 627

variance-reduction techniques,537–547

Skewness

coefficient of, 59, 61, 411, 430,433, 441

in assets’ returns, 369

of implicit-volatility curve, 369,377, 391, 417

Spot measure, 191, 256

vs. forward measure, 458

Spot price, 3

of asset or commodity, 150

of bond, see under Bond prices

Spot rate models, 462–472

Square-root process, 106, 376

Stable distributions, 446

Stochastic convergence

in distribution, 73–76

to a constant, 76–77

Stochastic differential, 109

Stochastic differential equation, 109

stochastic exponential, see Doleansexponential

Stochastic processes, 83

Stochastic-volatility models, 382

Stock dividends, 209

Stopping time, 89, 208

Straddle, 329

Submartingale, 89

Subordinated processes

in financial modeling, 427, 432,443

theory, 136

Subordinator, see Directing process

Supermartingale, 89

Swaps, see Interest rate swaps

Swaptions, 496, 498, 507

Synthetic options, 282

Taylor series, 23

Tenor structure, 503

Threshold options, 328

Treasury securities, 143, 144, 229

Treasury strips, 144

Uniform integrability, 38, 57Up-and-in options, see Options,

barrier

“Vanilla” options, 8Variable-strike options, see Options,

lookbacksVariance-gamma (v.g.) model,

426–432Variation

of Brownian motion, 96of functions, 28

Vasicek model, 461, 463–468Volatility

dependence on price level, 368persistence of, 228, 368

Volatility parameter, 105dependence on past prices,

380–382estimation, 227–229implicit, 229, 279–281in Bernoulli framework, 224price/time dependent, 240,

370–382Volatility smile, 281, 369, 415

Wiener process, see Brownian motion

Yield curvedefined, 145effect of shifts on option prices,

373interpolating discount factors

from, 229twists of, 478under Cox, Ingersoll, Ross

model, 472under Vasicek model, 467