numerical solution of hjbequations diffusion processes ...moll/eabcn/eabcn_lecture2.pdf · outline...
TRANSCRIPT
Lecture 2Numerical Solution of HJB Equations
Diffusion Processes, Kolmogorov Forward Equation
Benjamin MollPrinceton
EABCN Training SchoolJune 4-6, 2018
Outline
1. Numerical solution of HJB equations
2. Diffusion processes
3. Stochastic HJB equations
4. Kolmogorov Forward equations
1
Numerical Solution of HJB EquationsCodes: http://www.princeton.edu/~moll/HACTproject.htm
2
One-Slide Summary of Numerical Method• Consider general HJB equation:
ρv(x) = maxαr(x, α) + v ′(x) · f (x, α)
• Will discretize and solve using finite difference method• Discretization⇒ system of non-linear equations
ρv = r(v) + A(v)v
where A is a sparse (tri-diagonal) transition matrix
nz = 1360 10 20 30 40 50 60 70
0
10
20
30
40
50
60
70 3
Barles-Souganidis
• There is a well-developed theory for numerical solution of HJBequation using finite difference methods
• Key paper: Barles and Souganidis (1991), “Convergence ofapproximation schemes for fully nonlinear second order equationshttps://www.dropbox.com/s/vhw5qqrczw3dvw3/barles-souganidis.pdf?dl=0
• Result: finite difference scheme “converges” to unique viscositysolution under three conditions
1. monotonicity2. consistency3. stability
• Good reference: Tourin (2013), “An Introduction to Finite DifferenceMethods for PDEs in Finance.”
4
Problem we will work with: neoclassical growth model
• Explain using neoclassical growth model, easily generalized toother applications
ρv(k) = maxcu(c) + v ′(k)(F (k)− δk − c)
• Functional forms
u(c) =c1−σ
1− σ , F (k) = kα
• Use finite difference method
• Two MATLAB codeshttp://www.princeton.edu/~moll/HACTproject/HJB_NGM.m
http://www.princeton.edu/~moll/HACTproject/HJB_NGM_implicit.m
5
Finite Difference Approximations to v ′(ki)
• Approximate v(k) at I discrete points in the state space,ki , i = 1, ..., I. Denote distance between grid points by ∆k .
• Shorthand notationvi = v(ki)
• Need to approximate v ′(ki).• Three different possibilities:
v ′(ki) ≈vi − vi−1∆k
= v ′i ,B backward difference
v ′(ki) ≈vi+1 − vi∆k
= v ′i ,F forward difference
v ′(ki) ≈vi+1 − vi−12∆k
= v ′i ,C central difference
6
Finite Difference Approximations to v ′(ki)
!
"# $ # #%$
!# !#%$
!# $
Central
Backward
Forward
7
Finite Difference Approximation
FD approximation to HJB is
ρvi = u(ci) + v′i × (F (ki)− δki − ci) (∗)
where ci = (u′)−1(v ′i ), and v ′i is one of backward, forward, central FDapproximations.Two complications:
1. which FD approximation to use? “Upwind scheme”2. (∗) is extremely non-linear, need to solve iteratively:
“explicit” vs. “implicit method”
My strategy for next few slides:• what works• supplementary slides: why it works (Barles-Souganidis)
8
Which FD Approximation?• Which of these you use is extremely important• Best solution: use so-called “upwind scheme.” Rough idea:
• forward difference whenever drift of state variable positive• backward difference whenever drift of state variable negative
• In our example: definesi ,F = F (ki)− δki − (u′)−1(v ′i ,F ), si ,B = F (ki)− δki − (u′)−1(v ′i ,B)
• Approximate derivative as followsv ′i = v
′i ,F1{si ,F>0} + v
′i ,B1{si ,B<0} + v̄
′i 1{si ,F<0<si ,B}
where 1{·} is indicator function, and v̄ ′i = u′(F (ki)− δki).• Where does v̄ ′i term come from? Answer:
• since v is concave, v ′i ,F < v ′i ,B (see figure)⇒ si ,F < si ,B• if s ′i ,F < 0 < s ′i ,B, set si = 0⇒ v ′(ki) = u′(F (ki)− δki), i.e.
we’re at a steady state.9
Sparsity
• Recall discretized HJB equationρvi = u(ci) + v
′i × (F (ki)− δki − ci), i = 1, ..., I
• This can be written as
ρvi = u(ci) +vi+1 − vi∆k
s+i ,F +vi − vi−1∆k
s−i ,B, i = 1, ..., I
Notation: for any x , x+ = max{x, 0} and x− = min{x, 0}
• Can write this in matrix notation
ρvi = u(ci) +
[–s−i ,B∆k
s−i ,B∆k
–s+i ,F∆k
s+i ,F∆k
]vi−1vivi+1
and hence
ρv = u+ Av
where A is I × I (I= no of grid points) and looks like...10
Visualization of A (output of spy(A) in Matlab)
nz = 1360 10 20 30 40 50 60 70
0
10
20
30
40
50
60
70
11
The matrix A
• FD method approximates process for k with discrete Poissonprocess, A summarizes Poisson intensities
• entries in row i :
−s−i ,B∆k︸ ︷︷ ︸inflowi−1≥0
s−i ,B∆k−s+i ,F∆k︸ ︷︷ ︸
outflowi≤0
s+i ,F∆k︸︷︷︸
inflowi+1≥0
vi−1
vi
vi+1
• negative diagonals, positive off-diagonals, rows sum to zero:• tridiagonal matrix, very sparse
• A (and u) depend on v (nonlinear problem)ρv = u(v) + A(v)v
• Next: iterative method...12
Iterative Method
• Idea: Solve FOC for given vn, update vn+1 according tovn+1i − vni∆
+ ρvni = u(cni ) + (v
n)′(ki)(F (ki)− δki − cni ) (∗)
• Algorithm: Guess v0i , i = 1, ..., I and for n = 0, 1, 2, ... follow1. Compute (vn)′(ki) using FD approx. on previous slide.2. Compute cn from cni = (u′)−1[(vn)′(ki)]3. Find vn+1 from (∗).4. If vn+1 is close enough to vn: stop. Otherwise, go to step 1.
• See http://www.princeton.edu/~moll/HACTproject/HJB_NGM.m
• Important parameter: ∆ = step size, cannot be too large (“CFLcondition”).
• Pretty inefficient: I need 5,990 iterations (though quite fast)13
Efficiency: Implicit Method
• Efficiency can be improved by using an “implicit method”vn+1i − vni∆
+ ρvn+1i = u(cni ) + (vn+1i )′(ki)[F (ki)− δki − cni ]
• Each step n involves solving a linear system of the form1
∆(vn+1 − vn) + ρvn+1 = u+ Anvn+1((ρ+ 1
∆)I− An)vn+1 = u+ 1
∆vn
• but An is super sparse⇒ super fast• See http://www.princeton.edu/~moll/HACTproject/HJB_NGM_implicit.m
• In general: implicit method preferable over explicit method1. stable regardless of step size ∆2. need much fewer iterations3. can handle many more grid points 14
Implicit Method: Practical Consideration
• In Matlab, need to explicitly construct A as sparse to takeadvantage of speed gains
• Code has part that looks as followsX = -min(mub,0)/dk;Y = -max(muf,0)/dk + min(mub,0)/dk;Z = max(muf,0)/dk;
• Constructing full matrix – slowfor i=2:I-1
A(i,i-1) = X(i);A(i,i) = Y(i);A(i,i+1) = Z(i);
endA(1,1)=Y(1); A(1,2) = Z(1);A(I,I)=Y(I); A(I,I-1) = X(I);
• Constructing sparse matrix – fastA =spdiags(Y,0,I,I)+spdiags(X(2:I),-1,I,I)+spdiags([0;Z(1:I-1)],1,I,I);
15
Just so you remember: one-slide summary again• Consider general HJB equation:
ρv(x) = maxαr(x, α) + v ′(x) · f (x, α)
• Discretization⇒ system of non-linear equationsρv = r(v) + A(v)v
where A is a sparse (tri-diagonal) transition matrix
nz = 1360 10 20 30 40 50 60 70
0
10
20
30
40
50
60
70
16
Diffusion Processes
17
Diffusion Processes• A diffusion is simply a continuous-time Markov process (with
continuous sample paths, i.e. no jumps)• for jumps, use Poisson process: very intuitive, briefly later
• Simplest possible diffusion: standard Brownian motion (sometimesalso called “Wiener process”)
• Definition: a standard Brownian motion is a stochastic process Wwhich satisfiesW (t + ∆t)−W (t) = εt
√∆t, εt ∼ N (0, 1), W (0) = 0
• Not hard to seeW (t) ∼ N (0, t)
• Continuous time analogue of a discrete time random walk:Wt+1 = Wt + εt , εt ∼ N (0, 1)
18
Standard Brownian Motion
• Note: mean zero, E(W (t)) = 0...• ... but blows up Var(W (t)) = t
19
Brownian Motion• Can be generalized
x(t) = x(0) + µt + σW (t)
• Since E(W (t)) = 0 and Var(W (t)) = tE[x(t)− x(0)] = µt, Var[x(t)− x(0)] = σ2t
• This is called a Brownian motion with drift µ and variance σ2
• Often useful to write this in differential form• recall ∆W (t) := W (t + ∆t)−W (t) = εt
√∆t, εt ∼ N (0, 1)
• use notation dW (t) := εt√dt, with εt ∼ N (0, 1) and write
dx(t) = µdt + σdW (t)
• This is called a stochastic differential equation• Analogue of stochastic difference equation:
xt+1 = µ+ xt + σεt , εt ∼ N (0, 1)20
21
Further Generalizations: Diffusion Processes
• Can be generalized further (suppressing dependence of x , W on t )
dx = µ(x)dt + σ(x)dW
where µ and σ are any non-linear etc etc functions.
• This is called a “diffusion process”
• µ(·) is called the drift and σ(·) the diffusion.
• all results can be extended to the case where they depend on t,µ(x, t), σ(x, t) but abstract from this for now.
• The amazing thing about diffusion processes: by choosingfunctions µ and σ, you can get pretty much any stochastic processyou want (except jumps)
22
Example 1: Ornstein-Uhlenbeck Process
• Brownian motion dx = µdt + σdW is not stationary (randomwalk). But the following process is
dx = θ(x̄ − x)dt + σdW
• Analogue of AR(1) process, autocorrelation e−θ ≈ 1− θ
xt+1 = θx̄ + (1− θ)xt + σεt
• That is, we just choose
µ(x) = θ(x̄ − x)
and we get a nice stationary process!
• This is called an “Ornstein-Uhlenbeck process”23
Ornstein-Uhlenbeck Process
• Can show: stationary distribution is N(x̄ , σ
2
2θ
)24
Example 2: “Moll Process”
• Design a process that stays in the interval [0, 1] and mean-revertsaround 1/2
µ(x) = θ (1/2− x) , σ(x) = σx(1− x)
• That isdx = θ (1/2− x) dt + σx(1− x)dW
• Note: diffusion goes to zero at boundaries σ(0) = σ(1) = 0 &mean-reverts⇒ always stay in [0, 1]
25
Other Examples
• Geometric Brownian motion:dx = µxdt + σxdW
x ∈ [0,∞), no stationary distribution:log x(t) ∼ N ((µ− σ2/2)t, σ2t).
• Feller square root process (finance: “Cox-Ingersoll-Ross”)dx = θ(x̄ − x)dt + σ
√xdW
x ∈ [0,∞), stationary distribution is Gamma(γ, 1/β), i.e.g∞(x) ∝ e−βxxγ−1, β = 2θx̄/σ2, γ = 2θx̄/σ2
• Other processes in Wong (1964), “The Construction of a Class ofStationary Markoff Processes”
26
Stochastic HJB Equations
27
Stochastic Optimal Control
• Generic problem:
v (x0) = max{α(t)}t≥0
E0∫ ∞0
e−ρtr (x(t), α(t)) dt
subject to the law of motion for the state
dx(t) = f (x(t), α(t)) dt + σ(x(t))dW (t)
and α (t) ∈ A, for t ≥ 0, x(0) = x0 given
• σ could depend on α as well – easy extension
• Deterministic problem: special case σ(x) ≡ 0
• In general x ∈ RN , α ∈ RM . For now do scalar case.28
Stochastic HJB Equation: Scalar Case
• Claim: the HJB equation is
ρv(x) = maxα∈A
r(x, α) + v ′(x)f (x, α) +1
2v ′′(x)σ2(x)
• Here: on purpose no derivation (“cookbook”)
• In case you care, see any textbook, e.g. chapter 2 in Stokey (2008)
29
Just for Completeness: Multivariate Case
• Let x ∈ RN , α ∈ RM
• For fixed x , define the N × N covariance matrixσ2(x) = σ(x)σ(x)′
(this is a function σ2 : RN → RN × RN )• The HJB equation is
ρv(x) = maxα∈A
r(x, α) +
N∑i=1
∂v(x)
∂xifi(x, α) +
1
2
N∑i=1
N∑j=1
∂2v(x)
∂xi∂xjσ2i j(x)
• In vector notation
ρv(x) = maxα∈A
r(x, α) +∇xv(x) · f (x, α) +1
2tr(∆xv(x)σ
2(x))
• ∇xv(x): gradient of v (dimension N × 1)• ∆xv(x): Hessian of v (dimension N × N)
30
HJB Equation: Endogenous and Exogenous State
• Lots of problems have the form x = (x1, x2)• x1: endogenous state• x2: exogenous state
dx1 = f̃ (x1, x2, α)dt
dx2 = µ̃(x2)dt + σ̃(x2)dW
• Special case with
f (x, α) =
[f̃ (x1, x2, α)
µ̃(x2)
], σ(x) =
[0
σ̃(x2)
]
• Claim: the HJB equation isρv(x1, x2) =max
α∈Ar(x1, x2, α) + v1(x1, x2)f̃ (x1, x2, α)
+v2(x1, x2)µ̃(x2) +1
2v22(x1, x2)σ̃
2(x2)31
Example: Real Business Cycle Model
v (k0, z0) = max{c(t)}t≥0
E0∫ ∞0
e−ρtu(c(t))dt
subject todk = (zF (k)− δk − c)dt
dz = µ̃(z)dt + σ̃(z)dW
for t ≥ 0, k(0) = k0, z(0) = z0 given
Here:• x1 = k, x2 = z, α = c• r(x, α) = u(α)
• f (x, α) =[x2F (x1)− δx1 − α
µ̃(x2)
], σ(x) =
[0
σ̃(x2)
]32
Example: Real Business Cycle Model
• HJB equation is
ρv(k, z) =maxcu(c) + vk(k, z)[zF (k)− δk − c]
+ vz(k, z)µ(z) +1
2vzz(k, z)σ
2(z)
33
Example: Real Business Cycle Model
• Special Case 1: z is a geometric Brownian motiondz = µzdt + σzdW
ρv(k, z) =maxcu(c) + vk(k, z)[zF (k)− δk − c]
+ vz(k, z)µz +1
2vzz(k, z)σ
2z2
See Merton (1975) for an analysis of this case
• Special Case 2: z is a Feller square root processdz = θ(z̄ − z)dt + σ
√zdW
ρv(k, z) =maxcu(c) + vk(k, z)[zF (k)− δk − c]
+ vz(k, z)θ(z̄ − z) +1
2vzz(k, z)σ
2z
34
Aside: Poisson Uncertainty
• Simplest way of modeling uncertainty in continuos time:two-state Poisson process
• zt ∈ {z1, z2} Poisson with intensities λ1, λ2
• Result: HJB equation is
ρvi(k) = maxcu(c) + v ′i (k)(ziF (k)− δk − c) + λi(vj(k)− vi(k))
for i = 1, 2, j ̸= i
35
Special Case: Stochastic AK Model with log Utility
• Preferences: u(c) = log c
• Technology: zF (k) = zk (so maybe “zk model”?)
• Productivity z follows any diffusionρv(k, z) =max
clog c + vk(k, z)(zk − δk − c)
+ vz(k, z)µ(z) +1
2vzz(k, z)σ
2(z)
• Claim: Optimal consumption is c = ρk and hence capital followsdk = (z − ρ− δ)kdt
dz = µ(z)dt + σ(z)dW
• Solution properties? Simply simulate two SDEs forward in time36
Special Case: Stochastic AK Model with log Utility
• Proof: Guess and verifyv(k, z) = ν(z) + κ log k
• FOC:u′(c) = vk(k, z) ⇔
1
c=κ
k⇔ c =
k
κ• Substitute into HJB equation
ρ[ν(z) + κ log k ] = log k − logκ+κ
k[zk − δk − k/κ]
+ ν′(z)µ(z) +1
2ν′′(z)σ2(z)
• Collect terms involving log k ⇒ κ = 1/ρ⇒ c = ρk □
• Remark: log-utility⇒ offsetting income and substitution effects offuture z ⇒ constant savings rate ρ
37
General Case: Numerical Solution with FD Method
• Want to solve:
ρv(k, z) =maxcu(c) + vk(k, z)[zF (k)− δk − c]
+ vz(k, z)µ(z) +1
2vzz(k, z)σ
2(z)
38
General Case: Numerical Solution with FD Method
It doesn’t matter whether you solve ODEs or PDEs⇒ everything generalizes
http://www.princeton.edu/~moll/HACTproject/HJB_diffusion_implicit_RBC.m
39
General Case: Numerical Solution with FD Method
• Solve on bounded grids ki , i = 1, ..., I and zj , j = 1, ..., J• Use short-hand notation vi ,j = v(ki , zj). Approximate
vk(ki , zj) ≈vi+1,j − vi ,j∆k
or vi ,j − vi−1,j∆k
vz(ki , zj) ≈vi ,j+1 − vi ,j∆z
or vi ,j − vi ,j−1∆z
vzz(ki , zj) ≈vi ,j+1 − 2vi ,j + vi ,j−1
(∆z)2
• Discretized HJB
ρvi ,j =u(ci ,j) + vk(ki , zj)(zjF (ki)− δki − ci ,j)
+ vz(ki , zj)µ(zj) +1
2vzz(ki , zj)σ
2(zj)
ci ,j =(u′)−1[vk(ki , zj)]
40
Numerical Solution: Boundary Conditions?
• Upwind method in k-dimension⇒ no boundary conditions needed
• Do need boundary conditions in z-dimension
vz(k, z1) = 0 all k ⇒ vi ,0 = vi ,1
vz(k, zJ) = 0 all k ⇒ vi ,J+1 = vi ,J
• These correspond to “reflecting barriers” at lower and upperbounds for productivity, z1 and zJ (Dixit, 1993)
41
General Case: Numerical Solution with FD Method
• Stack value function vi ,j into vector v of length I × J
• I usually stack it as “endogenous state variable first”
v = (v1,1, v2,1, ..., vI,1, v1,2, ..., vI,2, v1,3, ..., vI,J)′
• here: doesn’t really matter
• End up with system of I × J non-linear equations
ρv = u(v) + A(v)v
• Solve exactly as before
• upwind scheme• implicit method preferred to explicit method
42
Visualization of A (output of spy(A) in Matlab)
nz = 157490 500 1000 1500 2000 2500 3000 3500 4000
0
500
1000
1500
2000
2500
3000
3500
4000
43
Saving Policy Function
k2 4 6 8 10 12 14
s(k,
z)
-1.4
-1.2
-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
44
Kolmogorov Forward Equations
45
Kolmogorov Forward Equations
• Let x be a scalar diffusion
dx = µ(x)dt + σ(x)dW, x(0) = x0
• Suppose we’re interested in the evolution of the distribution of x ,g(x, t), and in particular in the stationary distribution g(x)
• Natural thing to care about especially in heterogenous agentmodels
• Example: x = wealth• µ(x) determined by savings behavior and return to
investments• σ(x) by return risk• microfound later
46
Kolmogorov Forward Equations
• Fact: Given an initial distribution g(x, 0) = g0(x), g(x, t) satisfiesthe PDE
∂g(x, t)
∂t= −
∂
∂x[µ(x)g(x, t)] +
1
2
∂2
∂x2[σ2(x)g(x, t)]
• This PDE is called the “Kolmogorov Forward Equation”
• Note: in math this often called “Fokker-Planck Equation”
• Corollary: if a stationary distribution g(x) exists, it satisfies the ODE
0 = −d
dx[µ(x)g(x)] +
1
2
d2
dx2[σ2(x)g(x)]
• Remark: as usual, stationary distribution defined as “if you startthere, you stay there”
• g(x) s.t. if g(x, t) = g(x), then g(x, τ) = g(x) for all τ ≥ t47
Just for Completeness: Multivariate Case
• Let x ∈ RN
• As before, define the N × N covariance matrix
σ2(x) = σ(x)σ(x)′
• The Kolmogorov Forward Equation is
∂g(x, t)
∂t= −
N∑i=1
∂
∂xi[µi(x)g(x, t)] +
1
2
N∑i=1
N∑j=1
∂2
∂x2[σ2i j(x)g(x, t)]
48
Application: Stationary Distribution of RBC Model
• Recall RBC Modelρv(k, z) =max
cu(c) + vk(k, z)[zF (k)− δk − c]
+ vz(k, z)µ(z) +1
2vzz(k, z)σ
2(z)
• Denote the optimal policy function bys(k, z) = zF (k)− δk − c(k, z)
• Then the distribution g(k, z, t) solves∂g(k, z, t)
∂t=−
∂
∂k[s(k, z)g(k, z, t)]
−∂
∂z[µ(z)g(k, z, t)] +
1
2
∂2
∂z2[σ2(z)g(k, z, t)]
• Numerical solution with FD method: later49
Appendix: Ito’s Lemma
50
Ito’s Lemma
• Let x be a scalar diffusiondx = µ(x)dt + σ(x)dW
• We are interested in the evolution of y(t) = f (x(t)) where f is anytwice differentiable function
• Lemma: y(t) = f (x(t)) follows
df (x) =
(µ(x)f ′(x) +
1
2σ2(x)f ′′(x)
)dt + σ(x)f ′(x)dW
• Extremely powerful because it says that any (twice differentiable)function of a diffusion is also a diffusion
• Can also be extended to vectors
• FYI: this is also where the v ′(x)µ(x) + 12v ′′(x)σ2(x) term in theHJB equation comes from (it’s E[dv(x)]dt )
51
Application: Brownian vs. Geometric Brownian Motion
• Let x be a geometric Brownian motion
dx = µxdt + σxdW
• Claim: y = log x is a Brownian motion with drift µ− σ2/2 andvariance σ2
• Derivation: f (x) = log x , f ′(x) = 1/x, f ′′(x) = −1/x2.By Ito’s Lemma
dy = df (x) =
(µx(1/x) +
1
2σ2x2(−1/x2)
)dt + σx(1/x)dW
=(µ− σ2/2
)dt + σdW
• Note: naive derivation would have used dy = dx/x and hence
dy = µdt + σdW wrong unless σ = 0!52
Just for Completeness: Multivariate Case
• Let x ∈ RN . For fixed x , define the N × N covariance matrixσ2(x) = σ(x)σ(x)′
• Ito’s Lemma:
df (x) =
N∑i=1
µi(x)∂f (x)
∂xi+1
2
N∑i=1
N∑j=1
σ2i j(x)∂2f (x)
∂xixj
dt+
N∑i=1
σi(x)∂f (x)
∂xidWi
• In vector notation
df (x) =
(∇x f (x) · µ(x) +
1
2tr(∆x f (x)σ
2(x)))dt +∇x f (x) · σ(x)dW
• ∇x f (x): gradient of f (dimension m × 1)• ∆x f (x): Hessian of f (dimension m ×m)
53