leader-follower stochastic differential game with asymmetric information and applications · 2015....

58
UNIVERSIDADE DE MACAU Leader-Follower Stochastic Differential Game with Asymmetric Information and Applications Jie Xiong Department of Mathematics University of Macau Macau Based on joint works with Shi and Wang 2015 Barrett Lecture, Knoxville, 14/05/2015]

Upload: others

Post on 08-Feb-2021

1 views

Category:

Documents


0 download

TRANSCRIPT

  • UNIVERSIDADE DE MACAU

    Leader-Follower Stochastic Differential Gamewith Asymmetric Information and Applications

    Jie Xiong

    Department of MathematicsUniversity of Macau

    Macau

    Based on joint works with Shi and Wang2015 Barrett Lecture, Knoxville, 14/05/2015]

  • UNIVERSIDADE DE MACAU

    Outline

    1 Motivation

    2 Problem formulation

    3 LQ Leader-Follower Stochastic Differential Game withAsymmetric Information

    4 Follower’s optimization problem

    5 Complete Information LQ Stochastic Optimal ControlProblem of the Leader

  • UNIVERSIDADE DE MACAU

    1. A motivation

    Example: Continuous Time Newsvendor Problems

    D(·) is the demand rate{dD(t) = a(µ− kR(t)−D(t))dt+ σdW (t) + σ̃dW̃ (t),D(0) = d0 ∈ R,

    retail price R(t).

  • UNIVERSIDADE DE MACAU

    The retailer will obtain an expected profit

    J1(q(·), R(·), w(·))

    = E∫ T0

    [(R(t)− S) min[D(t), q(t)]

    − (w(t)− S)q(t)]dt.

    order rate q(t); wholesale price w(t); salvaged at unit priceS ≥ 0.

  • UNIVERSIDADE DE MACAU

    the manufacturer has a fixed production cost per unit M ≥ 0,an expected profit

    J2(q(·), R(·), w(·)) = E∫ T0

    (w(t)−M)q(t)dt.

  • UNIVERSIDADE DE MACAU

    The information G1t ,G2t available to the retailer and themanufacturer at time t, respectively. G1t 6= G2t .For any w(·), retailer selects a G1t -adapted processes pair(q∗(·), R∗(·)) for the retailer such that

    J1(q∗(·), R∗(·), w(·))

    ≡ J1(q∗(·;w(·)), R∗(·;w(·)), w(·)

    )= max

    q(·),R(·)J1(q(·), R(·), w(·)),

  • UNIVERSIDADE DE MACAU

    Manufacturer selects a G2t -adapted process w∗(·) for themanufacturer such that

    J2(q∗(·), R∗(·), w∗(·))

    ≡ J2(q∗(·;w∗(·)), R∗(·;w∗(·)), w∗(·)

    )= max

    w(·)J2(q∗(·;w(·)), R∗(·;w(·)), w(·)

    ),

  • UNIVERSIDADE DE MACAU

    Example: Cooperative Advertising and Pricing Problems

    dx(t) =[ρu(t)

    √1− x(t)− δx(t)

    ]dt+ σ(x(t))dW (t)

    + σ̃(x(t))dW̃ (t),

    x(0) = x0 ∈ [0, 1],

    u(t) advertisement effort rate, x(t) awareness share, ρ responseconstant, and δ the rate at which potential consumers are lost.

    Manufacturer decide:wholesale price w(t), cooperative participation rate θ(t).Retailer decides: effort u(t), retail price p(t).

  • UNIVERSIDADE DE MACAU

    Given w(t) and θ(t), retailer solves an optimization problem tomaximize his expected profit

    J1(w(·), θ(·), u(·), p(·))

    = E∫ T0e−rt

    [(p(t)− w(t))D(p(t))x(t)

    − (1− θ(t))u2(t)]dt,

    D is demand function.

  • UNIVERSIDADE DE MACAU

    The manufacturer’s optimization problem is to maximize hisexpected profit

    J2(w(·), θ(·), u(·), p(·))

    = E∫ T0e−rt

    [(w(t)− c)D(p(t))x(t)− θ(t)u2(t)

    ]dt,

    where c ≥ 0 is the constant unit production cost.

  • UNIVERSIDADE DE MACAU

    The information G1t ,G2t available to the retailer and themanufacturer at time t, respectively. G1t 6= G2t .Given w and θ, retailer chooses G1,t-adapted pair (u∗(·), p∗(·)),

    J1(w(·), θ(·), u∗(·), p∗(·))≡ J1

    (w(·), θ(·), u∗(·; (w(·), θ(·)), p∗(·; (w(·), θ(·))

    )= max

    u(·),p(·)≥0J1(w(·), θ(·), u(·), p(·)).

  • UNIVERSIDADE DE MACAU

    Select a G2,t-adapted pair (w∗(·), θ∗(·)) for the manufacturer

    J2(w∗(·), θ∗(·), u∗(·), p∗(·))

    ≡ J2(w∗(·), θ∗(·), u∗(·; (w∗(·), θ∗(·)),p∗(·; (w∗(·), θ∗(·))

    )= max

    w(·),0≤θ(·)≤1J2(w(·), θ(·), u∗(·; (w(·), θ(·)),

    p∗(·; (w(·), θ(·))),

  • UNIVERSIDADE DE MACAU

    2. Problem formulation

    State equation:

    dxu1,u2(t)

    = b(t, xu1,u2(t), u1(t), u2(t)

    )dt

    + σ(t, xu1,u2(t), u1(t), u2(t)

    )dW (t)

    + σ̃(t, xu1,u2(t), u1(t), u2(t)

    )dW̃ (t),

    xu1,u2(0) = x0,

    (2.1)

  • UNIVERSIDADE DE MACAU

    G1t 6= G2t ⊆ Ft.The admissible control

    U1 :={u1∣∣u1 : Ω× [0, T ]→ U1 is G1t -adapted

    and sup0≤t≤T

    E|u1(t)|i

  • UNIVERSIDADE DE MACAU

    Problem of the follower. For any chosen u2(·) ∈ U2 by theleader, choose a G1t -adapted control u∗1(·) = u∗1(·;u2(·)) ∈ U1,such that

    J1(u∗1(·), u2(·)) ≡ J1

    (u∗1(·;u2(·)), u2(·)

    )= inf

    u1∈U1J1(u1(·), u2(·)), (2.4)

    u∗1(·) = u∗1(·;u2(·)), (partial information) optimal control.

  • UNIVERSIDADE DE MACAU

    The leader would like to choose a G2t -adapted control u∗2(·) tominimize

    J2(u∗1(·), u2(·))

    = E[ ∫ T

    0g2(t, xu

    ∗1,u2(t), u∗1(t;u2(t)), u2(t)

    )dt

    +G2(xu∗1,u2(T ))

    ].

    (2.5)

  • UNIVERSIDADE DE MACAU

    Problem of the leader. Find a G2t -adapted control u∗2(·) ∈ U2,such that

    J2(u∗1(·), u∗2(·)) = J2

    (u∗1(·;u∗2(·)), u∗2(·)

    )= inf

    u2∈U2J2(u∗1(·;u2(·)), u2(·)

    ),

    (2.6)

  • UNIVERSIDADE DE MACAU

    Issacs (1954-5) differential gamesBasar and Olsder (1982) Monograph to summarizeStackelberg (1934) Leader-follower game introducedYong (2002) LQ, complete info.Bensoussan et al (2012) Stoch. Max. Principle

  • UNIVERSIDADE DE MACAU

    3. LQ Leader-Follower Stochastic Differential Gamewith Asymmetric Information; Stochastic maximumprinciple under partial information.

    The state equation

    dxu1,u2(t)

    =[A(t)xu1,u2(t) +B1(t)u1(t) +B2(t)u2(t)

    ]dt

    +[C(t)xu1,u2(t) +D1(t)u1(t) +D2(t)u2(t)

    ]dW (t)

    +[C̃(t)xu1,u2(t) + D̃1(t)u1(t) + D̃2(t)u2(t)

    ]dW̃ (t),

    xu1,u2(0) = x0,

    (3.7)

  • UNIVERSIDADE DE MACAU

    Given u2 ∈ U2, follower minimizes his cost functional

    J1(u1(·), u2(·))

    =1

    2E[ ∫ T

    0

    (〈Q1(t)x

    u1,u2(t), xu1,u2(t)〉

    +〈N1(t)u1(t), u1(t)

    〉)dt

    +〈G1x

    u1,u2(T ), xu1,u2(T )〉],

    (3.8)

    Info: G1t = σ{W̃ (s); 0 ≤ s ≤ t}

  • UNIVERSIDADE DE MACAU

    Stochastic maximum principle under partialinformation.Probability space (Ω,F , P̂).State equation: FBSDE

    dx = b(t, x, v)dt+ σ(t, x, v)dW + σ̃(t, x, v)dW̃ ,−dy = g(t, x, y, z, z̃, v)dt− zdW − z̃dY,x(0) = x0, y(1) = f(x(1)).

    (3.9)

    Observation {dY = h(t, x)dt+ dW̃ ,Y (0) = 0.

    (3.10)

  • UNIVERSIDADE DE MACAU

    Admissible controls v ∈ Av is Gt-adapted, where Gt = FYt .

    Control problem: Find u ∈ A s.t.

    J(u) = minv∈A

    J(v),

    where

    J(v) = Ê(∫ 1

    0`(t, xv, yv, zv, z̃v, v)dt+ φ(xv(1)) + γ(yv(0))

    ).

  • UNIVERSIDADE DE MACAU

    Hypothesis (H1):

    Coeff. conti. diff. w/ bounded 1st order partial derivatives.

    σ̃, h bounded.

    Hypothesis (H2):

    `, φ, γ conti. diff.

    Ê(∫ 1

    0|`(t, xv, yv, zv, z̃v, v)|dt+ |φ(xv(1))|+ |γ(yv(0))|

    )

  • UNIVERSIDADE DE MACAU

    when b = σ = σ̃ = 0, studied by Huang-Wang-Xiong (SICON2009)General form by Wang-Wu-Xiong (SICON 2013).Detail on LQ case by Wang-Wu-Xiong (IEEE TAC 2015).

  • UNIVERSIDADE DE MACAU

    Replace b, W̃ in state equation by b̃, Y , resp, where

    b̃ = b− σ̃h.

    State equation: dx = b̃(t, x, v)dt+ σ(t, x, v)dW + σ̃(t, x, v)dY,−dy = g(t, x, y, z, z̃, v)dt− zdW − z̃dY,x(0) = x0, y(1) = f(x(1)).

    (3.11)

  • UNIVERSIDADE DE MACAU

    Let

    Zv(t) = exp

    {∫ t0h(s, xv)dY − 1

    2

    ∫ t0h2(s, xv)ds

    }.

    SodZv = Zvh(t, xv)dY. (3.12)

    Let P ∼ P̂ be s.t.dP̂ = Zv(1)dP.

    Then, under P, (W,Y ) is a B.M.

    J(v) = E(∫ 1

    0Zv`(t, xv, yv, zv, z̃v, v)dt+ Zv(1)φ(xv(1)) + γ(yv(0))

    ).

  • UNIVERSIDADE DE MACAU

    Hypothesis (H3):

    For ξ bdd, Gt-measurable,

    v(s) ≡ ξ1[t,t+τ)(s),

    then v ∈ A.For any u ∈ A and v being Gs-adapted and bdd, ∃ δ > 0, if|�| < δ, then u+ �v ∈ A.

  • UNIVERSIDADE DE MACAU

    Adjoint equations:

    dp = (gy(t, x, y, z, z̃, u)p− `y(t, x, y, z, z̃, u)) dt+ (gz(t, x, y, z, z̃, u)p− `z(t, x, y, z, z̃, u)) dW+ (gz̃(t, x, y, z, z̃, u)p− h(t, x)) dW̃ ,

    −dq ={

    (bx(t, x, u)− σ̃(t, x, u)hx(t, x))q + σx(t, x, u)k

    +σ̃x(t, x, u)k̃ − gx(t, x, y, z, z̃, u)p

    −`x(t, x, y, z, z̃, u)}dt− kdW − k̃dW̃ ,

    p(0) = −γy(y(0))q(1) = −fx(x(1))p(1) + φx(x(1)),

    (3.13)

  • UNIVERSIDADE DE MACAU

    and {−dP = `(t, x, y, z, z̃, u)dt−QdW − Q̃dW̃P (1) = φ(x(1)).

    (3.14)

    Hamitonian

    H(t, x, y, z, z̃, v; p, q, k, k̃, Q̃)

    = bq + σk + σ̃k̃ + hQ̃+ `− (g − hz̃)p.

  • UNIVERSIDADE DE MACAU

    Theorem

    If u is a local minimum for J(·), then

    E(Hu(t, x, y, z, z̃, v; p, q, k, k̃, Q̃)

    ∣∣∣Gt) = 0.Idea of proof Set

    d

    d�J(u+ �v)

    ∣∣∣�=0

    = 0.

  • UNIVERSIDADE DE MACAU

    Remark

    (3.14) is for Zv(t). If h = 0, this eq. is not needed.

    Suppose h = b = σ = σ̃, it is proved in Huang-Wang-Xiong(2008, SICON)

  • UNIVERSIDADE DE MACAU

    4. Follower’s optimization problemStep 1. (Optimal control)Hamiltonian function:

    H1(t, x, u1, u2; q, k, k̃

    )=〈q, A(t)x+B1(t)u1 +B2(t)u2

    〉+〈k,C(t)x+D1(t)u1 +D2(t)u2

    〉+〈k̃, C̃(t)x+ D̃1(t)u1 + D̃2(t)u2

    〉− 1

    2

    〈Q1(t)x, x

    〉− 1

    2

    〈N1(t)u1, u1

    〉.

    (4.15)

  • UNIVERSIDADE DE MACAU

    Then,

    0 = N1(t)u∗1(t)−B>1 (t)E

    [q(t)

    ∣∣G1t ]−D>1 (t)E

    [k(t)

    ∣∣G1t ]− D̃>1 (t)E

    [k̃(t)

    ∣∣G1t ], a.e.t ∈ [0, T ],(4.16)

    where Ft-adapted (q(·), k(·), k̃(·)) satisfies the BSDE

    −dq(t) =[A>(t)q(t) + C>(t)k(t) + C̃>(t)k̃(t)

    −Q1(t)xu∗1,u2(t)

    ]dt

    − k(t)dW (t)− k̃(t)dW̃ (t),q(T ) = −G1xu

    ∗1,u2(T ).

    (4.17)

  • UNIVERSIDADE DE MACAU

    Step 2. (Optimal filtering)Let

    f̂(t) := E[f(t)

    ∣∣G1t ], f = q, k, k̃.How to derive filtering equation? Direct from (4.17) isimpossible.Solve (4.17) first!Guess:

    q(t) = −P1(t)xu∗1,u2(t)− ϕ(t), t ∈ [0, T ], (4.18)

    Set {dϕ(t) = α(t)dt+ β(t)dW̃ (t),

    ϕ(T ) = 0.(4.19)

  • UNIVERSIDADE DE MACAU

    Applying dq(t) and comparing, we get

    k(t) = −P1(t)[C(t)xu

    ∗1,u2(t) +D1(t)u

    ∗1(t) +D2(t)u2(t)

    ], (4.20)

    k̃(t) = − P1(t)[C̃(t)xu

    ∗1,u2(t) + D̃1(t)u

    ∗1(t)

    + D̃2(t)u2(t)]− β(t),

    (4.21)

    α(t) =[− Ṗ1(t)− P1(t)A(t)−A>(t)P1(t)

    −Q1(t)]xu

    ∗1,u2(t)− P1(t)B(t)u∗1(t)

    − P1(t)B2(t)u2(t)−A>(t)ϕ(t)

    + C>(t)k(t) + C̃>(t)k̃(t).

    (4.22)

  • UNIVERSIDADE DE MACAU

    Hence,q̂(t) = −P1(t)x̂u

    ∗1,û2(t)− ϕ(t), (4.23)

    k̂(t) = −P1(t)[C(t)x̂u

    ∗1,û2(t) +D1(t)u

    ∗1(t) +D2(t)û2(t)

    ], (4.24)

    ˆ̃k(t) = − P1(t)

    [C̃(t)x̂u

    ∗1,û2(t) + D̃1(t)u

    ∗1(t)

    + D̃2(t)û2(t)]− β(t)

    (4.25)

    with û2(t) := E[u2(t)

    ∣∣G1,t].

  • UNIVERSIDADE DE MACAU

    Using filtering technique (Lemma 5.4 in Xiong (2008)), we getdx̂u

    ∗1,û2(t)

    =[A(t)x̂u

    ∗1,û2(t) +B1(t)u

    ∗1(t) +B2(t)û2(t)

    ]dt

    +[C̃(t)x̂u

    ∗1,û2(t) + D̃1(t)u

    ∗1(t) + D̃2(t)û2(t)

    ]dW̃ (t),

    x̂u∗1,û2(0) = x0,

    (4.26)

    and −dq̂(t) =

    {A>(t)q̂(t) + C>(t)k̂(t) + C̃>(t)

    ˆ̃k(t)

    −Q1(t)x̂u∗1,û2(t)

    }dt− ˆ̃k(t)dW̃ (t),

    q̂(T ) = −G1x̂u∗1,û2(T ),

    (4.27)

  • UNIVERSIDADE DE MACAU

    Step 3. (Optimal state feedback control)By (4.16) we arrive at

    u∗1(t) =− Ñ1(t)−1[(B>1 (t)P1(t) +D

    >1 (t)P1(t)C(t)

    + D̃>1 (t)P1(t)C̃(t))x̂u

    ∗1,û2(t)

    +(D>1 (t)P1(t)D2(t) + D̃

    >1 (t)P1(t)D̃2(t)

    )û2(t)

    +B>1 (t)ϕ(t) + D̃>1 (t)β(t)

    ],

    (4.28)

    where

    Ñ1(t) := N1(t) +D>1 (t)P1(t)D1(t) + D̃

    >1 (t)P1(t)D̃1(t) (4.29)

  • UNIVERSIDADE DE MACAU

    Substituting (4.28) into (4.22), we obtain Riccati equation

    Ṗ1(t) + P1(t)A(t) +A>(t)P1(t) + C

    >(t)P1(t)C(t)

    + C̃>(t)P1(t)C̃(t) +Q1(t)

    −(P1(t)B1(t) + C

    >(t)P1(t)D1(t)

    + C̃>(t)P1(t)D̃1(t))Ñ−11 (t)

    (B>1 (t)P1(t)

    +D>1 (t)P1(t)C(t) + D̃>1 (t)P1(t)C̃(t)

    )= 0,

    P1(T ) = G1.

    (4.30)

    Suppose it admits a unique differentiable solution P1(·)

  • UNIVERSIDADE DE MACAU

    α(t) =[A>(t)− S̃1(t)Ñ1(t)−1B>1 (t)

    ]ϕ(t)

    +[C̃>(t)− S̃1(t)Ñ1(t)−1D̃>1 (t)

    ]β(t)

    +[− S̃1(t)Ñ1(t)−1S̃(t) + S̃2(t)

    ]û2(t),

    (4.31)

    which is G1t -adapted, where

    S̃(t) := D>1 (t)P1(t)D2(t) + D̃>1 (t)P1(t)D̃2(t),

    S̃1(t) := P1(t)B1(t) + C>(t)P1(t)D1(t)

    + C̃>(t)P1(t)D̃1(t),

    S̃2(t) := P1(t)B2(t) + C>(t)P1(t)D2(t)

    + C̃>(t)P1(t)D̃2(t), t ∈ [0, T ].

  • UNIVERSIDADE DE MACAU

    Then,

    −dϕ(t) ={[A>(t)− S̃1(t)Ñ1(t)−1B>1 (t)

    ]ϕ(t)

    +[C̃>(t)− S̃1(t)Ñ1(t)−1D̃>1 (t)

    ]β(t)

    +[− S̃1(t)Ñ1(t)−1S̃(t) + S̃2(t)

    ]û2(t)

    }dt

    − β(t)dW̃ (t),ϕ(T ) = 0,

    (4.32)

    which admits a unique G1t -adapted solution (ϕ(·), β(·)).

  • UNIVERSIDADE DE MACAU

    Finally,

    dx̂u∗1,û2(t) =

    [Ã(t)x̂u

    ∗1,û2(t) + F̃1(t)ϕ(t)

    + B̃1(t)β(t) + B̃2(t)û2(t)]dt

    +[ ˜̃C(t)x̂u

    ∗1,û2(t) + B̃>1 (t)ϕ(t)

    + F̃3(t)β(t) +˜̃D2(t)û2(t)

    ]dW̃ (t),

    x̂u∗1,û2(0) = x0,

    (4.33)

    solves x̂u∗1,û2 , where

  • UNIVERSIDADE DE MACAU

    Ã(t) := A(t)−B1(t)Ñ1(t)−1S̃>1 (t),

    B̃2(t) := B2(t)−B1(t)Ñ1(t)−1S̃(t),

    F̃1(t) := −B1(t)Ñ1(t)−1B>1 (t),

    B̃1(t) := −B1(t)Ñ1(t)−1D̃>1 (t),˜̃C(t) := C̃(t)− D̃1(t)Ñ1(t)−1S̃>1 (t),

    F̃3(t) := −D̃1(t)Ñ1(t)−1D̃>1 (t),˜̃D2(t) := D̃2(t)− D̃1(t)Ñ1(t)−1S̃(t)

    F̃4(t) := −S̃1(t)Ñ1(t)−1S̃(t) + S̃2(t).

  • UNIVERSIDADE DE MACAU

    Theorem 3.1 Let Riccati equation admit a differentiablesolution P1(·) and let matrix be convertible. For chosen u2(·) ofthe leader, Problem of the follower admits an optimalcontrol u∗1(·) of the state feedback, where processes triple(x̂u

    ∗1,û2(·), ϕ(·), β(·)) is the unique G1t -adapted solution to the

    FBSDE (4.32)-(4.33).

  • UNIVERSIDADE DE MACAU

    5. Complete Information LQ Stochastic OptimalControl Problem of the LeaderThe leader knows the complete information Ft at any time t.The state equations faced by the leader consist of BSDE (4.32)and the SDE (3.7). By (4.28), it can be written as

  • UNIVERSIDADE DE MACAU

    dxu2(t) =[A(t)xu2(t) +

    (Ã(t)−A(t)

    )x̂û2(t) + F̃1(t)ϕ(t)

    + B̃1(t)β(t) +B2(t)u2(t) +(B̃2(t)−B2(t)

    )û2(t)

    ]dt

    +[C(t)xu2(t) + F̃5(t)x̂

    û2(t) +˜̃B1(t)

    >ϕ(t) +˜̃D1(t)β(t)

    +D2(t)u2(t) + F̃2(t)û2(t)]dW (t)

    +[C̃(t)xu2(t) +

    ( ˜̃C(t)− C̃(t)

    )x̂û2(t) + B̃1(t)

    >ϕ(t)

    + F̃3(t)β(t) + D̃2(t)u2(t) +( ˜̃D(t)− D̃(t)

    )û2(t)

    ]dW̃ (t),

    − dϕ(t) =[Ã>(t)ϕ(t) +

    ˜̃C>

    (t)β(t)

    + F̃4(t)û2(t)]dt− β(t)dW̃ (t),

    xu2(0) = x0, ϕ(T ) = 0.(5.34)

  • UNIVERSIDADE DE MACAU

    The leader’s cost functional

    J̃2(u2(·)) := J2(u∗1(·), u2(·))

    =1

    2E[ ∫ T

    0

    (〈Q2(t)x

    u∗1,u2(t), xu∗1,u2(t)

    〉+〈N2(t)u2(t), u2(t)

    〉)dt

    +〈G2x

    u∗1,u2(T ), xu∗1,u2(T )

    〉](5.35)

    This complete info conditional mean-field LQ problem withstate (5.34) and cost (5.34) can be solved explicitly using ageneral result.

  • UNIVERSIDADE DE MACAU

    General conditional mean-field SMP. State

    dxu2(t) = b(t)dt+ σ(t)dW (t) + σ̃(t)dW̃ (t),

    −dq(t) ={bx(t)q(t) +

    d1∑j=1

    σjx(t)kj(t)

    +

    d2∑j=1

    σ̃jx(t)k̃j(t)− g1x(t)

    }dt

    − k(t)dW (t)− k̃(t)dW̃ (t),xu2(0) = x0, q(T ) = −G1x(xu2(T )),

    (5.36)

    where b is a function of (t, x, x̂, u2, û2, q, k, k̃, k̂,ˆ̃k).

  • UNIVERSIDADE DE MACAU

    Cost:

    J2(u2(·)) = E[ ∫ T

    0g2(t)dt+G2(x

    u2(T ))]

    Problem of the leader. Find a G2t -adapted control u∗2(·) ∈ U2,such that

    J2(u∗2(·)) = inf

    u2∈U2J2(u2(·)), (5.37)

    subject to (5.36).

  • UNIVERSIDADE DE MACAU

    Define Hamiltonian

    H2(t, x, u2, x̂, û2, q, k, k̃; y, z, z̃, p

    )=〈y, b()

    〉+ tr

    {z>σ(t)

    }+ tr

    {z̃>σ̃(t)

    }+ g2(t)

    −〈p, bx(t)q +

    d1∑j=1

    σjx(t)kj +

    d2∑j=1

    σ̃jx(t)k̃j − g1x(t)

    〉.

    (5.38)

  • UNIVERSIDADE DE MACAU

    Let (y(·), z(·), z̃(·), p(·)) be the unique Ft-adapted solution tothe adjoint conditional mean-field FBSDE of the leader

    dp(t) ={b∗x(t)p(t) + E

    [b∗x̂(t)p(t)

    ∣∣G1t ]}dt+

    d1∑j=1

    {σ∗jx (t)p(t) + E

    [σ∗jx̂ (t)p(t)

    ∣∣G1t ]}dW j(t)+

    d2∑j=1

    {σ̃∗jx (t)p(t) + E

    [σ̃∗jx̂ (t)p(t)

    ∣∣G1t ]}dW̃ j(t),(5.39)

  • UNIVERSIDADE DE MACAU

    −dy ={bxy + E

    [bx̂y∣∣G1t ]+ d1∑

    j=1

    [σjxz

    j + E[σjx̂z

    j∣∣G1t ]]

    +

    d2∑j=1

    [σ̃jx(t)z

    j(t) + E[σ̃jx̂(t)z

    j(t)∣∣G1t ]]

    −n∑i=1

    {∂bx∂xi

    (t)q(t)pi(t) + E[∂bx∂x̂i

    (t)q(t)pi(t)∣∣∣G1t ]}

    −n∑i=1

    { ∂∂xi

    ( d1∑j=1

    σjxkj

    )pi + E

    [ ∂∂x̂i

    ( d1∑j=1

    σjxkj

    )pi

    ∣∣∣G1t ]}

    −n∑i=1

    { ∂∂xi

    ( d1∑j=1

    σ̃jxk̃j

    )pi + E

    [ ∂∂x̂i

    ( d1∑j=1

    σ̃jxk̃j

    )pi

    ∣∣∣G1t ]}+ g1xxp+ E

    [g1xx̂p

    ∣∣G1t ]+ g2x + E[g2x̂∣∣G1t ]}dt− zdW − z̃dW̃ ,(5.40)

  • UNIVERSIDADE DE MACAU

    with boundary

    p(0) = 0,

    y(T ) = G1xx(x∗(T ))p(T ) +G2x(x

    ∗(T )),(5.41)

    where we have used φ ≡ φ(t, x∗(t), x̂∗(t), u∗2(t), û

    ∗2(t))

    forφ = b, σ, σ̃, g1, g2 and all their derivatives.Related work: Anderson and Djehiche (2011), Li (2012), Yong(2.13).

  • UNIVERSIDADE DE MACAU

    Proposition (Maximum principle of conditional mean-fieldFBSDE with partial information)

    Let u∗2(·) ∈ U2 be the optimal control for Problem of theleader and (x∗(·), q∗(·), k∗(·), k̃∗(·)) be the correspondingoptimal state which is the unique Ft-adapted solution to (5.36).Let (y(·), z(·), z̃(·), p(·)) be the unique Ft-adapted solution to theadjoint equation. Then

    E

    {∂H2∂u2

    + E[∂H2∂û2

    ∣∣∣∣G1t ]∣∣∣∣u2=u∗2(t)

    ∣∣∣∣∣G2t}

    = 0, a.e.t ∈ [0, T ], (5.42)

  • UNIVERSIDADE DE MACAU

    Now, we come back to LQ problem. Define Hamiltonian

    H2(t, xu2 , u2, ϕ, β; y, z, z̃, p

    )=〈y,A(t)xu2 +

    (Ã(t)−A(t)

    )x̂û2 + F̃1(t)ϕ

    + B̃1(t)β +B2(t)u2 +(B̃2(t)−B2(t)

    )û2〉

    +〈z, C(t)xu2 + F̃5(t)x̂

    û2 +˜̃B>1 (t)ϕ

    +˜̃D1(t)β +D2(t)u2 + F̃2(t)u2

    〉+〈z̃, C̃(t)xu2 +

    ( ˜̃C(t)− C̃(t)

    )x̂û2 + B̃>1 (t)ϕ

    + F̃3(t)β + D̃2(t)u2 +(D̃2(t)−D2(t)

    )û2〉

    +〈p, Ã>ϕ+

    ˜̃C>β + F̃4û2

    〉+

    1

    2

    [〈Q2x

    u2 , xu2〉

    +〈N2u2, u2

    〉].

    (5.43)

  • UNIVERSIDADE DE MACAU

    Applying the general thm, if there exists an Ft-adapted optimalcontrol u∗2(·) ∈ U2 for the leader, then

    0 = N2(t)u∗2(t) + F̃

    >4 (t)p̂(t) +B

    >2 (t)y(t)

    +(B̃2(t)−B2(t)

    )>ŷ(t) +D>2 (t)z(t)

    + F̃>2 (t)ẑ(t) + D̃>2 (t)z̃(t)

    +( ˜̃D2(t)− D̃2(t)

    )>ˆ̃z(t), a.e. t ∈ [0, T ],(5.44)

    where Ft-adapted processes quadruple (p(·), y(·), z(·), z̃(·))satisfies the conditional mean-field FBSDE

  • UNIVERSIDADE DE MACAU

    dp(t) =[Ã(t)p(t) + F̃>1 (t)y(t) +

    ˜̃B1(t)z(t) + B̃1(t)z̃(t)

    ]dt

    +[ ˜̃C(t)p(t) + B̃>1 (t)y(t)

    +˜̃D>1 (t)z(t) + F̃

    >3 (t)z̃(t)

    ]dW̃ (t),

    −dy(t) =[A>(t)y(t) +

    (Ã(t)−A(t)

    )>ŷ(t)

    + C>(t)z(t) + F̃>5 (t)ẑ(t) + C̃>(t)z̃(t)

    +( ˜̃C(t)− C̃(t)

    )>ˆ̃z(t) +Q2(t)x∗(t)]dt− z(t)dW (t)− z̃(t)dW̃ (t),

    p(0) = 0, y(T ) = G2x∗(T ).

    (5.45)

  • UNIVERSIDADE DE MACAU

    ——————

    Thanks!

    ——————

    A motivationProblem formulationStochastic maximum principle under partial information.Follower's optimization problemComplete Information LQ Stochastic Optimal Control Problem of the Leader