zubov’s method for di erential games -...

Zubov’s Method for Differential Games

Lars Grune

Mathematisches InstitutUniversitat Bayreuth

Joint work with Oana Silvia Serea,Ecole Polytechnique, Palaiseau, France

International Workshop “The Dynamics of Control”Irsee, 1st–3rd October, 2010

Happy Birthday Fritz!

Zubov’s Method for Differential Games

Lars Grune

Mathematisches InstitutUniversitat Bayreuth

Joint work with Oana Silvia Serea,Ecole Polytechnique, Palaiseau, France

International Workshop “The Dynamics of Control”Irsee, 1st–3rd October, 2010

Happy Birthday Fritz!

Introduction: The Uncontrolled CaseConsider the autonomous ODE

x(t) = f(x(t)), x ∈ Rd

with solutions Φ(t, x0) and locally exponentially stableequilibrium x∗ ∈ Rd

(without loss of generality x∗ = 0)

i.e., there exists a neighborhood N of x∗ = 0 and constantsc, σ > 0, such that for all x0 ∈ N :

‖Φ(t, x0)‖ ≤ ce−σt‖x0‖

Problem: What is the domain of attraction

D := {x ∈ Rd |Φ(t, x)→ x∗ = 0} ?

Lars Grune, Zubov’s Method for Differential Games, p. 2 of 23

x(t) = f(x(t)), x ∈ Rd

with solutions Φ(t, x0) and locally exponentially stableequilibrium x∗ ∈ Rd (without loss of generality x∗ = 0)

‖Φ(t, x0)‖ ≤ ce−σt‖x0‖

D := {x ∈ Rd |Φ(t, x)→ x∗ = 0} ?

x(t) = f(x(t)), x ∈ Rd

‖Φ(t, x0)‖ ≤ ce−σt‖x0‖

D := {x ∈ Rd |Φ(t, x)→ x∗ = 0} ?

x(t) = f(x(t)), x ∈ Rd

‖Φ(t, x0)‖ ≤ ce−σt‖x0‖

D := {x ∈ Rd |Φ(t, x)→ x∗ = 0} ?

Example for a Domain of AttractionFluid Dynamics: Explanation of the difference between linearstability and experimental instability for large Reynoldsnumbers [Trefethen et al., Science, 1993]

Zubov’s Equation [1964]For a continuous function h : Rd → R≥0 withh(x) = 0⇔ x = x∗ consider the PDE “Zubov’s Equation”

Dw(x) · f(x) = −h(x)(1− w(x))

with w : Rd → R and boundary condition w(x∗) = 0

Then: under suitable conditions on h this equation has aunique solution w : Rd → [0, 1] with

w(x) = 0 ⇔ x = x∗

and D satisfies the level set characterization

D = w−1([0, 1)) := {x ∈ Rd |w(x) ∈ [0, 1)}

Zubov’s Equation [1964]For a continuous function h : Rd → R≥0 withh(x) = 0⇔ x = x∗ consider the PDE “Zubov’s Equation”

Dw(x) · f(x) = −h(x)(1− w(x))

with w : Rd → R and boundary condition w(x∗) = 0

Then: under suitable conditions on h this equation has aunique solution w : Rd → [0, 1] with

w(x) = 0 ⇔ x = x∗

and D satisfies the level set characterization

D = w−1([0, 1)) := {x ∈ Rd |w(x) ∈ [0, 1)}

Example

x1(t) = −x1(t) + x1(t)3, x2(t) = −x2(t) + x2(t)3

D = [−1, 1]2

, h(x) = 5‖x‖2

Example

x1(t) = −x1(t) + x1(t)3, x2(t) = −x2(t) + x2(t)3

D = [−1, 1]2, h(x) = 5‖x‖2

Integral Equation

Dw(x) · f(x) = −h(x)(1− w(x))

D = w−1([0, 1)) := {x ∈ Rd |w(x) ∈ [0, 1)}

Why does this characterization hold?

Integration of Zubov’s equation and subsequentintegration by parts yields the integral equation

w(x) = 1− e−R∞0 h(Φ(t,x))dt

Φ(t, x)→ x∗ ⇔∫ ∞

h(Φ(t, x))dt <∞ ⇔ w(x) < 1

Integral Equation

Dw(x) · f(x) = −h(x)(1− w(x))

D = w−1([0, 1)) := {x ∈ Rd |w(x) ∈ [0, 1)}

w(x) = 1− e−R∞0 h(Φ(t,x))dt

Φ(t, x)→ x∗ ⇔∫ ∞

h(Φ(t, x))dt <∞ ⇔ w(x) < 1

Integral Equation

Dw(x) · f(x) = −h(x)(1− w(x))

D = w−1([0, 1)) := {x ∈ Rd |w(x) ∈ [0, 1)}

w(x) = 1− e−R∞0 h(Φ(t,x))dt

Φ(t, x)→ x∗ ⇔∫ ∞

h(Φ(t, x))dt <∞ ⇔ w(x) < 1

Zubov’s Equation — DiscussionZubov’s Equation yields

a characterization of the domain of attraction D

an existence result for a Lyapunov function v on D, i.e.,on the the largest possible domain

an — in principle — constructive method for thecomputation of v and D — analytically or numerically

additional insight through PDE formulation

Generalizations exist, e.g., for

periodic orbits [Aulbach ’83]

perturbed systems (deterministic: [Camilli, Gr., Wirth ’01],

stochastic: [Camilli, Loreti ’06; Camilli, Gr. ’03])

control systems (deterministic: [Sontag ’83, Camilli, Gr.,

Wirth ’08], stochastic: [Camilli, Cesaroni, Gr., Wirth ’06])

a characterization of the domain of attraction Dan existence result for a Lyapunov function v on D, i.e.,on the the largest possible domain

Control and PerturbationIn this talk we consider generalizations of this method forcontrolled and deterministically perturbed systems

x(t) = f(x(t), u(t), v(t))

with x(t) ∈ Rd

u ∈ U = {u : [0,∞)→ U, measurable}v ∈ V = {v : [0,∞)→ V, measurable}U ⊂ Rm, V ⊂ Rl compact

Problem: stabilization under uncertainty

u = control, trying to achieve Φ(t, x0, u, v)→ x∗

v = perturbation, trying to keep Φ(t, x0, u, v) away from x∗

(convergence to x∗ = 0 can be generalized to arbitrary compact sets)

x(t) = f(x(t), u(t), v(t))

with x(t) ∈ Rd

x(t) = f(x(t), u(t), v(t))

with x(t) ∈ Rd

Extension of Integral EquationRecall: Zubov’s method relies on the integral equation

w(x) = 1− e−R∞0 h(Φ(t,x))dt

For the solutions Φ(t, x, u, w) of x(t) = f(x(t), u(t), v(t)) define

J(x, u, v) = 1− e−R∞0 h(Φ(t,x,u,v),u(t),v(t))dt

Local exponential controllability and an appropriate choice ofh : Rd × U × V → R+

0 ensure

J(x, u, v) < 1 ⇔ Φ(t, x, u, v)→ 0

J(x, u, v) = 1 ⇔ Φ(t, x, u, v) 6→ 0

Thus, u should minimize J while v should maximize J zero sum differential game (min-max problem)

w(x) = 1− e−R∞0 h(Φ(t,x))dt

0 ensure

J(x, u, v) < 1 ⇔ Φ(t, x, u, v)→ 0

J(x, u, v) = 1 ⇔ Φ(t, x, u, v) 6→ 0

w(x) = 1− e−R∞0 h(Φ(t,x))dt

0 ensure

J(x, u, v) < 1 ⇔ Φ(t, x, u, v)→ 0

J(x, u, v) = 1 ⇔ Φ(t, x, u, v) 6→ 0

w(x) = 1− e−R∞0 h(Φ(t,x))dt

0 ensure

J(x, u, v) < 1 ⇔ Φ(t, x, u, v)→ 0

J(x, u, v) = 1 ⇔ Φ(t, x, u, v) 6→ 0

w(x) = 1− e−R∞0 h(Φ(t,x))dt

0 ensure

J(x, u, v) < 1 ⇔ Φ(t, x, u, v)→ 0

J(x, u, v) = 1 ⇔ Φ(t, x, u, v) 6→ 0

Thus, u should minimize J while v should maximize J

zero sum differential game (min-max problem)

w(x) = 1− e−R∞0 h(Φ(t,x))dt

0 ensure

J(x, u, v) < 1 ⇔ Φ(t, x, u, v)→ 0

J(x, u, v) = 1 ⇔ Φ(t, x, u, v) 6→ 0

Information Exchange between u and vWhat do u and v know about each other?

Possible settings:

when the perturbation chooses v : [0,∞)→ V , it knowsu : [0,∞)→ U — overly conservative, non causal

when the control chooses u : [0,∞)→ U , it knowsv : [0,∞)→ V — unrealistic, non causal

at time t, the perturbation knows u|[0,t] and the controlknows v|[0,t) — causal

at time t, the perturbation knows u|[0,t) and the controlknows v|[0,t] — causal

The last two settings are also realistic, since by Hamilton-Jacobi theory for choosing the optimal u and v knowing u|[0,t)and v|[0,t) is equivalent to knowing the solution Φ|[0,t)General question for differential games: does the“infinitesimal” advantage make a difference?

Information Exchange between u and vWhat do u and v know about each other? Possible settings:

when the perturbation chooses v : [0,∞)→ V , it knowsu : [0,∞)→ U

— overly conservative, non causal

when the control chooses u : [0,∞)→ U , it knowsv : [0,∞)→ V

— unrealistic, non causal

at time t, the perturbation knows u|[0,t] and the controlknows v|[0,t)

— causal

at time t, the perturbation knows u|[0,t) and the controlknows v|[0,t]

— causal

The last two settings are also realistic, since by Hamilton-Jacobi theory for choosing the optimal u and v knowing u|[0,t)and v|[0,t) is equivalent to knowing the solution Φ|[0,t)

General question for differential games: does the“infinitesimal” advantage make a difference?

Formalization of the Information StructureWe formalize the last two cases by defining the set ∆ ofnonanticipative strategies for the perturbation as the set ofmaps β : U → V with the following property for all u1, u2 ∈ Uand all s > 0:

u1(τ) = u2(τ) for almost all τ ∈ [0, s]

⇒ β(u1)(τ) = β(u2)(τ) for almost all τ ∈ [0, s]

similarly, we define the set Γ of nonanticipative strategiesα : V → U for the control

upper value: w+(x) := supβ∈∆

infu∈U

J(x, u, β(u))

lower value: w−(x) := infα∈Γ

supv∈V

J(x, α(v), v)

Keep in mind: the strategy player has an infinitesimal advantage

infu∈U

J(x, u, β(u))

supv∈V

J(x, α(v), v)

infu∈U

J(x, u, β(u))

supv∈V

J(x, α(v), v)

infu∈U

J(x, u, β(u))

supv∈V

J(x, α(v), v)

infu∈U

J(x, u, β(u))

supv∈V

J(x, α(v), v)

Domains of ControllabilityWe need two different domains of controllability

D+ = (w+)−1([0, 1)) and D− = (w−)−1([0, 1))

upper domain of uniform asymptotic controllability

x ∈ Rd

∣∣∣∣∣∣there exists θ(t)→ 0 such thatfor each β ∈ ∆ there existsu ∈ U with ‖Φ(t, x, u, β(u))‖ ≤ θ(t)

lower domain of uniform asymptotic controllability

D− =

x ∈ Rd

∣∣∣∣∣∣there exists θ(t)→ 0 and α ∈ Γsuch that for each v ∈ V the inequality‖Φ(t, x, α(v), v)‖ ≤ θ(t) holds

Local exponential controllability is defined analogously withθ(t) = ce−σt‖x‖ (can be generalized to uniform convergence)

D+ = (w+)−1([0, 1)) and D− = (w−)−1([0, 1))

x ∈ Rd

D− =

x ∈ Rd

D+ = (w+)−1([0, 1)) and D− = (w−)−1([0, 1))

x ∈ Rd

D− =

x ∈ Rd

D+ = (w+)−1([0, 1)) and D− = (w−)−1([0, 1))

x ∈ Rd

D− =

x ∈ Rd

Local exponential controllability is defined analogously withθ(t) = ce−σt‖x‖

(can be generalized to uniform convergence)

D+ = (w+)−1([0, 1)) and D− = (w−)−1([0, 1))

x ∈ Rd

D− =

x ∈ Rd

ExampleCan the upper and lower domain be different?

x(t) = −x(t) + u(t)v(t)x(t)3, x(t) ∈ R

x(t) ∈ R, u(t) ∈ U = {−1, 1}, v(t) ∈ V = {−1, 1}

upper domain: perturbation chooses strategy β

set β(u)(t) := u(t)

x(t) = −x(t) + x(t)3 for all u ∈ U D+ = (−1, 1)

lower domain: control chooses strategy α

set α(v)(t) := −v(t)

x(t) = −x(t)− x(t)3 for all v ∈ V D− = (−∞,∞)

x(t) = −x(t) + u(t)v(t)x(t)3, x(t) ∈ R

x(t) ∈ R, u(t) ∈ U = {−1, 1}, v(t) ∈ V = {−1, 1}

x(t) = −x(t) + u(t)v(t)x(t)3, x(t) ∈ R

x(t) ∈ R, u(t) ∈ U = {−1, 1}, v(t) ∈ V = {−1, 1}

x(t) = −x(t) + u(t)v(t)x(t)3, x(t) ∈ R

x(t) ∈ R, u(t) ∈ U = {−1, 1}, v(t) ∈ V = {−1, 1}

x(t) = −x(t) + u(t)v(t)x(t)3, x(t) ∈ R

x(t) ∈ R, u(t) ∈ U = {−1, 1}, v(t) ∈ V = {−1, 1}

x(t) = −x(t) + u(t)v(t)x(t)3, x(t) ∈ R

x(t) ∈ R, u(t) ∈ U = {−1, 1}, v(t) ∈ V = {−1, 1}

x(t) = −x(t) + u(t)v(t)x(t)3, x(t) ∈ R

x(t) ∈ R, u(t) ∈ U = {−1, 1}, v(t) ∈ V = {−1, 1}

x(t) = −x(t) + u(t)v(t)x(t)3, x(t) ∈ R

x(t) ∈ R, u(t) ∈ U = {−1, 1}, v(t) ∈ V = {−1, 1}

Formal Derivation of Zubov’s Equation

w+(x) := supβ∈∆

infu∈U

{1− e−

R∞0 h(Φ(t,x,u,v),u(t),v(t))dt

}satisfies for all T > 0 the optimality principle

w+(x) = supβ∈∆

infu∈U

{1− e−

R T0 h(Φ(t,x,u,v),u(t),v(t))dt[1− w+(Φ(T, x, u, v))]

Division by −T and passing to the limit for T → 0 yields theHamilton–Jacobi–Isaacs equation

H+(x,w+(x), Dw+(x)) = 0

with Hamiltonian

H+(x,w, p) = supu∈U

infv∈V{−p · f(x, u, v)− h(x, u, v)(1− w)}

“generalized Zubov Equation”

infu∈U

{1− e−

w+(x) = supβ∈∆

infu∈U

{1− e−

}Division by −T and passing to the limit for T → 0 yields theHamilton–Jacobi–Isaacs equation

H+(x,w+(x), Dw+(x)) = 0

with Hamiltonian

infu∈U

{1− e−

w+(x) = supβ∈∆

infu∈U

{1− e−

}Division by −T and passing to the limit for T → 0 yields theHamilton–Jacobi–Isaacs equation

H+(x,w+(x), Dw+(x)) = 0

with Hamiltonian

Formal Derivation of Zubov’s Equationw+ formally satisfies the generalized Zubov equation

H+(x,w+(x), Dw+(x)) = 0

with Hamiltonian

Likewise, w− formally satisfies the generalized Zubov equation

H−(x,w−(x), Dw−(x)) = 0

with Hamiltonian

H−(x,w, p) = infv∈V

supu∈U{−p · f(x, u, v)− h(x, u, v)(1− w)}

Formal Derivation of Zubov’s Equationw+ formally satisfies the generalized Zubov equation

H+(x,w+(x), Dw+(x)) = 0

with Hamiltonian

Likewise, w− formally satisfies the generalized Zubov equation

H−(x,w−(x), Dw−(x)) = 0

with Hamiltonian

Nonsmooth SolutionProblem: w+ and w− from the integral equations are typicallynonsmooth

Example: w+ for x(t) = −x(t) + u(t)v(t)x(t)3, withU = V = {−1, 1}, h(x, u, v) = x2

−5 −4 −3 −2 −1 0 1 2 3 4 50

viscositysolution

−5 −4 −3 −2 −1 0 1 2 3 4 50

viscositysolution

−5 −4 −3 −2 −1 0 1 2 3 4 50

viscositysolution

Viscosity Solution

Super– and subdifferential:

D+v(x) D−v(x)

w viscosity supersolution: H(x,w(x), p) ≥ 0 for all p ∈ D−w(x)

w viscosity subsolution: H(x,w(x), p) ≤ 0 for all p ∈ D+w(x)

w viscosity solution, if both holds [Crandall, Lions 82]

Viscosity Solution

D+v(x) D−v(x)

Viscosity Solution

D+v(x) D−v(x)

Viscosity Solution

D+v(x) D−v(x)

Existence and UniquenessWith this solution concept and with the help of “sub– andsuperoptimality principles” for viscosity super– andsubsolutions [Soravia 95] we arrive at the following Theorem:

w+ is the unique continuous viscosity solution of the equation

supu∈U

infv∈V{−p · f(x, u, v)− h(x, u, v)(1− w)} = 0

with w+(0) = 0

w− is the unique continuous viscosity solution of the equation

infv∈V

supu∈U{−p · f(x, u, v)− h(x, u, v)(1− w)} = 0

with w−(0) = 0

Furthermore, the characterizations D+ = (w+)−1([0, 1)) andD− = (w−)−1([0, 1)) hold

supu∈U

with w+(0) = 0

infv∈V

with w−(0) = 0

supu∈U

with w+(0) = 0

infv∈V

with w−(0) = 0

ExampleConsider the example from before

x(t) = −x(t) + u(t)v(t)x(t)3, x(t) ∈ Rwith x(t) ∈ R, u(t) ∈ U = {−1, 1}, v(t) ∈ V = {−1, 1}

For h(x, u, v) = x2 we can compute explicitly

w+(x) =

{1−√

1− x2, |x| < 11, |x| ≥ 1

w−(x) =

√1 + x2 − 1√

1 + x2

−5 −4 −3 −2 −1 0 1 2 3 4 50

W−(x

This confirms D+ = (−1, 1) and D− = (−∞,∞)

w+(x) =

{1−√

1− x2, |x| < 11, |x| ≥ 1

w−(x) =

√1 + x2 − 1√

1 + x2

−5 −4 −3 −2 −1 0 1 2 3 4 50

W−(x

w+(x) =

{1−√

1− x2, |x| < 11, |x| ≥ 1

w−(x) =

√1 + x2 − 1√

1 + x2

−5 −4 −3 −2 −1 0 1 2 3 4 50

W−(x

When does D+ = D− hold?

When does playing strategies yield no advantage?

The level set characterization implies that D+ = D− holds ifthere exists h such that w+ = w− holds.

The existence and uniqueness theorem implies that w+ = w−

holds if H+ = H− holds. Thus: H+ = H− ⇒ D+ = D−

Recall:

holds if H+ = H− holds.

Thus: H+ = H− ⇒ D+ = D−

Recall:

For the special case of h(x, u, v) = h(x) we get

H+(x,w, p) = H−(x,w, p)

⇔ supu∈U

infv∈V{−p · f(x, u, v)} = inf

v∈Vsupu∈U{−p · f(x, u, v)}

This condition (for all p ∈ Rn) is known as Isaacs’ condition

Theorem: Isaacs’ condition implies D+ = D−

This theorem extends a well known result from capture basinsin finite time pursuit evasion games to domains ofcontrollability of asymptotically controllable sets

H+(x,w, p) = H−(x,w, p)

⇔ supu∈U

H+(x,w, p) = H−(x,w, p)

⇔ supu∈U

ExampleIn our example

x(t) = −x(t) + u(t)v(t)x(t)3, x(t) ∈ Rwith x(t) ∈ R, u(t) ∈ U = {−1, 1}, v(t) ∈ V = {−1, 1}we have D+ = (−1, 1) 6= (−∞,∞) = D−

Isaacs’ condition must be violated

Indeed, for p = 1 and x = 1 we have

p · f(x, u, v) = −1 + uv

and thus

supu∈U

infv∈V{−p · f(x, u, v)} = sup

u∈Uinfv∈V{1− uv} = 0

infv∈V

supu∈U{−p · f(x, u, v)} = inf

v∈Vsupu∈U{1− uv} = 2

ExampleIn our example

x(t) = −x(t) + u(t)v(t)x(t)3, x(t) ∈ Rwith x(t) ∈ R, u(t) ∈ U = {−1, 1}, v(t) ∈ V = {−1, 1}we have D+ = (−1, 1) 6= (−∞,∞) = D−

Isaacs’ condition must be violated

Indeed, for p = 1 and x = 1 we have

p · f(x, u, v) = −1 + uv

and thus

supu∈U

infv∈V{−p · f(x, u, v)} = sup

u∈Uinfv∈V{1− uv} = 0

infv∈V

supu∈U{−p · f(x, u, v)} = inf

v∈Vsupu∈U{1− uv} = 2

Conclusions

Zubov’s method gives a characterization of the domain ofasymptotic controllability via the level set of a solution ofa first order PDE

Using viscosity solutions, the method can be extended toa differential game setting

Upper and lower value w+ and w− and the respectivedomains of controllability D+ and D− are defined andanalyzed seperately

Under the well known Isaacs condition the upper andlower domains of controllability D+ and D− coincide

Conclusions

zubov’s method for di erential games -...

Documents