systems optimization winter semester 2015 preliminaries 1.3. … · 2015. 10. 29. · systems...

26
Systems Optimization Winter Semester 2015 Preliminaries 1.3. Gradients, Hessian, Convex Sets, Convex Functions Dr. Abebe Geletu Technische Universit¨ at Ilmenau Institute of Automation and Systems Engineering Department of Simulation and Optimal Processes (SOP) Systems Optimization Winter Semester 2015 Preliminaries 1.3. Gradients, Hessian, Convex Sets, Convex Functions TU Ilmenau

Upload: others

Post on 25-Mar-2021

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Systems Optimization Winter Semester 2015 Preliminaries 1.3. … · 2015. 10. 29. · Systems Optimization Winter Semester 2015 Preliminaries 1.3. Gradients, Hessian, Convex Sets,

Systems OptimizationWinter Semester 2015

Preliminaries1.3. Gradients, Hessian, Convex Sets, Convex

Functions

Dr. Abebe Geletu

Technische Universitat IlmenauInstitute of Automation and Systems Engineering

Department of Simulation and Optimal Processes (SOP)

Systems Optimization Winter Semester 2015 Preliminaries 1.3. Gradients, Hessian, Convex Sets, Convex Functions

TU Ilmenau

Page 2: Systems Optimization Winter Semester 2015 Preliminaries 1.3. … · 2015. 10. 29. · Systems Optimization Winter Semester 2015 Preliminaries 1.3. Gradients, Hessian, Convex Sets,

1.3.1 DerivativesScalar valued function of of single variables:• Let f : R→ R - a scalar valued function of a single variable• Derivative of f

df (x)

dx= lim

∆x→0

f (x + ∆x)− f (x)

∆x.

Equivalent notations f ′(x).

• Second derivative: d2f (x)dx2 = d

dx

(df (x)dx

)= f ′′(x).

Conventions for derivative of time-dependent functionswhen x(t) is a time-dependent variable, then

x(t) = lim∆t→0

x(t + ∆t)− x(t)

∆t=

dx(t)

dt

Hence, x(t) = d2x(t)dt2 .

Systems Optimization Winter Semester 2015 Preliminaries 1.3. Gradients, Hessian, Convex Sets, Convex Functions

TU Ilmenau

Page 3: Systems Optimization Winter Semester 2015 Preliminaries 1.3. … · 2015. 10. 29. · Systems Optimization Winter Semester 2015 Preliminaries 1.3. Gradients, Hessian, Convex Sets,

1.3.2. Partial Derivatives and GradientScalar valued function of several variables:Let f : Rn → R - a function of several variables.Partial derivative of f w.r.t. the i-th variable

∂f (x1, . . . , xn)

∂xi= lim

h→0

f (x1, . . . , xi + h, . . . , xn)− f (x1, . . . , xi , . . . , xn)

h,

written more compactly as ∂f∂xi

(x).

I The gradient of f at the point x is the vector defined as

∇f (x) =

∂f∂x1

(x)∂f∂x2

(x)...

∂f∂xn

(x)

Systems Optimization Winter Semester 2015 Preliminaries 1.3. Gradients, Hessian, Convex Sets, Convex Functions

TU Ilmenau

Page 4: Systems Optimization Winter Semester 2015 Preliminaries 1.3. … · 2015. 10. 29. · Systems Optimization Winter Semester 2015 Preliminaries 1.3. Gradients, Hessian, Convex Sets,

1.3.2. Partial Derivatives and Gradient...Examples:

(i) For f (R) = 400R1

(R1+R2)2 find the gradient vector at the point

R = (0, 1).

∇f (R) =

(∂f (R)∂R1∂f (R)∂R2

)=

(400(R2−R1)

(R1+R2)3

−800R1

(R1+R2)3

)⇒ ∇f ((0, 1)) =

(400

0

).

(ii) For f (x) = x1x2 + 2x2x3 + 2x1x3 find the gradient vector.

∇f (x) =

∂f (x)∂x1∂f (x)∂x2∂f (x)∂x3

=

x2 + 2x3

x1 + 2x3

2x1 + 2x2

Systems Optimization Winter Semester 2015 Preliminaries 1.3. Gradients, Hessian, Convex Sets, Convex Functions

TU Ilmenau

Page 5: Systems Optimization Winter Semester 2015 Preliminaries 1.3. … · 2015. 10. 29. · Systems Optimization Winter Semester 2015 Preliminaries 1.3. Gradients, Hessian, Convex Sets,

1.3.2. Partial Derivatives and Gradient...

In fact, the function f (x) = x1x2 + 2x2x3 + 2x1x3 can be written as

f (x) =1

2(x1, x2, x3)

0 1 21 0 22 2 0

︸ ︷︷ ︸

=:Q

x1

x2

x3

=1

2x>Qx

which is a quadratic function.I In general, for a quadratic function

f (x) =1

2x>Qx + q>x

the gradient is equal to the vector ∇f (x) = Qx + q.

Systems Optimization Winter Semester 2015 Preliminaries 1.3. Gradients, Hessian, Convex Sets, Convex Functions

TU Ilmenau

Page 6: Systems Optimization Winter Semester 2015 Preliminaries 1.3. … · 2015. 10. 29. · Systems Optimization Winter Semester 2015 Preliminaries 1.3. Gradients, Hessian, Convex Sets,

1.3.3. The Hessian MatrixThe Hessian matrix of a function of several variables f : Rn → R is

H(x) =

∂2f (x)∂x2

1

∂2f (x)∂x2∂x1

. . . ∂2f (x)∂xn∂x1

∂2f (x)∂x1∂x2

∂2f (x)∂x2

2. . . ∂2f (x)

∂xn∂x2

......

. . ....

∂2f (x)∂x1∂xn

∂2f (x)∂x2∂xn

. . . ∂2f (x)∂x2

n

I Most of the time, the Hessian H(x) is a symmetric matrix.Example:(a) For the function f (R) = 400R1

(R1+R2)2 , the Hessian matrix will be

H(R) =800

(R1 + R2)4

[R1 − 2R2 2R1 − R2

2R1 − R2 −3R1

]Systems Optimization Winter Semester 2015 Preliminaries 1.3. Gradients, Hessian, Convex Sets, Convex Functions

TU Ilmenau

Page 7: Systems Optimization Winter Semester 2015 Preliminaries 1.3. … · 2015. 10. 29. · Systems Optimization Winter Semester 2015 Preliminaries 1.3. Gradients, Hessian, Convex Sets,

1.3.3. The Hessian Matrix ...Examples

Example:(b) The Hessian matrix of the function f (x) = x1x2 + 2x2x3 + 2x1x3 is

H(x) =

0 1 21 0 22 2 0

I In general, for a quadratic function f (x) = 1

2x>Qx + q>x , the

Hessian matrix isH(x) = Q,

i.e., for a quadratic function the Hessian is a constant matrix.

Systems Optimization Winter Semester 2015 Preliminaries 1.3. Gradients, Hessian, Convex Sets, Convex Functions

TU Ilmenau

Page 8: Systems Optimization Winter Semester 2015 Preliminaries 1.3. … · 2015. 10. 29. · Systems Optimization Winter Semester 2015 Preliminaries 1.3. Gradients, Hessian, Convex Sets,

1.3.4 Vector Valued functions

Vector valued functions of a single variable:There are variables that are compactly written in vector form as x(t)where

x(t) =

x1(t)x2(t)

...xn(t)

Example (Recall the water reservoirs control problem)The unknowns can be written in a vector form as

State variables: V (t) =

(V1(t)V2(t)

), control variables: w(t) =

(q(t)u(t)

)

Systems Optimization Winter Semester 2015 Preliminaries 1.3. Gradients, Hessian, Convex Sets, Convex Functions

TU Ilmenau

Page 9: Systems Optimization Winter Semester 2015 Preliminaries 1.3. … · 2015. 10. 29. · Systems Optimization Winter Semester 2015 Preliminaries 1.3. Gradients, Hessian, Convex Sets,

1.3.4 Vector Valued functions ... of single variable

The derivative x(t) is a vector represented by

x(t) =

x1(t)x2(t)

...xn(t)

Similarly the second time-derivative x(t) is given by

x(t) =

x1(t)x2(t)

...xn(t)

.

Systems Optimization Winter Semester 2015 Preliminaries 1.3. Gradients, Hessian, Convex Sets, Convex Functions

TU Ilmenau

Page 10: Systems Optimization Winter Semester 2015 Preliminaries 1.3. … · 2015. 10. 29. · Systems Optimization Winter Semester 2015 Preliminaries 1.3. Gradients, Hessian, Convex Sets,

1.3.5. The Jacobian MatrixVector valued functions of several variable:Consider the a function F : Rn → Rm, where

F (x) =

F1(x)F2(x)

...Fm(x)

, where x = (x1, x2, . . . , xn).

The Jacobian Matrix of F at a point x is defined as

J(x) =

(∇F1(x))>

(∇F2(x))>

...

(∇Fm(x))>

Systems Optimization Winter Semester 2015 Preliminaries 1.3. Gradients, Hessian, Convex Sets, Convex Functions

TU Ilmenau

Page 11: Systems Optimization Winter Semester 2015 Preliminaries 1.3. … · 2015. 10. 29. · Systems Optimization Winter Semester 2015 Preliminaries 1.3. Gradients, Hessian, Convex Sets,

1.3.5. The Jacobian Matrix ... ExampleExample: Find the Jacobian of

F (x) =

x2 + 2x3

x1 + 2x3

2x1 + 2x2

Here F1(x) = x2 + 2x3,F2(x) = x1 + 2x3,F3(x) = 2x1 + 2x2

∇F1(x) =

012

,∇F2(x) =

102

,∇F3(x) =

220

Hence,

J(x) =

(∇F1(x))>

(∇F2(x))>

(∇F3(x))>

=

0 1 21 0 22 2 0

Systems Optimization Winter Semester 2015 Preliminaries 1.3. Gradients, Hessian, Convex Sets, Convex Functions

TU Ilmenau

Page 12: Systems Optimization Winter Semester 2015 Preliminaries 1.3. … · 2015. 10. 29. · Systems Optimization Winter Semester 2015 Preliminaries 1.3. Gradients, Hessian, Convex Sets,

1.3.5. The Jacobian Matrix ... Exercise

I Note that, for a scalar valued function f : Rn → R. The Jacobianof the gradient F (x) = ∇f (x) is equal to the Hessian H(x) of f .

Exercise:

(ii) Determine the gradient and the Hessian of the function

f (x) = 100(x2 − x2

1

)2+ (1− x1)2.

(ii) Determine the Jacobian of

F (R) =

(400(R2−R1)

(R1+R2)3

−800R1

(R1+R2)3

)

Systems Optimization Winter Semester 2015 Preliminaries 1.3. Gradients, Hessian, Convex Sets, Convex Functions

TU Ilmenau

Page 13: Systems Optimization Winter Semester 2015 Preliminaries 1.3. … · 2015. 10. 29. · Systems Optimization Winter Semester 2015 Preliminaries 1.3. Gradients, Hessian, Convex Sets,

1.3.6. Taylor Series ApproximationA smooth function f : Rn → R can be represented by an infinite sum near to agiven point x .

f (x + αd) = f (x) + αd>∇f (x) +1

2α2d>∇H(x)d + . . .

for a very small number α > 0.

First-order Taylor approximation near x :

f (x + αd) ≈ f (x) + αd>∇f (x)

is a linear approximation of f (x) near x .Second-order Taylor approximation near x :

f (x + αd) = f (x) + αd>∇f (x) +1

2α2d>∇H(x)d

provides a quadratic approximation of f (x) near x .

I Most optimization algorithms are developed based on either first-or second-order Taylor approximations.

Systems Optimization Winter Semester 2015 Preliminaries 1.3. Gradients, Hessian, Convex Sets, Convex Functions

TU Ilmenau

Page 14: Systems Optimization Winter Semester 2015 Preliminaries 1.3. … · 2015. 10. 29. · Systems Optimization Winter Semester 2015 Preliminaries 1.3. Gradients, Hessian, Convex Sets,

1.3.7. Some Properties of the Gradient Vector

P1: A function f : Rn → R increases faster in the direction ofthe gradient vector ∇f (x); i.e.⇒ ∇f (x) is the steepest ascent direction of f from the point x ;⇒(−∇f (x)) is the steepest descent direction of f from thepoint x ;

Note that: For a given point x , using the 1-st order Taylorapproximation

f (x + αd) = f (x) + αd>∇f (x), α > 0.

If we choose the direction vector d = ∇f (x), then

f (x + αd) = f (x) + α∇f (x)>∇f (x) = f (x) + α‖∇f (x)‖2︸ ︷︷ ︸≥0

≥ f (x)

This implies, f increases in the direction of d = ∇f (x).Systems Optimization Winter Semester 2015 Preliminaries 1.3. Gradients, Hessian, Convex Sets, Convex Functions

TU Ilmenau

Page 15: Systems Optimization Winter Semester 2015 Preliminaries 1.3. … · 2015. 10. 29. · Systems Optimization Winter Semester 2015 Preliminaries 1.3. Gradients, Hessian, Convex Sets,

1.3.6. Some Properties of the Gradient Vector...

P2: The gradient vector ∇f (x) is orthogonal to the contour lineof f .⇒ For a given fixed number k , ∇f (x) is orthogonal to the curve(surface) {x | f (x) = k}.

Systems Optimization Winter Semester 2015 Preliminaries 1.3. Gradients, Hessian, Convex Sets, Convex Functions

TU Ilmenau

Page 16: Systems Optimization Winter Semester 2015 Preliminaries 1.3. … · 2015. 10. 29. · Systems Optimization Winter Semester 2015 Preliminaries 1.3. Gradients, Hessian, Convex Sets,

1.3.7. Convex Sets

A. Convex SetA set S ⊂ Rn is said to be a convex set, if for any x1, x2 ∈ S and any λ ∈ [0, 1] we have

λx1 + (1− λ)x2 ∈ S .

• For a set S to be convex, the line-segment joining any two points x1, x2 in S should becompletely contained in S .

Figure: Convex and a non-convex sets

• The set S = {x ∈ Rn | Ax ≤ a,Bx = b} is a convex set, where A,B are matrices anda, b are vectors.

Systems Optimization Winter Semester 2015 Preliminaries 1.3. Gradients, Hessian, Convex Sets, Convex Functions

TU Ilmenau

Page 17: Systems Optimization Winter Semester 2015 Preliminaries 1.3. … · 2015. 10. 29. · Systems Optimization Winter Semester 2015 Preliminaries 1.3. Gradients, Hessian, Convex Sets,

1.4.1. Convex Sets ...

whereS1 = {x | gi (x) ≤ 0, i = 1, 2, 3}

andS2 = {x | a ≤ ‖x‖2 ≤ b}, where 0 < a < b.

Systems Optimization Winter Semester 2015 Preliminaries 1.3. Gradients, Hessian, Convex Sets, Convex Functions

TU Ilmenau

Page 18: Systems Optimization Winter Semester 2015 Preliminaries 1.3. … · 2015. 10. 29. · Systems Optimization Winter Semester 2015 Preliminaries 1.3. Gradients, Hessian, Convex Sets,

1.4.2. Convex Functions

Definition (Convex Functions)A function f : Rn → R is said to be a convex function, if for any x1, x2 ∈ Rn and anyλ ∈ [0, 1] we have

f (λx1 + (1− λ)x2) ≤ λf (x1) + (1− λ)f (x2)

• A segment connecting any two points on the graph of f lies above the graph of f .

I A function f is strictly convex, if for any x1, x2 ∈ Rn and any λ ∈ [0, 1] we have

f (λx1 + (1− λ)x2) < λf (x1) + (1− λ)f (x2).

Systems Optimization Winter Semester 2015 Preliminaries 1.3. Gradients, Hessian, Convex Sets, Convex Functions

TU Ilmenau

Page 19: Systems Optimization Winter Semester 2015 Preliminaries 1.3. … · 2015. 10. 29. · Systems Optimization Winter Semester 2015 Preliminaries 1.3. Gradients, Hessian, Convex Sets,

1.4.2. Convex Functions...Examples:

(i) The following are convex functions f1(x) = x2, f2(x) = x4,f3(x) = ex , f4(x1, x2) = x2

1 + x22 .

(ii) The function f1(x) = x2, f3(x) = ex , f4(x1, x2) = x21 + x2

2 arestrictly convex.

Some properties

I If f is a convex function and α ≥ 0 , then αf (x) is a convexfunction.

I If f and g are convex functions, then their sum f (x) + g(x) isalso a convex function.

I If f is a convex functions h a convex and non-decreasing functionof one variable, then the composition h(f (x)) is also a convexfunction.

(iii) The function f (x) =√ex

21 +x2

2 + x41 is a convex function.

Systems Optimization Winter Semester 2015 Preliminaries 1.3. Gradients, Hessian, Convex Sets, Convex Functions

TU Ilmenau

Page 20: Systems Optimization Winter Semester 2015 Preliminaries 1.3. … · 2015. 10. 29. · Systems Optimization Winter Semester 2015 Preliminaries 1.3. Gradients, Hessian, Convex Sets,

1.4.2. Convex Functions...

Question: Is there a simple method to verify whether a function isconvex or not?

(iv) Consider the function f (x) = 12 (x1 − 2)2 + x2

2 − 5 that can bealso written as

f (x) =1

2(x1, x2)

[1 00 2

]︸ ︷︷ ︸

=Q

(x1

x2

)+ (4, 0)︸ ︷︷ ︸

=q>

(x1

x2

)+ (−3)︸︷︷︸

=b

=1

2x>Qx + q>x + b.

Systems Optimization Winter Semester 2015 Preliminaries 1.3. Gradients, Hessian, Convex Sets, Convex Functions

TU Ilmenau

Page 21: Systems Optimization Winter Semester 2015 Preliminaries 1.3. … · 2015. 10. 29. · Systems Optimization Winter Semester 2015 Preliminaries 1.3. Gradients, Hessian, Convex Sets,

1.4.2. Convex Functions...

function my2DPlot2

x = -10:0.1:10; y = -10:0.1:10;

[X,Y] = meshgrid(x,y);

Z = 0.5*(X -2).^2 + Y.^2 -5 ;

meshc(X,Y,Z)

xlabel(’x - axis’);

ylabel(’y - axis’);

hold on

%plot3(2,0,0,’sk’,’markerfacecolor’,[0,0,0]);

title(’Plot for the function f(x_{1},x_{2})=0.5(x_{1}-2)^{2}+ x_{2}^{2} - 5’)

Systems Optimization Winter Semester 2015 Preliminaries 1.3. Gradients, Hessian, Convex Sets, Convex Functions

TU Ilmenau

Page 22: Systems Optimization Winter Semester 2015 Preliminaries 1.3. … · 2015. 10. 29. · Systems Optimization Winter Semester 2015 Preliminaries 1.3. Gradients, Hessian, Convex Sets,

1.4.2. Convex Functions...

Systems Optimization Winter Semester 2015 Preliminaries 1.3. Gradients, Hessian, Convex Sets, Convex Functions

TU Ilmenau

Page 23: Systems Optimization Winter Semester 2015 Preliminaries 1.3. … · 2015. 10. 29. · Systems Optimization Winter Semester 2015 Preliminaries 1.3. Gradients, Hessian, Convex Sets,

1.4.2. Convex Functions...

Checking convexity

Let f be a twice differentiable function. Then

f is a convex function if and only if the Hessian H(x) is apositive semi-definite matrix.

f is a strictly convex function if and only if the Hessian H(x) isa positive definite matrix.

Examples:The function f (x) = 1

2 (x1 − 2)2 + x22 − 5 is (strictly) convex, since

H(x) =

[1 00 2

]is positive (semi-) definite.

I A quadratic function f (x) = 12x>Qx + q>x + b is convex if the

matrix Q is positive semi-definite.Recall that: a matrix Q is positive semi-definite if all its eigenvalues are non-negative.

Systems Optimization Winter Semester 2015 Preliminaries 1.3. Gradients, Hessian, Convex Sets, Convex Functions

TU Ilmenau

Page 24: Systems Optimization Winter Semester 2015 Preliminaries 1.3. … · 2015. 10. 29. · Systems Optimization Winter Semester 2015 Preliminaries 1.3. Gradients, Hessian, Convex Sets,

1.4.2. Convex Functions...

Methods to check convexity

I A function f (x) is convex if all the eigenvalues of the Hessianmatrix H(x) are non-negative, for any x .

I (Sylverster’s Criteria): A symmetric matrix is positivesemi-definite if and only if all its principal minors arenon-negative.

For an n × n matrix, a principal minor is the determinants of a k × k principal sub-matrix, where k=1,. . . ,n.

Example: The function

f (x) = x1x2 + 2x2x3 + 2x1x3

is not convex. Note that f has the Hessian matrix H(x) =

0 1 21 0 22 2 0

.

Then

det(0) = 0, det

([0 11 0

])= −1, det

0 1 21 0 22 2 0

= 8.

Systems Optimization Winter Semester 2015 Preliminaries 1.3. Gradients, Hessian, Convex Sets, Convex Functions

TU Ilmenau

Page 25: Systems Optimization Winter Semester 2015 Preliminaries 1.3. … · 2015. 10. 29. · Systems Optimization Winter Semester 2015 Preliminaries 1.3. Gradients, Hessian, Convex Sets,

1.4.2. Convex Functions...Excercise(A nonlinear spring system):(i)

• The applied forces are F1 = 0, F2 = 2N and the spring constants are k1 = k2 = 1N/m. Verify theconvexity of the potential energy function which is given by

P(x1, x2) =1

2k1 (∆L1)2 +

1

2k2 (∆L2)2 − F1x1 − F2x2,

where ∆L1 =√

(x1 + 10)2 + (x2 − 10)2 − 10√

2 and ∆L2 =√

(x1 − 10)2 + (x2 − 10)2 − 10√

2 are

changes in the length of the springs and x1 und x2 are the x and y shifts, respectively.

Systems Optimization Winter Semester 2015 Preliminaries 1.3. Gradients, Hessian, Convex Sets, Convex Functions

TU Ilmenau

Page 26: Systems Optimization Winter Semester 2015 Preliminaries 1.3. … · 2015. 10. 29. · Systems Optimization Winter Semester 2015 Preliminaries 1.3. Gradients, Hessian, Convex Sets,

1.4.2. Convex Functions...Execercises

(ii) Verify that, if g1(x), . . . , gm(x) are convex functions, then the set

S = {x | gi (x) ≤ 0, i = 1, . . . ,m}

is a convex set.

(iii) (Rosenbrock’s function) Verify whether the function

f (x) = 100(x2 − x21 )2 + (1− x1)2

is convex or not.

(iv) The following a function represents the weight of a truss structure

f (Mb,Mc) = 8M2.5b + 6M2.5

c .

Verify whether this function is convex.

Systems Optimization Winter Semester 2015 Preliminaries 1.3. Gradients, Hessian, Convex Sets, Convex Functions

TU Ilmenau