systems optimization winter semester 2015 preliminaries 1.3. … · 2015. 10. 29. · systems...

Systems OptimizationWinter Semester 2015

Preliminaries1.3. Gradients, Hessian, Convex Sets, Convex

Functions

Dr. Abebe Geletu

Technische Universitat IlmenauInstitute of Automation and Systems Engineering

Department of Simulation and Optimal Processes (SOP)

Systems Optimization Winter Semester 2015 Preliminaries 1.3. Gradients, Hessian, Convex Sets, Convex Functions

TU Ilmenau

1.3.1 DerivativesScalar valued function of of single variables:• Let f : R→ R - a scalar valued function of a single variable• Derivative of f

df (x)

dx= lim

∆x→0

f (x + ∆x)− f (x)

∆x.

Equivalent notations f ′(x).

• Second derivative: d2f (x)dx2 = d

dx

(df (x)dx

)= f ′′(x).

Conventions for derivative of time-dependent functionswhen x(t) is a time-dependent variable, then

x(t) = lim∆t→0

x(t + ∆t)− x(t)

∆t=

dx(t)

dt

Hence, x(t) = d2x(t)dt2 .


TU Ilmenau

1.3.2. Partial Derivatives and GradientScalar valued function of several variables:Let f : Rn → R - a function of several variables.Partial derivative of f w.r.t. the i-th variable

∂f (x1, . . . , xn)

∂xi= lim

h→0

f (x1, . . . , xi + h, . . . , xn)− f (x1, . . . , xi , . . . , xn)

h,

written more compactly as ∂f∂xi

(x).

I The gradient of f at the point x is the vector defined as

∇f (x) =

∂f∂x1

(x)∂f∂x2

(x)...

∂f∂xn

(x)


TU Ilmenau

1.3.2. Partial Derivatives and Gradient...Examples:

(i) For f (R) = 400R1

(R1+R2)2 find the gradient vector at the point

R = (0, 1).

∇f (R) =

(∂f (R)∂R1∂f (R)∂R2

)=

(400(R2−R1)

(R1+R2)3

−800R1

(R1+R2)3

)⇒ ∇f ((0, 1)) =

(400

0

).

(ii) For f (x) = x1x2 + 2x2x3 + 2x1x3 find the gradient vector.

∇f (x) =

∂f (x)∂x1∂f (x)∂x2∂f (x)∂x3

=

x2 + 2x3

x1 + 2x3

2x1 + 2x2


TU Ilmenau

1.3.2. Partial Derivatives and Gradient...

In fact, the function f (x) = x1x2 + 2x2x3 + 2x1x3 can be written as

f (x) =1

2(x1, x2, x3)

0 1 21 0 22 2 0

︸︷︷︸

=:Q

x1

x2

x3

=1

2x>Qx

which is a quadratic function.I In general, for a quadratic function

f (x) =1

2x>Qx + q>x

the gradient is equal to the vector ∇f (x) = Qx + q.


TU Ilmenau

1.3.3. The Hessian MatrixThe Hessian matrix of a function of several variables f : Rn → R is

H(x) =

∂2f (x)∂x2

1

∂2f (x)∂x2∂x1

. . . ∂2f (x)∂xn∂x1

∂2f (x)∂x1∂x2

∂2f (x)∂x2

2. . . ∂2f (x)

∂xn∂x2

......

. . ....

∂2f (x)∂x1∂xn

∂2f (x)∂x2∂xn

. . . ∂2f (x)∂x2

n

I Most of the time, the Hessian H(x) is a symmetric matrix.Example:(a) For the function f (R) = 400R1

(R1+R2)2 , the Hessian matrix will be

H(R) =800

(R1 + R2)4

[R1 − 2R2 2R1 − R2

2R1 − R2 −3R1

]Systems Optimization Winter Semester 2015 Preliminaries 1.3. Gradients, Hessian, Convex Sets, Convex Functions

TU Ilmenau

1.3.3. The Hessian Matrix ...Examples

Example:(b) The Hessian matrix of the function f (x) = x1x2 + 2x2x3 + 2x1x3 is

H(x) =

0 1 21 0 22 2 0

I In general, for a quadratic function f (x) = 1

2x>Qx + q>x , the

Hessian matrix isH(x) = Q,

i.e., for a quadratic function the Hessian is a constant matrix.


TU Ilmenau

1.3.4 Vector Valued functions

Vector valued functions of a single variable:There are variables that are compactly written in vector form as x(t)where

x(t) =

x1(t)x2(t)

...xn(t)

Example (Recall the water reservoirs control problem)The unknowns can be written in a vector form as

State variables: V (t) =

(V1(t)V2(t)

), control variables: w(t) =

(q(t)u(t)

)


TU Ilmenau

1.3.4 Vector Valued functions ... of single variable

The derivative x(t) is a vector represented by

x(t) =

x1(t)x2(t)

...xn(t)

Similarly the second time-derivative x(t) is given by

x(t) =

x1(t)x2(t)

...xn(t)

.


TU Ilmenau

1.3.5. The Jacobian MatrixVector valued functions of several variable:Consider the a function F : Rn → Rm, where

F (x) =

F1(x)F2(x)

...Fm(x)

, where x = (x1, x2, . . . , xn).

The Jacobian Matrix of F at a point x is defined as

J(x) =

(∇F1(x))>

(∇F2(x))>

...

(∇Fm(x))>


TU Ilmenau

1.3.5. The Jacobian Matrix ... ExampleExample: Find the Jacobian of

F (x) =

x2 + 2x3

x1 + 2x3

2x1 + 2x2

Here F1(x) = x2 + 2x3,F2(x) = x1 + 2x3,F3(x) = 2x1 + 2x2

∇F1(x) =

012

,∇F2(x) =

102

,∇F3(x) =

220

Hence,

J(x) =

(∇F1(x))>

(∇F2(x))>

(∇F3(x))>

=

0 1 21 0 22 2 0


TU Ilmenau

1.3.5. The Jacobian Matrix ... Exercise

I Note that, for a scalar valued function f : Rn → R. The Jacobianof the gradient F (x) = ∇f (x) is equal to the Hessian H(x) of f .

Exercise:

(ii) Determine the gradient and the Hessian of the function

f (x) = 100(x2 − x2

1

)2+ (1− x1)2.

(ii) Determine the Jacobian of

F (R) =

(400(R2−R1)

(R1+R2)3

−800R1

(R1+R2)3

)


TU Ilmenau

1.3.6. Taylor Series ApproximationA smooth function f : Rn → R can be represented by an infinite sum near to agiven point x .

f (x + αd) = f (x) + αd>∇f (x) +1

2α2d>∇H(x)d + . . .

for a very small number α > 0.

First-order Taylor approximation near x :

f (x + αd) ≈ f (x) + αd>∇f (x)

is a linear approximation of f (x) near x .Second-order Taylor approximation near x :

f (x + αd) = f (x) + αd>∇f (x) +1

2α2d>∇H(x)d

provides a quadratic approximation of f (x) near x .

I Most optimization algorithms are developed based on either first-or second-order Taylor approximations.


TU Ilmenau

1.3.7. Some Properties of the Gradient Vector

P1: A function f : Rn → R increases faster in the direction ofthe gradient vector ∇f (x); i.e.⇒ ∇f (x) is the steepest ascent direction of f from the point x ;⇒(−∇f (x)) is the steepest descent direction of f from thepoint x ;

Note that: For a given point x , using the 1-st order Taylorapproximation

f (x + αd) = f (x) + αd>∇f (x), α > 0.

If we choose the direction vector d = ∇f (x), then

f (x + αd) = f (x) + α∇f (x)>∇f (x) = f (x) + α‖∇f (x)‖2︸︷︷︸≥0

≥ f (x)

This implies, f increases in the direction of d = ∇f (x).Systems Optimization Winter Semester 2015 Preliminaries 1.3. Gradients, Hessian, Convex Sets, Convex Functions

TU Ilmenau

1.3.6. Some Properties of the Gradient Vector...

P2: The gradient vector ∇f (x) is orthogonal to the contour lineof f .⇒ For a given fixed number k , ∇f (x) is orthogonal to the curve(surface) {x | f (x) = k}.


TU Ilmenau

1.3.7. Convex Sets

A. Convex SetA set S ⊂ Rn is said to be a convex set, if for any x1, x2 ∈ S and any λ ∈ [0, 1] we have

λx1 + (1− λ)x2 ∈ S .

• For a set S to be convex, the line-segment joining any two points x1, x2 in S should becompletely contained in S .

Figure: Convex and a non-convex sets

• The set S = {x ∈ Rn | Ax ≤ a,Bx = b} is a convex set, where A,B are matrices anda, b are vectors.


TU Ilmenau

1.4.1. Convex Sets ...

whereS1 = {x | gi (x) ≤ 0, i = 1, 2, 3}

andS2 = {x | a ≤ ‖x‖2 ≤ b}, where 0 < a < b.


TU Ilmenau

1.4.2. Convex Functions

Definition (Convex Functions)A function f : Rn → R is said to be a convex function, if for any x1, x2 ∈ Rn and anyλ ∈ [0, 1] we have

f (λx1 + (1− λ)x2) ≤ λf (x1) + (1− λ)f (x2)

• A segment connecting any two points on the graph of f lies above the graph of f .

I A function f is strictly convex, if for any x1, x2 ∈ Rn and any λ ∈ [0, 1] we have

f (λx1 + (1− λ)x2) < λf (x1) + (1− λ)f (x2).


TU Ilmenau

1.4.2. Convex Functions...Examples:

(i) The following are convex functions f1(x) = x2, f2(x) = x4,f3(x) = ex , f4(x1, x2) = x2

1 + x22 .

(ii) The function f1(x) = x2, f3(x) = ex , f4(x1, x2) = x21 + x2

2 arestrictly convex.

Some properties

I If f is a convex function and α ≥ 0 , then αf (x) is a convexfunction.

I If f and g are convex functions, then their sum f (x) + g(x) isalso a convex function.

I If f is a convex functions h a convex and non-decreasing functionof one variable, then the composition h(f (x)) is also a convexfunction.

(iii) The function f (x) =√ex

21 +x2

2 + x41 is a convex function.


TU Ilmenau

1.4.2. Convex Functions...

Question: Is there a simple method to verify whether a function isconvex or not?

(iv) Consider the function f (x) = 12 (x1 − 2)2 + x2

2 − 5 that can bealso written as

f (x) =1

2(x1, x2)

[1 00 2

]︸︷︷︸

=Q

(x1

x2

)+ (4, 0)︸︷︷︸

=q>

(x1

x2

)+ (−3)︸︷︷︸

=b

=1

2x>Qx + q>x + b.


TU Ilmenau


function my2DPlot2

x = -10:0.1:10; y = -10:0.1:10;

[X,Y] = meshgrid(x,y);

Z = 0.5*(X -2).^2 + Y.^2 -5 ;

meshc(X,Y,Z)

xlabel(’x - axis’);

ylabel(’y - axis’);

hold on

%plot3(2,0,0,’sk’,’markerfacecolor’,[0,0,0]);

title(’Plot for the function f(x_{1},x_{2})=0.5(x_{1}-2)^{2}+ x_{2}^{2} - 5’)


TU Ilmenau



TU Ilmenau


Checking convexity

Let f be a twice differentiable function. Then

f is a convex function if and only if the Hessian H(x) is apositive semi-definite matrix.

f is a strictly convex function if and only if the Hessian H(x) isa positive definite matrix.

Examples:The function f (x) = 1

2 (x1 − 2)2 + x22 − 5 is (strictly) convex, since

H(x) =

[1 00 2

]is positive (semi-) definite.

I A quadratic function f (x) = 12x>Qx + q>x + b is convex if the

matrix Q is positive semi-definite.Recall that: a matrix Q is positive semi-definite if all its eigenvalues are non-negative.


TU Ilmenau


Methods to check convexity

I A function f (x) is convex if all the eigenvalues of the Hessianmatrix H(x) are non-negative, for any x .

I (Sylverster’s Criteria): A symmetric matrix is positivesemi-definite if and only if all its principal minors arenon-negative.

For an n × n matrix, a principal minor is the determinants of a k × k principal sub-matrix, where k=1,. . . ,n.

Example: The function

f (x) = x1x2 + 2x2x3 + 2x1x3

is not convex. Note that f has the Hessian matrix H(x) =

0 1 21 0 22 2 0

.

Then

det(0) = 0, det

([0 11 0

])= −1, det

0 1 21 0 22 2 0

= 8.


TU Ilmenau

1.4.2. Convex Functions...Excercise(A nonlinear spring system):(i)

• The applied forces are F1 = 0, F2 = 2N and the spring constants are k1 = k2 = 1N/m. Verify theconvexity of the potential energy function which is given by

P(x1, x2) =1

2k1 (∆L1)2 +

1

2k2 (∆L2)2 − F1x1 − F2x2,

where ∆L1 =√

(x1 + 10)2 + (x2 − 10)2 − 10√

2 and ∆L2 =√

(x1 − 10)2 + (x2 − 10)2 − 10√

2 are

changes in the length of the springs and x1 und x2 are the x and y shifts, respectively.


TU Ilmenau

1.4.2. Convex Functions...Execercises

(ii) Verify that, if g1(x), . . . , gm(x) are convex functions, then the set

S = {x | gi (x) ≤ 0, i = 1, . . . ,m}

is a convex set.

(iii) (Rosenbrock’s function) Verify whether the function

f (x) = 100(x2 − x21 )2 + (1− x1)2

is convex or not.

(iv) The following a function represents the weight of a truss structure

f (Mb,Mc) = 8M2.5b + 6M2.5

c .

Verify whether this function is convex.


TU Ilmenau

systems optimization winter semester 2015 preliminaries 1.3. … · 2015. 10. 29. · systems...

Documents