systems optimization winter semester 2015 preliminaries 1.3. … · 2015. 10. 29. · systems...
TRANSCRIPT
Systems OptimizationWinter Semester 2015
Preliminaries1.3. Gradients, Hessian, Convex Sets, Convex
Functions
Dr. Abebe Geletu
Technische Universitat IlmenauInstitute of Automation and Systems Engineering
Department of Simulation and Optimal Processes (SOP)
Systems Optimization Winter Semester 2015 Preliminaries 1.3. Gradients, Hessian, Convex Sets, Convex Functions
TU Ilmenau
1.3.1 DerivativesScalar valued function of of single variables:• Let f : R→ R - a scalar valued function of a single variable• Derivative of f
df (x)
dx= lim
∆x→0
f (x + ∆x)− f (x)
∆x.
Equivalent notations f ′(x).
• Second derivative: d2f (x)dx2 = d
dx
(df (x)dx
)= f ′′(x).
Conventions for derivative of time-dependent functionswhen x(t) is a time-dependent variable, then
x(t) = lim∆t→0
x(t + ∆t)− x(t)
∆t=
dx(t)
dt
Hence, x(t) = d2x(t)dt2 .
Systems Optimization Winter Semester 2015 Preliminaries 1.3. Gradients, Hessian, Convex Sets, Convex Functions
TU Ilmenau
1.3.2. Partial Derivatives and GradientScalar valued function of several variables:Let f : Rn → R - a function of several variables.Partial derivative of f w.r.t. the i-th variable
∂f (x1, . . . , xn)
∂xi= lim
h→0
f (x1, . . . , xi + h, . . . , xn)− f (x1, . . . , xi , . . . , xn)
h,
written more compactly as ∂f∂xi
(x).
I The gradient of f at the point x is the vector defined as
∇f (x) =
∂f∂x1
(x)∂f∂x2
(x)...
∂f∂xn
(x)
Systems Optimization Winter Semester 2015 Preliminaries 1.3. Gradients, Hessian, Convex Sets, Convex Functions
TU Ilmenau
1.3.2. Partial Derivatives and Gradient...Examples:
(i) For f (R) = 400R1
(R1+R2)2 find the gradient vector at the point
R = (0, 1).
∇f (R) =
(∂f (R)∂R1∂f (R)∂R2
)=
(400(R2−R1)
(R1+R2)3
−800R1
(R1+R2)3
)⇒ ∇f ((0, 1)) =
(400
0
).
(ii) For f (x) = x1x2 + 2x2x3 + 2x1x3 find the gradient vector.
∇f (x) =
∂f (x)∂x1∂f (x)∂x2∂f (x)∂x3
=
x2 + 2x3
x1 + 2x3
2x1 + 2x2
Systems Optimization Winter Semester 2015 Preliminaries 1.3. Gradients, Hessian, Convex Sets, Convex Functions
TU Ilmenau
1.3.2. Partial Derivatives and Gradient...
In fact, the function f (x) = x1x2 + 2x2x3 + 2x1x3 can be written as
f (x) =1
2(x1, x2, x3)
0 1 21 0 22 2 0
︸ ︷︷ ︸
=:Q
x1
x2
x3
=1
2x>Qx
which is a quadratic function.I In general, for a quadratic function
f (x) =1
2x>Qx + q>x
the gradient is equal to the vector ∇f (x) = Qx + q.
Systems Optimization Winter Semester 2015 Preliminaries 1.3. Gradients, Hessian, Convex Sets, Convex Functions
TU Ilmenau
1.3.3. The Hessian MatrixThe Hessian matrix of a function of several variables f : Rn → R is
H(x) =
∂2f (x)∂x2
1
∂2f (x)∂x2∂x1
. . . ∂2f (x)∂xn∂x1
∂2f (x)∂x1∂x2
∂2f (x)∂x2
2. . . ∂2f (x)
∂xn∂x2
......
. . ....
∂2f (x)∂x1∂xn
∂2f (x)∂x2∂xn
. . . ∂2f (x)∂x2
n
I Most of the time, the Hessian H(x) is a symmetric matrix.Example:(a) For the function f (R) = 400R1
(R1+R2)2 , the Hessian matrix will be
H(R) =800
(R1 + R2)4
[R1 − 2R2 2R1 − R2
2R1 − R2 −3R1
]Systems Optimization Winter Semester 2015 Preliminaries 1.3. Gradients, Hessian, Convex Sets, Convex Functions
TU Ilmenau
1.3.3. The Hessian Matrix ...Examples
Example:(b) The Hessian matrix of the function f (x) = x1x2 + 2x2x3 + 2x1x3 is
H(x) =
0 1 21 0 22 2 0
I In general, for a quadratic function f (x) = 1
2x>Qx + q>x , the
Hessian matrix isH(x) = Q,
i.e., for a quadratic function the Hessian is a constant matrix.
Systems Optimization Winter Semester 2015 Preliminaries 1.3. Gradients, Hessian, Convex Sets, Convex Functions
TU Ilmenau
1.3.4 Vector Valued functions
Vector valued functions of a single variable:There are variables that are compactly written in vector form as x(t)where
x(t) =
x1(t)x2(t)
...xn(t)
Example (Recall the water reservoirs control problem)The unknowns can be written in a vector form as
State variables: V (t) =
(V1(t)V2(t)
), control variables: w(t) =
(q(t)u(t)
)
Systems Optimization Winter Semester 2015 Preliminaries 1.3. Gradients, Hessian, Convex Sets, Convex Functions
TU Ilmenau
1.3.4 Vector Valued functions ... of single variable
The derivative x(t) is a vector represented by
x(t) =
x1(t)x2(t)
...xn(t)
Similarly the second time-derivative x(t) is given by
x(t) =
x1(t)x2(t)
...xn(t)
.
Systems Optimization Winter Semester 2015 Preliminaries 1.3. Gradients, Hessian, Convex Sets, Convex Functions
TU Ilmenau
1.3.5. The Jacobian MatrixVector valued functions of several variable:Consider the a function F : Rn → Rm, where
F (x) =
F1(x)F2(x)
...Fm(x)
, where x = (x1, x2, . . . , xn).
The Jacobian Matrix of F at a point x is defined as
J(x) =
(∇F1(x))>
(∇F2(x))>
...
(∇Fm(x))>
Systems Optimization Winter Semester 2015 Preliminaries 1.3. Gradients, Hessian, Convex Sets, Convex Functions
TU Ilmenau
1.3.5. The Jacobian Matrix ... ExampleExample: Find the Jacobian of
F (x) =
x2 + 2x3
x1 + 2x3
2x1 + 2x2
Here F1(x) = x2 + 2x3,F2(x) = x1 + 2x3,F3(x) = 2x1 + 2x2
∇F1(x) =
012
,∇F2(x) =
102
,∇F3(x) =
220
Hence,
J(x) =
(∇F1(x))>
(∇F2(x))>
(∇F3(x))>
=
0 1 21 0 22 2 0
Systems Optimization Winter Semester 2015 Preliminaries 1.3. Gradients, Hessian, Convex Sets, Convex Functions
TU Ilmenau
1.3.5. The Jacobian Matrix ... Exercise
I Note that, for a scalar valued function f : Rn → R. The Jacobianof the gradient F (x) = ∇f (x) is equal to the Hessian H(x) of f .
Exercise:
(ii) Determine the gradient and the Hessian of the function
f (x) = 100(x2 − x2
1
)2+ (1− x1)2.
(ii) Determine the Jacobian of
F (R) =
(400(R2−R1)
(R1+R2)3
−800R1
(R1+R2)3
)
Systems Optimization Winter Semester 2015 Preliminaries 1.3. Gradients, Hessian, Convex Sets, Convex Functions
TU Ilmenau
1.3.6. Taylor Series ApproximationA smooth function f : Rn → R can be represented by an infinite sum near to agiven point x .
f (x + αd) = f (x) + αd>∇f (x) +1
2α2d>∇H(x)d + . . .
for a very small number α > 0.
First-order Taylor approximation near x :
f (x + αd) ≈ f (x) + αd>∇f (x)
is a linear approximation of f (x) near x .Second-order Taylor approximation near x :
f (x + αd) = f (x) + αd>∇f (x) +1
2α2d>∇H(x)d
provides a quadratic approximation of f (x) near x .
I Most optimization algorithms are developed based on either first-or second-order Taylor approximations.
Systems Optimization Winter Semester 2015 Preliminaries 1.3. Gradients, Hessian, Convex Sets, Convex Functions
TU Ilmenau
1.3.7. Some Properties of the Gradient Vector
P1: A function f : Rn → R increases faster in the direction ofthe gradient vector ∇f (x); i.e.⇒ ∇f (x) is the steepest ascent direction of f from the point x ;⇒(−∇f (x)) is the steepest descent direction of f from thepoint x ;
Note that: For a given point x , using the 1-st order Taylorapproximation
f (x + αd) = f (x) + αd>∇f (x), α > 0.
If we choose the direction vector d = ∇f (x), then
f (x + αd) = f (x) + α∇f (x)>∇f (x) = f (x) + α‖∇f (x)‖2︸ ︷︷ ︸≥0
≥ f (x)
This implies, f increases in the direction of d = ∇f (x).Systems Optimization Winter Semester 2015 Preliminaries 1.3. Gradients, Hessian, Convex Sets, Convex Functions
TU Ilmenau
1.3.6. Some Properties of the Gradient Vector...
P2: The gradient vector ∇f (x) is orthogonal to the contour lineof f .⇒ For a given fixed number k , ∇f (x) is orthogonal to the curve(surface) {x | f (x) = k}.
Systems Optimization Winter Semester 2015 Preliminaries 1.3. Gradients, Hessian, Convex Sets, Convex Functions
TU Ilmenau
1.3.7. Convex Sets
A. Convex SetA set S ⊂ Rn is said to be a convex set, if for any x1, x2 ∈ S and any λ ∈ [0, 1] we have
λx1 + (1− λ)x2 ∈ S .
• For a set S to be convex, the line-segment joining any two points x1, x2 in S should becompletely contained in S .
Figure: Convex and a non-convex sets
• The set S = {x ∈ Rn | Ax ≤ a,Bx = b} is a convex set, where A,B are matrices anda, b are vectors.
Systems Optimization Winter Semester 2015 Preliminaries 1.3. Gradients, Hessian, Convex Sets, Convex Functions
TU Ilmenau
1.4.1. Convex Sets ...
whereS1 = {x | gi (x) ≤ 0, i = 1, 2, 3}
andS2 = {x | a ≤ ‖x‖2 ≤ b}, where 0 < a < b.
Systems Optimization Winter Semester 2015 Preliminaries 1.3. Gradients, Hessian, Convex Sets, Convex Functions
TU Ilmenau
1.4.2. Convex Functions
Definition (Convex Functions)A function f : Rn → R is said to be a convex function, if for any x1, x2 ∈ Rn and anyλ ∈ [0, 1] we have
f (λx1 + (1− λ)x2) ≤ λf (x1) + (1− λ)f (x2)
• A segment connecting any two points on the graph of f lies above the graph of f .
I A function f is strictly convex, if for any x1, x2 ∈ Rn and any λ ∈ [0, 1] we have
f (λx1 + (1− λ)x2) < λf (x1) + (1− λ)f (x2).
Systems Optimization Winter Semester 2015 Preliminaries 1.3. Gradients, Hessian, Convex Sets, Convex Functions
TU Ilmenau
1.4.2. Convex Functions...Examples:
(i) The following are convex functions f1(x) = x2, f2(x) = x4,f3(x) = ex , f4(x1, x2) = x2
1 + x22 .
(ii) The function f1(x) = x2, f3(x) = ex , f4(x1, x2) = x21 + x2
2 arestrictly convex.
Some properties
I If f is a convex function and α ≥ 0 , then αf (x) is a convexfunction.
I If f and g are convex functions, then their sum f (x) + g(x) isalso a convex function.
I If f is a convex functions h a convex and non-decreasing functionof one variable, then the composition h(f (x)) is also a convexfunction.
(iii) The function f (x) =√ex
21 +x2
2 + x41 is a convex function.
Systems Optimization Winter Semester 2015 Preliminaries 1.3. Gradients, Hessian, Convex Sets, Convex Functions
TU Ilmenau
1.4.2. Convex Functions...
Question: Is there a simple method to verify whether a function isconvex or not?
(iv) Consider the function f (x) = 12 (x1 − 2)2 + x2
2 − 5 that can bealso written as
f (x) =1
2(x1, x2)
[1 00 2
]︸ ︷︷ ︸
=Q
(x1
x2
)+ (4, 0)︸ ︷︷ ︸
=q>
(x1
x2
)+ (−3)︸︷︷︸
=b
=1
2x>Qx + q>x + b.
Systems Optimization Winter Semester 2015 Preliminaries 1.3. Gradients, Hessian, Convex Sets, Convex Functions
TU Ilmenau
1.4.2. Convex Functions...
function my2DPlot2
x = -10:0.1:10; y = -10:0.1:10;
[X,Y] = meshgrid(x,y);
Z = 0.5*(X -2).^2 + Y.^2 -5 ;
meshc(X,Y,Z)
xlabel(’x - axis’);
ylabel(’y - axis’);
hold on
%plot3(2,0,0,’sk’,’markerfacecolor’,[0,0,0]);
title(’Plot for the function f(x_{1},x_{2})=0.5(x_{1}-2)^{2}+ x_{2}^{2} - 5’)
Systems Optimization Winter Semester 2015 Preliminaries 1.3. Gradients, Hessian, Convex Sets, Convex Functions
TU Ilmenau
1.4.2. Convex Functions...
Systems Optimization Winter Semester 2015 Preliminaries 1.3. Gradients, Hessian, Convex Sets, Convex Functions
TU Ilmenau
1.4.2. Convex Functions...
Checking convexity
Let f be a twice differentiable function. Then
f is a convex function if and only if the Hessian H(x) is apositive semi-definite matrix.
f is a strictly convex function if and only if the Hessian H(x) isa positive definite matrix.
Examples:The function f (x) = 1
2 (x1 − 2)2 + x22 − 5 is (strictly) convex, since
H(x) =
[1 00 2
]is positive (semi-) definite.
I A quadratic function f (x) = 12x>Qx + q>x + b is convex if the
matrix Q is positive semi-definite.Recall that: a matrix Q is positive semi-definite if all its eigenvalues are non-negative.
Systems Optimization Winter Semester 2015 Preliminaries 1.3. Gradients, Hessian, Convex Sets, Convex Functions
TU Ilmenau
1.4.2. Convex Functions...
Methods to check convexity
I A function f (x) is convex if all the eigenvalues of the Hessianmatrix H(x) are non-negative, for any x .
I (Sylverster’s Criteria): A symmetric matrix is positivesemi-definite if and only if all its principal minors arenon-negative.
For an n × n matrix, a principal minor is the determinants of a k × k principal sub-matrix, where k=1,. . . ,n.
Example: The function
f (x) = x1x2 + 2x2x3 + 2x1x3
is not convex. Note that f has the Hessian matrix H(x) =
0 1 21 0 22 2 0
.
Then
det(0) = 0, det
([0 11 0
])= −1, det
0 1 21 0 22 2 0
= 8.
Systems Optimization Winter Semester 2015 Preliminaries 1.3. Gradients, Hessian, Convex Sets, Convex Functions
TU Ilmenau
1.4.2. Convex Functions...Excercise(A nonlinear spring system):(i)
• The applied forces are F1 = 0, F2 = 2N and the spring constants are k1 = k2 = 1N/m. Verify theconvexity of the potential energy function which is given by
P(x1, x2) =1
2k1 (∆L1)2 +
1
2k2 (∆L2)2 − F1x1 − F2x2,
where ∆L1 =√
(x1 + 10)2 + (x2 − 10)2 − 10√
2 and ∆L2 =√
(x1 − 10)2 + (x2 − 10)2 − 10√
2 are
changes in the length of the springs and x1 und x2 are the x and y shifts, respectively.
Systems Optimization Winter Semester 2015 Preliminaries 1.3. Gradients, Hessian, Convex Sets, Convex Functions
TU Ilmenau
1.4.2. Convex Functions...Execercises
(ii) Verify that, if g1(x), . . . , gm(x) are convex functions, then the set
S = {x | gi (x) ≤ 0, i = 1, . . . ,m}
is a convex set.
(iii) (Rosenbrock’s function) Verify whether the function
f (x) = 100(x2 − x21 )2 + (1− x1)2
is convex or not.
(iv) The following a function represents the weight of a truss structure
f (Mb,Mc) = 8M2.5b + 6M2.5
c .
Verify whether this function is convex.
Systems Optimization Winter Semester 2015 Preliminaries 1.3. Gradients, Hessian, Convex Sets, Convex Functions
TU Ilmenau