tutorial on convex optimization: part ii · 2018. 12. 19. · downlink beamforming as sdp and socp...
TRANSCRIPT
-
Tutorial on Convex Optimization: Part II
Dr. Khaled Ardah
Communications Research LaboratoryTU Ilmenau
Dec. 18, 2018
-
Outline
Convex Optimization Review
Lagrangian Duality
ApplicationsI Optimal Power Allocation for Rate MaximizationI Downlink Beamforming as SDP and SOCPI Uplink-Downlink Duality via Lagrangian Duality
Disciplined convex programming and CVX
-
Convex Optimization Review
Mathematical optimization problem (P)
minx
f0(x)
s.t. fi (x) ≤ 0, i = 1, . . . ,mhj(x) = 0, j = 1, . . . , p
Variable x ∈ Rn, domain (set) X = {x | fi (x) ≤ 0,∀i , hj(x) = 0,∀j}P is convex if: objective and domain X are convex, hj(x) are affineP is still convex if: min → max, fi (x) ≥ 0, fi (x) are all concaveI feasible solution x: p = f0(x), x ∈ X ,I local optimal solution x̄: p = f0(x̄) ≤ f0(x), ‖x− x̄‖ ≤ �, x, x̄ ∈ XI global optimal solution x?: p? = f0(x?) ≤ f0(x), x?, x ∈ XI if P is convex, then local optimal solution x̄ is global optimal x?.
-
Lagrangian Duality
Mathematical optimization problem (P) (not necessarily convex)
minx
f0(x)
s.t. fi (x)≤0, i = 1, . . . ,mhj(x) = 0, j = 1, . . . , p
Variable x ∈ Rn, domain X , optimal value p? = f0(x?)Lagrangian function L (named after Joseph-Louis Lagrange 1811)
L(x, λ, ν) = f0(x) +m∑i=1
λi fi (x) +
p∑j=1
νjhj(x)
I L is a weighted sum of objective and constraint functionsI λi is the Lagrange multiplier associated with fi (x) ≤ 0I νj is the Lagrange multiplier associated with hj(x) = 0
• other names: weights, penalties, prices, ...
-
Lagrange dual function (problem)
Lagrange dual function
g(λ, ν) = minxL(x, λ, ν) = min
x
(f0(x) +
m∑i=1
λi fi (x) +
p∑j=1
νihj(x)
)
For any given x, g(λ, ν) is a pointwise minimum of a family of linearfunctions (in (λ, ν)), thus g(λ, ν) is always a concave function
We say that (λ, ν) is dual feasible if λ ≥ 0 and g(λ, ν) is finiteI g(λ, ν) ≤ f0(x) (proof)
This means: for any dual feasible vector (λ, ν), the dualfunction always serves as a lower bound.
-
Lagrange dual function (problem)
Dual problem (D)
max(λ,ν)
g(λ, ν)
s.t. λ ≥ 0,
Variable: vector (λ, ν), optimal value d? = g(λ?, ν?)
Dual problem D is always convex regardless of the convexity of theoriginal (primal) problem P
Duality gap: e = p? − d?I In general, e > 0, i.e., there is a gap between primal and dual valuesI If primal problem P is convex, strong duality holds and thus e = 0
-
Optimality Conditions
The necessary conditions for x? to be a (local) optimal solution to primalproblem P is that there exists some (λ, ν) such that
Primal feasibility conditionsI fi (x?) ≤ 0, ∀i = 1, . . . ,mI hj(x?) = 0, ∀j = 1, . . . , p
Dual feasibility conditionI λ? ≥ 0
Complementary slackness conditionI λi fi (x?) = 0, ∀i = 1, . . . ,m
First order optimalityI ∇xL(x?, λ?, ν?) =∇xf0(x?) +
∑mi=1 λ
?i ∇xfi (x?) +
∑pj=1 ν
?i ∇xhj(x?) = 0
-
Optimality Conditions
The optimality conditions provided above are called theKarush-Kuhn-Tucker (KKT) conditions
In general, the KKT conditions are necessarily, but not sufficient
If problem is convex, the KKT conditions are sufficient
RemarkI For unconstrained optimization problem, KKT conditions reduces to
only first order Optimality condition ∇xf0(x?) = 0.• The local optimal must be attained at a stationary point
I For constraint optimization problems, the (local) optimal is no longerattained at a stationary point, instead, it is attained at a KKT point.
-
Example
Solve the following problem
min x2 + y2 + 2z2
s.t. 2x + 2y − 4z ≥ 8
-
Example
Solve the following problem
min x2 + y2 + 2z2
s.t. 2x + 2y − 4z ≥ 8
The Lagrangian L(x , y , z , λ) = x2 + y2 + 2z2 + λ(8− 2x − 2y + 4z)Dual function g(λ) = minx,y ,z L(x , y , z , λ)I ∂λ
∂x= 2x − 2λ = 0
I ∂λ∂y
= 2y − 2λ = 0I ∂λ
∂z= 4z + 4λ = 0
I ⇒ x = y = λ, z = −λSubstitute x = y = λ, z = −λ into 2x + 2y − 4z = 8, we get λ = 1What is the dual problem?
Are the optimality conditions satisfied?
-
Example
Least-norm solution of linear equations
minx
xTx
s.t. Ax = b,
Recall optimal solution is x? = A+p = (ATA)−1ATbI If A is very large, this solution cannot be used.
Lagrangian function L(x, λ) = xTx + λT (Ax− b)Dual function g(λ) = minx L(x, λ)I minx L(x, λ) = ∇xL(x, λ) = 2x + ATλ = 0⇒ x = − 12A
TλI x here minimizes L(x, λ)
g(λ) = L(− 12ATλ, λ) = − 14λ
TAATλ− bTλ, concave function of λI lower bound property: p? ≥ − 1
4λTAATλ− bTλ,∀λ
-
Example
Stranded form LP
minx
cTx
s.t. Ax = b
Lagrangian function
L(x, λ) = cTx + λT (Ax− b) = −bTλ+ (c + ATλ)Tx
which is affine in x
Dual function
g(λ) = minxL(x, λ) =
{−bTλ c + ATλ = 0−∞ otherwise
which is linear on affine domain {λ|c + ATλ = 0}lower bound property: p? ≥ −bTλ if c + ATλ ≥ 0
-
Optimal Power Allocation for Rate Maximization
-
Optimal Power Allocation for Rate MaximizationAssume that we have n channels, where the i-th channel gain is αiFor each channel i , the transmit power is given by piThe SNR of channel i is given as (where σ is noise power)
Γi =αipiσ
The rate of channel i is then given as
ri = log(1 + Γi )
Our problem: find power allocation vector p = [p1, . . . , pn]T thatmaximizes the sum rate subject to maximum power constraint, i.e.,
maxp
n∑i=1
ri =n∑
i=1
log(
1 +αipiσ
)s.t.
n∑i=1
pi = pmax
pi ≥ 0,∀i
-
Optimal Power Allocation for Rate Maximization
L(p, λ, ν) = −∑n
i=1 log(
1 + αipiσ
)+ µ(
∑ni=1 pi − pmax)−
∑ni=1 λipi
Take gradient wrt pi , we have
∇piL(p, λ, ν) = −1
1 + αipiσ· αiσ
+ µ− λi = 0.
Thus, we have µ = αiσ+αipi + λi
From the complementary slackness condition, we have λipi = 0I Case 1: λi = 0 and pi > 0 thus
µ =αi
σ + αipi⇒ pi =
1
µ− σαi, where
1
µ≥ σαi
I Case 2: pi = 0 and λi > 0, thus
µ =αiσ
+ λi ⇒ λi = µ−αiσ⇒ µ > αi
σ=
1
µ<
σ
αi
-
Optimal Power Allocation for Rate Maximization
From above, we have optimal power allocation as
pi = max{1
µ− σαi, 0} =
[ 1µ− σαi
]+We find µ such that
n∑i=1
[ 1µ− σαi
]+= pmax
Remark: if αi increases ⇒ σαi decreases ⇒ pi increases.Can we draw a diagram illustrating the above relation?
-
Downlink Beamforming as SDP and SOCP
-
Downlink Beamforming as SDP and SOCPA wireless network consisting of one Tx and K Rxs
The Tx has N antennas, while each Rx has one antenna
Problem: minimize the transmit power subject to SINR targets.
First, the received signal at k-th Rx is
yk =K∑k
hHk wksk + nk = hHk wksk︸ ︷︷ ︸
desiredsignal
+∑j 6=k
hHk wjsj︸ ︷︷ ︸interference
+ nk︸︷︷︸noise
I yk ∈ C is the received signalI hj ∈ CN is the channel between Tx and j-th RxI wj ∈ CN is the transmit beamforming for the j-th RxI sks∗k = 1, while sks
∗j = 0
Thus, the SINR at Rx k is given as
Γk =|hHk wk |2∑
j 6=k |hHk wj |2 + σ
-
Downlink Beamforming as SDP and SOCPThe SINR at Rx k is given as
Γk =|hHk wk |2∑
j 6=k |hHk wj |2 + σ
The QoS constraints requires that Γk ≥ γk ,∀kMathematical optimization problem (nonconvex)
minwk ,∀k
∑k
‖wk‖2
s.t.|hHk wk |2∑
j 6=k |hHk wj |2 + σ≥ γk ,∀k
Note that the transmit power is represented by ‖wk‖2 = pkIn other problems, you may want to design the beamformingdirection and the beamforming power independently
wk =√pkw̄k
where ‖w̄k‖2 = 1
-
Downlink Beamforming as SDP and SOCP
Solve the above problem using the relaxed SDPI ‖wk‖2 = wHk wk = Tr(wkwHk ) = Tr(Wk), where WkCN×NI |hHk wk |2 = (hHk wk)H(hHk wk) = wHk hkhHk wk = Tr(wkwHk hkhHk ) =
Tr(WkHk) where HkCN×NI Wk and Hk are both rank-one matrices
Rearrange the SINR constraints as
|hHk wk |2∑j 6=k |hHk wj |2 + σ
≥ γk ⇒ |hHk wk |2 ≥ γk(∑
j 6=k
|hHk wj |2 + σ)
Modify the SINR constraints using the above results
Tr(WkHk) ≥ γk(∑
j 6=k
Tr(HkWj) + σ)
-
Downlink Beamforming as SDP and SOCP
The original problem can be written as (nonconvex)
minwk ,∀k
∑k
Tr(Wk)
s.t. Tr(WkHk) ≥ γk(∑
j 6=k
Tr(HkWj) + σ),∀k
Wk � 0rank(Wk) = 1,∀k
The above problem is still nonconvex, due to rank-one constraint
Ignoring the rank-one constraint, the problem becomes a relaxedSDP, which is convex
-
Downlink Beamforming as SDP and SOCPBased on observation that an arbitrary phase rotation can be addedto the beamforming vectors without effecting the SINR functions
Thus, hHk wk can be chosen to be real without the loss of generality.
Let W = [w1, . . .wK ] ∈ CN×KThe SINR constrains become
(1− 1
γk
)|hHk wk |2 ≥
∥∥∥ [hHk W√σ
] ∥∥∥2Because hHk wk can be assumed real, we can take square root as√(
1− 1γk
)hHk wk ≥
∥∥∥ [hHk W√σ
] ∥∥∥which is a second-order cone constraint
The original problem can be written as SOCP (convex)
minwk ,∀k
∑k
‖wk‖2 s.t.
√(1− 1
γk
)hHk wk ≥
∥∥∥ [hHk W√σ
] ∥∥∥
-
Uplink-Downlink Duality via Lagrangian Duality
-
Uplink-Downlink Duality via Lagrangian Duality
In engineering design, we are not only in the numerical solution tothe problem, but also to the structure of the optimal solution.
The Lagrangian Duality of the original problem often reveals suchstructure.
Uplink-Downlink Duality: refers to the fact that the total transmitpower required to satisfy a certain SINR constrains in the downlinkis equal to the total transmit power required to satisfy a certainSINR constrains in the uplink∑
i=1
pk =∑i=1
qk
where pk is the downlink power and qk is the uplink power
Note that, pk does not have be equal to qk .
It is the sum of powers that is equal!
-
Uplink-Downlink Duality via Lagrangian Duality
The original optimization problem is
minwk ,∀k
∑k
‖wk‖2 =∑k
wHk wk
s.t.|hHk wk |2∑
j 6=k |hHk wj |2 + σ≥ γk ,∀k
Lagrangian function
L(wk , λk) =∑k
wHk wk −∑k
λk
( 1γk|hHk wk |2 −
∑j 6=k
|hHk wj |2 − σ)
=∑k
λkσ +∑k
wHk
[IN +
∑j 6=k
λjhjhHj −
λkγk
hkhHk
]wk
Note that[IN +
∑j 6=k λjhjh
Hj −
λkγkhkhHk
]
-
Uplink-Downlink Duality via Lagrangian Duality
The dual optimization problem
maxλk ,∀k
∑k
λkσ
s.t.∑j
λjhjhHj + IN �
(1 +
1
γk
)λkhkh
Hk ,∀k
-
Uplink-Downlink Duality via Lagrangian DualityLet us consider now the uplink problem
The received uplink signal at Rx with regards to Tx k
yk =K∑k
wHk hkqksk + nk = wHk hkqksk︸ ︷︷ ︸
desiredsignal
+∑j 6=k
wHk hjqjsj︸ ︷︷ ︸interference
+wHk nk︸ ︷︷ ︸noise
SINR of Tx k at uplink is given as
Γ̃k =qk |hHk w̄k |2∑
j 6=k qj |hHj w̄k |2 + w̄Hk w̄kσ
The mathematical optimization problem
minqk ,∀k
∑k
qk
s.t.qk |hHk w̄k |2∑
j 6=k qj |hHj w̄k |2 + w̄Hk w̄kσ≥ γk ,∀k
-
Uplink-Downlink Duality via Lagrangian Duality
The optimal beamforming direction w̄k is given by the MMSE as
w̄k = ρ(∑
j
qjhjhHj + σI
)−1hk
where ρ is a normalization parameter so that ‖w̄k‖ = 1After substituting w̄k into the SINR constraints and rearranging it,we have
minqk ,∀k
∑k
qk
s.t.∑j
qjhjhHj + σIN �
(1 +
1
γk
)qkhkh
Hk ,∀k
-
Uplink-Downlink Duality via Lagrangian Duality
The dual optimization problem of downlink problem
maxλk ,∀k
∑k
λkσ
s.t.∑j
λjhjhHj + IN �
(1 +
1
γk
)λkhkh
Hk ,∀k
The uplink optimization problem
minqk
∑k
qk
s.t.∑j
qjhjhHj + σIN �
(1 +
1
γk
)qkhkh
Hk ,∀k
Note that qk = λkσ.
Thus, Both problems are identical, except that the maximization,the minimization, and the SINR constraint are reversed
-
Disciplined convex programming and CVX
-
Disciplined convex programming and CVX
LP solversI lots available (GLPK, Excel, Matlab’s linprog, . . . )
Cone solversI typically handle (combinations of) LP, SOCP, SDP conesI several available (SDPT3, SeDuMi, CSDP, . . . )
general convex solversI some available (CVXOPT, MOSEK, . . . )
could write your ownI use some tricks to transform the problem into an equivalent one that
has a standard form (e.g., LP, SDP)I modeling systems can partly automate this step
-
CVX
runs in Matlab, between the cvx begin and cvx end commands
relies on SDPT3 or SeDuMi (LP/SOCP/SDP) solvers
refer to user guide, online help for more info
the CVX example library has more than a hundred examples
-
Example: Constrained norm minimization
between cvx begin and cvx end, x is a CVX variable
statement subject to does nothing, but can be added for readability
inequalities are treated element wise
-
What CVX does
after cvx end, CVX solver will
transforms problem into an LP
calls solver SDPT3
overwrites (object) x with (numeric) optimal value
assigns problem optimal value to cvx optval
assigns problem status (which here is Solved) to cvx status
-
Some useful functions