tutorial on convex optimization: part ii · 2018. 12. 19. · downlink beamforming as sdp and socp...

35
Tutorial on Convex Optimization: Part II Dr. Khaled Ardah Communications Research Laboratory TU Ilmenau Dec. 18, 2018

Upload: others

Post on 07-Feb-2021

3 views

Category:

Documents


0 download

TRANSCRIPT

  • Tutorial on Convex Optimization: Part II

    Dr. Khaled Ardah

    Communications Research LaboratoryTU Ilmenau

    Dec. 18, 2018

  • Outline

    Convex Optimization Review

    Lagrangian Duality

    ApplicationsI Optimal Power Allocation for Rate MaximizationI Downlink Beamforming as SDP and SOCPI Uplink-Downlink Duality via Lagrangian Duality

    Disciplined convex programming and CVX

  • Convex Optimization Review

    Mathematical optimization problem (P)

    minx

    f0(x)

    s.t. fi (x) ≤ 0, i = 1, . . . ,mhj(x) = 0, j = 1, . . . , p

    Variable x ∈ Rn, domain (set) X = {x | fi (x) ≤ 0,∀i , hj(x) = 0,∀j}P is convex if: objective and domain X are convex, hj(x) are affineP is still convex if: min → max, fi (x) ≥ 0, fi (x) are all concaveI feasible solution x: p = f0(x), x ∈ X ,I local optimal solution x̄: p = f0(x̄) ≤ f0(x), ‖x− x̄‖ ≤ �, x, x̄ ∈ XI global optimal solution x?: p? = f0(x?) ≤ f0(x), x?, x ∈ XI if P is convex, then local optimal solution x̄ is global optimal x?.

  • Lagrangian Duality

    Mathematical optimization problem (P) (not necessarily convex)

    minx

    f0(x)

    s.t. fi (x)≤0, i = 1, . . . ,mhj(x) = 0, j = 1, . . . , p

    Variable x ∈ Rn, domain X , optimal value p? = f0(x?)Lagrangian function L (named after Joseph-Louis Lagrange 1811)

    L(x, λ, ν) = f0(x) +m∑i=1

    λi fi (x) +

    p∑j=1

    νjhj(x)

    I L is a weighted sum of objective and constraint functionsI λi is the Lagrange multiplier associated with fi (x) ≤ 0I νj is the Lagrange multiplier associated with hj(x) = 0

    • other names: weights, penalties, prices, ...

  • Lagrange dual function (problem)

    Lagrange dual function

    g(λ, ν) = minxL(x, λ, ν) = min

    x

    (f0(x) +

    m∑i=1

    λi fi (x) +

    p∑j=1

    νihj(x)

    )

    For any given x, g(λ, ν) is a pointwise minimum of a family of linearfunctions (in (λ, ν)), thus g(λ, ν) is always a concave function

    We say that (λ, ν) is dual feasible if λ ≥ 0 and g(λ, ν) is finiteI g(λ, ν) ≤ f0(x) (proof)

    This means: for any dual feasible vector (λ, ν), the dualfunction always serves as a lower bound.

  • Lagrange dual function (problem)

    Dual problem (D)

    max(λ,ν)

    g(λ, ν)

    s.t. λ ≥ 0,

    Variable: vector (λ, ν), optimal value d? = g(λ?, ν?)

    Dual problem D is always convex regardless of the convexity of theoriginal (primal) problem P

    Duality gap: e = p? − d?I In general, e > 0, i.e., there is a gap between primal and dual valuesI If primal problem P is convex, strong duality holds and thus e = 0

  • Optimality Conditions

    The necessary conditions for x? to be a (local) optimal solution to primalproblem P is that there exists some (λ, ν) such that

    Primal feasibility conditionsI fi (x?) ≤ 0, ∀i = 1, . . . ,mI hj(x?) = 0, ∀j = 1, . . . , p

    Dual feasibility conditionI λ? ≥ 0

    Complementary slackness conditionI λi fi (x?) = 0, ∀i = 1, . . . ,m

    First order optimalityI ∇xL(x?, λ?, ν?) =∇xf0(x?) +

    ∑mi=1 λ

    ?i ∇xfi (x?) +

    ∑pj=1 ν

    ?i ∇xhj(x?) = 0

  • Optimality Conditions

    The optimality conditions provided above are called theKarush-Kuhn-Tucker (KKT) conditions

    In general, the KKT conditions are necessarily, but not sufficient

    If problem is convex, the KKT conditions are sufficient

    RemarkI For unconstrained optimization problem, KKT conditions reduces to

    only first order Optimality condition ∇xf0(x?) = 0.• The local optimal must be attained at a stationary point

    I For constraint optimization problems, the (local) optimal is no longerattained at a stationary point, instead, it is attained at a KKT point.

  • Example

    Solve the following problem

    min x2 + y2 + 2z2

    s.t. 2x + 2y − 4z ≥ 8

  • Example

    Solve the following problem

    min x2 + y2 + 2z2

    s.t. 2x + 2y − 4z ≥ 8

    The Lagrangian L(x , y , z , λ) = x2 + y2 + 2z2 + λ(8− 2x − 2y + 4z)Dual function g(λ) = minx,y ,z L(x , y , z , λ)I ∂λ

    ∂x= 2x − 2λ = 0

    I ∂λ∂y

    = 2y − 2λ = 0I ∂λ

    ∂z= 4z + 4λ = 0

    I ⇒ x = y = λ, z = −λSubstitute x = y = λ, z = −λ into 2x + 2y − 4z = 8, we get λ = 1What is the dual problem?

    Are the optimality conditions satisfied?

  • Example

    Least-norm solution of linear equations

    minx

    xTx

    s.t. Ax = b,

    Recall optimal solution is x? = A+p = (ATA)−1ATbI If A is very large, this solution cannot be used.

    Lagrangian function L(x, λ) = xTx + λT (Ax− b)Dual function g(λ) = minx L(x, λ)I minx L(x, λ) = ∇xL(x, λ) = 2x + ATλ = 0⇒ x = − 12A

    TλI x here minimizes L(x, λ)

    g(λ) = L(− 12ATλ, λ) = − 14λ

    TAATλ− bTλ, concave function of λI lower bound property: p? ≥ − 1

    4λTAATλ− bTλ,∀λ

  • Example

    Stranded form LP

    minx

    cTx

    s.t. Ax = b

    Lagrangian function

    L(x, λ) = cTx + λT (Ax− b) = −bTλ+ (c + ATλ)Tx

    which is affine in x

    Dual function

    g(λ) = minxL(x, λ) =

    {−bTλ c + ATλ = 0−∞ otherwise

    which is linear on affine domain {λ|c + ATλ = 0}lower bound property: p? ≥ −bTλ if c + ATλ ≥ 0

  • Optimal Power Allocation for Rate Maximization

  • Optimal Power Allocation for Rate MaximizationAssume that we have n channels, where the i-th channel gain is αiFor each channel i , the transmit power is given by piThe SNR of channel i is given as (where σ is noise power)

    Γi =αipiσ

    The rate of channel i is then given as

    ri = log(1 + Γi )

    Our problem: find power allocation vector p = [p1, . . . , pn]T thatmaximizes the sum rate subject to maximum power constraint, i.e.,

    maxp

    n∑i=1

    ri =n∑

    i=1

    log(

    1 +αipiσ

    )s.t.

    n∑i=1

    pi = pmax

    pi ≥ 0,∀i

  • Optimal Power Allocation for Rate Maximization

    L(p, λ, ν) = −∑n

    i=1 log(

    1 + αipiσ

    )+ µ(

    ∑ni=1 pi − pmax)−

    ∑ni=1 λipi

    Take gradient wrt pi , we have

    ∇piL(p, λ, ν) = −1

    1 + αipiσ· αiσ

    + µ− λi = 0.

    Thus, we have µ = αiσ+αipi + λi

    From the complementary slackness condition, we have λipi = 0I Case 1: λi = 0 and pi > 0 thus

    µ =αi

    σ + αipi⇒ pi =

    1

    µ− σαi, where

    1

    µ≥ σαi

    I Case 2: pi = 0 and λi > 0, thus

    µ =αiσ

    + λi ⇒ λi = µ−αiσ⇒ µ > αi

    σ=

    1

    µ<

    σ

    αi

  • Optimal Power Allocation for Rate Maximization

    From above, we have optimal power allocation as

    pi = max{1

    µ− σαi, 0} =

    [ 1µ− σαi

    ]+We find µ such that

    n∑i=1

    [ 1µ− σαi

    ]+= pmax

    Remark: if αi increases ⇒ σαi decreases ⇒ pi increases.Can we draw a diagram illustrating the above relation?

  • Downlink Beamforming as SDP and SOCP

  • Downlink Beamforming as SDP and SOCPA wireless network consisting of one Tx and K Rxs

    The Tx has N antennas, while each Rx has one antenna

    Problem: minimize the transmit power subject to SINR targets.

    First, the received signal at k-th Rx is

    yk =K∑k

    hHk wksk + nk = hHk wksk︸ ︷︷ ︸

    desiredsignal

    +∑j 6=k

    hHk wjsj︸ ︷︷ ︸interference

    + nk︸︷︷︸noise

    I yk ∈ C is the received signalI hj ∈ CN is the channel between Tx and j-th RxI wj ∈ CN is the transmit beamforming for the j-th RxI sks∗k = 1, while sks

    ∗j = 0

    Thus, the SINR at Rx k is given as

    Γk =|hHk wk |2∑

    j 6=k |hHk wj |2 + σ

  • Downlink Beamforming as SDP and SOCPThe SINR at Rx k is given as

    Γk =|hHk wk |2∑

    j 6=k |hHk wj |2 + σ

    The QoS constraints requires that Γk ≥ γk ,∀kMathematical optimization problem (nonconvex)

    minwk ,∀k

    ∑k

    ‖wk‖2

    s.t.|hHk wk |2∑

    j 6=k |hHk wj |2 + σ≥ γk ,∀k

    Note that the transmit power is represented by ‖wk‖2 = pkIn other problems, you may want to design the beamformingdirection and the beamforming power independently

    wk =√pkw̄k

    where ‖w̄k‖2 = 1

  • Downlink Beamforming as SDP and SOCP

    Solve the above problem using the relaxed SDPI ‖wk‖2 = wHk wk = Tr(wkwHk ) = Tr(Wk), where WkCN×NI |hHk wk |2 = (hHk wk)H(hHk wk) = wHk hkhHk wk = Tr(wkwHk hkhHk ) =

    Tr(WkHk) where HkCN×NI Wk and Hk are both rank-one matrices

    Rearrange the SINR constraints as

    |hHk wk |2∑j 6=k |hHk wj |2 + σ

    ≥ γk ⇒ |hHk wk |2 ≥ γk(∑

    j 6=k

    |hHk wj |2 + σ)

    Modify the SINR constraints using the above results

    Tr(WkHk) ≥ γk(∑

    j 6=k

    Tr(HkWj) + σ)

  • Downlink Beamforming as SDP and SOCP

    The original problem can be written as (nonconvex)

    minwk ,∀k

    ∑k

    Tr(Wk)

    s.t. Tr(WkHk) ≥ γk(∑

    j 6=k

    Tr(HkWj) + σ),∀k

    Wk � 0rank(Wk) = 1,∀k

    The above problem is still nonconvex, due to rank-one constraint

    Ignoring the rank-one constraint, the problem becomes a relaxedSDP, which is convex

  • Downlink Beamforming as SDP and SOCPBased on observation that an arbitrary phase rotation can be addedto the beamforming vectors without effecting the SINR functions

    Thus, hHk wk can be chosen to be real without the loss of generality.

    Let W = [w1, . . .wK ] ∈ CN×KThe SINR constrains become

    (1− 1

    γk

    )|hHk wk |2 ≥

    ∥∥∥ [hHk W√σ

    ] ∥∥∥2Because hHk wk can be assumed real, we can take square root as√(

    1− 1γk

    )hHk wk ≥

    ∥∥∥ [hHk W√σ

    ] ∥∥∥which is a second-order cone constraint

    The original problem can be written as SOCP (convex)

    minwk ,∀k

    ∑k

    ‖wk‖2 s.t.

    √(1− 1

    γk

    )hHk wk ≥

    ∥∥∥ [hHk W√σ

    ] ∥∥∥

  • Uplink-Downlink Duality via Lagrangian Duality

  • Uplink-Downlink Duality via Lagrangian Duality

    In engineering design, we are not only in the numerical solution tothe problem, but also to the structure of the optimal solution.

    The Lagrangian Duality of the original problem often reveals suchstructure.

    Uplink-Downlink Duality: refers to the fact that the total transmitpower required to satisfy a certain SINR constrains in the downlinkis equal to the total transmit power required to satisfy a certainSINR constrains in the uplink∑

    i=1

    pk =∑i=1

    qk

    where pk is the downlink power and qk is the uplink power

    Note that, pk does not have be equal to qk .

    It is the sum of powers that is equal!

  • Uplink-Downlink Duality via Lagrangian Duality

    The original optimization problem is

    minwk ,∀k

    ∑k

    ‖wk‖2 =∑k

    wHk wk

    s.t.|hHk wk |2∑

    j 6=k |hHk wj |2 + σ≥ γk ,∀k

    Lagrangian function

    L(wk , λk) =∑k

    wHk wk −∑k

    λk

    ( 1γk|hHk wk |2 −

    ∑j 6=k

    |hHk wj |2 − σ)

    =∑k

    λkσ +∑k

    wHk

    [IN +

    ∑j 6=k

    λjhjhHj −

    λkγk

    hkhHk

    ]wk

    Note that[IN +

    ∑j 6=k λjhjh

    Hj −

    λkγkhkhHk

    ]

  • Uplink-Downlink Duality via Lagrangian Duality

    The dual optimization problem

    maxλk ,∀k

    ∑k

    λkσ

    s.t.∑j

    λjhjhHj + IN �

    (1 +

    1

    γk

    )λkhkh

    Hk ,∀k

  • Uplink-Downlink Duality via Lagrangian DualityLet us consider now the uplink problem

    The received uplink signal at Rx with regards to Tx k

    yk =K∑k

    wHk hkqksk + nk = wHk hkqksk︸ ︷︷ ︸

    desiredsignal

    +∑j 6=k

    wHk hjqjsj︸ ︷︷ ︸interference

    +wHk nk︸ ︷︷ ︸noise

    SINR of Tx k at uplink is given as

    Γ̃k =qk |hHk w̄k |2∑

    j 6=k qj |hHj w̄k |2 + w̄Hk w̄kσ

    The mathematical optimization problem

    minqk ,∀k

    ∑k

    qk

    s.t.qk |hHk w̄k |2∑

    j 6=k qj |hHj w̄k |2 + w̄Hk w̄kσ≥ γk ,∀k

  • Uplink-Downlink Duality via Lagrangian Duality

    The optimal beamforming direction w̄k is given by the MMSE as

    w̄k = ρ(∑

    j

    qjhjhHj + σI

    )−1hk

    where ρ is a normalization parameter so that ‖w̄k‖ = 1After substituting w̄k into the SINR constraints and rearranging it,we have

    minqk ,∀k

    ∑k

    qk

    s.t.∑j

    qjhjhHj + σIN �

    (1 +

    1

    γk

    )qkhkh

    Hk ,∀k

  • Uplink-Downlink Duality via Lagrangian Duality

    The dual optimization problem of downlink problem

    maxλk ,∀k

    ∑k

    λkσ

    s.t.∑j

    λjhjhHj + IN �

    (1 +

    1

    γk

    )λkhkh

    Hk ,∀k

    The uplink optimization problem

    minqk

    ∑k

    qk

    s.t.∑j

    qjhjhHj + σIN �

    (1 +

    1

    γk

    )qkhkh

    Hk ,∀k

    Note that qk = λkσ.

    Thus, Both problems are identical, except that the maximization,the minimization, and the SINR constraint are reversed

  • Disciplined convex programming and CVX

  • Disciplined convex programming and CVX

    LP solversI lots available (GLPK, Excel, Matlab’s linprog, . . . )

    Cone solversI typically handle (combinations of) LP, SOCP, SDP conesI several available (SDPT3, SeDuMi, CSDP, . . . )

    general convex solversI some available (CVXOPT, MOSEK, . . . )

    could write your ownI use some tricks to transform the problem into an equivalent one that

    has a standard form (e.g., LP, SDP)I modeling systems can partly automate this step

  • CVX

    runs in Matlab, between the cvx begin and cvx end commands

    relies on SDPT3 or SeDuMi (LP/SOCP/SDP) solvers

    refer to user guide, online help for more info

    the CVX example library has more than a hundred examples

  • Example: Constrained norm minimization

    between cvx begin and cvx end, x is a CVX variable

    statement subject to does nothing, but can be added for readability

    inequalities are treated element wise

  • What CVX does

    after cvx end, CVX solver will

    transforms problem into an LP

    calls solver SDPT3

    overwrites (object) x with (numeric) optimal value

    assigns problem optimal value to cvx optval

    assigns problem status (which here is Solved) to cvx status

  • Some useful functions