analysis and optimization of energy-efficient massive mimo

Analysis and optimization of energy-efficient

massive MIMO wireless relaying systems

A Thesis Submitted

in Partial Fulfillment of the Requirements

for the Degree of

Doctor of Philosophy

by

Ekant Sharma

to the

DEPARTMENT OF ELECTRICAL ENGINEERING

INDIAN INSTITUTE OF TECHNOLOGY, KANPUR

May 2019

Abstract

Billions of subscribers around the world depend on their mobile phones for their professional

and private lives. This, however, comes at the cost of increasing energy consumption. Moti-

vated by the need to save energy, we design energy efficient algorithms for full-duplex (FD)

cooperative networks. Relay based cooperative communication has been extensively investi-

gated to improve the diversity, expand the coverage, increase the data rate, and reduce the

power consumption of wireless communication systems. The current generation of relays is

predominantly half-duplex (HD), as a benefit of their simplicity in implementation. A HD

relay requires two channel uses to send a data packet from the transmitter to the receiver,

since the relay cannot transmit and receive within the same time slot. FD technology is

becoming popular as a benefit of its increased throughput. A FD one-way relay transmits

and receives at the same time, hence theoretically it doubles the spectral efficiency (SE) of

a HD one-way relay. FD two-way relaying has recently been extended to multi-pair model,

wherein multiple user pairs exchange their data via a shared relay in a single channel use.

We consider two-way amplify-and-forward (AF) relaying, where multiple FD user-pairs ex-

change information via a shared FD massive multiple-input multiple-output (mMIMO) relay.

We derive closed-form lower bounds for the achievable rate considering both maximal-ratio-

combining/maximal-ratio-transmission and zero-forcing reception/zero-forcing transmission

processing at the relay, whilst relying on the minimum mean square error relay channel esti-

mation. These closed-form achievable-rate expressions are valid for an arbitrary number of

relay antennas. This is in contrast to asymptotic rate expressions which are valid only for

iv

N → ∞. We numerically demonstrate the accuracy of the derived closed form expressions

and the spectral gains achieved by a FD system over a HD one for various antenna regimes.

Most of the previous mMIMO relaying works maximize the spectral efficiency (SE). By

contrast, we maximize the global energy efficiency (GEE) metric, which obtains a Pareto-

optimality between the throughput and the energy consumed. We propose an algorithm

to solve GEE maximization problem, which has a non-convex objective, wherein we first

approximate the objective as a pseudo-concave function, and later choose Dinkelbach’s ap-

proach to optimize it. We compare the GEE of the proposed design with existing FD systems

and quantify the improvement achieved by the proposed algorithm. The proposed FD GEE

optimization framework can also be used for evaluating GEE of mMIMO HD AF systems

which has not been investigated in the open literature. We also compare GEE of the pro-

posed FD system to that of its HD counterparts, and characterize the self-loop and inter-user

interference regimes, for which the proposed FD system outperforms the HD ones.

We next optimize the weighted sum energy efficiency (WSEE) metric for FD mMIMO

relay system. The WSEE metric is more general than the GEE metric, usually optimized

in the mMIMO literature, as it can satisfy the heterogeneous energy efficiency (EE) require-

ments of diverse users, depending for example on their cost. The WSEE optimization, which

achieves this objective by allocating them different weights, is a hard-to-optimize non-convex

problem due to its sum-of-ratio form. The non-convex WSEE metric is optimized by first

approximating it as a concave-convex function, and later by applying quadratic transforma-

tion. We note that the WSEE cannot be optimized using Dinkelbach’s algorithm due to its

sum-of-ratio form. We also maximize the WSEE by formulating its Karush-Kuhn-Tucker

(KKT) conditions, and then by exploiting the sum-based nature of the WSEE to simplify

them. With these simplifications, which involve challenging differential computations, we

propose an iterative algorithm based on modified waterfilling approach, which yields a KKT

point of the non-convex WSEE problem. We note that the waterfilling technique is com-

monly shown as a closed-form solution for optimal SE, and not WSEE. To the best of our

knowledge, this is the first work which has reported closed-form waterfilling solutions to find

v

a KKT point of non-convex WSEE of two-way FD mMIMO relaying. We numerically show

that i) the proposed solutions achieve significant WSEE gains; ii) suitable choice of weights

can help prioritize EE requirements of different users; and iii) the closed-form WSEE op-

timization requires three order of magnitude lesser computational time than the proposed

quadratic-transformation-based WSEE optimization.

The decentralized processing architecture, wherein a set of computing units cooperate

with each other in a distributed manner to complete a complex task, has recently attracted a

lot of attention in mMIMO system design. We decentrally optimize WSEE of a HD mMIMO

relay system. We optimize it by developing a two-layer decentralized successive convex ap-

proximation optimization framework. In the first layer, the non-convex WSEE problem is

approximated either as a generic convex program (GCP) or as a second order cone program

(SOCP). In the second layer, the approximated problem is then decentrally solved using al-

ternating direction method of multipliers. We show that the proposed iterative algorithm

yields a Karush-Kuhn-Tucker point of the original WSEE problem. We numerically show

that for low signal-to-noise ratio (SNR), our algorithm simultaneously achieves the optimal

WSEE and the optimal sum-rate, while in the middle-high SNR region, sum-rate is degraded

to optimize the WSEE. We numerically analyze the effect of weights on the EE of individ-

ual users, and show that the proposed framework enable us to meet the heterogeneous EE

requirements. We also analytically and numerically show that the decentralized SOCP algo-

rithm has lesser complexity than its GCP counterpart, but yields the same WSEE. Finally,

we also numerically show that the decentralized approach yields the same WSEE as that of

the centralized one.

vi

Dedicated to

my family members

Acknowledgement

In the first place, I am grateful to the God for the good health and well-being that were

necessary to complete this thesis.

I would like to express my gratitude to my supervisor Prof. K. Vasudevan, who gave me

complete freedom to select the research topic of my interest, and his invaluable insight on

wireless communications lightens and leads me on my research.

If you are faithful to God, you are probably waiting for that person whom the heaven

has sent for you. I am extremely indebted to my supervisor Prof. Rohit Budhiraja, without

his support and dedicated involvement in every step, this thesis would have never been

accomplished. He always motivates by saying something like ’we have to complete it by

yesterday’ to encourage me to stay in the lab. Whenever I ran into a trouble or had a

question about my research or personal life, the door to Prof. Rohit Budhiraja office was

always open. Thanks to him, I had the opportunity to work in cutting-edge topics during

my PhD journey.

I would like to thank Prof. Adrish Banerjee for his constant motivation during the

course of this thesis. I also thank Prof. Aditya K. Jagannatham for the courses he taught,

which helped me in understanding the fundamentals of wireless communications. I thank

Prof. N Naik, with whom I had many discussions, which where not restricted to technical.

He always brought sweets or chocolates whenever he visited his hometown or abroad. I

would like to thanks my colleagues at WiSDom lab, with whom I always had productive

and amusing discussions, which always taught me something. Thanks to Dheeraj, Anupama,

viii

Sauradeep, Vishal, Venkatesh, Vikalp, Soumyadeep, Neha, Arpita, Aditya, Tripti, Rakesh,

Swadha, Sanket, Avinash and Shreya.

I have spend most of the time in Telematics lab with my colleagues cum friends: Prem,

Om Prakash, Shivani, Himanshu, and Dada. It was like my second home, the discussion with

them always made the atmosphere in lab very pleasant and fun. I would also like to mention

people with whom I had spend quality time in campus: Prabodh, Sanket, Manish, Ashutosh,

Mohan, Saurabh and their families.

On a personal level, I thank my family. Their support has been unconditional all these

years, they have cherished with me every great moment and supported me whenever I needed

them. My wife Suchita, who may not help me actively in my research, but the support I get

from her makes tough times a lot easier. She helped me in more ways than she can imagine.

My son Aaddvik, my good luck charm, he fills me will all the positive energy. My sister

Ekta, who did her masters from the same lab in IIT Kanpur, although younger than me, is a

constant source of motivation and she always believed in me. Without my parents constant

support, understanding and encouragement, it would have not been possible to achieve my

educational goals. My mother is the sole reason for me joining PhD program, who from

my childhood days motivated me towards my studies. I also thank my in-laws, who where

always present whenever in need. I am also thankful to my friends cum family, Devvrat and

Praween, who are just a phone call away.

(Ekant Sharma)

ix

Contents

Abstract iv

Acknowledgements viii

Contents xiv

List of Figures xv

List of Tables xix

Abbreviations xx

Notation xxiii

1 Introduction 1

1.1 Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.1.1 Use of massive MIMO . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.1.2 Use of relays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.1.3 Use of full-duplex . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.2.1 Single user-pair relay model . . . . . . . . . . . . . . . . . . . . . . . . 6

1.2.2 Multi-pair two-way relay model . . . . . . . . . . . . . . . . . . . . . . 7

1.2.3 Performance metrices and its importance . . . . . . . . . . . . . . . . 8

1.2.4 Spectral efficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

x

1.2.5 Energy efficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

1.3 Contribution of the thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

1.4 Organization of the thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2 Multi-pair Two-Way FD mMIMO Relay 16

2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.2 Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

2.3 Contribution of the present work . . . . . . . . . . . . . . . . . . . . . . . . . 18

2.4 System model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

2.4.1 SI suppression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

2.5 Channel estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

2.6 Relay precoder design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

2.6.1 MRC/MRT precoder . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

2.6.2 ZFR/ZFT precoder . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

2.7 Achievable sum-rate of MRC/MRT and ZFR/ZFT precoders . . . . . . . . . 28

2.8 Simulation results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

2.9 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

3 Global Energy Efficiency Optimization 38

3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

3.2 Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38


3.4 GEE problem formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

3.5 GEE maximization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

3.6 Computational complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

3.7 Closed-form optimal solution . . . . . . . . . . . . . . . . . . . . . . . . . . . 47


3.8.1 Global energy-efficiency maximization . . . . . . . . . . . . . . . . . . 50

3.8.2 FD versus HD GEE comparison . . . . . . . . . . . . . . . . . . . . . 52

xi

3.8.3 SE comparison with existing FD designs . . . . . . . . . . . . . . . . . 52

3.8.4 FD versus HD SE comparison . . . . . . . . . . . . . . . . . . . . . . . 57

3.9 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

4 Weighted Sum Energy Efficiency Optimization 61

4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

4.2 Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62


4.4 Problem statement and motivation . . . . . . . . . . . . . . . . . . . . . . . . 64

4.5 WSEE optimization using quadratic transform . . . . . . . . . . . . . . . . . 66

4.6 Closed-form WSEE solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

4.6.1 MRC/MRT processing . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

4.6.2 ZFR/ZFT processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

4.7 Computational complexity of the algorithms . . . . . . . . . . . . . . . . . . . 81


4.8.1 WSEE maximization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

4.8.2 Influence of weights . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

4.8.3 FD versus HD comparison . . . . . . . . . . . . . . . . . . . . . . . . . 86

4.8.4 Effect of channel estimation error . . . . . . . . . . . . . . . . . . . . . 89

4.8.5 Algorithm 3 and Algorithm 2 comparison . . . . . . . . . . . . . . . . 89

4.9 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

5 Decentralized WSEE Optimization for mMIMO Two-Way HD AF Relay-

ing 93

5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

5.2 Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94


5.4 System model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96

5.4.1 Channel estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99

xii

5.4.2 Relay precoder design . . . . . . . . . . . . . . . . . . . . . . . . . . . 99

5.4.3 Closed-form SE expression . . . . . . . . . . . . . . . . . . . . . . . . . 100

5.5 Weighted sum energy efficiency maximization . . . . . . . . . . . . . . . . . . 101

5.6 Two-layer decentralized WSEE optimization for MRC/MRT using GCP . . . 103

5.6.1 SCA framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

5.6.2 Decentralized ADMM approach . . . . . . . . . . . . . . . . . . . . . . 107

5.7 Decentralized WSEE optimization for MRC/MRT processing using SOCP . . 112

5.7.1 SOCP SCA approach . . . . . . . . . . . . . . . . . . . . . . . . . . . 112

5.7.2 Decentralized ADMM approach . . . . . . . . . . . . . . . . . . . . . . 114

5.7.3 Complexity of GCP and SOCP decentralized approaches . . . . . . . . 116

5.8 Decentralized WSEE optimization for ZFR/ZFT processing using SOCP . . . 116


5.9.1 Achievable WSEE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123

5.9.2 Numerical differences between WSEE and GEE optimization: . . . . . 126

5.9.3 Convergence of the decentralized algorithms . . . . . . . . . . . . . . . 128

5.9.4 Run time comparison of GCP and SOCP decentralized algorithms . . 131

5.9.5 Effect of channel estimation error . . . . . . . . . . . . . . . . . . . . . 131

5.9.6 Comparison of decentralized and centralized approaches . . . . . . . . 131

5.9.7 Decision on the number of relay antennas N required to maximize WSEE133

5.10 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135

6 Summary and Future Work 137

6.1 Thesis summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137

6.2 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139

Appendices 142

A Proof of Proposition 1 143

B Proof of Proposition 2 147

xiii

C Proof of Theorem 7 149

D Proof of Theorem 8 152

E Geometric and fractional programming 154

F Proof of Lemma 2 156

G Proof of Closed-form SE expression 157

References 159

Publications 171

xiv

List of Figures

1.1 Single antenna versus mMIMO. . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.2 Relays being installed to serve users in the capacity-limited areas. . . . . . . . . . . 4

1.3 Full-duplex: Same channel can be used both ways. . . . . . . . . . . . . . . . . . . 5

1.4 Half-duplex relay. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.5 Full-duplex relay. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.6 FD multi-pair two-way relay model. . . . . . . . . . . . . . . . . . . . . . . . . . 8

1.7 Interplay between Chapter 2-5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2.9 System model: Relay node consists of massive antenna arrays and each source

node has two antennas, one for transmission and another for reception. All

users and the relay simultaneously transmit and receive which leads to SI at

the relay (shown by dotted ellipse), at the user (shown by dotted circle) and

IUI (shown by solid red line). The users on either side of the relay (marked

with bold dotted line at the center) are isolated. . . . . . . . . . . . . . . . . 19

2.10 SE of MRC/MRT and ZFR/ZFT versus PR, where K = 10, Pρ = 10 dBm. . . . . . 33

2.11 SE versus the number of relay antennas for MRC/MRT and ZFR/ZFT processing,

where SNRρ = 10 dB. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

2.12 SE versus number of user pairs for MRC/MRT and ZFR/ZFT processing, where

N = 128, SNRρ = 10 dB. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

xv

2.13 SE versus number of relay antennas for MRC/MRT processing, comparing HD and

FD systems, where SNR = 10 dB, SNRρ = 10 dB. Here the values of σ2LIR and σ2

UI

are with respect to σ2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

2.14 SE versus number of relay antennas for ZFR/ZFT processing, comparing HD and FD

systems, where SNR = 10 dB, SNRρ = 10 dB. Here the values of σ2LIR and σ2

UI are

with respect to σ2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

2.15 Chapter 2 – summary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

3.17 GEE comparison of MRC/MRT and ZFR/ZFT processing versus Pmaxu = Pmax

u /σ2

for Pρ = 20 dBm, K = 5 and N = 64. . . . . . . . . . . . . . . . . . . . . . . . . 50

3.18 GEE comparison of MRC/MRT and ZFR/ZFT processing versus N for Pρ = 20 dBm

and K = 5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

3.19 GEE versus η comparing FD and HD systems for MRC/MRT with for Pρ = 20 dBm. 53

3.20 SE of MRC/MRT and ZFR/ZFT versus: a) Pρ, comparing the performance of MMSE

and LS estimators, where K = 5, N = 256 and PR = 10 dBm. . . . . . . . . . . . . 54

3.21 SE of MRC/MRT and ZFR/ZFT versus η, where K = 5, Pρ = 20 dBm, N = 500. . . 55

3.22 SE versus η = Pmaxu /σ2 comparison of HD system for MRC/MRT, where K = 10,

Pρ = 20 dB. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

3.23 Performance comparison of FD and HD systems: SE versus η = Pmaxu /σ2 for MRC/MRT

for K = 5, N = 500 and Pρ = 20 dB. . . . . . . . . . . . . . . . . . . . . . . . . . 57

3.24 Performance comparison of FD and HD systems: SE versus η = Pmaxu /σ2 for ZFR/ZFT

for K = 5, N = 500 and Pρ = 20 dB. . . . . . . . . . . . . . . . . . . . . . . . . . 58

3.25 Chapter 3 – summary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

4.27 WSEE comparison of MRC/MRT and ZFR/ZFT versus Pmaxu for N = 64 for K = 3

and Pρ = 20 dBm. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

4.28 WSEE comparison of MRC/MRT and ZFR/ZFT versus N for Pmaxu = 20 dBm for

K = 3 and Pρ = 20 dBm. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

xvi

4.29 EE of each user versusN for different weight: Ω1 : w1 =0.25, w2 =0.25, w3 =0.25, w4 =0.25

for Du :0.749, 0.445, 0.346, 0.311, D :0.649, 0.545, 0.231, 0.145, K= 2 and Pmaxu =

20 dBm. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

4.30 EE of each user versusN for different weight: Ω2 : w1 = 0.1, w2 = 0.1, w3 = 0.4, w4 = 0.4;

and c) Ω3 : w1 = 0.4, w2 = 0.4, w3 = 0.1, w4 = 0.1 for Du :0.749, 0.445, 0.346, 0.311,

D :0.649, 0.545, 0.231, 0.145, K=2 and Pmaxu =20 dBm. . . . . . . . . . . . . . . 85

4.31 EE of each user versusN for different weight: Ω3 : w1 = 0.4, w2 = 0.4, w3 = 0.1, w4 = 0.1

for Du :0.749, 0.445, 0.346, 0.311, D :0.649, 0.545, 0.231, 0.145, K= 2 and Pmaxu =

20 dBm. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

4.32 WSEE versus N comparing FD and HD systems for MRC/MRT for K = 3, Pmaxu =

20 dBm and Pρ=20 dBm. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

4.33 WSEE versus N comparing FD and HD systems for ZFR/ZFT for K = 3, Pmaxu =

20 dBm and Pρ=20 dBm. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

4.34 Effect of channel estimation errors for N = 64, K = 3 and Pρ = 20 dBm. . . . . . . 88

4.35 Comparison of QT and closed form KKT algorithms: WSEE versus N for MRC/MRT

with Pmaxu = 30 dBm, K = 3. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

4.36 Comparison of QT and closed form KKT algorithms: WSEE versus N for ZFR/ZFT

with Pmaxu = 30 dBm, K = 3. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

4.37 Runtime versus N with MRC/MRT. . . . . . . . . . . . . . . . . . . . . . . . . . 91

4.38 Chapter 4 – summary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

5.40 System model for multi-pair two-way HD AF relaying. . . . . . . . . . . . . . . . . 97

5.41 WSEE comparison of the proposed decentralized GCP and SOCP algorithms versus

η = Pmaxt /σ2 for MRC/MRT for K = 5 and N = 64. . . . . . . . . . . . . . . . . 123


a) η = Pmaxt /σ2 for ZFR/ZFT for K = 5 and N = 64. . . . . . . . . . . . . . . . . 124


K comparing both MRC/MRT and ZFR/ZFT for N = 64 and η = 20 dB. . . . . . 124

xvii

5.44 EE of each user versus N , when GEE is maximized for D : 0.749, 0.445, 0.246, 0.111,

weights w1 = 0.05, w2 = 0.15, w3 = 0.4, w4 = 0.4, K = 5 and η = 20 dB and MRC/MRT

processing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127

5.45 EE of each user versusN , when WSEE is maximized for D : 0.749, 0.445, 0.246, 0.111,


processing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127

5.46 Comparison of GEE obtained, on maximizing WSEE and GEE for D:0.749,0.445,0.246,0.111,


processing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128

5.47 WSEE versus total number of iterations, L for proposed decentralized GCP algorithm

with MRC/MRT processing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129

5.48 WSEE versus total number of iterations, L for proposed decentralized GCP algorithm

with ZFR/ZFT processing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130

5.49 WSEE versus total number of iterations, L for proposed decentralized SOCP algorithm

with MRC/MRT processing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130

5.50 Average runtime versus user-pairs, K comparing GCP and SOCP formulations for

decentralized algorithm with MRC/MRT processing for N = 64 and η = 20 dB. . . . 132

5.51 Average runtime versus user-pairs, K comparing GCP and SOCP formulations for

decentralized algorithm with ZFR/ZFT processing for N = 64 and η = 20 dB. . . . 132

5.52 WSEE versus N for different values of channel estimation errors σ2ξ,k for MRC/MRT

processing with K = 5 and η = 10 dB. . . . . . . . . . . . . . . . . . . . . . . . . 133

5.53 WSEE versus η comparison of proposed centralized SOCP and decentralized SOCP

algorithms for MRC/MRT processing; . . . . . . . . . . . . . . . . . . . . . . . . 134

5.54 WSEE versus η comparison of proposed centralized SOCP and decentralized SOCP

algorithms for ZFR/ZFT processing. . . . . . . . . . . . . . . . . . . . . . . . . . 134


N for MRC/MRT with η = 20 dB. . . . . . . . . . . . . . . . . . . . . . . . . . . 135

5.56 Chapter 5 – summary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136

xviii

List of Tables

3.2 Summary of mMIMO literature focusing on SE and GEE. . . . . . . . . . . . 39

3.3 System parameters. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

5.1 Summary of mMIMO literature focusing on multi-pair HD AF relay. . . . . . 95

5.2 System parameters. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122

xix

Abbreviations

3G third generation

4G fourth generation

5G fifth generation

ADMM alternating direction method of multiplier

AF amplify-and-forward

AWGN additive white Gaussian noise

BC broadcast

CCFP concave-convex fractional program

CDMA code division multiple acces

CSI channel state information

D2D device-to-device

DC difference of convex

EE energy efficiency

EPA equal power allocation

FD full-duplex

FP fractional programming

GCP generic convex program

GEE global energy efficiency

GP geometric program

HD half-duplex

xx

HSPA high speed downlink packet access

i.i.d. independent and identically distributed

IoT internet-of-things

IP inter-pair interference

IUI inter-user-interference

KKT Karush-Kuhn-Tucker

LIR self-loop interference at relay

LO local oscillator

LoS line-of-sight

LS least square

MAC multiple access

MIMO multi-input multi-output

MMFP max-min fractional program

mMIMO massive MIMO

MMSE minimum mean square estimation

MRC maximal-ratio-combiner

MRT maximal-ratio-transmission

NR noise from the relay

NU noise at the user

OFDM orthogonal frequency division multiplexing

PC pseudo concave

pdf probability density function

QC quasi-concave

QT quadratic transform

RF radio frequency

RHS right hand side

RPA random power allocation

xxi

SCA successive convex approximation

SE spectral efficiency

SI self-loop interference

SIC self-interference-cancellation

SISO single-input single-output

SINR signal-to-interference-noise-ratio

SNR singal-to-noise-ratio

SOCP second order cone program

SRM sum-rate maximization

WSEE weighted sum energy efficiency

ZFT zero-forcing-transmission

ZFR zero-forcing receiver

xxii

Notation

, defined as equal to

≈ approximately equal to

much less than

much greater than

a vector

A matrix

A∗ complex conjugate of matrix A

AT transpose of a matrix A

AH Conjugate-transpose of a matrix A

[A]i,j (i, j)th element of a matrix

[A]k,k kth diagonal element of the matrix A

E[·] expectation operator

Var[·] variance operator

Tr(·) Trace operator

‖a‖ l2 norm of a vector

an nth element of a vector

diag(a) a (square) diagonal matrix with elements a on its main diagonal

blkdiagA1,· · ·,AN a block diagonal matrix with A1 to AN as its diagonal blocks

xxiii

|c| magnitude of a complex scalar

IN an N ×N identity matrix

0N all-zero N ×N matrix

1k a 2K × 1 vector consisting of value one at kth row and zero otherwise

CN (0,Σ) circularly-symmetric complex Gaussian distribution with covariance matrix Σ

Zr×s an intezer matrix of dimension r × s

Rr×s a real matrix of dimension r × s

Cr×s a complex matrix of dimension r × s

h an estimate of the true parameter value h

K number of user-pairs

N number of relay antennas

Sm mth user

T coherence interval

τ number of pilots

pk transmit power of kth user

PR transmit power at the relay

G and F channel matrices

Ωk,k self-loop interference at user

Ωk,j , ∀ k 6= j inter-user interference

GRR self-loop interference at relay

zk noise at the kth user

zR noise at the relay

xxiv

Chapter 1

Introduction

Roughly two decades before, there was only voice communication. Then we moved to 3G,

which supported picture-sharing and bluetooth along with internet (144 kbit/s (CDMA) and

2.1 Mbits/s (HSPDA)). The implementation of 4G gave a complete transformation to the

app and content ecosystem. Be it social media, video streaming, online payments or ordering

a taxi, everything has been made so easy and fast. With time the demands of internet using

public have also increased, now people are not only browsing the internet, but they are also

uploading/downloading high definition photos, videos, TV shows, movies and many more.

The next generation wireless technology 5G, which will not only bring higher speeds to mobile

phones (upto 10 Gbps), but it will also be the network for billions of devices enabling fast (≈

ms) and secure connectivity. 5G will be used in smart home, smart farming, self-driving cars,

drone operation, healthcare, security and surveillance, machine-to-machine communication,

industrial internet-of-things (IoT)/industrial automation and robotics, logistics and shipping,

smart grid control, etc.

A natural concern comes to our mind that if 5G offers higher rates, what about energy

consumption ? Will the 100 times increase in data rate in 5G leads to 100 times increase

in energy consumption. Energy efficiency (EE) metric, which relies on the Pareto-optimality

between throughput and energy consumption, has recently drawn attention as a useful perfor-

mance measure. With 4G, a cellular site delivering 28 Mbit/sec consumes 1.35 kW of energy,

1.1 Solutions 2

leading to an EE of 20 kbit/Joule [1]. The 5G devices are aimed to have higher spectral

efficiency (SE) and EE improvement simultaneously. EE, therefore, is an integral part of all

proposed 5G solutions and technologies. The objective is to slash the energy requirement to

10 percent of current 4G networks.

1.1 Solutions

In this thesis, we focus on obtaining higher SE and EE simultaneously, when direct commu-

nication between a transmitter-receiver pair is not feasible due to the weak direct link. To

facilitate the above aim, we next discuss some recent breakthrough technologies, which is

being considered for the future 5G cellular systems.

1.1.1 Use of massive MIMO

Current generation wireless networks uses multiple-input multiple-output (MIMO) with two

transmit and two receive antenna elements, which would increases the capacity of a single-

input single-output (SISO) by a factor of two. Massive MIMO (mMIMO) technology, with

large number of antennas, as shown in Fig. 1.1, have become popular, since they serve

multiple users in the same time and frequency, resulting in hugh gains in capacity, reliability

and network density. The extra antenna focuses the transmission and reception of signal

energy into smaller regions of space to bring huge improvements in throughput and radiated

EE.

The mMIMO can cancel co-channel interference by using simple linear transmit processing

schemes, such as, zero-forcing transmission (ZFT) and maximal-ratio transmission (MRT) [2–

7], which significantly improves their SE with reduced complexity. The mMIMO also leads to

a reduction in EE, since now the total energy consumption is divided between multiple users.

Due to these benefits, mMIMO has been adopted as a wireless technology for 5G. Finally a

word of caution – ‘massive’ does not refer to the physical size, instead it signifies the large

number of antennas.

1.1 Solutions 3

Fig. 1.1: Single antenna versus mMIMO.

1.1.2 Use of relays

Relay-based communication has been extensively investigated to expand the coverage, im-

prove the diversity, increase the data rate, and reduce the energy consumption of wireless

communication systems [8–12]. The basic idea is simple: a source node transmits a signal

to a relay, the relay processes the received signal and then forwards it to destination node.

The relay deployment, as shown in Fig. 1.2, helps in improving coverage and capacity when

capacity of the link between the base station and user is i) low due to increased path loss (eg.

at cell edge, due to large obstacles, underground); ii) zero due to user node present outside

the coverage of the base station.

The base stations are connected to the core network via a backhaul wired link in a

traditional cellular network. Relays can be used to create pure wireless networks without

any need for base stations. Vehicles can communicate via relay and exchange information

about traffic jams, road conditions, etc. The relay, therefore in addition to improvements

in coverage and capacity, also has cost benefits over the base station installation due to a

reduction in backhaul, deployment costs and equipment costs.

1.1 Solutions 4

(a)

(b)

Fig. 1.2: Relays being installed to serve users in the capacity-limited areas.

1.1.3 Use of full-duplex

Current wireless technology employs a time-division or frequency-division approach to ex-

change information, in which either time or frequency is divided in orthogonal slots resulting

in half-duplex (HD) communication. Most of the current communication systems contain

nodes (e.g., base stations, mobiles or relays) which can work both as transmitters and re-

ceivers. Full-duplex (FD) communication (illustrated in Fig. 1.3), where the information can

be exchanged in both directions on same frequency at the same time, has the potential to

1.2 Motivation 5

double their SE [13–19].

Fig. 1.3: Full-duplex: Same channel can be used both ways.

The main hindrance in implementing FD systems is the large self-loop interference (SI)

coming from its node’s own transmission in comparison to its receive signal from its partner

node. However, advances in self-loop interference cancellation schemes now facilitate the use

of FD communications, resulting in higher rates than a HD system [13–19]. The SI schemes

are mainly categorized as three types: i) passive isolation; ii) time-domain cancellation; and

iii) spatial suppression.

1.2 Motivation

The relay is characterized based on the type of signal processing used. Most common

relaying strategies are: i) decode-and-forward (DF) – the relay decodes the source signal and

transmits the re-encoded signal; and ii) amplify-and-forward (AF) – the relay amplifies the

source signal and transmit it to destination node. The AF relay does not require complicated

hardware for error control, and therefore, are computationally simple and have low cost. We,

therefore, focus on AF relay protocol in this thesis. The AF relays are not yet common place

but are expected to in the next release of the 5G systems where the integrated backhaul and

access network are being investigated. It is known that DF relay can give better performance

than AF relay as it cleans up the noise.

1.2 Motivation 6

1.2.1 Single user-pair relay model

Relays can operate either in HD mode or in FD mode. The HD relay can either transmit or

receive within the same time slot. The one-way HD relay (shown in Fig. 1.4a) requires four

channels for the exchange of information between the user-pair whereas two-way HD relay

(shown in Fig. 1.4b) requires two channels and has higher SE than a HD one-way relay [11].

(a) One-way

(b) Two-way

Fig. 1.4: Half-duplex relay.

A FD one-way relay (shown in Fig. 1.5a) can simultaneously transmit and receive on

the same channel, and can theoretically double the SE, when compared with a HD one-way

relay, by transmitting one data unit per channel use [13]. Two-way FD relaying (shown in

Fig. 1.5b), wherein a user-pair exchanges two data units in one channel use via a relay [20],

further improves the system SE.

1.2 Motivation 7

(a) One-way

(b) Two-way

Fig. 1.5: Full-duplex relay.

1.2.2 Multi-pair two-way relay model

FD two-way relaying has recently been extended to multi-pair model [21–23], wherein multi-

ple user pairs exchange their data via a shared relay in a single channel use. Fig. 1.6 depicts

a multi-pair AF two-way FD relaying system with large number of antennas, in which K FD

user-pairs communicate via a single FD relay on the same time-frequency resource. Specifi-

cally, the user on one side of the relay, wants to send and receive data from the user on the

other side of the relay; the two sides of the relay in the figure are distinguished using the

bold dotted line. We will discuss this architecture in more detail in Chapter 2.

A multi-pair two-way FD relay system has the following interference sources i) co-channel

(inter-pair) interference: due to multiple users simultaneously accessing the channel; ii) SI

1.2 Motivation 8

Fig. 1.6: FD multi-pair two-way relay model.

at the relay and at the users: due to the FD operation; and iii) inter-user interference (IUI):

due to the FD operation, transmission of users interfere with the reception of users on the

same side of the relay [22].

The multi-pair relay model can be applied to scenario when multiple users in a location

want to exchange data without loading the traditional cellular networks e.g., outdoor concerts

at stadiums, indoor exhibition halls, which are usually single-cell. These multiple users could

also be located in an area where traditional cellular networks are disrupted e.g., due to a

natural disaster.

1.2.3 Performance metrices and its importance

1.2.4 Spectral efficiency

The SE, which measures how well the spectrum resources are used, is one of the important

criteria utilized to define wireless communication performance. We will first define the SE of

k → k′ link. Let us assume that i) there are 2K number of users in a network; ii) pk denotes

the transmit power of kth user; and iii) SINRk represents the signal-to-interference-noise

1.2 Motivation 9

ratio experienced by the kth user. The sum SE, measured in bits/sec/Hz, is defined as

SE =

2K∑k=1

log2(1 + SINRk). (1.1)

1.2.5 Energy efficiency

The next generation wireless technology 5G, will not only bring higher speeds to mobile

phones (upto 10 Gbps), but it will also be the network for billions of devices enabling fast

(≈ ms) and secure connectivity. A natural concern comes to our mind that if 5G offers

higher rates, what about energy consumption. Will the 100 times increase in data rate in 5G

leads to 100 times increase in energy consumption. Due to environmental and economical

concerns, EE has emerged as a prominent performance metric, which measures the number

of bits transmitted per Joule of energy expended. 5G aims at achieving 100 times higher

EE when compared to 4G. Traditional resource allocation techniques aimed to maximize

the amount of information that is reliably transmitted, however, the focus has now shifted

towards maximizing the amount of information that is reliably transmitted per Joule of

energy expended.

The natural question arises, about what is the best way to define EE of a network which

consists of multiple users. In literature, there exist two commonly used EE metrices. We will

now introduce them, describing their advantages and disadvantages. We first define the EE

of the k → k′ user link, in bits/Joule, as follows

EEk =log2(1 + SINRk)

µpk + Pc,k. (1.2)

where µ denotes the inverse transmit power amplifier efficiency and Pc,k denotes the circuit

power consumption in the k → k′ link.

1.2 Motivation 10

1.2.5.1 Global energy efficiency

The global energy efficiency (GEE) metric, obtains a Pareto-optimality between the through-

put and the energy consumed, and has recently drawn attention as a useful performance

measure. The GEE, is defined as [24]

GEE =

2K∑k=1

log2(1 + SINR)

2K∑k=1

(µpk + Pc,k)

. (1.3)

The GEE metric, therefore, defines how efficiently the network resources are utilized to

generate desired number of bits. We note that the GEE is a network-centric metric and

cannot prioritize the EEs of different user links [25, 26]. The prioritization of user links

is commonly required in a heterogeneous network where an energy-strapped user places a

high-premium on its EE.

1.2.5.2 Weighted sum energy efficiency

The weighted sum energy efficiency (WSEE) metric [25,26], which is the weighted sum of the

EEs of different users, can prioritize the EE requirements of various users by assigning them

predefined weights. We WSEE metric, is therefore, defined as follows [24]

WSEE ,2K∑k=1

wkEEk =2K∑k=1

wklog2(1 + SINRk)

µpk + Pc,k, (1.4)

where wk for k = 1, · · · , 2K are the predefined weights which are used to prioritize the EE of

different users links. The WSEE, however, due to the sum-of-ratio form, is a hard-to-tackle

non-convex objective.

1.3 Contribution of the thesis 11

1.3 Contribution of the thesis

The main contributions of this thesis are summarized as follow.

• We derive closed-form lower-bounds for the achievable rate of the multi-pair two-way

AF FD mMIMO relay for an arbitrary number of relay antennas. We consider both

maximal-ratio-combiner (MRC)/MRT and zero-forcing reception (ZFR)/ZFT process-

ing at the relay, based on the minimum mean squared error (MMSE) relay channel

estimation. These closed-form achievable-rate expressions which are valid for an arbi-

trary number of relay antennas, are more general than the ones in the existing mMIMO

relaying literature.

• We optimally allocate power to maximize GEE by using the closed-form achievable rate

expressions derived. We propose an algorithm to solve GEE maximization, which has a

non-convex objective, wherein we first approximate the objective as a pseudo-concave

(PC) function, and later choose Dinkelbach’s approach to optimize it. The achievable

rate expressions, and consequently GEE optimization, developed herein are applicable

to any antenna configuration. Furthermore, the expressions and the analysis developed

herein are significantly more complex due to the coupling of channels in AF relaying,

and both the SI and IUI imposed by the FD nodes. We numerically compare the GEE

obtained using the proposed algorithm to the equal-power approach of [22].

• The proposed FD GEE optimization framework can also be used for evaluating GEE

of mMIMO HD AF systems which has not been investigated in the open literature.

We compare GEE of both FD and HD systems and numerically quantify the SI and

IUI values for which a FD system has a better GEE than a HD system. We also show

the significantly improved GEE of optimal power allocation over the equal-power GEE

analyzed in [27].

• We propose a novel Quadratic Transform (QT)-based approach to optimize WSEE for

our system model which is a coupled function of the i) rate and total power consumption


in the WSEE numerator and denominator respectively; and ii) optimization variables

i.e, user power pk,∀k, and relay power PR. We recursively apply QT twice, firstly to

decouple the rate and total power consumption of each user, by expressing WSEE as

a difference of rate and power consumption, and secondly to decouple the numerator

and denominator of the signal-to-interference-noise ratio (SINR) terms in the rate ex-

pression. Now for the decoupled problem, to converge to a stationary point of the

original problem, we require the objective to be concave. However, the recast objective

is non-concave. We then propose a novel transformation which enables us to express

the non-concave term as a difference of convex (DC) functions. We then linearize the

concave part using successive convex approximation (SCA) technique, and keep the

convex part unchanged. This helps in exploiting the convexity of the non-convex cou-

pled variable. The modified problem now becomes concave and therefore converges to

a stationary point.

Novel QT-based approach: This work therefore provides a framework to optimize the

NP-hard sum-of-ratio non-convex WSEE problem, which has coupled optimization vari-

ables, and can be summarized as follows: i) recursive QT application to decouple the

sum-of-ratio WSEE problem by expressing it equivalently as a difference of the rate and

the power consumed.; ii) novel transformation which expresses the non-concave terms

as DC functions; and iii) application of SCA technique to linearize only the concave

term to make the WSEE optimization concave.

• We also maximize the WSEE by formulating its Karush–Kuhn–Tucker (KKT) condi-

tions, and then by exploiting the sum-based nature of the WSEE to simplify them.

With these simplifications, which involve challenging differential computations, we pro-

pose an iterative algorithm based on modified waterfilling approach [28], which yields a

KKT point of the non-convex WSEE problem. We note that the waterfilling technique

is commonly shown as a closed-form solution for optimal SE, and not WSEE. To the

best of our knowledge, this is the first work which has reported closed-form waterfilling


solutions to find a KKT point of non-convex WSEE of two-way FD mMIMO relaying.

The closed-form waterfilling solution derived herein, performs only bisection search

on the Lagrange multiplier, and does not require convex optimization softwares. We

numerically show that the closed-form WSEE optimization require three order of mag-

nitude lesser computational time than the proposed quadratic-transformation-based

WSEE optimization.

Novel approach to derive KKT point: We show how to simplify KKT conditions for

sum-based objective functions and employ a modified waterfilling approach to derive a

KKT point.”

• The proposed FD WSEE framework can also be used to evaluate the WSEE of mMIMO

HD system which, to the best of our knowledge, has not been investigated. We compare

the WSEE of the FD and HD systems and numerically determine the loop and IUI

values for which a FD system has better WSEE than a HD system. We numerically

show that the WSEE optimization framework developed herein enables us to set suitable

weights to satisfy the heterogeneous EE requirements of various users in the network.

• We optimize non-convex WSEE of a multi-pair two-way AF HD mMIMO relay system.

We optimize it by developing a two-layer decentralized SCA optimization framework.

The first layer approximates the non-convex WSEE either as a generic convex program

(GCP) or as a second order cone program (SOCP). The second layer decentrally solves

the approximated problem using alternating direction method of multipliers (ADMM).

We show that the proposed iterative algorithm yields a KKT point of the original

WSEE problem.

• We numerically show that the proposed decentralized algorithms yield higher WSEE

than the i) equal-power allocation (EPA) scheme in [29]; ii) sum-rate maximization

(SRM) scheme in [30]; and iii) random power allocation (RPA) scheme. We also show

that the proposed decentralized approach i) can set the weights to prioritize EE re-

quirements of one user over another; and ii) achieves the same WSEE as obtained using

1.4 Organization of the thesis 14

the centralized optimization approach.

1.4 Organization of the thesis

The rest of the thesis is organized as follows.

Chapter-2: We first explain in detail the system model for multi-pair two-way FD relay

based mMIMO system. We consider both MRC/MRT and ZFR/ZFT processing at the relay

and derive novel analytical expression for the SE of the considered system. We, both analyt-

ically and numerically, compare the SE of the FD system derived in this work with the SE

of HD system.

Chapter-3: This Chapter focuses on the network-centric GEE metric. We first discuss the

power consumption model for the considered system. Using the closed-form achievable rate

expressions derived in Chapter 2, we optimally allocate power to maximize GEE. We propose

an algorithm to solve GEE maximization, which has a non-convex objective, wherein we first

approximate the objective as a PC function, and later choose Dinkelbach’s approach to opti-

mize it. We also provide a closed-form power allocation solution for the GEE maximization

problem by formulating its KKT conditions and finally numerically compare the performance

of the proposed GEE algorithm to the equal-power approach.

Chapter-4: This Chapter focuses on the user-centric WSEE metric. We design an optimal

power allocation algorithm to maximize the WSEE of the multi-pair ful-duplex two-way re-

lay system. We also maximize the WSEE by formulating its KKT conditions, and then by

exploiting the sum-based nature of the WSEE to simplify them. We numerically show that

the WSEE optimization framework developed herein allows better control of the individual

user EE by allocating a different weight to it.

Chapter-5: This Chapter focuses on decentralized WSEE optimization. We develop a two-

layer SCA framework to decentrally maximize WSEE of multi-pair mMIMO two-way AF

HD relaying for both MRC/MRT and ZFR/ZFT relay processing. In the first layer, we cast

WSEE optimization as a GCP/SOCP. In the second layer, we use the ADMM algorithm to

1.4 Organization of the thesis 15

solve the approximated GCP/SOCP obtained at the nth iteration of the SCA approach in a

decentralized manner for both ZFR/ZFT and MRC/MRT processing. We also numerically

compare the WSEE of decentralized algorithm with equal-power allocation, sum-rate maxi-

mization and random power allocation schemes.

Chapter-6: We conclude the thesis with a summary of major findings and provide insights

on directions for future research.

We, highlight the interplay between the Chapter 2-5, in Fig. 1.7.

Fig. 1.7: Interplay between Chapter 2-5.

Chapter 2

Multi-pair Two-Way FD mMIMO

Relay

2.1 Introduction

In this chapter, we present mathematical models for multi-pair two-way FD mMIMO relay,

wherein multiple user pairs exchange their data via a shared relay in a single channel use.

Multi-pair two-way FD relaying, however, suffers from SI and IUI, from the transmit antenna

to receive antenna of the relay and user respectively, in addition to the co-channel (inter-pair)

interference due to multiple users simultaneously accessing the channel. The proposed system

model can be applied to scenario when multiple users in a location want to exchange data

without loading the traditional cellular networks e.g., outdoor concerts at stadiums, indoor

exhibition halls, which are usually single-cell. These multiple users could also be located in

an area where traditional cellular networks are disrupted e.g., due to a natural disaster. A

massive FD MIMO relay could then be set-up to serve these users. We next survey state-of-

the-art for the FD mMIMO relay system.

2.2 Related work 17

2.2 Related work

FD systems have received significant attention, due to their ability to recover the bandwidth

loss induced by HD relaying [31–35]. Recent studies in FD show that the SI caused due

to transmission and reception on the same channel, can be significantly suppressed [2], [3].

Before applying any SI cancellation technique, the SI channel due to the proximity of transmit

and receive antennas contains a strong line-of-sight (LoS) component [15]. Consequently

the magnitude of the SI channel can be modeled as a Rician distribution with a large K-

factor [15]. However, after employing active and passive SI suppression techniques [16, 19]

the strong LoS component gets significantly suppressed. The magnitude of the residual SI

channel can, consequently, be modeled as a Rician distribution with a small K-factor [15].

Further, the inter-antenna distance is much smaller than the distance between the receive

and transmit antenna. This results in i.i.d. Rayleigh-faded SI channels between the transmit

and receive antennas. Due to these reasons, the existing FD literature e.g., [19,21,22], model

the SI channel as Rayleigh-faded.

The mMIMO technology is being incorporated into multi-pair FD relays for mitigating

the SI at the relay, and the inter-pair co-channel interference [21–23, 27, 30]. Ngo et al. [21]

derived the achievable rate and a power allocation scheme for maximizing the ergodic sum-

rate for one-way decode-and-forward FD mMIMO-aided relaying. Zhang et al. [22] proposed

four power scaling schemes for two-way FD mMIMO relaying to improve both its SE and its

EE. Zhang et al. [23] developed a power allocation scheme for maximizing the sum-rate of

multi-pair two-way FD mMIMO AF relaying by using MRC/MRT processing at the relay,

and by using least squares (LS) channel estimation. Dai et al. [30] considered a HD multi-

pair two-way mMIMO AF relay and derived closed-form achievable rate expressions and a

power allocation scheme for maximizing the sum-rate under realistic imperfect channel state

information (CSI). Cui et al. [27] developed power scaling schemes for HD mMIMO-aided

two-way relay systems.

2.3 Contribution of the present work 18

2.3 Contribution of the present work

We next list the main contribution as follows:

• We derive closed-form lower bound for the SE of the multi-pair two-way AF FD mMIMO

relay. We consider both MRC/MRT and ZFR/ZFT processing at the relay and MMSE

relay channel estimation.

• These closed-form achievable rate expressions, developed herein are applicable to any

antenna configuration. This is in contrast to asymptotic rate expressions which are valid

only for N →∞. The federal communications commission recently approved mMIMO

products, which consists of 64 antennas connected to 64 fully digital transceiver chains [36].

Furthermore, the expressions and the analysis developed herein are significantly more

complex due to the coupling of channels in AF relaying, and both the SI as well as the

IUI imposed by the FD nodes.

• We compare the SE of the FD system derived in this work with the SE of HD system

derived in [30]. We numerically determine the loop and inter-user interference values

for which the MMSE-channel estimation based FD system has better SE than a HD

system in [30].

The rest of the chapter is organized as follows. We present our system model in Section 2.4,

and discuss the MMSE channel estimation in Section 2.5. The relay processing is discussed

in Section 2.6, while the achievable rates are analyzed in Section 2.7. The simulation results

are presented in 2.8. The chapter is summarized in Section 2.9.

2.4 System model

We consider, as illustrated in Fig. 2.9, a multi-pair AF two-way FD relaying system, in which

K FD user-pairs communicate via a single FD relay on the same time-frequency resource.

2.4 System model 19

Specifically, the user S2m−1, m = 1, · · · ,K on one side of the relay, wants to send and receive

data from the user S2m on the other side of the relay; the two sides of the relay in the figure

are distinguished using the bold dotted line. We assume that each user has one transmit

and one receive antenna, while the relay has N transmit and N receive antennas.1 We also

assume that the direct links between the user-pairs (S2m−1, S2m)are absent due to large path

loss and heavy shadowing [21, 22]. As seen from Fig. 2.9, due to the FD architecture, the

users on either side of the relay interfere with each other; the interference so caused is termed

as IUI [22].

Fig. 2.9: System model: Relay node consists of massive antenna arrays and each source node has twoantennas, one for transmission and another for reception. All users and the relay simultaneously transmit andreceive which leads to SI at the relay (shown by dotted ellipse), at the user (shown by dotted circle) and IUI(shown by solid red line). The users on either side of the relay (marked with bold dotted line at the center)are isolated.

A T -symbol channel coherence interval, τ ≤ T symbols for channel estimation by the

relay, and the remaining (T − τ) symbol are used for data transmission. We first discuss the

communication protocol and postpone the channel estimation discussion to Section 2.5.

1We can readily extend the system for a scenario wherein the relay has different number of transmit andreceive antennas.

2.4 System model 20

1. Data transmission at time instant n = 1: In the first symbol (n = 1) instant in the

data transmission interval, the relay does not transmit any signal, and only receives

from the users. The user Sk, for k = 1 to 2K, transmits the signal√pkxk(1) to the

relay, where pk is the transmit power of the kth user2. The signals received at the relay

and the user Sk are given respectively as

yR(1) =2K∑k=1

√pkgkxk(1) + zR(1) = Gx(1) + zR(1) and

yk(1) =∑i,k∈Uk

Ωk,i√pixi(1) + zk(1). (2.1)

Here x(n) = [x1(n), · · ·, x2K(n)]T ∈ C2K×1 with pdf CN (02K×1, I2K). The vector

gk ∈ CN×1 denotes the channel from the transmit antenna of the kth user to the relay

receive antenna array. Further, we denote the matrix G = [g1, · · ·, g2K ] ∈ CN×2K

and the term G = GP, where P = diag√

pk2K

k=1. The receive signal at the user

Sk is interfered by its own transmit signal which is termed as SI, and we denote it at

the user Sk in (2.1) as Ωk,k. The IUI channels are denoted by Ωk,i (k, i ∈ Uk, i 6= k),

where the set Uk = [1, · · ·, 2K−1] for odd k and Uk = [2, · · ·, 2K] for even k. The scalar

zk(n) ∼ CN (0, σ2n) and the vector zR(n) ∼ CN (0, σ2

nrIN ) are additive white Gaussian

noise (AWGN) at the user Sk and relay, respectively.

2. Data transmission at time instant n > 1: At the nth symbol in data transmission

interval

(a) The user Sk transmits the signal√pkxk(n) to the relay, and the relay simultane-

ously broadcasts a vector xR(n) ∈ CN×1 to all the users. The signals received at

2To avoid repetition, we assume that k = 1 to 2K throughout this paper.

2.4 System model 21

the relay and the user Sk are given respectively as

yR(n) = Gx(n) + GRRxR(n) + zR(n) and (2.2)

yk(n) = fTk xR(n) +∑i,k∈Uk

Ωk,i√pixi(n) + zk(n). (2.3)

Here the vector fTk ∈ C1×N denotes the channels from the relay transmit antenna

array to the receive antenna of the kth user. We denote the matrix (to be used

later in the sequel) F = [f1, · · ·, f2K ] ∈ CN×2K . The receive signals at the relay are

now interfered by its own transmit signal which is termed as SI, and we denote its

channel at the relay in (2.2) as GRR.

(b) Relay processing and its transmit signal: The AF relay generates its transmit

signal xR(n) by linearly precoding its received signal yR(n− 1) using a matrix W

as follows

xR(n) = WyR(n− 1). (2.4)

3. Relay transmit signal before and after SI suppression: Before employing any SI sup-

pression technique, the relay transmit signal xR(n), can be re-expressed by iteratively

substituting (2.2) into (2.4), as shown in [22], as

xR(n) = s [x(n− ν) + x(n− 2ν) + · · ·+ zR(n− ν) + zR(n− 2ν) + · · · ] , (2.5)

where s[·] is a function involving both vector and matrix operations, while ν denotes

the relay’s processing delay (we set ν to be unity). After employing the passive

SI suppression techniques from [37] discussed in detail in the next subsection, the SI

gets significantly suppressed. However, due to several inherent imperfections in the

implementation of SI cancellation stages, e.g., phase mismatch in the cancellation sig-

nals and the non-linearity of the amplifiers, there will always be some residual SI after

applying the aforementioned SI suppression techniques. Further, these imperfection

2.4 System model 22

sources in the SI cancellation process are independent, which makes the residual SI

as the sum of multiple suppressed channel and signal terms [23, 38, Prop. 1]. Conse-

quently, by applying the central limit theorem, the SI can be treated as the worst-case

Gaussian noise [23], [38, Prop. 1], [19, 39, 40]. If the term GRRxR in (2) represents

the SI at the relay, and let GRRxR be the corresponding residual SI, which is now

modeled as additional Gaussian noise with pdf CN (0, PRσ2LIRIN ), and is independent

of source signal x [23, 38, Prop. 1].We note that only the residual SI signal is spatially

uncorrelated and, similar to [23,38], has the same power on all the transmit antennas.

Accordingly, we have

yR(n) = Gx(n) + GRRxR(n) + zR(n). (2.6)

After SI suppression, the signal in (2.4) can thus be re-expressed using (2.6) as

xR(n) = WyR(n− 1). (2.7)

The vector xR(n) in (2.5) has infinite memory of x(n), whereas we observe from (2.7)

that it is now only a function of x(n−1). We now drop the time labels from (2.7), and ex-

press it using (2.6) as

xR = WyR = WGx + WGRRxR + WzR. (2.8)

The relay precoder W is designed such that it satisfies the relay transmit power con-

straint

PR = TrE[xRxHR

]= E

[‖WGx‖2

]+ E

[‖WGRRxR‖2

]+ E

[‖WzR‖2

]. (2.9)

Equation (2.9) can easily be derived by using the fact that the residual SI signal GRRxR

is independent of the source signal x.

2.4 System model 23

4. Received signal at the kth user: Using (2.8), the received signal at the user Sk in (2.3) is

yk = fTk xR +∑i,k∈Uk

Ωk,i√pkxi + zk

= fTk WGx + fTk WGRRxR + fTk WzR +∑i,k∈Uk

Ωk,i√pixi + zk

= fTk W√pk′gk′xk′︸︷︷︸

desired signal

+ fTk W√pkgkxk︸︷︷︸

self-interference

+ fTk W2K∑

i 6=k,k′

√pigixi

︸︷︷︸inter-pair interference

+ fTk WGRRxR︸︷︷︸amplified SI from relay

+ fTk WzR︸︷︷︸amplified noise from relay

+∑i,k∈Uk

Ωk,i√pixi︸︷︷︸

SI at user and IUI

+ zk︸︷︷︸AWGN at Sk

. (2.10)

Here (k, k′) = (2m − 1, 2m) or (2m, 2m − 1), for m = 1, · · · ,K denotes the user-pair

exchanging information. We assume that the relay estimates the channels G and F

using MMSE technique, and the precoder W is designed using these channel estimates.

For each user, similar to [22], the relay then transmits the self-interference cancellation

(SIC) coefficient fTk Wgk, where fk and gk are the estimated channel coefficients. After

SIC, the receive signal at user Sk is

yk = fTk W√pk′gk′xk′ +

√pkλkxk + fTk W

2K∑i 6=k,k′

√pigixi

+ fTk WGRRxR + fTk WzR +∑i,k∈Uk

Ωk,i√pixi + zk. (2.11)

The residual self-interference term λk = fTk Wgk − fTk Wgk is due to channel estima-

tion error.

Remark 1. We express the channel matrices as G = HuD1/2u and F = HdD

1/2d to account

for both small-scale and large-scale fading. The k-th elements of the large-scale diagonal

fading matrices Du and Dd are denoted as σ2g,k and σ2

f,k, respectively, while the small-scale

fading matrices Hu and Hd have i.i.d. CN (0, 1) elements. The entries of the channel matrix

GRR and the scalar channel Ωk,k, similar to [21, 22], are assumed to be independent and

2.4 System model 24

identically distributed (i.i.d.) as CN (0, σ2LIR) and CN (0, σ2

k,k), respectively. The IUI channels

Ωk,i (k, i ∈ Uk, i 6= k) are independent and are distributed as CN (0, σ2k,i) [22].

2.4.1 SI suppression

We discuss some SI cancellation techniques from existing literature which makes the SI sup-

pression easier by not requiring its instantaneous channel knowledge.

Passive cancellation: Reference [37,41] suggests that a careful utilization of the passive self-

interference suppression mechanisms (directional isolation, absorptive shielding, and cross-

polarization) can significantly suppress the SI. Reference [41] also showed that by additionally

assuming statistical channel knowledge and by using antennas arrays of sources/destinations,

the passive cancellation techniques can further suppress the SI.

Large antenna array at relay : Reference [21] argued that with large N , the channels of the

SI and the desired signal become nearly orthogonal, and hence the transmit-receive beam-

forming techniques (ZFR/ZFT or MRC/MRT) inherently project the desired signal to the

orthogonal complement space of the SI, which significantly reduce the SI.

Lower transmit power at relay : Reference [21] also demonstrated that, since the SI is pro-

portional to the relay power, PR, we can reduce the SI by using low transmit power. A

mMIMO relay system, due to large number of transmit antennas, uses radically less transmit

power/antenna than a conventional MIMO systems, which significantly reduces the SI.

We therefore, similar to existing mMIMO FD literature [20,21,41], assume that the SI can

be significantly mitigated by utilizing the above mentioned SI cancellation techniques, which

do not require SI channel knowledge. However, if required, the residual SI can be further

reduced by employing active (time-domain and spatial suppression) techniques developed

in [42], which require SI channel knowledge. The authors in [42] present an algorithm to

estimate SI channel for a mMIMO relay. We also note that since the relay is stationary, its SI

channel changes much more slowly than the channel from users to the relay. Therefore it is

reasonable to assume that the i) SI channel remains constant for multiple consecutive blocks;

and ii) relay pilot overhead is affordable because of the sufficiently longer coherence time of

2.5 Channel estimation 25

the residual SI channels [42]. Similar to [42], one can estimate the SI channel by utilizing

its slowly-varying nature using cost-efficient expectation-maximization algorithm with low

complexity.

Remark 2. We note the variance of channel estimation for the residual SI channel will increase

with relay transmit power. However, as mentioned before, a mMIMO relay has low per-

antenna transmit power [43]. We therefore, similar to [21], assume in this work that with

large number of relay antennas and low per-antenna relay transmit power, the SI channel

estimation variance does not vary much, and can be treated as constant. A future direction

of work is to investigate the effect of transmit power on the variance of SI channel estimation.

2.5 Channel estimation

For channel estimation, all users simultaneously transmit orthogonal pilot sequence of length

τ ≤ T symbols to the relay, where T is the channel coherence interval. We denote the pilot

symbols transmitted from 2K users as√τPρϕ ∈ C2K×τ , where Pρ is the pilot power. For

the pilots to be orthogonal i.e., ϕϕH = I2K , we require τ ≥ 2K [44].

Let Eg and Ef denote the estimation error matrices for G and F respectively, then the

MMSE channel estimates G and F are given as [21,43]

G = G + Eg, and F = F + Ef . (2.12)

The estimation error matrices Eg and Ef are independent of the estimated channel matrices

G and F, respectively [43]. The rows of the matrices G and F are distributed as CN (0, Du)

and CN (0, Dd) respectively, where [43]

Du = diagσ2g,1, · · ·, σ2

g,2K

with σ2

g,k =τPρσ

4g,k

τPρσ2g,k + σ2

nr

and

Dd = diagσ2f,1, · · ·, σ2

f,2K

with σ2

f,k =τPρσ

4f,k

τPρσ2f,k + σ2

nr

. (2.13)

2.6 Relay precoder design 26

Therefore, Eg ∼ CN (0,Du − Du) and Ef ∼ CN (0,Dd − Dd), where

Du − Du = diagσ2ξ,g,1, · · ·, σ2

ξ,g,2K

with σ2

ξ,g,k =σ2g,kσ

2nr

τPρσ2g,k + σ2

nr

and

Dd − Dd = diagσ2ξ,f,1, · · ·, σ2

ξ,f,2K

with σ2

ξ,f,k =σ2f,kσ

2nr

τPρσ2f,k + σ2

nr

. (2.14)

2.6 Relay precoder design

The MRC/MRT precoder is a low-complexity and simple-to-implement design, with rea-

sonably good performance at low signal to noise ratio (SNR). The ZFR/ZFT, in contrast,

performs better than MRC/MRT at high SNR, albeit with higher complexity. An advantage

of the MRC/MRT and ZFR/ZFT designs is that for mMIMO systems, they have similar

performance as that of non-linear successive interference cancellation schemes [21]. We now

motivate their use in more detail.

2.6.1 MRC/MRT precoder

It is a low-complexity transceiver which maximizes the receive signal SNR. It, as shown

in [21,22], yields reasonably well SE and EE at low SNR in mMIMO systems, and is therefore

widely used.

The MRC/MRT precoder using the estimated CSI is formulated as

W = αmrF∗TGH , αmrW, (2.15)

where T = blkdiag T1, · · · , TK. The permutation matrix

Tm =

0 1

1 0

, for 1 ≤ m ≤ K (2.16)

permutes the transmit data of each user pair to ensure that it reaches the intended receiver.

We note that the MRC/MRT precoder, despite the fact that it cannot completely suppress

2.6 Relay precoder design 27

the SI, is widely used because of its low complexity, which is O(N2K).

Using (2.9) and (2.15), the amplification factor αmr can be written as

αmr =

√√√√ PR

E[‖WGx‖2

]+ E [‖WGRRxR‖2] + E [‖WzR‖2]

. (2.17)

We next simplify α from (2.17) in the following proposition.

Proposition 1. For the MRC/MRT precoder, we have

αmr =

√PR

N2(Ψ + σ2

nr + PRσ2LIR

)Φ +N3Υ

, (2.18)

where Ψ =K∑i=1

(p2i−1σ

2g,2i−1 + p2iσ

2g,2i

), Φ =

K∑i=1

(σ2g,2i−1σ

2f,2i + σ2

g,2iσ2f,2i−1

)and

Υ =K∑i=1

(p2i−1σ

4g,2i−1σ

2f,2i + p2iσ

4g,2iσ

2f,2i−1

).

Proof. Please see Appendix A.

2.6.2 ZFR/ZFT precoder

It has higher complexity than the MRC/MRT design and nulls the inter-pair interference

by projecting the desired data onto the orthogonal complement of inter-pair interference

space [21,22]. It, therefore, works well at high SNR, where inter-pair interference dominates

the noise.

The ZFR/ZFT precoder using the estimated CSI is formulated as

W = αzfF∗T GH

, αzfW, (2.19)

where F = F(FHF

)−1and G = G

(GHG

)−1. The computational complexity of the

ZFR/ZFT precoder is O(N2K+NK2 +K3), which is higher than the MRC/MRT precoding.

In the next proposition, we simplify the amplification factor αzf in (2.17) for ZFR/ZFT

processing.

2.7 Achievable sum-rate of MRC/MRT and ZFR/ZFT precoders 28

Proposition 2. For the ZFR/ZFT precoder, we have

αzf =

√√√√√ PR

λ+ η

(2K∑i=1

piσ2ξ,g,i + σ2

nr + PRσ2LIR

) , (2.20)

where λ =2K∑i=1

pi′

(N−2K−1)σ2f,i, η =

∑2Kj=1

1(N−2K−1)2σ2

f,j σ2

g,j′.

Proof. Please see Appendix B.

2.7 Achievable sum-rate of MRC/MRT and ZFR/ZFT pre-

coders

In this section, we calculate lower-bounds on the instantaneous sum-rate for both MRC/MRT

and ZFR/ZFT precoders. The instantaneous signal-to-interference-noise ratio (SINR) at the

user Sk, denoted as SINRk, can be expressed using (2.11) as

SINRk=pk′ |f

Tk Wgk′ |

pk|λk|2+2K∑

i 6=k,k′pi|fTk Wgi|2+‖fTk WGRR‖2 PRN +‖fTk W‖2σ2

nR+ 1α2

∑i,k∈Uk

σ2k,ipi+

1α2σ2

n

.

(2.21)

Recall that α ∈ αzf, αzfThe sum-rate of the system is

R = E

2K∑k=1

log2 (1 + SINRk)

. (2.22)

Next we derive a lower-bound on the achievable rate using the method of [45]. For the

(k, k′) pair, the signal received by the kth user can be written as (see (2.11))

yk = α√pk′E

[fTk Wgk′

]xk′︸︷︷︸

desired signal

+ nk︸︷︷︸effective noise

, (2.23)

where nk = α√pk′(fTk Wgk′ − E

[fTk Wgk′

])xk′ + α

√pkλkxk + αfTk W

2K∑i 6=k,k′

√pigixi

+αfTk WGRRxR + αfTk WzR +∑i,k∈Uk

Ωk,i

√p(k)xi + zk. (2.24)


The relay calculates the value of E[fTk Wgk′

]using channel distribution of fk, and gk′ and

feeds it back to the users via low-rate auxiliary feedback channels. The distributions of

G = HuD1/2u and F = HdD

1/2d are determined by estimating Du and Dd respectively, which

remain constant over multiple coherence intervals, by using independent realizations of G

and H [43, pg. 260].

We note that the effective noise and desired signal in (2.23) are uncorrelated. Simi-

lar to [44], we only exploit the knowledge of E[fTk Wgk′

]in the detection. Using central

limit theorem, the uncorrelated additive noise nk can be treated as the worst-case Gaussian

noise [21,30,46]. The lower-bound on the achievable sum-rate, consequently, becomes

Rlower =

[2K∑k=1

log2

(1 + SINRk

)], (2.25)

where SINRk =α2pk′

∣∣E [fTk Wgk′]∣∣2

α2pk′var[fTk Wgk′

]+ α2pkSIk + α2IPk + α2NRk + α2LIRk + UIk + NUk

(2.26)

is a lower-bound on SNRk. In (2.26), the residual SI after SIC (SI), the inter-pair interference

(IP), the amplified-noise from the relay (NR), the amplified SI at relay (LIR), SI and IUI at

the user (UI), and the noise at the user (NU), are given as follows:

SIk = E[|fTk Wgk − fTk Wgk|2

], IPk =

2K∑i 6=k,k′

piE[|fTk Wgi|2

],

NRk = E[|fTk WzR|2

], LIRk = E

[|fTk WGRRx|2

],

UIk =∑i,k∈Uk

piE[|Ωk,ixi|2

], and NUk = E

[|zk|2

]. (2.27)

We further simplify the SINRk expressions for both the MRC/MRT and ZFR/ZFT precoders.


Theorem 1. The achievable rate of user Sk relying on MMSE channel estimate based MRC/MRT

processing is lower-bounded as log2

1 + SINR

mrck (pk, PR)

, where

SINRmrck (pk, PR) =

ak pk′

2K∑i=1

b(1)k,i + b

(2)k,iP

−1R +

∑i,k∈Uk

piP−1R b

(3)k,i

pi + ckpk+(d

(1)k + d

(2)k PR + d

(3)k P−1

R

)+∑i,k∈Uk

pifk,i(PR)

.

(2.28)

Here ak = N2σ4f,kσ

4g,k′

, b(1)k,i = ηk,i = Φσ2

f,kσ2g,i +N

(σ2f,kσ

4g,iσ

2f,i′

+ σ2g,iσ

4f,kσ

2g,k′

),

b(2)k,i = σ2

n

(Φσ2

g,i +Nσ4g,iσ

2f,i′

), b

(3)k,i = σ2

k,i

(Φσ2

g,i +Nσ4g,iσ

2f,i′

),

ck = −(

Φσ2f,kσ

2g,k +N

(σ4f,kσ

2g,kσ

2g,k′

+ σ2f,kσ

4g,kσ

2f,k′

)),

d(1)k =

(σ2LIRσ

2n + σ2

nrσ2f,k

)Φ +Nσ2

nrσ4f,kσ

2g,k′

, d(2)k = σ2

LIR

(σ2f,kΦ +Nσ4

f,kσ2g,k′

),

d(3)k = σ2

nrσ2nΦ, fk,i(PR) = (P−1

R e(1)k,i + e

(2)k,i ), e

(1)k,i = σ2

nrσ2k,iΦ, e

(2)k,i = σ2

LIRσ2k,iΦ. (2.29)

Proof. Refer to Appendix C.

Theorem 2. The achievable rate of user Sk relying on MMSE channel estimate based ZFR/ZFT

processing is lower-bounded as log2

1 + SINR

zfk (pk, PR)

, where

SINRzfk (pk, PR)=

ukpk′

2K∑i=1

d(1)k,i+d

(2)k,iP

−1R +

∑i,k∈Uk

piP−1R d

(3)k,i

pi+(v(1)k +v

(2)k PR+v

(3)k P−1

R

)+∑i,k∈Uk

piuk,i(PR)

. (2.30)

Here uk = 1, d(1)k,i =

1

(N − 2K − 1)

(σ2ξ,f,k

σ2f,i′

+σ2ξ,g,i

σ2g,k′

)+ σ2

ξ,f,kσ2ξ,g,iη,

d(2)k,i = σ2

n

(1

(N − 2K − 1) σ2f,i′

+ ησ2ξ,g,i

), d

(3)k,i = σ2

k,i

(1

(N − 2K − 1) σ2f,i′

+ ησ2ξ,g,i

),

v(1)k = σ2

nr

(1

(N − 2K − 1)σ2g,k′

+ σ2ξ,f,kη

)+ ησ2

LIRσ2n,

v(2)k = σ2

LIR

(1

(N − 2K − 1)σ2g,k′

+ σ2ξ,f,kη

), v

(3)k = ησ2

nrσ2n,

uk,i(PR) = (w(1)k,i + P−1

R w(2)k,i ), w

(1)k,i = ησ2

k,iσ2LIR, w

(2)k,i = ησ2

k,iσ2nr. (2.31)


Proof. Refer to Appendix D.

We now simplify the above lower-bounds derived in (2.28) and (2.30) for a HD system at

high SNR. The objectives of this simplification are i) to perform sanity check on correctness of

the derived expressions; and ii) to provide intuitive insights using these simplified expressions.

We first consider lower-bound for MRC. Now due to HD assumption we have σ2LIR =

σ2k,i = 0 which consequently leads to b

(3)k,i = 0, d

(2)k = 0 and fk,i(PR) = 0 in (2.28). We have

SINRmrck (pk, PR) =

akpk′∑2Ki=1

(b(1)k,i + b

(2)k,iP

−1R

)pi + ckpk +

(d

(1)k + d

(3)k P−1

R

) . (2.32)

And due to our high-SNR assumption we have σ2n/pk 1, σ2

nr/pk 1, σ2n/PR 1. The

high-SNR assumption also implies that Pρ σ2nr, which consequently suggests based on

(2.13) that the channel estimates are perfect i.e. σ2g,k ≈ σ2

g,k, σ2f,k ≈ σ2

f,k for k ∈ 1, · · · , 2K.

With above approximations, the denominator of (2.32) can be expressed as

Dk = N

2K∑i=1

pi

[Φσ2

f,kσ2g,i

N+ σ2

f,kσ4g,iσ

2f,i′

+ σ2g,iσ

4f,kσ

2g,k′

]

−

[Φσ2

f,kσ2g,k

N+ σ4

f,kσ2g,kσ

2g,k′

+ σ2f,kσ

4g,kσ

2f,k′

]pk +

σ2nrσ

2f,k

NΦ

.

Neglecting the terms that are multiplies of 1/N (since for a mMIMO system, the number of

relay antennas N 2K), the simplified high-SNR expression for MRC/MRT processing can

be written as

SINRmrck =

Nσ2f,kσ

4g,k′

pk′∑2Ki 6=k σ

2g,i(σ

2g,iσ

2f,i′

+ σ2f,kσ

2g,k′

)pi. (2.33)

The lower-bound for ZFR/ZFT processing can be simplified on similar lines and is given as

SINRzfk =

(N − 2K − 1)σ2g,k′

pk′

σ2nr

. (2.34)

We see that the above expressions are similar to the expressions derived in [30, Eq. (27) and

Eq. (31)] which considers a HD mMIMO relaying system.


Now, we also know that the MRC/MRT processing maximizes the received SNR by ne-

glecting the inter-pair interference. Furthermore, we observe this in the high-SNR expression

of (2.33), where the noise vanishes and the expression is dominated by the inter-pair inter-

ference. It is widely recognized that the ZFR/ZFT processing focuses on nulling out the

inter-pair interference and ignores the noise. This is evident from its high-SNR expression in

(2.34), where the inter-pair interfere is completely cancelled and the relay noise is present.

Before discussing the simulation results in the next section, a few remarks are in order.

Remark 3. The closed-form lower-bounds of the achievable sum-rate can be derived for FD

two-way mMIMO relaying systems in the presence of realistic correlated channels. Because

of the correlation among the relay antennas signals, while simplifying the expressions of α in

(2.17) and the SNR expressions in (2.28) and (2.30), we have to use the following results

E[GRRGH

RR

]= σ2

LIRtr (TR) RR, E[gHi gi

]= σ2

g,itr (RR) , E[gHi gj

]= 0,

E[∣∣gHi gj

∣∣2] = σ2g,iσ

2g,jtr

(R2R

), and E

[gig

Hi gig

Hi

]= σ4

g,i(NIN + RR)RR, (2.35)

where RR and TR are the deterministic non-negative definite matrices that characterize the

receive antenna correlation and the transmit antenna correlation at the relay, respectively.

By using similar steps to those in Appendix A-D, the current computations can be extended

to more realistic correlated models.

Remark 4. The current system model considers a mMIMO relay where each antenna is con-

nected to a dedicated radio frequency (RF) chain. This allows us to perform relay precoding

digitally. In practice, the number of RF chains, and consequently hardware complexity, can

be significantly reduced by connecting each RF chain to multiple antennas [47]. In such a

hybrid architecture, the transmit signals are first precoded by low-dimensional digital pre-

coding followed by analog (phase-only) precoding using cost-effective analog phase shifters.

The hardware complexity of the proposed design can also be reduced using such a hybrid

architecture.

Remark 5. All the users should be synchronized to the mMIMO relay. Reference [48] designs

a frequency synchronization scheme for the multiuser OFDM uplink relying on a massive

2.8 Simulation results 33

uniform linear array at the BS by exploiting the angle of arrival information of users. A

similar approach could also be employed to synchronize users with the relay in our model.

The frequency synchronization problem can also be alleviated by selecting the best antenna

or a subset of antennas. The antenna-selection technique will reduce the complexity of both

the synchronization algorithm as well as the overall hardware complexity.

−10 0 10 20 30 400

10

20

30

40

50

60

SNR [dB]

Spe

ctra

l effi

cien

cy (

bps/

Hz)

Monte−Carlo (MRC/MRT)Lower bound (MRC/MFT)Monte−Carlo (ZFR/ZFT)Lower bound (ZFR/ZFT)

N = 64

N = 256

Fig. 2.10: SE of MRC/MRT and ZFR/ZFT versus PR, where K = 10, Pρ = 10 dBm.

2.8 Simulation results

We investigate the performance of the multi-pair two-way FD AF relay system by using

Monte-Carlo simulations. We derived in Theorem 7 and Theorem 8 the analytical lower-

bounds of the achievable rate for both MRC/MRT and ZFR/ZFT with MMSE channel

estimation. To demonstrate the validity of these SE (which is equivalent to scaled achievable

rate) lower-bounds, we compare them with their exact expressions in (2.22) for N = 64 and

N = 256 relay antennas and K = 10 user pairs. For this study, we choose, noise variances

as σ2n = σ2

nr = σ2 = 0 dB, and the system SNR is defined as SNR = PR/σ2. We define

the pilot signal to noise ratio as SNRρ = Pρ/σ2 and allocate equal power to all users i.e.


we have pi = PR/2K, for i = 1, · · · , 2K. The large scale fading coefficients σ2g,i = σ2

f,i, for

i = 1, · · · , 2K, and the SI at the relay σ2LIR as well as the SI and IUI at the user σ2

UI = σ2k,i

for k, i = 1, · · · , 2K are set to 0 dB with respect to σ2. We see from Fig. 2.10 that the derived

closed-form and the exact expression overlap for ZFR/ZFT for N = 256 relay antennas. For

MRC/MRT, the lower-bound marginally differs from the exact expression. We also observe

that SE saturates for both MRC/MRT and ZFR/ZFT for high SNR values. This is because

the relay SI also increases proportionally upon increasing system SNR.

0 200 400 600 800 10000

20

40

60

80

100

Number of relay antennas (N)

Spe

ctra

l effi

cien

cy (

bps/

Hz)


SNR=−10 dB

SNR=10 dB

Fig. 2.11: SE versus the number of relay antennas for MRC/MRT and ZFR/ZFT processing, whereSNRρ = 10 dB.

We compare in Fig. 2.11 the SE versus N for MRC/MRT and ZFR/ZFT processing with

SNR = 10 dB and SNR = −10 dB. The performance of MRC/MRT and ZFR/ZFT processing

is almost same for SNR = −10 dB. The SE versus K for different value of SNR is shown

in Fig. 2.12. As the number of multi-pairs increases the SNR of each user decreases and

hence noise dominates. The ZFR/ZFT neglects the effect of noise which degrades the SE as

K increases. In contrast MRC/MRT works better at low SNR as it maximizes the received

SNR while neglecting the inter-pair interference.


5 10 15 20 25 30 35 40

0

10

20

30

40

50

Number of user pairs (K)

Spe

ctra

l effi

cien

cy (

bps/

Hz)


SNR=10dB

SNR=0dB

SNR=−10dB

Fig. 2.12: SE versus number of user pairs for MRC/MRT and ZFR/ZFT processing, where N =128, SNRρ = 10 dB.

We compare in Fig. 2.13 and Fig. 2.14 the SE versus number of relay antennas for HD

and FD system with MRC/MRT and ZFR/ZFT, respectively. In our framework, SE of HD

mMIMO relaying can be evaluated by setting the SI and IUI to zero. We observe that for

FD systems, as we increase the value of SI and IUI i.e., σ2LIR and σ2

UI , the SE of FD system

decreases. We observe from Fig. 2.13 that for σ2LIR = σ2

UI = 0 dB the FD system has higher

SE than a HD system. However, the HD performs better for σ2LIR = σ2

UI = 10 dB. We also

note from Fig. 2.14, that for N < 650, the HD relay with σ2LIR = 0 dB, σ2

UI = 5 dB performs

better than FD relay. We also observe that with the increase in the the number of relay

antennas the rate of increase of SE in case of FD relay is higher as compared to HD relay.


0 200 400 600 800 10000

10

20

30

40

50

60

70

80

90

Spe

ctra

l effi

cien

cy (

bps/

Hz)


Half−duplexFull−duplex

σLIR2 = −10 dB, σ

UI2 = −10 dB

σLIR2 = 0 dB, σ

UI2 = 0 dB

σLIR2 = 5 dB, σ

UI2 = −10 dB

σLIR2 = 10 dB, σ

UI2 = 5 dB

σLIR2 = σ

UI2 = 10 dB

σLIR2 = σ

UI2 = 5 dB

Fig. 2.13: SE versus number of relay antennas for MRC/MRT processing, comparing HD and FDsystems, where SNR = 10 dB, SNRρ = 10 dB. Here the values of σ2

LIR and σ2UI are with respect to σ2.

200 400 600 800 10000

20

40

60

80

100

120

140

Spe

ctra

l effi

cien

cy (

bps/

Hz)


Half−duplexFull−duplex

σLIR2 = −10 dB, σ

UI2 = −10 dB

σLIR2 = 0 dB, σ

UI2 = 0 dB

σLIR2 = 0 dB, σ

UI2 = 5 dB

σLIR2 = 5 dB, σ

UI2 = 0 dB

Fig. 2.14: SE versus number of relay antennas for ZFR/ZFT processing, comparing HD and FDsystems, where SNR = 10 dB, SNRρ = 10 dB. Here the values of σ2

LIR and σ2UI are with respect

to σ2.

2.9 Summary 37

2.9 Summary

We considered a multi-pair AF FD mMIMO two-way relay with FD users with single transmit

and receive antenna and derived closed-form SE expression for MRC/MRT and ZFR/ZFT

relay processing with MMSE channel estimation, and for arbitrary number of relay antennas,

which have not yet been derived in the literature. The accuracy of these lower bounds was

shown for different number of relay antennas, user pairs and relay transmit power. Finally,

we also numerically investigated the loop and inter-user interference values for which the FD

relay outperforms a HD relay.

Fig. 2.15: Chapter 2 – summary.

Chapter 3

Global Energy Efficiency

Optimization

3.1 Introduction

Design of energy-efficient wireless systems has recently attracted attention to reduce their car-

bon footprint. In this chapter, we maximize the GEE metric for two-way AF relaying, where

multiple FD user pairs exchange information via a shared FD mMIMO relay (see details in

Chapter 2). The GEE metric, which obtains a Pareto-optimality between the throughput

and the energy consumed, has recently drawn attention as a performance measure [49]. For

solving the optimization problem, we use the closed-form SE lower bounds derived, in Chap-

ter 2, for both MRC/MRT and ZFR/ZFT processing at the relay, by using MMSE channel

estimation. We next survey state-of-the-art papers considering GEE maximization.

3.2 Related work

The GEE for both conventional and mMIMO systems has recently been investigated in

[50–57]. Zappone et al. in [51] developed two novel optimization frameworks for energy-

efficient performance of interference limited MIMO networks. Reference [52] proposed a

competitive power control algorithm to maximize the EE of relay-assisted interference MIMO

3.2 Related work 39

Table 3.1: Summary of mMIMO literature focusing on SE and GEE.

Ref.Relayingscheme

Operatingmode

Metric de-rived/optimized

Main contribution

[27] Two-way, AF HDBoth SE andGEE

Derives SE and GEE asymptotic ex-pressions

[30] Two-way, AF HD Only SEAllocates optimal power to maximizeSE

[22] Two-way, AF FDBoth SE andGEE

Derives asymptotic SE and GEE ex-pressions

[50]One-way,decode-and-forward

HD Only GEEAllocates optimal power to maximizeGEE

networks. The authors in [53] proposed an interference neutralization scheme to maximize the

EE of each communication link for MIMO one-way AF relaying network. Nguyen et al. [55]

optimized the downlink GEE of a cell-free single-hop mMIMO system using ZF precoding.

Li et al. [56] optimized both the achievable rate and GEE of a small-cell based mMIMO single-

hop FD system. It is anticipated that a paradigm shift towards multi-component Pareto-

optimization is about to take place, leading to an entire optimal Pareto-front of solutions [58].

Wei et al. in [54] analyzed both SE and EE of a multi-pair two-way mMIMO relay with

limited radio-frequency chains. The authors in [55] optimized the downlink EE of a cell-free

single-hop mMIMO system with zero forcing precoding. The achievable rate and the EE of

a small cell mMIMO single-hop FD system are optimized in [56]. The GEE of a multi-pair

two-way AF mMIMO FD relay is optimized in [57]. The asymptotic GEE for a multi-pair

one-way decode and forward mMIMO HD relay is optimized in [50].

The existing literature of mMIMO relays, on the other hand, has either optimized the

achievable rate [21, 23, 30] or analyzed GEE [22, 27]. The GEE optimization for mMIMO

relaying has not been investigated, except for a recent study in [50] which optimized the

asymptotic GEE for a multi-pair one-way decode-and-forward mMIMO HD relay. A sum-

mary of above mentioned mMIMO relaying systems focusing on SE and GEE is provided in

Table 3.2. To the best of our knowledge, GEE of multi-pair two-way AF mMIMO FD relay

system has not been considered in the literature for a small antenna systems. We consider

this problem in this thesis.



We next list the main contribution as follows:

• We optimally allocate power to maximize GEE by using the closed-form achievable

rate expressions derived in Chapter 2. The GEE maximization, which has a non-

convex objective, is solved by proposing an algorithm wherein we first approximate the

objective as a PC function, and later choose Dinkelbach’s approach to optimize it.

• We discuss complexity issues of the proposed algorithm and suggest ways to reduce

it. We numerically compare the GEE obtained using the proposed algorithm to the

equal-power approach of [22].

• We also provide a closed-form power allocation solution for the GEE maximization

problem by formulating its KKT conditions.

• The proposed FD GEE optimization framework can also be used for evaluating GEE

of mMIMO HD AF systems which has not been investigated in the open literature.

We compare GEE of both FD and HD systems and numerically quantify the SI and

IUI values for which a FD system has a better GEE than a HD system. We also show

the significantly improved GEE of optimal power allocation over the equal-power GEE

analyzed in [27].

The rest of the chapter is organized as follows. We formulate GEE for our system model in

Section 3.4, and discuss the solution for the GEE maximization problem in Section 3.5. The

computational complexity is analyzed in Section 3.6. The closed-form power allocation solu-

tion for the GEE optimization problem is given in 3.7. The simulation results are presented

in 3.8. The chapter is summarized in Section 3.9.

3.4 GEE problem formulation 41

3.4 GEE problem formulation

We begin by defining the EE of the kth user which is given as the ratio of its SE and the

power used by it [51]:

EEk =

(1− 2K

T

)log2(1+SINRζ

k′)(

1− 2KT

)(η−1U pk+η−1

R PR

)+ 2K

T η−1U Pρ+Pc,k

, Alog2(1+SINRζ

k′)

PT,k(pk, PR)+Pc,k. (3.1)

Here(k, k′)= (2m−1, 2m) or (2m, 2m−1), for m = 1,· · ·,K denotes the user-pair exchanging

information. The terms pk, PR = PR/(2K) and Pc,k denotes the user transmit power, relay

transmit power and the fixed circuit power of the user Sk, respectively.

The constants ηU ∈ (0, 1) and ηR ∈ (0, 1) denote the power amplifiers efficiency at the

user and at the relay, respectively. Also PR = PR/2K and Pc,k = PC/2K, with PC being

total circuit power consumed [43,50]:

PC = PFIX + PTC + PCE + PSP . (3.2)

Here PFIX is the fixed circuit power required for site-cooling, controlling, and baseband signal

processing. Further, PTC , PCE and PSP are the power consumed by transceivers, channel

estimation, signal processing with optimization, given respectively as [43,50]

PTC = PLO +NPREL + 2KPUE , PCE =B

T

8NK2

ηRELand

PSP =

(1− 2K

T

)4NKBηREL

+ BT

6NKηREL

+ PCOM for MRC/MRT,(1− 2K

T

)4NKBηREL

+ BT

(8K3

3ηREL+ 12NK2+2NK

ηREL

)+ PCOM for ZFR/ZFT.

(3.3)

Here PLO is the power consumed by the local oscillator, PREL and PUE are the power

required to run the circuit components attached to each antenna at the relay and at the

UEs, respectively. The scalar PCOM denotes the computation power. Further, the term

ηREL denotes the computational efficiency at the relay in terms of arithmetic complex-valued

operations per Joule (flops/Watt).

3.5 GEE maximization 42

We now optimally allocate the power for maximizing GEE (in bits/Joule/Hz), which is

defined as [22,59,60]

GEE =R(pk, PR)

PT (pk, PR). (3.4)

The numerator in GEE is the SE, which also includes the channel estimation overhead,

defined as

R(pk, PR) =(

1− τ

T

) 2K∑k=1

log2

(1 + SINRζ

k(pk, PR)). (3.5)

The expression of SINRζk, where ζ ∈ (mrc, zf) are given in (2.28) and (2.30), respectively.

The denominator in GEE denotes the overall power consumed by the system [56]

PT (pk, PR) =2K∑k=1

PT,k(pk, PR) + PC . (3.6)

Before formulating the related optimization problem, we briefly discuss the terminologies

used in geometric and fractional programming from [61] and [62] respectively in Appendix

E, which will be used in the sequel. With this information, we next optimize GEE.

3.5 GEE maximization

GEE maximization problem is formulated as

P1 :Maxpk,PR

(1− τ

T

) 2K∑k=1

log2

(1 + SINRζ

k(pk, PR))

2K∑k=1

PT,k(pk, PR) + PC

(3.7a)

s.t. 0 ≤ pk ≤ Pmax, 0 ≤ PR ≤ PmaxR (3.7b)

2K∑k=1

pk + PR ≤ Pmaxu . (3.7c)

The first two constraints specify the peak transmit power of the user and the peak transmit

power of the relay i.e. Pmax and PmaxR respectively. The last constraint specify the total

system transmit power Pmaxu . We observe from (2.28) and (2.30) that the SINRζ

k is a ratio


of a monomial and a posynomial. The numerator of the objective in P1 therefore becomes a

ratio of two posynomials, which is not a posynomial and hence non-convex [63, pg. 160-163].

We also note that the constraints in P1 are upper-bounded posynomials and are therefore

convex. The problem P1 can be re-cast as

P2 : Maxpk,PR,Γk

log2

2K∏k=1

(1 + Γk)

2K∑k=1

PT,k(pk, PR) + PC

(3.8a)

s.t. ISNRζk(pk, PR) ≤ Γ−1

k (3.8b)

(3.7b), (3.7c). (3.8c)

The symbol Γk = SINRζk denotes an auxiliary variable and the term ISNRζ

k denotes the

inverse of SINRζk, i.e. ISNRζ

k = 1/SINRζk. Here we have dropped the constant

(1− τ

T

)from

the objective. We use the following result, proved in [62,64], to solve the above problem.

Proposition 3. Consider the concave-convex fractional programs (CCFP) g(x) = u(x)/v(x),

with u being non-negative, differentiable and concave, while v being positive, differentiable

and convex [62]. Then the function g(x) is a PC and a stationary point x∗ of g(x) is its

global maximizer. The problem of maximizing g(x) is equivalent to finding the positive zero

of D(λ), which is defined as

D(λ) , Maxxu(x)− λv(x) . (3.9)

The function D(λ) is convex, continuous and strictly monotonically decreasing and its zero

is found using Dinkelbach’s algorithm [64].

We first state the objective of the eventual GEE optimization problem, as required by


the Dinkelbach algorithm.

Maxpk,PR,Γk

log2

(2K∏k=1

(1 + Γk)

)− λ[m]

(2K∑k=1

PT,k(pk, PR) + PC

)or equivalently (3.10a)

Minpk,PR,Γk

log2

(2K∏k=1

(1 + Γk)

)−1

+ λ[m]

(2K∑k=1

PT,k(pk, PR) + PC

) . (3.10b)

We recall the following from [63]: i) the objective in a geometric program (GP) is a posyno-

mial; ii) the inverse of a posynomial is not a posynomial; and iii) the inverse of a monomial

is a monomial. Now the first term in the objective function in (3.10a) i.e.

(2K∏k=1

(1 + Γk)

)is a product of posynomials functions, and therefore a posynomial. Its inverse in the equiv-

alent objective in (3.10b) is, however, not a posynomial. Now if the first term in (3.10a)

is approximated as a monomial, its inverse in the equivalent objective of (3.10b) is also a

monomial. The objective therefore becomes a sum of monomial and a posynomial (the second

term), which is a posynomial. The approximated problem, can be thus solved as a GP, since

its objective and constraints are posynomials. To approximate the posynomial, we use the

following lemma from [65].

Lemma 1. Consider a monomial function q(νk) = δkν∆kk (νk > 0), which approximates

s(νk) = 1 + νk near an arbitrary point νk > 0. For this approximation, the following two

conditions hold.

1. For the best monomial local approximation, the parameters ∆k and δk are given by

∆k = νk(1 + νk)−1, δk = ν−∆k

k (1 + νk). (3.11)

2. For all νk > 0, s(νk) ≥ q(νk).

Using Lemma 1, the numerator of the objective function in problem P2 can be approxi-

mated as log2

2K∏k=1

[δk(pk, PR)Γ

∆k(pk,PR)k

], with ∆k(pk, PR) = Γk

(1 + Γk

)−1and δk(pk, PR)=(

Γk

)−∆k(pk,PR)(1 + Γk

), where Γk is an initial value approximation for Γk. Given the ap-

3.6 Computational complexity 45

proximated objective, the optimization P2 can be formulated as follows:

P3 : Maxpk,PR,Γk

log2

2K∏k=1

[δk(pk, PR)Γ

∆k(pk,PR)k

]2K∑k=1

PT,k(pk, PR) + PC

s.t. (3.7b), (3.7c), (3.8b).

(3.12)

The optimization P3 is now a CCFP. Let us now exploit the monomial approximation and

Dinkelbach’s algorithm to solve GEE problem P3, as illustrated in Algorithm 1.1

Remark 6. The first inequality constraint in Algorithm 1 is added to confine the domain of

variables Γ to a region around the current guess Γ[m] [61]. In most practical cases, β = 1.1

provides a good accuracy/speed trade-off for the monomial approximations [61,65].

3.6 Computational complexity

• Complexity of Algorithm 1 : i) one GP in step-3; ii) calculation of scalar D(λ[m]) in

step-4; iii) iterative calculation of λ[m+1] calculation in step-5.

We see that the overall complexity of the proposed algorithm is dominated by the complexity

of solving GP; the other two steps have trivial (and nearly same) computational complexity.

Two approaches are used to solve a GP – the interior point method [63] and infeasible interior

point algorithm [67].

The standard barrier-based interior-point method of solving a GP has a worst-case poly-

nomial time complexity, that scales gracefully with the problem size [63]. The complexity

1The Algorithm 1 uses the monomial, obtained using Lemma 1, to cast the problem P3 as a GP. Notethat this algorithm is not optimal as it approximates the posynomial with a monomial. It is a heuristicthat often computes the globally optimal power allocation [66]. Nevertheless, unlike many other methods, italso enjoys strong optimality claims, ensuring the monotonic objective improvement after each iteration, andfirst-order optimality upon convergence. The monotonic optimization, which obtains global optimal solutionin finite time [66], can also be used to solve problem P2 without any approximation. But the worst-casecomplexity of monotonic optimization increases at least exponentially in 2K [66]. The proposed sequentialfractional solution can solve the problem in polynomial time with affordable complexity. Further the proposedalgorithm – as concluded by exhaustive numerical studies in [66] – is an excellent heuristic that achieves aglobal optimum solution.

3.6 Computational complexity 46

Algorithm 1: GEE maximization algorithm

Input: Given the maximum number of iterations L and a tolerance ε > 0 and, set

m = 1 and λ[1] = 0. Calculate the initial values pk, PR and Γ[1]k by allocating

equal power to all users i.e. pk = PR/2K, and PR = P/2.

Output: p∗k and P ∗R as the solutions.

1 for m← 1 to L do

2 Given a feasible pk, PR, compute ∆[m]k (pk, PR) and δ

[m]k (pk, PR).

3 Solve the GP to calculate pk, PR and Γk.

Minpk,PR,Γk

log2

(2K∏k=1

δ[m]k (pk, PR)Γ

∆[m]k (pk,PR)

k

)−1

+ λ[m]

(2K∑k=1

PT,k(pk, PR) + PC

)s.t. β−1Γ

[m]k ≤ Γk ≤ βΓ

[m]k

(3.7b), (3.7c), (3.8b).

4 Based on the solutions PR and pk of GP, compute Γk, and

D(λ[m]) = log2

2K∏k=1

δ[m]k (pk, PR)Γ

∆[m]k (pk,PR)

k − λ[m]

(2K∑k=1

PT,k(pk, PR) + PC

).

5 Do until convergence if D(λ[m]k ) > ε then Γ

[m+1]k = Γk, and

λ[m+1] =

log2

2K∏k=1

δ[m]k (pk, PR)Γ

∆[m]k (pk,PR)

k

2K∑k=1

PT,k(pk, PR) + PC

.

6 else break.

7 return p∗k and P ∗R.

3.7 Closed-form optimal solution 47

analysis of the barrier-based interior point method and of the parameter choices is given in

detail in [63].

The infeasible algorithm proposed in [67] simultaneously solves both the primal and dual

GP, and has a competitive numerical efficiency for a wide range of GPs. This algorithm

is tested on the most challenging GP problems and it is shown to be faster than the ear-

lier methods.

3.7 Closed-form optimal solution

We now provide a closed-form power allocation solution for the GEE optimization problem.

We commence by simplifying the lower-bound for ZFR/ZFT processing by assuming that no

interference arrives from the users, i.e. σ2k,i = 0. The denominator of the sum-rate in (2.30)

under high SNR assumption of σ2n/PR 1, σ2

n/pk 1, σ2nr/pk 1, can be expressed as

Dk =σ2nr + PRσ

2LIR

(N − 2K − 1)σ2g,k′

+ ησ2nσ

2LIR. (3.13)

Therefore the simplified high-SNR expression of ZFR/ZFT processing is as follows:

SINRzfk =

pk′

σ2nr + PRσ

2LIR

(N − 2K − 1)σ2g,k′

+ ησ2nσ

2LIR

. (3.14)

The GEE can be expressed using (3.4)-(3.6), (3.14) as

GEE =

2K∑k=1

log2(1 + SINRzfk )

2K∑k=1

pk + PR + Pc

. (3.15)

Remark 7. We assume the power amplifiers efficiency to be unity both at the user and at

the relay as in [52,56,68] for mathematical simplicity. Also, Pc takes into account the power

required by the different components, namely by the transceiver’s radio-frequency chain, the

oscillator and the power consumption of channel estimation [52, 56, 68]. This assumption is


commonly exploited in the mMIMO literature for mathematical simplicity [56], and it does

not affect the overall behavior of the system considered.

The GEE maximization problem can be re-written using the following assumption pk = PS

for 1 ≤ k ≤ 2K, σ2g,k = σ2

g for 1 ≤ k ≤ 2K, and PR = 2KPS , as follows

P∗ : MaxPS

2K log2

(1 + PS

σ2nr + 2KPSσ

2LIR

(N − 2K − 1)σ2g

+ ησ2nσ

2LIR

)

4KPS + Pc(3.16a)

s.t. PS > 0. (3.16b)

The GEE in (3.16a) can be shown to be a quasi-concave function with respect to PS . Hence,

by applying the KKT conditions of the GEE maximization problem P∗ we can obtain a

globally optimal solution P ∗S . Using the classic KKT conditions, we have

(4KPS + Pc)C2

((C1 + 1)PS + C2) (C1PS + C2) ln 2− 4K log2

(1 +

PSC1PS + C2

)= 0, (3.17)

where C1 =2Kσ2

LIR(N−2K−1)σ2

gand C2 = σ2

nr(N−2K−1)σ2

g+ ησ2

nσ2LIR.

Therefore, by using (3.16a) and (3.17), we obtain the optimal GEE as follows

GEE =0.5C2

((C1 + 1)PS + C2) (C1PS + C2) ln 2. (3.18)

We can similarly obtain a closed-form solution for maximizing GEE with MRC/MRT pro-

cessing.


We now numerically investigate the GEE achieved by the proposed algorithm using Monte-

Carlo simulations by modeling the mMIMO FD relay system under realistic conditions [43].

We assume that the relay coverage area is a square with relay located at the center of the

square, and all the users randomly located at distances greater than the minimum distance


Table 3.2: System parameters.

Parameter Value

Area of coverage 0.25 × 0.25 km2

Median channel gain, Υ -148.1 dB

Minimum distance, dmin 35 m

Path-loss exponent, α 3.76

Shadow fading (standard deviation), σsf 1 dB

Bandwidth, B 20 MHz

Receive Noise Power, σ2 -94 dBm

Training interval, τ 2K

Coherence interval, T (in symbols) 200

Fixed power, PFIX 10 W

Power for Relay LO, PLO 0.1 W

Power per Relay antenna, PREL 0.2 W

Power per User antenna, PUser 0.1 W

Relay computation efficiency, LRel 750 Gflops/W

dmin. The large scale fading (LSF) coefficients corresponding to the channels relay→Sk and

Sk→relay i.e., σ2f,k and σ2

g,k, respectively, follow the path-loss model [43]:

σ2·,k [dB] = Υ− 10α log10

(dk

1 km

)+ F·,k, where σ2

·,k =

σ2g,k for Sk→relay

σ2f,k for relay→Sk.

(3.19)

Here Υ is the median channel gain at the reference distance of 1 km, dk is the separation

distance between the user Sk and the relay, α is the path-loss exponent which determines the

rate at which the signal decays with the distance and F·,k ∼ CN (0, σ2sf ) is the shadow fading

term that creates log-normal random variations around the transmit and receive antennas of

each user Sk.

We set the noise variances at the relay and at the user to be equal i.e., σ2nr = σ2

n = σ2,

the SI σ2LIR at the relay and the SI and IUI at the user σ2

k,i , σ2UI for 1 ≤ k, i ≤ 2K are

normalized to 0 dB with respect to σ2. The LSF coefficients are normalized with respect to


the total receiver noise power σ2 (which includes thermal noise and a noise figure of 7 dB

in the receiver hardware). We set the length of the coherence interval T = 200 symbols and

training length τ = 2K. We set the maximum transmit power of each user asPmax =30 dBm,

maximum relay transmit powerPmaxR =40 dBm and the circuit power Pc,k for 0 ≤ k ≤ 2K are

modeled according to the practical power consumption model given in (3.2). The main system

parameters are tabulated in the Table 3.3, which remain fixed unless otherwise specified.

3.8.1 Global energy-efficiency maximization

−20 −10 0 10 20 30 40 500

2

4

6

8

10

12

14

GE

E (M

bits

/Jou

le)

Pumax [dBm]

GEEmaxEPA

ZFR/ZFT

MRC/MRT

Fig. 3.17: GEE comparison of MRC/MRT and ZFR/ZFT processing versus Pmaxu = Pmax

u /σ2 forPρ = 20 dBm, K = 5 and N = 64.

We first investigate the performance of GEE maximization algorithm. The values of the

SIσ2LIR and IUIσ2

UI,σ2k,j for k, j=1,· · ·, 2K are assumed to be 0 dB with respect to σ2. We

plot in Fig. 3.17, GEE versus Pmaxu for both MRC/MRT and ZFR/ZFT processing. We see

that the proposed algorithm (labelled as GEEmax) yields a higher GEE than equal power

allocation scheme (labelled as EPA) which allocates equal power to all 2K users. We also

see that the achieved GEE remains constant for Pmaxu ≥ 20 dBm for the MRC/MRT scheme,


100 200 300 400 5000

5

10

15

GE

E (M

bits

/Jou

le)

Number of relay antennas, N

ZFR/ZFTMRC/MRT

Pumax= 0 dBm

Pumax= 20 dBm

Fig. 3.18: GEE comparison of MRC/MRT and ZFR/ZFT processing versus N for Pρ = 20 dBm andK = 5.

since with Pmaxu = 20 dBm, the system achieves the maximum GEE. After attaining the

maximum GEE, the system does not use any additional power, since any additional power

usage would reduce GEE. We observed this undesired behavior in EPA, where the system

keeps using the available power beyond Pmaxu = 20 dBm and hence GEE reduces. We also

see that for Pmaxu ≤ 0 dBm the MRC/MRT has better GEE than the ZFR/ZFT design.

In Fig. 3.18 we investigate the GEE by varying the number of antennas N for two

different values of Pmaxu . For this study, we fix the parameters to K = 5 user-pairs and

Pρ = 20 dBm. We observe that as N increases, initially the achievable GEE also increases

for both MRC/MRT and ZFR/ZFT processing. This is a benefit of increased array gain

which results in increase in SE. However, as we further increase N , the GEE starts decreas-

ing after attaining maximum value around N = 80 for Pmaxu = 20 dBm. This is because the

circuit power consumption also increases with the increase in N . Recall that the number of

RF chain is equal to the number of relay antennas. We also see that for Pmaxu = 0 dBm and

N < 80, the MRC/MRT performs better than the ZFR/ZFT, whereas for larger values of


N , the ZFR/ZFT starts performing better than MRC/MRT. The reason, as also mentioned

in [8] while analyzing the SE, is that the beam formed by the relay relying on ZFR/ZFT and

small N values does not point to the target user (since the ZFR/ZFT processing focuses on

mitigating the interference). Hence, the effective signal power received by the user is lower

than for MRC/MRT processing. However for large N values, the channels of different users

become (nearly) orthogonal, and the beams created by the ZFR/ZFT processing are directed

towards the target users [22].

Remark 8. The MRC/MRT design, is the simplest transceiver technique, which works well

at low SNR, as shown in [21, 22] in the context of the SE, and in the current work in the

context of the GEE. We see from Fig. 3.18 that the MRC/MRT outperforms ZFR/ZFT at

low SNR values. Due to its reasonably-good low-SNR performance, MRC/MRT design is

widely used in mMIMO systems [21,22].

3.8.2 FD versus HD GEE comparison

We now investigate GEE of FD and HD relaying for different values of σ2LIR and σ2

UI in

conjunction with MRC/MRT processing. We note that GEE of an AF HD mMIMO relaying

has not been investigated in the literature. In the proposed framework, GEE of HD mMIMO

relaying can be evaluated by setting the SI and IUI to zero and multiplying the SE by 0.5, due

to HD operation. We observe from Fig. 3.19 that for σ2LIR = σ2

UI = 0 dB the FD system has

higher GEE than a HD system. However, the HD performs better for σ2LIR = σ2

UI = 10 dB.

We also see that for N < 128 and σ2LIR = 5 dB, σ2

UI = 0 dB, HD performs better than FD,

however for N > 128 the FD achieves higher GEE than HD. We, therefore, conclude that

the effect of SI reduces as the number of antenna increases.

3.8.3 SE comparison with existing FD designs

We note that GEE maximization framework considered in this work can also maximize the

SE by maximizing the numerator of the Problem P3 – the modified problem is a GP. The

objective of this study is to show the spectral gains accrued by MMSE channel estimation and


0 100 200 300 400 5000

5

10

15

GE

E (

Mb

its/J

ou

le)


σ

LIR

2=σ

UI

2=−5~dB

σLIR

2=σ

UI

2=0~dB

σLIR

2=5 dB, σ

UI

2=0 dB

σLIR

2=σ

UI

2=10~dB

HD

Fig. 3.19: GEE versus η comparing FD and HD systems for MRC/MRT with for Pρ = 20 dBm.


proposed power allocation by comparing the SE of the proposed system to that of the existing

FD relaying system in [22, 23]. Reference [23] derives the lower-bound and maximizes SE

for the MRC/MRT processing alone, whilst relying on LS channel estimation; reference [22]

considers equal-power MRC/MRT and ZFR/ZFT FD relaying designs.

We now evaluate the SE of the proposed algorithm using simulations. For this analysis,

we consider a mMIMO relay system supporting K = 5 user pairs and N = 500 relay antennas.

We define η = Pmaxu /σ2, where Pmax

u is the maximum total transmit power of the system.

For the plots in the sequel, we vary η by fixing the variable σ2 to unity and then vary Pmaxu .

The large-scale fading matrices Du and Dd, similar to [21,30] are assumed to be:

Du = diag [0.749, 0.045, 0.246, 0.121, 0.125, 0.142, 0.635, 0.256, 0.021, 0.123] and

Dd = diag [0.257, 0.856, 1.000, 0.899, 0.014, 0.759, 0.315, 0.432, 0.195, 0.562] .

The circuit power is Pc = 15 dBm [26,30,66]. The noise variances are set to σ2n = σ2

nr = σ2.

−10 −5 0 5 10 15 200

10

20

30

40

Pρ (dBm)

SE

(bp

s/H

z)

MRC/MRT, proposedZFR/ZFT, proposedMRC/MRT, [5]

σnr2 =10 dB

σnr2 =0 dB

Fig. 3.20: SE of MRC/MRT and ZFR/ZFT versus: a) Pρ, comparing the performance of MMSE andLS estimators, where K = 5, N = 256 and PR = 10 dBm.


−20 −10 0 10 20 30 400

5

10

15

20

25

30

η [dB]

SE

(bp

s/H

z)

MRC/MRT, proposedMRC/MRT, [4]MRC/MRT, [5]ZFR/ZFT, proposedZFR/ZFT, [4]

Fig. 3.21: SE of MRC/MRT and ZFR/ZFT versus η, where K = 5, Pρ = 20 dBm, N = 500.

We commence by comparing the SE to that of the design in [23]. We plot in Fig. 3.20, the

SE versus Pρ, where we observe that for MRC/MRT processing, the MMSE estimator outper-

forms the LS estimator [23] both for a lower pilot power Pρ, and a higher noise power σ2nr. We

also see that the ZFR/ZFT associated with Pρ > 0 dB outperforms MRC/MRT processing.

We also note that Pρ = 20 dB maximizes SE for both MRC/MRT and ZFR/ZFT processing.

We use this pilot power for our next comparison.

We now evaluate the SE by varying η for the proposed algorithm considering both

MRC/MRT and ZFR/ZFT processing. We see from Fig. 3.21 that the algorithm proposed

for both MRC/MRT and ZFR/ZFT processing considerably improves SE, when compared

with their counterparts from [22], which allocates equal power to all users. We also see

that with the proposed algorithm the MRC/MRT processing outperforms even ZFR/ZFT

for η < 5 dB. Furthermore, the performance of the MRC/MRT processing associated with

the proposed algorithm overlaps with that of [23]. This fact can also be justified by observing

Fig. 3.20, where for Pρ = 20 dBm, the performance of the proposed system and [23] is the

same for MRC/MRT processing.


−10 0 10 20 30 400

5

10

15

20

25

30

35

η [dB]

SE

(b

ps/

Hz)

Proposed, HD lower boundHD lower bound [7][7, Eq.(27)]

N =128

N = 256

(a)

Fig. 3.22: SE versus η = Pmaxu /σ2 comparison of HD system for MRC/MRT, where K = 10, Pρ =

20 dB.


−20 −10 0 10 20 30 40

0

5

10

15

20

25

30

η [dB]

SE

(bp

s/H

z)

σLIR2 = σ

UI2 = 0, proposed

σLIR2 = σ

UI2 = 0 dB, proposed

σLIR2 = σ

UI2 = 5 dB, proposed

Half−duplex [7]Half−duplex [8]

FD, using [7]

Fig. 3.23: Performance comparison of FD and HD systems: SE versus η = Pmaxu /σ2 for MRC/MRT

for K = 5, N = 500 and Pρ = 20 dB.

In (2.33), we simplified the MRC/MRT lower-bounds for the HD case which we showed

that it matched with the existing results in [30, Eq.(27)]. In Fig. 3.22a we check the sanity of

these results by demonstrating that the two bounds numerically also match. We also observe

from the figure that the bounds match their asymptotic limits.

3.8.4 FD versus HD SE comparison

We now compare SE of the proposed FD system with the existing HD systems in [27, 30]

for different values of SI σ2LIR and IUI σ2

UI . Reference [27] derives the asymptotic SE and

GEE of a HD system by allocating equal power to all users; reference [30] allocates power

to maximize SE of a HD system. We first blindly apply the results available in [30] and

show that they yield rather optimistic SE results owing to neglecting both the SI and IUI. To

derive FD SE results from the HD SE results available in [30], we multiply them by a factor of

two. We observe from Fig. 3.23 that, when both SI and IUI are set to zero, the SE obtained

using the results in [30] and the proposed system matches. In the presence of realistic SI and


−20 −10 0 10 20 30 400

10

20

30

40

50

η [dB]

SE

(b

ps/H

z)

σLIR

2 = σ

UI

2 = −10 dB, proposed

σLIR

2 = σ

UI

2 = −5 dB, proposed

σLIR

2 = σ

UI

2 = 0 dB, proposed

Half−duplex [7]

Half−duplex [8]

Fig. 3.24: Performance comparison of FD and HD systems: SE versus η = Pmaxu /σ2 for ZFR/ZFT

for K = 5, N = 500 and Pρ = 20 dB.

IUI, the SE results of [30] are rather different from the current work. Quantitatively, we see

from Fig. 3.23 that even at a low-value of SI and IUI of 0 dB, results of [30] will over-estimate

the SE by 7.5 bps/Hz at a SNR of 10 dB. Furthermore, in Fig. 3.23 and Fig. 3.24, we want

to investigate the value of σ2LIR (along with σ2

UI) for which a FD system outperforms its HD

counterpart. We observe from Fig. 3.23 that for a FD system using MRC/MRT processing,

the σ2LIR dB and σ2

UI dB should be around 0 dB for it to achieve a better SE than a HD

system with optimal power allocation [30]. We see from Fig. 3.24 that with σ2LIR = σ2

UI = 0

dB, the ZFR/ZFT FD processing has lower SE than its HD counterpart. This is unlike the

MRC/MRT FD processing in Fig. 3.23 which significantly outperforms its HD counterpart

for σ2LIR = σ2

UI = 0 dB. A ZFR/ZFT FD requires higher suppression of σ2LIR and σ2

UI for it

to perform better than a HD system.

3.9 Summary 59

3.9 Summary

We considered a multi-pair AF FD mMIMO two-way relay and designed algorithm to max-

imize GEE. We demonstrated the EE improvement achieved by the proposed algorithm for

various scenarios. We also exhaustively compared the SE and GEE of the proposed FD relay-

ing system to the existing state-of-the-art HD relaying systems, and characterized the values

of SI and IUI, for which the proposed system outperforms the existing ones.

The proposed algorithms can be used to optimize GEE of not only emerging 5G systems

but also existing sensor networks, where relays/fusion-nodes enable communication between

multiple devices. GEE is the ratio of the network’s sum-rate and the network’s energy

consumption. The network-centric GEE metric, however, cannot prioritize the EE of indi-

vidual users. The user-centric weighted sum energy efficiency (WSEE) metric [26], which is a

weighted sum of the EEs of the individual users, allows better control of the individual user

EE by allocating a different weight to it.

3.9 Summary 60


Chapter 4

Weighted Sum Energy Efficiency

Optimization

4.1 Introduction

In this Chapter, we consider the user-centric WSEE metric [26], which is a weighted sum of

EEs of the individual users, and allows better control of the individual user EE by allocating

a different weight to it. The network-centric GEE metric, discussed in Chapter 3, which

is the ratio of the network’s sum-rate and the network’s energy consumption, however, is

not suited for the scenario when different users have different EE priorities [25, 26]. The

proposed WSEE algorithm in this chapter can be applied, to a scenario, where multiple users

are located in an area where traditional cellular networks are disrupted e.g., due to a natural

disaster. The users in aforementioned locations might not have a reliable energy source. A

low-cost mMIMO relay could then be set-up to serve these users and it is useful to prioritize

the individual energy efficiency of energy-strapped users. We next survey state-of-the-art

papers considering WSEE maximization.

4.2 Related work 62

4.2 Related work

The WSEE optimization, due to the sum-of-ratio form, is a hard-to-tackle non-convex pro-

gram. There is very few literature that considers WSEE metric [25,26,69–71]. Reference [25]

designed a precoder to optimize the WSEE of coordinated single-hop multi-cell multi-user

systems. Reference [26] optimized the WSEE of the downlink of a single-hop cellular orthog-

onal frequency division multiple access system. Reference [69] proposed a distributed pricing

algorithm to solve the WSEE maximization problem in ad hoc networks. Reference [70]

developed an iterative algorithm to maximize the WSEE of wireless networks, and is shown

to theoretically converge and achieve a KKT solution. Reference [71] considered an WSEE

maximization problem for device-to-device communications. In the mMIMO relaying litera-

ture, either the achievable rate e.g., [21,23,30] or the GEE [50,57] is optimized. The WSEE

metric for FD relaying has not yet been optimized. This paper bridges this gap.


We next list the main contributions as follows:

• We design an optimal power allocation algorithm to maximize the WSEE of multi-pair

FD mMIMO two-way AF relaying with MRC/MRT and ZFR/ZFT relay processing.

The non-convex WSEE metric is optimized by first approximating it as a concave-

convex function, and later by applying quadratic transformation [72]. This contribu-

tion is significantly different from [57], which maximizes GEE by using Dinkelbach’s

algorithm. We note that the WSEE cannot be optimized using Dinkelbach’s algorithm

due to its sum-of-ratio form.

• We also maximize the WSEE by formulating its KKT conditions, and then by exploit-

ing the sum-based nature of the WSEE to simplify them. With these simplifications,

which involve challenging differential computations, we propose an iterative algorithm

based on modified waterfilling approach [28], which yields a KKT point of the non-


convex WSEE problem. We note that the waterfilling technique is commonly shown

as a closed-form solution for optimal SE, and not WSEE. To the best of our knowl-

edge, this is the first work which has reported closed-form waterfilling solutions to find

a KKT point of non-convex WSEE of two-way FD mMIMO relaying. We note that

the Dinkelbach-based algorithms proposed in references [50, 57] to optimize GEE are

iterative numerical solutions, with no closed-form, and require convex optimization

software e.g., CVX [63]. The closed-form waterfilling solution derived herein, in con-

trast, performs only bisection search on the Lagrange multiplier, and does not require

convex optimization softwares. We numerically show that the closed-form WSEE opti-

mization require three order of magnitude lesser computational time than the proposed

quadratic-transformation-based WSEE optimization.

• We numerically show that the WSEE optimization framework developed herein enables

us to set suitable weights to satisfy the heterogeneous EE requirements of various

users in the network.

• The proposed WSEE optimization framework for a FD mMIMO relaying system can

also maximize the WSEE of its HD counterpart. We numerically compare the WSEE

of both HD and FD systems to characterize both SI and IUI values for which a FD

system has better WSEE than a HD one. We show that for a FD system to have a

better WSEE than a HD system, the MRC/MRT relay processing requires lower SI and

IUI suppression than the ZFR/ZFT processing.

The rest of the chapter is organized as follows. The WSEE problem formulation and

its difference with GEE is presented in 4.4 and the WSEE optimization problem is solved

in Section 4.5. The closed form solution for WSEE is derived in 4.6. The computational

complexity is analyzed in 4.7. The simulation results are presented in Section 4.8, and this

Chapter is summarized in Section 4.9.

4.4 Problem statement and motivation 64

4.4 Problem statement and motivation

Our aim now is to develop an algorithm to optimize the WSEE. We begin by rewriting the

EE of the kth user, defined in (3.1), as follows:

EEk = Alog2(1+SINRζ

k′)(

1− 2KT

)(η−1U pk+η−1

R PR

)+ 2K

T η−1U Pρ+Pc,k

, Alog2(1+SINRζ

k′)

PT,k(pk, PR)+Pc,k. (4.1)

Recall that (k, k′) = (2m − 1, 2m) or (2m, 2m − 1), for m = 1,· · ·,K denotes the user-

pair exchanging information and the constantA= 1−2K/T denotes the rate loss due to the

channel estimation phase.

The expression for the SINRζk, where ζ ∈ (mrc, zf) is given by (please refer to (2.28)

and (2.30))

SINRζk ,

∆ζk′

Λζk′

akpk′2K∑i=1

(b(1)k,i+b

(2)k,iP

−1R +

∑i,k∈Uk

piP−1R b

(3)k,i

)pi+ckpk+

(d

(1)k +d

(2)k PR+d

(3)k P−1

R

)+∑i,k∈Uk

pi

(P−1R e

(1)k,i+e

(2)k,i

) ,

(4.2)

where for MRC/MRT processing (ζ = mrc): ak = N2σ4f,kσ

4g,k′

, b(1)k,i = µk,i = Φσ2

f,kσ2g,i +

N(σ2f,kσ

4g,iσ

2f,i′

+σ2g,iσ

4f,kσ

2g,k′

), b

(2)k,i = σ2

n

(Φσ2

g,i+Nσ4g,iσ

2f,i′

), b

(3)k,i = σ2

k,i

(Φσ2

g,i+Nσ4g,iσ

2f,i′

),

ck = −(

Φσ2f,kσ

2g,k +N

(σ4f,kσ

2g,kσ

2g,k′

+ σ2f,kσ

4g,kσ

2f,k′

)),

d(1)k =

(σ2LIRσ

2n + σ2

nrσ2f,k

)Φ + Nσ2

nrσ4f,kσ

2g,k′

, d(2)k = σ2

LIR

(σ2f,kΦ +Nσ4

f,kσ2g,k′

), d

(3)k =

σ2nrσ

2nΦ, e

(1)k,i = σ2

nrσ2k,iΦ, e

(2)k,i = σ2

LIRσ2k,iΦ, and Φ =

∑Ki=1

(σ2g,2i−1σ

2f,2i + σ2

g,2iσ2f,2i−1

).

and for ZFR/ZFT processing (ζ = zf): ak = 1, b(1)k,i = 1

M

(σ2ξ,f,k

σ2

f,i′

+σ2ξ,g,i

σ2

g,k′

)+ σ2

ξ,f,kσ2ξ,g,iη,

b(2)k,i = σ2

n

(1

Mσ2

f,i′

+ ησ2ξ,g,i

), b

(3)k,i = σ2

k,i

(1

Mσ2

f,i′

+ ησ2ξ,g,i

), ck = 0,

d(1)k = σ2

nr

(1

Mσ2

g,k′

+ σ2ξ,f,kη

)+ ησ2

LIRσ2n, d

(2)k = σ2

LIR

(1

Mσ2

g,k′

+ σ2ξ,f,kη

),

d(3)k = ησ2

nrσ2n, e

(1)k,i = ησ2

k,iσ2nr, e

(2)k,i = ησ2

k,iσ2LIR, M = N − 2K − 1 and η =

∑2Kj=1

1M2σ2

f,j σ2

g,j′.

4.4 Problem statement and motivation 65

The WSEE is now defined as the weighted sum of the EEs of multiple users [25,26,51,69]:

WSEE(pk, PR) =

2K∑k=1

wkEEk. (4.3)

Here wk for k = 1, · · · , 2K are the predefined weights which are used to prioritize the EE

of different users links. The user links are commonly prioritized in a heterogeneous network

where an energy-strapped user places a high-premium on its EE, and the weights help in

achieving that objective. The denominator of the EEk metric contain the real energy con-

sumed for each user link [43,50]. The WSEE problem, by ignoring the factor A, can be cast as

P1 : maximizepk,PR

2K∑k=1

wklog2(1+SINRζ

k′)

PT,k(pk, PR)+Pc,k(4.4a)

subject to 0 ≤ pk ≤ Pmax, 0 ≤ PR ≤ PmaxR , (4.4b)2K∑k=1

pk + PR ≤ Pmaxu . (4.4c)

The first and the second constraints are on the peak transmit power of each user and the

relay, respectively [30]. The last constraint is on the total transmit power [30].

Before optimizing the WSEE, we motivate it by differentiating it from the GEE metric,

usually considered in the EE literature [50, 57]. We also discuss the challenges involved in

optimizing the WSEE when compared with the GEE optimization. We start by defining GEE:

GEE(pk, PR) =

A2K∑k=1

log2

(1 + SINRζ

k

)2K∑k=1

(PT,k(pk, PR) + Pc,k

) . (4.5)

We observe the following from (5.7) and (4.5):

• The network-centric metric GEE, which is the ratio of the total sum rate and the to-

tal transmit power of the system, captures the EE of the entire network, and cannot prior-

itize the EE of different users. This is unlike WSEE, which by setting the weights wk for

k = 1, · · ·, 2K, can do this.

4.5 WSEE optimization using quadratic transform 66

• The GEE optimization comes under the ambit of a single-ratio fractional programming

(FP) framework where the objective is a fraction of concave and convex functions [24].

The sum rate in the GEE numerator, which is not a concave function of (pk, PR), can be

linearly approximated as a concave function [50,57]. The fractional GEE objective, conse-

quently, becomes a PC function, which can be globally maximized using the Dinkelbach’s

algorithm [64].

• The WSEE which is the weighted sum of EEk falls under the category of multiple ratio

FP problem, with each EEk being in fractional form. The SE of user in the numerator of

each EEk term, similar to GEE, is not a concave function in (pk, PR), but can be linearly

approximated as a concave function. Each EEk therefore becomes a PC function, and

the WSEE, consequently becomes a sum of PC functions, which is not guaranteed to be

PC [24]. The WSEE optimization is therefore a challenging non-convex program [25,26,69],

and the Dinkelbach’s algorithm cannot be used to optimize it. We next solve it using QT

approach [72].

4.5 WSEE optimization using quadratic transform

A novel QT is proposed to tackle multiple-ratio concave-convex fractional problem (CCFP),

which decouples the numerator and denominator of each ratio term [72]. This converts a

multiple-ratio CCFP problem into a sequence of convex optimization problems.

We first state the QT [72, Corollary 2] in the following proposition.

Proposition 4. Consider a sum-of-functions-of-ratio problem

maximizex

2K∑k=1

fk

(uk(x)

vk(x)

)subject to x ∈ X , (4.6)

where X is a convex set, and fk(·) is a non-negative function. The sequence of functions

in the numerator and the denominator are defined such that uk(x) : Rn → R+ and vk(x) :


Rn → R++ for 1 ≤ k ≤ 2K. Using QT [72], the above problem can be equivalently cast as

maximizex,y

2K∑k=1

fk

(2√uk(x)yk − y2

kvk(x))

subject to x ∈ X , y ∈ R2K , (4.7)

where y is the collection of auxilary variables y1, y2, · · · , y2K.

Proposition 4 assumes that the functions uk(x) and vk(x) for 1 ≤ k ≤ 2K are non-negative

and positive, respectively. We next state another result from [72] which assumes a specific

structure on the functions uk(x) and vk(x).

Proposition 5. For the sum-of-functions-of-ratio problem in (4.6), if each ratio uk(x)/vk(x) is

in the concave-convex form i.e., each uk(x) is concave and each vk(x) is convex, and further

assuming that the function fk for all k is non-decreasing and concave, then for a given y,

the problem (4.7) is a concave problem in x and for a given x, the optimal value of yk

can be obtained in a closed form as y?k =

√uk(x)

vk(x) . By iteratively optimizing x and y, the

problem (4.7) converges to a stationary point of (4.6) with a nondecreasing objective value

after every iteration.

We now solve problem P1 using the QT in Proposition 4. We observe from the objective

of P1 that for each EEk term the i) numerator is the SE of the Sk → Sk′ link, which is

a non-negative function and; ii) denominator is the transmit power of user Sk which is a

positive function. The numerator and the denominator in each of the fractional terms (EEk),

using Proposition 4, can therefore be decoupled. The problem P1 can be recast as

P2 : maximizepk,PR,zk

2K∑k=1

wk

[2zk

√log2

(1 + SINRζ

k′

)−z2

k (PT,k(pk, PR)+Pc,k)

](4.8a)

subject to 0 ≤ pk ≤ Pmax, 0 ≤ PR ≤ PmaxR , (4.8b)2K∑k=1

pk + PR ≤ Pmaxu and z ∈ R2K . (4.8c)

Here z = [z1, · · ·, z2K ] is the collection of auxilary variables, where each zk decouples the

numerator and the denominator in each EEk terms. We observe from the objective of P2

that the term SNRζ

k′is in fractional form (see (4.2)). By again applying the Proposition 4 to


each of SNRζ

k′terms, the problem P2 can be recast as

P3 : maximizepk,PR,zk,yk′

2K∑k=1

wk

[2zk

√log2

(1+2yk′

√∆ζ

k′−y2

k′Λζk′

)−z2

k (PT,k(pk, PR)+Pc,k)

](4.9a)

subject to 0 ≤ pk ≤ Pmax, 0 ≤ PR ≤ PmaxR , (4.9b)

2K∑k=1

pk + PR ≤ Pmaxu , z ∈ R2K and y ∈ R2K . (4.9c)

Here yk′ is the auxilary variable that decouples ∆ζ

k′and Λζ

k′. To summarize, we have

applied QT in Proposition 4 twice to recast P1 as P3. We next use Proposition 5 to solve

(4.4), which is transformed as (4.9), and correspondingly P1 which is transformed as P3.

From Proposition 5, we see that the problem P3 is a concave maximization problem in

(pk, PR) if ∆ζ

k′and Λζ

k′are concave and convex functions, respectively. From (4.2), we have

∆ζ

k′= akpk′ , which is an affine (concave) function in pk. We now consider the term Λζ

k′from

(4.2) and after rearranging the terms, we get

Λζk′

=2K∑i=1

(b(1)k,ipi + b

(2)k,ipiP

−1R

)+∑i,k∈Uk

p2iP−1R b

(3)k,i +

2K∑i 6=j

∑j,k∈Uk

pipjP−1R b

(3)k,i + ckpk

+(d

(1)k + d

(2)k PR + d

(3)k P−1

R

)+∑i,k∈Uk

(piP

−1R e

(1)k,i + pie

(2)k,i

). (4.10)

From (4.10), we see that Λζk′

is non-convex because of the fractional non-convex terms piP−1R ,

pipjP−1R and p2

iP−1R . To apply Proposition 5, we have to approximate it as a convex function.

We achieve this aim in two steps as follows:

• Handling of piP−1R : We first consider the non-convex term piP

−1R and replace P−1

R with

PRinv. We note that the modified term piPRinv is a bilinear function in pi and PRinv but still

non-convex, which we convexify as follows. We first express it as a difference of two convex

functions in pi and PRinv as follows

piPRinv=1

2(pi + PRinv)

2− 1

2(p2i + P 2

Rinv) (4.11)

(a)≈ 1

2(pi + PRinv)

2 − 1

2(p2i + P 2

Rinv)− pi(pi − pi)−PRinv(PRinv−PRinv

), (piPRinv)

?. (4.12)


The approximation in (a) is obtained by linearizing the second term in (4.11) i.e., 12(p2

i +

P 2Rinv). Here pi and PRinv are the initial values of pi and PRinv, respectively. It is easy to

see that the approximate expression of (piPRinv), denoted as (piPRinv)? in (4.12), is convex

in (pi, PRinv).

• Handling of pipjP−1R and p2

iP−1R : We next linearize the non-convex terms pipjP

−1R and

p2iP−1R in (4.10) as follows. We first express them as pipjPRinv and p2

iPRinv, respectively.

We then use first order Taylor’s series approximation in the following lemma.

Lemma 2. The non-convex terms pipjPRinv and p2iPRinv can be linearly approximated as

(pipjPRinv)?= pipjPRinv

[pipi

+pjpj

+PRinv

PRinv−2

], (p2

iPRinv)?=2pi

[PRinvpi+pi

PRinv2−piPRinv

],

(4.13)

where pi, pj and PRinv are the initial values of pi, pj and PRinv, respectively.

Proof. Refer to Appendix F.

We now change P−1R to PRinv and then replace the non-convex terms piPRinv, pipjPRinv

and p2iPRinv in (4.10) with the respective approximate linear terms from (4.12) and Lemma 2

to obtain the following convex expression in (pk, PR).

Λζk′

=2K∑i=1

(b(1)k,ipi + b

(2)k,i (piPRinv)

?)

+∑i,k∈Uk

(p2iPRinv)

?b(3)k,i +

2K∑i 6=j

∑j,k∈Uk

(pipjPRinv)?b

(3)k,i+ ckpk

+(d

(1)k +d

(2)k P−1

Rinv+d(3)k PRinv

)+∑i,k∈Uk

((piPRinv)

?e(1)k,i+pie

(2)k,i

). (4.14)

We replace Λζk′

with Λζk′

and change the optimization variable PR to PRinv in problem P3,

to obtain

P4 : maximizepk,PRinv ,zk,yk

2K∑k=1

wk

[2zk

√log2

(1+2yk′

√∆ζ

k′−y2

k′Λζk′

)−z2

k (PT,k(pk, PRinv)+Pc,k)

](4.15a)

subject to 0 ≤ pk ≤ Pmax, (PmaxR )−1 ≤ PRinv ≤ PmaxRinv, (4.15b)2K∑k=1

pk + P−1Rinv ≤ P

maxu , z ∈ R2K and y ∈ R2K . (4.15c)


For a given yk and zk, the problem P4 is now concave in (pk, PRinv) and can be solved

using solvers e.g., CVX [63]. After calculating (pk, PRinv), we calculate the optimal values of

auxiliary variables zk and yk′ , by using Proposition 5 as

z∗k =

√log2(1 + ∆ζ

k′/Λζ

k′)

PT,k(pk, PRinv)+Pc,kand y∗

k′=

√∆ζ

k′

Λζk′

. (4.16)

We therefore iteratively calculate (pk, PRinv) by first solving P4 for a given (y, z) and then

calculate (y, z) from (4.16). The iterative process is summarized in Algorithm 2. As stated in

Proposition 5, the solution obtained using Algorithm 2 converge to a stationary point of P1.

Algorithm 2: WSEE maximization algorithm using quadratic transform.

Input: Given the maximum number of iterations L and a tolerance ε > 0. Calculate

initial pk and PRinv by allocating equal power to all users and the relay i.e.,

p[0]k = Pmax/4K and PRinv = (Pmax/2)−1.

Output: p∗k2Kk=1 and P ?Rinv as the solutions.


2 Given a feasible pk and PRinv, compute zk and yk, using the expressions in (4.16).

3 Compute p[m]k

2Kk=1 and P

[m]Rinv, by solving P4.

4 Do until convergence if ‖p[m] − p[m−1]‖ < ε then break.

5 return p∗k and P ?Rinv.

The above WSEE optimization using quadratic-programming-based Algorithm 2 requires

convex optimization problem solvers e.g., CVX [63]. We now aim to develop a low-complexity

algorithm to optimize WSEE by exploiting KKT conditions which reduces the computational

time when compared with the Algorithm 2. This becomes crucial when we deploy the op-

timized power allocation in real-time, and we need to calculate the optimization variables

within the channel coherence time. We later numerically show that the closed-form expression

derived next significantly reduces the computation time when compared with Algorithm 2.

We note that the existing GEE works in [50, 57] do not derive closed-form power alloca-

tion solution.

4.6 Closed-form WSEE solution 71

4.6 Closed-form WSEE solution

To derive the closed-form solution, we further simplify the SINRζk expressions in (4.2) for the

MRC/MRT and ZFR/ZFT processing by assuming that σ2n/PR 1, σ2

n/pk 1, σ2nr/pk 1.

This assumption is made in mMIMO literature to simplify sum-rate expressions and derive

asymptotically optimal power allocation schemes [30]. These assumptions also imply that

Pρ σ2nr, which implies σ2

g,k ≈ σ2g,k, σ

2f,k ≈ σ2

f,k for k = 1, · · ·, 2K. With σ2n/PR 1,

σ2n/pk 1, σ2

nr/pk 1 for k = 1, · · ·, 2K, the denominator of the lower bound in (4.2) for

MRC/MRT processing can be written as

Dk = α2N3

2K∑i=1

pi

Φσ2f,kσ

2g,i

N+ σ2

f,kσ4g,iσ

2f,i′

+ σ2g,iσ

4f,kσ

2g,k′

+

(σ2g,iΦ

N+σ4

g,iσ2f,i′

)∑i,k∈Uk

P−1R piσ

2k,i

+

2K∑i=1

piP−1R σ2

n

(Φσ2

g,i

N+ σ4

g,iσ2f,i′

)−

(Φσ2

f,kσ2g,k

N+ σ4

f,kσ2g,kσ

2g,k′

+ σ2f,kσ

4g,kσ

2f,k′

)pk

+PRσ2LIR

(σ2f,kΦ

N+σ4

f,kσ2g,k′

)+∑i,k∈Uk

piσ2k,iσ

2LIRΦ

N+

(σ2LIRσ

2n

N+σ2nrσ

2f,k

N

)Φ+σ2

nrσ4f,kσ

2g,k′

.Neglecting the terms that are multiplies of 1/N (since for a mMIMO system, number of relay

antennas N 2K), we have

Dk = α2N3

2K∑i 6=k

pi

(σ2f,kσ

4g,iσ

2f,i′

+ σ2g,iσ

4f,kσ

2g,k′

)+

2K∑i=1

piP−1R σ2

nσ4g,iσ

2f,i′

+2K∑i=1

piσ4g,iσ

2f,i′

∑i,k∈Uk

P−1R piσ

2k,i + PRσ

2LIRσ

4f,kσ

2g,k′

+ σ2nrσ

4f,kσ

2g,k′

. (4.17)

The sum-rate for the MRC/MRT processing can be approximated as log2

1 + SINR

mrck

where

SINRmrck =

Nσ2f,kσ

4g,k′

pk′

2K∑i 6=k

σ2g,i

(σ2g,iσ

2f,i′

+σ2f,kσ

2g,k′

)pi+

2K∑i=1

piP−1R

σ4g,iσ

2f,i′

σ2f,k

σ2n +

∑i,k∈Uk

piσ2k,i

+PRσ2LIRσ

2f,kσ

2g,k′

+σ2nrσ

2f,kσ

2g,k′

.

(4.18)


We now consider the denominator of the sum-rate expression in (4.2) for the ZFR/ZFT

processing. With σ2n/PR 1, σ2

n/pk 1, σ2nr/pk 1, we have

Dk = α2

(σ2nr + PRσ

2LIR

)Mσ2

g,k′+ ησ2

LIRσ2n +

2K∑i=1

piP−1R

Mσ2f,i′

σ2n +

∑i,k∈Uk

piσ2k,i

+∑i,k∈Uk

piσ2k,iησ

2LIR

.(4.19)

We note that, unlike MRC/MRT processing, here we do not require N 2K and this expres-

sion is consequently valid for arbitrary number of relay antennas. The sum-rate expression for

ZFR/ZFT can be approximated as log2

1+SINR

zfk

, where SINR

zfk is given as shown below

SINRzfk =

pk′

σ2nr+PRσ

2LIR

Mσ2g,k′

+2K∑i=1

piP−1R σ2

n

Mσ2f,i′

(σ2n +

∑i,k∈Uk

piσ2k,i

)+ ησ2

LIRσ2n +

∑i,k∈Uk

piσ2k,iησ

2LIR

.

(4.20)

We now state the KKT conditions of Problem P1.

d

dpk

[WSEE(pk)− λ

(2K∑k=1

pk − Pmaxu

)+

2K∑k=1

vk,1(pk − Pmax)−2K∑k=1

vk,2pk

]= 0 (4.21a)

0 ≤ pk ≤ Pmax, k = 1, · · · , 2K (4.21b)

2K∑k=1

pk ≤ Pmaxu (4.21c)

λ ≥ 0, vk,1 ≥ 0, vk,2 ≥ 0, k = 1, . . . 2K (4.21d)

vk,1(pk − Pmax) = 0, vk,2pk = 0, k = 1, · · · , 2K (4.21e)

λ

(2K∑k=1

pk − Pmaxu

)= 0, (4.21f)

where λ, vk,12Kk=1 and vk,22Kk=1 are the Lagrange multipliers associated with the maximum

transmit power constraint, the maximum user power constraint and the non-negativity of the

users power constraints, respectively.


To solve these KKT conditions, we first simplify (4.21a) to show that pk is a solution of

fixed-point equation. We now expand the first differential term in (4.21a) as follows

d

dpkWSEE(pk) =

d

dpk

[ 2K∑j=1

wjlog2(1 + SINR

ζj′)

PT,j(pj , PR) + Pc,j

]

=d

dpk

[wk

log2(1 + SINRζk′

)

(PT,k(pk, PR) + Pc,k)+ wk′

log2(1 + SINRζk )

(PT,k′ (pk′ , PR) + Pc,k′)+∑j 6=k,k′

wjlog2(1 + SINR

ζj′)

(PT,j(pj , PR) + Pc,j)

]

=d

dpk

[wk

log2(1 + SINRζk′

)

(PT,k(pk, PR) + Pc,k)

]︸︷︷︸

I

+d

dpk

[wk′

log2(1 + SINRζk )

(PT,k′ (pk′ , PR) + Pc,k′)

]︸︷︷︸

II

+d

dpk

[ ∑j∈Uk\k

wjlog2(1 + SINR

ζj′)


]︸︷︷︸

III

+d

d pk

[ ∑j∈U

k′ \k′

wjlog2(1 + SINR

ζj′)


]︸︷︷︸

IV

,

(4.22)

where ζ ∈ mrc, zf denotes the type of processing used at the relay. We will simplify each of

the terms in (4.22), considering the SNR expressions in (4.18) and (4.20) for both MRC/MRT

and ZFR/ZFT processing, respectively.

4.6.1 MRC/MRT processing

For the MRC/MRT processing, the term-I in (4.22) is simplified using the SNR expression

in (4.18), as

d

dpk

[wk

log2(1 + SINRmrck′ )


]= −Mk +

Jk

1 + SINRmrck′

[d

dpkSINR

mrck′

], (4.23)

where Mk =wk(1− 2K

T )η−1U log2(1 + SINR

mrck′ )

(PT,k(pk, PR) + Pc,k)2and Jk =

wkln 2(PT,k(pk, PR) + Pc,k)

.

To derive the exact expressions for the above terms, we first re-write the denominator of

SINRmrck′ using (4.18) as

Dk′ = Ak′ +Bk′ + Ck′ + Ek′ , (4.24)


where Ak′ =∑2K

i 6=k′ σ2g,i(σ

2g,iσ

2f,i′ + σ2

f,k′σ2g,k)pi, Bk′ = PRσ

2LIRσ

2f,k′σ

2g,k + σ2

nrσ2f,k′σ

2g,k,

Ck′ =2K∑i=1

piP−1R σ2

n

σ4g,iσ

2

f,i′

σ2

f,k′

and Ek′ =1

PRσ2f,k′

2K∑i=1

piσ2f,i′σ

4g,i

∑i,k′∈Uk′

piσ2k′,i.

We further simplify the expressions of Ak′ , Ck′ and Ek′ to express the differential in a

tractable form.

Ak′=

2K∑i 6=k′

σ2g,i(σ

2g,iσ

2f,i′ + σ2

f,k′σ2g,k)pi=

2K∑i 6=k,k′

σ2g,i(σ

2g,iσ

2f,i′ + σ2

f,k′σ2g,k)pi + 2σ4

g,kσ2f,k′pk

,Ak′ + Ak′pk,

Ck′ =2K∑i 6=k

piP−1R σ2

n

σ4g,iσ

2f,i′

σ2f,k′

+ pkP−1R σ2

nσ4g,k , Ck′ + pkCk′ ,

Ek′ =1

PRσ2f,k′

[ 2K∑i 6=k

piσ2f,i′σ

4g,i + pkσ

2f,k′

σ4g,k

][ ∑i,k′∈Uk′

piσ2k′,i

]

=1

PRσ2f,k′

[σ2f,k′

σ4g,k

∑i,k′∈Uk′

piσ2k′,i

]pk︸︷︷︸

δk′pk

+1

PRσ2f,k′

[ 2K∑i 6=k

piσ2f,i′σ

4g,i

][ ∑i,k′∈Uk′

piσ2k′,i

]︸︷︷︸

εk′

. (4.25)

Using (4.18), (4.24) and (4.25), we have the following expression for (4.23).

d

dpk

[wk

log2(1 + SINRmrck′ )


]= −Mk +

Jk

(1 + SINRmrck′ )

d

dpk

[Ik′pkDk′

]

=−Mk+Jk

1+SINRmrck′

[Ik′

Dk′−Ik′pk(Ak′+Ck′+δk′ )

D2k′

]=−Mk+

JkIk′ (Ak′+Bk′+Ck′+εk′ )

Dk′ (Dk′+Ik′pk),

(4.26)

where Ik′ = Nσ2f,k′

σ4g,k.

We now consider the term-II of (4.22). Since we need to differentiate SINRmrck , we first

simplify its denominator using (4.18) and (4.24) as follows

Dk = Ak +Bk + Ck + αkp2k + βkpk + γk, where (4.27)

Ak=∑2K

i 6=k σ2g,i(σ

2g,iσ

2f,i′+σ

2f,kσ

2g,k′

)pi, Ck =∑2K

i 6=k piP−1R σ2

n

σ4g,iσ

2

f,i′

σ2f,k

+pkP−1R σ2

n

σ4g,kσ

2f,k′

σ2f,k

, Ck +


pk¯Ck, αk =

1

PRσ2f,k

[σ2f,k′

σ4g,kσ

2k,k

], βk =

1

PRσ2f,k

[σ2k,k

2K∑i 6=k

piσ2f,i′σ

4g,i + σ2

f,k′σ4g,k

∑i,k∈Uk, i 6=k

piσ2k,i

]

and γk =1

PRσ2f,k

2K∑i 6=k

piσ2f,i′σ

4g,i

[ ∑i,k∈Uk, i 6=k

piσ2k,i

].

We now calculate the differential in the term-II in (4.22) using (4.18) and (4.27).

d

dpk

[wk′

log2(1 + SINRmrck )

(PT,k′(pk′ , PR) + Pc,k′)

]= −

Jk′Ikpk′ (¯Ck + 2αkpk + βk)

Dk(Dk + Ikpk′ ), −Qk. (4.28)

Similarly, to calculate the differential in the term-III of (4.22), we use (4.18) and (4.24)

to express the denominator of SINRmrcj′ for j ∈ Uk \ k as follows.

Dj′ = Aj′ + Aj′pk +Bj′ + Cj′ + pkCj′ + δj′pk + εj′ , (4.29)

where Aj′ =2K∑

i 6=k,k′σ2g,i(σ

2g,iσ

2f,i′

+ σ2f,j′σg,j)pi, Aj′ = σ2

g,k(σ2g,kσ

2f,k′

+ σ2f,j′σ

2g,j),

Cj′ = P−1R σ2

n

σ4g,kσ

2

f,k′

σ2

f,j′

, Cj′ =

2K∑i 6=k

piP−1R σ2

n

σ4g,iσ

2f,i′

σ2f,j′

, δj′ =1

PRσ2f,j′

σ2f,k′σ

4g,k

∑i,j′∈Uj′

piσ2j′,i,

εj′ =1

PRσ2f,j′

2K∑i 6=k

piσ2f,i′σ

4g,i

∑i,j′∈Uj′

piσ2j′,i.

The differential in the term-III in (4.22) is next calculated using (4.18) and (4.29).

d

dpk

[ ∑j∈Uk\k

wjlog2(1 + SINR

mrcj′ )


]=

∑j∈Uk\k

Jj

1 + SINRmrcj′

d

dpk

[Ij′pjDj′

]

=∑

j∈Uk\k

Jj

1+SINRmrcj′

[−Ij′pj(Aj′+Cj′+δj′)

D2j′

]=−

∑j∈Uk\k

JjIj′pj(Aj′ + Cj′ + δj′)

Dj′(Dj′ + Ij′pj), −Sk.

(4.30)

We now consider the term-IV of (4.22). Using (4.18) and (4.24), the denominator of SINRmrcj′

for j ∈ Uk′ \ k′ is given as

Dj′ = Aj′ + Aj′pk +Bj′ + Cj′ + pkCj′ + αj′p2k + βj′pk + γj′ , (4.31)


where αj′ =1

PRσ2f,j′

σ2f,k′

σ4g,kσ

2j′,k, βj′ =

1

PRσ2f,j′

[σ2j′,k

2K∑i 6=k

piσ2f,i′σ

4g,i+σ

2f,k′

σ4g,k

∑i,j′∈Uj′ , i 6=k

piσ2j′,i

]

and γj′ =1

PRσ2f,j′

[ 2K∑i 6=k

piσ2f,i′σ

4g,i

][ ∑i,j′∈Uj′ ,i 6=k

piσ2j′,i

].

The differential in the term-(IV) of (4.22) is now calculated using (4.18) and (4.31) as

following.

d

dpk

[ ∑j∈Uk′\k′

wjlog2(1 + SINR

mrcj′ )


]= −

∑j∈Uk′\k′

JjIj′pj(Aj′+Cj′+2αj′pk+βj′)

Dj′(Dj′+Ij′pj),−Lk.

(4.32)

By substituting (4.26), (4.28), (4.30) and (4.32) in (4.22), we have

−Mk +JkIk′ (Ak′ +Bk′ + Ck′ + εk′ )

Dk′ (Dk′ + Ik′pk)−Qk − Lk − Sk − λ = 0

=⇒ pk =

[Jk(Ak′ +Bk′ + Ck′ + εk′)

Dk′(λ+Mk +Qk + Lk + Sk)− Dk

′

Ik′

]. (4.33)

Based on the above derived expression of Ak′ , Bk′ , Ck′ ,Dk′ , εk′ , Ik′ , Jk,Mk, Qk, Lk, Sk,

we now iteratively solve the KKT conditions in (4.21) based on the modified waterfilling

algorithm proposed in [28] [24, pp. 270-273]. To this end, we assume that some initial

feasible pk values are known. We then compute Ak′ , Bk′ ,Dk′ , εk′ , Ik′ , Jk′ ,Mk, Qk, Lk, Sk

for 0 ≤ k ≤ 2K, where (k, k′) denotes the user-pair exchanging information. We have to now

update the transmit power pk, ∀k. For this, the Lagrange multipliers must be chosen such that

they satisfy the power constraints in (4.21d) and the corresponding complementary slackness

conditions in (4.21e) and (4.21f), which results in the following waterfilling problem [24,28]:

pk =

[Jk(Ak′ +Bk′ + Ck′ + εk′)

Dk′(λ+Mk +Qk + Lk + Sk)− Dk

′

Ik′

]+

subject to2K∑k=1

pk ≤ Pmaxu . (4.34)

Here x+ = max[x, 0]. We then iteratively update the transmit powers pk by solving the

waterfilling problem using the bisection search on Lagrange multiplier λ [24, 28], as in the


Algorithm 3.

Algorithm 3: WSEE maximization algorithm for MRC/MRT processing.


initial pk by allocating equal power to all users i.e., p[0]k = Pmax

t /4K and

PR = Pmaxt /2.

Output: p∗k2Kk=1 as the solutions.


2 Given a feasible pk, compute Ak′ , Bk′ , Ck′ ,Dk′ , εk′ , Ik′ , Jk,Mk, Qk, Lk, Sk2Kk=1,

using the expressions derived in Section 4.6.1.

3 Compute p[m]k

2Kk=1, using (4.34).

4 Do until convergence if ‖p[m] − p[m−1]‖ < ε then break.

5 return p∗k.

4.6.2 ZFR/ZFT processing

Similar to the MRC/MRT processing, we first consider the term-I in (4.22).

d

dpk

[wk

log2(1 + SINRzfk′ )


]= −Mk +

Jk

1 + SINRzfk′

[d

dpkSINR

zfk′

], (4.35)

where Mk and Jk are defined as in (4.23). To further simplify (4.35), we rewrite the denom-

inator of SINRzfk′ using (4.20) as

Dk′ = Uk′ + Zk′ +Wk′ + ρk′pk + θk′ , (4.36)

where Uk′ =σ2nr + PRσ

2LIR

Mσ2g,k

+ ησ2LIRσ

2n, ρk′ =

P−1R

Mσ2f,k′

∑i,k′∈Uk′

piσ2k′ ,i

,

θk′ =P−1R

M

∑i 6=k

piσ2f,i′

∑i,k′∈Uk′

piσ2k′ ,i

, Zk′ =2K∑i 6=k

piP−1R σ2

n

Mσ2f,i′

+pkP

−1R σ2

n

Mσ2f,k′

, Zk′ + pkZk′

and Wk′ =∑

i,k′∈Uk′

ησ2k′ ,iσ2LIRpi.


We continue simplifying (4.35) using (4.20) and (4.36)

d

dpk

[wk

log2(1 + SINRzfk′ )


]= −Mk +

Jk(Uk′ + Zk′ +Wk′ + θk′)

Dk′(Dk′ + pk). (4.37)

We now consider the term-II of (4.22). To simplify the differential, we rewrite the denomi-

nator of SINRzfk from (4.20) as

Dk = Uk + Zk + Vk +Wk. (4.38)

Here Uk=σ2nr+PRσ

2LIR

Mσ2g,k′

+ησ2LIRσ

2n,Vk=

P−1R

M

2K∑i=1

piσ2f,i′

∑i,k∈Uk

piσ2k,i

and Wk =∑i,k∈Uk

ησ2LIRσ

2k,ipi.

The terms Wk, Zk and Vk are next split as following.

Wk =∑

i,k∈Uk,i 6=kησ2

LIRσ2k,ipi + σ2

LIRσ2k,kηpk , εk + δkpk,

Zk =∑2K

i 6=kpiP−1R σ2

n

Mσ2

f,i′

+pkP

−1R σ2

n

Mσ2

f,k′

, Zk + pkZk,

Vk =pk

PRM

1

σ2f,k′

∑i,k∈Uk,i 6=k

piσ2k,i + σ2

k,k

∑i 6=k

piσ2f,i′

︸︷︷︸

βkpk

+σ2k,k

PRMσ2f,k′

p2k︸︷︷︸

αkp2k

+1

PRM

[ 2K∑i 6=k

piσ2f,i′

][ ∑i∈Uk,i 6=k

piσ2k,i

]︸︷︷︸

γk

.

Using (4.20) and (4.38), the differential in the term-II of (4.22) is given as following.

d

dpk

[wk′

log2(1 + SINRzfk )

(PT,k′(pk′ , PR) + Pc,k′ )

]=−Jk′pk′(Zk + 2αkpk + βk + δk)

Dk(Dk + pk′), Qk. (4.39)

Similarly, to simplify the differential in term-III, the denominator of SINRzfj′ in term-III of

(4.22) for j ∈ Uk \ k, is expressed using (4.36) as

Dj′ = Uj′ + Zj′ + pkZj′ +Wj′ + ρj′pk + θj′ , (4.40)


where ρj′ =P−1R

Mσ2f,k′

∑i,j′∈Uj′

pjσ2j′ ,i

, θj′ =P−1R

M

∑i 6=k

piσ2f,i′

∑i,j′∈Uj′

piσ2j′ ,i

and Wj′ =∑

i,j′∈Uj′

ησ2LIRσ

2j′ ,ipi. We next calculate the differential in the term-III of (4.22)

using (4.20) and (4.40).

d

dpk

∑j∈Uk\k

wjlog2(1 + SINR

zfj′ )

(PT,j(pj , PR) + Pc,j)= −

∑j∈Uk\k

Jjpj(ρj′ + Zj′ )

Dj′ (Dj′ + pj), −Sk. (4.41)

We now evaluate the differential in the term-IV of (4.22). The denominator of SINRzfj′ for

j ∈ Uk′ \ k′, as defined in (4.36), is written as

Dj′ = Uj′ + Zj′ + pkZj′ + αj′p2k + βj′pk + γj′ + δj′pk + εj′ , (4.42)

where αj′ =P−1R

M

σ2j′ ,k

σ2f,k′

, βj′ =P−1R

M

[ ∑i,j′∈U

j′ ,i 6=k

piσ2j′ ,i

σ2f,k′

+ σ2j′ ,k

∑i 6=k

piσ2f,i′

],

γi =P−1R

M

∑i 6=k

piσ2f,i′

∑i,j′∈U

j′ ,i 6=k

piσ2j′ ,i

, δj′ = ησ2LIRσ

2j′ ,k

and εj′ = ησ2LIR

∑i,j′∈U

j′ ,i 6=k

σ2j′ ,ipi.

Using (4.20) and (4.42), the differential in the term-IV of (4.22) is simplified as

d

dpk

∑j∈U

k′ \k′

wjlog2(1+SINR

zfj′ )

(PT,j(pj , PR) + Pc,j)= −

∑j∈U

k′ \k′

Jjpj(Zj′ + 2αj′pk + βj′ + δj′ )

Dj′ (Dj′ + pj), −Lk.

(4.43)

By substituting all the terms from (4.37), (4.39), (4.41) and (4.43) in (4.22), we have

=⇒ pk =

[Jk(Uk′ + Zj′ +Wk′ + θk′ )

Dk′(λ+Mk + Qk + Lk + Sk)− Dk′

]. (4.44)

Similar to the MRC/MRT processing, we provide an iterative procedure to solve the KKT

conditions in (4.21). We assume that we have initial pk values and then calculateDk′ , Jk′ , Mk,

Qk, Lk, Sk, Uk′ , Zj′ , Wk′ , θk′

for 0 ≤ k ≤ 2K. We then solve the following water-filling


problem to iteratively update the pk values ∀k.

pk =

[Jk(Uk′ + Zj′ +Wk′ + θk′)

Dk′(λ+Mk + Qk + Lk + Sk)− Dk′

]+

subject to

2K∑k=1

pk ≤ Pmaxu . (4.45)

Here x+ = max[x, 0]. We then iteratively update the transmit powers pk by solving the

waterfilling problem using the bisection search on Lagrange multiplier λ [24, 28], as in the

Algorithm 3.

Algorithm 4: WSEE maximization algorithm for ZFR/ZFT processing.


initial pk by allocating equal power to all users i.e., p[0]k = Pmax

t /4K and

PR = Pmaxt /2.

Output: p∗k2Kk=1 as the solutions.


2 Given a feasible pk, compute Dk′ , Jk, Mk, Qk, Lk, Sk, Uk′ , Zj′ , Wk′ , θk′2Kk=1,

using the expressions derived in Section 4.6.2.

3 Compute p[m]k

2Kk=1, using (4.45).

4 Do until convergence

if ‖p[m] − p[m−1]‖ < ε then break.

5 return p∗k.

Remark 9. We note that for a non-convex problem with non-affine constraints, the con-

straint qualification e.g., linear independence constraint qualification (LICQ) or Mangasarian-

Fromovitz constraint qualification (MFCQ) should hold to ensure that KKT conditions are

necessary [73, pg. 244, ch. 5], [74, pg. 328, ch. 12]. For our problem with a non-convex ob-

jective and only affine constraints, we note that no constraint qualification is required, and

the KKT conditions are always necessary [73, pg. 240, ch. 5], [74, pg. 338, ch. 12].

4.7 Computational complexity of the algorithms 81

4.7 Computational complexity of the algorithms

• Algorithm 1: i) computing the scalars yk, zk2Kk=1 in step-2; ii) solving problem P4 in

step-3.

• Algorithm 2: i) computing scalars Ak′ , Bk′ , Ck′ ,Dk′ , εk′ , Ik′ , Jk,Mk, Qk, Lk, Sk2Kk=1 in

step-2; ii) computing pk2Kk=1 using waterfilling problem in step-3.

We see that the overall complexity of Algorithm 1 and Algorithm 2 is dominated by the

complexity of solving Problem P4 and waterfilling problem, respectively; the other steps

have trivial computational complexity. The problem P4 at each iteration includes (6K + 1)

real variables and (2K+2) linear constraints. Then its per-iteration worst-case computational

by using an interior point method is O((8K+3)3/2(6K+1)2) [75, Chapter 6]. The complexity

of solving waterfilling problem with 2K variables is O(4K2 + 12K) [76].


We now numerically investigate the WSEE achieved by the proposed algorithm using Monte-

Carlo simulations. The weights wk, for 0 ≤ k ≤ 2K which satisfies2K∑i=1

wk = 1, are randomly

chosen as [0.13, 0.20, 0.24, 0.16, 0.17, 0.1] . We set the maximum transmit power of each user

as Pmax = 47 dBm, maximum relay transmit power PmaxR = 60 dBm and the circuit power

Pc,k for 0 ≤ k ≤ 2K are modeled according to the practical power consumption model given

in Section 3.4. The system configuration remains same as in Section 3.8, unless otherwise

specified.

4.8.1 WSEE maximization

We first investigate the WSEE of both MRC/MRT and ZFR/ZFT processing using QT-based

Algorithm 2. We will later show that closed-form KKT-based Algorithm 3, yields the same

WSEE as Algorithm 2. We compare the Algorithm 2 performance with the i) baseline equal-

power allocation (EPA) scheme, which divides the total power equally amongst 2K users


-20 -10 0 10 20 30 40 50P

u

max [dBm]

0

2

4

6

8

10

12

14

16

18

WS

EE

[M

bits/J

]

WSEEmax

SEmax [5]

EPA

MRC/MRT

ZFR/ZFT

Fig. 4.27: WSEE comparison of MRC/MRT and ZFR/ZFT versus Pmaxu for N = 64 for K = 3 and

Pρ = 20 dBm.

and relay i.e., pk=Pmaxu /4K and PR=Pmax

u /2 and; ii) SEmax design in [23], which optimizes

the SE for the current system. For this study we use the power optimization variables

obtained by maximizing the SE and use them for plotting the WSEE.

We plot in Fig. 4.27 the WSEE versus total system transmit power, Pmaxu . We consider

K = 3 user-pairs, N = 64 relay antennas, Pρ = 20 dBm pilot power. We see that Al-

gorithm 2 has higher WSEE than EPA and SEmax design in [23]. We observe that with

Algorithm 2, the WSEE increases till Pmaxu = 20 dBm, and remains constant after that.

This is because with Pmaxu = 20 dBm, the system has already attained the maximum WSEE,

and further increasing Pmaxu would only decrease the achieved WSEE. This is clearly vis-

ible for EPA and the SEmax designs, wherein system keeps using the available power for

Pmaxu > 20 dBm, and WSEE degrades. We observe that the MRC/MRT has marginally

higher WSEE than ZFR/ZFT for Pmaxu < 0 values. This is because the ZFR/ZFT processing

focuses on canceling the co-channel interference without considering noise, which degrades

its performance at low Pmaxu values.


0 100 200 300 400 500


2

4

6

8

10

12

14

16

WS

EE

[M

bits/J

]

WSEEmax

EPA

MRC/MRT

ZFR/ZFT

Fig. 4.28: WSEE comparison of MRC/MRT and ZFR/ZFT versus N for Pmaxu = 20 dBm for K = 3

and Pρ = 20 dBm.

We now compare in Fig. 4.28 the WSEE versus N for both MRC/MRT and ZFR/ZFT

processing. For this study, we set K = 3 user-pairs, Pρ = 20 dBm and Pmaxu = 20 dBm. We

observe that for the MRC/MRT (ZFR/ZFT) processing, the WSEE increases till N = 70

(N = 60) and reduces after that. This is because for this antenna range, the increase in

the SE of each user, in the numerator of the WSEE expression, dominates the consumed

power, in the denominator of the WSEE expression. For N > 70 for MRC/MRT (N > 60 for

ZFR/ZFT) processing, the WSEE reduces. This is because, as seen from (3.2), the increase

in the transceiver chains increase the total power consumed. The increase in SE is therefore,

not proportional to the power expended to achieve that increase, which reduces the WSEE.

This study tells us that the massive antenna with these N values are optimal for the WSEE

for our system.


100 200 300 400 500


0

2

4

6

8

EE

k [M

bits/J

]

User-1

User-2

User-3

User-4

Fig. 4.29: EE of each user versus N for different weight: Ω1 :w1 =0.25, w2 =0.25, w3 =0.25, w4 =0.25 for Du : 0.749, 0.445, 0.346, 0.311, D :0.649, 0.545, 0.231, 0.145, K=2 and Pmax

u =20 dBm.

4.8.2 Influence of weights

We now numerically illustrate the fact that the WSEE can prioritize the EE of a particular

user. For this study, we consider a system with K = 2 user-pairs and MRC/MRT processing,

with large scale matrices Du : diag0.75, 0.445, 0.346, 0.31, Dd = diag0.65, 0.545, 0.23, 0.145

and by fixing Pmaxu = 20 dBm. We chose the values of Du and Dd such that one user-pair

has better channel than the other one. We plot in Fig. 4.29 to Fig. 4.31 the EE of individual

users versus N .

We first choose equal weights Ω1 : w1 = 0.25, w2 = 0.25, w3 = 0.25, w4 = 0.25 and ob-

serve from Fig. 4.29 that by assigning equal weights to all the users, we achieve EE in the order

of users channel strength i.e., EEuser−1 > EEuser−2 > EEuser−3 > EEuser−4. We now choose

weights that prioritize weaker users i.e., User-3 and User-4 and assign the following weights

Ω2 : w1 = 0.10, w2 = 0.10, w3 = 0.4, w4 = 0.4. We observe from Fig. 4.30, that by assigning

higher weights to weaker user, we achieve EEuser−4 > EEuser−3 > EEuser−1 > EEuser−2. We


100 200 300 400 500


0

2

4

6

8

EE

k [M

bits/J

]

User-1

User-2

User-3

User-4

Fig. 4.30: EE of each user versusN for different weight: Ω2 : w1 = 0.1, w2 = 0.1, w3 = 0.4, w4 = 0.4;and c) Ω3 : w1 = 0.4, w2 = 0.4, w3 = 0.1, w4 = 0.1 for Du : 0.749, 0.445, 0.346, 0.311, D :0.649, 0.545, 0.231, 0.145, K=2 and Pmax

u =20 dBm.

100 200 300 400 500


0

2

4

6

8

10

12

EE

k [M

bits/J

]

User-1

User-2

User-3

User-4

Fig. 4.31: EE of each user versus N for different weight: Ω3 : w1 = 0.4, w2 = 0.4, w3 = 0.1, w4 = 0.1for Du :0.749, 0.445, 0.346, 0.311, D :0.649, 0.545, 0.231, 0.145, K=2 and Pmax

u =20 dBm.


0 200 4000

1

2

3

4

5

6

7

8


WS

EE

[M

bits/J

]

FD, (−5dB, −5dB)

FD, (0dB, −5dB)

FD, (0dB, 5dB)

HD

FD, (5dB, 5dB)

FD, (10dB, 5dB)

Fig. 4.32: WSEE versus N comparing FD and HD systems for MRC/MRT for K=3, Pmaxu =20 dBm

and Pρ=20 dBm.

next chose weights that prioritize stronger users i.e., User-1 and User-2 and assign the fol-

lowing weights Ω3 : w1 = 0.4, w2 = 0.4, w3 = 0.1, w4 = 0.1. We see from Fig. 4.31 that

the gap between EE of stronger and weaker user widens i.e., EEuser−1 > EEuser−2

EEuser−3 > EEuser−4. The EE priorities of different users can be achieved by selecting

appropriate weights.

4.8.3 FD versus HD comparison

We now compare the WSEE of the FD and the HD relaying with MRC/MRT and the

ZFR/ZFT processing by varying N in Fig. 4.32 and Fig. 4.33, respectively. In the HD system

considered herein, a node can either transmit or receive; the communication require two

phases to complete. In the first phase, all the users simultaneously transmit their respective

signals to the relay. In the second phase, the relay processes its receive signal, and broadcasts

it to all users. The residual SI variance σ2LIR and the IUI channel variance σ2

UI in Proposition 1

are, therefore, set to zero for the HD system. Also, we multiply the SE expression with 12


0 100 200 300 400 5000

3

6

9

12

14


WS

EE

[M

bits/J

]

FD, (−5dB, −5dB)

FD, (0dB, −5dB)

FD, (0dB, 5dB)

HD

FD, (5dB,5dB)

FD, (10dB, 5dB)

Fig. 4.33: WSEE versus N comparing FD and HD systems for ZFR/ZFT for K=3, Pmaxu =20 dBm

and Pρ=20 dBm.

to account for two phases. For the sake of fairness, we assume that the number of transmit

and receive RF chains to be same, for both HD and FD systems. We plot the WSEE of FD

relaying for different values of residual SI variance σ2LIR, and the IUI channel variance σ2

UI .

We observe from both these figures that the FD system outperforms the HD one at low σ2LIR

and σ2UI values, which is not surprising. This is because a FD system uses spectral resources

more efficiently than a HD system. We observe that the FD performs better than the HD

system for (σ2LIR, σ

2UI) = (0, 5) dB when N > 100 for MRC/MRT (N > 250 for ZFR/ZFT).

The HD system, however, outperforms FD when N < 100 for MRC/MRT (N < 250 for

ZFR/ZFT). We also note that as we increase N , the FD system is able to handle higher SI

and IUI, and therefore outperforms HD even with (σ2LIR, σ

2UI) = (5, 5) dB for N > 500. This

is because, as discussed in [21], for large N , the channel vectors of the SI and desired signal

become orthogonal, and hence the ZFR/ZFT or MRC/MRT processing project the desired

signal to the orthogonal complement space of the SI, which significantly reduce the SI [21].


-20 -10 0 10 20 30 40 50P

u

max [dBm]

0

2

4

6

8

10

12

WS

EE

[M

bits/J

]

,g,k

2 =

,f,k

2 = 0

,g,k

2 =

,f,k

2 = 0.15

,g,k

2 =

,f,k

2 = 0.35

Fig. 4.34: Effect of channel estimation errors for N = 64, K = 3 and Pρ = 20 dBm.


4.8.4 Effect of channel estimation error

We plot in Fig. 4.34, the WSEE versus Pmaxu for different values of channel estimation errors

σ2ξ,g,k and σ2

ξ,g,k for the MRC/MRT processing. For this analysis we set K = 3 user-pairs,

N = 64 relay antennas and the large scale fading coefficients σ2g,k and σ2

f,k to be unity.

We observe that WSEE decreases with increase in channel estimation error, which is not

surprising. We also note that the WSEE degradation when σ2ξ,g,k = σ2

ξ,g,k is increased from

0 (perfect channel estimate) to i) 0.15, is marginal; and ii) 0.35, is around 20%.

4.8.5 Algorithm 3 and Algorithm 2 comparison

We now show that the KKT-based Algorithm 3 has same WSEE as that of QT-based Algo-

rithm 2. We plot in Fig. 4.35 and Fig. 4.36 the WSEE versus N for MRC/MRT and ZFR/ZFT

processing, respectively. We fix Pmaxu = 30 dBm, K = 3, Pρ = 20 dBm, PR = Pmaxu /2 and

consider σ2LIR = σ2

UI i.e., −10 dB, 0 dB and 10 dB. We observe that both algorithms have

same WSEE for different N , σ2LIR and σ2

UI values.

We mentioned in Section 4.6 that the QT-based Algorithm 2 has higher computational

complexity, when compared to the KKT approach. We now numerically illustrate this fact, by

plotting in Fig. 4.37 the runtime per iteration versus N with MRC/MRT for both QT-based

Algorithm 2 and KKT-based Algorithm 3. We observe that the QT-based approach requires

the highest computational time per iteration while the KKT approach has the lowest, which

is approximately three orders of magnitude smaller than the QT-based approach.


0 200 400 600 800 1000


0

5

10

15

20

25

WS

EE

[M

bits/J

]

QT

KKT

LIR

2 =

UI

2 = -10dB

LIR

2 =

UI

2 = 0dB

LIR

2 =

UI

2 = 10dB

Fig. 4.35: Comparison of QT and closed form KKT algorithms: WSEE versus N for MRC/MRTwith Pmax

u = 30 dBm, K = 3.

0 200 400 600 800 1000


0

10

20

30

40

50

WS

EE

[M

bits/J

]

QT

KKT

LIR

2 =

UI

2 = -10dB

LIR

2 =

UI

2 = 0dB

LIR

2 =

UI

2 = 10dB

Fig. 4.36: Comparison of QT and closed form KKT algorithms: WSEE versus N for ZFR/ZFT withPmaxu = 30 dBm, K = 3.

4.9 Summary 91

0 100 200 300 400 500


0

0.5

1

1.5

2

2.5

Run tim

e p

er

itera

tion [sec]

QT, K = 6

QT, K = 3

KKT, K = 6

KKT, K = 3250 300 350

3

4

510

-3

Fig. 4.37: Runtime versus N with MRC/MRT.

4.9 Summary

We optimized the non-convex WSEE of a multi-pair AF FD mMIMO two-way relaying by

first approximating the objective as a concave-convex function, and then by using quadratic

transform. We also derived a closed-form waterfilling solution which yields its KKT point.

The proposed algorithms yield higher WSEE than the design which maximize the spectral

efficiency. We numerically showed that approximately N = 64 relay antennas are optimal

for WSEE, a N value which is being used for 5G system design. Any N value above it

deteriorates the WSEE because of increase in transceiver power, which mitigates the spectral

gain achieved with the same power increase. We showed that with N > 100 (250), a residual

SI variance of 0 dB can be tolerated for the MRC/MRT (ZFR/ZFT) processing FD relay to

have higher WSEE than its HD counterpart. We also showed that the WSEE metric allows

the weights to be judiciously assigned to weaker users to satisfy their EE requirements.

4.9 Summary 92


Chapter 5

Decentralized WSEE Optimization

for mMIMO Two-Way HD AF

Relaying

5.1 Introduction

The decentralized processing architecture, wherein a set of computing units cooperate with

each other in a distributed manner to complete a complex task, has recently attracted a lot of

attention in mMIMO system design [77–79]. This is because mMIMO systems generate large

amount of baseband data which makes the centralized processing extremely challenging [79].

In this Chapter, we consider HD relay, contrary to previous Chapters where we focused

on FD relay. The main reason is to dilute the complexity due to FD architecture and

focus more on the decentralized algorithm. References [77–79] showed that the decentralized

processing facilitates parallel computing, and offers a balanced trade-off between data transfer

requirements and system performance. Reference [79] argued that the centralized baseband

processing is extremely challenging with current interconnect technology, and proposed a

decentralized architecture, where a middle-level processing node is introduced which limits

the amount of data to be aggregated (or distributed) to (or from) the central processor,

5.2 Related work 94

relaxing the requirements on the data transfer network.

5.2 Related work

Reference [80] derived the ergodic rates, with MRC/MRT relay processing, and showed that

the system SE is enhanced if users’ channels have the same statistics. Reference [29] de-

rived the asymptotic SE for mMIMO amplify-and forward relay in infinite relay antennas

regime, and concluded that the large antenna arrays can reduce the transmit power while

maintaining a given asymptotic SE. Reference [30] first derived the closed-form achievable

rate expressions for arbitrary number of antennas for a mMIMO two-way relay, and then

designed an optimal power allocation scheme to maximize the SE. Motivated by the need to

save energy, GEE metric, is being used to design both conventional and mMIMO single-hop

and relaying systems [50, 55, 81–84]. Tan et al. in [50] optimized the asymptotic GEE of a

multi-pair one-way DF mMIMO HD relay, by approximating the non-convex problem as a

convex fractional programming problem. Reference [84] considered a multi-way AF mMIMO

HD relay with channel aging, and optimally allocated power to maximize GEE for ZFR/ZFT

relay processing.

The design of decentralized optimization techniques, which can effectively use the afore-

mentioned architecture, thus becomes crucial [85–87]. Reference [85] proposed a distributed

energy-efficient power optimization for SISO interference channels. Reference [86] constructed

decentralized robust beamformers for a multi-cell system with imperfect channel information

using ADMM method, which decomposes a global problem into smaller local subproblems,

and then coordinates solutions of these subproblems to solve the global problem [87].

All the above works which optimizes SE/GEE/WSEE for multi-pair HD AF relays are

summarized in Table I. We observe from the table that the existing mMIMO relaying litera-

ture has not optimized the SE/GEE/WSEE decentrally for both MRC/MRT and ZFR/ZFT

processing.


Table 5.1: Summary of mMIMO literature focusing on multi-pair HD AF relay.

Ref. Processing Optimization–Centralized/decentralized

[80] MRC/MRT No

[29] MRC/MRT, ZFR/ZFT No

[30] MRC/MRT, ZFR/ZFT SE–Centralized

[84] ZFR/ZFT GEE–Centralized

Proposed MRC/MRT, ZFR/ZFT WSEE–Both centralized and decentralized


We next list the main contributions as follows:

• We decentrally optimize WSEE by proposing a two-layer optimization framework which

combines the SCA approach, with ADMM, which to the best of our knowledge does not

exist, in the context of decentralized WSEE optimization. We assume that the relay

has decentralized architecture with multiple distributed servers (D-servers), with each

one of them connected to a central server (C-server) [78, 79, 88]. Each D-server solves

WSEE problem for a user in a decentralized manner, and the C-server coordinates with

the D-servers to calculate the global WSEE.

• The proposed framework is designed as follows.

First layer: We first equivalently transform the WSEE problem in its epigraph form

and then linearly approximate the non-convex constraints so obtained to cast WSEE

optimization as a GCP. The first-order linear approximation is then iteratively improved

using the SCA approach and the converged solution is shown to be a KKT point of the

original problem.

Second layer: We use the ADMM algorithm to solve the approximated GCP obtained

at the nth iteration of the SCA approach in a decentralized manner for both ZFR/ZFT

and MRC/MRT processing. The SCA problem is not in the standard form and does

not permit the direct application of the ADMM algorithm. To solve this problem, we

introduce the local and global versions of the relevant variables which allow us to apply

5.4 System model 96

ADMM to the approximated GCP. The WSEE problem is then decomposed into sub-

problems which are solved at each distributed node independently with an aim to update

the local variables.

• The approximated GCP in the SCA step has high computational complexity. We reduce

it by approximating the WSEE as a SOCP instead of a GCP. This is achieved by first

casting a constraint equivalently in the hyperbolic form, and then in the second order

cone form. We analytically and numerically show that the SOCP approach reduces the

complexity of the decentralized algorithm when compared with the GCP approach.

• We numerically show that the proposed decentralized algorithms yield higher WSEE

than the i) equal-power allocation (EPA) scheme in [29]; ii) sum-rate maximization

(SRM) scheme in [30]; and iii) random power allocation (RPA) scheme. We also show

that the proposed decentralized approach i) can set the weights to prioritize EE re-

quirements of one user over another; and ii) achieves the same WSEE as obtained using

the centralized optimization approach.

The rest of the chapter is organized as follows. Section 5.4 describes the system model and the

WSEE optimization is formulated in Section 5.5. The decentralized WSEE optimization for

MRC/MRT using GCP and SOCP is presented in Section 5.6 and Section 5.7, respectviely.

The decentralized optimization for ZFR/ZFT using SOCP is described in Section 5.8. The

simulation results are presented in Section 5.9, and the Chapter is concluded in Section 5.10.

5.4 System model

We consider, as shown in Fig. 5.40, a multi-pair mMIMO two-way AF relay system, where

K pairs of single-antenna users communicate via a HD N -antenna relay on the same time-

frequency resource. The user U2m−1, in the user-pair (U2m−1, U2m) for m = 1, · · · ,K, on one

side of the mMIMO relay, exchanges information with the user U2m on its other side. We

assume that the direct links between the user-pairs (U2m−1, U2m) are absent due to large

5.4 System model 97

path loss and shadowing [30]. A HD two-way relay communication occurs in two phases –

multiple access (MAC) and the broadcast (BC). In the MAC phase, all 2K users send their

respective signal to the relay. In the BC phase, the relay first amplifies its receive signal, and

then broadcasts it back to them. We assume that all the nodes operate in the time division

duplex mode.

Fig. 5.40: System model for multi-pair two-way HD AF relaying.

All the 2K users simultaneously transmit√pksk, for k = 1, · · · , 2K to the relay. Here

pk is the transmit power of user Sk with E[|sk|2] = 1. The signal received at the relay

yR ∈ CN×1, is

yR =

2K∑k=1

√pkgksk + nR = Gs + zR, (5.1)

where G = GP, with P = diag√p1, · · · ,√p2K, s = [s1, · · · , s2K ]T , gk ∈ CN×1 denotes the

channel between the kth user and relay. The elements of gk are independent and identically

distributed (i.i.d.) with probability distribution function (pdf) CN (0, σ2k). Here σ2

k represents

the large scale fading coefficient between the kth user and the relay. We denote the channel

matrix G = HD1/2 = [g1, · · · ,g2K ] ∈ CN×2K , where the elements of H ∈ CN×2K are the

5.4 System model 98

small-scale fading coefficients with pdf CN (0, 1) and D is the diagonal matrix with Dkk = σ2k.

The vector zR denotes the noise at the relay, which contains i.i.d. entries with pdf CN (0, σ2nr).

In the BC phase, the relay amplifies the received signal as xR = WyR ∈ CN×1, where

W ∈ CN×N is the relay processing matrix. We denote the total power constraint at the

relay as PR, which the relay transmit signal xR satisfies by design i.e., PR = Tr(E[xRxHR ]) =

Tr(W(GGH + σ2

nrIN

)WH

). The relay next broadcasts xR to all 2K users. The received

signal at the k′th user, which also gives the input-output relation for the k → k

′link, is

yk′ = gTk′

xR + zk′

= gTk′

W√pkgksk + gT

k′W√pk′gk′sk′ + gT

k′W

2K∑i 6=k,k′

√pigisi + gT

k′WzR + zk′ . (5.2)

Here (k, k′) = (2m − 1, 2m) or (2m, 2m − 1) for m = 1, · · · ,K denotes a user-pair. The

first term is the desired signal at the k′th user, the second term is the self-interference term

and the third term is the inter-pair interference. The fourth term represents the amplified

noise from the relay, and the fifth term zk′ ∼ CN (0, σ2n) is the AWGN at the k

′th user.

The relay calculates the self-interference cancellation (SIC) coefficients gTk′

Wgk using the

estimated CSI and sends it back to the users. The received signal yk′ , after SIC, is given as

yk′ = gTk′

W√pkgksk +

√pk′λk′sk′ + gT

k′W

2K∑i 6=k,k′

√pigisi + gT

k′WzR + zk′ . (5.3)

where λk′ = gTk′

Wgk − gTk′

Wgk represents the residual interference.

Remark 10. A user, to cancel its own transmitted signal, requires knowledge of scalar SIC

coefficient gTk′Wgk′ . For a user to be able to calculate the coefficient, in addition to channel

gk′ , it also needs the precoder knowledge W, which for it is difficult to get. The relay,

therefore, calculates the scalar coefficients and feeds them back to the users. To transmit the

precoded scalar value, the relay needs a single additional pilot per user.

Remark 11. The scalar coefficients could be quantized and sent to the users. The delay and

quantization will definitely degrade the spectral efficiency but we expect the degradation to

5.4 System model 99

be marginal. This is due to channel hardening in massive MIMO systems which makes the

channel deterministic.

5.4.1 Channel estimation

We consider a coherence interval of length T symbols, which is split into channel estimation

phase of τ symbols and data transmission phase of T−τ symbols. During channel estimation,

all 2K users simultaneously transmit τ -length pilot sequence to the relay. With Pρ as the

pilot transmit power, each user transmits√τPρψ ∈ C2K×τ . We assume that the pilots are

orthogonal i.e., ψψH = I2K , which requires τ ≥ 2K [44]. The MMSE channel estimate

G with Eg as the estimation error matrix, is given as G = G + Eg [30, 89]. The matrix

Eg is independent of the estimated channel matrix G [89]. The rows of the matrix G

are distributed as CN (0, D), where D = diagσ21, · · · , σ2

2K with σ2k =

τPρσ2k

τPρσ2k+σ2

nr[89].

Therefore, Eg ∼ CN (0,D− D) with D− D = diagσ2ξ,1, · · · , σ2

ξ,2K

with σ2

ξ,k =σ2kσ

2nr

τPρσ2k+σ2

nr.

5.4.2 Relay precoder design

We next present the relay precoders based on the MRC/MRT and the ZFR/ZFT processing.

5.4.2.1 MRC/MRT processing

The relay precoder is W = αG∗TGH [29]. Here the block-diagonal permutation matrix

T = blkdiag T1, · · · , TK with Tm = [0 1; 1 0] for 0 ≤ m ≤ K permutes the transmit data

of each user-pair to ensure that the data reaches its paired receiver. The amplification factor

α is obtained, on similar lines as in Section 2.6.

5.4.2.2 ZFR/ZFT processing

The relay precoder is W = α G∗T GH[29], where G = G

(GHG

)−1. We observe that

for ZFR/ZFT processing, gTk′

Wgi = δk,i, where δk,i = 1 when i = k and zero otherwise.

Therefore, the ZFR/ZFT precoder cancels both self- and inter-pair interference terms in (5.3).

5.4 System model 100

5.4.3 Closed-form SE expression

We next provide the closed-form SE expressions for the k → k′ link, for both MRC/MRT

and ZFR/ZFT precoders, that will later be used to allocate power to maximize WSEE.

Here (k → k′) = (2m, 2m − 1) or (2m − 1, 2m) for m = 1, · · · ,K represents the user-pair

exchanging data, where Sk is the transmitting user and Sk′ is the recipient. The SE expression

for (ζ ∈ mrc, zf) is

Rk = A log2(1 + SINRζk(p)), (5.4)

where SINRmrck (p) =

Nσ4kσ

2k′pk

2K∑i 6=k′

(σ2i σ

2k′σ2k + σ4

i σ2i′

)pi + σ2nrσ

2k′σ2k

and

SINRzfk (p) =

pk (N − 2K − 1)σ2k

σ2nr

. (5.5)

Here A = 12

(1− 2K

T

)and p = [p1, · · · , p2K ]T . These expressions are valid at high SNR i.e.,

σ2n/pk 1, σ2

nr/pk 1, and σ2n/PR 1, and are derived in Appendix G. Before we optimize

the WSEE, few remarks are in order.

Remark 12. We see from (5.5) that SINRzfk (p) is a function of pk alone while SINRmrc

k (p)

is a function of all the power optimization variables. This makes it easier to centrally opti-

mize WSEE for ZFR/ZFT processing, which we did in [90]. We note that [90] was a first

attempt to optimize the WSEE for a mMIMO relay, and we considered ZFR/ZFT process-

ing with a simple SINR expression. The current work not only extends it to MRC/MRT

processing with coupled power variables but also proposes algorithms to optimize them in a

decentralized manner.

Remark 13. MRC/MRT SINR expression insights: We know that the MRC/MRT design

maximizes the receive signal SINR, by neglecting the inter-pair interference [91]. This can also

be observed from the SINRmrck (p) expression in (5.5), where the denominator is dominated

by the inter-pair interference terms. We also notice that the numerator of SINRmrck (p)

consists of power of user Sk i.e., pk, which is obvious because the SINR expression is for the

5.5 Weighted sum energy efficiency maximization 101

k → k′

link. The numerator also contains the path loss from user Sk to relay (i.e., σ2k) and

from relay to user k′ (i.e., σ2k′).

Remark 14. ZFR/ZFT SINR expression insights: We note that the numerator of SINRzfk (p)

expression in (5.5), similar to the MRC/MRT SINR expression, consists of pk, the power

of the desired user. We know that the ZFR/ZFT design eliminates the inter-pair inter-

ference by neglecting the noise [91]. We see this from the SINRzfk (p) expression in (5.5),

where the denominator only consists of the noise variance, and the inter-pair interference is

completely cancelled.

5.5 Weighted sum energy efficiency maximization

We now aim to maximize the WSEE to satisfy the heterogeneous EE requirements of different

users by assigning them different weights. To define WSEE, we first define the EE (in

bits/Joule) of a user link. The EE, as in [26, 70], for the k → k′ user link is defined as the

ratio of SE and the power used by it, and is given by

EEk =B ×Rk

1ηU

(1− 2K

T

)pk +

η−1U 2KT Pρ + PC

2K

, (5.6)

where pk represents the transmit power of k → k′

link, the constant ηU ∈ (0, 1) denotes

the power amplifiers efficiency at the user and PC denotes the total circuit power consumed,

which can be obtained on lines similar to (3.2).

We next define WSEE, which is the weighted sum of EEs of all the users [26,70], as follows

WSEE =

2K∑k=1

wkEEk. (5.7)

Here wk for k = 1, · · · , 2K represent predefined weights, which are set according to the EE

requirements of different users [25]. Using (5.4), (5.6) and (5.7), we now maximize the WSEE,

5.5 Weighted sum energy efficiency maximization 102

by allocating power pk to the kth user:

P1 : Maximizep

2K∑k=1

wklog2(1 + SINRζ

k(p))

ϑpk + PC(5.8a)

subject to 0 ≤ pk ≤ Pmax, ∀k ∈ K (5.8b)

2K∑k=1

pk + PR ≤ Pmaxt , (5.8c)

Rk ≥ Rk, ∀k ∈ K. (5.8d)

Here K , 1, · · · , 2K, ϑ = ηU(1− 2K

T

)and PC =

η−1U 2KT Pρ + PC

2K . The constraints in (5.8b)

are on the peak transmit power of each user (Pmax), the constraint in (5.8c) is on the total

system transmit power (Pmaxt ), and the constraints in (5.8d) specify the quality-of-service

(QoS) required, in terms of rate Rk, by the link k → k′.

Remark 15. The objective of optimization problem in P1 is a function of SINRζk and PC . We

note that the SINRζk for MRC/MRT and ZFR/ZFT processing in (5.5), is a function of relay

antennas N . Also the total circuit power consumption PC in the denominator of objective,

as shown in in (3.2), is a function of N . These expressions, therefore, reflect the effect of

large scale antenna N in the optimization process.

Before proposing the decentralized algorithms to solve P1, we discuss why its central

optimization is also challenging by comparing it with the central GEE optimization in [50,92].

The GEE is defined as

GEE =

2K∑k=1

A log2(1 + SINR(pk))

2K∑k=1

(ϑpk + PC)

. (5.9)

We first see that the GEE, which is the ratio of the network sum-rate and the network power

consumption, is a network-centric metric and cannot prioritize different user links. The

GEE can be shown to be a PC function [92], and can be maximized using the Dinkelbach’s

algorithm [64]. The WSEE in P1, in contrast, is the sum of PC functions, and is not

guaranteed to be a PC function. The Dinkelbach’s algorithm thus cannot be used to optimize

5.6 Two-layer decentralized WSEE optimization for MRC/MRT using GCP 103

it. We next use the SCA approach to develop decentralized algorithms to optimize it for

both MRC/MRT and ZFR/ZFT processing. We begin with MRC/MRT processing in the

next section.

5.6 Two-layer decentralized WSEE optimization for MRC/MRT

using GCP

We decentrally optimize WSEE using ADMM algorithm, which is based on the decomposition-

coordination procedure, wherein the global WSEE optimization problem is decomposed into

small sub-problems which are solved independently by the D-servers, and the solutions of

these sub-problems are coordinated by the C-server to calculate the global solution [87].

Before applying ADMM to P1, we first approximate, using the SCA framework in the first

layer, its non-convex objective as a convex function and cast it as a GCP. The approximate

GCP so obtained can be solved either centrally or decentrally in the second layer. At the

nth iteration of the SCA, we use the ADMM algorithm to optimize it in a decentralized

manner. The approximated GCP is not in the standard form and does not permit the direct

application of the ADMM algorithm. We, therefore, introduce the local and global versions

of the relevant variables which enable us to apply ADMM. The problem is then decomposed

into sub-problems which each D-server solves independently to update the local variables.

The C-server then collects the local variables from each D-server, and updates the global

variables. The variables are updated and the information is exchanged until the ADMM

converges in the nth SCA iteration. All the D-servers then update the relevant parameters

to formulate the SCA problem for the (n + 1)th iteration, and another ADMM iteration is

then carried out. We next describe the SCA framework.

5.6.1 SCA framework

To approximate the non-convex WSEE objective in problem P1, we first cast it in an equiv-

alent epigraph form [63], and then approximate the non-convex constraints to cast it as a


GCP. The epigraph form of the problem P1 is as follows [63]

P2 : Maximizep,g

2K∑k=1

wkgk (5.10a)

subject to gk ≤log2(1 + SINRmrc

k (p))

ϑpk + PC, ∀k ∈ K (5.10b)

(5.8b), (5.8c), (5.8d). (5.10c)

Here SINRmrck (p), given in (5.5), denotes the SINR expression for the MRC/MRT processing

and g = [g1, · · · , g2K ]T denotes the slack variable, where gk represents the EE of the k → k′

link. We next introduce additional slack variables ϕ = [ϕ1, · · · , ϕ2K ]T and γ = [γ1, · · · , γ2K ]T

and cast P2 as

P3 : Maximizep,g,ϕ,γ

2K∑k=1

wkgk (5.11a)

subject to ϑpk + PC ≤ ϕ2k/gk, ∀k ∈ K (5.11b)

ϕ2k ≤ log2(1 + γk), ∀k ∈ K (5.11c)

γk ≤Nσ4

kσ2k′pk∑2K

i=1,i 6=k′ (σ2i σ

2k′σ2k + σ4

i σ2i′

)pi + σ2nrσ

2k′σ2k

, ∀k ∈ K (5.11d)

B log2(1 + γk) ≥ Rk, ∀k ∈ K (5.11e)

(5.8b), (5.8c), (5.11f)

where in the RHS of (5.11d), we substitute the value of SINRmrck (p) from (5.5). The slack

variables ϕ2k and γk represent the SE and the SINR of the k → k

′link, respectively. Note

that the objective in P3 is linear, the constraints (5.11c) and (5.11e) are convex, and the

constraints (5.11b) and (5.11d) are non-convex. We further simplify constraint (5.11d) by

substituting Ik = Nσ4kσ

2k′

, and introducing slack variables d = [d1, · · · , d2K ]T and α2 =

[α21, · · · , α2

2K ]T , as follows


γk ≤ Ikα2k

dk, ∀k ∈ K (5.12a)

dk ≥2K∑

i=1,i 6=k′(σ2i σ

2k′σ2k + σ4

i σ2i′

)pi + σ2nrσ

2k′σ2k, ∀k ∈ K (5.12b)

α2k ≤ pk, ∀k ∈ K. (5.12c)

The terms dk and α2k represent the denominator of the SINRmrc

k (p) given in (5.5), and the

transmit power of the kth user, respectively. Therefore, by using (5.12), the problem P3 can

be re-written as

P4 : Maximizep,g,ϕ,γ,d,α

2K∑k=1

wkgk (5.13a)

subject to (5.11b), (5.11c), (5.11e), (5.12a), (5.12b), (5.12c), (5.8b), (5.8c). (5.13b)

We note that problem P4 has all convex constraints except (5.11b) and (5.12a), whose RHS

are of the quadratic-over-linear form. The functionsϕ2kgk

andα2kdk

are jointly convex in (ϕk, gk)

and (αk, dk) respectively, which can be verified by calculating their second-order derivatives.

We approximate the RHS of (5.11b) and (5.12a) with their respective first-order Taylor

approximations. For a convex function f(v), the first-order Taylor approximation is always

its global under-estimator [63]

f(v) ≥ f(u) +∇f(u)T (v − u), (5.14)

where ∇f(·) is the gradient of f(·). At the nth iteration of the SCA, using (5.14), the linear

approximations of functionsϕ2kgk

andα2kdk

at points (ϕ(n)k , g

(n)k ) and (α

(n)k , d

(n)k ), respectively are

ϕ2k

gk≥

2ϕ(n)k

g(n)k

ϕk −(ϕ

(n)k

g(n)k

)2

gk andα2k

dk≥

2α(n)k

d(n)k

αk −(α

(n)k

d(n)k

)2

dk. (5.15)


We now replace the RHS of (5.11b) and (5.12a) with their respective linear approximations

in (5.15). The approximated problem at the nth iteration is thus

P5 : Maximizep,g,ϕ,γ,d,α

2K∑k=1

wkgk (5.16a)

subject to ϑpk + PC ≤2ϕ

(n)k

g(n)k

ϕk −(ϕ

(n)k

g(n)k

)2

gk, ∀k ∈ K (5.16b)

γk ≤ Ik(

2α(n)k

d(n)k

αk −(α

(n)k

d(n)k

)2

dk

), ∀k ∈ K (5.16c)

(5.8b), (5.8c), (5.11c), (5.12b), (5.12c), (5.11e). (5.16d)

We note that the non-convex constraints (5.11b) and (5.12a) in problem P4 are now approx-

imated in problem P5 as linear constraints (5.16b) and (5.16c), respectively. The problem

P5 is a GCP with all convex constraints. Even though our objective is to decentrally solve

P5, it can also be centrally solved as briefed in the remark below.

Remark 16. Central iterative procedure for SCA: Given (ϕ(1)k , g

(1)k , α

(1)k , d

(1)k ), whose calcula-

tion is discussed next in Remark 17, set iteration number n = 1 and do the following

• Solve P5 to calculate p∗.

• Compute ∀k ∈ K:

ϕ(n+1)k = (log2 (1 + SINRmrc

k (p∗)))1/2, g(n+1)k =

log2 (1 + SINRmrck (p∗))

ϑp∗k + PC,

α(n+1)k = (p∗k)

1/2 and d(n+1)k =

2K∑i=1,i 6=k′

(σ2i σ

2k′σ2k + σ4

i σ2i′

)p∗i + σ2nrσ

2k′σ2k.

• n← n+ 1.

Repeat the above steps until convergence.

The complexity of GCP P5 can be reduced by formulating it as a SOCP [93], which we

do later in Section 5.7. We next describe how it can be decentrally solved using ADMM.

Before doing that, two remarks are in order.

Remark 17. Calculation of initial feasible points ϕ(1)k , g

(1)k , α

(1)k and d

(1)k : We first equally

divide the power between all 2K users and relay i.e., pk = Pmaxt /4K, ∀k ∈ K and PR =


Pmaxt /2. We next compute ϕ

(1)k , g

(1)k , α

(1)k and d

(1)k by replacing inequalities of constraint

(5.11b), (5.11c), (5.12b) and (5.12c) by equalities. If the initial points ϕ(1)k , g

(1)k , α

(1)k and d

(1)k

are feasible for P5, then the problem formulated in subsequent iterations are also feasible

and solvable, as discussed in the convergence of the SCA approach below.

Remark 18. SCA convergence: We start by denoting θ(n) as the value of objective function

and ψ(n) as the feasible set of optimization variables, at iteration n. Then, ψ(n) is also feasible

for the problem formulated at the iteration n + 1 [94]. The proposed iterative approach,

therefore, results in a non-decreasing sequences of the objective function i.e., θ(n+1) ≥ θ(n).

Due to the power constraints (5.8b) and (5.8c), the objective θ is bounded from above, which

ensures that the iterative procedure always converges, and that too to a KKT point of P1

due to the following proposition (see proof in [94, pg. 33]).

Proposition 6. If the sequence x(n) converges to a point x∗, then x∗ is a KKT point of the

original problem.

5.6.2 Decentralized ADMM approach

Our aim now is to decentrally solve P5 in the second layer using ADMM. We observe that

the constraints (5.8c) and (5.12b), which couple transmit power of different users, prevents it

from being decomposable. We therefore first define a local feasible set Sk at the nth iteration,

denoted as S(n)k , of the SCA procedure for the D-server which solves problem for user Uk as

S(n)k =

pk, gk, ϕk, γk, dk, αk

∣∣∣∣ ϑpk,k + PC ≤2ϕ

(n)k

g(n)k

ϕk −(ϕ

(n)k

g(n)k

)2

gk, ϕ2k ≤ log2(1 + γk),

γk ≤ Ik(

2α(n)k

d(n)k

αk −(α

(n)k

d(n)k

)2

dk

), dk ≥

2K∑i 6=k′

(σ2i σ

2k′σ2k + σ4

i σ2i′

)pi,k + σ2nrσ

2k′σ2k,

α2k ≤ pk,k, B log2(1 + γk) ≥ Rk, 0 ≤ pk,k ≤ Pmax,

2K∑i=1

pi,k + PR ≤ Pmaxt

, (5.17)

where pk = [p1,k, · · · , p2K,k]T ∈ R2K×1 are local variables at user k. The term pi,k can

be viewed as a local copy of the power of the ith user available at user k. Each D-server,

therefore, has a total of 2K local variables representing the transmit power of all the users.


These new variables are introduced to decouple the constraints (5.8c) and (5.12b). We see

that none of the variables are coupled in any of the constraints in the local sets. We now

re-cast P5 using convex set S(n)k as

P6 : Maximizep,p,ϕ,γ,d,α

2K∑k=1

wkgk (5.18a)

subject to (pk, gk, ϕk, γk, dk, αk) ∈ S(n)k , ∀k ∈ K (5.18b)

pk = p ∀k ∈ K, (5.18c)

where p , [pT1 , · · · , pT2K ]T ∈ R4K2×1 and recall that K , 1, · · · , 2K. The constraints

in (5.18c) force the local copies to be identical and equal to the global variables, and

therefore our problem P6 is in the general global consensus form [87]. The proposed de-

centralized algorithm solves the general global consensus problem P6 using ADMM. Let

Ωk , [pTk , gk, ϕk, γk, dk, αk] ∈ R(2K+5)×1 which stacks all the local variables associated with

the user k. We next define

qk(Ωk) ,

wkgk, Ωk ∈ S

(n)k

∞, otherwise, where Ω , [ΩT

1 , · · · ,ΩT2k]

T . (5.19)

Based on P6 and (5.19), we write the augmented Lagrangian function of (5.18) as

L(n)(Ωk,p,χk∀k∈K) =2K∑k=1

(qk(Ωk)− χTk (pk − p)− ρ

2||pk − p||2

). (5.20)

Here ρ > 0 is the penalty parameter and χk , [χ1,k, χ2,k, · · · , χ2K,k]T ∈ R2K×1 are the

Lagrangian parameters associated with constraints (5.18c). The quadratic penalty term

ρ2 ||pk − p||2 penalizes the violation of equality constraints (5.18c). The penalty term re-

laxes the condition of strict convexity, and finiteness of the original objective function for

the ADMM to converge [87]. In every iteration, the ADMM uses Gauss-Siedel method to

alternatively update the global variables p and the local variables Ωk and χk [87]. To cal-

culate these updates we assume, as commonly done in the ADMM literature [87], perfect


synchronization between D-servers – each D-server can update its local variables before the

computation of the global variables. We next explain the proposed method for the (l + 1)th

ADMM iteration.

• Local computation: The D-server, corresponding to user Sk, by keeping the global

variables and Lagrange multipliers fixed, computes the local variables Ωk by solving

the optimization problem in (5.20) i.e.,

Ω(l+1) = arg maxΩk∈S

(n)k ,∀k∈K

L(n)(Ωk,p(l),χ

(l)k ). (5.21)

Here l denotes the ADMM iteration. The p(l) and χ(l)k denotes their values obtained

after the lth ADMM iteration. The augmented Lagrangian function in (5.20) is decom-

posable in Ωk, and hence each user Uk solves optimization problem (5.21) in parallel

as follows:

Ω(l+1)k = arg max

Ωk∈S(n)k

qk(Ωk)− (χ(l)k )T (pk − p(l))− ρ

2||pk − p(l)||2. (5.22)

• Global aggregation and computation: The C-server collects the values of local variables

from all the users, and then computes the global variable p by solving (5.20) with

respect to p, by keeping the local variables Ωk and χk constant, and equal to the

values obtained from the (l + 1)th iteration and lth iteration, respectively, as follows

p(l+1) = arg maxp

L(n)(Ω(l+1)k ,p,χ

(l)k ) = arg max

p

2K∑k=1

((χ

(l)k )Tp− ρ

2(‖p(l+1)

k − p‖2)).

(5.23)

We observe that the problem (5.23) is quadratic and has a closed-form solution

p(l+1) =1

2K

2K∑k=1

[p

(l+1)k +

1

ρχ

(l)k

]. (5.24)


After computing p(l+1), the C-server broadcasts it to all the D-server for the Lagrangian

multipliers update in the next step.

• Lagrangian multipliers update: The D-server, corresponding to the user Sk, updates its

Lagrangian multipliers as follows [87]

χ(l+1)k = χ

(l)k + ρ

(p

(l+1)k − p(l+1)

), ∀k ∈ K. (5.25)

Note that χi,k is the Lagrange multiplier associated with variable pi,k. We see that the

(5.25) is decentralized, and each D-server solves it locally, and in parallel with other

D-servers.

The steps in (5.22), (5.24), and (5.25) are iterated until the ADMM convergence, after which

we update the SCA parameters i.e., (ϕ(n+1)k , g

(n+1)k , α

(n+1)k , d

(n+1)k ) and then again carry out

the ADMM steps to solve P6 at the (n+1)th iteration. This process is iterated until the outer

SCA loop converges. We summarize the proposed decentralized method in Algorithm 51. The

convergence criterion in Algorithm 5 is ‖r(l)‖ ≤ εADMM , where ‖r(l)‖ is the primal residual

after lth update, which is defined as ‖r(l)‖ =

√∑2Kk=1 ‖[p

(l)k − p(l)‖2.

Remark 19. Convergence of proposed algorithm: The decentralized Algorithm 5 uses ADMM

and the SCA approaches, and their convergence will determine its convergence. The conver-

gence of the SCA approach is discussed earlier in Remark 18. And for a fixed SCA iteration,

the ADMM is guaranteed to converge, and its convergence is well investigated in [87].

Remark 20. The decentralized Algorithm 5 requires exchange of information between the

C-servers and 2K D-servers. For global computation, the C-server collects values of 2K

local variables and 2K Lagrangian multipliers from all the 2K D-servers. The C-server thus

collects 8K2 values from all the D-servers. After computing global variables, the C-server

broadcast 2K global variables to all the D-servers.

Remark 21. We note that the augmented Lagrangian function in general is not decompos-

1The proposed algorithm can be applied for allocating power to the relay also. The extension will howeverbe non-trivial and radically change the solution, and is beyond the scope of this work. Future research canconsider modifying the proposed framework to optimize the relay power, along with user power.


Algorithm 5: Decentralized WSEE optimization algorithm for MRC/MRT pro-

cessing

Input: i) Given (ϕ(1)k , g

(1)k , α

(1)k , d

(1)k ) (calculation discussed in Remark 17), εSCA and

εADMM , choose the initial values of (p(1),χ(1)). Set n = 1 and l = 1.

Output: p∗ as solution.

1 while ‖p(n) − p(n−1)‖ ≤ εSCA do

2 while ‖r(l)‖ ≤ εADMM ∀k ∈ K do

3 For D-server corresponding to Sk, obtain S(n)k by substituting the values of

(ϕ(n)k , g

(n)k , α

(n)k , d

(n)k ) in (5.17) and then solve (5.22) to calculate local

variables Ω(l+1)k using S(n)

k and (p(l),χ(l)k ).

4 For C-server, collect the local variables from all the D-servers and then

update the global variable vector p(l+1) using (5.24), then broadcast it to all

the 2K D-servers.

5 For D-server, corresponding to Sk, update Lagrange multipliers χ(l+1)k using

(5.25).

6 Based on the solution p∗ obtained above, compute for all k ∈ K,

ϕ(n+1)k = (log2 (1 + SINRk(p

∗)))1/2, g(n+1)k =

log2 (1 + SINRk(p∗))

ϑp∗k + PC,

α(n+1)k = (p∗k)

1/2, d(n+1)k =

∑2Ki=1,i 6=k′ (σ

2i σ

2k′σ2k + σ4

i σ2i′

)p∗i + σ2nrσ

2k′σ2k,

and p(n+1)k = p∗k.

7 Update l← l + 1 and n← n+ 1.

8 return p∗.

5.7 Decentralized WSEE optimization for MRC/MRT processing using SOCP112

able [63]. We derive a solution using (5.19) which allows us to decompose the augmented

Lagrangian function in (5.20) and solve each sub-problem (5.22) at the distributed nodes in-

dependently to update the local variables. The proposed approach also results in closed-form

solution for the optimization problem in (5.23).

5.7 Decentralized WSEE optimization for MRC/MRT pro-

cessing using SOCP

We know that the problem P5 is a GCP; its complexity can be reduced by casting it as a

SOCP [93]. This will, consequently, reduce the computational complexity of the decentral-

ized Algorithm 5. We show this later by comparing the computational complexity and the

computation time of both approaches. We begin by first showing how P5 can be cast as a

SOCP, and then later apply ADMM to decentrally obtain a locally optimal solution.

5.7.1 SOCP SCA approach

To cast P5 as a SOCP, we observe that all its constraints except (5.11c), (5.11e) and (5.12c)

are linear. We therefore begin by approximating (5.11c) as a second order cone (SOC) by

introducing the slack variable κκκ = [κ1,κ2, · · · ,κ2K ]T as

ϕ2k ≤ κk, ∀k ∈ K (5.26a)

γkκk ≤ γk log2 (1 + γk) , ∀k ∈ K. (5.26b)

We observe that the constraint (5.26a) is hyperbolic, which can be cast as a SOC by using

the following fact [63] – for three scalars x, y and z, relations on the LHS and RHS of the

arrow below are equivalent

w2 ≤ xy, x ≥ 0, y ≥ 0⇐⇒ ‖[2w x− y]‖ ≤ x+ y, x ≥ 0, y ≥ 0. (5.27)


Using (5.27), we equivalently cast (5.26a) and (5.12c), respectively as

‖[2ϕk κk − 1]‖ ≤ κk + 1, ∀k ∈ K and (5.28)

‖[2αk pk − 1]‖ ≤ pk + 1, ∀k ∈ K. (5.29)

We note that the constraint (5.26b) is non-convex. The RHS of the constraint in (5.26b) i.e.,

γk log2 (1 + γk) is a convex function w.r.t γk, which can be verified by calculating its second

order derivative. We now use (5.14) to linearly approximate γk log2 (1 + γk) at a point γ(n)k

as follows

γk log2 (1 + γk) ≥ γ(n)k log2

(1 + γ

(n)k

)+(γk−γ

(n)k

)( 1

ln 2γ

(n)k

(1+ γ

(n)k

)−1+log2

(1+ γ

(n)k

))= γkω

(n)k − ε(n)

k , ∀k ∈ K, (5.30)

where ω(n)k =

1

ln 2

γ(n)k

(1 + γ(n)k )

+ log2(1 + γ(n)k ) and ε

(n)k =

1

ln 2

(γ(n)k )2

(1 + γ(n)k )

.

Using the linear approximation obtained in (5.30), we equivalently cast the constraints in

(5.26b) as

γkκk ≤ γkω(n)k − ε(n)

k ⇒ ε(n)k ≤ γk(ω

(n)k − κk), ∀k ∈ K. (5.31)

We observe that the inequality obtained in (5.31) is hyperbolic. We, therefore, re-cast (5.31)

as a SOC constraint using (5.27) as follows

‖2√ε

(n)k γk − ω

(n)k + κk‖ ≤ γk + ω

(n)k − κk, ∀k ∈ K. (5.32)

The rate constraint in (5.11e) can be equivalently expressed as a linear constraint as

1 + γk ≥ 2Rk/B, ∀k ∈ K. (5.33)


The problem P5 can be cast as a SOCP as follows

P7 : Maximizep,g,ϕ,γ,d,α,κκκ

2K∑k=1

wkgk (5.34a)

subject to (5.8b), (5.8c), (5.16b), (5.16c), (5.12b), (5.28), (5.29), (5.32), (5.33). (5.34b)

Even though our objective is to solve P7 decentrally, it can be centrally solved in an iterative

manner as discussed in the remark below.

Remark 22. Central iterative procedure for SCA: Given p(1)k =Pmax

t /4K and (ϕ(1)k , g

(1)k , α

(1)k , d

(1)k ),

whose calculation is discussed in Remark 17, set SCA iteration number n = 1 and do the

following

• Solve P7 to calculate p∗.

• Compute for all k ∈ K,

ϕ(n+1)k = (log2 (1 + SINRmrc

k (p∗)))1/2, g(n+1)k =

log2 (1 + SINRmrck (p∗))

ϑp∗k + PC, α

(n+1)k = (p∗k)

1/2

d(n+1)k =

∑2K

i=1,i 6=k′ (σ2i σ

2k′σ2k + σ4

i σ2i′

)p∗i + σ2nrσ

2k′σ2k, and p

(n+1)k = p∗k .

• n← n+ 1

Repeat the above process until convergence.

5.7.2 Decentralized ADMM approach

We now implement ADMM to the SOCP problem P7 to reduce the computational complexity

of decentralized Algorithm 5. We define a local feasible set, to apply ADMM to P7, at the


nth iteration of SCA for user Uk as

S(n)k =

pk, gk, ϕk, γk, dk, αk,κk

∣∣∣∣ ϑpk,k + PC ≤2ϕ

(n)k

g(n)k

ϕk −(ϕ

(n)k

g(n)k

)2

gk,

‖[2ϕk κk − 1]‖ ≤ κk + 1, ‖γk − ω(n)k + κk 2

√ε

(n)k ‖ ≤ γk + ω

(n)k − κk,

γk ≤ Ik(

2α(n)k

d(n)k

αk −(α

(n)k

d(n)k

)2

dk

), dk ≥

2K∑i 6=k′

(σ2i σ

2k′σ2k + σ4

i σ2i′

)pi,k + σ2nrσ

2k′σ2k,

‖[2αk pk,k − 1]‖ ≤ pk,k + 1, 1 + γk ≥ 2Rk/B, 0 ≤ pk,k ≤ Pmax,

2K∑i=1

pi,k + PR ≤ Pmaxt

.

(5.35)

The user k variables pk = [p1,k, · · · , p2K,k]T ∈ R2K×1 are the local copies of the global variable

p = [p1, · · · , p2K ]T ∈ R2K×1. We observe that the set S(n)k consists of either SOC or the linear

constraints. We cast P7 as a global consensus problem using the set S(n)k as

P8 : Maximizep,p,ϕ,γ,d,α,κκκ

2K∑k=1

wkgk (5.36a)

subject to (pk, gk, ϕk, γk, dk, αk,κk) ∈ S(n)k , ∀k ∈ K (5.36b)

pk = p, ∀k ∈ K. (5.36c)

The vector Ωk , [pTk , gk, ϕk, γk, dk, αk,κk]T ∈ R(2K+6)×1 is the set which denotes all the local

variables associated with the user Sk. We now briefly summarize the distributed ADMM for

the SOCP at (l + 1)th iteration as follows.

• Construct the augmented Lagrangian function L(n)(Ωk,p,χk∀k∈K) of P8 in (5.36).

• D-server for user Sk calculates Ω(l+1)k by solving the convex subproblem in (5.22).

• C-server computes global variables p(l+1) as in (5.24).

• D-server for user Sk updates the Lagrangian multipliers χ(l+1)k according to (5.25).

The decentralized algorithm for the SOCP formulation is similar to the Algorithm 5 with the

5.8 Decentralized WSEE optimization for ZFR/ZFT processing using SOCP116

following difference: step-5: along with ϕ(n+1)k , g

(n+1)k , α

(n+1)k and d

(n+1)k , update p

(n+1)k = p∗k,

where p∗k is the locally optimal solution obtained. For the first step, p(1)k is obtained by equally

dividing the power between the relay and all the users i.e., p(1)k = Pmax

t /4K.

5.7.3 Complexity of GCP and SOCP decentralized approaches

We see that the overall complexity of Algorithm 1 is dominated by complexity of the local

computation in step-3 where we solve optimization sub-problem (24) and (50) at each user for

MRC/MRT and ZFR/ZFT, respectively. The complexity of solving step-3 is same for both

MRC/MRT and ZFR/ZFT processing. The other algorithm steps i.e., the global variable

calculation in step-4 and Lagrange multipliers calculation in step-5, which can be computed

in a closed-form, have trivial computational complexity. With GCP approach, step-3 has

has a per-iteration per-user computational complexity of O((2K + 5)4) [95]. With SOCP

approach, step-3 has has a per-iteration per-user computational complexity of O((2K + 6)3)

[95]. Thus, the per iteration per-user computational complexity of the proposed algorithm,

using SOCP is lower than the GCP approach. We also conclude that the per-iteration per-user

computational complexity of MRC/MRT and ZFR/ZFT processing in O(·) remains same.

Remark 23. We observe that the computational complexity is a function of number of users

being served i.e., 2K, and the number of slack variables, which is fixed, and is equal to

five and six for the GCP and SOCP approaches, respectively. For an increased number of

users, the complexity of both GCP and SOCP approaches is, therefore, primarily dominated

by the number of users.

5.8 Decentralized WSEE optimization for ZFR/ZFT process-

ing using SOCP

We now design decentralized algorithm to optimize WSEE for the ZFR/ZFT processing. We

note that the WSEE problem P1 in (5.8) for the ZFR/ZFT relay processing is centrally

optimized in [90]. We also note that the central solution of P1 for the ZFR/ZFT relay


processing is much easier to derive than the MRC/MRT processing considered earlier because

of its simple SINR expression (see (5.5)). We first briefly summarize the central solution

from [90] here and later show how to decentralize it. We reproduce problem P1 for the sake

of convenience.

P1 : Maximizep

2K∑k=1

wklog2(1 + SINRζ

k(p))

ϑpk + PC(5.37a)

subject to 0 ≤ pk ≤ Pmax, ∀k ∈ K (5.37b)

2K∑k=1

pk + PR ≤ Pmaxt , ∀k ∈ K (5.37c)

Rk ≥ Rk, ∀k ∈ K. (5.37d)

We first introduce the slack variables t = [t1, · · · , t2K ]T and λ = [λ1, · · · , λ2K ]T to cast

the problem P1 in the epigraph form, and then approximate the non-convex constraint

pk + PC ≤ λ2k/tk, so obtained, by replacing its RHS by it’s first-order Taylor approximation.

Here the slack variables tk and λ2k denote the EE and SE of the k → k

′link, respectively.

The approximation is then iteratively improved using the SCA approach. The approximated

convex program at the nth iteration is

P9 : Maximizep,t,λ

2K∑k=1

wktk (5.38a)

subject to ϑpk + PC ≤2λ

(n)k

t(n)k

λk −(λ

(n)k

t(n)k

)2

tk, ∀k ∈ K (5.38b)

λ2k ≤ log2(1 + SINRzf(pk)), ∀k ∈ K (5.38c)

(5.37b), (5.37c), (5.37d). (5.38d)

Problem P9 is convex as the objective is linear and all the constraints are either linear

or convex. It can thus be iteratively solved, similar to P5, centrally at the C-server, as

discussed in the Remark 16. We now approximate P9 as a SOCP to reduce its computational

complexity. We observe that all the constraints in P9 except (5.37d) and (5.38c) are linear.

The constraint (5.38c) can be cast as a SOC. We begin this process by introducing new slack


variables ξ = [ξ1, · · · , ξ2K ]T and by using (5.27):

‖[2λk ξk − 1]‖ ≤ ξk + 1, ∀k ∈ K (5.39a)

pkξk ≤ pk log2 (1 + akpk) , ∀k ∈ K, (5.39b)

where ak = (N − 2K − 1)σ2k/σ

2nr. Now the constraint (5.39b) is a non-convex constraint,

but the RHS of the constraint is convex in pk. We use (5.14) to linearly approximate

pk log2 (1 + akpk) at point p(n)k and use (5.27) to cast it as a SOC

‖pk − ν(n)k + ξk 2

√µ

(n)k ‖ ≤ pk + v

(n)k − ξk, ∀k ∈ K, (5.40)

where ν(n)k =

1

ln 2

akp(n)k

(1 + akp(n)k )

+ log2(1 + akp(n)k ) and µ

(n)k =

1

ln 2

ak(p(n)k )2

(1 + akp(n)k )

. Also, the con-

straints in (5.37d) is convex and can be written as a linear constraint

1 + akpk ≥ 2Rk/B, ∀k ∈ K. (5.41)

The original problem P1 can therefore be cast as a SOCP

P10 : Maximizep,t,λ,ξ

2K∑k=1

wktk (5.42a)

subject to (5.37b), (5.37c), (5.38b), (5.39a), (5.40), (5.41). (5.42b)

We can, similar to P7, centrally solve P10 using the SCA method, as discussed in Remark 22.

We next use ADMM to solve SOCP in P10. We define a feasible convex set for user Sk as

X (n)k =

pk, tk, λk, ξk

∣∣∣∣ϑpk,k + PC ≤2λ

(n)k

t(n)k

λk −(λ

(n)k

t(n)k

)2

tk, ‖[2λk ξk − 1]‖ ≤ ξk + 1,

1 + akpk,k ≥ 2Rk/B, ‖pk,k − ν(n)k + ξk 2

√µ

(n)k ‖ ≤ pk,k + v

(n)k − ξk,

0 ≤ pk,k ≤ Pmax,2K∑i=1

pi,k + PR ≤ Pmaxt

. (5.43)


The variables pk = [p1,k, · · · , p2K,k]T ∈ R2K×1 are local copies at user k of corresponding

global variables p = [p1, · · · , p2K ]T ∈ R2K×1. We observe that all the above constraints

in the set X nk are either SOCs or linear. We now reformulate P10 as a global consensus

problem as

P11 : Maximizep,p,t,λ,ξ

2K∑k=1

wktk (5.44a)

subject to (pk, tk, λk, ξk) ∈ X(n)k ∀k ∈ K (5.44b)

pk = p, ∀k ∈ K. (5.44c)

Here p , [pT1 , · · · , pT2K ]T ∈ R4K2×1, t = [t1, · · · , t2K ] and λ = [λ1, · · · , λ2K ]. Now, we define,

Θk , [pTk , tk, λk, ξk] ∈ R(2K+3)×1, where Θk is the set of all local variables associated with

the user Sk. Also, we define a new function as

qk(Θk) =

wktk, Θk ∈ X

(n)k

∞, otherwise, (5.45)

where Θ , [ΘT1 , · · · ,ΘT

2k]T . The augmented Lagrangian function of (5.44) is written as

L(n)(Θk,p, ςk∀k∈K

)=

2K∑k=1

(qk(Θk)− (ςk)

T (pk − p)− ρ

2||pk − p||2

). (5.46)

Here ρ > 0 is the penalty parameter, and ςk , [ς1,k, · · · , ς2K,k]T ∈ R2K×1 are Lagrangian

parameters associated with constraints (5.44c). The steps for computing local variables,

global variables and Lagrange multipliers at iteration (l + 1) are as follows

Θ(l+1)= argmaxΘk∈X

(n)k ,∀k∈K

L(n)(Θk,p(l), ς

(l)k ), (5.47a)

p(l+1) = arg maxp

L(n)(Θ(l+1)k ,p, ς

(l)k ), (5.47b)

ς(l+1)k = ς

(l)k + ρ(p

(l+1)k − p(l+1)), ∀k ∈ K. (5.47c)


The steps in (5.47a) can be written as

Θ(l+1)k = arg max

Θk∈X(n)k

qk(Θk)− (ς(l)k )T (pk − p(l))− ρ

2||pk − p(l)||2. (5.48)

We observe that each D-server can solve the problem (5.48) and also update the Lagrangian

multipliers in parallel. Also, the C-server computes global variables p(l+1) in step (5.47b).

We can write the equation in (5.47b) as

p(l+1) = arg maxp

2K∑k=1

((ς

(l)k )Tp− ρ

2

(‖(pk)(l+1) − p‖2

))=

1

2K

2K∑k=1

[p

(l+1)k +

1

ρς

(l)k

]. (5.49)

The steps in (5.48), (5.49) and (5.47c) are repeated until the procedure converges, after

which we update the relevant SCA parameters i.e., λ(n+1)k , t

(n+1)k , p

(n+1)k and then repeat

the ADMM steps to solve P11 at the (n + 1)th iteration. The procedure is repeated

until the outer loop converges. The decentralized approach for the ZFR/ZFT processing

is summarized in Algorithm 6, whose convergence criterion is same as that of the Algo-

rithm 5 i.e., ||r(l)|| ≤ εADMM , where ||r(l)|| is the primal residual after lth update, defined as

||r(l)|| =√∑2K

k=1 ||[p(l)k − p(l)||2.

Remark 24. Calculation of initial feasible points λ(1)k and t

(1)k : We first divide the peak

transmit power equally between all 2K users and relay i.e., pk = Pmaxt /4K, ∀k ∈ K and

PR = Pmaxt /2. Initial value of p

(n)k is calculated as p

(1)k = Pmax

t /4K. The values of λ(1)k

and t(1)k are computed, similar to Remark 17, by replacing the inequalities in the constraints

ϑpk + PC ≤λ2ktk

and (5.38c) by equalities.


We now numerically investigate the performance of the proposed decentralized WSEE opti-

mization algorithms. For this study, the noise variances are set as σ2nr = σ2

n = σ2, and we

define η = Pmaxt /σ2, where Pmax

t is the maximum total transmit power of the system. We

also fix the relay transit power as PR = Pmaxt /2 and QoS Rk = 2 Mbits/sec. We model the


Algorithm 6: Decentralized WSEE optimization algorithm for ZFR/ZFT process-

ing

Input: i) Given (λ(1)k , t

(1)k , p

(1)k ) (calculation discussed in Remark 24), εSCA and

εADMM , choose the initial values for (p(1), ς(1)). Set n = 1 and l = 1.

Output: p∗ as solution.

1 while ‖p(n) − p(n−1)‖ ≤ εSCA do

2 while ||r(l)|| ≤ εADMM ∀k ∈ K do

3 For D-server corresponding to Sk, first obtain X (n)k by substituting the values

of (λ(n)k , t

(n)k , p

(n)k ) in (5.43) and then solve (5.48) to obtain local variables

Θ(l+1)k using X (n)

k and (p(l), ς(l)k ).

4 For the C-server, collect the local variables from all the D-servers, and then

update the global variable vector p(l+1) using (5.49), and then broadcast it

to all 2K users.

5 For D-server corresponding to Sk, update Lagrange multipliers ς(l+1)k using

(5.47c).

6 Based on the solution p∗ obtained above, compute for all k ∈ K,

λ(n+1)k = (log2 (1 + SINRk(p

∗)))1/2, t(n+1)k =

log2 (1 + SINRk(p∗))

ϑp∗k + PC, p

(n+1)k = p∗k.

7 Update l← l + 1 and n← n+ 1.

8 return p∗.


Table 5.2: System parameters.

Parameter Value

Coverage area 0.25 × 0.25 km2

Median channel gain, Υ -148.1 dB

Minimum distance, dmin 35 m

Path-loss exponent, α 3.76

Shadow fading (standard deviation), σsf 10 dB

Fixed power, PFIX 18 W

Power for Relay LO, PLO 2 W

Power per Relay antenna, PREL 1 W

Power per User antenna, PUE 0.1 W

Relay computation efficiency, ηREL 12.8 Gflops/W

mMIMO HD relay system under realistic conditions [96], where we assume that the relay

is located at the center of a square, and the users are located randomly at a distance no

less than dmin. The large scale fading (LSF) coefficients follow the path-loss model [96]:

σ2g,k [dB] = Υ − 10α log10

(dk

1 km

)+ Fg,k, where Υ denotes the median channel gain at a ref-

erence distance of 1 km, α is the path-loss exponent, dk represents the separation distance

between the relay and the user Sk, and Fg,k ∼ CN (0, σ2sf ) is the shadow fading term which

models the log-normal random variations around the transmit and receive antennas of each

user Sk. The LSF coefficients are normalized with respect to the total receiver noise power

of σ2 = −94 dB (which includes thermal noise and a noise figure of 7 dB in the receiver

hardware). We fix the coherence interval to be T = 200 symbols, training length τ = 2K,

pilot power Pρ = 20 dB, transmit power of users as Pmax = 30 dB and system bandwidth

B = 20 MHz. The large-scale coefficients (normalized with receiver noise) obtained by taking

one snapshot of the practical setup is

D = diag [0.7011 0.6710 0.6622 1.5183 1.0932 1.8997 0.5793 0.8642 1.2322 1.4835] .


The weights are chosen as wk2Kk=1 = 0.17, 0.03, 0.08, 0.02, 0.12, 0.15, 0.05, 0.14, 0.07, 0.17 ,

which satisfies∑2K

i=1wk = 1. The main system parameters are tabulated in the Table 5.2.

5.9.1 Achievable WSEE

We investigate the WSEE achieved using the decentralized GCP and SOCP algorithms,

and compare it with equal power allocation (EPA) scheme, which allocates equal power

to all the 2K users out of the total power budget of Pmaxt /2 i.e., pk = Pmax

t /4K, ∀k ∈

K. For the proposed Algorithm 5, we set i) the initial values of global variables p(1) and

Lagrange multipliers χ(1)k as zero; ii) εSCA = εADMM = 0.01. iii) the penalty parameter

ρ = 0.25(0.1) for GCP (SOCP) with MRC/MRT processing and ρ = 0.005 (0.01) for GCP

(SOCP) with ZFR/ZFT processing. These ρ values are chosen after extensive simulations

which are discussed later in Section 5.9.3.

−10 0 10 20 300

0.5

1

1.5

2

η (in dB)

WS

EE

(M

bits/J

oule

)

ADMM−GCP

ADMM−SOCP

SRM [4]

EPA []

RPA

Fig. 5.41: WSEE comparison of the proposed decentralized GCP and SOCP algorithms versus η =Pmaxt /σ2 for MRC/MRT for K = 5 and N = 64.

We compare in Fig. 5.41-Fig.5.43 the WSEE achieved by the proposed algorithm with i)

equal-power allocation (EPA) scheme in [29] which allocates equal power to all the 2K users;


−10 0 10 20 300

1

2

3

4

5

WS

EE

(M

bits/J

oule

)

η (in dB)

ADMM−GCP

ADMM−SOCP

SRM [4]

EPA [3]

RPA

Fig. 5.42: WSEE comparison of the proposed decentralized GCP and SOCP algorithms versus a)η = Pmax

t /σ2 for ZFR/ZFT for K = 5 and N = 64.

4 8 12 16 200

1

2

3

4

5

Number of user−pairs, K

WS

EE

(M

bits/J

oule

)

ADMM−GCP

ADMM−SOCP

SRM [4]

EPA [3]

RPA

MRC/MRT

ZFR/ZFT

Fig. 5.43: WSEE comparison of the proposed decentralized GCP and SOCP algorithms versus Kcomparing both MRC/MRT and ZFR/ZFT for N = 64 and η = 20 dB.


ii) sum-rate maximization (SRM) scheme in [30], which optimally allocates power to maximize

the sum rate; and iii) random power allocation (RPA) scheme, where power is randomly

allocated to satisfy the constraint (5.8c). We vary, in Fig. 5.41 and Fig. 5.42, the WSEE

versus SNR η for the MRC/MRT and the ZFR/ZFT processing respectively, with K = 5

user-pairs and N = 64 relay antenna. We observe that both GCP and SOCP decentralized

ADMM algorithms yield the same WSEE. This applies to both MRC/MRT and ZFR/ZFT

processing.

We observe that the proposed algorithms for SNR η ≥ −5 dB, performs better than SRM,

EPA and RPA schemes. We specifically see that in Fig. 5.41 for a low SNR value of η = 0 dB,

the proposed algorithms achieve 11% WSEE improvement over the SRM scheme. We note

that the SNR η ≤ 8 dB for MRC/MRT and η ≤ 12 dB for ZFR/ZFT, the WSEE optimization,

in fact, optimizes the sum-rate [26], which increases with increase in η. The improvement over

SRM is mainly because of the priority weights. For η > 8 dB for MRC/MRT and η > 12 dB

for ZFR/ZFT, the situation changes and we observe that the

• WSEE achieved by the SRM algorithm degrades. This is because the increase in sum-

rate is not proportional to the power expended to achieve that increase.

• WSEE achieved by the proposed algorithm remains constant. This is because SNR of

η = 8 dB for MRC/MRT (η = 12 dB for ZFR/ZFT) attain the maximum WSEE. Any

further increase in the transmit power would only decrease the achieved WSEE. The

propose algorithm therefore do not expend any extra energy.

We also observe that the performance of both EPA and RPA perform inferior to the proposed

algorithms. This is because both these algorithms, similar to the SRM algorithm, keep using

the available power beyond η = 8 dB for MRC/MRT (η = 12 dB for ZFR/ZFT) and hence

WSEE reduces.

We next plot in Fig. 5.43 the WSEE for both MRC/MRT and ZFR/ZFT designs by

varying the number of user-pairs K and by fixing N = 64 and η = 20 dB. We see that the

proposed algorithms yield a higher WSEE than the SRM [30], EPA [29] and RPA schemes


for both MRC/MRT and ZFR/ZFT processing. We also note that the SRM scheme achieves

better WSEE than EPA and RPA schemes for the MRC/MRT processing. We also note

that the WSEE decreases for K > 5 and K > 8 for the MRC/MRT and the ZFR/ZFT

processing, respectively. This is because with increase in K, the per-user transmit power,

and consequently the SNR of each user, reduces. Also, we observe from (3.2) that the total

power consumption increases with K, which further reduces the WSEE. We also observe that

the WSEE for the MRC/MRT processing decreases faster than its ZFR/ZFT counterpart.

This is because the inter-pair interference increases with K, which the MRC/MRT processing

completely ignores it in the design.

5.9.2 Numerical differences between WSEE and GEE optimization:

We now numerically evaluate the difference between the GEE and the WSEE metrics to

demonstrate the flexibility offered by the latter. We know that the weights in the WSEE

definition in (9) can be chosen to prioritize EE of different users. We now numerically

demonstrate this by considering a system with K = 2 user-pairs and MRC/MRT processing,

with large-scale fading matrix D : 0.749, 0.545, 0.146, 0.111 and by fixing η = 20 dB. We

see that the values of D are chosen such that the first user-pair has a much better channel.

We plot in Fig. 5.44-5.46 the EE corresponding to individual users and the GEE achieved

by the following resource allocation policies: i) pk obtained from WSEE maximization using

Algorithm 1; ii) pk obtained from GEE maximization using [92]. We first plot in Fig. 5.44

and Fig. 5.45 the EE of individual users versus the number of relay antennas N by optimizing

GEE and WSEE, respectively. We observe from Fig. 5.44 that EEuser−1 > EEuser−2 >

EEuser−3 > EEuser−4. This is because GEE optimization assigns power proportional to the

user channel strength, and therefore User-1, with the maximum large-scale gain, has the

highest EE, while User-4 has the lowest EE.

We show that with the WSEE optimization, we can prioritize EE of the weaker users,

User-3 and User-4. To achieve this aim we optimize WSEE and allocate them higher weights

as follows w1 = 0.05, w2 = 0.15, w3 = 0.4, w4 = 0.4. We now observe from Fig. 5.45 that


50 100 150 2000

0.5

1

1.5

2


EE

(M

bits

/Jou

le)

User−1User−2User−3User−4

Fig. 5.44: EE of each user versus N , when GEE is maximized for D : 0.749, 0.445, 0.246, 0.111,weights w1 = 0.05, w2 = 0.15, w3 = 0.4, w4 = 0.4, K = 5 and η = 20 dB and MRC/MRT processing.

50 100 150 2000

0.5

1

1.5


EE

(M

bits

/Jou

le)

User−1User−2User−3User−4

Fig. 5.45: EE of each user versus N , when WSEE is maximized for D : 0.749, 0.445, 0.246, 0.111,weights w1 = 0.05, w2 = 0.15, w3 = 0.4, w4 = 0.4, K = 5 and η = 20 dB and MRC/MRT processing.


50 100 150 2000.5

0.6

0.7

0.8

0.9

1


GE

E (

Mbi

ts/J

oule

)

GEE maximizationWSEE maximization

Fig. 5.46: Comparison of GEE obtained, on maximizing WSEE and GEE for D :0.749,0.445,0.246,0.111, weights w1 = 0.05, w2 = 0.15, w3 = 0.4, w4 = 0.4, K = 5 and η = 20 dBand MRC/MRT processing.

EEuser−3 > EEuser−4 > EEuser−2 > EEuser−1. We see that as we assign higher weight to a

particular user, its EE also changes, and therefore the users with smaller large-scale gain can

be assigned higher weights in order to increase its EE. We, therefore, conclude that the EE

priorities of different users can be achieved by selecting appropriate weights.

We next analyze in Fig. 5.46 how optimizing WSEE with the above weights impacts

the GEE. We see that the GEE obtained by optimizing WSEE is lower. This is because

the WSEE optimization prioritizes the EE of weaker users. The WSEE optimization can

also yield GEE, similar to the case when GEE is maximized, by assigning more weights to

the users with stronger channels.

5.9.3 Convergence of the decentralized algorithms

We investigate the convergence of the proposed decentralized GCP and SOCP algorithms.

Reference [97] showed that for the ADMM algorithms considered herein i) a large value

of ρ is required to ensure their convergence; and ii) a small ρ value is required for faster


20 40 60 80 1000

0.5

1

1.5

2

Total number of iterations, L

WS

EE

(M

bits

/Jou

le)

ρ=0.25ρ=1

Fig. 5.47: WSEE versus total number of iterations, L for proposed decentralized GCP algorithm withMRC/MRT processing.

convergence. We plot in Fig. 5.47 and Fig. 5.48, the WSEE versus total number of iterations

L of the proposed decentralized SOCP algorithm for the MRC/MRT and the ZFR/ZFT

processing, respectively. For this study, we fix N = 64, K = 5 and η = 20 dB.

We observe from Fig. 5.47 that the MRC/MRT design requires ρ = 0.25 for convergence,

and the increase in its value reduces the convergence rate: ρ = 0.25 requires 50 iterations

whereas ρ = 1 requires 90 iterations. Though not shown here, the algorithm does not

converge for ρ < 0.25. We, therefore, note that the choice of penalty parameter ρ affects the

convergence. We similarly observe from Fig. 5.48 that for ZFR/ZFT design, the minimum

value of ρ = 0.005 is required for convergence, and the increase in its value reduces the

convergence rate. We see that ρ = 0.005 requires 30 iterations whereas ρ = 0.1 requires

80 iterations.

We next investigate in Fig. 5.49 the convergence of the proposed decentralized SOCP

algorithm for the MRC/MRT processing. We fix N = 64, K = 5 and η = 20 dB. We see

that for ρ = 0.1 we require 60 number of iterations, whereas for ρ = 1 we require more than


20 40 60 80 1001

2

3

4

5


WS

EE

(M

bits

/Jou

le)

ρ=0.005ρ=0.05ρ=0.1

Fig. 5.48: WSEE versus total number of iterations, L for proposed decentralized GCP algorithm withZFR/ZFT processing.

0 20 40 60 80 1000

0.5

1

1.5

2


WS

EE

(M

bits

/Hz)

ρ=0.1ρ=1

Fig. 5.49: WSEE versus total number of iterations, L for proposed decentralized SOCP algorithmwith MRC/MRT processing.


100 iterations. We also observe that, similar to the GCP formulation, the increase in ρ value

delays the convergence. Though not plotted here (due to lack of space), ρ = 0.01 is required

for the convergence of ZFR/ZFT processing. The above analysis helps us conclude that the

convergence of distributed algorithms improves as value of penalty parameter decreases.

5.9.4 Run time comparison of GCP and SOCP decentralized algorithms

We compare in Fig. 5a and Fig. 5b the average runtime of the decentralized GCP and SOCP

algorithms for the MRC/MRT (discussed in Algorithm 1) and ZFR/ZFT (discussed in Al-

gorithm 2) designs, respectively. For this study, we fix η = Pmaxt /σ2 = 20 dB and vary

the user-pairs K, and compare GCP and SOCP approaches for N = 64 and N = 256 relay

antennas. The rest of the system parameters e.g. Pmax, circuit power PC , system bandwidth

B, the receive noise power, QoS Rk remain same, as specified in the first paragraph of the

simulation section. We observe that the decentralized SOCP algorithm requires lesser aver-

age runtime than its GCP counterpart for both MRC/MRT and ZFR/ZFT processing. The

reason, as analytically shown earlier in Section 5.7.3, is the lesser computational complexity

of solving a SOCP.

5.9.5 Effect of channel estimation error

We plot in Fig. 5.52, the WSEE versus N for different values of channel estimation errors

σ2ξ,k considering MRC/MRT processing. For this study, we set K = 5, η = 10 dB and

the large scale fading coefficients σ2ξ,k to be unity. We observe that WSEE decreases with

increase in channel estimation error, which is not surprising. We also note that the WSEE

degradation when σ2ξ,k is increased from 0 (perfect channel estimate) to i) 0.15, is marginal;

and ii) 0.35, is around 20%.

5.9.6 Comparison of decentralized and centralized approaches

The focus of this paper is on the design of decentralized GCP and SOCP algorithms to

optimize the WSEE. But recall that the WSEE optimization problems P5, P7 and P10


5 10 150

10

20

30

40

50


Ave

rage

Run

time

(sec

onds

)

N = 256N = 64

SOCP

GCP

Fig. 5.50: Average runtime versus user-pairs, K comparing GCP and SOCP formulations for decen-tralized algorithm with MRC/MRT processing for N = 64 and η = 20 dB.

5 10 150

10

20

30

40

50

60

70


Ave

rage

Run

time

(sec

onds

)

N = 64N = 256

SOCP

GCP

Fig. 5.51: Average runtime versus user-pairs, K comparing GCP and SOCP formulations for decen-tralized algorithm with ZFR/ZFT processing for N = 64 and η = 20 dB.


100 200 300 400 5000.6

0.9

1.2

1.5

1.8


WS

EE

(M

bits

/Jou

le)

Ch. est. err. βξ,k

=0

Ch. est. err. βξ,k=0.15

Ch. est. err. βξ,k=0.35

Fig. 5.52: WSEE versus N for different values of channel estimation errors σ2ξ,k for MRC/MRT

processing with K = 5 and η = 10 dB.

can also be centrally solved at the relay. We now compare the WSEE of the decentralized

and the centralized algorithms to show that the former does not degrade the WSEE. For

this study, we set K = 5 user-pairs and consider N = 64, N = 128 and N = 256 relay

antennas. We first plot in Fig. 5.53 the WSEE achieved by the centralized SOCP (problem

P7) and the decentralized SOCP algorithm for the MRC/MRT processing. We observe that

both centralized and the decentralized approaches yield the same WSEE. We see a similar

behavior in Fig. 5.54 where we repeat this study for the centralized SOCP (problem P10)

and the decentralized SOCP algorithm for the ZFR/ZFT processing.

5.9.7 Decision on the number of relay antennas N required to maximize

WSEE

To investigate this aspect, we plot in Fig. 5.55 the WSEE achieved by varying the number of

relay antennas N . For this study we fix K = 5 user-pairs, η = 20 dB and consider MRC/MRT

processing. We observe, as before, that both GCP and SOCP decentralized algorithms yield

the same WSEE. We also observe that WSEE initially increases till N < 64, but it starts


−20 0 20 400

0.5

1

1.5

2

η (in dB )

WS

EE

(M

bits

/Jou

le)

DecentralizedCentralized

N=64

N=128

N=256

Fig. 5.53: WSEE versus η comparison of proposed centralized SOCP and decentralized SOCP algo-rithms for MRC/MRT processing;

−20 0 20 400

1

2

3

4

5

η (in dB )

WS

EE

(M

bits

/Jou

le)

DecentralizedCentralized

N=128N=256

N=64

Fig. 5.54: WSEE versus η comparison of proposed centralized SOCP and decentralized SOCP algo-rithms for ZFR/ZFT processing.

5.10 Summary 135

100 200 300 400 5000.5

1

1.5

2


WS

EE

(M

bits

/Jou

le)

ADMM−GCPADMM−SOCPSRMEPARPA

Fig. 5.55: WSEE comparison of the proposed decentralized GCP and SOCP algorithms versus N forMRC/MRT with η = 20 dB.

decreasing after that. This is because for N < 64, the increased array gain obtained by

increasing N dominates, which reduces the energy required to achieve a given WSEE. For

N > 64, the increase in array gain is shadowed by the increase in total circuit power due

to increase in N (see (3.2)). This, consequently, reduces the WSEE. We see that the study

demonstrates that N = 64 relay antennas maximize the WSEE for this system configuration.

5.10 Summary

We designed decentralized algorithms to maximize the non-convex WSEE metric, with cou-

pled optimization variables, for a multi-pair AF HD mMIMO two-way MRC/ MRT and

ZFR/ZFT relaying. The proposed algorithms first approximate the WSEE either as a GCP

or as a SOCP, and then optimize it decentrally using ADMM. The approximation is then se-

quentially improved using successive convex approximation framework. We showed that the

proposed algorithms not only outperform existing schemes, but also allow weight adjustment

to prioritize user’s EE requirements. The SOCP decentralized algorithm is shown to have

lesser run time that its GCP counterpart. We also numerically demonstrated the convergence

5.10 Summary 136

of the proposed decentralized algorithm.


Chapter 6

Summary and Future Work

6.1 Thesis summary

The underlying theme of this thesis, is to improve the spectral efficiency (SE) and energy

efficiency (EE) of wireless systems when direct communication between a transmitter-receiver

pair is not feasible due to the weak direct link. To achieve this aim, we consider multi-pair

amplify-and-forward relay where multiple full-duplex user-pairs exchange information via a

shared full-duplex massive multiple-input multiple-output (mMIMO) relay. We first derived

closed-form spectral efficiency expression for maximal-ratio-combiner (MRC)/maximal-ratio-

transmission (MRT) and zero-forcing-reception (ZFR)/zero-forcing-transmission (ZFT) relay

processing with minimum-mean-square-estimation (MMSE) channel estimation, and for arbi-

trary number of relay antennas, which have not yet been derived in the literature. We showed

the accuracy of these lower bounds for different number of relay antennas, user pairs and relay

transmit power. We also numerically investigated the loop and inter-user interference values

for which the full-duplex relay outperforms a half-duplex relay.

Energy-efficient optimization of wireless systems has recently attracted considerable atten-

tion. With a motivation to save energy, in this thesis, we designed energy efficient algorithm

to maximize global energy efficiency (GEE). We jointly optimize the relay and user powers

to optimize GEE by using the derived closed-form achievable rate expressions. The GEE

6.1 Thesis summary 138

maximization for the system considered herein has a non-convex objective. We solve this

problem by formulating a sequence of convex auxiliary problems to first approximate the

objective and later choose Dinkelbach’s approach to optimize it.

We demonstrated the EE improvement achieved by the proposed algorithm for various

scenarios. We also exhaustively compared the SE and GEE of the proposed full-duplex re-

laying system to the existing state-of-the-art half-duplex relaying systems, and characterized

the values of self-loop and inter-user interference, for which the proposed system outper-

forms the existing ones. The proposed algorithms can be used to optimize GEE of not only

emerging 5G systems but also existing sensor networks, where relays/fusion-nodes enable

communication between multiple devices.

Due to the limitations of the network-centric GEE metric, that it cannot prioritize the

EEs of different user links, which is commonly required in a heterogeneous network where

an energy-strapped user places a high-premium on its EE, we consider user-centric weighted

sum energy efficiency (WSEE) metric. The WSEE metric, unlike GEE, depends on individual

EEs of the users, and can prioritize the EE requirements of various users by assigning them

predefined weights.

We optimized the non-convex WSEE, which due to its sum-of ratio form, is the hardest

to maximize among energy-efficient metrics. We jointly optimize the user and relay powers to

optimize WSEE by first approximating the objective as a concave-convex function, and then

by using quadratic transform. We also derived a closed-form waterfilling solution which yields

its Karush-Kuhn-Tucker (KKT) point. The proposed algorithms yield higher WSEE than the

design which maximize the spectral efficiency. We numerically showed that approximately

N = 64 relay antennas are optimal for WSEE, a N value which is being used for 5G system

design. Any N value above it deteriorates the WSEE because of increase in transceiver

power, which mitigates the spectral gain achieved with the same power increase. We showed

that with N > 100 (250), a residual SI variance of 0 dB can be tolerated for the MRC/MRT

(ZFR/ZFT) processing FD relay to have higher WSEE than its half-duplex counterpart. We

also showed that the WSEE metric allows the weights to be judiciously assigned to weaker

6.2 Future work 139

users to satisfy their EE requirements.

The decentralized processing architecture, wherein a set of computing units cooperate

with each other in a distributed manner to complete a complex task, has recently attracted

a lot of attention in mMIMO system design. We designed decentralized algorithms to max-

imize the non-convex WSEE metric, with coupled optimization variables, for a multi-pair

amplify-and-forward half-duplex mMIMO two-way MRC/MRT and ZFR/ZFT relaying. The

proposed algorithms first approximate the WSEE either as a generic convex program (GCP)

or as a second order cone program (SOCP), and then optimize it decentrally using alter-

nating direction method of multipliers. We showed that the proposed algorithms not only

outperform existing schemes, but also allow weight adjustment to prioritize user’s energy

efficiency requirements. The SOCP decentralized algorithm is shown to have lesser run time

that its GCP counterpart. We also numerically demonstrated the convergence of the pro-

posed decentralized algorithm. The current work is crucial for 5G mMIMO system which use

decentralized architecture to handle large number of users.

The system model considered in this thesis can be applied, to a scenario, where multiple

users are located in an area where traditional cellular networks are disrupted e.g., due to

a natural disaster. The users in aforementioned locations might not have a reliable energy

source (battery operated). A low-cost mMIMO relay could then be set-up to serve these users

and it is useful to maximize the overall energy efficiency as well to prioritize the individual

energy efficiency of energy-strapped users.

6.2 Future work

The current system model considers a mMIMO relay where each antenna is connected to a

dedicated radio frequency (RF) chain. This allows us to perform relay precoding digitally.

In practice, the number of RF chains, and consequently hardware complexity, can be sig-

nificantly reduced by connecting each RF chain to multiple antennas [47]. In such a hybrid

architecture, the transmit signals are firstly precoded by low-dimensional digital precoding

6.2 Future work 140

followed by analog (phase-only) precoding using cost-effective analog phase shifters. The

hardware complexity of the proposed design can also be reduced using such a hybrid archi-

tecture. The system model and the solutions in this thesis are in fact derived for block

fading channel model, where the small scale channels are assumed to be constant for coher-

ence block and vary independently across coherence blocks. Further, the model can be easily

extended for adaptive estimation and tracking of channel parameter by assuming that the

channel varies slowly across coherence blocks. The time-correlation can then be exploited for

adaptive channel estimation and tracking.

The closed-form lower-bounds of the achievable sum-rate can be derived for full-duplex

two-way mMIMO relaying systems in the presence of realistic correlated channels. By using

similar steps to those in Appendix A-D, the current computations can be extended to more

realistic correlated models.

All the users should be synchronized to the mMIMO relay. Reference [48] designs a fre-

quency synchronization scheme for the multiuser orthogonal frequency division multiplexing

uplink relying on a massive uniform linear array at the base station by exploiting the angle of

arrival information of users. A similar approach could also be employed to synchronize users

with the relay in our model. The frequency synchronization problem can also be alleviated

by selecting the best antenna or a subset of antennas. The antenna-selection technique will

reduce the complexity of both the synchronization algorithm as well as the overall hardware

complexity.

This thesis focuses only on amplify-and-forward relaying protocol. Future work can

consider extending the present framework to other relaying techniques, such as decode-

and-forward, compute-and-forward, estimate-and-forward, compress-and-forward and noisy-

network coding.

The focus of this work is on mMIMO use-cases such as emergency/disaster communi-

cations, sensor networks, outdoor concerts at stadiums, indoor exhibition halls, which are

usually single-cell, and consequently do not experience mutual interference and pilot contam-

ination. However, this work can be extend for multi-cell mMIMO systems, a use-case for

6.2 Future work 141

commercial cellular systems, wherein mutual interference and pilot contamination are crucial

design aspects.

We can also extend this model to a scenario, where a massive antenna base-station is

connected to a relay, which then serves a Wi-Fi area consisting of multiple single-antenna

users. This model specifically suits a cell-edge scenario where the users on the cell-edge,

due to high path loss, have extremely weak direct link. Such users will, consequently, have

extremely poor SE. The relay-based system will radically outperforms the ‘without relay’

system.

Appendices

Appendix A

Proof of Proposition 1

We first simplify the term E[‖WGRRxR‖2

]in the denominator of (2.17).

E[‖WGRRxR‖2

]=PRN

TrE[F∗TGHGRRGH

RRGTHFT]

=PRN

TrE[GRRGH

RRGTHFT F∗TGH]

=PRN

TrE[GRRGH

RR

]E[GTHFT F∗TGH

](a)=PRN

TrNσ2

LIRINE[GTHFT F∗TGH

]= PRσ

2LIRTr

E

[K∑i=1

(g2i−1f

T2i + g2if

T2i−1

)(f∗2i−1g

H2i + f∗2ig

H2i−1

)]

= PRσ2LIR

[TrE[g2i−1f

T2i f∗2i−1g

H2i

]+ Tr

E[g2i−1f

T2i f∗2ig

H2i−1

]+Tr

E[g2if

T2i−1f

∗2i−1g

H2i

]+ Tr

E[g2if

T2i−1f

∗2ig

H2i−1

]](b)= PRσ

2LIR

K∑i=1

(N2σ2

g,2i−1σ2f,2i +N2σ2

g,2iσ2f,2i−1

) (c)= N2PRσ

2LIRΦ. (A.1)

Equality in (a) is because i) Tr(AB) = Tr(BA); ii) GRR, G and F are independent; and

144

iii) E[GRRGH

RR

]= Nσ2

LIRIN . Equality in (b) is obtained by using the following results:

TrE[g2i−1f

T2i f∗2i−1g

H2i

]= 0,Tr

E[g2if

T2i−1f

∗2ig

H2i−1

]= 0,

TrE[g2i−1f

T2i f∗2ig

H2i−1

]= E

[gH2i−1g2i−1f

H2i f2i

]= E

[gH2i−1g2i−1

]E[fH2i f2i

]= N2σ2

g,2i−1σ2f,2i,

TrE[g2if

T2i−1f

∗2i−1g

H2i

]= E

[gH2i g2if

H2i−1f2i−1

]= E

[gH2i g2i

]E[fH2i−1f2i−1

]= N2σ2

g,2iσ2f,2i−1.

In deriving the results above, we exploit the facts that E[gHi gi

]= Nσ2

g,i, E[gHi gj

]=

0, E[∣∣gHi gj

∣∣2] = Nσ2g,iσ

2g,j , and E

[gig

Hi gig

Hi

]= (N + 1)σ4

g,iIN [98]. Equality in (c) is

obtained by defining Φ =∑K

i=1 Φi,where Φi =(σ2g,2i−1σ

2f,2i + σ2

g,2iσ2f,2i−1

). We see that

the simplification of E[‖WGRRxR‖2

], present only in FD systems, is non-trivial. We now

simplify the term E[‖WGx‖2

]in the denominator of (2.17), which is expressed using (2.15)

as

E[‖WGx‖2

]= Tr

E[(F∗TGHGPPHGHGTHFT )

](a)= Tr

E[GPPHGHGTHFT F∗TGH

]+ Tr

E[EgPPHEg

HGTHFT F∗TGH]

. (A.2)

Equality in (a) is because G and Eg are independent and Tr (AB) = Tr (BA). We now

simplify first and second term of (A.2). We first expand first term of (A.2) as

E[GPPHGHGTHFT F∗TGH

]=

2K∑k=1

pkE

[gkg

Hk

K∑i=1

(g2i−1f

T2i + g2if

T2i−1

)(f∗2i−1g

H2i + f∗2ig

H2i−1

)]. (A.3)

To simplify (A.3), we decompose the summation therein for k 6= 2i or 2i− 1, k = 2i− 1 and

145

k = 2i. For these k values, Eq. (A.3) can respectively be simplified as

TrE[gkg

Hk

(g2i−1f

T2i + g2if

T2i−1

)(f∗2i−1g

H2i + f∗2ig

H2i−1

)]

=

N2σ2

g,k

(σ2g,2iσ

2f,2i−1 + σ2

g,2i−1σ2f,2i

), k 6= 2i or 2i− 1,

N2σ2g,2i−1

(σ2g,2iσ

2f,2i−1 + (N + 1)σ2

g,2i−1σ2f,2i

), k = 2i− 1,

N2σ2g,2i

(σ2g,2i−1σ

2f,2i + (N + 1)σ2

g,2iσ2f,2i−1

), k = 2i.

(A.4)

We next simplify (A.3) as

TrE[GPPHGHGTHFT F∗TGH

]= Tr

2K∑k=1

pkE

[gkg

Hk

K∑i=1

(g2i−1f

T2i + g2if

T2i−1

)(f∗2i−1g

H2i + f∗2ig

H2i−1

)](A.5)

(a)= N2

2K∑k 6=2i,2i−1

pk

K∑i=1

σ2g,k

(σ2g,2iσ

2f,2i−1 + σ2

g,2i−1σ2f,2i

)+N2

K∑i=1

p2iσ

2g,2i

(σ2g,2i−1σ

2f,2i

+(N + 1)σ2g,2iσ

2f,2i−1

)+N2

K∑i=1

p2i−1σ2g,2i−1

(σ2g,2iσ

2f,2i−1 + (N + 1)σ2

g,2i−1σ2f,2i

)(b)= N2

2K∑k 6=2i,2i−1

K∑i=1

pkσ2g,kΦi +N2

K∑i=1

(p2i−1σ

2g,2i−1 + p2iσ

2g,2i

)Φi

+N3K∑i=1

(p2i−1σ

4g,2i−1σ

2f,2i + p2iσ

4g,2iσ

2f,2i−1

)(c)= N2

K∑j=1,j 6=i

K∑i=1

(p2j−1σ

2g,2j−1 + p2j σ

2g,2j

)Φi +N2

K∑i=1

(p2i−1σ

2g,2i−1 + p2iσ

2g,2i

)Φi +N3Υ

= N2K∑j=1

K∑i=1

(p2j−1σ

2g,2j−1 + p2j σ

2g,2j

)Φi +N3Υ

(d)= N2

K∑j=1

K∑i=1

ΨjΦi +N3Υ = N2ΨΦ +N3Υ . (A.6)

We now discuss the steps used for simplifying (A.3). Equality in (a) is obtained using

the results in (A.4). Equality in (b) is obtained by using the definition of Φi from (A.1)

and by rearranging the terms. Equality in (c) is obtained by using i) the definition of

146

Φi from (A.1); and ii) the fact that∑2K

k=1 xk =∑K

j=1 (x2j−1 + x2j), and then by defining

Υ =∑K

i=1

(p2i−1σ

4g,2i−1σ

2f,2i + p2iσ

4g,2iσ

2f,2i−1

). Equality in (d) is obtained by defining Ψ =∑K

i=1 Ψi =∑K

i=1

(p2i−1σ

2g,2i−1+ p2iσ

2g,2i

).

Similarly, we next simplify second term of (A.2) as

TrE[EgPPHEg

HGTHFT F∗TGH]

= TrE[EgPPHEH

g

]E[GTHFT F∗TGH

]= Tr

2K∑k=1

pkσ2ξg,kE

[K∑i=1

(g2i−1f

T2i + g2if

T2i−1

)(f∗2i−1g

H2i + f∗2ig

H2i−1

)]

=2K∑k=1

pkσ2ξg,k

K∑i=1

N2(σ2g,2i−1σ

2f,2i + σ2

g,2iσ2f,2i−1

)= N2Φ

2K∑k=1

pkσ2ξg,k. (A.7)

The last term E[‖WzR‖2

]in the denominator of (2.17) is

E[‖WzR‖2

]= Tr

σ2nrE

[F∗TGHGTHFT

]= σ2

nr

K∑i=1

N2(σ2g,2i−1σ

2f,2i + σ2

g,2iσ2f,2i−1

)= N2σ2

nrΦ. (A.8)

By substituting the expression of E[‖WGx‖2

]from (A.6) as well as (A.7), and the expres-

sions of E[‖WGRRxR‖2

]and E

[‖WzR‖2

]from (A.1) and (A.8) respectively in (2.17), we

get (2.18).

Appendix B

Proof of Proposition 2

We first simplify the term E[‖WGRRxR‖2

]in the denominator of (2.17) as follows:

E[‖WGRRxR‖2

]=PRN

TrE[

ˆF∗T ˆGHGRRGHRR

ˆGTH ˆFT]

= PRσ2LIRTr

E[

ˆFT ˆF∗T ˆGH ˆGTH]

(a)= PRσ

2LIRTr

E[Λ∗FTΛGT

](b)= PRσ

2LIR

2K∑j=1

(E[w∗f,j,j′

]E[wg,j,j′

]+ E

[w∗f,j,j

]E[wg,j′ ,j′

])(c)= PRσ

2LIR

2K∑j=1

1

(N − 2K − 1)2 σ2f,j σ

2g,j′

, PRσ2LIRη. (B.1)

In the equality (a), we define ΛF , ( FH F) =(FHF

)−1as well as ΛG ,

( GH G) =(GHG

)−1. To derive the equality in (b), we first note that the random matrices ΛF and ΛG

have inverse Wishart distribution, i.e. we have ΛF ∼ W−1(D−1d , 2K), ΛG ∼ W−1(D−1

u , 2K)

[99], where Dd and Du are covariance matrices of the rows of the estimated channels G and F.

We denote wf,i,j =(ΛF

)i,j, wg,i,j =

(ΛG

)i,j, for i, j = 1, . . . , 2K and E

[ΛF

]=

D−1d

N−2K−1 ,

E[ΛG

]= D−1

uN−2K−1 [99]. We also have

148

E[fHi

ˆfj

]= E [wf,i,j ] =

1

(N − 2K − 1)σ2f,i

, i = j

0, i 6= j

and

E[gHi gj] = E [wg,i,j ] =

1

(N − 2K − 1)σ2g,i

, i = j

0, i 6= j

.

We now simplify the term E[‖WGx‖2] in the denominator of (2.17) using (2.19).

E[‖WGx‖2

]= E

[‖ F∗T GH

Gx‖2]

= Tr

E[( F∗T GH

GPPHGH GTH FT)

]= Tr

E[( F∗T GH

(G + Eg)PPH(GH + EHg ) GTH FT

)

]= Tr

E[F∗T GH

GPPHGH GTH FT]

+ Tr

E[EgPPHEg

H GTH FT F∗T GH]

(a)= Tr

E[F∗TPPHT FT

]+

2K∑i=1

piσ2ξ,g,iTr

E[ GTH FT F∗T GH

](b)=

2K∑i=1

pi′E[fHi

ˆfi

]+

2K∑i=1

piσ2ξ,g,iTr

E[Λ∗FTΛGT

](c)= λ+

2K∑i=1

piσ2ξ,g,iη. (B.2)

Equality in (a) is obtained by exploiting the fact that GHG =GH G =I2K , E

[EgPPHEg

H]=∑2K

i=1 piσ2ξ,g,iIN . Using the fact that TPPHT = diagp2, p1, · · · , p2K , p2K−1, we obtain (b).

Equality (c) is obtained by substituting λ ,∑2K

i=1

pi′

(N−2K−1)σ2f,i

.

The last term in the denominator of (2.17) can be simplified to

E[‖ F∗T GH

zR‖2]

= σ2nrTr

E[F∗T GH GTH FT

]= σ2

nrTrE[Λ∗FTΛGT

]= σ2

nrη. (B.3)

By using (B.1), (B.2) and (B.3), we get the simplified expression of αzf in (2.20).

Appendix C

Proof of Theorem 7

We will start by simplifying the numerator of (2.26), with fk = fk + ef,k, gk = gk + eg,k, and

using the fact that fk, gk, ef,k, eg,k are independent, as follows:

E[|fTk Wgk′ |

]= E

[(fk + ef,k

)TW(gk′ + eg,k′

)]= E

[fTk F∗TGH gk′

]= E

[fTk

K∑i=1

(f∗2i−1g

H2i + f∗2ig

H2i−1

)gk′

]

= E[fTk f∗k gH

k′gk′ + fTk f∗k′ g

Hk gk′

]= E

[fHk fk

]E[gHk′

gk′]

+ E[fHk′ fk

]E[gHk gk′

](a)= N2σ2

f,kσ2g,k′

. (C.1)

Equality in (a) is because E[gHi gi

]= Nσ2

g,i, E[gHi gj

]= 0 [98].

150

We next simplify LIRk:

LIRk = E[|fTk WGRRx|2

]= E

[∣∣∣(fk − eTf,k

)WGRRx

∣∣∣2]= E

[∣∣∣fkWGRRx∣∣∣2]+ E

[∣∣eTf,kWGRRx∣∣2]

=PRN

E[fTk WGRRGH

RRWH f∗k

]+PRN

E[eTf,kWGRRGH

RRWHe∗f,k]. (C.2)

The first term in (C.2) is simplified as follows:

E[fTk WGRRGH

RRWH f∗k

](a)= E

[fTk

K∑i=1

(f∗2i−1g

H2i + f∗2ig

H2i−1

)GRRGH

RR

(g2if

T2i−1 + g2i−1f

T2i

)f∗k

](b)= N3σ2

LIRσ2f,k

K∑i=1

(σ2f,2i−1σ

2g,2i + σ2

f,2iσ2g,2i−1

)+N4σ4

f,kσ2LIRσ

2g,k′

(c)= N3σ2

LIRσ2f,k

(Nσ2

f,kσ2g,k′

+ Φ). (C.3)

Equality in (b) is obtained by expanding (a) for two different cases: i) k, k′ = 2i, 2i−1; ii)

k, k′ 6= 2i, 2i−1, and by using the fact that E[∣∣gHi gj

∣∣2] = Nσ2g,iσ

2g,j and E

[gig

Hi gig

Hi

]=

(N+1)σ4g,iIN [98]. Equality in (c) is obtained by defining Φ =

K∑i=1

(σ2g,2i−1σ

2f,2i + σ2

g,2iσ2f,2i−1

).

The second term in (C.2) is now simplified using the similar ideas as follows:

E[eTf,kWGRRGH

RRWHe∗f,k]

= N3K∑i=1

σ2ξf,kσ

2LIR

(σ2f,2i−1σ

2g,2i + σ2

f,2iσ2g,2i−1

)= N3σ2

ξf,kσ2LIRΦ. (C.4)

Further using (C.3) and (C.4), we obtain the expression for (C.2) as:

LIRk = PRN2σ2f,kσ

2LIR

(Nσ2

f,kσ2g,k′

+ Φ)

+ PRN2σ2ξf,kσ

2LIRΦ

= N2PRσ2LIR

(Nσ4

f,kσ2g,k′

+ σ2f,kΦ

). (C.5)

151

Similar to the above calculations, we can simplify other terms in the denominator of (2.26) as

var[fTk Wgk′

]= N2

(Φσ2

f,kσ2g,k′

+Nσ2f,kσ

2g,k′

(σ2f,kσ

2g,k′

+ σ2f,kσ

2g,k′

)),

SIk = N2σ2ξf,kσ

2ξg,kΦ +N2σ2

f,kσ2ξg,k

(Nσ2

f,kσ2g,k′

+ Φ)

+N2σ2ξf,kσ

2g,k

(Nσ2

g,kσ2f,k′

+ Φ),

(C.6)

IPk = N22K∑

j 6=k,k′pj

[σ2f,kσ

2g,jΦ +N

(σ2f,kσ

4g,j σ

2f,j′

+ σ2g,j σ

4f,kσ

2g,k′

)], (C.7)

NUk = E[|zk|2

]= σ2

n, (C.8)

NRk = N2σ2nr

(Nσ4

f,kσ2g,k′

+ σ2f,kΦ

)and

UIk =∑i,k∈Uk

piE[|Ωk,ixi|2

]=∑i,k∈Uk

piσ2k,i. (C.9)

We can further simplify the denominator of (2.26) using (C.5)-(C.9), as

pk′var[fTk Wgk′

]+ pkSIk + IPk + NRk + LIRk +

1

α2mr

UIk +1

α2mr

NUk

= N2

2K∑i=1

pi

ηk,i +(

Φσ2g,i +Nσ4

g,iσ2f,i′

)P−1R

σ2n +

∑i,k∈Uk

piσ2k,i

+ ckpk

+(σ2LIRσ

2n + σ2

nrσ2f,k

)Φ +Nσ2

nrσ4f,kσ

2g,k′

, (C.10)

where ηk,i = Φσ2f,kσ

2g,i +N

(σ2f,kσ

4g,iσ

2f,i′

+ σ2g,iσ

4f,kσ

2g,k′

)and

ck=−(

Φσ2f,kσ

2g,j′

+N(σ4f,kσ

2g,j σ

2g,k′

+ σ2f,kσ

4g,j σ

2f,k′

)). The proof of Theorem (7) is complete.

Appendix D

Proof of Theorem 8

Starting with the numerator of (2.26) considering ZFR/ZFT processing, we have

E[fkTWgk′

]= E

[(fk + ef,k

)T F∗T GH (gk′ + eg,k′

)](a)= E

[1Tf,kT1g,k′

](b)= 1. (D.1)

Equality in (a) is obtained by using the following results: gHk′G = 1T

g,k′, GH

gk′ = 1g,k′ ,

fTkF∗ = 1Tf,k,

FTf∗k = 1f,k. Equality in (b) is because 1Tf,kT1g,k′ = 1. We next simplify LIRk.

LIRk = E[∣∣fTk WGRRx

∣∣2]= PRσ

2LIRE

[fTk WWH f∗k

]+ PRσ

2LIRσ

2ξ,f,kTr

E[WWH

](a)= PRσ

2LIRE

[fTkF∗T GH GT FT

f∗k

]+ PRσ

2LIRσ

2ξ,f,kTr

E[F∗T GH GT FT

](b)= PRσ

2LIRE

[1Tf,kTΛGT1f,k

]+ PRσ

2LIRσ

2ξ,f,kTr

E[Λ∗FTΛGT

](c)= PRσ

2LIRE

[wg,k′ ,k′

]+ PRσ

2LIRσ

2ξ,f,k

2K∑j=1

(E[w∗f,j,j′

]E[wg,j,j′

]+ E

[w∗f,j,j

]E[wg,j′ ,j′

])(d)=

PRσ2LIR

(N − 2K − 1)σ2g,k′

+ PRσ2LIRσ

2ξ,f,kη. (D.2)

153

Equality in (a) is obtained by substituting the expression of W from (2.19). Equal-

ity in (b) is obtained using the properties of unitary matrices, and equality in (c) is ob-

tained by mathematical manipulations. Equality in (d) is because E [wf,k,k] = 1(N−2K−1)σ2

f,k,

E[wg,k′ ,k′

]= 1

(N−2K−1)σ2

g,k′, and η ,

∑2Kj=1

1(N−2K−1)2σ2

f,j σ2

g,j′.

Similarly, the other terms in the denominator of (2.26) can be written as follows:

SIk =1

(N − 2K − 1)

(σ2ξ,f,k

σ2f,k′

+σ2ξ,g,k

σ2g,k′

)+ σ2

ξ,f,kσ2ξ,g,kη, (D.3)

IPk =2K∑

i 6=k,k′pi

[1

(N − 2K − 1)

(σ2ξ,f,k

σ2f,i′

+σ2ξ,g,i

σ2g,k′

)+ σ2

ξ,f,kσ2ξ,g,iη

], (D.4)

NUk = E[|zk|2

]= σ2

n, (D.5)

NRk = σ2nr

(1

(N − 2K − 1)σ2g,k′

+ σ2ξ,f,kη

)and

UIk =∑i,k∈Uk

piE[|Ωk,ixi|2

]=∑i,k∈Uk

piσ2k,i. (D.6)

Substituting the values obtained from (D.2)-(D.6) in the denominator of (2.26), we obtain

pk′var[fTk Wgk′

]+ pkSIk + IPk + NRk + LIRk +

1

α2zf

UIk +1

α2zf

NUk

=2K∑i=1

θk,ipi + σ2nr

(1

(N − 2K − 1)σ2g,k′

+ σ2ξ,f,kη

)+ PRσ

2LIR

(1

(N − 2K − 1)σ2g,k′

+ σ2ξ,f,kη

)

+2K∑i=1

pi

P−1R σ2

n

(1

(N − 2K − 1) σ2f,i′

+ ησ2ξ,g,i

)+∑i,k∈Uk

piP−1R σ2

k,i

(1

(N − 2K − 1) σ2f,i′

+ ησ2ξ,g,i

)+

P−1R ησ2

nrσ2n +

∑i,k∈Uk

piP−1R σ2

k,iησ2nr +

∑i,k∈Uk

piσ2k,iησ

2LIR + ησ2

LIRσ2n

, (D.7)

where θk,i =

(σ2ξ,f,k

(N−2K−1)σ2

f,i′

+σ2ξ,g,i

(N−2K−1)σ2

g,k′

+ σ2ξ,f,kσ

2ξ,g,iη

).

Appendix E

Geometric and fractional

programming

A real valued function f(x) of the form

f(x) = c xa11 xa22 · · ·x

ann , (E.1)

where c > 0, ai ∈ R and x ∈ Rn++, is referred to as a monomial function. The sum of one or

more monomials, i.e.

f(x) =K∑m=1

cm xa1m1 xa2m2 · · ·xanmn (E.2)

where cm > 0 is termed as a posynomial. Monomials are closed both under multiplication and

division, whereas posynomials are closed under addition and multiplication, but not under

division. The ratio of a posynomial and monomial is a posynomial. A geometric program

(GP) has a posynomial objective and upper bounded posynomial inequality constraints.

A fractional program is of the form u(x)/v(x), so that the optimization variable obeys

x ∈ X , where u : C ⊂ Rn → R, v : C ⊂ Rn → Rx+ and X ⊂ C ⊂ Rn. Since the objective is a

fraction, the problem is not guaranteed to be always convex, even if both u and v are affine

functions. For maximizing fractions, mostly two classes of generalized concave functions,

155

namely quasi-concave functions and PC functions are used. If C ⊂ Rn is a convex set, then

r : C → R is quasi-concave if

r(λx1 + (1− λ)x2) ≥ minr(x1, x2), (E.3)

for all x1, x2 ∈ C and λ ∈ [0; 1]. Similarly, r : C → R is PC if it is differentiable and,

r(x2) < r(x1) ⇒ O(r(x2))T (x1 − x2) > 0 for all x1, x2 ∈ C. A local maximum of optimiza-

tion associated with the PC objective constitutes a global maximum, whereas under a QC

objective it is not necessarily a global maximum. For example, when the objective has a

concave numerator and a convex denominator, the fractional program is a PC and its sta-

tionary point is its global maximizer. Such a problem belongs to the class of concave-convex

fractional programs [62].

Appendix F

Proof of Lemma 2

An arbitrary function f(x, y) in two variables x and y, using the first-order Taylor series

approximation [63], can be approximated around a point (a, b) as

f(x, y) = f(a, b)−[∂f(x, y)

∂x

](a,b)

(x− a)−[∂f(x, y)

∂y

](a,b)

(y − b). (F.1)

We now consider the terms p2iPRinv and pipjPRinv, and apply the approximation around the

points pi, pj and PRinv using (F.1), we get

(pipjPRinv)? = pipjPRinv

[pipi

+pjpj

+PRinv

PRinv− 2

]and

(p2iPRinv)

? = 2pi

[PRinvpi + pi

PRinv2− piPRinv

].

Appendix G

Proof of Closed-form SE expression

The closed-form SE of multi-pair two-way AF HD mMIMO relay for both MRC/MRT and

ZFR/ZFT processing is derived in [92]. We simplify the closed-form expressions derived in [92]

for the HD scenario considered in this paper. We start by re-writing the following proposi-

tions.

Proposition 7. For HD system, the achievable rate of the user Sk with MRC/MRT processing

is lower bounded as 0.5 log2

1 + SINRmrc

k (p)

, where SINRmrck (p) is given by

SINRmrck (p) =

akpk∑2Ki=1(b

(1)k,i + b

(2)k,iP

−1R )pi + ckpk′ + (d

(1)k + d

(3)k P−1

R )(G.1)

where ak = N2β2k′β2k, b

(1)k,i = µk,i = Φβk′ βi + N(βk′ β

2i βi′ + βiβ

2k′βk), b

(2)k,i = σ2

n(Φβi +

Nβ2i βi′ ), ck = −(Φβ2

k′+ 2Nβ3

k′βk), d

(1)k = σ2

nrβk′ Φ + Nσ2nrβ

2k′βk, d

(3)k = σ2

nrσ2nΦ, where

Φ =∑2K

k=1 βk′ βk.

Proof. The proposition can be proved on lines similar to [92] and is omitted for brevity.

Proposition 8. For HD system, the achievable rate of the user Sk with the ZFR/ZFT pro-

cessing is lower bounded as 0.5 log2

1 + SINRzf

k (p)

, where SINRzfk (p) is given by

SINRzfk (p) =

ckpk∑2Ki=1(u

(1)k,i + u

(2)k,iP

−1R )pi + (v

(1)k + v

(3)k P−1

R ), (G.2)

158

where ck = 1, u(1)k,i = 1

(N−2K−1)

(βξ,kβi′

+βξ,iβk′

)+ηβξ,kβξ,i, u

(2)k,i = ησ2

nβξ,i+σ2n

(N−2K−1)βi′,

v(1)k = σ2

nr(N−2K−1)βk

+ησ2nrβξ,k+ησ2

LIRσ2n, v

(3)k = ησ2

nrσ2n, and η =

∑2Kj=1

1(N−2K−1)2βj βj′

.

Proof. The proposition can be proved on lines similar to [92] and is omitted for brevity.

We now use the above lower bounds to derive more tractable high-SNR closed-form

sum-rate expression by assuming that σ2n/PR 1, σ2

n/pk 1,σ2nr/pk 1. This assumption

is made in mMIMO literature to simplify the sum-rate expressions and derive asymptotically

optimal power allocation schemes [30]. We will later use these closed-form expressions to op-

timize the WSEE. We start by simplifying the expressions for the MRC/MRT processing.

With σ2n/PR 1, σ2

n/pk 1, σ2nr/pk 1 for k ∈ 1, · · ·, 2K, the denominator of the lower

bound in (G.1) can be expressed as

D=N

2K∑i=1

pi

[Φβk′βi

N+βk′β

2i βi′ + βiβ

2k′βk

]−

[Φβ2

k′

N+2β3

k′βk

]pk+

σ2nrβk′

NΦ+σ2

nrβ2k′βk

. (G.3)

Neglecting the terms that are multiplies of 1/N (since for a mMIMO system, the number of

relay antennas N 2K), the SINR expression for the MRC/MRT processing can now be

approximated using (G.3) as SINRmrck (p) =

Nβ2kβk′

pk∑2K

i 6=k′(βiβk′

βk+β2i βi′

)pi+σ2nrβ

k′ βk

. On similar lines,

we can obtain the SINR expression for the ZFR/ZFT processing as follows SINRzfk (p) =

pk (N−2K−1)βkσ2nr

.

References

[1] G. Auer, V. Giannini, C. Desset, I. Godor, P. Skillermark, M. Olsson, M. A. Imran,

D. Sabella, M. J. Gonzalez, O. Blume, and A. J. Fehske, “How much energy is needed

to run a wireless network?” IEEE Wireless Commun., vol. 18, no. 5, pp. 40–49, 2011.

[2] T. L. Marzetta, “Noncooperative cellular wireless with unlimited numbers of base station

antennas,” IEEE Transactions on Wireless Communications, vol. 9, no. 11, pp. 3590–

3600, 2010.

[3] E. G. Larsson, O. Edfors, F. Tufvesson, and T. L. Marzetta, “Massive MIMO for next

generation wireless systems,” IEEE Communications Magazine, vol. 52, no. 2, pp. 186–

195, 2014.

[4] F. Rusek, D. Persson, B. K. Lau, E. G. Larsson, T. L. Marzetta, O. Edfors, and F. Tufves-

son, “Scaling up MIMO: opportunities and challenges with very large arrays,” IEEE

Signal Processing Magazine, vol. 30, no. 1, pp. 40–60, Jan. 2013.

[5] L. Lu, G. Y. Li, A. L. Swindlehurst, A. E. Ashikhmin, and R. Zhang, “An overview

of massive MIMO: benefits and challenges,” IEEE Journal of Selected Topics in Signal

Processing, vol. 8, no. 5, pp. 742–758, 2014.

[6] J. Hoydis, S. ten Brink, and M. Debbah, “Massive MIMO in the UL/DL of cellular

networks: How many antennas do we need?” IEEE Journal on Selected Areas in Com-

munications, vol. 31, no. 2, pp. 160–171, 2013.

REFERENCES 160

[7] E. Bjornson, E. G. Larsson, and T. L. Marzetta, “Massive MIMO: ten myths and one

critical question,” IEEE Communications Magazine, vol. 54, no. 2, pp. 114–123, 2016.

[8] B. Wang, J. Zhang, and A. Høst-Madsen, “On the capacity of MIMO relay channels,”

IEEE Transactions on Information Theory, vol. 51, no. 1, pp. 29–43, 2005.

[9] M. Yuksel and E. Erkip, “Multiple-antenna cooperative wireless systems: A diversity-

multiplexing tradeoff perspective,” IEEE Transactions on Information Theory, vol. 53,

no. 10, pp. 3371–3393, 2007.

[10] A. Adinoyi and H. Yanikomeroglu, “Cooperative relaying in multi-antenna fixed relay

networks,” IEEE Transactions on Wireless Communications, vol. 6, no. 2, pp. 533–544,

2007.

[11] L. Song, “Relay selection for two-way relaying with amplify-and-forward protocols,”

IEEE Transactions on Vehicular Technology, vol. 60, no. 4, pp. 1954–1959, 2011.

[12] Z. Zhang, X. Chai, K. Long, A. V. Vasilakos, and L. Hanzo, “Full duplex techniques for

5G networks: self-interference cancellation, protocol design, and relay selection,” IEEE

Communications Magazine, vol. 53, no. 5, pp. 128–137, May 2015.

[13] K. Lee and L. Hanzo, “Resource-efficient wireless relaying protocols,” IEEE Wireless

Communications, vol. 17, no. 2, pp. 66–72, Apr. 2010.

[14] A. Sabharwal, P. Schniter, D. Guo, D. W. Bliss, S. Rangarajan, and R. Wichman,

“In-band full-duplex wireless: Challenges and opportunities,” IEEE Journal on Selected

Areas in Communications, vol. 32, no. 9, pp. 1637–1652, 2014.

[15] M. Duarte, C. Dick, and A. Sabharwal, “Experiment-driven characterization of full-

duplex wireless systems,” IEEE Transactions on Wireless Communications, vol. 11,

no. 12, pp. 4296–4307, Dec. 2012.

REFERENCES 161

[16] A. Nadh, J. Samuel, A. Sharma, S. Aniruddhan, and R. K. Ganti, “A taylor series

approximation of self-interference channel in full-duplex radios,” IEEE Transactions on

Wireless Communications, vol. 16, no. 7, pp. 4304–4316, 2017.

[17] T. Riihonen, S. Werner, and R. Wichman, “Spatial loop interference suppression in full-

duplex MIMO relays,” in Conference Record of the Forty-Third Asilomar Conference on

Signals, Systems and Computers,. IEEE, 2009, pp. 1508–1512.

[18] Y. Y. Kang, B.-J. Kwak, and J. H. Cho, “An optimal full-duplex AF relay for joint

analog and digital domain self-interference cancellation,” IEEE Transactions on Com-

munications, vol. 62, no. 8, pp. 2758–2772, Aug. 2014.

[19] T. Riihonen, S. Werner, and R. Wichman, “Mitigation of loopback self-interference in

full-duplex MIMO relays,” IEEE Transactions on Signal Processing, vol. 59, no. 12, pp.

5983–5993, Dec. 2011.

[20] Z. Zhang, Z. Ma, Z. Ding, M. Xiao, and G. K. Karagiannidis, “Full-duplex two-way and

one-way relaying: Average rate, outage probability, and tradeoffs,” IEEE Transactions

on Wireless Communications, vol. 15, no. 6, pp. 3920–3933, Jun. 2016.

[21] H. Q. Ngo, H. A. Suraweera, M. Matthaiou, and E. G. Larsson, “Multipair full-duplex

relaying with massive arrays and linear processing,” IEEE Journal on Selected Areas in

Communications, vol. 32, no. 9, pp. 1721–1737, Sep. 2014.

[22] Z. Zhang, Z. Chen, M. Shen, and B. Xia, “Spectral and energy efficiency of multipair

two-way full-duplex relay systems with massive MIMO,” IEEE Journal on Selected Areas

in Communications, vol. 34, no. 4, pp. 848–863, Apr. 2016.

[23] Z. Zhang, Z. Chen, M. Shen, B. Xia, W. Xie, and Y. Zhao, “Performance analysis for

training-based multipair two-way full-duplex relaying with massive antennas,” IEEE

Transactions on Vehicular Technology, vol. 66, no. 7, pp. 6130–6145, Jul. 2017.

REFERENCES 162

[24] A. Zappone and E. Jorswieck, “Energy efficiency in wireless networks via fractional

programming theory,” Foundations and Trends of Communications and Information

Theory, vol. 11, no. 3-4, pp. 185–399, 2014.

[25] S. He, Y. Huang, L. Yang, and B. E. Ottersten, “Coordinated multicell multiuser pre-

coding for maximizing weighted sum energy efficiency,” IEEE Transactions on Signal

Processing, vol. 62, no. 3, pp. 741–751, Feb. 2014.

[26] L. Venturino, A. Zappone, C. Risi, and S. Buzzi, “Energy-efficient scheduling and power

allocation in downlink OFDMA networks with base station coordination,” IEEE Trans-

actions on Wireless Communications, vol. 14, no. 1, pp. 1–14, Jan. 2015.

[27] H. Cui, L. Song, and B. Jiao, “Multi-pair two-way amplify-and-forward relaying with

very large number of relay antennas,” IEEE Transactions on Wireless Communications,

vol. 13, no. 5, pp. 2636–2645, May 2014.

[28] W. Yu, “Multiuser water-filling in the presence of crosstalk,” in 2007 Information Theory

and Applications Workshop, Jan. 2007, pp. 414–420.

[29] H. Cui, L. Song, and B. Jiao, “Multi-pair two-way amplify-and-forward relaying with

very large number of relay antennas,” IEEE Transactions on Wireless Communications,

vol. 13, no. 5, pp. 2636–2645, 2014.

[30] Y. Dai and X. Dong, “Power allocation for multi-pair massive MIMO two-way AF relay-

ing with linear processing,” IEEE Transactions on Wireless Communications, vol. 15,

no. 9, pp. 5932–5946, Sep. 2016.

[31] B. P. Day, A. R. Margetts, D. W. Bliss, and P. Schniter, “Full-duplex MIMO relaying:

Achievable rates under limited dynamic range,” IEEE Journal on Selected Areas in

Communications, vol. 30, no. 8, pp. 1541–1553, 2012.

REFERENCES 163

[32] H. Cui, M. Ma, L. Song, and B. Jiao, “Relay selection for two-way full duplex relay

networks with amplify-and-forward protocol,” IEEE Transactions on Wireless Commu-

nications, vol. 13, no. 7, pp. 3768–3777, 2014.

[33] I. Krikidis, H. A. Suraweera, P. J. Smith, and C. Yuen, “Full-duplex relay selection for

amplify-and-forward cooperative networks,” IEEE Transactions on Wireless Communi-

cations, vol. 11, no. 12, pp. 4381–4393, 2012.

[34] T. Riihonen, S. Werner, and R. Wichman, “Hybrid full-duplex/half-duplex relaying with

transmit power adaptation,” IEEE Transactions on Wireless Communications, vol. 10,

no. 9, pp. 3074–3085, 2011.

[35] J. Lee, “Full-duplex relay for enhancing physical layer security in multi-hop relaying

systems,” IEEE Communications Letters, vol. 19, no. 4, pp. 525–528, 2015.

[36] E. Bjornson, L. Sanguinetti, H. Wymeersch, J. Hoydis, and T. L. Marzetta, “Massive

MIMO is a reality–what is next?: Five promising research directions for antenna arrays,”

Digital Signal Processing, vol. 94, pp. 3 – 20, 2019.

[37] E. Everett, A. Sahai, and A. Sabharwal, “Passive self-interference suppression for full-

duplex infrastructure nodes,” IEEE Transactions on Wireless Communications, vol. 13,

no. 2, pp. 680–694, 2014.

[38] P. Xing, J. Liu, C. Zhai, X. Wang, and X. Zhang, “Multipair two-way full-duplex relaying

with massive array and power allocation,” IEEE Transactions on Vehicular Technology,

vol. 66, no. 10, pp. 8926–8939, 2017.

[39] L. J. Rodrıguez, N. H. Tran, and T. Le-Ngoc, “Performance of full-duplex AF relay-

ing in the presence of residual self-interference,” IEEE Journal on Selected Areas in


REFERENCES 164

[40] I. Shomorony and A. S. Avestimehr, “Is gaussian noise the worst-case additive noise in

wireless networks?” in Proceedings of the IEEE International Symposium on Information

Theory, ISIT, Cambridge, MA, USA, July 1-6, 2012, pp. 214–218.

[41] X. Xia, D. Zhang, K. Xu, W. Ma, and Y. Xu, “Hardware impairments aware transceiver

for full-duplex massive MIMO relaying,” IEEE Transactions on Signal Processing,

vol. 63, no. 24, pp. 6565–6580, 2015.

[42] X. Xiong, X. Wang, T. Riihonen, and X. You, “Channel estimation for full-duplex relay

systems with large-scale antenna arrays,” IEEE Transactions on Wireless Communica-

tions, vol. 15, no. 10, pp. 6925–6938, 2016.

[43] E. Bjornson, J. Hoydis, and L. Sanguinetti, “Massive MIMO networks: Spectral, energy,

and hardware efficiency,” Foundations and Trends R© in Signal Processing, vol. 11, no.

3-4, pp. 154–655, Nov. 2017.

[44] B. Hassibi and B. M. Hochwald, “How much training is needed in multiple-antenna

wireless links?” IEEE Transactions on Information Theory, vol. 49, no. 4, pp. 951–963,

Apr. 2003.

[45] J. Jose, A. E. Ashikhmin, T. L. Marzetta, and S. Vishwanath, “Pilot contamination and

precoding in multi-cell TDD systems,” IEEE Transactions on Wireless Communications,

vol. 10, no. 8, pp. 2640–2651, Aug. 2011.

[46] M. Medard, “The effect upon channel capacity in wireless communications of perfect

and imperfect knowledge of the channel,” IEEE Transactions on Information Theory,

vol. 46, no. 3, pp. 933–946, May 2000.

[47] W. Xu, J. Liu, S. Jin, and X. Dong, “Spectral and energy efficiency of multi-pair massive

MIMO relay network with hybrid processing,” IEEE Transactions on Communications,

vol. 65, no. 9, pp. 3794–3809, Sep. 2017.

REFERENCES 165

[48] W. Zhang, F. Gao, S. Jin, and H. Lin, “Frequency synchronization for uplink massive

MIMO systems,” IEEE Transactions on Wireless Communications, vol. 17, no. 1, pp.

235–249, Jan. 2018.

[49] S. Buzzi, C. I, T. E. Klein, H. V. Poor, C. Yang, and A. Zappone, “A survey of energy-

efficient techniques for 5G networks and challenges ahead,” IEEE Journal on Selected

Areas in Communications, vol. 34, no. 4, pp. 697–709, Apr. 2016.

[50] F. Tan, T. Lv, and S. Yang, “Power allocation optimization for energy-efficient mas-

sive MIMO aided multi-pair decode-and-forward relay systems,” IEEE Transactions on

Communications, vol. 65, no. 6, pp. 2368–2381, Jun. 2017.

[51] A. Zappone, E. Bjornson, L. Sanguinetti, and E. A. Jorswieck, “Globally optimal energy-

efficient power control and receiver design in wireless networks,” IEEE Transactions on

Signal Processing, vol. 65, no. 11, pp. 2844–2859, Jun. 2017.

[52] A. Zappone, Z. Chong, E. A. Jorswieck, and S. Buzzi, “Energy-aware competitive power

control in relay-assisted interference wireless networks,” IEEE Transactions on Wireless


[53] A. Zappone, E. A. Jorswieck, and S. Buzzi, “Energy efficiency and interference neutral-

ization in two-hop MIMO interference channels,” IEEE Transactions on Signal Process-

ing, vol. 62, no. 24, pp. 6481–6495, Dec. 2014.

[54] W. Xu, J. Liu, S. Jin, and X. Dong, “Spectral and energy efficiency of multi-pair massive

MIMO relay network with hybrid processing,” IEEE Transactions on Communications,

vol. 65, no. 9, pp. 3794–3809, Sep. 2017.

[55] L. D. Nguyen, T. Q. Duong, H. Q. Ngo, and K. Tourki, “Energy efficiency in cell-

free massive MIMO with zero-forcing precoding design,” IEEE Communications Letters,

vol. 21, no. 8, pp. 1871–1874, Aug. 2017.

REFERENCES 166

[56] Y. Li, P. Fan, A. Leukhin, and L. Liu, “On the spectral and energy efficiency of full-

duplex small-cell wireless systems with massive MIMO,” IEEE Transactions on Vehic-

ular Technology, vol. 66, no. 3, pp. 2339–2353, Mar. 2017.

[57] E. Sharma, R. Budhiraja, K. Vasudevan, and L. Hanzo, “Full-duplex massive MIMO

multi-pair two-way AF relaying: Energy efficiency optimization,” IEEE Transactions on

Communications, vol. 66, no. 8, pp. 3322–3340, Aug. 2018.

[58] Z. Fei, B. Li, S. Yang, C. Xing, H. Chen, and L. Hanzo, “A survey of multi-objective

optimization in wireless sensor networks: Metrics, algorithms, and open problems,”

IEEE Communications Surveys and Tutorials, vol. 19, no. 1, pp. 550–586, Firstquarter

2017.

[59] S. Verdu, “On channel capacity per unit cost,” IEEE Transactions on Information The-

ory, vol. 36, no. 5, pp. 1019–1030, Sep. 1990.

[60] A. Zappone, P. Cao, and E. Jorswieck, “Energy efficiency optimization in relay-assisted

MIMO systems with perfect and statistical CSI,” IEEE Transactions on Signal Process-

ing, vol. 62, no. 2, pp. 443–457, Jan. 2014.

[61] S. Boyd, S.-J. Kim, L. Vandenberghe, and A. Hassibi, “A tutorial on geometric pro-

gramming,” Optimization and Engineering, vol. 8, no. 1, p. 67, Mar. 2007.

[62] A. Zappone and E. A. Jorswieck, “Energy efficiency in wireless networks via fractional

programming theory,” Foundations and Trends in Communications and Information

Theory, vol. 11, no. 3-4, pp. 185–396, Jun. 2015.

[63] S. Boyd and L. Vandenberghe, Convex Optimization. USA: Cambridge University Press,

2004.

[64] W. Dinkelbach, “On nonlinear fractional programming,” Management Science, vol. 13,

no. 7, pp. 492–498, Mar. 1967.

REFERENCES 167

[65] P. C. Weeraddana, M. Codreanu, M. Latva-aho, and A. Ephremides, “Resource alloca-

tion for cross-layer utility maximization in wireless networks,” IEEE Transactions on

Vehicular Technology, vol. 60, no. 6, pp. 2790–2809, Jul. 2011.

[66] A. Zappone, E. Bjornson, L. Sanguinetti, and E. A. Jorswieck, “Globally optimal energy-

efficient power control and receiver design in wireless networks,” IEEE Transactions on

Signal Processing, vol. 65, no. 11, pp. 2844–2859, Jun. 2017.

[67] K. O. Kortanek, X. Xu, and Y. Ye, “An infeasible interior-point algorithm for solving

primal and dual geometric programs,” Math. Program., vol. 76, pp. 155–181, Jan. 1996.



ing, vol. 62, no. 24, pp. 6481–6495, 2014.

[69] C. Pan, W. Xu, W. Zhang, J. Wang, H. Ren, and M. Chen, “Weighted sum energy

efficiency maximization in ad hoc networks,” IEEE Wireless Communications Letters,

vol. 4, no. 3, pp. 233–236, 2015.

[70] C. N. Efrem and A. D. Panagopoulos, “A framework for weighted-sum energy efficiency

maximization in wireless networks,” IEEE Wireless Communications Letters, vol. 8,

no. 1, pp. 153–156, 2019.

[71] Q. Wu, G. Y. Li, W. Chen, and D. W. K. Ng, “Energy-efficient D2D overlaying communi-

cations with spectrum-power trading,” IEEE Transactions on Wireless Communications,

vol. 16, no. 7, pp. 4404–4419, 2017.

[72] K. Shen and W. Yu, “Fractional programming for communication systems - part I: power

control and beamforming,” IEEE Transactions on Signal Processing, vol. 66, no. 10, pp.

2616–2630, May 2018.

REFERENCES 168

[73] M. S. Bazaraa, H. D. Sherali, and C. M. Shetty, Nonlinear Programming -

Theory and Algorithms, Third Edition. Wiley, 2005. [Online]. Available: https:

//doi.org/10.1002/0471787779

[74] J. Nocedal and S. J. Wright, Numerical Optimization. Springer, 1999. [Online].

Available: https://doi.org/10.1007/b98874

[75] A. Ben-Tal and A. S. Nemirovskiaei, Lectures on Modern Convex Optimization: Analysis,

Algorithms, and Engineering Applications. USA: Society for Industrial and Applied

Mathematics, 2001.

[76] G. H. Golub and C. F. Van Loan, Matrix Computations (3rd Ed.). USA: Johns Hopkins

University Press, 1996.

[77] C. Shepard, H. Yu, N. Anand, E. L. Li, T. L. Marzetta, Y. R. Yang, and L. Zhong,

“Argos: practical many-antenna base stations,” in The 18th Annual International Con-

ference on Mobile Computing and Networking, Mobicom’12, Istanbul, Turkey, August

22-26,, 2012, pp. 53–64.

[78] L. V. der Perre, L. Liu, and E. G. Larsson, “Efficient DSP and circuit architectures for

massive MIMO: state of the art and future directions,” IEEE Transactions on Signal


[79] K. Li, R. Sharan, Y. Chen, T. Goldstein, J. R. Cavallaro, and C. Studer, “Decentralized

baseband processing for massive MU-MIMO systems,” IEEE Transactions on Emerging

and Selected Topics in Circuits and Systems, vol. 7, no. 4, pp. 491–507, 2017.

[80] S. Jin, X. Liang, K. Wong, X. Gao, and Q. Zhu, “Ergodic rate analysis for multipair mas-

sive MIMO two-way relay networks,” IEEE Transactions on Wireless Communications,

vol. 14, no. 3, pp. 1480–1491, Mar. 2015.

https://doi.org/10.1002/0471787779

https://doi.org/10.1002/0471787779

https://doi.org/10.1007/b98874

REFERENCES 169



ing, vol. 62, no. 24, pp. 6481–6495, 2014.

[82] A. Zappone, P. Cao, and E. A. Jorswieck, “Energy efficiency optimization in relay-

assisted MIMO systems with perfect and statistical CSI,” IEEE Transactions on Signal


[83] H. Li, J. Cheng, Z. Wang, and H. Wang, “Joint antenna selection and power allocation

for an energy-efficient massive MIMO system,” IEEE Wireless Communications Letters,

vol. 8, no. 1, pp. 257–260, 2019.

[84] F. Tan, H. Chen, F. Zhao, and X. Li, “Energy-efficient power allocation for massive

mimo-enabled multi-way AF relay networks with channel aging,” EURASIP J. Wireless

Comm. and Networking, vol. 2018, p. 206, 2018.

[85] B. Du, C. Pan, W. Zhang, and M. Chen, “Distributed energy-efficient power optimization

for comp systems with max-min fairness,” IEEE Communications Letters, vol. 18, no. 6,

pp. 999–1002, 2014.

[86] C. Shen, T.-H. Chang, K.-Y. Wang, Z. Qiu, and C.-Y. Chi, “Distributed robust multicell

coordinated beamforming with imperfect CSI: An ADMM approach,” IEEE Transac-

tions on Signal Processing, vol. 60, no. 6, pp. 2988–3003, 2012.

[87] S. Boyd, N. Parikh, E. Chu, B. Peleato, J. Eckstein et al., “Distributed optimization

and statistical learning via the alternating direction method of multipliers,” Foundations

and Trends R© in Machine learning, vol. 3, no. 1, pp. 1–122, 2011.

[88] M. Masood, L. H. Afify, and T. Y. Al-Naffouri, “Efficient coordinated recovery of sparse

channels in massive MIMO,” IEEE Transactions on Signal Processing, vol. 63, no. 1,

pp. 104–118, 2015.

REFERENCES 170

[89] S. M. Kay, Fundamentals of Statistical Signal Processing: Estimation Theory. USA:

Prentice-Hall, Inc., 1993.

[90] E. Sharma, S. S. Chauhan, and R. Budhiraja, “Weighted sum energy efficiency opti-

mization for massive MIMO two-way half-duplex AF relaying,” IEEE Wireless Commu-

nications Letters, vol. 8, no. 1, pp. 25–28, 2019.

[91] Z. Zhang, Z. Chen, M. Shen, and B. Xia, “Spectral and energy efficiency of multipair

two-way full-duplex relay systems with massive MIMO,” IEEE Journal on Selected Areas

in Communications, vol. 34, no. 4, pp. 848–863, 2016.

[92] E. Sharma, R. Budhiraja, K. Vasudevan, and L. Hanzo, “Full-duplex massive MIMO

multi-pair two-way AF relaying: Energy efficiency optimization,” IEEE Transactions on


[93] M. S. Lobo, L. Vandenberghe, S. Boyd, and H. Lebret, “Applications of second-order

cone programming,” Linear Algebra and its Applications, vol. 284, no. 1, pp. 193 – 228,

1998.

[94] A. Beck, A. Ben-Tal, and L. Tetruashvili, “A sequential parametric convex approxima-

tion method with applications to nonconvex truss topology design problems,” J. Global

Optimization, vol. 47, no. 1, pp. 29–51, 2010.

[95] A. Ben-Tal and A. Nemirovski, Lectures on modern convex optimization: analysis, algo-

rithms, and engineering applications. Siam, 2001, vol. 2.

[96] E. Bjornson, J. Hoydis, and L. Sanguinetti, “Massive MIMO networks: Spectral, energy,

and hardware efficiency,” Foundations and Trends R© in Signal Processing, vol. 11, no.

3-4, pp. 154–655, 2017.

[97] R. T. Rockafellar, “Monotone operators and the proximal point algorithm,” SIAM jour-

nal on control and optimization, vol. 14, no. 5, pp. 877–898, 1976.

REFERENCES 171

[98] A. M. Tulino and S. Verdu, “Random matrix theory and wireless communications,”

Foundations and Trends R© in Communications and Information Theory, vol. 1, no. 1,

pp. 1–182, Jun. 2004.

[99] N. Timm, Applied Multivariate Analysis, ser. Springer Texts in Statistics. Springer

New York.

List of Publications

Journals papers: (Refereed)

1. Ekant Sharma, Dheeraj Amadula and Rohit Budhiraja, “Energy Efficiency Optimiza-

tion of Massive MIMO FD Relay With Quadratic Programming,” in IEEE Transactions

on Wireless Communications, vol. 19, no. 2, pp. 1429-1448, Feb. 2020

2. Ekant Sharma, Swadha Siddhi Chauhan and Rohit Budhiraja, “Decentralized WSEE

Optimization for Massive MIMO Two-Way Half-Duplex AF Relaying,” in IEEE Trans-

actions on Wireless Communications, vol. 19, no. 2, pp. 1397-1414, Feb. 2020

3. Dheeraj Amadula, Ekant Sharma and Rohit Budhiraja, “Spectral and Energy Effi-

ciency of Multipair Two-way Full-Duplex Spatially Correlated Massive MIMO MRC/MRT

Relaying,” in IEEE Transactions on Communications, vol. 67, no. 12, pp. 8346-8364,

Dec. 2019

4. Ekant Sharma, Arpita Singh Chauhan and Rohit Budhiraja, “Transceiver Design for

Massive MIMO Two-Way Half-Duplex AF Hybrid Relay With MIMO Users,” in IEEE

Transactions on Vehicular Technology, vol. 68, no. 9, pp. 8759-8774, Sept. 2019

5. Ekant Sharma, Rohit Budhiraja, K Vasudevan and Lajos Hanzo, “Full-Duplex Mas-

sive MIMO Multi-Pair Two-Way AF Relaying: Energy Efficiency Optimization,” in

IEEE Transactions on Communications, vol. 66, no. 8, pp. 3322-3340, Aug. 2018

6. Vikalp Mandawaria, Ekant Sharma and Rohit Budhiraja, “Energy-Efficient Massive

REFERENCES 173

MIMO Multi-Relay NOMA Systems With CSI errors,” to appear in IEEE Transactions

on Communications

7. Ekant Sharma, Neha Gupta, Sauradeep Dey and Rohit Budhiraja, “Hybrid Massive

MIMO Two-Way Relaying With Users And Relay Hardware Impairments”, in IEEE

Signal Processing Letters, vol. 27, pp. 486-490, 2020

8. Ekant Sharma, Swadha Siddhi Chauhan, and Rohit Budhiraja, “Weighted Sum En-

ergy Efficiency Optimization for Massive MIMO Two-Way Half-Duplex AF Relaying,”

IEEE Wireless Communications Letters, Volume: 8 , Issue: 1 , Feb. 2019

9. Venkatesh Tentu, Ekant Sharma, Rohit Budhiraja, “WSEE Optimization Using Asyn-

chronous ADMM For Massive MIMO Two-Way Relaying”, to appear in IEEE Com-

munications Letters, 2020

10. Vikalp Mandawaria, Ekant Sharma, Rohit Budhiraja, “WSEE Optimization of mmWave

NOMA Systems,” in IEEE Communications Letters, vol. 23, no. 8, pp. 1413-1417,

Aug. 2019

11. Sauradeep Dey, Ekant Sharma, and Rohit Budhiraja, “Scaling Analysis of Hardware-

Impaired Two-Way full-Duplex Massive MIMO Relay,” IEEE Communications Letters,

Volume: 23 , Issue: 7, July, 2019

12. Ekant Sharma, Ashish Shukla, and Rohit Budhiraja, “Spectral- and Energy-Efficiency

of Massive MIMO Two-Way Half-Duplex Hybrid Processing AF Relay,” IEEE Wireless

Communications Letters, Volume: 7 , Issue: 5 , Oct. 2018

13. Prem Singh, Ekant Sharma, K Vasudevan and Rohit Budhiraja, “CFO and Channel

Estimation for Frequency Selective MIMO-FBMC/OQAM Systems,” IEEE Wireless

Communications Letters, Volume: 7 , Issue: 5 , Oct. 2018

14. DN Amudala, A Rajoriya, Ekant Sharma, S Dey, Rohit Budhiraja, “Massive MIMO

REFERENCES 174

multi-pair two-way half-duplex AF FDD relaying: channel estimation”, CSI Transac-

tions on ICT, Springer, 2019

15. Ekant Sharma, Himanshu B Mishra, K Vasudevan and Rohit Budhiraja, “PAPR

Analysis of Superimposed Training Based SISO/MIMO-OFDM Systems With Orthog-

onal Affine Precoder,” Elsevier Physical Communications, Volume 25, Part 1, December

2017, Pages 239-248

16. Ekant Sharma, S. Rane, and K Vasudevan, “BER Efficient Interleaved OFDM Sys-

tem,” Wireless Personal Communications, Springer, 98, no. 1 (2018): 1531-1546

REFERENCES 175

Conference papers: (Refereed)

1. Vikalp Mandawaria, Ekant Sharma and Rohit Budhiraja, “Spectral Efficiency for

Massive MIMO Multi-Relay NOMA Systems with CSI errors,” to appear in IEEE 28th

European Signal Processing Conference (EUSIPCO 2020), Amsterdam, Netherlands,

Jan, 2021.

2. Sauradeep Dey, Ekant Sharma and Rohit Budhiraja, “Dynamic Resolution ADC/DAC

massive MIMO FD Relaying System Over Correlated Rician Channel,” to appear in

IEEE 28th European Signal Processing Conference (EUSIPCO 2020), Amsterdam,

Netherlands, Jan, 2021.

3. Dheeraj Naidu Amudala, Ekant Sharma and Rohit Budhiraja, “Spatially-Correlated

Hardware-Impaired Massive MIMO FD Relaying With MIMO Users,” to appear in

IEEE ICC 2020 Workshop on Full-Duplex Communications for Future Wireless Net-

works, Dublin, Ireland, Jun, 2020

4. Sauradeep Dey, Ekant Sharma and Rohit Budhiraja, “Impact of User and Relay

Hardware Impairments on Spectral Efficiency of HD Massive MIMO Relay,” to appear

in IEEE SPCOM 2020, Bangalore, India, July, 2020

5. Soumyadeep Dutta, Ekant Sharma and Rohit Budhiraja, “Power Scaling for Massive

MIMO UAV Communication System,” IEEE 12th International Conference on com-

munication systems and networks (COMSNETS), Bengaluru, India, 2020, pp. 507-510

6. Venkatesh Tentu, Dheeraj Amudula, Anupama Rajoriya, Ekant Sharma and Rohit

Budhiraja, “Energy Efficient Multi-Pair Massive MIMO Two-Way AF Relaying: A

Deep Learning Approach,” IEEE 12th International Conference on communication sys-

tems and networks (COMSNETS), Bengaluru, India, 2020, pp. 440-445

7. Sauradeep Dey, Ekant Sharma, and Rohit Budhiraja, “Multi-Pair Two-way Full-

Duplex Massive MIMO Relaying with Non-Ideal Hardware,” IEEE Global Communi-

REFERENCES 176

cations Conference (GLOBECOM), Waikoloa, HI, USA, 2019, pp. 1-6

8. Ekant Sharma, Dheeraj Amudula and Rohit Budhiraja, “Energy Efficiency Optimiza-

tion of Massive MIMO FD Relay Using Quadratic Programming,” 2019 IEEE 20th

International Workshop on Signal Processing Advances in Wireless Communications

(SPAWC)

9. Neha Gupta, Ekant Sharma, Sauradeep Dey and Rohit Budhiraja, “Spectral Effi-

ciency of Multi-pair Two-Way Massive MIMO Relay With Correlated Hardware Dis-

tortion,” 2019 IEEE 20th International Workshop on Signal Processing Advances in

Wireless Communications (SPAWC)

10. Ekant Sharma and Rohit Budhiraja, “QoS-Constrained Energy-Efficient AF Two-

Way Full-Duplex Relaying with Massive Antennas,” Proceedings of IEEE SPCOM

2018 (Invited paper), IISc Bangalore, India, pp. 1-6, Jul. 2018

11. Arpita Chauhan, Ekant Sharma, and Rohit Budhiraja, “Hybrid Block Diagonaliza-

tion for Massive MIMO Two-Way Half-Duplex AF Hybrid Relay,” Proceedings of IEEE

SPCOM 2018, IISc Bangalore, India, pp. 1-6, Jul. 2018

12. Ekant Sharma, Ashish Kant Shukla, and Rohit Budhiraja, “Spectral- and Energy-

Efficiency for Massive MIMO Two-Way Full-Duplex Hybrid Processing AF Relay,”

Proceedings of IEEE SPCOM 2018, IISc Bangalore, India, pp. 1-6, Jul. 2018

13. Ekant Sharma, Rohit Budhiraja and K Vasudevan, ”Multi-Pair Two Way AF Full-

Duplex Massive MIMO Relaying with ZFR/ZFT Processing,” Proceedings of IEEE In-

ternational Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC),

Montreal, QC, Canada, October 8-13, 2017

14. Ekant Sharma, H. B. Mishra, and K Vasudevan, ”PAPR Analysis of Superimposed

Training Based MIMO-OFDM Systems using an Orthogonal Affine Precoder,” IEEE

13th International Conference INDICON, Bengaluru, India, Dec 2016

REFERENCES 177

15. Ekant Sharma, H. B. Mishra, and K. Vasudevan, ”Training Sequence Optimization

for Estimating the Channel in the Presence of Colored Interference for MIMO-OFDM

Systems,” IEEE Region 10 Conference (TENCON), Singapore, Nov 2016

16. Ekant Sharma and K Vasudevan, “PAPR and BER Minimized OFDM Systems with

Low Complexity Channel Independent Precoders,” 22nd IEEE Symposium on Commu-

nications and Vehicular Technology in the Benelux (SCVT), Luxembourg, Nov 2015

analysis and optimization of energy-efficient massive mimo

Documents