a converse result for the discrete memoryless relay channel with relay–transmitter feedback
TRANSCRIPT
IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 8, AUGUST 2006 3789
A Converse Result for the Discrete Memoryless RelayChannel With Relay–Transmitter Feedback
Shraga I. Bross, Member, IEEE
Abstract—A converse result is proved for the discrete memoryless relaychannel in the presence of relay–transmitter causal feedback. The resultsubsumes the Cover–El Gamal max-flow min-cut upper bound on the ca-pacity of the one-way relay channel, and it is used to establish the partialfeedback capacity of the semideterministic relay channel.
Index Terms—Discrete memoryless relay channel, partial feedback.
I. INTRODUCTION
The relay channel model, first introduced by Van der Meulen in [1],describes a single-user communication channel where a relay helpsa sender–receiver pair in their communication. In [2], Cover and ElGamal suggested two coding approaches for the discrete memorylessrelay channel and established three achievability results; the first twoare based on each of their coding strategies while the third builds on acombination of both strategies. Additionally, the authors proved in [2]a converse result for the one-way relay channel; the so-called max-flowmin-cut upper bound.
In [4], the authors considered the discrete memoryless relaychannel with partial feedback; namely, either receiver–transmitter orrelay–transmitter feedback. The coding techniques presented in [4]yield an achievability result which is an analogue to the the third resultin [2] albeit for the relay–transmitter partial feedback scenario.
The main contribution of this work is a derivation of an infeasibilityresult for the relay channel with relay–transmitter feedback. This resultsubsumes (as expected) the max-flow min-cut upper bound and further-more, it demonstrates the role that feedback plays in helping both thetransmitter and the relay encoder to build correlation. We show that forthe semideterministic relay channel, relay–transmitter feedback doesnot improve on the one-way capacity that has been reported in [3].
II. DEFINITIONS AND MAIN RESULT
The discrete memoryless relay channel is a triple(X1 �X2; p(y; y1jx1; x2);Y � Y1) where X1 and X2 arefinite sets corresponding to the input alphabets of the sender and therelay, respectively; the finite sets Y1 and Y are the output alphabetsof the relay and the receiver, respectively; while p(�; �jx1; x2) is acollection of probability laws on Y �Y1 indexed by the input symbolsx1 2 X1 and x2 2 X2. The channel’s law extends to n-tuplesaccording to the memoryless law
Pr(yk; y1;kjxk1 ; x
k2 ; y
k�1; yk�11 ) = p(yk; y1;kjx1;k; x2;k)
where x1;k; x2;k; yk and y1;k denote the inputs and outputs of thechannel at time k, respectively, and xk1 (x1;1; . . . ; x1;k). Theinvestigated communication model is shown in Fig. 1.
Every n channel uses the sender produces a random integer W thatis uniformly distributed over the set f1; . . . ;Mg. In the presence of
Manuscript received December 19, 2005; revised April 2, 2006.The author is with the Department of Electrical Engineering, Technion–Israel
Institute of Technology, Technion City, Haifa 32000, Israel (e-mail:[email protected]).
Communicated by R. W. Yeung, Associate Editor for Shannon Theory.Digital Object Identifier 10.1109/TIT.2006.878222
relay–transmitter feedback both encoders are completely describedby corresponding sets of n encoding functions. These functions mapthe message and the sequence of previous relay outputs into the nextchannel input. Specifically
x1;k = f1;k(W;Y1;1; Y1;2; . . . ; Y1;k�1) (1)
x2;k = f2;k(Y1;1; Y1;2; . . . ; Y1;k�1) (2)
where x2;1 is allowed to be a stochastic function.The decoder observes the sequence of n channel outputs and esti-
mates W based on that. Formally, the decoder is described by a map-ping � : Yn ! f1; . . . ;Mg such that
w = �(Y1; . . . ; Yn): (3)
An (M;n; �)-code for the relay with partial feedback consists of twosets of n encoding functions and a decoder mapping � such that
Pe = Pr W 6= W � �:
A rateR is said to be achievable for the relay with partial feedback iffor any � > 0 there exists, for all n sufficiently large, an (M;n; �)-codesuch that (1=n) lnM � R.
Our main result can now be stated as follows.
Theorem 1: Let (X1 � X2; p(y; y1jx1; x2);Y � Y1) be a discretememoryless relay channel with causal noiseless feedback from relay tosender. If
R > supp
min fI(X1X2;Y ) ;
I(X1;Y Y1jX2V ) + I(V ;Y1jX2)g (4)
then there exists � > 0 such that Pe > � for all n. Here the supremumin (4) is taken over all laws on V � X1 � X2 � Y � Y1 of the form
pV X X Y Y (v; x1; x2; y; y1) = pV X X (v; x1; x2)p(y; y1jx1; x2)(5)
and the cardinality of V is bounded by kVk � kX1kkX2kkY1k+ 1.
Our converse result for the relay in the presence of relay–senderfeedback subsumes the converse result of Cover and El Gamal for theone-way general relay channel. They proved the following.
Theorem 2 [2, Theorem 4]: Consider the discrete memoryless relaychannel (X1 � X2; p(y; y1jx1; x2);Y � Y1). If
R > supp
min fI(X1X2;Y ) ; I(X1;Y Y1jX2)g (6)
then there exists � > 0 such that Pe > � for all n. Here the supremumin (6) is taken over all laws on X1 � X2 � Y � Y1 of the form
pX X Y Y (x1; x2; y; y1) = pX X (x1; x2)p(y; y1jx1; x2) : (7)
Evidently, the substitution of V = ; in (4) and (5) yields (6) and (7).The converse result in the presence of partial feedback recovers the
existence of an auxiliary random variable V which plays a role inbuilding correlation between the encoders. In [4], the following achiev-ability result has been reported for the relay in the presence of relay-sender feedback.
0018-9448/$20.00 © 2006 IEEE
3790 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 8, AUGUST 2006
Fig. 1. The relay channel with relay–transmitter feedback.
Theorem 3 [4, Theorem 3]: Consider the discrete memoryless relaychannel (X1 � X2; p(y; y1jx1; x2);Y � Y1) with causal noiselessfeedback from relay to sender. Then, the rate �R1 defined by
�R1 = supp
min I(X1; X2;Y )� I(Y1;Y1jX1; X2; ~V ; Y ) ;
I(X1;Y; Y1jX2; ~V ) + I( ~V ; Y1jX2) (8)
is achievable. Here, the supremum in (8) is taken over all laws on ~V �X2 � X1 � Y � Y1 � Y1 of the form
p~V X X Y Y Y (~v; x1; x2; y; y1; y1)
= p~V X X (~v; x1; x2)p(y; y1jx1; x2)pY jX ~V Y (y1j~v; x2; y1):
(9)
Comparing (4) and (5) with (8) and (9) one can see that the achiev-ability result indeed builds upon such an auxiliary random variable ~V .Yet, the gap between the right-hand side (RHS) of (4) and the RHS of(8) is due to the fact that Y1 6= Y1. In particular, for the semideter-ministic relay channel the capacity of which has been reported in [3],relay–sender partial feedback (as expected) does not increase the ca-pacity as stated below.
Corollary: If y1 is a deterministic function of x1 and x2, then
C = supp
min fI(X1X2;Y ) ; I(X1;Y jX2Y1) +H(Y1jX2)g
(10)where the supremum is taken over all joint laws pX X (x1; x2).
Proof of Corollary: The substitution of Y1 = ; and ~V = Y1in (8) and (9) yields the achievability of (10). The converse part ofthe corollary follows by the fact that, for the semideterministic relaychannel
I(X1;Y Y1jX2V ) + I(V ; Y1jX2)
= I(X1;Y1jX2V ) + I(X1;Y jY1X2V ) + I(V ;Y1jX2)(i)= I(X1;Y jY1X2V ) +H(Y1jX2)
=H(Y jY1X2V )�H(Y jY1X1X2V ) +H(Y1jX2)(ii)= H(Y jY1X2V )�H(Y jY1X1X2) +H(Y1jX2)(iii)
� H(Y jY1X2)�H(Y jY1X1X2) +H(Y1jX2)
= I(X1;Y jY1X2) +H(Y1jX2):
Here(i) follows by the deterministic relation between y1 and (x1; x2)which implies H(Y1jX1X2V ) = 0;(ii) follows by the Markovian relationV�� (X1X2)�� (Y Y1); and(iii) since conditioning reduces entropy.
The converse part of the corollary reveals that, for the semideter-ministic relay channel, the choice V = Y1 provides simultaneouslythe un-conditioning in step (iii) as well as the supremum over all lawspVX X ; that is,
supp
I(X1;Y Y1jX2V ) + I(V ; Y1jX2) =
supp
I(X1;Y jY1X2) +H(Y1jX2) :
We return to the proof of our main result.
Proof of Theorem 1: Suppose there exists an (M; n; �)-code forthe relay in the presence of causal relay–sender feedback. The prob-ability mass function on the joint ensemble (W;XXX1;XXX2; YYY ; YYY 1) isgiven by
p(w;xxx1; xxx2; yyy; yyy1) =
1
M
n
k=1
p(x1;kjw; yk�11 )p(x2;kjy
k�11 )p(yk; y1;kjx1;kx2;k)
where we use the notation YYY (Y1; . . . ; Yn).Now, the Fano inequality yields
H(W jW ) � � lnM + h(�) n�n(�) ; (11)
where h(�) denotes the binary entropy function and �n(�) ! 0 as� ! 0. From (3) and (11) it follows that
H(W jY n) � H(W jW ) � n�n(�) (12)
where we use the notation Y n (Y1; . . . ; Yn).Consider the identity
nR = H(W ) = I(W ; Y n) +H(W jY n):
Combining this with (12) we obtain
nR � I(W ;Y n) + n�n(�) :
We now proceed with a chain of inequalities for I(W ; Y n), wherethe explanations will follow:
I(W ;YYY ) � I(W ;YYY ; YYY 1)
(a)=
n
k=1
I(W ;YkY1;kjYk�1
Yk�11 X2;k)
=
n
k=1
H(YkY1;kjYk�1
Yk�11 X2;k)
� H(YkY1;kjWYk�1
Yk�11 X2;k)
IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 8, AUGUST 2006 3791
(b)=
n
k=1
H(YkY1;kjYk�1
Yk�11 X2;k)�H(YkY1;kjX1;kX2;k)
(c)=
n
k=1
H(YkjYk�1
Yk�11 X2;k)�H(YkjX1;kY
k�1Yk�11 X2;k)
+H(Y1;kjYk�1
Yk�11 X2;kYk)�H(Y1;kjX1;kX2;kYk)
=
n
k=1
I(Yk;X1;kjYk�1
Yk�11 X2;k)
+H(Y1;kjYk�1
Yk�11 X2;kYk)�H(Y1;kjX1kX2;kYk)
+ I(Y k�1Yk�11 ;Y1;kjX2;k)�H(Y1;kjX2;k)
+H(Y1;kjX2;kYk�1
Yk�11 )
(d)
�
n
k=1
I(Yk;X1;kjYk�1
Yk�11 X2;k)
+ I(Y k�1Yk�11 ;Y1;kjX2;k) +H(Y1;kjY
k�1Yk�11 X2;kYk)
�H(Y1;kjX1;kX2;kYkYk�1
Yk�11 )�H(Y1;kjX2;k)
+H(Y1kjX2;k)
=
n
k=1
I(Yk;X1;kjYk�1
Yk�11 X2;k)
+ I(Y k�1Yk�11 ;Y1;kjX2;k)
+ I(Y1;k;X1;kjX2;kYkYk�1
Yk�11 )
=
n
k=1
I(X1;k;YkY1;kjYk�1
Yk�11 X2;k)
+ I(Y k�1Yk�11 ;Y1;kjX2;k) :
Here(a) follows by the functional relationship (2);(b) from the functional relationship (1) and from the Markovityof (W;Y k�1Y k�1
1 )�� (X1;kX2;k)�� (YkY1;k);(c) from the fact that Yk is conditionally independent of(Y k�1Y
k�11 ) given (X1;kX2;k);
(d) since conditioning reduces entropy.Define
Vk (Y k�1Yk�11 )
then we have shown that
I(W ;YYY ) �
n
k=1
(I(X1;k;YkY1;kjX2;kVk) + I(Vk; Y1;kjX2;k)) :
Now, let Z be a random variable independent of VVV ;XXX1;XXX2; YYY ; YYY 1
uniformly distributed over the set f1; . . . ; ng, and set
X1 X1;Z ; X2 X2;Z ; Y YZ ; Y1 Y1;Z ; V VZ :
Then
1
n
n
k=1
(I(X1;k;YkY1;kjX2;kVk) + I(Vk;Y1;kjX2;k))
= I(X1;Y Y1jX2V Z) + I(V ;Y1jX2Z) : (13)
Next
I(X1;Y Y1jX2V Z) =H(Y Y1jX2V Z)�H(Y Y1jX1X2V Z)(e)= H(Y Y1jX2V Z)�H(Y Y1jX1X2)(f)
�H(Y Y1jX2V )�H(Y Y1jX1X2)(e)= H(Y Y1jX2V )�H(Y Y1jX1X2V )
= I(X1;Y Y1jX2V )
I(V ; Y1jX2Z) =H(Y1jX2Z)�H(Y1jV X2Z)(e)= H(Y1jX2Z)�H(Y1jV X2)(f)
�H(Y1jX2)�H(Y1jV X2)
= I(V ;Y1jX2): (14)
Here(e) follows by the Markovian relation
Z�� V�� (X1X2)�� (Y Y1)
and(f) since conditioning reduces entropy.
The combination of (13) and (14) yields that
1
n
n
k=1
(I(X1;k;YkY1;kjX2;kVk) + I(Vk; Y1;kjX2;k))
� I(X1;Y Y1jX2V ) + I(V ;Y1jX2): (15)
The inequality
I(W ;YYY ) �
n
k=1
I(X1;kX2;k;Yk) (16)
is proved in [2, Lemma 4].Combining (15) and (16) we conclude that
R � supp
min I(X1X2;Y );
I(X1;Y Y1jX2V ) + I(V ;Y1jX2) + �n(�)
where the supremum is taken over all joint laws of the form
pV X X Y Y (v; x1; x2; y; y1)
= pV X X (v; x1; x2)pY Y jX X (y; y1jx1; x2) :
Next, notice that
I(X1;Y Y1jX2V ) + I(V ; Y1jX2)
= H(Y jY1X2V )�H(Y Y1jX1X2) +H(Y1jX2)
and now a bound on the cardinality of V can be obtained via the tech-nique presented in [5, Appendix A1].
This completes the proof of Theorem 1.
ACKNOWLEDGMENT
The author would like to thank the reviewer as well as the AssociateEditor for their careful reading of the manuscript and their constructivecomments.
REFERENCES
[1] E. C. Van der Meulen, “Three terminal communication channels,” Adv.Appl. Probab., vol. 3, pp. 120–154, 1971.
[2] T. M. Cover and A. El Gamal, “Capacity theorems for the relaychannel,” IEEE Trans. Inf. Theory, vol. IT-25, no. 5, pp. 572–584,Sep. 1979.
3792 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 8, AUGUST 2006
[3] A. El Gamal and M. Aref, “The capacity of the semi-deterministic relaychannel,” IEEE Trans. Inf. Theory, vol. IT-28, no. 3, p. 536, May 1982.
[4] Y. Gabbai and S. I. Bross, “Achievable rates for the discrete memory-less relay channel with partial feedback configurations,” IEEE Trans.Inf. Theory, submitted for publication.
[5] A. D. Wyner and J. Ziv, “The rate-distortion function for source codingwith side information at the decoder,” IEEE Trans. Inf. Theory, vol.IT-22, no. 1, pp. 1–11, Jan. 1976.
A Counterexample to Cover’s Conjecture on GaussianFeedback Capacity
Young-Han Kim, Student Member, IEEE
Abstract—We provide a counterexample to Cover’s conjecture that thefeedback capacityC of an additive Gaussian noise channel under powerconstraint P be no greater than the nonfeedback capacity C of the samechannel under power constraint 2P , i.e., C (P ) � C(2P ).
Index Terms—Additive Gaussian noise channels, channel capacity, con-jecture, counterexample, feedback.
I. BACKGROUND
Consider the additive Gaussian noise channel
Yi = Xi + Zi; i = 1; 2; . . .
where the additive Gaussian noise process fZig1i=1 is stationary. Itis well known that feedback does not increase the capacity by much.For example, the following relationships hold between the nonfeedbackcapacity C(P ) and the feedback capacity CFB(P ) under the averagepower constraint P :
C(P ) � CFB(P ) � 2C(P )
C(P ) � CFB(P ) � C(P ) +1
2:
(See Cover and Pombra [5] for rigorous definitions of feedbackand nonfeedback capacities and proofs for the above upper bounds.Throughout this paper, the capacity is in bits and the logarithm is tobase 2.)
These upper bounds on feedback capacity were later refined by Chenand Yanagi [3] as
CFB(P ) � 1 +1
�C(�P )
CFB(P ) � C(�P ) +1
2log 1 +
1
�
Manuscript received November 4, 2005; revised March 29, 2006. This workwas supported in part by the NSF under Grant CCR-0311633.
The author is with the Information Systems Laboratory, Stanford University,Stanford, CA 94305 USA (e-mail: [email protected]).
Communicated by Y. Steinberg, Associate Editor for Shannon Theory.Digital Object Identifier 10.1109/TIT.2006.878167
Fig. 1. Plot of x vs. (1 + x)(1� x) .
for any � > 0. In particular, taking � = 2, we get
CFB(P ) � min3
2C(2P ); C(2P ) +
1
2log
3
2:
In fact, Cover [4] conjectured that
CFB(P ) � C(2P )
and it has been long believed that this conjecture is true. (See Chen andYanagi [8], [2] for a partial confirmation of Cover’s conjecture.)
II. A COUNTEREXAMPLE
Consider the stationary Gaussian noise process fZig1i=1 with powerspectral density
SZ(ei�) = j1 + ei�j2 = 2(1 + cos �):
Now under the power constraint P = 2, it can be easily shown [1, Sec.7.4] that the nonfeedback capacity is achieved by the water-filling inputspectrum SX(ei�) = 2(1� cos �), which yields the output spectrumSY (e
i�) � 4 and the capacity
C(2) =�
��
1
2log
SY (ei�)
SZ(ei�)
d�
2�= 1 bit:
On the other hand, it can be shown [9] that the celebrated Schalk-wijk–Kailath feedback coding scheme [6], [7] achieves the data rate
RSK(P ) = � log x0
under power constraint P , where x0 is the unique positive root of theequation
Px2 = (1 + x)(1� x)3: (1)
Now forP = 1, we can readily check that the unique positive root x0of (1) is less than 1=2, since f(x) := x2 � (1 + x)(1� x)3 is strictlyincreasing and continuous on [0, 1] with f(0) = �1 and f(1=2) =1=16; see Fig. 1. Therefore,
CFB(1) � RSK(1) = � log x0 > 1 = C(2):
0018-9448/$20.00 © 2006 IEEE