a generalization of arimoto-blahut algorithm

16
A Generalization of Arimoto-Blahut Algorithm Mohammad Rezaeian and Alex Grant Institute for Telecommunications Research University of South Australia IEEE International Symposium on Information Theory June 29, 2004

Upload: others

Post on 10-Nov-2021

3 views

Category:

Documents


0 download

TRANSCRIPT

A Generalization of Arimoto-Blahut Algorithm

Mohammad Rezaeian and Alex Grant

Institute for Telecommunications ResearchUniversity of South Australia

IEEE International Symposium on Information Theory

June 29, 2004

Outline

1. Introduction

2. Total capacity of multiple access channel.

3. An information function

4. An optimality criterion for total capacity.

5. The generalization of Arimoto-Blahut algorithm for total capacity.

6. Conclusion.

1

Computation of channel capacity

The (discrete) channel capacity problem

� � ���

� ������ �� � ���

does not have an analytical solution, but requires iterative method.

General iterative methods used

� Arimoto-Blahut Algorithm

� Geometric programming (new, ISIT 03)

� Cutting Plane algorithm (new, CISS 03)

The convergence analysis of all these methods relies on the convexity of (1).

2

Optimality criterion for convex problems

A convex maximization problem is a problem of the type

����������

where � � �� is a convex set and ���� is a concave function.

� Kuhn-Tucker condition is a necessary and sufficient condition for optimality.

� The dual methods can be used in convex problem. The duality gap is zero.

If the convexity is impaired, the problem can have local maxima or saddle pointssatisfying Kuhn-Tucker condition.

��� �� � is a concave function of � ��� for fixed � �� ���,The simplex � ��� is a convex set in the space � �� �,so

� � ���

� ������ �� �

is a convex problem.

3

Discrete Memoryless Multiple-Access Channel

An � -user discrete memoryless multiple-access channel is defined by � inputalphabets ��, � � � � � � an output alphabet � and a conditional probabilitydistribution ��� ��� �� ����� �.

� Objective: compute the “Total Capacity”������ � ���

����������������� ��� ����� �� � � � �� �� ��

� Problem:

– ���� � � � �� is a non-convex subset of space of all distributions

– � is non concave over the convex space �� � �� � � � � � ��

So the problem is not a convex problem for � .

� For � �, we have the convex problem of capacity for point to point channel. In general

������ ����� � is concave with respect to any one user.

������ ����� � is not concave over variation across users.4

A recent result on the total capacity

Regular Channel A multiple access channel with ���� ��� for all .

Non-regular channels can be decomposed into a number of regular subMACs byeliminating input letters.

[Watanabe-Kamoi, ISIT02]

� The Kuhn-Tucker condition is sufficient for regular channels.

� There exist a subMAC with capacity equal to the original channel.

Capacity of a channel is the maximum of capacity of all subMACs.

� Objective:An algorithm that converge to a Kuhn-Tucker distribution for regular channels.

5

Characterization of distributions that satisfy Kuhn-Tucker condition

For single user channel

� ���

� ������ � ���

[Gallager] A Kuhn-Tucker distribution for the problem (1) is a � ��� such that

��� � � �� � ��� � �

��� � � � �� � ��� ��

for some constant �.

We extend this characterization to multiple access channel problem

������ ���

� ����� �������� ��� ����� �� ����� � ���

A Kuhn-Tucker distribution for the problem (2) is a product distribution ����, such that

��� � ����� � �

��� � � ����� �

� � � �����

for some constant �.

6

The Information Function

����� � � �� �� ����� �� ��

� ����� is the Kullback-Leibler distance.

� ����� � is a measure of variation in distributions due to revelation � � �, theinformation contains in the event.

� Gallager defines ���� �� � ���� ������� ���� as the mutual information, and

����� � as an average mutual information (with respect to � �� ���).

��� �� � � �� ����� ��

���� � �� �������

���� �� �

�� ���

� ����� �� �

7

Extension to a set of variables

���� �� �� �� �� � � �� �� ��� �� ��� ����� �� ���

Marginalization to a subset �� � �� � � � ��, � �� � � � � � ��:

����� �� � ��

� ��� ���� �������� ��

����� � � ��� ����� �� ��

���� �� � � ��� ���� �� ��

With independence �, � ��� ���

�� ������

������ �

�������� ������� �� �

8

The optimality criterion

Apply Kuhn-Tucker for maximization of ���� � subject to

������ �, and ���� �, � �

. If � ����� �

� ���� is optimal then there exists unique Lagrange multipliers �� �� � � � ��

and ���� ���� � � � ������� ���� � � � ������ �, such that

����� �

������ � � ��� �

��� � ���� � �

��� � � ���� �

Hence

��� � ����� �

����� �� ���� � �

��� � ����� �

������ �� ���� �

An optimal distribution for total capacity is a product distribution ����, such that

��� � ����� � �

��� � � ����� �

� � � �����

� is the total capacity.

9

A General Iterative Algorithm

Define the sequence � ������, � � � � � � � , for all � � ���� and �� � ��.

� ���

� ���� �

����� ��������������� ��

� � is any continuous monotonically increasing positive function over ��.

� � ���� � �, for all � and �.

� ��� � is based on updated probabilities of users � to �� �.

� �� is the probability normalization factor for user �.

Lemma 1 The limit point of the sequence of � �, if it exists, satisfies the Kuhn-Tucker condition.

ie: for some constant �,

������� � � �� ����� � �

������� � �� ����� � �

� � �����10

Illustration of Lemma 1

0 10 20 30 40 50 60 70 80 90 1000

0.1

0.2

0.3

0.4

0.5

0.6

0.7f(x)=exp(x)

The convergence point of the sequence of

� ���

���

���� ��������� ��

satisfies

��� � �� ���� � �

��� � � �� ���� �

� � � �����for some constant �.

11

Convergence for � � ���

Lemma 2 The sequence � � for � � ��� is increasing in mutual information.

��� ���� � �������

�� � ����������� ����� ��������������

�� �������������

� ����

��� ��� � ����� � ��� ��� � ���� � ���� �� ������ �� �

For fixed product distribution � ����� maximizing ��� �� over � ���� gives:

� ��� ��

����� ������ ��������� � ��

��� ��� � ����� ��� ��� � �����If we put � ���� back to � ����, the iterative sequence increases ��� ��, thus ����.

��� ���� � ����� ��� ��� � ����� � ��� ��� � ����

12

Generalized Arimoto-Blahut Algorithm

Theorem For regular channels the sequence of product distribution � �,

� ���

� ���� � � ������

����������� ������ ��

������ ������������ ��

converge to a total capacity achieving distribution.

� � ���� � ����

� ��� � is based on updated probabilities of users � to �� �..

For � � � the sequence reduces to Arimoto-Blahut Algorithm.

� ������ � � ����

��������� ����� � ����� ���������� ��

where � ���� �� ���.

13

Two-user Example

Alphabets�� � �� � � ��

� � � � �Transition matrix

������� ��� ���

��� �� ���

��� ��� ���

��� ��� ����

���

Initialization

� ��� � �� � ���

� ��� � �� � ���

0 10 20 30 40 50 60 70 80 90 100

0.08

0.09

0.1

0.11

0.12

0.13

0.14

0.15

0.16

Iteration���

���

���

��

��

�� � �

Typical algorithm convergence for different functions ���.

14

Conclusion� The information function ������� � is useful for expressing an optimality crite-

rion for total capacity problem.

� The Arimoto-Blahut algorithm was generalized to computation of total capacity.

– Interesting to investigate some alternative � for faster convergence

Not covered in this talk

Symmetric multiple access channel

Work in Progress

Generalization of cutting plane algorithm for multiple access channel.

A new approach by using an ”exact penalty function” to impose the product distri-bution constraint.

15