efficient architectures for error control using low-density parity-check...
TRANSCRIPT
Efficient Architecturesfor Error Control
usingLow-Density Parity-Check Codes
David Haley
Thesis submitted for the degree of
Doctor of Philosophy
Institute for Telecommunications Research
July 2004
To my wife Jane,my mother Verna, and my father Peter.
Without my family anything that I achieve is meaningless.
Contents
List of Figures v
List of Tables vii
List of Algorithms viii
Glossary ix
Notation and Symbols xi
Summary xv
Publications xvi
Declaration xvii
Acknowledgments xviii
1 Introduction 1
1.1 The Digital Communication System . . . . . . . . . . . . . . . . . . . . . 2
1.1.1 Source Coding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1.2 Encryption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1.3 Channel Coding . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1.4 Modulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.1.5 The Channel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2 Codec Design Criteria . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3 Overview of Thesis Structure . . . . . . . . . . . . . . . . . . . . . . . . 8
1.4 Contributions of This Thesis . . . . . . . . . . . . . . . . . . . . . . . . . 9
2 Low-Density Parity-Check Codes 11
2.1 Historical Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2 Linear Block Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.3 Low-Density Parity-Check Codes . . . . . . . . . . . . . . . . . . . . . . 14
i
2.4 Factor Graph Representation of LDPC Codes . . . . . . . . . . . . . . . 16
2.5 Iterative Decoding of LDPC Codes . . . . . . . . . . . . . . . . . . . . . 18
2.5.1 The Sum-Product Algorithm . . . . . . . . . . . . . . . . . . . . . 19
2.5.2 Hard Decision Decoding . . . . . . . . . . . . . . . . . . . . . . . 22
2.6 Alternative Representations and Algorithms . . . . . . . . . . . . . . . . 25
2.7 Encoding LDPC Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.7.1 Structured Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.7.2 Code Transformation . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.8 Analysis of Codes and Decoding on Graphs . . . . . . . . . . . . . . . . . 33
2.8.1 Short Cycles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.8.2 Stopping Sets and Extrinsic Message Degree . . . . . . . . . . . . 35
2.8.3 Graph Expansion . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2.8.4 Near-Codewords . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2.8.5 The Computation Tree . . . . . . . . . . . . . . . . . . . . . . . . 37
2.8.6 Finite Graph-Covers . . . . . . . . . . . . . . . . . . . . . . . . . 38
2.9 Developments in LDPC Code Design . . . . . . . . . . . . . . . . . . . . 43
2.9.1 Irregular Constructions . . . . . . . . . . . . . . . . . . . . . . . . 43
2.9.2 Non-Binary Construction . . . . . . . . . . . . . . . . . . . . . . . 43
2.9.3 Algebraic LDPC Constructions . . . . . . . . . . . . . . . . . . . 44
2.10 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3 Iterative Encoding of LDPC Codes 46
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.2 The Sum-Product Encoder . . . . . . . . . . . . . . . . . . . . . . . . . . 48
3.3 Encoding via Iterative Matrix Inversion . . . . . . . . . . . . . . . . . . . 49
3.4 Reversible LDPC Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
3.4.1 Building Iteratively Encodable Circulants . . . . . . . . . . . . . . 52
3.4.2 Enforcing the Overlap Constraint . . . . . . . . . . . . . . . . . . 55
3.4.3 Building Reversible LDPC Codes from Circulants . . . . . . . . . 57
3.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
4 Performance Analysis 61
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.2 Performance on the AWGN Channel . . . . . . . . . . . . . . . . . . . . 62
4.3 Performance on the Binary Erasure Channel . . . . . . . . . . . . . . . . 68
4.4 Analysis of Codes and Decoder Behaviour . . . . . . . . . . . . . . . . . 69
4.4.1 Minimum Distance . . . . . . . . . . . . . . . . . . . . . . . . . . 74
4.4.2 Stopping Sets and Extrinsic Message Degree . . . . . . . . . . . . 76
4.4.3 Cycles and Near-Codewords . . . . . . . . . . . . . . . . . . . . . 78
ii
4.4.4 Finite Graph-Covers . . . . . . . . . . . . . . . . . . . . . . . . . 81
4.4.5 Graph Expansion . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
4.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
5 Improved Reversible LDPC Codes 88
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
5.2 Simple Metrics for Implementation Complexity . . . . . . . . . . . . . . 89
5.3 High Rate Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
5.4 Codes Built From Improved Expanders . . . . . . . . . . . . . . . . . . . 92
5.5 Recursive Construction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
5.6 Simulation Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
5.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
6 Analog Decoding 99
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
6.2 Potential Advantages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
6.3 Existing Work and Remaining Challenges . . . . . . . . . . . . . . . . . . 101
6.4 The Subthreshold CMOS Analog Approach . . . . . . . . . . . . . . . . . 103
6.4.1 Subthreshold Operation . . . . . . . . . . . . . . . . . . . . . . . 103
6.4.2 The Gilbert Multiplier . . . . . . . . . . . . . . . . . . . . . . . . 105
6.4.3 Probability Normalisation . . . . . . . . . . . . . . . . . . . . . . 105
6.5 Analog Computation using Soft-Logic Gates . . . . . . . . . . . . . . . . 106
6.5.1 The Analog Soft-XOR Gate . . . . . . . . . . . . . . . . . . . . . 107
6.5.2 The Analog Soft-Equal Gate . . . . . . . . . . . . . . . . . . . . . 107
6.5.3 Building Variable and Check Nodes . . . . . . . . . . . . . . . . . 108
6.6 The Analog Sum-Product Decoder . . . . . . . . . . . . . . . . . . . . . 109
6.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
7 The Reversible LDPC Codec 113
7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
7.2 Mode-Switching Gates . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
7.3 Codec Core Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
7.3.1 Decode Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
7.3.2 Encode Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
7.3.3 Estimate of Encoder Implementation Overhead . . . . . . . . . . 120
7.4 Circuit Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
7.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
8 Conclusion 125
8.1 Summary of Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
iii
8.2 Suggestions for Further Work . . . . . . . . . . . . . . . . . . . . . . . . 128
Bibliography 130
iv
List of Figures
1.1 Digital communication system model. . . . . . . . . . . . . . . . . . . . . 3
2.1 (3,6)-regular LDPC construction using random permutation matrices. . . 16
2.2 Factor graph representation of the global function g(x1, x2, x3). . . . . . . 17
2.3 Factor graph representation of a (2,4)-regular code. . . . . . . . . . . . . 17
2.4 Soft gates. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.5 An upper-triangular parity-check matrix. . . . . . . . . . . . . . . . . . . 28
2.6 A staircase structure for Hp. . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.7 A quasi-cyclic parity-check matrix. . . . . . . . . . . . . . . . . . . . . . 29
2.8 Systematic quasi-cyclic form of H. . . . . . . . . . . . . . . . . . . . . . . 30
2.9 Shift register based encoder for a quasi-cyclic code. . . . . . . . . . . . . 30
2.10 The parity-check matrix rearranged into approximate lower-triangular form. 32
2.11 A cycle of length four. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.12 A size five stopping set. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.13 Computation tree example. . . . . . . . . . . . . . . . . . . . . . . . . . 38
2.14 A simple graph G and a double cover. . . . . . . . . . . . . . . . . . . . . 39
2.15 An m-fold cover of G. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.1 Jacobi algorithm as message-passing. . . . . . . . . . . . . . . . . . . . . 52
4.1 Comparing the performance of reversible and random (3,6)-regular LDPCcodes, with length n = 64, on the AWGN channel. . . . . . . . . . . . . . 64
4.2 Comparing the performance of reversible and random (3,6)-regular LDPCcodes, with length n ≈ 500, on the AWGN channel. . . . . . . . . . . . . 65
4.3 Comparing the performance of reversible and random (3,6)-regular LDPCcodes, with length n ≈ 1000, on the AWGN channel. . . . . . . . . . . . 66
4.4 Comparing the performance of reversible and random (3,6)-regular LDPCcodes, with length n ≈ 4000, on the AWGN channel. . . . . . . . . . . . 67
4.5 Shifting the error floor for Rev4096, at Eb/N0 = 2dB, by varying η. . . . 68
4.6 Comparing the performance of reversible and random (3,6)-regular LDPCcodes, with length n = 64, on the binary erasure channel. . . . . . . . . . 70
4.7 Comparing the performance of reversible and random (3,6)-regular LDPCcodes, with length n ≈ 500, on the binary erasure channel. . . . . . . . . 71
v
4.8 Comparing the performance of reversible and random (3,6)-regular LDPCcodes, with length n ≈ 1000, on the binary erasure channel. . . . . . . . 72
4.9 Comparing the performance of reversible and random (3,6)-regular LDPCcodes, with length n ≈ 4000, on the binary erasure channel. . . . . . . . 73
4.10 Occurrence of maximal stopping sets for Rev4096, when ǫ = 0.375. . . . . 76
4.11 Parity bits in the |S| = 11 stopping set. . . . . . . . . . . . . . . . . . . . 78
4.12 Graph of Hp viewed as a lattice. . . . . . . . . . . . . . . . . . . . . . . . 79
4.13 Error vector weight histograms for Rev4096 at Eb/N0 = 2dB. . . . . . . . 80
4.14 Cycles in a variable row of the lattice for Hp. . . . . . . . . . . . . . . . . 80
4.15 Building a pseudo-codeword on the lattice for Hp. . . . . . . . . . . . . . 82
4.16 Comparison of expansion for random and reversible structures. . . . . . . 83
4.17 Loci of the spectra for Hp of the reversible codes. . . . . . . . . . . . . . 85
5.1 Comparing the performance of some r ≈ 0.763 LDPC codes. . . . . . . . 91
5.2 Expansion of type-II reversible structures. . . . . . . . . . . . . . . . . . 94
5.3 Performance of type-II reversible (3,6)-regular LDPC codes. . . . . . . . 96
5.4 Performance of rate r = 0.6 type-II reversible LDPC codes. . . . . . . . . 97
6.1 An n-type MOSFET. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
6.2 An n-type differential pair. . . . . . . . . . . . . . . . . . . . . . . . . . . 105
6.3 Vector multiplication core. . . . . . . . . . . . . . . . . . . . . . . . . . . 106
6.4 Vector normalisation using a biased current mirror. . . . . . . . . . . . . 107
6.5 Subthreshold CMOS soft-XOR gate. . . . . . . . . . . . . . . . . . . . . 108
6.6 Subthreshold CMOS soft-equal gate. . . . . . . . . . . . . . . . . . . . . 109
6.7 Voltage-mode soft-XOR gate. . . . . . . . . . . . . . . . . . . . . . . . . 110
6.8 Factor graph representation of bi-directional 3-port soft-logic gates. . . . 111
6.9 Check and variable nodes. . . . . . . . . . . . . . . . . . . . . . . . . . . 111
6.10 Analog decoder circuit for H. . . . . . . . . . . . . . . . . . . . . . . . . 112
7.1 Mode-switching XOR gate. . . . . . . . . . . . . . . . . . . . . . . . . . . 115
7.2 Codec circuit for H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
7.3 Information variable node structure for x9 . . . . . . . . . . . . . . . . . 117
7.4 Parity variable node structure for x1 . . . . . . . . . . . . . . . . . . . . 118
7.5 Shift register clock phases. . . . . . . . . . . . . . . . . . . . . . . . . . . 120
7.6 Decoder output with bits x7 and x11 corrected. . . . . . . . . . . . . . . . 122
7.7 Encoder parity bit outputs. . . . . . . . . . . . . . . . . . . . . . . . . . 123
7.8 Encoder output for parity bit x1. . . . . . . . . . . . . . . . . . . . . . . 123
vi
List of Tables
4.1 Reversible LDPC codes. . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.2 Randomly constructed benchmark LDPC codes. . . . . . . . . . . . . . . 62
4.3 Partial weight distribution of Rev64, for w (xu) ≤ 5. . . . . . . . . . . . . 74
4.4 Partial weight distribution of Rev512, for w (xu) ≤ 3 (first 14 entries only). 75
4.5 Bounds on dmin for the reversible codes. . . . . . . . . . . . . . . . . . . . 75
4.6 Stopping set configurations. . . . . . . . . . . . . . . . . . . . . . . . . . 77
5.1 Comparison of high rate code performance and complexity. . . . . . . . . 92
7.1 Mode-switching gate settings. . . . . . . . . . . . . . . . . . . . . . . . . 114
7.2 Transistor overhead for encoding. . . . . . . . . . . . . . . . . . . . . . . 120
7.3 Codec core simulation parameters. . . . . . . . . . . . . . . . . . . . . . 121
7.4 Example iterative solution. . . . . . . . . . . . . . . . . . . . . . . . . . . 121
7.5 Codec core specification. . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
vii
List of Algorithms
2.1 Sum-Product Decoder (Probability Domain) . . . . . . . . . . . . . . . . . 23
2.2 Sum-Product Decoder (Log-Likelihood Domain) . . . . . . . . . . . . . . . 24
2.3 Erasure Decoder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.1 Message-Passing Jacobi Method Over F2 . . . . . . . . . . . . . . . . . . . 51
3.2 Construction of Hp for a Type-I Reversible Code . . . . . . . . . . . . . . 57
5.1 Construction of Hp for a Type-II Reversible Code . . . . . . . . . . . . . . 93
viii
Glossary
The following acronyms are used in this thesis.
A/D analog to digital page 101
ARQ automatic-repeat-request page 130
AWGN additive white Gaussian noise page 6
BCJR Bahl-Cocke-Jeinek-Raviv page 101
BEC binary erasure channel page 7
BER bit error rate page 4
BiCMOS bipolar complementary metal-oxide-silicon page 105
BIST built-in self-test page 103
BPSK binary phase shift keyed page 5
BSG bi-directional soft-logic gate page 20
BSGC bi-directional soft-logic gate count page 89
CD compact disc page 2
CMOS complementary metal-oxide-silicon page 102
CRC cyclic redundancy check page 3
DVD digital versatile disc page 2
EG-LDPC Euclidean-geometry low-density parity-check page 90
EMD extrinsic message degree page 35
EXIT extrinsic information transfer page 43
FEC forward error correction page 4
FET field-effect transistor page 103
FFT fast Fourier transform page 18
ix
LDPC low-density parity-check page 11
LLR log-likelihood ratio page 5
MAP maximum a posteriori page 4
MIFG multiple-input floating-gate page 102
ML maximum-likelihood page 4
MOSFET metal-oxide-silicon field-effect transistor page 103
MSG mode-switching gate page 115
RS Reed-Solomon page 46
RTL resistor-transistor-logic page 115
SG soft-logic gate page 19
SiGe silicon-germanium page 104
SNR signal to noise ratio page 4
SOC system-on-a-chip page 101
SOR successive over relaxation page 102
TSMC Taiwan semiconductor manufacturing company page 122
T-SPICE Simulation program with integrated circuit emphasis page 122
VLSI very large scale integration page 99
WER word error rate page 62
XOR exclusive-or page 19
x
Notation and Symbols
This thesis employs the following notation and symbol set.
Notation relating to coding theory
X Matrix notation page 13
[X | Y] Matrix concatenation page 13
|X| Determinant page 83
X−1 Matrix inverse page 29
X ⊗ Y Matrix Kronecker product page 93
X⊤ Matrix transpose page 13
X Set notation page 16
X\Y Set subtraction page 20
|X | Order of set page 36
x Vector notation page 3
[x | y] Vector concatenation page 13
x⊤ Vector transpose page 13
|·| Absolute value page 36
A Adjacency matrix page 34
b Intermediate encoding vector page 27
c Check node in factor graph page 14
C Code page 12
x∗ Complex conjugate of x page 84
dmin Minimum distance page 13
Eb Bit energy page 6
xi
Eb/N0 Signal to noise ratio page 6
ǫ Channel erasure probability page 7
Es Symbol energy page 5
η Decoder clipping parameter page 24
⌊·⌋ Floor function page 13
F2 The binary Galois field page 3
G Factor graph page 39
Γ (s) Set of nodes adjacent to node s on the factor graph page 20
G Factor graph cover page 39
G Generator matrix page 13
Gsys Systematic form of generator matrix page 13
H Parity-check matrix page 13
Hsys Systematic form of parity-check matrix page 13
Hp Section of parity-check matrix relating to parity bits page 27
Hu Section of parity-check matrix relating to information bits page 27
h(x) First row polynomial of Hp for type-I reversible codes page 51
i Row weight of parity-check matrix page 15
[S] Iverson’s convention indicating truth of statement S page 17
j Column weight of parity-check matrix page 15
k Code dimension page 12
κ Jacobi encoder iterations page 49
λ Log-likelihood ratio page 5
m Number of rows in parity-check matrix page 13
M(·) Map operator page 5
m Degree of graph cover page 39
µδ Normalised spectral gap of graph page 36
µ Eigenvalue page 36
µs→r Message passed from node s to node r page 19
xii
n Code length page 12
N0 Noise spectral density page 6
ω Pseudo-codeword page 40
pX(X = x) Probability distribution/mass for random variable X at value x page 5
pX(x) Abbreviated form of pX(X = x) page 5
r Rate of information transfer (code rate) page 13
R The real numbers page 34
sgn (·) Sign operator page 24
σ2 Noise variance page 6
u Information vector page 3
u Decoded estimate of information vector page 3
v Variable node in factor graph page 14
w (x) Weight of vector x page 13
wp (ω) Pseudo-weight of pseudo-codeword ω page 41
wminp Minimum pseudo-weight page 41
x Codeword vector page 3
xp Parity section of codeword vector page 12
xu Information section of codeword vector page 12
x Decoded estimate of codeword page 4
y Received vector page 3
Z The integers page 43
ζ Graph expansion factor page 36
z(x) Syndrome of vector x page 36
xiii
Notation relating to electronics
Gnd Ground page 103
ID Transistor drain current page 104
Iu Unit current page 103
I0 Specific current page 104
MUX Multiplexer page 117
NDIFF Differential pair (n-type) page 119
NORM Normalisation circuit page 119
Vdd Supply voltage page 103
Vdiff Differential reference voltage page 104
Vds Transistor drain-to-source voltage page 104
Vgs Transistor gate-to-source voltage page 104
VrefN Reference voltage (n-type) page 105
VrefP Reference voltage (p-type) page 105
VT Thermal voltage page 104
Vth Transistor threshold voltage page 104
xiv
Summary
Recent designs for low-density parity-check (LDPC) codes have exhibited capacity ap-
proaching performance for large block length, overtaking the performance of turbo codes.
While theoretically impressive, LDPC codes present some challenges for practical imple-
mentation. In general, LDPC codes have higher encoding complexity than turbo codes
both in terms of computational latency and architecture size. Decoder circuits for LDPC
codes have a high routing complexity and thus demand large amounts of circuit area.
There has been recent interest in developing analog circuit architectures suitable
for decoding. These circuits offer a fast, low-power alternative to the digital approach.
Analog decoders also have the potential to be significantly smaller than digital decoders.
In this thesis we present a novel and efficient approach to LDPC encoder/decoder
(codec) design. We propose a new algorithm which allows the parallel decoder architec-
ture to be reused for iterative encoding. We present a new class of LDPC codes which are
iteratively encodable, exhibit good empirical performance, and provide a flexible choice
of code length and rate.
Combining the analog decoding approach with this new encoding technique, we
design a novel time-multiplexed LDPC codec, which switches between analog decode
and digital encode modes. In order to achieve this behaviour from a single circuit we
have developed mode-switching gates. These logic gates are able to switch between
analog (soft) and digital (hard) computation, and represent a fundamental circuit design
contribution. Mode-switching gates may also be applied to built-in self-test circuits for
analog decoders. Only a small overhead in circuit area is required to transform the
analog decoder into a full codec. The encode operation can be performed two orders of
magnitude faster than the decode operation, making the circuit suitable for full-duplex
applications. Throughput of the codec scales linearly with block size, for both encode
and decode operations. The low power and small area requirements of the circuit make
it an attractive option for small portable devices.
xv
Publications
• D. Haley, A. Grant, and J. Buetefuer,“Iterative encoding of low-density parity-check codes,”in Proc. 3rd Australian Commun. Theory Workshop, Canberra, Australia,Feb. 2002, pp. 15–17.
• D. Haley, A. Grant, and J. Buetefuer,“Iterative encoding of low-density parity-check codes,”in Proc. GLOBECOM 2002, Taipei, Taiwan, Nov. 2002, vol. 2, pp. 1289–1293.
• D. Haley, C. Winstead, A. Grant, and C. Schlegel,“Architectures for error control in analog subthreshold CMOS,”in Proc. 4th Australian Commun. Theory Workshop, Melbourne, Australia,Feb. 2003, pp. 75–80.
• D. Haley, C. Winstead, A. Grant, and C. Schlegel,“An analog LDPC codec core,”in Proc. Int. Symp. on Turbo Codes, Brest, France, Sept. 2003, pp. 391–394.
• D. Haley, C. Winstead, A. Grant, V. Gaudet, and C. Schlegel,“Reusing analog decoders for encoding,”in 2nd Analog Decoder Workshop, Zurich, Switzerland, Sept. 2003,available at http://www.isi.ee.ethz.ch/adw/slides/haley.pdf.
• D. Haley and A. Grant,“High rate reversible LDPC codes,”in Proc. 5th Australian Commun. Theory Workshop, Newcastle, Australia,Feb. 2004, pp. 114–117.
• D. Haley, C. Winstead, A. Grant, V. Gaudet, and C. Schlegel,“Robust analog decoder design with mode-switching cells,”in 3rd Analog Decoder Workshop, Banff, Canada, June 2004,available at http://www.analogdecoding.org/docs/Haley ADW04.pdf.
xvi
Declaration
I declare that this thesis does not incorporate without acknowledgment any material
previously submitted for a degree or diploma in any university, and that to the best
of my knowledge it does not contain any materials previously published or written by
another person except where due reference is made in the text.
David Haley
xvii
Acknowledgments
I first thank my principal supervisor Prof. Alex Grant. Thankyou Alex for giving me the
chance to learn the value of research, analysis and the language of mathematics. Thanks
for being one of the most hard working fund raisers in the field, and for giving me the
chance to visit three other continents on multiple occasions. Finally, thanks for all of the
good times and for the bass riffs that you’ve hammered out in our band. Rock On Alex.
I also acknowledge the support of my associate supervisor, Dr. Paul Alexander, and
the financial assistance of Southern Poro Communications.
I thank Prof. Bill Cowley for approaching me to start a doctorate, pointing me in
the direction of LDPC codes, and for introducing me to Alex and Paul. My research has
been primarily supported by the Australian Government under ARC SPIRT C00002232.
I also thank the Institute for Telecommunications Research for their financial support.
I thank the South Australian Section of the IEEE for providing the financial assis-
tance that allowed me to attend Globecom 2002 in Taiwan.
It has been a pleasure to collaborate with Dr. John Buetefuer during my candida-
ture. Thankyou John, for some very useful discussions about LDPC codes, and for some
motivational advice during my time writing up.
The next time you enjoy a concert, after applauding the band, go and thank the
person behind the sound mixing desk. We usually take this person for granted. The
same can often be said about good network administrators. I wish to sincerely thank
Bill Cooper for maintaining a reliable network at ITR. In particular I would like to
acknowledge the help that he has provided with my perpetually problematic laptop.
Cheers Governor.
I thank all of the other friendly ITR staff with whom I have had the chance to work
with, in either a research or development role.
I cannot thank Prof. Christian Schlegel enough for hosting me at the University
of Utah in 2002, and again last year at the University of Alberta. Thankyou for your
xviii
generous support and supervision. I also thank Professors Chris Myers and Reid Harrison
for their supervision during my stay in Utah.
I thank Prof. Vincent Gaudet and the rest of the HCDC lab in Alberta for their
support and for being such a great group of people. Thanks Charmaine for being so
good at organising everything. Thankyou Vince, Tony, Anne, Sheryl and Walt for the
great times in Edmonton and France last year. I look forward to collaborating with you
again in the future. I also thank Canadian Microelectronics Corporation for providing
the device models that were used to simulate the circuits described in this thesis.
The majority of knowledge that I have about analog decoding, I have learned from
Chris Winstead. His thorough understanding of this field (and many others) comes
without any hint of arrogance. I consider myself privileged to have had the opportunity
to collaborate with him, and look forward to future collaborations. Thankyou Chris for
the fun times in North America, Europe and down here in Australia. Thanks also for
proof reading sections of this thesis. As we both now come to the conclusion of our
postgraduate roles “The Aussie” wishes you and Erin all of the success and happiness
that you so richly deserve.
I thank Dr. Pascal Vontobel for his insightful and valuable suggestions regarding the
finite graph-cover analysis presented in this thesis. Thankyou Pascal for being so friendly
and helpful. Thankyou also for proofreading sections of this thesis.
I thank my friends for putting up with my sleep deprived, preoccupied, and rare
company over the last six months.
I wish to thank my parents for never getting mad at me when I pulled things apart
to find out what made them work, even in the cases when they didn’t quite make it back
together again. Your emphasis on my efforts, regardless of the results, founded the drive
that I have today. From the first time I connected a light bulb to a battery, through my
teenage interests in chemistry1, and now to the proofreading of this thesis. Your support
and love have always been there and will always be returned.
I come now to the hardest part of all. The part where I must try to sum up in
mere words, all of the appreciation that I have for my wife, her ever constant support,
encouragement and love. I simply haven’t worked out how to do it. It is a much harder
task than any problem addressed by this thesis. Jane, like everything that I have achieved
since we met, and may achieve in the future, this book is as much yours as it is mine. It
simply wouldn’t exist without you, and I don’t think that I could either. I love you.
1Sorry about blowing up the garden.
xix
Chapter 1
Introduction
In recent times mobile (cell) phones have become smaller, more reliable, and more afford-
able. Modern communication systems are now able to offer a wide range of features, and
this has changed the way many of us live our daily lives. For example, it is now common
for friends to communicate by sending simple text messages, or even high resolution pho-
tographs. As a result, many school teachers have been forced to ban the use of mobile
phones in the classroom. With this in mind, we now consider the following example in
which two students wish to communicate, without the use of their mobile phone.
Graeme wishes to send a message to his friend Tim, who is on the other side of the
room. Their teacher does not appreciate such interruptions, so Graeme carefully chooses
a short message. They do not want the other classmates to know what they are talking
about. Therefore, they use a predetermined method to jumble the ordering of words in
the message, so that it no longer makes sense. Graeme tells the person sitting next to
him the message, and then they pass it on, until it reaches Tim. A boy who is known to
cause mischief in such situations, Bill, is sitting in the same row. They expect that Bill
may try to disrupt the communication by switching one of the words in the message for
something completely different. However, the friends have previously agreed on a way
to handle Bill. Once the short, jumbled message has been prepared, Graeme tells it to
the person sitting next to him. In order to combat the presence of Bill, he repeats each
word. As Tim receives the message he checks that each word appears twice. If not then
he knows that Bill has altered the message, and he shakes his head in view of Graeme.
Otherwise, he waves to Graeme to signal that the message was received intact.
The above example demonstrates some simple techniques that assist in the secure
and reliable communication of information. However, weaknesses in Graeme and Tim’s
approach are easily identified. For example, if Bill modifies a word then Tim is able to
1
Chapter 1. Introduction
detect that an error has been introduced but he cannot correct it. He signals to Graeme,
and the entire message must be resent. Moreover, if Bill modifies two words then the
error may pass undetected. By sending each word three times they could combat this.
However, such an approach is hardly efficient, and they run a higher risk of being caught
by the teacher.
Outside of the classroom, if Graeme needs to send Tim some information he may do
so using a text message from his mobile phone. The challenges of providing fast, secure
and reliable communication are no longer his concern but rather that of the communi-
cation systems engineer. In this thesis we focus upon the reliable transfer and storage
of data through the use of error control codes. These codes appear in many applications
that we take for granted in our daily lives. In addition to increasing the reliability of
modern communication networks, error control codes are also used in digital data stor-
age systems, e.g. compact disc (CD) and digital versatile disc (DVD) players, and hard
disk drive systems. In this chapter we introduce the general model of a modern commu-
nication system, outline the properties of a good error control scheme, and provide an
overview of this thesis.
1.1 The Digital Communication System
The basic model that we use to describe the components of a digital communication
system is shown in Figure 1.1. A textbook introduction to this model can be found in [1].
This model shows only one direction of communication. It is common for communication
to occur in both directions, on either a full-duplex (simultaneous) or half-duplex (time
multiplexed) basis. Hence we consider each component and its inverse operation as a
functional pair. In this dissertation we focus specifically upon the channel encoder and
decoder, or codec.
The overall aim of the system is to ensure that the data at the sink matches that at
the source. The source and sink may be in different physical locations, e.g. a satellite
and ground station, or the same location at different times e.g. a hard disk drive. We
now describe the functional components that are used to meet this aim in an efficient,
secure and reliable manner.
2
Chapter 1. Introduction
Modem
Channel
Encoder
Data
Source
Source
Encoder
Transmitter/
Modulator
Codec
Data
Sink
Source
DecoderDecryption
Channel
Decoder
Source
Sink
Encryption
Noisy
Channel
Receiver/
Demodulator
u
u
x
y
Figure 1.1: Digital communication system model.
1.1.1 Source Coding
The source encoder is responsible for compressing the source data by removing uncon-
trolled redundancy. The decoder at the sink then applies the inverse algorithm to regen-
erate the original message. Data compression is achieved by making the distribution of
the source information as uniform as possible.
1.1.2 Encryption
In order to provide a secure transmission we may apply an encryption algorithm at the
source, and then decrypt upon reception. The aim of this stage is to ensure that the
transmitted message can be interpreted only by the intended recipient. In this thesis we
do not focus upon the components discussed to this point. Hence we consider the data
source, source encoder and encryption components to be lumped together and label this
larger component the source. We consider this source to present an information vector,
u, to the following stage of the system. Moreover, in this dissertation we focus upon the
case when u takes values from the binary Galois field, F2.
1.1.3 Channel Coding
It is the task of the channel coding scheme to provide detection and correction of errors
introduced during transmission.
Error detection schemes, such as the cyclic redundancy check (CRC) [1], inform the
receiver that an error has occurred during transmission. This is done by generating a
3
Chapter 1. Introduction
checksum value based on the data to be transmitted. The checksum is appended to the
message. The receiver then uses the received data to generate its own checksum according
to the same algorithm as the sender. If this checksum does not match the one received
from the sender then an error is declared. The CRC adds very little overhead in terms
of both transmission size and processing.
A forward error correction (FEC) scheme allows us not only to detect corrupted data
but to repair it. In contrast to error detection algorithms, which require the corrupted
message to be resent, FEC schemes protect the data prior to transmission. Increased
reliability is provided by controlled redundancy that is appended to the message. We
denote the ratio of the source information to the total size of the transmission inclusive
of redundancy, as the rate of information transfer.
The encoder maps the information vector, u, onto a codeword vector, x. In this thesis
we consider binary codes, such that the elements of x come from F2. The codeword is
transmitted across the channel, resulting in the received vector y. The alphabet used
for the elements of y is dependent upon the channel and demodulator. Considering the
received vector and the set of all valid codewords, the decoder then produces an estimate
of the transmitted codeword x, with elements coming from F2. From this estimate, we
obtain the estimate of the transmitted information, u.
The maximum-likelihood (ML) decoder selects the estimate x that maximises the
probability p(y|x). The maximum a posteriori (MAP) rule selects x such that p(x|y) is
maximised. Recall that it is the goal of the source encoder to make the distribution of
the source information as uniform as possible. In the case of a uniform distribution of
source information, the ML and MAP decision rules are identical.
We distinguish between two possible types of error event. If the decoder arrives at
an estimate x which is not a valid codeword, then we know that an error has occurred,
without having explicit knowledge of the transmitted codeword. Hence we label this
a detected error event. However, we cannot make such a deduction when the decoder
estimate represents a different valid codeword to x. We therefore label this an undetected
error event.
The performance of an error control code is measured in terms of the probability of
error for transmission at a given signal to noise ratio (SNR). We define the probability
of bit error, or bit error rate (BER), as the ratio of the number of erroneous bits at the
sink to the total number of bits transmitted.
Decoder implementation can often be simplified if we map probabilities into the log
4
Chapter 1. Introduction
domain [2]. Multiplication operations in the probability domain map to addition oper-
ations in the log domain, thus reducing digital implementation complexity. A common
representation is the log-likelihood ratio (LLR), for which we have the following definition.
Definition 1.1 (Log-Likelihood Ratio). Consider the binary random variable X with
probability mass function1 pX(x). The log-likelihood ratio for X is defined as
λ = loge
(
pX(0)
pX(1)
)
.
1.1.4 Modulation
We modulate elements from the codeword, also called codeword symbols, in order to
transmit them across a physical channel. Many modulation schemes are available for use
in a modern communication system, each presenting a tradeoff involving parameters such
as bandwidth requirement and power consumption. In this thesis we consider only the
binary phase shift keyed (BPSK) approach2 [1, 3]. Using BPSK modulation the binary
elements x from x each undergo an antipodal mapping, M(x), onto a physical signal as
follows.
M(0) = s0(t) =
√
2Es
Ts
cos(2πfct) 0 ≤ t ≤ Ts
M(1) = s1(t) =
√
2Es
Ts
cos(2πfct + π) 0 ≤ t ≤ Ts
Here the signal s1(t), representing the case x = 1, is π radians out of phase with the
signal s0(t), which represents x = 0. The spectrum of the transmitted signal is centred
about the frequency fc. Each symbol is transmitted over a period of time Ts, with symbol
energy Es. One bit, of either information or redundancy, is transmitted during this
period. The received signal is then demodulated, in order to reverse the BPSK mapping.
The output of the demodulator is a real number, or member of a discrete alphabet,
representing the transmitted bit value.
1.1.5 The Channel
The channel is the medium over which we transmit. The real world channel may atten-
uate, amplify, distort and/or contaminate the transmission with noise. There are many
1Here the probability mass function pX(x) is an abbreviated form of p(X = x), representing theprobability of the random variable X taking the particular value x.
2Note however that many of the codes that are studied in this thesis may be used in conjunction withmore general modulation schemes.
5
Chapter 1. Introduction
causes of noise in the modern communication environment. Hence the topic of channel
modelling alone represents an entire field of research.
In the late 1940s Claude Shannon introduced the concept of channel capacity [4]. He
showed that error control codes exist which allow transmission across the channel with
arbitrary low probability of error, when the rate of information transfer that is less than
capacity. This limit, which we now term the Shannon bound, tells us that such codes
exist, however it does not indicate how they are to be constructed. Shannon’s discovery
presents a challenge to communication systems engineers, to develop codes and decoding
algorithms that can meet its predictions.
For simplicity of analysis, in this dissertation we consider only the channel models
that follow. We assume that these channels are memoryless, in the sense that there is no
correlation between the noise applied to individual symbols.
The Additive White Gaussian Noise Channel
We assume a BPSK transmission and consider the discrete time additive white Gaussian
noise (AWGN) channel [1, 3] with binary input and real output. In the discrete time
domain we consider each binary bit x from x to be represented by a value x ∈ {±√Es},
where +√
Es and −√Es correspond to x = 0 and x = 1 respectively. The received value
corresponding to the bit is y = x + ρ. Here the random variable ρ represents normally
distributed noise with zero mean and variance σ2. The noise source is assumed to have
a single sided spectral density, N0 = 2σ2, which is independent of frequency. From the
Gaussian probability density function, we have
p(y|x) =1√
2πσ2e−(y−x)2/2σ2
.
For the purposes of empirical simulation we set Es = 1 and adjust the signal to
noise ratio by altering the noise variance. In order to compare two error control systems
that operate at a different rate of information transfer, we adjust the symbol energy
according to the rate r, to get the bit energy Eb = Es/r. The signal to noise ratio
is then calculated as the (dimensionless) ratio of received bit energy to noise spectral
density, Eb/N0 = Es/2σ2r. It is common to report the SNR in decibels, according to
Eb/N0(dB)=10 log10(Eb/N0).
6
Chapter 1. Introduction
The Binary Erasure Channel
The second type of channel that we consider in this dissertation is the binary erasure chan-
nel (BEC). This channel has input alphabet {−1, +1} and output alphabet {−1, 0, +1}.We again assume a BPSK transmission, where bit x from x is mapped according to
M(0) = +1, M(1) = −1 and presented at the channel input. The channel either leaves
x unaltered, or marks it as an erasure, to which we assign the value zero. We denote
the probability that x is erased by the channel as ǫ. This channel erasure probability is
assumed to be symmetric, i.e. independent of the transmitted value of x.
1.2 Codec Design Criteria
We now list some design criteria to be considered by the communication systems engineer,
when developing an error control codec.
Good Error Correcting Performance We aim to have the code provide a strong
error correcting capability. It is common to simulate operation of the error control
system and obtain empirical BER performance results, prior to implementation.
Flexibility of Code Design The code design approach should allow flexible choice of
parameters such as code length, i.e. the length of the codeword vector, and code
rate, i.e. the ratio of the length of the information vector to the code length.
Simplicity of Design It is desirable for the code to be easy to construct and repre-
sentable in a concise form.
High Speed The computational latency for both the decode and encode operations
should be low.
Low Power Consumption It is desirable that the codec is power efficient, especially
if it is to form part of a battery powered device.
Small Area It is desirable that a circuit implementation of the codec occupy only a
small area of silicon, especially if it is to form part of a portable device.
Ease of Verification The design and implementation should be easy to test.
The importance of each of the above parameters in the overall design will depend
upon the application. For example, small implementation size is more significant for
7
Chapter 1. Introduction
mobile handsets than for base stations. Throughput is a major motivation for many
applications, such as data storage and broadband communication. For some applications,
such as satellite transmission and portable devices, low power consumption is important
as it provides longer battery life and/or the requirement of a battery with lower weight.
Biomedical applications, such as implantable devices, call for very small and low power
circuits. The designer may choose to tradeoff one parameter for another, and the decisions
made will be motivated by the demands of the application.
1.3 Overview of Thesis Structure
This thesis may be considered to contain two components. In Chapters 2 through to 5 we
discuss coding theory aspects of the work, developing a new approach to codec design and
a new class of low-density parity-check codes. We then investigate practical approaches
to codec circuit implementation in Chapters 6 and 7. The remainder of this dissertation
is organised as follows.
Chapter 2 provides a review of the current state of the literature pertaining to error
control using low-density parity-check codes. We provide an overview of approaches to
code construction, graphical representations, and analytical tools. Existing methods for
encoding and decoding LDPC codes are also presented.
In Chapter 3 we explore new techniques for iterative encoding of LDPC codes which
allow the decoder architecture to be reused for encoding. A novel encoding algorithm is
presented which is based upon the Jacobi method for iterative matrix inversion over F2.
We define a convergence criterion for the algorithm and show how it can be viewed in
message-passing terms. We label any LDPC code that is encodable using the Jacobi
encoder as a reversible LDPC code, and show that any code with a triangular parity-
check matrix is reversible. An algorithm is presented for the algebraic construction of
4-cycle free reversible LDPC codes using circulant matrices. We label these codes type-I
reversible LDPC codes. The construction algorithm provides some flexibility in the choice
of code length and rate.
In Chapter 4 we present the empirical performance and a thorough analysis of some
type-I reversible codes. We investigate the performance of (3,6)-regular codes on both
the AWGN and binary erasure channels. We characterise and explain the empirical
observations, using current analytical tools. A weakness in the graphical structure of
these (3,6)-regular codes is identified, which leads to an error floor effect as we increase
8
Chapter 1. Introduction
code length.
In Chapter 5 we show that type-I reversible codes may be constructed which offer
good performance for high rate applications. We use results from the analysis in Chapter 4
to provide a design metric for developing a new class of reversible LDPC codes which
offer improved performance. A recursive algorithm is presented for constructing type-II
reversible LDPC codes which are 4-cycle free and are iteratively encodable using eight
iterations of the Jacobi encoder. This algorithm also provides greater flexibility for the
choice of code length and rate than the type-I approach. We demonstrate that (3,6)-
regular type-II reversible codes may be constructed that offer good empirical performance.
Since the proposal that analog circuits be used to build iterative decoders, interest
has begun to develop in this area. Analog Decoder Workshops have been held on an
annual basis for the last three years, with an increased participation and an impressive
progression of results being presented each year. In Chapter 6 we review the current
state of the literature in this area, and highlight some of the potential advantages that
analog decoder implementation can offer in comparison to digital implementation.
In Chapter 7 we extend the analog decoder so that it can also be used to perform
encoding. A novel circuit architecture is presented for the core of a reversible LDPC codec.
We implement a time multiplexed architecture which switches between analog decode and
digital encode modes. The encode operation is two orders of magnitude faster than the
decode operation, and hence the circuit is well suited to use in full-duplex communication
systems. In order to achieve this we introduce a new type of logic gate circuit, namely
the mode-switching gate, which is able to switch between digital and analog operation.
Encoding is performed using the Jacobi encoder presented in Chapter 3, requiring only
a small amount of circuit overhead.
In Chapter 8 we summarise the work presented in this dissertation, and discuss the
proposed reversible LPDC codec architecture in terms of the codec design criteria listed
in this introductory chapter. We also present some suggestions for further work.
1.4 Contributions of This Thesis
The following list summarises the main contributions made by this thesis.
Chapter 3
• Theorem 3.1 shows that we can use the sum-product decoder to iteratively
encode certain types of low-density parity-check codes.
9
Chapter 1. Introduction
• Theorem 3.2 provides a convergence criterion for the Jacobi iterative matrix
inversion method for matrices which have elements from F2.
• A new method for iteratively encoding LDPC codes is presented in Algo-
rithm 3.1. The algorithm employs the Jacobi method for iterative matrix
inversion over F2 and allows reuse of the decoder architecture for encoding.
• Theorem 3.5 provides an algebraic method for testing the existence of 4-cycles
in the factor graph corresponding to a circulant matrix, given its first row
polynomial and size.
• Algorithm 3.2 provides a method for algebraically constructing reversible LDPC
codes using circulant matrices. These 4-cycle free codes are suitable for high
rate applications, and are encodable using the iterative Jacobi method over F2.
Chapter 5
• Algorithm 3.2 provides a recursive construction method for reversible LDPC
codes which are 4-cycle free, and are encodable using eight iterations of the
iterative Jacobi method over F2. The algorithm is capable of, but not limited
to, generating (3,6)-regular codes.
Chapter 7
• In Section 7.2 a fundamental circuit design contribution is made, namely the
mode-switching gate. These logic gates are able to switch between analog (soft)
and digital (hard) computation. This contribution is the result of collabora-
tive work with the High Capacity Digital Communications Laboratory at the
University of Alberta.
• A novel circuit architecture for the core of a reversible LDPC codec is presented
in Chapter 7. This circuit results from collaborations with the Electrical Engi-
neering Department at the University of Utah, and the High Capacity Digital
Communications Laboratory at the University of Alberta.
10
Chapter 2
Low-Density Parity-Check Codes
2.1 Historical Summary
Just over a decade after Shannon’s founding work was published [4], Robert Gallager
introduced low-density parity-check (LDPC) codes [5, 6]. His approach would eventually
form the basis for a class of codes which perform extremely close to the Shannon bound.
Gallager’s codes, and iterative decoding algorithms, were however initially overlooked by
the coding community. That era was not one in which every researcher had the benefit
of a powerful computer at their fingertips, hence his work was not developed further.
Moreover, with the benefit of hindsight, Gallager’s approach challenged the paradigm of
coding at a time when the community was focussed upon minimum distance. It was not
until some three decades later, in the mid 1990s, that the true potential of low-density
parity-check codes was rediscovered.
A small number of researchers continued to work with Gallager’s codes during the
few decades after publication of his thesis, e.g. see [7, 8]. In the early 1980s Tanner
provided a graphical representation of LDPC and other coding schemes, now commonly
known as the Tanner graph [9]. He proposed that the decoding problem be approached by
factoring the codes into simpler component codes, and introduced what is now known as
the min-sum decoding algorithm. Tanner also foresaw the huge potential of parallelism,
allowing the simpler component codes to be processed simultaneously due to their low
complexity.
In the early 1990s, research in the area of channel coding became very popular due
to the introduction of turbo codes by Berrou et al. [10]. However, the initial develop-
ment of turbo codes was undertaken with very little connection being made to graphical
representations. The iterative algorithms used for decoding turbo codes have since been
11
Chapter 2. Low-Density Parity-Check Codes
linked to the principles of belief propagation described by Pearl [11], as have the algo-
rithms proposed by Gallager [12, 13]. Moreover, the structure of a turbo code is such
that it may be viewed as an LDPC code [14]. The attention drawn to iterative decoding
techniques through turbo coding made it almost inevitable that the work of Gallager
would eventually be revisited. In the mid 1990s LPDC codes were independently redis-
covered [15–17]. Shortly after, a surge of papers appeared in this area (see [18] and the
references therein).
Tanner’s work, and further developments from Wiberg et al. [15, 19], formed the
basis for the now commonly used factor graph representation of codes. These graphs can
be used to represent a wide range of algorithms and structures [20].
Several approaches to improving low-density parity-check codes have been proposed
since their rediscovery. Some of these codes, notably in the case of very long block sizes
(around 105 to 107 bits), are able to achieve performance within hundredths of a decibel
of the Shannon bound [21–23].
Most recently, there have been some developments in the area of analysing iterative
decoding on the graphs of finite length codes [24–28].
2.2 Linear Block Codes
Low-density parity-check codes fall into the class of linear block codes. We define the
following properties for binary linear block codes.
Source Information We partition a binary source sequence into row vectors of length
k prior to encoding. We label such a vector u.
Encoding The encoding process is undertaken on a block-wise basis, mapping u onto a
row vector codeword x. An (n, k) code has codeword length n.
Systematic Code A code in which the information vector appears as part of the code-
word is systematic. We label the information segment of the codeword xu = u, and
the length n − k appended parity bits xp.
Linearity Property The code C is a binary linear code if and only if C forms a vector
subspace over F2. Hence, the sum of any two codewords must itself be a codeword.
Code Dimension The dimension of the code is the dimension of its corresponding
vector space. A binary (n, k) code has dimension k, and thus has a total of 2k
12
Chapter 2. Low-Density Parity-Check Codes
codewords, each of length n.
All-Zero Codeword A direct consequence of the linearity property is that the all-zero
codeword, x = 0, is a member of every linear code.
Code Rate The rate of the code is r = k/n. The code rate reflects the proportion of
information transferred per channel use.
Generator Matrix The linearity property implies the existence of a basis for the code.
We may construct a generator matrix for the code, denoted G, by using the k
independent basis codewords as row vectors for G. The generator matrix has
dimension k×n and can be used to encode the information vector. We may consider
the systematic form Gsys = [P|Ik], such that the codeword parity bits prefix the
information bits. Here [P|Ik] denotes the concatenation of the k×k identity matrix
to P. Encoding is then performed according to x = [xp|xu] = uG. The generator
matrix describes the structure of the code, however it is not uniquely defined for
the code.
Parity-Check Matrix Another means of describing the code structure is provided by
the parity-check matrix, denoted H. In general, H has dimension m × n, where
m = n − k. All codewords must satisfy the condition Hx⊤ = 0, i.e. the set of
all codewords span the nullspace of H. The parity-check matrix is not uniquely
defined for the code, and is related to G by HG⊤ = 0. The systematic form of H
may be obtained from G, and vice versa, using Hsys = [Im|P⊤].
Weight Distribution We define the weight of a vector x, w (x), as the number of
nonzero elements that it contains. The weight distribution of a code is a list of the
number of codewords at each weight, for all codeword weights.
Hamming Distance The Hamming distance between two codewords is the number of
positions in which they differ, i.e. the weight of their binary sum.
Minimum Distance The minimum distance, dmin, is the smallest Hamming distance
between any two codewords in the codeword set. As the all-zero codeword is a
member of every linear code, the minimum distance of a linear code is equivalent
to the weight of the lowest weight codeword. In general, a code with minimum
distance dmin can correct ⌊(dmin − 1)/2⌋ errors1.
1The floor function, ⌊x⌋, returns the largest integer less than or equal to x
13
Chapter 2. Low-Density Parity-Check Codes
As the rows of the generator matrix are codewords, the lowest weight row represents
a simple upper bound on minimum distance. The following theorem relates dmin to the
structure of the code, through its parity-check matrix [29].
Theorem 2.1 (Massey). The minimum distance of a binary linear block code is equal
to the minimum nonzero number of columns in its parity-check matrix which sum to zero.
A general description of linear block codes is provided in [1]. Specific information
relating to LDPC codes is given in [30, 31].
2.3 Low-Density Parity-Check Codes
In general terms a low-density parity-check (LDPC) code is a linear block code which has
a sparsely populated parity-check matrix. In the remainder of this chapter, we will review
the development of LDPC codes, from the codes first presented by Gallager through to
recent capacity approaching structures.
As an example, we now consider the parity-check matrix H of a simple code (c.f. [20]).
H =
1 0 1 0 1 0 1 01 0 0 1 0 1 0 10 1 1 0 0 1 1 00 1 0 1 1 0 0 1
(2.1)
We assign each codeword bit to a variable, vs : s ∈ {1 . . . n}, where each variable also
corresponds to a column of H. Each row of H then represents a parity-check constraint
of the code, cr : r ∈ {1 . . . m}. The participation of variable vs in check constraint cr
is implied by a nonzero element at position (r, s) in H. Variable and check nodes can
also have values associated with them. We use the symbols vs and cr both to label these
nodes and to represent their value. We denote the number of nonzero elements in a row,
or column, as its weight.
Each row of H represents a parity-check constraint. Hence the full set of constraints,
assuming binary arithmetic, follows. If the values assigned to the set of variables represent
a valid codeword then cr = 0 for all r ∈ {1 . . . m}.
v1 + v3 + v5 + v7 = c1 (2.2)
v1 + v4 + v6 + v8 = c2 (2.3)
v2 + v3 + v6 + v7 = c3 (2.4)
v2 + v4 + v5 + v8 = c4 (2.5)
14
Chapter 2. Low-Density Parity-Check Codes
Gallager defined a regular structure for low-density parity-check codes as follows.
Definition 2.1 (Regular LDPC Code). A regular LDPC code with parameters (n, j, i)
is a block length n code having a parity-check matrix with column and row weights of
exactly j ones per column and i ones per row.
In contrast to the above, irregular LDPC codes permit variables to participate in
different numbers of checks, and allow check constraints to apply to different numbers
of variables. An irregular LDPC code can be described by the row and column vector
weight distributions of its parity check matrix. Techniques for designing row and column
weight distributions which provide significantly improved performance over regular codes
have been proposed [21–23, 32], and are discussed in Section 2.9.1.
We note that the example code in (2.1) has a regular (8, 2, 4) structure. It is also
common to label regular LDPC codes without specifying the block length, e.g. (2,4)-
regular. As we shall see in Section 2.5, it is the sparsity property of the parity-check
matrix which allows efficient decoding of these codes.
There are many ways to build a (j, i)-regular parity-check matrix. A common ap-
proach is to concatenate or superpose random permutation matrices, i.e. the identity
matrix with randomly permuted columns [5, 14]. In Figure 2.1 we use the notation
of [14], where a circled number represents that number of (non-overlapping) superposed
random permutation matrices. Figure 2.1(a) shows an example (3,6)-regular construc-
tion. Figure 2.1(b) shows how random permutation matrices may be concatenated to
build a (4,8)-regular code. Figure 2.1(c) shows how we may also view the superposed
construction as a random permutation of edge connections. A total of n left nodes of
degree j represent variables, and m right nodes of degree i represent check constraints.
Graphical code representations are discussed in more detail below.
If the rows of H are linearly independent then the rate of the code is r = (i − j)/i,
otherwise the rate is r = (n − l)/n, where l is the dimension of the row space of H over
the binary field which is used here.
The following two important results exist, for randomly generated regular LDPC
codes having column weight j ≥ 3 [5, 14]. Firstly, the minimum distance of such a code
increases linearly with block length. Secondly, if decoded by an optimal decoder, and
if the (nonzero) code rate is below some maximum rate, such a code will achieve an
arbitrarily low probability of error in the limit as block length tends to infinity.
15
Chapter 2. Low-Density Parity-Check Codes
3 3
(a) Superposed (MacKay 1A)
1 1 1 1
1 1 1 1
1 1 1 1
1 1 1 1
1 1 1 1
1 1 1 1
1 1 1 1
1 1 1 1
(b) Concatenated
....
....Π
(c) Edge permutation
Figure 2.1: (3,6)-regular LDPC construction using random permutation matrices.
2.4 Factor Graph Representation of LDPC Codes
We use the term compound code to classify any code which can be broken down into
constituent subcodes, e.g. an LDPC code [33]. The resurge of interest into compound
code structures and iterative decoding algorithms has brought with it new representations
for codes on graphs. These provide a useful tool for code design, and assist in the analysis
of decoding algorithms.
Consider a global function, i.e. a function of all n variables, g(x1, . . . , xn). Assume
that this global function factors into a product of local functions, fr(Xr). Here r ∈ R is
the local function index and Xr represents the subset of variables which participate in fr.
Hence we have
g(x1, . . . , xn) =∏
r∈R
fr(Xr). (2.6)
We may describe this factorisation using the following graphical model [20].
Definition 2.2 (Factor Graph). A factor graph is a bipartite graph which represents
the factorisation described in (2.6). The graph consists of variable and function nodes.
A variable node exists for each variable xs such that s ∈ {1 . . . n}. A function node exists
for each local function fr. An edge exists in the graph between the variable node xs and
function node fr if and only if xs ∈ Xr.
As a simple example, we consider a real valued global function of three variables,
g(x1, x2, x3), that can be written as a product of two local functions f1 and f2, as follows.
g(x1, x2, x3) = f1(x1, x2)f2(x1, x2, x3) (2.7)
The factor graph representation of (2.7), is shown in Figure 2.2. We represent variable
nodes and function nodes using circles and boxes respectively. Variable node xs, for
16
Chapter 2. Low-Density Parity-Check Codes
f1 f2
x1 x2 x3
Figure 2.2: Factor graph representation of the global function g(x1, x2, x3).
s ∈ {1, 2, 3}, is connected via an edge to the function node fr, for r ∈ {1, 2}, if and only
if xs is an argument of fr.
The bipartite factor graph structure is very well suited to representing LDPC codes.
Function nodes represent parity-check constraints of the code, and are therefore also
called check nodes. The (2,4)-regular structure described by (2.1) is mapped onto the
factor graph shown in Figure 2.3. The matrix H specifies the (bi-directional) edges of the
graph according to the position of its nonzero elements. Here the n = 8 variable nodes,
each having degree j = 2, represent codeword symbols. The m = 4 check nodes, each
having degree i = 4, represent parity-checks.
v1
v2
v3
v4
v5
v6
v7
v8
c1
c2
c3
c4
Figure 2.3: Factor graph representation of a (2,4)-regular code.
17
Chapter 2. Low-Density Parity-Check Codes
Consider a code C of length n with parity-check matrix H. Recall that a valid
codeword x satisfies Hx = 0, and that each row of H represents an individual parity-
check constraint. We assign a binary local indicator function to each check constraint.
The local function associated with check cr is satisfied when all variables participating
in the check, i.e. the set Xr, sum to zero over F2, i.e. [∑
xs∈Xrxs = 0]. Here we use
Iverson’s convention [34] to indicate the truth of a statement S. If S is true then [S] = 1,
otherwise [S] = 0. Hence, when the local function is satisfied it assumes the value 1,
otherwise it has value 0. The product of these local indicator functions is then called
the global indicator function. The vector x is a valid codeword, corresponding to a valid
configuration on the factor graph, if and only if it satisfies the global indicator function
of the code. Therefore the global function is also called the code set membership function
and denoted [(x1, . . . , xn) ∈ C] for a code C of length n. Consider the simple example
code given in (2.1) with constraint set listed in (2.5). Using arithmetic over F2, the global
indicator function for this code is given by
[(x1, . . . , x8) ∈ C] = [x1 + x3 + x5 + x7 = 0] · [x1 + x4 + x6 + x8 = 0]
· [x2 + x3 + x6 + x7 = 0] · [x2 + x4 + x5 + x8 = 0]. (2.8)
Tanner first introduced what is now known as the factor graph representation of
Gallager’s codes [9]. He also foresaw the parallel processing potential of breaking the long
codes into smaller constituent components for decoding. Wiberg, Loeliger and Kotter [15,
19] added state variable representation to the Tanner graph, allowing representation of
turbo and trellis codes2. Applications of factor graphs extend well beyond the field of
error control coding, and they may be used to describe many other algorithms, such as
the Kalman filter and certain fast Fourier transform (FFT) algorithms [20].
2.5 Iterative Decoding of LDPC Codes
The transmitted codeword x is subject to noise from the channel, as described in Sec-
tion 1.1.5. The decoder provides an estimate x of the transmitted information, based
upon code constraints, and the received vector y. In this section we review some decoding
approaches which employ iterative message-passing on the code’s factor graph. Message
values are computed at each node based upon the local code constraints. The sparse
nature of H admits low complexity computation.
2Factor graphs are sometimes referred to in the literature as Tanner graphs and/or TWLK graphs.
18
Chapter 2. Low-Density Parity-Check Codes
2.5.1 The Sum-Product Algorithm
Soft decision decoding algorithms make use of the received soft information. Considering
the channel in use, a vector of prior values is generated from y and input to the decoder.
When the decoding process is complete, a vector of posterior values is returned. Enforcing
a hard decision on this vector then yields x.
The sum-product algorithm can be used to perform soft decision decoding on the
factor graph, by considering each node as a local processor [15, 20]. The bi-directional
edges are used to carry messages between processors, during each iterative step. In
this section we limit our discussion of the sum-product algorithm specifically to the
application of decoding LDPC codes. More general descriptions of the algorithm are
provided in [15, 20].
The messages that are passed during decoding are binary probability mass functions,
or some representation thereof, e.g. a log-likelihood ratio. Each message µns→nrrepre-
sents a belief being passed from the source node ns to the receiving node nr, based upon
the constraints of the code. Hence the process is also referred to as belief propagation.
The value of each received symbol, ys, is used to calculate the prior message, ps,
for all s ∈ {1 . . . n}. In the probability domain, the prior message represents a proba-
bility mass function such that ps(0) = p(xs = 0|ys) and ps(1) = p(xs = 1|ys). All other
messages internal to the decoder are initialised to µ(0) = µ(1) = 0.5. We may decrease
computational complexity with only a small loss in performance, by using low precision
messages rather than real numbers [35].
As discussed in Section 2.4, the factor graph represents the factorisation of a global
function which enforces the code set membership constraint. Each check node enforces a
local exclusive-or (XOR) constraint, stating that the overall parity of messages entering
the node should be zero. Each variable enforces a local equality constraint on messages
entering the node, stating that they should all indicate the same value for the variable.
These constraints dictate the computation that occurs at the nodes, and hence the mes-
sages that result at the node outputs. As the messages represent soft probability values,
we use soft-logic gates [3, 20] to enforce the constraints. Moreover, the nodes calculate
an output message for each bi-directional edge. Initially we consider a 3-edge node,
describing the computation which generates an output message for one edge only, and
then extend this to the bi-directional case. We calculate the output message, i.e. binary
probability mass function, pZ(z) = (pZ(0), pZ(1)) using input messages pX(x) and pY (y),
19
Chapter 2. Low-Density Parity-Check Codes
which are assumed to be independent3.
Definition 2.3 (Soft-Logic Gate (SG)). A soft-logic gate is a single output, dual
input device. The inputs, pX(x) and pY (y), and output, pZ(z), represent the probability
mass of a binary random variable. The output is calculated according to some function
pZ(z) = f(pX(x), pY (y)) which returns a value in the range [0,1].
Two types of soft-logic gate are required to build a sum-product decoder, namely
the soft-XOR and soft-equal gate. These gates are used to construct check and variable
nodes respectively.
Definition 2.4 (Soft-XOR Gate). The soft-XOR gate calculates the output probability
mass pZ(z), using the inputs pX(x) and pY (y), as follows.
[
pZ(0)pZ(1)
]
=
[
pX(0)pY (0) + pX(1)pY (1)pX(0)pY (1) + pX(1)pY (0)
]
(2.9)
Definition 2.5 (Soft-Equal Gate). The soft-equal gate calculates the output probability
mass pZ(z) according to the following expression, where the normalisation factor γ is
chosen to ensure pZ(0) + pZ(1) = 1.
[
pZ(0)pZ(1)
]
= γ
[
pX(0)pY (0)pX(1)pY (1)
]
(2.10)
Figure 2.4 shows how we may use these gates to build a bi-directional gate, for which
we have the following definition.
Definition 2.6 (Bi-directional Soft-Logic Gate (BSG)). The bi-directional soft-
logic gate is a 3-port device for which each port has an output and input. The out-
put from each port is calculated using the two extrinsic edge inputs, according to the
same function, i.e. pX(x)out = f(pY (y)in, pZ(z)in), pY (y)out = f(pX(x)in, pZ(z)in), and
pZ(z)out = f(pX(x)in, pY (y)in). Hence a BSG may be constructed from three soft gates,
by aligning the output of one to each BSG output edge.
As a result of Forney’s normal graph realisations [36], we may break down the struc-
ture of a multiple edged node into constituent components. Variable and check nodes of
3Note that we have temporarily reused x and y for the soft-logic gate descriptions, and they do nothave their usual meaning, i.e. representing codeword and received symbols respectively. Here x, y, andz represent particular values for the random variables X, Y , and Z respectively.
20
Chapter 2. Low-Density Parity-Check Codes
pX(x)
pY (y)
pZ(z)f
(a) Single output soft gate.
pX(x)in
pX(x)out
pZ(z)out
pZ(z)in
pY (y)in pY (y)out
f
f
f
(b) Bi-directional 3-port soft gate.
Figure 2.4: Soft gates.
arbitrary size may be constructed by cascading bi-directional soft-equal and soft-XOR
gates respectively [3, 20]. This is discussed in more detail in Section 6.5.3.
Once the variable and check nodes have been constructed, we connect them using
bi-directional edges, according to the structure of the factor graph. We define Γ (vs)
as the set of all check nodes adjacent to variable node vs, specified by column s of H.
Similarly, the set of all variable nodes adjacent to check node cr, Γ (cr), is specified by row
r of H. Variable-check messages are denoted µvs→crand check-variable messages µcr→vs
.
We let Γ (vs) \c denote all checks Γ (vs) excluding check c, and similarly define Γ (cr) \v.
A single direction input edge, used to carry the prior message, ps, is appended to each
variable node in the factor graph. Similarly, a single direction output edge is appended
to carry the posterior message, qs.
There are many possible schedules for passing messages around the graph. In this
work we assume a flooding schedule [33], such that all messages µvs→crare passed, followed
by all messages µcr→vs, in a single iterative step.
The posterior value, qs, is generated for each codeword symbol, at the end of each
iteration. A hard decision is then performed on these values, to generate the codeword
estimate x. If this estimate satisfies all code constraints, then the process is halted,
otherwise it is repeated until some maximum number of iterations has expired.
Given the received values y and factor graph for H, we now summarise the soft
21
Chapter 2. Low-Density Parity-Check Codes
decision sum-product algorithm. Both probability and log-likelihood domain descriptions
are provided, assuming an AWGN channel and a BPSK mapped transmission M(0) =
+1, M(1) = −1. We also assume knowledge of the channel noise variance, σ2. The
probability domain check-variable and variable-check update rules follow from the above
soft-XOR and soft-equal gate descriptions respectively. The log-likelihood domain rules
are easily derived [20], using the LLR definition (Def. 1.1).
When implementing soft decision algorithms it is important to consider potential
numerical issues. In Step 3 of the log-likelihood domain algorithm, we must handle the
discontinuity of tanh−1(a) when a = 0. To do so we define the clipping parameter η.
For the floating-point simulations discussed in this work, we avoid numerical overflow
by clipping the input a, such that a = ±η when a ≷ η. We generally set the clipping
parameter value close to one, η = 1 − 10−10, in order to offer good dynamic range for
computation while protecting against overflow.
2.5.2 Hard Decision Decoding
We can perform hard decision decoding in the case of the AWGN, binary erasure, and
binary symmetric [1] channels. If soft information is carried with the received vector
then it is discarded. These algorithms do not offer the same level of performance as soft
decision algorithms. However, their simplicity makes them easier to implement, and has
also lead to some interesting analytical results [5, 21, 24, 35].
In this discussion we consider y to represent the hard received vector, coming from
M(ys < 0) = +1, M(ys ≥ 0) = −1. Gallager originally proposed the following two hard
decision algorithms [5].
Gallager Decoding Algorithm A
The message output from variable node vs along edge e is equal to the received value ys
for that node, unless all messages entering vs other than that along edge e disagree with
ys. In this case the opposite of ys is sent, i.e. if ys = −1 then a message value of +1 is
sent, and vice versa.
The message output from check cr to a connected variable along edge e, is the product
of all messages incoming on edges connected to cr, excluding the one from e.
22
Chapter 2. Low-Density Parity-Check Codes
Algorithm 2.1: Sum-Product Decoder (Probability Domain)
Initialisation:1
Initialise messages µvs→cr(0) = µvs→cr
(1) = 0.5,and µcr→vs
(0) = µcr→vs(1) = 0.5.
Set Priors:2
Using each received symbol ys, calculate ν = 2ys/σ2.
Set ps(0) = eν/(1 + eν), and ps(1) = e−ν/(1 + e−ν).Check - Variable:3
Let δvs= µvs→cr
(0) − µvs→cr(1) and calculate
δcr→v =∏
vs∈Γ(cr)\v
δvs.
From each check cr to each variable v ∈ Γ (cr) sendµcr→v(0) = 1
2(1 + δcr→v), and µcr→v(1) = 1
2(1 − δcr→v).
Variable - Check:4
From each variable vs to each c ∈ Γ (vs) send
µvs→c(0) = γps(0)∏
cr∈Γ(vs)\c
µcr→vs(0),
andµvs→c(1) = γps(1)
∏
cr∈Γ(vs)\c
µcr→vs(1).
The normalisation factor γ is chosen so that µvs→c(0) + µvs→c(1) = 1.Calculate Posteriors:5
For each symbol calculate
qs(0) = γps(0)∏
cr∈Γ(vs)
µcr→vs(0),
andqs(1) = γps(1)
∏
cr∈Γ(vs)
µcr→vs(1).
The normalisation factor γ is chosen so that qs(0) + qs(1) = 1.Stop/Continue:6
Calculate x from xs = qs(1) > 12.
If Hx = 0, then exit declaring success, else if iteration limit reached then exitdeclaring failure, otherwise return to Step 3.
23
Chapter 2. Low-Density Parity-Check Codes
Algorithm 2.2: Sum-Product Decoder (Log-Likelihood Domain)
Initialisation:1
Initialise messages µvs→cr= µcr→vs
= 0.Set Priors:2
Using each received symbol ys, set ps = 2ys/σ2.
Check - Variable:3
From each check cr to each variable v ∈ Γ (cr) send
µcr→v = 2 tanh−1
∏
vs∈Γ(cr)\v
tanh
(
1
2µvs→cr
)
.
Variable - Check:4
From each variable vs to each c ∈ Γ (vs) send
µvs→c = ps +∑
cr∈Γ(vs)\c
µcr→vs.
Calculate Posteriors:5
For each symbol calculate
qs = ps +∑
cr∈Γ(vs)
µcr→vs.
Stop/Continue:6
If all checks cr satisfy∏
vs∈Γ(cr)
qs > 0
then exit declaring success, else if iteration limit reached then exit declaringfailure, otherwise return to Step 3.
24
Chapter 2. Low-Density Parity-Check Codes
Gallager Decoding Algorithm B
This algorithm is a modified version of Algorithm A. In the variable message calculation,
the opposite of the received value will be sent if at least b incoming messages disagree.
The value of b is typically a function of the edge degrees and iteration number.
Erasure Decoding
Consider the binary erasure channel described in Section 1.1.5, with output mapping
M(0) = +1, M(1) = −1, M(erasure) = 0. Define the sign operator sgn (a) = ±1 for a ≷ 0,
and sgn (0) = 0. The message-passing erasure decoder operates, using real arithmetic,
as follows [37]. We may consider this algorithm to be a hard decision version of the
sum-product decoder (Algorithm 2.2).
Algorithm 2.3: Erasure Decoder
Initialisation:1
Set the value of each variable vs to the value of the corresponding receivedsymbol ys ∈ {0,−1, +1}. Initialise all messages to 0.Variable - Check:2
From each variable vs to each c ∈ Γ (vs) send
µvs→c = sgn
vs +∑
cr∈Γ(vs)\c
µcr→vs
Check - Variable:3
From each check cr to each variable v ∈ Γ (cr) send
µcr→v =∏
vs∈Γ(cr)\v
µvs→cr
For all variables vs, if at least one µcj→vs6= 0, then assign that value to vs.
Stop/Continue:4
If vs 6= 0 for all variables vs, then exit declaring success.If this is not the first iteration, and the set of values currently assigned tovariables vs is identical to that assigned at the last iteration, then exit declaringfailure. Otherwise return to Step 2.
2.6 Alternative Representations and Algorithms
In this dissertation we focus on factor graphs, however other code representations have
also been proposed. Bayesian networks may be used to represent error control codes [13,
25
Chapter 2. Low-Density Parity-Check Codes
33], and have a similar structure to that of the factor graph. However, Bayesian networks
are directed acyclic graphs which are not bipartite i.e. all nodes are of the same type.
At around the same time as Kschischang et al. were preparing the factor graph
approach [20], Aji and McEliece were developing an alternative view [38]. They presented
a generalised distributive law message-passing algorithm, and related it to several other
well known iterative decoding algorithms. They describe its operation using Bayesian
networks and junction trees.
Forney introduced normal graphs in [36]. These graphs are a modified form of factor
graph, such that all symbol variables have degree one and all internal state variables
have degree two. Symbol variables represent an external interface, while state edges are
used for message-passing. These edge restrictions imply that computation is performed
only at constraint nodes. Hence there is a well defined separation of tasks for each graph
component.
There are several well known decoding algorithms, e.g. the min-sum algorithm, which
are closely related to the sum-product algorithm. They use the same message-passing
approach but perform alternative node calculations [15].
Performance improvements can be obtained by altering the message schedule, so
that it accounts for structural properties of the factor graph which can cause problems
for iterative decoding [39]. Some investigation of the effects of attenuating messages has
also been undertaken [40–42].
Modified versions of the belief propagation decoder have been proposed, which offer a
tradeoff between error correcting performance and implementation complexity. Fossorier
et al. have presented a simplified decoding algorithm [43] which does not depend upon the
channel variance, and hence does not require channel parameter estimation. A two phase
iterative reliability based algorithm has also been proposed [44]. Alternative approaches
to practical decoding have been presented in [45–47].
Feldman has recently introduced a decoding algorithm which is based upon linear
programming [26]. In contrast to standard message-passing, the behaviour of this decoder
is well defined, even in the presence of cycles.
2.7 Encoding LDPC Codes
Low-density parity-check codes are a class of linear block code. They may therefore
be encoded using the generator matrix, as described in Section 2.2. In the absence of
26
Chapter 2. Low-Density Parity-Check Codes
structure built into the code, as the block length is increased this method places large
storage and processing demands on the encoder. Although H is designed to be sparse,
the generator matrix for the code, G, is generally not sparse4. Encoders may therefore
become considerably slower and larger when the block length n is increased, as matrix-
vector multiplication has complexity O (n2). This has motivated recent research into
finding computationally efficient encoders and structured codes which are amenable to
low complexity encoder implementation.
In this section we review several approaches to building a dedicated encoder archi-
tecture for linear time encoding of LDPC codes. We first provide an overview of existing
methods that may be used to build structured codes, which are specifically designed to
allow simple encoder implementation. We then describe a technique that may be used
to transform codes, so that an encoder with approximately linear time complexity may
be built.
In Chapter 3 we present novel approaches to low complexity encoding which reuse
the decoder architecture. The further motivation behind this approach is to reduce the
overall size of the communication system by allowing one circuit to perform both functions
on a time switched basis.
2.7.1 Structured Codes
Sparse Computation and Back Substitution
The design of LDPC codes with parity-check matrices which have an almost triangular
structure was proposed by MacKay et al. [32]. This method allows most of the parity
bits to be calculated in linear-time using sparse operations, i.e. functions which only
involve a small number of variables. Their experiments show that such codes have a
performance which approximates that of regular LDPC codes. A triangular parity-check
matrix is shown in Figure 2.5.
We consider codewords arranged as row vectors x = [xp | xu], where xu are the
information bits and xp are the parity bits. Similarly we have partitioned the parity-
check matrix, H = [Hp | Hu]. We define the intermediate vector b , Hux⊤u , which
may be evaluated in linear time via sparse matrix-vector computation. When Hp has
triangular form we may compute xp in linear time using Hp and b, via back-substitution.
For the above example this may be done in the following three steps.
4In fact, if G was sparse then we would not expect the code to have a good minimum distance. Recallthat the lowest weight row in G upper bounds dmin.
27
Chapter 2. Low-Density Parity-Check Codes
H =
1 1 1 0 1 0 0 0 0 1 0 0 0 1 0 0 0 10 1 0 1 0 0 0 1 0 0 1 0 1 0 0 1 0 00 0 1 0 0 1 0 0 1 0 0 1 0 0 1 0 1 00 0 0 1 0 0 0 0 0 1 0 0 1 0 0 0 0 10 0 0 0 1 0 1 0 0 0 1 0 0 1 1 0 0 00 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 1 00 0 0 0 0 0 1 0 0 0 0 1 0 0 0 1 1 00 0 0 0 0 0 0 1 0 1 0 0 0 1 0 1 0 10 0 0 0 0 0 0 0 1 0 1 0 0 0 1 0 0 0
xp1 xp9 xu1 xu9. . . . . .
Hp Hu
Figure 2.5: An upper-triangular parity-check matrix.
1. xp4 = b4, xp6 = b6, xp7 = b7, xp8 = b8, xp9 = b9
2. xp2 = xp4 + xp8 + b2, xp3 = xp6 + xp9 + b3, xp5 = xp7 + b5
3. xp1 = xp2 + xp3 + xp5 + b1
A special case of triangularity occurs when Hp has the dual diagonal structure shown
in Figure 2.6. The structure was first proposed by Ping et al. [48], and has recently
reappeared in the literature [49, 50]. Here the back-substitution process follows a simple
pattern and may be performed using an accumulator.
Hp =
1 1 0 0 0 00 1 1 0 0 00 0 1 1 0 00 0 0 1 1 00 0 0 0 1 10 0 0 0 0 1
Figure 2.6: A staircase structure for Hp.
In Section 3.2 we show how the decoder architecture can be reused to perform encod-
ing, via back-substitution, when Hp is triangular. We note that if the parity-check matrix
has any columns of weight j > 1, and is strictly triangular, then the code structure is
necessarily irregular.
28
Chapter 2. Low-Density Parity-Check Codes
Cyclic Code Structures
Another approach to providing linear time encodability employs cyclic [51, 52] or quasi-
cyclic structures [53, 54]. A cyclic code has the property that any cyclic shift of a codeword
is itself a codeword. A quasi-cyclic code has the property that, for a fixed shift size, a
cyclic shift of any codeword by that size is itself a codeword. We can build the parity-
check matrix for a quasi-cyclic LDPC code by horizontally concatenating sparse circulant
matrices, each having dimension m × m [55].
Definition 2.7 (Circulant Matrix). A circulant matrix is a square matrix in which
each row is a cyclic shift of the previous row.
An example parity-check matrix for a quasi-cyclic code having n = 18 and k = 12,
and hence r = 2/3, is shown in Figure 2.7. This matrix may be easily rearranged into the
quasi-cyclic form presented in [53] by permuting its columns in the order 1,m + 1, 2m +
1, 2,m + 2, 2m + 2, . . . m, 2m, 3m. Note that permuting the rows or columns of H does
not alter the code structure but merely relabels checks or variables respectively.
H =
1 1 0 1 0 0 1 0 1 0 1 0 0 0 1 0 1 10 1 1 0 1 0 0 1 0 1 0 1 1 0 0 1 0 10 0 1 1 0 1 1 0 1 0 1 0 1 1 0 0 1 01 0 0 1 1 0 0 1 0 1 0 1 0 1 1 0 0 10 1 0 0 1 1 1 0 1 0 1 0 1 0 1 1 0 01 0 1 0 0 1 0 1 0 1 0 1 0 1 0 1 1 0
Hp Hu1 Hu2
Figure 2.7: A quasi-cyclic parity-check matrix.
We split H into three m×m circulant components. Here there are a total of n0 = 3
component matrices, with k0 = 2 of these corresponding to codeword information bits.
More specifically, Hp corresponds to the parity bits, while Hu1 and Hu2 correspond to the
information bits. If Hp is non-singular then we may pre-multiplying H by H−1
p to obtain
the systematic form Hsys = [Im|Hu(sys)] = [Im|H−1
p Hu1|H−1
p Hu2]. We then permute the
columns of Hu(sys), in the order 1,m + 1, 2,m + 2 . . . m, 2m, and relabel the information
bits accordingly. This places it into systematic quasi-cyclic form as shown in Figure 2.8.
Note that each row of Hu(sys) is a right cyclic shift of the previous row, by k0 places.
This allows quasi-cyclic codes to be encoded using the same method as that used for cyclic
codes [56]. To generate the parity bits for the above code we use the k-stage feedback
shift register architecture shown in Figure 2.9. In this diagram the source vector, u, is
29
Chapter 2. Low-Density Parity-Check Codes
Hsys =
1 0 0 0 0 0 1 0 0 1 1 1 0 1 1 1 0 10 1 0 0 0 0 0 1 1 0 0 1 1 1 0 1 1 10 0 1 0 0 0 1 1 0 1 1 0 0 1 1 1 0 10 0 0 1 0 0 0 1 1 1 0 1 1 0 0 1 1 10 0 0 0 1 0 1 1 0 1 1 1 0 1 1 0 0 10 0 0 0 0 1 0 1 1 1 0 1 1 1 0 1 1 0
Im Hu(sys)
xu12xu1xp6xp1 . . .. . .
Figure 2.8: Systematic quasi-cyclic form of H.
input from the left and shifted to the right, starting with bit u1. Each box represents
a memory cell which holds the previously loaded value until a shift is applied, at which
point it takes the value of the cell preceding it. The switch s is initially connected to
the serial information source, to load the source vector into the register. This operation
takes place in k shifts. The first parity bit is then calculated using an exclusive-or (XOR)
operation on selected information bits, chosen by the shift register taps. The tap positions
are set according to the (reversed) position of nonzero terms in the first row of Hu(sys).
The switch is then closed into the feedback position and the data is shifted cyclically k0
times. The next parity bit is then calculated. This process is repeated until all m parity
bits have been generated.
suk . . . u1
xpk . . . xp1
Figure 2.9: Shift register based encoder for a quasi-cyclic code.
We can extend the parity-check matrix to build higher rate codes by horizontally
concatenating further circulant matrices. For the general case of n0 concatenated circu-
lant matrices the block length is n = mn0. Hence, the choice of code rate is restricted by
the relation r = (n0 − 1)/n0.
30
Chapter 2. Low-Density Parity-Check Codes
Other Code Structures
Sipser and Spielman have proposed a class of linear-time encodable and decodable ex-
pander codes in [17]. Their approach involves using cascaded graphs to recursively build
irregular codes based upon simple subcodes at each stage of the graph. Error correcting
codes are built by recursively combining weaker error-reducing codes. They have proven
that if the error-reducing codes are encodable/decodable in linear time, then this prop-
erty will carry through to the resulting error-correcting code [57]. Luby et al. have built
codes based upon these structures which exhibit very good performance [21].
Echard and Chang have presented an approach to building structured quasi-regular
LDPC codes, called π-rotation codes [58, 59]. The parity-check matrix for these codes is
built from component permutation matrices and their rotations. They may be encoded
in linear time using a flip-flop based encoder circuit.
Zhang and Parhi have presented a codec design for (3, i)-regular LDPC codes [60].
Their encoding algorithm is similar to that presented in the following section. However,
rather than transforming an existing code, they specifically construct the code to optimise
encoder speed.
2.7.2 Code Transformation
Richardson and Urbanke have shown that H can be manipulated for most LDPC codes,
so that the coefficient of the quadratic term in the encoding complexity is very small [61].
This is a general solution to the problem of time efficient encodability, as it relies not
upon a new structure of code but rather restructuring any existing (non-singular and
sparse) parity-check matrix. Motivated by the advantages of triangular computation, H
is first rearranged into the approximate triangular form shown in Figure 2.10. This is
done by exploiting the sparseness of H, using column and row reordering only, and hence
the density of the matrix remains unaltered.
The gap, g, is defined as the difference between the number of rows in the approximate
lower-triangular form and the number of rows in H. Encoding complexity is proportional
to n+g2. The problem of reducing encoding complexity therefore becomes one of reducing
g. For a (3,6)-regular LDPC code, the actual number of operations required is no more
than 0.0172n2 + O (n). Hence the very small constant term admits approximately linear
encoding complexity even in the case of large block lengths. Another key result is that
all provably good LDPC codes [62] have very small gaps, typically in the range from 1
31
Chapter 2. Low-Density Parity-Check Codes
to 3, and therefore have linear encoding complexity.
H =
[
A B TC D E
]
A B
C D E
T
0
kg
m
m
g
n
Figure 2.10: The parity-check matrix rearranged into approximate lower-triangular form.
The matrix is split into six components as shown. The matrix T is lower-triangular
and all diagonal elements of T are one. We consider the parity section of the (systematic)
codeword to be split into two segments, i.e. x = [u|xp1|xp2]. The parity bits are generated
as follows.
Pre-computation
Pre-compute the dense g × g matrix φ = −ET−1B + D and calculate its inverse.
Note that H must be full rank for φ to be invertible.
Compute First Parity Vector xp1
1. z1 = Au⊤ (linear time sparse multiplication).
2. z2 = T−1z1. As T is lower triangular and sparse this operation may be per-
formed by back-substitution in linear time.
3. z3 = −Ez2 (linear time sparse multiplication).
4. z4 = z3 + Cu⊤ (linear time sparse multiplication and addition).
5. xp1 = φ−1z4 (multiplication by dense g × g matrix).
Compute Second Parity Vector xp2
1. z5 = Bxp1⊤ (linear time sparse multiplication).
2. z6 = z1 + z5 (linear time sparse addition).
3. xp2 = −T−1z6 (linear time back-substitution).
32
Chapter 2. Low-Density Parity-Check Codes
2.8 Analysis of Codes and Decoding on Graphs
In this section we explore some of the analytical tools that have been developed for LDPC
codes, while paying particular attention to finite length codes. Recent work in this area
has followed from the foundations laid by Gallager [5], Tanner [9] and Wiberg [15].
2.8.1 Short Cycles
In the example segment of a code factor graph shown in Figure 2.11, a cycle of length
four is highlighted. A general definition follows.
Figure 2.11: A cycle of length four.
Definition 2.8 (Cycle). A cycle of length d in the factor graph representation of a code,
is a connected set of d/2 variable nodes and d/2 constraint nodes. The set is connected
such that, for each node, a path of d edges exists that connects the node back to itself, in
which all nodes are visited without traversing an edge twice. We use the term d-cycle to
denote a cycle of length d. In a bipartite graph, d is always even.
Definition 2.9 (Girth). The girth of a graph is the length of its shortest cycle.
It is well known that if the factor graph of a code is cycle free then the sum-product
decoder will converge to the maximum-likelihood code sequence [15]. Cycles, especially
those of short length, allow the feedback and reinforcement of incorrect values throughout
the network. This behaviour causes the decoder to stray from the maximum-likelihood
solution. However, we seek only a final hard decision, and are less concerned with the
exact soft solution. Despite the presence of cycles in the factor graph, it is now widely ac-
cepted that iterative message-passing algorithms can make good decoders (e.g. see [14]).
Some detailed investigations regarding the effect that cycles have on the behaviour of the
decoder appear in [33, 41, 63–65].
33
Chapter 2. Low-Density Parity-Check Codes
In contrast to the above, it is well known that cycle-free graphs cannot support good
codes [66, 67]. However, it is still widely accepted that removing short cycles from a
graph will improve performance when using message-passing decoding. Some theoretical
support for the removal, or avoidance, of short cycles has recently been presented [68].
In particular when constructing the parity-check matrix, it is common to constrain the
corresponding factor graph such that it is 4-cycle free. This appears to be especially
important for high rate codes [40]. A mapping of the 4-cycle free graph constraint onto
the parity-check matrix follows.
Definition 2.10 (Overlap Constraint). Constraining the parity-check matrix H, such
that the maximum overlap between any two columns is one, prevents 4-cycles from ap-
pearing in the graph of H.
By manipulating the adjacency matrix of a code’s factor graph, and in turn altering
H, McGowan and Williamson have presented an algorithm for detecting and removing
short cycles [69].
Definition 2.11 (Adjacency Matrix). The adjacency matrix of a graph, A, is a matrix
with each row and column labelled by the index of a vertex in the graph. If an edge exists
between two vertices vr and vs then the elements in A at position ars and asr are one,
otherwise they zero.
In this thesis we interpret the adjacency matrix over the reals. The adjacency matrix
for the factor graph of a code is related to its parity-check matrix as follows, where we
make the obvious mapping from F2 → R.
A =
[
0 HH⊤ 0
]
The algorithm breaks cycles by rearranging the entries in H, without altering the
row and column weight distributions. The experimental results presented in [69] suggest
that removing short cycles becomes more important when operating at high SNR, and
that there is proportionally more to be gained from raising the girth from four to six,
than from six to eight.
Campello et al. have proposed a bit filling approach to constructing LDPC codes [70,
71]. The algorithm generates random codes, allowing girth to be specified as a design
parameter.
34
Chapter 2. Low-Density Parity-Check Codes
2.8.2 Stopping Sets and Extrinsic Message Degree
Consider the case when we are using LDPC codes over the binary erasure channel. Here,
we are able to analyse the behaviour of the decoder using stopping sets. Such analysis
was first introduced by Richardson et al. [61]. Using stopping sets and combinatorial
arguments, Di et al. have developed algorithms for calculating the exact average bit and
block erasure probabilities for regular ensembles of LDPC codes [24]. A stopping set on
the graph of a code is defined as follows.
Definition 2.12 (Stopping Set). A stopping set S is a group of variable nodes, such
that all checks connected to S are connected to S at least twice.
Let the set of erasures made by the channel be denoted E , and those remaining when
the decoder fails be denoted S. When the iterative decoder fails, i.e. arrives at state
S, further iterations will not shift it from this state. The set of erased bits remaining
at this point represents the maximal stopping set from E . We note that the empty set
is a stopping set, and is the maximal stopping set of E when the decoder succeeds. An
example of a stopping set of size five is highlighted in Figure 2.12.
Figure 2.12: A size five stopping set.
In an analysis aimed at lowering the error floor for irregular codes, Tian et al. have
related stopping sets to cycles, and to the linear dependence of columns in H [25]. In a
graph where every variable has degree two or more, a stopping set must contain at least
one cycle. They have introduced the following code design metric.
Definition 2.13 (Extrinsic Message Degree (EMD)). The extrinsic message degree
of a variable node set is the number of checks that have only one edge connection into the
set.
35
Chapter 2. Low-Density Parity-Check Codes
They also present an efficient code construction algorithm which ensures that, for all
cycles up to a chosen length, the set of variables participating in such a cycle has some
minimum EMD. Their empirical results show that the error floor for irregular codes can
be significantly improved using this technique.
2.8.3 Graph Expansion
The above views imply that strong local connectivity in the factor graph can adversely
affect iterative decoder performance. It is therefore desirable that the graph have a good
expansion property. A set of nodes in a graph are said to expand well if they have a
large number of neighbours. More specifically, the expansion of a graph is characterised
as follows [17].
Definition 2.14 (Graph Expansion). Consider a set of nodes S on a graph of Nnodes. Let R denote the set of nodes neighbouring S. Every set of at most p nodes
expands by a factor ζ, if |R| ≥ ζ|S| for any set S ∈ N such that |S| ≤ p.
We label a graph in which all nodes have degree j, as j-regular. A common metric
for testing the expansion of a j-regular graph follows [72, 73].
Definition 2.15 (Normalised Spectral Gap). Consider a connected j-regular graph
GA, with a total of p nodes, having adjacency matrix A (Def. 2.11). As A is symmetric
it has real eigenvalues, which we order µ1 ≥ µ2 ≥ · · · ≥ µp. Since GA is connected and
regular, µ1 has multiplicity one and |µ1| = j. We define the normalised spectral gap of
GA as µδ , (j − |µ2|)/j.
If the normalised spectral gap is large, then the graph is a good expander. It is
well known from the Ramanujan bound, that |µ2| ≥ 2√
j − 1, in the limit of infinite
graph size [73]. Therefore, for large j-regular graphs, we expect that the best normalised
spectral gap attainable will be bound by µδ ≤ (j − 2√
j − 1)/j. A graph which achieves
this bound is said to have optimal expansion. A randomly generated regular graph is
likely to be a good expander [74]. Explicit construction of codes using graphs with good
expansion has also been investigated [17, 67, 75–78].
2.8.4 Near-Codewords
MacKay and Postol introduced the concept of near-codewords in [79], as follows.
36
Chapter 2. Low-Density Parity-Check Codes
Definition 2.16 (Near-Codeword). A (w, v) near-codeword is a vector e with weight
w (e) = w, such that the syndrome of e, z(e) = He has weight w (z(e)) = v.
A near-codeword e having low weight w and low syndrome weight v can lead to
convergence problems, and hence detected errors, for iterative decoding via belief propa-
gation on the AWGN channel [79]. We note the distinction between the decoding process
arriving at a near-codeword versus it arriving at a wrong codeword. The latter represents
an undetected error which may be attributed to a poor minimum distance property of
the code. In contrast near-codewords represent error states related to the structure of the
code’s factor graph, such as the connection of short cycles. If the decoder arrives at such
an error state, it is likely to remain there. The structure of a near-codeword is generally
different to that of a stopping set. A (w, v) near-codeword typically has v checks that are
connected into the set of w variables only once, giving the set an EMD of v, compared to
zero for a stopping set. However, this is not always the case, e.g. the stopping set shown
in Figure 2.12 also corresponds to a (5, 1) near-codeword.
2.8.5 The Computation Tree
As discussed in Section 2.5.1, the factor graph representation of a code provides a recipe
for constructing a message-passing decoder. We may consider any variable node as the
root of a tree which represents the unwrapped factor graph. This tree, which we call
the computation tree, was introduced by Wiberg [15] to model iterative message-passing
decoding.
Definition 2.17 (Computation Tree). The computation tree associated with a factor
graph G is a singly connected bipartite graph, rooted at some variable node vs, such that
s ∈ {1, . . . , n} for a code of length n. The computation tree of depth d is formed by
recursively unwrapping G, as follows.
1. Initialise the tree to be the variable root node vs, such that s ∈ {1, . . . , n}.
2. For each leaf node l in tier t of the tree, identify the corresponding node lG in G.
Label the parent node of l in the tree p. Let Γ (lG) \pG represent the set of nodes
adjacent to lG in G, excluding the node pG in G corresponding to the parent node p.
Create nodes at tier t + 1 of the tree, by attaching nodes representing each member
of this set to the leaf.
3. Repeat Step 2 a total of 2(d − 1) times.
37
Chapter 2. Low-Density Parity-Check Codes
In general, each node in the code’s factor graph has multiple instances in the tree.
This view allows us to analyse iterative decoding on a graph that is free of cycles. Fig-
ure 2.13 shows an example computation tree corresponding the factor graph shown in
Figure 2.3. In this case the root node is v1 and the tree has been expanded to depth
d = 3. Variable (bit) nodes are represented on the tree by circles and check (constraint)
nodes by boxes.
2 6 7 2 4 8 2 3 6 2 5 8 2 3 7 2 4 5
3 4 4 43 3
3 5 4 87 6
1 2
1
Figure 2.13: Computation tree example.
We consider the local parity-check constraints for each check node in the factor graph
to be mapped onto all corresponding check nodes in the computation tree. A configu-
ration, i.e. assignment ∈ {0, 1} to all variable nodes in the computation tree, is called
valid if all local parity-check constraints in the tree are satisfied. Valid configurations
on the computation tree represent pseudo-codewords. Analysis using pseudo-codewords
and pseudo-weight on the computation tree has been developed in [41, 42, 65]. We further
explore the concepts of pseudo-codewords and pseudo-weight in the following section.
2.8.6 Finite Graph-Covers
Explaining the behaviour of iterative decoding algorithms on the AWGN channel is a
challenging task. For the BEC case, stopping set analysis allows us to predict the final
state of the decoding process exactly, given the set of channel erasures and the code struc-
ture. The AWGN channel case is not as well understood. Here, error events can have
38
Chapter 2. Low-Density Parity-Check Codes
different characteristics. It is easy to see that poor minimum distance will lead to unde-
tected errors. However, accounting for the case when the decoder fails to converge, i.e. a
detected error, is more difficult. It has become common to employ stopping set analysis,
in conjunction with the argument that BEC performance is related to performance on
the AWGN channel, however this approach is only qualitative. Empirical analysis using
near-codewords appears more appropriate in the presence of AWGN, as it allows us to
account for problematic states during the decoding process. However, we do not expect
all near-codewords to cause problems. For example, when considering a (j, i)-regular
(n, k) code, a vector of length n containing a single one has a syndrome of weight j.
Thus it falls under the definition of a near-codeword. However, such a near-codeword is
present for every code, and we would not necessarily expect it to be problematic [27].
Kotter and Vontobel have recently introduced a new technique for analysing the
iterative decoding process for finite length codes [27, 28]. We now review their approach,
which is based upon finite graph-covers.
Let G denote the factor graph of a code C. We obtain an m-fold cover of G, denoted
G, by replicating each node in the graph m times, and then adding edges so that the
original local adjacency relationships are preserved (for an exact definition see [27]).
Figure 2.14 shows the graph, G, for the trivial code C = {(0, 0), (1, 1)}, and an example
2-fold (double) cover, G.
v1 v2
(a) The graph G of C.
v1,1 v1,2 v2,1 v2,2
(b) An example double cover of G.
Figure 2.14: A simple graph G and a double cover.
Note that G is not unique. Moreover, the number of valid covers grows quickly
with m. We may consider the general case of an m-fold cover of G, by replicating each
node m times and then permuting edge connections, as shown in Figure 2.15.
A locally operating iterative decoder cannot distinguish between G and G. Moreover,
G is itself the factor graph for a code C, having length mn. Every codeword x ∈ C also
39
Chapter 2. Low-Density Parity-Check Codes
...
......
...
v1,1 v1,2 v1,m v2,1 v2,2 v2,m
Π1
Π2 Π3
Π4 Π5
Π6
m
Figure 2.15: An m-fold cover of G.
has a valid representation on G. We may obtain a valid x by assigning the value of each
variable from x to each occurrence of it in the cover. We term this process lifting, and
use the symbol ˜ to distinguish objects relating to the cover from those in the underlying
graph. For the above double cover example C is lifted to {(0, 0, 0, 0), (1, 1, 1, 1)}. However,
there will be configurations on G, for which nodes corresponding to the same variable
in G, assume different values. In the above example, G also supports the configurations
x = (1, 0, 0, 1) and x = (0, 1, 1, 0). Hence, we require a means of characterising x in
n-dimensional space. For this we have the following definition of a pseudo-codeword, in
the context of finite graph-covers [27].
Definition 2.18 (Pseudo-Codeword). Consider a codeword x ∈ C and let ωs represent
the fraction of times that variable vs ∈ G assumes the value 1 on G,
ωs(x) ,|{l : xi,l = 1}|
m, where 0 ≤ ωs(x) ≤ 1
The vector ω = ω(x) = (ω1(x), ω2(x), . . . , ωn(x)) is a pseudo-codeword of C.
Assume that a BPSK mapped all-zero vector, 1, is transmitted and that the vector
y is received across an AWGN channel. We define a point, ωBPSK , representing the
pseudo-codeword in n-dimensional signal space, as follows [15].
40
Chapter 2. Low-Density Parity-Check Codes
Definition 2.19 (Generalised BPSK Mapping). For a BPSK mapping M(0) =
+1, M(1) = −1, on an AWGN channel, the generalised BPSK mapping of the nonzero
pseudo-codeword ω is
ωBPSK , 1 − 2
(∑
s ωs∑
s ω2s
)
ω
A decision boundary is formed in n-dimensional signal space, between 1 and ωBPSK .
The decision made by the maximum-likelihood decoder will be governed by which side
of this boundary y lies. While such analysis is, in general, not exact for the iterative
sum-product decoder, it can provide a very good prediction. The pseudo-weight of ω
determines the position of the decision boundary [15].
Definition 2.20 (Pseudo-Weight). The pseudo-weight of a nonzero pseudo-codeword
ω on the AWGN channel is
wp (ω) ,(∑
s ωs)2
∑
s ω2s
We note the importance of the minimum pseudo-weight, wminp , which is the lowest
pseudo-weight of any valid pseudo-codeword. This property is analogous to the minimum
distance of the code, dmin. However, where dmin is a function purely of the code, wminp
considers the factor graph structure and operation of the iterative decoding algorithm.
To analyse the effect of a valid configuration x on the cover G, we consider length mn
vectors y and λ, which represent lifted versions of the received vector and corresponding
log-likelihood vector. We assume transmission of the all-zero codeword 0, with lifted
representation 0. For BPSK mapping M(0) = +1, M(1) = −1 on the AWGN channel,
we have λ = 2y/σ2. Hence the LLRs represent the received values, scaled by a positive
constant. For this case, Proposition 2.1 dictates the decision that will be made by the
decoder [27].
Proposition 2.1 (Kotter and Vontobel). The decoder makes a decision between the
all-zero word 0 and any other valid configuration x on the graph-cover. Given the received
vector y ⇒ y and corresponding LLR vector λ ⇒ λ, the decision made relates to the
pseudo-codeword ω = ω(x), as follows.
p(x|λ) > p(0|λ) ⇐⇒ 〈ω(x), λ〉 < 〈ω(0), λ〉 ⇐⇒ 〈ω(x),y〉 < 0
Here the vector inner product is represented by 〈a,b〉 =∑
s asbs.
41
Chapter 2. Low-Density Parity-Check Codes
Recall that xs ∈ {0, 1}, and that G indicates the validity of x. A similar means
of testing pseudo-codeword validity is provided by modifying the definition of the local
indicator functions in G. The new indictor functions allow variable assignments ωs ∈ [0, 1].
The code C contains a discrete set of 2k binary codewords. However, the set of valid
pseudo-codewords is dense in the fundamental polytope, which is the convex hull in Rn
defined by the set of pseudo-codeword indicator functions [28].
Definition 2.21 (Pseudo-Codeword Indicator Function). Consider a check cr, cor-
responding to row r of H. Denote the total set of variable indices, i.e. the columns of
H, as S = {1, . . . , n}. Let Sr denote the indices of all variables connected to cr, i.e.
Sr = {s ∈ S|[H]r,s = 1}, and let Sr\p denote the set excluding p. Given a vector ω,
assign the value ωs to each variable vs, for all s ∈ S. The local indicator function for cr
is,
Ir =
1∑
s∈Sr\p
ωs ≥ ωp,∀p ∈ Sr,
0 otherwise.
The global indicator function is the product of all local indicator functions in the
graph. If it has value one for a given assignment ω, then ω is a valid pseudo-codeword.
We may use canonical completion to build a valid pseudo-codeword as follows [27].
Definition 2.22 (Canonical Completion). Consider a breadth-first spanning tree [80]
of the (j, i)-regular graph G, rooted at some arbitrary variable node vr. Construct the
pseudo-codeword ω, such that all variables vc at distance d edges from vr, are assigned
the value
ωc =1
(i − 1)d/2
Canonical completion always generates a valid pseudo-codeword, the weight of which
provides an upper bound on wminp . This simple technique demonstrates an important
fact. In the limit, as we increase the block length of a (j, i)-regular code to infinity,
wminp vanishes. This stands in sharp contrast to the good minimum distance property
of randomly generated LDPC codes discussed in Section 2.3. By modifying the process
slightly such that it is rooted at a set of variables, we are able to quantitatively assess
the effect of near-codewords. This technique is employed in Section 4.4.4.
42
Chapter 2. Low-Density Parity-Check Codes
2.9 Developments in LDPC Code Design
Since the introduction of regular binary LDPC codes, several modifications have been
proposed which can offer improved performance. We now review these approaches, some
of which have been considered breakthroughs within the coding community.
2.9.1 Irregular Constructions
By relaxing the regularity design constraint and allowing irregular node degree sequences,
we can improve upon the performance of Gallager’s original LDPC constructions.
Luby et al. introduced a class of irregular codes which exhibit considerable improve-
ment in performance over regular constructions [21]. Some example irregular construc-
tions have also been investigated by MacKay et al. [32].
Richardson et al. have shown how to determine convergence thresholds [35], and
optimise irregular degree distributions [22], using density evolution. A simplified Gaus-
sian approximation to density evolution has been provided by Chung et al. [81]. The
performance of long codes constructed according to these degree sequences approaches
capacity. For example, a block length n = 107 irregular LDPC code can achieve perfor-
mance within 0.04dB of the Shannon bound [23]. The technique is based upon the local
tree assumption, that the graph girth (Def. 2.9) will be large enough to sustain cycle free
local subgraphs during decoding. Hence the results degrade as block length is decreased.
Irregular codes also exhibit a higher error floor property than regular codes [32], and this
has been addressed by Tian et al. [25].
A method for selecting good degree distributions, using curve fitting on extrinsic
information transfer (EXIT) charts, has been presented by ten Brink and Kramer [82].
They also propose a technique for integrating the LDPC coding scheme with a modulator
and detector.
2.9.2 Non-Binary Construction
Davey and MacKay have generalised the LDPC coding scheme to build codes which use
symbols coming from finite fields of the form Fq, where q = 2b for some b ∈ Z+ [83]. The
iterative decoder is appropriately modified to deal with messages of higher cardinality.
Codes built over F4, F8 and F16, exhibit significant empirical improvement over the
performance of binary codes.
43
Chapter 2. Low-Density Parity-Check Codes
They have also demonstrated that incorporating irregular construction into these
code designs leads to further performance improvements [84].
2.9.3 Algebraic LDPC Constructions
Algebraic code construction has several potential advantages over the random approach.
From a theoretical perspective, we can often say more about code properties, such as
minimum distance, for algebraic constructions. This stands in contrast to the fact that
most of the results for random codes pertain to ensemble averages. We can incorporate
algebraic conditions, e.g. on girth (Def. 2.9), into the design process. Moreover, we can
provide concise descriptions for algebraic codes. For example, an algebraic code can often
be described by a polynomial which defines H, rather than requiring a verbose description
of H. From an implementation perspective they offer deterministic placement of edges
in the factor graph representation of the code, thus reducing the complexity of circuit
routing.
As a pioneer in the area, Margulis presented explicit code designs, by construct-
ing Ramanujan graphs [8, 75]. Independently, Lubotzky et al. explored a similar ap-
proach [76]. Rosenthal and Vontobel have since extended this approach, to build LPDC
codes which have a good expansion property (Def. 2.14) [67, 78].
Lucas et al. [51] and Kou et al. [52] have built LDPC codes based upon finite
projective geometries. These codes have good minimum distance and have been shown to
outperform Gallager codes. However, they have girth limited to six. Vontobel and Tanner
have proposed alternate designs, based upon generalised quadrangles [85], to which this
constraint does not apply.
The above codes fall under the category of partial geometry constructions, for which
Johnson and Weller have recently presented some new results [86]. They have also in-
vestigated the combination of algebraic constructions with irregular graphs [55] and non-
binary fields [87].
For further information regarding the algebraic construction of sparse graph codes
see [67] and the references therein.
2.10 Summary
Coding theory has undergone a paradigm shift in recent years, with the introduction of
iterative decoding and codes on graphs. The low-density parity-check codes introduced
44
Chapter 2. Low-Density Parity-Check Codes
by Gallager offer good error correcting performance. Since their rediscovery, several mod-
ifications have been proposed which further improve their performance. There are now
many impressive empirical results in the literature, especially in the case of long codes.
However, there is still work to be done in the area of analysing iterative decoding, and
in designing good codes of short and medium block length. The finite graph-cover anal-
ysis discussed in this chapter represents a recent step toward the analytical tools coming
into line with empirical results. This analysis accounts for the structure of a particular
parity-check matrix and considers the operation of the iterative decoding process. In the
case of iterative decoding, it has been shown that the traditional metric of code minimum
distance is less significant than the minimum pseudo-weight associated with the parity-
check matrix. Perhaps the most exciting step is still yet to be made, i.e. the design of
finite length codes with high minimum pseudo-weight.
Several techniques have been suggested to improve the performance of iterative de-
coding when the factor graph of a code has cycles. We may either constrain (or transform)
the parity-check matrix so that it does not have short cycles, or modify the decoding al-
gorithm so that it accounts for the presence of short cycles. New tools for analysing
LDPC codes are likely to assist in the development of both approaches.
Encoding LDPC codes is, in general, not a trivial problem. Several approaches have
been suggested to date, either based upon specific code designs, or for the transformation
of an existing code. These approaches may be separated into two categories. Firstly, codes
may be designed (or transformed) to exploit back substitution through either a triangular
(or approximately triangular) structure of the parity-check matrix. Secondly, cyclic and
quasi-cyclic structures may be encoded using a simple shift register approach, in a similar
manner to that used to encode turbo codes. Both techniques are based upon a serial
encoding operation, with computational latency that scales linearly with code length. In
the following chapter we introduce a novel approach to iterative encoding which reuses
the decoder. Moreover, the proposed approach allows encoding to be performed using a
parallel architecture. For this new technique, the number of iterations required to encode,
and hence latency, is fixed as we scale code length.
There are several challenges that must be faced in order to build an LDPC decoder
circuit which satisfies the design criteria introduced in Section 1.2. In Chapters 6 and 7 of
this thesis we see how such challenges may be met through the recent suggestion of using
analog circuits for iterative decoding. Finally, we propose a novel codec architecture and
outline the advantages offered by this circuit in terms of the codec design criteria.
45
Chapter 3
Iterative Encoding of LDPC Codes
3.1 Introduction
Inspired by the principle of iterative decoding, in this chapter we investigate the use of
iterative techniques for encoding LDPC codes. A previous study of high rate LDPC codes
with short block length [40] has shown that randomly generated codes (decoded using a
practical soft decision sum-product decoder) outperform comparable Reed-Solomon (RS)
codes (decoded by a hard input decoder). RS codes possess however an advantage over
randomly constructed LDPC codes, in that an RS erasure decoder can also be used for
encoding [1].
Motivated by this idea of decoder re-use, and the potential to encode on the factor
graph [9], we develop a class of reversible LDPC codes. We aim not only to provide an
encoding technique which has time complexity that varies linearly with block length but
more specifically one which reuses the sum-product message-passing decoder architecture
described in Section 2.5.1.
By reusing the decoder architecture for encoding, both operations can be performed
by the same circuit on a time switched basis [88]. The utilisation of area within practical
LDPC decoder implementations to date has been limited by routing congestion [89].
The area required for wire routing in a dedicated LDPC encoder implementation is also
significant. Thus, by eliminating the need for a separate dedicated encoder we aim to
reduce the overall size of the circuit.
Another benefit of reusing the decoder for encoding, is that we reduce the number
of individual components that must be tested. Hence, the overall burden of system
verification is reduced. This is of particular interest for codec circuit implementation.
The output of the encoder is deterministically defined by the input information vector.
46
Chapter 3. Iterative Encoding of LDPC Codes
Therefore an error in routing, or similar, will be exposed quickly when testing the encoder.
In contrast, it is the nature of the decoder to correct errors, and testing the decoder via
circuit simulation can be an arduous task. Hence an implementation error, such as that
described in [90], can be hard to detect. By using the same circuit to perform both tasks,
the encoder implicitly provides a further level of verification for the decoder.
We present a parallel encoding algorithm, which employs an adaption of the Jacobi
method for iterative matrix inversion. The standard Jacobi method, which operates over
R, is modified so that it instead operates over F2. We propose an algebraic construction
for reversible LDPC codes built from circulant matrices and show how the overlap con-
straint (Def. 2.10) may be viewed algebraically. Using these results we then develop an
algorithm for building iteratively encodable, 4-cycle free circulant matrices, suitable for
use in (3,6)-regular and high rate reversible codes. The codes have parity-check matrices
which combine circulant and random components, in a similar manner to that presented
by Bond et al. [91, 92]. However, we use different criteria for selecting the circulant
components, such that the codes are iteratively encodable.
As discussed in Section 2.7.1, we consider a binary systematic (n, k) code with code-
words arranged as row vectors x = [xp | xu], where xu are the information bits and xp are
the parity bits. Likewise the parity-check matrix is partitioned such that H = [Hp | Hu].
Thus x is a codeword iff [Hp | Hu][xp | xu]⊤ = 0, or equivalently Hpx
⊤p = Hux
⊤u . We
maintain our previous definition of b , Hux⊤u . Encoding then becomes equivalent to
solving
Hpx⊤p = b. (3.1)
Therefore, for an m × m non-singular Hp, the parity bits satisfy x⊤p = H−1
p b.
We investigate iterative solution methods for (3.1) and the corresponding convergence
criteria and constraints imposed on Hp. The idea behind using the code constraints
to perform encoding on the graph was originally suggested by Tanner [9]. The work
presented here forms a link between this concept and classical iterative matrix inversion
techniques, allowing the design of good codes that encode quickly.
This chapter contains original work, which was presented in part at the 2002 IEEE
Globecom conference, in Taipei, Taiwan [93]. The approach and code design techniques
were respectively presented at the 2002 and 2004 Australian Communications Theory
Workshops [94, 95].
47
Chapter 3. Iterative Encoding of LDPC Codes
3.2 The Sum-Product Encoder
As discussed in Section 2.7.1, if Hp is upper triangular then encoding (i.e. solving (3.1))
may be performed in m steps by simply performing back substitution. This implies a
solution for each of the parity bits in a particular order. Upper triangular matrices are
therefore of interest. For any upper triangular A with elements from F2,
A non-singular ⇐⇒ diag A = I (3.2)
(since the diagonal elements are the eigenvalues, none of which may be 0). Let T be the
set of all binary non-singular m×m matrices that may be made upper triangular, using
only row and column permutations.
The message-passing erasure decoder, introduced as Algorithm 2.3 in Section 2.5.2,
can be used for encoding certain types of LDPC codes, as we shall now show.
Theorem 3.1. Let binary A ∈ T and b be given. Algorithm 2.3 solves Ax = b in at
most m iterations, without regard to the actual order of node updates.
Proof. Without loss of generality assume that A is upper triangular. Obtain bM ∈{−1, +1}m from b, using M(0) = +1, M(1) = −1. Recall that Algorithm 2.3 employs
real arithmetic, and returns a vector xM ∈ {−1, +1}m. The solution, i.e. the binary form
of x, is obtained using the inverse mapping.
Construct the bipartite graph with variable nodes vs connected to checks cr according
to A. Also connected to each check cr is the additional variable node v′r. Initialise the
nodes (Step 1) v′s = M(bs) ∈ {−1, +1} with values from bM, and set all vs = 0.
Call vs active if at least one λcr→vs6= 0. An active vs is correct if λcr→vs
∈{sgn (M(xs)) , 0} for all adjacent checks cr ∈ Γ (vs). For any correct vs, sgn (λvs→cr
) ∈{sgn (M(xs)) , 0}. In the case that every node is either correct or not active, nodes can
only be made correct, left correct, or left inactive at the next Step 3, since each new
λcr→vs∈ {M(xs), 0}.
After the first Step 3, v1 will be correct (since the only nonzero incoming message
will be M(b1)). Similarly, any other nodes activated will be correct.
Assume there is a set of correct nodes C, such that |C| ≥ 1, and that every node
v 6∈ C is inactive. It remains only to show that at least one correct node is created at the
next Step 3. This is true since there will exist an integer r ≥ 1 such that v1, . . . , vr are
correct. At the next Step 3, vr+1 6∈ C will be correct since, by A triangular and (3.2),
48
Chapter 3. Iterative Encoding of LDPC Codes
cs ∈ Γ (vs) and cr 6∈ Γ (vs) for r < s. Likewise, vr ∈ Γ (cr) and vs 6∈ Γ (cr) for s > r. The
induction requires at most m steps before every node is correct.
Hence, if Hp ∈ T , we may perform encoding by applying Algorithm 2.3 to (3.1),
initialising the variables representing xu with ±1 and those representing xp with 0. The
idea of Theorem 3.1 is certainly not new, but we have not seen it made explicit.
The number of iterations required for convergence may be greatly reduced below the
upper bound of m for LDPC codes, as they are represented by sparse matrices. It is
possible to design Hp ∈ T using a tiered approach, similar to that described in [32]. In
this construction, the parity bits for one or more tiers will be evaluated at each iteration,
and therefore the total number of iterations may be set by the designer.
The selection of Hu is always arbitrary with respect to the sum-product encodability
of H.
3.3 Encoding via Iterative Matrix Inversion
Having reduced the encoding problem statement to one of matrix inversion, it is natural
to wonder whether classical iterative matrix inversion techniques, such as those described
in [96], can be applied in the case that A 6= T .
Suppose we wish to solve Ax = b. Split A according to A = S − T. We can then
write Sx = Tx + b, and try the iteration
Sxκ+1 = Txκ + b (3.3)
for some initial guess x0. In order to compute xκ+1 easily, S should be easily invert-
ible. The Gauss-Seidel method chooses S triangular, so for A ∈ T , we see that the
method of the previous section actually implements Gauss-Seidel (in this case simply
back-substitution). The classical Jacobi method for real matrices chooses S = diag A
and converges for any initial guess provided that the spectral radius, i.e. magnitude of
the largest eigenvalue, of the matrix S−1T is less than 1. We will consider the use of this
method for F2 matrices, necessitating different convergence criteria.
In order for S to be invertible, all values along its diagonal must be nonzero. Over
F2 this implies that S = I and diag A = I. Hence (3.3) becomes
xκ+1 = (A + I)xκ + b (3.4)
49
Chapter 3. Iterative Encoding of LDPC Codes
Theorem 3.2. For arbitrary x0, the iteration (3.4) with all matrices and vectors over
F2, yields xκ′ = A−1b for κ′ ≥ κ iff (A + I)κ = 0.
Proof. Let the error term at iteration κ be eκ = (x − xκ). Subtracting xκ+1 = Txκ + b
from x = Tx+b gives eκ+1 = Teκ. So eκ = Tκe0, where e0 is the error of the initialisation
x0. Hence the error term vanishes for iterations κ′ ≥ κ if Tκ = 0. Conversely, if Tκ 6= 0
for all κ, the algorithm will fail to universally converge since the error will be zero only
if e0 is in the null space of Tκ, which cannot be guaranteed independently of the initial
guess.
Denoting the eigenvalue of S−1T with the largest magnitude as µ1, the convergence
of the Jacobi method over R is governed by the spectral radius |µ1|. More precisely, the
rate of convergence is asymptotic, becoming slower in the limit |µ1| → 1−. In contrast,
we can say less about the rate of convergence for the Jacobi method over F2. The above
theorem implies only that the method either converges after a finite number of steps, or
that it never converges.
Based on Theorem 3.2, we can in principle construct codes that are iteratively en-
codable in κ iterations using (3.4) by selecting Hp such that
(Hp + I)κ = 0 (3.5)
We label these codes according to the following definition.
Definition 3.1 (Reversible Code). A reversible code is iteratively encodable using the
Jacobi method over F2.
We also say that such codes are Jacobi encodable. It is interesting to note that the
codes with Hp ∈ T mentioned in the last section are also Jacobi encodable.
Theorem 3.3. Any code with upper triangular Hp is Jacobi encodable over F2.
Proof. Let T = Hp + I. Hence diag T = 0. Each successive power of T will therefore
be upper triangular, with its first nonzero entry of each row occurring at least one place
later. Thus Tκ = 0 for some κ.
We may view the Jacobi iteration as message-passing on a bipartite graph formed as
follows. Let variable node vs correspond to xs and let nodes v′r correspond to br. The vs
are connected to checks cr according to A and the v′r are connected to cr. This is the same
50
Chapter 3. Iterative Encoding of LDPC Codes
Algorithm 3.1: Message-Passing Jacobi Method Over F2
Initialisation:1
Set all vs = +1 and v′r = br.
Variable - Check:2
Send µvs→c = vs to all c ∈ Γ (vs) \ cs.Check - Variable:3
From check cr send µcr→vr=∏
vs∈Γ(cr)\vrµvs→cr
to vr only.Let vs = µcs→vs
.Return to Step 2.
connection structure as required for sum-product decoding. The Jacobi message-passing
schedule, for a binary mapping: M(0) = +1, M(1) = −1 is defined as follows.
An example of how this algorithm operates on the graph is shown in Figure 3.1.
During each iteration variables may be updated in parallel. For clarity Figure 3.1 shows
only those messages used to update v2.
We note that Algorithm 3.1 has a strong resemblance to the sum-product decoder.
The update process for µcr→vrin the Jacobi method is identical to that used in the update
λcr→vrin the sum-product case, so the decoder architecture may be reused. It is also
worth noting that only one operation per node needs to be performed in each step of
the Jacobi method, compared to one per connected edge for each of the nodes in the
sum-product case.
3.4 Reversible LDPC Codes
In the following sections we will demonstrate the use of the F2 Jacobi convergence rule,
to design 4-cycle free codes which are iteratively encodable in κ iterations of the Jacobi
method. We therefore seek a matrix Hp which satisfies (3.5). Once Hp has been obtained
we may then complete H by randomly generating Hu, whilst blocking the introduction
of 4-cycles.
In this chapter we focus on the algebraic construction of reversible LDPC codes by
constructing Hp as an m × m circulant matrix (Def. 2.7). The first row of a circulant
matrix is specified by the polynomial c(x), where the coefficients of xs−1 represent the
entry in column s. The r-th row of the matrix is then specified by the polynomial
p(x) = xr−1c(x) mod (xm + 1).
The algebra of circulant matrices over F2 is isomorphic to the algebra of polynomials
modulo xm + 1 having coefficients from F2 [54]. Hence by choosing Hp to be a circulant
51
Chapter 3. Iterative Encoding of LDPC Codes
v4
v3
v2
v1
c4
c3
c2
c1
v′4
v′3
v′2
v′1
µv1→c
2
µc2→v2 µv′
2→c2µ v 4
→c 2
Figure 3.1: Jacobi algorithm as message-passing.
matrix we may define our constraints in terms of the algebraic manipulation of polyno-
mials. To this end we let h(x) denote the first row polynomial of Hp and then map the
iterative encodability constraint (3.5) into the polynomial domain as
(h(x) + 1)κ ≡ 0 mod (xm + 1) (3.6)
We refer to the number of nonzero terms in h(x) as the weight of the polynomial.
We will use j to denote both the weight of the polynomial h(x) and the column weight
of the corresponding circulant matrix.
3.4.1 Building Iteratively Encodable Circulants
We must first develop a technique for generating polynomials which satisfy (3.6). We will
then refine the approach by including the matrix overlap constraint to block the creation
of 4-cycles.
52
Chapter 3. Iterative Encoding of LDPC Codes
Consider an m × m circulant matrix Hp and let as denote the coefficient of xs in
h(x), where s ∈ {0 . . . m − 1}. We now introduce some further constraints on the choice
of the first row polynomial h(x) in order to simplify the algebra involved in searching for
candidate polynomials.
Lemma 3.1. If p(x) is a polynomial with binary coefficients of the form
p(x) = a0 + a1x + a2x2 + · · · + a(m−1)x
(m−1)
and κ = 2y for y ∈ Z+, then
pκ(x) = a0 + a1xκ + a2x
2κ + · · · + a(m−1)xκ(m−1)
Proof. F2 has characteristic two, hence
p2(x) =
(
m−1∑
s=0
asxs
)2
=m−1∑
s=0
a2sx
2s
All coefficients are binary and hence aκs = as for all s and κ. By recursively substituting
p(x) = p2(x) the lemma holds for all κ such that κ is a power of 2.
Henceforth, we restrict κ = 2y. By Lemma 3.1 the encodability constraint (3.6)
reduces to
hκ(x) ≡ 1 mod (xm + 1) (3.7)
The implementation of (3.4) is simplified if we have a direct connection for feedback
between variable vr and check cr on the factor graph representation of Hp. For this
reason, we further assume h0 = 1 and define g(x) = h(x) − 1, where g(x) represents the
remaining terms in h(x) once h0 has been assumed. The encodability constraint on g(x)
becomes
gκ(x) ≡ 0 mod (xm + 1) (3.8)
It is desirable to have κ as small as possible as it represents the number of iterations
that the encoder will take to converge.
As we are dealing with polynomials that have binary coefficients, g(x) must have an
even weight in order to satisfy (3.8), implying an odd weight for h(x). To construct g(x)
we must therefore select pairs of nonzero coefficients {ap, as} where p, s ∈ {1 . . . m − 1}and p 6= s, such that
apxκp + asx
κs ≡ 0 mod (xm + 1) (3.9)
53
Chapter 3. Iterative Encoding of LDPC Codes
We now seek a means of grouping candidate terms for g(x) so that we may choose
pairs which satisfy (3.9). To this end we constrain m such that it is some multiple
of κ, i.e. m = βκ, where β ∈ Z+. We will now show how the terms asx
s of h(x)
may be grouped into cosets, according to their exponents. Hence we now focus on the
set S = {0, 1, . . . ,m − 1} which are valid choices for coefficient indices, and therefore
exponents of x, in h(x). Note that S also represents the set of columns (indexed from
zero) in the first row of the matrix Hp.
Lemma 3.2. If m = βκ then, under modulo β addition, the set S = {0, 1, . . . ,m − 1}will be grouped into the cosets of {0, β, 2β, . . . ,m − β}, each of order κ.
Proof. Modulo β addition will group S into β distinct equivalence classes. The equiva-
lence class labelled zero will be {0, β, 2β, . . . ,m − β}. The remaining β − 1 classes then
form cosets of this, having the form {l, l+β, l+2β, . . . , l+m−β}, where l ∈ {1 . . . β−1}.As m is a multiple of β each coset must have the same size κ = m/β.
The following theorem shows how it is possible to construct g(x) by applying Lemma 3.2
and choosing pairs of terms from within the same coset.
Theorem 3.4. Let κ = 2y and m = βκ for y, β ∈ Z+. If p, s ∈ S are in the same coset
as specified by Lemma 3.2, then xκp + xκs ≡ 0 mod (xm + 1).
Proof. If p and s are in the same coset then s = p+δβ for some constant δ ∈ Z. Therefore
xκp + xκs = xκp + xκ(p+δβ)
= xκp + xκpxκδβ
= xκp(1 + xδm),
where, in the last line, the substitution β = m/κ has been made. By Lemma 3.1 note
that
1 + xδm = (1 + xm)δ
≡ 0 mod (xm + 1)
Hence, xκp + xκs ≡ 0 mod (xm + 1).
By choosing p and s from the same coset and then setting ap = as = 1 we may select
pairs of terms to include in g(x) such that (3.8) is satisfied.
54
Chapter 3. Iterative Encoding of LDPC Codes
3.4.2 Enforcing the Overlap Constraint
So far we have a means of selecting candidate terms for h(x) such that it satisfies (3.6).
We now require a means of enforcing the overlap constraint, i.e. blocking 4-cycle creation
in the graphical representation of Hp, as we build h(x).
We must constrain Hp such that the maximum overlap between any two rows is
one. Given that the remaining rows of Hp are cyclic shifts of the first row, the separation
between nonzero elements in the first row is therefore of interest. We now define a test for
4-cycle presence using h(x). We let S denote possible choices for the position of nonzero
terms in h(x) and H ⊂ S contain the chosen positions for nonzero terms in h(x).
Lemma 3.3. 1 A circulant matrix Hp with nonzero elements specified by H = {p, q, s, w}will contain 4-cycles if and only if
(s − w) ≡ (p − q) mod m (3.10)
for some p 6= q, s 6= w, p 6= s and q 6= w.
Proof. Due to the cyclic nature of Hp, if an overlap occurs it will be replicated through
the matrix and hence we need only to test the first row against all other rows. Assign the
index 0 to the first row of Hp. Row r ∈ {1 . . . m−1} is then a cyclic shift of row 0 modulo
m. In order for an overlap to occur between the first row and row r, it is sufficient that
(s + r) ≡ p mod m and (w + r) ≡ q mod m,
which, under modulo m addition, is equivalent to (3.10). Furthermore, in order for an
overlap to exist between two nonzero terms s and w it is necessary that
(s + r) ≡ w mod m and (w + r) ≡ s mod m,
which, under modulo m addition, reduces to the specific case of the initial statement
when p = w and q = s.
It is apparent from Lemma 3.3 that in order to detect the overlap between nonzero
terms p and q we must consider both the separation (p − q) mod m and also (q − p)
mod m. We therefore need a means of classifying the separation between two terms.
Definition 3.2 (Separation Class). Consider two pairs of nonzero terms at positions
{p1, q1} and {p2, q2} where p1, q1, p2, q2 ∈ S and p1 6= q1, p2 6= q2. These pairs are in
1While finalising this thesis we became aware of an independent discovery of Lemma 3.3 [97].
55
Chapter 3. Iterative Encoding of LDPC Codes
the same separation class, labelled by the unordered pair {t1, t2} ∈ S if t1 ≡ (p1 − q1) ≡(p2 − q2) mod m and t2 ≡ (q1 − p1) ≡ (q2 − p2) mod m.
The pair of elements {t1, t2} in a separation class label are additive inverses modulo
m. Hence, for even m, let T be the set of distinct separation class labels {{1,m −1}, {2,m − 2}, . . . , {m/2,m/2}}. Note that T does not contain {0, 0} as p 6= q by
definition, and hence |T | = m/2. From this point forward, with some abuse of notation,
we will use the term separation class to refer to a separation class label {t1, t2}.
Theorem 3.5. 2 Let H ⊂ S contain the chosen positions for nonzero terms in h(x). Let
D be the list of all separation classes that may be formed by selecting pairs of elements from
H. Note that |D| =(
|H|2
)
. Let Hp be the m × m circulant matrix corresponding to h(x),
where m ∈ 2y, y ∈ Z+. The overlap constraint will be violated for Hp if {m/2,m/2} ∈ D,
or if any member of T appears in D more than once.
Proof. If {m/2,m/2} ∈ D then there must be a pair of nonzero terms at positions
p, q ∈ H, p 6= q, such that (p − q) mod m = m/2 = (q − p) mod m. This violates the
overlap constraint according to Lemma 3.3.
Assume that a member {t1, t2} appears twice in D and hence | H |> 2. The elements
in H appear without repetition. So, there must exist an assignment of labels p, q, s, w ∈ Hto these elements, for p 6= q, s 6= w, p 6= s and q 6= w, such that
(p − q) ≡ (s − w) ≡ t1 mod m
(q − p) ≡ (w − s) ≡ t2 mod m
These equivalence relationships violate Lemma 3.3.
When building h(x) Theorem 3.5 may therefore be used to check for the introduction
of 4-cycles. We assume D based upon the contents of H and then let C ⊂ T be the set
of all separation classes that may be formed between a candidate nonzero element to
be added to H and the current contents of H. Hence if C ∩ D is not empty then the
introduction of the candidate element will violate the overlap constraint.
2While finalising this thesis we became aware of an independent discovery of Theorem 3.5 [97].
56
Chapter 3. Iterative Encoding of LDPC Codes
3.4.3 Building Reversible LDPC Codes from Circulants
Based upon the work presented in the previous sections, the following algorithm provides
a method for building circulant matrices suitable for use in reversible codes. We label
this subclass of reversible code as follows.
Definition 3.3 (Type-I Reversible Code). A type-I reversible code has Hp constructed
as a circulant matrix.
Algorithm 3.2: Construction of Hp for a Type-I Reversible Code
Initialise:1
Choose input parameters:j : column weight for Hp, such that j ≥ 3 and odd.κ : number of iterations required to encode, such that κ = 2y for y ∈ Z
+.m : size of Hp, such that m = βκ, where β ∈ Z
+.Initialise output parameters:Let the output polynomial h(x) be represented by the set H ⊂ S, containing thechosen positions for nonzero terms in h(x), where S = {0, 1, . . . ,m − 1}.Initialise H = {0} and h(x) = 1.Initialise internal parameter:Let D be the list of all separation classes that may be formed by selecting pairsof elements from H. Initialise D to be the empty set.Generate Cosets:2
Group S into cosets of {0, β, 2β, . . . ,m − β}.Select Candidate Terms:3
Select a candidate pair of terms from a coset created in Step 2.Test for Overlap4
Let C be the set of all separation classes (Def. 3.2) that may be formed betweenthe candidate terms to be added to H and the current contents of H. If C doesnot contain {m/2,m/2} and C ∩ D is empty, then add the candidate terms to Hand h(x), and update D.Stop/Continue:5
If |H| = j then exit declaring success. Otherwise, if all possible pairedcombinations from S have been tried then exit declaring failure. Otherwisereturn to step 3.
Theorem 3.6. If Algorithm 3.2 terminates with success, then the first row polynomial
corresponding to H will generate a 4-cycle free circulant matrix which is iteratively en-
codable in κ iterations.
Proof. In Step 3, candidate terms for H are selected from the cosets of {0, β, 2β, . . . ,m−β}. Hence the iterative encodability constraint will always be enforced, according to
Theorem 3.4, with κ iterations required for convergence. A candidate pair is rejected
57
Chapter 3. Iterative Encoding of LDPC Codes
if it would introduce an overlap in Step 4, according to Theorem 3.5. Hence if the
algorithm terminates successfully with | H |= j then the polynomial represented by Hwill correspond to an iteratively encodable 4-cycle free circulant matrix.
It is now natural to wonder what the minimum number of encoder iterations κ is,
for which we can build a polynomial of weight j.
Theorem 3.7. A circulant matrix Hp with odd column weight j, constructed using Al-
gorithm 3.2, will require κ > j iterations to encode, where κ = 2y for y ∈ Z+.
Proof. Consider a coset of {0, β, 2β, . . . ,m−β} generated in Step 2 of Algorithm 3.2. Let
the set of distinct separation classes (Def. 3.2) that can be formed by selecting pairs of ele-
ments from the coset, in Step 3, be E = {{β, (κ−1)β}, {2β, (κ−2)β}, . . . , {κβ/2, κβ/2}}.Once we have selected a pair of elements which form a separation class {t1, t2} ∈ E we
may not select another pair of elements from any coset which also form {t1, t2}. Doing
so would violate the overlap constraint test in Step 4. We also note that choosing a
pair of elements which form the separation class {κβ/2, κβ/2} = {m/2,m/2} will also
violate this constraint. Hence the total number of separation classes that are available
for selection without violating the overlap constraint is | E | −1 = κ/2 − 1.
In Step 1 we initialise the coefficient of x0 in h(x) to be h0 = 1 and hence the initial
weight of h(x) is one. So, to create a matrix of odd weight j, we will need to add (j−1)/2
pairs of terms. Each pair of terms will form a separation class. Therefore, to create a
weight j matrix without violating the overlap constraint we will need at least (j − 1)/2
separation classes. Hence we require κ/2 − 1 ≥ (j − 1)/2 and thus κ > j.
We may use the above theorem to select an appropriate value for κ in the initialisation
step of Algorithm 3.2, based upon the target weight for h(x).
We now present a method for constructing (3,6)-regular, 4-cycle free, reversible codes.
Using Algorithm 3.2 we may generate a 4-cycle free iteratively encodable matrix Hp with
column weight j = 3. By Theorem 3.7 we therefore seek a circulant which is encodable
in κ = 4 iterations.
Setting m = 8 and κ = 4, Algorithm 3.2 returns the candidate polynomial h(x) =
1 + x + x3. The circulant matrix Hp which corresponds to h(x) follows.
58
Chapter 3. Iterative Encoding of LDPC Codes
Hp =
1 1 0 1 0 0 0 00 1 1 0 1 0 0 00 0 1 1 0 1 0 00 0 0 1 1 0 1 00 0 0 0 1 1 0 11 0 0 0 0 1 1 00 1 0 0 0 0 1 11 0 1 0 0 0 0 1
We note that this polynomial has the form h(x) = 1+x+xm/4+1 and now show that
this general form may be used to generate larger matrices.
Theorem 3.8. If Hp is a binary m×m circulant matrix, where m = 2y for y ∈ Z+, y > 2,
built from cyclic rotations of the first row polynomial h(x) = 1 + x + xm/4+1, then Hp is
4-cycle free (3,3)-regular matrix, which is iteratively invertible using 4 iterations of the
Jacobi method over F2.
Proof. Given that the weight of h(x) is 3 and the transpose of a circulant matrix is also
circulant, it follows that Hp is (3,3)-regular.
By Lemma 3.1
h4(x) = xm+4 + x4 + 1
≡ 1 mod (xm + 1)
Hence h(x) satisfies (3.7) for κ = 4 and Hp is iteratively invertible using 4 iterations of
the Jacobi method over F2.
Let H correspond to the positions of nonzero terms in h(x). The set of separation
classes that may be formed by selecting pairs of elements from H is
D = {{1,m − 1}, {m/4, 3m/4}, {m/4 + 1, 3m/4 − 1}}The members of D are necessarily unique and the set does not include {m/2,m/2}.
Hence, by Theorem 3.5, Hp is 4-cycle free.
We complete H for a (3,6)-regular reversible code by randomly building a (3,3)
distributed Hu, whilst blocking the introduction of 4-cycles.
3.5 Summary
We have investigated iterative approaches to encoding LDPC codes. In particular we
have presented a new encoding algorithm, based upon the Jacobi method for iterative
59
Chapter 3. Iterative Encoding of LDPC Codes
matrix inversion. The new algorithm allows the decoder architecture to be re-used for
encoding. Our investigations have also drawn a link between iterative encoding/decoding
and classical iterative matrix inversion techniques. Re-use of the decoder architecture has
the potential to reduce the size of the codec. In particular, by encoding on the factor
graph of the code, we are able to reuse the wire routing in place for the decoder. Routing
represents a significant amount of area in an LDPC decoder implementation. Thus being
able to reuse it for encoding is advantageous. Moreover, the proposed algorithm allows
parallel implementation, such that the number of iterations required to encode is fixed
by design. Hence the computational latency of the encoder remains constant as we scale
code length.
As discussed in Chapter 2, we may separate existing algorithms for the practical
encoding of LDPC codes into two categories. The first approach is to design (or trans-
form) codes such that they have a triangular (or approximate triangular) form, allowing
the encoder to employ back substitution. The second approach involves the use of cyclic
or quasi-cyclic code designs, and allows encoding to be performed using a shift register.
Moreover, both of these approaches necessarily involve serial computation, implying a
latency that scales linearly with code length. This stands in contrast to the fact that the
algorithm proposed in this chapter allows an architecture to be built which has compu-
tational latency that is fixed as we scale code length.
All codes having parity-check matrices that are upper triangular are Jacobi encod-
able. However, this is a sufficient but not necessary condition for iterative encodability.
Furthermore, although the methods presented in this chapter focus upon the use of cir-
culant matrices, it is not a requirement that Hp be circulant. Hence, the iterative Jacobi
approach offers a structural alternative to the triangular and cyclic based approaches
discussed in Section 2.7.
We have shown how a class of reversible codes may be constructed that are Jacobi
encodable. Furthermore, the techniques developed for constructing these codes guarantee
that they are 4-cycle free. The iterative encodability constraint only applies to a section of
the parity-check matrix, thus providing flexibility for the design of code rate and length.
In the following two chapters we present the empirical performance and a thorough
analysis of some reversible codes. We first investigate (3,6)-regular codes, built using
the methods described above, and then explore alternative reversible code structures.
Following that, we present a novel codec circuit implementation, which extends the sum-
product decoder to include the iterative encoding algorithm.
60
Chapter 4
Performance Analysis
4.1 Introduction
Here we use the methods described in the previous chapter to build reversible LDPC
codes. We compare the performance of reversible LDPC codes to that of randomly
generated LDPC codes. Simulation results are provided for the additive white Gaussian
noise and binary erasure channels, as described in Section 1.1.5. An explanation of the
observed behaviour is then provided from several analytical tools.
We study the reversible LDPC codes listed in Table 4.1. These rate 1/2, (3,6)-regular
codes are all encodable using four iterations of the Jacobi algorithm. The parity-check
matrix of each code has the form [Hp|Hu]. Here Hp is a circulant matrix with the first row
polynomial h(x) coming from Theorem 3.8. In each case Hu is a (3,3)-regular randomly
generated matrix, chosen such that the graph of H is 4-cycle free.
The set of randomly constructed LDPC codes listed in Table 4.2 are used as a bench-
mark for comparison. Each of these rate 1/2, (3,6)-regular codes has a corresponding
entry in Table 4.1 of approximately the same block length. Where possible the ran-
dom codes have been obtained from MacKay’s online archive [98], in order to provide a
standard point of reference. All of these codes are 4-cycle free.
This chapter contains original work, which was presented in part at the 2002 IEEE
Globecom conference, in Taipei, Taiwan [93]. Some of the results were also presented at
Code Index Block Length (n) Rows in H (m) h(x)Rev64 64 32 1 + x + x9
Rev512 512 256 1 + x + x65
Rev1024 1024 512 1 + x + x129
Rev4096 4096 2048 1 + x + x513
Table 4.1: Reversible LDPC codes.
61
Chapter 4. Performance Analysis
Code Index Block Length (n) Rows in H (m) ConstructionRand64 64 32 MacKay 1A (see Figure 2.1(a))Rand504 504 252 MacKay 252.252.3.252Rand1008 1008 504 MacKay 504.504.3.504Rand4000 4000 2000 MacKay 4000.2000.3.243
Table 4.2: Randomly constructed benchmark LDPC codes.
the 2002 Australian Communications Theory Workshop, Canberra, Australia [94].
4.2 Performance on the AWGN Channel
In this section we empirically measure the performance of the codes over the AWGN
channel. The experiments were performed using the soft decision sum-product decoder
(Algorithm 2.2), for a maximum of 50 iterations.
We compare the performance of each reversible code to its corresponding random
benchmark code in the figures that follow. For the random codes, the performance
measured across the information bits matches that measured across the full codeword,
and hence it has not been plotted. However, this is not the case for the reversible codes.
Here we observe that, in general, the BER measured across the full codeword is higher
than that measured across the information bits alone. Hence the error rate curves are
shown for both the full codeword and information section. All simulation points shown
represent a minimum of 50 word errors, measured across the information section of the
codeword.
The reversible and random codes with length n = 64 exhibit similar performance, as
shown in Figure 4.1. The reversible code exhibits bit and word error rate floors at around
10−7 and 10−6 respectively. The BER and WER of the reversible code measured across
the full codeword matches that measured across the information bits alone. Figure 4.2
shows the performance of the Rev512 code matching that of the Rand504 code, until
flooring begins at bit and word error rates of around 10−5 and 10−3 respectively. As the
SNR is increased for the reversible code, the BER and WER measured across the full
codeword becomes higher than that measured across the information bits alone. Similar
behaviour is shown in Figure 4.3 for the Rev1024 code, in comparison to the Rand1008
code. Here the BER and WER of the reversible code begin to floor at around 10−5
and 10−2 respectively. This behaviour is also exhibited by the Rev4096 code, which is
compared to the Rand4000 code in Figure 4.4. Here we see the BER and WER of the
reversible code, measured across the information bits alone, begin to floor at around 10−5
62
Chapter 4. Performance Analysis
and 10−2 respectively. Moreover, the BER and WER measured across the full codeword
are significantly higher, with the BER beginning to floor at around 10−3 and the WER
beginning to floor almost immediately.
Since collecting these results we have discovered that the choice of the check output
clipping parameter η, described in Section 2.5.1, can have a large affect upon the per-
formance of the reversible codes. All results for the reversible codes were obtained using
η = 1 − 10−4. In Figure 4.5 we see how the BER and WER vary in relation to η. The
values plotted here for the Rev4096 code were taken at an SNR of 2dB, with a minimum
of 100 word errors per point, measured across the information section of the codeword.
Hence we may lower the floor observed in Figure 4.4, by instead selecting η = 1 − 10−2.
In contrast to the above, we have observed that applying a hard limit to the check
outputs can slightly increase the error rate for the random benchmark codes at high SNR.
However, the effect was much smaller than that observed for the reversible codes, and
was not visible for the length 4000 code. Hence, when decoding the random codes we
have chosen a high clipping limit, η = 1−10−10, in order not to distort their performance.
This limit is imposed purely to prevent numerical overflow.
We now summarise the above results for the reversible codes. An analytical account
for the observations follows in Section 4.4.
Unequal Bit Error Protection We observe a much higher error rate for the parity bits
of the codeword than for the information bits. This indicates a potential weakness
for message-passing in the code subgraph corresponding to Hp.
Unequal Word Error Protection Although the unequal bit error protection is not
always detrimental to the information bit error rate of the code, it does increase
the word error rate. A word error is declared if the decoder runs for the full allowed
iteration count without converging, i.e. if the stopping criteria has not been met.
The WER measured across the full codeword is therefore higher than that measured
across the information bits alone.
Error Floor The reversible codes display an error floor which appears at a different
level for each of the codes. We note that the flooring becomes more significant as
the code length is increased.
Dependence upon Clipping The bit and word error rates appear to be dependent
upon the choice of the clipping parameter η. By lowering η we may lower the
63
Chapter 4. Performance Analysis
1 2 3 4 5 6 710
−8
10−7
10−6
10−5
10−4
10−3
10−2
10−1
Uncoded BPSKRev64: InfoRev64: FullRand64
Eb/N0(dB)
BE
R
(a) Bit error rate
1 2 3 4 5 6 710
−7
10−6
10−5
10−4
10−3
10−2
10−1
100
Rev64: InfoRev64: FullRand64
Eb/N0(dB)
WE
R
(b) Word error rate
Figure 4.1: Comparing the performance of reversible and random (3,6)-regular LDPCcodes, with length n = 64, on the AWGN channel.
64
Chapter 4. Performance Analysis
1 1.5 2 2.5 3 3.5 410
−7
10−6
10−5
10−4
10−3
10−2
10−1
Uncoded BPSKRev512: InfoRev512: FullRand504
Eb/N0(dB)
BE
R
(a) Bit error rate
1 1.5 2 2.5 3 3.5 410
−6
10−5
10−4
10−3
10−2
10−1
100
Rev512: InfoRev512: FullRand504
Eb/N0(dB)
WE
R
(b) Word error rate
Figure 4.2: Comparing the performance of reversible and random (3,6)-regular LDPCcodes, with length n ≈ 500, on the AWGN channel.
65
Chapter 4. Performance Analysis
1 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 310
−7
10−6
10−5
10−4
10−3
10−2
10−1
Uncoded BPSKRev1024: InfoRev1024: FullRand1008
Eb/N0(dB)
BE
R
(a) Bit error rate
1 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 3
10−4
10−3
10−2
10−1
100
Rev1024: InfoRev1024: FullRand1008
Eb/N0(dB)
WE
R
(b) Word error rate
Figure 4.3: Comparing the performance of reversible and random (3,6)-regular LDPCcodes, with length n ≈ 1000, on the AWGN channel.
66
Chapter 4. Performance Analysis
1 1.5 2 2.5 3 3.510
−7
10−6
10−5
10−4
10−3
10−2
10−1
Uncoded BPSKRev4096: InfoRev4096: FullRand4000
Eb/N0(dB)
BE
R
(a) Bit error rate
1 1.5 2 2.5 3 3.5
10−4
10−3
10−2
10−1
100
Rev4096: InfoRev4096: FullRand4000
Eb/N0(dB)
WE
R
(b) Word error rate
Figure 4.4: Comparing the performance of reversible and random (3,6)-regular LDPCcodes, with length n ≈ 4000, on the AWGN channel.
67
Chapter 4. Performance Analysis
1 2 3 4 5 6 7 8 9 1010
−7
10−6
10−5
10−4
10−3
10−2
10−1
Information bitsFull codeword
− log10(1 − η)
BE
R
(a) Bit error rate
1 2 3 4 5 6 7 8 9 1010
−7
10−6
10−5
10−4
10−3
10−2
10−1
Information bitsFull codeword
− log10(1 − η)
WE
R
(b) Word error rate
Figure 4.5: Shifting the error floor for Rev4096, at Eb/N0 = 2dB, by varying η.
information error rate floor by a greater margin than that for the full codeword.
Again this indicates a weakness specific to the subgraph for Hp.
Dependence upon Block Length In general, the shorter reversible codes fair better
against their respective benchmarks than the longer codes do.
No Undetected Errors Undetected errors were observed only in the case of the very
short codes, having block length n = 64. We did not observe undetected errors for
any of the other codes.
4.3 Performance on the Binary Erasure Channel
In the figures that follow, we compare the performance of each reversible code to its corre-
sponding random benchmark code, for transmission over the binary erasure channel. The
experiments were performed using the binary erasure decoder described in Section 2.5.2.
All simulation points shown represent a minimum of 50 word erasures, measured across
the information section of the codeword.
The random and reversible length 64 codes exhibit the same performance on the
BEC, as shown in Figure 4.6. The bit and word erasure rates of the reversible code,
measured across the full codeword, match those measured across the information bits
alone. Figure 4.7 shows that the bit erasure performance of the Rev512 code closely
matches that of the Rand504 code, above the channel bit erasure probability of ǫ ≈ 0.31.
Below this point, the bit erasure rate measured across the full codeword becomes higher
than that measured across the information bits alone. The word erasure rate measured
68
Chapter 4. Performance Analysis
across the full codeword matches that measured across the information bits alone, and
exhibits a floor at around 10−5. Similar behaviour is shown in Figure 4.8 for the Rev1024
code, in comparison to the Rand1008 code. Here the bit erasure rate of the reversible code
matches that of the random code. For the Rev1024 code, the bit erasure rate measured
across the full codeword becomes higher than that measured across the information bits
alone, below ǫ ≈ 0.35. The word erasure rate of the reversible code begins to floor at
around 10−5. Performance of the Rev4096 code is compared to that of the Rand4000
code in Figure 4.9. The bit erasure rate measured across the information bits alone,
begins to floor at around 10−8. The bit erasure rate measured across the full codeword
becomes significantly higher than this at ǫ ≈ 0.39, beginning to floor at around 10−7.
The word erasure rate measured across the full codeword matches that measured across
the information bits alone, and begins to floor at around 10−4.
A summary of results for the reversible codes follows, with an analytical account
provided in Section 4.4.
Unequal Bit Erasure Protection As observed for the case of the AWGN channel, the
information bits appear to be more heavily protected than the parity bits. This
inequality begins to surface as we decrease the channel erasure probability and is
most visible for the longer codes.
Equal Word Erasure Protection In contrast to the AWGN case, the word erasure
rate measured across the information bits matches that measured across the full
codeword. This indicates that on average, when the decoder fails to correct all
erasures, the set of remaining erasures contains both information and parity bits.
Erasure Floor We observe a floor on the decoded erasure rate, which again appears to
be more significant for the longer codes.
Dependence upon Block Length The comparative performance of the reversible codes,
against their respective benchmark codes, appears to deteriorate with increased
block length. This observation is common to both channels.
4.4 Analysis of Codes and Decoder Behaviour
In this section we employ recently developed theoretical tools to analyse the behaviour of
the reversible LDPC codes presented above. We recognise that we have an error control
system that represents the pairing of a code to a sub-optimal decoder. It is therefore
69
Chapter 4. Performance Analysis
0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.510
−8
10−7
10−6
10−5
10−4
10−3
10−2
10−1
100
Rev64: InfoRev64: FullRand64
Channel bit erasure probability, ǫ
Dec
oded
bit
eras
ure
rate
(a) Bit erasure rate
0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.510
−7
10−6
10−5
10−4
10−3
10−2
10−1
100
Rev64: InfoRev64: FullRand64
Channel bit erasure probability, ǫ
Dec
oded
wor
der
asure
rate
(b) Word erasure rate
Figure 4.6: Comparing the performance of reversible and random (3,6)-regular LDPCcodes, with length n = 64, on the binary erasure channel.
70
Chapter 4. Performance Analysis
0.3 0.32 0.34 0.36 0.38 0.4 0.42 0.44 0.46 0.48 0.510
−8
10−7
10−6
10−5
10−4
10−3
10−2
10−1
100
Rev512: InfoRev512: FullRand504
Channel bit erasure probability, ǫ
Dec
oded
bit
eras
ure
rate
(a) Bit erasure rate
0.3 0.32 0.34 0.36 0.38 0.4 0.42 0.44 0.46 0.48 0.510
−7
10−6
10−5
10−4
10−3
10−2
10−1
100
Rev512: InfoRev512: FullRand504
Channel bit erasure probability, ǫ
Dec
oded
wor
der
asure
rate
(b) Word erasure rate
Figure 4.7: Comparing the performance of reversible and random (3,6)-regular LDPCcodes, with length n ≈ 500, on the binary erasure channel.
71
Chapter 4. Performance Analysis
0.32 0.34 0.36 0.38 0.4 0.42 0.44 0.46 0.48 0.510
−8
10−7
10−6
10−5
10−4
10−3
10−2
10−1
100
Rev1024: InfoRev1024: FullRand1008
Channel bit erasure probability, ǫ
Dec
oded
bit
eras
ure
rate
(a) Bit erasure rate
0.32 0.34 0.36 0.38 0.4 0.42 0.44 0.46 0.48 0.510
−7
10−6
10−5
10−4
10−3
10−2
10−1
100
Rev1024: InfoRev1024: FullRand1008
Channel bit erasure probability, ǫ
Dec
oded
wor
der
asure
rate
(b) Word erasure rate
Figure 4.8: Comparing the performance of reversible and random (3,6)-regular LDPCcodes, with length n ≈ 1000, on the binary erasure channel.
72
Chapter 4. Performance Analysis
0.36 0.38 0.4 0.42 0.44 0.46 0.48 0.510
−9
10−8
10−7
10−6
10−5
10−4
10−3
10−2
10−1
100
Rev4096: InfoRev4096: FullRand4000
Channel bit erasure probability, ǫ
Dec
oded
bit
eras
ure
rate
(a) Bit erasure rate
0.36 0.38 0.4 0.42 0.44 0.46 0.48 0.510
−7
10−6
10−5
10−4
10−3
10−2
10−1
100
Rev4096: InfoRev4096: FullRand4000
Channel bit erasure probability, ǫ
Dec
oded
wor
der
asure
rate
(b) Word erasure rate
Figure 4.9: Comparing the performance of reversible and random (3,6)-regular LDPCcodes, with length n ≈ 4000, on the binary erasure channel.
73
Chapter 4. Performance Analysis
important not only to analyse the code but to also consider how the structure of the code
is suited to the operation of the decoding algorithm. We note that the analytical tools
each have a different purpose. For example, poor minimum distance exposes a weakness
of the code without considering the decoder, stopping set analysis is more appropriately
applied to the BEC than the AWGN channel, and so on. For an introduction to the
analytical methods used in this section the reader is referred to Section 2.8.
4.4.1 Minimum Distance
We first explore bounds on the minimum distance for the above reversible codes. The
problem of evaluating the minimum distance of an LDPC code is intractable, except
for codes with very small block length [99]. Some analytical bounds on dmin are well
known [67, 100]. However, we show here that some very simple bounds on dmin are
sufficient to suggest a weakness for the larger circulant based reversible structures.
Recall that the minimum distance of a linear block code is equal to the weight of its
lowest weight nonzero codeword. For a small code we may search for such a codeword, by
basing the search on low weight information vectors. We generate all possible information
vectors with weight w (xu) ≤ ν and then encode this set. We choose ν to be small enough
that the search is computationally feasible. Let de denote the weight of the member in
the encoded set which has the lowest weight. This value provides an upper bound on
dmin. If de ≤ ν, then this gives dmin = de. All codewords in a (3,6)-regular code have even
weight [5]. Hence, if ν is odd and de > ν then dmin ≥ ν + 1. Moreover, if de = ν + 1, and
ν is odd, then we may conclude that dmin = ν + 1.
The Rev64 code is small enough that we are able to find its lowest weight codeword
using such a search. Table 4.3 shows a partial weight distribution, corresponding to a
subset of codewords generated from all nonzero information vectors of weight w (xu) ≤ 5.
We see that the code has dmin = 6. Using the same approach for the Rev512 code, this
time for w (xu) ≤ 3, yields the partial weight distribution given in Table 4.4. From this
partial distribution we can conclude that 4 ≤ dmin ≤ 14.
w (x) 6 8 10 12 14 16 18 20 22 24 26Count 1 444 1260 5989 13293 32043 60560 68045 48590 11803 796
Table 4.3: Partial weight distribution of Rev64, for w (xu) ≤ 5.
As a result of Theorem 2.1, it is well known that preventing 4-cycles in a (j, i)-
regular LDPC code lower bounds dmin ≥ j + 1. The above (3,6)-regular reversible codes
74
Chapter 4. Performance Analysis
w (x) 14 16 18 20 22 24 26 28 30 32 34 36 38 40Count 1 3 4 8 43 41 94 332 284 629 988 1757 3279 4923
Table 4.4: Partial weight distribution of Rev512, for w (xu) ≤ 3 (first 14 entries only).
are 4-cycle free by design, and hence have dmin ≥ 4.
Recall that the rows of the generator matrix, G, are themselves codewords. Hence,
an upper bound on minimum distance comes from the weight of the lowest weight row
in G. These bounds are summarised in Table 4.5.
Code Index Lower From Upper FromRev64 6 search 6 searchRev512 4 search 14 searchRev1024 4 4-cycle free 16 min row weight in GRev4096 4 4-cycle free 16 min row weight in G
Table 4.5: Bounds on dmin for the reversible codes.
Finally, we provide an upper bound on the minimum distance of any (3,6)-regular
reversible code with a circulant Hp that is encodable in four iterations.
Theorem 4.1. Consider a (3,6)-regular reversible LDPC code, with parity-check matrix
H constructed such that Hp is an m × m circulant matrix with first row polynomial
h(x) = 1+xr +xs. If the code is encodable using 4 iterations of the Jacobi algorithm then
its minimum distance is upper bounded by dmin ≤ 28, independently of the block length.
Proof. As Hp is encodable in 4 iterations we have H4 = I ⇒ H−1
p = H3p. Expanding
h3(x) gives the polynomial c(x) = 1 + xr + xs + x2r + x2s + x2r+s + xr+2s + x3r + x3s
mod (xm + 1). H−1
p is therefore a circulant matrix having a maximum possible column
weight of 9, when no terms in c(x) cancel. Consider Hsys = [Im|P⊤] = H−1
p H. Given
that Hu is (3,3)-regular, the maximum weight of a column in P⊤ = H−1
p Hu is 27. Hence
the maximum row weight of G = [P|Ik], and thus an upper bound on dmin, is 28.
From Theorem 4.1, reversible codes that are encodable in four iterations and built
using a circulant matrix for Hp, have a minimum distance that does not scale well with
block length. However, this does not fully explain our experimental observations. We
expect a poor minimum distance to cause the decoder to make undetected errors, yet
none occurred for the reversible codes with n > 64. For the case of n = 64 they were
observed for the random code also, as is expected to be the case for such a short block
length. Instead, the observed performance floor appears to be related to problems with
75
Chapter 4. Performance Analysis
decoder convergence. Minimum distance is a property of the code which is independent
of the chosen decoding algorithm. In the following analysis we consider the structure of
the reversible codes in relation to the operation of the decoder.
4.4.2 Stopping Sets and Extrinsic Message Degree
In Section 2.8.2 we introduced stopping set analysis, as an appropriate tool for investi-
gating the behaviour of message-passing decoding of LDPC codes over the binary erasure
channel [24]. Let the set of erasures made by the channel be denoted E , and those remain-
ing when the decoder fails be denoted S. Recall that S represents the maximal stopping
set from E . We now consider the simulation results for the Rev4096 code shown in Fig-
ure 4.9. We empirically observe the final state of the decoder for each word erasure event
in the result sample, when the channel bit erasure probability is ǫ = 0.375. Five different
stopping set configurations exist in the sample, each having a different size. Figure 4.10
shows how the occurrence of these sets is distributed over the fifty word erasure events
in the empirical sample space.
10 11 12 13 14 15 16 17 180
5
10
15
20
25
Stopping set size (|S|)
Cou
nt
Figure 4.10: Occurrence of maximal stopping sets for Rev4096, when ǫ = 0.375.
Table 4.6 lists each stopping set configuration observed for Rev4096 when ǫ = 0.375.
Each set of size |S| consists of |S| − 1 parity bits and only one information bit. This
structure is in agreement with the observed inequality in bit erasure protection. It is also
worth noting that it is the same information bit which appears in all five stopping set
structures.
Figure 4.11 shows a subset of the graph corresponding to Hp for the Rev4096 code.
76
Chapter 4. Performance Analysis
|S| Parity Info11 484, 486, 996, 997, 998, 999, 1508, 1509, 2020, 2021 324512 484, 485, 486, 996, 997, 998, 999, 1508, 1509, 2020, 2021 324514 483, 485, 486, 995, 996, 997, 998, 999, 1507, 1509, 2019, 2020, 2021 324516 482, 484, 485, 486, 994, 995, 996, 997, 998, 999, 1506, 1509, 2018, 3245
2019, 202117 480, 485, 486, 992, 993, 997, 998, 999, 1504, 1506, 1509, 2016, 2017, 3245
2018, 2019, 2021
Table 4.6: Stopping set configurations.
Here we see that the cyclic nature of Hp leads to a lattice1 structure in the graph. Variable
(bit) nodes are represented on the graph by circles and check (constraint) nodes by boxes.
Each node is labelled by its index, and members of the size |S| = 11 stopping set are
highlighted. Checks 999 and 1509 have only a single connection into the parity bit subset
of this stopping set. Information bit 3245 is connected to checks 999, 1509 and 2019.
Note that this information bit does not appear in Figure 4.11, as it lies on the graph of
Hu. By considering the complete graph of H and connecting this information bit, we
close the stopping set. Similar structures appear on the lattice for all of the configurations
listed in Table 4.6. In each case, the information bit plays a crucial role in closing the
stopping set.
The lattice may be extended to completely represent the graph of Hp, as shown
in Figure 4.12. This general graph represents Hp for all codes listed in Table 4.1, by
assigning node labels such that β = m/4. Each row of the lattice contains four nodes,
and consists entirely of either variables or checks. There are a total of β rows of each
type.
The lattice structure contains small sets of variables which have a low extrinsic
message degree [25]. An example set shown in Figure 4.12 contains nine parity bits.
All checks connected to the set are connected twice, with the exception of the check
labelled 3β + 4. Hence the set has an extrinsic message degree (Def. 2.13) of one. The
repetitive form of the lattice causes this undesirable structure, and others like it, to be
replicated throughout the graph. These structures increase the probability of stopping
set creation and contribute to the observed floor in performance. However, as they have a
nonzero EMD they require the inclusion of one or more information bits for stopping set
closure. This accounts for the observation that the word erasure rate measured for the full
1We use the term lattice here to describe the appearance of the graph with respect to the repeatedpattern of node connectivity. The term is not used in accordance with its mathematical meaning, as thismeaning does not make sense in the context of our discussion.
77
Chapter 4. Performance Analysis
483 995 1507 2019
20201508996484
484 996 1508 2020
20211509997485
485 997 1509 2021
20221510998486
486 998 1510 2022
20231511999487
487 999 1511 2023
Figure 4.11: Parity bits in the |S| = 11 stopping set.
codeword matches that measured for the information bits alone. At high channel erasure
rates the maximal stopping set for a given set of erased bits consists of smaller component
stopping sets. The overlapping of such sets accounts for the observed symmetry across
the number of information and parity bits in the maximal stopping set. However, at low
channel erasure rates, the maximal stopping set is often also the minimal stopping set
(ignoring the empty set). In these cases, the heavy weighting of parity bits within the
set causes the observed bit erasure rate measured across the information bits to be less
than that measured across the full codeword.
4.4.3 Cycles and Near-Codewords
We now turn our attention to the experiments performed in presence of AWGN noise. In
this case we are interested in the behaviour of the message-passing sum-product decoder,
in particular at high SNR where the performance of the reversible codes begins to diverge
from that of the random codes.
In contrast to the equal word error protection that we observed on the BEC, here we
observe error events which only affect parity bits. We investigate results for the Rev4096
code at an SNR of 2dB, setting the decoder clipping parameter η = 1 − 10−4. There
78
Chapter 4. Performance Analysis
β 2β 3β
3β+12β+1β+11
1 β+1 2β+1 3β+1
3β+22β+2β+22
2 β+2 2β+2 3β+2
3β+32β+3β+33
3 β+3 2β+3 3β+3
3β+42β+4β+44
4 β+4 2β+4 3β+4
3β2ββ4β
4β3β2ββ
β 2β 3β 4β
4β
Figure 4.12: Graph of Hp viewed as a lattice.
are a total of 931 word errors, although only 100 of these affect information bits. The
histograms in Figure 4.13 correspond to the observed error vector weight, i.e. a count of
the number of positions in which the transmitted and decoded vectors differ. We compare
the histogram for the full codeword to that taken across only the parity section of the
codeword.
The distribution of low weight error vectors is dominated by those which corrupt
only parity bits. In particular we see a large number of errors vectors, e, with weight
w (e) = 4. A closer inspection of these vectors shows that they all conform to the pattern
e(x) = xs + xβ+s + x2β+s + x3β+s mod (xm + 1), (4.1)
where β = m/4 and s ∈ {0 . . . m − 1}. Here, the algebraic representation e(x) of e has
79
Chapter 4. Performance Analysis
0 5 10 15 20 25 30 350
50
100
150
200
250
300
350
Error vector weight
Cou
nt
(a) Full codeword
0 5 10 15 20 25 30 350
50
100
150
200
250
300
350
Error vector weight
Cou
nt
(b) Parity section only
Figure 4.13: Error vector weight histograms for Rev4096 at Eb/N0 = 2dB.
the same form as that used to represent the first row vector of Hp in Section 3.4. This
error pattern corresponds to a row of variables in the lattice representation of Hp, as
shown in Figure 4.12. Although the lattice is 4-cycle free, it contains many 6-cycles2.
Each variable in a row of the lattice is connected to its neighbours on either side via a
6-cycle, while an 8-cycle connects all variables in the row, as shown in Figure 4.14.
3β+s2β+sβ+ss
s β+s 2β+s 3β+s
8-cycle
6-cycle
Figure 4.14: Cycles in a variable row of the lattice for Hp.
As discussed in Section 2.8.4, MacKay and Postol have proposed that near-codewords
can cause convergence problems for the message-passing decoder [79]. Near-codewords
can arise from the connection of short cycles in the graph of H, and hence it is not
surprising that the error pattern e(x) corresponds to a near-codeword. More precisely, if
e is a vector with ones positioned according to e(x), having zero valued entries elsewhere,
then e is a (4, 4) near-codeword. In this case, the weight of the syndrome for e corresponds
2In fact, any graph built from circulants having row weight i ≥ 3 has at most girth six [101].
80
Chapter 4. Performance Analysis
to the extrinsic message degree of a row of variables in the lattice.
A similar method may be used to account for other low weight error vectors which
dominate the distribution. In particular, the pattern of parity bits with an EMD of one,
shown in Figure 4.12, represents a (9, 1) near-codeword.
In order for these structures to cause a word erasure on the BEC, it is necessary to
include at least one information bit to close the stopping set. However, on the AWGN
channel the existence of these near-codewords alone appears sufficient to inhibit decoder
convergence and thus cause a detected word error. Hence we see separation of the infor-
mation section and full codeword word error rates on the AWGN channel, in contrast to
the BEC observations.
4.4.4 Finite Graph-Covers
Our empirical results suggest that the decoder arrives at near-codewords, due to decoder
convergence problems on the AWGN channel. However, we do not yet have a precise
explanation for this. To this end, we now extend our analysis by using the finite graph-
cover techniques of Kotter and Vontobel [27, 28], introduced in Section 2.8.6.
In order to simplify the analysis in this section, our discussion will assume the trans-
mission of the all-zero codeword. As suggested in [27], we may build a pseudo-codeword
ωe using a procedure similar to that of canonical completion (Def. 2.22), rooted at the
(4, 4) near-codeword e defined by (4.1). At this point we ignore Hu and operate only
on the graph corresponding to Hp. This process is illustrated in Figure 4.15, using the
general lattice for the reversible codes. Variable tiers of the lattice are labelled consec-
utively, starting at the root tier, t = 0, which corresponds to the nonzero elements of
e. The lattice is constructed to a depth t = β − 1, such that all parity variables appear
exactly once. Each variable vs on tier t is then assigned the value ωs = 1/2t. These
values are shown to the right of each variable. The assignment ensures that all local
indicator functions (Def. 2.21) are satisfied based upon the parity variables alone. Thus,
appending an all-zero information sequence generates a valid pseudo-codeword.
It is straightforward to show that the pseudo-weight (Def. 2.20) of the pseudo-
codeword ωe constructed using the above technique, has the property that wp (ωe) < 12
independently of the code length. A similar process can be used to find other pseudo-
codewords, e.g. by rooting the canonical completion at the (9, 1) near-codeword discussed
in the previous section. Furthermore, these low weight pseudo-codewords are replicated
through the structure of the lattice.
81
Chapter 4. Performance Analysis
1/2
β 2β 3β
3β+12β+1β+11
1 β+1 2β+1 3β+1
3β+22β+2β+22
2 β+2 2β+2 3β+2
3β+32β+3β+33
3 β+3 2β+3 3β+3
4β3β2ββ
β 2β 3β 4β
4β
1111
1/21/21/21/2
1/41/41/41/4
1/21/21/2 β−1β−1β−1β−1
t=0
t=1
t=2
t=β−1
Figure 4.15: Building a pseudo-codeword on the lattice for Hp.
To build this example pseudo-codeword we have considered only the graph of Hp.
Hence ωe has an all-zero information section. We have found that the reversible codes
support other pseudo-codewords which have lower pseudo-weight than ωe, by considering
the full graph of H and allowing nonzero elements in the information section3.
The analysis in this section provides a quantitative account for the near-codewords
observed through empirical investigation. We have identified that the structure of Hp
admits pseudo-codewords with low pseudo-weight, thus leading to convergence problems
for iterative sum-product decoding in the case of the AWGN channel.
4.4.5 Graph Expansion
For both channel models, the reversible code performance degrades with respect to that
of the random codes, as the block length is increased. We now consider graph expansion
(Def. 2.14), and show how the normalised spectral gap (Def. 2.15) may be used to account
for this general observation.
In contrast to the good expansion of random graphs, the expansion of the 3-regular
3Such a pseudo-codeword has been found for the Rev512 code, using a heuristic search performed byPascal Vontobel.
82
Chapter 4. Performance Analysis
graphs of Hp for the reversible codes, is limited by its lattice structure. In Figure 4.16
we compare the normalised spectral gap for the graph of Hp, for each of the reversible
codes, to that of a randomly generated graph of the same size.
200 400 600 800 1000 1200 1400 1600 1800 20000
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
0.09
0.1ReversibleRandomRamanujan Bound
Matrix size, m
Nor
mal
ised
spec
tral
gap,µ
δ
Figure 4.16: Comparison of expansion for random and reversible structures.
The random graphs exhibit a good expansion property, which appears to approach
the Ramanujan bound as they get larger. In contrast, the expansion metric for the
circulant based reversible structures vanishes as they get larger. We now show that µδ
can be obtained directly from a circulant Hp. This will allow us to investigate the spectral
properties of these matrices, without considering the adjacency matrix.
Lemma 4.1. Let C be an m × m circulant matrix, having eigenvalues µp ordered such
that |µ1| ≥ |µ2| ≥ · · · ≥ |µm|, with corresponding adjacency matrix
A =
[
0 CC⊤ 0
]
Denoting the (real) eigenvalues of A as αs, and ordering them α1 ≥ α2 ≥ · · · ≥ α2m, we
have µδ = (j − |α2|)/j = (j − |µ2|)/j.
Proof. The characteristic polynomial of A is |A − αI| = |α2I − C⊤C|. Therefore, if χ is
an eigenvalue of C⊤C then ±√χ are eigenvalues of A.
The singular values of C are the square roots of the eigenvalues of C⊤C. Moreover,
as C is circulant, its singular values correspond to the magnitude of its eigenvalues.
If µ is an eigenvalue of C, then |µ| is a singular value of C, and |µ|2 is an eigenvalue
of C⊤C. Hence ±|µ| are eigenvalues of A and the result follows.
83
Chapter 4. Performance Analysis
Let Hp be a circulant m × m matrix with first row polynomial chosen according
to Theorem 3.8, such that h(x) = 1 + x + xm/4+1. If φγ = eiγ2π/m is a complex mth
root of unity then µ = h(φ) is an eigenvalue of Hp [102]. We may therefore consider
each eigenvalue of Hp to be the sum of three unit vectors in the complex plane, such
that µ = 1∠0 + 1∠θ + 1∠(m/4 + 1)θ, where θ = γ2π/m. Figure 4.17(a) shows the
locus of the spectrum for the size m = 32 circulant Hp used in the Rev64 code. The
vector components of each eigenvalue are also shown. All eigenvalues are located in a
disc of radius 3, centred at the origin. Recall that µ1 = 3 + i0, and thus a small spectral
gap arises when an eigenvalue is positioned close to the edge of this disc. Loci of the
spectra for the other reversible codes are also shown in Figure 4.17, however the vector
components have been omitted to improve clarity.
We now focus on the set of eigenvalues for which the second and third vector com-
ponents have the same argument. This occurs when θ + 2sπ = (m/4 + 1)θ, and thus
θ = 8sπ/m, for s ∈ {1, . . . ,m/4}. Of particular interest is the eigenvalue in this set which
has the smallest nonzero argument. We label this eigenvalue and its complex conjugate
µ2 and µ2∗ respectively, as shown in Figure 4.17(a). There are m/4 eigenvalues in the
set, having uniform angular distribution about the point 1+ i0. Therefore, as we increase
the matrix size, m, the argument of µ2 reduces and |µ2| → 3. Hence, from Lemma 4.1,
the spectral gap of the graph vanishes.
Finally, we note that the poor expansion of the graph may provide some insight
into the effect of changing the clipping parameter. Altering η appears to affect the
information error rate for the reversible structures much more than for the random codes.
In a poor expander, strong local misconception can quickly overpower the correct belief
in neighbouring nodes. Clipping prevents the magnitude of the check output decisions
from becoming too powerful too quickly. Hence for a bad expander this is likely to
be advantageous, up to a point where limiting valid check outputs begins to degrade
performance. From the results in Figure 4.5, this point appears to be η = 1 − 10−2 for
the Rev4096 code.
4.5 Summary
We have investigated, characterised and analysed the behaviour of some rate 1/2, (3,6)-
regular reversible codes, which employ a circulant matrix structure for Hp. Empirical
results, for both the AWGN and binary erasure channels, show that the performance of
84
Chapter 4. Performance Analysis
Im
Re
µ2
µ2∗
(a) Rev64, m = 32
Im
Re
(b) Rev512, m = 256
Im
Re
(c) Rev1024, m = 512
Im
Re
(d) Rev4096, m = 2048
Figure 4.17: Loci of the spectra for Hp of the reversible codes.
85
Chapter 4. Performance Analysis
this class of codes degrades as their block length is increased. A thorough analysis has
been provided, in which an appropriate analytical tool has been employed to account for
each experimental observation.
The results presented in this chapter become less encouraging as we consider longer
codes. However, there were no undetected errors observed, and the comparatively poorer
performance of the longer codes has been linked to instances of decoder convergence
failure. It may therefore be possible to modify the operation of the decoding algorithm
to improve performance. Some evidence of this is provided by the fact that modifying
the clipping parameter can shift the performance floor. Further investigation of how
we may attenuate [42], or reschedule [39], messages may be worthwhile. It may also be
worthwhile investigating the iterative reliability based approach suggested in [44]. On
the AWGN channel, the WER measured across the full codeword is higher than that
measured across the information bits alone. This is due to cases when only the parity
bits of the decoder output are corrupt. In such a case, a cyclic redundancy check applied
across the information bits may be employed. As described in Section 1.1.3, the CRC may
be used to flag the error free information vector and prevent it from being unnecessarily
discarded.
We attribute the decoder convergence problems to the fact that their circulant based
lattice structures are not good expanders. In the case of the BEC it may be possible to
improve performance by a more careful selection of Hu, such that small stopping sets are
avoided. However, in the AWGN channel case, our finite graph-cover analysis highlights
that the weakness in Hp is sufficient to cause convergence problems, regardless of the
choice of Hu. The poor expansion property of the circulant structure admits low weight
pseudo-codewords. We note however, that it is not a necessary condition that Hp be
circulant in order for it to be iteratively encodable. The good expansion property of
random graphs motivates the search for iteratively encodable matrices which incorporate
some form of randomness. This approach is explored in the next chapter.
The reversible codes exhibit different characteristics depending upon the channel,
and corresponding decoder being employed. They therefore provide an interesting case
study and highlight some general points to be considered when analysing a class of LDPC
codes. We close this chapter by summarising these points, the first two of which reiterate
those presented in [79].
Simulate to a Low WER It is common practice to run simulations only to an infor-
mation BER of around 10−5. However, this may not be low enough to expose bad
86
Chapter 4. Performance Analysis
properties of a code. When presenting earlier versions of this work [93], the error
floor effect was not discovered for this reason. By simulating down to a word error
rate of around 10−5, or lower, we are much more likely to uncover such a weakness.
Distinguish Between Error Types It is also common not to distinguish between de-
tected and undetected errors. However, such information can be very useful in
determining the potential cause for a performance issue. Undetected errors relate
to minimum distance, whereas detected errors relate to minimum pseudo-weight.
Good Properties Do Not Always Scale While the good properties of random codes
scale well, we cannot assume this to be the case when considering structured codes.
Not All Decoders are Equal The way in which we handle extreme messages in the de-
coder implementation can affect performance. This is evident in the above analysis,
from the choice of the clipping parameter. Furthermore, favourable implementation
settings are likely to be code dependent.
Choose the Most Appropriate Tool In recent times it has become common to apply
stopping set analysis on the BEC, and then suggest that the results carry through to
the AWGN channel. While there is some evidence that the two are related, stopping
sets do not capture the full picture for the AWGN case [27]. This is reflected in
the above analysis of the reversible codes. Stopping set closure shows a dependence
upon the inclusion of an information bit. However, decoder convergence problems
on the AWGN channel can arise which are dependent only upon the parity bits.
A quantitative account for the latter is provided by finite graph-cover analysis.
Furthermore, the absence of undetected errors tends to suggest that the potentially
poor minimum distance properties are much less significant than the poor minimum
pseudo-weight.
87
Chapter 5
Improved Reversible LDPC Codes
5.1 Introduction
The analysis presented in the previous chapter has exposed a weakness of using weight
j = 3 circulant matrices to construct reversible LDPC codes. The graphs corresponding
to such structures have a poor expansion property. This admits low weight pseudo-
codewords, and ultimately leads to decoder convergence problems. In this chapter we
design some new reversible codes which have a better expansion property. We draw
motivation from the good expansion of random graphs, and the recursive constructions
presented by Tanner [9].
We compare simulation performance of the new codes to that of random codes.
In addition to this benchmark we use the finite geometry codes of Kou et al. [52].
These codes were selected because they have a similar encoding complexity to reversible
codes, both in terms of computation time and architecture size. Furthermore, finite
geometry codes have been shown to outperform regular random codes. We compare the
new reversible codes against their respective benchmarks, in terms of both performance
and implementation complexity.
All results in this section assume transmission on the AWGN channel, with sum-
product decoding, for a maximum of 50 iterations. In all cases the decoder employs a high
clipping limit, η = 1 − 10−10, imposed only to avoid numerical overflow. All simulation
points represent a minimum of 50 word errors, measured across the information section
of the codeword.
This chapter contains original work, presented in part at the 2004 Australian Com-
munications Theory Workshop, Newcastle, Australia [95].
88
Chapter 5. Improved Reversible LDPC Codes
5.2 Simple Metrics for Implementation Complexity
Each nonzero element in H represents an edge in the factor graph representation of the
code, and thus a connection between nodes in the circuit of the corresponding sum-
product decoder. Hence, the density of H gives us some indication of implementation
complexity, in terms of routing.
A simple count of edges in the factor graph does not account for the complexity
of the nodes themselves. For example, a 6-edge check node has the same number of
edges as two 3-edge checks, yet it has a higher internal complexity. As discussed in
Section 2.5.1, we can construct a multiple edged node by using bi-directional soft-logic
gates (Def. 2.6) as building blocks. We may use a count of these blocks as a simple metric
for implementation complexity.
Definition 5.1 (Bi-directional Soft-Logic Gate Count (BSGC)). The bi-directional
soft-logic gate count for a parity-check matrix H, represents the total number of bi-
directional soft-logic gates required to implement the sum-product decoder core correspond-
ing to H.
By cascading two bi-directional soft-logic gates we can build a 4-port node, and so
on. Hence, the BSGC of a single i-edged check node is i − 2. We assume that each
j-edged variable node requires an extra edge for channel interfacing, and thus assign it a
BSGC of j − 1. The BSGC of a (j, i)-regular LDPC code is therefore n(2(1− 1/i)j − 1).
As we focus upon analog implementations in this thesis, the complexity of check and
variable BSGs is assumed to be equal. This may not be the case for other implementa-
tions. The BSGC metric may be easily modified to account for such cases, by weighting
the counts of the different node types. Moreover, for this simple metric, we ignore any
possible optimisation of the nodes that may be achieved through partial reuse of inter-
mediate results. It is intended that the BSGC be used only as an approximate metric for
implementation complexity, rather than as a means of determining the transistor count
of an optimised decoder circuit.
5.3 High Rate Codes
In this section we consider the construction of Hp using a weight j = 5 circulant matrix.
We then use such a matrix to build high rate codes that are encodable in κ = 8 iterations
of the Jacobi method. Our motivation for selecting a higher column weight is twofold.
89
Chapter 5. Improved Reversible LDPC Codes
Firstly, when considering high rate binary LDPC codes, MacKay and Davey have
presented arguments against the use of column weight j = 3 [40]. We are considering
finite length codes, and the potential for weakness increases for high rate codes of short
block length. This is reflected by the performance of a randomly generated code, having
(n, k) = (1082, 826) and j = 3, shown in Figure 5.1. The code performs well at low SNR
but then exhibits an error floor. Furthermore, undetected errors were observed during
simulation at high SNR.
Using Algorithm 3.2, we can construct iteratively encodable circulants with weight
j = 5. Our second point of motivation for using such a matrix is that they offer improved
expansion. We have built an iteratively encodable circulant, having h(x) = 1+x21+x53+
x119+x183 and size m = 256. This matrix has a normalised spectral gap µδ = 0.108, which
is just over 50% of the Ramanujan bound when j = 5. Although not optimal, it represents
a large improvement upon the j = 3 case (see Figure 4.16) where µδ is less than 2% of its
corresponding optimal value. We may complete H, to build an (n, k) = (1082, 826) code,
by appending columns of weight j = 5, whilst blocking the creation of 4-cycles. The
performance of this code is shown in Figure 5.1. We note that the BER taken across the
information section closely follows that taken across the full codeword. Also, there is only
a slight deviation between the WER taken across the information section and that taken
across the full codeword. Empirically we observe that the decoder is failing to converge in
the presence of some (5, 5) near-codewords. These near-codewords have nonzero elements
in the parity bits only, thus causing the slight imbalance in word error rate. There were
no undetected errors observed during simulation. The agreement between information
and full codeword error rates, and the absence of an error floor, stands in contrast to the
experimental results for the reversible codes of the previous chapter. We attribute this
to the improved expansion property of the graph corresponding to Hp.
It is a requirement that the circulant Hp have odd column weight in order to be
iteratively encodable. However, aside from enforcing the 4-cycle free constraint, the
choice of Hu is unconstrained. Hence we may choose Hp as above but select columns of
weight j = 4 for Hu. The performance improvement offered by this (n, k) = (1082, 826)
code with j = {5, 4} is shown in Figure 5.1.
As a benchmark for comparison to these rate r ≈ 0.763 reversible codes, we use
a type-I Euclidean-geometry (EG-LDPC) code of the same rate and approximately the
same length. The performance of this code (c.f. [52]) is also shown in Figure 5.1. A
comparative summary of performance and implementation cost, in terms of both the
90
Chapter 5. Improved Reversible LDPC Codes
2 2.5 3 3.5 4 4.510
−8
10−7
10−6
10−5
10−4
10−3
10−2
10−1
Uncoded BPSKEG−LDPC (1023,781), j=32Rand (1082,826), j=3Rev (1082,826), j=5: InfoRev (1082,826), j=5: FullRev (1082,826), j={5,4}: InfoRev (1082,826), j={5,4}: Full
Eb/N0(dB)
BE
R
(a) Bit error rate
2 2.5 3 3.5 4 4.510
−6
10−5
10−4
10−3
10−2
10−1
100
EG−LDPC (1023,781), j=32Rand (1082,826), j=3Rev (1082,826), j=5: InfoRev (1082,826), j=5: FullRev (1082,826), j={5,4}: InfoRev (1082,826), j={5,4}: Full
Eb/N0(dB)
WE
R
(b) Word error rate
Figure 5.1: Comparing the performance of some r ≈ 0.763 LDPC codes.
91
Chapter 5. Improved Reversible LDPC Codes
Column Eb/N0 (dB)Code type (n, k) weight (j) at BER=10−6 Edge count BSGCEG-LDPC (1023,781) 32 3.63 32736 62403Reversible (1082,826) 5 3.89 5410 7574Reversible (1082,826) {5,4} 4.08 4584 9226
Table 5.1: Comparison of high rate code performance and complexity.
total number of edges in the code’s factor graph and the BSGC metric, is provided in
Table 5.1.
We see that the EG-LDPC code outperforms the j = {5, 4} reversible code by ap-
proximately 0.26dB at a BER of 10−6. However, this comes with a considerable cost in
decoder complexity. The EG-LDPC code has redundant rows in its square parity-check
matrix. Every variable and check has degree 32, giving a much higher node implemen-
tation complexity than that of the reversible codes. Furthermore, as the reversible codes
have a much smaller edge count they offer a significant reduction in routing complexity.
5.4 Codes Built From Improved Expanders
The results of the last section suggest that we can improve graph expansion of type-I
reversible codes, and thus improve performance, by increasing the column weight of Hp.
Although suitable for building high rate codes, type-I reversible structures with col-
umn weight j > 3 are not as well suited to rate 1/2 applications. This is because
increasing the column weight of rate 1/2 codes, having (j,2j)-regular structures, shifts
their performance away from capacity [5, 35]. As the (3,6)-regular ensemble is the best
regular ensemble, we seek a means of building a 3-regular iteratively encodable Hp with
good expansion.
5.5 Recursive Construction
It is well known that random graphs exhibit a good expansion property. Here we aim to
break the lattice structure of the codes presented in the previous chapter, by incorporating
randomness into the graph. The challenge in doing this is that the constructions must
also satisfy the iterative encodability (3.5) and overlap (Def. 2.10) constraints. The
following algorithm uses a recursive approach [9] to generate Hp, while incorporating two
randomly generated components. We use the terminology in the following definition to
identify codes built using this technique.
92
Chapter 5. Improved Reversible LDPC Codes
Definition 5.2 (Type-II Reversible Code). A type-II reversible code has Hp recur-
sively constructed according to Algorithm 5.1.
Algorithm 5.1: Construction of Hp for a Type-II Reversible Code
Initialise:1
Choose target size m, for Hp. Select the template matrix size s ≪ m, and set thecomponent matrix size p = m/s.Create Template:2
Construct the following s × s code template matrix.
T =
[
I NI I
]
Select N to be a 4-cycle free, 2-regular, s/2 × s/2 matrix, having the propertyN4 = 0. This matrix can be constructed as a circulant with first row polynomialn(x) = x + xs/8+1. It is easy to see that this polynomial generates N, byconsidering Theorem 3.8. Small circulants offer good expansion, and as s ≪ m,they make suitable template components. We note that T is 4-cycle free, sincediag N = 0.Generate Components:3
Choose two p × p component matrices as follows.P : random permutation matrixR : random 2-regular, 4-cycle free matrix.Apply Template4
Let A ⊗ B represent the matrix Kronecker product of A and B. SettingK1 = N ⊗ P and K2 = I ⊗ R, insert the components into the template to createHp, as follows.
Hp =
[
I K1
K2 I
]
Theorem 5.1. Algorithm 5.1 generates a 3-regular, 4-cycle free matrix, which is itera-
tively encodable in κ = 8 iterations.
Proof. In Step 1 the template T is generated such that it is 4-cycle free. As Step 2
selects 4-cycle free components, it follows that Hp is also free of 4-cycles. The Kronecker
products performed in Step 4 generate 2-regular K1 and K2, and hence Hp is 3-regular.
We now show that Hp is iteratively encodable, by considering
(Hp + I)8 =
[
(K1K2)4 0
0 (K2K1)4
]
We note that the identity (A⊗B)(C⊗D) = AC⊗BD holds for appropriately dimen-
sioned matrices. Thus (K1K2)4 = (N⊗PR)4 and (K2K1)
4 = (N⊗RP)4. Step 2 selects
93
Chapter 5. Improved Reversible LDPC Codes
N4 = 0, causing the result to vanish for both cases. Hence Hp satisfies (3.5) for κ = 8
iterations.
We complete H by extending Hp with a randomly generated Hu, while blocking
the creation of 4-cycles. Using Algorithm 5.1, with s = 32, we have built example Hp
matrices of size m =256, 512 and 2048. Figure 5.2 shows that these matrices offer much
better expansion than the circulant structures presented in the previous chapter.
200 400 600 800 1000 1200 1400 1600 1800 20000
0.01
0.02
0.03
0.04
0.05
0.06
0.07Reversible (Circulant)Reversible (Recursive)Ramanujan Bound
Matrix size, m
Nor
mal
ised
spec
tral
gap,µ
δ
Figure 5.2: Expansion of type-II reversible structures.
We have broken the lattice structure discussed in the previous chapter, thus reducing
the probability of creating small stopping sets. Hence we expect that reversible codes
built using the above matrices should also provide improved BEC performance. Further-
more, incorporating randomness into the structure should statistically improve minimum
distance. For example, the lowest weight row in the generator matrix for RevII512 has
weight 48, in contrast to the values presented in Table 4.5.
In contrast to the circulant designs presented in the previous chapter, the size of
type-II reversible codes is not constrained such that m ∈ 2y, y ∈ Z+. As the choice of
component matrix size is arbitrary, the size of Hp is constrained only to be some multiple
of the template size. Recall also that the choice of the number of columns in Hu is
arbitrary. Hence type-II reversible codes can be designed for a very wide selection of rate
and block length.
94
Chapter 5. Improved Reversible LDPC Codes
5.6 Simulation Performance
Here we investigate the performance of some type-II reversible codes, built using the
three example Hp matrices of the previous section. In each case we have extended Hp
to build 4-cycle free codes, using a randomly generated Hu. There were no undetected
errors observed during any of the simulations presented in this section.
Figure 5.3 shows the AWGN performance of three (3,6)-regular type-II reversible
codes. Performance of the random codes listed in Table 4.2 is again provided as a bench-
mark.
The performance of the n = 512 and n = 1024 codes matches that of their respective
random benchmarks, with no evidence of a floor. In both cases the BER and WER
measured across the information section matches that measured across the full codeword.
The n = 4096 code initially compares well to its random benchmark but shows signs of
flooring at around WER = 10−5. For this code we empirically note convergence failure
in the presence of some (14, 2) near-codewords. All codes perform significantly better
than those presented in the previous chapter.
Using the example Hp structure with size m = 512, we have also constructed two
example codes having (n, k) = (1280, 768) and rate r = 0.6. The performance of these
type-II reversible LDPC codes is shown in Figure 5.4. All columns in Hu for the code
labelled A, have weight j = 4. The code labelled B has an Hu section with 512 columns
of weight j = 3 and 256 columns of weight j = 4. As a comparative benchmark, we
also show the performance of an extended Euclidean-geometry code (c.f. [52]) having the
same rate and approximately the same length.
All three codes exhibit similar performance. At low SNR the reversible code having
some weight j = 3 columns in Hu performs slightly better than the other reversible code.
However, this gap closes as the SNR increases. The parity-check matrix of an extended
EG-LDPC code has a much lower density than that of a type-I EG-LDPC code. In this
case, the benchmark code has columns of weight j = {3, 4} and row weight i = 8. As a
result, all three codes have approximately the same implementation complexity.
5.7 Summary
In this chapter we have demonstrated that reversible codes exist which do not exhibit con-
vergence problems. We have achieved this by targeting the weakness in graph expansion
95
Chapter 5. Improved Reversible LDPC Codes
1 1.5 2 2.5 3 3.5 410
−8
10−7
10−6
10−5
10−4
10−3
10−2
10−1
Uncoded BPSKRevII512: InfoRevII512: FullRand504RevII1024: InfoRevII1024: FullRand1008RevII4096: InfoRevII4096: FullRand4000
Eb/N0(dB)
BE
R
(a) Bit error rate
1 1.5 2 2.5 3 3.5 410
−6
10−5
10−4
10−3
10−2
10−1
100
RevII512: InfoRevII512: FullRand504RevII1024: InfoRevII1024: FullRand1008RevII4096: InfoRevII4096: FullRand4000
Eb/N0(dB)
WE
R
(b) Word error rate
Figure 5.3: Performance of type-II reversible (3,6)-regular LDPC codes.
96
Chapter 5. Improved Reversible LDPC Codes
1 1.5 2 2.5 3 3.510
−7
10−6
10−5
10−4
10−3
10−2
10−1
Uncoded BPSKExtended EG−LDPC (1275,765)RevII (1280,768) A: InfoRevII (1280,768) A: FullRevII (1280,768) B: InfoRevII (1280,768) B: Full
Eb/N0(dB)
BE
R
(a) Bit error rate
1 1.5 2 2.5 3 3.5
10−4
10−3
10−2
10−1
100
Extended EG−LDPC (1275,765)RevII (1280,768) A: InfoRevII (1280,768) A: FullRevII (1280,768) B: InfoRevII (1280,768) B: Full
Eb/N0(dB)
WE
R
(b) Word error rate
Figure 5.4: Performance of rate r = 0.6 type-II reversible LDPC codes.
97
Chapter 5. Improved Reversible LDPC Codes
that was exposed in the previous chapter. Although the circulant design approach pre-
sented in Chapter 3 is not well suited to (3,6)-regular codes, it may be used to construct
high rate codes. We have presented a simple recursive algorithm for constructing (3,6)-
regular, type-II reversible codes. This reversible design employs random components,
and provides improved expansion to the graph of Hp.
Simulation results for the example type-II reversible codes compare well to random
and extended EG-LDPC benchmarks. The high rate circulant based reversible codes offer
a performance/complexity tradeoff in comparison to the type-I EG-LDPC structures.
The example length n = 512 and n = 1024 type-II reversible codes do not show
signs of a performance floor. However, the expansion and performance of the n = 4096
code are not as good. This relationship again motivates the expansion of Hp as a metric
for the expected performance of reversible codes. From Figure 5.2, we note that the
Hp component of the RevII512 code has optimal expansion. However, the expansion
metric value decreases as the size of Hp is increased. By exploring alternative recursive
approaches, it may be possible to improve the performance of the longer codes.
Recursive constructions offer a wide range of choices for the size of Hp. We also have
an arbitrary choice for the number of columns in Hu. Hence, type-II reversible codes
may be designed with a high degree of freedom for rate and block length.
98
Chapter 6
Analog Decoding
6.1 Introduction
Many high performance channel codes, such as LDPC codes, may be represented using
the factor graph structure presented in Section 2.4. Iterative decoding algorithms for
these codes, such as the sum-product algorithm presented in Section 2.5.1, are then
viewed as message-passing on the graph. The highly parallel structure of the factor
graph representation of LDPC codes offers the potential for very high throughput decoder
architectures to be built.
The standard approach to decoder implementation has been to design parallel [89],
or partly parallel [60, 103], digital circuits which map the structure of the code’s factor
graph. Each node on the factor graph can be considered as a small processor for message-
passing decoding. Messages are passed around the decoder circuit in a quantised form.
Analog implementations of the Viterbi algorithm have been used since the late
1970s [104], with recent application to magnetic disk drive channels [105]. Following
on from this has been the suggestion that analog VLSI circuits be used to implement
iterative soft decision decoding on factor graphs [106–108]. Since then a research com-
munity has begun to develop, in order to explore this alternative approach. The principles
that underly analog decoding are the same as those used to build analog circuits for some
artificial intelligence applications [109].
An analog decoder circuit maps the constraints of the code, in a similar manner to a
digital decoder. In contrast to the digital decoder, the analog decoder operates by passing
unquantised messages in continuous time, as it settles to the steady state output1. The
1An exception to the generalised digital decoder description is the recently proposed stochastic de-coder [110, 111]. This algorithm operates by passing randomised digital messages synchronously orasynchronously.
99
Chapter 6. Analog Decoding
designer may choose to consider the messages being passed between processing blocks,
as either currents or voltages. Hence, the designs are often referred to as being either
current-mode or voltage-mode respectively. Of course, current and voltage are related
and there is no fundamental difference between the two views [112]. We use the following
definition to determine the mode of a processing block [3].
Definition 6.1 (Current/Voltage-Mode). A processing block with low impedance in-
puts is a current-mode block, otherwise it is a voltage-mode block.
6.2 Potential Advantages
In this thesis we focus upon the analog implementation of an LDPC decoder. We now
list some of the potential advantages that an analog decoder presents, in contrast to that
of a digital implementation. Some of these advantages are particularly relevant to LDPC
decoder implementation.
Preservation of Soft Information A digital circuit must sample the received vector
and pass quantised message values around the circuit. Analog circuits internally
pass messages as real values without quantisation.
Fast Decoding Digital decoders operate by passing messages in discrete time iterations
around the circuit. An analog decoder operates in continuous time, as it settles to
a steady state output2. During the decoding process, voltage variations throughout
the decoder become progressively smaller, thus leading to fast convergence [113].
Low Power The analog decoder circuit is only required to settle to a steady state once it
is switched. This offers a potential power saving when compared to digital circuits,
which must operate for several iterations. Digital power consumption is propor-
tional to the switching (clock) rate, whereas analog decoders are only switched
once per block. Analog decoders can outperform digital decoders by two orders of
magnitude in terms of power consumption [106, 114]. The largest fabricated analog
decoder to date is the product decoder presented in [115]. This analog decoder
consumes approximately 2% of the power consumed by the digital LDPC decoder
presented in [89].
2As is the case for a digital implementation, it is also possible for the analog decoding process tooscillate [3].
100
Chapter 6. Analog Decoding
Small Area Analog decoders have the potential to be significantly smaller than digital
decoders [106, 114]. An analog message value may be represented on a single wire,
compared to q wires required for q-bit quantisation in the digital case. This reduces
the amount of wires required for routing and thus saves circuit space. This saving is
of particular relevance to LDPC decoders, for which the size of a digital implemen-
tation is determined by routing congestion rather than the transistor count [89].
After accounting for differences in code length and process technology, the analog
decoder presented in [115] requires approximately 4 times less area than the digital
decoder presented in [89].
Programmability The ability to represent real valued messages on a single wire allows
the design of efficient reprogrammable message-passing networks [116, 117]. This
offers the potential for the code structure to be dynamically loaded in a boot stage
prior to decoding, rather than being fixed at fabrication.
Elimination of Analog to Digital Converter A digital implementation requires a
separate circuit to convert the received analog matched-filter samples into digital
form before decoding. However, the analog decoder is effectively a smart analog to
digital (A/D) converter [118]. During the decoding process it converts analog sam-
ples directly to digital information bits. Eliminating the need for an A/D converter
stage provides a significant reduction in power consumption and circuit area.
Suitability to System-on-a-Chip Implementations Analog decoders are well suited
to sharing a single chip with other system components without causing interference,
as they have no high-frequency switching components. Furthermore, the large par-
allel decoding approach makes the analog decoder an effective low-pass filter, so
that it is robust against interference from other digital circuit components. Hence
analog decoders provide excellent radio frequency compatibility. The low voltage
analog designs presented in [118, 119] have very flexible supply voltages, and do not
require regulated reference voltages. This makes them particulary attractive for
system-on-a-chip (SOC) applications.
6.3 Existing Work and Remaining Challenges
So far the analog approach has been used to design trellis based BCJR-style decoders [114,
120, 121], turbo decoders [117, 122, 123], decoders suitable for LDPC codes [3, 124], min-
101
Chapter 6. Analog Decoding
sum decoders [125], and a decoder for block product codes [115, 126]. Several of these
designs have been successfully fabricated. Most of these decoders are proof-of-concept
designs having very small block length (<50 bits), with notable exceptions being [115,
123].
Work is ongoing into the analysis of analog iterative decoding circuits. Distortion in
analog computation can arise through mismatch between transistors in the circuit, and
a theoretical model for this effect is provided in [127]. For the small analog decoders
built to date, empirical evidence suggests that mismatch effects are not of significant
concern [124, 128, 129]. A recent density evolution based model for the effects of mismatch
in larger decoders [130] indicates that a mismatch of up to 20-25% may be tolerated.
Device mismatch may be easily set below this point by slightly increasing transistor size
beyond minimum dimensions.
Message-passing in a digital decoder reflects the discrete time schedule exactly. How-
ever, operation of the continuous time analog circuit is not as well understood. A recent
model has been proposed which approximates message-passing in the analog network as
successive over relaxation (SOR) [131]. Simulation results suggest that SOR provides
improved performance over the flooding schedule (Section 2.5.1).
New approaches for building larger analog decoders are required. The desire to op-
erate the circuit in continuous time, in contrast to time multiplexed digital computation,
calls for intelligent routing techniques [116]. A tailbiting ring decoder which operates
using a sliding window [132] provides an efficient architecture for analog decoding of
convolutional codes. However, building large sum-product based LDPC decoders is dif-
ficult, due to fast growth in routing complexity. Manual schematic design and layout for
large decoders is a tedious and error prone process. Using standard cells and automated
routing is attractive, with the potential to automatically generate a chip layout directly
from the parity-check matrix [3]. However, approaches to date lead to impractically large
layouts [127]. Hence, further investigation into design automation is required.
An alternative approach for constructing analog computation cells, using multiple-
input floating-gate (MIFG) transistors has been proposed [133]. These circuits have the
potential to offer smaller chip area.
The requirement that transistors operate in saturation places a limit of around 1.2V
on the minimum supply voltage of typical analog decoder cells. As CMOS processes have
evolved, their allowable supply voltages have correspondingly decreased. An alternative
cell design has therefore been proposed which can operate from a supply voltage as low
102
Chapter 6. Analog Decoding
as 0.6V [118, 119].
While the recent focus has been placed upon building analog decoders, potential
applications for analog processing are by no means limited to decoding. Essentially any
algorithm that operates on a factor graph is a candidate for analog implementation.
Research is ongoing into using analog computation for other applications, such as equali-
sation [134], timing recovery and synchronisation [135]. By replacing digital components
the potential exists to create a fully analog front end, and thus realise the true potential
of eliminating the A/D converter.
As discussed in Section 3.1, verification of the decoder circuit is a challenging task.
We may reduce this burden through importance sampling, when obtaining empirical
BER results from circuit simulation [136]. An alternate approach is to test the routing
connectivity of the circuit independently from operating the decoder, and provide a built-
in self-test (BIST) function for the fabricated circuit [137, 138].
6.4 The Subthreshold CMOS Analog Approach
We represent the probability mass function pX(x), using the currents on two wires, as
the vector (Ix,0,Ix,1) corresponding to (pX(0),pX(1)). We define the sum current, IX , as
the sum Ix,0 + Ix,1. We denote a probability of 1 by the unit current, Iu.
6.4.1 Subthreshold Operation
An n-type MOSFET [139] is shown in Figure 6.1. The device has four terminals, namely
the gate (G), source (S), drain (D), and bulk. The bulk terminal represents a connection
to the substrate. We assume that the bulk terminal is tied to either Gnd or Vdd for n-type
and p-type FETs respectively, and hence it has been omitted from the diagram.
D
S
G
Vgs
ID
Figure 6.1: An n-type MOSFET.
103
Chapter 6. Analog Decoding
In simplistic terms we may consider the above FET to act as a switch, which connects
the drain to source when the gate-to-source voltage Vgs is above the transistor threshold
voltage, Vth, or isolates the drain and source when Vgs is below Vth. This relationship
is exploited in digital circuit design. However, in reality the device does not turn off
abruptly as soon as Vgs drops below Vth. When Vgs ≈ Vth, a weak inversion layer exists,
supporting some current flow from drain to source. In this case the device is said to
be operating in the weak inversion, or subthreshold, region [139]. We assume that the
device is saturated when the drain-to-source voltage is Vds & 200mV. If the device is
saturated and operating in the subthreshold region then the drain current, ID, exhibits
the following exponential relationship to Vgs [139]. This relationship is similar to the
relationship between the collector current and base-emitter voltage of a bipolar device.
ID = I0eVgs/ζVT (6.1)
Here the specific current, I0, is a process dependent constant. The factor ζ > 1
is used to account for non-ideal device behaviour [139]. The thermal voltage is denoted
VT = kBT/q, where T is the absolute temperature, kB is Boltzmanns constant and q is the
carrier charge. We may consider the term ζVT to be a constant normalisation factor, with
units of voltage, leading to a dimensionless exponential component that varies with Vgs.
We exploit the relationship of (6.1) to build the simple analog processing circuits
that are reviewed in this section [3]. To do so, we choose a unit current such that the
devices in these circuits are biased to operate in the subthreshold region.
We now consider the n-type differential pair circuit shown in Figure 6.2. Here the
drain currents of the pair represent the probability mass pX(x) of a binary random vari-
able X, normalised to the unit current Iu. Using (6.1) it is straightforward to show
that the differential voltage VX0 − VX1 represents the corresponding log-likelihood ratio
loge(pX(0)/pX(1)). Hence, if we set a constant differential reference voltage, VX1 = Vdiff,
we can use this circuit to convert a voltage VX0 representing a log-likelihood value, into a
pair of currents (IupX(0),IupX(1)) representing the corresponding probability mass. The
reference voltage Vdiff represents a log-likelihood value of zero.
In this thesis we focus upon circuits which operate in subthreshold CMOS. Alterna-
tively, we may use bipolar devices to implement high speed analog decoders. Existing
subthreshold CMOS decoders are able to offer throughput in the order of 1-500Mbit/sec,
and we expect 1Gbit/sec throughput to be achievable with newer CMOS processes. Re-
cent predictions for the throughput of SiGe decoders are in the order of 10Gbit/sec [140].
104
Chapter 6. Analog Decoding
Iu
VX0 VX1
IupX(0) IupX(1)
Figure 6.2: An n-type differential pair.
Bipolar circuits use higher currents, as they do not operate in subthreshold mode. Hence
they present a speed/power tradeoff in relation to subthreshold CMOS decoders. More-
over, fabrication costs for the BiCMOS [121] process, and the more advanced SiGe [141]
process, are significantly higher than for the standard CMOS process.
6.4.2 The Gilbert Multiplier
Consider two probability masses pX(x) = (pX(0), pX(1)) and pY (y) = (pY (0), pY (1))
represented by current vectors (Ix,0,Ix,1) and (Iy,0,Iy,1) respectively, as described above.
Current vectors IX(pX(0), pX(1)) and IY (pY (0), pY (1)) are presented as input to the
circuit shown in Figure 6.3. When operating in the subthreshold region, the circuit shown
in Figure 6.3 outputs currents which are the pairwise products of pX(x) and pY (y), scaled
by the sum current IZ = IX . This form of the standard Gilbert multiplier circuit is used
in the core of the analog decoder processing blocks [106, 142]. Here the n-type reference
potential is labelled VrefN. The multiplier circuit may be easily extended to accommodate
wider input vectors [3].
6.4.3 Probability Normalisation
We may normalise the current vector resulting from a calculation such as that performed
in Figure 6.3, using the biased current mirror circuit shown in Figure 6.4. Here the p-type
reference potential is labelled VrefP.
Using this circuit the normalised output current Ioutj may be calculated using the
105
Chapter 6. Analog Decoding
IXpX(0)
IXpX(1)
IY pY (0)
IY pY (1)
-
-
-
-
I Zp
X(0
)pY
(0)
?
I Zp
X(0
)pY
(1)
?
I Zp
X(1
)pY
(0)
?
I Zp
X(1
)pY
(1)
?
VrefN
VrefN
Figure 6.3: Vector multiplication core.
following expression.
Ioutj = IuIinj
∑km=1 Iinm
6.5 Analog Computation using Soft-Logic Gates
In this section we review the variable and check node circuits which appear in the core
of an analog sum-product decoder [3]. In Section 2.5.1 we showed how these nodes may
be considered to be a concatenation of soft-logic gate components. Here we provide a
detailed description of soft-XOR and soft-equal gate functionality and implementation.
So far in our description we have the means to multiply and normalise currents.
Addition of currents is done simply by connecting wires together. If a current is required
more than once then we may use current mirrors to replicate it. We now use these rules
to build the two principle processing structures used in an analog LDPC decoder.
106
Chapter 6. Analog Decoding
. . . . . .
I in1
?
I out1
?
I in2
?
I out2
?
I ink
?
I outk
?
Iu
VrefP VrefP VrefP
Figure 6.4: Vector normalisation using a biased current mirror.
6.5.1 The Analog Soft-XOR Gate
The soft-XOR gate (Def. 2.4) may be used to build check nodes. We can implement
this processing block using an analog circuit, as shown in Figure 6.5. Current vectors
representing the probability masses pX(x) and pY (y) are set as input to the core circuit
of Figure 6.3. The outputs for IZpX(0)pY (0) and IZpX(1)pY (1) are then connected,
resulting in the current labelled IZ0, such that IZ0 ∝ pZ(0). The current labelled IZ1 is
formed in a similar manner, such that IZ1 ∝ pZ(1). Finally the resulting k = 2 currents
are used as input to the normalisation circuit of Figure 6.4 to generate IupZ(z).
6.5.2 The Analog Soft-Equal Gate
The soft-equal gate (Def. 2.5) may be used to build variable nodes. The analog circuit
implementation of this processing block shown in Figure 6.6 is similar to that of the
soft-XOR gate. In this case however, the outputs for IZpX(0)pY (1) and IZpX(1)pY (0)
are not used in the computation of the output distribution, and hence these paths are
terminated. The remaining outputs are then connected as input to the normalisation
circuit, to generate pZ(z).
The soft-logic gates presented so far have a diode connected (gate tied to drain)
FET at each input. Hence the blocks have a low input impedance and are therefore
current-mode circuits (Def. 6.1). Transforming the blocks to voltage-mode operation is
easily achieved by shifting the diode connected FETs from the input to the output side.
107
Chapter 6. Analog Decoding
IXpX(0)-
IXpX(1)-
IY pY (0) -
IY pY (1) -
IupZ(0)-
IupZ(1)-
Iu
IZ0? IZ1?
VrefN
VrefN
VrefP VrefP
Figure 6.5: Subthreshold CMOS soft-XOR gate.
Figure 6.7 shows a voltage-mode soft-XOR gate. Here we represent probability mass
pX(x) using the voltage pair (Vx,0, Vx,1). The simplicity of this transformation reflects
the fact that voltage-mode and current-mode views are fundamentally the same [112].
The designer is therefore free to choose whichever approach best fits the overall system [3].
6.5.3 Building Variable and Check Nodes
We may construct bi-directional soft-logic gates (Def. 2.6) using the method introduced
in Section 5.2. The analog soft-logic gates described above are specific implementations
of the single output soft-logic gate shown in Figure 2.4(a). Here the function pZ(z)out =
f(pX(x)in, pY (y)in) is implemented using the circuits shown in Figure 6.5 and Figure 6.6,
for soft-XOR and soft-equal gates respectively. The single output analog gates are then
connected as shown in Figure 2.4(b), to form bi-directional gates. Each edge in Figure 2.4
represents two wires in the circuit, e.g. carrying pX(0) and pX(1). The factor graph
representation of these bi-directional soft-logic gates is shown in Figure 6.8.
108
Chapter 6. Analog Decoding
IXpX(0)-
IXpX(1)-
IY pY (0) -
IY pY (1) -
IupZ(0)-
IupZ(1)-
Iu
VrefN
VrefN
VrefP VrefP
Figure 6.6: Subthreshold CMOS soft-equal gate.
We can construct a 4-port check node by cascading two 3-port bi-directional soft-
XOR gates, as shown in Figure 6.9(a). Larger nodes may be built in a similar manner.
Variable nodes are built using a similar approach, as shown in Figure 6.9(b). Here the
variable has 3.5 ports (3 bi-directional check connections plus an input only channel con-
nection) and an output bit slicer [3]. The output bit slicer uses a single output soft-equal
gate to incorporate channel information into the posterior output calculation. Messages
(probability masses) are passed as vectors in a single direction along 2 wires (dotted lines)
and bidirectionally between cells along 4 wires (solid lines).
6.6 The Analog Sum-Product Decoder
We may build an analog version of the sum-product decoder (Algorithm 2.1) using the
above processing blocks. The basic approach is to map the factor graph of the code
onto a circuit, using analog check and variable nodes. A detailed description has been
109
Chapter 6. Analog Decoding
Vx,0
Vx,1
Vy,0
Vy,1
Vz,0
Vz,1
Iu
VrefN
VrefN
VrefP VrefP
Figure 6.7: Voltage-mode soft-XOR gate.
completed by Lustenberger [3].
In this thesis we design a proof-of-concept codec core for a reversible LDPC code
having (n, k) = (16, 8). The code is (3,6)-regular, with Hp structured according to
Theorem 3.8 and Hu being randomly generated. The resulting parity-check matrix,
which contains 4-cycles due to the very small size of the code, is defined as follows.
H =
1 1 0 1 0 0 0 0 1 0 0 1 1 0 0 00 1 1 0 1 0 0 0 1 1 0 0 0 0 1 00 0 1 1 0 1 0 0 0 1 0 0 0 1 0 10 0 0 1 1 0 1 0 0 1 1 0 1 0 0 00 0 0 0 1 1 0 1 0 0 0 0 1 1 1 01 0 0 0 0 1 1 0 1 0 1 0 0 0 0 10 1 0 0 0 0 1 1 0 0 0 1 0 0 1 11 0 1 0 0 0 0 1 0 0 1 1 0 1 0 0
(6.2)
The analog sum-product circuit which maps the factor graph of H is shown in Fig-
ure 6.10. Each 6-edge check node is implemented using a 6-port bi-directional soft-XOR
gate, and each variable node using a 3.5-port soft-equal gate with an output bit slicer.
110
Chapter 6. Analog Decoding
pX(x)
pY (y)
pZ(z)
(a) Soft-equal gate.
pX(x)
pY (y)
pZ(z)
(b) Soft-XOR gate.
Figure 6.8: Factor graph representation of bi-directional 3-port soft-logic gates.
pW (w)
pX(x) pY (y)
pZ(z)
(a) 4-port check node.
pX(x)
pY (y)
pZ(z)
prior in
posterior out
(b) 3.5-port variable node with output bit slicer.
Figure 6.9: Check and variable nodes.
Channel observations are passed into the variable nodes and the decoded soft decisions
are presented at the variable node outputs.
If we consider the soft-equal and soft-XOR gates to be clocked so that they operate
in discrete time iterations, then it is easy to show that the above circuit implements the
sum-product algorithm. However, the analog implementation operates asynchronously
in continuous time, and for this case the connection is not so clear. Research into a
more accurate algorithmic description of the analog circuit operation has gained recent
interest [131].
6.7 Summary
The principle of analog decoding has passed the proof-of-concept phase. The analog
decoder community is still growing, and is now setting its sights upon larger chip designs.
The analog approach has also proven suitable to other front end applications. However,
further work is required before we can replace all digital components, and thus eliminate
the large and power demanding A/D converter. New theoretical models and circuit
111
Chapter 6. Analog Decoding
Parityvariables
InformationvariablesChecks
v8
v7
v6
v5
v4
v3
v2
v1
v16
v15
v14
v13
v12
v11
v10
v9
c8
c7
c6
c5
c4
c3
c2
c1
Variable node:3.5 port soft equal gate
Check node:6 port soft XOR gate
Received channel data:prior input
Decoder soft decision:posterior output
Probability vector bus:2 wires
Bidirectionalprobability vector bus:
4 wires
Figure 6.10: Analog decoder circuit for H.
analysis techniques are also being developed.
The analog approach appears well suited to LDPC decoder design. It offers fast,
accurate, and power efficient decoding. The promise of reduced routing complexity is
also of particular significance to LDPC decoder implementation.
The iterative encoding techniques developed in Chapter 3 allow a digital sum-product
decoder to be easily re-used for encoding reversible codes. However, re-use of the analog
decoder architecture to implement the encoding algorithm is not as obvious. In the
following chapter we modify the analog sum-product decoder core to also allow encoding,
with very little overhead to circuit area.
112
Chapter 7
The Reversible LDPC Codec
7.1 Introduction
In this chapter we show how the analog sum-product decoder may be extended to allow
iterative encoding of reversible codes. We focus only upon the codec core. Interface
circuits such as those discussed in [126, 143] may be added to complete the chip design.
A novel architecture is presented, based upon the iterative Jacobi message-passing
encoder. This discrete time hard decision algorithm offers straightforward reuse of a
digital sum-product decoder for encoding reversible codes. However, methods for reusing
the continuous time soft decision analog decoder to implement the algorithm are not as
obvious.
This chapter contains original work, resulting from collaborations with the Electri-
cal Engineering Department at the University of Utah, and the High Capacity Digital
Communications Laboratory at the University of Alberta. The initial concept codec
design was presented at the 2003 Australian Communications Theory Workshop, in Mel-
bourne [144]. This circuit reused the subthreshold computation cells of the decoder to
perform encoding, allowing half-duplex operation of encode and decode modes. A refined
version of the design was then presented at the 3rd International Symposium on Turbo
Codes and Related Topics, in Brest, France [88]. An alternative approach is presented
in this chapter, which allows the codec to switch between analog decoding and digital
encoding, and offers full-duplex operation. This design was presented at the 2nd Analog
Decoder Workshop, in Zurich, Switzerland [145]. An alternative application for the cell
designs, toward circuit verification, was recently presented at the 3rd Analog Decoder
Workshop, in Banff, Canada [137].
113
Chapter 7. The Reversible LDPC Codec
Mode enc enc Vu VrefN VrefP
Decode Gnd Vdd Vu VrefN VrefP
Encode Vdd Gnd Gnd Gnd Vdd
Table 7.1: Mode-switching gate settings.
7.2 Mode-Switching Gates
From initial investigations into reusing the analog decoder to perform iterative encoding
(Algorithm 3.1) we have identified two issues [88, 144]. Firstly, the output of the soft-
XOR gates decays over time during the iterative process, thus requiring amplification.
The second issue is related to the continuous time operation of the analog circuit. As
the arrival of messages at a node is asynchronous, it is possible for the circuit to stray
from the iterative Jacobi path, before settling to the correct steady state solution. We
therefore latch the feedback, forcing the circuit to update each iteration in discrete time.
These problems both arise from the fact that we are trying to map the discrete time
digital encoding algorithm onto a continuous time analog circuit. The algorithm passes
hard messages ∈ {0, 1} yet the analog circuit passes soft messages ∈ (0, 1). Motivated
by this we present an alternative design, using mode-switching gates (MSG) in the check
nodes. These gates operate as analog soft-XOR gates (Section 6.5.1) during decoding.
They are then switched to operate as digital resistor-transistor-logic (RTL) XOR gates
during the encode operation.
A mode-switching XOR gate is shown in Figure 7.1. This cell is based upon the
voltage-mode soft-XOR gate shown in Figure 6.7, with the addition of four transmission
gates. We may use minimum sized FETs for these transmission gates and hence they
present very little overhead to the original circuit. The cell may be switched between
encode (hard) and decode (soft) modes according to Table 7.1.
We now explain how the circuit generates the output pZ(1). The output pZ(0) is
generated in a similar manner. The transmission gates are used to dynamically rewire
the cell between encode (enc) and decode (enc) modes.
When switched to operate in decode mode (enc = Gnd, enc = Vdd), the circuit
is wired as a soft-XOR gate, equivalent to that shown in Figure 6.7. The voltage Vu
represents an external connection to the driving FET of a p-type current mirror, such
that transistor M1 is biased to supply the unit current Iu. Transistors M2 and M3 form
part of the normalisation network, and M4 is diode connected.
When switched to operate in encode mode (enc = Vdd, enc = Gnd), the circuit
114
Chapter 7. The Reversible LDPC Codec
Vx,0
Vx,1
Vy,0
Vy,1
Vz,0
Vz,1
Vu
VrefP VrefP
VrefN VrefN
enc enc
enc
enc
enc enc
enc
enc
M1
M2 M3
M4
Figure 7.1: Mode-switching XOR gate.
115
Chapter 7. The Reversible LDPC Codec
is wired as a digital resistor-transistor-logic XOR gate. We assume that hard decision
{Gnd, Vdd} voltages drive the inputs. We set VrefP = Vdd and use transistor M2 as
a resistive load to the multiplication matrix. The gate voltage of M2 represents the
inverted result for pZ(1). We set Vu = Gnd to turn on M1, and VrefN = Gnd so that
transistors M3 and M4 form an inverter. The true form of pZ(1) is then presented at the
output.
Potential applications for mode-switching gates [137] extend beyond the encoding
architecture presented in this work. Research is ongoing into using these gates to incor-
porate built-in self-test functionality into codec designs [138].
7.3 Codec Core Architecture
We now extend the analog sum-product decoder core described in Section 6.6 to allow
encoding. The encode operation is performed digitally, using the mode-switching XOR
gates presented above. This proof-of-concept design is based upon a type-I reversible code
(Def. 3.3), however the codec architecture may also be used for type-II codes (Def. 5.2).
Consider the decoder circuit corresponding to (6.2), shown in Figure 6.10. We sepa-
rate this circuit into information and parity sections by splitting each 6-port XOR gate,
cr, into two 4-port gates, cur and cpr, as shown in Figure 7.2. Moreover, we replace the
soft-XOR gates used in Figure 6.10 with mode-switching XOR gates.
The information variable nodes (x9 . . . x16) are extended to allow encoding as shown
in Figure 7.3 for the case of x9. Similarly, the parity variables (x1 . . . x8) are extended as
shown in Figure 7.4 for the case of x1. Transmission gates and a multiplexer (MUX) are
used to switch each variable node between encode (enc) and decode (enc) modes. The
multiplexer is also built from transmission gates. Here 2-wire probability vector buses
(thick lines) carry messages representing (pX(0),pX(1)). Where necessary these have been
expanded into single wires (thin lines).
7.3.1 Decode Mode
In decode mode (enc = Gnd, enc = Vdd) the circuit operates as a subthreshold CMOS
analog sum-product decoder [3]. All checks (mode-switching XOR gates) are set to
operate as soft-XOR gates.
In this mode, information and parity variables both perform the same function. Each
variable is based upon a soft-equal gate, as shown in the shaded region of Figure 7.3
116
Chapter 7. The Reversible LDPC Codec
Paritysection
Informationsection
v8
v7
v6
v5
v4
v3
v2
v1
v16
v15
v14
v13
v12
v11
v10
v9
x8
x7
x6
x5
x4
x3
x2
x1
x16
x15
x14
x13
x12
x11
x10
x9
b8
b7
b6
b5
b4
b3
b2
b1
cp8
cp7
cp6
cp5
cp4
cp3
cp2
cp1
cu8
cu7
cu6
cu5
cu4
cu3
cu2
cu1
Figure 7.2: Codec circuit for H.
From cu6
From cu2
From cu1
Vdiff
Prior
To cu6
To cu2
To cu1
Posterior
S
Select Encoder (enc) or Decoder (enc)
MUX
enc1
enc2
enc3
enc4
enc1
enc2
enc3
enc4
O1
O2
O3
O4
enc
enc enc
drstNDIFF NORM
E3In
E2In
E1In
GateIn
E3Out
E2Out
E1Out
GateOutM1
M2
Figure 7.3: Information variable node structure for x9
117
Chapter 7. The Reversible LDPC Codec
From cp8
From cp6
From cp1
Vdiff
Prior
From cp1
(p(x1 = 1))
To cp8
To cp6
To cp1
Posterior
S
Select Encoder (enc) or Decoder (enc)
MUX
enc1
enc2
enc3
enc4
enc1
enc2
enc3
enc4
O1
O2
O3
O4
φ1
φ1
φ2
φ2enc
drst
erst
NDIFF NORM
E3In
E2In
E1In
GateIn
E3Out
E2Out
E1Out
GateOutM1
M2
M3
Figure 7.4: Parity variable node structure for x1
and Figure 7.4. The equal gate operates as a voltage mode cell, however the output
result (posterior) is presented as a current-mode pair. Gate edge connections E1Out,
E2Out and E3Out are routed to adjacent check nodes. Edge connections E1In, E2In and
E3In receive messages routed from adjacent check nodes. We assume that the received
channel observation (prior) is represented as a voltage, in log-likelihood form. This
voltage is converted into a current-mode probability vector, using the n-type differential
pair (NDIFF) circuit described in Section 6.4.1. The reference voltage (Vdiff) represents
a log-likelihood value of zero. A vector normalisation circuit (NORM), as described in
Section 6.4.3, then feeds two diode connected FETs. These FETs provide a voltage-mode
representation of the vector, for input (via GateIn) to the equal gate. The decoded soft
output (posterior) is then taken from the output (GateOut port) of the equal gate.
A reset FET, M1, is connected across the input to the normalisation circuit. To reset
the decoder, i.e. clear the previous result, we briefly set drst = Vdd. This causes M1 to
turn on and sets a uniform distribution, p(xs = 0) = p(xs = 1) = 0.5, for all variables.
Once released from reset we allow the decoder circuit some fixed time, tdec, to settle
to its steady state. At this point an interface circuit [143] may be used to sample the
posterior core output, and present this result as a hard decision at the output of the chip.
118
Chapter 7. The Reversible LDPC Codec
7.3.2 Encode Mode
In encode mode (enc = Vdd, enc = Gnd) the circuit operates as a digital message-passing
Jacobi encoder (Algorithm 3.1). All checks (mode-switching XOR gates) are set to op-
erate as digital RTL XOR gates.
The first step toward encoding is to generate the vector b = Hux⊤u from the infor-
mation variables, as shown in Figure 7.2. Using a multiplexer we bypass the equal gate
that is used in decode mode (see Figure 7.3). The (p(xs = 1)) information bit value is
presented as a hard decision voltage at the (prior) node input. An inverter is then used to
generate p(xs = 0), and both values are sent to each outgoing edge of the variable node.
Transistor M2 at the input of this inverter connects it to ground during the decode mode
(enc = Vdd). This connection is made to prevent the input of the inverter from floating,
which can create a resistive path between the power rails and waste power.
Each check node includes the three adjacent incoming information bits in the hard
XOR operation, thus producing b. For example, the XOR operation at check cu1 includes
information bits x9, x12 and x13 to produce b1.
To implement the Jacobi iteration (3.4) we require only the check node output passed
along the path µcs→vsfor each vs representing the parity bit xs. This value for vs is then
fed back into the checks c ∈ Γ (vs) \ cs, and also forms the final decision for xs. We use a
two phase shift register, consisting of two transmission gates and two inverters, to latch
the feedback (see Figure 7.4). The input to this shift register is taken from the voltage
representing p(xs = 1). We initially ignore the value representing p(xs = 0), and generate
it later at the output stage of the shift register.
Signals φ1 and φ2 are used to clock the shift register, as shown in Figure 7.5. These
clock signals have duty cycle less than 50% and phases constrained such that they are
never both high at the same time. This prevents feedback during the present iteration
from altering results from the previous iteration. During the input phase (φ1 high) the
first transmission gate is closed, i.e. its input and output are connected, and the result
of the present iteration is stored on the gate of the first inverter. Upon completion of this
phase (φ1 low) the first transmission gate is opened. Shortly after this, the output phase
(φ2 high) begins and the second transmission gate closes. Calculation of the next iteration
begins immediately, and continues on after the second transmission gate is opened (φ2
low). The result is then sampled on the next input phase. The process is repeated for the
number of iterations required to encode, e.g. κ = 4 for the reversible code used in this
119
Chapter 7. The Reversible LDPC Codec
Non-overlapping
Input Phase, φ1
Output Phase, φ2
Figure 7.5: Shift register clock phases.
Function Components FET RequirementDynamic cell reconfiguration Transmission gates 192Edge multiplexing Transmission gates 512Input multiplexing Transmission gates 32Feedback shift registers Transmission Gates/Inverters 80
Table 7.2: Transistor overhead for encoding.
design. Upon completion, at time tenc, the result is latched through to the chip interface.
Initialisation is performed using transistor M3, which ties the input of the second
inverter to Vdd. We hold the encoder in the reset state by setting erst = Gnd. During
the decode mode (enc = Vdd) we set erst = Gnd. Transistors M2 and M3 then turn on,
and connect the input of each inverter to a supply rail, in order to prevent these input
voltages from floating.
7.3.3 Estimate of Encoder Implementation Overhead
An approximate count of the number of FETs used to add encoder functionality to this
(n = 16) analog decoder core is summarised in Table 7.2. We note that it is not necessary
to transform all soft-XOR gates into mode-switching gates. Only those gates which route
messages in the path of the iterative encoding algorithm require conversion, thus saving
some transistor overhead.
The total number of FETs required to transform the above decoder core into a codec
is 816. This represents approximately 15% of the total number of transistors used to build
the core. The average overhead is 51 FETs per bit, and this count can be used as a linear
guide to predict the overhead requirement for a larger codec. Most of these transistors
are used to build transmission gates, and hence they can have minimum dimensions. The
routing overhead is negligible, as only a small number of control signals have been added,
120
Chapter 7. The Reversible LDPC Codec
Parameter Symbol ValueUnit bias current Iu 100nAVoltage supply Vdd 1.8VReference voltage (n-type) VrefN 0.4VReference voltage (p-type) VrefP 1.4VReference voltage (differential) Vdiff 0.8V
Table 7.3: Codec core simulation parameters.
Iteration Parity Vector (x1...8)0 0 0 0 0 0 0 0 01 1 1 0 1 0 0 1 12 1 1 1 0 1 0 1 03 0 1 0 1 0 0 0 1
≥ 4 1 1 1 1 1 0 1 1
Table 7.4: Example iterative solution.
and data paths have been reused from the decoder.
7.4 Circuit Simulations
We have built a T-SPICE description of the complete codec core described above, for
the code with H defined by (6.2), using the TSMC 0.18µm CMOS technology1. Circuit
simulation parameters are provided in Table 7.3. The voltages representing log-likelihood
prior input to the decoder variables have a maximum deflection of ±0.2V about Vdiff.
Figure 7.6 shows the simulation of an example block decode. The circuit output
current Ixs,1, representing p(xs = 1), is shown for each symbol. In this example the
decoder is released from reset at time zero, and the channel has flipped bits x7 and
x11. The decoder successfully corrects these two errors, to arrive at the codeword x =
[0011001000101010]. The settled decoding process may be safely sampled by the interface
at time tdec = 10µs.
To demonstrate the operation of the encoder we apply the information vector x9...16 =
[00110111] and expect the parity vector x1...8 = [11111011]. From (3.4) we obtain each
step of the Jacobi iterative solution shown in Table 7.4.
For this example, the voltage Vxs,1 representing p(xs = 1) for each parity output bit
is shown in Figure 7.7. The information vector is applied at time zero and the reset latch
1The circuit description uses BSIM3v3 simulation device models, obtained through Canadian Micro-electronics Corporation.
121
Chapter 7. The Reversible LDPC Codec
0 5 10 15
0
20
40
60
80
100
Time (µs)
Outp
ut
curr
ent,
I xs,1
(nA
)A
AKIx7,1
AAK
Ix11,1
Figure 7.6: Decoder output with bits x7 and x11 corrected.
released 5ns later. The circuit is then clocked for κ = 4 iterations, to arrive at the correct
codeword, i.e. x6 is the only output bit having p(xs = 1) = 0. The iterative solution for
x1, shown in Figure 7.8, matches that predicted in Table 7.4. The encoded result may
be safely latched through to the output interface at time tenc = 50ns.
A summary of specifications for the codec core, measured from circuit simulation,
is provided in Table 7.5. We expect the power requirements and throughput of a larger
core to grow linearly with block length, for both encode and decode modes.
Parameter Decode EncodeTime (per block) 10µs 50nsPower (per block) 110µW 21.6mWEnergy 138pJ/decoded information bit 135pJ/encoded parity bitThroughput 800kbit/sec (information bits) 160Mbit/sec (parity bits)
Table 7.5: Codec core specification.
Although the digital encoder draws significantly more power than the analog decoder,
it operates 200 times faster, and hence the per block energy requirements of the two
modes balance. The average power consumption, assuming an equal number of encode
and decode operations, is 217µW per block. Based upon these results, and considering
a reversible code that is encodable in κ iterations, we expect the time taken to encode a
block to be approximately 0.125κ% of that taken to decode a block.
122
Chapter 7. The Reversible LDPC Codec
0 5 10 15 20 25 30 35 40 45 50 55−0.2
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
Time (ns)
Outp
ut
vol
tage
,V
xs,1
(V)
���Vx6,1
Figure 7.7: Encoder parity bit outputs.
0 5 10 15 20 25 30 35 40 45 50 55−0.2
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
Time (ns)
Outp
ut
vol
tage
,V
x1,1
(V)
Figure 7.8: Encoder output for parity bit x1.
123
Chapter 7. The Reversible LDPC Codec
7.5 Summary
In this chapter we have designed and simulated the circuit architecture for the core of
a reversible LDPC codec. The analog decoder architecture is well suited to continuous
time soft decision decoding. However, the discrete time hard decision Jacobi encoder is
more appropriately implemented with a digital circuit. In order to achieve these seemingly
contradictory roles for a circuit that is based upon architecture re-use, we have introduced
mode-switching logic gates. Moreover, these gates have other potential applications, e.g.
for circuit verification, that go beyond that studied in this thesis [137, 138].
Analog decoders are robust to design and fabrication errors [90]. This makes their
behaviour difficult to verify, both at design time and after fabrication. However, encoding
is a deterministic process in which errors are exposed quickly. Since we are reusing the de-
coder for encoding, verification of the encode operation also provides implicit verification
for components of the decoder circuit.
The additional area required to convert the analog sum-product decoder into a full
codec circuit is very low. Most of the additional transistors may have minimum dimen-
sions and the routing overhead is negligible. This efficient use of circuit area results from
both the novel encoding algorithm, and the novel circuit architectures that are used in
its implementation.
The example circuit in this work is based upon a small type-I reversible code, i.e.
the code has a circulant Hp. However, by scaling up the design the same architecture
can in principal be used to build a codec for one of the larger type-II reversible codes
presented in Chapter 5. As both encode and decode operations are fully parallel, the
block computation time remains constant as we scale block length.
The time taken for the codec to perform an encode operation is insignificant in
comparison to that taken to decode. For example, we expect the block encode time for
a type-II reversible code to be around 1% of the block decode time. Hence this circuit is
well suited to use in full-duplex communication systems.
124
Chapter 8
Conclusion
8.1 Summary of Results
In this thesis we propose a new approach to LDPC codec design, from both coding the-
ory and circuit implementation perspectives. The approach is based upon the reuse of
the sum-product decoder to perform encoding. It is motivated by the fact that encoding
LDPC codes is in general not a trivial problem, and by the potential to reduce implemen-
tation area. Given that the routing of wires in an LDPC decoder represents a significant
proportion of the total chip area, making further use of the architecture by reusing it for
encoding is advantageous.
In Chapter 3 we propose a novel encoding algorithm and a corresponding class of
reversible LDPC codes. The encoding algorithm is based upon the Jacobi method for it-
erative matrix inversion, for matrices which have elements from F2. The algorithm allows
reuse of a parallel sum-product decoder implementation, and thus offers an alternative to
the existing methods for encoding LDPC codes presented in Chapter 2, which are based
upon serial processing. Moreover, by using the same circuit for encoding and decoding,
verification of the encoder operation implicitly provides verification for components of
the decoder. This represents a further advantage of the iterative Jacobi approach, that
is not shared by the existing dedicated encoder architectures described in Chapter 2.
We determine the convergence constraint that a matrix must satisfy in order to be
iteratively invertible and thus the total number of iterations required to encode. We
also present an algebraic method for constraining a circulant matrix such that it is 4-
cycle free. Using these two constraints we present an algorithm for constructing 4-cycle
free type-I reversible codes, which are encodable using the Jacobi method over F2. The
algorithm allows some flexibility in the choice of code length and rate. From the analysis
125
Chapter 8. Conclusion
presented in Chapters 4 and 5, we find that the type-I construction allows good codes
to be developed for high rate applications. However, the method is not as well suited to
building (3,6)-regular codes, where we attribute decoder convergence problems to poor
expansion in the factor graph representation of these codes.
Motivated by the good expansion property of random graphs, in Chapter 5 we re-
cursively construct type-II reversible LDPC codes, such that they incorporate random
components. These codes are 4-cycle free and are encodable using eight iterations of
the Jacobi encoder. The algorithm produces codes which have better graph expansion,
and performance, than the type-I reversible codes. It also offers greater flexibility in the
choice of code length and rate. Moreover, it allows the construction of (3,6)-regular codes
with good error correcting performance.
In Chapter 7 we present a novel circuit architecture for the core of a reversible
LDPC codec. The circuit switches between analog decode and digital encode operations.
Encoding is performed via the iterative Jacobi method over F2, by reusing the decoder
architecture. In order to switch the circuit between decode and encode modes, we have
presented a novel circuit design for mode-switching gates. These logic gates are able to
switch between analog (soft) and digital (hard) computation, and may also be applied
to built-in self-test circuits for analog decoders. We now discuss the proposed codec
implementation in the context of the design criteria presented in Section 1.2.
Good Error Correcting Performance We have shown that reversible LDPC codes
can be constructed which are 4-cycle free and offer good performance. Empirical
simulation results compare well to those of random and finite geometry benchmark
codes. The subthreshold CMOS sum-product decoder design used as a basis for
this codec is not new. Results from several fabricated proof-of-concept decoders of
this form indicate that implementation performance agrees with that predicted by
simulation. Hence we expect that combining the reversible codes with an analog
decoder will result in good error correcting performance.
Flexibility of Code Design The reversible design approach offers a large amount of
flexibility in the choice of code rate, as the iterative encodability constraint only ap-
plies to a section of the parity-check matrix. We have provided examples of several
reversible codes with block length n ≈ 1000 that offer good empirical performance.
The type-II reversible code with n = 4096 exhibits signs of an error floor at a word
error rate of approximately 10−5. We expect that performance improvements for
126
Chapter 8. Conclusion
codes of this length may result from further investigation into alternative recursive
code constructions.
Simplicity of Design The algorithms presented for generating type-I and type-II re-
versible codes are easy to implement. Moreover, the underlying ideas behind these
algorithms, and the encoder itself, are straightforward.
High Speed Analog decoders are expected to operate at speeds of around two orders
of magnitude faster than digital decoders. Recent results for fabricated analog
decoders are in agreement with this expectation. Using the proposed architecture,
in the case of a type-II reversible code, the computational latency of the encode
operation is around 1% of the decode latency. Hence the codec is suitable for
full-duplex applications. Moreover, the parallel implementation implies that the
computational latency of both encode and decode operations is fixed as code length
is scaled.
Low Power Consumption The subthreshold CMOS decoder operates with very low
transistor bias currents and is thus very power efficient. Recent results for fab-
ricated analog subthreshold CMOS decoders indicate that they can operate with
power consumption that is around two orders of magnitude less than that of a dig-
ital decoder. Furthermore, for the proposed codec, the energy requirement of the
encoder is approximately the same as that of the decoder.
Small Area Analog decoder circuits offer a significant saving in routing complexity. As
the size of an LDPC decoder implementation is primarily determined by routing
congestion rather than the transistor count we expect the analog LDPC decoder
to be smaller than a digital decoder. Using the proposed approach, only a small
additional area overhead is required to transform the decoder into a reversible codec.
Ease of Verification It is the task of a decoder to correct errors. Hence the correct
behaviour of a decoder circuit is difficult to verify, both at design time and after fab-
rication. However, encoding is a deterministic process in which errors are exposed
quickly. The proposed codec reuses the decoder for encoding. Hence, verification
of the encode operation also provides implicit verification for components of the
decoder circuit.
By combining the analog approach and the reversible LDPC coding scheme we are
able to build a small, power efficient codec, which offers good performance with a flexible
127
Chapter 8. Conclusion
selection of code length and rate. In particular, the small size and low power requirements
of this codec design make it a candidate for use in mobile telephony, wireless networking,
implantable devices and other biomedical applications.
8.2 Suggestions for Further Work
• In this work we have developed design approaches for reversible codes by consider-
ing the encoding constraint, and factor graph expansion metric, in terms of matrix
manipulation. An alternative approach may be to consider how the iterative encod-
ability constraint maps onto the factor graph, and design codes from a graphical
viewpoint. In Chapter 5 we have approached the problem of building codes with
improved expansion, by incorporating randomness into the structure of the graph,
while maintaining its iterative encodability. By considering the encodability con-
straint in graphical terms, we may be able to devise a way of explicitly designing
good expanders that are iteratively encodable. A maximum of eight iterations are
required to encode the reversible codes presented in this thesis. However, the high
speed at which the encode operation may be performed using the architecture pro-
posed in Chapter 7 implies that it should be practical to allow more than eight
encoder iterations. Hence we may relax this design constraint in the search for
codes which offer high error correcting performance.
• This thesis focusses upon encoding via the application of the Jacobi method for iter-
ative matrix inversion. Investigating the suitability of applying other techniques for
solving linear systems of equations to the encoding problem may prove worthwhile.
• The recursive method proposed in Algorithm 5.1 demonstrates that good (3,6)-
regular codes with length n ≈ 1000 can be built. We should be able to develop
similar algorithms which allow reversible codes to be constructed with longer block
length and/or higher column weight.
• An investigation of irregular reversible LDPC code construction was undertaken
in [93]. An example irregular reversible code was built, having rate 1/2 and length
n = 1008. The code contained many connected cycles of length six, and exhibited
a BER floor at around 10−4. Hence an alternative approach to building irregular
structures is required. Using a recursive method, e.g. similar to that of Algo-
rithm 5.1, we may be able to develop good irregular reversible codes.
128
Chapter 8. Conclusion
• Results from the finite graph-cover analysis presented in Section 4.4.4 imply that
a weakness in the structure of Hp alone is sufficient to cause convergence problems
in the AWGN channel case. Hence in this work we have focussed on the section of
the factor graph corresponding to Hp. We have randomly generated Hu, imposing
only node degree constraints upon the graph, and blocking 4-cycles. However, there
may be ways to structure Hu which block the creation of small stopping sets, and
hence improve BEC performance.
• It is well known that we may test the parity-check constraints during sum-product
decoding of LDPC codes, and terminate the algorithm early if all constraints are
satisfied. Existing designs for analog LDPC decoders do not incorporate early
stopping criteria. The circuit is instead allowed to run for the fixed amount of
time before the result is latched. By testing the state of check nodes, it may be
possible to incorporate an early stopping criteria into the architecture, so that the
decoder stops once a valid codeword is obtained. This may reduce the average
computational latency of the decode operation.
• In the case of undetected error events, the number of satisfied constraints may
provide some indication of the correctness of the codeword estimate. We are able
to extract soft information pertaining to the satisfaction of individual checks, from
the soft-XOR gate cells. It may then be possible to combine this information using
other soft-logic gates, and provide an analog estimate of how many codeword bits
are erroneous. This may be useful for error tolerant systems which incorporate an
automatic-repeat-request (ARQ) protocol.
• Once the iterative Jacobi method arrives at a valid codeword it does not shift from
that state as further iterations are performed. We may therefore incorporate an
early stopping criteria into the architecture, so that the encoder stops once a valid
codeword is obtained. This may reduce the average computational latency of the
encode operation.
• The mode-switching gates introduced in Chapter 7 have further application to
built-in self-test circuits for analog decoders. Work is ongoing in this area in the
High Capacity Digital Communications Laboratory at the University of Alberta.
Moreover, these gates are not limited to use in an error control codec. They may be
used in any application that could benefit from being able to switch logic between
hard and soft operation.
129
Bibliography
[1] S. B. Wicker, Error Control Systems for Digital Communication and Storage. Pren-tice Hall, 1995.
[2] J. Hagenauer, E. Offer, and L. Papke, “Iterative decoding of binary block andconvolutional codes,” IEEE Trans. Inform. Theory, vol. 42, no. 2, pp. 429–445,Mar. 1996.
[3] F. Lustenberger, “On the design of analog iterative VLSI decoders,” Ph.D. disser-tation, ETH, Zurich, Switzerland, 2000.
[4] C. E. Shannon, “A mathematical theory of communication,” Bell System TechnicalJournal, vol. 27, pp. 379–423, 623–656, July, Oct. 1948.
[5] R. G. Gallager, Low-density parity-check codes. Cambridge, MA: MIT Press, 1963.
[6] ——, “Low-density parity check codes,” IRE Trans. Inform. Theory, vol. IT-8, pp.21–28, Jan. 1962.
[7] V. Zyablov and M. S. Pinsker, “Estimation of the error-correcting complexity ofGallager low-density codes,” Problems of Info. Trans., vol. 11, no. 1, pp. 23–26,1975.
[8] G. A. Margulis, “Explicit constructions of graphs without short cycles and lowdensity codes,” Combinatorica, vol. 2, no. 1, pp. 71–78, 1982.
[9] R. M. Tanner, “A recursive approach to low complexity codes,” IEEE Trans. In-form. Theory, vol. IT-27, pp. 533–547, Sept. 1981.
[10] C. Berrou, A. Glavieux, and P. Thitimajshirna, “Near Shannon limit error cor-recting coding and decoding: Turbo codes,” in Proc. International Conference onCommunications (ICC 93), Geneva, Switzerland, 1993, pp. 1064–1070.
[11] J. Pearl, Probabilistic Reasoning in Intelligent Systems. San Mateo, CA: MorganKaufmann, 1988.
[12] R. J. McEliece, D. J. C. MacKay, and J. F. Cheng, “Turbo decoding as an instanceof Pearl’s “belief propagation” algorithm,” IEEE J. Select. Areas Commun., vol. 16,pp. 140–152, Feb. 1998.
[13] B. J. Frey and F. R. Kschischang, “Probability propagation and iterative decod-ing,” in Proc. Allerton Conf. on Communication, Control and Computing, AllertonHouse, Monticello, IL, 1996.
130
BIBLIOGRAPHY
[14] D. J. C. MacKay, “Good error correcting codes based on very sparse matrices,”IEEE Trans. Inform. Theory, vol. 45, no. 2, pp. 399–431, March 1999.
[15] N. Wiberg, “Codes and decoding on general graphs,” Ph.D. dissertation, Univ.Linkoping, Linkoping, Sweden, 1996.
[16] D. J. C. MacKay and R. M. Neal, “Near Shannon limit performance of low densityparity check codes,” Electron. Lett., vol. 33, no. 6, pp. 457–458, Mar. 1997.
[17] M. Sipser and D. A. Spielman, “Expander codes,” IEEE Trans. Inform. Theory,vol. 42, no. 6, pp. 1710–1722, Nov. 1996.
[18] G. D. Forney Jr., “Codes on graphs: News and views,” in Proc. Int. Symp. onTurbo Codes and Related Topics, Brest, France, 2000, pp. 9–16.
[19] N. Wiberg, R. Kotter, and H.-A. Loeliger, “Codes and iterative decoding on generalgraphs,” European Trans. on Telecommun., vol. 6, pp. 513–525, Sept./Oct. 1995.
[20] F. R. Kschischang, B. J. Frey, and H.-A. Loeliger, “Factor graphs and the sum-product algorithm,” IEEE Trans. Inform. Theory, vol. 47, no. 2, pp. 498–519, Feb.2001.
[21] M. G. Luby, M. A. Shokrolloahi, M. Mizenmacher, and D. A. Spielman, “Improvedlow-density parity-check codes using irregular graphs,” IEEE Trans. Inform. The-ory, vol. 47, no. 2, pp. 585–598, Feb. 2001.
[22] T. J. Richardson, M. A. Shokrollahi, and R. L. Urbanke, “Design of capacity-approaching irregular low-density parity-check codes,” IEEE Trans. Inform. The-ory, vol. 47, no. 2, pp. 619–637, Feb. 2001.
[23] S.-Y. Chung, D. J. Forney Jr., T. J. Richardson, and R. L. Urbanke, “On the designof low-density parity-check codes within 0.0045 dB of the Shannon limit,” IEEECommun. Lett., vol. 5, no. 2, pp. 58–60, Feb. 2001.
[24] C. Di, D. Proietti, I. E. Telatar, T. J. Richardson, and R. L. Urbanke, “Finite-length analysis of low-density parity-check codes on the binary erasure channel,”IEEE Trans. Inform. Theory, vol. 48, no. 6, pp. 1570–1579, Jun. 2002.
[25] T. Tian, C. Jones, J. D. Villasenor, and R. D. Wesel, “Construction of irregularLDPC codes with low error floors,” in Proc. ICC 2003, vol. 5, Anchorage, AK,2003, pp. 3125–3129.
[26] J. Feldman, “Decoding error-correcting codes via linear programming,” Ph.D. dis-sertation, Massachusetts Institute of Technology, 2003.
[27] R. Kotter and P. O. Vontobel, “Graph-covers and iterative decoding of finite lengthcodes,” in Proc. Int. Symp. on Turbo Codes and Related Topics, Brest, France, 2003,pp. 75–82.
[28] P. Vontobel and R. Kotter, “Lower bounds on the minimum pseudo-weight of linearcodes,” in Proc. IEEE Int. Symp. on Inform. Theory, Chicago, IL, 2004, p. 70.
131
BIBLIOGRAPHY
[29] J. L. Massey, Threshold decoding. Cambridge, MA: MIT Press, 1963.
[30] M. C. Davey, “Error-correction using low-density parity-check codes,” Ph.D., Univ.Cambridge, Cavendish Laboratory, 1999.
[31] D. J. C. MacKay, Information theory, inference, and learning algorithms. Cam-bridge University Press, 2003.
[32] D. J. C. MacKay, S. T. Wilson, and M. C. Davey, “Comparison of constructionsof irregular Gallager codes,” IEEE Trans. Communications, vol. 47, no. 10, pp.1449–1454, Oct. 1999.
[33] F. R. Kschischang and B. J. Frey, “Iterative decoding of compound codes by prob-ability propagation in graphical models,” IEEE J. Select. Areas Commun., vol. 16,no. 2, pp. 219–230, Feb. 1998.
[34] R. L. Graham, D. E. Knuth, and O. Patashnik, Concrete Mathematics. Reading,MA: Addison-Wesley, 1989.
[35] T. J. Richardson and R. L. Urbanke, “The capacity of low-density parity-checkcodes under message-passing decoding,” IEEE Trans. Inform. Theory, vol. 47, no. 2,pp. 599–618, Feb. 2001.
[36] G. D. Forney Jr., “Codes on graphs: Normal realizations,” IEEE Trans. Inform.Theory, vol. 47, no. 2, pp. 520–548, Feb. 2001.
[37] M. G. Luby, M. Mitzenmacher, M. A. Shokrollahi, D. A. Spielman, and V. Stemann,“Practical loss-resilient codes,” in Proc. 29th Symp. Theory Computing, 1997, pp.150–159.
[38] S. M. Aji and R. J. McEliece, “The generalized distributive law,” IEEE Trans.Inform. Theory, vol. 46, no. 2, pp. 325–343, Mar. 2000.
[39] Y. Mao and A. H. Banihashemi, “Decoding low-density parity-check codes withprobabilistic scheduling,” IEEE Commun. Lett., vol. 5, no. 10, pp. 414–416, Oct.2001.
[40] D. J. C. MacKay and M. C. Davey, “Evaluation of Gallager codes for short blocklength and high rate applications,” in Codes, Systems and Graphical Models, ser.IMA Volumes in Mathematics and its Applications, B. Marcus and J. Rosenthal,Eds. New York: Springer-Verlag, 2000, vol. 123, pp. 113–130.
[41] B. Frey, R. Kotter, and A. Vardy, “Skewness and pseudocodewords in iterativedecoding,” in Proc. IEEE Int. Symp. on Inform. Theory, Cambridge, MA, 1998, p.148.
[42] B. J. Frey, R. Kotter, and A. Vardy, “Signal-space characterization of iterativedecoding,” IEEE Trans. Inform. Theory, vol. 47, no. 2, pp. 766–781, Feb. 2001.
[43] M. P. C. Fossorier, M. Mihaljevic, and H. Imai, “Reduced complexity iterativedecoding of low-density parity check codes based on belief propagation,” IEEETrans. Communications, vol. 47, no. 5, pp. 673–680, May 1999.
132
BIBLIOGRAPHY
[44] M. P. Fossorier, “Iterative reliability-based decoding of low-density parity checkcodes,” IEEE J. Select. Areas Commun., vol. 19, no. 5, pp. 908–917, May 2001.
[45] F. Guilloud, E. Boutillon, and J.-L. Danger, “λ-min decoding algorithm of regularand irregular LDPC codes,” in Proc. Int. Symp. on Turbo Codes and Related Topics,Brest, France, 2003, pp. 451–454.
[46] P. Zarrinkhat and A. Banihashemi, “Hybrid decoding of LDPC codes,” in Proc.Int. Symp. on Turbo Codes and Related Topics, Brest, France, 2003, pp. 503–506.
[47] S. Howard, “Two new LDPC decoding algorithms: Soft bit decoding and sub-min-sum decoding,” in 3rd Analog Decoder Workshop, Banff, Canada, 2004.
[48] L. Ping, W. K. Leung, and N. Phamdo, “Low density parity check codes withsemi-random parity check matrix,” Electron. Lett., vol. 35, no. 1, pp. 38–39, Jan.1999.
[49] M. Rashidpour and S. H. Jamali, “Low-density parity-check codes with simpleirregular semi-random parity-check matrix for finite-length applications,” in Proc.IEEE Int. Symp. on Personal, Indoor and Mobile Communication, PIMRC2003,vol. 1, Beijing, China, 2003, pp. 439–443.
[50] M. Yang, W. E. Ryan, and L. Yan, “Design of efficiently encodable moderate-lengthhigh-rate irregular LDPC codes,” IEEE Trans. Communications, vol. 52, no. 4, pp.564–571, Apr. 2004.
[51] R. Lucas, M. P. C. Fossorier, Y. Kou, and S. Lin, “Iterative decoding of one-step majority logic deductible codes based on belief propagation,” IEEE Trans.Communications, vol. 48, no. 6, pp. 931–937, June 2000.
[52] Y. Kou, S. Lin, and M. P. Fossorier, “Low density parity check codes based onfinite geometries: A rediscovery and new results,” IEEE Trans. Inform. Theory,vol. 47, no. 7, pp. 2711–2736, Nov. 2001.
[53] R. L. Townsend and E. J. Weldon Jr., “Self orthogonal quasi-cyclic codes,” IEEETrans. Inform. Theory, vol. 13, no. 2, pp. 183–195, Apr. 1967.
[54] M. Karlin, “New binary coding results by circulants,” IEEE Trans. Inform. Theory,vol. 15, pp. 81–92, 1969.
[55] S. J. Johnson and S. R. Weller, “A family of irregular LDPC codes with low en-coding complexity,” IEEE Commun. Lett., vol. 7, no. 2, pp. 79–81, Feb. 2003.
[56] W. W. Peterson, Error Correcting Codes. Cambridge, MA: MIT Press, 1961.
[57] D. A. Spielman, “Linear-time encodable and decodable error-correcting codes,”IEEE Trans. Inform. Theory, vol. 42, no. 6, pp. 1723 –1731, Nov. 1996.
[58] R. Echard and S. C. Chang, “The π-rotation low-density parity check codes,” inProc. GLOBECOM 2001, vol. 2, San Antonio, TX, 2001, pp. 980–984.
133
BIBLIOGRAPHY
[59] ——, “The extended irregular π-rotation low-density parity check codes,” IEEECommun. Lett., vol. 7, no. 5, pp. 230–232, May 2003.
[60] T. Zhang and K. K. Parhi, “Joint (3,k)-regular LDPC code and decoder/encoderdesign,” IEEE Trans. Sig. Proc., vol. 52, no. 4, pp. 1065–1079, Apr. 2004.
[61] T. J. Richardson and R. L. Urbanke, “Efficient encoding of low-density parity-checkcodes,” IEEE Trans. Inform. Theory, vol. 47, no. 2, pp. 638–656, Feb. 2001.
[62] T. J. Richardson, A. Shokrollahi, and R. Urbanke, “Design of provably good low-density parity-check codes,” in Proc. Int. Symp. Information Theory, Sorrento,Italy, 2000, p. 199.
[63] S. M. Aji, G. B. Horn, and R. J. McEliece, “Iterative decoding on graphs with asingle cycle,” in Proc. IEEE Int. Symp. on Inform. Theory, Cambridge, MA, 1998,p. 276.
[64] Y. Weiss, “Correctness of local probability propagation in graphical models withloops,” Neural Computation, vol. 12, pp. 1–41, 2000.
[65] G. D. Forney Jr., R. Koetter, F. R. Kschischang, and A. Reznik, “On the effec-tive weights of pseudocodewords for codes defined on graphs with cycles,” Codes,Systems and Graphical Models, vol. 123, pp. 101–112, 2001.
[66] T. Etzion, A. Trachtenbeg, and A. Vardy, “Which codes have cycle-free Tannergraphs?” IEEE Trans. Inform. Theory, vol. 45, no. 6, pp. 2173–2181, Sept. 1999.
[67] P. Vontobel, “Algebraic coding for iterative decoding,” Ph.D. dissertation, ETH,Zurich, Switzerland, 2003.
[68] S. Ikeda, T. Tanaka, and S. i. Amari, “Information geometry of turbo and low-density parity-check codes,” IEEE Trans. Inform. Theory, vol. 50, no. 6, pp. 1097–1114, June 2004.
[69] J. A. McGowan and R. C. Williamson, “Loop removal from LDPC codes,” in Proc.Inform. Theory Workshop, Paris, France, 2003, pp. 230–233.
[70] J. Campello, D. S. Modha, and S. Rajagopalan, “Designing LDPC codes usingbit-filling,” in Proc. ICC 2001, vol. 1, Helsinki, Finland, 2001, pp. 55–59.
[71] J. Campello and D. S. Modha, “Extended bit-filling and LDPC code design,” inProc. GLOBECOM 2001, vol. 2, San Antonio, TX, 2001, pp. 985–989.
[72] R. M. Tanner, “Explicit concentrators from generalized N-gons,” SIAM J. Alg.Disc. Meth., vol. 5, no. 3, pp. 287–293, Sept. 1984.
[73] N. Alon, “Eigenvalues and expanders,” Combinatorica, vol. 6, no. 2, pp. 83–96,1986.
[74] J. Friedman, “On the second eigenvalue and random walks in random d-regulargraphs,” Combinatorica, vol. 11, no. 4, pp. 331–362, 1991.
134
BIBLIOGRAPHY
[75] G. A. Margulis, “Explicit group-theoretic constructions of combinatorial schemesand their applications in the construction of expanders and concentrators,” Prob-lems Inform. Transmission, vol. 24, no. 1, pp. 39–46, 1988.
[76] A. Lubotzky, R. Phillips, and P. Sarnak, “Ramanujan graphs,” Combinatorica,vol. 8, no. 3, pp. 261–277, 1988.
[77] J. Lafferty and D. Rockmore, “Codes and iterative decoding on algebraic expandergraphs,” in Proc. Int. Symp. on Inform. Theory and Its Applications, Honolulu,HI, 2000, p. 276.
[78] J. Rosenthal and P. O. Vontobel, “Constructions of LDPC codes using Ramanu-jan graphs and ideas from Margulis,” in Proc. Allerton Conf. on Communication,Control and Computing, Allerton House, Monticello, IL, 2000, pp. 248–257.
[79] D. J. C. MacKay and M. S. Postol, “Weaknesses of Margulis and Ramanujan-Margulis low-density parity-check codes,” in Proc. 2nd Irish Conference on theMathematical Foundations of Comp. Sci. and Info. Tech., MFCSIT2002, ser. Elec-tronic Notes in Theoretical Computer Science, vol. 74, Galway, Ireland, 2003.
[80] E. W. Weisstein, “Spanning tree,” [Online]http://mathworld.wolfram.com/SpanningTree.html.
[81] S.-Y. Chung, T. J. Richardson, and R. L. Urbanke, “Analysis of sum-product de-coding of low-density parity-check codes using a Gaussian approximation,” IEEETrans. Inform. Theory, vol. 47, pp. 657–670, Feb. 2001.
[82] S. ten Brink, G. Kramer, and A. Ashikhmin, “Design of low-density parity-checkcodes for modulation and detection,” IEEE Trans. Communications, vol. 52, no. 4,pp. 670–678, May 1999.
[83] M. C. Davey and D. MacKay, “Low-density parity check codes over GF(q),” IEEECommun. Lett., vol. 2, no. 6, pp. 165–167, June 1998.
[84] D. J. C. MacKay and M. C. Davey, “Low density parity check codes over GF(q) ,”in Proc. Inform. Theory Workshop, Killarney, Ireland, 1998, pp. 70–71.
[85] P. O. Vontobel and R. M. Tanner, “Construction of codes based on finite generalizedquadrangles for iterative decoding,” in Proc. IEEE Int. Symp. on Inform. Theory,Washington, DC, 2001, p. 223.
[86] S. J. Johnson and S. R. Weller, “Codes for iterative decoding from partial geome-tries,” IEEE Trans. Communications, vol. 52, no. 2, pp. 236–243, Feb. 2004.
[87] ——, “Structured low-density parity-check codes over non-binary fields,” in Proc.5th Australian Communications Theory Workshop, Newcastle, Australia, 2004, pp.18–22.
[88] D. Haley, C. Winstead, A. Grant, and C. Schlegel, “An analog LDPC codec core,”in Proc. Int. Symp. on Turbo Codes and Related Topics, Brest, France, 2003, pp.391–394.
135
BIBLIOGRAPHY
[89] A. Blanksby and C. Howland, “A 690-mW 1-Gb/s 1024-b, rate-1/2 low-densityparity-check code decoder,” IEEE Journal of Solid-State Circuits, vol. 37, no. 3,pp. 404–412, Mar. 2002.
[90] C. Winstead, J. Dai, W. J. Kim, S. Little, Y. B. Kim, C. Myers, and C. Schlegel,“Analog MAP decoder for (8,4) Hamming code in subthreshold CMOS,” in Proc.ARVLSI (Advanced Research in VLSI), Salt Lake City, Utah, 2001, pp. 132–147.
[91] J. Bond, S. Hui, and H. Schmidt, “Constructing low-density parity-check codes withcirculant matrices,” in Proc. Inform. Theory and Networking Workshop, Metsovo,Greece, 1999, p. 52.
[92] ——, “Constructing low-density parity-check codes,” in EUROCOMM 2000, Mu-nich, Germany, 2000, pp. 260–262.
[93] D. Haley, A. Grant, and J. Buetefuer, “Iterative encoding of low-density parity-check codes,” in Proc. GLOBECOM 2002, vol. 2, Taipei, Taiwan, 2002, pp. 1289–1293.
[94] ——, “Iterative encoding of low-density parity-check codes,” in Proc. 3rd AustralianCommunications Theory Workshop, Canberra, Australia, 2002, pp. 15–17.
[95] D. Haley and A. Grant, “High rate reversible LDPC codes,” in Proc. 5th AustralianCommunications Theory Workshop, Newcastle, Australia, 2004, pp. 114–117.
[96] G. Strang, Linear Algebra and its Applications, 3rd ed. Saunders College Publish-ing, 1988.
[97] T. Shibuya and K. Sakaniwa, “Construction of cyclic codes suitable for iterativedecoding via generating idempotents,” IEICE Trans. on Fundamentals, vol. E86-A,no. 4, pp. 928–939, 2003.
[98] D. J. C. MacKay, “Encyclopedia of sparse graph codes,” [Online]http://wol.ra.phy.cam.ac.uk/mackay/codes/.
[99] A. Vardy, “The intractability of computing the minimum distance of a code,” IEEETrans. Inform. Theory, vol. 43, no. 6, pp. 1757–1766, Nov. 1997.
[100] R. M. Tanner, “Minimum-distance bounds by graph analysis,” IEEE Trans. Inform.Theory, vol. 47, no. 2, pp. 808–821, Feb. 2001.
[101] ——, “Spectral graphs for quasi-cyclic LDPC codes,” in Proc. IEEE Int. Symp. onInform. Theory, Washington, DC, 2001, p. 226.
[102] E. W. Weisstein, “Circulant determinant,” [Online]http://mathworld.wolfram.com/CirculantDeterminant.html.
[103] E. Boutillon, J. Castura, and F. R. Kschischang, “Decoder first code design,” inProc. Int. Symp. on Turbo Codes and Related Topics, Brest, France, 2000, pp.459–462.
136
BIBLIOGRAPHY
[104] A. S. Acampora and R. P. Gilmore, “Analog Viterbi decoding for high speed digitalsatellite channels,” IEEE Trans. Communications, vol. COM-26, pp. 1463–1470,Oct. 1978.
[105] M. S. Shakiba, D. A. Johns, and K. W. Martin, “BiCMOS circuits for analogViterbi decoders,” IEEE Trans. on Circuits and Systems II: Analog and DigitalSignal Processing, vol. 45, no. 12, pp. 1527–1537, Dec. 1998.
[106] H.-A. Loeliger, F. Tarkoy, F. Lustenberger, and M. Helfenstein, “Decoding in analogVLSI,” IEEE Commun. Magazine, vol. 37, no. 4, pp. 99–101, Apr. 1999.
[107] H.-A. Loeliger, F. Lustenberger, M. Helfenstein, and F. Tarkoy, “Probability prop-agation and decoding in analog VLSI,” IEEE Trans. Inform. Theory, vol. 47, no. 2,pp. 837–843, Feb. 2001.
[108] J. Hagenauer and M. Winklhofer, “The analog decoder,” in Proc. IEEE Int. Symp.on Inform. Theory, Cambridge, MA, 1998, p. 145.
[109] C. A. Mead, Analog VLSI and Neural Systems, ser. Addison Wesley Computationand Neural Systems Series. Reading, MA: Addison Wesley, 1989.
[110] V. Gaudet and A. Rapley, “Iterative decoding using stochastic computation,” Elec-tron. Lett., vol. 39, no. 3, pp. 299–301, Feb. 2003.
[111] A. Rapley, C. Winstead, V. Gaudet, and C. Schlegel, “LDPC decoder design usingstochastic computation,” in 3rd Analog Decoder Workshop, Banff, Canada, 2004.
[112] H. P. Schmid, “Single-amplifier biquadratic MOSFET-C filters,” Ph.D. dissertation,Swiss Federal Institute of Technology, Zurich, Switzerland, 2000.
[113] V. C. Gaudet, “Architecture and implementation of analog iterative decoders,”Ph.D. dissertation, Univ. Toronto, Toronto, Canada, 2003.
[114] C. Winstead, J. Dai, S. Yu, C. Myers, R. R. Harrison, and C. Schlegel, “CMOS ana-log MAP decoder for (8,4) Hamming code,” IEEE Journal of Solid-State Circuits,vol. 39, no. 1, pp. 122–131, Jan. 2004.
[115] C. Winstead et al., “A CMOS analog (16, 11)2 turbo product de-coder,” in 3rd Analog Decoder Workshop, Banff, Canada, 2004,http://www.analogdecoding.org/docs/Winstead ADW04.pdf.
[116] V. Gaudet, R. Gaudet, and G. Gulak, “Programmable interleaver design for analogiterative decoders,” IEEE Trans. on Circuits and Systems II, vol. 49, no. 7, pp.457–464, July 2002.
[117] V. Gaudet and G. Gulak, “A 13.3Mbps 0.35µm CMOS analog turbo decoder ICwith a configurable interleaver,” IEEE Journal of Solid-State Circuits, vol. 38,no. 11, pp. 2010–2015, Nov. 2003.
[118] C. Winstead, N. Nguyen, C. Schlegel, and V. C. Gaudet, “Low-voltage CMOScircuits for analog decoders,” in Proc. Int. Symp. on Turbo Codes and RelatedTopics, Brest, France, 2003, pp. 271–274.
137
BIBLIOGRAPHY
[119] D. Nguyen, “Implementation of low voltage analog decoders,” in 3rd Analog DecoderWorkshop, Banff, Canada, 2004.
[120] F. Lustenberger, M. Helfenstein, H.-A. Loeliger, F. Tarkoy, and G. S. Moschytz,“All-analog decoder for a binary (18,9,5) tail-biting trellis code,” in Proc. EuropeanSolid-State Circuits Conference, Duisburg, 1999, pp. 362–365.
[121] M. Morz, T. Gabara, R. Yan, and J. Hagenauer, “An analog 0.25µm BiCMOStailbiting MAP decoder,” in Proc. International Solid-State Circuits Conference,San Francisco, 2000, pp. 356–357.
[122] A. Xotta, D. Vogrig, A. Gerosa, A. Neviani, A. Graell i Amat, G. Montorsi,M. Bruccoleri, and G. Betti, “An all-analog CMOS implementation of a turbodecoder for hard-disk drive read channels,” in Proc. IEEE Int. Symp. on Circuitsand Systems, ISCAS 2002, Arizona, 2002, pp. V–69–V–72.
[123] A. Graell i Amat, S. Benedetto, G. Montorsi, D. Vogrig, A. Neviani, and A. Gerosa,“An analog decoder for the UMTS standard,” in Proc. IEEE Int. Symp. on Inform.Theory, Chicago, 2004, p. 296.
[124] P. Merkli, H.-A. Loeliger, and M. Frey, “Measurements and observations on analogdecoders for an [8,4,4] extended Hamming code,” in 2nd Analog Decoder Workshop,Zurich, Switzerland, 2003.
[125] S. Hemati and A. Banihashemi, “Full CMOS min-sum analog iterative decoder,”in Proc. IEEE Int. Symp. on Inform. Theory, Yokohama, Japan, 2003, p. 347.
[126] C. Winstead, C. Myers, C. Schlegel, and R. Harrison, “Analog decoding of productcodes,” in Proc. Information Theory Workshop, Cairns, Australia, 2001, pp. 131–133.
[127] J. Dai, “Design methodology for analog VLSI implementations of error controldecoders,” Ph.D. dissertation, Univ. Utah, Utah, 2001.
[128] F. Lustenberger and H.-A. Loeliger, “On mismatch errors in analog-VLSI errorcorrecting decoders,” in Proc. IEEE Int. Symp. on Circuits and Systems, ISCAS2001, vol. 4, Sydney, Australia, 2001, pp. 198–201.
[129] M. Frey, H.-A. Loeliger, F. Lustenberger, P. Merkli, and P. Strebel, “Analog-decoder experiments with subthreshold CMOS soft-gates,” in Proc. IEEE Int.Symp. on Circuits and Systems, ISCAS 2003, vol. 1, 2003, pp. 85–88.
[130] C. Winstead and C. Schlegel, “Density evolution analysis of device mismatch inanalog decoders,” in Proc. IEEE Int. Symp. on Inform. Theory, Chicago, 2004, p.293.
[131] S. Hemati and A. Banihashemi, “Comparison between continuous-time asyn-chronous and discrete-time synchronous iterative decoding,” in Proc. GLOBECOM2004, Dallas, TX, 2004, p. To appear.
138
BIBLIOGRAPHY
[132] M. Morz and J. Hagenauer, “Decoding of convolutional codes using an analog ringdecoder,” in 2nd Analog Decoder Workshop, Zurich, Switzerland, 2003.
[133] A. F. Mondragon-Torres and E. Sanchez-Sinencio, “Floating gate analog implemen-tation of the additive soft-input soft-output decoding algorithm,” in Proc. IEEEInt. Symp. on Circuits and Systems, ISCAS 2002, Arizona, 2002, pp. 89–92.
[134] J. Hagenauer, E. Offer, C. Measson, and M. Morz, “Decoding and equalizationwith analog non-linear networks,” European Trans. on Telecomm. (ETT), vol. 10,no. 6, pp. 659–679, Nov–Dec 1999.
[135] M. Frey and P. Merkli, “An analog circuit that locks onto a pseudo-noise signal,”in 3rd Analog Decoder Workshop, Banff, Canada, 2004.
[136] C. Winstead and C. Schlegel, “Importance sampling for SPICE-level verification ofanalog decoders,” in Proc. IEEE Int. Symp. on Inform. Theory, Yokohama, Japan,2003, p. 103.
[137] D. Haley, C. Winstead, A. Grant, V. Gaudet, and C. Schlegel, “Robust analogdecoder design with mode-switching cells,” in 3rd Analog Decoder Workshop, Banff,Canada, 2004.
[138] M. Yiu et al., “A digital built-in self-test approach for analog iterative decoders,”in 3rd Analog Decoder Workshop, Banff, Canada, 2004.
[139] B. Razavi, Design of Analog CMOS Integrated Circuits. New York: McGraw-Hill,2001.
[140] C. Winstead, V. Gaudet, and C. Schlegel, “Analog iterative decoding of error con-trol codes,” in Canadian Conf. on Electrical and Computer Eng., vol. 3, Montreal,Canada, 2003, pp. 1539–1542.
[141] J. Hagenauer, M. Morz, and A. Schaefer, “Analog decoders and receivers for high-speed applications,” Int. Zurich Seminar on Broadband Commun., Access, Trans-mission, Networking, pp. 3–1–3–8, Feb. 2002.
[142] M. Morz, J. Hagenauer, and E. Offer, “On the analog implementation of the APP(BCJR) algorithm,” in Proc. IEEE Int. Symp. on Inform. Theory, Sorrento, Italy,2000, p. 425.
[143] M. Helfenstein, F. Lustenberger, H.-A. Loeliger, F. Tarkoy, and G. S. Moschytz,“High-speed interfaces for analog, iterative decoders,” in Proc. IEEE Int. Symp.on Circuits and Systems, ISCAS ’99, vol. II, Orlando, Florida, 1999, pp. 424–427.
[144] D. Haley, C. Winstead, A. Grant, and C. Schlegel, “Architectures for error controlin analog subthreshold CMOS,” in Proc. 4th Australian Communications TheoryWorkshop, Melbourne, Australia, 2003, pp. 75–80.
[145] D. Haley, C. Winstead, A. Grant, V. Gaudet, and C. Schlegel, “Reusing analogdecoders for encoding,” in 2nd Analog Decoder Workshop, Zurich, Switzerland,2003, p. http://www.isi.ee.ethz.ch/adw/slides/haley.pdf.
139