constant log bcjr turbo decoder with pipelined architecture
Upload: international-journal-of-scientific-research-and-engineering-studies
Post on 18-Jul-2016
16 views
DESCRIPTION
Error correcting codes required to get the correct data from the corrupted signal due to noise and interference. There are many error correcting codes. Among them Turbo codes is considered to be the best because it is very close to the Shannon theoretical limit. The MAP algorithm is commonly used in the turbo decoder. Among the different versions of the MAP algorithm Constant log BCJR algorithm have less complexity and good error performance. The Constant log BCJR algorithm can be easily designed using look up table which reduces the energy consumption. In this work the proposed decoder is designed to decode two blocks of data at a time this increases the throughput and complexity of the decoder. Complexity is further reduced by the use of the add compare select (ACS) units and registers. The proposed decoder is coded in Verilog and synthesized with Spartan3 device. In comparison to the LUT log BCJR decoder, proposed design has less power consumption and memory usage. The Constant log BCJR architecture shows 7.06% of gate reduction and 38.08 % of power is reduced. Also proposed design shows good BER performance.TRANSCRIPT
International Journal of Scientific Research and Engineering Studies (IJSRES)
Volume 1 Issue 3, September 2014
ISSN: 2349-8862
www.ijsres.com Page 65
Constant Log BCJR Turbo Decoder With Pipelined Architecture
Aiswarya .S
Student
M. Tech, Embedded Systems
Sree Buddha College of Engineering
Alappuzha
Ragimol
Asst. professor
Dept. of ECE
Sree Buddha College of Engineering
Alappuzha
Abstract: Error correcting codes required to get the
correct data from the corrupted signal due to noise and
interference. There are many error correcting codes. Among
them Turbo codes is considered to be the best because it is
very close to the Shannon theoretical limit. The MAP
algorithm is commonly used in the turbo decoder. Among
the different versions of the MAP algorithm Constant log
BCJR algorithm have less complexity and good error
performance. The Constant log BCJR algorithm can be
easily designed using look up table which reduces the energy
consumption. In this work the proposed decoder is designed
to decode two blocks of data at a time this increases the
throughput and complexity of the decoder. Complexity is
further reduced by the use of the add compare select (ACS)
units and registers. The proposed decoder is coded in Verilog
and synthesized with Spartan3 device. In comparison to the
LUT log BCJR decoder, proposed design has less power
consumption and memory usage. The Constant log BCJR
architecture shows 7.06% of gate reduction and 38.08 % of
power is reduced. Also proposed design shows good BER
performance.
Keywords: Turbo decoder, MAP, ACS, Constant log
BCJR algorithm
I. INTRODUCTION
Over the years, there has been increase in the use of
digital communication in the field of computer, cellular and
satellite communication. The digital communication, the
information is the binary data which is modulated on the
analog waveform and transmitted through the communication
channel. In the communication channel the information may
be interfered by the noise. At the receiver station the corrupted
signal is demodulated and then it is converted in the binary
bits. Due to the noise the data may is not the correct replica of
the transmitted signal. So we use channel coding scheme to
protect against the noise and interference.
Turbo codes are known to be the best forward error
correcting code. The turbo code provides reliable data
transmission within a half decibel of the Shannon limit
[13].The iterative nature of the turbo decoding algorithm
increases the complexity. The two decoding algorithm used
for the turbo codes are Maximum A Posteriori (MAP) and
Soft Output Viterbi Algorithm (SOVA)[5]. The MAP
algorithm is based on the most likely data sequence. While
SOVA is based on the most likely connected path through the
trellis tree. The MAP algorithm performed well than the
SOVA at low SNR. However the implementation of the
decoder using the MAP algorithm is more difficult.
In order to reduce the computational complexity of the
MAP algorithm is usually implemented in the logarithmic
domain and corresponding algorithm is the log-MAP
algorithm[7]. Further the log-MAP algorithm can be
simplified. Then we get max-log-MAP, constant log MAP and
linear log map algorithm[5]. The log MAP algorithm can be
implemented using look up table. The LUT- log-BCJR
decoder is based on this. By the use of the constant log-
BCJR(constant log-MAP) algorithm we can reduce the look
up table elements than in the LUT- log- BCJR algorithm[`1].
We cannot reduce the energy consumption of the decoder by
reducing clock frequency and the throughput. So we can
reduce the energy by low complexity hardware.. So new
architecture is developed which contain only registers and add
compare and select unit for the constant log-BCJR block. Also
proposed design is worked with two blocks of data at a time.
This helps to increase the throughput twice.
In section II we discussed about the conventional
architecture of the decoder. Section III is deals with the
proposed architecture. Simulated results and comparison
results are explained in the section IV and found that the 7.06
% of area and 38.08 % of power is reduced. Section V
concludes the paper.
II. SYSTEM ARCHITECTURE
The architecture of the turbo code has an encoder part and
decoder part. The overall structure of the architecture is shown
in the figure 1.The encoder is the parallel concatenation of the
recursive systematic convolutional encoder [14-15]. That is
the convolutional encoder has a feedback structure which to
get a high weight codes. If the parity1 is low weight then
because of the interleaver the chance of producing the low
weight code by the parity 2 is very low. Thus our encoded
code becomes high weight code.
International Journal of Scientific Research and Engineering Studies (IJSRES)
Volume 1 Issue 3, September 2014
ISSN: 2349-8862
www.ijsres.com Page 66
Figure 1: Turbo encoder and decoder block diagram
In the channel the code get corrupted by the noise or
interferences. In a typical turbo decoding system, two
decoders operate iteratively and pass their decisions to each
other after each iteration. The turbo decoder generally uses the
MAP algorithm in at least one of the component decoders. The
decoding process begins by receiving partial information from
the channel (i.e., systematic output: Yks and parity 1: Yk
P1) and
passing it to the first decoder. The rest of the information,
parity 2 (YkP2
) goes to the second decoder and wait for the rest
of the information to catch up. While the second decoder is
waiting, the first decoder make an estimate of the transmitted
information, interleaves it to match the format of parity 2, and
sent it to the second decoder. The second decoder and the
channel and re-estimate is looped back to the first encoder
where the process start again.
Turbo decoding process can be explained as follows:
Encoded information sequence Xk is transmitted over an
Additive White Gaussian Noise (AWGN) channel, and a noisy
received sequence Yk is obtained. Each decoder calculates the
Log-Likelihood Ratio (LLR) for the k-th data bit dk, as
L(dk )
= Log( p(d
k =1/Y) / ( p(d
k =0/Y) (1)
L(dk
) can be found out using different algorithms [5], [7],
[11].LLR can be decomposed into 3 independent terms, as
L(dk ) =L
apri(d
k) +L
c(d
k) +L
e(d
k ) (2)
where L apri
(dk) is the a-priori information of data bit dk L
c(d
k)
is the channel measurement.
L e(d
k ) is the extrinsic information from the other
decoder. The extrinsic information of the second decoder is
the a-priori information of the first decoder.
III. CONSTANT LOG BCJR ARCHITECTURE
The decoder used is process two blocks of data at a time.
The decoder has two soft input soft output (SISO) decoder
which receives data as soft input that is multibit input. In the
first iteration SISO1 is working and which produce soft output
which is interleaved given to the SISO2. SISO2 is not work in
the first iteration and after receiving the soft output from the
SISO1 it begins to calculate soft output and it is deinterleaved
and given to SISO1 and this process will continue up to
certain iteration. When blocks B1, B2 are processing in the
decoder new two blocks B3,B4 are buffered in.
Here used one level of pipelining across the blocks by
using separate hardware for SISO1 and SISO2 for processing
two blocks of data at a time. SISO1 is expected to receive
system1, parity 1and SISO 2 is expected to receive system2
and parity 2. Both SISO‟s receives blocks in the forward order
and reverse order. So that it can produce two outputs for the
other decoder that is forward output and backward output. If
the 10 iteration is required for the operation of the decoder,
then 20 iterations is needed for the decoding of two blocks of
data.
Both SISOs take turn processing the sequence and
exchange the soft information until the soft information is
converged to a range. The hard decision will then be made
from the final soft information sequence. The block diagram
of the decoder is shown in the fig. 2.
Figure 2: Decoder block diagram
Input buffer receives the system bits and parity bits. The
control unit generates control signals for the operation of the
SISO‟s, input buffer and hard decision units. The interleaver
permutes the data. In order to reduce the interlevar area the
interleaver coefficients are stored in the ROM. The
deinterleavers also have same circuit like interleaver. The hard
decision is taken in a way that if soft output after particular
iteration is negative then hard bit is „1‟.The internal diagram
of the SISO is shown in the figure 3. SISO is worked using
Constant log BCJR algorithm. At first we have to calculate
branch metric. It is either initialized to zero or a-priori
probability [1]. To find L(dk) we have to find out forward
recursion metric it is found by traversing trellis in the forward
direction. And also found out backward recursion metric
which is found by traversing from the last data of the block.
L(dk) = max* (Sk ,Sk-1, 1) ((γ1
-(S k-1 , Sk) +
α-k-1(Sk-1) + β
-k(Sk)) - max* (Sk ,Sk-1, 0) ((γ0
-(S k-1 , Sk) + α
-k-
1(Sk-1) + β-k(Sk)) (3)
The forward metric can be found using the equation for K
from 0 to N where K is the particular time and n is the block
length.
αk-(Sk) = max* (Sk-1, i) ((γi
-(S k-1 , Sk) + α
-k-1(Sk-1) (4)
The reverse metrics β can be found from N to 0
β-k(Sk)= max* (Sk-1, i) ((γi
-(S k-1 , Sk) + β
-k+1(Sk+1)) (5)
The branch metric from one state to another state can be
found using following equation.
International Journal of Scientific Research and Engineering Studies (IJSRES)
Volume 1 Issue 3, September 2014
ISSN: 2349-8862
www.ijsres.com Page 67
-1) = K+ln( P(
-1 )) + 2/N
0 (y
s
k x
s
k (i) + y
p
k x
p
k )) (6)
For the constant-log-MAP algorithm, approximates the
Jacobi logarithm and the max* is calculated as
Max*(x,y) = max(x,y) + { 0 if | y-x| > T; C if |y-x|<= T} (7)
Figure 3: SISO architecture
This SISO architecture (fig.3) produces two outputs, that
is sof and sob. The sof output which produces soft output for
half blocks of data in the forward order and sob produce soft
output for another half of data in the reverse direction. This
block receives soft inputs sif and sib from the decoder.
Mf and Mb unit: It calculate branch metrics
FCAL/ BCAL UNIT:
This unit consists of register and four parallel metrics
calculation units. The single metric calculation unit shown in
the fig. 4. The ACS unit (Add Compare Select unit consists of
two adders and compare and select unit consists Max* unit.
The adder unit calculates sum of forward branch/reverse state
metric and branch metric for zero transition branch and 1
transition branch.
Figure 4: Single F/BCAL unit
The Max* unit for the forward and backward metric
calculator is given in the fig. 5[1]. R1 and R2 is loaded with
the values of the adders. It takes input from Registers R1 and
R2. Then given the result the R3. The four forward and four
backward state metrics are stored in register for the next
calculation of the metrics.
Figure 5: Max* unit
Max* can be calculated as follows
OPERATION 1
O =10110, In this clock cycle max* operation is
calculated. Input is given to
x~ and y~ . Result C0 determine the max(x~ , y~)
OPERATION 2
O =11001, x~ is loaded with the value T,|the result of first
operation is loaded in the y~ .C1 determine the outcome of
the test |x~ , y~| >T
OPERATION 3
O = 00000, x~ is provided with maximum x~ and y~. y~
is loaded with constant T. If C1 =0, Max(x~, y~) is added with
C. If C1=1 Max(x~, y~) is added with 0.
ADD unit:
This unit gets 3 input of data that is branch metrics M
state metrics F and B. this circuit consists two adder for
adding three inputs. These units produce M+F+B for 0 state
and1 state transition.
COMPARE AND SELECT
Compare and select compare the two output from the
adder this units 6 max* unit for the temporary soft outputs
calculation.
SUM/COMP:
It compare the temporary outputs add with the soft inputs.
IV. RESULTS AND DISCUSSION
The proposed architecture is simulated using VHDL and
synthesized using SPARTAN 3 device would get the
following results. The simulation result of proposed
architecture is shown in the fig. 6.
International Journal of Scientific Research and Engineering Studies (IJSRES)
Volume 1 Issue 3, September 2014
ISSN: 2349-8862
www.ijsres.com Page 68
Figure 6: Simulation results of the constant log BCJR decoder
The Constant log BCJR algorithm and LUT log BCJR
algorithm [1] is designed for a particular architecture. Then
the comparison based on the architecture are shown in the
table 1.
Parameter LUT log BCJR Constant log
BCJR
No.4 input LUT
23235 21708
Logics Distribution
No. of occupied
slices:
No. of slices
contains only related
logic:
12,640
12,640
0
11,547
11,547
0
No. of4 input LUT
used as logic:
23235
21708
Total equivalent
gate count for
design
Power consumption
193329
140 Mw
179672
88 Mw
Table 1: comparison on the basis of synthesis report
From the synthesis report it is found that 7.06 % total gate
count is reduced. From power analysis of both the architecture
we found that 38.02% of power is reduced. The particular
design is simulated in the MATLAB shows average BER of
3.858-004 at 10 iterations.
V. CONCLUSION
The study of turbo codes starts from the error correction
codes in the information theory. The turbo codes are derived
from the convolution codes. It is the parallel concatenation of
the recursive systematic convolutional codes. Turbo codes are
considered to be having high BER performances. The turbo
encoding is very easy but decoding is difficult. There are
many algorithms for the decoding. From that constant log
BCJR algorithm is used for our decoder architecture. The low
complexity turbo decoder with constant log BCJR algorithm is
designed. Here two blocks of data is processing at same time.
Although the hardware requirement is high the advantage is
that the throughput is more due to process the two blocks of
data at a time. When the proposed architecture is compared
with the LUT log architecture on the same design, it is found
that the 7.06% of area and 38.08% of power consumption
reduced. Also the proposed architecture have has better BER
performance.
In future we can design an decoder architecture which can
processed various blocks of input in parallel and also designed
a decoder which decodes the data of various block length.
REFERENCES
[1] Liang Li, Robert G. Maunder, Bashir M. Al-Hashimi,
Fellow, and Lajos Hanzo” A Low-Complexity Turbo
Decoder Architecture for Energy-Efficient Wireless
Sensor Networks” IEEE transactions on Very Large Scale
Integration (vlsi) systems, vol. 21, no. 1, pp.14-22,January
2013.
[2] L. Li, R. G. Maunder, B. M. Al-Hashimi, and L. Hanzo,
“An energy-efficient error correction scheme for IEEE
802.15.4 wireless sensor networks,” Trans. Circuits Syst.
II, vol. 57, no. 3, pp. 233–237, 2010.
[3] Stylianos Papaharalabos, P. Takis Mathiopoulos, Guido
Masera,and Maurizio Martina, “On Optimal and Near-
Optimal Turbo Decoding Using Generalized max*
Operator”, IEEE Communications Letters, 2009
[4] Z. Wang, “High-speed recursion architectures for MAP-
Based turbo decoders,” IEEE Trans. Very Large Scale
Integr. (VLSI) Syst., vol. 15, no. 4, pp. 470–474, Apr.
2007
[5] Z. He, P. Fortier, and S. Roy, “Highly-parallel decoding
architectures for convolutional turbo codes,” IEEE Trans.
Very Large Scale Integr. (VLSI) Syst., vol. 14, no. 10, pp.
1063–8210, Oct. 2006.
[6] C.M.Wu, M. D. Shieh, C. H.Wu,Y.T.Hwang, and
J.H.Chen, “VLSI architectural design tradeoffs for
sliding-window log-MAP decoders,” IEEE Trans. Very
Large Scale Integr. (VLSI) Syst., vol. 13, no. 4, pp. 439–
447, Apr. 2005.
[7] Jagadeesh Kaza and Chaitali Chakrabarti,“Design and
Implementation of Low-Energy Turbo Decoders” IEEE
transactions on Very Large Scale Integration (VLSI)
systems, vol. 12, no. 9, September 2004.
[8] Guido Masera, Marco Mazza, Gianluca Piccinini, Fabrizio
Viglione, and Maurizio Zamboni “Architectural Strategies
International Journal of Scientific Research and Engineering Studies (IJSRES)
Volume 1 Issue 3, September 2014
ISSN: 2349-8862
www.ijsres.com Page 69
for Low-Power VLSI Turbo Decoders” IEEE transactions
on Very Large Scale Integration (VLSI) Systems, vol. 10,
no. 3, June 2002
[9] C. Schurgers, F. Catthoor, and M. Engels, “Memory
optimization of MAP turbo decoder algorithms,” IEEE
Trans. Very Large Scale Integr. (VLSI) Syst., vol. 9, no.
2, pp. 305–312, Feb. 2001.
[10] M. C. Valenti and J. Sun, “The UMTS turbo code and an
efficient decoder implementation suitable for software-
defined radios,” Int. J. Wirel. Inform. Netw., vol. 8, no. 4,
pp. 203–215, 2001.
[11] P. Robertson, E. Villebrun, and P. Hoeher, “A
comparison of optimal and sub-optimal MAP decoding
algorithms operating in the log domain,” in Proc. IEEE
Int. Conf. Commun., pp. 1009–1013, 1995.
[12] C. Berrou, A. Glavieux, and P. Thitimajshima, “Near
Shannon limit error correcting coding and decoding:
Turbo codes,” in Proc. IEEE Int. Conf. Commun., 1993,
pp. 1064–1070.
[13] L. R. BAHL, J. COCKE, F. JELINEK, AND J. RAVIV
“Optimal Decoding of Linear Codes for Minimizing
Symbol Error Rate” IEEE Trans. Iformation theory., vol.
13, no. 4, pp. 284–288, mar. 1974.