viturbo: a reconfigurable architecture for future ubiquitous wireless networks mani vaya august 7,...
TRANSCRIPT
VITURBO: A Reconfigurable Architecture for Future Ubiquitous
Wireless Networks
Mani Vaya
August 7, 2002
Rice University
Overview
• Motivation• Communication Systems and Standards• Channel Encoding and Decoding Techniques• VITURBO• Design Tools • Results• Conclusions & Future Work
Ubiquitous Networks
• Anytime, Anywhere Networks
• Ubiquitous = Outdoor Cellular Networks + High Speed Indoor Wireless LANs
• Seamless transfer between different environments
• One receiver for all environments
Ubiquitous Networks
Cellular Systems High Speed Local Area Networks
Motivation
• Motivation: Instead of multiple architectures for multiple standards one single receiver architecture for multiple standards
• Means: Exploitation of commonalities between channel decoding algorithms for multiple standards, and their architectural realizations
• End Result : Reduced area, reduced cost reconfigurable architectures
Related Work
• Lucent: Unified Turbo/Viterbi Decoder– Uses log-MAP for Turbo decoding– 4 ACS units in total – Limited to 3G (2 Mbps Turbo, 384 Kbps Viterbi)
• Rice: Reconfigurable Viterbi Decoder– Uses hard metrics (practical systems use soft metrics)– Data Rates up to 26 Mbps– Unable to turn down units not in use– Only does Viterbi decoding
Our Contribution
• Design and Implementation of reconfigurable channel decoder for 802.11a and 3G systems
– Achieves data rates for both systems
– SOVA based Turbo decoding, and HDVA based Viterbi decoding
– Extremely flexible and is capable of decoding constraint length 3-9 convolutional codes+ Turbo codes
– Power saving mechanisms employed in the system
Next
• Ubiquitous Wireless Networks• Communication Systems and Standards• Channel Encoding and Decoding Techniques• VITURBO• Design Tools • Results• Conclusions & Future Work
CDMA System
S o u rc eE n c o d er
C h an n elE n c o d er
C D M AS p read o r
T R AN S M I T T E R
S o u rc eD ec o d er
C h an n elD ec o d er
M u ltiu s erD etec to r
C h ip M at ch edF ilt er
C h an n elE s tim ato r
R E C E I VE R
I n p u t
O u tp u t
Channel
OFDM System
C h an n elE n co d er
S /PIF F T
M o d u lat o rP /S C P
T R AN S M I T T E R
P /S E q u alizerF F T
D em o d u lat o rS /P C /P
C h an n elD eco d er
C h an n elE s tim ato r
R E C E I VE R
Channel
Next
• Motivation• Communication Systems and Standards• Channel Encoding and Decoding Techniques• VITURBO• Design Tools • Results• Conclusions & Future Work
Channel Encoding in Different Standards
Standard Maximum Data Rate
Code Type Code
Rate
Constraint length
WLAN 54 Mbps Convolutional 1/2 7
CDMA2000 2 Mbps Convolutional 1/2,1/3 9
CDMA2000 2 Mbps Turbo 1/3 4
3GPP 2 Mbps Convolutional 1/2,1/3 9
3GPP 2 Mbps Turbo 1/3 4
Simple Convolutional Coder (SCC)
D DU
y00
y01
: XOR
D : Shift Register
U : Input Data
Y00,Y0
1: Output Data
Rate: Number of Inputs/Number of Outputs = 1/2
Constraint Length: Number of Shift Registers +1= 3
Generator Polynomials: g0=[1 1 1], g1= [1 0 1]
Viterbi Algorithm
•Decoding complexity increases exponentially with constraint length
•Finds the most likely sequence of state transitions througha finite state trellis
Viterbi Decoder
Viter b iBM U
Viter b iAC S
S M U
Viter b i C O N T R O L
Q u an tizedC o d ed D ata D ec o d ed D ata ( + /- 1 )
BMU: Branch Metric Unit
ACS: Add Compare Select
SMU: Survivor Management Unit
Turbo Encoder
R SC 1
R SC 2
I
U
X s
X 1 p
X 2 p
U: Input data
RSC: Constituent Encoders
I : Interleaver
X1p, X2p: Output Data(Parity)
Xs: Output Data(Systematic)
Recursive Systematic Convolutional(RSC) Encoder
U
X p
D D D
X s
Recursive: Intermediate Data is fed back to the encoder
Systematic: Output Xs is same as input U
Turbo Decoding
• Reliability of decisions is computed and iterated between two decoders, in order to get a reliable estimate of the data
• Two competing algorithms: Soft Output Viterbi Algorithm(SOVA) & Maximum a posteriori Probability Algorithm(MAP)
• MAP has superior performance compared to SOVA, but SOVA is very similar to VA, and hence the choice for VITURBO
SOVA based Turbo Decoder
S I S O(SO VA /M A P )
I n ter leav erS I S O
(SO VA /M A P )
I n ter leav er
D e- I n ter leav erL e2 1
L e1 2
Y1 p
Ys
Y2 p
T u r b oBM U
T u r b oAC S
S o f tS M U
T u r b o C O N T R O L
D ec o d ed d a ta ( + /- 1 )
S o f t I n f o rm atio nto n ex td ec o d er
S o f t I n f o rm atio nf r o m p r ev io u sd ec o d er
Q u an tizedE n c o d ed D ata
1 b it
q b it
D ec o d ed d a ta ( + /- 1 )
S O VA
Y1p,Y2p: Received Parity data
Ys: Received Systematic Data
Le(12/21): Soft Information
Next
• Motivation • Communication Systems and Standards• Channel Encoding and Decoding Techniques• VITURBO• Design Tools • Results• Future Work
VITURBO: Features
• Completely parallel architecture for constraint length 9 (2K-
2 ACS units ), for high speed decoding
• Smaller constraint length decoding uses parts of larger circuit
• Flexible for a wide range of constraint lengths, generator polynomials and rates
• Power saving mechanisms help in shutting down units not in use for a particular decoding type
VITURBO: Complete Architecture
B M U
C o d ew o rdLU T
A C S sP athm etricM ux B ank
Thresho lder
P ath D if fe re nc eM e m o ry
D ec is io n B itM em o ry 1
D ec is io n B itM em o ry 2
S urvivo rM anager
Inte rle a v e r
S o ft D e c is io nM e m o ry
O rde rInve r s io n
M e m o ry 1
O rde rInve r s io n
M e m o ry 2
D e c is io n M u xB a n k
P a th D iffe re n c eM u x B a n k
Inve r s io nM uxB M U
M U X es
D einterleaver
M u x
Interleaver
M u xO u tp u t
Inpu
ts
Reconfigurable BMU
B M U
C o d ew o rdLU T
A C S sP a th m e tr icM u x B a n k
Thresho lder
P ath D if fe re nc eM e m o ry
D ec is io n B itM em o ry 1
D ec is io n B itM em o ry 2
S urvivo rM anager
Inte rle a v e r
S o ft D e c is io nM e m o ry
O rde rInve r s io n
M e m o ry 1
O rde rInve r s io n
M e m o ry 2
D e c is io n M uxB ank
P a th D iffe re n c eM u x B a n k
Inve r s io nM uxB M U
M U X es
D einterleaver
M u x
Interleaver
M u xO u tp u t
Inpu
ts
BMU & Codeword Look-Up Table
• Possible Codewords for rate k/n code : 2n
• Corresponding codeword for each butterfly: Defined by generator polynomial
• Contemporary Solutions: Use of inbuilt encoders for each code configuration
• Our Solution: Programmable Codeword Look-Up Table for enhanced flexibility and low power. Programmable for any codeword for K=3-9
Reconfigurable BMU
V iterb iB M
C o m p u te
T urb oB M
C o m p u te
MU
X
Inp ut D ata
Inp ut D ata
D ec o d er T yp e
B MM u x(0 )
B MM u x(1 )
B MM u x(1 2 7 )
8X 8
K
R
C o de w o r d L o o k-U pTabl e
Inde
xC
orre
spon
ding
Cod
ewor
d
To C
orrespondingAC
S units
8
8
8
8
8
KR
D ec o d er T yp eN ew C o d ew o rd
Reconfigurable ACS Unit & ACS Routing
B M U
C o d ew o rdLU T
A C S sP athm etricM u x B a n k
Thresho lder
P ath D if fe re nc eM e m o ry
D ec is io n B itM em o ry 1
D ec is io n B itM em o ry 2
S urvivo rM anager
Inte rle a v e r
S o ft D e c is io nM e m o ry
O rde rInve r s io n
M e m o ry 1
O rde rInve r s io n
M e m o ry 2
D ec is io n M uxB ank
P a th D iffe re n c eM u x B a n k
Inve r s io nM uxB M U
M U X es
D einterleaver
M u x
Interleaver
M u xO u tp u t
Inpu
ts
Reconfigurable Add Compare Select Sub Unit
+
+
>S
EL
EC
T -
S u r v iv o r P M ( Vite r b i/T u r b o )
P M 0 '
BM 0 '
P M 1 '
BM 1 '
D ec is io n Bit ( Vite r b i/T u r b o )
P M D if f er en c e
P M 0
P M 1
( T u r b o )
PM0’,PM1’: Input Path Metrics
BM0’, BM1’: Input Branch Metrics
PM0, PM1: Competing Path Metrics
Decoder Type(Viterbi/Turbo)
State Transitions and Butterflies
0 0
0 1
1 0
11
0 0
11
11
1 0
0 1
1 0
0 1
0 0
S ta g e s
Stat
es 1 0 0
1 0 1
11 0
11 1
0 /1
1 /0
1 /0
1 /1
0 /0
1 /1
0 /0
0 /1
0 0 0
0 0 1
0 1 0
0 11
0 /0
1 /1
1 /1
1 /0
0 /1
1 /0
0 /1
0 /0
K=3, SCC K=4, RSC
2 j
2 j+ 1
j
j+ 2 ^( K- 2 )
Jth butterfly for Constraint length K
Jth butterfly’s computations are done in Jth ACS unit
ACS Path Metric Routing Problem
ACS(0)
To ACS(0)
To ACS(1)
ACS(0)
From ACS(0)
From ACS(1)
From ACS(0)
From ACS(1)
To ACS(0)
To ACS(2)
K=3 K=4 K=…..,9
ACS Path Metric Routing Solution
Solution for All Ks:
ACS(0)
ACS(127)
Multiplexer Bank
(256 Muxes)
ACS(1)
Decoder Type
Constraint length
A C S (j)
P o w erC o ntro l(j)
Mux(2*j)
K
Mux(2*j +
1)
P a th M e tr icM u ltip le x or
b a n k
Path M
etrics from otherA
CS units
B ra nc h M e tric sS urvivo r P ath M etric s(T urb o /V iterb i)
T o P a th M e tric D iffe re nc e M u ltip le xo r b a nk
To D ecisio n B itM u ltip lexo r bank
P a th M e tric s fro m O the r A C S u nits
PM0’,PM1’: Input Path Metrics
BM0’, BM1’: Input Branch Metrics
PM0, PM1: Competing Path Metrics
Jth ACS Unit and Mux Bank
B MC o m p u te
B Mm ux(0 )
B Mm ux(3 )
B Mm u x
(1 2 7 )
P ow erCon trol (3 )
P o w e rC o n tro l (0 )
P ow erCon trol (1 2 7 )
A C S (0 )
A C S (3 )
A C S (1 2 7 )
P a thm etr icM u xB ank
T u rb o /V ite rb i C om pu ta tion
B Mm ux(4 )
P ow erCon trol (4 )
A C S (4 )
Viterb i C o m p u ta tio n
Pat
h M
etri
csP
ath
Met
rics
T hresho ld er
D ecis ion B it M u ltip lex or B ank
C O D E W O R D L O O K -U PT A B L E
P a thM etr ic D iffer en ceM u ltip lex or ba n k
T o P athD iffe rence M em o ry
T o D ec is io n B it M em o ry
BMU, ACS and MUX banks
Reconfigurable SMU
B M U
C o d ew o rdLU T
A C S sP a th m e tr icM u x B a n k
Thresho lder
P ath D if fe re nc eM e m o ry
D ec is io n B itM em o ry 1
D ec is io n B itM em o ry 2
S urvivo rM anager
Inte rle a v e r
S o ft D e c is io nM e m o ry
O rde rInve r s io n
M e m o ry 1
O rde rInve r s io n
M e m o ry 2
D e c is io n M uxB ank
P a th D iffe re n c eM u x B a n k
Inve r s io nM uxB M U
M U X es
D einterleaver
M u x
Interleaver
M u xO u tp u t
Inpu
ts
Flexible Hard Decision Traceback
Flexible Traceback
DecisionLUT
C
Decbit
P
K
Xt
Xt/Xv
Flexible Traceback for Viterbi and Turbo Decoding
• C: Current State, P: Previous State• decbit :decision bit• Xv and Xt: decoded data for Viterbi and Turbo decoding respectively• Decision-LUT is a Look-Up table for various possible decisions for
Turbo Decoding
If C >= 2K-2 then P= 2*C+ decbit - 2K-1 xv=1
else P = 2*C + decbitxv=0
end if xt = DecisionLUT(2*C+ decbit)
SOVA Traceback Architecture
ML Path
Competing Paths(All colors except black)
Ik-U : Decoded data at time k-U
Lk-U : Soft Information at time k-U
D elay1
D elay2
D elay3
D elay4
M in im a
D elay1 2
R elev an c eBits ks,'
kk -U
11,' ks
Lk-U
T B1
T B2
T B3
T B1 2
T B0
Ik-U
U: Reliability depth(3*K)
Power Saving
• Architecture is completely parallel for constraint length 9 (128 ACS units)
• Smaller constraint length decoders use parts of the complete circuit
• In order to save power, units not being used are shut down
• Quiescent Power is constant for all the different configurations
Power Saving Mechanism for ACS units
>Inde x ( j )
2 ( K- 2 )
AC S ( j)
P o w er C o n tr o lled C lo c k
in p u ts
c lo c k
K: C o n s tr a in t len g thj: I n d ex o f th e AC S u n it
Outputs
ac s O N ( j)
Next
• Motivation • Communication Systems and Standards• Channel Encoding and Decoding Techniques• VITURBO• Design Tools• Results
• Conclusions & Future Work
Design Tools
• Xilinx’s Virtex-II 2000K gate FPGA used to implement the design
• Xilinx’s ISE development environment used for design, synthesis, and implementation
– Design described in VHDL
– Modelsim used for simulations
– Synplicity used for synthesis
– XPower used for power estimation
Next
• Motivation • Communication Systems and Standards• Channel Encoding and Decoding Techniques• Viterbi and Turbo Decoding Architectures• VITURBO• Design Tools• Results
• Conclusions & Future Work
Gate Requirements for Different Realizations
0
50000
100000
150000
200000
250000
300000
350000
T V7 V3-7 V3-7&T
V9 V3-9 V3-9&T
LogicMemory
V(K1-K2) : Viterbi(Constraint length)T : Turbo
Area Savings
• Conventional architecture:– Separate architectures for K=7,9, and Turbo– Total Logic Area requirements= 267,147
• Reconfigurable VITURBO– Same architecture for K=7,9, and Turbo– Total Logic area requirements = 190,288
Area Savings = 28.7 %
Maximum Clocking Frequency
V(K1-K2) : Viterbi(Constraint length)T : Turbo
5456
58
60
62
64
6668
70
72
T V7 V3-7 V3-7&T
V9 V3-9 V3-9&T
MaximumFrequency(MHz)
Power Analysis
Decoder Clock Frequency
Data Rate
Power Consumption (sans Quiescent
power)
Energy/Bit
(Joules/bit)
Quiescent Power
K=7
(WLAN)
54 Mhz 54 Mbps 501.3 mW 9.28 nJ 225 mW
K=9
(3GPP)
2 Mhz 2 Mbps 59.54 mW 29.77 nJ 225 mW
K=4 Turbo
(3GPP)
34.3 Mhz 2 Mbps 104.76 mW 52.38 nJ 225 mW
Next
• Motivation • Communication Systems and Standards• Channel Encoding and Decoding Techniques• VITURBO• Design Tools • Results• Conclusions & Future Work
Conclusions
• VITURBO achieves data rates stipulated by 802.11a and 3G systems
• Reconfigurable architectures are a feasible proposition for future communication systems, as they – Provide flexibility
– Save Area
Possible Future Work
• Use of log-MAP for Turbo Decoding
• Use of Termination Algorithms for Turbo Decoding for lower power consumption
• Use of remaining ACS units for high data rate Turbo decoders
• Architecture designs with smaller number of ACS units
• Exploitation of similarities in other baseband processing units for CDMA and OFDM systems
Backups
Area-Time-Decoder Tradeoffs
D ecode r T ype G ates(Lo gic) Ga te s (Me m o ry)1 2 8 b it fra m e M ax. F requency
Vite rb i (7 ) 6 2 ,9 8 7 6 5 5 3 6 7 1 .3 MHz
Vite rb i(9 ) 1 6 6 ,3 4 8 2 6 2 ,1 4 6 6 8 .4 MHz
T u rb o 3 7 ,8 1 2 6 5 5 3 6 6 7 .7 MHz
Vite rb i(3 -5 ) 3 3 ,4 8 7 1 6 ,3 8 4 6 4 .6 MHz
Vite rb i(3 -5 )+ T u rb o 4 2 ,3 8 5 8 1 ,8 1 2 6 3 .3 MHz
Vite rb i(3 -7 ) 6 7 ,0 4 2 6 5 ,5 3 6 6 2 .8 MHz
Vite rb i(3 -7 )+ T u rb o 7 6 ,0 3 7 1 3 1 ,0 7 2 6 2 .1 MHz
Vite rb i(3 -9 ) 1 8 1 ,5 6 0 2 6 2 ,1 4 6 6 1 .9 MHz
VITURBO : V ite rb i(3 -9 )+T u rb o 1 9 0 ,2 8 8 3 2 7 ,6 8 0 6 0 .5 MHz
SOVA Algorithm- Branch Metric
kkckskckk pyLuyLuuLss ....).(),'( 2
21
21
21
:Branch Metric for state transition from s’ to s),'( ss
)( kuL :Extrinsic Information from previous decoder
cL :Channel Value=0
4NEc
),( kkk pux :Output of RSC1 at time k, for input uk
SOVA Traceback
SOVA Traceback Blocks
Har d D ec id in gS M U
P ath C o m p ar is o n Un it
D elay
D elay
Select Up d ate Un it
s k -D
's k -D Lk-D -U' s ,k
d ec s ,k
L ik elih o o d
re lev an c e b its
S y m b o l
Two Step SOVA -SM
V AS O V A
M L path C o m p e t i n gP a t h A l l P a t h s
c o n ve r g e
k-D -Uk-D
k
P o s s i b l eS u r vi vo r s
D: Decoding Depth
U: reliability Depth
k: Index of decoded bit