paper reading - a new approach to pipeline fft processor presenter:chia-hsin chen, yen-chi lee...
TRANSCRIPT
Paper Reading -A New Approach to Pipeline FFT Processor
Presenter: Chia-Hsin Chen,
Yen-Chi Lee
Mentor: Chenjo
Instructor: Andy Wu
2006.10.25 Owen, Lee 2
Outline
• What’s FFT
• FFT on Hardware
• Comparison
• C/C++ Sim
• Further Study
• Reference
2006.10.25 Owen, Lee 3
What’s DFT• The Fourier transform of discrete-time signals continuous function• Sample X(ω) at equal spaced frequencies
discrete function
this is called the discrete Fourier transform (DFT) of x(n)
1__
0
( ) ( )L
j n
n
X x n e
0 2
1__2 /
0
2( ) ( ) ( )
Lj kn N
n
kX k X x n e
N
1
2 /
0
( ) ( )N
j kn N
n
X k x n e
0,1,2,..., 1k N
2006.10.25 Owen, Lee 4
What’s FFT
• An efficient algorithm computes DFT
• Twiddle Factor:2 nkNjnk
NW e
x[n]Time
domain
X[k]Frequency
domain
1
0
0N
nkk n N
n
X x W k N
1
0
10
Nnk
n k Nk
x X W n NN
DFT
IDFT
2006.10.25 Owen, Lee 5
What’s FFT (cont.)
• Direct computation• N2 multiplication
• N(N – 1) addition
• FFT• Symmetry:
• Periodicity:
1
0
( ) ( )N
knN
n
X k x n W
0 1k N
/ 2k N kN NW W k N kN NW W
2006.10.25 Owen, Lee 6
Divide-and-Conquer• Simple divide case:
• N = LM (for N points)
• n=l+mL, k=Mp+q
• Apply 2-dimensional index map
where
1 1( )( )
0 0
( , ) ( , )L M
Mp q mL lN
l m
X p q x l m W
1 1
0 0
( , )L M
lq mq lpN M L
l m
W x l m W W
( )( )Mp q mL l MLmp mLq Mpl lqN N N N NW W W W W
1 mq pl lqM L NW W W
1
0
( ) ( )N
knN
n
X k x n W
0 1k N
2006.10.25 Owen, Lee 7
Two Dimensional Sequence
l \ m 0 1 2 … M-1
0 x(0) x(1) x(2) … x(M-1)
1 x(M) x(M+1) x(M+2) … x(2M-1)
2 x(2M) x(2M+1) x(2M+2) … x(3M-1)
: : : : : :
L-1 x((L-1)M)x((L-
1)M+1)… … x(LM-1)
2006.10.25 Owen, Lee 8
Comparison
• Computations decrease
Total computations
Complex multiplications
Complex additions
Before division N2 N(N-1)
After division N(M+L+1) N(M+L-2)
2006.10.25 Owen, Lee 9
Radix
• Let N=r1r2r3…rv
• For special case N=rv
• r is called the radix
• r = 2
2006.10.25 Owen, Lee 10
Radix-2 Butterfly
• DIT
• DIF
xi-1(k)
xi-1(m)
Xi(k)
Xi(m)-1
Twiddle factor Wn
Xi(k)
Xi(m)-1
Twiddle factor Wn
xi-1(k)
xi-1(m)
2006.10.25 Owen, Lee 11
Review of FFT approach
• A divide and conquer approach• Radix-2 Multi-path Delay Commutator
• Radix-2 Single-path Delay Feedback
• Radix-4 Single-path Delay Feedback
C2 BF2
8
C2 BF2
4
8
C2 BF2
2
2
C2 BF2
1
1
j
BF2
8
BF2
4
BF2
2
BF2
1
BF4
3x64
BF4
3x16
BF4
3x4
BF4
3x1
2006.10.25 Owen, Lee 12
Review (cont.)
C4 BF4
192
C4 BF4
48
C4 BF4 C4 BF412864 32
16
483216
1284
1284
321
321
DC6x64 BF4 DC6x64 BF4 DC6x64 BF4 DC6x64 BF4
• Radix-4 Multi-path Delay Commutator
• Radix-4 Single-path Delay Commutator
2006.10.25 Owen, Lee 13
Radix-22 DIF Algorithm
• Proposed by S. He and M. Torkelson
• Applying a 3-dimensional linear index map
1
0
( ) ( ) 0N
nkN
n
X k x n W k N
1 2 3 1 2 32 4
1 2 3 1 2 3
, 0,1 0 ~ 14
2 4 , 0,1 0 ~ 14
N NN
N
Nn n n n n n n
Nk k k k k k k
2006.10.25 Owen, Lee 14
Radix-22 DIF Algorithm (cont.)
41 2 3 1 2 32 4
3 2 1
42 3 1 2 3 2 31 4 4
23 2
1 2 3
1 1 1( )( 2 4 )
1 2 32 40 0 0
1 1( ) ( )(2 4 )
2 340 0
( 2 4 )
( )
{ ( ) }
N
N N
N
N N
N
n n n k k kN NN
n n n
n n k n n k kk NN N
n n
X k k k
x n n n W
B n n W W
1 1
22 3 2 3 2 34 4 4 2( ) ( ) ( 1) ( )N
k kN N N NB n n x n n x n n
2006.10.25 Owen, Lee 15
Radix-22 DIF Algorithm (cont.)
2 3 1 2 34
2 1 22 3 3 1 2 3 34
3 1 2 3 32 1 2
( )( 2 4 )
( 2 ) ( 2 ) 4
( 2 ) 4( 2 )( )
N
N
n n k k kN
n k kNn k n k k n kN N N N
n k k n kn k kN N
W
W W W W
j W W
4
3 1 2 3 3
43
1( 2 )
1 2 3 1 2 30
( 2 4 ) ( , , )N
N
n k k n kN
n
X k k k H k k n W W
1 1 2 11 2 3 / 2 3 / 2 3( , , ) ( ) ( ) ( 1) ( )
4k k k kN N
NH k k n B n j B n
2006.10.25 Owen, Lee 16
Butterfly with Decomposed Twiddle Factors
x(0)x(1)x(2)x(3)x(4)x(5)x(6)x(7)x(8)x(9)x(10)x(11)x(12)x(13)x(14)x(15)
X(0)X(8)X(4)X(12)X(2)X(10)X(6)X(14)X(1)X(9)X(5)X(13)X(3)X(11)X(7)X(15)
N/4 DFT(k1=0, k2=0)
N/4 DFT(k1=0, k2=1)
N/4 DFT(k1=1, k2=0)
N/4 DFT(k1=1, k2=1)
-j
-j
-j
-j
W0
W2
W4
W6
W0
W1
W2
W3
W0
W3
W6
W9
2006.10.25 Owen, Lee 17
Relation Between Radix-4 & Radix-22
• Combined Radix-4 with Radix-2
BF4
BF2i BF2ii
2006.10.25 Owen, Lee 18
R22SDF Pipeline FFT
• Example: N=256
BF2i BF2ii
128 64
7 6
BF2i BF2ii
32 16
5 4
BF2i BF2ii
8 4
3 2
BF2i BF2ii
2 1
1 0clk
x[n] X[k]
s t s t s t s t
-
-
Xr[n]
Xi[n]
Xr[n+N/2]
Xi[n+N/2]
Zr[n]
Zi[n]
Zr[n+N/2]
Zi[n+N/2]
BF2i
0
10
1
0
1
0
1
s
Zr[n]
Zi[n]
Zr[n+N/2]
Zi[n+N/2]
0
10
1
0
1
0
1
Xr[n]
Xi[n]
Xr[n+N/2]
Xi[n+N/2]
0
11
0
t&s’ t
BF2ii
2006.10.25 Owen, Lee 19
Comparison
Multiplier# Adder# Memory size
Control
R2MDC 2(log4N –
1)
4log4N 3N/2 - 2 Simple
R2SDF 2(log4N –
1)
4log4N N – 1 Simple
R4SDF log4N – 1 8log4N N – 1 Medium
R4MDC 3(log4N –
1)
8log4N 5N/2 – 4 Simple
R4SDC log4N – 1 3log4N 2N – 2 Complex
R22SDF log4N – 1 4log4N N – 1 Simple
2006.10.25 Owen, Lee 20
C/C++ Simulation
• Complex class
• BF2i、 BF2ii
• DelayReg
• ComputeW
• DFT
• FFT4->FFT16->FFT64->FFT256->FFTn
2006.10.25 Owen, Lee 21
C/C++ Sim (cont.)input
DFT FFTn
output output2
Substract
Dump to file
2006.10.25 Owen, Lee 22
Further Study
• R23SDF• Proposed by S. He and M. Torkelson
1 2 3 4 1 2 3 42 4 8
1 2 3 4 1 2 3 4
, , 0,1 0 ~ 18
2 4 8 , , 0,1 0 ~ 18
N N NN
N
Nn n n n n n n n n
Nk k k k k k k k k
BF2i BF2ii
128 64
7 6
BF2i BF2i
32 16
5 4
BF2ii BF2i
8 4
3 2
BF2i BF2ii
2 1
1 0clk
x[n] X[k]
s t s ts s ts
2006.10.25 Owen, Lee 23
Further Study (cont.)
• R24SDF• Proposed by J. OH and M. LIM
BF2i BF2ii
128 64
7 6
BF2i BF2ii
32 16
5 4
BF2i BF2ii
8 4
3 2
BF2i BF2ii
2 1
1 0clk
x[n] X[k]
s t s t s t s t
1 2 3 4 5 1 2 3 4 52 4 8 16
1 2 3 4 5 1 2 3 4 5
, , , 0,1 0 ~ 116
2 4 8 16 , , , 0,1 0 ~ 116
N N N NN
N
Nn n n n n n n n n n n
Nk k k k k k k k k k k
2006.10.25 Owen, Lee 24
CORDIC
• COordinate Rotation DIgital Computer• An iterative arithmetic algorithm introduced by
Volder in 1956
• Can handle many elementary functions, such as trigonometric, exponential, and logarithm with only shift-and-add arithmetic
( , )x y
( ', ')x y
' cos sin
' sin cos
x x
y y
1
23
4
2006.10.25 Owen, Lee 25
References• S. He and M. Torkelson. “A new approach to pipeline FFT processor.” IEEE Proceedi
ngs of IPPS ’96.• S. He and M. Torkelson. “Designing Pipeline FFT Processor for OFDM (de)Modulatio
n.” ISSSE, pp. 257-262, Sept. 1998.• J. Y. Oh and M. S. Lim. “New Radix-2 to the 4th Power Pipeline FFT Processor.” IEIC
E Trans. Electron., Vol.E88-C, No.8 Aug. 2005• E. E. Swartzlander, W. K. W. Young, and S. J. Joseph. “A radix 4 delay commutator f
or fast Fourier transform processor implementation.” IEEE J. Solid-State Circuits, SC-19(5):702-709, Oct. 1984.
• C. D. Thompson. “Fourier transform in VLSI.” IEEE Trans. Comput., C-32(11):1047-1057, Nov.1983.
• Y. Jung, Y. Tak, J. Kim, J. Park, D. Kim, and H. Park. “Efficient FFT Algorithm for OFDM Modulation.” Proceedings of IEEE Region 10 International Conference on Electrical and Electronic Technology. Vol.2 pp.676-678, 2001.
• A. M. Despain. “Very Fast Fourier Transform Algorithms Hardware for Implementation.” IEEE Trans. on Computers, Vol. c-28, No. 5, May 1979
• A. –Y. Wu. “CORDIC.” Slides of Advanced VLSI• Y. H. Hu. “CORDIC-based VLSI architectures for digital signal processing.” IEEE Sig
nal Processing Magazine. Pp. 16-35. July 1992• J. G. Proakis. D. G. Manolakis. “Digital signal processing” 3rd edition, Prentice Hall
2006.10.25 Owen, Lee 26
Thanks for Your Attention
Q & A ?