analysis of fpga based kalman filter architectures
DESCRIPTION
Analysis of FPGA based Kalman Filter Architectures. Arvind Sudarsanam Dissertation Defense 12 March 2010. Outline. Introduction Literature review PolyFSA architecture Architecture analysis Area analysis Error analysis Performance analysis Contributions Future work. Outline. - PowerPoint PPT PresentationTRANSCRIPT
03/12/2010 1
Analysis of FPGA based Kalman Filter Architectures
Arvind Sudarsanam
Dissertation Defense
12 March 2010
03/12/2010 2
Outline Introduction Literature review PolyFSA architecture Architecture analysis
Area analysis Error analysis Performance analysis
Contributions Future work
03/12/2010 3
Outline Introduction Literature review PolyFSA architecture Architecture analysis
Area analysis Error analysis Performance analysis
Contributions Future work
03/12/2010 6
Research overview
An FPGA based Polymorphic systolic array architecture is proposed to accelerate Kalman filters - Portions of this architecture can be reused for other applications during run-time
A comprehensive architecture analysis is presented. Results are presented in terms of area savings for varying performance and precision error.
03/12/2010 7
Outline Introduction Literature review PolyFSA architecture Architecture analysis
Area analysis Error analysis Performance analysis
Contributions Future work
03/12/2010 8
Hardware design for Kalman filters
- Systolic arrays Yeh [7], M. Lu [8] and P. Rao [9] proposed systolic array architectures for Kalman filters based on Faddeev algorithm
Cardoso et. al [11] proposed a hardware software co-processor system Profiling is used to guide partitioning by designer C2H [12] tool from Altera used to generate RTL designs
But these architectures are not scalable. Some efforts [15-20] target individual linear algebra
operations, like matrix inverse.
03/12/2010 9
Error analysis
Initial efforts [28-35] were targeted towards analyzing variable precision fixed-point arithmetic
Constantinides [36-45] proposed multiple ideas towards error analysis for fixed-point arithmetic
Availability of FPGAs has caused a surge in work towards developing variable precision architectures, especially in the floating point domain [46-53]
03/12/2010 10
Performance and area analysis Existing performance and area estimation approaches
target a parameter-specific architecture [72] Parameters include:
Overall data path width Memory size Number of processing elements
Proposed research is also parameter-specific, but looks at latency, precision and input rates of floating point arithmetic units
03/12/2010 11
Outline Introduction Literature review PolyFSA architecture
Application analysis Mapping to Systolic array Architecture details
Architecture analysis Contributions Future work
03/12/2010 13
Faddeev algorithm Faddeev algorithm is a method for efficiently
computing the Schur complement (D - CA-1B) Given matrices A,B,C,D, arrange in matrix M as:
Reduce to row echelon form and D-CA-1B will result in the lower right corner
D-CA-1B
03/12/2010 16
Mapping to systolic array
Simplify data flow
Mapping to 1-D Systolic array
Folding to make systolic array scalable
03/12/2010 19
Results
Target FPGA – Xilinx Virtex 4 SX35
Test case is derived from [Ronnback-2000]
Performance is compared against a software implementation on a Virtutech Simics PowerPC 750 simulator (Thanks: Rob Barnes [79])
03/12/2010 20
Performance of proposed PolyFSA
Overall execution time of EKF on PolyFSA based system architecture and PowerPC
Estimated execution of Faddeev algorithm for varying number of PEs and Faddeev Parameters
03/12/2010 21
Outline Introduction Literature review PolyFSA architecture Architecture analysis
Area analysis Error analysis Performance analysis
Contributions Future work
03/12/2010 22
Architecture analysis During design time, each PE in the proposed PolyFSA is
derived for best performance and with highest precision
QUESTION: By allowing for degradation in performance and/or tolerating precision error, can we reconfigure the existing PE with a set of smaller PEs?
03/12/2010 23
Design parameters that can be varied Precision of
Adder unit (madd) Multiplier unit (mmul) Divider unit (mdiv)
Latency of Adder unit (LatAdd) Multiplier unit (LatMul) Divider unit (LatDiv)
Input rate of the divider (c_rate)
03/12/2010 33
Calculation of Tfaddeev
Execution time of Faddeev algorithm on the proposed PolyFSA is computed using a simulation model
We are interested in observing the impact of performance degradation on resource utilization
Results are shown for overall execution of EKF
03/12/2010 38
Summary An FPGA based Polymorphic Faddeev Systolic Array
(PolyFSA) architecture is proposed to accelerate the compute-intensive kernels of Kalman filters.
Hierarchical analysis of the error introduced in results of Kalman filter computations due to reduction in precision is presented.
Simulation model to estimate the overall execution time of the Kalman filter algorithm is proposed.
Results of architecture analysis are presented in terms of pareto curves.
03/12/2010 39
Future work
Proposed methodology – architecture design supported by analysis – can be applied to design for other applications
Design goals can be extended to incorporate Power consumption
Design parameters can be extended to include other options – Implementation type, FPGA family type etc.