analysis of fpga based kalman filter architectures

03/12/2010 1

Analysis of FPGA based Kalman Filter Architectures

Arvind Sudarsanam

Dissertation Defense

12 March 2010

03/12/2010 2

Outline Introduction Literature review PolyFSA architecture Architecture analysis

Area analysis Error analysis Performance analysis

Contributions Future work

03/12/2010 3




03/12/2010 4

Kalman filters for Spacecraft navigation

03/12/2010 5

Kalman filters

03/12/2010 6

Research overview

An FPGA based Polymorphic systolic array architecture is proposed to accelerate Kalman filters - Portions of this architecture can be reused for other applications during run-time

A comprehensive architecture analysis is presented. Results are presented in terms of area savings for varying performance and precision error.

03/12/2010 7




03/12/2010 8

Hardware design for Kalman filters

- Systolic arrays Yeh [7], M. Lu [8] and P. Rao [9] proposed systolic array architectures for Kalman filters based on Faddeev algorithm

Cardoso et. al [11] proposed a hardware software co-processor system Profiling is used to guide partitioning by designer C2H [12] tool from Altera used to generate RTL designs

But these architectures are not scalable. Some efforts [15-20] target individual linear algebra

operations, like matrix inverse.

03/12/2010 9

Error analysis

Initial efforts [28-35] were targeted towards analyzing variable precision fixed-point arithmetic

Constantinides [36-45] proposed multiple ideas towards error analysis for fixed-point arithmetic

Availability of FPGAs has caused a surge in work towards developing variable precision architectures, especially in the floating point domain [46-53]

03/12/2010 10

Performance and area analysis Existing performance and area estimation approaches

target a parameter-specific architecture [72] Parameters include:

Overall data path width Memory size Number of processing elements

Proposed research is also parameter-specific, but looks at latency, precision and input rates of floating point arithmetic units

03/12/2010 11

Outline Introduction Literature review PolyFSA architecture

Application analysis Mapping to Systolic array Architecture details

Architecture analysis Contributions Future work

03/12/2010 12

Extended Kalman Filter

03/12/2010 13

Faddeev algorithm Faddeev algorithm is a method for efficiently

computing the Schur complement (D - CA-1B) Given matrices A,B,C,D, arrange in matrix M as:

Reduce to row echelon form and D-CA-1B will result in the lower right corner

D-CA-1B

03/12/2010 14

Faddeev algorithm

03/12/2010 15

Faddeev algorithm – Single node

Boundary node Internal node

03/12/2010 16

Mapping to systolic array

Simplify data flow

Mapping to 1-D Systolic array

Folding to make systolic array scalable

03/12/2010 17

Architecture details for boundary PE

Details for internal PE are similar

03/12/2010 18

Control flow

03/12/2010 19

Results

Target FPGA – Xilinx Virtex 4 SX35

Test case is derived from [Ronnback-2000]

Performance is compared against a software implementation on a Virtutech Simics PowerPC 750 simulator (Thanks: Rob Barnes [79])

03/12/2010 20

Performance of proposed PolyFSA

Overall execution time of EKF on PolyFSA based system architecture and PowerPC

Estimated execution of Faddeev algorithm for varying number of PEs and Faddeev Parameters

03/12/2010 21




03/12/2010 22

Architecture analysis During design time, each PE in the proposed PolyFSA is

derived for best performance and with highest precision

QUESTION: By allowing for degradation in performance and/or tolerating precision error, can we reconfigure the existing PE with a set of smaller PEs?

03/12/2010 23

Design parameters that can be varied Precision of

Adder unit (madd) Multiplier unit (mmul) Divider unit (mdiv)

Latency of Adder unit (LatAdd) Multiplier unit (LatMul) Divider unit (LatDiv)

Input rate of the divider (c_rate)

03/12/2010 24

Area analysis – Adder unit

03/12/2010 25

Area analysis – Multiplier unit

03/12/2010 26

Area analysis – Divider unit

03/12/2010 27

Area analysis – Divider unit

03/12/2010 28

Error analysis – Top-level flow

03/12/2010 29

Faddeev algorithm - Error vs Precision

03/12/2010 30

Error analysis for EKF

03/12/2010 31

EKF – Area Savings vs Error

03/12/2010 32

Performance analysis

Major portion of execution time

03/12/2010 33

Calculation of Tfaddeev

Execution time of Faddeev algorithm on the proposed PolyFSA is computed using a simulation model

We are interested in observing the impact of performance degradation on resource utilization

Results are shown for overall execution of EKF

03/12/2010 34

Performance analysis – Vary latency

03/12/2010 35

Performance analysis – Vary c_rate

03/12/2010 36

Area versus Performance

03/12/2010 37

3-D Pareto curves

03/12/2010 38

Summary An FPGA based Polymorphic Faddeev Systolic Array

(PolyFSA) architecture is proposed to accelerate the compute-intensive kernels of Kalman filters.

Hierarchical analysis of the error introduced in results of Kalman filter computations due to reduction in precision is presented.

Simulation model to estimate the overall execution time of the Kalman filter algorithm is proposed.

Results of architecture analysis are presented in terms of pareto curves.

03/12/2010 39

Future work

Proposed methodology – architecture design supported by analysis – can be applied to design for other applications

Design goals can be extended to incorporate Power consumption

Design parameters can be extended to include other options – Implementation type, FPGA family type etc.

analysis of fpga based kalman filter architectures

Documents

d systolic arrayfolding

varying performance

precision error

d ca

kalman filters portions

analysis of fpga

system architecture

proposed multiple ideas