using fpga in embedded devices

Post on 16-Apr-2017

127 Views

Category:

Engineering

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

1

Using FPGA in Embedded Devices

Andriy SmolskyyConsultant, Engineering29.03.2017

2

What is FPGA?

3

• Transistor-Transistor Logic - TTL• Programmable Array Logic - PAL• Programmable Logic Device – PLD• Complex PLD – CPLD• FPGA• ASIC

History of Programmable Logic

4

Digital Design with TTL LogicTruth table

5

Digital Design with TTL LogicTruth table Karnaugh

map

6

Digital Design with TTL LogicTruth table Karnaugh

mapLogic

expression

7

Digital Design with TTL LogicTruth table Karnaugh

mapLogic

expression Final implementation

8

• Logic gates and registers are fixed• Programmable sum of products array and output control

Programmable Array Logic (PAL)Implementation

Advantages• Fewer devices required• Lower cost• Power savings• Simpler to test and debug• Design security (prevent reverse engineering)

• In-system reprogrammability! (in some cases)

9

From PAL to Programmable Logic Device (PLD)• Arrange multiple PAL arrays in a single device

10

• Combine multiple PLDs in single device with programmable interconnect and I/O

From PLD to Complex PLD (CPLD)Implementation

Advantages• Ample amounts of logic and advanced configurable I/Os

• Programmable routing• Instant on• Non-volatile configuration• Reprogrammable

11

Interconnection Problem: Routing Takes Too Much SpaceGlobal Routing Row & Column Routing

12

• LUT inputs are mux select lines• FPGA LABs made up of logic elements (LEs) instead of product terms and macrocells

• Solves the Interconnection Problem

FPGA LUT and LAB

13

• LABs arranged in an array• Programmable interconnect• Interconnect may span all or part of the array

Field Programmable Gate Array (FPGA)Implementation

Advantages• Easier to create complex functions through LE cascading

• Integration of ready functions and IP blocks: PLLs, memory, arithmetic

• High density, high performance• Fast programming

14

• Pros:- Fast time to Market: easy to develop a new

device with specific logic or interfaces- Easy to upgrade device logic, fix bugs in

hardware- Specific devices: reconfigurable DSP, digital

filters• Cons:

- Need to be programmed at power on- It is hard to achieve 100% device utilization

FPGA vs ASIC

• Pros:- Higher performance: consume less

power and can operate faster on higher speed

- Cheaper in mass production- No configuration at power-on required- Smaller chip size

• Cons:- Additional expenses in design

preparation- Impossible to fix hardware bugs

FPGA ASIC

15

Software and hardware development aspects

16

System on Chip (SoC) + FPGA

17

• In general FPGA generated controllers are similar to Microcontrollers’ peripheral devices

• FPGA requires programming of each start, controllers might be not ready at the system start

• Take care with DMA, MMU, virtual memory and caching operations

• In some designs FPGA can control CPU peripheral devices

Software and hardware development aspects

18

• Verilog• VDHL• Visual development

FPGA design development

19

• Core IP- SDRAM Controllers- Ethernet PHY, Custom

Transceiver PHY- PCIe PHY- SDi, Display Port

• Megafunctions - PLL- I/O- Custom logic blocks

FPGA design development

20

High speed data processing: OpenCL in FPGA

21

A simple CPU

22

Load immediate value into register

23

Load memory value into register

24

Store register value into memory

25

Add two registers, store result in register

26

A simple programMem[100] += 42 * Mem[101]

CPU instructions:

R0 Load Mem[100] R1 Load Mem[101] R2 Load #42 R2 Mul R1, R2 R0 Add R2, R0 Store R0 Mem[100]

27

Single CPU activity, step by step

Time

28

Unroll the CPU hardware…

Space

29

… and specialize by position1. Instructions are fixed. Remove

“Fetch”

30

… and specialize1. Instructions are fixed. Remove

“Fetch”2. Remove unused ALU operations

31

… and specialize1. Instructions are fixed. Remove

“Fetch”2. Remove unused ALU operations3. Remove unused Load / Store

32

… and specialize1. Instructions are fixed. Remove

“Fetch”2. Remove unused ALU operations3. Remove unused Load / Store4. Wire up registers properly. And

propagate state.

33

… and specialize1. Instructions are fixed. Remove

“Fetch”2. Remove unused ALU operations3. Remove unused Load / Store4. Wire up registers properly. And

propagate state5. Remove dead data

34

… and specialize1. Instructions are fixed. Remove

“Fetch”2. Remove unused ALU operations3. Remove unused Load / Store4. Wire up registers properly. And

propagate state5. Remove dead data6. Reschedule!

35

FPGA datapath = Your algorithm, in silicon• Build exactly what you need:

- Operations- Data widths- Memory size, configuration

• Efficiency:- Throughput- Latency- Power

36

OpenCL FPGA• Host + Accelerator Programming Model• Sequential Host program on microprocessor

• Function offload onto a highly parallel accelerator device

main() { read_data( … ); maninpulate( … ); clEnqueueWriteBuffer( … ); clEnqueueNDRange(…,sum,…); clEnqueueReadBuffer( … ); display_result( … );}

__kernel voidsum(__global float *a, __global float *b, __global float *y){ int gid = get_global_id(0); y[gid] = a[gid] + b[gid];}

Host Code

FPGA Design

User Application

Algorithm

37

Loop Pipelining• Analyze any dependencies between iterations

• Schedule these operations• Launch the next iteration as soon as possible

float array[M];

for (int i=0; i < n*numSets; i++){ for (int j=0; j < M-1; j++) array[j] = array[j+1]; array[M-1] = a[i];

for (int j=0; j < M; j++) answer[i] += array[j] * coefs[j];}

At this point, we can launch the next iteration

38

Loop Pipelining ExampleWith Loop PipeliningNo Loop Pipelining

Looks almost like parallel thread execution

39

Digital Filter

z-1 z-1 z-1 z-1 z-1 z-1 z-1

X X X X X X X X

C0 C1 C2 C3 C4 C5 C6 C7

x(n)

+

y(n)

40

• Q&AFPGA in Embedded Devices

41

Thank you

Andriy SmolskyyConsultant, Engineeringandriy.smolskyy@globallogic.com+380-67-701-8637

top related