fpl'2014 - flextiles workshop - 3 - flextiles dsp accelerators

17
www.flextiles.eu FlexTiles Workshop at FPL’2014 conference: FlexTiles FP7 project Low-Power DSP Accelerator Embedded in a Heterogeneous Many-Core Architecture Marc MORGAN CSEM – Swiss Center for Electronics and Microtechnology

Upload: flextiles-team

Post on 19-Jun-2015

138 views

Category:

Engineering


1 download

DESCRIPTION

Slides presented at the FlexTiles Workshop at FPL'2014. Presentation #3: FlexTiles DSP Accelerators FlexTiles is a heterogeneous many-core platform reconfigurable at run-time developed within an FP7 project.

TRANSCRIPT

Page 1: FPL'2014 - FlexTiles Workshop - 3 - FlexTiles DSP Accelerators

www.flextiles.eu

FlexTiles

Workshop at FPL’2014 conference: FlexTiles FP7 project

Low-Power DSP Accelerator

Embedded in a Heterogeneous

Many-Core Architecture Marc MORGAN

CSEM – Swiss Center for Electronics and Microtechnology

Page 2: FPL'2014 - FlexTiles Workshop - 3 - FlexTiles DSP Accelerators

1 /

The info

rmation c

onta

ined in this

docum

ent and a

ny a

ttachm

ents

are

the p

ropert

y o

f F

lexT

iles c

onsort

ium

. Y

ou a

re h

ere

by n

otified that any r

evie

w, dis

sem

ination, dis

trib

ution,

copyin

g o

r oth

erw

ise u

se o

f th

is d

ocum

ent m

ust be d

one in a

ccord

ance w

ith the C

A o

f th

e p

roje

ct (T

RT

/DJ/6

24412785.2

011).

Tem

pla

te v

ers

ion 1.0

CSEM overview on a single slide

• private company, founded in the 1980’s, not for profit

• approx. 450 employees on 5 sites in Switzerland (HQ in Neuchatel)

and a site Brazil

• 5 research programs:

1. ultra-low power integrated systems (SoC, Vision, Wireless)

2. systems engineering (med tech, instrumentation, automation)

3. MEMS

4. surface engineering (nano, bio, printable electronics)

5. photovoltaic

• approx. 70 MCHF annual budget

• over 20 start-ups and spin-offs since 1995

Page 3: FPL'2014 - FlexTiles Workshop - 3 - FlexTiles DSP Accelerators

2 /

The info

rmation c

onta

ined in this

docum

ent and a

ny a

ttachm

ents

are

the p

ropert

y o

f F

lexT

iles c

onsort

ium

. Y

ou a

re h

ere

by n

otified that any r

evie

w, dis

sem

ination, dis

trib

ution,

copyin

g o

r oth

erw

ise u

se o

f th

is d

ocum

ent m

ust be d

one in a

ccord

ance w

ith the C

A o

f th

e p

roje

ct (T

RT

/DJ/6

24412785.2

011).

Tem

pla

te v

ers

ion 1.0

Many-core architecture: GPPs + accelerators

An array of general purpose processors (GPP)

Connected via a Network-on-Chip (NoC)

Complemented with accelerators to optimize speed and power:

DSP processors or specialized logic implemented in embedded-FPGA

Plus memory nodes and I/O

Page 4: FPL'2014 - FlexTiles Workshop - 3 - FlexTiles DSP Accelerators

3 /

The info

rmation c

onta

ined in this

docum

ent and a

ny a

ttachm

ents

are

the p

ropert

y o

f F

lexT

iles c

onsort

ium

. Y

ou a

re h

ere

by n

otified that any r

evie

w, dis

sem

ination, dis

trib

ution,

copyin

g o

r oth

erw

ise u

se o

f th

is d

ocum

ent m

ust be d

one in a

ccord

ance w

ith the C

A o

f th

e p

roje

ct (T

RT

/DJ/6

24412785.2

011).

Tem

pla

te v

ers

ion 1.0

Many-core architecture: GPPs + accelerators (cont’d)

Several IPs are available for the building blocks

both in the consortium and on the market

architectural choices attempt to retain genericity of the platform

CSEM provides an ultra-low power DSP processor for the DSP

accelerator

It plugs into a generic accelerator interface (AI)

Page 5: FPL'2014 - FlexTiles Workshop - 3 - FlexTiles DSP Accelerators

4 /

The info

rmation c

onta

ined in this

docum

ent and a

ny a

ttachm

ents

are

the p

ropert

y o

f F

lexT

iles c

onsort

ium

. Y

ou a

re h

ere

by n

otified that any r

evie

w, dis

sem

ination, dis

trib

ution,

copyin

g o

r oth

erw

ise u

se o

f th

is d

ocum

ent m

ust be d

one in a

ccord

ance w

ith the C

A o

f th

e p

roje

ct (T

RT

/DJ/6

24412785.2

011).

Tem

pla

te v

ers

ion 1.0

Accelerator interface (AI)

Interfaces the NoC’s NI to the accelerator by providing services:

programming, control/status, data in, data out, debug

DMA access, word FIFOs, notification

Page 6: FPL'2014 - FlexTiles Workshop - 3 - FlexTiles DSP Accelerators

5 /

The info

rmation c

onta

ined in this

docum

ent and a

ny a

ttachm

ents

are

the p

ropert

y o

f F

lexT

iles c

onsort

ium

. Y

ou a

re h

ere

by n

otified that any r

evie

w, dis

sem

ination, dis

trib

ution,

copyin

g o

r oth

erw

ise u

se o

f th

is d

ocum

ent m

ust be d

one in a

ccord

ance w

ith the C

A o

f th

e p

roje

ct (T

RT

/DJ/6

24412785.2

011).

Tem

pla

te v

ers

ion 1.0

DSP accelerator architecture

Choices for the DSP accelerator avoid DSP specific features

the DSP will not run an OS or kernel

the DSP will not use (or at least not require) interruptions

Note: CSEM’s icyflex4 ULP DSP could support both of the above

Implement a FIFO manager to handle input and output tokens from/to

the accelerator interface (AI)

Implement debug and tracing facilities

Debug: JTAG 1149.1 TAP

Tracing: programmable tracing unit

Page 7: FPL'2014 - FlexTiles Workshop - 3 - FlexTiles DSP Accelerators

6 /

The info

rmation c

onta

ined in this

docum

ent and a

ny a

ttachm

ents

are

the p

ropert

y o

f F

lexT

iles c

onsort

ium

. Y

ou a

re h

ere

by n

otified that any r

evie

w, dis

sem

ination, dis

trib

ution,

copyin

g o

r oth

erw

ise u

se o

f th

is d

ocum

ent m

ust be d

one in a

ccord

ance w

ith the C

A o

f th

e p

roje

ct (T

RT

/DJ/6

24412785.2

011).

Tem

pla

te v

ers

ion 1.0

DSP accelerator architecture (cont’d)

Page 8: FPL'2014 - FlexTiles Workshop - 3 - FlexTiles DSP Accelerators

7 /

The info

rmation c

onta

ined in this

docum

ent and a

ny a

ttachm

ents

are

the p

ropert

y o

f F

lexT

iles c

onsort

ium

. Y

ou a

re h

ere

by n

otified that any r

evie

w, dis

sem

ination, dis

trib

ution,

copyin

g o

r oth

erw

ise u

se o

f th

is d

ocum

ent m

ust be d

one in a

ccord

ance w

ith the C

A o

f th

e p

roje

ct (T

RT

/DJ/6

24412785.2

011).

Tem

pla

te v

ers

ion 1.0

Management of the DSP accelerator

Each accelerator is managed by software running on GPPs

virtualization manager: attribution of the accelerator

resource manager: control of the accelerator

These managers are in charge of:

transfer of the application (ELF) to the accelerator

signaling the accelerator when to start and when to stop

recovering statistics on usage of the accelerator to optimize the

execution of the application on the many-core platform

The tracing unit can be managed from the processor or from the JTAG

interface

Page 9: FPL'2014 - FlexTiles Workshop - 3 - FlexTiles DSP Accelerators

8 /

The info

rmation c

onta

ined in this

docum

ent and a

ny a

ttachm

ents

are

the p

ropert

y o

f F

lexT

iles c

onsort

ium

. Y

ou a

re h

ere

by n

otified that any r

evie

w, dis

sem

ination, dis

trib

ution,

copyin

g o

r oth

erw

ise u

se o

f th

is d

ocum

ent m

ust be d

one in a

ccord

ance w

ith the C

A o

f th

e p

roje

ct (T

RT

/DJ/6

24412785.2

011).

Tem

pla

te v

ers

ion 1.0

Application

(C code)

C to SpearDE

representation

Conversion

(Thales)

Data parallelisation Mapping

(Thales)

Graphic input

(manual)

+

C kernels

Streaming optimisation

(ACE)

Compilation & Link

(ACE)

architecture

representation

Master Cores

GPP

Slave cores

eFPGA, DSP

Library of IPs

Tool flow and Model of Computation

Binaries

Acc compiler or C2VHDL tools

(CSEM / UR1 / RUB)

Masters control slaves

Architecture

configuration

GUI (KIT)

Page 10: FPL'2014 - FlexTiles Workshop - 3 - FlexTiles DSP Accelerators

9 /

The info

rmation c

onta

ined in this

docum

ent and a

ny a

ttachm

ents

are

the p

ropert

y o

f F

lexT

iles c

onsort

ium

. Y

ou a

re h

ere

by n

otified that any r

evie

w, dis

sem

ination, dis

trib

ution,

copyin

g o

r oth

erw

ise u

se o

f th

is d

ocum

ent m

ust be d

one in a

ccord

ance w

ith the C

A o

f th

e p

roje

ct (T

RT

/DJ/6

24412785.2

011).

Tem

pla

te v

ers

ion 1.0

icyflex software development kit

GNU C compiler (gcc) v 4.6.3

icyflex instruction parallelism supported by latest releases of gcc

libc and libm from RedHat’s NewLib

software implementation of IEEE floating-point standard

GNU assembler / linker (binutils), v 2.20

BFD / ELF32 object file format

Binary, SREC, IHEX memory image file formats

GNU debugger (gdb), v 6.7.1

Mode 1: instruction set simulator of the icyflex core

Mode 2: On-Chip Debug (OCD) through a JTAG interface

icyflex instruction set simulator (ISS), written in C++

Phase-accurate, pipelined

Wrappers to SystemC, VHDL (Modelsim), Matlab/Simulink

Eclipse integrated development environment, v Helios

CDT C/C++ IDE plug-in

icyflex plug-in

.c

.o

.exe

.log

gcc

ld

gdb gdb

Page 11: FPL'2014 - FlexTiles Workshop - 3 - FlexTiles DSP Accelerators

10 /

The info

rmation c

onta

ined in this

docum

ent and a

ny a

ttachm

ents

are

the p

ropert

y o

f F

lexT

iles c

onsort

ium

. Y

ou a

re h

ere

by n

otified that any r

evie

w, dis

sem

ination, dis

trib

ution,

copyin

g o

r oth

erw

ise u

se o

f th

is d

ocum

ent m

ust be d

one in a

ccord

ance w

ith the C

A o

f th

e p

roje

ct (T

RT

/DJ/6

24412785.2

011).

Tem

pla

te v

ers

ion 1.0

icyflex family of ultra-low power processors

icyflex2 Control

Computing

Power

DSP

icyflex1

icyflex4

1 MUL 2 MAC 4 MAC … 36 MAC

Application

6 µW/MHz

25 µW/MHz 10-150 µW/MHz

12 MAC

power indicated for TSMC 65 nm CMOS

Page 12: FPL'2014 - FlexTiles Workshop - 3 - FlexTiles DSP Accelerators

11 /

The info

rmation c

onta

ined in this

docum

ent and a

ny a

ttachm

ents

are

the p

ropert

y o

f F

lexT

iles c

onsort

ium

. Y

ou a

re h

ere

by n

otified that any r

evie

w, dis

sem

ination, dis

trib

ution,

copyin

g o

r oth

erw

ise u

se o

f th

is d

ocum

ent m

ust be d

one in a

ccord

ance w

ith the C

A o

f th

e p

roje

ct (T

RT

/DJ/6

24412785.2

011).

Tem

pla

te v

ers

ion 1.0

icyflex2 vs icyflex4

Feature icyflex2 icyflex4 VPS=2

Optimized for Control DSP

P, X, Y memory buses,

ISA, HW loops,

saturation, …

Instruction word [bits] 32 (1 or 2 sub) 64 (1, 2 or 3 sub)

Memory access [bits] 8, 16 or 32 2x (8, 16, 32, 64, 128)

Data processing [bits] 16 or 32, trunc 2x (16 or 32 or 64), full

Single Instr. Multiple Data (SIMD) No Yes, up to 8 MAC

Instruction set is reconfigurable

on the fly

No Yes

Software Development Kit (SDK) GNU-based tool suite (gcc, gdb) + cycle-

accurate instruction set simulator (ISS)

Hardware Devt Kit (HDK) FPGA-based, customizable

VPS = Vector Processing Slices in the Vector Processing Unit of the DSP

Page 13: FPL'2014 - FlexTiles Workshop - 3 - FlexTiles DSP Accelerators

12 /

The info

rmation c

onta

ined in this

docum

ent and a

ny a

ttachm

ents

are

the p

ropert

y o

f F

lexT

iles c

onsort

ium

. Y

ou a

re h

ere

by n

otified that any r

evie

w, dis

sem

ination, dis

trib

ution,

copyin

g o

r oth

erw

ise u

se o

f th

is d

ocum

ent m

ust be d

one in a

ccord

ance w

ith the C

A o

f th

e p

roje

ct (T

RT

/DJ/6

24412785.2

011).

Tem

pla

te v

ers

ion 1.0

blank instructions

configured at run-time

icyflex: reconfigurable instructions and addressing modes

Instruction set

ADD

MUL

SHR

MAC

JUMP

configurable

configurable

SHIFT

MUX

ALU

ACC

ACC

SHIFT

MUX

ALU

ACC

ACC

instr

uctio

n d

eco

din

g

cycle N: config MOP

cycle N+1: use MOP

Page 14: FPL'2014 - FlexTiles Workshop - 3 - FlexTiles DSP Accelerators

13 /

The info

rmation c

onta

ined in this

docum

ent and a

ny a

ttachm

ents

are

the p

ropert

y o

f F

lexT

iles c

onsort

ium

. Y

ou a

re h

ere

by n

otified that any r

evie

w, dis

sem

ination, dis

trib

ution,

copyin

g o

r oth

erw

ise u

se o

f th

is d

ocum

ent m

ust be d

one in a

ccord

ance w

ith the C

A o

f th

e p

roje

ct (T

RT

/DJ/6

24412785.2

011).

Tem

pla

te v

ers

ion 1.0

DSP in FlexTiles emulators

Emulator 1 (software):

Using Open Virtual Platform (OVP)

Not cycle accurate

The icyflex4 DSP is emulated by a GPP running at a higher frequency

Emulator 2 (hardware):

Using an FPGA board with two Xilinx Virtex6 FPGAs

Uses a DFF version of the DSP accelerator

Page 15: FPL'2014 - FlexTiles Workshop - 3 - FlexTiles DSP Accelerators

14 /

The info

rmation c

onta

ined in this

docum

ent and a

ny a

ttachm

ents

are

the p

ropert

y o

f F

lexT

iles c

onsort

ium

. Y

ou a

re h

ere

by n

otified that any r

evie

w, dis

sem

ination, dis

trib

ution,

copyin

g o

r oth

erw

ise u

se o

f th

is d

ocum

ent m

ust be d

one in a

ccord

ance w

ith the C

A o

f th

e p

roje

ct (T

RT

/DJ/6

24412785.2

011).

Tem

pla

te v

ers

ion 1.0

Exploitation of FlexTiles results at CSEM

CSEM specializes in low power solutions

A well-balanced multi-processor design can optimize energy

consumption by reducing voltage and frequency

For multi-core: we offer CSEM solutions

For many-core: CSEM collaborates with 1 or more of our partners

including e.g. a follow up project to produce FlexTiles chips

Page 16: FPL'2014 - FlexTiles Workshop - 3 - FlexTiles DSP Accelerators

15 /

The info

rmation c

onta

ined in this

docum

ent and a

ny a

ttachm

ents

are

the p

ropert

y o

f F

lexT

iles c

onsort

ium

. Y

ou a

re h

ere

by n

otified that any r

evie

w, dis

sem

ination, dis

trib

ution,

copyin

g o

r oth

erw

ise u

se o

f th

is d

ocum

ent m

ust be d

one in a

ccord

ance w

ith the C

A o

f th

e p

roje

ct (T

RT

/DJ/6

24412785.2

011).

Tem

pla

te v

ers

ion 1.0

FlexTiles FP7 project

For more information regarding the FlexTiles project, visit:

http://www.flextiles.eu

Please take 5 minutes to fill out the survey

on the project web site under the Contact menu

The FlexTiles project is funded in part by FP7, the seventh

framework programme of the European Commission.

Page 17: FPL'2014 - FlexTiles Workshop - 3 - FlexTiles DSP Accelerators

www.flextiles.eu

FlexTiles

Thank you for your attention!

For more information: http://www.csem.ch

Questions? mailto:[email protected]