how to cope with the power wall reiner hartenstein tu kaiserslautern patmos 2015, the 25 th...

46
How to cope with the Power Wall Reiner Hartenstein TU Kaiserslautern PATMOS 2015, the 25 th International Workshop on Power and Timing Modeling, Optimization and Simulation; Salvador, Bahia, Brazil, Sept 1-5, 2015

Upload: silvia-fleming

Post on 28-Dec-2015

227 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: How to cope with the Power Wall Reiner Hartenstein TU Kaiserslautern PATMOS 2015, the 25 th International Workshop on Power and Timing Modeling, Optimization

How to cope with the Power

Wall

Reiner Hartenstein

TU Kaiserslautern

PATMOS 2015, the 25th International Workshop on Power and Timing Modeling, Optimization and Simulation;

Salvador, Bahia, Brazil, Sept 1-5, 2015

Page 2: How to cope with the Power Wall Reiner Hartenstein TU Kaiserslautern PATMOS 2015, the 25 th International Workshop on Power and Timing Modeling, Optimization

© 2015, [email protected] http://hartenstein.de

TU Kaiserslautern

2

>> Outline <<

http://www.uni-kl.de

• The Power Wall• “Dataflow” Computing• Reconfigurable Computing• Time to Space Mapping• The Xputer Paradigm• Conclusions

Page 3: How to cope with the Power Wall Reiner Hartenstein TU Kaiserslautern PATMOS 2015, the 25 th International Workshop on Power and Timing Modeling, Optimization

© 2015, [email protected] http://hartenstein.de

TU Kaiserslautern

Oldest conference series on power efficiencyPower efficiency is going to become an industry-wide issue

Some incremental improvements are on track, at all abstraction levels - not only in hardware design and software development

however, there is still a lot to be done

http://hartenstein.de/PATMOS/

3

Project .leader: Reiner Hartenstein

partner . . leader: Antonio Núñez

partner . . leader: Francis Jutand

spin-off from thePATMOS project 

The Workshop Series

Page 4: How to cope with the Power Wall Reiner Hartenstein TU Kaiserslautern PATMOS 2015, the 25 th International Workshop on Power and Timing Modeling, Optimization

© 2015, [email protected] http://hartenstein.de

Power Efficiency of Programming

Languages(an example)

4

Page 5: How to cope with the Power Wall Reiner Hartenstein TU Kaiserslautern PATMOS 2015, the 25 th International Workshop on Power and Timing Modeling, Optimization

© 2015, [email protected] http://hartenstein.de

TU Kaiserslautern

5 5© New York Times Data Center at Dallas

Columbia river

G. Fettweis, E. Zimmermann: ICT Energy Consumption - Trends and Challenges; WPMC'08, Lapland, Finland, 8 – 11 Sep 2008

Power consumption by internet: x30 til 2030 if trends continue

Already the internet’s 2007 carbon footprint was higher than that of worldwide air traffic

It‘s more than the entire world‘s total power consumption to-day !!!

IDC predicts: the total number of data centers around

the world will peak at 8.6 million in 2017.

ICT infrastructures

Page 6: How to cope with the Power Wall Reiner Hartenstein TU Kaiserslautern PATMOS 2015, the 25 th International Workshop on Power and Timing Modeling, Optimization

© 2015, [email protected] http://hartenstein.de

TU Kaiserslautern

6

>> Outline <<

http://www.uni-kl.de

• The Power Wall• “Dataflow” Computing• Reconfigurable Computing• Time to Space Mapping• The Xputer Paradigm• Conclusions

Page 7: How to cope with the Power Wall Reiner Hartenstein TU Kaiserslautern PATMOS 2015, the 25 th International Workshop on Power and Timing Modeling, Optimization

© 2015, [email protected] http://hartenstein.de

TU Kaiserslautern

Terminology Problems

Searching a more power-efficient machine paradigm:

Stressing differences to „Control-Flow“ Computers an area called „Dataflow“ Computers was started mid‘ 70ies at MITXputer area people are forced to sidestep by using terms like „data-driven“ or „data streams“…

… although the „Dataflow“ scene is I-Structure-centered. 7

Tagged Token Flow

Page 8: How to cope with the Power Wall Reiner Hartenstein TU Kaiserslautern PATMOS 2015, the 25 th International Workshop on Power and Timing Modeling, Optimization

© 2015, [email protected] http://hartenstein.de

TU Kaiserslautern

A Second OpinionD. D. Gajski, D. A. Padua, D. J. Kuck, R. H.

Kuhn: A Second Opinion on Data Flow Machines and Languages; IEEE

COMPUTER, February 1982the subtitle: "... data flow techniques attract a great deal of attention. Other alternatives, however, offer more hope for the future."

( Still active workshops …, e. g.:5th Workshop on Data-Flow Execution Models for Extreme Scale Computing (DFM 2015), Oct 18-21, 2015, San Francisco, CA, USA )

However, the power efficiency break-thru did not happen here

8

Page 9: How to cope with the Power Wall Reiner Hartenstein TU Kaiserslautern PATMOS 2015, the 25 th International Workshop on Power and Timing Modeling, Optimization

© 2015, [email protected] http://hartenstein.de9

Computing Paradigms

term program counter

execution triggered by paradigm

CPUyes instruction

fetchinstruction-

stream-based (von Neumann)program

counter

DPUCPUCPU

term

nodata

arrival**data-driven or

data-stream-baseddata

counters

rDPURPURPU

Xputer

Reconfigurable Computing

no

complicated I-

structure handling

“Dataflow” Computer

**) “transport-triggered”*) based on tagged token “I-Structure”

I-structure

DPUsDFCDFC

DFC*

Page 10: How to cope with the Power Wall Reiner Hartenstein TU Kaiserslautern PATMOS 2015, the 25 th International Workshop on Power and Timing Modeling, Optimization

© 2015, [email protected] http://hartenstein.de

TU Kaiserslautern

10

>> Outline <<

http://www.uni-kl.de

• The Power Wall• “Dataflow” Computing• Reconfigurable Computing• Time to Space Mapping• The Xputer Paradigm• Conclusions

Page 11: How to cope with the Power Wall Reiner Hartenstein TU Kaiserslautern PATMOS 2015, the 25 th International Workshop on Power and Timing Modeling, Optimization

© 2015, [email protected] http://hartenstein.de

TU Kaiserslautern

11

FFTFFT100

Reed-Solomon Decoding

Reed-Solomon Decoding 2400

Viterbi DecodingViterbi Decoding400

1000

MACMAC

DSP and wireless

Smith-Waterman pattern matchingSmith-Waterman pattern matching288

molecular dynamics simulation

molecular dynamics simulation

88

BLASTBLAST52

protein identification

protein identification

40

Bioinformatics

GRAPEGRAPE

2020

Astrop

hysics

Astrop

hysics

SPIHT wavelet-based image compression

SPIHT wavelet-based image compression 457

real-time face detection

real-time face detection

60006000

video-rate stereo vision

video-rate stereo vision

900pattern recognition

pattern recognition

730

Image processing,Pattern matching,Multimedia

3000CT imagingCT imaging

106

Speedup-Factor

1995 2000 20102005

~300

DNA & protein sequencing

8723 779

x2/y

r

1985 1990100

*) TU Kaiserslautern

103

100

+ Pre-FPGA solutionsXputer

15000

no FPGA: DPLA on MoM by TU-KL*

no FPGA: DPLA on MoM by TU-KL*

Design Rule Check accelerator PISA

(„fair comparizon“)

Design Rule Check accelerator PISA

(„fair comparizon“)

1984: 1 DPLA replaces

256 FPGAs

105

Speedup-Factor

Speed-ups by vN Software to FPGA Migrations

103

~10

% o

f

spee

d-up

Pow

er

savi

ng:

http://www.fpl.uni-kl.de/staff/hartenstein/Hartenstein-Speedup-Factors.pdf

http://www.fpl.uni-kl.de/staff/hartenstein/eishistory_en.html

fabricated by E.I.S. Multi

University Project Chip

19841984>15 years earlier

3439

CryptoCrypto1000

28514DES breakingDES breaking

Saving Power

Page 12: How to cope with the Power Wall Reiner Hartenstein TU Kaiserslautern PATMOS 2015, the 25 th International Workshop on Power and Timing Modeling, Optimization

© 2015, [email protected] http://hartenstein.de12

The Reconfigurable Computing Paradoxalthough the effective integration density of FPGAs is by 4 orders of magnitude behind the Gordon Moore curve, because of:•wiring overhead• reconfigurability overhead•routing congestion

“von Neumann Syndrome”

C.V. RamamoorthyVon Neumann Syndrome

von Neumann: an extremely power-

inefficient paradigm

Reinvent Computing

Page 13: How to cope with the Power Wall Reiner Hartenstein TU Kaiserslautern PATMOS 2015, the 25 th International Workshop on Power and Timing Modeling, Optimization

© 2015, [email protected] http://hartenstein.de13

Obstacles to widespread FPGA adoption go well beyond the required skill set

- Workshop at FPL_2015

http://reconfigurablecomputing4themasses.net/

Page 14: How to cope with the Power Wall Reiner Hartenstein TU Kaiserslautern PATMOS 2015, the 25 th International Workshop on Power and Timing Modeling, Optimization

© 2015, [email protected] http://hartenstein.de

TU Kaiserslautern

14

>> Outline <<

http://www.uni-kl.de

• The Power Wall• “Dataflow” Computing• Reconfigurable Computing• Time to Space Mapping• The Xputer Paradigm• Conclusions

Page 15: How to cope with the Power Wall Reiner Hartenstein TU Kaiserslautern PATMOS 2015, the 25 th International Workshop on Power and Timing Modeling, Optimization

© 2006, [email protected] http://hartenstein.de

TU Kaiserslautern

Dual paradigm mind set: an old hat – but was ignored

15

time to space mapping: procedural to structural:

1967: W. A. Clark: Macromodular Computer Systems; 1967 SJCC, AFIPS Conf. Proc.C. G. Bell et al: The Description and Use of Register-Transfer Modules (RTM's); IEEE Trans-C21/5, May 1972

FF

token bitevoke

FF FF

„This is

so simple ……

1971

loop to pipe mapping

Tunnel Vision!

why did it take 25 years to find out?

PDP-16 RTMs:

Reiner
HDL scene ~1970: „decision box turns into demultiplexer““That’s so simple! why did it take 30 years to find out ?”reductionists’ tunnel view - RTM as DEC product available: 1973
Page 16: How to cope with the Power Wall Reiner Hartenstein TU Kaiserslautern PATMOS 2015, the 25 th International Workshop on Power and Timing Modeling, Optimization

© 2015, [email protected] http://hartenstein.de16

xxx

xxx

xxx

|

||

x x

x

x

x

x

x x

x

- -

-

input data stream

xx

x

x

x

x

xx

x

--

-

-

-

-

-

-

-

-

-

-

xxx

xxx

xxx

|

|

|

|

|

|

|

|

|

|

|

|

|

|

output data streams

„data streams“

time

port #

time

time

port #time

port #

define: ... which data item at which time at which port

The Systolic Arrays (1)no instruction streams needed

1980

DPA(pipe network)DataPath Array (array of DPUs)

M. J. Foster, H. T. Kung: The Design of Special-Purpose VLSI Chips ... IEEE 7th ISCA, La Baule, France, May 6-8, 1980 H. T. Kung

execution transport-triggered

Page 17: How to cope with the Power Wall Reiner Hartenstein TU Kaiserslautern PATMOS 2015, the 25 th International Workshop on Power and Timing Modeling, Optimization

© 2015, [email protected] http://hartenstein.de

TU Kaiserslautern

17

just only . special

purpose ? just only . special

purpose ?

M. J. Foster and H. T. Kung: “The Design of Special-Purpose VLSI Chips ... “

(Tunnel Vision)

Karl Steinbuch

It is not sufficient to invent something. You need to recognize, that you have invented something.

Systolic Arrays (2)

http

://ka

rl-st

einb

uch.

org

17

Page 18: How to cope with the Power Wall Reiner Hartenstein TU Kaiserslautern PATMOS 2015, the 25 th International Workshop on Power and Timing Modeling, Optimization

© 2015, [email protected] http://hartenstein.de18

H. T. Kung: „of course algebraic!“

My student Rainer Kress replaced it by simulated annealing: this supports also any irregular & wild shape pipe networks

What Synthesis Method?

The super-systolic array: a generalization of the systolic array:

The super-systolic array: a generalization of the systolic array:

http://kressarray.de/

supports only very special applications with strictly regular data

dependencies

only linear pipes

*) KressArray [ASP-DAC-1995]

now a general purpose methodology !

now a general purpose methodology !

(linear projection)

user
Mathematicians caught by their own paradigm trap
user
: using only uniform linear pipes
Page 19: How to cope with the Power Wall Reiner Hartenstein TU Kaiserslautern PATMOS 2015, the 25 th International Workshop on Power and Timing Modeling, Optimization

© 2015, [email protected] http://hartenstein.de

TU Kaiserslautern

H. T. Kung: “It’s not our job”

19*) or receives

another Tunnel Vision Symptom

without a sequencer: missed to invent a new machine paradigm

Who generates* the datastreams?

Who generates* the datastreams?

(the Xputer)

Page 20: How to cope with the Power Wall Reiner Hartenstein TU Kaiserslautern PATMOS 2015, the 25 th International Workshop on Power and Timing Modeling, Optimization

© 2015, [email protected] http://hartenstein.de

TU Kaiserslautern

20

>> Outline <<

http://www.uni-kl.de

• The Power Wall• “Dataflow” Computing• Reconfigurable Computing• Time to Space Mapping• The Xputer Paradigm• Conclusions

Page 21: How to cope with the Power Wall Reiner Hartenstein TU Kaiserslautern PATMOS 2015, the 25 th International Workshop on Power and Timing Modeling, Optimization

© 2015, [email protected] http://hartenstein.de

TU Kaiserslautern

The Xputer machine Paradigm

obtained by adding auto-sequencing memory (ASM)

GAG: Generic Address GenerstorGAG: Generic Address Generstor

With data counters instead of a program counter

is the TU-KL‘s Symbiosis of Time to Space Mapping and Reconfigurable Computing!

(TU-KL)

ASM

ASM

ASM

datacounter

GAG RAM

ASM

Xputer literature

21

Page 22: How to cope with the Power Wall Reiner Hartenstein TU Kaiserslautern PATMOS 2015, the 25 th International Workshop on Power and Timing Modeling, Optimization

© 2006, [email protected] http://hartenstein.de

TU Kaiserslautern

Duality of procedural Languages

Flowware Languagesread next data itemgoto (data address)jump to (data address)data loopdata loop nestingdata loop escapedata stream branching

yes: internally parallel loops

22

Software Languagesread next instructiongoto (instruction address)jump to (instruction address)instruction loopinstruction loop nestinginstruction loop escapeinstruction stream branching

no: no internally parallel loops

But there is an Asymmetry

program counter:data counter(s):

more simple: no ALU tasks

MoPLstate register(s):

Xp

ute

r lit

era

ture

Xp

ute

r p

ag

es

easy

to

lear

n

cont

rol-p

roce

dura

l

data

-pro

cedu

ral

Reiner Hartenstein
Befehls-prozedural
Reiner Hartenstein
Daten-prozedural
Page 23: How to cope with the Power Wall Reiner Hartenstein TU Kaiserslautern PATMOS 2015, the 25 th International Workshop on Power and Timing Modeling, Optimization

© 2015, [email protected] http://hartenstein.de

TU Kaiserslautern

23

Compilation: Software vs. Configware u. Flowware

source program

softwarecompiler

software code

Software Engineeri

ng

Software Engineeri

ng

configware code

mapper

configwarecompiler

scheduler

flowware code

source „program“

Configware

Engineering

Configware

Engineering

time to space

mapping

data

C, etc.

procedural: time domain)

space domain

time domain

Xputer literature

Page 24: How to cope with the Power Wall Reiner Hartenstein TU Kaiserslautern PATMOS 2015, the 25 th International Workshop on Power and Timing Modeling, Optimization

© 2015, [email protected] http://hartenstein.de

TU Kaiserslautern

24

Heterogeneous: Co-Compilation

softwarecompiler

software code

Software / Configware Co-Compiler

Software / Configware Co-Compiler

configware code

mapperconfigware

compiler

scheduler

flowware code

data

C, or other high level language

automatic SW / CW partitioner

Xputer pages

Page 25: How to cope with the Power Wall Reiner Hartenstein TU Kaiserslautern PATMOS 2015, the 25 th International Workshop on Power and Timing Modeling, Optimization

© 2015, [email protected] http://hartenstein.de

TU Kaiserslautern

25

>> Outline <<

http://www.uni-kl.de

• The Power Wall• “Dataflow” Computing• Reconfigurable Computing• Time to Space Mapping• The Xputer Paradigm• Conclusions

Page 26: How to cope with the Power Wall Reiner Hartenstein TU Kaiserslautern PATMOS 2015, the 25 th International Workshop on Power and Timing Modeling, Optimization

© 2015, [email protected] http://hartenstein.de

TU Kaiserslautern

26

Illustrating the Paradigm Trap

The von Neumann Manycore Approach

the watering can model [Hartenstein]

( many crippled watering cans )

many von

Neumann

bottle-necks

many von

Neumann

bottle-necks

The von Neumann single core Approach

The Memory Wall

The Memory Wall ( crippled

watering can )

(1)

Page 27: How to cope with the Power Wall Reiner Hartenstein TU Kaiserslautern PATMOS 2015, the 25 th International Workshop on Power and Timing Modeling, Optimization

© 2015, [email protected] http://hartenstein.de

TU Kaiserslautern

27

Illustrating the Paradigm Trap

The “Dataflow“ Computer

extremely complicate

d: no watering

can model !

(2)

(a power efficiency break-thru did not happen here)

Page 28: How to cope with the Power Wall Reiner Hartenstein TU Kaiserslautern PATMOS 2015, the 25 th International Workshop on Power and Timing Modeling, Optimization

© 2015, [email protected] http://hartenstein.de

TU Kaiserslautern

28

Xputer: the only massively power-efficient Paradigm

The Xputer Paradigmhas no von Neumann bottle-neck

has no von Neumann bottle-neck

the watering can model [Hartenstein]

Page 29: How to cope with the Power Wall Reiner Hartenstein TU Kaiserslautern PATMOS 2015, the 25 th International Workshop on Power and Timing Modeling, Optimization

© 2015, [email protected] http://hartenstein.de

TU Kaiserslautern

We need a Seismic Shift …

The Aristotelian model with the von Neumann CPU as the center of the world has become obsolete To cope with power wall and manycore we have to accept a Copernican world of Heterogeneous SystemsFor many more years we need a heterogeneous triple-paradigm mind set: Configware, Flowware, and still even Software

This is an extremely massive challenge, since more than a half century of software sits squarely on top

… to avoid unaffordable ICT power consumption cost in the

future

29

Page 30: How to cope with the Power Wall Reiner Hartenstein TU Kaiserslautern PATMOS 2015, the 25 th International Workshop on Power and Timing Modeling, Optimization

© 2015, [email protected] http://hartenstein.de30

thank you !

Page 31: How to cope with the Power Wall Reiner Hartenstein TU Kaiserslautern PATMOS 2015, the 25 th International Workshop on Power and Timing Modeling, Optimization

© 2015, [email protected] http://hartenstein.de31

END

Page 32: How to cope with the Power Wall Reiner Hartenstein TU Kaiserslautern PATMOS 2015, the 25 th International Workshop on Power and Timing Modeling, Optimization

© 2015, [email protected] http://hartenstein.de32

Backup for discussion:

Page 33: How to cope with the Power Wall Reiner Hartenstein TU Kaiserslautern PATMOS 2015, the 25 th International Workshop on Power and Timing Modeling, Optimization

© 2015, [email protected] http://hartenstein.de

TU Kaiserslautern

LUT

LUT

LUT

LUT

LUT

LUT

TableLUT Table

First FPGA available 1984 from Xilinx

Page 34: How to cope with the Power Wall Reiner Hartenstein TU Kaiserslautern PATMOS 2015, the 25 th International Workshop on Power and Timing Modeling, Optimization

© 2015, [email protected] http://hartenstein.de

Transformations since the 70ies

time domain:procedure domain

space domain:structure domain

34

Pipelinek time steps, n DPUs

time algorithm space/time algorithmus

Strip Mining Transformation

Loop Transformations: rich methodology published: [survey: Diss. Karin Schmidt, 1994, Shaker Verlag]

n x k time steps, program loop

1 CPU

(time to time/space mapping)

Reiner
Die Relativitätstheorie befasst sich mit der Struktur von Raum und Zeit.Die spezielle Relativitätstheorie befaßt sich mit der Relativität von Raum und Zeit .
Page 35: How to cope with the Power Wall Reiner Hartenstein TU Kaiserslautern PATMOS 2015, the 25 th International Workshop on Power and Timing Modeling, Optimization

© 2015, [email protected] http://hartenstein.de35

The Reconfigurable Computing Paradox

Enabling software developers to apply their skills over FPGAs has been a long and, as of yet, unreached research objective in reconfigurable computing.

Reinvent Computing

although the effective integration density of FPGAs is by 4 orders of magnitude behind the Gordon Moore curve, because of:•wiring overhead• reconfigurability overhead•routing congestion•etc.

Page 36: How to cope with the Power Wall Reiner Hartenstein TU Kaiserslautern PATMOS 2015, the 25 th International Workshop on Power and Timing Modeling, Optimization

© 2015, [email protected] http://hartenstein.de36

datacounter

GAG RAM

ASM: Auto-Sequencing

Memory

ASM: Auto-Sequencing

Memory

use data counters, no program

counter

rDPArDPA

x x

x

x

x

x

x x

x

- -

-

xx

x

x

x

x

xx

x

--

-

-

-

-

-

-

-

-

-

-

ASM Data stream generator usage

(pipe network)

xxx

xxx

xxx

|

||

xxx

xxx

xxx

|

|

|

|

|

|

|

|

|

|

|

|

|

|

ASM

ASM

ASM

ASM

ASM

ASM

AS

M

AS

M

AS

M

AS

M

AS

M

AS

M

implemented by distributed

on-chip- memory

implemented by distributed

on-chip- memory

Xputermachine paradigm

Xputer pages

GAG: Generic Address

Generators

GAG: Generic Address

Generators(reconfigurable: to avoid

memory-cycle-hungry address

computation)

(reconfigurable: to avoid memory-cycle-

hungry address computation)

general purpose

reconfigurable

general purpose

reconfigurable

also more irregular

data routes supported

Page 37: How to cope with the Power Wall Reiner Hartenstein TU Kaiserslautern PATMOS 2015, the 25 th International Workshop on Power and Timing Modeling, Optimization

© 2015, [email protected] http://hartenstein.de

TU Kaiserslautern

37

Paradigm Shift Consequences

Software EngineeringSoftware Engineering1 programming source needed

algorithm: variable

resources: fixedsoftware

CPUProgram Counter (PC)

configware resources: variable 2 programming sources needed

flowware algorithm: variable

Configware EngineeringConfigware Engineering

Data Counters (DCs) sequencing code (e. g. see MoPL language)

Configuration Code (CC)

von Neumann:

Xputer:

Xputer literature

Heterogenous EngineeringHeterogenous EngineeringXputer and vN:all 3 programming sources needed

Page 38: How to cope with the Power Wall Reiner Hartenstein TU Kaiserslautern PATMOS 2015, the 25 th International Workshop on Power and Timing Modeling, Optimization

© 2015, [email protected] http://hartenstein.de38

no memory wall

DPA

xxx

xxx

xxx

|

||

x x

x

x

x

x

x x

x

- -

-

input data streams

xx

x

x

x

x

xx

x

--

-

-

-

-

-

-

-

-

-

-

xxx

xxx

xxx

|

|

|

|

|

|

|

|

|

|

|

|

|

| output data

streams

DPU operation is transport-triggered

no instruction streamsno message passing

nor thru common memory

massively avoiding memory cycles

Pipelining through DPU Arrays: the TU-KL Xputer principle

DPA = DPU arrayDPU = Data Path Unit

Page 39: How to cope with the Power Wall Reiner Hartenstein TU Kaiserslautern PATMOS 2015, the 25 th International Workshop on Power and Timing Modeling, Optimization

© 2015, [email protected] http://hartenstein.de39

I-Structure Storage

Communication Network

PEI-Structure Storage

PE

MIT Tagged Token Dataflow Architecture*

Wait-Match Unit &Waiting Token Store

Instruction Fetch Unit

Token Queue

Program Store &Constant Store

Form Token Unit

ALU & Form Tag

to/from the Communication Network

RU

SU

PE:

*) source: Jurij Silc

„instruction is executed even if some of its operands are not yet available“

no PC

no updateable global store

?

Tagged Token FlowI would call it

Page 40: How to cope with the Power Wall Reiner Hartenstein TU Kaiserslautern PATMOS 2015, the 25 th International Workshop on Power and Timing Modeling, Optimization

© 2015, [email protected] http://hartenstein.de

TU Kaiserslautern

Solution: I-Structure concept*

40

can be read but not written

at least one read request has been deferred

read request deferrend but write

operation is allowed

[source: Jurij Silc]

•very complex data structures !

•“each update consumes the structure and the value produces a new data structure”•“awkward or even impossible to implement”

Problems with “Dataflow” Architectures

*) „I-Structure Flow“ instead of „Dataflow“

?

?

= Tagged Token Structure

Page 41: How to cope with the Power Wall Reiner Hartenstein TU Kaiserslautern PATMOS 2015, the 25 th International Workshop on Power and Timing Modeling, Optimization

© 2015, [email protected] http://hartenstein.de

I-Structures (I = incremental) - part 1http://csd.ijs.si/courses/dataflow/Jurij Silc: Dataflow Architectures

28

27

28

26

41

Page 42: How to cope with the Power Wall Reiner Hartenstein TU Kaiserslautern PATMOS 2015, the 25 th International Workshop on Power and Timing Modeling, Optimization

© 2015, [email protected] http://hartenstein.de

MIT Tagged-Token

Dataflow Architecture

(PE)

31

29

http://csd.ijs.si/courses/dataflow/Jurij Silc: Dataflow Architectures

I-structure select

I-structure

assign

30 20

I-Structures - part 2

42

Page 43: How to cope with the Power Wall Reiner Hartenstein TU Kaiserslautern PATMOS 2015, the 25 th International Workshop on Power and Timing Modeling, Optimization

© 2015, [email protected] http://hartenstein.de

TU Kaiserslautern

43

[Tarek El-Ghazawi et al.: IEEE COMPUTER, Febr. 2008]

much less equipment

neededmuch less memory and bandwidth needed

massively saving energy

Reconfigurable Computing (RC): the intensive Impact

SGI Altix 4700 with RC 100 RASC compared to Beowulf cluster

Tarek El-Ghazawi

Speed-ups by von Neumann to RC Migrations

ApplicationSpeed-up

factor

Savings divisorPower Cost Size

.DNA and Proteinsequencing 8723 779 22 253DES breaking 28514 3439 96 1116

Reiner
– i. e. instead of a hangar full of racks: just one or half a rack without air conditioning
Reiner
: i.d. mostly fits in RAM on board of processor chip: i. e. orders of magnitude less bandwidth
Page 44: How to cope with the Power Wall Reiner Hartenstein TU Kaiserslautern PATMOS 2015, the 25 th International Workshop on Power and Timing Modeling, Optimization

© 2015, [email protected] http://hartenstein.de

TU Kaiserslautern

Taxonomy

44

Flynn‘s taxonomy:von Neumann only

Diana‘s taxonomy:Reconfigurable Computing

Reiner‘s taxonomy:heterogeneous systems

Reiner‘s 2nd taxonomy:Xputers only

noI

Page 45: How to cope with the Power Wall Reiner Hartenstein TU Kaiserslautern PATMOS 2015, the 25 th International Workshop on Power and Timing Modeling, Optimization

© 2015, [email protected] http://hartenstein.de45

source: iStockphoto

Programming Language Popularity

not yet determined by power efficiency …

(IEEE)

Page 46: How to cope with the Power Wall Reiner Hartenstein TU Kaiserslautern PATMOS 2015, the 25 th International Workshop on Power and Timing Modeling, Optimization

© 2015, [email protected] http://hartenstein.de

TU Kaiserslautern

Von Neumann Syndrome

46

Lambert M. Surhone, Mariam T. Tennoe, Susan F. Hennessow (ed.): Von Neumann Syndrome; ßetascript publishing 2011

Shifting to the

dominance of von-

Neumann-only caused an

cumulated damage of

at least trillions of

Dollars, if not

quadrillions ….