cs294-6 reconfigurable computing day 9 september 22, 1998 project startup: mediabench with...

32
CS294-6 Reconfigurable Computing Day 9 September 22, 1998 Project Startup: Mediabench With annotations from class discussion

Post on 21-Dec-2015

215 views

Category:

Documents


2 download

TRANSCRIPT

CS294-6Reconfigurable Computing

Day 9

September 22, 1998

Project Startup: Mediabench

With annotations from class discussion

Today

• Project– Goals– Tuning (get feedback from class)– Benchmark set– …more architecture/compute model context

Pedagogical Goal

• Give students an appreciation for tradeoffs designing “post-fabrication” programmable computing devices– focus on spatial architectures

• benefits

• design

Ideal

• Design computing array for benchmark set and quantify benefits

Ideal

• Design computing array for benchmark set and quantify benefits

– too much for one class?

Pragmatic

• How do we get most of the pedagogical value within the scope of this class?

High level things want to see

• Where spatial computing better than processor? – Worse?

• How optimize a design for spatial execution? – What’s different about?

• How tune/optimize spatial architecture?

Can’t “Do it All”

• Pick focussed pieces

What do you want?

• What are your most burning, unanswered questions in this area?– What should we try to answer?– What would you like to learn?

Burning Questions

• Real numbers for compute array versus processor– larger than Day 5 examples– computational density– energy/op

• How exploit real-time reconfiguration– swapping efficiently– spatio/temporal tradeoffs– virtualization (decompose, run-time manage)

Burning Questions

• Design memory for embedding with array– size

• distribution

• memory hierarchy

– interconnect to– interfacing

• physical

• control

Burning Questions

• Good automatic compilation possible?– How describe

• ease mapping

• ease user job

• Costs and overheads to– upgradability– portability

Burning Questions

• What’s wrong w/ fixed length (processor) word model?– Architecturally– description– work knowledge of data size req. into

specification– annotations

• requested by compiler?

Burning Questions

• How should P<=>FPGA talk to each other– change ISA?– How does presence of RC change workload

(requirements) for processor?

• Beneficial use RC as GP platform?– How attack?

• Computational model?

Burning Questions

• Homogenous/Heterogeneous architecture?

• Generation n+1 mainstream processor?

• Minimum array size to be useful?– Embedded cost sensitivity– general: benefit vs. array size

Burning Questions

• Applications– make new things viable which are not viable

today?• ? Anything other than putting 10x 100x computation

in affordable package?

– Building a better mousetrap isn’t enough?• Lag to exploit technology?

• Low on innovation side?

Burning Questions

• System-on-a-chip designs?– Design complexity (up)– IP – …surely do something innovative with (?)– place for spatial building blocks

AMD (Benefit)

• Where beneficial? – (expand day 5 comparisons)

• Power implications– Spatial have benefit?– When/where?

AMD (How Use)

• Area Time tradeoffs– What look like?– How achieve?– Importance?

• Specialization– What opportunities exist?– Importance of exploiting?– How exploit?

AMD (How Use)

• How do we build programs? Including:– Convenience?– Abstraction?– Application longevity?– Virtualization

AMD (Architecture)

• How compute (media apps) requirements differ from random logic?– Interconnect

• less? More stylized?

– Retiming• depth, heirarchy?

Original Plan

• mediabench kernels

• start with HSRA architecture and tools

• series of weekly projects– (to be tuned based on student feedback and

areas of interest)

• final writeup

Original Plan: Exercises

• Analyze sequential implementation

• Build spatial (HSRA) implementation– compare yielded density w/ sequential

• Model power– compare spatial/sequential

Original Plan: Exercises

• Interconnect– c,p for application– “right” amount of interconnect– non-Rent structure in application interconnect?– Quality of hand autoplacement ?– ??? Heirarchical vs mesh style ???

Original Plan

• Retiming– depth distribution– hierarchy

• when use memories, what sizes?

– ?? Output vs. input ??

Original Plan: Exercises

• Specialization– opportunities in your application– binding times– benefits

• Programming– How fit into stream model of full computation?– Scheduling/virtualization?

Benchmark Set

• No need to hype multimedia as computational driver?

• See many “band-aids” on conventional architectures to handle – MMX, VIS

• Desire for “programmable” solutions– multi/evolving standards– one device does it all

Benchmarks

• Audio– adpcm

– g.721

– gsm

• Still Image– epic

– JPEG

• Video– MPEG-2

• Encryption– pegwit

– pgp

• Rendering– mesa

– ghostscript(?)

• Speech Recognition– rasta

Hypothesis

• Spatial processing a better solution than “tweaking” ISAs– broader applicability– greater computational density– lower power

• Special-Purpose processing units brittle– e.g. FIR

FIR and MMX

• …but you saw on SPACE2/CYCLE that not everything works this well (at least easily).

DCT and MMX?

• Claim from processor crowd that FPGA version only 2x better than MMX?– AMD skeptical

Project Goals

• Learn about architectural design– pedagogy

• build intuition on key characteristics of arch.

• how attack architectural design

– expand what is known (research)• What’s good where? Why?

• Application characteristics?

• How tune architectures?