paintable computing
DESCRIPTION
A Presentation of: “Programming A Paintable Computer” William Butera PhD Thesis, MIT 2002. Paintable Computing. all images (c) their respective owners. The Goal. Computing by the Liter. The Big Idea. The Superlative Multi-Processor Inverse of Current Architecture Paradigm - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Paintable Computing](https://reader036.vdocuments.site/reader036/viewer/2022070421/56816022550346895dcf259b/html5/thumbnails/1.jpg)
Paintable Computing
A Presentation of:“Programming A Paintable Computer”
William ButeraPhD Thesis, MIT 2002
all images (c) their respective owners
![Page 2: Paintable Computing](https://reader036.vdocuments.site/reader036/viewer/2022070421/56816022550346895dcf259b/html5/thumbnails/2.jpg)
The Goal
● Computing by the Liter
![Page 3: Paintable Computing](https://reader036.vdocuments.site/reader036/viewer/2022070421/56816022550346895dcf259b/html5/thumbnails/3.jpg)
The Big Idea
● The Superlative Multi-Processor● Inverse of Current Architecture Paradigm● What are the hard problems?
– Are they worse than what has already been solved?
![Page 4: Paintable Computing](https://reader036.vdocuments.site/reader036/viewer/2022070421/56816022550346895dcf259b/html5/thumbnails/4.jpg)
Architecture Problems
● Asynchronous devices– No easy way to make synchronous
● Highly Unreliable Processors
![Page 5: Paintable Computing](https://reader036.vdocuments.site/reader036/viewer/2022070421/56816022550346895dcf259b/html5/thumbnails/5.jpg)
Architecture Problems
● No Global Communication● Unknown (and Unknowable) Topology
![Page 6: Paintable Computing](https://reader036.vdocuments.site/reader036/viewer/2022070421/56816022550346895dcf259b/html5/thumbnails/6.jpg)
Architecture Problems
● Code must be compact– Nodes cannot support large processes– Working sets must be small
● Infinitely many paths to failure
![Page 7: Paintable Computing](https://reader036.vdocuments.site/reader036/viewer/2022070421/56816022550346895dcf259b/html5/thumbnails/7.jpg)
The Solution
● New Architecture => New Solution– Out with the old
assumptions● Self Assembly
– Better paradigm– Redefines “success”
![Page 8: Paintable Computing](https://reader036.vdocuments.site/reader036/viewer/2022070421/56816022550346895dcf259b/html5/thumbnails/8.jpg)
Complex Adaptive Systems
● Aggregate Behavior – simple parts => arbitrarily complex systems
● Statistical Output– Local Interactions => Global State
![Page 9: Paintable Computing](https://reader036.vdocuments.site/reader036/viewer/2022070421/56816022550346895dcf259b/html5/thumbnails/9.jpg)
Implementing a Solution
● What sort of hardware is a good target?– Cannot be too small
● Must be able to do useful work– Cannot be too large
● Must be hard enough
![Page 10: Paintable Computing](https://reader036.vdocuments.site/reader036/viewer/2022070421/56816022550346895dcf259b/html5/thumbnails/10.jpg)
Reference Standard “Paintable” Computer-- Processing --
● Really tiny “traditional” architecture– CPU: 10-200Mhz– RAM: 50K words– Bus: 16+ bit– Programmable in traditional languages
● C, Java, etc
![Page 11: Paintable Computing](https://reader036.vdocuments.site/reader036/viewer/2022070421/56816022550346895dcf259b/html5/thumbnails/11.jpg)
Reference Standard “Paintable” Computer-- Power --
● Unspecified interface– Does not impinge on the architecture
● Examples– Batteries– Chemical substrate– Photo-cell– Structural power routing– Fuel Cells
![Page 12: Paintable Computing](https://reader036.vdocuments.site/reader036/viewer/2022070421/56816022550346895dcf259b/html5/thumbnails/12.jpg)
Reference Standard “Paintable” Computer-- Networking --
● Directionless● Bandwidth: 100kbps Full Duplex● Radius: ~8 particles
– Gaussian Random distribution of connectivity● Example Technologies:
– luminescence– electrostatic– near-field RF
![Page 13: Paintable Computing](https://reader036.vdocuments.site/reader036/viewer/2022070421/56816022550346895dcf259b/html5/thumbnails/13.jpg)
The Pushpin Computer
● A real system– An example of a
paintable computer● Model architecture
– 330 nodes
![Page 14: Paintable Computing](https://reader036.vdocuments.site/reader036/viewer/2022070421/56816022550346895dcf259b/html5/thumbnails/14.jpg)
System Layout
● Separate communication, ground, and power● Planes separated by flexible silicon insulation
![Page 15: Paintable Computing](https://reader036.vdocuments.site/reader036/viewer/2022070421/56816022550346895dcf259b/html5/thumbnails/15.jpg)
Programming Model
● Program Fragments (PFrag's)– Computational Elements
● Shared Memory Partitions– Inter Process Communication
● Embedded OS– Local Resource Control– Special PFrag Services
![Page 16: Paintable Computing](https://reader036.vdocuments.site/reader036/viewer/2022070421/56816022550346895dcf259b/html5/thumbnails/16.jpg)
Shared Memory Layout
![Page 17: Paintable Computing](https://reader036.vdocuments.site/reader036/viewer/2022070421/56816022550346895dcf259b/html5/thumbnails/17.jpg)
Shared Memory Layout
● PFrag I/O– Bassinet: Pre-Load Store– Launch Pad: Post-Unload Store
● Data I/O– Home Page: Output to Neighbors– Mirrored Home Pages: Input From Neighbors– Organized as a key value pairs
![Page 18: Paintable Computing](https://reader036.vdocuments.site/reader036/viewer/2022070421/56816022550346895dcf259b/html5/thumbnails/18.jpg)
OS Services 1 and 2 of 4
● Housekeeping– Defragmenting Memory– Resizing I/O Zone
● Network Access– Inter Processor Communication
● Manages Access I/O Regions● Manages Joins/Leaves
– Mediates PFrag Access to I/O Regions
![Page 19: Paintable Computing](https://reader036.vdocuments.site/reader036/viewer/2022070421/56816022550346895dcf259b/html5/thumbnails/19.jpg)
OS Services 3 and 4 of 4
● Running PFrags– Installs / Uninstalls the PFrag– Runs the PFrag
● PFrag Services– Mathematics– Random Numbers– Access to Memory– Transit Request Messages– etc.
![Page 20: Paintable Computing](https://reader036.vdocuments.site/reader036/viewer/2022070421/56816022550346895dcf259b/html5/thumbnails/20.jpg)
PFrag Implementation
● Implements Five Functions– Install
● Moves Self From Bassinet to Main Memory– DeInstall
● Cleans Up and Erases Self– Update
● Runs the process
![Page 21: Paintable Computing](https://reader036.vdocuments.site/reader036/viewer/2022070421/56816022550346895dcf259b/html5/thumbnails/21.jpg)
PFrag Transit
● Transfer-Granted– Cleans Up and Moves to Launch Pad for Transit
● Transfer-Refused– Allows PFrag to Dequeue Transfer Request
![Page 22: Paintable Computing](https://reader036.vdocuments.site/reader036/viewer/2022070421/56816022550346895dcf259b/html5/thumbnails/22.jpg)
Does It Work?
● Need to prove Viability– Simple Applications that we can use to:
● Test● Validate
– Butera Implements● BreadCrumbs● Near Sighted Mailman● Knitting Club
![Page 23: Paintable Computing](https://reader036.vdocuments.site/reader036/viewer/2022070421/56816022550346895dcf259b/html5/thumbnails/23.jpg)
What Is It Good For?
● What software will Motivate?– Need a “Killer App”
● Only works well on a Paintable Architecture– Butera Implements
● Gradient● MultiGrad● Tessellation Operator● Diffusion● Channel Operator● Coordinate Operator
![Page 24: Paintable Computing](https://reader036.vdocuments.site/reader036/viewer/2022070421/56816022550346895dcf259b/html5/thumbnails/24.jpg)
Can It Do What I Want It To?
● Need to Prove Utility– Simple Applications that do something Useful
● Traditional Service We Cannot Live Without– Butera Implements:
● Streaming Audio● Holistic Data Storage● Surface Bus● Image Segmentation
![Page 25: Paintable Computing](https://reader036.vdocuments.site/reader036/viewer/2022070421/56816022550346895dcf259b/html5/thumbnails/25.jpg)
Where do we go from here?
● AMD and Intel– 2-4 on core
processors– cannot go faster
● go wider!● How far can we
expand sideways?● A job for Architecture!
![Page 26: Paintable Computing](https://reader036.vdocuments.site/reader036/viewer/2022070421/56816022550346895dcf259b/html5/thumbnails/26.jpg)
How Good Is It?
● Can We Compare to Traditional?– Apples = Oranges?
● Consider Two Cases– Serial Operation– Embarrassingly Parallel Operation
![Page 27: Paintable Computing](https://reader036.vdocuments.site/reader036/viewer/2022070421/56816022550346895dcf259b/html5/thumbnails/27.jpg)
The Worst Case: Serial
● Cannot “optimize away” all serial operations● Interactive Programs
– Shells will work ● low system requirements● low communication overhead
– Will need a new device to do graphics● Build output into paintable?● Integrate a larger processor for graphics?
● Can still do Mulitprocessing● Can make it massively fault tolerant
![Page 28: Paintable Computing](https://reader036.vdocuments.site/reader036/viewer/2022070421/56816022550346895dcf259b/html5/thumbnails/28.jpg)
The Best Case: Parallel
● Where is our overhead?– Getting the problem to the device– Getting the problem off the device– Sharing intermediates
● The Computation Scales– with number of units– Cost is critical
![Page 29: Paintable Computing](https://reader036.vdocuments.site/reader036/viewer/2022070421/56816022550346895dcf259b/html5/thumbnails/29.jpg)
How Does Cost Scale?
● Butera's Die Assumptions:– Large Die = 100 mm2
– Medium Die = 25 mm2
– Small Die = 1 mm2
● Current Processor Dies– Pentium M = 84 mm2
– Pentium 4 = 131 mm2
– “Smithfield” Dual Core = 206 mm2
– Opteron Dual Core = 199 mm2
![Page 30: Paintable Computing](https://reader036.vdocuments.site/reader036/viewer/2022070421/56816022550346895dcf259b/html5/thumbnails/30.jpg)
Peering Into the Process
● Butera's Defect Rate Analysis– 200, 500, 1000 defects
● Class 1 Cleanroom – 1 particle per ft3
– 30cm (diameter) wafers = 0.785 ft2
● 250 to 1270 ft of linear air motion– If process takes 5 days:
● 2 to 10.5 ft per hour air motion● Assumptions are reasonable, even optimistic
![Page 31: Paintable Computing](https://reader036.vdocuments.site/reader036/viewer/2022070421/56816022550346895dcf259b/html5/thumbnails/31.jpg)
How Does Cost Scale?● Butera's Calculations
![Page 32: Paintable Computing](https://reader036.vdocuments.site/reader036/viewer/2022070421/56816022550346895dcf259b/html5/thumbnails/32.jpg)
Why is this relevant?● Yield Ratio is ~ 200 : 1
– As much as 20,000% greater yield
![Page 33: Paintable Computing](https://reader036.vdocuments.site/reader036/viewer/2022070421/56816022550346895dcf259b/html5/thumbnails/33.jpg)
How Can This Help?
● Consider a motivating problem– Embarrassingly Parallel– O(n2) computation– O(n) input– O(n) output– No inter-node communication
● Only Need to Consider Problem Input/Output– O( n * log8(n) )
![Page 34: Paintable Computing](https://reader036.vdocuments.site/reader036/viewer/2022070421/56816022550346895dcf259b/html5/thumbnails/34.jpg)
How Does It Scale?
0 1000 2000 3000 4000 5000 6000 7000 80000
10000
20000
30000
40000
50000
60000
Paintable Benefit Crossover
EmbarrassingWith Startup CostTraditional
Problem Size O(n^2)
Run
Tim
e