stories, not words: abstract datatype instruction...

36
Stories, Not Words: Abstract Datatype Instruction Sets Martha Kim Columbia University Workshop on New Directions in Computer Architecture 6/5/2011 Sunday, June 5, 2011

Upload: others

Post on 14-Oct-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Stories, Not Words: Abstract Datatype Instruction Setsarcade.cs.columbia.edu/adp-ndca11-slides.pdf · N. Goulding et al., “GreenDroid: A Mobile Application Processor for Silicon’s

Stories, Not Words: Abstract Datatype

Instruction Sets

Martha KimColumbia University

Workshop on New Directions in Computer Architecture

6/5/2011

Sunday, June 5, 2011

Page 2: Stories, Not Words: Abstract Datatype Instruction Setsarcade.cs.columbia.edu/adp-ndca11-slides.pdf · N. Goulding et al., “GreenDroid: A Mobile Application Processor for Silicon’s

The Utilization Wall

• Exponential decrease in percentage of transistors that can be operated at full frequency.

• In 45nm TSMC process, 7% of 300mm die can operate at full frequency

• In 32nm, 3.5%

Moore’s Law (manufacturable transistors)

Power budget (operable transistors)

Goulding et al. Conservation cores: Reducing the energy of mature computations. In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pages 205–218, Pittsburgh, Pennsylvania, March 2010.

2

Sunday, June 5, 2011

Page 3: Stories, Not Words: Abstract Datatype Instruction Setsarcade.cs.columbia.edu/adp-ndca11-slides.pdf · N. Goulding et al., “GreenDroid: A Mobile Application Processor for Silicon’s

Specialization Is a Promising ApproachR. Hameed et al., “Understanding sources of inefficiency in general-purpose chips,” ISCA '10

G. Venkatesh et al., “Conservation cores: reducing the energy of mature computations,” ASPLOS '10

J. Kelm, D. Johnson, W. Tuohy, S. Lumetta, and S. Patel, “Cohesion: a hybrid memory model for accelerators,” ISCA '10

H. Franke et al., “Introduction to the wire-speed processor and architecture,” IBM Journal of Research and Development, vol. 54, no. 1, pp. 3:1–3:11, 2010.

V. Govindaraju, C. Ho, and K. Sankaralingam, “Dynamically Specialized Datapaths for energy efficient computing,” HPCA ’11

M. Lyons, M. Hempstead, G. Wei, and D. Brooks, “The Accelerator Store framework for high-performance, low-power accelerator-based systems,” Computer Architecture Letters, vol. 9, no. 2, pp. 53–56, 2010.

C. Cascaval, S. Chatterjee, H. Franke, K. Gildea, and P. Pattnaik, “A taxonomy of accelerator architectures and their programming models,” IBM Journal of Research and Development, vol. 54, no. 5, p. 5, 2010.

R. Hou et al., “Efficient data streaming with on-chip accelerators: Opportunities and challenges,” HPCA ’11

N. Goulding et al., “GreenDroid: A Mobile Application Processor for Silicon’s Dark Future,” Hotchips ‘10.

Sunday, June 5, 2011

Page 4: Stories, Not Words: Abstract Datatype Instruction Setsarcade.cs.columbia.edu/adp-ndca11-slides.pdf · N. Goulding et al., “GreenDroid: A Mobile Application Processor for Silicon’s

An Ideal Accelerator SystemHigh Performance

Low Energy

Easy to Program

Software Portability

Sunday, June 5, 2011

Page 5: Stories, Not Words: Abstract Datatype Instruction Setsarcade.cs.columbia.edu/adp-ndca11-slides.pdf · N. Goulding et al., “GreenDroid: A Mobile Application Processor for Silicon’s

Accelerator Design Processes

We need a design flow that facilitates usability

Application

Sunday, June 5, 2011

Page 6: Stories, Not Words: Abstract Datatype Instruction Setsarcade.cs.columbia.edu/adp-ndca11-slides.pdf · N. Goulding et al., “GreenDroid: A Mobile Application Processor for Silicon’s

Accelerator Design Processes

We need a design flow that facilitates usability

Application

Microarch.

Sunday, June 5, 2011

Page 7: Stories, Not Words: Abstract Datatype Instruction Setsarcade.cs.columbia.edu/adp-ndca11-slides.pdf · N. Goulding et al., “GreenDroid: A Mobile Application Processor for Silicon’s

Accelerator Design Processes

We need a design flow that facilitates usability

Application

Microarch.

Arch.

Sunday, June 5, 2011

Page 8: Stories, Not Words: Abstract Datatype Instruction Setsarcade.cs.columbia.edu/adp-ndca11-slides.pdf · N. Goulding et al., “GreenDroid: A Mobile Application Processor for Silicon’s

Accelerator Design Processes

We need a design flow that facilitates usability

!Application

Microarch.

Arch.

Application

Microarch.

Arch.

Sunday, June 5, 2011

Page 9: Stories, Not Words: Abstract Datatype Instruction Setsarcade.cs.columbia.edu/adp-ndca11-slides.pdf · N. Goulding et al., “GreenDroid: A Mobile Application Processor for Silicon’s

Accelerator Design Processes

We need a design flow that facilitates usability

!Application

Microarch.

Arch.

Application

Microarch.

Arch.

Application

Arch.

Sunday, June 5, 2011

Page 10: Stories, Not Words: Abstract Datatype Instruction Setsarcade.cs.columbia.edu/adp-ndca11-slides.pdf · N. Goulding et al., “GreenDroid: A Mobile Application Processor for Silicon’s

Accelerator Design Processes

We need a design flow that facilitates usability

!Application

Microarch.

Arch.

Application

Microarch.

Arch.

Application

Arch.

Microarch.

Sunday, June 5, 2011

Page 11: Stories, Not Words: Abstract Datatype Instruction Setsarcade.cs.columbia.edu/adp-ndca11-slides.pdf · N. Goulding et al., “GreenDroid: A Mobile Application Processor for Silicon’s

Extending Software Abstractions to Hardware

Application

Libraries

Machine Code

Micro-ops

Execution core

Caches

Memory

Sunday, June 5, 2011

Page 12: Stories, Not Words: Abstract Datatype Instruction Setsarcade.cs.columbia.edu/adp-ndca11-slides.pdf · N. Goulding et al., “GreenDroid: A Mobile Application Processor for Silicon’s

Extending Software Abstractions to Hardware

Application

Libraries

Machine Code

Micro-ops

Execution core

Caches

Memory

Sunday, June 5, 2011

Page 13: Stories, Not Words: Abstract Datatype Instruction Setsarcade.cs.columbia.edu/adp-ndca11-slides.pdf · N. Goulding et al., “GreenDroid: A Mobile Application Processor for Silicon’s

Extending Software Abstractions to Hardware

Application

Libraries

Machine Code

Micro-ops

Execution core

Caches

Memory

Raise HW/SW interface

Sunday, June 5, 2011

Page 14: Stories, Not Words: Abstract Datatype Instruction Setsarcade.cs.columbia.edu/adp-ndca11-slides.pdf · N. Goulding et al., “GreenDroid: A Mobile Application Processor for Silicon’s

Extending Software Abstractions to Hardware

Application

Libraries

Machine Code

Micro-ops

Execution core

Caches

Memory

Raise HW/SW interface

Extend interfaces from libraries to hardware

Sunday, June 5, 2011

Page 15: Stories, Not Words: Abstract Datatype Instruction Setsarcade.cs.columbia.edu/adp-ndca11-slides.pdf · N. Goulding et al., “GreenDroid: A Mobile Application Processor for Silicon’s

Extending Software Abstractions to Hardware

Application

Libraries

Machine Code

Micro-ops

Execution core

Caches

Memory

Raise HW/SW interface

Extend interfaces from libraries to hardware

Exploit interfaces with specialized hardware

Sunday, June 5, 2011

Page 16: Stories, Not Words: Abstract Datatype Instruction Setsarcade.cs.columbia.edu/adp-ndca11-slides.pdf · N. Goulding et al., “GreenDroid: A Mobile Application Processor for Silicon’s

Abstract Datatype Processing

SW

Arch

UArch

Sunday, June 5, 2011

Page 17: Stories, Not Words: Abstract Datatype Instruction Setsarcade.cs.columbia.edu/adp-ndca11-slides.pdf · N. Goulding et al., “GreenDroid: A Mobile Application Processor for Silicon’s

Abstract Datatype Processing

class HashTable

put(k,v) v get(k)SW

Arch

UArch

Sunday, June 5, 2011

Page 18: Stories, Not Words: Abstract Datatype Instruction Setsarcade.cs.columbia.edu/adp-ndca11-slides.pdf · N. Goulding et al., “GreenDroid: A Mobile Application Processor for Silicon’s

Abstract Datatype Processing

class HashTable

put(k,v) v get(k)

put $h, $k, $v get $h, $k, $v

SW

Arch

UArch

Sunday, June 5, 2011

Page 19: Stories, Not Words: Abstract Datatype Instruction Setsarcade.cs.columbia.edu/adp-ndca11-slides.pdf · N. Goulding et al., “GreenDroid: A Mobile Application Processor for Silicon’s

Hash Table Processor

Abstract Datatype Processing

class HashTable

put(k,v) v get(k)

put $h, $k, $v get $h, $k, $v

SW

Arch

UArch

Sunday, June 5, 2011

Page 20: Stories, Not Words: Abstract Datatype Instruction Setsarcade.cs.columbia.edu/adp-ndca11-slides.pdf · N. Goulding et al., “GreenDroid: A Mobile Application Processor for Silicon’s

Compilation & Execution

Sequence Labeling

SparseVec HashTable

SV HTGP

Dispatch

Sunday, June 5, 2011

Page 21: Stories, Not Words: Abstract Datatype Instruction Setsarcade.cs.columbia.edu/adp-ndca11-slides.pdf · N. Goulding et al., “GreenDroid: A Mobile Application Processor for Silicon’s

The Software Fallback

SVGP

Dispatch

SVGP

Dispatch

Sunday, June 5, 2011

Page 22: Stories, Not Words: Abstract Datatype Instruction Setsarcade.cs.columbia.edu/adp-ndca11-slides.pdf · N. Goulding et al., “GreenDroid: A Mobile Application Processor for Silicon’s

An Ideal Accelerator SystemHigh Performance

Low Energy

Easy Use - align hardware interfaces with those software is already using

Portability - software fallback plan

Sunday, June 5, 2011

Page 23: Stories, Not Words: Abstract Datatype Instruction Setsarcade.cs.columbia.edu/adp-ndca11-slides.pdf · N. Goulding et al., “GreenDroid: A Mobile Application Processor for Silicon’s

Sparse Vector Accelerator

Enforcing Data Encapsulation

set $v,$i,$x

CPU

get $v,$i,$x dot $v1,$v2,$p

Sunday, June 5, 2011

Page 24: Stories, Not Words: Abstract Datatype Instruction Setsarcade.cs.columbia.edu/adp-ndca11-slides.pdf · N. Goulding et al., “GreenDroid: A Mobile Application Processor for Silicon’s

Sparse Vector Accelerator

Enforcing Data Encapsulation

set $v,$i,$x

CPU

get $v,$i,$x dot $v1,$v2,$p

v i x

AI B

Sunday, June 5, 2011

Page 25: Stories, Not Words: Abstract Datatype Instruction Setsarcade.cs.columbia.edu/adp-ndca11-slides.pdf · N. Goulding et al., “GreenDroid: A Mobile Application Processor for Silicon’s

Sparse Vector Accelerator

Enforcing Data Encapsulation

set $v,$i,$x

CPU

get $v,$i,$x dot $v1,$v2,$p

v i x

AI BAI B I A B

Sunday, June 5, 2011

Page 26: Stories, Not Words: Abstract Datatype Instruction Setsarcade.cs.columbia.edu/adp-ndca11-slides.pdf · N. Goulding et al., “GreenDroid: A Mobile Application Processor for Silicon’s

Sparse Vector Accelerator

Enforcing Data Encapsulation

set $v,$i,$x

CPU

get $v,$i,$x dot $v1,$v2,$p

v i x

AI BAI B I A BC D C D

Sunday, June 5, 2011

Page 27: Stories, Not Words: Abstract Datatype Instruction Setsarcade.cs.columbia.edu/adp-ndca11-slides.pdf · N. Goulding et al., “GreenDroid: A Mobile Application Processor for Silicon’s

Specialized Caching for Sparse Vectors

0%

25%

50%

75%

100%

128 256 512 1024 2048

Hit

Rat

e

Storage Capacity (B)

Standard CacheVecStore

Sunday, June 5, 2011

Page 28: Stories, Not Words: Abstract Datatype Instruction Setsarcade.cs.columbia.edu/adp-ndca11-slides.pdf · N. Goulding et al., “GreenDroid: A Mobile Application Processor for Silicon’s

Key Reuse in Hash Tables

0%

25%

50%

75%

100%

0.1 1 10 100 1000 10000 100000

Pct.

Has

h O

pera

tions

Number of Keys

LZW Compress Parser

Sunday, June 5, 2011

Page 29: Stories, Not Words: Abstract Datatype Instruction Setsarcade.cs.columbia.edu/adp-ndca11-slides.pdf · N. Goulding et al., “GreenDroid: A Mobile Application Processor for Silicon’s

Key Reuse in Hash Tables

0%

25%

50%

75%

100%

0.1 1 10 100 1000 10000 100000

Pct.

Has

h O

pera

tions

Number of Keys

LZW Compress Parser

Sunday, June 5, 2011

Page 30: Stories, Not Words: Abstract Datatype Instruction Setsarcade.cs.columbia.edu/adp-ndca11-slides.pdf · N. Goulding et al., “GreenDroid: A Mobile Application Processor for Silicon’s

Key Reuse in Hash Tables

0%

25%

50%

75%

100%

0.1 1 10 100 1000 10000 100000

Pct.

Has

h O

pera

tions

Number of Keys

LZW Compress Parser

386 entry table26% of table 99% of dynamic accesses

Sunday, June 5, 2011

Page 31: Stories, Not Words: Abstract Datatype Instruction Setsarcade.cs.columbia.edu/adp-ndca11-slides.pdf · N. Goulding et al., “GreenDroid: A Mobile Application Processor for Silicon’s

Key Reuse in Hash Tables

0%

25%

50%

75%

100%

0.1 1 10 100 1000 10000 100000

Pct.

Has

h O

pera

tions

Number of Keys

LZW Compress Parser

386 entry table26% of table 99% of dynamic accesses

94K entry table.1% of table 75% of dynamic accesses

Sunday, June 5, 2011

Page 32: Stories, Not Words: Abstract Datatype Instruction Setsarcade.cs.columbia.edu/adp-ndca11-slides.pdf · N. Goulding et al., “GreenDroid: A Mobile Application Processor for Silicon’s

Exploiting Key Reuse

Compress HTX-MParser HTX-M AccessesCompress HTX-M Entrystore AccessesParser HTX-M Entrystore Accesses

Hash Table Accelerator (HTX)

put $h,$k,$v get $h,$k,$v

HTX-M

HTX-C

Sunday, June 5, 2011

Page 33: Stories, Not Words: Abstract Datatype Instruction Setsarcade.cs.columbia.edu/adp-ndca11-slides.pdf · N. Goulding et al., “GreenDroid: A Mobile Application Processor for Silicon’s

Exploiting Key Reuse

0%

25%

50%

75%

100%

1 10 100 1000

Red

uctio

n In

HT

X-M

Acc

esse

s

Cache Capacity

Compress HTX-MParser HTX-M AccessesCompress HTX-M Entrystore AccessesParser HTX-M Entrystore Accesses

Hash Table Accelerator (HTX)

put $h,$k,$v get $h,$k,$v

HTX-M

HTX-C

Sunday, June 5, 2011

Page 34: Stories, Not Words: Abstract Datatype Instruction Setsarcade.cs.columbia.edu/adp-ndca11-slides.pdf · N. Goulding et al., “GreenDroid: A Mobile Application Processor for Silicon’s

SummaryExtend software’s encapsulated datatypes into hardware accelerators

Natural alignment with standard software engineering

Accelerator utility on all applications that use a particular type

A software fallback that ensures portability

Aggressive optimization of computation and data movement

Sunday, June 5, 2011

Page 35: Stories, Not Words: Abstract Datatype Instruction Setsarcade.cs.columbia.edu/adp-ndca11-slides.pdf · N. Goulding et al., “GreenDroid: A Mobile Application Processor for Silicon’s

Research ChallengesWhat are the appropriate types to target? What is the lower bound in complexity? Is there a max number of types a hardware system can support?

How do I implment polymorphism efficiently? (e.g., priority queue with arbitrary types and user-defined sort function)

How do I optimized enforcement of data encapsulation? (copy-on-read is conservative)

Can the execution model support parallel execution?

What is type-specific coherence like? Simpler? Uglier?

What is the appropriate system-level resource allocation between general and specialized? Between different types?

Sunday, June 5, 2011

Page 36: Stories, Not Words: Abstract Datatype Instruction Setsarcade.cs.columbia.edu/adp-ndca11-slides.pdf · N. Goulding et al., “GreenDroid: A Mobile Application Processor for Silicon’s

Thank You

Sunday, June 5, 2011