the stanford smart memories: a 90nm, 55m transistor, 61mm², 8-core chip multiprocessor

3
The Stanford Smart Memories: A 90nm, 55M transistor, 61mm², 8-core chip multiprocessor VLSI technology scaling is driving changes Designs are getting complex and expensive Gate and wire delay balance is changing Current architectures are hard to sustain Communication speed is not scaling Poor modularity Variety of programming models emerge Streams, Multi-Thread, Transactional Memory Per-application optimizations Q uad C onfigurable Xbar Tile Q uad (fabricated unit) C onfigurable Ld/StU nit CPU 1 M em ory C ontroller M em ory C ontroller M em ory C ontroller System ofQ uads Configurable M em ory M ats D ata /Inst’ CPU 0 D ata /Inst’ Tile 0 Tile 3 Tile 1 Tile 2 Configurable Protocol C ontroller Q uad Q uad Q uad Q uad Q uad Q uad Q uad Q uad M em ory C ontroller TX/R X

Upload: derek-levine

Post on 03-Jan-2016

28 views

Category:

Documents


0 download

DESCRIPTION

The Stanford Smart Memories: A 90nm, 55M transistor, 61mm², 8-core chip multiprocessor. VLSI technology scaling is driving changes Designs are getting complex and expensive Gate and wire delay balance is changing Current architectures are hard to sustain Communication speed is not scaling - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: The Stanford Smart Memories: A 90nm, 55M transistor, 61mm²,  8-core chip multiprocessor

The Stanford Smart Memories:A 90nm, 55M transistor, 61mm²,

8-core chip multiprocessor

The Stanford Smart Memories:A 90nm, 55M transistor, 61mm²,

8-core chip multiprocessor

VLSI technology scaling is driving changes• Designs are getting complex and expensive• Gate and wire delay balance is changing

Current architectures are hard to sustain• Communication speed is not scaling• Poor modularity

Variety of programming models emerge• Streams, Multi-Thread, Transactional Memory• Per-application optimizations

Quad

Configurable Xbar

TileQuad (fabricated unit)

Configurable Ld/St Unit

CPU 1Memory Controller

Mem

ory

Con

trol

ler

Memory Controller

System of Quads

Configurable Memory Mats

Data / Inst’

CPU 0

Data / Inst’

Tile 0Tile 3

Tile 1Tile 2

Configurable Protocol Controller

QuadQuad

QuadQuadQuad

QuadQuadQuad Mem

ory

Con

trol

ler

TX

/RX

Page 2: The Stanford Smart Memories: A 90nm, 55M transistor, 61mm²,  8-core chip multiprocessor

Design Methodologies• Use of Tensilica™ cores• Hierarchical verification

• ProcTileQuad4-Quads• Use of Relaxed Scoreboards for

efficient system verification• Design emulation using Bee2

Design & Physical ImplementationDesign & Physical Implementation

Single Quad – 8 Processors in4 Tiles + 1 Protocol Controller

Physical Statistics & Floor Plan • 55M transistors, 2.5M instances• ST CMOS 90nm Multi-Vt• Nominal Operation: 1.4W @181MHz • 1.0V core, 1.8V IO • 22-181MHz variable speed IO clock• Fully fine-grained clock gated

7.8mm

7.8

mm

Page 3: The Stanford Smart Memories: A 90nm, 55M transistor, 61mm²,  8-core chip multiprocessor

Testing PlatformTesting Platform

SM

a

b

c d

Bring-up Test Platforma) SM test chipb) Custom ‘daughter’ boardc) Control FPGA on Bee2

boardd) Custom double-ended

DIMM cardsState Of The Testing•System configuration ..............•Proc’s running programs ........So far so good… testing continues

Related PublicationsMai_isca’00, Mai_isscc’04, Labonte_pact’04, Leverich_isca’07, Solomatnikov_dac’07, Shacham_micro’08, Firoozshahian_isca’09Test Chip

Bee2 F

PGA Board

First Heartbeat