next generation stacked memory systems · 2.5d memory system challenges stacked memory solution...
TRANSCRIPT
![Page 1: Next Generation Stacked Memory Systems · 2.5D MEMORY SYSTEM CHALLENGES Stacked memory solution still at an early stage – no very large volume products Eco-system challenges Multi-sourcing](https://reader033.vdocuments.site/reader033/viewer/2022042400/5f0eda2b7e708231d44140ad/html5/thumbnails/1.jpg)
1
Next Generation Stacked Memory Systems
Alok Gupta
NVIDIA, Santa Clara, CA
![Page 2: Next Generation Stacked Memory Systems · 2.5D MEMORY SYSTEM CHALLENGES Stacked memory solution still at an early stage – no very large volume products Eco-system challenges Multi-sourcing](https://reader033.vdocuments.site/reader033/viewer/2022042400/5f0eda2b7e708231d44140ad/html5/thumbnails/2.jpg)
2
OUTLINE
1. Motivation 2. Memory Bandwidth Trends 3. Stacked Memory System 4. Conclusion 5. Q&A
![Page 3: Next Generation Stacked Memory Systems · 2.5D MEMORY SYSTEM CHALLENGES Stacked memory solution still at an early stage – no very large volume products Eco-system challenges Multi-sourcing](https://reader033.vdocuments.site/reader033/viewer/2022042400/5f0eda2b7e708231d44140ad/html5/thumbnails/3.jpg)
3
MOTIVATION
To keep up with increasing logic horsepower, memory bandwidth must scale every generation else performance becomes IO limited
Absolute power not just a mobile problem
For example, GPUs are maxed out on power budgets at 225-300W
IO bandwidth improvements must be achieved within same power budget as last generation which implies memory system power needs to stay same
Process scaling brings limited improvement in memory IO power – logic at least benefits from Moore’s law
New technology and ideas needed to keep memory bandwidth growth in similar power envelope
![Page 4: Next Generation Stacked Memory Systems · 2.5D MEMORY SYSTEM CHALLENGES Stacked memory solution still at an early stage – no very large volume products Eco-system challenges Multi-sourcing](https://reader033.vdocuments.site/reader033/viewer/2022042400/5f0eda2b7e708231d44140ad/html5/thumbnails/4.jpg)
4
GPU MEMORY SYSTEM BANDWIDTH
![Page 5: Next Generation Stacked Memory Systems · 2.5D MEMORY SYSTEM CHALLENGES Stacked memory solution still at an early stage – no very large volume products Eco-system challenges Multi-sourcing](https://reader033.vdocuments.site/reader033/viewer/2022042400/5f0eda2b7e708231d44140ad/html5/thumbnails/5.jpg)
5
MEMORY BANDWIDTH MAPPED TO DRAM TECHNOLOGY
Stacked Memory
![Page 6: Next Generation Stacked Memory Systems · 2.5D MEMORY SYSTEM CHALLENGES Stacked memory solution still at an early stage – no very large volume products Eco-system challenges Multi-sourcing](https://reader033.vdocuments.site/reader033/viewer/2022042400/5f0eda2b7e708231d44140ad/html5/thumbnails/6.jpg)
6
PACKAGING TRENDS
Conventional MCM Multi-Chip Module
Organic
Interposer
Silicon
Interposer Die Stacking
Diagram
Complexity Well understood Well-established
process No TSVs needed
TSVs limited to
silicon interposer
TSVs needed
across all chips in
the stack
Cost Low Medium TBD Higher Higher
Form Factor
Size (Reference)
Smaller PCB
Larger Package
Smaller PCB
Larger Package
Smaller PCB
Similar Package
Smaller PCB
Smaller Package
PKG PKG PKG
Silicon Interposer
PKG
GPU
2D 2.5D 3D 2.1D
![Page 7: Next Generation Stacked Memory Systems · 2.5D MEMORY SYSTEM CHALLENGES Stacked memory solution still at an early stage – no very large volume products Eco-system challenges Multi-sourcing](https://reader033.vdocuments.site/reader033/viewer/2022042400/5f0eda2b7e708231d44140ad/html5/thumbnails/7.jpg)
7
HIGH BANDWIDTH MEMORY (HBM) DRAM
• A single package containing multiple memory die stacked together, using through-silicon vias (TSV). The memory within HBM is organized into channels wherein each channel is functionally and operationally independent
• HBM DRAM uses a wide-interface architecture to achieve high-speed, low-power operation and is best suited for 2.5D Silicon Interposer based system designs
Base Logic
Layer in
DRAM Process
![Page 8: Next Generation Stacked Memory Systems · 2.5D MEMORY SYSTEM CHALLENGES Stacked memory solution still at an early stage – no very large volume products Eco-system challenges Multi-sourcing](https://reader033.vdocuments.site/reader033/viewer/2022042400/5f0eda2b7e708231d44140ad/html5/thumbnails/8.jpg)
8
HIGH BANDWIDTH MEMORY (HBM) DRAM
HBM DRAM array
2-8Gb DRAM die w/ ECC
4/8-high stack – 1GB to 8GB per stack
Up to 256GB DRAM internal bandwidth
Base Layer w/ HBM IO + DRAM Test and Repair logic – in DRAM process
DRAM Interface
Signaling – 1.2V LVCMOS
Data rate – 800MHz-1000MHz DDR, wide 1024-bit interface
![Page 9: Next Generation Stacked Memory Systems · 2.5D MEMORY SYSTEM CHALLENGES Stacked memory solution still at an early stage – no very large volume products Eco-system challenges Multi-sourcing](https://reader033.vdocuments.site/reader033/viewer/2022042400/5f0eda2b7e708231d44140ad/html5/thumbnails/9.jpg)
9
MOBILE WIDE-IO2 MEMORY
Density: 8Gb
4/8 independent 64-bit channels, 256/512-bit interface
no cross channel restrictions
Interface Speed: 400-566MHz DDR
Bandwidth: 25.6-68.2GBps
1 through 4 high stacks
Mono stack is micro-bumped w/o TSV, Multi-high stack w/ TSV
Power efficient – leverages LP process, CMOS signaling
Designed for 3D stacking but can be made to work for 2.1/2.5D solutions
![Page 10: Next Generation Stacked Memory Systems · 2.5D MEMORY SYSTEM CHALLENGES Stacked memory solution still at an early stage – no very large volume products Eco-system challenges Multi-sourcing](https://reader033.vdocuments.site/reader033/viewer/2022042400/5f0eda2b7e708231d44140ad/html5/thumbnails/10.jpg)
10
2.1D MEMORY SYSTEMS WITH WIDE MEMORY
2.1D Organic
Interposer Fan-Out WLP Fan-Out WLP PoP Interposer PoP
PKG Constructions
PKG Height (mm) 0.84 0.53 0.90 0.90
PKG Technology
Maturity Level Medium Medium Low Medium
Thermal Good Good Poor Poor
PKG Reliability Unknown
Cost Unknown
![Page 11: Next Generation Stacked Memory Systems · 2.5D MEMORY SYSTEM CHALLENGES Stacked memory solution still at an early stage – no very large volume products Eco-system challenges Multi-sourcing](https://reader033.vdocuments.site/reader033/viewer/2022042400/5f0eda2b7e708231d44140ad/html5/thumbnails/11.jpg)
11
2.1D MEMORY SYSTEM CHALLENGES
Number of IOs
Signal density and routing limits number of IOs
Interface Speed
Channel not as benign as other stacked solutions – performance/power trade-off
Bandwidth and Capacity scaling
Package Reliability
Solution Cost is Work-In-Progress
![Page 12: Next Generation Stacked Memory Systems · 2.5D MEMORY SYSTEM CHALLENGES Stacked memory solution still at an early stage – no very large volume products Eco-system challenges Multi-sourcing](https://reader033.vdocuments.site/reader033/viewer/2022042400/5f0eda2b7e708231d44140ad/html5/thumbnails/12.jpg)
12
Package Substrate
Silicon Interposer
2.5D MEMORY SYSTEM WITH HBM DRAM
Passive silicon interposer
Package Substrate
GPU/CPU
HBM
HBM
HBM
HBM
HBM
HBM
HBM
HBM
Cross-Section View
GPU/CPU
HBM
HBM
Top View
HBM
HBM
![Page 13: Next Generation Stacked Memory Systems · 2.5D MEMORY SYSTEM CHALLENGES Stacked memory solution still at an early stage – no very large volume products Eco-system challenges Multi-sourcing](https://reader033.vdocuments.site/reader033/viewer/2022042400/5f0eda2b7e708231d44140ad/html5/thumbnails/13.jpg)
13
2.5D HIGH DENSITY GPU-MEMORY INTERCONNECT
Silicon interposer enables fine pitch geometries
>50x finer geometry
Performance depends on signal integrity requirements
Loss (width)
Crosstalk (spacing)
GPU - HBM signal routing on Silicon Interposer
![Page 14: Next Generation Stacked Memory Systems · 2.5D MEMORY SYSTEM CHALLENGES Stacked memory solution still at an early stage – no very large volume products Eco-system challenges Multi-sourcing](https://reader033.vdocuments.site/reader033/viewer/2022042400/5f0eda2b7e708231d44140ad/html5/thumbnails/14.jpg)
14
SILICON INTERPOSER LOSS & CROSSTALK
Insertion Loss
Resistance in channel
Slew Rate degradation due to channel loss
Very simple channel transfer function (almost RC) compared to off-chip signaling
Crosstalk dominated by adjacent aggressors
Line space and thickness
Eye is nice and open
Silicon Interposer Channel Characteristics
Resistance
creates DC
Channel loss
Slew Rate
Degradation due to
loss
Coupled crosstalk is
dominated by
adjacent signals
Sharp Roll-off
w/o resonances
![Page 15: Next Generation Stacked Memory Systems · 2.5D MEMORY SYSTEM CHALLENGES Stacked memory solution still at an early stage – no very large volume products Eco-system challenges Multi-sourcing](https://reader033.vdocuments.site/reader033/viewer/2022042400/5f0eda2b7e708231d44140ad/html5/thumbnails/15.jpg)
15
2.5D MEMORY SYSTEM CHALLENGES
Stacked memory solution still at an early stage – no very large volume products
Eco-system challenges
Multi-sourcing
Active collaboration required between foundry, memory vendor, and OSAT to deliver a successful product
Assembly, Test/Repair, and failure analysis
Solution cost trend is a big unknown
![Page 16: Next Generation Stacked Memory Systems · 2.5D MEMORY SYSTEM CHALLENGES Stacked memory solution still at an early stage – no very large volume products Eco-system challenges Multi-sourcing](https://reader033.vdocuments.site/reader033/viewer/2022042400/5f0eda2b7e708231d44140ad/html5/thumbnails/16.jpg)
16
3D MEMORY SYSTEM WITH WIDE-IO2
![Page 17: Next Generation Stacked Memory Systems · 2.5D MEMORY SYSTEM CHALLENGES Stacked memory solution still at an early stage – no very large volume products Eco-system challenges Multi-sourcing](https://reader033.vdocuments.site/reader033/viewer/2022042400/5f0eda2b7e708231d44140ad/html5/thumbnails/17.jpg)
17
WIDE-IO2 SOC CO-LAYOUT
Routing blockages
Power delivery
Keep-out regions
Thermal hot-spots
![Page 18: Next Generation Stacked Memory Systems · 2.5D MEMORY SYSTEM CHALLENGES Stacked memory solution still at an early stage – no very large volume products Eco-system challenges Multi-sourcing](https://reader033.vdocuments.site/reader033/viewer/2022042400/5f0eda2b7e708231d44140ad/html5/thumbnails/18.jpg)
18
3D MEMORY SYSTEM CHALLENGES
Memory and SoC Co-layout
Bandwidth and Capacity scaling
Thermals
Power delivery
Cost
![Page 19: Next Generation Stacked Memory Systems · 2.5D MEMORY SYSTEM CHALLENGES Stacked memory solution still at an early stage – no very large volume products Eco-system challenges Multi-sourcing](https://reader033.vdocuments.site/reader033/viewer/2022042400/5f0eda2b7e708231d44140ad/html5/thumbnails/19.jpg)
19
SUMMARY
Stacked memory is a promising solution for ever demanding need for bandwidth
Xilinx is shipping large FPGAs using silicon interposer – solves a unique problem
High Volume Manufacturing for mainstream GPU/CPU devices still work-in-progress
Business challenges
Multiple companies need to work together (foundry + memory vendor + OSAT
Assembly, Failure Analysis, Test and Repair
DRAM Cost per bit
Stacking of heterogeneous devices not well understood – co-design, thermal and mechanical challenges
![Page 20: Next Generation Stacked Memory Systems · 2.5D MEMORY SYSTEM CHALLENGES Stacked memory solution still at an early stage – no very large volume products Eco-system challenges Multi-sourcing](https://reader033.vdocuments.site/reader033/viewer/2022042400/5f0eda2b7e708231d44140ad/html5/thumbnails/20.jpg)
20
QUESTIONS?