understanding ddr4 and today’s dram frontier - … · 2014-10-16 · understanding ddr4 and...
TRANSCRIPT
Understanding DDR4
and Today’s DRAM Frontier
Oct 15th 2014
2/32
Contents
1. Industry Trend
2. Introduction of DDR4
3. New Technology Node
4. 3D Stacking Technology
5. What’s coming Next
3/32
DRAM Market & Application
47% of DRAM for Server and PC application
7%
Industrial
Military
Aerospace
Server
(15%)
Mainframe
Supercomputer
Server
Workstation
PC
(32%) Desktop
Notebook
Mini laptop
Consumer
& Gfx
(21%)
TV/ LCD/ Printer
Set-top box/ D.Camera
Navigation/ Black Box
Gfx card / Video Game
Home Appliance
Mobile
(25%) Tablet
Smart Phone
Cellular Phone
EDP (Electronic Data Processing)
Source : Gartner(‘14.1Q)
4/32
Memory RAS and Low TCO are required for server
Server Application Trend
Server Virtualization Moving to Cloud Big Data
Low TCO Enhanced RAS High Performance
• Operating voltage • Stand-by power • Core & I/O power
• Better S/I • Reinforced resiliency
• High bandwidth • Better efficiency
Source : Gartner(‘14.1Q)
5/32
Mobility and small form factor are key index
Client Application Trend
Smart Phone & Tablet 2-in-1 Ultrabook™ Traditional NB AIO/NUC
Mobility PC like Experience
Longer Battery Life Small Form Factor Better Experience
• Operating voltage • Stand-by power • Core & I/O power
• Limited real estate • High capacity DRAM
• High bandwidth • Better efficiency
6/32
High Capacity
High Performance/Watt
High Reliability
But Low Cost
7/32
4th Generation of DDR SDRAM
Successor of DDR3 from 2014 supporting all Computing system
C:\>
PC66 -133 DDR DDR2 DDR3 DDR4
MDDR MDDR2 LPDDR3 LPDDR4
GDDR GDDR2 GDDR3 GDDR4 GDDR5
’02 ’14 ’07 ’05 ….
8/32
DDR4 Fully Addresses Industry Requirements
• Up to 3.2Gbps with more coming
• 20~40% power savings with power features
X2 Bandwidth with
Lower Power
• Single-ended
• Same clocking (Source Synchronous) Evolutionary path
• Double banks
• Higher density (up to 16Gb/mono, 128Gb/3DS) Better Resources
• +5 Power savings (Voltage/IO/Features) Lower Power Consumption
• +6 RAS features Greater Reliability
• Keep 8 bit prefetch Bank grouping Low Cost
Key Market Needs & How DDR4 Meets Them
9/32
Contents
1. Industry Trend
2. Introduction of DDR4
3. New Technology Node
4. 3D Stacking Technology
5. What’s coming Next
10/32
DDR4 Feature Summary
DDR4 has advanced features over DDR3
Spec items DDR3 DDR4
Density / Speed 512Mb~4Gb
0.8~1.8Gbps 4Gb~16Gb
1.6~3.2Gbps
Interface
Voltage (VDD/VDDQ/VPP)
1.5V/1.5V/NA (1.35V/1.35V/NA)
1.2V/1.2V/2.5V
Data IO CTT (34ohm) POD (34ohm)
Vref_DQ External Vref (VDD/2) Internal Vref (need training)
CMD/ADDR IO CTT CTT
Strobe Bi-dir / diff Bi-dir / diff
Core architecture
# of banks 8Banks 16Banks (4Bank Group)
Page size(x4/8/16) 1KB / 1KB / 2KB 512B / 1KB / 2KB
# of prefetch 8bits 8bits
Added functions RESET/ZQ/Dynamic ODT + 3DS/CRC/DBI/Multi preamble …
Physical
Package type/balls (X4,8/X16)
78 / 96 BGA 78 / 96 BGA
DIMM type R, LR, U, SoDIMM
DIMM Capacity 512MB to 64GB 8GB to 256GB
DIMM pins 240 (R,LR,U) / 204 (So) 288 (R,LR,U) / 260 (So)
11/32
Advanced Features for Performance
Advanced features to increase system performance
• DDR4 Platform Benefits
DDR3 Platform* DDR4
12C Max Cores 18C
AVX(128b) Vector Inst. AVX2(256b)
8.0GT/s QPI 9.6GT/s
* DDR3 – ’13 SV Platform, DDR4 – ’14 SV Platform
• No interleaving delay w/ Bank group
1 2 3 4
5 6 7 8
1 2 5 6 3 4 7 8 9 10 13 14 11 12 15 16
DDR3 8Banks DDR4 16Banks(4BG)
DDR3_1866 DDR4_1866
1.16 1
16% @1866 tRRD
DDR4 4nCK
DDR3 5nCK(1bubble)
• 2x Bandwidth of DDR4 versus DDR3
B/W(Mbps)
DDR3 800~1600
DDR4 1600~3200
• tFAW limit-free of DDR4 : DDR4 512B page vs. DDR3 1KB page
1 2 3 4 5
1 2 3 4 8
DDR3
DDR4
tFAW
tFAW
* tFAW : Four Active Window CMD Set
12/32
System Performance Comparison
Platform using DDR4 offers better performance
More benefit for the application requiring high capacity(Multi DPC)
1DPC 2DPC 3DPC
Max 42% Better
DDR3L 1600M
Source : Samsung SPEC_CPU Benchmark/DDR3L 1.35V vs. DDR4 1.2V/2Rank 16GB
1
1.42
DDR4 2133M
DDR3L 1333M
1
1.45
DDR4 1867M
Max 45% Better
DDR3L 800M
1
1.65
DDR4 1600M
Max 65% Better
42% 45% 65%
13/32
Power Reduction of Core and I/O
Operating voltage is decreased from 1.5V(1.35V) to 1.2V
POD(Pseudo Open Drain) reduces I/O power
• Continuous decreasing of VDD - DDR4 1.2V, P ∝ V2
• POD Interface : half of I/O power
0 1 0 1 0 1 0
0 1 0 1 0 1 0
Term.
Term.
DDR4 POD Interface
DDR3 SSTL Interface
Power Consumption
*SSTL : Stub Series Terminated Logic
14/32
Efficient Power Consumption
DDR4 is about 20% more power efficient against DDR3
1DPC 2DPC
Source : Samsung Power Benchmark/DDR3L 1.35V vs. DDR4 1.2V
DDR3 1600M
1 1.26
DDR4 2133M
26%
DDR3 1333M
1 1.20
DDR4 1867M
20%
26% Power Efficient 20% Power Efficient
[Performance/watt] [Performance/watt]
* Measured under controller’s POR condition w/ 2Rank 16GB RDIMM
15/32
Features for High Reliability
DDR4 supports Write CRC and CA Parity for high reliability
• Write CRC helps to recognize multi-bits failures during transmission
• DDR4 can prevent mal-operation by CMD/ADD error
DDR3
DDR4
Data
Data Data C R C
x
x
x Failed data during data transmission
x
x
x
x
Wrong data will be written to DRAM
CRC check & re-request
DDR3
Controller
or
Register
Parity for CMD/ADD
* Once parity error occurs, DDR4 request CMD/ADD set again
DDR4
*CRC : Cyclic Redundancy Checking
16/32
PPR(Post Package Repair)
Single bit and single row failure are repairable without any system power-off
Can repair half of function failures
Others 49%
Normal Rows Redundant Row
Recognizing
PPR
Restoring
Source : Samsung
Repairable by PPR(51%)
Single Bit Fail 37%
Single Row 14%
17/32
DDR4 in Client Platform
Better performance(30%) and lower power(70%)
Sandra2014 Performance Test 8.0
Idle
Performance
Power Consumption
x1.0
x1.3
x1.0
x1.2
x1.0 x1.0
x0.7
Intel HEDT Launched
DDR3 DDR4 DDR3 DDR4
DD
R3
x0.8
DD
R4
DD
R3
DD
R4
Active
18/32
Samsung DDR4 Line-up
DDR4 module solution for server application
Application DIMM Type Status
IA Server
Registered DIMM Production
Load Reduced DIMM Production
Micro Server ECC SODIMM Production
Application Type Status
Traditional Desktop / HEDT UDIMM Production
Ultrabook / AIO / NUC SODIMM Sampling
DDR4 module solution for client application
19/32
Contents
1. Industry Trend
2. Introduction of DDR4
3. New Technology Node
4. 3D Stacking Technology
5. What’s coming Next
20/32
Samsung’s Process Technology Journey
New DRAM process technology node every year
2znm new process product under mass production
8xnm
6xnm
5xnm
4xnm
3xnm
2xnm 2ynm
* Customer Sample shipping date for 1st product of each process node
Tom’s Hardware(Mar.’14)
Extreme Tech(Mar.’14)
Computerworld(Mar.’14)
2znm
21/32
2znm DDR3 4Gb Status
2znm DDR3 4Gb is verified with Current Client platforms
Already in mass production with valuable customers
DIMM Type Density(Org.) Validation Result Status
Unbuffered SODIMM
4GB(1Rx8) Pass Production
8GB(2Rx8) Pass Production
Unbuffered DIMM
4GB(1Rx8) Pass Production
8GB(2Rx8) Pass Production
22/32
2znm DDR4 8Gb Introduction
32GB RDIMM(8Gb) consumes 26% lower power than 32GB LRDIM(4Gb)
3% performance gain by eliminating data buffer
Delay From DB
RDIMM
LRDIMM
tPD from DB LRDIMM More Power
26% Power Saving 3%
Performance Gain
Power Performance
0.74
1.0 1.00
DD
R4
3
2G
B L
R
DD
R4
LR
D
DR
4
32
GB
LR
DD
R4
3
2G
B R
D
1.03
DD
R4
3
2G
B R
D
vs. DDR4(4Gb)
32GB LRDIMM DDR4(8Gb)
32GB RDIMM
32GB Comparision
23/32
Contents
1. Industry Trend
2. Introduction of DDR4
3. New Technology Node
4. 3D Stacking Technology
5. What’s coming Next
24/32
TSV technology for 3DS
Enables DRAM stacking with better electrical characteristics
TSV VIA
TSV Solutions
Master Chip
Memory Controller DRAM
(Master)
Integrated Buffer
Less I/O power
<4H TSV Package>
<3DS TSV RDIMM>
Conventional Stack Solutions
Wire-Bond RDL*
Memory Controller
DRAM
Data Buffer
<QDP Wire-bond Package>
<QDP LRDIMM>
Number of loading limits high speed operations
Only master chip communicates with controller regardless of number of stacking
Slave Chip
*RDL : Re-distribution Layer
25/32
1 1 0.96
1 0.99 1.03
LR TSV TSV_LR
Power Efficiency of 3DS Solution
3DS solution shows similar performance to buffered solutions
Significant less power by removing additional ICs
*Performance: SPECjbb benchmark, Latency: ATE, Power: Samsung memory stress PGM @ system
1
0.81 1.04
DDP LRDIMM TSV RDIMM TSV LRDIMM
~24% ~28%
DDP 32GB LRDIMM
4H 3DS 64GB RDIMM
4H 3DS 64GB LRDIMM
DDP 32GB LRDIMM
4H 3DS 64GB RDIMM
4H 3DS 64GB LRDIMM
1 1.00 0.96
1 0.99 1.03
LR TSV TSV_LR
Bandwidth and Latency
Stream LMBench
1 1.00 0.96
1 0.99 1.03
LR TSV TSV_LR
Bandwidth and Latency
Stream LMBenchPerformance Latency
4H 3DS DRAM consumes same as conventional 2stack 3DS RDIMM performs the same as buffered solutions
Performance & Latency Power Consumption
26/32
Unveiled 1st TSV product, 64GB RDIMM
64GB RDIMM with TSV is IN PRODUCTION
27/32
Contents
1. Industry Trend
2. Introduction of DDR4
3. New Technology Node
4. 3D Stacking Technology
5. What’s coming Next
28/32
Needs for Higher Performance Memory
High performance DRAM solution needed in N/W, GFX and HPC
Network Graphic HPC
• 200/400Gbit Ethernet from
Big Data : 6.6 Zetabyte in ‘16(CAGR 31%)
Connected Devices : 1 trillion in ‘16(IBM)
Internet speed goes up : LTE, LTE-A
• Higher B/W requirement
Look up Buffer : RLDRAM HBM
Packet Buffer : DDR3/4 HBM
Ethernet Solution
40/100Gbit DDR3 1866Mbps (x16 *8)
200Gb DDR4 2.8~3.2Gbps(x16 * 16)
400Gb HBM 100~200GB/s (1~2ea)
Overcome Uncanny Valley
200
4
200
5
200
6
200
7
200
8
200
9
201
0
201
1
201
2
201
3
201
4
201
5
201
6
201
7
201
8
201
9
202
0
’20, 60 Tera 3 Tera
GPU performance (Flops)
• Graphics Revolution
Improved 3D graphics, 4K Resolution, etc
Expanded use of GPGPU
• Memory B/W keep increasing
4.7Tera Flops in ‘13 : 288GB/s (GDDR5)
9.7Tera Flops in ‘15 : 600GB/s ↑
• GPGPU application enlargement
Expand from Super Computer to Server
. Shazam : Cloud Service using GPGPU accelerator
Core , Memory B/W increase
GPGPU Acceleration
Memory BW Increase
29/32
HBM (High Bandwidth Memory) Concept
HBM has 8 channels with 1024 I/O, support up to 256GB/s • 2/4/8H HBM stacks can be supported with TSV technology
PCB
DRAM
Buffer Logic Processor
Si Interposer
Mother Board
[System side view using 4H HBM] [HBM Structure – 4H Case]
Buffer
DRAM
Channel 0 Channel 1
HBM is the unique solution to achieve higher B/W with low power
700
50
100
300
500
900
GD
DR5
HBM
2015 2016 2017 2018 2019
HBM x4ea
(1TB/s, 2Gbps)
GDDR5 x12ea
(384GB/s, 8Gbps)
[Memory Bandwidth, GB/s]
DDR3/4, WIO2
Bandwidth Requirement
30/32
Thermal Management in 2.5D PKG is ready
[Temperature (°C] Buffer Die 1st DRAM Die 2nd DRAM Die 3rd DRAM Die 4th DRAM Die
PCB
DRAM
Buffer Logic Processor
Si Interposer
Mother Board
31/32
Infrastructure Readiness for HBM
300mm wafer process line is ready for “Mass Production”
• Fab process qualification is completed with “State of the art” facilities
Bump Carrier Bond Back-side Pad Debond & Saw
FAB Post-FAB Assembly
TSV Stacking
32/32
Samsung Memory for All Computing Device
Smart Phone
DDR4 /LPDDR3,4
DDR4
LPDDR3/4
GDDR5