flash storage trends & ecosystem - jedec1).pdf · • there are limitation on current cmos io,...

21
Flash Forward @ CES 2011 Flash Forward @ CES 2011 Flash Storage Trends & Ecosystem Hung Vuong Qualcomm Inc.

Upload: others

Post on 17-Mar-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Flash Storage Trends & Ecosystem - JEDEC1).pdf · • There are limitation on current CMOS IO, ~250MHz; ~533MHz for PoP • To improve performance CMOS IO, pincount must increase

Flash Forward @ CES 2011Flash Forward @ CES 2011

Flash Storage Trends & Ecosystem

Hung VuongQualcomm Inc.

Page 2: Flash Storage Trends & Ecosystem - JEDEC1).pdf · • There are limitation on current CMOS IO, ~250MHz; ~533MHz for PoP • To improve performance CMOS IO, pincount must increase

Flash Forward @ CES 2011

Agenda• Introduction• Trends

– Wireless Industry Trends– Memory & Storage Trends

• Opportunities• Summary

Page 4: Flash Storage Trends & Ecosystem - JEDEC1).pdf · • There are limitation on current CMOS IO, ~250MHz; ~533MHz for PoP • To improve performance CMOS IO, pincount must increase

Flash Forward @ CES 2011

Ave

rag

e T

hro

ug

hp

ut

(Kb

ps)

1980 1985 1990 1995 2000 2005 2010

DOrB EVDO-revB

HSPA+

1

10

100

1000

10,000

AMPSGSM CDMA

GPRS

CDMA 1x

EDGE

EVDO-revOWCDMA

HSDPA 3.6

EVDO-rev A

HSDPA7.2

LTE

Wireless Industry Moving Beyond VoiceMore Bandwidth Increased Data use Advanced Mobile Devices

Page 5: Flash Storage Trends & Ecosystem - JEDEC1).pdf · • There are limitation on current CMOS IO, ~250MHz; ~533MHz for PoP • To improve performance CMOS IO, pincount must increase

Flash Forward @ CES 2011

The Performance of a Laptop in your Pocket

ARM MCU< 20 MIPS

ARM7TDMI23 MIPS* (27 MHz)Intel 80186

10 MIPS(~2.5 MHz) * Dhrystone 2.1

** ARM Instruction Set

Multimedia PlatformARM9

Up to 160 MIPS*(146 MHz)

Enhanced PlatformARM9

Up to 250 MIPS*(225 MHz)

ConvergenceDual-Core

ARM9 + ARM11Up to 740 MIPS*

(400 MHz)

2010 2012

Snapdragon FamilyQualcomm Enhanced

CPU**(1-1.5GHz)

Page 6: Flash Storage Trends & Ecosystem - JEDEC1).pdf · • There are limitation on current CMOS IO, ~250MHz; ~533MHz for PoP • To improve performance CMOS IO, pincount must increase

Flash Forward @ CES 2011

Seamless Connectivity

•Ubiquitous Broadband•Apps store•Consumer Electronics

Rich Communication

•VoIP•PTT/PTM•Video Communication

•Multiplayer Gaming

User GeneratedContent

•Mobile 2.0 •Social Networking•Media Sharing •Collaboration•Mobile Advertising

Streaming

•Music/Ringtones•Video •Web Browsing

•Voice•SMS/Email

Voice & Text

SimpleCommunication Download Download

& UploadReal-TimeDelay Sensitivity

Seamless FixedMobile Convergence

Mobile services becoming the center of life

User trends shift from non-connected to connected, wired to wireless

Page 7: Flash Storage Trends & Ecosystem - JEDEC1).pdf · • There are limitation on current CMOS IO, ~250MHz; ~533MHz for PoP • To improve performance CMOS IO, pincount must increase

Flash Forward @ CES 2011

Feature phone/smartphone: Communication (voice, text,

email) Full internet Entertainment Navigation

Traditional laptop:

Productivity Internet/E-mail Entertainment Navigation

Handset< 2”

FeaturePhone2.5”

Smartphone< 3.5”

SmartPhone Tablet/SmartBook

~4” 7” – 12”

WirelessNotebook

> 12”

New Device Categories

MobileProfessionals

Families

Page 8: Flash Storage Trends & Ecosystem - JEDEC1).pdf · • There are limitation on current CMOS IO, ~250MHz; ~533MHz for PoP • To improve performance CMOS IO, pincount must increase

Flash Forward @ CES 2011

Qualcomm Snapdragon Family

In Production 2010 +

QSD8x72

DOrB/1xAdvHSPA+

QSD8x50A

DOrB/1xAdvHSPA

1.5 GHz Dual-CPU1080p VideoWSXGA DisplayPCI-ExpressSATAPCDDR2/31.3 GHz CPU

720p VideoWXGA DisplayPCDDR2

MSM8260HSPA+

MSM8660

DOrB/1xAdvHSPA+

MSM8960

LTEDOrB/1xAdvDC-HSPA+

MSM8270

DC-HSPA+

QSD8250HSPA

QSD8650DOrBHSPA 1.2 GHz Dual-CPU

1080p VideoWXGA DisplaySVDOSV-LTE

1.2 GHz Dual-CPU1080p VideoWXGA DisplaySVDO

1 GHz CPU720p VideoWXGA Display

MSM7230HSPA+

MSM8655

DOrB/1xAdvHSPA+

MSM8255HSPA+

800 MHz CPU720p VideoHalf-XGA DisplayHSPA+ 14 Mbps; Rel 7SVDO

1 GHz CPU720p VideoHalf-XGA DisplayHSPA+ 14 Mbps; Rel 7SVDO

MSM7630

DOrB/1xAdvHSPA+

Page 9: Flash Storage Trends & Ecosystem - JEDEC1).pdf · • There are limitation on current CMOS IO, ~250MHz; ~533MHz for PoP • To improve performance CMOS IO, pincount must increase

Flash Forward @ CES 2011

NAND FLASH Technology Trend

~64Gb

~32Gb

~10K/10yrs

~3K/5yrs

~2K/5yrs (?)

Reliability

ECCDensity

Process

~16Gb

~24b/1KB

~??b/??KB

~8b/512B

~4b/512B

8KB

2xnm3xnm4xnm5xnm 1xnm

4KBPage Size

Page 10: Flash Storage Trends & Ecosystem - JEDEC1).pdf · • There are limitation on current CMOS IO, ~250MHz; ~533MHz for PoP • To improve performance CMOS IO, pincount must increase

Flash Forward @ CES 2011

Problem Statement: Applications vs. Storage• Mobile Applications & its Computing Model

– “All about Multimedia” & more storage!– Phone: Change from Simple voice only platform to smartphone

with increase data usage– OS: Change from Single-thread to multi-threaded environment– Result: Demand for higher random IOPs performance

• Storage Technology– FLASH: NOR – NAND – MLC NAND – managed-NAND– Migration from Single-Level Cell (SLC) to Multi-Level Cell (MLC)– Reliability Degradation

• Lower than endurance (3K or lower) & retention (5K or lower) for MLC FLASH

• Higher ECC, 24bits or higher per 1KB block• Larger page size, 8KB block & possibly higher

– Result: Latency degradation in raw NAND

Page 11: Flash Storage Trends & Ecosystem - JEDEC1).pdf · • There are limitation on current CMOS IO, ~250MHz; ~533MHz for PoP • To improve performance CMOS IO, pincount must increase

Flash Forward @ CES 2011

• Optimization for performance & power– Optimization of device architecture with Increase parallelism (Multi-plane

& Multi-LUN support)– e.MMC Standardization for today managed NAND– JEDEC UFS is the next Generation NVM memory

• Size & pincount reduction on SoC– Pincount will limit SoC size reduction– Stacked memory (SIP, POP) continue to mature– TSV (Through Silicon Via)… are still “emerging”

• Serial Interface– Minimize pin count. Today SoC pincount is ~600 & growing– SoC Process shrink continue to migrate to 2xnm

• Proven programming model – Standard Host Controller Interface (HCI) definition– UFS uses SCSI architecture & commands (SW re-use)

Opportunities Within JEDEC

Page 12: Flash Storage Trends & Ecosystem - JEDEC1).pdf · • There are limitation on current CMOS IO, ~250MHz; ~533MHz for PoP • To improve performance CMOS IO, pincount must increase

Flash Forward @ CES 2011

Serial Technologies• There are limitation on current CMOS IO, ~250MHz; ~533MHz

for PoP• To improve performance CMOS IO, pincount must increase or

look at other aspects for improvement– Clock architecture & methodology

• I.e. Multi-phase & DLL

– Complex IO design• Drive Strength• Lower signaling Voltage• Lower bus loading• Termination

• There are multiple Serial technologies in the industry– SATA, PCIe, MDDI, MIPI, SPMT, RAMbus, etc.– But how to do Serial Interface while optimize for power…

Page 13: Flash Storage Trends & Ecosystem - JEDEC1).pdf · • There are limitation on current CMOS IO, ~250MHz; ~533MHz for PoP • To improve performance CMOS IO, pincount must increase

Flash Forward @ CES 2011

Serial Interface Power Estimate: USB Sideloading

• Assumptions– Transaction size: 64KBytes– Transaction Rate: 30MB/s– Flash Write Speed: 30MB/s

Transaction size (Bytes)Transaction Rate (Bytes/s)Flash Read/Write Speed (Bytes/s)

e.MMC4.41 M-PHY(150MB/s)

M-PHY(250MB/s)

M-PHY(500MB/s)

Link Rate(MB/S) 100.0E+6 150.0E+6 250.0E+6 500.0E+6Peak Utilization 30.0% 20.0% 12.0% 6.0%Idle Utilization 70.0% 80.0% 88.0% 94.0%Energy (mW/Gbps) 8.640 6.400 3.789 1.779

65.5E+330.0E+630.0E+6

HS-Burst Idle Time HS-Burst Idle Time HS-Burst

Page 14: Flash Storage Trends & Ecosystem - JEDEC1).pdf · • There are limitation on current CMOS IO, ~250MHz; ~533MHz for PoP • To improve performance CMOS IO, pincount must increase

Flash Forward @ CES 2011

UFS SCSI Architectural Model• e.MMC today command-response based architecture is limiting

– Designed for Single-thread computing model– Increasing bus frequency will not further improve BW

• UFS is based on SCSI architecture model, a Multi-threaded computing operation model– Command queuing– Out-of-order execution

• Well-known architecture model for Host & Device, same as SSD, SAS & USB storage– Flexible to support small & simple storage device such as USB

drive– And still support high-performance market such as enterprise

computing• Goal: Re-Use Standard SCSI commands to enable UFS

features

Page 15: Flash Storage Trends & Ecosystem - JEDEC1).pdf · • There are limitation on current CMOS IO, ~250MHz; ~533MHz for PoP • To improve performance CMOS IO, pincount must increase

Flash Forward @ CES 2011

UFS Architectural Diagram

Future Extension …

UFS InterConnect Layer(UIL)

UFS Command Set Layer(UCL)

UFS Native Command

Set

Simplified SCSI Command Set

Future Extension …

MIPI UniPro

MIPI M-PHY

Device Manager(QueryRequest)

UFS Transport Protocol Layer(UTP)

UTP_CMD_SAP

Task Management/Task Manager

Application Layer

UTP_TM_SAPUDM_SAP

UIO_SAP

Page 16: Flash Storage Trends & Ecosystem - JEDEC1).pdf · • There are limitation on current CMOS IO, ~250MHz; ~533MHz for PoP • To improve performance CMOS IO, pincount must increase

Flash Forward @ CES 2011

UFS Host Controller Interface (HCI)• Objectives - Provide standard programming interface for UFS

– Enable the use of common Host/OS Driver– Common Register set, for OS driver as well as Low-Level driver

• Low-level driver can be customized per HW host controller– Management of DMA & queues– Provide bus/link management capabilities– Provide power management capabilities, including device power

management• Optimized interface for different UFS usage models with

regards to embedded mass storage, memory card, and UFS bus topology

• Minimize changes to UFS Host SW as the technology matures

Page 17: Flash Storage Trends & Ecosystem - JEDEC1).pdf · • There are limitation on current CMOS IO, ~250MHz; ~533MHz for PoP • To improve performance CMOS IO, pincount must increase

Flash Forward @ CES 2011

Host Architecture Interface (HCI) Overview

• Device Management Entity (DME) Interface

o Support Native UniPro DME calls

• UFS Transport Protocol/Service Interface

o UFS Transport Protocol (UTP)o UFS Transport Service IFo UFS Native Command IF

• Host Controller Capability (HC CAP)o Host versiono Host Controller Capabilityo Host Controller Configuration

Page 18: Flash Storage Trends & Ecosystem - JEDEC1).pdf · • There are limitation on current CMOS IO, ~250MHz; ~533MHz for PoP • To improve performance CMOS IO, pincount must increase

Flash Forward @ CES 2011

Host Controller Interface (HCI) Architecture• Command Queuing

o Up to 32 Commands/Service Calls can be queued

o Command/Service call reordering by SW• DMA Operation

o For command/data fetchingo For data delivery

• Native UniPro supporto Direct access to UniPro DME Service

Access Point Interface at host registero Local control/configuration

UniPro layer (L4-L1.5) control/configuration

M-PHY control/configuration

o Remote device control/configuration Device reset Device control/configuration

• Doorbell register to start command execution

Page 19: Flash Storage Trends & Ecosystem - JEDEC1).pdf · • There are limitation on current CMOS IO, ~250MHz; ~533MHz for PoP • To improve performance CMOS IO, pincount must increase

Flash Forward @ CES 2011

Optimization: Command Queuing• Need to migrate to multi-threaded operations

– Single thread operation of e.MMC will limits device peak BW

• Command Queuing & Native Command Queuing existed today with SATA/SCSI, supporting up to 32 commands for UFS

• Goal - improve random read and write performance while minimize Host SW interaction

• Impact - Device Firmware complexity increase, added management functions.

• Out-of-Order Execution. Device can service commands strategically out of order to achieve greatest performance

Page 20: Flash Storage Trends & Ecosystem - JEDEC1).pdf · • There are limitation on current CMOS IO, ~250MHz; ~533MHz for PoP • To improve performance CMOS IO, pincount must increase

Flash Forward @ CES 2011

Other UFS Optimizations• Background operations

– Define time for device operations to occur– Define the power-mode for these operations

• Power Management– Power modes– Notification. Indicate power mode transition

• Secure operations, including write & erase operations• Dynamic Capacity• Data reliability

– Reliable write• Enhance area• Partitioning based on performance/reliability

requirements

Page 21: Flash Storage Trends & Ecosystem - JEDEC1).pdf · • There are limitation on current CMOS IO, ~250MHz; ~533MHz for PoP • To improve performance CMOS IO, pincount must increase

Flash Forward @ CES 2011

Conclusion• Changes in mobile computing model dictates changes to storage

device & its interface • Serial interface

– Serial technology: Power, performance, and pin count optimization– JEDEC-MIPI agreement allows the use of M-PHY & Unipro, for Jedec

NVM & DRAM solutions– JEDEC UFS is the 1st specification resulting from this agreement

• Focus on SW– SCSI Architecture & Command set– UFS HCI definition

• Result – Quicker adoption of UFS – Qualcomm committed to JEDEC Standard interface developments