presentation gordon

22
Gordon: Using Flash Memory to Build Fast, Power-efficient Clusters for Data-intensive Applications Presenter: He Wang Department of Electrical and Computer Engineering University of Florida

Upload: litaognv

Post on 13-Jul-2015

390 views

Category:

Technology


0 download

TRANSCRIPT

Gordon: Using Flash Memory to Build Fast, Power-efficient Clusters for Data-intensive

Applications

Presenter: He Wang

Department of Electrical and Computer Engineering

University of Florida

Outline

• Motivation and Background

• Introduction to Gordon’s system architecture

• Gordon’s storage system

• Configuring Gordon

Wiki

• Gordono A flash-based system architecture for massively parallel,

data-centric computing

• Featureo Power efficiency

o Performance advantage

o Aimed at data-centric applications

Motivation and Background

• Challenges with large-scale data processingo Slowdown in uni-processor performance

o Latency and BW bottleneck of HDD

o Power constraints

• Improve performance and power efficiency

• Progresseso Programming model that parallelizing data-processing program

o Increased BW and reduced latency with SSD

o Recent power efficient processors

Motivation and Background(cont)

• Gordon o Programming system that parallelizing data-processing program(i.e.

MapReduce)

• Abstractions for specifying data-parallel compution

• Automating the parallelism

o SSD

• Improved flash translation layer(FTL)

o Power efficient processors

• 100s or 1000s

• simple interconnect

Gordon system architecture

• Gordon nodeso 256GB Flash mem, flash storage controller, 2GB SDRAM,

1.9Ghz Intel Atom processor

o Connected through 1Gb ethernet-style network

o A standard rack hols 16 enclosures for 256 nodes with 64TB storage and 230GB/s I/O BW

o Independent computer

• OS

• Network interfaces

Gordon system architecture

• Gordon nodes featureso Power efficient

• 19W to 81W

o High BW

• 900MB/S

Figure 1. Gordon system architecture

Storage system

• Key to power efficiency and performance

• Support Erase, Program, Read operations

• Reliability issueo Wear out, needs wear-leveling

• Flash translation layer(FTL)

Storage system

• Flash controllero Implements FTL

o Link between CPU and flash array

• Shared buses, up to 4 packages

Storage system

• Gordon FTLo Operate a write point

• Pointer to a page of flash memory

o Maintain a summary page in each block• Logical block address(LBA)-to-physical mapping

• Benefit of this indirection

• Address organization

• Wear-leveling

• Working flowo Receive write command

o Locate data by write point

o update LBA table

Storage system

• Gordon FTL advantage---Write pointo Original FTL has only one write point, no parrallel

o Multiple write points with spread access

o Sequence number

• Avoid conflict with occupied write point

• Assign the write point with smallest available

Storage system

• Gordon FTL advantage---super-pageo Manage flash array with larger granularity with one write

point for each

o Horizontal striping

o Vertical striping

o 2D striping

Storage system• Super-page stripping approaches

Figure 2. Three approachs to striping data across flash arrays

Storage system

• Super-page

o Pros

• Reduced overhead

o Cons

• Latency for sub-page access

• Wear out effect larger portion

Storage system• Super-page evaluation

Figure 3. Flash storage array performance

Configuring Gordon• Workloads

o Benchmarks that use MapReduce

• Power modelo Direct mesure of a running system

o Datasheet

P = IdlePower * (1-ActivityFactor) + ActivePower * ActivityFactor

Configuring Gordon• Measuring cluster performance

o High-level simulator to measure overall performance

• Model 32 node by running 4 Vmware on 8 servers

o Sync mode, provides upper bound of exe time

o nosync mode, provides lower bound

o Storage simulator

Configuring Gordon• Parato-optimal Gordon system design

Figure 6. Parato-optimal Gordon system designs

Configuring Gordon• Optimal Gordon configurations

Figure 5. Optimal Gordon configuration

Out-perform disk-based by 1.5X and deliver 2.5X more performance per watt

Configuring Gordon• Gordon power consumption

o MaxE-flash consumes 40% of the energy of the disk-based configuration

o A factor of two increase in performance

Figure 6. Relative energy consumption

Discussions

• Exploit disks for cheap redundancy

• Virtualizing Gordon

Thanks