single-chip heterogeneous computing does the future include custom logics, fpga, and gpgpus?

28
Single-Chip Heterogeneous Computing Does the Future Include Custom Logics, FPGA, and GPGPUs? Presented by Kittisak Sajjapongse

Upload: xannon

Post on 23-Feb-2016

45 views

Category:

Documents


0 download

DESCRIPTION

Single-Chip Heterogeneous Computing Does the Future Include Custom Logics, FPGA, and GPGPUs?. Presented by Kittisak Sajjapongse. Introduction to the study. Objective of the study. Observe the trends of integrating unconventional cores (U-cores) into single-chip multicores - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Single-Chip Heterogeneous Computing Does the Future Include Custom Logics, FPGA, and GPGPUs?

Single-Chip Heterogeneous ComputingDoes the Future Include Custom Logics, FPGA, and GPGPUs?

Presented by Kittisak Sajjapongse

Page 2: Single-Chip Heterogeneous Computing Does the Future Include Custom Logics, FPGA, and GPGPUs?

Introductionto

the study

Page 3: Single-Chip Heterogeneous Computing Does the Future Include Custom Logics, FPGA, and GPGPUs?

Objective of the studyObserve the trends of integrating

unconventional cores (U-cores) into single-chip multicores

Identify the factors that impact decision to have U-cores

Introduction to the study

Page 4: Single-Chip Heterogeneous Computing Does the Future Include Custom Logics, FPGA, and GPGPUs?

Model in the studySymmetric - Multiple fast complex cores (FastCore)- Highly optimized to minimize latency of single thread

Asymmetric- One fast complex core (FastCore)- Multiple simple cores (BCE)- Intended to handle application which has parallelism

Heterogeneous- One fast complex core (FastCore)- U-cores: ASICs, FPGAs, GPGPUs- We are going to study about U-cores

Introduction to the study

Page 5: Single-Chip Heterogeneous Computing Does the Future Include Custom Logics, FPGA, and GPGPUs?

ASIC, FPGA, and GPGPUASIC (Application-Specific Integrated Circuit)

◦ A device or integrated circuit customized for specific application domains e.g. H264 codec, JPEG codec etc.

FPGA (Field Programmable Gate Array)◦ A configurable digital integrated circuit capable for

supporting hardware architectures

GPGPU (General-Purpose Graphic Processing Unit)◦ Graphics devices that provides APIs (Application

Programming Interface) for using with parallelizable application

Introduction to the study

Page 6: Single-Chip Heterogeneous Computing Does the Future Include Custom Logics, FPGA, and GPGPUs?

ASIC, FPGA, and GPGPUFeatures ASIC FPGA GPGPUDesign/Program CAD/CAM

EDA (Electronic Design Automation) Tool

Hardware Description Language (HDL)

openCL, CUDA, etc.

Design controls Transistors, Standard cells

Logic Components, RTL

Processors, Cache, Memory

Flexibility Fixed-function (1) Configurable (2) Programmable (3)Level of abstraction

Low (1) Medium (2) High (3)

Power efficiency Extremely High (3) High (2) Moderate (1)

They all are used to exploit parallelism!!!

Introduction to the study

Page 7: Single-Chip Heterogeneous Computing Does the Future Include Custom Logics, FPGA, and GPGPUs?

What is the study about ?Constains

◦ Power◦ Bandwidth

Questions posedUnder bandwidth- and power- constrains◦ Would single-chip multicores benefit significatly from U-

cores ?◦ Would ASICs be the best choice ?

Introduction to the study

Page 8: Single-Chip Heterogeneous Computing Does the Future Include Custom Logics, FPGA, and GPGPUs?

Model for U-core

Page 9: Single-Chip Heterogeneous Computing Does the Future Include Custom Logics, FPGA, and GPGPUs?

What is BCE?Baseline Core Equivalent

◦Referred to a basic processor◦Used as baseline reference for

performance and power consumption

Model for U-core

Page 10: Single-Chip Heterogeneous Computing Does the Future Include Custom Logics, FPGA, and GPGPUs?

What is BCE?

Two parameters used later◦n : number of total BCE available◦r : number of resources dedicated to

complex cores (in a unit of BCE)

Model for U-core

Page 11: Single-Chip Heterogeneous Computing Does the Future Include Custom Logics, FPGA, and GPGPUs?

Amdahl’s Law

Reference: http://en.wikipedia.org/wiki/Amdahl_lawModel for U-core

Page 12: Single-Chip Heterogeneous Computing Does the Future Include Custom Logics, FPGA, and GPGPUs?

Hill & Marty’s extended Amdahl’s Law

Reference: M. D. Hill et al., “Amdahl’s Law in the Multicore Era,” ComputerModel for U-core

Page 13: Single-Chip Heterogeneous Computing Does the Future Include Custom Logics, FPGA, and GPGPUs?

How about Heterogeneous arch.?

? SpeedupHeterogeneous (??)= ???

Under Power & Bandwidth constrains

Model for U-core

Page 14: Single-Chip Heterogeneous Computing Does the Future Include Custom Logics, FPGA, and GPGPUs?

Deriving model for U-coreSpeedupAmdahl = f(f,n)

SpeedupHill&Marty = f(f,n,r)

SpeedupHet.(U-core) = f(f,n,r,B,P,µ,φ)New Parameters:B – Memory Bandwidth of U-core (in unit of BCE compulsory bandwidth)P – Active Power of U-core relative to BCE µ – Performance of U-core relative to BCEΦ – Power efficiency of U-core relative to BCE

Model for U-core

Page 15: Single-Chip Heterogeneous Computing Does the Future Include Custom Logics, FPGA, and GPGPUs?

Deriving model for U-coreSpeedupasymmetric =

11-f

perf(r)+ f

perf(r) +n - r

Speedupasym(offload) =

n - r

µ( )

Speeduphet(U-core) =

Model for U-core

Page 16: Single-Chip Heterogeneous Computing Does the Future Include Custom Logics, FPGA, and GPGPUs?

Obtaining µ,φ for U-core

Page 17: Single-Chip Heterogeneous Computing Does the Future Include Custom Logics, FPGA, and GPGPUs?

Devices & Workload

Workload:

- Dense Matrix Multiplication (MMM)- Fast Fourier Transform (FFT with various input size 24 to 220)- Black-Scholes (BS)

Device Ref. DeviceBCE Intel AtomSymmetric CMP Intel Core i7-960ASIC (U-core) 65nm technology (1.1V)FPGA (U-core) V6-LX760GPU (U-core) GTX285, GTX480,

R5870

Device:

Obtaining µ,φ for U-core

Page 18: Single-Chip Heterogeneous Computing Does the Future Include Custom Logics, FPGA, and GPGPUs?

Deriving µ for ASIC in FFT-1024 (case study)

3500.5

Page 19: Single-Chip Heterogeneous Computing Does the Future Include Custom Logics, FPGA, and GPGPUs?

Deriving φ for ASIC in FFT-1024 (case study)

100

0.8

Page 20: Single-Chip Heterogeneous Computing Does the Future Include Custom Logics, FPGA, and GPGPUs?

Obtained Parameters

Obtaining µ,φ for U-core

Page 21: Single-Chip Heterogeneous Computing Does the Future Include Custom Logics, FPGA, and GPGPUs?

Applying the Model for Results

Page 22: Single-Chip Heterogeneous Computing Does the Future Include Custom Logics, FPGA, and GPGPUs?

Scaling Projection

Page 23: Single-Chip Heterogeneous Computing Does the Future Include Custom Logics, FPGA, and GPGPUs?

Budget and Constrains

Page 24: Single-Chip Heterogeneous Computing Does the Future Include Custom Logics, FPGA, and GPGPUs?

Result for FFT-1024

Page 25: Single-Chip Heterogeneous Computing Does the Future Include Custom Logics, FPGA, and GPGPUs?

Results for MMM

Page 26: Single-Chip Heterogeneous Computing Does the Future Include Custom Logics, FPGA, and GPGPUs?

Results for Black-Scholes

Page 27: Single-Chip Heterogeneous Computing Does the Future Include Custom Logics, FPGA, and GPGPUs?

Answering the questions◦ Would single-chip multicores benefit significatly from U-

cores ? Yes , If the application has enough (>90%) parallelism to

exploit.

◦ Would ASICs be the best choice ? Depends on applications, if there is not much parallelism, then ASIC

might not be worth to implement.

Page 28: Single-Chip Heterogeneous Computing Does the Future Include Custom Logics, FPGA, and GPGPUs?

Conclusions Sufficient parallelism must exists to significantly obtain

performance improvement from U-core

Flexible U-cores tend to be competitive to ASIC under limited bandwidth and limited parallelism

U-core such as ASIC is useful when power is the primary goal