author: partha biswas, nikil d. dutt, laura pozzi, paolo ienne presentation by ganghee ... · 2009....

17
Author: Partha Biswas, Nikil D. Dutt, Laura Pozzi, Paolo Ienne Presentation by Ganghee Jang 1

Upload: others

Post on 25-Feb-2021

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Author: Partha Biswas, Nikil D. Dutt, Laura Pozzi, Paolo Ienne Presentation by Ganghee ... · 2009. 4. 3. · Author: Partha Biswas, Nikil D. Dutt, Laura Pozzi, Paolo Ienne Presentation

Author: Partha Biswas, Nikil D. Dutt, Laura Pozzi,Paolo Ienne

Presentation by Ganghee Jang

1

Page 2: Author: Partha Biswas, Nikil D. Dutt, Laura Pozzi, Paolo Ienne Presentation by Ganghee ... · 2009. 4. 3. · Author: Partha Biswas, Nikil D. Dutt, Laura Pozzi, Paolo Ienne Presentation

What is ISE & Architecturally Visible Storage?

Motivation

Related Work

Design Approaches with AVS

Results

Not Bird’s eyes – Different View with my Eyes

2

Page 3: Author: Partha Biswas, Nikil D. Dutt, Laura Pozzi, Paolo Ienne Presentation by Ganghee ... · 2009. 4. 3. · Author: Partha Biswas, Nikil D. Dutt, Laura Pozzi, Paolo Ienne Presentation

ISE (Instruction Set Extension)◦ Instruction Set to support Customized Functional

Unit

AVS (Architecturally Visible Storage)◦ Visible Storage from Functional Unit(Architecturally)

From my Understanding,◦ “Visible” means “Where” data is Synchronized to

◦ To have Ownership of Processing Data, FU can have its own state and internal data

◦ Access Memory or Cache Directly

◦ Internal Memory

3

Page 4: Author: Partha Biswas, Nikil D. Dutt, Laura Pozzi, Paolo Ienne Presentation by Ganghee ... · 2009. 4. 3. · Author: Partha Biswas, Nikil D. Dutt, Laura Pozzi, Paolo Ienne Presentation

Let’s assume an Instruction multiplying a Scalar and Matrix

Mult_s_M(scalar, Matrix)

4

Page 5: Author: Partha Biswas, Nikil D. Dutt, Laura Pozzi, Paolo Ienne Presentation by Ganghee ... · 2009. 4. 3. · Author: Partha Biswas, Nikil D. Dutt, Laura Pozzi, Paolo Ienne Presentation

The Benefit of AVS

Lowers Cache Pollution◦ Data bypassing Cache and Register

Increase Scope of ISE Algorithm◦ Including load/store Instructions give Extra

Chances for more Optimization of Algorithm

Reduce Energy Consumption◦ By reducing Memory Access

5

Page 6: Author: Partha Biswas, Nikil D. Dutt, Laura Pozzi, Paolo Ienne Presentation by Ganghee ... · 2009. 4. 3. · Author: Partha Biswas, Nikil D. Dutt, Laura Pozzi, Paolo Ienne Presentation

Scratchpad

Avoid some Cache Pollution

Still share Register File

6

Page 7: Author: Partha Biswas, Nikil D. Dutt, Laura Pozzi, Paolo Ienne Presentation by Ganghee ... · 2009. 4. 3. · Author: Partha Biswas, Nikil D. Dutt, Laura Pozzi, Paolo Ienne Presentation

Intelligent SRAM

Move ALU nearer to On-Chip SRAM

Increase Complexity of Identification Problem

7

Page 8: Author: Partha Biswas, Nikil D. Dutt, Laura Pozzi, Paolo Ienne Presentation by Ganghee ... · 2009. 4. 3. · Author: Partha Biswas, Nikil D. Dutt, Laura Pozzi, Paolo Ienne Presentation

Cut: Identified codes for ISE

All Arguments for AFU from Register File

8

Page 9: Author: Partha Biswas, Nikil D. Dutt, Laura Pozzi, Paolo Ienne Presentation by Ganghee ... · 2009. 4. 3. · Author: Partha Biswas, Nikil D. Dutt, Laura Pozzi, Paolo Ienne Presentation

Increased Cut, so Increased Granularity

Need DMA for efficient Data Transfer

9

Page 10: Author: Partha Biswas, Nikil D. Dutt, Laura Pozzi, Paolo Ienne Presentation by Ganghee ... · 2009. 4. 3. · Author: Partha Biswas, Nikil D. Dutt, Laura Pozzi, Paolo Ienne Presentation

CFG: Control Flow Graph◦ To represent Program as a graph

Basic Block: Each Node in the CFG◦ Piece of code which has following Characteristics

◦ Entrance: only at the Beginning

◦ Exit: only at the End

10

Page 11: Author: Partha Biswas, Nikil D. Dutt, Laura Pozzi, Paolo Ienne Presentation by Ganghee ... · 2009. 4. 3. · Author: Partha Biswas, Nikil D. Dutt, Laura Pozzi, Paolo Ienne Presentation

A node idominates node j if every path from the entry node to j passes through i.

A

B

D

E

F

C

11

Page 12: Author: Partha Biswas, Nikil D. Dutt, Laura Pozzi, Paolo Ienne Presentation by Ganghee ... · 2009. 4. 3. · Author: Partha Biswas, Nikil D. Dutt, Laura Pozzi, Paolo Ienne Presentation

CFG of fftAlgorithm

P: last Polluter to cbb

cbb: refers to the basic block for which an ISE is currently being identified

12

Page 13: Author: Partha Biswas, Nikil D. Dutt, Laura Pozzi, Paolo Ienne Presentation by Ganghee ... · 2009. 4. 3. · Author: Partha Biswas, Nikil D. Dutt, Laura Pozzi, Paolo Ienne Presentation

R: Nodes can be reached from p

D: Nodes strictly Dominates cbb

Candidates: {4, 5, 6, 7, 8, 9}

Select the Node with the Least Execution Count

R

D

Candidates: R∩D

13

Page 14: Author: Partha Biswas, Nikil D. Dutt, Laura Pozzi, Paolo Ienne Presentation by Ganghee ... · 2009. 4. 3. · Author: Partha Biswas, Nikil D. Dutt, Laura Pozzi, Paolo Ienne Presentation

-no MEM: without allowing memory inclusion

-w/VEC: local memory inside with vector access

-w/VEC + SCA: local memory inside with vector access and scalar access

14

Page 15: Author: Partha Biswas, Nikil D. Dutt, Laura Pozzi, Paolo Ienne Presentation by Ganghee ... · 2009. 4. 3. · Author: Partha Biswas, Nikil D. Dutt, Laura Pozzi, Paolo Ienne Presentation

One of the good Example of H/W, S/W co-Design

Solving a Problem in ISE Design Domain

Speedup◦ Memory Wall Problem by Internal Memory

◦ Bigger Granularity

15

Page 16: Author: Partha Biswas, Nikil D. Dutt, Laura Pozzi, Paolo Ienne Presentation by Ganghee ... · 2009. 4. 3. · Author: Partha Biswas, Nikil D. Dutt, Laura Pozzi, Paolo Ienne Presentation

Hidden Assumption◦ No or very small Dependency between successive

AFU Operation

DMA can hurt Performance

◦ Granularity

All AFU Operation need bigger Granularity?

◦ Compiler must know all the detail about DMA latency

Need Profiling and recompile

Is it feasible with Reconfigurable ISE?

16

Page 17: Author: Partha Biswas, Nikil D. Dutt, Laura Pozzi, Paolo Ienne Presentation by Ganghee ... · 2009. 4. 3. · Author: Partha Biswas, Nikil D. Dutt, Laura Pozzi, Paolo Ienne Presentation

17