introduction to convolution circuits synthesis image processing, speech processing, dsp, polynomial...

86
Introduction to Introduction to Convolution circuits Convolution circuits synthesis synthesis image processing, speech processing, • DSP, polynomial multiplication in robot control. convolut ion

Upload: jeffery-hodge

Post on 26-Dec-2015

242 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Introduction to Convolution circuits synthesis image processing, speech processing, DSP, polynomial multiplication in robot control. convolution

Introduction to Convolution Introduction to Convolution circuits synthesiscircuits synthesis

• image processing, • speech processing, • DSP, • polynomial multiplication in robot control.

convolution

Page 2: Introduction to Convolution circuits synthesis image processing, speech processing, DSP, polynomial multiplication in robot control. convolution

FIR-filter like structureFIR-filter like structure

b4 b3 b2 b1

++ +

a4 0 0 0

a4*b4

• Separate input and output• Input and output move synchronized• Weights stay in space

Page 3: Introduction to Convolution circuits synthesis image processing, speech processing, DSP, polynomial multiplication in robot control. convolution

b4 b3 b2 b1

++ +

a4 0 0

a4*b4

a3

a3*b4+a4b3

Page 4: Introduction to Convolution circuits synthesis image processing, speech processing, DSP, polynomial multiplication in robot control. convolution

b4 b3 b2 b1

++ +

a3 a4 0

a4*b4

a2

a3*b4+a4b3 a4*b2+a3*b3+a2*b4

Page 5: Introduction to Convolution circuits synthesis image processing, speech processing, DSP, polynomial multiplication in robot control. convolution

b4 b3 b2 b1

++ +

a2 a3 a4

a4*b4

a1

a3*b4+a4b3 a4*b2+a3*b3+a2*b4

a1*b4+a2*b3+a3*b2+a4*b1

Page 6: Introduction to Convolution circuits synthesis image processing, speech processing, DSP, polynomial multiplication in robot control. convolution

b4 b3 b2 b1

++ +

a1 a2 a3

a4*b4

0

a3*b4+a4b3 a4*b2+a3*b3+a2*b4

a1*b4+a2*b3+a3*b2+a4*b1 a1*b3+a2*b2+a3*b1

Page 7: Introduction to Convolution circuits synthesis image processing, speech processing, DSP, polynomial multiplication in robot control. convolution

We insert Dffs to avoid many levels of logicWe insert Dffs to avoid many levels of logic

b4 b3 b2 b1

++ +

a4a2 a3

a4*b4a4*b3 a4*b2 a4*b1

Page 8: Introduction to Convolution circuits synthesis image processing, speech processing, DSP, polynomial multiplication in robot control. convolution

b4 b3 b2 b1

++ +

a3a1 a2

a4*b4 a4*b3+a3b4 a4*b2+a3b3a4*b1+a3b2 a3b1

Page 9: Introduction to Convolution circuits synthesis image processing, speech processing, DSP, polynomial multiplication in robot control. convolution

b4 b3 b2 b1

++ +

a20 a1

a4*b4 a4*b3+a3b4 a4*b2+a3b3+a2b4 a4*b1+a3b2+a2b3

a3b1+a2b2 a2b1

The disadvantage of this circuit is broadcasting

Page 10: Introduction to Convolution circuits synthesis image processing, speech processing, DSP, polynomial multiplication in robot control. convolution

We insert more Dffs to avoid broadcastingWe insert more Dffs to avoid broadcasting

b4 b3 b2 b1

++ +

a4a2 a3

a4*b40 0 0

0 0 0

Page 11: Introduction to Convolution circuits synthesis image processing, speech processing, DSP, polynomial multiplication in robot control. convolution

b4 b3 b2 b1

++ +

a3a1 a2

a4*b4 a3b4 a4b30

a4 0 0

0

Does not work correctly like this, try something new….

Page 12: Introduction to Convolution circuits synthesis image processing, speech processing, DSP, polynomial multiplication in robot control. convolution

b4 b3 b2 b1

a3a1 a2

a4*b4

a3b4 a4b3

0

a4 0 0

0

a2b4

a1b4

a3b3

a2b3

a1b3

00

0

0

a4b2

a3b2

a2b2

a1b2

0

0

0

a4b1

a3b1

a2b1

First sum

Second sum

Page 13: Introduction to Convolution circuits synthesis image processing, speech processing, DSP, polynomial multiplication in robot control. convolution

FIR-filter like structure, FIR-filter like structure, assume two delaysassume two delays

b4 b3 b2 b1

++ +

Page 14: Introduction to Convolution circuits synthesis image processing, speech processing, DSP, polynomial multiplication in robot control. convolution

b4 b3 b2 b1

++ +

Page 15: Introduction to Convolution circuits synthesis image processing, speech processing, DSP, polynomial multiplication in robot control. convolution

b4 b3 b2 b1

++ +

Page 16: Introduction to Convolution circuits synthesis image processing, speech processing, DSP, polynomial multiplication in robot control. convolution

b4 b3 b2 b1

++ +

Page 17: Introduction to Convolution circuits synthesis image processing, speech processing, DSP, polynomial multiplication in robot control. convolution

b4 b3 b2 b1

++ +

Page 18: Introduction to Convolution circuits synthesis image processing, speech processing, DSP, polynomial multiplication in robot control. convolution

b4 b3 b2 b1

++ +

Page 19: Introduction to Convolution circuits synthesis image processing, speech processing, DSP, polynomial multiplication in robot control. convolution

b4 b3 b2 b1

++ +

Page 20: Introduction to Convolution circuits synthesis image processing, speech processing, DSP, polynomial multiplication in robot control. convolution

b4 b3 b2 b1

++ +

Page 21: Introduction to Convolution circuits synthesis image processing, speech processing, DSP, polynomial multiplication in robot control. convolution

b4 b3 b2 b1

++ +

Page 22: Introduction to Convolution circuits synthesis image processing, speech processing, DSP, polynomial multiplication in robot control. convolution

b4 b3 b2 b1

++ +

Page 23: Introduction to Convolution circuits synthesis image processing, speech processing, DSP, polynomial multiplication in robot control. convolution

b4 b3 b2 b1

++ +

Page 24: Introduction to Convolution circuits synthesis image processing, speech processing, DSP, polynomial multiplication in robot control. convolution

b4 b3 b2 b1

++ +

Page 25: Introduction to Convolution circuits synthesis image processing, speech processing, DSP, polynomial multiplication in robot control. convolution

b4 b3 b2 b1

++ +

Page 26: Introduction to Convolution circuits synthesis image processing, speech processing, DSP, polynomial multiplication in robot control. convolution

b4 b3 b2 b1

++ +

Page 27: Introduction to Convolution circuits synthesis image processing, speech processing, DSP, polynomial multiplication in robot control. convolution

• Convolution Algorithm

Two loops

Patterns of operations on vectors:

1. vector product is not dot product [a0*b0, a1*b1, … an*bn][a0*b0, a1*b1, … an*bn]2. Dot product (scalar product) = vector product with

accumulation a0*b0 + a1*b1 + … + an*bna0*b0 + a1*b1 + … + an*bn3. Polynomial multiplication

Page 28: Introduction to Convolution circuits synthesis image processing, speech processing, DSP, polynomial multiplication in robot control. convolution

Example 3:Example 3:FIR Filter or FIR Filter or ConvolutionConvolution

Page 29: Introduction to Convolution circuits synthesis image processing, speech processing, DSP, polynomial multiplication in robot control. convolution

Example 3: ConvolutionExample 3: Convolution• There are many ways to implement convolution using systolic arrays, one of them is

shown: – u(n) : The input of sequence from left.

– w(n) : The weights preloaded in n PEs.

– y(n) : The sequence from right (Initial value: 0) and having the same speed as u(n).

• In this operation each cell’s function is: – 1. Multiply the inputs coming from left with weights and output the input received to the

next cell.

– 2. Add the final value to the inputs from right.

W0 W1 W2 W3

ui……u0

yi……y00

Wi

ain

bout

aout

bin

aout = ain

bout = bin + ain * wi

Page 30: Introduction to Convolution circuits synthesis image processing, speech processing, DSP, polynomial multiplication in robot control. convolution

• Each cell operation.

W0 W1 W2 W3

ui……u0

yi……y00

Wi

ain

bout

aout

bin

aout = ain

bout = bin + ain * wi

Convolution (cont)Convolution (cont)

• Systolic array.The input of sequence from left.

This is just one solution to this problem

• Weights in space• Inputs and outputs in each

cell

Page 31: Introduction to Convolution circuits synthesis image processing, speech processing, DSP, polynomial multiplication in robot control. convolution

Convolution Convolution • Can be 1D, 2D, 3D, etc.• Is very important in many applications.• Can be implemented efficiently in various

architectures.• Is an excellent example to compare various

computer architectures: – SIMD,– MIMD, – CA, – pipelined, – Systolic.

Page 32: Introduction to Convolution circuits synthesis image processing, speech processing, DSP, polynomial multiplication in robot control. convolution

Various Possible Various Possible ImplementationsImplementations

Convolution is very important, we use it in several Convolution is very important, we use it in several applications. So let us think what are applications. So let us think what are all the possible ways to implement itto implement it

• Convolution Algorithm

Page 33: Introduction to Convolution circuits synthesis image processing, speech processing, DSP, polynomial multiplication in robot control. convolution

Bag of Tricks that Bag of Tricks that can be usedcan be used

• Preload-repeated-value

• Replace-feedback-with-register

• Internalize-data-flow

• Broadcast-common-input

• Propagate-common-input

• Retime-to-eliminate-broadcasting

Page 34: Introduction to Convolution circuits synthesis image processing, speech processing, DSP, polynomial multiplication in robot control. convolution

How to invent such circuits?How to invent such circuits?

1. Let us learn from existing designs

2. Let us learn from our own mistakes

3. Let us check all possibilities of moving every piece of data

Page 35: Introduction to Convolution circuits synthesis image processing, speech processing, DSP, polynomial multiplication in robot control. convolution

Bogus Attempt at Systolic FIRBogus Attempt at Systolic FIRfor i=1 to n in parallel

for j=1 to k in place

yi += wj * x i+j-1

feedback from sequential implementation

Replace with register

Inner loop realized in placeStage 1: directly from equation

Stage 2: feedback = yi = yi

Stage 3:

Internal loop Internal loop in spacein space

Page 36: Introduction to Convolution circuits synthesis image processing, speech processing, DSP, polynomial multiplication in robot control. convolution

Bogus Attempt continued: Bogus Attempt continued: Outer LoopOuter Loopfor i=1 to n in parallel

for j=1 to k in place

yi += wj * x i+j-1

Factorize wjThis could work but it has

broadcast

Page 37: Introduction to Convolution circuits synthesis image processing, speech processing, DSP, polynomial multiplication in robot control. convolution

Bogus Attempt continued: Outer Loop - 2Bogus Attempt continued: Outer Loop - 2

for i=1 to n in parallel

for j=1 to k in place

yi += wj * x i+j-1

Because we do not want to have broadcast, we retime the signal w, this requires also retiming of X j

Page 38: Introduction to Convolution circuits synthesis image processing, speech processing, DSP, polynomial multiplication in robot control. convolution

• Another possibility of retiming

for i=1 to n in parallel

for j=1 to k in place

yi += wj * x i+j-1

Bogus Attempt continued: Outer Loop - 2aBogus Attempt continued: Outer Loop - 2a

Page 39: Introduction to Convolution circuits synthesis image processing, speech processing, DSP, polynomial multiplication in robot control. convolution

• Yet another approach is to broadcast common input x i-1

Bogus Attempt continued: Outer Loop - 3Bogus Attempt continued: Outer Loop - 3for i=1 to n in parallel

for j=1 to k in place

yi += wj * x i+j-1

Page 40: Introduction to Convolution circuits synthesis image processing, speech processing, DSP, polynomial multiplication in robot control. convolution

What we achieved?What we achieved?

• We showed several possible beginnigs of creating architectures.

• They were not successful, but show the principles.

• We will continue to create architectures, but these attempts were not complete waste of time, they can be used in similar problems successfully.

• You have to experiment with ideas!!

Page 41: Introduction to Convolution circuits synthesis image processing, speech processing, DSP, polynomial multiplication in robot control. convolution

Attempt at Systolic FIR: Attempt at Systolic FIR:

now internal loop is in parallelnow internal loop is in parallel

Internal loop Internal loop in parallelin parallel

Page 42: Introduction to Convolution circuits synthesis image processing, speech processing, DSP, polynomial multiplication in robot control. convolution

Attempt at Systolic FIR: now internal loop is Attempt at Systolic FIR: now internal loop is in parallelin parallel

1

2

3

Page 43: Introduction to Convolution circuits synthesis image processing, speech processing, DSP, polynomial multiplication in robot control. convolution

Outer Loop continuation for FIR filterOuter Loop continuation for FIR filter

Page 44: Introduction to Convolution circuits synthesis image processing, speech processing, DSP, polynomial multiplication in robot control. convolution

Continue: Optimize Outer LoopContinue: Optimize Outer LoopPreload-repeated ValuePreload-repeated Value

Based on previous slide we can

preload weights Wi

Page 45: Introduction to Convolution circuits synthesis image processing, speech processing, DSP, polynomial multiplication in robot control. convolution

Continue: Optimize Outer LoopContinue: Optimize Outer LoopBroadcast Common ValueBroadcast Common Value

This design has broadcast.

Some purists tell this is not systolic as systolic should have all short wires.

Page 46: Introduction to Convolution circuits synthesis image processing, speech processing, DSP, polynomial multiplication in robot control. convolution

Continue: Optimize Outer LoopContinue: Optimize Outer LoopRetime to Eliminate BroadcastRetime to Eliminate Broadcast

We delay these signals yi

Page 47: Introduction to Convolution circuits synthesis image processing, speech processing, DSP, polynomial multiplication in robot control. convolution

The design becomes not intuitive. Therefore, we The design becomes not intuitive. Therefore, we have to explain in detail “How it works”have to explain in detail “How it works”

y1=x1w1

y1=x1w1

x1

x2

inputs

outputs

Was it a good idea to combine input and output streams to cells?

Page 48: Introduction to Convolution circuits synthesis image processing, speech processing, DSP, polynomial multiplication in robot control. convolution

Polynomial Polynomial Multiplication of Multiplication of

1-D convolution 1-D convolution problemproblem

Page 49: Introduction to Convolution circuits synthesis image processing, speech processing, DSP, polynomial multiplication in robot control. convolution

Types of systolic structureTypes of systolic structure• Convolution problem

weight : {w1, w2, ..., wk}

inputs : {x1, x2, ..., xn}

results : {y1, y2, ..., yn+k-1}

yi = w1xi + w2xi+1 + ...... + wkxi+k-1

(combining two data streams)

H. T. Kung’s grouping work

assume k = 3

Polynomial Multiplication Polynomial Multiplication of 1-D convolution problemof 1-D convolution problem

Page 50: Introduction to Convolution circuits synthesis image processing, speech processing, DSP, polynomial multiplication in robot control. convolution

A A well-knownwell-known family of systolic family of systolic designs fordesigns for

convolution computationconvolution computation•Given the sequence of weights

{w1 , w2 , . . . , wk}•And the input sequence

{x1 , x2 , . . . , xk} ,•Compute the result sequence

{y1 , y2 , . . . , yn+1-k}

• Defined by

yi = w1 xi + w2 xi+1 + . . . + wk xi+k-1

Page 51: Introduction to Convolution circuits synthesis image processing, speech processing, DSP, polynomial multiplication in robot control. convolution

Design B1Design B1- Broadcast input , - move results systolically, - weights stay- (Semi-systolic convolution arrays with global data communication

-

Page 52: Introduction to Convolution circuits synthesis image processing, speech processing, DSP, polynomial multiplication in robot control. convolution

Design B1Design B1

- Broadcast input , - move results systolically, - weights stay- (Semi-systolic convolution arrays with global data communication

• Previously proposed for

circuits to implement a

pattern matching processor

and for circuit to implement

polynomial multiplication.

-

Page 53: Introduction to Convolution circuits synthesis image processing, speech processing, DSP, polynomial multiplication in robot control. convolution

Types of systolic structure: Types of systolic structure: design design B1B1

• wider systolic path (partial result yi move)

x3 x2 x1

y3 y2 y1 W1 W2 W3

yin

xin

yout

yout = yin + Wxin

W

Please analyze this circuit drawing snapshots like in an animated movie of data in subsequent moments of time

broadcast

Discuss disadvantages of broadcast

Results move out

Page 54: Introduction to Convolution circuits synthesis image processing, speech processing, DSP, polynomial multiplication in robot control. convolution

Design B2Design B2Inputs broadcast

Weights move

Results stay

Page 55: Introduction to Convolution circuits synthesis image processing, speech processing, DSP, polynomial multiplication in robot control. convolution

Types of systolic structure: Types of systolic structure: Design B2Design B2

Inputs broadcast

Weights move

Results stay

• wi circulate

• use multiplier-accumulator hardware

• wi has a tag bit (signals accumulator to output results)

• needs separate bus (or other global network for collecting output)

Win

xin

Wout y = y + Winxin

Wout = Win

y

x3 x2 x1

y1 y2 y3

W2W3W1

Page 56: Introduction to Convolution circuits synthesis image processing, speech processing, DSP, polynomial multiplication in robot control. convolution

Design B2

Broadcast input , move weights , results stay[(Semi-) systolic convolution arrays with

global data communication]

• The path for moving yi’s is wider then wi’s because of yi’s carry more bits then wi’s in numerical accuracy.

• The use of multiplier-accumulators may also help increase precision of the result , since extra bit can be kept in these accumulators with modest cost.

Semisystolic because of broadcast

Page 57: Introduction to Convolution circuits synthesis image processing, speech processing, DSP, polynomial multiplication in robot control. convolution

Design FDesign F

Input move

Weights stay

Partial results fan-in

• needs adder

Page 58: Introduction to Convolution circuits synthesis image processing, speech processing, DSP, polynomial multiplication in robot control. convolution

Types of systolic structure: Types of systolic structure: design Fdesign F

Input move

Weights stay

Partial results fan-in

• needs adder

• applications : signal processing, pattern matching

y1’s

Zout = Wxin

xout = xin

Zout

xoutxin W

x3 x2 x1W3 W2 W1

ADDER

Page 59: Introduction to Convolution circuits synthesis image processing, speech processing, DSP, polynomial multiplication in robot control. convolution

Design F

- Fan-in results, move inputs, weights stay

- Semi-systolic convolution arrays with global data communication

• When number of cell is large , the adder can be implemented as a pipelined adder tree to avoid large delay.

• Design of this type using unbounded fan-in.

Page 60: Introduction to Convolution circuits synthesis image processing, speech processing, DSP, polynomial multiplication in robot control. convolution

Design R1Design R1

Inputs and weights move in the opposite directions

Results stay

• can use tag bit

• no bus (systolic output path is sufficient)

• one-half the cells work at any time

Page 61: Introduction to Convolution circuits synthesis image processing, speech processing, DSP, polynomial multiplication in robot control. convolution

Types of systolic structure:Types of systolic structure: Design R1 Design R1Inputs and weights move in the opposite directions

Results stay

• can use tag bit

• no bus (systolic output path is sufficient)

• one-half the cells are work at any time

• applications : pattern matching

y = y + Winxin

xout = xin

Wout = Win

x1x3 x2

W1 W2

y3 y2 y1

Win

xin

Wout

y

xout

Page 62: Introduction to Convolution circuits synthesis image processing, speech processing, DSP, polynomial multiplication in robot control. convolution

Design R1

- Results stay, inputs and weights move in opposite directions- Pure-systolic convolution arrays with global data communication

• Design R1 has the advan-tage that it dose not require a bus , or any other global net-work , for collecting output from cells.

• The basic ideal of this de-sign has been used to imple-ment a pattern matching chip.

Page 63: Introduction to Convolution circuits synthesis image processing, speech processing, DSP, polynomial multiplication in robot control. convolution

Design R2Design R2

Inputs and weights move in the same direction at different speeds

Results stay

• xj’s move twice as fast as the wj’s

• all cells work at any time

• need additional registers (to hold w value)

Page 64: Introduction to Convolution circuits synthesis image processing, speech processing, DSP, polynomial multiplication in robot control. convolution

Types of systolic structure: Types of systolic structure: design R2design R2

Inputs and weights move in the same direction at different speeds

Results stay

• xj’s move twice as fast as the wj’s

• all cells work at any time

• need additional registers (to hold w value)

• applications : pipeline multiplier

W1

W2

W3

W4

W5

x3 x2 x1 y1 y2 y3

W W W

W

y

Win Wout

xin xout

y = y + Winxin

W = Win

Wout = W xout = xin

Page 65: Introduction to Convolution circuits synthesis image processing, speech processing, DSP, polynomial multiplication in robot control. convolution

Design R2

- Results stay , inputs and weights move in the same direction but at different speeds- Pure-systolic convolution arrays with global data communication

• Multiplier-accumulator can be used effectively and so can tag bit method to signal the output of each cell.

• Compared with R1 , all cells work all the time when additional register in each cell to hold a w value.

Page 66: Introduction to Convolution circuits synthesis image processing, speech processing, DSP, polynomial multiplication in robot control. convolution

Design W1Design W1

Inputs and results move in the opposite direction

Weights stay

• one-half the cells are work

• constant response time

Page 67: Introduction to Convolution circuits synthesis image processing, speech processing, DSP, polynomial multiplication in robot control. convolution

Types of systolic structure: Types of systolic structure: design W1design W1Inputs and results move in the opposite direction

Weights stay

• one-half the cells are work

• constant response time

• applications : polynomial division

yout = yin + Wxin

xout = xin

yin

xin

yout

W

xout

x1x3 x2 W1W2

y

W3

Page 68: Introduction to Convolution circuits synthesis image processing, speech processing, DSP, polynomial multiplication in robot control. convolution

Design W1

-Weights stay, inputs and results move in opposite direction- Pure-systolic convolution arrays with global data communication

• This design is fundamental in the sense that it can be naturally extend to perform recursive filtering.

• This design suffers the same drawback as R1 , only appro-ximately 1/2 cells work at any given time unless two inde-pendent computation are in-terleaved in the same array.

Page 69: Introduction to Convolution circuits synthesis image processing, speech processing, DSP, polynomial multiplication in robot control. convolution

Overlapping the executions of multiply-and-add in design W1

Page 70: Introduction to Convolution circuits synthesis image processing, speech processing, DSP, polynomial multiplication in robot control. convolution

Design W2Design W2

Inputs and results move in the same direction at different speeds

Weights stay• all cells work (high throughputshigh throughputs rather than fast response)

Page 71: Introduction to Convolution circuits synthesis image processing, speech processing, DSP, polynomial multiplication in robot control. convolution

Types of systolic structure: Types of systolic structure: design W2design W2

Inputs and results move in the same direction at different speeds

Weights stay

• all cells work (high throughputshigh throughputs rather than fast response)

x

W

xin xout

yin yout

yout = yin + Winxin

x = xin

xout = x

W1W2

x5

W3

x7 x3 x2x1

y1y2y3

W W Wx4x6

Page 72: Introduction to Convolution circuits synthesis image processing, speech processing, DSP, polynomial multiplication in robot control. convolution

Design W2

-Weights stay, inputs and results move in thesame direction but at different speeds- Pure-systolic convolution arrays with global data communication

• This design lose one advan-tage of W1 , the constant response time.

• This design has been extended to implement 2-D 2-D convolution ,convolution , where high throughputs rather than fast response are of concern.

Page 73: Introduction to Convolution circuits synthesis image processing, speech processing, DSP, polynomial multiplication in robot control. convolution

Remarks on Linear Arrays• Above designs are all possible systolic designs for the convolution problem. (some are semi-)

• Using a systolic control path , weight can be selected on- the-fly to implement interpolation or adaptive filtering.

• We need to understand precisely the strengths and drawbacks of each design so that an appropriate design can be selected for a given environment.

• For improving throughput, it may be worthwhile to implement multiplier and adder separately to allow overlapping of their execution. (Such as next page show)

• When chip pin is considered:• pure-systolic requires four I/O ports; • semi-systolic requires three I/O ports.

Page 74: Introduction to Convolution circuits synthesis image processing, speech processing, DSP, polynomial multiplication in robot control. convolution

Retiming of Retiming of filtersfilters

Page 75: Introduction to Convolution circuits synthesis image processing, speech processing, DSP, polynomial multiplication in robot control. convolution

FIR circuit: initial designFIR circuit: initial design

delays

Pipelining of xi

• We insert various numbers of unified delays

Page 76: Introduction to Convolution circuits synthesis image processing, speech processing, DSP, polynomial multiplication in robot control. convolution

FIR circuit: registers added below FIR circuit: registers added below weight multipliersweight multipliers

Notice changed timing here

We insert delays here

Page 77: Introduction to Convolution circuits synthesis image processing, speech processing, DSP, polynomial multiplication in robot control. convolution

FIR Summary: comparison of FIR Summary: comparison of sequential and systolicsequential and systolic

Page 78: Introduction to Convolution circuits synthesis image processing, speech processing, DSP, polynomial multiplication in robot control. convolution

Conclusions on 1D and 1.5D Systolic Arrays Conclusions on 1D and 1.5D Systolic Arrays

Systolic arrays are more than processor arrays which execute systolic algorithms.

– A systolic cell takes on one of the followingone of the following forms:

1. A special purpose cell with hardwired functions,

2. A vector-computer-like cell with instruction decoding and a processing element,

3. A systolic processor complete with a control unit and a processing unit.

Smarter processor for SAT, Petrick, etc.

Page 79: Introduction to Convolution circuits synthesis image processing, speech processing, DSP, polynomial multiplication in robot control. convolution

Large Large Systolic Arrays Systolic Arrays as as

general purpose general purpose computerscomputers

Page 80: Introduction to Convolution circuits synthesis image processing, speech processing, DSP, polynomial multiplication in robot control. convolution

Large Large Systolic Arrays as general Systolic Arrays as general purpose computerspurpose computers

• Originally, systolic architectures were motivated for high performance special purpose computational systems that meet the constraints of VLSI,

• However, it is possible to design systolic systems which: – have high throughputs – yet are not constrained to a single VLSI chip.

Page 81: Introduction to Convolution circuits synthesis image processing, speech processing, DSP, polynomial multiplication in robot control. convolution

Problems with systolic array Problems with systolic array designdesign

1. Hard to design - hard to understand

low level realization may be hard to realize

2. Hard to explain

remote from the algorithm

function can’t readily be deduced from the structure

3. Hard to verify

Page 82: Introduction to Convolution circuits synthesis image processing, speech processing, DSP, polynomial multiplication in robot control. convolution

Key Key architectural issuesarchitectural issues in designing in designing special-purpose systemsspecial-purpose systems

•Simple and regular design

Simple, regular design yields cost-effective special

systems.

•Concurrency and communication

Design algorithm to support high concurrency and

meantime to employ only simple blocks.

•Balancing computation with I/O

A special-purpose system should be a match to a variety

of I/O bandwidths.

Page 83: Introduction to Convolution circuits synthesis image processing, speech processing, DSP, polynomial multiplication in robot control. convolution

Two Dimensional Two Dimensional Systolic Systolic ArraysArrays

• In 1978, the first systolic arrays were introduced as a feasible design for special purpose devices which meet the VLSI constraints.

• These special purpose devices were able to perform four types of matrix operations at high processing speeds:

– matrix-vector multiplication,

– matrix-matrix multiplication,

– LU-decomposition of a matrix,

– Solution of triangular linear systems.

Page 84: Introduction to Convolution circuits synthesis image processing, speech processing, DSP, polynomial multiplication in robot control. convolution

General General Systolic OrganizationSystolic Organization

SystolicElement

SystolicElement

SystolicElement

SystolicElement

SystolicElement

SystolicElement

SystolicElement

SystolicElement

SystolicElement

SystolicElement

SystolicElement

SystolicElement

SystolicElement

SystolicElement

SystolicElement

SystolicElement

SystolicElement

SystolicElement

SystolicElement

SystolicElement

Page 85: Introduction to Convolution circuits synthesis image processing, speech processing, DSP, polynomial multiplication in robot control. convolution

Example 2:Example 2: Matrix-Matrix Multiplication

All previously showntricks can be applied

Page 86: Introduction to Convolution circuits synthesis image processing, speech processing, DSP, polynomial multiplication in robot control. convolution

• Seth Copen Goldstein, CMU Seth Copen Goldstein, CMU A.R. HursonA.R. Hurson2. David E. Culler, UC. Berkeley,2. David E. Culler, UC. Berkeley,3. 3. [email protected]. Syeda Mohsina Afroze4. Syeda Mohsina Afrozeand other students of Advanced Logic and other students of Advanced Logic Synthesis, ECE 572, 1999 and 2000.Synthesis, ECE 572, 1999 and 2000.

SourcesSources