factoring and eliminating common subexpressions in polynomial expressions international conference...

41
Factoring and Eliminating Factoring and Eliminating Common Subexpressions in Common Subexpressions in Polynomial Expressions Polynomial Expressions International Conference on Computer International Conference on Computer Aided Design (ICCAD), 2004 Aided Design (ICCAD), 2004 Farzan Fallah Farzan Fallah Advanced CAD Advanced CAD Research Research Fujitsu Labs. of Fujitsu Labs. of America America Anup Hosangadi Anup Hosangadi Ryan Kastner Ryan Kastner ECE Department, ECE Department, UCSB UCSB

Post on 21-Dec-2015

225 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Factoring and Eliminating Common Subexpressions in Polynomial Expressions International Conference on Computer Aided Design (ICCAD), 2004 Farzan Fallah

Factoring and Eliminating Common Factoring and Eliminating Common Subexpressions in Polynomial Subexpressions in Polynomial

ExpressionsExpressions

International Conference on Computer Aided International Conference on Computer Aided Design (ICCAD), 2004Design (ICCAD), 2004

Farzan FallahFarzan Fallah

Advanced CAD ResearchAdvanced CAD Research

Fujitsu Labs. of AmericaFujitsu Labs. of America

Farzan FallahFarzan Fallah

Advanced CAD ResearchAdvanced CAD Research

Fujitsu Labs. of AmericaFujitsu Labs. of America

Anup Hosangadi Anup Hosangadi

Ryan KastnerRyan KastnerECE Department, UCSBECE Department, UCSB

Anup Hosangadi Anup Hosangadi

Ryan KastnerRyan KastnerECE Department, UCSBECE Department, UCSB

Page 2: Factoring and Eliminating Common Subexpressions in Polynomial Expressions International Conference on Computer Aided Design (ICCAD), 2004 Farzan Fallah

OutlineOutline

IntroductionIntroduction

Related WorkRelated Work

Algebraic techniques for redundancy Algebraic techniques for redundancy eliminationelimination

Experimental resultsExperimental results

ConclusionsConclusions

Page 3: Factoring and Eliminating Common Subexpressions in Polynomial Expressions International Conference on Computer Aided Design (ICCAD), 2004 Farzan Fallah

IntroductionIntroductionEmbedded systemEmbedded system applications applications need to compute polynomial need to compute polynomial expressionsexpressions– Continuous functions can be Continuous functions can be

approximated by polynomials to approximated by polynomials to desired degree of accuracy.desired degree of accuracy.

– Adaptive signal processing (Adaptive signal processing (Polynomial Polynomial filtersfilters ) )

– Polynomial interpolation/extrapolation Polynomial interpolation/extrapolation in in Computer GraphicsComputer Graphics

– EncryptionEncryption

!7!5!3)sin(

753 xxxxx

!7!5!3)sin(

753 xxxxx

Page 4: Factoring and Eliminating Common Subexpressions in Polynomial Expressions International Conference on Computer Aided Design (ICCAD), 2004 Farzan Fallah

IntroductionIntroductionMultiplications are expensive in Embedded Multiplications are expensive in Embedded systemssystemsNo good optimization tool for reducing No good optimization tool for reducing complexity of polynomialscomplexity of polynomials– Designers rely on Hand optimized librariesDesigners rely on Hand optimized libraries

Conventional optimization techniquesConventional optimization techniques– CSE, Value numberingCSE, Value numbering: not suited for : not suited for

polynomialspolynomials– Horner form: Horner form: most popular representationmost popular representation– aannxxnn + a + a11xxn-1n-1 + ….a + ….an-1n-1x + ax + a00 = (…((a = (…((annx + ax + an-1n-1)x + a)x + an-2n-2)x + ..a)x + ..a11)x + a)x + a00

– Not good for multivariate polynomialsNot good for multivariate polynomials– Only a single polynomial expression at a timeOnly a single polynomial expression at a time

Page 5: Factoring and Eliminating Common Subexpressions in Polynomial Expressions International Conference on Computer Aided Design (ICCAD), 2004 Farzan Fallah

IntroductionIntroduction

Quartic-spline polynomial (3-D graphics)Quartic-spline polynomial (3-D graphics)P = zuP = zu44 + 4avu + 4avu33 + 6bu + 6bu22vv22 + 4uv + 4uv33w + qvw + qv44

Horner form (from Horner form (from MapleMapleTMTM))P = zuP = zu44 + (4au + (4au33 + (6bu + (6bu22 + (4uw + qv)v)v)v + (4uw + qv)v)v)v

(17 multiplications)(17 multiplications)Proposed algebraic method:Proposed algebraic method: dd11 = v = v22 ; d ; d22 = d = d11*v*v

P = uP = u33(uz + ad(uz + ad22) + d) + d11( qd( qd11 + u(wd + u(wd22 + 6bu) ) + 6bu) )

(11 multiplications)(11 multiplications)

Page 6: Factoring and Eliminating Common Subexpressions in Polynomial Expressions International Conference on Computer Aided Design (ICCAD), 2004 Farzan Fallah

Related WorkRelated WorkExpression Expression FactorizationFactorization (M.A.Breuer JACM’69) (M.A.Breuer JACM’69) – Allows only one kind of operator at a timeAllows only one kind of operator at a time

Symbolic algebraSymbolic algebra techniques techniques (A. Peymandoust, De’Micheli DAC’01)(A. Peymandoust, De’Micheli DAC’01)

– Used for mapping DSP datapaths (polynomials) to Used for mapping DSP datapaths (polynomials) to library elementslibrary elements

– Results depend upon exponential library searchResults depend upon exponential library search eg. aeg. a22 – b – b22 = (a+b)(a-b) iff (a+b) or (a –b) is in library = (a+b)(a-b) iff (a+b) or (a –b) is in library– Manipulates only one expression at a time.Manipulates only one expression at a time.

F1 = A + B + C + D;

F2 = A + P + D;=> Extract (A + D)

Page 7: Factoring and Eliminating Common Subexpressions in Polynomial Expressions International Conference on Computer Aided Design (ICCAD), 2004 Farzan Fallah

Motivating ExampleMotivating Example

Consider set of expressionsConsider set of expressions

– Naïve implementation: 16 multiplications, 4 Naïve implementation: 16 multiplications, 4 additions/subtractionsadditions/subtractions

Using CSEUsing CSE

– 12 multiplications, 4 additions/subtractions12 multiplications, 4 additions/subtractions

y– x4xy P

– xyz4yz 4x P

zy xy x P

23

2

22

31

y– x4xy P

– xyz4yz 4x P

zy xy x P

23

2

22

31

xdydyd

xydzdyzd

xdzyddd

4 P

4 P

P

3133

2232

21

21211

xdydyd

xydzdyzd

xdzyddd

4 P

4 P

P

3133

2232

21

21211

Page 8: Factoring and Eliminating Common Subexpressions in Polynomial Expressions International Conference on Computer Aided Design (ICCAD), 2004 Farzan Fallah

Motivating ExampleMotivating Example

Using our algebraic techniquesUsing our algebraic techniques

– Total 7 multiplications, 3 additions/subtractionsTotal 7 multiplications, 3 additions/subtractions– Savings of 5 multiplications, 1 addition/subtraction Savings of 5 multiplications, 1 addition/subtraction

compared to CSEcompared to CSE

Impossible to obtain such results using Impossible to obtain such results using conventional techniquesconventional techniques

xyddd

xdzdd

yzxddxd

3323

2312

1311

P

4 - 4 P

P

xyddd

xdzdd

yzxddxd

3323

2312

1311

P

4 - 4 P

P

Page 9: Factoring and Eliminating Common Subexpressions in Polynomial Expressions International Conference on Computer Aided Design (ICCAD), 2004 Farzan Fallah

Introduction to algebraic techniques Introduction to algebraic techniques for redundancy eliminationfor redundancy elimination

Algebraic techniques in multi-level logic synthesis (MLLS)Algebraic techniques in multi-level logic synthesis (MLLS)– Decomposition, factoring Decomposition, factoring reduce number of literalsreduce number of literals– DistillDistill and and CondenseCondense use Rectangle Covering methods. use Rectangle Covering methods.

Polynomial Expressions (Our Technique)Polynomial Expressions (Our Technique)– Factoring, Single term common subexpressions Factoring, Single term common subexpressions reduces number reduces number

of multiplicationsof multiplications– Multiple term common subexpressions Multiple term common subexpressions reduces number of reduces number of

additions and possibly multiplicationsadditions and possibly multiplications

Key Differences (Generalization to handle higher orders)Key Differences (Generalization to handle higher orders)– Kernelling techniquesKernelling techniques– Finding single cube intersectionsFinding single cube intersections

Page 10: Factoring and Eliminating Common Subexpressions in Polynomial Expressions International Conference on Computer Aided Design (ICCAD), 2004 Farzan Fallah

Introduction to our techniqueIntroduction to our technique(Outline)(Outline)

Find a subset of all possible subexpressions Find a subset of all possible subexpressions (kernel generation)(kernel generation)

Transformation of Polynomial Expressions Transformation of Polynomial Expressions – Problem formulationProblem formulation

Extract multiple term common subexpressions Extract multiple term common subexpressions and factorsand factors

Extract single term common factorsExtract single term common factors

Page 11: Factoring and Eliminating Common Subexpressions in Polynomial Expressions International Conference on Computer Aided Design (ICCAD), 2004 Farzan Fallah

Introduction to our techniqueIntroduction to our technique

TerminologyTerminology– LiteralLiteral: A variable or a constant eg. a,b,2,3.14: A variable or a constant eg. a,b,2,3.14– CubeCube: Product of literals eg. +3a: Product of literals eg. +3a22b, -2ab, -2a33bb22cc– SOPSOP: Sum of cubes eg. +3a: Sum of cubes eg. +3a22b – 2ab – 2a33bb22cc– Cube-free expressionCube-free expression: No literal or cube can : No literal or cube can

divide all the cubes of the expressions divide all the cubes of the expressions – KernelKernel: A cube free sub-expression of an : A cube free sub-expression of an

expression, eg. 3 – 2abcexpression, eg. 3 – 2abc– Co-KernelCo-Kernel: A cube that is used to divide an : A cube that is used to divide an

expression to get a kernel, eg. aexpression to get a kernel, eg. a22bb

Page 12: Factoring and Eliminating Common Subexpressions in Polynomial Expressions International Conference on Computer Aided Design (ICCAD), 2004 Farzan Fallah

Introduction to our TechniqueIntroduction to our Technique

Matrix Representation of Arithmetic ExpressionsMatrix Representation of Arithmetic Expressions

– F = xF = x33y – xyy – xy22zz is represented by is represented by

– Each row represents a product termEach row represents a product term– Each column represents a variable/constantEach column represents a variable/constant– Each element (i,j) represents power of variable j in term iEach element (i,j) represents power of variable j in term i

+/-+/- xx yy zz

++ 33 11 00

-- 11 22 11

Page 13: Factoring and Eliminating Common Subexpressions in Polynomial Expressions International Conference on Computer Aided Design (ICCAD), 2004 Farzan Fallah

Generation of Kernels (example)Generation of Kernels (example)PP11 = x = x33y + xy + x22yy22z {L} = {x,y,z}z {L} = {x,y,z}– Divide by x: Divide by x:

FFtt = P = P11/x = x/x = x22y + xyy + xy22zz

xx yy zz

33 11 00

22 22 11

xx yy zz

22 11 00

11 22 11

Page 14: Factoring and Eliminating Common Subexpressions in Polynomial Expressions International Conference on Computer Aided Design (ICCAD), 2004 Farzan Fallah

Generation of Kernels (example)Generation of Kernels (example)

FFtt = P = P11/x = x/x = x22y + xyy + xy22zz

C = Biggest Cube dividing all cubes of FC = Biggest Cube dividing all cubes of Ftt

xx yy zz

22 11 00

11 22 11

1 1 0

/ C = xx yy zz

11 00 00

00 11 11

Page 15: Factoring and Eliminating Common Subexpressions in Polynomial Expressions International Conference on Computer Aided Design (ICCAD), 2004 Farzan Fallah

Generation of Kernels (example)Generation of Kernels (example)

Obtain Kernel:Obtain Kernel: FF11 = F = Ftt/C = (x/C = (x22y + xyy + xy22z)/(xy) = ( x + yz)z)/(xy) = ( x + yz)

Obtain Co-Kernel Obtain Co-Kernel DD11 = x*(xy) = x = x*(xy) = x22yy– No kernels within FNo kernels within F11. Go back to P. Go back to P11

PP11 = x = x33y + xy + x22yy22zz– Divide now by next variable yDivide now by next variable y

FFtt = x = x33 + x + x22yzyz– C = xC = x22

– But (x < y) But (x < y) εε C C

Stop HereStop Here, to avoid repeating same kernel F, to avoid repeating same kernel Ftt/C = (x + yz)/C = (x + yz)– No more kernels extractedNo more kernels extracted– Record kernel FRecord kernel F11 = P = P11 with co-kernel ‘1’ with co-kernel ‘1’

Page 16: Factoring and Eliminating Common Subexpressions in Polynomial Expressions International Conference on Computer Aided Design (ICCAD), 2004 Farzan Fallah

Concept of kernels and co-kernelsConcept of kernels and co-kernels

Theorem:Theorem: Two expressions f and g can have a multiple Two expressions f and g can have a multiple term common subexpression term common subexpression iffiff there are 2 kernels K there are 2 kernels Kff and Kand Kgg having a multiple term intersection having a multiple term intersection

Detection of multiple term common subexpressions by Detection of multiple term common subexpressions by intersection of sets of kernels.intersection of sets of kernels.

Each co-kernel : kernel pair represents a possible Each co-kernel : kernel pair represents a possible factorizationfactorization– eg. xeg. x33y + xy + x22yy22z = [xz = [x22y](x + yz)y](x + yz)

Set of kernels a subset of all possible subexpressionsSet of kernels a subset of all possible subexpressions

Page 17: Factoring and Eliminating Common Subexpressions in Polynomial Expressions International Conference on Computer Aided Design (ICCAD), 2004 Farzan Fallah

All Kernels and Co KernelsAll Kernels and Co Kernels

(7)2

(6)3

(5)(4)(3)2

(2)22

(1)3

1

y– x 4xy P

– xyz4yz 4x P

zy xy x P

(7)2

(6)3

(5)(4)(3)2

(2)22

(1)3

1

y– x 4xy P

– xyz4yz 4x P

zy xy x P

y](1) x-[4xy x](xy),- [4 :P

xyz](1)- 4yz [4x yz](4), [x yz](x), -[4 x](yz),-[4 :P

z](1)y xy [x y),yz](x [x :P

23

2

22321

y](1) x-[4xy x](xy),- [4 :P

xyz](1)- 4yz [4x yz](4), [x yz](x), -[4 x](yz),-[4 :P

z](1)y xy [x y),yz](x [x :P

23

2

22321

Which kernels to choose?Which kernels to choose?

Page 18: Factoring and Eliminating Common Subexpressions in Polynomial Expressions International Conference on Computer Aided Design (ICCAD), 2004 Farzan Fallah

Kernel Cube Matrix (KCM)Kernel Cube Matrix (KCM)

One row for each Kernel generatedOne row for each Kernel generated

One column for each distinct kernel cubeOne column for each distinct kernel cube

Each non-zero element represents a term Each non-zero element represents a term

Kernel CubesKernel Cubes

xx yzyz 44 -yz-yz -x-xCCooKKeerrnneellss

44 11(3)(3) 11(4)(4) 00 00 00

xx22yy 11(1)(1) 11(2)(2) 00 00 00

xx 00 00 11(3)(3) 11(5)(5) 00

xyxy 00 00 11(6)(6) 00 11(7)(7)

yzyz 00 00 11(4)(4) 00 11(5)(5)

x3y

Page 19: Factoring and Eliminating Common Subexpressions in Polynomial Expressions International Conference on Computer Aided Design (ICCAD), 2004 Farzan Fallah

Finding Kernel IntersectionsFinding Kernel Intersections(Distill Algorithm)(Distill Algorithm)

Each kernel intersection or factor appears as a Each kernel intersection or factor appears as a rectanglerectangle– RectangleRectangle: Set of rows and columns such that all : Set of rows and columns such that all

elements are ‘1’elements are ‘1’

ValueValue of a rectangle = weighted sum of the of a rectangle = weighted sum of the number of operations savednumber of operations saved

GoalGoal: Maximum valued rectangular covering of : Maximum valued rectangular covering of KCMKCM

Greedy heuristic: covering by prime rectanglesGreedy heuristic: covering by prime rectangles– Prime rectanglePrime rectangle: Rectangle not covered by : Rectangle not covered by

any other rectangleany other rectangle

Page 20: Factoring and Eliminating Common Subexpressions in Polynomial Expressions International Conference on Computer Aided Design (ICCAD), 2004 Farzan Fallah

Finding Kernel Intersections Finding Kernel Intersections (Distill Algorithm)(Distill Algorithm)

Formula for Value of a rectangleFormula for Value of a rectangle R = number of rows; R = number of rows;

C = number of columnsC = number of columns

M(RM(Rii) = # of multiplications in row (co-kernel) i.) = # of multiplications in row (co-kernel) i.

M(CM(Cii) = # of multiplications in column (kernel-cube) i) = # of multiplications in column (kernel-cube) i

m = ratio of weights of multiplication to additionm = ratio of weights of multiplication to addition

Value = Value =

)1C()1R(

} ))C(M()1R())R(MR(1) - (C {mC

iR

i

)1C()1R(

} ))C(M()1R())R(MR(1) - (C {mC

iR

i

Formula calculates savings in operation Formula calculates savings in operation countcount

Page 21: Factoring and Eliminating Common Subexpressions in Polynomial Expressions International Conference on Computer Aided Design (ICCAD), 2004 Farzan Fallah

Distill AlgorithmDistill Algorithm

Kernel CubesKernel Cubes

xx yzyz 44 -yz-yz -x-x

CCooKKeerrnneellss

44 11(3)(3) 11(4)(4) 00 00 00

xx22yy 11(1)(1) 11(2)(2) 00 00 00

xx 00 00 11(3)(3) 11(5)(5) 00

xyxy 00 00 11(6)(6) 00 11(7)(7)

yzyz 00 00 11(4)(4) 00 11(5)(5)

4x + 4yz = 4d1 d1 = (x + yz)

x3y + x2y2z = x2yd1

Saves 5 multiplications and 1 addition

Page 22: Factoring and Eliminating Common Subexpressions in Polynomial Expressions International Conference on Computer Aided Design (ICCAD), 2004 Farzan Fallah

Distill AlgorithmDistill Algorithm

Kernel CubesKernel Cubes

xx yzyz 44 -yz-yz -x-x

CCooKKeerrnneellss

44 11(3)(3) 11(4)(4) 00 00 00

xx22yy 11(1)(1) 11(2)(2) 00 00 00

xx 00 00 11(3)(3) 11(5)(5) 00

xyxy 00 00 11(6)(6) 00 11(7)(7)

yzyz 00 00 11(4)(4) 00 11(5)(5)

Remove covered terms

4xy – x2y = xyd2

d2 = 4 – x

Saves 2 multiplications

Page 23: Factoring and Eliminating Common Subexpressions in Polynomial Expressions International Conference on Computer Aided Design (ICCAD), 2004 Farzan Fallah

Distill AlgorithmDistill Algorithm

Distill algorithm exits after no more kernel Distill algorithm exits after no more kernel intersections can be foundintersections can be found

P1 = x2yd1 d1 = x + yz

P2 = 4d1 – xyz d2 = 4 - xP3 = xyd1

Can further optimize by finding single cube Can further optimize by finding single cube intersectionsintersections

Page 24: Factoring and Eliminating Common Subexpressions in Polynomial Expressions International Conference on Computer Aided Design (ICCAD), 2004 Farzan Fallah

Finding single cube intersections Finding single cube intersections (Condense Algorithm)(Condense Algorithm)

Need an algorithm for finding single term Need an algorithm for finding single term common subexpressionscommon subexpressions

Consider two single term expressionsConsider two single term expressions– FF11 = a = a44bb33c c

– FF22 = a = a22bb44cc22

Form Cube Variable Incidence Matrix (CIM)Form Cube Variable Incidence Matrix (CIM) aa bb cc

44 33 11

22 44 22

One row for each product term.

One column for each variable

Page 25: Factoring and Eliminating Common Subexpressions in Polynomial Expressions International Conference on Computer Aided Design (ICCAD), 2004 Farzan Fallah

Finding single cube intersections Finding single cube intersections (Condense algorithm)(Condense algorithm)

Each (single term) common subexpression appears as a Each (single term) common subexpression appears as a rectangle.rectangle.– RectangleRectangle: Set of rows and columns where all elements are non-: Set of rows and columns where all elements are non-

zerozero

ValueValue of a rectangle is number of multiplications saved of a rectangle is number of multiplications saved by selecting itby selecting it– C = cube corresponding to the rectangleC = cube corresponding to the rectangle Value = Rows*( (Value = Rows*( (ΣΣC[i] ) -1)C[i] ) -1)

Maximum valued rectangular covering will give minimum Maximum valued rectangular covering will give minimum number of multiplicationsnumber of multiplications

Use greedy iterative covering by prime rectanglesUse greedy iterative covering by prime rectangles

Page 26: Factoring and Eliminating Common Subexpressions in Polynomial Expressions International Conference on Computer Aided Design (ICCAD), 2004 Farzan Fallah

Finding single cube intersections Finding single cube intersections (Condense algorithm)(Condense algorithm)

aa bb cc

44 33 11

22 44 22

22 33 11

d1 = a2b3c

aa bb cc dd11

22 00 00 11

00 11 11 11

22 33 11 00

00 11 11 00

d2 = bc

Page 27: Factoring and Eliminating Common Subexpressions in Polynomial Expressions International Conference on Computer Aided Design (ICCAD), 2004 Farzan Fallah

Finding single cube intersections Finding single cube intersections (Condense algorithm)(Condense algorithm)

aa bb cc dd11 dd22

22 00 00 11 00

00 00 00 11 11

22 22 00 00 11

00 11 11 00 00

22 00 00 00 00

d3 = a2

Page 28: Factoring and Eliminating Common Subexpressions in Polynomial Expressions International Conference on Computer Aided Design (ICCAD), 2004 Farzan Fallah

Finding single cube intersections Finding single cube intersections (Condense algorithm)(Condense algorithm)

Final CIMFinal CIM

Final Implementation ( 7 multiplications)Final Implementation ( 7 multiplications)

dd33 = a*a = a*a

dd22 = b*c = b*c

dd11 = b*b*d = b*b*d22*d*d33

FF11 = d = d11*d*d33

FF22 = d = d11*d*d22

aa bb cc dd11 dd22 dd33

00 00 00 11 00 11

00 00 00 11 11 00

00 22 00 00 11 11

00 11 11 00 00 00

22 00 00 00 00 00

Page 29: Factoring and Eliminating Common Subexpressions in Polynomial Expressions International Conference on Computer Aided Design (ICCAD), 2004 Farzan Fallah

Cube Literal Matrix (Condense Cube Literal Matrix (Condense Algorithm)Algorithm)

LiteralsLiterals

Term Term ++/-/- xx yy zz 44 dd11 dd22

CCuubbeess

11 ++ 22 11 00 00 11 00

22 ++ 00 00 00 11 11 00

33 -- 11 11 11 00 00 00

44 ++ 11 11 00 00 00 11

55 ++ 11 00 00 00 00 00

66 ++ 00 11 11 00 00 00

77 ++ 00 00 00 11 00 00

88 -- 11 00 00 00 00 00

Save 2 multiplications by extracting xy

CIM for our example after Distill algorithm

Page 30: Factoring and Eliminating Common Subexpressions in Polynomial Expressions International Conference on Computer Aided Design (ICCAD), 2004 Farzan Fallah

Condense AlgorithmCondense Algorithm

LiteralsLiterals

Term Term ++/-/- xx yy zz 44 dd11 dd22

CCuubbeess

11 ++ 11 00 00 00 11 00

22 ++ 00 00 00 11 11 00

33 -- 00 00 11 00 00 00

44 ++ 00 00 00 00 00 11

55 ++ 11 00 00 00 00 00

66 ++ 00 11 11 00 00 00

77 ++ 00 00 00 11 00 00

88 -- 11 00 00 00 00 00

Extracting xy

No more favorable cube intersections found

Page 31: Factoring and Eliminating Common Subexpressions in Polynomial Expressions International Conference on Computer Aided Design (ICCAD), 2004 Farzan Fallah

Final ImplementationFinal Implementation

– Total 7 multiplications, 3 additions/subtractionsTotal 7 multiplications, 3 additions/subtractions– Savings of 5 multiplications, 1 addition/subtraction compared Savings of 5 multiplications, 1 addition/subtraction compared

to CSEto CSE

Impossible to obtain such results using conventional Impossible to obtain such results using conventional techniquestechniques

xyddd

xdzdd

yzxddxd

3323

2312

1311

P

4 - 4 P

P

xyddd

xdzdd

yzxddxd

3323

2312

1311

P

4 - 4 P

P

Page 32: Factoring and Eliminating Common Subexpressions in Polynomial Expressions International Conference on Computer Aided Design (ICCAD), 2004 Farzan Fallah

Optimization of sin(x)Optimization of sin(x)

KernelsKernels11 -S-S33xx22 SS55xx44 -S-S77xx66 -S-S33 SS55xx22 -S-S77xx44 SS55 -S-S77xx22

xx 11(1)(1) 11(2)(2) 11(3)(3) 11(4)(4) 00 00 00 00 00

xx33 00 00 00 00 11(2)(2) 11(3)(3) 11(4)(4) 00 00

xx55 00 00 00 00 00 00 00 11(3)(3) 11(4)(4)

77

55

33 xSxSxS - x )xsin( 7

75

53

3 xSxSxS - x )xsin( Sin (x) = x + x3(-S3 + S5x2 – S7x4)

Saves 6 multiplications

Page 33: Factoring and Eliminating Common Subexpressions in Polynomial Expressions International Conference on Computer Aided Design (ICCAD), 2004 Farzan Fallah

Optimization of sin(x)Optimization of sin(x)

Final Implementation:Final Implementation: X = x*xX = x*x

Sin(x) = x*(1 + (-SSin(x) = x*(1 + (-S33 + (S + (S55 + S + S77*X)*X) ) *X)*X)*X) ) *X)

– Total 5 multiplications and 3 additions/subtractionsTotal 5 multiplications and 3 additions/subtractions

SAME AS GNU C HAND optimized formSAME AS GNU C HAND optimized form

KernelsKernels

11 xx22dd11 SS55 -S-S77xx22

xx 11(1)(1) 11(2)(2) 00 00

xx22 00 00 11(4)(4) 11(5)(5)

Page 34: Factoring and Eliminating Common Subexpressions in Polynomial Expressions International Conference on Computer Aided Design (ICCAD), 2004 Farzan Fallah

Experimental Setup Experimental Setup (Sequential processor)(Sequential processor)

Signal processing and multimedia applicationsSignal processing and multimedia applications– MP3 decoder, Mesa (graphics), Adaptive filter, FFT, MP3 decoder, Mesa (graphics), Adaptive filter, FFT,

FIRFIR– Taylor series approximation of trigonometric functionsTaylor series approximation of trigonometric functions– Optimizations on arithmetic subgraphs from Dataflow Optimizations on arithmetic subgraphs from Dataflow

graphs (DFGs)graphs (DFGs)

Polynomials from computer graphicsPolynomials from computer graphics– Multivariate polynomial approximationMultivariate polynomial approximation

Compared number of operations with CSE and Compared number of operations with CSE and Horner formHorner formEstimated savings in clock cycles on ARM coreEstimated savings in clock cycles on ARM core

Page 35: Factoring and Eliminating Common Subexpressions in Polynomial Expressions International Conference on Computer Aided Design (ICCAD), 2004 Farzan Fallah

ApplicationApplication FunctionFunction UnoptimizedUnoptimized CSECSE HornerHorner Our techniqueOur technique

AA MM AA MM AA MM AA MM

MP3 decoderMP3 decoder hwin_inithwin_init 8080 260260 7272 162162 8080 110110 6464 8686

MP3 decoderMP3 decoder imdctimdct 6363 189189 6363 108108 6363 9090 6363 5454

MesaMesa gl_rotationgl_rotation 1010 9292 1010 3434 1010 3737 1010 1515

Adaptive filterAdaptive filter LMSLMS 3535 130130 3535 8585 3535 5555 3535 4040

Gaussian Gaussian noise filternoise filter

FIRFIR 3636 224224 3636 143143 3636 8989 3636 6363

Fast Fast convolutionconvolution

FFTFFT 4545 194194 4545 112112 4545 8383 4545 5656

GraphicsGraphics quartic-splinequartic-spline 44 2323 44 1717 44 2020 44 1414

GraphicsGraphics quintic-splinequintic-spline 55 3434 55 2222 55 2323 55 1616

GraphicsGraphics chebyshevchebyshev 88 3232 88 1818 88 1818 88 1111

GraphicsGraphics cos-waveletcos-wavelet 1717 4343 1717 2424 1717 1919 1515 1717

AverageAverage 30.330.3 122122 29.529.5 72.572.5 30.330.3 54.454.4 28.528.5 37.237.2

Experimental results (comparing number of Experimental results (comparing number of operations from different methods)operations from different methods)

Average run time = 0.45s for our technique

Page 36: Factoring and Eliminating Common Subexpressions in Polynomial Expressions International Conference on Computer Aided Design (ICCAD), 2004 Farzan Fallah

Experimental results (Improvement over Experimental results (Improvement over CSE and Horner)CSE and Horner)

ApplicationApplication FunctionFunction Over CSEOver CSE Over HornerOver Horner

MM Clock cycles Clock cycles on ARM 7on ARM 7

MM Clock cycles Clock cycles on ARM 7on ARM 7

MP3 decoderMP3 decoder hwin_inithwin_init 46.9%46.9% 44.0%44.0% 21.8%21.8% 21.6%21.6%

MP3 decoderMP3 decoder imdctimdct 50.0%50.0% 44.7%44.7% 40.0%40.0% 35.1%35.1%

MesaMesa gl_rotationgl_rotation 55.9%55.9% 52.8%52.8% 59.5%59.5% 56.4%56.4%

Adaptive filterAdaptive filter LMSLMS 52.9%52.9% 48.9%48.9% 27.3%27.3% 24.2%24.2%

Gaussian noise Gaussian noise filterfilter

FIRFIR 55.9%55.9% 53.3%53.3% 29.2%29.2% 27.0%27.0%

Fast convolutionFast convolution FFTFFT 50.0%50.0% 46.3%46.3% 32.5%32.5% 29.3%29.3%

GraphicsGraphics quartic-quartic-splinespline

17.6%17.6% 16.8%16.8% 30.0%30.0% 28.8%28.8%

GraphicsGraphics quintic-splinequintic-spline 27.3%27.3% 26.1%26.1% 30.4%30.4% 29.2%29.2%

GraphicsGraphics chebyshevchebyshev 38.9%38.9% 35.7%35.7% 38.9%38.9% 35.7%35.7%

GraphicsGraphics cos-waveletcos-wavelet 29.2%29.2% 27.0%27.0% 10.5%10.5% 10.7%10.7%

AverageAverage 42.5%42.5% 39.6%39.6% 32.0%32.0% 29.8%29.8%

Page 37: Factoring and Eliminating Common Subexpressions in Polynomial Expressions International Conference on Computer Aided Design (ICCAD), 2004 Farzan Fallah

ConclusionsConclusions

Development of new algebraic technique Development of new algebraic technique for optimizing polynomial expressions.for optimizing polynomial expressions.

Currently used for minimizing number of Currently used for minimizing number of arithmetic operations using greedy arithmetic operations using greedy rectangular coveringrectangular covering

Results better than conventional Results better than conventional techniquestechniques

Page 38: Factoring and Eliminating Common Subexpressions in Polynomial Expressions International Conference on Computer Aided Design (ICCAD), 2004 Farzan Fallah

Future WorkFuture Work

Develop and implement optimal algorithms Develop and implement optimal algorithms to compare results with our greedy to compare results with our greedy heuristic.heuristic.

Optimization for delay, energy.Optimization for delay, energy.

Integrate our technique with conventional Integrate our technique with conventional compiler optimization pass to measure compiler optimization pass to measure impact on the whole application.impact on the whole application.

Page 39: Factoring and Eliminating Common Subexpressions in Polynomial Expressions International Conference on Computer Aided Design (ICCAD), 2004 Farzan Fallah

Thank YouThank You

Questions ??Questions ??

Page 40: Factoring and Eliminating Common Subexpressions in Polynomial Expressions International Conference on Computer Aided Design (ICCAD), 2004 Farzan Fallah

Extra slidesExtra slides

Page 41: Factoring and Eliminating Common Subexpressions in Polynomial Expressions International Conference on Computer Aided Design (ICCAD), 2004 Farzan Fallah

Finding Kernel IntersectionsFinding Kernel Intersections(Distill Algorithm)(Distill Algorithm)

Worst case scenario for Distill algorithmWorst case scenario for Distill algorithm

Number of prime rectangles exponential in number of rows/columnsNumber of prime rectangles exponential in number of rows/columns– Heuristic methods to find best prime rectangleHeuristic methods to find best prime rectangle– In practice polynomial expressions are not so largeIn practice polynomial expressions are not so large

11 11 11 11

11 11 11 11

11 11 11 11

11 11 11 11

11 11 11 11