1974 fast one-dimensional digital convolution by multidimensional techniques

8/20/2019 1974 Fast One-Dimensional Digital Convolution by Multidimensional Techniques

1/10

IEEE TRANSACTIONS

ON

ACOUSTICS SPEECH

AND

SIGNAL PROCESSING VOL. ASSP-22 O.

1 ,

FEBRUARY 1974

1

Fast One-Dimensional

Digital Convolution

by Multidimensional

Techniques

RAMESH C. AGARWAL and CHARLES S. BURRUS,

Member, IEEE

Abstract-Thispaperpresents two formulations

of

multi-

dimensional digital signals from one-dimensional digital signals

so thatmultidimensional convolution willmplement one-

dimensional convolution of the original signals. This

has

re-

duced an mportant word ength restriction when

used

with

the Fermat number transform. The formulation is very general

and ncludes block processing and sectioning as special cases

and, when used with various fast algorithms for

short

length

convolutions, results in improved multiplicationefficiency.

I.

In t roduc t ion

There are several advantages to formulating a one-

dimensional digital convolution as a two- or higher

dimensional problem. The first involves the Mersenne

and Fermat number transforms which have recently

been defined

[l]

[2] and which seem

to

have some

advantages over the discrete Fourier transform (DFT)

for implementing convolution on a digital computer.

They can be computationally faster than he fast

Fourier transform (FFT) implementation of the DFT

and result in no roundoff error. These transforms

have one limitation: the number of bits required for

each word in an implementation is proportional to

the length of the sequences t o be convolved

[

11

,

21.

It is the purpose of this paper to present a scheme

whereby long sequences can be convolved by a two-

dimensional convolution as mentioned by Rader

l]

.

This two-dimensional convolution can be imple-

mented by a twodimensional transform

to

allow a

high-speed error-free convolution with the word

length proportional to the square root of the length

of the sequences.

This formulation also allowsuseofspecial short

convolution algorithms similar to those proposed by

Pitassi

[7] ,

Rayner [SI

,

Davis [9] and Allwright

[ l o ] . These can be extended and combined with

others t o provide

a

very general and versatile format

andSeptember

20,

1973. Thisworkwassupported by the

Manuscript received March 15, 1973; revised July 10, 1973

National Science Foundation underGrant GK-23697.

The authors are with the Department of Electrical Engineer-

ing, Rice University, Houston, Tex. 77001.

for doing fast digital convolution. The caseof se-

quences of different lengths and the case where only

part of the output sequence is desired are covered as

special cases of he general problem.

II. Two-Dimen sional C on vo lut ion Based o n Overlap-Save

Consider the cyclic convolution of two sequences,

x n )

and

h n),

iving an output sequence y ( n ) , ll of

length N .

h(n)* x@

= y(n) .

1)

This is defined by

N 1

y ( n )=

h(n

4) ( q )

q

=o

n = O , l , * . * N -1 (2)

where

h ( n )

and

x n)

are periodically extended outside

their original domains of definition (or their indices

evaluated modulo

N ) .

In order to convert this one-dimensional problem t o

two dimensions a change of variables is made. First,

it must be possible to factor N into integer factors

N

= LM. (3)

If a change of variables is made such that

n = l + m L l

k , l = O , l , * - . L -

1

q = k + p L, m = O , l , - - - M -

1

then (2) becomes

L-1

M - 1

y ~

mL) =

h ( l +

mL

-

k

- P L )

x ( k + p ~ ) .

k = O p= O

4)

Define now a twodimensional L

X M

array 8 rom

the original length N

=

LM signal, x n),by

2 ~ ,

n)

=

x ( j +

mL ) 5 )

where .columns of 2 are the sections or blocks of x

and the rows are s-ples of

x

taken every

L

values of

n. In a similar way H and

P

are defined by

ii ~,

) =

h ( l +

mL)

? ( E ,

rn)

= y 1+

mL).

(6)

In termsof the two-dimensional signals, 4) ecomes

? Z, m ) =

k(l- ,

m

- p )

8 k , p )

(7)

which s a/\twodimensional convolution. Note that

valuesof H ouiside the L

X

M array are required.

These values of that are needed can be seen

rom

(7)

where l - L < i - k < L - l a n d l - M G m - p G M -

1. Values of H(1, m ) outside the L X M array are de-

fined by

6) .

We therefore define suitably extended

M

-1 L-1

p=O k = O


2/10

2 IEEE TRANSACTIONS ON ACOUSTICSPEECH AND SIGNALROCESSINGEBRUARY

8

5:

arrays H and X so that twodimensional convoluGon

will give the desired answer. The extension of H is

analogous to he overlap extensions used in the

overlap-save algorithm for doing sectioning or block

processing [ 3 ] ,

41.

Note tha t along the m dimen-

sion the values of H are periodic with period M , Le.,

the desired convolution in (7) is cyclic in the rn di-

mension (this is because the original desired convolu-

tion in

2 1

was defined

as

cyclic).Considering the

values of H along 1 shows the convolution in that di-

mension is not cyclic.

An important factor when considering multidimen-

sional convolution is the number of multiplications

required in an implementation. If in (7) m and p are

held constant then

7 )

becomes a scalar convolution

along the 1 dimension of a length-L sequence of xwith

a length-(2L- 1) sequence of h giving a length-L se-

quence of the output . There willbe one of these

length-L scalarconvolutions for each value ofm and p

in

(7)

so

that along the

rn

dimension each “operation”

will be a length-L convolution rather than a simple

scalar multiplication. This is a sort of convolution of

convolutions [ 3 ]

To

count the total number of mul-

tiplications to compute (7) with 1 and

k

constant are

found. But, with

1

and

k

not constant, this will be

the number of ength-L convolutions necessary and

the total number of multiplications will be the num-

ber of length-L convolutions times the number of mul-

tiplications for a leng th4 convolution. In his case

along the

rn

dimension the number of multiplications

is M 2 and along the E dimension it is

L 2 , o

the total

is

M 2

2 or

N 2

which is the same as a direct calcula-

tion of (2) would require. We will later use trans-

forms and other schemes to reduce this number. Note

the convolution can be carried out in either order.

If (7) is to be cmie$out by transforms and therefore

cyclic convolution, X must be augmented with zeros

so

that aliasing of the noncyclic convolution along 1

does not occur and

so

that all arrays are the same size.

Consider the

2L 1

M array

X

formed b,y ap-

pending ( L

-

1) ows of zeros to the bottomof X .

5:

I

x=

x ( L

-

1)

5:

x ( N - 1)

.

(8)

0

0 0

8

H is formed so tha t he columns of contain

periodic extension of the original h(n )with period

A

-

-

h( N L +

1)

h(N

-

2L + 1)

h ( N - 1) h(L

-

1)

. =

h(0 )

h ( L ) * h(N

- L )

h(1)

h(L

+

1)

h(L - 1)

h ( N -

1)

If twodimensional cyclic convolution is carried

we have

j L ? j * $

(

where the loFer L

X

M

partition of Y is 9 and

columns of Y are the desired blocks of y ( n ) n

Because of ease in implementation with transfor

the arrays would usually be extended one ad

tional row to be 2L X

M

rather than the minim

(2L

-

1)

X

M .

111 Two-Dimensional Transform Implem entat ion

5:

The twodimensional ransform of

X

is defined a

8

M - 1 L - 1

T { 2 }=

F j ,

k ) =

8 1,

m ) a;k

(

m=O Z=O

and the inverse transform

M - 1 2 L - I

T - l { F } = * l ,

m )

= ( 2N) - l

k = O j = O

.F ,

k )

a;:

a M

(

where aM is of an order

M

(i.e.,

M

is the least po

tive integer such that aM

M

= 1[2]

).

Applying

transform to

(10)

it can be shown that

-m k

T { ? ) = T { ? j } T { 2 } (

so that, similar to the onedimensional case,

(10)

be carried out by

@ = T - ’ [ T { . & } T { } ] .

(

If the transform is the DFT, then

a M = e-i2niM*

To compare multiplication efficiencies we assume

DFT of I$ is already known and the number of mu


3/10

AGARWAL ANDURRUS: ONE-DIMENSIONALIGITALONVOLUTION 3

plications for one

2 L X M

transform, one

2 L X M

complex multiplication, and one

2 L X

M inverse

transform are calculated. The number of complex

multiplications is approximately (2N log 2N 2N)as

compared to (N log N +

N

or a onedimensional im-

plementation and thereforeone would not use the

two-dimensional approach with the DFT for im-

proved multiplication efficiency.

Thecomputational advantage appears when used

with the Fermat number transform [ 2 ] where word

length requirements are a possible restriction. The

Fermat number transform is defined in

[ 2 ]

and, al-

though not named, is defined in a restricted form in

the atterpart of

[l,

eq. ( 3 8 ) ] . It is a ransform

defined in a finite ring of integers with arithmetic

performed modulo Fermat numbers (2b 1, b

=

2 f ) ,

with

az

=

2

and

x4

=e nd having the prop-

erty that multiplication of Fermat number trans-

forms corresponds to conventional cyclic convolution

( 2 )modulo 2b + 1. To perform convolution with the

transform requires N real multiplications and a num-

ber of additions and word shifts proportional to

N log N . Unfortunately the transform requires word

lengths proportional to th e ength of the sequences to

be convolved

[ l ] [ 2 ]

Since the lengths of the two

dimensions are 2 L and M rather thanN = LM for the

onedimensional signal, the word-length requirement

using the twodimensional transform is proportional

to the.square root of N rather than to N as for the

onedimensional problem. It is this reduction in the

necessary word-length that makes two-dimensional

formulationttractive with the Fermat number

transform.

The consequences of this reduction in word length

is of considerable practical importance. For example,

using a word length of 16 b and the Fermat number

transform [ 2 ] with x = 2 to compute the complete

noncyclic convolution of two sequences of equal

length, the onedimensional implementation restricts

the sequence length to a maximum of

16

where the

two-dimensional implementation increases the maxi-

mum to 256. A summary of the restrictions on

se-

quence lengths is shown in Table

I

for themost prac-

tical word lengths and for two values of

a.

Note that

onedimensional length restrictions would be too se-

vere for many applications but the two-dimensional

restrictions would include most practical filters. If

cyclic convolution is desired, then all the length re-

strictions are doubled since the addition of zeros to

prevent aliasing is unnecessary.

The two-dimensional transform-r nverse trans-

for m-can be taken in either order. There is, how-

ever, a computational advantage in taking the trans-

form first along the m direction (length

M )

and then

along the 1direction (length 2 L ) ;half the X sequences

along the m directionare zero and half the

H

se-

quences along the

m

direction are cyclically shifted

A

A

TABLE I

Sequence Length Restrictions for One- and Two-Dimensional

Implementation of Noncyclic Convolution by the

Fermat Number Transform

Word Length Transform

Maximum Sequence

Length-N/2

(Bits)

Basis

a

1 D

2-0

16

16 256

16

a

32024

32

32024

32 Jz

64096

64 2

64096

64 Jz

1286 384

by one position of the other halfsequences. Also,

while taking the inverse transform, there is an advan-

tage to first taking the inverse transform along 1, then

along

m,

because we need only half the Y sequences

and thereforeonly half the sequences need be in-

verted along m.

A

IV. Generalizations and the Inverse Problem

A generalization of the approach in this paper to

higher orders is fairly obvious. For example,

N

could

be factored into three integer factors N = LMP, as

was done with two factors in

(3).

The signals

x

n )

and

h ( n )

would then be redefined as three-dimensional

L X M X P

arrays and

( 2 )

converted to a hree-

dimensional convolution by a change of variables as

was done in two dimensions in

(4)-(7).

For x this

would be

z 1,

m , p )

=

x(1

mL pML) (15)

and with similar definitions for and and after aug-

mentation t o prevent aliasing, ( 2 ) would become a

three-dimensional convolution.

We would then have onedimensional cyclic convolu-

tion of N-length sequences being carried out by three-

dimensional cyclic convolution with dimensions of

lengths

2L, 2M,

nd P . Use of orders higher than two

does not seem needed with the Fermat number trans-

form at this time, but will be exploited with other

schemes in the nex tsections.

Stillanother variation would apply to he case

where the filter is periodically time-varying. This,

when converted to a two-dimensional problem with

L

equal the period of the filter, becomes time-

invariant along one dimension, m, and time-varying

along the other, 1 [ 5 ] .The DFT or Fermat transform

could be applied to the two-dimensional signal along

the time-invariant dimension and either direct calcula-

tion or another type transform applied to t he other

dimension.

The inverse of the above problem can be considered

where one wishes to implement two-dimensional con-


4/10

4 IEEE TRANSACTIONS ON ACOUSTICS SPEECH ANDIGNAL PROCESSING FEBRUARY

1

volution by one-dimensional methods. This involves

reversing the process presented in Section I1 by con-

structing onedimensional sequences from the given

arrays. MacAdams [6] has presented a scheme for

doing this tha t can be seen to be the inverse of the

problem addressed in this paper and can be applied to

cyclic or noncyclic twodimensional convolution.

V. M-Dimensional Convo lut ion

If the length of the signals to be cyclically convolved

in (2) can be written N = Z M then the methodsused

in (5) and (15) can be extended to define an M-

dimensional signal with each dimension of length two.

As before, this s done by a hange of variables.

X Z , m,p, ) = x Z

+ 2m -t

4p +

*

.

),

I , rn,p, . . .= 0 , l . (17)

After a similar definition for and ? cyclic convolu-

tion in

(2)

becomes

1 1 1

Y Z,

m , p , - . ) = c c c .

k=O j =O g=O

- g ( l - - ,

m

j , p - g , . .

) z ( k , j ,4,.

) (18)

for I m , p , . . . = 0 , l .

Both ?? and are Md@ensional with each dimen-

sion of length two and H is also Mdimensional but

must be defined with dimensions of length three in

order to carry out (18). This is the same as was re-

quired in two dimensions where H was defined in (9).

Since the original convolution in

(2)

was defined as

cyclic, the convolution

a l o g

the last dimension in

(18) is also cyclic so that H need not be extended

along that dimension but can be evaluated for this n-

dex modulo

2.

This is an extremely general and versatile formula-

tion for the original problem that can be used

to

im-

prove computational efficiency. The convolution

along each dimension of length two can be written in

terms of scalar variables as

R

The convolution of (18) can be viewed as M nested

length 2 convolutions, each separately of the form

shown in (19) requiring four multiplications. Using

the same reasoning that was explained for the two-

dimensional formulation, the total number of multi-

plications is F = 4M. Since

N

= ZM

this becomes F

=

N 2 , which is again the same as would be required by

directly calculating (2).

VI.

A High Speed Algo r i thm

With this formulationof scalar convolution in terms

of multidimensional short convolutions, various tricks

for efficient short convolution can be used. Consid

one of several possible algorithms suggested by Pit

[7] where, rather than directly calculating (19), h

intermediate numbers are found:

go

=

W O + h - l b l

g1 = hO@O

- x1

)

g2 = ( h , +

ho

1x0. (

The desired outputs are then calculated by

Yo =go

+g1

Y 1

= g 2

- g1. (

This approach uses three multiplications in comp

son with four multiplications for a direct calculati

Using this result, the total number of multipli

tions to calculate (18)becomes

F = 3 M . (

If

the fact pointed out earlier, that the convolut

along the last dimension is cyclic, is used, then a

ther reduction is possible.

Length-two cyclic convolution is given by

This can be calculated from two intermediate val

by

f o = (ho 2 +

hl/2)

x0 +

x1

)

f l

= (ho/2

-

hl /2) x0

-

X1 1

to give

Yo

=

o +

fl

Y1

= f o

f l

requiring two multiplications (assuming the factors

are either precalculated or obtained by shiftin

Using this result on the ast dimension reduces

(22)

F = 2 . 3 -1

(2

which is the same number obtained by Pitassi [ 7] a

Rayner

[SI

a

Pitassi developed his lgorithm by relating the cyc

convolution of two sequences to the convolution

subsequences in the same manner that the FFT an

developed based on decimation in time. This was

tended by Davis [9] to an approach similar to de

mation in hequency where the subsequences

halvesof the originalsequences. Both of these

special asesof the multidimensional formulati

since the values along any dimension are samples

blocks of th e original sequence.

Another form of convolution that is sometimes

sired s of the same, form as (2) but with h ( n ) n

beingperiodically extended,rather having indep

dent values for negative indices. The transmission m

trix formulation for N

=

3

is


5/10

AGARWAL AND BURRUS: ONE-DIMENSIONAL DIGITAL CONVOLUTION

5

and because of its structure the operator is called a

constant diagonal convolution matrix. This requires

2 N -

1

values of

h(n)

from

n

=

- N

1

o

n

= N -

1

and N values of x ( n ) o giveN value of

y n) .

The same approach tha t was used for cyclic con-

volution is applicable here except the reduction de-

scribed in (23) and (24) does not work since the M-

dimensional formulation of 18) sno onger cyclic

along any dimension. Therefore the number of multi-

plications necessary to implement a length-N constant

diagonal convolution is

F = 3 M (27)

which is the same as obtained by Allwright [lo], us-

ing a matrix factorization approach.

Note tha t cyclic convolution in (2 ) can be viewed as

a specialase of constant diagonal convolution.

Causal noncyclic convolution where

h(n)

is zero for

y1 <

0

is also a special case.

Lh2 h l

h o l

Lx2-l L Y 2 1

The most common form of convolution desired is

causal noncyclic convolution where all of the output

sequence is obtained ather han only the first

N

points as in (28).For his case two length-N se-

quences are convolved to give a length-(2N - 1)out-

put. The transmission matrix formulation forN = 3 is

Y 3

i i .

4

To apply the results from cyclic convolution, all se-

quences are extended with zeros t o length 2N. Cyclic

convolution then gives the desired output of (29)

and uses

F ~ 2 . 3 ~ (30)

multiplications.

A

further reduction is possible by recognizing tha t

along the last dimension only one multiplication is

necessary rather han wofor he cyclic case tha t

gave (25) or three for the constantiagonal noncyclic

case that gave (27). Consider the (M + 1)dimensional

convolution of (18) rom the extended sequences with

all indices except the last one held constant. The re-

sulting length-two scalar convolution is

which becomes

so = R o

+R - ,

x 1

s =

12, + 8, G I (32)

The G terms are always zero since he last half of the

length-2N sequence x(n)* has been added aszeros.

Close examination shows that because h(n )has also

been extended with zeros, either

Tio

or

8 ,

will always

be zero, depending on values of the constant indices.

Therefore the length-two convolution of

(31)

will re-

quire only one multiplication and the resulting total

number of multiplications necessary to noncyclically

convolve two length-N sequences giving a length-2N

output is

F = 3 M

(33)

which is the same as for the length-N constant diag-

onal convolution with a length-N output.

VII.

Mult id imens ional Convo lut ion Based on Over lap-Add

Another two-dimensional formulation canbe de-

veloped that is a generalization o,f the overlap-add al-

gorithm [31, [41 . In (7) the H function was aug-

mented so that the desirehd output y(n) is in block?

that are the columns of Y.* For this formulation H

will not be augmented but Y will be and the desired

y(n) will obtained by adding the overlapped col-

umns of Y. This formulation is based on noncyclic

convolution given by

(34)

where h(n) and x ( n ) are sequences of length N and

y ( n )

of length 2N - 1 and all are defined to be zero

outside these lengths. The transmission matrix formu-

lation for N = 3 is given by (29).

Using a similar factoring of N and change of vari-

ables as was done in (3) and (5) we have (34)

becoming

p =O k=O

Z = 0 , 1 ; - - 2 L - 2

m = 0 , 1 , - . . 2 M - 2. (35)

In this case both 2 and fiAareL XAM arrays and

9

is

(2L - 1) (2M

-

1) with

X

and

H

having their col-

2mns the blocks of

x ( n )

and

h(n)

but the columns of

Y and the blocks of y(n ) p e a mp-e complicated re-

lation than in (6) . Both

X

and H are defined to be

zero outside the domain of definition. Here we can

show


6/10

6

IEEE TRANSACTIONSN ACOUSTICS SPEECH ANDIGNAL PROCESSING FEBRUARY

1

y(E + rnL) = ? l , rn) + ? I + L ,

rn

-

1) (36)

for

I = O , l , - . . L -

1

r n = 0 , 1 , - . . 2 M -

1

and ? Z,

rn)

=

0

outside its domain f definition of

I = 0 7 1 , * * * 2 L -

rn=0,1; . .2M- 2 . (37)

This formulation gives the implementation of one-

dimensional convolution by the sumofoverlapped

columns of an array obtainedby noncyclic two-

dimensional convolution. This can be viewed as a

generalization of the overlap-add algorithm [3], [4]

used for sectioning or block processing.

If N = p , he extension t o M-dimensional convolu-

tion is similar to that done for the overlap-sve ech-

nique in (17) and (18). Both X and H are M-

gimensional with each dimension of length two and

Y is Mdimensional but with each of length three. For

X Z,rn,p,...)=x Z+2rn+4p+...)

~ ( I , m , p ; . . ) = h ( Z + 2 m + 4 p + . - . )

(38)

(34) becomes

Y ( I , r n , p , * . . ) = . * *

E m p

k = O

= O g=O

.

i 2 k , m

- j , p -

4,.

.

)z k,, 4 . )

(39)

for I ,

rn,p,

. . =

0,1 ,

2

The calculation of y(n) from is a bit complicated

but is a generalization of (36) and involves_onIy addi-

tions. For example if

N

= Z3

=

8 the Y function

would have three dimensions with a tota l of 27 ele-

ments. Along the I dimension there would be nine

length-three blocks that, when overlapped and added

along the

r

dimension, would give three length-seven

blocks.These, when overlapped and added, would

give the single length-fifteen sequence that would be

An exampleg r N =

4

would have a two-dimensional

Y n ) -

three X three Y

r

Yo0o1o2

= l Y l o

Yll Y 1 Z

LYZO

Y Z l YZZ

and y(n) would be found by

Yo = Y o 0

Y 1 = L o

Yz = Y z o

+ Y o 1

Y3 =

Y 1 1

Y4

=

Ys = Y l Z

Y Z l + Y o 2

Y6 =

YZZ.

(

Along each dimension of (39) the matrix formu

tion of the length-two convolution is

which, if done directly, requires four multiplicati

and one addition. If an intermediate step s added

YI ,

hen

Yo = ho x0

Y 1

= ( h o +hl)@O+ x 1 -

Yo

-

Y 1

Yz = h l x1

gives the three outputswith threemultiplications a

four additions/subtractions. Using this algorithm

each dimension of

(39)

and counting the number

required multiplications by the method used to fi

(33)

we find tha t

(39)

can be computed with

F = 3M ( 4

multiplications which are the same as that obtain

by the overlap-save method in (33).

VII.

Sequences

of

Dif feren t Lengths and Part ia l Outputs

Two modifications of the usual formulation of c

volution are often desired. The first occurs when t

two sequences are significantly different in leng

and the second when one desires only a portion of t

output rather than all of it. Both cases can be form

lated in terms of multidimensional convolutions a

computational savings can be realized.

If the

h(n)

is assumed to be the shorter seque

and if its length can be expressed asL

=

2' the leng

of the longer sequence

x ( n )

will be expressed as

R

2'

.M . It

may be necessary to add zeros to both

and h. The multidimensional signals are formed

give

S

length-two dimensions and the last dimensi

of length MA If

r

is the index for the ength-M dim

sion, then H is formulated to be zero for all

rn

oth

than m = 0. Therefore, if the fast algorithm of (2

or

(43)

isusedalong the S length-two dimensio

there will be only

M

multiplications required alo

the M dimension. The otal number of multipli

tions is then

F = 3' .M. (4

This could also be seen by considering he problem

requiring M ength-2' noncyclic convolutions w

the outputsoverlapped and added.

If only a portion of the total output rom the c


7/10

AGARWAL ANDURRUS: DIGITAL CONVOLUTION

7

volution is desired, then a similar saving can be ob-

tained. Assume that both h(n)and x ( n ) are of length

N = LM, witha noncyclic output

y ( n )

of length

2N

- 1.

Out of this only a block of y ( n )of length L

is desired. First,

N

extra zeros i re appended to both

the sequences and cyclic convolution of length 2 N is

formulated. This onedimensional cyclic convolution

is reformulated as a two-dimensional convolution as

in (5)-(7). The first dimension is of length

L

and the

convolution in this dimension is noncyclic and can be

carried ou t as cyclic convolution of length 2L

-

1. In

the second dimension the convolution is cyclic and of

length 2M. The wodimensional arrays are formu-

lated so that the des$ed length L block of

y n )

ap-

pears as a column of Y in (6). Thus, P along the sec-

ond dimension has to be computed only for one

index. This replaces cyclic convolution of length 2 M

with a summation of 2M terms along the second di-

mension. The desired output block can be written as

a summation of

2 M

convolutions of length L, where

each convolution represents convolution of two se-

quences of lengths 2L

-

1 and L, respectively, giving

a sequence of length L. Because zeros were appended

to both x(n)and

h(n) ,

out of these 2M convolutions,

between one and M convolutions woTld be nonzero,

depending on the second index of Y for which the

output is desired.

The convolutions along the first dimension of length

L can be carried out either by transform techniques

or multidimensional techniques discussedn this

paper. If the transform techniques are used, the

summation of convolutions can be carried ou t in the

transform domain, thus requiring only one inverse

transform to obtain the desired output. Rader

[ll]

hasdiscussed a similar technique for the particular

case of estimation of autocorrelation function for he

first few lagvalues. If

L

= 2' and the multidimen-

sional methods are used, the number of multiplica-

tions are at most the same as given in

(45).

The formulation just discussed can be extended for

the situation where sampled output is desired, sam-

pled at every L th value. In this case for he wo-

dimensional formulation, the output appears as a row

of P, and as before, the twodimensional convolution

again reduces to a summation of convolutions. If the

output

y ( n )

is a narrow-band signal as compared to

the sampling frequency, to reconstruct the analog sig-

nal, the samples of y n ) at a lower sampling rate are

sufficient. In this situation the formulation discussed

here can result in computational savings.

If a multidimensional formulation is considered, we

can obtain partial outputs as combinations of blocks

and samples.

IX.

Relat ions to Logical Convo lut ion and Walsh Transforms

Consider logical convolution [12] of two sequences

x ( n ) and h n) of length N = Z M , each giving

an

out-

put sequence y ( n ) also of length N . Logical convolu-

tion is defined similar to the cyclic convolution of ( 2) ,

but the addition and subtraction of indices is done

differently. All indices are represented in the binary

form as an M-bit index. When indices have to be added

or subtracted, theyare added or subtracted bit by bit,

modulo 2. Note that

in

logical convolution, addition

and subtraction of indices are equivalent. We can

convert this onedimensional ogical convolution prob-

lem to an Mdimensional convolution as in (17) and

(18). Along any dimension, convolution appears as n

(19), but since logical convolution is desired, h(-1)=

h 1). Therefore, if

(18)

is implemented as length-two

cyclic convolutions along all the dimensions,

y n )

hus

obtained is logical convolution of

x ( n )

and h(n) .

Al

ternatively, if (18)is implemented as a noncyclic con-

volution alongall the dimensions, we obtainnon-

cyclic convolution of x ( n ) and h(n)as in (26).

Length-two cyclic convolution can be implemented

using just two multiplications as in

(24),

which is a

length-two DFT implementation. Thus (18) canbe

implemented as an Mdimension$ cyclk convolution

using Mdimensional DFT's of

X

and H , where each

dimension isof length two. Alternatively length-N

logical convolution can also be implemented using

length-N Walsh transforms of x(n )and h(n) 12]. The

preceding development shows that length-(N = 2M

Walsh transform is equivalent to th e Mdimensional

DFT. Thus the Mdimensional approach establishes

the logical convolution theorem for the Walsh trans-

forms and it also establishes the fast Walsh transform

algorithm as anMdimensional DFT. These facts have

been noted before in the literature.

For a particular formulation of (20), Pittasi [71 and

Davis [9] ave shown that some of the intermediate

products correspond to multiplication of the Walsh

transforms of the twosequences.

X. General izations and A pplication s

There are several modifications and generalizations

that are possible with the formulation used here. The

first will illustrate that fast algorithms exist for se-

quences of length other than two.

Consider the noncyclic convolution of two length-

three sequences.

E]

Y4

This would normally require nine multiplications and

four additions. If six intermediate variables are calcu-

lated by


8/10

IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING FEBRUARY 19

go =

ho x0

g3 =

(ho +

hl x0 + x1

g1 = h l x1

g4

=

h +

h2 1(x1 I-

xz

g

= h z x z

g5 =(h2

+ h 0 @ 2

x01 (47)

then thedesired outpu t can be obtained by

Yo =go

Y1

=g3 g o - g,

Yz

=g5 + g ,

-

go - g2

y3

=g4

-

gl g2

Y4

=g2 (48)

requiring six multiplications.

If two sequences of length N = 3M were to be con-

volved, then using an implementation with

M

di-

mensions

of

length three would require for otal

multiplications

F = 6 M .

(49)

For short sequences of this length, this approach is

faster than adding zeros and using thenext larger

power of two-length with

(39).

Similar results are possible with other lengths and a

very general scheme can be developed for sequences

of length

N =

2M . 3 s . .

using different algorithms along different dimensions

in a manner similar to using a multiple radix FFT.

Because of th e generality of the multidimensional

formulation here are mixtures of highspeed tech-

niques that can be used.

In some situations, it may be advantageous to com-

bine transform techniques with the fast algorithms for

short convolutions discussed in this paper. One such

situation uses the Fermat number transforms

[2] .

As

discussed in Section

111,

using a b-bit implementation

of Fermat number transforms, the maximum cyclic

convolution length is 4

.

To cyclically convolve se-

quences longer than this, we need to formulate a two-

dimensional convolution as in Section

11.

Assume

x ( n ) ,

h(n),and

y n )

are sequences of length-(N = 8 b )

each. Consider the formulation of 7 ) , with M = 4 b

and

L

=

2.

Equation

7)

can be rewritten as

(50)

M -1

P(0, rn)=

i i ( 0 ,

m

p ) 8 0 , p )

p

=o

p =o

j =O

e 0, j)a(l,j),

m

= O l . .

, M

- 1. (51)

In these equations, each summation represents c

clic convolution of length 4

,

which can be carri

out by Fermat transforms. Taking length

4 b

Ferm

transforms of all the sequences in

(51),

we obtain

T { P(0, ) ) =

T { A ( O ,

) } T { 2 ( O , ) }

T {

P 1,

) } =

T(A(1,

) }

T{B(O,

I ) }

4-T { A - l ,

) } T { 2 ( 1 ,

+ T { a ( O ,

) ) T { 8 ( 1 , } . (5

These equations are similar to (19) and we cou

employ the tricks of (20) and (21) along the fir

dimension, giving

T {

P(0, ) }

= [ T {A ( O , ) }

+

T{A(-I, I ) } ] T { 2 ( 1 ,

+

r u

9 - T{2 I

{A(O,

E

T { ?(l,

I ) }

= [T@(O, I ) )

+

TCA(1,4}1T{2(0,I)

[ T { 8 ( 0 , ) } -

7 {8(1,

} ] T { A ( O , .

5

Note that (--l, rn)=fi l,n

-

l),thus A - l , 1)

the cyclic shift by one position of H 1 , l ) . We go

employ the 5yclic shift theorem to coAmputeT { H

E } from

T ( H ( 1 ,

E)}. Assuming t h e H transforms a

precalculated,this method requires computation of

length-(4

b ) X

transforms, 12

b

multiplications a

12

b

additions/subtractions to compute the

?

tran

forms using 53 ) , and two length-(4b ) ? inverse tran

forms. This

is

efficient because the only extra comp

tation is 4b extra multiplications and is better th

the two-dimensional Fermat number transform a

proach, which requires roughly twice the amount

computation. This use of

2

X

1M

convolution has

small computational advantage even when used wi

the FFT where, in effect, the ast stage of the FFT

gorithm is replacedby one of the fast length-tw

algorithms.

There are many other possibilities of combinatio

of Fermat and Fourier transforms of various lengt

and dimensions with short convolution algorithms

with the use of special hardware. The arbitrary ord

of the various operations can also be used to adva

tage. As pointed out by Rader [ l l ] and illustrat

in our discussion of partial outputs and in ( 5 3 ) , t

often more efficient t o take transforms along o

dimension than convolve along another before taki

the inverse transform.

To illustrate the efficiencies of some of the tec

niques of this paper, a comparison of a particular ca

will be made with direct and FFT implementatio

Consider the problem of noncyclic convolution

two length-(N

= 2M

sequences to give an output

length 2 N -

1

as described in (34). The number

multiplications per output point will be calculatedf

three implementations. First, consider adirect i

plementation which requires N 2 multiplication


9/10

AGARWAL AND BURRUS: ONE-DIMENSIONAL DIGITAL CONVOLUTION

9

2N

1

output points (we will approximate this by

2N

for simplification). This gives for multiplication

per output point

Fo

= t N .

If the FFT s used in anefficient way, taking advan-

tage of the fact that the data are real and using a

mixed radix algorithm (2,

4,

and 8), the number of

multiplications necessary per output pointcan be cal-

culated as described by Singleton [13] and is de-

noted F , .

Using the Mdimensional formulation with the fast

algorithm of

(33)

or 44) gives for themultiplications

per output

1

F2 ~ - 3 ~ .

2N

Table I1 compares these functions forvarious lengths

up to 1024. Note the multidimensional implementa-

tion is more efficient than a direct implementation

for all lengths and requires fewer multiplications

than the FFT forengths up to

128.

If a less efficient

FFT implementation requiring 4 log

N

+

4

multiplica-

tions per output point is used, the crossover length is

above

2048.

The multidimension approach can also be used in

the same way the FFT is used to implement ongoing

processing or sectioning [4]. In contrast, t o use with

the FFT, this approach is most efficient when the

length of the section orblock is the same as he length

of the convolution operator.

Initial observations indicate hat block implementa-

tion of recursive filters [3] becomes more attractive

when used with the techniques described in this pa-

per. To illustrate his, wewillagain consider the

multiplication efficiencies for three realizations. First

consider a recursive filter with an equal order numer-

ator and denominator f

N

=

2 M .

The multiplications

per output points for direct implementation

F o

=

2N.

Using efficient FFT algorithms, the results given by

the methods in [3] are denotedF1. Using three con-

volutions by

(33)

or

(44)

to implement the block re-

cursive filter with the block length equal the order,

gives as he multiplications per output point

Table I11 compares these multiplication efficiencies

for orders up to

128.

Note the multidimensional ap-

proach is more efficient than the direct for orders

above three and more efficient than the FFT for or-

ders up to about 256. This is yet to be explored in

detail.

TABLE

I1

A Comparison, f Multiplication Efficiencies for Three

Implementations of Length-N Convolution

2

4

8

16

64

32

128

256

512

1024

1

4

2

8

16

32

128

64

256

512

1.5

2.25

4.12

5.06

8.01

6.03

10.00

9.00

12.00

13.00

0.75

1.12

1.70

2.53

3.80

5.70

8.54

12.81

19.22

28.83

TABLE I11

A Comparison of Multiplication Efficiencies for Three

Implementations of Block Recursive Filters of Order N

N F O Fl F2

4

24.5

8

8

21 6.7

16

16 31

32

10.1

38 15.2

32

~

~~

645

64

22.8

128

128

53

2568

34.2

51.2

XI. Conclusions

This paper

has

presented two formulations of con-

volution in terms of multidimensional convolution-

one based ona generalization of the overlap-save

algorithm for sectioning and the other on theverlap-

add algorithm. The first proved to be well suited for

cyclic and constantdiagonal convolution and the ec-

ond for noncyclic convolution. Fast algorithms were

developedbased on ength-two

and

-three convolu-

tion that lead to an improvement in multiplication

efficiency. Thereduction nrequired word lengths

proved to be a valuable feature when used with the

Fermat number ransform. The formulation proved

to be well suited for the special cases where unequal-

length sequences were convolved or where only a por-

tion of the output was desired. It was further shown

that various mixtures of algorithms could beused

along the different dimensions to achieve certain ad-

vantages or to fit articular requirements. Finally, ex-

amples were presented to compare the multiplication

efficiencies of a few implementations.

The formulation is

so

general tha t a complete and

systematic investigation of all possible applications is

difficult. The main ideas and relations to otherworks

that we know of have been presented here. The in-

vestigation of word length and storage requirements

and a more complete consideration of recursive im-

plementations is still to be made.

Acknowledgment

The authors would like to

thank

R. A. Meyer for

valuable discussions.


10/10

10 IEEE TRANSACTIONSON ACOUSTICSPEECH AND SIGNALROCESSING OL.SSP-22 NO.

1 ,

FEBRUARY 1

References

[l

C.M. Rader “Discrete convolutions via Mersenne trans-

forms,” I E d ET r a n s .C o m p u t . , vol. C-21, pp.1269-

1273, Dec. 1972.

[ 2 ] R . C. Agarwal and C. S. Burru,s,. ‘Fast digitalconvolu-

tions using Fermat ransform s. In Southwe s t e rn IEEE

[3] C.

S.

Burrus, “Block realization of digital filters,” IEEE

C on 6 R e c , Houston, Tex., Apr: 1973 , pp. 538-543.

Trans . Aud io Ele c t roac ous t . ,

vol. AU-20, pp. 239-235,

Oct. 1972.

[4 B. Gold and C.M. Rader,

Digital Processing o f Signals.

New York: McGraw-Hill, 19 69 , py; 208-211.

[ 5 ]

R.

A. Meyer and C.

S.

Burrus,Certain properties

pf

I EE E C o n f . R e c . , Houston, Tex., Apr. 1973, pp. 529-

periodically time-varying digital ilters,” in

Southwe s t e rn

535.

[6] D.

P.

MacAdam, “Image restora tion by constrained de-

convolution,” J . O p t .

S OC .

A m e r . , vol. 6O;pp. 1617-

1627, Dec. 1970.

[ ] D. A. Pitassi, “Fast convolution using th e Walsh trans-

Digital

Design

Notch Filter

Procedure

JAMES A. CADZOW

Abstract-An

analyticalprocedure or designing ainear

digital notch filter is presented. The resu ltan t filte r is sixth-

orderand s mplementedby cascading three econd-order

filters

so

as

to

avoid instability which may arise from

computercoefficient runcation.Theprocedureoutlined is

straightforw’md, requires only simple algebraic steps, and gives

filter parameter selection criteria for reducing t he effec ts

of

compute r coefficient truncation.

Notch ilters have utility in situationswhere a esired

signal

is

corruptedby n dditive sinusoidalpickup.One

thusmust process th e noisy signal so as to remove the

sinusoid without significantly disto rting he desired signal.

I. In t roduc t ion

A frequently occurring linear data filtering applica-

tion occwfs when one wishes to process

a

signal of the

form

u

t )

=

s t )+

A sin wo

so

as to remove the additive sinusoidal component,

A

sin m o t , without seriously distorting the desired

signal

s(t).

The situation in which

a

60-Hz sinusoidal

pickup corrupts a desired measurement signal nicely

illustrates how such problems can arise in practice.

forms,” n Proc. C o n f .App l i c a t ions f Wa lsh Fu

t ions , Washington, D.C., Apr. 197 1, pp. 130-13 3.

[8 P.

J.

W. Rayner, “A fast cyclic convolution algorith

presentedatSymp. Digital Filtering, mperial Colle

London, England, Aug. 1971.

[9] W. F. ,Davis “A class

of

efficient convolution al

rithms, in &roc . Sy mp .A p p l i c a t i o n s o f Walsh Fu

t ions ,

Washington, D.C., Mar. 1972, p p. 318-3 29.

[ l o ] J. C. Allwright, “Real actorization of noncyclic c

volutionoperatorswithapplications

to

fast convo

[11 C. M. Rader,“An improvedalgorithm for high sp

tion,”

E l e c t r o n . L e t t . ,

vol.

7 ,

pp. 718-719, Dec. 1971

autocorrelationwith pplications to spectral stima

tion,”

I E E ET r a n s .A u d i oEle c t roac ous t . ,

vol. AU-

pp. 439-441, Dec. 197 0.

[12 J

G.

S . Robinson, “Logical convolution nd iscrete

Walsh and Fourie r power spectra ,” IEEE Trans . Au

Ele c t roac ous t . ,

vol. AU-20, pp. 271-280, Oct. 1972.

[13]

R.

C. Singleton, “An’alg orithm for com puting the mix

radix fast Fourier transform,”’ IEEE Trans . Aud io El

t roac ous t . ,

vol. AU-17, pp. 93-103, June 1969.

The ideal filter for he above application would th

have

a

response

s t )

to he nput

u t ) .

In the f

quency domain, the required linear filter would ha

a gajn of one for all frequencies except at wo wh

its gain s zero. As such, this processor s typica

called a notch filter with notchat a0. Its ga

frequency behavior is depicted in Fig.

1.

Unfortunately, the ideal notch filter is not phy

cally realizable and must be approximated in practi

If one attempted to implement a notch filter

proximation usinganalogdevices(i.e., resistors,

pacitors, and nductors), one would quickly real

the futility of thisapproach. On theother han

one may eadilydesign a digital filter whose f

quency behavior loselyesembles that shown

Fig.

1

(e.g., see

l]

[4 ]

).

The approach to be taken in this paper is to th

uniformly sample the signal u ( t ) (every T secon

and use the resulting sequence as the input o a digi

filter governed by

y ( h ) = b , u ( h ) + b , u ( h- 1 ) + . . . +,u(k

-

m)

- a l y h - 1) a z y k -

2)

-

. . .

a , y h -

n )

where u ( h )and y h ) denote the values of the filte

inputandoutput signals, espectively, at hehth

iteration.

A

procedure for selecting the coefficie

ai nd b iwhereby the filter’s gain factor-frequency b

havior will be similar to that shown in Fig.

l

will

shortly given. One must realize, however, that sin

the filter is digital, its frequency behaviorwillbe

periodic with period 2 x / T (e.g., [l, . 2971 ) and w

appear

as

shown in Fig.2.

The selection of the sampling period T o be used

of great importancefrom a number of viewpoin

Most importantly, it must be chosen small enough

that little distortion results from the analog-to-dig

conversion

of

the desired signal

s t ) .

Quantitative

1974 fast one-dimensional digital convolution by multidimensional techniques

Documents