1974 fast one-dimensional digital convolution by multidimensional techniques
TRANSCRIPT
-
8/20/2019 1974 Fast One-Dimensional Digital Convolution by Multidimensional Techniques
1/10
IEEE TRANSACTIONS
ON
ACOUSTICS SPEECH
AND
SIGNAL PROCESSING VOL. ASSP-22 O.
1 ,
FEBRUARY 1974
1
Fast One-Dimensional
Digital Convolution
by Multidimensional
Techniques
RAMESH C. AGARWAL and CHARLES S. BURRUS,
Member, IEEE
Abstract-Thispaperpresents two formulations
of
multi-
dimensional digital signals from one-dimensional digital signals
so thatmultidimensional convolution willmplement one-
dimensional convolution of the original signals. This
has
re-
duced an mportant word ength restriction when
used
with
the Fermat number transform. The formulation is very general
and ncludes block processing and sectioning as special cases
and, when used with various fast algorithms for
short
length
convolutions, results in improved multiplicationefficiency.
I.
In t roduc t ion
There are several advantages to formulating a one-
dimensional digital convolution as a two- or higher
dimensional problem. The first involves the Mersenne
and Fermat number transforms which have recently
been defined
[l]
[2] and which seem
to
have some
advantages over the discrete Fourier transform (DFT)
for implementing convolution on a digital computer.
They can be computationally faster than he fast
Fourier transform (FFT) implementation of the DFT
and result in no roundoff error. These transforms
have one limitation: the number of bits required for
each word in an implementation is proportional to
the length of the sequences t o be convolved
[
11
,
21.
It is the purpose of this paper to present a scheme
whereby long sequences can be convolved by a two-
dimensional convolution as mentioned by Rader
l]
.
This two-dimensional convolution can be imple-
mented by a twodimensional transform
to
allow a
high-speed error-free convolution with the word
length proportional to the square root of the length
of the sequences.
This formulation also allowsuseofspecial short
convolution algorithms similar to those proposed by
Pitassi
[7] ,
Rayner [SI
,
Davis [9] and Allwright
[ l o ] . These can be extended and combined with
others t o provide
a
very general and versatile format
andSeptember
20,
1973. Thisworkwassupported by the
Manuscript received March 15, 1973; revised July 10, 1973
National Science Foundation underGrant GK-23697.
The authors are with the Department of Electrical Engineer-
ing, Rice University, Houston, Tex. 77001.
for doing fast digital convolution. The caseof se-
quences of different lengths and the case where only
part of the output sequence is desired are covered as
special cases of he general problem.
II. Two-Dimen sional C on vo lut ion Based o n Overlap-Save
Consider the cyclic convolution of two sequences,
x n )
and
h n),
iving an output sequence y ( n ) , ll of
length N .
h(n)* x@
= y(n) .
1)
This is defined by
N 1
y ( n )=
h(n
4) ( q )
q
=o
n = O , l , * . * N -1 (2)
where
h ( n )
and
x n)
are periodically extended outside
their original domains of definition (or their indices
evaluated modulo
N ) .
In order to convert this one-dimensional problem t o
two dimensions a change of variables is made. First,
it must be possible to factor N into integer factors
N
= LM. (3)
If a change of variables is made such that
n = l + m L l
k , l = O , l , * - . L -
1
q = k + p L, m = O , l , - - - M -
1
then (2) becomes
L-1
M - 1
y ~
mL) =
h ( l +
mL
-
k
- P L )
x ( k + p ~ ) .
k = O p= O
4)
Define now a twodimensional L
X M
array 8 rom
the original length N
=
LM signal, x n),by
2 ~ ,
n)
=
x ( j +
mL ) 5 )
where .columns of 2 are the sections or blocks of x
and the rows are s-ples of
x
taken every
L
values of
n. In a similar way H and
P
are defined by
ii ~,
) =
h ( l +
mL)
? ( E ,
rn)
= y 1+
mL).
(6)
In termsof the two-dimensional signals, 4) ecomes
? Z, m ) =
k(l- ,
m
- p )
8 k , p )
(7)
which s a/\twodimensional convolution. Note that
valuesof H ouiside the L
X
M array are required.
These values of that are needed can be seen
rom
(7)
where l - L < i - k < L - l a n d l - M G m - p G M -
1. Values of H(1, m ) outside the L X M array are de-
fined by
6) .
We therefore define suitably extended
M
-1 L-1
p=O k = O
-
8/20/2019 1974 Fast One-Dimensional Digital Convolution by Multidimensional Techniques
2/10
2 IEEE TRANSACTIONS ON ACOUSTICSPEECH AND SIGNALROCESSINGEBRUARY
8
5:
arrays H and X so that twodimensional convoluGon
will give the desired answer. The extension of H is
analogous to he overlap extensions used in the
overlap-save algorithm for doing sectioning or block
processing [ 3 ] ,
41.
Note tha t along the m dimen-
sion the values of H are periodic with period M , Le.,
the desired convolution in (7) is cyclic in the rn di-
mension (this is because the original desired convolu-
tion in
2 1
was defined
as
cyclic).Considering the
values of H along 1 shows the convolution in that di-
mension is not cyclic.
An important factor when considering multidimen-
sional convolution is the number of multiplications
required in an implementation. If in (7) m and p are
held constant then
7 )
becomes a scalar convolution
along the 1 dimension of a length-L sequence of xwith
a length-(2L- 1) sequence of h giving a length-L se-
quence of the output . There willbe one of these
length-L scalarconvolutions for each value ofm and p
in
(7)
so
that along the
rn
dimension each “operation”
will be a length-L convolution rather than a simple
scalar multiplication. This is a sort of convolution of
convolutions [ 3 ]
To
count the total number of mul-
tiplications to compute (7) with 1 and
k
constant are
found. But, with
1
and
k
not constant, this will be
the number of ength-L convolutions necessary and
the total number of multiplications will be the num-
ber of length-L convolutions times the number of mul-
tiplications for a leng th4 convolution. In his case
along the
rn
dimension the number of multiplications
is M 2 and along the E dimension it is
L 2 , o
the total
is
M 2
2 or
N 2
which is the same as a direct calcula-
tion of (2) would require. We will later use trans-
forms and other schemes to reduce this number. Note
the convolution can be carried out in either order.
If (7) is to be cmie$out by transforms and therefore
cyclic convolution, X must be augmented with zeros
so
that aliasing of the noncyclic convolution along 1
does not occur and
so
that all arrays are the same size.
Consider the
2L 1
M array
X
formed b,y ap-
pending ( L
-
1) ows of zeros to the bottomof X .
5:
I
x=
x ( L
-
1)
5:
x ( N - 1)
.
(8)
0
0 0
8
H is formed so tha t he columns of contain
periodic extension of the original h(n )with period
A
-
-
h( N L +
1)
h(N
-
2L + 1)
h ( N - 1) h(L
-
1)
. =
h(0 )
h ( L ) * h(N
- L )
h(1)
h(L
+
1)
h(L - 1)
h ( N -
1)
If twodimensional cyclic convolution is carried
we have
j L ? j * $
(
where the loFer L
X
M
partition of Y is 9 and
columns of Y are the desired blocks of y ( n ) n
Because of ease in implementation with transfor
the arrays would usually be extended one ad
tional row to be 2L X
M
rather than the minim
(2L
-
1)
X
M .
111 Two-Dimensional Transform Implem entat ion
5:
The twodimensional ransform of
X
is defined a
8
M - 1 L - 1
T { 2 }=
F j ,
k ) =
8 1,
m ) a;k
(
m=O Z=O
and the inverse transform
M - 1 2 L - I
T - l { F } = * l ,
m )
= ( 2N) - l
k = O j = O
.F ,
k )
a;:
a M
(
where aM is of an order
M
(i.e.,
M
is the least po
tive integer such that aM
M
= 1[2]
).
Applying
transform to
(10)
it can be shown that
-m k
T { ? ) = T { ? j } T { 2 } (
so that, similar to the onedimensional case,
(10)
be carried out by
@ = T - ’ [ T { . & } T { } ] .
(
If the transform is the DFT, then
a M = e-i2niM*
To compare multiplication efficiencies we assume
DFT of I$ is already known and the number of mu
-
8/20/2019 1974 Fast One-Dimensional Digital Convolution by Multidimensional Techniques
3/10
AGARWAL ANDURRUS: ONE-DIMENSIONALIGITALONVOLUTION 3
plications for one
2 L X M
transform, one
2 L X M
complex multiplication, and one
2 L X
M inverse
transform are calculated. The number of complex
multiplications is approximately (2N log 2N 2N)as
compared to (N log N +
N
or a onedimensional im-
plementation and thereforeone would not use the
two-dimensional approach with the DFT for im-
proved multiplication efficiency.
Thecomputational advantage appears when used
with the Fermat number transform [ 2 ] where word
length requirements are a possible restriction. The
Fermat number transform is defined in
[ 2 ]
and, al-
though not named, is defined in a restricted form in
the atterpart of
[l,
eq. ( 3 8 ) ] . It is a ransform
defined in a finite ring of integers with arithmetic
performed modulo Fermat numbers (2b 1, b
=
2 f ) ,
with
az
=
2
and
x4
=e nd having the prop-
erty that multiplication of Fermat number trans-
forms corresponds to conventional cyclic convolution
( 2 )modulo 2b + 1. To perform convolution with the
transform requires N real multiplications and a num-
ber of additions and word shifts proportional to
N log N . Unfortunately the transform requires word
lengths proportional to th e ength of the sequences to
be convolved
[ l ] [ 2 ]
Since the lengths of the two
dimensions are 2 L and M rather thanN = LM for the
onedimensional signal, the word-length requirement
using the twodimensional transform is proportional
to the.square root of N rather than to N as for the
onedimensional problem. It is this reduction in the
necessary word-length that makes two-dimensional
formulationttractive with the Fermat number
transform.
The consequences of this reduction in word length
is of considerable practical importance. For example,
using a word length of 16 b and the Fermat number
transform [ 2 ] with x = 2 to compute the complete
noncyclic convolution of two sequences of equal
length, the onedimensional implementation restricts
the sequence length to a maximum of
16
where the
two-dimensional implementation increases the maxi-
mum to 256. A summary of the restrictions on
se-
quence lengths is shown in Table
I
for themost prac-
tical word lengths and for two values of
a.
Note that
onedimensional length restrictions would be too se-
vere for many applications but the two-dimensional
restrictions would include most practical filters. If
cyclic convolution is desired, then all the length re-
strictions are doubled since the addition of zeros to
prevent aliasing is unnecessary.
The two-dimensional transform-r nverse trans-
for m-can be taken in either order. There is, how-
ever, a computational advantage in taking the trans-
form first along the m direction (length
M )
and then
along the 1direction (length 2 L ) ;half the X sequences
along the m directionare zero and half the
H
se-
quences along the
m
direction are cyclically shifted
A
A
TABLE I
Sequence Length Restrictions for One- and Two-Dimensional
Implementation of Noncyclic Convolution by the
Fermat Number Transform
Word Length Transform
Maximum Sequence
Length-N/2
(Bits)
Basis
a
1 D
2-0
16
16 256
16
a
32024
32
32024
32 Jz
64096
64 2
64096
64 Jz
1286 384
by one position of the other halfsequences. Also,
while taking the inverse transform, there is an advan-
tage to first taking the inverse transform along 1, then
along
m,
because we need only half the Y sequences
and thereforeonly half the sequences need be in-
verted along m.
A
IV. Generalizations and the Inverse Problem
A generalization of the approach in this paper to
higher orders is fairly obvious. For example,
N
could
be factored into three integer factors N = LMP, as
was done with two factors in
(3).
The signals
x
n )
and
h ( n )
would then be redefined as three-dimensional
L X M X P
arrays and
( 2 )
converted to a hree-
dimensional convolution by a change of variables as
was done in two dimensions in
(4)-(7).
For x this
would be
z 1,
m , p )
=
x(1
mL pML) (15)
and with similar definitions for and and after aug-
mentation t o prevent aliasing, ( 2 ) would become a
three-dimensional convolution.
We would then have onedimensional cyclic convolu-
tion of N-length sequences being carried out by three-
dimensional cyclic convolution with dimensions of
lengths
2L, 2M,
nd P . Use of orders higher than two
does not seem needed with the Fermat number trans-
form at this time, but will be exploited with other
schemes in the nex tsections.
Stillanother variation would apply to he case
where the filter is periodically time-varying. This,
when converted to a two-dimensional problem with
L
equal the period of the filter, becomes time-
invariant along one dimension, m, and time-varying
along the other, 1 [ 5 ] .The DFT or Fermat transform
could be applied to the two-dimensional signal along
the time-invariant dimension and either direct calcula-
tion or another type transform applied to t he other
dimension.
The inverse of the above problem can be considered
where one wishes to implement two-dimensional con-
-
8/20/2019 1974 Fast One-Dimensional Digital Convolution by Multidimensional Techniques
4/10
4 IEEE TRANSACTIONS ON ACOUSTICS SPEECH ANDIGNAL PROCESSING FEBRUARY
1
volution by one-dimensional methods. This involves
reversing the process presented in Section I1 by con-
structing onedimensional sequences from the given
arrays. MacAdams [6] has presented a scheme for
doing this tha t can be seen to be the inverse of the
problem addressed in this paper and can be applied to
cyclic or noncyclic twodimensional convolution.
V. M-Dimensional Convo lut ion
If the length of the signals to be cyclically convolved
in (2) can be written N = Z M then the methodsused
in (5) and (15) can be extended to define an M-
dimensional signal with each dimension of length two.
As before, this s done by a hange of variables.
X Z , m,p, ) = x Z
+ 2m -t
4p +
*
.
),
I , rn,p, . . .= 0 , l . (17)
After a similar definition for and ? cyclic convolu-
tion in
(2)
becomes
1 1 1
Y Z,
m , p , - . ) = c c c .
k=O j =O g=O
- g ( l - - ,
m
j , p - g , . .
) z ( k , j ,4,.
) (18)
for I m , p , . . . = 0 , l .
Both ?? and are Md@ensional with each dimen-
sion of length two and H is also Mdimensional but
must be defined with dimensions of length three in
order to carry out (18). This is the same as was re-
quired in two dimensions where H was defined in (9).
Since the original convolution in
(2)
was defined as
cyclic, the convolution
a l o g
the last dimension in
(18) is also cyclic so that H need not be extended
along that dimension but can be evaluated for this n-
dex modulo
2.
This is an extremely general and versatile formula-
tion for the original problem that can be used
to
im-
prove computational efficiency. The convolution
along each dimension of length two can be written in
terms of scalar variables as
R
The convolution of (18) can be viewed as M nested
length 2 convolutions, each separately of the form
shown in (19) requiring four multiplications. Using
the same reasoning that was explained for the two-
dimensional formulation, the total number of multi-
plications is F = 4M. Since
N
= ZM
this becomes F
=
N 2 , which is again the same as would be required by
directly calculating (2).
VI.
A High Speed Algo r i thm
With this formulationof scalar convolution in terms
of multidimensional short convolutions, various tricks
for efficient short convolution can be used. Consid
one of several possible algorithms suggested by Pit
[7] where, rather than directly calculating (19), h
intermediate numbers are found:
go
=
W O + h - l b l
g1 = hO@O
- x1
)
g2 = ( h , +
ho
1x0. (
The desired outputs are then calculated by
Yo =go
+g1
Y 1
= g 2
- g1. (
This approach uses three multiplications in comp
son with four multiplications for a direct calculati
Using this result, the total number of multipli
tions to calculate (18)becomes
F = 3 M . (
If
the fact pointed out earlier, that the convolut
along the last dimension is cyclic, is used, then a
ther reduction is possible.
Length-two cyclic convolution is given by
This can be calculated from two intermediate val
by
f o = (ho 2 +
hl/2)
x0 +
x1
)
f l
= (ho/2
-
hl /2) x0
-
X1 1
to give
Yo
=
o +
fl
Y1
= f o
f l
requiring two multiplications (assuming the factors
are either precalculated or obtained by shiftin
Using this result on the ast dimension reduces
(22)
F = 2 . 3 -1
(2
which is the same number obtained by Pitassi [ 7] a
Rayner
[SI
a
Pitassi developed his lgorithm by relating the cyc
convolution of two sequences to the convolution
subsequences in the same manner that the FFT an
developed based on decimation in time. This was
tended by Davis [9] to an approach similar to de
mation in hequency where the subsequences
halvesof the originalsequences. Both of these
special asesof the multidimensional formulati
since the values along any dimension are samples
blocks of th e original sequence.
Another form of convolution that is sometimes
sired s of the same, form as (2) but with h ( n ) n
beingperiodically extended,rather having indep
dent values for negative indices. The transmission m
trix formulation for N
=
3
is
-
8/20/2019 1974 Fast One-Dimensional Digital Convolution by Multidimensional Techniques
5/10
AGARWAL AND BURRUS: ONE-DIMENSIONAL DIGITAL CONVOLUTION
5
and because of its structure the operator is called a
constant diagonal convolution matrix. This requires
2 N -
1
values of
h(n)
from
n
=
- N
1
o
n
= N -
1
and N values of x ( n ) o giveN value of
y n) .
The same approach tha t was used for cyclic con-
volution is applicable here except the reduction de-
scribed in (23) and (24) does not work since the M-
dimensional formulation of 18) sno onger cyclic
along any dimension. Therefore the number of multi-
plications necessary to implement a length-N constant
diagonal convolution is
F = 3 M (27)
which is the same as obtained by Allwright [lo], us-
ing a matrix factorization approach.
Note tha t cyclic convolution in (2 ) can be viewed as
a specialase of constant diagonal convolution.
Causal noncyclic convolution where
h(n)
is zero for
y1 <
0
is also a special case.
Lh2 h l
h o l
Lx2-l L Y 2 1
The most common form of convolution desired is
causal noncyclic convolution where all of the output
sequence is obtained ather han only the first
N
points as in (28).For his case two length-N se-
quences are convolved to give a length-(2N - 1)out-
put. The transmission matrix formulation forN = 3 is
Y 3
i i .
4
To apply the results from cyclic convolution, all se-
quences are extended with zeros t o length 2N. Cyclic
convolution then gives the desired output of (29)
and uses
F ~ 2 . 3 ~ (30)
multiplications.
A
further reduction is possible by recognizing tha t
along the last dimension only one multiplication is
necessary rather han wofor he cyclic case tha t
gave (25) or three for the constantiagonal noncyclic
case that gave (27). Consider the (M + 1)dimensional
convolution of (18) rom the extended sequences with
all indices except the last one held constant. The re-
sulting length-two scalar convolution is
which becomes
so = R o
+R - ,
x 1
s =
12, + 8, G I (32)
The G terms are always zero since he last half of the
length-2N sequence x(n)* has been added aszeros.
Close examination shows that because h(n )has also
been extended with zeros, either
Tio
or
8 ,
will always
be zero, depending on values of the constant indices.
Therefore the length-two convolution of
(31)
will re-
quire only one multiplication and the resulting total
number of multiplications necessary to noncyclically
convolve two length-N sequences giving a length-2N
output is
F = 3 M
(33)
which is the same as for the length-N constant diag-
onal convolution with a length-N output.
VII.
Mult id imens ional Convo lut ion Based on Over lap-Add
Another two-dimensional formulation canbe de-
veloped that is a generalization o,f the overlap-add al-
gorithm [31, [41 . In (7) the H function was aug-
mented so that the desirehd output y(n) is in block?
that are the columns of Y.* For this formulation H
will not be augmented but Y will be and the desired
y(n) will obtained by adding the overlapped col-
umns of Y. This formulation is based on noncyclic
convolution given by
(34)
where h(n) and x ( n ) are sequences of length N and
y ( n )
of length 2N - 1 and all are defined to be zero
outside these lengths. The transmission matrix formu-
lation for N = 3 is given by (29).
Using a similar factoring of N and change of vari-
ables as was done in (3) and (5) we have (34)
becoming
p =O k=O
Z = 0 , 1 ; - - 2 L - 2
m = 0 , 1 , - . . 2 M - 2. (35)
In this case both 2 and fiAareL XAM arrays and
9
is
(2L - 1) (2M
-
1) with
X
and
H
having their col-
2mns the blocks of
x ( n )
and
h(n)
but the columns of
Y and the blocks of y(n ) p e a mp-e complicated re-
lation than in (6) . Both
X
and H are defined to be
zero outside the domain of definition. Here we can
show
-
8/20/2019 1974 Fast One-Dimensional Digital Convolution by Multidimensional Techniques
6/10
6
IEEE TRANSACTIONSN ACOUSTICS SPEECH ANDIGNAL PROCESSING FEBRUARY
1
y(E + rnL) = ? l , rn) + ? I + L ,
rn
-
1) (36)
for
I = O , l , - . . L -
1
r n = 0 , 1 , - . . 2 M -
1
and ? Z,
rn)
=
0
outside its domain f definition of
I = 0 7 1 , * * * 2 L -
rn=0,1; . .2M- 2 . (37)
This formulation gives the implementation of one-
dimensional convolution by the sumofoverlapped
columns of an array obtainedby noncyclic two-
dimensional convolution. This can be viewed as a
generalization of the overlap-add algorithm [3], [4]
used for sectioning or block processing.
If N = p , he extension t o M-dimensional convolu-
tion is similar to that done for the overlap-sve ech-
nique in (17) and (18). Both X and H are M-
gimensional with each dimension of length two and
Y is Mdimensional but with each of length three. For
X Z,rn,p,...)=x Z+2rn+4p+...)
~ ( I , m , p ; . . ) = h ( Z + 2 m + 4 p + . - . )
(38)
(34) becomes
Y ( I , r n , p , * . . ) = . * *
E m p
k = O
= O g=O
.
i 2 k , m
- j , p -
4,.
.
)z k,, 4 . )
(39)
for I ,
rn,p,
. . =
0,1 ,
2
The calculation of y(n) from is a bit complicated
but is a generalization of (36) and involves_onIy addi-
tions. For example if
N
= Z3
=
8 the Y function
would have three dimensions with a tota l of 27 ele-
ments. Along the I dimension there would be nine
length-three blocks that, when overlapped and added
along the
r
dimension, would give three length-seven
blocks.These, when overlapped and added, would
give the single length-fifteen sequence that would be
An exampleg r N =
4
would have a two-dimensional
Y n ) -
three X three Y
r
Yo0o1o2
= l Y l o
Yll Y 1 Z
LYZO
Y Z l YZZ
and y(n) would be found by
Yo = Y o 0
Y 1 = L o
Yz = Y z o
+ Y o 1
Y3 =
Y 1 1
Y4
=
Ys = Y l Z
Y Z l + Y o 2
Y6 =
YZZ.
(
Along each dimension of (39) the matrix formu
tion of the length-two convolution is
which, if done directly, requires four multiplicati
and one addition. If an intermediate step s added
YI ,
hen
Yo = ho x0
Y 1
= ( h o +hl)@O+ x 1 -
Yo
-
Y 1
Yz = h l x1
gives the three outputswith threemultiplications a
four additions/subtractions. Using this algorithm
each dimension of
(39)
and counting the number
required multiplications by the method used to fi
(33)
we find tha t
(39)
can be computed with
F = 3M ( 4
multiplications which are the same as that obtain
by the overlap-save method in (33).
VII.
Sequences
of
Dif feren t Lengths and Part ia l Outputs
Two modifications of the usual formulation of c
volution are often desired. The first occurs when t
two sequences are significantly different in leng
and the second when one desires only a portion of t
output rather than all of it. Both cases can be form
lated in terms of multidimensional convolutions a
computational savings can be realized.
If the
h(n)
is assumed to be the shorter seque
and if its length can be expressed asL
=
2' the leng
of the longer sequence
x ( n )
will be expressed as
R
2'
.M . It
may be necessary to add zeros to both
and h. The multidimensional signals are formed
give
S
length-two dimensions and the last dimensi
of length MA If
r
is the index for the ength-M dim
sion, then H is formulated to be zero for all
rn
oth
than m = 0. Therefore, if the fast algorithm of (2
or
(43)
isusedalong the S length-two dimensio
there will be only
M
multiplications required alo
the M dimension. The otal number of multipli
tions is then
F = 3' .M. (4
This could also be seen by considering he problem
requiring M ength-2' noncyclic convolutions w
the outputsoverlapped and added.
If only a portion of the total output rom the c
-
8/20/2019 1974 Fast One-Dimensional Digital Convolution by Multidimensional Techniques
7/10
AGARWAL ANDURRUS: DIGITAL CONVOLUTION
7
volution is desired, then a similar saving can be ob-
tained. Assume that both h(n)and x ( n ) are of length
N = LM, witha noncyclic output
y ( n )
of length
2N
- 1.
Out of this only a block of y ( n )of length L
is desired. First,
N
extra zeros i re appended to both
the sequences and cyclic convolution of length 2 N is
formulated. This onedimensional cyclic convolution
is reformulated as a two-dimensional convolution as
in (5)-(7). The first dimension is of length
L
and the
convolution in this dimension is noncyclic and can be
carried ou t as cyclic convolution of length 2L
-
1. In
the second dimension the convolution is cyclic and of
length 2M. The wodimensional arrays are formu-
lated so that the des$ed length L block of
y n )
ap-
pears as a column of Y in (6). Thus, P along the sec-
ond dimension has to be computed only for one
index. This replaces cyclic convolution of length 2 M
with a summation of 2M terms along the second di-
mension. The desired output block can be written as
a summation of
2 M
convolutions of length L, where
each convolution represents convolution of two se-
quences of lengths 2L
-
1 and L, respectively, giving
a sequence of length L. Because zeros were appended
to both x(n)and
h(n) ,
out of these 2M convolutions,
between one and M convolutions woTld be nonzero,
depending on the second index of Y for which the
output is desired.
The convolutions along the first dimension of length
L can be carried out either by transform techniques
or multidimensional techniques discussedn this
paper. If the transform techniques are used, the
summation of convolutions can be carried ou t in the
transform domain, thus requiring only one inverse
transform to obtain the desired output. Rader
[ll]
hasdiscussed a similar technique for the particular
case of estimation of autocorrelation function for he
first few lagvalues. If
L
= 2' and the multidimen-
sional methods are used, the number of multiplica-
tions are at most the same as given in
(45).
The formulation just discussed can be extended for
the situation where sampled output is desired, sam-
pled at every L th value. In this case for he wo-
dimensional formulation, the output appears as a row
of P, and as before, the twodimensional convolution
again reduces to a summation of convolutions. If the
output
y ( n )
is a narrow-band signal as compared to
the sampling frequency, to reconstruct the analog sig-
nal, the samples of y n ) at a lower sampling rate are
sufficient. In this situation the formulation discussed
here can result in computational savings.
If a multidimensional formulation is considered, we
can obtain partial outputs as combinations of blocks
and samples.
IX.
Relat ions to Logical Convo lut ion and Walsh Transforms
Consider logical convolution [12] of two sequences
x ( n ) and h n) of length N = Z M , each giving
an
out-
put sequence y ( n ) also of length N . Logical convolu-
tion is defined similar to the cyclic convolution of ( 2) ,
but the addition and subtraction of indices is done
differently. All indices are represented in the binary
form as an M-bit index. When indices have to be added
or subtracted, theyare added or subtracted bit by bit,
modulo 2. Note that
in
logical convolution, addition
and subtraction of indices are equivalent. We can
convert this onedimensional ogical convolution prob-
lem to an Mdimensional convolution as in (17) and
(18). Along any dimension, convolution appears as n
(19), but since logical convolution is desired, h(-1)=
h 1). Therefore, if
(18)
is implemented as length-two
cyclic convolutions along all the dimensions,
y n )
hus
obtained is logical convolution of
x ( n )
and h(n) .
Al
ternatively, if (18)is implemented as a noncyclic con-
volution alongall the dimensions, we obtainnon-
cyclic convolution of x ( n ) and h(n)as in (26).
Length-two cyclic convolution can be implemented
using just two multiplications as in
(24),
which is a
length-two DFT implementation. Thus (18) canbe
implemented as an Mdimension$ cyclk convolution
using Mdimensional DFT's of
X
and H , where each
dimension isof length two. Alternatively length-N
logical convolution can also be implemented using
length-N Walsh transforms of x(n )and h(n) 12]. The
preceding development shows that length-(N = 2M
Walsh transform is equivalent to th e Mdimensional
DFT. Thus the Mdimensional approach establishes
the logical convolution theorem for the Walsh trans-
forms and it also establishes the fast Walsh transform
algorithm as anMdimensional DFT. These facts have
been noted before in the literature.
For a particular formulation of (20), Pittasi [71 and
Davis [9] ave shown that some of the intermediate
products correspond to multiplication of the Walsh
transforms of the twosequences.
X. General izations and A pplication s
There are several modifications and generalizations
that are possible with the formulation used here. The
first will illustrate that fast algorithms exist for se-
quences of length other than two.
Consider the noncyclic convolution of two length-
three sequences.
E]
Y4
This would normally require nine multiplications and
four additions. If six intermediate variables are calcu-
lated by
-
8/20/2019 1974 Fast One-Dimensional Digital Convolution by Multidimensional Techniques
8/10
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING FEBRUARY 19
go =
ho x0
g3 =
(ho +
hl x0 + x1
g1 = h l x1
g4
=
h +
h2 1(x1 I-
xz
g
= h z x z
g5 =(h2
+ h 0 @ 2
x01 (47)
then thedesired outpu t can be obtained by
Yo =go
Y1
=g3 g o - g,
Yz
=g5 + g ,
-
go - g2
y3
=g4
-
gl g2
Y4
=g2 (48)
requiring six multiplications.
If two sequences of length N = 3M were to be con-
volved, then using an implementation with
M
di-
mensions
of
length three would require for otal
multiplications
F = 6 M .
(49)
For short sequences of this length, this approach is
faster than adding zeros and using thenext larger
power of two-length with
(39).
Similar results are possible with other lengths and a
very general scheme can be developed for sequences
of length
N =
2M . 3 s . .
using different algorithms along different dimensions
in a manner similar to using a multiple radix FFT.
Because of th e generality of the multidimensional
formulation here are mixtures of highspeed tech-
niques that can be used.
In some situations, it may be advantageous to com-
bine transform techniques with the fast algorithms for
short convolutions discussed in this paper. One such
situation uses the Fermat number transforms
[2] .
As
discussed in Section
111,
using a b-bit implementation
of Fermat number transforms, the maximum cyclic
convolution length is 4
.
To cyclically convolve se-
quences longer than this, we need to formulate a two-
dimensional convolution as in Section
11.
Assume
x ( n ) ,
h(n),and
y n )
are sequences of length-(N = 8 b )
each. Consider the formulation of 7 ) , with M = 4 b
and
L
=
2.
Equation
7)
can be rewritten as
(50)
M -1
P(0, rn)=
i i ( 0 ,
m
p ) 8 0 , p )
p
=o
p =o
j =O
e 0, j)a(l,j),
m
= O l . .
, M
- 1. (51)
In these equations, each summation represents c
clic convolution of length 4
,
which can be carri
out by Fermat transforms. Taking length
4 b
Ferm
transforms of all the sequences in
(51),
we obtain
T { P(0, ) ) =
T { A ( O ,
) } T { 2 ( O , ) }
T {
P 1,
) } =
T(A(1,
) }
T{B(O,
I ) }
4-T { A - l ,
) } T { 2 ( 1 ,
+ T { a ( O ,
) ) T { 8 ( 1 , } . (5
These equations are similar to (19) and we cou
employ the tricks of (20) and (21) along the fir
dimension, giving
T {
P(0, ) }
= [ T {A ( O , ) }
+
T{A(-I, I ) } ] T { 2 ( 1 ,
+
r u
9 - T{2 I
{A(O,
E
T { ?(l,
I ) }
= [T@(O, I ) )
+
TCA(1,4}1T{2(0,I)
[ T { 8 ( 0 , ) } -
7 {8(1,
} ] T { A ( O , .
5
Note that (--l, rn)=fi l,n
-
l),thus A - l , 1)
the cyclic shift by one position of H 1 , l ) . We go
employ the 5yclic shift theorem to coAmputeT { H
E } from
T ( H ( 1 ,
E)}. Assuming t h e H transforms a
precalculated,this method requires computation of
length-(4
b ) X
transforms, 12
b
multiplications a
12
b
additions/subtractions to compute the
?
tran
forms using 53 ) , and two length-(4b ) ? inverse tran
forms. This
is
efficient because the only extra comp
tation is 4b extra multiplications and is better th
the two-dimensional Fermat number transform a
proach, which requires roughly twice the amount
computation. This use of
2
X
1M
convolution has
small computational advantage even when used wi
the FFT where, in effect, the ast stage of the FFT
gorithm is replacedby one of the fast length-tw
algorithms.
There are many other possibilities of combinatio
of Fermat and Fourier transforms of various lengt
and dimensions with short convolution algorithms
with the use of special hardware. The arbitrary ord
of the various operations can also be used to adva
tage. As pointed out by Rader [ l l ] and illustrat
in our discussion of partial outputs and in ( 5 3 ) , t
often more efficient t o take transforms along o
dimension than convolve along another before taki
the inverse transform.
To illustrate the efficiencies of some of the tec
niques of this paper, a comparison of a particular ca
will be made with direct and FFT implementatio
Consider the problem of noncyclic convolution
two length-(N
= 2M
sequences to give an output
length 2 N -
1
as described in (34). The number
multiplications per output point will be calculatedf
three implementations. First, consider adirect i
plementation which requires N 2 multiplication
-
8/20/2019 1974 Fast One-Dimensional Digital Convolution by Multidimensional Techniques
9/10
AGARWAL AND BURRUS: ONE-DIMENSIONAL DIGITAL CONVOLUTION
9
2N
1
output points (we will approximate this by
2N
for simplification). This gives for multiplication
per output point
Fo
= t N .
If the FFT s used in anefficient way, taking advan-
tage of the fact that the data are real and using a
mixed radix algorithm (2,
4,
and 8), the number of
multiplications necessary per output pointcan be cal-
culated as described by Singleton [13] and is de-
noted F , .
Using the Mdimensional formulation with the fast
algorithm of
(33)
or 44) gives for themultiplications
per output
1
F2 ~ - 3 ~ .
2N
Table I1 compares these functions forvarious lengths
up to 1024. Note the multidimensional implementa-
tion is more efficient than a direct implementation
for all lengths and requires fewer multiplications
than the FFT forengths up to
128.
If a less efficient
FFT implementation requiring 4 log
N
+
4
multiplica-
tions per output point is used, the crossover length is
above
2048.
The multidimension approach can also be used in
the same way the FFT is used to implement ongoing
processing or sectioning [4]. In contrast, t o use with
the FFT, this approach is most efficient when the
length of the section orblock is the same as he length
of the convolution operator.
Initial observations indicate hat block implementa-
tion of recursive filters [3] becomes more attractive
when used with the techniques described in this pa-
per. To illustrate his, wewillagain consider the
multiplication efficiencies for three realizations. First
consider a recursive filter with an equal order numer-
ator and denominator f
N
=
2 M .
The multiplications
per output points for direct implementation
F o
=
2N.
Using efficient FFT algorithms, the results given by
the methods in [3] are denotedF1. Using three con-
volutions by
(33)
or
(44)
to implement the block re-
cursive filter with the block length equal the order,
gives as he multiplications per output point
Table I11 compares these multiplication efficiencies
for orders up to
128.
Note the multidimensional ap-
proach is more efficient than the direct for orders
above three and more efficient than the FFT for or-
ders up to about 256. This is yet to be explored in
detail.
TABLE
I1
A Comparison, f Multiplication Efficiencies for Three
Implementations of Length-N Convolution
2
4
8
16
64
32
128
256
512
1024
1
4
2
8
16
32
128
64
256
512
1.5
2.25
4.12
5.06
8.01
6.03
10.00
9.00
12.00
13.00
0.75
1.12
1.70
2.53
3.80
5.70
8.54
12.81
19.22
28.83
TABLE I11
A Comparison of Multiplication Efficiencies for Three
Implementations of Block Recursive Filters of Order N
N F O Fl F2
4
24.5
8
8
21 6.7
16
16 31
32
10.1
38 15.2
32
~
~~
645
64
22.8
128
128
53
2568
34.2
51.2
XI. Conclusions
This paper
has
presented two formulations of con-
volution in terms of multidimensional convolution-
one based ona generalization of the overlap-save
algorithm for sectioning and the other on theverlap-
add algorithm. The first proved to be well suited for
cyclic and constantdiagonal convolution and the ec-
ond for noncyclic convolution. Fast algorithms were
developedbased on ength-two
and
-three convolu-
tion that lead to an improvement in multiplication
efficiency. Thereduction nrequired word lengths
proved to be a valuable feature when used with the
Fermat number ransform. The formulation proved
to be well suited for the special cases where unequal-
length sequences were convolved or where only a por-
tion of the output was desired. It was further shown
that various mixtures of algorithms could beused
along the different dimensions to achieve certain ad-
vantages or to fit articular requirements. Finally, ex-
amples were presented to compare the multiplication
efficiencies of a few implementations.
The formulation is
so
general tha t a complete and
systematic investigation of all possible applications is
difficult. The main ideas and relations to otherworks
that we know of have been presented here. The in-
vestigation of word length and storage requirements
and a more complete consideration of recursive im-
plementations is still to be made.
Acknowledgment
The authors would like to
thank
R. A. Meyer for
valuable discussions.
-
8/20/2019 1974 Fast One-Dimensional Digital Convolution by Multidimensional Techniques
10/10
10 IEEE TRANSACTIONSON ACOUSTICSPEECH AND SIGNALROCESSING OL.SSP-22 NO.
1 ,
FEBRUARY 1
References
[l
C.M. Rader “Discrete convolutions via Mersenne trans-
forms,” I E d ET r a n s .C o m p u t . , vol. C-21, pp.1269-
1273, Dec. 1972.
[ 2 ] R . C. Agarwal and C. S. Burru,s,. ‘Fast digitalconvolu-
tions using Fermat ransform s. In Southwe s t e rn IEEE
[3] C.
S.
Burrus, “Block realization of digital filters,” IEEE
C on 6 R e c , Houston, Tex., Apr: 1973 , pp. 538-543.
Trans . Aud io Ele c t roac ous t . ,
vol. AU-20, pp. 239-235,
Oct. 1972.
[4 B. Gold and C.M. Rader,
Digital Processing o f Signals.
New York: McGraw-Hill, 19 69 , py; 208-211.
[ 5 ]
R.
A. Meyer and C.
S.
Burrus,Certain properties
pf
I EE E C o n f . R e c . , Houston, Tex., Apr. 1973, pp. 529-
periodically time-varying digital ilters,” in
Southwe s t e rn
535.
[6] D.
P.
MacAdam, “Image restora tion by constrained de-
convolution,” J . O p t .
S OC .
A m e r . , vol. 6O;pp. 1617-
1627, Dec. 1970.
[ ] D. A. Pitassi, “Fast convolution using th e Walsh trans-
Digital
Design
Notch Filter
Procedure
JAMES A. CADZOW
Abstract-An
analyticalprocedure or designing ainear
digital notch filter is presented. The resu ltan t filte r is sixth-
orderand s mplementedby cascading three econd-order
filters
so
as
to
avoid instability which may arise from
computercoefficient runcation.Theprocedureoutlined is
straightforw’md, requires only simple algebraic steps, and gives
filter parameter selection criteria for reducing t he effec ts
of
compute r coefficient truncation.
Notch ilters have utility in situationswhere a esired
signal
is
corruptedby n dditive sinusoidalpickup.One
thusmust process th e noisy signal so as to remove the
sinusoid without significantly disto rting he desired signal.
I. In t roduc t ion
A frequently occurring linear data filtering applica-
tion occwfs when one wishes to process
a
signal of the
form
u
t )
=
s t )+
A sin wo
so
as to remove the additive sinusoidal component,
A
sin m o t , without seriously distorting the desired
signal
s(t).
The situation in which
a
60-Hz sinusoidal
pickup corrupts a desired measurement signal nicely
illustrates how such problems can arise in practice.
forms,” n Proc. C o n f .App l i c a t ions f Wa lsh Fu
t ions , Washington, D.C., Apr. 197 1, pp. 130-13 3.
[8 P.
J.
W. Rayner, “A fast cyclic convolution algorith
presentedatSymp. Digital Filtering, mperial Colle
London, England, Aug. 1971.
[9] W. F. ,Davis “A class
of
efficient convolution al
rithms, in &roc . Sy mp .A p p l i c a t i o n s o f Walsh Fu
t ions ,
Washington, D.C., Mar. 1972, p p. 318-3 29.
[ l o ] J. C. Allwright, “Real actorization of noncyclic c
volutionoperatorswithapplications
to
fast convo
[11 C. M. Rader,“An improvedalgorithm for high sp
tion,”
E l e c t r o n . L e t t . ,
vol.
7 ,
pp. 718-719, Dec. 1971
autocorrelationwith pplications to spectral stima
tion,”
I E E ET r a n s .A u d i oEle c t roac ous t . ,
vol. AU-
pp. 439-441, Dec. 197 0.
[12 J
G.
S . Robinson, “Logical convolution nd iscrete
Walsh and Fourie r power spectra ,” IEEE Trans . Au
Ele c t roac ous t . ,
vol. AU-20, pp. 271-280, Oct. 1972.
[13]
R.
C. Singleton, “An’alg orithm for com puting the mix
radix fast Fourier transform,”’ IEEE Trans . Aud io El
t roac ous t . ,
vol. AU-17, pp. 93-103, June 1969.
The ideal filter for he above application would th
have
a
response
s t )
to he nput
u t ) .
In the f
quency domain, the required linear filter would ha
a gajn of one for all frequencies except at wo wh
its gain s zero. As such, this processor s typica
called a notch filter with notchat a0. Its ga
frequency behavior is depicted in Fig.
1.
Unfortunately, the ideal notch filter is not phy
cally realizable and must be approximated in practi
If one attempted to implement a notch filter
proximation usinganalogdevices(i.e., resistors,
pacitors, and nductors), one would quickly real
the futility of thisapproach. On theother han
one may eadilydesign a digital filter whose f
quency behavior loselyesembles that shown
Fig.
1
(e.g., see
l]
[4 ]
).
The approach to be taken in this paper is to th
uniformly sample the signal u ( t ) (every T secon
and use the resulting sequence as the input o a digi
filter governed by
y ( h ) = b , u ( h ) + b , u ( h- 1 ) + . . . +,u(k
-
m)
- a l y h - 1) a z y k -
2)
-
. . .
a , y h -
n )
where u ( h )and y h ) denote the values of the filte
inputandoutput signals, espectively, at hehth
iteration.
A
procedure for selecting the coefficie
ai nd b iwhereby the filter’s gain factor-frequency b
havior will be similar to that shown in Fig.
l
will
shortly given. One must realize, however, that sin
the filter is digital, its frequency behaviorwillbe
periodic with period 2 x / T (e.g., [l, . 2971 ) and w
appear
as
shown in Fig.2.
The selection of the sampling period T o be used
of great importancefrom a number of viewpoin
Most importantly, it must be chosen small enough
that little distortion results from the analog-to-dig
conversion
of
the desired signal
s t ) .
Quantitative