1974 fast one-dimensional digital convolution by multidimensional techniques

Upload: rajesh-bathija

Post on 07-Aug-2018

229 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/20/2019 1974 Fast One-Dimensional Digital Convolution by Multidimensional Techniques

    1/10

    IEEE TRANSACTIONS

    ON

    ACOUSTICS SPEECH

    AND

    SIGNAL PROCESSING VOL. ASSP-22 O.

    1 ,

    FEBRUARY 1974

    1

    Fast One-Dimensional

    Digital Convolution

    by Multidimensional

    Techniques

    RAMESH C. AGARWAL and CHARLES S. BURRUS,

    Member, IEEE

    Abstract-Thispaperpresents two formulations

    of

    multi-

    dimensional digital signals from one-dimensional digital signals

    so thatmultidimensional convolution willmplement one-

    dimensional convolution of the original signals. This

    has

    re-

    duced an mportant word ength restriction when

    used

    with

    the Fermat number transform. The formulation is very general

    and ncludes block processing and sectioning as special cases

    and, when used with various fast algorithms for

    short

    length

    convolutions, results in improved multiplicationefficiency.

    I.

    In t roduc t ion

    There are several advantages to formulating a one-

    dimensional digital convolution as a two- or higher

    dimensional problem. The first involves the Mersenne

    and Fermat number transforms which have recently

    been defined

    [l]

    [2] and which seem

    to

    have some

    advantages over the discrete Fourier transform (DFT)

    for implementing convolution on a digital computer.

    They can be computationally faster than he fast

    Fourier transform (FFT) implementation of the DFT

    and result in no roundoff error. These transforms

    have one limitation: the number of bits required for

    each word in an implementation is proportional to

    the length of the sequences t o be convolved

    [

    11

    ,

    21.

    It is the purpose of this paper to present a scheme

    whereby long sequences can be convolved by a two-

    dimensional convolution as mentioned by Rader

    l]

    .

    This two-dimensional convolution can be imple-

    mented by a twodimensional transform

    to

    allow a

    high-speed error-free convolution with the word

    length proportional to the square root of the length

    of the sequences.

    This formulation also allowsuseofspecial short

    convolution algorithms similar to those proposed by

    Pitassi

    [7] ,

    Rayner [SI

    ,

    Davis [9] and Allwright

    [ l o ] . These can be extended and combined with

    others t o provide

    a

    very general and versatile format

    andSeptember

    20,

    1973. Thisworkwassupported by the

    Manuscript received March 15, 1973; revised July 10, 1973

    National Science Foundation underGrant GK-23697.

    The authors are with the Department of Electrical Engineer-

    ing, Rice University, Houston, Tex. 77001.

    for doing fast digital convolution. The caseof se-

    quences of different lengths and the case where only

    part of the output sequence is desired are covered as

    special cases of he general problem.

    II. Two-Dimen sional C on vo lut ion Based o n Overlap-Save

    Consider the cyclic convolution of two sequences,

    x n )

    and

    h n),

    iving an output sequence y ( n ) , ll of

    length N .

    h(n)* x@

    = y(n) .

    1)

    This is defined by

    N 1

    y ( n )=

    h(n

    4) ( q )

    q

    =o

    n = O , l , * . * N -1 (2)

    where

    h ( n )

    and

    x n)

    are periodically extended outside

    their original domains of definition (or their indices

    evaluated modulo

    N ) .

    In order to convert this one-dimensional problem t o

    two dimensions a change of variables is made. First,

    it must be possible to factor N into integer factors

    N

    = LM. (3)

    If a change of variables is made such that

    n = l + m L l

    k , l = O , l , * - . L -

    1

    q = k + p L, m = O , l , - - - M -

    1

    then (2) becomes

    L-1

    M - 1

    y ~

    mL) =

    h ( l +

    mL

    -

    k

    - P L )

    x ( k + p ~ ) .

    k = O p= O

    4)

    Define now a twodimensional L

    X M

    array 8 rom

    the original length N

    =

    LM signal, x n),by

    2 ~ ,

    n)

    =

    x ( j +

    mL ) 5 )

    where .columns of 2 are the sections or blocks of x

    and the rows are s-ples of

    x

    taken every

    L

    values of

    n. In a similar way H and

    P

    are defined by

    ii ~,

    ) =

    h ( l +

    mL)

    ? ( E ,

    rn)

    = y 1+

    mL).

    (6)

    In termsof the two-dimensional signals, 4) ecomes

    ? Z, m ) =

    k(l- ,

    m

    - p )

    8 k , p )

    (7)

    which s a/\twodimensional convolution. Note that

    valuesof H ouiside the L

    X

    M array are required.

    These values of that are needed can be seen

    rom

    (7)

    where l - L < i - k < L - l a n d l - M G m - p G M -

    1. Values of H(1, m ) outside the L X M array are de-

    fined by

    6) .

    We therefore define suitably extended

    M

    -1 L-1

    p=O k = O

  • 8/20/2019 1974 Fast One-Dimensional Digital Convolution by Multidimensional Techniques

    2/10

    2 IEEE TRANSACTIONS ON ACOUSTICSPEECH AND SIGNALROCESSINGEBRUARY

    8

    5:

    arrays H and X so that twodimensional convoluGon

    will give the desired answer. The extension of H is

    analogous to he overlap extensions used in the

    overlap-save algorithm for doing sectioning or block

    processing [ 3 ] ,

    41.

    Note tha t along the m dimen-

    sion the values of H are periodic with period M , Le.,

    the desired convolution in (7) is cyclic in the rn di-

    mension (this is because the original desired convolu-

    tion in

    2 1

    was defined

    as

    cyclic).Considering the

    values of H along 1 shows the convolution in that di-

    mension is not cyclic.

    An important factor when considering multidimen-

    sional convolution is the number of multiplications

    required in an implementation. If in (7) m and p are

    held constant then

    7 )

    becomes a scalar convolution

    along the 1 dimension of a length-L sequence of xwith

    a length-(2L- 1) sequence of h giving a length-L se-

    quence of the output . There willbe one of these

    length-L scalarconvolutions for each value ofm and p

    in

    (7)

    so

    that along the

    rn

    dimension each “operation”

    will be a length-L convolution rather than a simple

    scalar multiplication. This is a sort of convolution of

    convolutions [ 3 ]

    To

    count the total number of mul-

    tiplications to compute (7) with 1 and

    k

    constant are

    found. But, with

    1

    and

    k

    not constant, this will be

    the number of ength-L convolutions necessary and

    the total number of multiplications will be the num-

    ber of length-L convolutions times the number of mul-

    tiplications for a leng th4 convolution. In his case

    along the

    rn

    dimension the number of multiplications

    is M 2 and along the E dimension it is

    L 2 , o

    the total

    is

    M 2

    2 or

    N 2

    which is the same as a direct calcula-

    tion of (2) would require. We will later use trans-

    forms and other schemes to reduce this number. Note

    the convolution can be carried out in either order.

    If (7) is to be cmie$out by transforms and therefore

    cyclic convolution, X must be augmented with zeros

    so

    that aliasing of the noncyclic convolution along 1

    does not occur and

    so

    that all arrays are the same size.

    Consider the

    2L 1

    M array

    X

    formed b,y ap-

    pending ( L

    -

    1) ows of zeros to the bottomof X .

    5:

    I

    x=

    x ( L

    -

    1)

    5:

    x ( N - 1)

    .

    (8)

    0

    0 0

    8

    H is formed so tha t he columns of contain

    periodic extension of the original h(n )with period

    A

    -

    -

    h( N L +

    1)

    h(N

    -

    2L + 1)

    h ( N - 1) h(L

    -

    1)

    . =

    h(0 )

    h ( L ) * h(N

    - L )

    h(1)

    h(L

    +

    1)

    h(L - 1)

    h ( N -

    1)

    If twodimensional cyclic convolution is carried

    we have

    j L ? j * $

    (

    where the loFer L

    X

    M

    partition of Y is 9 and

    columns of Y are the desired blocks of y ( n ) n

    Because of ease in implementation with transfor

    the arrays would usually be extended one ad

    tional row to be 2L X

    M

    rather than the minim

    (2L

    -

    1)

    X

    M .

    111 Two-Dimensional Transform Implem entat ion

    5:

    The twodimensional ransform of

    X

    is defined a

    8

    M - 1 L - 1

    T { 2 }=

    F j ,

    k ) =

    8 1,

    m ) a;k

    (

    m=O Z=O

    and the inverse transform

    M - 1 2 L - I

    T - l { F } = * l ,

    m )

    = ( 2N) - l

    k = O j = O

    .F ,

    k )

    a;:

    a M

    (

    where aM is of an order

    M

    (i.e.,

    M

    is the least po

    tive integer such that aM

    M

    = 1[2]

    ).

    Applying

    transform to

    (10)

    it can be shown that

    -m k

    T { ? ) = T { ? j } T { 2 } (

    so that, similar to the onedimensional case,

    (10)

    be carried out by

    @ = T - ’ [ T { . & } T { } ] .

    (

    If the transform is the DFT, then

    a M = e-i2niM*

    To compare multiplication efficiencies we assume

    DFT of I$ is already known and the number of mu

  • 8/20/2019 1974 Fast One-Dimensional Digital Convolution by Multidimensional Techniques

    3/10

    AGARWAL ANDURRUS: ONE-DIMENSIONALIGITALONVOLUTION 3

    plications for one

    2 L X M

    transform, one

    2 L X M

    complex multiplication, and one

    2 L X

    M inverse

    transform are calculated. The number of complex

    multiplications is approximately (2N log 2N 2N)as

    compared to (N log N +

    N

    or a onedimensional im-

    plementation and thereforeone would not use the

    two-dimensional approach with the DFT for im-

    proved multiplication efficiency.

    Thecomputational advantage appears when used

    with the Fermat number transform [ 2 ] where word

    length requirements are a possible restriction. The

    Fermat number transform is defined in

    [ 2 ]

    and, al-

    though not named, is defined in a restricted form in

    the atterpart of

    [l,

    eq. ( 3 8 ) ] . It is a ransform

    defined in a finite ring of integers with arithmetic

    performed modulo Fermat numbers (2b 1, b

    =

    2 f ) ,

    with

    az

    =

    2

    and

    x4

    =e nd having the prop-

    erty that multiplication of Fermat number trans-

    forms corresponds to conventional cyclic convolution

    ( 2 )modulo 2b + 1. To perform convolution with the

    transform requires N real multiplications and a num-

    ber of additions and word shifts proportional to

    N log N . Unfortunately the transform requires word

    lengths proportional to th e ength of the sequences to

    be convolved

    [ l ] [ 2 ]

    Since the lengths of the two

    dimensions are 2 L and M rather thanN = LM for the

    onedimensional signal, the word-length requirement

    using the twodimensional transform is proportional

    to the.square root of N rather than to N as for the

    onedimensional problem. It is this reduction in the

    necessary word-length that makes two-dimensional

    formulationttractive with the Fermat number

    transform.

    The consequences of this reduction in word length

    is of considerable practical importance. For example,

    using a word length of 16 b and the Fermat number

    transform [ 2 ] with x = 2 to compute the complete

    noncyclic convolution of two sequences of equal

    length, the onedimensional implementation restricts

    the sequence length to a maximum of

    16

    where the

    two-dimensional implementation increases the maxi-

    mum to 256. A summary of the restrictions on

    se-

    quence lengths is shown in Table

    I

    for themost prac-

    tical word lengths and for two values of

    a.

    Note that

    onedimensional length restrictions would be too se-

    vere for many applications but the two-dimensional

    restrictions would include most practical filters. If

    cyclic convolution is desired, then all the length re-

    strictions are doubled since the addition of zeros to

    prevent aliasing is unnecessary.

    The two-dimensional transform-r nverse trans-

    for m-can be taken in either order. There is, how-

    ever, a computational advantage in taking the trans-

    form first along the m direction (length

    M )

    and then

    along the 1direction (length 2 L ) ;half the X sequences

    along the m directionare zero and half the

    H

    se-

    quences along the

    m

    direction are cyclically shifted

    A

    A

    TABLE I

    Sequence Length Restrictions for One- and Two-Dimensional

    Implementation of Noncyclic Convolution by the

    Fermat Number Transform

    Word Length Transform

    Maximum Sequence

    Length-N/2

    (Bits)

    Basis

    a

    1 D

    2-0

    16

    16 256

    16

    a

    32024

    32

    32024

    32 Jz

    64096

    64 2

    64096

    64 Jz

    1286 384

    by one position of the other halfsequences. Also,

    while taking the inverse transform, there is an advan-

    tage to first taking the inverse transform along 1, then

    along

    m,

    because we need only half the Y sequences

    and thereforeonly half the sequences need be in-

    verted along m.

    A

    IV. Generalizations and the Inverse Problem

    A generalization of the approach in this paper to

    higher orders is fairly obvious. For example,

    N

    could

    be factored into three integer factors N = LMP, as

    was done with two factors in

    (3).

    The signals

    x

    n )

    and

    h ( n )

    would then be redefined as three-dimensional

    L X M X P

    arrays and

    ( 2 )

    converted to a hree-

    dimensional convolution by a change of variables as

    was done in two dimensions in

    (4)-(7).

    For x this

    would be

    z 1,

    m , p )

    =

    x(1

    mL pML) (15)

    and with similar definitions for and and after aug-

    mentation t o prevent aliasing, ( 2 ) would become a

    three-dimensional convolution.

    We would then have onedimensional cyclic convolu-

    tion of N-length sequences being carried out by three-

    dimensional cyclic convolution with dimensions of

    lengths

    2L, 2M,

    nd P . Use of orders higher than two

    does not seem needed with the Fermat number trans-

    form at this time, but will be exploited with other

    schemes in the nex tsections.

    Stillanother variation would apply to he case

    where the filter is periodically time-varying. This,

    when converted to a two-dimensional problem with

    L

    equal the period of the filter, becomes time-

    invariant along one dimension, m, and time-varying

    along the other, 1 [ 5 ] .The DFT or Fermat transform

    could be applied to the two-dimensional signal along

    the time-invariant dimension and either direct calcula-

    tion or another type transform applied to t he other

    dimension.

    The inverse of the above problem can be considered

    where one wishes to implement two-dimensional con-

  • 8/20/2019 1974 Fast One-Dimensional Digital Convolution by Multidimensional Techniques

    4/10

    4 IEEE TRANSACTIONS ON ACOUSTICS SPEECH ANDIGNAL PROCESSING FEBRUARY

    1

    volution by one-dimensional methods. This involves

    reversing the process presented in Section I1 by con-

    structing onedimensional sequences from the given

    arrays. MacAdams [6] has presented a scheme for

    doing this tha t can be seen to be the inverse of the

    problem addressed in this paper and can be applied to

    cyclic or noncyclic twodimensional convolution.

    V. M-Dimensional Convo lut ion

    If the length of the signals to be cyclically convolved

    in (2) can be written N = Z M then the methodsused

    in (5) and (15) can be extended to define an M-

    dimensional signal with each dimension of length two.

    As before, this s done by a hange of variables.

    X Z , m,p, ) = x Z

    + 2m -t

    4p +

    *

    .

    ),

    I , rn,p, . . .= 0 , l . (17)

    After a similar definition for and ? cyclic convolu-

    tion in

    (2)

    becomes

    1 1 1

    Y Z,

    m , p , - . ) = c c c .

    k=O j =O g=O

    - g ( l - - ,

    m

    j , p - g , . .

    ) z ( k , j ,4,.

    ) (18)

    for I m , p , . . . = 0 , l .

    Both ?? and are Md@ensional with each dimen-

    sion of length two and H is also Mdimensional but

    must be defined with dimensions of length three in

    order to carry out (18). This is the same as was re-

    quired in two dimensions where H was defined in (9).

    Since the original convolution in

    (2)

    was defined as

    cyclic, the convolution

    a l o g

    the last dimension in

    (18) is also cyclic so that H need not be extended

    along that dimension but can be evaluated for this n-

    dex modulo

    2.

    This is an extremely general and versatile formula-

    tion for the original problem that can be used

    to

    im-

    prove computational efficiency. The convolution

    along each dimension of length two can be written in

    terms of scalar variables as

    R

    The convolution of (18) can be viewed as M nested

    length 2 convolutions, each separately of the form

    shown in (19) requiring four multiplications. Using

    the same reasoning that was explained for the two-

    dimensional formulation, the total number of multi-

    plications is F = 4M. Since

    N

    = ZM

    this becomes F

    =

    N 2 , which is again the same as would be required by

    directly calculating (2).

    VI.

    A High Speed Algo r i thm

    With this formulationof scalar convolution in terms

    of multidimensional short convolutions, various tricks

    for efficient short convolution can be used. Consid

    one of several possible algorithms suggested by Pit

    [7] where, rather than directly calculating (19), h

    intermediate numbers are found:

    go

    =

    W O + h - l b l

    g1 = hO@O

    - x1

    )

    g2 = ( h , +

    ho

    1x0. (

    The desired outputs are then calculated by

    Yo =go

    +g1

    Y 1

    = g 2

    - g1. (

    This approach uses three multiplications in comp

    son with four multiplications for a direct calculati

    Using this result, the total number of multipli

    tions to calculate (18)becomes

    F = 3 M . (

    If

    the fact pointed out earlier, that the convolut

    along the last dimension is cyclic, is used, then a

    ther reduction is possible.

    Length-two cyclic convolution is given by

    This can be calculated from two intermediate val

    by

    f o = (ho 2 +

    hl/2)

    x0 +

    x1

    )

    f l

    = (ho/2

    -

    hl /2) x0

    -

    X1 1

    to give

    Yo

    =

    o +

    fl

    Y1

    = f o

    f l

    requiring two multiplications (assuming the factors

    are either precalculated or obtained by shiftin

    Using this result on the ast dimension reduces

    (22)

    F = 2 . 3 -1

    (2

    which is the same number obtained by Pitassi [ 7] a

    Rayner

    [SI

    a

    Pitassi developed his lgorithm by relating the cyc

    convolution of two sequences to the convolution

    subsequences in the same manner that the FFT an

    developed based on decimation in time. This was

    tended by Davis [9] to an approach similar to de

    mation in hequency where the subsequences

    halvesof the originalsequences. Both of these

    special asesof the multidimensional formulati

    since the values along any dimension are samples

    blocks of th e original sequence.

    Another form of convolution that is sometimes

    sired s of the same, form as (2) but with h ( n ) n

    beingperiodically extended,rather having indep

    dent values for negative indices. The transmission m

    trix formulation for N

    =

    3

    is

  • 8/20/2019 1974 Fast One-Dimensional Digital Convolution by Multidimensional Techniques

    5/10

    AGARWAL AND BURRUS: ONE-DIMENSIONAL DIGITAL CONVOLUTION

    5

    and because of its structure the operator is called a

    constant diagonal convolution matrix. This requires

    2 N -

    1

    values of

    h(n)

    from

    n

    =

    - N

    1

    o

    n

    = N -

    1

    and N values of x ( n ) o giveN value of

    y n) .

    The same approach tha t was used for cyclic con-

    volution is applicable here except the reduction de-

    scribed in (23) and (24) does not work since the M-

    dimensional formulation of 18) sno onger cyclic

    along any dimension. Therefore the number of multi-

    plications necessary to implement a length-N constant

    diagonal convolution is

    F = 3 M (27)

    which is the same as obtained by Allwright [lo], us-

    ing a matrix factorization approach.

    Note tha t cyclic convolution in (2 ) can be viewed as

    a specialase of constant diagonal convolution.

    Causal noncyclic convolution where

    h(n)

    is zero for

    y1 <

    0

    is also a special case.

    Lh2 h l

    h o l

    Lx2-l L Y 2 1

    The most common form of convolution desired is

    causal noncyclic convolution where all of the output

    sequence is obtained ather han only the first

    N

    points as in (28).For his case two length-N se-

    quences are convolved to give a length-(2N - 1)out-

    put. The transmission matrix formulation forN = 3 is

    Y 3

    i i .

    4

    To apply the results from cyclic convolution, all se-

    quences are extended with zeros t o length 2N. Cyclic

    convolution then gives the desired output of (29)

    and uses

    F ~ 2 . 3 ~ (30)

    multiplications.

    A

    further reduction is possible by recognizing tha t

    along the last dimension only one multiplication is

    necessary rather han wofor he cyclic case tha t

    gave (25) or three for the constantiagonal noncyclic

    case that gave (27). Consider the (M + 1)dimensional

    convolution of (18) rom the extended sequences with

    all indices except the last one held constant. The re-

    sulting length-two scalar convolution is

    which becomes

    so = R o

    +R - ,

    x 1

    s =

    12, + 8, G I (32)

    The G terms are always zero since he last half of the

    length-2N sequence x(n)* has been added aszeros.

    Close examination shows that because h(n )has also

    been extended with zeros, either

    Tio

    or

    8 ,

    will always

    be zero, depending on values of the constant indices.

    Therefore the length-two convolution of

    (31)

    will re-

    quire only one multiplication and the resulting total

    number of multiplications necessary to noncyclically

    convolve two length-N sequences giving a length-2N

    output is

    F = 3 M

    (33)

    which is the same as for the length-N constant diag-

    onal convolution with a length-N output.

    VII.

    Mult id imens ional Convo lut ion Based on Over lap-Add

    Another two-dimensional formulation canbe de-

    veloped that is a generalization o,f the overlap-add al-

    gorithm [31, [41 . In (7) the H function was aug-

    mented so that the desirehd output y(n) is in block?

    that are the columns of Y.* For this formulation H

    will not be augmented but Y will be and the desired

    y(n) will obtained by adding the overlapped col-

    umns of Y. This formulation is based on noncyclic

    convolution given by

    (34)

    where h(n) and x ( n ) are sequences of length N and

    y ( n )

    of length 2N - 1 and all are defined to be zero

    outside these lengths. The transmission matrix formu-

    lation for N = 3 is given by (29).

    Using a similar factoring of N and change of vari-

    ables as was done in (3) and (5) we have (34)

    becoming

    p =O k=O

    Z = 0 , 1 ; - - 2 L - 2

    m = 0 , 1 , - . . 2 M - 2. (35)

    In this case both 2 and fiAareL XAM arrays and

    9

    is

    (2L - 1) (2M

    -

    1) with

    X

    and

    H

    having their col-

    2mns the blocks of

    x ( n )

    and

    h(n)

    but the columns of

    Y and the blocks of y(n ) p e a mp-e complicated re-

    lation than in (6) . Both

    X

    and H are defined to be

    zero outside the domain of definition. Here we can

    show

  • 8/20/2019 1974 Fast One-Dimensional Digital Convolution by Multidimensional Techniques

    6/10

    6

    IEEE TRANSACTIONSN ACOUSTICS SPEECH ANDIGNAL PROCESSING FEBRUARY

    1

    y(E + rnL) = ? l , rn) + ? I + L ,

    rn

    -

    1) (36)

    for

    I = O , l , - . . L -

    1

    r n = 0 , 1 , - . . 2 M -

    1

    and ? Z,

    rn)

    =

    0

    outside its domain f definition of

    I = 0 7 1 , * * * 2 L -

    rn=0,1; . .2M- 2 . (37)

    This formulation gives the implementation of one-

    dimensional convolution by the sumofoverlapped

    columns of an array obtainedby noncyclic two-

    dimensional convolution. This can be viewed as a

    generalization of the overlap-add algorithm [3], [4]

    used for sectioning or block processing.

    If N = p , he extension t o M-dimensional convolu-

    tion is similar to that done for the overlap-sve ech-

    nique in (17) and (18). Both X and H are M-

    gimensional with each dimension of length two and

    Y is Mdimensional but with each of length three. For

    X Z,rn,p,...)=x Z+2rn+4p+...)

    ~ ( I , m , p ; . . ) = h ( Z + 2 m + 4 p + . - . )

    (38)

    (34) becomes

    Y ( I , r n , p , * . . ) = . * *

    E m p

    k = O

    = O g=O

    .

    i 2 k , m

    - j , p -

    4,.

    .

    )z k,, 4 . )

    (39)

    for I ,

    rn,p,

    . . =

    0,1 ,

    2

    The calculation of y(n) from is a bit complicated

    but is a generalization of (36) and involves_onIy addi-

    tions. For example if

    N

    = Z3

    =

    8 the Y function

    would have three dimensions with a tota l of 27 ele-

    ments. Along the I dimension there would be nine

    length-three blocks that, when overlapped and added

    along the

    r

    dimension, would give three length-seven

    blocks.These, when overlapped and added, would

    give the single length-fifteen sequence that would be

    An exampleg r N =

    4

    would have a two-dimensional

    Y n ) -

    three X three Y

    r

    Yo0o1o2

    = l Y l o

    Yll Y 1 Z

    LYZO

    Y Z l YZZ

    and y(n) would be found by

    Yo = Y o 0

    Y 1 = L o

    Yz = Y z o

    + Y o 1

    Y3 =

    Y 1 1

    Y4

    =

    Ys = Y l Z

    Y Z l + Y o 2

    Y6 =

    YZZ.

    (

    Along each dimension of (39) the matrix formu

    tion of the length-two convolution is

    which, if done directly, requires four multiplicati

    and one addition. If an intermediate step s added

    YI ,

    hen

    Yo = ho x0

    Y 1

    = ( h o +hl)@O+ x 1 -

    Yo

    -

    Y 1

    Yz = h l x1

    gives the three outputswith threemultiplications a

    four additions/subtractions. Using this algorithm

    each dimension of

    (39)

    and counting the number

    required multiplications by the method used to fi

    (33)

    we find tha t

    (39)

    can be computed with

    F = 3M ( 4

    multiplications which are the same as that obtain

    by the overlap-save method in (33).

    VII.

    Sequences

    of

    Dif feren t Lengths and Part ia l Outputs

    Two modifications of the usual formulation of c

    volution are often desired. The first occurs when t

    two sequences are significantly different in leng

    and the second when one desires only a portion of t

    output rather than all of it. Both cases can be form

    lated in terms of multidimensional convolutions a

    computational savings can be realized.

    If the

    h(n)

    is assumed to be the shorter seque

    and if its length can be expressed asL

    =

    2' the leng

    of the longer sequence

    x ( n )

    will be expressed as

    R

    2'

    .M . It

    may be necessary to add zeros to both

    and h. The multidimensional signals are formed

    give

    S

    length-two dimensions and the last dimensi

    of length MA If

    r

    is the index for the ength-M dim

    sion, then H is formulated to be zero for all

    rn

    oth

    than m = 0. Therefore, if the fast algorithm of (2

    or

    (43)

    isusedalong the S length-two dimensio

    there will be only

    M

    multiplications required alo

    the M dimension. The otal number of multipli

    tions is then

    F = 3' .M. (4

    This could also be seen by considering he problem

    requiring M ength-2' noncyclic convolutions w

    the outputsoverlapped and added.

    If only a portion of the total output rom the c

  • 8/20/2019 1974 Fast One-Dimensional Digital Convolution by Multidimensional Techniques

    7/10

    AGARWAL ANDURRUS: DIGITAL CONVOLUTION

    7

    volution is desired, then a similar saving can be ob-

    tained. Assume that both h(n)and x ( n ) are of length

    N = LM, witha noncyclic output

    y ( n )

    of length

    2N

    - 1.

    Out of this only a block of y ( n )of length L

    is desired. First,

    N

    extra zeros i re appended to both

    the sequences and cyclic convolution of length 2 N is

    formulated. This onedimensional cyclic convolution

    is reformulated as a two-dimensional convolution as

    in (5)-(7). The first dimension is of length

    L

    and the

    convolution in this dimension is noncyclic and can be

    carried ou t as cyclic convolution of length 2L

    -

    1. In

    the second dimension the convolution is cyclic and of

    length 2M. The wodimensional arrays are formu-

    lated so that the des$ed length L block of

    y n )

    ap-

    pears as a column of Y in (6). Thus, P along the sec-

    ond dimension has to be computed only for one

    index. This replaces cyclic convolution of length 2 M

    with a summation of 2M terms along the second di-

    mension. The desired output block can be written as

    a summation of

    2 M

    convolutions of length L, where

    each convolution represents convolution of two se-

    quences of lengths 2L

    -

    1 and L, respectively, giving

    a sequence of length L. Because zeros were appended

    to both x(n)and

    h(n) ,

    out of these 2M convolutions,

    between one and M convolutions woTld be nonzero,

    depending on the second index of Y for which the

    output is desired.

    The convolutions along the first dimension of length

    L can be carried out either by transform techniques

    or multidimensional techniques discussedn this

    paper. If the transform techniques are used, the

    summation of convolutions can be carried ou t in the

    transform domain, thus requiring only one inverse

    transform to obtain the desired output. Rader

    [ll]

    hasdiscussed a similar technique for the particular

    case of estimation of autocorrelation function for he

    first few lagvalues. If

    L

    = 2' and the multidimen-

    sional methods are used, the number of multiplica-

    tions are at most the same as given in

    (45).

    The formulation just discussed can be extended for

    the situation where sampled output is desired, sam-

    pled at every L th value. In this case for he wo-

    dimensional formulation, the output appears as a row

    of P, and as before, the twodimensional convolution

    again reduces to a summation of convolutions. If the

    output

    y ( n )

    is a narrow-band signal as compared to

    the sampling frequency, to reconstruct the analog sig-

    nal, the samples of y n ) at a lower sampling rate are

    sufficient. In this situation the formulation discussed

    here can result in computational savings.

    If a multidimensional formulation is considered, we

    can obtain partial outputs as combinations of blocks

    and samples.

    IX.

    Relat ions to Logical Convo lut ion and Walsh Transforms

    Consider logical convolution [12] of two sequences

    x ( n ) and h n) of length N = Z M , each giving

    an

    out-

    put sequence y ( n ) also of length N . Logical convolu-

    tion is defined similar to the cyclic convolution of ( 2) ,

    but the addition and subtraction of indices is done

    differently. All indices are represented in the binary

    form as an M-bit index. When indices have to be added

    or subtracted, theyare added or subtracted bit by bit,

    modulo 2. Note that

    in

    logical convolution, addition

    and subtraction of indices are equivalent. We can

    convert this onedimensional ogical convolution prob-

    lem to an Mdimensional convolution as in (17) and

    (18). Along any dimension, convolution appears as n

    (19), but since logical convolution is desired, h(-1)=

    h 1). Therefore, if

    (18)

    is implemented as length-two

    cyclic convolutions along all the dimensions,

    y n )

    hus

    obtained is logical convolution of

    x ( n )

    and h(n) .

    Al

    ternatively, if (18)is implemented as a noncyclic con-

    volution alongall the dimensions, we obtainnon-

    cyclic convolution of x ( n ) and h(n)as in (26).

    Length-two cyclic convolution can be implemented

    using just two multiplications as in

    (24),

    which is a

    length-two DFT implementation. Thus (18) canbe

    implemented as an Mdimension$ cyclk convolution

    using Mdimensional DFT's of

    X

    and H , where each

    dimension isof length two. Alternatively length-N

    logical convolution can also be implemented using

    length-N Walsh transforms of x(n )and h(n) 12]. The

    preceding development shows that length-(N = 2M

    Walsh transform is equivalent to th e Mdimensional

    DFT. Thus the Mdimensional approach establishes

    the logical convolution theorem for the Walsh trans-

    forms and it also establishes the fast Walsh transform

    algorithm as anMdimensional DFT. These facts have

    been noted before in the literature.

    For a particular formulation of (20), Pittasi [71 and

    Davis [9] ave shown that some of the intermediate

    products correspond to multiplication of the Walsh

    transforms of the twosequences.

    X. General izations and A pplication s

    There are several modifications and generalizations

    that are possible with the formulation used here. The

    first will illustrate that fast algorithms exist for se-

    quences of length other than two.

    Consider the noncyclic convolution of two length-

    three sequences.

    E]

    Y4

    This would normally require nine multiplications and

    four additions. If six intermediate variables are calcu-

    lated by

  • 8/20/2019 1974 Fast One-Dimensional Digital Convolution by Multidimensional Techniques

    8/10

     

    IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING FEBRUARY 19

    go =

    ho x0

    g3 =

    (ho +

    hl x0 + x1

    g1 = h l x1

    g4

    =

    h +

    h2 1(x1 I-

    xz

    g

    = h z x z

    g5 =(h2

    + h 0 @ 2

    x01 (47)

    then thedesired outpu t can be obtained by

    Yo =go

    Y1

    =g3 g o - g,

    Yz

    =g5 + g ,

    -

    go - g2

    y3

    =g4

    -

    gl g2

    Y4

    =g2 (48)

    requiring six multiplications.

    If two sequences of length N = 3M were to be con-

    volved, then using an implementation with

    M

    di-

    mensions

    of

    length three would require for otal

    multiplications

    F = 6 M .

    (49)

    For short sequences of this length, this approach is

    faster than adding zeros and using thenext larger

    power of two-length with

    (39).

    Similar results are possible with other lengths and a

    very general scheme can be developed for sequences

    of length

    N =

    2M . 3 s . .

    using different algorithms along different dimensions

    in a manner similar to using a multiple radix FFT.

    Because of th e generality of the multidimensional

    formulation here are mixtures of highspeed tech-

    niques that can be used.

    In some situations, it may be advantageous to com-

    bine transform techniques with the fast algorithms for

    short convolutions discussed in this paper. One such

    situation uses the Fermat number transforms

    [2] .

    As

    discussed in Section

    111,

    using a b-bit implementation

    of Fermat number transforms, the maximum cyclic

    convolution length is 4

    .

    To cyclically convolve se-

    quences longer than this, we need to formulate a two-

    dimensional convolution as in Section

    11.

    Assume

    x ( n ) ,

    h(n),and

    y n )

    are sequences of length-(N = 8 b )

    each. Consider the formulation of 7 ) , with M = 4 b

    and

    L

    =

    2.

    Equation

    7)

    can be rewritten as

    (50)

    M -1

    P(0, rn)=

    i i ( 0 ,

    m

    p ) 8 0 , p )

    p

    =o

    p =o

    j =O

    e 0, j)a(l,j),

    m

    = O l . .

    , M

    - 1. (51)

    In these equations, each summation represents c

    clic convolution of length 4

    ,

    which can be carri

    out by Fermat transforms. Taking length

    4 b

    Ferm

    transforms of all the sequences in

    (51),

    we obtain

    T { P(0, ) ) =

    T { A ( O ,

    ) } T { 2 ( O , ) }

    T {

    P 1,

    ) } =

    T(A(1,

    ) }

    T{B(O,

    I ) }

    4-T { A - l ,

    ) } T { 2 ( 1 ,

    + T { a ( O ,

    ) ) T { 8 ( 1 , } . (5

    These equations are similar to (19) and we cou

    employ the tricks of (20) and (21) along the fir

    dimension, giving

    T {

    P(0, ) }

    = [ T {A ( O , ) }

    +

    T{A(-I, I ) } ] T { 2 ( 1 ,

    +

    r u

    9 - T{2 I

    {A(O,

    E

    T { ?(l,

    I ) }

    = [T@(O, I ) )

    +

    TCA(1,4}1T{2(0,I)

    [ T { 8 ( 0 , ) } -

    7 {8(1,

    } ] T { A ( O , .

    5

    Note that (--l, rn)=fi l,n

    -

    l),thus A - l , 1)

    the cyclic shift by one position of H 1 , l ) . We go

    employ the 5yclic shift theorem to coAmputeT { H

    E } from

    T ( H ( 1 ,

    E)}. Assuming t h e H transforms a

    precalculated,this method requires computation of

    length-(4

    b ) X

    transforms, 12

    b

    multiplications a

    12

    b

    additions/subtractions to compute the

    ?

    tran

    forms using 53 ) , and two length-(4b ) ? inverse tran

    forms. This

    is

    efficient because the only extra comp

    tation is 4b extra multiplications and is better th

    the two-dimensional Fermat number transform a

    proach, which requires roughly twice the amount

    computation. This use of

    2

    X

    1M

    convolution has

    small computational advantage even when used wi

    the FFT where, in effect, the ast stage of the FFT

    gorithm is replacedby one of the fast length-tw

    algorithms.

    There are many other possibilities of combinatio

    of Fermat and Fourier transforms of various lengt

    and dimensions with short convolution algorithms

    with the use of special hardware. The arbitrary ord

    of the various operations can also be used to adva

    tage. As pointed out by Rader [ l l ] and illustrat

    in our discussion of partial outputs and in ( 5 3 ) , t

    often more efficient t o take transforms along o

    dimension than convolve along another before taki

    the inverse transform.

    To illustrate the efficiencies of some of the tec

    niques of this paper, a comparison of a particular ca

    will be made with direct and FFT implementatio

    Consider the problem of noncyclic convolution

    two length-(N

    = 2M

    sequences to give an output

    length 2 N -

    1

    as described in (34). The number

    multiplications per output point will be calculatedf

    three implementations. First, consider adirect i

    plementation which requires N 2 multiplication

  • 8/20/2019 1974 Fast One-Dimensional Digital Convolution by Multidimensional Techniques

    9/10

    AGARWAL AND BURRUS: ONE-DIMENSIONAL DIGITAL CONVOLUTION

    9

    2N

    1

    output points (we will approximate this by

    2N

    for simplification). This gives for multiplication

    per output point

    Fo

    = t N .

    If the FFT s used in anefficient way, taking advan-

    tage of the fact that the data are real and using a

    mixed radix algorithm (2,

    4,

    and 8), the number of

    multiplications necessary per output pointcan be cal-

    culated as described by Singleton [13] and is de-

    noted F , .

    Using the Mdimensional formulation with the fast

    algorithm of

    (33)

    or 44) gives for themultiplications

    per output

    1

    F2 ~ - 3 ~ .

    2N

    Table I1 compares these functions forvarious lengths

    up to 1024. Note the multidimensional implementa-

    tion is more efficient than a direct implementation

    for all lengths and requires fewer multiplications

    than the FFT forengths up to

    128.

    If a less efficient

    FFT implementation requiring 4 log

    N

    +

    4

    multiplica-

    tions per output point is used, the crossover length is

    above

    2048.

    The multidimension approach can also be used in

    the same way the FFT is used to implement ongoing

    processing or sectioning [4]. In contrast, t o use with

    the FFT, this approach is most efficient when the

    length of the section orblock is the same as he length

    of the convolution operator.

    Initial observations indicate hat block implementa-

    tion of recursive filters [3] becomes more attractive

    when used with the techniques described in this pa-

    per. To illustrate his, wewillagain consider the

    multiplication efficiencies for three realizations. First

    consider a recursive filter with an equal order numer-

    ator and denominator f

    N

    =

    2 M .

    The multiplications

    per output points for direct implementation

    F o

    =

    2N.

    Using efficient FFT algorithms, the results given by

    the methods in [3] are denotedF1. Using three con-

    volutions by

    (33)

    or

    (44)

    to implement the block re-

    cursive filter with the block length equal the order,

    gives as he multiplications per output point

    Table I11 compares these multiplication efficiencies

    for orders up to

    128.

    Note the multidimensional ap-

    proach is more efficient than the direct for orders

    above three and more efficient than the FFT for or-

    ders up to about 256. This is yet to be explored in

    detail.

    TABLE

    I1

    A Comparison, f Multiplication Efficiencies for Three

    Implementations of Length-N Convolution

    2

    4

    8

    16

    64

    32

    128

    256

    512

    1024

    1

    4

    2

    8

    16

    32

    128

    64

    256

    512

    1.5

    2.25

    4.12

    5.06

    8.01

    6.03

    10.00

    9.00

    12.00

    13.00

    0.75

    1.12

    1.70

    2.53

    3.80

    5.70

    8.54

    12.81

    19.22

    28.83

    TABLE I11

    A Comparison of Multiplication Efficiencies for Three

    Implementations of Block Recursive Filters of Order N

    N F O Fl F2

    4

    24.5

    8

    8

    21 6.7

    16

    16 31

    32

    10.1

    38 15.2

    32

    ~

    ~~

    645

    64

    22.8

    128

    128

    53

    2568

    34.2

    51.2

    XI. Conclusions

    This paper

    has

    presented two formulations of con-

    volution in terms of multidimensional convolution-

    one based ona generalization of the overlap-save

    algorithm for sectioning and the other on theverlap-

    add algorithm. The first proved to be well suited for

    cyclic and constantdiagonal convolution and the ec-

    ond for noncyclic convolution. Fast algorithms were

    developedbased on ength-two

    and

    -three convolu-

    tion that lead to an improvement in multiplication

    efficiency. Thereduction nrequired word lengths

    proved to be a valuable feature when used with the

    Fermat number ransform. The formulation proved

    to be well suited for the special cases where unequal-

    length sequences were convolved or where only a por-

    tion of the output was desired. It was further shown

    that various mixtures of algorithms could beused

    along the different dimensions to achieve certain ad-

    vantages or to fit articular requirements. Finally, ex-

    amples were presented to compare the multiplication

    efficiencies of a few implementations.

    The formulation is

    so

    general tha t a complete and

    systematic investigation of all possible applications is

    difficult. The main ideas and relations to otherworks

    that we know of have been presented here. The in-

    vestigation of word length and storage requirements

    and a more complete consideration of recursive im-

    plementations is still to be made.

    Acknowledgment

    The authors would like to

    thank

    R. A. Meyer for

    valuable discussions.

  • 8/20/2019 1974 Fast One-Dimensional Digital Convolution by Multidimensional Techniques

    10/10

    10 IEEE TRANSACTIONSON ACOUSTICSPEECH AND SIGNALROCESSING OL.SSP-22 NO.

    1 ,

    FEBRUARY 1

    References

    [l

    C.M. Rader “Discrete convolutions via Mersenne trans-

    forms,” I E d ET r a n s .C o m p u t . , vol. C-21, pp.1269-

    1273, Dec. 1972.

    [ 2 ] R . C. Agarwal and C. S. Burru,s,. ‘Fast digitalconvolu-

    tions using Fermat ransform s. In Southwe s t e rn IEEE

    [3] C.

    S.

    Burrus, “Block realization of digital filters,” IEEE

    C on 6 R e c , Houston, Tex., Apr: 1973 , pp. 538-543.

    Trans . Aud io Ele c t roac ous t . ,

    vol. AU-20, pp. 239-235,

    Oct. 1972.

    [4 B. Gold and C.M. Rader,

    Digital Processing o f Signals.

    New York: McGraw-Hill, 19 69 , py; 208-211.

    [ 5 ]

    R.

    A. Meyer and C.

    S.

    Burrus,Certain properties

    pf

    I EE E C o n f . R e c . , Houston, Tex., Apr. 1973, pp. 529-

    periodically time-varying digital ilters,” in

    Southwe s t e rn

    535.

    [6] D.

    P.

    MacAdam, “Image restora tion by constrained de-

    convolution,” J . O p t .

    S OC .

    A m e r . , vol. 6O;pp. 1617-

    1627, Dec. 1970.

    [ ] D. A. Pitassi, “Fast convolution using th e Walsh trans-

    Digital

    Design

    Notch Filter

    Procedure

    JAMES A. CADZOW

    Abstract-An

    analyticalprocedure or designing ainear

    digital notch filter is presented. The resu ltan t filte r is sixth-

    orderand s mplementedby cascading three econd-order

    filters

    so

    as

    to

    avoid instability which may arise from

    computercoefficient runcation.Theprocedureoutlined is

    straightforw’md, requires only simple algebraic steps, and gives

    filter parameter selection criteria for reducing t he effec ts

    of

    compute r coefficient truncation.

    Notch ilters have utility in situationswhere a esired

    signal

    is

    corruptedby n dditive sinusoidalpickup.One

    thusmust process th e noisy signal so as to remove the

    sinusoid without significantly disto rting he desired signal.

    I. In t roduc t ion

    A frequently occurring linear data filtering applica-

    tion occwfs when one wishes to process

    a

    signal of the

    form

    u

    t )

    =

    s t )+

    A sin wo

    so

    as to remove the additive sinusoidal component,

    A

    sin m o t , without seriously distorting the desired

    signal

    s(t).

    The situation in which

    a

    60-Hz sinusoidal

    pickup corrupts a desired measurement signal nicely

    illustrates how such problems can arise in practice.

    forms,” n Proc. C o n f .App l i c a t ions f Wa lsh Fu

    t ions , Washington, D.C., Apr. 197 1, pp. 130-13 3.

    [8 P.

    J.

    W. Rayner, “A fast cyclic convolution algorith

    presentedatSymp. Digital Filtering, mperial Colle

    London, England, Aug. 1971.

    [9] W. F. ,Davis “A class

    of

    efficient convolution al

    rithms, in &roc . Sy mp .A p p l i c a t i o n s o f Walsh Fu

    t ions ,

    Washington, D.C., Mar. 1972, p p. 318-3 29.

    [ l o ] J. C. Allwright, “Real actorization of noncyclic c

    volutionoperatorswithapplications

    to

    fast convo

    [11 C. M. Rader,“An improvedalgorithm for high sp

    tion,”

    E l e c t r o n . L e t t . ,

    vol.

    7 ,

    pp. 718-719, Dec. 1971

    autocorrelationwith pplications to spectral stima

    tion,”

    I E E ET r a n s .A u d i oEle c t roac ous t . ,

    vol. AU-

    pp. 439-441, Dec. 197 0.

    [12 J

    G.

    S . Robinson, “Logical convolution nd iscrete

    Walsh and Fourie r power spectra ,” IEEE Trans . Au

    Ele c t roac ous t . ,

    vol. AU-20, pp. 271-280, Oct. 1972.

    [13]

    R.

    C. Singleton, “An’alg orithm for com puting the mix

    radix fast Fourier transform,”’ IEEE Trans . Aud io El

    t roac ous t . ,

    vol. AU-17, pp. 93-103, June 1969.

    The ideal filter for he above application would th

    have

    a

    response

    s t )

    to he nput

    u t ) .

    In the f

    quency domain, the required linear filter would ha

    a gajn of one for all frequencies except at wo wh

    its gain s zero. As such, this processor s typica

    called a notch filter with notchat a0. Its ga

    frequency behavior is depicted in Fig.

    1.

    Unfortunately, the ideal notch filter is not phy

    cally realizable and must be approximated in practi

    If one attempted to implement a notch filter

    proximation usinganalogdevices(i.e., resistors,

    pacitors, and nductors), one would quickly real

    the futility of thisapproach. On theother han

    one may eadilydesign a digital filter whose f

    quency behavior loselyesembles that shown

    Fig.

    1

    (e.g., see

    l]

    [4 ]

    ).

    The approach to be taken in this paper is to th

    uniformly sample the signal u ( t ) (every T secon

    and use the resulting sequence as the input o a digi

    filter governed by

    y ( h ) = b , u ( h ) + b , u ( h- 1 ) + . . . +,u(k

    -

    m)

    - a l y h - 1) a z y k -

    2)

    -

    . . .

    a , y h -

    n )

    where u ( h )and y h ) denote the values of the filte

    inputandoutput signals, espectively, at hehth

    iteration.

    A

    procedure for selecting the coefficie

    ai nd b iwhereby the filter’s gain factor-frequency b

    havior will be similar to that shown in Fig.

    l

    will

    shortly given. One must realize, however, that sin

    the filter is digital, its frequency behaviorwillbe

    periodic with period 2 x / T (e.g., [l, . 2971 ) and w

    appear

    as

    shown in Fig.2.

    The selection of the sampling period T o be used

    of great importancefrom a number of viewpoin

    Most importantly, it must be chosen small enough

    that little distortion results from the analog-to-dig

    conversion

    of

    the desired signal

    s t ) .

    Quantitative