signal processing with lapped transforms

191

Upload: others

Post on 11-Sep-2021

12 views

Category:

Documents


0 download

TRANSCRIPT

For a complete listing Or 1:111' A'd('d/, /-/rJ'/l,,'if' 'l'r'/('(;ollli/l.lI./I;"nll(l/I,'i Lib1'a.ry, turn 1.0 t.lle h;]('II: "I' I,his hfltlk " ,
I' ,"
Henrique S. IVlalvar Universidade de Brasflia, Brazil
ARTECH HOUSE
Bost,on • London
Mlllv,"·. II I1l'iqu S., 1l)1j7-
SignAl PI'OC 88il1g with lupp d transform8 / IIOl11'iqu S. Malvar. p. cm.
Includo8 bibliographical references and index. ISJ3N 0-89006-467-9
I. ignal processing-Digital techniques. 2. Signal processing­ Mathematics. 3. Transformations (Mathematics) I. Title. IL. TiLle: Lapped transforms. Ill. Series. '1'J(5102.2M275 1991 G2J .382'2-dc20
UdU~b Library Cataloguing in Publication Data
MlIlvar, Hcnrique S. Si!:nal processing with lapped transforms. I. 'Pitlc. (;21.3822
ISBN 0 9006-467-9
(O1!)!)2 AHTECH HOUSE, INC. (JAG untoll Street NlIl'wood, MA 02062
All rights reserved. Printed and bound in the United States of America. No I)/'I't of this book may be reproduced or utilized in any form or by any means, (llu't"onic or mechanical, including photocopying, recording, or by any irlfOl'l1'1ution storage and retrieval system, withoul permission in writing 1"1'0111 the publisher.
Int I'national Standard Book Number: 0-8900G-~67-9 L.it rary of Congress Catalog Card Numb 1': 91-35984
l098766~32J
Contents
1.1.2 Power Spectrum 4
1.1.3 Aut.orcgressive Models 7
1.1.4 Spectral Flatness 8
1.2 Block Transforms. 9
1.2.1 Basic Concepts 10
1.2.4 Karhuncn-Lotvc Tram.form 15
1.2.6 Type-IV Dif:crete Cosine Trausfonn 20
1.2.7 Other Transforms 21
1.2.8 Two-Dimensional Transforms 22
2.1 Signal Filtering 32
V11
,·/d ( :ON'/'/IN"'S ( 'ON'/'/';N"'S ix
2.'1.' Mlllticlul.llnd Fill.l·rin,a: 38 ~ c Block Tra,m·:fnl'ln~ V('l':"ll~ Filt.er Banks 127.iJ
..1.3 Adnpl,ivl' Fill,lTiuJ,!; 38 3.6 Applications. 130
2.' SW·Ct.1'11111 E~l.illlrt.l;inll . 44 3.61 Signal Filtering 130
,:1 ']'nln~f()rm Cntling 47 3.6.2 Adaptive Filtering 133
~."1 Otlll'!' Applira.t.itll1s 55 3.6.3 Spectrum Estimation. 134
2,rj Past. AIgoritlulls . 55 3.64 Signal Coding. 134
2.:i.] Discl'd,(' FOllricr Transform 56 3.6.5 Other Applications 135
2.~.2 Discn..~t.c Hartley Transform 63 37 Summary 136
2.0.:1 Di~t:.l'('j.(' Cosine Transform. 67
2..J.!1 Typ('-IV Discrete Cosine Transform 4 Lapped Orthogonal Transforms 143
71
74
llllllllary 75 4.1.2 Connection with Filter Banks 148
:1 Sigllal Processing in Subbands 81 4.2 The Lapped Orthogonal Transform 152
:1.1 tvl1l11.ira,l,(' Sign;)] Processing 82 4.2.1 Recursive LOT Optimization 152
31.1 DC'cimation and Interpolation 82 4.2.2 Quasiopt.imal LOTs 155
31.2 Cas(';'ule Connections. 86 4.3 Fast. Algorit.hms for t.he LOT 161
3.1.3 Polyphase Decompositions. 88 4.3.1 Structure of the Fast LOT. 161
:\.' Filt,c'1' Banks. 89 43.2 LOTs of Fini t.e- Lengt.h Signals 166
3.2.1 St.ructures FB-I and FB-II . 89 4.4 Fast LOT for AI > 16 167
32.2 Signal ReconstructioIl 91 4.5 Coding Performances. 170
:3.2.3 Computat.ional Complexity 93 4.6 Summary 171
3.2.4 DFT Filt.er Banks 94
:3.3 Quadnl,tllre rvIirror Filters 100 5 Modulated Lapped Transfonns 175
3.3.1 Two-channel QMF Banks 100 5.1 The MLT 176
3.3.2 QMF Banks for AI > 2 106 5.2 Extended Lapped Transforms 180
3.'1 PC'rfcet Reconstruction 109 52.1 Perfect. Reconstruction. 181
3.4.1 Two-channel PR Filter Banks. 110 5.2.2 Properties 185
3.4.2 PR Filt.(·f B;mks for 111 > 2 119 5.3 Design Techniques 190
;'.4.1 Fast ELT rnr J\ = l (M r:n (;.4.2 F"ISt. ELT f,.,. l\" = 2
,).4.4 Cnlllpnta/'.inllal Complexity
G.5 'odinJ,'!; Pcrforlllnncc
Appendix B Pragranls far Black Transfanns
xi
254
257
2Gl
2G5
273
274
277
281
G.l
1i.3
6.2.2 Generalized HLT .
G.3.2 Etluivalcnt Subband Filters
G.3.4 Regularity.
6.3.G Computational Complexity
Appendix D Tables of ELT Butterfly Angles
Index
315
345
353
7.1 Sp(·d.rum Estimation.
247
247
254
Preface
Digital signal processing (DSP) ha..<;; been a growing field for mort> than three decades. Wit.h the availability of fast integrated circuits dedicated specifically to DSP ap­ plications, we now live in a world where DSP is not just a hot research topic, but
part of our everyday life. If wc look at wha.t is attached t.o a standard telephone line in our modern office, for example, wc sec modems) fa-x machines, and tapeless i1nswering machines: all of which could not. exist without DSP, vVc could certainly :-pcnd many pages describing examples of DSP applications..
Ach-ances in DSP have been so many that. specialized arCilS within it are thcm­ ~elves becoming new fields. Among them, wc may cite spe<:'ch processing, image_
processing. adaptive filtering, and ffiltltirat.e signal processing. In all of thef:c areas,
fast transform~ are frequently used, because it. is often more efficient to process a :-;ignal in the t.ransform domain.
The purpose of this book is to present to the reader a. complete reference on t.he theory and applications of a new family of transforms, called lopped l'fal1S­
/0,.,n8 (LTs). These trnnsforms can be used in many applications, such a!' filtering, coding, spectral estimation, and any OtlH.:'fS where a tradition,,} block tnlllsform is employed, such as t.he discrete Fourier transform (DFT) or the discret.e cosine trans­
form (DCT). In many cases. LTs will lead to a bet.ter complexity versus performance trade-off than other transfonus. Until now, the theory of lapped transforms and
many of t heir applications have appeared in theses, journal articles, and conference proceedings. This i:- the first. book in which all of the known result.s are put together in an organized form.
INe believe that. thi:- book is a u!'ieful reference for design engineers, graduate stu­ dents , and researchers involveo with DSP ;:\pplications that make use of fast traIlS·
forms. Many signal processing systern:o; employing fast. transforms arc presented, as
well as evaluations of implementations of those syst.cms. Thus, the reader with a
practical application in mind will be able to put LTs to work to/his or her benefit
X1Jl
illllllt'.liul.·ly. P(lrlhllt. plll'pn:-;I', \V,'!WV" inl'1lld,·d ill t.hl~ ;q)lll'nrlin'~ lislillj,(s of "0111­
pul('1" 1'1'0,ll;l'lllI1S wil,il fi'''1. alp;ol'itlllllS for lap\)!'d lransful'Ills, a.... \Vdl ;1.... program::; for Il'lIdil illnal Illcwk transforms hasl'd (lU OpLill1ii'pd alp;oritllllls.
As is IIH' (';tS(' wit.h allY u,~'" lopi,', tlwrC' ,lr(' many inh'n~sting thC'oretical issues
illVnlvill,l!; LTs, for l~xilmplC', t.he rcliltionships that exist among LTs, multi rate filter
hlllLk~, alii I disndl' wilw·ld t,rallsforl1l:-:i. Throughout t.he book. there are sections <l1'vlIII'd '·lltin·l)' t,o thcs£' a.nd ot.her t.llC'oretical aspects. The reader is only assumed In !lit VI' it solid backgrouud in the bnsic theory of discrete-time signal processing, IllcllHlillj.!; Ill!' flllldanH'ntals of fnndom signal representat.ion and int.roductory linear nl}\,·1 fnl, UlIdllll btccUy, this book will be eyen more useful to those already familiar
with t,ll'~ irllJlll'lfwutation of signal processing systems t.hat. employ fa~t transf()rm~
01' flll.·1' hnl1ks.
'l11lptf'l'l st.a.rt.s with a brief re\'iew of signal models and the basic definitions and
l'IIIIII'l't il's of t,mditional block trilIl~forms and lapped transforms. A brief history of 1111' dl'VI'\{)plllf'llt of lapped transforms is also presented. In Chapter 2, some of the Ilppli(·ld.iQ!lS of block transforms are discussed, with emphasis on the DFT. DHT, llllt! till' 0 ,T. The current state-of-art of fast. algorithms for these transforms is 1',·vi'·Wt·t!, aud the best known procedures for their computation are presented.
'i'lli' !'iISi('s of multirate signal proce!'sing are discussed in Chapter 3, with the pili pmll' n[ st.lldying maximally-decimated filter bi1nks, which arc essential building ld.wk:'l (If SlIbh;llld coding systems. Special attention is gi\'en t.o quaclratw'c mirror
11 It 1'1' (QMF) b'lllks a.nd perfect. reconst.ruction (PR) filter banks, including conju­
,LVll,t· '1lladratllrf' filtf'rs (CQF). The chapter discll:-isf:'S tile fundamental idea that 11'1111111'0"01 (,lIding i::; in fact a special case of ~llbband coding, and also discllssrs lit l.'lty till' i,pplications of subband signal processing.
III '1lilpttT 4. the theory and propertie!i of the lapped orthogoni'l} t.ri'lnsform
(I j( )T) ILn' ~t.lIdied in dctnil. The theoretical aspeds leading to tlH' PR proPf'rty of Illpp,·d ll'an!oiforms are discussed. within the context that. a lapped t.ransform is a
lull 11 lid ~'xlf'nsion of CL regul"r block transform. This exten!'ion is directed towards
I'llllill,l!. till' j ransform into a filter bank with impro\'ed frequency resolution. Design
1"j'hlliqlll'S alld fast algorithms for the LOT are presented. The coding perform<lnce or 1111' LOT, which is better than that of the DCT. is also diHussecl.
Thl' llIodulat.ed lapped trill1sfonn (i\IILT) family of LTs is studied in Chapter 5. t0A'·t 111'1' wit.h it.s gCllf'ralizf'd vrrsioll, t.he l'xh.'udt'd lapped transform (ELT). A dp­
!l,il,'d dis<"ut'silln of t.he df'sign tcchniques for t.lll' p;:rnf>ration of optimized :\ILTs and
I'~l.../TH is prl'sf'utcd. F'lst algorit.hms for 1.111' MLT and ELT are also present.ed at
XI'
:I II~v·1 of ddllil previollsly ullavailable ill t.llf' lit.erat,l1rc. The chaptet· ends with a
j IlI'c)[l,tical dit'('llssion of t.he .oding performance of the MLT and ELT.
The hicrarrhic"llapped transform (HLT) is discussed in Chapter 6. HLTs ore lI!"cful for multiresolution signal analysis and coding because HLTs are in filet filter
hanks with subbands of unequal widths, and impulse responses of different lengths. 'Tn"'C' structures for the HLT are discussed, as well as the connections with the
discrete wavelet transform.
Applications of lapped transforms are discussed in Cha.pter 7. Examples of the use of LTs in signal filtering a.rc considered, with emphasis on adaptive filters and variable filters, The use of LTs in signal coding is also discussed, with many exarnples of the results obtained with speech f\.nd image coders based on LTs. From these results, it becomes clear th.. t one of the main advantages of lapped transforms
over traditional block transforms is the strong reduction in the discontinuit,if's in
the reconstructed signal at the block boundaries, the so-called blod..in.g cJJ(cJs.
The appendices present valuable information for the reader interested in putting t.he ideas in this book to work immediately in his (or her) application t.hat requires a. block transform or a filter bank. "Vhen the desired number of bands is two, a good alternative for the implementation of perfect-reconstruction filter banks is t.he conjugate '1uadraturc filters (CQFs). Thus. wc have included" table of CQF
coefficients in Appendix A. Computer programs for fast computation of ~oll1e of the most conllTIonly l1~ed block transforms 1.\re presentcd in Appendix B. In Appendix C there are seycral tables of but.terfly angles for the MLT I and in Appendix D computer
programs for fast computa.tion of LTs are presented. The progra.ms are all written
in the (~C" ..language for increased port<lbility.
l'I{/~1"I\ '/~
Acknowledgements
T~ere are many people who had a strong intiuencc on tIll' ma.terial presellted ill
thIs. book, ar~d to whom I would like to thank. Professor Da,vid H. Staelin, my thesIs supervIsor at M.LT. imd a good friend, has alwa.ys been vcry supportive and
encouraging, giving me the right advice on everything. He was the originator of the
term lapp,d transform. The research team that developed the basic ideas behind the lapped orthogonal transform at M.LT., back in 1984, also inclnded Philippe
Cassereau, Brian Hinman, and Jeff Bernstein. 11any encouraging discussions on the theory and applications of lapped transforms and related topics were held with Dr. A. Brian<;on, C. Clapp, M. Cruvinel Dr G de Jagcr Prof P S R D' .. , .• (~ I •.•• InIZ, R. Duarte, Dr. P. Duhame!, Dr. S. Erics~on, Dr. D. Le Gall, Dr. F. A. O. N<lscirnento. A. Popat, Prof. K. R. Rao. R. Saraiva Jr.. Dr. J. Shapiro, Prof. M. J. T. Smith, Dr. B. Szabo, Dr. A. Tabatabai, Prof. P. P. Vaidyanathan, Prof. M. Vett.erli. Prof. A. S. Willsky, and Dr. G. Womell.
In part,icular.l would like to express my sinccre thanks to Ricardo L. de Quciroz
f~r his ~lany suggestions on the manuscript, and for carrying out all of the computer simulatIOn!' of the applications of lapped trrlnsforms to image processing. I am also thankful to Edual'do M. Rubino, for writing a family of TEXt device drivers that allowed this book to be entirely typeset by the author. Eduardo also wrote t.he soft.w~re that produced the half-tonc imf\ges of Chapter 7. The encouragement. that I rccel\'ed from Mark vValsh. Pamela Ah!, Kim Field, and Denllis Ricci of Artech
House helped keep my peace of mind as I was writing this book. It was certainly a pleasure working with them.
The financial support from the Brazilian Government., through t.he Constlho Nacionnl de. Dcsen,vo!tlimenlo C'icuUfico e TecnolOgico - CNPq, is gratefully acknowl­ ed,c;ed. CNPq supported mo~t of my research on lapped transforms s'ince 1987 I.hrough grants nos. 404.963·87,404.519-88,300.159_90, and 600.047-90. '
Finally.. a special note of gratitude goes to my wife Regina Helena. my daughter Ana Beatnz, and my son Henrique. Each hour spent writing thi~ book was an
bOil I' taken away from thcm; a.nd there \vere many, many such hours. Viithout t.heir patience and understnnding. t.his book could not have been written. .
ITE/X is a trademark of the American M<ll.hematical Society.
Chapter 1
Introduction
This is Cl book on signal processing with lappcd transforms (LTs). At first, this might seem to be an obscure subject because LTs arc relatiyely new. This is not S0,
however, and LTs are becoming more attractive for a wide variety of applications. This is mainly because LTs are a, special family of filter bank~ t.hat can be ea.sily
designed and implement.ed. even for Cl large number of ~ubbands. Throughout. this book, we shall see thnt LTs can lead to better syst.cm performance than other
more usual transfonns, like the discrete Fourier transform (DFT) or discrete cosine transform (DCT), in applicat.ions such as image coding, speech coding. and adaptiye filtering. In any application where a block transform or a filt.er bank is employed. a
lapped transform can "ho be used, since block transfol'm~ i:Uld lapped tran:o:fol'ms can always be viewed a~ special cases of filter banks. As wc will see in later chapters, in many cases LTs \vil1lead to n bet.lcr signal represent.at.ion or reduced computational
complcxity, or both) when compi'lred to the most commonly used block transforms or filter hi'lnks.
Before we start discussing LTs, it is important, t,hat we re\'iew the basics of
dis<:rete-time signal represent.at.ion, so t.hat we makc dear what we meitn by a signal. This is what we shall do in Section 1.1, where we will folIo"\\' a statistical approach towi"!rrl signal modeling. Looking at. signals as sample functions of st.ochastic pro­ cesses helps to predict quite accurately average :-ystem performance. ,",Ve must also review the ba.sic concepts behind h'adit.ional block transforms to support, our later
discussion of LTs, and this is the goal of Section 1.2. A brief introdndion to lapped transforms. including Cl re.... icw of their history, is presented in Section 1.3.
('/111/"/'1':/1 (, IN'/'III)/lf/("I'J(JN
I, I, Sf( IN/I L MOI)/';I,"" :1
'vV(, l'lll! HiJJ;lllll 11 ~4'lJlll'IW" nf rt'al 1l11l11bn:-- 1.11;11. carry :-;ollW Illl'llllill~flll information. Sllf'1i n :-W41l11'IlCf' \)1' lI11l1l1WI'H p;t'IIITally n)llll's Crolll periodic s;mlplillg of il physical
\\IIIV('[OI'I1I, liS i:.; I hi' (',u>!' in spt'('ch ilnd illla~(' prnc'('!'\."in 1'. How('v('1'. it call also rcp­
]1'."-1/'111 all l'I'Onclllli(' tilllf' S('l'jl'~1 such il"~ IllI' d;lily Dow JlllleS index! for cxampl .
'I'h,' 1"11 j Ii ')[ till' si.c:lIal call he nuil.c or Bot, and that makes lit.tle difference in III1IHt ~itllilti(,)lls. \tVllI'll WI' refer tu signals as sequences of nmnbers1 \Vf" are already
rllllNid....ill~ lIwlll a.s fllll(:l.iollS of discrete time (or discrete space. according to the
IIld"pl'lId"III, vilriabh·) {Il. awl ~o we will not consider throughout this book signals t hnl Ill" ("olllinllOUS wan:.'forms. This is nowadays a natural trend in signal process­
IIW,; ",lwlll""'!' we C;Ul proc('s~ a signal digitally. it is generally cheaper and more j·!Ii,·i"lIt. 10 dn so than t.o us~ analog signal processing.
III Ol't\f'l' to process ~igllals digitally. we Heed not only discrete time but also
diNI'!I'!l' :l111plillldl", hrcausr we must allocate a finite number of Lits of ~t.orage to II'PII'~'of('llt. a lllllllbf'r in a digiti\l comput.er. Ho,,·c\·er. as usual in most text.s dealing wit 11 ,Ii~it HI1'ii~nal procc.-:sing, we will always assume that the signal amplitudes are ('lull illllll\lS. Thus. (·itch signal sample is represented by a renl number. "Vhcnever
Ilppll~l.riat(', Wf' will di:-:t'uss briefly the effects of discretizing the signal amplitudes. III lllosl "asl's wC' will ~iJ11ply ass tunc that t.he errors in t.he ~ign(ll representation due III linit" pl"'cision storage arc small enough t.o be negligible.
1.1.1 Deterministic and Stochastic Models
Figure 1.1. Examples of signals. Top: determini~tic; middle: random. whitE'; bottom: random, colored, Note that these are discrete-time signals; the points corresponding to
~ampl(' values were connected by straight linE'S for c,'lsier viewing.
from 0 to 7r appear with equal cont.ribution). and also a. sample waw{orm of a col­ orf>d random signal (in which ~omc frequency components hase higher energies than
others).
A detenuinistic wa\'eform .r( n) may be characterized by a. simple cqu(Jtion, for
exa.mple,
l)"IlI'lltlillP; ou tlw application, signals can be vi('wed either as determinist.ic or
fni I'/l 11<11)11 1 waveforms. \~le ~i1Y t.hat a signal is deterministic when it is specified
I •." 11 l.;ddl' (If villtWS or by some mathematical function that defines the values of 1"".11 li:lll1plc' (some parameters of the function may be unknown). An example <If l\ (kknninistic wv.vefOIn1 is et. sinusoid, as shown in Fig. 1.1, which may be
""'ljl'l'ld,I'd by n modem in a digital communication system. For a rnndom signal, we llllly 1I0!. 1010W its preci~e waveform in tl part,icular observation of it, but we have
It prohabilistic model or measured statistics that de!'cribe some properties of t.he Hi Ku:t1, :-iurh ns mean. yarianrc: and autoC'orrelation.
H:llIdnITI si.c;nals are good models for real-world signals s11ch as speech and im­
n,c,t's 12], Two examples of random signals are also shown in Fig. 1.1, where we hav.. :l typical :-:ampl~ wavefollll of a whit-C' noise process (in which all frequencies
.T(n) = sin (won)
Note tha t :r( n ) means the vallle of the signal x at t.he timf> index (or space coordinate index) n. In t.he above example..T(n} is just Cl. sampled sinusoid a.t. a frequency l.Iw'Q
radians.
For a random signal, second-order statistics arc oftell Cl. sufficient. char<lcteriza­
tiol1, particula.rly when we arc dealing "'ith linear processing [2]. For examplc: a
statiOllt'lry white noise ~ignal l1'(11) is defined by h"l"ing zerO mean,
EI",!,,)I = 0
r'II.I/"I'I';/( /. IN'I'/I()/lI1("I'ION ----
/./. S/C:N/\ /. MOIJI';["o.; ,)
/llll! /lll /lll!I1f·IlI'l'I·lllliOlll'\I'qUI'IlI·I· (1\1' Itlltn"O!TI'hdi'\1J fUlldi'lll) llHlt iK a:-.ntl.,d 1111
llldl'll',
or simply tIw pow("r sp("et.rum of a I'imdom signal, which is the Fouricr transform
of its autocorrelation [4},
wllf'l'(' h( n) is the impulse response of the system, and * denotes cOll\"olution.
Sbtlisti('ally speaking, many signals like speech a.nd images behn.ve like colored
lIili:t· (lit 11'/1,:;;1'1 ill a. short time or space scale). so tha.t the model in (1.1) can a.lso h,' lI!-i(·d as I\. gnod onc for the ~ignals themselves [2]. The autocorrclation function
of .r( 11) a.... W'llc-raLed in (1.1) is given by [4]
WIIl'I'I' I~ll dl'lIut('~ f'xjwd,,·d valne [3]. Nol,(' that (J~I is the \'ariancc of each sample nf 10(11), i.I'., a;~, = E[w'l(lIl].
A whill' 1l()i~1' sigllitl is :l good model for some noises that a.rc encouutered in the
1'/'111 WIJ1'111, fflr I~X;11l1pl('1 hackgroul1(l noise in ranio communicat.ions. However, there
UI'I> Illll!'!' l10ise SllllrCf'S tlwt may he colored) i.e., their frequency components may
IIIIVI- 111H'f1IUtl f·n('q~i('s. A colol'ed llOi~e 1)(11) may be generated by passing white Illlhw I hl'Ollp.;1t ;1 lineal' system, in the form
1I,.,.(n) == EI"·(1II),,:(m-n)1 = a;', h(11»h(-11)
(1.3) ~
/1=-00
vVe shoul<.1 be careful not to look at the power spectrum as being the same thing as the Fouripr transform of the signal itself. Recall that the Fourier transform of x(n)
(also referred to as the spectrum of :r(n) when the signal is determinist.ic) is given
by [lJ
Szz(ejW ) =F{Rzz(n)} = I: Ru(n) <-jwn n=-oc
and so XC ciw) is a random variable for every w because it is a function of the random signal x(n), whereas Su(ciw ) is a deterministic function of w. The rela.tionship between thc~e two is simple and intuitive, though. In order to see that, consider a block of AI samples of :r(n), and assume that (1.3) has been computed over such a
block. Then, define the periodogram 12,41 of that signal block as
where F{·} is the Fourier transform operator and j =R. We recall that, Ru(n) being a discrete~time signal. the frequency variable w is measured in radians and
varies from -'If to 'If.
The autocorrelation can be recovered simply by the inverse Fourier transform of
t.he power spectrum [4] I that is.
(1.1)
(1.2)
~
1.1.2
(1.4) AllllIl·III'f'l·lal.ion functions describe the behavior of a random signal in the time rto­
,,,,,i,,. w.· sce from (1.2) that if h(71) is a strong filter, i.e., if hen) is quite different.
r"llIl ;111 illlplll~, then the autocorrelation of :T(n) will also be quite different from
!tll illlplll~". Following this line of thought. if "(n) iR a low-pass filter, with the vlll1l1'S of hen) decaying slowly with n. then Rn(n) will also decay slowly wi'th n, 1I1lt! thll~ :'iamples of the signal :T(n) that are close in time or space will be strongly
1'01'1'1·1:11,(',1. Therefore, we conclude that if h(n) is a low-pass filter, then the random tli,u;lIi.l :r( It) will look like a slowly-varying noise. \'Ve could reach the same con­
('Ill~ions by looking at. t.he generation of a. colored random signal in the frequency
dOlllitill. For that purpose, we need th" ddinition of the power density spectrllm,
P (ciw ) = .2- IX (eJW)I' :r;:r; - 111 '
It is easy to verify that the power spedruffi is simply the expected value of the
periodogram
(1.5)
Thus, whereas )[(eiw) is the Fourier spectrum of a particular sample signal of
the stochastic process 1:(n) (and therefore it specifies the frequency components of
I, ('II/II"I'I';/{ I, IN'I'/{O/JII("I'II)N /,1, ,<;I(:NII/, A/()/)I';I,S 7
For a widl~ clllSS of signals, including speech and images, there is a particular form of
H(,)-') in (1.6) that is good enough for most applications. This form is an all-pole
filter, with a z-transform that can be written as
0[, in the Fourier domain (= = f.}W)
1 H(:) = -~N~--
(1.7)
m=l
1.1.3
Frequency../2
"'iJ\III1' 1.1. Io:xalllpl(,f! of Rpccl.ra. Top: spectrum of a deterministic sinusoidj middle: spec­
'1'11111 uf n whit!' lloisc; bottom: spectrum of a coIored signaL which corresponds t.o the following difference equation
N x(n) = I: am x(n - m) + a~, wen)
111=1
(1.8)
t1lht, plll'l.il'lIl:ll' siullpl(' signa.l), Sn(e jW ) is the expected energy at frequency w for
till- who]" 1'l1spmhlC' of functions in t.he process J:(n). Hencc, th(' power spectrum
SJ,A f'}W) c!l'filll's the average frequency distribution of energy for the random signal ,1'('/1), 'I'ItI'Sf' ('uw'epts arc illustrated in Fig. 1.2.
I/flnl,illp; bad.; at (1.2), its frequency domain equivalent, is CClSy to obtain by
nl"llll~ of tll(' convolution theorem [1], \vith the result
We not.e that t.he above system has no memory of the input, but it. memorizes LV previolls outputs, \vhich are used together with the current input to generate
the current out.put. Because of this, the random signal x(n) is referred to as an
autoregressivc signal [4J of order N, ill short AR.( N). A block diagram of the AR.( N)
model is shown in Fig. 1.3.
Autoregressive models are quite useful in image and speech processing. Based on these models, it is possible to compare objectively different signal processing
AR(N) SIGNAL
(16) w(n)
WHITE NOISE
I-I: m=!
wlH'I'I' II«('JW) = T{h(n)}. Therefore, for a given signal that wc want t.o model as ill (1.1), wc can use an average of several periodogram measurements as a good
("tim:tk of S,,(ciw ). Finding a filter hen) that satisfies (1.6) is an approximation p,."bkm that can he easily soh'ed with a number of techniqnes [1, 5. 61.
Figure 1.3. Gcneration of an AR(N) signal
( '/I ill "I'IW I, IN'I '11(1) 1/( "I' ION
,.y,d j ·IIIM. Ffll' 1'11)1'1'('11,11 11l1lr!,·llinl,,1' 1H'I.WITlI N 10 lIurl N 20 is ~nod "nollgll to
r'''I\I'I'~I''lIt Uti' 1'(';0(1,11/111('/':-1 of t.Ill' \l1),'al !.1'I11'!. 12, iJ. !1""('ntly, wilh tILl' df~vl~lf)pltwllt
(If till' low t!,·I:l.,Y 'I~Lr {('f)dl'~('x('ikd lillf'ill' prf'didion) l'IJ{lcn; for HI)('('('h, rnvdel
Ol'd('I'H ()I' lip 1.(1 50 h"w 1>('('11 sllg~(':;lcd [8], wit.h 1.11(' gOiL! of bcttC'l' l11odf'ling the pil,l'i. Iwril)dicity ill f(,l1wl(' spN'ch.
For illl;I,c;,·l'I. a (il'~t-(}I'III'1' lIIlldl·l AR(l) is adeqnate in many C:ll'CS [2, 9. 10J. It. is '·'·pl·C'.... '·lll:ltivl· of Uu: illla.c;(' W)I<'II wc look a.t it as it one-dimensiol1i1.1 signal, either
II,Y ,..wll.lll1ill~ it.s rows or columns. Strictly spea.king, however, an imtLge is a two­
dillll'IISiollal sigll,,1 of tilt'" fnl'm .1'(11.,111), where .r. is represents the light intensity at Illi' pllint (ft,/II) uf tlw image plaue. So, its autocorre1ation function is also two­ dilllf·lll-lioJ);d. ,ltHI drnnf.c·d by Rrx(n, m). where 11. and m now represent the lags in llal' Ilol'izonl,l! iLnd vrrtica.l directions. It is usual. in practice. to assume that the
illlH.L((· "iln bf' lllod(·led as a separable random process [9J, i.e., its autocorre1ation fllllt'tillll call he writ.tell ill the form
(1.9)
wlll'l'I' Ru ( /I.) aud RrAm) arc the one-dimensiona.l autocorrelation functions alonO" o
1,11t' nl\V:'l find cnl1tll1n~. The model in (1.9) is valid only approximately for real- wnl'ld illlngcs, hilt. the approximation is good enough for many applications. This ~f'lllll'lll)ilityof thC' i\utoC"orrdation function implies that we can use separable trans­ rlU'lnS to procC'ss t,}lf' image; without any loss in performance, as we shall discuss in Sl'dioll ] .2.
1.2. J) /,()(;/i '1'11/\ N,'WO 11 MS
From the above definition, it follows that [2J
A white noise signal wen) with Rww(n) = o~ 6(n) has Sww(e}:.l) = o~" which leads to 'Y~ = 1. A colored signal x(n) will have 'Y; < 1 [21, and this is the reason for the name spectral flatness. The lower the measure, the farther Sxx(ejW ) will be
from heing flat.
Later in the buok, we will make use of the concept of the distort.ion-rat.e func·
tion, which is related to the spectral flatness measure. For signal sources that are
stationary and ergodic, the distortion-rate function D(R) specifies the minimum
mean-square distortion D that could be achicved at an average rate of R bits per random variable [llJ. Thereforc, the distortion-rate function provides a lower bound on the fidelity that. can be achieved by any coder, for a given communication cha.n­ nel. As intuitively expected, D(R) is a monotonica.Jly non-increasing function of R, with D - 0 as R _ 00, Thus, it is an invertible function, and its inverse is called
the rate-distortion function, R(D),
For a stationary and ergodic source with a Gaussian probability density function,
the distortion-rate function is given by [2]
Ideal (unrealizable) vector quantization of the signal at all average rat.e of R bits per sample would lead to Cl distortion DQ; given by [2]
1.1.1 Spectral Flatness (LlD)
J\ ClIT I ht' discussion in the preceding subsections, it is clear that stochastic mod­ ·'hl ror sigllals, based on autocorrelations and power spectra, should be adf'quate r,ll Ill:lllyapp1ications. Before we end this section, though. we discuss briefly one hlqllll't.JllIt ('onsequence of the random signal models, which is the spectral flatness 111"/1,.,'(111'1'. For it signal .T(n) \vith power spectrum Sr.r.(t)W), the spectral flatness 1111',1.'0,(111'1' is defined by [2]
Thus, the spectral flatnt"ss measure ,.; is also a measure on how much reduction in mean-square error (over strilight signal quantization) wc could obtain from the best
possible coder. Knowledge of such a bound is certainly quite useful when wc are
designing a signal coding system becau8e it will tell us how much room there is for improvement in our particular system.
1.2 Block Transforms
Transforms are frequently used in signal processing, and their importance can be
verified by the number of books written about t.he subject, for example [9, 10, 12].
/(/ ('I/JlI"/'/W I, /N'/'1I0/lil :'1'II)N U. I)/,()(;/i '/'IIJ1NS/,,()IIMS / I
fl'ld,.. i~ ht'/'lllHW tl'hlll"lrOI'IIlN nIl! hI' 1'llll'lllynl ill lllltlly 11I1:-4i,' ~i,u;lll1l !JI'I)C(..':-;:·dllg OpfT
11t!lll1H, HlIcla II:~ liIh·l'l11.l.l;. CHI'I'I'hd,ioll, Sp'-'dl'lll11 I'HtilJl:ll.ioll, :llld Hi~Il:l1 ('llhnIlCf'l1lt'lIt.
111111 ",ldilllJ:. III t.his :·wdioll \\11' n'vi,'w t.Iw ch,f-illit,iolls Hond proJlnli"s!lf blo<:k tnll1S­ ((Will,'! ill P;f'IWl'lll, Ill1ft 1.111'11 Wf' n-'vi,'w 1.11(' specific propnt.it's of t.he most. llsua.1
Il'tllIHfOl'llls. III !Iilpll'!' 2 w(" shall st.udy t.}w·sc tr;lII:-fnrIns in 1l10rc detail. wit,h :l
dirH'l1SHillll nf fa.... t. :dp;ol'il.lllns and applica.l,ion~.
The equation above is valid for any invertible transform matrix A, but we will consider throughout this book only orthogonal matrices, i,e., those for which
Therefore, the inverse transform relat.ion becomes
l.2.1 Dasic Concepts x=AX (113)
Opfn!'t' (·Olllpllt.illg t.11I' tnmsform of a. givcn a signal x(n), we must group its samples hiln hll)("k~. nl·rening to x as onc of thc~c blocks, we have
"V" Cilll X t.1w t.rall~fonn of x, and A the t.ransformation matrix, or simply the !'I'IIIIHful'lI1. rr till' cl(,lllcnt~ in A a.rc complex 1 the ~upers(Tipt T denotes conjugate t l'IIIIHll()sitioll,
wlll'/'I' t 1\1' stlpnscripl T denot.es transposition (\....e usually consider signal blocks as "nllllllll v('("t.ors), and 111 is t.he block index. The correct, notat.ion for x would be x( Ill), III lilnk.· drar thClt we ha.ve a sequence of signal blocks. However, to avoid j'lInrllsinll \Vii 11 0111 'r indices, we will leave the dependence on 111 implicit. The
dilll"llsioll nf x is AI, which is also referred to as the block size or block length.
'ollsidcr tilt' following direct linear transformation on x:
'''''''e have used AT in the direct transform and A in the inverse so that the basis vectors (also called basis functions) of the transform are the columns of A, Therefore, we see from (1.12) that the kth element of X is the inner product of x with the Hh basis function. Furthermore. (1.13) implies that the kth element of X is also the coefficient of the kth basis function in the linear combination of the
basis functions in (1.13) that recovers x from its transform. For traditional block
transforms, the matrix A is square of order 1\{, so that the number of transform
coefficients in X is the same as the number of signal samples in x.
In practical implementa.tions, orthogonnl transforms have many advantages, which come from three basic properties. First. when we choose an orthogonal transform matrix A, both the direct and inverse transform matrices are immedi 4
ately defined without the need for matrix inversion, Second, the fact the inverse transform operator is just the transpose of the direct operator means that, given a flowgraph for the direct transform. the inverse transform cnn be implemented by
transposing the fiowgraph, i.e., by running it backwards [10,13], FinallY,orthogonal
transforms conserve energy, t.hat IS l
(1.11)
(l.12)
x = [.r(mAl) "(mM - I) ... x(mAl - M + IW
" "pccial rcmi-H'k regarding the notation is important here, In many texts, bnldflll''-' low!'!' case letters are used for vectors and boldface upper case letters for 1IIIlI.I'i('I'H. In d('Clling with transforms, however, it is also st.andard practice to use 11 IIIW('I' I'a$(~ l('t,lC'r for a signal and the same letter in tipper case for its transform.
SilWl' t.hf' two concepts are not totally compatible, we have decided to follO\V' more
"!Cl:kly till' :-;ccond, that is. when we call x a signal vector, we shall refer to its tnlllSrOl'm a~ X.
Ikn1v('ring; the signal block from its transform can be done simply by invert­ ill/( (1.12):
lIX11 = Ilxll
where Ilxll is the Euclidean norm of x, t.hat is.
M
where [x], denotes the k-th clement of the vector x.
Two other properties of orthogonnl transforms that we llse later in the book are that all eigenvalues of A lie on the unit circle of the eomplex plane [14], and that
(1.17)
(1.18)
Y,lXxH
1.2, IJJ.()(;/' 'l'IIi\N.'i;...;..I":.:.():.:./I:.:.~:.:.'.'i;..' -.:...;.1:)
where the superscript .. denotes complex conjugation. For 1\1 even, we sec that {X]o
and (XL\fj2 are purely real (zero imaginary part). It is important to note that the relationships in (1.17) are a consequence of the fact that there are only A1/2 distinct frequencies in the DFT basis functions defined in (1.15). The kth and (M - k)th
basis functions have the same frequency and their phases differ by 1r /2.
One of the main uses for the DFT in signal processing is in spectrum estima­ tion [1, 5, 13] because the DFT of an At-point signal block can be viewed as an approxima.tion to thc Fouricr transform of the signal at equally spaced frequencies, according to (1.16). By increasing the block length lH I we can improve the frequen­ cy resolution of t.he DFT. If the signal x(n) is random, t.he DFT of each block will also be random, but an average of several periodograms (defined in (1.4)) will be a good estimate of the power spectrum Su(ei:..') {41. There aTe many other techniques
for spectrum cstimation [15: 16], but they are beyond the scope of this book. The
main disadvantage of using the DFT for power spectrum estimation i:-l that with a
block of length At we will have information about only AI/2 frequencies. There are other sinusoidal transforms that do not have this disadvantage. as we shall see in the following.
Another·important use for the DFT in signal processing is in filtering, either fixed or adaptive {I7]. The reason for that is the convolution theorem [1, 5], which st.ates that if yen) = x(n). h(n.), then
This property is not immediately shared by the DFT; that is, if yen) = x(n) • hln) and Y I x, and h are blocks of these signals, wc have, in general.
Whatevcr interpretation w(" feel to be more appropriate, it is clear from (1.13)
and (1.15) that the DFT represents the signal vector x as a linear combination of
harmonically-related complex sinusoids.
Assuming that the input block x is real, (1.15) leads to the well-known property
where x denotes the Hadamard product operator [18), i.e .• the product performed
element by element. If the blocks x and h are padded with enough zeros, though,
(1.14)
(1.15)
(1.16)
fl (,2lTkn)°11 1. = VIVi exp ]~
I'
wll"l'j' [ i~ till..: iC"-.:lltily matrix. Siuce there arc fl..l(1l1 + 1}L2 distinct e uations il.!JJ.14) all till' 1\1[2 dCI1lcuts of A 1 wc have A1(1I1 - 1)/2 degrees of freedo~
Eadl dlUio' of A leads tu it different transform. Of course, the number of possible ('!loicr'M iH iunnitc, but some of them lead to matrices with special properties that Ill'f' lIH(·flll in :;ignal proccfoising applications. In the following subsections we discuss hrit·fly t.he 111081. cOlllmonly used transforms, and the fundamental properties of each.
wlll'f(' (lnl.- means the element of A in the nth row and kth column or, equivalently, the 71th sample of the hh basis function. As is usual in most of the literature cl ·;.ding with transforms, we assume that the indices nand k vary from 0 to Af - 1.
'rhe scaling factor /1/M is chosen to keep the orthogonality of A. We note that t111~ transform coefficient index k ca.n be viewed as a frequency index.
The definition of the DFT basis functions in (1.15) has two basic interpreta­ l,iollS [13]. In the first, the DFT corresponds to a discretization of the Fourier trallsform in (1.3), in the sense that the transform coefficients of the DFT are sam­ p1r.~{ of t.he Fourier transform, in the form [I, 5]
UU'!'/' /11'(' 1\/(1\/ - L}J2 dq:;I""'~ of fn'Nlol1l in c!wokill,u; tht' AlI'J. ('I.'III('llt~ of A. ThiH Illlll. pl'lIj)f'I'Ly ('Illl 1) •. f'H:<lily nhtflillrt! by I't'cll.1ling t.hat tILt, ol'thoJ(on:dity of A implies
1'11.. diS('Jd,C Fourier transform (DFT) [I, 5, 13] is defined by basis functions that 111'1' cOlllp)ex sinusoids wit.h frequencies varying linearly from 0 to 1r, in the form
1.2.2 Discrete Fourier Transform
wlwn'!X]k is the kth element of X, i.e., the kth transform coefficient. In the second illtnpretation, the DFT coefficients in X can be viewed as the discrete Fourier series
of Il. bandlimited continuous waveform x(t), which has been periodica.lly sampled to obtain thc signal vector x.
/1 ('1/,1/"/'/"11 / IN'I'/(()/)(/( "I'/()N 1[,
Tllf' rli:<f'rdj' Ha.rtlry t,l'ansfol'lll (DHT) [19] is a real version of the DFT. Its basis
rlllli'! inlls ;In' oht;lilll'cl simply by adding the real nnd imaginary parts of the DFT, in till' f01'1ll
1lil,It (1.18) IH'('l)llll'N /Ill j·lpllllily. 'rltll~. Ih.· j'l)llvollllioll nlll h., pi'r(ornwd block
hy hl(wl< with :t rlil'I'd DFT, n Ihldltlltnld pl'Odlld, lllld illl ill""n"\(' OPT, III all
,\v"I'lltp Hdl! 01' (lv'TIIl!' sav.' pI'Ot'j'dlll'!' 111. as Wf" :,;,.(. ill tbl' Il('xl dlilptf'r.
TII(' II:-lc' flf t!J(·l F'T ill Sj),·("j,l'lllll I':·d,illlat.inn alld si.c;lIld HH''l'inp; is furtllf'l' justifkd hy till' /'Xisl,,'lll',' nf fnsl i\1~orit.llllls for t.ht' DFT. colketivdy Imnwn as t.lw fast,
I", Hl'i.'!' II'llllSftll'111 (FPT) 110. 13]. '~'e n't.urn to thrs(' issues in Chapter 2.
Discrete Hartley 'IJ:ansform1.2.3
( 1.19)
Figure lA. The first four basis functions of the DIIT, for AI = :32.
IS,
Note tha.t in the a.bo'·e definit.ion w(" havc already considf'ft.,cl thflt x is composed of samplf>s of a zero-mean signal. From (1.11) and (1.20), it. is dear that the cleffit'uts
of the covariance matrix are samples of the aut,ocorrclation function of x(n), that
'f'lw Karhunen-Loc\'<:, transform (ELT) [2], which is <:\1::;0 referred to as t.he Hote-lling
transform [21]. is nn optimal transform from an stati~ti("alviewpoint. This j~ becnuse
till' KLT is the unique orthogonal transform tha.t will produce Cl set of uncOlTelated coefficients from a nonwhite signal. \;Vc will see later why this clecorrelation is important. For now, let us consider the COVnri<lIlCe matrix R xx of t.he input signal hloc.k x
(1.20)
T R xx '" E[xx I
1..2.4Wll''l'l' CilS(H) =cos(8) + siu(8). Note that, as was the case with th~ DFT, there arc
AI/2 disbnd. siml'oids in (1.19). The kth and (U - k)th basis fnnctions have the
HIlllW fI't'qIlCllCy. but. arc 90 degrees out. of phase. Some of t.he b<lsis functiollf: of thc
I liT art' shown in Fig. 1.4, for 1\] = 32. The discrete Ha.rtley transform was also illdC'Jwlldently derived in [2°1, where it was called type-I discrete H: transform.
Thcrc are numy more similarit.ies th<:\1l differences between the DHT and the
DPT, a.nd anything tha.t can be done with the DFT can also b. done with the
I rfT [la], including spectrum estimation and convolution. In fact, given the DHT
(·f)l'f1ic..:ients of a. signal block, it is easy to compute the corresponding DFT coeffi­
f'il'lIls, and vice versa f1g]. The DHT has two a.dvantages, though: first, it is a. real
ll':lllsform, so no complex arit.hmetic is necessary; second. we see from (1.19) tha.t.
till' DHT matrix is symmetric. and therefore the direct and inverse transforms are idl'nticaL This second point. is quite important because it means that in signa.l fllt.cr­
itl/-!: and coding applications. which require both the direct and im'erse transforms,
nnly one routine for computing the DHT is l'cqllirf'd, :-:oince it will also perform the illvlTse DHT.
Fast algorithms for the DHT 1 collectively referred to as FHT algorithms, al­
so exist 1 and they have approximately the same computational complexit,y of t.he f'()ITI'~ponding FIT algorithms, as we see in Chapter 2.
The equation above shows that R xx is a symmetric <:\nd Toeplit.z matrix (<:\11 dingo­
nil Is have identical entries). Furthermore. it is well known that covariance ma.trice~
arc positivp sf'midefinite [2, 21]' i.e., a.1I of its eigenv<llues are real and nonncgatiye.
lii ('lIiI/"/'/,;U I. IN'I'IW/iIlU'/'I()N /.2. HI,O( 'I, '/'11/1 NS/"Ol/M," 1i
The KLT transform is defined by the AI\L matrix that will cliagonalize Rxx, ill
the form
GiV(~11 Lhf' inpllt hlock x with cl)Vari:llll'(, R x)(, till,' Cllv"rin,l1f'l' of Ulf' t:rn.nsfornwd
hlock X is ea::;ily obtained from (1.12) as
(1.21)T .Rxx = A u R xx An = (hag {AD, AI, ... , AM-I}
It is \vell known from matrix theory [14] that the matrix AI":L which will satisfy
the diagonalization condit.ion in (1.21) is the orthogonal matrix who:-e columns arc the eigenvectors of R xx , with the diagonal entries of Rxx being the eigenvalues,
that is,
R xx [A], = ,\, [A]., Figure 1,.j. The first four oa,sis fllllctiolls of the 1(L1' ~ M = 32, for an AR( 1) signal with illtNsamplc correlation coefficient of 0.9.
where [A].k is the kth column of A. Thus, the basis functions of the KLT are the eigenvect.ors of the covariance matrix of the inpnt signal. For AR(l) signals, such eigenvectors have the property of being sinusoids [2], with freqnencies that
are not equally distributed in the unit circle. The phases of these sinusoids are determined from the fact tha.t eigenvedors of Toeplitz matrices have either even or
odd symmetry [22]. Some of the basis functions of the I\LT arc shown, for Al = 32: in Fig. 1.5, for an AR(l) signal, and in Fig. 1.6. for an AR(12) model derived from a female speech signal. It is interesting to Hote that the basi~ functions shown in Fig. 1.6 are quite similar to the KLT fundions obtained in [23] for different speech signals, which is a confirmation that stati-:;tical modeling can be helpful in getting
the average properties of most families of signals.
Note that (1.21) implies that the transforms coefficients (the elements of X) are
uIlcorrelated, and that their variances an:' gi\'en by
Var {[XJJl '" (Jf = "
By diagonalizing the covariance matrix of the transform coefficients, the KLT also maximizes the energy compaction in X, in t.he ~ens(' that the above variance dis­ tribution of X is the most concentrated (there will be a few transform coefficients with large variances, and most coefficients will have small variances). This implies
t,hat the product of all variances is minimized [2].
Figure 1.0. The first four basis functions of the T~LT, ," = :32, for an AR(12) model derived from a fem;:\.le ~p('ech ~jgn<ll.
IH ('/1111"1'1-:11 I. IN'I'IIIi/)/I(:'/'/I)N 1.2. IJ/,!)!'I, 'l'IrJlN,~i"(lIIAl!i 1!)
Ti,e discret.e cosine t.ransform (DCT) [10] was originally derived from Chebyshev
polynomials [24] I but with the aim of approximating the eigenvectors of the au­ ICH:orrelation matrix of an AR(l) signal block. The asymptot.ic equivalence of the I)C'T and t.he KLT was later demonst.rat.ed [25). The definition of the DCT basis fUllctions is
(f"llIllpal'cd 1,0 t.h(' O(A110gAJ) operations required for the FFT or FHT l as discussed III Ilapt.C'I" 2).
An important asymptot.ic property of the KLT is that for an AR(I) signal, as
I Ill' fon-elation coefficient between adja.cent samples Rn (l)/ ~7"(O) t.ends to one, 1111" b:"lsi:-:; functions of the KLT become sinusoids at frequencies equally-spaced in 11.(' interval [0,11'") (2, 9]. These sinusoids assume precisely the form of the discrete t'osilll' transform, which we discuss ne-xl.
anI = c(k) {f; cos [(n +D~;]
c(k) == {
(1.22)
(""-I 2) 11,\/IT "I 1=0
which is referred to as the transform coding ga.in. Note that the numerator in
the equation above is in fact the average energy in the t.ransformed block X, which is identical t.o the input signal va.riance, due to the total energy conservation of any orthogonal transform. The denominator, however, is the geometric mean of
the transform coefficient variances l which is minimized by the KLT. Therefore, the
KLT maximizes GTC, and so it is the optimal transform for TC.
\iVith the KLT, transform coding is an asymptotically optimal coding system
because [2J
This CIH'l"gy ('oncc..'lllrat.iol1 ha:- ill1 illlplll"t'l11l. illlplical,ion fn!' Higll;d coding ap­
plications. In transform codiug (TC) {D). instf'ad of qttallt,izin~ t,h(' stllllplf's of the signal with a desired number of bits per sfUllplc (which is referred t.o a.s pulse-code modulation, or PCM)1 we can perform the quantizat.ion on the transform coeffi­ cients, with a different number of bits for each. The total number of bits should be the same in both cases. It is well known [2, 9] that a lower mean-square error
will result from quantizing the transform coefficients. Assuming scalar quantizers, the reduction in TC mean-square error over PC:M is given. as we see in Chapter 2:
by [2, 9J
Thus, the reduction in mean-square en-or for TC reaches the maximum value of
I/;; predicted by the dist.ortion-rate function in (1.10).
From the discussion above it is clear that the KLT should ideally he the tram:­
form of choice in a signal coding system 1 due to its optima-lity. However, the KLT is in fact seldom used in practice for two reasons. First l the KLT is signal-dependent, so we need to have a good model for the signal statist.ic~, which is not always available. Furthermore, the signal stat.istics might change with time, and real-time computation of cO\'ariance matrices and their eigenvect.ors is virtually impossible
for most applications. Second, the KLT ma.trix AKL has no particular structure, so the transform operat.or in (1.12) would requirc 1\12 multiplica.tions tlnd additions
A plot of the first. four basis functions of the DCT is shown in Fig. 1./, for NI = 32.
We should j,oint out that. t.h"re are ot.her DCT definitions [10], but the DCT cor· I'Psponding to (1.22) is the most employed in practice (for example, IEEE standard 110. 1180 [261 is based on that d"finition).
The main difference between the DCT and the DFT or DHT is that., when the frequency index k varies from 0 to Al - 1, there arc AI different frequencies
ill the interval 10, ?rl. Therefore, a spectral analysis performed with the DCT will
produce twice the number of frequency bands as does the DFT or DHT, but with the disadvantage that the DCT spectrum has no magnitude/phase interpretation. Furthermore, if the input signal has a st.rong component at a particular DCT fre­ quency, but that component is 90 degrees out of phase with respect to the DCT
basis funct.ion, then the corresponding DCT coefficient will be zero. Thus, the DCT of a single block may not be much useful for spectral analysis, but Cl DCT-based
'(} ('/IJ11"/'/';/I I, IN"'II(!I)IJ(""/ON
Figure 1.7. The first (our basis functions of the neT, for AI = 32.
/,~, /I/,O('/i '/'/IiINiil"OIlAlS
Figure LB. The first four basis functions of the DCT-IV, for AI = 32.
~I
Because of the frequency shift of the neT-IV ba.<;jl') functions, when compared
It) the DCT basis, the DCT-IV is not, as Ilseful as t.he DCT for signal coding, as we will see in the next chapter. However, the neT-IV can be successfully applied to
spectrum estimation and adaptive filtering. Furthermore. the DCT-IV can be used as CL building block for fast algorithms for the DCT .md for lapped transfonns.
Related to the DCT·IV is the DST·IV, defined by [201
periodogram would also be a good e!';timate of the Fourier sp<'drum. ;'\$ we see in Chapter 2.
The DCT can be u!'ed for cOllvolut.ion and correl:ltion, because it satisfies a
modified shift property [101, but with a bigher computational complexity than DFT· bilsed convolution. Therefore, the ma.jor uses of the DCT are in transfOl'm coding
and frequency-domain adaptive filtering. The u::;e of the DCT in signal coding is so widespread that the DCT is a candidate for becoming a CCITT standa.rd for video­
phone and other image coding applicatiollf:: [10]. vVc will discuss th(' Olppllcations and fast algorithms of the DCT in Chapter 2, ank =' {{; sin [ (n +D(k + DJ~I] (l.24 )
1.2.6 Type-IV Discrete Cosine Transform
The neT-IV was introduced in [20] as an alternative tnmsform for spect.r.J analp-i!'. It is obtained by shifting the frequencies of the neT ba.oo;i!' functions in (1.22) by 1r /2M 1 in the form
Note that if we take both the DCT·IV and the DST·IV of an input signal, then we would have 1\/ frequencies of spectrum information in the int.erval [0, x], which is twice the resolution of the DFT or DHT. Magnitude and phase information would
then be preserved becf\.use wc have projections on cosines with the DCT-IV and on
sines with the DST·IV,
(1.23) 1.2.7 Other Transforms
Note that, unlike the DCT, t.he scaling factor is identical for all basis functions. The first four basis functions of the DCT·IV a.re shown in Fig, 1.8. for M = 32.
Except for the !(LT for non-AR.(l) signa.ls, all of the previously considered trans­
forms ha.ve basis functions that are sa.mpled sinusoids. In applications where the
('11111"1'1';/1 I. IN'I'/III)//I:'I'II)N
computational complexity of sinusoidal transforms may he prohibitive. other traI1S­
forms have been considered, such as the Walsh-Hadamard, Haar, slant., and high­ correlation transform [9, 21, 27]. All of these transforms have fast algorithms, which are less complex than those of the sinusoidal transforms. However, the easier com­ putation is achieved at the cost of worse performance in signal filtering or coding.
Thus, with the availability of DSP hardware of ever increasing computing speed
and decreasing cost, the nonsinusoidal transforms are now employed only in rare situations. Therefore, we shall not discuss them further in this book.
1.2.8 Two-Dimensional Transforms
Our discussion of block transforms throughout this section has assumed that the
input signal block is one-dimensional. In image coding applications, however, the blocks are adually kf x AI matrices, and the transform operator is a four­
dimensiona.l tensor. In the.se applications [9, 21, 27J, there is no advanta.ge in using
general orthogonal tensors for two-dimensional (2-D) tran~formsl so separable trans~
forms can always be employed. Then, the transform t.ensor for 2-D signa-ls B can
be put in the separable form
where x denotes the Kronecker product 114).
The equation above states that the 2-D transform operator can be a.pplied in two steps: transform all the rows in the block with a 1-D transform, and then transform
all columns in the transformed block with the same I-D transform (\ve could start
with transforming the colulTIlls, and then the rows). Therefore, <Ill t.he discussion on block transforms that we have carried OIl until now, as well as the subsequent
discussions on lapped transforms, remain valid for 2-D transforms. VVe simply use separable transforms.
1.3 Development of Lapped Transforms
The basic motivation behind t.he development of lapped tra.nsform~ comes from
one of the major disadvantages of tradit.ional block transforms: the blocking effects1
which a.re discontinuities in the reconstructed signal. In transform coding or block
1.;1. IIIN/,/,!)/'MI,N'I' ()/o' 1,/11'1'/.;0 '1'/lJlN81"O/IA/S ----'-~;j
1lIII'I'ill~, wc' start by n:muviug a. bloc'k of Al samples from the input signal. The
Idlll'k is t.hc'Il transfonlH'd, using Ol1r of the- transforms discussed in the previous ,11'l'j iou. The tnm::-form coefficients are then processed according to the application: III tnLllsform coding 12, 9], they are quant.ized, either 'wit.h sCi\lar or vector quantiza~ t lllll [281, and in fixed or adaptive filtering they are multiplied by appropriate scaling rlldofH [2Q]. Finally, the inverse transform is computed, and the reconstructed signal
Idol'!.: is appended to t.he output.
Bpc<\use of the independent processing of each block, some of the coding errors
will produce discontinuities in the signal because the final samples of one block will, most likely, not match with the first samples of t.he next a.nd t.hus generate 1111' I)locking effects. In image coding, the blocking effects produce il reconstructed 1I1l:1J:!;C' that seems to be built of small t.iles [10, 2S}, whereas in spf"ech coding the
l'l'l·ollstruded speech will have an a.udible periodic noise 1301.
Several approaches t.owards the reduction of blocking effects \vere developed.
Arllong them, prefiltering and postfiltering [31, 32] and t.he short-space Fourier t.ransform (55FT) [331. The filtering techniques have t.he disadvantage of reducing till' trnllsform coding gain, and producing a yisible low-pass effect a.round t.he block
boundaries, but they have as an advunta_gc a. computa.tional o\"erhead of less than III percent.. \-\lith the 55FT, there are no blocking ('ffect~, because the 55FT basis fUlIctions are t.heoretically infinite in lell,z;th. The disach"antage with the 55FT is the
Hppearance of ringing around the edges of an image" AI~o, the 55FT eau only be dHciently applied to finite-lc-ngth signals. such as illlrlges [33]. or to periodic signals.
The firs,t lapped transform that was designed with the aim of reducing the block­
illp; effect.s was the lapped orthogonal transform (LOT) [34, 35]. The key to the
dl'velopment of the LOT wa.s the recognition that the bloc.king effects are rea-lly ('aused by the discontinuit.ies on the bm,is func.tiomi of the transforms. These dis­ l'ontinuities do not exist when we are only looking at a single block, but t.hey do ;l{>pear when we realize that each block is reconstructed as a weighted combination of the btlsis functions. vVhen each of these functions is pasted in its position in the output stream of samples. a discontinuity is caused. To better visuali:;w this result, imagine that only a single reconstructed block is sent to the output. Then. the
signal will jump from and to zcro before and aft.er the block. Another motivation in the sa.me direction ca.n be obtained by looking at Fig. 1.6, where we see t,hat the
first basis functions of the KLT of a physi<:al signal do not have sharp disc-ontinuit.ies
itt their boundaries.
• .'1 ('/lM"I'IW. I. IN'I'I!()/)/lr:'/'/ON I ,:1. /) I, V/';/. () /' Ail I,;N'I' (W /,AI.,:'I.:.'1..:.';/..:.)_'/.:.'/.:.U..:.:...\N.:...S.:.../!:;,O..:.II..:.A.:."•.:.' ...:~::::.j
wll1'I'~' P i~ 1:111" lap]1l'd t,ril,llsf'onll ma.t.rix, with the LT basis fundions as it.s columns,
1111' I x i:-; ;\[I "xtl'lHlcd :;;ignal block having 2111 samples
BASIS FUNCTION FIRST BLOCK
BASIS FUNCTION SECOND BLOCK x = [...(mM - 2M + 1) .,(mM - 2M + 2) x(mM -1) x(mM)]T(1.26)
REGION OF OVERLAP
Figure 1.9. The overla.pping of the hasis fUIlctions in the LOT.
wit.h .,-/1 being the block index. Thus. it is clear t.hat. each sample of the input. tli'~llid .1'(/1.) will be used in bvo blocks. Not.e that \ve have made implicit again the rlt']wlldence of x on rn to simplify the notation,
In order t,o allow the reconstruction of the original signal :f(n) from the sequence Id' LOT blocks X, the LOT matrix P should satisfy the following orthogonality
/·IJlH.libons [34, 36]
pTp = I and pTWP = 0 (1.27)
Initially, the LOT ba.sis functions were constrained to have their lengths equal
to twice the block size, i.e., 2AI. Then, the LOT of a sigmJ blo(;k x can be obtained by [34, 35]
After better understanding of the generation of blocking effects, the development of the LOT started with the fundamental question: can the basis functions of the transform be made longer than the transform length? In this way, the basis functions could have a smoother transition to and from zero at their ends. Thus,
if the number of transform coefficients is Al. the basis functions would have more
than Ai samples. Nevertheless, each ncw input block \volllcl still be taken every
1\1 samples, and so the basis functions of tIlt" LOT would be projected not only on the current block, but also on the neighboring blocks, on both sides. Therefore, the
basis functions from one block and one of its neighboring blocks would overlap, and this is the reason for the name lapped. This property is illustrated in Fig. 1.0.
If we want to use the same basis functions for analysis (direct transform) and synthesis (inverse transform), the overlapping port.ions of the basis functions of
neighhoring blocks must be orthogonal. This is the reason for the LOT denomina­
tion. Fortunately, the number of constraints incurred by these extra orthogonality
restrictions is less than the number of extra. degrees of freedom added by increasing the lengths of the basis funct.ions, Therefore, LOT ba~is functionf' can be construct­ ed.
x = pTx (1.25)
wlwl'C I is the identit,y matrix and W is the .AI-sa.mple shift operator defined by
(1.28)
wllcrc the identity matrix nbove i;; of oruer Al.
\Ve note t.hat the LOT definition in (1.27) is quite general For example, a block tr:msform A can also be viewed as a LOT, with pT = [0 AT 0]. In fact, \\Then
\V(' look at a block transform in t.his way, the discontinuities in the basis functions lll'come dear. Vv'€, want to find a matrix P that has nonzero entries and is a va.lid
LOT, The first. approach townrd that goal [34] was to solve the problem in (1.27) in :111 iterative wa~"~ st;u·ting with the first basis function that led to the maximum (76, :llld then solving a nonlinear optimization problem to find the second basis function I,hat maximized (7i, under the constraints in (1.27). and so on for the third and
1"l'll1aining ba.~is functions, In this way: the LOT with t.he maximum GTC would be obtained. This approach did not lead to good results. though, because the procedure may converge to 10(;al optima,
Another approa.ch to the derivation of a LOT is to start \vithin a subspace of
\'alid LOTs [36, 37], and solve an eigenvector problem to derive an optimal LOT. In this way, the equivalent of a I\.arhunen-Loeve LOT is obtained, An example of an optimal LOT derived in this way is shown in Fig. 1.10.
The first experiments with LOT coding of images [35, 38] were quite encourag­ iug. By simply replacing the DCT by the LOT in a transform coder, the blocking
l(i ('/Ii\1''I'I';/I I. IN'I'/lO/III("I'ION
Figure 1.10. The first four ba~is functions of all optimal LOT. for 1\1 ;:: :32.
It ~ rences
(I) ()ppenlwim, A. V., and R.. Schafer, Disc.,.etf~ T,:me Signal PI'OCfssing, Englewood ClifFs, N.1: Pren\iee-Hall. 1989.
['I .)"""n\, N. S., and P. :'0/011, Digital Coding of Waveforllls, Englcwood Cliffs, NJ: Prentice-Hall,1984.
l:ll Drake, A. W.. Funl[Olllentals of Appliel[ Probability Theory, New York: McGraw­ Hill, 1967.
[Ill Papoulis , A., Probability, Random Variables, and Stochaslic Processes, New York: McGraw-Hill, 2nd cd., 1984.
[G] Do FaHR, D., J. G. Lucas, and W. S. Hodgkiss, Digital Signal Processing: A Systelll Design Approach. New York: Wiley, 1988.
161 IEEE DSP Committee, ed., Progmm.< f07' Digital Signal Processing, New York: IEEE Press, 1979.
effects were virtually eliminated. However, the optimal LOT:; obtained wit.h thc aforementioned methods had t.he disadyant.a.gc of not being fast computa.ble. Af· ter the first suggestion of a fast LOT algorithm in [37, 38). the LOT became a seriou~ candidate for replacing the neT in transform coding. Furt.her experiments with the LOT also have demonstrat.ed its superior p<.'rform;:mce in adaptivc speech coding [30, 39J, and data recovery in packet. video with packet loss [40J.
Later, it became clear tha.t the fcllnily of lapped transforms could be expanded by removing two of the rest.rictions on t.heir ba~js functions. First, the even or odd symmetry of the functions, initially assumed in the de\'elopments in [34. 37] was relaxed, leading to the development of the modulated lapped trm"form (MLT) [39]. Second, the length of the basis functions was not restricted to 21\1. so that Iflpped transforms wit.h arbitrarily long basis functions could be designed. as first. demon­ strated in [41].
The purpo~e of this section was to give a brief review of the history of the development of lapped transforms. Throughout the book, nll of t.he dctnils of the theoretical properties and fast algorithms for LTs will be present('d: as well as Illany
examples of practical applications of LTs.
171 Rabinor, L. H.. , and R. W. Schafor. Digilnl P,·oC(.ssillg of Speech Signals, Engle­ wood Cliffs, NJ: Prentico-Hall, 1978.
/81 Chen, J. H., "High-quitlity 16 kb/s speech coding with it one-way delay less than 2 ms/' IEEE Inil. COTlf. AC01LsL , Speech} Signal Processing. Albuquerque. KM, pp. 453-457, Apl'. 1990.
[D] Clarkc, R.. J., TI'(/''/I,sjorm CodilJg oj Images, London: Academic Pre~s, 1985.
!IO] R.ao, K. R., and P. Yip, Discrete Cosine Trar/.':.fol'm: AlgorifhnlsJ Advantages, Applic(Jtions: New York: Academic Press, 1990.
/l1) Borger, T., RaLe Dis/07'Lion Thcory. Englewood Cliffs, NJ: Prent.ice-Hall. 1971.
11.2] Beauchamp, K. G., Tro.llsforms for En.gillcers: it Guide to Signal Processing, New York: Oxford University Press, 1987.
113] Brigham, E. 0., The Pas/. Fouricr Trrl11.'>form, Englewood Cliffs , NJ: Prentice­ Hall, 1974.
114) Gantm"cher, F. H., The Theory of AlaLriccs. l'of. I. New York: Chelsea, 1977.
[15] Childers, D. G., ed., Modern Spedru", Aaalysis, New York: IEEE Press, 1978.
28 f'I/!\I"I'IW I, IN'I'I(ill)IJ("I'/ON
[16] Gardner, W. A., Statistical Speclral Al/alY8is: A NO'II1)r()habilislic App'rou.ch, En­ glewood Cliffs, KJ: Prentice-Hall, 1988,
[17] Widrow, B" and S, D, Steams, Adaptiv, Signal Processing, Englewood Cliffs, NJ: Prentice-Hall, 1985.
[18] Graham, A" l\'rollechr Products and AfalTiT Calc1tlus: with Applications: Chich­ ester, England: Ellis Horwood, 1981.
[19] Bracewell, R. N., ~:Discrctc Hart.ley t.ransform,:! .1. Opf. Soc. Am., val. 73, Dec. 1983, pp. 1832-1835.
[20] Wang, Z.. "Fast algorithms for the discrete W transform and for the discrete Fourier transform," IEEE Trans. Aco'Ust., Spnch, Si.qnal Processing, vo1. ASSP­ 32, Aug, 1984, pp, 803-816.
[211 Gonzalez, R. C" iUld P, Wintz, Digital Imagc Processing, Reading, MA: Addison-Wesle)", 1977,
(22) lVIakhoul, J., "On the eigenveetors of symmetric Toeplitz ffiittrices," fEEE T1'ans, Acoust.. Speech, Signal Pracessing, vo!. ASSP-29. Aug. 1981, pp. 868­ 872,
[23] Campanella, S. ,1., and G. S. Robinsou, ::A comparison of orthogonal t.ransforms for digital speech processing," IEEE TraT/s, Comm'U//., Tech" vo!. COM-19, Dee. 1971, pp, 1040-1050,
[24] Ahmed, N., T. Natarajan, and K. R. Rao, '(Discrete cosine transform," IEEE Trans, Compld" vo!. C-23, Jan, 1974, pp, 90-93,
[25] Clarkc, R. J., "Relation between the Karhunen-Loeve and cosine transforms," Pl'oc. lEE, Pt, F, vo1. 128, Nov, 1981, pp, 359-360,
[26) IEEE Circuits and Systems Societ)", "IEEE standard no. 1180 - specifications for the implementation of 8 x 8 inverse discrete cosine l.rnnsfonn," Tech. Rep., IEEE, New York, March 1991.
{27] Pratt, W. K., cd., Digital Image Processing! New York: Wi!cy, 1978.
(28] Net.ravali. A. N., and B. G. H'l.skell, Digital Piclnrc8: Representation and Com­ pression, New York: Plenum, 1989.
129] Cowan, C. F, K" and P, Grant, eds" Adaptive Filters, Englewood Cliffs, N.J: Prentice-Hall,1985,
PIlII M1l1vill', H. S., IWrl n,. Dllart.(', "Trnnsform/suhband coding of speech with the lappt'" ol'tholjonal transform," IEEE Inll. Symp. Circuits Syst, Port.land, OR, pp. 12G8-1271, May 1989,
1:111 Ilf' VC, H. C.. and J. S. Liro, "Rcduct,ion of blocking effect in image coding," I/,;EB lull. Conf. Acousf., Speech, Signal ProCEssing) Boston} ~iIA. pp. 1212­ 1215. March 1983.
I:I~I Malvar, H, S" "A pre- and post-filtering technique for the reduction of blocking ('ffeets:' Picture Coding Symp.) St.ockholm. Sweden, June 1987.
I:J:q Hinman, B. L...1. G. Bcrnstein, and D. H. Staeliu, "Short-space Fourier trans­ form image processing," IEEE inll. Conf. At·oust.. Speech, Signal ProCfsslllg. San Diego, CA, pp. 4,8,1-4,8.4, larch 1984,
1:11] C"ssereau. P., A New Glass oJ Oplimal Unif.ary Transform.'; for Image Process­ ing. M~ster's t.hesis. Mass. Inst. Tech .. Cambridge, t-./IA. ~'Iar 1985.
l:~;,] Cassereau, P. M., D. H. Staeliu, and G. de Jager, ':Encoding of images based on a lapped ortbogonal transform," IEEE TraIlS. COmmlLl1 .. vo1. 37, Feb. 1989. pp. 189-193,
1:)lq :Vlalvar, H, S" Optimal Pre- an.d Posl-filtcrs in Noisy Sampled-Da.!a SystEms, PhD thesis, Mass, Inst. Tech., Cambridge, MA, Sep, 1986,
lai] .Malvar, H. S., and D. H. Staelin, ':The LOT: transform coding without blocking f'ffeetf:," IEEE Tro1l.8. Acousf., Spurh., Signal Processillg. \'01. 37. Apr. 1989: pp. 553-559.
1;1:"'] 1vlalvl:n', H. S., and D. H. Staclin, ':Reduction of blocking effects in imnge coding with a lapped orthogona.l transform," IEEE Inll. ConI Acoll..;I./ Speech, Signal ProCf;8";lI!f, New York, pp, 781-784, Apr. 1988.
1;10] Malv;:lr , H. S., "Lapped transforms for efficient transform/subband coding:' IEEE Tra1lS, Acolt$/ .. Speech, Sigllal Pract"illg, vo!. 38, .June 1990, pp. 9G9­ 978,
1'1lI] Haskell, P., ancl D. M:cf:sersclunitt. "Reconstructing lost video dat.a in a lapppcl orthogonal transform based c.oder,'· JEEB Inll. Conf. Acousf., Speech. Signal Processing, Albu'luerque, NM, pp, 1985-1988. Apr. 1990.
1,1 LI Mah·a.r, H, S" "Modulated QMF filter banks with perfect. reconstruction." Electron, Lell" YO!. 26, no. 13, June 1990, pp, 906-907.
Chapter 2
Applications of Block Transforms
III this chapter we discuss in more detail the classical block transforms that were prC'~ented in Ch;lpter 1. Wc start by looking at the most common applications of
hlock transforms in signal processing: filtering, spectrum estimation, and coding, ill Sections 2.1 to 2.3. Other applications are briefly disclIssed in Section 2.4. Wc l"<Ttainly cannot cover all of those applications in great detail (there (lrc entire hooks written on each one of them). Our purpose is to make clear how orthogonal
Ir;U1sforrns can be applied. This is essential to enable us to judge 'whether a lapped
tra.nsform can be employed instead of a block transform for a given application.
In most cases, the time consumed in performing the transforms is an import.ant issue. Simply computing the tra.nsforms by t.he mat.rix mnltiplications defined in Chapter 1 is unaccept.able in virtually any application because it. would generally
result in an excessive requirement on t.he computat,ional power of the processing system. Thus, in mo~t applications the transforms are computed by fast algo­
rithms. We believe that it is important for the system design engineer to ha.ve fast procedures for block t.ransforms readily available, so the fastest. known algorithms
ror the comput.ation of the most commonly used block transfor~ls are discussed in
Sretion 2.5. Computer programs implementing these algorithms call be found in Appendix B.
31
_:I.:.l ,_::..:/l.:..:".:./...:"/:..:'I.::.:.>/I~. 1\1 'I '1./1.'" 'I'/UNS OF 1I1,()( '11 '1'/1/\ NSI"()/I MS
The basic operation in digital signal processing is fiJt,ering. Assuming that x(n) is the input and y(n) the output of a. linear shift·invariant filter with impulse response hen). it is well known [lJ that the relationship between yen) and :r(n) is given by
2.1 Signal Filtering
Figure 2.1. FIR filtering vi" direct form, or transvNsa.1 ~trl1ctur('.
where >+: i~ the convolution operator. In many practical eases, a. finite impulse
response filter (FIR) can he used. For a FIR filter, there are indices 1nl and m? such that h(m) = 0 if m ti [ml, md. \Vithout loss of generality, wc cau assume that 1nl = 0 and m~ = L -1, where L is the filter length, so that the convolution becomes the finite sum
where the vector h contains the filter coefficients, and the vector x" is a .sign.. l block containing L samples, from :r(n, - L + 1) to ;r(n).
Viewing a FIR filtering operation as an inn~r product is quite useful when we arc working with block .signal processing. In fad, comparing cquatioll~ (2.2) with (1.12), we see that a transform operation is equivalent to the computation of the outputs
of many filters on the same input signal block x. Each filter impulse response is a
column of the transform matrix A. Therefore. a block transform can also be viewed
as a filter bank, and this connection will be further disct1s~ed ill the next chapter.
L-1
yen) = L h(m)x(n -m) m=O
The above equation can also be \"iewed as the inner product
(2.1)
(2.2)
;d~o referred t.o as the t.ransversal filter. il' depicted ill Fig. 2.1, wl1(:'rc the operators
=-1 are one-sample delays.
Either in lwrdware or softwart'l th(' t.ransyersal filt.cr is straightforward to im­ plement, and it is such a basic pron'ssing unit that man~' digital signal processing integrated circuit,s have th('ir hardware optimizcd to perform direct form FIR filt('r­ iug at the fastest possible speed (generally one tap per illstruction cycle) [2]. The disadvantage of the direct form is that. in general, L multiplications and L - 1 ad­ ditions must b(' performed for each outpnt sample. In applications such as acoustic echo cancelation, where filter lengths in excess of onc thousand are employed. the
computa.tional effort of the direct form ma.y be prohihitin::.
The basic idea hehind faster FIR filtering is to use t.he circular com·olution
property of t.he discrete Fouricr t.ransform (DFT) [1]. Gin:'n two vectors x and It of length 1.11, with DFTs X a.nd H. respectively, we compute a vector U by
u '" X x H (2.3)
where x denotes the Hadamard (clement-by-dcmcnt) product. Tlwn. the in,·erse
OFT of U is related t.o x and h by
2.1.1 Efficient FIR Filtering
where the op('r"t.or CV denote~ circular conyolut.ion, which call also he written asThe simplest practical implementation of the FIR filter in (2.1) and (2.2) is the direct form [1]. In this approach, whenever an input sample x(n) becomes available. it is
shifted into block x. and the oldest sample in x is di:--carded. The inner product
in (2.2) is then recomputed, and a. new output sample is cwailable. This st.ructure,
M-I
u{n) = L hem )"«(11 - m) mod AI) m=O
(2.4 )
lIsillp; (2.G), wc scc that
with ",(AI -I) = O.where i mod Jd denotes the remainder of t.he divi:--ion of i by .A!. Notc' t,h;d, t.he na111<­
circular convolution is related to the circular shift propert.y of the mod operator;
for example. -I mod 111 = M - I.
The circular convolution in (2.4) can be used to comput.e the linear convolution
in (2.1). Onc way to accomplish that is to LIS' the overlap-add procedure [I, 31, in which wc overlap the results of circular convolutions to obtain the linear convolution. Another approach is the overlap-save, which is quite similar to the overlap-add.
!vIore details on the overlap-saw' procedure can be found in [1].
In order to see how the overlap-add proce<1ure works, let us define two sequences 1111.1
of length j\f by overlap-add
Let u::; 1I0W oW'1'lap and add two consecutin' blocks of l1 r(n), in the form
y(n) '" «,(n mod L) + ",-I(n mod L +L), rL $ n $ rL +L-I
x,(n mod L - m) = x(n - m). 0 $ m $ n mod L
(2.8)
and
:rr_l(nmodL+L -m) = J:(n -rn), n. moclL-L+ 1::; m ~ £-1
It i~ important to stress that these equationf'i arc only valid been use we ha\'c padded hr(n) and xr(n) with zeros and taken length-AI DFTs. Using the two equations :11 love and also
() { x(rL+n), O~n~L-I
X r 11. = 0, L ~ n ~ M-l
(2.6) h,(m) = h(m), 0 $ m $ L - I
In the above definitions, we have assumed that, AI = 2L (wc cnn always choose L
and J'1 in order to satisfy tllis relationship). Thus, hr(n) is a block containing the L samples of h(n), whereas xr(n) is a block containing L samples from :r(n), starting at x(rL + n). Both blocks are p[ldded with L zeros.
Assuming that the DFT of x,(n) and h,(n) arc multiplied according to (2.3),
and calling 'U,.(n) the resulting sequence, we have. from (2.4)
M-l u,(n) = L h,(m)x,«(n - m) mod M) (2.7)
m=O
From the above equation and the definitions in (2.5) and (2.6), we can write
n
tl,(n) = L h,(m)x,(n - m), 0 $ n $ L - I m=O
and
m=7I-L+1
L-I y(n) = L h(m):r(n - m)
m=O
which is precisely the linear convolution that we originally wanted to implement
io(2.1).
A graphical represent.ation of the ovcrli\p-acld procedure is shown in Fig. 2.2,
where the output sequence y(n) is formed by overlapping and adding the ur(n) I,locks. Assuming that L is a power of two (we can al\vays pad the original h(n) with ,,('ros in order to meet this), iV/ is a.lso a power of two, and so the popular radix-two 1>1' mixed-radix disc-rete Fourier transforms can be computed by the popula.r radix­ I.wo or mixed-radix fast Fourier transform (FFT)l which arc discussed in more detail
ill Section 2.5. In fact, as long as 1'1 is highly composite, we can use a fa"t DFT
algorithm [41
By replacing the direct-fonn linear convolution in (2.1) by the overlap-add pro­
,'pdure we have to perform, for each block of L out.put sample~, one direct DFT
(the DFT of h(n) can be pl'ecomputed and stored), M complex multiplicittions,
,'1(; (.'11111''1'1,;/1., i\I'/'/,IC'/I'I'IUNS (!I" III,ur'l, '1'/lI\N,'W()/IMS ~,I, SIC :N/I /, /"////'I'J//IN(,' ,'17
W(k) '" [H(k) +H(M - ':ll/2
where X(k) denot.es t.he kt.h element. of X and H'(k) and Ha(k) are t.he even and
1ldd pa.rts of H, that is
The DFT is Hot tlw only t,ran!d'orm that (all be llsed for circula.r cOIlvolution
(tllld hencc for FIR filtering). The discrete Hartley transform (DHT) also has a l'tlllvulHtion property [5, 6]. Calling X, H, and U the DHTs of the vectors X, h, lllld u, respectively. the circular cOl1\'olution of (2.4) leads to the following Hartley
Illlllla.in relationship
(2.9)U(k) = X(k)W(k) + X(i\! - k)H'(k)
Figure 2.2. FIR filtering via. ovcrlap-add procedurE'. Otber transforms c<ln be u~{'d instead
of the DFT. W(k) '" [H(k) - H(M - k)I/2
one inverse DFT and L - 1 addit,ions (because the last sample of II r (n) is always
zero). In fact. due the complex conjugatt' symmetry of t.he DFT for real sequences [see (1.17)], only L - 1 complex multiplications and t.wo real multiplications arc
needed to perform the product in (2.3). The computation of a length-AI DFT of a
real sequence by means of a FFT takes approximately 2.1.1[ log:? 111 ref\! operations when .JH i~ a power of two (t.hf' precise numbers i'lrc gin'n in Section 2,5), (lud so the
computational complexity of FIR filtering with the DFT will grow proportionally to the logarit.hm of the filter length. W"ith dircd.·fonn filt.ering the computational effort grmvs proportionally to the filter lengt h. As an example. if the filt.er length is L:;:: 256. the direct form implementation requircs a total of 511 arithmetic oper­
ations per output sample. \Vith the oycrlap-add procedure Uw computat.iol1alload is only G2 ariUU11Ptic operations per output sample.
From the example discussed abm'e, it is clear that whell wc WClnt. to perform FIR filtering wit,h a long filter. we should use the DFT-ba~ed o'Trlap-acld procedure. because the sayings in comput.ations can be of an ordf'r of magnitude or more. This. argument. <'llone would be enough to justify t.he Rearch for efficient FFT algorit.hms.
a t.opic to which wc will return in Sed-ion 2.5.
III many signal proces!'ing applications, it is desired have <\ linetu" phase filter, and ~o h(n.) will have even symmetry. In this case, RO(l') :;:: 0 and (2.9) will reduce to
:l simple element-by-c1ement product, exactly like the DFT convolution property.
Once the DHT is used to perform cyclic convolution, it could replace the DFT in t.lw overlap-add algorithm for FIR filtering. Compared to the DFT-based overlap·
luld procedure , the DHT requirel:i a small increase in the number of additions [6]
(less than 1% for Af = 512). However, the DHT has the advantage of being its own
inverse, so only one subroutine is necessary for transform computation. This is not t rue for the DFT with real input data, as we will see in Section 2.5, where a direct DFT subroutine cannot be simply modified to compute the inverse DFT.
The discrete cosine transform (DCT) can also be used to approximate cyclic con·
volutions [71. but. an exact. FIR filt.ering procedure of the overlap-add kind cannot hp easily performed with DCTs. Nevertheless. in the context of signal restoration or coding, where the reconstructed signal will necessarily be subject to errors, fil­
Ining in the DCT domain a~ part of i\. sign"'l coding or restoration process may be useful. In fact. DCT-domain filtt:ring has been successfully applied t.o the problem
of progressive image t.ransmission wit.h neT domain filtering based on the human
visual system response [8]. tvlorc details can be found in [9].
,-:l.:..~ (_:.:..II.:..i\_I_"I_'/.:..';I.:..1.:...:.....:1.:.." 'I '/,I(:i\'I'/ONS 1)/0' 11/,01:/\ '/'//11 NS/"OU MS ",I. .... IUNII/, I,'I/:/'/WINI;
CHANNEL
2.1.3 Adaptive Filtering
In applications such as automatic equali,mtion of unknown channels, system iden­
tification, and interference cancelling, we need to be able to adjust the filter coeffi-
d(n) ,(n )ADAPTATION
X n '" [.r(n) ,o(T> - 1)
The subscript n in h n is important t.o denot.e the time variance of the filt.er coef­ licients. In fact, for every new incoming sample, a uew set. of filter coefficients is
l·l1lployed.
The parameter JL cont.rols a fundamt:ntal trade-off bet.ween speed of convergence
;lud stability. If ~t is too small, it may take many thousands of iterations for h to
converge, whereas if it is too large, the filter coefficients may diverge. The standard
of comparison for the definition of what are small or large va.lues of I-' is the set
where h is the L-climen:-;iollal vector containing the filt.cl' coefficients, and X 11 is a
running signal block, i.e.,
hn+l = h n +2pe.(n) X n
Figure 2.4. Ada.ptive FIR fiIt.er, transversal structure.
,·j,'nts in Fig. 2.1 until the filter output, matches a desired rcspol1!'e. Assuming the
rill,l'l' is FIR., the transversal structure of Fig. 2.1 can be easily cOllverted into an
Illlaptive filter [13], as shown in Fig. 2.4.
The error signal e(n), whichis the diff(Tcnce between the desired signal d(n.) and
tile' filter output y(n), is the key information for the block "adapt.ation algorithm"
ill Fig. 2.4, which should try to minimiz