parallel computing for real-rime signal processing and control, m. o. tokhi, m. a. hossain and m. h....
TRANSCRIPT
INTERNATIONAL JOURNAL OF ADAPTIVE CONTROL AND SIGNAL PROCESSINGInt. J. Adapt. Control Signal Process. 2004; 18:547–549
BOOK REVIEW
PARALLEL COMPUTING FOR REAL-RIME SIGNAL PROCES-
SING AND CONTROL, M. O. Tokhi, M. A. Hossainand M. H. Shaheed, Springer, London, 2003, xiii +253pp
Developments in control and signal processingapplications are closely associated with the historyof computation. At the level of the processorand software as seen by the programmer, the vastmajority of this history has been and remainsuniprocessor. However, this hides many layersof parallelism which are realized at chip leveland utilized by the compiler. It is this, rather thanraw clock speed, which is mostly responsible forthe incredible increase in computing performanceseen over the last several decades. Explicit paralle-lism, where the programmer has direct responsi-bility for structuring the computation, has beenapplied mostly in specialist areas, such as the useof super-computers for computationally intensiveproblems (e.g. ray-tracing and animation, quantumdynamics, modelling of plasmas and nonlinearfluids, etc). The recent trend in this field is theuse of computing clusters with software likeMPI and PMV for handling task distribution.There are many types of problem where parallelprocessing is effective, ranging from applicationswhich have huge but largely independent setsof data (the SETI screen-saver is perhapsthe classic example) to relatively small, wellstructured and generic algorithms (such as theFFT) where it may make sense to have acustomised hardware implementation. The classof applications in control and digital signalprocessing which are addressed in this booklie somewhere between these extremes with theadded dimension of a hard real-time deadline tomeet. They hover uncomfortably just beyond asatisfactory uniprocessor implementation butsomewhat short of justifying the investment re-quired by dedicated hardware. It is then natural toask whether a programmable, multi-processorsolution}involving a modest rather than largenumber of processors}can reduce the executiontime sufficiently to comply with the real-timerequirements. For many algorithms commonto control and signal processing, it turns out thatit is quite difficult to re-structure the algorithm
so that the code and data are effectively partitionedand distributed across several communicatingprocessors. The generic problem has been thesubject of much research in computer science [1].Clearly, the situation also varies with time because,while both the complexity of applications andthe capability of computers continually increase,they do not do so at the same rate. In particular,there was a lengthy period in the 1980s wherethe complexity of algorithms surged ahead of thecomputational performance of microprocessors, asa result of the rapid developments in control theoryat that time. This stimulated a great deal of interestin multiprocessor solutions and, in the UnitedKingdom in particular, this was further spurred bythe availability of the Inmos transputer and itsparallel programming language, occam. It is herethat many of the roots of this book lie.The book’s introductory chapter is an over-view
of the concepts involved in parallel processing.Early on, there is a reference to the factors whichlimit the effectiveness of the approach}can theapplication be divided into sufficient independentlycomputable components and are the communica-tion and synchronization overheads required tosupport parallel operations small compared to thecomputation itself? This sets the theme for much ofthe remainder of the book. The introductionalso summarizes some general fields of applicationfor parallel processing. Here, I would have liked tosee some more specific examples of its use incontrol and signal processing, particularly ifreferences to its use in industry were available.The authors could also have expressed the contextof the book rather better by emphasizing its focuson comparatively small but tightly coupled algo-rithms which must be computed in a short space oftime, i.e. the goal is response time and notthroughput.Chapter 2 is probably the part of the
book which will appeal to the widest audience. Itcontains a summary of the major computerarchitectures at a level appropriate for thenon-specialist. It is a distillation of materialfound in other text books [2,3] stripped of detailwhich would obscure the principle for engineershaving a limited background in the subject. Itwould, however, have benefited from a section at
Copyright # 2004 John Wiley & Sons, Ltd.
the front to remind the reader of the salientfeatures of the von Neumann architecture and itcould have been extended to describe somefeatures of contemporary microprocessors (e.g.cache memory, branch prediction, superscalararchitecture), which are sometimes referred to inlater chapters with little support material. In thisrespect, the book by Lawson [4] is somewhatsuperior.In Chapter 3, which deals with performance
evaluation issues, the authors discuss howto decide whether a parallel processing solutionyields superior performance, usually in compar-ison with a uniprocessor. They discuss thecomplex and inter-related factors which affectthe performance of a parallel computer}thehardware, the algorithm, the software andcost considerations}and make performance com-parisons difficult. As with any computer bench-marking, meaningful comparisons can onlybe made when all the variables are carefullycontrolled so that the effect of introducingparallelism is isolated. The second partof this chapter is a case study on the effect ofrun-time/communicate-time ratio and compilerefficiency on some sample applications, whichinclude a finite difference simulation and anadaptive filter algorithm. Their ‘real-world’ ap-proach, citing measurements on specificcases drawn from the authors’ research, ischaracteristic of much of the book. It hasthe advantage of being practical and definitebut is rather empirical, with the drawback thatmodelling or theory to give insight into the resultsis sparse and generalization from the particularcase presented can only be done rather loosely.Here, I found the order of the chapters a bit oddbecause the details of the example applications arenot given until Chapter 6; I would have thoughtthat interpreting the results would require anunderstanding of the algorithms used in the casestudy.The focus of Chapter 4 is on performance
metrics with an emphasis on speed-up, efficiencyand scalability. Simple expressions for the perfor-mance metrics are developed for both homoge-neous and heterogeneous architectures andillustrated by a number of examples. Againthe chapter includes a case study to showhow the measured execution time of a flexiblebeam simulation algorithm varies with differentproblem size and processor types}the compar-ison is done for the TMS320C40 DSP, the Inteli860 vector processor and the INMOS T8transputer.
In Chapter 5, attention is turned to program-ming for real-time parallel computing. Thecomputer science community has generated muchmaterial on this topic [5–7] and, while it isobviously not possible to cover these in any depthin this book, I feel that too much is assumed hereof the reader’s knowledge about concurrencyand process scheduling. A substantial part of thechapter describes a case study which implementsthe flexible beam simulation algorithm on adual processor in a multi-threaded environment.The results are a striking demonstration of (a) theoverhead which the thread synchronizationmechanism can introduce and (b) that theI/O process can introduce a serious bottleneckwhich dominates the execution time despite anycomputational gains due to parallelism. Alto-gether, there is here a salutary message thatobtaining the benefits of parallel processing incontrol or signal processing applications does notcome easily.Chapter 6 turns to the algorithms. As well as
the beam simulation example, the structure ofalgorithms for system identification, adaptivefiltering, spectral analysis and a flexible manip-ulator are considered. Whilst these are all real-time applications, I’m not sure that any of themare control applications, at least in the sense of theprocessor being in a feedback loop. Even in the‘controlled beam’ example it is not clear whetherthe real-time simulation, which forms the bulk ofthe computation, is actually needed when thephysical system itself is being controlled. Perhapsimage processing or large-scale systems couldhave been considered as applications, althoughthese too could be regarded as pre-processingcomplex sensor signals before application of arelatively simple ‘control’ algorithm. Some refer-ence to the idea of the computational complexityof algorithms would further strengthen thischapter.Chapter 7, which is a classification and survey
of major developments in microprocessors overthe last 30 years or so, helps with interpretation ofthe results in earlier chapters. I would, however,categorize it as reference, rather than text book,material. Having summarized microprocessordevelopment to the present day, it is a littlesurprising that no mention is made of possiblefuture trends. For instance, no mention is made ofFPGA implementation where the rapid develop-ment in hardware and development tools ismaking them easier to use routinely in controland instrumentation and even to challenge thecapability of programmable DSPs. In terms of
Copyright # 2004 John Wiley & Sons, Ltd. Int. J. Adapt. Control Signal Process. 2004; 18:547–549
BOOK REVIEW548
software, it would be appropriate in the contextof this book to mention the parallel structureoffered by Handel-C (which uses the principles ofCommunicating Sequential Processes and is influ-enced by occam).The final chapter is a comprehensive compar-
ison of the capabilities of several processor typeson the algorithms described in Chapter 7. In theuniprocessor case, the i860 is a clear winner in 4out of 6 examples considered and the C40 wins inthe other 2 examples, which the authors attributeto the compatibility of the former with regularlystructured algorithms and the latter with irregularalgorithms. The need for care with heterogeneousarchitectures is demonstrated too}it is shown,for instance, that combining a T8 with the i860gives an 11% reduction in execution time on anRLS filter algorithm but a 20% increase on anLMS filter.The book is part of a series on advanced text
books and I would agree that this would notnormally be considered undergraduate material. Itis not that it is a difficult book}in fact, it is easilyaccessible to a wide audience and, with theexception of Chapter 6, requires only elementarymathematics. In my view, however, it is a greaterpriority for undergraduate students in these fieldsto learn the principles of real-time systems,operating systems, software engineering andnumerical sensitivity before parallel processing isconsidered. A major problem faced by text booksof this type is that they will date rapidly and thistrend is apparent here, because the performancecomparisons relate to processor types that are ator near to the end of their lifetimes. This may notaffect the principles involved but the relevance ofthe performance comparisons, in the face ofrelentless increases in computing power, is wea-kened. On a detailed point, the production of thebook is not always perfect. For instance, the run-on page headers for Chapter 3 are incorrect andthere is a change in text size on p. 30. In the reviewcopy, the right-hand margin is unattractivelynarrow throughout.My overall impression is that the book does not
contain examples of spectacular speed-up arisingfrom the use of multiple processors. Whether aparticular application has the potential to benefitfrom parallel processing requires careful judge-ment. J.P. Eckert’s cautionary words at the famousMoore School Lectures in 1946 may yet have a ring
of truth ‘‘The arguments for parallel operation areonly valid provided one applies them to the stepswhich the built-in or wired-in programming of themachine operates. Any steps which are controlled bythe operator, who sets up the machine, should be setup in a serial fashion. It has been shown over andover again that any departure from this procedureresults in a system which is far too complicated touse’’. This does not, of course, imply that weshould not try to prove him wrong! Personally, Ibelieve that we are presently at a stage in the cyclewhere the available computing power exceeds whatis generally required for control systems imple-mentation}but this may change. Perhaps cogni-tive control is a future field of application? There iscertainly ample evidence (the brain!) that massiveparallelism, combined with rather low computingspeed, leads to a prodigious capability for controland signal processing.
Dr D. I. JONES
School of InformaticsUniversity of Wales
Dean StreetBangor Gwynedd LL57 1UT
Wales, UKE-mail: [email protected]
(DOI: 10.1002/acs.821)
REFERENCES
1. Chapin SJ. Distributed and multiprocessorscheduling. ACM Computing Surveys 1996; 28(1):233–235.
2. Hwang K, Briggs FA. Computer Architecture andParallel Processing. McGraw Hill: New York, 1985.
3. Flynn MJ. Computer Architecture: Pipelined andParallel Processor Design. Jones & Bartlett: Boston,London, 1995.
4. Lawson HW. Parallel processing in industrial real-time applications. Innovative Technology. Prentice-Hall: Englewood Cliffs, NJ, 1992.
5. Burns A, Wellings A. Real-Time Systems and theirProgramming Languages (2 edn). Addison-Wesley:Reading, MA, 1997.
6. Nissanke N. Realtime Systems. Prentice-Hall: Read-ing, MA, 1997.
7. Liu JWS. Real-Time Systems. Prentice-Hall: Reading,MA, 2000.
Copyright # 2004 John Wiley & Sons, Ltd. Int. J. Adapt. Control Signal Process. 2004; 18:547–549
BOOK REVIEW 549