analysis of computational system performance in automatic...
TRANSCRIPT
![Page 1: Analysis of Computational System Performance in Automatic ...essrl.wustl.edu/~jao/Talks/ConferenceTalks/HPEC_2000.pdf · Analysis of Computational System Performance in Automatic](https://reader034.vdocuments.site/reader034/viewer/2022042100/5e7bfce717c1186d1b4b377d/html5/thumbnails/1.jpg)
Analysis of Computational System Analysis of Computational System Performance in Automatic Target Performance in Automatic Target
RecognitionRecognition
Joseph A. OJoseph A. O’’SullivanSullivanMichael D.Michael D. DeVoreDeVore
Electronic Systems and Signals Electronic Systems and Signals Research LaboratoryResearch Laboratory
Supported by: DARPA grant DAAL01-98-C-0074Boeing FoundationONR grant N00014-98-1-06-06
Mark A. FranklinMark A. FranklinRoger D. ChamberlainRoger D. Chamberlain
Computer and Communications Computer and Communications Research CenterResearch Center
![Page 2: Analysis of Computational System Performance in Automatic ...essrl.wustl.edu/~jao/Talks/ConferenceTalks/HPEC_2000.pdf · Analysis of Computational System Performance in Automatic](https://reader034.vdocuments.site/reader034/viewer/2022042100/5e7bfce717c1186d1b4b377d/html5/thumbnails/2.jpg)
2
System Performance in ATR
OverviewOverview
•• Factors of InterestFactors of Interest– Result Quality– Throughput– System Resources
•• Illustration from Automatic Target Illustration from Automatic Target Recognition (ATR)Recognition (ATR)
•• Relating Factors of InterestRelating Factors of Interest•• Computational ModelComputational Model•• ExampleExample•• ConclusionsConclusions
![Page 3: Analysis of Computational System Performance in Automatic ...essrl.wustl.edu/~jao/Talks/ConferenceTalks/HPEC_2000.pdf · Analysis of Computational System Performance in Automatic](https://reader034.vdocuments.site/reader034/viewer/2022042100/5e7bfce717c1186d1b4b377d/html5/thumbnails/3.jpg)
3
System Performance in ATR
IntroductionIntroduction
Goal: Goal: A method of making implementation decisions in terms of quality A method of making implementation decisions in terms of quality of final resultsof final results
Approach:Approach:Model the application and system to relate three factorsModel the application and system to relate three factors1. Quality of Results1. Quality of Results2. Required Throughput (not latency)2. Required Throughput (not latency)3. System Resources3. System Resources
Results:Results:Apply the approach to automatic target recognition (ATR) from Apply the approach to automatic target recognition (ATR) from synthetic aperture radar (SAR) imagessynthetic aperture radar (SAR) images
![Page 4: Analysis of Computational System Performance in Automatic ...essrl.wustl.edu/~jao/Talks/ConferenceTalks/HPEC_2000.pdf · Analysis of Computational System Performance in Automatic](https://reader034.vdocuments.site/reader034/viewer/2022042100/5e7bfce717c1186d1b4b377d/html5/thumbnails/4.jpg)
4
System Performance in ATR
Factors of InterestFactors of Interest
•• type of platform (commercial or custom)type of platform (commercial or custom)•• number and speed of processorsnumber and speed of processors•• interconnection network bandwidthinterconnection network bandwidth•• memory bandwidthmemory bandwidth
Dependencies between result quality, throughput, and computing resources help determine:
![Page 5: Analysis of Computational System Performance in Automatic ...essrl.wustl.edu/~jao/Talks/ConferenceTalks/HPEC_2000.pdf · Analysis of Computational System Performance in Automatic](https://reader034.vdocuments.site/reader034/viewer/2022042100/5e7bfce717c1186d1b4b377d/html5/thumbnails/5.jpg)
5
System Performance in ATR
ATR IllustrationATR Illustration
•• Quality Quality -- Probability of erroneous classificationProbability of erroneous classification•• Throughput Throughput -- Target images processed per secondTarget images processed per second•• Resources Resources -- Processors, memory and I/O bandwidth, etc.Processors, memory and I/O bandwidth, etc.
aa=T72
SAR SAR PlatformPlatform
rr
Target Target ClassifierClassifier
Orientation Orientation EstimatorEstimator
ââ=T72=T72
θθ=45=45°°^
For classification/estimation components we relate:
![Page 6: Analysis of Computational System Performance in Automatic ...essrl.wustl.edu/~jao/Talks/ConferenceTalks/HPEC_2000.pdf · Analysis of Computational System Performance in Automatic](https://reader034.vdocuments.site/reader034/viewer/2022042100/5e7bfce717c1186d1b4b377d/html5/thumbnails/6.jpg)
6
System Performance in ATR
Factor InterFactor Inter--relationshipsrelationships
•• ATR systems are explicitly or implicitly based on models of ATR systems are explicitly or implicitly based on models of targets with some complexity targets with some complexity CC
•• More complex target models require more computation but can More complex target models require more computation but can yield better results; Pr(error)=yield better results; Pr(error)=ff((CC,,ααSARSAR))
•• Target model complexity and computational power determine Target model complexity and computational power determine overall system throughput; overall system throughput; TTCHIPCHIP==gg((CC,,ααCOMPCOMP))
•• Given an architecture, both result quality, Pr(error)Given an architecture, both result quality, Pr(error),, and and throughput, throughput, RR=1/=1/TTCHIPCHIP, are parameterized by target model , are parameterized by target model complexitycomplexity
![Page 7: Analysis of Computational System Performance in Automatic ...essrl.wustl.edu/~jao/Talks/ConferenceTalks/HPEC_2000.pdf · Analysis of Computational System Performance in Automatic](https://reader034.vdocuments.site/reader034/viewer/2022042100/5e7bfce717c1186d1b4b377d/html5/thumbnails/7.jpg)
7
System Performance in ATR
ATR as an Optimization ProblemATR as an Optimization Problem•• ATR can be viewed as maximizing a measure of ATR can be viewed as maximizing a measure of
goodness over all classes, goodness over all classes, aa, and orientations, , and orientations, θθ..•• Likelihood based approaches maximize the probability Likelihood based approaches maximize the probability
density function of an observed image, density function of an observed image, rr..
•• Example: Model pixel Example: Model pixel ii as independent, zero mean, as independent, zero mean, complex conditionally Gaussian, with variance complex conditionally Gaussian, with variance σσii
22((θθ,,aa))
pR Θ,A r θ ,a( )=1
π σ i2 θ ,a( )
e−
ri
2
σ i2 θ ,a( )
i∏
•• Variances, estimated from training data, must be storedVariances, estimated from training data, must be stored
![Page 8: Analysis of Computational System Performance in Automatic ...essrl.wustl.edu/~jao/Talks/ConferenceTalks/HPEC_2000.pdf · Analysis of Computational System Performance in Automatic](https://reader034.vdocuments.site/reader034/viewer/2022042100/5e7bfce717c1186d1b4b377d/html5/thumbnails/8.jpg)
8
System Performance in ATR
ATR as a ATR as a Parallelizable Parallelizable OperationOperation
•• Maximizing Maximizing ppRR||θθ,,AA is equivalent to maximizing the logis equivalent to maximizing the log--likelihood, likelihood, ll((r|r|θθ,,aa) ) ∝∝ lnln ppRR||θθ,,AA
l rθ ,a( ) = − lnσ i2 θ ,a( )+
ri2
σ i2 θ ,a( )
⎡
⎣ ⎢ ⎤
⎦ ⎥ i∑
•• Each measured value, Each measured value, rrii, undergoes operations of the , undergoes operations of the same form for all pixels, orientations, and target classessame form for all pixels, orientations, and target classes
![Page 9: Analysis of Computational System Performance in Automatic ...essrl.wustl.edu/~jao/Talks/ConferenceTalks/HPEC_2000.pdf · Analysis of Computational System Performance in Automatic](https://reader034.vdocuments.site/reader034/viewer/2022042100/5e7bfce717c1186d1b4b377d/html5/thumbnails/9.jpg)
9
System Performance in ATR
ATR as a ATR as a ParallelizableParallelizable OperationOperationATRATR aa11
rr1
••••••
aa22rr2 ATRATR
aammrrm ATRATR
aamaxmax
ll((rr||θθ1, , aa1))^max max ll((rr||θθ, , aa1))θθ
••••••
max max ll((rr||θθ, , aa2))θθ
max max ll((rr||θθ, , aat))θθ
ll((rr||θθ2, , aa2))^
ll((rr||θθt, , aat))^
••••••
maxmax
ll((rr||355355°°,,aa))
ll((rr||55°°,,aa))
ll((rr||00°°,,aa))ll((rr||θθ,,aa))^
rr
σσ22((θθ,, aa))
gg gg gggg gg gg
gg gg gg
•• •• ••
•• •• ••
•• •• ••
••••••
ΣΣll((rr||θθ, , aa))
••••••
••••••
![Page 10: Analysis of Computational System Performance in Automatic ...essrl.wustl.edu/~jao/Talks/ConferenceTalks/HPEC_2000.pdf · Analysis of Computational System Performance in Automatic](https://reader034.vdocuments.site/reader034/viewer/2022042100/5e7bfce717c1186d1b4b377d/html5/thumbnails/10.jpg)
10
System Performance in ATR
Quality of Results and ComplexityQuality of Results and ComplexityIn this context, target model complexity relates to In this context, target model complexity relates to
resolution in the approximation of resolution in the approximation of σσ22((θθ,,aa))
Coarse model of aT62 tank, 1 template with 16K floats
Fine model of a T72 tank (1/5 relative scale),72 templates totaling 1.1M floats
![Page 11: Analysis of Computational System Performance in Automatic ...essrl.wustl.edu/~jao/Talks/ConferenceTalks/HPEC_2000.pdf · Analysis of Computational System Performance in Automatic](https://reader034.vdocuments.site/reader034/viewer/2022042100/5e7bfce717c1186d1b4b377d/html5/thumbnails/11.jpg)
11
System Performance in ATR
Result Quality and ThroughputResult Quality and Throughput•• ATR hinges on likelihood function evaluationATR hinges on likelihood function evaluation
•• Each implementation decision sets a maximum Each implementation decision sets a maximum number of function evaluations per unit timenumber of function evaluations per unit time
•• Maximum number of function evaluations determines Maximum number of function evaluations determines what level of model can be usedwhat level of model can be used
•• Level of model determines ATR performanceLevel of model determines ATR performance
•• Approach is to determine, for any combination of Approach is to determine, for any combination of system parameters, the best achievable performance system parameters, the best achievable performance as a function of required chip rateas a function of required chip rate
![Page 12: Analysis of Computational System Performance in Automatic ...essrl.wustl.edu/~jao/Talks/ConferenceTalks/HPEC_2000.pdf · Analysis of Computational System Performance in Automatic](https://reader034.vdocuments.site/reader034/viewer/2022042100/5e7bfce717c1186d1b4b377d/html5/thumbnails/12.jpg)
12
System Performance in ATR
Computational ModelsComputational Models
Chip processing rate Chip processing rate RR=1/=1/TTCHIPCHIP
Assumptions:Assumptions:•• Each CPU optimizes over a region of the search spaceEach CPU optimizes over a region of the search space•• MultiMulti--issue CPU with 2 instructions/clock cycleissue CPU with 2 instructions/clock cycle•• 6 instructions per pixel6 instructions per pixel
TCHIP sec/SAR Image L templates/targetT1 sec/clock cycle M targetsT2 sec/memory read N pixels/templateT3 sec/SAR Image load P processors
TCHIP = 3LMN
PT1 +
LMNP
T2 + T3
![Page 13: Analysis of Computational System Performance in Automatic ...essrl.wustl.edu/~jao/Talks/ConferenceTalks/HPEC_2000.pdf · Analysis of Computational System Performance in Automatic](https://reader034.vdocuments.site/reader034/viewer/2022042100/5e7bfce717c1186d1b4b377d/html5/thumbnails/13.jpg)
13
System Performance in ATR
ExampleExample
T2=T1 with prefetch 16 KB/SAR Image (4B floats)1 GHz clock M=10 targetsVarying target model complexity (L and N)
1 Gb/s image bus 10 Gb/s image bus
![Page 14: Analysis of Computational System Performance in Automatic ...essrl.wustl.edu/~jao/Talks/ConferenceTalks/HPEC_2000.pdf · Analysis of Computational System Performance in Automatic](https://reader034.vdocuments.site/reader034/viewer/2022042100/5e7bfce717c1186d1b4b377d/html5/thumbnails/14.jpg)
14
System Performance in ATR
ExampleExample
•• Figures show increase of chip rate provided by more processors Figures show increase of chip rate provided by more processors for fixed probability of errorfor fixed probability of error
•• Alternatively, they show decreased probability of error with Alternatively, they show decreased probability of error with more processors for fixed chip ratemore processors for fixed chip rate
•• Curve convergence at low chip rates indicates small recognition Curve convergence at low chip rates indicates small recognition improvement at high target model complexitiesimprovement at high target model complexities
•• For 1Gb/s bus, convergence at high chip rates indicates time to For 1Gb/s bus, convergence at high chip rates indicates time to load SAR image dominates total chip processing timeload SAR image dominates total chip processing time
![Page 15: Analysis of Computational System Performance in Automatic ...essrl.wustl.edu/~jao/Talks/ConferenceTalks/HPEC_2000.pdf · Analysis of Computational System Performance in Automatic](https://reader034.vdocuments.site/reader034/viewer/2022042100/5e7bfce717c1186d1b4b377d/html5/thumbnails/15.jpg)
15
System Performance in ATR
ConclusionsConclusions
•• Throughput demands may vary with conditions of useThroughput demands may vary with conditions of use
•• Quality of results as a function of required throughput Quality of results as a function of required throughput is determined in part by system implementationis determined in part by system implementation
•• Models of application behavior and system Models of application behavior and system performance can be combined to find acceptable performance can be combined to find acceptable combinations of result quality, throughput, and system combinations of result quality, throughput, and system design.design.