ultra sound solution impact of c++ dsp optimization techniques
TRANSCRIPT
![Page 1: Ultra sound solution Impact of C++ DSP optimization techniques](https://reader035.vdocuments.site/reader035/viewer/2022062221/56649e535503460f94b49e13/html5/thumbnails/1.jpg)
Ultra sound solution
Impact of C++ DSP optimization techniques
![Page 2: Ultra sound solution Impact of C++ DSP optimization techniques](https://reader035.vdocuments.site/reader035/viewer/2022062221/56649e535503460f94b49e13/html5/thumbnails/2.jpg)
Research Team discussion Ultra-sound probe (20 MHz) that sends out
signals into body that reflect off moving blood cells in (Artery? Vein?)
Ultra-sound frequency received is Doppler shifted compared to transmitted frequency Same as sound when ambulance goes by. Higher
if approaching, lower if receding They get the positive frequencies (towards)
on the left audio channel and negative frequencies (away) on the right audio channel.
04/21/23.ENCM515 – Ultrasound ProblemCopyright [email protected] 2 / 33
![Page 3: Ultra sound solution Impact of C++ DSP optimization techniques](https://reader035.vdocuments.site/reader035/viewer/2022062221/56649e535503460f94b49e13/html5/thumbnails/3.jpg)
Picture looks like this
Note that the display loses all direction information Can I help them to output the maximum frequency?
04/21/23ENCM515 – Ultrasound ProblemCopyright [email protected] 3 / 33
![Page 4: Ultra sound solution Impact of C++ DSP optimization techniques](https://reader035.vdocuments.site/reader035/viewer/2022062221/56649e535503460f94b49e13/html5/thumbnails/4.jpg)
Captured audio signal
04/21/23ENCM515 – Ultrasound ProblemCopyright [email protected] 4 / 33
Engineering Problems
Problem 5 – Different amplitudes common
Problem 6 – Why are funny dead spots not lining up in left and right channels? Handling stereo not mono signals
Incorrect labeling / misinterpreation
Problem 7 – How to remove dead-spots?
![Page 5: Ultra sound solution Impact of C++ DSP optimization techniques](https://reader035.vdocuments.site/reader035/viewer/2022062221/56649e535503460f94b49e13/html5/thumbnails/5.jpg)
Max frequency – definition 1 Frequency
below which X% of the frequencies fall
Noisy signal for large thresholds
> 80%
04/21/23ENCM515 – Ultrasound ProblemCopyright [email protected] 5 / 33
![Page 6: Ultra sound solution Impact of C++ DSP optimization techniques](https://reader035.vdocuments.site/reader035/viewer/2022062221/56649e535503460f94b49e13/html5/thumbnails/6.jpg)
After XPI Stage 2 Have a working algorithm concept Engineering problem 1 – Complex math (a + jb) on SHARC! Engineering Problem 2 – Define maximum frequency
zillions of blood cells – therefore distribution of frequencies Workable prototype – discuss more with customer
Engineering Problem 3 – SHARC D/A can’t handle DC signal Workable prototype – discuss more with customer
Engineering Problem 4 – Can SHARC handle all this in real-time?
Problem 5 – Is different amplitudes of input channels common? Yes
Problem 6 – Why are funny dead spots not lining up in left and right channels? Artifact – mislabeled and misinterpreted sampled
Problem 7 – How to remove dead-spots? – Discuss more with customer
04/21/23ENCM515 – Ultrasound ProblemCopyright [email protected] 6 / 33
![Page 7: Ultra sound solution Impact of C++ DSP optimization techniques](https://reader035.vdocuments.site/reader035/viewer/2022062221/56649e535503460f94b49e13/html5/thumbnails/7.jpg)
ProcessBlockDONEOUTSIDEINTERRUPT
AVOIDS RACE
04/21/23ENCM515 – Ultrasound ProblemCopyright [email protected] 7 / 33
![Page 8: Ultra sound solution Impact of C++ DSP optimization techniques](https://reader035.vdocuments.site/reader035/viewer/2022062221/56649e535503460f94b49e13/html5/thumbnails/8.jpg)
Real life problem -- Stereo
Minor changes to Audio Premptive Task
04/21/23ENCM515 – Ultrasound ProblemCopyright [email protected] 8 / 33
![Page 9: Ultra sound solution Impact of C++ DSP optimization techniques](https://reader035.vdocuments.site/reader035/viewer/2022062221/56649e535503460f94b49e13/html5/thumbnails/9.jpg)
Make “C – code more general Moved buffer[ ] to external files Unknown size of arrays being
processed
04/21/23ENCM515 – Ultrasound ProblemCopyright [email protected] 9 / 33
![Page 10: Ultra sound solution Impact of C++ DSP optimization techniques](https://reader035.vdocuments.site/reader035/viewer/2022062221/56649e535503460f94b49e13/html5/thumbnails/10.jpg)
Switch to Release mode Switch to optimizing compiler
(ReleaseNWC) means can no longer set breakpoints – Fix with these steps
04/21/23ENCM515 – Ultrasound ProblemCopyright [email protected] 10 / 33
![Page 11: Ultra sound solution Impact of C++ DSP optimization techniques](https://reader035.vdocuments.site/reader035/viewer/2022062221/56649e535503460f94b49e13/html5/thumbnails/11.jpg)
First look at code
Timing -- software loop with r2 as loop counter – test at end
N * (10 – 1) cycles (jump is not db)
-1 for 1parallel instruction
04/21/23ENCM515 – Ultrasound ProblemCopyright [email protected] 11 / 33
![Page 12: Ultra sound solution Impact of C++ DSP optimization techniques](https://reader035.vdocuments.site/reader035/viewer/2022062221/56649e535503460f94b49e13/html5/thumbnails/12.jpg)
UseCompilerInfo button
04/21/23ENCM515 – Ultrasound ProblemCopyright [email protected] 12 / 33
3 Stalls – 2 on software jump. 1 on ?
![Page 13: Ultra sound solution Impact of C++ DSP optimization techniques](https://reader035.vdocuments.site/reader035/viewer/2022062221/56649e535503460f94b49e13/html5/thumbnails/13.jpg)
Obvious things to do We are already processing left and
right channels in one program Switch to left audio in dm memory and
right audio in pm memory
Need to do Make right buffers ‘pm’ Change prototype of function to padd pm
04/21/23ENCM515 – Ultrasound ProblemCopyright [email protected] 13 / 33
![Page 14: Ultra sound solution Impact of C++ DSP optimization techniques](https://reader035.vdocuments.site/reader035/viewer/2022062221/56649e535503460f94b49e13/html5/thumbnails/14.jpg)
As expected 2 cycles saved
Parallel dm and pm reads and writes
04/21/23ENCM515 – Ultrasound ProblemCopyright [email protected] 14 / 33
![Page 15: Ultra sound solution Impact of C++ DSP optimization techniques](https://reader035.vdocuments.site/reader035/viewer/2022062221/56649e535503460f94b49e13/html5/thumbnails/15.jpg)
Why software loop? Switch does know what to do about
size of loop so can’t oprtimize loop
04/21/23ENCM515 – Ultrasound ProblemCopyright [email protected] 15 / 33
THIS PRAGMAIS A CONTRACTBETWEEN THEDEVELOPER AND COMPILEDON’T LIE
![Page 16: Ultra sound solution Impact of C++ DSP optimization techniques](https://reader035.vdocuments.site/reader035/viewer/2022062221/56649e535503460f94b49e13/html5/thumbnails/16.jpg)
This does not compile
Pragma variables not handled by preprocessor
04/21/23ENCM515 – Ultrasound ProblemCopyright [email protected] 16 / 33
![Page 17: Ultra sound solution Impact of C++ DSP optimization techniques](https://reader035.vdocuments.site/reader035/viewer/2022062221/56649e535503460f94b49e13/html5/thumbnails/17.jpg)
Variable as end of loop Compile will not optimizewhen loop parameter is declared external, or internal or static
04/21/23ENCM515 – Ultrasound ProblemCopyright [email protected] 17 / 33
![Page 18: Ultra sound solution Impact of C++ DSP optimization techniques](https://reader035.vdocuments.site/reader035/viewer/2022062221/56649e535503460f94b49e13/html5/thumbnails/18.jpg)
Loop parameters all constantsknown to compiler
Drop from 8 cycles to2 cycles as compiler knows enough to switch to hardware loop control – STALLS FROM JUMP GONE
04/21/23ENCM515 – Ultrasound ProblemCopyright [email protected] 18 / 33
![Page 19: Ultra sound solution Impact of C++ DSP optimization techniques](https://reader035.vdocuments.site/reader035/viewer/2022062221/56649e535503460f94b49e13/html5/thumbnails/19.jpg)
Where am I getting all my info?
04/21/23ENCM515 – Ultrasound ProblemCopyright [email protected] 19 / 33
![Page 20: Ultra sound solution Impact of C++ DSP optimization techniques](https://reader035.vdocuments.site/reader035/viewer/2022062221/56649e535503460f94b49e13/html5/thumbnails/20.jpg)
Can we switch to SIMD mode
VECTORIZATION
MAY NOT BE POSSIBLE IF COMPILER DOES NOT KNOW ABOUT ALIGNMENT OF ARRAYS
(How arrays placed in memory)
04/21/23ENCM515 – Ultrasound ProblemCopyright [email protected] 20 / 33
![Page 21: Ultra sound solution Impact of C++ DSP optimization techniques](https://reader035.vdocuments.site/reader035/viewer/2022062221/56649e535503460f94b49e13/html5/thumbnails/21.jpg)
Impact of vectorization Before -- loop count was 0x80 With memory operations of the form
r2 = dm(i4, m6) where m6 = 1 meaning code is doing r2 = i4+
+;
04/21/23ENCM515 – Ultrasound ProblemCopyright [email protected] 21 / 33
![Page 22: Ultra sound solution Impact of C++ DSP optimization techniques](https://reader035.vdocuments.site/reader035/viewer/2022062221/56649e535503460f94b49e13/html5/thumbnails/22.jpg)
New instructions – SIMD mode
Bit set mode1 0x200000 (bit clr mode 1)
Processor doing r2 = dm(i5, 2)
Same as r2 = dm(i5, 1) AND s2 = dm(i5, 1)
Loading two registers
04/21/23ENCM515 – Ultrasound ProblemCopyright [email protected] 22 / 33
![Page 23: Ultra sound solution Impact of C++ DSP optimization techniques](https://reader035.vdocuments.site/reader035/viewer/2022062221/56649e535503460f94b49e13/html5/thumbnails/23.jpg)
Try using #pragma inline BEFORE AFTER (20 cycles
faster?)
04/21/23ENCM515 – Ultrasound ProblemCopyright [email protected] 23 / 33
![Page 24: Ultra sound solution Impact of C++ DSP optimization techniques](https://reader035.vdocuments.site/reader035/viewer/2022062221/56649e535503460f94b49e13/html5/thumbnails/24.jpg)
C++ showing out of order execution
04/21/23ENCM515 – Ultrasound ProblemCopyright [email protected] 24 / 33
WARNING
![Page 25: Ultra sound solution Impact of C++ DSP optimization techniques](https://reader035.vdocuments.site/reader035/viewer/2022062221/56649e535503460f94b49e13/html5/thumbnails/25.jpg)
Lets do “inline” ProcessOneBlock( ) is called by four
subroutines – lets in
04/21/23ENCM515 – Ultrasound ProblemCopyright [email protected] 25 / 33
![Page 26: Ultra sound solution Impact of C++ DSP optimization techniques](https://reader035.vdocuments.site/reader035/viewer/2022062221/56649e535503460f94b49e13/html5/thumbnails/26.jpg)
Mixed mode view is interesting
04/21/23ENCM515 – Ultrasound ProblemCopyright [email protected] 26 / 33
![Page 27: Ultra sound solution Impact of C++ DSP optimization techniques](https://reader035.vdocuments.site/reader035/viewer/2022062221/56649e535503460f94b49e13/html5/thumbnails/27.jpg)
Mixed Mode Out of order execution with 4 copies of the code for
DoCopyBlock( ) (one for each of Process 0, Process1, Process2, Process 3)
NO CODE OF ProcessOneBlock( )
04/21/23ENCM515 – Ultrasound ProblemCopyright [email protected] 27 / 33
![Page 28: Ultra sound solution Impact of C++ DSP optimization techniques](https://reader035.vdocuments.site/reader035/viewer/2022062221/56649e535503460f94b49e13/html5/thumbnails/28.jpg)
Speed improvement Moving from software loop and using dm and pm
memories caused a change from 8 cycles / pt to 2 cycles for two points processed in SIMD (4 CALLS * 7 CYCLES SAVED * N POINTS PROCESSED)
Moving to IN_LINE causes a change of around 120 cycles for each subroutine call (4 CALLS * 120 CYCLES SAVED)
N = 128 -- (4 * 1800 to 4 * 120) 480 Mhz processor -- 15 us to 1 us LESSON LEARNT – SPEND YOUR TIME OPTIMIZING
THE LOOPS – REST IS SMALLER AND GETS SMALLER WITH LARGER N
04/21/23ENCM515 – Ultrasound ProblemCopyright [email protected] 28 / 33
![Page 29: Ultra sound solution Impact of C++ DSP optimization techniques](https://reader035.vdocuments.site/reader035/viewer/2022062221/56649e535503460f94b49e13/html5/thumbnails/29.jpg)
Otherimprovementsdepend oncode Characteristicsspecifics
04/21/23ENCM515 – Ultrasound ProblemCopyright [email protected] 29 / 33
![Page 30: Ultra sound solution Impact of C++ DSP optimization techniques](https://reader035.vdocuments.site/reader035/viewer/2022062221/56649e535503460f94b49e13/html5/thumbnails/30.jpg)
04/21/23ENCM515 – Ultrasound ProblemCopyright [email protected] 30 / 33
![Page 32: Ultra sound solution Impact of C++ DSP optimization techniques](https://reader035.vdocuments.site/reader035/viewer/2022062221/56649e535503460f94b49e13/html5/thumbnails/32.jpg)
Memory alignment can be important
After first char fetch, system and move to move 8 chars in SIMD
04/21/23ENCM515 – Ultrasound ProblemCopyright [email protected] 32 / 33
![Page 33: Ultra sound solution Impact of C++ DSP optimization techniques](https://reader035.vdocuments.site/reader035/viewer/2022062221/56649e535503460f94b49e13/html5/thumbnails/33.jpg)
04/21/23ENCM515 – Ultrasound ProblemCopyright [email protected] 33 / 33
![Page 34: Ultra sound solution Impact of C++ DSP optimization techniques](https://reader035.vdocuments.site/reader035/viewer/2022062221/56649e535503460f94b49e13/html5/thumbnails/34.jpg)
Conditional code (manual PGO)
04/21/23ENCM515 – Ultrasound ProblemCopyright [email protected] 34 / 33
![Page 35: Ultra sound solution Impact of C++ DSP optimization techniques](https://reader035.vdocuments.site/reader035/viewer/2022062221/56649e535503460f94b49e13/html5/thumbnails/35.jpg)
Correct ways to process loops
04/21/23ENCM515 – Ultrasound ProblemCopyright [email protected] 35 / 33
![Page 36: Ultra sound solution Impact of C++ DSP optimization techniques](https://reader035.vdocuments.site/reader035/viewer/2022062221/56649e535503460f94b49e13/html5/thumbnails/36.jpg)
04/21/23ENCM515 – Ultrasound ProblemCopyright [email protected] 36 / 33
![Page 37: Ultra sound solution Impact of C++ DSP optimization techniques](https://reader035.vdocuments.site/reader035/viewer/2022062221/56649e535503460f94b49e13/html5/thumbnails/37.jpg)
04/21/23ENCM515 – Ultrasound ProblemCopyright [email protected] 37 / 33
![Page 38: Ultra sound solution Impact of C++ DSP optimization techniques](https://reader035.vdocuments.site/reader035/viewer/2022062221/56649e535503460f94b49e13/html5/thumbnails/38.jpg)
#pragma all_aligned #pragma loop_unroll N #pragma SIMD_for #pragma align num #pragma alignment_region( and
#pragma alignment_region_end
04/21/23ENCM515 – Ultrasound ProblemCopyright [email protected] 38 / 33