generation of highly parallel code for tigersharc processors an introduction this presentation will...
Post on 19-Dec-2015
217 views
TRANSCRIPT
Generation of highly parallel code for TigerSHARC
processors An introduction
04/18/23Introduction to highly parallel TigerSHARC code Copyright M. Smith and S. Lei Contact [email protected] / 45 + B14
Background assumed
Familiarity with TigerSHARC architecture
Familiarity with TigerSHARC programmer’s model for registers
Some assembly experienceAn interest in beating the compiler
in those special cases when you need the last drop of blood out of the CPU :-)
04/18/23Introduction to highly parallel TigerSHARC code Copyright M. Smith and S. Lei Contact [email protected] / 45 + B14
To be tackledWhat’s causing the problem– General limitations of instruction sets
How to recognize when you might be coming up against TigerSHARC architecture limitations
A process for optimizing the TigerSHARC parallelism– Example -- Temperature conversion
– Bonus if time permits -- Average and instantaneous power
04/18/23Introduction to highly parallel TigerSHARC code Copyright M. Smith and S. Lei Contact [email protected] / 45 + B14
When are DSP instructions valid?
You are going to customize– When can you use the DSP instructions?
– Most -- From Monday to Friday
– Some Only between 9:00 a.m. and 9:00 p.m.Check against architectureMIMD -- Parallel ops MUST be able to do this
– Can it be fetched in one cycle (1 instruction line)– Can it be executed in one cycle (resource question)– Can it execute without conflicting with other instructions?– Then PROBABLY legal
HOWEVER -- The designers had the final decision and you have to live by that decision!
04/18/23Introduction to highly parallel TigerSHARC code Copyright M. Smith and S. Lei Contact [email protected] / 45 + B14
Under best conditionsIf instruction described the right way– 2 data memory access (in or out) with a
REQUIRED post modification operation possibly with a modify register containing the value 0
– 1 add compute operation on data registers
– 1 multiply compute operation on data registers
– Ability to redo code using both X and Y
– Sometimes – audio for example – do left channel in X and right channel in Y
Introduction to PPPPIC
Professor’s Personal Process for Parallel Instruction Coding
04/18/23Introduction to highly parallel TigerSHARC code Copyright M. Smith and S. Lei Contact [email protected] / 45 + B14
Basic code development -- any systemWrite the “C” code for the function
void Convert(float *temperature, float *result, int N)
which converts an array of temperatures measured in “Celsius” (Canadian Market) to “Fahrenheit” (Tourist Trade)
Convert the code to TigerSHARC assembly code, following the standard coding and documentation practices, or just use the compiler to do the job for you
04/18/23Introduction to highly parallel TigerSHARC code Copyright M. Smith and S. Lei Contact [email protected] / 45 + B14
Standard “C” code
void Convert(float *temperature, float *result, int N) {
int count;
for (count = 0; count < N; count++) {
*result = (*temperature) * 9 / 5 + 32;
temperature++;
result++;
}
04/18/23Introduction to highly parallel TigerSHARC code Copyright M. Smith and S. Lei Contact [email protected] / 45 + B14
Process for developing custom codeRewrite the “C” code using “LOAD/STORE”
techniques TigerSHARC is essentially super-scaler RISC
Write the assembly code using a hardware loop– Check that end of loop label is in the correct place
REWRITE the assembly code using registers and instructions that COULD be used in parallel IF you could find the correct optimization approach
Move algorithm to “Resource Usage Chart”Optimize (Attempt to) Compare and contrast time -- include set up and
loop control time -- was it worth the effort?
04/18/23Introduction to highly parallel TigerSHARC code Copyright M. Smith and S. Lei Contact [email protected] / 45 + B14
TigerSHARD-style load/store “C” code
void Convert(register float *temperature,
register float * answer, register int N) {
register int count;
register float scratch;
for (count = 0; count < N; count++) {
scratch = * temperature;
scratch = scratch * (9 / 5);
scratch = scratch + 32;
*answer = scratch;
temperature++; answer++;
}
04/18/23Introduction to highly parallel TigerSHARC code Copyright M. Smith and S. Lei Contact [email protected] / 45 + B14
04/18/23Introduction to highly parallel TigerSHARC code Copyright M. Smith and S. Lei Contact [email protected] / 45 + B14
04/18/23Introduction to highly parallel TigerSHARC code Copyright M. Smith and S. Lei Contact [email protected] / 45 + B14
04/18/23Introduction to highly parallel TigerSHARC code Copyright M. Smith and S. Lei Contact [email protected] / 45 + B14
04/18/23Introduction to highly parallel TigerSHARC code Copyright M. Smith and S. Lei Contact [email protected] / 45 + B14
04/18/23Introduction to highly parallel TigerSHARC code Copyright M. Smith and S. Lei Contact [email protected] / 45 + B14
04/18/23Introduction to highly parallel TigerSHARC code Copyright M. Smith and S. Lei Contact [email protected] / 45 + B14
04/18/23Introduction to highly parallel TigerSHARC code Copyright M. Smith and S. Lei Contact [email protected] / 45 + B14
04/18/23Introduction to highly parallel TigerSHARC code Copyright M. Smith and S. Lei Contact [email protected] / 45 + B14
04/18/23Introduction to highly parallel TigerSHARC code Copyright M. Smith and S. Lei Contact [email protected] / 45 + B14
04/18/23Introduction to highly parallel TigerSHARC code Copyright M. Smith and S. Lei Contact [email protected] / 45 + B14
04/18/23Introduction to highly parallel TigerSHARC code Copyright M. Smith and S. Lei Contact [email protected] / 45 + B14
04/18/23Introduction to highly parallel TigerSHARC code Copyright M. Smith and S. Lei Contact [email protected] / 45 + B14
04/18/23Introduction to highly parallel TigerSHARC code Copyright M. Smith and S. Lei Contact [email protected] / 45 + B14
04/18/23Introduction to highly parallel TigerSHARC code Copyright M. Smith and S. Lei Contact [email protected] / 45 + B14
04/18/23Introduction to highly parallel TigerSHARC code Copyright M. Smith and S. Lei Contact [email protected] / 45 + B14
04/18/23Introduction to highly parallel TigerSHARC code Copyright M. Smith and S. Lei Contact [email protected] / 45 + B14
04/18/23Introduction to highly parallel TigerSHARC code Copyright M. Smith and S. Lei Contact [email protected] / 45 + B14
04/18/23Introduction to highly parallel TigerSHARC code Copyright M. Smith and S. Lei Contact [email protected] / 45 + B14
04/18/23Introduction to highly parallel TigerSHARC code Copyright M. Smith and S. Lei Contact [email protected] / 45 + B14
04/18/23Introduction to highly parallel TigerSHARC code Copyright M. Smith and S. Lei Contact [email protected] / 45 + B14