generation of highly parallel code for tigersharc processors an introduction this presentation will...

31
Generation of highly parallel code for TigerSHARC processors An introduction

Post on 19-Dec-2015

217 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Generation of highly parallel code for TigerSHARC processors An introduction This presentation will probably involve audience discussion, which will create

Generation of highly parallel code for TigerSHARC

processors An introduction

Page 2: Generation of highly parallel code for TigerSHARC processors An introduction This presentation will probably involve audience discussion, which will create

04/18/23Introduction to highly parallel TigerSHARC code Copyright M. Smith and S. Lei Contact [email protected] / 45 + B14

Background assumed

Familiarity with TigerSHARC architecture

Familiarity with TigerSHARC programmer’s model for registers

Some assembly experienceAn interest in beating the compiler

in those special cases when you need the last drop of blood out of the CPU :-)

Page 3: Generation of highly parallel code for TigerSHARC processors An introduction This presentation will probably involve audience discussion, which will create

04/18/23Introduction to highly parallel TigerSHARC code Copyright M. Smith and S. Lei Contact [email protected] / 45 + B14

To be tackledWhat’s causing the problem– General limitations of instruction sets

How to recognize when you might be coming up against TigerSHARC architecture limitations

A process for optimizing the TigerSHARC parallelism– Example -- Temperature conversion

– Bonus if time permits -- Average and instantaneous power

Page 4: Generation of highly parallel code for TigerSHARC processors An introduction This presentation will probably involve audience discussion, which will create

04/18/23Introduction to highly parallel TigerSHARC code Copyright M. Smith and S. Lei Contact [email protected] / 45 + B14

When are DSP instructions valid?

You are going to customize– When can you use the DSP instructions?

– Most -- From Monday to Friday

– Some Only between 9:00 a.m. and 9:00 p.m.Check against architectureMIMD -- Parallel ops MUST be able to do this

– Can it be fetched in one cycle (1 instruction line)– Can it be executed in one cycle (resource question)– Can it execute without conflicting with other instructions?– Then PROBABLY legal

HOWEVER -- The designers had the final decision and you have to live by that decision!

Page 5: Generation of highly parallel code for TigerSHARC processors An introduction This presentation will probably involve audience discussion, which will create

04/18/23Introduction to highly parallel TigerSHARC code Copyright M. Smith and S. Lei Contact [email protected] / 45 + B14

Under best conditionsIf instruction described the right way– 2 data memory access (in or out) with a

REQUIRED post modification operation possibly with a modify register containing the value 0

– 1 add compute operation on data registers

– 1 multiply compute operation on data registers

– Ability to redo code using both X and Y

– Sometimes – audio for example – do left channel in X and right channel in Y

Page 6: Generation of highly parallel code for TigerSHARC processors An introduction This presentation will probably involve audience discussion, which will create

Introduction to PPPPIC

Professor’s Personal Process for Parallel Instruction Coding

Page 7: Generation of highly parallel code for TigerSHARC processors An introduction This presentation will probably involve audience discussion, which will create

04/18/23Introduction to highly parallel TigerSHARC code Copyright M. Smith and S. Lei Contact [email protected] / 45 + B14

Basic code development -- any systemWrite the “C” code for the function

void Convert(float *temperature, float *result, int N)

which converts an array of temperatures measured in “Celsius” (Canadian Market) to “Fahrenheit” (Tourist Trade)

Convert the code to TigerSHARC assembly code, following the standard coding and documentation practices, or just use the compiler to do the job for you

Page 8: Generation of highly parallel code for TigerSHARC processors An introduction This presentation will probably involve audience discussion, which will create

04/18/23Introduction to highly parallel TigerSHARC code Copyright M. Smith and S. Lei Contact [email protected] / 45 + B14

Standard “C” code

void Convert(float *temperature, float *result, int N) {

int count;

for (count = 0; count < N; count++) {

*result = (*temperature) * 9 / 5 + 32;

temperature++;

result++;

}

Page 9: Generation of highly parallel code for TigerSHARC processors An introduction This presentation will probably involve audience discussion, which will create

04/18/23Introduction to highly parallel TigerSHARC code Copyright M. Smith and S. Lei Contact [email protected] / 45 + B14

Process for developing custom codeRewrite the “C” code using “LOAD/STORE”

techniques TigerSHARC is essentially super-scaler RISC

Write the assembly code using a hardware loop– Check that end of loop label is in the correct place

REWRITE the assembly code using registers and instructions that COULD be used in parallel IF you could find the correct optimization approach

Move algorithm to “Resource Usage Chart”Optimize (Attempt to) Compare and contrast time -- include set up and

loop control time -- was it worth the effort?

Page 10: Generation of highly parallel code for TigerSHARC processors An introduction This presentation will probably involve audience discussion, which will create

04/18/23Introduction to highly parallel TigerSHARC code Copyright M. Smith and S. Lei Contact [email protected] / 45 + B14

TigerSHARD-style load/store “C” code

void Convert(register float *temperature,

register float * answer, register int N) {

register int count;

register float scratch;

for (count = 0; count < N; count++) {

scratch = * temperature;

scratch = scratch * (9 / 5);

scratch = scratch + 32;

*answer = scratch;

temperature++; answer++;

}

Page 11: Generation of highly parallel code for TigerSHARC processors An introduction This presentation will probably involve audience discussion, which will create

04/18/23Introduction to highly parallel TigerSHARC code Copyright M. Smith and S. Lei Contact [email protected] / 45 + B14

Page 12: Generation of highly parallel code for TigerSHARC processors An introduction This presentation will probably involve audience discussion, which will create

04/18/23Introduction to highly parallel TigerSHARC code Copyright M. Smith and S. Lei Contact [email protected] / 45 + B14

Page 13: Generation of highly parallel code for TigerSHARC processors An introduction This presentation will probably involve audience discussion, which will create

04/18/23Introduction to highly parallel TigerSHARC code Copyright M. Smith and S. Lei Contact [email protected] / 45 + B14

Page 14: Generation of highly parallel code for TigerSHARC processors An introduction This presentation will probably involve audience discussion, which will create

04/18/23Introduction to highly parallel TigerSHARC code Copyright M. Smith and S. Lei Contact [email protected] / 45 + B14

Page 15: Generation of highly parallel code for TigerSHARC processors An introduction This presentation will probably involve audience discussion, which will create

04/18/23Introduction to highly parallel TigerSHARC code Copyright M. Smith and S. Lei Contact [email protected] / 45 + B14

Page 16: Generation of highly parallel code for TigerSHARC processors An introduction This presentation will probably involve audience discussion, which will create

04/18/23Introduction to highly parallel TigerSHARC code Copyright M. Smith and S. Lei Contact [email protected] / 45 + B14

Page 17: Generation of highly parallel code for TigerSHARC processors An introduction This presentation will probably involve audience discussion, which will create

04/18/23Introduction to highly parallel TigerSHARC code Copyright M. Smith and S. Lei Contact [email protected] / 45 + B14

Page 18: Generation of highly parallel code for TigerSHARC processors An introduction This presentation will probably involve audience discussion, which will create

04/18/23Introduction to highly parallel TigerSHARC code Copyright M. Smith and S. Lei Contact [email protected] / 45 + B14

Page 19: Generation of highly parallel code for TigerSHARC processors An introduction This presentation will probably involve audience discussion, which will create

04/18/23Introduction to highly parallel TigerSHARC code Copyright M. Smith and S. Lei Contact [email protected] / 45 + B14

Page 20: Generation of highly parallel code for TigerSHARC processors An introduction This presentation will probably involve audience discussion, which will create

04/18/23Introduction to highly parallel TigerSHARC code Copyright M. Smith and S. Lei Contact [email protected] / 45 + B14

Page 21: Generation of highly parallel code for TigerSHARC processors An introduction This presentation will probably involve audience discussion, which will create

04/18/23Introduction to highly parallel TigerSHARC code Copyright M. Smith and S. Lei Contact [email protected] / 45 + B14

Page 22: Generation of highly parallel code for TigerSHARC processors An introduction This presentation will probably involve audience discussion, which will create

04/18/23Introduction to highly parallel TigerSHARC code Copyright M. Smith and S. Lei Contact [email protected] / 45 + B14

Page 23: Generation of highly parallel code for TigerSHARC processors An introduction This presentation will probably involve audience discussion, which will create

04/18/23Introduction to highly parallel TigerSHARC code Copyright M. Smith and S. Lei Contact [email protected] / 45 + B14

Page 24: Generation of highly parallel code for TigerSHARC processors An introduction This presentation will probably involve audience discussion, which will create

04/18/23Introduction to highly parallel TigerSHARC code Copyright M. Smith and S. Lei Contact [email protected] / 45 + B14

Page 25: Generation of highly parallel code for TigerSHARC processors An introduction This presentation will probably involve audience discussion, which will create

04/18/23Introduction to highly parallel TigerSHARC code Copyright M. Smith and S. Lei Contact [email protected] / 45 + B14

Page 26: Generation of highly parallel code for TigerSHARC processors An introduction This presentation will probably involve audience discussion, which will create

04/18/23Introduction to highly parallel TigerSHARC code Copyright M. Smith and S. Lei Contact [email protected] / 45 + B14

Page 27: Generation of highly parallel code for TigerSHARC processors An introduction This presentation will probably involve audience discussion, which will create

04/18/23Introduction to highly parallel TigerSHARC code Copyright M. Smith and S. Lei Contact [email protected] / 45 + B14

Page 28: Generation of highly parallel code for TigerSHARC processors An introduction This presentation will probably involve audience discussion, which will create

04/18/23Introduction to highly parallel TigerSHARC code Copyright M. Smith and S. Lei Contact [email protected] / 45 + B14

Page 29: Generation of highly parallel code for TigerSHARC processors An introduction This presentation will probably involve audience discussion, which will create

04/18/23Introduction to highly parallel TigerSHARC code Copyright M. Smith and S. Lei Contact [email protected] / 45 + B14

Page 30: Generation of highly parallel code for TigerSHARC processors An introduction This presentation will probably involve audience discussion, which will create

04/18/23Introduction to highly parallel TigerSHARC code Copyright M. Smith and S. Lei Contact [email protected] / 45 + B14

Page 31: Generation of highly parallel code for TigerSHARC processors An introduction This presentation will probably involve audience discussion, which will create

04/18/23Introduction to highly parallel TigerSHARC code Copyright M. Smith and S. Lei Contact [email protected] / 45 + B14