optimization of the mmips
DESCRIPTION
Optimization of the mMIPS. Sander Stuijk. Outline. Video processing I/O operations on the mMIPS Extending the LCC compiler Assignment. Video processing – motion estimation. Video processing – algorithm/architecture codesign. MBS + VIP. MMI+AICP. CAB. MPEG. 1394. Conditional access. - PowerPoint PPT PresentationTRANSCRIPT
Department of Electrical EngineeringElectronic SystemsDepartment of Electrical EngineeringElectronic Systems
Optimization of the mMIPS
Sander Stuijk
Electronic Systems
Outline
Video processing I/O operations on the mMIPS Extending the LCC compiler Assignment
Electronic Systems
Video processing – motion estimation
Electronic Systems
Video processing – algorithm/architecture codesign
MPEGMPEGMBSMBS
++VIPVIP
MMI+AICPMMI+AICP
13941394
MSPMSP
M-PIM-PI
MIPSMIPSTriMedia VLIWTriMedia VLIW
T-PIT-PIConditionalConditionalaccessaccess
CA
BC
AB
75135bandwidth(MB/s)
1246.0power(mW)
4.20.437eff area(mm2)
3731load(%)
11.41.41area(mm2)
GPASIP
picture-rate up-converter
Electronic Systems
What is an image?
A black and white image is a matrix of luminance values More pixels means higher image quality
400 pixels/line
300
lin
es
40 pixels/line
30
line
s
Electronic Systems
How do you store an image?
An image is a one dimensional pixel array
0 width-1
width*height-1Address: [y*width+x]
x
y
Electronic Systems
How many bits do we need per pixel?
Experiments: we can distinguish about 200 levels in an image
We shall use 8 bit representation of luminance
Electronic Systems
Video processing
Spatial domain Image processing on a still image Examples
Edge detection Blurring ...
Temporal domain Image processing across different points in time Examples
Motion estimation Object recognition ...
The assignment deals with still images
Electronic Systems
What does a 3x3 filter do with an image?
A 3x3 filter replaces each pixel (byte) in the file with the weighted sum of the pixel and its eight direct neighbors:
With:
And filter-coefficient Cline,pixel represented by one byte
1
1
1
1,
1_
line
line
pixel
pixelpixelwidthlinenpixellinen byteC
NbyteOutput
1
1
1
1,
line
line
pixel
pixelpixellineCN
3x3filter
Example: filter coefficients are all “1”
Electronic Systems
Blur filter
+1 +1 +1
+1 +1 +1
+1 +1 +1
Filter coefficient:
Electronic Systems
C-code for blur filterfor(int a=width+1; a<width*height-(width+1); a++){
result=((
1* (int)buf_i[a-1-width] +
1* (int)buf_i[a-width] +
1* (int)buf_i[a+1-width] +
1* (int)buf_i[a-1] +
1* (int)buf_i[a] +
1* (int)buf_i[a+1] +
1* (int)buf_i[a-1 +width] +
1* (int)buf_i[a+width] +
1* (int)buf_i[a+1+width]
+4 )/ 9);
if(result<0) buf_o[a]=0;
else if(result>255) buf_o[a]=255;
else buf_o[a]=result;
}
clip weighted sum (pixel value) back to
one byte
Electronic Systems
Sharpening filter
-1 -1 -1
-1 12 -1
-1 -1 -1
Filter coefficient:
Electronic Systems
Outline
Video processing I/O operations on the mMIPS Extending the LCC compiler Assignment
Electronic Systems
Input / output
void main(void){ int a, result; char *buf_i = (char*)0x600000, *buf_o = (char*)0x665400; for (a=WIDTH+1; a < WIDTH*HEIGHT-(WIDTH+1);a++) { result=(( -1*(int)buf_i[a-1-WIDTH] + -1*(int)buf_i[a-WIDTH] + -1*(int)buf_i[a+1-WIDTH] + -1*(int)buf_i[a-1] + 12*(int)buf_i[a] + -1*(int)buf_i[a+1] + -1*(int)buf_i[a-1+WIDTH] + -1*(int)buf_i[a+WIDTH] + -1*(int)buf_i[a+1+WIDTH] + 128) / 4);
if(result<0) buf_o[a] = 0; else if (result > 255) buf_o[a] = (char)255; else buf_o[a] = result; }}
input image output image
Electronic Systems
The input file with image data (name.y format)
File: {byte0,byte1,……..byten, bytewidth*height}
Pixel left top Pixel right bottom
Example: Two pixels above directly above each other: byten and byten+width
bicycle.y football.y
Electronic Systems
Placing an image in the mMIPS memory
1. lcc -o mips_mem.bin image.c
2. imgproc.exe -i bicycle.y mips_mem.bin
images
Start of image in memory (mips_mem.bin)
Electronic Systems
Extracting an image from the mMIPS memory
1. imgproc.exe -e mips_ram.dump bicycle.y
2. ImProc.exe
Electronic Systems
Outline
Video processing I/O operations on the mMIPS Extending the LCC compiler Assignment
Electronic Systems
Adding a custom operation to LCC and the mMIPS
void main(void) {
int a, b, result; char *buf_i = (char*)0x600000, *buf_o = (char*)0x665400;
for (a = 1; a < HEIGHT - 1; a++) { for (b = 1; b < WIDTH - 1; b++) {
result=(( 5*(int)buf_i[(a - 1) * WIDTH + b - 1] + -3*(int)buf_i[(a - 1) * WIDTH + b ] + 6*(int)buf_i[(a - 1) * WIDTH + b + 1] + -7*(int)buf_i[ a * WIDTH + b - 1] + 11*(int)buf_i[ a * WIDTH + b ] + -7*(int)buf_i[ a * WIDTH + b + 1] + 6*(int)buf_i[(a + 1) * WIDTH + b - 1] + -3*(int)buf_i[(a + 1) * WIDTH + b ] + 5*(int)buf_i[(a + 1) * WIDTH + b + 1] +
128) / 13);
/* Absolute value */ if(result<0) buf_o[a * WIDTH + b] = -result; else buf_o[a * WIDTH + b] = result;
} }
}
Electronic Systems
Add a custom pattern to the C code
#define abs(a,b) ((a) - ((b) + *(int *) 0x12344321))
void main(void) {
int a, b, result; char *buf_i = (char*)0x600000, *buf_o = (char*)0x665400;
for (a = 1; a < HEIGHT - 1; a++) { for (b = 1; b < WIDTH - 1; b++) {
result=(( 5*(int)buf_i[(a - 1) * WIDTH + b - 1] + -3*(int)buf_i[(a - 1) * WIDTH + b ] + 6*(int)buf_i[(a - 1) * WIDTH + b + 1] + -7*(int)buf_i[ a * WIDTH + b - 1] + 11*(int)buf_i[ a * WIDTH + b ] + -7*(int)buf_i[ a * WIDTH + b + 1] + 6*(int)buf_i[(a + 1) * WIDTH + b - 1] + -3*(int)buf_i[(a + 1) * WIDTH + b ] + 5*(int)buf_i[(a + 1) * WIDTH + b + 1] + 128) / 13);
/* Absolute value */result = abs(result,result);
buf_o[a * WIDTH + b] = result;}
} }
Electronic Systems
Adding a custom operation (concept)
LCC defines 4 constructs that map to custom operations in LCC:
((a) - ((b) + *(int *) 0x12344321))
((a) + ((b) + *(int *) 0x12344321))
((a) - ((b) - *(int *) 0x12344321))
((a) + ((b) - *(int *) 0x12344321))
More operations (possibly with more operands) can be added. Look at the website for more information.
Electronic Systems
Add the custom pattern to the LCC compiler
Custom operations listed in the file lcc/src/minimips.md
opcode = 0
function code = 0x31
Electronic Systems
Adding a custom operation to LCC and the mMIPS
void main(void) {
int a, b, result; char *buf_i = (char*)0x600000, *buf_o = (char*)0x665400;
for (a = 1; a < HEIGHT - 1; a++) { for (b = 1; b < WIDTH - 1; b++) {
result=(( 5*(int)buf_i[(a - 1) * WIDTH + b - 1] + -3*(int)buf_i[(a - 1) * WIDTH + b ] + 6*(int)buf_i[(a - 1) * WIDTH + b + 1] + -7*(int)buf_i[ a * WIDTH + b - 1] + 11*(int)buf_i[ a * WIDTH + b ] + -7*(int)buf_i[ a * WIDTH + b + 1] + 6*(int)buf_i[(a + 1) * WIDTH + b - 1] + -3*(int)buf_i[(a + 1) * WIDTH + b ] + 5*(int)buf_i[(a + 1) * WIDTH + b + 1] +
128) / 13);
/* Absolute value */ if(result<0) buf_o[a * WIDTH + b] = -result; else buf_o[a * WIDTH + b] = result;
} }
}
Electronic Systems
Add a custom pattern to the C code
#define abs(a,b) ((a) - ((b) + *(int *) 0x12344321))
void main(void) {
int a, b, result; char *buf_i = (char*)0x600000, *buf_o = (char*)0x665400;
for (a = 1; a < HEIGHT - 1; a++) { for (b = 1; b < WIDTH - 1; b++) {
result=(( 5*(int)buf_i[(a - 1) * WIDTH + b - 1] + -3*(int)buf_i[(a - 1) * WIDTH + b ] + 6*(int)buf_i[(a - 1) * WIDTH + b + 1] + -7*(int)buf_i[ a * WIDTH + b - 1] + 11*(int)buf_i[ a * WIDTH + b ] + -7*(int)buf_i[ a * WIDTH + b + 1] + 6*(int)buf_i[(a + 1) * WIDTH + b - 1] + -3*(int)buf_i[(a + 1) * WIDTH + b ] + 5*(int)buf_i[(a + 1) * WIDTH + b + 1] + 128) / 13);
/* Absolute value */result = abs(result,result);
buf_o[a * WIDTH + b] = result;}
} }
Electronic Systems
Controleer dat clipping gebruikt wordt
Open een cygwin shell en voer de volgende commando’s uit:
1) lcc image.c –o mips_mem.bin
2) disas mips_mem.bin | less Zoek een assembler instructie met opcode 0 and function code 0x31
Electronic Systems
Adding a special function to the mMIPS (hardware)
ad
d1
pc
registers
dec
ode
r
mux
4
imm
2w
ord
alu
ctrl
c4
ad
d2
alu
mux
3
mux
2
mux
1
haz
ard
_ct
rl
if_in
str
if_p
c
id_instr_15_11
id_data_reg2
id_immediate
id_instr_20_16
id_instr_5_0
id_ctrl_ex_alusrcid_ctrl_ex_aluop
id_ctrl_ex_regdst
id_ctrl_mem_branch
id_ctrl_mem_memwrite
id_ctrl_mem_memread
id_ctrl_wb_memtoreg
id_ctrl_wb_regwrite
id_data_reg1
id_pc
ex_alu_result
ex_ctrl_wb_regwriteex_ctrl_wb_memtoreg
ex_regdst_addr
mem_dmem_data
mem_alu_result
mem_regdst_addr
mem_ctrl_wb_regwritemem_ctrl_wb_memtoreg
r_addr_reg1
r_addr_reg2
w_addr_regr_data_reg2
r_data_reg1
w
w_data_reg
shift
left_
jmp
shift
left
mux
8
mux
7
mux
6
mux
5
sig
nex
tend
byt
e
mem
dev
bra
nch
_ct
rl
dec
ode
r_n
bid_instr_25_0
id_instr_10_6
id_ctrl_ex_regvalue
id_ctrl_ex_target
id_ctrl_ex_hiloalu_selid_ctrl_ex_hilo_write
ctrl
haz
ard
hilo
c31
clk
enable
rst
rom_dout
rom_wait
ram_dout
ram_wait
rom_addr
rom_r
ram_din
ram_addr
ram_r
ram_w
dev_*
1
0
1
0
2
1
00
1
0
1
2
0
1
2
1
0
0
1
2
b
a
[31-26][5-0] [15-11]
[20-16]
[10-6]
[5-0]
[25-0]
[15-0]
[25-21]
[20-16]
IFIDWrite
Hazard
PCWrite
Signextend
Pipe_en (This signal goes to all registers on which no
write signal is depicted)
Enable
dmem_wait
aluctrl
alu
Electronic Systems
Outline
Video processing I/O operations on the mMIPS Extending the LCC compiler Assignment
Electronic Systems
Assignment
Optimize the run-time of an image processing algorithm running on the mMIPS without reducing the instruction set supported by the hardware.
ConstraintsAll programs that run on the original mMIPS must also run on your design.Programs running on your design and the original mMIPS must produce bit-exact output
AllowedAdding special instructions to the mMIPS;Changing the design of the mMIPS (e.g. forwarding).
Not-allowedModification of the image processing algorithm that are not needed to use special instructions (e.g. replace multiply with shifts).
Electronic Systems
Testing and implementing the design
Test for functional correctness Run the original mMIPS with the algorithm to produce a reference output. Compare the results of your mMIPS to the reference output.
You can use the check-image utility for this purpose. Make sure that your image is large enough to cover all possible cases.
Synthesize your design You must synthesize your design to determine the maximum clock frequency
at which your mMIPS can run. This determines part of the speed-up.
Electronic Systems
Submitting your design
Use the submit-design utility to submit your mMIPS design You can submit a new design as often as you like, but only your last design
will be tested
Electronic Systems
Important dates
Midterm meeting on March 16th in Pav b1 from 10.45 till 12.30.
Submit the first version of your modified mMIPS on March 18th before noon. This version must at least contain:
a working forwarding unit, a working custom clipping instruction.
A separate document (A4, 10pt font, max 2 pages) with a description of all changes made, or planned, or investigated
Provide a short description of the required changes Provide a short motivation for the change Explain the expected performance gain
Send document to [email protected] Instructions at www.es.ele.tue.nl/education/5JJ55-65/mmips If you submit a design and document, then you will get feedback on
your ideas!
Electronic Systems
Important dates
Submit the final version of your modified mMIPS on April 1st before noon. Instructions at www.es.ele.tue.nl/education/5JJ55-65/mmips This is a hard deadline, no extension is possible!
Individual presentation of your mMIPS on April 4-5th
The presentation is about the changes you made to the mMIPS. Do not talk about the book, forwarding or clipping.
Electronic Systems
Support and information
Every Monday from 15.45 till 17.30 in PT 9.10
Check also www.es.ele.tue.nl/education/5JJ55-65/mmips for more information, hints, etc.