Ôn tập kiến trúc máy tính
DESCRIPTION
Chương 4,5TRANSCRIPT
-
TRNG I HC BCH KHOA
KHOA KHOA HC & K THUT MY TNH
n tp Mn: Kin Trc My Tnh - 504002
TP. HCM 4/2015
-
Ni dung
I. Kiu lnh ........................................................................................................................... 3
II. Single clock processor ...................................................................................................... 3
II.1 Single clock processor ................................................................................................. 3
II.2 Kin trc single cycle ................................................................................................... 3
II.3 Bi tp .......................................................................................................................... 7
II.4 p n/Gi ................................................................................................................ 7
III. Pipeline processor ............................................................................................................. 9
III.1 Pipeline processor ........................................................................................................ 9
III.1.1 Hazard ................................................................................................................ 9
III.2 Bi tp: ....................................................................................................................... 11
III.3 p n/gi ............................................................................................................... 13
IV. Memory ........................................................................................................................... 14
IV.1 Memory ...................................................................................................................... 14
IV.2 Cache .......................................................................................................................... 15
IV.2.1 Block placement .............................................................................................. 16
IV.2.2 Block identification ......................................................................................... 17
IV.2.3 Block replacement ........................................................................................... 17
IV.2.4 Write strategy .................................................................................................. 17
IV.2.5 Miss/hit ............................................................................................................ 18
IV.3 Bi tp: ....................................................................................................................... 19
IV.4 p n/Gi .............................................................................................................. 20
-
Thc hnh kin trc my tnh n tp ktmt CE 2015
3
Cc yu t nh hng n hiu sut ca h thng:
- di ca chng trnh (instruction count)
- S chu k trn 1 lnh (CPI)
- Thi gian ca 1 chu k (clock cycle time).
I. Kiu lnh
Cc kiu format ca tp lnh trong MIPS:
Loai R
Op (6 bit) Rs (5bit) Rt (5bit) Rd (5 bit) Sa (5 bit) Funct (6 bit)
Loai I
Op (6 bit) Rs (5bit) Rt (5bit) Immediate (16 bit)
Loai J
Op (6 bit) Immediate (26 bit)
Trong o:
- Op: opcode ca lnh.
- Rs, Rt, Rd: Thanh ghi ngun, target, ich.
- Sa (shift amount): s bit dich trong cac lnh shift.
- Immediate: i din cho s.
- Funct: 6 bit function.
II. Single clock processor
II.1 Single clock processor
- u im: mt clock mt chu k.
- Nhc im: mt chu k tn nhiu thi gian, mi lnh d nhanh hay chm u thc thi
trong mt chu k.
II.2 Kin trc single cycle
Tham kho Hnh II-1
- Thanh PC: tr n lnh ang thc thi.
- Instruction memory: cha code thc thi, khi ny ch cho php c.
Registers file: cha 32 thanh ghi, do cn 5 bit xc nh thanh ghi no (25=32).
xc nh chi tit thanh ghi, ta tham kho bng thanh ghi di.
-
Thc hnh kin trc my tnh n tp ktmt CE 2015
4
- B m rng du: mc ch c bn l m rng du t con s 16bit 32bits.
- B chn (MUX): dng chn input a vo trong trng hp c nhiu input v 1
output, hoc chn ng ra trong trng hp c 1 ng vo v nhiu ng ra. Tn hiu select
quyt nh s la chn
- ALU: thc hin tnh ton.
- Data memory: l vng nh cha d liu trong phn data. Ch c lnh LOAD v
STORE mi c th truy xut vo khi ny.
Hnh II-1: Kin trc single cycle processor
-
Thc hnh kin trc my tnh n tp ktmt CE 2015
5
Bng II-1 Danh sch 32 thanh ghi
Bng II-2 Gi tr ca ALUop
Input output 4bit encoding
Op[6bit] funct[6bit] ALUCtrl
R-type Add ADD 0
R-type Sub SUB 10
R-type And AND 100
R-type Or OR 101
R-type Xor XOR 110
R-type Slt SLT 1010
Addi X ADD 0
Slti X SLT 1010
Andi X AND 100
Ori X OR 101
Xori X XOR 110
Lw X ADD 0
Sw X ADD 0
Beq X SUB 10
Bne X SUB 10
J X X X
Bng II-3 ngha ca cc tn hiu iu khin.
(Ta gi s tn hiu tch cc l bng 1, tn hiu khng tch cc l 0)
Tn hiu ngha Gi tr = tch cc Gi tr = khng tch cc
-
Thc hnh kin trc my tnh n tp ktmt CE 2015
6
RegDest Chn thanh ghi kt qu Chn Rd ghi kt qu Chn Rt ghi kt qu
RegWrite Cho php ghi kt qu ngc vo thanh ghi
Cho php Khng cho php
ExtOp Dng cho phn m rng du khi dng s
M rng du Khng quan tm
ALUSrc Chn thanh ghi hoc s a vo ALU
Thanh ghi vi s Thanh ghi vi thanh ghi
Memwrite Cho php ghi vo vng
data memory Cho php ghi Khng cho php ghi
MemRead Cho php c t vng data memory
Cho php c Khng cho php c
MemtoReg Dng chn ng t data memory n thanh ghi
Chn ng t data memory n thanh ghi (lnh load)
Chn ng t kt qu ca ALU n thanh ghi
Beq, Bne Dng cho cc lnh nhy c iu kin
Nu iu kin nhy tha mn, PC s di 1 on n nhn
Khi iu kin khng tha, khi PC = PC + 4 (thc thi lnh tip theo)
J Dng cho lnh nhy khng c iu kin
Nhy n nhn cn PC ch n lnh tip theo
PCSrc Chn ngun cho PC Khi lnh nhy xy ra Khi lnh nhy ko xy ra
Ch :
- Khng quan tm n RegDes,Memread, MemtoReg khi tn hiu RegWrite = 0
- Khi ALUSrc = 0 th ta khng quan tm n Extop
Tham kho thm b tn hiu ca lnh c th slide Main Control Signal Values silde 45.
- Bng lit k ng i c tr lu nht ca cc lnh (b qua tr ca b m rng du,
MUX, ADDER, dy, PC)
ALU Instruction
Fetch Decode ALU Reg
Load Instruction
Fetch Decode ALU
Memory
Read Reg
Store Instruction
Fetch Decode ALU
Memory
Write
Branch Instruction
Fetch Decode ALU
Jump Instruction
Fetch Decode
-
Thc hnh kin trc my tnh n tp ktmt CE 2015
7
II.3 Bi tp
Dng li kin trc c miu t Hnh II-1 tr li cc cu hi bn di:
Cho bng delay ca cc khi nh sau:
I-MEM ADDER MUX ALU REG D-MEM Control
200ps 100ps 30ps 180ps 150ps 200ps 100ps
a) Xc nh ng i c tr lu nht ca lnh AND, LOAD, v tnh tr ?
b) Xc nh cc tn hiu ca khi control unit (main Unit) khi thc thi lnh BEQ $1, $2,
ABC. Vi $1 = 0x00FF, $2 = 0x00FE
c) Thnh phn phn cng no khng s dng khi ta thc thi lnh SLTI, lnh J
d) B qua delay ca cc khi adder, mux, control. Xc nh data path v thi gian ca
cc kiu lnh
o ALU
o LOAD
o STORE
o BRANCH
o JUMP
e) Xc nh thi gian ca single cycle v multi-cycle
f) Gi s c 1 chng trnh gm 40% ALU, 20% Loads, 10% stores, 20% branches, &
10% jumps. Tnh CPI trong trng hp single cycle , multi cylce
g) Tnh speed up
II.4 p n/Gi
a)
- lnh ADD : I-MEM (200) REGs(150) MUX(30) ALU(180) MUX(30)
REGs(150) = 740
- lnh LOAD : I-MEM (200) REGs(150) MUX(30) ALU(180) D-MEM(200)
MUX(30) REGs(150) = 940
b)
- Xt lnh BEQ $1, $2, ABC. Vi $1 = 0x00FF, $2 = 0x00FE, lnh ny c ngha: nu
thanh ghi $1 m bng thanh ghi $2 th n s nhy n nhn ABC
- Ta thy ni dung thanh ghi $1 v $2 khc nhau nn lnh BEQ khng thc hin nhy n
nhn ABC. T , ta a ra tn hiu cho lnh nh sau:
Tn hiu Gi tr Gii thch
RegDest X Khng quan tm
RegWrite 0 Khng ghi kt qu vo thanh ghi
ExtOp X Khng quan tm
ALUSrc 0 Thanh ghi vi thanh ghi
-
Thc hnh kin trc my tnh n tp ktmt CE 2015
8
Memwrite 0 Khng truy xut vo vng data
MemRead X Khng quan tm
MemtoReg X Khng quan tm
Beq 1 Lnh BEQ
Bne 0 Khng phi lnh BNE
J 0 Lnh branch khng phi lnh jump
PCSrc 0 iu kin nhy khng xy ra
ALUop 10 Tham kho bng xxx trn
c)
- Lnh SLTI dng set gi tr thanh ghi ch ln 1 nu thanh ghi em so snh nh hn 1 s
cho trc, ngc li n s reset gi tr thanh ghi ch xung 0 nu thanh ghi em so snh
ln hn 1 s cho trc.
- V d SLTI $1, $2, 100 th thanh ghi $1 = 1 khi $2 < 100, ngc li $1 = 0 khi $2 >= 100
t ta xt ng i ca lnh nh sau:
- Qua I-MEM (lnh no cng qua I-MEM) control unit, reg files, mux, signed extens,
khng dng b cng cho PC ALU khng qua D-MEM MUX , Reg files
d)
Instruction
class
Instruction
memory
Register
read
ALU
Operation
Data
memory
Register
write Total
ALU 200 150 180 150 680ps
Load 200 150 180 200 150 880ps
Store 200 150 180 200 730ps
Branch 200 150 180 530ps
Jump 200 150 350ps
e)
- Tnh thi gian ca single cycle = max ca tt c cc lnh, c th trong trng hp ny lnh
LOAD c gi tr ln nht (thi gian thc thi lu nht) single cycle = 880ps.
- Tnh thi gian ca multi cycle = max ca 5 bc (IF- INSTRUCTION MEMORY, ID
REG FILES, EXE- ALU, MEM DATA MEMORY, WRITE BACK REG FILES)
trong 1 lnh, c th trong trng hp ny bc Instruction memory tn nhiu thi gian
nht multi cycle = 200ps.
f)
- CPI l s chu k trn lnh.
- CPI ca single cycle = 1.
- CPI ca multi cycle = 0.44 + 0.25 + 0.14+ 0.23 + 0.12 = 3.8
g)
- Thi gian chy ca 1 chng trnh = CPI * thi gian ca mt cycle.
- Speed up = thi gian chy ca single cycle / thi gian chy ca multi cycle = (1 * 880) / (
3.8 * 200) = 880/760 = 1.16
-
Thc hnh kin trc my tnh n tp ktmt CE 2015
9
III. Pipeline processor
III.1 Pipeline processor
a) Pipeline chia lnh thc thi ra thnh 5 bc, mi bc thc thi trong trong mt chu k
- IF: ly lnh t I-MEM.
- ID: gii m lnh: bit c l lnh no, kiu lnh g, xc nh thanh ghi cn dng, a
ch cn nhy, s cn tnh ton
- EX: thc thi lnh hoc tnh ton a ch cho lnh load/store.
- MEM: truy xut data i vi lnh load/store.
- WB: ghi ngc kt qu li thanh ghi.
b) Hiu sut ca pipe line vi single cycle.
- Ta chia lnh ra thnh k bc.
- Thi gian thc thi n lnh ca single cycle = n * single cycle.
- Thi gian thc thi n lnh ca pipe line = (k + n -1) * pipeline clock cycle .
o Lnh u tin my k chu k, n -1 lnh cn li mi lnh 1 chu k.
- Ta gi s single cycle = k* pipeline clock cycle
- Khi speedup = (n* k* pipeline clock cycle)/ ((k + n -1) * pipeline clock cycle )
- Khi n th speedup k (tc l pipeline nhanh ti a gp k ln single cycle)
Ch :
Pipeline khng rt ngn thi gian thc thi ca mt lnh, m ch tng hiu sut ln bng cch
tng thng nng ca my. Khi cc bc ca mt lnh c thi gian thc thi khc nhau th s lm
gim speed up.
- Thi gian fill v drain cng ng thi lm gim speed up
- hin thc pipeline ngi ta dng thanh ghi lu kt qu li mi bc:
o Tn hiu t khi control unit (main control)
o Tt c tn hiu iu khin c sinh ra bc ID
o Mi bc dng 1 s tn hiu iu khin
o RegDst c dng trong bc ID
o ExtOp, ALUSrc, ALUCtrl ,J, Beq, Bne, zero c dng trong bc EXE
o MemRead, MemWrite, MemtoReg dng trong bc MEM
o RegWrite dng trong bc WB
III.1.1 Hazard
Khi hin thc pipeline s gy ra cc loi hazard: structural hazard, data hazard, control hazard.
III.1.1.1 Structural hazard
Xy ra khi c s tranh chp ti nguyn phn cng, 2 lnh cng dng chung phn cng trong
cng chu k.
-
Thc hnh kin trc my tnh n tp ktmt CE 2015
10
III.1.1.2 Data hazards
Xy ra khi c s ph thuc d liu.
III.1.1.2.1 S ph thuc d liu:
Read After Write RAW Hazard
I. add $s1, $s2, $s3 #thanh ghi $s1 c ghi
J. sub $s4, $s1, $s3 #thanh ghi $s1 c c
Khi , data hazard xut hin khi lnh J c $s1 m lnh I li cha tnh xong kt qu ca $1
Write After Read: Name Dependence
I: sub $t4, $t1, $t3 # $t1 c c trc
J: add $t1, $t2, $t3 # $t1 c ghi sau
R rang, ta thy khng c s ph thuc d liu y, ch c ph thuc tn bin. loi
b s ph thuc v tn bin, ta i tn thanh ghi.
I: sub $t4, $t1, $t3
J: add $t5, $t2, $t3
Write After write: Name Dependence
I: sub $t1, $t4, $t3 # $t1 c ghi
J: add $t1, $t2, $t3 # $t1 c ghi li ln na
R rng ta thy khng c s ph thuc d liu y, ch c ph thuc tn bin. Kt qu
ch ph thuc vo lnh J sau. loi b s ph thuc v tn bin, ta i tn thanh ghi.
I: sub $t1, $t4, $t3
J: add $t5, $t2, $t3
Read After Read: khng gy ra s ph thuc.
III.1.1.2.2 Phng php gii quyt data hazard
a) Chn stall vo m bo lnh trc tr kt qu v m lnh sau c th c c kt
qu trong chu k k tip. Phng php ny khng tn ti nguyn phn cng (gim
gi thnh), ch to ra delay cho chng trnh (gim hiu sut).
b) Dng k thut forward. Phng php ny cn thm ti nguyn phn cng (gi thnh
tng) hin thc + chn stall khi cn thit.
- Khi xy ra hazard i vi lnh load, cho d ta c dng k thut forward th cng phi tn
1 stall gii quyt chng.
- hin thc forward, ngi ta thm b mux cho vic la chn input cho ALU.
- Cc lnh khc (khc lnh load) th kt qu c cho ra bc ALU (EXE) nn khi ta
dng k thut forward s khng cn stall na.
V d: Gi s c s ph thuc gia cc lnh nh sau:
1
-
Thc hnh kin trc my tnh n tp ktmt CE 2015
11
2
3
4
- Lnh 2 ph thuc 1,
- Lnh 3 ph thuc 1,
- Lnh 4 ph thuc 1.
Khi dng k thut forward cho lnh
- lnh 2 th to forward t EXE(1) EXE(2)
- lnh 3 th to forward t MEM(1) EXE(3)
- lnh 4 th to forward t WB(1) EXE(4)
Ch : cc forwarding u forward v v tr EXE.
c) Sp xp li code
Vic sp xp phi m bo th t trc sau khi c s ph thuc v m bo tnh ng n
ca chng trnh.
III.1.1.2.3 Control hazard
Dng tin on tng hiu sut ca chng trnh.
- Tin on 1 bit
- Tin on 2 bit
III.2 Bi tp:
1) Cho s v cc thng s ca b x l single clock nh hnh bn di.
-
Thc hnh kin trc my tnh n tp ktmt CE 2015
12
Thi gian delay ca mi khi cho nh bng bn di.
I-MEM ALU REG D-MEM
200ps 150ps 200ps 200ps
a) B qua tr ca khi ADD, MUX, Control. Tnh single cycle, pipeline clock?
b) Tnh thi gian thc thi ca chng trnh gm 150 line code i vi single cycle v
pipeline. T tnh speed up so snh single cycle v pipeline (khng c stall)
c) Gi s chng trnh khng c stall v thng k c l c ALU 50%, Beq 25%, lw
15%, sw10%. Tnh speed up gia multi cycle v pipeline.
2) Cho on code sau:
addi $1, $zero, 100
addi $2, $zero, 100
add $3, $1, $2
lw $4, L_4
lw $5, L_5
and $6, $4, $5
sw $6, L_KQ
a) Xc nh s ph thuc gia cc lnh v thanh ghi no gy ra s ph thuc .
b) Chn stall gii quyt hazard trn, cn bao nhiu stall?
c) Sp xp li th t cc lnh sao cho khi chy on code th t stall nht m tnh logic
ca chng trnh vn khng i.
d) Dng k thut forward gii quyt hazard th khi chy s c bao nhiu stall.
-
Thc hnh kin trc my tnh n tp ktmt CE 2015
13
III.3 p n/gi
1a)
- Single cycle = thi gian thc thi lnh di nht (lnh load) = I-Mem Regs ALU
D-Mem Regs = 200 + 200 + 150 + 200 + 200 = 950ps
- Pipeline clock = max (I-Mem, Regs, ALU, D-Mem, Regs) = 200
1b)
- Thi gian thc thi 150 ca single cycle = 150 * 950 = 142500 ps
- Thi gian thc thi 150 ca pipeline = (5 + 150 - 1)* 200 =30800 ps
- Speed up = 142500/30800 = 4.62
1c)
- CPI ca multi cycle = (50%* 4 + 25%*3 + 15%*5 + 10%*4) = 3.9
- CPI ca pipeline khi khng c stall l = 1
- Thi gian thc thi = CPI * s lnh * thi gian 1 chu k
- Speed up = thi gian mutli cycle /thi gian pipeline = (3.9 * s lnh * 200)/( 1 *s lnh *
200)
2a)
(1) addi $1, $zero, 100 (2) addi $2, $zero, 100 (3) add $3, $1, $2 (4) lw $4, L_4 (5) lw $5, L_5 (6) and $6, $4, $5 (7) sw $6, L_KQ
Lnh no m cc ton hng c t mu m th hin s ph thuc qua thanh ghi .
2b)
- 9 stall, 3 stall gia (2) v (3), 3 stall gia (5) v (6), 3 stall gia (6) v (7)
- Lc chn stall vo m bo l nhng ch cn c gi tr thanh ghi (ID) phi sau chu k ghi
kt qu ca thanh ghi
2c)
Mt trong nhng cch sp xp lm gim stall
lw $4, L_4
lw $5, L_5
addi $1, $zero, 100
addi $2, $zero, 100
add $6, $4, $5
add $3, $1, $2
sw $6, L_KQ
-
Thc hnh kin trc my tnh n tp ktmt CE 2015
14
Cn li 3 stall
2d)
- 1 stall. Sinh vin v hnh hiu r hn
Hnh nh so snh single cycle, multi cycle, v pipe line
Single cycle
Load Add Jump store
5 5 5 5
Branch
5
Multi cycle
Load add Jump store branch
IF ID EXE MEM WB IF ID EXE WB IF ID IF ID EXE MEM IF ID EXE
Pipeline
IF ID EXE MEM WB
IF ID EXE MEM WB
IF ID EXE MEM WB
IF ID EXE MEM WB
IF ID EXE MEM WB
IV. Memory
IV.1 Memory
Gm cc bus c bn sau:
- Address:n bit dng xc nh a ch trong ram, khng gian a ch 2n
- Data: m bits dng xut/ nhp d liu, rng ca ram l mbits
- OE: output enable, khi tn hiu ny tch cc tng ng vi vic c d liu t RAM
- WE: write enable, khi tn hiu ny tch cc tng ng vi vic ghi d liu vo RAM
Cc im khc nhau c bn gia SRAM v DRAM
-
Thc hnh kin trc my tnh n tp ktmt CE 2015
15
SRAM DRAM
- Cu to 6 transistor - t, nhanh - Tnh, khng cn refesh gi tr
- nh a ch theo hng - .
- 1 transistor + 1 t - R, chm hn - ng, v t in r r in theo thi gian nn cn
phi refesh gi tr theo chu k - nh a ch theo ma trn - .
Hnh IV-1 SRAM
Hnh IV-2 DRAM
IV.2 Cache
Do tc pht trin ca ALU qu nhanh so vi Memory nn to ra khong cch kh xa gia
ALU v MEM, do cn c cache lm b m gia ALU v MEM. Cache thng c lm
bng SRAM. Mc ch lm gim thi gian truy xut memory.
Tc v thi gian truy xut ca memory c xp theo th t sau (ch mang tnh cht tham
kho)
- Registers (size < 1 KB), Access time < 0.5 ns
- Level 1 Cache (size 8 64 KB), Access time: 1 ns
- L2 Cache (512KB 8MB), Access time: 3 10 ns
- Main Memory (4 16 GB), Access time: 50 100 ns
- Disk Storage (> 200 GB), Access time: 5 10 ms
-
Thc hnh kin trc my tnh n tp ktmt CE 2015
16
Temporal Locality (thi gian) mt bin, thc th c truy xut th c th n s c truy
xut ln na. Thng xut hin trong nhng vng lp, hay gi hm/th tc nhiu ln. i vi
truy xut theo thi gian th xu hng thng gi block trong cache nhm truy xut ln sau
Spatial Locality (khng gian) lnh/data trong vng nh khi c truy xut c th cc
lnh/data gn n s c truy xut. Thng xut hin trong khai bo mng, thc thi tun t
i vi truy xut theo khng gian th xu hng thng chun b trc block k tip.
IV.2.1 Block placement
Phng php t block vo cache.
IV.2.1.1 Direct mapped
Mi block c xc nh mt v tr t duy nht. Cho n l s block trong cache. Block th m
trong b nh (RAM) s c t vo v tr m%n trong cache
IV.2.1.2 Full associative
Mi block c c vo v tr no cn trng trong cache.
IV.2.1.3 Set associative
Mi block c xc nh mt set duy nht. Trong set c k s la chn, t block vo 1
trong k ch trng . Cho n l s set trong cache, Block th m trong b nh (RAM) s c t
vo v tr m%n trong cache.
V d minh ha:
Trong K- way Set associative th k block s gp thnh 1 set, v d trn k = 2.
-
Thc hnh kin trc my tnh n tp ktmt CE 2015
17
IV.2.2 Block identification
xc nh a ch ngi ta chia a ch ra lm 3 phn (Tag, Index, block offset)
31 0
Tag Index Block offset
IV.2.2.1 Block offset
Xc nh thnh phn no trong block c truy xut. xc nh bock offset c bao nhiu
bit, ta i xc nh trong block c bao nhiu phn t.
Xc nh s phn t bng cch ly (size of block)/ (size of n v truy xut)
IV.2.2.2 Index
Dng xc nh s set trong b nh m
- Direct mapped: 1 block l 1 set.
- K-way (k = 2m) Set Associative: k block gom thnh 1 set.
- Full Associative: ton b block thnh 1 set. Index lc ny l 0 bit.
Xc nh s block bng cch ly (size of cache)/ (size of block)
IV.2.2.3 Tag
xc nh block no ang nm trong cache.
Tag bit = 32 index bits block offset bits (trong kin trc 32 bits)
IV.2.3 Block replacement
Khi mt block vo m khng cn ch trng t vo th cn phi thay block c bng block
mi.
- Trong trng hp direct mapped, v mi block ch c 1 ch t nn ta khng nhc n
y
- FIFO: ci no c t vo trc th s c ly ra trc.
- Ramdom
- LRU: ci no t dng nht th c thay th trc.
IV.2.4 Write strategy
Chin lc ghi ngc li cache, memory
-
Thc hnh kin trc my tnh n tp ktmt CE 2015
18
IV.2.4.1 Write Back
Ch updata cache, khi c yu cu hay cn thay th th mi update gi tr sau cng xung
memory. Cn bit valid xc nh block c valid hay khng v bit modified xc nh
block c update cha.
Kh hin thc, t tn lu lng bng thng ca h thng.
IV.2.4.2 Write Through
Updata c cache v memory, cn bit valid xc nh block c valid hay khng.
n gin, d hin thc, tn lu lng bng thng ca h thng v phi update nhiu.
IV.2.5 Miss/hit
- Miss: cn truy xut m tm khng thy trong cache. Do phi a block cha ci ta
mun truy xut vo cache, sau truy xut n.
- Hit: cn truy xut v tm thy ci mun truy xut trong cache.
- Miss Penaty: s chu k x l cache miss.
- Hit rate = hit / (hit + miss)
- Miss rate = miss / (hit + miss) = 1 hit rate
- I-Cache Miss Rate = Miss rate trong lc truy xut I-MEM
- D-Cache Miss Rate = Miss rate trong lc truy xut D-MEM
V d:
Chng trnh c 1000 lnh trong c 25% l load/store. Bit lc c I-MEM b miss 150,
D-MEM b miss 50. Tm I-Cache Miss Rate, D-Cache Miss Rate.
- I-Cache Miss Rate = s ln miss / s ln truy xut I-MEM = 150/1000 = 15%
- D-Cache Miss Rate = s ln miss / s ln truy xut D-MEM = 50/(1000*25%) =50/250 =
20%
Khi cache miss th s gy ra stall. xc nh bao nhiu stall, ta i tm cc thng s sau:
- Memory stall cycles = Combined Misses * Miss Penalty
- Miss Penalty: s chu k gii quyt vic miss.
- Combined Misses = I-Cache Misses + D-Cache Misses
- I-Cache Misses = I-Count I-Cache Miss Rate
- D-Cache Misses = LS-Count D-Cache Miss Rate
- LS-Count (Load & Store) = I-Count LS Frequency
- Memory Stall Cycles Per Instruction = Combined Misses Per Instruction Miss Penalty
- Combined Misses Per Instruction = I-Cache Miss Rate + LS Frequency D-Cache Miss
Rate
Memory Stall Cycles Per Instruction = I-Cache Miss Rate Miss Penalty + LS Frequency
D-Cache Miss Rate Miss Penalty
-
Thc hnh kin trc my tnh n tp ktmt CE 2015
19
V d: Instruction count (I-Count) = 106 lnh, 30% lnh loads/stores, D-cache miss rate l 5%
v I-cache miss rate l 1%, cho Miss penalty l 100 chu k, tnh combined misses per instruction
and memory stall cycles
1% + 30% * 5% = 0.025 combined misses mi lnh tng ng 25 misses trn 1000 lnh.
Memory stall cycles = 0.025 * 100 (miss penalty) = 2.5 stall cycles per instruction
Total memory stall cycles = 106 * 2.5 = 2,500,000
CPIMemoryStalls = CPIPerfectCache + Mem Stalls per Instruction
V d: cho CPI = 1.5 khi khng c stall, Cache miss rate l 2% i vi instruction v 5% i
vi data. Lnh loads v stores chim 20%. Cho trc miss penalty l 100 chu k i vi I-cache
v D-cache. Tnh CPI ca h thng?
- Mem stalls cho mi lnh = 0.02*100 + 20%*0.05*100 =3
- CPI Memory Stalls = 1.5 + 3 = 4.5 cycles
Average Memory Access Time (AMAT) thi gian truy xut b nh trung bnh
- AMAT = Hit time + Miss rate * Miss penalty
Do gim thi gian truy xut th
- Ta gim Hit time: bng cch dng b nh cache nh, n gin.
- Gim Miss Rate: bng cch dng b nh cache ln, block size ln v k-way set
associativity vi k ln
- Gim Miss Penalty bng cch dng cache nhiu mc.
V d: tm AMAT khi bit Cache access time (Hit time) of 1 cycle = 2 ns, Miss penalty = 20
clock cycles, miss rate of 0.05 per access
- AMAT = 1 + 0.05 20 = 2 cycles = 4 ns
IV.3 Bi tp:
1) Cho b nh cache c dung lng 256KB, block size l 4word, mi ln truy xut 1 byte.
Xc nh s bit ca cc trng tag, index, block offset trong cc trng hp.
- Direct mapped
- Full associcative
- 2 way set associcative
2) Cho b nh chnh c dung lng 1G, b nh cache c dung lng 1MB, block size l
256B, mi ln truy xut 1 word. Xc nh s bit ca cc trng tag, index, block offset
trong cc trng hp.
- Direct mapped
- Full associcative
- 4 way set associcative
-
Thc hnh kin trc my tnh n tp ktmt CE 2015
20
3) Trong cache c 8 block, mi block l 4words. Xc nh s ln miss/ hit khi h thng truy
xut vo cc a ch theo th t sau:
0x0001002A
0x00010020
0x0002006A
0x00020066
0x00020022
0x0001002B
Trong cc trng hp
- Direct mapped
- Full associcative
- 2 way set associcative
IV.4 p n/Gi
1)
- S phn t trong 1 block = (size of block)/(size of phn t truy xut) = 4 word /1 byte =
4*4 bytes/ 1 byte = 16
- S khi block trong cache = size of cache / size of block = 256KB / 4 word = 28*210/4*4 =
214 blocks
- Direct mapped: block offset 4 bits, index = 14 bits, tag = 32 4 -14 = 14 bits
- Full associcative: block offset 4 bits, index = 0 bits, tag = 32 4 = 28 bits.
- 2 Ways set associative: 2 block to thnh 1 set m c 214 block nn c 213 sets, block
offset 4 bits, index = 13 bits, tag = 32 4 -13 = 15 bits
2)
- S phn t trong 1 block = (size of block)/ (size of phn t truy xut) = 256B / 4 bytes =
28 bytes/ 22 byte = 26
- S khi block trong cache = size of cache / size of block = 1MB / 256B = 210*210/28 = 212
blocks.
- Khng gian a ch l 1G, do ta dng thanh ghi 30 bit l c th truy xut c 1G ram.
- Direct mapped: block offset 6 bits word, index = 12 bits, tag = 30 6 12 2 bit word =
10 bits.
- Full associcative: block offset 6 bits word, index = 0 bits, tag = 30 6 2 = 22 bits.
- 4 ways set associative: 4 block to thnh 1 set m c 212 block nn c 210 sets, Block
offset 6 bits word, index = 10 bits, tag = 30 6 -10 -2 = 12 bits.
3)
- Da vo , ta tnh c c 4 bits block offset, 3 bit index. Do ta phn tch a ch nh
bn di:
Direct map
-
Thc hnh kin trc my tnh n tp ktmt CE 2015
21
Address Tag Index Block Miss
Gii thch offset /hit
0x0001002A 0000 0000 0000 0001 0000 0000 0 010 1010 M First access
0x00010020 0000 0000 0000 0001 0000 0000 0 010 0000 H
0x0002006A 0000 0000 0000 0010 0000 0000 0 110 1010 M First access
0x00020066 0000 0000 0000 0010 0000 0000 0 110 0110 H
0x00020022 0000 0000 0000 0010 0000 0000 0 010 0010 M Khc tag
0x0001002B 0000 0000 0000 0001 0000 0000 0 010 1011 M Khc tag
Full associative
Address Tag Block Miss
Gii thch offset /hit
0x0001002A 0000 0000 0000 0001 0000 0000 0010 1010 M First access
0x00010020 0000 0000 0000 0001 0000 0000 0010 0000 H
0x0002006A 0000 0000 0000 0010 0000 0000 0110 1010 M First access
0x00020066 0000 0000 0000 0010 0000 0000 0110 0110 H
0x00020022 0000 0000 0000 0010 0000 0000 0010 0010 M First access
0x0001002B 0000 0000 0000 0001 0000 0000 0010 1011 H
2 ways set associative, cn 2 bit index
Address Tag Index Block Miss
Gii thch offset /hit
0x0001002A 0000 0000 0000 0001 0000 0000 00 10 1010 M First access
0x00010020 0000 0000 0000 0001 0000 0000 00 10 0000 H
0x0002006A 0000 0000 0000 0010 0000 0000 01 10 1010 M First access
0x00020066 0000 0000 0000 0010 0000 0000 01 10 0110 H
0x00020022 0000 0000 0000 0010 0000 0000 00 10 0010 M Khc tag
0x0001002B 0000 0000 0000 0001 0000 0000 00 10 1011 M Khc tag
I. Kiu lnhII. Single clock processorII.1 Single clock processorII.2 Kin trc single cycleII.3 Bi tpII.4 p n/Gi
III. Pipeline processorIII.1 Pipeline processorIII.1.1 HazardIII.1.1.1 Structural hazardIII.1.1.2 Data hazardsIII.1.1.2.1 S ph thuc d liu:Read After Write RAW HazardIII.1.1.2.2 Phng php gii quyt data hazardIII.1.1.2.3 Control hazard
III.2 Bi tp:III.3 p n/gi
IV. MemoryIV.1 MemoryIV.2 CacheIV.2.1 Block placementIV.2.1.1 Direct mappedIV.2.1.2 Full associativeIV.2.1.3 Set associative
IV.2.2 Block identificationIV.2.2.1 Block offsetIV.2.2.2 IndexIV.2.2.3 Tag
IV.2.3 Block replacementIV.2.4 Write strategyIV.2.4.1 Write BackIV.2.4.2 Write Through
IV.2.5 Miss/hit
IV.3 Bi tp:IV.4 p n/Gi