ccomputer organization
TRANSCRIPT
-
7/28/2019 Ccomputer organization
1/71
Computer OrganizationModule IV
-
7/28/2019 Ccomputer organization
2/71
Memory
Memory: - Memory hierarchy Principle of inclusion-memory
interleaving techniques. Disk memory - Data organisationon disk-Disk performance Disk caching. Main memory-
SRAM, DRAM, ROM Associative memory, Scratchpad
memory-Cache memory Levels of Cache-Mapping
techniques, Associative, Direct, and Set Associative-Main
memory update policies.
-
7/28/2019 Ccomputer organization
3/71
Basic Concepts
The maximum size of the memory that can beused in any computer is determined bytheaddressing scheme.
For example, a 16-bit computer that generates16-bit addresses is capable of addressing up to2 16 = 64K memory locations.
Similarly, 32-bit addresses can utilize amemory that contains up to 2 32 = 4G
whereas machines with 4O-bit addresses canaccess up to 2 40 = 1 T (tera) locations.
-
7/28/2019 Ccomputer organization
4/71
-
7/28/2019 Ccomputer organization
5/71
Basic Concepts
memory access time : - time between the
Read and the MFC signals.
memory cycle time :- minimum time
delay required between the initiation of twosuccessive memory operations,
for example, the time between two successive
Read operations The cycle time is usually slightly longer than the
access time
-
7/28/2019 Ccomputer organization
6/71
Memory Hierarchy
Based on Speed , Size and Cost
-
7/28/2019 Ccomputer organization
7/71
-
7/28/2019 Ccomputer organization
8/71
Memory Interleaving
-
7/28/2019 Ccomputer organization
9/71
Memory Interleaving
-
7/28/2019 Ccomputer organization
10/71
Main Memory
-
7/28/2019 Ccomputer organization
11/71
-
7/28/2019 Ccomputer organization
12/71
-
7/28/2019 Ccomputer organization
13/71
RAM
SRAM's
DRAM's
ROM
-
7/28/2019 Ccomputer organization
14/71
Static Memories
Memories that consist of circuits capable ofretaining their state as long as power is appliedare known as static memories.
Figure 5.4 illustrates how a static RAM (SRAM)cell be implemented.
-
7/28/2019 Ccomputer organization
15/71
SRAM
-
7/28/2019 Ccomputer organization
16/71
Two inverters cross-connected form latch.
The latch is connected to two bit lines bytransistors T1 and T2.
These transistors act as switches that can beopened or closed under control of the word line.
When the word line is at ground level, the
transistors are turned off and the latch retainsits state.
This state is maintained as long as the signalon the word line is at ground level.
-
7/28/2019 Ccomputer organization
17/71
Read Operation
the word line is activated to close switches T1and T2. If the cell is in state 1, the signal on bit
line b is high and the signal on bit line b ' is low. The opposite is true if the
cell is in state 0.
Thus, b and b ' are complements of each other. SenselWrite circuits at the end of the bit lines
monitor the state of b and b ' and set the outputaccordingly.
-
7/28/2019 Ccomputer organization
18/71
Write Operation
The state of the cell is set by placing theappropriate value on bit line b and itscomplement on b', and then activating the wordline.
This forces the cell into the correspondingstate.
The required signals on the bit lines aregenerated by the SenselWrite circuit.
-
7/28/2019 Ccomputer organization
19/71
-
7/28/2019 Ccomputer organization
20/71
Dynamic Memories
Static RAMs are fast, but they come at a high costbecause their cells require several transistors.
Less expensive RAMs can be implemented if
simpler cells are used. However, such cells do notretain their state indefinitely; hence, they arecalled dynamic RAMs (DRAMs).
-
7/28/2019 Ccomputer organization
21/71
Dynamic Memories
Information is stored in a dynamic memory cell inthe form of a charge on a capacitor,
and this charge can be maintained for only tens of
milliseconds. Since the cell is required to store information for a
much longer time, its contents must be periodicallyrefreshed by restoring the capacitor charge to itsfull value.
-
7/28/2019 Ccomputer organization
22/71
DRAM
-
7/28/2019 Ccomputer organization
23/71
DRAM
In order to store information in this cell, transistorT is turned on and an appropriate voltage is appliedto the bit line.
This causes a known amount of charge to be storedin the capacitor.
-
7/28/2019 Ccomputer organization
24/71
DRAM
During a Read operation, the transistor in aselected cell is turned on.
A sense amplifier connected to the bit line detects
whether the charge stored on the capacitor isabove the threshold value.
If so, it drives the bit line to a full voltage thatrepresents logic value 1.
This voltage recharges the capacitor to the fullcharge that corresponds to logic value 1.
DRAM
-
7/28/2019 Ccomputer organization
25/71
DRAM
If the sense amplifier detects that the charge on thecapacitor is below the threshold value, it pulls thebit line to ground level, which ensures that thecapacitor will have no charge, representing logicvalue 0.
Thus, reading the contents of the cell automaticallyrefreshes its contents.
All cells in a selected row are read at the same time,which refreshes the contents of the entire row.
-
7/28/2019 Ccomputer organization
26/71
-
7/28/2019 Ccomputer organization
27/71
a 21-bit address is needed to access a byte in thismemory.
12-bit row address
9-bit column address To reducereduce the number of pins needed for external
connections, the row and column addresses aremultiplexed on 12 pins.
R d O ti
-
7/28/2019 Ccomputer organization
28/71
Read Operation
During a Read or a Write operation, the rowaddress is applied first
It is loaded into the row address latch in responseto a signal pulse on the Row Address Strobe (RAS)
input of the chip. Then a Read operation is initiated, in which all cells
on the selected row are read and refreshed.
Then column address is applied to the address pinsand loaded into the column address latch undercontrol of the Column Address Strobe (CAS) signal.
-
7/28/2019 Ccomputer organization
29/71
The information in this latch is decoded and theappropriate group of 8 SenselWrite circuits areselected.
If the R/W control signal indicates a Read
operation, the output values of the selected circuitsare transferred to the data lines, D
7-0.
For a Write operation, the information on the D7-0
lines is transferred to the selected circuits. This information is then used to overwrite the
contents of the selected cells in the corresponding
8 columns.
Synchronocs DRAMS
-
7/28/2019 Ccomputer organization
30/71
Synchronocs DRAMS
DRAMs whose operation is directly synchronizedwith a clock signal. Such memories are known assynchronous DRAMs (SDRAMs).
-
7/28/2019 Ccomputer organization
31/71
Double Data Rate SDRAM's
-
7/28/2019 Ccomputer organization
32/71
Double Data Rate SDRAM's
The standard SDRAM performs all actions on therising edge of the clock signal.
A similar memory device is available, whichaccesses the cell array in the same way, but
transfers data on both edges of the clock. The latency of these devices is the same as for
standard SDRAMs. But, since they transfer data on
both edges of the clock, their bandwidth isessentially doubled for long burst transfers.
Such devices are known as double-data-rateSDRAMs (DDR SDRAMs).
-
7/28/2019 Ccomputer organization
33/71
The term memory latency is used to refer to theamount of time it takes to transfer a word of datato or from the memory.
Memory bandwidth is the rate at which data can beread from or stored into a semiconductor memoryby a processor. Memory bandwidth is usuallyexpressed in units ofbytes/second.
-
7/28/2019 Ccomputer organization
34/71
Read Only Memories
-
7/28/2019 Ccomputer organization
35/71
Both SRAM and DRAM chips are volatile, whichmeans that they lose the stored information ifpower is turned off.
There are many applications that need memorydevices which retain the stored information ifpower is turned off.
Since its normal operation involves only reading of
stored data, a memory of this type is called read-only memory (ROM).
Read Only Memories(ROM)
-
7/28/2019 Ccomputer organization
36/71
Read Only Memories(ROM)
Types
-
7/28/2019 Ccomputer organization
37/71
Types
ROM PROM
EPROM
EEPROM
-
7/28/2019 Ccomputer organization
38/71
Cache Memories
Cache Memory
-
7/28/2019 Ccomputer organization
39/71
Cache Memory
The speed of the main memory is very low incomparison with the speed of modern processors.
For good performance, the processor cannot spend
much of its time waiting to access instructions anddata in main memory.
Introduced a new scheme which reduces the timeneeded to access the necessary information CacheMemory
Cache Memory
-
7/28/2019 Ccomputer organization
40/71
Cache Memory
The effectiveness of the cache mechanism is basedon a property of computer programs called localityof reference.
many instructions in localized areas of the programare executed repeatedly during some time period,and the remainder of the program is accessedrelatively infrequently. This is referred to as locality
of reference.
Cache Memory
-
7/28/2019 Ccomputer organization
41/71
Cache Memory
Two Aspects of locality of reference temporal andspatial
temporal - The temporal aspect means that a recentl yexecuted instruction is likely to be executed again verysoon.
Spatial - The spatial aspect means that instructions inclose proximity to a recently executed instruction (withrespect to the instructions addresses) are also likely tobe executed soon.
-
7/28/2019 Ccomputer organization
42/71
If the active segments of a program can be placedin a fast cache memory, then the total executiontime can be reduced significantly.
Conceptually, operation of a cache memory is verysimple. The memory control circuitry is designedto take advantage of the property of locality ofreference.
-
7/28/2019 Ccomputer organization
43/71
The temporal aspect of the locality of referencesuggests that whenever an information item(instruction or data) is first needed, this itemshould be brought into the caChe where it will
hopefully remain until it is needed again. The spatial aspect suggests that instead of fetching
just one item from the main memory to the cache,
it is useful to fetch several items that reside atadjacent addresses as well.
-
7/28/2019 Ccomputer organization
44/71
-
7/28/2019 Ccomputer organization
45/71
Usually, the cache memory can store a reasonablenumber of blocks at any given time
but this number is small compared to the total
number of blocks in the main memory. The correspondence between the main memory
blocks and those in the cache is specified by amapping function.
-
7/28/2019 Ccomputer organization
46/71
When the cache is full and a memory word(instruction or data) that is not in the cache isreferenced, the cache control hardware mustdecide which block should be removed to create
space for the new block that contains thereferenced word.
The collection of rules for making this decision
constitutes the replacement algorithm.
-
7/28/2019 Ccomputer organization
47/71
In a Read operation, the main memory is notinvolved.
For a Write operation, the system can proceed in
two ways. In the first technique, called the write-through
protocol, the cache location and the main memorylocation are updated simultaneously.
The second technique is to update only the cachelocation and to mark it as updated with anassociated flag bit
-
7/28/2019 Ccomputer organization
48/71
The main memory location of associated flag bit,often called the dirty or modified bit.
The main memory location of the word is updated
later, when the block containing this marked wordis to be removed from the cache to make room for anew block.
This technique is known as the write-back, or copy-
back protocol.
-
7/28/2019 Ccomputer organization
49/71
When the addressed word in a Read operation isnot in the cache, a read miss occurs.
The block of words that contains the requested
word is copied from the main memory into thecache. After the entire block is loaded into thecache, the particular word requested is forwardedto the processor.
-
7/28/2019 Ccomputer organization
50/71
Alternatively, this word may be sent to theprocessor as soon as it is read from the mainmemory, which is called load-through, or early restart
reduces the processor's waiting period somewhat, butat the expense of more complex circuitry.
-
7/28/2019 Ccomputer organization
51/71
MAPPING FUNCTIONS
-
7/28/2019 Ccomputer organization
52/71
Consider a cache consisting of 128 blocks of 16words each, for a total of 2048 (2K) words
assume that the main memory is addressable by a
16-bit address. The main memory has 64K words,which we will view as 4K blocks of 16 words each.
-
7/28/2019 Ccomputer organization
53/71
-
7/28/2019 Ccomputer organization
54/71
Direct Mapping Associative Mapping
Set-Associative Mapping
-
7/28/2019 Ccomputer organization
55/71
Direct Mapping
-
7/28/2019 Ccomputer organization
56/71
Direct Mapping
-
7/28/2019 Ccomputer organization
57/71
The simplest way to determine cache locations inwhich to store memory blocks is the direct-mapping technique.
In this technique, block j of the main memory mapsonto blockj modulo 128 of the cache, as depicted in
ie, main memory blocks 0, 128, 256, . . . is loaded inthe cache, it is stored in cache block O.
Blocks 1, 129, 257, .. . are stored in cache block 1,and so on.
-
7/28/2019 Ccomputer organization
58/71
Since more than one memory block is mapped ontoa given cache block position, contention may arisefor that position even when the cache is not full.
For example, instructions of a program may start inblock 1 and continue in block 129, possibly after abranch.
As this program is executed, both of these blocks
must be transferred to the block-1 position in thecache.
Contention is resolved by allowing the new block to
overwrite the currently resident block
-
7/28/2019 Ccomputer organization
59/71
Placement of a block in the cache is detenninedfrom the memory address.
The low-order 4 bits select one of 16 words in ablock.
7 -bit cache block field determines the cacheposition in which this block must be stored.
-
7/28/2019 Ccomputer organization
60/71
The high-order 5 bits of the memory address of theblock are stored in 5 tag bits associated with itslocation in the cache.
They identify which of the 32 blocks that aremapped into this cache position are currentlyresident in the cache.
-
7/28/2019 Ccomputer organization
61/71
As execution proceeds, the 7 -bit cache block fieldof each address generated by the processor pointsto a particular block location in the cache.
The high-order 5 bits of the address are comparedwith the tag bits associated with that cachelocation.
If they match, then the desired word is in that block of
the cach. If there is no match, then the block containing the
required word must first be read from the mainmemory and loaded into the cache.
-
7/28/2019 Ccomputer organization
62/71
The direct-mapping technique is easy toimplement, but it is not very flexible.
-
7/28/2019 Ccomputer organization
63/71
Associative Mapping
-
7/28/2019 Ccomputer organization
64/71
Associative Mapping
-
7/28/2019 Ccomputer organization
65/71
much more flexible mapping method, in which amain memory block can be placed into any cacheblock position.
In this case, 12 tag bits are required to identify amemory block when it is resident in the cache.
The tag bits of an address received from theprocessor are compared to the tag bits of each
block of the cache to see if the desired block ispresent.
-
7/28/2019 Ccomputer organization
66/71
The cost of an associative cache is higher than thecost of a direct-mapped cache
because of the need to search all 128 tag patterns todetermine whether a given block is in the cache.
A search of this kind is called Associative Search
-
7/28/2019 Ccomputer organization
67/71
Set Associative Mapping
-
7/28/2019 Ccomputer organization
68/71
-
7/28/2019 Ccomputer organization
69/71
A combination of the direct- and associative-mapping techniques
Blocks of the cache are grouped into sets,
mapping allows a block of the main memory toreside in any block of a specific set.
the contention problem of the direct method iseased by having a few choices for block placement.
At the same time, the hardware cost is reduced bydecreasing the size of the associative search.
-
7/28/2019 Ccomputer organization
70/71
cache with two blocks per set. In this case, memory blocks 0, 64, 128, . . . , 4032
map into cache set 0, and they can occupy either ofthe two block positions within this set.
Having 64 sets means that the 6-bit set field of theaddress determines which set of the cache mightcontain the desired block.
-
7/28/2019 Ccomputer organization
71/71
The tag field of the address must then beassociatively compared to the tags of the twoblocks of the set to check if the desired block ispresent.
This two-way associative search is simple toimplement.