ccomputer organization

7/28/2019 Ccomputer organization

1/71

Computer OrganizationModule IV


2/71

Memory

Memory: - Memory hierarchy Principle of inclusion-memory

interleaving techniques. Disk memory - Data organisationon disk-Disk performance Disk caching. Main memory-

SRAM, DRAM, ROM Associative memory, Scratchpad

memory-Cache memory Levels of Cache-Mapping

techniques, Associative, Direct, and Set Associative-Main

memory update policies.


3/71

Basic Concepts

The maximum size of the memory that can beused in any computer is determined bytheaddressing scheme.

For example, a 16-bit computer that generates16-bit addresses is capable of addressing up to2 16 = 64K memory locations.

Similarly, 32-bit addresses can utilize amemory that contains up to 2 32 = 4G

whereas machines with 4O-bit addresses canaccess up to 2 40 = 1 T (tera) locations.


4/71


5/71

Basic Concepts

memory access time : - time between the

Read and the MFC signals.

memory cycle time :- minimum time

delay required between the initiation of twosuccessive memory operations,

for example, the time between two successive

Read operations The cycle time is usually slightly longer than the

access time


6/71

Memory Hierarchy

Based on Speed , Size and Cost


7/71


8/71

Memory Interleaving


9/71

Memory Interleaving


10/71

Main Memory


11/71


12/71


13/71

RAM

SRAM's

DRAM's

ROM


14/71

Static Memories

Memories that consist of circuits capable ofretaining their state as long as power is appliedare known as static memories.

Figure 5.4 illustrates how a static RAM (SRAM)cell be implemented.


15/71

SRAM


16/71

Two inverters cross-connected form latch.

The latch is connected to two bit lines bytransistors T1 and T2.

These transistors act as switches that can beopened or closed under control of the word line.

When the word line is at ground level, the

transistors are turned off and the latch retainsits state.

This state is maintained as long as the signalon the word line is at ground level.


17/71

Read Operation

the word line is activated to close switches T1and T2. If the cell is in state 1, the signal on bit

line b is high and the signal on bit line b ' is low. The opposite is true if the

cell is in state 0.

Thus, b and b ' are complements of each other. SenselWrite circuits at the end of the bit lines

monitor the state of b and b ' and set the outputaccordingly.


18/71

Write Operation

The state of the cell is set by placing theappropriate value on bit line b and itscomplement on b', and then activating the wordline.

This forces the cell into the correspondingstate.

The required signals on the bit lines aregenerated by the SenselWrite circuit.


19/71


20/71

Dynamic Memories

Static RAMs are fast, but they come at a high costbecause their cells require several transistors.

Less expensive RAMs can be implemented if

simpler cells are used. However, such cells do notretain their state indefinitely; hence, they arecalled dynamic RAMs (DRAMs).


21/71

Dynamic Memories

Information is stored in a dynamic memory cell inthe form of a charge on a capacitor,

and this charge can be maintained for only tens of

milliseconds. Since the cell is required to store information for a

much longer time, its contents must be periodicallyrefreshed by restoring the capacitor charge to itsfull value.


22/71

DRAM


23/71

DRAM

In order to store information in this cell, transistorT is turned on and an appropriate voltage is appliedto the bit line.

This causes a known amount of charge to be storedin the capacitor.


24/71

DRAM

During a Read operation, the transistor in aselected cell is turned on.

A sense amplifier connected to the bit line detects

whether the charge stored on the capacitor isabove the threshold value.

If so, it drives the bit line to a full voltage thatrepresents logic value 1.

This voltage recharges the capacitor to the fullcharge that corresponds to logic value 1.

DRAM


25/71

DRAM

If the sense amplifier detects that the charge on thecapacitor is below the threshold value, it pulls thebit line to ground level, which ensures that thecapacitor will have no charge, representing logicvalue 0.

Thus, reading the contents of the cell automaticallyrefreshes its contents.

All cells in a selected row are read at the same time,which refreshes the contents of the entire row.


26/71


27/71

a 21-bit address is needed to access a byte in thismemory.

12-bit row address

9-bit column address To reducereduce the number of pins needed for external

connections, the row and column addresses aremultiplexed on 12 pins.

R d O ti


28/71

Read Operation

During a Read or a Write operation, the rowaddress is applied first

It is loaded into the row address latch in responseto a signal pulse on the Row Address Strobe (RAS)

input of the chip. Then a Read operation is initiated, in which all cells

on the selected row are read and refreshed.

Then column address is applied to the address pinsand loaded into the column address latch undercontrol of the Column Address Strobe (CAS) signal.


29/71

The information in this latch is decoded and theappropriate group of 8 SenselWrite circuits areselected.

If the R/W control signal indicates a Read

operation, the output values of the selected circuitsare transferred to the data lines, D

7-0.

For a Write operation, the information on the D7-0

lines is transferred to the selected circuits. This information is then used to overwrite the

contents of the selected cells in the corresponding

8 columns.

Synchronocs DRAMS


30/71

Synchronocs DRAMS

DRAMs whose operation is directly synchronizedwith a clock signal. Such memories are known assynchronous DRAMs (SDRAMs).


31/71

Double Data Rate SDRAM's


32/71

Double Data Rate SDRAM's

The standard SDRAM performs all actions on therising edge of the clock signal.

A similar memory device is available, whichaccesses the cell array in the same way, but

transfers data on both edges of the clock. The latency of these devices is the same as for

standard SDRAMs. But, since they transfer data on

both edges of the clock, their bandwidth isessentially doubled for long burst transfers.

Such devices are known as double-data-rateSDRAMs (DDR SDRAMs).


33/71

The term memory latency is used to refer to theamount of time it takes to transfer a word of datato or from the memory.

Memory bandwidth is the rate at which data can beread from or stored into a semiconductor memoryby a processor. Memory bandwidth is usuallyexpressed in units ofbytes/second.


34/71

Read Only Memories


35/71

Both SRAM and DRAM chips are volatile, whichmeans that they lose the stored information ifpower is turned off.

There are many applications that need memorydevices which retain the stored information ifpower is turned off.

Since its normal operation involves only reading of

stored data, a memory of this type is called read-only memory (ROM).

Read Only Memories(ROM)


36/71

Read Only Memories(ROM)

Types


37/71

Types

ROM PROM

EPROM

EEPROM


38/71

Cache Memories

Cache Memory


39/71

Cache Memory

The speed of the main memory is very low incomparison with the speed of modern processors.

For good performance, the processor cannot spend

much of its time waiting to access instructions anddata in main memory.

Introduced a new scheme which reduces the timeneeded to access the necessary information CacheMemory

Cache Memory


40/71

Cache Memory

The effectiveness of the cache mechanism is basedon a property of computer programs called localityof reference.

many instructions in localized areas of the programare executed repeatedly during some time period,and the remainder of the program is accessedrelatively infrequently. This is referred to as locality

of reference.

Cache Memory


41/71

Cache Memory

Two Aspects of locality of reference temporal andspatial

temporal - The temporal aspect means that a recentl yexecuted instruction is likely to be executed again verysoon.

Spatial - The spatial aspect means that instructions inclose proximity to a recently executed instruction (withrespect to the instructions addresses) are also likely tobe executed soon.


42/71

If the active segments of a program can be placedin a fast cache memory, then the total executiontime can be reduced significantly.

Conceptually, operation of a cache memory is verysimple. The memory control circuitry is designedto take advantage of the property of locality ofreference.


43/71

The temporal aspect of the locality of referencesuggests that whenever an information item(instruction or data) is first needed, this itemshould be brought into the caChe where it will

hopefully remain until it is needed again. The spatial aspect suggests that instead of fetching

just one item from the main memory to the cache,

it is useful to fetch several items that reside atadjacent addresses as well.


44/71


45/71

Usually, the cache memory can store a reasonablenumber of blocks at any given time

but this number is small compared to the total

number of blocks in the main memory. The correspondence between the main memory

blocks and those in the cache is specified by amapping function.


46/71

When the cache is full and a memory word(instruction or data) that is not in the cache isreferenced, the cache control hardware mustdecide which block should be removed to create

space for the new block that contains thereferenced word.

The collection of rules for making this decision

constitutes the replacement algorithm.


47/71

In a Read operation, the main memory is notinvolved.

For a Write operation, the system can proceed in

two ways. In the first technique, called the write-through

protocol, the cache location and the main memorylocation are updated simultaneously.

The second technique is to update only the cachelocation and to mark it as updated with anassociated flag bit


48/71

The main memory location of associated flag bit,often called the dirty or modified bit.

The main memory location of the word is updated

later, when the block containing this marked wordis to be removed from the cache to make room for anew block.

This technique is known as the write-back, or copy-

back protocol.


49/71

When the addressed word in a Read operation isnot in the cache, a read miss occurs.

The block of words that contains the requested

word is copied from the main memory into thecache. After the entire block is loaded into thecache, the particular word requested is forwardedto the processor.


50/71

Alternatively, this word may be sent to theprocessor as soon as it is read from the mainmemory, which is called load-through, or early restart

reduces the processor's waiting period somewhat, butat the expense of more complex circuitry.


51/71

MAPPING FUNCTIONS


52/71

Consider a cache consisting of 128 blocks of 16words each, for a total of 2048 (2K) words

assume that the main memory is addressable by a

16-bit address. The main memory has 64K words,which we will view as 4K blocks of 16 words each.


53/71


54/71

Direct Mapping Associative Mapping

Set-Associative Mapping


55/71

Direct Mapping


56/71

Direct Mapping


57/71

The simplest way to determine cache locations inwhich to store memory blocks is the direct-mapping technique.

In this technique, block j of the main memory mapsonto blockj modulo 128 of the cache, as depicted in

ie, main memory blocks 0, 128, 256, . . . is loaded inthe cache, it is stored in cache block O.

Blocks 1, 129, 257, .. . are stored in cache block 1,and so on.


58/71

Since more than one memory block is mapped ontoa given cache block position, contention may arisefor that position even when the cache is not full.

For example, instructions of a program may start inblock 1 and continue in block 129, possibly after abranch.

As this program is executed, both of these blocks

must be transferred to the block-1 position in thecache.

Contention is resolved by allowing the new block to

overwrite the currently resident block


59/71

Placement of a block in the cache is detenninedfrom the memory address.

The low-order 4 bits select one of 16 words in ablock.

7 -bit cache block field determines the cacheposition in which this block must be stored.


60/71

The high-order 5 bits of the memory address of theblock are stored in 5 tag bits associated with itslocation in the cache.

They identify which of the 32 blocks that aremapped into this cache position are currentlyresident in the cache.


61/71

As execution proceeds, the 7 -bit cache block fieldof each address generated by the processor pointsto a particular block location in the cache.

The high-order 5 bits of the address are comparedwith the tag bits associated with that cachelocation.

If they match, then the desired word is in that block of

the cach. If there is no match, then the block containing the

required word must first be read from the mainmemory and loaded into the cache.


62/71

The direct-mapping technique is easy toimplement, but it is not very flexible.


63/71

Associative Mapping


64/71

Associative Mapping


65/71

much more flexible mapping method, in which amain memory block can be placed into any cacheblock position.

In this case, 12 tag bits are required to identify amemory block when it is resident in the cache.

The tag bits of an address received from theprocessor are compared to the tag bits of each

block of the cache to see if the desired block ispresent.


66/71

The cost of an associative cache is higher than thecost of a direct-mapped cache

because of the need to search all 128 tag patterns todetermine whether a given block is in the cache.

A search of this kind is called Associative Search


67/71

Set Associative Mapping


68/71


69/71

A combination of the direct- and associative-mapping techniques

Blocks of the cache are grouped into sets,

mapping allows a block of the main memory toreside in any block of a specific set.

the contention problem of the direct method iseased by having a few choices for block placement.

At the same time, the hardware cost is reduced bydecreasing the size of the associative search.


70/71

cache with two blocks per set. In this case, memory blocks 0, 64, 128, . . . , 4032

map into cache set 0, and they can occupy either ofthe two block positions within this set.

Having 64 sets means that the 6-bit set field of theaddress determines which set of the cache mightcontain the desired block.


71/71

The tag field of the address must then beassociatively compared to the tags of the twoblocks of the set to check if the desired block ispresent.

This two-way associative search is simple toimplement.

ccomputer organization

Documents