30/11/07ian/modules/eee515j1/ files 1 eee515j1 asics and digital design lecture 7: cpus; the shc1,...

20
30/11/07 www.eej.ulster.ac.uk/~ian/modul es/EEE515J1/files 1 EEE515J1 ASICs and DIGITAL DESIGN Lecture 7: CPUs; The SHC1, Simple Hypothetical CPU #1 Ian McCrum Room 5B18 Tel: 90 366364 voice mail on 6 th ring Email: [email protected] Web site: http://www.eej.ulst.ac.uk (old archive http://tigger.engj.ulst.ac.uk/~ddij23 ) Last changed 30/11/07@18:00

Upload: milton-payne

Post on 13-Dec-2015

218 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: 30/11/07ian/modules/EEE515J1/ files 1 EEE515J1 ASICs and DIGITAL DESIGN Lecture 7: CPUs; The SHC1, Simple Hypothetical CPU #1 Ian

30/11/07 www.eej.ulster.ac.uk/~ian/modules/EEE515J1/files

1

EEE515J1ASICs and DIGITAL DESIGN

Lecture 7: CPUs; The SHC1, Simple Hypothetical CPU #1

Ian McCrum Room 5B18Tel: 90 366364 voice mail on 6th ringEmail: [email protected] site: http://www.eej.ulst.ac.uk (old archive http://tigger.engj.ulst.ac.uk/~ddij23 )

Last changed 30/11/07@18:00

Page 2: 30/11/07ian/modules/EEE515J1/ files 1 EEE515J1 ASICs and DIGITAL DESIGN Lecture 7: CPUs; The SHC1, Simple Hypothetical CPU #1 Ian

30/11/07 www.eej.ulster.ac.uk/~ian/modules/EEE515J1/files

2

Common ASM DATA PROCESSOR blocks We find we

often get data from the outside world or a internal storage register, process it in some way and put the result back into an internal register or send it to the outside world

DATA PROCESSORSimple blocks, each of which does a single, simple, easily expressed

function.

CONTROL LOGICActually a FSM;

receiving inputs and deciding what

sequences of outputs to generate.

Input Data

OutputData

Control Signals

Status Signals

External Inputs( only a few and

preferablysynchronised to the

system clock)

Page 3: 30/11/07ian/modules/EEE515J1/ files 1 EEE515J1 ASICs and DIGITAL DESIGN Lecture 7: CPUs; The SHC1, Simple Hypothetical CPU #1 Ian

30/11/07 www.eej.ulster.ac.uk/~ian/modules/EEE515J1/files

3

More common DATA PROCESSOR blocks

In designing ASM machines we often need to repeat a set of operations a number of times. Hence we will often have counters and some means of detecting when a count is reached. (or counters that count down and a zero detector (NOR gate!)

COUNTER

(RESETABLE)

DETECT

16

CLOCK

COUNTUP

CLEAR

EQ16

LOADSTARTVALUE

COUNTER

(RESETABLE)

DETECT

zero

CLOCK

COUNTDOWN

LOAD

EQ

REGISTER, Load number or constant

Page 4: 30/11/07ian/modules/EEE515J1/ files 1 EEE515J1 ASICs and DIGITAL DESIGN Lecture 7: CPUs; The SHC1, Simple Hypothetical CPU #1 Ian

30/11/07 www.eej.ulster.ac.uk/~ian/modules/EEE515J1/files

4

More general pupose data processing block We could add

the blocks on the left to every digital machine we design...

This is the start of designing a “general purpose digital machine”

- a CPU

CLEARRESULT

LOADRESULT

REG

ADDER

REGISTER

CLEAR

ADD

LOAD

LOAD

REG

ALU

REGISTER

CLEAR

ALU Function code

LOADRESULT

CLEARRESULT

ALU can output A+B, A-B, B-A, A, B, A AND B, A OR B, A XOR B, NOT A NOT B using 4 Function code lines. It can also output STATUS bits Z,C,N,V (see 74F181 datasheet)

Page 5: 30/11/07ian/modules/EEE515J1/ files 1 EEE515J1 ASICs and DIGITAL DESIGN Lecture 7: CPUs; The SHC1, Simple Hypothetical CPU #1 Ian

30/11/07 www.eej.ulster.ac.uk/~ian/modules/EEE515J1/files

5

The SHC01 (see SHC01.pdf) The minimum to do useful work – has many areas that can be

improved; it only has one accumulator (and a temporary register). It cannot, as it stands, implement subroutines or even indexed memory accesses. It has only 8 bit data and address buses.

Has a PROGRAM ROM where every instruction code (OPCODE) and operand is stored, starts at address zero

Requires 22 control signals emitted in the correct order for everything to work

allows up to 16 microinstructions for each OPCODE loaded into the IR (Instruction Register

See the fetch-execute tables and microcode tables to see how this machine works (the .pdf on the website/handout in class)

Page 6: 30/11/07ian/modules/EEE515J1/ files 1 EEE515J1 ASICs and DIGITAL DESIGN Lecture 7: CPUs; The SHC1, Simple Hypothetical CPU #1 Ian

30/11/07 www.eej.ulster.ac.uk/~ian/modules/EEE515J1/files

6

RESULT

ACCA MDR

ALU

IR

LAT

PROGRAMROM

MARPC

CONTROL UNIT ROM

13 ADDRESSES22 DATA

OUTPUTS

DATARAM

INPUT BUFFER

OUTPUT REG

DATA BUS – 8 bits

ADDRESS BUS8 bits

E

S

S

S

S

SSS

S

E

E

E

EE

i

C[2..0]

The control unit ROM outputs signals to;-control Strobing data into a register (using the 'S' lines)Enabling outputs from registers or buffers ('E')Controlling function of the ALU (C2,C1 and C0)Incrementing the PC (the 'I' line)Supply a 4 bit number to the LAT latch, (this causes the ROM to switch to (typically) the next microinstruction)

i.e { ACCAS, MDR

S, RESULT

S, RESULT

E, IR

S, PC

S, PC

i,

PCE, MAR

S, MAR

E, ALU[C2..C0], ROM

E.RAM

S, RAM

E, INP

E,

OUTS, LAT[d3..d0] }

Hence the ROM is 2^13 x 22 bits in size

Page 7: 30/11/07ian/modules/EEE515J1/ files 1 EEE515J1 ASICs and DIGITAL DESIGN Lecture 7: CPUs; The SHC1, Simple Hypothetical CPU #1 Ian

30/11/07 www.eej.ulster.ac.uk/~ian/modules/EEE515J1/files

Step #

ACTION RESULT { ROUND BRACKETS MEAN “CONTENTS OF” }

COMMENT

0 PCE=1 (PC) -> AB PUT PROGRAM COUNTER CONTENTS ONTO ADDRESS BUS

1 ROME=1 (ROM) -> DB READ THE PROGRAM ROM.OPCODE NOW ON DATA BUS

2 PCI=1,PCE=1,IRS=1

(PC)+1->PC, (PC)->AB,(DB)->IR

POINT PC AT OPERAND, AND READ THE ROM; ITS CONTENTS GO INTO THE IR

3 ROME=1 (ROM)->DB ADDRESS BUS SETTLES WITH NEW VALUE; THE ADDRESS OF THE OPERAND

4 MDRS=1 (DB)->MDR PUT IT IN THE MDR

5 ALU=ADD ALU=(ACC)+(MDR) EXECUTE THE INSTRUCTION

6 RESULTS=1 (ALU)->RESULTS  

7 RESULTE=1 (RESULTS)->DB PUT ANSWER ONTO DATA BUS

8 ACCS=1 (DB)-ACC AND INTO ACC.

9 PCI=1   THESE ARE PART OF

10 PCE=1.ROME=1   THE NEXT

11 IRS=1,PCI=1,PCE=1

  FETCH-EXECUTE TABLE

POWER UP SEQUENCE and fetch-execute of first instruction (assumes immediate ADD)

Do examine the 5 page handout carefully – check the microcode tables that implement the above

Page 8: 30/11/07ian/modules/EEE515J1/ files 1 EEE515J1 ASICs and DIGITAL DESIGN Lecture 7: CPUs; The SHC1, Simple Hypothetical CPU #1 Ian

30/11/07 www.eej.ulster.ac.uk/~ian/modules/EEE515J1/files

8

Improving the SHC01

1) Use a REGISTER BANK

REG_WRITE_ADDRESS[2..0]

REGA_READ_ADDRESS[2..0]

REGB_READ_ADDRESS[2..0]

A B

REGISTER BANK(8 registers)

ALU

RESULT

S

E

C0

C1

C2

WRS

The Register bank needs10 control signals instead of 2, but the control logic can be altered to make this efficient – take bits direct from the IR to the register address lines. Suits larger machines – 16 bits and above

Page 9: 30/11/07ian/modules/EEE515J1/ files 1 EEE515J1 ASICs and DIGITAL DESIGN Lecture 7: CPUs; The SHC1, Simple Hypothetical CPU #1 Ian

30/11/07 www.eej.ulster.ac.uk/~ian/modules/EEE515J1/files

9

Improving the SHC01

2) Use a bigger ALU or 3) a secondary ALU

ALU

REG bank

ALU Function code

Secondary ALU (e.g

MULTIPLIER

Page 10: 30/11/07ian/modules/EEE515J1/ files 1 EEE515J1 ASICs and DIGITAL DESIGN Lecture 7: CPUs; The SHC1, Simple Hypothetical CPU #1 Ian

30/11/07 www.eej.ulster.ac.uk/~ian/modules/EEE515J1/files

10

Improving the SHC014) Improve memory addressing capability -

(a) Increment(b) Double Increment(c) Decrement(d) Double Decrement(e) Reset (to access address zero)

MARS

RAM

MARPC

ROM

S

S

S

E

MAREE

I

MARIMARIIMARDMARDDMARCLR

Page 11: 30/11/07ian/modules/EEE515J1/ files 1 EEE515J1 ASICs and DIGITAL DESIGN Lecture 7: CPUs; The SHC1, Simple Hypothetical CPU #1 Ian

30/11/07 www.eej.ulster.ac.uk/~ian/modules/EEE515J1/files

11

Improving the SHC01If you have a

source and destination address in external RAM in makes sense to have two address pointers within the CPU

MAR2 will need the usual S and E lines, it makes sense to also add others

(c.f. previous slide)

5) Add a second MAR -

RAM

MAR1PC

ROM

S

S

S

E

E

I MAR2

DATA BUS

ADDRESS BUS

Page 12: 30/11/07ian/modules/EEE515J1/ files 1 EEE515J1 ASICs and DIGITAL DESIGN Lecture 7: CPUs; The SHC1, Simple Hypothetical CPU #1 Ian

30/11/07 www.eej.ulster.ac.uk/~ian/modules/EEE515J1/files

12

Improving the SHC016) Add a second ALU – to allow calculated addresses

RAM

MAR1PC

ROM

MAR2

DATA BUS

ADDRESS BUS

RE

G

TEMPREG2

Secondary ALU (-simple adder)

Page 13: 30/11/07ian/modules/EEE515J1/ files 1 EEE515J1 ASICs and DIGITAL DESIGN Lecture 7: CPUs; The SHC1, Simple Hypothetical CPU #1 Ian

30/11/07 www.eej.ulster.ac.uk/~ian/modules/EEE515J1/files

13

Now to optimise the Control unit.

It currently needs 13 inputs and 22 outputs

If implemented as a large ROM it needs

2^13 * 22 bits = 180,224 bits

Page 14: 30/11/07ian/modules/EEE515J1/ files 1 EEE515J1 ASICs and DIGITAL DESIGN Lecture 7: CPUs; The SHC1, Simple Hypothetical CPU #1 Ian

30/11/07 www.eej.ulster.ac.uk/~ian/modules/EEE515J1/files

14

MICROPROGRAMMING

INSTRUCTIONREGISTER

To all 'S' and 'E' control signals, also to ALU C2, C1 and C0 control lines, AS and BS strobe lines, PCI Increment line (PCI)

CONTROL UNIT ROM

CONTAINING MICROCODE

4 BIT LATCH

STATUSBIT

FROMALU

clk

8

4

18

On Powerup the IR and LATCH are at zero, so the first address presented at the inputs of the MICROCODE ROM is

X 0000-0000 0000

The first thing to do is put the PC’s contents onto the address bus

Next Enable the PROGAM ROMs outputs (onto the databus)

Next The IR is strobed – the first real opcode is now in the IR and the ROM has a new address … depending on what that opcode is!

The Microcode performs a “microjump” to the new microcode

Page 15: 30/11/07ian/modules/EEE515J1/ files 1 EEE515J1 ASICs and DIGITAL DESIGN Lecture 7: CPUs; The SHC1, Simple Hypothetical CPU #1 Ian

30/11/07 www.eej.ulster.ac.uk/~ian/modules/EEE515J1/files

15

Improving the CONTROL UNIT of SHC011) Replace LAT with “MICROPROGRAM COUNTER”

If we use just microorders “COUNT” and “RESET” this saves 2 outputs from the control unit so its new size is 2^13 X 20 (....168,340 bits ...)

Actually we can remove the need for “RESET” if we complicate the microcode. its new size is 2^13 X 19 (...155,648 bits...)

It is even possible to have “COUNT” as a default option and remove the need for it as well – at this stage the microcode becomes hard to follow – so this step is left until the very end when a number of obfuscating optimisations can be carried out

INSTRUCTIONREGISTER

CONTROL UNIT ROM

CONTAINING MICROCODE

4 bit LATCH

STATUSBIT

FROMALU

clk

8

4

2

Page 16: 30/11/07ian/modules/EEE515J1/ files 1 EEE515J1 ASICs and DIGITAL DESIGN Lecture 7: CPUs; The SHC1, Simple Hypothetical CPU #1 Ian

30/11/07 www.eej.ulster.ac.uk/~ian/modules/EEE515J1/files

16

Improving the CONTROL UNIT of SHC012) Look for redundancy in the control signals - PCE/MARe

It so happens that we never activate more than one S line at a time – we can use a decoder, There are times when no S lines are active so it is convenient to use a 3:8 decoder and provide 7 S lines with a 3 bit number emitted from the Control unit ROM

CU ROM is 2^13 X 14 (...114,688 bits...)

Drop MARE and use an invertor wired to PCE since we see that PCE and MARE are never '1' at the same time and it does no harm to have one of these at '1' all the time. (“00” not used)

This saves an output, CU ROM is now 2^13 X 18 (....147,456 bits...)

3) Look for redundancy in the control signals – mutually exclusive 'S' lines

PC MAR

PCE

Page 17: 30/11/07ian/modules/EEE515J1/ files 1 EEE515J1 ASICs and DIGITAL DESIGN Lecture 7: CPUs; The SHC1, Simple Hypothetical CPU #1 Ian

30/11/07 www.eej.ulster.ac.uk/~ian/modules/EEE515J1/files

17

Improving the CONTROL UNIT of SHC014) Look for redundancy in the control signals - NANOMEMORY

Although the CU ROM could output many different patterns, if we analyse the complete set of microcode we might discover, for example, we only need 100 different emissions. Hence we use a “LOOKUP TABLE” to generate these. The CU ROM outputs a number between 0 and 99 and the NANOMEMORY emits the required wide microinstruction

CU ROM is 2^13 X 7 = 57,344 and NANOMEMORY is 2^7 X 14 = 1778 giving total of (...59,122 bits...)

INSTRUCTIONREGISTER

CONTROL UNIT ROM

CONTAINING lookup number of MICROCODE

4 BIT LATCH

STATUSBIT

FROMALU

clk

8

4

12

NANOMEMORY5 inputs and 24 outputs

7

Page 18: 30/11/07ian/modules/EEE515J1/ files 1 EEE515J1 ASICs and DIGITAL DESIGN Lecture 7: CPUs; The SHC1, Simple Hypothetical CPU #1 Ian

30/11/07 www.eej.ulster.ac.uk/~ian/modules/EEE515J1/files

18

Improving the CONTROL UNIT of SHC015) Only provide the opcodes actually wanted – probably less than 254

INSTRUCTIONREGISTER

CONTROL UNIT ROM

CONTAINING lookup number of MICROCODE

4 BIT LATCH

STATUSBIT

FROMALU

clk

6

4

12

NANOMEMORY5 inputs and 24 outputs

7

Although the CU ROM could provide many different opcodes, such a simple architecture may only need 50 or so opcodes, we can keep IR7 and IR6 low all the time –

hence only apply 6 bits to the ROM from the IR

CU ROM is 2^11 X 7 = 14336

and NANOMEMORY is

2^7 X 14 = 1778 giving total of

(...16,114 bits...)

Page 19: 30/11/07ian/modules/EEE515J1/ files 1 EEE515J1 ASICs and DIGITAL DESIGN Lecture 7: CPUs; The SHC1, Simple Hypothetical CPU #1 Ian

30/11/07 www.eej.ulster.ac.uk/~ian/modules/EEE515J1/files

19

Improving the CONTROL UNIT of SHC016) Use fields in the IR to drive control signals directly

Although more common in bigger machines (e.g 16 bits) we can divide the IR into fields and “wire” them directly to parts of the CPU, bypassing the CU and saving space there.

If a field in the IR is used as a “MODE” field it can drive multiplexors and switches to route the other IR fields to different parts of the CPU.

This is used in, for example, the PDP11 to allow fields to be used to drive the ALU or the ADDRESS calculation sections.

At this point the architecture (and microcode) become complicated - and beyond the course!

INSTRUCTION REGISTER

CONTROL UNIT ROM

2

e.g to

ALU fn or REG bank addresses

Page 20: 30/11/07ian/modules/EEE515J1/ files 1 EEE515J1 ASICs and DIGITAL DESIGN Lecture 7: CPUs; The SHC1, Simple Hypothetical CPU #1 Ian

30/11/07 www.eej.ulster.ac.uk/~ian/modules/EEE515J1/files

20

Summary Be able to sketch a typical CPU Be able to sketch a typical CONTROL UNIT Be able to work out FETCH-EXECUTE tables for simple

(explained)instructions Be able to write out a MICROCODE table, including

whatever steps are required at powerup to get the machine going

Be able to suggest architectural improvements to the CPU

Be able to sketch CONTROL UNIT improvements and calculate the resulting savings in ROM sizes.