icc module 3 lesson 1 – computer architecture 1 / 12 © 2015 ph. janson information, computing &...

12
ICC Module 3 Lesson 1 – Computer Architecture 1 / 12 © 2015 Ph. Janson Information, Computing & Communication Computer Architecture Clip 6 – Logic parallelism School of Computer Science & Communications P. Ienne (charts), Ph. Janson (commentary)

Upload: marvin-lang

Post on 18-Jan-2016

216 views

Category:

Documents


0 download

TRANSCRIPT

PowerPoint Presentation

Computer ArchitectureClip 6 Logic parallelismSchool of Computer Science & CommunicationsP. Ienne (charts), Ph. Janson (commentary)

Information, Computing & CommunicationICC Module 3 Lesson 1 Computer Architecture# / 12 2015 Ph. JansonThis video clip is part of the E.P.F.L. introductory course on Information, Computing, and Communication.It is the sixth in a set of video clips on computer architecture.

1OutlineClip 0 IntroductionClip 1 Software technology Assembler languageAlgorithmsRegistersData instructionsInstruction numberingControl instructionsClip 2 Hardware architecture Von Neumanns stored program computer architectureData storage and processingControl storage and processingClip 3 Hardware design Instruction encodingHarware implementation Transistor technologyClip 4 Computing circuitsClip 5 Memory circuitsHardware performanceClip 6 Logic parallelismClip 7 Architecture parallelismFirst clipPrevious clipNext clip

ICC Module 3 Lesson 1 Computer Architecture# / 12 2015 Ph. JansonEarlier video clips have shown how computing devices can be built out of transistors and programmed in languages that can be compiled into binary codes executable on such computers.The remaining two clips in this series on computer architecture will explain how computers can be made faster by resorting to different forms of parallelism.The present clip focuses on logic parallelism.2What about performance? Step 5

Source: Hennessy & Patterson, MK 2011Architecture!~20% / yearcome from technology(= transistor speed)Processors performance increase:52% / year

ICC Module 3 Lesson 1 Computer Architecture# / 12 2015 Ph. JansonGaining computer performance is of course always desirable but architecture plays a key role in there.-Indeed if one looks back at the evolution of processor performance, one can see on this slide that it has grown by a little over 50% per year.-Yet only about 20 of these 50% are due to transistors having become smaller and faster.-The difference between this 20% technology progress and the over 50% real progress is entirely due to smart hardware design.3Two simple examples of performance increase:At the circuit levelReducing the delay of an adder=> this clipAt the processor structure levelIncreasing the throughput of instructionsHow can one increase performance beyond transistor speed ?t= Reduce delaywaiting to get a result= Increase throughputnumber of results per time unitt

ICC Module 3 Lesson 1 Computer Architecture# / 12 2015 Ph. JansonThe performance of anything can always be increased in one of two ways:- either by reducing the time or delay it takes to do something- or by increasing the throughput or the number of things done by unit of time-This clip focuses on an example of performance gain through reduction of the time to perform operations thanks to logic parallelism.-The next and last clip will discuss an example of performance gain through increase of throughput thanks to architectural parallelism.4Addition is easy1 1 1 0 0 0 1 1 1 1 0 0 0 1 1 0 1 0 1 0 0 1 0 1 1 1 0 0 0 0 1 1 1 0 0 1 0 1 A 0 1 1 1 0 1 0 1 0 1 1 0 0 0 1 1 0 1 0 +B 1 0 1 1 1 0 0 0 1 0 1 1 1 0 0 1 0 1 1 =Bit additions0 + 0 = 00 + 1 = 11 + 0 = 11 + 1 = 10 = 121 + 020 = 210carry

ICC Module 3 Lesson 1 Computer Architecture# / 12 2015 Ph. JansonConsider how one could reduce the time to perform an addition.-In doing so remember all possible two bit additions, observing that if both bits are 1 then the result will include a carry bit into the next order addition.-A basic adder starts by adding the two lowest order bits - then proceeds to add the two next lowest order bits.-It cannot proceed faster because at every stage it must compute the carry bit before it can compute the next stage.-Only then can it move on - and on to the highest order bit.

5Building an adder circuit is also (relatively) easy 0 1 1 1 0 0 1 0 1 A 1 0 0 0 1 1 0 1 0 +B 1 1 1 0 0 1 0 1 1 =Bit addition 0 + 0 = 0 0 + 1 = 1 1 + 0 = 1 1 + 1 = 100 + 0 + 0 00 0 + 0 + 1 = 01 0 + 1 + 0 = 010 + 1 + 1 = 101 + 0 + 0 = 011 + 0 + 1 = 101 + 1 + 0 = 101 + 1 + 1 = 11One needs to factor in the carry

ICC Module 3 Lesson 1 Computer Architecture# / 12 2015 Ph. JansonObserving the foregoing, building a full adder requires the following.-First computing every result bit requires a bit-adding circuit to add the two bits of same order of the two operands.-Because of the carry however, - for all higher order bits, the bit-adding circuits need to be able to add three bits rather than just two.-So we need the same circuit at every stage - on to the highest order bit - where each of the 3-bit adding circuits is one of those seen in the 4th clip of this series on computer architecture.6Propagation of the carry is a fundamental aspect of additions !But such an adder is slow ! 0 1 1 1 0 0 1 0 1 A 1 0 0 0 1 1 0 1 0 +B 1 1 1 0 0 1 0 1 1 =The delay of an adderis thus proportionnel to the number of bits to be added

ICC Module 3 Lesson 1 Computer Architecture# / 12 2015 Ph. JansonSuch a basic adding circuit thus suffers from a fundamental limitation.-The propagation of the carry bit is an intrinsic requirement that prevents computing the addition of all the bits in parallel.-The total time required by such an adder is thus proportional to the number of bits to be added.7Can one do better ?64-bit adderbits 0bits 63T

ICC Module 3 Lesson 1 Computer Architecture# / 12 2015 Ph. JansonCan one do better?If we imagine that we need to add two 64-bit numbers, -The total time to do this is bounded by the need to compute 63 carry bits

8Can one do better ?32-bit adderbits 032-bit adderbits 63Carry frombits 31

ICC Module 3 Lesson 1 Computer Architecture# / 12 2015 Ph. JansonCan we hope to gain anything by splitting the 64-bit numbers into two 32-bit numbers - and transferring one carry bits between the two operations?

9Can one do better ?32-bit adderbits 032-bit adderbits 63Carry frombits 31T/2T/2One has gained nothing

ICC Module 3 Lesson 1 Computer Architecture# / 12 2015 Ph. JansonComputing the sum of the two lower order 32-bit numbers will certainly take only half the time of what it would take to add the two 64-bit numbers.-However we cannot start adding the two higher order 32-bit numbers until one has the carry bit from the first addition.-Thus one has gained nothing.10Can one do better ?32-bit adderbits 0bits 63T/210That only takes half the time !32-bit adderT/2

ICC Module 3 Lesson 1 Computer Architecture# / 12 2015 Ph. JansonNow imagine for a second that we invest twice as many transistors in the design, building two separate adders for the two higher order 32-bit numbers, one that assumes a carry bit of 0 and one that assumes a carry bit of 1-Now we can start adding the lower order and higher order 32-bit numbers in parallel, thus taking only half the time -Because indeed by the time the real carry bit becomes available from the lower order addition we have the sum of the higher order addition for either case: carry 0 or carry 1.-All that remains to do is to use that carry bit to select which of the two higher order additions is the right one, one way or the other.-So we have essentially cut the total computation time by half at the cost of investing roughly 1.5 times as many transistors.11One can thus profoundly change the performance of a circuit without changing it functionalityOne can invest more transistors and energy to obtain faster circuitsOr one can use slower circuits to spare energy

Performance engineering (1)This is an example of logic synthesiswhich is one of the branches of Computer Engineering

ICC Module 3 Lesson 1 Computer Architecture# / 12 2015 Ph. JansonOne can thus profoundly change the performance of a circuit without changing its functionality.-This can be achieved by investing more circuits thus raising power consumption.-One could equally well use slower circuits to save power.This is the field of logic synthesis, one of the branches of computer engineering.12