ch13_2_

Upload: vighnesh-sarwankar

Post on 07-Jan-2016

213 views

Category:

Documents


0 download

DESCRIPTION

COA

TRANSCRIPT

  • 7/17/2019 Ch13_2_

    1/41

    1Multiprocessors

    Computer Organization Computer Architectures Lab

    MULTIPROCESSORS

    Characteristics of Multiprocessors

    Interconnection Structures

    Interprocessor Arbitration

    Interprocessor Communication

    and Snchroni!ation

    Cache Coherence

  • 7/17/2019 Ch13_2_

    2/41

    "Multiprocessors

    Computer Organization Computer Architectures Lab

    TERMI#OLO$%

    Parallel Computin&

    Simultaneous use of multiple processors' all components

    of a sin&le architecture' to sol(e a tas)* Tpicall processors identical'

    sin&le user +e(en if machine multiuser,

    -istributed Computin&

    Use of a net.or) of processors' each capable of bein&

    (ie.ed as a computer in its o.n ri&ht' to sol(e a problem* Processors

    ma be hetero&eneous' multiuser' usuall indi(idual tas) is assi&ned

    to a sin&le processors

    Concurrent Computin&

    All of the abo(e/

    Characteristics of Multi

    processors

  • 7/17/2019 Ch13_2_

    3/41

    0Multiprocessors

    Computer Organization Computer Architectures Lab

    TERMI#OLO$%

    Supercomputin&

    Use of fastest' bi&&est machines to sol(e bi&' computationall

    intensi(e problems* istoricall machines .ere (ector computers'but parallel2(ector or parallel becomin& the norm

    Pipelinin&

    3rea)in& a tas) into steps performed b different units' and multiple

    inputs stream throu&h the units' .ith ne4t input startin& in a unit .hen

    pre(ious input done .ith the unit but not necessaril done .ith the tas)

    5ector Computin&

    Use of (ector processors' .here operation such as multipl

    bro)en into se(eral steps' and is applied to a stream of operands

    +6(ectors7,* Most common special case of pipelinin&

    Sstolic

    Similar to pipelinin&' but units are not necessaril arran&ed linearl'

    steps are tpicall small and more numerous' performed in loc)step

    fashion* Often used in special8purpose hard.are such as ima&e or si&nal

    processors

    Characteristics of Multi

    processors

  • 7/17/2019 Ch13_2_

    4/41

    9Multiprocessors

    Computer Organization Computer Architectures Lab

    SPEE-UP A#- E::ICIE#C%

    A; $i(en problem

    T

  • 7/17/2019 Ch13_2_

    5/41

    BMultiprocessors

    Computer Organization Computer Architectures Lab

    AM-ALGS LAH

    $i(en a pro&ram

    f ; :raction of time that represents operations

    that must be performed seriall

    Ma4imum Possible Speedup; S

    S

    ' .ith p processorsf +1 8 f , 2 p

    1

    S J 1 2 f ' .ith unlimited number of processors

    8 I&nores possibilit of ne. al&orithm' .ith much smaller f

    8 I&nores possibilit that more of pro&ram is run from hi&her speedmemor such as Re&isters' Cache' Main Memor

    8 Often problem is scaled .ith number of processors' and f is a

    function of si!e .hich ma be decreasin& +Serial code ma ta)e

    constant amount of time' independent of si!e,

    Characteristics of Multi

    processors

  • 7/17/2019 Ch13_2_

    6/41

    Multiprocessors

    Computer Organization Computer Architectures Lab

    :L%##Gs AR-HARE TAKO#OM%

    SI; Sin&le Instruction Stream

    8 All processors are e4ecutin& the same instruction in the same ccle

    8 Instruction ma be conditional

    8 :or Multiple processors' the control processor issues an instruction

    MI; Multiple Instruction Stream

    8 -ifferent processors ma be simultaneousle4ecutin& different instructions

    S-; Sin&le -ata Stream

    8 All of the processors are operatin& on the same

    data items at an &i(en time

    M-; Multiple -ata Stream

    8 -ifferent processors ma be simultaneousl operatin& on different data items

    SISD ; standard serial computer

    MISD; (er rare

    MIMD and SIMD ; Parallel

    processin& computers

    I; Instruction Stream-; -ata Stream

    MS S> ? I > ? -

    M

    Characteristics of Multiprocessors

  • 7/17/2019 Ch13_2_

    7/41

    DMultiprocessors

    Computer Organization Computer Architectures Lab

    Ti&htl Coupled Sstem

    8 Tas)s and2or processors communicate in a hi&hl snchroni!ed

    fashion

    8 Communicates throu&h a common shared memor

    8 Shared memor sstem

    Loosel Coupled Sstem

    8 Tas)s or processors do not communicate in a

    snchroni!ed fashion

    8 Communicates b messa&e passin& pac)ets

    8 O(erhead for data e4chan&e is hi&h

    8 -istributed memor sstem

    COUPLI#$ O: PROCESSORS

    Characteristics of Multiprocessors

  • 7/17/2019 Ch13_2_

    8/41

    Multiprocessors

    Computer Organization Computer Architectures Lab

    $ranularit of Parallelism

    $RA#ULARIT% O: PARALLELISM

    Coarse8&rain

    8 A tas) is bro)en into a handful of pieces' each

    of .hich is e4ecuted b a po.erful processor

    8 Processors ma be hetero&eneous

    8 Computation2communication ratio is (er hi&h

    Medium8&rain

    8 Tens to fe. thousands of pieces

    8 Processors tpicall run the same code

    8 Computation2communication ratio is often hundreds or more

    :ine8&rain

    8 Thousands to perhaps millions of small pieces' e4ecuted b (er

    small' simple processors or throu&h pipelines

    8 Processors tpicall ha(e instructions broadcasted to them

    8 Compute2communicate ratio often near unit

    Characteristics of Multiprocessors

  • 7/17/2019 Ch13_2_

    9/41

    FMultiprocessors

    Computer Organization Computer Architectures Lab

    MEMOR%

    #et.or)

    Processors

    Memor

    SARE- MEMOR%

    #et.or)

    Processors2Memor

    -ISTRI3UTE- MEMOR%

    Shared +$lobal, Memor

    8 A $lobal Memor Space accessible b all processors

    8 Processors ma also ha(e some local memor-istributed +Local' Messa&e8Passin&, Memor

    8 All memor units are associated .ith processors8 To retrie(e information from another processors

    memor a messa&e must be sent there

    Uniform Memor

    8 All processors ta)e the same time to reach all memor locations

    #onuniform +#UMA, Memor

    8 Memor access is not uniform

    Characteristics of Multiprocessors

  • 7/17/2019 Ch13_2_

    10/41

    1@Multiprocessors

    Computer Organization Computer Architectures Lab

    SARE- MEMOR% MULTIPROCESSORS

    Characteristics

    All processors ha(e e=uall direct access to onelar&e memor address space

    E4ample sstems

    8 3us and cache8based sstems; Se=uent 3alance' Encore Multima4 8 Multista&e I#8based sstems; Ultracomputer' 3utterfl' RP0' EP 8 Crossbar s.itch8based sstems; C*mmp' Alliant :K2

    Limitations

    Memor access latenc ot spot problem

    Interconnection #et.or)

    * * *

    * * *P PP

    M MM

    3uses'Multista&e I#'Crossbar S.itch

    Characteristics of Multiprocessors

  • 7/17/2019 Ch13_2_

    11/41

    11Multiprocessors

    Computer Organization Computer Architectures Lab

    MESSA$E8PASSI#$ MULTIPROCESSORS

    Characteristics

    8 Interconnected computers 8 Each processor has its o.n memor' and

    communicate (ia messa&e8passin&

    E4ample sstems

    8 Tree structure; Teradata' -A-O 8 Mesh8connected; Rediflo.' Series "@1@' N8Machine 8 percube; Cosmic Cube' iPSC' #CU3E' :PS T Series' Mar) III

    Limitations

    8 Communication o(erhead ard to pro&rammin&

    Messa&e8Passin& #et.or)

    * * *P PP

    M M M* * *

    Point8to8point connections

    Characteristics of Multiprocessors

    1"M lti

  • 7/17/2019 Ch13_2_

    12/41

    1"Multiprocessors

    Computer Organization Computer Architectures Lab

    < Time8Shared Common 3us< Multiport Memor< Crossbar S.itch< Multista&e S.itchin& #et.or)< percube Sstem

    I#TERCO##ECTIO# STRUCTURES

    Interconnection Structure

    3us

    All processors +and memor, are connected to acommon bus or busses

    8 Memor access is fairl uniform' but not (er scalable

    10M lti

  • 7/17/2019 Ch13_2_

    13/41

    10Multiprocessors

    Computer Organization Computer Architectures Lab

    8 A collection of si&nal lines that carr module8to8module communication8 -ata hi&h.as connectin& se(eral di&ital sstem elements

    Operations of 3us

    3us

    M0 .ishes to communicate .ith SB

    >1? M0 sends si&nals +address, on the bus that causesSB to respond

    >"? M0 sends data to SB or SB sends data toM0+determined b the command line,

    Master -e(ice; -e(ice that initiates and controls the communication

    Sla(e -e(ice; Respondin& de(ice

    Multiple8master buses8 3us conflict8 need bus arbitration

    -e(ices

    M0 SD M SB M9S"

    3US

    Interconnection Structure

    19M lti I i S

  • 7/17/2019 Ch13_2_

    14/41

    19Multiprocessors

    Computer Organization Computer Architectures Lab

    S%STEM 3US STRUCTURE :OR MULTIPROCESSORS

    Interconnection Structure

    CommonSharedMemor

    Sstem3us

    Controller

    CPU IOPLocal

    Memor

    Sstem3us

    Controller

    CPULocal

    Memor

    Sstem3us

    ControllerCPU IOP

    LocalMemor

    Local 3us

    S%STEM 3US

    Local 3us Local 3us

    1BM lti I t ti St t

  • 7/17/2019 Ch13_2_

    15/41

    1BMultiprocessors

    Computer Organization Computer Architectures Lab

    MULTIPORT MEMOR%

    Interconnection Structure

    Multiport Memor Module 8 Each port ser(es a CPU

    Memor Module Control Lo&ic 8 Each memor module has control lo&ic 8 Resol(e memor module conflicts :i4ed priorit amon& CPUs

    Ad(anta&es

    8 Multiple paths 8 hi&h transfer rate

    -isad(anta&es 8 Memor control lo&ic 8 Lar&e number of cables and

    connections

    MM 1 MM " MM 0 MM 9

    CPU 1

    CPU "

    CPU 0

    CPU 9

    Memor Modules

    1Multiprocessors I t ti St t

  • 7/17/2019 Ch13_2_

    16/41

    1Multiprocessors

    Computer Organization Computer Architectures Lab

    CROSS3AR SHITC

    Interconnection Structure

    MM1

    CPU1

    CPU"

    CPU0

    CPU9

    Memor modules

    MM" MM0 MM9

    3loc) -ia&ram of Crossbar S.itch

    MemorModule

    data

    address

    R2H

    memorenable

    }

    }

    }

    }

    data'address' andcontrol from CPU 1

    data'address' andcontrol from CPU "

    data'address' andcontrol from CPU 0

    data'address' andcontrol from CPU 9

    Multiple4ersandarbitration

    lo&ic

    1DMultiprocessors I t ti St t

  • 7/17/2019 Ch13_2_

    17/41

    1DMultiprocessors

    Computer Organization Computer Architectures Lab

    MULTISTA$E SHITCI#$ #ETHOR

    Interconnection Structure

    A

    3

    @

    1

    A connected to @

    A

    3

    @

    1

    A connected to 1

    A

    3

    @

    1

    3 connected to @

    A

    3

    @

    1

    3 connected to 1

    Intersta&e S.itch

    1Multiprocessors Interconnection Structure

  • 7/17/2019 Ch13_2_

    18/41

    1Multiprocessors

    Computer Organization Computer Architectures Lab

    MULTISTA$E I#TERCO##ECTIO# #ETHOR

    Interconnection Structure

    @

    1

    @@@

    @@1

    @

    1

    @1@

    @11

    @

    1

    1@@

    1@1

    @

    1

    11@

    111

    @

    1

    @

    1

    @

    1

    P1

    P"

    4 Ome&a S.itchin& #et.or)

    @

    1

    "

    0

    9

    B

    D

    @@@

    @@1

    @1@

    @11

    1@@

    1@1

    11@

    111

    3inar Tree .ith " 4 " S.itches

    1FMultiprocessors Interconnection Structure

  • 7/17/2019 Ch13_2_

    19/41

    1FMultiprocessors

    Computer Organization Computer Architectures Lab

    %PERCU3E I#TERCO##ECTIO#

    Interconnection Structure

    8 p "n

    8 processors are conceptuall on the corners of a

    n8dimensional hpercube' and each is directl

    connected to the n nei&hborin& nodes

    8 -e&ree n

    One8cube T.o8cube Three8cube

    11@1@

    1 @@ 1@

    @1@

    11@

    @11 111

    1@1

    1@@

    @@1

    @@@

    n8dimensional hpercube +binar n8cube,

    "@Multiprocessors Inter

    processor Arbitration

  • 7/17/2019 Ch13_2_

    20/41

    "@Multiprocessors

    Computer Organization Computer Architectures Lab

    I#TERPROCESSOR AR3ITRATIO#

    3us 3oard le(el bus

    3ac)plane le(el bus Interface le(el bus

    Sstem 3us 8 A 3ac)plane le(el bus

    8 Printed Circuit 3oard 8 Connects CPU' IOP' and Memor 8 Each of CPU' IOP' and Memor board can be

    plu&&ed into a slot in the bac)plane+sstem bus, 8 3us si&nals are &rouped into 0 &roups

    -ata' Address' and Control+plus po.er,

    8 Onl one of CPU' IOP' and Memor can be&ranted to use the bus at a time

    8 Arbitration mechanism is needed to handlemultiple re=uests

    Interprocessor Arbitration

    e*&* IEEE standard DF bus 8 lines

    -ata; 1+multiple of ,

    Address; "9Control; "Po.er; "@

    "1Multiprocessors Inter

    processor Arbitration

  • 7/17/2019 Ch13_2_

    21/41

    "1Multiprocessors

    Computer Organization Computer Architectures Lab

    S%#CRO#OUS Q AS%#CRO#OUS -ATA TRA#S:ER

    Snchronous 3us Each data item is transferred o(er a time slice

    )no.n to both source and destination unit 8 Common cloc) source 8 Or separate cloc) and snchroni!ation si&nal is transmitted periodicall to snchroni!e the cloc)s in the sstem

    Asnchronous 3us

    < Each data item is transferred b Handshake mechanism

    8 Unit that transmits the data transmits a control si&nal that indicates the presence of data 8 Unit that recei(in& the data responds .ith

    another control si&nal to ac)no.led&e the receipt of the data

    < Strobe pulse 8 supplied b one of the units toindicate to the other unit .hen the data transfer

    has to occur

    Interprocessor Arbitration

    ""Multiprocessors Inter

    processor Arbitration

  • 7/17/2019 Ch13_2_

    22/41

    ""Multiprocessors

    Computer Organization Computer Architectures Lab

    3US SI$#ALS

    3us si&nal allocation

    8 address8 data8 control8 arbitration8 interrupt8 timin&8 po.er' &round

    IEEE Standard DF Multibus Si&nals

    -ata and address-ata lines +1 lines, -ATA@ 8 -ATA1B

    Address lines +"9 lines, A-RS@ 8 A-RS"0

    -ata transfer

    Memor read MR-C

    Memor .rite MHTC

    IO read IORC

    IO .rite IOHC

    Transfer ac)no.led&e TAC +KAC,

    Interrupt control

    Interrupt re=uest I#T@ 8 I#TD

    interrupt ac)no.led&e I#TA

    Interprocessor Arbitration

    "0Multiprocessors Inter

    processor Arbitration

  • 7/17/2019 Ch13_2_

    23/41

    "0Multiprocessors

    Computer Organization Computer Architectures Lab

    3US SI$#ALS

    IEEE Standard DF Multibus Si&nals +ContGd,

    Miscellaneous control

    Master cloc) CCL

    Sstem initiali!ation I#IT

    3te hi&h enable 3E#

    Memor inhibit +" lines, I#1 8 I#"

    3us loc) LOC

    3us arbitration

    3us re=uest 3RE

    Common bus re=uest C3R

    3us bus 3US%

    3us cloc) 3CL

    3us priorit in 3PR#3us priorit out 3PRO

    Po.er and &round +"@ lines,

    Interprocessor Arbitration

    "9Multiprocessors Inter

    processor Arbitration

  • 7/17/2019 Ch13_2_

    24/41

    "9Multiprocessors

    Computer Organization Computer Architectures Lab

    I#TERPROCESSOR AR3ITRATIO# STATIC AR3ITRATIO#

    Serial Arbitration Procedure

    Parallel Arbitration Procedure

    Interprocessor Arbitration

    3usarbiter 1

    PI PO 3usarbiter "

    PI PO 3usarbiter 0

    PI PO 3usarbiter 9

    PI PO

    i&hest

    priorit

    1

    3us bus line

    To ne4tarbiter

    3usarbiter 1

    Ac) Re=

    3usarbiter "

    Ac) Re=

    3usarbiter 0

    Ac) Re=

    3usarbiter 9

    Ac) Re=

    3us bus line

    9 4 "Priorit encoder

    " 4 9-ecoder

    "BMultiprocessors Inter

    processor Arbitration

  • 7/17/2019 Ch13_2_

    25/41

    "BMultiprocessors

    Computer Organization Computer Architectures Lab

    I#TERPROCESSOR AR3ITRATIO# -%#AMIC AR3ITRATIO#

    Priorities of the units can be dnamicall chan&eable.hile the sstem is in operation

    Time Slice :i4ed len&th time slice is &i(en se=uentiall to each processor' round8robin fashion

    Pollin& Unit address pollin& 8 3us controller ad(ances the address to identif the re=uestin& unit

    LRU

    :I:O

    Rotatin& -ais Chain Con(entional -ais Chain 8 i&hest priorit to the nearest unit to the bus controller Rotatin& -ais Chain 8 i&hest priorit to the unit that is nearest to the unit that has most recentl accessed the bus+it

    becomes the bus controller,

    Interprocessor Arbitration

    "Multiprocessors Inter

    processor Communication and Synchronization

  • 7/17/2019 Ch13_2_

    26/41

    "Multiprocessors

    Computer Organization Computer Architectures Lab

    I#TERPROCESSOR COMMU#ICATIO#

    Interprocessor Communication

    Interprocessor Communication and Synchronization

    Shared Memor

    Communication Area

    Recei(er+s,

    Mar)

    Sendin&Processor

    Recei(in&Processor

    Recei(in&Processor

    Recei(in&

    Processor

    ***

    Messa&e

    Shared Memor

    Recei(er+s,Mar)

    Sendin&Processor

    Recei(in&Processor

    Recei(in&Processor

    Recei(in&Processor

    ***

    Messa&e

    Instruction

    Interrupt

    Communication Area

    "DMultiprocessors Inter

    processor Communication and Synchronization

  • 7/17/2019 Ch13_2_

    27/41

    p

    Computer Organization Computer Architectures Lab

    I#TERPROCESSOR S%#CRO#IATIO#

    Snchroni!ation Communication of control information bet.een processors

    8 To enforce the correct se=uence of processes 8 To ensure mutuall e4clusi(e access to shared .ritable data

    ard.are Implementation

    Mutual E4clusion .ith a Semaphore Mutual E4clusion

    8 One processor to e4clude or loc) out access to shared resource bother processors .hen it is in a Critical Section

    8 Critical Section is a pro&ram se=uence that'once be&un' must complete e4ecution before

    another processor accesses the same shared resource

    Semaphore 8 A binar (ariable 8 1; A processor is e4ecutin& a critical section' that not a(ailable to other processors @; A(ailable to an re=uestin& processor 8 Soft.are controlled :la& that is stored in

    memor that all processors can be access

    Interprocessor Communication and Synchronization

    "Multiprocessors Inter

    processor Communication and Synchronization

  • 7/17/2019 Ch13_2_

    28/41

    p

    Computer Organization Computer Architectures Lab

    SEMAPORE

    Testin& and Settin& the Semaphore

    8 A(oid t.o or more processors test or set the same semaphore 8 Ma cause t.o or more processors enter the same critical section at the same time 8 Must be implemented .ith an indi(isible operation

    R J8 M>SEM? 2 Test semaphore 2 M>SEM? J8 1 2 Set semaphore 2

    These are bein& done .hile locked' so that other processors cannot testand set .hile current processor is bein& e4ecutin& these instructions

    If R1' another processor is e4ecutin& the critical section' the processor e4ecuted

    this instruction does not access theshared memor

    If R@' a(ailable for access' set the semaphore to 1 and access

    The last instruction in the pro&ram must clear the semaphore

    Interprocessor Communication and Synchronization

    "FMultiprocessors Cache Coherence

  • 7/17/2019 Ch13_2_

    29/41

    p

    Computer Organization Computer Architectures Lab

    CACE COERE#CE

    Caches are Coherent

    Cache Incoherency in

    Hrite Throu&h Polic

    Cache Incoherency in Hrite 3ac) Polic

    K 1"@

    K 1"@

    P1

    K B"

    P"

    K B"

    P0

    Main memor

    Caches

    Processors

    3us

    K B"

    K 1"@

    P1

    K B"

    P"

    K B"

    P0

    Main memor

    Caches

    Processors

    3us

    K B"

    K B"

    P1

    K B"

    P"

    K B"

    P0

    Main memor

    Caches

    Processors

    3us

    0@Multiprocessors Cache Coherence

  • 7/17/2019 Ch13_2_

    30/41

    p

    Computer Organization Computer Architectures Lab

    MAI#TAI#I#$ CACE COERE#C%

    Shared Cache 8 -isallo. pri(ate cache 8 Access time dela

    Soft.are Approaches < Read8Onl -ata are Cacheable 8 Pri(ate Cache is for Read8Onl data 8 Shared Hritable -ata are not cacheable 8 Compiler ta&s data as cacheable and noncacheable 8 -e&rade performance due to soft.are o(erhead

    < Centrali!ed $lobal Table 8 Status of each memor bloc) is maintained in C$T; RO+Read8Onl, RH+Read and Hrite, 8 All caches can ha(e copies of RO bloc)s

    8 Onl one cache can ha(e a cop of RH bloc)

    ard.are Approaches

    < Snoop Cache Controller 8 Cache Controllers monitor all the bus re=uests from CPUs and IOPs 8 All caches attached to the bus monitor the .rite operations

    8 Hhen a .ord in a cache is .ritten' memor is also updated +.rite throu&h, 8 Local snoop controllers in all other caches chec) their memor to determine if the ha(e

    a cop of that .ord If the ha(e' that location is mar)ed in(alid+future reference to this location causes cache miss,

    01Multiprocessors arallel Com

    puting

  • 7/17/2019 Ch13_2_

    31/41

    Computer Organization Computer Architectures Lab

    PARALLEL COMPUTI#$

    $roscheGs La.

    $roschGs La. states that the speed of computers is proportional to thes=uare of their cost* Thus if ou are loo)in& for a fast computer' ou arebetter off spendin& our mone buin& one lar&e computer than t.osmall computers and connectin& them*

    $roschGs La. is true .ithin classes of computers' but not true bet.eenclasses* Computers ma be priced accordin& to $roachGs La.' but theLa. cannot be true asmptoticall*

    Mins)Gs Conecture

    Mins)Gs conecture states that the speedup achie(ableb a parallel computer increases as the lo&arithm of the

    number of processin& elements'thus ma)in& lar&e8scaleparallelism unproducti(e*

    Man e4perimental results ha(e sho.n linear speedup for o(er1@@ processors*

    p g

    0"Multiprocessors arallel Com

    puting

  • 7/17/2019 Ch13_2_

    32/41

    Computer Organization Computer Architectures Lab

    PARALLEL COMPUTI#$

    n

    AmdahlGs La.

    A small number of se=uential operations can effecti(ellimit the speedup of a parallel al&orithm*Let f be the fraction of operations in a computation that must be performed se=uentiall'.here @ J f J 1* Then the ma4imum speedup S achie(able b a parallel computer .ith p processorsperformin& the computation is S J 1 2 >f +1 8 f, 2 p?* :or e4ample' if 1@ of the computation must beperformed se=uentiall' then the ma4imum speedup achie(able is 1@' no matter ho. manprocessors a parallel computer has*

    There e4ist some parallel al&orithms .ith almost no se=uential operations* As the problem si!e+n,increases' f becomes smaller +f 8 @ as n8 . In this case' lim S p*

    p g

    istor

    istor tells us that the speed of traditional sin&le CPUComputers has increased 1@ folds e(er B ears*Hh should &reat effort be e4pended to de(ise a parallelcomputer that .ill perform tas)s 1@ times faster .hen'b the time the ne. architecture is de(eloped andimplemented' sin&le CPU computers .ill be ust as fast*

    Utili!in& parallelism is better than .aitin&*

    00Multiprocessors arallel Com

    puting

  • 7/17/2019 Ch13_2_

    33/41

    Computer Organization Computer Architectures Lab

    PARALLEL COMPUTI#$

    Pipelined Computers are Sufficient

    Most supercomputers are (ector computers' and most of the successesattributed to supercomputers ha(e accomplished on pipelined (ectorprocessors' especiall Cra1 and Cber8"@B*

    If onl (ector operations can be e4ecuted at hi&h speed' supercomputers.ill not be able to tac)le a lar&e number of important problems* Thelatest supercomputers incorporate both pipelinin& and hi&h le(elparallelism +e*&*' Cra8",

    Soft.are Inertia

    3illions of dollars .orth of :ORTRA# soft.are e4ists*Hho .ill re.rite them/ 5irtuall no pro&rammers ha(e

    an e4perience .ith a machine other than a sin&le CPUcomputer* Hho .ill retrain them /

    09Multiprocessors Interconnection Structure

  • 7/17/2019 Ch13_2_

    34/41

    Computer Organization Computer Architectures Lab

    I#TERCO##ECTIO# #ETHORS

    S.itchin& #et.or) +-namic #et.or),

    Processors +and Memor, are connected to routin& s.itches li)e in telephone sstem 8 S.itches mi&ht ha(e =ueues+combinin& lo&ic,'

    .hich impro(e functionalit but increase latenc 8 S.itch settin&s ma be determined b messa&e headers or preset b controller 8 Connections can be pac)et8s.itched or circuit8 s.itched+remain connected as lon& as it is needed, 8 Usuall #UMA' bloc)in&' often scalable and up&radable

    Point8Point +Static #et.or), Processors are directl connected to onl certain other processors and

    must &o multiple hops to &et to additional processors

    8 Usuall distributed memor 8 ard.are ma handle onl sin&le hops' or multiple hops 8 Soft.are ma mas) hard.are limitations 8 Latenc is related to &raph diameter' amon& man other factors 8 Usuall #UMA' nonbloc)in&' scalable' up&radable 8 Rin&' Mesh' Torus' percube' 3inar Tree

    0BMultiprocessors Interconnection Structure

  • 7/17/2019 Ch13_2_

    35/41

    Computer Organization Computer Architectures Lab

    I#TERCO##ECTIO# #ETHORS

    S.itch Processor

    Multista&e Interconnect

    3us

    0Multiprocessors Interconnection Structure

  • 7/17/2019 Ch13_2_

    36/41

    Computer Organization Computer Architectures Lab

    I#TERCO##ECTIO# #ETHORS

    Static Topolo& 8 -irect Connection

    8 Pro(ide a direct inter8processor communication path 8 Usuall for distributed8memor multiprocessor

    -namic Topolo& 8 Indirect Connection

    8 Pro(ide a phsicall separate s.itchin& net.or)

    for inter8processor communication 8 Usuall for shared8memor multiprocessor

    -irect Connection

    Interconnection #et.or)

    A &raph $+5'E,5; a set of processors +nodes,

    E; a set of .ires +ed&es,

    Performance Measures; 8 de&ree' diameter' etc

    0DMultiprocessors Interconnection Structure

  • 7/17/2019 Ch13_2_

    37/41

    Computer Organization Computer Architectures Lab

    I#TERCO##ECTIO# #ETHORS

    Complete connection

    8 E(er processor is directl connected to e(er other processors

    8 -iameter 1' -e&ree p 8 1

    8 V of .ires p + p 8 1 , 2 " dominant cost8 :an8in2fanout limitation ma)es it impractical for lar&e p

    8 Interestin& as a theoretical model because al&orithm bounds for this

    model are automaticall lo.er bounds for all direct connection machines

    Rin&

    8 -e&ree "' +not a function of p,

    8 -iameter

    p2"

    0Multiprocessors Interconnection Structure

  • 7/17/2019 Ch13_2_

    38/41

    Computer Organization Computer Architectures Lab

    I#TERCO##ECTIO# #ETHORS

    "8Mesh

    8 -e&ree 9

    8 -iameter "+m 8 1,8 In &eneral' an n8dimensional mesh has

    diameter d + p12n8 1,8 -iameter can be hal(ed b ha(in& .rap8around

    connections +8 Torus,8 Rin& is a 18dimensional mesh .ith .rap8around

    connection

    m p"

    * * *

    * * *

    m

    m

    0FMultiprocessors Interconnection Structure

  • 7/17/2019 Ch13_2_

    39/41

    Computer Organization Computer Architectures Lab

    I#TERCO##ECTIO# #ETHOR

    3inar Tree

    8 -e&ree 0

    8 -iameter " lo&p 1

    "

    9@Multiprocessors

    Interconnection Structure

  • 7/17/2019 Ch13_2_

    40/41

    Computer Organization Computer Architectures Lab

    MI# SPACE

    3aseline >Hu@?:lip >3atcherD?Indirect binar n8cube >PeasDD?Ome&a >La.rieDB?

    Re&ular SH banan >$o)eD0?

    -elta net.or) >Patel1?

    3anan net.or)

    +uni=ue path net.or),

    PM"I net.or)

    -ata Manipulator >:en&D9?Au&mented -M >Sie&elD?In(erse A-M

    >Sie&elDF?$amma >Par)er9?

    E4tra sta&e Cube >Adams"?Replicated2-ialted -elta netor) >rus)al0?38delta >%oon?

    Multiple Path #et.or)

    Permutation2Sortin& #et.or) + # W ,

    Clos net.or) >B0?3enes net.or) >"?3atcher sortin& net.or) >?

    M I #

    91Multiprocessors

  • 7/17/2019 Ch13_2_

    41/41

    SOME CURRE#T PARALLEL COMPUTERS

    -M8SIM-AMT -AP$oodear MPPThin)in& Machines CM seriesMasPar MP1I3M $:11

    SM8MIM-

    Alliant :K33# 3utterflEncore Multima4Se=uent 3alance2SmmetrCRA% "' K8MP' %8MPI3M RP0

    U* Illinois CE-AR

    -M8MIM-Intel iPSC series' -elta machine#CU3E seriesMei)o Computin& SurfaceCarne&ie8Mellon2 Intel iHarp