![Page 1: (C) 2000 Mark D. HillUniversity of Wisconsin-Madison How Computer Architecture Trends May Affect Future Distributed Systems Mark D. Hill Computer Sciences](https://reader035.vdocuments.site/reader035/viewer/2022062413/5a4d1b327f8b9ab05999b760/html5/thumbnails/1.jpg)
(C) 2000 Mark D. Hill University of Wisconsin-Madison
How Computer Architecture TrendsMay Affect Future Distributed Systems
Mark D. Hill
Computer Sciences DepartmentUniversity of Wisconsin--Madisonhttp://www.cs.wisc.edu/~markhill
PODC ‘00 Invited Talk
![Page 2: (C) 2000 Mark D. HillUniversity of Wisconsin-Madison How Computer Architecture Trends May Affect Future Distributed Systems Mark D. Hill Computer Sciences](https://reader035.vdocuments.site/reader035/viewer/2022062413/5a4d1b327f8b9ab05999b760/html5/thumbnails/2.jpg)
(C) 2000 Mark D. Hill PODC00: Computer Architecture Trends
Three Questions
• What is a System Area Network (SAN)and how will it affect clusters?– E.g., InfiniBand
• How fat will multiprocessor servers beand how to we build larger ones?– E.g. Wisconsin Multifacet’s Multicast & Timestamp Snooping
• Future of multiprocessor servers & clusters?– A merging of both?
![Page 3: (C) 2000 Mark D. HillUniversity of Wisconsin-Madison How Computer Architecture Trends May Affect Future Distributed Systems Mark D. Hill Computer Sciences](https://reader035.vdocuments.site/reader035/viewer/2022062413/5a4d1b327f8b9ab05999b760/html5/thumbnails/3.jpg)
(C) 2000 Mark D. Hill PODC00: Computer Architecture Trends
Outline
• Motivation
• System Area Networks
• Designing Multiprocessor Servers
• Server & Cluster Trends
![Page 4: (C) 2000 Mark D. HillUniversity of Wisconsin-Madison How Computer Architecture Trends May Affect Future Distributed Systems Mark D. Hill Computer Sciences](https://reader035.vdocuments.site/reader035/viewer/2022062413/5a4d1b327f8b9ab05999b760/html5/thumbnails/4.jpg)
(C) 2000 Mark D. Hill PODC00: Computer Architecture Trends
Technology Push: Moore’s Law
• What do following intervals have in common?– Prehistory to 2000– 2001 to 2002
• Answer: Equal progress in absolute processor speed(and more doubling 2003-4, 2005-6, etc.)– Consider salary doubling
• Corollary: Cost halves every two years– Jim Gray: In a decade you can buy a computer
for less than its sales tax today
![Page 5: (C) 2000 Mark D. HillUniversity of Wisconsin-Madison How Computer Architecture Trends May Affect Future Distributed Systems Mark D. Hill Computer Sciences](https://reader035.vdocuments.site/reader035/viewer/2022062413/5a4d1b327f8b9ab05999b760/html5/thumbnails/5.jpg)
(C) 2000 Mark D. Hill PODC00: Computer Architecture Trends
Application Pull
• Should use computers in currently wasteful ways– Already computers in electric razors & greeting cards
• New business models– B2C, B2B, C2B, C2C– Mass customization
• More proactive (beyond interactive) [Tennenhouse]– Today: P2C where P==Person & C==Computer– More C2P: mattress adjusts to save your back– More C2C: Agents surf the web for optimal deal– More sensors (physical/logic worlds coupled)– More hidden computers (c.f., electric motors)
• Furthermore, I am wrong
![Page 6: (C) 2000 Mark D. HillUniversity of Wisconsin-Madison How Computer Architecture Trends May Affect Future Distributed Systems Mark D. Hill Computer Sciences](https://reader035.vdocuments.site/reader035/viewer/2022062413/5a4d1b327f8b9ab05999b760/html5/thumbnails/6.jpg)
(C) 2000 Mark D. Hill PODC00: Computer Architecture Trends
The Internet Iceberg
• Internet Components– Clients -- mobile, wireless– “On Ramp” -- LANs/DSL/Cable Modems– WAN Backbone -- IPv6, massive BW– and ...
• SERVICES– Scale Storage– Scale Bandwidth– Scale Computation– High Availability
![Page 7: (C) 2000 Mark D. HillUniversity of Wisconsin-Madison How Computer Architecture Trends May Affect Future Distributed Systems Mark D. Hill Computer Sciences](https://reader035.vdocuments.site/reader035/viewer/2022062413/5a4d1b327f8b9ab05999b760/html5/thumbnails/7.jpg)
(C) 2000 Mark D. Hill PODC00: Computer Architecture Trends
Outline
• Motivation
• System Area Networks– What is a SAN?– InfiniBand– Virtualizing I/O with Queue Pairs– Predictions
• Designing Multiprocessor Servers
• Server & Cluster Trends
![Page 8: (C) 2000 Mark D. HillUniversity of Wisconsin-Madison How Computer Architecture Trends May Affect Future Distributed Systems Mark D. Hill Computer Sciences](https://reader035.vdocuments.site/reader035/viewer/2022062413/5a4d1b327f8b9ab05999b760/html5/thumbnails/8.jpg)
(C) 2000 Mark D. Hill PODC00: Computer Architecture Trends
Regarding Storage/Bandwidth
• Currently resides on I/O Bus (PCI)– HW & SW protocol stacks– Must add hosts to add storage/bandwidth
bridge
i/o bus
i/o slot 0 i/o slot n-1
proc
memory interconnect
proc
memory
![Page 9: (C) 2000 Mark D. HillUniversity of Wisconsin-Madison How Computer Architecture Trends May Affect Future Distributed Systems Mark D. Hill Computer Sciences](https://reader035.vdocuments.site/reader035/viewer/2022062413/5a4d1b327f8b9ab05999b760/html5/thumbnails/9.jpg)
(C) 2000 Mark D. Hill PODC00: Computer Architecture Trends
Want System Area Network (SAN)
• SAN vs. Local Area Nework (LAN)– Higher bandwidth (10 Gbps)– Lower latency (few microseconds or less)– More limited size– Other (e.g., single administrative domain, short distance)– Examples: Tandem Servernet & Myricom Myrinet
• Emerging Standard: InfiniBand– www.inifinibandTA.org w/ spec 1.0 Summer 2000– Compaq, Dell, HP, IBM, Intel, Microsoft, Sun, & others– 2.5 Gbits/s times 1, 4, or 12 wires
![Page 10: (C) 2000 Mark D. HillUniversity of Wisconsin-Madison How Computer Architecture Trends May Affect Future Distributed Systems Mark D. Hill Computer Sciences](https://reader035.vdocuments.site/reader035/viewer/2022062413/5a4d1b327f8b9ab05999b760/html5/thumbnails/10.jpg)
(C) 2000 Mark D. Hill PODC00: Computer Architecture Trends
InfiniBand Model (from website)
HCA (host channel adapter)
switch
linkTCA
target(disks)
Other switches, hosts, targets, etc.
XCA
routerOther
networks
proc
memory interconnect
proc
memory
![Page 11: (C) 2000 Mark D. HillUniversity of Wisconsin-Madison How Computer Architecture Trends May Affect Future Distributed Systems Mark D. Hill Computer Sciences](https://reader035.vdocuments.site/reader035/viewer/2022062413/5a4d1b327f8b9ab05999b760/html5/thumbnails/11.jpg)
(C) 2000 Mark D. Hill PODC00: Computer Architecture Trends
Inifiniband Advantages
• Storage/Network made orthogonal from Computation• Reduce “hardware” stack -- no i/o bridge• Reduce “software” stack; hardware support for
– Connected Reliable– Connected Unreliable– Datagram– Reliable Datagram– Raw Datagram
• Can eliminate system call for SAN use (next slide)
![Page 12: (C) 2000 Mark D. HillUniversity of Wisconsin-Madison How Computer Architecture Trends May Affect Future Distributed Systems Mark D. Hill Computer Sciences](https://reader035.vdocuments.site/reader035/viewer/2022062413/5a4d1b327f8b9ab05999b760/html5/thumbnails/12.jpg)
(C) 2000 Mark D. Hill PODC00: Computer Architecture Trends
Virtualizing InfiniBand
• I/O traditionally virtualized with system call– System enforces isolation– System permits authorized sharing
• Memory virtualized– System trap/call for setup– Virtual memory hardware for common-case translation
• Infiniband exploits “queue pairs” (QPs) in memory– C.f., Intel Virtual Interface Architecture (VIA)
[IEEE Micro, Mar/Apr ‘98]– Users issue sends, receives, & remote DMA reads/writes
![Page 13: (C) 2000 Mark D. HillUniversity of Wisconsin-Madison How Computer Architecture Trends May Affect Future Distributed Systems Mark D. Hill Computer Sciences](https://reader035.vdocuments.site/reader035/viewer/2022062413/5a4d1b327f8b9ab05999b760/html5/thumbnails/13.jpg)
(C) 2000 Mark D. Hill PODC00: Computer Architecture Trends
Queue Pair
proc
HCA
MainMemory
• QP setup system call– Connect with process– Connect with remote QP
(not shown here)
• QP placed in “pinned” virtual memory
• User directly access QP– E.g., sends, receives &
remote DMA reads/writes
send1
send2
dma-R3
dma-W4
receive2
receive1
![Page 14: (C) 2000 Mark D. HillUniversity of Wisconsin-Madison How Computer Architecture Trends May Affect Future Distributed Systems Mark D. Hill Computer Sciences](https://reader035.vdocuments.site/reader035/viewer/2022062413/5a4d1b327f8b9ab05999b760/html5/thumbnails/14.jpg)
(C) 2000 Mark D. Hill PODC00: Computer Architecture Trends
InfiniBand, cont.
• Roadmap– NGIO/FIO merger in ‘99– Spec in ‘00– Products in ‘03-’10
• My Assessment– PCI needs successor– InfiniBand has the necessary features (but also many others)– InifiniBand has considerable industry buy-in (but it is recent)– Gigabit Ethernet will be only competitor
• Good name with backing from Cisco et al.• But TCP/IP is a killer
– Infiniband for storage will be key
![Page 15: (C) 2000 Mark D. HillUniversity of Wisconsin-Madison How Computer Architecture Trends May Affect Future Distributed Systems Mark D. Hill Computer Sciences](https://reader035.vdocuments.site/reader035/viewer/2022062413/5a4d1b327f8b9ab05999b760/html5/thumbnails/15.jpg)
(C) 2000 Mark D. Hill PODC00: Computer Architecture Trends
InfiniBand Research Issues
• Software Wide Open– Industry will do local optimization
(e.g., still have device driver virtualized with system calls)– But what is the “right” way to do software?– Is there a theoretical model for this software?
• Other SAN Issues– A theoretical model of a service-providers site?– How to trade performance and availability?– Utility of broadcast or multicast support?– Obtaining quasi-real-time performance?
![Page 16: (C) 2000 Mark D. HillUniversity of Wisconsin-Madison How Computer Architecture Trends May Affect Future Distributed Systems Mark D. Hill Computer Sciences](https://reader035.vdocuments.site/reader035/viewer/2022062413/5a4d1b327f8b9ab05999b760/html5/thumbnails/16.jpg)
(C) 2000 Mark D. Hill PODC00: Computer Architecture Trends
Outline
• Motivation
• System Area Networks
• Designing Multiprocessor Servers– How Fat?– Coherence for Servers– E.g., Multicast Snooping– E.g., Timestamp Snooping
• Server & Cluster Trends
![Page 17: (C) 2000 Mark D. HillUniversity of Wisconsin-Madison How Computer Architecture Trends May Affect Future Distributed Systems Mark D. Hill Computer Sciences](https://reader035.vdocuments.site/reader035/viewer/2022062413/5a4d1b327f8b9ab05999b760/html5/thumbnails/17.jpg)
(C) 2000 Mark D. Hill PODC00: Computer Architecture Trends
How Fat Should Servers Be?
• Use– PCs -- cheap but small– Workgroup servers -- medium cost; medium size– Large servers -- premium cost & size
• One answer: “yes”
PCs w/“soft” state
Servers runningdatabases for“hard” state
![Page 18: (C) 2000 Mark D. HillUniversity of Wisconsin-Madison How Computer Architecture Trends May Affect Future Distributed Systems Mark D. Hill Computer Sciences](https://reader035.vdocuments.site/reader035/viewer/2022062413/5a4d1b327f8b9ab05999b760/html5/thumbnails/18.jpg)
(C) 2000 Mark D. Hill PODC00: Computer Architecture Trends
How Do We Build the Big Servers?
• (Industry knows how to build the small ones)
• A key problem is the memory system– Memory Wall: E.g., 100ns memory access =
400 instruction opportunities for 4-way 1GHz processor
• Use per-processor caches to reduce– Effective Latency– Effective Bandwidth Used
• But cache coherence problem ...
![Page 19: (C) 2000 Mark D. HillUniversity of Wisconsin-Madison How Computer Architecture Trends May Affect Future Distributed Systems Mark D. Hill Computer Sciences](https://reader035.vdocuments.site/reader035/viewer/2022062413/5a4d1b327f8b9ab05999b760/html5/thumbnails/19.jpg)
(C) 2000 Mark D. Hill PODC00: Computer Architecture Trends
Coherence 101
interconnection network
P0
cache
memorymemory
P1
cache
Pn-1
cache
r0<-m[100]
100 : 4100 : 4
r1<-m[100]
m[100]<-5 r3<-m[100]r2<-m[100]
“4”“4”
X 5
“?” “?”
100 4
![Page 20: (C) 2000 Mark D. HillUniversity of Wisconsin-Madison How Computer Architecture Trends May Affect Future Distributed Systems Mark D. Hill Computer Sciences](https://reader035.vdocuments.site/reader035/viewer/2022062413/5a4d1b327f8b9ab05999b760/html5/thumbnails/20.jpg)
(C) 2000 Mark D. Hill PODC00: Computer Architecture Trends
Broadcast Snooping
Mem P0 P1 P2
Data Network
Ordered Address Network
P1:GETX P2:GETX
P1:GETXP2:GETX
P2:GETX P2:GETX P2:GETXP2:GETX P1:GETX P1:GETX P1:GETXP1:GETX
P2:GETX
data data
data data data
P2:GETX
![Page 21: (C) 2000 Mark D. HillUniversity of Wisconsin-Madison How Computer Architecture Trends May Affect Future Distributed Systems Mark D. Hill Computer Sciences](https://reader035.vdocuments.site/reader035/viewer/2022062413/5a4d1b327f8b9ab05999b760/html5/thumbnails/21.jpg)
(C) 2000 Mark D. Hill PODC00: Computer Architecture Trends
Broadcast Snooping
• Symmetric Multiprocessor (SMP)– Most commercially-successful parallel computer architecture– Performs well by finding data directly– Scales poorly
• Improvements, e.g., Sun E10000– Split address & data transactions– Split address & data network (e.g., bus & crossbar)– Multiple address buses (e.g., four multiplexed by address)– Address bus is broadcast tree (not shared wires)
• But…– Broadcast all address transactions (expensive)– All processors must snoop all transactions
![Page 22: (C) 2000 Mark D. HillUniversity of Wisconsin-Madison How Computer Architecture Trends May Affect Future Distributed Systems Mark D. Hill Computer Sciences](https://reader035.vdocuments.site/reader035/viewer/2022062413/5a4d1b327f8b9ab05999b760/html5/thumbnails/22.jpg)
(C) 2000 Mark D. Hill PODC00: Computer Architecture Trends
Dir/Mem P0 P1 P2
Data Network
Address Network
Directories
P1:GETX P2:GETX
data data
data data data
P2:GETXP1:GETX
P2:GETX
send send
![Page 23: (C) 2000 Mark D. HillUniversity of Wisconsin-Madison How Computer Architecture Trends May Affect Future Distributed Systems Mark D. Hill Computer Sciences](https://reader035.vdocuments.site/reader035/viewer/2022062413/5a4d1b327f8b9ab05999b760/html5/thumbnails/23.jpg)
(C) 2000 Mark D. Hill PODC00: Computer Architecture Trends
Directories
• Directory Based Cache Coherence– E.g., SGI/Cray Origin2000– Allows arbitrary point-to-point interconnection network– Scales up well
• But– Cache-to-cache transfers common in demanding apps
(55-62% sharing misses for OLTP [Barroso ISCA ‘98])– Many applications can’t use 100s of processors– Must also “scale down” well
![Page 24: (C) 2000 Mark D. HillUniversity of Wisconsin-Madison How Computer Architecture Trends May Affect Future Distributed Systems Mark D. Hill Computer Sciences](https://reader035.vdocuments.site/reader035/viewer/2022062413/5a4d1b327f8b9ab05999b760/html5/thumbnails/24.jpg)
(C) 2000 Mark D. Hill PODC00: Computer Architecture Trends
Wisconsin Multifacet: Big Picture
• Build Servers For Internet economy– Moderate multiprocessor sizes: 2-8 then 16-64, but not 1K– Optimize for these workloads (e.g. cache-to-cache transfers)
• Key Tool: Multiprocessor Prediction & Speculation– Make a guess... verify it later– Uniprocessor predecessors: branch & set predictors– Recent multiprocessor work: [Mukherjee/Hill ISCA98],
[Kaxiras/Goodman HPCA99] & [Lai/Falsafi ISCA99]– Multicast Snooping– Timestamp Snooping
![Page 25: (C) 2000 Mark D. HillUniversity of Wisconsin-Madison How Computer Architecture Trends May Affect Future Distributed Systems Mark D. Hill Computer Sciences](https://reader035.vdocuments.site/reader035/viewer/2022062413/5a4d1b327f8b9ab05999b760/html5/thumbnails/25.jpg)
(C) 2000 Mark D. Hill PODC00: Computer Architecture Trends
Comparison of Coherence Methods
CoherenceAttribute
Snooping Directories
Find previousowner directly?
Yes Sometimes
Alwaysbroadcast?
Yes No
Ordering w/oacks?
Yes No
Stateless atmemory?
Yes No
Orderednetwork?
Yes No
CoherenceAttribute
Snooping Directories MulticastSnooping
Find previousowner directly?
Yes Sometimes Usually(good)
Alwaysbroadcast?
Yes No No (good)
Ordering w/oacks?
Yes No Yes (good)
Stateless atmemory?
Yes No No butsimpler
Orderednetwork?
Yes No Yes, achallenge
Use prediction to improve on both?
![Page 26: (C) 2000 Mark D. HillUniversity of Wisconsin-Madison How Computer Architecture Trends May Affect Future Distributed Systems Mark D. Hill Computer Sciences](https://reader035.vdocuments.site/reader035/viewer/2022062413/5a4d1b327f8b9ab05999b760/html5/thumbnails/26.jpg)
(C) 2000 Mark D. Hill PODC00: Computer Architecture Trends
Multicast Snooping
• On cache miss– Predict "multicast mask" (e.g., bit vector of processors)– Issue transaction on multicast address network
• Networks– Address network that totally-orders address multicasts– Separate point-to-point data network
• Processors snoop all incoming transactions– If it's your own, it "occurs" now– If another's, then invalidate and/or respond
• Simplified directory (at memory)– Purpose: Allows masks to be wrong (explained later)
![Page 27: (C) 2000 Mark D. HillUniversity of Wisconsin-Madison How Computer Architecture Trends May Affect Future Distributed Systems Mark D. Hill Computer Sciences](https://reader035.vdocuments.site/reader035/viewer/2022062413/5a4d1b327f8b9ab05999b760/html5/thumbnails/27.jpg)
(C) 2000 Mark D. Hill PODC00: Computer Architecture Trends
• Performed at Requesting Processor– Include owner (GETS/GETX) & all sharers (GETX only)– Exclude most other processors
• Techniques– Many straightforward cases (e.g., stack, code,
space-sharing)– Many options (network load, PC, software, local/global)
Predicting Masks
Mask Predictorblock address
feedback
predicted mask
![Page 28: (C) 2000 Mark D. HillUniversity of Wisconsin-Madison How Computer Architecture Trends May Affect Future Distributed Systems Mark D. Hill Computer Sciences](https://reader035.vdocuments.site/reader035/viewer/2022062413/5a4d1b327f8b9ab05999b760/html5/thumbnails/28.jpg)
(C) 2000 Mark D. Hill PODC00: Computer Architecture Trends
Implementing an Ordered Multicast Network
• Address Network– Must create the illusion of total order of multicasts– May deliver a multicast to destinations at different times
• Wish List– High throughput for multicasts– No centralized bottlenecks– Low latency and cost (~ pipelined broadcast tree)– ...
• Sample Solutions– Isotach Networks [Reynolds et al., IEEE TPDS 4/97]– Indirect Fat Tree [ISCA `99]– Direct Torus
![Page 29: (C) 2000 Mark D. HillUniversity of Wisconsin-Madison How Computer Architecture Trends May Affect Future Distributed Systems Mark D. Hill Computer Sciences](https://reader035.vdocuments.site/reader035/viewer/2022062413/5a4d1b327f8b9ab05999b760/html5/thumbnails/29.jpg)
(C) 2000 Mark D. Hill PODC00: Computer Architecture Trends
Indirect Fat Tree [ISCA ‘99]
P $D M
![Page 30: (C) 2000 Mark D. HillUniversity of Wisconsin-Madison How Computer Architecture Trends May Affect Future Distributed Systems Mark D. Hill Computer Sciences](https://reader035.vdocuments.site/reader035/viewer/2022062413/5a4d1b327f8b9ab05999b760/html5/thumbnails/30.jpg)
(C) 2000 Mark D. Hill PODC00: Computer Architecture Trends
Indirect Fat Tree, cont.
• Basic Idea– Processors send transactions up to roots– Roots send transactions down with logical timestamp– Switches stall transactions to keep in order– Null transaction sent to avoid deadlock
• Assessment– Viable & high cross-section bandwidth– Many "backplane" ASICs means higher cost– Often stalls transactions
• Want– Lower cost of direct connections– Always delivery transactions as soon as possible (ASAP)– Sacrifice some cross-section bandwidth
![Page 31: (C) 2000 Mark D. HillUniversity of Wisconsin-Madison How Computer Architecture Trends May Affect Future Distributed Systems Mark D. Hill Computer Sciences](https://reader035.vdocuments.site/reader035/viewer/2022062413/5a4d1b327f8b9ab05999b760/html5/thumbnails/31.jpg)
(C) 2000 Mark D. Hill PODC00: Computer Architecture Trends
Direct 2-D Torus (work in progress)
• Features– Each processor is switch– Switches directly connected– E.g., network of Compaq 21364
• Network order?– Broadcasts unordered– Snooping needs total order
• Solution– Create order with logical timestamps
instead of network delivery order– Called Timestamp Snooping [ASPLOS ‘00]
1514
0 1
![Page 32: (C) 2000 Mark D. HillUniversity of Wisconsin-Madison How Computer Architecture Trends May Affect Future Distributed Systems Mark D. Hill Computer Sciences](https://reader035.vdocuments.site/reader035/viewer/2022062413/5a4d1b327f8b9ab05999b760/html5/thumbnails/32.jpg)
(C) 2000 Mark D. Hill PODC00: Computer Architecture Trends
Timestamp Snooping
• Timestamp Snooping– Snooping with order determined by logical timestamps– Broadcast (not multicast) in ASPLOS ‘00
• Basic Idea– Assign timestamp to coherence transactions at sender– Broadcast transactions over unordered network ASAP– Transaction carry timestamp (2 bits)– Processors process transactions in timestamp order
![Page 33: (C) 2000 Mark D. HillUniversity of Wisconsin-Madison How Computer Architecture Trends May Affect Future Distributed Systems Mark D. Hill Computer Sciences](https://reader035.vdocuments.site/reader035/viewer/2022062413/5a4d1b327f8b9ab05999b760/html5/thumbnails/33.jpg)
(C) 2000 Mark D. Hill PODC00: Computer Architecture Trends
Timestamp Snooping Issues
• More address bandwidth– For 16-processors, 4-ary butterfly, 64-byte blocks– Directory: 3*8 + 3*72 + more = 240 + more– Timestamp Snooping 21*8 + 3*72 = 384 (< 60% more)
• Network must guarantee timestamps– Assert future transactions will have greater timestamps
(so processor can processor older transactions)– Isotach [Reynolds IEEE TPDS 4/97] more aggressively
• Other– Priority queue at processor to order transactions– Flow control and buffering issues
![Page 34: (C) 2000 Mark D. HillUniversity of Wisconsin-Madison How Computer Architecture Trends May Affect Future Distributed Systems Mark D. Hill Computer Sciences](https://reader035.vdocuments.site/reader035/viewer/2022062413/5a4d1b327f8b9ab05999b760/html5/thumbnails/34.jpg)
(C) 2000 Mark D. Hill PODC00: Computer Architecture Trends
Initial Multifacet Results
• Multicast Snooping [ISCA ‘99]– Ordered multicast of coherence transactions– Find data directly from memory or caches– Reduce bandwidth to permit some scaling– 32-processor results show 2-6 destinations per multicast
• Timestamp Snooping [ASPLOS ‘00]– Broadcast snooping with “order” determined by
logical timestamps carried by coherence transactions– No bus: Allows arbitrary memory interconnects– No directory or directory indirection– 16-processor results show 25% faster for 25% more traffic
![Page 35: (C) 2000 Mark D. HillUniversity of Wisconsin-Madison How Computer Architecture Trends May Affect Future Distributed Systems Mark D. Hill Computer Sciences](https://reader035.vdocuments.site/reader035/viewer/2022062413/5a4d1b327f8b9ab05999b760/html5/thumbnails/35.jpg)
(C) 2000 Mark D. Hill PODC00: Computer Architecture Trends
Selected Issues
• Multicast Snooping– What program property are mask predictors exploiting?– Why is there no good model of locality
or the “90-10” rule in general?– How does one build multicast networks?– What about fault tolerance?
• Timestamp Snooping– What is an optimal network topology?– What about buffering, deadlock, etc.?– Implementing switches and priority queues?
![Page 36: (C) 2000 Mark D. HillUniversity of Wisconsin-Madison How Computer Architecture Trends May Affect Future Distributed Systems Mark D. Hill Computer Sciences](https://reader035.vdocuments.site/reader035/viewer/2022062413/5a4d1b327f8b9ab05999b760/html5/thumbnails/36.jpg)
(C) 2000 Mark D. Hill PODC00: Computer Architecture Trends
Outline
• Motivation
• System Area Networks
• Designing Multiprocessor Servers
• Server & Cluster Trends– Out-of-box and highly-available servers– High-performance communication for clusters
![Page 37: (C) 2000 Mark D. HillUniversity of Wisconsin-Madison How Computer Architecture Trends May Affect Future Distributed Systems Mark D. Hill Computer Sciences](https://reader035.vdocuments.site/reader035/viewer/2022062413/5a4d1b327f8b9ab05999b760/html5/thumbnails/37.jpg)
(C) 2000 Mark D. Hill PODC00: Computer Architecture Trends
Multiprocessor Servers
• High-Performance Communication “within box”– SMPs (e.g., Intel PentiumPro Quads)– Directory-based (SGI Origin2000)
• Trend toward hierarchical “out of box” solutions– Build bigger servers from smaller ones– Intel Profusion, Sequent NUMA-Q, Sun WildFire (pictured)
SMP
SMP
SMP
SMP
![Page 38: (C) 2000 Mark D. HillUniversity of Wisconsin-Madison How Computer Architecture Trends May Affect Future Distributed Systems Mark D. Hill Computer Sciences](https://reader035.vdocuments.site/reader035/viewer/2022062413/5a4d1b327f8b9ab05999b760/html5/thumbnails/38.jpg)
(C) 2000 Mark D. Hill PODC00: Computer Architecture Trends
Multiprocessor Servers, cont.
• Traditionally had poor error isolation– Double-bit ECC error crashes everything– Kernel error crashes everything– Poor match for highly available Internet infrastructure
• Improve error isolation– IBM 370 “virtual machines”– Stanford HIVE “cells”
![Page 39: (C) 2000 Mark D. HillUniversity of Wisconsin-Madison How Computer Architecture Trends May Affect Future Distributed Systems Mark D. Hill Computer Sciences](https://reader035.vdocuments.site/reader035/viewer/2022062413/5a4d1b327f8b9ab05999b760/html5/thumbnails/39.jpg)
(C) 2000 Mark D. Hill PODC00: Computer Architecture Trends
Clusters
• Traditionally– Good error isolation– Poor communication performance (especially latency)– LANs are not optimized for clusters
• Enter Early SANs– Berkeley NOW w/ Myricom Myrinet– IBM SP w/ proprietary network
• What now with InfiniBand SAN (or alternatives)?
![Page 40: (C) 2000 Mark D. HillUniversity of Wisconsin-Madison How Computer Architecture Trends May Affect Future Distributed Systems Mark D. Hill Computer Sciences](https://reader035.vdocuments.site/reader035/viewer/2022062413/5a4d1b327f8b9ab05999b760/html5/thumbnails/40.jpg)
(C) 2000 Mark D. Hill PODC00: Computer Architecture Trends
A Prediction
• Blurring of cluster & server boundaries
• Clusters– High communication performance
• Servers– Better error isolation– Multi-box solutions
• Use same hardware & configure in the field
• Issues– How do we model these hybrids?– Should PODC & SPAA also converge?
![Page 41: (C) 2000 Mark D. HillUniversity of Wisconsin-Madison How Computer Architecture Trends May Affect Future Distributed Systems Mark D. Hill Computer Sciences](https://reader035.vdocuments.site/reader035/viewer/2022062413/5a4d1b327f8b9ab05999b760/html5/thumbnails/41.jpg)
(C) 2000 Mark D. Hill PODC00: Computer Architecture Trends
Three Questions
• What is a System Area Network (SAN)and how will it affect clusters?– E.g., InfiniBand– Make computation, storage, & network orthogonal
• How fat will multiprocessor servers beand how to we build larger ones?– Varying sizes for soft & hard state– E.g., Multicast Snooping & Timestamp Snooping
• Future of multiprocessor servers & clusters?– Servers will support higher availability & extra-box solutions– Clusters will get server communication performance