Download - gr DOS Unit-1
-
8/9/2019 gr DOS Unit-1
1/29
-
8/9/2019 gr DOS Unit-1
2/29
University of Pennsylvania
2
Introduction to Distributed Systems
Why do we develop distributed systems?
Availability of powerful yet cheap microprocessors (PCs,workstations), continuing advances in communicationtechnology,
What is a distributed system?A distributed system is a collection of independent computers thatappear to the users of the system as a single system.
Examples:
Network of workstations
Distributed manufacturing system (e.g., automated assemblyline)
Network of branch office computers
-
8/9/2019 gr DOS Unit-1
3/29
University of Pennsylvania
3
Advantages of Distributed Systemsover Centralized Systems
Economics: A collection of microprocessors offer a betterprice/performance than mainframes. Low price/performance ratio:cost effective way to increase computing power.
Speed: A distributed system may have more total computing powerthan a mainframe. Ex. 10,000 CPU chips, each running at 50 MIPS.Not possible to build 500,000 MIPS single processor since it would
require 0.002 nsec instruction cycle. Enhanced performance throughload distributing.
Inherent distribution: Some applications are inherentlydistributed. Ex. a supermarket chain.
Reliability: If one machine crashes, the system as a whole can still
survive. Higher availability and improved reliability. Incremental growth: Computing power can be added in small
increments. Modular expandability
Another deriving force: The existence of large number of personalcomputers, the need for people to collaborate and share information.
-
8/9/2019 gr DOS Unit-1
4/29
University of Pennsylvania
4
Advantages of Distributed Systemsover Independent PCs
Data sharing: Allow many users to access to acommon data base
Resource Sharing: Expensive peripherals like color
printers
Communication: Enhance human-to-humancommunication, e.g., email, chat
Flexibility: Spread the workload over the available
machines
-
8/9/2019 gr DOS Unit-1
5/29
University of Pennsylvania
5
Disadvantages of Distributed Systems
Software: Difficult to develop software fordistributed systems
Network: The network can saturate or
cause other problems Security: Easy access also applies to
secrete data
-
8/9/2019 gr DOS Unit-1
6/29
University of Pennsylvania
6
Hardware Concepts
All distributed systems consist of multiple CPUs, there are several differentways the hardware can be organized interms of how they are interconnectedand how they communicate.
Flynn picked two characterstics they are the number of instructions streamand the number of data streams.
A computer with a Single instruction and single data stream is called SISD.All traditional uniprocessor computers.
The next category is SIMD, single instruction stream, multiple data stream.This type refers to array processors with one instruction unit that fetches aninstruction, and then commands many data units to carry it out in parallel, each
with its own data. Some supercomputers are SIMD.The next category is MISD, multiple instruction stream, single data stream.No known computers fit this mode.
Next comes MIMD, we find multiple instruction stream, multiple data stream.Which essentially means a group of independent computers, each with its ownprogram counter, program, and data. All distributed systems are MIMD.
-
8/9/2019 gr DOS Unit-1
7/29
University of Pennsylvania
7
Hardware Concepts
ATaxonomy of parallel and distributed computer systems.
-
8/9/2019 gr DOS Unit-1
8/29
University of Pennsylvania
MIMD (Multiple-InstructionMultiple-Data)
Tightly Coupled versus Loosely Coupled
y Tightly coupled systems (multiprocessors)
o shared memory
o intermachine delay short, data rate high
y Loosely coupled systems (multicomputers)
o private memory
o intermachine delay long, data rate low
8
-
8/9/2019 gr DOS Unit-1
9/29
University of Pennsylvania
9
Bus versus Switched MIMD
Bus: a single network, backplane, bus, cable or other mediumthat connects all machines. E.g., cable TV
Switched: individual wires from machine to machine, withmany different wiring patterns in use.
Multiprocessors (shared memory) Bus
Switched
Multicomputers (private memory)
Bus Switched
-
8/9/2019 gr DOS Unit-1
10/29
University of Pennsylvania
10
Bus-based Multiprocessors
Bus-based multiprocessors
y cache memory
y hit rate
y cache coherence
y write-through cache: propagate write immediately
y snoopy cache: monitor when its entry becomes obsolete
-
8/9/2019 gr DOS Unit-1
11/29
University of Pennsylvania
11
Switched Multiprocessors
A crossbar switch An Omega switching network
-
8/9/2019 gr DOS Unit-1
12/29
University of Pennsylvania
12
Switched Multiprocessorsy for connecting large number (say over 64) of processorsy crossbar switch: n**2 switch pointsy omega network: 2x2 switches for n CPUs and n memories,
log n switching stages, each with n/2 switches,y total (n log n)/2 switchesy delay problem: E.g.,n=1024,10 switching stages from
CPU to memory. a total of 20 switching stages. 100MIPS 10nsec instruction execution time need 0.5 nsecswitching time
y
NUMA
(Non-Uniform MemoryA
ccess): placement ofprogram and datay building a large, tightly-coupled, shared memory
multiprocessor is possible, but is difficult and expensive
-
8/9/2019 gr DOS Unit-1
13/29
University of Pennsylvania
13
Multicomputers
y easy to buildy communication volume much smallery relatively slow speed LAN (10-100MIPS, compared to
300MIPS and up for a backplane bus)
Bus-Based Multicomputers
AMulticomputer consisting of workstations on a LAN
-
8/9/2019 gr DOS Unit-1
14/29
University of Pennsylvania
14
y interconnectionnetworks: E.g., grid, hypercubey hypercube: n-dimensional cube
Switched Multicomputers
-
8/9/2019 gr DOS Unit-1
15/29
University of Pennsylvania
15
Software Concepts
Software more important for users
Three types:
1. Network Operating Systems
2. (True) Distributed Systems
3.Multiprocessor Time Sharing
-
8/9/2019 gr DOS Unit-1
16/29
University of Pennsylvania
16
Network Operating Systems
y loosely-coupled software on loosely-coupled hardwarey Anetwork of workstations connected by LAN
y each machine has a high degree of autonomy
o rlogin machine
o rcp machine1:file1 machine2:file2
y Files servers: client and server model
y Clients mount directories on file servers
y Best knownnetwork OS:
o Suns NFS (network file servers) for shared file
systemsy a few system-wide requirements: format and meaning of
all the messages exchanged
-
8/9/2019 gr DOS Unit-1
17/29
University of Pennsylvania
17
NFS
NFS Architecture
Server exports directories
Clients mount exported directories
NSF Protocols
For handling mounting
For read/write: no open/close, stateless
NSF Implementation
-
8/9/2019 gr DOS Unit-1
18/29
University of Pennsylvania
18
(True) Distributed Systems
tightly-coupled software on loosely-coupled hardware
provide a single-system image or a virtual uniprocessor
a single, global interprocess communication mechanism,
process management, file system; the same system callinterface everywhere
Ideal definition:
A distributed system runs on a collection ofcomputers that do not have shared memory, yet looks
like a single computer to its users.
-
8/9/2019 gr DOS Unit-1
19/29
University of Pennsylvania
19
Multiprocessor Operating Systems
Tightly-coupled software on tightly-coupled hardware
y
Examples: high-performance serversy shared memory
y single run queue
y traditional file system as on a single-processor system: centralblock cache
-
8/9/2019 gr DOS Unit-1
20/29
-
8/9/2019 gr DOS Unit-1
21/29
University of Pennsylvania
21
DesignIssues of Distributed Systems
Transparency
Flexibility
Reliability
Performance
Scalability
-
8/9/2019 gr DOS Unit-1
22/29
University of Pennsylvania
22
1. Transparency
How to achieve the single-system image, i.e., how to make acollection of computers appear as a single computer.
Hiding all the distribution from the users as well as theapplication programs can be achieved at two levels:
1) hide the distribution from users
2) at a lower level, make the system look transparent toprograms.
1) and 2) requires uniform interfaces such as access tofiles, communication.
-
8/9/2019 gr DOS Unit-1
23/29
University of Pennsylvania
23
Types of transparency
Location Transparency: users cannot tell where hardware andsoftware resources such as CPUs, printers, files, data bases arelocated.
Migration Transparency: resources must be free to move fromone location to another without their names changed.E.g.,/usr/lee,/central/usr/lee
Replication Transparency: OS can make additional copies offiles and resources without users noticing.
Concurrency Transparency: The users are not aware of theexistence of other users. Need to allow multiple users toconcurrently access the same resource. Lock and unlock formutual exclusion.
Parallelism Transparency: Automatic use of parallelism withouthaving to program explicitly. The holy grail for distributed andparallel system designers.
Users do not always want complete transparency: a fancy printer1000 miles away
-
8/9/2019 gr DOS Unit-1
24/29
University of Pennsylvania
24
2. Flexibility
Make it easier to change
Monolithic Kernel: systems calls are trapped and executed by the kernel. Allsystem calls are served by the kernel, e.g., UNIX.
Microkernel: provides minimal services. Shown in above Fig.1) IPC2) some memory management3) some low-level process management and scheduling4) low-level i/o
E.g.,Mach can support multiple file systems, multiple system interfaces.
-
8/9/2019 gr DOS Unit-1
25/29
University of Pennsylvania
25
3. Reliability
Distributed system should be more reliable than singlesystem. Example: 3 machines with .95 probability of beingup. 1-.05**3 probability of being up.
Availability: fraction of time the system is usable.Redundancy improves it.
Need to maintain consistency
Need to be secure
Fault tolerance: need to mask failures, recover fromerrors.
-
8/9/2019 gr DOS Unit-1
26/29
-
8/9/2019 gr DOS Unit-1
27/29
University of Pennsylvania
27
5. Scalability
Systems grow with time or become obsolete.Techniques that require resources linearly interms of the size of the system are not scalable.e.g., broadcast based query won't work for large
distributed systems. Examples of bottlenecks
o Centralized components: a single mail server
o Centralized tables: a single URL address book
o Centralized algorithms: routing based on completeinformation
-
8/9/2019 gr DOS Unit-1
28/29
University of Pennsylvania
28
Communication Networks
Computers are connected through a communicationnetwork
Wide Area Networks (WAN)connect computers spread over a wide geographic areapoint-to-point or store-and-forward -- data is
transferred betwee
ncomputers through a series ofswitches
switch -- a special purpose computer responsible forrouting data (to avoid network congestion)data can be lost due to: switch crashes, communicationlink failures, limited buffers at switches, transmission
errors, etc.
-
8/9/2019 gr DOS Unit-1
29/29