Download - Distributed Systems COEN 317 Introduction
![Page 1: Distributed Systems COEN 317 Introduction](https://reader035.vdocuments.site/reader035/viewer/2022062407/56812a74550346895d8df9fd/html5/thumbnails/1.jpg)
Distributed Systems COEN 317 Introduction
Chapter 1,2,3
![Page 2: Distributed Systems COEN 317 Introduction](https://reader035.vdocuments.site/reader035/viewer/2022062407/56812a74550346895d8df9fd/html5/thumbnails/2.jpg)
COEN 317
JoAnne Holliday
Email: [email protected] (best way to reach me)
Office: Engineering 247, (408) 551-1941
Office Hours: TW 3:00-4:30 and by appointment
Class web page: http://www.cse.scu.edu/~jholliday/
![Page 3: Distributed Systems COEN 317 Introduction](https://reader035.vdocuments.site/reader035/viewer/2022062407/56812a74550346895d8df9fd/html5/thumbnails/3.jpg)
Textbook: Distributed Systems, Principles and Paradigms
By Tanenbaum and van Steen
We will cover chapter 4-8 and parts of 9.
Read chapter 1. Review chapters 2 if needed for networks and 3 as needed for threads and processes
Chapter 1: Introduction
Chapter 2: Communication, Networking
Chapter 3: Processes
![Page 4: Distributed Systems COEN 317 Introduction](https://reader035.vdocuments.site/reader035/viewer/2022062407/56812a74550346895d8df9fd/html5/thumbnails/4.jpg)
Definition of a Distributed System (1)
A distributed system is:
A collection of independent computers that appears to its
users as a single coherent system.
![Page 5: Distributed Systems COEN 317 Introduction](https://reader035.vdocuments.site/reader035/viewer/2022062407/56812a74550346895d8df9fd/html5/thumbnails/5.jpg)
Definition of a Distributed System (2)
A distributed system organized as middleware.Note that the middleware layer extends over multiple machines.
1.1
![Page 6: Distributed Systems COEN 317 Introduction](https://reader035.vdocuments.site/reader035/viewer/2022062407/56812a74550346895d8df9fd/html5/thumbnails/6.jpg)
Threads (chapter 3)
Message propagation times are long. Send a message and let one thread wait for response while another continues with task.
![Page 7: Distributed Systems COEN 317 Introduction](https://reader035.vdocuments.site/reader035/viewer/2022062407/56812a74550346895d8df9fd/html5/thumbnails/7.jpg)
Distributed systems
“Distributed System” covers a wide range of architectures from slightly more distributed than a centralized system to a truly distributed network of peers.
![Page 8: Distributed Systems COEN 317 Introduction](https://reader035.vdocuments.site/reader035/viewer/2022062407/56812a74550346895d8df9fd/html5/thumbnails/8.jpg)
One Extreme: Centralized
Centralized: mainframe and dumb terminals
All of the computation is done on the mainframe. Each line or keystroke is sent from the terminal to the mainframe.
![Page 9: Distributed Systems COEN 317 Introduction](https://reader035.vdocuments.site/reader035/viewer/2022062407/56812a74550346895d8df9fd/html5/thumbnails/9.jpg)
Moving Towards Distribution
In a client-server system, the clients are workstations or computers in their own right and perform computations and formatting of the data.
However, the data and the application which manipulates it ultimately resides on the server.
![Page 10: Distributed Systems COEN 317 Introduction](https://reader035.vdocuments.site/reader035/viewer/2022062407/56812a74550346895d8df9fd/html5/thumbnails/10.jpg)
More Decentralization
In Distributed-with-Coordinator, the nodes or sites depend on a coordinator node with extra knowledge or processing abilities
Coordinator might be used only in case of failures or other problems
![Page 11: Distributed Systems COEN 317 Introduction](https://reader035.vdocuments.site/reader035/viewer/2022062407/56812a74550346895d8df9fd/html5/thumbnails/11.jpg)
True Decentralization
A true Distributed system has no distinguished node which acts as a coordinator and all nodes or sites are equals.
The nodes may choose to elect one of their own to act as a temporary coordinator or leader
![Page 12: Distributed Systems COEN 317 Introduction](https://reader035.vdocuments.site/reader035/viewer/2022062407/56812a74550346895d8df9fd/html5/thumbnails/12.jpg)
Distributed Systems: Pro and Con
Some things that were difficult in a centralized system become easier – Doing tasks faster by doing them in parallel
– Avoiding a single point of failure (all eggs in one basket)
– Geographical distribution
Some things become more difficult– Transaction commit
– Snapshots, time and causality
– Agreement (consensus)
![Page 13: Distributed Systems COEN 317 Introduction](https://reader035.vdocuments.site/reader035/viewer/2022062407/56812a74550346895d8df9fd/html5/thumbnails/13.jpg)
Advantages of the True Distributed System
• No central server or coordinator means it is scalable
• SDDS, Scalable Distributed Data Structures, attempt to move distributed systems from a small number of nodes to thousands of nodes
• We need scalable algorithms to operate on these networks/structures– For example peer-to-peer networks
![Page 14: Distributed Systems COEN 317 Introduction](https://reader035.vdocuments.site/reader035/viewer/2022062407/56812a74550346895d8df9fd/html5/thumbnails/14.jpg)
Transparency in a Distributed System
Important: location, migration (relocation), replication, concurrency, failure.
Transparency Description
AccessHide differences in data representation and how a resource is accessed
Location Hide where a resource is located
Migration Hide that a resource may move to another location
RelocationHide that a resource may be moved to another location while in use
ReplicationHide that copies of a resource exist and a user might use different ones at different times
ConcurrencyHide that a resource may be shared by several competitive users
Failure Hide the failure and recovery of a resource
PersistenceHide whether a (software) resource is in memory or on disk
![Page 15: Distributed Systems COEN 317 Introduction](https://reader035.vdocuments.site/reader035/viewer/2022062407/56812a74550346895d8df9fd/html5/thumbnails/15.jpg)
Scalability
• Something is scalable if it “increases linearly with size” where size is usually number of nodes or distance.
• “X is scalable with the number of nodes”• Every site (node) is directly connected to every other
site through a communication channel. Number of channels is NOT scalable. For N sites there are N! channels.
• Sites connected in a ring. # of channels IS scalable. (N channels for N sites)
![Page 16: Distributed Systems COEN 317 Introduction](https://reader035.vdocuments.site/reader035/viewer/2022062407/56812a74550346895d8df9fd/html5/thumbnails/16.jpg)
Scalability Problems
Examples of scalability limitations.
Concept Example
Centralized services A single server for all users
Centralized data A single on-line telephone book
Centralized algorithmsDoing routing based on complete information
![Page 17: Distributed Systems COEN 317 Introduction](https://reader035.vdocuments.site/reader035/viewer/2022062407/56812a74550346895d8df9fd/html5/thumbnails/17.jpg)
Scaling Techniques (1)
1.4
The difference between letting:
a) a server or
b) a client check forms as they are being filled
![Page 18: Distributed Systems COEN 317 Introduction](https://reader035.vdocuments.site/reader035/viewer/2022062407/56812a74550346895d8df9fd/html5/thumbnails/18.jpg)
Scaling Techniques (2)
1.5
An example of dividing the DNS name space into zones.
![Page 19: Distributed Systems COEN 317 Introduction](https://reader035.vdocuments.site/reader035/viewer/2022062407/56812a74550346895d8df9fd/html5/thumbnails/19.jpg)
Characteristics of Scalable Distributed Algorithms
• No machine (node, site) has complete information about the system state.
• Sites make decisions based only on local information.
• Failure of one site does not ruin the algorithm.
• There is no implicit assumption that a global clock exists.
![Page 20: Distributed Systems COEN 317 Introduction](https://reader035.vdocuments.site/reader035/viewer/2022062407/56812a74550346895d8df9fd/html5/thumbnails/20.jpg)
Homogeneous and tightly coupled vs heterogeneous and loosely coupled
We will study heterogeneous and loosely coupled systems.
![Page 21: Distributed Systems COEN 317 Introduction](https://reader035.vdocuments.site/reader035/viewer/2022062407/56812a74550346895d8df9fd/html5/thumbnails/21.jpg)
Multiprocessors (1)
A bus-based multiprocessor.
1.7
![Page 22: Distributed Systems COEN 317 Introduction](https://reader035.vdocuments.site/reader035/viewer/2022062407/56812a74550346895d8df9fd/html5/thumbnails/22.jpg)
Multiprocessors (2)
a) A crossbar switchb) An omega switching network
1.8
![Page 23: Distributed Systems COEN 317 Introduction](https://reader035.vdocuments.site/reader035/viewer/2022062407/56812a74550346895d8df9fd/html5/thumbnails/23.jpg)
Homogeneous Multicomputer Systems
a) (a) Gridb) (b) Hypercube: 2N nodes at degree N
1-9
![Page 24: Distributed Systems COEN 317 Introduction](https://reader035.vdocuments.site/reader035/viewer/2022062407/56812a74550346895d8df9fd/html5/thumbnails/24.jpg)
Software Concepts
• DOS (Distributed Operating Systems)• NOS (Network Operating Systems)• Middleware
System Description Main Goal
DOSTightly-coupled operating system for multi-processors and homogeneous multicomputers
Hide and manage hardware resources
NOSLoosely-coupled operating system for heterogeneous multicomputers (LAN and WAN)
Offer local services to remote clients
MiddlewareAdditional layer atop of NOS implementing general-purpose services
Provide distribution transparency
![Page 25: Distributed Systems COEN 317 Introduction](https://reader035.vdocuments.site/reader035/viewer/2022062407/56812a74550346895d8df9fd/html5/thumbnails/25.jpg)
Uniprocessor Operating Systems
Separating applications from operating system code through a microkernel.
1.11
![Page 26: Distributed Systems COEN 317 Introduction](https://reader035.vdocuments.site/reader035/viewer/2022062407/56812a74550346895d8df9fd/html5/thumbnails/26.jpg)
Distributed Operating Systems
May share memory or other resources.
1.14
![Page 27: Distributed Systems COEN 317 Introduction](https://reader035.vdocuments.site/reader035/viewer/2022062407/56812a74550346895d8df9fd/html5/thumbnails/27.jpg)
Network Operating System
General structure of a network operating system.
1-19
![Page 28: Distributed Systems COEN 317 Introduction](https://reader035.vdocuments.site/reader035/viewer/2022062407/56812a74550346895d8df9fd/html5/thumbnails/28.jpg)
Middleware based Distributed System
General structure of a distributed system as middleware.
1-22
![Page 29: Distributed Systems COEN 317 Introduction](https://reader035.vdocuments.site/reader035/viewer/2022062407/56812a74550346895d8df9fd/html5/thumbnails/29.jpg)
Middleware and Openness
In an open middleware-based distributed system, the protocols used by each middleware layer should be the same, as well as the interfaces they offer to applications.
1.23
![Page 30: Distributed Systems COEN 317 Introduction](https://reader035.vdocuments.site/reader035/viewer/2022062407/56812a74550346895d8df9fd/html5/thumbnails/30.jpg)
Comparison between Systems
A comparison between multiprocessor operating systems, multicomputer operating systems, network operating systems, and middleware based distributed systems.
ItemDistributed OS
Network OS
Middleware-based OSMultiproc
.Multicomp.
Degree of transparency
Very High High Low High
Same OS on all nodes Yes Yes No No
Number of copies of OS
1 N N N
Basis for communication
Shared memory
Messages FilesModel
specific
Resource management
Global, central
Global, distributed
Per node Per node
Scalability No Moderately Yes Varies
Openness Closed Closed Open Open
![Page 31: Distributed Systems COEN 317 Introduction](https://reader035.vdocuments.site/reader035/viewer/2022062407/56812a74550346895d8df9fd/html5/thumbnails/31.jpg)
Modern Architectures
An example of horizontal distribution of a Web service.
1-31
![Page 32: Distributed Systems COEN 317 Introduction](https://reader035.vdocuments.site/reader035/viewer/2022062407/56812a74550346895d8df9fd/html5/thumbnails/32.jpg)
Two meanings of synchronous and asynchronous communications
• Synchronous communications is where a process blocks after sending a message to wait for the answer or before receiving.
• Sync and async have come to describe the communications channels with which they are used.
• Synchronous: message transit time is short and bounded. If site does not respond in x sec, site can be declared dead. Simplifies algorithms!
• Asynchronous: message transit time is unbounded. If a message is not received in a given time interval, it could just be slow.
![Page 33: Distributed Systems COEN 317 Introduction](https://reader035.vdocuments.site/reader035/viewer/2022062407/56812a74550346895d8df9fd/html5/thumbnails/33.jpg)
What makes Distributed Systems Difficult?
• Asynchrony – even “synchronous” systems have time lag.
• Limited local knowledge – algorithms can consider only information acquired locally.
• Failures – parts of the distributed system can fail independently leaving some nodes operational and some not.
![Page 34: Distributed Systems COEN 317 Introduction](https://reader035.vdocuments.site/reader035/viewer/2022062407/56812a74550346895d8df9fd/html5/thumbnails/34.jpg)
Example: Byzantine Agreement
Introduced as voting problem (Lamport, Shostak, Pease ’82)
A and B can defeat enemy iff both attack
A sends message to B: Attack at Noon!
General A General B
The Enemy
![Page 35: Distributed Systems COEN 317 Introduction](https://reader035.vdocuments.site/reader035/viewer/2022062407/56812a74550346895d8df9fd/html5/thumbnails/35.jpg)
Byzantine Agreement
Impossible with unreliable networks
Possible if some guarantees of reliability
– Guaranteed delivery within bounded time– Limitations on corruption of messages– Probabilistic guarantees (send multiple messages)