2668 (1) disturbuted systems
TRANSCRIPT
What is distributed systems?
• A collection of independent computers that appears to its users as a single coherent system
. A distributed system organized as middleware. The middleware layer runs on all machines, and offers a uniform interface to the system
What is a scale..?
•A system is said to be scalable if it can handle the addition of users and resources without suffering a noticeable loss of performance or increase in administrative complexity
•Building a system to fully use such resources requires an understanding of the problems of scale.
•The scale of a system has three dimensions:– Numerical– Geographical– Administrative
Dimensions of a scale
• Numerical :- The number of users and objects that are part of the objects that are part of the system
• Geographical:- the distance between the farthest nodes in the system
• Administrative:-the number of organizations that exert administrative control over pieces of the system.
• If a system is expected to grow, its ability to scale must be considered when the system is designed
Distributed systems consciously deigned to scale
• Grapevine was one of the earliest distributed computer systems consciously designed to scale
More recent projects that are concentrated on particular subsystems
Internet Domain Naming Systems
Kerberos
Sprite
Sprite
DEC’s Global Naming and Authentication services
Concentrating on complete Scalable systems
Locus
Dash
Amoeba
The Effects of Scale• Scale affects systems in numerous ways
• Scalability is negatively affected when the system is based on– Centralized server: one for all users
– Centralized data: a single data base for all users
– Centralized algorithms: one site collects all information, processes it, distributes the results to all sites.
contd…,• These are some of the issues that affect the scalability of a system as
a whole.
Reliability:- As the system scales geographically, it becomes less likely that all components will be able to communicate. Can over come by-
Autonomy
Replication
System Load:- If system gets bigger the amount of data that must be managed by network services grow and also the total number of requests increases.
Can overcome by –
Replication
Distribution and caching
Contd..,
Administration:- As the number of nodes in a system grows it becomes impractical to maintain information about the system and its users in each node.
It can be overcome when common information is maintained centrally
Heterogeneity:- It is likely that systems which cross administrative boundaries will not only include hardware of different types but also running different O.S or different versions
Dealing with Heterogeneity-
Coherence
The effects of scale on particular subsystems.
• If a system is expected to grow, its ability to scale must be considered when the system is designed
• The three dimensions of scale affect distributed systems in many ways.
• Scale also affects the user's ability to easily interact with the system.
• Among the affected components are
Naming and Directory services
The security Subsystems
Remote resources
Naming and Directory Services
• A name server maps a name to information about the name’s binding
• The information might be the address of an object or it might be the general
Granularity of Naming
UID- Based Naming
Directory Services
contd..,
• Granularity of Naming :- Name Servers actually differs in the size of the objects they name
some name servers may name only hosts , some may name individual users and services and few may name only individual files
The size of naming database ,the frequency of queries and the read-to- write ratio are all affected by the granularity of the objects named
• UID-Based Naming :- Uses unique Identifier to name the objects ,usually contains information about the server and the identifier.
Problem with Uid based naming is that objects move, the UID often identify the server on which objects resides
Contd..,
• Directory Services :- A directory contains UID’s for files other directories or in fact any object for which UID exists
There is no requirement that a subdirectory be on the same server as its parent.
Different parts of name space can reside on different machines
A directory server can support pieces of independent name spaces, and it is possible for those name spaces to overlap, or even to contain cycles.
The Security Sub Systems
• As the size of a system grows, security becomes increasingly important and increasingly difficult to implement.
• The bigger the system, the more vulnerable it is to attack
• Security has some aspects:
– Authentication :- how the system verifies a user's identity
Passwords
Host based authentication
Encryption based authentication
Contd..,
• Authorization:-
A request is sent to an authorization service whenever a server needs to make an access control decision. The authorization service makes the decision and its answerer back to the server
The client is first authenticated, then the server make its own decision about whether the client is authorized to perform an operation
Remote Resources
• Scale affects the sharing of many kinds of resources like processors, memory, storage, programs, and physical devices.
• The services that provide access to these resources often inherit scalability problems from the naming and security mechanisms they use.
• This section will look at the scaling issues related to network communication in such services
Communication
File Systems
.
Communication
Communication typically takes one of two forms: Point-to-point :-In point -to-point communication the
client sends messages to the particular server that can satisfy the request.
Broadcast :-the client sends the message to everyone, and only those servers that can satisfy the request
As a system grows geographically, the medium of communications places limits on the system's performance.
These limits must be considered when deciding how best to access a remote resource
File Systems
• The file system provides an excellent example of a service affected by scale.
• Files are spread across many servers, and each server only processes requests for the files that it stores.
• Files are assigned to multiple servers, and clients contact a subset of the servers when making requests
• It is heavily used, and it requires the transfer of large amounts of data
Solutions for scalability..?
•Shed load, but not too much.
•Avoid global broadcast.
•Support multiple access mechanisms.
•Keep the user in mind.
•Building and evaluating scaling techniques– Replication– Distribution– caching
Scaling Techniques
• Replication –– Replicate important resources– Distribute the replicas.– Use loose consistency
• Distribution –– Distribution technique can be used as a solution to size
scalability.– Distribution means taking a component, splitting it into
parts and spreading those parts among the nodes of the system(Ex: DNS)
Scaling Techniques
• Caching –
Cache frequently accessed data
Consider access patterns when caching.
Cache timeout
Cache at multiple levels.
Look first locally
Conclusion
• A distributed system must be designed such that it is scalable.
• We need to face all the problems discussed thus far and come up with solutions for them in order to build a successful long lasting distributed system.
• Once again, the problems discussed here are a single drop in an ocean full of problems.
References
http://en.wikipedia.org/wiki/Scalability http://ieeexplore.ieee.org “Reliable Distributed Systems, Technologies,
Web Services and Applications” by Kenneth P Birman.
“Design, performance and scalability of the distributed enterprise systems”, by Janusz Kowalik.
“Distributed systems: principles and paradigms” by Andrew S. Tenanbaum.