fit5174 parallel & distributed systems dr. ronald pose lecture 11 - 2013 fit5174 distributed...

FIT5174 Parallel & Distributed Systems Dr. Ronald Pose Lecture 11 - 2013

FIT5174 Distributed & Parallel Systems

Lecture 11

Unit Summary and Test Preparation

FIT5174 Parallel & Distributed Systems Dr. Ronald Pose Lecture 11 - 2013 1


Why did we study parallel and distributed systems?

• Traditional “centralised” computing has been displaced largely by distributed computing schemes over the last 20 years – centralised mainframes are used much less;

• Software applications are increasingly being developed for distributed computing environments, and older applications are being ported to such;

• Parallel hardware is becoming more common with multiple core processors and GPUs in consumer machines;

• For a software application to both perform well and be reliable, it must be designed around the environments it is expected to be operated in; that is parallel and distributed computing systems;

• Programmers and software architects therefore need a good understanding of parallel and distributed computing environments to be effective and successful in current and future industry;

• Many badly behaved applications in parallel and distributed environments are a result of insufficient understanding by programmers.

2


What is a distributed system?• A distributed computer system is a collection of computers interconnected by

computer networks enabling communication via passing messages, that appears to the user as a single coherent system.

• There are potential advantages of distributed systems:– Potential increased reliability since it is possible to design applications that run

correctly when not all the distributed computers are working properly– Potential to have incremental growth in the system capacity by adding extra distributed

computers– Potential to allow sharing of common resources– Potential cost savings since multiple smaller, cheaper computers may be cheaper

overall compared with a large central computer– Potential parallel operation of distributed components

• Potential disadvantages of distributed systems:– Reliance on computer network for correct operation– More difficult to program correctly– Management and security difficulties– Upgrade and debugging difficulties

3


What is a parallel system?• A parallel computer system has multiple components that run at the same

time.• These parallel running components may be hardware and or software.• There are potential advantages of parallel systems:

– Potential increased performance due to multiple things happening at once– Potential increased reliability due to concurrent redundant operations allowing some

to fail without overall failure– Potential to reduce time to job completion– Potential to run more jobs per hour– Possibly the only way to meet tight real-time computing deadlines– Potential parallel operation of distributed components

• Potential disadvantages of parallel systems:– More difficult to program correctly– More expensive hardware– More difficult to predict system behaviour and performance– Needs specialized software to take advantage of parallel systems.

4


Distributed System Challenges?• Heterogeneity - different networks, hardware, operating systems, software

etc. makes it difficult to produce standard, portable programs.

• Openness - can be difficult to add new distributed components unless all the interfaces are very well specified.

• Transparency - can be difficult to make the distributed system appear transparently like a single system

• Performance - can be difficult to achieve potential performance in distributed systems

• Scalability - not always easy to scale up to large distributed systems

• Failure handling - not so easy to deal with potential failures being distributed and more likely overall.

• Security - distributed system by its nature allows more opportunity to have security problems in its many components

• Concurrency - having concurrent operation of distributed components can lead to interference between subsystems.

5


Parallel System Challenges?• Heterogeneity - different processor core types, different parallel

architectures, different software architectures, etc. makes it difficult to produce standard, portable programs.

• Performance - can be difficult to achieve potential performance in parallel systems

• Scalability - not always easy to scale up to highly parallel systems• Concurrency - having concurrent operation of parallel components can

lead to unexpected and undesirable interactions between them.• Dealing with both fine-grained parallelism and coarse-grained parallelism

in ways that maximize use of available resources is difficult.• Efficient use of parallel systems is difficult.

6


What background material did we need to cover?

• Networks and Data Communication - obviously distributed systems rely on computer networks and overall distributed system performance and behaviour depends on that of the underlying networks.

• Computer Architecture - in order to achieve parallelism within a computer one needs suitably parallel computer hardware architecture, and in order to program parallel computers effectively ne must be aware of the underlying computer architecture.

• Operating systems - in order to run and control parallel and distributed programs one must make use of available operating systems facilities. Not all operating systems can support all kinds of distributed and parallel systems.

• Memory hierarchy - data storage, virtual memory, multi-level caching must all be considered in the design of parallel and distributed systems since concurrent operations can easily adversely affect the behaviour of the memory hierarchy, hence adversely affect system performance.

• Programming paradigms - to achieve parallel and distributed systems one needs to choose appropriate programming systems and models.

7


Special Purpose versus General Purpose

• Both parallel and distributed systems can be designed to solve particular problems or problem domains, or they may be general purpose systems.

• GPUs for instance began as special purpose co-processors doing computer graphics in parallel, but have now evolved to be much more general purpose.

• Multiple cores on the other hand are a more general purpose mechanism.• We looked at a highly specialized, highly parallel computer system

designed to solve a difficult problem in virtual reality systems that conventional computers could not handle, even highly parallel gneral-purpose computers.

• We also looked at a general purpose parallel and distributed computer system designed to facilitate a variety of programming paradigms while providing convenient ways to share data and keep things secure.

8


Parallel versus Distributed• A parallel computer can operate on many levels

– Parallel operation within an instruction– Parallel execution of multiple instructions– Parallel execution of multiple processes– Parallel operation of multiple different execution units– Parallel operation of multiple ALUs, registers, buses etc.– The components of a parallel system may or may not be distributed

• A distributed computer system may run in parallel where multiple distributed components operate concurrently or a distributed computer system may distribute function but not operate in parallel.– For instance client-server computing may have a client ask a server to do something and

simply wait for the answer.– Alternatively such a system might ask many servers to do work concurrently and the client

itself may work in parallel doing something else while waiting for thr servers to complete their tasks.

• So while many parallel computers are not distributed and many distributed computer systems do not run in parallel, many such systems are both parallel and distributed.

9


Distributed computing models• Client-Server

– Via message passing

– Via remote procedure calls (RPC)

– Via object oriented approaches such as CORBA

• Clusters• Grids• Clouds• Interprocess communications

– Message passing

– Distributed Shared Memory

– RPC

– HTTP (world wide web protocols)

10


Parallel Processing models• Parallel Applications• Multiple serial applications• Coarse grained systems

– Symmetrical multiprocessors

• Medium grained systems– Clusters

• Fine grained systems– Massively parallel processing

• Parallel an distributed data storage• Shared memory• Message Passing• Hybrid shared memory, message passing• One-to-one, one-to-many, many-to-one, meny-to-many communication

11


Synchronization• Synchronization is essential to coordinate operations that might run in parallel

or when sharing resources• Depending on whether the system is distributed or not, one has various

options• Shared memory variables may have to be protected by locking or

semaphores• Message passing can be used both for communication and synchronization• RPC is a form of synchronized communication• Clock synchronization may be important in order to coordinate and establish

ordering in distributed systems• Mutual exclusion and transactions may be important in parallel and distributed

systems - we can use techniques borrowed from operating systems and database management systems

• Deadlock and other problems can be introduced when synchronization is poorly designed or implemented

12


Processes versus Threads• Parallel systems at a medium or coarse grain have multiple pieces of

code executing concurrently• This can take the form of multiple processes, with the same or different

programs running and interacting and communicating forming a parallel or distributed system

• The processes can communicate via sending messages over the network or via shared memory segments if the hardware and operating system support it.

• Threads are often thought of as light-weight processes in that they are part of a simgle program, in effect procedures or functions that are run concurrently and share global variables and data structures.

• A set of cooperating processes is more flexible in that it can easily be distributed and can run a variety of software but has higher overheads.

13


Shared Memory versus Message Passing• Message passing is easily applied to distributed systems whereas a

distributed shared memory system, in effect hides an underlying message-passing system.

• MPI is a popular programming model for parallel and distrubuted systems using message passing

• Shared memory is generally only used in non-distributed systems. It tends to be easier to program in that it is the ‘conventional’ way people program using variables and data structures in memory and performing operations using them.

• Shared memory operation tends to be more efficient provided one has suitable hardware to support it.

• Remote Procedure Calls can be implemented using wither shared memory or message passing

14


Shared Memory versus Message Passing• Message passing is easily applied to distributed systems whereas a

distributed shared memory system, in effect hides an underlying message-passing system.

• MPI is a popular programming model for parallel and distrubuted systems using message passing

• Shared memory is generally only used in non-distributed systems. It tends to be easier to program in that it is the ‘conventional’ way people program using variables and data structures in memory and performing operations using them.

• Shared memory operation tends to be more efficient provided one has suitable hardware to support it.

• Remote Procedure Calls can be implemented using wither shared memory or message passing

15


Flynn’s Classification• SISD - single instruction single data - a conventional single processor• SIMD - single instruction multiple data - a data parallel approach• MISD - multiple instruction single data - not common• MIMD - multiple instruction multiple data - multiprocessor

16


Common Useful Algorithms• Election Algorithms• Clock Synchronization Algorithms• Concurrency Control Algorithms

– Using locks, semaphores etc., timestamps, optimistic,

• Transactions– Two-phase locking and commit operations

• Locking– Lock granularity

17


Failure Types• Various kinds of failure and how one can deal with them• Fault-tolerant systems• Graceful degradation

18


Parallel Computer Systems• Parallel computer memory architectures• Parallel programming models suiting the various parallel computer and

memory architectures• Parallel computer performance models

– Amdahl’s law

19


Parallel CPU Organization• Pipelining - overlapping (concurrently performing) the various stages of executing a single instruction

in a production-line style arrangement– Pipeline stalls– Pipeline forwarding– Data hazards– Conditional branches– Pipeline latency and throughput

• Superscalar - having multiple ALUs operating concurrently running multiple instructions in parallel– Instruction level parallelism– Depends on having no dependency between instructions running in parallel

• Can have pipelined superscalar machines• Vector computers• Massively parallel computer architectures• Problems

– Data dependency– Procedural Dependency– Resource Conflicts– Anti-dependency

• Out-of-order execution, VLIW explicit parallel execution

20


Parallel Organization• Speculative execution - run multiple instructions in parallel just in case.

– E.g. both sides of a brance may be run in parallel until you decide which branch to take, then ignore the ones you don’t need.

– Perhaps run multiple different programs in parallel to achieve the same results and use the first one to finish

– These might be wasteful but may be acceptable if there is spare hardware resources and speed is important

• Different computer interconnection topologies– Mesh

– Hypercube

– Bus

– Tree

– Graphs

• Special-purpose co-processors such as GPGPUs, Vector processors, Signal processors, etc.

21


Storage Hierarchy and Caches• Consider storage hierarchy• Virtual memory versus physical memory• Hierarchy

– Disk

– Memory

– Cache• possibly multiple levels of cache

– Registers

• How should programs and data be designed to suit memory hierarchy to achieve maximum performance?

• How does parallel processing affect memory hierarchy performance?• How should one optimise parallel programs to enable good memory

performance?

22


Monash Multiprocessor• Example of a large scale shared memory multiprocessor that scales up

to a distributed and parallel system with thousands of processors, each sharing memory in a controlled way.

• Novel password-capability system integrates distributed shared persistent virtual memory that includes both data storage and processes in a unified model.

• Has message passing and shared memory• Is object-based but doesn’t impose any particular programming

paradigm• Medium grained parallel hardware• Supports distributed operation but maintains a single address space.

23


Address Recalculation Pipeline- A specialized architecture to facilitate a low latency virtual reality display system

• Example of a highly parallel special-purpose architecture that involves a great deal of parallelism in the hardware and in the software– Pipelining

– Multiple processors

– Multiple memory units

– Parallel access to each memory unit

– Scalable hardware and software

24


Parallel Programming

• What have we done in practical programming?– Shared memory segments in Unix (Linux) systems

– Running multiple programs sharing memory

– Forking new processes from within one program

– Pthreads having multiple concurrent threads within the one program

– MPI having multiple tasks, possibly distributed across multiple computers, communicating by message passing

25


Summary• We have covered a great number of topics that provide background or

indeed are fundamental to parallel and distributed computing

• We have looked at different parallel and distributed architectures

• We have looked at various programming paradigms to exploit parallel and distributed architectures

• We have considered the performance of parallel and distributed systems

• We have also considered factors such as modes of failure, reliability, efficiency of such systems

• We have looked at parallel algorithms and how to design them

• We have examined the hardware of the computer system to see how parallel and distributed programs can affect its efficient operation and to learn how to structure such programs to maximize their performance

• We have looked at some examples of large scale parallel systems, both general purpose and also specialized systems

26

fit5174 parallel & distributed systems dr. ronald pose lecture 11 - 2013 fit5174 distributed...

Documents

distributed environments

parallel hardware

advantage of parallel

distributed computer

parallel computer system

distributed computing

parallel running components

completion potential