university of colorado – ecen5043 sw eng of multiprogram systems getting started introduce me...

41
University of Colorado – ECEN5043 SW Eng of Multiprogram Getting started • Introduce me • Change of text motivated by feedback • Course overview (next slide)

Post on 20-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

University of Colorado – ECEN5043SW Eng of Multiprogram Systems

Getting started

• Introduce me

• Change of text motivated by feedback

• Course overview (next slide)

University of Colorado – ECEN5043SW Eng of Multiprogram Systems

Course overview – not in order

• SW Engineering of multiprogram systems– Multiprogram systems

• Single CPU, shared resources• Examples

– Real-time scheduling – view as single CPU plus peripheral processors

– Performance Engineering• Simulation tool – evaluation copies

University of Colorado – ECEN5043SW Eng of Multiprogram Systems

ECEN5043 Software Engineering of Multi-program Systems

University of Colorado, Boulder2nd course in the Software Engineering Certificate series

Introduction

University of Colorado – ECEN5043SW Eng of Multiprogram Systems

• An intro to some ideas

• A bit of history

• More ideas

• Some problems to think about

University of Colorado – ECEN5043SW Eng of Multiprogram Systems

Sequential debugging

• See Fig 1.1 (next slide) – from Concurrent Programming by M. Ben-Ari – an interchange sort program

• The program is sequential

• How debug a sequential program?– trace – execute one at a time– breakpoints and snapshots – suspend execution

and check values of variables

• This is a simplistic sort but let’s improve it by executing portions of the sort in parallel

University of Colorado – ECEN5043SW Eng of Multiprogram Systems

Ben Ari -- Figure 1.1

University of Colorado – ECEN5043SW Eng of Multiprogram Systems

Sequential --> two halves

• Suppose for n = 10, the numbers are 4,2,7,6,1,8,5,0,3,9

• Divide the array into two halves

• Have 2 people (programs) sort the two halves simultaneously

• Merge the two sorted halves

University of Colorado – ECEN5043SW Eng of Multiprogram Systems

Efficiency?

• In the inner loop of an interchange sort, there are (n-1) + (n-2) + ... + 1=n(n-1)/2 comparisons. This is approximately n2/2.

• To sort half the elements requires only (n/2)2/2=n2/8 comparisons

• The parallel algorithm can perform the entire sort in two times n2/8 comparisons which is n2/4 comparisons (plus another n comparisons to do the merge). See next slide for table.

• Additional savings can be achieved if the two sorts are performed “simultaneously”

University of Colorado – ECEN5043SW Eng of Multiprogram Systems

Number of comparisonsperformed by sort

University of Colorado – ECEN5043SW Eng of Multiprogram Systems

See Figure 1.3 Sequential halves version

• See Figure 1.3 from Ben-Ari’s book to see a sequential algorithm for this program.

• (It’s written in Pascal which is almost like pseudo-code in readability – if there is a notation you don’t understand, ask!)

• Suppose it could be run where two processors are available. If the language had a construct to allow the use of two when two were available, it could be run sequentially or not.

University of Colorado – ECEN5043SW Eng of Multiprogram Systems

Sequential Halves, part 1, example, Fig. 1.3

University of Colorado – ECEN5043SW Eng of Multiprogram Systems

Sequential Halves, part 2, example, Fig. 1.3

University of Colorado – ECEN5043SW Eng of Multiprogram Systems

Concurrent

• Concurrent processes have the potential for parallel execution

• An algorithm can be improved by identifying procedures that may be executed concurrently.

• Greatest improvement is under true parallel execution

• Still, the concurrent algorithm is superior to the sequential algorithm even if the concurrent one is executed sequentially.

University of Colorado – ECEN5043SW Eng of Multiprogram Systems

Figure 1.4 Concurrent main program

University of Colorado – ECEN5043SW Eng of Multiprogram Systems

Concurrent programming• Programming notations and techniques for

– expressing potential parallelism– solving the resulting synchronization and

communication problems

• Implementation of parallelism is a separate topic

• Basic problem – which activities may be done concurrently?– merge is not ok to do concurrently with the sortings– unless there was a way to synchronize its execution

with the sorts– while count1 < middle do ...

University of Colorado – ECEN5043SW Eng of Multiprogram Systems

Synchronizing the merge

• while count1 < middle dowait until

i of procedure call sort(1,n) is greater than count1 and

i of procedure call sort(n+1, two_n) is greater than count2 and

ONLY then:if a[count1] < a[count2] then...

• What would happen if we put the merge inside the cobegin/coend?

University of Colorado – ECEN5043SW Eng of Multiprogram Systems

Quality

• Parallelism can improve not only performance but also quality

• Consider: – Read 100-character messages and print the

contents on 125-character lines. – However, runs of n=1 to 9 blanks are to be

replaced by a single blank followed by the numeral n. (Runs of blanks may start in one message and continue on another)

• Harder to write as a sequential program (although of course you can do it)

University of Colorado – ECEN5043SW Eng of Multiprogram Systems

3-program solution

1. Read message lines and write them to a temporary file

2. Read that character stream and modify runs of blanks, write to a second temporary file

3. Read the second temporary file and print lines of 125 characters each

• Not acceptable because of the files overhead• If run concurrently and communication paths

established between them, the programs would be efficient

University of Colorado – ECEN5043SW Eng of Multiprogram Systems

3-program solution picture

100 msg file

msg temporary file of characters

chars

chars

strip blanks

de-blanked chars

print 125-char lines

chars

linesline-formatted version

convert msgs to chars

University of Colorado – ECEN5043SW Eng of Multiprogram Systems

Sequence Diagram;message Mgr ;charsStore ;linesMgr

conv_to_chars(msg)

conv_to_lines(125chars)

What are some characteristics of the classes that would have to be true for this to work?

strip_runs_of_ blanks

University of Colorado – ECEN5043SW Eng of Multiprogram Systems

What makes concurrent programming hard?

• Generally concurrent programming is harder than sequential programming because it’s hard to ensure the program is correct

• One can analyze the correctness of the component parts of the software but their execution may be interleaved

• So what??

University of Colorado – ECEN5043SW Eng of Multiprogram Systems

Interleaving

• Suppose concurrent program P consists of two processes P1 and P2.

• P executes any one of the execution sequences that can be obtained by interleaving the execution sequences of the two processes.

• Can be as random as flipping a coin to decide if the next instruction will be from P1 or from P2

University of Colorado – ECEN5043SW Eng of Multiprogram Systems

Reasoning about concurrent programs

• The result of an individual instruction on any given data cannot depend upon the circumstances of its execution

• Only the external behavior of the system may change– depending on the interaction of the instructions

through the common data

• Interleaved execution sequences can be huge in number but they are approachable with formal methods in order to demonstrate correctness.

University of Colorado – ECEN5043SW Eng of Multiprogram Systems

A Bit of History ... about Operating Systems

• Concurrent programming grew out of problems associated with operating systems

• 1950’s – you signed up to use the computer and wasted everyone’s time if you stopped to figure out a problem. What is optimized?

• Next generation: “supervisor program” batched jobs; an operator sat at a console. Programmers’ card batches were fed in hourly. If one “died”, the next one was started up right away. What is optimized?

University of Colorado – ECEN5043SW Eng of Multiprogram Systems

Efficiency ... sort of

• Batched jobs were a more efficient use of the computer but ...

• an inefficient use of the programmers who could no longer track their program dynamically

• If a program failed, you got your cards and a dump to look at... ... eventually ...

• Dramatic INCREASE in computer throughput! So what’s the problem??

University of Colorado – ECEN5043SW Eng of Multiprogram Systems

New type of inefficiency

• Suppose a computer executed one million instructions per second, connected to a card reader reading 300 cards per minute (5/sec).

• From the time the read instruction was executed until the card was read, 200,000 instructions could have been executed ... but weren’t.

• A program to read the cards and calculate the average of the punched numbers spent 99% of its time doing nothing.

University of Colorado – ECEN5043SW Eng of Multiprogram Systems

Solution # 1 ... long time ago• Spooling

• Decomposed the operation of card-reading into 3 steps:– read cards to tape– read programs from tape into computer to execute– write results to a second tape– print information from the second tape

• Disjoint processes yielded increased throughput by running the processes on separate computers (very simple ones to read/write tape) to minimize usage of the expensive main one

University of Colorado – ECEN5043SW Eng of Multiprogram Systems

More modern approach -- multiprogramming

• Switch the processor among several computations whose programs and data are held simultaneously in memory.

• We call this multiprogramming• While I/O is in progress for program P1, the

computer will execute several thousand instructions of program P2 and return to process the data obtained for P1 when it is ready or when P2 is waiting on I/O

• While user-A is typing in the form information, the computer conducts the search for user-B ...

University of Colorado – ECEN5043SW Eng of Multiprogram Systems

Interleaved computation

• Interleaved computation has its roots in multiprogrammed systems

• We don’t try to deal with the global behavior of the switched computer.

• Instead, we consider the actual processor to be merely a means of interleaving the computations of several processors.

University of Colorado – ECEN5043SW Eng of Multiprogram Systems

Blue Screen of Death• Original concern was to improve throughput• Throughput, however, was soon impacted by

system crashes– system stopped functioning as it was supposed to– extensive recovery and restart measures delayed

execution

• Defects caused by our inadequate understanding of how to execute several programs simultaneously -- c. 1960’s.

• This ... inadequacy ... has returned today even though the understanding is available.

University of Colorado – ECEN5043SW Eng of Multiprogram Systems

Better than fiction

• Even though a computer is switching from one sequential process to another every few milliseconds, it seems to be performing the tasks simultaneously

• It is more than just a useful fiction to assume the computer is in fact performing its tasks concurrently.

• Most computers use interrupts to accomplish the above purpose ...

University of Colorado – ECEN5043SW Eng of Multiprogram Systems

Typical scenario

• P1 makes a read request and suspends its execution until the read is complete

• CPU may execute P2

• When the read requested by P1 has completed, the device interrupts the execution of P2 to allow the o.s. to record the completion of the read.

• Either P1 or P2 may now be resumed.

University of Colorado – ECEN5043SW Eng of Multiprogram Systems

Abstraction: o.s. executes many programs concurrently

• Interrupts occur asynchronously during execution of other programs by the CPU

• No way of predicting or coordinating the occurrence of the interrupt with the execution of any arbitrary instruction by the CPU

• The computer is executing any one of a vast number of execution sequences that may be obtained by arbitrarily interleaving the instructions of programs and I/O device handlers

University of Colorado – ECEN5043SW Eng of Multiprogram Systems

Process vs. program

• Use of the term process emphasizes the fact that we need not differentiate between ordinary programs and external devices

• They are all independent processes that may need to communicate with each other

• The abstraction tries to ignore as many details of the actual application as possible

• We will study the producer-consumer problem ...

University of Colorado – ECEN5043SW Eng of Multiprogram Systems

Producer-consumer problem abstraction

• This is an abstraction of – a program producing data for consumption by

an output device– an input device producing data for consumption

by a program– or a program that modifies data through a series

of processing steps that can run concurrently

• Problems/requirements are the same– synchronization– communication

• But we assume each process is sequential

University of Colorado – ECEN5043SW Eng of Multiprogram Systems

Rule Number One

• It is always possible to refine the description of a system until it is given in terms of sequential processes.

University of Colorado – ECEN5043SW Eng of Multiprogram Systems

Concurrent programming paradigm

• Applicable to wide range of systems, not just operating systems

• Every computer (included embedded processors) is executing programs that can be considered to be interleaved concurrent processes– example: real-time system expected to concurrently

absorb and process dozens of different asynchronous external signals and operator commands

• Abstract concurrency that models switched systems is not well understood

• Distributed processing adds a new wrinkle ... next semester

University of Colorado – ECEN5043SW Eng of Multiprogram Systems

What’s Coming

• Context: writing correct software

• We will look at what correctness means in programs that never “finish” (do forever)

• Look at these issues w.r.t. concurrent programming– Synchronization– Communication

• The problems are subtle and ignoring details can lead to “spectacular” bugs!

University of Colorado – ECEN5043SW Eng of Multiprogram Systems

Tying it together

• Back to our simple program to convert msgs to lines of characters without runs of blanks

• Look at these issues w.r.t. concurrent programming– Synchronization– Communication

• In the (incomplete) sequence diagram, – who needs to synchronize– who needs to communicate

University of Colorado – ECEN5043SW Eng of Multiprogram Systems

Sequence Diagram;message Mgr ;charsStore ;linesMgr

conv_to_chars(msg)

conv_to_lines(125chars)

What are some characteristics of the classes that would have to be true for this to work?

strip_runs_of_ blanks

University of Colorado – ECEN5043SW Eng of Multiprogram Systems

More

• Mutual exclusion – processes competing for single resource that requires only one access at a time

• Deadlock -- each process waiting for another to do something before proceeding

• Avoid unnecessary delay

• Synchronization primitives and how they are implemented in a few environments

• Semaphore concept develops into variations– Producer-consumer solution