parallel and distributed computing in education (invited talk)

31
Kent Academic Repository Full text document (pdf) Copyright & reuse Content in the Kent Academic Repository is made available for research purposes. Unless otherwise stated all content is protected by copyright and in the absence of an open licence (eg Creative Commons), permissions for further reuse of content should be sought from the publisher, author or other copyright holder. Versions of research The version in the Kent Academic Repository may differ from the final published version. Users are advised to check http://kar.kent.ac.uk for the status of the paper. Users should always cite the published version of record. Enquiries For any further enquiries regarding the licence status of this document, please contact: [email protected] If you believe this document infringes copyright then please contact the KAR admin team with the take-down information provided at http://kar.kent.ac.uk/contact.html Citation for published version Welch, Peter H. (1998) Parallel and Distributed Computing in Education (Invited Talk). In: VECPAR''98: Third International Conference on Vector and Parallel Processing - Selected Papers, 21/06/1998, Porto , Portugal . DOI Link to record in KAR http://kar.kent.ac.uk/21644/ Document Version UNSPECIFIED

Upload: truongmien

Post on 11-Jan-2017

231 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Parallel and Distributed Computing in Education (Invited Talk)

Kent Academic RepositoryFull text document (pdf)

Copyright & reuse

Content in the Kent Academic Repository is made available for research purposes. Unless otherwise stated all

content is protected by copyright and in the absence of an open licence (eg Creative Commons), permissions

for further reuse of content should be sought from the publisher, author or other copyright holder.

Versions of research

The version in the Kent Academic Repository may differ from the final published version.

Users are advised to check http://kar.kent.ac.uk for the status of the paper. Users should always cite the

published version of record.

Enquiries

For any further enquiries regarding the licence status of this document, please contact:

[email protected]

If you believe this document infringes copyright then please contact the KAR admin team with the take-down

information provided at http://kar.kent.ac.uk/contact.html

Citation for published version

Welch, Peter H. (1998) Parallel and Distributed Computing in Education (Invited Talk). In:VECPAR''98: Third International Conference on Vector and Parallel Processing - Selected Papers,21/06/1998, Porto , Portugal .

DOI

Link to record in KAR

http://kar.kent.ac.uk/21644/

Document Version

UNSPECIFIED

Page 2: Parallel and Distributed Computing in Education (Invited Talk)

Parallel and Distributed Computing

in Edu ation (Invited Talk)

Peter H. Wel h

Computing Laboratory, University of Kent at Canterbury, CT2 7NF.

P.H.Wel h�uk .a .uk

Abstra t. The natural world is ertainly not organised through a en-

tral thread of ontrol. Things happen as the result of the a tions and

intera tions of unimaginably large numbers of independent agents, oper-

ating at all levels of s ale from nu lear to astronomi . Computer systems

aiming to be of real use in this real world need to model, at the appro-

priate level of abstra tion, that part of it for whi h it is to be of servi e.

If that modelling an re e t the natural on urren y in the system, it

ought to be mu h simpler

Yet, traditionally, on urrent programming is onsidered to be an ad-

van ed and diÆ ult topi { ertainly mu h harder than serial omputing

whi h, therefore, needs to be mastered �rst. But this tradition is wrong.

This talk presents an intuitive, sound and pra ti al model of parallel

omputing that an be mastered by undergraduate students in the �rst

year of a omputing (major) degree. It is based upon Hoare's mathe-

mati al theory of Communi ating Sequential Pro esses (CSP), but does

not require mathemati al maturity from the students { that maturity is

pre-engineered in the model. Fluen y an be qui kly developed in both

message-passing and shared-memory on urren y, whilst learning to ope

with key issues su h as ra e hazards, deadlo k, livelo k, pro ess starva-

tion and the eÆ ient use of resour es. Pra ti al work an be hosted on

ommodity PCs or UNIX workstations using either Java or the o am

multipro essing language. Armed with this maturity, students are well-

prepared for oping with real problems on real parallel ar hite tures that

have, possibly, less robust mathemati al foundations.

1 Introdu tion

At Kent, we have been tea hing parallel omputing at the undergraduate level

for the past ten years. Originally, this was presented to �rst-year students before

they be ame too set in the ways of serial logi . When this ourse was expanded

into a full unit (about 30 hours of tea hing), timetable pressure moved it into

the se ond year. Either way, the material is easy to absorb and, after only a

few (around 5) hours of tea hing, students have no diÆ ulty in grappling with

the intera tions of 25 (say) threads of ontrol, appre iating and eliminating ra e

hazards and deadlo k.

Page 3: Parallel and Distributed Computing in Education (Invited Talk)

Parallel omputing is still an immature dis ipline with many on i ting ul-

tures. Our approa h to edu ating people into su essful exploitation of parallel

me hanisms is based upon fo using on parallelism as a powerful tool for simpli-

fying the des ription of systems, rather than simply as a means for improving

their performan e. We never start with an existing serial algorithm and say:

`OK, let's parallelise that!'. And we work solely with a model of on urren y

that has a semanti s that is ompositional { a fan y word for WYSIWYG { sin e,

without that property, ombinatorial explosions of omplexity always get us as

soon as we step away from simple examples. In our view, this rules out low-level

on urren y me hanisms, su h as spin-lo ks, mutexes and semaphores, as well

as some of the higher-level ones (like monitors).

Communi ating Sequential Pro esses (CSP)[1{3℄ is a mathemati al theory for

spe ifying and verifying omplex patterns of behaviour arising from intera tions

between on urrent obje ts. Developed by Tony Hoare in the light of earlier

work on monitors, CSP has a ompositional semanti s that greatly simpli�es

the design and engineering of su h systems { so mu h so, that parallel design

often be omes easier to manage than its serial ounterpart. CSP primitives have

also proven to be extremely lightweight, with overheads in the order of a few

hundred nanose onds for hannel syn hronisation (in luding ontext-swit h) on

urrent mi ropro essors [4, 5℄.

Re ently, the CSP model has been introdu ed into the Java programming

language [6{10℄. Implemented as a library of pa kages [11, 12℄, JavaPP[10℄ en-

ables multithreaded systems to be designed, implemented and reasoned about

entirely in terms of CSP syn hronisation primitives ( hannels, events, et .) and

onstru tors (parallel, hoi e, et .). This allows 20 years of theory, design pat-

terns (with formally proven good properties { su h as the absen e of ra e hazards,

deadlo k, livelo k and thread starvation), tools supporting those design patterns,

edu ation and experien e to be deployed in support of Java-based multithreaded

appli ations.

2 Pro esses, Channels and Message Passing

This se tion des ribes a simple and stru tured multipro essing model derived

from CSP. It is easy to tea h and an des ribe arbitrarily omplex systems. No

formal mathemati s need be presented { we rely on an intuitive understanding

of how the world works.

2.1 Pro esses

A pro ess is a omponent that en apsulates some data stru tures and algorithms

for manipulating that data. Both its data and algorithms are private. The outside

world an neither see that data nor exe ute those algorithms. Ea h pro ess is

alive, exe uting its own algorithms on its own data. Be ause those algorithms are

exe uted by the omponent in its own thread (or threads) of ontrol, they express

2

Page 4: Parallel and Distributed Computing in Education (Invited Talk)

the behaviour of the omponent from its own point of view1. This onsiderably

simpli�es that expression.

A sequential pro ess is simply a pro ess whose algorithms exe ute in a single

thread of ontrol. A network is a olle tion of pro esses (and is, itself, a pro ess).

Note that re ursive hierar hies of stru ture are part of this model: a network is

a olle tion of pro esses, ea h of whi h may be a sub-network or a sequential

pro ess.

But how do the pro esses within a network intera t to a hieve the behaviour

required from the network? They an't see ea h other's data nor exe ute ea h

other's algorithms { at least, not if they abide by the rules.

2.2 Syn hronising Channels

The simplest form of intera tion is syn hronised message-passing along hannels.

The simplest form of hannel is zero-bu�ered and point-to-point. Su h hannels

orrespond very losely to our intuitive understanding of a wire onne ting two

(hardware) omponents.

Ac

B

Fig. 1. A simple network

In Figure 1, A and B are pro esses and is a hannel onne ting them. A wire

has no apa ity to hold data and is only a medium for transmission. To avoid

undete ted loss of data, hannel ommuni ation is syn hronised. This means

that if A transmits before B is ready to re eive, then A will blo k. Similarly, if

B tries to re eive before A transmits, B will blo k. When both are ready, a data

pa ket is transferred { dire tly from the state spa e of A into the state spa e of

B. We have a syn hronised distributed assignment.

2.3 Legoland

Mu h an be done, or simpli�ed, just with this basi model { for example the de-

sign and simulation of self-timed digital logi , multipro essor embedded ontrol

systems (for whi h o am[13{16℄ was orignally designed), GUIs et .

Here are some simple examples to build up uen y. First we introdu e some

elementary omponents from our `tea hing' atalogue { see Figure 2. All pro-

esses are y li and all transmit and re eive just numbers. The Id pro ess y les

1 This is in ontrast with simple `obje ts' and their `methods'. A method body nor-

mally exe utes in the thread of ontrol of the invoking obje t. Consequently, obje t

behaviour is expressed from the point of view of its environment rather than the

obje t itself. This is a slightly onfusing property of traditional `obje t-oriented'

programming.

3

Page 5: Parallel and Distributed Computing in Education (Invited Talk)

through waiting for a number to arrive and, then, sending it on. Although in-

serting an Id pro ess in a wire will learly not a�e t the data owing through

it, it does make a di�eren e. A bare wire has no bu�ering apa ity. A wire on-

taining an Id pro ess gives us a one-pla e FIFO. Conne t 20 in series and we

get a 20-pla e FIFO { sophisti ated fun tion from a trivial design.

Idin out

Id (in, out)

in outSucc

Succ (in, out)

+out

in0

in1

Plus (in0, in1, out)

inout0

out1

Delta (in, out0, out1)

in outTail

Tail (in, out)Prefix (n, in, out)

in outn

Fig. 2. Extra t from a omponent atalogue

Su is like Id, but in rements ea h number as it ows through. The Plus

omponent waits until a number arrives on ea h input line (a epting their arrival

in either order) and outputs their sum. Delta waits for a number to arrive and,

then, broad asts it in parallel on its two output lines { both those outputs must

omplete (in either order) before it y les round to a ept further input. Prefix

�rst outputs the number stamped on it and then behaves like Id. Tail swallows

its �rst input without passing it on and then, also, behaves like Id. Prefix

and Tail are so named be ause they perform, respe tively, pre�xing and tail

operations on the streams of data owing through them.

It's essential to provide a pra ti al environment in whi h students an develop

exe utable versions of these omponents and play with them (by plugging them

together and seeing what happens). This is easy to do in o am and now, with

the JCSP library[11℄, in Java. Appendi es A and B give some of the details. Here

we only give some CSP pseudo- ode for our atalogue (be ause that's shorter

than the real ode):

Id (in, out) = in ? x --> out ! x --> Id (in, out)

Su (in, out) = in ? x --> out ! (x+1) --> Su (in, out)

4

Page 6: Parallel and Distributed Computing in Education (Invited Talk)

Plus (in0, in1, out)

= ((in0 ? x0 --> SKIP) || (inl ? x1 --> SKIP));

out ! (x0 + x1) --> Plus (in0, in1, out)

Delta (in, out0, out1)

= in ? x --> ((out0 ! x --> SKIP) || (out1 ! x --> SKIP));

Delta (in, out0, out1)

Prefix (n, in, out) = out ! n --> Id (in, out)

Tail (in, out) = in ? x --> Id (in, out)

[Notes: `free' variables used in these pseudo- odes are assumed to be lo ally

de lared and hidden from outside view. All these omponents are sequential pro-

esses. The pro ess (in ? x --> P (...)) means: \wait until you an engage

in the input event (in ? x) and, then, be ome the pro ess P (...)". The input

operator (?) and output operator (!) bind more tightly than the -->.℄

2.4 Plug and Play

Plugging these omponents together and reasoning about the resulting behaviour

is easy. Thanks to the rules on pro ess priva y2, ra e hazards leading to unpre-

di table internal state do not arise. Thanks to the rules on hannel syn hronisa-

tion, data loss or orruption during ommuni ation annot o ur3. What makes

the reasoning simple is that the parallel onstru tor and hannel primitives are

deterministi . Non-determinism has to be expli itly designed into a pro ess and

oded { it an't sneak in by a ident!

Figure 3 shows a simple example of reasoning about network omposition.

Conne t a Prefix and a Tail and we get two Ids:

(Prefix (in, ) || Tail ( , out)) = (Id (in, ) || Id ( , out))

Equivalen e means that no environment (i.e. external network in whi h they

are pla ed) an tell them apart. In this ase, both ir uit fragments implement a

2-pla e FIFO. The only pla e where anything di�erent happens is on the internal

wire and that's undete table from outside. The formal proof is a one-liner from

the de�nition of the parallel (||), ommuni ations (!, ?) and and-then-be omes

(-->) operators in CSP. But the good thing about CSP is that the mathemati s

engineered into its design and semanti s leanly re e ts an intuitive human feel

for the model. We an see the equivalen e at a glan e and this qui kly builds

on�den e both for us and our students.

2 No external a ess to internal data. No external exe ution of internal algorithms

(methods).3 Unreliable ommuni ations over a distributed network an be a ommodated in this

model { the unreliable network being another a tive pro ess (or set of pro esses)

that happens not to guarantee to pass things through orre tly.

5

Page 7: Parallel and Distributed Computing in Education (Invited Talk)

c outinn Tail

c outinId Id

=

Fig. 3. A simple equivalen e

out

Succ

0

c

a

b

Numbers (out)

outin

+

0

c

a

b

Integrate (in, out)

Pairs (in, out)

outin+

a

Tailb c

Fig. 4. Some more interesting ir uits

6

Page 8: Parallel and Distributed Computing in Education (Invited Talk)

Figure 4 shows some more interesting ir uits with the �rst two in orporating

feedba k. What do they do? Ask the students! Here are some CSP pseudo- odes

for these ir uits:

Numbers (out)

= Prefix (0, , a) || Delta (a, out, b) || Su (b, )

Integrate (in, out)

= Plus (in, , a) || Delta (a, out, b) || Prefix (0, b, )

Pairs (in, out)

= Delta (in, a, b) || Tail (b, ) || Plus (a, , out)

Again, our rule for these pseudo- odes means that a, b and are lo ally

de lared hannels (hidden, in the CSP sense, from the outside world). Appendi es

A and B list o am and Java exe utables { noti e how losely they re e t the

CSP.

Ba k to what these ir uits do: Numbers generates the sequen e of natural

numbers, Integrate omputes running sums of its inputs and Pairs outputs

the sum of its last two inputs. If we wish to be more formal, let <i> represent

the i'th element that passes through hannel { i.e. the �rst element through

is <1>. Then, for any i >= 1:

Numbers: out<i> = i - 1

Integrate: out<i> = Sum {in<j> | j = 1..i}

Pairs: out<i> = in<i> + in<i + 1>

Be areful that the above details only part of the spe i� ation of these ir uits:

how the values in their output stream(s) relate to the values in their input

stream(s). We also have to be aware of how exible they are in syn hronising

with their environments, as they generate and onsume those streams. The base

level omponents Id, Su , Plus and Delta ea h demand one input (or pair of

inputs) before generating one output (or pair of outputs). Tail demands two

inputs before its �rst output, but thereafter gives one output for ea h input.

This e�e t arries over into Pairs. Integrate adds 2-pla e bu�ering between

its input and output hannels (ignoring the transformation in the a tual values

passed). Numbers will always deliver to anything trying to take input from it.

If ne essary, we an make these syn hronisation properties mathemati ally

pre ise. That is, after all, one of the reasons for whi h CSP was designed.

2.5 Deadlo k { First Conta t

Consider the ir uit in Figure 5. A simple stream analysis would indi ate that:

Pairs2: a<i> = in<i>

Pairs2: b<i> = in<i>

Pairs2: <i> = b<i + 1> = in<i + 1>

Pairs2: d<i> = <i + 1> = in<i + 2>

Pairs2: out<i> = a<i> + d<i> = in<i> + in<i + 2>

7

Page 9: Parallel and Distributed Computing in Education (Invited Talk)

Pairs2 (in, out)

in out+

a

b

c

dTail Tail

Fig. 5. A dangerous ir uit

But this analysis only shows what would be generated if anything were gen-

erated. In this ase, nothing is generated sin e the system deadlo ks. The two

Tail pro esses demand three items from Delta before delivering anything to

Plus. But Delta an't deliver a third item to the Tails until it's got rid of its

se ond item to Plus. But Plus won't a ept a se ond item from Delta until it's

had its �rst item from the Tails. Deadlo k!

In this ase, deadlo k an be designed out by inserting an Id pro ess on

the upper (a) hannel. Id pro esses (and FIFOs in general) have no impa t on

stream ontents analysis but, by allowing a more de oupled syn hronisation, an

impa t on whether streams a tually ow. Beware, though, that adding bu�ering

to hannels is not a general ure for deadlo k.

So, there are always two questions to answer: what data ows through the

hannels, assuming data does ow, and are the ir uits deadlo k-free? Deadlo k

is a monster that must { and an { be vanquished. In CSP, deadlo k only o urs

from a y le of ommitted attempts to ommuni ate (input or output): ea h pro-

ess in the y le refusing its prede essor's all as it tries to onta t its su essor.

Deadlo k potential is very visible { we even have a deadlo k primitive (STOP) to

represent it, on the grounds that it is a good idea to know your enemy!

In pra ti e, there now exist a wealth of design rules that provide formally

proven guarantees of deadlo k freedom[17{22℄. Design tools supporting these

rules { both onstru tive and analyti al { have been resear hed[23,24℄. Deadlo k,

together with related problems su h as livelo k and starvation, need threaten us

no longer { even in the most omplex of parallel system.

2.6 Stru tured Plug and Play

Consider the ir uits of Figure 6. They are similar to the previous ir uits,

but ontain omponents other than those from our base atalogue { they use

omponents we have just onstru ted. Here is the CSP:

Fibona i (out)

= Prefix (1, d, a) || Prefix (0, a, b) ||

Delta (b, out, ) || Pairs ( , d)

Squares (out)

= Numbers (a) || Integrate (a, b) || Pairs (b, out)

8

Page 10: Parallel and Distributed Computing in Education (Invited Talk)

Demo (out)

= Numbers (a) || Fibona i (b) || Squares ( ) ||

Tabulate3 (a, b, , out)

IntegrateNumbersout

Pairs

Squares (out)

a b

Numbers

Fibonacci Tabulate3

Squares

out

Demo (out)

a

b

c

Fibonacci (out)

out

Pairs

a b

cd

1 0

Fig. 6. Cir uits of ir uits

One of the powers of CSP is that its semanti s obey simple omposition rules.

To understand the behaviour implemented by a network, we only need to know

the behaviour of its nodes { not their implementations.

For example, Fibona i is a feedba k loop of four omponents. At this level,

we an remain happily ignorant of the fa t that its Pairs node ontains another

three. We only need to know that Pairs requires two numbers before it outputs

anything and that, thereafter, it outputs on e for every input. The two Prefixes

initially inje t two numbers (0 and 1) into the ir uit. Both go into Pairs, but

9

Page 11: Parallel and Distributed Computing in Education (Invited Talk)

only one (their sum) emerges. After this, the feedba k loop just ontains a single

ir ulating pa ket of information (su essive elements of the Fibona i sequen e).

The Delta pro ess taps this ir uit to provide external output.

Squares is a simple pipeline of three omponents. It's best not to think of

the nine pro esses a tually involved. Clearly, for i >= 1:

Squares: a<i> = i - 1

Squares: b<i> = Sum {j - 1 | j = 1..i} = Sum {j | j = 0..(i - 1)}

Squares: out<i> = Sum {j | j = 0..(i - 1)} + Sum {j | j = 0..i} = i * i

So, Squares outputs the in reasing sequen e of squared natural numbers. It

doesn't deadlo k be ause Integrate and Pairs only add bu�ering properties

and it's safe to onne t bu�ers in series.

Tabulate3 is from our base atalogue. Like the others, it is y li . In ea h

y le, it inputs in parallel one number from ea h of its three input hannels and,

then, generates a line of text on its output hannel onsisting of a tabulated

(15-wide, in this example) de imal representation of those numbers.

Tabulate3 (in0, in1, in2, out)

= ((in0 ? x0 - SKIP) || (in1 ? x1 - SKIP) || (in2 ? x2 - SKIP));

print (x0, 15, out); print (x1, 15, out); println (x2, 15, out);

Tabulate3 (in0, in1, in2, out)

Conne ting the output hannel from Demo to a text window displays three

olumns of numbers: the natural numbers, the Fibona i sequen e and perfe t

squares.

It's easy to understand all this { thanks to the stru turing. In fa t, Demo

onsists of 27 threads of ontrol, 19 of them permanent with the other 8 being

repeatedly reated and destroyed by the low-level parallel inputs and outputs

in the Delta, Plus and Tabulate3 omponents. If we tried to understand it on

those terms, however, we would get nowhere.

Please note that we are not advo ating designing at su h a �ne level of gran-

ularity as normal pra ti e! These are only exer ises and demonstrations to build

up uen y and on�den e in on urrent logi . Having said that, the pro ess

management overheads for the o am Demo exe utables are only around 30 mi-

rose onds per output line of text (i.e. too low to see) and three millise onds

for the Java (still too low to see). And, of ourse, if we are using these te h-

niques for designing real hardware[25℄, we will be working at mu h �ner levels

of granularity than this.

2.7 Coping with the Real World { Making Choi es

The model we have onsidered so far { parallel pro esses ommuni ating through

dedi ated (point-to-point) hannels { is deterministi . If we input the same data

in repeated runs, we will always re eive the same results. This is true regardless

of how the pro esses are s heduled or distributed. This provides a very stable

base from whi h to explore the real world, whi h doesn't always behave like this.

10

Page 12: Parallel and Distributed Computing in Education (Invited Talk)

Any ma hine with externally operatable ontrols that in uen e its internal

operation, but whose internal operations will ontinue to run in the absen e of

that external ontrol, is not deterministi in the above sense. The s heduling of

that external ontrol will make a di�eren e. Consider a ar and its driver heading

for a bri k wall. Depending on when the driver applies the brakes, they will end

up in very di�erent states!

CSP provides operators for internal and external hoi e. An external hoi e

is when a pro ess waits for its environment to engage in one of several events {

what happens next is something the environment an determine (e.g. a driver

an press the a elerator or brake pedal to make the ar go faster or slower).

An internal hoi e is when a pro ess hanges state for reasons its environment

annot determine (e.g. a self- lo ked timeout or the ar runs out of petrol). Note

that for the ombined (parallel) system of ar-and-driver, the a elerating and

braking be ome internal hoi es so far as the rest of the world is on erned.

o am provides a onstru tor (ALT) that lets a pro ess wait for one of many

events. These events are restri ted to hannel input, timeouts and SKIP (a null

event that has always happened). We an also set pre- onditions { run-time tests

on internal state { that mask whether a listed event should be in luded in any

parti ular exe ution of the ALT. This allows very exible internal hoi e within a

omponent as to whether it is prepared to a ept an external ommuni ation4.

The JavaPP libraries provide an exa t analogue (Alternative.sele t) for these

hoi e me hanisms.

If several events are pending at an ALT, an internal hoi e is normally made

between them. However, o am allows a PRI ALT whi h resolves the hoi e be-

tween pending events in order of their listing. This returns ontrol of the opera-

tion to the environment, sin e the rea tion of the PRI ALTing pro ess to multiple

ommuni ations is now predi table. This ontrol is ru ial for the provision of

real-time guarantees in multi-pro ess systems and for the design of hardware.

Re ently, extensions to CSP to provide a formal treatment of these me hanisms

have been made[26, 27℄.

in out

inject

Replace (in, out, inject)

in out

inject

*m

Scale (in, out, inject)

Fig. 7. Two ontrol pro esses

4 This is in ontrast to monitors, whose methods annot refuse an external all when

they are unlo ked and have to wait on ondition variables should their state prevent

them from servi ing the all. The lose oupling ne essary between sibling monitor

methods to undo the resulting mess is not WYSIWYG[9℄.

11

Page 13: Parallel and Distributed Computing in Education (Invited Talk)

Figure 7 shows two simple omponents with this kind of ontrol. Repla e

listens for in oming data on its in and inje t lines. Most of the time, data

arrives from in and is immediately opied to its out line. O asionally, a signal

from the inje t line o urs. When this happens, the signal is opied out but,

at the same time, the next input from in is waited for and dis arded. In ase

both inje t and in ommuni ations are on o�er, priority is given to the (less

frequently o urring) inje t:

Repla e (in, inje t, out)

= (inje t ? signal --> ((in ? x --> SKIP) || (out ! signal --> SKIP))

[PRI℄

in ? x --> out ! x --> SKIP

);

Repla e (in, inje t, out)

Repla e is something that an be spli ed into any hannel. If we don't use

the inje t line, all it does is add a one-pla e bu�er to the ir uit. If we send

something down the inje t line, it gets inje ted into the ir uit { repla ing the

next pie e of data that would have travelled through that hannel.

outin

+

0

reset

RIntegrate (in, out, reset)

out

Succ

0

reset

RNumbers (out, reset)

Fig. 8. Two ontrollable pro esses

12

Page 14: Parallel and Distributed Computing in Education (Invited Talk)

Figure 8 shows RNumbers and RIntegrate, whi h are just Numbers and

Integrate with an added Repla e omponent. We now have omponents that

are resettable by their environments. RNumbers an be reset at any time to

ontinue its output sequen e from any hosen value. RIntegrate an have its

internal running sum rede�ned.

Like Repla e, S ale (�gure 7) normally opies numbers straight through,

but s ales them by its fa tor m. An inje t signal resets the s ale fa tor:

S ale (m, in, inje t, out)

= (inje t ? m --> SKIP

[PRI℄

in ? x --> out ! m*x --> SKIP

);

S ale (m, in, inje t, out)

Figure 9 shows RPairs, whi h is Pairs with the S ale ontrol omponent

added. If we send just +1 or -1 down the reset line of RPairs, we ontrol whether

it's adding or subtra ting su essive pairs of inputs. When it's subtra ting, its

behaviour hanges to that of a di�erentiator { in the sense that it undoes the

e�e t of Integrate.

outin+

Tail

*1

reset

RPairs (in, out, reset)

Fig. 9. Sometimes Pairs, sometimes Differentiate

This allows a ni e ontrol demonstration. Figure 10 shows a ir uit whose

ore is a resettable version of the Squares pipeline. The Monitor pro ess rea ts

to hara ters from the keyboard hannel. Depending on its value, it outputs an

appropriate signal down an appropriate reset hannel:

Monitor (keyboard, resetN, resetI, resetP)

= (keyboard ? h -->

CASE h

`N': resetN ! 0 --> SKIP

`I': resetI ! 0 --> SKIP

`+': resetP ! +1 --> SKIP

`-': resetP ! -1 --> SKIP

);

Monitor (keyboard, resetN, resetI, resetP)

13

Page 15: Parallel and Distributed Computing in Education (Invited Talk)

Demo2 (keyboard, screen)

Tabulate3

Monitor

RNumbers RIntegrate Rpairs

keyboard

screen

Fig. 10. A user ontrollable ma hine

When Demo2 runs and we don't type anything, we see the inner workings of

the Squares pipeline tabulated in three olumns of output. Keying in an `N',

`I', `+' or `-' hara ter allows the user some ontrol over those workings5. Note

that after a `-', the output from RPairs should be the same as that taken from

RNumbers.

2.8 A Nastier Deadlo k

One last exer ise should be done. Modify the system so that output freezes if an

`F' is typed and unfreezes following the next hara ter.

Two `solutions' o�er themselves and Figure 11 shows the wrong one (Demo3).

This feeds the output from Tabulate3 ba k to a modi�ed Monitor2 and then on

to the s reen. The Monitor2 pro ess PRI ALTs between the keyboard hannel

and this feedba k:

Monitor2 (keyboard, feedba k, resetN, resetI, resetP, s reen)

= (keyboard ? h -->

CASE h

... deal with `N', `I', `+', `-' as before

`F': keyboard ? h --> SKIP

[PRI℄

feedba k ? x --> s reen ! x --> SKIP

);

Monitor2 (keyboard, feedba k, resetN, resetI, resetP, s reen)

5 In pra ti e, we need to add another pro ess after Tabulate3 to slow down the rate of

output to around 10 lines per se ond. Otherwise, the user annot properly appre iate

the immedia y of ontrol that has been obtained.

14

Page 16: Parallel and Distributed Computing in Education (Invited Talk)

Demo3 (keyboard, screen)

keyboard

screen

Tabulate3

RNumbers RIntegrate Rpairs

Monitor2

feedback

Fig. 11. A ma hine over whi h we may lose ontrol

TraÆ will normally be owing along the feedba k-s reen route, inter-

rupted only when Monitor2 servi es the keyboard. The attra tion is that if

an `F' arrives, Monitor2 simply waits for the next hara ter (and dis ards it).

As a side-e�e t of this waiting, the s reen traÆ is frozen.

But if we implement this, we get some worrying behaviour. The freeze oper-

ation works �ne and so, probably, do the `N' and `I' resets. Sometimes, however,

a `+' or `-' reset deadlo ks the whole system { the s reen freezes and all further

keyboard events are refused!

The problem is that one of the rules for deadlo k-free design has been broken:

any data- ow ir uit must ontrol the number of pa kets ir ulating! If this num-

ber rises to the number of sequential (i.e. lowest level) pro esses in the ir uit,

deadlo k always results. Ea h node will be trying to output to its su essor and

refusing input from its prede essor.

The Numbers, RNumbers, Integrate, RIntegrate and Fibona i networks

all ontain data- ow loops, but the number of pa kets on urrently in ight is

kept at one6.

In Demo3 however, pa kets are ontinually being generated within RNumbers,

owing through several paths to Monitor2 and, then, to the s reen. Whenever

Monitor2 feeds a reset ba k into the ir uit, deadlo k is possible { although not

ertain. It depends on the s heduling. RNumbers is always pressing new pa kets

into the system, so the ir uits are likely to be fairly full. If Monitor2 generates

a reset when they are full, the system deadlo ks. The shortest feedba k loop is

from Monitor2, RPairs, Tabulate3 and ba k to Monitor2 { hen e, it is the `+'

and `-' inputs from keyboard that are most likely to trigger the deadlo k.

6 Initially, Fibona i has two pa kets, but they ombine into one before the end of

their �rst ir uit.

15

Page 17: Parallel and Distributed Computing in Education (Invited Talk)

Demo4 (keyboard, screen)

Tabulate3

RNumbers RIntegrate Rpairs

keyboard

screen

Freeze

Monitor3

Fig. 12. A ma hine over whi h we will not lose ontrol

The design is simply �xed by removing that feedba k at this level { see Demo4

in Figure 12. We have abstra ted the freezing operation into its own omponent

(and atalogued it). It's never a good idea to try and do too many fun tions in

one sequential pro ess. That needlessly onstrains the syn hronisation freedom

of the network and heightens the risk of deadlo k. Note that the idea being

pushed here is that, unless there are spe ial ir umstan es, parallel design is

safer and simpler than its serial ounterpart!

Demo4 obeys another golden rule: every devi e should be driven from its own

separate pro ess. The keyboard and s reen hannels interfa e to separate de-

vi es and should be operated on urrently (in Demo3, both were driven from one

sequential pro ess { Monitor2). Here are the driver pro esses from Demo4:

Freeze (in, freeze, out)

= (freeze ? x --> freeze ? x --> SKIP

[PRI℄

(in ? x --> out ! x --> SKIP

);

Freeze (in, freeze, out)

Monitor3 (keyboard, resetN, resetI, resetP, freeze)

= (keyboard ? h -->

CASE h

... deal with `N', `I', `+', `-' as before

`F': freeze ! h --> keyboard ? h --> freeze ! h --> SKIP

);

Monitor3 (keyboard, resetN, resetI, resetP, freeze)

16

Page 18: Parallel and Distributed Computing in Education (Invited Talk)

2.9 Bu�ered and Asyn hronous Communi ations

We have seen how �xed apa ity FIFO bu�ers an be added as a tive pro esses

to CSP hannels. For the o am binding, the overheads for su h extra pro esses

are negligible.

With the JavaPP libraries, the same te hnique may be used, but the hannel

obje ts an be dire tly on�gured to support bu�ered ommuni ations { whi h

saves a ouple of ontext swit hes. The user may supply obje ts supporting any

bu�ering strategy for hannel on�guration, in luding normal blo king bu�ers,

overwrite-when-full bu�ers, in�nite bu�ers and bla k-hole bu�ers ( hannels that

an be written to but not read from { useful for masking o� unwanted outputs

from omponents that, otherwise, we wish to reuse inta t). However, the user

had better stay aware of the semanti s of the hannels thus reated!

Asyn hronous ommuni ation is ommonly found in libraries supporting inter-

pro essor message-passing (su h as PVM and MPI). However, the on urren y

model usually supported is one for whi h there is only one thread of ontrol on

ea h pro essor. Asyn hronous ommuni ation lets that thread of ontrol laun h

an external ommuni ation and ontinue with its omputation. At some point,

that omputation may need to blo k until that ommuni ation has ompleted.

These me hanisms are easy to obtain from the on urren y model we are

tea hing (and whi h we laim to be general). We don't need anything new.

Asyn hronous sends are what happen when we output to a bu�er (or bu�ered

hannel). If we are worried about being blo ked when the bu�er is full or if we

need to blo k at some later point (should the ommuni ation still be un�nished),

we an simply spawn o� another pro ess7 to do the send:

(out ! pa ket --> SKIP |PRI| someMoreComputation (...));

ontinue (...)

The ontinue pro ess only starts when both the pa ket has been sent

and someMoreComputation has �nished. someMoreComputation and sending the

pa ket pro eed on urrently. We have used the priority version of the parallel

operator (|PRI|, whi h gives priority to its left operand), to ensure that the send-

ing pro ess initiates the transfer before the someMoreComputation is s heduled.

Asyn hronous re eives are implemented in the same way:

(in ? pa ket --> SKIP |PRI| someMoreComputation (...));

ontinue (...)

2.10 Shared Channels

CSP hannels are stri tly point-to-point. o am3[28℄ introdu ed the notion of

(se urely) shared hannels and hannel stru tures. These are further extended

in the KRoC o am[29℄ and JavaPP libraries and are in luded in the tea hing

model.

7 The o am overheads for doing this are less than half a mi rose ond.

17

Page 19: Parallel and Distributed Computing in Education (Invited Talk)

A hannel stru ture is just a re ord (or obje t) holding two or more CSP

hannels. Usually, there would be just two hannels { one for ea h dire tion of

ommuni ation. The hannel stru ture is used to ondu t a two-way onversation

between two pro esses. To avoid deadlo k, of ourse, they will have to understand

proto ols for using the hannel stru ture { su h as who speaks �rst and when the

onversation �nishes. We all the pro ess that opens the onversation a lient

and the pro ess that listens for that all a server8.

clients servers

Fig. 13. A many-many shared hannel

The CSP model is extended by allowing multiple lients and servers to share

the same hannel (or hannel stru ture) { see Figure 13. Sanity is preserved

by ensuring that only one lient and one server use the shared obje t at any

one time. Clients wishing to use the hannel queue up �rst on a lient-queue

(asso iated with the shared hannel) { servers on a server-queue (also asso iated

with the shared hannel). A lient only ompletes its a tions on the shared

hannel when it gets to the front of its queue, �nds a server (for whi h it may

have to wait if business is good) and ompletes its transa tion. A server only

ompletes when it rea hes the front of its queue, �nds a lient (for whi h it may

have to wait in times of re ession) and ompletes its transa tion.

Note that shared hannels { like the hoi e operator between multiple events

{ introdu e s heduling dependent non-determinism. The order in whi h pro esses

are granted a ess to the shared hannel depends on the order in whi h they join

the queues.

Shared hannels provide a very eÆ ient me hanism for a ommon form of

hoi e. Any server that o�ers a non-dis riminatory servi e9 to multiple lients

should use a shared hannel, rather than ALTing between individual hannels

from those lients. The shared hannel has a onstant time overhead { ALTing

is linear on the number of lients. However, if the server needs to dis riminate

between its lients (e.g. to refuse servi e to some, depending upon its internal

state), ALTing gives us that exibility. The me hanisms an be eÆ iently om-

bined. Clients an be grouped into equal-treatment partitions, with ea h group

lustered on its own shared hannel and the server ALTing between them.

8 In fa t, the lient/server relationship is with respe t to the hannel stru ture. A

pro ess may be both a server on one interfa e and a lient on another.9 Examples for su h servers in lude window managers for multiple animation pro esses,

data loggers for re ording tra es from multiple omponents from some ma hine, et .

18

Page 20: Parallel and Distributed Computing in Education (Invited Talk)

For deadlo k freedom, ea h server must guarantee to respond to a lient all

within some bounded time. During its transa tion with the lient, it must follow

the proto ols for ommuni ation de�ned for the hannel stru ture and it may

engage in separate lient transa tions with other servers. A lient may open a

transa tion at any time but may not interleave its ommuni ations with the

server with any other syn hronisation (e.g. with another server). These rules

have been formalised as CSP spe i� ations[21℄. Client-server networks may have

plenty of data- ow feedba k but, so long as no y le of lient-server relations

exist, [21℄ gives formal proof that the system is deadlo k, livelo k and starvation

free.

Shared hannel stru tures may be stret hed a ross distributed memory (e.g.

networked) multipro essors[15℄. Channels may arry all kinds of obje t { in lud-

ing hannels and pro esses themselves. A shared hannel is an ex ellent means for

a lient and server to �nd ea h other, pass over a private hannel and ommuni-

ate independently of the shared one. Pro esses will drag pre-atta hed hannels

with them as they are moved and an have lo al hannels dynami ally (and

temporarily) atta hed when they arrive. See David May's work on I arus[30, 31℄

for a onsistent, simple and pra ti al realisation of this model for distributed

and mobile omputing.

3 Events and Shared Memory

Shared memory on urren y is often des ribed as being `easier' than message

passing. But great are must be taken to syn hronise on urrent a ess to shared

data, else we will be plagued with ra e hazards and our systems will be useless.

CSP primitives provide a sharp set of tools for exer ising this ontrol.

3.1 Symmetri Multi-Pro essing (SMP)

The private memory/algorithm prin iples of the underlying model { and the

se urity guarantees that go with them { are a powerful way of programming

shared memory multipro essors. Pro esses an be automati ally and dynami-

ally s heduled between available pro essors (one obje t ode �ts all). So long

as there is an ex ess of (runnable) pro esses over pro essors and the s heduling

overheads are suÆ iently low, high multipro essor eÆ ien y an be a hieved {

with guaranteed no ra e hazards. With the design methods we have been de-

s ribing, it's very easy to generate lots of pro esses with most of them runnable

most of the time.

3.2 Token Passing and Dynami CREW

Taking advantage of shared memory to ommuni ate between pro esses is an

extension to this model and must be syn hronised. The shared data does not

belong to any of the sharing pro esses, but must be globally visible to them {

either on the sta k (for o am) or heap (for Java).

19

Page 21: Parallel and Distributed Computing in Education (Invited Talk)

The JavaPP hannels in previous examples were only used to send data values

between pro esses { but they an also be used to send obje ts. This steps outside

the automati guarantees against ra e hazard sin e, un onstrained, it allows

parallel a ess to the same data. One ommon and useful onstraint is only to

send immutable obje ts. Another design pattern treats the sent obje t as a token

onferring permission to use it { the sending pro ess losing the token as a side-

e�e t of the ommuni ation. The tri k is to ensure that only one opy of the

token ever exists for ea h sharable obje t.

Dynami CREW (Con urrent Read Ex lusive Write) operations are also pos-

sible with shared memory. Shared hannels give us an eÆ ient, elegant and easily

provable way to onstru t an a tive guardian pro ess with whi h appli ation pro-

esses syn hronise to e�e t CREW a ess to the shared data. Guarantees against

starvation of writers by readers { and vi e-versa { are made. Details will appear

in a later report (available from [32℄).

3.3 Stru tured Barrier Syn hronisation and SPMD

Point-to-point hannels are just a spe ialised form of the general CSP multi-

pro ess syn hronising event. The CSP parallel operator binds pro esses together

with events. When one pro ess syn hronises on an event, all pro esses registered

for that event must syn hronise on it before that �rst pro ess may ontinue.

Events give us stru tured multiway barrier syn hronisation[29℄.

P

M

D

b1 b1b0 b0 b0 b0b2 b2

Fig. 14. Multiple barriers to three pro esses

We an have many event barriers in a system, with di�erent (and not ne es-

sarily disjoint) subsets of pro esses registered for ea h barrier. Figure 14 shows

the exe ution tra es for three pro esses (P, M and D) with time owing horizon-

tally. They do not all progress at the same { or even onstant { speed. From

time to time, tha faster ones will have to wait for their slower partners to rea h

an agreed barrier before all of them an pro eed. We an wrap up the system in

typi al SPMD form as:

|| <i = 0 FOR 3>

S (i, ..., b0, b1, b2)

20

Page 22: Parallel and Distributed Computing in Education (Invited Talk)

where b0, b1 and b2 are events. The repli ated parallel operator runs 3 instan es

of S in parallel (with i taking the values 0, 1 and 2 respe tively in the di�erent

instan es). The S pro ess simply swit hes into the required form:

S (i, ..., b0, b1, b2)

= CASE i

0 : P (..., b0, b1)

1 : M (..., b0, b1, b2)

2 : D (..., b1, b2)

and where P, M and D are registered only for the events in their parameters. The

ode for P has the form:

P (..., b0, b1)

= someWork (...); b0 --> SKIP;

moreWork (...); b0 --> SKIP;

lastBitOfWork (...); b1 --> SKIP;

P (..., b0, b1)

3.4 Non-Blo king Barrier Syn hronisation

In the same way that asyn hronous ommuni ations an be expressed (se tion

2.9), we an also a hieve the somewhat ontradi tory sounding, but potentially

useful, non-blo king barrier syn hronisation.

In terms of serial programming, this is a two-phase ommitment to the bar-

rier. The �rst phase de lares that we have done everything we need to do this

side of the barrier, but does not blo k us. We an then ontinue for a while, doing

things that do not disturb what we have set up for our partners in the barrier

and do not need whatever it is that they have to set. When we need their work,

we enter the se ond phase of our syn hronisation on the barrier. This blo ks us

only if there is one, or more, of our partners who has not rea hed the �rst phase

of its syn hronisation. With lu k, this window on the barrier will enable most

pro esses most of the time to pass through without blo king:

doOurWorkNeededByOthers (...);

barrier.firstPhase ();

privateWork (...);

barrier.se ondPhase ();

useSharedResour esProte tedByTheBarrier (...);

With our lightweight CSP pro esses, we do not need these spe ial phases to

get the same e�e t:

doOurWorkNeededByOthers (...);

(barrier --> SKIP |PRI| privateWork (...));

useSharedResour esProte tedByTheBarrier (...);

The explanation as to why this works is just the same as for the asyn hronous

sends and re eives.

21

Page 23: Parallel and Distributed Computing in Education (Invited Talk)

3.5 Bu ket Syn hronisation

Although CSP allows hoi e over general events, the o am and Java bindings

do not. The reasons are pra ti al { a on ern for run-time overheads10. So,

syn hronising on an event ommits a pro ess to wait until everyone registered for

the event has syn hronised. These multi-way events, therefore, do not introdu e

non-determinism into a system and provide a stable platform for mu h s ienti�

and engineering modelling.

Bu kets[15℄ provide a non-deterministi version of events that are useful for

when the system being modelled is irregular and dynami (e.g. motor vehi le

traÆ [33℄). Bu kets have just two operations: jump and ki k. There is no limit

to the number of pro esses that an jump into a bu ket { where they all blo k.

Usually, there will only be one pro ess with responsibility for ki king over the

bu ket. This an be done at any time of its own (internal) hoosing { hen e the

non-determinism. The result of ki king over a bu ket is the unblo king of all the

pro esses that had jumped into it11.

4 Con lusions

A simple model for parallel omputing has been presented that is easy to learn,

tea h and use. Based upon the mathemati ally sound framework of Hoare's CSP,

it has a ompositional semanti s that orresponds well with out intuition about

how the world is onstru ted. The basi model en ompasses obje t-oriented de-

sign with a tive pro esses (i.e. obje ts whose methods are ex lusively under their

own thread of ontrol) ommuni ating via passive, but syn hronising, wires. Sys-

tems an be omposed through natural layers of ommuni ating omponents so

that an understanding of ea h layer does not depend on an understanding of the

inner ones. In this way, systems with arbitrarily omplex behaviour an be safely

onstru ted { free from ra e hazard, deadlo k, livelo k and pro ess starvation.

A small extension to the model addresses fundamental issues and paradigms

for shared memory on urren y (su h as token passing, CREW dynami s and

bulk syn hronisation). We an explore with equal uen y serial, message-passing

and shared-memory logi and strike whatever balan e between them is appro-

priate for the problem under study. Appli ations in lude hardware design (e.g.

FPGAs and ASICs), real-time ontrol systems, animation, GUIs, regular and

irregular modelling, distributed and mobile omputing.

o am and Java bindings for the model are available to support pra ti al

work on ommodity PCs and workstations. Currently, the o am bindings are

10 Syn hronising on an event in o am has a unit time overhead, regardless of the num-

ber of pro esses registered. This in ludes being the last pro ess to syn hronise, when

all blo ked pro esses are released. These overheads are well below a mi rose ond for

modern mi ropro essors.11 As for events, the jump and ki k operations have onstant time overhead, regardless

of the number of pro esses involved. The bu ket overheads are slightly lower than

those for events.

22

Page 24: Parallel and Distributed Computing in Education (Invited Talk)

the fastest ( ontext-swit h times under 300 nano-se onds), lightest (in terms

of memory demands), most se ure (in terms of guaranteed thread safety) and

qui kest to learn. But Java has the libraries (e.g. for GUIs and graphi s) and

will get faster. Java thread safety, in this ontext, depends on following the CSP

design patterns { and these are easy to a quire12.

The JavaPP JCSP library[11℄ also in ludes an extension to the Java AWT

pa kage that drops hannel interfa es on all GUI omponents13. Ea h item (e.g.

a Button) is a pro ess with a onfigure and a tion hannel interfa e. These are

onne ted to separate internal handler pro esses. To hange the text or olour

of a Button, an appli ation pro ess outputs to its onfigure hannel. If some-

one presses the Button, it outputs down its a tion hannel to an appli ation

pro ess (whi h an a ept or refuse the ommuni ation as it hooses). Exam-

ple demonstrations of the use of this pa kage may be found at [11℄. Whether

GUI programming through the pro ess- hannel design pattern is simpler than

the listener- allba k pattern o�ered by the underlying AWT, we leave for the

interested reader to experiment and de ide.

All the primitives des ribed in this paper are available for KRoC o am and

Java. Multipro essor versions of the KRoC kernel targeting NoWs and SMPs will

be available later this year. SMP versions of the JCSP[11℄ and CJT[12℄ libraries

are automati if your JVM supports SMP threads. Hooks are provided in the

hannel libraries to allow user-de�ned network drivers to be installed. Resear h

is ontinuing on portable/faster kernels and language/tool design for enfor ing

higher level aspe ts of CSP design patterns (e.g. for shared memory safety and

deadlo k freedom) that urrently rely on self-dis ipline.

Finally, we stress that this is undergraduate material. The on epts are ma-

ture and fundamental { not advan ed { and the earlier they are introdu ed the

better. For developing uen y in on urrent design and implementation, no spe-

ial hardware is needed. Students an graduate to real parallel systems on e they

have mastered this uen y. The CSP model is neutral with respe t to parallel

ar hite ture so that oping with a hange in language or paradigm is straight-

forward. However, even for uni-pro essor appli ations, the ability to do safe and

lightweight multithreading is be oming ru ial both to improve response times

and simplify their design.

The experien e at Kent is that students absorb these ideas very qui kly and

be ome very reative14. Now that they an apply them in the ontext of Java,

they are smiling indeed.

12 Java a tive obje ts (pro esses) do not invoke ea h other's methods, but ommu-

ni ate only through shared passive obje ts with arefully designed syn hronisation

properties (e.g. hannels and events). Shared use of user-de�ned passive obje ts will

be automati ally thread-safe so long as the usage patterns outlined in Se tion 3 are

kept { their methods should not be syn hronized (in the sense of Java monitors).13 We believe that the new Swing GUI libraries from Sun (that will repla e the AWT)

an also be extended through a hannel interfa e for se ure use in parallel designs {

despite the warnings on erning the use of Swing and multithreading[34℄.14 The JCSP libraries used in Appendix B were produ ed by Paul Austin, an under-

graduate student at Kent.

23

Page 25: Parallel and Distributed Computing in Education (Invited Talk)

Referen es

1. C.A. Hoare. Communi ation Sequential Pro esses. CACM, 21(8):666{677, August

1978.2. C.A. Hoare. Communi ation Sequential Pro esses. Prenti e Hall, 1985.3. Oxford University Computer Laboratory. The CSP Ar hive. <URL: http://

www. omlab.ox.a .uk/ ar hive/ sp.html>, 1997.4. P.H. Wel h and D.C. Wood. KRoC { the Kent Retargetable o am Compiler. In

B. O'Neill, editor, Pro eedings of WoTUG 19, Amsterdam, Mar h 1996. WoTUG,

IOS Press. <URL:http:// www.hensa.a .uk/ parallel/ o am/ proje ts/ o am-

for-all/ kro />.5. Peter H. Wel h and Mi hael D. Poole. o am for Multi-Pro essor DEC Alphas.

In A. Bakkers, editor, Parallel Programming and Java, Pro eedings of WoTUG

20, volume 50 of Con urrent Systems Engineering, pages 189{198, Amsterdam,

Netherlands, April 1997. World o am and Transputer User Group (WoTUG),

IOS Press.6. Peter Wel h et al. Java Threads Workshop { Post Workshop Dis us-

sion. <URL:http://www.hensa.a .uk/parallel/groups/wotug/java/dis ussion/>,

February 1997.7. Gerald Hilderink, Jan Broenink, Wiek Vervoort, and Andre Bakkers. Communi-

ating Java Threads. In Parallel Programming and Java, Pro eedings of WoTUG

20, pages 48{76, 1997. (See referen e [5℄).8. G.H. Hilderink. Communi ating Java Threads Referen e Manual. In Parallel

Programming and Java, Pro eedings of WoTUG 20, pages 283{325, 1997. (See

referen e [5℄).9. Peter Wel h. Java Threads in the Light of o am/CSP. In P.H.Wel h

and A. Bakkers, editors, Ar hite tures, Languages and Patterns, Pro eedings of

WoTUG 21, volume 52 of Con urrent Systems Engineering, pages 259{284, Am-

sterdam, Netherlands, April 1998. World o am and Transputer User Group

(WoTUG), IOS Press. ISBN 90-5199-391-9.10. Alan Chalmers. JavaPP Page { Bristol. <URL:http://www. s.bris.a .uk/ ~alan/

javapp.html/>, May 1998.11. P.D. Austin. JCSP Home Page. <URL:http://www.hensa.a .uk/parallel/languages/

java/j sp/>, May 1998.12. Gerald Hilderink. JavaPP Page { Twente. <URL:http://www.rt.el.utwente.nl/

javapp/>, May 1998.13. Ian East. Parallel Pro essing with Communi ation Pro ess Ar hite ture. UCL

press, 1995. ISBN 1-85728-239-6.14. John Galletly. o am 2 { in luding o am 2.1. UCL Press, 1996. ISBN 1-85728-

362-7.15. o am-for-all Team. o am-for-all Home Page. <URL:http://www.hensa.a .uk/

parallel/o am/o am-for-all/>, February 1997.16. Mark Debbage, Mark Hill, Sean Wykes, and Denis Ni ole. Southampton's Portable

o am Compiler (SPoC). In R. Miles and A. Chalmers, editors, Progress in Trans-

puter and o am Resear h, Pro eedings of WoTUG 17, Con urrent Systems En-

gineering, pages 40{55, Amsterdam, Netherlands, April 1994. World o am and

Transputer User Group (WoTUG), IOS Press. <URL:http://www.hensa.a .uk/

parallel/ o am/ ompilers/ spo />.17. J.M.R. Martin and S.A. Jassim. How to Design Deadlo k-Free Networks Using

CSP and Veri� ation Tools { a Tutorial Introdu tion. In Parallel Programming

and Java, Pro eedings of WoTUG 20, pages 326{338, 1997. (See referen e [5℄).

24

Page 26: Parallel and Distributed Computing in Education (Invited Talk)

18. A.W. Ros oe and N. Dathi. The Pursuit of Deadlo k Freedom. Te hni al Re-

port Te hni al Monograph PRG-57, Oxford University Computing Laboratory,

1986.

19. J. Martin, I. East, and S. Jassim. Design Rules for Deadlo k Freedom. Transputer

Communi ations, 2(3):121{133, September 1994. John Wiley & Sons, Ltd. ISSN

1070-454X.

20. P.H. Wel h, G.R.R. Justo, and C. Will o k. High-Level Paradigms for Deadlo k-

Free High-Performan e Systems. In Grebe et al., editors, Transputer Appli ations

and Systems '93, pages 981{1004, Amsterdam, 1993. IOS Press. ISBN 90-5199-

140-1.

21. J.M.R. Martin and P.H. Wel h. A Design Strategy for Deadlo k-Free Con urrent

Systems. Transputer Communi ations, 3(4):215{232, O tober 1996. John Wiley

& Sons, Ltd. ISSN 1070-454X.

22. A.W. Ros oe. Model Che king CSP, A Classi al Mind. Prenti e Hall, 1994.

23. J.M.R. Martin and S.A. Jassim. A Tool for Proving Deadlo k Freedom. In Par-

allel Programming and Java, Pro eedings of WoTUG 20, pages 1{16, 1997. (See

referen e [5℄).

24. D.J. Be kett and P.H. Wel h. A Stri t o am Design Tool. In Pro eedings of UK

Parallel '96, pages 53{69, London, July 1996. BCS PPSIG, Springer-Verlag. ISBN

3-540-76068-7.

25. M. Aubury, I. Page, D. Plunkett, M. Sauer, and J. Saul. Advan ed Sili on Proto-

typing in a Re on�gurable Environment. In Ar hite tures, Languages and Patterns,

Pro eedings of WoTUG 21, pages 81{92, 1998. (See referen e [9℄).

26. A.E. Lawren e. Extending CSP. In Ar hite tures, Languages and Patterns, Pro-

eedings of WoTUG 21, pages 111{132, 1998. (See referen e [9℄).

27. A.E. Lawren e. HCSP: Extending CSP for Co-design and Shared Memory. In

Ar hite tures, Languages and Patterns, Pro eedings of WoTUG 21, pages 133{156,

1998. (See referen e [9℄).

28. Geo� Barrett. o am3 referen e manual (draft). <URL:http:// www.hensa.a .uk/

parallel/o am/do uments/>, Mar h 1992. (unpublished in paper).

29. Peter H. Wel h and David C. Wood. Higher Levels of Pro ess Syn hronisation. In

Parallel Programming and Java, Pro eedings of WoTUG 20, pages 104{129, 1997.

(See referen e [5℄).

30. David May and Henk L Muller. I arus language de�nition. Te hni al Report

CSTR-97-007, Department of Computer S ien e, University of Bristol, January

1997.

31. Henk L. Muller and David May. A simple proto ol to ommuni ate hannels

over hannels. Te hni al Report CSTR-98-001, Department of Computer S ien e,

University of Bristol, January 1998.

32. D.J. Be kett. Java Resour es Page. <URL:http://www.hensa.a .uk/parallel/

languages/java/>, May 1998.

33. Kang Hsin Lu, Je� Jones, and Jon Kerridge. Modelling Congested Road Traf-

� Networks Using a Highly Parallel System. In A. DeGloria, M.R. Jane, and

D. Marini, editors, Transputer Appli ations and Systems '94, volume 42 of Con-

urrent Systems Engineering, pages 634{647, Amsterdam, Netherlands, September

1994. The Transputer Consortium, IOS Press. ISBN 90-5199-177-0.

34. Hans Muller and Kathy Walrath. Threads and swing. <URL:http://java.sun. om/

produ ts/jf /swingdo -ar hive/threads.html>, April 1998.

25

Page 27: Parallel and Distributed Computing in Education (Invited Talk)

Appendix A: o am Exe utables

Spa e only permits a sample of the examples to be shown here. This �rst group are

from the `Legoland' atalogue (Se tion 2.3):

PROC Id (CHAN OF INT in, out) PROC Su (CHAN OF INT in, out)

WHILE TRUE WHILE TRUE

INT x: INT x:

SEQ SEQ

in ? x in ? x

out ! x out ! x PLUS 1

: :

PROC Plus (CHAN OF INT in0, in1, out)

WHILE TRUE

INT x0, x1:

SEQ

PAR

in0 ? x0

in1 ? x1

out ! x0 PLUS x1

:

PROC Prefix (VAL INT n, CHAN OF INT in, out)

SEQ

out ! n

Id (in, out)

:

Next ome four of the `Plug and Play' examples from Se tions 2.4 and 2.6:

PROC Numbers (CHAN OF INT out) PROC Integrate (CHAN OF INT in, out)

CHAN OF INT a, b, : CHAN OF INT a, b, :

PAR PAR

Prefix (0, , a) Plus (in, , a)

Delta (a, out, b) Delta (a, out, b)

Su (b, ) Prefix (0, b, )

: :

PROC Pairs (CHAN OF INT in, out) PROC Squares (CHAN OF INT out)

CHAN OF INT a, b, : CHAN OF INT a, b:

PAR PAR

Delta (in, a, b) Numbers (a)

Tail (b, ) Integrate (a, b)

Plus (a, , out) Pairs (b, out)

: :

26

Page 28: Parallel and Distributed Computing in Education (Invited Talk)

Here is one of the ontrollers from Se tion 2.7:

PROC Repla e (CHAN OF INT in, inje t, out)

WHILE TRUE

PRI ALT

INT x:

inje t ? x

PAR

INT dis ard:

in ? dis ard

out ! x

INT x:

in ? x

out ! x

:

Asyn hronous re eive from Se tion 2.9:

SEQ

PRI PAR

in ? pa ket

someMoreComputation (...)

ontinue (...)

Barrier syn hronisation from Se tion 3.3:

PROC P (..., EVENT b0, b2)

... lo al state de larations

SEQ

... initialise lo al state

WHILE TRUE

SEQ

someWork (...)

syn hronise.event (b0)

moreWork (...)

syn hronise.event (b0)

lastBitOfWork (...)

syn hronise.event (b1)

:

Finally, non-blo king barrier syn hronisation from Se tion 3.4:

SEQ

doOurWorkNeededByOthers (...)

PRI PAR

syn hronise.event (barrier)

privateWork (...)

useSharedResour esProte tedByTheBarrier (...)

27

Page 29: Parallel and Distributed Computing in Education (Invited Talk)

Appendix B: Java Exe utables

These examples use the JCSP library for pro esses and hannels[11℄. A pro ess is an

instan e of a lass that implements the CSPro ess interfa e. This is similar to, but

di�erent from, the standard Runable interfa e:

pa kage j sp.lang;

publi interfa e CSPro ess {

publi void run ();

}

For example, from the `Legoland' atalogue (Se tion 2.3):

import j sp.lang.*; // pro esses and obje t arrying hannels

import j sp.lang.ints.*; // integer versions of hannels

lass Su implements CSPro ess {

private ChannelInputInt in;

private ChannelOutputInt out;

publi Su (ChannelInputInt in, ChannelOutputInt out) {

this.in = in;

this.out = out;

}

publi void run () {

while (true) {

int x = in.read ();

out.write (x + 1);

}

}

}

lass Prefix implements CSPro ess {

private int n;

private ChannelInputInt in;

private ChannelOutputInt out;

publi Prefix (int n, ChannelInputInt in, ChannelOutputInt out) {

this.n = n;

this.in = in;

this.out = out;

}

28

Page 30: Parallel and Distributed Computing in Education (Invited Talk)

publi void run () {

out.write (n);

new Id (in, out).run ();

}

}

JCSP provides a Parallel lass that ombines an array of CSPro esses into a CSPro ess.

It's exe ution is the parallel omposition of that array. For example, here are two of

the `Plug and Play' examples from Se tions 2.4 and 2.6:

lass Numbers implements CSPro ess {

private ChannelOutputInt out;

publi Numbers (ChannelOutputInt out) {

this.out = out;

}

publi void run () {

One2OneChannelInt a = new One2OneChannelInt ();

One2OneChannelInt b = new One2OneChannelInt ();

One2OneChannelInt = new One2OneChannelInt ();

new Parallel (

new CSPro ess[℄ {

new Delta (a, out, b),

new Su (b, ),

new Prefix (0, , a),

}

).run ();

}

}

lass Squares implements CSPro ess {

private ChannelOutputInt out;

publi Squares (ChannelOutputInt out) {

this.out = out;

}

publi void run () {

One2OneChannelInt a = new One2OneChannelInt ();

One2OneChannelInt b = new One2OneChannelInt ();

new Parallel (

new CSPro ess[℄ {

new Numbers (a),

new Integrate (a, b),

new Pairs (b, out),

}

).run ();

}

}

29

Page 31: Parallel and Distributed Computing in Education (Invited Talk)

Here is one of the ontrollers from Se tion 2.7. The pro esses Pro essReadInt and

Pro essWriteInt just read and write a single integer (into and from a publi value

�eld) and, then, terminate:

lass Repla e implements CSPro ess {

private AltingChannelInputInt in;

private AltingChannelInputInt inje t;

private ChannelOutputInt out;

publi Repla e (AltingChannelInputInt in,

AltingChannelInputInt inje t,

ChannelOutputInt out) {

this.in = in;

this.inje t = inje t;

this.out = out;

}

publi void run () {

Alternative alt = new Alternative (new Guard[℄ {inje t, in});

final int INJECT = 0, IN = 1; // Guard indi es (prioritised)

Pro essWriteInt forward = new Pro essWriteInt (out); // a CSPro ess

Pro essReadInt dis ard = new Pro essReadInt (in); // a CSPro ess

CSPro ess parIO = new Parallel (new CSPro ess[℄ {dis ard, forward});

while (true) {

swit h (alt.priSele t ()) {

ase INJECT:

forward.value = inje t.read ();

parIO.run ();

break;

ase IN:

out.write (in.read ());

break;

}

}

}

}

JCSP also has hannels for sending and re eiving arbitrary Obje ts. Here is an asyn-

hronous re eive (from Se tion 2.9) of an expe ted Pa ket:

// set up pro esses on e (before we start looping ...)

Pro essRead readObj = new Pro essRead (in); // a CSPro ess

CSPro ess someMore = new someMoreComputation (...);

CSPro ess asyn = new PriParallel (new CSPro ess[℄ {readObj, someMore});

while (looping) {

asyn .run ();

Pa ket pa ket = (Pa ket) readObj.value

ontinue (...);

}

30