pitfalls in teaching development and testing of concurrent programs and how to overcome them

Download Pitfalls in Teaching Development and Testing of Concurrent Programs and How to Overcome them

If you can't read please download the document

Upload: egan

Post on 06-Jan-2016

41 views

Category:

Documents


0 download

DESCRIPTION

Pitfalls in Teaching Development and Testing of Concurrent Programs and How to Overcome them. Eitan Farchi. Objectives of the course I wanted to teach. Background - PowerPoint PPT Presentation

TRANSCRIPT

  • Pitfalls in Teaching Development and Testing ofConcurrent Programs and How to Overcome them

    Eitan Farchi

    IBM Labs in Haifa

    Objectives of the course I wanted to teach BackgroundThe process abstraction, mutual exclusion and conditional synchronization, scheduling policies and fairness, the process life cycle, synchronization primitives (semaphores, monitors), message passing, logical time, examples,Design the protocol through an abstraction Use atomic and atomic wait primitives (c1, s1) s2 => (c1 || c2, s1) (c2, s2)(c1, s1) s2 => (, s1) s2(b, s1) true and (c, s1) s2 => (, s1) s2 The use of higher abstraction level synchronization primitives lead toLower number of possible interleavingsMistakes are less likelyDesign is validated throughReviewing the important interleavingsFormal reasoning (invariants, proofs, model checking,)Higher abstraction level synchronization primitives are correctly translated to lower abstraction level synchronization primitivesFor example, an atomic primitive is carefully translated to locks and unlocksBug patterns are used to avoid mistakesThe implementation is tested using ConTest At this stage a good test plan is readily available from the previous development phaes

    IBM Labs in Haifa

    I thought this course for several years in various formatsTo third year computer science studentsTo professional programmers With and without experience in development of concurrent programs At least first degree in computer scienceTo testers with various degree of programming skills

    IBM Labs in Haifa

    Real world description of the ticket algorithm (start with something concrete )Some stores/government offices employ the following method to ensure that customers are serviced in order of arrivalUpon entering the store, a customer draws a number that is larger than the number held by any other customerThe customer then waits until all customers holding smaller numbers have been servicedThis algorithm is implemented by a number dispenser and by a display indicating which customer is being servedIf the store has one employee behind the service counter, customers are served one at a time in their order of arrival

    IBM Labs in Haifa

    High level implementation of the ticket algorithm var number := 1, next := 1, turn[1:n] := ([n], 0)P[1:1..n]:: do true -> critical section non-critical section od

    IBM Labs in Haifa

    Mapping of the previous two abstraction levels (real world and high level descriptions) // customer obtains a ticket // customers wait their turn // call for next customer

    IBM Labs in Haifa

    Testing/validating the protocolEven if the synchronization primitives are high level there are typically too many interleavings to review This is addressed by inductive proof, invariantsAssuming process i entered the critical section then turn[i] == next right after . It is easy to prove that turn[i] turn[j] if i j and turn[i] 0 and turn[j] 0Thus, as long as the critical section is not exited, any process that will reach will have to wait and at most one process can enter the critical section. Students We dont like mathematics and we dont like proofs, in fact, we hate them And by the way the ticket algorithm is ridiculously simple its only a loop with for lines of codeMaybe they dont understand there is an exponential space of possible interleavings?

    IBM Labs in Haifa

    Objectives of the course - updated BackgroundThe process abstraction, mutual exclusion and conditional synchronization, scheduling policies and fairness, the process life cycle, synchronization primitives (semaphores, monitors), message passing, logical time, examples,Design the protocol through an abstraction Use atomic and atomic wait primitives (c1, s1) s2 => (c1 || c2, s1) (c2, s2)(c1, s1) s2 => (, s1) s2(b, s1) true and (c, s1) s2 => (, s1) s2 The use of higher abstraction level synchronization primitives lead toLower number of possible interleavingsMistakes are less likelyDesign is validated through Systematically represent the set of possible interleavingsTypically through the use of Cartesian product modelsReviewing the important interleavingsHigher abstraction level synchronization primitives are correctly translated to lower abstraction level synchronization primitivesFor example, an atomic primitive is carefully translated to locks and unlocksBug patterns are used to avoid mistakesThe implementation is tested using ConTest At this stage a good test plan is readily available from the previous development phases

    IBM Labs in Haifa

    Helping the students realize that there is an exponential interleaving space

    First attempt - countingThe number of possible interleavings is enormousFor (a;b;c;e;f;g)||(h;I;j;k;l;m) of none blocking atomic actions the number of possible traces is 12!/(6!*6!) = 924Second attempt riddles100 threads are executing x++ on a shared variable initialized to 0, what are the possible outcomes?Students OK there are many things happening together in parallel and they can occur in many ways but it is hard, too hard, to think about things happening in parallel

    IBM Labs in Haifa

    Serialization helps understand the algorithm

    IBM Labs in Haifa

    Serialization helps understand the algorithm (Continued)

    IBM Labs in Haifa

    Next we implement the protocolStudents Locks are easy to use no need to read the instructions

    IBM Labs in Haifa

    Avoid errors by understanding the synchronization primitives [precise-java] In Java each object is associated with a lock Consider the following classclass Conflict { Conflict(){ synchronized(Conflict.class){}; }; synchronized static void f(){.}; synchronized void g(){.}; void h(){ synchronized(this){.}; }; void r(){};};Which of the following pairs of methods when executing concurrently can cause a conflict?f || g, f || h, f || r, g || h, g || r, h || r Pairs of the constructor method and one of the other methods

    IBM Labs in Haifa

    Translating from abstract to concrete - implementation pitfalls are explainedDifference between atomicity and lockingWhat is the protection provide by synchronized(o){x++} occurring in parallel to x++?When translating from an atomic block to locks/unlocks we need to identify all program locations that contened on the shared resourceCheck that the lock was obtained this is not good lock()unlock()Check that the lock was released along all error pathsWhat happens if a signal is taken while in the critical section (pthreads)What happens if an interrupt exception is taken while in wait()? try{synchronized(o){o.wait();}}catch(Exception e){}When atomic conditional wait is implemented we typically introduce a race and we need to recheck the condition once in the critical sectionTeaching pitfalls is highly effective in reducing the learning curve

    IBM Labs in Haifa

    Hiding the protocol implementationPrepare a general synchronization services for the system located in a separate class (see picture on the right)Students - OK but well implement the protocol all over the place any wayHard to teach without real life large systems experience Hard to suggest to engineers that maintain an existing system that is not like that If its not broken dont fix it

    IBM Labs in Haifa

    TestingRunning many times a test that has a concurrency problem does not necessarily produce itEspecially in unit test environmentsEasy to demonstrate through examplesCreate an empty test in which the synchronization primitives used are mapped to no-ops and shoe that the protocol works fineBest practice your test should at least expose a problem with the empty implementationRunning black box tests that have the required contention (e.g., customers accessing the ticketing system simultaneously) does not necessarily produce the white box contention you are after The blocking in to occur and not occur A context switch to occur right before and right after Defining the coverage tasks you are after and checking their coverage helpsE.g., ConTest synchronization coverage

  • BACKUP

    IBM Labs in Haifa

    Exercises - knowing the synchronization primitives (Java)100 threads execute i++ where i is a global variable. Describe all possible outcomesThe following thread is interrupted while waiting at the blue statement belowtry{ synchronized(foo){ foo.wait(); }}catch(Exception e){};

    Is the thread still holding the lock and is the thread interrupt bit turned on at the red statement above? What are the answers to the same questions if we change theprogram to:

    synchronized(foo){ try{ foo.wait(); }catch(Exception e){}; }

    IBM Labs in Haifa

    Exercises - knowing the synchronization primitives (Java)What happens if one thread executes the following method recursively, e.g., by excecuting factorial(7)

    synchronized int factorial(int i){ if(i == 0) return(1); else return(i * factorial(i-1));}

    IBM Labs in Haifa

    Will Parallel Programming Become Common Knowledge and the Parallel Programmer the Programmer of the future?

    It is hard to teach parallel programming development and verification to novicesComprehending the space of possible interleavings is hardAccurately and correctly defining the behavior of many threads acting in parallel is hardWith the introduction of multi-core, there is an increasing need for programmers who are able to reliably develop parallel programsBut maybe a different solution is possible? Can we avoid the need for the parallel programmer? Can we have the compiler or the programming language encapsulate the difficulties of parallelism and return the genie to the bottle? Will parallel programming become common knowledge and the parallel programmer the agent of the next revolution in programming paradigms?

    IBM Labs in Haifa

    Will Parallel Programming Become Common Knowledge and the Parallel Programmer the Programmer of the future?(continued)How will future multi-core systems be programmed? How well does existing primitives address various application domains and how well do they coexist? (3)What is the role of high level primitivies (e.g., the trasaction model). Can it hide perforomance? (3)Is the major difficulty in programming parallel programs testing them (2)?How do we address students huge difficulties in predicting possible interleavings and, most special, the unwanted/undesired ones (2)?What courses should be added to the curriculum and what should be taught on the job? (2)What is the minimum knowledge one needs if the underlying program is parallel? To be more specific, most programmers probably know close to nothing about compiler optimization and about the processor structure. Will they need more knowledge in the future, or can the details be hidden from them? (1)What will be the minimum knowledge needed by a parallel programmer and how will he or she acquire it, with emphasis on testing/debugging? (1)

    )

    Note background was actually added and was assumed in the first version of the classSome details are left out, can you identify them?Note - formal methods droppedAnswer 0 to 100

    Note - ConTest synchronization coverage used as a way to demonstrate to practitioners that they are not testing concurrency)

    Sent the abstract to PADTAD community