Parallel Computing is Now Mainstream> Cores are reaching performance
limits> More transistors per core just makes it
hot> New processors are multi-core
> and maybe multithreaded as well> Uniform shared memory within a
socket> Multi-socket may be pretty non-
uniform> Logic cost ($ per gate-Hz) keeps
falling> New “killer apps” will doubtless
need more performance> How should we write parallel
programs?
Parallel Programming Practice Today> Threads and locks> SPMD languages
> OpenMP> Co-array Fortran, UPC, and Titanium
> Message passing languages> MPI, Erlang
> Data-parallel languages> Cuda, OpenCL
> Most of these are pretty low level
Higher Level Parallel Languages
> Allow higher level data-parallel operations> E.g. programmer-defined reductions and
scans> Exploit architectural support for
parallelism> SIMD instructions, inexpensive
synchronization> Provide for abstract specification of
locality> Present a transparent performance
model> Make data races impossible
For the last item, something must be done about unrestricted use of variables
Shared Memory is not the Problem
> Shared memory has some benefits:> Forms a delivery vehicle for high
bandwidth> Permits unpredictable data-dependent
sharing> Provides a large synchronization
namespace> Facilitates high level language
implementations> Language implementers like it as a
target> Non-uniform memory can even scale> But shared variables are an issue:
stores do not commute with other loads or stores
> Shared memory isn’t a programming model
Pure Functional Languages
> Imperative languages do computations by scheduling values into variables> Their parallel dialects are prone to data
races> There are far too many parallel
schedules> Pure functional languages avoid data
races simply by avoiding variables entirely> They compute new constants from old> Loads commute so data races can’t
happen > Dead constants can be reclaimed
efficiently> But no variables implies no mutable
state
Mutable State is Crucial for Efficiency> To let data structures inexpensively
evolve> To avoid always copying nearly all of
them > Monads were added to pure functional
languages to allow mutable state (and I/O)> Plain monadic updates may still have
data races > The problem is maintaining state
invariants> These are just a program’s “conservation
laws”> They describe the legal attributes of the
state> As with physics, they are associated with
a certain generalized type of commutativity
Maintaining Invariants
> Updates perturb, then restore an invariant> Program composability depends on this > It’s automatic for us once we learn to program> How can we maintain invariants in parallel?
> Two requirements must be met:> Updates must not interfere with each other
> That is, they must be isolated in some fashion> Updates must finish once they start
> …lest the next update see the invariant false> We say the state updates must be atomic
> Updates that are both isolated and atomic are called transactions
Commutativity and Non-Determinism> If p and q preserve invariant I and do not interfere,
their parallel execution { p || q } also preserves I †
> If p and q are performed in isolation and atomically, i.e. as transactions, then they will not interfere ‡
> Operations may not commute with respect to state> But we always get commutativity with respect to the
invariant
> This leads to a weaker form of determinism> Long ago some of us called it “good non-determinism”> It’s the non-determinism operating systems rely on† Susan Owicki and David Gries. Verifying properties of parallel programs:
An axiomatic approach. CACM 19(5), pp. 279−285, May 1976. ‡ Leslie Lamport and Fred Schneider. The “Hoare Logic” of CSP, And All That. ACM TOPLAS 6(2), pp. 281−296, April 1984.
Example: Hash Tables
> Hash tables implement sets of items> The key invariant is that an item is in
the set iff its insertion followed all removals> There are also storage structure
invariants, e.g. hash buckets must be well-formed linked lists
> Parallel insertions and removals need only maintain the logical AND of these invariants> This may not result in deterministic state> The order of items in a bucket is
unspecified
High Level Data Races
> Some loads and stores can be isolated and atomic but cover only a part of the invariant> E.g. copying data from one structure to
another> If atomicity is violated, the data can be
lost> Another example is isolating a graph
node while deleting it but then decrementing neighbors’ reference counts with LOCK DEC> Some of the neighbors may no longer
exist> It is challenging to see how to
automate data race detection for examples like these
Other Examples
> Data bases and operating systems commonly mutate state in parallel
> Data bases use transactions to achieve consistency via atomicity and isolation> SQL programming is pretty simple> SQL is arguably not general-purpose
> Operating systems use locks for isolation> Atomicity is left to the OS developer> Lock ordering is used to prevent deadlock
> A general purpose parallel language should easily handle applications like these
Implementing Isolation
> Analysis> Proving concurrent state updates are isolated
> Locking> Deadlock must be handled somehow
> Buffering> Often used for wait-free updates
> Partitioning> Partitions can be dynamic, e.g. as in quicksort
> Serializing> These schemes can be nested
Isolation in Existing Languages
> Static in space: MPI, Erlang> Dynamic in space: Refined C, Jade> Static in time: Serial execution > Dynamic in time: Single global
lock> Static in both: Dependence
analysis> Semi-static in both: Inspector-
executor > Dynamic in both: Multiple locks
Atomicity
> Atomicity means “all or nothing” execution> State changes must be all done or undone
> Isolation without atomicity has little value> But atomicity is vital even in the serial case
> Implementation techniques:> Compensating, i.e. reversing a computation “in
place”> Logging, i.e. remembering and restoring the
original state values> Atomicity is challenging for distributed
computing and I/O
Exceptions
> Exceptions can threaten atomicity> An aborted state update must be undone
> What if a state update depends on querying a remote service and the query fails?> The message from the remote service
should send exception information in lieu of the data
> Message arrival can then throw as usual and the partial update can be undone
Transactional Memory
> “Transactional memory” means transaction semantics within lexically scoped blocks> TM has been a hot topic of late> As usual, lexical scope seems a virtue
here> Adding TM to existing languages has
problems> There is a lot of optimization work to
do> to make atomicity and isolation highly
efficient> Meanwhile, we shouldn’t ignore
traditional ways to get transactional semantics
Whence Invariants?
> Can we generate invariants from code?> Only sometimes, and it is difficult even then
> Can we generate code from invariants?> Is this the same as intentional programming?
> Can we write invariants plus code and let the compiler check invariant preservation? > This is much easier, but may be less attractive
> Can languages make it more likely that a transaction covers the invariant’s domain?> E.g. leveraging objects with encapsulated state
> Can we at least debug our mistakes?
Conclusions
> Functional languages with transactions enable higher level parallel programming> Microsoft is heading in this general
direction> Efficient implementations of isolation
and atomicity are important> We trust architecture will ultimately
help support these things> The von Neumann model needs
replacing, and soon
YOUR FEEDBACK IS IMPORTANT TO US!
Please fill out session evaluation
forms online atMicrosoftPDC.com
Learn More On Channel 9
> Expand your PDC experience through Channel 9.
> Explore videos, hands-on labs, sample code and demos through the new Channel 9 training courses.
channel9.msdn.com/learnBuilt by Developers for Developers….
© 2009 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.