thread with care: concurrency pitfalls in java [iași codecamp 25th october 2014]
DESCRIPTION
We live in a world where parallel computing is becoming more and more ubiquitous: from large-scale GPU-based computing grids for bitcoin mining and montecarlo simulations, to 16+ cores desktop processors, 5760 shader cores GPUs, and octa-core smartphones. Leveraging this computing power in an effective way while also guaranteeing correctness requires a deep understanding of the threading capabilities and of the memory model behaviors hidden behind the language primitives that we use every day while coding. Yet, concurrency is one of the least taught subject in programming courses around the world, and one of the most obscure topic for the developers, often also very senior ones. We'll explore the topic together by walking through cases in the Java world. We'll also delve into novel approaches that can simplify it, such as Actors and Software Transactional Memory.TRANSCRIPT
Thread with care: concurrency pitfalls in Java
Luigi Lauro (Enteprise Architect - Product Line Group Risk Management)UniCredit Business Integrated Solutions
Iași, 25 October 2014
2 Iași, 25 October 2014 Luigi Lauro - UniCredit Business Integrated Solutions
Thread with careConcurrency pitfalls in Java
We live in a world where parallel computing is becoming more and more ubiquitous: from large-scale GPU-based computing grids for bitcoin mining and montecarlo simulations, to 16+ cores desktop processors, 5760 shader cores GPUs, and octa-core smartphones.
Leveraging this computing power in an effective way while also guaranteeing correctness requires a deep understanding of the threading capabilities and of the memory model behaviors hidden behind the language primitives that we use every day while coding.
Yet, concurrency is one of the least taught subject in programming courses around the world, and one of the most obscure topic for the developers, often also very senior ones.
We'll explore the topic together by walking through cases in the Java world. We'll also delve into novel approaches that can simplify it, such as Actors and Software Transactional Memory.
DISCLAIMER: this presentation was made for a speech at CodeCamp. It was a very interactive presentation, so most of the explanations / teaching were given by my voice, and are not present in text in the presentation, because I wanted to force the people to think, talk, interact.
I hope you can understand anyways the topic and what I wanted to give as an overall message.
Agenda
1. Concurrent and Parallel Computing
4. Common Pitfalls
2. Why parallel computing matters?
3
6. Novel Approaches
5. Solutions
3. Threading in Java
Iași, 25 October 2014 Luigi Lauro - UniCredit Business Integrated Solutions
4 Iași, 25 October 2014 Luigi Lauro - UniCredit Business Integrated Solutions
Concurrent and Parallel ComputingWhat’s the difference?
Concurrent computing
Computing in which computations are executing during overlapping time periods – concurrently – instead of sequentially (one completing before the next starts)
Parallel computing
Computing in which computations are carried out simultaneously, in parallel
They have the same hardware/software requirements?
What’s the relationship between them?One is a super-set of the other? They are mutually exclusive?
They share the same risks, the same dangers?
5 Iași, 25 October 2014 Luigi Lauro - UniCredit Business Integrated Solutions
Why parallel computing matters?From Moore to Amdahl
Moore’s Law
over the history of computing hardware, the number of transistors in a dense integrated circuit doubles approximately every two years
Amdahl’s Law
Maximum speedup of parallel computing is limited by the fraction of the problem that must be performed sequentially
What does it mean for performance?
What does it mean for the programmer?
What does it mean for the user?
6 Iași, 25 October 2014 Luigi Lauro - UniCredit Business Integrated Solutions
Threading in JavaSo easy, yet so complex
EASY
DANGEROUS
COMPLEX
NOT KNOWN
NOT TAUGHT
NOT DOCUMENTED
BITES YOU IN THE (_!_)
… really?
7 Iași, 25 October 2014 Luigi Lauro - UniCredit Business Integrated Solutions
Threading in JavaQuotes on parallel computing
“The way the processor industry is going, is to add more and more cores, but nobody knows how to program those things. I mean, two, yeah; four, not really; eight, forget it.”
“Everybody who learns concurrency thinks they understand it, ends up finding mysterious races they thought weren’t possible, and discovers that they didn’t actually understand it yet after all.”
“I decided long ago to stick to what I know best. Other people understand parallel machines much better than I do; programmers should listen to them, not me, for guidance on how to deal with simultaneity.”
“Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it.”
Steve Jobs, Apple
Herb Sutter, chair of the ISO C++ standards committee, Microsoft.
Donald Knuth, Professor Emeritus at Stanford University
Brian Kernighan, professor at Princeton University
8 Iași, 25 October 2014 Luigi Lauro - UniCredit Business Integrated Solutions
Threading in JavaSo easy, yet so complex
EASY
Can it get easier than this?
Can you do without threads?
Where are threads?
9 Iași, 25 October 2014 Luigi Lauro - UniCredit Business Integrated Solutions
Threading in JavaSo easy, yet so complex
DANGEROUS
And of course you know that perfectly right?
What is Thread Safety?
A class is thread-safe if it behaves correctly when accessed from multiple threads, regardless of the scheduling or interleaving of the execution of those threads by the runtime environment, and with no additional synchronization or other coordination on the part of the calling code.
How it becomes dangerous?
Without proper synchronization the JVM is designed to exploit any possible optimization, relaxing the guarantees and semantics of the java code in multi-threading environments, in order to ensure maximum performance for the code when single-threaded.
How can I fix it?
Basically the assumption is that you know when your classes will be accessed by multiple threads, you know what the dangers are, and you will take proper actions to notify the JVM of this, and ensure proper behavior using the tools java offers (synchronized, volatile, etc…)
10 Iași, 25 October 2014 Luigi Lauro - UniCredit Business Integrated Solutions
Threading in JavaSo easy, yet so complex
COMPLEX
Understanding concurrent/parallel programming and threading in Java, means
understanding in details the Java Language Specification, and how Java Memory
Model works (JSR 133), and the partial-ordering rules (“happens-before”) that are
defined there
“The Java Memory Model describes what behaviors are legal in multithreaded code, and
how threads may interact through memory. It describes the relationship between variables
in a program and the low-level details of storing and retrieving them to and from memory or
registers in a real computer system. It does this in a way that can be implemented correctly
using a wide variety of hardware and a wide variety of compiler optimizations.”
Brian Goetz
11 Iași, 25 October 2014 Luigi Lauro - UniCredit Business Integrated Solutions
Threading in JavaSo easy, yet so complex
COMPLEXWhat you need to know
Safely share mutable state across threads
Ensure proper visibility and avoid reading stale/partial values
Manage correctly JVM low-level optimization to not affect the software semantics
Avoid deadlocks and livelocks situations
Know what is thread-safe and what not and use them accordingly
Properly use the tools java offers (immutable objects, volatile keyword, implicit/explicit locking, atomic variables,
synchronized/concurrent collections, blocking queues and deques, latches, semaphores, barriers, phasers, executors,
FutureTask, fork-join framework, parallel bulk operations, non-blocking synchronization, thread and stack confinement,
client-side locking, safe publication, fail-fast iterators, lock striping, piggybacking, copy-on-write)
Minimize critical sections, avoid unneeded bottlenecks and maximize code performance in multi-core scenarios
You tick all the checkboxes right?
12 Iași, 25 October 2014 Luigi Lauro - UniCredit Business Integrated Solutions
Threading in JavaSo easy, yet so complex
NOT KNOWN
Where do you think you stand here?
Most junior programmers don’t know it
Most senior programmers don’t know it
Most “guru“ programmers don’t know it
The seniors and gurus who know it, they THINK they know it, but actually, they don’t
The ones who do know it for real, are the ones that will tell you that it’s a very
complex topic, and that they really don’t know it all
13 Iași, 25 October 2014 Luigi Lauro - UniCredit Business Integrated Solutions
Threading in JavaSo easy, yet so complex
NOT TAUGHT
Did someone or something teach you about this topic?
In Education: while threading is so ubiquitous, it’s considered an ‘advanced topic’ and it’s
mostly not taught at all in schools and universities. Not to mention the fact that most
teachers do not actually know anything about it (there are exceptions of course! )
In Books: most people who write java books, do not really understand the topic, and
therefore do not really teach it (example: most sold and known Java book by Bruce Eckel
took 10+ years and 4 editions to include a proper and correct discussion of the topic)
In Online Articles: … they are too busy speaking about “The Cloud” (whatever that means)
or showing you how to build in 5 minutes and 10 easy steps a “HelloWorld” web application
in the last fancy, trendy framework
In Companies: you have to deliver for yesterday, something based on the requirements
that will come in tomorrow…. you think you have time to think about thread safety??
14 Iași, 25 October 2014 Luigi Lauro - UniCredit Business Integrated Solutions
Threading in JavaSo easy, yet so complex
NOT DOCUMENTED
Try to look for words “thread” or “concurrent” in JDBC specifications
Is a ServletContext thread-safe? A HttpSession? A DataSource?
Documentation is one of the most powerful tools for managing thread safety. Users look to the
documentation to find out if a class is thread-safe, and maintainers look to the documentation to
understand the implementation strategy so they can maintain it without inadvertently compromising
safety
but….
Most of Java frameworks, API, libraries, technologies very poorly document their thread-safety and
threading behaviors, if they do at all.
Even official java libraries are no exception: more often than not, you have to make assumptions
based on common sense, because java technologies specification are silent, or at least
unforthcoming, about thread safety guarantees and requirements for interfaces to implement
15 Iași, 25 October 2014 Luigi Lauro - UniCredit Business Integrated Solutions
Threading in JavaSo easy, yet so complex
BITES YOU IN THE ***
Do you still want to write multi-threaded software?
Multi-threading bugs are the kind of bug that:
Can wreck havoc: visibility/reordering issues can make your application read old or even completely wrong values in a
variable, dead and live locks situation can make your application hang totally or run and use 100% of the hardware
resources without doing any progress. Think about what this can do to the real-world application you manage.
You will be fired faster than you can read the Exception message.
Are impossible to replicate: for 1.000.000.000 times it works correctly, then it doesn’t work correctly for one time, then it
works correctly again for another 1.000.000.000 times. Good luck debugging that.
Choose their target well: they work correctly on your development machine (one processor, less cores, client JVM), they
fail on the production machine (multi-processors, more cores, server JVM). That’s unfair I know.
You are naked against them: you have almost no tools to defend yourself. How can you detect or debug something that
happens once every 3 months and manifest in completely unknown behavior? The only way is pain-stakingly going through
*ALL* your code to find where the threading issue is. Good luck with that too.
16 Iași, 25 October 2014 Luigi Lauro - UniCredit Business Integrated Solutions
Threading in JavaSo easy, yet so complex
… really? Yes. Really.
Seems simple enough, isn’t it?
17 Iași, 25 October 2014 Luigi Lauro - UniCredit Business Integrated Solutions
Threading in JavaSo easy, yet so complex
You think you know the answer?
#ReplyIfYouCan
Anyone up to the challenge?
Challenge
Post on twitter using hashtag #ReplyIfYouCan the solution to this challenge
Anyone can participate, you have time until Sunday 26th October 2014, 23.59 to
submit your answer. At that time, I will post the ‘solution’ to the challenge.
The best answer will be selected for a special prize offered by UniCredit Business
Integrated Solutions, and will be given a job opportunity for an IT career in Iasi
18 Iași, 25 October 2014 Luigi Lauro - UniCredit Business Integrated Solutions
Threading in JavaSo easy, yet so complex
So, where are the problems? ATOMICITYWhenever a program has instructions that should be logically connected one to the other (i.e. “Check Then Act” or “If Then
Else” logic), I have to ensure that the instructions are executed in an atomic way (all, or none) in a multi-threading
environment, otherwise race conditions could arise that violate the program intended behavior
VISIBILITYIn a multi-threading environment, due to optimizations techniques used by the VM and JIT Compiler , there is no guarantee of
visibility of mutable state across threads. Whenever I share state across threads, I have to make sure to take the proper
steps to notify this to the VM, so to enforce necessary actions to achieve visibility
ORDERINGThe JVM specifications allows the JVM to freely re-order instructions in any program, and not execute them strictly in
order, but instead re-ordering them in a more efficient way for performance purpose , as long as this re-ordering do not
affect the intended behavior in single-thread environment. This effectively means that in multi-threading environment I
have to understand what the re-ordering could mean, and take proper action to tell the JVM not to do it , in case it would
break the correct program executionScary stuff isn’t it?
19 Iași, 25 October 2014 Luigi Lauro - UniCredit Business Integrated Solutions
Common PitfallsIf you have done this, don’t worry, you are not alone
How would you fix this?
ATOMICITY
20 Iași, 25 October 2014 Luigi Lauro - UniCredit Business Integrated Solutions
Common PitfallsIf you have done this, don’t worry, you are not alone
How would you fix this?
ATOMICITY
21 Iași, 25 October 2014 Luigi Lauro - UniCredit Business Integrated Solutions
Common PitfallsIf you have done this, don’t worry, you are not alone
How would you fix this?
VISIBILITY
22 Iași, 25 October 2014 Luigi Lauro - UniCredit Business Integrated Solutions
Common PitfallsIf you have done this, don’t worry, you are not alone
How would you fix this?
VISIBILITY
23 Iași, 25 October 2014 Luigi Lauro - UniCredit Business Integrated Solutions
Common PitfallsIf you have done this, don’t worry, you are not alone
How would you fix this?
ORDERING
24 Iași, 25 October 2014 Luigi Lauro - UniCredit Business Integrated Solutions
Solutions…. Ahhh finally!
Now I KNOW how to fix this!
STATELESS
CONFINEMENT
VOLATILE
LOCKING
IMMUTABLE
THREAD-SAFE CLASSES
25 Iași, 25 October 2014 Luigi Lauro - UniCredit Business Integrated Solutions
Solutions…. Ahhh finally!
Now I KNOW how to fix this!
STATELESS
if you have no state, you have no risk
26 Iași, 25 October 2014 Luigi Lauro - UniCredit Business Integrated Solutions
Solutions…. Ahhh finally!
Now I KNOW how to fix this!
CONFINEMENT
if your state can only be accessed by one thread, you’re safe
27 Iași, 25 October 2014 Luigi Lauro - UniCredit Business Integrated Solutions
Solutions…. Ahhh finally!
Now I KNOW how to fix this!
VOLATILE
when you have shared state, and you need to enforce visibility ONLY, use volatile
28 Iași, 25 October 2014 Luigi Lauro - UniCredit Business Integrated Solutions
Solutions…. Ahhh finally!
Now I KNOW how to fix this!
LOCKING
when you have shared state, and you need to enforce visibility AND atomicity, synchronize
29 Iași, 25 October 2014 Luigi Lauro - UniCredit Business Integrated Solutions
Solutions…. Ahhh finally!
Now I KNOW how to fix this!
IMMUTABLE
Immutable objects are always thread-safe
30 Iași, 25 October 2014 Luigi Lauro - UniCredit Business Integrated Solutions
Solutions…. Ahhh finally!
Now I KNOW how to fix this!
IMMUTABLE
Immutable objects are always thread-safe
31 Iași, 25 October 2014 Luigi Lauro - UniCredit Business Integrated Solutions
Solutions…. Ahhh finally!
Now I KNOW how to fix this!
THREAD-SAFE CLASSES
when you can’t do something, ask someone else to do it
32 Iași, 25 October 2014 Luigi Lauro - UniCredit Business Integrated Solutions
SolutionsLet’s see what Java offers us…
Learn these tools and use them!
Fail-fast and fail-safe iterators: almost all non-thread safe java collections offers fail-fast iterators, that
in most cases gives ConcurrentModificationException when used incorrectly, to help detecting threading
issues, and most concurrent collections offers fail-safe iterators, which are thread-safe by default
Easy ways to ensure immutability and to force confinement: final keyword, ThreadLocal* classes
Direct un-optimized memory access: with keyword “volatile” you can force the read and write directly
to central memory and ensure no caching/optimization interfere (and no instruction re-ordering since
Java5, when it became also a full memory barrier), giving you a fast and easy solution to simple visibility
issues
Implicit and Explicit Locking: you have at your disposal exclusive implicit locking through the
“synchronized” keyword, but you can also explicit lock classes such as ReentrantLock,
ReentrantReadWriteLock, that gives more flexibility over dealing with lock unavailability and greater control
over queueing behavior (fair vs unfair lock, read-write, etc…)
Atomic classes: many atomic classes that are thread-safe, are very fast (due to native support for
Compare-and-Swap CAS hardware CPU instructions) and also allows thread-safe compound operations
like addAndGet, compareAndSet, etc.. (AtomicInteger, AtomicLongArray, AtomicReference, etc…)
33 Iași, 25 October 2014 Luigi Lauro - UniCredit Business Integrated Solutions
SolutionsLet’s see what Java offers us…
Learn these tools and use them!
Thread-safe classes and wrappers: many thread-safe classes for a lot of different operations
(StringBuffer, etc…), thread-safe collections (CopyOnWriteArrayList, etc…), and library wrappers to make
collections thread-safe through exclusive locking (Collections.synchronizedList/Map/Set, etc…)
Concurrent collections: concurrent collections that are thread-safe, very fast, and with support for
compound operations, through implementation approaches like copy-on-write, lock-striping, piggybacking
and other non-blocking lock-free advanced synchronization techniques (CopyOnWriteArrayList/Set,
ConcurrentHashMap, ConcurrentSkipListSet/Map, ConcurrentLinkedQueue/Deque,
PriorityBlockingQueue, LinkedTransferQueue, SynchronousQueue, etc…)
Synchronizers: when you need to coordinate the work and lifecycle of multiple threads, you have a lot
of different classes to help you do so (CountDownLatch, CyclicBarrier, Semaphore, Phaser, Exchanger,
etc…)
Threading Tools: when you need to manage multiple threads for tasks and pool/re-use them, you have
a lot of threading pool possibilities, executor services and tools to facilitate the job (FutureTask, Executor,
ExecutorService, ThreadFactory, ScheduledExecutorService, CompletionService, Fork-join framework,
Thread.isInterrupted()/interrupted(), daemon threads, etc..)
34 Iași, 25 October 2014 Luigi Lauro - UniCredit Business Integrated Solutions
Novel ApproachesGoing forward
35 Iași, 25 October 2014 Luigi Lauro - UniCredit Business Integrated Solutions
Novel ApproachesGoing forward
LOCKING
36 Iași, 25 October 2014 Luigi Lauro - UniCredit Business Integrated Solutions
Novel ApproachesGoing forward
LOCKS
PRO
Available: synchronization primitives are
available by default in any modern
programming language, Java included
Flexible: by mixing all the tools that you
have at your disposal, you can achieve and
implement exactly the concurrency model and
architecture you need for your purpose
Fast: nothing beats being able to write your
very own thread-synchronization code at the
lowest level possible, using volatile, atomics,
locks, synchronizers
CONS
Knowledge: require a very deep knowledge
of the Java Language Specification and Java
Memory Model to fully grasp the concepts
behind and master them (“happens-before”)
Complicated: managing threading
primitives at the lowest level possible is often a
big task, and cumbersome, it’s a big effort on
its own
Error-prone: it’s very easy to forget to
properly handle one corner case, and suddenly
turn one perfectly thread-safe class into a
broken one
37 Iași, 25 October 2014 Luigi Lauro - UniCredit Business Integrated Solutions
Novel ApproachesGoing forward
ACTORS
38 Iași, 25 October 2014 Luigi Lauro - UniCredit Business Integrated Solutions
Novel ApproachesGoing forward
ACTORS
Actors are very lightweight concurrent entities. They hold the state, and encapsulate the programming
logic, and interact with each other through asynchronous messages and message queues. They raise the
abstraction level and make it much easier to write, test, understand and maintain concurrent and/or
distributed systems. You focus on workflow—how the messages flow in the system—instead of low level
primitives like threads, locks and socket IO
PRO
Safe: no shared state, and no blocking, so
your software is safe “by default”
High Level: you don’t have to deal with
threading primitives, and you can focus on the
real task at hand
CONS
Restructuring & Rethinking: if you want to
use actors, ALL your code will need to use
actors, and all your code will need to be
written, designed and thought using actors and
message handling
Are you ready to *THINK* in actors?
39 Iași, 25 October 2014 Luigi Lauro - UniCredit Business Integrated Solutions
Novel ApproachesGoing forward
How would you fix this?
SOFTWARE TRANSACTIONAL MEMORY
40 Iași, 25 October 2014 Luigi Lauro - UniCredit Business Integrated Solutions
Novel ApproachesGoing forward
SOFTWARE TRANSACTIONAL MEMORYSoftware transactional memory (STM) is a concurrency control mechanism analogous to database
transactions for controlling access to shared memory in concurrent computing. It is an alternative to lock-
based synchronization. A transaction in this context occurs when a piece of code executes a series of
reads and writes to shared memory. These reads and writes logically occur at a single instant in time;
intermediate states are not visible to other (successful) transactions.
PRO Increased concurrency: thanks to
optimistic approach no thread needs to wait for
a resource, and different threads can safely
access same resource
Straightforward: you just need to mark the
transaction sections that you need to be
atomic, but you don’t have to reason about the
state, variables, conditions
CONS Repeatable operations: every critical
sections needs to be re-executed if it fails to
‘commit’, so it means that side-effects could
trigger twice, and that do-once operations,
such as I/O, cannot be handled this way
Overhead: maintaining the log, and doing
transactions verification and commits cost time
and performance, and so does repeating
operations again and again potentially
41 Iași, 25 October 2014 Luigi Lauro - UniCredit Business Integrated Solutions
Thread with care: concurrency pitfalls in JavaLuigi Lauro - [email protected]
That’s all folks!
Books:
Java Concurrency in Practice - Brian GoetzConcurrent Programming in Java - Doug LeaThe Art of Multiprocessor Programming - Maurice HerlihyProgramming Concurrency on the JVM - Venkat SubramaniamSeven Concurrency Models in Seven Weeks - Paul Butcher
Online Articles:
http://docs.oracle.com/javase/tutorial/essential/concurrency/http://www.vogella.com/tutorials/JavaConcurrency/article.htmlhttp://baptiste-wicht.com/posts/2010/05/java-concurrency-part-1-threads.htmlhttp://www.javaworld.com/article/2078809/java-concurrency/http://adit.io/posts/2013-05-15-Locks,-Actors,-And-STM-In-Pictures.html
Get in touch with me at [email protected] for asking to have this presentation, for questions, doubts, solutions, discussions, insults, suggestions, and pretty much everything!