Monday, January 25, 2016

Book review: Java Concurrency In Practice

I just finished reading the monumental book Java Concurrency in Practice, the definitive guide to writing concurrent programs in Java from Brian Goetz at al. This books gives you lots of information in a single easy place to find, so I'll delve immediately into describing what can you learn from it.

A small distributed system

On modern processor architectures, multithreading and concurrency have in general become a small distributed system inside a motherboard, spanning the centimeters that separate the CPU cores and the RAM.
In fact, you can see many parallels between the two field: CPUs are different machines, and coordinating between them is relatively more costly than allowing independent executions. The L1, L2 and L3 caches near the CPU cores behave as replicas, showing tunable consistency models and forcing compilers to introduce synchronization where needed.
Moreover, partial failure is always round the corner as threads run independently. Forcing programmers to deal with possible failure is one of the few usages of checked exceptions that I find not only acceptable but also desirable. I tend not to like checked exceptions too much as they tend to be replicated in too many places in the code, creating coupling. Still, they make forgetting about a possible thread interruption harder and also push for isolating the concurrent code from the domain models it is using underneath, to avoid throws clause contaminations.

Relevant JVM topics

The book is ripe with Java Virtual Machine concurrency concepts, building a pattern language for talking about thread safety and performance (which are the goals we are pursuing with concurrent applications.) Java's model is based on multithreading and shared memory, where the virtual threads are mapped 1:1 over the OS threads:
  • thread safety is based on confinement, atomicity, and visibility. These are not generic terms but are really concrete, explained with many code samples.
  • Publication and synchronization makes threads communicate, and immutable objects help keeping the collaboration simple. Immutability is not just a conceptual suggestion, because the JVM actually behaves differently when final fields are in place.
  • Every concept boils down to an explanation built over the underlying Java Memory Model, a specification that JVMs have to respect when implementing primitive operations.

Libraries

Basic concepts are necessary for understanding what's going on in your VM, but they are an insufficient level of abstraction for productive work. For this reason, the book explains the usage of several standard libraries:
  • synchronized data structures and their higher performance. ConcurrentHashMap is a work of art as it implements lock striping to avoid coordination when accessing different buckets in the map.
  • The Executor framework provides thread pools, futures, task cancellation and clean shutdown. Creating threads by hand is a beginner's solution.
  • Middleware such as latches, semaphores, and barriers to coordinate threads and stop them from clashing with each other without having to manually write synchronized blocks all over the place.
Thus part of the book has an emphasis of using the best tools available in Java SE instead of reinventing the wheel with Object.wait() and Object.notifyAll(), which are still explained thoroughly in the advanced chapters. Reinventing the wheel can be an error-prone task that produces inferior results, and it should not be the only option just because it's the only approach you know.

Be aware that...

The book is updated to Java 6 (it's missing the Fork/Join framework for example), but fortunately this version contains much of what you need on the theory and basic libraries. You will still need to integrate this knowledge with Java 8 parallel streams.
It takes focus to get through this book, and I spent several dozen hours to read the 16 chapters.
The annotations (such as @GuardedBy) won't compile if you don't download a separate package; it's too bad they're not a standard, since the authors are luminaries of the Java concurrency field, experts from many JSR groups and Java programming language authors.
As always for very technical books, I suggest to read it on a pc, with your preferred IDE and JUnit open to write tests and experiment with what you are learning. You probably will need some review on the most difficult topics, just to hear them as explained from different people. Stack Overflow and many blog articles will be your friend as you look for examples of unsafe publication or of the Java Memory Model.

Conclusions

I'm a fan of getting to the bottom of how things do work (and don't). I would definitely recommend this book if you are executing your code in multiple threads, as sooner or later you will be bitten without even understanding what went wrong. Even if you're just writing a Servlet, that code could become a target for concurrency.
Moreover, as for distributed systems, in concurrency simple testing is not enough: problems can be hard to find and combinatorially difficult to reproduce. You need theory, code review, static analysis: this book is one of the tools that can help you avoiding pesky bugs and much wasted time.

No comments:

ShareThis