Monday, January 25, 2016

Book review: Java Concurrency In Practice

I just finished reading the monumental book Java Concurrency in Practice, the definitive guide to writing concurrent programs in Java from Brian Goetz at al. This books gives you lots of information in a single easy place to find, so I'll delve immediately into describing what can you learn from it.

A small distributed system

On modern processor architectures, multithreading and concurrency have in general become a small distributed system inside a motherboard, spanning the centimeters that separate the CPU cores and the RAM.
In fact, you can see many parallels between the two field: CPUs are different machines, and coordinating between them is relatively more costly than allowing independent executions. The L1, L2 and L3 caches near the CPU cores behave as replicas, showing tunable consistency models and forcing compilers to introduce synchronization where needed.
Moreover, partial failure is always round the corner as threads run independently. Forcing programmers to deal with possible failure is one of the few usages of checked exceptions that I find not only acceptable but also desirable. I tend not to like checked exceptions too much as they tend to be replicated in too many places in the code, creating coupling. Still, they make forgetting about a possible thread interruption harder and also push for isolating the concurrent code from the domain models it is using underneath, to avoid throws clause contaminations.

Relevant JVM topics

The book is ripe with Java Virtual Machine concurrency concepts, building a pattern language for talking about thread safety and performance (which are the goals we are pursuing with concurrent applications.) Java's model is based on multithreading and shared memory, where the virtual threads are mapped 1:1 over the OS threads:
  • thread safety is based on confinement, atomicity, and visibility. These are not generic terms but are really concrete, explained with many code samples.
  • Publication and synchronization makes threads communicate, and immutable objects help keeping the collaboration simple. Immutability is not just a conceptual suggestion, because the JVM actually behaves differently when final fields are in place.
  • Every concept boils down to an explanation built over the underlying Java Memory Model, a specification that JVMs have to respect when implementing primitive operations.

Libraries

Basic concepts are necessary for understanding what's going on in your VM, but they are an insufficient level of abstraction for productive work. For this reason, the book explains the usage of several standard libraries:
  • synchronized data structures and their higher performance. ConcurrentHashMap is a work of art as it implements lock striping to avoid coordination when accessing different buckets in the map.
  • The Executor framework provides thread pools, futures, task cancellation and clean shutdown. Creating threads by hand is a beginner's solution.
  • Middleware such as latches, semaphores, and barriers to coordinate threads and stop them from clashing with each other without having to manually write synchronized blocks all over the place.
Thus part of the book has an emphasis of using the best tools available in Java SE instead of reinventing the wheel with Object.wait() and Object.notifyAll(), which are still explained thoroughly in the advanced chapters. Reinventing the wheel can be an error-prone task that produces inferior results, and it should not be the only option just because it's the only approach you know.

Be aware that...

The book is updated to Java 6 (it's missing the Fork/Join framework for example), but fortunately this version contains much of what you need on the theory and basic libraries. You will still need to integrate this knowledge with Java 8 parallel streams.
It takes focus to get through this book, and I spent several dozen hours to read the 16 chapters.
The annotations (such as @GuardedBy) won't compile if you don't download a separate package; it's too bad they're not a standard, since the authors are luminaries of the Java concurrency field, experts from many JSR groups and Java programming language authors.
As always for very technical books, I suggest to read it on a pc, with your preferred IDE and JUnit open to write tests and experiment with what you are learning. You probably will need some review on the most difficult topics, just to hear them as explained from different people. Stack Overflow and many blog articles will be your friend as you look for examples of unsafe publication or of the Java Memory Model.

Conclusions

I'm a fan of getting to the bottom of how things do work (and don't). I would definitely recommend this book if you are executing your code in multiple threads, as sooner or later you will be bitten without even understanding what went wrong. Even if you're just writing a Servlet, that code could become a target for concurrency.
Moreover, as for distributed systems, in concurrency simple testing is not enough: problems can be hard to find and combinatorially difficult to reproduce. You need theory, code review, static analysis: this book is one of the tools that can help you avoiding pesky bugs and much wasted time.

Sunday, January 10, 2016

Book review: Thinking in Java

I recently read the 1000-page tome Thinking in Java, written by Bruce Eckel, with the goal of getting my feet wet in the parts of the language that were still obscure to me. Here is my review of the book, containing its strong and weak points.

Basic topics

This is a book that touches on every basic aspect of the Java language, keeping an introductory level throughout but delving into deep usage of the Java standard libraries when treating a specific subject.
The basic concepts are well covered by the first 200 pages:
  • primitive values, classes and objects, control structures, and operators.
  • Access control: private, package, protected, public for classes, methods and fields.
  • Constructors and garbage collection.
  • Polymorphism and interfaces that enable it.
Most of the basic topics are oriented to programmers coming from a C procedural background, so they don't dwell on syntax but instead focus on the semantics and the JVM memory model.
Even if you come from a modern dynamic language, you will still find this first part useful to intimately understand how the language works. You will learn common idioms and patterns such as delegating to parent methods, getting a feel for Java code instead of trying to write Ruby code in a Java environment.

Wide coverage

The larger part of the book instead will be useful to cover new ground, if your knowledge is lacking in some areas or if you want a complete understanding of it. For example, I have a good knowledge of data structures such as ArrayList, HashMap and HashSet; still the 90 pages on the Java collections framework introduced me to structures such as the PriorityQueue whose usage is infrequent but can be very useful when you encounter a problem that calls for them.
Here is a full list of the specific topics treated in the book:
  • Inner static and non-static classes.
  • Exceptions management with try/catch/finally blocks.
  • Generics and all their advanced cases including example such as class SelfBounded<T extends SelfBounded<T>>.  I thought I knew generics until I read this chapter.
  • The peculiarities of arrays, still present in some parts of the language such as method calls with a variable number of arguments.
  • The Java collections framework, much more than List, Set and Map; lots of different implementations and a conceptual map of all the interfaces and classes.
  • Input/Output at the byte and text level, plus part of the java.nio evolution.
  • Enumerated types.
  • Reflection and annotations (definition and usage).
By the way, a few of the chapters can safely be skipped to make the book shorter. Drop the graphical user interfaces chapter as totally outdated, I don't even write user interfaces different without HTML anymore nowadays. probably the most popular GUI framework right now is the Android Platform rather than what's described here.
I also suggest to skip the concurrency and threading chapter, since such a small treatment cannot do justice to this topic. I would prefer another dedicated introduction and then go on with a more advanced book like Java Concurrency in Practice, which will tell you also what not to do instead of showing all the language features.
On this point, I find the writing of Bruce Eckel conservative, showing caution with advanced and obscure features rather than showing off with the risk of writing unmaintainable code down the line. The point is making you able to read complex Java code, not enabling you to write a mess more quickly.

Style

The book is quite lengthy, but lets you select a subset of the chapters pretty well if you need to dig into a particular topic. The text is driven by code samples, and to experimenting instead of reciting theory.
I suggest to read a digital version with your IDE ready: at least in my case, I found it easier to pick up concepts and get involved by writing my own examples. A 1000-page book would be pretty daunting if read on a device with no interaction, as it's the polar opposite of dense.
I created many test cases like this one, which lead me to verify the assumed behavior of Java libraries and features with my own hands:

Currency

The drawback of this book is its not being up-to-date with the current version of the platform, Java 8. The most recent version is the 4th edition, available on Amazon since 2006, which treats every feature up to Java 5.
You will have to piece together knowledge of Java 8 and the intermediate versions from other sources. I would have expected this book to at least be up-to-date with Java 7 due to its popularity.
However, due to Java's backward compatibility, what you read is still correct: I only found one code sample to have a compilation problem. I wonder how long would this book be if it was edited again to include Java 8: it could probably get to 1500 pages or more and implode under its own weight.

Conclusions

If you work with Java, Thinking in Java is a must-read, either to get a quick introduction to the basic features or to delve into one of the specific areas when you need it. You will probably never be surprised by reading Java syntax or idioms again. However, don't expect a complete coverage of such a huge world: this should be your first Java book, not the last.

Monday, January 04, 2016

PHPUnit_Selenium 2.0.0 is out

Here is the text of the change I have just merged to make a new major version of PHPUnit_Selenium a reality:
As signaled in #351, there are incompatibilities between the current version of PHPUnit_Selenium and PHPUnit 5.
It is a losing proposition to still support Selenium 1 API (SeleniumTestCase), as it redefines methods that have even changed signatures. It has not been maintained for years.
So to support PHPUnit 5.x there will be a new major version of this project, 2.x. The old 1.x branch will remain available but not updated anymore.
2.x will contain:
  • Selenium2TestCase
and work with PHPUnit 4.x or 5.x, with correspondent PHP versions.
1.x will contain:
  • SeleniumTestCase
  • Selenium2TestCase
but will only work with PHPUnit 4.x, with correspondent PHP versions. In general, it will not be updated anymore. 
Supported PHP versions vary from 5.3 (!) to 5.6, according to the PHPUnit's version requirement.
Installation is available through Composer, as before.

Saturday, January 02, 2016

Building an application with modern Java technologies

Sometimes Java gets a bad rap from Agile software developers, who suspect it to be a legacy technology on par of Cobol. I understand that being the most used language on the planet means there must be projects of any kind out there, including lots of legacy. That said, Java and the JVM are a huge and alive ecosystem producing value in all kind of domains from banking to Android applications and scientific software.

So I built a sample Game Of Life with what I believe are modern Java tecnologies:
  • Java 8 with lambdas support
  • Gradle for build automation and dependency resolution (substitutes both Ant and Maven)
  • Jetty as an embedded server to respond to HTTP requests
  • Jersey for building the RESTful web service calculating new generations of a plane, using JAX-RS
  • Freemarker templating engine to build HTML
  • Log4j 2 for logging, encapsulated behind the interface slf4j.
On the testing side of things:
Here's the result:
The application is self-contained and can be compiled, tested and run without any IDE or previous machine setup except having a JDK 8 on your machine. See the project on Github for details.

ShareThis