Thursday, June 18, 2015

Property-based testing primer

I'm a great advocate of automated testing and of finding out your code does not work on your machine, 30 seconds after having written it, instead of in production after it has caused a monetary loss and some repair work to be performed. This is true for many different kind of testing, from the unit level (which has also benefits for internal quality and design feedback) to the acceptance level (which ensures the stakeholders get what they need and documents it for the future). Your System Under Test can be a single class, a project or even a collaboration of [micro]services accessed through HTTP from another process or machine.

However, classic test suites written with xUnit and BDD styles have some scaling problems they hit when you want to exercise more than some happy paths: 
  • it is difficult to cover many different inputs by hand-writing test cases, so we stick with at most a dozen of cases for a particular method.
  • There are maintenance costs for every new input we want to test: each need some assertions to be written and be updated in the future if the System Under Test changes its API.
  • We tend not to test external dependencies such as the language or libraries, since we trust them to do a good job even if their failure is our responsibility. We chose them and we are deploying our project, not the original authors which provided the code "as is", without warranty.
Note that here I define input as the existing state of a system, plus the new input provided by a test (the Given and When part of a scenario) and output as not only the actual response produced by the System Under Test but also the new state it has assumed.


Let's take as an example a sort() function, which no one implements today except in job interviews and exercises.
Assuming an array (or list, depending on your language), we can produce several inputs for the function like we would do in a kata:
  • [1, 2, 3]
  • [3, 1]
  • [3, 6, 5, 1, 4]
and so on. When do we stop? Maybe we also need some tricky input:
  • []
  • [1]
  • [1, 1]
  • [2, 3, 5, 6, 8, 9, 1, 3, 6, 7, 8, 9]
Once we have all the inputs gathered, we need to define what we expect for each of them:
  • [1, 2, 3] => [1, 2, 3]
  • [3, 1] => [1, 3[
  • [3, 6, 5, 1, 4] => [1, 3, 4, 5, 6]
  • [] => []
  • [1] => [1]
  • [1, 1] => [1, 1]
  • [2, 3, 5, 6, 8, 9, 1, 3, 6, 7, 8, 9]  => [1, 2, 3, 3, 5, 6, 6, 7, 8, 8, 9, 9]
You can do this incrementally, growing the code one new test case at a time, but you have to do it anyway. Considering this can become boring and error-prone in this toy example makes you wonder what to do when instead of sort() you have a RangeOfMoney class which is key to your business and manages point-like and arbitrary intervals of monetary amounts (true story).

Property-based testing in a nutshell

Property-based testing in an approach to testing coming from the functional programming world. To solve the aforementioned problems (and get new, more interesting ones), it follows these steps:
  1. generate a random sample of possible inputs.
  2. Exercise the SUT with each of them.
  3. Verify properties which should be true on every output instead of making precise comparisons.
  4. (Optionally) if the properties verification failed, possibly shrink to find a minimal input that still causes a failure.

How does this work for the sort() function?

We can use rand() to generate an input array:
This array is composed by natural numbers (Gen\nat) and it is long up to 100 elements (Gen\pos(100)), since very long arrays could make our tests slow.

Then, for each of these inputs, we exercise sort() and verify a simple property on the output, which is the order of the elements:
This is not the only property that sort() maintains, but it's the first I would specify. There are possible others:
  • every element in the input is also in the output
  • every element in the output is also in the input
  • the length of the input and output arrays are the same.
Here is the complete example written using Eris, our extension for PHPUnit that automates the generation of many kinds of random data and their verification.

How to find properties?

How do we apply property-based testing to code we actually write every day? It certainly fits more in some areas of the code than in others, such as Domain Model classes.

Some rules of thumb for defining properties are:
  • look for inverse functions (e.g. addition and substraction, or doubling an image in size and shrinking it to 50%). You can use the inverse on the output and verify equality with the input.
  • Relate input and output on some property that is true or false on both (e.g. in the sort() example than an element that is in one of the two arrays is also in the other)
  • Define post conditions and invariants that always hold in a particular situation (e.g. in the sort() example that the output is sorted, but in general you can restrict the possible output values of a function very much saying it is an array, it contain only integers, its length is equal to the input's length.)

[2, 3, 5, 6, 8, 9, 1, 3, 6, 7, 8, 9] makes my test fail

Defining valid range of inputs with generators and the properties to be satisfied is a rich description of the behavior of the System Under Test. Therefore, when a sort() implementation fails we can work on the input in order to shrink it: trying to reduce its complexity and size in order to provide a minimal failing test case.

It's the same work we do when opening a bug report for someone else's code: we try to find a minimal combination that triggers the bug in order to throw away all unnecessary details that would slow down fixing it.

So in property-based testing the [2, 3, 5, 6, 8, 9, 1, 3, 6, 7, 8, 9] can probably be shrinked to [2, 3, 5, 6, 8, 9, 1, 3, 6, 7, 8] and maybe up to [1, 0], depending on the bug. This process is accomplished by trying to shrink all the random values generated, which in our case were the length of the array and the values contained.

Testing the language

So here's some code I expect to work:
This function creates a PHP DateTime instance using the native datetime extension, which is a standard for the PHP world. It starts from an year and a day number ranging from 0 to 364 (or 365) and it build a DateTime pointing to the midnight of that particular day.

Here is a property-based test for this function:
We generate two random integers in the [0. 364] range, and test that the difference in seconds of the two generated DateTime objects is equal to 86400 seconds multiplied by the number of the days passed between the two selected dates. A property of the input (distance) is maintained over the output in a different form (seconds instead of days).

Surprisingly, this test fails with the following message: what happened is we triggered a bug of the DateTime object while creating it with a particular combination of format and timezone. The net effect of this bug could have been that our financial reports (telling daily revenue) would have started showing the wrong numbers starting from February 29th of the next year.

Notice that the input is shrinked to the simplest possible values that trigger the problem: January 1st on one value and March 1st on the other.
Eventually we found a easy work around, as with a couple more lines of code we can avoid this behavior. We could do that only after discovering the bug of course.

In conclusion

Testing an application is a necessary burden for catching defects early and fix them with an acceptable cost instead of letting them run wild on real users. Property-based testing pushes automation also in the generation of inputs for the System Under Test and in the verification of results, hoping to lower the maintanance cost while increasing coverage at the same time.

Given the domain complexity handled by the datetime extension, it's doing a fantastic job and it's being developed by very competent programmers. Nevertheless, if they can slip in bugs I trust that my own code will, too. Property-based testing is an additional tool that can work side by side with example-based testing to uncover problems in our projects.

We named the property-based PHPUnit extension after Eris, the Greek goddess of chaos, since serious testing means attacking your code and the platform it is built on in the attempt of breaking it before someone else does.


Wednesday, May 27, 2015

Eris 0.4.0 is out

Eris is a PHPUnit extension for property-based testing, that is to say testing based on generating inputs to a system and check its state and output respect a set of properties. The project is a porting of QuickCheck and Eris is the name of the Greek goddess of chaos, since its aim is to break the System Under Test with unexpected inputs and actions.

I am planning a talk at the PHP User Group Milano and a longer blog post to introduce the general public to how property-based testing works. I held the same talk for a few friends at the phpDay 2015 Unconference.

Meanwhile version 0.4.0 is out, with the following ChangeLog:
  • Showing generated input with ERIS_ORIGINAL_INPUT=1.
  • names and date (DateTime) new Generator.
  • tuple Generator supports variadic arguments.
  • Shrinking respects when() clauses.
  • Dates and sorting examples.

As for all semantic-versioned projects, the 0.x series should be considered alpha and no API backward compatibility is guaranteed on update.

Image credits

Sunday, March 29, 2015

Evolution in software development

Evolution can be cited as a metaphor for iterative development: every new iteration (or commit at an ideal smallest scale) produces a shippable, new version of the software that has evolved from the previous one.

Really simplifying the process, and skipping some steps for simplicity, we can see something like:
  1. you start from a unicellular organism
  2. which evolves (slowly) into a multicellular organism
  3. which evolves into a fish
  4. which evolves into a mammal such as a primate and then into Homo Sapiens
as a metaphor for:
  1. you start from a Walking Skeleton
  2. you add the features necessary to get to a Minimum Viable Product
  3. you add, modify and drop features to tackle one part of a hierarchical backlog and get to some business objective.
  4. you pick another area of the backlog to continue working
Every metaphor has its limits: for example not all software descends from a common ancestor; variations in the new generations of software are not necessarily produced by randomness nor selected by reproductive ability of the software itself.
Still, we can take some lessons where patterns observed over million of years of evolution apply to a software development scenario.

If by any chance you happen to be a creationist it's better for you to stop reading this post.

Lesson: each step has to keep working

My father is a human, working organism. His father was too - and also the mother of his father. We all descend from an unbroken line of ancestors who were capable of staying alive and, during their lifespans, reproduce. This line goes up until ancestors 4 billion years ago who were unicellular organisms.
Thus evolution has to keep all intermediate steps working. Evolution cannot produce current versions who do not have value in the hope that they can be turned into something valuable later.
Value here is not necessarily money as it can be a user base or market share that can be monetized later: the investors market puts a price tag on a large user base but often not on a sound software architecture.
In fact, this lesson is a reality in capitalism-based companies whose funding is received through demonstrating business results; again, not necessarily profitability but acquisition, retention, growth metrics (Twitter) or revenue (Amazon):
Short-term capital investments are a very common business model in companies adopting Agile software development. They're not the only possible model to develop wealth: the Internet and GPS were created through generous public funding by the US government (which had its own military reasons).

Lesson: evolution takes a long time

If you take a look at the timeline of evolution on Earth, you'll see that it took a long time for more and more complex organisms to appear:
  • for the last 3.6 billion years, simple cells (prokaryotes);
  • for the last 300 million years, reptiles;
  • for the last 200 million years, mammals;
  • for the last 130 million years, flowers;
  • for the last 60 million years, the primates,
  • for the last 20 million years, the family Hominidae (great apes);
  • for the last 2.5 million years, the genus Homo (including humans and their predecessors);
  • for the last 200,000 years, anatomically modern humans.
We can argue that flowers are simpler than dinousaurs, and I'm going to address this later, but we widely believe that modern age humans express the most complex abilities ever seen on Earth: speaking, reading, writing, and tool-building. It can take a long time too for a complex software and an adequate design to emerge purely from iteration.

Lesson: we may not be capable of anything else

A complex system that works is invariably found to have evolved from a simple system that worked. A complex system designed from scratch never works and cannot be patched up to make it work. You have to start over with a working simple system. – John Gall
Evolution has been wonderful at producing animals that can run while for robotics it was an hard problem to control artificial legs on disparate terrains compared to ordinary computation.
But, if we believe Gall's law, there may not be another way to get to complex systems than to pass from simple systems first. In a sense, robotics engineers have not designed robot dogs from scratch but evolved them from simpler artificial beings at a faster pace than natural selection.
Exercises on evolving a product in thin slices such as the Elephant carpaccio are often made fun of because of the triviality of the slices in a controlled exercise: "add an input field", "support two discounts instead of one". However, in a real environment the slices become more "launch the product in France for this 10 million people segment" and "launch for another segment of the population". To believe we can design the perfect solution and never slice the problem into steps really is a God complex.

Lesson: local minima are traps

Humans and in general vertebrates have variants of the recurrent laryngeal nerve, which connects the brain to the larynx or an equivalent organ. This nerve developed in fishes, where it took a short route around the main artery exiting from the heart. It still takes this route today in humans.
However, giraffes have undergone a series of local optimizations (boy-scout rule anyone?) where the giraffes with longer necks outcompeted the shorter ones, raising the average length of their necks up to what we see today. The nerve grew with the neck as the only possible local modification that keep the new giraffe capable of living and reproducing itself was to lengthen the nerve, not to redesign it.
Longer nerves have a cost: the more cells they are composed of, the easier it is for them to develop diseases or to suffer injuries. An engineer would never do such a design mistake when trying to build a giraffe.
It is open to speculation whether it is possible to develop a giraffe with a nerve taking a shorter route: an engineer would surely try to simplify the DNA with this large refactoring.
But there may be, for example, embryological reasons that prevent giraffes from being able to be grown from an embryo if the nerve takes a different route. We often estimate the time for a new feature as a critical path where nothing goes wrong, but what if you have to take offline and migrate your database for hours in order to deploy this brand new, clean object-oriented design? Your perfectly designed giraffe may be destined to be stillborn.
Nature isn't bad or good: it just is. Local minima are a fact of life and software. We should take care not to preach local improvements as the silver bullet solution, nor to jump into too large refactorings which will kill the giraffe.

Lesson: you don't necessarily get better

Fitness functions describe how well-adapted an organism is to its environment, and evolutionary pressure selects the organisms with the better fitness to survive for the next generations. Even when selective pressure is very slightly skewed in one direction, over many generations a trait may very well disappear or appear due to a cumulative effect.
Depending on your choice for fitness, you may get a different result. Modern-day mosquitoes evolved from the same tree of life as humans, so our evolutionary projects can either become humans, elephants, cats, cockroaches or mosquitoes depending on which direction the forces in play are selecting.
In fact, there are so many legacy code projects around that you wonder what has produced them. One line at the time, they evolved into monsters to satisfy the fitness function of external quality: add this feature, make more money, save server resources, return a response to the user faster.
Worse is better is another example of philosophy where for software is more important to be simple than correct. Simple software is produced and spreads faster than cathedral-like software, taking advantage then of network effects such as the newly created community to improve. We have seen that at work with C and x86 (and so many other standards), we are seeing it now with JavaScript and MongoDB.
Again, I'm not saying this is good or bad: I'm saying this is happening and that if you want to replace JavaScript with a better language you have to take this into account instead of complaining about it. One wonders how many extinct animals we have never heard about, which leads us to the last lesson of evolution.

Lesson: extinction is around the corner

Much of this article is a deconstruction of iterative development, as a way to swing the pendulum on the other side of the argument for once. There has to be a devil's advocate.
I will leave you instead with a final point on why iterative development is a crucial part of Agile. After all, even this post is iterative: written, edited and reviewed at separate times, I didn't write it with a pen and paper in fact but on a digital medium very easy to modify.
More than 99 percent of all species that ever lived on the planet are estimated to be extinct -- Extinction on Wikipedia

Why a species goes extinct? It may evolve into another species, but this happens very slowly. What usually happens is the member of a species are no longer able to survive in changing conditions or against superior competition. Which sounds like something extracted from a management book.

In fact, one of the fundamental maxims of Agile software design is to keep software easy to change. Resist the over-optimization of today to survive and thrive tomorrow, as we can't foresee the future but we can foresee that we will likely have to change.

Sunday, September 21, 2014

Microservices are not Jars

The cult of the monolith
I've been building microservices for two years and my main complaint is that they're still not micro- enough. Here's a rebuke of Uncle Bob's recent post Microservices and Jars, which he apparently has written after forming an opinion based on an article in Martin Fowler's bliki:
One of my clients recently told me that they were investigating a micro-service-architecture. My first reaction was: "What's that?" So I did a little research and found Martin Fowler's and James Lewis' writeup on the topic.
 "I didn't even know what microservices were up until several days ago. Now I'm ready to pontificate about the topic."
So what is a micro-service? It's just a little stand-alone executable that communicates with other stand-alone executables through some kind of mailbox; like an http socket. Lots of people like to use REST as the message format between them.
Why is this desirable? Two words. Independent Deployability.
Let's ignore the REST as the message format terminology. Only two words? Independent deployability is nice, but I've seen cases where independence is total, and cases where an end-to-end test suite still needs to run including the production version of services A and B and the new version C' that we want to deploy to substitute C.
Other interesting properties of microservices such as scaling them independently come to mind. Or writing them in different languages. Or adapting to Conway's law by aligning teams with microservices for most of their work.
You can fire up your little MS and talk with it via REST. No other part of the system needs to be running. Nobody can change a source file in a different part of the system, and screw your little MS up. Your MS is immune to all the other code out there.
You can test your MS with simple REST commands; and you can mock out other MSs in the system with little dummy MSs that do just what your tests need them to do.
Moreover, you can control the deployment. You don't have to coordinate with some huge deployment effort, and merge deployment commands into nasty deployment scripts. You just fire up your little MS and make sure it keeps running.
You can use your own database. You can use your own webserver. You can use any language you like. You can use any framework you like.
Freedom! Freedom!
<sarcasm> tag needed.

But wait. Why is this better? Are the advantages I just listed absent from a normal Java, or Ruby, or .Net system?
  • existing databases tend to be attractors when new persistence requirements come up. So if I have MySQL up and running in my application and a job that would be a good fit for MongoDB comes up, I'm definitely not going to introduce MongoDB given the infrastructure setup time. I'll just go with the existing infrastructure and create some new tables, perpetuating the growth of the monolith.
  • Web servers are often tied to languages. If I want to use Node.js it will listen on the port 80 by itself, while PHP is commonly used with Apache, and Java with Tomcat or Jetty. 
  • JARs are a pretty JVM-specific packaging system. I'm definitely not going to put PHP code into JARs.
  • Frameworks come from the language, and even inside the same language I can have multiple PHP applications where one has a custom user interface and one serves a Angular single-page application.
Also the ones not listed:
  • It's easier to find out machines which contain bottleneck and replace them, CPU and IO usage maps directly to applications.
  • It's easier to get started working as a new developer because you need just a single microservice to run on your machine.
  • It's easier to throw away one microservice and replace it with a new one doing the same job, but better written.
What about: Independent Deployability?
We have these things called jar files. Or Gems. Or DLLs. Or Shared Libraries. The reason we have these things is so we can have independent deployability.
Replacing single JARs or DLLs seems pretty dangerous to me where there are compile-time and binary dependencies in play. Since Uncle Bob has experience with that, I'm going to trust him to deploy safely this way.
Most people have forgotten this. Most people think that jar files are just convenient little folders that they can jam their classes into any way they see fit. They forget that a jar, or a DLL, or a Gem, or a shared library, is loaded and linked at runtime. Indeed, DLL stands for Dynamically Linked Library.
So if you design your jars well, you can make them just as independently deployable as a MS. Your team can be responsible for your jar file. Your team can deploy your DLL without massive coordination with other teams. Your team can test your GEM by writing unit tests and mocking out all the other Gems that it communicates with. You can write a jar in Java or Scala, or Clojure, or JRuby, or any other JVM compatible language. You can even use your own database and wesbserver if you like.
You can use any language you like as long as you run it on the JVM. Sure there must be people who work on other infrastructure or don't want to run their languages on a compatibly-yet-really-secondary platform? PHP applications? Ruby programmers?
If you'd like proof that jars can be independently deployable, just look at the plugins you use for your editor or IDE. They are deployed entirely independently of their host! And often these plugins are nothing more than simple jar files.
So what have you gained by taking your jar file and putting it behind a socket and communicating with REST?
SOAP is the last acronym where simple was used this way. Look, by generating a WSDL from your objects along with an XSD file that can be used to validate XML messages you can pass requests over HTTP with a Soap-Action header and regenerate Java (or other compatible languages) code from the WSDL...
One thing you lose is time. It takes time to communicate through a socket. It takes time to decode REST messages. And that time means you cannot use micro-services with the impunity of a jar. If I want two jars to get into a rapid chat with each other, I can. But I don't dare do that with a MS because the communication time will kill me.
Of course, chatty fine-grained interfaces are not a microservices trait. I prefer accept a Command, emit Events as an integration style. After all, microservices can become dangerous if integrated with purely synchronous calls so the kind of interfaces they expose to each other is necessarily different from the one of objects that work in the same process. This is a property of every distributed system, as we know from 1996.
On my laptop it takes 50ms to set up a socket connection, and then about 3us per byte transmitted through that connection. And that's all in a single process on a single machine. Imagine the cost when the connection is over the wire!
It takes more to write a file line by line rather than doing it in a single shot. However, if the file is 2GB long, I prefer the first solution in order to preserve memory. I'm just trading off time for another resource.
In the case of microservices, I'm trading off the latency of single interactions between different services for more important resources: programmer time, independent scalability, even time experienced by the end user. A front end asynchronously publishing events to a backend service feels faster to the user than a monolithic application where I respond to user requests and generate report lines in the same process or on the same machines.
Another thing you lose (and I hate to say this) is debuggability. You can't single step across a REST call, but you can single step across jar files. You can't follow a stack trace across a REST call. Exceptions get lost across a REST interface.
To me debuggability and introspection into an application improves when using microservices, because you will be full of all the HTTP logs of every service calling one another. You don't have to predispose logging cut points as they come for free with the HttpChannel objects. For a more business-oriented monitoring, take a look at Domain Events: we publish them from different applications in order to build reports based on data from different components.
After reading this you might think I'm totally against the whole notion of Micro-Services. But, of course, I'm not. I've built applications that way in the past, and I'll likely build them that way in the future. It's just that I don't want to see a big fad tearing through the industry leaving lots of broken systems in it's wake.
For most systems the independent deployability of jar files (or DLLS, or Gems, or Shared Libraries) is more than adequate. For most systems the cost of communicating over sockets using REST is quite restrictive; and will force uncomfortable trade-offs.
Paraphrasing Stroustrup, there are only two kinds of achitectures: the ones people complain about and the ones nobody uses. We are here proposing microservices because they have provided value in many systems that were once thought not to need them. As long as you have reporting needs you don't want to burden your front end with, or need to scale up in the number of users or programmer, you can consider microservices (and their cost).
My advice:
Don't leap into microservices just because it sounds cool. Segregate the system into jars using a plugin architecture first. If that's not sufficient, then consider introducing service boundaries at strategic points.
Please don't! The interaction between microservices are very different from the ones between objects inside a single application. Each call outside of the boundary is a potential failure mode that you should try to model as an asynchronous message that can be retried when delivery fails (the receiving microservice being down, slow or not reachable). Retrofitting microservices over an existing code base is a costly endeauvour and you should only embark on it if you have an adequate time and money budget, possibly bigger than the one necessary to build with microservices in the first place.

Sunday, August 17, 2014

Tabular data in Behat

All of this has happened before, and all this will happen again. -- BSG
I just watched Steve Freeman short talk "Given When Then" considered harmful (requires free login), and I was looking for some ways to cheaply eliminate duplication in Behat scenarios.

Fortunately, Behat supports Scenario Outlines for tabular data which is an 80/20 solution to transform lots of duplicated scenarios:
    Scenario: 3 is Fizz
        Given the input is 3
        When it is converted
        Then it becomes Fizz

    Scenario: 6 is Fizz too because it's multiple of 3
        Given the input is 6
        When it is converted
        Then it becomes Fizz

    Scenario: 2 is itself
        Given the input is 2
        When it is converted
        Then it becomes 2
into a table:
    Scenario Outline: conversion of numbers
        Given the input is <input>
        When it is converted
        Then it becomes <output>

            | input | output |
            | 2     | 2      |
            | 3     | Fizz   |
            | 6     | Fizz   |

Moreover, you can also pass tabular data to a single step with Table Nodes:
    Scenario: two items in the cart
        Given the following items are in the cart:
            | name    | price |
            | Cake    |     4 |
            | Shrimps |    10 |
        When I check out
        Then I pay 14
It takes a few minutes to learn how to do this into an existing Behat infrastructure. There are minimal changes to perform in the FeatureContext in the case of the Table Nodes, while Scenario Outlines are a pure Gherkin-side refactoring.

My code is on Github in my behat-tables-kata repository. If this reminds you of PHPUnit's @dataProvider, try to think of other patterns that can be borrowed from the xUnit world to fast-forward Cucumber and Behat development.

Saturday, August 16, 2014

PHPUnit Essentials review
PHPUnit Essentials by Zdenek Machek is a modern and complete book about PHPUnit usage. I've been sent an electronic copy by Packt Publishing and am now reviewing it here.

The first thing that struck me about the book was the breadth of subjects: you start from mocks and command line options, to get even to Selenium usage. You have to know your tools and given PHPUnit being a standard, this is all knowledge that will accompany you for several years.

Every book on PHPUnit must be compared with the wonderful manual, to see what it adds to the picture with respect to the documentation. PHPUnit Essentials, in this respect, looks also at 3rd party libraries such as mocking libraries or "competitors" such as PHPSpec to enlarge the picture to the whole open source PHP landscape. This is something the documentations of single projects cannot do, and where a bit of opinionated advice can be taken.

There is a bit of what may seem outdated information in the book such as how to perform a PEAR-based installation, but it's identified as such (PEAR being deprecated and dismissed by the end of the year.) Another seemingly outdated tool is Selenium IDE, but once upgraded with a formatter for Selenium2TestCase like explained in this book it becomes usable again. This kind of advice demonstrates the real world experience of the author and makes you trust the content.

On the whole by reading this book you go in as a naive tester and you come out with lots of skills on using PHPUnit in different scenarios; so I would recommended it to programmers wanting to dive into testing PHP applications. Probably it's not worth a read for the medium-to-advanced users, for which most of the content is already known from PHPUnit manual or personal experience. After all the book's named Essentials, so it delivers all that you expect from the title in a convenient single package.

Saturday, July 19, 2014

Skateboards, rockets and math

This slide from Spotify has been popular for a while:
It explains how a product can be built iteratively, satisfying first the need for transport with lesser means and then evolving to a more powerful platform. In this model feedback such as business model validation and satisfaction from the project sponsors can arrive early, even when they're negative (especially so).

From what I read about Spotify, they're also well-aware that incremental development can only take you so far: you don't get a car by making a better bicycle. Sometimes you have to take a leap to a new platform; or if it's clear that simpler technology won't support your vision, start from an higher level of essential complexity.

Here's someone that didn't start from a skateboard:
Imagine telling Spotify to install WebSphere (or some other technological terror) as the first step when starting a brand new project; or telling SpaceX teams "Come on, Elon, just give us a bicycle and we'll get some first sales!"

Or telling Larry Page that programming isn't math:

Keeping in mind this strong dependency on context, where do the competitive advantages of your product lie?
In finding a better fit with the needs of users, maybe a lower time to market? In solving technology problems to carry humanity into space at an acceptable cost? In algorithms that can find high quality information in the web ocean? In fooling VCs in giving you free money?

From your vision, your choices of education, process, and technology.

Friday, April 25, 2014

The full list of my articles on DZone

From 2010 to the end of 2013 I have written a few articles each week on DZone. Here is the full list as a reference.

Update: 98% of these articles are serving correctly again (only 10 links are being updated).

Practical PHP Patterns

PHP implementations for the GoF Design Patterns book and Martin Fowler's Pattern of Enterprise Application Architecture.

Practical PHP Patterns: Transaction Script
Practical PHP Patterns: Domain Model
Practical PHP Patterns: Table Module
Practical PHP Patterns: Service Layer
Practical PHP Patterns: Table Data Gateway
Practical PHP Patterns: Row Data Gateway
Practical PHP Patterns: Active Record
Practical PHP Patterns: Data Mapper
Practical PHP Patterns: Unit of Work
Practical PHP Patterns: Identity Map
Practical PHP Patterns: Lazy Loading
Practical PHP Patterns: Identity Field
Practical PHP Patterns: Foreign Key Mapping
Practical PHP Patterns: Association Table
Practical PHP Patterns: Dependent Mapping
Practical PHP Patterns: Embedded Value
Practical PHP Patterns: Serialized LOB
Practical PHP Patterns: Single Table Inheritance
Practical PHP Patterns: Concrete Table Inheritance
Practical PHP Patterns: Inheritance Mapping
Practical PHP Patterns: Metadata Mapping
Practical PHP Patterns: Query Object
Practical PHP Patterns: Repository
Practical PHP Patterns: Page Controller
Practical PHP Patterns: Front Controller
Practical PHP Patterns: Template View
Practical PHP Patterns: Transform View
Practical PHP Patterns: Two Step View
Practical PHP Patterns: Remote Facade
Practical PHP Patterns: Pessimistic Offline Lock
Practical PHP Patterns: Coarse Grained Lock
Practical PHP Patterns: Implicit Lock
Practical PHP Patterns: Database Session State
Practical PHP Patterns: Gateway
Practical PHP Patterns: Mapper
Practical PHP Patterns: Separated Interface
Practical PHP Patterns: Layer Supertype
Practical PHP Patterns: Registry
Practical PHP Patterns: Value Object
Practical PHP Patterns: Money
Practical PHP Patterns: Special Case
Practical PHP Patterns: Plugin
Practical PHP Patterns: Service Stub
Practical PHP Patterns: Record Set
Practical PHP Patterns: Application Controller
Practical PHP Patterns: Client Session State
Practical PHP Patterns: Optimistic Offline Lock
Practical PHP Patterns: Server Session State
Practical PHP Patterns: Class Table Inheritance
Practical PHP Patterns: Data Transfer Object
Practical PHP Patterns: Visitor
Practical PHP Patterns: Memento
Practical PHP Patterns: Mediator

Practical PHP Refactoring

PHP examples for Martin Fowler's Refactoring book.

Practical PHP Refactoring: Inline Temp
Practical PHP Refactoring: Move Method
Practical PHP Refactoring: Move Field
Practical PHP Refactoring: Extract Class
Practical PHP Refactoring: Hide Delegate
Practical PHP Refactoring: Inline Class
Practical PHP Refactoring: Remove Middle Man
Practical PHP Refactoring: Introduce Foreign Method
Practical PHP Refactoring: Introduce Local Extension
Practical PHP Refactoring: Self Encapsulate Field
Practical PHP Refactoring: Replace Data Value with Object
Practical PHP Refactoring: Change Value to Reference
Practical PHP Refactoring: Change Reference to Value
Practical PHP Refactoring: Replace Array with Object
Practical PHP Refactoring: Duplicate Observed Data
Practical PHP Refactoring: Change Unidirectional Association to Bidirectional
Practical PHP Refactoring: Change Bidirectional Association to Unidirectional
Practical PHP Refactoring: Replace Magic Number with Symbolic Constant
Practical PHP Refactoring: Encapsulate Field
Practical PHP Refactoring: Encapsulate Collection
Practical PHP Refactoring: Replace Type Code with Class
Practical PHP Refactoring: Replace Type Code with Subclasses
Practical PHP Refactoring: Replace Type Code with State or Strategy
Practical PHP Refactoring: Replace Subclass with Fields
Practical PHP Refactoring: Decompose Conditional
Practical PHP Refactoring: Consolidate Conditional Expression
Practical PHP Refactoring: Consolidate Duplicate Conditional Fragments
Practical PHP Refactoring: Remove Control Flag
Practical PHP Refactoring: Replace Nested Conditionals with Guard Clauses
Practical PHP Refactoring: Replace Conditional with Polymorphism
Practical PHP Refactoring: Introduce Null Object
Practical PHP Refactoring: Introduce Assertion
Practical PHP Refactoring: Rename Method
Practical PHP Refactoring: Add Parameter
Practical PHP Refactoring: Remove Parameter
Practical PHP Refactoring: Separate Query from Modifier
Practical PHP Refactoring: Parameterize Method
Practical PHP Refactoring: Replace Parameter with Explicit Methods
Practical PHP Refactoring: Preserve Whole Object
Practical PHP Refactoring: Replace Parameter with Method
Practical PHP Refactoring: Introduce Parameter Object
Practical PHP Refactoring: Hide Method
Practical PHP Refactoring: Replace Constructor with Factory Method
Practical PHP Refactoring: Encapsulate Downcast (and Wrapping)
Practical PHP Refactoring: Remove Setting Method
Practical PHP Refactoring: Replace Exception with Test
Practical PHP Refactoring: Pull Up Field
Practical PHP Refactoring: Pull Up Method
Practical PHP Refactoring: Replace Error Code with Exception
Practical PHP Refactoring: Pull Up Constructor Body
Practical PHP Refactoring: Push Down Method
Practical PHP Refactoring: Push Down Field
Practical PHP Refactoring: Extract Subclass
Practical PHP Refactoring: Replace Record with Data Class
Practical PHP Refactoring: Extract Superclass
Practical PHP Refactoring: Extract Interface
Practical PHP Refactoring: Collapse Hierarchy
Practical PHP Refactoring: Form Template Method
Practical PHP Refactoring: Replace Inheritance with Delegation
Practical PHP Refactoring: Replace Delegation with Inheritance
Practical PHP Refactoring: Tease Apart Inheritance
Practical PHP Refactoring: Convert Procedural Design to Objects
Practical PHP Refactoring: Separate Domain from Presentation
Practical PHP Refactoring: Extract Hierarchy
Practical PHP Refactoring: Extract Method
Practical PHP Refactoring: Inline Method
Practical PHP Refactoring: Replace Temp with Query
Practical PHP Refactoring: Introduce Explaining Variable
Practical PHP Refactoring: Split Temporary Variable
Practical PHP Refactoring: Remove Assignments to Parameters
Practical PHP Refactoring: Replace Method with Method Object
Practical PHP Refactoring: Substitute Algorithm

Practical PHP Testing Patterns

PHP implementations of the xUnit testing patterns by Gerard Meszaros.

Practical PHP Testing Patterns: Behavior Verification
Practical PHP Testing Patterns: Recorded Test
Practical PHP Testing Patterns: Scripted Test
Practical PHP Testing Patterns: Data-Driven Test
Practical PHP Testing Patterns: Test Automation Framework
Practical PHP Testing Patterns: Minimal Fixture
Practical PHP Testing Patterns: Standard Fixture
Practical PHP Testing Patterns: Fresh Fixture
Practical PHP Testing Patterns: Shared Fixture
Practical PHP Testing Patterns: Back Door Manipulation
Practical PHP Testing Patterns: Layer Test
Practical PHP Testing Patterns: Four Phase Test
Practical PHP Testing Patterns: Assertion Method
Practical PHP Testing Patterns: Test Method
Practical PHP Testing Patterns: Assertion Message
Practical PHP Testing Patterns: Testcase Class
Practical PHP Testing Patterns: Test Runner
Practical PHP Testing Patterns: Testcase Object
Practical PHP Testing Patterns: Test Suite
Practical PHP Testing Patterns: Test Discovery
Practical PHP Testing Patterns: Inline Setup
Practical PHP Testing Patterns: Delegated Setup
Practical PHP Testing Patterns: Creation Method
Practical PHP Testing Patterns: Implicit Setup
Practical PHP Testing Patterns: Prebuilt Fixture
Practical PHP Testing Patterns: Lazy Setup
Practical PHP Testing Patterns: Suite Fixture Setup
Practical PHP Testing Patterns: Setup Decorator
Practical PHP Testing Patterns: Chained Tests
Practical PHP Testing Patterns: State Verification
Practical PHP Testing Patterns: Custom Assertion
Practical PHP Testing Patterns: Delta Assertion
Practical PHP Testing Patterns: Guard Assertion
Practical PHP Testing Patterns: Unfinished Test Assertion
Practical PHP Testing Patterns: Garbage-Collected Teardown
Practical PHP Testing Patterns: Automated Teardown
Practical PHP Testing Patterns: In-Line Teardown
Practical PHP Testing Patterns: Implicit Teardown
Practical PHP Testing Patterns: Test Double
Practical PHP Testing Patterns: Test Stub
Practical PHP Testing Patterns: Test Spy
Practical PHP Testing Patterns: Mock Object
Practical PHP Testing Patterns: Fake Object
Practical PHP Testing Patterns: Configurable Test Double
Practical PHP Testing Patterns: Hard-Coded Test Double
Practical PHP Testing Patterns: Test-Specific Subclass
Practical PHP Testing Patterns: Named Test Suite
Practical PHP Testing Patterns: Test Utility Method
Practical PHP Testing Patterns: Parameterized Test
Practical PHP Testing Patterns: Testcase Class Per Class
Practical PHP Testing Patterns: Testcase Class per Fixture
Practical PHP Testing Patterns: Testcase Superclass
Practical PHP Testing Patterns: Testcase Class per Feature
Practical PHP Testing Patterns: Test Helper
Practical PHP Testing Patterns: Database Sandbox
Practical PHP Testing Patterns: Stored Procedure Test
Practical PHP Testing Patterns: Table Truncation Teardown
Practical PHP Testing Patterns: Dependency Injection
Practical PHP Testing Patterns: Transaction Rollback Teardown
Practical PHP Testing Patterns: Dependency Lookup
Practical PHP Testing Patterns: Humble Object
Practical PHP Testing Patterns: Test Hook
Practical PHP Testing Patterns: Literal Value
Practical PHP Testing Patterns: Derived Value
Practical PHP Testing Patterns: Generated Value
Practical PHP Testing Patterns: Dummy Object

Lean tools

Reflections on the Poppendieck's Lean Software Development: An Agile Toolkit.

Lean Tools: Seeing Waste
Lean Tools: Value Stream Mapping
Lean Tools: the Last Responsible Moment
Lean Tools: Queuing Theory
Lean Tools: Self-Determination
Lean Tools: Motivation
Lean Tools: Expertise
Lean Tools: Perceived Integrity
Lean Tools: Conceptual Integrity
Lean Tools: Refactoring
Lean Tools: Measurements
Lean Tools: Contracts


My experiments following O'Reilly's Erlang Programming.
Erlang: Hello World
Erlang: tuples and lists
Erlang: build and test
Erlang: functions (part 1)
Erlang: functions (part 2)
Erlang: Concurrency
Erlang: client/server
Erlang: linking processes
Erlang: monitoring
Erlang: records
Erlang: macros
Erlang: live upgrade
Erlang: higher order functions
Erlang: list comprehensions
Erlang: binaries and bitstrings
Erlang: references
Erlang: sets
Erlang: bags

The wheel

A small series highlighting open source libraries to counter by bias on building my own tools.
The Wheel: Symfony Console
The Wheel: Symfony Filesystem
The Wheel: Symfony Stopwatch
The Wheel: Monolog
The Wheel: Twig
The Wheel: Guzzle
The Wheel: Symfony Routing
The Wheel: Assetic


Why I'm leaving Subversion for Git
Acceptance Test-Driven Development
How improved hardware changed programming
Contributing to open source projects
Introducing NakedPhp 0.1
How I learned to stop worrying and love new words
Graphical tips for the average coder
Domain-Driven Design Review
The class design checklist
HTTP verbs in PHP
Java versus PHP
TDD: Always code as...
The TDD Checklist (Red-Green-Refactor in Detail)
The Model-View-Controller pattern in PHP
The guide to configuration of PHP applications
Vim for PHP development
Synchronization in PHP
Evolution of a programmer
PHP 2.x frameworks and Ruby on Rails
Zend_Test for Acceptance TDD
Yahoo! Query Language
OSGi and servlets can work together
Writing user stories for web applications
CSS3 pseudo-classes
Death by buzzwords
Testing web applications with Selenium
The refactoring breakthrough on a CoffeeMachine
Lower your bar in Test-Driven Development
Web MVC in Java (without frameworks)
Software engineering in the rail system
Web applications as enterprise software
The absolute minimum you'll ever have to know about session persistence on the web
Exceptional JavaScript
JSP are more than templates
Firebug is beautiful
A Dojo primer
PHP inclusions
10 HTML tags which are not used as often as they deserve
WebML: overcoming UML for web applications
The buzzword glossary
The shortest guide to character sets you'll ever read
The wonders of the input tag in HTML 5
Native jQuery animations
NetBeans vs. Vim for PHP development
The different kinds of testing
Selenium is not a panacea
Is graceful degradation dead?
From Subversion to Git in a morning
Why a Pomodoro helps you getting in the zone
PHPUnit 3.5: easier asserting and mocking
CSS refactoring
The PHP paradigms poll results: OOP wins
How to set up the Pomodoro Technique in your office
What you need to know about your version control system
Paint on a canvas like Van Gogh
The must-know of color theory
You don't have to always stare at a screen
What we don't need in object-oriented programming
INVEST in user stories
Primitive Obsession
5 features of PHP that seem hacks, but save your life
From Doctrine 1 to Doctrine 2
The Dark Side of Lean
It's just like putting LEGO bricks together... Or not?
The best tools for writing UML diagrams
Date and time in PHP 5
Zend_Validate for the win
Meaningless docblocks considered harmful
Double Dispatch: the next best thing with respect to Dependency Injection
Zend_Locale for the win
Technical Investment, or quality vs. time
Real-life closures examples ...for real
Client applications with Ajax Solr: JavaScript vs. servlets
Sitting on the couch
What cooking can teach to a software developer
CouchDB from JavaScript
Reuse your closures with functors
These are not the buzzwords you're looking for
TDD for algorithms: the state of the art
An humble infographic on methodologies
Why Twitter is not an RSS replacement
5 things that PHP envies Java for
PageRank in 5 minutes
Where has XHTML gone?
Behavior-Driven Development in PHP with Behat
Do not fear the command line
Can you use PHP without frameworks nowadays?
Why Ruby's monkey patching is better than land mines...wait, what?
How to remove getters and setters
SOLID for packag... err, namespaces
What you must know about PHP errors to avoid scratching your forehead when something goes wrong
A programmer on the cloud
GitHub is a web application, Twitter is not (yet)
Eliminating duplication
Table-free CSS layouts in 10 minutes
Web Workers, for a responsive JavaScript application
How to enrich lawyers
What Firefox 4 means to web developers?
The PHP frameworks poll results
Struts vs. Zend Framework
HTTP is your wrench
The measures of programming
We cannot avoid testing JavaScript anymore
WebSockets in 5 minutes
Linear trees with Git rebase
Exploring TDD in JavaScript with a small kata
All you want to know about Web Storage
Classical inheritance in JavaScript
A Mockery review
The Gang of Four patterns as everyday objects
PHP UML generation from a live object graph
Bleeding edge JavaScript for object orientation
The 4 rules of simple design
Git backups, and no, it's not just about pushing
How to bomb a technical talk
The eXtreme Programming Values
Web services in Java
The Kindle is ready for programmers
On commits and commit messages
The Victorian Internet, and the Victorian social networks
Parallelism for dummies
Automated code reviews for PHP
Monitoring on Unix from scratch
I don't know how to test this
A week without Flash
Self-Initializing Fakes in PHP
Testing JavaScript when the DOM gets in the way
The era of Object-Document Mapping
HATEOAS, the scary acronym
Unit testing JavaScript code when Ajax gets in the way
Phantom JS: an alternative to Selenium
Symfony 2 from the eyes of a ZFer
PHP 5.4 features poll: the results
How to be a worse programmer
CoffeeScript: a TDD example
Assetic: JavaScript and CSS files management
The fastest browser poll: results
Syntactically Awesome Stylesheets
Edge Side Includes with Varnish in 10 minutes
Raphaël: cross-browser drawings
Future JavaScript, today: Google's Traceur
Backbone.js: MVC in JavaScript
Web typography in 2011
Practical Google+ Api
Phar: PHP libraries included with a single file
Cross-Site Request Forgery explained
Pretotyping: a complete example
Zend Application demystified
All the Git hooks you need
Temporal correlation in Git repositories
The Goal of software development
What I have learned at DDD Day
OAuth in headless applications
A look at Dart from the eyes of an OO programmer
And now instead, 5 things Java envies PHP for
Tell, Don't Ask in the case of a web service
Getting started with Selenium 2
I've had enough of running Scala in a terminal, let's try with a web application
Using a virtual machine to play with multiple versions of PHP
PHP on a Java application server
Web applications with the Play framework
Selenium 2 from PHP code
Eventual consistency is everywhere in the real world
Setting up a LAMP box with Puppet
PhoneGap: native applications written in HTML
HTML5 Drag and Drop uploading
Testing and specifying JavaScript code with Jasmine
What I learned in the Global Day of Code Retreat
Creating a virtual server with Vagrant: a practical walkthrough
Clojure for dummies: a kata
Rails from the point of view of a PHP developer
The Spark micro framework
3D experience in a browser with Three.js
Clojure libraries and builds with Leiningen
Open source PHP projects of 2011
TDD for multithreaded applications
Web application in Clojure: the starting point
Object-oriented Clojure
Open/Closed Principle on real world code
Python Hello World, for a web application
Offline web applications: a working example
jQuery plugins with jsTestDriver
Ajax requests to other domains with Cross-Origin Resource Sharing
Unit testing when Value Objects get in the way
An Introduction to the R Language
My use case for checked exceptions
What WSGI is
The Decorator pattern, or its cousin, in JavaScript
Bottle: a lightweight Python framework
Spam filtering with a Naive Bayes Classifier in R
Erlang's actor model
The 7 habits of highly effective developers
Our experience with Domain Events
Audio in HTML 5: state of the art
Running JavaScript inside PHP code
Gradient descent in Octave
A Zend Framework 2 tryout
Asynchronous and negative testing
All the mouse events in JavaScript
Everything you need to know about Python exceptions
CSS Bits: The Mouse Cursor
Bootstrap: rapid development and the complexity of a framework
Test-Driven Emergent Design vs. Analysis
PHP objects in MongoDB with Doctrine
TravisCI Intro and PHP Example
Sometimes Python is magic
Writing clean code in PHP 5.4
Object Calisthenics
Ajax and MVC
TDD in Python in 5 minutes
Test-Driven Development with OSGi
Including PHP libraries via Composer
There's no reason not to switch to DocBlox
The unknown acronym: GRASP
Bullets for legacy code
Finding wiring bugs
2 years of Vim and PHP distilled
All about JMS messages
Asynchronous processing in PHP
The Page Object pattern
Software versions, the necessary evil
Commodities in the IT world
The return of Vim
What's in a name?
The standard PHP setup
Selenium on Android
Hexagonal architecture in JavaScript
Why everyone is talking about APIs
Testing PHP scripts
Software Metaphors
MongoDB and Java
PHPSpec: BDD for your classes
What is global state?
A crash course for the MongoDB console
My love story with SSH
The surgery metaphor
The Turing test
Functional JavaScript with Underscore.js
Record and replay for testing of legacy PHP applications
PHP 5.4 by examples
My take on Utility and Strategic software
The Duck is a Lie
Set Up Solr and Get it Running
All debugging and no testing makes the PHP programmer a dull boy
All the ways to perform HTTP requests in PHP
The Roman numerals kata: TDD with and without analysis
Refactoring away from spaghetti PHP
What is statistical learning?
Why I am functophobic
How to build a Kanban board
An Introduction to WEKA - Machine Learning in Java
How to Take Unit Testing (and Test-Driving) Seriously
Transform switches in maps
Manual Test-Driven Development
Errors: part of the learning curve
Build your own Java stopwatch
Development of Latex documents
The Pomodoro updates
The problem of user identity
Don't overspecify your mocks
Factory patterns: Collaborators Map
The perils of long-running test suites
A CRC cards primer
No one always needs a framework
Scheduling is not the same for computers and people
Don't ignore errors
Why having an API matters: testing
What I learned at the Italian Agile Day 2012
Preparing to coach with the Game of Life
Lessons learned from the Code Retreat
The danger of large releases: Trenord case study
OO vs. functional: the Game of Life
Code Katas: Ruzzle solver
Agile traveling
How ACID is MongoDB?
SOLID principles: are they enough for OO?
Caring about build files
Thinking in value terms
Why HATEOAS is not the witch to burn
Carriers vs. the OSI model
How to correctly work with PHP serialization
Pomodoro, 2013 edition
Experiences with the book club
External processes and PHP
PHP streams for everything
Isolation in MongoDB
PHP's mcrypt
Design Choices: Return Values and Mocks
Contributing to Paratest
From Java to PHP
Continuous Integration and Pull Requests
MongoDB 2.4 is Out!
Monoids in PHP
Automated Testing is Cancer
Diving into Behat
Monitoring with DataDog
Trying out PHP Refactoring Browser
The difficult relationship between developers and business
What's in a constructor?
PHPUnit vs. Phake cheatsheet
How to stub SOAP in PHP
Many Ubiquitous Languages
Accessing APIs without taking down your own application
Game of life in Haskell
Cloning in PHP
Slack, the missing concept
A simple strategy for dotfiles
NoSQL does not mean no migrations (but opens up new ways of doing them)
Serialization and injection
The R-word
Selenium screenshots for rendering tests
The pitfalls of O(N)
How to think about patterns
Why not add this new feature?
Continuous Deployment Demystified
Backward compatibility, even inside a single project
The Legacy Code Retreat
XP Values: Simplicity
XP Values: Feedback
Memcache 102
My Vim values
Review: Implementing Domain-Driven Design
Upgrading PHP, from the trenches
Elephant Carpaccio (on user stories)
Battle with legacy: reducing ifs
Management 3.0 review
An Open/Closed Principle kata
Notes to a Software Team Leader Review
XP Values: Courage
Importing data, the API way
XP Values: Communication
How to shard a cron
The little toolbox of PHP performance optimization
How a LazyDecorator can let your application avoid building massive object graphs
Karate Chop
What your test suite won't catch
Book review: Slack
Unix commands for dealing with structured text
The programmer's information diet
Six months of Behat
Revisiting Conway's law
Unix lessons: sed
Object-relational mapping: seriously
Migration to AWS: part 1
A pull model for Event Stores
Book review: The Puritan Gift
Book review: Feedback control for computer systems
Migration to AWS: part 2
A different kind of kata: Harry Potter books
Migration to AWS, part 3
HTTP katas
Two days in the business side
Configuration is code
REST callbacks
Distributed time
A course with J.B. Rainsberger
Italian Agile Days 2013
MongoDB and its locks
Roman numerals, towards reuse
Global Day of Code Retreat 2013
No return statements
Long-running PHP processes: external resources
MongoDB Christmas optimization
Stand back, I'm going to try science!
AngularJS: first impression
Using APC correctly
Parallel PHPUnit


What paradigm should PHP applications embrace?
Is touch typing mandatory?
Which PHP framework would you use today for a brand new application?
Which browser do you consider the fastest?
What new feature in PHP 5.4 is the most important to you?