Sunday, September 21, 2014

Microservices are not Jars

The cult of the monolith
I've been building microservices for two years and my main complaint is that they're still not micro- enough. Here's a rebuke of Uncle Bob's recent post Microservices and Jars, which he apparently has written after forming an opinion based on an article in Martin Fowler's bliki:
One of my clients recently told me that they were investigating a micro-service-architecture. My first reaction was: "What's that?" So I did a little research and found Martin Fowler's and James Lewis' writeup on the topic.
 "I didn't even know what microservices were up until several days ago. Now I'm ready to pontificate about the topic."
So what is a micro-service? It's just a little stand-alone executable that communicates with other stand-alone executables through some kind of mailbox; like an http socket. Lots of people like to use REST as the message format between them.
Why is this desirable? Two words. Independent Deployability.
Let's ignore the REST as the message format terminology. Only two words? Independent deployability is nice, but I've seen cases where independence is total, and cases where an end-to-end test suite still needs to run including the production version of services A and B and the new version C' that we want to deploy to substitute C.
Other interesting properties of microservices such as scaling them independently come to mind. Or writing them in different languages. Or adapting to Conway's law by aligning teams with microservices for most of their work.
You can fire up your little MS and talk with it via REST. No other part of the system needs to be running. Nobody can change a source file in a different part of the system, and screw your little MS up. Your MS is immune to all the other code out there.
You can test your MS with simple REST commands; and you can mock out other MSs in the system with little dummy MSs that do just what your tests need them to do.
Moreover, you can control the deployment. You don't have to coordinate with some huge deployment effort, and merge deployment commands into nasty deployment scripts. You just fire up your little MS and make sure it keeps running.
You can use your own database. You can use your own webserver. You can use any language you like. You can use any framework you like.
Freedom! Freedom!
<sarcasm> tag needed.

But wait. Why is this better? Are the advantages I just listed absent from a normal Java, or Ruby, or .Net system?
  • existing databases tend to be attractors when new persistence requirements come up. So if I have MySQL up and running in my application and a job that would be a good fit for MongoDB comes up, I'm definitely not going to introduce MongoDB given the infrastructure setup time. I'll just go with the existing infrastructure and create some new tables, perpetuating the growth of the monolith.
  • Web servers are often tied to languages. If I want to use Node.js it will listen on the port 80 by itself, while PHP is commonly used with Apache, and Java with Tomcat or Jetty. 
  • JARs are a pretty JVM-specific packaging system. I'm definitely not going to put PHP code into JARs.
  • Frameworks come from the language, and even inside the same language I can have multiple PHP applications where one has a custom user interface and one serves a Angular single-page application.
Also the ones not listed:
  • It's easier to find out machines which contain bottlenecks and replace them, CPU and IO usage maps directly to applications.
  • It's easier to get started working as a new developer because you need just a single microservice to run on your machine.
  • It's easier to throw away one microservice and replace it with a new one doing the same job, but better written.
What about: Independent Deployability?
We have these things called jar files. Or Gems. Or DLLs. Or Shared Libraries. The reason we have these things is so we can have independent deployability.
Replacing single JARs or DLLs seems pretty dangerous to me where there are compile-time and binary dependencies in play. Since Uncle Bob has experience with that, I'm going to trust him to deploy safely this way.
Most people have forgotten this. Most people think that jar files are just convenient little folders that they can jam their classes into any way they see fit. They forget that a jar, or a DLL, or a Gem, or a shared library, is loaded and linked at runtime. Indeed, DLL stands for Dynamically Linked Library.
So if you design your jars well, you can make them just as independently deployable as a MS. Your team can be responsible for your jar file. Your team can deploy your DLL without massive coordination with other teams. Your team can test your GEM by writing unit tests and mocking out all the other Gems that it communicates with. You can write a jar in Java or Scala, or Clojure, or JRuby, or any other JVM compatible language. You can even use your own database and wesbserver if you like.
You can use any language you like as long as you run it on the JVM. Sure there must be people who work on other infrastructure or don't want to run their languages on a compatibly-yet-really-secondary platform? PHP applications? Ruby programmers?
If you'd like proof that jars can be independently deployable, just look at the plugins you use for your editor or IDE. They are deployed entirely independently of their host! And often these plugins are nothing more than simple jar files.
So what have you gained by taking your jar file and putting it behind a socket and communicating with REST?
SOAP is the last acronym where simple was used this way. Look, by generating a WSDL from your objects along with an XSD file that can be used to validate XML messages you can pass requests over HTTP with a Soap-Action header and regenerate Java (or other compatible languages) code from the WSDL...
One thing you lose is time. It takes time to communicate through a socket. It takes time to decode REST messages. And that time means you cannot use micro-services with the impunity of a jar. If I want two jars to get into a rapid chat with each other, I can. But I don't dare do that with a MS because the communication time will kill me.
Of course, chatty fine-grained interfaces are not a microservices trait. I prefer accept a Command, emit Events as an integration style. After all, microservices can become dangerous if integrated with purely synchronous calls so the kind of interfaces they expose to each other is necessarily different from the one of objects that work in the same process. This is a property of every distributed system, as we know from 1996.
On my laptop it takes 50ms to set up a socket connection, and then about 3us per byte transmitted through that connection. And that's all in a single process on a single machine. Imagine the cost when the connection is over the wire!
It takes more to write a file line by line rather than doing it in a single shot. However, if the file is 2GB long, I prefer the first solution in order to preserve memory. I'm just trading off time for another resource.
In the case of microservices, I'm trading off the latency of single interactions between different services for more important resources: programmer time, independent scalability, even time experienced by the end user. A front end asynchronously publishing events to a backend service feels faster to the user than a monolithic application where I respond to user requests and generate report lines in the same process or on the same machines.
Another thing you lose (and I hate to say this) is debuggability. You can't single step across a REST call, but you can single step across jar files. You can't follow a stack trace across a REST call. Exceptions get lost across a REST interface.
To me debuggability and introspection into an application improves when using microservices, because you will be full of all the HTTP logs of every service calling one another. You don't have to predispose logging cut points as they come for free with the HttpChannel objects. For a more business-oriented monitoring, take a look at Domain Events: we publish them from different applications in order to build reports based on data from different components.
After reading this you might think I'm totally against the whole notion of Micro-Services. But, of course, I'm not. I've built applications that way in the past, and I'll likely build them that way in the future. It's just that I don't want to see a big fad tearing through the industry leaving lots of broken systems in it's wake.
For most systems the independent deployability of jar files (or DLLS, or Gems, or Shared Libraries) is more than adequate. For most systems the cost of communicating over sockets using REST is quite restrictive; and will force uncomfortable trade-offs.
Paraphrasing Stroustrup, there are only two kinds of achitectures: the ones people complain about and the ones nobody uses. We are here proposing microservices because they have provided value in many systems that were once thought not to need them. As long as you have reporting needs you don't want to burden your front end with, or need to scale up in the number of users or programmer, you can consider microservices (and their cost).
My advice:
Don't leap into microservices just because it sounds cool. Segregate the system into jars using a plugin architecture first. If that's not sufficient, then consider introducing service boundaries at strategic points.
Please don't! The interaction between microservices are very different from the ones between objects inside a single application. Each call outside of the boundary is a potential failure mode that you should try to model as an asynchronous message that can be retried when delivery fails (the receiving microservice being down, slow or not reachable). Retrofitting microservices over an existing code base is a costly endeauvour and you should only embark on it if you have an adequate time and money budget, possibly bigger than the one necessary to build with microservices in the first place.

Sunday, August 17, 2014

Tabular data in Behat

All of this has happened before, and all this will happen again. -- BSG
I just watched Steve Freeman short talk "Given When Then" considered harmful (requires free login), and I was looking for some ways to cheaply eliminate duplication in Behat scenarios.

Fortunately, Behat supports Scenario Outlines for tabular data which is an 80/20 solution to transform lots of duplicated scenarios:
    Scenario: 3 is Fizz
        Given the input is 3
        When it is converted
        Then it becomes Fizz

    Scenario: 6 is Fizz too because it's multiple of 3
        Given the input is 6
        When it is converted
        Then it becomes Fizz

    Scenario: 2 is itself
        Given the input is 2
        When it is converted
        Then it becomes 2
into a table:
    Scenario Outline: conversion of numbers
        Given the input is <input>
        When it is converted
        Then it becomes <output>

            | input | output |
            | 2     | 2      |
            | 3     | Fizz   |
            | 6     | Fizz   |

Moreover, you can also pass tabular data to a single step with Table Nodes:
    Scenario: two items in the cart
        Given the following items are in the cart:
            | name    | price |
            | Cake    |     4 |
            | Shrimps |    10 |
        When I check out
        Then I pay 14
It takes a few minutes to learn how to do this into an existing Behat infrastructure. There are minimal changes to perform in the FeatureContext in the case of the Table Nodes, while Scenario Outlines are a pure Gherkin-side refactoring.

My code is on Github in my behat-tables-kata repository. If this reminds you of PHPUnit's @dataProvider, try to think of other patterns that can be borrowed from the xUnit world to fast-forward Cucumber and Behat development.

Saturday, August 16, 2014

PHPUnit Essentials review
PHPUnit Essentials by Zdenek Machek is a modern and complete book about PHPUnit usage. I've been sent an electronic copy by Packt Publishing and am now reviewing it here.

The first thing that struck me about the book was the breadth of subjects: you start from mocks and command line options, to get even to Selenium usage. You have to know your tools and given PHPUnit being a standard, this is all knowledge that will accompany you for several years.

Every book on PHPUnit must be compared with the wonderful manual, to see what it adds to the picture with respect to the documentation. PHPUnit Essentials, in this respect, looks also at 3rd party libraries such as mocking libraries or "competitors" such as PHPSpec to enlarge the picture to the whole open source PHP landscape. This is something the documentations of single projects cannot do, and where a bit of opinionated advice can be taken.

There is a bit of what may seem outdated information in the book such as how to perform a PEAR-based installation, but it's identified as such (PEAR being deprecated and dismissed by the end of the year.) Another seemingly outdated tool is Selenium IDE, but once upgraded with a formatter for Selenium2TestCase like explained in this book it becomes usable again. This kind of advice demonstrates the real world experience of the author and makes you trust the content.

On the whole by reading this book you go in as a naive tester and you come out with lots of skills on using PHPUnit in different scenarios; so I would recommended it to programmers wanting to dive into testing PHP applications. Probably it's not worth a read for the medium-to-advanced users, for which most of the content is already known from PHPUnit manual or personal experience. After all the book's named Essentials, so it delivers all that you expect from the title in a convenient single package.

Saturday, July 19, 2014

Skateboards, rockets and math

This slide from Spotify has been popular for a while:
It explains how a product can be built iteratively, satisfying first the need for transport with lesser means and then evolving to a more powerful platform. In this model feedback such as business model validation and satisfaction from the project sponsors can arrive early, even when they're negative (especially so).

From what I read about Spotify, they're also well-aware that incremental development can only take you so far: you don't get a car by making a better bicycle. Sometimes you have to take a leap to a new platform; or if it's clear that simpler technology won't support your vision, start from an higher level of essential complexity.

Here's someone that didn't start from a skateboard:
Imagine telling Spotify to install WebSphere (or some other technological terror) as the first step when starting a brand new project; or telling SpaceX teams "Come on, Elon, just give us a bicycle and we'll get some first sales!"

Or telling Larry Page that programming isn't math:

Keeping in mind this strong dependency on context, where do the competitive advantages of your product lie?
In finding a better fit with the needs of users, maybe a lower time to market? In solving technology problems to carry humanity into space at an acceptable cost? In algorithms that can find high quality information in the web ocean? In fooling VCs in giving you free money?

From your vision, your choices of education, process, and technology.

Friday, April 25, 2014

The full list of my articles on DZone

From 2010 to the end of 2013 I have written a few articles each week on DZone. Here is the full list as a reference.

Update: 98% of these articles are serving correctly again (only 10 links are being updated).

Practical PHP Patterns

PHP implementations for the GoF Design Patterns book and Martin Fowler's Pattern of Enterprise Application Architecture.

Practical PHP Patterns: Transaction Script
Practical PHP Patterns: Domain Model
Practical PHP Patterns: Table Module
Practical PHP Patterns: Service Layer
Practical PHP Patterns: Table Data Gateway
Practical PHP Patterns: Row Data Gateway
Practical PHP Patterns: Active Record
Practical PHP Patterns: Data Mapper
Practical PHP Patterns: Unit of Work
Practical PHP Patterns: Identity Map
Practical PHP Patterns: Lazy Loading
Practical PHP Patterns: Identity Field
Practical PHP Patterns: Foreign Key Mapping
Practical PHP Patterns: Association Table
Practical PHP Patterns: Dependent Mapping
Practical PHP Patterns: Embedded Value
Practical PHP Patterns: Serialized LOB
Practical PHP Patterns: Single Table Inheritance
Practical PHP Patterns: Concrete Table Inheritance
Practical PHP Patterns: Inheritance Mapping
Practical PHP Patterns: Metadata Mapping
Practical PHP Patterns: Query Object
Practical PHP Patterns: Repository
Practical PHP Patterns: Page Controller
Practical PHP Patterns: Front Controller
Practical PHP Patterns: Template View
Practical PHP Patterns: Transform View
Practical PHP Patterns: Two Step View
Practical PHP Patterns: Remote Facade
Practical PHP Patterns: Pessimistic Offline Lock
Practical PHP Patterns: Coarse Grained Lock
Practical PHP Patterns: Implicit Lock
Practical PHP Patterns: Database Session State
Practical PHP Patterns: Gateway
Practical PHP Patterns: Mapper
Practical PHP Patterns: Separated Interface
Practical PHP Patterns: Layer Supertype
Practical PHP Patterns: Registry
Practical PHP Patterns: Value Object
Practical PHP Patterns: Money
Practical PHP Patterns: Special Case
Practical PHP Patterns: Plugin
Practical PHP Patterns: Service Stub
Practical PHP Patterns: Record Set
Practical PHP Patterns: Application Controller
Practical PHP Patterns: Client Session State
Practical PHP Patterns: Optimistic Offline Lock
Practical PHP Patterns: Server Session State
Practical PHP Patterns: Class Table Inheritance
Practical PHP Patterns: Data Transfer Object
Practical PHP Patterns: Visitor
Practical PHP Patterns: Memento
Practical PHP Patterns: Mediator

Practical PHP Refactoring

PHP examples for Martin Fowler's Refactoring book.

Practical PHP Refactoring: Inline Temp
Practical PHP Refactoring: Move Method
Practical PHP Refactoring: Move Field
Practical PHP Refactoring: Extract Class
Practical PHP Refactoring: Hide Delegate
Practical PHP Refactoring: Inline Class
Practical PHP Refactoring: Remove Middle Man
Practical PHP Refactoring: Introduce Foreign Method
Practical PHP Refactoring: Introduce Local Extension
Practical PHP Refactoring: Self Encapsulate Field
Practical PHP Refactoring: Replace Data Value with Object
Practical PHP Refactoring: Change Value to Reference
Practical PHP Refactoring: Change Reference to Value
Practical PHP Refactoring: Replace Array with Object
Practical PHP Refactoring: Duplicate Observed Data
Practical PHP Refactoring: Change Unidirectional Association to Bidirectional
Practical PHP Refactoring: Change Bidirectional Association to Unidirectional
Practical PHP Refactoring: Replace Magic Number with Symbolic Constant
Practical PHP Refactoring: Encapsulate Field
Practical PHP Refactoring: Encapsulate Collection
Practical PHP Refactoring: Replace Type Code with Class
Practical PHP Refactoring: Replace Type Code with Subclasses
Practical PHP Refactoring: Replace Type Code with State or Strategy
Practical PHP Refactoring: Replace Subclass with Fields
Practical PHP Refactoring: Decompose Conditional
Practical PHP Refactoring: Consolidate Conditional Expression
Practical PHP Refactoring: Consolidate Duplicate Conditional Fragments
Practical PHP Refactoring: Remove Control Flag
Practical PHP Refactoring: Replace Nested Conditionals with Guard Clauses
Practical PHP Refactoring: Replace Conditional with Polymorphism
Practical PHP Refactoring: Introduce Null Object
Practical PHP Refactoring: Introduce Assertion
Practical PHP Refactoring: Rename Method
Practical PHP Refactoring: Add Parameter
Practical PHP Refactoring: Remove Parameter
Practical PHP Refactoring: Separate Query from Modifier
Practical PHP Refactoring: Parameterize Method
Practical PHP Refactoring: Replace Parameter with Explicit Methods
Practical PHP Refactoring: Preserve Whole Object
Practical PHP Refactoring: Replace Parameter with Method
Practical PHP Refactoring: Introduce Parameter Object
Practical PHP Refactoring: Hide Method
Practical PHP Refactoring: Replace Constructor with Factory Method
Practical PHP Refactoring: Encapsulate Downcast (and Wrapping)
Practical PHP Refactoring: Remove Setting Method
Practical PHP Refactoring: Replace Exception with Test
Practical PHP Refactoring: Pull Up Field
Practical PHP Refactoring: Pull Up Method
Practical PHP Refactoring: Replace Error Code with Exception
Practical PHP Refactoring: Pull Up Constructor Body
Practical PHP Refactoring: Push Down Method
Practical PHP Refactoring: Push Down Field
Practical PHP Refactoring: Extract Subclass
Practical PHP Refactoring: Replace Record with Data Class
Practical PHP Refactoring: Extract Superclass
Practical PHP Refactoring: Extract Interface
Practical PHP Refactoring: Collapse Hierarchy
Practical PHP Refactoring: Form Template Method
Practical PHP Refactoring: Replace Inheritance with Delegation
Practical PHP Refactoring: Replace Delegation with Inheritance
Practical PHP Refactoring: Tease Apart Inheritance
Practical PHP Refactoring: Convert Procedural Design to Objects
Practical PHP Refactoring: Separate Domain from Presentation
Practical PHP Refactoring: Extract Hierarchy
Practical PHP Refactoring: Extract Method
Practical PHP Refactoring: Inline Method
Practical PHP Refactoring: Replace Temp with Query
Practical PHP Refactoring: Introduce Explaining Variable
Practical PHP Refactoring: Split Temporary Variable
Practical PHP Refactoring: Remove Assignments to Parameters
Practical PHP Refactoring: Replace Method with Method Object
Practical PHP Refactoring: Substitute Algorithm

Practical PHP Testing Patterns

PHP implementations of the xUnit testing patterns by Gerard Meszaros.

Practical PHP Testing Patterns: Behavior Verification
Practical PHP Testing Patterns: Recorded Test
Practical PHP Testing Patterns: Scripted Test
Practical PHP Testing Patterns: Data-Driven Test
Practical PHP Testing Patterns: Test Automation Framework
Practical PHP Testing Patterns: Minimal Fixture
Practical PHP Testing Patterns: Standard Fixture
Practical PHP Testing Patterns: Fresh Fixture
Practical PHP Testing Patterns: Shared Fixture
Practical PHP Testing Patterns: Back Door Manipulation
Practical PHP Testing Patterns: Layer Test
Practical PHP Testing Patterns: Four Phase Test
Practical PHP Testing Patterns: Assertion Method
Practical PHP Testing Patterns: Test Method
Practical PHP Testing Patterns: Assertion Message
Practical PHP Testing Patterns: Testcase Class
Practical PHP Testing Patterns: Test Runner
Practical PHP Testing Patterns: Testcase Object
Practical PHP Testing Patterns: Test Suite
Practical PHP Testing Patterns: Test Discovery
Practical PHP Testing Patterns: Inline Setup
Practical PHP Testing Patterns: Delegated Setup
Practical PHP Testing Patterns: Creation Method
Practical PHP Testing Patterns: Implicit Setup
Practical PHP Testing Patterns: Prebuilt Fixture
Practical PHP Testing Patterns: Lazy Setup
Practical PHP Testing Patterns: Suite Fixture Setup
Practical PHP Testing Patterns: Setup Decorator
Practical PHP Testing Patterns: Chained Tests
Practical PHP Testing Patterns: State Verification
Practical PHP Testing Patterns: Custom Assertion
Practical PHP Testing Patterns: Delta Assertion
Practical PHP Testing Patterns: Guard Assertion
Practical PHP Testing Patterns: Unfinished Test Assertion
Practical PHP Testing Patterns: Garbage-Collected Teardown
Practical PHP Testing Patterns: Automated Teardown
Practical PHP Testing Patterns: In-Line Teardown
Practical PHP Testing Patterns: Implicit Teardown
Practical PHP Testing Patterns: Test Double
Practical PHP Testing Patterns: Test Stub
Practical PHP Testing Patterns: Test Spy
Practical PHP Testing Patterns: Mock Object
Practical PHP Testing Patterns: Fake Object
Practical PHP Testing Patterns: Configurable Test Double
Practical PHP Testing Patterns: Hard-Coded Test Double
Practical PHP Testing Patterns: Test-Specific Subclass
Practical PHP Testing Patterns: Named Test Suite
Practical PHP Testing Patterns: Test Utility Method
Practical PHP Testing Patterns: Parameterized Test
Practical PHP Testing Patterns: Testcase Class Per Class
Practical PHP Testing Patterns: Testcase Class per Fixture
Practical PHP Testing Patterns: Testcase Superclass
Practical PHP Testing Patterns: Testcase Class per Feature
Practical PHP Testing Patterns: Test Helper
Practical PHP Testing Patterns: Database Sandbox
Practical PHP Testing Patterns: Stored Procedure Test
Practical PHP Testing Patterns: Table Truncation Teardown
Practical PHP Testing Patterns: Dependency Injection
Practical PHP Testing Patterns: Transaction Rollback Teardown
Practical PHP Testing Patterns: Dependency Lookup
Practical PHP Testing Patterns: Humble Object
Practical PHP Testing Patterns: Test Hook
Practical PHP Testing Patterns: Literal Value
Practical PHP Testing Patterns: Derived Value
Practical PHP Testing Patterns: Generated Value
Practical PHP Testing Patterns: Dummy Object

Lean tools

Reflections on the Poppendieck's Lean Software Development: An Agile Toolkit.

Lean Tools: Seeing Waste
Lean Tools: Value Stream Mapping
Lean Tools: the Last Responsible Moment
Lean Tools: Queuing Theory
Lean Tools: Self-Determination
Lean Tools: Motivation
Lean Tools: Expertise
Lean Tools: Perceived Integrity
Lean Tools: Conceptual Integrity
Lean Tools: Refactoring
Lean Tools: Measurements
Lean Tools: Contracts


My experiments following O'Reilly's Erlang Programming.
Erlang: Hello World
Erlang: tuples and lists
Erlang: build and test
Erlang: functions (part 1)
Erlang: functions (part 2)
Erlang: Concurrency
Erlang: client/server
Erlang: linking processes
Erlang: monitoring
Erlang: records
Erlang: macros
Erlang: live upgrade
Erlang: higher order functions
Erlang: list comprehensions
Erlang: binaries and bitstrings
Erlang: references
Erlang: sets
Erlang: bags

The wheel

A small series highlighting open source libraries to counter by bias on building my own tools.
The Wheel: Symfony Console
The Wheel: Symfony Filesystem
The Wheel: Symfony Stopwatch
The Wheel: Monolog
The Wheel: Twig
The Wheel: Guzzle
The Wheel: Symfony Routing
The Wheel: Assetic


Why I'm leaving Subversion for Git
Acceptance Test-Driven Development
How improved hardware changed programming
Contributing to open source projects
Introducing NakedPhp 0.1
How I learned to stop worrying and love new words
Graphical tips for the average coder
Domain-Driven Design Review
The class design checklist
HTTP verbs in PHP
Java versus PHP
TDD: Always code as...
The TDD Checklist (Red-Green-Refactor in Detail)
The Model-View-Controller pattern in PHP
The guide to configuration of PHP applications
Vim for PHP development
Synchronization in PHP
Evolution of a programmer
PHP 2.x frameworks and Ruby on Rails
Zend_Test for Acceptance TDD
Yahoo! Query Language
OSGi and servlets can work together
Writing user stories for web applications
CSS3 pseudo-classes
Death by buzzwords
Testing web applications with Selenium
The refactoring breakthrough on a CoffeeMachine
Lower your bar in Test-Driven Development
Web MVC in Java (without frameworks)
Software engineering in the rail system
Web applications as enterprise software
The absolute minimum you'll ever have to know about session persistence on the web
Exceptional JavaScript
JSP are more than templates
Firebug is beautiful
A Dojo primer
PHP inclusions
10 HTML tags which are not used as often as they deserve
WebML: overcoming UML for web applications
The buzzword glossary
The shortest guide to character sets you'll ever read
The wonders of the input tag in HTML 5
Native jQuery animations
NetBeans vs. Vim for PHP development
The different kinds of testing
Selenium is not a panacea
Is graceful degradation dead?
From Subversion to Git in a morning
Why a Pomodoro helps you getting in the zone
PHPUnit 3.5: easier asserting and mocking
CSS refactoring
The PHP paradigms poll results: OOP wins
How to set up the Pomodoro Technique in your office
What you need to know about your version control system
Paint on a canvas like Van Gogh
The must-know of color theory
You don't have to always stare at a screen
What we don't need in object-oriented programming
INVEST in user stories
Primitive Obsession
5 features of PHP that seem hacks, but save your life
From Doctrine 1 to Doctrine 2
The Dark Side of Lean
It's just like putting LEGO bricks together... Or not?
The best tools for writing UML diagrams
Date and time in PHP 5
Zend_Validate for the win
Meaningless docblocks considered harmful
Double Dispatch: the next best thing with respect to Dependency Injection
Zend_Locale for the win
Technical Investment, or quality vs. time
Real-life closures examples ...for real
Client applications with Ajax Solr: JavaScript vs. servlets
Sitting on the couch
What cooking can teach to a software developer
CouchDB from JavaScript
Reuse your closures with functors
These are not the buzzwords you're looking for
TDD for algorithms: the state of the art
An humble infographic on methodologies
Why Twitter is not an RSS replacement
5 things that PHP envies Java for
PageRank in 5 minutes
Where has XHTML gone?
Behavior-Driven Development in PHP with Behat
Do not fear the command line
Can you use PHP without frameworks nowadays?
Why Ruby's monkey patching is better than land mines...wait, what?
How to remove getters and setters
SOLID for packag... err, namespaces
What you must know about PHP errors to avoid scratching your forehead when something goes wrong
A programmer on the cloud
GitHub is a web application, Twitter is not (yet)
Eliminating duplication
Table-free CSS layouts in 10 minutes
Web Workers, for a responsive JavaScript application
How to enrich lawyers
What Firefox 4 means to web developers?
The PHP frameworks poll results
Struts vs. Zend Framework
HTTP is your wrench
The measures of programming
We cannot avoid testing JavaScript anymore
WebSockets in 5 minutes
Linear trees with Git rebase
Exploring TDD in JavaScript with a small kata
All you want to know about Web Storage
Classical inheritance in JavaScript
A Mockery review
The Gang of Four patterns as everyday objects
PHP UML generation from a live object graph
Bleeding edge JavaScript for object orientation
The 4 rules of simple design
Git backups, and no, it's not just about pushing
How to bomb a technical talk
The eXtreme Programming Values
Web services in Java
The Kindle is ready for programmers
On commits and commit messages
The Victorian Internet, and the Victorian social networks
Parallelism for dummies
Automated code reviews for PHP
Monitoring on Unix from scratch
I don't know how to test this
A week without Flash
Self-Initializing Fakes in PHP
Testing JavaScript when the DOM gets in the way
The era of Object-Document Mapping
HATEOAS, the scary acronym
Unit testing JavaScript code when Ajax gets in the way
Phantom JS: an alternative to Selenium
Symfony 2 from the eyes of a ZFer
PHP 5.4 features poll: the results
How to be a worse programmer
CoffeeScript: a TDD example
Assetic: JavaScript and CSS files management
The fastest browser poll: results
Syntactically Awesome Stylesheets
Edge Side Includes with Varnish in 10 minutes
Raphaƫl: cross-browser drawings
Future JavaScript, today: Google's Traceur
Backbone.js: MVC in JavaScript
Web typography in 2011
Practical Google+ Api
Phar: PHP libraries included with a single file
Cross-Site Request Forgery explained
Pretotyping: a complete example
Zend Application demystified
All the Git hooks you need
Temporal correlation in Git repositories
The Goal of software development
What I have learned at DDD Day
OAuth in headless applications
A look at Dart from the eyes of an OO programmer
And now instead, 5 things Java envies PHP for
Tell, Don't Ask in the case of a web service
Getting started with Selenium 2
I've had enough of running Scala in a terminal, let's try with a web application
Using a virtual machine to play with multiple versions of PHP
PHP on a Java application server
Web applications with the Play framework
Selenium 2 from PHP code
Eventual consistency is everywhere in the real world
Setting up a LAMP box with Puppet
PhoneGap: native applications written in HTML
HTML5 Drag and Drop uploading
Testing and specifying JavaScript code with Jasmine
What I learned in the Global Day of Code Retreat
Creating a virtual server with Vagrant: a practical walkthrough
Clojure for dummies: a kata
Rails from the point of view of a PHP developer
The Spark micro framework
3D experience in a browser with Three.js
Clojure libraries and builds with Leiningen
Open source PHP projects of 2011
TDD for multithreaded applications
Web application in Clojure: the starting point
Object-oriented Clojure
Open/Closed Principle on real world code
Python Hello World, for a web application
Offline web applications: a working example
jQuery plugins with jsTestDriver
Ajax requests to other domains with Cross-Origin Resource Sharing
Unit testing when Value Objects get in the way
An Introduction to the R Language
My use case for checked exceptions
What WSGI is
The Decorator pattern, or its cousin, in JavaScript
Bottle: a lightweight Python framework
Spam filtering with a Naive Bayes Classifier in R
Erlang's actor model
The 7 habits of highly effective developers
Our experience with Domain Events
Audio in HTML 5: state of the art
Running JavaScript inside PHP code
Gradient descent in Octave
A Zend Framework 2 tryout
Asynchronous and negative testing
All the mouse events in JavaScript
Everything you need to know about Python exceptions
CSS Bits: The Mouse Cursor
Bootstrap: rapid development and the complexity of a framework
Test-Driven Emergent Design vs. Analysis
PHP objects in MongoDB with Doctrine
TravisCI Intro and PHP Example
Sometimes Python is magic
Writing clean code in PHP 5.4
Object Calisthenics
Ajax and MVC
TDD in Python in 5 minutes
Test-Driven Development with OSGi
Including PHP libraries via Composer
There's no reason not to switch to DocBlox
The unknown acronym: GRASP
Bullets for legacy code
Finding wiring bugs
2 years of Vim and PHP distilled
All about JMS messages
Asynchronous processing in PHP
The Page Object pattern
Software versions, the necessary evil
Commodities in the IT world
The return of Vim
What's in a name?
The standard PHP setup
Selenium on Android
Hexagonal architecture in JavaScript
Why everyone is talking about APIs
Testing PHP scripts
Software Metaphors
MongoDB and Java
PHPSpec: BDD for your classes
What is global state?
A crash course for the MongoDB console
My love story with SSH
The surgery metaphor
The Turing test
Functional JavaScript with Underscore.js
Record and replay for testing of legacy PHP applications
PHP 5.4 by examples
My take on Utility and Strategic software
The Duck is a Lie
Set Up Solr and Get it Running
All debugging and no testing makes the PHP programmer a dull boy
All the ways to perform HTTP requests in PHP
The Roman numerals kata: TDD with and without analysis
Refactoring away from spaghetti PHP
What is statistical learning?
Why I am functophobic
How to build a Kanban board
An Introduction to WEKA - Machine Learning in Java
How to Take Unit Testing (and Test-Driving) Seriously
Transform switches in maps
Manual Test-Driven Development
Errors: part of the learning curve
Build your own Java stopwatch
Development of Latex documents
The Pomodoro updates
The problem of user identity
Don't overspecify your mocks
Factory patterns: Collaborators Map
The perils of long-running test suites
A CRC cards primer
No one always needs a framework
Scheduling is not the same for computers and people
Don't ignore errors
Why having an API matters: testing
What I learned at the Italian Agile Day 2012
Preparing to coach with the Game of Life
Lessons learned from the Code Retreat
The danger of large releases: Trenord case study
OO vs. functional: the Game of Life
Code Katas: Ruzzle solver
Agile traveling
How ACID is MongoDB?
SOLID principles: are they enough for OO?
Caring about build files
Thinking in value terms
Why HATEOAS is not the witch to burn
Carriers vs. the OSI model
How to correctly work with PHP serialization
Pomodoro, 2013 edition
Experiences with the book club
External processes and PHP
PHP streams for everything
Isolation in MongoDB
PHP's mcrypt
Design Choices: Return Values and Mocks
Contributing to Paratest
From Java to PHP
Continuous Integration and Pull Requests
MongoDB 2.4 is Out!
Monoids in PHP
Automated Testing is Cancer
Diving into Behat
Monitoring with DataDog
Trying out PHP Refactoring Browser
The difficult relationship between developers and business
What's in a constructor?
PHPUnit vs. Phake cheatsheet
How to stub SOAP in PHP
Many Ubiquitous Languages
Accessing APIs without taking down your own application
Game of life in Haskell
Cloning in PHP
Slack, the missing concept
A simple strategy for dotfiles
NoSQL does not mean no migrations (but opens up new ways of doing them)
Serialization and injection
The R-word
Selenium screenshots for rendering tests
The pitfalls of O(N)
How to think about patterns
Why not add this new feature?
Continuous Deployment Demystified
Backward compatibility, even inside a single project
The Legacy Code Retreat
XP Values: Simplicity
XP Values: Feedback
Memcache 102
My Vim values
Review: Implementing Domain-Driven Design
Upgrading PHP, from the trenches
Elephant Carpaccio (on user stories)
Battle with legacy: reducing ifs
Management 3.0 review
An Open/Closed Principle kata
Notes to a Software Team Leader Review
XP Values: Courage
Importing data, the API way
XP Values: Communication
How to shard a cron
The little toolbox of PHP performance optimization
How a LazyDecorator can let your application avoid building massive object graphs
Karate Chop
What your test suite won't catch
Book review: Slack
Unix commands for dealing with structured text
The programmer's information diet
Six months of Behat
Revisiting Conway's law
Unix lessons: sed
Object-relational mapping: seriously
Migration to AWS: part 1
A pull model for Event Stores
Book review: The Puritan Gift
Book review: Feedback control for computer systems
Migration to AWS: part 2
A different kind of kata: Harry Potter books
Migration to AWS, part 3
HTTP katas
Two days in the business side
Configuration is code
REST callbacks
Distributed time
A course with J.B. Rainsberger
Italian Agile Days 2013
MongoDB and its locks
Roman numerals, towards reuse
Global Day of Code Retreat 2013
No return statements
Long-running PHP processes: external resources
MongoDB Christmas optimization
Stand back, I'm going to try science!
AngularJS: first impression
Using APC correctly
Parallel PHPUnit


What paradigm should PHP applications embrace?
Is touch typing mandatory?
Which PHP framework would you use today for a brand new application?
Which browser do you consider the fastest?
What new feature in PHP 5.4 is the most important to you?

Wednesday, April 23, 2014

Integrated tests are not feeling well. Long live design.

Take me down to the big Rails city / where the tests are green and they take 10 minutes
I checked the date this afternoon:
$ date
Wed Apr 23 14:38:08 CEST 2014

and apparently it's 2014, but there's still a widely held belief (in some circles) that integrated (or end-to-end) tests should be favored over unit tests. A belief that Test-Driven Development does not have a beneficial influence on the quality of your tests and code.
So today I'm repeating a few things I have been writing about in the last years.
Don't be too proud of this technological terror you've constructed. The ability to INSERT a row is insignificant next to the power of Domain Models.

A few properties of unit tests

Unit tests target one or a few objects at a time, without accessing different resources than the CPU and memory of the current process. With respect to integrated tests, they are:
  • Easier to write: their setup phase takes a few lines where Test Doubles are injected in a constructor.
  • Faster: a 1000-test suite takes seconds to run.
  • Isolated: they cannot interfere with each other or with the global state of the process.
  • Repeatable: they always give the same result no matter the initial conditions.
  • Precise: they tell you which wire does not work instead that the car does not start.

Listen to your tests

How many asserts must a man write down / before you call him a man

That's not to say integrated tests are not useful: I have a talk coming up at phpDay in which I will explain how our Behat test suite work and the optimizations we made to keep it under 5 minutes. Some concerns where integration tests shine are making sure systems built by different teams work together, refactoring legacy code, and acceptance-level tests written in the customer's language.
However, the ratio of unit tests to integrated tests should be in the range of 10 to 1, or even 50 to 1. If there is a force at work in a project that pushes for more integrated tests than unit tests, you are falling into a vicious cycle where instead of writing:
assertEquals("1.00", new Money(100).toString());
you're writing, more often than not:
    "<span class="money">1.00</span>",
    findPriceTag(get("/subscription/" + id))
and promoting coupling between the Money, Subscription and PageTemplate objects.
The Listen to the tests principle tell us to take difficult-to-write integrated tests as a smell: a warning that we need to break the dependencies between objects to be able to reuse them, for example in isolated tests (lowering coupling); and move responsibilities around until objects respond to an interface with a small surface area (increasing cohesion).
The benefit of TDD is continuously applying these two forces in your codebase. Renouncing to it while favoring integrated tests is thinking you're able to do the same in your mind, for the rest of the life of your codebase. We test because we don't want to break features, such as being able to perform a payment; but we unit test because we don't want to impact non-functional concerns such as reuse and the ability to change.
Take me to the magic of the moment / on a glory night / where the objects of tomorrow dream away / in the wind of change


I'll stop short. Here's where you can know more about integrated and unit tests, explained by some of the best people in the field.
And finally the style of this post is inspired by Call me maybe: MongoDB.

Sunday, March 09, 2014

The good old TCP/IP stack

There are theoretical models, such as the ISO/OSI one, that cast the Internet into a set of many levels in an attempt of standardization. The Internet protocol suite, also known as TCP/IP, describes instead what goes on in reality to show you this blog post. The suite is divided into multiple layers, each building on the previous one and containing several protocols that can be theoretically swapped with each other.
I may use some protocol-specific terminology for antonomasia, such as frame.

Link layer

The link layer solves the problem: how do I get a frame of bytes from one physical device to another? Consider that the network resources, such as physical cables and radio frequencies, may be shared so that collision is possible. For the same reason, sometimes routing has to be available to identify who am I sending these bytes to; however this routing is physical, consisting of single point-to-point connections or of network card addresses.
Inside a local network, Ethernet and the wireless IEEE 802.11 standards have the lion's share of the market. Devices are identified by their firmware-based MAC addresses and the network may contain switches sending the frames travelling trough them to the correct recipient.
However, a local network is of limited utility nowadays. To talk with the rest of the world, more complex link layer protocols are needed: they get you from your DSL router to your ISP ones, maybe even involving multiple hops such as a section based on copper wires and one on optical fiber.
The link layer is closely coupled to the hardware available: different protocols work on different mediums such as wires, glass and electromagnetic waves. It is possible in theory to abstract the business logic (say, how to detect a collision) from the medium; however, it's like testing a Repository object by looking at the query that it generates instead of running it against the real database.

Internet layer

In the Internet model, machines may have globally-recognizable addresses that have meaning outside their local network. Thanks to these IP addresses and the related protocols, you can solve the problem of getting packets of data from one node in the world to another.
However, these packets have severe limitations:
  • they are of a limited or fixed size, that cannot be increased more than a few thousand bytes due to the packet switching model.
  • No order is guaranteed: packet may take different paths to get to the target host and arrive in any order.
  • Their transmission is best-effort, as there can be arbitrary packet loss.
Inside the global network, all hops at the Internet layer level have an IP address; the source and target IP addresses are written inside each packet so that each intermediate node can route it towards the neighbor that is probably nearest to the target. You can imagine the complexity of constantly updating this routing table while addresses are (re)assigned every day.
IP (version 4 or 6) is not the only Internet layer protocol. ICMP is one of the other famous ones, used for example by ping and traceroute for troubleshooting.
Finally, note that due to the limitations of the public address ranges containing only 4 billion IPs, NAT and other techniques have been developed to provide private address spaces to local networks. This severely breaks the model of  globally addressable nodes, as for example nodes inside your home or office network cannot accept incoming connections (without resorting to port forwarding). It is a necessary evil due to the ubiquitousness of IPv4 and its 32-bit address fields.

Transport layer

The Internet layer provides global connectivity, but with the limitations described above. To provide a useful bidirectional communication channel, the Transport layer builds upon the unreliable packets of the Internet layer to provide the illusion of a local IO stream, the same you could get by reading a file.
Consider for example the Transmission Control Protocol, TCP; it provides:
  • reliable and ordered communication between hosts. Lost packets are retransmitted and sequence numbers to correct out-or-order arrival.
  • multiplexing of communication channels between two nodes single link via ports. I can connect to the same web server with multiple browsers without the HTML pages and images being returned messing with each other.
Other protocols such as UDP are not optimized for reliable communications, but on other parameters like latency. What matters is that with a transport layer we can build a remote terminal which is conceptually the same as a local one, sending streams of text and receiving other text back.

Application layer

Once we have transformed the mess of wires and network devices into a universal interface made of text and bytes, it's up to the application to do something useful with it. Protocols at the application layer differ in what they offer to the end user:
  • Identification of nodes with an host name even if its IP address changes or they are physically moved elsewhere (DNS).
  • A way to read and create hypertext/hypermedia documents and related resources (HTTP).
  • A secure terminal session on a remote machine (SSH).
  • Updates for the local clock of your machine so that it's always correctly set (NTP).
  • Voice and video chat (proprietary protocols usually).


Why it's important to know how the full stack of the Internet protocols works?
  • When something breaks or slows down, it helps to identify the level at which the failure is happening, and contact the right person such as a your ISP, a system administrator that has to restart a VPN or a programmer not targeting the correct HTTP response code.
  • Layers are isolated from each other, so you can usually swap implementations inside one layer while keeping a system functional, sometimes sacrificing non-functional requirements such as performance. If your DSL line is down, you can use a mobile broadband Interney key without changing software.
  • Some problems are best solved inside a particular layer: congestion control by the transport layer, routing and visibility at the Internet layer. Why wasting energy in segregating responsibilities when there is already a standard division of labor we cannot change...