Monday, April 03, 2017

Pipeline Conf 2017

Not liquorice
Post originally shared on eLife's internal blog, but in the spirit of (green) open access here it is.

Last month I have attended the PIPELINE conference in London, 2017 edition. This event is a not-for-profit day dedicated to Continuous Delivery, the ability to get software changes into the hands of users, and to do so safely, quickly, and in a sustainable way. It is run by practitioners for practitioners, everyone on different sides of the spectrum like development, operations, testing, project management, or coaching.

The day is run with parallel tracks, divided into time slots of 40-minute talks and breaks for discussions and, of course, some sponsor pitches. I have been picking talks from the various tracks depending on their utility to eLife's testing and deployment platform, since our tech team has been developing every new project with this approach for a good part of 2016.

The conceptual model of Continuous Delivery and of eLife's implementation of it is not dissimilar to the scientific publishing process:
  • there is some work performed into an isolated environment, such as a laboratory, but also someone's laptop;
  • which leads to a transferable piece of knowledge, such as a manuscript, but also a series of commits, roughly speaking some lines of code;
  • which is then submitted and peer reviewed. We do so through pull requests, which perform a series of automated tests to aid human reviewers inside the team; part of the review is also running the code to reproduce the same results on a machine which is not that original laptop.
  • after zero or more rounds of revisions, this work gets accepted and published...
  • which means integrating it with the rest of human knowledge, typesetting it, organizing citations and lots of metadata about the newly published paper. In software, the code has to be transformed into an efficient representation, or virtual machines have to be configured to work with it.
  • until, finally, this new knowledge (or feature) is in the hands of a real person, who can read a paper or enjoy the new search functionalities
Forgive me for the raw description of scientific work.

In software, Continuous Delivery tries to automate and simplify this process to be able to perform it on microchanges multiple times per day. It aims for speed to be able to bring a new feature live in tens of minutes; it aims for safety to avoid breaking the users work on new changes; and to do all of this in a sustainable way, not to sacrifice tomorrow's ability to evolve for a quick gain today.

Even without the last mile of real user traffic, the 2.0 software services have been running in production or production-like servers from the first weeks of their development. A common anti-pattern in software development is to say "It works on my machine" (imagine some saying "It reproduces the results, but only with my microscope"); what we strive for is "It works on multiple machines, that can be reliably created; if we break a feature we know within minutes and can go back the latest version known to work."

Dan North: opening keynote

Dan North started to experiment with Continuous Delivery in 2004, at a time when builds were taking 2 days and a half to run in a testing environment contended by multiple teams. He spoke about several concepts underpinning Continuous Delivery:
  • conceptual consistency: the ability of different people to make similar decisions without coordination. It's an holy grail for scaling the efforts of an organization to more and more members and teams.
  • supportability: championing Mean Time To Repair over Mean Time Between Failures. The three important questions for facing a problem as what happened? Who is impacted? How do we fix it?
  • operability: what does it feel like to build your software? To deploy it? Test it? Releasing it? Monitor it? Support it? Essentially, developer experience in additio to user experience.
Operability is a challenge we have to face ourselves more and more as we move from running our own platform to provide open source software for other people to use. Not only reading an article has to be a beautiful experience, but publishing one should be.

John Clapham: team design for Continuous Delivery

This talk was more people-oriented, I agree with the speaker that engagement of workers is what really drive profits (or value in case of non-profits).
Practically speaking:
  • reward the right behaviors to promote the process you want;
  • ignore your job title as everyone's job is to deliver value together;
  • think small: it's easier to do 100 things 1% better than to do 1 thing 100% better (aka aggregation of marginal gains)

Abraham Marin: architectural patterns for a more efficient pipeline

The target for a build is for it to take less than 10 minutes. The speaker promotes the fastest builds as the one you don't have to run, introducing a series of patterns (and the related architectural refactorings) to be executed, safely, to simplify your software components:
  • decoupling an api from implementation: extracting an interface package to reduce the dependencies to a component to a dependency to an interface;
  • dividing responsibiliteis vertically or horizontally trying to isolate the most frequent changes and minimizing cross-cutting requirements;
  • transform a library into a service;
  • transform configuration into a service.
Some of these lessons are somewhat oriented to compiled languages, but not limited to them. My feeling is that even if you reduce compile times, you still have to test some components in integration, which is a large source of delay.

Steve Smith: measuring Continuous Delivery

How do you know whether a Continuous Delivery effort is going well? Or more pragmatically,  which of your projects is in trouble?
The abstract parameters to measure in pipelines are speed (throughput, cycle time) and stability. Each declines differently depending on the context.
In deployment pipelines that go from a commit to a new version release in production, lead time and the interval of new deployments can be measured. But also failure rate (how many runs fail) and failure recovery time are interesting. In more general builds or test suites, execution time is a key parameter but a more holistic view includes interval (how frequent are builds executed).
I liked some of these metrics so much that they are now in my OKRs for the new quarter. Simplistic quote: you can't manage what you can't measure.

Alastair Smith: Test-driving your database

To continuously deploy new software versions, you need an iterative approach to evolve your database and the data within it. When you evolve, you also have to test every new schema change. Even in the context of stored procedures for maximum efficiency (and lock-in), Alastair showed how to write tests that can reliably run on multiple environments.

Rachel Laycock: closing keynote, Continuous Delivery at Scale

Rachel Laycock is the Head of technology for North America at Thoughtworks, the main sponsor of the conference. The keynote however had nothing to do with sales pitches. Here are some anti-patterns:
  • "We have a DevOps team" is an oxymoron, as that kind of team doesn't exist; what often happens is that the Ops team gets renamed."
  • Do we chose Kubernetes or Mesos?" as in getting excited about the technology before you understand the problem to solve.
The "at scale" in the title pushes for seeing automation as a way to build a self-serving platform, where infrastructure people are not bottlenecks but enablers for the developers to build their own services.
The best quote however really was "yesterday's best practice becomes tomorrow's anti-pattern". What we look for is not to be the first to market but to have an adaptable advantage, a product that can evolve to meet new demands rather than being a dead end.