Sunday, September 21, 2014

Microservices are not Jars

The cult of the monolith
I've been building microservices for two years and my main complaint is that they're still not micro- enough. Here's a rebuke of Uncle Bob's recent post Microservices and Jars, which he apparently has written after forming an opinion based on an article in Martin Fowler's bliki:
One of my clients recently told me that they were investigating a micro-service-architecture. My first reaction was: "What's that?" So I did a little research and found Martin Fowler's and James Lewis' writeup on the topic.
 "I didn't even know what microservices were up until several days ago. Now I'm ready to pontificate about the topic."
So what is a micro-service? It's just a little stand-alone executable that communicates with other stand-alone executables through some kind of mailbox; like an http socket. Lots of people like to use REST as the message format between them.
Why is this desirable? Two words. Independent Deployability.
Let's ignore the REST as the message format terminology. Only two words? Independent deployability is nice, but I've seen cases where independence is total, and cases where an end-to-end test suite still needs to run including the production version of services A and B and the new version C' that we want to deploy to substitute C.
Other interesting properties of microservices such as scaling them independently come to mind. Or writing them in different languages. Or adapting to Conway's law by aligning teams with microservices for most of their work.
You can fire up your little MS and talk with it via REST. No other part of the system needs to be running. Nobody can change a source file in a different part of the system, and screw your little MS up. Your MS is immune to all the other code out there.
You can test your MS with simple REST commands; and you can mock out other MSs in the system with little dummy MSs that do just what your tests need them to do.
Moreover, you can control the deployment. You don't have to coordinate with some huge deployment effort, and merge deployment commands into nasty deployment scripts. You just fire up your little MS and make sure it keeps running.
You can use your own database. You can use your own webserver. You can use any language you like. You can use any framework you like.
Freedom! Freedom!
<sarcasm> tag needed.

But wait. Why is this better? Are the advantages I just listed absent from a normal Java, or Ruby, or .Net system?
Yes:
  • existing databases tend to be attractors when new persistence requirements come up. So if I have MySQL up and running in my application and a job that would be a good fit for MongoDB comes up, I'm definitely not going to introduce MongoDB given the infrastructure setup time. I'll just go with the existing infrastructure and create some new tables, perpetuating the growth of the monolith.
  • Web servers are often tied to languages. If I want to use Node.js it will listen on the port 80 by itself, while PHP is commonly used with Apache, and Java with Tomcat or Jetty. 
  • JARs are a pretty JVM-specific packaging system. I'm definitely not going to put PHP code into JARs.
  • Frameworks come from the language, and even inside the same language I can have multiple PHP applications where one has a custom user interface and one serves a Angular single-page application.
Also the ones not listed:
  • It's easier to find out machines which contain bottlenecks and replace them, CPU and IO usage maps directly to applications.
  • It's easier to get started working as a new developer because you need just a single microservice to run on your machine.
  • It's easier to throw away one microservice and replace it with a new one doing the same job, but better written.
What about: Independent Deployability?
We have these things called jar files. Or Gems. Or DLLs. Or Shared Libraries. The reason we have these things is so we can have independent deployability.
Replacing single JARs or DLLs seems pretty dangerous to me where there are compile-time and binary dependencies in play. Since Uncle Bob has experience with that, I'm going to trust him to deploy safely this way.
Most people have forgotten this. Most people think that jar files are just convenient little folders that they can jam their classes into any way they see fit. They forget that a jar, or a DLL, or a Gem, or a shared library, is loaded and linked at runtime. Indeed, DLL stands for Dynamically Linked Library.
So if you design your jars well, you can make them just as independently deployable as a MS. Your team can be responsible for your jar file. Your team can deploy your DLL without massive coordination with other teams. Your team can test your GEM by writing unit tests and mocking out all the other Gems that it communicates with. You can write a jar in Java or Scala, or Clojure, or JRuby, or any other JVM compatible language. You can even use your own database and wesbserver if you like.
You can use any language you like as long as you run it on the JVM. Sure there must be people who work on other infrastructure or don't want to run their languages on a compatibly-yet-really-secondary platform? PHP applications? Ruby programmers?
If you'd like proof that jars can be independently deployable, just look at the plugins you use for your editor or IDE. They are deployed entirely independently of their host! And often these plugins are nothing more than simple jar files.
So what have you gained by taking your jar file and putting it behind a socket and communicating with REST?
SOAP is the last acronym where simple was used this way. Look, by generating a WSDL from your objects along with an XSD file that can be used to validate XML messages you can pass requests over HTTP with a Soap-Action header and regenerate Java (or other compatible languages) code from the WSDL...
One thing you lose is time. It takes time to communicate through a socket. It takes time to decode REST messages. And that time means you cannot use micro-services with the impunity of a jar. If I want two jars to get into a rapid chat with each other, I can. But I don't dare do that with a MS because the communication time will kill me.
Of course, chatty fine-grained interfaces are not a microservices trait. I prefer accept a Command, emit Events as an integration style. After all, microservices can become dangerous if integrated with purely synchronous calls so the kind of interfaces they expose to each other is necessarily different from the one of objects that work in the same process. This is a property of every distributed system, as we know from 1996.
On my laptop it takes 50ms to set up a socket connection, and then about 3us per byte transmitted through that connection. And that's all in a single process on a single machine. Imagine the cost when the connection is over the wire!
It takes more to write a file line by line rather than doing it in a single shot. However, if the file is 2GB long, I prefer the first solution in order to preserve memory. I'm just trading off time for another resource.
In the case of microservices, I'm trading off the latency of single interactions between different services for more important resources: programmer time, independent scalability, even time experienced by the end user. A front end asynchronously publishing events to a backend service feels faster to the user than a monolithic application where I respond to user requests and generate report lines in the same process or on the same machines.
Another thing you lose (and I hate to say this) is debuggability. You can't single step across a REST call, but you can single step across jar files. You can't follow a stack trace across a REST call. Exceptions get lost across a REST interface.
To me debuggability and introspection into an application improves when using microservices, because you will be full of all the HTTP logs of every service calling one another. You don't have to predispose logging cut points as they come for free with the HttpChannel objects. For a more business-oriented monitoring, take a look at Domain Events: we publish them from different applications in order to build reports based on data from different components.
After reading this you might think I'm totally against the whole notion of Micro-Services. But, of course, I'm not. I've built applications that way in the past, and I'll likely build them that way in the future. It's just that I don't want to see a big fad tearing through the industry leaving lots of broken systems in it's wake.
For most systems the independent deployability of jar files (or DLLS, or Gems, or Shared Libraries) is more than adequate. For most systems the cost of communicating over sockets using REST is quite restrictive; and will force uncomfortable trade-offs.
Paraphrasing Stroustrup, there are only two kinds of achitectures: the ones people complain about and the ones nobody uses. We are here proposing microservices because they have provided value in many systems that were once thought not to need them. As long as you have reporting needs you don't want to burden your front end with, or need to scale up in the number of users or programmer, you can consider microservices (and their cost).
My advice:
Don't leap into microservices just because it sounds cool. Segregate the system into jars using a plugin architecture first. If that's not sufficient, then consider introducing service boundaries at strategic points.
Please don't! The interaction between microservices are very different from the ones between objects inside a single application. Each call outside of the boundary is a potential failure mode that you should try to model as an asynchronous message that can be retried when delivery fails (the receiving microservice being down, slow or not reachable). Retrofitting microservices over an existing code base is a costly endeauvour and you should only embark on it if you have an adequate time and money budget, possibly bigger than the one necessary to build with microservices in the first place.

ShareThis