Saturday, July 04, 2009

Domain model is everything

What is a good domain model? The one that bridges together domain and infrastructure...
According to Wikipedia, Domain Model is a conceptual model of a system which describes the various entities involved in that system and their relationships. Note that is written capitalized, because it is recognized as a pattern.
Domain Model is typical complemented by infrastructure, the set of all the data and code needed for an application to work, which are not part of the model.
An important part of Domain Model concept is that it is a model: it does not take into account all the world variables but only the interesting ones for our application. If you need a home banking system, you don't need to persist in a relational database the eye color of the customers...

A practical approach
Producing a Domain Model is tipically achieved with classes (business object, a very overloaded term as Fowler says), while in the past a domain model was constructed using data structures and functions which processes them. OOP encapsulate data and methods that act on them in the same place: that's why it's so natural to produce object oriented code.
Not all the classes in an OO project are part of the model: the majority of them constitutes infrastructure. Collections classes, persistence mechanism for objects, file writers and readers, serialization systems, controllers, java servlet, iterators, view helpers, image manipulators - all part of infrastructure, the low level of an application.
The most important level is the model, that sits on the infrastructure (but it does not means that should be aware of all implementation detail of infrastructure classes) and provides the functionality needed to run the most of your user stories. It is the layer that unit tests absolutely have to cover. All the business logic should be encapsulated by the model.
Practically speaking, the User, Group and Article classes which you've seen in many posts and blogs as examples of code, are at the heart of a Domain Model. Controllers and views in a php framework (such as Zend Framework for php developers) are already present, as the job of a framework is to provide infrastructure: it's your model that differentiates the application from another.

Iterative work
Choosing the right model could lead to simple development: failing to do so would complicate subsequent actions. Because it's not possible to write code for a complete model from the start, a developer must be open to further model refinements. For instance, in Ossigeno there is a single script that rebuilds an environment (a particular combination of database and php files) from scratch updating the tables that persists the domain objects. Adding a field to an object means only writing a line of yaml and push the regenerate button.
Requirements gathering is not a definite phase: it's iterative. Whenever further requirements and modelling occurs, it's useful to have a simple way to refine the model code with a push-the-button script. The more you refine and regenerate, the more the Domain Model adapts to the real world; the more it adapts to the part of the world that interest us for our application, bridging the gap between code and domain.

A Linear Programming example
In math, the right model can do the difference with solving a problem or not being capable of solve it in a zillion of years.
Suppose you have to maximize a function of some variables (real numbers or integers): suppose also you have some constraints on the variables that cannot be violated; the solution to the problem should be a bunch of numbers that gives the top value for the objective function and does not infringe constraints.
Well, if you choose particular variables that let you write linear constraints, there's good news for you. You can apply Linear Programming and have the problem solved by a computer in polinomial time. If your model is not so well-crafted, and you have to resort to write constraints where two variables are multiplicated or divided, Linear Programming can't help you. And very bad things happen, such as finding a numeric solution taking weeks.
This example shows us that having a good Domain Model means not only drawing a correct painting of the business domain, but also use the right colors and lines to fit in the canvas. Programmatically speaking, the painting is our model while our infrastructure constitutes the canvas or panel where we draw.
A good model is one that can solve our problems, mapping the application entities into the code, while not overengineering the infrastructure. It can discosts from the real world, but the correct naming (we'll see in an example right now) can lighten the path...

Bridging the gap
We said that a good model stands in between the infrastructure and the business, so we can make an example of what is not a good model.
A bad model is too close to the infrastructure: for example if we model our users, groups and articles in a unique big, multidimensional array, the infrastructure used is very simple and is not subjected to particular stress.
A bad model is also too tight to the business domain: if we take a picture of the users, groups and articles world with static classes, deep inheritance trees and generated fields in the objects will make very hard to save and restore our application state.
However, the need for abstracting away the persistence concerns of objects from a model (for example in Domain Driven Design) have lead the developers to find a way to keep in touch to the domain. This is done by extending the infrastructure using ORMs. An Orm is a superset of the common infrastructure of a framework that cares about persistence of the objects involved in an application: it adds to the allowed infrastructure the typical object composition and inheritance features. Persistence ignorance is a wide argument and it is not treated here.

Sql relationships as an example
A common example of a model where the mapping to the real world is not immediate, and is sacrificed to performance and cleanliness, is the handling of many-to-many relationship in relational databases.
A relational database has been the state-of-the-art tool for persisting the state of an application, and today it's still the a widely used one. However, a relational database in its simplest form consists of tables, columns and rows, so it has no notion of how to handle a many-to-many relationship between our User objects and Group objects. If a user could belong to a single group, we would persist the relationship by a field group_id in the User table; but a User can belong to many groups, and conversely a particular group could have many number of users.
The standard solution is to create a new table, where there are only two columns: group_id and user_id, containing the numerical index of the Group and User rows. A User belonging to a particular group is described by an entry in this table, so we'll name it UserSubscription.
Our modelling has introduced another entity, that does not exists in the real world, but it is useful to mantain a homogeneous infrastructure. This is the best crafted model at this stage of development (not using ORMs).
This solution works particularly well in a rdbms context, so why not abstracting it away? This way we'll have a model withour UserSubscription that maps even better to the real world. This is the next step taken by the Orm family (Hibernate, Doctrine, ...), where you can define any type of relationship and let the Orm do the dirty job of persisting it in the database, with the aid of developer metadata, and providing a language similar to Sql to query the infrastructure layer and obtaining the stored User and Group objects, correctly composed.
This is stretching the gap between domain and infrastructure, letting your good model (that sits in the half) to go even near the domain side. That's why OOP is so successful, that's why Orms are a complex answer to a simple problem, but still so successful.

No comments:

ShareThis