Saturday, November 28, 2009

Saturday question: mixing Repository and Active Record

Saturday is becoming the 'questions day of week', since it is not the first time that after a week of work some readers email me to carry on the discussion on design and testability, two topics that are stressed in my blog posts. :)
This week, Fedyashev wrote to me about mixing architectural patterns in a single application:
I really like these Active record and Repository patterns.
The drawback of Repository pattern is its cost(takes more time then
Active record). Benefit is higher abstraction which really helps on
complicated business logic.
The drawback of Active record is that lower testability(db interaction
is required) and harder in handling complicated domain logic.
Is it acceptable to take the best of these two patterns to be used in
the same application?
I was thinking about using Active record for simple CRUDs and Repository
for complicated domain objects.
The idea behind this intention is to keep cost of code lower but still
have a good code.
What would you recommend?
There are cases in which Active Record would be an acceptable pattern. Since the drawback of Active Record is little testability, the primary scenario for its application is when there is nothing to test. Some applications are data intensive and require only to move information back and forth from the database.
CRUD screens, as you suggest, often have little logic and can take advantage of active records. But we should evaluate case by case, since it is very easy for logic to leak into Active Record instances, and logic should be thoroughly tested.
For example, logic is present in managing validation of entities upon insertion and editing: a classical situation is searching for already existent nicks upon user registration. A Repository is capable of performing validation using external resources as they can be injected at construction or passed as a method parameter, while an Active Record probably not (and it will be more complex to test this validation).

Another problem I see in mixing up these patterns is the different libraries requirements. Typically, we want repositories to aggregate an instance of a lower-layer framework that encapsulates Sql queries or whatever storage we are using (Hibernate or Doctrine 2), while Active Records are subclasses of other frameworks abstract base classes (Zend_Db or Doctrine 1).
The paradoxical result is that implementing both patterns leads to use two different version of Doctrine at the same time, which I do not recommend for maintenance reasons and code clarity.
A solution would be keep the implementations in two separate BoundedContext, which are different domain models that can communicate, for instance using the same underlying relational database. Though, BoundedContext is a DDD term and suppose that you work with persistent-ignorant models in both contexts.

However, the real choice is not between Active Record and Repository but between Active Record and Data Mapper (persistence-ignorant domain model). It seems for instance that Doctrine 2 provides a default repository class you can tweak later, although it has default methods only for retrieving entities and not to insert them (I think the insertion can be managed with events). It's not really difficult to change your approach from:
$user = new User();
$user->nick = 'John Doe';
$user = new User();
$user->nick = 'John Doe';
when what you gain is freedom from activating a mysql daemon to test the User class, without using Repositories. Repositories may come into play later, when and where you want a finer control on the bridge with the database.


Fedyashev Nikita said...

Thanks, Giorgio!

So many new ideas :)

zampano said...

The problem is not Repository vs. AR/DM or any other persistence adapter but the way how it is seen and implemented in the DDD-way.

As you surely know, in DDD a Repository is the in-memory abstraction of a collection of Aggregates and it resides in the domain itself. The implementation is infrastructure and may be whatever you want, even pure SQL.

What you mention in this posts is not a problem at all, but it becomes one when you try to see a Repository only like something that retrieves ARs in a classic active record (or ORM) way; but only with the database-focused glasses on.

But when its main responsibility is the retrieving of ARs and to ensure that only *valid* ARs (fulfilling all invariants, being a whole in terms of consistency and in a valid state) are returned, you'd never had something like
new User().

Giorgio said...

An Active Record by definition is a one-one correspondence between objects and database rows; isn't it enough to infringe persistence abstraction required by DDD?

zampano said...

Not really - because it is not my (the Repository) problem HOW and WHERE to store and to retrieve from the Aggregates I am working with.

You can work with Doctrine 1.x or any other Active Record framework with no problems. It does not look very nice but the domain does (and should) not bother.
It loads, reconstitutes and saves Aggregates.
It could even be the file system or key/value storage, it simply doe not matter - or shouldn't.

Doctrine 1.x Repos may also just look like your example
Just inject a genericRepo to your single Repos and work with a fake in-memory implementation, now you can test that stuff, too.

But almost never there are methods to change state directly, like $user->name(), but there are domain commands like userAddressChange($street, $zip, $city) that catch an real domain event.
The single parts of the entity (or VO if you want) Address may reside in different sources/tables.

And honestly, Doctrine 2 is the other extreme: too much of domain concerns are outsourced to a framework. That could hurt too.
No problem with a generic repository but it can become one with entities/VOs that play different roles in different Aggregates or BCs.

Anonymous said...

Amazing as always