Monday, December 14, 2009

How an Orm works

Some readers have been confused by the terms I use often in reference to Object relational mappers, so I want to describe some concepts of Orms and make some definitions. Particularly I want to focus on how a real Orm works and lets you write classes that do not extend anything (Plain Old Php Objects or Plain Old <insert oo-language here> Objects).
The persistence-abstraction standard is Java Persistence Api, which was extracted from Hibernate, and I will refer to it in this post. Doctrine 2 is the Orm which ports the specification in the php world and will be the reference implementation of these concepts in the explanation that follows.

The primary classification of Domain Model classes consists in dividing them in two categories: Entities, which primary responsibility is to maintain the state of the application, and Services, which responsibility is to perform operations that involves more than one Entity, and to link to the outside of the domain model, breaking direct dependencies. This distinction leaves out Specifications, Value Objects, etc., which add richness to a model but are less crucial parts of it. Repositories and Factories are still a particular kind of Service.
I know that primary responsibility of a class sounds bad, since a class should have only one responsibility; though, there is a trade-off between responsibility and encapsulation and an Entity class should certainly hide the workings of many operations that involve only its private data.
Examples of Entity class are User, Group, Post, Forum, Section, and so on. Typical Service class names can be UserRepository, UserFactory, HttpManager, TwitterClient, MyMailer. You can often recognize entities from their serializability.
Imagining that you are going to take advantage of an Orm's features, once you have your Entity classes defined it's up to you to define their mapping to relational tables in a format that the Orm understands - xml, yaml, ini files, or simple annotations. The Orm will use this information not only to move objects back and forth from the database, but also to create and maintain your schema, thus without introducing duplication.
The mapping consists of metadata that describe what properties of an entity you want to store, and how. There are multiple ways to map objects to tables and an Orm should not just invent how to fit them in a database.
Java annotations are objects which provide compile-time checks, while in php they are only comments included in the docblock due to lack of native support. This also means that with Doctrine 2 there is no dependency from the Entity class file to the Orm source code.
This is the simplest Entity I can think of, a City class, complete with mapping for Doctrine 2:
 * Naked Php is a framework that implements the Naked Objects pattern.
 * @copyright Copyright (C) 2009  Giorgio Sironi
 * @license
 * This library is free software; you can redistribute it and/or
 * modify it under the terms of the GNU Lesser General Public
 * License as published by the Free Software Foundation; either
 * version 2.1 of the License, or (at your option) any later version.
 * @category   Example
 * @package    Example_Model

 * @Entity
class Example_Model_City
     * @Id @Column(type="integer")
     * @GeneratedValue(strategy="AUTO")
    private $_id;

     * @Column(type="string")
    private $_name;

    public function __construct($name)

     * @return string   the name
    public function getName()
        return $this->_name;

    public function setName($name)
        $this->_name = $name;

    public function __toString()
        return (string) $this->_name;
Private properties are accessed via reflection.

A JPA-compliant Orm presents a single point of access to its functionalities: the Entity Manager, which is a Facade class. You should now understand the meaning of its name.
The Entity Manager object usually has two important collaborators: the Identity Map and the Unit Of Work, plus the generated proxy classes which serve for many purposes:
  • the Identity Map is - as the name suggests - a Map which maintains a reference to every object which has been actually reconstituted from the database, or that the Orm knows somehow (e.g. because it has been told to persist it explicitly).
  • Proxies, whose classes are generated on the fly, substitute a regular object in the graph with a subclass instance capable of lazy loading itself if and only if needed. The methods of the Entity class are overridden to execute the loading procedure before dispatching the call to the original versions.
  • The Unit Of Work calculates (or maintains) a diff between the object graph and the relational database; it commits everything at the end of a request, or session, or when the developers requires so.
The shift in the workflow is from the classic ActiveRecord::save() method to the EntityManager::flush() one. It is a developer's responsibility to maintain a correct object graph, but it is the Orm's one to reflect the changes to the relational database. The power of this approach resides in letting you work on an object graph as it were the (almost) only version you know of the Domain model.


Fedyashev Nikita said...

Thanks for mentioning of POJO term in PHP stack. I want to ask you about that.

Can reading a book like "POJO in Action" benefit me, as PHP programmer?

There isn't a lot of information in this field for PHP environment. So maybe it can be a good start..

What do you think, Giorgio?

Giorgio said...

You have to make sure that you have the tools to work with POxO: Doctrine would substitute Hibernate in this fashion. Some parts of Pojo in Action are very technology specific as you would expect from a Java book. You can benefit from the Domain Model development strategies described there (but there is DDD which is the masterpiece).

Alan said...

Great article.

It is worth to mention that object graphs, you are talking about, are often referred to as aggregates

Fedyashev Nikita said...

>>You have to make sure that you have the tools to work with POxO

I'm waiting for Doctrine2 beta version release :)

OK, thanks a lot, I will concentrate on Java-agnostic, DDD parts :)

Giorgio said...

an object graph is a more general term in the sense that an object graph composed only of entities and with one root, is an Aggregate (DDD term); aggregation is a typical Uml relation. But also services which compose each other are an object graph; of course you want to store in the database only graphs composed by entities.

Anonymous said...

Thanks, that explanation helps a lot. I think that once you graduate, you'll be surprised how much better you understand this stuff than many professionals who have been programming or a decade (like myself) - especially in the PHP field.