Invisible to the eye

Wednesday, September 30, 2009

Practical testing in php part 7: stubs

This is the seventh part in the php testing series. You may want to checkout the other parts or to subscribe to the feed to be updated when new posts are available.

The name Unit testing explains the most special and powerful characteristic of this type of testing: the granularity of verifications is taken to the class size, which is the smallest cohesive unit we can find in an object-oriented application.
Despite these good intentions, tests often cross the borders of the single class because the system under test is constituted by more than one object: the main one which methods are called on (to verify the results returned with assertions) and its collaborators.
This is expected as no object is an island: there are very few classes which work alone. While we can skip the middle man on the upper side by calling methods directly on the system under test, there's no trivial way to wire the internal method calls to predefined result.
However, it's very important to find a way for insulating a class from its collaborators: while a small percentage of functional tests (which exercise an object graph instead of a single unit) are acceptable and required in a test suite, the main concern of a test-infected developer should be to have the most fine-grained test suite of the country:

tests are very coupled to the production code, in particular to the classes which they use directly. It's a good idea to limit this coupling to the unit under test as much as possible;
the possible interactions with a single object are a few, but integrating collaborators raises the number of situations that can happen very quickly;
the collaborators could be very slow (relying on a database or querying a web service) or not even worth testing because it is not our responsibility;
finally, if a collaborator breaks, not only its tests will fail but also the tests of the classes which use it. The defect could then be hard to locate.

In short, every collaborator introduced in the test arrangement is a step towards integration testing rather than unit testing. However, don't take this advice as a prohibition: many classes (like Entities/newables, but also infrastructure ones like an ArrayIterator) are too much of an hassle to substitute with the techniques described later in this post. You should definitely instantiate User and Group in tests of service classes which acts on them.

The verb substitute is appropriate: the only way to keep out collaborators from a unit test is to replace them with other objects. These objects are commonly known as Test Doubles and they are subclasses or alternative implementations respectively of the class or interface required by the SUT. This property allows them to be considered instanceof the original collaborator class/interface and to be kept as a field reference or to be passed as a method parameter, which are the only ways I can think of to recognize a collaborator.
Dependency Injection plays a key role in many unit tests: in the case of a field reference on a class, there is the need to replace this reference with the Test Double one, to hijack the method calls on the collaborator itself. Since field references are usually private, it is difficult to access them (requiring reflection) and violating a class encapsulation to simplify testing does not seem a good idea.
Thus, the natural way to provide Test Doubles for collaborators is Dependency Injection, being it implemented via the constructor or via setters. Instead of instantiating the production class, simply produce an object of its custom-made subclass and use it during the construction of the SUT. My SUTs usually have optional constructor parameters to allow passing in a null when the collaborator is not needed at all.

While entering the Test Doubles world, the average programmer hears many terms which describes substitutes of an object's collaborators. In crescent order of complexity, they are:

Dummies: object which do not provide any interaction, but are placeholders used to satisfy compile-time dependencies and to avoid breaking the null checks: no method is called on them but they are likely to be required parameters of a SUT method or part of its constructor signature. To avoid the creation of dummies I prefer to keep all constructor parameters optional, trusting that the Factory which creates the object passes regular collaborators instead of null values.
Stubs: objects where some methods have been overridden to return canned results. They may have more than one precalculated result available, depending on the method parameters combination, but these variables are known values once you reach the act part of the test method.
Mock objects: also known as Test Spies, mock objects are used in a testing style different from what we have worked on since the first part of this series (behavior based testing). They will be the main argument of the next episode.
Fakes: objects which have a working implementation, but much simpler than the real collaborator one. An in-memory ArrayObject which substitutes a database result Iterator is an example of a Fake.

You generally don't need to write a Dummy object since there is no interaction with it: the real collaborator can be used instead if its constructor is thin. A Fake is a running implementation so if an already existing class cannot work, there's usually no other choice than write a real subclass and reusing it in all the tests that require the collaborator.
For Stubs and Mocks the situation is different: there are plenty of frameworks for nearly every language which provide help in generating them, which take care of evaluating the subclass code and instantiating an object. Phpunit incorporates a small mocking framework, accessible through the method getMock() on the test case class. Remember that while the method is named getMock(), both Stubs and Mocks can be created via this Api. In this part we'll focus again on state verification and we'll use a Stub to improve the granularity of a test.

We are going to give a meaningful example of unit testing using a Stub. In this example, a GeolocationService takes a User object and fills its latitude and longitude fields using the location specified. GeolocationService requires an instance of a fictional GoogleMaps object to work, and since we all love Dependency Injection it is passed in its constructor.
Note that GoogleMaps can also be an interface or a base abstract class: there is no technical difference. Moreover, if it was a less important collaborator it can even be passed as a locate() parameter.
This is the test case:

class GeolocationServiceTest extends PHPUnit_Framework_TestCase
{
    public function testProvidesLatitudeAndLongitudeForAnUser()
    {
        $coordinates = array('latitude' => '42N', 'longitude' => '12E');
        $googleMapsMock = $this->getMock('GoogleMaps', array('getLatitudeAndLongitude'));
        $googleMapsMock->expects($this->any())
                       ->method('getLatitudeAndLongitude')
                       ->will($this->returnValue($coordinates));
        $service = new GeolocationService($googleMapsMock);
        $user = new User;
        $user->location = 'Rome';
        $service->locate($user);
        $this->assertEquals('42N', $user->latitude);
        $this->assertEquals('12E', $user->longitude);
    }
}

and here is the full, running example.
Note that I have created a Stub only for the external service and not for the User class. The former is external, slow, and unpredictable, but the latter is simple, with little or none internal behavior, and there's a small chance it will break. Nothing behaves like a String more than a String, as Misko says. The test method now focuses on exercising the locate() method and not also the GoogleMaps class.

Finally, let's take a look at the getMock() Api:

object getMock($originalClassName, [array $methods, [array $arguments,

[string $mockClassName, [boolean $callOriginalConstructor,

[boolean $callOriginalClone, [boolean $callAutoload]]]]]])

$originalClassName is the name of the class you want to create a Stub/Mock for. $methods is a numeric indexed array containing the names of the methods you want to subclass; if left empty, every method will be substituted. $arguments are arguments to pass to the constructor of the original class (almost never used), while $mockClassName is a custom name for the subclass created.
The last three arguments are boolean values used to determine if leaving the original constructor or clone method, or to allow autoloading of $originalClassName. They default to true. Often you want to set $callOriginalConstructor to false if its signature requires other collaborators to be passed in. All arguments but the first one are optional.
The mock produced, along with the original class methods, also has the expects() one available. For now, simply calling it with the argument $this->any() will do the job. This method returns a PHPUnit_Framework_MockObject_Builder_InvocationMocker instance; in short, an internal object that you can call method() and will() on to decide the method name to replace and its predefined behavior. The simplest possible behavior is $this->returnValue(...), but also $this->returnArgument($argumentNumber) is available, along with $this->returnCallback($callbackName); refer to the phpunit documentation for supplemental informations.

I hope this introduction to Stub objects has helped you grasping the essence of Unit testing in php. Feel free to ask clarifications in the comments. In the next part of this series, we will explore the possibilities of Mock objects and behavior based testing.

You may want to subscribe to the feed to remain updated on new posts in this series.

Tuesday, September 29, 2009

Practical testing in php part 6: refactoring and patterns

This is the sixth part of the php testing series. You may want to check out the other parts or to subscribe to the feed to remain updated on new posts.

In the xUnit world, tests are code. While there are testing tools which treats tests as data, phpunit and companions recognize classes and objects: this means that they are first class citizens and there should be no distinction in importance between production and test code.

Why it is important to refactor production code? To improve maintainability and ensure that changes which break the system appear less often. The same can be said for the tests: a suite that embrace change and is maintainable will make the developers actually use it, from the start to the long run. While the focus is usually on production code refactoring, today we will talk about test refactoring and the patterns where you should head to.
The worst thing that can happen is having an outdated test suite which fails because it is not maintained with production code: it will quickly lose credibility and thus it will be run sparingly, and then forgotten.
One of the best methodologies to improve production code maintainability is to test it: the more easy to test is a class, the more decoupled and maintainable it becomes. You will often find yourself refactoring a production class to simplify testing, for instance making Demeter happy, resulting in the application to have a simpler design.
Following our duality of production and test code, sometimes test methods and test cases grow and present a lot of repetition. What can be done to avoid these problems and maintain an agile (with the lowercase a) test suite is to refactor test code towards some patterns, some of them you already started to grasp in the previous parts of this series. Test code has often a low complexity compared to production code: it runs in an isolated environment, with nearly no state to maintain, with very decoupled classes (the test cases) and the help of a framework. Thus, it's tempting to use a lot of copy&paste to write tests, but knowing a bunch of patterns can flatten even tests little complexity and help you avoid duplication. As all patterns, they have been catalogued and given a standard name.

Standard Fixture and Fresh Fixture reuse the code which builds fixtures for the tests (and not the fixture itself). This pattern can be implemented with phpunit setUp() method.
Shared Fixture reuse the same object graph for a set of tests: obviously it should have no state or a way to reach a particular state for testing purposes. This pattern can be implemented with phpunit setUpBeforeClass() method.
Four Phase Test is the classical motif of a test method: arrange, act, assert and the optional teardown.
Test Runner and Test Suite Object are pattern which phpunit implements for you. You can then specifiy metadata to alter the building of a test suite or execution options, or specifical annotations which the runner supports.
State Verification is the simplest way of using phpunit and it's what we have done until now, writing assertions on explicit results of the system under test. Behavior Verification is based on making assertiong on indirect results, like method calls on collaborators and will be treated in the next parts of this series; Mock and Stub are patterns used in Behavior Verification, and phpunit provides support for their dynamic creation.
Table Truncation Teardown and Transaction RollBack TearDown are standard patterns for testing components which interact with a database.
Literal, Derived and Generated Value are patterns to provide fake data to the system under test. They all have their place in unit testing, depending on the unit purpose.

If you are interested in learning more about patterns you should check out the book xUnit Test Patterns: Refactoring Test Code and its website, which is a very complete guide to probably every single testing construct that has been explored in the xUnit frameworks. On the website you can find description and usage examples of all the patterns described here and of other specific ones.
Moreover, remember that test code is still code and the basic refactorings like Extract Method, Extract Superclass, Introduce Explaining Variable etc. are valid also in the testing land. Simply refactor some boilerplate code in private methods of a test case can save you the boring job of updating duplicated blocks.

As a side note, remember that when refactoring production code you have the safety net of the test suite, that will tell you when you have just broke something. No one tests the tests, however, and so you may want to temporarily break the behavior under test before refactoring a method or a test case. Simply altering the return statements of production methods can make the test fail so you can control that it continue to fail after the refactoring. When writing the original test, the TDD methodology crafts the method even before the production code exists and this is one of the main reason why the test is solid; a test is valid if it's able to find problems in the production code: that is, failing when it should fail.

I hope this series is becoming interesting as now you have learned your tests have the same importance of the production code. They can even be more important: if I have to choose between throwing away the production code and its documentation, and losing a good test suite, I will definitely throw away the first. It can be rewritten using the tests, while writing a complete test suite of an application without any tests is an harder task.
In the next parts, we'll enter the world of Test Doubles and of behavior verification, taking advantage of Mocks, Stubs, Fakes and similar pattern.

You may want to subscribe to the feed to remain updated about new articles in this series.

Monday, September 28, 2009

Practical testing in php part 5: annotations

This is the fifth parth of the php testing series. You may want to check out the other parts or to subscribe to the feed to be informed of new articles.

Now that we have learned much about writing tests (with or without fixtures) and using assertions, we can improve our tests further by exploiting phpunit features. This awesome testing tool provides support for several annotations which can add behavior to your test case class without making you write boilerplate code. Annotations are a standard way to add metadata to code entities, such as classes or methods, and are composed by a @ tag followed by zero, one or more arguments. While the parsing implementation is left to the tool which will use them, their aspect is consistent: phpDocumentor also collects @param and @return annotations to describe an Api.
Remember that annotations must be placed in docblock comments as in the php engine there is no native support for them: phpunit extracts them from the comment block using reflection.

While writing an Api or also a simple class, the corner cases and incorrect inputs have to be taken into consideration. The standard way to manage errors and bad situations in an oop application is to use exceptions. But how to test that a method raises an exception when needed? Of course the normal behavior is tested with real data that returns a real result. For the exceptional behavior, we can start with this test method:

    public function testMethodRaiseException()
    {
        try {
            $iterator = new ArrayIterator(42);
            $this->fail();
        } catch (InvalidArgumentException $e) {
        }
    }

The purpose of this code is to raise an exception by passing invalid data to the constructor of ArrayIterator, which requires an array. If the exception is raised accordingly, it bubbles up to the end of the try block and it is catched correctly, making the test pass. If the exception is not thrown, the call to fail() declares the test failed.
However, this paradigm will be repeated very often everytime you need to test an exception and so it can be abstracted away. Also, this code does not convey the intent of testing an exception since it is cluttered with details like an empty catch block and calls to fail().
Phpunit already abstracts away this code providing an annotation, @expectedException, which has to be put in the method docblock:

    /**
     * @expectedException InvalidArgumentException
     */
    public function testMethodRaiseExceptionAgain()
    {
        $iterator = new ArrayIterator(42);
    }

This code is much more clear than the constructs we used earlier. The only code present in the method is the one required to throw the exception, while the intent is described in the method name and in its annotations.

Another common repetition is testing a method with different kind of inputs, while executing always the same code. This is commonly resolved with a loop:

    public function testBooleanEvaluationInALoop()
    {
        $values = array(1, '1', 'on', true);
        foreach ($values as $value) {
            $actual = (bool) $value;
            $this->assertTrue($actual);
        }
    }

But phpunit can do the loop for you, taking advantage of the @dataProvider annotation:

    public static function trueValues()
    {
        return array(
            array(1),
            array('1'),
            array('on'),
            array(true)
        );
    }
    /**
     * @dataProvider trueValues
     */
    public function testBooleanEvaluation($value)
    {
        $actual = (bool) $value;
        $this->assertTrue($actual);
    }

This annotation should be followed by the name of a static method in the test case which returns an array of data sets to be passed to the test method. Phpunit will iterate over this array and using each one of its elements (which will be an array containing the arguments) to run the test method, telling you which data set was in use in case of a test failure. Of course you can put anything in the data sets: input for the SUT or expected result, or both.
The code becomes a bit longer, but the expressivity of defining the concept of different data sets in a standard way are worth considering.

The last common situation we will look at today is test dependency. Again, we are talking about dependency beneath the same test case since interdependencies between unit tests are a small of high coupling and should be raise suspects about your classes design.
It happens often that some test methods are more specific than the first you wrote and they will obviously fail if the formers do. The classic example is the add()/remove() tests on a container: to make sure remove() works you have to use add() for the arrange part of the test method. Phpunit solve this common problem of logic and temporal precedence (I won't present a workaround like in the other cases since it was not possible to solve this issue before phpunit 3.4 introduced @depends):

    public function testArrayAdditionWorks()
    {
        $array = array();
        $array[0] = 'foo';
        $this->assertTrue(isset($array[0]));
        return $array;
    }

    /**
     * @depends testArrayAdditionWorks
     */
    public function testArrayRemovalWorks($fixture)
    {
        unset($fixture[0]);
        $this->assertFalse(isset($fixture[0]));
    }

Not only testArrayAdditionWorks() is executed before testArrayRemovalWorks(), but since it returns something, this result is passed as an argument to the dependent method. If the former test fails, however, the dependent ones are marked as skipped as they will fail anyway by definition. They will clutter the output too, while it is clear that the functionality that needs repairment is the array addition.

I hope this standard phpunit annotations can help you enjoy writing tests for your php classes, leaving you the exciting work and taking off the boring one. In the next parts, we'll look at refactoring for test code before taking a journey with stubs and mocks.

I have uploaded on pastebin these code examples in a running phpunit test case. You may also want to subscribe to the feed to be informed of new posts in this series.

Saturday, September 26, 2009

Practical testing in php part 4: fixtures

This is the fourth part in the php testing series. You may want to subscribe to the feed to be informed when new posts are available.

In the previous parts, we have explored how to install phpunit and how to write tests which exercise our production code. Also we have learned to use the assertion methods to check the actual results: now we are ready to improve the test code from a refactoring point of view, and to take advantage of phpunit features.
While writing more and more test methods, you can notice that you follow a common pattern, commonly known as arrange-act-assert; this is the main motif of state based testing.
The first operation in a test is usually to set up the system under test, being it an object or a complex object graph; then the methods of the object are called during the act part and some assertions (hopefully not more than one) are done on the results returned from these calls. In some cases, when you have allocated external resources like a fake database, a final cleaning up phase is needed.
What you will actually discover is that often part of the arrange phase and the final cleanup code are shared between test methods: for example in case you are testing a single class, the instantiation of an object is a simple operation you can extract from the test methods. To support this extraction, phpunit (and all xUnit frameworks) provide the setUp() and tearDown() template methods.
These methods are executed respectively before and after every test method: default implementations are provided in PHPUnit_Framework_TestCase with an empty body. You can override this empty methods when useful, to have arrange/cleanup code to be shared between tests in the same test case and prepare a known state before every run. This known state is called a fixture.
Your test case class can go from this:

<?php
class ArrayIteratorTest extends PHPUnit_Framework_TestCase
{
    public function testSomething()
    {
        $iterator = new ArrayIterator(array('a', 'b', 'c'));
        // act, assert...
    }

    public function testOtherFeature()
    {
        $iterator = new ArrayIterator(array('a', 'b', 'c'));
        // act, assert...
    }
}

to this:

<?php
class ArrayIteratorwithFixtureTest extends PHPUnit_Framework_TestCase
{
    private $_iterator;

    public function setUp()
    {
        $this->_iterator = new ArrayIterator(array('a', 'b', 'c'));
    }

    public function testSomething()
    {
        // act on $this->_iterator, assert...
    }

    public function testOtherFeature()
    {
        // act on $this->_iterator, assert...
    }
}

Observe that, since an object of this class will be created to run the test, you can conserve every variable you want as a private member, and then have a reference to it available in the test method. setUp() usage provides a cleaner and Dry solution, and saves many lines of code when many test methods are needed.

Here is some know-how on using fixtures:

usually the tearDown() method should not be provided since the fixture is an object graph and will be garbage-collected after all the tests are executed, or overwritten by the next setUp() call. Thus, the empty body provided by default is often enough.
the fixture methods are executed for every test, so the test methods have the same state as a starting point. When more than one fixture is requested, the common practice is to break down the test case, preparing more than one test case class for the system under test; these classes represents different scenarios and together constitutes the overall test suite for this system.
sharing a fixture between test cases can be a smell for a bad design, since they are not insulated enough and classes know too much of each other. This cannot be done with setUp() methods however, but there are suite-level setup available in phpunit if you must share a fixture. However, keep in mind that you probably can refactor your classes to improve the maintainability of the application and of its test suite.
setUpBeforeClass() and tearDownAfterClass() are two hooks (static methods) which are executed before a test case methods are considered and after the overall process is finished. They are the equivalent of setUp() and tearDown(), but at the test case level instead of the test method one.
finally, assertPreConditions() and assertPostConditions() are two methods executed before and after the a test method. They differ from setUp() and tearDown() since they are executed only if the test did not already fail and they can halt the execution with a failing assertion. setUp() and tearDown() should never throw exceptions and they are executed anyway, unconcerned by the current test outcome.

This is all you must know on test fixtures to start experimenting with them. I hope your test code will be much more well written after introducing setUp().
In the next part, we'll explore the annotations that can influence phpunit test runner, like @depends and @dataProvider.

You may want to subscribe to the feed to be informed when new articles in this testing series will be available.

Friday, September 25, 2009

Practical testing in php part 3: assertions

This is the third part in the php testing series. You may want to checkout the previous parts or to subscribe to the feed to be notified of new posts.

Assertions are declarations that must hold true for a test to be declared successful: a test pass when it does not execute assertions or the one called are all verified correctly. Assertions are the final goal of a test, the place where you confront the expected and precalculated values of your variables with the ones that come from the system under test.
Assertions are implemented with methods and you have to make sure they are actually executed: thus, an if() construct inside a test is considered an anti-pattern as test methods should follow only one possible execution path where they find the assertions defined by the programmer.
There is also a assert() construct in php, used for enable checks on variables in production code. The assertions used in tests are a little different as they are real code (and not code passed in a string) and they do not clutter the production code but constitute a valuable part of test cases.
In phpunit there are some convenience methods which help to write expressive code and do different kind of assertions. These methods are available with a public signature on the test case class which is extended by default.

The first assertion which fails causes an exception to be raised and captured by phpunit runner. This means that if you are using an exception per test you are safe, but if you are writing test methods which contain multiple assertions, beware that the first failure will prevent the subsequent assertions from being executed. Only the assert*() calls which strictly depends on the previous ones to make sense should be placed in the same method as them.
Here is a list of the most common assertions available in phpunit: since the documentation is very good on this features I'm not going to go into the details. Most important and widely used methods are evidenced in bold.

assertTrue($argument) takes a boolean as a mandatory parameter and make the test fail if $argument is not true. You must pass to it a result from a method which returns booleans, such as a comparison operator result. assertFalse($argument) presents the inverse of this method behavior, failing if $argument is different from false.
assertEquals($expected, $actual) takes two arguments and confront them with the == operator, declaring the test failed if they do not match. The canned result should be put in the $expected argument, while the result obtained from the system under test in the $actual one: they will be shown in this order if the test fails, along with a comparison of the arguments dumps when applicable. assertNotEquals() is this method's opposite.
assertSame($expected, $actual) does the identical job of assertEquals(), but comparing the arguments with the === operator, which checks also the equality of variable types along with their values.
assertContains($needle, $haystack) searches $needle in $haystack, which can be an array or an Iterator implementation. assertNotContains() can also be very handy.
assertArrayHasKey($key, $array) evals if $key is in $array. It is used for both numeric and associative ones.
assertContainsOnly($type, $haystack) fails if $haystack contains element whose type differs from $type. $type is one of the possible result from gettype().
assertType($type, $variable) fails if $variable is not a $type. $type is specified as in assertContainsOnly(), or with PHPUnit types constants.
assertNotNull($variable) fails if $variable is the special value null.
assertLessThan(), assertGreaterThan(), assertGreatherThanOrEqual(), assertLessThanOrEquals() perform verifications on numbers and their names are probably self explanatory. They all take two arguments.
assertStringsStartsWith($prefix, $string) and assertStringsEndsWith($suffix, $string) are also self explanatory and section a string for you, avoiding the need for substr() magic in a test.

Remember that you can still make up nearly any assertion by calling a verification method and pass the result to assertTrue(). Moreover, nearly everyone of this methods support a supplemental string parameter named $message, which will be shown in the case of a failing test caused by the assertion; if you're making up a complex method for a custom assertion you may want to provide $message to assertTrue() to provide information in case the production code regress. Obviously, the custom assertion methods should be tested too.
If you want to see some code, I put up some examples on the assertion methods usage on pastebin.

I think you will start soon to use the more expressive assertions for what you are testing for: test methods should be short and easily understandable, and assertion methods which abstract away the verification burden are very beneficial. In the next parts, we'll dig into ways to reuse test code and in the annotations which phpunit recognizes to drive our test execution, such as @dataProvider and @depends.

You may want to subscribe to the feed to be informed when new posts in this series are published.

Thursday, September 24, 2009

Practical testing in php part 2: write clever tests

This is the second part of the testing series about php. You may want to subscribe to the feed to check out previous parts and not miss the next ones.

In the previous part, we have discovered the syntax and the infrastructure needed to run a test with phpunit. Now we are going to show a practical example using a test case/production code couple of classes.
What we are going to test is the Spl class ArrayIterator; for the readers that do not know this class, it is a simple Iterator implementation which abstracts away a foreach on the elements of an array.
Of course it would be very useful to write the tests before the production code class, but this is not the time to talk about TDD and its advantages: let's simply write a few tests to ensure the implementation works as we desire. This is also a common way to study the components of an object-oriented system: read and understand its unit test and write more of them to verify our expectations about the production classes behavior are fulfilled.
Let's start with the simplest test case: an empty array.

class ArrayIteratorTest extends PHPUnit_Framework_TestCase
{
    public function testEmptyArrayIsNotIteratedOver()
    {
        $iterator = new ArrayIterator(array());
        foreach ($iterator as $element) {
            $this->fail();
        }
    }
}

The test case class is named ArrayIteratorTest, following the convention of using a 1:1 mapping from production classes to test ones. The test method simply creates a new instance of the system under test, setting up the situation to have it iterate over the empty array. If the execution path enter the foreach, the test fails, as the call to fail() is equivalent to assertTrue(false).
The next step is to cover other possible situations:

public function testIteratesOverOneElementArrays()
    {
        $iterator = new ArrayIterator(array('foo'));
        $i = 0;
        foreach ($iterator as $element) {
            $this->assertEquals('foo', $element);
            $i++;
        }
        $this->assertEquals(1, $i);
    }

This test ensures that one-element numeric arrays are iterated correctly. The first assertion states that every element which is passed as the foreach argument is the element in the array, while the second that the foreach is executed only one time. You have probably guessed that assertEquals() confronts its two arguments with the == operator and fails if the result is false.
When it is not too computational expensive, we should strive to have the few possible assertions per method; so we can separate the test method testIteratesOverOneElementArrays() in two distinct ones:

public function testIteratesOverOneElementArraysUsingValues()
    {
        $iterator = new ArrayIterator(array('foo'));
        foreach ($iterator as $element) {
            $this->assertEquals('foo', $element);
        }
    }
    
    public function testIteratesOneTimeOverOneElementArrays()
    {
        $iterator = new ArrayIterator(array('foo'));
        $i = 0;
        foreach ($iterator as $element) {
            $i++;
        }
        $this->assertEquals(1, $i);
    }

Now the two test methods are nearly independent and can fail independently to provide information on two different broken behaviors: not using the array values and iterating more than one time on an element. This is a very simple case, but try to think of this example of a methodology to identify responsibilities of a production class: the test names should describe what features the class provides at a good level of specification (and they are really used for this purpose in Agile documentation). This is what we are doing by adopting descriptive test names and using a single assertion per test where it is possible: broken up the role of the class in tiny pieces which together give the full picture of the unit requirements.
We can go further and test also the use of ArrayIterator on associative arrays:

public function testIteratesOverAssociativeArrays()
    {
        $iterator = new ArrayIterator(array('foo' => 'bar'));
        $i = 0;
        foreach ($iterator as $key => $element) {
            $i++;
            $this->assertEquals('foo', $key);
            $this->assertEquals('bar', $element);
        }
        $this->assertEquals(1, $i);
    }

As an exercise you can try to refine this method two three independent ones, for instance creating the first of them with a name such as testIteratesOverAssociativeArraysUsingArrayKeysAsForeachKeys(). Don't worry about long method names as long as they are long to strengthen the specification, but only when the code can be refactored to smaller test methods. Even then, finding descriptive test names is the most difficult part of the process.
We can go on and add other test methods, and Spl has many.

Whenever a bug is found which you can impute to the class under test, you should add a test method which exposes the bug, and thus fails; then you can fix the class to make the test pass. This methodology helps to not reintroduce the bug in subsequent changes to the class, as a regression test is in place. It also defines more and more the behavior of a class by adding a method at the time.
The TDD methodology not only forces to add test methods to expose bug, but also to define new features. Implementing a user story is done by first writing a fail test which exposes the problem (the feature is not present at the time in the class) and then by implementing it.

I hope you're liking this journey in testing and you're considering to test extensively your code if you currently are not using phpunit or similar tools. In the next part, we will make a panoramic the assertion methods which phpunit provides to simplify the tester work. Remember that, in software unit testing, developer and tester coincide, or at least are at one next to the other, in the case of pair programming.

I have posted on pastebin the ArrayIteratorTest class if you want to play with it by yourself.
You may want to subscribe to the feed to not miss subsequent posts in this series.

Wednesday, September 23, 2009

Practical testing in php part 1: PHPUnit usage

What is unit testing and why a php programmer should adopt it? It may seem simple, but testing is the only way to ensure your work is completed and you will not called in the middle of the night by a client whose website is going nuts. The need for quality is particularly present in the php environment, where it is very simple to deploy an interpreted script, but it is also very simple to break something: a missing semicolon in a common file can halt the entire application.
Unit testing is the process of writing tests which exercise the basic functionality of the smallest cohesive unit in php: a class. You can also write tests for procedures and functions, but unit testing works at its best with cohesive and decouple classes. Thus, object-oriented programming is a requirement; this process is contrapposed to functional and integration testing, which build medium and large graphs of objects when run. Unit testing istances one, or very few, units at the time, and this implies that unit tests are tipically easy to run in every environment and do not burden the system with heavy operations.

When the time comes for unit and functional testing, there's only one leader in the php world: PHPUnit. PHPUnit is the xUnit instance for the average and top-class php programmer; born as a port of JUnit, has quickly filled the gap with the original Java tool thanks to the potential of a dynamic language such as php. It even surpassed the features and the scope of JUnit providing a simple mocking framework (whose utility will be discovered during this article series).

The most common and simplest way to install PHPUnit is as Pear package. On your development box, you have to be available the php binary and the pear executable.

sudo pear channel-discover pear.phpunit.de
sudo pear install phpunit/PHPUnit

Use a root shell (or a administrator on if you develop on other systems) instead of sudo if you prefer. These commands tell pear to add the channel of phpunit developers as it was a package repository, and to install the PHPUnit package from the phpunit channel. The grabbed release is the latest stable; at the time of this writing, the 3.4 version.
If the installation is successful, you now have a phpunit executable available from the command line. This is where you will run tests; if you use an IDE, probably there is a menu for running tests that will show you the command line output (and you should also install phpunit from the IDE interface to make it discover the new tool).

Before exploring the endless possibilities of testing, let's write our first one: the simplest test that could possibly work. I saved this php code in MyTest.php:

class MyTest extends PHPUnit_Framework_TestCase
{
public function testComparesNumbers()
{
$this->assertTrue(1 == 1);
}
}

What is a test? And a test case? A test case is constituted by a method by a class which extends PHPUnit_Framework_TestCase, which as its name tells is an abstract test case class provided by the PHPUnit framework. When developing a object-oriented application, you may want to start with one test case per every class you want to test (and if you're going the TDD way every class will be tested), thus there will be a 1:1 correspondence between classes and test cases. For the moment, we don't want to go too fast and we simply write a class that tests php common functionality.
Every test is a method which by convention starts with the keyword 'test'. Also for convention, the method name should tell what operation the system under test is capable, in this test .
Every method will be run independently in an isolated environment, and will make some assertion on what should happen. assertTrue() is one of the many assert*() method inherited from the abstract test case, which declares the test failed if an argument different from true is passed to it. The test as it is written now should pass. In fact, we can simply run it and find out:

[giorgio@Marty:~]$ phpunit MyTest.php
PHPUnit 3.4.0 by Sebastian Bergmann.

.

Time: 1 second

OK (1 test, 1 assertion)

Instant feedback is one of the pillars of TDD and of unit testing in general: the code in tests should instance your classes and exercising their functionality to ensure they don't blow up and respect the specifications. With the phpunit script, it's very simple and fast to run a test case or a group of them after you have made a change to your class and make sure you haven't break an existing feature.
The result of phpunit run is easily interpretable: a dot (.) for every test method which passed (without failed assertions), and a statistic of the number of tests and assertions encountered.
Let's try to make it fail, changing 1==1 to 1==0:

[giorgio@Marty:~]$ phpunit MyTest.php
PHPUnit 3.4.0 by Sebastian Bergmann.

F

Time: 0 seconds

There was 1 failure:

1) MyTest::testComparesNumbers
Failed asserting that  is true.

/media/sdb1/giorgio/MyTest.php:6

FAILURES!
Tests: 1, Assertions: 1, Failures: 1.

For every failed test, you get an F instead of the point in the summary. Other letters can be encountered, for instance the E if the test caused no assertion to fail but it raised an exception or a php error. The only passed tests are the one which present a dot: you should strive for a long list of dots that fill more than one row.
You also get a description of the assertion which has failed and in what test case, method and line it resides. Since you can run many test cases as a single command, this is very useful to localize defects and regression.
This time the test has failed because it is bad written: zero is not equal to one and php is right in giving out false. But assertTrue() does not know this and in the next part we'll write some tests which works upon userland code and it is in fact useful to detect if production classes are still satisfying all their requirements.

You may want to subscribe to the feed to not miss the subsequent parts of this php testing series. Feel free to ask clarifications in the comments or to raise testing topics you particularly care for.

Tuesday, September 22, 2009

The power of tracking and logging

Tracking resources - time, money, code changes and whatever else - is a powerful way to improve your understanding of them. Logging is even more powerful as it's automatic tracking done by your tools for you.

The practice of tracking time for the various activities one intraprends is one of the pillar of time management. Analyzing what is sucking up precious man-months gives you a clue to eliminate the tasks that really aren't worth the effort, instead of finding them by chance or by some preconcepts you may have. Tracking your spending is the first advice a financial counselor will tell you if you experience money problems.
The concept of time tracking is also present in programming: the only sure-fire way to optimize an algorithm or an application is to profile it as the first step, and then making changes to the code which constitutes the bottleneck. While an algorithm theorical analysis produces a result expressed in O(f(n)), profiling it on real data allows the programmer even to confront different O(n log n) sorting algorithms.

At a different perspective level, tracking is present in the modern methodologies of development with tools as source control systems. Every code check-in is tracked and remains forever in the history of the codebase: no line of code is ever lost in shared directories or email folders. This type of tracking information is better named as logging.
It is very cheap for a software system, if built correctly, to conserve every chunk of data that passes on its bridge. Subversion and other vcs do exactly the process of logging any single commit, and the logs reveal useful when viewed as changesets or while generating a changelog for a new release. Project management tools like trac, built upon subversion, log every change to the tickets which report bugs and feature requests, along with the edits of the wiki pages. It is a small job paragonated to the extent of tracking Wikipedia does.
If you're talking on irc or other istant messenger, your client is probably writing logs of the conversations to disk. Every enterprise java application logs exceptions to a file or to database, too.
Having a large amount of organized data is a source of valuable information, as this logging capabilities allow to:

posting a conversation between developers on the wiki for further reference;
giving commit access to new developers knowing their commits can be rolled back;
listing the commits which affects a particular bug, which trac does;
generating a changelog file by looking at the list of commits in a particular period on a branch;
updating a working copy or a deployment of a php application transmitting only the modified files;
generating a list of the locally modified files in a few seconds to see what is being committed (svn status).

There are infinite possibilities for the usage of logged data. It is often said that every item of a project which cannot be automatically generated (like builds) should be put under version control.
Logging was once expensive, when it was done by hand on dusted registers: the data was patiently annotated during the day and it wasn't going to be useful anywhere else. Now that the information era is arrived, take advantage of the logging capabilities of your tools and never write the same thing twice when a machine can do it for you.

The image is a photo of a rinascimental ledger, used for accounting. Money transaction have a long logging tradition for fiscal purposes.

Monday, September 21, 2009

Setting SMART goals for developers

The journey of a developer is constituted of hard work and self-improvement. Maybe you are going to work on a big freelance project, to solve programming challenges or expand your knowledge in an area where you lack expertise; whatever you want to be productive for, setting SMART goals can help you stay focused and reach big achievements.

According to Wiktionary, the definition of goal is a result that one is attempting to achieve: setting a goal is the first step in deciding what you want to do. But to find out how you're gonna achieve a goal and concentrate on it, you must refine and work also on its definition.

Verba volant, scripta manent
The most important part of a goal is where it's kept: while verbal and mental goals are volatile and can change over time, a written goal is carved in stone until you consciously decide to modify it. The characteristic of a SMART goal are best exploited when they are consistently written and reviewed. You're gonna write a Unix clone? Write it. You're gonna launch a new search engine? Force yourself to write it down.
To help your productivity, you should definitely do what works for you; however, keep in mind that this system has helped a lot of people to get out of lazyness and to maintain a good attitude while working on their passions.

The five SMART keywords
SMART is an acronym for five terms a goal must implement to be really useful. The SMART criteria is used in many management fields, for instance to question a project's objectives. Writing user stories instead of a list of features is an implementation of SMART goals.
These keywords are:

Specific: you must be specific in your goal as vague ones cannot give you the needed focus. Learning Java is not a specific goal, while Write a small application in Java is a little better. The logic of being specific is that writing a specific, narrow goal will make you think of a plan and to respect the other SMART characteristics; if a goal is too big you should break it into smaller sub-goals that are enough specific, doing a sort of idea refactoring.
Measurable: the achievement should be scientificly and quantitatively measurable. While it seems obvious, this goal characteristic is really important: otherwise you'll never know if the goal has been achieved and should be deleted form you list. Again as an example, Learning Java is not measurable, while Writing a clone of the ls command in Java is very measurable.
Attainable: while a goal can be out of your comfort zone, you must consider the time and the work you will put in this experience. Earn $100,000 in one week is often not an attainable goal for the majority of us.
Relevant: a goal has to be challenging. Otherwise why set it? If a goal is too simple or short, it is more a task on yur todo list instead of a life-changing experience. In our example, if you're already experienced with Java you should strive for writing a more complex application instead of the simple ls, while this can be a good programming problem for a person that has never tried an object-oriented language.
Time-bound: you must set a deadline for measure your achievements. As Keith Ferrazzi says, a goal is a dream with a deadline: if you do not set a specific date when you want to have accomplished this goal, it will remain a dream. A deadline is what forces you to work and stimulate your productivity, even if it's self-imposed.

In sum, our goal has gone from Learning Java to Write wget in Java before Sep 30, 2009, written for instance in a GOALS.txt file in my home directory, which I really have. I hope seeing this version of the goal helps you to think of what you need to do in order to accomplish it: learn the basic Java syntax and approach to programming, find out what libraries for networking and filesystem management are avaiable and how to set up a working development environment. Note that these are all googleable tasks, while Learning Java will point you to base level courses when you learn to create Dog and Cat classes which extend Animal.
TDD is another example of goal setting: writing a test is in fact setting a goal which is very Measurable, since the red/green bar will tell you if the test has passed and a computer is deterministic (if you do not write brittle tests). There are other problems to solve, though: productivity is halted if you do not implement enough user stories per day; or you might fall into testing only getters and setters, which is not a Relevant goal.

Feel free to raise questions: productivity is a difficult and subjective field, in which everyone of us can learn from each other.

Coaches talk about schemes, game plans and tactic, but the purpose of football is only to score a goal more than the opponent team. Set your goals to know what you're heading towards.

Friday, September 18, 2009

Zend Framework Api: what $options is?

I wanted to share an insight that repetitevely using and studying the Zend Framework API has given me. I know finding a lack in documentation can be annoying but ZF is usually very coherent in its API and I find out it is predictable also in this case, which is a good trait for a very large set of classes such as the one from Zend.

The tipical constructor of a Zend Framework class (or factory method) has a signature such as:

Zend_Form::__construct([mixed $options = null]);

and this always left me wondering what would $options contain, since I cannot found a documentation for this constructor. I depended on finding examples in the reference manual for what I want to do with my objects.
What I have learned is a convention widely used in Zend Framework main components and in the incubator: the options are passed to the setters after the mandatory construction process have taken place. This means that if we build a submit button in this way:

$button = new Zend_Form_Element_Submit('submit', array('ignore' => true, 'value' => 'Click me'));

the result is equivalent to:

$button = new Zend_Form_Element_Submit('submit');
$button->setIgnore();
$button->setValue('Click me');

So you can simply refer to the documentation of the particular setter you want to call in the constructor and provide a key in the array with the initial lowercase (to respect the Pear/Zend coding standard): setElementsBelongTo() will require a elementsBelongTo key.

It would be interesting to know if there is an official guideline to provide constructors like the ones in Zend_Form component or if do some classes follow a different convention: feel free to share what you know in the comments.

What's new in PHPUnit 3.4

PHPUnit is the leader testing tool in the php world: the equivalent of JUnit for Java and it's what comes to mind when you're thinking of testing a php application.
Although the Api for PHPUnit 3 is stable, some useful improvements and functionalities have been developed for the 3.4 branch, which has seen its first official release two days ago.

If you have a Pear-based installation, it's very easy to get PHPUnit 3.4. Simply run:
# pear upgrade phpunit/PHPUnit
as root or preceded by sudo.
For some important features, you can refer to the official manual, that is currently in the process of being updated.

Here's a list of some novelties that I have experimented this morning on NakedPhp:

test methods dependencies: mark with @depends a test method, followed by the name of the test method which it relies upon. For instance, a test method for deletion of an object in a container depends upon the addition test; this way if the first fails the dependent tests will be automatically skipped and you will locate the bug instantly. Moreover, you can pass fixture from the testAdd() method to testDelete(), effectively reusing it. Keep in mind, however, that this feature works on test methods and not between different test cases.
annotations @runTestsInSeparateProcesses and @runInSeparateProcess allow tests running in parallel. They should be used on the test case class and on the test method respectively. However, I've had trouble adding them because some kind of recursive dependency is raised when exporting global variables and xdebug stop it mercilessly.
setUpBeforeClass() and tearDownBeforeClass() a la junit have been introduced. Heavy fixture set up now can be done one time per test case.
getMockForAbstractClass() will fill in the abstract methods for you.
assertStringStartsWith() and assertStringEndsWith() have been added for rapid assertions.
PHPUnit_Extensions_TicketListener_Trac can open and close Trac tickets basing on a test result: I think the @ticket annotation is stricly related to this component. I have not yet tried it since I do not have a Trac instance available (and I think this process can be slow), but it seems a good step forward to expose brittle tests. Especially if used in continuos integration.
Mock Objects can now be passed to with() without issues. The cloning problems for with() can be though to resolve since objects are cloned when passed to with() to allow assertions being run on them after the test method has returned: obviously asserting that an object is identical to an expected one would never work, and asserting simple equality is often worse since very big objects can be passed around and duplicated.
You can mock namespaced classes (in php 5.3) without going to a complicated process to autoload them first and then specify all the parameters to getMock(). Good!
@covers annotation support has been extended to setUp() and tearDown(), and the relative code coverage calculation has been improved.

These are the most important features that I have found in the ChangeLog and I have already tried to experiment a few of them them personally in my projects. Feel free to add your discoveries in the comments.

Thursday, September 17, 2009

Gang of Four Design Patterns review

Design Patterns: Elements of Reusable Object-Oriented Software was the first published book to identify patterns in object-oriented programmind and has become a classic in the last ten years.

What is a design pattern?
In computer science, but also in architecture, a design pattern is a standard solution to a common problem. In the software engineering field, patterns fill the lack of functionality of an object-oriented language.
Whenever you hear the words Factory, Lazy Loading, Singleton or Iterator, the argument is design patterns:

tipically the name of a design pattern is written capitalized: Iterator is one of them, while an iterator is a specific object;
design patterns are built with classes and objects and their representation is often a Uml diagram. When a set of classes you write falls in the diagram definition, it's said that you are implementing the pattern.
a pattern is a reusable technique and does not consist in reusable code: you may write many Factory classes but they will never be interchangeable. What is similar in them is the underlying concept and contract structure, the recurring theme.
also Anti-patterns exist, but they are not commonly teached.

This book, written by the Gang of Four and published in 1994, is the first presentation of patterns for the general developers public. It has sold more than 500.000 copies and even nowadaysl it does not have an alternative as a complete reference manual for the original patterns.
The name Gang of Four for the authors Erich Gamma, Richard Helm, Ralph Johnson and John Vlissides is mutuated from the original Chinese Gang of Four, which was an alliance of (obviously four) communist party officials.

Chapter 1 (Introduction) and chapter 2 (A case study)
The beginning of this book already provides great value. The first chapter is about object-oriented programming, a recap of its principles, and a 50000 feet view of the patterns and how they fit together. Composition and inheritance are in the list of the topics; a guide for the reader helps to orient in the book, searching the right pattern for the job.
The second chapter is a study for the design of a WYSIWYG text editor a la OpenOffice, whose architecture is composed a bit at the time applying eight design patterns. It can be useful to read the practical solution of a real world problem and how Gang of Four manages to implementing the patterns presented.

After this introduction, the patterns are presented. Every pattern is described in the book as follows:

Name and classification: the name of this patterns have become a standard and they are referred all over the Internet. The classification will tell you what king of pattern you are viewing (creational, structural or behavioral).
Intent: every pattern solves a problem and it is stated in the most general form.
Also known as: other names for the pattern, now mostly in disuse.
Motivation: why the problem is a real, painful problem and how the pattern solves it brilliantly. This is the main explanation of how the solution works.
Applicability: typical cases where you will reach these pages of the book to read how to implement the pattern.
Structure: Uml class diagram or object diagram
Participants: descriptions of the classes involved in the diagram, with generic names.
Collaborations: the method calls of the objects.
Consequences: what this pattern application implies and its advantages or limitations.
Implementation: long notes on the best ways to implement in code this pattern.
Sample code: it speak by itself; a complete example of C code which shows you how to do practical work with this pattern. Although there is little mumbo jumbo in this book, I appreciate a lot this parts which keep out it and present something that compiles.
Known uses: object-oriented application which has applied this pattern successfully.
Related patterns: similar patterns to consult if the current solution feels wrong or raise other issues than the one it solves.

As you can see, nearly everything you want to know about writing an Iterator or a Builder is kept in this book.

Creational patterns (5 chapters)
This collection of patterns is dedicated to the creation of objects, solving problems like abstracting away a particular kind of objects behind an interface and simplify the creation process by eliminating repetitive code.
Abstract Factory, Builder, Factory Method, Prototype and Singleton are presented in this first big part.

Structural patterns (7 chapters)
This second collection presents patterns used for maintain a clean object graph: the typical intents are sharing interfaces and objects for reuse or adding responsibilities to subclasses without cluttering the class tree.
Adapter, Bridge, Composite, Decorator, Facade, Flyweight and Proxy are presented in this section.

Behavioral patterns (11 chapters)
These patterns are concerned with building a standard architecture for objects and with assigning responsibilities to the right owner. The intents are retain loos coupling between classes while at the same time allow complex algorithms and control flow to happen. For instance, abstracting way the State of an object to decouple it.
Chain Of Responsibility, Command, Interpreter, Iterator, Mediator, Memento, Observer, State, Strategy, Template Method and Visitor are presented here in this last part.

A glossary, a guide to the Uml notation and the foundation classes, on which the code examples are based, are also provided at the end of the book. This bonus will teach you also the basics of Uml.

In sum, this book is a milestone in the history of object-oriented programming and it will always have a reserved space ~~next to my desk~~ into my hard disk to allow a rapid consultation. I won't advise to read it all the first time, as it can be boring, but having it at hand can help whenever you're thinking if the design of your application can be improved.

The image at the top is the cover of the current edition of the book. There is also en electronic version: Design Patterns CD: Elements of Reusable Object-Oriented Software (Professional Computing).

Wednesday, September 16, 2009

How to TDD a database application

After the publishing of Failed attempt to apply Tdd response, some readers have asked me how to test a database application, as an example of code which is not suitable for xUnit frameworks.

Let's start with saying that database is not a special case for testing, as it's only a port of your application. Whenever the application interact with an external program I would say it presents a port, and this pattern is called Hexagonal Architecture.
Your effort should be in testing thoroughly the core of the application, building a solid object-oriented Domain Model piece after piece, by writing a test at the time an making it pass. The Domain Model should not have dependency on any infrastructure like database, http requests, twitter adapters, and so on: this refinement of the Domain Model pattern is called Domain-Driven Design and when applied produces easy testable code. The Domain Model would be tested by injecting fake adapters as there should be no logic in the database: this is object-oriented programming and in database object do not exist nor we can test it easily and automatically with Junit.
Though, this approach is very complicated and it is considered when the domain has a rich set of rules and behavior, while many applications have only CRUD capabilities.

Think of your application as existing only in Ram memory and strip out all the unnecessary code. The classes which remains are the core of the Domain Model. For instance if I had to manage the list of the users of a forum, I would write initially only a User class and a generic collection of User objects. User in my point of view is a POJO or POPO, which does not extend anything:
class User { ...

Insulating the database from view will hide this generic collection behind an abstraction, since it is not present at all in memory except for caching. Subsets of it can be reconstituted as needed. This abstraction is called a Repository or, at a low level, a DataMapper. Hibernate for Java and Doctrine 2 for php are example of a DataMapper pattern: they let you work on your objects and then take them and synchronize the database with the new data you have inserted: a change in a User's password, or new Users to be added. To polish the DataMapper Api, which is very generic, a UserRepository class can be created.
Even if you do not have a generic DataMapper, and work with mysql_query, PDO or JDBC queries, you can write a UserRepository which will act as the port for the database (or maybe a text file; since the repository decouples it, the storage mechanism can be anything from a memcache server to a group of monkeys which writes down on paper serializations of objects).
Depending on your architecture, your controllers or other domain classes will now have the UserRepository as a collaborator, and will talk to it and call its methods instead of accessing the database directly; this is a form of Dependency Injection. Obviously if there are other entities which are persisted, like Groups, Posts, etc. they should have their own repository class.

Of course the point of this discussion is how to test this code, since to write it we have to prepare a test before (it's called red green refactor and not code and then try to test). If we manage to write these tests first, the code will be automatically testable, and very decoupled and reusable as this is a characteristic of testable code.
What we need to write are not tests, but unit tests. If you want to check the entire path of data in the application, you can write integration tests which will exercise even the user interface, but Test-Driven Development prescribes to write unit tests: your test methods should have a dependency only on the system under test, and to the interface it uses.
Continuing our example, we need:

unit tests for User, Group, Post classes (Domain Model entities);
unit tests for controllers or other classes from the Domain Model which uses the adapters.
unit tests for UserRepository and similar classes (adapters);
unit tests for DataMapper or (infrastructure);

The point of unit testing is you can test singular components separately, and not the entire application as you would do with functional testing. Moreover, you must test each component separately or otherwise it would reveal too difficult to write classes to satisfy an integration test and the TDD benefits of adding a bit of design with every test method would be lost.
With this knowledge, we can say:

unit tests for Domain Model entities must be written by the developer of this application. If the project is mostly a CRUD application and is data-intensive, these test cases will be very short.
unit tests for controllers and everything that uses the adapters also must be written by the developer, mocking the adapters out. If you use the real adapters in testing, it will be integration testing and it will be heavy and brittle; you won't know if it's the controller or the adapter which does not work after a red test.
unit tests for the adapters concrete classes: this is the only interaction with the database. Fortunately, we are unit testing for this classes so we can even use a new database for every test method as every class will present a few methods; compare this approach with testing every single feature of the application by using a real database.
unit tests for infrastructure: these are included in the framework so we shouldn't worry.

Note that if you use a DataMapper which abstracts the particular database vendor as infrastructure, you can run the unit tests for the adapters by using a in-memory database like sqlite instead of the real database which will likely be very heavy to manage and recreate every time; this way, you de facto exclude database from your unit tests. Of course some integration tests will be necessary, but they are not part of TDD and of the design process.

I hope to have given you an idea of what unit tests means and how to deal with an application which uses a database for persistence, which is a very common scenario. Feel free to raise questions in the comments if something is not clear.

Do you want more? There's a book for .NET developers which explains DDD and database-independent testing.
The image at the top is a photograph of integrated circuits, which are real world components designed for testability and which are tested indipendently from the card where they will work.

Tuesday, September 15, 2009

SOLID part 6 (bonus): how much solid Standard Php Library is?

This is a bonus part for the SOLID principles series. You can check it out to view the articles which explain all the principles or subscribe to the feed to be informed of new posts.

How that we have talked a lot about the five SOLID principles, it's time to apply them to a real project and see the good design ideas they introduce and what trade-offs are accepted in everyday coding. Since I have mainly a php audience, I have chosen the unique object-oriented library that the php core features natively: the Standard Php Library (SPL).

The Iterator interface and its implementations were the first Spl components to be included in php. This interface is very cohesive and it does not assume much about the source where items are coming from: for instance, in every implementation there's no need to write a method that counts the elements, since it is segregated in a Countable standard interface.
Iterator is extended by RecursiveIterator and OuterIterator, which manages respectively an iterator which items have children and a iterator who decorates another. These components adds only one or two methods to the parent interface and are also very cohesive: they are the hook for opening Spl to extensions.
The responsibilities of these classes are evenly distributed: the basic Iterator implementations have proven to be very useful (such as DirectoryIterator which produces the list of files contained in a folder), while other functionalities are kept in their own OuterIterator implementation, like LimitIterator and CachingIterator.
The primary use of Iterator (and of its parent interface Traversable) is in the foreach construct, where it can be passed as it were an array. The standard implementations can be swapped without problems in a foreach, and this means that Liskov Substitution Principle is more or less respected. What comes to mind is that different Iterators will return a different type of items, but php is a dynamic language and this precisation does not strictly affect the application of principle as long as the methods retain the same meaning and intent. To solve it completely the interface should use generics like in Java: Iterator.

Spl provides object-oriented way to access filesystem (SplFileInfo and similar ones), which is not a point of interest in this discussion since these classes are a wrapper to the basic functions such as opendir() and filesize(). The same goes for Exception and its tree of subclasses. The Serializable interface is also a good hook for extension of behavior, but we are not going to overanalyze the library.

What indeed raise attention is the SplObserver and SplSubject interfaces, which suggest a standard way to implement the observer pattern. These components are not useful in my opinion, and a comparison with static typed languages will show the difference.
In a static language like Java, when a parameter is passed to a method which have a determinate signature, only the contract defined by this signature could be used in the body of the method. This is the definition of SplObserver:
public function update(SplSubject $subject);
where SplSubject contains the methods attach(), detach() and notify(). Since the contract is defined from SplSubject interface, if php were a static language we could only call these three methods, which are totally unuseful to get the state of the subject (which can be thinked as some getXXX() methods to call on the subject); only the fact that php is a dynamic language, and does not complain if you call methods which can be unexistent,allows such a method call. Moreover, these interfaces force the developer to adopt a pull-style of Observer pattern even where a push-style would be simpler.
In sum, a design pattern, like the Observer one, is a common solution to a problem and should be implemented every time with different flavours, requiring different classes. SplObserver and SplSubject fail to satisfy the Dependency Inversion Principle since they won't decouple their implementors, that needs to know a large abstraction they leave out.

With the release of php 5.3, it has become clear that the goal of Spl is to provide fast components, implemented in C for particular purposes, and not a well designed object-oriented library. SplQueue is an example of this, along with its companion SplStack:
class SplQueue extends SplDoublyLinkedList implements Iterator, ArrayAccess, Countable { ...
SplDoublyLinkedList is a classic data structure, a general purpose list that is subclassed by SplQueue and SplStack. Since it implements Iterator and other Spl interfaces, their children inherits these features, but in my opinion it's not a good idea.
A queue can be accessed only one element at time; a stack has the same limitations, which are voluntarily imposed to incapsulate the internal data structure and limit the operations that can be done on it by a client class: I would rather have a SplQueue as an interface and not a concrete implementation, with just the enqueue() and dequeue() methods; SplStack as another interface with pop() and push(). SplDoublyLinkedList would be a generic implementation, which satisfies both interfaces.
Why I would prefer this design? Because it places functionalities in two compact interfaces; otherwise Interface Segregation Principle is not applied and no interfaces are even defined. SplQueue can't be a child of a SplDoublyLinkedList as it would be a violation of Liskov Substition Principle: a implementation of a queue can use a double linked list internally, but not inherit methods from it or it is not a queue anymore.
A benefit of a Queue interface would be real decoupling and its consequences: for instance, a mock of Queue can be used in testing to make sure only enqueue() and dequeue() are used, while mocking a SplQueue as it is will expose also all the other Iterator methods as empty ones, leading to strange results (and tying my high level modules to details instead of to a queue abstraction, infringing Dependency Inversion Principle).

I would say that Spl is a good product for the goals which it had been thinked for: a reliable and fast object-oriented layer, which can overload common constructs like count() and foreach; it has been a very successful feature of php. But its design could be vastly improved, also with application of SOLID principles.

The image at the top is an upside down library.
Did you like this series? Leave a comment or subscribe to the feed to stay in touch.