Invisible to the eye: April 2010

Friday, April 30, 2010

Zen-Driven Development, part 1

Inspired by Eric S. Raymond's Unix Koans.

Long time ago, a student wanted to learn Test-Driven Development. Thus, he went to the Great Master to learn the ways of test-first design.
The student said to the Master "How can I learn TDD, this new methodology that would fix my broken designs?"
The master answered "If you want answers from me, I cannot help."
But the student did not understand.

The day after that, the student came back to the Great Master.
He said "Master, I am humile and eager to learn. Can you tell me how can I employ Test-Driven Development in my projects? I have so much legacy code to tame... I want to be Agile."
And the Master "Again, I do not have answers."

The third day, the student came back discouraged.
"Master, I do not want answers. Just give me a little clue, since in three days I have not written a single line of code. I want to be a TDDer."
"Listen to me, for a great truth is being revealed to you, without any certification course."
The student looked at the Master impatiently. The Master continued:
"You told me you want to learn TDD. But TDD does not start working on a problem by producing its answer. It starts from defining the problem, posing a question. Only when there is a clear, formal question defined you can verify you have answered it when the time comes."
Then there was a bit of silence, as the student nodded in agreement.
"Are you able to write a wget clone by using TDD?" the Great Master asked.
"Of course no, master..."
"Then we get a red bar."
And then, the student was enlightened.

Tuesday, April 27, 2010

Transparent remoting is a fable

Transparent remoting is the use of the Proxy pattern to create remote proxies, that conforms to the same interface of a remote object but instead of executing its methods locally it marshals (serializes) the parameters, send it over the network, get the marshalled result and returns it to the client code.
The idea behind this pattern's implementations is using a remote object, discovered with some kind of infrastructure service, as it were in the same address space of the current process. The most famous implementation is probably Java Remote Method Invocation (RMI).

This is nice!
However, the famous paper A Note on Distributed Computing outlines the inherent issues of treating remote and local objects with the same interface and contract:

latency: obviously, it takes more time (some orders of magnitude) to perform remote calls in relation to local calls.
memory access: it is performed via pointers and handlers on local machines, while it is more complex in remote invocations. More or less solved in Java RMI via stubs and skeletons.
partial failure: remote objects can suffer different failures which are not included in the original interface. For instance, they can raise RemoteExceptions which you must deal with a try/catch block. Or they can let you wait a method's return value forever (actually until a timeout is reached.)
concurrency: calls from different nodes can happen at the same time, and there is no single point for managing shared resources like a common operating system.

The first two are problems we can live with, while the third and the fourth have no conceptual solution. In sum, remote proxies are a leaky abstraction, and it is difficult not to notice that an object is remote.

Isn't trasparency over the network a simplification?
It is more an old myth, and often too much of a simplification. Single-machine deployment for web applications exist, and they can be transformed into multiple-machine ones. How do they tackle the problem?
These infrastructures do the job backwards. They assume a service like a database is going to be remote, and if it is local, well nothing changes, you can just stick a localhost or 127.0.0.1 in the configuration. Sockets are used for communication between Unix components from the 1970s, and they are not a leaky abstraction. Even RMI forces you to throw RemoteExceptions from a local object (and this is the first failure of transparent remoting).

Isn't RMI useful?
Of course, in the correct use cases RMI can be handy. For example if you are in a LAN, and are in an hurry to separate an application server from your web servers, you will certainly use RMI to access beans on the former. As long as it is not used transparently, it is a great technology.

So should we stop doing remoting with objects?
No. Even JavaScript nowadays does remoting in Ajax applications, but it has a different model, which is asynchronous. Of course some Java infrastrucure is by nature asynchronous, Java frameworks are introducing an asynchronous model for calling remote objects (R-OSGi).

IRemoteService remoteService = (IRemoteService) reference.getProperty(REMOTE);
// This futureExec returns immediately
IFuture future = RemoteServiceHelper.futureExec(remoteService, "hello", new Object[] { CONSUMER_NAME });
// ...do other computation here...
// This method blocks until a return 
future.get();
System.out.println("Called future.get() successfully");

You can wrap future.get() in a separate thread and continue with your life. This is almost like doing an Ajax call: your browser does not freeze while the request is being completed. Asynchronous method calls are an interesting innovation (not really an innovation since they are older than dirt) over transparent invocations, and open up new models for distributed computing.

Saturday, April 24, 2010

First week at DZone roundup

This is the first week where I can include links to my original articles at DZone. I've been a bit busy with my Toefl exam but I hope to still write frequently also on this blog as well.

php|architect
Zend launches user group affiliate program
Oracle PHP Generator

DZone
The Model-View-Controller pattern in PHP
The guide to configuration of PHP applications
Practical PHP Patterns: Transaction Script

You may have notices I had published only a small post at this blog this week, but as you can see I am actually writing more than before. :)

Wednesday, April 21, 2010

The impact of a personal blog

This week, I have become the Zone Leader of the Web Builder Zone of the DZone Network. As you may have noticed, my republished articles on DZone have experienced thousands of visits and many upvotes, so I'm moving to a position which involves writing original content for DZone, as I have done with php|architect in the last months.
What does this imply for the subscriber of this blog? Really not much. You will continue to get content for free, but I will divide my writing between this personal blog and DZone. Mainly PHP-related content will be published on DZone during the first part of the week, while I will continue to write here about other topics such as Agile methodologies, general object-oriented programming and Java, Domain-Driven Design and so on.
I will continue however to post in my weekly roundup the links to original articles of mine published all over the web, so that you don't miss anything. There is no author's personal feed to subscribe to on DZone, but this is the full one for the Web Builder Zone if you want to be promptly updated.

As a side note, I'm impressed by the impact that a personal blog like this has been on my opportunities, as a mean to demonstrate my expertise. In 2009, before opening this hosted blog at zero cost, I was a freelance developer who worked on custom PHP applications on a per-project basis. Italy isn't full of opportunities for talented people and every project seemed equal to the next.
Fast forward two hundred posts later, my name is recognized and I got to write PHP-related articles for two of the must popular content providers in the field, DZone and php|architect. I had also been invited to propose a talk for the Italian PHP conference, phpDay, which had been subsequently accepted. I will be a speaker on phpDay, a really unexpected event.
I started an open source project, NakedPhp, which has seen some good recognition thanks to its appearance in this blog. I wish I could have more time to dedicate to it as I feel the idea it is based on, the Naked Objects pattern, is really valuable for enterprise applications, a field PHP could conquer if it had more standards. Someone wants to push to the git repository? :)
My English knowledge has widely improved and my writing is now fluent. I have no fear about the upcoming Toefl exam I must pass in order to get my Bachelor's degree (feel free to find a grammatical error in this article now :)
The point is, this journey helped me a lot in my career path: this blog is not an year old yet and I have landed two flexible freelance writing positions, a showcase for my personal projects, and a talk at a conference. Thank you all for the time you give to my articles, and the retweets you do.

Saturday, April 17, 2010

First half of April roundup

After the last roundup, I have published original posts at php|architect you may want to take a look at:
PHP 5.3 namespaces for the rest of us
Zend Framework 1.9.8 and 1.10.3 released
Netsparker Community Edition released

Meanwhile, I halted the work on NakedPhp for a bit because I'm allocating time to the development of my thesis on a search application oriented to multimedia files: this time, Java has been distracting me from the PHP world.

Friday, April 16, 2010

Growing object-oriented software review

Growing object-oriented software, guided by tests is a masterpiece on Test-Driven Development, a valid guide for the beginner in this field and for the almost expert as well. This book guides the reader through a world where TDD is not only always applied, but it is kept in its pure form.
The title describes exactly the purpose of the practices presented in the book: starting an ambitious project from scratch, and expand the product from an empty skeleton to a fully-featured application. Unit, integration and acceptance tests are the specifications which are written before the production code, and that drive the development in an agile way towards a the end of an iteration or a release.

Structure
The book is divided in five parts:

Introduction to Test-Driven Development and design principles for object-oriented applications. This part bridges the beginner with the rest of the book, written for a programmer at an intermediate level.
The process of Test-Driven Development: TDD in a double cycle, mocking, third-party code, and every bit of theory you need to know about Red-Green-Refactor.
A worked example: 150 pages where the authors build from scratch an application which interfaces with the web and a user interface, by adding one test at the time and justifying every single step. The code is available in different programming languages on the official web site.
Sustainable Test-Driven Development: TDD is often abandoned when issues arise due to its incorrect usage. Managing the test suite is the most important part of long-term maintainability and readability, expressiveness and consistency are taught here.
Advanced topics: persistence, threading and asynchronous calls are often a difficult field for test-driven applications, enhancing the omnipresent danger of introducing brittle and slow tests. Though, it is not impossible to extend testing in these areas by tweaking the original approach. Remember that if you code test-first, nothing can stop you from writing testable code.

As you can see from the Advanced topics part, the book is oriented to a Java audience, but if you exclude the last two chapters every practice and principle is just tied to real object-oriented programming and not to particular programming languages. For examples, PHP 5 has a complete object paradigm and the necessary tools (PHPUnit, Zend_Test) to practice the techniques described in the book.
A basic knowledge of Java helps, though, because the code samples are written in Java and make use of frameworks such as Swing; of course JUnit-based tests are instead perfectly equivalent to test written with any other xUnit testing framework.

Some concepts explained by this book - such as the Walking Skeleton and Acceptance TDD - were very enlightening. I guess there is a reason why Misko Hevery recommends this book:

Reading the book I sometimes felt that I was listening to myself, especially when the authors warned about global state, singletons, overusing mocks, and doing work in constructors among other things. But unlike myself, who draws sharp lines between right and wrong, the authors did a good job of presenting things on the gray scale of benefits and drawbacks. The book shows what a typical code most people will write, and then show how tests point a way towards refactoring.

Also Robert C. Martin, also known as Uncle Bob, has words of praise for Growing object-oriented software:

At last a book, suffused with code, that exposes the deep symbiosis between TDD and OOD. The authors, pioneers in test-driven development, have packed it with principles, practices, heuristics, and (best of all) anecdotes drawn from their decades of professional experience. Every software craftsman will want to pore over the chapters of worked examples and study the advanced testing and design principles. This one’s a keeper.

I am glad I have finished reading it before starting my thesis on a large project. If you do not trust me, trust them. :)

Thursday, April 15, 2010

Java versus PHP

If you exclude C and its child C++, the most popular programming languages in the world are Java and PHP, which power most of the dynamic web. I have working experience with PHP and for academical purposes I am deepening my knowledge of Java, thus I'd like to point out similarities and key differences between these two languages. Every language has its pros and cons, so there's no absolute winner here.

History
Java was originally developed at Sun Microsystems in 1995, as part of the Java platform. Java applications are compiled to an intermediate bytecode which is run by a virtual machine. While originally intended for client software and in-browser applets (with the motto write once, run everywhere), Java is now a key infrastructure for many web applications.
PHP was born in the same year, and it was instead specifically created for the web, as a server-side scripting language to embed in html pages. It has evolved through 5 major versions to became one of the most popular web programming languages, thanks also to shared hosting services, which are particularly simple to set up for PHP applications and drive down their operational costs.

Typing
Java is built over static typing: variables must have a declared (possibly polymorphic) type. Thus, it is commonly judged as verbose, although the verbosity isn't necessarily linked to static typing.
PHP uses dynamic typing instead: variables assume the type of the value currently contained in them, and can change their type to satisfy implicit casts and conversions. This approach is prone to errors which would be detected by a compiler, but unit tests are a wonderful substitute for compile-time checks in dynamic languages.
Note that both languages have primitive types: Java because of the performance problems of wrapping integer and character variables in objects at the time of its creation in 1995, and PHP because it wasn't originally thought as an object-oriented language.

Object-oriented programming
PHP 5 borrowed its object-oriented paradigm from Java, which is the standard implementation of an object-oriented language today. After the release of PHP 5, a key point in the introduction of a serious object paradigm, PHP is evolving towards oop and has borrowed more from Java products: Doctrine 2 is an object-relational mapper inspired by Hibernate and JPA; phpDocumentor is built on the example of Javadoc; PHPUnit is one of the xUnit products, which derive from the original JUnit.

Execution model
PHP classes, functions and data structures, when they do not employ external infrastructure like caches or databases, are created in a script and they are garbage-collected at the end of the request. Java applications are instead kept in memory between requests, and the architecture of these two kinds of applications if fundamentally different - though, I would not say one model is superior to the other one. PHP pulls in execution only what it needs, and it pays back with the inability to run periodical tasks such as cron. Java applications can start multiple threads, but their management is much more complex, from the compiling phase to the deployment which includes servlets reloading.

Infrastructure and web
PHP is simple to deploy in its basic form (.php scripts), but this means that increasingly often the average developer has to use frameworks which builds standard infrastructure features over the simple PHP interpreter. Ironically, these framework are similar to Java ones; for example Zend Framework's controllers are the equivalent of servlets: classes with a standard constructor that extend a common base class and act on a request object to produce a response one.
Java has less native features built in the language, as it is not strictly oriented to the web, but it has them in frameworks which adhere to a standard, the servlet containers. PHP capabilities are hindered by the absence of standards.
For instance, the web.xml file in WAR packages, which represent a web application, define routes to map urls to servlets. Imagine if PHP had a standard for defining routes that Zend Framework, Symfony and their siblings respected: that seems science fiction.

Wednesday, April 14, 2010

The class design checklist

Given the good reception of the TDD checklist, I've decided to put together a similar one with suggestions for the generic class and interface design. These entities are the basic artifacts of object-oriented programming, thus this checklist is used at a lower level than the TDD one.
The form, however, is the same - a list of questions which should be answered by a developer before committing a new code artifact. There are really no particular phases in which to apply these questions, though, like in the Red-Green-Refactor cycle, so I organized them by topic. Feel free to pose these questions to yourself or to a developer you're pairing with anytime you perceive a smell in the code.
Note that many static analysis tools would point out some of this issues, but I think it is better to tackle possible problems early on in the development cycle, at the very moment of the initial design. If you employ TDD, design is performed one step at the time thus leaving you free of aggressive refactoring on the newly introduced concepts: this process converges towards cohesive and loosely coupled classes, which are blueprints for a good object graph.

Naming

Does the class (or interface) name describe what an instance of this class (or interface) does? Usually a name or an adjective plus a noun are good for a class, while an adjective is more appropriate for an interface. Some interfaces have names composed by one or more nouns.
Does the name contain unnecessary implementation details? Interface and abstract classes should not contain any reference to a particular implementation, but you should analyze this issue in context. For instance, XmlParser is not correct if at least a possible parser implementations does not work with Xml, while for a family of Xml parsers that differ in performance is appropriate. In the same motif, class names should not contain private implementation details of the class which may change, only the class real special trait in respect to the other implementations (e.g. XmlParser, HtmlParser, YamlParser.)
Is the fully qualified name of the class or interface correct? (also known as: is this artifact placed in the right package or namespace?) You can make a guess based on the number of dependencies that this new artifact introduces.
Naming conventions and best practices are also valid for method names, parameters, local variables, inline comments.
Is the name consistent with the rest of the object model? Is it part of the Ubiquitous Language? The more public is the named entity (in the ascending order private, protected, [package where applicable,] public, published), the more important is to get a valid name immediately.
Should the name be influenced by a standard Ubiquitous Language? This is usually true for design patterns implementation, where leveraging the role names communicates much about the code structure. Other examples of a standard Ubiquitous Language are framework and programming language conventions.

Structure

How many levels of indentation are there in your code? Supposing that the first level is dedicated to methods, other two levels are acceptable, with one (thus two in total) being the norm. Extract Method will help breaking up the complexity in different, orthogonal methods.
Have you inserted switch constructs, especially similar ones? This structure is usually a smell, which can be refactored with a State or Strategy pattern.
If-elseif chains or even if-else constructs, when repeated, are equivalent to switch, with the latter being its two-fold substitute.
Are there new operators mixed with business logic? This is a no-brainer.
Are there any controversial constructs in the code, such as the static keyword, or goto?
If the code artifact is a subclass, does it extend the right parent class? If it is an implementation, does it implement the right interface? Check the semantics and analyze the proposition An instance of [entity] is always an instance of [parent], too.

Length

May a class want to implement only part of this interface? Segregate it in different pieces as much as possible.
Is the class longer than the standard size for your project? The suggested length varies with the programming language and the particular application, but a long class may be the sign of an imminent Extract Class refactoring.
Is the class size of the same order of magnitude of other similar implementations?
How many characters are the most complicated lines long? You may want to introduce intermediate methods or data structures to keep such lines readable. A common rule of thumb is the 80 characters of the original text terminal.
Method length is also a useful metric. The rule of thumb is a method should be fully visible in a single screen, but this doesn't mean that with a 30'' LCD monitor you should write longer methods. The original screen was 25 lines high, but you may want to extend it a bit for practical purposes and refactor later: extracting local method is one of the simplest refactoring and it has a very limited impact on the rest of the codebase.

I hope you already had many of this questions in your mind while coding. As always, feel free to add new insights in the comments, I would be glad to learn new practices from you.

Tuesday, April 13, 2010

Practical Php Patterns: Template Method

This post is part of the Practical Php Pattern series.

The pattern of today is the Template Method one. Template Method is an inheritance solution to the problem of hooking into steps of the execution of an algorithm.

The pattern is very simple: the Template Method on an AbstractClass defines the algorithm by composing small hook methods, which can be implemented or overridden by different ConcreteClass that inherit from AbstractClass. The AbstractClass may provide the hooks as abstract methods, or as concrete methods with default implementations; in PHP their visibility is usually protected so that they are visible only to the hierarchy internal code.
An example of Template Method where a standard hook is used is the Factory Method pattern: businessMethod() is a Template Method which composes factoryMethod() as an hook.
From the design and testability point of view, Template Method should be used sparingly, and mostly in high-level components which acts more as a declarative layer and are not the subject of extensive unit testing. This means they should not contain much real logic.
In fact, the testing of AbstractClass is usually non-standard for programmers that has made an habit of TDD, but it is indeed possible: a custom subclass of AbstractClass is built specifically for the test, or a mock is generated which overrides only the hook methods. Testing the single ConcreteClasses is instead difficult as in every test you will throw in also the business logic of AbstractClass, which was factored out specifically to avoid dealing with it.
You can lie to the production code by factoring out business logic with an extends keyword and hiding it under the carpet, but you can't lie to unit tests that exercise this production code. So if you find yourself struggling with testing ConcreteClass instances in isolation, you may want to refactor into a Strategy pattern or a similar composition solution built with Dependency Injection in mind. The particular pattern depends on the semantics of your object graph.

The code sample deals with multiple implementations of binary operations, which only define the sources of the operands and the business logic of the actual operation, sharing the wiring code.

<?php
/**
 * The AbstractClass.
 */
abstract class BinaryOperation
{
    /**
     * These are three hooks defined, which should
     * provide the two numbers which the operation is
     * applied to and its business logic.
     */
    protected abstract function _getFirstNumber();
    protected abstract function _getSecondNumber();
    protected abstract function _operator($a, $b);

    /**
     * This is the Template Method.
     * It uses all the three hooks, but a typical
     * Template Method can coexist with other ones, and
     * share hooks with them.
     * @return numeric
     */
    public function getOperationResult()
    {
        $a = $this->_getFirstNumber();
        $b = $this->_getSecondNumber();
        return $this->_operator($a, $b);
    }
}

/**
 * A ConcreteClass.
 */
class Sum extends BinaryOperation
{
    private $_a;
    private $_b;

    public function __construct($a = 0, $b = 0)
    {
        $this->_a = $a;
        $this->_b = $b;
    }

    protected function _getFirstNumber()
    {
        return $this->_a;
    }

    protected function _getSecondNumber()
    {
        return $this->_b;
    }

    protected function _operator($a, $b)
    {
        return $a + $b;
    }
}

/**
 * A ConcreteClass.
 */
class NonNegativeSubtraction extends BinaryOperation
{
    private $_a;
    private $_b;

    public function __construct($a = 0, $b = 0)
    {
        $this->_a = $a;
        $this->_b = $b;
    }

    protected function _getFirstNumber()
    {
        return $this->_a;
    }

    protected function _getSecondNumber()
    {
        return min($this->_a, $this->_b);
    }

    protected function _operator($a, $b)
    {
        return $a - $b;
    }

}

// Client code
$sum = new Sum(84, 56);
echo $sum->getOperationResult(), "\n";
$nonNegativeSubtraction = new NonNegativeSubtraction(9, 14);
echo $nonNegativeSubtraction->getOperationResult(), "\n";

Monday, April 12, 2010

The dangers of Late Static Bindings

There's a lot of (justified) excitement about PHP 5.3 new features, such as the support of namespaces and anonymous functions. Though, some glittering capabilities of the language are definitely not gold: the goto statement is probably the most debated example, but also the long-awaited Late Static Bindings support is an hammer which may hurt your fingers...

Technically speaking, Late Static Bindings give you the possibility to implement an effective class hierarchy with static classes. You can already see that there are two dangerous words in this definition: static and hierarchy.
In the code sample, you can see that the pre-PHP 5.3 static methods resolution was not affected by subclassing, as self always targeted (and will continue to target) the original class it was written in. Therefore, the static keyword has been reused in the context of method calls, resolving to the class that makes the static call.
It's simpler to show it than to describe it:

<?php
class Base
{
    public static function getHelloText()
    {
        return 'Hello from Base!';
    }

    /**
     * 'self' is always resolved to the class it is
     * written in.
     */
    public static function helloWorld()
    {
        echo self::getHelloText(), "\n";
    }
}

class Subclass extends Base
{
    public static function getHelloText()
    {
        return 'Hello from Subclass!';
    }

    // helloWorld() is inherited
}

Subclass::helloWorld();

With Late Static Bindings:

<?php
class Base
{
    public static function getHelloText()
    {
        return 'Hello from Base!';
    }

    /**
     * 'self' is resolved at runtime (late binding)
     * to the class where the method is called.
     */
    public static function helloWorld()
    {
        echo static::getHelloText(), "\n";
    }
}

class Subclass extends Base
{
    public static function getHelloText()
    {
        return 'Hello from Subclass!';
    }

    // helloWorld() is inherited
}

Subclass::helloWorld();

An equivalent functionality is available with the method get_called_class(), which returns at runtime the name of the class a static method is called in. This was not possible at all before PHP 5.3 and discovering what class is called in a static method is the central point of LSB.

It's not a mistery that static methods should be used sparingly and for particular use cases, as they are a procedural solution. Moreover, inheritance is an overrated method of code reuse that favors implicit assumptions and exposures over explicit decisions. In this article we will compare two examples where Late Static Bindings support proves useful and where it is not.

Good example: abstract implementation of a Factory Method.
Let's suppose we have a small, two-level hierarchy of classes that emulate an enumerated type. If we have a static factory method on the base class, we can implement as Flyweights the whole hierarchy of objects with only one method:

<?php
/**
 * Every subclass will have a method getInstance()
 * that returns the singleton.
 */
class AbstractEnum
{
    private static $_instances = array();

    public static function getInstance()
    {
        $class = get_called_class();
        if (!isset(self::$_instances[$class])) {
            self::$_instances[$class] = new $class();
        }
        return self::$_instances[$class];
    }
}

class Subclass extends AbstractEnum {}

$subclass = Subclass::getInstance();
$otherSubclass = Subclass::getInstance();
var_dump($subclass === $otherSubclass);

This is only syntactic sugar for an hypothetical Base::getInstance('Class'), and the single objects must be mere ValueObjects or data container and have no behavior to mock out, since static methods are the death of testability.

Bad example: Active Record which does not evolve towards Repository
Suppose instead we have the classic ActiveRecord class we extend to produce a very simple object-relational mapper (most of the PHP ORMs are based on this pattern.) If we had a find() static method on the ActiveRecord class, with LSB we can now make it work on every subclass, so that to find our users we can call User::find($id).
This is the death of testability and good design however: how can we mock out this class from a service that uses it? We are stuck with a procedural, hardcoded User::find($id) call.
Instead, libraries such as Doctrine 2 has evolved towards the Repository pattern, extracting and managing a Repository object specific for User instances, which can be obtained and which we can call find() on. Moreover, we can inject it into any services.

What can we learn from these two examples? Static methods are often the sign of a missing concept in the design, something which would encapsulate a collection of objects of the same class, or their shared metadata. Introducing a new object, being it a Table Data Gateway or a Repository, makes this concept explicit and promotes it to a first-class object, which you can pass around and mock as you want.

Saturday, April 10, 2010

The Apple of sin

I realize that I am biased against Apple as I'm not a fanboi of them and do not buy their products, but this time the situation is really ridicolous. As you may have read in one of the dozens of posts about the issue, Apple has decided that if you want to use the iPhone 4.0 SDK, you cannot choose the language to write your applications in (yes, also the linked posts are biased.)

3.3.1 … Applications must be originally written in Objective-C, C, C++, or JavaScript as executed by the iPhone OS WebKit engine, and only code written in C, C++, and Objective-C may compile and directly link against the Documented APIs (e.g., Applications that link to Documented APIs through an intermediary translation or compatibility layer or tool are prohibited).

It simply does not make sense. Well, it makes sense from their business perspective, but not from the users and developers point of view. Users get less choice for apps, while developers are forced to use a particular environment which they may not be familiar with. Someone said that this is a move to prevent compilation of Flash applications into C (like Facebook has done with HipHop for PHP), a possible solution to get them running on the iPad.
Perception of software development is often confused from many people, so let's extend the metaphor to other fields, showing what would happen if this prohibition would be applied there:

you're not allowed to edit images that will be displayed by Apple products with Photoshop (ops, Adobe). You shall use the iPencil instead and scan your drawings in a iJpeg, which is like a normal Jpeg but costs a dollar a piece.
you're not allowed to write your PDF files displayed on Apple products with OpenOffice.org and export them in this format. You must use iWork (this one really exists.)
you're not allowed to play musical instruments to produce songs that will be stored on the iPod or similar products. You must use GarageBand instead.

And of course, you're not allowed to write your own source code the way you want it, and then compile (ops, this is real.) But my source code is my own business: if I want to write it in Brainfuck, I'll definitely write it in Brainfuck. It's Turing-complete, so Steve where's the problem?
Of course I will continue not buying anything from Apple.

Friday, April 09, 2010

Domain-Driven Design review

Domain-Driven Design: Tackling Complexity in the Heart of Software

Domain-Driven Design, by Eric Evans, is the masterpiece of the DDD movement. It introduced to the world the harvested experience of Evans in working on model-driven development on a variegate group of object-oriented projects, presenting it as a series of patterns and meta-patterns (such as the Ubiquitous Language), and ranging from the design of a single class to a large map of different applications.

A brief summary
DDD is a very dense book, and it is logically divided in four parts:

Putting the Domain Model to work, which covers the importance of the Ubiquitous Language and the centrality of the model of the application domain.
The building blocks describes the patterns used at the finer level, such as Entity, Value Object, Aggregate, Service, Factory and Repository.
Refactoring towards deeper insight covers the continuos process of adapting the Domain Model to new insights and crunched knowledge, believing firmly that an hidden model exists and will be reached with a supple design. Modelling as described here is like applying an unification theory to software.
Strategic design introduces the techniques for large scale structure and separation of contexts, such as the famous Anticorruption Layer (although it is not the only communication medium between different Domain Models.)

In my opinion the third and fourth parts are the most difficult to digest, due to their abstractness (not everyone of us has worked at that scaling level) and the perceived distance from working code. In fact, the first two parts are rich in code samples, which gradually vanishes towards the end of the book while the view on object-oriented applications is taken to an higher level. Diagrams, both based on Uml and not, are preferred throughout the last chapters.

How to read it
Many people start reading DDD full of confidence, scheduling five chapters a day, only to struggle very quickly, usually in the second part, and abandone it. Even in my recently completed reading from cover to cover, I waited some time before starting the third part.
This book is intended as a series of patterns - notice the capital letters used for most of the names - those names are specific terms with a precise meaning. In the case of the Gang of Four book on design patterns, this made it hard to read cover to cover because of the lack of a common thread between the chapters.
But in this book's case, the main theme is the model-driven design practice that evolves from the finest level - the class and its methods - to BoundedContexts and Responsibility Layers. That said, the information contained is very dense and I have never read more than 15-20 pages on a given day. I suggest you do to the same or you risk burning out before reaching half of the book. Also taking notes is a mean that forces high focus of the content and may help.

Other resources
There is a lighter book freely available - Domain-Driven Design quickly - which promises to explain you DDD in 100 pages. Don't read it: when I tried, I understood a concept only as long as I have already read it on the original DDD book, and never learn anything on the new material. The original is 400-page long, but it is already condensed as much as possible.
A book that expands on the code samples and the design process is Applying Domain-Driven Design and Patterns, which is a good read to combine with DDD (in fact, it is 600-page long.) You may refer to it when you want more practical examples of a pattern treated in the original book.

In sum, this book is often cited as the DDD book, and in the future it may be considered the bible of real object-oriented development. If you have it on your shelf, I suggest you to read as much as you can beginning from the start, skipping only the fourth part if you do not ordinarily deal with large-scale applications.

Thursday, April 08, 2010

phpDay 2010 (plus discount code)

I will be at phpDay 2010, the Italian conference on PHP and related topics, as a speaker.
There are many talks in English from foreign speakers such as Fabien Potencier, but my talk is in Italian. It is scheduled for the last day of the conference, 15th May 2010.
http://www.phpday.it/session/architettura-e-testabilita
Talk page on joind.in

Architettura e testabilità: il design di un'applicazione può essere influenzato positivamente da diverse pratiche. La facilità di testing é condizione sufficiente per un architettura che garantisca semplice manutenzione e alta coesione dei componenti. Argomenti trattati: Dependency Injection, Law of Demeter, Design Pattern creazionali (Factory vs. Singleton), Api oneste.

If you are interested, you can book a seat here. I have a 20% discount code I can give you if you ask me via email, I'm not sure publishing it here is allowed.

Wednesday, April 07, 2010

HTTP verbs in PHP

While PHP is capable of performing HTTP requests towards external servers with any method, either via the HTTP extension or by opening streams directly, the support of the various GET, POST, PUT and other verbs on the receiving side of HTTP requests is a bit more complicated.

Background
The HTTP 1.0 specification (RFC 1945) officially defined only the GET, HEAD and POST methods, leaving open the possibility of adding extension methods:

Method        = "GET"                    ; Section 8.1
              | "HEAD"                   ; Section 8.2
              | "POST"                   ; Section 8.3
              | extension-method

The specification of HTTP 1.1 (as its last 1999 incarnation RFC 2616) defines explicitly other methods:

Method         = "OPTIONS"                ; Section 9.2
               | "GET"                    ; Section 9.3
               | "HEAD"                   ; Section 9.4
               | "POST"                   ; Section 9.5
               | "PUT"                    ; Section 9.6
               | "DELETE"                 ; Section 9.7
               | "TRACE"                  ; Section 9.8
               | "CONNECT"                ; Section 9.9
               | extension-method

Of these methods, the more interesting ones due to their usage in RESTful applications are GET, POST, PUT and DELETE:

GET is a safe, idempotent method and it is used to retrieve a resource.
POST is considered a catch-all method nowadays, but its intent is defining a subordinate resource to the current one. For instance, posting to a blog resource may create a new post.
PUT is the analogue of GET used to send a resource to the HTTP server.
DELETE is the analogue of GET used to, of course, delete a particular resource.

Client support
GET and POST are the bread and butter of requests sent towards PHP applications. They are commonly generated directly by the browser. A GET request is generated by a link or a form with the specified method attribute set as get. A POST request instead is usually obtainable in native HTML 4 only with a form, which may contain also an enctype attribute to set the Content-Type header of the request (usually employed for the upload of files via POST method.)
Browsers often limit their support for HTTP request to these two, as the HTML specification does not define a standard mean to generate other type of requests on the client-side. We have a programmatical way of making asynchronous HTTP requests, Javascript, but it does not help either as it's limited by the implementor's capabilities.
Javascript libraries do not force particular restrictions of the allowed methods: we may send GET, POST, PUT or DELETE requests to an endpoint on the server, but via an XMLHttpRequest object (which is their standard back-end) the unsupported methods will be emulated via overloaded POST. This means a POST request will be produced with an additional parameter (which may be named _method or _requestType, depending on the particular library) that describes the actual method used in the client. You get to maintain the semantic of the request in the client code, though, and maybe in the future native support for PUT and DELETE request will be available.
There are ways to make a real PUT or DELETE request, and they usually require more complex infrastructure on the client, like Java applets or non-standard Javascript (which is not supported in the majority of browsers), or even not using a browser as the client, for instance using a web service acting as a client of another one.

Server support
But how can we detect the HTTP request method in a PHP script? For the common methods, such as GET and POST, the superglobal arrays $_GET and $_POST are always available and contain the request parameters. You may want to wrap them with an object-oriented interface, and note that in the case of files upload via POST you should also look at the $_FILES superglobal array.
For the other methods, the first thing to do is setting up your webserver with a directive that routes all the PUT requests to a single, dynamic entry point. In the case of Apache, this is described in the PHP manual as:

Script PUT /put.php

The pointed script simply has to read from the standard input the PUT request:

$putdata = fopen("php://input", "r");

file_get_contents() won't work here; welcome back to the world of CGI! :)
The manual and the user comments are also a valuable resource for common pitfalls in implementing this type of endpoints.
With regard to the DELETE requests, the same configuration is valid to route the requests to a single entry point. Then, it's a matter of identifying the right environment variables, which may vary depending on your HTTP server. Fortunately, PHP frameworks do most of the work, and you are free to programmatically implement different behaviors depending on the type of the request.

Tuesday, April 06, 2010

Graphical tips for the average coder

I am not a front-end guy and surely not a web designer, but I promised myself not to avoid amateurial graphical work on occasion. While doing photomontages and altering images such as photos and movie posters, I learned some basic techniques and I now take great advantage of the editor's tools. I want to share these insights with the average programmer, who like me has seldom used a graphics editor and is scared by design work.

Selection is underrated in the minds of us programmers. Rectangular, circular, magic wand selections and similar tools are very sophisticate and composable: addition, subtraction and inversion of selections are the norm in a graphic's workflow. Selecting the right part of an image is fundamental for rendering the effects you want to apply, such as brush strokes, gaussian blur or even a bucket fill, to the correct zone instead of spreading them all over the image.

Working at an high zoom level - from 200% to 300% - gives you the opportunity for fine surgery. I guess this is why graphic designers have giant monitors, besides for fitting all the toolbars from Photoshop or GIMP on a single virtual desktop. Even when having an enlarged main view, you can always open another small view at a realistic scale to always have a glance at the overall picture.

Layers are the bread and butter of image editing, and every decent editor lets you modify, show and merge them independently, and combining layers of different sizes and positions in a unique image. If you are a web developer, I'm sure you have worked with CSS and the box model, so the concept of layering won't be a novelty. I store graphical works in the XCF format, which is the GIMP native format akin of the widely known PSD: both allow layers to be kept separately instead of being rasterized together in a JPEG or PNG. The separation of XCF from JPEG or PNG can be think of as the equivalent of source code and binaries: when you start playing with the source code you are transforming from an user to a programmer.

Transparency, often in the form of an alpha channel, is the glue that allows the composition of rectangular layers. Technically speaking, the alpha channel occupies 8 of the 32 bit of the color space, so that every pixel can specify its red, green and blue content with a total of 24 bits, but also its transparency (or by converse its opacity) to the point of being completely invisible.

Finally, another tool I find very useful in my pixel-obsessed, mathematical weltanschauung of image editing is the guide and the automatic snap of selections and layers to them or to the image borders. I simply cannot produce a decent work without setting up guides to attach layers to. I have not a perfectly steady hand while moving a mouse, which is not a precision device, but the combination of zoom and snaps lets me work like I was playing with Lego bricks or with my good old classes and objects.

Monday, April 05, 2010

How I learned to stop worrying and love new words

Because of the subject of my Bachelor's degree thesis, I am currently busy learning more and more about Java technologies, in particular the OSGi specification and the frameworks that implement it.
In the past, I saw articles about OSGi passing in DZone's feeds, and never cared much about it. It's possible and desiderable to avoid contact with many technologies we are not considering for right now: somehow we have to limit the amount of new information in our self-improvement process to the actually useful fields.
However, the thesis subject (audio and video search) is interesting but it involves a large amount of Java technology, in particular a framework built on top of the OSGi model. Here comes the pain: I did not know what OSGi was at all. Instead of continue worrying about it, I decided to dive into OSGi and I'd like to recall my steps here so that you can decide to take a similar journey on an argument that you are required to know.

Step 1: Wikipedia
Wikipedia is the starting point of most of my researches, even if it is not 100% reliable as every community-crafted content is. Wikipedia took me from an empty word (OSGi) to a definition:

The OSGi framework is a module system and service platform for the Java programming language that implements a complete and dynamic component model, something that does not exist in standalone Java/VM environments.

Moreover, well-written Wikipedia articles have many internal links to related material, both in the page body and in the See also section. Every new term you encounter usually points to its own definition, a scenery that can lead to a tab explosion but also to deep knowledge.

Step 2: define concepts in your own terms
I quickly discovered that OSGi applications are composed of bundles. The bundle term is part of the Ubiquitous Language of OSGi, but I did not know an exact definition. When you are doing a panoramic of a technology, it's useful to start with a good approximation which uses already grasped concepts:

OSGi bundles are particula JARs which includes some metadata files along with the hyerarchically organized .class files that contain the bytecode. Bundles export or import Java packages they respectively provide or require.

This is an approximation, since JARs and bundles do not strictly overlap. But it is a good one and let me abstract away most of the bundle internal organization for a while.

Step 3: resources to learn more

Google is often the best friend of a developer (as the GIYF acronym correctly says.) You can look for tutorials, but also for particular frequently asked questions.
Books on the subject, particularly if they contain good code samples, are the best road to a deep understanding of the technology. However they may not be the right material if you're only looking for a crash course.
YouTube videos are instead highly distilled knowledge, and the equivalent of a crash course. A 1-hour long talk can teach you very much about the assumptions and the usage of a framework like OSGi without effort: you just need to listen to the speaker.
also a search on Wikimedia Commons will bring you a lot of diagrams about your preferred technology (example), used throughout all the Wikimedia Foundation wikis.

Step 4: practice
My first practising step was producing an OSGi bundle and deploying it in an OSGi framework. I've done it even before step 3 because I like to get a walking skeleton as soon as possible, but I've gone reviewing my code once I had learned a larger part of the theory. Getting a running example is always a confidence booster however, even if you are copying much of the code without knowing its meaning.

The same learning process is going on for me for other material I'm using in the thesis, such as BPEL and the SMILA framework. While there may be aliens with genetic memories, you shouldn't be afraid of new concepts: everything you know was learnt at some point in your life.

Friday, April 02, 2010

Practical Php Patterns: Strategy

This post is part of the Practical Php Pattern series.

The pattern of the day (and of the week) is one of the most important ones in object-oriented programming: the Strategy pattern. Its intent is encapsulating an algorithm into a specific object, defining a clear input and output exposed to the Context where the Strategy is used, to let them both vary independently.
The Strategy used in a Context object can change for configuration purposes, thus allowing the selection of a specific behavior. Other use cases comprehend the tuning of non-functional requirements such as fast switching between performance or memory usage trade-offs of different algorithms that implement the same behavior (a classical example is sorting).
Finally, the use of Strategy objects simplify unit testing. For example, injecting in-memory Strategies instead of disk-related ones is the standard way to test classes that depend on infrastructure.

Participants

Context: uses a Strategy object, outsourcing part of its behavior.
Strategy: contract that Context sees.
ConcreteStrategy: implementation of Strategy as a particular behavior.

Switch statements or if-else chains are candidates for refactoring to a Strategy pattern (as they are for the State pattern). The difference between the two is their intent: State encapsulates data and possibly the transition to other States; a Strategy object usually does not produce other Strategy implementations and hides complex behavior.
The implementation of the Strategy pattern usually follows the classical composition paradigm: Context has a private field reference to a Strategy, while Strategy may be shareable as a Flyweight if the Context passes to it the necessary parameters (or even itself) when calling its methods.
The composition of a Strategy object is a valuable alternative to inheritance: in my opinion Strategy can be think of as a generalization of many patterns that gain their power from favoring composition. An Abstract Factory is in fact a Strategy dedicated to the creation of objects; an Adapter allow retrofitting an object as a Strategy for another one; and so on.

The code sample uses hidden Strategy objects for the sorting process of a Collection, in particular for comparing two values.

<?php
/**
 * The Strategy. Defines a behavior for comparing two objects
 * of the Collection.
 */
interface Comparator
{
    /**
     * @return integer  -1 if $a should precede $b
     *                  1 if $b should precede $a
     *                  0 if considered equal
     */
    public function compare($a, $b);
}

/**
 * The Context where the Strategy is employed.
 * Strategy is stored as a private field which can
 * be initialized only one time.
 */
class Collection implements Countable
{
    private $_elements;
    private $_comparator;

    public function __construct(array $elements = array())
    {
        $this->_elements = $elements;
    }

    public function initComparator(Comparator $comparator)
    {
        if (isset($this->_comparator)) {
            throw new Exception("A Comparator is already present.");
        }
        $this->_comparator = $comparator;
    }

    public function sort()
    {
        $callback = array($this->_comparator, 'compare');
        uasort($this->_elements, $callback);
    }

    /**
     * A representation for a clear output.
     */
    public function __toString()
    {
        $elements = array();
        foreach ($this->_elements as $value) {
            if (is_array($value)) {
                $value = 'Array with ' . count($value) . ' elements';
            }
            $elements[] = $value;
        }
        return '(' . implode(', ', $elements) . ')';
    }

    public function count()
    {
        return count($this->_elements);
    }
}

/**
 * A ConcreteStrategy that compares via the native operator.
 */
class NumericComparator implements Comparator
{
    public function compare($a, $b)
    {
        if ($a == $b) {
            return 0;
        }
        return $a < $b ? -1 : 1;
    }
}

/**
 * A ConcreteStrategy that compares via the result
 * of the count() function.
 */
class CountableObjectComparator implements Comparator
{
    public function compare($a, $b)
    {
        if (count($a) == count($b)) {
            return 0;
        }
        return count($a) < count($b) ? -1 : 1;
    }
}

// ordering numbers
$numbers = new Collection(array(4, 6, 1, 7, 3));
$numbers->initComparator(new NumericComparator);
$numbers->sort();
echo $numbers, "\n";

// ordering Countable objects
$first = array(1, 2, 3);
$second = array(1, 2, 3, 4);
$third = new Collection(array(1, 2, 3, 4, 5));
$objects = new Collection(array($third, $second, $first));
$objects->initComparator(new CountableObjectComparator);
$objects->sort();
echo $objects, "\n";

Thursday, April 01, 2010

PHP 6 finally released

The versioning process of PHP has been exceptionally modified to clear the situation about the long-expected PHP 6. The last stable release, PHP 5.3.2, has been transformed in PHP 6 using the powerful Unix-based tool sed and finally released as PHP 6.0.

The goals of this new major release are multiple: the first is simply to set in stone, once and for all, the features included in PHP 6, regarding for example the Unicode implementation. To avoid conflicts within the development team, which were dividing the volunteers in groups that advocated different solutions, the encoding used to store string will be the UTF-1, where the 1 stands for 1K, the size of the standard bitmap-based characther glyphs which are strung together to compose the source code of PHP 6 scripts. No more question marks (?) will be displayed due to character set issues, at least while reading PHP source code.
The principal IDEs with PHP support, like Eclipse PDT and Zend Studio, have already announced the upgrade of their principal tools to simplify the editing process in the next days. This is probably the final death of the ugly old-style text editors like Vim which plagued the hardware vendors for years, delaying the adoption of new workstations needed to run the IDEs.
Though, the selection of the font to use in the set of glyphs is being discussed in the developers mailing list, along with the standard size, which must be readable even by PHP developers with an imperfect sense of sight.

A second, important objective is to finally give dignity to the large set of PHP 6 books released in the last three years. By re-releasing PHP 5.3 as PHP 6, there is finally the possibility of reading these books without wondering what PHP 6 is, or if the real release of PHP 6 has brought to the public different features from the one described in a Professional PHP 6 book. Now the books' publishers can state to have said the truth all the time about PHP 6 and its namespace and anonymous functions support, with the notable exclusion of the little transition from a character-oriented programming language to an image-oriented one.

To give you a picture (no pun intended) of this radical solution that has solved the Unicode support and the blog posts syntax highligthing problems in a single shot, I will include here the source code for an hello world program written in PHP 6.

Save the file and run php6 hello-world-php6.png to see PHP 6 at work.
Here is a sample from my Ubuntu terminal:

Note that PHP 6 added a ! and a new line, which were inferred from my enthusiasm in building and running it.

Invisible to the eye