Invisible to the eye

Wednesday, March 31, 2010

India, and the Great Indian Developer Summit

At least according to Google Analytics, if we leave out for a moment my Western audience, I have a significant amount of readers from India. One of the advantages of the Internet era is that it lets us communicate with other, far parts of the world on a daily basis. Here in Italy India is famous for its history, being one of the most important centers for commerce in the world, and for its variegate culture and religions.
But India has also a somewhat controversial reputation in the Western software niche: some thinks it is a rapidly growing country after the economic liberalisation and that the Indians will soon steal all the jobs to us, while many other people imply that India is the ideal target for low-quality work which you don't want to pay much for.
For example, the Pragmatic Bookshelf used to sell a book named My job went to india: 52 Ways to Save Your Job. It is a very good book and its goal is improving the career path. That said, it has nothing to do with India: the title was chosen as a shocking one because of the outsourcing hype at the time. It has been subsequently renamed to The Passionate Programmer.

While it is true that the cost of life in India (at least in a large part of the country) is lower than in the Western countries, I do not believe the fable that there are only terrible developers there. India is a large country and even if the statistical distribution of educated developers were the same of Europe and United States, it would be normal to encounter a vast amount of mediocre developers. We encounter them everyday in our own cities.
If you are an Indian developer, the fact that you're reading here suggests that the tail of the curve contains also quality-aware, conscientious programmers, reflecting the overall situation of every developed country. I think here in Italy we have a percentual of simply bad developers in the web engineering field at least equal to India, given all the people who jumped on the bandwagon of the information era.

By the way, a reader pitched me about an Indian event for programmers. I hope, if you are an Indian reader, it interests you and I have not wasted your time.
This event is the Great Indian Developer Summit, maybe the biggest conference for Indian software developers. It will be held in Bangalore, from April 20 to 23. The majority of the Indian visits to this blog come from Bangalore, more than from other important cities like Calcutta. Is Bangalore a technological centre? For instance, the majority of Italian visits come from Milan, while the peak of US visits is from San Francisco.
The conference is composed of 80 session, divided in the four days by technology or programming language: .NET, web-related, Java, and workshops.
About the arguments, I am not an expert of .NET technologies, but the covered topics are ASP.NET, SQL Server 2008, Visual Basic 2010, C#, Azure, Silverlight, among others. For the web day, we have Rich Internet Applications, Ajax libraries (Dojo, JQuery) vs. Flash, HTML5, frameworks such as RubyOnRails and the Python-powered Django.
The third day is dedicated to Java: the talks are mainly about frameworks (Spring, Struts, GWT, Wicket) and alternative languages that compile to bytecode for the JVM (Scala, Groovy, JRuby). The fourth day comprehends workshops on Java, Cloud Computing and rich applications, Agile development, Microsoft technologies. Also a free Internet connection is provided during all the four days.
A conference provides many learning and networking opportunities, especially in a big city like Bangalore. I'm only a starter for what concerns these events, but I can see how a full immersion in this environment can benefit an average web developer.

Original image of the Taj Mahal from Wikimedia Commons.

Tuesday, March 30, 2010

Always code as...

In programming there's an old saying that goes like this:

Always code as if the guy who ends up maintaining your code will be a violent psychopath who knows where you live. -- attributed to John Woods

I must say he's right. Fortunately there are usually no psychopaths hanging around on your legacy applications, but it's a good karma rule to adhere to. And you may end up maintaining your own code, which is an outcome I hope you had considered. :)
I'd like to extend this maxim with some other suggestions similar in form.

Always code as if you were pair programming (mental/social metaphor)
When we are pair programming, we have to explain what we are doing. We get a mindset that challenge our code to make it better. We never get angry at the machine or at a library's developers.
A benefit of pair programming is having a different mind that works over the same problem, so that there is a failsafe in place to stop bugs or technical debt from being introduced. But the habit of trying to explain, or at least state, a problem before solving it is a productivity booster, even if you only talk to a rubber duck. This habit is enforced by TDD too.

Always code as if you were paying your lines' weight in gold (financial metaphor)
The less code you write to solve a problem, the less code you'll have to maintain: code is widely considered a liability more than an asset (an high-level programming language is crucial here.) Moreover, there is a limit to the lines of code you can write every day while maintaining an acceptable level of quality.
You should favor verbosity only to improve readability and encapsulation: the trade-off is difficult to find here, but introducing new domain concepts as classes or methods is often a valuable asset that balances the code's hindrance, as long as they are significative for the application (e.g. customer's credit card number vs. customer's eye color.)

Always code as if you had to deploy and use your application at the end of the day
Which may be the case if it's a web application.
Portability is not a feature you can add as single user story: the best way to make an application portable, configurable, deployable and most of all working is to build it as simple as possible with these characteristics (a walking skeleton), and keep them while you expand the codebase with new features.
This practice reduces risk (chances that you may find your ideal deployment requirements cannot be met) and helps early automation of the most boring tasks, defining them clearly as soon as possible. Once the technicalities have been removed and a clean environment is ready, you will be free to work on the pure domain model.

Monday, March 29, 2010

The TDD checklist (Red-Green-Refactor in detail)

I have written up a checklist to use for unit-level Test-Driven Development, to make sure I do not skip steps while writing code, at a very low level of the development process. Ideally I will soon internalize this process to the point that I would recognize smells as soon as they show up the first time.
This checklist is also applicable to the outer cycle of Acceptance TDD, but the Green part becomes much longer and it comprehends writing other tests. Ignore this paragraph if this get you confused.

TDD is described by a basic red-green-refactor cycle, constantly repeatead to add new features or fix bugs. I do not want to descend too much in object-oriented design in this post as you may prefer different techniques than me, so I will insist on the best practices to apply as soon as possible in the development of tests and production code. The checklist is written in the form of questions we should ask ourselves while going through the different phases, and that are often overlooked for the perceived simplicity of this cycle.

Red
The development of every new feature should start with a failing test.

Have you checked in the code in your remote or local repository? In case the code breaks, a revert is faster than a rewrite.
Have you already written some production code? If so, comment it or (best) delete it to not be implicitly tied to an Api while writing the test.
Have you chosen the right unit to expand? The modified class should be the one that remains more cohesive after the change, and often in new classes should be introduced instead of accomodating functionalites in existing ones.
Does the test fail? If not, rewrite the test to expose the lack of functionality.
Does a subset of the test already fail? Is so, you can remove the surplus part of the test, avoiding verbosity; it can come back in different test methods.
Does the test prescribe overly specific assertions or expectations? If so, lessen the mock expectations by not checking method calls order or how many times a method is called; improve the assertions by substituting equality matches with matches over properties of the result object.
Does the test name describe its intent? Make sure it is not tied to implementation details and works as low-level documentation.
How much can you change in an hypothetical implementation without breaking the test (making it brittle)?
Is the failure message expressive about what is broken? Make sure it describes where the failing functionality resides, highlighting the right location if it breaks in the future.
Are magic numbers and strings expressed as constants? Is there repeated code? Test code refactoring is easy when done early and while a test fails, since in this paradigm it is more important to keep it failing then to keep it passing.

Green
Enough production code should be written to make the test pass.

Does the production code make the test pass? (Plainly obvious)
Does a subset of the production code make the test pass? If so, you can comment or (best) remove the unnecessary production code. Any more lines you write are untested lines you'll have to read and maintain in the future.
Every other specific action will be taken in the Refactor phase.

Refactor
Improve the structure of the code to ease future changes and maintenance.

Does repeated code exist in the current class?
Is the name of the class under test appropriate?
Do the public and protected method names describe their intent? Are they readable? Rename refactorings are between the most powerful ones.
Does repeated code exist in different classes? Is there a missing domain concept? You can extract abstract classes or refactor towards composition. At this high-level the refactoring should be also applied to the unit tests, and there are many orthogonal techniques you can apply so I won't describe them all here.

Feel free to add insights and items on the list in the comments. I value very much feedback from other TDDers.

Saturday, March 27, 2010

Weekly roundup: March is ending

I just wanted to inform you about some content you may find interesting.

This week I have published three new articles at php|architect:
Google releases skipfish
Impel, the Javascript ORM
Ten Top PHP people to follow on Twitter

Moreover, my post Contributing to open source projects has been republished on DZone.

Friday, March 26, 2010

The rest of the NakedPhp walktrough

This post's title is a parody of The Rest of the Robots.

In the previous post we have seen how to manipulate PHP objects directly, and calling methods on them, thanks to NakedPhp's user interface. Today we will see how to interact with the database, essentially how to store object in it and retrieve them via Repositories.
The situation we left the example application into was the following:

There are two Example_Model_City objects in memory (London and Paris), and a Example_Model_Place (Eiffel Tower) which references, with a many-to-one directional association, Paris.
We are ready to save our session. Generally, a subset of the application's object graph is instantiated in memory, modified and put back in the storage. We are not assuming that the storage is a relational database: it is simply a component that take an object graph and persist it someway. In PHP, one of the few library that can do this is Doctrine 2, which will map our objects on a set of relational tables.
We check that there are no objects to remove, and then click Save:

Note that after saving the session, the objects' color changed. In the example application I associated via CSS rules green to transient objects and orange to managed ones. The difference between the two is that a transient object will vanish if we let the session expire, while a managed one is already present in the storage (so it should be actively removed if we want it to vanish.) The save action response tells us three new entities have been persisted, and no managed ones. Moreover, no objects have been removed.
Let's close this session and click Clear.

The session is now empty, and in the last screen we are already gone to the PlaceFactory object. We notice a new findAllCities action available. It has appeared now because there is a hideFindAllCities() method on Example_Model_PlaceFactory, which returns true until there are no cities in the database (Services has access to external infrastructure like storage and whatever we want them to use, while Entities usually have no external references since they have to be serialized.)

The findAllCities action brings us back all the cities stored in the database. This is an "object" - really an array, but every variable saved in the session is wrapped in a NakedObject instance.
The difference between a normal object and an array is that a wrapped array has a Collection Facet, and the views recognize this Facet and treat it differently. Particularly, they list the contained objects and provide access to them.
Finally, note that this Entity is managed and not transient, because it has been retrieved from the storage.

By clicking on the row of an item of the collection, the object is extracted in the session (it is considered transient because the semantic for the extracted objects has not been written yet. This will be fixed in the future.) It still refers to the same instance in the collection, so if we modify one we'll see the changes in the other.

We now see an example of method call with object parameters. The createPlaceFromCity action will take a City and create for us a new Place object with the city property already set. But we don't like the current cities: we want another one.

As it was the case of Entities editing, the context is conserved between different method calls. We have clicked on createCity and we are now creating Rome.

The new object is available for the original method call now.

And finally, the action returns a new Place object which is stored in the session. The city has been set correctly.

That's all for now. I have exposed a practical example of NakedPhp features and of direct manipulation of objects, which has been persisted, retrieved, and moved around.
Now I can go back to add other features, such as semantics for removal and extraction of objects, and a complete method merging feature. What do I mean? It will be useful to have the createPlaceFromCity method also on every City object, so that, given a city, it will present to the user a Factory Method for Places. In this case, we could have put the method on the Example_Model_City class, but such a method may require collaborators which we should not inject on City: imagine a sendByMail(City) or searchSimilar(City) methods which access mailers or the database. With method merging, any service method which has in its parameters an object of class A will be callable from A objects as well, with the particular A parameter automatically passed and hidden.

The method merging feature was already present in NakedPhp but is currently broken (if I had acceptance-TDDed it, it wouldn't have been.)
See you in April for NakedPhp 0.2 and some news!

Thursday, March 25, 2010

A NakedPhp walktrough

Today I'll present a walktrough in the NakedPhp example application, to show the direct manipulation of objects it provides, with features like calling methods and editing of Entity objects.
The example application manages a Domain Model that contains places (shops, sightseeings, and so on) and cities. Places have a directional many-to-one relationships with cities.
I will upload a screenshot for every step.The graphic is very basic, but it is not a responsibility of the framework. Every application can write its own layout, complete with style sheets, and assemble the content pieces differently.

Supposing we had set up correctly the example application, to start working we have to load the naked-php controller default action. In my Apache configuration I set up the public/ directory as the DocumentRoot of the example virtual host. There are really no differences with others Zend Framework applications.

The starting page shows in the header the two declared services (managed as singleton-like instances), an empty session and a context bar, which we can ignore for now. Classes of a Domain Model are divided in two parts: Services (always available and never serialized) and Entities (managed in the session bar and passed around). Factories and Repositories go under the Services umbrella, while ValueObjects for now are not supported because the underlying object-relational mapper does not support them yet.
Clicking on the PlaceFactory service in the header will send us to the object Example_Model_PlaceFactory, with a list of the available actions (exposed methods, not filtered for now):

If we click on createPlace, a new Example_Model_Place object will be created by this factory method. This methods has no parameters and creates an instance with default values. We are redirected to that instance, which now is in the session bar:

If we click again on PlaceFactory and choose the createCity action, a form will be shown since this method requires one parameter (the name of the city):

We insert London as the name and submit the form. A new object is put in the session bar and we are redirected to it as it is the result returned by the method. The session bar keeps object in the PHP session: we are not touching the database. This means that entities should be serializable, and this leaves us free to use Plain Old PHP Objects that do not extend any framework class. The only requirement is that the phpdoc annotations and a few other ones are present to determine the type of parameters and method return values.
After the creation and the redirect, a list of the properties of the Example_Model_City object is shown:

Note that the object different specifications (classes) are distinguished by different icons.
If we go back to the Default Name Place by clicking on it and follow the link on the pencil, an editing form is generated basing on the setters available on the object.

Note that the context bar now contains one more link (other than Index). When a form for selecting method parameters or objects fields is shown, the context is conserved. If we don't like the objects we have in the session, we can go around looking for better ones, or create them. In the example, we want to create the Eiffel Tower Place and since it is not in London, we need a Paris City object. So even if we are on the form, we simply go on the PlaceFactory service and select the createCity action again:

The context has grown again (it can be reset by going to the index if someone screws up.) Then we submit the form and we are redirected to the last action we were calling, the editing of the entity 1:

We can now select Paris for the city field and change the name to Eiffel Tower, then submit the form. Unfortunately collections are not supported and we will get an error, but the process works well until it encounter the events collection (not yet implemented). If we simply reload the index page and click on the newly renamed Eiffel Tower object, we'll see it has been correctly edited.

We have worked only in-memory and not had any interaction with the database yet. This post is getting long so I will show you tomorrow the second part of the walktrough, where we save these entities and retrieve them with a method that is hidden or shown automatically basing on the current state.

Wednesday, March 24, 2010

PHP in Action review

PHP in Action is a hands-on PHP book written by Dagfinn Reiersol, Marcus Baked and, most notably, Chris Shiflett. PHP in Action is maybe the only PHP-specific books which bridges typical PHP topics, such as forms and database handling, to object-oriented design. It is very rare to encounter a book like this, which teaches object-oriented programming from a non naive point of view (How do I write those "classes?") in the PHP environment. PHP is still catching up with other languages in this field and many developers can only benefit from improving their modelling skills and design practices.

As I said, this book is PHP-specific; though, many other titles proclaim they're teaching object-oriented PHP on their covers, while the only touched topics are public and private fields, and how to extend classes with inheritance (if that seems normal to you, read this book.) Many publishers jumped on the bandwagon of PHP 5 and proposed books focused on the language constructs instead of the things you can build with them.
PHP in Action is a bit different. For example, it includes some of the SOLID principles and examples of their application in PHP code, without too many assumptions about the overall knowledge of the reader. The most important Design Patterns are explained, with an eye to the native support offered by PHP 5 (SPL).
Advanced techniques (for the average developer) are also introduced, such as refactoring, unit testing and Test-Driven Development. By no means this is an in-depth read on these topics, but the average developer which has a deep understanding of the PHP technology (but not of OO as a decent support was introduced only a few years ago in the language) will find this book useful to start upgrading his skills to the next level. I think this is a common situation, and was also mine; if I had found this book previously, my journey would have been simpler as I wouldn't have had to translate knowledge from Java books. I hope these advanced parts will become a standard in the future.
That said, there is really no PHP book that describes in full depth object-oriented design. There are specific books on object-oriented development, which are very long and insightful and still not complete. These books usually choose Java for their code samples (or C++ if they're very old); you may want to refer to different titles for pure object-oriented learning.

About the material provided, the code samples are inserted in each chapter, and are also refactored while iterative development takes place. There are many small Uml diagrams to help the reader understand what's going on - mainly class and sequence diagrams. Highlighting of relevant code and changed lines is the norm, along with ordered explanation lists linked to different point of the code samples that substitute intrusive comments.

The level of the book is adequate for the intermediate coder, thus I found it easy to read. Nevertheless, it is a good panoramic of the PHP landscape in term of the transition to object-oriented programming. The first edition is from 2007, and it is not outdated; though, you may consider using a framework to provide many of the infrastructure seen in the book, which is provided more for teaching than actual every day usage.

Some of the links in this post are affiliate links.