Monday, November 16, 2009

Defaulting to private

While writing classes, which scope do you normally choose for your fields and methods? Do you consciously choose a visibility or just stick with what your IDE proposes?

Before dig in the philosophical discussion, let's summarize the different visibilities available in object-oriented languages:
  • Public: no limitations.
  • Package: scope level available in Java but not in C++ and php. The member can be accessed by code that resides in its own package.
  • Protected: the member can be accessed only by its own class and by subclasses. Similarly, Friend visibility in some languages is used to allow friend classes to access the field or method.
  • Private: the member can be accessed only by code of its own class.
Note that I said its own class, and not its own object, as private fields are usually accessible by other objects of the same class.
<?php
class ComplexNumber
{
    private $_real;
    private $_imaginary;

    /* constructor and other methods... */

    public function equals(ComplexNumber $another)
    {
        if ($another->_real == $this->_real 
        and $another->_imaginary == $this->_imaginary) {
            return true; 
        }
        return false;
    }
}
The reason behind this behavior is that encapsulation with limited visibility facilitates changing the code. If we are modifying the $_real and $_imaginary fields we are changing the class, so there is no problem in limiting visibility to the class code instead of a particular object (forcing an object to access only its own private fields and not its brothers' ones).

I am a proponent of test-first approaches to software development and this means I often implement in production code the simplest thing that could possibly work, and that makes my tests pass. Another limitation I follow during development is visibility: if there are no tests that access a property or a method, there is no reason for it to be public.
Whenever I create a new class member, being it a field or a method, I default to private or protected for its visibility. Only if a method is the subject of a test it becomes public, while it is very rare that I need a public field.
This rule of thumb gives the code the advantage of increased encapsulation, since public visibility is chosen only if mandatory. The distinction between private and protected is relevant only if you allow subclassing, an action that you usually control if your code is not part of a framework or public library.

In php 4 there were no visibility modifiers, and all members were public. This was one of the serious limitation of supporting php 4 for object-oriented applications, like CakePHP did. Apart from naming conventions, there was no way to tell apart methods which were in the Api and internal ones, which could change in any subsequent release. You can tell developers that private methods start with '_', but if there is no forced limitation of scope it is very simple for a developer to do the quick hack such as calling an internal method.
Thus, the advantage of greater encapsulation is to keep the Api small, reducing coupling and having less method signatures carved in stone. Any source file can call a public method, so you cannot simply change its parameters and side-effects. While renaming a method is often simple thanks to modern IDEs (or sed), if you change the behavior of the method you have to review every place in the codebase to make sure there are no incorrect assumptions.
On the other hand, if I decide to expose a method as public, I want a test that forces me to comply and that would be red if the visibility is different. This process explicates the contract of the class, and avoid breaking a method in the future.

Sometimes refactoring produces a private method worth of testing. However, a private method cannot be tested directly but only trough public methods of the same class. Still, it is covered indirectly because if it were not you would simply remove it as unreachable code.
If you feel like testing it independently, it is probably the sign this method carry out tasks out of the current class's responsibility, and you should move the private method in a collaborator (which may have to be created from scratch).
Encapsulation would be maintained since the collaborator would be stored as a private property of the SUT. If the method cannot be moved out and cannot be exposed, it should not be tested: you maintain the freedom to change it later as long as your public methods does not make the tests red.

So, whenever you are coding object-oriented applications, I suggest to keep as many methods as you can private, and expose public methods only for contracts between classes.

2 comments:

Unknown said...

I think this is good advice and I second it. It especially pays off the larger the system and the code base is and the larger the userbase of the codebase. Good API design is incredibly hard and backwards compatibility is a major issue. As soon as you have a protected member and your class is not final you can theoretically not even rename the member without potentially breaking BC. If the class was not intended to be inherited then it might be a good idea to make it final (this may have some negative consequences on testing/mocking though).

It really pays off to invest in good APIs and member visibility is a very important part of that. Its easy to relax visibility later but much harder to go the other way around so starting with the lowest visibility that is needed to get the job done is good advice.

Some random thoughts:

- protected in Java actually allows access from all classes in the same package. Thats quite confusing to many people.

- I only know of Scala supporting object-private accessibility but maybe others do, too. In Scala its:

private[this] val foo = "hello"

Good post!

Giorgio said...

Thanks Roman,
backward compatibility is important for classes that are meant to be used externally and I'm sure you have a lot of them in Doctrine 2. :)
I just can't make class final anyway, unless I have already an interface which I can mock instead of the concrete class, which is a fine approach but not always practical. Otherwise, making everything final (such as method parameters and class members) would be very defensive.

ShareThis