Saturday, July 04, 2009

Keep growth under control

Systems grows and also classes do, but we should respect the Single Responsibility Principle. Rule of thumb for noticing potentially God classes.
In a good designed oop application, every class should do one thing, and doing it well, paraphrasing the unix philosophy.
The Single Responsibility Principle (SRP) is one of the five SOLID principles that governs object oriented programming.

A class should have one, and only one, reason to change.

Real world
But in reality, when you open .java and .php files or whatever else, you often find yourself scrolling up and down the class code to fix disparate things. What can be done to keep classes navigable and simple?
I have my quick dirty approach, that uses unix shell; it takes advantage of the one-to-one correspondency of class and source file, and also of the fact that:
Measuring programming progress by lines of code is like measuring aircraft building progress by weight.
-- Bill Gates
In my opinion, lines of code are a metric, but not the way many thinks: the more lines you commit, the more bloat you introduce. If you absolutely need hundreds of lines of code, you should at least divide them in a bunch of methods, functions, files and classes to keep them manageable.
This cli interface is available on ubuntu and every other linux distribution, since this commands are tipically built-in.

$ wc `find library/Otk/ tests/ -name '*.php'`| sort -n

Obviously you can replace library/Otk and tests/ with folders you like, and .php extension with .java or .c one.
What this command outputs is the full list of source files ordered by line count:
3 6 66 tests/TestHelperMysql.php
3 6 67 tests/TestHelperSqlite.php
8 21 202 tests/stubs.php
273 766 7975 library/Otk/Image.php
274 641 8949 library/Otk/Controller/Scaffolding.php
281 787 9713 library/Otk/Form/Generator.php
10298 24649 315267 total

The output of wc is defined as:
linesCount wordsCount charsCount fileName
so you will notice on the last lines of the output the files that contains the highest number lines. All lines are counted: blanks and comments are included. You might want to use cloc or other more specialized (but not standard) programs to count only lines of real code, but as I said before this is a quick'n'dirty approach.

Where I should start refactor?
Now observe the last lines of this output: if the bigger classes in your project are the ones which you are frequently working on, and you struggle to move up and down in their source code, it can be a good sign of where refactoring is needed. I started from having some classes for forms and repository that had 500+ lines and now I'm down to the half of them for my biggest class, which is Otk_Form_Generator. This also has improved my testcases that went from a full integration test of the form class to the unit testing of form generator and collaborator classes.
Typical tdd worflow is Red - Green - Refactor, but when Refactor is executed after some time, where you should start? Here's a brutal indicator.

No comments: