Invisible to the eye: August 2010

Saturday, August 28, 2010

Weekly roundup: consulting in Siena

It has been a busy week in Siena, where I am doing consulting on web development topics. The most effective task in my work has been repairing the test suite of a PHP project and optimizing it for running in 1/5 of the original time.
On the whole, Siena is a beautiful city, with nearly no factories nearby and small enough for me to walk to my workplace through the historical part of city.

Here are my original articles for this week on DZone.
Practical PHP Patterns: Coarse Grained Lock
The wonders of the input tag in HTML 5
Practical PHP Patterns: Implicit Lock
NetBeans vs. Vim for PHP development

Sunday, August 22, 2010

Weekly roundup: double edition, packing for Siena

Last week I got caught in the Google search vs. Oracle joke and I couldn't do a roundup, so this one spans two weeks' worth of posts.
Anyway, I will be in Siena for the next weeks, for software consulting business.

Here are my articles written in the last two weeks.
WebML: overcoming UML for web applications
Practical PHP Patterns: Remote Facade
The buzzword glossary, a dictionary for frequently used generic words such as model, domain, role...
Practical PHP Patterns: Data Transfer Object
The shortest guide to character sets you'll ever read, which is fully exaplained by its title.
Practical PHP Patterns: Optimistic Offline Lock
Native jQuery animations, which shows what you can do with the jquery.js file only.
Practical PHP Patterns: Pessimistic Offline Lock

Friday, August 20, 2010

PHPUnit MockBuilder in master branch

A quick note: my MockBuilder class has been integrated by Sebastian Bergmann in the master branch of phpunit. The name MockBuilder reflects the nature of this addition, a Builder pattern for the mock objects, more than the original MockSpecification does.
He said it should be included in PHPUnit 3.5, the next minor version. I'll soon add some documentation by forking the phpunit-documentation repository.

Wednesday, August 18, 2010

Refactoring PHPUnit's getMock()

Not an actual refactoring, but at least the introduction of a layer of indirection, a Parameter object, called PHPUnit_Framework_MockSpecification. I have already written the patch in a branch of my github repository. They are actually two independent patches, since PHPUnit core and the mocking component are in two separate repositories:
http://github.com/giorgiosironi/phpunit/commit/c7d62874ff9c1ed6f520e98cab2568c9bb933ec6
http://github.com/giorgiosironi/phpunit-mock-objects/blob/mock-specification/PHPUnit/Framework/MockSpecification.php
http://github.com/giorgiosironi/phpunit-mock-objects/blob/mock-specification/Tests/MockSpecificationTest.php
All functionalities were Test-Driven Developed.

Use cases
The current API of getMock(), the Facade for the mocking library, actually prescribes 7 parameters. Most of them are optional, like in use case (a):

$this->getMock('MyClass');

But if you want to specify an uncommon parameter, you have to include the previous ones, and hunt around for their default values, praying that you will get them right and insert the boolean or empty values in the correct order, like in (b):

return $this->getMock('MyClass', array(), array(), '', false);

In some cases (c):

$this->getMock('MyClass', array(), array(), '', true, true, false);

A ~~Specification object~~Builder pattern, which I intend to propose as a feature request after getting some feedback from the community, will aid some of these use cases. For example a) remains the same: there is no need to complicate the API here.

$this->getMock('MyClass');

For case b):

$this->getMockSpecification('MyClass')
     ->disableOriginalConstructor()
     ->getMock();

For case c):

$this->getMockSpecification('MyClass')
     ->disableAutoload()
     ->getMock();

State of development
I have currently implemented support for 6/7 of the getMock parameters in the MockSpecification object (only the autoload-related parameter is missing). This solution is an instance of the Builder pattern (I need a new name for MockSpecification, which started out as a parameter object but then acquired a getMock() method for a faster access to the created object).
Once the mock is created, it behaves exactly like an ordinary mock: MockSpecification calls getMock() internally.
This would be a totally backward compatible change, since it only adds a new way to creating a mock.

What I want from you
Any feedback, from glitches in the code to better names for the API methods and the class itself. I guess PHPUnit_Framework_Mock_MockBuilder can be the right name. Once tidied up the code, I'll open a ticket for a feature request on PHPUnit's trac asking to assess it and merge in the master repository.

Monday, August 16, 2010

The Flattr model

Following examples from Germany, I have integrated the Flattr buttons in this blog. I also reduced the only ad, an AdSense banner, to a small box on the right.

Basically Flattr is a micropayment system:

Users charge their account via Paypal or other means.
They flattr really amazing blog entries or useful articles they found on the Internet, when they want to repay the authors for their time and effort.
At the end of every month a small, configurable amount of money (like 2 or 10 €) is set aside. This is divided between all the flattered articles the user has selected.

Fixed expense, almost no barrier to micropayment once registered to Flattr (Paypal has fees that render donating something like € 0.10 impossible). We'll see how this work out.
If you also want to be flattered, try set up the system on your blog too.

Sunday, August 15, 2010

A public response to Gene Quinn on the "Google removed Oracle" forgery

My quick debunking of Google Briefly Punishes Oracle by Removal from Google Search has been retweeted a lot and even linked from Tech Crunch.
I can't reply on all the different web sites where Gene Quinn, the author, is "sticking to his guns" (I learnt a new English expression today) and say that we are all wrong.
So I'll summed up my thoughts here, with a response to one of his typical comments. The comment is nearly identical to one posted by him on his own article (particularly the bits about typing oracle instead of using a link and the screenshot as a proof), so we can assume it's authentic.

It is Tech Crunch, not me, that has been duped.

Unfortunately, someone at Tech Crunch knows what Unicode is. You probably don't.

You can believe what you want, but I was not provided a link. I watched someone type “oracle” into Google search and this was what was produced.

This does not imply anything - I can configure a keyboard to produce homograph cyrillyc characters when I press keys like a and o. Everyone who has ever installed a wrong keyboard driver knows that the characters printed on the keyboard are not electronically hardcoded and depend on a software configuration.

I requested a screen shot. So those, whoever they are (including Tech Crunch) that are claiming this is false are incorrect. Those saying I was sent a link with an intentionally malformed search term are likewise wrong.

This does not imply anything, again. Holy crap, Batman, you are an attorney, do you bring screenshots in court? I can easily make one by simply saving the page and modify the HTML source.
Furthermore, the provided screenshot shows exactly links to pages which contain the fabricated query, like http://dvlprs.com/link/2483939. Those pages are the only shown just because they were the only ones containing the oracle word spelled with 4/6 as Cyrillic characters. By now, the same query will include all the articles which talk about this story.
This is the freezed version of his article in case he decided to take the image down (basically this is a screenshot made by a trusted third party, freezepage.com). And this is the freezed version of his screenshot:

If you try visiting the links, you will be brought to pages containing the fabricated query.

If Tech Crunch has any journalistic standards they would remove this post which offers nothing but speculation passed off as fact. My guess is that if and when Oracle makes this an issue during their litigation Tech Crunch will be printing a retraction. So, you have been warned. I am sticking 100% behind the report because it is true.

This is only FUD. You are expected to publish an amendment to your article, basing on the evidence about your own screenshot linking to a forgered query which explains everything. You may want to know that when you publish links in an image, people can actually following them by typing their URLs in the location bar of browsers.
Next step, you'll treaten me to take down my blog?

Saturday, August 14, 2010

Google never removed Oracle from its index

Some folks have been reporting a strange behavior assumed by Google after the lawsuit filed by Oracle against Android and Google: it supposedly removed oracle.com pages, and all the pages that talk about Oracle, from its search index. Even the wikipedia page on the Delphic oracle.
I initially retweeted the news and explained that it was a trick shortly after.
It would have been a low shot, really. I don't think it's even possible to remove that large set of results on all the datacenters of Google in a short time frame.

What really happened
Someone made up this query:
http://www.google.com/search?q=оrаcІе
Initially the result page was empty (Your search - ... - did not match any documents). Then people began tweeting and sharing the query and Google started showing up them as the unique results:

So how did they do it?
At first I thought someone used a capital i (I) to substitute the L of Oracle, but Google is smart and would perform a case-insensitive search in this case:
http://www.google.com/search?q=oracIe

Nevertheless, the difference between capital i and lowercase L is not so visible in Google's font.
But, if you try to paste the link or save the page and go over it with hexedit, you'll notice this:
http://www.google.com/search?q=%D0%BEr%D0%B0c%D0%86%D0%B5
This is clearly the sign that someone has inserted non-ASCII characters in the query.
The character table for Unicode/UTF-8 says that we have, in sequence:

CYRILLIC SMALL LETTER O
LATIN SMALL LETTER R
CYRILLIC SMALL LETTER A
LATIN SMALL LETTER C
CYRILLIC CAPITAL LETTER BYELORUSSIAN-UKRAINIAN I
CYRILLIC SMALL LETTER IE

This combination of characters is very unlikely to be found in actual documents. In fact, at first it did not produce results. Furthermore, in Google's font of choice, Arial, the difference between these letters and their latin counterparts (if there is any) is again not clear to the naked eye. It makes sense to reuse glyphs that are actually the same in ordinary printed text.
And finally, the forgery replaces the majority of the latin letters, because replacing only one or two would lead to a Did you mean: oracle notice.

Mystery solved
So UTF-8 struck again, and some of us were fooled by a ingenious, well-forgered Google query. Technically this is called an homograph attack.
The potential of UTF-8 as a dangerous mean of fooling users is great - imagine if non-latin URLs will become a reality. Fortunately, the ICANN and major browsers have been working on a solution, but we as web developers should be aware of the problem too.

Thursday, August 12, 2010

Get an ebook reader

Only if you enjoy reading books, of course.

I recently ordered a Bebook Neo, the European equivalent of the Amazon Kindle. It arrived from the Netherlands in two business days (there is no custom between European Union countries). It has wi-fi and an headphones jack if you want to listen to music, but these features do not interest me. I want to tell you about the reading experience.

As a programmer and engineer, I read a lot of books, on various subjects:

technical ones (Kent Beck, Martin Fowler, Uncle Bob...)
managerial ones (Peopleware, Agile Estimating and Planning)
science fiction (Asimov, Dune, Philip K. Dick...)
fiction (Zen and the art of motorcycle maintenance)
personal development (Getting things done)
cooking (well...)

For some of these books, I don't even know if an Italian version exist. Instead, ebooks (mostly in PDF format) are ideal to get them in their original English edition. For example, Packt Publishing sent me a PDF copy of some of their books for review in less than a minute from my registration.
Furthermore, there are other publications available in electronic format only, like DZone Refcards and free ebooks from SitePoint.

The only problem with reading ebooks is the device you use to read them. My Asus EeePC 701 (the first netbook in the market segment) is good for writing articles, and skimming blog posts, or for a bit of PHP programming by SSHing into my home machine. But for reading extensively, LCD screens will kill us.

First, there is backlight. You know when, as a child, you were told not to stare at the Sun? Here is the same mechanism, on a smaller scale. Direct light is sent from the LCD screen to your eyes, and over the minutes or the hours staring at it becomes not very beneficial for the eyes. Besides the protracted direct lighting, the LCD screen has a very different light intensity from the surrounding environment (from objects which do not emit own light, of course), which causes eyes to strain to continuously adapt between the screen and the rest of the world:

When reading a good book, I usually get in a flow state and do not make 5-minutes pauses, which sound innatural and break the natural page turn rhythm I am accustomed to since I was 6.
With an LCD, the brighter the environment, the less you see on the screen (outdoor you can't see anything, especially with modern glossy displays). With a e-ink screen like Kindle's or Nook's Bebook's ones, you have to actually provide external light to read. This is an advantage for e-ink devices, since you can easily provide light when it's needed, like you do for paper books. I do so with an abat-jour over my bed. On the contrary, it's quite difficult to obscure the sun if you want to read outdoors with an LCD screen.
As any library affectionate can tell you, an LCD-based device is not an ebook reader, period. Forget about iPads - those can make for wonderful trays for Martini glasses like my CD drive does, not for readers that do not cause headaches.

Second, there is the user experience. I used Evince, the Ubuntu equivalent. The ebook reader software is specialized and offers a simpler interface. Zooming have several acceptable levels and text is automatically reflowed to fit the pages. There's no fine tuning of the horizontal scrolling bar to have all the text visible at the same time, nor continuos adjustment of the vertical one; only two buttons to go to the next or previous page.

Third, there is battery life. I haven't still recharged my Bebook after the initial unpacking day more than a week ago. It basically consumes energy only for its idle cycle and during page turns. Requiring no own lighting mechanism, when you're reading a page it is essentially not consuming. Its charge is estimated to last between 4,000 and 7,000 page turns. In comparison, for my netbook I have a choice between attaching a cable to the nearest power outlet or using a half-kilogram battery.

Fourth, there is size and weight convenience: the internal memory of Bebook Neo is 512 MB and a book commonly occupies from 1 to 10 MB of storage space. This particular device has an SD card slot that you can use to expand the memory further: with 8-16GB of SD cards, you have practically infinite memory (unless you own the Library of Congress). You're helping the environment at the same time - I bet the material used for manufacturing an ebook reader and the energy it used during its lifecycle are a more efficient choice over printed books. If there are doubts, maybe we can recharge readers with solar panels. :)

Fifth, there is the absence of distractions. On my netbook, I'm one click away from opening my mailbox or twitter account. In the latest versions of Ubuntu, new messages from instant messenger buddies shows up as notifications (in some also Twitter mentions), and I have to close Pidgin or Empathy.

Sixth, it is very trendy. No one I know in Italy has an ebook reader. When people brag with their iPhones you can peacefully continue reading your favorite book and ignore them. :)

The downside of the solution is only its cost. A device from the Bebook series costs between 250 and 350 Euros. If you live in the US, the Kindle and the Nook are available and cost much less. Take into account custom duties if you decide to buy from another country.

These costs are comparable or inferior to those of netbooks, but ebook readers are much more specific devices. However, if you read a lot, embracing an ebook reader will give you all the advantages above.

Sunday, August 08, 2010

Weekly roundup: cooking

I dedicated my weekend to cooking pastries and meringues for the girlfriend returning from Sardinia. Software engineers are good chefs when they give their best effort in the kitchen, probably due to attention to the detail, dosages and practices (like using leftover egg whites for making meringues, which is a best practice of the domain.)

Here are my original articles published this week on the Web Builder Zone.

PHP inclusions on include(), include_once(), require(), readfile(), autoloading and so on.
Practical PHP Patterns: Application Controller
10 HTML tags which are not used as often as they deserve, which gained 12,000 views (and counting) at this time.
Practical PHP Patterns: Visitor, a pattern which was skipped in the original Gang of Four series on this blog.

Tuesday, August 03, 2010

Munchkin, learning Test-Driven Development in PHP

Mészáros Márton and other PHP coders have started a Test-Driven Development project centered on showing the methodology to new adopters in a green field. The goal of the project, named Munchkin, is creating a feed aggregator - like Google Reader - from scratch.

The authors will post a series of articles about their development process along the way. If you want to follow a step-by-step guide to implementing an Agile project with TDD in PHP, follow them.

Monday, August 02, 2010

We're not superheroes

I was reading bits of Coders at work, a book of interviews to famous programmers. I didn't like it very much: here's why.

The first thing I noticed is that many questions are biographical. I do not care about knowing if Ken Thompson, which built Unix, worked on a PDP-10 or a PDP-11 or a PDP-7. By the way, I do not even know how those things look like: I was born in 1988.

Besides that, this kind of books tells you how to live like a superhero coder, but most of us just aren't (me too).
Thompson could work out the design of software in his mind for a month before starting coding, I can't (or I can waste less time by writing something in code).
Knuth could design and code LaTeX in pencil and write for six months before do any testing. I can't (or I can, but I would be much more efficient with a quick feedback loop like Test-Driven Development's one.)
Zawinski could pick up rolls of duct tape and make Netscape work in six months (picking up also a lot of technical debt and vanishing from the market in the following years). I prefer working software over comprehensive documentation but not over sustainable development.

So we're not superheroes: test suites, source control, Continuos Integration are our bat-gadgets which enable us to deliver software while working for a living and not living for work, like Ken Thompson and his 28-hour days or Bill Gates and his nights at school programming. Let the people with the superpowers shoot webs all night long, while we go back to Wayne Manor and throw a party. At least Batman hasn't a day job: poor's Peter Parker must live a miserable life.

Invisible to the eye