Friday, October 23, 2009

Being a developer is often referred to as fun and challenging, but also as a dull corporate job. Indeed there is some truth in both point of views since most of the fun depends on the domain a developer is dealing with.
Moreover, some activities can be very boring and frustrating leaving you clueless about what is not working. Suppose you're performing manual tests in the browser for your web application, loading pages and submitting forms. Often when changes to the code are fresh you'll end up with blank pages, forms that fail to submit, or they might even persist incorrect data.
So what you can do? Start to automate testing and concentrate test case on smaller units, so that when a test fails there are only one or two things that could have gone wrong. But the differences between manual/integration and automated/unit testing are only an example and a more general automation pattern can be seen.

I call this approach taking away the pain. Let's dig in some real world situations:

you and your colleague have just overwritten each other's changes to the live php files, on a production website. I guess you two should take the time to configure a subversion repository, so that every different copy of a file is memorized and restorable; moreover, different changes to plain text elements like source code files are merged (almost) harmlessly. Maybe you can even build a staging server to extensively test your application.

deployment of your web application puts you under pressure has you have to complete twenty-three steps in the right order. So why not automating it via a phing/ant script?

you are bored of writing similar Sql queries for inserting and updating your tables, when the only things that change between two queries are the table and the field names. Why not implementing or extending an ActiveRecord class?

you are bored to death of writing forms for hundred of entities without any behavior. Maybe you should look for a general solution such as Zend_Form, which abstracts away the html rendering and provides automatic population and validation of inputs values. You can even automate more and write a code generation tool, followed by fine tuning of your elements by hand.

you are even more tired of duplicating the list of fields and all their validation rules in the domain layer and in the user interface. So start using Naked Objects.

The advantages of "taking away the pain" as a human behavioral pattern are multiple: you automate a reusable solution to your problems, that may come handy in more than one project. Plus, crafting it is by far more challenging and fun than continue to write boilerplate code: you're applying automation and creating abstractions and other level of indirection.
The problem is that often it is very challenging, sometimes even too difficult for a single person to come up with a reusable, general solution to these kind of problems. Consider the following situation:

You achieve persistence ignorance of your complex domain layer, and your User/Group/Post/CommentRepository classes do not extend anything. They are POJO (or Plain Old <insert your language here Objects) and contain much logic. But writing the UserDao, GroupMapper, etc. classes to persist the instances in a database is boring as hell since you should reflect private fields of every object and put it in a table field. So you decide to find a generic solution since this problem persists (no pun intended) in subsequent projects.
And you find yourself writing a full-fledged ORM. Every User/Group/Post class has different peculiarities that were treated one at the time in your specialized classes, but writing a single mapper class that can deal with all the special cases quickly becomes impossible. You struggle with lazy loading, annotations format, detaching of entities, tracking policies, special mapping of value objects, relationships... The component grows to reach thousands and thousands of lines of code.

Indeed, an abstract DataMapper layer is very difficult to write for a single person, unless specifications and similar implementations already exists: the php implementations of the generic DataMapper pattern are ports of Hibernate/JPA.
Fortunately, there is open source, and we can participate in a project with other people, sharing our knowledge and time. I think this is why open source is so successful: it allows sharing our efforts to take away the pain from programmers' lifes.

3 comments:

Just some clarifications that you're probably aware of and it did just not come across: JPA is a specification, Hibernate is an implementation. And I'm not sure "port" is the right term here. For example, JPA has lots of different implementations but you would not say they are all ports of each other just because they share parts of the public API and even though they're even written in the same language. In the same way I dont consider Doctrine2 to be a port of Hibernate, its just too different and even a drastically different language. In my eyes porting a project rather means taking large parts of the sourcecode and adjusting it to fit the new environment, like NHibernate is a port of Hibernate and they regularly re-synch their code bases.

Roman,I do not consider Doctrine 2 and Zend_Entity copy of Hibernate, but I wanted to highlight the contribute in knowledge that Hibernate and JPA (which was extracted from Hibernate as far as I know) give to everyone who wants to write an Orm today. For instance, consider HQL/JPQL/DQL. In other components of the Orm there's much work to do since php is a dynamic language and it's radically different from Java.

Yes, I absolutely agree. All the knowledge that is shared through all these open source projects is extremely valuable.

And yes, the JPA was mainly extracted out of Hibernate and a few other ORMs (like TopLink, now EclipseLink). The specification obviously always only has a subset of the features of the individual implementations, kind of a lowest common denominator.