I Love Cocoon!

A survey of extensible server strategies S V Ramu (2002-05-10)

A 'home grown' strategy for designing a website

Around two years back, for the first time, I had an opportunity to design and implement a
dynamic website, with a team. Though it took few weeks to digest the HTTP's stateless
model and the associated browser quirks, soon we could apply all the standard application
tricks to HTML page designing. For an OOP addict, the ASP, or the plain JSP model
of mixing script and html was as harsh as metal scratching metal. It was immediately
certain that if at all we use JSP, custom tag is the way to go. By which, we separate
the tag structures from the java coding, through the elegant taglib model.

But still, this total dependence of the application to the web infrastructure was
still reeking with bad smell. After all, a web interface is only one face of an
application. Also, we were not ready for full fledged application server based
design, due to its seeming heaviness. So JSP is out and Servlet is in. At least,
JSP as a way of mixing HTML and script, is definitely out. The servlet has its own
drawbacks. The idea of generating HTML tags with Java code was like squashing a bug
with a sledge hammer. We needed the flexibility of JSP with the Independence of
Servlet.

The newly learnt jargons of XML-XSL came to the rescue. With XSL you can
generate HTML, just like in JSP. But unlike it you cannot mix in server-side script
inside it. XSL is a way for generating pure presentation output. Any code based
manipulation should be externally performed and fed in as XSL parameters. This was
at once appealing and liberating. So, the model now was, to use servlet for
request marshaling, and the HTML generated by a battery of XSL files, using
Java's XSLT API. This meant that all our content has to be converted form the regular
SQL resultsets to XML, through Java.

Soon we also decided that we just needed only one servlet, whose job was only to
route the requests and the parameters to appropriate Java classes (we even made
this dynamic, by loading the appropriate Java classes only in runtime, as configured
in a property file). Initially we were guilty of such drastic simplification! Of
reducing the whole web architecture dependence to just that one servlet. But it made
sense, as we realized that servlet is just for that: to connect our server-side
code to the client's browser.

The curiosity, the Struts and the Cocoon

Even till today, I'm happy that we hit upon such a model. But we were secretly aware
that we were not alone in this drastic rethinking of the web architecture. I soon
stumbled upon the Apache-Jakarta project called
Coccon. Initially I was put
off by its huge size and seemingly heavy learning curve. Also, from many people,
we heard that, there is another project called
Struts that played in the
same arena with promise. I wondered why there are two projects from the same stable
for solving the same problem? So the exploration began, and this essay is the result.
I should warn you, that this article is in no way a tutorial of these OpenSource
products. This is just my sort of First Impression Report, and a leisurely
reflection upon the architecture of these two products with respect to our above
model.

Put very simply: Both Struts and Cocoon, just like the 'Home Grown Model' relies
very little on the servlets (in fact only one servlet usually)! This was very heart
warming. Then, where they differed? what did they offer extra? Again, from the very
very thin study I've done, it seems, that the Struts relied more on Java, and
Cocoon relied almost completely on XML-XSL infrastructure. As Cocoon was close to
my heart, I went into it to some depth, while just giving a quick glance at Struts.
So this article will be more on Cocoon and its philosophy, rather than Struts.
I'm in fact an advocate of completely XMLised world, with XSLT as the transforming node
(see the next section for more explanations).

There was an interesting surprise in the Cocoon model, which made me think that we could
have been bolder in simplifying the 'Home Grown Model'. From the conventional view
point, a run-time XSL transformation is a heavy overload, when compared to the pure
static HTML. True, but ASP/JSP/Servlet is not much different from the XSLT load.
In fact there is not much to design in a static website which only has HTML pages
and some pictures. So, once we decided that we need dynamic content, XSLT is not too
different from JSP like models. With this mental background, where XSLT is barely
breaking even with JSP, the idea of multiple XSL transformation, before sending it over
HTTP, was like an unpardonable sin! But this is exactly what Cocoon does. Short of
completely eliminating Java, it manages with just XML and XSL, with java only in the
silent background.

'Isn't OOP dead?'

Cocoon made me remember
my pet theory that,
properly planned, the future programming could reduce to a suitable graph network,
where the nodes are XSLT converters and the edges are the XML content transmission
protocols.

...what are then the modern Application
Architecture options available to us? If key module-to-module communication can
be in neutral terms with ports and XML, then it doesn't matter if these models
are in the same machine or across the world. Of course, performance is still a
deciding factor, before going overboard and converting all our method calls as
port calls. All the same, an application can now be imagined as a bunch of Service
modules which communicate over neutral channels and format, and possibly with neutral
semantics as well (i.e. the XML schema too might be a global standard, instead of a
proprietary one). And all that remains of programming is to code these Service
Nodes which just transform some input XML to some other output XML, not
altogether a OO demand at all.

If you imagine all the service nodes as points, and their interconnections with
other services as directed lines. Then what we have for an application is a
network of points and lines. Now, if the semantics of these communication (i.e. if XML is
the universal format, a particular schema of tag structure is the semantics)
is an international standard, draw that line in red, and if it is proprietary
then in black. If done, then our application network would be many points with
red or black colored directed (arrowed) lines connecting them. We can say that as the
number of red lines grow, the application is to that extant an extensible and
maintainable product, since any new vendor can deliver a module with better
performance and yet with complete integration assurances. This is really the
promise of the Web Services paradigm, where a service is the software equivalent
to the IC of electronics.

(Isn't OOP dead?)

In this light, seeing Cocoon made me envious of those admirable minds, who dared to
go beyond the fear of too many runtime transformation becoming a bottleneck, to a
dream of completely separating the concerns, to the point of reducing Java like
coding to the absolute minimum. You must realize, that today, with the tremendous
processor speed and the spacious RAMs, and above all an optimized monster servers,
the speed is really not a concern. You can always throw in more hardware. The issue now
is having a portable content, which is ultimately extendable and scalable. Cocoon
realizes this fully, hence exploits and combines the simplicity of XML with the
versatility of XSL. Its design advocates multi-level XSL transformation before sending
out the response. All the same, Cocoon 2 claims to optimize fiercely to
production quality, by using SAX parsers instead of the memory and CPU gobbling DOM.

A very rough overview of Cocoon

There are very many jargons to be learnt in Cocoon. As Cocoon itself admits, the
concepts of SiteMap and XSP (eXtensible Server Pages), have a steep learning curve.
But the heart of the whole framework is nothing short of a revolution (as its early
founder, Stefano Mazzocchi -mad-zoki-, rightfully claims). After some time with its
docs, I'm bit uncomfortable with its over simplified model of Actions, which
ridiculously simplifies all the programming needs with the elegant
Apache Jakarta Avalon Framework.
If this is true, what it means is, that the whole site management can be done with
XML-XSL alone, with java only for producing those starting XML 'seeds' (so to say).

Basically the model consists of the following concepts...

Parse the URI and select the appropriate process:Match the request URI with RegEx, Wildcards etc. and branch it off to an
appropriate Pipeline. This process selection phase can also be done
with Selectors (which can use things like Browser types, parameters etc.),
and with Actions which are just Java classes which take in a list
of Name-Value pairs and churn out a modified list of those Name-Values.

Setup a Pipeline of XML transformers:
A pipeline is just the Generation of the initial XML for the given request,
Transforming it in many stages, finally Serializing it into a
response format. There is also a very nice capability of Aggregating the XML
output of two or more Pipelines, and continuing with the Transformation.

Generate the Initial XML:
The key idea here is to use one of the Generators to create the initial
XML. This generation of XML could just be a physical XML file, or a JIT created
Directory structure (with Ant like selectability), or RDBMS, or maybe from other Template
based content creators like Apache Jakarta
Velocity, or from a legacy Java Script file, etc.

Transform the XML:
Once the initial XML is created, we can Transform it with our own XSL file, or
with any one of the standard transformers coming with Cocoon, like for I18N, Logging
or SQL etc. There could be as many number of transformations as you like.

Of course, there are many other tricks for handling Errors, Views etc. for
which you can use the decent documentation available with Cocoon downloads. Installing
Cocoon is just copying the Cocoon.war file to the Tomcat webapps folder (but due to
some mismatch of the XML parser versions, you have to follow few jar copying
rules stipulated in the Cocoon installation pages).

Epilogue

Trying to understand the Cocoon project, I happened to stumble upon so many nice
projects in Apache Jakarta, that are used by Cocoon. I do realize that what little
I've explained here about Cocoon is pathetically cryptic. But, my idea is to start
on this survey of distributed server/web strategies, and give you a taste of the
motivation behind such notable efforts, in the eyes of a personal experience. To me,
the realization that the efforts of my previous team is up-to-date enough, was
heart warming. I hope this gives you as much confidence in innovating, as much
as it gave us. Soon, I'll try to continue to explore Cocoon, Struts and others,
in much more detail. Mainly I'd like to arrive at some unifying architectures, that
we can discuss and standardize at TATTVUM, so as not to be put off by so many
wonderful upcoming projects and models. So, please do comment.