In the last post, I alluded to how shifting complexity from one area to another can actually improve perceived simplicity. Here is a concrete example of how we got rid of scores of wsdl-client generated jaxb classes and brought in a fresh wave of simplicity to our Grails applications.

Problem

One or more of our Grails application uses several our own plugins and each of them talk to several web services that share a lot of xsd schemas, the java client generation of which end up proliferating duplicated jaxb classes in each of our plugins and also in the applications which use them, thus resulting in multiple classpath conflicts.

Also we dont like the date conversions from Date to Calendar to GregorianCalendar to XmlGregorianCalendar or other conversions from Byte to Int to Shorts. Also jaxb does some non-intuitive magic with some of the xsd:any types in the schemas. If xsd schemas want strict type checking, fine. We from the “def” world, just look everything as plain data. And finally the jaxb classes look ugly.

The “withSoap” is a DSL, that is supposed to be injected automatically into controllers, but on other classes it didnt seem to work. I tried using WsliteEnhancer, but that didnt help either. So the direct call to withSoap using WsliteConnector singleton.

The example shows all params injected from grailsApplication config, to allow webservice urls for different environments.

The interesting aspect is the envelopeAttributes, which is a map of namespaces. Since we are not using generated client classes, whcih does the work for us, the namespace prefixes must be specified somehow in the xml itself as part of soap request. Adding namespaces to envelopeAttributes will append the namespaces to the soap-env tag. No need to include the namespace for soap-env itself. The wslite plugin does that.

With the complexity of creating the xml shifted into MarkupBuilder, creating soap requests become trivial, even more so with the withSoap DSL.

Wsdl import generated classesPros
No need to worry about namespaces
Set and Get of values use Java-like syntax
Setting values does not matter as xml marshalling happens at a later time than setting the values
Setting tag attribute or tag value is no different (ie code doesn’t care)

Cons
Every minor change in wsdl forces wsdl client generation
Duplicated client generated classes could cause classpath conflicts
The hierarchy of the xml elements is not apparent in the code, the only way to know the hierarchy is to look at the xml samples

WSLite + MarkupBuilderPros
No confusion about which client to use: there are scores of them and each have some unique issue (jax-ws, jax-b, cxf and so on)
No wsdl import client generated sources and jars
No crazy xerces, xalan and other marshalling runtime errors, especially if deploying to different app servers (tomcat, weblogic etc)
If there are trivial changes to elements (saying adding a new optional element), will still work without changing client code (if that element is not used)
Creation of xml is very straight-forward
Code documents the hierarchy of the xml!
Response returns XmlSlurped data, so dot-notation can directly be applied on the objects

Cons
Code must be namespace-prefix aware
Must know tag attributes and tag values to create the xml correctly (this must be a Pro)
Code must generate xml in the same sequence of elements; blocks of elements can still be called out-of-order, but the final xml body must be created in a sequence
Unvalidated namespace prefixes increases testing time
Extra testing time for un-type-checked variables

Compared to our previous jaxb code (with ObjectFactory, creation of new Objects etc.), the code is now 40% smaller, and additionally got rid of the client sources too.

Software is a big grandpa pendulum clock. It oscillates between simplicity and complexity perpetually. Ideas and concepts that start off simple, rapidly become complex and eventually complex ideas are broken down into simple components and again these simple components evolve into complex creatures of their own. After all we evolved from a single cell into complex cell living beings, why blame the software for that?

Webservices necessitated from complex interaction between disparate systems (remember CORBA, JNI, DCOM, RPC, mainframe apis?). All those were dissolved into ‘simple’ wsdl based services, as long as client and server maintain the contract, everything was promised to be a simple hookup.

But it didnt end there. Generating wsdl clients either in Java or .NET were complex in its own terms. Java discovered its way, as usual, with hundreds of wsdl clients and xml jars (anyone still using xerces, xalan, sax4j directly?) – jaxb, jax-rs, jws, cxf and several more proprietary ones). .Net, was not a A-shot either: first asmx, then wcf and then several flavors of .net versions on top of that. Most of the time, its not a one-step process to call a WCF wsdl from a Java based client, there are always some surprises around the corner in terms of configurations. Its heartening to see nobody got it right the first time.

If you think about it, what was the necessity to generate classes from wsdl in order to send plain xml text data? The early xml apis described the xml in terms of hardcore hierarchy: root, nodes, parent, children, siblings, single. It could have been easily extended to represent a oh-so loveable joint-family: cousins, uncles, aunts, nephews and nieces:

Such family friendly APIs would have thrilled developers who often work only after the family goes to sleep.

I am also starting to see that xml and xsd schemas are not an ideal way to describe data models. Describing a model in xsd using the semantics of xml itself is over complicated. On top of that extensions, simpleTypes, complexType – are way too tedious and garrulous. If you are exposed to C++/Ruby/Groovy world, you can visualize the models pretty quickly in terms of encapsulations, compositions and interfaces. In the xml world, I only see tags and attribute names. Lots and lots of them. The content is lost in the tag-tsunami.

While reviewing a project recently, where all our models are xsd based, data models extending from base xsds, and several simple, complex and xsd:any types, pretty soon I found them going above my head. Im not even thinking how xsds maintain versions of models and references.

Coming to actually using them in our Grails project, a colleague of mine tried to generate the client classes using jaxb (before trying a few other clients) and pretty soon found himself in a maze of conflicting classpaths from different generated wsdls. Jaxb generates classes for all the dependent xsds (obviously) and generating them for different wsdls (in the same project) ended up with several classpath issues.

Taking a break, we decided to ask a fundamental question: “Why do we have to worry about all the xml types in the Grails code, if all we need is to send a plain textual xml data?”.

Who cares what the model hierarchy of xsd is? Most of the time that model is useless to the Grails code, where typically the representation of model is different from the xsd models. All those xsd generated classes, objects, factories, abstract factories were required because the complex familial Java APIs were the only way to process xml (unless using StringBuffer and appending data manually).

Enter Groovy’s MarkupBuilder, it is much easier to build an xml than generating the client. So if we all care is to send data and generating that data itself is easier than generating clients, why bother at all with the later. And we switched to Grails wslite plugin, with which we are directly creating soap requests and sending it to server.

So thats how the pendulum switched back to its other position, back to creating hand made soaps. Eventually this may get complex too. But again, the important thing to remember is that the original complexity is not the same level of complexity of the broken down components. The “absolute complexity” remains unaltered and the “perceived complexity” depends on how lazy or smart we are.

Alright, I will have to admit that the misuse of xml is irritating me quite a bit. Xml is reaching a critical desalination point (sorry Quaid) and is set to freeze the world over. I sincerely hope that xml abuse will be the cause of next Internet meltdown. Disproportionate amount of control information thrown along with a tiny amount of data is going to upset someone over the wire, causing a data-strife and leading to abrupt strike by tcp services.

That was pretty much my dream when I struggled over what I thought was a simple solution to a really simple problem. I was working on a data-migration project, you know, the routine thingy, read data from a legacy database, convert to xml and stuff down the throat of another database. I picked up Grails to get it done, uh well, with a premise that I do not have to handwrite a single query and can simply map tables to domains and magically invoke “save()”.

I was short-circuiting Grails by using it only for GORM features, but running as a grails script, and not for any ui, scaffolding and many of it’s “traditional” strenghts.

1. GORM hates tables with no keys

The legacy tables I dealt with did not have any PKs at all. And GORM does not like them because of required implicit id. Yes composite keys work fine, but the legacy tables did not have any unique rows either. So the solution the was to simply create a view on the table:

select rownum as uniquekey, t.* from table t

I could care a damn for what the rownum was, as long as it was unique. I mapped the GORM id to the uniquekey and all was okay.

2. Grails Spring Context in a script environment

Grails does an amazing job of injecting Spring beans whereever you expect to. All’s well when running as a web app, but I was running this as a grails script, so the Spring services were not getting injected.

instead of new MyService(). That’s one of the great strenghts of grails, the spring context is available throughout the script environment (not just Web and Unit Test) and its easy to experiment with. It just takes a bit of hunting, and asking for help.

3. XMLType – The Clown Prince of Column types

I ran the script and it ran well about a 1000 records and I almost declared success. Almost. And then came the bouncer.

Colleges and Universities teach a lot about how to do programming. Sadly there are no professors who take a class on “How to write BAD code”. And set a test on some really professional-industry-level bad code. The experience gained from doing wrong things is hardly equivalent to making the right choices. Yeah, you know what Edison said.

Oracle’s SQL Errors (and JDBC Sql Exceptions) will always be dear to my heart in realizing how to write bad code. So I got the error:

ORA-01461: can bind a LONG value only for insert into a LONG column

I had a few long values in my domain, so I scratched off one by one, like a senile lottery ticket buyer, and after some debugging found that all numbers are good (Damn lottery pun!). By process of elimination I realized it was the XMLType column that wasn’t behaving. The reason was the size of the xml data was 4100 bytes in length. And Oracle’s XMLType is an “XMLType” if its data size < 4kb, but if its above 4kb its a CLOB. Good deal! A dynamically self-morphing peekaboo data column type, FWIW.

Someone suggested to use a stored procedure instead of domain mapping, so I cajoled my co-worker to quickly write a wrapper storedproc for me. Try what ever I could not get past the next error:

ORA-06553: PLS-306: wrong number or types of arguments in call.

When things go wrong, switch to basics, so they say. So I went back to basics of counting numbers by my fingers, nails and toes: all arguments are correct, all types are correct. Yet Oracle thinks I pass wrong arguments.

I abandoned the stored procedure route and turned to using Oracle XDB for the XMLType. Oracle does not respect Maven community and somehow I located an xdb.jar and dropped into the grails lib directory. But STS Eclipse has issues with properly refreshing libraries. So I had to clean, close, sweep, brush the project from both STS and command line to get it included in the classpath. But surprise, xdb.jar has its own SAXParser classes which interfere with the Grails scripts.

Googling around, there was a suggestion to use xmlparserv2.jar in grails/lib. I again I hunted the jar down, but If I put it in the lib directory, grails wouldn’t even compile Then I put it in the classpath at runtime, but I simply got ClassNotFoundException.

I guess the -cp option works fine if running as a run-app, but while invoke the gant scripts, the -cp is not honored (Grails 2.1.0). I realized I may have to tweak the Run-Script, but I didnt want to go that far, because the script will be run in environments I dont even know and didnt want to assume a lot.

I abandoned the XMLType route. Oracle’s half-cooked XMLType implementation and its support api isnt for someone who likes to be productive. I took a break, wondering whats the need for storing a 200 byte data – yes, thats what the actual datasize was, but ended up as a 4100 byte xml with all kinds of tags. Thats the anguish about xml abuse and the internet meltdown that I fervently hope.

4. When in Rome, treat everybody like British

Finally I found a piece of code, which converted the XMLType into a Clob in a round-about way. Not sure why converting one data type to another needs a javax.sql.Connection parameter. This is like, if you want to pour water from one container to another container, you need to send it to a hydro-electric plant.

Only to discover that the second parameter OracleConnection is not a javax.sql.Connectin per se, and certainly not what I have in the Hibernate’s sessionFactory.currentSession().connection() which is Proxy19 – a connection proxy. So how do I get an OracleConnection from a proxy connection?

And so I finally after a long self-guided tour of google, oracle, stackoverflow and several more sites that even my browser history could not remember, got the xmltype working, but not before I scratched around to fix the sql statement itself:

String SQL = /insert into table1(?,?,?) values (?,?,xmltype(?))/

And so the conspiracy was settled – Grails, Hibernate, Script, Spring, Oracle, XMLType, CLOB, Connection Proxy – all concocted together to make a nightmare of a simple insert statement.

If you had taken a course in compilers in your college, you would have learnt about syntax trees. The primary goal of a compiler is to convert the code to the target cpu’s instruction set. Those who worked with C compilers in early 1990s would know that there were several C compilers for DOS, Windows, Mac and *nix etc. and programs had to be compiled in each system individually.

Before the compilers output the executable code, they create something called an abstract syntax tree (AST) – which is a hierarchical representation of the code.

Take a simple example:

c = a + b

would be converted as

[equals]
|
[c]
|
[plus]
/ \
[a] [b]

So thats the job of compiler (technically, it is the parser tool): Take a human readable, well understood mathematical expression and convert into a semantic graph, which we really dont care much about.

A few days before I was having this huge xml in front of me which I had to convert to flatfile. For xml conversions, generally, using xslt seems to be the most popular solution. But as I started to write the xsl, I was reminded about the college days reading about compilers, semantic graphs and syntax trees. That’s when I felt like I borrowed an elephant to move my chairs around. And for a moment I felt like a jack-ass. What is xslt making me do? It is making me write some complex hierarchical code, which is very similar to the AST ! But isn’t that the job of the compiler?

So as a programmer, I have to give up the most intuitive way of representing logic and start writing code as hierarchical syntax tree. If compilers were alive, they would either appreciate my generosity or make fun of me for doing half their work.

The cases above illustrates amply that xslt is a superficial construct, end to end. In retrospect, there was absolutely no necessity for xslt to respect the symantics of xml itself. It could have been a simple set of instructions. Worse than this is the CAML query syntax, where simple SQL DSL is converted into complex hierarchy based conditions. The complexity increases exponentially when more conditions are added. In other words, I feel these have been design to specifically kill productivity.

It is easier to think of data models as hierarchical structures, but it is much harder to think of program instruction sets in terms of hierarchies. In data structures we are concerned about relations between entities, while programming is about flow of logic, not hierarchy of statements. So thats the mantra – when you use xslt or xml to write code next time, be aware that the compilers are having a party at your expense.

And thats when I snapped and switched to a simple Groovy script to convert the flatfile using MarkupBuilders, which got the job done about an order faster.

Time is cyclic, they say. The Sun, moon, seasons, tides, rains, storms, tornadoes and likely the universe itself – they all go and come back again with indifferent precision. There is something beyond the universe which definitely looks like huge unimaginable recycle bin. Similarly, innovation is also cyclic. Old ideas are thrown into a recycle bin and they come out as what we feel as a new idea. What we worked before, we work on again. Its not just déjà vu, but also déjà fait.

I was working on a project in Oracle Fusion middleware. Sitting in the bus back to home, set a train of thought.

That’s an interesting syntax for setting a variable, isn’t it? For 400 years since Galileo, we are used to mathematical notations like:

outputVariable.response = 404

Any one remembers writing in 8085 instruction set? Here is refresher:

MVI A, 194H
STA 2000H

If compilers can understand ‘a = 404’ and if humans can understand that the same ‘a = 404’ really means ‘a = 404’, who are we pacifying with the above xml construct? The only participant that is most satisifed in this complicated piece of assignment is the schema validator. Even the assembler instructions look simpler compared to the xml-instruction set. So we progressed from complicated assembler copy instruction to simpler a = b to a complicated xml <copy> instruction. Innovation is cyclic.

When xmls became a rage a few years ago, every bit of configuration imaginable were ported to xml. We all enjoyed learning Spring xml configuration where we could create classes and reference objects using declarative xml. Suddenly everything was hierarchical. Eventually the limitations of xml was hard felt and out came the non-xml frameworks (Wicket, Rails for eg). Recent versions of Spring, Grails also do not rely on xml anymore as the backbone of configuration. Json and Dsl have made us realize the simplicity of representing data without a purposefully complicated hierarchical structure is still possible. We progressed from non-standard-simple-configurations to standard-hierarchical- configurations-in-xml to standard-non-xml-configurations. Innovation is cyclic.

But again, Json and Dsl go back to a similar format of key=value, just like the properties files. So we have moved from flat-file-properties to hierarchical-xml-configuration to flat-file-look-alike-optionally-hierarchical-dsl config files. Innovation is cyclic.

From Javascript validations to everything-must-be-on-server-validations back to ajax-type-javascript-validations. Innovation is cyclic.

Every data transmitted, whether it is across network, or between circuits, there is always a small amount of control data that is sent along with it. Like source, destination, checksum etc. Yet somehow we bought into the idea of xml – where the size of control data is about like 100 times the size of actual data itself. We have been wasting a huge amount of network bandwidth, sending the same tags for a million records. We wrap 10 bytes of data with about 1kb of tags, so that control data could be ripped off and the actual data be used. Much like an silk-embroidered-golden gift wrap for a lego toy.

Xml is not wrong. It provides a great way to exchange data between two systems when they don’t agree to agree with each other on anything, like a divorced couple. Or a teen and his/her dad. But using xml hierarchy to write a mathematically or logically expressible code like the one above or a similar cousin – the inexplicable-complexity-on-steroids Caml query, superimposing an overtly artificial hierarchy strucuture, where there is only a conceptual linearity, is like asking a husband to do laundry on a superbowl night.