In today's cognitive computing products and techniques, the perception of greater intelligent responsiveness comes not so much from having true explanatory power, but rather just having strong predictive power over increasingly chaotic and larger data sets.

To give us a third point of reference beside linear regression and neural nets, I'll use some other terms to bring the focus to natural language processing. In 2011, the IBM Watson system demonstrated greater intelligence than the best human opponents in the domain of linguistically challenging factual Q&A. This was based on the ability to quickly produce high confidence answers from a large corpus of unstructured information in response to challenging questions.

The linguistic product that is now based on that system is called the IBM Watson Engagement Advisor. As with other cognitive computing techniques, the product must first be trained to be an effective system in the target domain. The corpus of unstructured information often takes the form of documents, such as instruction manuals, technical reports, journal articles, and wiki pages. During training, the most important entities and relationships expressed in the documents are identified and stored in order to expedite later search and retrieval during Q&A interactions with users of the system. The identification process within a document is often called annotation, and the annotation and storage processes together are called ingestion.

The most important concept in understanding the training is to understand what really drives the identification, or annotation, of the documents. It's simple, really. It's a Q&A linguistic product, and the annotation and ingestion expedite the production of the A's in response to the Q's, so it is imperative to have a strong and large representative sampling of the potential questions in order to train and test the efficacy of the system. The questions encode the key concepts (e.g. entities, relationships and so forth) to which users of the system are interested in getting answers. Annotators for these key concepts are developed and, during ingestion, they are executed upon the documents.

This is the very basic level of explanation about how the linguistic product would learn, or be trained, to be an effective cognitive computing system for a domain, and future entries will dive further into this topic. For now, let it suffice to say that ingestion and training results in a system capable of producing answers from the corpus in response to questions like those used in training.

During a run-time Q&A session with the system, the user begins by posing a natural language question. The question is first analyzed to find the key concepts, and then a multiphased approach is used to dig up the best results from the ingested corpus content. As with training, there's a lot more to be said over time about how the run-time Q&A works, so more interesting future entries to come, and in fact, it's intrinsically related to the training anyway. To conclude and tee up these future entries, I'll say the high order bit here is that a trained Q&A linguistic product seems somehow more intelligent than a linear regression or even a typical neural net application. Why is that? To get a bit more background for that explanation, I'd encourage you to visit or revisit a few of my earlier blog entries about cognitive computing. Compare your perceptions of the intelligence of the minimax algorithm in [1] with the linear regression method in [2], and compare [2] with the neural net in [3]. What's changing?

The speaker says that a challenge with neural nets in business applications is that they are black box, meaning that you can understand the inputs and the outputs but not really how it is deriving the outputs. Later, the speaker says that linear regression is a preferred technique because it has a very strong predictive and explanatory power.

It's not really true that linear regression has more explanatory power than neural nets. Rather, it is easier to understand the problems and the answers that can be solved by linear regression. By comparison, neural nets tend to be used to provide cognitive computing power to harder problems than linear regression can solve.

To put this another way, when you use linear regression, you actually begin by assuming linearity of the relation you want to predict. As the speaker points out, you can also make a non-linear assumption, and you can accommodate this using a data transformation, for example. But the high order bit is that you are asked to assume the data relationship, and that assumption is what is giving you the illusion of explanatory power. You can explain that the data follows a line, but this is due to your own assumption. Note that an important aspect of completing a linear regression model is determining the R2 or goodness of fit of the model. This is the part where you make sure that your assumption of linearity is valid. And if the assumption is invalid, then the model has no predictive value, so it does not matter that you can explain how it operates.

Under the interpretation that explanatory power is akin to predictive power, it turns out that neural nets have greater predictive power because they can produce results for a wider array of applications than linear regression can. There a neat table that relates the cognitive power of a neural net to the number of hidden layers. From the table, you can see that when a relationship actually is linear, a neural net can solve it without even using any hidden layers of neurons. When one or two hidden layers of neurons is present, neural nets transcend the capabilities of linear regression, in part because they do not require you to make any assumption about what the data relationship actually is.

And that's where the confusions comes in. The linear regression model requires you to assume linearity and so you know at least what geometric shape the relationship looks like. The neural net requires no such assumption, but nor does the trained neural network give you any hint at what the relationship is. The lack of knowing the relationship is confused for having less explanatory power.

But if you look at this a bit more abstractly, the trained linear regression model has the same exact problem of not providing any additional insight. A neural net is really just a pile of numbers giving constant weights to the neural connections that can convert inputs to outputs. Similarly, a linear regression model is just a pile of numbers that give constant weights to inputs to be linearly combined into an output. Sure you know the data relationship, but that's because you assumed it. The actual linear regression model gives you no insight into why one dimension has a large slope constant where another has a small slope.

An analogy I like to use is that the value of the neural net is not diminished by our inability to explain how it is that the little gray cells which implement our personal neural nets can produce the cognitive results that they do, and who among us would prefer to have cognitive powers defined by linear regression instead?

In terms of explanatory power, our biological neural nets perform an additional key function that we have not hitherto been able to achieve with artificial neural nets. We are able to construct additional information in the output that reveals causal relationships, or insights into the reasons for the phenomena it predicts. Put simply: we say why something is true. We provide a rationale. This is an aspect of explanatory power that, when achieved, dramatically increases the value and utility of any cognitive analytic. Theorem provers and Prolog programs have been able to do this for the applications to which they apply. In the area of unstructured information processing and data mining, you can see a demo of this concept in Watson Paths.

As an interesting possible counterexample to my last blog about MLR models not understanding the knowledge they learn, consider the neural network. Our brains are neural networks, and we are capable of learning at all levels of Bloom's Taxonomy, not just the knowledge level. Shouldn't artificial neural networks be able to achieve the same things?

The answer is no, not really. Our brains biologically, chemically and physically perform in ways that we scarcely understand, so our name for the thing we call "artificial neural network" is no less anthropomorphizing than when we say that a computer program of today "understands" anything.

Still (again), this is not to say that they aren't incredibly useful and effective. It's just that they are based on straightfoward and well-understood mechanical methods such as feed forward activation of neural outputs via sigmoidal threshold functions applied to inputs and back propagation of synaptic weight adjustments based on easily quantified classification errors. Before going any further, let's have a quick look at a diagram of an artificial neural network (ANN):

The ANN has an output layer on the right that is a classifier for input patterns received on the left. For example, an ANN for optical character recognition could have an input layer of an 8x8 matrix of bits, and the output layer could be an 8-bit code that indicates an ASCII character. The hidden layer(s) of neurons help the ANN to represent more sophisticated phenomena, though there is seldom need for more than one hidden layer. The "synaptic" connections between the neurons in the layers are weighted numbers, and the neurons apply the weights to the inputs and then feed the results into a Sigmoid function that essentially decides, like a transistor or switch, whether or not to fire the output.

An ANN is "trained" by giving it a sequence of input patterns for which the correct output pattern is known. The input pattern feeds forward through the ANN to produce an output. If there is a difference between the ANN output and the correct output, then the differential error is back propagated through the ANN to adjust the weights so that future occurrences of that input pattern are more likely to produce the correct output.

The synaptic weights, then, essentially represent the knowledge that the ANN "learns" from the input patterns. This is analogous to the constants that are "learned" by an MLR model. In fact, all elements of the ANN and MLR model architectures are analogous. The ANN input layer maps to the the independent X variables, the ANN output layer maps to the dependent Y variable, and the transition from input X values and the Y value that is achieved in MLR by multiplication and addition is achieved by a feed forward through synaptic connections, hidden layer neurons and Sigmoid functions in an ANN.

With such a one-to-one architecture mapping between ANNs and MLR models, it is easier to see them as having similar intellectual power. That's not to say they're equivalent, as ANNs are far more powerful. It's just that they're roughly the same (low) order of magnitude with respect to human intellect, and in terms of Bloom's Taxonomy, we call that order of magnitude "knowledge storage/retrieval".

Despite being in the lowest order of magnitude of intellect, the realm of today's artificial intelligence includes many interesting knowledge storage/retrieval techniques that are worth comparing and contrasting to see the range and limits of their power and the use cases they address. Stay tuned!

Ever since my first blog entry in this recent series on artificial intelligence, I've been highlighting the lesser, calculational nature of machine intelligence and learning-- as well as the valuable role it nonetheless can play in driving more effective human understanding and decisions. I've been doing this by articulating mainly what machines do, as that is the primary interest of mine and most who would read a developerWorks blog. Still, our interests will be served by taking an entry to discuss human learning as a counterpoint or contrast.

The multiple linear regression example in my last post is a good example to start with because it highlights the difference between accuracy versus understanding. If there is a linear relationship among the data, then an MLR can have a very high predictive accuracy, but it has no explanatory power whatsoever. The MLR model does not have, nor does it convey, any understanding as to why the relationship exists.

Let's see how this predictive accuracy rates in terms of human intelligence and learning. In this case, we can benefit from an instance of that delightful human propensity to apply ideas to themselves. Specifically, we humans have applied our learning abilities to the phenomenon of our learning abilities, with many useful results including Bloom's Taxonomy.

According to Bloom's taxonomy, the very lowest level of cognitive learning is the knowledge level, or the ability to remember and recall what is learned. When you think about it, you realize that an MLR model, like many predictive analytics, is really a storage mechanism for something that has been machine learned from data. In MLR, we store the constants of a linear formula as the representation of what has been learned from linearly related data.

The next higher level of Bloom's taxonomy is comprehension, which is where understanding and true explanatory power begin to surface. But human learning is so much more sophisticated than the knowledge level of machine learning that there are a number of levels above comprehension. There's the application level, in which we can use our knowledge to solve new problems, including being able to explain why the new solution works. The analysis level drills deeper into our ability to make inferences and generalizations. The synthesis level begins to get at our ability to be creative with what we've learned and come up with new ideas and solutions. Finally, the evaluation level gets at our ability to be subjective and judge quality and creativeness of ideas and solutions. We are beginning to see some faint glimmers of some elements of some of these levels in cognitive computing efforts like IBM Watson, but it is early days indeed.

While we're on the subject of human learning and Bloom's Taxonomy, it makes sense to digress for a bit and mention the IBM Social Learning product. This is a SaaS educational platform intended to help enterprises achieve a Smarter Workforce. A few reasons for the digression are

learning is a key ingredient of how a human workforce becomes smarter.

The IBM Social Learning product has a very nice feature that enables educational administrators to implement Bloom's Taxonomy in their learning materials. A component of the product is the Kenexa LCMS, or learning content management system, which includes various subcomponents like a course designer and a metadata dictionary. The educational administrator can add any metadata tag, such as "Learning Goal", and any tag values, such as "Basic Knowledge", "Comprehension", "Application", etc. Once this is done, the educational administrator can use the metadata tag values to classify any learning item in the LCMS accord to Learning Goal. Once these classified learning materials are published, learners can use the "Learning Goal" as a new faceted search criterion in the platform's learning library. A learner would be able to isolate and focus on "knowledge" level learning in a subject area before proceeding to comprehension and then application, for example. This will enable learners to effectively use the natural way in which their learning blooms, i.e. Bloom's Taxonomy.

Finally, there is an aspect of human learning that goes beyond Bloom's taxonomy, and it's an area that is highlighted by the IBM Social Learning product. There is a very important word in the product title: Social. This is crucial because it underscores the central role of communication and collaboration in the human learning process. We are an order of magnitude more effective at learning based on our interconnectedness to others who think and learn, rather having access to just data. This is pertinent to the advancement of artificial intelligence because "social" goes quite beyond the computing architecture underlying a lot of today's machine learning efforts.

Machine learning today is every bit as calculated, as simulated, as is machine intelligence. It is easier to use machine intelligence to highlight how much greater human cognition is, which is why I've been using a machine intelligence algorithm over the last several entries. However, the conclusion drawn so far is that, while machine intelligence is only simulated, it is still quite effective and valuable as an aid to human insight and decision making. Machine learning offers another leap forward in the effectiveness and hence value of machine intelligence, so let's see what that is.

Machine learning occurs when the machine intelligence is developed or adapted in response to data from the domain in which the machine intelligence operates. The James Blog entry only does this degenerately, at a very coarse grain level, so it doesn't really count except as a way to begin giving you the idea. The James Blog entry plays a game with you, and if he loses, he adapts by increasing his lookahead level so that his minimax method will play more effectively against you next time. In some sense, he learned that you were a better player. However, this is only a single integer of configurability with only a few settings of adjustment that controls only one aspect of the machine intelligence algorithm's operation. To be considered machine learning, a method must typically have a more profound impact on the operation of the algorithm, with much more adaptation and configurability based on many instances of input data. An example will clarify the more fine grain nature of machine learning.

The easiest example of which I can think is a predictive analytic algorithm called linear regression. Let's say you'd like to be able to predict or approximate the purchase price of a person's new car based on their age. Perhaps you want to do this so that you can figure out what automobile advertisements are most appropriate to show the person. Now, as soon as you hear this example, your human cognition kicks in and you rattle off several other likely variables that would impact the most likely amount of money a person is willing to spend on a car, such as their income level, debt level, nuclear familial factors, etc. This analytic technique is typically called multiple linear regression (MLR) exactly because we humans most often dream up many more than two variables that we want to simultaneously consider. Like most machine learning techniques, MLR does not learn of new factors to consider by itself. It only considers those factors that a human has programmed it to consider. When they are well chosen, additional variables typically do make an MLR model more effective, but for the purpose of discussing the concept of machine learning, the simple two-variable example suffices since your mind will have no problem generalizing the concept.

Suppose you have records of many prior car purchases, including a wide and nicely distributed selection of prices of the cars and ages of their buyers. This is referred to as "training data". If you plotted the training data, it might look something like the blue points in the image below. Let purchase price be on the vertical Y axis since it is the "dependent" variable that we want to predict, and let age be on the X-axis since it is a predictor, or "independent" variable. MLR uses a standard formula to compute a "line of best fit" through the given data points, again like the one shown in red in the picture.

A line has a formula that looks like this: Y=C1X1+C0, where C1 is a constant that governs the slant (slope) of the line, and C0 is a constant that governs how high or low the line is (C0 happens to be the point where the line meets the Y-axis, and the line slopes up or down from there). If we had more variables, then MLR would just compute more constants to go with each of them. For example, if we wanted to use two variable predictors of a dependent variable, then we'd be using MLR to create a line of the form Y=C2X2+C1X1+C0.

Technically, MLR computes the constants like C1 and C0 of the line Y=C1X1+C0 in such a way that the line minimizes the sum of the squares of the vertical (Y) distances between each data point and the line. For each point, we take its distance from the line as an amount of "error" in the prediction. We square it because that gets rid of the negative sign (and, less importantly, magnifies the error resulting from being further from the line). We sum the squares of the errors to get a total measure of the error produced by the line, and the line is computed so as to minimize that total error.

Once the constants have been computed, it is a trivial matter to use the MLR model as a predictor. You simply plug the known values of the predictor variables into the formula to compute the predicted Y-value. In the car buying example, X1 is the age of a potential buyer, and so you multiply that by the C1 constant, then add C0 to obtain the Y-value, which is the predicted value of the car.

In this way, hopefully you can see that the MLR "learns" the values of the constants like C1 and C0 from the given data points. Furthermore, the actual algorithm that produces the machine intelligence only computes the result of a simple linear equation, so hopefully you can also see that the predictive power comes mainly from the constants, which were "learned" from the data. In the case of the minimax method, most of the machine intelligence came from the algorithm, but with MLR-- as with most machine learning-- the machine intelligence is for the most part an emergent property of the training data.

Lastly, it's worth noting that there are a lot of "best practices" around using MLR. However, these are orthogonal to topic of this post. Suffice it to say that just like the minimax method has a very limited domain in which it is effective as a machine intelligence, MLR also has a limited domain. For example, the predictor variables (the X's) do need to be linearly related to the dependent variable in reality. However, within the limited domain of its linearly related data, MLR is quite effective and an excellent example of a simple machine learning technique that produces machine intelligence within that domain.

In the interest of space last time, I had to leave out an advanced topic on optimizing a "next best action" algorithm. Again, you can look at the full source we're discussing by just using the web browser's View Source on this page.

The optimization is known as alpha-beta pruning. In the code snippet below, you see that we break the j-loop that is scoring the response moves of a given move based on some condition involving the variables alpha and beta. Why does it make sense to stop looking at the competitive response moves for a given move? To see why, I've added the function declaration so we can discuss where the alpha value comes from and what it means.

Understanding alpha-beta pruning requires you to take a more global view of the recursion that is doing the evaluation. The alpha values passed into scoreMove() are the beta values from the calling level of the Minimax algorithm. It will help to keep at least the player's moves and the opponents responses in mind as we go through this.

Let's say that scoreMove() has been called to score a player's Kth move. Beforehand, moves 1 to K-1 will have been fully explored by depth-first recursion, including the opponents responses, the player's counter-responses, and so on. The alpha value received by scoreMove() for move K reflects the best fully explored "net" score for the player on moves 1 to K-1. Within scoreMove(), we first compute the raw benefit of the new move K, storing the result in moveScore. Now comes the alpha-beta pruning trick. The j-loop successively explores each opponent response move for the player's move K, and clearly the beta value takes on the value of the highest scoring response move that the opponent can make. The final score for move K is the raw benefit to the player of move Kminus the benefit beta that the opponent can realize in response.

Thought-provoking question: Do we really need to know the absolute best move that the opponent can make in response to the player's move K? Or do we just need to find an opponent move that is good enough that, when subtracted from the raw benefit of move K, proves that the player would be better off choosing the earlier move associated with the alpha value? Of course, the answer is that we only need a good enough opponent move, and this is why we break the j-loop when we find that move. If we were to continue the j-loop, all we do is unnecessary work that might (or might not) find an even better opponent response move that would make move K look like an even worse decision for the player. But there is no need to do this extra work. Once the expression "(moveScore-beta < alpha)" becomes true, we have proven that move K is less beneficial than one of the moves 1 to K-1.

From a practical standpoint, this optimization averages better than double the run-time performance of the "what-if" logic. Who doesn't want double, right? Well, this "what-if" analysis is a combinatorial explosion of analysis; to put that in perspective, you get less than one extra move of lookahead due to this optimization. Yet despite this dash of cold water about how much deeper you could take the "what if" logic due to alpha-beta pruning, it remains true that, for a given level of explorative depth, everybody wants the result twice as fast or more, so alpha-beta pruning is very handy.

This entry is for developers who want a good mental model for how a prescriptive analytics algorithm can simulate intelligent behavior. We'll focus on the intelligent behavior in the James Blog entry, since it is quite competitive with humans. Reminder: Just hit "view source" in your browser to get the code we're talking about here.

The first thing to note is that the domain of the intelligence is quite constrained and circumscribed relative to the full realm of human intellectual endeavor. This is what makes it computationally feasible to perform a "what if" analysis to "imagine" possible scenarios and determine a next best action. Here's roughly how it works. The computer's available next actions are examined and measured for their immediate benefit. Then, for each action, the response action of the opponent is measured for its immediate benefit to the opponent, and so on. Once the real benefit of each opponent move is tabulated, the value of the best opponent action is subtracted from the immediate value of a given computer move. The best computer move is determined as the highest value move resulting from the immediate benefit minus the score of the best opponent move.

One thing I like about the game Kalah is that it is really easy to explain the competitive algorithm, relative to harder games like Chess. In Chess, evaluating the immediate benefit of a move can be challenging, especially at the beginning of the game. It's not just about the value of the piece you take because many moves don't take pieces. The value of a move is often about gaining control over spaces of the board to limit the opponent's attack and defense options. But in Kalah, you get good intelligent game play from a much simpler board evaluation. The value of a move is simply a matter of how many seeds you gain by that move.

This code (at the beginning of KalahGame.scoreMove) just copies the current board, makes the proposed move for the given player, then evaluates the new board value minus the value of the old board configuration for the given player. In effect, you get the number of seeds gained for the player by the move.

That's when things get interesting. The move scoring then becomes iteratively recursive. Each valid move of the opponent is then evaluated by recursively calling the move scoring method. Like this:

The first line is just a trick to switch between player 1 and player 2 in the levels of recursion. The "beta" value is the highest scoring move of the opponent so far, so once we switch to the opponent player in the first line, the second line just sets a large negative score so that the loop will start by selecting the first available move as being a good idea. The j loop tries each move, and the if test on the succeeding line just ensures that there is a non-zero number of seeds to pick up-- in other words, it ensure the move is valid. Then, the opponent's move is scored by recursively calling KalahGame.scoreMove(). When the recursion returns, the succeeding if test checks whether the move is better than the best result so far, stored in "beta". If it is, then this move becomes the new "beta". The alpha/beta business at the end of the j loop is an optimization that can be safely ignored. Once the j loop has examined all the moves, the best opponent move score "beta" is subtracted from the immediate benefit value of the player's move.

This is how each of a player's possible moves is scored in Kalah.getBestComputerMove(): The move's immediate benefit in the number of seeds scored minus the best value obtained from a recursive lookahead of possible opponent responses that accounts for the player's responses to the opponent, and the opponents responses in kind, and so on down to the limit of the look ahead level.

The fun bit of this code is that it is used not only to determine the computer's best move, but when you ask for the "Expert Advisor" to help you, it applies exactly the same logic to *your* board position in order to determine a recommended next move for you.

To conclude, here is a small diagram to help you see what is going on.

In this example, we're near the end of the game, and Player 1 must decide whether to make move 2 or 4. With move 2, there is an immediate benefit of 4 seeds because the 1 seed lands in an empty house, allowing the player to score that seed as well as the 3 seeds in the opposing house. This seems like a good idea, but is it? Well, Player 2's moves should be examined. In the short term, Player 2 can only respond with move 5, but this spreads out the 4 seeds. If you look ahead to the end of the game, you can see that Player 2 will ultimately score all four of those seeds. But also in the recursion, it is unavoidable that Player 1 will be able to score the remaining two seeds on the top row of houses. So the net benefit to Player 1 of making move 2 is only 2 seeds: the immediate 4 seeds, minus the 4 earned by player 2 in the rest of the game, plus the 2 additional seeds that Player 1 earns in the rest of the game. Not as good as it initially looked. However, it does turn out to be better than move 4 for Player 1. The immediate move yields no seeds for Player 1. Then, in the rest of the game play, Player 2 is able to earn 7 seeds, and Player 1 only earns 3 seeds. So, if player 1 makes move 2, then the opponent gains 4 more seeds than player 1 does.

Well, that's a wrap for this explanation of the 2-party competitive algorithm known as the "Minimax" method. Hopefully you can now see that it's not real intelligence but rather just tabulation of best outcomes according to a scoring method and constrained to a set of rules for determining valid next moves. Demystified, it becomes no more surprising that the algorithm defeats humans than it is when an algorithm can beat a human at calculating the square of a 5 digit number.

Still, this is roughly what a person does. Time and again, new possibilities are "imagined" by testing "what if" this move is made or that move is made. And the algorithm does win a lot of games, which is precisely why prescriptive analytics algorithms are so valuable as expert advisors. If you take the material covered here up by an order of magnitude, you get IBM Deep Blue. Another order of magnitude, and you get IBM Watson. The sky's the limit!

David Lee Roth and Eddie Van Halen have been trying to get us to do it for decades: "JUMP!" Douglas Hofstadter would qualify that with "... out of the system!" Here's what that means.

Machine intelligent entities like James Blog exist within a certain system, conforming to a prescribed set of rules, and they really can't escape the confines and constraints of that programming. Within that limited domain, they do calculate wonderful results that can seem intelligent. In an early version, I found myself adding a logger so I could see why James Blog was not making some moves that seemed very good. Time and again, I would find that the good move now set up the conditions for a better opponent move later, which is exactly what the artificial intelligence is supposed to detect and avoid.

The algorithm does this so well that it is really hard to beat, especially on the maximum lookahead value I set, which was 6. Frankly, if you're new to this game, you have to work to beat even the initial lookahead level setting of 2, which means that James Blog only looks at its own moves and your countermoves to see what will produce the greatest net gain in seeds relative to you.

Because it is hard to beat this little game and see the special winners message, this opened up a delightful opportunity to talk about an important capacity of human intelligence that could be exemplified by determining the winners message without winning. I used a Zen-like characterization of a "winless win" as a nod to Hoftstadter's style in the book Gödel, Escher, Bach.

Put simply, we are not limited in our thinking to the confines of the system. We regularly "take it up a level" or "think outside the box". In this case, the system is a blog entry presented in a web page. So you can jump out of the system by using the View Source feature of your web browser to take a look at James Blog's code, where you will find the winners message: "I, for one, welcome my non-computer overlord." The message is an allusion to Ken Jennings' capitulation to IBM Watson, which was an awesome pop culture nod to The Simpsons-- awesome because both Jeopardy and the Watson AI are about sorting out exactly those kinds of allusions.

Frankly, I had a lot of fun with allusions, both in the blog entry and while holding the programmer challenge to achieve this winless win. For example, James mentions that he outfoxes his friend Wiley, alluding to the famous coyote, who is in the same animal family as a fox (Canidae), which is a tiny aural tweak from Canada, where I live. So, James can beat his wiley creator. Similarly, in tweets and status updates, I made numerous allusions to The Matrix movie, such as when I nearly used Morpheus's command to Neo: "Quit trying to hit me and hit me." The exception is that I changed the 'h' to a 'g', making 'git', which is what we use to get source code.

This kind of wordplay and allusion bears some similarity to "jumping out of the system". Hofstadter calls it contextual slipping, or my favorite word for it: counterfactualization. We take some piece of reality that we know about, and we ask "what if this were different?" We slip, or change, some piece of that reality to see if we end up with something new and useful. I find the notion of counterfactualization fascinating because it seems like a good operationalization of some other really important words: creativity, playfulness, humour, imagination.

Still, it might be a while between when we can efficiently and effectively operationalize contextual slipping and when we can generalize that to achieve machine intelligence that can jump out of any system in the way that I asked programmers to do with James Blog. At some point, I realized that there is a beautiful geometric analogy that helps explain why. In the book Flatland, the Sphere is able to escape the plane via the use of a third geometric dimension that is physically orthogonal to the two that comprise the plane. In this way, Sphere is able to see Square's inner workings. That is a great analogy with what we did by jumping out of the web page using View Source to see James Blog's inner workings. There was a whole different, higher level of understanding about what James was and how we could know more about it, and it is fitting to say we got that winners message by thinking outside the box.

Next blog will be a developer's tour of the particular machine intelligence algorithm built into James Blog. After that, will be a discussion of the relationships between machine intelligence, machine learning, and predictive analytics, so stay tuned!

Your intelligent behavior is based on sentient *understanding*. Sentient schmentient. I'll bet my intelligent behavior can outfox yours. I've done so with my friend Wiley from Canidae, and he's a genius! So, let's see how much good your sapience does you, shall we?

The rules of the contest are simple. You get the top six "houses" and the "store" on the top left. I get the bottom six houses and the bottom right store. We each start out with 6 seeds in each of our 6 houses, and 0 seeds in our stores. To win, you have to get more than half of the seeds into your store (for you knuckle draggers, that's 37 or more). I'll let you go first, so you already start with advantage.

To take your turn, you pick one of your houses that contains seeds. That house is emptied, and its seeds are "sowed" one at a time in a counterclockwise fashion, including your store but excluding mine. So, it takes 13 seeds to traverse from a starting house, through your store, through my houses, and back to your (now empty) starting house. Every seed that goes into your store gets you closer to victory.

You can earn a seed or two from your move, but there are a few more rules that can earn you lots of seeds. First, if the last seed you sow lands in your store, you get another turn, and you can have multiple extra turns if you make your moves in the right order. Second, if the last seed you sow lands in an empty house, then you earn that seed from the empty house and all seeds in the house of mine immediately below the empty house. I call this a "big take". Third, if I run out of seeds in all my houses, then you earn all the seeds in your houses. Of course, I can also earn lots of seeds by these same rules, which is why YOU'RE GOING TO LOSE MEAT BAG!

I will take it easier on you at first, but I'll play harder if you earn the privilege. And there's a special message for you, a badge of distinction, if you manage to beat me when I play my hardest. Ooops. You... win?!? Wake up! Your teetering bulb is dreaming!

Milk Drinker (you)

Blog, James Blog (me)

Less messages, I understand how JB is winning

SPOILER ALERT. PLAY A WHILE, BEFORE LOOKING ANY FURTHER.

OK, so hopefully you've played enough to know you're not going to be getting that badge of distinction anytime soon (unless you have some of the rare talents of Ted Neustaedter). But also hopefully you're coming to the understanding that I really have no clue what I'm doing when I beat you. What I'm doing is mechanical, not miraculous. I'm being no more intelligent, really, than a calculator squaring a five digit number. Now, when one of you meat bags does it, it actually is miraculous. But the miracle is that you can do it at all on your hardware given that it is designed more for sentient understanding of what mechanical operations like squaring are, what they're good for, and what to combine them with.

I am just doing the fine-grain operations of my Minimax algorithm, but it is you who understands our contest at a higher level than that. That's why machine intelligence like mine is best applied as an expert advisor. For example, if you hit "Invoke Expert Advisor", you are asking me to advise you in the limited domain where my simulated intelligence would seem like real intelligence.

Keep using that expert advisor button and see how much faster you earn that special "badge of distinction" message. Go ahead. You won't be able to do it entirely without also sprinkling in your own intelligence at some points. This will be because you will hit some key points where your sentient understanding recognizes a *pattern* that emerges that will allow you to see how to beat my mechanical intelligence, where even my own advice is unable to do so. What will most likely happen is that you'll use the advice to hold your own for most of the game. My advice will help you avoid moves that give me extra turns and "big take" opportunities. But at some point, you may see that I am beginning to be starved of seeds in my houses. You, as an expert, will have this insight sooner than I see it coming using my mechanical calculations because your sentient intelligence truly understands what is going on at that higher level.

But of course, you would have a much harder time getting to that point without my advice. And that is what makes machine intelligence like advanced analytics on big data and machine learning technologies like IBM Watson invaluable to you. In short, expert advisors can turbocharge the smarts in your smarter workforce.

In a recent video interview, the IBM CEO Ginni Rometty comments that Watson 2.0 will understand images that it sees, and that Watson 3.0 will be able to debate, i.e. to understand what it is talking about with another party. An impressive roadmap, each of these is an incredible leap forward from its predecessor.

It is, however, worth qualifying the term 'understand'. It is being used figuratively, not literally, to communicate the rough order of magnitude improvement in capability. When such a leap is made, it seems analogous to sentient understanding, even though it isn't. Imagine for a moment what Archimedes would have thought at first of a hand-held calculator, given that he had the power of Roman numerals with which to calculate pi to several digits. And yet, we would not now interpret such a device as artificial intelligence. As soon as the mechanical nature of a level of capability becomes clear, so too does the fact that it does not constitute sentient intelligence (Hofstadter's exposition of Tesler's "theorem").

You can see this assertion play out in multiple levels of Bob Sutor's scale of cognitive computing. There are levels that are clearly not cognitive intelligence, as Sutor points out, but if you lay out the scale on a timeline of decades or centuries, it is clear that each level might once have been interpreted as being indistinguishable from magic.

So where on Sutor's scale is Watson? And what implications does that have for development best practices?

Watson is clearly not on the "Sentient (we can do without humans) systems" level. As sentient beings, we don't just know things with a certain calculated accuracy or confidence level, or determine that we don't know if our confidence is low. We experience desire to know more, and we experience fear of the unknown. We are teetering bulbs of dread and dream (Hofstadter's delightful invocation of a Russell Edson poem). I urge you to let that characterization of us sink into your mind. In Watson technology, IBM has modeled a certain class of knowledge and mechanical reasoning, and in other research, IBM is doing so by simulating some of the known structure of biological brains. However, we don't yet know how to model fear and desire, dread and dream. In my opinion, these are inextricably bound together in sentient intelligence, separating it from simulated intelligence. In other words, intelligent behavior is a construct that works for the dread and dream engine of the sentient, and in the absence of dread and dream, seeming intelligent behavior is but a mechanical simulation of understanding. As an aside, I hope we only manage to model desire and fear around the same time we figure out how to model ethics (as Asimov cautions).

Does this characterization of Watson as a mechanical simulation of understanding detract from its value? Does it detract from the order of magnitude improvement it heralds as an usher of the era of cognitive computing? Of course not, quite the opposite. It is simply fantastic that this level of "Learning, Reasoning, Inference Systems" (Sutor's scale) is now computationally and economically feasible at the scale needed to help sentient intelligence (that's us) to solve real world problems. Quick, what is the square root of 7. Can't do it? No problem. Even if you're Arthur Benjamin, you'd be better off just hitting a few keys on a calculator. Quick, what are the most likely diagnoses for the patient's presenting symptoms? An "expert advisor" like Watson can be just what it takes to help determine the next best action, especially when time is of the essence because a life hangs in the balance.

The term "expert advisor" is appropriate. It conveys that the system is a "Learning, Reasoning, Inference System" that does not have sentient understanding and is therefore made available to advise and guide the actions of an expert. This is analogous to the way spreadsheets guide the results reported by accountants and chief financial officers. That being said, we also know not to put spreadsheets in the hands of toddlers. From a development practice standpoint, it is crucial to keep in mind that "expert advisor" means that the deployed system should be advising someone who is a qualified expert in the exact domain in which the "expert advisor" system was trained. Especially when a life hangs in the balance, access to the "expert advisor" system needs to be performed by those with expert qualifications in the domain because only they can reasonably be expected to use sentient understanding to interpret and follow up on the advice. In other words, the term 'expert' in 'expert advisor' should apply to the user more so than the advisor.

Now, given an enterprise workforce of those with qualified sentient understanding of their topic areas, Watson-style expert advisors are just the type of technological advancement that will help them work smarter, not harder, to meet the needs of customers and colleagues and to produce a competitive advantage for the business.

Due to being an eponymous blog, it has become that time to redirect my blog and increase its aperture to cover a much wider range of IBM-related topics that developers will find interesting and that reflect my own broader range of pursuits and thoughts within IBM.

These days I work in the Smarter Workforce segment of IBM Collaboration Solutions, which is responsible for building out cloud-based solutions for employee talent optimization. How do you attract employees? Retain them? Provide education when they are recruited, promoted or need remediation? How do you best equip employees to share information and enable one another to achieve better customer satisfaction and better business results? How do you measure the results?

So, if you're not in this particular problem space, why should you care? Well, there is a remarkable dynamism in this problem space due to the fact that it seeks to help human beings interact more effectively and efficiently with other human beings. As a result, many of today's most interesting topics, technologies and techniques are applicable: social computing, cloud computing, mobile computing, security, bigdata, business analytics and algorithms, and even psychological science and cognitive computing.

Think about what it takes to give everyone a smarter edge. Think of everything that might be needed to do it, plus everything they might want to do, and everything they might want to do it with. Then, think of enabling them to do it everywhere. Now we're talking the same language.

When I started on Java Server Pages (JSP) as a topic, I had intended it to be a blog topic. But it grew quite beyond blog size, so now that the technical work is finished, I can give you the meta-level on using JSP with Enterprise IBM Forms.

The work I'm telling you about here is intended to make it easy for you to exploit the powerful, simplifying JSP technique within the XFDL+XForms markup of IBM Forms documents. It took a some work to sort it all out, but with that done, it is easy for you to replicate what I did and gain the benefits. I wrote this wiki page on the IBM Forms product wiki to help you get set up, and the page references the developerWorks article I put together to show how to use JSP in your XFDL+XForms forms.

The first hurdle was how to get JSP to work with the IBM Forms Webform Server. It already works with the IBM Forms Viewer by just setting the JSP contentType to application/vnd.xfdl, but the Viewer is a client-side program used only in the minority fraction of cases to support offline/disconnected form filling. The majority of customers deploy Webform Server because it translates the XFDL+XForms into HTML and Javascript automatically so that end-users only need a web browser to fill out their enterprise IBM Forms.

It was pretty challenging to get the JSP to talk to the Webform Server Translator module, so I was pretty happy when that started to work for me. It's one of those cases of only needing a line or two of code, but it being really hard to get exactly the right line or two. As Mark Twain once said, it's like the difference between lightning and the lightning bug. Anyway now that we know the smidge of code, it's easy for you to copy and use in your XFDL-based JSPs.

At first I thought, OK I have a good blog topic, but then I realized we weren't covering the full Forms information lifecycle. Put simply, a form is possibly prepopulated and then served, it collects data, but then it comes back and you have to do something with the data collected. So, back for more work sorting out how to receive a completed form into a JSP and use its values in JSP scriptlet code that helps prepopulate the next outbound form. This was a fair bit less challenging, as it maps very closely to how you start up the IBM Forms API in a regular Java servlet. Remember, JSP is just a convenient notation that the web application server knows how to turn into a Java servlet. JSP just makes it easier for you to focus on your special sauce application code.

Well, now that I could handle the whole Forms information lifecycle, I realized I hadn't covered the software development lifecycle. Back to the salt mines again. The problem was that JSP annotations are incompatible with XML. Although there is an alternative XML syntax for JSP, I devote a section in the article to explaining why it's a bit of a train wreck, and I focus instead on the normal JSP annotations. By representing them as XML processing instructions, we're able to maintain the XFDL and the JSP logic together using the IBM Forms Designer, and then use an XSLT to convert to actual JSP when it's time to deploy the IBM Form. This was really important to me because, quite frankly, if a new feature does not work in the Design environment for a language, then the feature essentially does not exist in the language.

Now, that's a wrap! I hope you like the article and get accelerated development benefit from it. JSP is really for building quick prototypes and demos, and also for solving simpler problems much more simply than using straight Java servlet coding. It's even a really nice complement to using Java servlet coding within a larger project. So don't delay, get ready to use JSP with XFDL today.

How would you like to be able to construct, deploy and get results from IT solutions using only your web browser?

Don't believe me? Well, how about coming to the IBM Forms wiki, where you can watch a few short videos that show you.

You'll be intrigued and want to go the next step. One of the prominently available wiki pages is a community article that gives you a starter pack of prebuilt solutions like the ones you see in the video. You can download any one or all of them because they're just single files that describe the forms, access control, workflow stages and other resources of each solution. You can import any of them into your own IBM Forms Experience Builder server, and then deploy them, use them, get results from them, and of course edit them to see how they work or to change them and redeploy them. All from your web browser.

Don't have an IBM Forms Experience Builder server to try it out? Well, now we've gotten to the main topic of this blog article. You can get your own free public access to an IBM Forms Experience Builder server. You can try out any of these starter pack solutions as well as build and deploy any of your own solutions.

Since you will be a builder of forms experience solutions, we will need to be able to present your solutions to you, distinguished from everyone else's solutions. So, you'll have to start by registering yourself with the system that hosts the IBM Forms Experience Builder server. The system is called Lotus Greenhouse, so click the link and then choose "Sign up" to get your account.

Once you're able to log in to Greenhouse, you'll get access to a number of software products including IBM's social business software (Connections), IBM Websphere Portal Server, and of course IBM Forms Experience Builder. However, you don't really need to log in to Greenhouse then menu navigate to IBM Forms Experience Builder when you can just bookmark the direct link to IBM Forms Experience Builder on Greenhouse.

Once you log in with your Greenhouse user id and password, you'll see the "Manage" solutions page, which lists all of the Forms Experience Builder (FEB) applications that you have designed. This is the page that gives you the ability to create a "New Application" or "Import" one of those starter pack applications, all at the press of a button.

So, now you can try out and evaluate IBM Forms Experience Builder now to see for yourself that there really is a smarter web where you can construct valuable solutions without coding now. If you are building IT solutions for your organization, you owe it to yourself to see how much more effective you'll be at satisfying your organization's IT solution demands. But even more importantly, if you're competing for IT solution services contracts, you owe it to yourself to become an IBM business partner or to expand your partnership to include IBM Forms Experience Builder. And finally, if you like to build industry-specific data management products, then you should consider becoming an IBM value-added reseller (VAR) so you can build your products more efficiently with IBM Forms Experience Builder and go to market with IBM to sell the bundle. In all these cases, you now have the access you need above so you can learn more and get started today.

Forms exist to collect data from web users involved in business processes. Are you a business partner who wants to build solutions more quickly in order to make a higher margin? Then read on!

What if you could use a web browser to design not only the user interfaces of the multiple pages of a form, but also the whole solution for which it collects data?

Now, with IBM Forms Experience Builder, you finally can.

You can define the roles of users in the business process, and you can assign users and groups to those roles. You can even set up open roles whose users are defined dynamically during the business process once the right information is collected earlier in the process. For example, only once you take in a person's name can you access an LDAP service to look up his manager and then assign that person to the manager role for an approval step.

You can define the user interface of a Form, and have an automatic database created on the server side to store database records corresponding to completed instances of that Form. You can even define multiple Forms that work together within a solution that collects data according to different record schemas.

You can define the stages of a business process workflow that uses the Form or Forms to create and update database records. Stage transitions can branch forward, backward or even stay on the same stage to update a database record that still needs more work.

You can define access control for each workflow stage and determine which Forms, page, and UI elements are available in each stage.

You can even use the database records collected with one Form as a GUI configurable web service within the fill experience of a second Form. For example, you could have one Form of a solution that collects inventory data, and then use that data in a second Form that makes it possible to order from available inventory.

You can make the Form fill experience available within a portlet of an IBM Websphere Portal website.

You can extend the Form's web interface with your own javascript, CSS and HTML widgets. You can extend the server-side solution behavior with your own value-added web services.

You can create a solution with your web browser, you can save it to the server, you can hit Deploy in your web browser, and then your users can access the Forms of the solution from web links. If you later decide it is necessarily to add to or change the solution, you can edit the solution again using your web browser and hit Deploy again The data is retained for all the remaining form UI elements, and the database tables are altered as needed to make space to store data collected by any new form UI elements.

Via web links, users can access the list of database records collected by the solution. Only the records to which the user has access are presented. If you're the solution creator/administrator, you can get access to all the records. Whomever is given a link to view the records can also set up their own customized filters for the data, so a user can truly use the view as a business process task list, and even filter down to tasks of a particular type, from a particular person, having met or exceeded some value, etc.

Continuing with the amazing stuff you can do with the eval() function: You can use it in a user interface binding to enable your form to programmatically control what the user sees.

As an exhibition of this capability, I'll give you the pertinent parts of an XML editor form that dynamically adjusts to XML structure and allows you to edit the content of any leaf nodes while giving you link buttons to allow you to drill deeper into element subtrees as well as a "back" button to go to the parent of any subtree whose leaves you may be editing. It starts with an XForms repeat, like this:

The repeat expression is computed by the form and is changed by user actions that drill deeper into the XML tree or go back to parent elements. The repeat expression will end with "/*" so that the controls in the repeat will show the children of whatever node the repeat expression selects before the "/*".

For simplicity, I've put the XML data to be edited as the first element of the instance that also manages the calculation of the repeat expression, but you could do this as two separate instances instead. Here's the instance structure I used in this example:

The first element could be anything, but I used a "purchase order" data structure, so this form will magically morph into a purchase order editor. Further, it should now be clear why in the last blog I concentrated on data that carried its own formula calculations and data validation rules. If I replace the Purchase-Order element above with the loan calculation data below, then this same form will help calculate your monthly payment on a loan:

Within the repeat, we can use different kinds of form controls to be responsive to the identified types of data and also to the issue of whether something is an input or an output based on whether it has a computed value. Here are two examples at the XFDL+XForms level:

In the predicates of the form controls, "not (*)" ensures that these form controls are only relevant if the data node is a leaf that is to be filled with character content. The "value" attribute in the data provides a calculation formula, so that has been used to distinguish when to provide and input versus an output form control. The two examples above make relevant form controls for data elements annotated with a currency type attribute. Other form controls for checkboxes and dates can be created to bind to types like booleans and dates.

By design of this particular "magic morphing" (XML editor) form, the first element in the computed XPath expression is the first element of the instance. Then we compute the full path to the element whose children will become editable by the repeat and the form controls within the repeat. This is based on the "expr" data node that will be programatically manipulated. Because expr is initially empty in the instance data above, that means you will initially see the children of the first element of the instance, because the repeatexpr is calculated to the first element plus the initially empty "expr" plus "/*" to get the children.

OK, so how do we adjust what the XForms repeat presents to the user? Basically, we want to either add a child element name to drill down into a subtree or we want to subtract a child element name to go up a level. First, let's cover how to add an element, i.e. add a step to the location "path". Inside the repeat, each child element that is a subtree root (has children) gets an XForms trigger in a link style button. If you activate the trigger (press the button) then you drill down into the corresponding node. Here's what that looks like:

The trigger ref binds to a node that has children, as tested by the predicate "[*]". The label shows the name of the child element whose subtree you will drill into if you activate the trigger. The action sequence simply chucks a slash plus that name onto the end of the "expr" as a new step in the location path. This adds to the "path" which adds to the "repeatexpr" which updates the XForms repeat to show the children of that subtree root.

The trigger to go back up to a parent from a child is something that would live outside of the repeat because you only need one "back" button. It's actually a bit trickier because you can't get the last slash in order to lop off the last location step in the path. Fortunately, XPath lets you find the first occurrence of a substring, and XForms actions include a loop. So, the way I did this was to construct a new expression out of all the location steps in the old one, except the last, which was detectable by there being no more slashes. Here's what that looks like:

The first setvalue copies the "expr" less the leading slash into the "scratchexpr". Then, we clear out the "expr" so we can build it up anew from the parts of the scratchexpr. Now, we execute while "scratchexpr" still contains a slash, so the loop stops short of copying the last location step from scratchexpr to expr. Once the processing is complete, then once again, the modifications made to expr, reverberate to "path" then to "repeatexpr" due to the XForms binds above, and so the XForms repeat updates to show and allow editing of the content of the parent element.

And that's it! Thanks to eval() used in combination with all other pre-existing features of XForms, you can make a form that edits any XML element data structure.

There are a number of new XPath extension functions available to XForms developers in the latest release of IBM Forms, and I'd like to draw your attention to two of them: eval() and eval-in-context(), and they are wicked cool!

The function eval(expr) evaluates expr in the context of the function call and returns the result. The eval-in-context(expr, contextExpr) function does a similar thing, except it first evaluates the contextExpr and uses the result as the context for the main expression. This is desparately needed for XPath 1.0 expressions to eliminate the infestation of pesky ".." operations that typically occurs. I've used it, rather than eval(), in the samples below.

One use of these functions is to enable the powerful capability to let XML data carry sophisticated dynamic metadata, which can then be implemented and enforced with singular XForms bind elements that attach the semantics of the metadata whereever it is found in the data.

It turns out that the xsi:type attribute from XML schema is already a rudimentary version of the metadata idea we're pursuing here, so the question becomes what if you could do it for all of the juicy metadata that XForms contains, like calculated data values, data validity constraints, and so forth? Let's look at what this "decorated" data might look like for a simple expense report:

It is really easy to use XPath capabilities to find all elements having a value attribute and then to bind an XForms calculate formula to those elements, and then eval-in-context() is used to determine the result according to whatever expression is given in the data, like this:

The nodeset expression starts with descendant::* to explore all elements of the data, and then the predicate [@value] selects all elements that have a value attribute. For each such element node, a calculate formula is bound to it by the XForms bind. The eval-in-context() call uses ".." to go up a level so that the formulas in the value attributes can omit "../", e.g. so the expression can simply be "quantity * price" rather than "../quantity * ../price".

On the expense report data above, this binding evaluates the constraint expression attached to the total element, and then converts the result to a boolean. Due to the constraint, the expense form data cannot be submitted to a server for processing unless the total expense is less than 10000.

In other scenarios, you may want to control the other metadata properties like relevant, required and readonly. Here's an example of data where relevance and requiredness control is needed:

The xsi:type assignment for age already works in XForms without needing a bind. The required setting for name is statically true, not dynamically changeable, so it is very handy to be able to allow either a static boolean value or an expression, like this:

Technically, the required property on the parent element is conditional on the age value, but that is an automatic feature of XForms, i.e. nodes marked required are only required if they are relevant. It should not be too surprising to see that the bind for relevance looks like this:

Information system architects, and even vertical industry standardization or government IT standardization bodies, derive immense value from XML schema definitions that describe the data structures and data types expected in valid transactions of the information system. However, these assets are focused on defining the completed transaction. In many a presentation, I've talked to potential customers about how IBM Forms documents express far more value because they are about the human interaction that takes place during the fill experience needed to produce those completed transactions.

As a bit of an aside, it's important for the technically minded reader to be familiar with the "sell" side of this equation. It is important to be able to easily justify technology adoption decisions with business owners who need to be able to understand how you will be able to better server your customers, reduce development and maintenance costs, increase competitiveness, eliminate vendor lock-in, etc.

In this blog, I'd like enumerate various benefits that you get from the standards basis of IBM Forms documents and their implementations, but I'd also like to separate the enumerations into two lists: 1) benefits above using XML Schema that you get just from using the XForms markup within an IBM Form, and 2) the benefits you get from the XFDL processor (XFDL is the XML vocabulary that IBM Forms provides as a presentation layer for XForms).

Firstly, an XForm is a clearly a superset of XML Schema since an XForm can incorporate the XML Schema if it is available and provide its validation information set to the fill experience and the submission experience. But an XForm provides many additional benefits, including the ability to:

Express data validation constraints that are based on other XML data values entered during the fill experience.

Automatically compute data values based on other input entered during the fill experience, rather than requiring users to perform the error-prone task of calculating and inputting summative results manually.

Describe the inputs, outputs and triggering mechanisms of web services to invoke, such as to use entered XML data to obtain database query results that fill other parts of the form or to invoke server-side validation or calculation logic from a business rules engine.

Control whether a data value is readonly or whether the user can enter a value, for example based on conditions related to a business process step or an access rule.

Express user interface controls that indicate which XML data values will be available to the presentation layer for input or output.

Conditionally show or hide the user interface controls (and hence their presentation) in response to conditions, such as those that may relate to a business process step or access rule.

Provide customized help and validation error messages to help users fix data input errors when they occur.

Provide prompting label text to be associated with the presentation of each input or output control, both visually and aurally (for accessibility)

Associate a selection list with any XML data node in a way that constrains user input capability to the provided list

Define labels, help messages, validation error messages and selection lists in more than one human language within the same form, thereby ensuring citizens receive the same form logic and interaction behaviors regardless of which official language they select to request government services

Associate repeated data with a logically tabular set of user interface controls and encode the means by logical rows of the table are added or removed in response to insertions or deletions of data

While the above benefits indicate what additional behaviors and features of XML forms can be expressed above those that can be expressed by an XML Schema alone, XForms is also more interesting for standardization of a forms repository due to what it does not express. An XForm does not rigorously bind its many behavioral benefit to a specific presentation layer of the form. The intention of this language architecture was to address multimodal requirements of forms applications, e.g. rendition on a desktop, tablet, smartphone, telephone call-in voice service or instant messaging interaction. Different presentation layer implementations can address these requirements, and such implementations can even be provided by different vendors. The XForms working group also anticipated that there would be a wide array of varied technical requirements for presentation layers, and this language architecture allows XForms to be used with fundamentally different XML presentation languages that address these disparate requirements. Examples range from ODF for flowing text with fill-in-the-blanks fields to XFDL with its high-precision contract-style layout capability.

Due to the above mentioned language architecture, XForms markup does not comprise a well-formed XML document until it is incorporated into a presentation layer XML document. XFDL in the XML vocabulary used in IBM Forms to provide a presentation layer for XForms.

Of course, in a "baseline standard" version of a form, the default presentation layer associated with the XForms markup can be minimal in nature so that the only benefit is to provide a well-formed XML document to host the XForms markup. Interestingly, once you have this from XFDL, then the result is in fact an XML document, and so it can be processed by readily available XML processing tools like XSLT. These XML tools can be usd to automate creation of different versions of the baseline form that may have a richer presentation description, alternative natural language usage, or even different presentation layer markup. In addition, various consumers could use simple XML tooling to rebrand the forms.

The Extensible Forms Description Language (XFDL) in an XML vocabulary describing the presentation layer and richly interactive behaviors of modern web-based electronic forms. This XML vocabulary was first introduced to the W3C in 1998 (http://www.w3.org/TR/NOTE-XFDL) and over the years, the versions of the language have consumed XML data-processing components of the W3C XML technology stack as they have been standardized by the W3C, especially including XML Schema, XForms, and XML Signatures. XFDL is a royalty-free open format whose current version specification, which can be obtained from -

XFDL is a host XML language for the XForms standard, and so the many benefits of XForms described above are inherited by XFDL. The implementations of XFDL as a presentation layer language add the following benefits over the core XForms processing:

very high precision control over the layout and rendition of the user interface

integration of XML Signatures with both XForms and a high precision user interface

comprehensive treatment for accessibility, localization and language support

integrations to standard application server processing of form results and to run-time processing by JSR 168 and JSR 286 compliant portlets

the choice of zero-install operation within a web browser, using a server translator module

the choice of a client install to support both offline and online processing

In a larger sense, though, this only part of the benefit derived from XFDL. Still more benefit is derived from the availability of the XFDL forms visual design environment, which gives form authors integrated access to GUI features for XForms, for XML Signatures, for schema-driven design, for web sevice connections, and for the XFDL language benefits included in the list. The design environment even includes a converter that helps preserve the layout of PDF Forms that are brought over into XFDL. Finally, the XFDL forms visual design environment also provides features for maintaining a collection of forms, such as SVN and other team repository plugins and management of form parts.

Industry solutions is an important area of endeavour for IBM. An industry solution is an IT asset that helps solve an industry-specific problem and is easily reconfigurable to meet specific needs of each client. A solution often helps a client to reach out to and interact with their own customers or users.

An important segment within industry solutions is called case management, which takes the view that a customer/user interaction pattern can be orchestrated by a case. The definition of a case includes data structure and data type definitions, metadata definitions, business process and user access rules, and other possible resources. Based on an initial request by a customer or user, the case management system instantiates the case definition, and the resultant case orchestrates the interaction to achieve the goal or goals implicit in the defined pattern. For example, a case management system could be used to orchestrate the means by which a customer makes and successfully completes a warranty claim for a defective product. The process would begin with collecting initial information about the defective product, about the defect, and about the purchase. The process would include determining the legitimacy of the warranty claim, providing basic support to qualify the defect and determine a course of action, and ultimately to effect a repair or replacement of the product.

A case management user interface is a collection of interactive components that collect data from a user and store it in the data structure of a case (an instance of a case definition). The presentation of the user interface is also affected by the metadata of the case. A common piece of metadata is an enumeration of the valid values that a data item may take. If such a list is available, then it would be presented in a dropdown menu or list box, and the input would be collected via list selection rather than by free-form typing in a text entry field. Other common metadata are boolean flags such as for indicating whether a data node is readonly or required to fill in a step of the case processing. The user interface components would be affected by enforcing the readonly property or providing a sensory indication of the required property. Still other metadata can define validity constraints, such as a numeric datatype or a minimum or maximum value or length. The user interface of a component associated with a data node would be affected by indicating whether the current data value is or is not valid according to the constraints. A case management asset would also typically forbid progress to the next step of the orchestrated pattern when the data associated with the current step contains invalid values.

The term “case management solution” has been used to describe a software solution that supports the design, deployment, execution and reconfiguration of case management assets. As the field of case management matures, the term “advanced case management” has emerged as a way to characterize case management assets that have advanced feature requirements, such as the requirement to collect many dozens, scores or hundreds of fields of data. Some examples of advanced case management include: home or car insurance claims, credit card charge dispute resolution, citizen-facing ombudsperson cases, contagious disease outbreak tracking, and management of complex medical or psychological treatment cases. IBM Case Manager (ICM) is IBM's advanced case management solutions.

A problem arises in advanced case management solutions with respect to the expected maturity of the user interface. The typical case management solution generates a user interface presentation layer for the data of a step using a simple linear columnar approach or a column of expandable stacks of related data values. The advantage of this approach is that it most easily adapts to a reconfiguration of the case management asset in which the data structure is amended to add or remove data nodes. However, there are a number of disadvantages to this approach. For one, it provides a one-size-fits-all approach to the user interface layout in which usability substantially degrades in quality as the size of the data set grows.

As well, larger data sets tend to correlate to more advanced requirements in an overall solution that a case management asset simply cannot begin to address using only a simple user interface approach. There are many such features, including creating multipage guided interview style wizards, creating mutiple print-style pages to reflect a “document of record” for the case, and of course adding the ability to digitally sign the “document of record” as a way to create a legally binding agreement or a record that can stand up to rigorous auditability requirements.

The new release of IBM Forms is the strategic IBM forms technology that now solves this problem for IBM Case Manager (ICM). An IBM Form combines an XML data structure with a template describing interaction behavior rules and a comprehensive user interface definition. The how-to for connecting an IBM Form to an ICM solution is as simple as going into the IBM Forms Designer, right-clicking on any number of XML data nodes, making them “public” and giving them public names equal to the ICM case property names they must map to, and then ataching the IBM Form into the ICM solution.

The IBM Form can then be used in the Case eForm widget anywhere in the ICM solution where the Case Data Widget would have been used. During execution of a case under the ICM solution, the case property values are automatically injected into the XML data of the IBM Form as it is rendered to the user, based on the public data mapping mentioned above. When the user completes or saves the form, the updated data values from the IBM Form are injected back to case properties so that the Form and the Case are in synch.

Oh, but the story is so much cooler than just adding high precision, multipage user interface control for the data. And it's cooler than having a “document of record” for the data that can be digitally signed. IBM Forms contains this fabulous technology for defining live interaction with XML. It's called... you guessed it... XForms. XForms manages not just XML data, but also metadata pertaining to each data node, such as the node's datatype, or whether the node is required, or whether it must be valid according to some constraint expression. XForms also allows a list of values to be associated with the input mechanism for a data node. Remember above where I said that case management assets define metadata just like this for case properties? Well, when you design a case solution that includes this metadata, and then map a XML node to a case property by assigning the public data name, the IBM Forms integration automatically injects not just the case property values, but also any lists as well as XForms binds for the metadata. During the form run-time, the XForms processor then automatically combines these XForms bind results with any metadata settings that might be defined within the form itself. In effect, there is a seamless bridge from ICM case processing to the user's interaction with the IBM Form.

Finally, this seamless bridge works in the other direction too, from the IBM Form back to the ICM case solution. In addition to synchronizing data updates from the form back to the case, the key lifecycle operations of saving or completing a form interaction are gated by a validation operation. IBM Case Manager delegates the validation operation to the IBM Form technology, which executes a validation operation based on the behavior of an XForms submission. This means that non-relevant nodes are automatically pruned from the validation, and the validation result is the sensible combination validation rules injected from the case solution and validation rules expressed directly in the form.

Netting it out, IBM Forms is a first-class citizen of IBM Case Manager solutions, and the case data and metadata of an IBM Case Manager solution are handled as first-class citizens of the user experience provided by an IBM Form.

Recently, I was experimenting with one of the features planned for the next version of XForms. The feature is the iterate attribute for XForms actions, which will perform a for-each loop operation based on a nodeset obtained from the xpath expression in the iterate attribute value. XForms 1.1 already has a while loop, but iterate makes many data processing loops easier to write and also more performant (subject of a future blog). There were lots of iteration use cases to choose from, but I decided to experiment with sorting because it is a well-known benchmark algorithm.

Before we go any further, let me say up front that XForms action scripting is intended for very lightweight data manipulation, like adding or deleting a data node corresponding to a table row or copying data results to or from the SOAP envelopes of a web service. By the time you get to nested iterations like those needed sorting, you should be considering alternatives expressed in full-blown imperative languages available in the information system within which the form is being used. For example, in the case of sorting, it is a better idea to request sorting in the database query whose results are returned from a web service into your form so that your form logic does not even have to do the sorting.

So, with the disclaimer out of the way, let's abuse the technology a bit to get a better sense of what is feasible in those customer-needs-it-yesterday circumstances. It turns out that XForms 1.1 does allow full nodeset processing in the insert action's origin attribute and the delete action's nodeset attribute. Without even needing the new iterate attribute, this is just enough iteration capability to perform efficient sorting -- so there are some kinds of iterations that can be done now without the iterate attribute.

We're going to do a divide-and-conquer "partition" sort that I personally created as a university freshman after my 1st semester instructor told our class that linked lists could only be sorted slowly. At the time, the usual computer languages only allowed static allocation for arrays, and even though I didn't know what a "quick sort" was, I had seen the light of dynamic allocation, and I was never going back! I later learned how great a merge sort is on a linked list, but the effort of turning an array quicksort into a linked list partition sort comes in handy now because a merge sort cannot be efficiently expressed in XForms until the iterate attribute is added.

The way a quicksort works on an array (or subarray) is that you pick a random element to be the 'pivot' value. Then you run two index variables at the same time, one from the start of the array upward and the other going from the end of the array downward. The 'up' index is advanced until it finds a value greater than the pivot, the 'down' index is decremented until a value less than the pivot is found, and then the values at the 'up' and 'down' locations are swapped. This keeps happening until 'up' and 'down' meet somewhere near the middle of the array. At this point you've partitioned the array into a subarray of values less than the pivot value and a subarray of values greater than the pivot value. The quicksort is then invoked recursively to sort both subarrays.

The main challenge with this approach is the 'down' index, which is a reverse iteration. In a singly linked list, you can only go forward. XForms insert and delete actions have a similar limitation: they can only identify a nodeset of nodes to insert or delete, but not really a direction of iteration. But the important bit is what the quicksort is doing, not how it is done. Think of the list content as being completely messy, and each partitioning stage must make it somewhat less messy by dividing the content into a partition of lesser elements and a partition of greater elements. Then, the next partitioning stage is invoked recursively to do a better job of cleaning up the mess within each partition.

Let's explore this concept by sorting a list of elements, such as sorting a list of <person> elements by a <lastname> child element. We begin by copying the list into an initial partition element of a temporary instance called 'sortdata', like this:

Next, we initialize the random number generator so we can randomly select pivot values for all the partitioning stages:

4) <xforms:setvalueref="instance('sortpivot')" value="random(true)"/>

Next, we start up a simple while loop that continues to process partitions until none are left.

5) <xforms:actionwhile="instance('sortdata')/partition">

Within the loop, we grab the last partition from the sort data and determine whether it is non-trivial or trivial (only 1 or 2 elements). A non-trivial partition is subjected to further divide-and-conquer processing.

Step 5 and step 5.1.1 are more interesting than they seem at first. The list of partition elements in the sortdata actually implements the recursion stack, and we just pushed a new element into that stack at the second-to-last position. Because we have an explicit stack, we only need a loop in step 5 to implement recursion.

The next thing we do here is grab a random last name to serve as a pivot value for the partitioning. The first setvalue just picks a random location, and then the second step uses the location to get the value. Notice also that I use * rather than partition before the [last()] predicate because the sort data only contains partition elements, so there is no point in doing a name test for partition.

Now the magical part happens. All elements in the last partition whose key element (lastname) is less than or equal to the pivot value are moved to the newly created second-to-last partition. By combining the nodeset processing capability of XForms insert and delete actions with the predicate-based node selection capability of XPath, the matching nodes can be selected and moved using two single XForms actions, i.e. without using an XForms while loop.

If the last partition is now empty due to the move operation in step 5.1.3, then the new second-to-last partition received all its elements. If all the moved elements are equal to the pivot value, then we can output them back into the original data list and then remove the last two partitions. Note that the insert is configured to prepend the elements into the data list, and we're copying them from the last non-empty partition, which has the elements with the greatest key value.

Now, we've finished with the non-trivial partition handler, and we turn our attention to processing a trivial partition containing at most 2 elements. The content of the partition is moved to the original data list and the partition is removed. Again, note that we're processing the last partition, which has the greatest key values, and the insert prepends to the data list, so the sorted data list starts with the greatest values and grows as lesser and lesser values are prepended over time as all the partitions are processed.

As a final note on all this algorithmic fun, the question arises whether this sort achieves optimal O(N log N) performance. The answer is no, not quite, due to hidden costs of data instance management and data node selection. However, the sort will be much faster than a "simple" sort because it does perform only O(N) XForms actions.

I've recently realized that there are a few milestones to celebrate all at once here: This blog turned 5, my W3C XForms group age is 10, and this is the 100th exciting episode of the IBM Forms dW blog. I took a look back at what you, the readers, seem to like best, and those 2000-3000 hit entries are at the intersection of strong technical content and the confluence of open standards. How serendipitous!

The power and value of the XFDL markup language underlying IBM Forms comes from the way it brings together all the features of open standards needed to design sophisticated solutions, including XML, XML Schema, XForms, XML Signatures, and XPath. In this entry, I'd like to continue the schema driven design story by talking to you about the information architect's half of the equation. Specifically, we'll cover the XML Schema constructs that the IBM Forms Designer converts into various user interface controls like XFDL popups skinning XForms select1 elements and XFDL tables skinning XForms repeat elements. If you haven't yet watched the short demo videos mentioned in the prior blog entry, take a moment to do so now because I'd like for us to dig into parts of the "Medical Preapproval" schema used in that demo. The top-level element definitions for the schema look like this:

In this sample, the data needed fora medical insurance preapproval consists of some basic patient data like name and date of birth, contact information like email and address, and details of the medical procedures required. When the IBM Forms Designer is provided with this schema, it can generate an XML data conformant to the schema in an XForms instance element, and then the form author can drag and drop the XML elements in the instance view onto the Design canvas. For example, if the form author drags and drops the "patient" element, then the IBM Forms Designer creates an XFDL pane that skins an XForms group element. Within the group, various XFDL items like text input fields and calendar pickers are created and mapped to the child elements of the "patient" element so that the textual content can be collected by the form. So you can get an idea of how to write XML schema for structured content, here's a part of the patientType definition:

At design time, text entry fields are created for these data items, and any specialized schema rules, such as the regex pattern definition for the SSN, are applied at XForms run-time. In case you're curious, here's how you'd set up a regex pattern, in this case a very simple one that takes 9 digits but also allows an empty string:

For data types like the xsd:date (above) and xsd:boolean, you will get an alternative form control, such as calendar picker or checkbox. The IBM Forms Designer will even generate a calendar picker if you have a type derived from xsd:date. For example, you may want your schema to allow a date or an empty value. So, if you change the type of the dateOfBirth element above to health:dateOrEmpty, then you can use the declaration below to achieve that effect:

By default, the label text generated for each element is based on the element name. Camel-casing is used to determine when spaces should be added to the label. For example, the default label generated for the "patient" group is "Patient", and the default label for the "lastName" element is "Last Name". However, the information architect is provided greater control over the labels in the schema. In the example above, an appinfo annotation is used to define the desired label of "Contact Information:" for the contactInfo element. The same appinfo annotation can be used to control the labels of the UI controls bound to leaf (text content) nodes. For example, below is a definition that could go into the patientType definition, but it provides an alternative label for an element that has a name that is perhaps medically accurate but perhaps less desirable:

An enumeration like this in the schema causes an XFDL "popup" item (a dropdown menu) to be generated, along with an xforms:select1 element. The dropdown menu entries are Male and Female, and the user's choice at run-time places the word Male or Female in the "sex" element. But maybe you'd prefer to have the data be based on codes like M and F, but still show menu entry labels like Male and Female. Again, the appinfo annotation comes to the rescue:

I've used Gender in these examples for brevity, but you can use enumerations to describe other lists, such as for States and Provinces. The IBM Forms Designer will generate an XFDL popup item by default, but the Designer also lets you right-click convert the popup to other user interface items that can legally skin an xforms:select1, such as a radiogroup, checkgroup or list box.

Also, sometimes what you need is a combobox, so that the user can access a dropdown menu but also have the option to type an open-ended answer. For example, the open-ended response for a gender might provide medically relevant information. The information architect can control this in the XML schema by creating an open-ended enumeration, i.e. an enumeration unioned with a string:

Within the recommendedProcedures element, the schema indicates that a "procedure" element can appear more than once (maxOccurs is unbounded in this example, but any value greater than 1 will also work). The IBM Forms Designer responds with the XFDL table editor. It reads the child elements in the sequence to determine the default set of table columns. Notice the use of the appinfo annotation to control the column header label text. When the form author drags and drops the details element, XFDL fields sking xforms:input elements are created for elements like reason and total, but the table editor also produces an XFDL table and xforms:repeat whose nodeset attribute contains the XPath needed to bind to however many procedure elements are present in the data at form run-time. The table editor also generates XFDL buttons containing xforms:trigger elements that provide the "add row" and "delete row" capabilities using xforms:insert, xforms:delete and xforms:setfocus actions.

Now hopefully you have a more complete picture of how IBM Forms concentrates the combined value of all these standards-- XML schema, XPath, XForms, XML-- into one solution creation machine. Well, that may have been a tad long, but we're celebrating three milestones here :-) Besides, the length of this entry is a simple reflection of how much cool stuff you can now exploit in your XML Schemas to help your form authors create the forms that feed the right XML to your backend systems.

One of the New IBM Forms 4.0 Demonstrations has just won the top-rated video award at Lotusphere. This is a fantastic victory for the Websphere Portal segment of Lotus, which focuses on products like IBM Forms that create and provide exceptional web experiences. It is also a victory for the W3C XForms standard since the Wizard Creator featured in the video is, as far as I know, the first point-and-click design experience for an XForms switch. This feature exemplifies the principle that declarative markup languages result in more powerful application design environments because they express what the author wants. With an imperative languages, a design environment must either operate at a much lower level (less powerful) or do some wicked reverse engineering/pattern matching to discern what the author wanted from how he did it.

So, XForms has definitely been my friend in helping to create an award-winning feature in IBM Forms. Wanna meet my newest friend? Here we are, just me and Watson, celebrating our victories at the closing session of Lotusphere 2011:

To put it in Watson's terms:

Category: Lotusphere Best Demo Video Winners

Answer: An IBM Form that uses XForms to express Wizard interfaces for forms.

A major new release of our forms software, version 4.0, is now only a few weeks from shipping. And as of this release, the product line will be known as IBM Forms! This is an incredibly important indicator of the strategic value IBM sees in the Forms business as a key component in building a Smarter Planet. The feature set coming in this new major release combines the best of Web 2.0 client application behaviors and design experience with the traditional strength of interactive XML data collection for which the prior releases our product line are well known. IBM Forms documents are interactive web application instances that have many traditional capabilities that we have been building into them since 1993.

They can always be serialized to provide a user with their own saved copy of their web application experience, to archive, to email, or to pass to the next step of a business process or workflow. They can be reinstantiated at any time to continue the interactive fill experience.

They can be instantiated directly into a web browser without using a plugin, and they can also be instantiated in a client application for an offline fill experience, dramatically simplifying IT maintenance and platform support, .

They provide a multipage high precision visual layout with comprehensive accessibility support and localization

They operate over XML payloads that directly conform to the needs of back-end business functions, enabling straight-through XML processing systems.

They enable rich web application interaction and behaviors based on the W3C XForms 1.1 standard.

They allow users to attach supporting documents into the form in support of their business function, including spreadsheets, word processing files, images, videos, etc.

They secure the business function implemented by the form with a comprehensive interactive document digital signature system that includes sectional signing, multiple signatures, and overlapping signatures, all of which protect not just the form data, but the form behaviors, the form appearance and the attached documents as well.

Now lets add to that what's coming in the new release:

The ability to drive customer satisfaction and employee productivity through more efficient, compelling interfaces that can include

In "enterprise" service-oriented architectures (SOAs), XML is a clearly entrenched format for server-to-server communications. But there's a movement afoot to define a "Web SOA" as a separate entity that describes server-to-client communications... based on JSON as the format.

The rationale is that JSON parses "a hundred times faster" in a web browser than does XML. This is like the tail wagging the dog. It is a myopic, accept-the-status-quo approach to web application computing, and it just won't do.

First of all, the difference between parsing different serializations is negligible (when done right), and nodes are nodes are nodes, so if there is a 100-fold difference, it is because one parsing method is direct and the other is doing something phenomenally suboptimal and probably easy to fix. One way to solve this is to alter all the system architectures on the web and introduce the complexity of a data format transformation to cross the illusory chasm between a "Web SOA" and the "Enterprise SOA". Another idea would be to put the pressure where it rightly belongs, on web browser makers to fix the feature of efficient XML processing.

Particularly as vertical industry standardization continues, such as is happening in both insurance and healthcare, this latter solution is really the only viable one in the long term. XForms technologies like Lotus Forms and Ubiquity XForms wrap a logical client application around the exact XML that is needed by the server side because this is the best way to minimize system development and maintenance costs. Once there is a defined XML data model, there should not be some other second data model introduced into web applications for some reason as vapid as poor implementation of the required feature.

This conversion is not a trifling matter either; have you seen how ugly the JSON is that is actually capable of preserving XML fidelity (i.e. supporting round-tripping)? Trying to build a reasonable client-side around that JSON is not nearly such a pretty sight as when the data representational limitations of JSON are accepted in an application. If the full representational power of XML schema is used to define an application data model, then we need to be able to carry that out to the client without significant alteration so that we can make references to the data transparently, not through the lens of some horrendous transformation. And make no mistake, XML schemas for vertical industry standards are full-featured indeed.

Finally, I would be interested in hearing about any research into whether the overall throughput of a web application is increased or decreased by the introduction of JSON down to the client. It's nice that parsing gets a hundred times faster on the client, but when you're server is trying to handle more than a hundred concurrent users, and the server has to translate from XML to JSON (and back again), then it sure seems like the use of JSON is pushing processing burden in the wrong direction.

A customer asked me recently how they could use XForms constructs to create a form that dynamically populates a table with available products that can be ordered based on selection of a product provider. In this case, the customer also wanted to have the product order list be editable once provided, allowing the user to delete rows or even to add more rows to the product order table initially provided for the selected store. Seemed like another good example for the blog.

So, suppose you have a main data instance for a form that looks something like this:

Each product has a name, code, and cost, and the end-user will indicate the quantity of the product they desire. The above data corresponds to showing an empty table until the user chooses a "storeID". Here is the XForms user interface markup for showing a four column table having as many rows as there are "product" elements in the data.

Initially, the table will have just one row of four user interface elements containing essentially empty or zero values. However, now lets hook up something that allows us to pick a store ID so we can fill the table with an initial order of products. Now, it would be reasonable in a full form application to obtain the list of stores from a web service and then get the starting list of products for the selected store from another web service. Getting data from web services is not an important but orthogonal point, so in this mock-up, I'm going shorten all of that down to just having the data available in a format that looks like this:

The user is provide the ability to select a store using the "select1" control, and the list of stores can be easily picked up from the data using an "itemset". Once the user makes a choice, an "xforms-value-changed" event on the select1 could be used to run a web service to get the product list, but here we'll just mock that up with an "insert" because the data is already available:

The "ref" on the select1 tells where to store the resulting store selection. The "nodeset" on the itemset tells where to get the list of stores from. The "ref" attribute on the xforms:label in the itemset tells what to show for each item in the list of choices, and the "ref" xforms:value tells what to store in the data ("storeID" due to the ref on select1) when a particular list choice is selected.

The "xforms-value-changed" event handler recognizes when a selection has been made, since that results in a value change on the "storeID" data node. The delete action gets rid of any preceding list, and then the insert action copies the list of products for the selected store into the main data. In particular notice that the XPath predicate in the origin attribute selects a store element to copy based on the store element's ID attribute matching the selected store identity placed in the storeID element by the value change behavior of the select1.

Once this insert occurs, the xforms:repeat is automatically responsive to the change of the data. It generates a four column row of user interface controls for each of the inserted product elements. For example, if the user picks store A, then they get three rows for Widgets, Gadgets and Trinkets. If they then pick store D, the form automatically adjusts to five rows for Lockets, Pockets, Rockets, Sprockets and Sockets.

Once the table content has been set with the product list for a particular store, the user may choose to add or delete rows from the table. Here is an additional instance that would be used to store the data prototype for a product:

This simply inserts a product prototype, obtained via the origin attribute, into the location defined by the context and nodeset attributes at a position corresponding to the row of the table that current has the input focus. When the data is inserted, the repeat table automatically generates another four-column row of user interface elements to allow the user to interact with the new data.

Similarly, a button to delete a row from the repeat table would trigger the deletion using the following XForms markup:

This simply deletes the product data element corresponding to the row of the table that currently contains the input focus. The row of user interface elements that presented this data is automatically deleted. As the next step of the action script, if the product list data becomes empty due to the prior delete, a new empty product prototype is inserted. The repeat table then presents one row of interface elements, so this extra insert ensures the user is never left with an unsightly empty table.

Last but not least, the actions scripts of both of the triggers above end with an xforms:setfocus action. This is because pressing a button, be it to add or delete an item from a table, transfers focus to the button. That's just how the web works. But the user's focus is not really on the buttons; those are just tools. The user's focus is on changing the table, so it is a better user experience to push the focus back to the repeat table.

As an example of the powerful data-driven dynamism available in Lotus Forms due to features of XForms, I'd like to take you through a brief conceptual tour on the focused example of creating a Lotus Form template for a Questionnaire or Survey. This template is able to handle not just any number of questions and any amount of question text, but also any kind of answer type. And all of this would be controlled by the data so that the actual design of the Lotus Form template is the same.

The power of being purely data-driven should not be glossed over. You can easily have web application servlet code that obtains the questionnaire template and then prepopulates it with specific questionnaire data so that the client side receives a specific questionnaire selected in a previous step of the web application. But, XForms-based Lotus Forms also have that AJAX property of being responsive during run-time to new data obtained by a form via web services or other http submissions. So, you could even have a Lotus Form that obtains and adds new questions on the fly in response to answers provided to initial questions.

This post will focus on the main repeating template that provides the dynamic presentation layer for each question of a questionnaire or survey. As this is an example of a purely data-driven questionnaire, let's start by looking at a sample data format. Suppose you have a survey consisting of any number of items, each of which can contain a question text, an indication of the type of question being asked, a place for an answer, and optionally some possible choices for those answers. Something like this:

In the XFDL presentational language that Lotus Forms combines with the XForms data processing layer, every XForms user interface element has a container XFDL elementfor presentation. The survey format consists of a number of <item> elements, so an XFDL <table> containing an <xforms:repeat> is the correct top-level presentation element:

<table sid="survey"> <xforms:repeat nodeset="/survey/item">

... <!-- UI for showing one item of data -->

</xforms:repeat>

... <!-- More XFDL options for styling the whole table -->

</table>

The table has a scope identifier (sid) attribute that allows the table to be programmatically referenced, but we won't be using that feature in this example. The table can also have XFDL options outside of the <xforms:repeat> to control presentational aspects like borders and background colors, and we aren't focusing on that either.

The <xforms:repeat> has an attribute called "nodeset" which uses an XPath expression to make a reference to however many <item> elements are in the <survey>. This is an automatic or "declarative" loop construct. For each <item> node in the data, no matter how many there are, the template content of the <xforms:repeat> is generated to present that <item> to the user. Even if new <item> elements are added at run-time, e.g. by a web service or an <xforms:insert> action, the XFDL table in the Lotus Form will dynamically grow to present the new <item> elements. And even if some <item> elements are removed from the data, e.g. by an <xforms:delete> action associated with an XFDL <button> by an <xforms:trigger>, the XFDL table will dynamically and automatically remove the corresponding user interface elements that were presenting those removed <item> elements.

So, the magic really happens in the template inside the <xforms:repeat>. In Lotus Forms, you can put any and all kinds of XFDL items in the <xforms:repeat>, including more XFDL table items. In this example, we will be showing a few variations that present different kinds of user interface controls for collecting a few different kinds of answers to questions.

First off, though, presenting the actual question text for an <item> is a simple matter of using an XFDL label item with an <xforms:output>, like this:

For each survey <item>, an XFDL <label> item is generated, and it binds to the <question> child element of that associated <item> using the "ref" attribute. The XFDL label item presents the text of the bound <question> node, and other XFDL options can be used to provide styling such as the block layout flow as well as alternative font color, background color, font selection and so forth.

More XFDL items can be added to the <xforms:repeat> to collect the answer for the given question. In many cases of XFDL tables, each XFDL item within the <xforms:repeat> template is actually presented to the user. An example would be using each XFDL item in the <xforms:repeat> to represent one column of a purchase order table. However, it is not necessary to show all of the XFDL items within the <xforms:repeat> template. In fact, XForms user interface controls have a selective binding feature that XFDL items support, since the XFDL items are wrappers for the XForms user interface elements.

The selective binding feature of XForms will be used to help easily choose one XFDL item from among many to collect the user's answer to the question. Each question can have a different type of answer, so each "row" of the table can make a different choice of user interface control used to collect the answer. The selective binding feature uses an XPath predicate to decide whether or not the XForms user interface element binds to a node of data or not, and the control is invisible if it is not bound to a data node.

In the example survey data above, the first <item> contains a <question> whose type attribute indicates it is a "yes/no" question. Inside the <xforms:repeat> we can create a checkbox item that can collect a (schema valid boolean) true/false answer, as follows:

The above checkbox widget only binds to <answer>, and therefore is only visible, if the corresponding question type is 'yesno'. Otherwise, the XPath in the ref attribute of the <xforms:input> does not select any nodes, so the XFDL <check> item is not visible.

The second <item> of sample data above has a type of 'likertscale', so we would like to show a 5-point radio button group rather than a checkbox. As explained above, the check box on the second row of the survey table automatically hides itself due to selective binding, so all we have to do is add an XFDL <radiogroup> item to the <xforms:repeat> to provide the interface for collecting the 'likertscale' type of answer, as follows:

The third survey <item> in the sample data above provides a closed selection of choices. That could be styled using a pair of radio buttons, a pair of mutually exclusive checkboxes, a list box, or a popup control that provides a simple dropdown list. The answer types in the survey format could be made to distinguish these possibilities using more keywords, but for this example we'll just assume that a <popup> control is the desired presentation for a closed selection. The XFDL markup below shows how this can be done, and it is also interesting because it is shows that the data can also dynamically control the choices, rather than having only static choices as shown in the <radiogroup> above.

It seems a useful, now, to round out this blog post by presenting a few more examples for other common types of input, such as single-line strings, multiline text, and dates. Here's what the data would look like:

So, hopefully you now have the idea that a completely dynamic and completely data-drive survey or questionnaire can be created using the features of XForms in XFDL (Lotus Forms). Any number of XFDL items can be added to the <xforms:repeat>, XPath predicate selection can be used to choose one XFDL item from among many to collect an answer for a survey question, and most importantly that a different choice of user interface control can dynamically selected for each survey question.

Lotus Forms has supported XForms for a number of years now, and you can get a good idea of all the features supported from the XFDL reference manual.

However, now that XForms 1.1 has been finalized, I've had a number of questions about shining a spotlight on the XForms 1.1 specific features in Lotus Forms. Quite a number of XForms 1.1 features were improvements to the semantics of pre-existing XForms 1.0 features, and no small number of those improvements were based on feedback from the IBM Victoria Software Lab, so obviously we implement those and it would be too long to go into them. The spotlight will be on Lotus Forms features syntactically activated with new XForms 1.1 vocabulary that was not available in XForms 1.0.

One of the coolest and most powerful additions to XForms 1.1 were the if and while attributes on XForms actions. XForms actions are behaviors like changing data values, insert or deleting nodes, or making web service calls, and they can be set to happen in response to events like a button press (DOMActivate) or user input (xforms-value-changed). The if and while attributes enable XForms actions to be conditionally or iteratively executed when these events occur.

Lotus Forms supports the context and origin attributes on insert and delete actions. These attributes enable handling of empty repeating data, and they allow repeating data to be copied from a data template. It is much easier to handle dynamic table data with these features. Also, when deleting a data node representing a row of a table, if the table becomes empty, then the above if attribute can be used on an insert to detect that this has happened, and insert a new empty data node. The net result observed by the user is that deleting the last row of a table looks like it just clears out that row so that the user can start immediately entering more data.

Lotus Forms also implements the XPath function compare(), which means a form author could use XForms actions with the if and while attributes to sort or search data, if the need arose. Several other functions are implemented, including:

random() - in case you want to write a Lotus Form that plays Black Jack

current() - to help with data table lookups

power() - for exponential calculations such as compounded interest payments

days-to-date() - can be used in combination with days-from-date() to do simple date math like "today plus 90 days"

seconds-to-dateTime() - can be used in combination with seconds-from-dateTime() to do dateTime math like "now plus 3 hours"

local-date() - provides the date for the end-user, rather than the UTC date

local-dateTime() - provides the end-user date and time, rather then the UTC date and time.

Lotus Forms supports the display of images obtained from XForms instance data, in both button and label items, using the mediatype="image/*" attribute setting on xforms:output.

Lotus Forms supports the xforms:* datatypes, which allow an empty string to be valid on the corresponding xsd:* datatypes, like xsd:date. Whereas pure XML schema datatype definitions are intended to define what constitutes valid completed data, this feature of XForms recognizes the importance of a good user experience before and during completion of the form.

Finally, Lotus Forms supports several of the new features of xforms:submission, including:

the method="put" and method="delete" attribute settings to round out access to ATOM publishing services

the relevant and validate attributes, which allows a submission to turn off data validation and relevance pruning. This can be used to implement a "Save to Server" capability so that a user can perform a fill experience over multiple sessions.

the serialization="none" attribute setting to enable an xforms:submission to perform simple URL activation.

The targetref attribute, which enables a web service call to replace only a portion or subtree of a data instance.

The replace="text" attribute setting, which allows a web service call to replace the content of the target data node, rather than the data node itself. This is useful for accessing web services that return textual content rather than XML.

Lotus Forms also has a number of XFDL extensions that add value to the integration with XForms, but that is the subject of another blog for another time.