cocoon-dev mailing list archives

On Wed, 2004-11-24 at 20:50 +0100, Daniel Fagerstrom wrote:
snip
> I'll continue to describe a possible design of a template generator (TG)
> that uses pre-compilation and has customer tags. This will be done in
> three steps: We start with a trivial pre-compiled template language (TL)
> _without_ expressions ;) In the next step we add an expression language
> to it and we finish by adding executable tags.
>
>
> A Trivial Pre-Compiled "Template Generator"
> ===========================================
>
> This step is not that useful in it self, the result will just be an
> unnesecarilly complicated implementation of the file generator. This is
> for introducing some ideas.
>
> The pre-compiling TG works in two steps: First it compiles the content
> of the input source to a script, and store the script in its cache
> together with an approriate cache key. Then the cached script will be
> used as long as the cache key is valid. It is important that the script
> is thread safe otherwise it will be unsafe to reuse. As store, Cocoon's
> transient store could be used.
>
> In the second step the script is executed and SAX output is generated.
>
> The Trivial Script
> ------------------
>
> The task of the trivial "template generator" is to output its input
> template document as is :) For this task there allready is a feature
> complete script implementation in Cocoon; o.a.c.xml.SAXBuffer.
>
> The SAXBuffer contains a list of SAXBits where each SAXBit is an
> (immutable) representation of a SAX event. The SAXBit interface contains
> the method "send(ContentHandler)". And the SAXBuffer contains an inner
> class implementing SAXBit for each SAX event type.
>
> The SAXBuffer implements XMLConsumer which records the input events as
> list of SAXBits. It also implements XMLizable with the method
> "toSAX(ContentHandler)" that iterates through the recorded SAXBits and
> execute "send(ContentHandler)" on them.
>
> This far compiling a script means sending the input source to a
> SAXBuffer and executing a script means calling the "toSAX" method on the
> SAX buffer.
Hello. I have made an attempt to implement what you describe above
(filed it in bugzilla http://issues.apache.org/bugzilla/show_bug.cgi?
id=25288). The block contains a JXTemplateGenerator that compiles a
template to a script, caches it and then invokes it.
Script compilation
------------------
A template is compiled into a Script by a ScriptCompiler. The script
compiler consumes SAX events and stores tokens representing those events
in the Script. Here's the Token interface:
public interface Token {
public int getStart();
public void setStart(int start);
public int getEnd();
public void setEnd(int end);
}
As an example of what a Script might look like, consider the following
xml snippet:
<root>
<item attr="1">
Some text
</item>
<item attr="2">
</root>
If we disregard whitespace this will be translated to the following
Script.
PlainElementToken start=0, end=8, bodyStart=1, localname="root"
PlainElementToken start=1, end=5, bodyStart=4, localname="item"
AttributeToken start=2, end=4, localname="attr"
CharactersToken start=3, end=4, characters="1"
CharactersToken start=4, end=5, characters="Some text"
PlainElementToken start=5, end=8, bodyStart=7, localname="item"
AttributeToken start=6, end=8, localname="attr"
CharactersToken start=7, end=8, characters="2"
A difference with SaxBuffer is that SaxBuffer stores separate objects
for open and close event (eg startElement/endElement). The reason for
doing it this way is that it is now much easier to play parts of the
buffer. Eg if we want to play the body of the root element we know that
we should invoke the tokens between bodyStart and end (ie 1 to 7
inclusive)
Another difference that can be seen above is that attributes are stored
as tokens in the script just like any other type of SAX event. The
beauty of this approach is that when expression tokens are introduced
there will be no difference between how expressions in body content are
stored to expressions in attributes. For example:
<root attr="AA ${1+1} BB">
CC ${2+2} DD
</root>
=>
PlainElementToken start=0, end=8, bodyStart=5, localname="root"
AttributeToken start=1, end=5, localname="attr"
CharactersToken start=2, end=3, characters="AA"
ExpressionToken start=3, end=4, expression="1+1"
CharactersToken start=4, end=5, characters="BB"
CharactersToken start=5, end=6, characters="CC"
ExpressionToken start=6, end=7, expression="2+2"
CharactersToken start=7, end=8, characters="DD"
Note that I have made some simplifications regarding whitespace above.
Tags
----
In addition to the tokens mentioned in the samples above a user can make
custom tags by implementing the Tag interface.
public interface Tag extends ElementToken {
public void invoke(JXTGContext context) throws Exception;
}
public interface ElementToken extends Token {
public int getBodyStart();
public void setBodyStart(int bodyStart);
}
A Tag is a Token and is placed inside the Script just like other tokens
(so they need to be thread safe). The ScriptCompiler uses a
TagRepository (configured in the generator entry in sitemap.xmap) to
differentiate between registered tags (stored as TagS in the Script) and
normal elements (stored as PlainElementTokenS).
Script invokation
-----------------
When the script has been compiled and cached it is then invoked by a
ScriptInvoker. The ScriptInvoker walks through the script and fires of
the appropriate SAX events. When it stumbles upon a Tag it executes
Tag.invoke(context). The tag has the choice of firing off SAX of its own
or invoking its body through AbstractTag.invokeBody(context).
Cache handling
--------------
Unfortunately I don't know all that much about how cacheing works so the
implementation can best be described as naive in that respect.
Issues, questions
-----------------
* Do you have any pointers on load testing? I tried JMeter but couldn't
get it to work, are there any alternatives you could recommend? I
compared the generator to the original JXTemplateGenerator on a plain
document (no expressions or custom tags). It was a wee bit faster but I
checked it using the time comment generated on each page and I don't
know how reliable that is. Also I would suspect that the simplifications
in cache handling and lack of expression handling would affect the
result.
* The interfaces will probably have to undergo major changes to allow
for expression handling.
* I would really appreciate comments on exception handling as I don't
really know what is considered best practices.
* What does JX stand for?
Cheers Jonas