Python implementation of HAML

UPDATE Dmsl is an active project on github and I am looking for active developers that can submit any bugs they may come across. I have been using it in production environments for nearly one year as of 12/2011. Refer to the repository README for an in depth look at current development, http://github.com/dasacc22/dmsl/blob/master/README

So it has been a busy past few days. During which I have written up an implementation of HAML in python. The closest thing I saw to this before was GRHML, whose site seems non-functional and I had a hard time finding information on its status. So without further ado, let me introduce DAML, my HAML implementation.

Ok, now this isn’t all it does but let me talk about a couple things first. For one, this project, as of today, is only 3 or 4 days old. Its hard for me to recall b/c I’ve poured a lot of time into trying to make this run as fast as possible. Today on the other hand has produced a lot of slow downs and hackish code for implementing features like Django code blocks. For you XSL people (which would be me) that means a way to name and call templates from other templates. So anyway, a lot of this code from today (and yesterday) needs some love. With that said, lets go over some differences between HAML and DAML when it comes to marking up HTML.

You’ll notice right away that declaring tags is straight forward and precisely the same. I’ve never actually used HAML so I dont know its exact syntax, but I spent a lot of time perusing the documentation for HAML. Now the actual text processor in DAML is quite fast for building documents. At one point during development, when i was still testing speeds in comparison to HAML, I had results along the lines of 0.21ms processing time for DAML versus 2.4ms processing time for HAML. This was for plain-jane HTML declaration. Lets look at some of that now with DAML.

%html
%head
%title Good Stuff
%body
#header Some stuff here
and indentions of plain text
will all be part of this div
.span while this child div whose class="span"
will be embedded within the above div
while this text is tailed
and multilined too
%p and heres some random content too
%strong that can be played
however you like.
%p one thing worth noting is that new tags
%strong need
to be on
%em new lines
so keep that in mind.

All of this renders fine of course. But I still have a TODO list for handling stuff unrelated to python expression evaluating. Namely comments, escaping, whitespace control. This is pretty much in line with HAML thus far although I often see blank lines in HAML appended with equals-sign and I cant recall what for. Regardless, the next thing to note is the use of variables. Which loosely follows the string.Formatter (and would fully support it if not for speed issues at the moment, but soon hopefully). Lets go over an example of setting and using variables.

:title = 'Hello World!'
%h1 {title}

Simple, eh? Now here’s the deal. There is a sand-boxed python eval going on. All lines starting with :colons are getting added to an evaluation queue that gets compiled and eval’d. Currently you would tag variables just like using string.Formatter and in the future, this will support all the goodies associated with it. Currently though whats going on is its simply accessing the variable declared (versus string.Formatter being setup with the documents namespace and getting called). Never-the-less, this works fine so far. Lets look at some other things we can do with “:”

Notice first, that you can basically freely declare normal python code blocks. You can also declare functions to make use of as shown with the initial intro document. I would say, ideally, the syntax I would want to go for in the majority of cases is something like (for-loop with plain-text block not yet functional)

So this is something I am working towards, but you can totally declare :func(*args) inline right now and it works. Notice, going back to the plain-text that you can freely declare %tag#id.class(attr=val) in almost any order, the only except is you can not start a line with (attr=val) nor can you declare (attr=val)(attr=val) though the latter may be added. Part of my TODO list is to have the ability to span attributes across multiple lines.

Ill touch on one last bit here, I implemented something similar to django blocks (at its most basic level). Its a wee bit limited until i implement multiline filter options as part of the preprocessor I *just* started working on not but 4 hours ago. But let me show what the final syntax would look like

:extends('template.daml')
:block header
%h1 {title} OVERRIDE!

Now the above doesn’t work until i finish a bit on this preprocessor but to show you how that will work, let me show you how the above does currently work

:extends('template.daml')
:block('header\n %h1 {title} OVERRIDE!')

Now that does work, and as you can see, my preprocessor is basically going to go through for indented portions of text under colon directive calls and push it into the related function thats in the sandboxed globals. This basically means that you can write all sorts of text filters and everything really really easy b/c then back on your end you may have a need to do “blank” with “blah. So..

I’ve basically tried to unify a number of features of HAML here so that anything can be easily overridden or customized and extended. Nothing thats just built-in and untouchable. There’s even more to come surely but I am honestly way burnt out on this.

Let me lastly explain just how FAST this works currently. Now before I went in today/yesterday adding in all these crazy bits and pieces, I grabbed the bench suite from genshi that benchmarks all the most popular python templating engines. I wrote my template to do everything to spec as the others were doing and these were my results.

Mako: 0.38ms
DAML: 0.44ms
Cheetah: 0.66ms
Genshi-Text: 0.96ms

and after that it just trailed off into really big numbers with Genshi coming in at 1.5ms?? and django at 2.4ms?? So this thing has the potential to be FAST. Now since the hackery I’ve pulled in the past number of hours I’ve actually almost doubled my processing time and the above test for DAML now runs at 0.85ms but keep in mind i have done no optimization/refactoring/code-cleanup/obvious-bug-fixes-needed to the codebase i just recently committed. So that portion needs some love and I’d love to get the speed back down to what it was just early yesterday morning.

Now, if your interested in the project, pleeease check it out at github, http://github.com/dasacc22/dmsl and play around with it. I’m going to slow my development down to a crawl comparitively as I’m really burnt out. I wont be implementing any new features and I’ll be cleaning up my code base yet again and trying to write beautiful code to everything ive hacked together today/last-nite.

And please feel free to contact me by email or comments or however regarding this project. I would like to get the code cleaned up so that I can collaborate on different things to really bring this project up to par.

Thanks! Yes the intention is for it to integrate with a framework. First on my list is a framework I have developed that is mostly just a carefully put together set of cherrypy toolbox tools and cherrypy namespace funcs. After that, though I’ve never sat down and actually used Django, it would be foolish of me not to get it working there since there is such a large user base. The way the processor works is that there is a dictionary used for the namespace eval. Essentially at some point during the request, you’ll want to update the dictionary with the response from the controller. In the case of my personal toolbox, I am also going to investigate scraping the locals of the controller so as to negate the need for a return even. I’ve updated the README and cleaned up the source quite alot within the past hour so feel free to check it out again on github

One thing I’m a little concerned about in your syntax is the syntax you use for including Python expressions. A colon at the beginning of the line is actually meaningful in Haml; see the documentation on filters: http://haml-lang.com/docs/yardoc/file.HAML_REFERENCE.html#filters. Why not use a hyphen and an equals sign, like most implementations do?

Hi, Thanks for your interest. Currently I am going through a 0.1 series hammering out the details of implementation and trying to take care of any bugs. After I finish a refactoring of my code, I anticipate the possibility for syntax changes to take place in 0.2 series. For the most part, if I chose to follow HAML’s implementation more closely, those type’s of changes to the code base should be minimal. I was already looking to modularize the various directives. My choice of syntax is actually explained in the README on github under the header “Explanation of Syntax Choices”. To quickly sum up, I felt there was a lot of overlap in HAML’s implementation. For example, while a colon refer’s to a filter in HAML, it refer’s to anything pythonic in DAML. I actually have a preprocessor (though rudimentary atm) that allows you to declare HAML filters and it simply converts it to a function call. This kind of unification I feel would allow seemless and easy extensibility in DAML. A text filter could even be declared in the document and used later on, and its all done via a simple colon. Anyway, I’m still hammering away at a lot of stuff and completely open for debate on any subject regarding this. I’m no ruby programmer and have never used HAML but from what I understand, HAML leans on ERB for certain features that I am implementing directly in DAML and in this way I sort of feel that this project is becoming something more then a mere HAML implementation in python. If you have any clarification on this I’d be happy to hear it via email.

Yes I have. Actually I realized Shpaml is probably more related to HAML then DAML is. This is what I mean. From what I understand (i have never used HAML), HAML leans on ERB for missing feature sets. It’s more like a layer on ERB. ERB i guess is like someones django templates. And thats exactly what what Shpaml is as well. You have to use django templates and shpaml together to do any kind of serious work. DAML is actually going to be a full featured template engine that runs fast and doesn’t need to rely on anything else for missing feature sets. You can view the README on github to see what I am talking about for speed.

Haml certainly does not lean on ERB. It is most definitely “a full featured template language that runs fast and doesn’t need to rely on anything else for missing feature sets.” Even if it did compile to ERB, I don’t see how the implementation would have any bearing on the language itself.

Aah see, like i said, i could easily mis-state something here since I’ve never used it. For some reason I recalled reading “HAML compiles your templates to ERB” but this is totally wrong, on haml-lang theres the obvious statement, “Haml is a drop-in replacement for ERB”

I couldn’t find a use-case in HAML that relates to XSL’s template naming, import/include and call-template. How would this be achieved in HAML? From what I could tell, you would have to render-partial’s for each page, for each section.

Also, can you provide more solid benchmarks, I had to essentially write my own and I am not experienced with ruby what-so-ever. You have a “bench” in your git repo but its converting HAML to ERB, which isn’t suitable for comparison. How about a bench from start to finish with purely HAML?

Haml is intended to be used in combination with helpers in the host language. These helpers handle dynamic things, like those you’re talking about. Rendering partials is one example of a helper, but if you want something more complicated you can implement it yourself without modifying the template code. That’s the beauty of modularity and everything being for its purpose.

I see, when I was going over the HAML documentation, nothing ever really stood out as a “here’s how to write your own HAML extensions/helpers and here’s the possibilities”.

I mean the Helpers section in documentation is two sentences long and rather vague and doesn’t even mention the possibility of easily writing your own, just that HAML offers useful helpers for various things. This could simply be an oversight b/c im not ruby-esque. Im glad to hear it is modular and possible to stand alone over in Ruby land. Apparently the work flow for some django users is to mix HAML and django templates (from user comments elsewhere).

But again, I’d still like to see some benchs that compile to HTML instead of ERB making use of w/e is available with HAML off the bat b/c my speed tests may have been erroneous as they clocked in longer then a django template, which is already slow. Of course slow isn’t *that* slow compared to a number of other choices out there that are simply unbearable for any decently trafficked site.

The documentation is written with the assumption that those reading it will be familiar with the typical helper pattern in Ruby, which I suppose isn’t the case for you. Ruby web frameworks, whether they use Haml or ERB, typically provide some built-in mechanism for defining helper methods, as well as a suite of their own helpers that Haml can use.

As for benchmarks, the one in the repository doesn’t convert Haml to ERB. It compares the speed of compiling a Haml template to that of compiling an ERB template. In any case, it’s just a simple Haml file compared to a simple ERB file, so it shouldn’t be too hard to create your own.

I wouldn’t be terribly surprised if Daml ended up being substantially slower than existing engines. It took Haml a good long time to figure out all the tricks for getting up to speed with ERB.

eh, i dunno about that. My first code write was very much a hack fest and I still wrote it for speed constantly profiling as I was writing it. There are speed benchmarks in my README and its a close second to Mako which is possibly the fastest template engine available for python for many use cases.

I’ve given a lot of thought to implementation, the code is procedural and short, removing a lot of function overhead. I have completely avoided/replaced any regex with python builtins reducing lots of overhead. I only evaluate python expressions in the document one time which means it scales really well. And I anticipate now that I have hammered out most all the details of implementation and I rewrite the code, I will have more minor speed gains.

One thing making Daml fast (or possibly slow even) is it currently depends on lxml. But this is totally unnecessary for what I’m actually using the library for. So I will try and write a builtin solution as lxml actually has a lot of function overhead and makes use of some regex here in there.

I wouldn’t be surprised at all if Daml was very competitive speed-wise

I’m not sure exactly what form Python templating tends to take, but I’ll hazard a guess that if the speed of the parsing matters, you’re doing something wrong. You should almost certainly be parsing everything once and then using an internal representation of some sort to generate the HTML for each request.

The way Haml does it is to compile each Haml template into a single Ruby method, which it then dynamically evaluates in the context of a module. This method is then called for each template, in many cases completely avoiding any Haml code at all and just doing string concatenation.

The speed of the parsing matters only for highly trafficked sites. The less overhead the parser has, is less overhead for the entire stack related to the request. Say your web stack, minus the template parser and negating bandwidth/hardware issues, takes 3 ms to process a request. Now say you have two different parsers. One takes 1ms and the other takes 3ms. This is a difference between being able to handle 166 requests per second and 250 requests per second. Comparitively to testing I’ve actually done, this example has a narrow margin. Many times I see much longer processing times for the parser.

As for the general way templates ussually go in python, there is a parser method that is passed a dict (indexed array). When looking over ruby web controller code, (not that I even vaguely understand ruby), I noticed there were no returns which fits the model of a template being evaluated in the context of a module. In python, this can be done simply by having the controller return locals(), and then post-controller call, the template can be rendered. There’s alot of variation and preference here depending on the framework.

From the speed tests I’ve already done, it’s simply not an issue. Daml has three iterations over the document, normalization, evaluating, building. Having the first two seperated allows for more dynamic usage with extensions written. The block/extends feature makes use of this for example. By the time it comes to building the doc, its as if the whole bit was typed out by hand. At this point I calculate where everything goes very quickly in a way, that allows for variable indention as well.

I actually tried looking over the Haml code related to indention parsing (related to the variable indention issue filed on github) but I just dont know enough to really tell what I’m looking at.

My point is that you shouldn’t need to care about parsing time at all ever. by compiling down to code — Ruby or Python — you’re let the VM handle your speed, and the VM is always going to be faster than manual evaluation. If none of the other Python templates do this, then you’ll have a major competitive advantage if you add support for it to Daml.

okay: i’ll bite — so, in terms of speed of processing vs. speed of compilation… since apache and things like passenger spawn child processes to handle requests… are the complied haml/hamlpy bits reused in someway across processes, or do new child processes have to reparse templates?

Im not sure as I deal with lighttpd and cherrypy. but as of right now, all requests do a complete parse from start to finish and it still manages to be quite fast. Currently in the works is a Template class that accepts a sandbox and dictionary of variables to finish the parse. In this case, you would pass around this Template instance to your controller code (or have your controller code access this Instance in memory, more likely).

Hi, I’m the author of shpaml. First of all, I’m glad other folks in the Python community are pursuing HAML-like things.

I just want to clarify a couple things about shpaml. First of all, you are completely correct that shpaml is not intended to be a templating language. It is either intended to generate static HTML, or, more likely, it is intended to sit on top of a more complex templating language like django. Shpaml is all about syntax sugar; it leaves the heavy lifting for somewhere else in the stack.

A common misconception about shpaml is that it actually relies on Django. It does not. It runs completely independently of Django. The only direct relationship between shpaml and Django is that they do play nice together, and the shpaml website happens to run on top of Django. The statement “You have to use django templates and shpaml together to do any kind of serious work” is false, but if you replace “django templates” with “other good tools,” then it is completely correct.