The speed issue is the killer here. We can't have a 100% performance hit on template rendering times -- speed in that area is one of Django's selling points. I'm going to remove the "patch" keyword for now, Gary, since I don't think we can take this patch. Leaving the ticket open, though, since the idea is a good one and if we can fix it, we should.

I'll try to work up something a bit faster; I had something that was almost right a while back, how hard can it be? :-(

On first inspection it looks reasonably good. The performance stats are promising, too.

I can see a couple of minor areas for improvement:

Trailing whitespace - a lot of the diff is actually just your editor adding spaces to the end of lines.

Edge cases in tests - futzing with regular expressions makes me nervous, so I'd like to see a lot more tests to verify that nothing weird is going on. In particular, the test cases are fairly minimal - most of them start with the newline that is going to be stripped. I'd be interested in seeing more verification that if there is text/newlines *before* the newline that is to be stripped, that text won't be affected adversely.

Documentation. This behaviour needs to be documented -- nothing major; a quick note in the description of the template language should do.

The other thing holding up this ticket is the start of feature development for 1.3. We can't apply this until be branch for 1.2 support.

I wrote some more tests for this and it turned out that the presence of a tag would strip out all whitespace occurring before it, not just one line. I don't know if that was intentional, but I thought it surprising.

Another thing I noticed is that trailing whitespace is not stripped. I don't know how relevant that is, but I mimiced that behavior in this patch.

I seriously doubt anybody has used vertical tabs (\v) or form feeds (\f) in Django templates ever. There's no need to put them in the reg-exp.

Also, realise that this is backwards incompatible and not in a trivial way. Anybody using templates to generate output in a format where whitespace is always significant (plain text email is one very big case) would have their output broken by this (unclear if Jacob and Russell both thought this was an acceptable sacrifice -- I definitely don't -- or if it was overlooked). It has to be optional and off by default in the rendering process. (The other case where I can immediately think of this breaking something is Django's debug page's "paste traceback" button -- I remember Erik Karulf spending ages to make sure the whitespace in what we POST'ed to dpaste was sensible.

I agree with Malcolm that outright changing this is a backwards-compatibility issue. Unfortunately.

On the other hand, I think the new behavior is clearly superior in pretty much every case, and a major win, so I think it's worth finding a way to do it via an optional flag (an argument to Template.init? Would then need to be added to several shortcuts as well...), and then consider (later?) possible deprecation paths to migrate towards making it the default.

I'd like to see this enabled with a template tag, as it is template authors who will be needing to use it most. I'd also like to see a deprecation path so that this can become the default behaviour.

Similar to when autoescaping was introduced, we could add a template tag {% stripwhitespace on|off %}{% endstripwhitespace %} that people can put in their base templates to preserve the existing behaviour when the default changes or enable the new behaviour before it becomes the default. We could raise a deprecation warning whenever a template is rendered without this tag.

I'm not very familiar with the behavior of the blocktrans tag, but I guess there would need to be some handling of the whitespace at the end of a msgid being present in the po file, but not in the compiled template. I don't know if gettext is that strict. I guess the message extractor script should handle template syntax on its own line in the same way that the latest patch here does.

If the django message extractor script already uses the same classes, then there may be no problem, but it's something to look into.

I've spent the morning updating the patch against the latest development version.

It was fairly straightforward, except for the Lexer.create_token() method, which took some time. It was already fairly hairy in 1.4, and the addition of the verbatim tag in 1.5-dev had made it even more so. I've added comments to make it easier for the next person to unravel the logic.

All tests pass, pre- and post-patch, and the results of Gary Wilson's benchmark remain good.

You could try reading through the ticket and django-developers group history so that you're familiar with all the options and opinions that have been presented. Then outline a proposal on the django-developers group to move the ticket forward by choosing a preferred option (probably the one which had the most support), summarise why it has been chosen over any others, update any patches so that they apply cleanly and ask for a code review and any final objections that would delay this being committed.

There's no way I'd want to have _only_ a global flag to enable this behaviour. Certainly, there are cases where I'd rather avoid leading spaces and blank lines, but equally there are cases where it's important.

Not all the world is HTML. Django templates are frequently used for content types where the leading whitespace may be important, in the same projects where it's used for HTML, where it's less important.

Is there an open ticket for tracking the progress of integrating Django's newly blessed template engine (Jinja)? Or any indication on how "long term" that plan is? Was there a google groups discussion about this long term plan?

I did a quick search and couldn't find either. That being the case, it seems that if no new improvements will be made to Django's template engine because it might be replaced in the next 5-10 years (seems like a big job), that's simply leaving everyone who finds the current engine lacking in any way, stuck in limbo indefinitely.