= Genshi Tutorial =
This tutorial is intended to give an introduction on how to use Genshi in your web application, and present common patterns and best practices. It is aimed at developers new to Genshi as well as those who've already used Genshi, but are looking for advice or inspiration on how to improve that usage.
== Introduction ==
In this tutorial we'll create a simple Python web application based on [http://cherrypy.org/ CherryPy 3]. !CherryPy was chosen because it provides a convenient level of abstraction over raw CGI or [http://wsgi.org/wsgi WSGI] development, but is less ambitious than full-stack web frameworks such as [http://pylonshq.com/ Pylons] or [http://www.djangoproject.com/ Django], which tend to come with a preferred templating language, and often show significant bias towards that language.
The application we'll build here is a stripped-down version of sites such as [http://reddit.com/ reddit] or [http://digg.com/ digg]: it lets users submit links to online articles they find interesting, and then lets other users comment on those stories. Just for kicks, we'll call that application '''Geddit?'''.
We'll keep the project as simple as possible, while still showing many of Genshi's features and how to best use them:
* For persistence, we'll use native Python object serialization (via the `pickle` module), instead of an SQL database and an ORM.
* There's no authentication of any kind. Anyone can submit links, anyone can comment.
* We'll start with the basics (rendering templates, handling forms, etc), and then continue by adding features such as Atom feeds and an AJAX interface.
[[PageOutline(2-3, Content, inline)]]
== Getting Started ==
=== Prerequisites ===
First, make sure you have !CherryPy 3.0.x installed, as well as recent versions of [http://formencode.org/ FormEncode] and obviously Genshi. You can download and install those manually, or just use [http://peak.telecommunity.com/DevCenter/EasyInstall easy_install]:
{{{
$ easy_install CherryPy
$ easy_install FormEncode
$ easy_install Genshi
}}}
=== The !CherryPy Application ===
Next, set up the basic !CherryPy application.
1. Create a directory that should contain the application
2. Inside that directory create a Python package named geddit by doing the following:
* Create a `geddit` directory
* Create an empty file called `__init__.py` inside the `geddit` directory
3. Inside the `geddit` package directory, create a file called `controller.py` with the following content:
{{{
#!python
#!/usr/bin/env python
import operator, os, pickle, sys
import cherrypy
class Root(object):
def __init__(self, data):
self.data = data
@cherrypy.expose
def index(self):
return 'Geddit'
def main(filename):
data = {} # We'll replace this later
# Some global configuration; note that this could be moved into a
# configuration file
cherrypy.config.update({
'tools.encode.on': True, 'tools.encode.encoding': 'utf-8',
'tools.decode.on': True,
'tools.trailing_slash.on': True,
'tools.staticdir.root': os.path.abspath(os.path.dirname(__file__)),
})
cherrypy.quickstart(Root(data), '/', {
'/media': {
'tools.staticdir.on': True,
'tools.staticdir.dir': 'static'
}
})
if __name__ == '__main__':
main(sys.argv[1])
}}}
Enter the tutorial directory in the terminal, and run:
{{{
$ PYTHONPATH=. python geddit/controller.py geddit.db
}}}
''Note: On some Windows systems you may have to enter two lines:
{{{
SET PYTHONPATH=.
python geddit/controller.py geddit.db
}}}
You should see a log message pointing you to the URL where the application is being served, which is usually http://localhost:8080/. Visiting that page will respond with just the string “Geddit”, as that's what the `index()` method of the `Root` object returns.
Note that we've configured !CherryPy to serve static files from the `geddit/static` directory. !CherryPy will complain that that directory does not exist, so create it, but leave it empty for now. We'll add static resources later on in the tutorial.
=== Basic Template Rendering ===
So far the code doesn't actually use Genshi, or even any kind of templating. Let's change that.
Inside of the `geddit` directory, create a directory called `templates`, and inside that directory create a file called `index.html`, with the following content:
{{{
#!genshi

News

}}}
This template demonstrates some aspects of Genshi that we've not seen so far:
* We declare the `py:` namespace prefix on the `` element, which is required to be able to add [wiki:Documentation/xml-templates.html#template-directives directives] to the template.
* There's a `py:if` [wiki:Documentation/xml-templates.html#conditional-sections condition] on the `` element. That means that the `` and everything it contains will only be included in the output stream if the expression `links` evaluates to a truth value. In this case we know that `links` is a list (assembled by the `Root.index()` method), so if the list is empty, the `` will be skipped.
* Next up, we've attached a `py:for` [wiki:Documentation/xml-templates.html#looping loop] to the `

` element. What this does is that the `

` element will be repeated for every item in the `links` list. The `link` variable is bound to the current item in the list on every step.
* You can tell that we can also use more complex expressions than just simple variable substitutions: the directives such as `py:if` and `py:for` take Python expressions of any complexity, and you can include Python expressions in other places by putting them inside curly braces prefixed with a dollar sign (`${...}`).
When you reload the page in the browser, you should get something like this:
[[Image(tutorial01.png)]]
=== Adding a Submission Form ===
In the previous step, we've already added a link to a submission form to the template, but we haven't implemented the logic to handle requests to that link yet.
To do that, we need to add a method to the `Root` class in `geddit/controller.py`:
{{{
#!python
@cherrypy.expose
def submit(self, cancel=False, **data):
if cherrypy.request.method == 'POST':
if cancel:
raise cherrypy.HTTPRedirect('/')
# TODO: validate the input data!
link = Link(**data)
self.data[link.id] = link
raise cherrypy.HTTPRedirect('/')
tmpl = loader.load('submit.html')
stream = tmpl.generate()
return stream.render('html', doctype='html')
}}}
'''Note:''' we explicitly check for the HTTP request method here. And only if it's a “POST” request we actually go and look into the submitted data and add it to the database. That's because “GET” requests in HTTP are [http://www.w3.org/DesignIssues/Axioms#state supposed to be idempotent], that is, they should not have side effects. If we didn't make this check, we'd also be accepting requests that would change the database via “GET” or “HEAD”, thereby violating the rules.
And of course we'll need to add a template to display the submission form. In `geddit/templates`, create a file named `submit.html`, with the following content:
{{{
#!genshi
Geddit: Submit new link

Submit new link

}}}
Now, if you click on the “Submit new link” link on the start page, you should see the submission form. Filling out the form and clicking "Submit" will post a new link and take you to the start page. Clicking on the “Cancel” button, will take you back to the start page, but not add a link.
Please note though that we're not performing ''any'' kind of validation on the input, and that's of course a bad thing. So let's add validation next.
=== Adding Form Validation ===
We'll use [http://formencode.org/ FormEncode] to do the validation, but we'll keep it all fairly basic. Let's declare our form in a separate file, namely `geddit/form.py`, which will have the following content:
{{{
#!python
from formencode import Schema, validators
class LinkForm(Schema):
username = validators.UnicodeString(not_empty=True)
url = validators.URL(not_empty=True, add_http=True, check_exists=False)
title = validators.UnicodeString(not_empty=True)
class CommentForm(Schema):
username = validators.UnicodeString(not_empty=True)
content = validators.UnicodeString(not_empty=True)
}}}
Now let's use those in the `Root.submit()` method. First add the form classes, as well as the `Invalid` exception type used by !FormEncode, to the imports at the top of `geddit/controller.py`, which should then look something like this:
{{{
#!python
import cherrypy
from formencode import Invalid
from genshi.template import TemplateLoader
from geddit.form import LinkForm, CommentForm
from geddit.model import Link, Comment
}}}
Then, update the `submit()` method to match the following:
{{{
#!python
@cherrypy.expose
def submit(self, cancel=False, **data):
if cherrypy.request.method == 'POST':
if cancel:
raise cherrypy.HTTPRedirect('/')
form = LinkForm()
try:
data = form.to_python(data)
link = Link(**data)
self.data[link.id] = link
raise cherrypy.HTTPRedirect('/')
except Invalid, e:
errors = e.unpack_errors()
else:
errors = {}
tmpl = loader.load('submit.html')
stream = tmpl.generate(errors=errors)
return stream.render('html', doctype='html')
}}}
As you can tell, we now only add the submitted link to our database when validation is successful: all fields need to be filled out, and the `url` field needs to contain a valid URL. If the submission is valid, we proceed as before. If it is not valid, we render the submission form template again, passing it the dictionary of validation errors. Let's modify the `submit.html` template so that it displays those error messages:
{{{
#!genshi
Geddit: Submit new link

Submit new link

}}}
So now, if you submit the form without entering a title, and having entered an invalid URL, you'd see something like the following:
[[Image(tutorial02.png)]]
But there's a problem here: Note how the input values have vanished from the form! We'd have to repopulate the form manually from the data submitted so far. We could do that by adding the required `value=""` attributes to the text fields in the template, but Genshi provides a more elegant way: the [wiki:Documentation/filters.html#html-form-filler HTMLFormFiller] stream filter. Given a dictionary of values, it can automatically populate HTML forms in the template output stream.
To enable this functionality, first you'll need to add the following import to the `geddit/controller.py` file:
{{{
#!python
from genshi.filters import HTMLFormFiller
}}}
Next, update the bottom lines of the `Root.submit()` method implementation so that they look as follows:
{{{
#!python
tmpl = loader.load('submit.html')
stream = tmpl.generate(errors=errors) | HTMLFormFiller(data=data)
return stream.render('html', doctype='html')
}}}
Now, all entered values are preserved when validation errors occur. Note that the form is populated as the template is being generated, there is no reparsing and reserialization of the output.
== Improving the Application ==
=== Factoring out the Templating ===
By now, we already have some repetitive code when it comes to rendering templates: both the `Root.index()` and the `Root.submit()` methods look very similar in that regard: they load a specific template, call its `generate()` method passing it some data, and then call the `render()` method of the resulting stream. As we're going to be adding more controller methods, let's factor out those things into a library module.
There's a special challenge here, though: we still want to be able to add the `HTMLFormFiller` or other stream filters to the template output stream, which needs to be done before that output stream is serialized. We'll use a combination of a decorator and a regular function to achieve that, which collaborate using the !CherryPy thread-local context.
Create a directory called `lib` inside the `geddit` directory, and inside the `lib` directory create two files, named `__init__.py` and `template.py`, respectively. Leave the first one empty, and in the second one, insert the following code:
{{{
#!python
import os
import cherrypy
from genshi.core import Stream
from genshi.output import encode, get_serializer
from genshi.template import Context, TemplateLoader
loader = TemplateLoader(
os.path.join(os.path.dirname(__file__), '..', 'templates'),
auto_reload=True
)
def output(filename, method='html', encoding='utf-8', **options):
"""Decorator for exposed methods to specify what template they should use
for rendering, and which serialization method and options should be
applied.
"""
def decorate(func):
def wrapper(*args, **kwargs):
cherrypy.thread_data.template = loader.load(filename)
opt = options.copy()
if method == 'html':
opt.setdefault('doctype', 'html')
serializer = get_serializer(method, **opt)
stream = func(*args, **kwargs)
if not isinstance(stream, Stream):
return stream
return encode(serializer(stream), method=serializer,
encoding=encoding)
return wrapper
return decorate
def render(*args, **kwargs):
"""Function to render the given data to the template specified via the
``@output`` decorator.
"""
if args:
assert len(args) == 1, \
'Expected exactly one argument, but got %r' % (args,)
template = loader.load(args[0])
else:
template = cherrypy.thread_data.template
ctxt = Context(url=cherrypy.url)
ctxt.push(kwargs)
return template.generate(ctxt)
}}}
In the `geddit/controller.py` file, you can now remove the `from genshi.template import TemplateLoader` line, and also the instantiation of the `TemplateLoader`, as that is now done in our new library module. Of course, you'll have to import that library module instead:
{{{
#!python
from geddit.lib import template
}}}
Now, we can change the `Root` class to match the following:
{{{
#!python
class Root(object):
def __init__(self, data):
self.data = data
@cherrypy.expose
@template.output('index.html')
def index(self):
links = sorted(self.data.values(), key=operator.attrgetter('time'))
return template.render(links=links)
@cherrypy.expose
@template.output('submit.html')
def submit(self, cancel=False, **data):
if cherrypy.request.method == 'POST':
if cancel:
raise cherrypy.HTTPRedirect('/')
form = LinkForm()
try:
data = form.to_python(data)
link = Link(**data)
self.data[link.id] = link
raise cherrypy.HTTPRedirect('/')
except Invalid, e:
errors = e.unpack_errors()
else:
errors = {}
return template.render(errors=errors) | HTMLFormFiller(data=data)
}}}
As you can see here, the code is now less repetitive: there's a simple decorator to define which template should be used, and the `render()` function produces the template output stream which can then be further processed if necessary.
=== Adding a Layout Template ===
But there's also duplication in the template files themselves: each template has to redefine the complete header and footer, and any other “decoration” markup that we may want to apply to the complete site. Now, we could simply put those commonly used markup snippets into separate HTML files and [wiki:Documentation/xml-templates.html#includes include] them in the templates where they are needed. But Genshi provides a more elegant way to apply a common structure to different templates: [wiki:Documentation/xml-templates.html#match-templates match templates].
Most template languages provide an inheritance mechanism to allow different templates to share some kind of common structure, such as a common header, navigation, and footer. Using this mechanism, you create a “master template” in which you declare slots that “derived templates” can fill in. The problem with this approach is that it is fairly rigid: the master needs to know which content the templates will produce, and what kind of slots need to be provided for them to stuff their content in. Also, a derived template is itself not a valid or even well-formed HTML file, and can not be easily previewed or edited in a WYSIWYG authoring tool.
Match templates in Genshi turn this upside down. They are conceptually similar to running an XSLT transformation over your template output: you create rules that match elements in the template output stream based on XPath patterns. Whenever there is a match, the matched content is replaced by what the match template produces. This sounds complicated in theory, but is fairly intuitive in practice, so let's look at a concrete example.
In the `geddit/templates/` directory, add a file named `layout.html`, with the following content:
{{{
#!genshi
Geddit: ${title}
${select('*[local-name()!="title"]')}

}}}
That contains a whole lot of things, so let's break it up into smaller pieces and go through the various aspects to clarify them.
1. '''The Document Element'''
{{{
#!genshi
}}}
First, note that the root element of the template is an `` tag. This is needed because markup templates are XML documents, and XML documents require a single root element (we also use it to attach our namespace declarations, but we could just as well do that on the nested `` elements). However, because the page templates that include this file will also have `` root elements, we add the `py:strip=""` directive so that this second `` tag doesn't make it through into the output stream.
2. '''Match Template Definition'''
{{{
#!genshi
}}}
Here we define the first match template. The `path` attribute contains an XPath pattern specifying which elements this match template should be applied to. In this case, the XPath is very simple: it matches any element with the tag name “head”, so it will be applied to the `...` element. We also add the `once="true"` attribute to tell Genshi that we only expect a single occurrence of the `` element in the stream. Genshi can perform some optimizations based on this information.
3. '''Selecting Matched Content'''
{{{
#!genshi
}}}
Inside match templates, you can use the special function `select(path)` to access the element that matched the pattern. Here we use that function in the `py:attrs` directive, which basically translates to “''get all attributes on the matched element, and add them to this element''”. So for example if your page template contained ``, the element produced by this match template would also have the same `id="foo"` attribute.
{{{
#!genshi
Geddit: ${title}
}}}
This is a more complex example for selecting matched content: it fetches the text contained in the `` element of the original `` and prefixes it with the string “Geddit: ”. But as page templates may not even contain a `` element, we first check whether it exists, and only add the colon if it does. Thus, if the page has no title of its own, the result will be “Geddit”.
{{{
#!genshi
${select('*[local-name()!="title"]')}
}}}
Finally, this is an example for using a more complex XPath pattern. This `select()` incantation here returns a stream that contains all child elements of the original ``, except for those elements with the tag name “title”. If we didn't add that predicate, the output stream would contain two `` tags.
If you've done a bit of XSLT, match templates should look familiar. Otherwise, you may want to familiarize yourself with the basics of [http://en.wikipedia.org/wiki/XPath XPath 1]—but note that Genshi only implements a subset of the full spec as explained in [wiki:Documentation/xpath.html Using XPath in Genshi]. Just play around with match templates a bit; at the core, the concept is actually pretty simple and consistent.
Now we need to update the page templates: they no longer need the header and footer, and we'll have to include the `layout.html` file so that the match templates are applied. For the inclusion, we add the namespace prefix for XInclude, and an `xi:include` element.
Let's see how the template should look now for `index.html`:
{{{
#!genshi
News

News

}}}
Also change the `submit.html` template analogously, by adding the namespace prefix, the `` element, and by removing the header and footer `

`s:
{{{
#!genshi
Submit new link

Submit new link

Your name:

${errors.username}

Link URL:

${errors.url}

Title:

${errors.title}

}}}
And speaking of “layout”, you can see that we've added references to some static resources in the layout template: there's an embedded image as well as a linked stylesheet and javascript file. [http://svn.edgewall.org/repos/genshi/trunk/examples/tutorial/geddit/static Download] those files and put them in your `geddit/static/` directory.
When you reload the front page in your browser, you should now see something similar to the following:
[[Image(tutorial03.png)]]
=== Implementing Comments ===
We're still missing an important bit of functionality: people should be able to comment on submitted links. Three things are needed to implement that:
* a detail view of a link, showing all comments made so far,
* a way to get to that page from the list of links, and,
* a form to add new comments.
Note that on the model side we're covered, there's already a `Comment` class in `geddit.model`, and we even have two comments in our database already. And we already have to form that'll be used to validate comment submissions, in form of the class `CommentForm` in `geddit.form`.
So let's add the rest by extending the `index.html` template to show how many comments there are so far, and make that a link to the detail page. Change your `geddit/templates/index.html` file to match the following:
{{{
#!genshi
News

News

` for every link in the list, each containing the number of comments, and linking to the detail page.
Of couse, if you click on those links, you'll get an error page: we haven't implemented the `info()` view yet!
Let's do that now. Add the following method to the `Root` class:
{{{
#!python
@cherrypy.expose
@template.output('info.html')
def info(self, id):
link = self.data.get(id)
if not link:
raise cherrypy.NotFound()
return template.render(link=link)
}}}
And then add the needed temlate `geddit/templates/info.html` with the following content:
{{{
#!genshi
${link.title}

${link.title}

}}}
At this point you should be able to see the number of comments on the start page, click on that link to get to the details page, where you should see all comments listed for the corresponding link submission. That page also contains a link for submitting additional comments, and that's what we'll need to set up next.
We need to add the method for handling comment submissions to our `Root` object. It should look like this:
{{{
#!python
@cherrypy.expose
@template.output('comment.html')
def comment(self, id, cancel=False, **data):
link = self.data.get(id)
if not link:
raise cherrypy.NotFound()
if cherrypy.request.method == 'POST':
if cancel:
raise cherrypy.HTTPRedirect('/info/%s' % link.id)
form = CommentForm()
try:
data = form.to_python(data)
comment = link.add_comment(**data)
raise cherrypy.HTTPRedirect('/info/%s' % link.id)
except Invalid, e:
errors = e.unpack_errors()
else:
errors = {}
return template.render(link=link, comment=None,
errors=errors) | HTMLFormFiller(data=data)
}}}
Last but not least, we need the template that renders the comment submission form. Inside `geddit/templates`, add a file named `comment.html`, and insert the following content:
{{{
#!genshi
Comment on “${link.title}”

Comment on “${link.title}”

In reply to ${comment.username}
at ${comment.time.strftime('%x %X')}:

${comment.content}

Your name:

${errors.username}

Comment:

${errors.content}

}}}
Phew! We should be done with the commenting now. Play around with the application a bit to get a feel for what we've achieved so far. The next section will look into various things that can be done to further improve the application.
[[Image(tutorial04.png)]]
== Advanced Topics ==
=== Adding an Atom Feed ===
Every web site needs an RSS or [http://www.atomenabled.org/ Atom] feed these days. So we shall provide one too.
Adding Atom feeds to Geddit is fairly straightforward. First, we'll need to add auto-discovery links to the index and detail pages.
Inside the `` element of `geddit/templates/index.html`, add:
{{{
#!genshi
}}}
And inside the `` element of `geddit/templates/info.html`, add:
{{{
#!genshi
}}}
Now we need to add the `feed()` method to our `Root` class in `geddit/controller.py`:
{{{
#!python
@cherrypy.expose
@template.output('index.xml', method='xml')
def feed(self, id=None):
if id:
link = self.data.get(id)
if not link:
raise cherrypy.NotFound()
return template.render('info.xml', link=link)
else:
links = sorted(self.data.values(), key=operator.attrgetter('time'))
return template.render(links=links)
}}}
Note that this method dispatches to different templates depending on whether the `id` parameter was provided. So, for the URL `/feed/`, we'll render the list of links using the template `index.xml`, and for the URL `/feed/{link_id}/`, we'll render a link and the list of related comments using the template `info.xml`.
The templates for this are also pretty simple. First, `geddit/templates/index.xml`:
{{{
#!genshi
Geddit News${links[0].time.isoformat()}${link.url}${url('/info/%s/' % link.id)}${link.username}${link.time.isoformat()}${link.title}
}}}
And now, `geddit/templates/info.xml`:
{{{
#!genshi
Geddit: ${link.title}
${time.isoformat()}
Comment ${len(link.comments) - idx} on “${link.title}”${url('/info/%s/' % link.id)}#comment${idx}${comment.username}${comment.time.isoformat()}${comment.content}
}}}
Voila! We now provide Atom feeds for all our content.
[[Image(tutorial05.png)]]
=== Ajaxified Commenting ===
[http://www.adaptivepath.com/publications/essays/archives/000385.php AJAX] (Asynchronous Javascript And XML) is all the rage today, and in many ways, it is indeed a helpful technique for improving the usability and responsiveness of web-based applications.
To demonstrate how you'd use Genshi in a project that uses AJAX, let's enhance the Geddit commenting feature to use AJAX. We'll implement this in such a way that the current way comments work remains available, to serve those who don't have Javascript available, and also just to be good web citizens. That approach to using funky new techniques is often referred to as “unobtrusive Javascript”, and what it provides is “graceful degradation.”
'''Note''': technically, what we'll be doing here isn't AJAX in the literal sense, because we'll not be transmitting XML. Instead, we'll respond with simple HTML fragments, a technique that is sometimes referred to as “AJAH” (“H” as in HTML).
We'll go about this in the following way: on a link submission detail page, the “Add comment” button will now load the comment form into the current page, instead of going to the dedicated comment submission page. When the user clicks the “Cancel” button, we simply remove the form from the page. On the other hand, if the user clicks the ”Submit” button, we validate the entry, and if it's okay, we remove the form and load the new comment into the list on the page. That means that the user never leaves or reloads the link submission detail page in the process!
The first thing we need to do is to make the comment form available as a fragment, outside of the normally needed HTML skeleton. To do that, we create a new template file, in `geddit/templates/_form.html`, with the following content:
{{{
#!genshi

Your name:

${errors.username}

Comment:

${errors.content}

}}}
And as that is the same form as the one used in the `geddit/templates/comment.html` template, let's replace the markup in that template with an include:
{{{
#!genshi
Comment on “${link.title}”

Comment on “${link.title}”

In reply to ${comment.username}
at ${comment.time.strftime('%x %X')}:

${comment.content}

}}}
We'll also need to make the display of an individual comment available as an HTML fragment, so let's factor it out into a separate template file as well.
Add a template called `_comment.html` to the `geddit/templates` directory, and insert the following lines:
{{{
#!genshi

${comment.username} at ${comment.time.strftime('%x %X')}

${comment.content}

}}}
And in `geddit/templates/info.html` replace the `

` element rendering the comments with the following:
{{{
#!genshi

}}}
Now we'll need to look into modifying the `Root.comment()` method so that it correctly deals with the AJAX requests we'll be adding.
For convenience, let's add a new small module to our `lib` package. Inside the `geddit/lib` directory, create a file named `ajax.py`, and add the following code to it:
{{{
#!python
import cherrypy
def is_xhr():
requested_with = cherrypy.request.headers.get('X-Requested-With')
return requested_with and requested_with.lower() == 'xmlhttprequest'
}}}
This checks whether the current request originates from usage of AJAX (technically, the `XMLHttpRequest` Javascript object), based on a convention commonly used in Javascript libraries to add the special HTTP header “`X-Requested-With: XMLHttpRequest`” to all requests.
Add an import of that module to the top of the `geddit/controller.py` file, replacing:
{{{
#!python
from geddit.lib import template
}}}
with:
{{{
#!python
from geddit.lib import ajax, template
}}}
Then, replace the `Root.comment()` method in `geddit/controller.py` with the following code:
{{{
#!python
@cherrypy.expose
@template.output('comment.html')
def comment(self, id, cancel=False, **data):
link = self.data.get(id)
if not link:
raise cherrypy.NotFound()
if cherrypy.request.method == 'POST':
if cancel:
raise cherrypy.HTTPRedirect('/info/%s' % link.id)
form = CommentForm()
try:
data = form.to_python(data)
comment = link.add_comment(**data)
if not ajax.is_xhr():
raise cherrypy.HTTPRedirect('/info/%s' % link.id)
return template.render('_comment.html', comment=comment,
num=len(link.comments))
except Invalid, e:
errors = e.unpack_errors()
else:
errors = {}
if ajax.is_xhr():
stream = template.render('_form.html', link=link, errors=errors)
else:
stream = template.render(link=link, comment=None, errors=errors)
return stream | HTMLFormFiller(data=data)
}}}
There's another small detail we'll need to care of: in our `@template` decorator, we're automatically adding a `` declaration to any template output stream that is being serialized to HTML. For AJAX responses containing HTML fragments, we don't really want to add any kind of DOCTYPE, so we'll need to adjust the implementation of the decorator.
To do that, first add an import of our `geddit/lib/ajax.py` file to the `geddit/lib/template.py` file:
{{{
#!python
from geddit.lib import ajax
}}}
Then, replace the implementation of the `output()` function with the following:
{{{
#!python
def output(filename, method='html', encoding='utf-8', **options):
"""Decorator for exposed methods to specify what template the should use
for rendering, and which serialization method and options should be
applied.
"""
def decorate(func):
def wrapper(*args, **kwargs):
cherrypy.thread_data.template = loader.load(filename)
opt = options.copy()
if not ajax.is_xhr() and method == 'html':
opt.setdefault('doctype', 'html')
serializer = get_serializer(method, **opt)
stream = func(*args, **kwargs)
if not isinstance(stream, Stream):
return stream
return encode(serializer(stream), method=serializer,
encoding=encoding)
return wrapper
return decorate
}}}
Note how we're now only adding the `doctype='html'` serialization option when we're not handling an AJAX request.
Finally, we need to add the actual Javascript logic needed to orchestrate all this. Add the following code at the bottom of the `` element in the `geddit/templates/info.html` template:
{{{
#!genshi
}}}
This Javascript snippet uses [http://jquery.com/ jQuery] (via the `jquery.js` file you've already added to you `geddit/static` directory). We won't go into the details of the script here, suffice to say that it implements our goals in a fairly lightweight manner. For a nice introduction to jQuery, see [http://simonwillison.net/ Simon Willison]´s blog post [http://simonwillison.net/2007/Aug/15/jquery/ jQuery for JavaScript programmers].
Now, when you click on the “Add comment” link on the link submission detail page, with Javascript enabled, you should see the comment form appear on the same page:
[[Image(tutorial06.png)]]
=== Allowing Markup in Comments ===
At this point we allow users to post plain text comments, but those comments can't include niceties such as hyperlinks or HTML inline formatting (emphasis, etc). A very naive application would simply accept HTML tags in the input, and pass those tags through to the output. That is generally a bad thing, however, as it [http://neomeme.net/2007/05/26/reddit-hacked/ opens up] your site to [http://ha.ckers.org/cross-site-scripting.html cross-site scripting] (XSS) attacks, which can undermine any security measures you try put into effect (including SSL). And because this is generally not the behavior you want, Genshi XML-escapes everything by default, which makes it safe to include in (X)HTML output.
(''Note that as Geddit allows anyone to do anything, we don't actually have any valuable assets to protect, so this exercise is somewhat theoretical. For the rest of this section, just imagine we required users to register and login to submit links or post comments.'')
So what we want to do in this section is to allow users to include HTML tags in their comments, but do so in a safe manner. We do not want to enable malicious users to include Javascript code, or CSS styles that turn the whole page black, or other things that may be considered harmful. In other words, we need to “sanitize” the markup in the comments.
But let's ignore that aspect for now, and start by making Genshi not escape HTML tags in comments. We'll start by editing `geddit/template/_comment.html`:
{{{
#!genshi

${comment.username} at ${comment.time.strftime('%x %X')}

${HTML(comment.content)}

}}}
Here, we've added an import for the Genshi `HTML()` function. This is done using a [wiki:Documentation/templates.html#code-blocks Python code block] via the `` processing instruction. We've already seen that we can use complex Python expressions in templates. By using the `` processing instruction, we can embed any Python statements directly in the template, for example to define classes or functions. In this case we simply import a function that we need to use.
The `HTML()` function parses a snippet of HTML and returns a Genshi markup stream. It tries to do this in a way that invalid HTML is corrected (for example by fixing the nesting of tags). We then use that function to render the content of the comment. So what does this do, exactly? Well, the comment text is parsed using an HTML parser, fixed up if necessary (and possible), and injected into the template as a markup stream. A template expression that evaluates to a markup stream is treated differently than other data types: it is injected directly into the template output stream, effectively resulting in tags not getting escaped.
'''Note:''' Genshi also provides the `genshi.core.Markup` class, which is just a special string class that flags its content as safe for being included in HTML/XML output for Genshi. So instead of wrapping the comment text inside a call to the `HTML()` function, you could also use `Markup(comment.content)`. That would avoid the reparsing of the content, but at the cost of that content not being subject to stream filters and different serialization methods. In a nutshell, using `Markup` is not recommended unless you really know what you're doing.
So at this point our users can include HTML tags in their comments, and the comments will be rendered as HTML. But as noted above, that approach is very dangerous for most real-world applications, so we've got more work to do: we need to sanitize the markup in the comment so that only markup that can be considered safe is let through. Genshi provides a stream filter to help us here: [wiki:Documentation/filters.html#html-sanitizer HTMLSanitizer].
To add sanitization, first add the imports for the `HTML` function and the `HTMLSanitizer` filter to `geddit/controller.py`, so that the imports at the top of that file look something like this:
{{{
#!python
import cherrypy
from formencode import Invalid
from genshi.input import HTML
from genshi.filters import HTMLFormFiller, HTMLSanitizer
}}}
Then we'll update the `Root.comment()` method so that it sanitizes comments as they are submitted:
{{{
#!python
@cherrypy.expose
@template.output('comment.html')
def comment(self, id, cancel=False, **data):
link = self.data.get(id)
if not link:
raise cherrypy.NotFound()
if cherrypy.request.method == 'POST':
if cancel:
raise cherrypy.HTTPRedirect('/info/%s' % link.id)
form = CommentForm()
try:
data = form.to_python(data)
markup = HTML(data['content']) | HTMLSanitizer()
data['content'] = markup.render('xhtml')
comment = link.add_comment(**data)
if not ajax.is_xhr():
raise cherrypy.HTTPRedirect('/info/%s' % link.id)
return template.render('_comment.html', comment=comment,
num=len(link.comments))
except Invalid, e:
errors = e.unpack_errors()
else:
errors = {}
if ajax.is_xhr():
stream = template.render('_form.html', link=link, errors=errors)
else:
stream = template.render(link=link, comment=None, errors=errors)
return stream | HTMLFormFiller(data=data)
}}}
We've just added two lines here, namely:
{{{
#!python
markup = HTML(data['content']) | HTMLSanitizer()
data['content'] = markup.render('xhtml')
}}}
This parses the comment text, runs it through the sanitizer, and serializes it to XHTML. And the result of the transformation is what we'll save to our “database”. We use XHTML here just because that can be processed by a wider variety of tools. For the purposes of this tutorial we could just as well be storing the content using HTML serialization, because Genshi can handle both.
'''Note:''' this is just one way to add sanitization. Another equally valid approach would be to store comment submissions exactly how they were entered, and sanitize them when they are displayed. Or you could have two fields in the model: one to store the text as originally submitted, and the other to store the sanitized content ready for display. Or, if you were really paranoid, you'd sanitize both the input and the output. Which method you choose depends on the needs of your particular application.
You may want to try performing some XSS attacks by including malicious HTML markup in comments. Try some of the methods shown on the [http://ha.ckers.org/xss.html XSS Cheat Sheet]. You should not be able to get past the sanitizer; if you are, please [/newticket let us know].
We're almost done—the only remaining task is to update the Atom feed so that it, too, includes the user-submitted HTML tags as markup, instead of as escaped text. Open `geddit/templates/info.xml`, and update it to look as follows:
{{{
#!genshi
Geddit: ${link.title}
${time.isoformat()}
Comment ${len(link.comments) - idx} on “${link.title}”${url('/info/%s/' % link.id)}#comment${idx}${comment.username}${comment.time.isoformat()}

${HTML(comment.content)}

}}}
Just like above, we've added the import of the Genshi `HTML()` function. On the `` element we've added the `type="xhtml"` attribute, and we've added a wrapper `

` inside that element to declare the XHTML namespace. Finally, inside that `

`, we inject the comment text as an HTML-parsed stream, analogous to what we've done in the HTML template.
== Summary ==
This brings the tutorial to a close. We've demonstrated how you would generally use Genshi in a small Python web application. We've shown some best practices and recipes for making effective use of the features Genshi provides.
You can checkout the complete code for this tutorial here:
http://svn.edgewall.org/repos/genshi/trunk/examples/tutorial
If you like the application we've built here and would like to experiment with further enhancements, feel free to do so. Here are a couple of ideas:
* [wiki:GenshiTutorial/Authentication Add authentication], preferably based on [http://openid.net/ OpenID] ([http://openidenabled.com/python-openid/ Python libraries for OpenID] are available.)
* [wiki:GenshiTutorial/Internationalization Internationalize the application], using Genshi's builtin [wiki:Documentation/i18n.html I18n support] and [http://babel.edgewall.org/ Babel]. '''DONE'''
* [wiki:GenshiTutorial/CsrfProtection Add protection against cross-site request forgery (CSRF) attacks], using the [wiki:Documentation/filters.html#transformer Transformer] filter to inject form tokens in HTML forms.
* [wiki:GenshiTutorial/Voting Add voting on links]. See the [http://developer.yahoo.com/ypatterns/pattern.php?pattern=votetopromote Vote to Promote] pattern in Yahoo's “Design Pattern Library” for some inspiration.
* [wiki:GenshiTutorial/CommentThreading Add comment threading], so that people can reply to comments, and comments and replies are displayed in a hierarchical manner.
* [wiki:GenshiTutorial/AtomPublishing Add support for the Atom Publishing Protocol]. See http://bitworking.org/projects/atom/
* (your idea here)
Thanks for reading, we hope the tutorial has been useful!