...making Linux just a little more fun!

<-- prev | next -->

Templating in Python

By Mike Orr (Sluggo)

There are several templating systems for Python, but here we'll look at PTL and Cheetah. Actually, I lied; we'll focus on some little-known templating features of Quixote that aren't PTL per se but are related to it. These can be used to get around some annoyances in PTL. We'll also compare Cheetah against PTL/Quixote to see whether one of the two is more convenient overall, or which niches each system works best in. Both systems can be used standalone in Web or non-Web applications. You can download Quixote at http://www.mems-exchange.org/software/quixote/, and Cheetah at http://cheetahtemplate.org/. Install them via the usual "python setup.py install" mantra.

PTL

The Quixote documentation has a thorough description of PTL, so we'll just give a brief overview here. A PTL template looks like a Python function, but bare expressions are concatenated and used as the implicit return value. Here's an example:

def add [plain] (a, b):
    answer = a + b 
    'a plus b equals '
    answer

Calling add(2, 3) returns "a plus b equals 5". Doing this in ordinary Python returns None; the two bare expressions are thrown away. To build an equivalent to this template, you'd have to use StringIO or build a list of values and join them. And you'd have to convert non-string values to strings. So PTL is a much cleaner syntax for functions that "concatenate" a return value.

The [plain] is not valid Python syntax, so you have to put this function in a *.ptl module and teach Python how to import it. Assume your module is called myptl.ptl.

$ python
Python 2.3.4 (#1, Nov 30 2004, 10:15:28)
[GCC 3.3.4 20040623 (Gentoo Linux 3.3.4-r1, ssp-3.3.2-2, pie-8.7.6)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from quixote.ptl import install    # Install PTL import hook
>>> import myptl
>>> print myptl.add(2, 3)
a plus b equals 5
>>> myptl.add(2, 3)
'a plus b equals 5'

One of PTL's features is automatic HTML quoting. Suppose you had this:

def greeting [html] (what):
    "<strong>Hello, %s!</strong>\n" % what   

A nice user types 'world' into a form and your function returns:

>>> print myptl.greeting("world")
<strong>Hello, world!</strong>

But say a malicious user types '<script type="text/javascript">BAD_STUFF</script>' instead:

>>> print x.greeting('<script type="text/javascript">BAD_STUFF</script>')
<strong>Hello, &lt;script type=&quot;text/javascript&quot;&gt;BAD_STUFF&lt;/script&gt;!</strong>

PTL escapes it automatically in case you forgot to. How does it know which values to escape? It escapes everything that's in a bare expression and not defined literally in the function: arguments, subroutine return values, and global variables. To protect a string from further escaping, wrap it in an htmltext instance:

>>> from quixote.html import htmltext
>>> text = htmltext("<em>world</em>")
>>> print myptl.greeting(text)
<strong>Hello, <em>world</em>!</strong>

In fact, the return value is itself an htmltext instance:

>>> myptl.greeting(text)
<htmltext '<strong>Hello, <em>world</em>!</strong>'>

htmltext is mostly string compatible, but some Python library functions require actual strings:

>>> "The universe is a big place.".replace("universe", text)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
TypeError: expected a character buffer object

This is one of the annoyances of PTL. The other is overquoting. Sometimes you have to use str() and htmltext() to get around these. Sometimes this is a pain in the butt. It causes parenthesesitis, long lines, obfuscated code, makes generic modules dependent on Quixote, etc. At least htmltext dictionary keys match their equivalent string keys. But if you intend to use the dict as **keyword_args, you'd better str() the keys.

PTL's third annoyance is the import hook. It's "magic", it may break sometime, it doesn't play well with other import hooks, and it has a failed-import bug. (The latter two are probably Python's fault rather than PTL's.) The failed-import bug is that if you import a module that doesn't exist, the variable is set to None rather than raising an ImportError. This causes a cascading error later when you try to access an attribute of it, similar to a null pointer dereference in other languages. You just have to remember that if a variable is unexpectedly None, it may mean a failed import. (This bug happens only in some circumstances, but I haven't figured out which.)

When using PTL with ZODB, the Quixote docs warn to import ZODB before PTL. ZODB has its own import hook, and they must be installed in this order or you'll get errors. I discovered the same thing happens with Python's fcntl module on the Macintosh. fcntl doesn't have an import hook, but PTL's hook has an unexpected interaction that causes fcntl to fail. On Mac OS X 10.3 (Python 2.3.0), fcntl.so is in a separate directory along with other "C" extensions. After installing PTL, import fcntl finds the deprecated FCNTL.py due to the Mac's case-insensitive filesystem. This is a dummy module that has constants but no functions. So you try to do file locking and blammo! AttributeError. To get around this you have to import fcntl before PTL, or put the extension directory at the start of the Python path before importing fcntl. If you're doing this at the start of your application because a third-party module later uses fcntl, it can be confusing to future application maintainers. (Python 2.4 supposedly doesn't have this problem because FCNTL.py doesn't exist.)

When the import hook works, it works great. But you may be leery of it due to known or unknown problems. What alternatives are there? PTL creates a *.pyc file, so once the module has been imported you don't need the hook again unless the source changes. But *.pyc files aren't compatible between Python versions, and you may forget to import-with-hook after making changes. So what other alternatives are there?

TemplateIO

PTL is built from components that can also be used standalone in ordinary Python functions. This is not covered in the Quixote documentation but can be deduced from the source. Our first example above translates to:

from quixote.html import TemplateIO
def add(a, b):
    tio = TemplateIO()
    answer = a + b 
    tio += 'a plus b equals '
    tio += answer
    return tio.getvalue()

 >>> import mymodule
 >>> mymodule.add(2, 3)
 '2 plus 3 equals 5'

As you can see, it's similar to StringIO but with a cleaner interface. It also automatically converts the right side to a string. There's a flag to do HTML escaping:

from quixote.html import TemplateIO, htmltext
def greeting(what):
    tio = TemplateIO(html=True)
    tio += "&"
    tio += htmltext("<strong>Hello, %s!</strong>") % what
    return tio.getvalue()

>>> reload(mymodule)
>>> mymodule.greeting("<javascript>")
<htmltext '&amp;<strong>Hello, &lt;javascript&gt;!</strong>\n'>

Here we have to explicitly htmltext() everything we don't want escaped. Is this better or worse than PTL? Is the TemplateIO syntax better or worse than PTL? That's for you to decide. I prefer PTL for some modules and TemplateIO for others. TemplateIO is also better for generic modules that shouldn't depend on the import hook. The TemplateIO class resides in quixote/html/_py_htmltext.py. (There's also a faster "C" version, _c_htmltext.c.) You can copy the module to your own project (check the license first), or write a simple non-escaping TemplateIO in a few lines of code.

htmltext

_py_htmltext.py also contains other classes and functions used by PTL and TemplateIO: htmltext, htmlescape, and stringify. stringify is a function that converts anything to string or unicode, a kind of enhanced str(). htmlescape calls stringify, escapes the result, and returns a htmltext object. But if the argument is already htmltext, htmlescape doesn't escape it. So when we said htmltext protects a string from being escaped, we really meant htmlescape treats htmltext specially.

When you use one of htmltext's "string methods", it calls htmlescape on its arguments. (Actually it inlines the code, but close enough.) So where we used the % operator in greeting() above, it escaped the right side. This is a common idiom in programs that use htmltext: put the htmltext wrapper on the left side of the operator, and let it escape the arguments on the right side:

result = htmlext("<em>format string %s %s</em>") % (arg1, arg2) 

def em(content):
    return htmltext("<em>%s</em>") % content

Don't do this unless you really mean it:

result = htmltext("<em>%s</em>" % arg)    # BAD!!! 'arg' won't be escaped.

It's usually most convenient to put the htmltext() call as close to the variable definition or import/input location as possible. That way you don't have to worry about whether it's been wrapped or not. This can be a problem for generic modules that would suddenly depend on Quixote, but again you can copy _py_htmltext.py into your project to eliminate that dependency.

html

quixote.html contains a few convenience functions that build htmltext objects. The source is in quixote/html/__init__.py.

htmltag(tag_name, add_xml_empty_slash=False, css_class=None, **attrs) 
href(url, text, title=None, **attrs)
url_with_query(path, **attrs)

Here are some examples:

>>> from quixote.html import htmltag, href, url_with_query
>>> htmltag('table')
<htmltext '<table>')
>>> print htmltag('table')
<table>
>>> print htmltag('/table')
</table>
>>> print htmltag('table', False, 'foo')
<table class="foo">
>>> print htmltag('br', True)
<br />
>>> print htmltag('div', False, 'chapter', style="border-style:raised", foo="bar")
<div class="chapter" style="border-style:raised" foo="bar">
>>> print htmltag('img', src="foo.jpg", width="200", height="160")
<img src="foo.jpg" height="160" width="200">
>>> print href("foo.html", "Foo!", name="foo")
<a href="foo.html" name="foo">Foo!</a>
>>> url = url_with_query("delete_user", fname="ben", lname="okopnik")
>>> print url
delete_user?fname=ben&amp;lname=okopnik
>>> print href(url, "Page 2")
<a href="delete_user?fname=ben&amp;lname=okopnik">Page 2</a>
>>> input_dict = {'page': 2, 'printable': 'y'}
>>> print url_with_query("display", **input_dict)
display?page=2&amp;printable=y

Cheetah

But what if you really want your template to be a large string with placeholders that "looks like" the final output? PTL is fine for templates with lots of calculations and small amounts of literal text, but it's less convenient with large chunks of text. You either have large multiline strings in the function, making the expressions hard to find, or you use global variables for the literal text. Sometimes you'd just rather use a traditional-looking template like this:

<html><head><title>$title</title></head><body>
$content
</body></html>

Cheetah does this. It has a users' guide (which I mostly wrote), so we'll just complete the example without explaining it in detail:

from Cheetah.Template import Template
t = Template(file="mytemplate.tmpl")
t.title = "Greetings"
t.content = "<em>Hello, world!</em>"
print str(t)
<html><head><title>Greeting</title></head><body>
<strong>Hello, world!</strong>
</body></html>

Cheetah has many features we won't discuss here, but one feature it doesn't have is smart escaping. You can set a built-in filter that escapes all values, and you can turn the filter on and off at different points in the template, but you can't escape certain values while protecting htmltext values.

Well, actually you can if you write your own filter. [text version]

from Cheetah.Filters import Filter
from quixote.html import htmlescape

class HtmltextFilter(Filter):
    """Safer than WebSafe: escapes values that aren't htmltext instances."""
    def filter(self, val, **kw):
        return htmlescape(val)

Instantiate the template thus:

t = Template(file="mytemplate.tmpl", filter=HtmltextFilter)

Or put this in the template:

#from my_filter_module import HtmltextFilter
#filter $HtmltextFilter

Sometimes you want to put an HTML table in a Cheetah template, but you don't want to type all the tags by hand. I've written a table module that builds a table intuitively, using TemplateIO and htmltext. Here's the source. The module docstring has the complete usage, but here are a few examples:

import table

# A simple two-column table with headers on the left, no gridlines.
data = [
    ('First Name', 'Fred'),
    ('Last Name', 'Flintstone')]
print table.ReportTable.build(data)

# A table with headers at the top.
headers = ['Name', 'Noise Level']
data = [
    ('Pebbles', 'quiet'),
    ('Bam-Bam', 'loud')]
print table.Table.build(data, headers)

# A table with custom tags.
data = [
    ('Fred', 'Flintstone', '555-1212')]
td = htmltag('td')
td_phone = htmltag('td', css_class='phone')
tds = [td, td, td_phone]
t = Table
t.table = htmltag('table', css_class='my_table')
for row in data:
    t.row(row, tds)  # Match each cell with its corresponding <td> tag.
print t.finish()

The output is a htmltext object, which you can set as a placeholder value for Cheetah.

quixote.form lets you build forms in a similar way, and the same object does form display, validation, getting values, and redisplay after errors. I highly recommend it. Like everything else here, it can be used standalone without the Quixote publisher.

Other template packages

PTL and Cheetah use a non-tag syntax for replaceable values, so they work just as well for non-HTML output as HTML. Zope Page Templates (ZPT/TAL) and Nevow's template system, among others, use XML-style tags for placeholders. This limits their usability for non-HTML output. I prefer to use one template system for all my output rather than one for HTML and another for non-HTML, and I hate XML tags. Those who love XML tags may prefer ZPT or Nevow. Nevow has an interesting way of building replacement values via callback functions, which literally "put" the value into the template object. (I wrote about Nevow in a previous PyCon article.) More ZPT/TAL information is here. These all can be used without their library's publishing loop.

I hope this article gave you some ideas on the many ways you can structure a template in Python.

 


picture Mike is a Contributing Editor at Linux Gazette. He has been a Linux enthusiast since 1991, a Debian user since 1995, and now Gentoo. His favorite tool for programming is Python. Non-computer interests include martial arts, wrestling, ska and oi! and ambient music, and the international language Esperanto. He's been known to listen to Dvorak, Schubert, Mendelssohn, and Khachaturian too.

Copyright © 2005, Mike Orr (Sluggo). Released under the Open Publication license unless otherwise noted in the body of the article. Linux Gazette is not produced, sponsored, or endorsed by its prior host, SSC, Inc.

Published in Issue 117 of Linux Gazette, August 2005

<-- prev | next -->
Tux