Note: the Tartan project is not under active devleopment.
We'll leave the code here on RubyForge, so feel free to
do with it as you will. If you are interested in taking
over the project, please contact one of the developers.

Welcome to Tartan

Tartan is a general purpose text parsing engine whose main target is wiki
text parsing. (see c2.com
and Wikipedia) It
doesn’t implement one specific mark-up, but instead, provides a way
to specify a variety of mark-ups. So, Tartan is a bit more
"involved" than a purpose built parser like RedCloth or BlueCloth, but
provides the following benefits:

separates the specific wiki syntax specification from the implementation

allows layering and extension of parsing rules

allows multiple output formats from the same syntax specification

The current implementation of Tartan is in Ruby and includes a full Markdown parser
(described in YAML). The format of the parsing specification has been
created with an eye to having a language independent definition of wiki
(and possibly other) mark-ups. That’s a lofty goal, and Tartan
hasn’t quite gotten there yet, but we think there’s a clear
path. In any case, even if it is only available in Ruby it will hopefully
be helpful for projects needing to do something more than just convert wiki
text directly into HTML.

Usage

So, really all you want to do is generate HTML from Markdown text.
Here’s how you do it:

Other parsers would have similar names and would have the same usage. In
particular, you will need to require the parser class file and then creat a
new instance of the parser and call to_html on that instance.

You can also have other output methods, say to_xml, which would be
called in the same way on the instance of the parser object.

Layering Parsers

You can add parsing syntax to existing parsers. This is done by building up
a set of parsers specifications that work together.

In the Tartan distribution you have a specification for Markdown and you
also have a specification for table mark-up. You can combine them by
creating a new class that layers the tables onto the Markdown definition
as follows in a file called tartan_markdown_tables.rb:

So the parsing rules are defined as a set of contexts and each context is
an list of parsing rules. The base context defaults to block; that
is, the parser starts with the block context which may point the
parser off to other contexts to parse blocks of the parsed text. More on
this after the explanation of the parsing rules.

Parsing Rules

The following is a simple parsing rule to match paragraphs and mark them up
in HTML:

The parser will repetitively apply the match regular expression
and if it matches, the html output sub-rule will put the
start_mark, <p>, and the end_mark,
</p>, around the text that is matched as a paragraph.

If we wanted to also mark off blocks of code that are indented by say 2 or
more spaces at the beginning of the line, we could use the following rule:

When we want to add the code rule, the ordering becomes important.
If we put the paragraph rule first, it will gobble up both the
paragraphs and the code blocks since it’s just looking for groups of
non blank lines. To prevent this we need to put the code rule
first. So the overall definition would be:

Now, lets say we want to be able to mark-up text with emphasis (HTML
<em>) and strong emphasis (HTML <strong>) in
paragraph text, but not code. We‘ll use an asterisk (*) around text
we want to have emphasis and a double asterisk around text we want to have
strong emphasis (**). Note that we don’t want this to happen in text
in a code block.

To do this, we set up a new parsing context for paragraph body text and
"point" the parser to the context when it recognizes a paragraph.

The rescan directive between the strong and
emphasis rules tells the parser to "start over". This is
needed because otherwise the strong rule would "claim"
all the text it matched and the emphasis rule wouldn’t have
a chance to parse any of it. This would come into play if we had a
paragraph such as:

To do this we use the subparse directive to tell the parser that
the contents of the paragraph should be parsed by the paragraph
context.

Creating a Mix-in

It’s possible to mix-in or layer a parsing specification with a base
parser. This allows you to add additional markup or change the markup of an
existing syntax. You could use this to add table mark-up to Markdown (in fact,
this mix-in to Markdown is available as part of the Tartan code
distribution).

To show how this works, we’ll look at how to specify and then add
character element markup to the parser example we’ve been working
with. We want to turn things like "<", "&" and
"->" into "&lt;", "&amp;" and
"&rarr;".

We want these transformation to be done in the context of parsing
paragraphs, so we’ll only want to add to the paragraph
context in our previous example.

So, to add this syntax parsing, you would create the following
specification:

That’s it for the mix-in specification. Now we add these to the
previous set. We didn’t touch on file naming of specifications
before, but now we need to. Let’s say that we put the previous
specification in a file called example-parser.yml and we put the
new spec in entities.yml. To combine them, we would create a new
Ruby class like this:

Going Further

Honestly, this brief tutorial just provides you with the basics of Tartan.
If you want to know more, for now, the best thing is to look at the Markdown and table
extension specification in the code. That will show you a real-world
example of how to create a base parser and a mix-in.

There will be additional documentation to follow. In particular a reference
guide that covers all the parser rule directives one at a time.

If you need some help in getting Tartan to work for your project, please
don’t hesitate to post to the Tartan help-form or
write me directly at bitherder@rubyforge.org.

The Name

Tartan is intended to weave together different parsing elements. It’s
intended to be an alternative of both RedCloth and BlueCloth. Tartan is
a kind of cloth that weaves different colors together in an interesting
pattern.