txt2html README

Intro

txt2html is a plain text to HTML converter written as a Korn Shell 93
script function. It succesfully converts subtle text markup to lists,
bold, italics, tables and headings to their corresponding HTML markup
without having to write unreadable source text files.

No installation is necessary, txt2html can be utilized from the
command line or as a function called from another shell script. The
program "txt2html" is itself a Korn Shell 93 script.

txt2html someFile.txt

txt2html includes several command-line arguments; see them using

txt2html -?

This README file contains all the marks that txt2html is able to
understand so that it serves also as a demo. So, to see how this file will
look after being converted by txt2html, you can just do

txt2html < txt2html.README > README.html

and see its effects.

Text paragraphs

A text paragraph is any group of lines containing text delimited by one or
more blank lines, provided that none of them beings with a blank space.
So, you just write lines as usual (wrapping or not), and separates
paragraphs as in a word processor.

Headings

A line is understand as a heading if it's immediately followed by another
one that contains only a repetition of a special character (see 'Text
paragraphs' and 'Headings' for an example). There are three heading
levels depending on this special character: if it's a line of = (equal
sign), it's a first level heading, used for titles and tagged with h1 HTML
tags. If it's - (hyphen), it's a second level heading, and if it's ~
(tilde), a third level one. This document shows the three heading levels.
It's suggested that the first level heading is used only once, as it's
magically taken as the title for the HTML page, if one is not overriden as
a command line argument.

Text effects

If some text is surrounded by asterisks, as \this one\, it's marked as
bold (you probably wrote text this way in email to emphasize something).
As well, text surrounded by the _ symbol (underscore), as \this one\, is
marked as italic. Bold can also be marked up surrounding the text with three
apostrophes (\this way\) and italics with two (\this way\). If you ever
used a WikiWikiWeb system you'll be familiar with these ones.

Other special text is automatically recognized, as URLs (so that the URL
http://www.mtxia.com should be clickable). Text beginning with ./ is
interpreted as relative URLs, so ./index.html should also be clickable.

txt2html can also be useful when documenting source code, as function names
like printf() or variables like $username are also highlighted. There are
command line arguments to make the parenthesis and / or leading dollar to
disappear from the output document.

URLs are simply substituted as shown above; if an URL is followed by a
phrase surrounded by parentheses (just like you naturally would do to
explain the contents of a web), this phrase is used as the link text, as
in this example pointing to
http://www.mtxia.com/fancyIndex/Tools/Scripts/Korn/K93_Unix/txt2html.html
(the txt2html Home Page).

Lists

txt2html is powerful rendering lists. There are three types of lists:
unnumbered ones (bulleted), numbered ones and definition lists. They are
recognized as lines starting with a blank (space or tab) immediately
followed by an special character.

Unnumbered lists start with some blanks, followed by an asterisk,
followed by another blank. If the following lines are space indented,
they are assumed as part of the same list element. The asterisk can
also be a - (hyphen).

Lists can have multiple levels. To add another level,

Just indent a bit deeper,

and have hours of fun

nesting.

unindent 1 level

unindent a 2nd level

Numbered lists are marked up almost the same, just by substuting the
asterisk by a # (sharp) or 1 (number one).

Definition lists are marked up almost the same, but delimiting the
definition term from the definition itself by a colon.

List examples

Unnumbered list:

First element. Elements at the same level must be indented
by the same number of spaces.

The second one.

The second element has one sub-element.

And another...

that, itself, has another one

unindent 1 level

The third one...

Has another extremely long sub-element to show that long
ones are rendered correctly. Please note that the elements
of a list cannot be separated by blank lines or they will
be interpreted as different lists.

The 4th and final one...

And its final child.

Ordered list:

First element.

The second one.

The second element has one sub-element.

And another...

that, itself, has another one

unindent 1 level

The third one...

Has another extremely long sub-element to show that long
ones are rendered correctly. Please note that the elements
of a list cannot be separated by blank lines or they will
be interpreted as different lists.

And another sub-element, to show this is not a cut & paste
from the unsorted example.

The 4th and final one. Note also that ordered and unsorted
lists cannot be combined.

Definition list:

first

the first element and
this is the second line of the first
definition list and it will wrap around the full line
of the browser so that it is visible across multiple
lines

second

the second element

third

the third element

Preformatted text

A text that should be rendered as is should be written with at least a
blank in the beginning of all lines. This can be an example:

If you ever wrote any Perl POD documentation, you'll be familiar with this.

If you write preformatted text and its first line collisions with list
definitions (i.e. text with lines beginning with blanks and an asterisk or
sharp) just insert a line containing only spaces before it.

Cites

If you want to quote a (possibly long) paragraph of text, use a blank
followed by a " (double quote) in its first line, as in the following
example:

"BRAIN, n. An apparatus with which we think what we think. That which
distinguishes the man who is content to _be_ something from the man
who wishes to _do_ something. A man of great wealth, or one who has
been pitchforked into high station, has commonly such a headful of
brain that his neighbors cannot keep their hats on. In our
civilization, and under our republican form of government, brain is so
highly honored that it is rewarded by exemption from the cares of
office." -- Ambrose Bierce

The leading double quote remains as part of the cited paragraph.

HTML

If you need to insert HTML as is (for rendering, say, images or
complicated layouts), you can also do it. Anything between two < symbols
and two > symbols will be passed without any further processing. So, to
insert an image, just do this:

Tables

But where txt2html is really awesome is rendering tables. They are created
using the + (plus) sign for corners, the - (hyphen) for horizontal lines
and the | (pipe) for vertical lines. So this is a table:

Band Name second Band Name third Band Name

Album Name second Album Name third Album Name

Number of Songs second Songs third Songs

Dead Can Dance second line

A Passage in Time second passage

16 216

Bel Canto

White-Out Conditions

10

Depeche Mode

Speak and Spell

16

Love Spirals Downwards

Temporal

13

One or more header rows can be imbedded in a table by marking the header
row with an exclamation point (!) immediately following the first pipe
"|!" designating a data row. Only one "!" is necessary in the first
cell, however every cell in a header row may be designated using a "!"
for consistency, if desired. A header row may also be marked by using
an asterisk (*) instead of a plus sign (+) to mark the cell divisions
on the table border line above each data cell.

Band Name second Band Name third Band Name

Album Name second Album Name third Album Name

Number of Songs second Songs third Songs

Dead Can Dance

A Passage in Time

16

Bel Canto

White-Out Conditions

10

Depeche Mode

Speak and Spell

16

Love Spirals Downwards

Temporal

13

The following is a table with multiple header lines identified.

Head 1

Head 2

Head 3

Head 4

Cell 1-1

Cell 1-2

Cell 1-3

Cell 1-4

Cell 2-1

Cell 2-2

Cell 2-3

Cell 2-4

Cell 3-1

Cell 3-2

Cell 3-3

Cell 3-4

! Head 5

Head 6

Head 7

Head 8

Cell 4-1

Cell 4-2

Cell 4-3

Cell 4-4

Cell 5-1

Cell 5-2

Cell 5-3

Cell 5-4

Cell 6-1

Cell 6-2

Cell 6-3

Cell 6-4

Separators

A separator line (horizontal ruler) can be inserted by typing four or
more hash marks (#) on a line. To the end of this document there should
be a separator, above my signature.