ASCIIMathML.js (ver 1.4.7): Syntax and List of Constants

The main aims of the ASCIIMathML syntax are: -- 1. close to standard
mathematical notation -- 2. easy to read -- 3. easy to type

You can use your favorite editor to write XHTML pages that use this
JavaScript program. If the page is viewed by a browser that does not
support MathML or JavaScript, the ASCII formulas are still quite
readable. Most users will not have to read the technicalities on
this page. If you type

`x^2` or `a_(mn)` or `a_{mn}` or `(x+1)/y` or `sqrtx`

you pretty much get what you expect: `x^2` or `a_(mn)` or `a_{mn}` or
`(x+1)/y` or `sqrtx`. The choice of grouping parenthesis is up to you
(they don't have to match either). If the displayed expression can be
parsed uniquely without them, they are omitted. Printing the table of
constant symbols (below) may be helpful (but is not necessary if you
know the LaTeX equivalents).

It is hoped that this simple input format for MathML will further
encourage its use on the web. The remainder of this page gives a fairly
detailed specification of the ASCII syntax. The expressions described here
correspond to a wellspecified subset of Presentation MathML and behave
in a predictable way.

The syntax is very permissive and does not generate syntax
errors. This allows mathematically incorrect expressions to be
displayed, which is important for teaching purposes. It also causes
less frustration when previewing formulas.

The parser uses no operator precedence and only respects the grouping
brackets, subscripts, superscript, fractions and (square) roots. This
is done for reasons of efficiency and generality. The resulting MathML
code can quite easily be processed further to ensure additional syntactic
requirements of any particular application.

The grammar: Here is a definition of the grammar used to parse
ASCIIMathML expressions. In the Backus-Naur form given below, the
letter on the left of the ::= represents a category of symbols that
could be one of the possible sequences of symbols listed on the right.
The vertical bar | separates the alternatives.

The translation rules: Each terminal symbol is translated into
a corresponding MathML node. The constants are mostly converted to
their respective Unicode symbols. The other expressions are converted
as follows:

l`S`r

`to`

<mrow>l`S`r</mrow>
(note that any pair of brackets can be used to delimit subexpressions,
they don't have to match)

In the rules above, the expression `S'` is the same as `S`, except that if
`S` has an outer level of brackets, then `S'` is the expression inside
these brackets.

Matrices: A simple syntax for matrices is also recognized:
l(`S_(11)`,...,`S_(1n)`),(...),(`S_(m1)`,...,`S_(mn)`)r
or
l[`S_(11)`,...,`S_(1n)`],[...],[`S_(m1)`,...,`S_(mn)`]r.
Here l and r stand for any of the left and right
brackets (just like in the grammar they do not have to match). Both of
these expressions are translated to
<mrow>l<mtable><mtr><mtd>`S_(11)`</mtd>...
<mtd>`S_(1n)`</mtd></mtr>...
<mtr><mtd>`S_(m1)`</mtd>...
<mtd>`S_(mn)`</mtd></mtr></mtable>r</mrow>.
For example {(S_(11),...,S_(1n)),(,...,),(S_(m1),...,S_(mn))]
displays as `{(S_(11),...,S_(1n)),(,...,),(S_(m1),...,S_(mn))]`.
Note that each row must have the same number of expressions, and there
should be at least two rows.

Tokenization: The input formula is broken into tokens using a
"longest matching initial substring search". Suppose the input formula
has been processed from left to right up to a fixed position. The
longest string from the list of constants (given below) that matches
the initial part of the remainder of the formula is the next token. If
there is no matching string, then the first character of the remainder
is the next token. The symbol table at the top of the ASCIIMathML.js
script specifies whether a symbol is a math operator (surrounded by a
<mo> tag) or a math identifier (surrounded by a <mi> tag). For
single character tokens, letters are treated as math identifiers, and
non-alphanumeric characters are treated as math operators. For digits,
see "Numbers" below.

Spaces are significant when they separate characters and thus prevent
a certain string of characters from matching one of the
constants. Multiple spaces and end-of-line characters are equivalent
to a single space.

Numbers: A string of digits, optionally preceded by a minus sign, and
optionally followed by a decimal point (a period) and another string
of digits, is parsed as a single token and converted to a MathML
number, i.e., enclosed with the <mn> tag. If it is not desirable to
have a preceding minus sign be part of the number, a space should be inserted.
Thus x-1 is converted to <mi>x</mi><mn>-1</mn>, whereas
x - 1 is converted to <mi>x</mi><mo>-</mo><mn>1</mn>.

Of course you may want or need other symbols from the thousands of LaTeX
symbols or unicode
symbols. Fortunately ASCIIMathML.js is very easy
to extend, so you can tailor it to your specific needs. (This
could be compared to the LaTeX macro files that many users have
developed over the years.)