What is a Reasonable Authoring DTD under SGML or XML for MathML?

William F. Hammond

Email: hammond@math.albany.edu

1. Introduction

There has been a recent resurgence of interest in
MathML, the rather
granular XML language developed by the World Wide Web Consortium (W3C)
HTML Math Working Group during the period 1996-2000, due to the
availability of MathML-capable builds of the browser
Mozilla, the
open-source development version of the popular browser NetScape.

2. A Few Examples

Compound fractions:

{{a}/{b}}/{{c}/{d}} = {a d}/{b c} .

The formula for solving the quadratic
equation a x^{2} + b x + c = 0
(in a field of characteristic ≠ 2):

of a centrally important object that one might choose to declare
as the symbol “galQ”. In this instance, however the expression
is formed using the following three declared symbols:1

Name

Rendering

GELLMU expansion

Q

Q

\regch{\bold{Q}}

Qbar

\={Q}

\ovbar{\regch{\bold{Q}}}

Gal

Gal

\mbox{Gal}

Here the example is repeated

Gal(\={Q} / Q)

with the same presented appearance but this time as the declared
symbol galQ, which is defined without using other
declared symbols in its definition.

3. Generating MathML

There is a serious issue surrounding how one might migrate from
traditional TeX-like mathematical markup, which uses reasonably
succinct mathematical notation based on the long tradition of Western
mathematical notation, to an authoring markup that is fully adequate
for translation to MathML. For example, how can we automatically
translate, with full confidence, the XML versions of the above
mathematical examples into MathML? Or, if we cannot, what additional
information needs to be added?

One possibility is offered by my draft on mathematical notation at the
URL

It attempts to explain what additional information is needed in
this document to eliminate the need for guessing by an automated
rendering system at work on these examples, as marked up in the XML
version of this document. Note that no guessing is needed to render
this document in either HTML, with mathematics set crudely but
reasonably, nor to render it in LaTeX. (Perhaps one may not fully
appreciate this latter point without examining the XML version of this document.)

For the purpose of assistance in automated rendering to MathML as well
as for the purpose of supplying semantic information for computer
algebra systems, GELLMU provides a metacommand mathsym2 for the
formal declaration of mathematical symbols with the usage:

\mathsym{symbol-name}{symbol-rendering}[symbol-meta-info]
.

Here symbol-name is an alphanumeric string (case-sensitive)
beginning with a letter. The second argument is the presentation
rendering of the symbol in GELLMU markup. It is like the definition
of a newcommand except that it may not involve
arguments.3
The optional third argument symbol-meta-info is
an alpha-numeric string that might also include possibly a few other
string characters such as ‘/’, ‘-’, ‘,’,
‘.’, ‘*’, etc. Its exact structure depends on the
production system. For example, it might consist of (name, value) pairs
for conveying meta-information about the symbol.

The syntactic translator replaces each invocation of a given
mathsym with the specified rendering and writes for each
mathsym definition a corresponding element in the SGML output
whose content consists solely of the declared symbol name if there is
no meta information but otherwise consists of the symbol name followed
by a blank space and then whatever string of meta information is
provided in the optional argument. Additionally, each invocation is
wrapped in a rendering-inert Sym element whose key
attribute reveals the name given to the symbol at the point of
declaration (and by which the symbol is invoked). This makes it
possible for a downstream authoring platform processor that has
remembered the list of declared symbol names to match each invocation
of a declared symbol with its associated meta information, if any,
provided by the author in the symbol declaration.

A related feature in the didactic GELLMU document type is the
mlg tag for marking mathematical logical groups. This is
somewhat akin to the lgg tag for TeX-like logical groups,
traditionally created in TeX markup with braces that are not
attached to a command.4 As with lgg there is no
obvious evidence of an mlg tag in a typeset rendering, but the
presence of such a tag is intended as a signal to downstream
mathematical parsers that the contents of the tag be given grouping
priority as, say, with visible parentheses. Furthermore, the
mtype and mml attributes of the mlg tag may be
used to pass semantic information about the tag's contents to a
processor.

The reader is invited to do one or more of the following:

point out inadequacies in my draft on notation.

improve my draft on notation.

provide code to format the above examples in MathML.

Footnotes

* The command
regch is a variant of mbox that is intended to denote
the normal version of a “regular” character found in a
mathematical context when that character is suitable for a
hypothetical algorithmic application of an accent such as
ovbar. A general mbox is regarded as not suitable for
hypothetical algorithmic accenting.

* The
name mathsym is the default value of the variable
gellmu-mathsym-name in the syntactic translator.

* However, a declared math symbol may be invoked in
a newcommand that takes arguments.

* Such unattached braces in GELLMU
markup lead to an lg0 tag in the output of the syntactic
translator that is translated to an lgg tag in the XML version
of the didactic document type.