FYI - Dave Raggett
------- Forwarded Message
Date: Mon, 20 May 1996 09:13:05 -0700
From: "Richard J. Fateman" <fateman@cs.berkeley.edu>
Message-Id: <199605201613.JAA27194@peoplesparc.CS.Berkeley.EDU>
To: dli-math@ncsa.uiuc.edu, s-harum@uiuc.edu
Subject: Re: SGML Math Workshop/ a suggestion
Content-Length: 4269
>From the summary of the workshop, it sounds like the
same ground has been covered as in OpenMath workshops.
I have a suggestion to try to make some progress here.
1. A small set of documents or pages from documents
should be made available for examination.
2. All contenders for adoption as an encoding should
demonstrate their encoding for the given set of pages
and if appropriate add to the set of pages additional
pages which highlight the advantages of their encoding.
3. Programs which map from one of the encodings to the
visual appearance of the page presented should be made
available for experimentation.
You can look at two pages that are given in the context
of OCR, if you visit
http://www.cs.berkeley.edu/~fateman/ocrchallenge.html
The pages linked to this may suggest to the careful
reader that neither the syntactic nor the semantic
approach suggested in the Workshop summary can succeed
in a "pure" form.
Let me mention a few pro/con additional issues that may not have
been addressed.
con: syntactic approach.
If the parameters of the display system change, how does the display
change? for example, on a narrow display, how does one display a wide
expression? Can break-points and re-display really be computed on a
purely syntactic basis? I doubt it. The person entering text in a
syntactic fashion is doomed to encoding accidental relationships as
though they were deliberate. Only a semantic understanding (at SOME
nontrivial level) tells the encoder if the relationship of the text as
it is on the draft manuscript page must be EXACTLY preserved.
con: semantic approach.
It is pretty much inevitable that the notation given by
any predetermined collection will be inadequate, since authors
are not restricted a priori to notation that can be mapped into
existing Mathematica (etc.) semantics.
An extension language is therefore necessary. Computer algebra systems
(for at least 20 years before Mathematica) have traditionally taken an
object-oriented approach which associates with each operator a
"formatting" program. Thus the POWER[a,b] operator has a format
program that does this:
format(POWER[a,b]) --> {format[a]} <super>{format[b]}</super>
or some such thing. (I've used Mathematica / SGML bastardized
notation, but you can get the idea).
If you are unwilling to allow people to define new formats based
on new operators "on the fly" then a semantic encoding would be
impossible to live with.
(By the way, the simple example of formatting POWER is inadequate
because if the format[b] doesn't fit in the space allocated to it on
one line some alternative must be found).
........
My own (in my experience pretty much workable) solution is to use a
language which can (in cases where it makes sense) be mapped into a
CAS language and can (in cases where it makes sense) be mapped into
TeX, or directly displayed on a bit-map screen in a crude version
of TeX.
A language which is extensible by nature, and in which new
notations are provided in the same language that describes the
mathematical data.
Lisp does fine.
Trivial examples:
a+b+c is (+ a b c)
(a+b)*c*f(x,y) is (* (+ a b) c (f x y))
If you want to put in extra markers you can do this:
[a+b]*c*f(x,y) is (* (squarebracket (+ a b)) c (f x y))
and
a + b + c
+ d + e
is {perhaps} (multiple-lines + (a b c)(d e))
The nice part of it is you can include in your file something like
(define-printformat g (args)
(horizontal-format (Gamma font-name ....)
(parenthesis-list args)))
(define-printformat multiple-line (+ args)
(vertical-format (format `(+ ,(first args))
....)))
As is usual in what I do, it is possible to use a conventional display
syntax suitable for "advanced calculus" or "applied math" on top of
this, But it is hardly required to do so. One can consider building
some syntax where other, perhaps conflicting, notations prevail. For
example, numerous conflicting notations prevail in old calculus
references.
Much of what is expressed in Lisp can be written in Mathematica,
and probably in other language systems. Translating between
Lisp and Mathematica is approximated by f[a,b] <--> (f a b).
There are numerous free small lisp systems.
------- End of Forwarded Message