Because MathML is, typically, embedded in a wider context, it is important
to describe the conditions that processors should acknowledge in order to
recognize XML fragments as MathML.
This chapter describes the fundamental
mechanisms to recognize and transfer MathML markup fragments within a larger environment such
as an XML document or a desktop file-system, it raises the issues of combining
external markup within MathML, then indicates how cascading style sheets
can be used within MathML.

7.1 Invoking MathML Processors: namespace, extensions, and mime-types

7.1.1 Recognizing MathML in an XML Model

Within an XML document supporting namespaces (TODO: cite xmlns and xml specs),
the preferred method to recognize
MathML markup is by the identification of the math element
in the appropriate namespace, i.e. that
of URI http://www.w3.org/1998/Math/MathML.

This is the recommended method to embed MathML within [XHTML]
documents. Some user-agents' setup may require supplementary information
to be available, such as the MicroSoft behaviour specification (TODO: quote)
used in the MathType browser-extension (TODO:quote).

Markup-language specifications that wish to embed MathML may provide special
conditions independent of this recommendation. The conditions should be
equivalent and the elements' local-names should remain the same.

7.1.2 Resource Types for MathML Documents

Although rendering MathML expressions often occurs in place in
a Web browser, other MathML processing functions take place more
naturally in other applications. Particularly common tasks include
opening a MathML expression in an equation editor or computer algebra
system. It is important therefore to specify the encoding-names that
MathML fragments should be called with:

MIME types [RFC2045], [RFC2046] offer
a strategy that can be used in current user agents
to invoke a MathML processor. This is primarily useful when
referencing separate files containing MathML markup from an embed or object element,
or within a desktop environment. (TODO: check that this still applies)

[RFC3023] assigns MathML the MIME type
application/mathml+xml which is the official mime-type.
The W3C Math Working Group recommends the standard file extension
.mml within a registry
associating file formats to file-extension.
In MathML 1.0, text/mathml was given as the suggested
MIME type. This has been superceded by RFC3023.
In the next section, alternate encoding names are provided for the purposes of
desktop transfers.

Encoding names are specified in the section below, and are described
in the chapter 5 as attribute values of the annotation* elements.
Moreover, content-types and encoding-names have a fairly similar semantic
and are even used exchangeably in some environments (e.g. Java's DataFlavor).

It might be worth trying to homogenize our list and maybe specify mime-type
equivalence for each encoding names.

Resolution

None recorded

7.2 Transferring MathML in Desktop Environments

MathML expressions are often exchanged between applications using the familiar
copy-and-paste or drag-and-drop paradigms. This section provides recommended
ways to process MathML while applying these paradigms.

Applying them will transfer
MathML fragments between the contexts of two applications
by making them available in several flavors, often called
clipboard formats or data flavors.
The copy-and-paste paradigm lets application place content in a central
clipboard, one data-stream per clipboard format; consuming applications
negotiate by choose to read the data of the format they elect.
The drag-and-drop pardigm lets application offer content by declaring
the available formats and potential recipients accept or reject a drop based
on this list; the drop action then lets the receiving application request
the delivery of the format in the indicated format.
The list of flavors is generally ordered, going from the most
wishable to the least wishable flavor.

Current desktop platforms offer both of these transfer paradigms using similar
transfer architectures. In this section we specify what applications should
provide as transfer-flavors, how they should be named, and how they should handle
the special semantics, annotation, and annotation-xml
elements.

To summarize the two negotiation mechanisms,
we shall, here, be talking of flavors, each having a name
(a character string) and a content (a stream of binary data),
which are exported.

7.2.1 Basic Flavors' Names and Contents

MathML contains two distinct vocabularies: one for
encoding mathematical semantics called Chapter 4 Content Markup
and one for encoding visual presentation called
Chapter 3 Presentation Markup.
Some MathML-aware applications import and export only one of these
vocabularies, while other may be capable of producing and consuming
both. Consequently, we propose three distinct MathML flavors:

Flavor Name

Description

MathML Content

Instance contains content MathML markup only

MathML Presentation

Instance contains presentation MathML markup only

MathML

Any well-formed MathML instance presentation markup,
content markup, or a mixture of the two is allowed

Note that Content MathML, Presentation MathML and
MathML are the exact strings that should be used to describe the
flavors described above.
On operating systems that allow such, applications should register such
names (e.g. Windows' RegisterClipboardFormat).

When transferring MathML, for example when placing it within a clipboard,
an application MUST ensure the content is a
well-formed XML
instance of a MathML schema. Specifically:

The instance MUST begin with a XML processing instruction,
e.g. <?xml version="1.0">

The instance MUST contain exactly one root math element.

Since MathML is frequently embedded within other XML document
types, the instance MUST declare the MathML namespace
on the root math element. In addition, the instance SHOULD use a
schemaLocation attribute on the math element to indicate
the location of MathML schema documents against which the instance is valid.
Note that the presence of the schemaLocation attribute does not require a
consumer of the MathML instance to obtain or use the cited schema documents.

The instance MUST use numeric character references (e.g. &#x03b1;)
rather than character entity names (e.g. &alpha;) for greater interoperability.

The character encoding for the instance MUST be either specified in the
XML header, UTF-16, or UTF-8. UTF-16-encoded data MUST begin with a
byte-order mark (BOM). If no BOM or encoding is given,
the character encoding will be assumed to be UTF-8.

7.2.2 Recommended Behaviours when Transferring

Applications that transfer MathML SHOULD adhere to the following conventions:

Applications that have pure presentation markup and/or pure content markup
versions of an expression SHOULD offer as many of these two flavors as are available.

When both
presentation and content are exported, recipients should consider it equivalent to a
single MathML instance in which presentation and content are combined at the top
level using MathML's semantics element (see
Section 5.5.1 Top-level Parallel Markup).
(TODO: issue: in DnD you can't read several, at least in java)
The order between flavors determines
whether presentation wraps content, or vice-versa. Usually, Presentation MathML
should be offered first so that it wraps the Content MathML.

When an application has a mixed presentation and content version
in addition to pure presentation and/or content versions, it should
export the mixed versionafter the pure presentation and/or
content markup versions, and mark it as the generic MathML flavor.

When an application cannot produce pure presentation and/or
content markup versions, or cannot determine whether MathML data is
pure presentation or content markup (e.g. data being passed through
from a third application,) it should export only one version
marked as the generic MathML flavor.

An application that only has pure presentation and/or content
markup versions of an expression available SHOULD NOT export a second
copy of the data marked as the generic MathML
flavor.

When an application exports a MathML fragment whose root element is
a semantics element, it SHOULD offer, after the flavors above,
a flavor for each annotation or annotation-xml element:
the flavor should be given by the encoding attribute value,
and the content should be the child text in UTF-8 (if the annotation
element contains only textual data), a valid XML fragment (if the annotation-xml
element contains children), or the data resulting of requesting the URL
given by the href attribute.

As a final fallback applications SHOULD export a version of the data in plain-text
flavor (such as CF_UNICODETEXT, UnicodeText, NSStringPboardType, text/plain, ...).
When an application has multiple versions of an expression available, it
may choose the version to export as text at its
discretion. Since some older MathML-aware programs expect MathML
instances transferred as text to begin with a math
element, the text version should generally omit the XML processing
instruction, DOCTYPE declaration and other XML prolog material before
the math element. Similarly, the BOM should be omitted for
Unicode text encoded as UTF-16. Note, the Unicode text version of the
data should always be the last flavor exported,
following the principle that exported flavors should be ordered
with the most specific flavor first and the least specific flavor
last.

7.2.3 Discussion

For purposes of determining whether a MathML instance is pure
content markup or pure presentation markup, the math element
and the semantics, annotation and
annotation-xml elements should be regarded as belonging to
both the presentation and content markup vocabularies. This is
obvious for the root math element which is required for all
MathML expressions. However, the semantics element and its
child annotation elements comprise an arbitrary annotation mechanism
within MathML, and are not tied to either presentation or content
markup. Consequently, applications consuming MathML should always
process these four elements even if the application only implements
one of the two vocabularies.

It is worth noting that the above recommendations allow agents producing
MathML to provide binary data for the clipboard, for example as an image
or an application-specific format.
The sole method to do so is to reference the binary data by the href
attribute since XML child-text does not allow arbitrary byte-streams.

While the above recommendations are intended to improve
interoperability between MathML-aware applications utilizing the
transfer flavors, it should be noted that they do not guarantee
interoperablility. For example, references to external resources
(e.g. stylesheets, etc.) in MathML data can also cause
interoperability problems if the consumer of the data is unable to
locate them, just as can happen when cutting and pasting HTML or many
other data types. Applications that make use of references to
external resources are encouraged to make users aware of potential
problems and provide alternate ways for obtaining the referenced
resources. In general, consumers of MathML data containing references
they cannot resolve or do not understand should ignore them.

7.2.4 Examples

7.2.4.1 Example 1

An e-Learning application has a database of quiz questions, some of
which contain MathML. The MathML comes from multiple sources, and the
e-Learning application merely passes the data on for display, but does
not have sophisticated MathML analysis capabilities. Consequently,
the application is not aware whether a given MathML instance is pure
presentation or pure content markup, nor does it know whether the
instance is valid with respect to a particular version of the MathML schema. It therefore
places the following data formats on the clipboard:

7.2.4.3 Example 3

A schema-based content management system contains multiple MathML
representations of a collection of mathematical expressions, including mixed
markup from authors, pure content markup for interfacing to symbolic computation
engines, and pure presentation markup for print publication. Due to the system's
use of schemas, markup is stored with a namespace prefix.
The system therefore can transfer the following data:

7.2.4.4 Example 4

A similar content management system is web-based and delivers MathML
representations of mathematiacly expressions. The system is able to produce
presentation MathML, content MathML, TeX and pictures in PNG format.
In web-pages being browsed, it could produce a MathML fragment such as the following:

7.3 Combining MathML and Other Formats

Since MathML is most often generated by authoring tools, it is
particularly important that opening a MathML expression in an editor should
be easy to do and to implement. In many cases, it will be desirable for an
authoring tool to record some information about its internal state along
with a MathML expression, so that an author can pick up editing where he or
she left off. The following markup is proposed:

7.3.1 Mixing MathML and HTML

In order to fully integrate MathML into XHTML, it should be possible
not only to embed MathML in XHTML, as described in Section 7.1.1 Recognizing MathML in an XML Model,
but also to embed XHTML in MathML.
However, the problem of supporting XHTML in MathML presents many
difficulties. Therefore, at present, the MathML specification does not
permit any XHTML elements within a MathML expression, although this
may be subject to change in a future revision of MathML.

In most cases, XHTML elements (headings, paragraphs, lists, etc.)
either do not apply in mathematical contexts, or MathML already
provides equivalent or better functionality specifically tailored to
mathematical content (tables, mathematics style changes,
etc.). However, there are two notable exceptions, the XHTML anchor and
image elements. For this functionality, MathML relies on the general
XML linking and graphics mechanisms being developed by other W3C
Activities.

7.3.2 Linking

We wish to stop using xlink for links since it seems unimplemented
and add the necessary
attributes at presentation elements.

Resolution

None recorded

MathML has no element that corresponds to the XHTML anchor element
a. In XHTML, anchors are used both to make links, and to
provide locations to which a link can be made. MathML, as an XML
application, defines links by the use of the mechanism described in
the W3C Recommendation "XML
Linking Language" [XLink].

A MathML element is designated as a link by the presence of the
attribute xlink:href. To use the attribute xlink:href, it is also necessary to declare the
appropriate namespace. Thus, a typical MathML link might look like:

MathML designates that almost all elements can be used as XML linking
elements. The only elements that cannot serve as linking elements are those
which exist primarily to
disambiguate other MathML constructs and in general do not correspond to
any part of a typical visual rendering. The full list of exceptional
elements that cannot be used as linking elements is given in the table
below.

MathML elements that cannot be linking elements

mprescripts

none

malignmark

maligngroup

Note that the XML Linking [XLink] and XML Pointer Language [XPointer] specifications also define how to link
into a MathML expressions. Be aware, however, that such
links may or may not be properly interpreted in current software.

7.3.3 Images

The img element has no MathML
equivalent. The decision to omit a general mechanism for image
inclusion from MathML was based on several factors. However, the main
reason for not providing an image facility is that MathML takes great
pains to make the notational structure and mathematical content it
encodes easily available to processors, whereas information contained
in images is only available to a human reader looking at a visual
representation. Thus, for example, in the MathML paradigm, it would be
preferable to introduce new glyphs via the mglyph element which at a minimum identifies them
as glyphs, rather than simply including them as images.

7.3.4 MathML and Graphical Markup

Apart from the introduction of new glyphs, many of the situations
where one might be inclined to use an image amount to displaying
labeled diagrams. For example, knot diagrams, Venn diagrams, Dynkin
diagrams, Feynman diagrams and commutative diagrams all fall into this
category. As such, their content would be better encoded via some
combination of structured graphics and MathML markup. However, at the
time of this writing, it is beyond the scope of the W3C Math Activity
to define a markup language to encode such a general concept as
"labeled diagrams." (See http://www.w3.org/Math for
current W3C activity in mathematics and http://www.w3.org/Graphics
for the W3C graphics activity.)

One mechanism for embedding additional graphical content is via the
semantics element, as in the following example:

Here, the annotation-xml elements are used to indicate alternative
representations of the Content MathML depiction of the
intersection of two sets.
The first one is in the "Scalable Vector
Graphics" format [SVG1.1]
(see [XHTML-MathML-SVG] for the definition of an XHTML profile integrating MathML and SVG), the second one uses the
XHTML img element embedded as an XHTML fragment.
In this situation, a MathML processor can use any of these
representations for display, perhaps producing a graphical format
such as the image below.

Note that the semantics representation of this example is given
in the Content MathML markup, as the first child of the
semantics element. In this regard, it is the
representation most analogous to the alt attribute of the
img element in XHTML, and would likely be
the best choice for non-visual rendering.

7.4 Using CSS with MathML

When MathML is rendered in an environment
that supports [CSS21], controlling mathematics style properties with a CSS
stylesheet is obviously desirable.
MathML 2.0 has significantly redesigned the way presentation element
style properties are organized to facilitate better interaction
between MathML renderers and CSS style mechanisms. It introduces four
new mathematics style attributes with logical values. Roughly
speaking, these attributes can be viewed as the proper selectors for
CSS rules that affect MathML.

Controlling mathematics styling is not as simple as it might first appear
because mathematics styling and text styling are quite different in
character. In text, meaning is primarily carried by the relative
positioning of characters next to one another to form words. Thus,
although the font used to render text may impart nuances to the
meaning, transforming the typographic properties of the individual
characters leaves the meaning of text basically intact. By contrast,
in mathematical expressions, individual characters in specific
typefaces tend to function as atomic symbols. Thus, in the same
equation, a bold italic 'x' and a normal italic 'x' are almost always
intended to be two distinct symbols that mean different things. In
traditional usage, there are eight basic typographical categories
of symbols. These categories are described by mathematics style
attributes, primarily the mathvariant
attribute.

Text and mathematics layout also obviously differ in that
mathematics uses 2-dimensional layout. As a result, many of the style
parameters that affect mathematics layout have no textual analogs.
Even in cases where there are analogous properties, the sensible
values for these properties may not correspond. For example,
traditional mathematical typography usually uses italic fonts for
single character identifiers, and upright fonts for multicharacter
identifier. In text, italicization does not usually depend on the
number of letters in a word. Thus although a font-slant property
makes sense for both mathematics and text, the natural default values
are quite different.

Because of the difference between text and mathematics styling, only the
styling aspects that do not affect layout are good candidates for CSS control.
MathML 3.0 captures the most important properties with the new mathematics style
attributes, and users should try to use them whenever possible over
more direct, but less robust, approaches. A sample CSS stylesheet
illustrating the use of the mathematical
style attributes is available in Appendix E Sample CSS Style Sheet for MathML.
Users should not count on MathML implementations to implement any other properties
than those in the Font, Colors, and Outlines families of properties described in
[CSS2] and implementations should only implement these properties
within MathML-elements.
Note that these prohibitions do not apply to CSS stylesheets
that implement the MathML-CSS profile. (TODO: quote).

TODO: add equivalence statements and conflict resolution and stress that CSS
changes should not be considered meaningful.

Generally speaking, the model for CSS interaction with the math
style attributes runs as follows. A CSS style sheet might provide a style
rule such as:

math *.[mathsize="small"] {
font-size: 80%
}

This rule sets the CSS font-size properties for all children of the
math element that have the mathsize attribute set to small.
A MathML renderer
would then query the style engine for the CSS environment, and use the
values returned as input to its own layout algorithms. MathML does
not specify the mechanism by which style information is inherited from
the environment. However, some suggested rendering rules for the
interaction between properties of the ambient style environment and
MathML-specific rendering rules are discussed in Section 3.2.2 Mathematics style attributes common to token
elements, and more generally throughout Chapter 3 Presentation Markup.

It should be stressed, however, that some caution is required in
writing CSS stylesheets for MathML. Because changing typographic
properties of mathematics symbols can change the meaning of an equation,
stylesheet should be written in a way such that changes to document-wide
typographic styles do not affect embedded MathML expressions. By
using the MathML 2.0 mathematics style attributes as selectors for CSS rules,
this danger is minimized.

Another pitfall to be avoided is using CSS to provide
typographic style information necessary to the proper understanding of
an expression.
Expressions dependent on CSS for meaning will not be
portable to non-CSS environments such as computer algebra systems. By
using the logical values of the new MathML 3.0 mathematics style attributes
as selectors for CSS rules, it can be assured that style information
necessary to the sense of an expression is encoded directly in the
MathML.

MathML 3.0 does not specify how a user agent should process style
information, because there are many non-CSS MathML environments, and
because different users agents and renderers have widely varying
degrees of access to CSS information. In general, however, developers
are urged to provide as much CSS support for MathML as possible.