Introduction to Groves and Property Sets

Sections:

1. Introduction

This paper is an introduction to the grove paradigm. It serves
several purposes:

It is to be part of the documentation for grove-based software such
as pygrove and Jade/DSSSL.

It is intended to clarify in people's minds what the result of parsing
an SGML or XML document should look like. Some variation on the grove model
could be imagined, but the basics of the model seem fundamental and unavoidable
to me: for instance, W3C's DOM reflects the same basic concepts.

Groves were invented to solve the problems that had become revealed
at a particular point in the development of the SGML family of standards.
XML has reached the same point so the time is right to popularize the grove
idea.

Please send me your comments on this document. It will eventually become an
ISOGEN technical paper, but it is still rough.

2. Background

2.1 The problem

In the early and mid-1990s, the ISO groups that were responsible
for the SGML family of standards realized that they had a large problem.
The people working on the DSSSL and HyTime standards found that they
had slightly different ideas of the abstract structure of an SGML document.
Understanding an SGML document's structure is easy for simple things, but
there are many issues that are quite complex. For instance, it is not
clear whether comments should be available for a DSSSL spec. to work
on, or whether they should be addressable by hyperlinks. It isn't clear
whether it should be possible to address every character, or only
non-contiguous spans of characters. Should it be possible to address and
process tokens in an attribute value or only character spans? Should
it be possible to address markup declarations? XLink and XSL must solve all
of the same issues.

The problem is that XML is defined in terms of its syntax, just
as SGML was at first. Linking and processing are done in terms of some
data model, not in terms of syntax. When you make a link between
two elements, you are not linking in terms of the character positions
of the start- and end-tags in an SGML or XML entity. You are linking in terms
of abstract notions such as "element", "attributes" and "parse tree".
The role of an XML parser is to throw away the syntax and rebuild the
logical ("abstract") view. The role of a linking engine (such as a
web browser) is to make links in terms of that logical view. The role
of a stylesheet engine is to apply formatting in terms of that logical
view.

Unless stylesheet languages, text databases, formatting engines
and editors share a view, reliable processing will be unreliable
and complicated. It is not very common for XML and SGML applications
and toolkits to provide all of the information necessary for building
many classes of sophisticated applications, such as editors. There is
not even a standardized way for an toolkit to express what information
from the SGML/XML document it will preserve. Even if two toolkits preserve
exactly the same information, it is quite possible that they use
different terminology to describe the information. In some cases, APIs
might be identical except for trivial wording!

A related problem is how to address components of data types other
than SGML and XML. For instance, how do you make a hyperlink to a
particular frame in an MPEG movie, or a particular note in a midi
sequence? How would you extract that information in a stylesheet (for
instance for sequencing a multimedia hyperdocument). It makes no sense
to address in terms of bytes, because often a single logical entity,
like a frame, is actually spread across several bytes and they may not
be contiguous. Addressing in terms of characters would make even less
sense because MPEG movies and midi sequences are not character based.
The web solves this problem by inventing a new "query language" (in the
form of extensions to URLs) for each data type. This more or less works,
but it leads to a proliferation of similar, but incompatible query
languages doing the same basic thing, but with different underlying
models.

2.2 The solution: groves

The grove concept is intended to solve these problems. You can decide
whether it succeeds for yourself.

The basic idea underlying groves is that notations like XML and
SGML exist only as a syntax for some underlying data model. It is well
understood, for instance, that SGML/XML elements form a tree. That is an
abstract data model. The sequence of characters in an XML document is
not literally a tree structure: it is the syntactic representation of
a tree. The actual tree exists only as an abstraction in the head of
the author, or in the data structures of the software that processes
the document. XML is only interesting because it helps us to serialize
these trees so that we can move them from place to place.

An XML
document is much more than a simple tree. It has links built by ID/IDREF
attributes. If the document uses XPointer, these links can use more
complicated addressing mechanism.
Knowing about these links is important for any tool that
provides even primitive hypertext facilities. An XML document also has logical
relationships between elements, element types and
the declarations for those types. This might be important for a DTD-editing
application to track. Furethmore, an XML document has markup-level details like
ignorable whitespace and comments. Knowing about these details is
important for editing tools. As you can see, the abstract data model for an XML document
is actually quite intricate and complicated. It must address relationships,
details of markup, and information about both the
physical and logical makeup of the document!

The problem is that XML's data model is implicit. The XML
specification alludes to it, but does not describe it. Surprisingly, the
XML specification requires a processor to pass all whitespace to the
"application", but does not require the processor to pass anything else
on! This is not because whitespace is the only important thing: rather
it is because everything else is underspecified. SGML was in the same
sorry state of underspecification before the grove was developed. The
grove solves this by providing a language for describing XML's abstract
data model in a rigorous and complete way. You can think of the grove as
a meta-"data model" for media applications. It is a data model for
building data models. Or you could think of it as providing the
low-level primitives for building high-level, media-specific models.

Groves are usually, but not always, tied to particular media types
("notations"). So groves for CGM documents would look fairly different
from groves for XML documents. The terminology and semantics of these
two specifications are quite different, so you would expect their APIs
and query languages to also be fairly different. The grove model is
designed to allow them to be exactly as different as they need to be,
and no more! What that means is that the basic concepts are the same,
but that every media type ("notation") defines its own vocabulary of
"properties" in terms of the basic concepts underlying the notation. We
call these vocabularies "property sets." In a grove-based view of the
world, an XML document is a collection of hundreds of properties, all
drawn from the XML property set. This is analogous to the way that a
valid XML document is a collection of element types, all drawn from some
DTD.

All properties are held in containers called nodes. Nodes represent
everything in an XML document: elements, attributes, every significant
character, all insignificant whitespace, etc. Groves are so complete that
given a complete implementation (GroveMinder is almost complete, Jade is
not) HyTime can make a hypertext link to the keyword "#REQUIRED" in
an attribute list declaration and DSSSL stylesheets can (in theory)
vary their formatting on the amount of whitespace between attribute values
in a start-tag. Of course nobody is likely to go that far, but the point
is that all of that information is available and addressable. For someone
creating (for example) an XML editing or maintenance system, these issues
could be important. Consider the case where the only difference between a
checked-in document and the version in the archive is insignificant whitespace.
A smart repository might choose not to increment the version number.

System designers can ask a grove buider to trim nodes that they do not
need from the grove using a "grove plan". This means that your applications
do not need to keep track of all of that information if you are not using
it. Limited grove builders (such as the free Jade software) can describe
their capabilities in terms of grove plans. Two products that claim to
support the same grove plan should build identical groves for a particular
document.

As an example: one sort of node in an XML document would be an
"Element" node. Examples of its properties include its generic
identifier (type name), attributes and content. Examples of information that most
applications would "trim" out of the grove might include the markup
details of the elements' start- and end-tags, entity starts and ends and any ignored whitespace
at the start and end of the element.

Groves can also contain so-called "emergent properties." An
emergent property is one that is not directly obvious from the
syntax, but emerges when a document is processed as a whole. For
example, the list of elements with unqiue IDs in a document is only
available when the document has been completely processed. In the
SGML property set, this list is stored in a property called "elements"
on the "SGML Document" node. An emergent property is any property
that is computed based on information from various parts of the
document.

Another emergent property would be
the logical relationship between elements and their element types
nodes. If your grove plan contains DTD information, then you can
ask an element for its "element type" and get back a node with
information about the content model of the element type, allowed
attributes, tag ommissability and so forth. Note that even the
relationship between element types and their allowed attributes is an
emergent property because in SGML/XML, attributes are specified
separate from the elements that they apply to.

Property sets are defined in documents that conform to
the "propset" DTD. You can think of these documents as simple
schemas for property sets. They can specify that particular
properties must contain particular types of values (integers,
strings, nodes, lists of nodes). They can specify that
some properties are so-called "sub-nodal" properties, which
means that in the logical tree, the node with the property
logically "owns" the node that is the value of the
property. For example, elements in the SGML property set
have a "subnodal" property called "attributes." This means
that elements "own" attributes.

The property set definition
language is defined in Annex A of the HyTime specification.
As a schema language, it is not as powerful as those used
for databases (SQL) or CAD systems (such as EXPRESS), but it is
sufficient for our current needs. In the future they may
be augmented or replaced by something like EXPRESS.

2.3 Examples of property sets

The most important existing property set is the SGML property
set. Although the SGML property set should theoretically be specified
in the SGML standard, it is actually described in the HyTime
standard. This is due to the fact that the SGML standard
predates the grove concept! Nevertheless, the SGML property set is
quite complete, robust and well thought out. The SGML property
set can be used for XML documents, but it may sometimes be worth
creating an XML property set in terms of the terminology and feature
set of the XML specification.

Another important property set is the HyTime property set.
This provides a data model for the HyTime links in a document. It
should be possible to use the HyTime property set to describe XLink
links. SGML groves are built from single documents, but HyTime
groves are built from "hyperdocuments" constructed from many
interlinked SGML documents. Nodes in the HyTime grove point back
to the SGML document that contained the HyTime construct.

Other property sets include the Plain Text property set,
which has only two classes, one for plaintext documents and
one for each data character and the Data Tokenizer property
set which can be used to break text up into tokens by
separating on whitespace or other characters.
These property sets are ISO standards. It is quite
likely that other ISO standards will
incorporate property sets in the future. Work is underway to
create property sets for EXPRESS and STEP data and for the
Computer Graphic Metafile(CGM).

ISO is not the only organization that can create property
sets. Private individuals such as Sam Hunting and corporations
such as TechnoTeacher have also created property sets for everything
from schema languages to contract law.

3. Nodes, Groves and Property Sets

3.1 Nodes and their properties

The two most fundamental concepts in the property set paradigm
are nodes and properties. Technically speaking, a node is just an
"ordered set of properties, representing a single object." In
a grove conforming to the SGML property set, examples of objects
include elements, start-tags, generic identifiers or anything else in
the property set. In a HyTime grove, examples of objects include
links, anchors, hyperdocuments etc.

A property is a combination of a name and a value. Conceptually,
this is similar to an attribute in XML or SGML. It is important not
to confuse properties with XML attributes, however. The concepts are
similar because the basic idea of name/value pairs is fundamental
to information: think of a phone book or dictionary. Objects in OOP
programming languages are also sets of name/value pairs. But the
properties in an SGML grove are defined in the SGML property set,
not in some particular DTD.

Every node conforms to some node class. Node classes are defined
in the property set. All nodes of the same class have exactly the
same properties, in exactly the same order. The class also restricts
the possible types of the each property.

Properties on nodes appear in the same order as their definition
in the property set. Even if an implementation uses an unordered storage model
(e.g. a dictionary or hashtable), it can present an ordered view of
the properties by checking for the correct order in the property set.
In the SGML property set, the first property of
an element is its generic identifier (element type name). The second
property is its unique identifier (ID), the third property is
attributes. By convention, these properties are ordered so that the
most commonly used, widely understood properties are first and the
less common used or understood properties are later. Thanks to this
organization, and to places in the propset DTD that allow commentary, a
property set definition document can serve as documentation for
the property set.

3.2 Reading a property set

The easiest way to read a property set is with the output of
TechnoTeacher's PropGrinder program. This program reads a property set and
generates interlinked HTML pages for the various parts. The propset DTD is
very compact and uses very short element type and attribute names.
The PropGrinder program turns this terse document into something more
readable and navigable.

As you read this document, you should follow along with
PropGrinder's description of a simplified SGML property set at
the HyTime Users's Group
Web Site. Of course, you need to open that document in a
separate window.

The so-called "SGML-ESIS" set is simplified in that it only supports the
most commonly used parts of the property set (often termed
the "ESIS"). It is simplified though a "grove plan." We will
discuss grove plans more later.

3.3 Property datatypes

A node class in a property set acts as a schema for nodes of
that class. Every node must have a value for every property declared
in its class. We say that the node "exhibits" that property. It is
possible for it to exhibit a "null" value, just as in relational
database or object oriented theory, but it must always have the property nevertheless.
As an interesting diversion, this comparison to relational
databases provides a hint of an implementation strategy for
property sets. Each class can have a table with as many columns
as there are properties.

An example of a node class is the Element class. It will help
you to understand what follows if you look at the PropGrinder output
for this element while you read. To get there, click on the link
called "Classes" and then click on the class called "Element".
Once you are there, you will see a display dedicated to information
about the element node class. At the top is information about the
node class. If you have frames turned on, the bottom left lists the
node classes' properties. The bottom right
displays information about a particular property.

Just as relational tables have a particular type (string, number,
date, etc.), property values have types. The list of these types
is in section A.4.1.1 of the HyTime specification. Some of them are
very simple: "char" is a character. The "char" property of the
"datachar" node is an example of a character property.
Character set issues are beyond
the scope of this tutorial. "String" is an ordered list of
zero or more characters and "strlist", is an ordered list of zero or
more strings.

The element's "GI" (generic identifier) property is a
string property. Click on the string "GI" in the bottom left to see
information about the property in the bottom right. Propgrinder
describes the full name of the property: "Generic identifier". It
also says that the property is in the default grove plan which
means that it should be provided by any grove builder that has not
been asked not to provide it. Because it is a string, it gets a
"string normalization rule" which basically describes how strings
are normalized by the parser.

Its "verify type" is "other." The
HyTime specification has this to say about the "verify type":
"The attribute verify type (vrfytype) is used by the DSSSL
transformation language. It is fully described in the
DSSSL standard." The verify type is otherwise beyond the scope
of this tutorial. Finally, the property set tells us what clause of
the SGML specification defines the concept of a GI (clause 7.8,
paragraph 1) and gives us a short diescription of the node:
"Generic identifier (element type name) of element."

Other types of properties include "integer", for integral numbers,
"intlist", for ordered lists of integers and "boolean", for
true/false values.

A slightly more complex
type of value is an "enum" or enumeration. This is similar to an
enumeration in a programming language or in an SGML/XML attribute
value. The property set designer can specify a list of named, valid
values for the property. For instance for SGML entity types, the
enumerated values are "text", "cdata", "ndata", "sdata", "subdoc"
and "pi". You can see this by looking at the "Entity Type" property
of the "Entity" class. To do this, click on "Classes" (at the top),
then "Entity" (in the list) and then "Entity Type" (in the properties
list in the bottom left corner).

There are also node value types called "component name" and
"component name lists." A component name represents the name of
some grove property, class name or enumerated value. Component names are not just
strings, although some grove-based APIs may treat them as strings.
You can think of component names as strings that are known to
the grove processor "in advance" because they are from the property
set. Ordinary strings are not known in advance. The grove makes the
distinction because strings that are known in advance may be
"compiled" (or, more formally, "interned") into integers and
referred to more efficiently.

Another type of value is a nodal value. This is conceptually
a "pointer to" or "reference to" another node. You can also have lists
of these references to nodes called "node lists." For instance, HyTime
links would point to other nodes through nodal values. In fact, the
entire grove is constructed through nodal (and node list) values. In the SGML property
set, the "SGML Document Node" has a property called "governing doctype"
that refers to the DTD that is in use. The type of that property is
"document type." It also has a property that refers to the root element
of the document, called the "document element" property. The class of this
element is just "element". All element nodes (including the document
element) point to their children (elements, characters, etc.) through a
node list property called "content". This continues down
to every data character in the content of an element.

You can follow this path by starting at the SGML document node.
Do this by clicking on the "Classes" link at the top and "SgmlDocument"
in the class list. You can see the GoverningDoctype and DocumentElement
properties I described above. If you click on the DocumentElement
property, you will see that it allows nodes of type Element. If you
click on the word Element, the display will change to present information
about the Element class. It should look familiar. We've been here before.
From there, you can drill down into the valid
content for an element by clicking on the Content property in the bottom
left corner. From there you can see that the Content property is a node list which
allows "Data Char" nodes, along with elements, external data, processing
instructions and sdata.

Attributes are slightly different. The attributes property of
Element nodes is of type named node list. A named node
list is like a node list in that it is an ordered list of nodes. But it
is more than an ordinary node list. Each
node in the named node list is assigned a name based on some property of
the node. For example, the attributes property of element nodes can
contain only attribute assignment nodes. Each of these nodes must have
a "name" property.

You can see that this property is the "name property"
of the "attributes" named node list. First go to the Element class
page as we did before. Then click on the word "Attributes"
in the properties list for the class "element." In the bottom
right hand corner, there should be a box labelled "Allowed Classes"
with a small table (with perhaps only one item in it). On the left side
is the name of a class that is allowed, and on the right side is the
"name property" of that class. The word "name" should appear because
"name" happens to be the "name property" for the "attribute assignment"
nodes in the "attributes" named node list of the "elements node." Click
on the word "name" to see the "attribute assignment" class description.
As a shortcut, we can say that the attrirubes named node list is "indexed"
by the attributes' names. This is another hint

As you can see, the grove can be thought of as a sort of "parse
tree" with nodes containing nodes. But not every property expresses a
logical "contains" relationship. Some properties express other sorts of
relationships. For instance, an entity reference node must point to its
entity definition node and similiarly an element node must point to its
element type node. There is no logical container relationship there. We
refer to properties that express a container relationship as "sub-nodal"
properties. We refer to properties that express any other relationship
as an "internal reference" ("irefnode") or "unrestrained reference"
("urefnode"). A node referenced through an irefnode property must be in
the same grove as the referencing (property containing) node.
Unrestrained referenced nodes may be in the same grove or another
one.

It is quite common to want to do something with each node in a
grove. For instance you might want a printed representation of a grove,
or you might want to store it in a database. It is easy to visit each
node because every node except the grove's root node occurs once and
only once as a subnode of some other node. The other node is referred
to as the subnode's origin. Every node has an "origin" property that
allows us to find the nodes origin. The grove root's "origin" property
is null.

"Origin" is the first property we have discussed that is common to
every single grove, no matter what its property set. These common
properties are called "intrinsic properties." We will discuss these
later.

3.4 Children, Data and Content properties

In many media types, there is a distinction between content data and
metadata. Arguably, in SGML and XML, the DTD is metadata, attributes are
metadata and character data is "real" data. The property set paradigm
allows the property set definer to specify the difference between
metadata and data. Every node is allowed to have one property that it
distinguishes as being its "children" property. The name of the property
does not have to be "children". For SGML/XML elements, it is just
"content." The children property (whatever its name) is considered to be
the data. All other properties are not. Presumably they are metadata.
There is only one children property allowed per node.

Imagine we are writing software that works with any sort of grove.
We might want to know what a node's children property is. One way to
do that would be to read the property set document (it is an SGML
document, after all). That is rather a hassle, though.
It is possible to ask a node what its children property's name is,
rather than reading the property set document. Every node exhibits a
component name-valued property called "Children Property Name". If the
node class has a children property, the value of the children property
name property must be the name of some other property that the class
exhibits. For instance, element nodes would have a "children property
name" property with a value of "content" because the element's content
property is its children property.

Intuitively, this corresponds to the tree that you would draw of
of an SGML document: the logical parse tree. It includes elements and
their contents, but not things such as attributes, the DTD, start- and
end-tags, etc. This tree also corresponds to the logical tree that
almost every existing stylesheet language works upon. By default,
stylesheet languages typically look at each element and output its data
content, ignoring its attributes. They make the same distinction between
metadata and content data.

Of course, any particular DTD designer might encode something she
logically considers metadata as an element instead of an attribute: but
they will find that the grove and most SGML/XML software (whether
grove-based or not) will not really support them in that decision.
SGML/XML practitioners have various techniques for working around this:
for instance, if your application does not need the distinction between
data and metadata, it can treat ordinary subnode properties and children
properties the same.

A similar, related property is the data property. Some nodes carry
content data but have no subnodes. The node might carry its data in
a character or string property instead of in subnodes. Obviously a
data character node cannot have some other node as its subnode and
so on infinitely! Instead
of having a children property, these nodes have a data property. The
data property is specified with a data property name intrinsic property.
Every node
class can have either a data property or a children property, but not
both. Either
one can be referred to as the
content property. For instance, the notation class does not
have either, so it has no content property. On the other hand, the Element
class has "Content" as its content property. You can see this by looking
at the class description for the Element class output by PropGrinder.

Using the content property, it is possible to extract the data
of a node automatically. The data of a node with a data property is the
value of that property. Of course this data can only be a character or
string. The data of a node with a children property is the concatenation
of the data produced by each of the node's children perhaps separated
by a data separator. A data separator is specified through a
Data Separator property which is found based on a
Data Separator Property Name property.

3.5 Intrinsic Properties

There are certain properties that every node exhibits,
no matter what its class or its property set. These are called
the intrinsic properties. We do not repetitively define these
in a property set definition document, so you will not find
them in the definitions for every node.

We have already discussed the Children Property Name,
Data Property Name and Data Separator Property Name properties. These say which property
(if any) should be considered to be the content property
of a node and if the node has children, what separator should be
placed between the children's
data when they are concatenated. Since a node can have at most one
of these "content" properties, either the children or data
property is required to be null, and in some cases they may
both be null. These
properties are provided to make the property set "reflexive".
The goal is to be able to learn things about a node's definition
from its properties. The data separator property works with
the data property name to allow us to properly construct the data
of the node.

Another important intrinsic property is All Property Names.
This is also a property that reflects the nodes definition.
By asking a node for all of its property names, we can
write software that can work with groves without knowing
about particular property sets. For example, an application could
store nodes in a database or transmit them over the Web without
reading their property set definition document or having hard-coded
information about them. A similar property is Subnode Property
Names which is a list of the names of all of the subnode
properties of the node.

Also related to reflexiveness is the Class Name property.
Given a node, it is possible to ask it for its class name and
base special processing on that name. For example, a class name
could be used to look up the documentation for a class in the
property set definition document.

Other intrinsic properties help with navigation around the
grove. Every node has a Grove Root property that describes how
to get to the root node of the current grove. Of course you could
find this property in code by looking at the origin of the node,
and the origin of its origin, and so on. Eventually when you find
a node with a null origin, you have found the grove root. The grove
root property makes this easier, however.

Another navigational property is the Origin to Subnode
Relationship Property Name property. Every node in the grove except
the root has an origin. The node must be referred to in some
sub-nodal property of its origin. That property is known
as the Origin to Subnode Relationship property name. This
property allows you to navigate up to the origin node and then
back down to the subnode. If the Origin to Subnode Relationship
is a children property, then we say that the subnode is a child
and the origin node is a parent. If a node is a child, we can
get its parent node with the intrinsic Parent property.

The final intrinsic property is the Tree Root property. This is
equivalent to walking up from node to node as long as each
node has a parent. The node that has no parent is the tree
root. It might not be the grove root, however. If the node
has an origin, then it is not the grove root, but
if the origin is not a parent then the node is a tree root.

4. Grove Plans and Modules

The HyTime specification defines a grove plan as "a specification
of what modules, classes, and properties to include in a grove. Grove
plans are used both to construct groves and to view existing groves."
Grove plans are defined through "grove plan" elements. These elements are
defined in section 7.1.4.2 of the HyTime specification. They can include
or exclude classes and properties.

Sets of property set components may be grouped together so that
they can be include or excluded from a grove plan at once. The grove
plan element can include or exclude them all at once.

5. The SGML Property Set

The SGML property set defines the results of an SGML
parse. It serves as a data model for hypertext linking and
as a basis for SGML creation and management APIs. This section
will describe the major node types in the SGML property set.

5.1 The SGML Document Node

The root node of an SGML grove is always an SGML document
node. SGML Document nodes never occur as subnodes of other nodes.
The SGML document node has among its subnodes the document element, the
DTD and various sorts of document-global information. Some important
properties of the SGML document node are:

governing doctype node property

This document type node contains all of the DTD information.

document element node property

This element node contains all of the document's content

elements node property

This named node list constains all of the elements in the document with IDs, indexed by their IDs.

entities named node list

This lists of all of the document's entities, indexed by their names.

5.2 Element Nodes

Most SGML processing is controlled by elements and their generic
identifiers (element type names). Element nodes can occur as the document
element or in the content of some other element.

gi property

This string is the name of the element's type. If there is no minimization, it would appear in the element's start-
and end-tag.

unique identifier

This string is the unique identifer of the element. It is used as the name property in the sgml document node's elements
property.

attributes property

This named subnode list property contains attribute assignment nodes indexed by their names.

content property

This subnode list contains all of the content of the element. That might include data characters,
processing instructions, comments, other elements and various miscellany and esoterica.

element type property

If the grove plan includes DTD information, this IREF node property
points to an element type node with information about the content model,
allowed attributes and tag ommissability information relevant to the element.

5.3 Attribute Assignment Nodes

Attributes are very common in SGML/XML documents. They allow authors to attach
extra information (strings) to elements. The nodes that represent attribute assignments
are different from those that represent attribute definitions (in the DTD). Here are
some of the important properties of attribute assignment nodes:

name

This string property contains attribute assignment's name. The attributes are indexed by the node name
in an element's attributes node list.

value

This subnode list property contains either tokens or characters, depending on whether the attribute
value is one of the tokenized types (i.e. not CDATA). If the attribute is implied (not specified and
not defaulted) then the value is null. Because value is the content property of the
attribute assignment node, you can get the string value of the node by asking for
the node's "data". In Python, that would look like attassignNode.data()

implied

This boolean attribute is true if the attribute value was implied (not specified and defaulted) and false
otherwise.

token separator

This character attribute describes what character was used to separate tokens from each other. This is
almost always the space character, but could be something else in a variant concrete syntax (SGML esoterica).

5.3.1 Attribute Value Token Node

Attribute value token nodes represent the value of tokenized attributes. If you want to
work with each token of a tokenized attribute value individually, you should use attribute value
tokens instead of asking for the data of the attribute assignment.

token

This string property has the text of the token. It is the content property of the token,
which is why it is included when you ask for the data the attribute assignment node.

5.4 Grove navigation how-to

You can get to the grove root (the SGML document node, in an SGML grove) by asking
for the grove root property of any node in the grove. In Python this would look like

root=node.GroveRoot

You can find an elements with a certain ID by asking for the elements property
of the SGML document node. In Python, given any node, you could get the element
with the ID foo like this:

node.GroveRoot.Elements["FOO"]

. For instance
to resolve an IDREF attribute named REF:

element.GroveRoot.Elements[element.Attributes["REF"].data()]

You can find an entity by name using the same procedure as for elements,
but use the "Entities" property of the SGML Document node instead of the
"Elements" property:

element.GroveRoot.Entities[element.Attributes["ENT"].data()]

5.5 More Information

The defintion of the grove paradigm is the
Property Set Definition Requirements
annex of the HyTime specification. It is quite readable, especially if you have already
read this tutorial. You can also contact me with questions, but I can't guarantee
rapid response. For longer engagements, we can arrange a consulting contract.
My employer, ISOGEN, is the leading supplier of grove-based document management
know-how.