EXPOSING HTML/DOM VIA ATK/AT-SPI

OVERVIEW

This document specifies how document
content can be efficiently exposed using the existing ATK/AT-SPI
with minimum modifications to the ATK/AT-SPI specifications. It
does so by effectively creating a containment hierarchy based upon a
relevant subset of the DOM, where each container has an accessible
role which is mapped to AtkRoleTypes from the
associated element type (e.g., "html:h1", "html:h2", "html:a", html:p",
etc.). Contiguous regions of text typically appear as children that are
accessible text objects, and other components (e.g., widgets inside
forms) appear as their associated ATK/AT-SPI counterparts. Furthermore,
CSS attributes will be represented as AT-SPI text
attributes.

PROBLEM STATEMENT

Screen reader users need the ability to:

Navigate documents by higher level structure

Know where they are within the current document's structure

Although requirement 1 can be filled by an improved caret navigation
capability, that would force synchronization of the screen reader and
the document's point of regard, which is not always possible -- for
example, the Firefox
caret cannot be set within a list of options, for example. The caret
browsing system is difficult to maintain because the two can be out of
sync.

Furthermore, requirement 2 really means that the screen reader needs to
be able to determine document structure anyway.

The current ATK implementation in Gecko has the following
problems, when exposing document structure:

It does not expose headings, quotations, paragraphs, forms,
list containers and other structural objects

It is not possible to differentiate sequential or nested
lists from each other or even from a single large list. The list items
are simply exposed one after the next.

There is confusion about embedded objects. They are
currently exposed out of order relative to where they actually exist in
the document. Gecko must do extra error-prone work to expose them in
the new order, and the AT must unravel that. In addition, the extra
work in the Gecko implementation makes code reuse with the Windows/MSAA
implementation difficult.

There is confusion about links. In HTML links can span
entire paragraphs or a complex set of objects. This is not modeled well
with the current system.

PLAN

The basic plan is relatively simple:

Embedded objects: Treat container
elements as AtkHypertext containers which can have AtkHyperlink children.
Embedded objects, links and other containers will appear as AtkHyperlink
children of these
containers. Non-link children will not have ATK_ROLE_LINK but will still implement
AtkHyperlink, and AtkHyperlink:getURI
will be used to expose any associated file
locations. [ Aaron: Bill seems to be suggesting hyperlink-URI
attribute instead, not sure why at the moment.] Implement AtkHyperlinks as full-fledged AtkObjects. The technical plan
for doing this in Mozilla will involve moving code out of mozilla/accessible/src/atk
and into cross-platform classes. We will move a lot of implementation code into
classes in the nsAccessible inheritance chain, instead of using delegation.

Because children can still be walked via other methods, an object with children
must only implement AtkHypertext when it:

contains text, and thus the position of the children within that text must be
exposed, or

the child contents could be broken into several lines with multiple
children, and thus the line breaks must be exposed a start and end offset which
contains several children

has a child with a URI, which is exposed via AtkHyperlink. [Aaron: let's skip
this test, in the future it will be possible to QI directly to nsIAccessibleHyperlink
for this info]

Embedded object characters: When
arrowing over the child would move over it in one jump, an embedded object character
will be used to mark the child's place, rather than using the accessible name/text in the container's text. In practice, this means that images, image links, image
maps [ Aaron: I'm not sure this is a good rule in the case where the next
character is not in the child, but in a grandchild. ]

Tag names (bug
340809): Tag names will be exposed
on using an ATK object or text attribute in the form of tag= [namespaceabbrev:tagname]. For example, "tag
= html:blockquote". In addition, the closest matching role will
be available via the normal role mechanism. A text attribute will be used instead
of an object attribute whenever the current object is an AtkText. Possible namespace
abbreviations will be "html:", "svg:", "xforms:", "mathml" and "xul". [ Aaron: this
proposal currently does not yet provide a way to expose tag names for all elements,
since not all elements in the DOM are being exposed. However, if an element turns
out to be useful we can change the rules to expose it ]
[Bill: ANSWER]: We already have
Accessibility_Accessible::getAttributes, so this problem is solved in
theory. [ Aaron: the nodes we expose
are only a subset of the DOM nodes, so it does not completely provide all info about
the doc. If the AT wants to traverse the whole DOM and get everything, it can't.
Wouldn't call it a P1 though -- may not be necessary to provide entire DOM.]

DOM attributes (covered by bug
340809 above): Attributes will
be exposed via ATK object or text attributes such as namespace-abbrev.attrname. Use a period to separate out the namespace, not
a colon, otherwise the AT will thing that the attribute name is just the namespace
(The AT will receive a single "name:value" string, and parsing the key name logically
stops at the first colon). A text attribute will be used instead
of an object attribute whenever the current object is an AtkText.

CSS attributes (covered by bug
340809 above): CSS attributes will be exposed via ATK object or text attributes
in the form of css.attrname=value. A text attribute will be used instead of an object
attribute whenever the current object is an AtkText. Only expose the differences in CSS with the parent, so that AT can see changes in CSS from container, which
is what makes the CSS important for the user. Ancestor collection will help AT needs
computed style.

Dynamic content roles assigned by authors:
Like all attributes, dynamic roles and landmarks supplied by the autor will be available
through an object attribute. For example, expect something like
like "html:role = wairole:liveregion". In addition, the closest matching ATK role will be available
via the normal mechanism. In the future, the role attribute may contain a space-separated
list of roles. [ Aaron: I'm still hoping that PFWG does not move to multiple roles,
it would be difficult to deal with ]

Actions: Implement AtkAction directly
on the object. AtkHyperlink::getObject() will return null when there is no
action, but return self and implement AtkAction when there is one. AtkHyperlink::getObject()
needs to return self to be compatible with legacy ATs (Gnopernicus, GOK) that don't
expect the hyperlink to implement AccessibleAction. Modern ATs may also depend on
this functionality to have common code to deal with both paradigms. (The API isn't
changing, only the convention.)

Scroll to (no bug filed, test
and see if it works): we will ensure that
a magnifier or other AT can scroll to any element by placing the caret at the start
of the element. Scroll bars objects will also be implemented as siblings of the
object with the scroll bars.

Cross-node text selection:
[Bill :
ANSWER: See the AtkText interface docs, in particular the
text-selection-changed events, and atk_text_get_selection and atk_text_get_nselections.] [ Aaron
: how does this solve the problem where the selection can start in
one AtkText and end in another? ]

Value change events (bug 340672): we need to
expose these, but we won't expose the old value. It's
too difficult to get in Mozilla
and there is no use case.

We will also implement tweaks to the way individual types of items are exposed:

Tables (no bug, this may
work already, need to test): AccessibleTable would be implemented on the <table> object,
and the children could also be obtained by walking the regular
hierarchy.

Links (bug
340665): expose as ATK_ROLE_LINK,
and
keep link text in parent AtkHypertext [ Aaron: need consistent rule for when to
put child object text in parent -- why just links?]. Expose focus events using the normal focus
event, not the link-selected event. To indicate visited links, use ATK_STATE_VISITED if it is available, otherwise use ATK_STATE_CHECKED.

Forced line breaks (bug 340667): expose a <br>
simply as a \n character in the accessible text

Descriptions: use
ATK_RELATION_DESCRIPTION_FOR/DESCRIBED_BY
when they become available

Forms (bug
340671): use
ATK_ROLE_FORM if decision
makers agree to add that role. If not available, use ATK_ROLE_PANEL with "tag =
html:form"

Progress bars: Expose ATK_STATE_INDETERMINATE if the progress bar is in an indeterminate state. Or we may fail, or throw an exception
when the value is requested in that case.

Alerts (no bug, mutation
events already fired but need testing, ancestor collection is an open issue): For now, use ordinary
mutation events and don't do anything special for alerts. However, AT's will not
be able to get this right without a new event for the ROLE_ALERT. For example, if
an alert or liveregion was already there but the text inside changes, you may want
to re-speak the region. In order to do that the AT needs to walk the ancestor chain
to check for a ROLE_ALERT ancestor on every children-changed or text-changed event,
which would be extremely slow. So we propose to modify the AT-SPI collection interface
to allow ancestor collection, which would be useful for many cases (such as finding
if an ancestor is editable or has focus).

ADVANTAGES

Document structure exposed. Screen reader can get info it
needs.

Simplfies implementation both for Gecko and for screen
readers

Backwards-compatible with GOK

Synchronizes ATK
implementation with MSAA implementation, making an
implementation for advanced text interfaces on Windows much easier down
the road,
by allowing reuse of all the classes which implement special interfaces
(text, tables, etc.). With this huge difference removed, most of the
important code to implement those interfaces can be moved into the
cross-platform implenentation.

MOZILLA IMPLEMENTATION NOTES

HTML allows almost anything, anywhere. For example, almost anything can have text
inside and embedded objects within it (exception: images and some form controls),
and thus would need to implement AtkText and AtkHypertext. In addition,
almost anything can be embedded within another object that contains text, and thus
can be an AtkHyperlink. Therefore these interfaces must be implemented
in the base accessible (nsAccessible) or base HTML accessible (nsHTMLAccessible
-- TBD) class. QueryInterface() will need to support interfaces based on tests of
what attributes and children the object has, not what kind of accessible it is.
Because the results of QueryInterface() must be constant for the lifetime of an
object (according to the rules of XPCOM), if any of these factors change, the accessible
object must be invalidated and recreated when needed.

The implementation for new-atk will move the code from nsAccessibleText and nsAccessibleHyperText
into nsHTMLaccessible. The nsIAccessibleText and nsIAccessibleHypertext interfaces
will be supported on this object, and the associated node will be the container
node, not the first text node as it is now. This will simplyify the code by removing
delegation and using inheritence to append functionality. Further delegation will
be removed as we remove the unncessary nsMai classes and implement the callbacks
and logic on ns*AccessibleWrap classes.

All ATK specialization interfaces will be moved out of mozilla/accessible/public/atk
and into mozilla/accessible/public (bug
340822). The implementations for these interfaces will
move to cross-platform classes, away from accessible/src/atk. This will enable future
extensions across platforms to take advantage of these interfaces.

We will need a new nsHTMLContainerAccessible which inherits from nsAccessibleWrap.
nsBlockAccessible and nsLinkableAccessible would inherit from it.

ATK interfaces and what needs to be changed in Mozilla to support them with the new-atk methodology

ATK interface

Mozilla XPCOM interface

Implementation

AtkText

nsIAccessibleText

nsHTMLContainerAccessible, covers xul:label and xul:description as well,
those use HTML frames to contain the text (bug
340829)

AtkEditableText

nsIAccessibleEditableText

nsXULTextboxAccessible,
nsHTMLTextfieldAccessible, etc. (bug 340830)
Wiith Midas, anything can be editable. For that part, we would need to handle it
in nsHTMLContainerAccessible.

AtkHypertext

nsIAccessibleHyperText

nsHTMLContainerAccessible, covers xul:label and xul:description as well (bug
340829, same as for AtkText)

AtkHyperlink

nsIAccessibleHyperLink

nsAccessible -- Anything can be embedded in HTML, even something from another namespace
(bug 340833)

AtkValue

nsIAccessibleValue

nsAccessible -- anything can have a value via aaa:valuenow (bug
340825)
Also need to move atk/nsXULProgressMeterAccessibleWrap code into xul/nsXULProgressMeterAccessible

EXAMPLES

Mapping between HTML and the ATK representation.

Each ATK accessible object is
encapsulated in braces ("{}"), and meaningful ATK interfaces and attributes,
including specializations, are represented as name/value pairs inside
the braces.
For convenience, accessible text is shown merely as
'text="contents
of the text"'.* = Embedded object character (0xfffc),
used when no text from the object will be inserted in the parent AccessibleHypertext

HTML content

HTML source

ATK representationAll objects support AtkObject and AtkComponent

This is a heading

This is a paragraph with an image in it.

This is another heading

<h1>This is a heading</h1><p> This is a paragraph with an <image src="image.gif" alt="some image"/> image in it.</p><h2>This is another heading</h2>

{parent AtkHypertext,
role=ATK_ROLE_PARAGRAPH, attr="html:tag=p" text="Here is a *."attribute run for "bartending site."
with textattr="link=true,
css:text-decoration=underline, css:color=(0,0,255)"} {child
AtkHypertext, AtkHyperlink,
role=ATK_ROLE_LINK, text="*bartending site"
hypertext-indices=[10,11],[not sure if we need to dup textattrs
here, or add them to defaulttextattrs],
hypertext-URI="http://foo.bar.com"}
{grandchild AtkImage, AtkHyperlink
role=ATK_ROLE_IMAGE, attr="html:tag=img,
link-type=image"
AccName/ImageDescription="beer glass",
hyperlink-indices=[0,1]
URI="beerglass.GIF"}[don't know if the URIs should
always be fully specified, or if omitting the base URI is OKAt the moment, not planning to do this, instead plan
to expose repair text in the name if no alt/title exists ]

[Bill: we
should be able to support list-style=image, and
"list-style-image=URL()", etc. this way. In the above example,
it's
not clear whether the bullet should be a unicode char or just omitted
and implied by the list style.. my guess is the latter (i.e.
bullets
don't appear in the text)]

This is a list item.

This is another list item.

<ol> <li>This is a list item.</li> <li>This is another list item.</li></ol>

[Note that unlike user interface ATK_ROLE_LIST objects, these lists don't implement AtkSelection, and the list items' AtkStateSets do not include ATK_STATE_SELECTABLE. There is a question here as to whether <ul> and <ol> elementsshould always implement AtkText or not. I think it would be better if they did not, unless they had non-empty text content, but this may prove impractical.]

Tell me a little more:

<form> <div> <label for="self"/> Tell me a little more: </label> </div> <div> <textarea> I am a monkey with a long tail. I like to swing from trees and eat bananas. I've recently taken up typing and plan to write my memoirs. </textarea> </div></form>

[ attribute run over the portion of the text scrolled into view? CONTROLLER_FOR/CONTROLLED_BY for the scrollbar/viewport? Alternative would be to treat all the text content as though it were visible, but that's no good for magnifiers and ATs for the mobility-impaired. Probably the textarea needs to be expanded somewhat, or at least fitted with AtkActions for scrolling plus text attribution for determining what parts of the text are currently scrolled into view, without the AT client having to resort to bounds checking in the "screen review" fashion. This is, however, a general problem with multiline text in viewports.The relatively new AtkText getBoundedRanges API reduces the pain somewhat since you canfeed it the AtkComponent bounds and it will give you back the visible text.]

[note that because the entry field is not editable, but just displays the current selection, I think it should not be exposed (especially since it represents a nodewhich is not present in the HTML DOM. The list items need not implement AtkAction,since the AtkSelection interface is used by the client to select among them.]